<<

The Pennsylvania State University

The Graduate School

Department of Philosophy

THE DISCOVERY OF MATHEMATICAL THEORY:

A CASE STUDY IN THE LOGIC OF MATHEMATICAL INQUIRY

A Thesis in

Philosophy

by

Daniel Gerardo Campos

© 2005 Daniel Gerardo Campos

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

December 2005

The thesis of Daniel Gerardo Campos was reviewed and approved* by the following:

Douglas Anderson Associate Professor of Philosophy Thesis Advisor Co-Chair of Committee

Emily Grosholz Professor of Philosophy Co-Chair of Committee

Dale Jacquette Professor of Philosophy

Catherine Kemp Assistant Professor of Philosophy

Michael Rovine Associate Professor of Human Development and Family Studies

John Christman Associate Professor of Philosophy Head of the Department of Philosophy

*Signatures are on file in the Graduate School

iii ABSTRACT

What is the logic at work in the inquiring activity of mathematicians? I address this question, which pertains to the philosophical debate over whether, in addition to a logic of justification of mathematical knowledge, there is a logic of inquiry and discovery at work in actual mathematical research. Based on the philosophy of

(1839-1914), I propose that there is a logic of mathematical inquiry, and I expound its form. I argue that even though there are not rules or algorithms that will lead to breakthrough discoveries and successful inquiry with absolute certainty, Peirce’s philosophy provides a way to describe (i) the conditions for the possibility of mathematical discovery; (ii) the actual method of inquiry in mathematics and its associated heuristic techniques; and (iii) the logical form of reasoning that warrants the application of mathematical theories to the study of actual scientific problems in nature.

With regard to (i), I discuss the role of the problem-context of discovery and describe the epistemic conditions necessary for carrying out mathematical reasoning. With respect to

(ii), I argue that experimental hypothesis-making in the course of analytical problem- solving, and not deduction from axioms, is the actual method of mathematical research.

Regarding (iii), I argue that abduction and analogy warrant the application of mathematical theories to the study of actual scientific problems. The discovery and early development of mathematical probability, culminating with Jacob Bernoulli’s Ars

Conjectandi (1713), serves as the historical case study to examine critically the proposed logic of mathematical inquiry. I discuss the practical implications of the proposed logic of inquiry for a philosophy of mathematical education.

iv TABLE OF CONTENTS

LIST OF FIGURES ...... vii

ACKNOWLEDGEMENTS...... viii

Chapter 1 Introduction ...... 1

Chapter 2 The Peircean Framework: Peirce’s Philosophy of Inquiry ...... 11

2.1 Peirce’s Trichotomic...... 11 2.1.1 The Peircean Categories: Firstness, Secondness, and Thirdness ...... 13 2.1.2 Triadic Relations and the Mathematical Origin of the Categories...... 18 2.2 The Triad in Reasoning ...... 24 2.2.1 The Triad in Signs: Icons, Indices, and Symbols ...... 26 2.2.2 The Triad in Arguments: Deduction, Induction, and Abduction...... 30 2.2.3 On Peirce’s Distinction between Induction and Abduction ...... 43 2.3 Peirce’s Conception of Mathematics...... 57 2.3.1 Hypothetical States of Things ...... 58 2.3.1.1 Diagrams and Icons...... 65 2.3.1.2 The Applicability of Mathematical Theories ...... 69 2.3.2 Necessary Reasoning, Experimentation, and Observation...... 74 2.3.3 Pure Mathematical Reasoning versus Poietic Creation...... 82 2.4 Mathematics as Creative, Precise, Experimental, Open-Ended Inquiry...... 88

Chapter 3 The Context of Mathematical Discovery: The Case of Mathematical ...... 95

3.1 The 1654 Pascal-Fermat Correspondence and the Problem-Context of Discovery...... 104 3.1.1 The ‘Problem with Dice’...... 105 3.1.2 The ‘Problem of Points’ and Expectation ...... 118 3.2 The Creation of a Hypothetical State of Affairs and the Origins of Mathematical Probability Theory...... 130 3.3 Peircean Considerations and Implications...... 141

Chapter 4 Epistemic Conditions for the Possibility of Mathematical Discovery ...... 146

4.1 Epistemic Abilities...... 146 4.1.1 Imagination...... 148 4.1.2 Concentration ...... 158 4.1.3 Generalization...... 163 4.2 The Community of Inquirers ...... 171 4.2.1 Systems of Representation ...... 174 4.2.2 Existing Mathematical Knowledge ...... 180

v 4.2.3 Dialogical Criticism...... 187 4.3 Pragmatic Upshot Towards a Logic of Mathematical Inquiry ...... 192

Chapter 5 The Method of Mathematical Inquiry and the Heuristics of Discovery ....198

5.1 Heuristic Methods for Creating ‘Framing Hypotheses’ ...... 200 5.1.1 Abstraction ...... 200 5.1.2 Framing Analogy...... 205 5.2 The ‘Analytic Method’ of Mathematics and the Heuristics of ‘Analytical Hypothesis-Making’...... 210 5.2.1 Lessons from Huygens’s General Method of Solution for the Problem of Points...... 211 5.2.1.1 The ‘Analytic Method’ of Mathematics...... 213 5.2.1.2 Generalization and Particularization as Analytical Heuristics...219 5.2.2 Lesson’s from the Demonstration of Bernoulli’s Theorem...... 224 5.2.3 Pragmatic Upshot of the Historical Lessons: Towards a Logic of Mathematical Inquiry ...... 238

Chapter 6 The Leibniz-Bernoulli Correspondence: The Abductive Warrant of Bernoulli’s Theorem...... 242

6.1 The Correspondence ...... 243 6.2 Bernoulli’s Hypothesis as an Inference to the Best Explanation...... 259 6.3 Bernoulli’s Hypothesis as a Case of Abduction ...... 274 6.3.1 Formulating and Weighing Plausible Explanatory Hypotheses in Response to Living Doubt...... 279 6.3.2 ‘Abductive Insight’ in Bernoulli’s Hypothesis...... 287 6.3.3 ‘Simplicity’ in Bernoulli’s Hypothesis...... 292 6.4 Comparing Both Accounts of Bernoulli’s Ampliative Inference...... 294

Chapter 7 Abduction as Rational Ampliative Inference: Objections and Replies...... 300

7.1 The Descriptive Adequacy of the Abductive Model ...... 301 7.2 The Problem of Truthful Hypothesizing...... 308 7.2.1 The Abductive Faculty ...... 312 7.2.2 The Simplicity of Abductive Hypotheses...... 320 7.2.3 Bernoulli’s Truthful Hypothesizing ...... 326

Chapter 8 Explanation and Reality ...... 329

8.1 True as a priori Real Dispositions...... 331 8.2 A priori Real Dispositions as Explanatory of a posteriori Statistical Regularities...... 344 8.2.1 Some Possible Objections ...... 354 8.2.1.1 Bernoulli’s ‘Propensity’ Interpretation?...... 354 8.2.1.2 Bernoulli’s ‘Realism’? ...... 358

vi 8.3 Explanation and Reality in Bernoulli’s Reasoning...... 365

Chapter 9 Conclusion...... 373

Bibliography ...... 381

vii LIST OF FIGURES

Figure 2-1: Division of the Three Universal Categories according to their Orders of Genuineness and Degeneracy...... 18

Figure 2-2: Diagram of a Three-Way Forking Road as Suggested by Peirce...... 23

Figure 2-3: Diagram from C. S. Peirce’s demonstration of Euclid’s Fifth Proposition (Elements I.5), known as the Pons Asinorum...... 78

viii ACKNOWLEDGEMENTS

I owe my deepest gratitude to an entire community of inquirers that enabled me to complete this work in philosophy. I regard this thesis as the product of a community that includes Professor Daniel Conway, who encouraged my interests and helped me to get started as a student of philosophy; Professor Johannes Fritsche, who introduced me to the central question of my dissertation by way of Aristotle and who as an generous teacher was always willing to discuss philosophical matters or simply to read with me; Professors

Mike Rovine, Dale Jacquette, and Cathy Kemp, who in manifold ways oriented my work;

Emily Grosholz and Douglas Anderson, my academic advisors and dissertation directors, who patiently helped me to shape and clarify my thinking; and all my friends with whom

I studied in cafés and dinners and from whom I learned how to let philosophy be a conversation. I am thankful to all of them. Emily and Doug are not only academic advisors but, more importantly, exemplars of what it means to live a reflective, ethical life.

A mis padres, Rodolfo Campos y Liannette Badilla, y a mis hermanas, Antonieta y

Xinia, les agradezco todo el apoyo y el cariño que me han dado siempre. A ellos y a mis amigos, especialmente a Caro, les ofrezco un abrazo: ¡Gracias!

Chapter 1

Introduction

What is the logic at work in the inquiring activity of mathematicians? This is the central philosophical question that I propose to address here. Logicians and philosophers of mathematics will generally agree with the position that there is a logic of justification of mathematical knowledge, even if they might disagree about its specific form. The question of whether there is a logic of mathematical inquiry, and especially of mathematical discovery, is far more controversial. Or at least it should be, since the controversy would be a sign that the question receives ample philosophical consideration.

However, in comparison to the question of logical justification, the question of logical inquiry and discovery receives relatively little attention from philosophers.

For instance, in a recent introductory work to the philosophy of mathematics

James Robert Brown notes that mathematical activity is far broader that proving theorems, as it includes practices such as choosing problems for research, focusing on a certain line of research, deciding which mathematical projects to fund, and so on (Brown

1999, p. 30). He argues that “[m]athematical achievements may rest entirely on deductive evidence, but mathematical practice is based squarely on the inductive kind [of evidence]” (Brown 1999, p. 30). He develops an exposition of what he calls ‘inductive evidence’, but his overall emphasis is on the role of such evidence, especially of pictures and figures, in proofs, not in the actual inquiring process of trying to pose and attack

2 mathematical problems.1 This, of course, is not to fault Brown. It is simply to point out that the question of the logic of mathematical inquiry is comparatively neglected. Brown explicitly writes that “[a]side from proofs, the notion of mathematical inference is a largely unexplored field. It is certainly not in the same stage of development as, say, rational inference and methodology in the natural sciences” (Brown 1999, p. 171). He cites the work of George Pólya (1954) and of Hilary Putnam in general, of Penelope

Maddy (1990) in set theory, and of Daniel Shanks (1993) in , as attempts to address the question of the nature of mathematical inference. Beyond the specific question of inference, I might also cite the work of Imre Lakatos (1976) on the logic of development of mathematical concepts, Carlo Cellucci (2002) on the method of mathematical research, and recent collections such as those edited by Donald Gillies

(1992) and Emily Grosholz and Herbert Breger (2000), exploring questions of revolution and development in the growth of mathematical knowledge, as variegated efforts to study various aspects of mathematical inquiry and the resulting development of mathematical knowledge.

My aim is to contribute to this generally interrelated, and relatively neglected, line of philosophical research by pursuing a systematic investigation into the logic of mathematical inquiry, paying especial attention to the reasoning processes leading to mathematical discovery. I will approach the question from an openly Peircean standpoint.

The question of the logic of inquiry—in theoretical and practical endeavors, in

1 For example, he describes some types of inductive evidence involved in enumerative induction, analogy and broad experience (Brown 1999, p. 30-32). More broadly, he argues that pictures and figures are not only heuristic devices in mathematical research to aid the understanding and provide explanations, but are actually a type of evidence that justifies and proves results.

3 mathematics, science, and philosophy—remained central to American mathematician, scientist, and mathematician Charles Sanders Peirce throughout his life.2 With specific

regard to the logic of inquiry in mathematics, Peirce thought that the question of how

mathematicians can guess on the ways to solve a mathematical problem or to carry out

the demonstration of a proposed mathematical result “is a mystery that deserves a life- time of study” (NEM 4, p. 215).3 Peirce, I think, made significant inroads into the

“mystery” of rational mathematical conjecturing, and of mathematical inquiry more generally, throughout his life-time. I hope to embark on a similar course of study along

Peircean lines. This in fact reveals at the outset the nature of my approach. In the first place, I aim at a ‘systematic’ approach in the sense that it will rely entirely upon the system of universal categories developed by Peirce. Chapter 2, in fact, is an introduction

to the Peircean system of categories and its implications for a theory of inquiry. Second,

my project is a study of ‘logic’ in Peirce’s broad sense of ‘logic’ as the science of good, self-controlled reasoning. Though Peirce made important contributions—comparable only to those of Gottlob Frege—to what we know contemporarily as “formal logic,”

Peirce did not think that this narrow field was the whole of logic, nor even its principal part. For Peirce ‘logic’ is more properly understood as the art of good, self-controlled reasoning. Of this art, “formal deductive logic” is but one sub-branch. I think of Peirce’s logic, understood as the art of good reasoning, as being more akin to the entirety of

2 I call Peirce an “American” philosopher in the full sense of a “philosopher from the Americas” and not in the narrow and inadequate sense of “American” as “being from the United States of America.” 3 In keeping with standard practice in Peirce scholarship, all citations from Peirce 1973, The New Elements of Mathematics, will be abbreviated henceforth as NEM. In this case, for example, NEM 4, p. 215 corresponds to volume 4, page 215. I will introduce similar standard abbreviations upon the first citation of texts by Peirce.

4 Aristotle’s Organum —i.e., not only to the Prior and Posterior Analytics, but also to the

Dialectics, for instance—and in the spirit of, say, and ’s

1662 Ars Cogitandi, which at the time of the birth of modern science attempted to account for non-deductive, pre-probabilistic forms of reasoning, for example.

More specifically, Peirce thought that the science of logic consisted of three branches, namely: (i) ‘Speculative grammar’, a label due to Duns Scotus, which consists of the “analysis of what kinds of signs are absolutely essential to the embodiment of thought” (EP 2, p. 257).4 It is the study of what general types of signs are necessary for

reasoning to be possible. As I will explain in chapter 2, for Peirce the most general types

of signs are ‘icons’, ‘indices’, and ‘symbols’. (ii) ‘Critic’, which is the study of “all the

different elementary modes of getting at truth and especially all the different classes of

arguments…[and of] their properties so far as these properties concern [the] power of the

arguments as leading to the truth” (EP 2, p. 256). I will show in chapter 2 that for Peirce

the elementary types of arguments that can lead to the truth in various degrees are

deduction, induction, and abduction. Here we see that deductive logic is but one sub-

branch of the whole science. (iii) ‘Methodeutic’, which “is the last goal of logical study”

and consists in “the theory of the advancement of knowledge of all kinds (EP 2, p. 256).

This is precisely the branch of logic that, on the basis of the other two, ought to reveal the

methods for discovery and breakthrough advancement of mathematical and scientific

knowledge. It is methodeutic that, for Peirce, ultimately deserves “a life-time of study.”

4 In keeping with standard practice in Peirce scholarship, all citations from Peirce 1998a, The Essential Peirce, volume 2, will be abbreviated henceforth as EP 2.

5 My approach to the question of the ‘logic’ of mathematical inquiry, therefore, will reflect Peirce’s three-fold conception of the science of logic. After an introduction in chapter 2 to Peirce’s system of categories and its consequences for a philosophy of inquiry, I conceive of my subsequent exposition as consisting of three parts. The first part treats of the context of mathematical discovery (chapter 3) and of the conditions for the possibility of discovery within that context (chapter 4). It is a study into the conditions that favor breakthroughs in the course of mathematical inquiry. The second part concerns the method of mathematical inquiry and of the heuristic techniques that mathematicians deploy in pursuing that method (chapter 5). The third part addresses the forms of reasoning that lead to the innovative application of mathematical theories and models to problems in the natural sciences and the logical difficulties related to such attempts at innovative application (chapters 6-8).

I suggest at the outset that these three parts relate to the three branches of Peircean logic in the following way. Part One, on the context and the conditions of mathematical discovery, and relevant sections of chapter 2 that introduce Peirce’s special notion of mathematical ‘diagrams’ together make up a study on ‘speculative grammar’, in as much as they investigate the types of signs necessary for mathematical reasoning. The whole of

Part One, however, is a broader study than ‘speculative grammar’ and in some ways is closer to what we might contemporarily call “epistemology” insofar as it addresses, for instance, the epistemic abilities required for mathematical reasoning according to Peirce.

Part Two, on the method of mathematical inquiry and its associated heuristic techniques, is a study in logical ‘critic’ in as much as it classifies forms of reasoning involved in actual mathematical research, and is an incipient attempt at ‘methodeutic’ in as much as it

6 suggests a general method and particular heuristic techniques conducive to mathematical discovery. Finally, Part Three, on the logical grounds for the innovative application of mathematics to science, is a study in ‘methodeutic’, or the art of discovery, at the intersection of mathematics and natural science.

There is a third way in which the nature of my approach is Peircean. In closing his famous essay on “How to Make Our Ideas Clear,” part of the Illustrations of the Logic of

Science published in 1877-1878, Peirce declares: “How to give birth to those vital and procreative ideas which multiply into a thousand forms and diffuse themselves everywhere, advancing civilization and making the dignity of man, is an art not yet reduced to rules, but of the secret of which the history of science affords some hints” (EP

1, p. 141).5 For Peirce, the key to deciphering the ‘art of discovery’ lies in the history of

science, including of course the . In the history of mathematical and scientific reasoning we find the record of our inquiring practices. This is of course a

claim that needs to be defended. I do not defend it at the outset but ask that the

forthcoming investigation be judged, at least in part, as a defense of what the historical

approach might achieve.

Admittedly, I do not seek to reduce the ‘art of discovery’ to rules in the sense of

algorithms that will necessarily lead to conceptual breakthroughs, successful hypotheses,

and mathematical discoveries. I grant at the outset that a logic of inquiry cannot produce

rules for discovery in the same way in which deductive logic clarifies truth-preserving

rules of inference. The expectation that a logic of inquiry should meet the standards of a

5 In keeping with standard practice in Peirce scholarship, all citations from Peirce 1992, The Essential Peirce, volume 1, will be abbreviated henceforth as EP 1.

7 logic of justification is what leads detractors of the former logic to foreclose its possibility even before it is attempted. For example, from twentieth century philosophy of science we can think most famously of Karl Popper’s charge that what Peirce would call the “art of giving birth to vital and procreative ideas” in science is not a question of logic but of psychology (see Popper 1959). The charge that Popper and many other philosophers make is really that a logic of inquiry, including its art of discovery, cannot live up to the standards that their narrow sense of ‘logic’ demands. The standard is usually that of rule-derived, certain knowledge. Following Peirce, who devoted much effort to developing a broader logic, including the branch of ‘methodeutic’, I propose to the contrary that a logic of inquiry should meet its own standards. Accordingly, throughout the course of the forthcoming argument, I will try to make clear what those standards are.

What I proceed to do here, then, is not to develop an abstract model of a logic of inquiry with complete disregard to the history of mathematical practice. I rather aim to study the living practice of mathematical inquiry, as reflected in the discovery and early development of a major branch of mathematics in early modernity. To this effect, first I provide my introductory interpretation of Peirce’s philosophy of mathematical inquiry, and then I study the discovery of mathematical probability theory. The purpose of the case study is to provide us with a concrete, historical example of progressive mathematical research that will allow a critical examination of the Peircean logic of inquiry. This purpose calls for a delicate balance between two tasks that I will strive to preserve. On the one hand, I will attempt to examine, by way of the case study, the strengths and weaknesses of Peirce’s logic of mathematical inquiry, offering criticism

8 and even attempting some original development when required by the evidence of the case study. On the other hand, I will attempt to explain, through the proposed Peircean logic, the mathematical reasoning involved in the discovery and early development of mathematical probability. Given my striving for balance, however, I think I should forewarn the reader that my project is not a . The are already outstanding histories of the concept of probability, most notably The Emergence of

Probability (Ian Hacking, 1975) and Classical Probability in the Enlightenment (Lorraine

Daston, 1988), and also histories of mathematical probability and statistics, most notably

A History of the Mathematical Theory of Probability (Isaac Todhunter, 1865) and The

History of Statistics (Stephen Stigler, 1986). My goal is not to provide an alternative work to these historical works. My principal aim is rather to make a first sytematic attempt at expounding the logic of mathematical inquiry by way of an important case study.

With regard to the case study, traditionally the beginning of mathematical probability has been located at the 1654 correspondence between and regarding the mathematics of games of chance, though historians such as F. N.

David (1962) and M. G. Kendall (1970) have paid significant attention to the mathematical precedents to the Pascal-Fermat correspondence. My treatment of the case study will pay due attention to those precedents by anonymous medieval poets and mathematicians such as Galileo, for example. I will examine closely the Pascal-Fermat correspondence, the work of Christian Huygens in the latter half of the seventeenth century, and especially Jacob Bernoulli’s Ars Conjectandi (1713)—the work that introduces the first crucial theorem that seeks to justify the estimation of unknown

9 probabilities on the basis of the statistical ratios of phenomena observed in the natural world. Bernoulli’s Ars Conjectandi, in fact, will turn out to be a focal text, due to its relation to the earlier work of Pascal, Fermat and Huygens and to the1703-1704 epistolary discussion between Bernoulli and G.W. Leibniz regarding the assumptions and implications of Bernoulli’s theorem that preceded the (posthumous) publication of the text. The discovery of mathematical probability is of interest as a case study because, although it seems to have originated in the investigation of the mathematics of games of chance, the mathematical urgency and complexity of the field evolved due to its applicability to the natural and social sciences. Mathematical probability is therefore a branch of mathematics that at once influences and is influenced by problems from the sciences. It will therefore provide an interesting case to address the logic of inquiry of the mathematician qua pure mathematician, on the one hand, and qua mathematical scientist on the other. So I think the consideration of this case will be a worthwhile endeavor towards studying the logic at work in the inquiring activity of mathematicians.

Finally, a fourth way in which Peirce’s philosophy influences my approach is reflected in my attempt to address the pragmatic upshot of the forthcoming investigation.

What are the practical bearings of studying the logic of mathematical inquiry? To be clear, I think it is already a worthwhile theoretical endeavor to try to describe and expound the logica utens of mathematical inquiry. However, I think the most important practical bearings lie in what the investigation may suggest towards developing a logica docens for the training of students of mathematics. I do not mean only those students who are training to be professional mathematicians. I mean all those students who at any time in the course of their education undertake the study of mathematics. When students

10 undertake mathematical training, what is it that they are preparing to do? How should we teach students, at any age and from any field, to do mathematics? How should we train aspiring mathematicians for a life of inquiring practice? I hope the following investigation might provide some inroads into these questions and suggest at least some ways to develop a logica docens of mathematical inquiry. By investigating what inquiring mathematicians actually do when they inquire, I hope also to gain some insight into how we might train future students of mathematics. Given the close relationship of mathematics to science and philosophy, and given its creative affinities to the arts, these issues might prove in the long run to extend beyond the distinguished history of philosophical reflection upon mathematics.

Chapter 2

The Peircean Framework: Peirce’s Philosophy of Inquiry

2.1 Peirce’s Trichotomic

At the core of Charles Sanders Peirce’s philosophy, pervading his entire system of

thought, are his three universal categories. As early as his 1868 paper “On a New List of

Categories,” Peirce attempts to derive the universal categories that constitute the

essential, irreducible aspects of all phenomena (EP 1, p.1-10). At the time of his early list,

Peirce is greatly influenced by Kant’s critical philosophy, and he attempts a logical

derivation of the universal categories required for the unification of experience, that is, of

the elementary conceptions required to bring the manifold of sense impressions into unity

in the understanding. The effort to derive, critique, and ultimately defend his list of

universal categories remained a constant endeavor throughout Peirce’s philosophical

career. In 1868 he claims that “the unity to which the understanding reduces the

impressions is the unity of a proposition” (EP 1, p. 2), and thus Peirce derives his first list

of universal categories from his logical analysis of the nature of a proposition. Later,

Peirce also derives the universal categories phenomenologically, that is, by attending to

the universal elements of appearance as they present themselves to us in ordinary

experience.6 Most importantly and most fundamentally, as we will see, Peirce derives his

6 See, for example, the second of the 1903 Harvard Lectures, entitled “On Phenomenology,” in EP 2, p. 145-159.

12 list of three universal categories from the formal logic of relations, and in particular, from what he finds to be the irreducibility of triadic relations.7 Since Peirce regards the formal

logic of relations to be a branch of mathematics, for him the universal categories

ultimately had a mathematical origin.8

Since the universal categories are at the core of Peirce’s entire philosophical

system, they are naturally at the core of his philosophy of inquiry in mathematics and

science. Thus, in order to introduce Peirce’s philosophy of inquiry as it pertains to

mathematics, the natural starting point is a description of the three universal categories

that he finds to pervade all phenomena. My focus will not be to expound or defend his

derivation of the categories.9 My focus will rather be to describe the Peircean categories

sufficiently so as to provide a basic framework to understand Peirce’s model of inquiry in

mathematics. Accordingly, my emphasis will be on presenting the categories in their

heuristic aspect, that is, as guides for our own philosophical researches into the logic of

mathematical inquiry. In order to introduce the art of ‘trichotomic’ which deploys the

categories heuristically, let us turn first to define and describe briefly the categories in

Peircean terms, and then to illustrate the mathematical origin of the categories through

Peirce’s preferred example of the irreducibility of triadic relations.

7 See, for example, NEM 4, p. 307-312. 8 See Houser, “Introduction” to EP 1, p. xxx, and Anderson 1995, p. 33-34. 9 For a comprehensive examination of Peirce’s development of his philosophical system on the basis of the categories, see Murphey 1961. Murphey, however, incorrectly argues that Peirce’s aim was to develop a complete, final, and closed system, and that he ultimately failed. Anderson (1995), Hausman (1993), and Hookway (1985) argue more convincingly, and from various perspectives, that Peirce conceived of his system as being open and always subject to revision as our communal knowledge of reality and reality itself evolve. Peirce never aimed at a complete and closed system.

13 2.1.1 The Peircean Categories: Firstness, Secondness, and Thirdness

In a mature formulation of the categories in his 1903 Harvard Lecture entitled

“The Categories Defended,” Peirce introduces them as follows:

Category the First is the Idea of that which is such as it is regardless of anything else. That is to say, it is a Quality of Feeling.

Category the Second is the Idea of that which is such as it is as being Second to some First, regardless of anything else and in particular, regardless of any law, although it may conform to a law. That is to say, it is Reaction as an element of the Phenomenon.

Category the Third is the Idea of that which is such as it is as being a Third, or Medium, between a Second and its First. That is to say, it is Representation as an element of the Phenomenon. (EP 1, p. 161)

It is crucial to observe, first, that in these definitions the categories are Ideas both in the

sense of Form or Nature and in the sense that they are the irreducible elements of thought

that allow us to unify our experience of phenomena. As Ideas, the categories are the

elements of thought that correspond to the elements of Quality, Reaction, and

Representation in the phenomena. For Peirce, our experience of any phenomenon,

therefore, has the three irreducible aspects of (i) feeling, that is, of experiencing the

phenomenon as having an intrinsic and independent quality, (ii) reaction, that is, of

experiencing the phenomenon as being constituted also by its relations of action and

reaction with other phenomena, and especially of experiencing our minds as being

reactive to the occurrence of the phenomenon, and (iii) representation, that is, of

experiencing the phenomenon through a sign of it in our minds. Suppose that we observe

a geometrical figure, say a triangle, drawn in black ink on white paper. We experience the

figure in its Firstness in so far as we perceive it to consist of black patches on a white

background, the black patches being, say, thick or thin, straight or curved, the

14 background paper being smooth or coarse, and so on. Blackness, whiteness, thickness or thinness, smoothness or coarseness, and so on, are the experienced or felt qualities of the drawn figure. Moreover, the figure, as being actually physically present and as forcing its presence upon our perception is the figure experienced as Second. Or, if we prefer, our actual reactive perception of the figure is Second to the active presence of the figure.

Finally, in so far as we interpret the actually drawn figure to represent a triangle in general, that is, in so far as the actual drawing stands for a geometrical concept to our interpreting minds, perhaps in the context of the demonstration of a geometrical theorem, we experience that figure as a Third.

Moreover, implied in the preceding description we find Peirce’s view that the categories are not only ‘ideal’ but also ‘real’, that is, they are not only conceptions of our thought but also constitutive elements of the phenomena themselves.10 As constitutive elements of phenomena, Peirce often describes the categories as follows:

First is the beginning, that which is fresh, original, spontaneous, free. Second is that which is determined, terminated, ended, correlative, object, necessitated, reacting. Third is the medium, becoming, developing, bringing about. A thing considered in itself is a unit. A thing considered as a correlate or dependent, or as an effect, is second to something else. A thing which in any way brings one thing into relation with another is a third or medium between the two. (EP 1, p. 280)

Thus, as elements of phenomena, we can describe the categories as Originality (also,

Spontaneity and Independence), Relation, and Mediation respectively. For Peirce, a phenomenon is ‘first’ in so far as it is original, spontaneous, free, and independent. A pure ‘first’ would be, for example, a completely random event that does not result from

10 Peirce in fact dedicates the subsequent 1903 Harvard Lecture, entitled “The Seven Systems of Metaphysics,” to demonstrating the reality of the categories, or more precisely, their actual operation in nature. See EP 1, p. 179-195.

15 any cause or does not obey any law. A phenomenon is second in so far as it reacts with another, or has its limits in its relation to another, or is necessitated by the occurrence of another. A ‘second’ would be, for example, an effect in so far as it is a reaction to its cause. A phenomenon is ‘third’ in so far as it mediates between two others. An active causal law is a ‘third’ in so far as it necessitates and brings about an effect upon the occurrence of its cause; for example, gravity as a ‘habitual’ or lawful relation between any two physical objects.

Now, the universal categories are hierarchical and need not occur purely in phenomena. A phenomenon might conceivably be a pure first, such as a purely spontaneous and chance event, independent of cause or law. In his cosmology, for instance, Peirce hypothesizes that such purely spontaneous random events, call them

“flashes,” initiated the evolution of the cosmos from a completely indefinite no-thing- ness into a gradually less indefinite some-thing-ness in which some spontaneous events do take place.11 A phenomenon that is a second must also involve some degree of

firstness. You are hit by a heavy pole in the small of your back. You experience the hit as

a reaction; there is a real clash between agent and patient (EP 2, p. 177-178). But the hit

also involves acute pain on the part of the patient. The pain is yours, not the pole’s or the

hitting relation’s. So the phenomenon does involve some degree of firstness. That is, here

is an experienced reaction that also involves a felt quality. Finally, a phenomenon might

conceivably be a pure third, such as a purely ideal triangle, which does not have any felt

quality and does not react with anything in the actual world. But a phenomenon that is

11 See, for example, his 1891-1893 Monist Metaphysical Series, EP 1, p. 285-371.

16 predominantly a third may also involve degrees of secondness and of firstness. Walking in a crowded street, a friend shoves you forcefully so as to indicate the direction in which you should walk. The shove is predominantly a representation, in this case, a way to communicate the direction in which you should walk. But it involves a forceful, physical action and reaction, and it involves the felt quality of shoving and of being shoved respectively. Here is a representation that appeals to a reaction in order to communicate something and that involves a felt quality.

Finally, the higher categories are susceptible of ‘degeneration’. According to

Peirce, “Firstness or freshness may have manifold varieties, or rather, arbitrariness and variety is its essence, but it is absolute and unsusceptible of differences of degree. It may be present more or less, but it has no different orders of complication” (EP 1, p. 280).

Spontaneity, originality, and chance do not admit differences of order. Secondness or reaction, however, does admit a dual division, as it may be “genuine or degenerate” (EP

1, p. 280-281). In genuine reactions, there is an actual dynamical connection between the reacting objects or events. In such a genuine relation, if one of the relates were to disappear, and with it the reaction, the remaining relate would be modified, as it would lose the characters implied in the relation. Degenerate secondness consists in relations of reason that involve no dynamical connection. There are two varieties: a single object or event considered as a second to itself or a single object considered as a second to an object with which it has no real connection. Finally, Thirdness, Representation, or

Mediation does admit of two orders of degeneracy (EP 1, p. 281). Genuine thirdness consists in a ‘vital’ connection among three objects, events, or terms, A, B, and C. In genuine mediation, each of the three terms is connected to each other by “a relation

17 which only subsists by virtue of the third term, and each has a character which belongs to it only so long as the others really influence it. It would not be enough to say that the connection between the terms is dynamical, for forces only subsist between pairs of objects; we had better use the word ‘vital’ to express the mode of connection, for wherever there is life, generation, growth, development, there and there alone is such genuine thirdness” (EP 1, p. 281). Mediation of the first order of degeneracy, then, occurs when two of the three terms are identical, so that the third only mediates between two aspects of the same object, or when there is no ‘vital’ connection in the triad, but only dynamical connections between A and B and between B and C in a way that brings about a dynamical connection between A and C. Mediation of the second order of degeneracy occurs when all three terms are identical or when there are not even dynamical connections between pairs of terms but only relations of reason.

I will defer offering examples of the genuine and degenerate orders of the categories until the discussion of the triadic classification of signs below. For now, the trichotomic division of the categories may be diagrammed according to the following figure, based on one drawn by Peirce in his manuscripts for the 1903 Harvard Lectures on Pragmatism.12 In the figure we see a line representing pure or genuine firstness or spontaneity, a catena representing first-order degenerate and genuine secondness or reaction, and a stemma representing first- and second-order degenerate and genuine thirdness or mediation.

12 This is similar to one of three diagrams drawn by Peirce in “The Categories Defended” (MS 308; published in EP 2.12), as reproduced in EP 2, p. 162. In the lecture, Peirce aims to illustrate the various specific divisions of the categories, and especially the various stemmas of Thirdness. The interested reader may see that lecture for a more thorough discussion of the categories and their classification into orders of genuineness and degeneracy.

18 Figure 2-1

Figure 2-1: Division of the Three Universal Categories according to their Orders of Genuineness and Degeneracy.

Having introduced these definitions and brief descriptions of the categories, which shall become clearer as we deploy them in the forthcoming investigations, let us turn to discuss their mathematical origin in Peirce’s work in the logic of relations.

2.1.2 Triadic Relations and the Mathematical Origin of the Categories

As I pointed out above, Peirce derives his categories first through the logical analysis of the nature of a proposition—a decade-long effort culminating with his aforementioned 1868 “New List”—and later through his phenomenological investigation into the elements of appearance in the course of our ordinary experience—as it is evident throughout the 1903 Harvard Lectures. Most fundamentally, however, Peirce conceives of his categories as being founded upon his investigations into the logic of relations. As

Douglas Anderson observes, “in developing his logic, [in 1870 Peirce] found another route to his categories through an analysis of relations” (Anderson 1995, p. 33: see CP

19 1.363, CP 3.63, and CP 3.421). Since Peirce associated formal logic, including the logic of relations, with mathematics, then it is the mathematical-logical analysis of relations that discovers the categories (Anderson 1995, p. 33-34; see CP 4.240). Nathan Houser concurs, observing that for Peirce “[m]athematics is a science of discovery that investigates the realm of abstract forms, the realm of ideal objects (entia rationis). It is the mathematician who first discovers the fundamentality of triadicity by finding that monadic, dyadic, and triadic relations are irreducible, while relations of any degree (or adicity) greater than triadic can be expressed in combinations of triadic relations”

(“Introduction” to EP 1, p. xxx).13 Thus, for Peirce the irreducibility of monadic, dyadic,

and triadic relations, and the reducibility of all other degrees of relation to triadic relations provide the ground for Firstness, Secondness, and Thirdness, or Independence,

Relation, and Mediation, as the elementary and irreducible universal categories.

Undertaking a thorough examination of Peirce’s arguments in this regard is

beyond the scope of the present introductory purposes.14 Nonetheless, it will be

illuminating at least to consider Peirce’s archetypical exemplification of the irreducibility of triadic relations, since it is irreducible triadicity that he thinks will find the most

resistance before being accepted as a true conception. Consider the relations involved in

‘A gives C to B’.15 According to Peirce this is an irreducible triadic relation that cannot

be expressed as a congeries of pairs or dyadic relations. Take these three dual relations:

13 I will develop my interpretation of Peirce’s conception of mathematics below, an interpretation that will accommodate the description of mathematics as a science of discovery that studies ideal objects. 14 The interested reader might consult Thompson 1953, chapter 1, and the collection of essays on Peirce’s logic edited by Nathan Houser et al (1997). 15 For Peirce’s full discussion of this example, see NEM 4, p. 307; EP 1, p. 251-253; and EP 2, p. 170-173.

20 ‘A enriches B’, ‘A parts with C’, and ‘B receives C’. The triple relation involved in the fact that A enriches B with C, all in one act, cannot be analyzed into a combination of these dual relations. As Peirce puts it, “A may enrich B, B may receive C, and A may part with C, and yet A need not necessarily give C to B. For that, it would be necessary that these three dual relations should not only coexist, but be welded into one fact. Thus we see that a triad cannot be analyzed into dyads” (EP 1, p. 252). The upshot of the example of giving is that here is one case in which it is not possible to express a ‘triple fact’ as a combination of ‘dual facts’, that is, it is not possible to analyze the triadic relation involved in the act of giving into a mere collection of dyadic relations. Such a reduction would miss something elemental about the relations involved in giving. I would put it as follows: ‘C irreducibly mediates in the giving relation between A and B’. ‘A’, ‘B’, and

‘C’ in their independence are monads; ‘A enriches B’, ‘A parts with C’, and ‘B receives

C’ are dyads, irreducible to monads; and ‘A gives C to B’ is a triad, irreducible to dyads or monads. In turn, consider the relations involved in ‘A sells C to B for the price D’.16

According to Peirce, this quadruple fact is a compound of two triple facts. The first triple fact is that ‘A makes with B a certain transaction, E’. The second triple fact is that ‘the transaction E is a sale of C for the price D’. The combination of these two triple facts makes up a quadruple fact. Thus we have an example of a genuine quadratic relation that is reducible to triadic relations, while we have already seen that genuine triadic relations are irreducible. According to Peirce, then, Independence, Relation, and Mediation, or

16 For Peirce’s full treatment of this example, see EP 1, p. 252.

21 Firstness, Secondness, and Thirdness, constitute the genuine elementary conceptions or universal categories involved in all facts or phenomena.

Peirce explains the reducibility of the quadruple fact of ‘selling for a price’ to a pair of triple facts and the irreducibility of the triple fact of ‘giving’ to dual facts as follows. A dual relative term, such as ‘lover’, is a sort of blank form where two places are left blank, so that “in building a sentence round lover, as the principal word of the predicate, we are at liberty to make anything we see fit the subject, and then, besides that, anything we please the object of the action of loving” (EP 1, p. 252). In short, a dual relative term is a blank form such as ‘_____ loves _____’. However, a triple relative term, such as ‘giver’, has two correlates, and thus it is a sort of blank form with three places left blank—for example, ‘_____ gives _____ to _____’. “Consequently,” Peirce writes, “we can take two of these triple relatives and fill up one blank place in each with the same letter, X, which has only the force of a pronoun, or identifying index, and then the two together will form a whole having four blank places; and from that we can go on in a similar way to any higher number” (EP 1, p. 252). In terms of our preceding example, I think that Peirce has in mind the following schema: Take two triadic blank forms, ‘_____ makes with _____ the transaction _____’ and ‘the transaction _____ is a sale of _____ for the price _____’. Now, fill in one blank in each form with the letter E as follows: ‘_____ makes with _____ the transaction E ’ and ‘the transaction E is a sale of _____ for the price _____’. We can now combine these triadic forms the make up the quadruple form ‘_____ makes with _____ the sale of _____ for the price _____’.

However, Peirce argues, “when we attempt to imitate this proceeding with dual relatives, and combine two of them by means of an X, we find we only have two blank

22 places in the combination, just as we had in either of the relatives taken by itself” (EP 1, p. 252). That is, we cannot combine blank dyadic forms into blank triadic forms. I think that Peirce has the following schema in mind. Take two subsequent dyads of the form

‘_____ R _____’, where R is a transitive relation. Filling in the appropriate blanks with

X, even if we combine ‘_____ R ___X ’ and ‘___X R _____’ into ‘_____ R _____’, we still end up with a blank dyadic form. Try as we may, it will be impossible to construct a blank triadic form out of dyadic forms. In accordance with this schematic reasoning, Peirce concludes by describing a diagram: “A road with only three-way forkings may have any number of termini, but no number of straight roads put end on end will give more than two termini. Thus any number, however large, can be built out of triads; and consequently no idea can be involved in such a number radically different from the idea of three. I do not mean to deny that the higher numbers may present interesting special configurations from which notions may be derived of more or less general applicability; but these cannot rise to the height of [fundamental] philosophical categories” (EP 1, p. 252). The following figure of a three-way forking road, as suggested by Peirce, illustrates the way in which any number of relations may be reduced to triads.

Following Peirce’s previous example, for instance, the relation ‘A sells C5 to B2 for the price D14’ may be reduced to the triads ‘A makes with B2 the transaction E’ and ‘the transaction E consists is the sale of C5 for the price D14’. Note that this reduction may be repeated for any of the quadratic relations in the diagram, including those for which the three-way forks are not completed in the figure.

23

Figure 2-2

Figure 2-2: Diagram of a Three-Way Forking Road as Suggested by Peirce.

The diagram of a three-way forking road is especially apt to conclude this brief introduction to the Peircean categories. As Peirce suggests, this schema will be especially important because of its general applicability in mathematical and philosophical investigations. Peirce in fact defines ‘trichotomic’ as “the art of making three-fold divisions” on the basis of the universal categories (EP 1, p. 280). It is this art of

‘trichotomic’ that we will deploy next in order to expound Peirce’s philosophical views regarding the presence and function of the triad of universal categories in logical reasoning.

24 2.2 The Triad in Reasoning

Trichotomic is the art of making three-fold divisions on the basis of the universal categories. Peirce employs this art extensively to analyze the nature of reasoning. As a brief prolegomenon to an exposition of Peirce’s conception of mathematical reasoning, then, it will be helpful to expound now his views on the triadic nature of reasoning in general. And regarding ‘reasoning in general’, I must provide the following definitions at the outset so that the focus of my forthcoming discussion is understood to belong within a wider conceptual framework.17

By ‘reasoning’ Peirce means “the process by which we attain a belief which we

regard as the result of previous knowledge” (EP 2, p. 11). As we will see, reasoning

consists of various kinds of learning processes.

A ‘belief’ is “a state of mind of the nature of a habit, of which the person is aware,

and which, if he acts deliberately, on a suitable occasion, would induce him to act in a

way different from what he might act in the absence of such a habit” (EP 2, p. 12). For

example, if a person ‘believes’ that a straight line is the shortest distance between two

points, then if she wants to move from point A to point B along the shortest path, she will

actually try to follow a straight path.

A ‘judgment’ is an act of consciousness by which we recognize a belief; while a

‘proposition’ is the expression of a judgment (EP 2, p. 12).

17 I base my schematic introduction of ‘reasoning in general’ on the manuscripts “What is a Sign?” (EP 2.2) and “Of Reasoning in General” (EP 2.3). The interested reader might turn to these manuscripts for a concise introduction to Peirce’s notion of reasoning and of the role of signs within that process. Though this schema of ‘reasoning in general’ is admittedly too succinct, the Peircean conception of ‘reasoning’ should become clearer as I discuss in detail the Peircean notion of mathematical inquiry and deploy it in the forthcoming case study.

25 ‘Argumentation’ is the expression of a ‘reasoning’, that is, the expression of a process by which we attain a new ‘belief’ from existing ones (EP 2, p. 12).

‘Expression’ is a kind of representation or signification (EP 1, p. 281).

Thus, ‘reasoning’ is expressed as an ‘argumentation’ by way of signs, where a

‘sign’ is “a thing which serves to convey knowledge of some other thing, which it is said to stand for or represent. This thing is called the object of the sign; the idea in the mind that the sign excites, which is a mental sign of the same object, is called an interpretant of the sign” (EP 2, p.13).

On the basis of these definitions and schematic conception of ‘reasoning in general’, it will become apparent why I focus my introductory discussion of the triad in reasoning first on Peirce’s triadic classification of ‘signs’, or kinds of representation, into icons, indices, and symbols, in order to explicate, second, Peirce’s triadic classification of

‘argumentations’ into deduction, induction, and abduction. Both of these triadic classifications will provide for us the key concepts to discuss Peirce’s conception of mathematics and, more generally, to undertake a case study into the logic of mathematical inquiry.18

18 In “A Guess at the Riddle” (EP 1.19), Peirce outlines what would make up a complete exposition of the triad in reasoning. Such exposition would discuss: (1) Three kinds of sign—Icons, Indices, and Symbols; (2) three kinds of symbols—terms, propositions, and arguments; (3) three kinds of arguments—deduction, induction, abduction or hypothesis—and three figures of syllogism; (4) three kinds of terms—absolute, relative, conjugative; and (5) some miscellaneous triads. See the corresponding editorial note in EP 1.19 to identify the papers where Peirce elaborates each of these triads. Clearly, a complete exposition would merit an extended work in itself. My more limited task here is to introduce only those notions that are absolutely necessary to a discussion of Peirce’s conception of mathematical reasoning.

26 2.2.1 The Triad in Signs: Icons, Indices, and Symbols

A sign is essentially a representation and thus, for Peirce, a sign “is a third mediating between the mind addressed and the object represented” (EP 1, p. 281).

Representation or signification is a triadic relation since a “sign stands for something to the idea which it produces of modifies. Or, it is a vehicle conveying into the mind something from without. That for which it stands is called its Object; that which it conveys, its Meaning; and the idea to which it gives rise, its Interpretant” (NEM 4, p.

309). A sign, in short, is a representation that communicates the ‘meaning’ of an ‘object’ to an interpreting mind by producing another representation called the ‘interpretant” idea.19 According to Peirce, the object of a representation is itself a representation, and so

there is an endless series of representations behind the ‘object’ represented, although we

can conceive of an absolute object as the limit of the series. Likewise, the ‘meaning’ of a

representation is itself a “more diaphanous” representation of the object, so there is also

an endless series of representations here. Finally, the ‘interpretant’ of a representation

itself represents the truth regarding the object; but as a representation it itself produces

another interpretant idea, producing another infinite series (NEM 4, p. 309-310). The

upshot is that reasoning is a continuous process of interpreting representations since “all

reasoning is an interpretation of signs of some kind” (EP 2, p. 4).

Since signs are representations or Thirds, the theory of the categories reveals that

they may be genuine, degenerate in the first order, or degenerate in the second order.

19 Strictly speaking, the interpretant sign need not occur in a mind, at least not in a human mind. For example, computers can follow or interpret ‘signs’. Since my focus is on human reasoning, however, I will usually write of interpretant signs occurring in minds.

27 Whether a sign is genuine or degenerate in the first or second order depends on the kind of relation that exists between the sign and the object in the object-sign-mind triad. Peirce describes genuine signs as follows: “A sign is in conjoint relation to the thing denoted and to the mind. If this triple relation is not of a degenerate species, the sign is related to its object only in consequence of a mental association, and depends upon a habit. Such signs are always abstract and general, because habits are general rules to which the organism has become subjected. They are, for the most part, conventional or arbitrary”

(EP 1, p. 225-226). These genuine signs are symbols. They are mostly conventional representations because they communicate their meaning by way of mental association; that is, on the basis of an accepted convention, the representation determines an interpretant idea in our minds. The sign is related to its object by way of a conventional mental association to which our minds are habituated (EP 1, p. 281). All general words and any way of conveying a judgment are examples of genuine signs or symbols.

However, “if the triple relation between the sign, its object, and the mind, is degenerate, then of the three pairs [sign-object, sign-mind, object-mind] two at least are in dual relations which constitute the triple relation. One of the connected pairs must consist of the sign and its object, for if the sign were not related to its object except by the mind thinking of them separately, it would not fulfill the function of the sign at all” (EP

1, p. 226). Again, the order of degeneracy of a sign results from the kind of relation that subsists between the sign and the object. If the sign-object relation does not result from mental association, then it must result from their “direct dual relation…independent of the mind using the sign” (EP 1, p. 226). Representations of the first order of degeneracy result when “the sign signifies its object solely by virtue of being really connected with

28 it” (EP 1, p, 226). These signs are indices. Examples include natural signs (e.g. smoke as a natural sign of fire or burning), physical symptoms (e.g. fever as a natural sign of some illness), the letters on a geometrical diagram that indicate specific elements of the diagram, and the subscripts numbers in algebra that distinguish one value from another without specifying the values themselves. Representations of the second order of degeneracy result when the dual relation between the sign and its object “consists in a mere resemblance between them” (EP 1, p. 226). These signs are icons. Icons stand as representations virtually indistinguishable from their objects because they embody all the characters of their objects. Examples of icons include most works of art, such as sculptures and paintings, and geometrical diagrams.

In expounding Peirce’s notion of mathematical inquiry, I will develop at length the view that geometrical diagrams are icons. But even at this point it would be very illuminating to consider the following brief illustration of icons: “A diagram, indeed, so far as it has general signification, is not a pure icon; but in the middle part of our reasonings [say, in the course of proving a theorem], we forget that abstractness in great measure, and the diagram is for us the very thing. So in contemplating a painting, there is a moment when we lose the consciousness that it is not the thing, and the distinction of the real and the copy disappears, and it is for the moment a pure dream,—not any particular existence, and yet not general. At that moment we are contemplating an icon”

(EP 1, p. 226). We might also conceive of theatrical performances or movies as being icons at the moment of the observer’s immersed contemplation of them. Now, in so far as geometrical diagrams or paintings represent themselves, they are pure icons. However, it is important to observe that the foregoing examples anticipate that representations need

29 not be pure icons, indices, or symbols. As we will find in considering mathematical representations, symbols may have indexical and iconic features, indexes may have iconic features, and icons may have indexical and even symbolic features. For example, geometrical diagrams may be regarded as icons, in so far as they represent themselves; they may include indexical features, such as letters pointing to specific elements of the diagram; and they may be regarded as symbols, in so far as they have a general signification beyond being representations of themselves, for example, in so far as they may be interpreted as representations of actual features of physical space.20

I will return to discuss the triadic classification of signs into icons, indices, and

symbols at length in the context of addressing the nature of mathematical representation

according to Peirce.21 For now, let us turn to the triadic classification of arguments, which will also be essential to understand his notion of mathematical reasoning.

20 Emily Grosholz has recently developed an alternative, original study of the various forms of representation, including various forms of mixed or what Peirce calls ‘degenerate’ cases, because they can be interpreted both iconically and symbolically. For a treatment of this issue with specific regard to representation in mathematics, see Grosholz 2005. 21 I must clarify that the triadic classification of representations into icons, indexes, and symbols is only a preliminary classification. Peirce in fact developed, largely by way of the subsequent application of the art of trichotomic to his classification of signs, a far more complex classification according to the subspecies and levels of degeneracy of each kind of sign. For a glance at Peirce’s complex, fully developed classification, see EP 2.32. For a brief introduction to Peirce’s semiotic system, see Houser, “Introduction,” in EP 1, p. xxxvi-xli. For a full discussion, see Liszka 1996.

30 2.2.2 The Triad in Arguments: Deduction, Induction, and Abduction

In a seminal 1867 paper entitled “On the Natural Classification of Arguments,”

Peirce classifies arguments into a triad.22 His aim in that paper—an aim which will also

guide my discussion here—is to provide a classification of the various kinds of

arguments, without undertaking yet to discuss their justification or validity. Under the

influence of his sustained study of Kant, Peirce classifies arguments as either analytic or

synthetic. Analytic arguments are deductive, and synthetic arguments are classified as

either ‘induction’ or ‘hypothesis’.23 Although his conception of the different kinds of

argument, and especially of induction and of hypothesis (later abduction or retroduction),

evolved over the course of his research in logic, Peirce always maintained a triadic

classification.24 Let us begin with an illustration of the classification before addressing

how Peirce arrives at it.

In the 1877-1878 Illustrations of the Logic of Science, Peirce exemplifies the

three kinds of argument in syllogistic form as follows. Suppose there is a bag filled with

white beans, so that any sample taken from the bag will consist only of white beans.

There are three basic kind of syllogistic arguments that may arise from this situation,

namely:

22 See “On the Natural Classification of Arguments,” Proceedings of the American Academy of Arts and Sciences, April 9, 1867. Reprinted in W 2, p. 23-48. Note that following standard practice in Peirce scholarship, I will abbreviate Peirce 1982 as W, followed by volume and page number. 23 I will discuss extensively the distinction between ‘induction’ and ‘hypothesis’, ‘retroduction’, or ‘abduction’ below. 24 I cannot offer here a discussion of the development of Peirce’s views on the various kinds of arguments. For an account of this development, see Santaella 1998. For a specific account of the evolution of Peirce’s concept of abduction, see Anderson 1986.

31 (1) Deduction:

Premise 1: Rule — All the beans from this bag are white.

Premise 2: Case — These beans are from this bag.

Conclusion: Result — These beans are white.

(2) Induction:

Premise 1: Case — These beans are from this bag.

Premise 2: Result — These beans are white.

Conclusion: Rule — All the beans from this bag are white.

(3) Hypothesis:

Premise 1: Rule — All the beans from this bag are white.

Premise 2: Result — These beans are white.

Conclusion: Case — These beans are from this bag.25

The different kinds of argument arise from the reasoning situation of the inquirer.

First, if the inquirer knows already the ‘rule’ that all the beans from this bag are white

and the ‘case’ that this sample of beans (hidden within a container, let’s suppose) was drawn from this bag, then she may conclude the ‘result’ that the sampled beans are white.

The inquirer, then, attains a new belief on the basis of existing knowledge by a deductive reasoning process. That she believes the ‘result’ to be that these (presumably hidden) sampled beans are white means that she would be willing to act on the basis of that conclusion; for instance, she may be willing to take a bet against someone who gambles that some or all of the sampled beans are not white. Second, if the inquirer already knows

25 See “Deduction, Induction, and Hypothesis” in EP 1.12, especially p. 186-189.

32 that the beans sampled from this bag are white, she may conclude that all the beans from this bag are white. Though Peirce does not speak yet of the level of confidence that the inquirer may place on this conclusion, we might foresee that she might not be certain of it, but may only hold it with some degree of probability, and so may ‘believe’ the conclusion, that is, be willing to take action, according to this degree of probability.

Third, the inquirer may only know that all the beans in this bag are white and that she has before her a sample of all white beans. She may then conclude the ‘case’ to be that the sample of beans was taken from this bag. Again, she may not be certain about her conclusion, and may only hold it tentatively. This, as I have mentioned, is only an illustration, but one crucial point should already come to light: for Peirce the classification of arguments really corresponds to, or reflects, the various basic kinds of reasoning that we might require in order to confront various situations in the course of our inquiries. As Peirce would put it in 1898, the three forms of argument are but the

“scaffolding” that helps us to contemplate the structure of the three basic kinds of inferential reasoning (RLT, p. 141).26

Now, I think the preceding illustration also brings to light a relation between the classification of arguments and the universal categories that we may anticipate now but

will become problematic later. Though Peirce does not discuss the classification in these

terms, we make take a ‘rule’ to be of the nature of a ‘third’ since it mediates between all

possible cases and results concerning, say, the beans in this or other bags and the possible

samples of beans. Similarly, we may take a ‘case’ to be of the nature of a ‘second’ since

26 Following standard practice in Peirce scholarship, I will abbreviate Peirce 1998b, Reasoning and the Logic of Things, as RLT.

33 it is of the nature of a ‘relation’; in this case, a relation between these sampled beans, on the one hand, and this bag of beans, on the other. Finally we may take a ‘result’ to be of the nature of a ‘first’ in as much as it is of the nature of a ‘quality’; for example, the proposition that ‘these beans are white’ predicates, according to Peirce, the quality

“whiteness” to belong to “these beans” as a subject. From this “categorical” perspective, I suggest that the three kinds of argument would appear as follows: (1) Deduction infers a

‘quality’ on the basis of a ‘relation’ and a ‘general rule’; (2) induction infers a ‘general rule’ on the basis of a ‘relation’ and a ‘quality’; and (3) hypothesis infers a ‘relation’ on the basis of a ‘general rule’ and a ‘quality’.27 Let us hold on to this correspondence

between the three classes of argument and the categories, although we will find in the

course of the forthcoming investigations that this correspondence becomes problematic,

even in term of Peirce’s own evolving philosophical system.

Regarding the classification of arguments and the corresponding classification of

kinds of reasoning, at any rate, Peirce usually addresses the issue by showing that there

are three basic figures of syllogism that correspond, in turn, to three basic kinds of

reasoning. In other words, he typically addresses the triad in reasoning by showing, first,

that there is a formal distinction between three kinds of argument and, second, that this

formal distinction corresponds to an actual distinction between three kinds of inferential

reasoning. This is also the strategy that he deploys in what we might call his “mature”

27 Admittedly, this suggestion is based on Peirce’s 1867 exposition of the classification of arguments which relies on the form of the syllogism. As Peirce’s views on logic developed along with the discovery of the logic of relations, the ordered structure of the syllogism fades away from his account of the three types of reasoning, a kind of egalitarianism of predicates follows, and so the adequacy of my suggested correspondence between the categories and the three forms diminishes. However, I think that my suggested correspondence between categories and syllogistic forms of reasoning is helpful for the purposes of an introductory exposition.

34 classification, arguably starting in his 1898 Cambridge Conferences lecture entitled

“Types of Reasoning.”28

I cannot address here Peirce’s formal argument to show that there are three basic

figures of the syllogism.29 Since my ultimate goal is to address mathematical reasoning as

an activity, I will only take up his argument that the three basic figures correspond to

three different kinds of reasoning. The key to the argument is that, according to Peirce,

“demonstrative inference is the limiting case of probable inference. Certainty pro is

probability 1. Certainty con is probability 0” (RLT, p. 136). This is reflected in the first

figure of the syllogism. Peirce presents the probable syllogism as follows:

The proportion r of the Ms possess π as a haphazard character;

These Ss are drawn at random from the Ms;

Therefore, probably and approximately, the proportion r of the Ss possess π.

Following Hilary Putnam, we may say that “the situation that Peirce envisaged in

the inference under discussion [is that we] have a population M from which we are

sampling [and the] sampling is to be such that the attribute π occurs randomly in the sequence of Ms generated by the method of sampling (‘the Ss’)” (“Comment” in RLT, p.

63).30 Additionally, we may take “probably” in the conclusion to mean “with a high

28 See RLT, p. 123-142. 29 Peirce’s argument can be found in “On the Natural Classification of Arguments” (W 2, p. 23-48). Other partial versions of the argument are found in “Deduction, Induction, and Hypothesis” (EP 1.12) and in “Types of Reasoning” (RLT, p. 123-142). Hilary Putnam’s “Comments” on this last text (in RLT, p. 59-67) are also helpful and illuminating. The reader may notice that, strictly speaking, there are four basic figures of the syllogism, but Peirce considers the forth to be a “mixed” kind corresponding to inference by “analogy.” For the details, the preceding references are also helpful. 30 See RLT, p. 136-137. Peirce adds four “explanatory remarks” regarding what he means by ‘probably and approximately’, ‘the proportion r’, ‘drawn at random’, and ‘haphazard’. See Putnam’s “Comments” (in

35 degree of confirmation” (Putnam, “Comment” in RLT, p. 64). Moreover, Peirce specifies that the proportion r does not have any precise numerical value although, as a proportion, r must lie between 0 and 1. Peirce conceives of the reasoning regarding r as follows: distribute all possible values between 0 and 1 into two parcels; our statement is that r belongs to one of the two parcels. For example, we could state that ‘more than half of the

Ms, and thus of the Ss, possess π’. This is stating that r > 0.5. Or we could state that

‘nearly all of the Ms, and thus of the Ss, possess π’. If by “nearly all” we mean a percentage greater than 95%, for example, then we are stating that r > 0.95. The point is that r falls within one of two intervals between 0 and 1 (RLT, p. 136).

If the proportion r = 1 in the limiting case, and if we take π to be the quality P, that is, if all the Ms possess some quality P, then this probable inference becomes a valid demonstrative syllogism in the first figure:

All M are P

All S are M

All S are P.

Similarly, if the proportion r = 0, the probable inference also becomes a valid

demonstrative syllogism in the first figure:

No M are P

All S are M

No S are P.

RLT, p. 61-65) for a discussion of Peirce’s remarks from a contemporary perspective on these probability concepts.

36 In general, then, valid demonstrative syllogisms are limiting cases of probable inferences in the first figure.31 In Peirce’s words, the first figure of probable reasoning embraces “all necessary reasoning as a special case under it. It is in fact Deduction” (RLT, p. 138).

Putnam emphasizes that Peirce did not regard this argument as a justification of demonstrative reasoning “nor, of course, did he regard it as a justification of probability inference. His aim was to provide a natural classification of the three types of arguments”

(“Comment” in RLT, p. 63).

Given Peirce’s description, deductive reasoning consists in drawing an inference about a specific character of the objects, events, or phenomena in a sample on the basis of our knowledge of the character of the objects in the population from which we know the sample to be drawn at random. Again, with respect to the categories we may surmise (i) that knowledge of the character of objects or events in a population is knowledge of a

‘general rule’, (ii) that knowledge of the provenance of a sample is knowledge of the relation between the objects in the sample and the population from which they are drawn, so that (iii) an inference regarding the character of the objects in a sample is an inference regarding a ‘quality’ of the objects. Be that as it may, it is clear that, under the current description, in deductive reasoning we proceed from our knowledge of the character of a population and of the method of sampling to a conclusion about the character of the sample.

31 As Putnam observes, Peirce assumes here, without stating it, that the population of Ms is finite (“Comment” in RLT, p. 62). As Putnam also notes, Peirce is aware of this assumption because Peirce later admits that the statement ‘the probability that an M is π is 1’ is “compatible with there being Ms that are not π” (Putnam in RLT, p. 275, note 12). This is possible when the population of Ms is infinite because a single M that is not π becomes negligible for the estimation of probabilities. I might add that, strictly speaking, it is because Peirce makes the assumption that the population of Ms is finite that he claims that ‘certainty pro is probability 1’ and ‘certainty con is probability 0’.

37 Turning to a second kind of reasoning, Peirce argues that a valid syllogism in the third figure is the limiting case of inductive reasoning. The third figure of probable reasoning is derived from the first figure just as the third figure of a necessary syllogism is derived from the respective first figure, that is, by interchanging the major premise and the conclusion while denying both (RLT, p. 139). Take the major premise that ‘the proportion r of the Ms possess π as a haphazard character’. Recall that r only means that the true ratio of the Ms that possess π fall within one of two intervals between 0 and 1.

According to Peirce, if we let ζ denote any ratio contained in the interval to which r does not belong, then the denial of the major premise will be the proposition that ‘the proportion ζ of the Ms possess π as a haphazard character’. Likewise, the denial of the conclusion that ‘the proportion r of the Ss possess π’ will be the proposition that ‘the proportion ζ of the Ss possess π’. Therefore, the third figure of probable reasoning is the following:

These Ss are drawn at random from the Ms;

Of these Ss, the proportion ζ possess the haphazard character π;

Therefore, probably and approximately, the proportion ζ of the Ms possess π.

For Peirce, this is the formula of induction (RLT, p. 139). The form of the argument reflects the structure of inductive reasoning—from the character of a sample of objects drawn at random from a population we infer the character of the population. Again, in terms of the categories, we may surmise that induction consists in inferring a ‘general rule’ that pertains to a population on the basis (i) of the ‘relation’ between the objects in the sample and the population from which they are drawn, and (ii) of the prevalence of a

‘quality’ in the sampled objects.

38 Induction, then, is usually for Peirce the form of reasoning that leads to knowledge of general rules or laws. But the ‘generality’ here is not equivalent to universality, especially when the population under study is infinite. Peirce notes that

“[a]ll that induction can do is infer the value of a ratio, and that only approximately”

(RLT, p. 136). This is because if the Ms are infinite in number, the ratio of Ms that possess the quality π to all the Ms may be as 1 to 1, and “still there may be exceptions….

Consequently, induction can never afford the slightest reason to think that a law is without exception” (RLT, p. 139-140). For Peirce our knowledge of general rules or laws of nature comes from our ‘experience’, where ‘experience’ consists in everything that our

‘reactions’ with outward reality forces our minds to acknowledge. When we reason inductively, we infer general laws on the basis of our sample of experience. Now, it would actually be a fallacious induction to draw the ‘sample’ first—that is, to gather the collection of objects, events, and so on, first—and only then look for a character present in a certain proportion of that sample. If we then proceed to infer that the character is found throughout the population in the same ratio as it is found in the sample, we commit a fallacy, for we can find any number of common characters or qualities in any collection of objects (RLT, p. 137-138).32 What we must do in order to reason legitimately—not

only in induction but in all probable reasoning that relies on sampling—is to choose first

the character π to be studied or tested in advance of the drawing of the sample and only

then proceed to examine the proportion in which π is present in that sample. Induction, in

particular, consists in deliberately choosing a character for study and proceeding to

32 This fallacy is akin to what contemporary statisticians call “data snooping,” which consists in looking or “snooping” into the statistical data prior to deciding which statistical hypotheses to test with it.

39 ascertain the prevalence of the character in a sample in order to draw a conclusion regarding the population. It is in this way the inductive reasoning leads to probable and approximate knowledge of general rules regarding the nature of a population.

However, as Peirce himself emphasizes, while inductive reasoning can lead to knowledge of general rules, it “can never make a first suggestion” (RLT, p. 139). That is, induction itself can never suggest for investigation the possible laws, rules, regularities, or uniformities that may prevail in a population of phenomena. It can only infer the ratio in which the character already chosen for study is present in the population; this ratio gives us a probable and approximate general rule. Induction can only confirm or deny, with a given degree of approximation, the prevalence of conjectured laws, rules, or regularities in the population. In short, inductive reasoning only serves to test a conjecture regarding the general character of a population; it can never suggest a conjecture for inductive testing. Accordingly, Peirce places great emphasis on the ampliative force of a third kind of reasoning, different from both deduction and induction—a form of reasoning that results in what Peirce calls “first suggestions” or conjectures of all kinds, including suggestions regarding general laws or regularities.

The third kind of reasoning is abduction or retroduction. The form of abductive argument corresponds to the second figure of the probable syllogism:

Anything of the nature of M would have the character π, taken haphazard;

S has the character π;

Provisionally, we may suppose S to be of the nature of M.

Peirce restates this in conditional form as:

40 If µ were true, then π, π’, π’’ would follow as miscellaneous consequences;

But π, π’, π’’ are in fact true;

Provisionally, we may suppose that µ is true.

He observes that this “kind of reasoning is very often called adopting a hypothesis for the sake of its explanation of known facts” where the resulting explanation is the modus ponens:

If µ is true, then π, π’, π’’ are true;

µ is true;

Therefore, π, π’, π’’ are true. (RLT, p. 140)

In short, for Peirce abductive reasoning is inference to an explanatory hypothesis, and

this form of reasoning is to be distinguished from induction, in which we have already

adopted a hypothesis and are only testing its consequences.

When in our inquiries we are confronted with new, often puzzling, facts that we

seek to explain, we are in a situation that requires us to make a conjecture that would

explain the facts, and to adopt the conjecture provisionally as a hypothesis that we may

test. Experiential sampling is also crucial in this situation. It is the course of experience, whether in everyday life or in methodical inquiry, that often presents us with unexplained phenomena. In this situation, then, the sample of facts is given to us by the course of experience. So long as this is the case, we may consider the sample to be taken haphazard. The ‘abductive suggestion’ consists in the conjecture that a general rule, say the general character of a certain type of event, explains the facts under investigation.

At first glance, it may seem that the categories apply neatly to abductive reasoning. Abduction seems to consist in inferring the ‘relation’ between a sample of

41 observed phenomena and a population of a given kind of object or event on the basis (i) of our knowledge of a certain ‘quality’ that pertains to the observed, sampled phenomena and (ii) of the ‘general nature’ of a population. However, I think that abduction cannot be cast in these categorical terms so neatly because there are at least two species of abductive reasoning. Under one species, which we might term ‘habitual abduction’, the inquirer already knows a general rule or law, and the reasoning consists in grasping that the known general rule, when applied to the facts under investigation, provides an explanation for those facts. So the inquirer provisionally hypothesizes that the general rule is at work in the production of the facts. In short, the ‘abductive suggestion’ is the hypothesis that the observed facts result from the known rule or law. Habitual abduction, then, usually takes the form of the conjectural classification of facts by way of laws for the purpose of explaining those facts. However, under the second species, which we might term ‘creative abduction’, the general rule itself is not known in advance of the inquiry into the observed facts and their explanation. The inquirer is confronted with puzzling or surprising facts, but she does not know of a general rule, law, or nature that may readily explain the facts. She must perceive, imagine or ideate the explanation itself.33 In terms of the preceding formula of abductive argument, the difference between

‘habitual’ and ‘creative’ abduction is that in habitual abduction the inquirer already

knows the general rule stated in the first premise and her reasoning consists in associating

the rule with the observed phenomena, while in ‘creative abduction’ she must conceive

33 For now, I will leave open the question of how the inquirer “abduces” the general rule, law, or nature that explains the facts; that is, of whether perception plays a role, or whether creative abduction is a pure process of imagination, rational ideation, and so on. I will take up this issue in the context of the case study, in section 6.3.

42 the general rule itself, and state it in the major premise of the argument. Thus, creative abduction involves the discovery of ‘generality’, and it cannot be neatly described categorically as a process of inferring a ‘relation’ on the basis of a known ‘general law’ applied to the ‘quality’ of observed phenomena. Thus, we will need to pay close attention to the distinction between ordinary and creative abduction in the forthcoming study into the logic of mathematical inquiry, especially when, in chapter 6, we come to confront questions of how to warrant the application of general mathematical results to the scientific study of natural phenomena.

Let us close this preliminary discussion of the triad in reasoning by way of

Peirce’s own summary:

We see three types of reasoning. The first figure embraces all Deduction whether necessary or probable. By means of it we predict the special results of the general course of things, and calculate how often they will occur in the long run. A definite probability attaches to the Deductive conclusion because the mode of inference is necessary. The third figure is Induction by means of which we ascertain how often in the ordinary course of experience one phenomenon will be accompanied by another. No definite probability attaches to the Inductive conclusion, such as belongs to the Deductive conclusion; but we can calculate how often inductions of a given structure will attain a given degree of precision. The second degree of reasoning is Retroduction [or Abduction]. Here, not only is there no definite probability to the conclusion, but no definite probability attaches even to the mode of inference. We can only say that the Economy of Research prescribes that we should at a given stage of our inquiry try a given hypothesis, and we are to hold to it provisionally as long as the facts will permit. There is no probability about it. It is a mere suggestion that we tentatively adopt. (RLT, p. 141-142)

I might only add that, as we will see, the three kinds of reasoning are themselves divided into many species. At the very least, deduction can be classified as corollarial or theorematic, induction as qualitative or quantitative,34 and abduction as habitual or

34 See Hookway 1985, p. 208-229.

43 creative hypothesis. All of these forms of reasoning have their place in mathematical and scientific reasoning. Moreover, the case study will show that a fourth kind of reasoning, analogy, to which Peirce ascribes a “mathematical provenance” and which he considers to be of a “mixed character” in relation to the three basic kinds (RLT, p. 141), is also crucial to mathematical reasoning.

2.2.3 On Peirce’s Distinction between Induction and Abduction

As I have already noted, Peirce’s triadic classification of arguments and forms of reasoning evolves throughout the course of his philosophical and logical investigations.

The progressive and evolving distinction between induction and hypothesis, which in his mature work becomes a thoroughgoing distinction between induction and abduction, is a crucial part of the development of his triadic classification. Peirce, who was prone to providing intellectual autobiographies in his writings and lectures, acknowledges and recounts his evolving views on the distinction at various junctures.35 Only gradually did

Peirce come to view abduction as a distinct form of reasoning, not to be classified as a

species of induction but as a separate kind of reasoning altogether. Now, the distinction

between induction and abduction is at once one of the most original and controversial

aspects of Peirce logical classifications. Therefore, in preparation for our case study, it

will be worthwhile to conclude this introduction to the triad in reasoning by expounding

briefly Peirce’s reasons for the distinction between induction and abduction. Of course,

35 See for instance RLT, p. 141.

44 we have already seen that the formal structure of both forms of argument is different, corresponding to two different reasoning situations that an inquirer might confront. Thus, we should take the forthcoming arguments to be an elaboration of the distinctions already laid out implicitly in the preceding triadic classification.

In his 1877-1879 Illustrations of the Logic of Science, Peirce notes at least four interrelated distinctions between induction and hypothesis. (1) The first distinction regards the kinds of conclusion that result from induction and from hypothesis. At this stage of his work, Peirce characterizes induction as the “inference of a rule” and hypothesis as the “subsumption of a case under a class” (EP1, p. 189, 191). By induction we “conclude that facts, similar to observed facts, are true in cases not examined” while by hypothesis we “conclude the existence of a fact quite different from anything observed, from which, according to known laws, something observed would necessarily result” (EP1, p. 194).

(2) Induction is “reasoning from particulars to the general law” while hypothesis is “reasoning from effect to cause” (EP1, p. 94). This distinction follows from the previous one. In induction we reason that what we have observed in particular cases will generally be true in other similar, though as of yet unobserved, cases. In hypothesis, we reason that a fact which we have observed is the effect of another unobserved fact of a different kind, which is its cause.

(3) Induction “classifies” while hypothesis “explains” (EP1, p. 194). Peirce does not elaborate on this distinction. I take him to mean that induction classifies observed

45 particulars into a general class of objects or events.36 Hypothesis in turn consists in

tentatively supposing that an observed fact follows, by way of known causal laws, from

other unobserved facts; the explanation, then, consists in associating two facts as being

related as cause and effect by way of causal laws.37

(4) The fourth distinction is actually a more forceful statement of an aspect of the first one. Peirce calls it the “great difference” between induction and hypothesis, namely, that the former “infers the existence of phenomena such as we have observed in cases which are similar” while the latter “supposes something of a different kind from what we have directly observed, and frequently something which it would be impossible for us to observe directly” (EP1, p. 197). Alternatively, Peirce writes that “the essence of an induction is that it infers from one set of facts another set of similar facts, whereas hypothesis infers from facts of one kind to facts of another” (EP1, p. 198). This distinction indeed will remain central throughout the evolution of Peirce’s views on the classification of forms of reasoning, as it implies that it is by way of hypothesis, not induction, that we discover causes and form novel conceptions, at least as tentative conjectures. Induction does not find anything altogether new; it only associates unobserved phenomena with observed phenomena of a similar kind by way of a rule.

Hypothesis, however, consists in supposing some unobserved fact—often a fact not previously conceived—to hold as an explanation for observed facts of an entirely

36 I must observe, however, that in Peirce’s later thought ‘habitual abduction’ is also classificatory; but the purpose of the classification is to explain a phenomenon, not to assess the probability that the general class leads to good predictions about the characters of particulars. 37 Whether this is an adequate account of explanation within the Peircean framework will be an important component of the case study in chapter 8.

46 different kind. As this central distinction evolves, Peirce will come to provide an account for how such novel conceptions or suppositions come about.

Admittedly, the previous four distinctions between induction and hypothesis do not constitute formal reasons for distinguishing between the two forms of inference. The formal reasons for the distinction are given by the theory of the categories and its formal implications for the triad in reasoning. However, pragmatic reasons are clearly important for Peirce. According to him, the “utility and value of the distinction are to be tested by their applications” (EP1, p. 198). Attention to the applications or consequences of our conceptions is the mark of Peirce’s pragmatism.38 Peirce anticipates some such

consequences, including: (1) That induction is a ‘much stronger’ kind of inference than

hypothesis, in the sense that its conclusions are not merely tentative so that, I presume,

we must distinguish between them in our logical studies of actual arguments; and (2) that

it is impossible to infer hypothetical conclusions via induction so that, I presume, in order

to have a thorough logical account of the forms of reasoning involved in scientific inquiry, we must carve a place for hypothesis-making as rational inference (EP1, p. 198).

Overall, I interpret Peirce to mean that the distinction between induction and hypothesis

is not only justified by the triadic structure of reasoning but also by its value in furthering

a deeper, more thorough and subtle understanding of mathematical and scientific inquiry,

of their nature, history, and actual practice. If we fail to make the distinction, we fail to

understand theoretical inquiry in depth. A central failure would consist in neglecting to

account for the distinction between associating facts of a similar kind by way of a general

38 For Peirce’s statement of the pragmatic maxim, see “How to Make our Ideas Clear” (EP 1.8)

47 rule and conceiving of an entirely novel fact and supposing it to be associated, via a causal law, with observed facts of a different kind. According to Peirce, a theory of logic that does not distinguish between induction and hypothesis-making commits this neglect.39

Now, as Peirce’s triadic classification evolves, he sharpens the distinction

between induction and abduction. As we have already seen, in the 1898 Cambridge

Conferences Lectures Peirce provides the distinct structural forms of inductive and

abductive argument. Now, their structural form reveals a remarkable distinction between

inductive and abductive reasoning—inductive reasoning relies upon abductive

assumptions. Hilary Putnam writes that “by requiring that ‘induction’ include a premise

to the effect that the sampling method is random, Peirce was telling us that all induction requires prior knowledge of lawlike statements. For the statement that a method of sampling is random…requires knowledge of the equality of certain future frequencies, and is thus a species of lawlike knowledge, knowledge of generals” (“Comment” in RLT, p. 67). I take Putnam to mean that induction, on the Peircean model, presupposes that any object from the population would be sampled with equal frequency in the course of sampling in the infinite long-run. This presupposition of randomness in sampling amounts to a presupposition of a lawlike process. And it is an abductive presupposition; a conjecture regarding the nature of the sampling process. The upshot of induction relying on retroductive assumptions is important to the problem of induction: “Peirce was not

39 It will in fact be part of my task to show that, without a suitable concept of hypothesis or abduction as distinct from induction, we would fail to understand the logic of inquiry involved in the work of the early mathematical probabilists.

48 pretending to answer Hume by justifying a form of nondeductive inference from just statements about particulars to a general law; rather, Peirce was saying that inductive reasoning always requires the presence of assumptions about the general course of the world (that certain ways of sampling are random). Those assumptions, in Peirce’s view, themselves come from Retroduction, and are thus not knowledge [in the traditional epistemological sense of ‘justified true belief’] but hypotheses which are ‘provisionally adopted.’ It is the making of hypotheses and not the empiricist’s beloved ‘induction’ that makes empirical knowledge possible” (Putnam, “Comment” in RLT, p. 67).

There is an important line of defense of Peirce’s notions of abduction and induction, and of the distinction between them, on the grounds that it solves Hume’s famous ‘problem of induction’ and its variants. The key to the argument is that induction is not justified by principles arrived at inductively, but that induction is justified by the success of the inductive method in the long-run, where ‘success in the long-run’ is variously defined according to one’s interpretation of Peirce, and that this success depends in part on sound abductive presuppositions. A particular inductive inference may be wrong, but if we persist on the use of the inductive method, by virtue of its form and structure our inferences will lead to the truth in the long-run. Induction, moreover, must rely upon hypotheses and abductive conjectures to get off the ground as a form of reasoning. So the origin of empirical knowledge is in abductive hypothesis-making and what we require to account for the origin of scientific knowledge is an inquiry into the epistemological sources of abduction.

49 Evaluating such arguments regarding the ‘problem of induction’ and its history would take us too far afield.40 However, I do want to emphasize what these discussions

reveal about Peirce’s philosophy of inquiry: Peirce is interested in accounting for the reasoning activity of scientific inquirers within the actual complex situations in which they find themselves thinking, and in assessing the merit of their methods of reasoning on

the basis of their aim within the context of scientific inquiry. The ‘economy of research’

in scientific inquiry requires that inquirers, at various points, must make abductive

conjectures and test them inductively. There is no way around this in actual scientific

practice. Since this is what their aim demands, scientific inquirers do not wallow in epistemological “paper doubt” because their abductive conjectures and inductive tests do not rise to the level of certainty and justification of deductive reasoning.41 They are rather

concerned with developing and deploying methods that will help the progress of inquiry.

Accordingly, Peirce evaluates the merit, value, or ‘warrant’ of scientific forms of

reasoning on the basis of whether they facilitate the progress of scientific inquiry. The

key question for Peirce is not whether abduction or induction can ever attain the level of

justification that deductive reasoning has, but whether a method of reasoning fulfills its

function within the economy of scientific research.

40 See Hookway 1985, p. 208-229, for an introduction to these issues and a guide to other relevant secondary literature. 41 Peirce calls “paper doubt” the disingenuous kind of doubt that some philosophers feign in order to pretend that their conclusions result from a skeptical inquiry. The archetypical “paper doubt” for Peirce is Cartesian doubt, which only pretends to question all presuppositions in order to find an absolutely certain foundation for knowledge. According to Peirce, it is impossible to doubt all of our deeply ingrained beliefs at once. Moreover, the course of experience continuously throws a scientific inquirer into real, living doubt that requires an honest inquiry into our actually unsettled beliefs. For more on the contrast between “paper doubt” and “living doubt” see “The Fixation of Belief” (EP 1.1).

50 This is evident in the 1898 Cambridge Conferences Lectures. Peirce argues that while induction serves to test claims regarding the general character of phenomena, it can never suggest the conjectures that are to be tested inductively. This again emphasizes that the two forms of reasoning respond to different demands that arise in the course of scientific inquiry. Abductive reasoning responds to the need, arising from the ‘economy of research’, that at a given stage of an inquiry into the principles and causes of observed phenomena we must come up with conjectures, no matter how tentative they may be or how unlikely it may seem at the outset that they will turn out to be true. Peirce offers the example of scientific inquiry into cuneiform writing. In order to take the first steps towards reading cuneiform inscriptions, inquirers had to make some guesses regarding their meaning, and had to hold these guesses provisionally as hypotheses for inductive testing (RLT, p. 142). The testing, I surmise, would consist in ascertaining whether the hypothesized interpretation of the inscriptions would lead to probably and approximately true readings of other, perhaps undiscovered, writings. Inductive reasoning responds to the need dictated by the economy of research that we must test our plausible conjectures in order for scientific inquiry to progress.

Now, the claim that hypothesis-making is a different process than induction is admitted by other philosophers. Familiar twentieth-century views on the philosophy of science, such as Karl Popper’s method of ‘conjectures and refutations’ and Carl

Hempel’s ‘hypothetico-deductivism’, allow for such a distinction.42 However, these

philosophical views exclude conjecturing or hypothesis-making from the realm of logical

42 See Popper 1959 and 1963, and Hempel 1966.

51 scientific reasoning. ‘Conjecturing’ or ‘hypothesis-making’ is not strictly logical reasoning; it is a “psychological” process where, of course, “psychological” means not subject to, and even unworthy of, “logical” or “philosophical” study. These models of scientific reasoning forego, at the outset, the task of accounting for hypothesis-making as an intrinsic part of scientific inquiry. In my estimation, it amounts to giving up on a difficult task by claiming that it is a task for others, for non-philosophers or non- logicians. Peirce, in contrast, does not give up before trying. His model of scientific inquiry, supported by his theory of the categories and his triadic classification of reasoning, offers a distinction between induction and hypothesis-making and argues that hypothesis-making or abduction is a reasoning process, subject to its own logic. Logic, for Peirce, is simply about self-control in reasoning. Part of the task of the philosopher or logician of science, then, is to provide an account of abduction.

Peirce fully develops abductive logic in his mature work. In the 1903 Harvard

Lectures on Pragmatism, Peirce provides one of his fullest and clearest expressions of the logic of abduction. He describes abduction as the logical process of forming an explanatory hypothesis and again classifies it as a logical form of inference distinct both from deduction and from induction. He shows that even though an abduction only asserts its conclusion “problematically or conjecturally,” as an inference it has a definite logical form, namely:

“The surprising fact, C, is observed;

But if A were true, C would be a matter of course.

Hence, there is reason to suspect that A is true” (EP2, p. 231).

52 “Thus,” Peirce continues, “A cannot be abductively inferred, or if you prefer the expression, cannot be abductively conjectured, until its entire contents is already present in the premiss, ‘If A were true, C would be a matter of course’” (EP 2, p. 231). This premise amounts to stating that A would explain C, so Peirce is arguing that we cannot infer A unless our inference be to an explanatory hypothesis. In short, a condition for the admissibility of a hypothesis is that the hypothesis would account for the facts, and on those explanatory grounds we hold the hypothesis to be plausible.

The result of abductive reasoning, then, is the suggestion of what may plausibly be the case, situation, fact, or entity that explains an observed phenomenon. This is different from the deductive conclusion that something must necessarily be the case under given hypothetical or axiomatic conditions and from the inductive conclusion that something probably will be the case, in a calculable proportion of cases, upon the fulfillment of some particular conditions in nature (EP 2, p. 215-216). According to

Peirce, in the whole process of scientific inquiry, abductive hypotheses are mere plausible suggestions that we then take up for inductive experimental testing. But these ‘mere abductive suggestions’ are the only source of scientific discovery, while inductive conclusions are a matter of experimental verification. Again, herein we find the key distinction or “great difference” between the reasoning situations that call for abduction or induction. The question of abduction is the question of explanation of observed facts, while the question of induction is the question of the degree of agreement between observed and predicted facts. Creative abductive inferences are the site of novel conception, as the entities involved in the causes, principles, or laws that plausibly explain the observed phenomena are often of a different nature than the observed

53 phenomena. That is, when we propose a hypothetical explanation for an observed fact, we often conceive causes, principles, or laws that involve entities that are essentially different from the observed fact. In abduction, we begin with an observed particular phenomenon, we suppose a general explanation—a cause, a law, or a principle—that involves entities that are of a different nature than the observed particular phenomenon, and we provisionally conclude that the supposition is plausible. There is no such need for innovative conception in induction. We only assess the probable and approximate degree to which we may generalize a rule or theory.

Having considered Peirce’s view, and in order to appreciate more clearly the reasons for distinguishing between induction and abduction, let us pose an objection.

Suppose that, along with Bertrand Russell, we claim that “the inferring of premises from consequences is the essence of induction” (Russell 1907, p. 273-274). This amounts to claiming that abduction is essentially induction, since in abduction we reason from observed consequences to plausible explanatory premises. Thus, we might say that abduction, if there is any such form of reasoning at all, is really just a species of induction. What would Peirce reply?

In the first place, at a formal level—that is, at the level of the structure of arguments—Peirce argues that there are two different basic figures of the syllogism that

“infer premises from consequences,” if we take “premises” to mean the premises of a deductive argument and “consequences” to mean the conclusion of a deductive argument,

54 or even of a probable argument, in the first figure.43 The third figure is an induction,

while the second figure is an abduction. Thus, Peirce would argue that as Russellian

objectors we fail to make a formal distinction between two ways of “inferring premises

from consequences.” Second, and more importantly, Peirce would argue that we are

failing to grasp the “great difference” between two forms of actual reasoning. In

induction, we do indeed “infer premises from consequences” in the aforementioned

sense. But all that we do is to infer a general, ‘probably and approximately true’ rule

about objects that are essentially of the same nature. From observed swans, for example,

we make an inference about all swans, or nearly all swans, or more than half of the

swans, and so on. Thus, there is no novelty of conception involved in induction. In

abduction, we also “infer premises from consequences.” But this type of reasoning arises

in an inquiring situation that requires conceptual innovation. We observe a phenomenon

and conjecture so as to the nature of its cause. We confront a surprising fact and conceive

of a novel explanatory hypothesis. For example, we observe a regular pattern emerge out

of a succession of chance events, and we suppose a real probability to be at work in

producing the pattern.44 Abduction responds to a different inquiring situation than

induction, one that requires conceptual discovery. In sum, a broad view that sweeps under

the label ‘induction’ every inference from consequences to premises fails to make crucial

distinctions regarding both the form of different kinds of scientific argument and the

43 I think it is fair to interpret Russell in this way, since his larger point in the passage I quote is that we tend to believe that the premises of a deduction are true because we find that their deductive consequences are true. 44 This, in fact, will become the central example in section 6.3 of the case study.

55 actual nature of the reasoning activity of scientific inquirers under various contexts and situations.

Now, regarding the logic of abduction specifically, Peirce argues that it requires that an abductive hypothesis be (i) explanatory and (ii) capable of experimental verification, if the hypothesis is to be adopted (see EP 2.16).45 Elsewhere, Peirce provides

other rules for the logic of abduction, that is, for the process of adopting explanatory

hypotheses for testing. Hookway (1985, p.225-226) provides a concise summary of other

elements of the logic of abduction, including Peirce’s recommendations that we should

(iii) favor hypotheses that seem simple, natural, and plausible to us (CP 6.447), (iv) prefer

theories that explain a wide range of phenomena to those more narrow in scope (CP

7.221); (v) be mindful of successful theories in other areas and choose hypotheses that

employ similar kinds of explanations, that is, be mindful of analogies; and (vi) keep

always present the question of economy of money, time, thought, and energy, which for

Peirce is the ‘leading consideration in Abduction’ (CP 5.600). (vii) Peirce also warns

against giving undue preference to hypotheses on the basis of their ‘antecedent

likelihoods’ (CP 5.599), emphasizing therefore that the question of the plausibility of an

abductive hypothesis is different from the question of its antecedent probability or

likelihood. A conjecture might be highly improbable and yet it may seem plausible to us.

The last remark brings to the fore a final consideration. There are two tasks

involved in accounting for abductive reasoning. One is to describe its logical form and to

account for the rules that make it a rational process—in Peirce’s terms, a self-controlled

45 We will revisit and expound these rules in section 6.3 of the case study.

56 thinking activity in which we can describe and criticize the principles of reasoning that we employ. Another task is to explain how the ‘abductive suggestions’ or conjectures arise in the first place. Hookway distinguishes between these tasks as (i) clarifying “the rules we follow in ranking explanations as more or less plausible” and (ii) explaining

“how we have a reliable source of hypotheses, and why we are good at making guesses about the nature of phenomena” (1985, p. 224). The later task concerns the origin of

‘creative abduction’. That is, it concerns the epistemic source of the supposition ‘if A were true, C would be explained’ which is involved in abductive inference. Admittedly, I leave this second task open for now. I will return to it in depth in the context of our case study. I will only anticipate that Peirce attributes the origin of first explanatory suggestions to our ‘Abductive Insight’—a faculty of reasoning, akin to instinct, that continuously transforms our ‘perceptual judgments’, which are not subject to rational self-control, into ‘abductive conjectures’ subject to logical criticism via the logic of abduction. My immediate aim for now has been rather to introduce Peirce’s triadic classification of reasoning, and to explain the reasons for distinguishing between induction and abduction as two distinct forms of reasoning.

57 2.3 Peirce’s Conception of Mathematics

Charles Peirce provides two definitions of mathematics in “The Essence of

Mathematics” (CP 4.227-244).46 He attributes the first definition to his father, Benjamin

Peirce, who “in 1870 first defined mathematics as ‘the science which draws necessary

conclusions’” (CP 4.228).47 Second, Charles himself defines mathematics as “the study of

what is true of hypothetical states of things” (CP 4.233). These two definitions comprise

the essence of Peirce’s own conception of mathematics, and so I consider them to be the

best starting point for understanding his thought regarding the nature of mathematics.

Since the first definition refers to drawing necessary conclusions while the second to

studying hypothetical states of affairs, their compatibility might not be apparent. As

Peirce himself acknowledges, “[i]t is difficult to decide between the two definitions of

mathematics; the one by its method, that of drawing necessary conclusions; the other by its aim and subject matter, as the study of hypothetical states of things” (CP 4.238). My

task here will be to reconcile these definitions by expounding what Peirce means by them. We will then have a preliminary understanding of Peirce’s view on the nature of mathematics as a reasoning activity.

46 This work is dated January-February, 1902, and written as the first section of chapter 3 of the Minute Logic, a larger work never completed by Peirce. 47 For Benjamin Peirce’s definition, see his “Linear Associative Algebra” (1870), section 1. The editors of The Collected Papers of Charles Sanders Peirce also refer to the American Journal of Mathematics, vol. 4 (1881).

58 2.3.1 Hypothetical States of Things

In putting forth his father’s definition, Peirce emphasizes that mathematics is ‘the science which draws necessary conclusions’ and not the science of drawing necessary conclusions (CP 4.239).48 We should notice at once that this first definition is a description of mathematical activity—under this description, what mathematicians do is

to draw necessary consequences. But from what do they draw necessary consequences?

This is what the second definition answers. Mathematics, as ‘the study of what is true of hypothetical states of things’, concerns itself exclusively with what follows necessarily from the general description of a purely hypothetical state of affairs. Let us take up first the question of what Peirce means by hypothetical states of things, and subsequently we will address what it means to ‘reason necessarily’ about them.

In an illuminating passage, central to his thought on mathematics, Peirce writes:

[A]ll modern mathematicians agree with Plato and Aristotle that mathematics deals exclusively with hypothetical states of things, and asserts no matter of fact whatever; and further, that it is thus alone that the necessity of its conclusions is to be explained. This is the true essence of mathematics; and my father’s definition is in so far correct that it is impossible to reason necessarily concerning anything else than a pure hypothesis. Of course, I do not mean that if such pure hypothesis happened to be true of an actual state of things, the reasoning would thereby cease to be necessary. Only, it never would be known apodictically to be true of an actual state of things. Suppose a state of things of a perfectly definite, general description. That is, there must be no room for doubt as to whether anything, itself determinate, would or would not come under that description. And suppose, further, that this description refers to nothing occult — nothing that cannot be summoned up fully into the imagination. Assume, then, a range of possibilities equally definite and equally subject to the imagination; so that, so far as the given description of the supposed state of things is general, the different ways in which it might be made determinate could never introduce doubtful or occult features. The assumption, for example, must not refer to any matter of fact. For questions of fact are not within the purview of the imagination ….

48 The latter formulation would be, according to Peirce, a definition of formal deductive logic, that is, of the science that studies how necessary inferences are drawn. See also CP 4.239.

59 Perhaps it would have to be restricted to pure spatial, temporal, and logical relations. Be that as it may, the question whether in such a state of things, a certain other similarly definite state of things, equally a matter of the imagination, could or could not, in the assumed range of possibility, ever occur, would be one in reference to which one of the two answers, Yes and No, would be true, but never both. But all pertinent facts would be within the beck and call of the imagination; and consequently nothing but the operation of thought would be necessary to render the true answer. Nor, supposing the answer to cover the whole range of possibility assumed, could this be rendered otherwise than by reasoning that would be apodictic, general, and exact. No knowledge of what actually is, no positive knowledge, as we say, could result. On the other hand, to assert that any source of information that is restricted to actual facts could afford us a necessary knowledge, that is, knowledge relating to a whole general range of possibility, would be a flat contradiction in terms.

Mathematics is the study of what is true of hypothetical states of things. That is its essence and definition. Everything in it, therefore, beyond the first precepts for the construction of the hypotheses, has to be of the nature of apodictic inference. No doubt, we may reason imperfectly and jump at a conclusion; still, the conclusion so guessed at is, after all, that in a certain supposed state of things something would necessarily be true. Conversely, too, every apodictic inference is, strictly speaking, mathematics. (CP 4.232-233)

For Peirce, then, a ‘pure hypothesis’ is a ‘state of affairs’—let us call it also a ‘world’— of a ‘perfectly definite, general description’. The state of things is ‘general’ in the

Peircean sense—that is, it is a real, even if not actual, world which is sufficiently vague so as to allow many different particular worlds to be actual instantiations of it, but also sufficiently ‘definite’ or delimited so as to leave no ambiguity regarding whether a particular world falls within the limits of the hypothetical description. Although not actual, a general mathematical world is real in as much as it a world (a) that we can conceive and bring ‘within the purview of our imagination’ or within the discerning gaze of what Peirce sometimes also calls our “mind’s eye”49 and (b) a world that we can

understand by reasoning about it because it is subject to logical delimitations including,

49 See, for instance, NEM 4, p. 219, footnote.

60 for example, the limitation that the propositions that we can derive from the hypotheses that frame the mathematical world must be either true or false.

In terms of the Peircean categories, a ‘hypothetical state of things’ is a world in which Thirdness predominates. Christopher Hookway goes so far as to argue that for

Peirce mathematical objects, being purely hypothetical, lack Secondness altogether, that is, these objects are non-reactive—we do not actually and outwardly react with or sensually perceive pure mathematical objects (Hookway 1985, p. 205-207). Along the same lines, we might be inclined to go so far as to say that mathematical objects also lack

Firstness, that is, these hypothetical objects qua mathematical have no felt quality about them. This means, for example, that mathematical objects have no qualities such as texture, smell, color, or taste. In studying what is true of a purely hypothetical state of things, it is irrelevant to our reasoning whether those things have a particular color or smell; thus, the general description of our hypothetical world need not specify such qualities. At least initially, however, I want to be more guarded and simply claim that, for

Peirce, in a purely hypothetical state of affairs Thirdness predominates. The prevailing character of a mathematical world is its generality, but I want to leave open the possibility that such a world may have also a reactive and a sensible character.50

Let us elucidate the foregoing conception of a ‘mathematical world’ as a

‘hypothetical state of things’ by way of Peirce’s own archetypical example, the first book of Euclid’s Elements. According to Peirce, Euclid’s dominating idea in this book is that students can understand the first elements of geometry only if they understand the logical

50 I will elaborate on some reasons for this below.

61 structure of the theory. Thus, in the first book Euclid makes a particularly painstaking effort to show the logical structure of geometric theory without saying anything explicit about the logic of mathematics (NEM 4, p. 236).51 Consequently, Peirce often turns to

Euclid to elucidate his philosophical ideas about mathematics, and a fundamental

Peircean distinction regarding the essence of mathematics is illustrated by the distinction

between Euclidean geometry as physical science and as mathematical doctrine (NEM 4,

p. 209-210). We may regard Euclid’s geometry as the science of real, physical space; in that case, geometry is “certain to a closer degree of approximation” than any other special—physical or psychical—science, and it “must employ the most refined methods of observation” to solve problems of measurement of physical space (NEM 4, p. 209).

However, as physical science Euclidean geometry does not approach the degree of precision that it has when regarded as a pure mathematical theory. In order to understand

Euclidean geometry as purely mathematical, “the postulates, after having been thoroughly revised, must be considered as purely hypothetical; and the question that the geometer [qua mathematician] is answering must be understood to be, ‘What would be the properties of spatial figures, in a space having the properties embodied in the postulates?” (NEM 4, p. 209). As pure mathematics, then, geometry only asks what would be the case, were certain hypotheses to hold. I call this the Peircean conception of mathematics as the science of the ‘would-be’: “[T]he feature of mathematics which separates it widely both from Philosophy and from every special science is that the

51 This is not to say that there were not logical gaps in the theory; it is to say that in the first book of the Elements Euclid deploys his most “polished and endless labor and thought” towards displaying the logical structure of the geometric doctrine (NEM 4, p. 236).

62 mathematician never undertakes (quâ mathematician) to make a categorical assertion from the beginning of his scientific life to the end. He simply says what would be the case under hypothetical circumstances” (NEM 4, p. 208). For Peirce, the entire structure of

Euclidean geometry consists of (1) definitions, (2) postulates, (3) axioms, (4) corollaries,

(5) diagrams, (6) letters, (7) theorems, and (8) scholia (see NEM 4, p. 237-238). The postulates, in particular, express initial hypotheses in general terms (NEM 4, p.237).

Thus, the postulates allow us, as mathematicians, to describe a hypothetical state of affairs of our own conception in general terms.

We can now understand why, for Peirce, mathematics “asserts no matter of fact whatever.” In the first place, pure mathematics makes no categorical assertions about what is actually the case in nature; pure mathematicians only reason about a hypothetical world of their own creation (NEM 4, p. xiii). It is in this sense, then, that for Peirce “all mathematics is, in its purity, a study of an ideal system” (NEM 4, p. xvii; see also NEM

4, p. 191). Moreover, for Peirce the usual distinction between pure and applied mathematics is only a matter of degree; there is no hard division of mathematics into two discrete, ‘pure’ and ‘applied’, branches. All mathematics studies what is true of hypothetical states of affairs; thus, even “when a mathematician deals with facts, they become for him ‘mere hypotheses’” (CP 3.428). According to him, “modern mathematicians recognize as the truly essential characteristic of their science, that…it concerns itself with pure hypotheses without caring at all whether they correspond to anything in nature or not, or at least, disregards such correspondence entirely after its hypotheses are once formed” (NEM 4, p. 194).

63 Peirce provides the example of a working physicist who searches for a soluble mathematical problem that “resembles” a practical physical problem as closely as possible. This search consists in a “logical analysis of the problem” to express it in mathematical form, say in the form of a system of equations. For Peirce, the purely mathematical work begins “when the equations or other purely ideal conditions are given” (NEM 4, p. xv). We can consider this work to be ‘applied mathematics’ only in a limited sense: “‘Applied Mathematics’ is simply the study of an idea which has been constructed so as to be more or less like nature” (NEM 4, p. xv). Nevertheless, it remains the study of an ‘idea’ in the sense of a ‘general form’. Peirce again uses geometry as an example. Geometry is purely mathematical in as much as it is the study of the relations that obtain in a purely ideal system as described, for instance, by way of arbitrary definitions and hypothesized postulates. It can only be labeled ‘applied mathematics’ in as much as the hypothetical system is constructed so as to resemble an actual, or natural, system. In other words, it can only be labeled ‘applied mathematics’ in as much as mathematician “makes use of space imagination to form icons of [ideal] relations which have no particular connection with space” (NEM 4, p. xv). By way of contrast, we could say that, according to Peirce, the label “pure mathematics” in its ordinary sense simply refers to the study of an idea which has been constructed regardless of whether it resembles anything in nature or not. In other words, the ideal system may turn out to resemble some actual system, but the mathematician does not conceive of it in that way nor constructs it for the purpose of studying nature.

Another way to characterize Peirce’s conception is to say that all mathematics is the study of pure hypotheses and that the ordinary distinction between ‘pure’ and

64 ‘applied’ mathematics merely refers to whether the mathematician, in conceiving the hypotheses, makes an appeal to real outward experience. Take geometry again as an example of ‘applied mathematics’. For conceiving the ideal system, the mathematician appeals to space imagination, and space is, for Peirce, a “matter of real experience”

(NEM 4, p. xv). For example, when we say that a line is the shortest distance between two points, Peirce argues that we must appeal to two real outward experiences. One is the experience of vision, since “a straight line is a line that viewed endwise appears as a point”; the other is the experience of movement or physical action, since “length involves the idea of muscular action” (NEM 4, p. xv). Since the proposition that ‘the straight line is the shortest distance between two points’ necessarily involves the connection of two experiences, it “cannot be resolved into a merely formal phrase, like 2 and 3 are 5” (NEM

4, p. xv). Interestingly, for Peirce arithmetic is an example of ‘pure mathematics’ as contrasted to ‘applied mathematics’ in terms of the appeal to experience. He writes, for instance, “2 and 3 are 5 is true of an idea only, and of real things so far as the idea is applicable to them. It is nothing but a form, and asserts no relation between outward experiences” (NEM 4, p. xv). For example, Peirce argues, suppose we have three objects, a candle, a book, and a shadow. If to these objects we join a book, the result is five, because there will be two shadows; and if we further join, say, five more candles, the result is eight because the shadows disappear. Now, nobody would take these facts to be violations of arithmetic “for the propositions of arithmetic are not understood as applicable to matters of fact, except so far as the facts happen to conform to the idea of number” (NEM 4, p. xvi). Peirce’s point is that, as a matter of experience, it is not the case that every time we join three objects and one, the result is four objects. The upshot is

65 that for Peirce the ideal system of arithmetic, unlike the system of geometry, is not constructed on the basis of experience so as to conform to a natural system, to resemble matters of fact. Thus we cannot call arithmetic ‘applied mathematics’. Nevertheless, qua mathematics both arithmetic and geometry study purely hypothetical states of affairs.

The example of arithmetic also makes headway into a preliminary view of the notion of mathematical truth involved in Peirce’s conception of mathematics. For Peirce,

“if the cognitions of arithmetic, for example, are true cognitions, or even forms of cognition, this circumstance is quite aside from their mathematical truth” (CP 4.232). In this sense, a ‘true cognition’ would mean our cognition of an actual positive fact, of a state of affairs existent in nature. Such cognitions are the matter of positive science, which deals with fact. But mathematics deals with hypotheses and asserts no matter of fact. Thus, mathematical truth cannot consist in the correspondence between our cognition of a state of affairs and the actual state of affairs. Mathematical truth rather consists in the necessity of the conclusions derived from mathematical hypotheses. I will turn to the logical nature of ‘necessity reasoning’ below.

2.3.1.1 Diagrams and Icons

Now I claimed previously that, even though in a ‘mathematical world’ generality or Thirdness predominates, Peirce’s conception of a ‘hypothetical state of things’ may not be reducible to a world of mere generality. One principal reason for this is that, for

Peirce, bringing a pure hypothesis ‘within the purview of the imagination’ or of ‘the

66 mind’s eye’ is not a mere metaphor. It means that we are capable of representing this purely hypothetical world by way of a ‘sign’. More strongly and precisely, we must represent a purely hypothetical mathematical world by means of what Peirce calls a

‘diagram’. Peirce uses this word in “the peculiar sense of a concrete, but possibly changing, mental image of such a thing as it represents. A drawing or model may be employed to aid the imagination; but the essential thing to be performed is the act of imagining” (NEM 4, p. 219, footnote). A diagram, then, is a sign that represents in our minds the objects and relations that conform to our hypothesis. We may “actualize” the imagined mental diagram by way of physical models, say via graphs, drawings or equations, but the diagram is a sign that conveys a ‘meaning’ to our ‘mind’s eye’—the

‘meaning’ being the form of the relations that hold according to our pure hypothesis.

According to Peirce, there are two kinds of mathematical diagrams, (a) the geometrical,

“which are composed of lines” and (b) the algebraic, “which are arrays of letters and other characters whose interrelations are represented partly by their arrangement and partly by repetitions of them” (NEM 4, p. 219, footnote). Regardless of their kind, however, their essential nature is that of being representations that convey the forms of relation that obtain in a purely hypothetical state of things. In short, the essential act of diagramming in mathematics is not the act of drawing a geometrical figure or writing an algebraic expression; it is rather the act of imagining a representation that embodies the relations among objects that hold in our purely hypothetical world.

Now, for Peirce mathematical diagrams are icons. Recall that an icon is a sign that conveys information about its object because it embodies the very qualities of the object it represents. Accordingly, for Peirce a geometric diagram possesses itself that very form

67 of the geometrical relations that it represents. Likewise, an algebraic diagram embodies itself the very forms of the mathematical relations that it represents. Peirce writes, “the icon is very perfect in signification, bringing its interpreter face to face with the character signified. For this reason, it is the mathematical sign par excellence” (EP 2, p. 307).

Hookway observes that for Peirce a purely mathematical world is in fact an icon; whole mathematical theories are icons (Hookway 1985, p. 187). That is, they are representations of themselves. For Peirce, while an icon is ‘perfect in respect to signification’, in respect to denotation it is wanting since it “gives no assurance that any such object as it represents really [or rather, actually] exists” (EP 2, p. 307). This is why I want to leave open the possibility that for Peirce, even though the predominant character of a mathematical world is its generality or Thirdness, such a world may also have an inherent quality or Firstness since it has the mode of being of an icon.

To clarify Peirce’s view, let us see why mathematical diagrams are not indexes or symbols. A mathematical theory is not in real reaction with the object it denotes; for

Peirce a mathematical theory is a pure hypothesis, and so its only mode of reality is as a representation of itself in our minds. Thus, a diagram does not index any object other than itself. A mathematical diagram qua purely mathematical is not a symbol either because it does not represent another object or world to the interpreting mind; for example, it does not express laws of nature. For Peirce, for example, the expression of a physical law, such as the law of gravitation, in mathematical notation, is a symbol that represents a law of nature. Mathematical expressions, or diagrams, as representations of purely hypothetical systems, do not signify any such natural laws.

68 The upshot of entire mathematical theories being icons, then, is that these theories as pure hypotheses signify themselves perfectly by embodying the very forms of relation that they represent, but they do not necessarily denote any actually existing objects or forms of relation among actually existing objects, as indices do, and they do not necessarily express any laws of nature, as symbols do.

Moreover, as Hookway clearly points out, the iconicity of mathematical theories reveals that the object of mathematical study is the ‘form of a relation’: “[T]he mathematician is concerned with the forms of different relational structures, and his theories instantiate them” (Hookway 1985, p. 191-192; see CP 4.530-531). It is worthwhile again to consider one of Peirce’s own examples in arguing that the object of mathematical diagrammatic investigation is the ‘form of a relation’.

For example, let f1 and f2 be the two distances of the two foci of a lens from the lens. Then,

(1 / f1) + (1 / f2) = (1 / f0).

This equation is a diagram of the form of the relation between the two focal distances and the principal focal distance; and the conventions of algebra (and all diagrams, nay all pictures, depend upon conventions) in conjunction with the writing of the equation, establish a relation between the very letters f1, f2, f0 regardless of their significance, the form of which relation is the Very Same as the form of the relation between the three focal distances that these letters denote….Thus, this algebraic Diagram presents to our observation the very, identical object of mathematical research, that is, the Form of the harmonic mean, which the equation aids one to study. (CP 4.530)

In this case, we have the example of an algebraic diagram. The form of the relation between actual focal distances is diagrammed algebraically, or in the literal sense of “algebra,” reduced to a diagram whose elements have the same formal relations as the distances. The mathematician qua mathematician is interested in studying the formal structure of the relations, and the diagram perfectly embodies those very formal relations

69 of interest. Hookway cogently emphasizes that the diagram, for Peirce, does not keep the mathematician at “one remove” from his object of study (Hookway 1985, p. 190-191).

The mathematician qua mathematician is not interested in studying the actual physical magnitudes of focal distances; she is rather interested in studying the form of relation that is present in the very mathematical diagram itself. This form of relation, in other words, is present both in the actual focal distances and in the algebraic equation; and the mathematician, in studying the diagram, is studying directly, not indirectly or at one remove, the object of mathematical inquiry. Finally, it will be important below to elucidate the way in which this form of relation is present to our observation in the mathematical diagram.

2.3.1.2 The Applicability of Mathematical Theories

Now, if mathematical theories are icons, how can they become “applicable” to the study of actually existing nature? While, under the Peircean conception, mathematics is always the study of a purely hypothetical state of affairs, Peirce also recognizes the importance of the applicability of mathematics to the study of natural phenomena and so to the study of problems in the special, physical and psychical, sciences. According to

Hookway, in fact, for Peirce the purpose “of mathematical activity lies in its applications” (Hookway 1985, p. 185). So we might state the question in the terms that

Hookway does, namely, “what relation must obtain between a mathematical problem and

70 a non-mathematical one, for the solution to the former to be applicable to the solution to the [latter]?” (Hookway 1985, p. 186).

An initial key for answering this question is that for Peirce the hypothetical state of affairs for mathematical study may be conceived so as to resemble an actual state of affairs (NEM 4, p. xv). That is, the ideal mathematical systems may not only be icons representing themselves but also symbols representing systems of relations actually existing in nature. Hookway discusses Peirce’s example of a map as an icon of a physical terrain. Suppose there is an ‘isomorphism’ between the map and the terrain, so that the relations that hold between points in the map correspond to the relations that hold between places in the terrain.52 Then any problems about relations obtaining between

places in the terrain, such as how distant one place is from another, will be reducible to

problems about relations between points in the map. Hookway notes that when there is an

isomorphic correspondence between the relations found in the sign and those found in

what the sign represents, Peirce says that “sign and signified display the same ‘form of

relation’: the map is an icon of the terrain because both embody relational structures of

the same form” (Hookway 1985, p. 190). Hookway’s answer to the question of

applicability, then, is that the solution of a mathematical problem is applicable to the

solution of a non-mathematical one when there is an isomorphism between the

mathematical theory and the reality to which it is applied (1985, p. 189-191). Although I

think Hookway is correct in so far as an ‘isomorphic correspondence’ illustrates the kind

52 Hookway describes the isomorphism of this example in detail. We can take ‘isomorphism’ here to mean that two structures—say one ideal, I, and the other physical, P—are isomorphic when (i) the two structures have the same number of elements and (ii) when the relations that hold among the elements of I have the same pattern as the relations that hold among the elements of P (see Brown 1999, p. 37).

71 of ‘resemblance’ that Peirce means to suggest, in this preliminary exposition I do not want to foreclose the possibility that ‘homomorphism’53 and other types of ‘resemblance’ or ‘analogy’ might still belong within the Peircean conception.

Returning to Peirce’s archetypical example, he claims that in Euclidean geometry the postulates embody properties of physical space (NEM 4, p. 209-210). The Euclidean system, though purely hypothetical, is constructed so as to resemble physical space. By

‘resemblance’ here Peirce means that the forms of relation that obtain between points, lines, and figures in the Euclidean geometric world correspond to forms of relation that obtain in actual physical space. Let us take up as an example Euclid’s famous fifth postulate: “That, if a straight line falling on two straight lines make the interior angles on the same side less than two right angles, the two straight lines, if produced indefinitely, meet on that side on which are the angles less than two right angles” (Euclid 1956, vol. 1, p. 155). Mathematically this postulate is a hypothesis that describes a purely hypothetical state of affairs, according to Peirce. But the hypothesis is conceived—or at least was originally imagined—so that the form of the relations that obtain between two straight lines, as described in the postulate, would correspond to the forms of relations that do obtain in actual physical space. Now, if a problem pertaining to a physical space is reduced to a problem in Euclidean geometry, the Euclidean solution becomes applicable to the actual problem provided that the correspondence between the Euclidean system and the physical space holds, including the proviso that the mathematical hypothesis of

53 We can also say that two structures, I and P, are homomorphic when only the second condition for isomorphism holds, that is, when the relations that hold among the elements of I have the same pattern as the relations that hold among the elements of P, even if I and P do not have the same number of elements (see Brown 1999, p. 37).

72 the fifth postulate corresponds to what is the case in the physical space. In this case, as in any other, the key is not whether the correspondence is isomorphic, but whether the ‘form of relation’ embodied in the Euclidean theory is the same ‘form of relation’ that obtains in the physical space.

I might put my answer to the question of applicability in the following way.

Under the Peircean conception, all mathematical theories are icons. Some mathematical theories are pure icons, that is, they are only representations of themselves. The theories pertain only to ‘would-bes’ or purely hypothetical states of affairs. However, other mathematical theories are ‘symbolic icons’. That is, as icons they are purely ideal systems representing hypothetical states of affairs, but they also turn out to be symbols representing actual natural systems; the relations that hold among the elements of the ideal system also hold among the elements of the natural system. For example, Euclidean geometry qua mathematical is a system of pure icons, but qua physical theory it was conceived as a system of symbolic icons representing actual physical space. In so far as it is a true symbolic icon of physical space—where ‘truth’ means isomorphic or homomorphic correspondence—Euclidean geometry is applicable to problems pertaining to physical space. And in so far as it fails to be a true symbolic icon, Euclidean geometry is not applicable to such problems. Moreover, sometimes the mathematician conceives hypothetical states of affairs as representations of actually existing systems, so that the resulting mathematical theories are thereby applicable to the study of those existing systems in so far as the representation is true. But a mathematical theory that is initially conceived as a purely hypothetical state of affairs—i.e., imagined as a pure icon—may later be found to be a symbolic representation of an actual state of affairs. In the former

73 case, the mathematician creates the resemblance between the mathematical world and the natural system, while in the latter case she discovers it.

I have circumscribed the discussion to the relatively limited question of how mathematical theories become applicable to practical problems, such as those of the special sciences, according to Peirce’s conception of mathematics. However, I should emphasize again that for Peirce ‘mathematics’ is not a special scientific discipline, describable as the sum of its accumulated theories, but is rather an activity that draws necessary conclusions. What Peirce calls ‘mathematical activity’ encompasses all activity characterized by necessary reasoning. In his words, “every apodictic inference is, strictly speaking, mathematics” (CP 4.233). Accordingly, I would claim that for Peirce the most important application of mathematics does not consist in the deployment of this or that particular mathematical theory to solve this or that practical problem, but in the overall deployment of necessary reasoning to investigate problems in, say, phenomenology, aesthetics, ethics, logic, and the practical, physical and practical sciences.54 For a more

thorough exposition of Peirce’s conception of mathematics, then, we must turn now to

what he means by ‘necessary reasoning’.

54 This list corresponds to Peirce’s own hierarchical classification of the sciences. See, for example, EP 2, p. 258-262. In my dissertation I will remain focused on how the “application” of mathematical theories to scientific problems works according to a Peircean conception of mathematics. A thorough inquiry into how, according to Peirce, mathematical reasoning is to be deployed in the investigations of the aforementioned sciences is well beyond the scope of this work.

74 2.3.2 Necessary Reasoning, Experimentation, and Observation

In his discussion on the “Essence of Mathematics” Peirce distinguishes between two kinds of necessary reasoning, namely, ‘corollarial’ and ‘theorematic’ reasoning, and he argues that theorematic reasoning characterizes mathematics as an activity (CP 4.233).

In fact, he considers his distinction between these two kinds of proof to be his first real discovery about the methods of mathematical inquiry (NEM 4, p. 49). Let us describe these alternative forms of necessary reasoning.

Corollarial reasoning yields, according to Peirce, what all the “philosophers” who follow Aristotle call “direct demonstrations” or “demonstrations why” (CP 4.233).55 A

‘demonstration why’ is a “demonstration which employs only general concepts and

concludes nothing but what would be an item of a definition if all its terms were

themselves distinctly defined” (CP 4.233). That is, a ‘demonstration why’ confines itself to general terms and deduces nothing but propositions that would themselves be part of a mere definition. As an example of such “insignificant remarks” Peirce mentions the

geometrical truths that Euclid did not think worthy to mention and that editors introduced

with a garland or corolla (CP 4.233). The mathematician’s characteristic activity is not to

derive corollaries but to pursue “indirect demonstrations” or “demonstrations that”—in

other words, “theorems” (CP 4.233). For Peirce, ‘demonstrations that’ demand a

55 By the “philosophers” who extol “demonstrations why” Peirce means those philosophers in the “Aristotelian” tradition who might think that the best formal demonstrations are deductions that only involve general terms. Below he presents a strong contrast between these “philosophers” and the “mathematicians” who pursue “demonstrations that.” I think Peirce’s contrast is deliberately hyperbolic. His classification of the sciences and his program of philosophy demand that philosophers should deploy ‘mathematical’ reasoning, and he is railing against those “philosophers” who ignore mathematical reasoning altogether. In the balance, Peirce clearly respected the reasoning methods of Plato, Aristotle, and Leibniz, for instance, alongside Euclid, and so these are not the “philosophers” that he is deriding.

75 different kind of reasoning. Theorematic reasoning involves the construction of a diagram upon which the mathematician experiments and observes the results of his experiment.

Peirce writes:

Here, it will not do to confine oneself to general terms. It is necessary to set down, or to imagine, some individual and definite schema, or diagram — in geometry, a figure composed of lines with letters attached; in algebra an array of letters of which some are repeated. This schema is constructed so as to conform to a hypothesis set forth in general terms in the thesis of the theorem. Pains are taken so to construct it that there would be something closely similar in every possible state of things to which the hypothetical description in the thesis would be applicable, and furthermore to construct it so that it shall have no other characters which could influence the reasoning.…[A]fter the schema has been constructed according to the precept virtually contained in the thesis, the assertion of the theorem is not evidently true, even for the individual schema; nor will any amount of hard thinking of the philosophers' corollarial kind ever render it evident. Thinking in general terms is not enough. It is necessary that something should be DONE. In geometry, subsidiary lines are drawn. In algebra permissible transformations are made. Thereupon, the faculty of observation is called into play. Some relation between the parts of the schema is remarked. But would this relation subsist in every possible case? Mere corollarial reasoning will sometimes assure us of this. But, generally speaking, it may be necessary to draw distinct schemata to represent alternative possibilities. Theorematic reasoning invariably depends upon experimentation with individual schemata…In the last analysis, the same thing is true of the corollarial reasoning, too; even the Aristotelian "demonstration why." Only in this case, the very words serve as schemata. Accordingly, we may say that corollarial, or "philosophical" reasoning is reasoning with words; while theorematic, or mathematical reasoning proper, is reasoning with specially constructed schemata. (CP 4.233)

The fundamental difference between corollarial and theorematic reasoning is that the latter demands imagining individual ‘diagrams’—in the Peircean sense—that embody the general characteristics contained in the hypothesis or initial statement of the theorem, perhaps setting the diagrams down upon some medium of concrete representation such as paper or computer screen, experimenting upon the diagrams by altering them in some methodical fashion, and observing the results of the experiment upon the diagram.

Corollarial reasoning demands no such effort at imagining, experimenting upon, and observing specially constructed diagrams. This reasoning does require signs, and words

76 may act in this reasoning as geometric or algebraic diagrams do in theorematic reasoning.

In fact, both types of reasoning are species of deduction. But theorematic reasoning calls for subtle and inventive experimentation upon diagrams. Hookway puts the difference as follows. While ‘corollarial deductions’ do not yield surprises, since we perceive the conclusion of the deduction upon imagining that the premises are true, ‘theorematic deductions’ do yield surprises, since they call for imagination and invention in experimenting upon an icon to enable us to “evolve” surprising consequences from the hypothesis of the theorem (Hookway 1985, p. 193-194; see NEM 4, p. 38 and p. 288). In both kinds of deduction a sign or diagram is essential—not merely heuristic—but the mark of a theorem is that a “subsidiary construction modifies the diagram in some permissible way” that is not specified or suggested already in the premises (Hookway

1985, p. 194).

Now, Peirce’s distinction between corollarial and theorematic reasoning originates in his study of the structure of proofs in Euclid.56 Thus, I turn to construct an

abridged version of Peirce’s own favorite example in order to illustrate what he means by

theorematic reasoning. The example is the demonstration of the fifth proposition from

Euclid’s Elements (NEM 4, p. 201-207).57 According to Peirce, this is the first

proposition in the Elements that requires, on the part of the student, some measure of

“active mathematical thought” and thus it is known as the Pons Asinorum or the Asses’s

56 For more details on the Euclidean origin of the distinction for Peirce, see and NEM 4, p. 238. For comments, see Hookway 1985, p. 194. 57 Hookway provides his own detailed illustrations to substantiate the distinction between corollarial and theorematic deduction, including the example of Lakatos’s discussion, in Proofs and Refutations, of Euler’s theorem as an instance of mathematical reasoning the fits the pattern of theorematic deduction. See Hookway 1985, p. 194-203.

77 Bridge (NEM 4, p. 201-202). Peirce restates the proposition as follows: “[I]n a triangle two of whose sides are equal to one another, the two angles opposite those sides are also equal; and moreover, if the two equal sides are prolonged beyond the third side, the two exterior angles so formed will be equal” (NEM 4, p. 205). Under Peirce’s model of mathematical reasoning, such a general statement of the hypothesis of the theorem is the first step of a ‘demonstration that’. In this case, the task is to ‘demonstrate that’ the hypothesized resulting relations between two interior angles and between two exterior angles hold for any triangle of the assumed characteristics, namely, with two equal sides.

The second step is to imagine an individual diagram that embodies all the general characteristics assumed in the hypothesis; in this case, to imagine an iconic representation of an isosceles triangle, whose two equal sides are prolonged. Peirce’s distinction between corollarial and theorematic reasoning primarily results from the necessity, in the latter case, of imagining an individual icon that will help us to show the result. In corollarial reasoning we might immediately grasp, just from the general statement of the thesis, other general truths that must necessarily follow. But theorematic reasoning demands the construction of an individual schema. Euclid thereby sets down in papyrus an individual schema like the triangle ABC, with the equal sides AB and AC produced further with the lines AD and AE (see Figure 2-3).58

58 This figure is based on that of Peirce reproduced in NEM 4, p. 206.

78

Figure 2-3

Figure 2-3: Diagram from C. S. Peirce’s demonstration of Euclid’s Fifth Proposition (Elements I.5), known as the Pons Asinorum.

The third step is to experiment upon the diagram by imagining—and actually drawing, if necessary to aid the imagination—modifications that might help us to show that the hypothesized relations do obtain, in this case, the specified relations between interior and exterior angles. Part of my task in deploying Peirce’s model of mathematical inquiry in the forthcoming case study will be to elucidate more thoroughly this sense of

‘experimentation’. An important aim for my work is the description of techniques of

‘mathematical experimentation’ that lead to breakthroughs in mathematical reasoning.

For now, I want to emphasize that this ‘experimentation’ consists in the imaginative and judicious modification of the original diagram so as to produce a related diagram that might literally show or ‘monstrate’ the hypothesized relations among elements of the original diagram. In the quoted passage, Peirce writes of experimenting upon the diagram by drawing ‘subsidiary lines’, that is, by constructing an auxiliary figure. But the experiment might also consist in analyzing an original complex diagram into simpler

79 elements, so that the experimental analysis might exhibit the desired results.59 In the case

of Euclid’s demonstration of the fifth proposition, the experiment consists, first, in using

proposition 3 to find point Z along the line AD and point H along the line AE such that

the lines AZ and AH are equal to each other, and second, in drawing the triangles ZCB

and HBC (see Figure 2-3).

The fourth step is to observe the results of the experiment. This ‘observation’ is

no mere acritical perception but rather a critical reasoning activity informed by existing

knowledge and guided or focused by the aims of the experiment. The goal is to observe—

that is, to grasp that the experiment ‘monstrates’— the hypothesized results. Furthermore,

the deployment of theoretical knowledge is continuous with, and intrinsic to, the act of

‘observation’.60 In our example, Euclid and his students must observe that, by proposition

4, the triangles ZCB and HBC are so related that the angles ZCB and HBC are equal to

each other. Thus, the experiment upon the diagram ‘demonstrates that’ the specified

exterior angles are equal to each other. Next, the students of geometry must observe that,

also by proposition 4 and by relevant postulates and axioms, the triangles ACZ and ABH

are so related that the angles ACZ and ABH are equal to each other. Finally, by

deploying axiom 3 intrinsically within the act of observation, the students must see that

59 This would provide an avenue to link Peirce’s conception of mathematical reasoning as experimentation upon a diagram with Aristotle’s notion of logical analysis. Patrick Byrne (1997), in fact, convincingly argues that in the Posterior Analytics Aristotle provides a logic of scientific research that he derived from the method of analysis in ancient Greek geometry. I think that we can conceive of such a method as one of experimenting, via geometrical analysis, upon diagrams. Regrettably, I will have no occasion here to pursue this thesis. 60 For Peirce, there are not two separate acts, one of pure ‘sense perception’ and the other of ‘reasoning’ about what has been observed on the basis of theory. Observation consists of both acts being continuous with and indivisible from each other. N.R. Hanson develops this Peircean position carefully. See Hanson 1965, especially chapters 1 through 4.

80 the angles ABC and ACB are equal to each other. Therefore, the experiment upon the individual schema also ‘demonstrates that’ the interior angles opposite the equal sides of an isosceles triangle are equal to each other. QED.

Now, it is important to emphasize that for Peirce all of this reasoning process results in a deductive demonstration. In theorematic reasoning, the conclusion necessarily follows from the definitions, hypotheses, axioms, and other demonstrated theorems that serve to substantiate the proof. But what is insightful about Peirce’s model of necessary mathematical reasoning is that theorematic deduction requires imaginative and judicious experimentation and observation of a diagram. To arrive at successful deductions, the mathematician must experiment via abduction and induction. For Peirce the final structure of a Euclidean theorematic deduction and the actual practice of the mathematician in trying to find a proof of the theorem are different matters. According to

Hookway, although mathematical results are certain, the actual experimental and observational practice of the mathematician is inductive (Hookway 1985, p. 203).

Carolyn Eisele concurs, noting that for Peirce the practicing mathematician repeats the experiment upon a diagram in various ways in order to infer inductively that every diagram constructed according to the same hypothetical precept would present the same general relations among its parts (Eisele 1979, p. 5).61

For now, let us summarize what Peirce calls the “general nature” of the “essential

parts” of necessary reasoning that are involved in active mathematical inquiry (see NEM

61 My own position, to be substantiated in the course of the forthcoming case study into the discovery of mathematical probability theory, will be that this purely mathematical practice of experimentation upon ‘diagrams’ may involve various types of both induction and abduction, among various forms of ‘analysis’. See my preliminary description of Carlo Cellucci’s concept of ‘analysis’ in section 2.4 below.

81 4, p. 221-222, for Peirce’s description with another geometrical example).62 In my

estimation, there are five stages.63

(1) The mathematician expresses a hypothesis, often posed as a problem, in

general terms, possibly in ordinary language.

(2) The mathematician translates the general language of the hypothesis or

problem into a concrete ‘diagram’, or icon of the hypothesis, which she creates in her

imagination. Even though it is a single object, the mathematician imagines the diagram so

as to represent the full meaning of the general hypothesis. The representation of a general

hypothesis by a single schema is accomplished because the mathematician understands

that the diagram might be modified in some respects but not in others; what matters is

that the particular diagram that the mathematician imagines embody all the general

characteristics set forth in the hypothesis. Peirce formally defines of ‘diagram’ as “an

icon or schematic image embodying the meaning of a general predicate; and from the

observation of this icon we are supposed to construct a new general predicate” (NEM 4,

p. 238). Thus, mathematical reasoning is essentially diagrammatic.

(3) The mathematician ‘experiments’ upon the diagram by making additions or

changes to the original schema. But she does not modify it randomly, she rather “seeks to

conquer the difficulties by dividing them” (NEM 4, p. 221). Though Peirce does not

express it in these terms, I propose that to experiment upon a diagram is to ‘analyze’ it in

62 Again, at this point I present this model descriptively. My defense and critique of it will arise in the course of the forthcoming case study. 63 In NEM 4, p. 221-222, Peirce only divides the reasoning into four essential parts. But in my summary, I divide experimentation and observation into two parts, so as to distinguish them clearly. Of course, none of these stages should be understood as being discrete steps; there is rather one continuous process of mathematical reasoning that, for logical analysis, we can divide into stages in various ways.

82 any one of many possible ways, with a view to solving the problem, that is, of substantiating the proposed hypothetical result.

(4) The mathematician ‘observes’ the results of the ‘experiments’ until she sees that one of the experimental modifications of the diagram serves to solve the problem or substantiate the hypothesis. She may repeat the experiment, to convince herself inductively that the experimental results she has observed can be generalized in a theorem (see Eisele 1979, p. 5, 7).

(5) The mathematician translates the result of experimentation and observation back into general language. This is often done with a view to the applicability of the solution to the mathematical problem; in other words, with a view towards the applicability of the mathematical hypothesis to actual states of affairs.

2.3.3 Pure Mathematical Reasoning versus Poietic Creation

As Peirce himself recognizes, the definition of mathematics according to its method—namely, that it is the science which draws necessary conclusions—makes or seems to make the sole reasoning activity of the mathematician qua mathematician to deduce the consequences of hypotheses. However, the definition according to aim and subject matter, in as much as it points out that mathematics studies hypothetical states of affairs, seems to suggest that something like the poietic creation of hypotheses is part of mathematical reasoning (CP 4.238). Peirce observes that the framing of general hypotheses—for example, in conceiving “the field of imaginary quantity and the allied

83 idea of Riemann’s surface, in imagining non-Euclidian measurement, [and] ideal numbers”— involves the exercise of “immense genius” (CP 4.238). Moreover, the framing of particular mathematical hypotheses to solve special problems calls “for good judgment and knowledge, and sometimes great intellectual power, as in the case of

Boole’s algebra” (CP 4.238). Therefore, Peirce notes, a question arises: Should we exclude the work of poietic hypothesis-making, whether to create an ideal mathematical system or just to express a special scientific or practical problem in terms of a mathematical one, from the domain of pure mathematical reasoning? (CP 4.238).

Peirce’s own response is affirmative, though I think it is a guarded and nuanced response. He writes: “Perhaps the answer should be that, in the first place, whatever exercise of intellect may be called for in applying mathematics, to a question not propounded in mathematical form [it] is certainly not pure mathematical thought” and,

“in the second place, that the mere creation of a hypothesis may be a grand work of poietic genius, but cannot be said to be scientific, inasmuch as that which it produces is neither true or false, and therefore is not knowledge” (CP 4.238; emphasis mine). The first observation suggests that the work of stating a special problem from the sciences in purely mathematical form and of applying the mathematical solution to the special problem is not, strictly speaking, mathematical reasoning. To be in line with Peirce on these two points, perhaps we should speak of the “mathematician qua poet” in so far as she frames hypotheses, while we should speak of the “mathematician qua special scientist” in so far as she applies the solution of the mathematical problem to the actual scientific problem. To these two observations, Peirce adds that “if mathematics is the study of purely imaginary states of things, poets must be great mathematicians” (CP

84 4.238). To the contrary, Peirce emphasizes that mathematics is not merely the study of hypothetical states of affairs, which a poet might also perform in a novel, but it is the study of what is true of such hypothetical worlds (CP 4.238). And studying what is true of a hypothesis, we have seen, is a process of necessary reasoning that involves the construction of an icon that represents the hypothesis, methodical experimentation and observation upon the diagram, and generalization of the results of the experiments. In other passages, Peirce even goes so far as to sever the process of ‘forming a conjecture’ from mathematical reasoning because ‘forming a conjecture’ is instinctive and not subject to rational self-control or criticism, while mathematical reasoning is purely a self- controlled rational activity (NEM 4, p. 218, 222).64 In those passages, Peirce means that

mathematical hypotheses are not abductive, that is, they are not drawn by our ‘abductive

instinct’ for the purpose of explaining an actual, perceived phenomenon. Thus, pure

mathematical activity seems to involve neither poietic invention nor creative abduction.

Based on such passages, Hookway argues that for Peirce the framing of

mathematical hypotheses requires poietic genius but it is not scientific work (Hookway

1985, p. 204). Thus, the poietic creation of hypotheses is, strictly speaking, not part of the reasoning activity of the mathematician qua mathematician. However, I find the exclusion of the hypothesis-making function of the poietic imagination from ‘pure mathematical reasoning’ highly problematic, even from the perspective of Peirce’s own model of mathematical inquiry. After all, on his own account the mathematician cannot

64 In these passages, Peirce’s position seems to anticipate that of Popper in Conjectures and Refutations. But I will argue in section 6.3 that, on the whole, Peirce’s position does see instinctive conjecturing as part of scientific reasoning.

85 do mathematics unless she has some hypothetical state of affairs to study, and for carrying out such active studying she must be able to construct and modify diagrams in her imagination. So I would find it much more adequate, within Peirce’s model, and much more true to actual mathematical inquiry, to acknowledge the place of creative hypothesis-making as essential to mathematical thought.

Now, Hookway does point out what I think must be the reason why Peirce sometimes severs poietic creation from mathematical reasoning so emphatically. Peirce’s reason is systemic—he regards mathematics to be the fundamental discipline in his classification of the sciences, and the primary position of mathematics is due to its method of reasoning. Mathematical thought, which forms a diagram or model of the problem for study and experiments upon it, can be deployed in any of the sciences that lie in a lower hierarchical scale, and in fact such reasoning is required by the lower sciences, while ‘pure’ mathematical reasoning must remain free from the particular methods of the lower sciences. According to Hookway, Peirce regards the mathematical method to be a priori, practically infallible, yielding certain conclusions, and prelogical, that is, not subject to logical criticism (Hookway 1985, p. 203-207). Mathematical reasoning is a priori in the sense that its objects of study are entia rationis—we create the objects, namely, mathematical forms (Hookway 1985, p. 205). Its results are certain, even when experimentation and observation are ‘inductive’ (in Hookway’s interpretation), because mathematical objects lack ‘secondness’, i.e. are non-reactive, so the diagrammatic instances are the objects of study (Hookway 1985, p. 205-206). That is, mathematics is not subject to the error that studying perceived matters of fact introduces into our scientific reasoning. Finally, since upon grasping the meaning of the diagrammatic

86 instance we immediately grasp the general mathematical form of relation under study, no logical criticism of the grasping is required, and mathematical reasoning is prelogical

(Hookway 1985, p. 206-207). For all these reasons, pure mathematical reasoning is primary for Peirce.

Now, my intention here is not to defend, nor even to evaluate, the reasons for

Peirce’s classification. My intention is simply to describe the position of mathematics at the top of the Peircean hierarchy. Carolyn Eisele summarizes the matter as follows:

Peirce “set up a hierarchy of the sciences in which the methods of one science might be adapted to the investigation of those under it on the ladder. Mathematics occupied the top rung, since its independence of the actualities in nature and its concern with the framing of hypotheses and the study of its consequences made its methodology a model for handling the problems of the real world and also supplied model transforms into which such problems might be cast and by means of which they might be resolved” (Eisele

1979, p. 1). My suggestion, then, is that Peirce sometimes emphasizes that poietic creation is not part of pure mathematical reasoning in order to preserve the primacy of the mathematical method in his classification of the sciences. He does not want mathematical reasoning to appear to depend on any other reasoning method.65

In my estimation, however, it would be incongruous of Peirce to define

mathematics as the study of hypothetical states of affairs and yet to exclude the poietic

65 Admittedly, Peirce conceived of this hierarchy as a system of reciprocity. Those disciplines that have a higher position in the hierarchy influence and are influenced by the problems, methods, and results of those disciplines below them. The development of methods and the progress of knowledge is not a “trickle- down” affair, but a dynamic interrelation. However, mathematics does have a special status for Peirce at the top of the ladder, and I think that he does have a tendency—in my opinion, a tendency that is inconsistent with his overall philosophy of scientific inquiry—to privilege mathematical reasoning so as to seal it from the influence of “non-mathematical” reasoning.

87 creation of these hypotheses from mathematical activity, even for systemic reasons. But I think that his position is more nuanced than such a strict exclusion. I suggest that Peirce’s position is rather that the creation of mathematical hypotheses is poietic, but it is not merely poietic, and accordingly, that hypothesis-framing is part of mathematical reasoning that involves an element of poiesis but is not merely poietic either. Scientific considerations also inhere in the process of hypothesis-making. The mathematician is interested in studying hypothetical states of affairs of a perfectly general description. For example, as a geometrician Euclid is interested in framing a hypothetical world in which arbitrarily defined objects such as points, lines, planes, and so on, are subject to certain general forms of relation. When hypothesizing the relation between two parallel lines lying on the same plane, Euclid is not interested in the ‘inherent quality’ of the line, such as whether in our diagrammatic representation of it the line is red or blue. In a merely poietic hypothetical world, however, such as a poem or a novel, describing inherent qualities such as the color of snow on a sunny winter day or the tone of voice in which a character speaks can be of primary importance. Moreover, as Peirce emphasizes, the mathematician aims to study what is true of the hypothetical states of affairs that he frames. Thus, even as he creates diagrammatic representations of his hypothesis, his focus is on finding what is generally true about the hypothesis, and not on exploring the inherent qualities of the hypothetical world.

88 2.4 Mathematics as Creative, Precise, Experimental, Open-Ended Inquiry

In closing this preliminary discussion, I want to stress that both of the Peircean definitions of mathematics are descriptions of mathematical activity. According to its

‘method’, mathematics is the science which draws necessary conclusions. According to its ‘aim and subject matter’, mathematics is the science which studies what is true of hypothetical states of affairs. Both definitions reflect the Peircean notion that mathematics, in particular, and science, in general, is first and foremost a practice, an activity.

This makes Peirce’s conception of mathematics difficult to locate in terms of various classifications of the various philosophies of mathematics. I think the difficulty arises from the fact that the traditional classifications of philosophies of mathematics into realist, formalist, conceptualist, and intuitionist doctrines, for example, mainly concern the types of entities or objects that mathematics studies. Peirce’s conception of mathematics emphasizes instead the nature and character of mathematics as a reasoning activity. Some recent classifications, however, are helpful to locate Peirce’s view in the spectrum of the various philosophies of mathematics. Consider first Dale Jacquette’s classification according to whether conceptions of mathematics are ‘mind-independent’ or ‘mind-dependent’. Traditional conceptions of mathematics such as ‘realism’ and

‘formalism’ are mind-independent since they conceive of mathematics as the “study of real things and their properties, which in some sense may turn out to be purely formal”

(Jacquette 2002, p. 3). Other traditional conceptions, such as ‘conceptualism’ and

‘intuitionism’ are mind-dependent. The former conceives of mathematics as the study of

89 ideas as the contents of rational thought, while the latter conceives of mathematical entities as (Kantian) pure forms of intuition (Jacquette 2002, p. 2-3).

Now, if we ask whether Peirce’s mathematical objects are mind-dependent or independent, the answer is nuanced. In some respects, Peirce’s conception might be classified as ‘mind-independent’. His view is realist insofar as we can discover truths about a mathematical realm. However, Peirce would reject realist views, such as Frege’s, that hold that mathematics studies a particular segment of reality about which we discover positive truths (see Hookway 1985, p. 184). What mathematicians discover is rather what would be true of hypothetical mathematical worlds. And even though the objects of mathematical study are general forms of relation for Peirce, he would reject the

‘formalist’ view that mathematics is merely a ‘rule-governed syntax game’ and that

“mathematical truths are purely formal matters of mathematical notation” (Jacquette

2002, p. 2). The fact that we must experiment upon mathematical diagrams and observe the results in order to discover what would be true of our hypothetical world commits

Peirce to a moderate realism, as the imagined mathematical world comes to have its own independent, researchable standing once it is framed. However, Peirce would not ascribe to mathematical entities—‘mathematical icons’ or ‘diagrams’, in his case—a mind- independent existence in a Platonic realm. For Peirce, the grade of reality of mathematical hypotheses is not actuality, but it is potentiality or possibility. However, insofar as the hypothetical states of affairs of mathematics are our rational creation, they are ‘mind-dependent’. But they are not (Kantian) Forms of intuition. These Forms are rather fixed, rigid—they are not the product of human creativity, and in my estimation are rather like mental straight-jackets that constrain the possibilities of our mathematical

90 reasonining. Peirce’s mathematical objects are rather the product of mathematical creativity; in fact, as we shall see in chapter 4, they are the creations of the mathematical imagination of a community of inquirers. Perhaps Peirce’s conception comes closest to being a form of conceptualism insofar as it views mathematics as “an investigation of the formal properties of ideas or concepts as the contents of thought” (Jacquette 2002, p. 2).

However, a moderate realism creeps in again because for Peirce mathematicians come to discover the properties of mathematical ‘conceptual’ realms, and these discoveries do not consist merely in unpacking the contents of a ‘concept’ but in actually experimenting upon the hypothetical worlds mathematics to discover their properties. We might say, then, that regarding the origin of mathematical entities—or ‘icons’ of hypothetical states of affairs—Peirce’s conception of mathematics is ‘mind-dependent’, but that once conceived, Peirce views the hypothetical worlds of mathematics as being ‘mind- independent’ insofar as we must experiment upon them to discover what would be true about them. In the end, I think that we gain much more clarity using Jacquette’s classification when we emphasize that for Peirce mathematics is primarily a creative reasoning activity. Thus, his view of mathematics is primarily ‘mind-dependent’: without reasoning minds, there would be no mathematical realm, no ‘would-be’ worlds, to explore.

Keeping in mind that Peircean mathematics is a reasoning activity, let us consider also Carlo Cellucci’s classification of philosophies of mathematics as closed or open views (see Cellucci 2000). The closed-view sees mathematics as close-ended inquiry.

Mathematical theories are founded upon fixed principles given once and for all and which can only be accepted or rejected as a whole, so that the standard way of developing

91 mathematics is by deriving consequences from the axioms of a given system.

Mathematics proceeds by way of the axiomatic method; that is, mathematical reasoning consists in deductively unfolding what is already implicitly contained in the axioms of the given system or in replacing the latter with a new system by conceiving of entirely new axioms. Cellucci classifies the views of Kant, Frege, and Hilbert as closed-views. In contrast, the open-view regards mathematics as open-ended inquiry. Mathematical theories do not have absolute foundations on permanent axioms but on provisional hypotheses. Mathematical theories are dynamic, that is, capable of being transformed to represent changeable states of affairs. Within the context of these open systems, solving a problem in a particular field of mathematics may require concepts and methods from other fields, so that the standard way of developing mathematics is by ‘problem-solving’ via the ‘analytical’ method. Mathematics proceeds by way of ‘analysis’ in which proof- search begins with a hypothesis and not with a fixed, self-evident axiom; that is, mathematical reasoning consists in solving a problem by provisionally assuming a hypothesis and showing that it leads to an adequate solution. The ‘analytical’ search for hypotheses may proceed by a variety of techniques, including mathematical analogy

(reducing one problem to another by showing that the elements of an existing problem are analogous to those of another which has already been solved or is easier to solve), diagramming (in the ordinary sense of investigating an actual figure), generalization

(provisionally assuming a particular case to be a general rule), particularization

(instantiating a general rule in a particular case), induction, abduction, and so on.

Under Cellucci’s classification, Peirce’s conception of mathematics is clearly an open-view. For Peirce mathematical principles, axioms, or postulates are not self-evident

92 truths that determine the true content of a closed system; they are rather provisional hypotheses that frame an open-ended system. Within that system, we can pose problems and search for hypothetical solutions by way of mathematical experimentation. Thus we can speak of two senses of ‘hypothesis’ in Peirce’s open-ended view—first, the ‘framing- hypotheses’ that create a mathematical state of affairs, and second, ‘analytical hypotheses’ that we provisionally assume in order to solve problems within the system.66

Moreover, mathematical systems need not be axiomatic. A mathematical system may be a non-axiomatic hypothetical model of, say, a theoretical or practical problem.

Mathematicians, then, may create a ‘diagrammatic’model and experiment upon it for the purposes of hypothetical problem-solving, without a need to axiomatize an entire system.

Peircean mathematical systems are also dynamic because ‘framing-hypotheses’ may always be modified or re-created and ‘analytical hypotheses’ may always be introduced—via analogy, for instance—from one field of mathematics into another.

Peirce’s position admits the open-view that problem solving may be a potentially infinite process of posing increasingly more general hypotheses and considering their consequences (see Cellucci 2000, p. 162). Let us take the system of Euclidean geometry as an example. A closed-view would interpret a change of the fifth-postulate as a toppling of the Euclidean system; if the postulate is “false”, the system is “false,” and a new,

“true” system must be constructed on the basis of “true” axioms. However, for Peirce this position would be completely inadequate to the inquiring practice of the mathematicians.

66 I must note, however, that the repertoire of ‘analytical’ techniques for hypothesis-making admitted by Cellucci is more extensive than Peirce’s, if we interpret Peirce’s experimentation upon diagrams to consist only in induction, analogy, and perhaps abduction. In this sense, Cellucci’s heuristic view of mathematics will serve to strengthen the Peircean view in the course of the case study.

93 According to his open-view, in modifying the fifth-postulate, a mathematician is simply changing a ‘framing-hypothesis’ and re-conceiving a mathematical world. Now the mathematician studies what would be true of the reconceived hypothetical state of affairs, exploring new possibilities and performing new experiments that may call for new

‘analytical hypotheses’. Thus, Peirce would not claim that, qua mathematics, the

Euclidean system is false while the non-Euclidean system is true. They are two interrelated open systems that we can investigate through experimentation. The creation of a new system need not imply the destruction or “falsification” of the old one; it is rather the conception of an alternative ‘would-be world’. Mathematicians do not set out to create closed systems on the basis of axioms; they set out to explore what follows from provisional hypotheses.

In the end, I think it is adequate to Peirce’s view to characterize mathematics as a creative, precise, experimental, open-ended reasoning activity. Mathematics is creative in so far as it is an activity in which we imagine and conceive a hypothetical state of affairs.

It is precise insofar as mathematicians must create a delimited world of a perfectly general description and study it by way of ‘necessary reasoning’. It is experimental insofar as ‘necessary reasoning’ requires the manipulation and observation of diagrams in order to discover what would be true about such a hypothetical world. It is open-ended in that a hypothetical state of mathematical affairs can always be modified or re-created and in that we can discover new ‘would-be’ truths about them. The forthcoming task of expounding the ‘logic of inquiry’ in mathematics will be, above all, the task of describing and elucidating the reasoning practices of mathematical inquirers. In this case, the inquirers will be the early mathematical probabilists, and the practice under scrutiny will

94 be their progressive discovery of the theory of mathematical probability. I will take as a starting point the position that mathematicians practice a science of discovery in which the primary ampliative form of reasoning is ‘hypothesis-making’ in two senses—first, and more broadly, ‘hypothesis-making’ as the ‘poietic creation’ that frames a general, open-ended mathematical world; second, and more narrowly, ‘hypothesis-making’ as

‘conjecturing’ regarding the experimental solution to particular problems arising within our general mathematical world.

Chapter 3

The Context of Mathematical Discovery: The Case of Mathematical Probability Theory

The concept of ‘probability’, in its manifold senses, dates back to antiquity. According to the ‘ancient’ epistemological concept, ‘probability’ means the ‘approvability of an opinion by an authority’. Although the word probabilitas is Latin, the most influential source of the ‘ancient’ concept is Aristotle’s Topics. The purpose of that treatise on dialectical deduction is to find a line of inquiry “from reputable opinions on any subject presented to us” (Aristotle 1984, Topics 100a21). The premises of a dialectical deduction are not primitive and true propositions, as in the case of a demonstrative deduction, but are opinions accepted by a reputable authority, say, by a wise or intelligent person.

Thomas Aquinas calls these statements of reputable opinions ‘probable propositions’:

“[B]elief or opinion [fides vel opinio] is sometimes achieved, on account of the probability of the propositions [probabilitatem propositionum] from which one proceeds, because reason inclines completely towards one part of a contradiction, but with fear concerning the other part. The Topics or dialectics is devoted to this. For the dialectical syllogism which Aristotle treats in the book of Topics is from probable premises [ex probabilibus est]” (Aquinas 1992, LB1 LC-1N.-6.). We find here two features of the

‘ancient’ epistemological conception of probability. First, it is related to the degree of credibility of a proposition. Reason leans towards accepting one proposition between a pair of contradictories; but it does so “with fear” because this proposition is only probable, not certain, and so reason may err—it may be that the contradictory proposition

96 is true. Second, probability is related to belief or opinion, not to knowledge. As Ian

Hacking points out, in scholastic epistemology there is a sharp distinction between knowledge and opinion as they do not have the same objects. Knowledge (scientia) is knowledge of universal truths that are true of necessity, either because they are primitive first truths or because they are known by syllogistic demonstration. Opinion (opinio) refers to beliefs not arrived at by way of demonstration, including propositions that, since they are non-universal, are non-demonstrable. These propositions are arrived at by reflection, argument, or disputation (Hacking 1975, p. 20-23). It is important to observe that, even under this conception of ‘probability’ as ‘approvability by authority’, there are systematic methods of reasoning to choose between ‘probable opinions’. In the Topics, for instance, Aristotle claims that he intends the dialectical syllogism to be useful (i) for intellectual training, (ii) for the philosophical sciences since being able to ‘puzzle on both sides of a subject’ makes us better able to detect truth and error, and (iii) for the criticism of the principles of several sciences (see Aristotle 1984, Topics II 101a25-101b5). This is his aim in developing his method, and I find it worthwhile to emphasize so as not to underestimate the philosophical worth of the ‘ancient’ conception of probability as present in Aristotle.67 Be that as it may, the crucial point is that “in scholastic doctrine

opinion is the bearer of probability” (Hacking 1975, p.22). Hacking argues that since the

objects of knowledge and opinion are different, the limit of the increasing probability of

opinion is certain belief, not knowledge. Moreover, he points out that the probability of

an opinion is not primarily a matter of evidence or reason, since ‘reason’ is linked to

67 The subsequent change of this conception of probable opinion into ‘opinion enforced by authority’ is a corruption. See Hacking 1975, p. 24, for a discussion of the Jesuit casuistical doctrine of probabilism.

97 ‘cause’, ‘necessary cause’, or ‘the reason why’ in a demonstration.68 Accordingly, since in opinio there is no clear concept of evidence, ‘probability’ does not mean ‘evidential support’ but ‘acceptability by intelligent people’ (Hacking 1975, p. 22).

The evolution of ‘probability’ as an epistemological concept, from its ancient connotation linked to the credibility of an authoritative opinion to its modern connotation linked to the credibility of a proposition based on evidence, has been the subject of many distinguished histories—the most representative being, in English, Ian Hacking’s already noted Emergence of Probability (1975) and Lorraine Daston’s Classical Probability in the Enlightenment (1988). Hacking’s overarching thesis is that the transition into the

‘modern’ conception of probability begins when ‘probability’ is linked to evidence, and especially statistical evidence, so that ‘probability’ becomes a thoroughly dual concept that is at once ‘epistemological’—probability as the subjective credibility of a proposition—and ‘aleatory’—probability as the objective chance or likelihood of a physical event. He argues that this transition happens in the low, rather than in the high, sciences of the Renaissance. The physicists of the Renaissance, including Galileo, still work in an Aristotelian world of first causes: they are still dedicated to knowledge and

demonstrative science, not to opinion, and so they have no need for serious use of

probability concepts, let alone a mathematical concept of probability. In contrast, the

‘purveyors of opinion’ work in fields such as medicine and alchemy (precursor to

chemistry) that, under scholastic epistemology, could never become demonstrative

sciences. Nevertheless, they develop a distinction between causes and signs through a

68 We can locate this linkage in Aristotle’s Analytics.

98 conception of partial prognostication that bears probability rather than certainty, and whose probability arises from frequency. For instance, a specific symptom is frequently a sign of a specific disease (Hacking 1975, p.23-30).

According to Hacking, the probability of a proposition begins to be evaluated with respect to the evidence—in the form of signs—that supports it. The methods to select, evaluate, and measure evidence are rudimentary, and especially the mathematical measurement of probable evidence has yet to be conceived and developed, but the link between probability and evidence emerges. Commenting on the transition, Hacking writes: “As Fracastoro put it, ‘Some signs are almost always, others are often to be trusted’, and these are ‘signs with probability’. It is here that we find the old notion of probability as testimony [by witness or authority] conjoined with that of frequency. It is here that stable and law-like regularities become both observable and worthy of observation” (Hacking 1975, p. 43). Thus, in this transition we can locate the origin of the ‘modern’ dual concept of probability, both as epistemological—related to support by evidence—and statistical—related to stable frequencies.

Along Hacking’s line, L. J. Cohen locates another important step in the transition, namely, Pascal’s wager. Very crudely stated, in the wager propositions for the existence and the non- are assigned equal chances, so the agnostic should choose to act as if he believes in God because the expected payoff—salvation—is higher.

However, there is no way of checking the assignment of equal chances for both propositions against relative frequencies of actual events. Measurement is central to the wager, but the measurement is not with respect to actual events but with respect to a proposition. We can thus say that the wager also conjoins the epistemological and

99 statistical concepts of probability, although Pascal, according to Cohen, has a developed conception of the mathematical aspects of the concept that the physicians and alchemists lack. As Cohen puts it, “the measurement of a proposition’s credibility is now being treated as having the same mathematical structure as the measurement of an aleatory chance and…the term probability is now being used in both connections” (Cohen 1989, p. 17). For Cohen, the wager extends the concept of probability in an important way and from a different direction, namely, from the aleatory calculations related to games to a range of applications to propositions in other areas. In the wager Pascal does not show us how to evaluate the probability of a proposition, but he inaugurates the field of decision theory, that is, the theory concerned with decisions among different possible courses of action when the outcomes of the actions are uncertain.

Now, according to Hacking and Cohen, the word ‘probability’ first denotes something measurable in the 1662 Port Royal Logic, or the Ars Cogitandi, among whose anonymous authors we now recognize Antoine Arnauld.69 In the final chapters, the

authors “begin the study of a novel kind of non-deductive inference,” namely, decision

under uncertainty, and their treatment of this kind of inferential decision-making is

specifically probabilistic in the statistical sense (Hacking 1975, p.75). For Hacking, the

problem-context for this kind of inference is the following: theories are fixed for the time

being (i.e. the inference is not a kind of speculative theorizing); the decision-maker has

empirical data but does not know the most viable generalization from it; the class of

possible hypotheses are determined (so, I might observe, the inference is not hypothesis-

69 For a modern language version, see Arnauld 1970.

100 making or ‘abduction’ in the Peircean sense) and the decision-maker can apply probability calculations within that class (Hacking 1975, p.75). The new concept of probability, then, involves using empirical information about the relative frequencies of events as inductive evidence to decide on a general course of action. Hacking points out that the authors of the Ars Cogitandi use “frequencies to measure probabilities of natural occurrences, [and they are]…well aware that a decision problem requires a calculation of expectation involving not only utility but also probability” (Hacking 1975, p.77).

Therefore, the ‘probability’ of a proposition becomes, in the first place, a matter of measurable degree of credibility. Subsequent development and refinement of the mathematics of probability yield more sophisticated and precise methods of measurement, but the nature of ‘probability’ as measurable degree of credibility of a proposition is already present in the Ars Cogitandi. Second, the ‘probability’ of a proposition is no longer a matter of mere opinion in contradistinction to knowledge. The scholastic distinction between certain knowledge and probable opinion collapses.

Admittedly, this is a result not only of the development of mathematical probability but of the emergence of modern science in general, but we may begin to think of opinion and knowledge as having the same objects, and their difference becomes a matter of degree that is probabilistically measurable. According to these two observations, then, the evaluation of the ‘probability’ of a proposition becomes a matter of evaluating the evidence in such a way that this evidence yields a measurable degree of credibility for the belief stated in the proposition.

Lorraine Daston, however, dissents from Hacking’s account of the emergence of a dual—epistemic and aleatory—conception of ‘probability’ in modernity. As a historian

101 she argues that all the uses of the term ‘probability’ in the seventeenth-century do not reduce to dual epistemic and aleatory elements. Conversely, she argues that “several key seventeenth-century instances of concepts and methods using one or another of [the epistemic and aleatory] constituents do not mention the word ‘probability’” (Daston

1988, p. 12-13).70 She points out the example of John Graunt’s 1662 Natural and

Political Observations on the Bills of Mortality, a pioneering work of our contemporary

field of ‘demography’, in which he attempts basic, descriptive statistical analyses of the

London bills of mortality. Daston’s point, I think, is that even though we would interpret

Graunt as being concerned with the measurement of statistical ratios regarding the

population of London, perhaps with a view to predicting trade (David 1962, p. 102), he

does not use the term ‘probability’ in his book. Thus, Graunt does not associate statistical

ratios with any aleatory sense of ‘probability’.71 In sum, for Daston the dual classification

of probability in the seventeenth-century as epistemic and aleatory can only be achieved

in hindsight from our contemporary perspective that classifies conceptions of probability

as ‘subjective’ or ‘objective’. The conceptions of probability in the seventeenth-century

were far more variegated, and the writers, lawyers, philosophers, scientists, and

70 Daston cites Barbara Shapiro (1983), who finds in the English literature of the seventeenth-century other senses of ‘probability’ as degrees of assurance, reasonable doubt, verisimilitude, worthiness to be believed and epistemological modesty. 71 For a discussion of the English empirical analyses of the bills of mortality, see David 1962, chapter 10. According to David, “modern statistical method and practice has its twin roots in the of probability and in the collection of statistical data” (p. 98). And while in the seventeenth century mathematicians in the European continent began to be concerned with the theoretical calculus of probabilities, the English were characteristically “preoccupied with facts” such as gathering actual statistical data (p. 98). I find here an echo of the debate over whether the origins of mathematical probability are empirical or theoretical, a debate that I will address below.

102 mathematicians of the time had not achieved a clear distinction yet between probability as epistemic or subjective and probability as aleatory or objective.

It is not my intention in this work to weigh in on the debate concerning the overall history of the concept of ‘probability’ in its manifold colloquial, legal, philosophical, scientific, and mathematical dimensions. My more modest and focused interest is in discussing the conceptual origins of mathematical probability. Towards this end, I do want to call attention to Daston’s position that out of the many senses of ‘probability’ employed in the seventeenth-century, the early theorists of mathematical probability adopted those senses that were more readily quantifiable. Daston writes:

The history of the classical theory of probability is in fact an extremely instructive case study in the preconditions for quantification. Out of the crowd of qualitative notions of probability current in the mid-seventeenth century, mathematicians seized upon only a few. Their choices appear in part to have been governed by what might be called proto-quantification. Some kinds of probability—legal evidence, odds on a wager, the value of a future event—were already cast in qualitative terms of sorts, albeit rather crude and arbitrary ones, and these were just the ones that became part of the earliest formulations of calculus of probabilities. Quantification depends on some perceived analogy between subject matter and the available mathematics, and protoquantification thrusts these analogies to the fore….However, analogy alone is not sufficient for quantification: the phenomena must be conceived as constant and orderly enough to receive mathematical treatment. The phenomena of the classical probabilists were, from our vantage point, extraordinarily diverse, ranging from the credibility of witnesses to the pattern of human mortality, but all were assumed to exhibit at least long term regularities compatible with Bernoulli’s and later Bayes’ theorems. Here the mathematical tools shaped ideas about the subject matter by imposing greater determinacy upon phenomena previously thought to be unruly. Prevailing ideas about the subject matter also shaped the mathematical tools, as in the protracted attempts of the probabilists to tailor the definition of expectation to the specifications of one or another sense of rationality. Quantification is thus a process of mutual accomodation between mathematics and subject matter to create and sustain the analogies that make applications possible. (Daston 1988, p. xv-xvi)

Daston convincingly argues that the ‘proto-quantitative’ concepts of probability already present in the study of fair legal contracts involving expectations, of good wagers, and of

103 games of chance, did shape the early quantitative treatment of probability. I would suggest that this illustrates the point that new mathematical conceptualizations may be guided by concepts already developed in other fields of inquiry.

More importantly, I call attention to this passage in order to propose—from a

Peircean standpoint and by way of contrast—that the discovery of the mathematical theory of probability did not consist in a process of quantification. Daston in fact thinks that the work of mathematicians such as Cardano, Pascal, Fermat, and Huygens was not, strictly speaking, the origin of a mathematical theory of probability but that it rather consisted in the application of “available mathematics”—mainly arithmetic, algebra and simple —to problems of expectation in aleatory contracts and in games of chance (see Daston 1988, p. 3-6).72 I would submit that the application of “available

mathematics” to the study or prediction of phenomena might well consist, in one of its

many possible forms, in the quantification of the phenomena on the basis of an analogy

between a theoretical, or ‘general’, and a practical, or ‘actual’, state of affairs. However,

the discovery of a mathematical theory, such as that of mathematical probability, consists

rather in the creation of an ideal system and in the exploration of what is true of that

system. In the case of mathematical probability, its discovery and early development consisted in the gradual creation and investigation of an ideal system by the community of various generations of probabilists. The early probabilists up to Pascal and Fermat already made important contributions to the creation of the hypothetical state of affairs

for study in mathematical probability, and in so doing they were part of a community of

72 I will return to evaluate this claim in more detail below.

104 mathematicians that gradually discovered a new field of inquiry. Let us then turn to discuss explicitly the problem-context that gave rise to their mathematical work, to see how it was part of the discovery of a new branch of mathematics.

3.1 The 1654 Pascal-Fermat Correspondence and the Problem-Context of Discovery

A central question in the philosophical debate regarding the advent of the mathematical theory of probability concerns whether the theory arose on the basis of empirical observations or of purely theoretical speculations. In my estimation, this is a debate regarding the ‘problem-context’ of discovery, that is, the epistemic context of inquiry that gives rise to a new mathematical theory. The debate tends to pose a strict dichotomy between empirical problem-solving and pure theorizing, and asks whether the origins of mathematical probability are found in one or the other kind of activity. The common starting point for both sides of the debate is the consideration of the 1654 correspondence between Blaise Pascal and Pierre de Fermat regarding some questions on games of chance.73 Some of the questions were posed originally to Pascal by the

Chevalier de Méré, a gambler who had the privilege to pass his interest in games of

chance as mathematical curiosity.

73 Although there are precursors to the mathematical treatment of probability, among whom I will consider Cardano and Galileo below, the Pascal-Fermat correspondence is traditionally considered the starting point of mathematical probability because it “created a research tradition, complete with problems and concepts, that dominated the field for over fifty years” (Daston 1988, p. 15). I will start my discussion at the traditional point; however, I shall argue over the course of the case study that the discovery of mathematical probability theory was a process that involved the work of various generations of mathematical inquirers. The discovery, then, was a continuous process involving the work of a community of inquirers and not a series of discrete conceptual and mathematical leaps.

105 The upshot of the debate will be whether mathematical inquiry, as illustrated in the discovery of mathematical probability, is an ‘empirical’ or ‘theoretical’ activity. By an ‘empirical’ activity I mean one that mainly seeks solutions to practical problems, often by way of ‘actual’, physical experimentation; such activity finds conceptual innovation in

‘actual’ experience by way of perception. A ‘theoretical’ activity, in turn, mainly seeks solutions to purely theoretical problems, regardless of whether theory has practical applications; such activity finds conceptual innovation in rational ideation, independently of any ‘actual’ perception.74 I will evaluate the Pascal-Fermat correspondence in order both to clarify the dichotomous positions that argue for either empirical or theoretical

origins for mathematical probability and to offer an alternative position—a view

suggesting that, in the case of mathematical probability, an empirical problem-context

acted as an enabling condition for the possibility of mathematical innovation, but that the

activity of the early mathematical probabilists gradually became the study of a theoretical

system of ideas, a pure mathematical system nonetheless conceived and developed with a

view to its practical applications.

3.1.1 The ‘Problem with Dice’

The 1654 correspondence touches on various types of the ‘problem of points’ or “les

partis.” Such problems concern the fair distribution of the total sum staked in games of

74 I am using the term ‘actual’ in the Peircean sense—experimentation and experience are actual when they involve physical ‘reaction’, that is, when they involve ‘secondness’.

106 chance according to the players’ relative expectations for winning, when the game is suspended before it ends. For the case where two players are involved, for example, we might state the problem of points as follows: Suppose that two players play a match such that, in order to win, one must score n points before his opponent does. If they stop the match when player 1 has won x < n points and player 2 has won y < n points, how should the total sum that they staked in the match be divided? The original letter from Pascal to

Fermat is missing. However, in a letter dated July 29, 1654, Pascal identifies two types of problems of points under discussion with Fermat. One problem is “with dice [le parti des dés] and the other with sets of games [partie des parties] with perfect justness” (David

1962, p.230-231).75

Let us consider the ‘problem with dice’ first. Unfortunately, the original letter from Pascal to Fermat, that presumably stated a precise and specific problem, is missing.

But from the extant correspondence we can surmise that the problem is “simply when a

person throwing dice bets that he will get a certain number in a stated number of throws”

(David 1962, p. 230). In an undated letter responding to Pascal’s original questions,

Fermat considers the following problem. Suppose that a player bets that he will throw a

certain score with a single die in eight throws; for example, a player may bet that in eight

throws, he will score at least one 6. The problem to resolve is how the total stakes of the

bet ought to be distributed if the game is suspended at various stages, say, before the first

or second or third throw, and so on. Fermat answers that if the player agrees not to make

the first throw, then the player ought to receive 1/6 of the total sum at stake. Now, if it is

75 Maxine Merrington’s translation of the entire correspondence is included as an appendix to David 1962. All my quotations of the correspondence are from this source.

107 further agreed that the player will not make his second throw either, then he also ought to receive 1/6 of the remainder of the stakes; that is, (1/6) * (1 – 1/6) = 5/36 of the total sum.

If the player further agrees not to make his third throw, then he should again receive 1/6 of the remainder; that is, 25/216 of the total sum. If the player further agrees not to make the fifth throw, he should receive 125/1296 of the total; and so on. The fair value of the agreement is similarly calculated up to eight throws (Fermat in David 1962, p. 229).76

After expounding this correct reasoning, Fermat observes that there is an

erroneous example in Pascal’s original (now missing) letter. Suppose that a player undertakes to score a 6 in eight throws and, after having made three throws and failed to score a 6, the opponent asks him not to make the fourth throw. Fermat reports that according to Pascal the player should receive 125/1296 of the total sum in compensation.

Fermat counters that this is erroneous. He argues that since the player has made three

throws and scored no 6, he has gained nothing; therefore, the total sum remains in play and the player should receive 1/6 of the total stakes in compensation. And if after four throws, the player agrees not to make the fifth one, he again should receive 1/6 of the total stakes. He argues that as long as the entire sum remains in play “it follows not only

from theory but from common sense that each throw must have the same value” (David

1962, p. 230). Fermat concludes, “let me know whether we agree in principle, as I

believe, and only differ in application” (David 1962, p. 230). This last statement is

important because it highlights that Pascal and Fermat are not interested merely in

solving some applied problems and leaving the discussion at that; they are rather

76 We might note that the resulting series of fair payoffs are the terms of a geometrical series. For a brief exposition and discussion of the series, see Schneider 2000, p. 62-63.

108 interested in developing a mathematical method for estimating fair compensations on the basis of expectations. As we will see, whether this mathematical method is already probabilistic or not is an important matter of debate.

For now, however, let’s consider Pascal’s response. In the letter dated July 29,

1654, he writes to Fermat: “I admire your method for the problem of points even more than that for dice. I have seen several people obtain that for dice, like M. Chevalier de

Méré, who first posed these problems to me, and also M. de Roberval: but M. de Méré never could find the true value for the problem of points nor a method for deriving it, so that I found myself the only one to know this ratio” (Pascal in David 1962, p. 231).

Pascal reports that de Méré has solved a “problem for dice” but not the “problem of points.” When coupled with the following passage from the same letter, this statement is the source of a debate regarding whether the problem-context for the Pascal-Fermat correspondence is empirical or theoretical. Moving on to pose other problems and attempting solutions, Pascal reports:

I have not time to send you the proof of a difficulty which greatly puzzled M. de Méré, for he is very able, but he is not a geometrician [that is, a mathematician] (this, as you know, is a great defect)….He told me that he had found a fallacy in the theory of numbers, for this reason:

If one undertakes to get a six with one die, the advantage in getting it in 4 throws is as 671 is to 625.

If one undertakes to throw 2 sixes with two dice, there is a disadvantage in undertaking it in 24 throws.

And nevertheless 24 is to 36 (which is the number of pairings of the faces of two dice) as 4 is to 6 (which is the number of faces of one die).

This is what made him so indignant and which made him say to one and all that the propositions were not consistent and that Arithmetic was self- contradictory: but you will very easily see that what I say is correct, understanding the principles as you do. (David 1962, p. 235-236)

109 The gambler de Méré knows that betting that he will throw at least one 6 in four throws of a die is “advantageous,” while undertaking a bet that he will throw at least one pair of

6s in twenty-four throws of two dice is “disadvantageous.” In contemporary terms, we would say that the a priori probability of winning the former bet is greater than 0.5, while the a priori probability of winning the latter bet is less than 0.5. A simple calculation using contemporary mathematical probability shows that the probability of throwing at least one 6 in four throws of a die is 0.5177 while the probability of throwing at least a pair of 6s in twenty-four throws of two dice is 0.4914.77 But how does de Méré know it?

Donald Gillies argues that de Méré knows about the relative advantage or

disadvantage of undertaking these bets by way of empirical observations akin to

experimental results. For Gillies, de Méré’s “empirical observations at the gambling table

had enabled him to realize that a probability of 0.4914 was less than 0.5. The precision of

this result is worthy of the finest and most painstaking scientific experimenter, and

perhaps de Méré should be regarded as such, whatever his actual motives for making

these observations” (Gillies 2000a, p. 54). David holds a similar opinion, surmising that

de Méré was such an assiduous gambler that he could distinguish empirically a difference

in probability of 0.5 – 0.4914 = 0.0086 (David 1962, p. 89). Gillies continues to argue

that “there is no doubt that ‘experimental observations’ of this character led to a striking

advance in mathematics, namely the early development of the mathematical theory of

probability” (Gillies 2000a, p. 54). His interpretation of this passage from the Pascal-

Fermat correspondence is part of a larger argument according to which mathematical

77 For a simple description of these probability calculations see David 1962, p. 89.

110 knowledge—at least in some areas—has empirical origins, that is, conceptual sources in actual experience, including sense perception.78 Gillies finds support for his empiricist

view in cases where mathematics, like the natural sciences, advances by reflection upon

the result of experimental investigations. The discovery of early mathematical probability

is just such a case for Gillies. According to him, the theory of probability—which

eventually became so important for the natural and social sciences—originated from the

frivolous activities of gamblers because the “standard games of chance involving coins,

dice, cards, roulette wheels, etc. can be considered as experimental apparatuses for

investigating the phenomenon of randomness. The compulsive gamblers who spent hours

studying the outcomes of experimental trials with these pieces of apparatus were in effect

scientists conducting a careful examination of the phenomenon of chance and

randomness, even though their motives were very far removed from those of the

disinterested student of nature” (Gillies 2000a, p. 53). According to Gillies, then, the problem-context for the discovery of mathematical probability was thoroughly empirical,

as it involved the observation of outcomes from actual experimentation with the aid of

randomizing tools.

Ivo Schneider rejects this position, arguing to the contrary that the problem of

dice raised by de Méré belongs within a thoroughly theoretical context. Schneider bases

78 Gillies’s overall empiricist view is nuanced and well-worth consideration far beyond the limited treatment that I can provide here. For example, even though traditional empiricism suspects the positing of abstract entities, Gillies argues for an Aristotelian empiricism according to which abstract objects, such as numbers and sets, (i) exist objectively or at least intersubjectively and (ii) do not exist apart from the material world but embodied in the material world (2000a, p. 46-47). Gillies’s view extends to the epistemic origins of mathematical knowledge in perception and to an accompanying notion of mathematical truth as being hypothetical. But my limited focus right now is on what I call the ‘problem- context’ of discovery.

111 his argument in part on the text of the correspondence. He observes that in the July 29 letter Pascal initially credits de Méré with solving a “problem for dice.” Schneider interprets this to mean that de Méré solved precisely the problem of dice that Pascal raises later in the same July 29 letter to Fermat—the problem we have been considering.

According to Schneider, the two passages in Pascal’s letter only make sense if we assume that de Méré had calculated himself the odds of scoring a 6 in four throws of a single die and the odds of scoring a double 6 in twenty-four throws of two dice. De Méré’s indignation arises when he notes that the proportions of 4 to 6 and 24 to 36 are equal, while against his (mathematically naïve) expectation, the odds in the two bets are different. Thus, Schneider concludes, “the whole passage constitutes a report about a completely theoretical problem, far away from any ‘empirical observations’” (Schneider

2000, p. 61). Thus, whereas Gillies views de Méré as a careful observer of empirical results and indeed as a painstakingly precise experimenter, Schneider views him as an able calculator of odds on the basis of available mathematical theory.

Schneider offers another argument against the interpretation of de Méré’s work as being empirical and experimental. This argument is independent of the Pascal-Fermat correspondence. Schneider argues that Gillies’s interpretation of the correspondence involves two questionable presuppositions. First, there must already exist, for de Méré as well as for any other gambling experimenter, some idea of the stabilization of relative frequencies with increasing numbers of independent trials. Second, since many thousands of actual experimental trials are necessary to determine, with a significant level of certainty, that the odds are smaller than 1 to 1 in undertaking to score two 6s in twenty- four throws of two dice, then the gambling experimenters must already have an

112 appropriate way of storing the information of thousands of outcomes. However, neither presupposition is fulfilled in 1654. According to Schneider, the “conceptual and methodological tools necessary for an evaluation of empirical observations in the sense claimed by Gillies began to develop only a generation later with Jakob Bernoulli”

(Schneider 2000, p. 62). For Schneider, what de Méré, Pascal, or any other able mathematician of that time could do was to calculate the odds involved in various proposed games with dice.79 From Schneider’s perspective, these calculations clearly

belong within a theoretical problem-context.

In my estimation, it is clear that Fermat would have been able to apply his own

theoretical method correctly to any variant of the ‘problem with dice’. But it is unclear to

me that de Méré would have been able to do it.80 In this regard, Schneider’s interpretation

of the text of the correspondence is not convincing. Aside from a passing comment by

Pascal to the effect that de Méré solved some “problem for dice,” there is little reason to

believe that the French gambler would have been able to solve the variety and complexity

of problems that Fermat was able to solve. From the text alone it is even doubtful

whether Pascal knows how to solve the given theoretical problem. Pascal clearly does

report that the odds of scoring a six in four throws of a single die are 671 to 625. Given

his rejection of Fermat’s combinatorial method, it is unlikely that he would have arrived

79 He provides a clear and brief mathematical description of how this may have been possible. See Schneider 2000, p. 62-63. Schneider generalizes the method developed by Fermat for estimating fair payoffs in games with dice. This is the method we already discussed when introducing the ‘problem with dice’. 80 It is even unclear to me that Pascal would have been able to do it. Pascal constantly blunders in his arguments with Fermat, and then blatantly tries to save face by claiming, without any extant evidence, that his solutions to various problems agree with Fermat’s. For a view that brings into question Pascal’s own powers as a mathematical probabilist see David 1962, chapters 8 and 9. For a more charitable view of Pascal’s work, see Calinger 1999, p. 548-554.

113 at this result by estimating the various combinations of possible outcomes and showing that in 671 out of 1296 possible cases, at least one 6 is scored. He may have employed instead the method for the ‘problem with dice’ to show that the fair payoff out of the total stakes for foregoing the four throws in an attempt to score a 6 are (1/6) + (1/6)*(5/6) +

(1/6)*(25/36) + (1/6)*(125/216) = 671/1296. Now, Fermat’s general method may also be employed for estimating the odds of scoring a double 6 in twenty-four throws of two dice. The fair payoff is 1 – (35/36)24 as a proportion of the total stakes.81 But Pascal does

not report this result. So I do not find convincing evidence that de Méré or Pascal have

solved the latter theoretical problem. Overall, we have no way to determine from the text

of the correspondence alone whether they have solved the theoretical problem and that is

how they know of the relative advantage or disadvantage of undertaking the two different

bets, as Schneider argues, or whether de Méré knows about these relative advantages

from his empirical observations as a gambler, as David and Gillies suggest, while Pascal

either knows the theoretical explanation for the observed results and is simply reporting

the episode to Fermat or does not know the mathematical explanation and is posing the question to Fermat surreptitiously. In short, I do not think it is possible to determine from

the text of the correspondence alone whether the problem-context for these mathematical

investigations is empirical or theoretical.

Nevertheless, I think that Gillies’s empiricist interpretation of the correspondence

relies unduly on the notion of experimentation. From my perspective, the act of

81 See Schneider 2000, p. 62. Note that the probability of not scoring a double 6 in a double-throw is 35/36. And there are twenty-four double-throws. So the probability of not scoring a double 6 in twenty-four double-throws is (35/36)24.

114 experimentation involves the clear purpose of investigating a theoretical hypothesis.

Regardless of the conceptual origin of the hypothesis—which from a Peircean perspective is usually abductive—the notion of experimentation implies that a theoretical conjecture is being put to the test. Gillies’s position implies that de Méré, as a representative of the gambling experimenters, had the purpose of putting a series of specific hypotheses regarding chance to the test of physical trials involving the actual manipulation of various randomizing apparatus. But there is no evidence that these gamblers ever took any such scientific course of action. In this sense, Schneider is correct in pointing out that the concept of stable relative frequencies of events, empirically determinable via experimental trials, had not been clearly established at the time of the

1654 correspondence; in fact, any way to determine stable statistical ratios from independent experimental trials would not be formally conceptualized until the appearance of Jacob Bernoulli’s Ars Conjectandi in 1713 and its subsequent impact on the development mathematical probability. Thus, I think it is unjustified to claim that the origins of mathematical probability, as illustrated in the Pascal-Fermat correspondence, lie in an experimental problem-context.

It is more adequate to claim that the problem with dice that Pascal reports or poses to Fermat is already a theoretical problem. Pascal appears to be reporting that mathematical principles easily explain the relative advantages and disadvantages in games that make de Méré so indignant; perhaps Pascal is even posing the question to

Fermat, being unsure of how to solve mathematically the relative disadvantage in the twenty-four double-throw game with dice. At any rate, the problem emerges in the context of a theoretical, not of an empirical or experimental, discussion, and it is also to

115 be solved via mathematical theory and not via experimental observation. I agree with

Schneider—the empiricist presuppositions that gamblers like de Méré would already have (i) a clear mathematical concept of stable statistical ratios and (ii) the means to estimate these ratios in complicated problems involving vast amounts of empirical information are unwarranted. In Peircean terms, I think the problem under investigation already belongs within a hypothetical state of affairs—a mathematical world created on the basis of hypotheses regarding the possible outcomes of idealized games that involve perfectly fair randomizing objects.

On the other hand, I also think it is unwarranted to claim without qualification that the historical origins of the mathematical problems under consideration in the Pascal-

Fermat correspondence lie in a purely theoretical context. I propose that the idealized mathematical system that Pascal and Fermat are exploring theoretically in their correspondence does have some empirical sources. While gamblers such as de Méré were not experimental scientists, they were keen observers of patterns of outcomes in games of chance. At least in simple cases, they knew empirically that undertaking certain games was more advantageous than others, even if they could offer no theoretical explanation for the relative advantage or disadvantage. They were able to settle some relative advantages or disadvantages empirically in simple cases, but had to look for the

‘mathematicians’—in the sense of the theoretical investigators of an idealized hypothetical system—to offer an explanation.

There is a long line of keen-eyed gamblers to bear out this position. F. N. David argues, for example, that , an “inveterate gambler” and an able mathematician, had a “good working knowledge” of the empirical chances of winning at

116 various games of with dice; in fact, Cardano’s Liber de Ludo Aleae—published in 1663 but probably written between 1530 and 1560—can be interpreted as an attempt to provide the practice of gaming with an incipient mathematical theory (David 1962, p. 58).

Cardano’s key insight is to try to determine by enumeration all the possible combinatorial outcomes of various games and to determine relative advantages or disadvantages on the basis of such enumerations. In turn, between 1613 and 1623 Galileo wrote Sopra le

Scoperte dei Dadi, a brief mathematical treatment of games with dice.82 He produced this

work probably at the request of the Grand Duke of Tuscany (David 1962, p. 65). The

main problem for study is the following. Suppose that three dice are thrown. We know

that there is an equal number of ‘3-partitions’ that yield a total score of 9 or 10. That is,

we can score a 9 with the 3-partitions (6,2,1), (5,3,1), (5,2,2), (4,4,1), (4,3,2), and (3,3,3), while we can score a 10 with the 3-partitions (6,3,1), (6,2,2), (5,4,1), (5,3,2), (4,4,2), and

(4,3,3). And yet in practice it is less advantageous to undertake to score a total of 9 than

to undertake to score a total of 10. Why is this empirical outcome the case? (see David

1962, p. 65). Galileo’s treatment is to distinguish between partitions and , and to show that there are more permutations that produce a score of 10 than of 9.83 But

what is important for us right now is Galileo’s statement that “although 9…can be made

up in as many ways [or 3-partitions] as 10…yet it is known that long observation has

made dice-players consider 10…to be more advantageous than 9” (Galileo in David

1962, p. 65; emphasis mine). Here we have a clear example in which dice-players, among

82 E. H. Thorne’s translation is printed as Appendix 2 to David 1962, p. 191-195. The original text is printed in Galileo 1898, vol. 8, p. 591-594. 83 The partition (6,2,1), for example, can be thrown in six different permutations or combinatorial orders, namely: [6,2,1], [6,1,2], [2,6,1], [2,1,6], [1,6,2], [1,2,6].

117 whom we presumably find the Grand Duke of Tuscany, have been able to determine empirically, by long observation, that it is more advantageous to throw a score of 10 than of 9 with three dice. But they cannot determine the reason and have to turn to the

‘mathematician’ for a theoretical explanation.

Below, in the context of providing a fuller account of the progressive creation of an idealized system of mathematical probability, I will turn to consider in more detail

Cardano’s and Galileo’s hypothetical idealization of various problems with dice in order to study them mathematically. For now, I want to emphasize that even though Pascal and

Fermat study theoretical problems in their correspondence, the origin of the hypothetical system that frames their theoretical study is found, at least in part, in the keen empirical observations of gamblers. These players ascertained that undertaking certain games is relatively more advantageous or disadvantageous in practice, and in so doing they identified empirical phenomena that became the subject for idealized theoretical treatment. In short, their empirical observations contributed to the gradual creation of a problem-context for mathematical study. But, as Daston points out, it was not only, or even mainly, gambling questions that created the problem-context for the gradual discovery of mathematical probability; legal questions, for instance, also contributed to the creation of a problem-context for mathematical treatment (Daston 1988, p. 13-14).

Let us turn to see how this influence of legal problems is reflected in Pascal and Fermat’s study of the ‘problem of points’.

118 3.1.2 The ‘Problem of Points’ and Expectation

As I noted in introducing the Pascal-Fermat correspondence, its central mathematical question is the ‘problem of points’, of which the foregoing ‘problem with dice’ is a special case. Recall that the ‘problem of points’ consists in determining the fair distribution of the total stakes in games of chance according to the relative expectations of the players. Let us consider another special version of the problem from the correspondence, which Pascal calls the ‘problem with sets of games’.

There must have been a letter by Fermat in response to Pascal’s letter of July 29.

Unfortunately, this letter is also missing, but again we can infer some of its contents from a letter by Pascal dated August 24, 1654. In that letter, Pascal recounts Fermat’s solution to a ‘problem with sets of games’ involving several players, and then proceeds to critique it. As Pascal writes, the dispute ultimately concerns the method of solution of the problem of points: “When there are two players, your combinatorial method is very reliable, but when there are three, I think I can prove that it is not very applicable, unless you proceed in some other way which I have not understood. But the method I have shown you and which I always use can be applied in all cases and in all types of the problem of points, whereas the combinatorial method (which I use for particular cases because it is shorter than the general method) is only good for those few occasions and not for others” (Pascal in David 1962, p. 239). According to Pascal, Fermat’s ‘combinatorial method’ is not generally applicable. What does he mean?

Pascal reports Fermat’s ‘combinatorial’ solution to the following problem.

Suppose there are two players playing several games (as part of a match that requires

119 staking a given sum).84 Also, suppose that the first man needs two more games and that

the second needs three (in order to win the match and collect the stakes). The problem is

“to find the fair division of stakes” if the players stop playing (David 1962, p. 239).

Fermat’s solution consists in determining, first, “in how many games the play will be

absolutely decided” (David 1962, p. 239). For the specified situation, the match will be

decided in four games at most. Second, it is necessary to determine “how many

combinations would make the first [player] win and how many the second and to share

out the stakes in this proportion” (David 1962, p. 240). Now, in order “to find out how

many combinations of four games there are between two players, one must imagine that

they play with a die of two faces (since there are only two players) as in heads and tails,

and that they throw four of these dice…and now one has to see in how many different

ways these dice can turn up” (David 1962, p. 240). Pascal reports that this “is easy to

calculate, it is sixteen altogether, which is the square of four” (p. 240), even though he

should say rather that sixteen is equal to two—the number of possible outcomes of each

toss—raised to the fourth power—the number of throws. Be that as it may, Pascal reports

Fermat’s further supposition that the two-sided die has one face marked a as favorable to

the first player and another face marked b as favorable to the second, so that the sixteen

combinations of possible outcomes are given by the following table.

84 The type of game is not specified, but both Fermat and Pascal assume that the players have equal chances of winning any one instance of the game. For illustrative purposes, we might imagine a game of skill in which players have exactly equal playing abilities or, in the case of two players, simply a game that consists in tossing a perfectly fair coin.

120 a a a a a a a a b b b b b b b b a a a a b b b b a a a a b b b b a a b b a a b b a a b b a a b b a b a b a b a b a b a b a b a b 1 1 1 1 1 1 1 2 1 1 1 2 1 2 2 2

Each column of the table displays a possible combination of outcomes in four subsequent

throws of the two-sided die. The last row in each column indicates whether player 1 or

player 2 wins and collects the stakes. Recall that player 1 wins the match if he wins two

games, while player 2 needs three games to win. There are eleven possible combinations

that make player 1 the victor, while only five combinations make player 2 the winner.

Therefore, the total stakes must be shared in the ratio of 11 to 5. Pascal concludes: “This

is your [Fermat’s] method when there are two players whereupon you say that if there are

more players, it will not be difficult to find the fair division of stakes by the same

method” (David 1962, p. 240).

Having recounted his correspondent’s solution, Pascal proceeds to argue that

Fermat’s combinatorial method is not general and in fact “will not always be correct” if

there are more than two players (David 1962, p. 240). He argues as follows. Suppose that

there are three players and that the first man needs one game, the second needs two, and the third two. In order “to solve the problem of points, following the same combinatorial method, one must first find in how many games the play will be [absolutely] decided,” and in this scenario it will be decided in three games (David 1962, p 242). Second, it is necessary to find the number of combinations for the outcomes of these three games, and to determine how many combinations favor the first, second, or third player respectively.

In this case, Pascal more accurately writes that “it is easy to see how many combinations

121 there are altogether: it is the third power of 3, that is to say its cube, 27” (David 1962, p

242). Pascal now imagines a three-sided die, with one face marked a as favorable to the first player, another marked b as favorable to the second player, and another marked c as favorable to the third. Pascal, in an attempt to follow Fermat’s method, provides the following table for the twenty-seven combinations of possible outcomes of throwing three three-sided dice at once, and lists what he considers the corresponding winners of the play.

a a a a a a a a a b b b b b b b b b c c c c c c c c c a a a b b b c c c a a a b b b c c c a a a b b b c c c A b c a b c a b c a b c a b c a b c a b c a b c a b c 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 2 2 2 2 3 3 3 3 3 3 3

Pascal correctly enumerates the possible combinations, but he makes a mistake in the

determination of the winners of the match. He reasons that since the first player needs

just one game to win, then all the throws resulting in at least one a are favorable to him,

and there are nineteen such throws. Since the second man needs two games, then all the throws with at least two b’s are favorable to him, and there are seven. Similarly, since the third man needs at least two c’s, there are seven throws favorable to him. Thus, for some of the throws there are two winners. For example, for the throw a b b both the first and second players are winners, for the throw a c c both the first and third players are winners, and so on (David 1962, p. 243).

Pascal himself observes that this seems to lead to the erroneous conclusion that the stakes ought to be divided in the proportion of 19:7:7 (David 1962, p. 243). But instead of offering an alternative correct analysis, Pascal offers another erroneous

122 solution. In order to determine the fair division of stakes, he proposes the following method, still taking himself to be following Fermat. There are thirteen throws favorable only to the first player, six throws which give him “a half share,” and eight throws which give him nothing. Suppose that each share is worth one pistole. Then, we determine his fair share of the stakes as follows: 13 * (1 pistole) + 6 * (1/2 pistole) + 8 * (0 pistoles) =

16 pistoles. Thus, we “divide the sum of the products, 16, by the sum of the throws, 27, which gives the fraction 16/27; this gives the amount due to the first man, when the stakes are shared out, that is 16 pistoles out of 27” (David 1962, p. 243-244). By the same method, we find that there are four throws favorable to the second player alone, three throws worth half a share to him, and twenty worth nothing to him. Therefore, his fair share of the stakes is 4 * (1 pistole) + 3 * (1/2 pistole) + 20 * (0 pistoles) = 5 ½ pistoles.

It is easy to confirm that the same amount is due to the third player. Therefore, the fair share of the stakes ought to be distributed in the proportion 16 : 5 ½ : 5 ½ (David 1962, p.

244).

Pascal mistakenly attributes this line of reasoning to Fermat’s method. He claims that the “hypothetical conditions” involved in Fermat’s method are inadequate to the

“actual conditions” of play. This is because in the “actual conditions of the game with three players, only one man can win, for play ceases as soon as one man has won. But in the hypothetical conditions, two men can get the number of games they need: that is, when the first man wins the single game he needs and one of the others wins the two games he needs [for example, the throws a b b or a c c]: for still they would have played only three games, whereas when there were only two players, the hypothetical and the real conditions fitted in with the interests of both players; it is this which makes such a

123 difference between the real and hypothetical conditions” (David 1962, p. 244-245).

Before untangling Pascal’s erroneous analysis, it is worthwhile to weigh his objection carefully for what it reveals about the nature of the mathematical reasoning at work in the correspondence. Pascal charges that the hypothetical conditions constructed by Fermat in order to analyze mathematically the problem of points are not adequate to represent the actual conditions of play. In Peircean terms, the hypothetical state of affairs devised under Fermat’s method is not an adequate representation of the actual state of affairs; it may be an ‘icon’ of some hypothetical situation, but it is not a ‘symbol’ of the actual situation of play. Therefore, its application to the actual game will lead to erroneous conclusions; in this case, to an erroneous distribution of stakes. From a Peircean stance, then, the corrective would be to create an alternative hypothetical state of affairs that correctly symbolizes the actual conditions of the game.

As it turns out, it is Pascal’s incorrect analysis of Fermat’s hypothetical state of affairs that leads to error. On September 25, 1654, Fermat replies, arguing that there are only seventeen combinations favorable to the first man and five for each of the other two.

This is because the combination a c c, for example, is favorable only to the first player and not to the third, because everything that occurs after the first man has won a game is worth nothing since he has already won the match. According to Fermat, Pascal appears to have forgotten this (David 1962, p. 247-248). Constructing an amended table with the correct analysis of the game under consideration, we get the following.

a a a a a a a a a b b b b b b b b b c c c c c c c c c a a a b b b c c c a a a b b b c c c a a a b b b c c c a b c a b c a b c a b c a b c a b c a b c a b c a b c 1 1 1 1 1 1 1 1 1 1 1 1 2 2 2 1 2 3 1 1 1 1 2 3 3 3 3

124 The table shows that the fair distribution of the stakes ought to be in the proportion

17:5:5.

Regarding Pascal’s objection to the hypothetical conditions created in Fermat’s method, the latter replies poignantly that the “consequence, as you so well remarked, of this fiction of lengthening the match to a particular number of games is that it serves only to simplify the rules and (in my opinion) to make all the chances equal or, to state it more intelligibly, to reduce all the fractions to the same denominator” (David 1962, p. 248).

Fermat explains to Pascal that his method of imagining that the play extends to a certain number of games is a way of simplifying the problem at hand. The simplification consists in reducing all the fractions of fair compensation to the same denominator—namely, to the total number of combinations of possible outcomes. In the above example, the fair compensation to the three players can be reduced respectively to 17/27, 5/27, and 5/27 of the total stakes. Fermat in fact discusses an alternative way of solving the problem without appealing to this simplifying hypothesis. I will return to discuss it in detail in section 4.1.3 when examining the important function that generalization plays in

Fermat’s reasoning as compared to Pascal’s. Right now, I wish to emphasize instead that

Fermat has succeeded in creating a hypothetical state of affairs that, coupled with correct mathematical analysis, leads to the solution of a wide variety of problems known collectively as the ‘problem of points’.

In my estimation, even though Pascal and Fermat are studying what is true about a hypothetical state of affairs, and so strictly speaking are immersed in theoretical problems, there is clearly present in their work a concern with developing applicable methods of mathematical analysis. But the applications are not intended only for games

125 of chance. As I have pointed out, Lorraine Daston argues that legal doctrines, especially regarding the problem of devising fair aleatory contracts for situations involving uncertainty about the future, were among the main sources of concepts and problems for the early development of mathematical probability (Daston 1988, p. 6-14). Accordingly, the notion of ‘expectation’, being a ‘readily quantifiable’ or ‘proto-quantitative’ concept, became a major subject of concern for the early probabilists: “Upon closer examination, the works of the early probabilists turn out to be more about equity than about chances, and more about expectations than about probabilities. These ideas and the applications they stimulated—for example, to games of chance and annuities—came…largely from the law” (Daston 1988, p. 14). My intention is not to evaluate Daston’s thesis that legal doctrines framed the main concepts and problems of mathematical probability in the seventeenth century. As she openly admits, there is no “monistic explanation” regarding the conceptual origins of mathematical probability in the seventeenth century (Daston

1988, p. 6). My intention is rather to point out for our consideration the claim that the

Pascal-Fermat correspondence concerns ‘expectation’ more centrally than ‘probability’.

Commenting upon the correspondence, Daston distinguishes sharply between

Fermat’s and Pascal’s methods to analyze the ‘problem of points’. Unfortunately, she only considers one of the many variants of the ‘problem of points’ treated in the correspondence, specifically an example from the July 29 letter from Pascal to Fermat.

Suppose that two players stake 32 pistoles each in a three-point game. If player 1 has won two games and player 2 has won one game, how are the stakes to be divided if they stop playing? (see David 1962, p. 231-232). Pascal claims that Fermat’s (now missing) solution involves “excessive” labor of combinations. So he offers a “shortcut” to estimate

126 “the fair value of each game” (David 1962, p. 231). His “shortcut” is to reason as follows.

For the next game, if player 1 were to win, he would get three points and therefore would collect all the stakes, 64 pistoles. But if player 2 were to win instead, the players would be tied at two points, so if they wished to stop playing at that point, each player would get back his 32 pistoles. Therefore, player 1 is assured of 32 pistoles, and he has ‘equal chances’ of winning the next game, so he is entitled to 16 more pistoles—one half of the remaining 32 which is due to him for his ‘equal chances’ of winning the next game

(David 1962, p. 231-232). As Daston notes, in contemporary notation the fair share of the stakes due to player 1 is (1) * 32 + (1/2) * 32 = 48 pistoles. Now, Daston writes that

“Fermat’s solution, as it can be pieced together from the extant correspondence

(particularly Pascal’s reply of 24 August 1654), seems to rest upon a full enumeration of all possible outcomes,” and she seems to agree with Pascal’s rejection of “Fermat’s combinatorial method as unwieldy and potentially liable to error” (Daston 1988, p. 15-

16). However, we have already seen that, in other examples, it is Pascal’s application of

Fermat’s method, and not the method itself, that turns out to be “unwieldy and liable to error.”

We can easily reconstruct what Fermat’s ‘combinatorial solution’ would be. First, determine the number of games in which the play must absolutely be decided. In this case, it is two games. Second, determine the total number of combinations of possible outcomes, and share out the stakes in the proportion of combinations that favor each player. Imagine a two-sided die, with one face marked a as favorable to player 1 and another marked b as favorable to player 2. The table of combinations and respective winners is the following.

127 a a b b a b a b 1 1 1 2

The 64 pistoles making up the total stakes, then, ought to be distributed in the proportion

3:1 to players 1 and 2 respectively. Therefore, player 1 gets 48 pistoles and player 2 gets

16. As we have seen, this method is not as unwieldy as Pascal thinks, even when the games become more extended and complicated with more than two players. Moreover, as

I have already pointed out and as I will elaborate fully when discussing the role of generalization in mathematical reasoning, we see in the correspondence Pascal’s failure to generalize and to apply Fermat’s method correctly. In Pascal’s defense, we might observe with Daston that “when Pascal claimed that ‘Fermat’s method has nothing in common with my own,’ he apparently meant that Fermat had suggested no mechanical means of finding combinations” and that “once Pascal realized that combinations (i.e., coefficients of the terms of the binomial expansion) could be systematically read off from the arithmetic triangle, he himself favored this approach to the mathematical analysis of games of chance” (Daston 1988, p. 17). I would add that in order to make Fermat’s method more manageable for the Pascalian objector, in terms of enumerating combinations, it was necessary to develop the field of mathematical combinatorics, as

Huygens and Bernoulli did later.

Now, Daston observes that using contemporary notation to express Pascal’s calculation of the expectation of player 1 as (1) * 32 + (1/2) * 32 = 48 can be misleading because it may appear to be symmetric with the contemporary probabilistic approach of estimating expectation as the product of the probability of winning a game times the

128 stakes of the game. She notes instead that “Pascal made expectation and equality of condition the primitive concepts of his analysis” (Daston 1988, p. 16). The point is that the ‘(1/2)’ in the expression above is not conceptualized as a ‘probability’ in the sense of a ratio of possible outcomes: instead “the ½ factor derived from the equality of condition between the two players” (Daston 1988, p. 16). The ½ expresses an equality of condition, that is, of chances of winning, but it is not a probabilistic ratio in our contemporary sense.

I might add that if there is a conceptualization of ½ as a ratio of favorable to total possible outcomes in the correspondence, it would be Fermat’s. Daston concludes:

“Although Pascal clearly knew the outcome values of [player 1’s] winning or losing the next round, and understood Fermat’s combinatorial solution, he chose to analyze the problem in terms of certain gain [(1)*32] and remainder subject to equitable distribution

[(1/2)*32]. Only after this fundamental expectation has been established do probabilities of any description enter the argument, and then only to endorse halving the residual amount as fair. Unlike Fermat’s, Pascal’s strategy consisted in eliminating explicit considerations of probability from as much of the problem as possible, substituting certain gain and equity in their place. Fermat’s solution took equiprobable combinations as fundamental; Pascal’s approach was built upon expectation. Both mathematicians viewed the problem as one of determining expectations rather than probabilities” (Daston

1988, p. 16-17).

As I have already argued, when taking the entire correspondence into consideration it is doubtful whether Pascal understood Fermat’s combinatorial method based on the enumeration of equipossible cases. But Daston makes the important point that Pascal’s reasoning is not probabilistic, at least not in the contemporary sense, and

129 that both Pascal and Fermat sought to estimate expectations rather than probabilities, inspired by applied problems concerning aleatory legal contracts. The difference between their respective conceptual approaches to the problem is, I think, that Fermat begins to treat systematically the calculation of ratios of favorable to total possible outcomes as being central to the estimation of fair expectations. To be bold, I might claim that Fermat is looking forward to what the mathematical concept of probability will become as a ratio of favorable to total possible outcomes, and Pascal is looking backwards, or at least sideways, unable to foresee the power of probabilistic analysis in problems of expectation. Some commentators might consider this claim too bold, though. Ivo

Schneider, for example, observes that the “whole correspondence does not contain a word or an idea connected with the or indeed with any concept of probability” (Schneider

2000, p. 61). Regarding the method of solution to the problem of points, he adds that “the considerations leading to [the solution of the problem] have nothing to do with the concept of probability nor with the concept of frequency. Instead the basic concept used by Fermat and by Pascal is the value of each single throw or more generally the value of each single game” (Schneider 2000, p. 63). Regarding Schneider’s comments, I would remark that the correspondence does not contain any explicit word about a concept of

‘probability’ but Fermat’s combinatorial method contributes seminal ideas for the development of mathematical probability. In the balance, it is more adequate to claim that

Fermat takes crucial steps towards conceptualizing ‘probability’ in terms of the combinatorial estimation of the ratio of favorable to total possible outcomes, even as he is steeped in problems of expectation.

130 3.2 The Creation of a Hypothetical State of Affairs and the Origins of Mathematical Probability Theory

Having discussed at length the Pascal-Fermat correspondence, let me articulate more fully my position regarding the problem-context that led to the discovery of mathematical probability theory. As we have seen, the debate surrounding the correspondence tends to pose a dichotomy according to which the original problem-context for the discovery of mathematical probability was either empirical or theoretical. In my estimation, Pascal and Fermat were already working within a thoroughly theoretical framework; their treatment of the ‘problem of points’ and, more generally, of the problem of rational expectation in aleatory situations, is ‘purely mathematical’ in the sense that it is the study of what would be true under determinate hypothetical conditions. The framing, analysis, and proposed solutions to the problem are thoroughly theoretical, and they do not involve

‘actual’ experimentation, in the sense of testing a hypothesis with actual randomizing tools. However, the strong claim that these mathematicians were working on a theoretical problem “far away from any ‘empirical observations’” is unwarranted.85 I contend rather

that an empirical problem-context acted as an enabling condition for discovery. The

discovery consisted in the gradual framing and investigation of a hypothetical state of

affairs. Empirical observation enabled the progressive conception of a mathematical

world, and mathematicians in part conceived this theoretical world with the purpose of

finding theoretical solutions to empirical problems.

85 This previously quoted claim is found in Schneider 2000, p. 61.

131 Now, I propose that the gradual conceptualization of a ‘Fundamental Probability

Set’ of equiprobable outcomes is at the core of the progressive creation of the hypothetical state of affairs that early mathematical probabilists came to study. In contemporary terms, by a ‘fundamental probability set’ we understand the set consisting of all the possible outcomes for a given experiment.86 Thus, I am suggesting that the

central discovery in early probability theory consists in the progressive conceptualization

of a set—or collection or completely enumerated list—of all the possible outcomes of a

‘random experiment’ or ‘aleatory trial’ that have equal chances of happening.87 This

‘fundamental probability set’ of equiprobable outcomes constitutes the main ‘framing-

hypothesis’ for the mathematical investigations of the early probabilists as it largely

determines the hypothetical state of affairs to be investigated.

The history of this conceptualization begins with the first attempts to enumerate

all the possible outcomes of aleatory trials in games of chance. In his influential essay on

“The Beginnings of a Probability Calculus,” M. G. Kendall points out that at some

unknown point in time during the middle ages dice finally replaced tali as the main

instruments of play and, since cards were not introduced until about 1350 A.D., gaming

was probably conducted mainly with dice for about one thousand years (Kendall 1970, p.

19). In accordance with this, the first known enumerations of possible chance outcomes

86 For the experiment of tossing a fair coin, for example, the ‘fundamental probability set’ consists of the equally probable outcomes ‘heads’ and ‘tails’. The name ‘fundamental probability set’ is due to Jerzy Neyman (1950, p. 15). The early probabilists, of course, did not conceive of it as a ‘set’ in our contemporary mathematical sense. For ease of exposition alongside other commentators, however, I will also write of the ‘fundamental probability set’ when discussing early mathematical probability. 87 The formulation ‘random experiment’ involves the contemporary notions of randomness and experimentation, so the formulation ‘aleatory trial’ perhaps conforms better to the conceptual scheme of the early probabilists.

132 refer to games with dice. In particular, the earliest known enumerations list the

‘partitions’ of possible outcomes for dice throws.88 Kendall refers to a “clerical version”

of a game with three dice invented around 960 A.D. by the bishop Wibold of Cambray.

The bishop enumerates fifty-six virtues corresponding to the fifty-six partitions of

possible ways in which three dice can be thrown. The game consists in throwing three

dice and then practicing for a day the virtue that corresponds to the result. As Kendall

observes, “the important point is that the partitional falls of the dice were correctly

counted” (Kendall 1970, p. 22). He also cites medieval poems in English that list the

possible outcomes for various throws, and notes that “the different possible throws were

enumerated and known without any reference to gaming or to a probabilistic basis” (p.

22). Even though these partitional enumerations indeed are not probabilistic, I suggest

that their conception constitutes the first step towards the eventual ideation of a

‘fundamental probability set’. Dice players realize that all the possible partitions of

outcomes can be correctly enumerated and proceed to do so, for various purposes

including, apparently, the creation of a “virtuous” game to counteract the “vicious”

effects of gambling. The partitional enumerations are the first conceptual step towards the

eventual creation of a probability calculus. Even if no calculation is attempted on the

basis of the partitions, the enumeration begins to determine a field of possibilities that

eventually will become the subject of a calculus.

88 Recall that a ‘partition’ is a possible combination of dice scores regardless of order. For example, suppose that three dice are thrown and suppose that the scores are 1, 1, and 2, regardless of order. This result is the partition [1, 1, 2]. The ordered throws (1,1,2), (1,2,1), and (2,1,1) all correspond to this partition [1,1,2]. When order is taken into consideration, the possible outcomes are knows as ‘permutations’ rather than partitions.

133 The next development is the distinction between ‘partitions’ and ‘permutations’.

The Latin poem De Vetula, usually ascribed to Richard de Fournival (1200-1250 A.D.), contains the earliest calculation of the total number of possible ordered throws with three dice. Thus, it accounts for permutations and not only for partitions. Kendall translates the relevant passage as follows: “If all three numbers are alike, there are six possibilities; if two are alike and the other different there are 30 cases, because the pair can be chosen in six ways, and the other in five; and if all three are different there are 20 ways, because 30 times 4 is 120 but each possibility arises in 6 ways. There are 56 possibilities. But if all three are alike, there is only one way for each number; if two are alike and one different, there are three ways; and if all are different there are six ways. The accompanying figure shows the various ways” (Kendall 1970, p. 23). An edition printed in 1662 includes an iconic figure, probably added by a commentator and not part of the original poem, that represents the fifty-six possible partitions and a table that calculates the number of permutations (see Kendall 1970, p. 24-25; David 1962, plates 6 and 7). In the quoted passage, the author of De Vetula first accounts for the fifty-six 3-partitions: six partitions where the same score turns up thrice, plus thirty where one score turns up twice, plus twenty where three different scores turn up. But then the author observes that the first six partitions can only turn up in a single order, while the second thirty partitions can turn up in three different orders, and the last twenty partitions can turn up in six different orders.

Therefore, even though the poem does not state it explicitly, it is implied that the number of permutations is (6)*1 + (30)*3 + (20)*6 = 216. This is the total number of possible permutations of three dice throws. Again, even though there is no attempt at any probabilistic calculations, the important distinction between unordered and ordered

134 throws is present in this poem, and so this is a further determination of what would eventually become a ‘fundamental probability set’. In the case of the throw of three dice, what would eventually become the hypothetical state of affairs for probabilistic study is more finely described as consisting of permutations and not merely of partitions. Now, both Kendall and David think that the distinction already present in De Vetula is an example of a key conceptual advance that gets lost and needs to be rediscovered (Kendall

1970, p. 23; David 1962, p. 34). For example, Kendall notes a rather free 14th Century

translation of De Vetula into French in which the translator does not list partitions or

permutations, but merely enumerates the sixteen possible total scores with three dice,

pointing out that some of them occur more often than others. Kendall writes, “the

essential step in the De Vetula has been lost” (Kendall 1970, p. 23). That is, the

description of the field of possible outcomes in terms of permutations has been lost, at

least for the translator.

However, the distinction has been initially achieved and, as history shows, will be

achieved again. A more determinate field of possibility, described in terms of partitions

and permutations, now requires to be rediscovered as a system subject to mathematical

study. An incipient development is present in Cardano’s Liber de Ludo Aleae, composed probably between 1530 and 1560. There is some debate as to the relevance of Cardano’s work to the history of mathematical probability, since his work is not a mathematical treatise but a manual for gamblers containing all sorts of disparate advice. Moreover,

David suggests that Cardano’s mathematical insights might be actually due to his faithful assistant, Ludovico Ferrari (see David 1962, ch. 6). Regardless of authorship, along with

David I think the following passage from the chapter “On the Cast of One Die” in the

135 Liber de Ludo Aleae is relevant to the history of probability: “One-half of the total number of faces always represents equality; thus the chances are equal that a given point will turn up in three throws, for the total circuit is completed in six, or again that one of three given points will turn up in one throw. For example, I can as easily throw one, three or five, as two, four or six. The wagers are therefore laid in accordance with this equality if the die is honest, and if not they are made so much the larger or smaller in proportion to the departure from true equality” (Cardano 1961, p. 8-10). The important step taken in this passage consists in linking equipossible outcomes and equality of chances. When one fair die is cast, there are six equally facile outcomes; therefore, the chances of throwing any one of three points (e.g. 1, 3, or 5) are equal to those of throwing any of the other three points (e.g. 2, 4, 6). Cardano’s somewhat confusing exposition is due in part to his regard for the “fundamental principle of gambling” which consists in equality of conditions, for example, “of opponents, of bystanders, of money, of situation, of the dice box, and of the die itself” (Cardano 1961, p.5). He is concerned with determining fair wagers in which the participants have equal chances of winning, and these equal chances are determined on the basis of the equally possible outcomes of casting fair dice.

Cardano’s breakthrough consists in actually calculating chances on the basis of counts of equipossible outcomes. The hypothetical system determined by the complete enumeration of outcomes is now the subject of an incipient mathematical calculus. In the Liber,

Cardano proceeds to calculate chances in throws of two and of three dice, on the basis of the counts of possible outcomes.

According to David, it is reasonable to believe that Cardano, as a passionate gambler, had a “good working knowledge” of the empirical chances involved in various

136 games of chance and that, as an avid reader, he was familiar with the correct enumerations of partitions present in works like De Vetula and several commentaries on it (David 1962, p. 58). In David’s estimation, then, “the step which [Cardano] or Ferrari took, and it is a big one, is to introduce the idea of combinations to enumerate all the elements of the fundamental probability set, and to notice that if all the elements of this set are of equal weight, then the ratio of the number of favourable cases to the total number of cases gives a result in accordance with experience” (p. 58). She concludes that

“[t]here is no doubt about it: here the abstraction from empiricism to theoretical concept is made…for the first time” (p. 58). I agree with David only in part. Actual experience in games with dice enabled the gradual creation of the fundamental probability set, as present in De Vetula and as deployed for the calculation of chances in Cardano’s Liber.

On the basis of actual experience these authors, as representatives of a whole community of inquirers, were able to create an ideal system by a process which David calls

“abstraction.”89 Moreover, I think it is reasonable to believe that intelligent gamblers had

estimated empirically the chances of winning in some games of chance. However, they

could offer no reason for it, and had to turn to mathematics for an explanation. Cardano, possibly with the aid of Ferrari, was able to turn his observations and experience into a mathematical problem. The idea, available to Cardano, of a complete enumeration of partitions and permutations of outcomes in games with dice framed the mathematical problem of the estimation of chances. Nevertheless, once Cardano frames the mathematical problem, he conducts his work in the realm of an ideal mathematical

89 I will provide my own Peircean interpretation of this process of ideation in section 4.1.1.

137 system. His correct calculations of odds in various games are mathematical, not empirical, estimations. Likewise, his mistakes in the calculation of various chances are mathematical, not empirical, mistakes. Thus, I do not find warrant to claim that Cardano checks that his mathematical estimation of chances accords with experience, or that he follows the “modern scientific ‘method’,” as David suggests (p. 59). Empirical observations and actual experience acted as enabling conditions for the framing of the mathematical problem, but Cardano’s incipient method of solution is thoroughly mathematical. Whether he may have purposefully turned to empirical observation or to scientific experimentation in order to confirm the mathematical solutions cannot be determined from the text.

When Galileo writes Sopra le Scoperte dei Dadi, sometime between 1613 and

1623, he deploys with ease, and without any apparent claim to originality, the method of estimating relative advantages on the basis of a complete enumeration of equipossible cases. At least for this Italian mathematician, both the idea of an enumeration of equipossible outcomes and the method of estimating the relative advantages or disadvantages of aleatory events on the basis of this enumeration are well-established.90

He begins the work by writing that the “fact that in a dice-game certain dice games are

more advantageous than others has a very obvious reason, i.e. that some are more easily

and more frequently made than others, which depends on their being able to be made up

with more variety of numbers” (Galileo in David 1962, p. 192). Galileo explains the

90 See Kendall (1970) and David (1962, ch. 7) for some historical speculation on how the idea and the method may have disseminated, at least among Italian mathematicians between Cardano and Galileo. I do not seek to recount or speculate on this history but simply to point out that Galileo already had a firm grasp of these elements of an incipient calculus of probability.

138 relative ease or frequency of aleatory events on the basis of the relative number of favorable possible outcomes in aleatory trials. Recall the main problem that Galileo discusses, probably at the request of the gambling Grand Duke of Tuscany. Suppose that three dice are thrown. Even though there is an equal number of ‘3-partitions’ that yield a total score of 9 or of 10, by “long observation” dice-players have established that in practice it is less advantageous to undertake to score a total of 9 than to undertake to score a total of 10. Why is this empirical outcome the case? I have already argued that here we have another case of a gambler searching for a mathematical explanation of observed empirical results.

Now I want to emphasize instead that Galileo’s mathematical treatment of the problem is to distinguish between partitions and permutations and to show that there are more permutations that produce a score of 10 than of 9. In beginning to solve the problem he writes: “[T]o achieve my end with the greatest clarity of which I am capable, I will begin by considering how, since a die has six faces, and when thrown it can equally well fall on any one of these, only 6 throws can be made with it, each different from all the others. But if altogether with the first die we throw a second, which also has six faces, we can make 36 throws, each different from all the others…And if we add a third die, since each one of its six faces can be combined with each one of the 36 combinations of the other two dice, we shall find that the combinations of three dice are 6 times 36, i.e. 216, each different from the others” (David 1962, p. 193). He thus establishes that there are two-hundred and sixteen outcomes that can occur “equally well” when three dice are cast.

Then he proceeds to analyze the composition of this fundamental probability set of equipossible outcomes. He observes that there are sixteen possible total scores, namely,

139 3, 4, 5,…, 18. The problem, then, is to find the number of ‘permutations’ that will produce each of these total scores. Galileo in essence reproduces the reasoning already present in De Vetula, declaring “these three fundamental points; first, that the triples, that is the sum of three-dice throws, which are made up of three equal numbers, can only be produced in one way; second, that the triples which are made up of two equal numbers and the third different, are produced in three ways; third, that those triples which are made up of three different numbers are produced in six ways” (David 1962, p. 194).

Based on this reasoning, Galileo constructs a table enumerating each of the possible total scores, every possible 3-partition that yields each of the possible total scores, and the number of permutations that produces each possible 3-partition. The table also sums up the total number of permutations that will yield each of the total scores and show that the grand total of all of these sums is 216 (see David 1962, p. 194). Galileo uses the table to show that there are twenty-seven permutations that yield a total score of 10 while there are only twenty-five permutations that yield a total score of 9. This is why it is more advantageous to undertake to score a 10 than a 9, even though there are six 3-partitions or triples that yield both of these total scores.

It is important to note that there is no use yet of a probabilistic ratio in Galileo’s reasoning. He does not argue that it is more advantageous to undertake to score a 10 than a 9 because the probabilities of success are 27/216 and 25/216 respectively. However, his reasoning is clearly based on the analysis of the composition of what we call a fundamental probability set. Implicit in his reasoning is the assumption that it is enough to compare the number of outcomes favorable to one or the other score because there are

216 equipossible outcomes in total. That is, it is implicit that 216 is a common

140 denominator, so it is enough to compare the number of favorable outcomes—27 to 25 in the example. Therefore, even though Galileo’s reasoning is not probabilistic yet, it testifies to the progressive creation and mathematical exploration of a fundamental probability set; that is, of a hypothetical state of affairs that becomes the subject of progressive mathematical exploration that eventually results in a probability calculus and, more generally, in a mathematical theory of probability. Galileo creates a hypothetical model representative of a practical situation, makes some calculations on the basis of the hypothesis, and offers the theoretical model as an explanation of the outcomes observed in practice.

Thirty or forty years after Galileo’s brief work, the Pascal-Fermat correspondence takes place. Fermat approaches the problem of points as a problem to be analyzed in terms of a fundamental probability set. The method of solution is to find the chances of winning of the various players involved in a game on the basis of the number of equipossible outcomes favorable to each of the players, and to distribute the stakes in proportion to these chances. Despite Pascal’s complaints that the ‘combinatorial method’ is unwieldy, Fermat’s successful treatment of the problem in terms of permutations of equipossible outcomes is noteworthy. The first known treatment of the problem of points is found in Fra Luca Pacioli’s 1494 Summa de Arithmetica, Geometria, Proportioni et

Proportionalitá. His solution is incorrect. So are the different solutions offered by

Tartaglia in his 1556 Generale Trattato and by Peverone in his 1558 Due Brevi e Facili

Trattati, il Primo d’Arithmetica, l’Altro di Geometria.91 I have already noted Pascal’s

91 For a discussion of these treatments see Kendall 1970, p. 27-28.

141 own fumbling with the ‘combinatorial method’. Thus, Fermat’s reasoning is admirable in that it successfully analyses the long standing problem of points in an innovative way.

The idea of a complete enumeration of equipossible outcomes is well established, and he attacks the problem of points—a problem regarding the fair distribution of stakes based on the relative expectations of winning—by reducing it to a problem of estimating the number of equipossible outcomes favorable to each player. Fermat explicitly explains to

Pascal that the total number of equipossible outcomes provides a common denominator, and this is why the stakes ought to be divided according to the relation of the number of outcomes favorable to each player. As I have argued, Fermat works on a thoroughly theoretical mathematical problem. However, the empirical observations and actual experience of generations of inquirers enabled the progressive creation of the hypothetical state of affairs that frames his mathematical reasoning. Admittedly, his reasoning is not yet explicitly probabilistic; but it is part of the historical research that created the fundamental probability set and began to investigate problems on the basis of it, and so it is an intrinsic part of the progressive discovery of mathematical probability theory.

3.3 Peircean Considerations and Implications

From a Peircean perspective, it would be erroneous to sever empirical and theoretical problem-contexts from each other and to argue that the origins of mathematical probability are found in one or the other context. I have argued that the origins of the

142 main framing hypothesis of probability theory, namely the fundamental probability set, are found in empirical observations and experience with actual randomizing objects, especially dice, and that this framing hypothesis determines a mathematical world for theoretical study. This is in line with Peirce’s view that the “business of the mathematician lies with exact ideas, or hypotheses, which he first frames, upon the suggestion of some practical problem, then traces out their consequences, and ultimately generalizes” (MS 188, p. 2).92 But a further Peircean consideration arises here. Cornelis

de Waal interprets this passage to mean that for Peirce, “mathematics as a theoretical

science trails behind mathematics as a practical science” (de Waal 2005, p. 287). De

Waal considers a passage from Peirce’s manuscripts to the effect that the task of the

mathematician is “to imagine a state of things different from the real state of things, and

much simpler, yet clearly not differing from it enough to affect the practical answer to the

question proposed” (MS 165a, p. 67). According to de Waal, “[m]athematics thus

furnishes the scientist with a skeleton model that can be considered representative of the question being studied, and instead of studying the phenomenon with all its fortuitous detail, it suffices to study the model instead” (de Waal 2005, p. 288). I agree with this assessment to the extent that the Peircean position admits that actual experience and practical considerations act as enabling conditions for the creation of mathematical theories. Are they, however, necessary conditions? In my estimation, this two-fold question asks whether it is the case that for Peirce theoretical investigations in

92 Henceforth, all citations abbreviated as MS are references to the unpublished Harvard manuscripts of Charles Sanders Peirce. The manuscript number is according to the Robyn catalogue (see Robyn 1967). The page number, when provided, is according to the numbering assigned by the Institute for Studies in Pragmaticism, .

143 mathematical practice must necessarily respond to practical considerations and whether this position is adequate to actual mathematical inquiry?

The early history of mathematical probability provides an example in which theoretical considerations seem to answer entirely to practical aims. According to Daston, mathematical probabilists developed very few new techniques until the end of the eighteenth century (Daston 1988, p. 5). As an example, we might note that they developed no mathematical techniques as novel, say, as those of analytic geometry or differential calculus. Thus, Daston argues, “[t]his very lack of new mathematical content bound mathematical probability more firmly to its applications. Since it belonged wholly to what we would now call applied mathematics, probability theory stood or fell upon its success in modeling the domain of phenomena that the classical interpretation had mapped out for it. Failure threatened not just this or that field of application, but the mathematical standing of the theory itself” (p. 5). Clearly I agree with Daston that practical applications enabled and actually motivated the creation and development of mathematical probability. I further agree that the early probabilits conceived of a hypothetical state of affairs in order to pose problems whose solutions would be applicable in practical contexts, such as those of legal aleatory contracting. From a

Peircean standpoint, however, I would like to suggest that early mathematical probability consisted in more than the sum of its applications. From its inception, the study of mathematical probability was the study of a hypothetical state of affairs. This is already evident in Fermat’s work. He is not only concerned to answer to specific practical problems, but he is interested in generalizing what is true about the mathematical world that he is investigating. He is interested in a general method qua mathematical and not

144 just qua practical tool. His interest in the correspondence is not only application; as a

‘pure mathematician’ he is also interested in studying questions regarding a purely ideal system, a mathematical ‘diagram’ in the Peircean sense. I will substantiate more thoroughly my argument to this effect when we come to discuss in more detail Fermat’s generalization of his method section 4.1.3. As we will also find in chapter 5, this generalizing tendency, this effort to develop general mathematical methods qua theoretical, is more explicitly present in the work of Huygens and Bernoulli subsequent to

Fermat’s.

I finally want to suggest that, for Peirce, the theoretical investigations of the mathematician must not necessarily respond to practical questions. While a practical or empirical problem-context is often an enabling condition for the creation of mathematical theories, it is not a necessary condition. Mathematicians are free to explore what is true of any hypothetical world of their own creation. This creation may respond to practical considerations and it often does in fact, as in the case of early mathematical probability.

However, the creation may also be entirely speculative and for purely theoretical purposes. For example, the investigations ensuing from changes to the fifth postulate of

Euclidean geometry responded to entirely theoretical considerations. It is true that, from

Peirce’s standpoint, ancient geometric theory, epitomized in Euclid’s system, likely originated as a theoretical activity that responded to the practice of measuring land (see de Waal 2005, p. 289). However, the Peircean position would also admit that the nineteenth-century studies that developed alternative geometrical systems were thoroughly theoretical activities concerned with a purely ideal system. In my estimation, in order to be an adequate account of actual mathematical inquiry, the Peircean position

145 must admit that purely theoretical and speculative questions also lead to the creation of mathematical theories. In the end, I think the viability of the Peircean view regarding the problem-context of mathematical discovery consists in its refusal to provide unrestricted and absolute primacy to theory or to practice, but to regard mathematics instead as the theoretical investigation of a purely hypothetical state of affairs that often, though not necessarily, represents an actual, practical situation under investigation.

Chapter 4

Epistemic Conditions for the Possibility of Mathematical Discovery

Let us turn to discuss the conditions for the possibility of mathematical discovery that are implied by Peirce’s logic of mathematical inquiry. Since I am proposing Peirce’s model as an open-ended systematic view of mathematical practice, any proposed

‘conditions for the possibility of discovery’ should not be ad hoc; they should rather reflect and indeed follow from the structure of that open-ended system. Accordingly, I will discuss these conditions as they relate to the Peircean view of mathematical inquiry.

In what follows, I will elaborate on ‘epistemic abilities’—which consist in the powers of imagination, concentration, and generalization—and the ‘community of inquiry’—which comprehends the function of ‘language’ or ‘system of representation’, ‘mathematical background knowledge’, and ‘criticism’—as necessary conditions for the possibility of mathematical discovery.

4.1 Epistemic Abilities

To begin, then, any proposed epistemic conditions should follow from the irreducible elements of quality, relation, and generality—or firstness, secondness, and thirdness—that are intrinsic to the phenomenon under study; in this case, a hypothetical

147 state of affairs.93 More specifically, I submit that the necessary epistemic conditions for

the possibility of mathematical discovery ought to be those abilities required by the

mathematician in order to detect and investigate with precision the qualitative, relational,

and general aspects of a mathematical hypothesis. And this is just what we find in

Peirce’s own description of the intellectual qualities necessary for mathematical

reasoning. Thomas S. Fiske reports that at a November 24 1894 meeting of the American

Mathematical Society, “in an eloquent oration on the nature of mathematics, C. S. Peirce

proclaimed that the intellectual powers essential to the mathematician are ‘Concentration,

imagination, and generalization’” (quoted in Archibald 1938, p. 7).94 In his writings on

mathematics, Peirce often emphasizes the abilities of imagination, concentration, and generalization that are necessary for mathematical reasoning (see, for instance, CP 2.81 and 4.611). In my estimation, the powers of imagination, concentration, and generalization correspond to the necessary abilities to (i) create a mathematical ‘icon’—a hypothetical state of affairs that is of interest to the inquirer qua mathematician for its own intrinsic character; (ii) discriminate between mathematically essential and superfluous relations in the determination of the icon and focus the attention on the essential ones; and (iii) generalize on the basis of the characters and relations embodied

93 In strict Peircean terms, a ‘framing mathematical hypothesis’ is of a thoroughly general character and, therefore, strictly a ‘third’. Accordingly, henceforth when I write of the elements of quality and relation, or of firstness and secondess, in a mathematical hypothesis, I am referring to qualitative and relational elements intrinsic to this general hypothesis. In Peircean terms, these ‘first’ and ‘second’ elements are really aspects of a ‘third’ degenerate in the ‘first’ and ‘second’ orders. 94 This passage was called to my attention by de Waal 2005.

148 in the icon. Let me elucidate each of these intellectual powers in detail in the specific context of the creation and development of early probability theory.95

4.1.1 Imagination

The first intellectual ability required in mathematical research is the imagination.

The imagination consists in “the power of distinctly picturing to ourselves intricate configurations” (MS 252). That is, it consists in the ability to create original mathematical diagrams in order to represent an innovative hypothetical world, and to do

this ‘distinctly’ in the epistemological sense of being able to determine its properties with

exactitude. The imagination is the primary necessary epistemic condition for the

possibility of innovative mathematical reasoning because without its creative work the

inquirer would have no world to explore, no determinate hypothetical state of affairs to

investigate with the rigor of necessary reasoning. The faculties of concentration and generalization would have no subject to investigate, no mathematical matter to

experiment upon and observe, if there were no imagined mathematical hypotheses.

95 A thorough discussion on the nature of the mind and of these intellectual ‘powers’ according to Peirce is beyond the scope of my present concerns. However, I should note that, in my opinion, these ‘powers’ are not meant to be any abstract entities somehow “contained” in the mind. They are rather ‘abilities’ that are actualized in the performance of certain mental actions. They are instinctive and innate Peircean ‘habits’ that nonetheless must be cultivated and developed. For example, the mathematical ‘imagination’ is an ability that consists in being able to actually picture or imagine a mathematical ‘diagram’; that is, to create a ‘sign’ representing a mathematical idea where the idea is embodied in the sign. There is no abstract entity called the “imagination” apart from a habit actualized in the act of imagining. Likewise, there is no abstract power of concentration independent of the actual act of concentrating nor is there an abstract faculty of generalization separate from the act of generalizing. In what follows, I use the terms ‘powers’, ‘faculties’, and ‘abilities’ in the sense of these skillful ‘habits’ for performing a specific kind of action.

149 Imagination is the key to originality, in mathematics as in all scientific and philosophical reasoning, and so it is the primary source of breakthrough discovery.

Peirce in fact warns against those who would underestimate the importance of original imagination in mathematics, especially those philosophers and mathematicians who vainly glorify the skill of necessary demonstrative reasoning as if it were the highest capacity for mathematical reasoning, without realizing that demonstrative skill depends on the imagination: “The mistake of…all who think that necessary reasoning leaves no room for originality—it is hardly credible however that there is anybody who does not know that mathematics calls for the profoundest invention, the most athletic imagination, and for a power of generalization in comparison to whose everyday performances the most vaunted performances of metaphysical, biological, and cosmological philosophers in this line seem simply puny—their error, the key of the paradox which they overlook, is that originality is not an attribute of the matter of life, present in the whole only so far as it is present in the smallest parts, but is an affair of form, of the way in which parts none of which possess it are joined together” (CP 4.661). The mathematical imagination informs an original world, provides it with a distinct and determinate structure. Again, in this respect the mathematician is like the artist—her creative achievement consists in literally forming a whole that is not simply an aggregation of parts. The difference between the two endeavors is that the artist strives for beauty, which is a matter of the intrinsic quality of the whole and its effective and affective relation to an observer, while the mathematician strives for truth, which is a matter of the determinacy and exactitude of the distinctive characters and relations in the whole. The differences notwithstanding, the highest creations of the mathematical imagination are hypothetical worlds so rich in

150 possibility that its properties may actually surprise the inquirer: “The pure mathematician deals exclusively with hypotheses. Whether or not there is any corresponding real thing, he does not care. His hypotheses are creatures of his own imagination; but he discovers in them relations which surprise him sometimes” (CP 5.567). Thus, it is due to the rich originality of the imagination that mathematics is a science of discovery—even though hypothetical mathematical worlds are created, they are not closed systems where everything is determined by rigid ‘self-evident’ axioms, but they are rather open-ended creations that may be subject to surprising discoveries.

Let us see this in the case of early mathematical probability. I submit that the creative imagination of generations of inquirers was the main source of the progressive creation of the ‘fundamental probability set’. Perhaps it is no trivial coincidence that the outcomes of games of chance were the matter of religious, oracular, and esoteric speculations, and indeed the subjects of musings by the author of De Vetula, Chaucer,

Dante, and other poets.96 Be that as it may, the work of the mathematical imagination

consisted in creating a hypothetical world so determinate and distinct so as to be able to

investigate what would be necessarily and demonstrably true about it. Generations of

imaginative “would-be” prophets and poets, collectively represented by the author of De

Vetula, created a state of affairs in which all the distinct possible outcomes of imagined

dice-throws were determined. The initial difference between Galileo and the Grand Duke

of Tuscany is that Galileo is able to imagine the fundamental probability set more

distinctly: he distinguishes not only among partitions, but also among permutations; that

96 For details on the enumeration of chance outcomes as they appear in poetry, see Kendall 1970.

151 is, he imagines a situation in which the order of the thrown dice-scores matters. In their epistolary exchange Pascal and Fermat often invoke the imagination. In discussing a variant of the problem of points between two players, Pascal writes that “one must imagine that they play with a die of two faces” so that one of the faces favors one of the players (David 1962, p. 240). Moreover, one of Pascal’s principal charges is that

Fermat’s proposed “hypothetical conditions” of play in a game of points between three players do not correspond to the “actual conditions” of play (David 1962, p. 244-245).

But, as I have already argued in section 3.1.2, it is Pascal who fails to grasp Fermat’s proposed “hypothetical conditions” and this failure is largely a failure of imagination:

Pascal fails to picture the hypothetical game in the way that Fermat does. And this failure of the imagination is actually present in Pascal’s incorrect tables describing the possible outcomes of the game when compared to Fermat’s tables.97 These tables are

mathematical ‘diagrams’—they are at once ‘icons’ representing a hypothetical state of

affairs and ‘symbols’ representing an actual situation; in short, they are ‘symbolic icons’.98 Pascal’s symbolic icon is an incorrect representation of the actual situation that

he is trying to analyze, while Fermat’s representation is correct. In these actual, tangible

‘diagrams’ we see the respective failure and success of their imagination.

Subsequent developments in the history of mathematical probability further illustrate the powerful role that the imagination played in the creation of the ‘fundamental

97 The question might arise whether Pascal’s failure is actually one of conceptualization and not of imagination. The Peircean response is that since reasoning is a semiotic process, to conceptualize is literally to signify, that is, to create or produce a ‘sign’, such as a mathematical ‘diagram’, and it is precisely the function of the imagination to create these ‘signs’. 98 This is one example the types of mixed or ambiguous representations, at once iconic and symbolic, that Emily Grosholz has recently explored in alternative, but often compatible, ways. See Grosholz 2005.

152 probability set’. Let me refer briefly to one of these episodes, the 1703-1704 Leibniz-

Bernoulli correspondence.99 As we will see, after the Pascal-Fermat correspondence,

Huygens published the first general treatise of probability theory, De Ratiociniis in Aleae

Ludo (1657), in which, among other results, he found a general solution to the problem of

points independently of Fermat. After Huygens’s treatise appeared, Jacob Bernoulli took

an interest in probability theory and began mathematical research that culminated with

his Ars Conjectandi, published posthumously in 1713. In the Ars Conjectandi, Bernoulli

sought to develop a method to study social and natural phenomena by way of

mathematical probability. We will have occasion to discuss the details of his proposed

mathematical method later. For now, it is important to note that as early as 1703

Bernoulli announced to Leibniz that he had discovered a method to estimate the

probabilities of natural events, such as the probability of atmospheric phenomena like

storms and the probability of death due to various diseases. Leibniz immediately objected

to Bernoulli’s claims. One of his main reasons was that, while in games of chance it is

possible to enumerate all the equipossible outcomes, such a total enumeration is not

possible in the case of natural phenomena like storms and diseases, because these natural

events are “happenings which depend on an infinite number of cases” (Leibniz in

Bernoulli 1966, p. 72). Bernoulli responds with an imaginative analogy. The human body

can be compared to an urn. Just as an urn might contain white and black pebbles in a

certain ratio, so the human body might contain healthy and diseased parts in a certain

ratio. And just as the probability of drawing black pebbles in the long run is the ratio of

99 I will discuss this correspondence in depth in chapter 6. For the original version, see Leibniz 1855, p. 75- 89. For an English translation of the correspondence, see Bernoulli 1966, p. 67-78.

153 black pebbles to all pebbles, so the probability of death from disease in the long run is the ratio of diseased parts of various kinds to all parts of the body. That the parts of the body are infinite is not problematic mathematically; the mathematical notion of ‘limit’ solves this problem because “the ratio of one infinity to another is still a finite number”

(Bernoulli 1966, p. 76; see also Hacking 1975, p. 163-164). We may assume that

Bernoulli would offer a similar analogy in the case of the atmosphere—it is like an urn containing stormy and stable parts. The merits of the analogy from a contemporary perspective are beside the point; Bernoulli’s imaginative leap is significant for his time: he is envisioning ways in which the calculus of probabilities, much more developed now with Huygens’s and his own work, is not only applicable to games of chance and to aleatory legal contracting; it is applicable to the study of natural and social phenomena.

Daston underscores the imaginative aspect of Bernoulli’s analogy when she writes that

“Bernoulli’s use of the now familiar urn example to model the relation between underlying causes [i.e. a priori probabilities] and observed effects [i.e. statistical frequencies] was perhaps the first quantitative attempt to construe a chance mechanism metaphorically. Heretofore, probabilists had treated lotteries, dice games, and coin tosses at the immediate level of practical problems, not as analogues for more general processes in nature” (Daston 1988, p. 237-238). Therefore, Bernoulli’s creative analogy, which

Daston even calls a metaphor, leads to one of the most significant advancements in the early history of mathematical probability—the first significant and serious attempt to describe “uncertain” but regular natural and social phenomena by way of mathematical probability.

154 Admittedly, someone might object, even on allegedly Peircean grounds, that

Bernoulli’s imaginative analogy is not, strictly speaking, mathematical reasoning. The objector might observe that Bernoulli is not recreating the hypothetical state of affairs that probabilists study but rather reconceiving its range of application. In other words, in this case imaginative work is crucial to Bernoulli qua applied mathematical scientist but not to Bernoulli qua pure mathematician. In this regard, I think that Peirce’s occasional tendency to divide strictly the work of the pure mathematician and of the applied mathematical scientist can be puzzling (given his own view of inquiry as an activity) and that this tendency encourages the above objection.100 But I think the objection ultimately

ought to be rejected on Peircean grounds. Scientific research, including mathematics, is

first and foremost an activity. Now, any scientific inquirer goes through several phases of

research work that require different types of actions. But these actions are not discrete

and isolated. They are rather continuous with each other and flow into one another in the

context of actually pursuing research. This is precisely Bernoulli’s situation. He is both

pure mathematician and mathematical scientist. He studies a hypothetical state of affairs

and also reasons about the range of application of the hypothesis to actual states of

affairs. And in both roles he must do imaginative work. Thus, it would be incongruous to

claim that as pure mathematician Bernoulli works within a given and rigid context while

as mathematical scientist he is imaginative so as to extend the range of application of his

pure mathematical findings. I think it is far more adequate to Bernoulli’s actual practice

100 See, for example, CP 5.567, in which Peirce argues that projective geometry is not pure mathematics because it does not study pure hypotheses, but hypotheses that somehow retain some definite meaning intended to correspond to an actual situation.

155 to claim that in re-conceiving the scope of application of mathematical calculations on the fundamental probability set, he is also recreating the hypothetical state of affairs that mathematical probabilists study. Most pointedly, the fundamental probability set can now consist of infinite equipossible elements and can represent a greater variety of actual situations.

Now, returning to the role of the imagination in mathematical reasoning according to Peirce, it is important to emphasize that actual experience is an important catalyst for the work of the imagination. ‘Experience’ in this Peircean sense means the actual, forceful ‘reaction’ of a mind with an outward reality; ‘experience’ is the reactive secondness involved in, say, actual sense perception, where an outward reality forces its presence upon the mind.101 Cornelis de Waal argues that “phenomena encountered in the

sciences are an important source for mathematical notions and theories. More generally,

we can say that, for Peirce, it is experience that furnishes mathematicians with their

ideas” (de Waal 2005, p. 289). He cites the following passage by Peirce: “The results of

experience have to be simplified, generalized, and severed from fact so as to be perfect

ideas before they are suited to mathematical use. They have, in short, to be adapted to the

powers of mathematics and the mathematician. It is only the mathematician who knows

what these powers are; and consequently the framing of the mathematical hypotheses

must be performed by the mathematician” (MS 17; emphasis mine). On the basis of the

foregoing discussion, we can interpret this passage to mean that the imagination frames

101 Throughout this discussion, I will use the term ‘experience’ to mean this outward clash with actual reality in the Peircean sense. The issue is whether actual experience is a necessary or an enabling condition for the function of the imagination.

156 the facts of experience in terms of mathematical hypotheses, so that the mathematician can apply his powers of concentration and generalization to the study of those hypotheses, which ultimately are “pure mental creation[s] involving no assertion about any thing but the mathematician’s idea, —his dream, as it might be called, except for its precision, clearness, and consistency” (MS 17). In the case of early mathematical probability, actual experience with randomizing apparatus, such as dice, catalyzes the earliest formulations of the fundamental probability set like that found in De Vetula.

Actual experience with tali and dice eventually lead to the imaginative idealization of the perfect die, central to the incipient mathematical explorations of Cardano, Pascal, and

Fermat.102

However, I should emphasize in closing that while experience catalyzes the work

of the imagination, it is not the case that all imaginative mathematical conceptions have

their origin in experience.103 In other words, while experience enables the work of the

imagination, it is not always necessary for such creative activity. Peirce does

appropriately acknowledge that not all mathematical conceptions are derived from

physical experience. For example, conceiving the idea of imaginary quantity and

“imagining non-Euclidian measurement” both take place in the thoroughly theoretical

context of a hypothetical state of affairs, and mathematicians exercise “immense genius”

in creating them in order to solve pure mathematical problems (CP 4.238; emphasis

102 For another example of how experience catalyzes the work of the imagination in mathematical reasoning, see de Waal 2005. The author cites MS 94, in which Peirce argues that the mathematical conceptions of ‘surface’, ‘line’, ‘point’, ‘right line’, and ‘plane’ are derived from experience. 103 This is a claim both about Peirce’s conception of mathematics and about mathematics in general. In my estimation, some mathematical hypotheses are purely imaginative, without ‘actual’ experiential sources, and Peirce allows for this.

157 mine). Moreover, I have already argued chapter 2 that, according to Peirce, the traditional distinction between ‘pure’ and ‘applied’ mathematics is simply a distinction concerning whether or not the mathematician has to appeal to a real outward experience in order to derive conceptions and advance her reasoning. The ‘applied’ branches of mathematics must make such appeals to experience, while the ‘pure’ branches need not. In the end however, the mathematician must leave outward experience behind and concern herself exclusively with pure hypotheses. In this sense, all mathematics is ultimately pure (see

NEM IV, p. xv). Therefore, just like an empirical-problem context is an enabling but not a necessary condition for the possibility of mathematical discovery, so also experience is an enabling but not a necessary condition for the creative work of the mathematical imagination.104

This may be puzzling, since Peirce’s conception of mathematics as a reasoning

activity may seem to imply that all mathematical ideas must be derived from experience.

But this is only puzzling when “practice” and “theory” are fallaciously severed from each

other, so that all activity is associated only with practice. Imaginative theoretical

reasoning is also an activity and a practice. Not only ‘actual’ outward ‘experience’ in

empirical contexts, but also reasoning activity regarding hypothetical states of affairs in

104 I emphasize again that in this context I am discussing ‘actual experience’ in the Peircean sense. Considering a broader sense of the term ‘experience’ beyond this immediate context, however, we find that there are ‘formal’ kinds of ‘experience’—as opposed to ‘material’ or ‘physical’ kinds—that enrich the reasoning ability of the mathematician. Solving mathematical problems is a ‘formal’ kind of experience; in the absence of ‘actual’ experience, it is often experience of represented mathematical forms—notations, figures and, in general, ‘diagrams’ in the Peircean sense—that enrich the imaginative work of the mathematician. When we admit this ‘formal’ sense of experience, then we might claim that experience in general, whether ‘physical’ or ‘formal’, is a necessary condition for the work of the mathematical imagination. This claim would be compatible with the Peircean view of mathematical inquiry. For a discussion of ‘formal’ experience in mathematics, see Grosholz 2000b.

158 theoretical contexts stimulate the creative imagination to conceive of alternative worlds, to re-create and re-form hypothetical states of things in order to investigate them.

Accordingly, I think that those passages in which Peirce appears to claim that actual experience is necessary for mathematical ideation may at best be interpreted as historical claims. For instance, geometry and arithmetic historically arise from the activities of measuring land and counting. But once the mathematical states of affairs the geometers and arithmeticians study are created, these hypotheses take on a theoretical life of their own, including the creation of subsequent branches of mathematics. The drawback to this interpretation is that, as we should recall from chapter 2, Peirce even claims that the ideal system of arithmetic, unlike that of geometry, is not conceived on the basis of outward experience but is purely theoretical, the pure work of the imagination, from its inception

(see NEM IV, p. xv). In the end, the first necessary epistemic condition for the possibility of mathematical innovation is the mathematical imagination, sometimes enabled by actual experience.

4.1.2 Concentration

The second power necessary for mathematical reasoning is concentration, that is,

“the ability to take up a problem, bring it to a convenient shape for study, make out the gist of it, and ascertain without mistake just what it does and does not involve” (MS 252).

Concentration, then, is the capacity to seize upon a problem, to determine what is mathematically essential about it as it is represented in a ‘diagram’, and to hold it clearly

159 in view for precise and sustained analysis. Peirce links concentration to an ability for detailed and discriminating observation as follows: “[M]athematics requires a certain vigor of thought, the power of concentration of attention, so as to hold before the mind a highly complex image, and keep it steady enough to be observed” (CP 2.81).

Concentration, then, is the condition for the possibility of carrying out the detailed observation of specially constructed ‘diagrams’ that mathematical reasoning demands.

Concentration is a necessary condition for mathematical inquiry because mathematics is for Peirce an observational science, though the observation involved is of a special kind, namely, the observation of imaginatively created diagrams: Mathematics “does not undertake to ascertain any matter of fact whatever, but merely posits hypotheses, and traces out their consequences. It is observational, in so far as it makes constructions in the imagination according to abstract precepts, and then observes these imaginary objects, finding in them relations of parts not specified in the precept of construction. This is truly observation, yet certainly in a very peculiar sense; and no other kind of observation would at all answer the purpose of mathematics” (CP 1.240). We may think of this peculiar kind of observation as a type of judicious inspection of the consequences of variations upon a mathematical ‘diagram’. Now, Peirce comes close to describing concentration as a kind of mathematical excellence or virtue that can be cultivated by training, but only to an extent. Concentration consists in a ‘vigor of thought’ and “though training can do wonders in a short time in enhancing this vigor, still it will not make a powerful thinker out of a naturally feeble mind, or one that has been utterly debilitated by intellectual sloth” (CP 2.81). Concentration, therefore, is a kind of natural ability that

160 needs to be exercised in order to develop the strength and vitality that mathematical reasoning demands.

In the case of the discovery of probability theory, the necessity of the power of concentration is best exemplified by the instances of erroneous reasoning that failed due to a lack of concentration. For brevity of exposition, I will limit my discussion to one of the most salient examples. It comes from G. F. Peverone’s attempt, in his 1558 treatise on arithmetic and geometry, to solve a version of the problem of points, after failed attempts by Fra Luca Pacioli and by Tartaglia. Peverone considers this problem: Suppose two players, A and B, are playing for ten points. A has won seven points and B has won nine points. How should the stakes be divided from this point forward (if B stakes 2 crowns)?

(see Kendall 1970, p. 27 and David 1962, p. 38). Kendall renders Peverone’s argument in this way: “A should put 12 crowns and B 2 crowns [or, equivalently, the stake should be divided in the proportion 1:6]. For if A, like B, had one game to go each would put two crowns [or divide the stakes in equal proportions]. If A had two games to go against

B’s one, he should put 6 crowns against B’s two, because, by winning two games he would have won four crowns, but with the risk of losing the second after the first; and with three games to go he should put 12 crowns because the difficulty and risk are doubled” (Kendall 1970, p. 27).105 Kendall thinks that “this must be one of the nearest

105 In this quotation I have corrected a mistake in Kendall’s text. Because Kendall renders the problem as “how much each player should stake,” instead of rendering as “how much each player should collect from the already existing stakes,” he gets tangled up and unwittingly writes that “A should put 2 crowns and B 12 crowns.” For an alternative rendering which is more in line with the usual posing of the game of points, see David 1962, p. 38-39. I have followed Kendall in order to discuss his interpretation of Peverone’s argument. The origin of the confusion is in Peverone’s tortuous Italian text, as quoted by Kendall: “Se giuocassero a 1 giuoco, bastarebbero scutti 2; et a due giuochi 6, per che vincendo solo 2 giuochi guadagnarebbe scutti 4; ma questo sta con pericolo di perdere il secondo, vinto il primo: però deve

161 misses in mathematics” (p. 28). It is a “near miss” because Peverone argues correctly for the cases where A has one and two points left to win, but then forgets the rule of his own reasoning and errs for the case where A has three points left to win. Kendall reproduces what should have been Peverone’s argument, had he remained consistent, as follows. If B has one point left to win and is staking 2 crowns, then A should stake the following: (a) with one point left to win, 2 crowns; (b) with two points left to win, 2 + 4 = 6 crowns; and (c) with three points left, 2 + 4 + 8 = 14 crowns, not 12 (p. 28). Kendall concludes:

“Peverone was perfectly well acquainted with geometrical progressions and uses the word progressione in one exposition of his answer to this problem. Having got as far as the staking of 6 crowns by A with two games to go, if he had only stuck to his own rule and considered the conditional probabilities of gain more closely, he would have solved this simple case of the problem of points, in essence, nearly a century before Fermat and

Pascal” (p. 28).

I agree with Kendall in that this is a “near miss” and I would add that the reason for the “near miss” is that Peverone lost his ‘concentration’ in the Peircean sense. That is, the Italian mathematician lost imaginative sight of the mathematical ‘diagram’ under study. Let me recast this loss of ‘concentration’ in terms of the Fermatian diagrams of the fundamental probability set under scrutiny. Consider the first case, where A and B have one point left to win. The match then must be absolutely decided in one more game. If, as mathematicians, we envision this situation in terms of the tables suggested by Fermat’s

guadagnare scutti 6, et a 3 giuochi scutti 12, per che si indoppia la difficoltà e pericolo” (Kendall 1970, p. 27).

162 reasoning, then we see that there are only two equipossible outcomes, “a” or “b”, which make players A or B the winner respectively:

a b A B

Therefore, the stakes ought to be divided in the equal proportions 1:1, that is, they each

stake 2 crowns. Now consider the second case, where A has two points left to win against

B’s single needed point. The match must be absolutely decided in two games. Thus, the

Fermatian diagram of the situation is the following:

a a b b a b a b A B B B

There is only one equipossible outcome that favors player A against three outcomes that

favor player B. Therefore, the stakes ought to be divided in the proportion 1:3, that is, A

must stake 6 crowns against B’s 2 crowns. Now, my suggestion is that Peverone imagines

a diagram that, for the purposes of mathematical analysis of the problem of points, is

equivalent to these Fermatian diagrams. This is evident in Peverone’s statement that A

risks “losing the second [game] after winning the first.” Based on a correct diagrammatic

analysis, Peverone provides the correct solution to the first two cases. But then he loses

his ‘concentration’. For the case where A has three points left to win, he does not modify

the diagram while keeping in view the principles he has already deployed; instead he

forgets the diagram and rushes to answer that A should stake 12 crowns against B’s 2

163 crowns because “the difficulty and risk are doubled” (p. 27).106 Had he kept his

‘concentration’ in carrying out his reasoning, he would have constructed a diagram equivalent to the following one. In this last case, the match must be absolutely decided in three games. The equipossible outcomes are represented by the following diagram:

a a a a b b b b a a b b a a b b a b a b a b a b A B B B B B B B

Therefore, the stakes ought to be divided in the proportion 1:7, or equivalently, A should

stake 14 crowns against B’s 2 crowns. In sum, I think that Peverone’s failure consists in

not being able to hold before his imaginative attention the diagrammatic representation of

the fundamental probability set of equipossible outcomes, even though he is able to

envision it for the first two simple cases.107

4.1.3 Generalization

The third necessary intellectual ability of the mathematician is the power of

generalization. This is the ability “to see that what seems at first a snarl of intricate

106 It may be that Peverone lost his ‘concentration’ for lack of appropriate notation to represent the present situation via an ‘icon’ and to describe a general procedure via an ‘iconic symbol’. Perhaps an adequate algebraical notation would have helped Peverone to keep his ‘concentration’. In that case, we would see an example of the role of mathematical language or system of representation as a condition for the possibility of mathematical inquiry. Even an icon akin to the Fermatian tables may have helped him. I will turn to discuss this condition below. 107 David has lingering doubts regarding whether Peverone is really trying to enumerate the probability set, though she does not substantiate the doubts (1962, p. 39). If she is correct, then Peverone’s failure is far more fundamental than a failure of concentration—it is rather a failure of the imagination to diagram the situation under analysis.

164 circumstances is but a fragment of a harmonious and comprehensible whole” (MS 252).

By this power, the mathematician “sees” that particular, and seemingly peculiar, relations among elements of a mathematical diagram really belong within a comprehensive, intelligible pattern of general relations. It is an ability, then, to “see”—in the sense of

“grasping with the mind’s eye”—that a particular mathematical diagram embodies a web of general relations. This “seeing” however does not have the character of passive perception; I think Peirce rather means by it the discerning observation of a series of complex mathematical representations of a given problem, so as to discover the mathematically essential form in all of them.

Peirce writes in such terms when providing a definition of the logical process of

‘generalization’, a definition meant to apply not only to mathematics but to all generalizing reasoning: “Generalization in its strict sense, means the discovery, by reflection upon a number of cases, of a general description applicable to them all. This is the kind of thought movement which I have elsewhere called formal hypothesis, or reasoning from definition to definitum” (CP 2.422). The reasoning process of

‘generalization’ is a process of discovery of a general description that applies to a whole series of examined cases; it is to “see” or to discover that those examined cases are particular instantiations of a ‘general’—that is, of a precise, distinct description of a harmonious state of affairs, actual or hypothetical, that comprehends all of the observed cases. The product of the reasoning process of ‘generalization’ is “an idea derived from the comparison of a number of objects;” however, this resulting idea “is not an extension of an idea already had, but, on the contrary, an increase of definiteness of the conceptions

I apply to known things” (CP 2.422; emphasis mine). Here, Peirce clarifies what

165 ‘generalization’ is by contrast to ‘extension’. The process of ‘extension’ consists in “the discovery (by increase of information) that a predicate applies—mutatis mutandis—to subjects to which it had not occurred to us to apply it” (CP 2.422). Thus, in a process of extension the inquirer already knows the general description and discovers that it applies to new cases; extension is the discovery of new particular instances that fall under a

‘general’. In a process of generalization, the inquirer rather discovers the general description that covers what is essential about all the examined cases. For these reasons,

Peirce calls generalization an increase in the “depth” and extension an increase in the

“breadth” of our concepts (CP 2.422).

It is important to emphasize that Peirce describes both ‘generalization’ and

‘extension’ as ‘operations of thought’. I think that these operations or actions may be classified as species of the three kinds of inference. From the foregoing descriptions, I claim from a Peircean standpoint (i) that extension is a species of induction—the inference that observed instances fall under a known general rule, class, or law—and (ii) that generalization is a species of ‘creative abduction’—the ampliative inference from observed instances to a new hypothetical general rule. If I am correct, inductive extension increases “breadth” of our conceptual schemes while abductive generalization increases their “depth.”

It is also important to note that generalization and extension are not dichotomous, as there may also be operations of thought that are ‘generalizing extensions’. These are a kind of extension “in which the change of the predicate, in order to make it applicable to a new class of subjects, is so far from obvious, that it is the part of the mental process which chiefly attracts our notice” (CP 2.422). Consider Peirce’s main mathematical

166 example: “[W]hat is usually called Fermat's theorem is that if {r} be a prime number, and a be any number not divisible by {r}, then a{r}-1 leaves a remainder of 1 when divided by

{r}. Now, what is called the generalized theorem of Fermat is that if {k} is any integer number, and φ{k} its totient, or the number of numbers as small as {k} and prime to it,

and if a be a number prime to {k}, then aφ{k} leaves a remainder 1 when divided by {k}.

Instead of calling such process a Generalization, it would be far better to call it a

generalizing extension” (CP 2.422). It is a generalizing extension because it is the

discovery that a general mathematical rule applies to more cases than it was previously

known. The upshot of the foregoing distinctions is both to understand precisely what

Peirce means by ‘generalization’ and to note that it is closely related to other operations

of thought involved in reasoning in general and mathematical reasoning in particular. But

the fundamental power required of the mathematician is not the important ability to

extend a notion but the necessary ability to generalize, to add “depth” to our

mathematical conceptions by discovering what is essential in a variety of ‘diagrams’, namely, their common mathematical form.

In the context of discussing the lessons that the history of science has for the field of logic, Peirce affirms that the “most important operation of the mind is that of generalization” (CP 1.82). Interestingly, all the examples that he provides in the context of this claim are mathematical. He writes: “If we look at any earlier work upon

mathematics as compared with a later one upon the same subject, that which most

astonishes us is to see the difficulty men had in first seizing upon general conceptions

which after we become a little familiarized to them are quite matters of course” (CP

1.82). A good example comes from ancient Egyptian arithmetic. According to Peirce,

167 Egyptian mathematicians were able to think of adding two fractions, say one-fifth plus one-fifth, but did not conceive of the result as two-fifths but rather of the sum of one- third plus one-fifteenth. The failure to generalize consists in being unable to “conceive of a sum of fractions unless their denominators were different”; that is, I surmise, in being unable to conceive of the sum of fractions in terms of a common denominator (CP 1.82).

After discussing this and other examples, Peirce concludes: “That which the early mathematicians failed to see in all these cases was that some feature which they were accustomed to insert into their theorems was quite irrelevant and could perfectly well be omitted without affecting in the slightest degree the cogency of any step of the demonstrations” (CP 1.82). This formulation of the conclusion tends to blend the work of concentration and of generalization, but it does re-emphasize that generalization consists in discovery what is mathematically essential about a variety of cases. From our previous discussion, it is clear that the power of concentration is also invoked in order to hold in clear view the diagrams that contain mathematically essential and superfluous characteristics and relations. This is just a reminder that although Peirce distinguishes between the abilities of imagination, concentration, and generalization, they all are intrinsically related in actual mathematical reasoning.

Let us turn to discuss an important case of generalization in the early history of mathematical probability. The case comes specifically from the Pascal-Fermat correspondence, and it consists in Fermat’s success and Pascal’s failure to ‘generalize’

Fermat’s combinatorial solution to the problem of points. Recall Pascal’s claim, in his letter dated August 24, 1654, that Fermat’s ‘combinatorial method’ of solution works for games involving two players but not for games involving three or more players. The

168 specific problem under study is how to divide the stakes of a game when player 1 has one point left to win, while players 2 and 3 both need two points to win. Fermat’s

‘combinatorial method’ yields the correct solution—the stakes ought to be divided in the proportion 17:5:5. But Pascal both makes a mistake in deploying the combinatorial method and provides an incorrect solution using his own so-called “general” method. His main objection to Fermat’s combinatorial method is that it creates a “hypothetical” situation in which the game is extended for three more points, corresponding to the number of games in which the match must absolutely be decided, but that this

“hypothetical” situation does not correspond to the “actual” situation of play because the match might end before three games (see David 1962, p. 239-246). Recall also that on

September 24, 1654, Fermat replies with the correct solution using his combinatorial method and claims that “the consequence…of this fiction [or hypothesis] of lengthening the match to a particular number of games is that it serves only to simplify the rules and

(in my opinion) to make all the chances equal or, to state it more intelligibly, to reduce all the fractions to the same denominator” (David 1962, p. 248). With this in mind, let us now consider how Fermat shows to Pascal that his combinatorial method of solution is general. I submit that he does this by creating alternative mathematical diagrams representing the problem and showing that the essential elements of the solution remain the same.

On September 25, Fermat suggests the following to Pascal. Consider still the same game between three players in which player 1 needs one point and players 2 and 3 need two points. Instead of lengthening the hypothetical match to three more games, extend it to four. Then, there are not just 33 = 27 possible ordered combinations of

169 outcomes but 34 = 81 possible ordered combinations, or permutations, of outcomes. Still,

the solution to the problem will be to count how many possible outcomes favor each of

the players. You will find that there are 51 combinations that favor player 1, while there

are 15 combinations that favor each of the other two players. Therefore, the stakes ought

to be divided in the proportion 51:15:15, but this is equivalent to 17:5:5, our previous

solution. Now extend the hypothetical match to five more games, or to any number

greater than three. The combinatorial solution will always be 17:5:5. Thus, Fermat

concludes, “I am right in saying that the combination a c c is favourable only to the first

man and not to the third, and c c a is favourable to the third and not to the first, [which

are the cases that Pascal analyzes incorrectly with his own method], and therefore my

combinatorial rule is the same for three players as for two and, in general, for any number

of players” (David 1962, p. 248). In my Peircean interpretation, Fermat is creating

alternative mathematical diagrams of the problem using always the same combinatorial

precept and showing that under any of those alternative representations, the solution to

the problem remains the same. Therefore, the combinatorial method is general. We might

envision the alternative diagrams to be tables similar to those we have already discussed.

In the case when the match is extended to four more games, the diagram is a table with

81 columns representing each of the equipossible outcomes of extending the match four

more games and indicating the match winner for each outcome. All the cases make up

what amounts to the following fundamental probability set: {(a,a,a,a), (a,a,a,b), (a,a,a,c)

… (b,b,b,b) … (c,c,c,c) …}. In the case where the game is extended to five games, the

diagram would be a table with 35 = 243 equipossible outcomes that make up the

respective fundamental probability set. But no matter what the particular diagram is, as

170 long as it is constructed with the same general combinatorial precept, it always leads to the same general solution. Pascal may go ahead and imagine as many different diagrams as he wishes in order to convince himself. This only means that his ability to generalize is less powerful than Fermat’s. But as long as the diagram is built according to the combinatorial rule so as to represent correctly the fundamental probability set, the solution to the problem will be correct.

Fermat, however, does not stop here but actually proceeds to analyze the problem in a different way that leads to the same combinatorial solution. Consider player 1: he can win in one, two, or three games. In the first case, “if he wins in a single game, he must, with one die of three faces, win with the first throw. A single die has three possibilities, this player has a chance of 1/3 of winning, when only one game is played” (David 1962, p. 249). Now, if two games are played, the triple-faced die yields nine possible outcomes.

Out of these nine possibilities, two outcomes give player 1 the victory: either player 2 wins the first game and player 1 the second game or player 3 wins the first game and player 1 the second. Therefore, player 1 has a chance of 2/9 of winning in two games.

Finally, if three games are played with a triple-faced die, there are twenty-seven possible outcomes. Fermat shows that two of these favor player 1, and therefore he has a 2/27 chance of winning in three games. On the basis of this analysis, Fermat argues that the

“sum of chances” that player 1 will win is 1/3 + 2/9 + 2/27 = 17/27. Therefore, he concludes, “this rule is sound and applicable to all cases, so that without recourse to any artifice, the actual combinations in each number of games give the solution and show what I said in the first place, that the [hypothetical] extension to a particular number of games is nothing but a reduction of the several fractions to a common denominator.

171 There in a few words is the whole mystery, which puts us on good terms again since we both only seek accuracy and truth” (David 1962, p. 249). The “mystery” is ultimately solved and elucidated by way of Fermat’s generalization.

4.2 The Community of Inquirers

The powers of imagination, concentration, and generalization are the necessary epistemic conditions for the possibility of mathematical inquiry. Mathematicians, however, do not inquire alone; they rather cooperate with a cross-generational community of inquiry composed both of their predecessors and their contemporaries who contribute to the progress of mathematics by posing important research questions, creating new hypothetical systems, making crucial conceptual breakthroughs or cursory but necessary demonstrations, criticising each other’s work, finding and correcting mistakes in reasoning, attempting novel attacks on an existing problem, and, in general, collaborating in the investigation of what is true of the hypothetical states of things that make up the subject matter for mathematical research. From a Peircean standpoint, this community of inquiry is a necessary condition for the possibility of innovative mathematical inquiry. This, in my opinion, is a significant claim as it means that without a community of inquiry, the mathematician could not create hypothetical states of affairs nor pursue progressive investigation into what is true about such hypothetical worlds.

And this is not merely a contingent economical claim. It is not simply because the mathematician does not, as a matter of contingent fact, have sufficient resources,

172 especially time, in order to research in isolation and make discoveries that the community of inquiry is a necessary condition for research. It is because mathematics is an activity and the mathematician’s very reasoning practice is of necessity communal, not individual.

This claim is of paramount importance within the Peircean philosophy of inquiry in general. For Peirce, the view that rational inquiry is individual has Cartesian origins.

As early as 1868, in an essay on the nature of cognition entitled “Some Consequences of

Four Incapacities,” Peirce argues that “the Cartesian criterion…amounts to this:

‘Whatever I am clearly convinced of, is true.’ If I were really convinced, I should have done with reasoning and should require no test of certainty. But thus to make single individuals absolute judges of truth is most pernicious. The result is that metaphysicians will all agree that metaphysics has reached a pitch of certainty far beyond that of the physical sciences; — only they can agree upon nothing else. In sciences in which men come to agreement, when a theory has been broached it is considered to be on probation until this agreement is reached. After it is reached, the question of certainty becomes an idle one, because there is no one left who doubts it. We individually cannot reasonably hope to attain the ultimate philosophy which we pursue; we can only seek it, therefore, for the community of philosophers. Hence, if disciplined and candid minds carefully examine a theory and refuse to accept it, this ought to create doubts in the mind of the author of the theory himself” (CP 5.265). Peirce’s central claim is that rational inquiry is essentially communal, not individual, as the Cartesian “paper-doubter” would pretend.

But this does not mean that it must be consensual; consensus is an ideal, a regulative hope that guides ‘scientific’ practice—where ‘science’ is the reasoning activity of those

173 community members that earnestly seek the truth by putting their beliefs to the test of reality, even of the hypothetical grade of reality of mathematical worlds, and submit it to the criticism of others. Communal rational inquiry, for Peirce, is essentially dialectical.

This general claim about communal inquiry in science and philosophy applies to mathematics. Even if mathematics is the study of what is true about hypothetical states of things—that is, about the very creations of the mathematician’s mind—the mathematician cannot make progressive discoveries individually, let alone lasting discoveries concerning the necessary truths about those hypothetical worlds.

There may be several ways to attempt to substantiate this claim from a Peircean perspective. The following passage on the communal, semiotic nature of reasoning provides a good way to understand Peirce’s position.108 For Peirce, “thinking always

proceeds in the form of a dialogue—a dialogue between the different phases of the ego—

so that, being dialogical, it is essentially composed of signs, as its matter, in the sense in

which a game of chess has the chessmen for its matter. Not that particular signs employed

are themselves the thought! Oh, no; no whit more than the skins of an onion are the

onion” (CP 4.6). Commenting on this passage, Douglas Anderson points to the

communal, semiotic nature of thinking that Peirce proposes: “The career of thought itself,

he suggested, takes place through the dynamic inquiry of a community of inquirers in

ongoing dialogue” (Anderson 1995, p. 1). Thinking is continuous dialogue—with oneself

and with others—carried out through signs: this is the key to the necessary role of the

108 I will not attempt to provide a comprehensive exposition of Peirce’s view on the communal, semiotic nature of thinking. I will only try to sketch it in enough detail to outline the necessary role of the community of inquiry in mathematical reasoning. This is one of those places, in a finite project, where one regrets the need to be brief.

174 community in mathematical reasoning. On the basis of this view, I will briefly outline a three-fold function of the community of mathematical inquirers as follows: (i) A mathematical ‘language’ or ‘system of representation’ is a communal development necessary for the possibility of mathematical inquiry; (ii) mathematical knowledge is settled communally, and the individual mathematician must employ it as background in her research; and (iii) the community of inquiry provides the dialogical criticism necessary for successful mathematical innovation. This outlined position, of course, is to be examined critically via the case of early mathematical probability theory.

4.2.1 Systems of Representation

As we saw in the introductory discussion on the “triad in reasoning,” for Peirce all thinking proceeds by way of ‘signs’. In any learning process, the triadic relation object- sign-mind is irreducible and inescapable since “all reasoning is an interpretation of signs of some kind” (EP 2, p. 4). This was a long-standing doctrine for Peirce, and he argued extensively for it.109 Here, I will only note succinctly the heart of his argument that

appears in the 1868 essay “Questions Concerning Certain Faculties Claimed from Man,”

a critique of Cartesian epistemology and one of the clearest expositions of Peirce’s

semiotic, communal conception of reasoning (EP 1.3). Peirce begins by arguing (i) that

we have no power of introspection but that all of our knowledge of internal facts is

109 For an early discussion, see the articles known collectively as the Cognition Series, published in 1868- 1869 in the Journal of Speculative Philosophy, reprinted in EP 1, selections 2, 3, and 4.

175 inferred from our knowledge of external facts, and (ii) that we have no power of intuition but that all our cognitions are logically determined from previous cognitions (and there is no absolutely first, intuited cognition but rather cognizing begins as an experiential process). Then he asks “whether we can think without signs” (EP 1, p. 23). Since we have no power of introspection, we cannot appeal to directly cognized internal facts to answer this question, but we have to investigate external facts. Thus, Peirce summarizes his argument to the effect that we cannot think without signs as follows: “If we seek the light of external facts, the only cases of thought which we can find are of thought in signs. Plainly no other thought can be evidenced by external facts. But we have seen that only by external facts can thought be known at all. The only thought, then, which can possibly be cognized, is thought in signs. But thought which cannot be cognized does not exist. All thought, therefore, must necessarily be in signs” (EP 1, p. 24). Now, Peirce continues, “from the proposition that every thought is a sign, it follows that every thought must address itself to some other, must determine some other, since this is the essence of a sign” (EP 1, p. 24). That is, thinking is a continuous process in which every thought produces another thought since every ‘sign’ must have an ‘interpretant’ sign which has the preceding sign as its ‘object’. Thinking is a process or an event in time consisting of a continuous stream of thought-signs, each of which is subsequently interpreted by an

‘interpretant’ sign.

This thinking process is for Peirce dialogical: in the case of an individual person, the dialogue consists in a self addressing, through a present thought-sign, a future self who interprets the sign and, in turn, addresses a future self through the new thought-sign, and so on; in the case of a community, individuals communicate their thoughts to each

176 other through communally-convened ‘signs’.110 Therefore, the mathematician must

reason through mathematical ‘signs’ and, in order to communicate her inquiries, these

must be expressed by way of a communally-convened system of representation. It may be

that the inquiring mathematician herself proposes an original system of signs, especially

when working in a new field, that is, in an innovative hypothetical state of affairs. But in

order for the inquiry to be effectively communicated to other inquirers, that is, to convey

with precision to other dialoguing inquirers her thoughts on the mathematical subject

matter under study, the suggested original sign-conventions must be understood by the

community and must be relatable or capable of being associated with its precedent

thought-signs. For the foregoing reasons, therefore, the community of inquirers is a

necessary condition for the creation of an effective system of mathematical

representation. Let us elaborate on this position by reflecting briefly upon early

probability theory.

In the first place, creating a thoroughly adequate system of representation was

necessary for the development of the field. In my estimation, the most forceful example

supporting this position concerns one of the main reasons why the ancient Greek

mathematicians did not develop any probabilistic mathematics, even though gaming was

prevalent and chance decision-making instruments, such as the casting of tali and dice,

were ubiquitous in religious and secular affairs. There are many positions, none of them

entirely satisfactory, regarding why the ancient Greeks did not develop any mathematics

110 For a full exposition of thinking as a dialogical process made up of a stream of thought-signs, interpreted either in one’s own mind or communicated via conventional signs to other minds, see Peirce’s “Some Consequences of Four Incapacities” (EP 1.3), especially p. 38-40.

177 of chance. Among these, I find Gillies’s position most noteworthy (Gillies 2000, p. 22-

24). By comparison to the emergence of mathematical probability in the seventeenth- century, Gillies finds two main reasons why the ancient Greek mathematicians did not develop a mathematics of chance: (i) Ancient Greek mathematicians were skillful in geometry, but development of probability theory required arithmetic and algebra, and in these areas the Greeks had poor systems of representation. (ii) Assumption of equipossibility could not have applied to the irregular dice of the ancient; that is, what

Gillies considers their “experimental apparatus” did not yield themselves to simplifying assumptions. Gambling, for instance, was carried out with dice, not coins, whose faces were conceivably equipossible. The first upshot of Gillies’s position is that adequate systems of representation, including those of related mathematical fields like arithmetic and algebra, were necessary conditions for the possibility of discovery of mathematical probability. The second upshot, clearly of empiricist provenance, is that the lack of regular experimental tools and thus of the opportunity to actually observe empirically the occurrence of stable statistical ratios prevented the possibility of conceiving of a mathematics of chance.

Of the two reasons, then, I want to argue for the first as the fundamental one, while I want to suggest that the second is merely ancillary to the first. I agree that one of the main reasons that the Greek mathematicians did not develop the mathematics of chance is that they did not have a good system of representation to frame and analyze probabilistic problems mathematically. In particular, the Greeks lacked an adequate system of numerals to represent numbers and hence the results of counts, and without numerals they could not develop a sophisticated arithmetic, much less any incipient

178 combinatorics. Without a system of numerals, then, these mathematicians could not count possibilities to create a fundamental probability set nor perform even simple arithmetical operations on this comprehensive enumeration of possibilities. Now, David also argues that the absence of numerals foreclosed the possibility of an empirical discovery of the mathematics of chance, though with empiricist leanings she thinks this was the case because it would have been impossible to keep a thorough tally of the experimental outcomes of aleatory trials with tali, dice, or any other gaming tool. Unable to keep a systematic account of chance outcomes, the ancients could not seize upon the idea of stable statistical ratios (David 1962, p. 22). I think this may be the case: the Greeks may not have had the tools to observe some regular empirical results that would have enabled the discovery of mathematical probability. But this was mainly the consequence of the lack of a good system of representation for some arithmetical ‘diagramming’ required to develop mathematical probability. David acknowledges this implicitly when she argues that the arithmetic and algebraic development necessary for mathematical probability could only take place after the difficulty to write down numbers was overcome (David

1962, p. 24). In sum, the absence of a system of signs to represent enumerations foreclosed the possibility of conceiving the fundamental probability set and of developing a mathematical calculus on the basis of it. And it took various generations of mathematicians from a variety of cultures before a good system of numerals was established.

In the second place, a thoroughly adequate system of mathematical representation for the study of mathematical probability theory had to be developed communally in order to be effective. For example, the struggle to conceive of the correct set of

179 equipossible outcomes for the game of points involved the effort to represent effectively the difference between partitions and permutations. Such efforts at effective representation are present in a 1662 printed edition of De Vetula, where a table of permutations for the throw of three-dice is provided.111 They are also evident in a table

that Galileo creates in his brief treatise to convey to his audience the difference between

possible partitions and permutations, also for the throw of three dice.112 And I think this

is illustrated by the Pascal-Fermat correspondence. Even when a system of numerals was

in place, part of these two mathematicians’ effort to communicate with each other consisted in creating an effective ‘diagram’ to represent the set of equipossible outcomes

for the game of points. In interpreting Fermat’s combinatorial method, Pascal creates the

types of tables that I have reproduced in section 3.1.2, and Fermat approves of them as

representations of the fundamental probability set. These are all efforts to represent, by

way of effective signs, the idea of a fundamental probability set. Reasoning upon the set must be carried through signs, and so the analysis of the set of equipossible outcomes becomes clearer and more accurate as the signs themselves—such as tables instead of long enumerations in prose or verse—become clearer and more effective, and this not for individual mathematicians only, but for the entire community of early probabilists.

In the third place, the community of inquirers not only created gradually an adequate system of representation to express their developing thought concerning probability theory; generations of inquirers actually developed entire branches of mathematics that informed conceptually the development of probability theory. This

111 For a reproduction, see Kendall 1970, p. 24-25. 112 See David 1962, p. 194.

180 development in part consists in the creation of entire systems of representation, such as those of arithmetic and algebra; in other words, it consists in the development of vast systems of mathematical ‘diagrams’ to express their thought concerning entire hypothetical worlds. But it also consists in part in the communal development of a growing body of mathematical knowledge. Thus, the community of inquirers not only created the convened system of thought-signs but also developed the requisite body of mathematical thought itself. Accordingly, this function of the community demands its own examination.

4.2.2 Existing Mathematical Knowledge

Mathematical knowledge—that is, mathematics regarded not as an ongoing inquiring practice but, less dynamically and more traditionally, as an established body of propositions about the different hypothetical states of things that frame the different branches of this science—is necessarily the product of communal inquiry. And this body of knowledge, represented by a conventional system of mathematical diagrams, is a necessary condition for the mathematician’s original reasoning. Let me expound these two interrelated claims.

First, a community of inquirers is a necessary condition for the possibility of the growth of a rich body of mathematical knowledge, that is, of a system of ideas pregnant with possibility for further creation and investigation. And all mathematicians must pick up the thread, so to speak, of this communal thought in order to find out what needs to be

181 investigated. Regarded as historical or sociological, this claim should not be controversial. Suppose that it were possible for a mathematical genius to develop, in complete isolation and by way of an entirely original system of signs, a body of mathematical knowledge that she never communicates to other mathematical inquirers.

Even then, as a historical or sociological matter, mathematical knowledge will not grow rich, and the knowledge will be lost as soon as the genius dies. Regarded as epistemological, the claim requires substantiation. I submit the following two-fold

Peircean argument. The growth of mathematical knowledge consists in the gradual creation and study of hypothetical states of affairs, represented and embodied in systems of mathematical ‘diagrams’. A system of mathematical diagrams is a system of ‘signs’ expressing mathematical ‘ideas’. Now, ‘signs’ and ‘ideas’ are irreducibly triadic, and an examination of their respective triadic natures provides two arguments, one firmly epistemological and the other more speculative. A ‘sign’, according to Peirce, has “three references: 1st, it is a sign to some thought which interprets it; 2nd, it is a sign for some

object to which in that thought it is equivalent; 3rd, it is a sign, in some respect or quality,

which brings it into connection with its object” (EP 1, p. 38). Notice that in its first

element, a sign refers to some interpreting thought. It always refers to subsequent,

interpreting thought of our own and, if outward expression is involved, it is also interpreted by someone else’s thought. Now, suppose that a mathematician’s ‘thought- signs’ were only interpreted by thoughts of his own. Then, the possibilities for interpretation—for example, by way of conceiving of a new thought-sign, modifying the preceding sign, focusing attention on its particular characteristics, examining its general elements, associating it with other signs, and so on—are limited by the powers of

182 imagination, concentration, and generalization of the mathematician himself. For instance, an extraordinarily imaginative mathematician may nonetheless have an ordinary power of concentration, while a mathematician with a great ability for generalization may nonetheless have an ordinary ability for imagination, and so on. In such cases, the growth of a rich body of mathematical knowledge will be curtailed by the necessarily limited epistemic powers of individual inquirers.

Moreover, ‘ideas’ themselves are irreducibly triadic in that they (i) are constituted by intrinsic qualities, (ii) are inextricably related to other ideas since no idea arises in complete isolation, and (ii) have a tendency for becoming general. This is in fact what in

1892 Peirce labels the “law of mind,” namely, “that ideas tend to spread continuously and to affect certain others which stand to them in a peculiar relation of affectibility. In this spreading, they lose intensity, and especially the power of affecting others, but gain generality and become welded with other ideas” (EP 1, p. 313). This law of mind is a speculative hypothesis to explain the evolutionary character of nature. What is important in our present context is that mathematical ideas, expressed through thought-signs, are inextricably related to other ideas and, if the law is tenable, they actually have a power of eliciting other ideas and a tendency to become general, that is, to establish progressively more general ideas and habits of thought in the mathematician’s mind. Peirce cites as evidence the fact that thinkers in different places often arrive at similar results at the same time, for example, in non-Euclidean geometry and in the logic of relatives. This would happen, according to his speculative law of mind, because the living problems of a community of inquirers are ideas that may elicit the same solutions in the minds of different inquirers. The ideas are active and effective in the intellectual life of the

183 community of inquirers. Thus mathematical ideas, once conceived, take on a life of their own in the mind. They have intrinsic properties as icons, of course, but they also have a power to react with other ideas and to become increasingly general, so as to apply to ever-growing and continuously evolving hypothetical worlds.

Now, Peirce hypothesizes that these ideas have their reactive and ultimately generalizing power not only within an individual, or personal mind, but also between minds. Therefore, the ideas of an individual mathematician have a power of affecting and transforming the ideas of other mathematicians. The mathematician must express the ideas, and once expressed, these living ideas have their intellectual effect by transforming the thought of the community of inquirers. In one of his most daring hypotheses, posed abductively in order to explain the possibility of the actual existence of, say, public spirit and corporate personality, Peirce speculates that there may in fact be corporate or communal minds where ‘ideas’ have their reactive effect and evolve according to their generalizing tendency. Short of examining this hypothesis in detail here, I would like to submit that mathematical ‘ideas’ that grow into rich and complex systems do indeed require for their development the cooperative intellectual effort of the community of mathematical inquirers. The development of geometric, arithmetic, algebraic, and probabilistic systems of ideas, for example, necessarily required the intellectual collaboration of entire communities—that is, the direct and effective influence of each mathematician’s ideas upon the ideas of others. The expansive and evolving nature of these mathematical systems of ideas necessarily transcended, by their tendency for generalization, the limits of any individual mind.

184 Admittedly, the latter, speculative argument may require a more thorough elaboration. However, I think that in conjunction with the former argument, it helps to substantiate the epistemological claim that the community of inquirers is a necessary condition for the growth of mathematical knowledge. Now I want to consider the second claim, namely, that existing mathematical knowledge is necessary condition for the possibility of original mathematical reasoning. The researching mathematician brings into her reasoning activity a body of background mathematical knowledge, from arithmetic and geometry, to the various systems of algebra, analytic geometry, differential calculus, number theory, and so on, as the case may be. From a Peircean perspective, this background knowledge consists in manifold, possibly interrelated, systems of diagrams that represent and in fact embody manifold, possibly interrelated hypothetical states of things. I submit that for the mathematician as active inquirer this existing knowledge consists in complex systems of ‘diagrams’ that, in the course of her reasoning, she may bring to bear in the analysis of a problem by modifying, reducing, relating, and generally associating them to the ‘diagrammatic problem’ under consideration. That is, the deeper and more extensive her knowledge of existing mathematical fields, the richer her possibilities for finding original solutions to new problems within an existing field or even to create entirely new fields of inquiry. She is able to enter the fullest version of the communal conversation, so to speak. I might even say that the more extensive her knowledge of existing mathematical systems of ideas the richer the “intellectual matter” that can be explored, modified, re-formed, in-formed, re-created and generated by the mathematician’s power of imagination. The imagination, let us recall, is a necessary epistemic condition for mathematical inquiry, and thus the mathematician’s background

185 mathematical knowledge provides the “intellectual matter” that functions as necessary condition for the possibility of original inquiry. The richer the existing “matter,” the richer the possibilities for innovation. Admittedly, existing knowledge is a necessary but not a sufficient condition. The imagination, if powerful enough, may create its own hypotheses and explore them through original ‘diagrams’, but in the long run the communal development of mathematics requires the creative work of the imagination upon existing knowledge. A powerful mathematical imagination with deep and extensive knowledge of hypothetical worlds already has a subject “matter” pregnant with possibility for further creation and re-creation. Similarly, the power of generalization has the possibility of leading to “deeper” mathematical understanding the more “extensive” the body of available knowledge that it has at its disposal. In this sense, we might recall

Peirce’s observation that the act of generalization leads to increasing “depth” as an extensive “breadth” of knowledge becomes organized under fewer principles.113

In the case of the early mathematical probabilists, communally-developed mathematical knowledge did function as necessary, though not sufficient, condition for the possibility of discovery. Clearly, basic arithmetic was involved in the reasoning of

Cardano, Galileo, Pascal, and Fermat, and at least Fermat explicitly conceptualized chances in terms of mathematical ratios of favorable to total equipossible outcomes, though this conceptualization was implicit already in Cardano’s gambler’s manual.

Fermat also devises a ‘combinatorial method’ of solution to the problem of points, even

113 Carlo Celluci discusses extensively the role of existing mathematical knowledge in the actual process of solving mathematical problems by way of the ‘analytical’ method (see Cellucci 2002, ch. 22). I will discuss the function of background mathematical knowledge in the course of mathematical reasoning, including Cellucci’s ideas, in chapter 5. For a more extensive treatment of the role of existing mathematical knowledge in mathematical reasoning, see Grosholz 1991, especially chapters 1 and 2.

186 as Pascal dreaded that the calculation of combinations may become unwieldy in complex problems. But Fermat stood by his method, perhaps realizing that the absence of a well- developed theory of combinations and permutations was simply an obstacle to be overcome by developing the area of combinatorics. A mathematician with Fermat’s intellectual powers and with his knowledge of algebra and number theory must have foreseen, or at least reasonably hoped, that the development of a theory of combinations sophisticated enough for the application of the combinatorial method to complex problems in the estimation of chances was entirely possible. Most notably, both Huygens and Bernoulli took up this task and developed the field of combinatorics so as to simplify the calculus of chances and, eventually with Bernoulli, of probabilities. In this case, once the basic mathematical ideas of probability theory has arisen and some initial problems concerning chances had been solved, it became necessary to develop the body of knowledge in combinatorics in order for probability theory to make further progress

Fermat proposed a general method of solution for the problem of points even if it required the development of combinatorics in order to be most useful, instead of foreclosing the task of solving the problem because a seemingly “pre-requisite” field was not sufficiently developed yet. In the end, these examples support the position that a communally-developed body of mathematical knowledge is an necessary condition for the possibility of mathematical discovery; however, it is not a sufficient condition in as much as the development of one field, such as mathematical probability, may in fact spur the parallel communal development of a seemingly pre-requisite area, such as mathematical combinatorics, and in this development the work of the imagination, at the very least, is also necessary.

187

4.2.3 Dialogical Criticism

The Peircean conception of thinking as a dialogical process implies that dialogical criticism is a necessary condition for the possibility of mathematical innovation.

Mathematical discovery is a critical, dialogical, and therefore ultimately communal, process. That for Peirce thinking is dialogical has been sufficiently established. That it is critical is implied by his conception of scientific and mathematical inquiry as open-ended practices, so that any belief is subject to the test of actual experience or, in the case of a purely theoretical endeavor such as mathematics, of dialogical criticism carried out by way of precise analysis. Let us then turn immediately to discuss the necessary function of dialogical criticism in the discovery of early mathematical probability.114

As is well known, a prevalent style of discussion among the European mathematicians of the sixteenth and seventeenth centuries involved a mathematician posing a problem, for which he often claimed to have a solution, as a public challenge for other mathematicians to solve. Some of the precursors of mathematical probability, such

as Cardano, Tartaglia, and Peverone were often involved in such discussions in a variety of areas, especially algebra. Even though this must have been an interesting way of

114 There is another Peircean line of argumentation that I will not develop here but that I would like to point out. Mathematical inquirers are fallible; their reasoning, even if considered to be ‘necessary reasoning’, often goes awry. Thus their fallibility requires the dialogical criticism of other inquirers to be set straight. For a discussion by Peirce of ‘fallibility’ in mathematical and scientific reasoning, see “The First Rule of Logic,” in EP 2.5 or RLT, p. 65-80. For a discussion of Peirce’s fallibilism and mathematics, see Hookway 1985, p. 182-183.

188 proceeding, I think that the disputatious character of such challenges did not aim to contribute to a communal advancement of mathematics, since often the challenger would safeguard his methods for arriving at a solution. David reports, for example, that in the sixteenth century Tartaglia earned a high reputation as a mathematician by traveling to various Italian universities and posing questions on the solutions to some third-degree algebraic equations which the local mathematicians, most famously Antonio Maria Fiore in Venice in 1535, were unable to solve. However, he kept to himself the method of solution (see David 1962, p. 47-49). If such disputations advanced mathematics, this advancement was an indirect and unwitting result of the ensuing public discussion.

With specific regard to the origins of mathematical probability, the communal process of dialogical criticism is better represented by the successive attempts at solving the problem of points. As I have already mentioned, these attempts started with Fra Luca

Pacioli, also known as Luca di Borgo, in his 1494 Summa. In this work, he poses a version of the problem of points as a problem of proportions, apparently not conceiving of it as a problem of chances (see David 1962, p. 26-39). At this juncture, what I must emphasize is that the Summa was written as a textbook intended for mathematical instruction, and it was an influential work in the training of Italian mathematicians in the sixteenth century. David argues that it motivated Cardano and Tartaglia to “write their versions of the algebra and of the arithmetic of the day” (1962, p. 38). With regard to the problem of points, Pacioli offers an incorrect solution to the problem but, as I have also noted in section 3.2, this inspired Tartaglia first and Peverone later to take up the task of providing their own solutions. This is precisely an example of the kind of dialogical criticism that leads to mathematical innovation. Tartaglia and Peverone also erred, but I

189 have observed that Peverone came close to a correct solution, except that his power of mathematical concentration failed him.

This process of intergenerational criticism among mathematical inquirers led eventually to the Pascal-Fermat correspondence. We have already discussed the correspondence in ample detail. It remains to emphasize now that this epistolary exchange is a prime example of the communal function of dialogical criticism in advancing mathematics. Pascal and Fermat take up questions in the estimation of expectations in aleatory situations—exemplified by the problem of points in games of chance—with a candid spirit of pursuing exacting analysis and criticism in the context of an open-ended inquiry. Both mathematicians openly expound their best methods of solution to each other in the hope of learning from the critical observations of their correspondent. We have seen, for example, how Pascal attempts to apply Fermat’s combinatorial method to a problem involving three players and criticizes the method because it allegedly leads to incorrect solutions, thus spurring Fermat to expound it generally and even to offer an alternative method of solution. Dialogical criticism thus spurred explicit ‘generalization’. It is also noteworthy that Fermat tried to engage Pascal in questions of number theory; Pascal however declined, recognizing that Fermat’s discoveries in this area were beyond his comprehension (see David 1962, p. 247-251). I think that in the correspondence these mathematicians followed what in 1898 Peirce called the ‘first rule of logic’, namely, “that there is but one thing needful for learning the truth, and that is a hearty and active desire to learn what is true” (RLT, p.170). This active desire to learn is the fundamental condition for discovering the truth because, according to the Peircean view, “inquiry of every type, fully carried out, has the vital

190 power of self-correction and of growth” (RLT, p. 170). To “carry out an inquiry fully” means to investigate in the spirit of a communal endeavor that is open to the lessons of experience and dialogical criticism. The aim is not to defend what the inquirer believes to be true, but to expose what he believes to be true to the test of experience and to the honest and exacting criticism of the community of inquirers. I submit that Pascal, and especially Fermat, pursued their joint epistolary research in this spirit, thus illustrating the necessity of communal dialogue for mathematical discovery.

Moreover, it is noteworthy that Huygens likely took up his research into the mathematics of expectation in aleatory situations upon learning of the problem of points in Paris in 1655. David writes that even though Huygens found out about the existence of the problem of points, he was not “told of the solutions of Fermat and of Pascal nor of the methods which they followed, which may have been because his Parisian friends were not competent to explain them” (David 1962, p. 111). David reports that, at the time of

Huygens’s visit, Pascal was undergoing his second conversion to and so had retired from mathematical circles, while Fermat was in Toulouse. Nevertheless, upon his return to Holland, Huygens began his work into the mathematics of expectation that led to his 1657 De Ratiociniis in Aleae Ludo (1962, p. 111-112). This is another instance in which the community of inquirers catalyzed a crucial development in early mathematical probability. Huygens found out about the problem of points in the Parisian mathematical circles, and even though he did not learn Pascal’s or Fermat’s methods of solution, he set to work again in the spirit of the Peircean ‘first rule of logic’ so as to find out what is mathematically true about problems of expectation involving chance. Below we will discuss his mathematical approach for solving the problem. But the fact is that his

191 inquiries were successful and that he offered a more systematic and conceptually lucid solution than even Fermat.

Finally, I should briefly note that the 1703-1704 Leibniz-Bernoulli correspondence is another paramount example of the necessary role of dialogical criticism in leading to mathematical innovation. I have already anticipated that the correspondence centers on the question of whether the calculus of probabilities, greatly advanced by Bernoulli’s , is applicable to problems in the natural and social sciences. Bernoulli claims that the probabilities of natural events may be estimated by observed statistical frequencies of those events while, for reasons that we will have occasion to discuss in section 6.1, Leibniz objects that the probability calculus is not applicable to such events. What I want to emphasize now is that in his Ars

Conjectandi Bernoulli explicitly addresses Leibniz’s objections (see Bernoulli 1966, ch.

4, especially p. 42-44). He is careful to address the objections so as to justify his mathematical method of estimating probabilities on the basis of observed frequencies.

Whether his defense of the method is successful is an open question that I will address later. This very question in fact eventually became the problem of inverse probability, which has its own history of communal inquiry, featuring such prominent probabilists as

Bayes and Laplace. The important point at this juncture is that dialogical criticism lead to a careful and thoughtful argument on the part of Bernoulli to defend the applicability of

192 his mathematical method to scientific problems. Once again, then, the community of inquirers served to spur mathematical innovation.115

4.3 Pragmatic Upshot Towards a Logic of Mathematical Inquiry

Let us turn now to consider succinctly what the foregoing discussion on the epistemic conditions for the possibility of mathematical discovery implies for a practical logic of mathematical inquiry. The fulfillment of these conditions by themselves in the context of mathematical research does not guarantee that the investigation will succeed or that new discoveries in mathematics will necessarily result. I readily admit that there is not a rule-like set of sufficient conditions for mathematical discovery. The Peircean conception of mathematics in fact implies that such a rule-like set of sufficient conditions does not exist, and to demand that it be provided in order for a proposed logic of discovery to be a “logic” in the narrow sense of “mechanical” rules for inference is to misunderstand entirely the nature of mathematical reasoning, on the one hand, and of

‘logic’ in its more comprehensive sense as the science of good reasoning—deductive, inductive, abductive, or otherwise. Consider mathematics defined as the science that draws necessary conclusions. As Peirce puts it with what we might call a sort of logico-

religious earnestness, it is a logical heresy to hold that “necessary reasoning takes a

115 Some further questions that arise from this discussion and that would require further investigation are (i) whether criticism is entirely distinct from creation or whether it is part of the creative process in framing mathematical worlds and (ii) whether criticism involves reasoning powers or abilities different from imagination, concentration and generalization.

193 course from which it can no more deviate than a good machine can deviate from its proper way of action, and that its future work might conceivably be left to a machine”

(CP 4.611). Peirce admits that “all genuine mathematical work, except the formulation of the initial postulates (if this be regarded as mathematical work) is necessary reasoning;” however, such reasoning demands originality and calls for the powers of imagination, concentration, and generalization to choose a path of reasoning among the many possible ones open to a practicing mathematician (CP 4.611). In sum, to demand a “recipe” sufficient for original mathematical inquiry in order to call the recipe a “logic of discovery” is to misunderstand altogether the nature of mathematical inquiring activity.

Nevertheless, it is possible to outline a set of necessary and enabling conditions for the possibility of mathematical discovery. The foregoing conditions, including an enabling empirical problem-context and the epistemic conditions of the individual mathematician—imagination, concentration, generalization—and of the community of inquirers—system of representation, mathematical knowledge, dialogical criticism—, provide us with a sketch of the basic elements of the logica utens at work in innovative mathematical research. To some extent these conditions, especially the individual epistemic ones, are innate or instinctive, and so it is not possible to turn this logica utens entirely into a logica docens. However, all of these conditions can be fostered, and to this extent it would be possible to develop a logica docens for mathematical discovery. This logica docens would be in the spirit of the correspondence course on the “Art of

Reasoning” that Peirce designed in the late 1880s.116 Peirce conceived of the course as

116 For Peirce’s description of the course, see W 6, p. 10-32.

194 one in which the students would attempt practical reasoning exercises in traditional logic, mathematical reasoning, and scientific reasoning. Their solutions would then be subject to analytical scrutiny so as to give the students insight into the theory of reasoning once the “living examples make the necessity and significance of it apparent” (W 6, p. 11).

Peirce aimed at teaching the “living process” of reasoning and at exercising the student’s mind “in such a way that it gains strength and skill at the same time” (W 6, p. 16). The specific aims of each of the exercises in logical, mathematical, and scientific reasoning reveal that Peirce thought the conditions for the possibility of good reasoning could be cultivated: “The first point, for instance, which I bring to the test, in nearly every case, is the brightness, inventiveness, liveliness of the mind; and some exercises are devoted to waking up this faculty should it be in a dormant state, as it often is” (W 6, p. 16). This is akin to cultivating the power of imagination of the mathematician. Peirce continues by emphasizing the strengthening of abilities akin to the power of concentration: “The next thing necessary is to see that the man makes a vigorous distinction between fact and fancy….Some fancy that reasoning has to be performed within the private chambers of their own brains, and do not appreciate, at first, how intimately it is connected with the real world” (W 6, p. 16). This description suggests the training of the power of concentration for a natural scientist, but it is easy to see how the mathematician, who ought to distinguish between what is relevant and what is superfluous in constructing and analyzing mathematical ‘diagrams’, would benefit from similar training. Though Peirce does not emphasize it in the description of the course, we can extend his ideas to conceive of training the power of generalization.

195 The upshot is that these abilities can be cultivated and invigorated, and in the case of the training of mathematicians, I think they should be. This type of education in the

“art of mathematical reasoning” should go alongside with the traditional learning of the established body of mathematical knowledge that dominates the students’ education, often well into their university coursework. In a vivid analogy meant to emphasize vigor in reasoning, Peirce writes: “If you want to teach a man to box, you must set him to boxing. But you must carefully analyze each motion for him. [Thus] the exercises must be analytic. The analytical gymnastics of the mind is what the instruction consists in” (W

6, p. 30). My suggestion is that the mathematician in training ought to be subjected to these gymnastics of her powers or abilities from the outset, from her early mathematical education in the schools and through the entire course of her training. Moreover, mathematicians might deliberately continue to cultivate these conditions, that is, to strengthen them as part of the logica utens that they actually deploy in their investigations. It is beyond the limits of this project to develop specific suggestions for how to foster these conditions. I will have to leave it as a follow-up task on the basis of my present work. The task is to think of the specific kind of mathematical education that would invigorate the proposed set of enabling and necessary conditions for the possibility of discovery. At the very least, however, my present results open the path and point the way for such future research.

I nevertheless do want to make some closing remarks about the power of imagination. The emphasis that Peirce places on the ability to imagine hypothetical worlds is, in my estimation, one of his most important contributions to the logic of mathematical inquiry. Mathematics, like all of science, is first and foremost an

196 imaginative activity. This means that reason must necessarily appeal to the imagination in order to fulfill its mathematical function. If this be called a Romantic view of mathematical inquiry, I shall not object.117 Peirce, however, has a regrettable tendency to

turn around and suggest that the imaginative formulation of initial postulates might not be

regarded as mathematical work (see CP 4.611, just quoted above). It is perplexing to me

that there should be any doubt for Peirce that this imaginative creation of hypothetical

worlds is an intrinsic part of mathematical reasoning. The case of the early probabilists

demonstrates it: the discovery of the fundamental probability set was the product of the

creative imagination of generations of inquirers. In their case, actual experience did to

some extent feed the imagination with “intellectual matter” to transform into ‘signs’ or

‘diagrams’. However, without the imagination there would have been no hypothetical

world for the mathematical probabilists to explore.

From a Peircean standpoint, I think that ultimately I am justified to submit that the

mathematician is a type of poet who studies what is necessarily true about his

hypothetical creations. These creations must have some definite characteristics, such as

being general and determinate, and they are investigated through some precise heuristic

methods of analysis, as we will see next. They are, nevertheless, the mathematical

creations of a poetic mind. The mathematician in training, therefore, might greatly

strengthen his necessary power of imagination not only by attempting to frame the

117 In the April 15, 2005, Dotterer lecture at The Pennsylvania State University, Richard Rorty suggested that giving primacy to the imagination over reason is the Romantic response to the Enlightenment. Then he proceeded to exult “poetry” and the work of the poets as the necessary philosophical response to the worship of “science” and the work of the scientists that some philosophical traditions practice. Rorty included Peirce among the “science worshipers.” But Peirce’s view is clearly that mathematical reasoning must invoke the imagination to create its subject matter; if this be called a Romantic view of mathematics, this is altogether acceptable to me.

197 principles of possible mathematical worlds but also by engaging in any other imaginative activity, including poetic, artistic, and musical creation. With regard to the relation between music and mathematics, for instance, Peirce writes that “[a]ccuracy and temper go together” in the practice of both (MS 748; printed in NEM 4, p. xiv). He adds that this mix of accuracy and temper “is one of the many respects in which mathematicians and musicians have a certain degree of resemblance. Many mathematicians love instrumental music: why should they not? For the delight equally of the science and the art consists in the contemplation of complicated systems of relationship. It really needs explanation if a mathematician is not musical. The intelligent listening to a fugue of Bach is certainly more like reading a piece of higher mathematics than the lesson of the school boy in elementary geometry is like the higher geometry” (NEM 4, p. xiv). The implications for the education of a mathematician are clear: in listening to music he is contemplating a system of relations as complex as any system of mathematical ‘diagrams’. The powers of imagination and of concentration are sharpened as much in one as in the other activity.

And the sharpening of these abilities is a necessary component of a logica docens for mathematical inquiry.

Chapter 5

The Method of Mathematical Inquiry and the Heuristics of Discovery

Having expounded a set of enabling and necessary conditions for the possibility of mathematical discovery, I will turn now to consider in detail the method of mathematical inquiry and the heuristic techniques by which it proceeds. I turn to the examination of method and heuristics because, when the conditions for the possibility of discovery are actually given, mathematicians must do their research, they must pose and solve mathematical problems. How do they proceed? What are their methods of actual research? Recall from chapter 2 that, according to the Peircean conception, the process of

‘necessary reasoning’ deployed in active mathematical inquiry can be logically analyzed into five continuous stages, namely: (i) Expressing a hypothesis in general terms, often by posing it as a problem; (ii) creating a concrete ‘diagram’ or mathematical icon to represent the hypothesis; (iii) experimenting upon the diagram by imagining possible changes to the diagram or schema; (iv) observing the results of experimentation until one experiment is seen to solve the problem or to ‘de-monstrate’ the original general hypothesis; and (v) generalizing the solution and, if necessary, translating it into the general language of the original hypothesis. The five-stage method of mathematical inquiry, therefore, requires the creation of and experimentation upon ‘diagrams’ in order to substantiate the general hypotheses.

When presenting the Peircean view on the method, I suggested that mathematical inquiry advances by way of ‘hypothesis-making’ of two kinds: first, by the making of

199 ‘framing hypotheses’ that create and determine a hypothetical state of affairs, and second, by the making of ‘conjectures’ or ‘analytical hypotheses’ in the course of experimenting upon diagrams to solve particular problems. I also suggested that although Peirce does not acknowledge it explicitly, the process of ‘experimenting upon diagrams’ is an

‘analytical’ process—that is, a process of judiciously modifying a ‘diagram’ or mathematical ‘icon’ in a variety of possible ways with a view towards solving the problem. This sense of ‘analysis’ as central to mathematical discovery is in line with what Carlo Cellucci proposes to be the open view of mathematics as an inquiring practice. For Cellucci, mathematics is an open-ended activity that proceeds by the method of analyzing proposed problems through a variety of hypothesis-making techniques (see

Cellucci 2000). In what follows, I aim to develop the Peircean logic of mathematical inquiry by studying the heuristics of discovery involved in the inquiries of the early mathematical probabilists. I aim to “develop” the Peircean position in the sense that I will not only consider the methods for framing and analytical hypothesis-making that Peirce himself discusses but will also appeal to the heuristic methods that Cellucci delineates.

Now, I do not aim to provide a comprehensive set of hypothesis-making methods in the forthcoming discussion. Even though I will analyze some of the heuristic techniques employed by the probabilists and accordingly point out some of the most important hypothesis-making methods that mathematicians employ, I rather aim to show how the careful logical investigation of historical case studies may lead us to a deeper understanding of the heuristics of discovery that are actually involved in mathematical inquiry and that ought to fostered in the training of aspiring mathematical researchers.

200 5.1 Heuristic Methods for Creating ‘Framing Hypotheses’

I call ‘framing hypotheses’ those propositions that either (i) create and determine a hypothetical state of things for mathematical investigation—e.g. definitions, axioms, and postulates—or (ii) suggest a plausible, more specific result within that hypothetical world that becomes a problem to be solved—e.g. theorems. Under the Peircean view, even the axioms that serve as the fundamental propositions of an axiomatic mathematical system ought to be considered as ‘framing hypotheses’ and not as self-evident truths.118 I

proceed next to define and illustrate—by way of Peirce’s and Cellucci’s work—some of the important methods for making ‘framing hypotheses’ that the early mathematical

probabilists deployed in their inquiries.

5.1.1 Abstraction

The mental operation of abstraction is central to mathematical research.

According to Peirce, “the adjective abstract was first used, in Latin, and in imitation of the Greek, of a geometrical form conceived as depleted of matter….The word is, with little doubt, a translation of the Greek [aphairesis], although no Greek text known at that time in the West, has been adduced, from which it could have been borrowed. The

118 According to James Robert Brown, we could adopt three views regarding axioms in mathematics: Axioms are (i) self-evident truths; (ii) arbitrary stipulations, such as those of conventionalism and formalism; and (iii) fallible attempts to describe how things are (Brown 1999, p. 170). In that classification, Peirce’s view is a variant of (iii), except it is not realist in the sense that Brown intends. The axioms are fallible attempts at describing a hypothetical state of affairs, sometimes representing an actual state of affairs, but not necessarily. That is, the axioms do not necessarily seek to describe a realm of mathematical ideas independent of the mathematicians’ creative conceptions.

201 etymological meaning is, of course, drawing away from; this, however, does not mean, as is often supposed, drawing the attention away from an object, but, as all the early passages in both ancient languages fully demonstrate, drawing one element of thought

(namely, the form) away from the other element (the matter), which last is then neglected. But even in the very first passage in which abstraction occurs as a term of logic, two distinct meanings of it are given, the one the contemplation of a form apart from matter, as when we think of whiteness, and the other the thinking of a nature indifferenter, or without regard to the differences of its individuals, as when we think of a white thing, generally. The latter process is called, also, precision (or better, prescission)”

(CP 2.428). There are, therefore, two different kinds of abstraction: one consists in drawing the form away from the matter of an object; the other consists in thinking of a general character of a group of objects regardless of their individual differences or particular qualities. Strictly speaking, Peirce prefers to call the former process

‘abstraction’ and the latter ‘prescission’.119

With respect to the operation more properly called ‘abstraction’, Peirce claims that it is “the very nerve of mathematical thinking. Thus, in the modern theory of equations, the action of changing the order of a number of quantities, is taken as itself a subject of mathematical operation, under the name of a substitution. So a straight line,

119 Discussing the latter in more detail, he writes that “prescission, if accurately analyzed, will be found not to be an affair of attention. We cannot prescind, but can only distinguish, color from figure. But we can prescind the geometrical figure from color; and the operation consists in imagining it to be so illuminated that its hue cannot be made out (which we easily can imagine, by an exaggeration of the familiar experience of the indistinctness of hues in the dusk of twilight). In general, prescission is always accomplished by imagining ourselves in situations in which certain elements of fact cannot be ascertained. This is a different and more complicated operation than merely attending to one element and neglecting the rest” (CP 2.428).

202 which is nothing but a relation between points, is studied, and even intuited, as a distinct thing. It would be best to limit the word abstraction to this process” (CP 2.428). Peirce in fact defines abstraction, in its more strict sense, always with reference to mathematical reasoning: Abstraction “consists of seizing upon something which has been conceived as a [epos pteroen], a meaning not dwelt upon but through which something else is discerned, and converting it into an [epos apteroen], a meaning upon which we rest as the principal subject of discourse. Thus, the mathematician conceives an operation as something itself to be operated upon. He conceives the collection of places of a moving particle as itself a place which can at one instant be totally occupied by a filament, which can again move, and the aggregate of all its places, considered as possibly occupied in one instant, is a surface, and so forth” (CP 1.83).

Peirce elaborates on the difference between ‘prescission’ and ‘hypostatic abstraction’ in the context of a discussion of the operations of thought characteristic of mathematics, and claims that it is the latter that is central to mathematical reasoning (CP

4.235). According to Peirce, “Abstractions are particularly congenial to mathematics.

Everyday life first, for example, found the need of that class of abstractions which we call collections. Instead of saying that some human beings are males and all the rest females, it was found convenient to say that mankind consists of the male part and the female part. The same thought makes classes of collections, such as pairs, leashes, quatrains, hands, weeks, dozens, baker's dozens, sonnets, scores, quires, hundreds, long hundreds, gross, reams, thousands, myriads, lacs, millions, milliards, milliasses, etc. These have suggested a great branch of mathematics. Again, a point moves: it is by abstraction that the geometer says that it ‘describes a line.’ This line, though an abstraction, itself moves;

203 and this is regarded as generating a surface; and so on. So likewise, when the analyst treats operations as themselves subjects of operations, a method whose utility will not be denied, this is another instance of abstraction….These examples exhibit the great rolling billows of abstraction in the ocean of mathematical thought” (CP 4.235). As we can see, collections and sets; points, lines, and surfaces; and algebraic operations on operations, are all the result of the process of abstraction according to Peirce. Note that this is not to claim that all these conceptions, much less all mathematical conceptions, have empirical origins. Theoretical conceptions are also subject to the process of abstraction, as the notion of an algebraic operation on an operation illustrates. Abstraction applies both to actual percepts and to theoretical ideas, both to perceived and to imagined objects.

The early probabilists gradually discovered the fundamental probability set—the main ‘framing hypothesis’ of early mathematical probability, since it began to delineate the mathematical world for study—by a process of ‘abstraction’. This means that the fundamental probability set was initially conceived by imaginatively drawing the

‘mathematical form’ for study from the ‘matter’ of actual experiences. The inquirers personified by the anonymous author of De Vetula were initially able to imagine, by abstracting from their actual experience with games of chance, a collection of possible outcomes of aleatory trials. This does not mean that the collection or complete enumeration of possible outcomes was directly perceived; it rather means that these inquirers were able to imagine the aleatory situation of dice-throwing, for example, in such a careful, attentive way so as to seize upon the complete enumeration of possible outcomes as the ‘mathematical form’ worthy of mathematical study, while disregarding those aspects of the ‘matter’ of the situation that were not of mathematical interest.

204 Among such mathematically unessential aspects of the ‘matter’ of dice-throwing to be disregarded we might include, say, the material of which the dice are made, the color of the dice, who the players are (so long as the game does not involve skill or skill can be assumed equal), whether the throw is right- or left-handed, and so on.

I am suggesting, then, that the early mathematical probabilists discovered the fundamental probability set not by abstraction from a direct percept but rather by abstraction from an image. That is, they first imagined an aleatory situation under ideal conditions: perfect dice, equally skilled players, and so on. Then they abstracted from this image the conception of an enumerated collection of possible outcomes. This means that they transformed the possible individual imaginary outcomes of the idealized game into an abstract object, an enumeration or collection which, later in the history of mathematics, came to be conceived as a set. It is not important right now that the initial enumerations may have been incorrect for the purposes of estimating chances insofar as they did not distinguish between partitions and permutations. This distinction would be the later result of a more precise determination of the fundamental probability set in the context of the ‘analysis’ of problems in the calculus of chances, as we can see in

Galileo’s ‘analytic’ work. The crucial point now is that the fundamental probability set was initially conceived, even if imprecisely, by way of abstraction from an idealized situation.

Admittedly, this original image of a game under ideal conditions was the result of abstracting from actual experience with games. I would like to call this operation of abstracting an idea from actual perceptual experience an ‘idealizing abstraction’. Clearly the Peircean stance admits that ideas may arise by abstraction from perceptual

205 experience. But this does not mean that all ideas arise by idealizing abstraction from perceptual experience. Our first ideas, once conceived on the basis of experience, spur a continuous train of thought in the mind that, animated by the imagination, has a life of its own. Ideas engender ideas through the work of the imagination. Sometimes this imaginative work proceeds by way of abstraction. In the conception of a fundamental probability set, the imaginative work—literally, the making of a mathematical image or

‘sign’—consisted in abstracting the mathematical form from the image of idealized aleatory conditions. Once the concept of a fundamental probability set was abstracted, the course of mathematical thought upon this idea took a life of its own in the minds of a community of inquirers.

5.1.2 Framing Analogy

Analogy is an important method for framing mathematical hypotheses. In a very general sense, analogy is the inference that if two things agree with each other in one or more respects, they will agree in other respects as well. The nature of the agreement between the analogous things—whether individual objects, an object and its sign, relations between objects, systems of relations, and so on—may be one of strict correspondence or of some degree of resemblance. If the agreement is a strict one-to-one correspondence between (i) all the elements and (ii) the patters of relations among the elements that constitute two different things, the analogy is an isomorphism. If the number of elements is not the same, but the structural patterns of relation between the

206 elements of the two things are the same, the analogy is a homomorphism. Less strictly, an analogy between two things may be a relation of stronger or weaker resemblance.

Peirce considered analogy to be a fourth type of inference, a mixed case between induction and abduction.120 In a relatively early effort at discussing the various forms of

ampliative inference, Peirce writes that “among probable inferences of mixed character,

there are many forms of great importance. The most interesting, perhaps, is the argument

from Analogy in which, from a few instances of objects agreeing in a few well-defined

respects, inference is made that another object, known to agree with the others in all but

one of those respects, agrees in that respect also” (CP 2.787). In this case, we might

understand the analogy to be between a sample of completely examined objects and

another partially examined object. I think that the Peircean view that analogy is a mixed case between induction and hypothesis or abduction may be elaborated as follows. In an analogy, there is an enumeration of the aspects in which two things correspond to, or at least resemble, each other. This sort of enumerative evidence is usually inductive—it consists in a sample of observations about particulars. However, the inference is not from particular instances to a probable general rule; it is rather a conclusion to another

particular aspect in which the analogous things plausibly agree. No definite probability is attached to the conclusion of the inference by analogy. The inference only suggests what may plausibly be the case about one of the things under examination on the basis of its

120 However, later in the development of Peirce’s doctrines of logic, he seems to have come to classify analogy as a species of abduction. The question of what is the form of an analogy according to Peirce and how it ought to be classified is a subject that still requires a thorough investigation of its own. Here I will only try to discuss it in sufficient detail for my present purposes, though I openly admit that my interpretation of analogy may be considered as an incipient attempt at an elaboration of the Peircean model rather than as a faithful account of Peirce’s own explicit views.

207 partial agreement with the other, and this is similar to a ‘habitual’ abductive conclusion.

In analogy, therefore, no general rule is inferred, as it is in ‘creative’ abduction. Analogy, remains at the level of knowledge of particulars, and so it is not an inference that leads to generalization. It does however tend to extend our knowledge—in the Peircean sense of

‘extension’ versus ‘generalization’—by concluding that an object may plausibly belong to a known class of objects.

With specific regard to necessary diagrammatic reasoning, Peirce writes that

“deduction consists in constructing an icon or diagram the relations of whose parts shall present a complete analogy with those of the parts of the object of reasoning, of experimenting upon this image in the imagination, and of observing the result so as to discover unnoticed and hidden relations among the parts” (CP 3.363; emphasis mine).

Therefore, here Peirce claims that the mathematician constructs an icon or diagram for study by way analogy, where we might take a “complete analogy” to be an isomorphism, or at least a homomorphism, between the thing represented and the diagrammatic representation of the thing. In the larger context of his thought, I think we should take

Peirce’s position to be that mathematical ‘diagrams’ are often constructed by analogy, though mathematicians have other methods of construction depending on what the specific reasoning context requires.

Be that as it may, it is precisely as an extensive inference—that is, as an inference that adds “breadth” rather than “depth” to our knowledge—that Bernoulli uses an analogy to justify, in one of a series of arguments, the application of mathematical probability to the study of natural phenomena. As I discussed in section 4.1.1 , in his 1703-1704 correspondence with Leibniz, Bernoulli used an analogy between an urn filled with

208 pebbles and a human body containing sicknesses or diseased parts to warrant the application of the probability calculus to the study of natural events, such as diseases and storms. Bernoulli offers the analogy in response to his correspondent’s objection that the possible outcomes of natural events, unlike those of games of chance, are infinite—as

Hacking points out, an objection that in contemporary terms amounts to claiming that there is no fundamental probability set for natural events (Hacking 1975, p. 163-164).

Let me now expound Bernoulli’s inference by analogy. The initial step is to enumerate several characteristics with respect to which the urn and the human body

“agree” or are analogous. Bernoulli lists two such characteristics in this case.

Accordingly, the first premise of the analogy is that just as the urn contains white or black balls in a ratio that is unknown to us, so also the human body contains “the tinder of sicknesses within itself” (Bernoulli 1966, p. 76). For simplicity, we might interpret

Bernoulli to mean that the human body contains healthy—analogous to white—and sick—analogous to black—parts in a ratio that is unknown to us. Bernoulli assumes that the ratio of sick to healthy parts provides an indication of the probability of death within a given time period. It is likely that Bernoulli conceived of this tinder box of sicknesses as containing various types of sicknesses, such as dropsy and plague, which he uses in other examples and which were commonly listed in mortality tables such as those gathered by

John Graunt’s 1662 Bills of Mortality.121 This latter interpretation of course would

complicate the analogy, since we would have to conceive of the various sicknesses as

being analogues to balls of various colors within an urn. These complications

121 For a reprint of some summary statistics on causes of mortality from Graunt’s Bills, see David 1962, p. 101.

209 notwithstanding, the second premise is that just as it is possible to sample with replacement from the urn so as to determine empirically the ratio of white to black balls, so also is it possible to sample parts from the human body and observe them so as to determine empirically the ratio of healthy to sick parts. The next step is to point out some characteristic that is known to be true of the urn. Accordingly, the third premise is the claim, which Bernoulli believes to be warranted by his theorem, that on the basis of a mathematically determined sample size we can become “morally certain” that the experimentally observed ratio of white to black balls is as close to the true ratio as it is scientifically desired (Bernoulli 1966, p. 75). The final step is to draw the conclusion by analogy. Bernoulli thus infers that on the basis of a mathematically determined sample size we can also become “morally certain” that the empirical ratio of healthy to sick parts is as close to the true ratio as scientifically desired.

Regardless of the factual and conceptual merits of the various premises, the most important point for our present purposes is that by way of this analogy Bernoulli extends the notion of the fundamental probability set.122 Just as there is a fundamental probability

set in games of chance, so also is there a fundamental probability set in natural events.

Bernoulli argues that mathematically it is irrelevant that in natural events the number of

possible outcomes constituting the set is infinite since the ratio of two infinite quantities

122 For an eloquent discussion of the scientific and conceptual context that made the various premises and implicit assumptions of Bernoulli’s analogy plausible or implausible to his contemporaries, see Daston 1988, p. 230-253. Her main thesis is that Bernoulli created a model of causation that appealed to those of his contemporaries who wanted to uphold the possibility of practically certain scientific knowledge against skeptical arguments. The model of causation consisted in conceiving of a priori probabilities as the unknown causes of observed effects, namely, a posteriori statistical ratios. Thus it was possible to reason scientifically from effects (observed frequencies) back to causes (unobserved a priori probabilities) and to obtain scientifically practical certainty about the hidden causes of observed phenomena.

210 may nevertheless be finite, as the notion of a limit shows. Bernoulli’s analogy, therefore, extends the general concept of the fundamental probability set to as to be applicable in the particular case of natural events. Daston explains that “Bernoulli’s appropriation of the urn example to describe the processes linking inaccessible causes [i.e. a priori probabilities] to observed effects [i.e. statistical frequencies] expanded not only the domain of problems upon which probabilists might test their skills, but also the conceptual tools for extending the range of the theory’s applications still further. By likening situations as disparate as the diseases that afflict the human body…to drawing white and black balls at random from an urn, probabilists hoped to free their theory from its preoccupation with gambling puzzles. Bernoulli’s urn model of causation set the pattern for other applications of classical probability theory” (Daston 1988, p. 238). This is an exemplary instance, therefore, of the significant re-creation by an analogical extension of a ‘framing hypothesis’.123 Further studies into historical cases ought to lead to a more comprehensive list of methods for the making of framing hypotheses. For now, however, let us turn to examine cases of ‘analytical’ hypothesis making.

5.2 The ‘Analytic Method’ of Mathematics and the Heuristics of ‘Analytical Hypothesis-Making’

I call ‘analytical hypotheses’ those that the mathematicians conceive as plausible solutions to a problem. In Peircean terms, these hypotheses consist in experimental modifications to an existing ‘diagram’—be it a geometrical figure, an algebraic equation,

123 Recall that I have already argued in section 4.3 that the imaginative creation or re-creation of ‘framing hypotheses’ ought to be considered an inextricable aspect of mathematical reasoning.

211 an so on—in order to derive another ‘diagram’ representing the sought mathematical result. Again, I will examine some salient cases from the early history of mathematical probability so as to (i) clarify in general what I, following Cellucci, mean by the ‘analytic method’ of mathematics, and (ii) define and illustrate some of the most important heuristic methods for making ‘analytical hypotheses’ outlined by Peirce and Cellucci.

5.2.1 Lessons from Huygens’s General Method of Solution for the Problem of Points

After his 1655 visit to Paris, Huygens set to work on various problems on the mathematics of chance, including the problem of points, independently of Pascal and

Fermat. His work is highly significant since, as David proclaims, “[t]he scientist who first put forward in a systematic way the new propositions evoked by the problems set to

Pascal and Fermat, who gave the rules, and who first made definitive the idea of mathematical expectation was Christianus Huygens” (David 1962, p. 110). In his 1657

De Ratiociniis in Aleae Ludo, Huygens put forth the first systematic treatment of the mathematics of chance, and this work became the standard text for studying the elements of . It was subject to various English translations, one of them by

John Arbuthnot, and Jacob Bernoulli included it, with his own annotations, as part I of the Ars Conjectandi. The main body of Huygens’s De Ratiociniis in Aleae Ludo consists of the following fourteen propositions:

I: To have equal chances of getting a and b is worth (a + b) / 2.

II: To have equal chances of getting a, b or c is worth (a + b + c) / 3.

212 III: To have p chances of obtaining a and q of obtaining b, chances being equal, is worth (pa + qb) / (p + q).

IV: Suppose I play an opponent as to who will win the first three games and that I have already won two and he one. I want to know what proportion of the stakes is due to me if we decide not to play the remaining games.

V: Suppose that I lack one point and my opponent three. What proportion of the stakes, etc.

VI: Suppose that I lack two points and my opponent three, etc.

VII: Suppose that I lack two points and my opponent four, etc.

VIII: Suppose now that three people play together and that the first and second lack one point each and the third two points.

IX: In order to calculate the proportion of stakes due to each of a given number of players who are each given numbers of points short, it is necessary, to begin with, to consider what is owing to each in turn in the case where each might have won the succeeding game.

X: To find how many times one may wager to throw a six with one die.

XI: To find how many times one should wager to throw 2 sixes with 2 dice.

XII: To find the number of dice with which one may wager to throw 2 sixes at the first throw.

XIII: On the hypothesis that I play a throw of 2 dice against an opponent with the rule that if the sum is 7 points I will have won but that if the sum is 10 he will have won, and that we split the stakes in equal parts if there is any other sum, find the expectation of each of us.

213 XIV: If another player and I throw turn and turn about with 2 dice on condition

that I will have won when I have thrown 7 points and he will have won when he

has thrown 6, if I let him throw first find the ratio of my chance to his.124

Among these propositions, the most important for our discussion will be number IX because it provides a general rule for the solution of any particular version of the problem of points. I will detail the heuristic method that Huygens employs to derive this rule. Prior to that specific discussion, however, let us turn to consider briefly what

Huygens’s treatise reveals in general about the ‘analytic method’ of mathematical inquiry.

5.2.1.1 The ‘Analytic Method’ of Mathematics

Carlo Cellucci has recently argued in favor of a philosophical view of mathematics as an open-ended, heuristic practice instead of what he calls the

‘foundationalist’ view of mathematics as a closed-ended body of knowledge completely determined by self-evident axioms (see Cellucci 2000 and 2002). In that context, he argues that the ‘heuristic’ view reveals that the actual method of mathematical inquiry is

‘analytic’ instead of ‘axiomatic’. Actual mathematical inquiry does not proceed by way of mechanical deduction from self-evident principles and axioms. Some mathematical theories might exhibit an axiomatic structure once they are developed, but at this point they are “dead,” so to speak; established, axiomatized theories are no longer an actual,

124 I have listed the propositions as translated in David 1962, p. 116-117.

214 living matter of inquiry.125 Mathematical inquiry rather proceeds by way of analytical

problem-solving. According to Cellucci, “the analytic method is the procedure according

to which one analyzes a problem [that is, brakes it into constituent problems, or reduces it

to another problem, and so on] in order to solve it and, on the basis of such analysis, one

formulates a hypothesis. The hypothesis constitutes a sufficient condition for the solution

of the problem, but it is itself a problem that must be resolved. In order to resolve it, one

proceeds in the same way, that is, one analyzes it and, on the basis of such analysis,

formulates a new hypothesis. [Thus, analysis] is a potentially infinite process” (Cellucci

2002, p. 174).126 Under this view, therefore, the search for an absolute foundation to

mathematical knowledge is vain. To cast mathematical axioms as self-evident truths that

serve as absolute foundations for mathematical knowledge is to curtail the actual process

of analytical inquiry. Moreover, in as much as the analytic “passage from the given

problem to a hypothesis that constitutes a sufficient condition for its solution is

configured as a reduction from one problem to another, the analytic method is also called

the method of reduction” (p. 175). And in as much as the analytic method requires

formulation of a hypothesis for the solution of a problem, it “is also called the method of

hypothesis” (p. 177). Analysis, then, consists in reasoning processes that we might very

broadly conceive as ‘reduction’ and ‘hypothesis-making’.

Now, I have argued in section 2.4 that, in my estimation, the Peircean conception

of mathematics is open-ended, heuristic, and advocates the analytic method in the

125 If I may venture a metaphor, I think of these theories as a kind of fossil record of what once was a living matter. 126 All translations from this work are mine.

215 Cellucian sense since it views mathematics as an inquiring practice in which the mathematician solves ideal problems by way of analytical hypothesis-making. These ideal problems are framed within hypothetical states of affairs that often serve as representative models for ‘actual’ problems in nature. Moreover, ‘framing hypotheses’ in the Peircean sense need not be axioms, and even when an ideal mathematical system is axiomatized, the axioms are simply hypotheses that may be reconceived as the process of mathematical inquiry demands.

I want to submit now that the ‘heuristic’ view of mathematics, with its endorsement of the analytical method, explains well the type of mathematical practice that Huygens’s treatise reveals. There are no axioms serving as the foundation of

Huygens’s De Ratiociniis in Aleae Ludo. There is rather a series of propositions—all framed by the general idea of the fundamental probability set—that actually stand for problems of chance and expectation. In order to solve them, Huygens analyzes them, reducing them to other problems and posing hypothetical solutions; that is, in Peircean terms, experimenting with the mathematical models or ‘diagrams’ that represent the problem and observing the results of experimentation. The solution to each problem, in turn, suggests new problems for investigation. The analytical process, then, gradually leads to a “deepening” of knowledge on the mathematics of chance. For example, even without discussing the details here, we might easily imagine that the solution to the problem stated in proposition VII could proceed by analyzing this problem into those problems already solved in the immediately preceding propositions. And as we shall see in detail shortly, proposition IX is a general problem that can be analyzed into simpler problems that are either of easy solution or already solved in previous propositions,

216 especially II and VIII. Moreover, in his general treatment of the problem of points in proposition IX Huygens assumes that all players have equal chances of winning each game. This suggests a new, more general, problem: what if the players do not have equal chances of winning each game? takes up this problem and offers an even more general solution to the problem of points in his 1718 Doctrine of Chances. We find in Huygens’s treatise, then, not an axiomatized theory but a series of interrelated problems regarding the calculus of chance whose solutions eventually lead Huygens to offer general rules for the solution of similar problems, such as the general method for solving particular problems of points stated in proposition IX. And the same analytical process is taken up by other inquirers, so that the analytical method does tend towards increasingly more general problems, potentially ad infinitum.

Now, proponents of the ‘foundationalist’ view of mathematics as an affair of deduction from self-evident axioms might of course deny that Huygens’s treatise is properly a mathematical work. Daston in fact observes that even though “the famous correspondence between Blaise Pascal and Pierre Fermat first cast the calculus of probabilities in mathematical form in 1654, many mathematicians would argue that the theory achieved full status as a branch of mathematics only in 1933 with the publication of A. N. Kolmogorov’s Grundbegriffe der Wahrscheinlichkeitsrechnung. Taking David

Hilbert’s Foundations of Geometry as his model, Kolmogorov advanced an axiomatic formulation of probability based on Lebesgue integrals and measure set theory. Like

Hilbert, Kolmogorov insisted that any axiomatic system admitted ‘an unlimited number of concrete interpretations besides those from which it was derived,’ and that once the axioms for probability theory had been established, ‘all further exposition must be based

217 exclusively on these axioms, independent of the usual concrete meaning of these elements and their relations’” (Daston 1988, p.3). Under such a ‘foundationalist’ view, therefore, the work of all the early probabilists, including Huygens, may be regarded as a non-mathematical, even if scientific, attempt at providing quantified models of chance phenomena, but not as mathematical theorizing proper. They may concede Daston’s own view that “the link between model and subject matter is considerably more intimate than that between theory and applications” so that, even in the eyes of the early probabilists, the field of mathematical probability was “a mathematical model of a certain set of phenomena, rather than…an abstract theory independent of its applications” (Daston

1988, p. xii). Even conceding this, however, the “foundationalists” would not confer upon early mathematical probability the seemingly privileged rank of a theory.

From a Peircean perspective, the distinction between model and theory may be of philosophical interest for understanding the structure of mathematical and scientific knowledge, but it is not relevant for determining whether the early probabilists were acting and reasoning as mathematicians. Foundationalist philosophers of mathematics may impose their conceptions of mathematics on early mathematical probability in order to battle as much as they want about whether it is a model or a theory. However, from a

Peircean open-ended, heuristic perspective, what marks the probabilists’ reasoning as genuinely mathematical is that they were creating ideal states of affairs and studying what would be true about them. Whether the results of studying these hypothetical states of affairs amounted structurally to models or to theories is beside our point of interest.

Nevertheless, let me state that I think that no deeper understanding of mathematics is gained by arbitrarily circumscribing the notion of mathematical ‘theory’ to axiomatized

218 systems of propositions. If anything, it promotes the erroneous idea that mathematics is the dead stuff printed in a certain kind of textbook.127 My inclination is to say that a mathematical ‘theory’ is a purely ideal system while a mathematical ‘model’ is a system that represents an ‘actual’ problematic phenomenon. In Peircean terms, a theory is a ‘pure icon’ while a model is a ‘symbolic icon’. Qua pure mathematicians, the early probabilists were creating a theory; qua applied mathematical scientists, they were modeling aleatory phenomena. Be that as it may, what is crucial to us is that the ideal systems of early mathematical probability were open-ended and subject to reconception and revision, as problem-solving demanded and as mathematical theorizing and the modeling of actual chance phenomena dictated. Whether theorizing or modeling, their activity was thoroughly mathematical, and it proceeded by problem-solving and hypothesis- making.128 Huygens’s work testifies to this, as we shall now see.

127 Preferably one without any figures, actual diagrams, pictures, conjectures or wild guesses. See, for instance, James Robert Brown’s discussion of the Bourbaki group in French mathematics, which equates the highest standards of rigor with a thorough refusal to use any pictures or figures or other heuristic aides in their demonstrations (Brown 1999, p. 172-173). The immediate Peircean reply, of course, is that the members of the Bourbaki group only take themselves not to be working with ‘diagrams’ even though their algebraic and analytic expressions are also mathematical ‘signs’ usually equivalent to other “pictorial” forms of representation. 128 To be bold, I would say that mathematics is not what philosophers, or mathematicians with philosophical dogmas, define it to be according to their precepts, proceeding then to exclude from mathematics anything that does not fit those precepts: mathematics is what mathematicians actually do. And what mathematicians actually do is recorded in its history as well as enacted in living research. This is why I think philosophers of mathematics ought to look at the history of the subject in order to understand its nature. Otherwise, the discussion over this or that arbitrary prescription of what mathematics is becomes extremely uninteresting.

219 5.2.1.2 Generalization and Particularization as Analytical Heuristics

Proposition IX provides a general rule for the solution of the problem of points.

Let me first expound Huygens’s demonstration and then discuss what it reveals about analytical heuristics. Again, the proposition is the following:

In order to calculate the proportion of stakes due to each of a given number of

players who are each given numbers of points short, it is necessary, to begin with,

to consider what is owing to each in turn in the case where each might have won

the succeeding game.

To demonstrate it, Huygens reasons as follows.129 (I will insert my annotations in parentheses.) He supposes that there are three players, A, B, and C, and that A lacks one game, B two games, and C three games in order to win the match. (That is, he begins by considering a particular problem of points.) He begins by trying to find the proportion of stakes due to B, calling the sum of stakes q (which serves as an algebraic unknown), if either A, or B himself, or C wins the first succeeding game. There are, therefore, three cases to consider. (a) If player A were to win the next game, then the match would end and consequently the sum due to B is 0 (i.e. B is due 0q). (b) If B were to win the next game, he would therefore lack 1 game, while A and C would still lack 1 and 2 games respectively. Therefore, by proposition VIII, B is due 4q/9. (Imagine the “Fermatian table” of equipossible outcomes for the ensuing situation. There would be 4 out of 9 possible outcomes that would favor player B. Huygens does not construct a Fermatian

129 My rendition of Huygens’s reasoning is a lose translation of his demonstration as reprinted in Bernoulli 1713, p. 18-19.

220 table, but the exercise allows us to understand the result given our previous discussion of the Fermatian method in the Pascal-Fermat correspondence.) (c) Lastly, if C were to win the next game, then he would lack 1 game, while A and B would still lack 1 and 2 games respectively. Consequently, by proposition VIII, B is due 1q/9. (Again imagine the

“Fermatian table” of equipossible outcomes for the ensuing situation. There would be only one out of nine possible outcomes that would favor player B.) Moreover, if we

“colligate in one summation,” that is, if we add, that which in each of the three cases is due to B, namely 0, 4q/9, and 1q/9, the result is 5q/9. Dividing this sum by 3, which is the number of players, the result is exactly 5q/27. By proposition II this is the “part sought,” that is, the proportion of the total stakes that is due to B. (Had we diagrammed the

“Fermatian table” for the particular version of the problem of points that Huygens considers, we would have found that there are twenty-seven equipossible outcomes, out of which only five outcomes favor player B.) As if to leave completely clear his reasoning, Huygens restates his conclusion that since B would obtain either 0, 4q/9, or

1q/9, then by proposition II the proportion of stakes due to B is “0 + 4q/9 + 1q/9 : 3” or

5q/27.

(At this point, Huygens derives a general rule for solving the problem of points from the foregoing solution to one particular version of the problem of points.) Therefore,

Huygens argues, one must consider in any problem whatsoever, clearly in the preceding one or in any other version of the problem, what is due to each player in the case where each might win the next game. (In the previous particular problem, we would find by the same method that A is due 17q/27 and C is due 5q/27.) For just as one cannot solve the preceding problem until we “subduce” it under the calculations already done for

221 proposition VIII, so also we cannot solve the problem in which the three players lack 1,

2, and 3 games respectively until we calculate how the stakes ought to be distributed when: (i) they lack 1, 2, and 2 games respectively, which is the preceding problem just solved, and (ii) they lack 1, 1, and 3 games respectively, which is the problem already solved in proposition VIII. (Note that when (iii) they lack 0, 2, 3 games respectively, the solution is trivial since A gets all of the stakes. This is why Huygens’s does not list it.)

Huygens provides a table that “comprehends” the calculations for each subsequent particular problem of points, up to the problem in which A, B, and C lack 2, 3, and 5 games respectively, noting that the particular solutions can be extended. (By providing the table, Huygens emphasizes that his general rule will work no matter how complex the particular problem of points under study.)

Allow me to draw out now what Huygens’s demonstration reveals about mathematical demonstration via analytical heuristics. I find in Huygens’s demonstration the five-stage process of necessary reasoning that Peirce outlines. (i) Huygens expresses a hypothesis in general terms, in this case, his proposed general rule for solving the problem of points. This proposition becomes a problem that ought to be resolved. (ii)

Huygens creates a concrete ‘diagram’ or mathematical icon to represent the general proposition; in this case, the ‘diagram’ consists in a particular, definite, play situation that can be represented by words, Fermatian tables, and so on. (iii) Huygens experiments upon the ‘diagram’ by imagining possible changes to it, in this case, by considering three possible modifications that correspond to each of the three players winning the next game. As I already noted, each of the three ensuing situations can then be represented by three new ‘diagrams’, say, by three “Fermatian tables” of the resulting play situations.

222 (iv) Huygens observes the results of the experimental modification of the original

‘diagram’ and grasps that the experiment solves the problem, i.e. ‘de-monstrates’ the original general proposition. (v) Finally, Huygens generalizes the solution, even providing a complex table that we might consider a composite diagram of multiple play situations and that we understand to be but a part of an infinitely composite diagram.

Let us now delve into the ‘analytical’ stages of the process. In this example, stage

(ii) consists in a ‘particularization’ of the problem of points. Cellucci defines heuristic particularization as “the inference by way of which one passes from one hypothesis to another one that it contains as a particular case” (Cellucci 2002, p. 267). We might state the general problem of points as follows: Given that players A, B,…, X, Y, Z lack a, b,…, x, y, z points respectively to win the match, find the proportion of the total stakes q that is due to each one of them. Huygens’s finds that trying to find a general rule of solution directly from this general statement of the problem is too difficult. Thus he particularizes the general problem, in effect by constructing a particular ‘diagram’ of it, and proceeds to solve the particular version.

Experimental stage (iii) consists in ‘reducing’ the particular ‘diagram’ of the problem into three alternative ‘diagrams’ of problems that have already been solved.

‘Reduction’ in this sense simply means to resolve the present problem into one or more alternative problems whose solutions, when composed or linked in some suitable way, are sufficient for solving the original one. In this case, Huygens reduces the problem in which players A, B, and C lack 1, 2, and 2 games respectively into three alternative problems: how to divide the stakes when (a) they lack 0, 2, and 2 games; (b) they lack 1,

1, and 2 games; and (c) 1, 2, and 1 games. Case (a) has a trivial solution, and cases (b)

223 and (c) have already been solved in proposition VIII. Additionally, proposition II provides the rule by which the original problem can be solved in terms of the solutions to cases (a), (b), and (c).

In stage (iv) Huygens’s “sees” or “grasps” that the method of solution is general: it can be applied to any particular problem and it will lead to the correct solution.

Equivalent modifications of the original diagram in any play situation will yield the correct response regarding the fair distribution of stakes. Cellucci defines heuristic generalization as “the inference by way of which one passes from one hypothesis to another one that contains it as a particular case” (Cellucci 2002, p. 267). I submit that

Huygens “grasps” the generality of the rule quickly due to his vigorous power of generalization. Any mathematician with a lesser power of generalization, however, could arrive at the same generalization by conducting other diagrammatical experiments. It is in this sense that Peirce argues that necessary mathematical reasoning is inductive. The mathematician could experiment with problems in which there are, say, four players that lack 1, 1, 1, and 2 points. She could create a diagram of this play situation and resolve it by the same method into the various possible alternative diagrams. Still she would find that Huygens’s general rule works. Thus, mathematicians arrive at inductive generalizations more or less quickly according to whether their power of generalization is more or less vigorous. Accordingly, Huygens emphasizes the generality of his method by providing a table with the solution to more complex games. No matter how complex the diagram his general method works, and his readers can confirm it inductively by conducting alternative experiments themselves. In this way, stage (v) is completed and the process of demonstration ends.

224 In sum, I propose that in the course of this demonstration Huygens reasons analytically, deploying the heuristic experimental techniques of ‘particularization’,

‘reduction’, and ‘generalization’ to solve a problem and, consequently, to “demonstrate necessarily” what originally stood as a hypothetical proposition. Accordingly, I find in this example a confirmation that the method of mathematical inquiry is analytical and experimental—in the Cellucian and Peircean senses that I have expounded—and an illustration that ‘particularization’, ‘reduction’, and ‘generalization’ are among the key heuristic techniques of research that mathematicians ought to cultivate. Let us turn to learn other heuristic methods of analysis from Bernoulli’s proof of his famous law of large numbers.

5.2.2 Lesson’s from the Demonstration of Bernoulli’s Theorem

In his Ars Conjectandi, Bernoulli states and demonstrates the first law of large numbers—arguably the most important breakthrough of early mathematical probability theory. Bernoulli sought to find a way to estimate the probabilities of events on the basis of empirical observation in situations where the probabilities cannot be estimated a priori. In the next chapter, I will discuss in detail the logical implications of Bernoulli’s mathematical findings and of his arguments to warrant the application of mathematical probability theory, on the basis of his theorem, to problems in the natural and social sciences. Right now I will discuss the strictly mathematical problem that Bernoulli poses and resolves. It is, in fact, a two-fold problem. First, Bernoulli wants to show that as the

225 number of empirical observations increases and tends towards infinity, the observed statistical frequency of a phenomenon approaches its a priori probability without asymptotical bound. He claims that by an “instinct of nature” we know that the more observations we have the lesser the risk involved in estimating the a priori probability from the a posteriori ratio. However, what we know by instinct requires mathematical demonstration (Bernoulli 1966, p. 38). That is, Bernoulli hypothetically poses, on the basis of an instinct of nature, a mathematical proposition, and this proposition becomes an ‘analytical’ problem that he must resolve. Stephen Stigler observes that the empirical approach to the determination of chances was not new with Bernoulli, nor did he consider it to be new. What was new was Bernoulli’s attempt to give formal treatment to the vague notion that the greater the accumulation of evidence about the unknown proportion of cases, the closer we are to certain knowledge about that proportion” (Stigler 1986, p. 65).

Bernoulli approaches the problem formally by trying to show that as the number of empirical observations of a binomial event tends towards infinity, the probability that the difference between the a priori probability and the observed frequency is less than any small number approaches 1 (or maximum probability). In contemporary terms, Bernoulli seeks to prove the following hypothesized mathematical proposition:

Let p be the probability of a successful event E being the outcome on any chance

experimental trial. Let n be the number of experimental trials, x be the number of

successes in n trials, and sn = x/n be the proportion of successes in n trials.

Bernoulli wants to justify mathematically that sn is a good estimate of p by

226

showing that for any small positive number ε, the probability P( | p - sn | < ε ) → 1

as n → ∞.130

Second, Bernoulli seeks to show that the required number of experimental trials may be

specified mathematically in order to ensure that the empirical estimate is as close to the a

priori probability of an event as it is desired. That is, Bernoulli wants to show that, for

any given small probability δ, n can be specified such that P( | p – (x/n) | ≤ ε ) > (1 – δ).

Continuing with our foregoing notation, we may express Bernoulli’s statement of the

problem as follows:

Show that n may be specified such that, for any given large positive number c,

P( | p - (x/n) | ≤ ε ) > c P( | p - (x/n) | > ε ).131

Stigler emphasizes that Bernoulli proved more than just the first law of large numbers.

Had Bernoulli only considered the first part of the problem, it would be strictly fair to call his theorem just the first “weak law of large numbers.” However, mainly on account of the second part, Bernoulli’s “actual result was deeper, subtler, more precise, more

difficult, and more ambitious than the simple and elementary statement of the weak law

of large numbers” (Stigler 1986, p. 66). Bernoulli not only demonstrated formally that as

the number of observations increases to infinity the probability that the difference

between the observed frequency and the a priori probability is arbitrarily small tends to 1,

but he also showed how to determine the number of observations required in order to

130 See Hacking 1971b, p. 221-222, and Stigler 1986, p. 63-70. I am following Hacking’s (1971b) notation now in order to engage his discussion of the problem more easily in the next chapter. 131 I adapt this expression from Stigler 1986, p. 66. Stigler points out several possible mathematical and conceptual pitfalls related to stating Bernoulli’s result in contemporary probabilistic terms. Among these, the most salient is that Bernoulli only treats the case where the numbers of successes, r, and failures, s, are integers, and not with the contemporary situation in which the ratio p = r / (r + s) ranges over all the real numbers in the interval [0, 1]. For the details, see p. 66-67.

227 attain a desired level of accuracy in the statistical estimate. With this in mind, let us turn to discuss Bernoulli’s attack on this two-fold analytical problem.

Bernoulli’s demonstration of the theorem is extensive and considerably longer than contemporary proofs of the weak law of large numbers. Stigler reconstructs

Bernoulli’s mathematical proof in contemporary terms with clarity and succinctness (see

Stigler 1986, p. 67-70). Accordingly, my aim here is not to retrace the path already covered by Stigler nor to reconstruct the entirety of Bernoulli’s proof. I will rather expound Bernoulli’s heuristic attack on the problem, emphasizing and explaining the

‘analytical’ character of his reasoning. Bernoulli states the theorem, or problematic proposition in need of proof, as follows:

I will call those cases in which a certain event can happen successful or fertile cases; and those cases sterile in which the same event cannot happen. Also, I will call those trials successful or fertile in which any of the fertile cases is perceived; and those trials unsuccessful or sterile in which any of the sterile cases is observed. Therefore, let the number of fertile cases to the number of sterile cases be exactly or approximately in the ratio r to s, and hence the ratio of fertile cases to all the cases will be r / (r + s) or r / t [letting t = r + s], which is within the limits (r + 1) / t and (r – 1) / t. It must be shown that so many trials can be run such that it will be more probable than any given times (e.g., c times) that the number of fertile observations will fall within these limits rather than outside these limits—i.e., it will be c times more likely than not that the number of fertile observations to the number of all the observations will be in a ratio neither greater than (r + 1) / t nor less than (r – 1) / t. (Bernoulli 1966, p. 60-61)

Bernoulli sets out to show, then, that it is possible to determine a number of trials so that the empirical statistical frequency of an event will be within two bounds around the true probability—the more the observations, the tighter the bounds.

228 Bernoulli’s strategy is to ‘reduce’ the probabilistic problem to a problem stated in terms of the binomial expansion.132 The ‘reduction’ works as follows. Recalling that r is

the number of “fertile” or successful trials and s is the number of “sterile” or unsuccessful trials, let nt = n (r + s) be the total “number of observations taken.” This means that there

are n binomial experiments and that in each of those there are t trials of which r are

successes and s are failures. Now, expand the binomial (r + s)nt and relate each of the

terms of the expansion divided by tnt with the “expectation” [expectatio] or “probability”

[probabilitas] of a given ratio of successful and unsuccessful trials. That is, expand the binomial to get the expression:

rnt + (nt / 1) (rnt-1) s1 + [(nt (nt-1)) / (1*2)] (rnt-2) s2 +

[(nt (nt-1) (nt-2)) / (1*2*3)] (rnt-3) s3 + … + (nt/1) (r1) snt-1 + snt.

Then, on the basis of proposition XIII of the first part of the Ars Conjectandi, Bernoulli

can deduce that each of the terms of the expansion divided by tnt gives the probability of a

given ratio of successes and failures. That is, (rnt / tnt) gives the probability that all the

trials are successes; [(nt / 1) (rnt-1) s1] / tnt gives the probability that all but one of the trials are successes, and so on. Consequently, eliminating the common denominator tnt, it is possible to identify the first term of the expansion, rnt, with the number of experiments

where there are only successful trials; the second term of the expansion, (nt / 1) (rnt-1) s1,

with the number of experiments in which all the trials but one are successes; the third

term, [(nt (nt-1)) / (1*2)] (rnt-2) s2, with the number of experiments in which all but two

132 Here we see the influence that the existing knowledge in mathematics has on mathematical research. Newton’s proof of the binomial theorem enabled a more powerful strategy of proof by Bernoulli.

229 trials are successes; and so on, until the last term, snt, is identified with the number of experiments in which all the trials are failures.

Having expressed all the possible outcomes in terms of the binomial expansion,

Bernoulli can now solve the probabilistic problem exclusively in terms of the expansion

and of some properties of infinite series.133 However, prior to detailing the ensuing

analytical solution, I want to characterize the nature of Bernoulli’s ‘reduction’ of the probabilistic problem so far. Bernoulli has created an isomorphism between the collection

of all the possible outcomes of the experiments, i.e. of what we call the ‘fundamental

probability set’, and the binomial expansion. All the possible outcomes of the nt

experiments are counted in the binomial expansion, and each of the terms of the

expansion represents the number of ways in which each possible combination of r

successes and s failures can occur. Moreover, the ratio of each of the terms to the

common denominator tnt represents the probability of each of the possible events. A

simple example will clarify the matter. Suppose there is two-sided die, with one white

and one black face. Let the experiment be to throw the die twice. Perform the experiment

twice. Now determine the probability of each of the possible events. In this case, let r =

number of “white” throws and s = number of “black” throws. The number of throws, or

trials, per experiment is t = 2, and the number of experiments, or double-throws of the

die, is n = 2. All the elements of the fundamental probability set and the probabilities of

all the possible events are represented in the following binomial expansion:

(r + s)nt = (r + s)2*2 = (r + s)4 = r4 + 4r3s1 + [(4*3)/(1*2)]r2s2 + 4r1s3 + s4.

133 It is important to emphasize that Bernoulli carried out considerable original work on infinite series and that a treatise on the subject was included as an appendix to the Ars Conjectandi.

230 There are five terms in the expansion corresponding to five possible events, namely:

(1) The coefficient of r4 indicates that there is 1 possible way to throw four whites;

(2) the coefficient of r3s1 indicates that there are 4 possible ways to throw three

whites and one black;

(3) the coefficient of r2s2 indicates that there are 6 possible ways to throw two whites

and two blacks;

(4) the coefficient of r1s3 indicates that there are 4 possible ways to throw one white

and two blacks;

(5) and the coefficient of s4 indicates that there is 1 possible way to throw four

blacks.

Since the total number of possible outcomes is tnt = 24 = 16, the probability of each event

can be obtained by dividing the number of possible outcomes favoring each of the events

above by 16.

Now, historically Bernoulli is not the first to the use of the binomial expansion to

treat combinatorial problems in probability. However, he seems to have conceived of this

method on his own. Daston, for instance, writes that “[a]lthough much of Chapter 3 of

Part II of the Ars conjectandi duplicated the results of Pascal’s Traité du triangle

arithmétique (1665), Bernoulli appears not to have known of Pascal’s work. However,

Bernoulli produced a ‘table of combinations’ that is essentially the arithmetic triangle, and proceeded to investigate its ‘truly curious and surprising’ properties, including the familiar derivation of the coefficients of the binomial expansion” (Daston 1988, p. 235).

Moreover, Bernoulli’s analysis of the problem under discussion in terms of the binomial

expansion reveals a vigorous and original imagination.

231 In sum, the isomorphism between the elements and relations involved in the probabilistic problem and the elements and relations represented in the binomial expansion allow Bernoulli to treat the problem in terms of relevant properties of the expansion itself. In Peircean terms, thus far (i) Bernoulli has expressed a hypothesis in general terms by proposing the theorem, and (ii) he has created a concrete ‘diagram’ or mathematical icon to represent the situation described in the hypothesis. The binomial expansion is a ‘diagram’ of the fundamental probability set designed with the express purpose of exhibiting clearly the relations among elements of that set that are relevant to the problem at hand. The next stage in the reasoning is (iii) to experiment upon the

‘diagram’ by introducing suitable changes to the algebraic expression and exhibiting relations between the terms that compose it so as to derive the desired result.

From a Peircean perspective, I submit that Bernoulli proceeds to reasoning stage

(iii) by way of the following analysis.134 Rewrite the binomial expansion of (r + s)nt as:

nt nt-1 1 1 nt-1 nt r + (nt/1) (r ) s + … + Ln + … + M + … + Rn + … + (nt/1) (r ) s + s , where M is the largest term in the expansion, and Ln and Rn are the terms that are n places

to the left and to the right of M respectively. As I have already pointed out, this ‘diagram’

is an orderly schema of all the possible cases that make up the fundamental probability

set. But this recasting of the binomial expansion already poses an ancillary problem.

Bernoulli needs to show that the largest term M represents the number of possible

experimental outcomes that result in nr successes and ns failures. He proves this in

134 Bernoulli formally presents the proof by stating and proving five lemmas and then invoking them in the demonstration of the main theorem. But I submit that his actual inquiry did not follow this order. He rather first analyzed the main problem into ancillary problems that he proceeded to resolve in the lemmas.

232 lemma 3, with the support of lemmas 1 and 2. Thus, M represents the total number of possible cases in which the true ratio r:s is empirically observed. Bernoulli’s next step in his ‘diagrammatical experiment’ consists in resolving or literally breaking this schema into two component parts—for convenience of exposition, I will call them the ‘central’ and ‘peripheral’ parts. The ‘central’ part consists of the terms Ln, …, M, …, Rn. The

nt nt-1 1 1 ‘peripheral’ part consists of the terms r , (nt/1) (r ) s , …, Ln-1, and Rn+1, …, (nt/1) (r )

nt-1 nt s , s . Now, add all of the terms of the ‘central’ part, Ln + … + M + … Rn, and all of

nt nt-1 1 1 the terms of the ‘peripheral’ part, r + (nt/1) (r ) s + … + Ln-1 + Rn+1 + … + (nt/1) (r )

snt-1 + snt.

Now observe the result of the diagrammatical experiment so far (recall that

observation is stage (iv) in the Peircean model of mathematical reasoning). Bernoulli

already proved that M represents the number of possible cases that result in nr successes

and ns failures. Now, observe that since Ln and Rn are the terms that are n places to the

left and to the right of M respectively, they represent the number of cases in which either

nr + n or nr – n trials are “fertile”—while the rest of the trials are “sterile”—respectively.

Therefore, observe that the sum Ln + … + M + … Rn represents the number of possible

cases in which no more than nr + n and no less than nr – n trials are successful, while the

nt nt sum r + … + Ln-1 + Rn+1 + … s represents the number of possible cases in which either

more than nr + n or less than nr – n trials are successful. Recalling the isomorphism

already described, observe also that the former sum of ‘central’ terms represents the

number of possible cases in which the observed experimental outcomes approximate the

true ratio r : s, while the latter sum of ‘peripheral’ terms represents number of possible

cases in which the observed experimental outcomes deviate from the true ratio r : s.

233 On the basis of all the foregoing experimentation and observation, we see that what Bernoulli needs to show is that n can be chosen so large that

nt nt (Ln + … + M + … Rn) > c (r + … + Ln-1 + Rn+1 + … s ) for any desired c.

This analytical recasting of the problem again poses ancillary problems that need to be

resolved. First Bernoulli needs to show that in the expansion of a binomial of power nt, n

can be chosen so that the ratio of M to Ln and the ratio of M to Rn will be greater than any

given ratio. This he proves in lemma 4. Then Bernoulli needs to show that in the same

binomial expansion, n can be chosen so that ratio of the sum of all the terms from M up

to and including Ln (or Rn) to the sum of all the terms beyond Ln (or Rn)—that is, the ratio of ‘central’ to ‘peripheral’ terms to the left of right or M—is greater than any given ratio.

Both of these proofs reflect Bernoulli’s powerful ingenuity and deep and subtle knowledge of infinite series. By lemmas 4 and 5, then, Bernoulli can deduce that a power nt of the binomial r + s can be chosen so large that

nt nt (Ln + … + M + … Rn) > c (r + … + Ln-1 + Rn+1 + … s ) for any c.

Observe that this solves the original problem 1, whose result is the law of large numbers.

Divide both sides of the inequality by the total number of possible outcomes, tnt. The

resulting inequality of probabilities,

nt nt nt nt [(Ln + … + M + … Rn) / t ] > c [(r + … + Ln-1 + Rn+1 + … s ) / t ] for any c,

expresses the desired result, namely, that as number of experiments increases, and

eventually as n → ∞, it will be more than c times more probable that the observed ratio of

nt nt successes to failures will be within the bounds (Ln / t ) and (Rn / t ) around the true

probability than outside those bounds. In other words, the number of observations can

always grow so large so that it will be more probable that the empirically observed

234 frequencies will approximate rather than deviate from the a priori probability, where what “more probable,” “approximate,” and “deviate” mean can be specified with mathematical precision.

However, Bernoulli does not stop here. There is a second part to the original problem, namely, to find the n necessary for a desired level of approximation of the statistical estimate to the a priori probability. He lets R be equal to the ratio of the number of possible successful trials to the number of all possible trials. Since the power nt of the binomial represents the total number of experiments, it follows from the preceding result that so many observations can be made such that the sum of cases in which

[(nr – n) / nt] < R < [(nr + n) / nt], or equivalently,

[(r – 1) / t] < R < [(r + 1) / t], exceeds the sum of the other cases by more than c times. That is, n not only can grow sufficiently large but its value can also be determined with mathematical exactitude so that the statistical estimate will “approximate” the a priori probability to a precisely specified degree; in the foregoing notation, so that R will be c times more likely to fall within than outside the bounds [(r – 1) / t] and [(r + 1) / t]. At this point, Bernoulli’s demonstrative reasoning ends—the stages of analytical experimentation and observation have finally led to a full demonstration of the general theorem. He proceeds to apply his newly demonstrated result to some specific examples.

At this point, I should emphasize a crucial insight that I owe to Daston. She points out, almost in passing, that “Bernoulli’s method of approximating by inequalities the required number of observations nt…was inspired by Archimedes’s approximation of π”

235 (Daston 1988, p. 236). In defending his claim that the estimation of a priori probabilities by way of a posteriori frequencies can be shown sufficient for “moral certainty” even in situations where the number of possible cases involved is infinite, Bernoulli cites historical accomplishments in which mathematical approximations are sufficient for practical use. As his main example, he mentions that “the determinate ratio of the circumference of a circle to its diameter…cannot be expressed accurately except by the infinitely continued decimal places of Ludophus, but…nevertheless, [it] is bounded by

Archimedes, Metius, and Ludolphus himself within limits which are very sufficient to practical application” (Bernoulli 1966, p. 43). The proof strategy that Archimedes uses to bound the value of π provided Bernoulli with a plan of attack to demonstrate his famous theorem. A natural way to develop my present study would be to undertake a careful investigation of the nature of Archimedes’s reasoning to ascertain whether it can be aptly characterized as ‘experimental’, in the Peircean sense, and ‘analytical’, in the Cellucian sense, and to assess its heuristic impact on Bernoulli’s reasoning.135

I regret I cannot undertake this task in detail in this study. However, I do want to

make some pertinent remarks here. Archimedes approximates the value of π in

Proposition III of De Circuli Dimensione or Measurement of the Circle. Archimedes shows that the ratio of the circumference to the diameter of a circle is less than 3 1/7 but

135 There are at least two 17th Century Latin editions of Archimedes’s works. There first one is Archimedis opera quae extant, David Flurant de Rivualt (Ed.), Paris: Apud Claudium Morellum, 1615; the second is Archimedis opera, Isaac Barrow (Ed.), London: G. Godbid, 1675. I understand that there also is a 17th Century German edition, but I have not been able to find the citation at this time. Any one of these editions may have been used by Bernoulli. It would be interesting to find which one it was, perhaps from Bernoulli’s notebooks and archives at Basel. And the interest would not be only historical; it would be of philosophical relevance because the commentaries of the various editors are illuminating in different ways, and so each one may have influenced Bernoulli’s actual reasoning differently.

236 greater than 3 10/71. Historian of Mathematics Carl Boyer notes that this approximation is far more precise than those of the Egyptians and Babylonians (Boyer 1989, p. 142-

143). Archimedes’s solution consists of two parts, each concerned with finding one of the bounds. It is reasonable to claim, then, that this proposition was originally a problem to be resolved—namely, “approximate the value of π”136—that Archimedes ‘reduced’ it to

two problems—namely, “find a lower and an upper bound for π”—and that the formal

statement of the proposition and its impeccable, elegant proof is only the formal skeleton

revealing an involved analytical process. He finds the upper bound by inscribing a circle

within a regular hexagon, then doubling recursively the number of sides of the regular

polygon circumscribing the circle, up to a 96-side polygon—and showing successively

that the perimeter of each polygon is less than and gradually approximates 3 1/7.

Invoking relevant theorems from Euclidean geometry, he can deduce that π is also less

than 3 1/7. Archimedes finds the lower bound in a similar fashion, this time starting with

a regular hexagon inscribed in a circle, next doubling recursively the number of sides of

the inscribed polygon successively up to a 96-side polygon, until he approximates the

result that the perimeter of the polygon is greater than 3 10/71. Again invoking Euclidean

theorems, he can deduce that π > 3 10/71. Thus, a series of carefully conceived and

calculated inequalities lead to the approximation, which Bernoulli considers sufficient for

practical use.

This is a clear case in which a heuristic method, a plan of attack on a difficult

analytical problem, is suggested to a great mathematician by his knowledge of the history

136 We must keep in mind, of course, that ancient geometers never used our modern notation π to denote the ratio of the circumference to the diameter of the circle.

237 of his subject and what it reveals about methods of mathematical ‘experimentation’. It points to the close link between “deep” mathematical knowledge and effective analytical

‘hypothesis-making’. Cellucci speaks to this link when he writes, “every step of the process of reduction establishes new relations between the problem [under investigation] and existing knowledge, as it is necessary from the moment that resolving a problem generally requires us to transcend the problem’s own limits and to explore the relationship between it and other problems. Existing knowledge plays an essential role in the discovery of hypotheses. Naturally, it does not constitute a sufficient basis for finding new hypotheses because these hypotheses go beyond existing knowledge. But no new hypothesis can be found without starting from data, and the facts of existing knowledge—for example, the problems already resolved—are the data for finding hypotheses” (Cellucci 2002, p. 174). Consequently, Cellucci argues that mathematical inquiry cannot be carried out in closed systems. Mathematical problems arise in relation to other problems, often from other mathematical systems, and their analysis necessarily calls for knowledge of problems already resolved in other areas. Moreover, the whole of mathematical knowledge is in a continuous process of development and expansion, so that developments in one area may pose problems in another area and may in fact

“deepen” the scope of existing problems.

238 5.2.3 Pragmatic Upshot of the Historical Lessons: Towards a Logic of Mathematical Inquiry

For the best mathematicians, I think, the practice of “deepening” their knowledge in a variety of areas, and especially in the history of mathematics, is simply part of the logica utens that prepares them to make better and more effective ‘analytical hypotheses’ through a variety of heuristic methods. This is their practice and no prescriptive logic of inquiry is required to guide them. I want to emphasize, however, the upshot of the foregoing discussion for a logica docens. I think it is safe to claim that it is common among mathematicians, or at least it is the norm in programs of mathematical education, to regard the study of the history of mathematics as being entirely inconsequential to learning to conduct actual mathematical research. In my estimation, this attitude impoverishes the research abilities of students of mathematics. Their training ought to expose them to the reasoning of mathematicians throughout the history of mathematics, at least in some areas. This would not only deepen their existing knowledge, but also cultivate their analytical capacity and train them in heuristic methods. As a student of probability, for example, I had a very narrow view of the possible ways to undertake to prove the law of large numbers. It has been my study of the work of Bernoulli that has

“deepened” my understanding of the theorem; as a student, such an investigation could have also fostered my analytical ability and strengthened my grasp on a variety of heuristic methods. Likewise, contemporary students of differential and integral calculus would gain tremendous insight into the power of the analytical methods at their disposal by comparing them to those that Archimedes developed in order to study the properties of a wide range of curves, solids, and geometrical figures.

239 As I have already admitted, my aim here has not been to provide a comprehensive list of heuristic methods for analytical hypothesis-making. I have rather aimed at drawing some of these methods from a historical case study, thereby illustrating as well how a careful study of crucial historical discoveries may lead us to a better understanding of the heuristics of mathematical hypothesis-making. Summing up the case of Bernoulli, in his reasoning we again find an example of experimental analysis conducted by way of a variety of heuristic techniques, most notably a ‘reduction’ by way of an isomorphism— which is a highly formal species of ‘analogy’—, the ‘resolution’ or literal breaking apart of an algebraic array into component parts, and the subsequent series of algebraic

‘deductions’—formal, rule-guided operations that transform one algebraic expression into an equivalent one—involving inequalities between various relevant ratios. Other examples would reveal more techniques.

For instance, De Moivre’s gradual improvement on Bernoulli’s theorem, which eventually led him to the normal approximation of the for the case where p = 1/2 involved a heuristic method that Cellucci catalogues as ‘hybridization’.137

‘Hybridization’ is “the inference through which the properties of the object of a certain

mathematical domain are transferred to the objects of another domain, giving place to a

partial superposition of both domains” (Cellucci 2002, p. 285). A typical example may be

Descartes’s development of analytic geometry, in which both equations and curves are

hybrids that possess at once algebraic and geometrical properties. In the case of De

Moivre, he came to conceive of the binomial expansion not only as an algebraic

137 The identification of this method is due to Emily Grosholz. See Grosholz 1992 and Grosholz 2000b, p.88.

240 expression but as a curve. Stigler locates the first indication of this in De Moivre’s 1730

Miscellanea Analytica de Seriebus et Quadraturis (Stigler 1986, p. 76-77; see also

Pearson 1926). De Moivre writes, “If the terms of the binomial are though of as set up right, equally spaced at right angles to and above a straight line, the extremities of the terms follow a curve. The curve so described has two inflection points, one on each side of the maximal term” (De Moivre 1730, quoted in Stigler 1986, p. 76-77). The image that arises is that of a curve that today we would recognize as the curve of a . The central or maximal term is that which represents the most probable outcome; in the curve, it would be the peak. The terms that gradually move away from the central, maximal term in either direction, represent the gradually less probable outcomes towards the tails of the curve. When De Moivre imaginatively conceives of the binomial expansion as a hybrid, in as much as it also represents a curve, he can investigate its properties as a curve, including its points of inflection. He finds that for the binomial distribution the inflection points are located at a distance (1/2) (n + 2)1/2 from the maximal term (Stigler 1986, p. 77; see De Moivre 1730, p. 109-110). Therefore, the

‘hybridization’ of the binomial expansion is the crucial heuristic step that eventually allowed De Moivre to find what we recognize as the first normal approximation to the binomial distribution. The examples could and ought to continue to expand our list of heuristic methods.

For now, I want to close our discussion of Bernoulli’s “pure” mathematical reasoning by looking forward to the question of its “applied” scientific upshot. In her assessment of the upshot of Bernoulli’s mathematical theorem, Daston argues that

“mathematical conjectures about far more complex and interesting situations like human

241 disease and the weather became possible” because of the theorem (Daston 1988, p. 231).

Moreover, she writes that “Bernoulli also drew the attention of mathematicians to the relationship between probabilistic conjecture and inductive reasoning….Because

Bernoulli and his eighteenth-century successors equated the a priori probabilities with causes and the a posteriori observations with effects, his theorem became a tool for discovering the probability of causes from effects” (Daston 1988, p. 231-232). However, she charges that even though Bernoulli’s theorem assumes that the true a priori ratio r : s is known, he “did not hesitate to invert his method, without proof or further justification, to find the probability with which [a priori] r and s could be inferred from the observed ratio of fertile to sterile cases” (Daston 1988, p. 236). In other words, she claims that

Bernoulli tries to apply the mathematical result to empirical problems without further mathematical or logical warrant. This charge precisely raises the issue for the last part of my study of the logical of mathematical inquiry; namely, how is it logically justifiable to apply ‘ideal’ or ‘hypothetical’ mathematical theories to the scientific study of ‘actual’ events in nature?

Chapter 6

The Leibniz-Bernoulli Correspondence: The Abductive Warrant of Bernoulli’s Theorem

Between 1703 and 1704, G.W. Leibniz and Jacob Bernoulli sustained an

epistolary discussion regarding Bernoulli’s famous theorem of mathematical probability

now known as the first ‘law of large numbers.’ During the course of the correspondence,

Bernoulli describes to Leibniz the probabilistic problem that his theorem intends to

solve—briefly, the problem of justifying a method to estimate a posteriori, from the

observed results of experiments, the probabilities of events when those probabilities

cannot be calculated a priori, that is, prior to experimental observation.138 The ensuing

exchange of objections and replies elucidates the logical warrant underlying the

applicability of Bernoulli’s theorem in the statistical estimation of probabilities in the

natural and social sciences. The valid estimation of such probabilities would fulfill the

aim of Bernoulli’s ‘art of conjecturing’, which he describes as follows:

We are said to know [scire] or to understand [intelligere] those things which are certain and beyond doubt; all other things we are said merely to conjecture [conjicere] or guess about [opinari].

To conjecture about something is to measure [metiri] its probability; and therefore, the art of conjecturing or the stochastic art [Ars Conjectandi sive Stochastice] is defined by us as the art of measuring as exactly as possible the probabilities of things with this in mind: that in our decisions or actions we may be able always to choose or to follow what has been perceived as being superior, more advantageous, safer, or better considered; in this alone lies all the wisdom of the philosopher and all the discretion of the statesman. (Bernoulli 1966, p. 13)

138 From this point onwards, ‘a priori’ will mean ‘prior to experimental observation’ and ‘a posteriori’ will mean ‘posterior to experimental observation’.

243 Here I argue that Bernoulli’s reasoning in order to justify his proposed method is a case of abductive inference. To this end, first I recount in detail the key elements of the correspondence for our present purposes; second I discuss how Bernoulli’s reasoning could be understood as a case of inductive inference to the best explanation; third I argue how his reasoning could be understood instead as a case of abductive inference; and finally I compare both accounts and argue that the abductive one provides a better logical model than the inductive account in order to understand Bernoulli’s ampliative inference.

The general upshot of my discussion is (i) to distinguish between inductive inference to the best explanation and abduction as competing accounts of ampliative inference in science and mathematics; (ii) to reclassify an apt model of the inference to the best explanation as abductive and not inductive; and (iii) to examine in what way abductive inference may warrant the application of mathematical theory to natural and social sciences.

6.1 The Correspondence

On October 3, 1703, Bernoulli wrote a letter to Leibniz in which he describes the central problem of probability that occupies him (see Leibniz 1855, p. 77-78). This is the problem of finding a good method to estimate the probability of an event a posteriori from the results of experiments when the probability of that event cannot be determined a priori. More specifically, Bernoulli is concerned with the problem that arises when we do not know a priori the count of favorable and unfavorable possible outcomes in order to

244 calculate the probability of an event and we must therefore find a way to estimate the probability a posteriori on the basis of observed experiments where the event either occurs or fails to occur. From the outset, Bernoulli describes this as an important problem for the application of mathematical probability—or of the ‘doctrine of chances’—to natural and social sciences, or, as he puts it, to “civil, moral, and economic affairs”

(Bernoulli 1966, p. 68). In fact, Bernoulli compares the application of mathematical probability to games of chance and to life expectancy when he presents the problem to

Leibniz as follows:

[I]t is a known fact that the probability of any event depends on the number of possible outcomes with which the event can or cannot happen; and so, it occurred to me to ask why, for example, do we know with how much greater probability a seven rather than an eight will fall when we roll a pair of dice, and why indeed do we not know how much more probable it is for a young man of twenty years to survive an old man of sixty years than for an old man of sixty years to survive a young man of twenty years; this is the point: we know the number of possible ways in which a seven and in which an eight can fall when rolling dice, but we do not know the number of possible ways which prevail in summoning a young man to die before an old man, and which prevail in summoning an old man to die before a young man. I began to inquire whether what is hidden from us by chance a priori can at least be known by us a posteriori from an occurrence observed many times in similar cases—i.e., from an experiment performed on many pairs of young and old men. For had I observed it to have happened that a young man outlived his respective old man in one thousand cases, for example, and to have happened otherwise only five hundred times, I could safely enough conclude that it is twice as probable that a young man outlives an old man as it is that the latter outlives the former. (1966, pp. 68-70; emphasis mine)

So far, Bernoulli characterizes the problem as one of finding a warranted way to estimate unknown a priori probabilities of events via the observed relative frequencies of those events. Estimating probabilities by way of observed frequencies may seem like a natural way to proceed, but it requires mathematical, logical, and epistemological warrant that it is a reasonable method. This warrant is what Bernoulli seeks. Bernoulli’s theorem

245 provides the mathematical warrant. Below I argue that the logic of abduction ‘warrants’ the plausibility of the method.139

Bernoulli proceeds to describe the problem to Leibniz in the mathematical terms

that his theorem addresses. Although the mathematical problem and the mathematical reasoning that Bernoulli deploys in solving the problem are not our main concern here, it is illuminating for an account of the logic of mathematical inquiry to observe how

Bernoulli presents the problem to his correspondent: “Moreover, although—and this is amazing—even the stupidest man knows, by some instinct of nature per se and by no

previous instruction, that the more observations there are, the less danger there is in straying from the mark, it requires not at all ordinary research to demonstrate this fact accurately and geometrically” (Bernoulli 1966, p. 70). Bernoulli poses an analytical

problem.140 By means of what he calls a ‘natural instinct’, we know that the more experimental observations we have, the more our a posteriori estimate of probability will approach the true unknown a priori probability of an event. This result—known by instinct—poses a problem, namely, a proposition that stands in need of a mathematical

139 In this chapter, I leave the sense of ‘warrant’ sufficiently open so as not to reduce it to a formal, deductivist sense. As I will discuss in section 6.3 below, ultimately the ‘warrant’ that Bernoulli’s reasoning attains is not the formal, rigid sense of rendering a belief absolutely certain, but the Peircean sense of rendering a belief ‘reasonable’ for appropriate action. In the case of scientific inquiry, this appropriate action will be the experimental testing of a conjecture. 140 By ‘analysis’ I mean, as in previous chapters, the method of mathematical inquiry which consists in searching—by way of various species of abduction, induction, and analogy, including abstraction, generalization, particularization, reduction, diagramming and so on—for the hypotheses that lead to the solution of a proposed problem (see Cellucci 2002). As I have argued, the analytical situation is best characterized as follows: in the course of continuous, open-ended research, we come to frame a problem that would lead to a plausible mathematical result. Next, we engage in a process of mathematical analysis to demonstrate the plausible result; throughout this process, new abductions,analogies, inductions, and so on, provide the explanatory hypotheses that eventually lead to the deductive or ‘theorematic’ demonstration of the result.

246 demonstration. As I discussed in section 5.2.2, in contemporary terms the mathematical result in need of demonstration is the following:

Let p be the probability of a successful event E being the outcome on any chance

experimental trial. Let n be the number of experimental trials, x be the number of

successes in n trials, and sn = x/n be the proportion of successes in n trials.

Bernoulli wants to justify mathematically that sn is a good estimate of p by

showing that for any small positive number ε, the probability P( | p - sn | < ε ) → 1

as n → ∞.141

Bernoulli continues his letter by further describing the mathematical problem that he

must solve in order to justify the a posteriori estimation of probabilities. He writes, “in

addition, it must be inquired whether the probability of an accurate ratio increases

steadily as the number of observations grows, so that finally the probability that I have

found the true ratio rather than a false ratio exceeds any given probability; or whether

each problem, so to speak, has an asymptote—that is, whether I shall finally reach some

level of probability beyond which I cannot be more certain that I have detected the true

ratio” (1966, p. 70). Continuing with our foregoing notation, we may state the problem as

follows. Show that n may be specified such that, for any given large positive number c,

(1) P( | p – (x/n) | ≤ ε ) > c P( | p – (x/n) | > ε ), or equivalently,

(2) P( | p – sn | ≤ ε ) > cP( | p – sn | > ε ) (see Stigler 1986, p. 66).

Bernoulli states the mathematical upshot of his yet unpublished solution to this

problem in this way: “For if the latter is true [namely, that the accuracy of experimental

141 See Hacking 1971a, p. 221-222. Again, I am following Hacking’s notation in order to engage his discussion of the problem more easily later.

247 estimation of probability reaches an asymptote], we will be done with our attempt at finding out the number of possible outcomes through experiments; if the former is true, we will investigate the ratio between the numbers of possible outcomes a posteriori with as much certainty as if it were known to us a priori. And I have found the former condition is indeed the case; whence I can now determine how many trials must be set up so that it will be a hundred, a thousand, ten thousand, etc., times more probable (and finally, so that it will be morally certain) that the ratio between the numbers of possible outcomes which I obtain in this way is legitimate and genuine” (1966, p. 70-71). In short, in order to justify the legitimacy and genuineness of estimating probabilities a posteriori

Bernoulli must prove two results, namely (i) that the probability of an accurate estimate increases without limit as the number of observed outcomes increases and (ii) that the number of experimental observations may be specified so that the probability of an accurate estimate is greater than any given probability.142

Bernoulli concludes by restating the practical upshot of successfully solving this

mathematical problem: “The following suffices for practice in civil life: to formulate our

conjectures in any situation that may occur no less scientifically than in games of chance;

I think that all the wisdom of the politician lies in this alone” (1966, p. 71). Bernoulli thus

emphasizes that his goal is to provide solid mathematical foundations for an art of

conjecturing; that is, for an art that relies on the a posteriori estimation of relative

probabilities of events from observed outcomes in natural and social matters, an art that is to guide decision making on the basis of probabilistic weighing of empirical evidence.

142 ‘Accuracy’ here means simply that the difference between p and sn is small.

248 Leibniz, in turn, raises a series of objections to Bernoulli’s announced mathematical results. On December 3, 1703, Leibniz recapitulates Bernoulli’s problem:

“When we estimate empirically, by means of experiments, the probabilities of successes, you ask whether a perfect estimation can be finally obtained in this manner. You write that you have found this to be so” (1966, p. 72). But Leibniz immediately expresses the main reason for his objection: “There appears to me to be a difficulty in this conclusion: that happenings which depend upon an infinite number of cases cannot be determined by a finite number of experiments” (1966, p. 72; emphasis mine). We may elucidate the meaning of Leibniz’s objection by considering the two different arguments that he puts forth to substantiate it. Leibniz’s first argument is the following: “[I]ndeed, nature has her own habits, born from the return of causes, but only ‘in general.’ And so, who will say whether a subsequent experiment will not stray somewhat from the rule of all the preceding experiments, because of the very mutabilities of things? New diseases continuously inundate the human race, but if you had performed as many experiments as you please on the nature of deaths, you have not on that account set up the boundaries of the world so that it cannot change in the future” (1966, p. 72-73).

Isaac Todhunter (1865) and F. N. David (1962, p. 133) interpret this passage to mean that Leibniz is objecting to Bernoulli’s inverse use of the theorem. Briefly, the problem of ‘inverse probability’ consists in using Bernoulli’s theorem—where the observed relative frequencies of events are shown mathematically to approximate their a priori probabilities, on the supposition that the a priori probabilities are known in advance—for the purpose of estimating probabilities a posteriori in situations where only the observed sample frequencies are given. The inverse use is fallacious if, without

249 further mathematical or logical warrant, one simply takes the observed long-run sample frequencies to be estimates of the probabilities Todhunter, for example, makes reference to one of the applications of the theorem that Bernoulli himself discusses in the Ars

Conjectandi, namely, that of the estimation of white balls to black balls in an urn (see

Bernoulli 1966, p. 64-65). According to Todhunter, Bernoulli proposed to use his theorem in an inverse way. The British historian writes: “Suppose that…we do not know anything beforehand of the ratio of the white balls to the black; but that we have made a large number of drawings, and have obtained a white ball R times, and a black ball S times: then according to James Bernoulli we are to infer that the ratio of the white ball to the black balls in the urn is approximately R/S” (Todhunter1865, p. 73). It is strange, however, that Todhunter thinks that the Bernoulli-Leibniz correspondence is explicitly about the problem of inverse probability, because he also writes: “To determine the precise numerical estimate of this inference requires further investigation….[T]his has been done in two ways, by an inversion of James Bernoulli’s theorem, or by the aid of another theorem called Bayes theorem; the results approximately agree” (Todhunter

1865, p. 73). I find the adscription of the inverse probability controversy to the Bernoulli-

Leibniz correspondence to be anachronistic. This problem will only be raised and addressed explicitly later in the history of mathematical probability theory, most notably with the publication of ’s “An essay towards solving a problem in the doctrine of chances” in 1764. Be that as it may, Todhunter continues by claiming that the inverse use of Bernoulli’s theorem is “that which Leibnitz found it difficult to admit, and which James Bernoulli maintained against him” (Todhunter 1865, p. 73). F. N. David, in turn, observes that Bernoulli held back publication of the Ars Conjectandi “while he

250 pondered over the implications of the theorem which now goes by his name. He had considerable correspondence with Leibnitz, possibly the first disputation on inverse probability which is recorded. James [Bernoulli] stuck stubbornly to his point of view, but may have had inward doubts because he did not publish” (David 1962, p. 133). It will be my contention that Bernoulli’s defense of the applicability of his theorem for the estimation of probabilities is not “stubborn;” it is rather a reasonable defense whose logic we can reconstruct and analyze. Nevertheless, I think that David’s interpretation of the correspondence is more accurate than Todhunter’s—the correspondence reveals that

Bernoulli is struggling with the logical warrant for the applicability of his theorem to problems in the social and natural sciences, even if the logical difficulties are not raised in the more precise, but also narrower, terms of the problem of inverse probability.

A. P. Dempster thinks that Leibniz is not objecting to Bernoulli’s inverse use of the theorem but is rather presenting “a general argument against any method of trying to determine the characteristics of an infinite population from a finite sample” (Dempster in

“Preface” to Bernoulli 1966, p. 5). Ian Hacking, in turn, claims that Leibniz is mainly

“sceptical about a posteriori statistical inference” and that “he cannot persuade himself that there is a [Fundamental Probability Set, that is, a set of disjoint alternative events of equal possibility] for diseases” as there is for “dicing and urns” (Hacking 1971a, p. 219-

220). According to Hacking, in other words, Leibniz is concerned that, in the case of diseases, the number of disjoint, alternative, equally possible outcomes is infinite; but since the classical definition of probability depends on the notion of a set of disjoint alternative events of equal possibility, the probabilities of events concerning diseases cannot be estimated as they are undefined. Under this interpretation, Leibniz is assuming

251 that the number of disjoint alternative events of equal possibility must be finite; otherwise, the probability ratios cannot be estimated.

In general, I think Leibniz is indeed concerned with the ‘empirical estimation’ of probabilities, that is, with a posteriori statistical inference. However, he is not precisely raising the problem of inverse probability. Leibniz’s concerns are broader. He is concerned here with two types of issues—metaphysical and logical. There are two interrelated metaphysical issues, namely, games of chance and natural happenings are different sorts of events—the former sort have finite possible cases, the latter have infinite possible cases—and the events of nature are ‘mutable’. Leibniz questions whether there are any stable, non-mutable counts of possible outcomes in natural happenings, such as diseases, so that experimental observations can help us to establish the probabilities of natural events. More strongly, Leibniz perhaps is asking if the mutability of things in nature does not imply that empirical estimation on the basis of experiments cannot help us to learn the ‘rules’ of natural happenings, where ‘rules’ in this case are probabilities. The objection is metaphysical, as it is based on the mutability of natural events and the infinity of possible cases, but this interpretation gains in credibility from an understanding of Leibniz as a rationalist epistemologist who believes that our knowledge of natural laws or ‘rules’ is a priori and stems from reason, not from experience. This is the general skepticism about a posteriori learning that Hacking attributes to Leibniz. Be that as it may, Leibniz thinks that the metaphysical infinity and mutability of natural happenings, such as the progress or change of diseases, makes the a posteriori estimation of probabilities of events, such as death from suffering a disease, impossible.

252 On April 20, 1704, Bernoulli makes two replies to the foregoing objections. Both illustrate how Bernoulli’s mathematical reasoning tries to undermine the mathematical upshot of Leibniz’s metaphysical concerns. He begins by clarifying the mathematical result of his theorem through an example of drawing pebbles, independently and with replacement, from an urn containing twice as many white pebbles as black ones. The problem is to determine the unknown ratio of white to black pebbles by experiment.

Bernoulli writes, “I claim (assuming that you have two estimates of the two-to-one ratio, though quite close to one another, different, one being larger, the other being smaller— say 201:100 and 199:100) that I can determine scientifically the necessary number of observations so that with ten, a hundred, a thousand, etc. times more probability, the ratio of the number of drawings in which you choose a white pebble to the number of drawings in which you choose a black pebble will fall within, rather than outside of, these limits of the two-to-one ratio: 201:100 and 199:100; and so I claim that you can be morally certain the ratio obtained by experiment will come as close as you please to the true two-to-one ratio” (Bernoulli 1966, p. 75-76). This is a restatement of what the contemporary mathematical expression (1) above describes for p = 2/3, ε = 1/300, and c = 10, 20, 1000, etc.

Bernoulli then draws an analogy between an urn containing pebbles and a human body containing diseases: “But if now in place of the urn you substitute the human body of an old or a young man, the human body which contains the tinder of sicknesses within itself as the urn contains pebbles, you can determine in the same way how much nearer to death one is than the other” (Bernoulli 1966, p. 76). Bernoulli suggests that we measure the ratio of healthy to sick parts in the bodies of the old and young men and that we take

253 their respective ratios as the relative measure of the men’s respective proximity to death.

I have already pointed out in section 5.1.2 that Bernoulli grants that urns and human bodies are different types of things, just as games of chance and natural happenings are different types of events; however, he claims that in mathematical probability we can treat them analogously, and analogy is one of the fundamental methods mathematicians employ to resolve analytical problems. This is his first argument for rejecting the mathematical cogency of Leibniz’s metaphysical misgivings.143 Bernoulli strengthens his

mathematical argument by appealing to the notion of a limit as a ratio of two infinite

quantities: “It does no good to say that the number of sicknesses to which each is exposed

is infinite; for let us grant this; it is nevertheless known that there are levels of infinity,

and the ratio of one infinity to another infinity is still a finite number, and can be

expressed either precisely or sufficiently precisely for practical use” (1966, p. 76).

Bernoulli is again making an analytical move: he brings a concept from another field of

mathematics—namely, “the newly discovered fact that π is the limit of a sequence of

ratios” (Hacking 1971a, p. 220)—to bear on a conceptual difficulty in mathematical

probability.

Bernoulli’s second reply on April 20, 1704 regards the ‘mutability of things’. He

admits that natural events, such as diseases, change and mutate over time. But this need

not impede our a posteriori estimation of probabilities; it only requires that we update our

143 According to Hacking, Bernoulli’s analogy relies on “an underlying isomorphism of physical symmetries and propensities” (Hacking 1971a, p. 220). I will return to discuss Bernoulli’s analogy between balls in an urn and parts in the human body in section 8.1 below. There I will argue that the analogy implies that for Bernoulli a priori probabilities are ‘real propensities’ both in games of chance and in natural events.

254 empirical data so that our estimates are useful: “If sicknesses are multiplied by the passage of time, then new observations, in any case, must be set up” (1966, p. 76-77). In contemporary statistical terms, this is simply an admission that spatio-temporal extrapolation from an observed sample will result in erroneous statistical inference. In sum, Bernoulli responds via mathematical analysis to the metaphysical difficulties that infinity and mutability pose for his method of a posteriori estimation of probabilities.144

Having discussed Leibniz’s metaphysical concerns from December 3, 1703, let us turn now to the logical issue that he raises—namely, whether from finite evidence we can infer anything about an infinite class of events. Leibniz elaborates this logical concern

more strongly in his second argument substantiating his objection to Bernoulli. He writes:

When we investigate the path of a comet from any number of observations, we suppose that it is either a conic curve or another kind of simple curve. Given any number of points, an infinite number of curves can be found passing through them. Thus I show the following: I postulate (and this can be demonstrated) that given any number of points, some regular curve can be found passing through these points. Let it be given that this curve has been found, and call it “A.” Now, let another point be taken lying between the points given but outside of this curve; let a curve pass through this new point and the points given originally, according to the above postulate: this curve must be different from the first curve, but nevertheless it passes through the same given points through which the first curve passes. And since a point can be varied an infinite number of times, there will also be an infinite number of these and other possible curves. Moreover, observed outcomes can be compared with these points, where the fixed underlying outcomes or their estimates inferred from observed outcomes, can be compared with the model curve. It may be added that, although a perfect estimation cannot be had empirically, an empirical estimate would nonetheless be useful and sufficient in practice. (1966, p. 73-74)

The crux of Leibniz’s argument is that, for any set of observed points, there are infinitely many curves that pass through those points; thus, from the observed points we cannot

144 From a Peircean standpoint, this analysis should also lead to a reorientation of the metaphysical views that gave rise to the objections in the first place.

255 estimate ‘perfectly’—that is, with ‘moral certainty’—a model curve.145 When the points, for example, represent the observed positions of a comet, we can estimate infinitely many paths for the comet that fit the observed positions. The upshot is that from finite empirical evidence we cannot infer with certainty any one hypothesis when infinitely many arbitrary hypotheses are compatible with the evidence. The logical stakes of Leibniz’s objection reach beyond the estimation of curves from observed data points or the estimation of probabilities from observed frequencies of events. Leibniz questions the very process of inferring a hypothesis from empirical data—from his perspective, there are no sound logical grounds for choosing one hypothesis rather than another. He grants that an empirical hypothesis, such as an inferred curve or an estimated probability, would be “useful and sufficient in practice,” that is, that it would suffice to justify an action in empirical matters. However, the logical basis for the action does not attain the level of certainty. In so far as we are concerned with the logical ‘perfection’, or certainty, of the hypothesis, the inference is arbitrary and thus highly uncertain.146

145 The notion of ‘moral certainty’ appears throughout the correspondence. The issue is whether a posteriori statistical frequencies can yield ‘moral certainty’, as Bernoulli seeks to prove with his theorem. According to the Oxford English Dictionary, ‘moral certainty’ means “the quality or state of being subjectively certain; assurance, confidence; absence of doubt or hesitation.” In the context of the correspondence, then, ‘moral certainty’ means the state of being (subjectively) certain, especially so as to provide a perfect warrant for practical action. In the broader context of seventeenth-century epistemology, however, the term ‘moral certainty’ is equivocal. Barbara Shapiro, for instance, shows that in seventeenth- century England the term had various meanings ranging from ‘doubtful assent’ to ‘sufficient assurance’ or ‘undoubted assent’ to a proposition on the basis of various kinds of evidence, including sense experience, testimony, and persuasive arguments. Importantly, her exposition implies that ‘moral certainty’ was not circumscribed to the realm of subjective certainty but could also be inter-subjective, as in the case of the public or communal evaluation of the evidence substantiating a statement of proposition in matters of religion or law. This last observation might imply for us that the aspiration to attain moral certainty on the basis of Bernoulli’s ‘art of conjecturing’ may be the aim not of an individual inquirer but of an entire community of inquirers. See Shapiro 1983, p. 27-37. 146 In fact, for Leibniz there seems to be no logical status for a belief between certainty and non- knowledge.

256 In logical terms, we can understand Leibniz’s objection to be directed against simple enumerative induction. That mode of inference will not get us on the way to justified certain knowledge of planetary paths or probabilities of events. Our inferential process must be different since, in the case of the path of a comet for example, even before we infer a representative curve, we already presuppose that the path is represented by a simple curve. Thus, some non-inductive knowledge guides from the outset the estimation of the path. But Leibniz’s concern is even deeper. According to Hacking, in the case of a posteriori estimation of probabilities, Leibniz’s point is not simply that “any number of statistical hypotheses are logically consistent with a finite experimental segment [i.e. a finite number of trials]. The point is far stronger. Bernoulli shows that when the observed proportion of heads in n tosses is sn, there would be a very good

probability of getting sn if the true unknown probability of heads were itself close to sn.

Apparently this is a good reason for estimating the unknown p around sn. But there are infinitely many arbitrary hypotheses on which sn would be just as probable” (1975, p.

164). In other words, we observe the proportion of occurrence of an event to be sn.

According to Bernoulli, this provides us good reason to hypothesize that the true unknown probability p of the event is close to sn and therefore to use sn as an estimate of

the unknown p. Leibniz objects that there are infinitely many other arbitrary hypotheses that would make the observed proportion sn just as probable. So what is the reason for inferring that the true probability p is close to sn? Bernoulli’s reply will be: “When in

doubt, choose the simplest hypothesis” (Hacking 1975, p. 164). Far more than a logically naïve appeal to simplicity is involved in Bernoulli’s reply and, as I previously noted,

257 Leibniz will not accept an appeal to simple enumerative induction to warrant Bernoulli’s inference. So let us turn to Bernoulli’s response to elucidate his inferential reasoning.

In his April 20, 1704, letter to Leibniz, Bernoulli writes:

The example of investigating the trajectory of a comet from several of its observed positions…I would never use it to demonstrate a proposition: although, in a limited way I can find an application, since it cannot be denied that if five points have been observed, all of which are perceived to lie along a parabola, the notion of a parabola will be stronger than if only four points have been observed: for although there are an infinite number of curves which may pass through five points, there is nevertheless beyond this infinite number another infinite number—rather, an infinitely times more infinite number—of curves which may pass through only the first four points and not through the fifth point, all of which are excluded by this fifth observation. And yet I admit that every conjecture which is deduced by observations of this sort would be quite flimsy and uncertain if it were not conceded that the curve sought is one of the simple curves; this indeed seems quite correct to me since we see everywhere that nature follows the simplest paths. (1966, p. 77-78)

Bernoulli’s reply is two-fold. First, he argues that although there are infinitely many curves that fit a given number of observed points, additional observations increasingly help us to reduce the number of possible curves that fit the data. At first glance, this may

appear to be an argument for increasing inductive probability: that is, the greater the

number of experimental trials, the better our empirical estimates of probabilities, and therefore the more reason we have to infer the true unknown probabilities from the observed proportions. However, even though this is compatible with one of the main results that Bernoulli proves, this is not his point in responding to Leibniz. The general point is rather that additional evidence helps us to reduce significantly the number of hypotheses that we may infer from the data. Additional evidence thus significantly constrains the set of possible hypotheses that we may consider. In short, Bernoulli does not argue that additional evidence increases the inductive probability of a hypothesis but

258 rather that it delimits the set of possible hypotheses that we may infer.147 Second,

Bernoulli concedes that in order to make a strong and practically certain—as opposed to

‘flimsy and uncertain’—conjecture about the curve that represents the path of a comet, on the basis of a finite number of empirical observations, we must assume that a simple curve represents the path of the comet. But he thinks that this assumption is sound since we perceive simplicity everywhere in nature.148

The question that I will take up in the rest of this chapter is what kind of logical

process produces hypotheses of the kind that Bernoulli formulates. This is a question of

the logic of discovery that is at work in Bernoulli’s reasoning; in particular, of the

reasoning that warrants the application of ideal mathematical systems to actual scientific

problems. My aim is to show that there is at work a logic of ampliative inference and to

specify its form. Hacking’s interpretation of the correspondence suggests that Bernoulli

makes an inference to the best explanation when he proposes that it is the unknown probability p being close to sn that explains the observed proportion sn (see Hacking,

1971a, p. 219-221, 228-229).149 I want to argue instead that Bernoulli makes an abductive

inference. Next, I expound Bernoulli’s reasoning as an inductive inference to the best explanation in order to contrast this view later with my own interpretation of Bernoulli’s reasoning as a case of abduction.

147 As I will argue in section 6.3.1, additional evidence constrains the set of plausible abductive hypotheses that we may formulate. 148 In the coming sections, I will consider what warrant, if any, there is for this principle of simplicity, and whether this principle guides an inference to the best explanation or an abduction. 149 Hacking’s claim is more general, as we will see later.

259 6.2 Bernoulli’s Hypothesis as an Inference to the Best Explanation

Let us recount the inferential situation that interests us from the Leibniz –

Bernoulli correspondence. Suppose we observe empirically the relative frequency of an event to be sn, but the a priori probability p is unknown. Bernoulli proposes, as a reason

to estimate p by means of sn, the hypothesis Hp: ‘the true unknown event probability p is

close to sn’. Leibniz objects that there are infinitely many other arbitrary hypotheses that

are compatible with our observation sn. Bernoulli replies that we should choose Hp since it is the simplest hypothesis and since we observe nature everywhere to choose the simplest paths. In this section, I take up Hacking’s suggestion that Bernoulli makes an inference to the best explanation and I develop this interpretation carefully.

Let us start by considering Gilbert Harman’s model of the inference to the best explanation (Harman 1965).150 Harman proposes it as the fundamental form of non-

deductive inference—a form that is often warranted in cases where, for example, simple

enumerative induction is not and that, moreover, includes enumerative induction as one

of its special cases. He initially writes that in making an inference to the best explanation,

“one infers, from the fact that a certain hypothesis would explain the evidence, to the

truth of the hypothesis” (Harman 1965, p. 89). This statement however, is too strong. In

the case of Bernoulli, it would mean that he infers, from the fact that Hp would explain sn, the conclusion that Hp is true. However, as will I argue in section 6.3 below, it is more

accurate to say that when we initially infer a hypothesis, we infer it to be plausible; only

150 I start with Harman’s model because it seems to be the first widely acknowledged exposition of the inference to the best explanation in the recent spur of philosophical interest that it has generated over the last 40 years.

260 later do we infer it to be true provided it withstands experimental tests or the course of further experience. Moreover, Harman’s initial description tells us that we infer the truth of any hypothesis that would explain the data. Consider Hq1: ‘we have an imperfect

capacity of observation that makes what happens with certainty, under determining

conditions, seem to us to happen only in sn proportion of cases.’ Or consider Hq2: ‘an evil demon makes what happens with certainty, under determining conditions, seem to us to happen only in sn proportion of cases.’ These hypotheses would, in some contexts,

explain the evidence. But we would not immediately infer them to be true nor would we

infer just any hypothesis that would be explanatory.

Harman, however, acknowledges that there may be several candidate explanatory

hypotheses for the evidence and that we need to reject all alternative hypotheses before

making a warranted inference. So he refines his description of inference to the best

explanation: “[O]ne infers, from the premise that a given hypothesis would provide a

‘better’ explanation for the evidence than would any other hypothesis, to the conclusion

that the given hypothesis is true” (Harman 1965, p. 89). Following Harman, we would

describe Bernoulli’s inference as follows:

Premise: The hypothesis Hp that ‘unknown p is close to sn’ would provide a

“better” explanation for the observed sn than would any other hypothesis.

Conclusion: Hp is true.

This characterization points to at least three relevant features of the inference to hypotheses. First, it acknowledges that in science, when we have some given empirical data, we formulate several alternative hypotheses that are compatible with it; we do not simply formulate one such hypothesis and stop there. Although Bernoulli does not

261 mention, in the correspondence, the alternative hypotheses he could have pondered that would explain sn, he at least tacitly admits that there are possible alternative hypotheses

to Hp since he does not reply to Leibniz that Hp is the only hypothesis compatible with sn but that it is the simplest one. Second, Harman’s description points out that the hypotheses that we infer are explanatory. This is crucial. Bernoulli may grant to Leibniz that there are several, even infinitely many, hypotheses that are compatible with the observed sn. But not all of the alternative hypotheses will be explanatory. This of course

raises the question of what is an explanatory hypothesis, or more simply, what is an

explanation. Harman does not undertake to answer this question, very extensive and

difficult in itself, but at least he points out, third, that we infer hypotheses that explain the

evidence better than any of the alternatives. This opens up for us the possibility of

arguing that although we may not be able to determine absolutely what an explanation is,

we can at least make relative judgments regarding better or worse explanations. That is,

when confronted with two or more alternative explanatory hypotheses for the data, we

can make judgments about their relative merit on the basis of some relevant criteria.

Harman briefly points to some such criteria when he asks under what conditions

we can make, for example, a warranted enumerative inductive inference. He answers:

“Whenever the hypothesis that all A’s are B’s is (in the light of all the evidence) a better,

simpler, more plausible (and so forth) hypothesis than is the hypothesis, say, that

someone is biasing the observed sample in order to make us think that all A’s are B’s”

(Harman 1965, p. 91). I will leave aside for now the problem that, from the fact that a

hypothesis is more plausible than any alternative, Harman would have us infer that the

262 hypothesis is true.151 I want to emphasize instead that, in his case, Bernoulli clearly

argues that he infers Hp because it is a simpler hypothesis than any alternative, say, than

Hq3: ‘someone is biasing the observed sample in order to make us think that event E

152 happens in sn proportion of cases.’ In this example, Bernoulli also deploys some

implicit beliefs that are part of the inference, such as the assumption that the sample of

observed outcomes results from independent, random trials, and that the physical trials

are such that their possible outcomes are equipossible (see Hacking 1971b).

Precisely in this regard, Harman argues that one of the crucial features of the inference to the best explanation is that it reveals the role that ‘lemmas’ play in inferring explanatory hypotheses. Articulating, exposing and evaluating such lemmas is an important aspect of the analysis of knowledge that results from inference since, according to Harman, there is a wide agreement among epistemologists that if we are to know, our belief must be both true and warranted. That is, a necessary condition of knowledge is not only that our final belief be true, but also that the ‘lemmas’, or intermediate propositions between premises and conclusion, be true. Otherwise, if the intermediate propositions

“are warranted but false, then [we] cannot be correctly described as knowing the

conclusion” (Harman 1965, p. 92). Harman calls this necessary condition for knowledge

the “condition that the lemmas be true” (p.92). I do not aim here to accept or reject these

epistemological strictures on what it is “to know,” but I do want to emphasize that these

151 It may be that in Harman’s exposition it is implied that inferring that an explanatory hypothesis is true means that we infer that it is probably or plausibly true, not certainly true. This would make for a more charitable reading of his exposition. But if this view is implied, it would have been worthwhile to make it explicit. 152 This hypothesis is due to Hacking 1971a.

263 often implicit ‘lemmas’ are an intrinsic part of the inference from evidence to best explanatory hypotheses. Recall the form of Bernoulli’s inference under the present reconstruction—Premise: Hp would provide a “better” (simpler) explanation for observed

sn than would any other hypothesis; Conclusion: Hp is true. The premise includes an

implicit lemma that is part of the inference; namely, ‘I, Bernoulli, hold the belief that Hp is indeed the simplest explanatory hypothesis.’ If Hp turned out not to be the simplest

hypothesis, the lemma would be false. Harman would then say that the inference to the

153 truth of Hp would be unwarranted and that Hp would be false.

So far I have emphasized the merits of Harman’s model of the inference to the

best explanation for elucidating hypothetical inferences. This model reveals that, in an

inference such as Bernoulli’s hypothesis that unknown p is close sn, the mathematician or

scientist formulates several alternative explanatory hypotheses for the evidence and

infers the best one by appealing to some relevant criteria for judging the relative merit of

explanations and to some implicit lemmas. Moreover, the model attempts to account for

the fact that ampliative inferences are not of the form of enumerative induction since

ampliative hypotheses often appeal to concepts that are not part of the evidence. In formal terms, that is, ampliative hypotheses involve terms that do not appear in the statements of the evidence. For example, the concept of an unknown probability p underlying statistical phenomena is not part of the evidence consisting of an empirical estimation of the relative frequency sn of an event. Thus, the ampliative inference to Hp

153 Note that this treats the inference to the best explanation as if it were deductive by subjecting it to the standards of justification of deductively certain knowledge. As it will become clear in section 6.3 below, these lemmas are better described as ‘background beliefs’ that inform an abductive hypothesis.

264 constitutes a conjecture that in turn provides the ‘reasonable’ hypothetical ground for estimating unknown event probabilities by way of observed relative frequencies.154

However, Harman’s model of inference to the best explanation has some significant problems in accounting for the ampliative inference to explanatory hypotheses. First, the model does not explain how the explanatory hypotheses arise. The model takes the alternative explanatory hypotheses as given and reveals that we infer the best one, but it does not analyze the reasoning process through which the scientist conceives such hypotheses. This is mainly because Harman treats explanatory inference abstractly and not as an actual process. Second, it does not make a sufficiently sharp analytic distinction between two reasoning processes involved in inferring explanatory hypotheses—namely, the process of formulating several plausible explanatory hypotheses and the process of inferring the best one. And third, because the model fails to make the distinction between formulating plausible hypotheses and inferring the best one, it hastily describes the inference to an explanatory hypothesis as the inference to the truth, and not to the plausibility, of the hypothesis.

Peter Lipton develops an improved model of the inference to the best explanation in some important ways, especially with respect to the second and third problems above.

So now I turn to present his model in order to analyze Bernoulli’s inference. Lipton proposes that his model describes situations in which “it is not simply that the phenomena

154 Throughout my discussion, the term ‘reasonable’ will have a distinctly Peircean meaning. A belief— that is, the basis for action—is ‘reasonable’ when we adopt it according to one of the forms of logical inference—abduction, induction, or deduction—and we hold it with the appropriate, or justified, degree of confidence—plausibility, probability, or necessity, respectively. Note that both a deductive certainty and an abductive conjecture may be ‘reasonable’, so long as they are based upon appropriate grounds such as true premises or insightful perceptual judgments, respectively.

265 to be explained provide reasons for inferring the explanations: we infer the explanations precisely because they would, if true, explain the phenomena” (Lipton 1991, p. 57).

Strangely, as we will see, for Lipton this inference is inductive, even if it is not an inference in which the evidence provides inductive support for the conclusion, but one in which we infer a hypothesis because it explains the evidence. So he writes that inference to the best explanation is “a new model of induction, one that binds inference in a new and exciting way. According to [it], our inferential practices are governed by explanatory considerations” (Lipton 1991, p. 58). Lipton describes the form of this inference as follows: “Given our data and our background beliefs, we infer what would, if true, provide the best of the competing explanations we can generate for those data (so long as the best is good enough for us to make any inference at all)” (Lipton 1991, p. 58). In the following assessment of his model, I advance Lipton’s apt emphasis on the role of explanatory considerations as guides to hypothetical inference, while I point out some difficulties that arise from his classification of inference to the best explanation as inductive.

Lipton makes two important distinctions regarding the sorts of explanations that we seek when we infer explanatory hypotheses. First, he distinguishes between actual and potential explanations. For his model, Lipton assumes ‘inferential and explanatory realism,’ that is, (a) that a goal of inference is truth, (b) that our actual inferential practices generally take us toward the goal of truth, and (c) that for something to be an actual explanation, it must be at least approximately true (Lipton 1991, p. 59). So the actual explanation of a phenomenon is its true, or at least approximately true, explanation, in reality. A model of inference to the best actual explanation fails to

266 describe our actual inferential practices since it portrays all of our inferences to be true, when our “inductive practice is fallible: we sometimes reasonably infer falsehoods”

(Lipton 1991, p. 59).155 It also fails to account for the role of competing explanations in

inference and mischaracterizes our inferential process as we can only decide whether a

hypothesis is an actual explanation after we have inferred it. In short, this model of

inference to the best actual explanation does not give us “an account of the way

explanatory considerations can be a guide to truth” (Lipton 1991, p.59). Consequently,

Lipton proposes a model of inference in which we first consider potential explanations

and then infer the best one. Potential explanations need not be true explanations of the

empirical evidence; they only are plausible explanations of the phenomenon that accord

with our system of beliefs and that would provide some way to understand the observed

phenomenon. The plausibility of potential explanatory hypotheses acts as an “epistemic

filter”: in our inferential process we do not consider all possible explanations—for

example we do not consider arbitrary ones—but only the potential explanations, that is,

the plausible ones that are “live options” to become the actual explanation (Lipton 1991,

p. 60-61).

The upshot of the foregoing distinction is to characterize the inference to the best

explanation as a process consisting of two “epistemic filters.” According to Lipton, in our

reasoning we deploy the first filter to select a group of plausible explanations for an

observed phenomenon from a vast pool of possible explanations. Then we use a second

155 This phrase foretells a key element of a Peircean critique of Lipton’s model: it describes our inference to explanatory hypotheses as inductive, not abductive. But the phrase also adverts that we can ‘reasonably’ infer falsehoods, where ‘reasonable’ may be taken in the Peircean sense that I defined above.

267 filter to select the best explanation from the group of competing plausible explanations.

In this way, Lipton’s account improves on the second and third problems with Harman’s model. That is, Lipton distinguishes between the process of generating explanatory hypotheses and that of selecting the best one, and he accounts for the fact that in generating hypotheses our foremost consideration is the plausibility, not yet the actual truth, of the explanation.

Lipton distinguishes, second, between the likeliest and loveliest explanations.

Given the empirical evidence, the likeliest explanation is “the explanation that is most warranted,” while the loveliest explanation is “the one which would, if correct, be the most explanatory or provide the most understanding” (Lipton 1991, p. 61). In a succinct description he writes, “[l]ikeliness speaks of truth; loveliness of potential understanding”

(Lipton 1991, p. 61). For Lipton, one way to understand this distinction is to see that we evaluate likeliness, but not loveliness, with respect to the total available evidence. For example, he observes that Newtonian mechanics is one of the loveliest explanations in science and, at one time, it also was one of the likeliest of the then available data. As the theory of special relativity arose along with the new data that supports it, Newtonian mechanics became a less likely explanation of all of the data, but it remains a lovely explanation of the old data, that is, it still helps us to understand clearly the physical phenomena that it explains well. By this example Lipton also tries to illustrate that likely and lovely explanations are differently affected by new competing explanations, as a new explanation may make the current one less likely but not necessarily less lovely (Lipton

1991, p. 62). The last point is problematic, as I think that an explanation that decreases in likeliness in the face of new data also decreases in loveliness, since we learn that we did

268 not understand the old phenomena as deeply as we thought, and the new likelier explanation may be a lovelier explanation. For instance, the special theory of relativity may be likelier and lovelier, since it explains the additional evidence and since, although the theory may be more difficult to grasp than Newtonian mechanics, once we grasp it we gain a deeper understanding of the phenomena that Newton’s theory also explains.

In my estimation, the really crucial distinction that Lipton is trying to draw, but does not draw explicitly, is that we evaluate the likeliness of a hypothesis by way of inductive probability, while we evaluate its loveliness by way of explanatory plausibility.

In evaluating inductive probability, we weigh the relative support that all of the empirical data provide for one or the other hypothesis; while in evaluating explanatory plausibility, we judge how well an explanation coheres with our current system of beliefs, that is, with our background theoretical knowledge, and with what we perceive or think to be possible. I think the reason that Lipton does not distinguish between inductive probability and explanatory plausibility explicitly, even when he makes all the other aforementioned distinctions, is that he considers the inference to the best explanation to be an inductive inference, while at the same time he argues that it is an inference to the loveliest explanation. The inference is clearly non-deductive but, like Harman, Lipton does not even consider the possibility that the inference to the best explanation may be non- inductive. If Lipton were to associate the inference to the likeliest explanation explicitly with the inductive estimation of the probability of a hypothesis given the evidence, he would have to at least discuss explicitly the possibility that inference to the loveliest explanation is non-inductive. He would have to consider whether the inference to the loveliest explanation is a third form of inference, for instance, abductive. But Lipton, like

269 Harman, is content simply to cite, without argument, abduction as one of the antecedent, and apparently insufficiently developed, forms of inductive inference to the best explanation (see Lipton 1991, p. 58). Recalling that in section 2.2.3 I argued, on Peircean grounds, that induction and abduction are different forms of inference and that induction is an evaluation of probability while abduction is an evaluation of plausibility, we now can see that the inference to the loveliest explanation, if it is a viable form of inference, must be abductive and not inductive.

For now, let us continue a discussion of Lipton’s model of inference to the loveliest explanation, as I consider it to be a viable model of inference that is miscategorized. Lipton rejects a description of the inference to the best explanation as inference to the likeliest explanation because it begs the questions of what are the marks of likeliness, what principles we use to judge one inference more likely than another one, and what features of an argument lead us to say that the premises make the conclusion likely (Lipton 1991, p. 62). In other words, since inference to the best explanation should describe ‘strong inductive arguments’ in which “the premises make the conclusion likely,” we beg the question when we simply say that when we infer a hypothesis what we do is infer the likeliest explanation (Lipton 1991, p. 62). In contrast, describing the inference to the loveliest explanation elucidates how explanatory considerations guide our inference to a hypothesis. For Lipton, the model of the inference to the loveliest potential explanation “claims that the explanation that would, if true, provide the deepest understanding, is the explanation that is likeliest to be true. Such an explanation suggests a really lovely explanation of our inferential practice itself, one that links the search for truth and the search for understanding in a fundamental way” (Lipton 1991, p. 63). It

270 links them, that is, without conflating their role in inference.156 In sum, Lipton’s

characterization of the inference to the best explanation as the inference to the loveliest

potential explanation aptly elucidates that, when we infer explanatory hypotheses, (i) we

initially generate relevant candidate hypotheses that are plausible explanations of the

empirical evidence, and (ii) then we choose the hypothesis that yields the deepest

understanding of the observed empirical phenomena.157

Let us now deploy this model to describe Bernoulli’s inference. Given the

evidence sn and our background beliefs, following Bernoulli we infer what would, if true,

provide the best of the various competing explanations that we can generate for the observed sn. In this case, we infer the explanatory hypothesis Hp: ‘the true unknown event

probability p is close to sn’. This inferential process involves two ‘epistemic filters’: (i)

Generate a group of plausible explanations, say H1, H2,…, Hp-1, Hp, for sn, from the vast

pool of possible explanations. (ii) Select the best explanation, that is, the loveliest or the

one that produces the deepest understanding, of the occurrence of an event with observed

relative frequency sn from the select group of plausible competing explanations. In this

case, the best explanation is Hp.

By describing the first filter, we begin to respond to Leibniz’s objection that there

are infinitely many arbitrary hypotheses that explain sn, since we show that we only

consider as ‘live options’ for inference those hypotheses that we judge to be plausible explanations on the basis of our system of beliefs. Thus, arbitrary hypotheses that do not

156 For other accounts of discovery as a logical process involving the search for explanation, see Schaffner 1993, especially chapter 2, and Hanson 1965. 157 In section 6.3, I will describe how the logic of abduction provides an account of the method or process of ‘generation’ of hypotheses itself.

271 aid our understanding nor cohere with at least some of our existing beliefs will not appear to be explanatory or plausible.158 Lipton’s model aptly emphasizes the role of background

beliefs in the generation of hypotheses. For Bernoulli the system of beliefs includes the

belief that the explanation for the occurrence of sn is not arbitrary, nor unintelligible, but

is mathematically describable. Accordingly, the system also includes both theoretical

knowledge of mathematical probability and the beliefs concerning the applicability of the

theory to different sorts of problems. In particular, Bernoulli believes that a priori

probabilities explain the relative frequency of events in games of chance. Thus, whatever

explanation he offers or accepts for the occurrence of sn must cohere with this belief. Hp

does. Stating explicitly Hp along with some of the background beliefs that cohere with it

and make it plausible, we could say: ‘an event—ludic or natural—occurs with relative

frequency sn because its a priori probability p is approximately sn, and we can describe,

via mathematical probability theory, how sn approaches p as the number of our

experimental observations increases.’

Let us turn now to specify further the second filter in Bernoulli’s inference.

Bernoulli explicitly says that Hp is the loveliest explanation because it is the simplest one.

That is, Hp yields the deepest understanding of the occurrence of an event with relative

frequency sn because it shows that the phenomenon accords with the simplicity of ways

that we perceive to rule everywhere in nature. In this case, the hypothesis explains that

the a posteriori observed sn results, in a simple and intelligible way, from its proximity to

158 It is at this point that Popper turns to psychology and claims that discovery is not a question for the logic of scientific inquiry. It is the general purpose of my entire argument to show that the formation of ampliative hypotheses is a logical process. Additionally, I must observe that, from a Peircean view of the entire process of scientific inquiry, any system of beliefs will improve so long as the inquirer subject his plausible hypotheses to the test of real experience. This will become apparent in section 6.3 below.

272 the a priori probability p. Initially, we may take this ‘simplicity’ to mean ‘logical simplicity’. But this is not enough. The simplicity that Bernoulli appeals to is not only a

‘logical simplicity’—in the sense of the conceptual parsimony of a hypothesis—because it is a simplicity that we perceive. As I will argue in sections 6.3.3 and 7.2.2, the logic of abduction provides a way to understand Bernoulli’s appeal to ‘simplicity’. For now, it suffices to observe that the simplicity of the hypothesis is the explicit reason that

Bernoulli gives to Leibniz in order to choose it as the best, or loveliest, explanation. But there are other implicit reasons that we can draw out of Bernoulli’s correspondence on the basis of Lipton’s model.

First, Lipton points out that, in order to choose the loveliest explanation, we may perform experiments to narrow the field of candidate plausible hypotheses resulting from the first filter. This point is well taken, though in general it seems more germane to hypothetical inferences in natural sciences than to Bernoulli’s inference for justifying the application of a mathematical method of probability estimation to problems in natural sciences. As we shall see in the next section, however, an important criterion in

Bernoulli’s reasoning is that the inferred hypothesis be capable of yielding consequences capable of experimental verification. Second, according to Lipton, the lovely explanations that we infer tend to fulfill our preferences for explanations that (a) specify a causal mechanism, (b) are precise, and (c) unify our understanding and our explanatory scheme. According to Lipton, our search for rational explanations of phenomena is often best understood as a search for the unknown causes of observed effects. In the case of

Bernoulli, this would mean that he considers the a priori probability p to be the cause of

273 159 the observed statistical frequency sn. We also prefer explanations that are precise.

Bernoulli’s theorem, for instance, provides a mathematical explanation of the relation between p and sn; in particular, it explains with precision how sn approaches p as the

number of experimental observations increases. Thus, Bernoulli sees that the hypothesis

Hp underlies a very precise logical and mathematical explanation of the occurrence of sn.

Finally, the most important characteristic of a hypothetical explanation is that it unifies our understanding. Bernoulli’s hypothesis suggests that with the tools of mathematical probability we may understand stochastic phenomena both in games of chance and in natural events. That is, through one mathematical theory we can understand types of events that, at first glance, appear to be very dissimilar. In fact, one of the most forceful points behind Leibniz’s objections is precisely that natural events are essentially different from games of chance—because natural events are mutable and have infinite types of possible outcomes—and, therefore, that the calculus of mathematical probability, applicable in ludic problem-contexts, is not applicable to the problems of natural science. Bernoulli instead grasps that both games of chance and natural events can be understood as stochastic phenomena and that, as stochastic, they are subject to the same explanation via mathematical probability. Bernoulli sees that for all such stochastic phenomena, we can explain the occurrence of an event with statistical frequency sn by the

proximity of the event’s a priori probability p to sn. In turn, the mathematical theory of

159 Lipton recognizes that the logical form of the inference to the best explanation does not necessarily presuppose a causal theory of explanation and that the philosophical question of what is an explanation is complex and unsettled, so he hopes that other possibly correct theories of explanation may turn out to be compatible with his account of the inference.

274 probability, in general, and Bernoulli’s theorem, in particular, develop the unified explanatory scheme for such phenomena.

Bernoulli’s insight into the common stochastic nature of very disparate events and into their unifying hypothetical explanation constitutes one of his most fundamental conceptual discoveries. And it is an insight. By emphasizing the insightful nature of

Bernoulli’s inference I again want to suggest that, although it is an inference to an explanation, it is not an inductive inference. It is in this respect that Lipton’s model, put forth as an account of inductive inference, does not account for the true form of

Bernoulli’s inference. Next, I interpret Bernoulli’s reasoning as a case of abductive inference to a hypothesis. This will prepare the way to argue that the inference to the best explanation, understood as the inference to the loveliest potential explanation, is a form of abduction that Lipton misclassifies as a form of induction and that this misclassification amounts to a conflation of two distinct forms of inference. The conflation blurs some important aspects of Bernoulli’s inference to an explanatory hypothesis. However, a reclassification of Lipton’s model of the inference to the best explanation as an abductive inference and a more thorough description of Peircean abduction elucidate Bernoulli’s inference in the full scope of its crucial details.

6.3 Bernoulli’s Hypothesis as a Case of Abduction

I first interpret Bernoulli’s inference as a case of abduction before turning to a contrast of this position with the foregoing interpretation of it as an inductive inference to

275 the best explanation. Recall from section 2.2.2 that in his 1903 Harvard Lectures on

Pragmatism, C.S. Peirce describes abduction as the logical process of forming an explanatory hypothesis and shows that even though an abduction only asserts its conclusion “problematically or conjecturally,” as an inference it has the following definite logical form:

“The surprising fact, C, is observed;

But if A were true, C would be a matter of course.

Hence, there is reason to suspect that A is true” (EP 2, p. 231).

Recasting Bernoulli’s inference in abductive form, we have:

160 Premise 1: We observe the relative frequency of event E to be sn.

Premise 2: If the unknown a priori probability p of event E were close to sn, then

the occurrence of sn would be a matter of course.

Conclusion: There is reason to suspect that Hp is true, namely, that ‘the unknown

a priori probability p of event E is close to sn.

We infer the hypothesis Hp as plausible because it would explain the occurrence of sn.

This conclusion is a plausible belief and, in turn, provides us with a reason for estimating

the unknown probability of natural events by way of their observed relative

frequencies.161

160 For now, I leave aside the element of surprise in the observation, but I take it up later when it is more crucial to my discussion. 161 Recall that a belief, that is the basis for action, is ‘reasonable’ when we adopt it according to one of the forms of logical inference—abduction, induction, or deduction—and we hold it with the appropriate, or justified, degree of confidence—plausibility, probability, or necessity, respectively.

276 Now, recall from section 2.2.3 the distinction between induction and abduction, and especially the “great difference” between the reasoning situations that call for one or the other of these forms, namely, that the question of abduction is the question of explanation of observed facts, while the question of induction is the question of the degree of agreement between observed and predicted facts. On the basis of that distinction, I claim that Bernoulli’s inference, just as any inferential discovery, is more properly understood as an abduction than as an induction. Bernoulli’s inference, as I am reconstructing it, takes the observed relative frequency sn of event E as given. In terms of

mathematical probability, Bernoulli shows with his theorem that sn is an ‘accurate’

estimate of the unknown a priori probability p of event E. Leibniz, however, queries him on the logical grounds for justifying such an estimate of probability in the case of natural events. In other words, even if Bernoulli proves the theorem with mathematical rigor, what are the reasons for applying it to the estimation of probabilities in the case of natural

events. Bernoulli’s reply is an abductive conjecture: we observe the relative frequency of a natural event to be sn; but if the unknown a priori probability p of the event were close

to sn, then the occurrence of sn would be expected; therefore, we may plausibly conjecture

that the unknown a priori probability p of the event is close to sn.

We can begin to see that Bernoulli’s inference is abductive and not inductive

because its conclusion is an explanatory hypothesis that involves general conceptions that

are not of the nature of the observed phenomenon. The observed phenomenon is a

relative frequency—a ratio of favorable to total outcomes—but the explanation involves

277 the concept of an unknown a priori probability.162 The idea of an unobserved probability

of success underlying a sequence of random experimental outcomes explains the

occurrence of the observed relative frequency of successes. This brings to the fore the

most striking innovative aspect of Bernoulli’s inference: it provides a plausible

explanation for the fact that, in random natural processes, we come to observe stable

statistical frequencies of events. Bernoulli’s entire reasoning, both in pursuing a proof of his theorem and in justifying before Leibniz the logical grounds for its applicability to the estimation of probabilities of natural events, relies on the abductive insight that unobserved probabilities explain the surprising occurrence of stable statistical frequencies in chance events.

Ian Hacking makes the important point that one of Bernoulli’s most fundamental conceptual contributions to the discovery of mathematical probability is to provide a way to explain statistical regularity in random processes.163 Hacking emphasizes “how

difficult it was [at the time of the emergence of probability] to comprehend regularity

within the random” (Hacking 1971a, p. 228).164 I agree with Hacking’s incisive

assessment of the importance of the Swiss mathematician’s contribution. As I already

anticipated, however, Hacking characterizes Bernoulli’s reasoning as an inference to the

best explanation. He writes, “confronted by puzzling or interesting phenomena, one

favours hypotheses that provide the best explanation of the phenomena. If the

162 Note that I do not attribute to Bernoulli a frequentist view according to which a probability is defined as the long run relative frequency, or ratio of favorable to total outcomes, of an event. 163 See Hacking 1971a, p. 228-229. 164 Furthermore, Hacking credits Karl Pearson for reminding us, along with others, “how puzzling statistical stability once seemed” (Hacking 1971a, p. 228). See Pearson 1924, 1925, and 1926.

278 phenomenon to hand is some statistical stability, Bernoulli’s theorem shows us that the best explanation is that we have independent trials with probability close to the observed frequency….[I]f we observe a long sequence of repeated trials in which the relative frequency of occurrences of [event] S gradually gets closer and closer to p, we have a phenomenon that can be rationally explained only on the assumption that the chance of S is about p. Bernoulli’s theorem demonstrates how to explain the fact of regularity within the random” (Hacking1971a, p. 229). Although Hacking’s point regards Bernoulli’s theorem and not specifically the arguments he presents to Leibniz in correspondence, the connections between both the theorem and the correspondence are clear. Through both— the theorem and the correspondence—Bernoulli addresses the fundamental reason why we observe stable statistical frequencies in random events and infers that the we have reason to use these frequencies as estimates of the true a priori probabilities of those events. My view is that Hacking provides an initially apt characterization of Bernoulli’s reasoning, so long as we understand the inference to the best explanation to be abductive and not inductive. This, however, is not the traditional understanding of the inference to the best explanation, so Hacking’s initial interpretation needs to be examined and developed more fully. Moreover, characterizing Bernoulli’s inferential process as abductive elucidates more thoroughly various important aspects of his reasoning. Let us continue, then, to draw out those abductive elements from Bernoulli’s reasoning.

279 6.3.1 Formulating and Weighing Plausible Explanatory Hypotheses in Response to Living Doubt

The surprise that we experience when we observe statistical regularity in a random sequence of events, especially in nature, is an important catalyst for Bernoulli’s inquiry into an explanation for that puzzling phenomenon. As Peirce argues extensively, our response to that puzzlement is a process of inquiry that begins with an abductive search for plausible explanations of the phenomenon.165 Accordingly, we can specify

more thoroughly the first premise of his abductive inference as follows: ‘In a sequence of

independent random experiments, we observe the relative frequency of a natural event E

to be sn and we are puzzled that there should be any statistical regularity in the random

sequence that exhibits this frequency sn’. The premise states the puzzlement that spurs the

inquiry.

Now, according to Peirce when we experience real living doubt we search for

explanatory hypotheses that fulfill two necessary conditions; namely, the hypotheses

must (i) explain the phenomenon and (ii) set up expectations that eliminate the surprise

or, in other words, be capable of producing testable predictions. Peirce writes: “What

good is abduction? What should an explanatory hypothesis be to be worthy to rank as a hypothesis? Of course, it must explain the facts. But what other conditions ought it to

fulfill to be good? The question of the goodness of anything is whether that thing fulfills

its end. What then, is the end of an explanatory hypothesis? Its end is, through the

subjection to the test of experiment, to lead to the avoidance of all surprise and to the

165 See especially two of his series of lectures, the “Illustrations of the Logic of Science” (EP 1, p. 109- 199) and the “Harvard Lectures on Pragmatism” (EP 2, p. 133-241).

280 establishment of a habit of positive expectation that shall not be disappointed. Any hypothesis, therefore, may be admissible, in the absence of any special reasons to the contrary, provided it be capable of experimental verification, and only in so far as it is capable of such verification” (EP 2, p. 235; emphasis mine). The two conditions, then, may in fact be stated as one necessary and sufficient condition: a hypothesis is admissible if, and only if, it is capable of experimental verification.166 It is necessary and sufficient because, for Peirce, the condition that the hypothesis be capable of experimental verification already presupposes the first and second necessary conditions. That is, for a hypothesis to be experimentally verifiable it must be capable of yielding testable

predictions, and in order to yield predictions, it must explain ‘how’ the observed

phenomenon would come to be as a matter of course.167

Thus, when we are initially surprised that we have observed the frequency sn in a random natural process, we might formulate several hypotheses in an attempt to explain the phenomenon and mitigate the puzzlement. Let us state explicitly some of these

166 Since my concern here is mainly with the logical form of abduction, I leave aside the question of what we are to understand by experimental verification since, for Peirce, the answer “involves the whole logic of induction” (EP 2, p. 235). Peirce himself discusses what he means and does not mean by verification in EP 2, p. 235-238. It is important, however, to clarify at least that Peirce is not a logical positivist with regard to verification. Abductive hypotheses need not have directly observable consequences, expressible in a purely observational language, to be verifiable. The requirement is that the hypotheses have conceivable practical consequences in the context of an entire, and partly theoretical, system of belief—consequences that would be experientially testable, even if indirectly. As Nathan Houser poignantly writes, “Peirce’s devotion to mathematics and science, his emphasis on the scientific method, and his pragmatic maxim (which sounds a lot like a verification principle) certainly suggest an affinity between pragmatism and positivism….The pragmatic maxim may thus be taken as a test for whether our conceptions, and our theories, are indexed to experience, or whether they are part of a mere language game. But though there are many points in common between pragmatism and positivism, there are important differences, especially Peirce’s insistence on realism and on the legitimacy of abductive reasoning, and his denial of a sharp demarcation between the language of observation and the language of theory” (“Introduction” to EP 1, p. xxxiv). 167 For now, I leave open the question of the form of this ‘how’, that is, of the form of explanation. See section IV below for a brief discussion of the models of explanation that the logic of abduction may admit.

281 possible hypotheses. Restating some of the possible hypotheses from section 6.2 in light of our current discussion, we get:

Hq1: ‘we have an imperfect capacity of observation that makes event E, which

happens with certainty under determining conditions, seem to us to happen only in

a statistically regular proportion sn of random cases.’

Hq2: ‘an evil demon makes event E, which happens with certainty under

determining conditions, seem to us to happen only in a statistically regular

proportion sn of random cases.’

Hq3: ‘someone is biasing the observed sample in order to make us think that event

E happens in a statistically regular proportion sn of random cases.’

Following Peirce,168 we can formulate another possible alternative hypothesis and restate

Bernoulli’s own stated hypothesis as follows:

Hq4: ‘the statistical regularity sn observed in a sequence of random natural

outcomes is due to mere chance—that is, to the pure randomness of a natural

process where there are no a priori probabilities of events.’

Hp: ‘the statistical regularity sn observed in a sequence of random natural

outcomes is due to the proximity of the unknown a priori probability p of event E

to sn.’

We may grant to Leibniz, as Bernoulli does, that infinitely many other hypotheses may be

formulated in an attempt to explain the occurrence of sn, but let us limit our discussion to

168 I am following Peirce in the sense of paraphrasing the possible hypotheses that he formulates to explain the general observation in nature that “with overwhelming uniformity, in our past experience, direct and indirect, stones left free to fall have fallen” (see EP 2, p. 181-183). Peirce’s own discussion regards the theme of reality and truth.

282 these five hypotheses as they should prove sufficient to elucidate the process of abductive reasoning.

Taking an important cue from Lipton, we should notice that the logical process of abduction consists first in a search for explanatory hypotheses—or in Lipton’s terms, for potential explanations—for the observed phenomenon. Now, Peirce’s logic of abduction shows that our initial formulation of hypotheses is directly informed by our system of beliefs, by our existing background knowledge. For various systems of belief, any of the foregoing hypotheses might appear to be possible explanations of the phenomenon. Hq1 would be an explanation of the occurrence of statistical regularity sn for a person who

holds the metaphysical belief that the universe is deterministic—so that all events are

determined with certainty provided that the necessary initial conditions are given—and

who holds the epistemological belief that our capacity for knowing and observing all the determining conditions of events in nature is limited, so that our observations of statistical frequencies are in part the reflection of our imperfect epistemic capacity. Hq2 would count as an explanation for a skeptic who believes that the observations of statistical frequencies of events afforded by our sense-perception are extremely fallible and uncertain, so that the said observations might as well result from the sensory deceptions of an evil demon. Hq3 would be an explanation for one who admits the belief

that statistical regularities may take place in natural events, but that our estimate of the

regularity in this particular occasion is biased by our method of sampling. Someone may

even generalize the hypothesis and claim that our samples always yield biased estimates.

Hq4 would be a possible explanation for one who holds the metaphysical belief that only

pure chance is at work in the observed natural process, so that any particular statistical

283 estimate of regularity is merely random and will vary completely randomly from one sample to the next. In other words, this is a plausible explanation for, say, someone who assumes that every sn in different experimental sequences of trials is due to mere chance.

Finally, Hp would be a plausible explanation of the occurrence of sn for someone who

believes that statistical regularities do take place in nature according to a reason and that

the reason may be described accurately by the appropriate mathematical methods.169

It is important to note that, while the foregoing hypotheses are not completely arbitrary as they must in some way meet the necessary condition of being explanatory, some of them may appear to be, to the same person at the same time, possible explanations of the observed phenomenon. Bernoulli, for instance, clearly holds Hp to be plausible, as it is the explanation that he offers to Leibniz, but he also held a form of metaphysical determinism similar to Hq1 (see Hacking 1971a, p.214). So the criterion of

being a possible, or potential, explanation of the observed phenomenon is not sufficient

for the adoption of an abductive hypothesis. It is here precisely that Lipton’s notion of a

‘second epistemic filter’, one that reduces our list of potential explanations to the most

plausible one(s), becomes crucial. In Peircean terms, I am claiming that the logical

process of abduction involves not only the formulation of possible explanations but also

the judgment of relative plausibility of explanation. Strictly speaking, all the foregoing

hypotheses may be regarded as the result of abductive reasoning. However, like

Bernoulli, we do not necessarily formulate explicitly all of these hypotheses because in

the course of abductive reasoning we make judgments of relative plausibility according to

169 I leave the notion of ‘reason’ open so that it may be understood as a cause, a law, or a principle, in order to allow for causal or nomological theories of explanation.

284 our beliefs. Peirce in fact thinks that we infer abductive hypotheses with various degrees of plausibility. In describing the formulation of an explanatory conjecture he writes that the inquirer “provisionally holds [the hypothesis] to be ‘Plausible’; this acceptance ranges, in different cases,—and reasonably so,—from a mere expression of it in the interrogative mood, as a question meriting attention and reply, up through all appraisals of Plausibility, to uncontrollable inclination to believe” (EP 2, p. 441). If we ponder conjectures with varying degrees of plausibility, then we can also weigh their relative plausibility in order to infer the hypothesis most worthy of being subjected to further scientific inquiry.

Let us attempt to specify this judgment from Bernoulli’s perspective. In the course of our reasoning, we might conceive all five hypotheses to be possible explanations, but we immediately judge some hypotheses to be relatively more or less plausible than others. Bernoulli might immediately deem Hq2 and Hq3 to be relatively less

plausible than the other explanations—the former because he is not a skeptical

epistemologist with regard to the accuracy of our sense-perception and so he reasonably

assumes that our observations of experimental outcomes are accurate, and the latter

because he, or any experimenter, can ensure that the experimental trials be independent

and random and thus that the sample be unbiased. If any particular experimental sequence

were biased for some reason, the experimenter can correct the problem so that, in general,

no sampling bias takes place. Moreover, the judgment of the relative plausibility of

hypotheses is to some extent a judgment of relative explanatory power, that is, of the

relative contribution of the explanation towards our understanding of the phenomenon.

We can see especially that, although Hq2 and Hq3 are possible explanations, they do not

285 contribute much to our understanding of the occurrence of the statistically stable frequency sn. Were we, unlike Bernoulli, skeptical epistemologists, we might accept that

Hq2 reveals that our sense-perception is easily deceived, but even so the hypothesis does

not tell us how or why deception leads to our observation of a statistical regularity instead

of an observation of utterly unintelligible randomness. Hq3 in turn, might suggest that

biased sampling produces our statistical estimate, but it does not explain with

mathematical precision how the sampling method biases the estimate, nor how to assess

the severity of the bias, nor how to correct the bias. In turn, we might share with

Bernoulli the view expressed in Hq1, but even then our hypothesis would only reveal the

epistemological limitations of our knowledge of natural processes via statistical

estimates; that is, it would only reveal that our statistical estimates represent only our

subjective degree of certainty about nature. The hypothesis, however, does not explain

how, given our cognitive limitations, we come to observe precisely the statistical

regularity sn and not some other estimate.

Thus far, the judgment of relative plausibility leaves us with two alternative

explanations. Hq4 suggests that we live in a natural world where statistical expectation

and prediction are impossible—any apparent statistical regularity sn observed in a random

sequence of natural outcomes is due to mere chance; there is no regularity and there are

no a priori probabilities of events in natural processes. Hp alternatively suggests that

statistical regularities such as sn are to be expected to result from the unknown a priori

probabilities of events. At this point, we should recall that the necessary condition that the hypotheses be explanatory goes along with the necessary condition that those hypotheses be capable of yielding testable predictions. In our abductive reasoning, we

286 adopt explanatory hypotheses only if they are capable of producing testable expectations.

In our current inferential situation, both Hq4 and Hp explain sn but only the second

hypothesis provides grounds for prediction and expectation. It is at least conceivable that

we might be able to test whether, under repeated sampling, the observed frequency of

event E tends to be in the proximity of some p. That is, on the grounds of the hypothesis

of an unknown a priori probability p, we might expect that, under repeated sampling

sequences 1 through r, sn1, sn2 … snr tend to be close to each other due to their common

proximity to the unknown probability p. The mathematical form of our test and the exact interpretation of the result would need to be worked out, but at least the possibility of

testing our expectation is conceivable. This very conceivability of practical consequences is the condition that Peirce’s pragmatic logic of abduction stipulates for the admission of a hypothesis. It is more difficult to conceive, however, how we would draw testable consequences from Hq4—if any seeming statistical regularity were merely due to chance,

we would not have any logical way of setting reasonable statistical expectations to test

whether experiment, or experience, disappoints or fulfills those expectations. We should

also notice that none of Hq1, Hq2, or Hq3 are capable of producing any conceivable testable

predictions. Their consequence is to paralyze us either because our sense-perceptual

faculties are too fallible or easily deceived or because we can never hope to draw samples representative of what is the case in nature.

Following Peirce, then, I would claim that Bernoulli’s explanatory hypothesis Hp

is the most plausible one and that Bernoulli infers it abductively. We can draw further reasons to conclude the strongest plausibility of Hp by noticing that this hypothesis also coheres more strongly than Hq4 with one of the fundamental hypotheses underlying the

287 possibility of scientific inquiry, namely, the hypothesis that there is intelligible regularity in the universe. Hp coheres with the belief that we may find intelligible regularity even in

sequences of random natural events. At the very least, I can ‘reasonably’ claim that the

hypothesis of intelligible regularity in nature is part of Bernoulli’s system of beliefs, as

his overall research undertaking—his very actual practice—reveals the conviction that

mathematical reasoning can describe regularity and stability in nature. These additional

remarks amount to claiming that, besides being capable of yielding conceivable practical

consequences, Hp coheres more strongly than Hq4 with some fundamental beliefs

underlying Bernoulli’s inquiry, and so he might also judge Hp to be more plausible than

Hq4 on explanatory grounds.

6.3.2 ‘Abductive Insight’ in Bernoulli’s Hypothesis

Now, the abductive account must address the way in which Bernoulli’s inference

is an act of insight. The crucial claim in this respect is that Bernoulli sees or, more

precisely, perceives that a probabilistic order explains regularity in chance natural

events.170 Recall that, as a central element of his reply to Leibniz, Bernoulli declares that

“we see everywhere that nature follows the simplest paths” (Bernoulli 1966, p. 78). Thus,

I must explain in what sense, according to the Peircean logic of abduction, Bernoulli

170 Strictly speaking, this is an epistemological claim about the origin of ideas and not a claim of formal logic about the structure of an inference. But I think it is important to address, at least briefly, the perceptual element of abduction, especially since from a Peircean perspective, the “logic of abduction” is a model of reasoning processes that are continuous and dynamic, so that the epistemology vs. formal logic distinction, though it is useful for analyzing reasoning processes, is not present in the processes themselves.

288 perceives that unobserved probabilities explain the surprising occurrence of stable statistical frequencies in chance events. The key to the Peircean position lies in the continuity between ‘perceptual judgment’ and ‘abductive inference’. According to Peirce, perception is for the logician “simply what experience,—that is, the succession of what happens to him,—forces him to admit immediately and without reason” (EP 2, p. 224). In any experiential situation, the ‘object’ of our perception is the ‘percept’. This ‘percept’ need not be a physical thing; it can be anything that is before our attention including an idea or an abstract problem, so long as it is represented in a sign.171 A ‘perceptual

judgment’ is a propositional judgment that we form of the percept. Just as in perception

experience forces us to admit the presence of the percept, so also the formation of the

perceptual judgment is rationally uncontrollable, that is, the formation of the perceptual

judgment is not subject to conscious logical criticism.172 Thus, the percepts themselves

suggest an uncontrollable, but interpretative, perceptual judgment. This is what Peirce

calls the ‘interpretativeness’ of perceptual judgments: interpretation of the percept is

continuous with and inseparable from perception, or, “perception is interpretative” (EP 2,

p. 229).

For Peirce, a perceptual judgment is simply the limiting case of abductive

judgment. One of the central or ‘cotary’ propositions of Peirce’s pragmatism is that

171 Notice that perception is not limited to sense experience of the natural, or actual, world, but extends to any kind of experience, including the intellectual experience of a mathematical world. Two observations are germane at this point. First, for Peirce there are three Universes of Experience, namely, the universe of Ideas, of Brute Actuality, and of Signs. So we can experience thoughts, facts, and representations that mediate between an object and a mind (see EP 2, p. 435). Second, recall that for Peirce mathematics is the study of what is true of hypothetical states of things. A mathematical world is a hypothetical world that the mathematician experiences. 172 Peirce writes, “a perceptual judgment is a judgment absolutely forced upon my acceptance and that by a process which I am utterly unable to control and consequently am unable to criticize” (EP 2, p. 210).

289 “abductive inference shades into perceptual judgment without any clear line of demarcation between them; or in other words our first premisses, the perceptual judgments, are to be regarded as an extreme case of abductive inferences, from which they differ in being absolutely beyond criticism” (EP 2, p. 227). Thus perception and abduction judgments are essentially the same general kind of reasoning: both involve grasping some relevant aspect of the percept that the percept itself suggests or presents to our attentive observation. This is why Peirce describes abductive judgments as ‘flashes of insight’: “The abductive suggestion comes to us like a flash. It is an act of insight, although extremely fallible insight. It is true that the different elements of the hypothesis were in our minds before; but it is the idea of putting together what we had never before dreamed of putting together which flashes the new suggestion before our contemplation”

(EP 2, p. 227). Importantly, “the only symptom by which [perceptual and abductive judgments] can be distinguished is that we cannot form the least conception of what it would be to deny the perceptual judgment….An abductive suggestion, however, is something whose truth can be questioned or even denied” (EP 2, p. 229-230). Note that this difference is only symptomatic—it is not formal or essential, as both perceptual judgments are only the limiting case of abductive judgment, and specifically the case where we cannot subject our judgment to logical or ‘reasoned’ criticism.

In the case of Bernoulli’s reasoning, the percept is the phenomenon of random natural events. This percept is clearly quite general; it is not a particular event but a general class of events; it is, nonetheless, what is before his attention. More specifically, the phenomenon of random natural events is the percept that is before Bernoulli’s scientific attention, and he is regarding random natural events as mathematically

290 describable events. Moreover, as he considers the phenomenon of random natural events,

Bernoulli already relies on previous perceptual judgments about these events—most importantly, on the judgment that some natural events, even if they seem to occur by chance, are nonetheless regular; in particular, they exhibit a regular frequency of occurrence. This judgment is in fact an expression of the phenomenon that now stands in need of explanation. The question is why chance events exhibit a regular frequency.

Given this problem context, then, Bernoulli’s abductive insight consists in bringing the concept of a priori probability to bear—logically in his discussion with Leibniz and mathematically in the proof of his theorem—on the explanation of the phenomenon of statistical regularity within the random in nature. This is not to say that the concept of

‘probability’ is Bernoulli’s invention; it is to say rather that Bernoulli perceives the way in which the concept of a priori probability contributes to the explanation of regular statistical frequencies in natural events. The concept of probability, on the one hand, and the perception of stable relative frequencies of natural events, on the other, are already part of the scientific context in which Bernoulli works, and his abductive insight consists in grasping that a priori probabilities explain the a posteriori statistical regularities observed in nature. Bernoulli perceives and, in a continuous reasoning process, conjectures that the regular frequencies that we observe in the natural world—the actual world of facts—are proximate to their a priori probabilities, and that this proximity is mathematically describable. Bernoulli, of course, needs to work out his conjecture mathematically by way of a deductive demonstration of the theorem. But his insight consists in grasping that the chance events of the natural world exhibit a statistical regularity that is explained by a priori probabilities and describable via the mathematical

291 theory of probability. This amounts to putting together into an intelligible relation what no one has been able to relate intelligibly before.

Comparing Bernoulli’s insight into the role of probability in statistically regular phenomena to the reasoning of his contemporaries will help us to appreciate its incisiveness. Let us consider for example a paper published by John Arbuthnot in the

1710 volume of the Philosophical Transactions of the Royal Society of London. Based on the London birth tables over the 82-year period from 1629 to 1710, Arbuthnot observes that the stable ratio of male to female births is 18:17. Assuming that the true a priori odds of male to female births are 1:1, Arbuthnot shows that observing an 18:17 a posteriori ratio over 82 years is highly improbable. Now, this observed regularity is puzzling because Arbuthnot expects that the observed ratio should be 1:1, since the equal number of males and of females of all ages would be ideal for the preservation of humankind, as

Divine Providence ordains. So what is the explanation for the regular ratio 18:17?

Arbuthnot reasons that it must be that Divine Providence intervenes to ensure an equal number of males and females because the mortality rates of males are higher, and it is required to have an equal number of males and females of reproductive age to ensure the ordained survival of manking (see Arbuthnot 1710). Arbuthnot thus concludes that it is

Divine Providence, and not chance, that explains the 18:17 male to female birth ratio. Let us contrast his reasoning with Bernoulli’s. His ‘art of conjecturing’ leads to the inference that the true a priori probability of a male birth is 18/35. Or, we could say that the true ratio of male to female parts in the “seed” (to paraphrase Arbuthnot) is 18:17. That is,

Bernoulli’s art indicates that the stable statistical ratio is explained by an a priori

292 probability as its efficient cause, not by the order dictated by Divine Providence as a final cause. Herein lies Bernoulli’s insight.

Now, although Bernoulli’s reasoning is insightful, his hypothesis is more properly described as an abductive, and not a perceptual, judgment because it is subject to controlled, logical criticism. Bernoulli abductively judges that natural events exhibit a regular statistical frequency because that frequency is close to the a priori probability of the event. In short, he judges that random natural events, just like events in games of chance, are probabilistic. But we may criticize this judgment; the judgment is in fact a plausible hypothesis that we can scrutinize. We can conceive the possibility of our judgment being a wrong judgment; it may be, for example, that relative frequencies of events are due to mere chance and that the frequencies will vary completely randomly from one experimental sequence to another. So we can conceive of testing whether, in repeated experimental sequences, the relative frequency of an event will be regular or not. As we have seen, however, this is already a form of inductive reasoning, but

Bernoulli’s initial judgment that natural events exhibit a regular statistical frequency because of their a priori probability is an abductive judgment—is it an act of insight into a relevant aspect of the observed phenomenon.

6.3.3 ‘Simplicity’ in Bernoulli’s Hypothesis

Finally, the simplicity of Bernoulli’s hypothesis must not elude the abductive account. After all, Bernoulli claims that we perceive that nature follows the simplest

293 paths, and for him this is a reason to infer the ‘simplest’ hypothesis. For Bernoulli, the simplest path to regular statistical frequencies in nature is that the frequencies of chance events be close to their a priori probabilities, and this is good reason to infer his hypothesis and to found the ‘art of conjecture’ upon it. But is Bernoulli’s sense of the simplicity of his hypothesis defensible on the basis of Peirce’s sense of abductive simplicity?

Part of the question is what kind of ‘simplicity’ characterizes an abductive hypothesis. For Peirce the ‘simplicity’ of hypotheses is of two types. Simplicity means primarily the natural ‘facility’ or ‘instinctiveness’ of a hypothesis, and only secondarily

‘logical’ simplicity or ‘parsimony’, that is, the simplicity propounded by Ockham’s razor

(see EP 2, p. 444).173 This means that the simplest hypothesis is the one that nature itself

suggests—it is the one that we most forcefully perceive to be the case. Only secondarily

does simplicity mean parsimony. Thus, we primarily abduce a hypothesis based on the

direct experience of what the natural phenomenon itself suggests and only secondarily

subject it to logical scrutiny to excise any conceptions unnecessary to the explanation of

the observed phenomenon. The Peircean position relies on the belief that our capacity for

hypothesizing is attuned to nature. I will examine this position in detail in the section 7.2,

for it is admittedly problematic and deserves ample consideration and scrutiny.

173 In contemporary terms, we might recognize here the distinction between the ‘simplicity’ and the ‘economy’ of a hypothesis. The contemporary distinction would clarify that the two types of “Peircean” ‘simplicity’ are not necessarily compatible and often opose each other. For example, a hypothesis might appeal to many concepts in order to be a ‘simple’ explanation of an observed phenomenon, while an ‘economic’ or parsimonious hypothesis may be a quite complex explanation. I think Peirce would recognize that problematic relation. I also recognize the distinction, but will use the Peircean terminology of two types of simplicity in order to make clearer the exposition of Peirce’s ideas.

294 For now, I conclude the abductive account by suggesting that Bernoulli’s own sense of the simplicity of his hypothesis only accords partially with this Peircean position. For Bernoulli, his conjecture is the simplest hypothesis in that it reflects the simplicity that we perceive in nature. He claims that we see nature everywhere to follow simplest paths. In so far as he means that the simplicity of nature is an objective simplicity—that is, as a inherent quality of nature, a pure First—then I think this does not accord with Peirce, for whom the simplicity of nature is relational—‘simplicity’ is of the character of a Second in so far as it is a relation of attunement between our minds and the structure of the universe. A hypothesis that primarily expresses that relational attunement is the ‘simplest hypothesis’. Thus, in so far as Bernoulli means that the simplicity of nature is a perceived simplicity in that it is the result of the attunement of our minds to the structure of the universe, then Bernoulli’s sense accords with Peirce’s sense of the simplicity of an abductive hypothesis. But would Bernoulli be justified in adopting a hypothesis on the grounds that it is ‘simple’ in either one of the foregoing senses? Again,

I will address this question in section 7.2, in the context of considering some possible objections to the abductive account of his inference that I have advanced.

6.4 Comparing Both Accounts of Bernoulli’s Ampliative Inference

How does the foregoing account of Bernoulli’s reasoning as abductive compare with the previous account of his reasoning as an inference to the best—loveliest potential—explanation? I think there are several compatible elements between the

295 accounts. The most important common feature is that both accounts emphasize that

Bernoulli infers an explanatory hypothesis and that this process involves, first, the generation of plausible explanations and, second, a relative weighing of their merits as plausible hypotheses. Moreover, the criteria, under both models, for judging the relative plausibility of hypothetical explanations are compatible. The Peircean conditions that admissible hypotheses be explanatory and yield conceivable testable predictions cohere with Lipton’s view that we deem to be more plausible those explanatory hypotheses that

(a) specify a causal mechanism, (b) are precise, and (c) unify our understanding and our explanatory scheme.

First, it is reasonable to say that hypotheses that are capable of experimental verification—in Peirce’s broad sense of having conceivable practical consequences that can be put to the test of experiment or experience—must be sufficiently precise so as to suggest how the consequences might follow from the hypothesis under specific conditions. The preference for precision is, in Peircean terms, the condition that admissible abductive hypotheses contain only ‘clear’ and ‘distinct’ conceptions, that is, conceptions for which we can conceive all relevant practical bearings. This is what Peirce calls the maxim of Pragmatism, namely: “Consider what effects that might conceivably have practical bearings we conceive the object of our conception to have: then, our conception of those effects is the whole of our conceptions of the object” (1998, p. 135).

For Peirce, this is a maxim of Logic, and more specifically, the maxim of the logic of abduction, which ought to perform two services: “[I]t ought, in the first place, to give us expeditious riddance of all ideas essentially unclear. In the second place, it ought to lend support [to], and help render distinct, ideas essentially clear but more or less difficult of

296 apprehension” (1998, p. 239). So abductive hypotheses, in order to be admissible, must be sufficiently clear so as to be capable of rendering testable experimental or experiential consequences, and this is akin to our preference for precise explanations.

Second, the practical consequences of an abductive hypothesis are conceivable on the basis of our entire system of beliefs and so also on the basis of everything is part of our explanatory scheme. In order to follow the pragmatic maxim, our reasoning about the concepts involved in a hypothesis must involve all relevant elements from our system of knowledge so as to identify the ways in which a proposed hypothesis coheres with or contradicts our existing beliefs. That is, we do not reason on the basis of isolated terms and propositions; we reason on the basis of a whole understanding the helps us to elucidate the concepts and consequences of the explanatory hypotheses that we formulate when confronted with an unexplained phenomenon. As I have argued in previous sections above, Bernoulli’s explanatory hypothesis, as compared to alternative hypotheses, coheres most strongly with his core mathematical and scientific beliefs, and this is an important reason why he infers it either as the best explanation or as the most plausible abductive explanation, depending on the respective inferential models that we have deployed to interpret his reasoning.

The third issue—namely, that of abductive hypotheses fulfilling our preference for explanations that specify a causal mechanism—is more complicated, but I think the

Peircean model of abduction can address it. Lipton admits that the question of a philosophical theory of explanation is not settled, but his hope is that, whatever the true theory of explanation may turn out to be, it will be compatible with his logical description of the inference to the best explanation. Nonetheless, Lipton does think that whatever a

297 good explanation may be, one of the characteristics we prefer is that it specify some causal mechanism by way of which the observed phenomenon occurs.174 This is

compatible with the logic of abduction, which admits of causal laws as explanations of

phenomena. In the case of Bernoulli, he seeks to explain the occurrence of the statistical

regularity sn as a general, mathematically ruled phenomenon. So he seeks a general,

mathematically principled explanation. This kind of explanation is what his abductive

hypothesis suggests and what his mathematical theorem elaborates. Admittedly, for

Bernoulli when it comes to explaining the statistically regular occurrence of a particular

natural event, part of the explanation must articulate not only the mathematical principles

of probability that explain the statistical regularity but also the particular causes of that

particular event. And he has a determinist view of causation that will create problems in

his reasoning, as I will argue in section 8.3. Even so, he thinks of the a priori

probabilities of events as causes and of the observed statistical ratios as effects. His art of

conjecturing precisely seeks to provide a scientific way to reason from effects back to

causes.175 He intends for this new science to provide a third way to yield knowledge

between demonstrative certainty and mere opinion. My suggestion has been that

Bernoulli warrants this new mathematical science on the basis of an abductive

hypothesis.

174 I agree with Lipton that it is not within the scope of an account of the logic of ampliative inference to solve the question of a true theory of explanation. Lipton favors a causal model of explanation (1991, ch. 3) while N.R. Hanson favors a model of explanation that appeals to physical laws and principles, including causal laws (1965, ch. 3). 175 In our present context, I leave open the question of how an a priori probability p may be the cause of a statistical frequency sn. I will discuss extensively the question of how p is explanatory of sn for Bernoulli in section 8.2.

298 In sum, I think it is reasonable to claim that Lipton’s criteria for preferable hypotheses are largely compatible with the Peircean conditions for the admissibility of hypotheses, and therefore, with the abductive model of inference. In the case of

Bernoulli, the hypothesis Hp fulfills the Peircean criteria better than any of the alternative

hypotheses, and so we may infer it abductively, as we conclude that it is the most plausible explanation for the observed statistical regularity and thus worthy of experimental examination.

Now, it is important to emphasize again that at the end of our ampliative reasoning we conclude that a hypothesis such as Hp is plausible, not probable. This is an

indication that the reasoning is abductive, not inductive. It is precisely here where I think

Lipton misclassifies his model of inference to the best explanation as inductive. Were he

to advocate a model of inference to the likeliest potential explanation, the classification of

the inference as inductive would be understandable because the question of likeliness, in

Lipton’s terms, is a question of inductive probability in Peircean terms. Likeliness is a

measure—quantitative or qualitative—of the degree to which a general hypothesis agrees

with all the evidence, so it is a measure of the inductive probability of the hypothesis. But

Lipton advocates a model of inference to the loveliest potential explanation, that is, to the explanation that provides the deepest understanding of the phenomenon, and this is an abductive inference. In the case of Bernoulli, he infers the general hypothesis Hp—‘the

unknown a priori probability p of the event E is close to the observed statistical

frequency sn’—because ‘p being close to sn’ makes most understandable the occurrence

299 176 of sn and not because ‘p being close to sn’ makes most probable the occurrence of sn. In

short, I think that Lipton’s model of inference to the best explanation must be classified

as a form of abduction and not of induction.

Finally, as I pointed out above, the general hypothesis Hp contains the conception

of an ‘unknown a priori probability p’ that underlies a natural process in which the

statistical regularity sn occurs. The hypothesis is not inducible from the observed

phenomenon since the conception of an ‘unknown a priori probability’ is not inducible

from the idea of an observed frequency ratio. The concept of observed ‘relative

frequency’ does not lead inductively—that is, by drawing a general conclusion

exclusively from the repeated observation of frequency ratios—to the concept of ‘a priori probability’.177 Bernoulli, then, brings a concept to bear in a novel way on the explanation of an observed phenomenon, and this is a mark of abduction: the act of bringing relevant concepts to bear in creative ways on the explanation of previously unexplained observed phenomena is an abductive act.178 For all the foregoing reasons,

then, I think Bernoulli’s ampliative inference is most accurately and adequately

characterized as abductive.

176 On this point, see Hacking who writes: “I am not trying to foist on Jacques Bernoulli a simple ‘likelihood’ interpretation, which would be quite unjustified. That is, I do not claim that he favours sn as an estimator of p because that is the estimator that makes the data most probable” (1971a, p. 229). 177 Again, I am claiming that Bernoulli does not hold a frequentist interpretation of probability; that is, Bernoulli does not simply define probability to be the long run relative frequency of an event. His work is prior to the rise of the frequency interpretation of probability. In my estimation, his position is rather that the a priori probability of an event explains the observed occurrence of the event with some relative frequency, but the probability is not defined as the relative frequency. Again, I will address how p is explanatory of sn for Bernoulli in section 8.2 below. 178 We might once again think here of Bernoulli’s reasoning regarding the relationship between a priori probabilities and a posteriori statistical regularities in contrast to Arbuthnot’s.

Chapter 7

Abduction as Rational Ampliative Inference: Objections and Replies

The foregoing abductive account provides a logical description of the reasoning

that led Bernoulli to a seminal discovery in the applicability of mathematical probability

to science. The question that Bernoulli confronts is how to justify the estimation of

probabilities by way of statistical frequencies in the case of natural events. His solution,

or justification, is an inference to a plausible hypothesis: The most plausible explanation

for observing event the statistically regular frequency sn of event E in a random natural

process is that the unknown a priori probability p is close to sn. Therefore we have good

reason to hold the hypothesis that the unknown probability p is close to statistical

frequency sn.

I have argued that both Lipton’s model of the inference to the loveliest potential

explanation and, more comprehensively, Peirce’s logic of abduction provide reasonable

accounts of Bernoulli’s discovery. These models constitute an improvement over

traditional twentieth century accounts of the logic of scientific reasoning, such as

Hempel’s hypothetico-deductive model or Popper’s model of conjectures and

refutations.179 Both accounts claim that logical scientific reasoning only begins after the

hypotheses or conjectures are given. Proponents of these models typically place the

burden of demonstrating that the question of discovery is logical and not merely

179 For Lipton’s discussion of how the inference to the best explanation improves on hypothetico- deductivism, see Lipton 1991, chapters 4 and 5.

301 psychological on the proponents of any logic of discovery. I submit, to the contrary, that the foregoing account does show that a logical process, not psychological happenstance, led Bernoulli to infer his hypothesis. In what follows, I aim to substantiate this claim further by addressing explicitly some possible criticisms that may be leveled against abduction as rational ampliative inference. In order to draw out the contemporary relevance of this issue, I turn to address the possible charges that contemporary philosopher Bas van Fraassen might bring against the abductive account of Bernoulli’s discovery. In Laws and Symmetry, van Fraassen elaborates a detailed criticism against the inference to the best explanation (I.B.E.) as the fundamental rule of ampliative reasoning.

I take up some key criticisms that may apply also against the abductive account of

Bernoulli’s ampliative inference in order to elucidate both where those criticisms are misguided and where, even if they are valid criticisms against I.B.E., they can be addressed by way of the abductive account.180

7.1 The Descriptive Adequacy of the Abductive Model

At the outset of his critique, van Fraassen announces the he will urge a “gentle”

verdict against the I.B.E. model, namely: “Someone who comes to hold a belief because

he found it explanatory, is not thereby irrational. He becomes irrational, however, if he

adopts it as a rule to do so, and even more if he regards us as rationally compelled by it”

180 For Lipton’s response to van Fraassen in defense of the inference to the best explanation, see Lipton 1991, chapters 7 and 9. I find Lipton’s responses to be very apt.

302 (van Fraassen 1989, p. 142). Van Fraassen is ultimately concerned with refuting the position that I.B.E. is the cornerstone of epistemology, that is, the position that it is the fundamental rule by which we do, and ought to, seek and secure ‘warranted’ beliefs, where ‘warranted’ means deductively certain or at least likely to be true. At once we can see the ultimate stakes of the debate: On what grounds and by which rule are we to pursue certain, or at least probable, true knowledge? As we will see, any response that links inference to explanation—such as I.B.E. and abduction—is untenable for van

Fraassen. The deliberate adoption of an ‘ampliative rule’ of inference based on explanation would in fact make us irrational according to him.

Among van Fraassen’s many charges against I.B.E., the central one for our purposes is that the I.B.E. model pretends to be something other than it is (van

Fraassen1989, p. 142). He writes that I.B.E. “is not what it pretends to be, if it pretends to fulfill the ideal of induction. As such its purport is to be a rule to form warranted new beliefs on the basis of the evidence, the evidence alone, in a purely objective manner. It purports to do this on the basis of an evaluation of hypotheses with respect to how well they explain the evidence, where explanation again is an objective relation between hypothesis and evidence alone” (van Fraassen 1989, p. 142). I submit that the logic of abduction is not subject to the same charge. In the first place, one of essential differences between induction and abduction is that the particular observed instances which constitute the evidence do not lead, by inductive generalization, to explanatory hypotheses that involve novel concepts not contained in the evidence. In fact, the model of I.B.E., at least in Lipton’s version of inference to the loveliest explanation, is best categorized as abductive. So inference to the loveliest explanation ought to purport to be

303 abductive. This may not assuage van Fraassen’s other objections, but at least it will be a more accurate classification of the logical form of that inference. Second, the model of abductive inference does not purport to consist in the objective formation of an explanatory hypothesis on the basis of the evidence alone. For Peirce the formation of an explanatory hypothesis is of the nature of perceptual judgments, and perception is always interpretative. Just as perception is inherently informed by our interpreting ‘habits’—that is, by the conscious and unconscious beliefs that inherently inform our perception of reality—so is abductive inference informed by our existing system of knowledge. Note that abductive inference is ‘informed’ but not ‘determined’ by our existing knowledge because experience often confronts us with unexpected phenomena, with real facts that do not cohere with our system of beliefs, and so we must subject our disrupted beliefs to logical scrutiny. According to Peirce, we infer the plausibility of explanatory hypotheses partly on the basis of our existing system of knowledge, and once we make the inference, we subject our hypotheses to the test of reality, of real experiment or experience. Of course, as we will see, this appeal of the ‘test of reality’ only will delay some of van

Fraassen’s more poignant criticisms. For now, it is important to emphasize that abduction consists, at least partly, in conceiving new explanatory notions or in associating previously unrelated concepts to the explanation of the observed phenomena. Abduction does not consist in a rule-like inference to general explanation from particular evidence alone.

This brings up the third point, namely, the logic of abduction does not provide a

‘rule’ for inference. The model of abduction describes the conditions for the possibility of generating explanatory hypotheses, the logical form of ampliative inference to an

304 explanatory hypothesis, and the criteria for assessing the plausibility and explanatory merit of hypotheses.181 However, it does not purport to be a rule that leads to the

discovery and establishment of ‘warranted’ new beliefs—again, in the narrow sense of

‘warranted’ as ‘deductively certain’ or ‘likely to be true’.182 In fact, fourth, the logic of

abduction only describes the reasoning process that leads to the adoption of an

explanatory hypothesis as plausible, not as ‘warranted’ true belief. Abductive inference

only provides the initial judgments, or first premises, of a more involved and thorough

process of scientific reasoning. According to Peirce, the whole process involves the

generation and adoption of plausible explanatory hypotheses, the deduction of testable

consequences from the hypotheses, and the experimental or experiential testing to assess

the level of agreement between fact and theory. The description of this process involves

the entire logic of abduction, deduction, and induction. Even after completing the whole

inquiry, we only hold the scientific conclusion as provisionally true until such time as

new evidence or experience might bring it into question.183 The logic of abduction does

not involve the claim that the inference to an explanatory hypothesis produces a new

‘warranted’ belief; it only claims that an abductive inference consists in generating an explanatory hypothesis and judging it to be plausible.

Now, the epistemological demand that, in order to be rationally believed, an

inferred explanatory hypothesis must be certainly, or at least probably, true conflicts with

actual scientific practice, which involves the rational adoption of explanatory hypotheses

181 See section 2.2.2. 182 In the remainder of this chapter, the term ‘warranted’ will have this narrow sense. 183 See Charles Sanders Peirce, Illustrations of the Logic of Science, in EP 1, p. 109-199. Here Peirce describes his early model of the ‘method of science’.

305 as plausible. But van Fraassen’s main objection against abductive inference would be that it cannot yield new beliefs that are at least likely to be true; this, at least, is precisely his main charge against I.B.E. Van Fraassen argues that I.B.E. cannot yield ‘warranted’ new beliefs because it only selects the best among the historically given hypotheses. However, he points out, we cannot compare these available hypotheses with those that no one has formulated, and the best one among the available candidate hypotheses may well be the best among many bad ones. This is not sufficient to warrant our belief in the best hypothesis. According to van Fraassen, choosing the best is only a comparative judgment, but a prior ampliative step is required as well. He writes: “To believe is at least to consider more likely to be true than not. So to believe the best explanation requires more than evaluation of the given hypothesis. It requires a step beyond the comparative judgment that this hypothesis is better than its actual rivals. While the comparative judgment is indeed a ‘weighing (in the light of) the evidence’, the extra step—is not. For me to take it that the best of set X will be more likely to be true than not, requires a prior belief that the truth is already more likely to be found in X, than not” (van Fraassen 1989, p. 143). Van Fraassen aptly notes that, in order to be a viable form of reasoning, ampliative inference from observed phenomena to an explanatory hypothesis would have to involve two steps: first, generate a set of explanatory hypotheses that is likely to include the true explanation; and second, infer the best hypothesis by deciding, on the basis of adequate (objective) criteria, which one provides the best explanation for all the available evidence. The models of inference to the best explanation that van Fraassen

306 rejects do not explicitly account for the first, ampliative step and, furthermore, cannot provide a ‘warrant’ for it. So the second, comparative step is unwarranted.184

The abductive model, however, does account explicitly for the ampliative step.

Peirce in fact emphasizes the ampliative step of generating hypotheses. The comparative step of judging one hypothesis to be more plausible than another is only implicit. But I elicit it from Peirce by considering that for him reasoning is a continuous process; thus,

when confronted with an unexplained phenomenon, we formulate a series of potential

explanations, judge them to be more or less plausible, and infer the most plausible

hypothetical explanation, provided it be capable of being put to the test of experiment.

The logical analysis of the process may make it seem as if we generate a series of rival

hypotheses in discrete acts of reasoning and then infer the best one in a separate act; but

the process of generating hypothetical explanations and of judging their relative

plausibility is continuous. In fact, the term ‘steps’ introduces some undue discreteness

into the analysis, so it is more accurate to say that Peirce emphasizes the generative phase

of abductive process while often leaving implicit the comparative phase of the whole

process of abductive reasoning.

Let us turn to assess how the abductive account of Bernoulli’s inference stands,

given van Fraassen’s concerns so far. To begin, my account explicitly describes the

184 In my estimation, Lipton’s model of inference to the loveliest explanation addresses thoroughly van Fraassen’s charge. In the first place, the model describes two ‘epistemic filters’: one for generating explanatory hypothesis and another for choosing the best one. In the second place, it concedes that inference to the best explanatory hypothesis does not warrant the conclusion that the hypothesis is likely to be true. It concludes that the hypothesis is lovely, that is, that is has some desirable features—such as explanatory precision, cohesion with other beliefs, and power to unify our understanding—that make it the most tenable position for the time being. But again, I propose that this is precisely what makes it an abductive, not an inductive, inference.

307 ampliative phase of Bernoulli’s reasoning process. There are natural events that exhibit a statistical regularity in the form of an observed relative frequency sn. But in the context of

early modern science in Europe this is puzzling: Why would we observe any statistical regularity of that form in a natural event? Bernoulli, just like Leibniz, Arbuthnot, or any other scientist who confronts this kind of question, struggles to formulate possible

explanations Hi. Of course, there are an infinite number of possible explanations that they

might formulate. But the abductive model tells us that the generative process always has

the logical form: ‘If Hi were the case, then one would expect to observe sn’. Admittedly,

this logical form does not provide the mathematical scientists with an ampliative rule for

making any crucial discovery regarding the explanation of statistical regularities in nature. Even if one were to prescribe this formal scheme of reasoning to Bernoulli, he would not thereby generate good explanations on the basis of the evidence alone.

Important conditions, such as his knowledge of mathematical probability and his belief

that nature is mathematically describable, must be in place for him to be able to

conjecture that statistical regularities in nature have a mathematical explanation, that a

priori probabilities act as causes in a mathematically precise way.

Moreover, Bernoulli makes comparative judgments among the possible explanations that he is able to generate. But these are judgments of relative plausibility,

not of relative probability. We do not know all the hypothetical explanations that he may

have entertained, though in the previous chapter I suggested some hypothesis that any

inquirer could have formulated. But, according to my reconstruction of Bernoulli’s inference, he did formulate the hypothetical explanation: ‘if Hp were the case, that is, if

the unknown a priori probability p of event E were close to sn, then we would expect to

308 observe observe sn’. And he did conclude, therefore, that Hp is plausible. He infers it as

plausible because Hp makes the occurrence of sn “expectable” not only in the sense of

making it “predictable” but also in the sense of making it “understandable” in the light of

Bernoulli’s entire system of beliefs. In sum, the abductive account rejects any purport

that Bernoulli followed an inductive rule for inferring the explanation most likely to be

true on the basis of the evidence alone. Instead the model (i) describes explicitly the

generative and comparative aspects of the reasoning process that leads Bernoulli to adopt

the hypothetical explanation that a priori probabilities explain observed statistical

frequencies in natural events, (ii) stipulates some of the conditions that allow Bernoulli to

generate that hypothetical explanation, and (iii) specifies some of the criteria that guide

Bernoulli’s judgment of relative plausibility among competing explanations.

7.2 The Problem of Truthful Hypothesizing

Only at this point, however, do we come to consider what, in my estimation,

would be one of van Fraassen’s most challenging objections to the abductive model of

ampliative inference. Put succinctly, the objection would be that even if the abductive

model were to describe explicitly the ampliative phase in the inference to a hypothetical

explanation, it would not thereby offer any justification for that inference. In other words,

even if van Fraassen were to accept the foregoing description of abductive inference, he

might charge that all I have done so far is describe a model of unjustified or

‘unwarranted’ inference. Recall that against the inference to the best explanation van

309 Frassen poses the problem in terms of likeliness to be true: To believe the best explanation means to hold that the explanation is at least likely to be true. This requires the prior belief that the truth is likely to be found among the set of explanatory hypotheses that we are able to generate. But what is the ‘warrant’ for this prior belief?

Stricly speaking, the abductive model is not subject to this exact problem because it does not claim that we ‘believe’ the inferred explanatory hypothesis. The model does not make this claim in the traditional sense of ‘belief’ as certainly (or probably) true opinion, nor in the Peircean sense of ‘belief’ as a deliberately adopted proposition upon which we are willing to, and in fact do, act under given circumstances.185 All we are

willing to do when we infer a scientific hypothesis is to put it to the ‘test of reality’ to

examine whether, after a whole process of deduction of consequences and inductive

comparison of predictions with results, we may reasonably adopt the hypothesis as a

tenable theory. Only at that point do we ‘believe’ the hypothesis and, according to Peirce, the warrant for doing so is that the entire method of reasoning—the abductive, deductive, and inductive process—that we employ is self-corrective, that is, it detects its own errors when confronted with reality in the course of experience, and so it leads towards the truth in the long run.186 Moreover, Peirce’s logic of abduction only describes the initial stages

185 For a full account of Peirce’s notion of ‘belief” see “The Maxim of Pragmatism” in EP 2, p. 133-144. According to Peirce, “belief consists mainly in being deliberately prepared to adopt the formula believed in as a guide to action” (p. 139). 186 See, for example, “The First Rule of Logic” in Peirce RLT, p. 165-180. In this lecture, part of the Cambridge Conferences Lectures of 1898, Peirce argues that inquiry of every type—including abductive, inductive, and deductive reasoning—is self-corrective so that “it may be said that [the only] thing needful for learning…is a hearty and active desire to learn what is true” (p. 171). The emphasis is on active inquiry: as long as we remain attentive to the experiential disruption of our beliefs and keep putting questions to nature by testing our hypotheses, self-correcting reasoning will lead us towards the truth in the long run. This claim involves the Peircean view that our logic strives to conform to the logic of the universe.

310 of our process of self-controlled reasoning, that is, of reasoning which we pursue deliberately and can control logically; but Peirce concedes from the outset that abductive inferences are not ‘warranted’ in the strong sense in which deductive inferences are warranted because from true premises they must necessarily lead to true conclusions.

According to Peirce, the only justification for abductive inferences is that this form of reasoning provides our only hope for understanding the reality that we confront. This is because abduction is the only form of reasoning truly generative of new conceptions and of novel explanations for the phenomena that we experience.187

Nevertheless, we do hold our abductive inferences to be at least plausible and we

do act upon them by proceeding to test them via experiment. So we might ask whether

this action is justified in some way. Moreover, Peirce does propose that in abduction lies

our only ‘reasonable’ hope for understanding the reality we experience, so one might

press for the basis of that hope.188 An objector in line with van Fraassen, then, might pose

this problem against the abductive account: Even if the abductive model does not make

any pretense that explanatory hypotheses are inferred as certainly true or even as

probably true, what is the ‘warrant for believing’ that even our most plausible hypotheses

are worthy of being put forth for the so-called ‘test of reality’? What is the ‘warrant for

believing’ that our explanatory hypotheses tend to strike upon the truth? This problem is important; it poses challenges that a thoughtful defense of the abductive model must address. For shorthand, I propose to call it the ‘problem of truthful hypothesizing’. In

187 Peirce writes, “if we are ever to understand things at all, it must be in that [abductive] way” (EP 2, p. 205). 188 Recall that a belief, which is the basis for action, is ‘reasonable’ when we adopt it according to one of the forms of logical inference—abduction, induction, or deduction—and we hold it with the appropriate, or justified, degree of confidence—plausibility, probability, or necessity, respectively.

311 response, I want to argue that even though there is no strict ‘warrant for believing’ that our hypothesizing tends towards the truth—in the narrow epistemological sense of

‘warranted belief’ as a deductively certain proposition—there is reason for hoping that our abductive reasoning does tend to generate hypotheses that lead us towards the truth in the long run.

On his part, van Frassen anticipates and rejects several possible responses to the

‘problem of truthful hypothesizing’. I will examine the two responses that pertain directly to Peirce’s ideas. First, there is what van Fraassen calls the appeal to ‘privilege’, which

“consists in a claim for privilege of our genius. Its idea is to glory in the belief that we are by nature predisposed to hit on the right range of hypotheses” (van Fraassen 1989, p.

143). Van Fraassen links this claim to the “medieval metaphysical principle of adequatio mentis a rei” (p. 143). But he notes that contemporary readers would not accept it as a metaphysical principle and would request a justification. One of the possible justifications of the principle is naturalism in epistemology. According to van Fraassen, the naturalist position bases its claims “on the fact of our adaptation to nature, our evolutionary success which must be due to a certain fitness” (p. 143). This position is akin to Peirce’s own. Van Fraassen objects that the conclusion that we are predisposed by nature to strike upon true hypotheses “will not follow without a hypothesis of pre- adaptation, contrary to what is allowed by Darwinism” (p. 143). According to him, nature

“does not select for internal virtues—not even ones that could increase the chance of adaptation or even survival beyond the short run. Our new theories cannot be more likely to be true, merely given that we were the ones to think of them and we have characteristics selected for in the past, because the success at issue is success in the

312 future” (p. 143). Van Fraassen, in sum, objects for two interrelated reasons, namely, (i) the naturalist position relies on the non-Darwinian hypothesis that a ‘pre-adaptation’ to nature allowed us, from the outset, to strike upon true hypotheses about our natural world and so this ‘internal virtue’ favored our survival, and (ii) even if in the past natural selection favored our alleged capacity for correct hypothesizing, our past success does not guarantee that in the future we will continue to generate hypotheses about nature that, on the whole, are likely to be true.

7.2.1 The Abductive Faculty

Let us weigh these charges specifically against Peirce’s own position.189 For him,

the question of what sort of validity can be attributed to abductive reasoning is the

fundamental question of logical critic, that is, of the branch of logic that treats of the classification and evaluation of arguments (see EP 2, p. 443).190 In my estimation,

Peirce’s argument for the validity of abduction does not rest ultimately on an appeal to

Darwinism in support of an evolutionary epistemology. His principal argument for the

validity of abduction is not that, in the course of evolution, nature has selected for our

internal capacity for guessing true explanations of phenomena. That is, Peirce’s argument

189 It is beyond the scope of my present examination of the viability of the abductive model of inference to assess whether Darwinism allows for the positions inherent to naturalist epistemology in general. It may well be that the naturalist position is, in general, non-Darwinian so that an appeal to Darwinism is unfounded. The other elements of the position may nevertheless be worthy of reconsideration in a more extensive study. At any rate, my specific concern here is with Peirce’s ‘naturalist epistemology’ and the support it may lend to the logic of abduction. 190 For Peirce, the three branches of logic are speculative grammar, critic, and methodeutic. See “What Makes a Reasoning Sound?” in EP 2, especially p. 256-257.

313 does not rely on any particular thesis regarding human evolution. His argument, however, does rest on an appeal to our ‘abductive faculty’ and in this sense it falls under what van

Fraassen classes as ‘naturalism’. The ‘abductive faculty’ is our instinctive capacity for perceiving and cognizing the general elements of phenomena that make them understandable and explicable.191 Peirce puts forth various reasons for thinking that we

do have such a faculty of ampliative cognition. He does not put them forth as

conclusively certain arguments, but they are meant to show that it is reasonable to hold that we have a capacity for hypothesizing that tends to strike upon the truth. In my interpretation, Peirce’s reasons are mainly of three types—the instinctive nature of human reason, the historical progress of science, and the role of experience in inquiry.

In the relatively late (1908) essay “The Neglected Argument for the Reality of

God,” Peirce explicitly rests the validity of abductive inference upon our ‘abductive

faculty’. He describes abduction as “the spontaneous conjectures of instinctive reason”

(EP 2, p. 443). Regarding the question of the validity of abduction, he writes that the

“first answer we naturally give is…that we cannot help accepting the conjecture at such a

valuation as that at which we do accept it; whether as a simple interrogation, or as more

or less plausible, or occasionally, as an irresistible belief. But far from constituting, by

itself, a logical justification such as it becomes a rational being to put forth, this pleading,

that we cannot help yielding to the suggestion, amounts to nothing more than a

confession of having failed to train ourselves to control our thoughts. It is more to the

191 This definition implicitly relies on both Peirce’s theory of the categories and his theory of perception. For my presentation of the theory of the categories, see section 2.1.1. For a brief account of the theory of perception, see section 6.3.3.

314 purpose, however, to urge that the strength of the impulse is a symptom of its being instinctive” (p. 443). According to Peirce, while we experience the abductive suggestion as a spontaneous and forceful impulse, this only helps to mark the abduction as instinctive but does not justify the inference. If we were to offer the irresistibility of the conjecture as its warrant, we would merely betray a logical failure on our part—logical in the Peircean sense of failing to exercise controlled criticism of the form and implications of our thoughts. So the urgency of the thought does not validate the abduction; it only marks it as instinctive.

Yet for Peirce the instinctiveness of the abductive process does provide reason to hope that it tends to lead us to the truth. In other words, Peirce does not put forth the instinctiveness of abduction as the ‘warrant’ to accept the hypothesis conclusively; he rather suggests that it is a reason to give the hypothesis consideration at an adequate degree of plausibility and to shape it by exerting some controlled criticism over it. For

Peirce, “if we knew that the impulse to prefer one hypothesis over another really were analogous to the instincts [for flight] of birds and wasps [for example], it would be foolish not to give it play, within the bounds of reason; especially, since we must entertain some hypothesis, or else forego all further knowledge than that which we have already gained by that very means” (EP 2, p. 443-444). Peirce, then, thinks that the instinctive strength of our conjectures confers to them some degree of plausibility, and the very possibility of the advancement of our knowledge is at stake in giving our

315 plausible abductive hypotheses ‘play’—that is, a lively exercise of our reasoning powers.192

I should emphasize, however, that Peirce vacillates about just how purely

instinctive our ‘abductive faculty’ is. He does not assert that this faculty is just an instinct

or that our abductive conjectures are merely instinctive. At times he describes our

conjectures as being instinctive, but at other times he describes them as being analogous

to instinctive impulses, as in the previous passage.193 I think the reason for the vacillation

is that, for Peirce, it is not only the instinctive nature of our conjectures that give us some

valid reason to ponder their plausibility as explanations for observed phenomena. There

are other reasons to hope that our abductive inferences point us in the direction of true

explanations. Accordingly, he writes: “But is it a fact that man possesses this magical

[abductive] faculty? Not, I reply, to the extent of guessing right the first time, nor perhaps

the second; but that the well prepared mind has wonderfully soon guessed each secret of

nature, is historical truth. All the theories of science have been so obtained….There is a

reason, an interpretation, a logic in the course of scientific advance; and this indisputably

proves to him who has perceptions of rational, or significant, relations, that man’s mind

must have been attuned to the truth of things in order to discover what he has discovered.

It is the very bedrock of logical truth” (EP 2, p. 444). There are two important points to

192 For Peirce’s interesting discussion of intellectual ‘play’, which in its most free form he calls ‘musement’, see EP 2, p. 434-439. 193 I think this is an interesting question for Peirce scholarship. Peirce often writes on the relationship between ‘reason’ and ‘instinct’. In his 1898 lecture “Philosophy and the Conduct of Life” (EP 2, p. 27-41) he seems to go so far as to make a pretty strong distinction, so that ‘reason’ ought to rule over the theoretical matters of science and philosophy while ‘instinct’ should guide us in practical matters of vital importance. Even then, however, ‘reason’ must be attentive to the suggestions of ‘instinct’. In general, his more consistent position is that ‘reason’ is itself an ‘instinct’. For a careful elaboration, see Ayim 1982.

316 draw out from this revealing passage. First, Peirce appeals to historical fact—to the

‘historical truth’ of the advancement of scientific knowledge—in order to support the claim that we have a capacity for hypothesizing that tends to strike upon the truth.

Second, Peirce implicitly admits that our abductive capacity is not merely instinctive, for it is not just any mind, but the well prepared mind, that perceives the true reasons and explanations of natural phenomena. I turn to elaborate these points.

The attention to the history of scientific progress is in fact an important feature of

Peirce’s views on logic and discovery. In his relatively early (1878) essay, “How to Make

Our Ideas Clear,” Peirce already declares that the history of science “affords some hints” on the art of conceiving the “vital and procreative ideas” that advance human knowledge

(EP 1, p. 141). Thirty years later, in the “Neglected Argument,” Peirce argues that the history of scientific progress “indisputably proves” to us that we must have a faculty for discovering the true explanations for natural phenomena.194 Admittedly, an appeal to

scientific progress is controversial for us, especially since the very notion that there is

progress in science is a matter of debate, and an appeal to the logic of scientific progress

is even more controversial.195 Moreover, Peirce seems to be begging the question: he

rests the validity of abductive hypothesizing upon our ‘abductive faculty’; but he presents

the history of successful scientific hypothesizing as ‘indisputable proof’ that we have

such a faculty.

194 I will shortly present Peirce’s seemingly strong claim of an ‘indisputable proof’ in a more tempered light since I am arguing that the Peircean position is rather that it is ‘reasonable’, not indisputable, to think that we have a capacity for forming true explanatory hypotheses for natural phenomena. 195 On the specific question of mathematical progress, see Grosholz 2000a, especially Part III.

317 I think that the Peircean position regarding the relationship between the history of scientific progress and the validity of abductive inference may be put forth more clearly by way of the distinction between ‘argument’ and ‘argumentation’. An ‘argument’ is

“any process of thought reasonably tending to produce a definite belief,” while an

‘argumentation’ is an argument “proceeding upon definitely formulated premisses” (EP

2, p. 435). We can offer the following ‘argument’ for the validity of abductive inference on the basis of the history of science. We infer abductive hypotheses at some degree of plausibility. Our abductive faculty does not ‘strictly justify’ or ‘warrant’ the outright conclusion that our hypotheses are certainly or probably true. However, our abductive faculty ‘reasonably justifies’ us in various ways to ponder our conjectures, judge their plausibility, and formulate them into testable hypotheses, all in the hope of ascertaining their truth after conducting a complete inquiry. One reason for giving play to our abductive hypotheses is that our abductive faculty is an intrinsic part of our perceptive abilities, of the way we encounter reality and form our conceptions about the universe.

Another reason is that our abductive faculty is analogous to the instincts of animals, say to the instinct of birds and wasps for flying. To refuse to give ‘play’ to our conjectures is to negate our own instinctive impulses. Another reason, the one at issue for us presently, is that the abductive faculty has enabled inquirers within the scientific community of inquiry throughout history to generate true hypotheses that explain some natural phenomena that were initially perplexing to us. The history of the advancement of scientific knowledge is evidence that human inquirers have the capacity for formulating true hypotheses to render understandable the phenomena of nature. At the same time, it gives us reasonable hope that if we continue to pursue our best conjectures we will strike

318 upon the truth about what presently puzzles us, that is, about those phenomena of nature that we cannot presently explain. Admittedly, the appeal to the history of science is not part of an ‘argumentation’ that proceeds from premises based on the history of science to the certain conclusion that our abductive faculty generates true hypotheses without error.

It is, however, part of an ‘argument’ that, on the basis of the historical advancement of scientific knowledge, leads us to hold the reasonable hope that we have a conjecturing capacity that tends to guide us towards the truth about nature.

At this point, we might recall van Fraassen’s objection that our alleged past success in hypothesizing about nature does not guarantee that in the future we will continue to generate hypotheses that, on the whole, are likely to be true (or at least plausible). His objection concerns evolutionary epistemology, but it might also be about the history of science. If we seek an absolutely certain guarantee for the success of future hypotheses, I think the point is granted. The alleged history of scientific progress does not guarantee the success of future scientific hypothesizing. In as much as the objection concerns scientific practice, however, I do not find it to be very forceful or interesting. If we were to act only upon absolutely certain inferences, we would be effectively paralyzed, both in practical and in scientific matters. The Peircean point is that, in scientific inquiry, the only hope for understanding what puzzles us is to pursue our abductive conjectures, and the admittedly tenuous validity of these conjectures rests upon our capacity for perceiving and hypothesizing the truth about nature. The history of scientific progress on the basis of abductive hypothesizing provides one reason to think that we have such a capacity. Now, an overarching defense of the position that there is a logical progress in the history of scientific knowledge is a task far too extensive to

319 undertake in the limited context of this discussion. I have defended extensively, however, the position that Bernoulli’s inference is abductive. He is confronted with the phenomenon of observable statistical regularities in natural events and infers the hypothetical explanation that the observed statistical frequencies are close to the true probability of the events. The modest claim now is that historical cases of successful hypothesizing suggest that we have a capacity for formulating true explanatory hypotheses about natural phenomena, and so it is reasonable to continue to pursue those hypothesis with a ‘lively exercise of our reasoning powers’.196

Peirce’s reference to this “lively exercise” resonates with his implicit hint that our

abductive capacity is not merely instinctive since it is the ‘well prepared mind’ that formulates hypotheses that tends towards the truth. A “lively exercise” is naturally an activity. Thus, the abductive capacity can be trained and sharpened by practice. In fact, with specific regard to mathematics Peirce explicitly writes, in a letter to F. A. Woods dated 14 October 1913, that reliance upon ‘geometrical instinct’ is not sufficient for mathematical inquiry, especially to discover what would-be true of a hypothetical state of affairs; the ‘geometrical instinct’ must be trained. As Peirce puts it, “I do not believe instinct is an absolutely infallible source of truth. On the contrary, it must develop that every feature of creation” (NEM 4, p. vii). This necessity to train the hypothesizing

196 Van Fraassen writes that “in science and philosophy, no less than in ordinary life and in literature,” we see “putative examples” of inference from puzzling phenomena to their best explanation (van Fraassen 1989, p. 131). He might similarly charge that there are many ‘putative examples’ of abduction in the history of science, such as the ones I have presented. But I think this charge would only betray a certain disdain for what history might teach us about the logic of scientific inquiry. At any rate, it would not suffice to label the reconstruction of a logical case of abductive discovery as a ‘putative example’ in order to refute it. It would be necessary both to critique the reconstruction and to offer a better explanation of the discovery. Attempting to articulate what van Fraassen might reply is beyond the scope of this discussion, and so in all fairness I can only let my foregoing examples stand ready for criticism.

320 instinct also applies to the study of actual states of affairs. Just as mathematicians must develop their ‘geometrical instinct’ by trying different types of ‘diagrammatic’ experiments in a variety of analytical problem-solving situations so as to accumulate and reflectively weigh their experiences, so also the mathematical scientists—i.e. the mathematicians qua scientists—as well as natural and social scientists must train their

‘abductive instinct’ or hypothesis-making ability in a variety of situations. I submit that herein lies the pragmatic upshot of the present discussion; herein we find the practical bearings for a logica docens of the present issue under scrutiny. The training of mathematicians and mathematical scientists must not be simply a training in existing knowledge; it must be also a training in hypothesizing and experimenting. In sum, then, the ‘well prepared mind’ is the mind of those who, as part of the community of inquiry, have cultivated the conditions for the possibility of scientific discovery and have pursued extensive practice in hypothesis-making. Such inquirers have cultivated their abductive ability so as to validate the hope that communal hypothesis-making will lead to the truth in the long run.

7.2.2 The Simplicity of Abductive Hypotheses

Now, van Fraassen anticipates a second response to the ‘problem of truthful hypothesizing’ that is pertinent to Peirce’s position. He calls it the “retrenchment” of those who argue that “[e]xplanatory power is a mark of truth, not infallible, but a characteristic symptom” (van Fraassen 1989, p. 146). For van Fraassen, an example of

321 this so-called ‘retrenchment’ would be precisely my foregoing argument that abduction is an inference from evidence to a plausible, and not to a true, explanatory hypothesis. The abductive model ‘retrenches’ from a claim of truth to a claim of plausibility for the inferred hypothesis. The explanatory power of the hypothesis is a “characteristic symptom,” albeit fallible, that the hypothesis will turn out to be true. Van Fraassen discusses various forms of retrenchment. The abductive model would belong among those that argue that “the special features which make for explanation among empirically unrefuted theories make them (more) likely to be true,” or in the case of abduction, more plausible (van Fraassen 1989, p. 146). There may be several special features that would arguably make an explanation more plausible. Drawing from Lipton and Peirce, I have pointed out that explanatory hypotheses that are simpler, are more precise, cohere with our explanatory schemes, and unify our understanding are more plausible.

Importantly for our ongoing discussion of Bernoulli’s inference, van Fraassen critiques, as an example, the general claim that the special feature of ‘simplicity’ makes an explanation more likely to be true. More specifically, he rejects the claim that simpler theories are more likely to be true given the supposition that the universe is simple. Van

Fraassen charges that this claim only appears plausible due to an equivocation regarding the meaning of ‘simplicity’. On the one hand, ‘simplicity’ may mean an ‘objective’ characteristic of the universe: “If the simplicity of the universe can be made into a concrete notion by specifying objective structural features that make for simplicity,” then he can see how one would arrive at “the opinion that the universe is (probably) simple.

For there can be evidence of any objective structural feature” (van Fraassen 1989, p.

148). However, even if the ‘simplicity’ of the universe were to consist of ‘objective

322 structural features’ this kind of simplicity would not necessarily make our simple hypotheses more likely to be true. For this ‘simplicity’ would be a global feature of the structure of the universe, but any “part of a structure, which is very simple overall, may be exceedingly complex considered in isolation” (p. 148). The part of the universe that our simple hypothesis attempts to explain may be highly complex, so the presumption that the simplicity of the explanatory hypothesis is a special feature that makes it more likely to be true does not hold. On the other hand, ‘simplicity’ may be ‘relational’.

According to van Frasssen, “if the universe’s simplicity means the relational property, that it lends itself to manageable description by us (given our limitations and capacities),” he cannot see how one can hold that the universe is simple. If simplicity is relational, one can only claim that our explanatory successes so far “are all successes among the descriptions we could give of nearby parts of the universe, and of the sort which our descriptive abilities allow” (p. 148). The universe and its phenomena may be of a level of complexity that lies beyond our limited descriptive abilities. Thus, whether the alleged

‘simplicity’ of nature means ‘simple-in-itself’ or ‘simple-to-us’, van Fraassen rejects the claim that this cosmic simplicity makes simpler hypotheses more probable. Van Fraassen might extend the foregoing arguments to claim that the simplicity of the universe, whether objective or relational, does not make simpler explanatory hypotheses more plausible either.

In the case of Bernoulli’s arguments, for example, I think that van Fraassen poses an important challenge.197 Bernoulli suggests that one reason for adopting the hypothesis

197 Recall my anticipation of this discussion in section 6.3.3.

323 that observed statistical regularities in natural events are due to the a priori probabilities of those events is that this is the simplest hypothesis and “we see everywhere that nature follows the simplest paths” (Bernoulli 1966, p. 78). This seems to be the argument that we should adopt the simplest hypothesis because nature is simple, and van Fraassen argues strongly against this argument. Could Bernoulli’s argument withstand the criticism? I think it could, if we interpret the argument along Peicean lines.

From a Peircean perspective, ‘simplicity’ does make for more plausible explanatory hypotheses, where ‘simplicity’ means both the simplicity of the universe and of the hypotheses. For Peirce, the ‘simplicity’ of the universe is relational. The universe is simple in that we are attuned to it.198 And this is precisely the way in which hypotheses are simple. The ‘simplicity’ of hypotheses is of two types. Simplicity means primarily the

‘facility’ or ‘instinctiveness’ of a hypothesis, and only secondarily ‘logical’ simplicity or

‘parsimony’, that is, the simplicity propounded by Ockham’s razor. In the “Neglected

Argument” Peirce writes:

Modern science has been builded after the model of Galileo, who founded it on il lume naturale. That truly inspired prophet said that, of two hypotheses, the simpler is to be preferred; but I was formerly one of those who, in our dull self- conceit fancying ourselves more sly than he, twisted the maxim to mean the logically simpler, the one that adds the least to what has been observed, in spite of three obvious objections: first, that so there was no support for any hypothesis;

198 The attunement results because our minds have been shaped by the very evolution of the universe. The thesis is evolutionary: our minds have been formed and have evolved under the very influence of the evolutions of the universe. By way of illustration of the ‘relational’ character of simplicity, Peirce writes: “Observe, that the character of a theory as to being simple or complicated depends entirely on the constitution of the intellect that apprehends it. Bodies left to themselves move in straight lines, and to us straight lines appear as the simplest of curves. This is because when we turn an object about and scrutinize it, the line of sight is a straight line; and our minds have been formed under that influence. But abstractly considered, a system of like parabolas similarly placed, or any one of an infinity of systems of curves, is as simple as the system of straight lines. Again, motions and forces are combined according the principle of the parallelogram, and a parallelogram appears to us a very simple figure” (NEM 4, p. xix). For more on Peirce’s “evolutionary philosophy” see Hausman 1993.

324 secondly, that by the same token we ought to content ourselves with simply formulating the special observations actually made; and thirdly, that every advance of science that further opens the truth to our view discloses a world of unexpected complications. It was not until long experience forced me to realize that subsequent discoveries were every time showing I had been wrong,—while those who understood the maxim as Galileo had done, early had unlocked the secret—that the scales fell from my eyes and my mind awoke to the broad and flaming daylight that it is the simpler hypothesis in the sense of the more facile and natural, the one that instinct suggests, that must be preferred; for the reason that unless man have a natural bent in accordance with nature’s, he has no chance of understanding nature, at all. (EP 2, p. 444)

By ‘simplicity’, then, Peirce primarily means the relative cognitive ease with which

hypotheses formulated by il lume naturale arise, and after “long experience” in scientific

inquiry, he comes to believe that this simplicity is the instinctiveness of the hypotheses.

Now, Peirce takes up Galileo’s notion of il lume naturale and turns it into the

‘abductive faculty’.199 On the basis of my previous interpretation of the nature of this

faculty in section 6.3.3, then, I think that for Peirce the ‘simplicity’ of hypotheses means

their ‘instinctinveness’, ‘insightfulness’, and ‘perceptiveness’—that is, the degree to

which the hypotheses arise from our perception of the general elements of a phenomenon

that make it understandable and explicable. The new claim in defense of Bernoulli would

be that instinctive, insightful, and perceptive hypotheses are more plausible and deserve

primary attention in inquiry. Of course, as an important consideration we seek

parsimonious hypotheses, but this consideration is secondary to giving our instinctive-

perceptive hypotheses the full and vigorous attention of our reasoning powers.

From a Peircean standpoint, then, how is the simplicity of a hypothesis a mark of

its plausibility, given that the universe is relationally simple? Recall that van Fraassen’s

199 In future work, it would be interesting to take up a full account of Peirce’s transformation of il lume naturale into the abductive faculty. Part of the account would involve the influence of nineteenth century evolutionary theory on Peirce’s thought and especially on his notion of ‘instinct’.

325 position is that the supposed simplicity of universe—whether objective or relational— does not make simpler hypotheses more likely to be true. But by simple hypotheses he means logically parsimonious hypotheses, that is, hypotheses that accord with the maxim known as Ockham’s razor. As a matter of argumentation in those terms, van Fraassen might be right in holding that a higher plausibility of logically simpler hypotheses does not follow from the assumption of global cosmic simplicity, be it objective or relational.

That the universe may be globally or relationally simple does not thereby make parsimonious hypotheses more plausible. Peirce, however, concedes this point. He even mentions it explicitly as a reason to think that the Galilean maxim for preferring simple of hypotheses does not refer, first and foremost, to their logical parsimony, but to their instinctiveness and perceptiveness, when he writes that “every advance of science that further opens the truth to our view discloses a world of unexpected complications” (EP 2, p. 444). Peirce’s position is rather that the instinctiveness and perceptiveness of abductive hypotheses are a mark of their plausibility because our abductive faculty tends to grasp, even if tentatively, what is true of the actual universe. Expressing this as a maxim, we might say that ‘the more instinctive and perceptive hypothesis is more plausible, given that the universe itself is an object in intelligible relation to our minds (i.e. it is relationally simple)’.

Even though van Fraassen only argues explicitly against the claim that

‘simplicity’ makes hypotheses more likely to be true (or plausible), he believes that similar arguments would serve against other special features of good explanations.

Ultimately, he wants to establish that there is no strict ‘warrant’—whether it be an epistemological theory regarding the cognitive origin of hypotheses or an appeal to the

326 special epistemic features of hypotheses—for believing that our explanatory hypothesizing tends toward the truth. In response to the ‘problem of truthful hypothesizing’ I have argued, largely on Peircean grounds, that we have reason for hoping that our abductive inferences do tend to generate hypotheses that lead us towards the truth. These reasons include the instinctive nature of reason, the history of advancement of scientific knowledge, and the conditions for the possibility of discovery that human inquirers may cultivate. The upshot of my position is that we are justified in

‘giving play’ to our abductive conjectures within our reasoning capacities. It is

‘reasonable’ to exercise a lively logical analysis upon an abductive conjecture and, if the conjecture stands it, to test it against experience by taking appropriate action on the basis of it. In the context of scientific inquiry, the appropriate action is to propose the conjecture for inductive experimental testing. This is very different from believing the conjecture to be certainly true; it is rather to take the sort of self-controlled, reasonable action that a plausible hypothesis deserves, and it is the sort of action that scientific inquiry requires in practice.

7.2.3 Bernoulli’s Truthful Hypothesizing

In his correspondence with Leibniz, Bernoulli ultimately seeks to justify the applicability of mathematical probability to the study of problems in the natural world.

Given his reticence to finish and publish the Ars Conjectandi—which was only published posthumously in 1713 with an introduction by his nephew, Nicholas—we may conjecture

327 that Jacob Bernoulli himself thought that his reasoning did not provide a strictly certain

‘warrant’ for such an application of mathematical probability. Though he would not admit it to Leibniz, in Bernoulli’s own view his reasoning only afforded some tentative justification for applying his theorem to the estimation of probabilities in civic, moral, and economic affairs, and so Bernoulli remained hesitant to publish.200 But in spite of his

intimate hesitations regarding the absolutely certain warrant for the applicability of the

‘law of large numbers’, Bernoulli also held the reasonable hope that his theorem might

provide a good method to investigate natural events that exhibit statistical regularities.

More specifically, Bernoulli instinctively and reasonably held on to the hope that the true

probabilities of natural events might be estimated via observed relative frequencies of

occurrence. An abductive inference provides the justification for this method—not an

absolutely certain ‘warrant’, but a plausible basis for action, in the Peircean sense of

‘scientific action’ as putting a hypothesis to the test of experience.201 In this case, the

‘action’ would consist in applying the results of mathematical probability to the

investigation of natural and social phenomena to ascertain whether the probabilistic

methods contribute to the explanation of the phenomena. In turn, this brings to the fore a

pressing issue that so far remains open—namely, the way in which the ‘true’ probabilities

200 Admittedly, he probably was dissatisfied with the theorem for purely mathematical reasons. See, for instance, Stigler 1986, p. 77-78. 201 In “The First Rule of Logic” Peirce makes a distinction between abductive hypotheses in Science and in Practice, that is, in theoretical and in practical matters. Strictly speaking, in practical affairs we may ‘believe’ the hypotheses by being deliberately prepared to act upon them in pursuing our practical aims, while in scientific matters we do not properly ‘believe’ the hypotheses—we only ‘hold them for true’ provisionally for theoretical reasons and the only ‘action’ scientific hypotheses afford is to put them to the test of experiment. See Peirce 1998b, p. 176-178.

328 of natural events may contribute to the explanation of natural phenomena. Next I turn to discuss this issue.

Chapter 8

Explanation and Reality

I contend thus far that Bernoulli’s reasoning in defense of the applicability of his theorem to problems in the natural and the social sciences consists in an abductive inference to an explanatory hypothesis. In the context of that view, the present issue is to describe how the a priori probabilities of events may explain the a posteriori relative frequency of occurrence of those events according to Bernoulli. More generally, the question is how the ‘true’ probabilities of events may be part of the explanation, and not merely of the description, of phenomena in the natural world.

In the historical context of the correspondence, Leibniz may have put the question to Bernoulli in the following way: ‘Let us grant for now the hypothesis that the a posteriori frequency is close to the a priori probability of a natural event, for example, of storming in the atmosphere or of bodily illness or death. Even then, how does the a priori probability explain the observed frequency when the probability does not reveal the real causes of the empirical regularity? In order to explain empirical phenomena, we need to know their causes. The a priori probability itself is not a real cause, and positing this probability does not lead us to understand the true causes of the observed regularities in nature.’ In contemporary terms, we may put the difficulty in the following way. I have proposed that Bernoulli responds to Leibniz’s epistolary objections regarding the estimation of probabilities via observed relative frequencies by arguing that the proximity of the a posteriori frequency sn to the a priori probability p of natural event E provides

330 the most plausible explanation for the observed statistical regularity. I have labeled this explanation the hypothesis Hp. Now, in order for Hp to be an explanation of sn, the

probability p cannot be merely nominally defined as the relative frequency sn. If it were

so defined, the explanation would be trivial. At best, the probability p, estimated via the

frequency sn, would describe some regularity in the occurrence of event E—whether in

games of chance or in nature—but it would not explain the regularity of the event. If

Bernoulli were to offer this kind of explanation, it would not help us to understand the

phenomenon under investigation. Upon offering the proximity of sn to p—nominally

defined as the relative frequency of occurrence of E—as an explanation for the actual

occurrence of sn, an objector might still ask, ‘But why does the statistical regularity take

place? After all, p is merely equivalent, by definition, to the regularity. So p does not

explain it.’

Regarding this issue, I want to propose: (i) For Bernoulli a priori probabilities are

‘true’ in the sense of being ‘real’, and more precisely, ‘actual’ physical dispositions. (ii)

Since a priori or ‘true’ probabilities are real, they are explanatory of a posteriori

statistical frequencies in the sense of being actually operating tendencies or dispositions

that inhere upon and guide natural phenomena. Bernoulli’s mathematical demonstration

and logical defense of his theorem both rely on this view, although it conflicted with

some deep-rooted beliefs regarding ‘causal determinism’ that he himself could not bring

into question. (iii) While a priori real probabilities provide an initial hypothetical

explanation of observable statistical phenomena, further inquiry into the real principles

and causes of these a priori dispositions helps inquirers approach the aim of scientific

inquiry—the truthful understanding of reality.

331

8.1 True Probabilities as a priori Real Dispositions

The first step towards addressing the present difficulty is to show that for

Bernoulli the a priori probability of an event is not merely defined nominally as the statistical frequency of occurrence. I propose that for him the a priori probability p is primarily, though only implicitly, an actual physical disposition towards the occurrence of the event.

The contention that Bernoulli conceives of probability as being physical or objective meets, no doubt, with some important obstacles. The main one comes from

Bernoulli’s Ars Conjectandi itself. In the first chapter of Part IV—the celebrated part in which Bernoulli sets forth his ‘art of conjecturing’ and proves his limit theorem—

Bernoulli explicitly proposes an epistemological, subjective interpretation of probability.

In defining the fundamental concepts for his art, Bernoulli distinguishes between two types of certainty regarding things or events (rerum), namely, ‘objective’ certainty, or the certainty of events in themselves as they have existed, do exist, or will exist, and

‘subjective’ certainty, or the certainty of events according to our limited knowledge of their past, present, or future existence. According to him, all past, present, and future things—all things that exist—are certain in themselves but may be uncertain according to us. Even all events that will take place in the future are necessarily determined to exist

(Bernoulli 1966, p. 6-7). Bernoulli’s universe, then, in line with his mechanistic age, appears to be completely deterministic. What is merely ‘possible’, for instance, does not

332 enjoy any kind of existence; in fact, ‘possibility’ is merely an epistemological concept— that which is possible is that which has a very little part of subjective certainty, and that which has infinitely little subjective certainty is impossible. So, at least in these passages, for Bernoulli things and events are possible or impossible according to us, but not in themselves (Bernoulli 1966, p. 8).

In this context of metaphysical determinism in nature and epistemological limitations on the part of human beings, for Bernoulli,

[P]robability is a degree of certainty and differs from absolute certainty as a part differs from the whole. If, for example, the whole and absolute certainty—which we designate by the letter a or by the unit symbol 1—is supposed to consist of five probabilities or parts, three of which stand for the existence or future existence of some event, the remaining two standing against its existence or future existence, this event is said to have 3/5 a or 3/5 certainty. (Bernoulli 1966, p. 8)

It is clear that Bernoulli explicitly defines probability as degree of subjective certainty. It also seems clear that Bernoulli broadly intends for his ‘art of conjecturing’ to be an art of measuring the subjective certainty of events in social and natural affairs since “to conjecture about something is to measure its probability” (Bernoulli 1966, p. 13). In short, Bernoulli intends for his art to allow us to reason with a degree of precision approaching ‘moral certainty’ about matters in which our cognitive capacities are limited and yield uncertain knowledge. However, even as he introduces the art of measuring subjective certainty, Bernoulli still holds an objective, physical conception of probability that is implicit in his discussion of the art of conjecturing, in his examples of the applicability of the art, and in his correspondence with Leibniz.

Ian Hacking makes the most fundamental observation in support of this view. He notes that several opposed theories of mathematical probability all share the “classical”

333 feature of defining probability in terms of a ‘Fundamental Probability Set’, that is, a “set of disjoint alternatives of equal probability” (Hacking 1971a, p. 210). According to the various theories, these disjoint alternatives are supposed to be equal in probability because they are equal with respect to either (i) “physical symmetries that determine the frequency with which alternatives occur on repeated trials on a chance set-up;” (ii) the epistemological support “furnished for the alternatives by the available data;” and (iii) possibility (Hacking 1971a, p. 210). Hacking’s overall historical thesis is that (i) develops into (ii), from the ‘doctrine of chances’ with a physical interpretation of probability in the

17th Century to the ‘art of conjecture with an epistemological interpretation of probability

in the 18th, and from Pascal and Huygens to Bernoulli and finally to Laplace. In this

broad context, even though Bernoulli aims to found the ‘art of conjecture’, tied to an

epistemological concept of probability, he remains steeped in a ‘doctrine of chances’

conception of physical probability (p. 212). Bernoulli’s notion of probability, then, is thoroughly dual. Hacking further argues that, in general, the dual interpretation of probability as epistemological or de dicto and as physical or de re results from defining probability in terms of ‘equally possible’ cases, where ‘possibility’ is itself a thoroughly dual concept (1971b, p. 341-343). This is also the case with Bernoulli, who maintains a physical notion of possibility along the epistemological one. Now, where Hacking’s lucid syntheses of the conceptual development of probability are seminal, I find it worthwhile to delve into textual analyses of illuminating passages in order to draw out the logical and epistemological details of key developments—in this case, of the way in which Bernoulli holds an objective conception of probability as a physical proclivity or disposition of a chance set-up towards an outcome. So I turn to examine how Bernoulli’s physical

334 conception of probability is grounded upon a more basic conception of ‘possibility’ as the physical ‘proclivity’ of a thing towards a certain outcome or as the physical ‘facility’ with which an event can be produced.

In introducing the ‘art of conjecturing’ in part IV of the eponymous work,

Bernoulli writes that “for correctly forming conjectures about anything at all, nothing is required other than that the numbers of these cases be accurately determined and that it be found out how much more easily some cases can happen than others” (Bernoulli 1966, p. 34-35; emphasis mine). This ease of happening has the sense of an outcome being facile, that is, producible or formable without difficulty. Thus, for Bernoulli the art of conjecturing is founded upon the correct determination of the relative ‘facility’ of different cases or outcomes. This is more strongly emphasized when he continues by presenting the central problem for the ‘art of conjecturing’:

But here, finally, we seem to have met our problem, since this may be done only in a very few cases and almost nowhere other than in games of chance the inventors of which, in order to provide equal chances for the players, took pains to set up so that the numbers of cases would be known and fixed for which gain or loss must follow, and so that all these cases could happen with equal ease [pare facilitate]. For in several other occurrences which depend upon either the work of nature or the judgment of men, this by no means is the situation. And so, for example, the numbers of cases in dice are known: for in each die there are clearly as many cases as there are sides, and they are all equally likely [æquè proclives]. Because of the similarity of the sides and the balanced weight of the die, there is no reason why one of the sides should be more prone to fall than another, as there would be no reason if the sides were of different shapes or if the die was made of a material heavier on one side than another. And so likewise, the numbers of chances for drawing forth a white or black pebble from an urn are known, and it is known that all chances are equally likely [æequè possibiles]: for the number of pebbles of each kind are known and determinate, and there is no reason why this pebble or that pebble should come forth rather than any other one. (Bernoulli 1966, p. 35)

In the original text the last sentence reads, nullaque perspicitur ratio cur hæc vel illa potius exire debeat quàm in altera (see Bernoulli 1713).Thus, it may be rendered as “no

335 reason is perceived why this pebble or that pebble should come forth rather than any other one.” The difference is important as it reveals that, for Bernoulli, not only is there no inherent physical reason why we should draw one pebble rather than another, but that we also cannot perceive any reason for our drawing this or that pebble. The latter expression implies an epistemological sense of equal possibility, revealing the utter duality of Bernoulli’s interpretation of ‘probability’ defined in terms of equipossibility. In the course of one passage, Bernoulli moves from a clearly objective notion of possibility to a more ambiguous subjective notion related to our perceptual limitations. However, he does not abandon the objective notion. In games of chance it is possible to estimate a priori probabilities of events not only because the number of cases is determinate and known, but also because the very chance set-up for the games is physically constructed so that each case is producible with equal feasibility. The die’s physical symmetry and material construction guarantees that all sides have an equal proclivity for turning up.

Likewise, in games that consist in drawing pebbles, or even better, balls from an urn, all drawings are equally possible in the sense of occurring with equal ease. I think it is reasonable to extend Bernoulli’s comments and suggest that the physical symmetry of the balls and the material construction of the urn, that is, the objective characteristics of the chance set-up, make each possible draw equally possible to happen.

By Bernoulli’s own admission, this is precisely where the central problem of the

‘art of conjecturing’ arises. In games of chance, (i) the number of cases is determinate and known a priori, and (ii) the game is objectively constructed so that all cases have equal proclivity towards happening. Games involving the throwing of dice and the drawing of balls in urns provide examples where these conditions are met. However, this

336 is not the situation in events that depend upon “either the work of nature or the judgment of men.” Consider, for example, the situation we encounter in events related to human disease, meteorology, or games of mental or physical skill:

But what mortal will ever determine, for example, the number of diseases—i.e., the number of cases—which are able to seize upon the uncountable parts of the human body at any age and which can inflict death upon us? And what mortal will ever determine how much more likely this disease than that disease, pestilence than dropsy, dropsy than fever will destroy a man so that then a conjecture can be formed about the relationship between life and death in future generations? Who likewise will reckon up the innumerable cases of mutations to which the air is daily exposed, so that he can then guess after any given month, not to mention after any given year, what the constitution of the air will be? Again, who has well enough examined the nature of the human mind or the amazing structure of our body so that in games which depend wholly or in part on the acumen of the former or the agility of the latter, he could dare to determine the cases in which this player or that can win or lose? For since these and other such things depend upon causes completely hidden from us, and since moreover these things will forever deceive our effort because of the innumerable variety of their combinations, it would clearly be unwise to wish to learn anything this way [that is, by counting a priori favorable and unfavorable cases for a particular event]. (Bernoulli 1966, p. 36-37)

It is in these types of situations that we require an art of measuring probabilities a posteriori from observed outcomes and the results of experimental trials. In my interpretation of Bernoulli, the need for this art arises mainly because condition (i) above

is not met, that is, because the number of possible cases is either indeterminate or

unknown a priori. Thus, it is again undeniable that Bernoulli has an epistemological conception of probability at work in founding the ‘art of conjecturing’. We cannot perceive the number of diseased parts of the body or the number of mutations in the air, nor do we know the nature of the human mind or the structure of the human body in so far as they determine our probabilities of winning games of mental acumen or bodily agility. We do not know the causes of bodily disease or atmospheric disturbances. All of

337 these aspects of Bernoulli’s example suggest that he is working with a subjective notion of probability.

However, I think that Bernoulli is still working with an objective conception of probability as well, even in these examples of situations that depend either on the work of nature or on human judgment. The probabilities of disease or death, of atmospheric phenomena such as storms, and of winning a game of skill are physical probabilities. For

Bernoulli, these probabilities result from actually diseased parts of the body, from the objective state of the atmosphere at any given time and the actual disturbances of its parts, and from the actual mental or physical constitution of the player. These probabilities are not only the measure subjective certainty but also the actual objective chance of an event. The physical probability of an event is a direct result of the objective characteristics that “facilitate” its happening.202 The probability of death is the proclivity

of a human being towards dying given the ratio of diseased to healthy parts in the body;

the probability of a storm is the atmospheric disposition towards the storm happening

given the relative disturbances of the parts of air; and the probability of winning or losing

a game of, say, tennis is the proclivity towards winning or losing the game according to

the relative mental and bodily agility of the players. Accordingly, he argues that by

having observed that, out of three-hundred men of the age and constitution of Titius, one-

hundred have survived past ten years and two-hundred have died, we “could safely

enough conclude [colligare poteris]” that the number of cases in which Titius must die

202 What I am proposing implies, I think, that for Bernoulli these probabilities, when properly estimated according to his theorem, ultimately express an isomorphism between objective states of affairs and subjective knowledge. This isomorphism and the ‘actuality’ of probabilities contradict Bernoulli’s metaphysical determinism. Bernoulli’s explicit and implicit beliefs are in conflict. I will address this issue in section 8.3 below.

338 within the next ten years is twice the number in which he can survive past ten years

(Bernoulli 1966, p. 37). Similarly, by having observed the weather for the past several years, we could note many times was calm or rainy, or by having observed “judiciously” two players, we can count the number of times in which one or the other player has won

(p. 37-38). “In this way,” Bernoulli concludes, we can detect “what the ratio probably is between the numbers of cases in which the same events, with similar circumstances prevailing, are able to happen and not to happen later on” (p. 38). These prevailing circumstances are objective states of affairs; they are the physical conditions that facilitate this or that outcome in matter of nature or society.203

The strong sense of the reality or ‘actuality’ of physical probabilities that I am

proposing is further implied in Bernoulli’s claim that by deploying his method of

estimating probabilities a posteriori from observing trials—the mathematical method

afforded by his limit theorem—we are trying to detect the ‘true ratio of cases’ favorable

or unfavorable for an event. In announcing one of the specific mathematical problems

that his theorem will address, Bernoulli writes: “It certainly remains to be inquired

whether after the number of observations has been increased, the probability is increased

of attaining the true ratio [genuinæ rationis] between the number of cases in which some

event can happen and in which it cannot happen, so that this probability finally exceeds

any given degree of certainty; or whether the problem has, so to speak, its own

asymptote—that is, whether some degree of certainty is given which one can never

203 It is important to note that Bernoulli explicitly acknowledges the presumption that “every single thing is able to happen and not to happen in as many cases as it was previously observed to have happened and not to have happened in like circumstances” (Bernoulli 1966, p. 37). He is of course writing years before David Hume raised the problem of induction which essentially claims that there is no non-inductive warrant for this presumption, so that inductive inference is invalid.

339 exceed, so that however many observations are made, we can never be more than 1/2 or

2/3 or 3/4 certain that we have detected the true ratio of the cases [verum casuum rationem]” (Bernoulli 1966, p. 39; emphasis mine). I interpret this true ratio of cases to be an a priori real probability for Bernoulli, a probability that physically consists in an a priori actual disposition towards an outcome.

Recalling the terms of exposition of the theorem from previous chapters,

Bernoulli’s mathematical problem is to show P( | p - sn | < ε ) → 1 as n → ∞. According

to my interpretation of Bernoulli’s discussion, the ‘art of conjecturing’ explicitly aims at

estimating a degree of subjective certainty, namely, P( | p - sn | < ε ) for small ε. This is

allegedly a subjective probability. However, there is also an a priori real p that consists in

an actual physical proclivity of a chance set-up or a random natural process towards this

or that outcome. Bernoulli’s own discussion of the problem that he addresses via the ‘art

of conjecturing’ and of the examples of mortality and meteorology reveal an implicit

conception of probability as objective. The probability p is the true ratio of objective

cases that have a physical proclivity towards an outcome. The ‘art of conjecturing’,

therefore, implicitly consists in estimating the real a priori probability p of a natural

event by way of the observed experimental result sn.

Precisely in this connection, Hacking points out another way in which we can

ascertain that Bernoulli’s conception of probability is also physical. As Hacking notes,

Bernoulli’s “chief aim…is to determine some unknown probabilities by experiment”

(Hacking 1971a, p. 214). Accordingly, Bernoulli’s interpretation of probability cannot be very subjective—if probabilities can be determined by experimenting upon an objective chance set-up or on nature, the probabilities cannot be purely personal degrees of belief

340 (Hacking 1971a, p. 215-216). Bernoulli is rather “interested in experimental discovery of unknown probabilities. He imagines that the unknown probabilities are produced by some kind of symmetries, and that it is the task of an art of conjecturing to produce good estimates of those unknowns” (p. 217). In my estimation, the fact that Bernoulli appeals to experimentation to ascertain probabilities reveals that these probabilities are the result of objective characteristics of, say, diseases more or less tending to result in death or of atmospheric mutations more or less tending to produce storms. So I find it somewhat puzzling that in the end Hacking resists this interpretation of Bernoulli as holding a conception of probability as physical tendency or propensity towards an outcome.

Hacking ultimately denies that for Bernoulli chance could be an “ultimate fact”; that is, chance cannot be ‘actual’ for Bernoulli. For example, the outcome of any particular coin- toss is completely determined by causes other than the objective features of the coin and the chance set-up for coin-tossing (p. 215). Ultimately he takes Bernoulli’s professed belief in metaphysical determinism at face value (see p. 214), even if Bernoulli’s own treatment of the mathematical problem of estimating probabilities and discussion of examples suggest something different.

Hacking has some reasons for resisting the interpretation of Bernoulli as holding chance to be real, of course. The principal one is that he assimilates Bernoulli’s subjective conception of probability to Heisenberg’s conception of subjective probability in quantum physics. Hacking argues that Bernoulli’s conception is like that of the quantum physicist who holds that probability statements about physical systems contain both an objective element of tendency and a subjective element of our beliefs, based on incomplete knowledge, about the state of the system. Just like in quantum physics we can

341 check by repeated experimentation our subjective probability statements about the system, so also in statistical inference we can check by experiment our subjective measures of the probability of aleatoric events (Hacking 1971a, p. 215-216). I take

Hacking to be defending the view that for Bernoulli P( | p - sn | < ε ) ultimately remains a

subjective probability. However, I do not find that this should be reason to reject the

position that, in spite of Bernoulli’s professed mechanical, deterministic view of the

universe, the a priori probability p of an aleatoric event is not implicitly real for him.

In fact, I think Hacking’s own experimental realism would bear out my position

(see Hacking 1982). He distinguishes between two kinds of scientific realism. Realism

about theories proposes that in science we try to form true theories about the world and

its physical constitution. Realism about entities asserts the existence of at least some of the entities that are the objects of study in physics and the natural sciences. According to

Hacking, realism about entities holds sway in science; it is in fact “a necessary condition for the coherence of most experimentation in natural science” (Hacking 1971a, p. 281).

This is evident in the actual practice of research scientists as their experiments are interventions in nature that make use of the causal properties of theoretical entities, such as the electron when it was first proposed as a theoretical entity potentially explanatory of observed phenomena. This is what Hacking calls the ‘experimental argument for scientific realism’. Admittedly, Bernoulli does not propose that we experiment with the causal properties of a priori probabilities. But he does propose that we find what the a

priori probability p is on the basis of experimentation. This unknown p is a priori real; it is a true ratio of actual physical parts or actual physical conditions favorably or

342 unfavorably tending to produce an outcome; and it is actually at work in the experimental trials.

The analogy between the urn and the human body (or the atmosphere) constitutes the final and, I think, strongest expression of the reality of a priori probabilities for

Bernoulli. As I have discussed, he illustrates the mathematical problem that confronts him again via the example of drawing pebbles from an urn. Suppose that, unknown to us, there are three-thousand white pebbles and two-thousand black pebbles in the urn. The problem is to show that we can find the number of necessary trials (draws) so that it is n times more probable to draw pebbles in a 3:2 ratio than in another ratio different from

3:2. “If this is the case,” he argues, “and if, finally, moral certainty [about the true ratio] is reached in this way…we will have investigated the number of cases a posteriori almost as accurately as if they had been known to us a priori” (Bernoulli 1966, p. 40). The epistemological probability to be estimated is our degree of subjective certainty about the true ratio of pebbles in the urn. Bernoulli needs to show that this probability approaches moral certainty, that is, subjective certainty in practical affairs. The physical probability at work is the actual a priori objective tendency of the aleatoric set-up such that white or black pebbles are drawn in a ratio of 3:2. Having presented the urn example, Bernoulli uses the analogy to show how his reasoning applies to natural events:

And…this is amply sufficient in civil life, where what is morally certain is considered as absolutely certain in order to form our conjectures in any situation that may arise no less scientifically than in games of chance. And in fact, if, for example, we replace the urn by the atmosphere or the human body (both of which contain fuel for various mutations and diseases as the urn contains pebbles), we will in the same way be able to determine by observations how much more easily this or that event can take place in the regions of the atmosphere or the human body. (Bernoulli 1966, p. 40-41)

343 Bernoulli appeals to this very analogy in his correspondence to Leibniz.204 In that

epistolary exchange, Bernoulli argues that, although games of chance and natural events

such as diseases of the human body or mutations of the atmosphere are different types of

events, they are subject to analogous mathematical treatment when it comes to estimating

their probabilities a posteriori. That is, the mathematical problem of estimating probabilities a posteriori on the basis of observed experimental results is analogous in

games of chance and in natural events; therefore, the mathematical solution is analogous.

In the Ars Conjectandi, Bernoulli’s analogy appeals strongly to an objective symmetry

between games of chance and natural events. The very physical constitutions of aleatoric

set-ups and of the human body or the atmosphere are analogous. Therefore, just like an

urn filled with pebbles has the objective characteristics tending to yield draws according

to the true white-to-black ratio, so also the human body has the objective characteristics

tending towards producing death from various diseases and the atmosphere has a physical

constitution with some objective propensity towards storming. The a priori probability of

disease in a human body or of storming in a region of the atmosphere is an actual

‘proclivity’ or ‘facility’ resulting from their very physical conditions—literally, from the physical “fuel” they contain for malaise or for storms. Thus it is, in sum, that for

Bernoulli a priori probabilities are actual dispositions towards this or that outcome in

aleatoric and in natural events.

204 See my discussion of the correspondence in section 6.1.

344 8.2 A priori Real Dispositions as Explanatory of a posteriori Statistical Regularities

In Bernoulli’s discussion of the ‘art of conjecturing’ in the analogous contexts of aleatory and natural events, it is implicit that a priori real dispositions are inherent reasons that yield regular statistical frequencies. These real dispositions inhere upon the sequence of independent random outcomes in aleatoric set-ups or natural events, thus tending to produce the regularities. Therefore, I want to propose that it is as inherent reasons that a priori proclivities implicitly explain a posteriori regularities, even in

Bernoulli’s thoroughly ambiguous conception of probability in the Ars Conjectandi. I will elaborate this view, which is only implicit in Bernoulli’s own texts, from a contemporary Peircean perspective in order to show that Bernoulli does not foreclose the possibility of interpreting probabilities as being objective proclivities that produce and explain statistical regularities.

However, I must first comment on a difficulty intrinsic to Bernoulli’s own stated beliefs that gives rise to the need for this contemporary elaboration. The difficulty stems from Bernoulli’s own view of causal determinism, that is, from the view that all events, aleatoric or natural, are objectively certain as they are completely determined by their causes. Bernoulli expresses this position in his metaphysical claim that any past, present, or future event is objectively certain, since it is absolutely determined to take place by divine predetermination. If events were not objectively certain, Bernoulli writes, “it is not clear in what way the praise of the omniscience and omnipotence of the greatest Creator can remain undiminished” (Bernoulli 1966, p. 7). Moreover, as we have seen in the analogy between drawing white or black pebbles from an urn to estimate their true ratio

345 and observing the frequencies of human mortality or atmospheric storming to estimate the respective probabilities, Bernoulli thinks that the stochastic art must be applied because “these and other such things [or natural events] depend upon causes completely hidden from us [causis omnino latentibus]” (Bernoulli 1966, p. 36-37; emphasis mine).

That is, the ‘art of conjecturing’ is required in the study of natural events because their determining causes are only latent. Bernoulli’s causal determinism, then, links him to the position that the explanation of an empirical phenomenon consists in the whole of its determining causes. It is here precisely that Leibniz may have pressed him on the grounds that positing an a priori probability does not reveal the real causes of a statistical regularity and therefore does constitute a satisfactory explanation of it. Explicitly for

Bernoulli, the a priori probability is not a completely determining cause of the regularity; therefore, the logical justification for his stochastic art on the basis of the explanatory hypothesis that a posteriori statistical frequencies occur due to a priori probabilities is even more tenuous that a plausible conjecture, since the hypothesis does not seem to count as an explanation of the phenomenon.

This difficulty is compounded by the tension between Bernoulli’s professed causal determinism and his view of probabilities as a priori proclivities. As I have argued, Bernoulli implicitly holds that a priori probabilities are the real proclivities of aleatory or natural events towards an outcome. In the limit theorem Bernoulli distinguishes between the a priori probability p and a posteriori statistical frequency sn as two separate mathematical entities that are in a describable and demonstrable mathematical relation. Logically, both in the Ars Conjectandi and in the correspondence with Leibniz, Bernoulli’s description of the mathematical problem and defense of the

346 solution as applicable to natural and social science appeal to a priori probabilities qua real proclivities as the reason for the occurrence of statistical regularities. In terms of my exposition in section 6.3.2, this is the conflict between hypotheses Hp—‘the statistical

regularity sn observed in a sequence of random natural outcomes is due to the proximity

of the unknown a priori probability p of event E to sn’—on the on hand, and Hq1—‘we

have an imperfect capacity of observation that makes event E, which happens with

certainty under determining conditions, seem to us to happen only in a statistically regular proportion sn of random cases’—on the other. Bernoulli infers Hp along with its implicit

view of a priori probabilities as real dispositions, in spite of his proclaimed determinism.

What is required for Bernoulli to resolve this difficulty, I think, is a different view

of a priori probabilities as being explanatory of a posteriori frequencies—namely, the

view that a priori proclivities act as inherent reasons that tend to produce statistical

regularities in nature. Some elements of this view are implicit in Bernoulli’s texts, and

next I draw them out explicitly in order to substantiate his justification for the ‘art of

conjecturing’. My motivation is that I understand Bernoulli as being in the midst of

discovering a new way to conceive of and explain statistical phenomena, even if some

elements of this discovery were only latent in his reasoning and unclear to Bernoulli

himself.

In order to substantiate this position, let me first suggest what kind of ‘inherent

reason’ these a priori proclivities constitute. They are general dispositions that tend to

become actualized upon specific circumstances. For example, a physically symmetric die

has general dispositions that tend to become actualized as relative frequencies upon the

circumstance of repeated, independent throws. Analogously for Bernoulli, human beings

347 of a specific age and bodily constitution have general dispositions towards being healthy or ill that tend to become actualized upon their continuing to live their natural lives. Or athletes of a specific level of ability and fitness have a general disposition towards winning or losing that tends to become actualized upon the occasion of playing many independent games. Peirce describes these general dispositions as follows:

I am, then, to define the meaning of the statement that the probability, that if a die be thrown from a dice box it will turn up a number divisible by three, is one- third. The statement means that the die has a certain “would-be”; and to say that a die has a “would-be” is to say that it has a property, quite analogous to any habit that a man might have. Only the “would-be” of the die is presumably as much simpler and more definite than the man’s habit as the die’s homogenous composition and cubical shape is simpler than the nature of the man’s nervous system and soul; and just as it would be necessary, in order to define a man’s habit, to describe how it would lead him to behave and upon what sort of occasion—albeit this statement would by no means imply that the habit consists in that action—so to define the die’s “would-be” it is necessary to say how it would lead the die to behave on an occasion that would bring out the full consequence of the “would-be”; and this statement will not of itself imply that the “would-be” of the die consists in such behavior.

Now in order that the full effect of the die’s “would-be” may find expression, it is necessary that the die should undergo an endless series of throws from the dice box, the result of no throw having the slightest influence upon the result of any other throw, or, as we express it, the throws must be independent each of every other. (Peirce 1957, p. 79-80)

The expression or “full effect” of the die’s ‘would-be’ consists in the observed relative frequency of throwing a number divisible by three gradually approaching 1/3, until converging upon it as the number of throws goes to infinity. Commenting upon this passage, Donald Gillies claims that Peirce mistakenly treats the ‘would-be’ as a property of the die, not noticing that the ‘would-be’ depends on the conditions upon which the die is thrown; it is, for example, like the weight of a body, which depends on the gravitational field in which the body is found, and not like the mass, which is a genuine property of the body (Gillies 2000, p. 118). This is a slight misinterpretation, in as much

348 as the ‘would-be’ for Peirce is a habit, and ‘habits’ in the Peircean sense are not meant to be properties of objects independent of any relational context but rather general dispositions that depend on actualizing conditions—on reactions—to become manifested.

As with any object, the die is constituted by its qualities, its relations, and its dispositions, and these categorical elements are irreducible and inseparable from the die. Be that as it may, Gillies does point out that Peirce makes “a valuable point in distinguishing between the probability of the die as a dispositional quantity, a ‘would-be’, on the one hand, and an occasion that would bring out the full consequence of the ‘would-be’ on the other” because “it allows us to introduce probabilities as ‘would-be’s’ even on occasions where the full consequences of the ‘would-be’ are not manifested” (Gillies 2000, p. 118). That is, Peirce’s distinction allows us to distinguish between probabilities as dispositions, on the one hand, and relative frequencies as their expression in a long run of experimental trials, on the other.

It is precisely this distinction that allows us to grasp Bernoulli’s implicit view of proclivities as reasons inhering upon the manifestation of stable relative frequencies in the long run. It is remarkable that Peirce uses an analogy between the ‘habit’ of a die to fall with a certain face up and the ‘habit’ of a human being to act in a specific way under given circumstances, just as Bernoulli makes an analogy between the proclivity of a die to fall with this or that face up and the proclivity of a human being to die from various diseases. What Bernoulli calls a ‘proclivity’ or ‘facility’ is akin to what Peirce calls a

‘would-be’ or ‘habit’. These ‘habits’ are not independent of actualizing conditions but rather actively work upon the fulfillment of these conditions, for example, upon the repeated throwing of a die in a specific aleatory set-up or upon the living or dying of

349 many human beings of similar age and constitution in a given environment. At the same time, these ‘habits’ or ‘dispositions’ are ‘inherent reasons’ that guide the actualization of regular statistical frequencies upon the fulfillment of specific experimental conditions.

Expounding Bernoulli’s reasoning from a perspective that makes explicit use of the Peircean categories, I suggest that Bernoulli’s ‘proclivities’ are ‘real dispositions’—of the nature of Thirdness—that become ‘actualized’ upon the occasion of some experimental ‘reaction’—of the nature of Secondness. In one of his characterizations of

Thirdness, Peirce in fact says that “it might be called an inherent reason” (RLT, p. 148).

In the course of Bernoulli’s reasoning, then, real a priori dispositions are implicitly the

‘inherent reason’ for the occurrence of stable relative frequencies, and it is in this sense that they explain the occurrence of such regularities in natural events. The implicit view that an a priori tendency towards an outcome explains the observed relative frequency of a natural event involves a ‘disposition’ as the explanatory ‘reason’ for the observed frequency.

According to Peirce, we must suppose this ‘reason’ to be an “active general principle” (EP 2, p. 183). Let us consider again two of our possible explanations from section 6.3.2 for the puzzling but stable statistical frequencies that we observe in nature:

Hq4: ‘the statistical regularity sn observed in a sequence of random natural

outcomes is due to mere chance—that is, to the pure randomness of a natural

process where there are no a priori probabilities of events.’

Hp: ‘the statistical regularity sn observed in a sequence of random natural

outcomes is due to the proximity of the unknown a priori probability p of event E

to sn.’

350 Following Peirce, let us recast them according to the general view of the operations of nature that they involve as follows:

Hq4: ‘statistical regularities and, more generally, uniformities and regularities in

nature are due to mere chance and afford no ground for expectation.’

Hp: ‘statistical regularities and, more generally, uniformities and regularities in

nature are due to some active general principle’.

According to Peirce, every “reasonable man” must adopt the latter hypothesis as it is the only one compatible with our past, direct and indirect, experience (see EP 2, p. 183). An skeptic that exhibits ‘paper-doubt’—that is, that pretends to doubt beliefs that he nonetheless embraces in action—would have to suppose every confirmation of the regularities “to be merely fortuitous in order reasonably to escape the conclusion that general principles are really operative in nature” (EP 2, p. 183). Accordingly, the only

reasonable conjecture to explain observed statistical regularities is that these regularities

are due to real a priori dispositions—real habits—that are really operative in stochastic

phenomena and that become actually operative upon the fulfillment of specific conditions

in nature, that is, upon actual ‘reactions’ that bring into operation the disposition or habit.

In the conceptual context of Bernoulli’s examples, for instance, just as a fair physical die

has an equal proclivity to fall with any of its faces up upon the actual experiment of

throwing it independently and repeatedly, so also a given region of the atmosphere has a

‘facility’ for storming—a ratio of mutated parts to all its constitutive parts—that becomes

actualized upon the realization of some specific atmospheric conditions. The die’s equal

proclivities and the atmosphere’s facility for storming are general principles that really

and inherently operate upon the actual throwing of a die or upon an actual storm.

351 I find N.R. Hanson’s conception of abductive explanation to be the most apt when investigating Bernoulli’s abductive reasoning. For Hanson, an abductive explanation provides a pattern of organization that explains the observed phenomenon. This pattern invokes reasons or principles that organize the phenomena into an intelligible framework

(Hanson 1965, p. 85-92). These principles, however, are not properly understood when conceived simply as objective ‘causal-chains’. According to Hanson, what we conceive of as ‘causes’ and ‘effects’ in a causal-chain depends on the theoretical context within which we attempt to explain a phenomenon, so ‘causes’ are theory-loaded from beginning to end: “They are not simple, tangible links in the chain of sense-experience, but rather details in an intricate pattern of concepts” (Hanson 1965, p. 54). The principles of phenomena are more properly expressed by way of ‘laws’ grounded in a conceptual framework. A ‘law’ is “a family of statements, definitions and rules” (Hanson 1965, p.

98). Laws may function as theoretical statements, empirical summaries, definitions, or heuristic maxims, depending on the context of research (Hanson 1965, p. 94-98, 112-

118). Sometimes they may even take the form of ‘causal laws’, which are different from causal chains (Hanson 1965, p. 62-65). For Hanson, causal chains are built in the form:

(A then B)1, (A then B)2, … , (A then B)n; therefore, all As are followed by Bs. If an

exception is found in the course of experience, the chain is broken. What underlies causal

laws is a thoroughgoing conceptual pattern. Hanson writes: “That happenings are often

related as cause and effect need not mean that the world is shackled with ineffable

[causal] chains, but it does mean that experience and reflection have given us good

reason to expect [B] every time we confront [A]. For [A] to be thought of as a cause of

[B] we must have good reasons for treating ‘A’…as a theory-loaded, explanatory term”

352 (Hanson 1965, p. 65). Thus, causal laws have an epistemological function within a broad theoretical framework, or in Peirce’s terms, within a system of beliefs. Laws, then, are statements that have a variety of functions depending on the theoretical context of research.

An explanation in terms of causal laws, instead of simply in terms of determining causes, would provide the initial step for Bernoulli to get out of his conundrum. Had

Leibniz pressed him on the status of his explanation of statistical regularities in terms of a priori probabilities, Bernoulli could have replied that his theorem expresses a causal law of nature. Bernoulli in fact comes close to this position in the very concluding passage of the Ars Conjectandi, where he writes: “[F]inally, this one thing seems to follow [from my theorem]: that if observations of all events were to be continued throughout all eternity, (and hence the ultimate probability would tend toward perfect certainty), everything in the world would be perceived to happen in fixed ratios and according to a constant law of alternation [omnia in mundo certis rationibus & constanti vicissitudinis lege contingere deprehenderentur], so that even in the most accidental and fortuitous occurrences we would be bound to recognize, as it were, a certain necessity and, so to speak, a certain fate” (Bernoulli 1966, p. 65-66.) Were the number of our observations of all events to go to infinity, then, we would find that all events take place in a certain ratio according to constant law. We would ascertain that causes determine their effects in a precise manner through causal laws, and our infinite observations would lead us to

‘deprehend’—that is, to seize upon or detect—those laws. However, if the causal law expressed in Bernoulli’s theorem, and the a priori probabilities that it involves, were to remain merely nominal, then Bernoulli’s ‘law of large numbers’ would still not qualify as

353 an explanation for Leibniz as it fails to reveal the real causes of events and, in contemporary terms, the law would remain a mere description, not an explanation, of the statistical regularities that arise in nature. It is here that Bernoulli does not take explicitly the step that is required for his hypothesis to stand as an explanation.

The view that Bernoulli should have adopted as an explicit working hypothesis is that these causal laws ultimately express the reality of inherent reasons that are actually operative in the making of stochastic regularities in natural phenomena. In the case of statistical phenomena, true a priori probabilities, understood as the ratio of objective cases tending towards or against the making of a specific event, constitute the real

‘reason’ that organizes the statistical phenomena to make them intelligible. A priori proclivities inhere upon stochastic events, making the resulting regularities understandable. The corresponding explanatory hypothesis, then, places the regularities within an intelligible pattern. But he did not adopt this view even as a working hypothesis due to his adherence to an entire system of completely deterministic metaphysical beliefs, according to which each particular effect is completely determined by the totality of its particular causes, and due to his corresponding explicit definition of probability as degree of subjective certainty.

354 8.2.1 Some Possible Objections

Now, the foregoing sketch, from a contemporary Peircean standpoint, of the form of explanation involved in Bernoulli’s theorem may be subject to at least two objections that I proceed to address.

8.2.1.1 Bernoulli’s ‘Propensity’ Interpretation?

At this point, commentators of early mathematical probability might object that I am foisting upon Bernoulli a ‘propensity’ interpretation of probability—that is, an interpretation according to which a ‘probability’ is an inherent disposition, propensity, or proclivity of a random object or process that tends to become actualized upon the realization of specific circumstances.205 The objection could take two forms: (a) Bernoulli

held a ‘subjective’, not an ‘objective’, interpretation of probability. (b) Even if Bernoulli

were to hold an ‘objective’ interpretation, it could not be the ‘propensity’ interpretation,

which was only proposed by Popper already in the twentieth century.

The first form of the objection ascribes to Bernoulli the “classical” theory of

probability, with its conception of probabilities in terms of ratios of equally possible

cases, along with a subjective interpretation of probability. As Gillies points out, the

classical theory of probability is a product of the thinking of the European Enlightenment

and embodied many of the Enlightenment’s characteristic ideas, including the

205 For a thorough discussion of the propensity interpretation of probability see Gillies 2000b, ch. 5.

355 “admiration for Newtonian mechanics, and the consequent belief in universal determinism” (Gillies 2000, p. 14). In Gillies’s opinion, the early probabilists of the historical period up to Laplace—especially his Philosophical Essay on Probabilities

(1814)—regarded probability as epistemological or subjective, rather than objective.

Even in chapter VII of the Philosophical Essay, for example, where Laplace discusses the situation of a biased coin with chance of heads equal to (1+α)/2 and chance of tails equal to (1-α)/2, the occasional “phrases which seem to imply the [objective] existence of unknown chances” are better interpreted as slips or inconsistencies “rather than as a commitment to objective probability. Since all the probabilists of that period had a firm belief in universal determinism, it is difficult to see how they could have conceived of probability other than as a measure of human ignorance” (Gillies 2000, p. 21). Gillies refers to the view of Daston (1988), who argues that the distinctly dual interpretation of probability as subjective or epistemological and objective or physical only arose well into the nineteenth century with Poisson, Cournot, and Ellis (see Daston 1988, p. 191). For

Daston, the distinction would have been inconceivable to the early probabilists since they embraced the Enlightenment ideal of the “reasonable man” who could act on the basis of broad experience precisely because the “objective probabilities of experience and the subjective probabilities of belief were, in a well-ordered mind, mirror images of one another” (Daston 1988, p. 197). Gillies’s and Daston’s positions, then, would provide grounds for objecting that I am ascribing to Bernoulli an objective interpretation of probability that he could not have possibly conceived.

I would reply, first, that while Laplace may have explicitly and extensively argued for an epistemological interpretation of probability on the basis of his admiration for

356 Newtonian mechanics and the consequent universal determinism, this same view is not

Bernoulli’s. The Ars Conjectandi was published posthumously in 1713, a full century before Laplace’s Philosophical Essay. There is far more ambiguity on Bernoulli’s views on probability than on Laplace’s. The latter is far more imbued with the Enlightenment ideas, and he develops a thorough subjective interpretation of probability that had not yet congealed in Bernoulli’s work, whose treatment and discussion of probability and especially of a priori probabilities relies extensively on physical proclivities and a doctrine of objective chances, even as he attempts to found a subjective art of conjecturing. In this regard, my position is closer to Hacking (1971 b), whereas Gillies constrasts Daston’s and his own view to Hacking’s, who argues that mathematical probability, from its very emergence around 1660, was thoroughly dual, objective and subjective at once (see Gillies 1990, p 18-22).

Second, whether or not Laplace’s inconsistencies can be dismissed as occasional slips, I do not think the same can be claimed about Bernoulli. In his work, the implicit reliance upon objective chances is too extensive to dismiss. Even if Bernoulli strives to define and describe probability as ‘degree of subjective certainty’, the ambiguous treatment of probability in his work and in his correspondence relies often on the objective proclivities of aleatory set-ups or natural events. The examples of mortality rates and meteorology illustrate the fundamental motivation for Bernoulli’s work towards an art of conjecturing, and in his description of these examples he relies on the physical analogies and symmetries between dice, human bodies, and regions of the atmosphere.

Now, it is important to acknowledge that Bernoulli clearly did not express any commitment to objective probability. I think it is also important to observe, however, that

357 in his work there is not any one interpretation of probability—objective or subjective, epistemological or physical—that is thoroughly worked out and fully crystallized. The

Ars Conjectandi and the Leibniz-Bernoulli correspondence contain many seminal ideas of probability; they leave tacitly open several possible conceptions and interpretations that will only be parsed out and clarified in the course of the subsequent history of mathematical probability. My suggestion is that one of these implicit conceptions—one that Bernoulli, unlike Laplace, has not yet foreclosed—is that of a priori probabilities as

‘proclivities’ or ‘dispositions’ that ‘facilitate’ the occurrence of stable statistical frequencies.

Now, a second form of the objection would suggest that, even if the seeds of an objective conception of probability are alive in Bernoulli’s work, I am committing an anachronism in ascribing to him an interpretation of probability as a physical

‘propensity’. I would reply that I am not claiming that Bernoulli explicitly conceived of probabilities as ‘propensities’ or that he thought of them as Peirce or Popper would come to think of them in the nineteenth and twentieth centuries. I am only arguing that

Bernoulli’s arguments in favor of his method of estimating unknown probabilities in nature imply a view of a priori probabilities as ‘proclivities’, ‘tendencies’, or

‘dispositions’ that provide the explanatory reason for the emergence of statistical regularities in nature. Admittedly, I have taken a Peircean stance to draw out explicitly the position that is only implied in Bernoulli’s work, but this is not to claim that Bernoulli thinks explicitly in this way, nor is it to suggest that he holds a clearly formed interpretation of probability of an objective ‘propensity’.

358 8.2.1.2 Bernoulli’s ‘Realism’?

This, however, brings us to a final objection that may be raised against my elaboration of Bernoulli’s explanatory reasoning, namely, that the Peircean standpoint attributes to Bernoulli an untenable ‘realism’ regarding probability. The objection would be directed against the defensibility of this form of ‘realism’ and not against the claim that Bernoulli holds any such implicit view. The ‘realism’ lies in this previously noted argument: Given our observation of persistent regularities in the nature—such as stable statistical regularities, or in Peirce’s own example, that ‘stones left free to fall have fallen’—the inquirer may conjecture that the regularities are due either to mere chance or to some active general principle. But the skeptic who adopts the first hypothesis would have to suppose that every regularity is merely fortuitous and affords no ground for expectation in order to escape the supposition that there are active general principles operating in nature (while still acknowledging that we do observe uniformities and regularities). Thus, we must adopt the hypothesis that there are really operative general principles in nature.206

As Bas van Fraassen notes, Peirce identified arguments of this general form to have been advanced by the scholastic realists against the nominalists, and Peirce advances his own version in the 1903 Harvard Lectures (see van Fraassen 1989, p. 19-

23). Van Fraassen notes that those arguments are two-fold: (i) To the conclusion that

there must be laws of nature, and independently (ii) to the conclusion that we must

believe that there are such laws. The first argument is of the form: There are pervasive,

206 See EP 2, p. 181-183.

359 stable regularities in nature; but no regularity will persist by chance as there must be a reason for the regularity; therefore, the reason is the existence of a law of nature. The second argument is of the form: If we deny the previous conclusion, we are reduced to skepticism; that is, if we say there is no reason for a regularity, then we imply that there is no reason for the regularity to persist. But if we say that there is no reason, then we cannot expect it to persist. Therefore, we have no basis for rational expectation of the future (van Fraassen 1989, p. 19).

According to van Fraassen, Peirce’s arguments in the Harvard Lectures are of the second type. But when speaking of ‘mere chance’, Peirce equivocates between ‘by chance’ as ‘due to no reason’ and ‘by chance’ as ‘no more likely to happen than its contraries’. Under the latter meaning, van Fraassen concedes that if no event is more likely that any other, then there is no ground for expectation. But under the former meaning, van Fraassen denies the conclusion, as there is no inconsistency in asserting that a natural regularity takes place while denying that there is a deeper, or ‘real’, reason for it. Instead of accusing Peirce of equivocation, we may attribute to him the tacit premise that “whatever happens either does so for a reason or else happens no more often than its contraries” (van Fraassen 1989, p. 21). But, van Fraassen argues, this premise would have Peirce subscribing also to the first argument to the conclusion that there must be laws of nature. Accordingly, he criticizes Peirce’s argument because it involves the tacit view “that a universe without laws—if those are the reasons for the regularities— would be totally irregular, chaotic” (van Fraassen 1989, p. 21).207 Van Fraassen sums up

207 For Van Fraassen’s reconstruction and critique of Peirce’s argument, see 1989, p. 20-22.

360 what he takes to be Peirce’s version of scholastic realism as follows: “A law must be conceived as the reason which accounts for uniformity in nature, not the mere uniformity or regularity itself. And the law must be conceived as something real, some element or aspect of reality quite independent of our thinking or theorizing—not merely a principle in our preferred science or humanly imposed taxonomy” (van Fraassen 1989, p. 22-23).

Van Frasseen, in short, rejects the argument that the existence of regularities in nature implies that there must be a real laws producing the regularities. This suggests a general objection against the ‘realism’ involved in my reconstruction of Bernoulli’s reasoning that requires careful consideration.208

In order to clarify the objection, we should note first that van Fraassen conflates

‘law’ with ‘principle’ in Peirce. Van Fraassen rejects Peirce’s argument in the context of

a larger critique of the concept of ‘law’, which he considers to be an outdated metaphor for a global constraint on natural systems, a remnant of the pre-modern notion that a

Creator set up general principles that govern the universe (see van Fraassen 1989, p. 1-

14). Perhaps this is why he quickly identifies ‘principle’ with ‘law’ in Peirce. Although

Peirce himself equates them sometimes when writing of the reality of laws of nature,

‘law’ and ‘principle’ are not necessarily equivalent. A general principle need not be a

law. In our case study, for instance, the general principle is the general disposition of

aleatory set-ups or of natural processes towards producing a particular outcome with a

given frequency; this general principle acts as an inherent reason in experimental trials

208 My limited aim here will be to clarify the form of ‘realism’ at issue and to suggest the ways in which it strengthens Bernoulli’s reasoning, without engaging in a thorough review of the realism-nominalism debate.

361 and gives rise to a statistical regularity. As Hanson would point out, Bernoulli’s ‘law of large numbers’ is manifold according to the mathematical or scientific context in which it is deployed.209 As a mathematical result, the law expresses a relation between two ratios,

p and sn, according to the rules of mathematical probability. As an empirical law, it expresses the process through which a statistical regularity arises in nature out of a sequence of independent random events. The link between the mathematical and empirical dimensions of the law is that the law represents a process in which a general principle—in this case, a disposition or proclivity or true ratio p—acts as an inherent

210 reason that produces the regularity sn. In both cases, this ‘law’ is more properly

209 On the manifold meanings of ‘law’ according to the term’s use in science, see Hanson (1965), chapter 5, especially p. 93-105. In physics, for instance, ‘laws’ may function as definitions, empirically true statements against which contrary evidence is inconceivable, empirical statements that may be false but that organize theories and facts into intelligible patterns, contingent and falsifiable universal statements, or notational tools for demarcating the notions we accept for thinking about phenomena. 210 Gillies (2000b) also considers this issue. He asks whether Bernoulli’s theorem is a mathematical theorem or an empirical law? What is the relation between an empirical law of large numbers and the mathematical theorem? Does the mathematical result provide a theoretical explanation for the empirical phenomenon? (see Gillies 2000b, p. 7). Gillies focuses his attention on which one of various semantic interpretations of the probability calculus provides the most satisfactory answer. He argues against the frequency interpretation, according to which mathematical probability is a mathematical science which studies a particular range of observable phenomena, namely, ‘mass phenomena’ and ‘repetitive events’ (p. 88-89). According to Von Mises, for instance, in mass phenomena “the relative frequencies of certain attributes become more and more stable as the number of observations is increased” (Von Mises 1928, p. 12; quoted in Gillies 2000b, p. 92). The mathematical theorem, then, is an abstraction from our observation of the stabilities due to an empirical law (Gillies 2000b, p. 94). However, Gillies shows that the definition of probability as the limiting frequency of an event in the long run “is supposed to be an operational definition of a theoretical concept (probability) in terms of an observable concept (frequency). However, it could be claimed that it fails to provide a connection between observation and theory because of the use of limits in an infinite sequence” (Gillies 2000b, p. 101). In short, the frequency interpretation fails to provide a link between the abstract theoretical notions and the empirical realities. Gillies favors the propensity interpretation of probability to resolve the problem (see Gillies 2000b, ch. 5). My Peircean view is that the question of whether Bernoulli’s theorem is a mathematical theorem or an empirical law poses a false dichotomy. Qua proposition of pure mathematics that pertains to a purely hypothetical state of affairs, Bernoulli’s result is a mathematical theorem, a pure ‘icon’. But Bernoulli grasps abductively that the mathematical ‘icon’ is also a ‘symbol’, an empirical law—that is, it represents an actual reality. This is the reality of ‘probability’ as an inherent reason, a proclivity or propensity that causes the observed empirical regularities. Thus, qua proposition of mathematical science, of what Peirce calls ‘applied mathematics’, Bernoulli’s result is a ‘symbol’ or ‘law’ representing an actual causal process in nature.

362 understood as a representation. In Peirce’s terms, the ‘law of large numbers’ is a general proposition, of the nature of representation, that furnishes a safe basis for prediction and that corresponds to a reality (see EP 2, p. 181-182). For Peirce, ‘really to be’ and ‘to be represented’ are different, though they correspond to each other: “When I say that really to be is different from being represented, I say that what really is ultimately consists in what shall be forced upon us in experience, that there is an element of brute compulsion in fact, and that fact is not a question of mere reasonableness” (EP 2, p. 182). That is,

‘really to be’ is not merely a question of our form of representation; it is a question of what forces itself upon us in experience. For Peirce, then, the dispositions or tendencies toward statistical regularities of natural events force their reality upon us in the course of experience.

The preceding observations clarify that the present objection weighs against the argument that we must hypothesize that real general principles, which act as inherent reasons, produce the regularities that we observe in nature, while laws represent these operations in natural processes. This would seem a mere shift in terms, except that it suggests some ways to elucidate and present more robustly the type of ‘realism’ at work here. It is important to emphasize that it is the grounds for rational expectation in science that are at stake in Peircean ‘realist’ arguments. These are the grounds for the formulas or laws that furnish “a safe basis for prediction,” as Peirce puts it. More broadly, these are the grounds for the emergence, within our system of beliefs, of general formulas for action or ‘habits’. Peirce argues that if we hypothesize that a stable regularity is due to mere chance, then the regularity does not provide a sound basis for prediction and we have no grounds for expectation. So we must conjecture instead that there are active

363 general principles at work in producing the regularity. The point is that rational expectation is grounded upon a hypothesis or conjecture. In order to make rational predictions based on our observation of a persistent regularity thus far in experience, we must hypothesize that there is a reason for it. This adds another dimension of the role of hypothesis in scientific reasoning, since I have already argued extensively that, in order to explain the regularity, we must abductively hypothesize what the reason is.

As I have noted, van Fraassen objects that it is not necessary to hypothesize that there is a real reason for a regularity in order to expect rationally that the regularity will continue and to base our predictions on that expectation. On my part, I find this assertion puzzling, as it seems to ignore completely the question of what is the warrant for expecting that the future will resemble the past in order to make scientific predictions—in our case, the warrant for expecting rationally that the regularities observed so far will continue to hold. Peirce’s realist argument offers a solution in the form of a hypothesis; it appeals to the most reasonable hypothesis to explain the fact of persistent regularity: We must hypothesize that general principles are really operative in nature. This hypothesis both explains the fact of uniformity or regularity of past experience and grounds our rational, or better, in Peircean terms, our ‘reasonable’ expectations and predictions, while van Fraassen’s position, at least as presented in this limited context, does not address the issue of the rational grounds for expectation. In our example, for instance, in order to expect reasonably that a stable statistical frequency will persist we must hypothesize that there is a reason for the regularity, and in order to explain the regularity, we must hypothesize what the reason is. Ultimately, Peirce is pointing out that in order for science to be a reasonable endeavor, inquirers must hypothesize that there are real reasons for

364 observed regularities. Now, van Fraassen attributes to Peirce not only the second argument to the hypothesis that there are real reasons for empirically observable regularities, but also the first argument to the positive assertion that there are such real reasons. However, it is only Peirce’s argument for the reality of general principles as a working hypothesis for scientific inquirers I am bringing to bear on my elaboration of

Bernoulli’s explanatory reasoning. This working hypothesis is what I think should be explicitly at play in Bernoulli’s argument in order to stand as an explanation within the general context of his mathematical and scientific reasoning, and it is what aim at defending.

Even so, van Fraassen poses an interesting challenge that is worth considering.

According to him, for Peirce ‘by chance’ sometimes means ‘due to no reason at all’ so that he can argue that regularities occur either by chance or due to an active reason.

However, van Fraassen writes, Peirce recognizes the reality of chance “and agreed that anything at all could come about spontaneously, by chance, without any underlying reasons” (van Fraassen 1989, p. 20). It appears that Peirce is inconsistent in maintaining the reality of chance while, at the same time, holding that ‘no regularity could come about without a reason’. Van Fraassen is correct in observing that for Peirce chance is real and that individual phenomena can arise spontaneously, by chance or firstness. Even regularities can arise spontaneously for Peirce. The key, however, is that regularities cannot arise by mere chance. This is what van Fraassen overlooks. Peirce argues that regularities arise either by mere chance or due to an active reason. He hypothesizes that it is due to an active reason or principle. But this does not preclude an element of spontaneity in the emergence of regularity. Like any phenomenon, a regularity has

365 irreducible elements of spontaneity, reaction, and habit-taking, even if the third element of habit-taking or disposition predominates.211 In the case of the law of large numbers, for

instance, the statistical regularity due to a real proclivity of the natural process does

ultimately arise out of a random sequence of events, and so disposition or habit

predominates in the process. However, there is always an element of spontaneity, as the regularity does arise in chance ways from random experimental trials, and there is always an element of reaction, as the regularity arises out of particular physical states and

actualizing conditions of the natural process.

In the end, my main objective has been to elucidate the Peircean hypothesis of the

‘reality of general principles’ as a working hypothesis for scientific inquirers and to argue

that this hypothesis ought to be part of Bernoulli’s explanatory reasoning. Bernoulli’s

‘realism’ would consist, then, in the formulation of the working hypothesis that a priori

proclivities are the really operative reasons that tend to produce statistical regularities out

of a sequence of independent random experiments. The full consideration of this position finally brings us to discuss the relation between explanation and reality.

8.3 Explanation and Reality in Bernoulli’s Reasoning

From a contemporary perspective, Bernoulli’s own claim about the kind of knowledge that his ‘art of conjecturing’ ultimately yields may appear, in some respects, relatively modest. According to him, the manifold causes that determine natural events

211Recall here my discussion of the three universal categories in section 2.1.1.

366 such as bodily disease and atmospheric storming will remain latent in the phenomena.

We may come to know some isolated causes, but the whole of the determining causes and their intricate operations will be unknown to us—we cannot know them a priori. Given this epistemological situation, Bernoulli thinks that the best that we can do is to estimate probabilities a posteriori. The self-proclaimed upshot of his theorem is that, as we increase the number of observations of, say, a natural event, we can approach practical, or moral, certainty regarding the frequency of occurrence of that event. According to one of his examples, for instance, the more we observe the life spans of pairs of young and old men, the more practically certain we become of how much more likely a young man is to live a given number of years from now than an old man. But we can never know the real causes that will determine a particular young man to die before or after a particular old man (see Bernoulli 1966, p. 36-38). In general, he thinks his theorem shows what we know by an ‘instinct of nature’, namely, that the more observations we have and the more experiments we make, the more practically certain our a posteriori knowledge is. At best, by way of the observation of all phenomena throughout eternity, “the ultimate probability would tend toward perfect certainty” and “everything in the world would be perceived to happen in fixed ratios and according to a constant law of alternation” (Bernoulli 1966, p.

66). This possibility, however, is an ideal, and it is not attainable by human cognition.

Nonetheless, Bernoulli does explicitly aim at justifying a richer, fuller sense of science. In the scientific context of his age, he confronts a notion of science according to which we can only have absolutely certain knowledge deduced from a priori principles, or else we have no knowledge but mere opinion (see Hacking 1975, ch. 3), and he attempts to establish a richer mathematical science in which we can also have probable

367 knowledge in natural and social affairs. In order to found this science, he proves the first limit theorem of mathematical probability and formulates a plausible hypothesis to justify the application of the theorem in the estimation of unknown a priori probabilities by way of a posteriori frequencies. Although it would take time for this richer probabilistic science to develop, Bernoulli laid its foundations and argued formidably in its favor.

Leibniz’s March 22, 1714, letter to Bourguet testifies to the achievement of

Bernoulli’s mathematical and logical reasoning. In commenting this letter, Hacking emphasizes how Bernoulli’s correspondence and Ars Conjectandi have persuaded

Leibniz that it is possible to estimate probabilities a posteriori (Hacking 1971b, p. 346).

After describing to Bourguet the a priori method of estimating how probable—in the physical sense of feasible or makeable—an event is by counting the possible outcomes of, say, the throw of dice, Leibniz writes: “One may still estimate likelihoods

(vraisemblances) a posteriori, by experience, to which one must have recourse in default of a priori reasons. For example, it is equally likely that a child should be born a boy or a girl, because the number of boys and girls is very nearly equal all over the world. One can say that what happens more or less often is more or less feasible in the present state of things, putting together all considerations that must concur in the production of a fact”

(cited in Hacking 1971b, p. 346). Leibniz, therefore, now believes that we might estimate the degree of physical feasibility of a natural event from our experiential observation of the statistical frequency of the event. And even if we do not know the a priori reasons for

368 that physical feasibility or proclivity, our statistical estimate is no mere opinion but a well-founded, warranted estimate of a factual truth.212

In spite of his deliberate attempt at enriching science with a new, warranted form

of knowledge, however, and due to his deep-rooted belief in ‘causal determinism’ and to

his explicit attempt at defining probability as degree of subjective certainty in part IV of

the Ars Conjectandi, Bernoulli cannot himself envision some of the significant

possibilities for scientific understanding that his ‘stochastic art’ opens up. In the first

place, he does not fully realize how his reasoning provides a way to understand the occurrence of statistical regularities in nature. Given our epistemic limitations, he thinks that we can never ascertain the whole of the determining causes of natural events nor the causal laws through which these causes operate. And so due to his emphasis on our inability to seize upon determining causes, Bernoulli may have struggled to clarify before

Leibniz the way in which a priori probabilities explain a posteriori frequencies.

Considered from this perspective, Bernoulli’s response would have to be that a priori

probabilities are the real efficient cause of stable statistical frequencies. But this inherent

‘physical probabilism’—that is, the notion that events are, at least in part, inherently

probabilistic and thus not absolutely determined—is at odds with his notion of complete

causal determination and with his definition of probability as degree of subjective certainty in an absolutely deterministic world where all events are objectively certain.

Thus, Bernoulli does not grasp that his implicit treatment of a priori probabilities as

212 Interestingly, Hacking also notes that “for once, Leibniz has not kept up with the literature, for Arbuthnot had already published [four years earlier, in 1710] his proof that regularly more boys are born than girls” (1971b, p. 346).

369 physical proclivities creates the possibility of understanding statistically regular phenomena as being the intelligible product of real a priori dispositions operating as inherent reasons in aleatory and natural events, and that the actual operation of these dispositions in the making of statistical regularities is represented via his own theorem.

Moreover, Bernoulli does not realize how his logical defense, via an explanatory hypothesis, of his theorem’s applicability in natural and social science provides a way to focus scientific inquiry into the real principles, reasons, or causes of statistical phenomena. The crucial point in this respect is that the proximity of an observed a posteriori frequency to a real, but unknown, a priori probability is a working scientific hypothesis. Likewise, the related view that the active operation of a priori dispositions in aleatoric or natural events tends to produce statistical regularities is also a working scientific hypothesis. Upon Bernoulli’s positing of a really operative a priori probability as the explanation for a statistical regularity, a fellow inquirer like Leibniz may counter- argue: ‘Even if we posit an a priori probability or true ratio, we still need to know what the causes of that probability are.’ As we see in the letter to Bourguet, for instance,

Leibniz admits that a posteriori statistical frequencies are warranted estimates of the a priori physical probability of an outcome. However, he notes that we still do not know the a priori reasons for that a priori probability. In the case of birth rates, for example,

Leibniz thinks that we may come to know a posteriori what the true probability of a male or of a female birth is in the world population (in his erroneous estimate, the ratio of male to female births is 1:1), but we do not know the a priori reasons for this probability. But further research into these a priori reasons is precisely what Bernoulli’s working hypothesis would suggest as the focus for further inquiry—search into the principles,

370 reasons, laws, and causes that produce that a priori tendency. Leibniz’s possible retort to

Bernoulli need not be an objection; it must rather be a catalyst for further scientific inquiry. Bernoulli’s explanatory hypothesis is not meant to be the end of the inquiry; it is rather a working hypothesis that helps to focus further research in order to understand the statistical phenomenon.

Let us revisit the example of Arbuthnot’s study of birth rates in London.

Arbuthnot observes that the statistical ratio of male to female births is 18:17. Why should this be? Arbuthnot argues that Divine Providence intervenes for the purpose of ensuring the preservation of mankind. Since this observed 18:17 ratio would be highly improbable due to chance alone when the true a priori ratio of male to female births is assumed to be

1:1, since it is necessary to have an equal number of males and females for reproduction, and since males have a higher mortality rate, Divine Providence intervenes for the sake of a final cause—the preservation of humankind (see Arbuthnot 1710). This, however, does not count as a scientific explanation for Bernoulli. But his stochastic art makes it possible to infer instead that the a priori probability of a male birth in London in the late seventeenth and early eighteenth centuries is 18/35 because this would offer the most plausible explanation for the observed frequency. At the very least, he offers an explanation in terms of an a priori efficient cause, not a final cause. Now Leibniz might ask, ‘but why is the a priori probability of a male birth 18/35?’ Although Bernoulli himself does not realize it, his reasoning makes possible the reply that that is precisely a new question for research: ‘We should hold the working hypothesis that the true probability is 18/35, and now we should focus our scientific research into the real principles and causes that produce that a priori disposition.’ This working hypothesis

371 should guide all further investigation as it will focus the inquiry in a fruitful manner into the real reasons that make the stable frequency intelligible. In the long run, this course of inquiry will lead to a more lucid and truthful understanding of the statistical ratio of male to female births than the mere supposition that God has willed the ratio to be so.

Thus the course of scientific inquiry into statistical phenomena will proceed by way of abductive hypothesizing, that is, by way of inquirers conjecturing the real principles and causes that tend to produce regularity. And the search for explanation will be an inquiry into reality. For Bernoulli, as for all investigators who, in Peircean terms, inquire in scientific ‘singleness of heart’—that is, with the earnest desire to learn how the truth stands in the matter under investigation—,213 scientific research ultimately aims at

explaining reality, at making natural phenomena understandable to us by revealing their

real principles, reasons, and causes. Truthful explanation is a much higher aim than

simply describing the phenomena adequately, for example, than simply describing

mathematically the emergence of a statistical frequency via the ‘law of large numbers’.

Bernoulli’s scientific aim, as it is evident in his reasoning, is not merely to describe statistical regularity but to explain it. This same aspiration, therefore, ought to guide further inquiry in order to explain statistical regularity—inquiry that proceeds from

Bernoulli’s working hypothesis of real a priori probabilities as dispositions towards the gradual discovery of the real reasons that produce those a priori dispositions. In this regard, Bernoulli’s explanatory hypothesis helps scientific inquirers to focus their

213 See “The First Rule of Logic,” in EP 2.5 and in RLT, p. 165-180.

372 research so as to gradually approach the aim of scientific inquiry—namely, the truthful understanding of reality.

Chapter 9

Conclusion

I opened this investigation by asking what is the logic at work in the inquiring

activity of mathematicians. This is a question regarding the logica utens that

mathematicians actually deploy in the course of mathematical inquiry. But in accordance

with Peirce’s pragmatism, I have tried to keep before our attention what this logica utens

may suggest towards developing a logica docens for the training of students of

mathematics. By way of summary, I want to close this investigation with some final

reflections on the practical bearings, for the training of mathematicians, of the Peircean

logic of mathematical inquiry that I have expounded. This is also in the spirit of Peirce’s

philosophy, which he intended to be examined, criticized, and developed.

Peirce in fact made several efforts at developing methods for training students in

the ‘art of reasoning’, including the correspondence course that I discussed in section

4.3.214 He makes some important remarks with specific regard to the teaching and the

studying of mathematics in “Reasons Conscience: A Practical Treatise on the Theory of

Discovery wherein Logic is Conceived as Semeiotic” (MS 693; printed in part in NEM

4.10). This manuscript was intended as a text for the training of students of logic, broadly

conceived, again, as a three-fold science of ‘speculative grammar’, ‘critic’, and

‘methodeutic’. After affirming that mathematics is the easiest of sciences that nonetheless

214 Carolyn Eisele has written a good introduction to Peirce’s thought on mathematical education. See “Peirce’s Philosophy of Mathematical Education” in Eisele 1979, p. 177-200.

374 is usually taught poorly, Peirce writes on the necessary abilities for teaching mathematics: “A teacher of mathematics, especially in its most elementary branches (for they once passed, it does not so much matter), ought to have three requirements which I name in the order of their importance, though all are indispensable. He should be a strong mathematician, a subtile logician, and an accomplished psychologist. It will also be of advantage if he should be a natural teacher and an experienced teacher of mathematics”

(NEM 4, p. 199). Without a doubt, fulfilling these requirements would be quite a challenge for any teacher, however gifted and well-prepared. However, leaving aside the psychological issues that Peirce alludes to, an understanding of the logic of mathematical inquiry would make the teacher a stronger mathematician on her own right, and she would have a more thorough command of the nature of the reasoning that she ought to elicit from her students. Elsewhere, Peirce claims that “[m]any pupils have genius; they have intuitions; they relieve the teacher of the hardest part of his task” (NEM 4, p. xiv).

That is, they have a creative imagination, an ability to conceive and modify images, signs, or, in the case of mathematics, ‘diagrams’.215 What such pupils require, then, is a

teacher who knows what is the “real nature of a mathematical demonstration” (NEM 4, p.

xiv). That is, they need guidance in developing an effective logic of inquiry. In sum, a teaching approach that conveys, from the earliest mathematical studies, that mathematics is not a matter of following pre-established rules to solve contrived, self-contained

problems, but an inquiring practice that requires an active imagination, vigorous concentration, and an effort at generalizing findings, often on the basis of existing

215 In the case of Peirce, the term ‘intuition’ should not be understood in a Cartesian or Kantian way. I think that by ‘intuitions’ Peirce here must mean simply ‘signs arising from a creative imagination’.

375 knowledge outside the relatively narrow realm of a particular mathematical field. Such an approach, I submit, would already be a tremendous improvement in mathematical education.

With respect to the student of mathematics, Peirce in turn vividly writes:

In reading mathematics, the student should beware of falling into a passive state. He must remember that it is he himself, and nobody else, who has to perform the entire reasoning process, for which the book merely affords some hints. There are men who never dream of this. If they are asked to look at a diagram, they will say to themselves, “Yes, we will suppose that done,” without the smallest consciousness of not having done what was asked of them…Should anybody get it into their heads that they are required to do the reasoning themselves, they will feel as if they had been stood upon the ridge pole of a cathedral and were required to walk unaided from one end of it to the other. The scare will deprive them of their senses.

You may remark that a boy with any turn for mathematics talks of learning Latin but never talks of learning mathematics. He talks of doing mathematics. Now you, instead of standing there eternally shivering on the brink, first carefully watch you what the book says you are to do, and having thought over how you are to do it, then without further hesitation jump you in and do it; and you will soon find it is great fun. (NEM 4, p. 199-200)

I do not know if students who fear mathematics as much as they fear falling to their death from the ridge pole of a cathedral will ever come to think of doing mathematics as fun, but the point is that they can at least do something about their fear: namely, do mathematics.216 And to do mathematics, what they require, first and foremost, is not a set

of rules in, say, arithmetic and geometry, but a logica utens. It is this logic that they need

to practice as they learn the elements of arithmetic and geometry. The foregoing

investigation into the logic of mathematical inquiry elucidates the basic outlines of the

practice that educators ought to encourage.

216 Peirce’s illustrates this by continuing to show how the student may walk across the Pons Asinorum, the bridge found at Euclid’s fifth proposition, as discussed in chapter 2.

376 In the first place, students must cultivate the conditions for the possibility of discovery. Most fundamentally, this means strengthening their powers of imagination, concentration, and generalization. In my estimation, the first ability that a poor mathematical education enervates is the imagination. ‘Follow this rule to solve this kind of problem’ is the essence of a poor education; meanwhile the question of what to do if a different problem shall arise is never raised. As I suggested in section 4.3, then, the foremost task to be accomplished in mathematical training is encouraging the creative imagination as much as strengthening the vigor of concentration and the power of generalization. Learning to deploy the universal categories heuristically would help in this matter. Exercises aimed at distinguishing the elements of inherent quality or

Firstness, relation or Secondness, and generality or Thirdness, in any object of conscious attention—an actual physical phenomenon as much as an imagined poetic or mathematical state of affairs represented in an ‘icon’—would strengthen the powers of imagination, concentration, and generalization respectively.217 In the course of the

students’ gradual learning of the elements of existing mathematical knowledge, these

exercises ought to be accompanied by illustrations of the important function of the

community of inquirers in providing dialogical criticism to one’s reasoning. A practical problem context would enable the students’ progress in conducting inquiry and, by

training him in grasping analogies between ideal mathematical states of affairs and actual

physical or practical problems. Such training would be especially important for students

217 As I have acknowledged, the design of such exercises would require further reflection. Peirce’s work would provide some guidance, of course. For instance, he dedicates one of his 1898 Cambridge Conferences Lectures, entitled “Training in Reasoning,” to describing in details some exercises conducive to distinguishing inherent quality, relation, and generality in any phenomenon. See RLT, p. 181-196.

377 who may have no interest in becoming ‘pure mathematicians’ in the Peircean sense, but who will see the potential utility of pursuing mathematical reasoning in their own specialized inquiries.

In the second place, as students make progress in their education, they require practical training in the method of analysis and the associated techniques of mathematical experimentation. In this regard, my principal thesis has been that ‘hypothesis-making’— understood both as framing hypothetical states of affairs and as making analytical, problem-solving hypotheses in the course of experimenting upon ‘diagrams’—and not deduction from self-evident axioms is the main form of ampliative reasoning in mathematics. Even at the heart of ‘theorematic’ deduction—that is, of the process of drawing necessary conclusions that are not immediately evident from the premises of the reasoning in the way that corollaries are immediately evident—we find that mathematicians are experimenting, attempting various hypotheses in order to arrive at a desired result. Through a variety of techniques, they are attempting experimental modifications upon images of their own creation in order to derive other images. In its ampliative phases, their logical reasoning can be thoroughly characterized as experimenting upon ‘signs’ so as to derive other related ‘signs’—their mathematical

“train of thought” consists in this continuous association of interrelated ‘diagrams’. This is nothing but an expression of Peirce’s law of mind, a hypothesis that I discussed in chapter 2 when proposing that abduction, unlike induction, consists in the innovative association of ideas represented as ‘signs’. Peirce writes: “I say that the whole law of mind, in the department of science [including mathematics], of art, and of practical life,—whether it be what we call knowing, or emotion, or reaction with the world,

378 consists in this that ideas connect themselves with iconical ideas, so as to make up sets”

(MS 1008; printed in NEM 4, p. xx). What I have called a ‘mathematical world’—a hypothetical state of affairs—is nothing but a set of interconnected ‘icons’ created by way of ‘framing hypotheses’ and developed by way of experimentation upon logically associated or connected ‘diagrams’.

Now, that mathematics should be practiced by creating and experimenting upon

‘diagrams’ would come as a surprise to most student of mathematics, at least in the early stages of education, I think. Perhaps the brightest students come to understand it on their own early in their studies. The rest of us need help to see it—at least I would have profited from this kind of insight during my studies. I came to understand it with some level of clarity only during my university courses, when I was actually doing mathematics, not following rules for the solution of specific types of problems; Peirce and Cellucci helped me to understand it distinctly with their articulations of the method of mathematical inquiry; and I have attempted to develop their articulations further. I admit that I have only discussed some of the heuristic techniques of analysis that practicing mathematicians employ. Peirce and especially Cellucci (2002) have already developed a more thorough catalogue of techniques.218 However, on the basis of their

work I have described in careful detail the various stages that constitute the analytic

experimental process—the five stage process that students of mathematics must practice.

218 Nonetheless, a task that is pending from a Peircean standpoint is to classify those techniques under the three main kinds of reasoning and the subspecies thereof. This is a task of logical ‘critic’ in the Peircean sense. I admit that it proved to be too extensive a task for this limited work, and as a consequence my discussion of heuristic techniques appears to me to be somewhat unsystematic, contrary to my original aim.

379 And I have illustrated how the history of mathematics, as a record of mathematical practice, holds the key towards developing a method for successful mathematical inquiry.

In the third place, I propose that students of mathematics ought to gain experience in the application of mathematics to practical and scientific problems. As the example of

Bernoulli’s comprehensive reasoning for proving his mathematical theorem and warranting its application to problems in the sciences indicates, the living practice of inquiry is not circumscribed to any one field of specialization. Bernoulli acted both as

‘pure’ mathematician and as ‘applied’ mathematical scientist in the course of the comprehensive reasoning that established his theorem as one of the historical hallmarks of mathematical achievement. Over the course of various decades in the late seventeenth century and early eighteenth centuries, he pursued the mathematical research and tackled the subtle logical and scientific difficulties associated with the foundation of the ‘art of conjecturing’ upon the mathematics of probability. It was a long struggle, that he may have felt he lost, since he did not publish the Ars Conjectandi. I have argued, however, that he succeeded both as pure mathematician and as mathematical scientist. There may have been shortcomings to his reasoning. But he spurred a line of research that continued through the investigations of Abraham de Moivre, Thomas Bayes, Pierre Simon Laplace, and Carl Friedrich Gauss, among others, and that eventually led to the pervasive influence of mathematical probability, and its twin field of mathematical statistics, upon scientific research in the nineteenth and twentieth centuries, and that will continue throughout the present twenty-first century. On the basis of this historical example, I advocate the benefits to the student of mathematics, and especially to the ‘pure’ mathematician in training, of cultivating a practice as mathematical scientist. This

380 practice may be aided, once again, by deploying the universal categories as heuristic tools—that is, by observing and distinguishing the inherent qualities, the relations, and the general elements of phenomena.

By following the outlines of this logica docens that reflects the logica utens of mathematical inquiry, I propose that students of mathematics may ultimately develop a characteristic approach, a personal ‘methodeutic’ for mathematical research. Admittedly, if the Peircean logic of mathematical inquiry that I have described is correct and is indeed a practicable logica utens, that ‘methodeutic’ will clearly have the general character of experimental analysis. However, as Peirce’s system of universal categories shows, the generality of a logical ‘methodeutic’ does not preclude its inherent spontaneity in practice. The creative imagination of the student of mathematics will lead her to develop a personal style, much like the creative style of poets and artists. At that point, mathematical practice will become fun, as Peirce desired from what I would call his

‘joyful’ philosophical stance—that is, a stance that does not preclude originality and spontaneity from any reasoning practice. To the extent that mathematical practice is akin to other activities, this logica docens, this approach to training, will extend its influence beyond the education of students of mathematicians alone to the fostering of other scientific and poietic practices.

Bibliography

Anderson, D. (1986). "The Evolution of Peirce's Concept of Abduction." Transactions of the Charles S. Peirce Society 22: 145-164

Anderson, D. (1987). Creativity and the Philosophy of C. S. Peirce. Dordrecht, The Netherlands, Martinus Nijhoff.

Anderson, D. (1995). Strands of System: The Philosophy of Charles Peirce. West Lafayette, Indiana, Purdue University Press.

Aquinas, Thomas (1992). In libros Posteriorum analyticorum. Milano, Editoria Elettronica Editel.

Arbuthnot, J. (1710). "An Argument for Divine Providence, Taken from the Constant Regularity Observed in the Births of Both Sexes." Philosophical Transactions of the Royal Society of London 27: 186-190.

Archibald, R. C. (1938). A Semicentennial History of the American Mathematical Society, 1888-1938. New York, American Mathematical Society.

Aristotle (1984). The Complete Works of Aristotle. Princeton, New Jersey, Princeton University Press.

Arnauld, A., and Nicole, Pierre (1970). La Logique, ou L'Art de penser: contenant, outre les regles communes, plusiers observations nouvelles propres a former le iugement. New York, G. Olms.

Ayim, M. (1982). Peirce's View of the Roles of Reason and Instinct in Scientific Inquiry. Meerut, India, Anu Prahashan.

Bayes, T. (1763). "An essay towards solving a problem in the doctrine of chances." Philosophical Transactions of the Royal Society of London 53: 370-418.

Bernoulli, J. (1713). Ars Conjectandi. Basil, Thurnisiorum.

Bernoulli, J., and Sung, Bing, Trans. (1966). Translations from James Bernoulli. Cambridge, Massachusetts, Department of Statistics, Harvard University.

Brown, J. R. (1999). Philosophy of Mathematics: An Introduction to the World of Proofs and Pictures. New York, Routledge.

382 Byrne, P. (1997). Analysis and Science in Aristotle. Albany, New York, State University of New York Press.

Calinger, R. (1999). A Contextual History of Mathematics to Euler. Upper Saddle River, New Jersey, Prentice Hall.

Cardano, G. (1961). The Book on Games of Chance: "Liber de ludo aleae". New York, Holt, Rinehart and Winston.

Cellucci, C. (2000). The Growth of Mathematical Knowledge: An Open World View. The Growth of Mathematical Knowledge. E. Grosholz, and Breger, Herbert. Dordrecht, Netherlands, Kluwer Academic Publishers.

Cellucci, C. (2002). Filosofia e matematica. Bari, Italy, Laterzi.

Cohen, L. J. (1989). An Introduction to the Philosophy of Induction and Probability. Oxford, Clarendon Press.

Daston, L. (1988). Classical Probability in the Enlightenment. Princeton, New Jersey, Princeton University Press.

David, F. N. (1962). Games, Gods and Gambling: A History of Probability and Statistical Ideas. London, Charles Griffin & Co.

De Moivre, A. (1718). The Doctrine of Chances, or a Method of Calculating the Probability of Events in Play. London, W. Pearson.

De Moivre, A. (1730). Miscellanea Analytica de Seriebus et Quadraturis. London, J. Tonson and J. Watts.

de Waal, C. (2005). "Why Metaphysics Needs Logic and Mathematics Doesn't: Mathematics, Logic, and Metaphysics in Peirce's Classification of the Sciences." Transactions of the Charles S. Peirce Society XLI(2): 283-297.

Eisele, C. (1979) Studies in the Scientific and Mathematical Philosophy of Charles S. Peirce. The Hague, Mouton.

Euclid (1956). The Thirteen Books of Euclid's Elements. Thomas Heath (Ed.). New York, Dover.

Galileo (1898). Opere. Firenze, Barbera.

Gillies, D., Ed. (1992). Revolutions in Mathematics. Oxford, Oxford University Press.

Gillies, D. (2000a). An Empiricist Philosophy of Mathematics and its Implications for the History of Mathematics. The Growth of Mathematical Knowledge. E. Grosholz,

383 and Breger, Herbert. Dordrecht, The Netherlands, Kluwer Academic Publishers: 41-57.

Gillies, D. (2000b). Philosophical Theories of Probability. London, Routledge.

Graunt, J. (1662). Natural and Political Observations on the Bills of Mortality. London, T. Roycroft.

Grosholz, E. (1991). Cartesian Method and the Problem of Reduction. Oxford, Clarendon Press

Grosholz, E. (1992). Was Leibniz a Mathematical Revolutionary? Revolutions in Mathematics. D. Gillies. Oxford, Oxford University Press.

Grosholz, E., and Breger, Herbert, Eds. (2000a). The Growth of Mathematical Knowledge. Dordrecht, The Netherlands, Kluwer Academic Publishers.

Grosholz, E. (2000b). The Partial Unification of Domains, Hybrids, and the Growth of Mathematical Knowledge. The Growth of Mathematical Knowledge. E. Grosholz, and Breger, Herbert. Dordrecht, The Netherlands, Kluwer Academic Publishers: 81-91.

Grosholz, E. (2001). Mathematics, Representation, and Molecular Structure. Tools and Modes of Representation in the Laboratory Sciences. U. Klein. Dordrecht, Kluwer Academic Publishers.

Grosholz, E. (2005). Mathematical Reasoning and Heuristics. C. Cellucci, and Gillies, Donald. London, King's College Publications: 1-23.

Hacking, I. (1971a). "Jacques Bernoulli's 'Art of Conjecturing'." The British Journal for the Philosophy of Science 22: 209-229.

Hacking, I. (1971b). "Equipossibility Theories of Probability." The British Journal for the Philosophy of Science 22: 339-355.

Hacking, I. (1975). The Emergence of Probability. Cambridge, U.K., Cambridge University Press.

Hacking, I. (1982). "Experimentation and Scientific Realism." Philosophical Topics 13: 71-88.

Hanson, N. R. (1965). Patterns of Discovery: An Inquiry into the Conceptual Foundations of Science. Cambridge, U.K., Cambridge University Press.

Harman, G. H. (1965). "The Inference to the Best Explanation." The Philosophical Review 74: 88-95.

384 Hausman, C. (1993). Charles S. Peirce's Evolutionary Philosophy. Cambridge, U.K, Cambridge University Press.

Hempel, C. (1966). Philosophy of Natural Science. Upper Saddle River, New Jersey, Prentice Hall.

Hilbert, D. (1899). Foundations of Geometry. La Salle, Illinois, Open Court.

Hookway, C. (1985). Peirce. London, Routledge.

Houser, N., Roberts, Don, and Van Evra, James, Ed. (1997). Studies in the Logic of Charles Sanders Peirce. Indianapolis, Indiana University Press.

Huygens, C. (1657). Ratiociniis in aleae ludo. Exercitionum Mathematicorum. F. van Schooten. Amsterdam, J. Elsevirii.

Jacquette, D., Ed. (2002). Philosophy of Mathematics: An Anthology. Oxford, Blackwell.

Kendall, M. G. (1970). The Beginnings of a Probability Calculus. Studies in the and Probability. E. S. Pearson, and Kendall, M. G. London, Charles Griffin & Co.: 19-34.

Kolmogorov, A. N. (1933). Grundbegriffe der Wahrscheinlichkeitsrechnung. Berlin, J. Springer.

Lakatos, I. (1976). Proofs and Refutations. Cambridge, Cambridge University Press.

Laplace, P. S. (1814). Essai philosophique sur les probabilités. Paris, Mme. ve Courcier.

Leibniz, G. W. (1855). Leibnizens Mathematische Schriften herausgegeben von C. I. Gerhardt. Berlin, A. Asher.

Lipton, P. (1991). Inference to the Best Explanation. New York, Routledge.

Liszka, J. J. (1996). A General Introduction to the Semeiotic of Charles Sanders Peirce. Indianapolis, Indiana University Press.

Maddy, P. (1990). Realism in Mathematics. Oxford, Oxford University Press.

Murphey, M. G. (1961). The Development of Peirce's Philosophy. Cambridge, Massachusetts, Harvard University Press.

Neyman, J. (1950). First Course in Probability and Statistics. New York, Holt.

Pacioli, L. (1494). Summa de arithmetica, geometria, proportioni et proportionalità. Venice, Paganinus de Paganinis.

385 Pearson, K. (1924). "Historical Note on the Origin of the Normal Curve of Errors." Biometrika 16: 402-404.

Pearson, K. (1925). "James Bernoulli's Theorem." Biometrika 17: 201-210.

Pearson, K. (1926). "Abraham de Moivre." Nature 117: 551-552.

Pearson, E. S., and Kendall, M. G., Ed. (1970). Studies in the History of Statistics and Probability. London, Charles Griffin & Co

Peirce, B. (1870). Linear Associative Algebra. Washington City.

Peirce, C. S. (1932). Collected Papers of Charles Sanders Peirce. Cambridge, Massachusetts, Harvard University Press.

Peirce, C. S. (1957). Essays in the Philosophy of Science. Indianapolis, The Liberal Arts Press.

Peirce, C. S. (1976). The New Elements of Mathematics. The Haghe, Netherlands, Mouton Publishers.

Peirce, C. S. (1982). Writings of Charles S. Peirce: A Chronological Edition. Bloomington, Indiana University Press.

Peirce, C. S. (1992a). The Essential Peirce: Selected Philosophical Writings 1. Indianapolis, Indiana University Press.

Peirce, C. S. (1992b). Reasoning and the Logic of Things. Cambridge, Massachusetts, Harvard University Press.

Peirce, C. S. (1998). The Essential Peirce: Selected Philosophical Writings 2. Indianapolis, Indiana University Press.

Peverone, G. F. (1558). Due brevi e facili trattati, il primo d'arithmetica, l'altro di geometria. Lione, Gio. di Tournes.

Polya, G. (1954). Mathematics and Plausible Reasoning. Princeton, Princeton University Press.

Popper, K. (1959). The Logic of Scientific Discovery. London, Hutchinson.

Popper, K. (1963). Conjectures and Refutations: The Growth of Scientific Knowledge. New York, Routledge.

Potter, V. (1967). Charles S. Peirce on Norms and Ideals. Amherst, Massachusetts, The University of Massachusetts Press.

386 Robyn, R. S. (1967). Annotated Catalogue of the Papers of Charles S. Peirce. Amherst, Massachusetts, University of Massachusetts Press.

Russell, B. (1907). The Regressive Method of Discovering Premises in Mathematics. Essays in Analysis. D. Lacky. New York, George Braziller.

Santaella, L. (1998). "La evolución de los tres tipos de argumento: abducción, inducción y deducción." Analogía Filosófica 12(1): 9-20.

Schaffner, K. F. (1993). Discovery and Explanation in Biology and Medicine. Chicago, The Chicago University Press.

Schneider, I. (2000). The Mathematization of Chance in the Middle of the 17th Century. The Growth of Mathematical Knowledge. E. Grosholz, and Breger, Herbert. Dordrecht, The Netherlands, Kluwer Academic Publishers: 59-75.

Shanks, D. (1993). Solved and Unsolved Problems in Number Theory. New York, Chelsea.

Shapiro, B. (1983). Probability and Certainty in Seventeenth-Century England. Princeton, New Jersey, Princeton University Press.

Stigler, S. (1986). The History of Statistics: The Measurement of Uncertainty before 1900. Cambridge, Massachusetts, Belknap Press.

Thompson, M. (1953). The Pragmatic Philosophy of C. S. Peirce. Chicago, The Press.

Todhunter, I. (1865). A History of the Mathematical Theory of Probability from the Time of Pascal to that of Laplace. London, Macmillan.

van Fraassen, B. (1989). Laws and Symmetry. Oxford, Clarendon Press.

Von Mises, R. (1928). Wahrscheinlichkeit, Statistik und Wahrheit. Wien, J. Springer.

VITA

Daniel Gerardo Campos

Born July 25, 1972 in San José, Costa Rica

Education

Ph.D. Philosophy. Pennsylvania State University, University Park, PA. December 2005.

M.A. Philosophy. Pennsylvania State University, University Park, PA. August 2002.

M.A. Statistics. Pennsylvania State University, University Park, PA. August 1996.

B.A. Mathematics. Harding University, Searcy, AR. May 1994.

Educación secundaria. Colegio Monterrey, San José, Costa Rica. December 1989.

Educación primaria. Unidad Educativa México, San José, Costa Rica. December 1984.

Honors, Grants, and Awards

Resident Scholar Fellowship, Dibner Library of the History of Science and Technology, Smithsonian Institution, Washington D.C., 2005.

Graduate Scholar Award, Pennsylvania State University, Fall 2000 – Spring 05.

University Graduate Fellowship, Pennsylvania State University, Fall 1998 – Spring 99.

Walton Scholar, Harding University, 1990 – 94.

Outstanding Senior in Mathematics, Harding University, 1994.

Published Articles

“Assessing the Value of Nature: A Transactional Approach.” Environmental Ethics, 24(1), 57-74, 2002.

“Resource Selection by Animals: The Statistical Analysis of Binary Response.” Coenoses, 12(1), 1-21, 1997.