<<

Philosophia Scientiæ Travaux d'histoire et de philosophie des sciences

16-1 | 2012 From Practice to Results in and

Valeria Giardino, Amirouche Moktefi, Sandra Mols and Jean-Paul Van Bendegem (dir.)

Electronic version URL: http://journals.openedition.org/philosophiascientiae/700 DOI: 10.4000/philosophiascientiae.700 ISSN: 1775-4283

Publisher Éditions Kimé

Printed version Date of publication: 1 April 2012 ISBN: 978-2-84174-581-4 ISSN: 1281-2463

Electronic reference Valeria Giardino, Amirouche Moktef, Sandra Mols and Jean-Paul Van Bendegem (dir.), Philosophia Scientiæ, 16-1 | 2012, “From Practice to Results in Logic and Mathematics” [Online], Online since 01 April 2012, connection on 15 January 2021. URL: http://journals.openedition.org/philosophiascientiae/ 700; DOI: https://doi.org/10.4000/philosophiascientiae.700

This text was automatically generated on 15 January 2021.

Tous droits réservés 1

TABLE OF CONTENTS

Préface Léna Soler

Introduction: From Practice to Results in Mathematics and Logic Valeria Giardino, Amirouche Moktefi, Sandra Mols and Jean Paul Van Bendegem

Part I : Conceptual and Theorical Issues

Practice, Constraint, and Mathematical Concepts Mark C. R. Smith

Proof and Understanding in Mathematical Practice Danielle Macbeth

The Role of Representations in Mathematical Reasoning Jessica Carter

Towards a Practice-based of Logic: Formal Languages as a Case Study Catarina Dutilh Novaes

Part II: Case Studies

Learning from Euler. From Mathematical Practice to Mathematical Explanation Daniele Molinini

From Practice to New Concepts: Geometric Properties of Groups Irina Starikova

Pratique mathématique et lectures de Hegel, de Jean Cavaillès à William Lawvere Baptiste Mélès

La physique dans la recherche en mathématiques constructives Vincent Ardourel

Philosophia Scientiæ, 16-1 | 2012 2

Préface

Léna Soler

1 The point of departure of this volume has been a conférence, "From Practice to Results in Logic and Mathematics", organized by the PratiScienS group in June 2010 in Nancy. The present volume does not contain, properly speaking, the proceedings of this conference. Indeed, the book is the result of a separate call for papers, and there is no exact coincidence between the papers it includes and the contributions of the initial conference. Nevertheless, the conference and this volume have been initiated in the same spirit and with the same motivations. These are the spirit and motivations of the PratiScienS project so as a matter of preface, I would like to say a few words about them.

2 PratiScienS stands for 'Rethinking Sciences from the Standpoint of Scientific Practices'. The project started in 2007 and is pursued by a small interdisciplinary group of France- based historians, and sociologists of science (where 'science' must be understood here in a sense which includes mathematics and logic in addition to the empirical sciences). The general idea is to achieve a fine-grained characterization of the actual practices of scientists, and to investigate philosophically significant issues about these practices (such as the issue of robustness, that is, how something acquires the status of a robust result in the empirical and the formal sciences1, or the issue of the contingency/ inevitability of robust scientific achievements).

3 The investigation of these issues requires a methodological reflection on the means and difficulties that a study of scientific practices implies, and the development of exploratory tools suitable for each scientific field. It also requires a clarification of what it means to study a scientific field from the standpoint of practices, including an of what a practice is, of how to conceptualize a scientific practice, how to individualize (that is, identify) this or that scientific practice as one unitary practice, and so on. Quite astonishingly, these questions are rarely addressed, despite the fact that even a superficial look to the existing studies conducted under the heading of 'practices' reveals heterogeneous and possibly conflicting underlying answers to them.

4 The exploration of such questions involves the specification of the kinds of studies that are commonly contrasted to practice-based inquiries of science (be it with respect to their aims, their scope, their methods or the kind of accounts they produce). When we

Philosophia Scientiæ, 16-1 | 2012 3

scrutinize the way this contrast appears in the literature, we meet important distinctions such as : (a) Practices as processes, dynamical actions and procedures, contrasted with the propositional products of these practices (theories, experimental facts, results of a mathematical theorem, etc.) ; (b) Ongoing, day-to-day, real-time actual science (science as it is really practiced, science in action) contrasted with science as it is a posteriori reconstructed by practitioners, notably in their publications ; (c) Real science opposed to idealized accounts provided by philosophers—most of the time in a derogatory sense of the term 'idealized', that is, in a sense implying the allegation to have produced a one-sided and truncated, if not a completely deceitful account of what practitioners indeed do and what the science under scrutiny really is. . .

5 Further work is needed in order to progress in these questions. This is already the case for the practice-based studies of the empirical sciences, but this is even more required when the target is the formal sciences, because after the 1980s, in the first decades of the so-called 'practice turn', the practice-based investigations of science have been primarily, if not exclusively, focused on empirical sciences (and first of all on experimental fields exemplified by experimental physics, prototypically, through laboratory studies of physics in action). Mathematics has not often been the targeted object of such kinds of inquiries, let alone logic. Actually, the mainstream philosophy of mathematics and logic has remained mostly focused on foundational issues, probably for reasons that have to do with the common idea of mathematics as the domain of necessary truths established once for all through deductive, apodictic proofs.

6 This being said, in the last decades, a growing interest in mathematical and logical practices has become more and more tangible. Practice-based approaches of mathematics have been valued as fruitful by more and more scholars, and more and more work has been done under the banner of mathematical practice. A similar, although delayed and more timid movement can be noticed in the case of logic. So it seems that we are in the process of an important methodological shift in the philosophical investigations of mathematics and logic.

7 This methodological shift has all chances to go with a renewal of the current philosophical conceptions of mathematics and logic : in the same manner as practice- based studies of the natural sciences have transformed the image of physics and other experimental sciences, practice-based studies of mathematics and logic should enrich and modify our ideas about mathematics and logic. To contribute to this shift was the main motivation for the organization of the 2010 conference. The present volume, I hope, will help to get a better sense of the central issues, exploratory methods and dominant trends in the practice-oriented studies of mathematics and logic today.

Acknowledgements

8 I would like to express my gratitude to a number of people and institutions without which the 2010 conference could not have taken place and the volume could not have been published. First of all, thanks to all the members of the PratiScienS group, notably to the members of the organizing core group, Sandra Mols, Valeria Giardino and Amirouche Moktefi, who also are the scientific editors of this book along with Jean Paul van Bendegem, one of those who most contributed to turn our interest on mathematical practices and to help us understand what mathematical practices are. I am also grateful to the other Members of the Scientific Committee of the conference,

Philosophia Scientiæ, 16-1 | 2012 4

Gerhard Heinzmann, Paolo Mancosu, Philippe Nabonnand, Jean-Michel Salanskis and Andrew Warwick. Finally, I want to thank the institutions that currently support the activities of the PratiScienS group, namely the Agence Nationale de la Recherche (ANR), the Maison des Sciences Humaines de Lorraine (MSH Lorraine), the Région Lorraine, the Henri Poincaré Archives (Laboratoire d'Histoire des Sciences et de Philosophie), and the University of Lorraine.

NOTES

1. Characterizing the Robustness of Science : After the Practice Turn in , L. Soler, E. Trizio, T. Nickles et W. Wimsatt (eds.), Springer, 2012.

AUTHOR

LÉNA SOLER LHSP - Archives H. Poincaré (UMR 7117), CNRS - Université de Lorraine, IUFM de Lorraine (France)

Philosophia Scientiæ, 16-1 | 2012 5

Introduction: From Practice to Results in Mathematics and Logic

Valeria Giardino, Amirouche Moktefi, Sandra Mols and Jean Paul Van Bendegem

Acknowledgments We want to thank the PratiScienS group and the MSH-Lorraine staff for their help for the organisation of the conference that has inspired this volume. Thanks are also due to the authors, the reviewers and the proof-readers who helped us all along the preparation of this volume. Lastly, we also warmly thank Sandrine Avril and Manuel Rebuschi for their help and support.

1 Mathematical practice: a short overview

1 This volume is a collection of essays that discuss the relationships between the practices deployed by logicians and mathematicians, either as individuals or as members of research communities, and the results from their research. We are interested in exploring the concept of 'practices' in the formal sciences. Though common in the history, philosophy and sociology of science, this concept has surprisingly thus far been little reflected upon in logic and mathematics. Yet, such practice-based approach would be most crucial for a critical study of these fields. Indeed, in their daily work, mathematicians and logicians do deploy sets of practices, some intimately tied to mathematics and logic as such and others relating rather to material, theoretical, intellectual and social issues in their environments. For these reasons, this volume pertains to the interdisciplinary trend that has gained more and more interest in recent years, and which is known as the philosophy of mathematical practice.

2 To give an all too short account of this recent approach, one might look for its roots in Imre Lakatos's 1976 seminal Proofs and Refutations [Lakatos 1976], a work that was in part inspired by Georg Polya's How to Solve It [Polya 1945] which discusses heuristics and problem-solving techniques in an educational setting. Later on, Philip Kitcher proposed in his book The Nature of Mathematical Knowledge [Kitcher 1985] a more or less formal model of how mathematics as an activity can be described. Several authors

Philosophia Scientiæ, 16-1 | 2012 6

embraced that trend, as is shown in subsequent volumes that made connections between philosophy and history of mathematics, as did Thomas Tymoczko's New Directions in the Philosophy of Mathematics [Tymoczko 1986] and, later and more explicitly, Bill Aspray and Philip Kitcher's History and Philosophy of Modern Mathematics [Aspray & Kitcher 1988]. It must be said here that historians of mathematics have long paid attention to practices and produced significant studies with often implicit, but also sometimes explicit, claims about their philosophical relevance.

3 The above description may suggest a rather uniform approach to the study of mathematical practice, but such is not the case. In the introduction to The Philosophy of Mathematical Practice [Mancosu 2008], Paolo Mancosu identifies two main traditions. The first is the 'maverick' tradition which remains close to the Lakatosian approach, while the second settles itself within the modern analytical tradition, namely the naturalizing programme that started with Williard Van Orman Quine, and where Penelope Maddy has played and still plays an important role.

4 This is not yet the end of the story however. Independently of these developments in the philosophy of mathematics, but also partially inspired by Lakatos as well as Thomas Kuhn's The Structure of Scientific Revolutions [Kuhn 1962], some researchers developed a sociology of mathematics, where one of the major focuses was mathematical practice as a group or community phenomenon. Two works should be mentioned in this line: David Bloor's Knowledge and Social Imagery [Bloor 1976] and Sal Restivo's The Social Relations of Physics, Mysticism, and Mathematics [Restivo 1985]. In contrast with history and philosophy of mathematics, the sociological approach did not merge easily with the mentioned traditions, although some 'brave' attempts in this direction should be noticed [Restivo, Van Bendegem, & Fischer 1993], [Rosental 2008], [Lôwe & Mûller 2010]. Plenty of reasons can be listed but surely one of the corner elements is the internal-external debate: does mathematics develop according to 'laws' or patterns that are internal to mathematics, or do external elements contribute?

5 One outcome of these discussions has been to draw the attention to other, so far neglected, areas where mathematics is involved, prominently mathematics education and ethnomathematics. In 2002, a conference was organized in Brussels where the organizers, Jean Paul Van Bendegem and Bart Van Kerkhove, tried to realize their ambition in bringing representatives of all these disciplines together. This conference has led to the 2007 Perspectives on Mathematical Practices, with the overambitious subtitle Bringing Together Philosophy of Mathematics, Sociology of Mathematics, and Mathematics Education [Van Kerkhove & Van Bendegem 2007]. Another important outcome from all these developments, overall, has been the confirmation of the rather heterogeneous character of this field, as can be seen in the recently founded Association for the Philosophy of Mathematical Practice (APMP) (see the website of the association at http:// institucional.us.es/apmp/), wherein both of Mancosu's traditions are clearly present and at times happily interacting.

6 More generally, so as to draw a complete picture of the philosophy of mathematical practice, it has to be noted that all of the above would have to be looked at from a wider perspective, in order to see how these developments relate to the whole domain of the philosophy of mathematics, including its mainstream subdomain, the foundational studies of mathematics. In a sense this is a tension within the internal approach: even if one accepts that mathematics is subjected to something like an internal development,

Philosophia Scientiæ, 16-1 | 2012 7

it will remain to find out how this development is best characterized. Such a complete picture will not be drawn here but will need to be clarified in the future.

2 Themes and issues in the philosophy of mathematical practice

7 In the light of what has been said, how does the problem agenda of the philosophy of mathematical practice look like? The main focus will be on exploring the internal issues, while indicating along the way connections with external issues and themes. An elegant way to obtain some coherence in such an agenda is to look at the various levels where the practices are situated: macro-, meso-and micro-level, as a first rough classification.

8 At the macro-level, the discussion is about the global development of mathematics. One specific question that arises is whether or not 'revolutions' take place and, if so, of what kind. On this issue, the standard volume remains Donald Gillies's Revolutions in Mathematics [Gillies 1992]. Are there Kuhnian parallels to be drawn or does a 'philosophy of mathematical practice' approach also confirm the special status of mathematics vis-à-vis the sciences? As has happened in post-Kuhn philosophy of science, the sociological elements enter into the picture, as one asks why particular mathematical developments took place in their place and time. Rephrased, the issue is that of how well mathematics is or is not closed off from the rest of society. The relevance of the educational and ethnomathematical dimensions is obvious at this level.

9 At the meso-level, the concern is to deal with research programmes (traditions, styles, etc.). Are there such 'research programmes' ? How do they guide research, motivate people, and so forth? The educational dimension is present here as well but on a more internal level: how is the next generation of mathematicians initiated in a particular research programme? A lot of work has been done on this level and addressed features include the issues of being deemed, interesting, fruitful, provider of explanation, promising solutions, or indicative of heuristics. These are all features that are usually neglected in foundational studies.

10 At the micro-level, the core theme is the nature of mathematical proof. Besides the notion of proof itself and its inner difficulties, there is a wealth of related themes to explore. A first group includes issues related to presentation— concerning the (non-)formalized character of a proof, its (in-)formality, the use of diagrams and their status —, and accessibility—who is able to grasp the proof?—and, perhaps most importantly, why is a proof convincing? A second group of themes concerns what 'accompanies' proofs. Mathematicians perform 'experiments', but are such quotation marks necessary? Why (not)?; they crunch numbers in the hope of finding interesting patterns, performing inductive reasoning; they support their proofs by arguments, leading to the questions whether proofs themselves can be seen as a particular type of argument, and whether, fascinatingly, a rhetorics of mathematical texts is possible. Quite interestingly, at this latter micro-level, other fields and disciplines than history, philosophy and sociology, also contribute to the picture. Such is the case with cognitive psychology, crucial to understand reasoning processes as they take place in the human (social?) brain. Such is the case too with evolutionary biology and psychology,

Philosophia Scientiæ, 16-1 | 2012 8

necessary to establish the roots and origins of mankind's 'number sense', if any. It is to be noted too that, at this micro-level, the impact of the particular tradition, whether 'maverick' or 'naturalized', is not overwhelmingly present, making a close collaboration perfectly possible.

11 More generally, there is one issue that is for sure: there is no shortage of themes to explore and problems to be solved when looking at the philosophy of mathematics from the standpoint of practices. The list of subjects that we presented above is not exhaustive and additional questions—the use of computers in mathematics certainly being one of them—are expected to be addressed by the (hopefully) growing and striving community of philosophers interested in this practice-based approach.

3 The essays in this volume

12 The essays in this volume are contributions to the philosophy of mathematical practice. Two kinds of papers are to be found. The first set of essays is more conceptual and theoretical, with authors discussing the notion of 'practice' and its connections with concepts, understanding and representations. The second set of essays contains rather detailed analyses and discussions of case studies.

13 In the first group of essays, first, as a general discussion, Mark Smith defends a conception of mathematical practice and mathematical subject matter that puts together inferential pluralism and a form of concept-realism. The second essay by Danielle Macbeth is a close examination of the relation between the practice of proving and understanding. Fully formalized mathematical proofs usually do not advance our mathematical understanding: does this mean that form does not correspond to content? In her view, to avoid such an opposition, it is necessary to develop a different notion of (formal) proof, and eighteenth-century algebraic proofs can provide an example of fully rigorous and fully content-filled mathematical proofs. The third essay, by Jessica Carter, is an analysis of the role of representations in mathematical proofs, illustrated by an example from contemporary mathematical practice where the value of an expression is found by gradually breaking it down into simpler parts. More generally, and referring to the Peircean terminology, she also discusses the role of icons and indices in such a practice. Finally, the fourth essay by Catarina Dutilh Novaes delineates a practice-based philosophy of logic and illustrates it by focusing on the role played by formal languages in logic. In her view, formal languages have an operative role; paper-and-pencil and hands-on technology trigger specific cognitive processes, which psychology discloses.

14 In the second group of essays, the first three are analyses of case studies exploring specific mathematical practices. First, Daniele Molinini focuses on the explanatory character of Euler's proof of his Theorem, which, according to the author, is not recognized by Steiner's approach to mathematical explanation. Secondly, Irina Starikova shows how the representation of groups through graphs leads to a new conceptual perspective on the geometry of groups. Thirdly, Baptiste Mélès defends the thesis that the concepts of paradigm and thematization used by Jean Cavaillès find an illustration and a formalization in theory and a precedent in the Hegelian dialectic. In the fourth and last essay, Vincent Ardourel explores possible interactions between mathematics and physics, by focusing on instances of practices of mathematical research that consist in reformulating constructively physical theories.

Philosophia Scientiæ, 16-1 | 2012 9

15 All in all, the essays in this volume give an idea of how many different routes can be taken by the investigation on the mathematical and logical practices. Although one might wish the opposite, the general impression is that it is not likely that a far- reaching integration of the internal and external viewpoints may occur in the very near future. Nevertheless, and following the recent creation of the APMP, one may hope that, at the very least, historians and philosophers of mathematics and logic may have found one another, not that it would mean that the heterogeneity of this field of research may not be an issue to be dealt with.

BIBLIOGRAPHY

ASPRAY, WILLIAM & KITCHER, PHILIP (EDS.) 1988 History and Philosophy of Modern Mathematics, Minneapolis: University of Minnesota Press.

BLOOR, DAVID 1976 Knowledge and Social Imagery, London: Routledge & K. Paul.

GILLIES, DONALD (ED.) 1992 Revolutions in Mathematics, Oxford: Clarendon Press.

KITCHER, PHILIP 1985 The Nature of Mathematical Knowledge, New York: Oxford University Press.

KUHN, THOMAS 1962 The Structure of Scientific Revolutions, Chicago: University of Chicago Press.

LAKATOS, IMRE 1976 Proofs and Refutations, Cambridge: Cambridge University Press.

LÔWE, BENEDIKT & MULLER, THOMAS (EDS.) 2010 Philosophy ofMathematics: Sociological Aspects and Mathematical Practice, London: College Publications.

MANCOSU, PAOLO (ED.) 2008 The Philosophy of Mathematical Practice, Oxford, New York: Oxford University Press.

POLYA, GEORG 1945 How to Solve It: A New Aspect ofMathematical Method, Princeton: Princeton University Press.

RESTIVO, SAL 1985 The Social Relations of Physics, Mysticism, and Mathematics, Dordrecht, Boston: D. Reidel.

RESTIVO, SAL, VAN BENDEGEM, JEAN PAUL & FISCHER, ROLAND (EDS.) 1993 Math Worlds: Philosophical and Social Studies ofMathematics and Mathematics Education, Albany: SUNY Press.

ROSENTAL, CLAUDE 2008 Weaving Self-Evidence: A sociology of logic, Princeton: Princeton University Press.

Philosophia Scientiæ, 16-1 | 2012 10

TYMOCZKO, THOMAS (ED.) 1986 New Directions in the Philosophy of Mathematics, Boston: Birkhâuser.

VAN KERKHOVE, BART & VAN BENDEGEM, JEAN PAUL (EDS.) 2007 Perspectives on Mathematical Practices: Bringing Together Philosophy of Mathematics, Sociology of Mathematics, and Mathematics Education, Dordrecht: Springer.

AUTHORS

VALERIA GIARDINO Institut Jean Nicod (CNRS EHESS-ENS), Paris (France) & Universidad de Sevilla (Spain)

AMIROUCHE MOKTEFI IRIST (EA 3424), Université de Strasbourg & LHSP - Archives H. Poincaré (UMR 7117), CNRS - Université de Lorraine (France)

SANDRA MOLS LHSP - Archives H. Poincaré (UMR 7117), CNRS - Université de Lorraine (France)

JEAN PAUL VAN BENDEGEM CLWF, Center for Logic and Philosophy of Science, Vrije Universiteit Brussel (Belgium)

Philosophia Scientiæ, 16-1 | 2012 11

Part I : Conceptual and Theorical Issues

Philosophia Scientiæ, 16-1 | 2012 12

Practice, Constraint, and Mathematical Concepts

Mark C. R. Smith

Introduction

1 A good deal of what goes on in our ordinary lives and in our reflective theoretical moments is shaped by an array of practices of various kinds—conceptual, cultural, technological, and others—and philosophy of late has become increasingly sensitive to these, in particular to their role in shaping the substructure of normative thought in contexts of both practical and theoretical reasoning. How we emerges constitutively out of what we're up to, interested in, find salient, or simply need in virtue of being human. In the specific case of mathematical reasoning, we might regard the very existence of multiple, mutually incompatible derivational systems as evidence for a kind of 'practice pluralism', the thesis that different forms of conceptual life—so to speak—are bound to yield different results according to their own internal constraints of reasoning, absent a system-external source of norms of mathematical thought. The pluralist idea, loosely stated, is that in the context of pure mathematics, it is practice which determines the content of at least most mathematical concepts. But a large part of my argument here will be that this view is false, and I will both motivate the negative or critical aspect of my view, and suggest an alternative in the light of a novel argument.

2 Mathematical practice has some features which are distinctive. In general a mature practice or cluster of practices can be individuated thanks to at least these three aspects.1 The first is that a practice is individuated by the network of concepts in which that practice finds its place as something meaningful and recognizable as such among a community. Call this the interpretive structure into which a practice must fit. To illustrate: since I have no interpretive structure for telling the difference between, say, people really playing cricket and people going through some joking simulacrum of cricket, I have no serious way of telling whether or not the practice of cricket is being embodied or carried out by the group I'm watching. My lacking an interpretive

Philosophia Scientiæ, 16-1 | 2012 13

structure, such as the application criteria for the concepts of run and wicket and so on, not only makes me an outsider with respect to the practice, but makes me incapable of distinguishing it from an imitation or a close, but distinct, relative. Secondly, a mature practice is something which must affect the mental states of those who are undertaking it. In order for a practice to be recognizable as cricket or waltzing or doing group theory, it must be part of the participants' self-conception that they are engaged in that very activity: there must be first-person acknowledgement available. And thirdly, a practice must be 'something to which we advert in describing an activity in certain terms', as Thompson [Thompson 2008, 199] puts it: that is, there must be second-and third-personal acknowledgement available—for instance of an activity as counting as waltzing—in addition to the first-personal self-conception.

3 The multiple branches of mathematics share these characteristics just as much as do the practices of making a soufflé, presiding over a trial, or investigating the anatomical structure of the fins of certain fish. In order to see what is most intriguing about mathematics understood as a collection of practices, though, we can make a further distinction among practices which are non-truth-seeking (such as making a soufflé), and practices like anatomical investigation which are truth-seeking with respect to a given subject-matter. Mathematics both pure and applied is a practice of the latter kind. But it has the further remarkable feature that, while it is a cluster of mature truth-seeking practices, its content is (generally) not available for empirical inspection and its truths are not (in the general case) open to confirmation in the face of experience.

4 And so my interest in what follows is in asking what the results of mathematical practices, so characterized, can tell us about the distinctive target-domain of (some portion of) mathematics. My specific thesis will be that mathematical results—at least in some significant range of cases—cannot have their content fixed by the practices that engender them, but rather that the direction of fit is closer to the reverse: something anterior to the practice is what ratifies the practice itself. I will, therefore, be arguing for a kind of realism, one which, as I shall explain, concerns concepts rather than objects, and which I will try to make persuasive in light of a specific, familiar, and quite interesting example.

5 The rest of the paper is organized as follows. In section 2, I will broach the contrast between practice monism (as embodied by Michael Dummett, on whose view there is one single correct canon of inferential procedures) and practice pluralism (as represented by Hartry Field), and argue in favour of retaining some insights from each of their quite different views. Part of my argument here will be that while practice pluralism is closest to the truth, pluralism itself offers a series of examples of how practices adjust to target-domains and have their content fixed by those domains, rather than the reverse. But what, precisely, is involved in speaking of a mathematical 'target-domain'? This is the question explored throughout section 3. In 3.1, I make the preliminary case that realism of one sort or another has to be endorsed, but, contra the indispensability argument discussed in 3.2, it cannot be a realism vis-à-vis objects. In 3.3, I flesh out a fuller picture of my rival argument—the argument from constraint—to motivate the beginnings of a more satisfactory, practice-motivated realism. Finally, in section 4, I draw some conclusions and make explicit the connection between practices, constraint, and the insights retained from the second section.

Philosophia Scientiæ, 16-1 | 2012 14

1 The practice of inference: harmony and pluralism

6 Alternative , embodying alternative models of what inference can or should be, are designed in such a way as to preserve different features for varying purposes. While classical logic has truth-preserving inference in view, combined (perhaps necessarily) with a realist conception of semantic values for mathematical propositions, intuitionistic strictures on logic are meant to preserve epistemic warrant in inference, while a paraconsistent system which denies 'explosion' or ex falso quodlibet might conceivably be suitable for an account of inferences in reasoning about the contents and structure of fictional worlds (and there may be other uses of paraconsistency as well).

7 But standing against this first blush of a pluralist vision of inferential practices, we find Michael Dummett (see especially [Dummett 1978]). The elements of his argument are familiar but bear repeating, for they culminate in an insight about harmony which I think is worth retaining. Begin with an everyday sort of example about the practice of inference. Imagine that we are speaking of someone who is now dead, and we wonder if she was courageous. Let's say that she was never in danger, never faced a situation in which she might be called upon to display courage or show herself to be a coward. Is it the case that there is some truth 'out there', some verification-transcendent truth, to the proposition that she was courageous? Surely not: surely it is the case that 'She was courageous' is neither true nor false. It is not that she really was one way or the other and we'll never know which: the issue is that it seems that this proposition does not have a truth value at all. In Dummett's view, this familiar sort of observation is an indication that we should not tether meaning to bivalent truth conditions, because the sentence 'She was courageous' is both meaningful and deprived of a truth value. We can say: (1) It is not the case that she was not courageous. But that does not entail: (2) She was courageous.

8 What we know is what would have had to have been the case in order for us to be entitled to assert 'She is courageous' or 'She showed courage on that occasion', but now that she is dead we should not suppose that there is some truth, or some falsehood, about her that just happens to be forever outside our epistemic reach. Rather, we know what it would have been like to be entitled to the assertion, and so it is meaningful, but that does not license the inference from (1) to (2). That inference would be licensed, by double negation elimination, on a truth-value-realist's semantic picture. But there is certainly something deep and important about Dummett's contention that the inference is suspicious. Surely, she was neither the one nor the other. But we speakers of English are comfortable with explaining what kinds of (counterfactual) circumstances would have entitled us to say that she was either courageous or cowardly, in a certain situation. I think that the resonance of the argument comes from an expectation of harmony, a property we expect to find among our inferential practices and the structures they invoke. Understanding mathematical language and thought, with their putative entities and structures and inferences, involves just the same machinery as any other portion of thought. These involve the ability to fit 'molecular' semantic pieces into a wider network of features, such as the evidence on which some

Philosophia Scientiæ, 16-1 | 2012 15

proposition rests and the consequences that flow from that proposition. This is what the harmony at issue involves.

9 On this view, understanding any proposition p involves: (i) mastering at least some of its appropriate contexts of use in the light of what other speakers recognize as appropriate, (ii) the ability to draw out some of what follows from it, (iii) the ability to transpose it into another appropriate context (e.g. when a child hears a metaphor for the first time, and can display her understanding or failure to understand that metaphor in her subsequent transposition of its use to fresh contexts which are either appropriate or inappropriate), and (iv) being able to adduce warrant for it. I know nothing about industrial chemistry, but I know that gold dissolves in cyanide (and indeed that 'Gold dissolves in cyanide' is T ↔ gold dissolves in cyanide). But if I go to a meeting of industrial chemists, they will soon see that I don't understand this proposition in any real way because I can't adduce evidence for it, I can't draw any consequences from it, and I can't transpose it to any other relevantly similar circumstance. I have no network into which to fit p: it has no harmonious position in my cognitive equipment. In fact I have no interpretive structure which would afford me the chance of being a participant in the practices that people who really understand the proposition are capable of. Mathematical propositions, like any other cluster of propositions, are located in a network of evidence (warrants) and consequences, which I need to master in order to master the relevant portion of the language. I need substantially more than truth conditions for just p alone: I also need its fit into a network of relevant collective practices of use, giving and taking evidence, and consequence-drawing. The significant insight to retain from Dummett is that engagement in mathematics (or of any other portion of thought and language) is a skillful deployment of practical abilities that cannot be represented as piecemeal, proposition-by-proposition agglomeration.

10 But Dummett also defends a further thesis: that the right logic should embody the single correct semantics, and so that we cannot ratify (for example) DN-elimination in the logic, by the argument from the reflection on courage given above. He defends the view that there is a unique correct logical canon. We might style him a 'practice monist' in this respect. There is, however, an apparent tension between the monistic thesis concerning inference, and the harmony thesis, since if it's the case that a practice constitutively demands a harmonious interpretive structure, and given that classical mathematics grounded in classical logic is one of the most robust interpretive structures there are, then the reformist project of monism conflicts directly with mature, robust and harmonious practices. Since the proof-strategy of DN-elimination is, furthermore, so pervasively operative over mathematical concepts that are very robust indeed, and since it allows for so many transitions among robust structures, there is very good reason to ratify it. Because such inferences are authorized in number theory, for example, the transposition of a (perfectly legitimate) constraint from inferences about one target-domain—such as states of character like courage—onto a quite different target-domain are unwarranted, and jar with the expectation of harmony.

11 A rather different position vis-à-vis inferential practice in mathematics is defended by Hartry Field (see, for instance, [Field 1998]), in whom we find a strong 'practice pluralism'. His famous slogan is that mathematics need not be true to be good. Many ordinarily take truth to be a non-trivial property of the class of uniquely warranted

Philosophia Scientiæ, 16-1 | 2012 16

mathematical propositions by the strictures of some monolithic canon. Not so for Field. The reasoning is as follows. There are many canons, many alternative logics, and each one will deliver some different inferences. A logical canon is like a licensing bureau, and different wickets in the bureau will stamp your license in different ways to ratify some inferences and prohibit others. The classical bureau will license the inference from (1) to (2), but the intuitionistic bureau won't allow it to go through. The classical bureau will license you from (p and ~ p) to q, but the window with Graham Priest behind it (see [Priest 2006]) will let p and ~ p both be true and not license the inference to q, as he argues that we should not ratify ex falso (or ex contradictione) quodlibet in the general case. Paraconsistent logic is comfortable with inconsistent arithmetic (e.g. [Mortensen 1995], [Mortensen 2009]), while to classical logic, inconsistency is fatal. (Frege learned this the hard way.) But now, we might ask in pluralist spirit, on what ground would you stand from which you could further say 'Classical logic is true, and a paraconsistent system is false', or vice versa? There is no such ground. Correctness, in the case of mathematics, must just be consistency with the principles of a canon, some canon or other. The only grounds for choice between canons, for choice as to which window in the license bureau we walk up to, are either aesthetic or pragmatic. Choosing between logics is not a matter of logical choice obedient to some prior standard, nor a matter of discerning which is true. It might be a matter of choosing what's more beautiful, or picking what better serves the interests of our empirical science: in essence, the choice of interpretive structure, the selection of an appropriate practice in which to engage, is one which is guided by the aims at hand, guided by the target-domain. But how are we to characterize that domain, and the results about it?

2 The argument from constraint

12 The indispensability argument for mathematical realism, due in large part to Quine and Putnam, is widely held to be the most promising candidate for establishing mathematical realism concerning the mathematical target-domain, if such realism can be established at all. It's also an argument that begins with our practices as they weave together mathematics and natural science into a conceptual whole. Yet it suffers from some significant problems. In the following subsections I will begin by offering a different, and novel, argument for realism concerning mathematical concepts, contrast it with the indispensability argument, and reply to some potential objections while fleshing out the argument further.

2.1 The argument in brief

13 It goes like this. One of three familiar ancient Greek construction problems in geometry is the problem of squaring the circle (the two others are trisecting the angle and doubling the cube). The problem is this:

14 (SC) Using only an unmarked straight-edge and compasses, and given a circle of area A, construct a square having that same area A.

15 Greek geometers struggled with this problem for a long time, as did many mathematicians until the 19th century (mathematicians were similarly exercised by the other problems too). It turns out that the construction demanded in (SC) is, in fact, impossible, and we can prove it.

Philosophia Scientiæ, 16-1 | 2012 17

16 2 Suppose a circle of radius r. The area of a circle AC is given by the formula AC = π . Now,

let r = 1, to give us an arbitrary unit length. The area AS of a square of side a is, of 2 course, a . This means that to construct AS = AC we would need to construct a square with side a = √π.

17 This is what's not possible. n is a transcendental number: this means that it does not n n-1 satisfy a polynomial equation (i.e. one of the form anx +a n-1x + ... + a0 = 0). Now, polynomial equations operate exclusively with the functions of addition, subtraction, multiplication and exponentiation: these are the operations that can be geometrically constructed, that is, carried out in principle with the tools set in (SC). But n does not satisfy any procedure that involves just these operations: it is transcendental (the proof was given by Lindemann in 1882 [Lindemann 1882]). The upshot is that there is no constructive geometric 'performance , so to speak, by which we could accomplish the task set in (SC). The transcendence of n means that we cannot turn one structure-type into another by a certain pathway. So a purely mathematical property furnishes an explanation of why something is impossible, an explanation of a constraint on the possibilities that are within the ambit of our practice.

18 In order better to see the force of the argument from constraint over the indispensability argument ('IA , for short), I shall consider two points ofcontrast. The first has to do with IA's ultimate inability to derive its target conclusion on the basis of premises that require mathematical objects explanatory role to match the role of empirical postulates. The second point raises the question of how one might argue for the actual satisfaction of mathematical truth conditions. I raise each of these in turn.

2.2 A mismatch of roles

19 The general form of the indispensability argument for the existence of specifically mathematical abstract objects invokes their indispensability for empirical science. There is in fact so much mathematics included in the interpretive structure of the natural science that the structure would crumble, and the natural scientific practices become unintelligible, if the mathematics were removed. I borrow the following formulation of the argument's principal thesis from Colyvan: If apparent reference to some entity (or class of entities) £ is indispensable to our best scientific theories, then we ought to believe in the existence of ξ. [Colyvan 2001, 7]

20 The idea might be cast like this: apparent reference, combined with indispensability (in some sense to be made more precise) in the formulation of a body of theory which we know on independent grounds to be true, secures reference. Commitment to mathematical objects, by these lights, is the very same sort of thing as commitment to quarks and other unobservable entities. If 'quark appears to refer, and we cannot achieve the same scientific success without reference to quarks as we can with such reference, then it is rational to believe that 'quark refers, even if it is only a provisional belief. In a similar vein, if 'the real line appears to refer, and we cannot achieve the same degree of explanatory success in physics without the real line as we can with it, then it is rational to believe that the expression refers. IA claims that mathematical objects fill the same explanatory role as physical unobservables.

21 The argument's fulcrum is the extent of the empirical support we can bring on behalf of this thesis: that mathematics cannot, without loss of explanatory content and

Philosophia Scientiæ, 16-1 | 2012 18

disintegration of the interpretive structure, be paraphrased out of empirical science. An IA of this kind is, then, an empirical hypothesis, to be tested like any other against experience. In that vein, in a piece on behalf of mathematical objects indispensability in natural science, Alan Baker [Baker 2005] raises the very interesting example of North American periodical cicadas and the resources that need to be brought to bear on the explanation of their life-cycle. Briefly, the case is this: in each of three species of cicada of the Magicicada genus, 'the nymphal stage remains in the soil for a lengthy period, then the adult cicada emerges after either 13 years or 17 years depending on the geographical area [Baker 2005, 229]. That the life-cycle should fall into a prime- numbered-year sequence is curious, and wants explaining. Baker canvasses two rather different biological explanations of why this is an evolutionarily advantageous sequence, both of which, despite their differences, invoke the number-theoretic fact that prime periods minimize intersection compared to non-prime periods. (Exactly what they minimize intersection with is open to interpretation, but Baker's discussion of the phenomenon is uniform across both of the interpretive versions he cites: see the references in his article.) In any event, Baker writes: [T]here are genuine mathematical explanations of physical phenomena, and the explanation of the prime cycle lengths of periodical cicadas using number theory is one example of such. If this is right, then applying inference to the best explanation in the cicada example yields the conclusion that numbers exist. [Baker 2005, 236] Nevertheless, he continues, Whatever cases of putative mathematical explanation the platonist might come up with, there will always be some leeway for nominalist objections since the role of mathematical posits is unlikely ever to exactly match the role of concrete unobservables, such as electrons. [Baker 2005, 236]

22 I think it's helpful to cast Baker's argument like this: since explanation is achieved by invoking a property that is instantiated by nothing at all unless it's instantiated by an abstract object, the relevant abstract object is indispensable to the explanation. A further feature of the empirical IA is that it sees its criterion of success as the ability to locate cases in which the explanation of contingent physical facts ineliminably involves reference to abstract objects, whose existence is necessary if they exist at all. This opens up an opportunity for the nominalist to drive a chisel into the interstitial explanatory between concrete and abstract particulars, and then hammer off the abstracta, perhaps by invoking an instrumentalist or fictionalist strategy, for instance in the manner of Hartry Field [Field 1989], for whom the mathematics lubricates inferences that it would otherwise be too clumsy to nominalize fully, but does no more duty than that. Now, my claim is that while examples like Baker's cicadas are compelling up to a point, we are ultimately left with a mismatch of explanatory role between mathematical and physical entities, a mismatch which opponents of IA will exploit by pointing out that IA does not decide the issue. (Note that the idea here is not so much that IA is false or fallacious, but that it leaves the debate just where it started.)

23 But we can better make the mathematical realists case by reorienting the focus of lines of argument from a practice-centred inference to the best explanation—in which abstract objects emerge as posits on an ontological and epistemic par with physical unobservables—to an equally practice-centred inference that will allow us to derive the explanation of a properly physical and metaphysical impossibility. Rather than looking to what mathematical entities (of some kind to be made more precise) rule in, the argument from constraint appeals to what they rule out. Mathematical entities don t need to match the role of concrete unobservables: they do a different kind of duty, but

Philosophia Scientiæ, 16-1 | 2012 19

explanatory duty nonetheless. The thrust of the argument from constraint is to reveal rational grounds to endorse the existence of mathematical entities of some kind, but grounds which, rather than appealing to an empirical hypothesis about the functional role of such entities qua postulates, point instead to those entities status as necessary in circumscribing the boundaries of the possible. These boundaries are laid bare by attending to what mathematical inference reveals about constraints which are both intra-theoretic to the interpretive structure, and have purchase on the range of what can actually be done.

2.3 The argument more fully

24 Mark Balaguer expresses this perfectly sensible demand: 'platonists need to argue that the platonistic truth conditions of our mathematical utterances are actually satisfied ' [Balaguer 2009, 133].2 The realist needs to adduce some reasons for thinking that the conditions are satisfied by the right sort of thing. What might these be? Here we can make an additional point on behalf of the argument from constraint over the IA by way of an objection that teases out some important points of contrast. The following objection, analogous to Balaguer's question, might be raised against my claims for the import of constraint. If I host a party, and I run out of beer at a quarter past midnight, my not being able to serve Emily a beer at twenty past midnight is explained by my lack of beer, or by the dearth of beer in the fridge, or something along those lines. But this explanation is not an explanation that commits me to such things as dearths or lacks, as though they were objects. The proper explanation of my inability to serve enough drinks will make no mention of such an entity as a dearth of beer. Why suppose, then, that the explanatory resources for the impossibility of (SC) will be any different from the resources needed for the explanation of why Emily can t have a beer?

25 My line of reply is this. There appear to be two fundamentally different kinds of constraint invoked in these two scenarios, and two different corresponding kinds of explanation. My inability to serve enough drinks is a contingent constraint, and that constraint is explained by causal features of the situation. The relevant causal explanation will invoke contingent relations between events. The impossibility of carrying out the (SC) construction, on the other hand, is not an event, since it does not unfold in space-time, and is not open to causal explanation since the constraint expresses a metaphysical impossibility. There is no possible world in which the (SC) construction can be carried out, whereas there is obviously a possible world in which I have more beer than I have at my party in this one. We might call the mathematical explanation of the constraint an essential explanation: one that is necessary, and thus non-causal. Part of the argument from constraint's bite is that it points to a necessary property, instantiated in any metaphysically possible world whatever. This particular necessity points in turn not just to theoretical indispensability for natural science, but to metaphysical indispensability in understanding a portion of the structure of possible worlds, this one among them. The argument from constraint, then, is an argument from the requirements of essential explanation to metaphysical indispensability in the relevant interpretive structure, in which the relata of the explanation are not contingent events.

26 Now, given the phenomenon of constraint, what can the argument from that phenomenon teach us about the target-domain of the relevant portion of mathematics,

Philosophia Scientiæ, 16-1 | 2012 20

such as to explain how we are able to move from practice to results concerning that domain? In other words, what underwrites the transition from practice to results in contexts such as those of essential explanation? Several candidates suggest themselves, but I will argue that only one among them is able to support the specific modal and explanatory features that must be accommodated by a suitable account.

27 Objects, either concrete or abstract, can be rejected as candidates for populating the target-domain. On the one hand, concrete objects are ipso facto contingent, and so any properties which supervene upon them, or are emergent from them, will be subject at most to physical rather than to logical or metaphysical necessity. Concrete particulars do not provide support for inferences across all possible worlds. Nor can concrete particulars, supposing them to be the relevant target-domain, by themselves support inferences made on the basis of mathematical induction. For instance, the unique prime factorization theorem over the integers would be falsified for any number greater than whatever number of particulars there happens to be. On the other hand, abstract objects, by the well-known arguments from Benacerraf, are equally untenable as candidates, given that a determinate characterization of any number as some particular set, or other kind of individual, is unavailable in principle [Benacerraf 1965]. Multiple and structurally indiscriminable characterizations are always at hand. But this last observation suggests a further possible range of candidates: concepts, rather than objects. For concepts can of course be multiply instantiated.3 If this is the case, then what is the character of the concepts at issue in the specifically mathematical target-domain of the practices involved in essential explanation? On one proposal, such concepts are to be understood as either akin to instrumental idealizations which are used in the natural sciences for modelling real systems—ideal gasses, for example, or treating fluids as continuous rather than discrete substances in fluid dynamics—or else as akin to the concepts used in fictional storytelling. On this proposal, the target-domain of mathematical practice is something other, which wears mathematics as a mask. It is really, on this view, the material world, or a pseudo-world of free imaginative creation, which is the target-domain. But this cannot be right, if we reflect on the phenomenon of constraint as it has been described here. For if the material world is the target-domain of mathematical practice, then there is no support for inferences concerning trans-world impossibilities; and if a pseudo-world of our imaginative creation is our real target, then the explanatory purchase of what I have called 'essential explanation' is lost. And so I propose that what does the work of extracting results from mathematical practice is in fact the relevant range of concepts themselves, understood as real at least in the sense of not being mere representations of something else (for there is nothing else that they can plausibly represent), and in the further sense that the structure of the concepts determines the appropriateness or inappropriateness of our thought about them, and the consequent practices by which we engage with them, rather than the other way around.4 The order of explanation appropriate to an account of mathematical correctness and the content of mathematical thought begins with concepts as their content and structure are worked out over time, and only then looks to whether the connected practices—practices of inference in particular, of consequence-extraction—are such as to represent the rational transitions to which we are entitled in virtue of the concept, which is gradually more fully understood. In the context of the argument from constraint proposed above, the TRANSCENDENTAL NUMBER concept determines the content of a constellation of

Philosophia Scientiæ, 16-1 | 2012 21

thoughts about π and further determines the explanatory reach of those thought- contents with respect to the range of what is and what is not possible.

Conclusion

28 A quick summary of the argument is in order. I suggested first that we can characterize mature practices of almost any kind along three dimensions: that they have meaning and can be individuated by virtue of being encompassed under a system of interpretive concepts; that they affect the mental state, specifically the first-personal self- conception, of participants; and that third-personal recognition of participants as such is available. But in addition to these, mathematics, among other practices, is also a discipline that seeks the truth, but the target-domain in which these truths are sought by means of mathematical practices is unique—or very nearly—in not being available for sensory examination. So what is that domain? Methodologically speaking, we need to examine some features of mathematical practices themselves, multiple practices of inference in particular, so as to corner the results in our target-domain, and here we find (retaining this insight from Field) that practices of inference are adjusted in the light of what interests happen to be guiding investigation, in the light of what kinds of results are sought. But nevertheless, the exact character of the target-domain of pure mathematics is worth inquiring into, and while I have argued that some form of realism is required, it cannot be the object-realism espoused by defenders of the indispensability argument. Nonetheless, realism is required in order that we may account for the transition from, in my example, pure mathematical practices to their genuinely explanatory outcomes (an essential explanation, in the vocabulary arising from the argument from constraint). I further argued that only real concepts are suited to the multi-faceted task of moving from practice to results.

29 Practices are always-already there, so to speak, but they are good or bad in relation to concepts antecedent in the order of explanation when it comes to phenomena such as constraint.5 Concepts are not generated by practice in the general case; practices are instead measured for their worth in relation to the concepts that are their target- domain. Inferential policies are responsible to concepts: hence the explanation of constraint. I have sought to motivate reconnecting the target-domain of a portion of mathematics not just with its internal structure of operation as a practice, but as a discipline whose subject matter that is importantly practice-determining. The argument has proceeded by pointing to what is ruled out. In my example, what is ruled out is not ruled out by practices taken by themselves, but rather by (in this specific case) the TRANSCENDENTAL NUMBER concept. Squaring the circle is not ruled out by our ways of doing, but by what we find as limits at the outskirts of our doings.

Philosophia Scientiæ, 16-1 | 2012 22

BIBLIOGRAPHY

BAKER, ALAN 2005 Are there genuine mathematical explanations of physical phenomena?, , 114, 223-238.

BALAGUER, MARK 2009 Fictionalism, theft, and the story of mathematics, Philosophia Mathematica, 17(2), 131-162.

BENACERRAF, PAUL 1965 What numbers could not be, Philosophical Review, 74(1), 47-73.

COLYVAN, MARK 2001 The Indispensability of Mathematics, New York: OUP.

CORFIELD, DAVID 2010 Understanding the infinite I: Niceness, robustness, and realism, Philosophia Mathematica, 18(3), 253-275.

DUMMETT, MICHAEL 1978 Truth and Other Enigmas, Cambridge: Harvard University Press.

FIELD, HARTRY 1989 Realism, Mathematics, and Modality, New York: Blackwell. 1998 Mathematical objectivity and mathematical objects, in Contemporary Readings in the Foundations of Metaphysics, edited by LAURENCE, S. & MACDONALD, C., Oxford: Blackwell.

ISAACSON, DANIEL 1994 Mathematical intuition and objectivity, in Mathematics and Mind, edited by GEORGE, A., New York: Oxford University Press.

LINDEMANN, FERDINAND VON 1882 Ùber die Zahl n, Mathematische Annalen, 20(2), 213-225.

MORTENSEN, CHRIS 1995 Inconsistent Mathematics, Dordrecht: Kluwer. 2009 Inconsistent mathematics: Some philosophical implications, in Handbook of the Philosophy of Science, edited by IRVINE, A. D., North-Holland Publishing Company - Elsevier, vol. 9: Philosophy of Mathematics.

PRIEST, GRAHAM 2006 In Contradiction, Oxford: Oxford University Press, 2nd ed.

THOMPSON, MICHAEL 2008 Life and Action, Cambridge: Harvard University Press.

NOTES

1. I borrow these three criteria loosely from Michael Thompson [Thompson 2008, 199], though the use I make of them is entirely different. 2. Balaguer raises this point in respect of platonism in contrast specifically to mathematical fictionalism, but the challenge ought to generalize.

Philosophia Scientiæ, 16-1 | 2012 23

3. Individual concepts such as the ones expressed by proper names are an exception to multiple instantiation, but the Benacerraf argument shows that, if numbers are not individuals, number concepts are by the same token not individual concepts either. 4. In the light of these considerations I think that Daniel Isaacson [Isaacson 1994] is perhaps closest to the truth in invoking what he calls 'concept platonism'. For concepts as determiners of thoughts do triple duty: first, unlike platonic particulars, they are multiply instantiable and thus constant across isomorphic structures that serve to fill their place-holders; second, they are constraining in virtue of possessing necessary structural relations to each other; and thirdly, following from these first two features, they are explanatory in the requisite sense of furnishing essential explanations. 5. A related point, especially concerning infinite structures, is made in Corfield [Corfield 2010].

ABSTRACTS

In this piece I articulate and defend a conception of mathematical practice and mathematical subject-matter, which is responsive both to a sensible pluralism concerning the connection between inferential practices and guiding interests, on the one hand, and to the objective content-determining structure of mathematical concepts on the other. I begin by sketching a general characterization of practices themselves, and by specifying some of the unique features of mathematical practices. An examination of inferential pluralism follows, and some insights of pluralist arguments are retained. But I argue further for the requirement of some form of mathematical realism, though the object-realism of the indispensability argument is assessed and rejected. My positive proposal argues for a form of concept-realism, which is established by what I call the "argument from constraint."

Dans cet article je propose d'exprimer et de défendre une conception des pratiques et du domaine de discours mathématiques qui soit sensible, d'une part, au pluralisme des relations entre pratiques inférentielles et intérêts, et d'autre part, à la structure objective et déterminante des concepts mathématiques. J'ébauche tout d'abord une caractérisation générale des pratiques, pour ensuite préciser certains phénomènes propres aux pratiques mathématiques. Suit un recensement des idées qui se dégagent des arguments pluralistes, et de celles qui sont à retenir. Mais je défends par la suite la nécessité d'une forme de réalisme mathématique, qui toutefois ne peut être le réalisme d'objets que prônent les partisans de l'argument d'indispensabilité. Je défends plutôt un réalisme des concepts, soutenu par ce que je baptise l'« argument de la contrainte ».

AUTHOR

MARK C. R. SMITH Queen's University (Canada)

Philosophia Scientiæ, 16-1 | 2012 24

Proof and Understanding in Mathematical Practice

Danielle Macbeth

1 In their practice mathematicians prove, and reprove, theorems—in some cases hundreds of times1—in order better to understand. But how does the practice of proving theorems result in mathematical understanding? What is a proof that it yields understanding (when it does)? These questions are all the more difficult insofar as a proof as it is conceived in mathematical logic, as a formal derivation of a theorem from axioms, seems to have nothing whatever to do with understanding. Is perhaps the locus of mathematical understanding something other than proof? Or should we perhaps think instead that mathematicians' proofs are something essentially different from derivations as they are understood in mathematical logic? What exactly is the relationship between, on the one hand, the mathematical practice of proving theorems, and on the other, the desired mathematical result of better understanding?2

2 Needless to say, no fully adequate answer can be provided here. What I aim to do is, first, to identify what I take to be the most fundamental issue regarding proof and understanding in mathematical practice, the issue that must be addressed before anything more detailed can be said about how particular proofs, or particular sorts of proofs, enable understanding, or about the nature of mathematical understanding, or about different aspects of the mathematician's understanding. As developed in the first section, the problem is that given our current conception of formal proof it is impossible to see how such a proof could possibly advance our understanding because a formal proof abstracts from all content. But if that is right then we need to reconsider the notions of formal reasoning and formal proof. In the second section two quite different accounts of what one might mean by formal reasoning are outlined, one in which form is fundamentally opposed to content and one in which it is not. In the third section I turn to the sort of algebraic reasoning that was the norm in mathematics in the eighteenth century in order to clarify the idea that one might reason in a way that is at once fully rigorous, and hence in a sense formal, and also from the contents of mathematical ideas. In Section

Philosophia Scientiæ, 16-1 | 2012 25

3 Four some ideas are very briefly sketched regarding how one might extrapolate from that case to the sort of mathematical reasoning that since the nineteenth century has been the norm in mathematical practice. The fifth section concludes the discussion.

1 Proof, understanding, and formalization

4 According to what it seems fair to say is the standard view, a mathematician's proof is a sequence of sentences beginning with axioms each subsequent sentence of which is a logical consequence, in the technical sense due to Tarski, of an earlier sentence or sentences.3 Suppes Introduction to Logic, published in 1957, is a classic exposition of the view. First, it provides "a completely explicit theory of inference adequate to deal with all the standard examples of deductive reasoning in mathematics" [Suppes 1957, xii]. The underlying idea of this explicit theory is of course very familiar, but it is worth rehearsing nonetheless. On this view: A correct piece of reasoning, whether in mathematics, physics, or casual conversation, is valid in virtue of its logical form. Because most arguments are expressed in ordinary language with the addition of a few technical symbols particular to the discipline at hand, the logical form of the argument is not transparent. Fortunately, this logical structure may be laid bare by isolating a small number of key words and phrases like 'and , 'not , 'every , and 'some . In order to fix upon these central expressions and to lay down explicit rules of inference depending on their occurrence, one of our first steps shall be to introduce logical symbols for them. With the aid of these symbols it is relatively easy to state and apply rules of valid inference. [Suppes 1957, xii]

5 Once having mastered translations into the symbolic notation and the rules governing inferences involving the logical signs in the system (Chapters 1-6), the student is equipped to navigate the differences between a formal proof and an informal proof of the sort, it is claimed, that mathematicians in their practice actually provide (Chapter 7). On the account Suppes outlines, whereas the mathematician's (informal) proof has jumps in the deductive reasoning, in a fully formalized proof all the jumps have been replaced by small steps that are manifestly instances of the rules of formal logic. "In giving an informal proof, we try to cover the essential, unfamiliar, unobvious steps and omit the trivial and routine inferences". A formal proof includes even the trivial and routine inferences [Suppes 1957, 128].

6 But as has been more recently noted, this seems, surprisingly, not to be right. Mathematicians, in their actual practice, often do not give proofs in this sense; they do not set out a chain of deductions, whether or not with jumps in the reasoning. Instead, they describe how the chain of reasoning goes, or would go if one actually engaged in it. As Bostock has put the point, in mathematical practice "one does not actually construct such proofs [i.e., fully formalized, or even gappy, deductive proofs from axioms]; rather one proves that there is a proof as originally defined" [Bostock 1997, 239]. Avigad similarly suggests that "real" proofs, that is, the proofs that mathematicians actually give, "are simply higher-level, informal texts that indicate the existence of lower-level formal ones; i.e., they are recipes, or descriptions" [Avigad 2006, 132]. This feature of the proofs that mathematicians actually provide is the basis also of Azzouni's "derivation-indicator view of mathematical practice" [Azzouni 2006].4 As Azzouni sees, what the mathematician provides is only something that indicates that there is a derivation; what is given is not the derivation itself, that is, the chain of reasoning

Philosophia Scientiæ, 16-1 | 2012 26

(whether or not with gaps or jumps) from one claim to another according to rules of valid inference. Now Azzouni does think, as does Avigad, that the derivation itself, the derivation that is only described by the mathematician, can be given in the symbolism of standard mathematical logic, and that, we will see, may not be so. But I want for the moment to leave this to one side in order to focus more closely on the idea that mathematicians proofs are in fact not proofs in the sense of chains of reasoning but instead descriptions of such proofs.

7 A simple example illustrates the difference we are concerned with here.5 Suppose you are given the task of dividing eight hundred and seventy-three by seventeen. You might solve the problem by engaging in a bit of mental arithmetic that you could report as follows. A hundred seventeens is seventeen hundred; so fifty seventeens is half that, eight hundred and fifty. Add one more seventeen—so that now we have fifty-one seventeens—to give eight hundred and sixty-seven. This is six less than eight hundred and seventy-three. So the answer is fifty-one with six remainder. Alternatively, you might do (or imagine doing) a standard paper-and-pencil calculation in Arabic numeration the result of which (if you were schooled in North America) will look something like this:

51

17∣873

85

23

17

6

8 Either way, one gets the answer that is wanted. But the two collections of marks serve very different purposes. The written English, it seems fair to say, provides a description of some reasoning, in particular a description of the steps of mental arithmetic a person might go through in order to find the answer to the problem that was posed. The description does not give those steps themselves. To say or write that a hundred seventeens is seventeen hundred is not to calculate a product. Even more obviously, to say or write the words 'add one more seventeen' is not to add one more seventeen. It is to instruct a hearer or reader: at this point in their reasoning, assuming they are following the instructions, they are to add seventeen. The paper-and-pencil calculation is radically different. It is manifestly not a description of a chain of reasoning one might undertake; rather it shows a calculation—at least it does so to one familiar with this use of signs to divide one number by another. (Notice that the paper-and-pencil calculation does not show the steps that are described in the first passage; calculating in Arabic numeration does not merely reproduce mental arithmetic but involves its own style of computing.) Working through this collection of signs in Arabic numeration, or performing the calculation from scratch for oneself, just is to calculate. Here we have not a description of reasoning but instead a display of reasoning.6 One calculates in the system of Arabic numeration in a way that is simply impossible in natural language. To

Philosophia Scientiæ, 16-1 | 2012 27

one who is literate in this use of the system of written signs, the calculation shows the reasoning.

9 Bostock, Avigad, and Azzouni all claim that proofs in contemporary mathematical practice describe rather than display the reasoning. This is easily verified by looking at the form of expression of the proofs that are provided in standard contemporary textbooks of college level mathematics, say, in abstract algebra. The following descriptive phrases, taken from proofs in the opening chapter of Birkhoff and Mac Lane's classic A Survey of Modern Algebra (1965), are typical:7

-… the first law may be proved by induction on n.

- ... by successive applications of the definition, the associative law, the induction assumption,

and the definition again.

- By choice of m, P (k) will be true for all k < m.

- Hence, by the well-ordering postulate ...

- From this formula it is clear that ...

- This reduction can be repeated on b and r1 ...

- This can be done by expressing the successive remainders ri in terms of a and b ...

- By the definition of a prime...

- On multiplying through by b ...

- ... by the second induction principle, we can assume P (b) and P (c) to be true ...

- Continue this process until no primes are left on one side of the resulting equation ...

- Collecting these occurrences, ...

- By definition, the hypothesis states that ...

- … Theorem 10 allows us to conclude ...

10 Proofs written using such phrases are not themselves the reasoning but instead accounts of how the reasoning is to go and would go were one actually to reason from the starting point to the desired theorem. They tell you what to do rather than showing you the doing of it.

11 Mathematicians in their practice do not give formal, or even formalizable— in Suppes' sense of filling in all the gaps in the reasoning—proofs but instead only describe chains of reasoning. We need, then, to ask whether the chains of reasoning that the mathematician describes can be formalized in Suppes sense. In one respect the answer is obviously yes. We do know how to take a mathematician's proof and on that basis produce a fully formalized proof, and have actually done this for various mathematically significant proofs— among them, Gôdel's first incompleteness theorem,

Philosophia Scientiæ, 16-1 | 2012 28

the Jordan curve theorem, the prime number theorem, and the four-color theorem.8 But what exactly is the relationship between the mathematician's reasoning and the formalized proof? The answer is not obvious and it is not obvious because a formalized proof is usually unintelligible. A formalized proof does not provide or even advance mathematical understanding. There are only two options. If we as sume that the formalized proof does set out the chain of reasoning that the mathematician describes, it would seem then to follow, in light of the fact that the formalized proof does not advance understanding, that mathematical understanding does not reside in the practice of proving theorems but instead in some other aspect of mathematical practice. Alternatively, one might argue that the formalized proof does not set out the mathematician's reasoning, the reasoning that is described in a textbook proof, precisely because a mathematician's chain of reasoning—the reasoning that is described in a textbook proof—can advance one's mathematical understanding. On this second approach, one begins with the thought that the mathematician's reasoning does, at least in some cases, advance mathematical understanding. But because we know that the corresponding formal proof does not advance mathematical understanding, one concludes that the mathematician's reasoning cannot be identified with a formal proof.

12 Manders embraces the first horn. He accepts the mathematical logician's formal conception of proof but then locates mathematical understanding out-side of proof, in the conceptual settings (as he calls them) of proofs. And he does so because, he sees, the reliability that is gained in formalizing a proof is achieved at the expense of understanding: "fully formalized proofs are usually unintelligible. Whatever goes into clarity of mathematical ideas can be obscured by the way those ideas are represented in reliability theoretic mathematical foundations" [Manders 1987, 202]. For Manders, because he assumes that questions of reliability are adequately settled by formalizing proofs, while at the same time recognizing that formalization is inimical to understanding, it simply cannot be the case that the locus of mathematical understanding is proof.

13 Rav argues contrapositively that because a mathematician's proof can provide mathematical understanding, the mathematician's reasoning cannot be identified with a formalization. Although, Rav thinks, "most of our current mathematical theories can be expressed in first-order set-theoretical language", a mathematician's proof cannot be formalized because it "depends on an understanding and prior assimilation of the meanings of concepts from which certain properties follow logically" [Rav 1999, 20, fn 20, and 29]. A mathematician's proof, Rav argues, is irreducibly meaningful; it has irreducible semantic content and so cannot be identified with a derivation in some formal system precisely because such a derivation is merely syntactic.9

14 As the arguments of Rav and Manders taken together indicate, the conception of formal proof that we inherit from mathematical logic leaves us with only two choices: either a mathematician's proof is (or at least can be) a source of mathematical understanding and is essentially different from a formalization of it, or the mathematician's proofs are formal, that is, in principle formalizable, but are not the means by which one achieves mathematical understanding. I think that we should be very puzzled by this. Surely our logic is, just as Suppes says, "a completely explicit theory of inference adequate to deal with all the standard examples of deductive reasoning in mathematics" [Suppes 1957, xii]. It is barely conceivable that the norms governing the reasoning mathematicians

Philosophia Scientiæ, 16-1 | 2012 29

actually engage in are simply disjoint from the familiar rules of logical inference.10 But if so, then why is it that conceptual clarity in mathematics is essentially different from formal clarity? This is the problem of proof and understanding that we are concerned with here.

2 Two conceptions of formal reasoning

15 Manders and Rav agree that a fully formalized proof has nothing whatever to do with mathematical understanding. And it is not hard to see why it does not: a fully formalized proof does not depend in any way on content or meaning. In a formal proof what matters is not (non-logical) content but only the logical particles in virtue of which the rules governing permissible moves may be applied. One does not have to think at all about what is being claimed or inferred in a formal proof, as is made manifest in the fact that such proofs are machine-checkable. Indeed, this is a large part of the point of the formalization; the point is to make any and all relations of meaning explicit, all steps in the proof purely syntactic, so that one can see precisely on what the proof depends. But there is a puzzle hidden here. A proof, Rav suggests, has irreducible semantic content. Mathematicians, if he is right, reason on the basis of contentful mathematical ideas. Thurston, a practicing mathematician and Fields medalist, concurs: "reliability does not primarily come from mathematicians formally checking formal arguments; it comes from mathematicians thinking carefully and critically about mathematical ideas" [Thurston 1994, 47]. Thurston, unlike Rav, does think that (at least in a sense) mathematicians proofs can in principle be formalized. "However", he cautions, "we should recognize that the humanly understandable and humanly checkable proofs we actually do are what is most important to us, and that they are very different from formal proofs" [Thurston 1994, 48]11 But supposing that mathematicians do reason on the basis of contentful mathematical ideas, then we should be able to make those ideas, and the steps of reasoning they enable, explicit so as to see precisely on what the reasoning depends, and we should be able to do that without losing sight of meaning altogether. But it is just this that we seem not to be able to do insofar as properly logical, rigorous reasoning depends, we think, on form as contrasted with content. Hence, to make the mathematician's reasoning fully explicit is to show that it does not, after all, rely on any meanings, any semantic content, but only logical form.

16 On the model-theoretic account of logic and language that is due to Tarski, a language involves two essentially different and opposed elements, logical form, which alone matters to the goodness of inference, and content that is given by an interpretation or model. More exactly, on this conception, a (first-order) language involves four essentially different sorts of signs that following Hodges we can characterize as follows: First, there are the truth-functional connectives. These have fixed logical meanings. Second there are the individual variables. These have no meaning. When they occur free, they mark a place where an object can be named; when bound they are part of the machinery of quantification.. . The third symbols are the non-logical constants. These are the relation, function, and individual constant symbols. Different first- order languages have different stocks of these symbols. In themselves they don t refer to any particular relations, functions or individuals, but we can give them references by applying the language to a particular structure or situation. And fourth there are the quantifier symbols. These always mean 'for all individuals' and 'for some individuals'; but what counts as an individual depends on how we apply

Philosophia Scientiæ, 16-1 | 2012 30

the language. To understand them, we have to supply a domain of quantification. The result is that a sentence of a first-order language isn't true or false outright. It only becomes true or false when we have interpreted the non-logical constants and the quantifiers. [Hodges 1985/6, 144]

17 On this conception, what matters to the goodness of inference is only logical form. An inference from some premise or premises P to a conclusion C is good just in case C is a logical consequence of P, that is, there is no interpretation or model in which P comes out true and C false. To have a fully rigorous chain of reasoning is thus to dispense with all content and meaning, to attend only to form. To reason rigorously, on this account, is to be impervious to extra-logical meanings.

18 To reason rigorously is to make everything on which one's reasoning depends explicit; it is to state in advance all the premises and assumptions that are required in the derivation of the conclusion. Tarskian commits us to something stronger, to the idea that meanings have nothing at all to do with the goodness of inference. The notion of logical consequence reduces an act of inference, the derivation of some conclusion from a premise or premises, to a purely formal, syntactic relation between uninterpreted strings of signs. And although the demand for rigor, for a formal proof in that sense, does not require that one strip all meaning from one's non-logical constants but only make all assumptions explicit, the step from the demand for rigor to the requirement that one formalize one's chain of reasoning in some mean-ingless system of signs has seemed to many a short one indeed. We read, for instance, in Nagel's classic essay "The Formation of Modern Conceptions of Formal Logic in the Development of Geometry": Everyone is aware, he [Pasch] pointed out, of the dangers which threaten the geometer who uses diagrams and other sensory images; but few seem to be equally aware of the traps which we lay for ourselves when we employ common words to designate mathematical concepts. For such words have many associated meanings not relevant to the task of a rigorously deductive science, and those associated meanings sway us to the detriment of rigor. [Nagel 1939, 238]

19 So far, so good, one might say: it is for just this reason that we need elucidations of our primitive signs, explicit definitions of other terms, axioms setting out the basic truths of the system, and rules on the basis of which alone anything can be derived. Nagel, however, continues: To avoid these handicaps it is therefore desirable to formalize the set of nuclear propositions; that is, we ought to replace them by a series of expressions in which the "geometrical concepts" of the propositions have been replaced by arbitrarily selected marks, whose sole function is to serve as "places" or blanks to be filled in as occasion may warrant. The result of such a formalization is an "empty frame", which expresses the structure of the set of nuclear propositions and which is alone relevant to the task of pure geometry.

20 In pure geometry, according to Nagel, all meaningful words, that is, all non-logical words, are to be replaced by meaningless symbols to ensure that one draws no inferences that are grounded in meaning. Pure geometry, like pure mathematics generally, is to be conceived (on this view) as the manipulation of empty marks according to rules.

21 But, again, rigor does not entail formalism in this sense. All that rigor requires is that no steps are to be made in the system save those permitted by the axioms, definitions, and given rules of inference, which serves to ensure that all one's presuppositions have

Philosophia Scientiæ, 16-1 | 2012 31

been made explicit. Although it is true that the possibility of following the rules mechanically, as if one's non-logical primitive signs had no meaning, can serve as a test of the adequacy of one's system of axioms and definitions, it simply does not follow that one should conceive one's signs as empty of all meaning and signification. As Frege puts the point, "it is possible, of course, to operate with figures mechanically, just as it is possible to speak like a parrot. . . [But] it only becomes possible [to do this] after the mathematical notation has, as a result of genuine thought, been so developed that it does the thinking for us, so to speak" [Frege 1884, iv]. "The use of symbols must not be equated with a thoughtless, mechanical procedure. . . One can think also in symbols" [Frege 1976, 33]. If Frege is right, mathematicians develop systems of written signs not to dispense with all content but instead to express content in a way that enables rigorous reasoning on the basis of that content.

22 Why then does Nagel so easily slide from the demand for rigor (for a formal proof in that sense) to the demand for formalization in his sense? Why does he think that the only way to avoid being "swayed" by "associated meanings" is by jettisoning meanings altogether? A plausible answer, or at least part of an answer, is that he associates rigor with mechanism: to be really rigorous is to think like a machine, impervious not only to associated meanings but to any meaning at all.

23 Mathematicians in their practice do not think like machines any more than chess masters play chess merely mechanically, that is, as chess-playing computers do, by brute number crunching (hundreds of millions of moves per second). Nevertheless, it can seem that the goodness of inference really does rely only on logical form as contrasted with content. In fact this is not so. Although there is a perfectly good sense in which inferences are valid in virtue of their form, it does not follow that the notion of logical form fundamentally contrasts with and is opposed to that of semantic content, of content as it matters to understanding and truth. It is perfectly possible to accept that inferences are valid in virtue of their form while rejecting the distinction between logical form and content that is codified in the model-theoretic conception of language that we find in mathematical logic. For to say that an inference is valid in virtue of form need mean no more than that the inference is an instance or application of something more general, a schema or rule that applies not only in the given case but also in other cases that are relevantly like it.

24 Consider, for example, the (material) inference from 'Felix is a cat to 'Felix is a mammal . This inference is good, if it is good, because one can infer generally from something's being a cat that it is a mammal; the inference is good (if it is good) because inferences of the form 'x is a cat; therefore, x is a mammal' are good. The inference is not good in virtue of being about Felix. Though one does infer something about Felix, the validity of the inference is explained not by reference to Felix but by appeal to a rule, something to the effect that being a cat entails being a mammal. Similarly, the inference from John is in the kitchen or the den' and John is not in the den' to John is in the kitchen' is good, if it is good, because one can infer generally from something's being this or that, and its not being that, to its being this. The inference is not good in virtue of being about John's whereabouts. Though one does infer something about John's whereabouts, the validity of the inference is explained not by reference to John but by appeal to a rule, namely, the rule we call disjunctive syllogism. Whereas in our Felix example the rule concerns valid inferences that rely on the meanings of the words cat' and mammal', in the John example the rule concerns valid inferences that rely on the meanings of the

Philosophia Scientiæ, 16-1 | 2012 32

words or' and not'. And just the same is true in all other cases of actual inference, whether material or formal; any actual inference is an instance of a rule that applies also in other cases.

25 The model-theoretic conception of language according to which logical form stands opposed to semantic content commits us to something stronger. It commits us, first, to the idea that the Felix inference is not really a valid inference at all. On this view, no inference is valid in virtue of meaning, from which it follows directly that 'cat' and 'mammal' play an essentially different role in reasoning from words such as not' and or'. Somehow, on this view, logical contents such as that given by or' and not' are not really contents, meanings, at all but only forms, or form indicators.

26 Particular contents, at least as contents are understood on the model-theoretic view, do not matter to the validity of an inference or of a chain of inferences in a proof. But, we have seen, particular contents do seem to matter in mathematical reasoning. Though any inference is valid in virtue of its form, actual mathematical reasoning is reasoning from the contents of concepts, from mathematical ideas. Even so much as to begin to understand how this can be, we need to bear in mind that although there is a clear sense in which inferences are valid in virtue of their form (because they are instances of something more general), it does not follow that language should be conceived model theoretically, in terms of a thoroughgoing distinction between logical form and content as it matters to meaning, understanding, and truth.

27 Mathematicians, in their practice, reason on the basis of content. But because we tend to think that logical form is opposed to content, that the goodness of inference is independent of content, the notion of a fully formalized mathematical proof generates a philosophical difficulty. It becomes impossible to understand how a mathematical proof could further mathematical understanding. Either a mathematical proof can advance one's mathematical understanding, in which case it cannot be identified with a fully formalized proof, or it can be fully formalized, made machine checkable, but does not contribute to mathematical understanding. The problem here is clearly not with the very idea of rigor. It is perfectly coherent that one might reason fully rigorously on the basis of content. The problem is that we do not know how to make intelligible the idea of reasoning on the basis of content, mathematical ideas. It is to this problem that we now turn.

3 Reasoning from content in the formula language of elementary algebra

28 We have seen that our standard way of thinking about logic (and language) makes it very hard to see how reasoning might be on the basis of content. I want, then, to set that standard way of thinking aside as far as possible and consider instead the sort of constructive algebraic problem solving that was characteristic of mathematical practice in the eighteenth century, long before we had anything even approaching our contemporary views of logic. Such problem solving is done in the familiar symbolic language of algebra that we all learned as school children and as is well known, it can be made fully rigorous in an axiomatization. As I will argue, this language, together with the mathematical practice that uses it, provides, at least on one plausible

Philosophia Scientiæ, 16-1 | 2012 33

interpretation, an illustration of how one might reason on the basis of content in mathematics.

29 Among the axioms of elementary algebra are these: 1. a + b = b + a 2. (a + b) + c = a + (b + c) 3. a x b = b x a 4. (a x b) x c = a x (b x c) 5. a x (b + c) = (a x b) + (a x c).

30 Given these axioms governing permissible rewritings, one can further derive various theorems, which similarly govern permissible rewritings, for instance, this: (a + b)2 = a2 + 2ab + b2. But how exactly should we think of these axioms? One way is quantificationally, that is, in terms of a conception of generality that was fully clarified only in the early decades of the twentieth century12 On this way of thinking, the axioms are implicitly universally quantified claims about numbers, basic truths about numbers, all of them. A second way, following Hilbert, is to think of them as implicitly defining addition and multiplication, as stipulations about what we will mean by the signs '+' and 'x'. The grounds for thinking of the axioms this way are familiar: although we can read the axioms as truths about numbers, as on the quantificational reading, those same axioms can also be interpreted very differently, as truths about quite different sorts of objects. So long as the axioms are satisfied in a given domain then the derived theorems will hold in that domain. And this is, of course, an important, and importantly mathematical, insight.

31 But there is also a third way of thinking of these axioms. We can treat the axioms, and the theorems derived from them, neither as truths about numbers nor as uninterpreted stipulations for which many models might be given, but instead as basic and derived truths about the operations of addition and multiplication, as ascribing higher-level properties to those operations, in our axioms here, commutativity, associativity, and distribution. On this reading, the axioms and theorems are perfectly meaningful (as they are not on the second, Hilbertian reading), but they are in no way about numbers (as on the first, quantificational reading). What the axioms and theorems concern on this third reading are certain arithmetical operations and particular properties of those operations. On this reading, the axioms set out basic properties of those operations and on that basis one derives theorems that ascribe other, non-basic properties to those same operations.

32 It is important that we be clear about what this third reading amounts to, and to that end it will help to consider an interpretation of it that does not capture what we are after here13 On the (mistaken) interpretation, the first axiom is fully expressed thus: (∀a,b є N)( ∀ + є S)a + b = b + a.

33 The three readings outlined above are then to be taken to correspond to three different answers one might give to the question what is assumed to be fixed and what can vary in an interpretation. On the first, quantificational reading, both N and S are assumed fixed. On the second, Hilbertian reading, both N and S are treated as uninterpreted independent of the specification of a model in which the axiom comes out true. The third reading, on this account, keeps S fixed but allows N to vary with different interpretations. (And a fourth possibility opens up as well, namely, to keep N fixed and let S vary.) But that is not what is intended here. The third reading we are after,

Philosophia Scientiæ, 16-1 | 2012 34

according to which the axioms ascribe properties to the operations of addition and multiplication, does not involve any quantifiers at all. It has the form of a simple predication; it is like an ascription of a property to an object, something of the form 'Fa', but is one level up insofar as the "object" in this case is the addition function (not the numbers that function correlates but the correlating function itself), and the property ascribed is a second-level one. To explain how exactly the notation would have to be functioning to serve this expressive purpose would take us too far afield14 For our purposes what is important is that such a reading is available, and that it is quite different both from the first quantificational reading and from the (related) Hilbertian reading. On the first reading the axioms are about some objects and ascribe properties and relations to those objects, all of them. On the second reading, the axioms are about some particular domain of objects only on an interpretation (in a model) that makes the axioms true. On the third reading, the axioms are not about objects, numbers, but about particular arithmeti-cal functions, and they ascribe various (higher-level) properties and relations to those arithmetical functions.

34 The third reading of the axioms (and theorems derived from them) as as-criptions of properties to the operations of addition and multiplication can help to clarify the first two readings as follows. First, if, as the third reading has it, addition has the property of being commutative, then it obviously follows that for any two particular numbers a and b, a + b = b + a. It follows, that is, that the quantificational reading of that same axiom is true. But now, as we can see, that universally quantified generality is not known to be true directly. How could it be given that there are infinitely, indeed, non-denumerably many numbers to which it applies? Instead that quantified generality is known to be true because the operation of addition has the property of being commutative.15 It is not merely an accidental truth about numbers that, as it happens for any two of them, the first plus the second is equal to the second plus the first. It is, rather, necessary that this be so. And this necessity at the level of individual, particular numbers can be explained by the fact, one level up—that is, at the level of operations on numbers rather than at the level of the numbers themselves—that the operation of addition has the property of being commutative.

35 The second, Hilbertian reading is also explained because what the Hilbertian reading really reflects is the fact that inferences are valid in virtue of their form, the fact that they are instances of something essentially general that can be applied in other cases. The basic idea is this. If, as on our third reading, we take our axioms and theorems to be ascriptions of properties to the operations of addition and multiplication, then we can see as well that the inferences from the axioms to the theorems are not good in virtue of the fact that they are about addition and multiplication—any more than our inference about Felix was good in virtue of being about Felix. They are good because having those particular properties that are ascribed in the axioms entails having certain other properties as well. That is just what the derivations of the theorems from the axioms show. It follows directly that any other function or operation having the same particular properties as are ascribed in the axioms will also have the properties that are ascribed in the theorems that are derived from the axioms. The Hilbertian insight that the axioms and theorems can be applied in other cases does not show that the model-theoretic conception of language is the right conception to have. For that insight can be seen instead as a reflection of the fundamental feature of inference we

Philosophia Scientiæ, 16-1 | 2012 35

have already remarked on, the fact that any particular inference is an instance or application of a general rule that can be applied also in relevant other cases.

36 On our third reading, the axioms of elementary algebra are contentful truths about the addition and multiplication functions from which other truths follow. All these (basic and derived) truths function in turn to license inferences in particular cases. They can, in other words, be applied as rules in solving construction problems and in demonstrating theorems. Consider, for illustration, this theorem: if two numbers are each a sum of two integer squares then their product is also a sum of integer squares. In order to show algebraically that this theorem is true, we begin by formulating the starting point in the symbolic language. We have two numbers that are each a sum of two integer squares, which we express in the symbolic language thus: a2 + b2 and c2 + d2. What we must show is that their product is also a sum of integer squares. So we write down the product of our two sums thus: (a2 +b2)(c2 +d2). Notice what we have done here. We began with a certain mathematical idea, the idea, or as we can also say, the concept, of a sum of two integer squares. The content of that concept, what it is to be a sum of two integer squares, was then expressed in the language of algebra: a2 + b2. This expression, 'a2 + b2', does not merely abbreviate the English phrase—though one could imagine so using the signs (perhaps thus: + 212)—but instead formulates in a specially designed system of signs the content of that English phrase, what it means. To express the idea that we have two such sums we then used different letters in place of 'a' and 'b', and on that basis we were able to exhibit the content of the idea of a product of two numbers that are, each of them, a sum of two integer squares: (a2 + b2)(c2 + d2). Having thus formulated the content of the concept product of two sums of (two) integer squares in an arithmetically articulated complex of signs in the language, we can now apply the rules given in the axioms and derived theorems.

37 Leaving out obvious steps—steps that we could easily put in, tedious thought it would be to do so—the fifth axiom licenses the move from ( a 2 + b2 )( c2 + d2 ) to a2c2 + a2d2 + b2c2 + b2d2 Notice, first, that although the rule applies to a certain form of expression it is still perfectly possible to keep the content expressed in view as we apply the rule. As an indication of this we can express the rule in axiom 5 in (meaningful) words: if you have a product of a number and a sum then it can be rewritten as a sum of the products of the number together with each of the summands. It is also worth remarking that although we generally think of inferential articulation as something that involves a relation between concepts, here we are dealing with an internally articulated idea. It is the whole expression that sets out the idea with which we are concerned, that is, the idea of a product of two sums of integer squares, but because that expression is internally articulated, we can apply the rewrite rules of elementary algebra to it.

38 Now we need to reorder to get: a2c2 + b2d2 + a2d2 + b2c2 At this point a student might wonder why that was a good move to make. The reason, however, becomes clear at the next step in which we both add and subtract 2abcd (with the letters appropriately reordered) to give, after some reorganization, a2c2 - 2acbd + b2d2 + a2d2 + 2adbc + b2c2.

Philosophia Scientiæ, 16-1 | 2012 36

And now even the student ought to be able to see that this last expression can be rewritten (by appeal to familiar derived rules) as ( ac - bd) 2 + ( ad + bc) 2 . But that, we can see, is just what was wanted, a sum of two integer squares. We have the desired result.

39 By formulating the content of the concept with which we began in the formula language of algebra, we were able to reason in the language, by putting equals for equals, to obtain eventually the result that was wanted. That reasoning was, or at least could have been made to be, fully rigorous, every step licensed by a basic or derived rule in the system, but it was also fully contentful. In proving the theorem that the product of two sums of integer squares is also a sum of integer squares (that is, that being a product of two sums of integer squares entails being a sum of integer squares, and vice versa) in elementary algebra one does not abstract from content; instead one expresses content in a mathematically tractable way, in a way enabling reasoning in the system of signs. Of course the beginning student will operate with the language in a fairly mechanical way, much as the beginning chess player operates with chess pieces in the course of a game. It takes time to become literate in the system, to learn to see the meanings in the signs, the contents they express, and this is not surprising. The symbolic language of elementary algebra is quite unlike natural language. It is a specially designed system operating on its own distinctive principles, and it takes practice, and some skill, to learn to use it to its full potential—just as it takes practice, and skill, to become a master chess player, to learn to see the opportunities and hazards in particular configurations of chess pieces.

40 There is a further lesson in this example as well. As Avigad (from whom I borrow the example) notes, "the proof uses only the commutativity and associativity of addition and multiplication, the distributivity of multiplication over addition and subtraction, and the fact that subtraction is an inverse to addition; hence it shows that the theorem is true much more generally in any commutative ring" [Avigad 2006, 115]. This is just what I have called the Hilbertian insight again. This little chain of reasoning is valid, and so we know that it is an instance of a rule that can be applied also in other cases provided that the requisite properties hold in those cases. This chain of reasoning does not show that we are, in this case, not in fact reasoning about products of sums and sums of squares, any more than the fact that the rule governing the inference about Felix applies also in other cases shows that in the given case we are not reasoning about Felix. What it shows is that the reasoning can also be applied in other cases as well.

41 Although in our example the symbolism of elementary algebra enables the display of reasoning from some starting point to the desired conclusion, we have seen that mathematicians in their practice often do not provide such a proof or chain of reasoning but only describe one. In this regard mathematician's proofs are very different from what an account such as Suppes' would lead one to expect. And there is another respect as well in which mathematicians' actual practice diverges from the standard view, and diverges in a way that seems centrally tied to the issue of mathematical understanding. According to the standard view, theorems are proven on the basis of axioms, not on the basis of definitions; although definitions are sometimes discussed in logic, they play no role in reasoning as it is understood by the mathematical logician. Definitions are taken merely to provide abbreviations. As Avigad puts it, "in standard logic textbooks... definitions are usually treated outside the

Philosophia Scientiæ, 16-1 | 2012 37

deductive framework; in other words, one views definienda as meta-theoretic names for the formulas they stand for, with the understanding that in the real' formal proof it is actually the definientia that appear" [Avigad 2006, 130].16 What this cannot explain, however, is why it is that, as Tappenden has argued, "mathematicians often set finding the 'right'/'proper'/'correct'/'natural' definition as a research objective, and success— finding the 'proper' definition—can be counted as a significant advance in knowledge" [Tappenden 2008, 256].17 A good definition is often critical to the discovery of an interesting and revealing proof of some theorem; indeed, the right definition can render a result wholly trivial, that is, utterly transparent.18 And when a definition has made a result trivial in this way, mathematicians will tell you that we have fully understood the phenomenon in question. This is very hard to understand if a definition is merely an abbreviation. But again if we consider a case from early modern algebra, we can begin to formulate a more satisfactory alternative.

42 In our first example we began with a mathematical idea expressed in English and formulated the content of that idea in the language of elementary algebra. We had in that case no simple mathematical sign for the idea the content of which is given by the complex sign '(a2 + b2)(c2 + d2)'. The fundamental importance of definitions in mathematics is thus not even hinted at by such an example. A different example will do better. Though it does not involve any explicit appeal to definitions (which would involve us in more difficulties than can be adequately addressed here), our second example does constitutively involve identities of the form a = b, where the sign on the left is a simple sign in the language and the sign on the right is a complex sign comprising a whole collection of simple signs. As this example makes clear, the identity is not functioning merely as an abbreviation.

43 We begin with the exponential function ex, where e is understood to be some particular number (that is, it is a name for a number as π is): e is the number such that ex is its own derivative. (It can be shown that such a number must exist.) Given that ex is its own derivative, it follows that: ex = 1 + x + x2/2! + x3/3! + x4/4! + ... + xn/n! + ...

44 And we know that this is true because if we take the derivative of the infinite series on the right we will end up with just the same series again: the derivative of the first term is zero, of the second is the first, of the third is the second, and so on. What we have in this identity is, then, on the left, a simple sign for a particular function, and on the right, a very richly articulated collection of signs, one that clearly displays a certain pattern—which is why we need not worry about the fact that we cannot actually write out the whole infinite sequence. Enough of it is given that we know how it will go, and so could extend it arbitrarily far. This identity is furthermore true so we know that one and the same function is designated both by the simple expression on the left and by the complex expression on the right. The two signs do not, however, express the same Fregean sense, as is evident from the fact that we had to argue that the identity was true. But we can also just see that the senses of the two expressions are different insofar as the one expression, 'ex', has no internal arithmetical articulation while the other, the infinite series, has a great deal. Precisely because the expression on the right is complex, one can apply the rules of algebra to the content that is there exhibited in a way that is simply impossible in the case of the simple sign appearing on the left.

45 It can also be shown that, under suitable assumptions, the sine and cosine functions can be expressed as power series, namely, these:

Philosophia Scientiæ, 16-1 | 2012 38

sin(x) = x - x3/3! + x5/5! - x7/7! + ... cos(x) = 1 - x2/2! + x4/4! - x6/6! + ... And now we are in a position to notice something interesting, namely, that all the terms that appear in the power series expansion of the function ex occur in one or other of the power series for the sine and cosine functions. Only the signs are different. Let us then everywhere replace x in ex = 1 + x + x2/2! + x3/3! + x4/4! + ... + xn/n! + ... with ix, where i is by definition such that i2 = -1, to give eix = 1 + ix + (ix)2/2! + (ix)3/3! + (ix)4/4! + (ix)5/5! + ... Now we do some standard algebraic manipulations according to the rules to get eix = 1 + ix - x2/2! - ix3/3! + x4/4! + ix5/5! - x6/6! - ... And rearranging things a bit, by collecting together the terms that contain i, gives this: eix = (1 - x2/2! + x4/4! - x6/6! + ... ) + i(x - x3/3! + x5/5! - ... ). And now we can see that the first series is just that we identified with the cosine function and the second that identified with the sine function. Putting equals for equals thus gives us Euler's famous theorem, eix = cos(x) + isin(x), relating the exponential function and the two trigonometric functions.

46 As this example illustrates, mathematical identities involving both simple signs and arithmetically complex signs can play a crucial role in a demonstration. Without the simple signs 'ex', 'sin(x)', and 'cos(x)' we would not be able to recognize our result as a result involving the exponential function and the two trigonometric functions; without the power series with which these three functions were identified, we would be unable to reason our way to the desired result. The example thus illustrates the way in which finding a fruitful articulation of some mathematical notion, one that will enable the proof of some result, can constitutes a real mathematical advance. Of course the simple signs are eliminable in the demonstration in the sense that replacing all instances of them in the course of reasoning with the relevant complex signs will not affect the validity of the reasoning. But equally obviously doing that destroys the interest of the demonstration insofar as it is no longer possible to see the demonstration as a demonstration involving the exponential and trigonometric functions. What one would end up with in that case would be a trivial identity, one with the same power series on both sides of the identity sign. Euler's theorem is mathematically significant because it reveals a connection between two apparently independent mathematical domains, that of exponential functions and that of trigonometric functions. But we can see this only if we have the identities of the exponential function and of the trigonometric functions with which we began, both the complex signs, the power series, and the simple signs 'ex', 'sin(x)', and 'cos(x)'.

47 By displaying the contents of mathematical notions of antecedent interest in new and mathematically tractable ways, identities enable demonstrations involving those notions, and they can do because, as we have just seen, in the identities those contents are displayed in complex signs in ways that enable reasoning. In these identities, both the simple sign and the complex sign designate or mean one and the same thing; they have the same Bedeutung.

48 But, again, they do not express the same sense; they have quite different roles to play in the process of reasoning. Different things follow from the simple and the complex signs precisely because the one is a simple sign and the other is a complex sign with a lot of internal articulation that can be utilized in the proof. And though I will not try to show

Philosophia Scientiæ, 16-1 | 2012 39

it here, just the same point can be applied also to definitions in actual mathematical practice. Definitions in mathematics are not mere abbreviations. They provide an articulation of the content of some mathematical notion that is also named in the definiendum, content that can be critical to the chain of reasoning required by the proof.

49 I have suggested that we think of axioms, for instance, those of elementary algebra, as ascriptions of basic properties to arithmetical operations such as addition and multiplication. Theorems derived from those axioms ascribe further properties to those operations, and both those axioms and the theorems derived from them can then be treated as basic and derived rules governing valid inferences. We have looked at two sorts of cases, one that did not involve any identities of simple and complex signs (our little demonstration of the theorem that if two numbers are both sums of integer squares then their product is also a sum of integer squares) and another in which such identities have an essential role to play, namely, in the demonstration of Euler's theorem. In both sorts of cases the reasoning is, in principle, fully rigorous; every step in the reasoning is justified by some antecedently specified (or specifiable) rule. But the reasoning is also fully contentful. The formula language of arithmetic and algebra enables the expression of a content in a way that is mathematically tractable, that is, in a way enabling one to reason (on the basis of content) in the system of signs. And because it does, we can begin at least to understand, if only for this case, how it is that a proof can advance our mathematical understanding. Because the dilemma with which we began—according to which a proof cannot be both formal, that is, fully rigorous, and also contentful (hence relevant to mathematical understanding)—does not arise for this case, we can see how a fully rigorous chain of reasoning can be fully compatible with reasoning on the basis of mathematical ideas. Our examples show that it is possible that by proving something a mathematician might gain insight and understanding.

4 Looking forward

50 Over the course of the nineteenth century the practice of constructive algebraic problem solving, that we see, for instance, in Euler's work, began to give way, in the work of Riemann, Dedekind, and others, to a new sort of mathematical practice, the practice of reasoning deductively from defined concepts, Denken in Begriffen, thinking in concepts.19 Because there was no mathematical language, no system of written signs within which to do this work, proofs had to be reported (as they are still today) in natural language together with whatever help could be provided by the formula language of arithmetic and algebra. Taking our cue from that formula language, it is, however, possible to say what such a system of signs would have to do: it would have to formulate the inferentially articulated contents of the concepts of mathematical interest in a way that enables mathematical reasoning. Our logical languages were not designed to do this. Although one can formulate basic and derived laws of logic in a logical language such as that Suppes introduces in his little logic textbook, and can reason rigorously in the system, one cannot exhibit the contents of concepts in such a language. The language was not designed for that expressive purpose, and nor even is it obvious how this might be done, what sort of language might be needed instead.

51 It seems in a way obvious that an expression of algebra such as a2 + b2 ' formulates the content of the mathematical idea of a sum of two integer squares, that it displays what

Philosophia Scientiæ, 16-1 | 2012 40

it is to be such a sum. But as obvious and natural as this is to us, it is far from clear how the language is functioning to enable such a display. And because this is not clear such an expression does not immediately suggest to us what it would take to display the contents of concepts in a way that would enable not now algebraic computations but instead deductive reasoning, that is, the sort of reasoning mathematicians began to employ over the course of the nineteenth century. But we do know at least this much. Because the reasoning involved in this case is deductive rather than algebraic, the contents of our concepts would need to use signs for logical relations in much the way, in elementary algebra, we use signs for arithmetical operations. As in algebra we introduce a few primitive signs for the basic arithmetical operations, so we would need to introduce a few primitive signs for the basic logical relations, say, the conditional, negation, and identity, as well as a sign usable in the expression of generality of content. We would then need to formulate axioms, corresponding to the axioms of algebra, that would set out the basic properties of those primitive logical relations, and to derive some theorems from those axioms. As in algebra, the axioms and derived theorems would then function as rules governing inferences in the system. Definitions of concepts of antecedent mathematical interest could then be formulated in the system and those definitions could provide the starting point for proofs much as identities formed the starting point for our little demonstration of Euler's theorem. That is what we would need … and that, I would argue (though of course I cannot do so here), is what is provided for us already in Frege's 1879 Begriffsschrift, his concept- script. Frege explicitly and self-consciously designed his strange two-dimensional notation to express mathematical content in a way that would enable rigorous reasoning on the basis of that content. And in Part III of his 1879 logic, he proves a theorem on the basis of explicit definitions that aims to illustrate precisely how a strictly deductive proof in his language can be ampliative, that is, constitute real extension of our knowledge. If Frege is right, his language is just what we need to display rather than merely to describe both the contents of mathematical concepts and the chains of reasoning that take us from definitions of concepts to theorems revealing logical relations among those defined concepts. Of course, we will not be able even to begin to understand how this could be so, so long as we continue to read his notation as a mere variant of our own.20

Conclusion

52 Proofs in mathematics can take many forms, and can contribute to mathematical understanding in a very wide variety of ways. What I have said here only touches on one point, but it is a point that is, I think, central. In order to understand the role of proving theorems in achieving mathematical understanding, we need better to understand what a mathematical proof is, and in particular, the role that specially devised systems of written marks have historically played in the mathematical practice of proof, how they have served to formulate content in ways that facilitate rigorous mathematical reasoning. Standard mathematical logic is not such a system of signs; nor, again, was it designed to be. Rather than formulating content as it matters to inference in mathematical practice (as Frege's concept-script was designed to do, and I would argue succeeds in doing) standard notations of logic introduce abbreviations, special signs to take the place of words, to enable the perspicuous display of the logical

Philosophia Scientiæ, 16-1 | 2012 41

forms of sentences of natural language. Little wonder, then, that a mathematical proof fully formalized in standard notation is unintelligible.

53 Lacking any adequate system of signs within which to reason deductively from the contents of defined concepts, mathematicians in their practice today—at least in cases in which the reasoning is deductive from concepts rather than algebraic—can only report the course of their reasoning; they describe how it goes rather than showing how it goes in a specially devised system of written marks. And as anyone who has worked on results in algebra that were discovered before the development of an adequate symbolism for (elementary) algebra knows, it is often much harder to understand a proof that is only described than it is to follow one in which the reasoning is displayed.21 But there is another problem as well: where the reasoning is not displayed in a system of written marks but only reported it is much harder to come to understand how the reasoning works as reasoning. We will not understand how a mathematical proof can yield mathematical understanding until we understand much better than we currently do how it is that one reasons on the basis of content in mathematics. And to understand that we must concern ourselves with how content is formulated in specially devised systems of signs in actual mathematical practice—in Euclidean diagrams, in algebraic equations, and (I would argue) in Frege's Begriffsschrift formulae.22

BIBLIOGRAPHY

AVIGAD, JEREMY 2006 Mathematical method and proof, , 153, 105-159. AzzouNi, JODY 2006 Tracking Reason: Proof, Consequence, and Truth, Oxford: Oxford University Press.

BOSTOCK, DAVID 1997 Intermediate Logic, Oxford: Oxford University Press.

FREGE, GOTTLOB 1884 Die Grundlagen der Arithmetik. Eine logisch-mathematische Untersuchung ûber den Begriff der Zahl, Breslau: W. Koebner, cited according to the English translation by J. L. Austin: The Foundations of Arithmetic, Evanston, IL: Northwestern University Press, 1980. 1976 Nachgelassene Schriften und wissenschaftlicher Briefwechsel, vol. 2, Hamburg: Felix Meiner, cited according to the English translation by Hans Kaal: Philosophical and Mathematical Correspondence, G.Gabriel et al. (eds.), Chicago: University of Chicago Press, 1980.

GOETHE, NORMA B. & FRIEND, MICHÈLE 2010 Confronting ideals of proof with the ways of proving of the research mathematician, Studia Logica, 96, 277-292.

GOLDFARB, WARREN 1979 Logic in the twenties: The nature of the quantifier, The Journal of Symbolic Logic, 44, 351-368.

HODGES, WILFRID 1985/6 Truth in a structure, in Proceedings of the , 86, 135-141.

Philosophia Scientiæ, 16-1 | 2012 42

LAUGWITZ, DETLEF 1995 Bernhard Riemann 1826-1866: Wendepunkte In der Auffassung der Mathematik, Basel: Birkhaûser, cited according to the English translation by Abe Shenitzer: Bernhard Riemann 1826-1866: Turning Points in the Conception ofMathematics, Basel, Berlin, and Boston: Birkhaûser, 1999.

MACBETH, DANIELLE 2005 Frege's Logic, Cambridge: Harvard University Press. s. d. Seeing how it goes: Paper-and-pencil reasoning in mathematical practice, Philosophia Mathematica.

MANDERS, KENNETH L. 1987 Logic and conceptual relationships in mathematics, in Logic Colloquium '85, North Holland: Elsevier Science, 193-211.

NAGEL, ERNEST 1939 The formation of modern conceptions of formal logic in the development of geometry, Osiris, 7, 142-224, cited according to the reprinting in Teleology Revisited and Other Essays in the Philosophy and History of Science, New York: Press, 1979.

RAV, YEHUDA 1999 Why do we prove theorems?, Philosophia Mathematica, 7, 5-41. 2007 A critique of a formalist-mechanist version of the justification of arguments in mathematicians' proof practices, Philosophi Mathematica, 15, 291-320.

ROTA, GIAN-CARLO 1997 The phenomenology of proof, Synthese, 111, 183-196.

SPIVAK, MICHAEL 1965 on Manifolds, Reading, Mass.: Addison-Wesley.

SUPPES, PATRICK 1957 Introduction to Logic, New York: Van Nostrand Reinhold, reissued by Mineola, New York: Dover, 1999.

TAPPENDEN, JAMIE 2008 Mathematical concepts and definitions, in The Philosophy of Mathematical Practice, edited by MANCOSU, PAOLO, Oxford: Oxford University Press.

THURSTON, WILLIAM 1994 On proof and progress in mathematics, Bulletin of the American Mathematical Society, 30, 161-177, cited according to the reprinting in 18 Unconventional Essays on the Nature ofMathematics, Rueben Hersh (ed.), Springer, 2006.

VERVLOESEM, KOEN 2010 Mathematical concepts in computer proofs, in Philosophical Perspectives on Mathematical Practice, edited by VAN KERKHOVE, BART, DE VUYST, JONAS, & VAN BENDEGEM, JEAN PAUL, London: College Publications.

WIEDIJK, FREEK 2008 Formal proof—getting started, Notices of the AMS, 55(11), 1408—1414.

Philosophia Scientiæ, 16-1 | 2012 43

NOTES

1. There are, for example, over two hundred proofs of the law of quadratic reciprocity, all aimed at a better understanding of this astonishing law. See [Tappenden 2008, 260]. Gian-Carlo Rota provides the beginnings of an explanation in [Rota 1997]. According to him, the first proof of a theorem is often "needlessly complicated". "It takes a long time, ranging from a few decades to entire centuries, before the facts that are hidden in the first proof are understood... This gradual bringing out of the significance of a new discovery takes the appearance of a succession of proofs, each one simpler than the preceding. New and simpler versions of a theorem will stop appearing when the facts are finally understood" [Rota 1997, 192—193]. 2. This question is different from, and I will suggest prior to, questions regarding the nature of mathematical understanding and of the sorts of proofs that provide it. 3. This is not the only conception of proof and logic in circulation. It nonetheless is very standard and can serve as needed here to enable the articulation of what I take to be the deep problem we face regarding the relationship between proof and understanding in mathematical practice. 4. See especially Chapter 7. See also Yehuda Rav's critique of the view in [Rav 2007]. 5. It should not be inferred that the distinction has a sharp boundary; there are intermediate and mixed cases. What matters for our purposes is that there seems to be a clear contrast in actual mathematical practice between the descriptions of reasoning mathematicians actually provide and the chains of reasoning themselves. The example aims to illustrate this contrast with a clear and simple case. 6. More exactly, we have a trace left after the paper-and-pencil activity of reasoning has been completed, one that can then be gone through again, either by the original agent or by another calculator. 7. The list is due to Avigad, [Avigad 2006, 131]. 8. The list is due to Freek Wiedijk in [Wiedijk 2008, 1408]. 9. See also [Goethe & Friend 2010]. 10. Of course, creative mathematicians use a variety of non-deductive methods in formulating plausible conjectures and discovering proofs, that is, in what Reichenbach in Chapter One of Expérience and Prédiction (1938) called the context of discovery. My remark applies only to what he calls the context of justification, that is, to the rational reconstruction of the course of a proof that the mathematician communicates to others. See www.ditext.com/reich/reichc.html. 11. Koen Vervloesem also makes this point in his discussion of the difference between human and computer proofs [Vervloesem 2010]. 12. Warren Goldfarb rehearses some of the history [Goldfarb 1979]. 13. The reading to be outlined was developed in comments by an anonymous referee of an earlier draft of this essay. 14. It is perhaps at least worth noting that on this reading the letters 'a', 'b', and 'c' do not function to refer to objects but instead serve, together with other signs to form an expression for a second-level concept. 15. Of course in the order of knowing one first learns of particular pairs of numbers, say, seven and five, that 7 + 5 = 5 + 7. The claim is that nonetheless in the order of being, it is the fact that addition is commutative that is prior. 16. This is just what we find in [Suppes 1957, Chapter 8]. 17. Compare Frege's remark, in the introduction to The Foundations of Arithmetic, that "often it is only after immense intellectual effort, which may have continued over centuries, that humanity at last succeeds in achieving knowledge of a concept in its pure form, in stripping off the irrelevant accretions which veil it from the eyes of the mind" [Frege 1884, vii].

Philosophia Scientiæ, 16-1 | 2012 44

18. Tappenden, quotes Michael Spivak, [Spivak 1965, 104]: "Stokes' theorem shares three important attributes with many fully evolved major theorems: a) It is trivial, b) It is trivial because the terms appearing in it have been properly defined, c) It has significant consequences" [Tappenden 2008, 257]. 19. I owe the phrase "Denken in Begriffen" to Detlef Laugwitz in [Laugwitz 1995]. 20. In [Macbeth 2005], I outline the reading of Frege's notation that I would argue is needed here. See also [Macbeth s. d.]. 21. This is often the case, but not always. The ancient proof that there is no largest prime is reported (for instance, in Euclid's Elements), but it is nonetheless immediately intelligible to almost anyone. 22. An earlier version of this paper was presented at the conference, The Notion of Form in 19th and Early 20th Century Logic and Mathematics, held at the Vrije Universitait Amsterdam in January 2011, and a much abridged version at an affiliated session, sponsored by the Association for the Philosophy of Mathematical Practice, of the 14th Congress of Logic, Methodology and Philosophy of Science, July 19-26, 2011. I thank participants at both events for very valuable comments and feedback, and Arianna Betti, Marco Panza, Karine Chemla, Ken Manders, and Jamie Tappenden, in particular, for very helpful comments. The current version also responds to concerns and criticisms of two anonymous referees.

ABSTRACTS

The mathematical practice of proving theorems seems clearly to result in improved mathematical understanding; the aim of proving, and reproving, theorems in mathematics is better understanding. And yet, as is by now well known, fully formalized mathematical proofs are usually unintelligible; they do not advance our mathematical understanding. How, then, should we understand the relationship between proving theorems and advancing our mathematical understanding? I argue that, first, we need a different notion of (formal) proof, one that does not take form to be opposed to content. Eighteenth century algebraic proof provides an example of fully rigorous and fully contentful mathematical proof, and in this case one can see how mathematical proof might provide mathematical understanding. The task is to extrapolate the insights gained from this case to the sort of deductive reasoning from concepts that has been the norm in mathematical practice since the nineteenth century.

Prouver des théorèmes est une pratique mathématique qui semble clairement améliorer notre compréhension mathématique. Ainsi, prouver et reprouver des théorèmes en mathématiques, vise à apporter une meilleure compréhension. Cependant, comme il est bien connu, les preuves mathématiques totalement formalisées sont habituellement inintelligibles et, à ce titre, ne contribuent pas à notre compréhension mathématique. Comment, alors, comprendre la relation entre prouver des théorèmes et améliorer notre compréhension mathématique. J'avance ici que nous avons d'abord besoin d'une notion différente de preuve (formelle), qui ne tienne pas la forme pour opposée au contenu. La pratique de la preuve algébrique au xvIIIe siècle fournit un exemple de preuve à la fois pleinement rigoureuse et porteuse de contenu, et dans ce cas, il est possible de voir comment une preuve mathématique apporte une compréhension mathématique. Il s'agira alors de mobiliser les enseignements de cet exemple pour étudier le type de

Philosophia Scientiæ, 16-1 | 2012 45

raisonnement déductif à partir de concepts, qui a constitué la norme dans la pratique mathématique depuis le xixe siècle.

AUTHOR

DANIELLE MACBETH Haverford Collège (USA)

Philosophia Scientiæ, 16-1 | 2012 46

The Role of Representations in Mathematical Reasoning1

Jessica Carter

Introduction

1 The use of representations in mathematical reasoning is the focus of a number of papers and books in both the philosophy of mathematics and in mathematics education. Ken Manders has long stressed the use of artefacts in mathematics, see for example his [Manders 2008]. More recently the topic has been discussed by Grosholz in her book Representation and Productive Ambiguity in Mathematics and the Sciences [Grosholz 2007], where it is claimed that ambiguous representation is a prerequisite for the fruitfulness of mathematics as well as natural sciences. In [Duval 2006] it is argued that the ability to shift between different representations is fundamental for mathematical understanding. This paper seeks to clarify the role of representations in mathematical proofs. In [Carter 2010] a proof is described as using an interplay between different kinds of representations. By representations, I mean diagrams, symbols, definitions and expressions that can be written down. One role that representations play is to enable us to break down proofs into manageable parts and thus to focus on certain details of a proof, by removing irrelevant information. [Carter 2010, 3]

2 Here I wish to show just how representations enable us to break a proof down, and explain in more detail the role that icons and indices play in this process.

3 My claim is simply that one role of iconic representations, i.e., representations that resemble what they represent, is to ensure that there is a link between the obtained result and the original expression. Use of indices makes it possible to reinsert the result obtained, by dealing with the manageable parts, in the appropriate place. A different aspect that will be addressed concerns what could be denoted as 'compound definitions'. By those, I mean definitions consisting of more than one component. In the case considered, we are dealing with matrices with certain entries. So one component of the definition tells us we are dealing with matrices, the second component concerns the entries of the matrices. The point here is that at a particular

Philosophia Scientiæ, 16-1 | 2012 47

step of the proof one need only to focus on one component—thus removing irrelevant information.2

4 This process will be illustrated by an example from current mathematical practice, more precisely a result from Free Probability Theory. What is found is the value of an expression. By doing this my approach differs from previous work in this field in two respects. First, most examples in the literature are historical examples. Second, as many authors are concerned with visualisation, many examples concern geometry— and geometric diagrams. The example presented here concerns an algebraic expression.

5 Before turning to the case study, I will give a brief introduction to what will be needed from Peirce's theory of signs. Note my point is not to present— or argue for—a reading that Peirce would necessarily have endorsed. I present an interpretation of his theory that I believe is useful when addressing certain aspects of mathematical reasoning.3

1 Semiotics

6 The fundamental concept is that of a sign. A sign—or a representation—is something that stands for something else. According to Peirce, for something to act as a sign more is required; a sign is not a sign unless it is interpreted as a sign. A sign, therefore, is a triad (as everything else is understood in terms of three): A sign, or representamen, is something which stands to somebody for something in some respect or capacity. It addresses somebody, that is, creates in the mind of that person an equivalent sign, or perhaps a more developed sign. That sign which it creates I call the interpretant of the first sign. The sign stands for something, its object. It stands for that object, not in all respects, but in reference to a sort of idea, which I have sometimes called the ground of the representamen. [Peirce 1965, 2.228, italics in original]4

7 What is notable in this description is that, in addition to the sign and the object that it represents, there is a third part, the interpretant. Thus a sign can be pictured as follows:

8 Given this triad one may ask, for example, about the nature of the sign in question and about relations between sign and object.5 What is relevant for our purpose is the relation between the sign and the object; one may ask: In what capacity does the sign represent a given entity? According to Peirce's categories there are three possibilities, named icon, index and symbol. For a first description, icons represent by virtue of likeness. A picture is an example. An index represents by being actually connected to what it represents. The movement of the leaves and branches of the tree outside my window is a sign to me that it will be windy later when I cycle home. Finally a symbol

Philosophia Scientiæ, 16-1 | 2012 48

represents by the assignment of a rule that determines its interpretant. Words are symbols, as well as, for example, π.

9 Peirce makes further subdivisions. Icons may in turn represent in three ways, depending on whether the sign shares certain qualities with the represented; or it shows some actual relations; or by virtue of "parallelism in something else" [Peirce 1965, 2.227]. These different kinds are denoted images, diagrams and metaphors.

10 Indices represent by being connected to the represented, either by a real physical— causal—connection, or by a purposeful act of connecting the signs. As an example of an index in the first sense, Peirce refers to a weather cock that is taken as a sign of the direction of the wind. For the second sense, Peirce mentions the geometers' use of letters, in the sense of signposts, in diagrams as indices.

11 Symbols obviously play a major role in mathematics as most expressions are built up from symbols. This is also the case with the expression that we will consider later. Here we will not be much concerned with the role that these play. I mention one curiosity:6 usually the symbol π is taken to denote the real number that denotes the relation between the circumference and the diameter of a circle. In the case study we shall deal with, π denotes a permutation. The rationale for this is that Greek letters are often used to denote mathematical entities and it is 'p' for permutation. The letters that we shall consider are the outcome of a "purposeful act of connecting a sign to its object", and they will therefore act as indices: Geometricians mark letters against the different parts of their diagrams and then use these letters to indicate those parts. [Peirce 1965, 2.285]

12 In this sense, Peirce comments that "indices are absolutely indispensable in mathematics" [Peirce 1965, 3.305].

13 In addition Peirce explicitly highlighted the role of icons—"likenesses"—in mathematics: The reasoning of mathematicians will be found to turn chiefly upon the use of likenesses, which are the very hinges of the gates of their science. The utility of likenesses to mathematicians consists in their suggesting in a very precise way, new aspects of supposed states of things... [Peirce 1965, 2.281]

14 In the case study we shall deal with, there will be two senses of iconicity at play.

15 Central to Peirce's conception of reasoning in mathematics is that all such reasoning is diagrammatic—and as such iconic. A mathematical theorem will contain certain hypotheses. By fixing reference with certain indices, it is possible to produce a diagram that displays the relations of these referents. In statements concerning basic geometry, the diagram could be a geometric diagram as we know them from Euclid. In other parts of mathematics diagrams take a different form. One of Peirce's favourite examples from concerns the theorem that—in modern terms—the cardinality of a set is strictly less than the cardinality of its power set. Denoting the set by C, it is possible to express the relations in a diagram as follows: There is no 1-1 correspondence between the members of C and P (C ). This is a statement about the relation between a set, C and the set of subsets of C, i.e., a diagram. It is thus clear that Peirce employs the term 'diagram' in a much wider sense than usual. Even spoken language can be diagrammatic: the wordy and loose deductions of the philosophers may make use rather of auditory diagrams, if I may be allowed the expression, than [of] visual ones. [Peirce 1976, 164]

Philosophia Scientiæ, 16-1 | 2012 49

16 Generally a proof proceeds by—either seeing the conclusion of the theorem directly in the diagram—or by performing experiments on the diagram which will in the end lead to the conclusion. In the above example, the crucial step in the experiment consists in first assuming that the theorem is false, saying that there is a 1-1 correspondence f : C → P (C ) and then forming the set M={s∈C : f (s) ∉ C}.

17 In our case study we see examples of diagrammatic reasoning in this sense. In addition, there will also be examples of reasoning using iconicity in the sense of a parallelism in something else, i.e. as a metaphor: Particularly deserving of notice are icons in which the likeness is aided by conventional rules. Thus, an algebraic formula is an icon, rendered such by the rules of commutation, association, and distribution of the symbols. [Peirce 1965, 2.279]

18 To illustrate this consider the following identity: (a + b)2=a2+2ab+b2. To see that this holds, we use (conventional) rules that hold for multiplication and sum of numbers, and what it means for something to be "squared", i.e., to be multiplied by itself. We also need to know that ab = ba. Most notably, we need to know, or be acquainted with, what the letters a and b are taken to represent, so they function as indices. If they represent numbers, as in this case, the commutative rule holds, if something else, this rule need not hold. Since—with the aid of these conventional rules—we obtain new insight, Peirce stresses that iconicity is at stake: This capacity of revealing unexpected truth is precisely that wherein the utility of algebraical formulae consists, so that the iconic character is the prevailing one. [Peirce 1965, 2.279]

19 Before we turn to the main example, I shall illustrate some of the main ideas with a simpler example. The point is to simplify the expression sin(tan-1(x)). The first step is to represent tan1(x) geometrically, see the picture below. To simplify, we assume x>0.

20 Denote by θ the angle tan-1(x). In this diagram, we can also identify sin(tan-1(x)) = sin(θ). To simplify, we denote this line segment a. From the diagram we see that we have two

Philosophia Scientiæ, 16-1 | 2012 50

similar triangles, BAC and BED. Using the fact that the relation between corresponding sides of such triangles are identical, we can write the following equation:

21 Summing up: In the first step we represent part of the expression in a geometric diagram, i.e., we display relations of the theorem in a diagram. It is also possible to identify the sought for expression, sin(tan-1(x)), in this diagram as the length of a side in a certain triangle. By lettering these sides appropriately—using indices—we can express a relation between known x and unknown quantities a, b, and thus produce a new diagram. This diagram is an equation involving real numbers. Using rules for how to operate on these, we arrive at the final expression for a. Since a was the index representing the desired expression, we can reinsert this and obtain the result:

22 Note the use of iconicity, both in the sense of producing certain diagrams, and as using conventional rules. In this example we also see that each step has a certain focus. When dealing with the first diagram, the focus is first on similar triangles. Next we focus on the right angled triangle. In the final step(s), we solve an equation involving real numbers. This is one aspect of what I refer to as "using representations in order to break a proof down into manageable parts". Finally note the extensive use of indices. I explicitly mentioned this with respect to the 'a' that is used to denote the expression.

2 Case from Free Probability Theory

23 We will consider U. Haagerup's and S. Thorbjørnsen's paper [Haagerup & Thorbjørnsen 1999]. In this paper the following expression is studied

Philosophia Scientiæ, 16-1 | 2012 51

24 A Gaussian random matrix is a matrix whose entries are complex valued random

variables of the form fij + i gij. The Bi's are thus of the following form:

25 The entries, fij ,gij, form families of independently but equally distributed Gaussian random variables.

26 This expression is first reduced to

27 where

• B1, …Bp are independent matrices of the same type as B. • The π appearing in the subindices is a permutation on the set {1, 2, …p}.

28 It is the following expression that will be the topic of our investigation:

29 In this particular case we take the mean of the entries of the matrices to be 0 and their

variance 1, i.e., fij,gij ~ N(0,1). It is shown that the expression is equal to (π◌̂ is a permutation obtained from π, k(π◌̂) and l(π◌̂) denote certain natural numbers, that will be explained below):

30 The proof is done in a number of steps that will be presented here, demonstrating how it is broken down into manageable parts. The focus is on the overall steps, keeping the mathematical details to a minimum. The proof can be construed as consisting of two major parts (as illustrated in picture 7). In one, the permutation π is reworked into a different permutation, π◌̂. In the second, the focus is on the expression (1). We start with this expression.

In the first step the trace of the product of the matrices is found. In order to obtain this, the letters, B, are taken to represent ordinary matrices. We then use rules for multiplication of 1 matrices and take the trace of the resulting matrix.7 Letting 6(u, v, k) designate the u, v'th entry

of the matrix Bk, fixing reference by indices, the following expression (3) is obtained:

31 Note the indices of ui are divided into even and uneven parts, and i runs from 1 to 2p. Odd indices keep track of columns and even ones keep track of the rows of the matrices. Recall that the entries 6(u,v,k) are (complex valued) Gaussian distributed random variables. They are distributed so that the value of these products—if not zero —is 1. This means that we simply can count how many of these terms in the sum are not zero. We may make an initial guess about the value of the expression simply by

looking at the number of possible choices of the ui's. If i is odd, ui can take all values

Philosophia Scientiæ, 16-1 | 2012 52

between 1 and n. If i is even, ui can take any value between 1 and m. Adding up, since there are p each of them, there are np • mp different products. To find the value of the expression, we need to work out how many of these products are not zero.

Next the entries b(u,v,k) are taken to designate complex valued random variables. The condition 2 that the expectation of such a product is different from zero is that the product must consist of pairs of entries with their conjugate.8 This gives the following condition:

b(u2i,u2i+1,π(i)) = b(u2π(i),u2π(i)-1,π(i)), i ∈ {1,2,…,p} or

u2i = u2n(i) and u2i+1 = u2n(i)-1

By calculations on these two conditions and use of the definition of n one obtains uj = uπ◌̂ (j)+1. This equality tells us that when a product is non-zero, then this equality holds. The question is

3 then how many groups of different uj for j ∈ {1,2,…, 2p} there are. Mathematically this corresponds to determining the number of equivalence classes under the equivalence relation j

~ π◌̂ π◌̂ (j)+ 1 on the set {1,2,…, 2p}.

32 Note that there is also a notational shift where j ∈ { 1 , 2,…, 2p} replaces 2i+1 and 2i.

33 It is seen that, through these steps, there is a gradual reduction of the original expression (1). The first expression concerns the expectation of a product of Gaussian

Random variables. In the reduced expression - uj = u π◌̂(j)+1 - uj as well as the indices j represent natural numbers. More precisely they represent variables that take as values

certain natural numbers. (If j is odd uj can take values from 1 to n, if even uj can take values from 1 to m. j can take values from 1 to 2p.) π in the indices represents a permutation. The task of finding the value of the expression is finally reduced to considering an identity concerning natural numbers. This is one of the simpler details referred to in the above description of the role of representations in proofs. Note also that each step has a certain focus. In step one, the focus is on properties of matrices, forgetting that their entries are random variables. In the second step, the focus shifts to the entries, the complex valued random variables, and as already noticed, in the third step, we deal mainly with natural numbers.

34 Turning to the role of icons, we see that, at first, part of the theorem (2) is displayed in the expression (1). In step 1 above we use iconicity in the sense of a parallelism in

something else. We take the indices, Bi, to denote matrices and use rules that tell us how to multiply these in order to obtain the product of the random variables.

35 At step 2 iconicity is used in the sense of producing a (new) diagram that depicts certain relations that will tell us something about the solution of the problem. Looking at expression (3), we see that it is a sum. The setup is such that each summand—if not zero—is one. So we need to count the number of terms that are not zero. The desired relations are obtained by asking for the condition—what are relations between random variables— so that product is not zero. The formulation of this condition becomes the new diagram. The rest of the proof works out these conditions using an interplay between these two roles of icons, until the basic step concerning natural numbers is arrived at.

Philosophia Scientiæ, 16-1 | 2012 53

36 The second part of the proof concerns the permutation π : {1,2,…,p} — {1,2,…,p} which is transformed into a new permutation π◌̂ on {1,2,…, 2p}. This also reflects the change of sub-indices in the expression above. One way to arrive at the expression for n is to write the product of the matrices without the permutation: * * C1 C2‧…C2 p-1C2p 37 But we still wish to keep track of which matrices are identical, so we compare the above expression with the original one:

* * B1 Bπ(1)‧…Bp Bπ (p) 38 Before illustrating the general idea we consider an example, letting p = 4 and considering the permutation π = (12)(34)9: * * * * C1 C2C3 C4C5 C6C7 C8, is compared to * * * * B1 B2B2 B1B3 B4B4 B3 39 We see that the first and fourth B have the same index 1, which means they are identical, similarly the matrices in the second and third place are identical, continuing until we have:

C1 = C4 C2 = C4 C5 = C8 C6 = C7

40 This particular example is shown in picture 3 below.

41 In general comparing the two expressions one obtains the following identities:

C2i-1 = Bi = Bπ (π-1(i)) = C2π-1(i) and

C2i = B π (i) = C2 π (i)-1 42 Calculations on these equations give the permutation π◌̂:

43 Note that π◌̂ takes odd numbers into even numbers and vice versa. These permutations can be pictured by diagrams. If p = 4, and π is the permutation

44 (12)(34):

Picture 3. The diagram on the left represents π = (12)(34) and the diagram on the right represents the corresponding π̂.

45 For another example take the permutation (13)(24):

Philosophia Scientiæ, 16-1 | 2012 54

Picture 4. The diagram on the left represents π = (13)(24) and the diagram on the right represents the corresponding π̂.

46 Recall the equivalence relation obtained above. The number of equivalence classes can

also be illustrated by diagrams. The relation is j~π◌̂ π◌̂ (j)+1 and we use it on the two permutations above. The first gives the following equivalence classes:

Picture 5. The diagram on the left represents π̂ from picture 3 and the diagram on the right represents equivalence classes. Numbers connected by a line are in the same equivalence class. These are

calculated as follows: 1~π̂ π̂ (1) + 1 = 5, so 1 is seen to be related to 5. Similarly 5~π̂ π̂ (5)+ 1 = 8 + 1 = 1, calculating mod 8. Thus 1 and 5 form an equivalence class.

47 In this case, there are 5 equivalence classes, of which 2 contain even numbers and 3 contain odd numbers. From the definition of the equivalence relation, it can be seen that equivalence classes will always contain exclusively even numbers or odd numbers. Thus we may define k(π◌̂) to be the number of equivalence classes containing even numbers, and l(π◌̂) to be the number of equivalence classes containing odd numbers. Then if the value (2) of the expression is recalled, it is seen that if p is 4, and the permutation is as just discussed, the value of the expression is m2 • n3.

48 The second example gives:

Picture 6. The diagram on the left represents π̂ from picture 4 and the diagram on the right represent equivalence classes. Numbers connected by a line are in the same equivalence class. These are calculated as in the former example.

49 Here are only 3 equivalence classes, which give the value m2 ‧ n to the expression. Since in the two cases there is a different number of equivalence classes, one may ask

Philosophia Scientiæ, 16-1 | 2012 55

whether there is a way to determine how many there are for a particular permutation. This question is examined by the authors. It turns out that if the lines do not cross, as in the first example, there will always be 5 equivalence classes when p = 4. In general there will be p + 1 equivalence classes. This is another detail of the proof that can be studied partly using diagrams as addressed elsewhere [Carter 2010]. Here I will mention a single point concerning iconicity of the diagrams. The property of being a crossing or non crossing permutation has bearings on the result of the expression. This property is at first visible in the diagrams. It is possible, though, to formulate an algebraic analogue of being a crossing permutation. In this sense, since the diagram displays this relation, it is an iconic representation.

50 Moving on to explain the role of icons and indices, consider first the diagram below representing the process described above:

Picture 7

51 We see again that the expression is finally, via a reformulation of the permutation,

reworked to the equality uj = u π◌̂ (j)+1. The answer is then found from the equivalence

relation j~π◌̂ π◌̂(j)+ 1, calculating the number of equivalence classes. We have thus seen how the original expression is gradually broken down to study specific details. 1. The claims about the role of icons and indices were: 2. Icons, i.e., representations that resemble what they represent, enable the use of an obtained result in the original expression.

52 The role of indices is to make it possible to reinsert the result in the appropriate place.

53 The major letters used are the following: • B, in the first step representing matrices, in the second step representing Gaussian random matrices. • b, representing random variables.

Philosophia Scientiæ, 16-1 | 2012 56

• π, representing a permutation.

• uj, representing natural numbers.

54 In the first step of the proof, we take the Bi's to represent matrices and it is our acquaintance with these that give us rules on how to multiply, and find the trace of, matrices. These rules are used to obtain the second expression, the sum of expectations of the product of the entries. Observe that to obtain the first equality, we need both

acquaintance with what the letters, Bi, are taken to represent together with relations that hold for these kind of entities. This was the description given above for iconicity in terms of a mediated third. The same picture holds in step three, where calculations are performed on the conditions

u2i = u2π(i) and u2i+1 = u2π(i)-1,

55 leading to the expression uj = u π◌̂(j)+1. Here we take uj to represent numbers, and use relations on these to obtain the desired equality. (This is simplified somewhat in order not to drown the main point in mathematical details.)

56 The second step is different. Here we note that the b(u,v,k)'s represent random variables, and that the expression involves products of complex random variables with their conjugates. From this expression, we extract certain relations that must hold in order for the products to be different from zero. These relations allow us to produce a new diagram.

57 Even though there is partial likeness along the way, there is a puzzle that needs to be assembled in the end. That this is possible depends on the indexical use of the representations, i.e., the role of representations as "sign-posts", which is illustrated in the diagram. • π◌̂ points to π in the original expression.

• uj = u π◌̂ (j)+1 points—if the change of subindices is recalled—to the entries b(u2i,u2i-1,k). 58 In conclusion, my claim was that the value of a complex expression could be found— using representations—in a series of steps to break it down to manageable parts. In each step it is possible to select a particular focus, and thus "remove irrelevant information". Use of iconicity along the way, ensures that the final expressions or results tell us something about the original expression, whereas the use of symbols as indices lets us recall where to reinsert the obtained results.

BIBLIOGRAPHY

CARTER, JESSICA 2010 Diagrams and proofs in analysis, International Studies in the Philosophy of Science, 24(1), 1-14.

DUVAL, RAYMOND 2006 A cognitive analysis of problems of comprehension in a learning of mathematics, Educational Studies, 61, 103-131.

Philosophia Scientiæ, 16-1 | 2012 57

GROSHOLZ, EMILY R. 2007 Representation and Productive Ambiguity in Mathematics and the Sciences, New York: Oxford University Press.

HAAGERUP, UFFE & THORBJORNSEN, STEEN 1999 Random matrices and K-theory for exact C*-algebras, Documenta Mathematica, 4, 341-450.

MANDERS, KENNETH 1999 Euclid or Descartes: Representation and responsiveness, unpublished. 2008 The Euclidean diagram, in The Philosophy of Mathematical Practice, edited by MANCOSU, PAOLO, Oxford: Oxford University Press.

MOORE, MATTHEW E. 2010 New Essays on Peirce's Mathematical Philosophy, Chicago and La Salle: Open Court Publishing Company.

PEIRCE, CHARLES SANDERS 1965 Collected Papers, Cambridge: The Belknap Press of Harvard University Press. 1976 The New Elements of Mathematics, vol. IV, The Hague: Mouton.

NOTES

1. I wish to thank Frederik Christiansen as well as the anonymous reviewers for insightful comments. 2. Ken Manders [Manders 1999] deals more extensively with a similar point. He introduces the notions of responsiveness and indifference in order to address the topic of progress in mathematics. Human beings have limited resources for paying attention. As a consequence, one measure for the progress of a method is in terms of the number of types of response rendered superfluous by it. 3. For more interpretations of Peirce's philosophy of mathematics I recommend the recent book edited by Matthew Moore [Moore 2010]. 4. References to Peirce are to his Collected Papers, 1965. 2.228 means paragraph 228 in the second book. 5. According to Peirce's Phaneroscopy, sign, interpretant and object can take part in firstness (possibility), secondness (existence), and thirdness (law), which—at first— gives rise to 10 different categories of signs. 6. This point was made by one of the reviewers. 7. We also need the fact that the expectation of the sum is the sum of expectations. 8. This follows since the expression E ({f + ig}m{f - ig}n ) is zero unless m = n. 9. π = (12)(34) means that it is the map π : {1, 2, 3,4} →{1, 2, 3,4}, where the number 1 is mapped to 2, 2 to 1, and 3 is mapped to 4, and 4 to 3.

Philosophia Scientiæ, 16-1 | 2012 58

ABSTRACTS

This paper discusses the role of representations in mathematical proofs. It is proposed that representations enable us to break a proof down into manageable parts. We illustrate with an example from current mathematical practice how the value of an expression is found by gradually breaking it down into simpler parts. In addition I explain which role icons and indices play in this process. Icons ensure that there is likeness between the expression and the obtained results. Indices acting as sign-posts enable us to reinsert the results at the appropriate places.

Cet article discute le rôle des représentations dans les preuves mathématiques. Il est suggéré ici que les représentations nous permettent de diviser une preuve en plusieurs parties plus faciles à traiter. Nous illustrerons cela avec un exemple de la pratique mathématique actuelle qui consiste à trouver la valeur d'une expression en la divisant graduellement en parties plus simples. Par ailleurs, j'explique le rôle que jouent les icônes et les indices dans cette procédure. Les icônes assurent la similarité entre l'expression et les résultats obtenus. En revanche, les indices agissent comme des indicateurs qui permettent de réinsérer les résultats aux emplacements appropriés.

AUTHOR

JESSICA CARTER University of Southern Denmark, Odense (Denmark)

Philosophia Scientiæ, 16-1 | 2012 59

Towards a Practice-based Philosophy of Logic: Formal Languages as a Case Study*

Catarina Dutilh Novaes

In different subfields of philosophy, focus on actual human practices has been an important (albeit still somewhat non-mainstream) approach for some time already. This is true in particular of general philosophy of science, and to a lesser extent of philosophy of mathematics: following the revolutionary work of authors such as Kuhn and Lakatos, quite a few philosophers of science and mathematics sought to incorporate the actual practices of scientists and mathematicians into their philosophical analyses. Now, it seems fair to say that such a practice-based turn in these fields has delivered important results; we now have a more thorough and encompassing understanding of science as an enterprise thanks to these analyses. In this sense, it is perhaps surprising to notice that no such practice-based turn has yet taken place within the philosophy of logic. Why is that? Is there something about logic that makes it intrinsically unsuitable for a practice-based approach? What are the prospects for new insight into traditional philosophical questions pertaining to logic (e.g. the nature and scope of logic) to be gained from focus on the practices of logicians? It might be contended that, given logic's normative (as opposed to descriptive) nature, and given that it deals with a priori issues, not empirical ones, attention to the practices of logicians is a wrongheaded approach from the start. The main goal of the present contribution is to argue that this is not the case, independently of the issue of the normativity and a prioricity of logic.1 In the first part of the paper, I delineate what a practice-based philosophy of logic would (could) look like, insisting in particular on why it can be relevant and how it is to be articulated. The 'how?' question is especially significant, as crucial methodological challenges must be addressed for the formulation of a methodologically robust practice-based philosophy of logic. In the second part, I illustrate how a practice-based approach in (the philosophy of) logic could be developed by means of a case- study: the use of formal languages in logic, in particular in the practices of logicians. (The

Philosophia Scientiæ, 16-1 | 2012 60

material presented here is essentially a condensed version of the main arguments presented in book-length in [Dutilh Novaes 2012].) I argue that formal languages play a fundamental operative role in the work of logicians, as a paper-and-pencil, hands-on technology triggering certain cognitive mechanisms—more specifically, countering some of our more 'natural' cognitive mechanisms which are not particularly suitable for research in logic (as well as in other fields). I substantiate these claims mostly with empirical data from research on the psychology of reasoning. By means of this analysis, I hope to show that a practice-based philosophy of logic can be a fruitful enterprise, especially if accompanied by much-needed methodological reflection.

1 Practice-based philosophy of logic

1.1 Why?

Perhaps the first question to be asked before we embark on a new enterprise such as a practice-based approach in philosophy of logic is: Why do we need a new approach at all? In what sense do more traditional approaches fail to clarify certain aspects of the target-phenomenon, and in which ways is the proposed new approach likely to shed new light on the matter? In order to address these questions, let me first offer some general considerations on the very idea of philosophical analyses with emphasis on (human) practices. Let us start with the schema 'practice-based philosophy of X ', where 'X' is a given (scientific) field. If we replace 'X' with 'science', we obtain 'practice-based philosophy of science', which was arguably the first practice-based approach systematically developed within philosophy. Generally speaking, a practice-based philosophy of science takes actual scientific practices as a starting point instead of an idealized, abstract notion of 'science' . The contrast between Popper's and Kuhn's respective views is illustrative: Popper maintained that science proceeds (indeed, should proceed) by means of falsification [Popper 1959], but Kuhn pointed out that this picture was in no way a reflection of actual scientific practices [Kuhn 1962]. A Popperian might still contend that this is in any case how science should proceed; but when there is such a significant mismatch between theory and object of analysis, in what sense is the theory still informative or relevant in any interesting way? The crucial gestalt shift undertaken by Kuhn was arguably that of viewing 'science' not as an abstract entity—the collection of scientific theories—but rather as a body of (rather heterogenic) human practices. From a Kuhnian perspective, such practices are the actual phenomena that a of science should address, not (only) the theories themselves and their logical relations as such. But in order to argue that the Popperian picture was a wrong picture of scientific practice, Kuhn needed to make his philosophical analyses empirically-informed: it was on the basis of investigations on the history of science that Kuhn could challenge the Popperian picture and argue that, rather than falsification, a scientist typically aims at the confirmation of theories. So here is the first important feature of (and challenge for) a practice-based approach in philosophy: it must be empirically informed, as it must gather data on how e.g. scientific research is actually conducted. This may seem to jeopardize the very core of philosophical methodology as traditionally construed, and to blur the distinction between philosophy, sociology and history. But are these insurmountable difficulties?

Philosophia Scientiæ, 16-1 | 2012 61

Besides practice-based philosophy of science (which also has Hacking and Latour, among others, as prominent names), recent years have witnessed the emergence of a practice-based philosophy of mathematics in the work of e.g. J. P. van Bendegem [Kerkhove & Bendegem 2007], Hersh [Hersh 1997], [Hersh 2006], Mancosu [Mancosu 2008], among others. Again, the main idea is that traditional philosophy of mathematics, with the predominance of the 'foundations of mathematics' research program, offers a distorted or in any case extremely limited picture of mathematics as actually practiced by mathematicians. For philosophers interested in mathematical practices, the target-phenomenon to be accounted for is not only the mathematicians output (the theories they formulate and the theorems they prove), but also the processes (both individual and social) leading to the output, i.e. the practices of mathematicians. In a sense, in both cases the crucial step seems to be a widening of the scope of philosophy of X so as to include the actual practices of the practitioners of X. It may be debated whether this new scope is meant to replace the old, output-oriented scope, or whether it is meant to be an addition, not necessarily seeking to supplant traditional forms of philosophical investigation of X. But ultimately, practice-based approaches to science and mathematics seem to rely crucially on an assumption and on an empirical claim: actual practices of mathematicians and scientists are philosophically relevant (assumption), and traditional philosophical accounts of both science and mathematics (those with no particular focus on practices) do not reflect actual practices within these fields (empirical claim).2 I believe the empirical claim to have been essentially vindicated by the works of Kuhn and many others following his footsteps (even though there is of course still scope for debate on the exact details of encompassing models of scientific and mathematical practices). As for the assumption, one way that it could be discharged is if a practice-based approach to X succeeds in solving, or at least in shedding new light on, an open question within traditional philosophy of X, one with respect to which traditional philosophical methodology has proved to be insufficiently illuminating.3 Let us take a closer look at the current state in the philosophy of logic. I think it is fair to say that 'traditional' philosophy of logic is often far removed from the actual, latest developments of research in logic properly speaking. The main focus is still on traditional themes such as truth, consequence, paradoxes, etc. To be sure, such topics remain of crucial importance, but there is a wide range of developments within logic as a discipline that receive scarce attention from philosophers of logic. Indeed, the logic that they talk about is all too often the logic of several decades ago, when (mathematical) logic was almost exclusively concerned with the foundations of mathematics. In a sense, philosophy of logic is still very much tied to what could be described as a 'Quinean agenda', and in many cases still to the idea that first-order predicate logic is the quintessential logical system (as in e.g. [Sher 2008]). In practice, however, and in particular in terms of applications, first-order logic is much less prominent, a fact perhaps related to its sub-optimal computational properties. In short, current philosophy of logic does not pay sufficient attention to the actual, current practices of logicians. But what are the current practices of logicians? Again, this is an empirical question, and one which is not likely to receive an unproblematic answer, precisely because what is to count as 'logic' is a contentious issue to start with. Delineating what is to count as a 'practice of logic', even if not in a sharp, clear-cut way, is most certainly not a trivial

Philosophia Scientiæ, 16-1 | 2012 62

enterprise. While I acknowledge this difficulty, I think a few general remarks can at least convey some of the mismatches between 'logic' as talked about by philosophers of logic and 'logic' as practiced by logicians. Firstly, logic is no longer exclusively concerned with the foundations of mathematics; it intersects with areas such as computer science, game theory, decision theory, linguistics, cognitive science, etc. Secondly, several different logical systems besides first-order logic are regularly used and studied, but discussions on logical pluralism do not seem to be able to really make sense of the plurality of logical systems. Arguably, explaining how (if at all) the current plurality of logical systems available in the market can (or cannot) be harmonized with the traditional idea of 'logic as the science of correct reasoning' is one of the most pressing tasks for philosophers of logic. While there has been quite some interest in this issue on the part of philosophers [Haack 2000], [Beall & Restall 2006], a hallmark of such discussions on logical pluralism is that they are arguably still not sufficiently 'pluralistic', leaving too much of the actual practices of logicians out (as argued by e.g. [Benthem 2008]). Typically, only those that could be described as the 'classical' non- classical logics are explicitly discussed (intuitionistic logic, relevant logic, and in recent years paraconsistent logics). In particular, actual research in logic goes well beyond the concepts of consequence and truth, but nevertheless discussions on logical pluralism are typically formulated in terms of consequence-pluralism and alethic pluralism. I am not suggesting that current discussions within philosophy of logic are no longer relevant and should simply be abandoned in favor of a practice-based approach. Rather, I am suggesting that the mismatch between philosophical theorizing and actual (recent) practices when it comes to logic is a reason for us to take seriously the idea of expanding the research agenda within philosophy of logic, and in particular to pay more attention to the actual practices of logicians. The idea is to raise philosophical issues pertaining to the latest developments in logic,4 and to ask ourselves questions such as: What are the cognitive and social mechanisms involved in the development of logic as a discipline? What is it that we do when we 'do logic' , and how do we do it? So here is a tentative characterization of practice-based philosophy of logic in its goals and tasks: - It takes as its starting point logic as it was and is actually practiced— recent developments as well as its history. - Its tasks are:

- to clarify underlying assumptions, raise pertinent questions;

- to draw philosophically significant conclusions from technical results,

- to 'make sense' of logical practices: the tools and technologies involved, the cognitive

processes, social interactions.

Let me stress once again that practice-based philosophy of logic need not replace more traditional forms of investigation in the field; it is best seen as an alternative, complementary philosophical approach to logic, but one which may ultimately even shed new light on traditional issues.

Philosophia Scientiæ, 16-1 | 2012 63

1.2 How?

Now that I have addressed the 'why' question, let us turn to the much more delicate 'how' question: how is practice-based philosophy of logic to be developed in a methodologically robust manner? Mere anecdotal evidence will obviously not be sufficient; but as already noticed, a practice-based philosophy of logic must be empirically-informed, given that the goal is precisely to take actual practices into account5. Here again we encounter the output vs. process distinction: while traditional philosophy of logic focuses on the output of logical practices, i.e. the logical theories themselves, from a practice-based perspective logic is viewed as a body of human cognitive and social processes as well as a collection of theories. From this point of view, how logicians arrive at their results—the heuristics of logical practices—becomes an important question, and insofar as they are human beings like any other, the general cognitive makeup of human agents and their patterns of social interaction will be significant. Indeed, I submit that there are essentially two intertwined but distinct levels of practices to be investigated: the social level and the individual level. On the one hand, the social level of logic as a collective, public enterprise involves networks of people who communicate with each other and whose work builds on previous work (it is a cumulative enterprise). Naturally, logicians share specific (socially established) norms on how work in logic ought to be done. On the other hand, the individual level of logic as a cognitive enterprise regards the cognitive processes taking place when a logician is at work; even though the social aspect is of course fundamental for the creative process, ultimately thinking remains an individual, private matter. I suggest that these two levels may require distinct methodologies to be investigated. With respect to the social level, a natural question to be asked is: what distinguishes a practice-based philosophy of logic from a sociology of logic? Granted, sociology of logic is an almost non-existing field, but at least one comprehensive sociological study of the practices of logicians has been undertaken with significant results, namely [Rosental 2008]. Would a practice-based philosophy of logic focusing on the social level be significantly different from such a sociological approach? I believe it would, even though a philosophical analysis would do well to be informed by the results obtained by means of the application of solid sociological methodology such as Rosental s. But just to illustrate the kind of analysis that a practice-based philosophy of logic could offer on the social level, let us reflect for a moment on the ubiquitous practice of presenting a proof-system and a semantics, and then moving on to proving soundness and completeness results. That this is how we 'do logic' is such a widespread convention that few people reflect on what exactly it means to offer a soundness and completeness proof from a philosophical point of view. A practice-based philosopher of logic could raise precisely this kind of question: why is this manner of proceeding so widespread, and what exactly is achieved by means of such proofs? Of course, there are traditional philosophical analyses of the significance of proofs of soundness and completeness, such as [Dummett 1978], but they fail to highlight the fact that it has to a large extent become a social norm to proceed in this manner. A practice-based philosopher of logic, by contrast, may outline the partially conventional nature of such a procedure, and may even go as far as questioning the propriety of this procedure in specific cases.6

Philosophia Scientiæ, 16-1 | 2012 64

In short, the main difference between a sociology of logic and a practice-based philosophy of logic is that the former is fundamentally descriptive ; the latter, while having a descriptive dimension (and possibly being informed by sociological findings), will also be to some extent prescriptive. Some of the tasks of a practice-based philosophy of logic are to offer critical analyses of the conceptual foundations of actual work being done in logic (clarifying underlying assumptions), and to identify possible conceptual problems underlying the practices and to suggest directions for improvement. Indeed, actual practices are not always (necessarily) 'right' . What kind of methodology can we use for data-gathering on the practices of logicians? As already mentioned, mere anecdotal evidence is not sufficient. Moreover, once such data are gathered, it is not entirely obvious that we can then proceed with the 'usual' philosophical methodology, whatever that may be. At this point, however, it would be premature to offer full-fledged methodological considerations; faithful to the practice- based principle, I believe that it is only by doing practice-based philosophy of logic that the appropriate methodological guidelines will really present themselves. Nevertheless, even at this early stage it is important to be clear on the kind of methodology that can be used for data-gathering on practices. I have a fairly precise idea on how to get started on the individual level of logic as a cognitive enterprise: a promising approach seems to be to take into account findings from cognitive science and psychology on how humans in general reason. There is a well-developed tradition in the psychology of reasoning which has gathered significant data on human reasoning patterns, and an interesting question is precisely to inquire into the cognitive changes produced by logical training on this initial cognitive state that logicians, as humans, share with all other humans (it is precisely the line of investigation that I shall pursue in the second part of the paper). Although little work has been done on the psychology and cognitive science of learning logic specifically, there is some work on the topic; [Stenning 2002], for example, is a remarkable illustration of how cognitive processes engaged in the learning of logic can be studied.7 Moreover, the practice-based philosopher of logic already has a good starting point with the available data on how humans in general, rather than logicians specifically, reason. I shall argue in the second part of the paper that significant insight can be gained on the practices of logicians just by taking this material as a starting point. As for the collective level of logic as a social enterprise, data-gathering seems a delicate matter. Robust sociological methodology would have to be employed, such as e.g. systematic, quantitative analyses of the corpus of articles published on a given topic in view of a specific question to be addressed. Historical studies may also prove to be useful, as they may provide information on the variants and invariants in the practices of logicians across different times and places. But at this point, data-gathering for the social level of analysis would require significantly more discussion than I can offer here, and I hope that others may be able to offer suggestions for further development in this area. In general, the methodological challenges for the formulation of a practice-based philosophy of logic stem from the fact that the approach finds itself in the difficult position of balancing descriptive and prescriptive elements. It must not only deal with the is/ought divide, but it must also find a way to merge the 'is' level with the 'ought' level. It does not take a purely idealized notion of what logic ought to be as its starting point, but it is not meant to be sociology of logic either. Perhaps one way to describe

Philosophia Scientiæ, 16-1 | 2012 65

this predicament is the following: practice-based philosophy of logic discusses how things ought to be (with respect to logical practices), but taking as a starting point how things actually are—a constrained, situated form of normativity. Moreover, ideally the approach should be able to create a situation of reflective equilibrium between practices and theory: ideally, the dialogue between the practice-based philosopher of logic and the logician should go both ways.8

2 A case study: uses of formal languages in logic

2.1 Why do logicians use formal languages?

One does not need to be a practice-based logician to recognize the importance of understanding the role of formal languages in logic. True enough, most of us seem to take for granted that 'this is simply how we do things' , forgetting (or dismissing) the fact that, for most of its history, logic was not practiced with formal languages as we now know them. Yet, this seems to be a fundamental question: why use formal languages when doing logic? What is the actual impact of using formal languages for research in logic? Does it really make a difference for the logician's investigations in terms of the results obtained? Is it necessary for the development of logic as a discipline? One cannot deny the substantial changes that logical practices underwent since it became customary to do logic with formal languages, and this phenomenon requires an explanation. Although perhaps not indispensable for logical practices up to a certain level of abstraction, formal languages are arguably such powerful tools that they profoundly transform the ways in which we do logic. Typical answers to these questions refer to the clarity of expression and level of abstraction afforded by formal languages; sometimes reference is also made to the object-language vs. meta-language distinction, which greatly contributed to the development of meta-theoretical investigations. It seems to me however that, while such remarks are not false properly speaking, in fact they only 'scratch the surface' of the phenomena in question.9 There seems to be much more to the use of formal languages in logical practices than these elements alone, and in particular we should inquire into the cognitive impact of using formal languages when doing logic. First, here is a question that is often seen as either trivial or pedantic (or both): what are formal languages? Some may dismiss the question as unimportant (e.g. [Avron 1994, 218]), while others may say that there is an exceedingly simple and philosophically uninteresting answer. Indeed, from a purely mathematical point of view, a formal language is characterized by a finite vocabulary and a finite and sharply defined syntax: its (well-formed) expressions and formulas are generated recursively on the basis of the vocabulary and the syntax (forming a potentially infinite class). Again, this is of course true, but it leaves a lot unexplained. For example, in what sense are formal languages formal, and in what sense are they languages? Do they fulfill a communicative role as vernacular languages do? Do they have a social dimension as languages used for communication among logicians? Presumably, the adjective 'formal' differentiates them from other kinds of linguistic systems.10 Elsewhere [Dutilh Novaes 2011], I discuss at length the different meanings that the term 'formal' has with respect to logic, and it would seem that, of the eight senses of 'formal' identified in that paper, formal languages are formal in particular in the sense of formal as de-semantification (a

Philosophia Scientiæ, 16-1 | 2012 66

concept to be explained below) and in the sense of formal as computable (related both to how the syntax of a formal language operates and how inference-making with a formal language and an appropriate deductive system typically functions). These two aspects will have a key impact on the cognitive effect of doing logic with formal languages, as I shall argue. But before focusing on these cognitive aspects, a few more preliminary observations on formal languages and logical practices seem in place. First of all, formal languages obviously do not replace other forms of communication in logical practices: typically, logicians use a mix of formal and vernacular languages, switching a bit back and forth when convenient (this can be observed in particular in oral contexts). But again, research in logic which uses formal languages is significantly different from research in logic without them (as history shows); the appearance of this specific technology has drastically changed the way logicians work, not only in terms of methods but also in terms of the results obtained. But what kind of tool is it? Some fairly obvious but often overlooked facts about formal languages are worth mentioning here. To begin with, formal languages are written languages, with no obvious spoken counterparts. As such, they involve predominantly (but perhaps not fundamentally) our visual capacities. Equally important is the observation that formal languages came into existence only after a very long and gradual process going through the use of schematic letters and the development of mathematical notation (algebra in particular), spanning over many centuries and different continents [Staal 2006], [Dutilh Novaes 2012, chap. 3]. The 17th century is a crucial period in this sense, not only for the groundbreaking developments in notations for algebra (in particular with Viète and Descartes), but also in that, perhaps for the first time in history, logic and mathematics were increasingly viewed as related disciplines [Mugnai 2010]. This greatly facilitated the transfer of mathematical notational techniques to logic. And yet, as is well known, the development of the first full-fledged logical formal language had to wait until 1879, with Frege's Begriffsschrift. But the Begriffsschrift did not have much of an impact on logical and mathematical practices in first instance. This sociological fact may be related to the stage of development of the foundational program at the time; in 1879, axiomatizations of different subfields of mathematics were in their very early days, and at that point the focus was on the axioms rather than on the rules of inference underlying the theories (which were then informally and tacitly adopted, as if they were unproblematic [Awodey & Reck 2002]). In other words, at that point, the need for such a formal language as Frege's concept-script was simply not truly felt. But by the time the first volume of Whitehead and Russell's Principia Mathematica appeared (which the authors themselves saw as a continuation of Frege's project), the time was right for a more 'formal' approach to rules of inference, in particular because meta-theoretical questions became increasingly seen as fundamental.1111 More specifically, it became clear that the completeness (in different senses of 'completeness'—see [Awodey & Reck 2002]) of the proposed axiomatizations could not be taken for granted and had to be established by means of proofs. Now, to conduct such meta-theoretical investigations on those axiomatizations, it became important to 'objectify' the underlying logic as a mathematical object itself, and thus the concept of object-language emerged. The classical rationale for the introduction of formal languages is the idea that vernacular languages are prone to all kinds of ambiguities and imperfections, and a

Philosophia Scientiæ, 16-1 | 2012 67

formal, artificial language would be the result of a sanitation of expressive means for the purposes of scientific investigation in particular. The preface of Frege's Begriffsschrift is the locus classicus for this position (I shall say more on the expressive role of formal languages shortly). Moreover, a crucial and essentially new function for formal languages was discovered in the early 20th century: to facilitate meta-theoretical investigations to be conducted.12 While these two aspects are undoubtedly significant to explain the impact of the use of formal languages in logical practices, my main contention here is that they only tell part of the story, and my goal is to propose at least a partial account of the rest of the story.

2.2 Formal languages as a technology

Here is yet another hypothesis which may not sound particularly illuminating at first, but which I believe truly contributes to clarifying the role of formal languages in logical practices: formal languages are best seen as a particular kind of technology. Of course, this hypothesis will not be very informative unless we can provide a more precise meaning to the rather vague term 'technology'. As a first approximation, a technology can be described as a specific method, material or device used to solve practical problems. Formal languages as such are not a method by themselves, but they are devices that allow for the implementation of certain methods. Ultimately, formal languages are a valuable item in a logician's toolbox; but naturally, being tools, they will deliver results only if put to use in fruitful ways. A particularly puzzling aspect of technologies in general is that they are usually developed with specific applications in view (i.e. to fulfill certain needs that the hitherto available technologies fail to address), but often turn out to offer possibilities that had not been originally foreseen, and which go beyond the specific practical problems they were created to address. We could refer to this phenomenon as the 'surprise factor' in any technological development. What this means is that, once a given technology is developed to tackle a specific familiar issue, it often opens up a whole new range of possibilities which would otherwise not even be conceivable prior to its development. New technologies may literally create new worlds, besides bringing answers to existing problems.13 Perhaps the best example of a radically life-changing technology is writing. Writing developed through thousands of years, but for a very long time, it consisted in very rudimentary forms of proto-writing [Schamdt-Besserat 1996]. Originally, proto-writing was developed in Mesopotamia as a simple technology for accounting, i.e. for keeping track of production surplus and goods in general; naturally, such needs only arose against the background of a society which had abandoned a hunting-gathering economy in favor of agriculture and herding.14 So what started as rudimentary techniques to keep track of goods slowly but surely developed into a much more powerful technology, one which profoundly modified the way human beings (in literate societies) live their lives. Formal languages are, as already mentioned, written languages, so the history of their development is a specific chapter in the general history of the development of writing. An important aspect is how the development of mathematical notation in particular went hand in hand with the search for notations that would facilitate processes of calculation; this is especially conspicuous in the development of numerical notation—see

Philosophia Scientiæ, 16-1 | 2012 68

[De Cruz & De Smedt 2010]. Interestingly, with respect to numerical notation, expressivity and ease of calculation often do not go hand in hand. The classical example is the contrast between Roman numerals and Indo-Arabic numerals; the former are in a sense more intuitive and 'iconic' , but they constitute a cumbersome tool for calculation, while the opposite is true of the latter.15 Indeed, there seems to be an inherent tension between desiderata of expressivity and desiderata of effective calculation, a tension which will occupy a central position in the present analysis. As already noticed, the pioneers of the introduction of formal languages in logic typically had concerns of expressivity (precision, objectivity) in mind when defending the need for a technology other than vernacular (written) languages in their investigations. The notable exception is of course Leibniz, for whom the calculative aspect of an artificial language such as his envisioned lingua characteristic and its potential to lead to new discoveries was crucial. The claim I will be defending in what follows is that this facilitating effect for logical reasoning as such is the somewhat 'unexpected' (but not for Leibniz, of course) upshot of the introduction of the specific technology corresponding to formal languages in logical practices. Inquiring into the cognitive processes underlying these phenomena is one of the purposes of the paper. But rather than using the term 'calculative' , whose meaning is more restricted than what I have in mind, I will be using the term 'operative' to refer to this feature of formal languages. In effect, we can discern three different and somewhat conflicting roles of formal languages in the practices of logicians (but notice that this enumeration does not make any claim of exhaustivity):16 • Expressive role • Iconic role • Operative (calculative) role. The expressive role of formal languages. A discussion of the expressive aspect of formal languages must begin with an examination of Frege's own description of the purpose(s) of his 'ideography' : My first step was to attempt to reduce the concept of ordering in a sequence to that of logical consequence, so as to proceed from there to the concept of number. To prevent anything intuitive from penetrating here unnoticed, I had to bend every effort to keep the chain of inferences free of gaps. In attempting to comply with this requirement in the strictest possible way I found the inadequacy of language to be an obstacle; no matter how unwieldy the expressions I was ready to accept, I was less and less able, as the relations became more and more complex, to attain the precision that my purpose required. This deficiency led me to the idea of the present ideography. Its first purpose, therefore, is to provide us with the most reliable test of the validity of a chain of inferences and to point out every presupposition that tries to sneak in unnoticed. [Frege 1977, 5-6] The imperfections of vernacular languages (which Frege refers to as 'Sprache des Lebens', the language of life) include, but are not limited to, the phenomena of ambiguity, equivocation, empty names, etc. For a variety of reasons, the expressivity afforded by vernacular languages is inadequate for Frege's project of providing logical foundations for mathematics, and in particular to test the correctness of already existing mathematical theorems on the basis of an examination of the logical validity of the corresponding chains of inference. A crucial aspect of this enterprise is the

Philosophia Scientiæ, 16-1 | 2012 69

requirement of perspicuousness: every single step must be made absolutely explicit, as shortcuts are not allowed, and hidden presuppositions are not welcome. It is in this sense that this motivation for using formal languages can be described as related to concerns of expressivity: every small inferential step and every premise/assumption must be explicitly expressed. While I am prepared to grant that the use of a formal language can greatly facilitate the realization of these goals, it is crucial to observe that a formal language is not a necessary prerequisite for the success of the enterprise (and clearly, it is not sufficient either). In theory, and perhaps even in practice, this could be accomplished by means of a regimentation of the vernacular language in question. (Indeed, in the later Latin medieval tradition, the Latin being used for logical investigations was no longer a form of vernacular but rather a highly regimented version of Latin, containing a wide range of conventions introduced in order to increase precision of expression.) Moreover, it is worth noticing that replacing a vernacular language with a formal one for whatever purposes typically does not come for free; there is always the risk of overall loss of expressivity if the formal language fails to perform its expressive function adequately. Frege himself notices that a formal language is like a microscope, i.e. a tool suitable for particular applications and in particular circumstances: but if the microscope is malfunctioning, one would be well advised to stick to the not entirely adequate, but in any case widely tested and reliable alternative, namely the eye. For these and other reasons, I believe that the gain in clarity that is (purportedly) associated with the use of formal languages (in logic and elsewhere) is by no means the main reason why using or not using formal languages has such an impact on the practices of logicians and on the results they obtain. The iconic role of formai languages.17 Formal languages are typically sentential languages, which means, according to Stenning's [Stenning 2002, 51] useful distinction, that their very appearance on paper as concatenations (i.e. linearly presented) must be interpreted indirectly, i.e. on the basis of the syntactical conventions which define how the inscriptions on paper must be 'scanned' in the first place. Diagrams, by contrast, rely on patterns of spatial relations between inscriptions other than concatenation, and these spatial relations can be interpreted directly as (somehow) mirroring the spatial relations present in the phenomenon being depicted. Nevertheless, as Stenning himself acknowledges, the distinction formulated in these terms is not necessarily sharp, and many systems of representation have elements of both forms of representation, sentential and diagrammatic. 'Vernacular writing' is (normally) linear and one-dimensional (either vertically or horizontally), a fact that tends to obscure its iconic nature, as argued by Krâmer [Krâmer 2003]. Every form of writing is inexorably iconic insofar as presupposes inscriptions on surfaces, but given that most of them are fundamentally based on concatenation (linearity), it is all too easy to disregard the iconic element.18 Now, what is often the case with formal languages is that, even though they are still essentially sentential languages, they nevertheless explore the iconic feature of written languages more deeply than vernacular written languages. 'Formal writing' , when at its best, makes full use of the two dimensional possibilities of a surface; for example, proofs are typically represented by two-dimensional structures such as trees and graphs. Frege's Begriffsschrift is a good example of a notation that incorporates iconicity into a sentential form of representation (admittedly, it is a type-setting nightmare, which might explain why it

Philosophia Scientiæ, 16-1 | 2012 70

was never widely adopted). As recently argued by D. Macbeth (this volume), the diagrammatic aspect of Frege's notation is fundamentally integrated into the very machinery of the system. In a similar vein, diagrammatic logical systems have been extensively studied in recent years [Shin 2002], [Shin & Lemon 2008]. The visual-iconic nature of formal logic suggests cognitive connections between doing logic and our visual faculties, and these have been explored in detail by K. Stenning in Seeing Reason (primarily, but not exclusively, in the context of logic teaching for undergraduate students). The cognitive impact of the use of this concrete technology— formal languages—when reasoning in logic, in particular in connection with its inherent visual and iconic nature, is a fascinating topic that deserves to be explored much more widely. Stenning's pioneering work has brought to the fore several aspects of this interaction, but it has also shown that our understanding of the impact of using different systems of representation to reason in and about logic is still at its early stages. Particularly surprising are Stenning's findings concerning individual differences in reasoning styles and the significance of using different modalities (sentential or diagrammatic systems of representation) for the resolution of the same classes of problems [Stenning 2002, chap. 02]. Whatever it may turn out to be, the story of the impact of using different technologies when doing logic will have to incorporate individual differences, and this suggests that, on a cognitive level, there is no universal, uniform way of 'doing logic' . The operative role of formal languages. Another dimension of the use of formal languages in logic, one which is related to, but nevertheless distinct from the iconic dimension, is what we could describe as their 'operative' dimension.19 'Operative writing' is a concept introduced by S. Krâmer [Krâmer 2003], which in the case of formal languages concerns the 'paper-and-pencil' dimension which seems to play an important role in how logicians reason and arrive at new results (Krâmer explicitly mentions the formal languages of logicians as examples of operative writing). She presents operative writing in the following terms: a medium for representing a realm of cognitive phenomena [. . . ] a tool for operating hands-on with these phenomena in order to solve problems or to prove theories pertaining to this cognitive realm. [Krâmer 2003, 522] The conception that Krâmer is criticizing as too limited is the 'phonetic' conception of writing, one which reduces writing to the role of transcription of spoken language to visual media. Clearly, formal languages do not fit into this picture, along with other forms of what she describes as 'operative writing'.20 One possible way of understanding the divide between operative vs. non-operative writing would concern the prior availability of the cognitive phenomena that a given portion of writing is (or is not) a report of: non-operative writing would express already available thoughts, while operative writing allows for new thoughts and ideas to come into existence, precisely through the process of operating with, and on, portions of writing. Prima facie, this seems plausible, but from this point of view, any form of writing can count as operative writing at least in some circumstances. We all know that, when writing a paper in plain vernacular, doing the actual writing is often an integral and necessary part of the creative process.21 In effect, it is a different, specific characteristic of Krâmer's notion of operative writing that is particularly relevant for the present attempt to grasp the cognitive impact of using formal languages when

Philosophia Scientiæ, 16-1 | 2012 71

doing logic: what she describes as the process of de-semantification, which we shall examine now.

2.3 The operative role of formal languages

Before discussing the theory behind the notion of formal languages as operative writing, let me offer a few platitudes on the practices of logicians. Firstly, 99% of logicians have a black/white board in their offices; it is rare to encounter a logician (or a mathematician, for that matter) for whom writing/scribbling frantically is not a part of his/her modus operandi. Indeed, writing down symbols typically plays an important role in how a logician organizes his/her thoughts and comes to new ideas and insights. In itself, this is not particularly controversial: for almost all kinds of problems or reasoning tasks, it helps a great deal if we have paper and pencil at our disposal to scribble with. Thinking and problem-solving in general are not exclusively inner mental processes; by making use of external devices which help making cognitive phenomena more concrete, we are often in a much better position to solve the tasks in question. This phenomenon has been extensively analyzed in the literature on the concepts of extended mind and extended cognition [Clark & Chalmers 1998], [Clark 2008], [Menary 2010], and writing is a prototypical example of a technology extending cognition beyond the limits of skull and skin. Still, it does look like formal languages play a particularly significant role as a hands-on tool for discovery in logic, beyond the general reasoning-enhancing properties of writing and drawing in general. They not only play a justificatory role in contributing for the perspicuousness of the correctness of a proof, but they also perform the heuristic function of contributing to the discovery of new results and theorems, more than just any other arbitrary form of writing/drawing. Of course, we must always bear in mind that individual differences and different styles of reasoning (again, see [Stenning 2002]) suggest that this heuristic dimension does not operate uniformly on all agents. Some logicians may make extensive use of concrete devices such as whiteboards, sheets of paper and pencils, while others may conduct a significant portion of their thinking without actually writing anything down. Nevertheless, the heuristic function of formal languages is sufficiently pervasive so as to require an explanation. What are the features of formal languages that allow them to perform this operative function? Before elaborating specifically on the notion of de-semantification, let me point out from the start that this discussion is connected with the 'logic as calculus vs. logic as language' dichotomy introduced by van Heijenoort [Heijenoort 1967]. Typically, the proponents of the use of formal languages in science (logic in particular) emphasize their expressive advantages. Responding to a criticism of the Begriffsschrift, Frege makes a point of stressing that his ideography differs from Boole's notation in that it is not a mere calculus: it is a lingua characteristic (see [Heijenoort 1967]). It has also often been stressed in the secondary literature that, unlike what was to become of formal languages at a later stage, Frege's logical language is a meaning-ful language, not a calculus, and that this would be a sign of the superiority of the Fregean project over later projects such as the Hilbertian project (see [Sundholm 2003]). Here, however, I defend a very different view (following [Krâmer 2003]): the process of de- semantification of formal languages can be, and often is, beneficial for reasoning in and about logic.

Philosophia Scientiæ, 16-1 | 2012 72

That this process of de-semantification did take place historically can be attested by a remark by Tarski (admittedly, with a certain degree of exaggeration), and a famous quote by Whitehead:22 Already at an earlier stage in the development of the deductive method we were, in the construction of a mathematical discipline, supposed to disregard the meanings of all expressions specific to this discipline, and we were to behave as if the places of these expressions were taken by variables void of any independent meaning. But, at least, to the logical concepts we were permitted to ascribe their customary meanings. […] Now, however, the meanings of all expressions encountered in the given discipline are to be disregarded without exception, and we are supposed to behave in the task of constructing a deductive theory as if its sentences were configurations of signs void of any content. [Tarski 1959, 134] By relieving the brain of all unnecessary work, a good notation sets it free to concentrate on more advanced problems, and in effect increases the mental power of the race. In mathematics, granted that we are giving any serious attention to mathematical ideas, the symbolism is invariably an immense simplification. [...] [B]y the aid of symbolism, we can make transitions in reasoning almost mechanically by the eye, which otherwise would call into play the higher faculties of the brain. It is a profoundly erroneous truism [...] that we should cultivate the habit of thinking what we are doing. The precise opposite is the case. Civilization advances by extending the number of important operations which we can perform without thinking about them. [Whitehead 1911, 59-61] But what exactly is meant by 'de-semantification' and what is the connection with the notion of operative writing? Before giving my account of the concept, let me start by offering Krâmer's own formulations: We can explicate this idea with a type of writing in which this 'process of de- semantification' is particularly apparent. We will name this process 'operative writing'. This modality of writing is commonly known, and misunderstood, as 'formal language' and represents one of the fundamental innovations in 17th century science. [Krâmer 2003, 531] The rules of calculus apply exclusively to the syntactic shape of written signs, not to their meaning: thus one can calculate with the sign '0' long before it has been decided if its object of reference, the zero, is a number, in other words, before an interpretation for the numeral '0'—the cardinal number of empty sets—has been found that is mathematically consistent.23 [Krâmer 2003, 532] […] signs can be manipulated without interpretation.24 This realm separates the knowledge of how to solve a problem from the knowledge of why this solution functions. [Krâmer 2003, 532] […] the potency of these calculations is always connected to a move toward de- semantification: the meanings of signs become undifferentiated. [Krâmer 2003, 532] Essentially, the process of de-semantification entails that each step in the process of reasoning will not be guided by the reasoner's intuitions and beliefs which would (almost inevitably, as we shall see) be activated if she was to reason on the basis of the meanings of the inscriptions in question. Rather, by 'mechanically' applying the rules defined within the formalism (as a result of a process of de-semantification; knowing how but not necessarily why), i.e. by not 'thinking what we are doing', the reasoner not only alleviates the demands on her cognitive resources; she may also effectively block the interference of external information (which, in the case of logical reasoning, should be treated as irrelevant). Moreover, thus defined, operative writing allows for the (in principle) unlimited iteration of such operations; by contrast, if the reasoner is guided by meaning and intuitions, typically at some point she is led too far astray from the cognitive territory she is capable of operating on solely on the basis of her intuitions.

Philosophia Scientiæ, 16-1 | 2012 73

Of course, these points are not entirely new, and the search for the 'mechanization' of reasoning within logic and elsewhere has had its enthusiasts for a long time (even though it has its opponents too, as mentioned above). What I would like to add to the debate in the next section are the findings from psychology and cognitive science that seem to explain, at least partially, why de-semantification and mechanization of reasoning can be such powerful tools, in particular in the quest for novel results. But before we move on to psychology, here is one more quote, this time by Gôdel: [In a formal system] rules of inference are laid down which allow one to pass from the axioms to new formulas and thus to deduce more and more propositions, the outstanding feature of the rules of inference being that they are purely formal, i.e. refer only to the outward structure of the formulas, not to their meanings, so that they could be applied by someone who knew nothing about mathematics, or by a machine. [Gôdel 1995, 45; emphasis added] The theoretical step from disregarding the meaning of formulas and the 'mechanical' application of rules of inference is a substantive step, i.e. de-semantification is neither a necessary nor a sufficient condition for the mechanical application of rules. Nevertheless, the connection between the two notions is a tight one, and it is easy to see that the mechanical application of rules can be greatly facilitated by a process of de-semantification. In the next section we shall see why it can be so beneficial for the production of new knowledge if a given set of rules can in principle be applied for investigations in a given domain by someone who 'knows nothing' about it.

3 Psychology of reasoning and the use of formal languages

In this section, I review some of the results from the psychology of reasoning tradition which, I claim, are relevant for the issues being discussed here, in particular in that they shed light on the operative import of the use of formal languages in logical practices. Decades of research on the psychology of deduction have amassed significant data on the discrepancies between our spontaneous reasoning mechanisms and the canons of deductive reasoning; the bottom line is that subjects typically do not 'do well' in experiments with deductive tasks (for surveys, see [Evans 2002] and [Dutilh Novaes 2012, chap. 4]). Several explanations for these discrepancies have been put forward, but it seems that we typically let external information 'sneak in', i.e. we take context and prior beliefs into account (even if not directly relevant for the task at hand), just as suggested by Frege in the preface of the Begriffsschrift. These are all quite sensible reasoning tendencies for practical purposes, but in scientific contexts they may be counter-productive, for a variety of reasons. In particular, if one of the main goals in science is to produce new knowledge, the tendency we have (see experiments discussed below) to seek confirmation for the beliefs we already hold may hinder scientific discovery. In such contexts, we need devices that help us counter our usual doxastic conservativeness, and formal languages are among such devices. The psychology of reasoning tradition has also established that the 'mistakes'25 we make when reasoning (deductively) are not random; there are patterns in such 'mistakes', and to account for the processes which might be underlying this systematicity, the notion of 'reasoning biases' has been introduced. The different biases

Philosophia Scientiæ, 16-1 | 2012 74

are presented as an attempt to understand why it is that subjects systematically deviate from normative responses in experiments with deductive tasks. Here I focus on one of these 'biases' recognized in the literature, namely confirmation bias, and more specifically on how confirmation bias affects inference-drawing. In the abstract of a survey article on confirmation bias, we find the following definition: Confirmation bias [...] connotes the seeking or interpreting of evidence in ways that are partial to existing beliefs, expectations, or a hypothesis in hand. [Nickerson 1998, 175] In practice, people can reinforce their existing beliefs by selectively collecting new evidence, by interpreting evidence in a biased way, by selectively recalling information from memory, etc. In the words of Jonathan Evans, one of the most influential researchers in the field, "confirmation bias is perhaps the best known and most widely accepted notion of inferential error to come out of the literature on human reasoning" [Evans 1989, 41]. 'Confirmation bias' is a general term used to describe different and perhaps unrelated cognitive processes; admittedly, the concept is not always clearly defined. But what the different, possibly unrelated processes collectively referred to as 'confirmation bias' all seem to have in common is what could be described as our overwhelming tendency towards doxastic conservativeness. Typically, we seek to confirm and maintain the beliefs we already hold; we are extremely resistant to revising prior beliefs—in C. Fine's [Fine 2006, chap. 4] fitting terminology, our brain is extremely 'pigheaded'. Among the many guises that confirmation bias can acquire, I am here particularly interested in confirmation bias as affecting inference-drawing. Human agents appear to consistently refrain from drawing conclusions from premises they do accept if the conclusions clash with prior beliefs. They may either draw the 'wrong' conclusions (i.e. those that do not follow from the premises, at least not in the technical sense of deductively following, but instead seem to be in harmony with their prior beliefs) or refuse to draw conclusions in order to avoid conflict with prior belief. The technical term usually used to refer specifically to the phenomenon of prior belief interfering with inference-drawing and inference evaluation (in particular with respect to syllogistic tasks) is belief bias.26 The experiments investigating the phenomenon of belief bias are usually formulated with syllogistic arguments; subjects are presented with fully formulated arguments rather than being asked to draw conclusions themselves from given premises. As it turns out, subjects typically endorse far more arguments as valid if their conclusions also accord with prior belief, and reject arguments whose conclusions clash with prior belief, even though they have been instructed to judge only on the basis of what follows 'logically'.27 Here I report on experiments with evaluation tasks, but belief bias has also been observed in conclusion production tasks, i.e. when subjects are asked to draw conclusions themselves from given premises [Oakhill & Johnson-Laird 1985]. In such studies, subjects are typically presented with syllogisms: some are valid, some are invalid, some have a believable conclusion, some have an unbelievable conclusion, in all four possible combinations. Here are some of the syllogisms presented to subjects in [Evans, Barston, & Pollard 1983] study; notice that the valid syllogisms (both believable and unbelievable) are of the same syllogistic mood and figure (same 'logical form'), and the same holds of the invalid syllogisms. The general results of the experiment were the following (percentage of arguments accepted as valid):

Philosophia Scientiæ, 16-1 | 2012 75

Clearly, prior beliefs seem to be activated when subjects are evaluating (the correctness of) arguments. Subjects do show a degree of what we could call 'logical competence' (assuming the traditional canons) in that valid syllogisms are more often endorsed than invalid syllogisms in both categories; similarly, syllogisms with believable conclusions are more often endorsed than syllogisms with unbelievable conclusions.

Valid- Valid-believable Invalid-believable Invalid-unbelievable unbelievable

No police dogs are No nutritional things No addictive things are No million-aires are vicious. are inexpensive. inexpensive. hard workers.

Some highly trained Some vitamin tablets Some cigarettes are Some rich people are dogs are vicious. are inexpensive. inexpensive. hard workers.

Therefore, some highly Therefore, some Therefore, some Therefore, some trained dogs are not vitamin tablets are not addictive things are not millionaires are not police dogs. nutritional. cigarettes. rich people.

Believable conclusion Unbelievable conclusion

Valid 89 56

Invalid 71 10

What is remarkable though is that invalid syllogisms with believable conclusions are more often endorsed (71%) than valid syllogisms with unbelievable conclusions (56%). Apparently, doxastic conservativeness trumps logical competence, and this seems to be an accurate picture of some of our most pervasive cognitive tendencies.28 In order to further probe the mechanisms involved in these responses, other researchers [Sa, West, & Stanovich 1999] designed an experiment where syllogisms were presented to subjects with conclusions that would be neither believable nor unbelievable—in other words, syllogisms with so-called 'unfamiliar content', so that the patterns of doxastic conservativeness would presumably not be activated. The prediction was that, if it was indeed doxastic conservativeness that was trumping logical competence, in the case of syllogisms with unfamiliar content this would not occur. In first instance, subjects were presented with an invalid syllogism with familiar content, and in fact with a believable conclusion: All living things need water. Roses need water. Roses are living things. As could have been anticipated, only 32% of the subjects gave the logically 'correct' response when evaluating this syllogism, i.e. that it is invalid. They were then given a little scenario of a situation in a different planet, involving an imaginary species, wampets, and an imaginary class, hudon, and subsequently were asked to evaluate the following syllogism: All animals of the hudon class are ferocious. Wampets are ferocious. Wampets are animals of the hudon class.

Philosophia Scientiæ, 16-1 | 2012 76

Interestingly, 78% of the very same subjects whose wide majority had failed to give the 'logically correct' response in the previous task gave the 'logically correct' response here, i.e. that the syllogism is invalid. Even more significantly, the two syllogisms have the exact same mood and figure, presumably AAA-2 (the universal quantifiers are unstated in the second premise and the conclusion). So while they had failed to 'see' its invalidity in the first syllogism, arguably in virtue of the familiar content, in the second case the unfamiliar content made it so that prior beliefs did not interfere in the subjects reasoning. Experiments such as this led K. Stanovich to the following conclusion: The rose problem [the discrepancy just mentioned] illustrates one of the fundamental computational biases of human cognition—the tendency to automatically bring prior knowledge to bear when solving problems. That prior knowledge is implicated in performance on this problem even when the person is explicitly told to ignore the real- world believability of the conclusion, illustrates that this tendency toward contextualizing problems with prior knowledge is so ubiquitous that it cannot easily be turned off— hence its characterization here as a fundamental computational bias (one that pervades virtually all thinking whether we like it or not). [Stanovich 2003, 292-293] (Technically, it is not a matter of knowledge as we philosophers understand it, i.e. as involving factuality, but rather a matter of belief, given that a subject will act in the exact same way towards her false beliefs.) Stanovich then goes on to argue that the computational bias consisting in activating prior beliefs when tackling a given problem is in most cases advantageous, e.g. in terms of the allocation of cognitive resources and time constraints, and that it makes sense to think of it as an adaptation which increased fitness in the human environment of evolutionary adaptedness, many thousands of years ago. But he claims that there are a few (but important) situations in modern life where such computational biases must be overridden, as they would provide suboptimal reasoning leading to negative real-life consequences. It seems to me that, although he may overstate the negative consequences of not overriding 'pigheadedness' in specific situations, there is something very true in Stanovich's observations. As a general cognitive pattern, our tendency towards doxastic conservativeness is normally beneficial; but in specific situations, in particular in scientific contexts, it must be compensated for if we aim at optimal reasoning (given certain goals) in these situations. Which devices could effectively act as a counterbalance to some of these computational biases? My claim here is that formal languages are one such technology, in particular in that the process of de- semantification that is typically (though not always) a core feature of our uses of formal languages as operative devices effectively helps us block the interference of prior beliefs in the reasoning process. In short, we here see how a philosophical thesis—the claim that Krâmer's notions of operative writing and de-semantification can account for the cognitive impact of uses of formal languages (in particular, but not exclusively, in logic)—can receive empirical corroboration. The observed effect of improved logical 'performance' with unfamiliar content in practice seems to correspond to a process of de-semantification, and reflecting on doxastic conservativeness as a fundamental computational bias (as defined by Stanovich) gives us a tentative explanation for why our reasoning abilities can be undermined when dealing with familiar content: essentially, because we simply cannot help seeking confirmation for the beliefs we already possess.

Philosophia Scientiæ, 16-1 | 2012 77

Most people would not dispute that the use of formal languages should facilitate the reasoning process in that it augments computational power, as suggested by Whitehead's quote above and as discussed in the literature on the 'extended mind thesis'. But my analysis goes further: the use of formal languages, or 'operative languages' more generally (including mathematical formalisms), in fact allows a human agent to 'run a different software' altogether, as it were, one that is able to override the 'pigheaded' software that we normally run when reasoning. This effect is particularly important in the case of applied logics, i.e. when logic is used to investigate a particular domain or phenomenon, given that in such cases the investigator may have stronger intuitions ('hunches' ) concerning the target phenomenon in question, which should nevertheless not interfere in the reasoning process. One way to describe the different 'software' induced by reasoning with formal languages is that it forces the agent to consider all cases where the premises are true, rather than focusing on her preferred cases (those which accord with her prior, external beliefs).29 Similarly, Stenning's [Stenning 2002] results on how the media and the 'languages' used for teaching logic (i.e. sentential or diagrammatical systems of representation) have a significant impact on the students' learning process and reasoning suggest that the 'external means' one uses when doing logic have an undeniable cognitive effect on the reasoning process itself. From this point of view, it is not surprising at all that, once they had the technology of formal languages at their disposal, logicians could do logic in ways that are substantially different from how logic was done prior to the wide adoption of formal languages in logical practices.30 It might be thought, however, that the results of what is perhaps the most widely studied task in the psychology of reasoning literature, namely the Wason selection task, contradict my main thesis here: subjects performances on this task are typically very low when they are given 'abstract material' to work with, but they improve in many (though not all) contentful versions of the task.31 But what has also emerged from this literature is that not all contentful versions of the task enhance reasoning performance; it is not even the familiarity factor that explains the facilitating effect. What really seems to make a substantive difference is whether subjects interpret the conditional in the task as a descriptive conditional or as a deontic conditional. Stenning and van Lambalgen [Stenning & Lambalgen 2004] have argued that if subjects interpret the conditional as deontic, the task acquires an entirely different logical structure than if they interpret the conditional as descriptive; in fact, with the deontic interpretation, the task is computationally much more tractable— and not for reasons associated with confirmation bias or anything related to it. Moreover, the fact that subjects perform poorly when given 'abstract' contents (letters or numbers) only goes on to show that the use of formal languages is not an innate ability in humans: it is a technique that must be learned in order to be used successfully. Thus, the results on the Wason selection task are at the very least compatible with the main hypothesis of this paper: formal languages are a technology which, when properly mastered, may serve as a counterbalance to some of our more spontaneous cognitive patterns, precisely in that they activate other cognitive processes which we can perform but which are not activated under normal circumstances.

Philosophia Scientiæ, 16-1 | 2012 78

Conclusion

To conclude, let me refer to a much quoted passage by D'Alembert: 'Algebra is generous: she often gives more than is asked of her'; F. Staal [Staal 2007] has suggested that the same holds of formal/artificial languages. Drawing from empirical data from psychology and cognitive science, I have offered one possible explanation for the generosity of formal languages: they give us more than we expect because they allow us to go beyond the beliefs we already hold; they deliver us new knowledge. This phenomenon is related to the fact that they are a technology with built-in mechanisms for the inhibition of more spontaneous reasoning tendencies, which seek to confirm prior belief—what I have referred to as our tendency towards doxastic conservativeness. These mechanisms are related to the operative ('paper-and-pencil ) nature of formal languages, i.e. to their role as cognitive artifacts. My suggestion was that these external devices allow us to run a 'different software' when reasoning, a software that is not geared towards bringing in prior beliefs—one of our strongest 'computational biases' according to Stanovich. Notice however that, originally, the avowed purposes of formal languages were essentially expressive and meta-theoretical; in this sense too formal languages are generous, i.e. in the sense of impacting reasoning in ways that go far beyond the original applications they were designed for. As for the prospects for a practice-based philosophy of logic, I have attempted to illustrate one possible path of development, namely to bring in data about how humans reason in order to attain a better grasp of the analogies and (perhaps more importantly) disanalogies between 'everyday life' forms of reasoning and reasoning as performed by trained logicians. There is every reason to believe that other practice- oriented attempts may deliver significant results, and I hope at least to have convinced the reader that a practice-based approach can be fruitful and shed new light on some long-standing issues within the philosophy of logic—in the present case, why doing logic with formal languages is so different from doing logic without formal languages.

BIBLIOGRAPHY

ÂNDRADE-LOTERO, EDGAR & DUTILH-NOVAES, CATARINA 2010 Validity, the squeezing argument and alternative semantic systems: the case of Aristotelian syllogistic, Journal of Philosophical Logic, OnlineFirst, 1-32, doi:DOI10.1007/s10992-010-9166-y.

AVRON, ARNON 1994 What is a logical system?, in What is a logical system?, edited by GABBAY, D., Oxford: Oxford University Press, 217-238.

AWODEY, STEVE & REOK, ERICH H. 2002 Completeness and Categoricity. Part I: Nineteenth-century Axiomatics to Twentieth-century Metalogic, History and Philosophy of Logic, 23, 1-30.

Philosophia Scientiæ, 16-1 | 2012 79

BEALL, JC & RESTALL, GREG 2006 Logical Pluralism, Oxford: Oxford University Press.

BENTHEM, JOHAN VAN 2008 Logical dynamics meets logical pluralism?, Australasian Journal of Logic, 6, 182-209.

CLARK, ANDY 2008 Supersizing the Mind, Oxford: Oxford University Press. 1998 The extended mind, Analysis, 58, 10-23.

DE CRUZ, HELEN, NETH, HANSJÔRG & SCHLIMM, DIRK 2010 The cognitive basis of arithmetic, in PhiMSAMP. Philosophy of mathematics: Sociological aspects and mathematical practice, edited by LÔWE, B. & MULLER, T., London: College Publications, 59-106.

DE CRUZ, HELEN & DE SMEDT, JOHAN 2010 The innateness hypothesis and mathematical concepts, Topoi, 29, 3-13.

DUMMETT, MICHAEL 1978 The justification of deduction, in Truth and Other Enigmas, London: Duckworth, 290-318.

DUTILH NOVAES, CATARINA 2011 The different ways in which logic is (said to be) formal, History and Philosophy of Logic, 32(4), 303-332. 2012 Formal Languages from a Cognitive Perspective, Cambridge: Cambridge University Press.

ELQAYAM, SHIRA & EVANS, JONATHAN ST.B.T. 2011 Subtracting "ought" from "is": Descriptivism versus normativism in the study of the human thinking, Behavioral and Brain Sciences, 34(5), 233-248.

EVANS, JONATHAN, BARSTON, JULIE & POLLARD, PAUL 1983 On the conflict between logic and belief in syllogistic reasoning, Memory and Cognition, 11, 295-306.

EVANS, JONATHAN ST. B. T. 1989 Bias in Human Reasoning: Causes and Consequences, London: Erlbaum Associates. 2002 Logic and human reasoning: An assessment of the deduction paradigm, Psychological Bulletin, 128(6), 978-996.

FINE, CORDELIA 2006 A Mind of Its Own: How your brain distorts and deceives, New York: W.W. Norton.

FREGE, GOTTLOB 1977 Begriffsschrift, in From Frege to Gôdel: A Source Book in Mathematical Logic, 1879-1931, edited by HEIJENOORT, JEAN VAN, Cambridge: Harvard University Press, 3rd ed., 1-82, 1879.

GÔDEL, KURT 1995 Collected Works - Vol. III: Unpublished essays and lectures, Oxford: Oxford University Press.

HAACK, SUSAN 2000 Philosophy of Logics, Cambridge: Cambridge University Press.

HEIJENOORT, JEAN VAN 1967 Logic as calculus and logic as language, Synthese, 17, 324-330.

HERSH, REUBEN 1997 What is Mathematics, Really?, Oxford: Oxford University Press.

Philosophia Scientiæ, 16-1 | 2012 80

HERSH, REUBEN (ED.) 2006 18 Unconventional Essays on the Nature of Mathematics, Berlin: Springer.

HODGES, WILFRID 2009 Logic and games, in Stanford Encyclopedia of Philosophy, edited by ZALTA, E.

KERKHOVE, BART & BENDEGEM, JEAN PAUL VAN 2007 Perspectives on Mathematical Practices: Bringing together philosophy of mathematics, sociology of mathematics, and mathematics education, Dordrecht: Springer.

KRÂMER, SYBILLE 2003 Writing, notational iconicity, calculus: On writing as a cultural technique, Modern Languages Notes - German Issue, 118(3), 518537.

KUHN, THOMAS 1962 The Structure of Scientific Revolutions, Chicago: University of Chicago Press.

LANDY, DAVID & GOLDSTONE, ROBERT L. 2007a How abstract is symbolic thought?, Journal of Experimental Psychology: Learning, Memory, and Cognition, 33(4), 720-733. 2007b Formal notations are diagrams: Evidence from a production task, Memory and Cognition, 35(8), 2033-2040. 2009 Pushing symbols, in The 31st Annual Conference ofthe Cognitive Science Society, Amsterdam, 1072-1077. URL facultystaff.richmond.edu/~dlandy/Publications/Landy_ Goldstone_2009.pdf.

MANCOSU, PAOLO 2008 Mathematical explanation: Why it matters, in The Philosophy of Mathematical Practice, edited by MANCOSU, PAOLO, Oxford: Oxford University Press, 134-149.

MARION, MATHIEU 2009 Why play logical games?, in Games: Unifying Logic, Language, and Philosophy, edited by MAJER, O., PIETARINEN, A.-V., & TULENHEIMO, T., Dordrecht: Springer, 3-26.

MENARY, RICHARD 2007b Writing as thinking, Language Sciences, 29, 621-632.

MENARY, RICHARD (ED.) 2010 The Extended Mind, Cambridge: MIT Press.

MORRIS, ANNE K. & SLOUTSKY, VLADIMIR M. 1998 Understanding of logical necessity: Developmental antecedents and cognitive consequences, Child Development, 69(3), 721-741.

MUGNAI, MASSIMO 2010 Logic and mathematics in the seventeenth century, History and Philosophy of Logic, 31(4), 297-314.

NICKERSON, RAYMOND S. 1998 Confirmation bias: A ubiquitous phenomenon in many guises, Review of General Psychology, 2(2), 175-220.

OAKHILL, JANE & JOHNSON-LAIRD, PHILIP N. 1985 The effect of belief on the production of syllogistic conclusions, Quarterly Journal of Experimental Psychology, 37(A), 553-569.

Philosophia Scientiæ, 16-1 | 2012 81

POPPER, KARL 1959 The Logic of Scientific Discovery, London: Hutchinson, translation of Logik der Forschung, 1934.

ROSENTAL, CLAUDE 2008 Weaving Self-Evidence: A sociology of logic, Princeton: Princeton University Press.

SÂ WALTER WEST, RICHARD & STANOVICH, KEITH 1999 The domain specificity and generality of belief bias: Searching for a generalizable critical thinking skill, Journal of Educational Psychology, 91, 497-510.

SCHAMDT-BESSERAT, DENISE 1996 How Writing Came About, Austin: University of Texas Press.

SHER, GILA 2008 Tarski's thesis, in New Essays on Tarski and Philosophy, edited by PATTERSON, D., Oxford: Oxford University Press, 300-339.

SHIN, SUN-JOO 2002 The Iconic Logic of Peirce's Graphs, Cambridge: MIT Press. 2008 Diagrams, in Stanford Encyclopedia of Philosophy, edited by ZALTA, E. URL plato.stanford.edu/entries/diagrams/.

STAAL, FRITS 2006 Artificial languages across sciences and civilizations, Journal of Indian Philosophy, 34, 89-141. 2007 Preface: The generosity of formal languages, Journal of Indian Philosophy, 35, 405-412.

STANOVICH, KEITH E. 2003 The fundamental computational biases of human cognition: Heuristics that (sometimes) impair decision making and problem solving, in The Psychology of Problem Solving, edited by DAviDsoN, J. E. & STERNBERG, R. J., New York: Cambridge University Press, 291-342.

STENNING, KEITH 2002 Seeing Reason, Oxford: Oxford University Press. 2004 A little logic goes a long way: Basing experiment on semantic theory in the cognitive science of conditional reasoning, Cognitive Science, 28, 481-529. 2008 Human Reasoning and Cognitive Science, Cambridge: MIT Press.

SUNDHOLM, B. G. 2003 Tarski and Lesniewski on languages with meaning versus languages without use: A 60th birthday provocation for Jan Wolenski, in Philosophy and Logic. In Search of the Polish Tradition, edited by HINTIKKA, J., CZARNECKI, T., KIJANIA-PLACEK, K., PLACEK, T., & ROJSZCZAK, A., Dordrecht: Kluwer, 109-128.

TARSKI, ALFRED 1959 Introduction to Logic and to the Methodology ofDeductive Sciences, New York: Oxford University Press, 8th ed.

WHITEHEAD, ALFRED N. 1911 An Introduction to Mathematics, Oxford: Oxford University Press.

NOTES

*. This research was supportée! by a VENI grant from the Netherlands Organization for Scientific Research (NWO). I would also like to thank two anonymous reviewers for detailed comments, and

Philosophia Scientiæ, 16-1 | 2012 82

the audience at the 'From Practices to Results' conference in Nancy (June 2010) for valuable feedback. 1. In other words, attention to the practices of logicians may be illuminating even within a normative, a priori conception of logic, as I hope will become clear in the paper. 2. Notice that the distinction between process vs. output does not correspond neatly to the 'context of discovery vs. context of justification' distinction, as one of the elements that is of interest from a practice-based perspective is how canons of scientific justification are established and enforced by communities of practitioners. 3. There are other ways of discharging the assumption, such as raising new, important questions which are not (cannot) be formulated within traditional philosophy of X (I owe this point to an anonymous referee). But in such cases, traditional philosopher of X can still argue that the question is not in fact relevant, or that it is not even a truly philosophical question (speaking from personal experience!). 4. One field that must be further explored on philosophical grounds is the connection between logic and games. There are of course several logical systems that make heavy use of the game analogy (dialogical logic, game-theoretic semantics, etc.), but is the analogy justified? In which way is logic really (like) a game? See for example [Marion 2009] and [Hodges 2009] for discussions of these issues. 5. Insofar as logic is traditionally seen as an a priori, normative enterprise, it might be thought that 'empirically-informed philosophy of logic is something of an oxymoron. To argue that it is not so is one of the goals of the present contribution. 6. I have in mind the cases in which a semantics does not naturally present itself and a somewhat ad hoc, concocted semantics is developed to fill in the gap, even though this semantics does not really represent the original target-phenomenon faithfully. See [Andrade-Lotero & Dutilh-Novaes 2010] for a critical discussion of the significance of proofs of soundness and completeness in the case of Aristotelian syllogistic. 7. There is also an interesting tradition in research on mathematics education, on how for example the concept of logical necessity develops upon instruction; see for instance [Morris & Sloutsky 1998]. 8. Admittedly, this might be too much to hope for; it is not obvious that the logician will be interested in talking to the practice-based philosopher of logic. 9. I argue for these claims in more detail in [Dutilh Novaes 2012, chap. 3]. 10. Although Montague famously claimed that English is a formal language. 11. This does not mean that Principia Mathematica itself had already reached the stage of formulating meta-theoretical questions (essentially, it had not); rather, the claim is that the meta-theoretical questions which were beginning to emerge (as described in [Awodey & Reck 2002]) could be more easily addressed by others against the background offered by Principia Mathematica. 12. But notice that formal languages are not a necessary condition for meta-logical investigations. The Prior Analytics itself, the first logical text in this tradition, offers sophisticated meta-theoretical analyses. 13. Needless to say, the surprise factor in technological development can also lead to disastrous consequences. 14. Writing did arise independently at a few other places and times, but the history of such developments is not significantly different (as far as we can establish from our current knowledge of these developments). 15. See [Kramer 2003, 531]. But compare [De Cruz, Neth & Schlimm 2010] on the cognitive impact of different numerical systems. 16. These three roles are discussed in more detail in [Dutilh Novaes 2012, chap. 3].

Philosophia Scientiæ, 16-1 | 2012 83

17. Given that the iconic role of formal languages is not the main topic of the present contribution, my discussion here will be rather succinct. The brief considerations presented here should essentially be seen as suggestions for future work. 18. Naturally, there are different kinds (degrees?) of iconicity. The iconicity of, for example, the notation in Principia Mathematica is fundamentally different from the iconicity in, for example, Euclidean diagrams. Kramer's main point, which I endorse here, is that every form of writing explores and relies on graphic/iconic components. One of her examples is the use of footnotes in written texts, which clearly exploits the graphic two-dimensional possibilities of surfaces (as opposed to the one-dimensionality of speech). 19. That they are distinct dimensions emerges from the observation that a given system of representation can have a high level of iconicity and yet not be suitable to be 'operated on', i.e. to facilitate and aid mental processes (again, the differences between Roman and Indian/Arabic numerals). It might be thought that iconicity should enhance reasoning performance in the sense that, given that it allows for direct interpretation (in Stenning's sense), the cognitive step of 'scanning' the inscriptions and 'imposing an abstract syntactic structure on concatenations' [Stenning 2002, 51] can be 'skipped' . In practice, however, diagrammatic systems of representation allow for a lesser degree of abstraction, which in turn affects expressibility and also calculability, as I shall argue. Moreover, at least two other factors must be taken into account: individual differences in reasoning styles, and the kind of task at hand (both of which are discussed in [Stenning 2002]). The heart of the matter is in any case that there are good reasons to distinguish the iconic from the operative dimension of formal languages. 20. "In other words, this [phonetic] definition excludes so-called 'formal languages' that construct graphical systems sui generis and which are all, at best, verbalized retroactively and verbalized only in a limited, fragmentary form. [. . . ] Calculus is the incarnation of operative writing." [Kramer 2003, 522]. Tellingly, Kramer started her research career as a historian of 17th century logic and mathematics, in particular the concepts of calculus and calculation. 21. Menary analyzes the cognitive impact of writing from the point of view of ex-tended cognition, i.e. the idea that certain external objects, artifacts and technologies are often constitutive of the cognitive processes involving them [Menary 2007b]. 22. In a sense, the very birth of logic as we know it with Aristotle already contained an element of de-semantification, in his extensive use of schematic letters in the two Analytics (as pointed out by an anonymous reviewer). 23. As reported by [Kramer 2003, fn. 29], until the beginning of modern times in Europe, the zero was not viewed as a number on a par with other numbers, but rather essentially as a 'gap' . 24. Commenting on whether infinitely small or infinitely large numbers are 'actual' numbers in the context of his development of infinitesimal mathematics, Leibniz said: "On n'a point besoin de faire dépendre l'analyse mathématique des controverses métaphysiques" (quoted in [Kramer 2003, fn. 36]). 25. Whether these deviances from the traditional canons of deduction can rightly be considered to be mistakes depends on accepting the normative authority of these canons over thinking and reasoning; but this is a moot point [Elqayam & Evans 2011]. 26. In fact, the term 'belief bias is more felicitous as a description of the whole phenomenon than the term 'confirmation bias', given that the bias is not towards confirming simpliciter but rather towards reinforcing prior beliefs, which of course may require dis confirming (refuting) a particular hypothesis if it clashes with prior beliefs. 27. This is one of the methodological problems with such experiments: they seem to assume that subjects will give an unproblematic interpretation to the notion of 'fol-lowing logically', but this is arguably a thoroughly theory-laden concept—see [Morris & Sloutsky 1998]. Nevertheless, what emerges from the experiments is still the strong 'pull of doxastic conservativeness.

Philosophia Scientiæ, 16-1 | 2012 84

28. As described in [Fine 2006, chap. 4], our pigheadedness goes well beyond its manifestation specifically with respect to inference-drawing. 29. Indeed, one way to conceptualize this difference is in terms of non-monotonicity vs. monotonicity. Our 'computational bias entails non-monotonic patterns of reasoning, while well- designed formal languages in deductive settings enforce monotonic reasoning—see [Dutilh Novaes 2012] for details. 30. Moreover, work by Landy and Goldstone [Landy & Goldstone 2007a], [Landy & Goldstone 2007b], [Landy & Goldstone 2009] suggests that agents typically call upon sensorimotor processing to operate with mathematical and logical notations, and thus that the perceptual features of the different systems is bound to have an impact on the reasoning processes. They literally 'move bits and pieces of the notation around to perform the correct transformations. An 'extended cognition' perspective on formal languages is developed in [Dutilh Novaes 2012, chap. 5]. 31. For reasons of space, I am assuming that the reader is familiar with the Wason selection task; for further details, see [Stenning & Lambalgen 2004], [Stenning & Lambalgen 2008]. The point of this short discussion is rather to anticipate a natural objection to my main thesis, and to suggest how the objection could be dealt with. To be sure, the discussion of the Wason task here will remain rather superficial for reasons of space; further details can be found in [Dutilh Novaes 2012, chap. 4].

ABSTRACTS

In different subfields of philosophy, focus on actual human practices has been an important (albeit still somewhat non-mainstream) approach in recent decades. But so far, no such practice- based turn has yet taken place within the philosophy of logic. In the first part of the paper, I delineate what a practice-based philosophy of logic would (could) look like, insisting in particular on why it can be relevant and how it is to be undertaken. In the second part, I illustrate the proposed practice-based approach by means of a case-study: the role played by formal languages in logic, in particular in the practices of logicians. I argue that formal languages play a fundamental operative role in the work of logicians, as a paper-and-pencil, hands-on technology triggering certain cognitive processes—more specifically, countering some of our more 'spontaneous' cognitive patterns which are not particularly suitable for research in logic (as well as in other fields). I substantiate these claims with empirical data from research in the psychology of reasoning. With this analysis, I hope to show that a practice-based philosophy of logic can be a fruitful enterprise, in particular if accompanied by much-needed methodological reflection.

Au cours des dernières décennies, les travaux portant sur les pratiques humaines réelles ont pris de l'importance dans différents domaines de la philosophie, sans pour autant atteindre une position dominante. À ce jour, ce type de tournant pratique n'a cependant pas encore pénétré la philosophie de la logique. En première partie, j'esquisse ce que serait (ou pourrait être) une philosophie de la logique centrée sur l'étude des pratiques, en insistant en particulier sur sa pertinence et sur la manière de la conduire. En deuxième partie, j'illustre cette approche centrée sur les pratiques au moyen d une étude de cas : le rôle joué par les langages formels en logique, en particulier dans les pratiques des logiciens. Ma thèse est que les langages formels jouent un

Philosophia Scientiæ, 16-1 | 2012 85

rôle opératoire fondamental dans le travail des logiciens en tant que technologie pratique du crayon et du papier, génératrice de processus cognitifs - et qui plus spécifiquement vient contrebalancer certains de nos schémas cognitifs « spontanés » peu adéquats à la recherche en logique (ainsi que dans d'autres domaines). Cette thèse sera appuyée sur des données empiriques venant de la recherche en psychologie du raisonnement. Avec cette analyse j'espère montrer qu'une philosophie de la logique centrée sur l'étude des pratiques peut être fructueuse, en particulier si elle est complétée par les réflexions méthodologiques nécessaires.

AUTHOR

CATARINA DUTILH NOVAES University of Groningen (The Netherlands)

Philosophia Scientiæ, 16-1 | 2012 86

Part II: Case Studies

Philosophia Scientiæ, 16-1 | 2012 87

Learning from Euler. From Mathematical Practice to Mathematical Explanation

Daniele Molinini

1 Introduction

1 Does philosophy of mathematics have something to do with the way mathematicians do their job? Can philosophy of mathematics learn from the observation and the analysis of the way mathematics is done? As it is well known, a positive answer to these questions came from the philosophy of mathematics itself, with the emergence during the sixties of a strong opposition to the classical foundational programs (logicism, Hilbert program and intuitionism). This reaction against the "dogmas of foundationalism" [Tymoczko 1998, 95], which started with Imre Lakatos and which has been pursued by what Aspray and Kitcher defined as "a maverick group of philosophers" [Aspray & Kitcher 1988], gave a central importance to the history of mathematics and assumed mathematical practice as a driving force in the philosophical research.

2 The research questions posed by the "maverick" philosophers were more oriented on the heuristics of mathematics, the way in which mathematics grows, the notion of explanation in mathematics, the distinction between formal and informal proofs and reasonings in mathematics [Aspray & Kitcher 1988, 17]. Furthermore, questions concerning the dynamics of mathematical discovery and the historical development of mathematics were also addressed [Kitcher 1984].

3 It is without a doubt that such research questions are still urgent and central to the agenda of contemporary philosophers of mathematics ([Mancosu, Jørgensen & Pedersen 2005]; [Mancosu 2008]). And it can be noted that similar claims about the importance of scientific practice in the philosophical investigation come from the general philosophy of science.1 But how can the questions of the traditional philosophy of science be dissolved or moved by the examination of scientific practice? And, for

Philosophia Scientiæ, 16-1 | 2012 88

what specifically concerns the topic touched by the present issue of this journal, how can an examination of the practice of mathematicians help in informing topics regarding the philosophy of mathematics? In this paper I will propose a possible answer to the latter question. By focusing on the notion of mathematical explanation, I will show that an analysis of Euler's scientific practice does provide not only an evaluation of a philosophical model (Steiner's model of mathematical explanation in mathematics), but it also suggests new directions of investigations that might come out as fruitful. In this sense, the aim of the paper will be twofold: it will provide a testing of one particular account of mathematical explanation on a specific case-study; it will show that an investigation which takes scientific practice as a starting point for philosophical analysis can have strong repercussions on debates which are central to the contemporary philosophy of mathematics and to philosophy of science in general.

4 It should be pointed out that this practice-driven methodology in the study of the notion of explanation, i. e. a methodology according to which a theory of explanation must be tested and refined starting from the analysis of scientific practice itself, has been adopted by other authors as well ([Hafner & Mancosu 2008]; [Mancosu 2008]). The same methodology is welcomed by some members of the so-called "Stanford School of History and Philosophy of Science", notably Nancy Cartwright and Margaret Morrison [Hoefer 2008]. These authors agree that it is possible to have a better comprehension of mathematical explanation focusing on particular case-studies and taking the test-case itself as a starting point for philosophical considerations.

5 In this study I will focus on mathematical explanation within mathematics, leaving aside the different topic of mathematical explanation in natural and social sciences [Mancosu 2008, 134]. But what do we refer to with the expression 'mathematical explanation of a mathematical fact'? To answer this question, it is important to note that mathematical activity cannot be reduced to a purely justificatory exercise. A great part of this activity is driven by factors other than justificatory aims such as establishing the truth of a mathematical fact. In many cases, when confronted with a theorem or a problem, the mathematician prefers a particular proof-strategy or procedure because it provides more than a mere justification of the mathematical truth ([Kitcher 1984]; [Sandborg 1998]; [Hafner & Mancosu 2005]; [Tappenden 2005]). In other words, that particular proof-strategy or procedure goes beyond the simple reason that 'it makes the mathematical truth evident'. It also provides an 'explanation' of why that mathematical result is evident (the reason why).

6 As Paolo Mancosu has shown in one of his studies on explanation in mathematics, even if there is a renewed interest on the notion of mathematical explanation, this interest does not represent a novelty in philosophy and the attention to the topic can be traced back to Aristotle [Mancosu 2000]. Although there seems to be enough evidence that mathematical explanations occur within mathematics, however, the notion has not been sufficiently studied and much more investigation is needed to adequately capture it. Mark Steiner's account of mathematical explanation, proposed by the author in his paper "Mathematical explanation" [Steiner 1978a], represents the first explicit contribution to the study of the notion in analytic philosophy. More precisely, with his model Steiner attempts to account for explanations in mathematics which come under the form of proof.2

7 The outline of the paper will be the following. First of all, I will illustrate Steiner's account of mathematical explanation in mathematics. Then, in Section 3, I will

Philosophia Scientiæ, 16-1 | 2012 89

concentrate on Euler's mathematical practice and I will present his geometrical proof of the so-called 'Euler's theorem' for the existence of an instantaneous axis of rotation in rigid body kinematics. Finally, I will evaluate Steiner s account on this particular test-case and I will show how the account does not consider Euler s geometrical proof as a genuine explanation, thus contradicting the intuitions of Euler himself. The final Section will contain my conclusions.

2 Steiner's account of explanation in mathematics

8 Mark Steiner has presented his account of mathematical explanation (coming under the form of proof) in his paper "Mathematical explanation" [Steiner 1978a]. The starting point in the building of his original approach is the following observation: "to explain the behaviour of an entity, one deduces the behavior from the essence or nature of the entity" [Steiner 1978a, 143]. The previous remark is aimed to face a well- known problem: mathematical truths are commonly regarded as necessary, then it is meaningless to speak of essential properties of a mathematical entity. Thus, in order to escape all the difficulties related to the definition of an essential property of a mathematical entity x, i. e. a property x enjoys in all possible worlds, Steiner introduces the relative notion of characterizing property: Instead of 'essence', I shall speak of 'characterizing property', by which I mean a property unique to a given entity or structure within a family or domain of such entities or structures. (I take the notion of a family or domain undefined in this paper; examples will follow shortly.) We thus have a relative notion, since a given entity can be part of a number of different domains or families. Even in a single domain, entities may be characterized multiply. [Steiner 1978a, 143]

9 For Steiner, an explanatory proof depends on such property, while a non-explanatory proof does not. In particular, "an explanatory proof makes reference to a characterizing property of an entity mentioned in the theorem, such that from the proof it is evident that the result depends on that property" [Steiner 1978a, 143]. The dependence characterizing property-result comes from the fact that if we try to manipulate the proof, by substituting in it a different object of the same domain, the theorem collapses. This introduces us to Steiner's second core-notion about explanation by proofs: generalizability—through the variation of a characterizing property. If we deforme the proof by varying a certain characterizing property of a related entity, what we obtain in response is a change of the theorem. To every deformation of the proof there corresponds a deformation in the theorem, i. e. to an array of proofs there corresponds an array of theorems.3 The theorems obtained are proved and explained by the deformations of the original proof. This is what Steiner takes for an explanatory proof to be generalizable [Steiner 1978a, 143].

10 To sum up, Steiner offers two criteria for considering a proof being explanatory:

11 C1 dependence on a characterizing property of an entity mentioned in the theorem (dependence criterion)

12 C2 possibility to deform the proof by "substituting the characterizing property of a related entity" and getting "a related theorem" (generalizability criterion).

13 Steiner's examples of explanatory proofs include two different proofs of the sum of the first n integers and the proof of the irrationality of √2 involving the fundamental theorem of arithmetic. In another paper, which is devoted to the distinct topic of

Philosophia Scientiæ, 16-1 | 2012 90

mathematical explanations in science, Steiner considers as explanatory a proof of a particular theorem as given in linear algebra [Steiner 1978b]. Since the test case I am going to present in the next Section concerns the very same theorem (as discussed by Euler), let me consider here this example as illustrative of his account. The proof requires some familiarity with the language of linear algebra, and therefore it is first necessary to give some basic terminology.

14 An orthogonal matrix A is a real square matrix with the following property: At = A-1 (the transpose matrix of A coincides with the inverse of A).4 The class of n x n orthogonal matrices is a group under matrix multiplication.5 The group of real orthogonal n x n matrices is called the orthogonal group, and it is denoted by O(n). The property At = A-1 clearly holds for every real orthogonal matrix which belongs to O(n). Since AAt = I, we have: |AAt| = |A||At| = |A||A| = |A|2 = |I| = 1.6 Thus |A| = ±1 for every member of O(n). The subgroup of matrices with determinant +1 is called the special orthogonal group and it is denoted by SO(n). We can regard a matrix as a representation of a linear mapping α of the Euclidean space (α : V → V), then n is the dimension of the

space (n = dim(V)). The characteristic polynomial Pα(t) of the linear map α : V → V is the polynomial |A - tI|, where A is our matrix representation of α. The scalar ⋋ is an eigenvalue of the linear mapping a if and only if |A - ⋋I| = 0 (this is a theorem), where the equation

15 | A - ⋋I| = 0 is called secular equation. An eigenvector of the transformation is a non-null vector v that is transformed in a scalar multiple of itself: α ( v) = ⋋v. If ⋋ is any eigenvalue of α, the set of all vectors v of V with α (v) = ⋋v is a nonzero subspace of V which is called the eigenspace of ⋋; it consists of the zero vector and all the eigenvectors belonging to ⋋. Naturally, if the eigenspace has dimension 1, it is a line.

16 All the notions sketched in the previous paragraph belong to the basic background knowledge of linear algebra and group theory. Consider now a particular case of orthogonal matrices (or transformations): those matrices for which the condition | A| = +1 holds, i. e. the members of the special orthogonal group SO(n). Here is the theorem considered by Steiner, together with the relative proof:

17 Theorem 1. Every matrix A ∈ SO(3), with A ≠ I3, has an eigenvalue +1. 18 Proof. The proof of the existence of such eigenvalue is very short and could be stated via a speedy argument which requires no direct calculations but only some background 7 notions. We assume A ≠ I3 because A = I3 is the uninteresting case of the identity transformation.

19 Consider a real matrix A member of SO(3) and recall that for this matrix the determinant is equal to +1. Furthermore we know that, in general, the secular equation | A - ⋋I| = 0 has three roots which correspond to three eigen-values. We want to prove that one of these eigenvalues is our ⋋ = +1.

20 Suppose now ⋋1, ⋋2 and ⋋3 are eigenvalues of A. They are the roots of a cubic polynomial with real coefficients (the entries of the matrix are real). Thus, according to the fundamental theorem of algebra (FTA) and the complex conjugate root theorem 8 * (CCRT), one of the eigenvalue (say, ⋋1) is real. By CCRT, if ⋋2 is not real, then its * * complex conjugate ⋋2 is also an eigenvalue (⋋3 = ⋋2 ). Since |A| = ⋋1⋋2⋋3, we have two possibilities for the eigenvalues:

21 a) ⋋1 = 1, ⋋2 = ⋋3 = ±1

Philosophia Scientiæ, 16-1 | 2012 91

22 * * b) ⋋1 = 1, ⋋2 = ⋋3 ∉ R (observe here the change in notation for ⋋3 ) 23 In either case we have that +1 is an eigenvalue.9

24 In his paper "Mathematics, explanation and scientific knowledge" [Steiner 1978b], Steiner considers the proof above as explanatory. How then are his criteria for

explanatoriness C1 and C2 supposed to operate in this case? Recall, first of all, that for Steiner a proof is explanatory only if it makes evident that the conclusion depends on a

property of some entity or structure mentioned in the theorem (criterion C1). In other words, the proof must involve a characterizing property.

25 Steiner's attention in his paper is not explicitly directed to explanations in pure mathematics, but the focus is elsewhere.10 This is, perhaps, what precludes him from

offering a detailed description of how C1 and C2 are ful-filled in the proof above. However, in a personal communication he suggests that we have to "pick the characterizing property of SO(3) as having an odd dimension". This means that the entity (or structure) mentioned in the theorem is SO(3), which belongs to the family SO(n), while its characterizing property is 'having an odd dimension'. But in what sense then the previous proof depends on the odd dimensionality of SO(3)? If we concentrate, as Steiner suggests, on the proof strategy above, it is evident that the existence of the eigenvalue ⋋ = +1 depends on the fact that n is odd. We also agree with Steiner that there is no necessity for any eigenvalue to be +1 [Steiner 1978a, 18], because this does not hold for all the elements of the family SO(n) (in two or four dimensions, for instance, the number of real eigenvalues is even and the theorem does not hold).

26 Second, recall that for Steiner generalizability comes when we substitute the characterizing property of a related object and what we get is a related deformed theorem (criterion C2). Concerning C2, then, Steiner s idea is that the generalizability of the proof above is obtained by replacing the dimension 3 by some odd number: "There is an explanatory proof of this [existence of an eigenvalue +1] that extends to any Euclidean space of odd dimension" (personal communication). It is easy to see how we can use this strategy to get new theorems. By replacing 3 by some odd number we obtain our related theorem, i. e. a theorem which states the existence of the eigenvalue +1 for every real matrix A ∈ SO(n) with n odd.11

27 After this short presentation of Steiner s theory, in the next Section I will concentrate on Euler and his geometrical proof of a particular result in kinematics of rigid body motion. The theorem in which this result appears, called Euler s theorem after Euler, states that the general displacement of a rigid body with a point fixed is a rotation about some axis. In other words, the theorem says that for every rotation of a rigid body with a fixed point there exists an instantaneous axis around which the rotation is made.12 From an analysis of Euler s practice as a mathematician it will emerge that Euler looked at his geometrical proof as a mathematical 'explanation' of a mathematical truth, and not as a simple justificatory procedure.

3 Euler's proof for the existence of an instantaneous axis of rotation

28 Euler s contributions to mechanics are numerous and of primary importance. Among those, the use for the first time of a perpendicular Cartesian coordinate system to describe the motion of a rigid body in his Recherches sur les corps célestes en général [Euler

Philosophia Scientiæ, 16-1 | 2012 92

1747] and the introduction of the so-called 'Euler Angles' in chapter IV of his Introductio in Analysis Infinitorum (1748), while investigating surfaces in space. Euler was also the first to prove the existence of an instantaneous axis of rotation in the kinematics of rigid body motion. He obtained this result, for the first time, in his E177 Découverte d'un nouveau principe de mécanique [Euler 1750].13

29 In E177 Euler utilizes previous results in order to study the general motion of a rigid body with a fixed point and deduces the changes in the position and the velocity distribution from the given forces acting on the body. His enterprise in the dynamics of rigid body motion in space was stimulated by the problem of the rotation of the Earth around its axis (as to explain the precession of equinoxes). Newton had given a first explanation of the phenomenon in book III of the Principia, while D'Alembert tried to improve Newton's study in his 1749 Recherches sur la précession des équinoxes et sur la nutation de l'axe de la Terre [Wilson 1987]. Furthermore, Euler's studies in hydraulics and in the motion of ships during the 1740's permitted him a new and different approach to mechanics.14 The separation of the progressive motion of the center of gravity from the rotatory motion obtained in his Scientia Navalis (written between 1737 and 1740, and published in 1749) is applied here in a kinematical version and it permits Euler to reach the following conclusion: the rotatory motion of the body is independent of its progressive motion. As Euler affirms in the Introduction: Quelque composé que soit le mouvement d'un corps solide, on le peut toujours décomposer en un mouvement progressif et en un mouvement de rotation. [….] on peut entreprendre la recherche du mouvement de rotation, tout comme si le corps n'avoit aucun mouvement progressif. [Euler 1750, 82]

30 The separation of motions gives Euler the opportunity to concentrate only on rotatory motion and state the existence of an instantaneous axis of rotation: Supposant donc le centre de gravité d un corps solide quelconque en repos, ce corps sera néanmoins susceptible d'une infinité de mouvemens différens. Or je démontrerai dans la suite, que, quel que soit le mouvement d'un tel corps, ce sera pour chaque instant non seulement le centre de gravité qui demeure en repos, mais il y aura aussi toujours une infinité de points situés dans une ligne droite, qui passe par le centre de gravité, dont tous se trouveront également sans mouvement. [Euler 1750, 83]

31 In the Section Détermination du mouvement en général, dont un corps solide est susceptible, pendant que son centre de gravité demeure en repos, in order to study the general motion of the body, Euler introduces a Cartesian system fixed in absolute space and assumes that a point Z of the body with coordinates x, y, z has velocities P, Q, R in the direction of the axes. The components of the velocity P, Q, R are functions of x, y, z. Euler's final purpose is to find these functions. He considers another point z "infiniment proche du précédent Z", of coordinates x + dx, y + dy, z + dz and velocities P + dP, Q + dQ, R + dR. After an analytic treatment, Euler is able to conclude that there are points, which have coordinates (Cu, -Bu, Au), that do not move during time dt.15 These points are on a straight line through the origin, which is called the "instantaneous axis of rotation".16

32 […] tous les points du corps, qui sont contenus dans ces formules x = Cu, y = -Bu, z = Au demeureront en repos pendant le temps dt. Or tous ces points se trouvent dans une ligne droite, qui passe par le centre de gravité O; donc cette ligne droite demeurant immobile sera l axe de rotation, autour duquel le corps tourne dans le présent instant. [Euler 1750, 95]

Philosophia Scientiæ, 16-1 | 2012 93

33 After the analytic argument, Euler adds a purely geometric (i. e. non analytic) proof of the existence of the instantaneous axis of rotation, discussing the infinitesimal motion of a spherical surface with a fixed point. He embarks on the new proof after the following remark: "Sans entrer dans le détail du calcul, que je viens de déveloper, on peut aussi prouver la même vérité par la seule Géométrie" [Euler 1750, 96]. Keep in mind the theorem: When a rigid body is moved around its center it is always possible to find a line, passing through the center, whose position is the same as before the motion. His geometrical proof runs as follows17:

34 Euler's geometrical proof. In time dt, the rigid body with a fixed point (the center of gravity) will have moved from one position to another. Now, consider in the moving body a spherical layer ("couche sphérique") whose center coincides with the center of gravity (in Figure 1 I have indicated the center of gravity with a cross 'x' , but this point is not showed in Euler's original diagram). As Euler observes, to know the motion of the spherical surface defined by this spherical layer amounts to determine the motion of the entire body.

35 If we consider an arc AB of a grand circle on the spherical surface, after time dt it will have moved to the new position ab. Recall the body is rigid, then AB = ab. Euler remarks that, from the study of this configuration, it will be possible to find the positions of all the points of the spherical surface after time dt [Euler 1750, 96].

36 Let's look at the diagram (Figure 1). If we prolong the two arcs BA and ba they will meet in a point, say C. If we take point a on the arc bcC such that ac = AC, after time dt point C (on the arc BA) will be transported in position c (on the arc ba), and thus the arc CAB in

cab. For simplicity, call g1 the grand circle BAC and g2 the other circle bcC. Consider now

a point M outside the grand circle g1, and trace the arc MC. After time dt the point M will move to the new position, say m. If we want to know the position m, we have simply to trace the arc cm = CM such that ∠acm = ∠ACM.

37 Now, if we prolong the two arcs CM and cm, they will intersect in a point O. Thus, after time dt, the arc CMO will be transported to the arc cmO. If CMO = cmO, the point O will not move in time dt and will be a fixed point. Nevertheless, to say that arcs CMO and cmO are equal amounts to say that the spherical triangle cCO is isosceles. Therefore the problem to see if there exists such a fixed point O corresponds to the problem of finding a possible configuration in which the spherical triangle cCO is isosceles. This configuration, as Euler points out, can always be found:

Philosophia Scientiæ, 16-1 | 2012 94

FIGURE 1: Reconstruction of Euler's original diagram as it appears in E177.

Il est certain qu'on peut toujours constituer en sorte l'arc CMO, qu'ayant décrit son arc correspondant cmO, il soit cmO = CMO. Car pour que cela arrive on n'a qu'à constituer l'arc CMO en sorte, que l'angle cCO devienne égal à l'angle CcO; afin que le triangle sphérique COc devienne isoscèle et partant les côtés CO et cO égaux entr'eux. [Euler 1750, 96] 38 In order the spherical triangle to be isosceles it suffices that the angles ∠cCO and ∠CcO be equal. How do we choose M in the right way to obtain this configuration? Euler's reasoning is very simple. Consider the configuration in which the angles are equal: ∠cCO = ∠CcO. Observe that:

∠cCO = ∠ACO - ∠ACc and ∠CcO = ∠180 - ∠acO (4)

39 But we also want that

∠ACO = ∠acO and ∠cCO = ∠CcO (5)

40 Thus, by simple substitutions and addition of the angles, we have:

∠2cCO = ∠180 - ∠Acc (6)

Philosophia Scientiæ, 16-1 | 2012 95

FIGURE 2: The instantaneous axis of rotation.

41 Although Euler does not do this, it is immediate to see the instantaneous axis just by tracing a line passing through the center of gravity (which is at rest in the motion) and

the point O. In Figure 2, I traced the instantaneous axis Ir and the system of reference fixed with the rigid body before and after the infinitesimal motion—respectively,

(xo,yo,zo) and (x1,y1,z1). 42 Why did Euler add the previous geometrical proof if he had already obtained the result via an analytic procedure? Bear in mind the mathematical result: there exists a line which does not move in the infinitesimal transformation. As Euler affirms, we can prove that this is true without calculations and via geometry alone [Euler 1750, 96]. It seems then that Euler adds the geometrical proof to convince the reader of the validity of the previous analytic result. More precisely, as I am going to show in the next paragraph, Euler is convinced that the geometrical procedure provides us with a kind of comprehension of the result (it is not mere justification). The geometrical proof has an epistemic value that the analytic proof has not.

43 Even though Euler strongly contributed to the transition towards an analytical mechanics, by pushing forward the process of mathematization started with Newton, geometrical arguments still played a crucial role in his scientific practice. As an example, in his De novo genere oscillationum (presented in 1739), in the study of what we call today 'forced simple harmonic motion', Euler interpreted geometrically every quantity involved in the problem.18 This, however, should not come as a surprise. As a consequence of the 17th century, geometrical concepts were still rooted in mechanics and the way in which the new mechanics developed did not entirely depend on the new mathematical techniques for conceptual support [Mahoney 1984]. In the Preface to his Mechanica (1736), Euler points out that in reading works such as Newton's Principia and Hermann's Phoronomia he had difficulties in solving problems slightly differents from

Philosophia Scientiæ, 16-1 | 2012 96

the geometrical cases presented but he could "understand" the solutions to the particular geometrical problems "well enough" [Euler 1736, Preface].19 Here Euler attributed to geometrical proofs some specific epistemic virtue ("understanding").20 Observe that, while Euler stresses that the works composed "without analysis" do not provide a sufficient knowledge to solve problems slightly differents from a given (geometrical) example, he admits that geometrical methods "persuade of the truth of the things that are demonstrated" and lead to the comprehension of the solution [Euler 1736, Preface]. This is exactly the case of E177: he adds the geometrical proof because it persuades of the existence of point O (and then of the instantaneous axis) and, furthermore, leads to the comprehension of the solution. The analytic procedure lacks in this epistemic value. This is why Euler's geometrical proof marks a decisive point and it is considered by Euler of primary importance. To prove a geometrical theorem, Euler adds to the analytic proof a purely geometric proof. In considering a geometrical proof for the geometrical theorem, Euler restricts the conceptual resources used to prove the theorem to those which determine the content of the theorem, i. e. geometrical concepts. To put it roughly, the same geometrical concepts are used to enunciate and prove the theorem. Therefore Euler is concerned with 'purity of methods'.21 And, in line with what Euler affirms in the Preface to his Mechanica, this purity increases the epistemic quality of the proof with respect to the analytic proof. The geometrical procedure provides us with a kind of comprehension of the result (it is not pure justification), thus increasing the epistemic quality of the proof with respect to the analytic proof. The qualitative type of knowledge that the geometrical proof provides is the knowledge of the basic reasons why the result is true. The purely geometric proof is explanatory because, by making possible to reach the result through the same resources which determine the content of the theorem, it provides us with the knowledge of why the instantaneous axis exists: the axis exists because we can easily construct it geometrically, by finding point O and tracing the line in the diagram. Geometrical reasoning defines a natural, 'organic' conceptual path which leads us from the content of the theorem to the result. This is how the purely geometric proof sheds light on why the result is true.22 The analytic proof, although convincing of the correctness of the mathematical claim, does not offer this possibility.

44 Let me mention how the previous case regarding the geometrical proof in E177 does not represent an isolated example within Euler's mathematical activity. In 1775 Euler published E478, Formulae generales pro translatione quacunque corporum rigidorum. Here he reconsiders the more general problem of the description of the change of position of a rigid body. The difference with E177 is that here Euler is dealing with an arbitrary (finite) motion. In Section 20 of E478 Euler examines the following problem: does it exist a straight line which has its initial and final directions parallel (before and after the translation)? In order to search this line he tries to solve a system of equations, but the problem is analytically too complex and he is not able to find a solution.23 Thus, once more, he gives a geometrical proof, similar to that presented above.24 Euler is convinced of the validity of his analytic (unsolved) procedure, and thus of the existence of the line, because he has a geometrical proof which confirms this "vérité". Again, the geometrical proof persuades of the truth of the things that are demonstrated. More precisely, Euler observes how sometimes rules of analytical formulas "hide" the truth of excellent properties such as the existence of the line under investigation: As it thus has been seen by the most solid reasoning, that in every translated situation there is always given such a straight line IZ, whose direction does not

Philosophia Scientiæ, 16-1 | 2012 97

differ from that which the same line IZ held in its initial situation [. . . ] On this account this excellent property, the truth of which is so easily shown geometrically, will be most hidden by rules of analytical formulas; and owing to this we can anticipate the rules themselves to be the most important advancement for the whole science of mechanics. [Euler 1775, 96]

45 Interestingly, at the end of his paper Euler refers to the difficulty concerning the analytical problem and writes: But truly nobody who is easily stunned will undertake this [analytic] work; on this account this extraordinary property of all rigid bodies is judged as much more difficult, and can provide the best opportunity for the geometers to exercise their powers by thoroughly explaining this property. [Euler 1775, 98] 25

46 To sum up, Euler informs the reader why the particular geometrical proof was taken, beyond the simple reason that it makes the proof work. This is what happens in the geometrical proof presented above, where the proof is given by Euler to "easily" show the existence of the instantaneous axis. By making obvious how the fixed point O can be found, this particular proof convinces us of the result. Nevertheless, it provides not only a simple justification of the result, but it also provides an explanation of why the instantaneous axis does exist (the geometer explains this property through the conceptual instruments of geometry). And the previous quotations show how Euler is convinced that the (pure) geometrical proof has such particular (explanatory) virtue.

4 Testing Steiner's account on Euler's proof

47 Steiner's model for explanation in mathematics has been discussed by various authors ([Resnik & Kushner 1987]; [Weber & Verhoeven 2002]; [Hafner & Mancosu 2005]). However, a fine-grained analysis of it has been carried out only by Resnik and Kushner [Resnik & Kushner 1987], and Hafner and Mancosu [Hafner & Mancosu 2005]. Both the criticisms proposed by these authors point to a major defect of Steiner's account: without any constraint on what a 'family' (or 'domain') of mathematical entities is, the choice of 'characterizing property' in a proof is subject to arbitrariness. Furthermore, and more interesting for what follows here, Hafner and Mancosu choose as test-case a proof recognized as explanatory in mathematical practice, coming from the work of Alfred Pringsheim in the theory of infinite series, and show how Steiner's criteria of explanatoriness fails in considering this proof as explanatory. Hafner and Mancosu's moral is that Steiner's model does not reflect the practice of mathematicians, and for this reason must be rejected or improved to account for the intuitions of the practicing mathematicians.

48 Let's now consider Euler's geometrical proof given in the previous Section. If Euler's geometrical proof is explanatory, as Euler's mathematical practice seems to suggest us, Steiner's model of explanation should reflect this intuition in recognizing it as an

explanatory proof. In other words, criteria C1 and C2 should be applied successfully. 49 Recall Steiner's example discussed in Section 2.26 First of all, we must identify the characterizing property. Steiner does not provide any precise indication on how to perform such identification, but he suggests that this property should characterize something referred to in the theorem such that from the proof it is evident that the conclusion depends on this property. What does Euler's theorem refer to? If we come back to the theorem, we can observe that it is about a spherical surface which remains

Philosophia Scientiæ, 16-1 | 2012 98

rigid in the transformation (the distance between any two given points of the surface remains constant in the transformation), a point which is the center of such surface and which does not move during the transformation, and a line whose position is the same as before the transformation. Now, the object line cannot be taken into account as an object possessing a characterizing property, and this because it represents the result we want to prove. In fact, recall that according to Steiner the result must depend on the characterizing property. Therefore it would be a nonsense to say that the result depends on a characterizing property of the result itself. It can be thought that the characterizing property is the property the surface has to have a fixed point, or that the characterizing property is the property the point center has to remain in the same position. The proof and the result clearly depend on both these conditions. However, even if we assume that these properties characterize uniquely the two objects spherical surface and point—within the two families 'spherical surfaces subject to a transformation' and 'points subject to a transformation'—, it is difficult to see how

criterion C2 might apply. How do we deform the proof by substituting a characterizing property of a related entity? If we vary one of the two properties, the proof (and thus the theorem) will collapse. And this without telling us anything about new theorems.

50 On the other hand, we can turn our attention to the proof itself and observe how it depends on a great number of geometrical objects (lines, points, angles, arcs,…). Although each of those elements belongs to a family of similar entities and can be characterized in terms of some specific property (according to Euclidean geometry), they all contribute to the proof-strategy and are therefore necessary to the validity of the proof. Is it possible to identify a characterizing property among those objects? Perhaps, a natural move in the identification of the characterizing property would be to consider the particular importance attributed by Euler to some particular step in the proof. As we have seen above, Euler's key step in his proof-strategy is to find a geometrical configuration in which point O has the same position before and after the motion of the spherical surface. To find this configuration, as Euler points out, amounts to find a configuration in which the spherical triangle cCO is isosceles. It is obvious that the object triangle is the focus of Euler's proof. And it is evident that the proof, together with the result, depends on the property this object has to be isosceles (if spherical triangle cCO were not isosceles, point O would not be fixed in the transformation and the theorem would collapse). This suggests that we can choose the characterizing property of triangle cCO as that of being isosceles, i. e. as 'having the angles ∠cCO and ∠CcO equal' (or, equivalently, 'having the two sides CMO and cmO equal'). This condition, which uniquely characterizes the triangle (it picks out that particular triangle within the family of triangles), marks a crucial step in Euler's proof and then it deserves a particular attention. Unfortunately, although this choice would reflect Steiner's core intuition because the proof depends on this particular property of

the spherical triangle cCO, neither C1 nor his second criterion C2 can be applied.

Concerning C1, the entity triangle cCO is not mentioned in the theorem, and therefore

this criterion is not satisfied. Moreover, recall that, according to C2, if we deform the proof by "substituting the characterizing property of a related entity" we get "a related theorem". Here the entity under investigation is a triangle, and its characterizing property is 'having the angles ∠cCO and ∠CcO equal'. If we consider a related entity (another triangle), it is not clear how a substitution in its characterizing property could be made. According to Euler's constructive proof, if we consider a spherical triangle cCO

Philosophia Scientiæ, 16-1 | 2012 99

with two angles not equal, the theorem fails to hold and we do not obtain new theorems.

51 There is, however, a second object mentioned in the proof that should be regarded as good candidate for having a characterizing property. In order the spherical triangle OCc to be isosceles, point M must be chosen such that CM be perpendicular to γ. It might be thought, then, that the proof depends on the property the arc CM has to be perpendicular to the angular bisector γ. Or, which amounts to the same thing, that the proof depends on the property the angle formed by ∠cCO and

has to be a right angle. Although they both represent reasonable choices for a characterizing property, and they uniquely characterize the theorem, even in this case the proof would not meet Steiner's criteria. The two entities (the arc CM and the angle formed by ∠cCO and

are not mentioned in the theorem, contrary to what is demanded by C1 . Furthermore,

it is not clear how the deformability criterion C2 is supposed to operate. How do we pick out a related entity with a replaced characterizing property? Again, it seems that

Steiner's criterion C2 cannot be applied simply because it is not possible to deform the proof in his sense.

52 My choices above were motivated by the fact that the individuation of a configuration in which the spherical triangle OCc be isosceles represents a crucial step in Euler's proof strategy. If there is a characterizing property, it is reasonable to concentrate on that step. Furthermore, this well reflects Steiner's idea about the dependence characterizing property-result: it must be evident, from the proof, that the result depends on the characterizing property. There are, of course, others objects in the proof which can be picked out as entities possessing a specific characterizing property. For instance, the property angles ∠cCO and ∠CcO have according to the expression (4), or the property the grand circle AB has to remain unaffected in length when moved to the new position

a6. However, even if we consider these possibilities, Steiner's criterion C1 is not satisfied

(the entities in question are not mentioned in the theorem) and it is hard to see how C2 might be applied. Neither the theorem nor our proof seems to be 'deformable', in Steiner's sense, to yield genuinely new results. It is not possible to deform the proof by "holding constant the proof idea" [Steiner 1978a] and getting something like the new theorem that we found in Steiner's example of Section 2.

53 Let me summarize the results of this Section. For the proof to count as explanatory in

Steiner's sense it must make plain that criterion C1 and C2 are satisfied. Although the choice of characterizing property in Euler's proof seems to be extremely subject to arbitrariness, as previous criticisms to the model have shown ([Resnik & Kushner 1987]; [Hafner & Mancosu 2005]), I have considered some possible candidates. My choice was motivated by the importance that a property of some geometrical objects has in Euler's

proof strategy. However, neither criterion C1 nor criterion C2 apply for these characterizing properties, thus blocking Steiner's theory from recognizing the proof as explanatory. Surely other possibilities might be considered, but a general observation seems to show the inapplicability of the model for the particular test-case: Euler's constructive proof is about a particular result and it seems that no deformation of it

Philosophia Scientiæ, 16-1 | 2012 100

will tell us anything about new theorems. The moral is that Steiner's account fails in considering this proof as explanatory and this contrasts with Euler's intuitions as a mathematician.27 Obviously, the previous considerations shift the burden of the proof to Steiner. If his theory of explanation is right, Steiner must show that it is able to recognize Euler's geometrical proof as an explanatory proof.

5 Conclusions and perspectives

54 If my analysis is correct, Steiner's model cannot account for the explanatory character of the geometrical proof given by Euler and therefore it should be refined or even rejected in favour of a different approach. Perhaps, the lesson we learn from Euler is that new directions of investigations are needed to adequately capture the notion of mathematical explanation. We have seen that Euler appealed to geometry to show why a particular geometrical result is true. Moreover, I pointed to the crucial role that Euclidean geometry, and more precisely a pure (geometrical) method, played in his explanatory practice. This suggests that it might be more philosophically profitable to abandon Steiner's idea that an explanatory proof depends on a particular property of an entity mentioned in the theorem in favour of an approach which focuses on the preferences expressed by the mathematicians for some mathematical concepts or for the particular mathematical framework used to prove a theorem. On the other hand, it might be thought that the notion of explanatory proof cannot be captured simpliciter, as Steiner proposes, but that there is a variety of explanatory proof-practices in mathematics. In this sense, Euler's practice would represent just an example of such practices.

55 It would be interesting to pursue these issues here, but that is obviously something which represents a fertile terrain for another study. Rather, the modest aim of this paper was to show that Steiner's theory has difficulties in accounting for a case of genuine mathematical explanation. Furthermore, and more important for the ambitions of the present thematic issue, I hope to have provided an example of how mathematical practice can inform the philosophy of mathematics. About 250 years passed by, however we can still learn from Euler. I am indebted to the following people for their generous comments on earlier versions of this paper and for their encouragement: Paolo Mancosu, Marco Panza, Chris Pincock, Henk De Regt, Sébastien Maronne, José Diez, Steven French. I would also like to thank Mark Steiner for his clarifications on his account of explanation, and an anonymous referee for very helpful suggestions.

BIBLIOGRAPHY

ASPRAY, WILLIAM & KITCHER, PHILIP 1988 History and Philosophy of Modern Mathematics, Minneapolis: University of Minnesota Press.

Philosophia Scientiæ, 16-1 | 2012 101

DE REGT, HENK & DIEKS, DENNIS 2005 A contextual approach to scientific understanding, Synthese, 144, 137-170.

DETLEFSEN, MICHAEL 2008 Purity as an ideal of proof, in The Philosophy of Mathematical Practice, edited by MANCOSU, PAOLO, Oxford: Oxford University Press, 179-198.

DETLEFSEN, MICHAEL & ARANA, ANDREW 2011 Purity of methods, Philosophers' Imprint, 11(2), 1-20.

EULER, LEONHARD 1736 Mechanica, sive motus scientia analytice exposita [E15, E16], St. Petersburg: Ex. typographia Academiae Scientiarium, reprinted in Opera Omnia, Series II, Vols. 1 and 2. The works of Euler will be cited according to Leonhardi Euleri Opera omnia, C. Blanc, A.T. Grigorijan, W. Habicht et al. (eds.), Basel: Birkhaûser, 1911-1986. 1739 De novo genere oscillationum [E126], in Opera Omnia, II, vol. 10, 78-97, originally published in Commentarii academiae scientiarum Petropolitanae, 11, 1750, 128-149. 1747 Recherches sur les corps célestes en général [E112], in Opera Omnia, II, vol. 25, 1-44, originally published in Mémoires de l'académie des sciences de Berlin, Vol. 3, 1747, 93-143. 1748 Introductio in analysin infinitorum [E101, E102], Lausanne: Apud Marcum-Michœlem Bousquet and Socios, reprinted in Opera Omnia, Series I, Vols. 8 and 9. 1750 Découverte d'un nouveau principe de mécanique [E177], in Opera Omnia, II, vol. 5, 81-108, originally published in Mémoires de l'académie des sciences de Berlin 6, 1752, 185-217. 1775 Formulae generales pro translatione quacunque corporum rigido-rum [E478], in Opera Omnia, II, vol. 9, 84-98, originally published in Novi Commentarii academiae scientiarum Petropolitanae 20, 1775, 189-207. A translation by John Sten is available here: http://www.17centurymaths.com/ contents//euler/e478tr.pdf.

GOLDSTEIN, HERBERT 1957 Classical Mechanics, Reading, Massachusetts: Addison Wesley, 1st ed.

GROVE, LARRY C. 2002 Classical Groups and Geometric Algebra, no. 39 in Graduate Studies in Mathematics, Providence: American Mathematical Society.

GROVE, LARRY C. & BENSON, CLARK T. 1985 Finite Reflection Groups, New York: Springer, 2nd ed.

GUICCIARDINI, NICCOLÔ 2004 Dotage: Newton's mathematical legacy in the eighteenth century, Early Science and Medicine, 9(3), 218-256.

HAFNER, JOHANNES & MANCOSU, PAOLO 2005 The varieties of mathematical explanation, in Visualization, Explanation and Reasoning Styles in Mathematics, edited by MANCOSU, PAOLO, J0RGENSEN, KLAUS FROVIN, & PEDERSEN, STIG, Dordrecht: Springer, 215-250. 2008 Beyond unification, in The Philosophy of Mathematical Practice, edited by MANCOSU, P., Oxford: Oxford University Press, 151-179.

HOEFER, CARL 2008 Introducing Nancy Cartwright's philosophy of science, in Nancy Cartwright's Philosophy of Science, edited by HARTMANN, STEPHAN, HOEFER, CARL, & BOVENS, LUC, New York: Routledge, 1-13.

KITCHER, PHILIP 1984 The Nature of Mathematical Knowledge, Oxford: Oxford University Press.

Philosophia Scientiæ, 16-1 | 2012 102

KOETSIER, TEUN 2007 Euler and kinematics, in Leonhard Euler: Life Work and Legacy, edited by BRADLEY, R. E. & SANDIFER, C. E., Amsterdam: Elservier, 167-194.

MAHONEY, MICHAEL S. 1984 Changing canons of mathematical and physical intelligibility in the later 17th century, Historia Mathematica, ii, 417-423.

MALTESE, GIULIO 2000 On the Relativity of Motion in Leonhard Euler's Science, Archive for History of Exact Sciences, 54(4), 319-348.

MANCOSU, PAOLO

2000 On mathematical explanation, in The Growth of Mathematical Knowledge, edited by GROSHOLZ, E. & BREGER, H., Kluwer, 103-109. 2001 Mathematical explanation: Problems and prospects, Topoi, 20, 97-117. 2008 Mathematical explanation: Why it matters, in The Philosophy of Mathematical Practice, edited by MANCOSU, PAOLO, Oxford: Oxford University Press, 134-149. 2011 Explanation in mathematics, in The Stanford Encyclopedia of Philosophy, edited by ZALTA, EDWARD N., summer 2011 ed.

MANCOSU, PAOLO, J0RGENSEN, KLAUS FROVIN & PEDERSEN, STIG 2005 Visualization, Explanation and Reasoning Styles in Mathematics, vol. 327, Dordrecht: Springer.

RAVETZ, JEROME 1961 The representation of physical quantities in eighteenth-century mathematical physics, , 52, 7-20.

RESNIK, MICHAEL D. & KUSHNER, DAVID 1987 Explanation, independence, and realism in mathematics, British Journal for the Philosophy of Science, 38, 141-158.

SANDBORG, DAVID 1998 Mathematical explanation and the theory of why-questions, British Journal for the Philosophy ofScience, 49, 603-624.

STEINER, MARK 1978 a Mathematical explanation, , 34, 135-151. 1978 b Mathematics, explanation and scientific knowledge, Noûs, 12, 17-28.

TAPPENDEN, JAMIE 2005 Proof style and understanding in mathematics i: Visualization, unification and axiom choice, in Visualization, Explanation and Reasoning Styles in Mathematics, edited by MANCOSU, PAOLO, J0RGENSEN, KLAUS FROVIN, & PEDERSEN, STIG, Dordrecht: Springer, 147-214.

TYMOCZKO, THOMAS 1998 New Directions in the Philosophy of Mathematics: An Anthology, Princeton: Princeton University Press.

WEBER, ERIK & VERHOEVEN, LIZA 2002 Explanatory proofs in mathematics, Logique et Analyse, 179-180, 299-307.

WEYL, HERMANN 1973 The Classical Groups. Their invariants and representations, Princeton: Princeton University Press, 2nd ed.

Philosophia Scientiæ, 16-1 | 2012 103

WILSON, CURTIS 1987 D'Alembert versus Euler on the precession of the equinoxes and the mechanics of rigid bodies, Archive for History ofExact Sciences, 37, 223-273.

NOTES

1. For instance, in their paper on scientific understanding, Henk De Regt and Dennis Dieks write: "Nowadays few philosophers of science will contest that they should take account of scientific practice, both past and present. Any general characteristic of actual scientific activity is in principle relevant to the philosophical analysis of science" [De Regt & Dieks 2005, 139]. 2. Although "proofs are often vehicles for mathematical explanation" [Sandborg 1998, 616], not all explanations in mathematics come under this form. For cases of explanation in mathematics which do not come under the form of proof see [Kitcher 1984], [Sandborg 1998] and [Mancosu 2001]. 3. Although Steiner offers some examples, the notion of 'deformation' is left undefined in his discussion. He writes: "Deformation is similarly undefined—it implies not just mechanical substitution, but reworking the proof, holding constant the proof idea" [Steiner 1978a, 147]. 4. The inverse A-1 of A satisfies A-1 A = I = AA-1. The transpose matrix of A, denoted by A*, is the t A . matrix obtained by exchanging A's rows and columns: (A )ij = ji 5. A group is a set G, together with a binary operation * on G, which has the following properties: 1) for all g and h in G, g * h ∈ G; 2) for all f, g and h in G, f * (g * h) = (f * g)* h; 3) there is an unique e in G such that for all g in G, g * e = g = e * g; 4) if g ∈ G there is some h in G such that g * h = e = h * g. 6. We used two well-known properties of determinants: the fact that, for any square matrix A, | AB| = |A||B| and |At| = |A|. 7. This proof is that to which Steiner refers to and it is given in Goldstein's Classical Mechanics [Goldstein 1957, 123]. The very same proof is common in text-books of linear algebra and group theory. For instance, see [Grove & Benson 1985]. 8. The FTA states that any polynomial has at least a complex root. If we accept the factorization of a polynomial of degree n, it has exactly n complex roots. The CCRT says that if a polynomial in one variable with real coefficients, such as |A- ⋋I|, has a complex root ⋋, then the complex conjugate ⋋* of ⋋ is also a root of the polynomial. This means that, according to CCRT, if ⋋ is a solution of the polynomial equation |A - ⋋I| = 0, also ⋋* will be a solution of the same equation. In general, the FTA says that |A - ⋋I| = 0 will always have complex solutions, but not necessarily real solutions. However, we are considering a real orthogonal 3 x 3 matrix, then the polynomial |A - ⋋I| is a real polynomial (the entries of the matrix A are real, and so are the coefficients of the polynomial) of odd degree (degree 3). Thus, by FTA, the polynomial equation |A - ⋋I| = 0 has an odd number of solutions. Now, according to CCRT, complex solutions come in conjugate pairs, and consequently there is an even number of them in |A - ⋋I| = 0. Therefore the polynomial equation |A - ⋋I| = 0 has at least one real solution (i. e. at least one real eigenvalue). 9. What interests us here is the existence of such eigenvalue. Observe, however, that the proof given also shows that there is only one eigenvalue ⋋ = +1. The unicity of ⋋ = +1 can also be proved as a corollary of the more general theorem of Cartan-Dieudonné, by showing that the dimension of the space Fix(α) of the fixed points of the transformation is 1 [Grove 2002, 49]. 10. In this paper Steiner is mainly concerned with the existence of mathematical explanations in science, and with the existential conclusions which follow once the existence of such explanations is accepted.

Philosophia Scientiæ, 16-1 | 2012 104

11. For instance, this theorem appears in Hermann Weyl's The Classical Groups [Weyl 1973, 58]. 12. Note that the theorem seen above, used by Steiner to illustrate his account, is just a modern algebraic formulation of Euler's theorem, once the appropriate physical interpretation is established. This is, for instance, how the theorem appears in Goldstein's textbook: "The real orthogonal matrix specifying the physical motion of a rigid body with one point fixed always has the eigenvalue +1" [Goldstein 1957, 119]. 13. In 1910 and 1913, the Swedish mathematician Gustaf Eneström completed the first comprehensive survey of Euler's works. Each work is classified by using the letter E and a number (the works are referred to as "Eneström number"). 14. See [Maltese 2000] on this phase of Euler's work in mechanics, concerning the relativity of motion. 15. The letters A, B, C stand for indeterminate coefficients. 16. Euler does not use the word instantaneous. He refers to it simply as axe de rotation. 17. For a similar reconstruction see [Koetsier 2007, 184—185]. 18. Referring to this paper, Jerome Ravetz has observed: "symptomatic of his think-ing at this stage is the description of the sine as a line, not a function" [Ravetz 1961, 13] 19. The original passage is in Latin. I adopt here the translation which appears in [Guicciardini 2004]. 20. Euler uses the Latin verb 'percipio', which means 'to perceive', 'to understand'. The verbal form is translated as 'understanding' by [Guicciardini 2004, 245]. 21. On purity of methods see [Detlefsen 2008] and [Detlefsen & Arana 2011]. Detlefsen and Arana offer the following characterization of purity: "purity in math-ematics has generally been taken to signify a preferred relationship between the resources used to prove a theorem or solve a problem and the resources used or needed to understand or comprehend that theorem or problem. In this sense, a pure proof or solution is one which uses only such means as are in some sense intrinsic to (a proper understanding of) a theorem proved or a problem solved" [Detlefsen & Arana 2011, 1]. 22. Although the connection purity of methods-explanation has not been suffi-ciently investigated yet, purity and explanation are not absolutely separate matters and the two notions seem to be closely connected [Mancosu 2011]. In particular, the idea that purity of methods can increase the epistemic quality of the proof, and more particularly that they can have explanatory value, has been expressed by some au-thors (cf. [Detlefsen & Arana 2011, 7—8]). Even though in this paper I am pointing to such connection (purity of methods-explanatory value) for the case of Euler's E177, it should be noted that the fact that a proof be 'pure' is not regarded as a necessary condition for that proof to be explanatory. Other features of a proof, for instance the virtue a proof has to be visualizable or its being part of a particular theory or framework, have been considered as plausible sources of explanatory power as well [Mancosu 2011], and this despite the 'pure' or 'impure' character of the proof. 23. For a presentation of E478, see [Koetsier 2007]. 24. As Koetsier has pointed out, it is possible that Euler did not realize that his geometrical proof for the instantaneous motion, in E177, also holds for the case of finite motion [Koetsier 2007, 185]. 25. John Sten's translation renders the Latin verb 'enucleare' as 'to explain'. This is the rendering as it is given in a standard Latin-English dictionary, as for instance in Charlton T. Lewis and Charles Short's A Latin Dictionary (Oxford: Clarendon Press, 1879). 26. In that case, the characterizing property of SO(3) was 'having an odd dimension', and the generalizability of the theorem came from deforming the proof by replacing the dimension 3 by some odd number. 27. Note that in considering Euler's geometrical proof as a mathematical explanation I did not endorse any particular account of mathematical explanation, such as Steiner s. The fact that this

Philosophia Scientiæ, 16-1 | 2012 105

is a mathematical explanation comes from Euler's practice as a mathematician and physicist, as I showed in the previous Section.

ABSTRACTS

In his "Découverte d'un nouveau principe de mécanique" (1750) Euler offered, for the first time, a proof of the so-called Euler's Theorem. In this paper I will focus on Euler's original proof and I will show how a look at Euler's practice as a mathematician can inform the philosophical debate about the notion of explanatory proofs in mathematics. In particular, I will show how one of the major accounts of mathematical explanation, the one proposed by Mark Steiner in his paper "Mathematical explanation" (1978), is not able to account for the explanatory character of Euler's proof. This contradicts the original intuitions of the mathematician Euler, who attributed to his proof a particular explanatory character.

Dans son « Découverte d'un nouveau principe de mécanique » (1750) Euler a donné, pour la première fois, une preuve du théorème qu'on appelle aujourd'hui le Théorème d'Euler. Dans cet article je vais me concentrer sur la preuve originale d'Euler, et je vais montrer comment la pratique mathématique d Euler peut éclairer le débat philosophique sur la notion de preuves explicatives en mathématiques. En particulier, je montrerai comment l'un des modèles d'explication mathématique les plus connus, celui proposé par Mark Steiner dans son article « Mathematical explanation » (1978), n'est pas en mesure de rendre compte du caractère explicatif de la preuve donnée par Euler. Cela contredit l'intuition originale du mathématicien Euler, qui attribuait à sa preuve un caractère explicatif spécifique.

AUTHOR

DANIELE MOLININI Équipe Rehseis (UMR 7219), Université Paris 7 (France)

Philosophia Scientiæ, 16-1 | 2012 106

From Practice to New Concepts: Geometric Properties of Groups

Irina Starikova

Introduction

1 In practice, it seems that mathematics is not carried on solely in terms of axioms, theorems and proofs. Often, some properties are easy to overlook from the perspective given by a definition or a traditional representation. An effective method of revealing new properties is by using different representations of the concepts in such a way that these properties become noticeable.

2 The example demonstrates how groups were represented as graphs. Then the latter represented as metric spaces helped to reveal many geometric properties of groups. As a result, many combinatorial problems were solved through the application of geometry and topology.

3 For the epistemological analysis of this example, I apply the approach proposed by Manders [Manders 1999]. He considers mathematical practice as control of the selective response to given information, where selective response means emphasising some properties of an object while neglecting others. From this perspective, representations serve to implement the principal constituents of this activity. As will be demonstrated below, this approach makes clear that a change in representation is a valuable means of finding new properties.

4 The structure of the paper is as follows. The first section explains Manders' approach and his comparison of Euclid's and Descartes' geometries. The second section presents the case study, namely, it explains how groups previously studied algebraically were approached as geometric objects. This was realised through the representation of groups by Cayley graphs, and then through representing graphs as metric spaces. Detailed definitions will be provided, but for the purpose of this paper it is enough to grasp the general idea. Finally, the conceptual impact of the geometric approach to

Philosophia Scientiæ, 16-1 | 2012 107

groups will be analysed in terms of Manders' approach. These issues are developed in detail in Starikova [Starikova 2011].

1 Mathematical practice in the terms of selective responses

5 In his unpublished paper 'Euclid or Descartes? Representation and Responsiveness', Manders analyses the contribution of Géométrie, compared to Euclid's plane geometry. He particularly stresses the epistemic role of the algebraic notation in that contribution. He draws attention to the fact that mathematical problem solving has a strategically selective character: at each segment of practice only some information is taken into account, but not all. Strategical indifference includes such epistemic activities as abstraction, unification, idealisation and approximation. Responding to particular elements of the context and remaining indifferent to others provides control over each step of the context:

6 ...[T]he mathematician faced with a proposition to prove must exercise strategically allocated indifference in responding to that situation, say, details of examples that do not bear on general claims one is trying to frame or prove. [Manders 1999, 4] Manders uses the term 'respondif ' for these responses and indifferences. I will refer to them as 'selective responses' (which include indifferences).

7 In short, Manders announces that two things are to be demonstrated: • a. that mathematical advance is based on a systematic, coordinated use of responsiveness and indifference, and • b. that the coordination of this responsiveness and indifference is implemented by means of representations [Manders 1999, 2].

8 Let me explain the key ideas of his approach.

1.1 Application and applicative response

9 Selective responses are often applied from another domain. This way one can distinguish between direct and application-mediated responses. Further on, I will use the 'original' and 'applicative' response terms correspondingly. The purpose of application is to access the resources of the applied domain. For example, in Descartes' geometry, geometric problems are solved through solving algebraic equations, which represent the geometric curves. Here is Manders' summary of Descartes' approach:1

10 From a diagram-based problem to initial equations. • Set out the problem in diagram form, as if it is already solved. • Elaborate the diagram, consider its lines and what will be the primitives in the algebraic equations (constants and variables). • List the algebraic conditions for the problem. • Label the known and unknown line segments with single letters. Algebraic manipulation.

11 Eliminate the auxiliary unknowns.

12 'Construct' (solve geometrically) the equation.

13 Each stage of this process gives rise either to selective responses to the original context or to the applicative context. According to Manders, what makes Descartes' approach

Philosophia Scientiæ, 16-1 | 2012 108

exceptionally effective is that it enriches traditional geometry with many new types of selective responses. It also coordinates them with each other and the already established Euclidean geometrical responses and indifferences. For example, the Cartesian approach applies algebraic selective responses:

14 Descartes' geometrical method enhances geometry in the style of the ancients by the algebra of his time, by introducing algebraic responses to geometrical problems (systems of equations), algebraic methods to simplify systems of equations and geometrical responses to equations. [Manders 1999, 8]

1.2 The role of representations (artefacts)

15 In mathematical reasoning we often produce and respond to artefacts: natural language expressions, Euclidean diagrams, algebraic or logical formulas. According to Manders, artefacts help to implement and control selective responses. Artefacts also provide new responses, suitable for the current context: e.g. to recognise a region bounded by three sides. They also make the artefacts available for further steps: e.g. to draw a line between two points. Therefore artefacts fix and stabilise the responses and indifferences to particular elements of a structure.

16 There is also a coordination of the diagram and the text: the text specifies what is essential in producing and reading the diagrams. Therefore Euclidean constructions involve a complex coordination and control of selective responses both to the diagram and the text.

17 Such control and coordination may have different levels of 'quality', and this is the point where improvement is possible. In Euclid's demonstrations some information may be read only from a diagram, some only inferred from prior text entries, while some is diagram-based and text-based. For example, in the demonstration of Euclid I.1, we know that the closed curves are circles only from prior stipulation in the text. The fact that they intersect arises only from their situation in the diagram as in the figure below, rather than being inferred from the prior text.

FIGURE 1: Euclid's proposition 1.1.

18 Diagrams are produced according to the specifications in the accompanying text, which makes the depicted relations reproducible and therefore stable. Diagram and text contribute to each other differently, and compensate for each others' weaknesses.

19 However, Manders admits that our control over (Euclidean) diagrams is limited. In the Euclidean style of using diagrams it is possible that some metric properties are unclear. For example, some segments or angles may appear as equal, when in fact they are non-

Philosophia Scientiæ, 16-1 | 2012 109

equal. Also there is a risk of missing cases in Euclidean diagrammatic practice (e.g. the case of an obtuse triangle may be missed with considerable consequences).

20 Manders specifies that the features that we read directly from a diagram are (a) perceptually explicit, and (b) stable under the diagram perturbations such as a sequence of reproductions of the diagram. These conditions compensate for our limited control of diagrams.

21 Therefore the Euclidean use of diagrams extensively relies on the appearance of the diagrams, which means that the diagrams have to be realised in a medium, and a degree of accuracy—to satisfy conditions (a) and (b)—is required. Furthermore, the non-trivial nuances, such as the solutions' dependence on case branching, must be taken into account.

22 Let us return to the stages of Descartes' approach. The second stage is paramount for an applicative approach. This stage brings about a substantial advance in problem solving: after the equations are written down, all the calculations proceed quickly by means of efficient algebraic algorithms and become largely a "cut-and-dried matter" [Manders 1999, 28].

23 Note that the second stage consists mainly of the algebraic manipulations. Indeed, we do not need any more geometrical consideration of known and unknown quantities in the diagram. The coefficients extracted from the diagram at the beginning are later on treated indifferently to their geometrical magnitudes. This is a case of indifference to the diagram metric properties and the application of an algebraic response—solving equations. After the first stage, our reasoning is neither motivated by, nor particularly responsive to, the diagram of the original problem. As Manders put it: Compared to traditional geometrical analysis/synthesis, exclusively compared to "purely synthetic" geometrical thought, the elaborate and clear-cut segmentation of Cartesian geometric problem solving method into distinct tasks and stages, at moderate cost of ultimate geometrical grasp, turns on greatly enhanced respondif coordination and control, especially in the algebraic stages... [Manders 1999, 25]

24 The final step is just to check if the solutions are relevant. But what initiates the application is the first or representational stage. Manders calls it a "response-shaping" coordination. He makes the important point that the introduction of algebraic notation helps us to apply the fast algebraic algorithms: Changing the artefact basis of part of the analytic process, to a representation (algebraic equations) indifferent to diagram appearance, allows all this progress towards solving problems, unimpeded by the difficulties of diagram control in Euclidean-style reasoning. [Manders 1999, 18]

25 Therefore, in the case of an applicative approach, the representation facilitates the application. Moreover, it endorses a new algebraic notion to geometry—the degree of an equation. Degree is a new concept unknown in Euclidean practice, which now becomes available for approaching geometrical problems.

26 Manders remarks that in terms of such criteria as proof-strength, consistency and computability, nothing is significantly improved in Descartes' geometry (except for computability). Also, viewing Descartes' method as simply the elimination of (tedious) geometric constructions would be incomplete. The results of Descartes' algebraisation of geometry are fundamental, and Manders' approach does explain the (conceptual) advance of Descartes' geometry over Greek geometry. It gives us concrete examples of the benefits. One such example is the indifference towards case distinctions, which are

Philosophia Scientiæ, 16-1 | 2012 110

not needed in Descartes' algebraic method. Another example is the new type of response — the appreciation of the equation degree.

2 The case study

27 Let me now move to geometric group theory. I shall start with explaining the geometric approach, and then when analysing it I will make the comparison with the combinatorial approach with its focus on the role of representations.

28 The geometric perspective on groups is not new. Groups were commonly seen as transformations of geometric objects, and this view was the core of Klein's Erlangen programme which aimed to classify and characterise geometries on the basis of group theory (and projective geometry) [Klein 1873]. 2 However, in this paper I discuss a new approach, which is based on the idea that groups as such can be thought of as geometric objects. I shall consider hyperbolic groups as an important example and then give it a philosophical analysis.

2.1 The geometric approach

2.1.1 Generated groups

29 Definition (a generating set). Let G be a group. Then a subset S ⊆ G is called a generating set for the group G if every element of G can be expressed as a product of the elements of S or the inverses of the elements of S.

30 In other words, every element of G can be written as a composition of symbols (called letters) representing the elements of S and their inverses. A representation of a non- identity element s as a product of n ≥ 1 letters is called a word. In a given word the number n ∈ ℕ is called the length of the word. The word with the length equal to 0 is called an empty word and by definition it represents the group identity I. Other representations of I by words of length n ≥ 1 are called group relations.

31 There may be several generating sets for the same group. The largest generating set is the set of all group elements. For example, the subsets {1} and {2,3} generate the group (ℤ, +) or ℤ for short, whereas {2} does not.

32 Definition (a finitely generated group). A group with a specified set of generators S is called a generated group and is designated as (G, S) . If a group has a finite set of generators, it is called a finitely generated group.

33 Example. The group ℤ is a finitely generated group, for it has a finite generating set, for example S = {1}. The generated group ℤ with respect to the generating set {1} is usually designated as (ℤ, {1}). The group (ℚ, +) of rational numbers under addition cannot be finitely generated.

34 Generators provide us with a 'compact' representation of finitely generated groups: i.e. a finite set of elements, which by the application of the group operation, gives us the rest of the group.

Philosophia Scientiæ, 16-1 | 2012 111

2.1.2 Groups represented by graphs

35 In effect, a representation of a finitely generated group with respect to a chosen set of generators can be realised as a graph.

36 Definition (a Cayley graph). Let (G, S) be a finitely generated group. Then the Cayley graph r(G, S) of a group G with respect to the choice of S is a directed coloured graph, where vertices are identified with the elements of G and the directed edges of a colour's connect all possible pairs of vertices (x, sx), x ∈ G, s ∈ S.

37 The vertices of a Cayley graph do not have to be labelled, whereas edges must be coloured if there is more than one generator. The edge corresponding to the multiplication by group element x, which satisfies the condition x2 = I, does not need to be a directed edge.

38 Example. The Cayley graph for the first example, (ℤ, {1}) is an infinite chain as illustrated in the figure below:

FIGURE 2: The Cayley graph of the group (ℤ, {1}).

39 Different choices of generators give different Cayley graphs. The same group ℤ with generators {1, 2} can be depicted as an infinite ladder, as in Figure 3:

FIGURE 3: The Cayley graph of the group (ℤ, {1,2}), where bold stands for {1}.

and (ℤ, {2,3}) in the figure below gives the graph: 40 FIGURE 4: The Cayley graph of the group (ℤ, {2,3}), where bold stands for {3}.

2.2 Groups and their graphs as metric spaces

41 The 'geometric' properties of groups mean the properties which can be revealed by thinking of their Cayley graphs as metric spaces (I will explain how). Many of these geometric properties turn out to be independent from the choice of generators for a

Philosophia Scientiæ, 16-1 | 2012 112

Cayley graph. For this reason they are considered to be the properties of the groups themselves. At first glance it looks like they depend on the choice of generators, simply because the Cayley graph does. However, it turns out that these properties are the same for different generating sets. Studying these properties makes up a substantial part of geometric group theory. To relate the geometric and algebraic properties of groups means to answer the question:

42 If G, G' are quasi-isometric groups, to what extent do G and G' share the same algebraic properties?

43 In other words, the idea is to 'look at groups through' their Cayley graphs and try to see new (geometric) properties of groups. Then to return to the algebra and check which groups share these properties and under which constraints. This opens up the following opportunities for a group-theorist: 1. Using a presentation of the group G to define a metric on the group and then to exploit the consequent geometry; 2. Defining geometric counterparts to some algebraic properties of groups ('up to quasi- isometry'); 3. Classifying groups with these geometric properties.

44 The obtained results can be considered as a contribution of the groups' geometrisation to the algebra of groups. Now some more precise formulations and examples are to be given.

2.2.1 Word metric

45 Thinking of groups in terms of metric spaces requires that the relevant space is given

by a group G with a particular generating set S and the word metric dS:[(G,S),dS]. A metric on groups can be introduced by the notion of word as defined above.

46 Definition (a word metric). If g, h ∈ G then the word metric (with respect to S) dS(g, h) is the length of a shortest word representing g-1h, where g-1h is a word w such that gw = h.

47 The metric space [(G,S),dS] may not appear at first glance to give us much structure to study as compared to that found in the classical metric spaces, such as the Euclidean or hyperbolic. However it becomes potentially more intriguing when one observes that a discrete valued metric space can be compared to an interesting continuously valued metric space such as the hyperbolic plane. Here is how Gromov expresses this issue about a word metric space: This space may appear boring and uneventful to a geometer's eye since it is discrete and the traditional local (e.g. topological and infinitesimal) machinery does not run in [the group] Γ. To regain the geometric perspective one has to change one's position and move the observation point far away from r. Then the metric in r seen from the distance d becomes the original distance divided by d and for d — ∞ the points in Γ coalesce into a connected continuous solid unity which occupies the visual horizon without any gaps and holes and fills our geometer's heart with joy. [Gromov 1993, 1]

48 This implies that if in a discrete space any pair of distinct points is joined by a geodesic segment (the shortest path), it makes such a metric space comparable to the classical metric spaces in the same way as a sequence of connected points can be similar to a line.3 This idea is central to geometric group theory, and next I will explain how it was realised in a technical sense.

Philosophia Scientiæ, 16-1 | 2012 113

2.2.2 Cayley graphs as metric spaces

49 The idea above can be realised through the introduction of a metric on Cayley graphs. By the definition of a Cayley graph, words in a generated group (G, S) correspond to paths in the Cayley graph Γ(G,S).

50 Definition. A path between two arbitrary vertices of the graph, x and y, is a sequence of edges between x and y.

51 The idea is to take each such edge to be of a length equal to 1. Then the length of the path is equal to the number of unit edges in this path. For example, for the Cayley graph in Figure 5 which has one generator 1, the word/path from 0 to 3 is 1 ‧ 1 ‧ 1, and its length is 3.

FIGURE 5: The Cayley graph of the group ℤ 6.

52 Definition (a path metric). Let Γ(G,S) be the Cayley graph for a generated group (G, S). For

any pair of vertices x, y of Γ(G,S) the path metric (word metric) dΓ,s(x,y) on the Cayley graph Γ(G,S) can be defined as the length of (one of) the shortest paths (geodesic segments) connecting x and y.

53 The set of vertices in a Cayley graph with this path metric is now a metric space, and

the group metric space ((G, S), dS) is isometric to the Cayley graph metric space

(Γ(G,S),dΓ,s) by definition. 54 There can be more than one path (word) that is the shortest in a group. For example, in

the Cayley graph of the group C3 x C2 in the figure below there are two equally short paths between the vertices labelled 0 and 1: i.e. 0,4,1 and 0,3,1.

FIGURE 6: The Cayley graph of the group C3 x C2.

Philosophia Scientiæ, 16-1 | 2012 114

2.3 Quasi-isometry

55 The notion of quasi-isometry is central to geometric group theory. It serves to formalise the idea of comparing metric spaces as expressed above by Gromov. Let me use also Bridson's words to better explain its role: ... [O]ne needs a language that will lend precision to observations such as the following: if one places a dot at each integer point along a line in the Euclidean plane, then the line and the set of dots become indistinguishable when viewed from afar, whereas the line and the plane remain visibly distinct. One makes this observation precise by saying that the set of dots is quasi-isometric to the line whereas the line is not quasi-isometric to the plane. [Bridson 1999, 138]

2.3.1 Definition and some examples

56 As Bridson points out, the way to compare metric spaces is through the notion of quasi- isometry. It is based on isometry, which is a distance-preserving map between metric spaces. Isometry is similar to congruence in geometry, in that it expresses the idea 'is the same as' in a given category. To distinguish it further, quasi-isometry is a weaker equivalence relation between metric spaces: it is supposed to grasp the idea 'is similar to'.

57 Definition (a quasi-isometry). For metric spaces (M1,d1) and (M2,d2) a function f :M1 — M2 (not necessarily continuous) is called a quasi-isometry if there exist constants A ≥ 1 and B ≥ 0 such that:

58 for all x, y ∈ M1 and a constant C ≥ 0 such that to every u in M2 there exists x in M1 with

d2(u, f (x)) ≤ C. The spaces M1 and M2 are called quasi-isometric if there exists a quasi-

isometry f :M1 — M2. This means that the distance between any two points in M2 is bounded above and below by linear functions of the distance between the images of

these vertices in M1 . The notion of quasi-isometry generalises the familiar definition of isometry corresponding to the case A = 1 and B = 0.

59 Examples:

60 1. Any non-empty bounded space is quasi-isometric to a point: f : X1 — {x} is a quasi- isometry.

61 2. f : ℝ x [0,1] — R: projection to the first coordinate is a quasi-isometry, where d is the usual Euclidean metric.

62 3. f : (ℤ,d) — (ℝ, d) is a quasi-isometry (f (x) = x for any x ∈ ℤ), where d is the usual

Euclidean metric (see the figure below). Indeed, d2(f (x),f (y)) = d1 (x,y). Take A = 1 and B = 0.

Philosophia Scientiæ, 16-1 | 2012 115

FIGURE 7: ℤ is quasi-isomorphic to ℝ.

63 4. f : ℤ 2 — E2 is a quasi-isometry with respect to the Euclidean metric, as in the figure below:

FIGURE 8: ℤ 2 is quasi-isometric to E2.

64 5. The Cayley graph of (ℤ, { 1 , 2}) as well as the Cayley graph of (ℤ, { 2, 3}) with respect to word metric are quasi-isometric to ℝ (see Figures 3 and 4).

65 Non-examples:

66 1. The empty set is quasi-isometric only to itself. Let us take B to be the empty set, and A to be a non-empty set with some arbitrary metric d. There is no function f : A — B if B is empty and A is non-empty.

67 2 Boundedness is a quasi-isometry invariant Thus for example, ℝ is not quasi-isometric to [0, 1].

68 Quasi-isometry on groups The next proposition states that for a given group all its Cayley graphs are quasi-isometric:

69 Proposition. Take a group G, with finite subsets S and T, such that both S and T generate

G. Then the Cayley graph Γ(G, S) is quasi-isometric to Γ(G,T). Indeed, dΓ(x,y) ≤ kdS(x,y),

where k = maxidΓ(si,I), that is k is the greatest length of the generators in the set S

measured in the metric dT, and vice versa. 70 Now it is possible to define quasi-isometry between two finitely generated groups.

71 Definition (a quasi-isometric group). Two finitely generated groups G and G' are quasi- isometric, if there is a Cayley graph Γ(G,S) of the group G with respect to a generating set S, such that Γ(G, S) is quasi-isometric to a Cayley graph Γ(G' , T) of the group G' with respect to a generating set T.

72 In other words, two finitely generated groups are quasi-isometric if and only if for some choices of generators in groups G and G', their Cayley graphs are quasi-isometric. The choice of generators is however unimportant because of the previous proposition.

Philosophia Scientiæ, 16-1 | 2012 116

2.3.2 Quasi-isometry invariants: hyperbolicity

73 It turns out that there is a substantial number of properties in finitely generated groups that are independent from the choice of generators and that are called 'quasi- isometry invariants' or 'geometric properties of groups'. By having a number of geometric properties of groups, one can relate them to the algebraic properties and approach algebraic problems through these geometric properties. One amongst many of these geometric properties is hyperbolicity.

74 The concept of hyperbolic groups was introduced by [Gromov 1987], which has since become very influential and given rise to an extensive research programme. Before this, hyperbolicity was considered only on surfaces and other differentiable manifolds with a metric. Gromov's innovation was to extrapolate it in a more general context, defining it also on discrete objects such as graphs or groups [Gromov 1987]. The idea is again to embed a Cayley graph of a group quasi-isometrically into the hyperbolic space ℍn and apply hyperbolic geometry to study this group.4 This involves the notion of a geodesic triangle.

75 Definition. A geodesic triangle is a figure consisting of three different points together with the pairwise-connecting geodesic segments.

76 The points are known as the vertices, while the geodesic segments are known as the sides of the triangle. A geodesic triangle can be considered in any space in which geodesics exist. For example, a geodesic triangle in a Cayley graph with the associated word metric consists of three distinct arbitrary vertices x, y, z connected by three geodesic segments (the sides of the triangle), from x to y, y to z and z to x respectively.

77 Example. In Figure 6, a simple geodesic triangle is formed by the vertices 0,2 and 4. However, the vertices 0,1,2 form two geodesic triangles with the

78 sides 0,2;0,4,1;2,4,1 and 0,2;0,3,1;2,4,1 correspondingly.

79 Obviously a geodesic triangle (as in the last example) does not have to be shaped as a Euclidean triangle. All what is needed is that three vertices are connected by a geodesic path.

80 Thin (hyperbolic) triangles and negatively curved groups

81 Given the notion of a geodesic triangle, one can introduce the notion of negative curvature or hyperbolicity on Cayley graphs.

82 The common definition of hyperbolicity is based on the key property of hyperbolic spaces that the sum of a triangle's angles is less than π. In a discrete case as in the case of a graph, there are no angles, yet there is the word metric. The following notion of a δ-thin triangle allows us to define hyperbolicity for this metric in a more general way without appealing to the notion of angle. I will now give two definitions of a ô-thin triangle and show how it applies to Cayley graphs and their groups as hyperbolic spaces.

83 Definition 1 (a δ-thin triangle). Let δ ≥ 0. A geodesic triangle in a metric space is said to be δ -thin if each of its sides is contained in the δ-neighbourhood of the union of the other two sides, as demonstrated in Figure 9 below [Gromov 1987, 120].

84 With the definition of a δ-thin geodesic triangle, we can define a δ -hyperbolic space.

85 Definition (a hyperbolic space). A geodesic space X is called δ-hyperbolic or negatively curved if there is a constant ô such that every triangle in X is δ-thin.

Philosophia Scientiæ, 16-1 | 2012 117

86 The next definition gives a different perspective on thin triangles, which is more clearly related to the idea of quasi-isometry. 5

FIGURE 9: A thin hyperbolic triangle.

87 Definition 2 (a δ -thin triangle). Let ∆ be a geodesic triangle in a metric space X, and f ∆ :

∆ → T∆ be an isometry from the sides of the triangle ∆ to a tripod T∆. Namely, f sends the vertices of the triangle to the vertices of the tripod. For some δ ≥ 0, triangle ∆ is δ-

thin if f ∆ (p) = f ∆ (q) == d(p, q) ≤ δ. 88 The idea is the following. Take any triangle; by pinching the sides of a triangle and making it thinner, one eventually arrives at a tripod as it is visualised in Figure 10:

FIGURE 10: Mapping a hyperbolic triangle to a tripod.

89 Example . In Figure 5, the triangle 1,2,3 forms a tripod. Any of its sides belongs to the union of its other sides. For instance, side 1,2 belongs to the union of the sides 2,3 and 1,2,3. More generally a tripod is a 0-thin geodesic triangle. However, the Cayley graph is hyperbolic because it is bounded (see Example 1 of hyperbolic groups below).

90 The crucial observation made about the hyperbolicity of groups is that it is an invariant with respect to the choice of generators. Therefore it can be considered as a property of the group as itself and not only a generated group [Gromov 1987, 75-263]. This follows from the next two facts:

91 1. For any generating sets S and T of a group G, the Cayley graph metric spaces 6 (Γ(G,S),dS) and (Γ(G,T),dT) are quasi-isometric. This fact follows from the proposition in 2.3.1.

92 2. If two geodesic spaces are quasi-isometric, one is hyperbolic if and only if the other is [Howie 1999, Lemma 3, 10].

Philosophia Scientiæ, 16-1 | 2012 118

93 It follows from the second definition that those Cayley graphs which are trees (along with other graphs satisfying δ-hyperbolicity) can be considered as hyperbolic spaces. 7 It is also often said that Gromov-hyperbolic spaces are the ones that exhibit "tree-like behavior". 8

FIGURE 11: A tree on the Poincaré disc.

94 Examples of hyperbolic groups ([Howie 1999, 11]):

95 1. Every finite group is hyperbolic, because its Cayley graphs are all bounded. Indeed, for any generating set, the set of vertices in any of its Cayley graphs is finite, as is the distance between any of the vertices. Take δ to be equal to the longest path between all possible pairs of the vertices, then any distance in the graph is no greater than this δ. In particular, the distance between a point on one side of a geodesic triangle and any point on the two other sides is no greater than δ. Therefore this point is in the δ- neighbourhood of the union of the two other sides.

96 2. Every finitely generated free group is hyperbolic, because its Cayley graphs are trees. 9

97 3. A non-example: Group ℤ 2 is quasi-isometric to E2, and hence is not hyperbolic. The fact that ℤ 2 is not hyperbolic shows that not every finitely generated group is hyperbolic.

98 Some of the 'geometric properties of groups' independent of the choice of generators are related to important combinatorial problems, such as finite presentability and solvable word problem.

99 Finite presentability Definition. A finitely presented group is a group with a finite number of generators and relations.

100 The question is which groups are finitely presentable. It turns out that groups which are quasi-isometric to a finitely presented group are also finitely presented:

101 Proposition 1. If two groups G and G' are quasi-isometric, then G' is finitely presented if and only if G is finitely presented.10

102 Solvable word problem Suppose a group G is finitely presented. A word in the generators and their inverses represents some element of the group. Is there an algorithm to decide if this is the identity element? Or that two arbitrary words of the group are equivalent? If so, then the group is said to have a solvable word problem. The uniform word problem (for the class of all finitely presented groups with solvable word problem) is unsolvable as demonstrated by [Dehn 1911], [Novikov 1955], and [Boone, Cannonito, & Lyndon 1973]. However, the asymptotic approach allows us to formulate

Philosophia Scientiæ, 16-1 | 2012 119

some more quantitative statements about the solvability of the word problem in the classes of quasi-isometric groups:

103 Proposition 2. For two quasi-isometric groups, if one has a solvable word problem, then for the other it is also solvable.

104 Propositions 1 and 2 connect geometric properties with algebraic properties. Moreover, it was shown that hyperbolicity can give a new perspective on these combinatorial problems. For example, the next two results due to [Gromov 1987] establish important properties of hyperbolic groups related to these problems:

105 1. Every hyperbolic group has a finite presentation [Howie 1999, Theorem 2, 11-12].

106 2. Every hyperbolic group has a solvable word problem [Howie 1999, 17, Lemma 6].

107 Research into this question is still of major importance in combinatorial group theory. Thus this result obtained by the geometric approach about hyperbolic groups is a valuable contribution to the study of groups.

3 Discussion

108 To analyse the case study I will use Manders' model, introduced in the previous section, which sees mathematical activities as the coordination of strategic responses and indifferences to given information. Manders' analysis makes explicit the advantages of Descartes' algebraisation. I will show that it is also effective in my case study.

3.1 Combinatorial vs. geometric representation

109 To analyse the epistemic role of Cayley graphs in the geometrisation of groups, let me compare them to the ways of representing groups that are traditionally used in combinatorial group theory.

110 Combinatorial group theory mostly uses symbolic notation. Group elements can be expressed by symbols. For example, the elements of the dihedral group of symmetries

in an equilateral triangle D3 can be written as: 111 I, r,r2 ,f, fr, fr2

Philosophia Scientiæ, 16-1 | 2012 120

FIGURE 12: The flip f and the rotation r.

112 All these 'geometric' groups have the same structure as permutation groups, where permutations can also be imagined as transformations in a space (or a plane).

113 For instance, if we take the equilateral triangle ABC and apply a flip, then this transformation can be presented as a permutation of the set of vertices of ABC : A is switching with C and B is staying the same. Then the rotation r would be C moving to A, B to C and A to B. The flip f followed by rotation r • f can be written by using the following notation:

TABLE 1: A multiplication table for D3.

114 In the latter two cases the algebra represents the concept that was articulated in a geometric way without geometrical allusions: merely in terms of some abstract As and

Bs. We can think of group D3 more abstractly as a set with group axioms, no longer thinking about group elements as continuous motion or as permutations. This is where indifference to the geometric aspects is especially effective. The combinatorial representations and axiomatic expression of groups are detached from the concrete nature of the group: one can talk about a group and its properties without specifying what exactly the group operation is. What is important is that this operation satisfies group axioms. This abstract character of algebra allows one to describe the structural aspects of objects from different origins in a flexible way. Indifference to geometry provides a broad applicability of group theory to other object-specific fields (e.g. crystallography).

Philosophia Scientiæ, 16-1 | 2012 121

115 Presented groups are the closest algebraic counterparts to Cayley graphs. Group presentation includes generators and group relations (words equal the identity or 3 2 empty words). Here is the group presentation of D3: < r,f | r ,f ,rfrf >. To give a more general example: a cyclic group of order n can be presented as (a | an = e). This type of combinatorial representation is quite informative and at the same time, synoptic. It is used both in the combinatorial and in the geometric approach. However, as demonstrated in the previous chapter, for a detailed consideration of group geometry Cayley graph diagrams are essential.

116 What makes them so effective? For example, the mathematical difference between symbolic notation of a generated group and its Cayley graph is often said to be insignificant, but the cognitive difference between the two cognitive representations is significant. Let us consider how a Cayley graph diagram can be useful in the same 2 example of group D3 (see Figure 13). Recall that f = I. This is immediately visible in the diagram: the two-arrowed edges reflect the fact that f = f-1. Similarly, one can easily see that rf ≠ fr, but it is not the case in Figure 6 and therefore the latter represents an abelian group whereas the former represents a non-abelian group (in an abelian group G, ab = ba, V a,b e G).

FIGURE 13: The Cayley graph of the group D3.

117 Multiplication tables also reflect the abelian property: an abelian group multiplication table is symmetric with respect to the main diagonal. But multiplication tables show only simple relations like f2 = I , whereas in a Cayley graph they are all visible as loops in the diagram. One can check in the diagram if the two paths corresponding to given words end in the same vertex. This would mean that these words are equivalent.

118 Comparing generated groups with Cayley graphs, geometric group theorists would usually say that the two are 'practically the same objects'. Indeed, what is mathematically significant here is the response to the generating aspect of the group (the generators). It is essential that a Cayley graph gives different information from the information given by the algebraic notation of the generated group: e.g. by colours, edges, shapes, the structure of the group at a glance, and importantly, connectedness. Groups do not have these properties. It is the diagram that suggests our response to these properties.

3.2 The geometric response to artefacts

119 As shown in section 2.2.1, groups can be seen as metric spaces equipped with the word metric. This is a discrete-valued metric, so it does not give enough structure to compare

Philosophia Scientiæ, 16-1 | 2012 122

a word metric space to the classical metric spaces by their geometric properties. However, the practice with particular types of selective responses makes it possible to place groups in the same research-object category as classical metric spaces. This means that groups can be studied by using the same classical methods. The main elements of this practice are (i) the indifference to the discrete structure of the group metric space, and (ii) the response to the perceptual similarity of particular Cayley graphs with these metric spaces. These are new responses, unavailable to the combinatorial approach. They are implemented by reading the graphs as geometric objects in a space; which in turn leads to new advances—the concepts of quasi- isometry, hyperbolicity of groups and other geometric properties of groups.

120 The applied responses are central to this approach. They are modified and adapted to the original tasks. In the geometric response to Cayley graph diagrams, we perceive them as objects embedded in a space and having geometric elements (e.g. the edges of a graph are thought of as the measurable sides of triangles). This response is applied from geometry, as for example in Euclidean geometry we respond to the intersections of lines, figures and their relations in the diagrams. The response is modified : e.g. a geodesic triangle does not have to be shaped like a Euclidean triangle. In other words, some of the Euclidean diagrammatic appearance is neglected, whereas the more abstract properties of triangularity (three connected vertices) are highlighted. Also for hyperbolic Cayley graphs our response is modified to be non-Euclidean: we see the graph as a geometric object on a saddle-shaped surface or the Poincaré disc (as in Figure 11). Despite the fact that the graphs are visualised as being on a flat surface, we apply indifference to this property of the visualisation.

121 To summarise, practice with various representations helps us to see the different aspects of mathematical objects. Diagrams have some advantages over symbolic notations in terms of representing the structure of objects at a glance, in colours and connected shapes that resemble geometric figures. The examples above demonstrate that the geometric response to the diagrams of graphs makes it clear how we can naturally integrate additional geometric machinery (geodesics, triangles, metric spaces, hyperbolicity). It is easier to think of the geometric and topological properties of an object such as a graph equipped with a diagram, than of an algebraic one equipped with algebraic symbolism. As a result, a particular response to diagrammatic representations facilitates the ongoing application and development of concepts.

BIBLIOGRAPHY

BOONE, WILLIAM W., CANNONITO, FRANK BENJAMIN & LYNDON, ROGER C., 1973 Word Problems: Decision Problem in Group Theory, Amsterdam: North-Holland.

BRIDSON, ANDRÉ & HAEIGER, MARTIN R. 1999 Metric Spaces of Non-positive Curvature, Berlin: Springer.

Philosophia Scientiæ, 16-1 | 2012 123

DEHN, MAX 1911 Uber unendliche diskontinuierliche Gruppen, Mathematische Annalen, 71(1), 116-144.

GROMOV, MIKHAIL 1987 Hyperbolic groups, in Essays in Group Theory, New York: Springer, Mathematical Sciences Research Institute Publications, vol. 8. 1993 Asymptotic Invariants of Infinite Groups, London Mathematical Society Lecture Note Series, vol. 182, Cambridge University Press, Geometric Group Theory.

HAWKINS, THOMAS 1984 The Erlanger programme of Felix Klein: Reflections on its place in the history of mathematics, Historia Mathematica, 11, 442-470.

HOWIE, JAMES 1999 Hyperbolic Groups. Lecture Notes. Heriot-Watt University, Edinburgh. URL http://www.ma.hw.ac.uk/~jim/HWM00-23.pdf

KAPOVICH, ILYA & BENAKLI, NADIA 2002 Boundaries of hyperbolic groups, Contemporary Mathematics, 296, 39-93, Combinatorial and geometric group theory. Proceedings of the AMS special session on combinatorial group theory (New York, 2000/Hoboken, NJ).

KLEIN, FÉLIX 1873 Uber die sogenannte nicht-euklidische Geometrie (Zweiter Aufsatz), Mathematische Annalen, 6, 112-145.

MANDERS, KENNETH 1999 Euclid or Descartes: Representation and responsiveness, unpublished.

NOVIKOV, PYOTR 1955 On the algorithmic unsolvability of the word problem in group theory, in Proceedings ofthe Steklov Institute of Mathematics, vol. 44, 1-143, Russian.

STARIKOVA, IRINA 2011 A Philosophical Investigation of Geometrisation in Mathematics, Ph.D. thesis, University of Bristol.

NOTES

1. Manders examines the problem of Heraclides in Pappus' Collectio as approached by Descartes in Book III of Géométrie : given the square AD and the line BN, extend the side AC to E so that EF, on EB with F on CD, equals BN. See [Manders 1999, 13-14]. 2. For a historical exposition, see [Hawkins 1984]. 3. A geodesic is locally the shortest path between points in the space. 4. For details see [Gromov 1987] and [Bridson 1999]. 5. For more details see [Bridson 1999, 409]. 6. See for example [Howie 1999, Example 5, 5]. 7. Any connected graph without cycles is a tree, e.g. tripod is a tree. 8. See [Kapovich & Benakli 2002, 31]. 9. A group G is called free if it has a subset S such that any element of G can be generated by a unique finite sequence of elements from S and their inverses (disre-garding trivial variations -1 -1 1 such as sg1 = sg2 g2g1- ). 10. See more in [Bridson 1999, 143-144].

Philosophia Scientiæ, 16-1 | 2012 124

ABSTRACTS

The paper aims to show how mathematical practice, in particular with visual representations, can lead to new mathematical results. The argument is based on a case study from a relatively recent and promising mathematical subject—geometric group theory. The paper discusses how the representation of groups by Cayley graphs made possible to discover new geometric properties of groups.

Cet article cherche à montrer comment la pratique mathématique, particulièrement celle admettant des représentations visuelles, peut conduire à de nouveaux résultats mathématiques. L'argumentation est basée sur l'étude du cas d'un domaine des mathématiques relativement récent et prometteur: la théorie géométrique des groupes. L'article discute comment la représentation des groupes par les graphes de Cayley rendit possible la découverte de nouvelles propriétés géométriques de groupes.

AUTHOR

IRINA STARIKOVA University of Bristol (UK)

Philosophia Scientiæ, 16-1 | 2012 125

Pratique mathématique et lectures de Hegel, de Jean Cavaillès à William Lawvere *

Baptiste Mélès

Introduction

1 Nous soutiendrons ici l’hypothèse selon laquelle les concepts de paradigme et de thématisation, par lesquels Jean Cavaillès décrit dans l’ouvrage posthume Sur la logique et la théorie de la science la dynamique de l’activité mathématique, trouvent dans la théorie des catégories à la fois une illustration et une formalisation2, et dans la dialectique hégélienne un précédent.

2 Dans un premier temps, nous examinerons cette hypothèse, non sans définir le concept de thématisation et les quelques notions élémentaires de théorie des catégories qui nous serviront par la suite. Ensuite, en supposant hégélienne la notion de « dialectique » employée par Cavaillès, nous vérifierons si le concept de « thématisation » peut prétendre décrire le processus dialectique de la logique hégélienne, interprétation dont nous trouverons l’esquisse chez William Lawvere. Enfin, nous mettrons à l’épreuve cette interprétation en analysant la dialectique hégélienne de l’être et du néant dans la Science de la logique et les Leçons sur l’histoire de la philosophie.

3 Nous verrons ainsi comment la dialectique hégélienne peut tout à la fois décrire la pratique mathématique et s’y voir représenter.

Philosophia Scientiæ, 16-1 | 2012 126

1 De Cavaillès à la théorie des catégories

1.1 Paradigme et thématisation chez Jean Cavaillès

4 Dans la deuxième partie de Sur la logique et la théorie de la science, Cavaillès observe dans l’histoire des mathématiques un double « processus de séparation3 », par « paradigme » et par « thématisation » [Cavaillès 2008, 27].

5 Il existe d’une part un processus « longitudinal, ou coextensif à l’enchaînement démonstratif », qui est celui du « paradigme ». Le « raisonnement arithmétique sur le nombre entier fini » a ainsi donné lieu, par généralisation, aux « enchaînements les plus abstraits » [Cavaillès 2008, 27] de l’algèbre. D’un point de vue formel, cette généralisation est celle par laquelle les constantes originelles se voient remplacer par des variables : « en remplaçant les déterminations d’actes par la place vide pour une substitution, on s’élève progressivement à un degré d’abstraction qui donne l’illusion d’un formel irréductible » [Cavaillès 2008, 29]. Les exemples de cette généralisation fourmillent dans l’histoire des mathématiques : Tels les essais de caractéristique géométrique ou algébrique, la théorie des déterminants, le symbolisme du calcul infinitésimal où chaque fois se manifeste le désir de ne conserver que ce qui est d’essence ; telle enfin la réduction de tous les jugements au jugement prédicatif, de tout raisonnement à la répétition ou au morcellement d’une formule unique et identique. [Cavaillès 2008, 30]

6 Le paradigme est une généralisation portant sur les objets. Mais à lui seul, il ne saurait expliquer tout entière la dynamique à l’œuvre dans l’histoire des mathématiques.

7 Les mathématiques ne sont pas seulement connaissance d’objets, mais aussi connaissance d’actes : « La formalisation n’est réalisée que lorsqu’au dessin des structures se superposent systématisées les règles qui les régissent » [Cavaillès 2008, 30]. Orthogonalement à la généralisation d’objets qui remplace les constantes d’origine par des variables, il existe une généralisation d’actes qui abstrait les règles de transformation de ces variables. C’est ici qu’intervient la « thématisation », processus « vertical ou instaurant un nouveau système de liaison qui utilise l’ancien comme base de départ, et non plus stade traversé par un mouvement, mais objet de réflexion dans son allure actuelle » [Cavaillès 2008, 27]. Dans ce nouveau processus, ce ne sont plus les objets de base qui importent, mais les transformations auxquelles ils sont soumis : « La thématisation prend pour départ l’enchaînement saisi cette fois dans son vol, trajectoire qui se mue en sens. La pensée ne va plus vers le terme créé mais part de la façon de créer pour en donner le principe par une abstraction de même nature que l’autre, mais dirigée transversalement » [Cavaillès 2008, 30].

8 Cavaillès en mentionne quelques exemples, principalement mathématiques : « théorie des groupes, théorie des opérations linéaires, des matrices, topologie des transformations topologiques » [Cavaillès 2008, 31] ; mais ce geste est également à l’œuvre en logique avec la théorie de la démonstration, qui prend pour objets ce qui, dans les raisonnements, n’était encore que des actes — introduction de la conjonction, élimination de l’implication, etc.

9 Un exemple commun marquera la différence entre paradigme et thématisation. Les vecteurs peuvent être vus comme une généralisation par paradigme des nombres, qui obéit, pour une part, aux mêmes lois (associativité et commutativité de l’addition, multiplication scalaire, etc.). Au cours de cette généralisation, les objets ont certes

Philosophia Scientiæ, 16-1 | 2012 127

changé de nature, puisqu’ils adoptent la forme (a1 a2 ... an ) au lieu de la simple forme a ; mais ils n’en restent pas moins des objets. De leur côté, les actes restent des actes : on n’additionne plus tout à fait les mêmes objets, mais il s’agit toujours d’opérations réglées sur les objets. En revanche, l’algèbre linéaire dans son ensemble contient la généralisation par thématisation du calcul vectoriel. Étudier les applications linéaires et leurs relations, c’est prendre pour objet ce qui précédemment avait le statut d’acte. L’acte de transformer des vecteurs devient lui-même un objet, typiquement une matrice, sur lequel de nouveaux actes sont possibles, notamment la composition par multiplication matricielle. Si le paradigme transforme les objets en objets et les actes en actes, la thématisation fait abstraction des objets et transforme les actes en objets.

10 Ce dernier exemple montre que les deux processus de généralisation, loin d’être exclusifs, doivent être considérés dans leur interaction : « Il n’y a pas de formalisme sans syntaxe, pas de syntaxe sans un autre formalisme qui la développe » [Cavaillès 2008, 33]. La thématisation peut en effet à son tour donner lieu à une généralisation par paradigme : c’est par thématisation que l’on abstrait l’idée de groupe à partir de permutations sur un ensemble particulier, mais par paradigme que l’on passe de ce groupe particulier à la notion générale de groupe ; puis à nouveau par thématisation que l’on considère, sous le nom d’homomorphismes de groupe, les actes par lesquels on peut passer d’un groupe à l’autre. Aucun des deux processus n’est donc définitif : l’histoire des mathématiques est l’alternance des deux gestes.

11 Ce double processus de généralisation est responsable de la nature « dialectique » de l’histoire des mathématiques — thèse formulée dans toute sa généralité dans la dernière page de Sur la logique et la théorie de la science [Cavaillès 2008, 78], mais qui traverse déjà les premiers ouvrages, notamment Méthode axiomatique et formalisme, où Cavaillès renvoie au concept husserlien de « faculté thématisante des mathématiques4 », que dans l’ouvrage posthume il détache pourtant, comme l’a observé Hourya Sinaceur, de toute référence à l’activité de la conscience :

12 La thématisation n’est pas, pour Cavaillès, l’effet d’un acte du sujet (transcendantal) tournant son « regard » ou son intérêt vers un quelque chose constitué ainsi en objet. Elle est une modalité fondamentale de l’expérience (ou connaissance) mathématique, celle-ci étant privilégiée comme prototype de l’activité rationnelle essentiellement progressive. La construction des objets n’est pas constitution au sens phénoménologique du terme. [Sinaceur 1990, 2582]

13 Or la théorie des catégories a joué vis-à-vis des concepts de paradigme et de thématisation un rôle double, puisqu’elle en est à la fois une illustration parmi d’autres, et la formalisation par excellence.

1.2 La théorie des catégories comme illustration et formalisation

1.2.1 Catégories et morphismes

14 La théorie des catégories peut être décrite comme une théorie des morphismes. Contrairement à la théorie des ensembles, qui ne définit des morphismes entre ensembles ou entre structures qu’après les avoir explicitement définis — c’est-à-dire après avoir précisé d’abord la nature des objets sur lesquels agiront ces morphismes — la théorie des catégories étudie les propriétés des morphismes en faisant abstraction de la nature des objets concernés.

Philosophia Scientiæ, 16-1 | 2012 128

15 La fonction première de cette théorie est en quelque sorte de factoriser un certain nombre de résultats qui, sans elle, se verraient répartir dans autant de théories qu’il y a d’espèces de structure. Par exemple, on définira la notion d’application en théorie des ensembles, la notion d’homomorphisme en théorie des groupes, la notion d’application linéaire en algèbre linéaire, la notion d’application continue en topologie, etc., alors que ces notions présentent des caractéristiques communes dont on pourrait traiter une fois pour toutes. De même, il existe un air de famille entre le produit cartésien pour les ensembles, le produit direct pour les groupes, le produit d’espaces vectoriels en algèbre linéaire, etc. Au lieu de feindre la surprise en apprenant les propriétés de chacun de ces concepts, on peut unifier leur théorie.

16 C’est cette unification des mathématiques « par le haut » que Charles Ehresmann voit à l’œuvre dans la théorie des catégories : Actuellement, dans un ouvrage consacré à l’étude des structures mathématiques d’une espèce donnée quelconque (par exemple les topologies, les groupes,…), l’auteur commence généralement par définir les structures en question, puis à introduire les notions d’isomorphisme et d’homomorphisme entre ces structures (par exemple les homéomorphismes et les applications continues) ; ensuite il étudie les sous-structures et les structures quotient. Il est donc naturel de se demander s’il existe une notion unificatrice pour toutes ces théories diverses, comme la notion de groupe qui a permis de rétablir l’unité de la Géométrie après la découverte des géométries non euclidiennes ». [Ehresmann 1965, VII]

17 Mais expliquons par quelques exemples simples cette notion de morphisme. Un morphisme est ce qui, à un objet, en associe un autre ; on peut le concevoir comme un processus de transformation. Tel est par exemple le morphisme « hachoir » qui transforme la vache en steak haché, et que l’on peut représenter de la manière suivante :

18 D’un point de vue plus mathématique, un morphisme peut être l’application d’un ensemble dans un autre, par exemple la fonction qui, à un nombre, associe son carré :

19 On peut également décrire un morphisme en ayant recours à des diagrammes internes, c’est-à-dire en rendant explicite la composition interne des objets concernés par notre morphisme. Voici par exemple le morphisme « AOC », qui dans un futur proche déterminera la robe des différents côtes-d’auvergne :

Philosophia Scientiæ, 16-1 | 2012 129

20 Entre des ensembles, les morphismes sont des applications : ce que l’on exprime en théorie des catégories en disant que la catégorie des ensembles est celle dont les objets sont des ensembles et les morphismes des applications. Plus généralement, toute catégorie est la donnée d’un certain type d’objets et d’un certain type de morphismes.

21 Toute application est un morphisme, mais tout morphisme n’est pas une application : en particulier, un morphisme peut exister entre deux structures mathématiques. On peut par exemple définir la catégorie des graphes, dont les objets sont des graphes, et les morphismes, des « morphismes de graphes ».

22 Ainsi, entre les rues d’une ville et le plan de cette ville, il existe un morphisme qui à chaque carrefour associe sur le plan un sommet, et à chaque sens de circulation de chaque rue associe une flèche. Il n’existe donc pas seulement des morphismes d’éléments, mais également des morphismes de morphismes. La seule restriction est que le morphisme préserve la structure de graphe.

23 De même, on peut définir la catégorie des groupes, dont les objets sont les groupes, et les morphismes les homomorphismes de groupe. Soit par exemple le groupe (ℤ/4 ℤ, +), c’est-à-dire le groupe des nombres entiers modulo 4. Nous pouvons définir un morphisme de ce groupe vers le groupe (ℤ/2 ℤ, +), comme l’indiquera un schéma représentant de manière intuitive les deux groupes et leur morphisme :

24 Ce morphisme préserve la structure de groupe : il s’agit donc bien d’un homo- morphisme de groupes, ou en d’autres termes d’un morphisme dans la catégorie des groupes.

25 Comme le résument Andrea Asperti et Giuseppe Longo, Le point de départ de la théorie des catégories est le principe que toute sorte d’objet mathématique structuré vient équipé d’une notion de transformation ou de construction « acceptable », c’est-à-dire d’un morphisme qui préserve la structure de l’objet. [Asperti & Longo 1991, 43]

26 Ce point de vue est expressément revendiqué par le confondateur de la théorie qu’est : La théorie des catégories demande à propos de chaque type d’objet mathématique : « Quels sont les morphismes ? » ; elle suggère que ces morphismes devraient être décrits en même temps que les objets. [Mac Lane 1969, 30]

27 Et c’est précisément à cela que l’on reconnaît un catégoriste ; dans une lettre, le bourbakiste historique André Weil, notoirement hostile à la théorie des catégories, écrit à Claude Chevalley : « Comme tu sais, mon honorable collègue Mac Lane soutient que toute notion de structure comporte nécessairement une notion d’homomorphisme. » À la suite de quoi, Weil demande à Chevalley ce que, selon lui, on peut bien gagner à ce type de considérations. D’après Ralf Krômer et le témoignage de Pierre Cartier, l’hostilité de Weil à la théorie des catégories est la cause principale du silence de Bourbaki à son endroit.

Philosophia Scientiæ, 16-1 | 2012 130

28 Parmi les morphismes, il en est un qui se distingue particulièrement : il s’agit du morphisme identité, qui se contente de préserver l’objet tel qu’il est. On peut le représenter notamment par une boucle :

29 Dans la catégorie des ensembles, le morphisme identité n’est autre que l’application identité ; dans la catégorie des groupes, il s’agit de l’homomorphisme identité, etc.

30 On voit donc que la notion fondamentale de la théorie des catégories est celle de morphisme, qui s’étend, de manière très générale, aussi bien à de simples ensembles dénués de structure qu’à des structures mathématiques de quelque richesse que ce soit. On peut ainsi définir les foncteurs comme morphismes agissant sur des catégories, et les transformations naturelles comme morphismes agissant sur des foncteurs.

1.2.2 Constructions élémentaires

31 La composition — Une fois définie la notion de morphisme, il est essentiel de définir la composition de morphismes, qui consiste intuitivement en une succession de morphismes. On notera ainsi la composition g o f :

32 Dans la catégorie des ensembles, la composition de morphismes est la composition d’applications : par exemple, le morphisme « grand-père maternel » correspond à la composition des morphismes « mère » et « père » :

33 Sans grande difficulté, on peut transposer la composition de morphismes à d’autres catégories que les ensembles, par exemple à la catégorie des groupes :

Philosophia Scientiæ, 16-1 | 2012 131

34 Ces quelques définitions nous permettent de définir une catégorie en toute généralité. Une catégorie est en effet composée : 1. d’objets ; 2. de morphismes entre ces objets ; 3. d’un morphisme identité pour chaque objet ; 4. d’une composition de morphismes qui soit associative, c’est-à-dire que

35 Le produit — De nombreuses catégories peuvent être munies d’un produit.

36 Plutôt que de donner la définition abstraite de ce concept5, nous dirons simplement que le produit de deux objets, quand il existe, est accompagné de deux projections qui permettent d’obtenir l’un et l’autre objet.

37 Nous ne le montrerons que par quelques exemples. D’abord, le cylindre est le produit d’un cercle et d’un segment : on peut en effet le munir de deux projections qui permettent de le déterminer complètement : en connaissant deux de ses ombres portées soigneusement choisies, on peut rétablir tout le volume du cylindre. Il n’est rien dans le cylindre que l’on ne retrouve par ses projections respectives.

38 Ensuite, le produit de deux ensembles n’est autre que leur produit cartésien. Soient par exemple deux ensembles finis, l’un de deux et l’autre de trois éléments ; on peut représenter en un rectangle leur produit cartésien.

39 On définit ce produit cartésien en associant un élément à chaque couple d’éléments tirés de l’un et l’autre ensemble.

40 Nous donnerons un exemple supplémentaire, qui présente l’avantage d’appartenir à une structure et non simplement à un ensemble : il s’agit du produit de graphes. Leur produit peut être obtenu en procédant d’abord de la même manière que pour le produit cartésien dans la catégorie des ensembles, c’est-à-dire en « multipliant » les éléments les uns par les autres. Mais cela ne saurait suffire, car il n’est plus question du produit de deux ensembles, mais de deux structures. Il faut donc ensuite composer les morphismes de l’un et l’autre graphe.

Philosophia Scientiæ, 16-1 | 2012 132

41 À partir de cet exemple, on trouve facilement comment engendrer le produit direct, qui n’est autre que le produit dans la catégorie des groupes. Soient en effet par exemple les groupes (ℤ/2 ℤ, +) et (ℤ /2 ℤ, +). Par produit direct des ensembles de base de ces groupes, nous obtenons un ensemble à quatre éléments ; puis par composition des morphismes respectifs des deux groupes, on obtient bien un nouveau groupe, en l’occurrence le groupe de Klein (ℤ /2 ℤ x ℤ /2 ℤ, +) :

42 L’intérêt de la théorie des catégories est de désigner par un seul nom et d’élaborer par une seule méthode de construction des concepts relevant de théories mathématiques très diverses.

43 La somme — La somme, souvent appelée coproduit6, est le dual du produit. Intuitivement, elle correspond à la fusion de deux structures en une seule, dans laquelle on prend garde d’éviter tout télescopage non contrôlé7.

44 Par exemple, le coproduit de deux ensembles correspond à leur union disjointe ; c’est-à- dire que les deux ensembles sont bien « réunis » en un troisième, mais en veillant à compter deux fois les éléments qui se trouveraient à la fois de part et d’autre. Ainsi, l’union disjointe des ensembles A = {a, b} et B = {b, c} ne serait pas A⊔B = {a, b, b, c} = {a,

b, c}, mais A⊔B = {a, bi, b2, c}, le renommage permettant de bien distinguer les deux occurrences de l’élément b.

45 Dans certains cas, la somme peut exiger un télescopage, mais celui-ci doit toujours être

soumis à un contrôle strict. Par exemple, la somme de deux espaces pointés { x 0,a, b} et

{x0,c,d} doit bien être { x0,a,b,c,d} : l’ensemble pointé d’arrivée ne peut contenir qu’un seul point de base, où confluent les deux points de base de départ.

Philosophia Scientiæ, 16-1 | 2012 133

46 De même, la somme directe d’espaces vectoriels envoie chaque vecteur trivial des espaces vectoriels de départ vers l’unique vecteur trivial de l’espace vectoriel d’arrivée.

47 La somme est donc intuitivement ce qui rassemble le contenu de deux objets, en ne tolérant de télescopage que ce qui est indispensable à la préservation de la structure des objets : ainsi l’union disjointe préserve la structure d’ensemble, la somme d’ensembles pointés préserve la structure d’ensemble pointé (ce qui ne serait pas le cas si l’on conservait deux points de base), la somme directe d’espaces vectoriels préserve la structure d’espace vectoriel.

48 Une fois encore, la théorie des catégories permet de factoriser un certain nombre de concepts et de théorèmes qui, sans elle, se voient disséminer dans une foule de théories — une mauvaise infinité, indubitablement.

49 Il existe naturellement bien d’autres concepts fondamentaux de la théorie des catégories : l’objet initial, l’objet terminal, le choix, la détermination, la section, la rétraction, le produit fibré, la limite, etc. Mais nous nous limiterons ici aux concepts qui serviront directement à la compréhension des analyses qui suivent.

1.2.3 La thématisation en théorie des catégories

50 Non moins que par ces concepts fondamentaux, la théorie des catégories se distingue par un tour d’esprit caractéristique, qui répond au concept de thématisation chez Cavaillès : chaque catégorie est munie de morphismes, c’est-à-dire d’actes, que l’on peut thématiser dans une nouvelle catégorie. À cet égard, il ne suffirait pas de dire que les catégoristes sont friands de thématisation : c’est que de part en part et depuis son origine, la théorie des catégories est thématisation.

51 Grothendieck exprime dans toute sa pureté le recours totalement décomplexé des catégoristes à la quantification : Il est certain qu’il faut pouvoir considérer les catégories, foncteurs, homomorphismes de foncteurs etc comme des objets mathématiques, sur lesquels on puisse quantifier librement, et qu’on puisse considérer à leur tour comme formant les éléments d’ensembles. (cité dans [Krômer 2004, 259])

52 « Quantifier librement », selon le mot de Grothendieck, pourrait être le slogan des catégoristes ; c’est-à-dire en particulier quantifier sur des morphismes, des morphismes de morphismes (Joncteurs), des morphismes de morphismes de morphismes (transformations naturelles), etc. La thématisation survenait autrefois au terme d’un lent processus d’incubation, qui demandait des siècles, puis simplement des décennies, et elle était une question de génie : aujourd’hui immédiate en théorie des catégories, elle n’est plus qu’une question de réflexe.

53 On peut dire sur cette base que la théorie des catégories a fourni à la fois une illustration parmi d’autres et une formalisation par excellence du concept de thématisation.

54 Une illustration parmi d’autres, dans la mesure où Eilenberg et Mac Lane, fondateurs de la théorie des catégories, observant la répétition de gestes similaires d’une théorie à l’autre — morphismes de groupe, morphismes d’espaces vectoriels, morphismes d’espaces topologiques, etc. — en sont venus à dégager le concept unificateur de morphisme comme transformation préservant la structure. Ce faisant, ils ont bien généralisé la notion de morphisme, et pris comme objet ce qui précédemment n’était encore qu’acte sur des objets de plus bas niveau. Ils ont donc appliqué la notion de

Philosophia Scientiæ, 16-1 | 2012 134

thématisation, et généralisé — donc élevé au paradigme — la notion de morphisme, en la rendant indifférente aux structures sur lesquelles elle porte.

55 Mais la théorie des catégories est aussi la formalisation par excellence des concepts de Cavaillès et tout particulièrement du concept de thématisation, dans la mesure où elle transforme en procédé mathématique ce qui, aux yeux de Cavaillès, restait dans les limites d’une analyse philosophique de l’histoire des mathématiques. Le concept philosophique est devenu concept mathématique, revenant par là même à la discipline dont, comme l’a vu Hourya Sinaceur [Sinaceur 1990, 2581], la pratique avait inspiré Husserl. La philosophie aurait ainsi donné aux mathématiques le moyen de penser de leurs propres forces le procédé qu’elles appliquaient implicitement.

56 La théorie des catégories, dont la naissance est contemporaine du dernier ouvrage de Cavaillès, procure donc à ses concepts cardinaux de paradigme et de thématisation une application en même temps qu’une formalisation.

1.2.4 Remarques historiques

57 Cavaillès s’étant d’abord fait connaître pour son étude soigneuse de l’histoire de la théorie des ensembles, on pourra être surpris de le voir rapprocher du point de vue catégoriel. Mais il serait réducteur de le considérer comme « le philosophe de la théorie des ensembles », car dans Sur la logique et la théorie de la science, celle-ci n’est plus au cœur de ses préoccupations. Comme l’observe Jan Sebestik : Dans ses deux thèses, Méthode axiomatique et formalisme et Remarques sur la formation de la théorie abstraite des ensembles (1938), Cavaillès étudia de l’intérieur la naissance de la théorie des ensembles et le problème des fondements des mathématiques, depuis sa « préhistoire » (Bolzano) jusqu’à l’axiomatisation de Zermelo-Fraenkel et au programme de Hilbert avec ses avatars, aboutissant à la formalisation de la mathématique et de la logique, ainsi qu’à une vision nouvelle de la mathématique ouverte par les découvertes de Gôdel. […] Dans ses articles qui accompagnent et suivent ses thèses, Cavaillès change de registre : l’étude interne et technique du problème des fondements étant terminée, il peut maintenant modifier son approche, éclairer son propre travail par des mises en perspective, mettre l’accent sur des lignes de développement qui orientent le travail axiomatique des théories mathématiques. [Sebestik 2008, 91-92]

58 Ici, ce n’est plus tant le fondement des mathématiques qui intéresse Cavaillès, que les forces qui président à leur devenir. Ce changement de perspective est caractéristique du point de vue catégoriel, par opposition au point de vue ensembliste.

59 Or, ce changement de perspective n’est pas historiquement dénué d’échos, Cavaillès ayant fréquenté les milieux bourbakistes à l’École normale supérieure, à Strasbourg et — par la force des choses mais surtout des idées — à Clermont-Ferrand ; ayant également, comme Bourbaki, été influencé par les travaux d’Emmy Noether ; enfin, Ehresmann fut, avant de promouvoir la théorie des catégories [Ehresmann 1965], l’ami puis l’un des éditeurs posthumes de Cavaillès. On en trouve rétrospectivement un autre indice chez Saunders Mac Lane, lorsqu’il dégage de l’histoire des mathématiques deux formes principales de création d’objets, la généralisation et l’abstraction, en utilisant du reste les mêmes exemples que Cavaillès [Mac Lane 1986, 434-438]. Il ne s’agit évidemment pas de soutenir une invérifiable — et probablement inexistante — influence directe de Cavaillès sur la théorie des catégories, ou de la théorie des catégories sur Cavaillès : Eilenberg et Mac Lane n’étaient pas plus en contact avec Bourbaki qu’avec Cavaillès lorsqu’ils développèrent les concepts fondamentaux de la

Philosophia Scientiæ, 16-1 | 2012 135

théorie des catégories. Mais il suffit d’indiquer qu’il existe dans les mathématiques et la philosophie de cette époque une tendance commune. Si Cavaillès, fusillé en 1944, n’eut pas le temps de connaître les concepts fondamentaux de la théorie des catégories, publiés en septembre 1945, il fréquenta le milieu où ces concepts étaient en train de naître.

1.3 Une « dialectique » hégélienne ?

60 Le mot de dialectique, sur lequel Cavaillès conclut Sur la logique et la théorie de la science, a des résonnances — platonicienne, kantienne et hégélienne — qui ne pouvaient manquer de susciter la curiosité des commentateurs. La question est disputée, de savoir si Cavaillès pensait à Hegel en terminant son ouvrage par l’idée d’un processus dialectique. Jean-Toussaint Desanti rejette fermement cette hypothèse : « Ce n’est pas à Hegel, dont il se défiait, qu’il pensait en écrivant ces mots » [Desanti 1981, IV]. Gilles- Gaston Granger, de son côté, privilégie l’hypothèse platonicienne : « Reste à élucider le sens exact de ce dernier mot, sens original chez Cavaillès, et en tout cas irréductible à celui qu’on trouve dans Platon ou chez Hegel, quoique peut-être moins éloigné encore du premier que du second » [Granger 1998, 72]. Enfin, Canguilhem estime que Hegel était encore trop méconnu en France avant les grands commentaires de Koyré et de Kojève [Canguilhem 1989, 686].

61 Mais Jules Vuillemin aurait tenu de Cavaillès lui-même l’aveu d’une référence à Hegel : se référant à la Philosophie de l’algèbre [Vuillemin 1962], Élisabeth Schwartz évoque une « puissante présence hégélienne, dont Jules Vuillemin nous disait tenir de Cavaillès que c’est bien elle que reconnaissait son maître disparu en évoquant la philosophie du concept » [Schwartz 2005, 23]. Mme Schwartz a ainsi pu montrer en quel sens Cavaillès « choisit le langage de Hegel pour dire la culmination de son spinozisme mathématique » [Schwartz 2006, 306], et comment la référence hégélienne permet d’éclairer le rapport, jamais rompu, de Cavaillès à l’idéalisme allemand [Schwartz 2006, 311-312 et 319 sq.].

62 En suivant cette hypothèse, nous allons tenter de donner un sens à l’interprétation de Hegel qu’elle peut présupposer. Car si l’allusion est bien hégélienne, on peut s’interroger sur la possibilité d’un retour à Hegel à partir de cette interprétation du processus dialectique. La métaphore est en effet toujours à double sens : le comparé rétroagit sur le comparant. La référence à Hegel n’est éclairante que si l’on détermine de quel Hegel il est question. Les notions de paradigme et de thématisation peuvent- elles nous éclairer sur le déroulement de la dialectique hégélienne ? Ou, en d’autres termes, quelle image Cavaillès pouvait-il avoir de la dialectique hégélienne — s’il s’agit bien d’elle — pour pouvoir lui comparer l’histoire des mathématiques ?

63 Nous ne disposons que de bien peu d’éléments sur l’interprétation de Hegel par Cavaillès ; en dehors d’une rapide mention dans Sur la logique et la théorie de la science [Cavaillès 2008, 5], ce nom n’apparaît pour ainsi dire jamais sous sa plume. Mais si l’on peut reconnaître dans la théorie des catégories l’application et la formalisation du concept de thématisation, alors l’allusion répétée du catégoricien William Lawvere à la logique hégélienne pourrait nous donner quelques pistes sur la nature de l’éventuel rapport entre logique hégélienne et thématisation.

Philosophia Scientiæ, 16-1 | 2012 136

2 Lawvere et le concept d’Aufhebung en théorie des catégories

64 La Science de la logique occupe une place centrale dans le système hégélien : elle est à la fois le point d’aboutissement de la Phénoménologie de l’esprit — car l’esprit parvenu au savoir absolu est mûr pour le développement d’une logique — et le point de départ de l’Encyclopédie. Or ce terme de « logique » ne pouvait laisser d’intriguer ceux qui, aujourd’hui, étudient ce qui porte désormais ce nom, à savoir un ensemble d’axiomes, de concepts, et même un certain style, hérité notamment de Frege et de Russell.

65 Mais la rencontre entre logique hégélienne et logique moderne a bien souvent été, de part et d’autre, décevante.

66 L’attitude la plus fréquente a été un rejet pur et simple de ce que Hegel appelle « logique ». Russell en montra l’exemple, en présentant comme une conversion salutaire sa « rébellion », qu’il date de la fin 1898, contre une pensée hégélienne dont il fut d’abord le disciple [Russell 1961, ch. 5]. Il ne manqua pas, par la suite, de railler la dialectique comme s’accommodant trop facilement des contradictions, et notamment du paradoxe du menteur, tandis que la logique formelle les considérait avec gravité [Russell 1961, 97]. Robert Blanché s’inscrit dans la lignée de Russell, en consacrant quelques lignes à peine aux logiques kantienne et hégélienne dans La Logique et son histoire, par ailleurs si complète : quelques lignes dans lesquelles il refuse à ces entreprises le nom même de logique [Blanché & Dubucs 1996, 247-248].

67 Constatant également que la logique hégélienne ne se laissait pas facilement exprimer dans la logique formelle héritée de Russell, d’autres prirent le parti inverse, et s’efforcèrent de mettre au pas la logique. On vit ainsi naître des systèmes parfois pathologiques, dont la seule finalité était d’autoriser la formalisation de la logique hégélienne. C’est ainsi que le logicien polonais Rogowski, espérant faire honneur à la pensée dialectique, développa des logiques multi-valentes munies d’un nombre sans cesse croissant de valeurs de vérité.

68 Quoique opposées, ces deux attitudes procèdent d’un même point de vue sur la pensée hégélienne. Rejeter la logique hégélienne sous le prétexte qu’elle n’est pas formulable dans une logique standard ou adopter des logiques minoritaires pour comprendre Hegel, c’est cautionner l’idée selon laquelle la pensée hégélienne devrait, par quelque acte de foi, être adoptée ou rejetée, le passage de l’un à l’autre ne pouvant avoir lieu que sur le mode d’une conversion — mais surtout, et pire, d’une conversion arbitraire.

69 Or, par rapport à ce débat, il existe une initiative qui mérite d’être examinée dans toute son originalité, et qui fut celle de William Lawvere. À trois égards au moins, son travail est digne d’admiration. D’abord, ses travaux de recherche ont marqué définitivement l’histoire de la théorie des catégories ; c’est principalement à lui que l’on doit le point de vue selon lequel elle fournit un fondement universel et élégant pour les mathématiques.

70 Ensuite, Lawvere s’est distingué par la qualité de ses ouvrages de vulgarisation, qui fourmillent d’exemples concrets. Il est question de petits déjeuners préférés, de mouches qui volent et de restaurants chinois. La rigueur mathématique n’est pas pour autant sacrifiée sur l’autel de la pédagogie, et Saunders Mac Lane, cofondateur avec de la théorie des catégories, ne cache pas son admiration devant le

Philosophia Scientiæ, 16-1 | 2012 137

livre Conceptual Mathematics, et la rigueur que Lawvere arrive à transmettre à ses étudiants8.

71 Enfin, Lawvere a ceci de remarquable — et de rare — qu’il a toujours milité contre la séparation de la philosophie et des sciences ; non tant pour déplorer, comme on l’entend souvent, le faible niveau scientifique des philosophes que, plus originalement, pour critiquer la propension des scientifiques en général, et des mathématiciens en particulier, à ne s’intéresser tout au plus qu’aux fondements mathématiques de leur discipline en balayant d’un revers de main l’idée de fondements philosophiques. Lawvere se réclame en cela de Grassmann, qui n’imaginait pas écrire son Ausdehnungslehre de 1844 sans la précéder d’une introduction dans laquelle le concept d’algèbre linéaire se verrait déduire a priori de la nature de la pensée [Grassmann 1878, Introduction].

72 Mais l’intérêt pour l’œuvre de William Lawvere se redouble lorsque l’on s’aperçoit que, dans au moins quatre de ses articles sur la théorie des catégories, Lawvere fit allusion à la logique hégélienne [Lawvere 1989], [Lawvere 1991], [Lawvere 1992], [Lawvere 1996b]. Par rapport aux deux attitudes précédemment mentionnées sur le rapport entre logique formelle et dialectique hégélienne, cette interprétation catégoriste de la dialectique présenterait le grand intérêt, d’une part, de ne pas renoncer à comprendre Hegel, et de l’autre, de l’interpréter dans un cadre canonique et non dans quelque dialecte créé ad hoc.

2.1 Dialectique et mathématique

73 Lawvere se réclame d’un double héritage, qui est celui de Grassmann et, bien évidemment, de la théorie des catégories. Ce sont, selon lui, autant de bonnes raisons de vouloir maintenir un lien vivant entre mathématiques et philosophie.

74 Lawvere se réfère en effet à Grassmann, qui dans l’Introduction de l’Ausdehnungslehre de 1844 distinguait ainsi la dialectique et les mathématiques : L’opposition (Gegensatz) entre l’universel et le particulier détermine [….] la division des sciences formelles en dialectique et mathématique. La première est une science philosophique, en ce qu’elle recherche l’unité dans toute pensée, tandis que la mathématique a la direction opposée, en ce qu’elle appréhende tout ce qu’elle pense à chaque fois comme un particulier. [Grassmann 1878, xxil]

75 Lawvere formalise cette distinction de la manière suivante, fortement influencée par le graphisme traditionnel de la théorie des catégories :

76 Mais c’est pour aussitôt ajouter que l’on ne saurait en rester à cette opposition entre mathématiques et dialectique : Nous avons besoin d’un instrument qui guide nos étudiants pour qu’ils suivent chacune de ces deux activités d’une manière unifiée, passant du général au particulier et du particulier au général. Je crois que la théorie des catégories mathématiques […] peut incarner un tel instrument. [Lawvere 1996a, 256]

Philosophia Scientiæ, 16-1 | 2012 138

77 On remarquera au passage le soin avec lequel Lawvere articule les intérêts philosophiques et pédagogiques : quoi de plus hégélien que de maintenir ce lien entre encyclopédie et pédagogie ?

78 En termes catégoriels, l’espoir de Lawvere est que l’on puisse dégager des morphismes entre la mathématique et la dialectique, c’est-à-dire unifier ces deux disciplines, mais à un niveau supérieur, c’est-à-dire en passant par leur thématisation. Si mathématique et dialectique sont des morphismes entre le général et le particulier, les unifier ne sera possible qu’à la condition d’exhiber des morphismes entre ces morphismes.

79 Il ne s’agit donc ni de réduire les mathématiques à la philosophie, ni de réduire la philosophie aux mathématiques, mais de dégager le fonds commun d’où naît leur distinction. C’est donc en un sens fort qu’il faut entendre le titre de Conceptual Mathematics que Lawvere donne à son livre d’introduction à la théorie des catégories [Lawvere & Schanuel 1991].

2.2 Dialectique et théorie des catégories

80 Dans son article de 1992 « Categories of Space and of Quantity », Lawvere montre que la théorie des catégories permet d’exprimer certains processus dialectiques à l’œuvre dans les mathématiques contemporaines. Celles-ci réalisent en effet à plus d’une occasion ce que Lawvere appelle une « unité et identité des opposés » (Unity and Identity of Opposites, qu’il appelle aussi UIO) : espace et quantité, quantité intensive et quantité extensive, général et particulier [Lawvere 1992, 14], etc. Le mot d’héroïsme n’est pas trop fort pour désigner le tour de force de Lawvere, dont l’article ne contient strictement aucun symbole mathématique, ni même d’ailleurs aucun nom de variable, fût-il aussi trivial que le nombre x, la fonction f ou l’objet A. L’article de 1992, en dépit ou peut-être en raison de son abstraction mathématique, n’est composé que de mots tirés du dictionnaire anglais.

81 Nous examinerons l’exemple de l’opposition entre l’espace et la quantité, pour voir comment, selon Lawvere, la théorie des catégories montre, à un niveau supérieur, leur identité.

82 Lawvere part de la définition du produit et du coproduit en théorie des catégories. Ceux-ci existent dans mainte structure mathématique, mais n’interagissent pas toujours de la même façon. Plus précisément, deux types de relations sont particulièrement fréquents, tout en étant mutuellement exclusifs : ou bien le produit est distributif sur le coproduit, ou bien les deux opérations sont isomorphes.

Philosophia Scientiæ, 16-1 | 2012 139

2.2.1 Catégories distributives

83 Lorsque le produit est distributif sur le coproduit, Lawvere parle de catégorie distributive. Par exemple, en arithmétique élémentaire, l’équation suivante est toujours valide :

84 a x (b + c)= a x b + a x c.

85 Dans la théorie des ensembles, le produit cartésien est distributif sur l’union disjointe :

86 A x (B ⊔ C)= (A x B) ⊔ (A x C)

87 Lawvere cite comme autres exemples de catégories distributives les ensembles continus, les espaces dérivables, mesurables et combinatoires. De ce que les catégories spatiales soient des catégories distributives, Lawvere induit que l’on peut utilement poser la réciproque, et identifier les « catégories de l’espace » aux catégories distributives.

2.2.2 Catégories linéaires

88 Par opposition aux catégories distributives, Lawvere définit les catégories linéaires par l’isomorphisme qui existe parfois entre le produit et le coproduit.

89 Nous pourrons expliquer cet isomorphisme par la notion de somme directe en algèbre linéaire. Munissons-nous de deux espaces vectoriels de dimension finie, par exemple une droite et un plan. Leur somme directe est l’espace vectoriel qui les envoie dans un seul et même espace vectoriel, en prenant garde à ce que leur télescopage se limite au strict minimum pour préserver la structure d’espace vectoriel ; en particulier, celui-ci ne doit contenir qu’un seul vecteur nul. Par conséquent, la somme directe consistera dans ce cas à forger un espace vectoriel à trois dimensions. La somme directe est donc bien l’incarnation du coproduit dans la catégorie des espaces vectoriels.

90 Mais une fois donnée cette somme directe, on peut également définir deux projections, qui permettent de rétablir les espaces vectoriels originaux. On peut en effet projeter notre espace à trois dimensions d’une part sur une seule de ses dimensions, de l’autre sur les deux dimensions restantes. Munie de ces morphismes projectifs, notre somme ressemble alors étrangement à un produit. C’est la raison pour laquelle, généralisant ce cas de figure où produit et coproduit sont isomorphes, Lawvere appelle catégories linéaires — comme on parle d’algèbre linéaire pour les espaces vectoriels — celles qui satisfont cet isomorphisme.

91 Il existe d’autres exemples de catégories linéaires : celle des groupes abéliens9, celle des espaces vectoriels topologiques, etc. Dans la mesure où les vecteurs peuvent être vus comme des quantités, Lawvere propose d’identifier les catégories linéaires aux « catégories de quantité ».

Philosophia Scientiæ, 16-1 | 2012 140

2.2.3 Quantités d’espace et espaces de quantités

92 Les notions d’espace et de quantité ont été définies par Lawvere comme des catégories opposées : l’une satisfait la loi de distributivité du produit sur la somme, l’autre ne la satisfait pas. Bien loin de minimiser cette opposition, Lawvere va montrer comment chacun des opposés peut, à un niveau supérieur, verser dans l’autre : à cette fin, il décrit d’une part la génération de quantités d’espaces, de l’autre celle d’espaces de quantités.

93 Les quantités d’espace, pour commencer, correspondent selon Lawvere aux quantités extensives, c’est-à-dire aux quantités résultant d’une activité de mesure, telles que la masse et le volume ; elles se distinguent des quantités intensives telles que la densité, qui sont composées de rapports entre des quantités extensives [Lawvere 1992, 18]. Les quantités extensives montrent donc que l’on peut déterminer des morphismes, ou plus précisément des foncteurs, d’une catégorie distributive dans une catégorie linéaire. Elles mettent au jour d’authentiques « quantités d’espace ».

94 Mais le chemin inverse est également possible : il existe des foncteurs d’une catégorie linéaire dans une catégorie distributive. Si l’on considère en effet la variation d’une quantité, celle-ci, en tant que fonction, s’inscrit dans l’espace des fonctions. On peut donc forger des « espaces de quantité » tout autant que des « quantités d’espace ».

95 Ce que révèle cette dialectique, qui n’a rien d’artificiel, c’est que les catégories distributive et linéaire, ou plus spécifiquement d’espace et de quantité, ont beau être radicalement et définitivement opposées, il n’en est pas moins possible de définir des morphismes menant de chacune à l’autre ; ce qui évidemment ne peut se faire à leur propre niveau, où l’opposition demeure, mais seulement en passant au niveau de thématisation immédiatement supérieur, dans lequel les morphismes (ou actes) d’une catégorie distributive deviennent les objets d’une catégorie linéaire, et les morphismes d’une catégorie linéaire les objets d’une catégorie distributive.

2.3 L’Aufhebung comme concept mathématique

96 Ayant montré par des exemples précis que les mathématiques contemporaines, grâce à la théorie des catégories, peuvent déterminer positivement l’unité et l’identité d’objets opposés, Lawvere entreprend de dégager la structure abstraite de ce processus dialectique.

97 Étant donnés deux objets, munis de morphisme menant du premier au second et du second au premier, Lawvere appelle Aufhebung la plus petite catégorie résolvant de cette manière les opposés d’une catégorie donnée [Lawvere 1989, 68]. C’est la raison pour laquelle il avance le terme de bicatégorie10, celle-ci pouvant être représentée par le schéma suivant :

Philosophia Scientiæ, 16-1 | 2012 141

98 Un doute peut cependant subsister quant à l’usage que Lawvere réserve à la dialectique hégélienne : c’est qu’en aucun passage il n’est soutenu par la moindre citation précise. Ce silence, bien naturel dans la mesure où l’intention première de Lawvere est d’importer la dialectique dans la mathématique, ne peut manquer de laisser sur sa faim l’historien de la philosophie.

99 Aussi allons-nous tenter de donner un contenu à cette interprétation catégoriste de Hegel, en étudiant la dialectique de l’être et du néant, telle qu’elle est exposée dans trois textes majeurs : la Science de la logique, l’Encyclopédie et les Leçons sur l’histoire de la philosophie.

3 Hegel et la dialectique de l’être et du néant

100 La dialectique de l’être et du néant est peut-être, de toutes les Aufhebungen de l’Encyclopédie, celle que Hegel a le plus fréquemment et le plus longuement développée. Située au seuil du système, c’est en effet à elle qu’il incombe d’incarner la méthode. Elle est le fer de lance — d’autres diront le cheval de Troie — de la dialectique.

101 Or cette dialectique s’éclaire d’un jour particulier, une fois comparée aux Leçons sur l’histoire de la philosophie ; car on s’aperçoit vite, dans la section consacrée à la philosophie grecque, qu’il n’existe pas une, mais trois de l’être et du néant. La première est celle de Parménide et de Zénon, selon laquelle « l’être est, le non-être n’est pas » : cette pensée de l’être et du néant soutient l’exclusion mutuelle et définitive des deux concepts. La deuxième est celle d’Héraclite, qui montre la véritable voie à suivre, celle de l’unification des opposés. La troisième est l’atomisme de Leucippe et de Démocrite, parodie de dialectique qui n’aurait spéculativement rien de satisfaisant, sinon précisément à titre de contre-exemple.

3.1 Éléatisme et catégorie discrète

102 L’éléatisme tient tout entier dans l’énoncé parménidien : « L’être est, le non-être n’est pas. » Tout en reconnaissant aux Éléates le mérite d’avoir exprimé la première authentique pensée conceptuelle, Hegel ne manque pas, dans les Leçons sur l’histoire de la philosophie, de pointer l’extrême « pauvreté » de leur point de vue, qui correspond très précisément à l’abstraction et à la pauvreté des concepts d’être et de néant en général dans la Science de la logique.

103 Du point de vue de la théorie des catégories, étant donné les deux opposés que sont l’être et le néant, l’éléatisme se limiterait à l’affirmation triviale des morphismes identité définis sur chacun d’entre eux :

104 Cette pensée se contente de gloser une opposition considérée comme donnée, sans la remettre en question, et en la munissant simplement d’identités, c’est-à-dire de tautologies.

105 Telle est la forme abstraite de l’opposition de l’être et du néant, pensée et exprimée par l’éléatisme. La catégorie ainsi définie est parfois appelée 1 + 1, et elle est un exemple de catégorie discrète : « Une catégorie est discrète quand chaque flèche est une identité »

Philosophia Scientiæ, 16-1 | 2012 142

[Mac Lane 1969, 11]. Ce type de catégorie, comme l’éléatisme selon Hegel, a généralement peu de choses à nous apprendre : elle est fondamentalement pauvre.

3.2 Héraclitéisme et bicatégorie

106 La dialectique de l’être et du néant dans la Science de la logique pose le concept de devenir comme « exemple le plus proche » [Hegel 1970, 353] d’unité des concepts de l’être et du néant.

107 En toute rigueur, bien d’autres concepts unifient ces deux concepts, non seulement celui de commencement [Hegel 1970, 353], mais également tous les concepts ultérieurs : être-là, être-pour-soi, quantité, etc. Pour prendre des exemples bien ultérieurs, la vie n’est pas concevable sans cette unité de l’être et du néant, pas plus que l’esprit [Hegel 1970, 524]. Mais si chacun de ces concepts ultérieurs contient cette unité de l’être et du néant, c’est d’une part parce qu’ils présupposent le concept de devenir, de l’autre en y ajoutant bien plus de déterminations qu’il n’est nécessaire. Le rôle privilégié du devenir à cet égard est d’être le concept minimal synthétisant de manière concrète ces deux concepts opposés. Le concept de commencement est, quant à lui, « également proche ». Il est donc substituable à celui du devenir : « La Chose, dans son commencement, n’est pas encore, cependant il n’est pas simplement son néant, mais en lui il y a aussi son être » [Hegel 1970, 353]. C’est donc à Héraclite que revient le mérite d’avoir découvert « le premier concret, l’absolu en tant qu’ayant en lui l’unité de termes opposés » [Hegel 1971, t. XVII, 344], que Hegel appelle encore « la première unité de déterminations opposées » [Hegel 1971, t. XVII, 350].

108 Ce qui est essentiel à cette synthèse vivante de l’être et du néant, ce sont les passages qui s’ouvrent entre les deux concepts : Ce qui est la vérité, ce n’est ni l’être ni le néant, mais le fait que l’être — non point passe — mais est passé en néant, et le néant en être. Pourtant leur vérité, tout aussi bien, n’est pas leur état-de-non-différenciation, mais le fait qu’ils sont absolument différents, et que pourtant, tout aussi immédiatement, chacun disparaît dans son contraire. Leur vérité est donc ce mouvement du disparaître immédiat de l’un dans l’autre : le devenir ; un mouvement où les deux sont différents, mais par le truchement d’une différence qui s est dissoute tout aussi immédiatement. [Hegel 1972, 23] Dans les addenda de Y Encyclopédie, Hegel précise, s’il en était besoin : « Dans l’histoire de la philosophie, c’est le système d’Héraclite qui correspond à ce degré de l’Idée logique » [Hegel 1970, 523].

109 En termes catégoristes, on peut dire que l’héraclitéisme munit le couple de l’être et du néant de deux nouveaux morphismes : l’un de l’être vers le néant, qui est le fait de passer, c’est-à-dire de glisser du présent (qui est) vers le passé (qui n’est pas) ; l’autre du néant vers l’être, à savoir l’advenir, fait de glisser du futur (qui n’est pas) vers le présent (qui est). Il existe donc des morphismes entre ces deux objets que sont l’être et le néant : ceux-ci ne sont pas isolés, confinés dans leurs identités respectives.

110 Le concept de devenir correspond à la théorie, non plus véritablement de l’être et du néant qui en deviennent des abstractions ou plutôt des appauvrissements, mais du passer et de l’advenir. Étudier les relations entre ces deux nouveaux objets, ou plus

Philosophia Scientiæ, 16-1 | 2012 143

précisément entre ces deux morphismes, est l’apport de l’héraclitéisme. La dialectique, dans ce cas concret, peut donc être décrite comme une bicatégorie, c’est-à-dire une catégorie munie évidemment de morphismes entre objets, mais surtout de morphismes entre ces morphismes.

111 Certes, le concept de devenir est encore abstrait chez Héraclite, mais il a du moins le mérite de montrer la marche à suivre ; et que ce concept soit encore abstrait est tout simplement le signe que la dialectique est loin d’être achevée.

3.3 Atomisme et catégorie linéaire

112 À ce stade de la dialectique, telle que la présentent l’Encyclopédie et la Science de la logique, l’affaire est close. Mais l’histoire de la philosophie, comme toute histoire, connaît parfois des ralentissements, sinon certaines formes de régression. La dialectique de l’être et du néant en a connu une dans la figure de l’atomisme de Leucippe et de Démocrite.

113 La philosophie d’Empédocle avait servi d’avertissement : il existe parfois, dans l’histoire de la philosophie, de fausses synthèses, des parodies de dialectique, que Hegel frappe généralement du simple mot de Synthese [Hegel 1971, t. XVII, 379-380], ou d’Eklektismus11 [Hegel 1975, t. XIX, 31-35]. Cette pensée se contente, face à une opposition donnée, de mélanger les opposés et de présenter ce mélange — mélange des quatre éléments, mélange d’amour et de haine — comme leur unification.

114 L’atomisme de Leucippe et de Démocrite est parti, selon Hegel, du même principe que l’éléatisme : le même ne naît que du même, et rien ne naît de rien. Mais à la différence des Éléates, les atomistes n’en sont pas restés à cette opposition formelle, et ont voulu dépasser cette contradiction. Ils ont ainsi traduit l’opposition de l’être et du néant en celle de l’atome et du vide [Hegel 1971, t. XVII, 386], puis se sont contentés de « mélanger » le vide et le plein. Mais les opposés ne sont unifiés que de manière très superficielle : il y a là « une synthèse qui n’est pas déterminée par la nature de ce qui est réuni, mais où au fond ces étants en soi et pour soi demeurent séparés » [Hegel 1971, t. XVII, 388]. L’atomisme, à l’instar de la Synthese empédoclienne, projette donc tout simplement les opposés dans des régions différentes du réel. Ce procédé qui permet de concilier toute chose avec son opposé, pour peu que l’on prenne garde au télescopage, a un nom en théorie des catégories : il s’agit de la somme.

115 Mais on peut également munir cette nouvelle catégorie de morphismes de projection : en faisant abstraction du néant qu’il y a dans le réel — le vide — on retrouve dans l’atome les grandes lignes de la théorie éléate de l’être ; et en faisant abstraction de ce qu’il y a d’être dans le monde, on retrouve les grandes lignes de la théorie éléate du néant. Dans la mesure où ces deux projections existent aux yeux de Hegel, la somme peut tout aussi bien être conçue comme un produit. La catégorie est donc vraisemblablement linéaire, dans la mesure où la somme et le produit sont isomorphes.

Philosophia Scientiæ, 16-1 | 2012 144

116 Les analyses de Lawvere peuvent donc être prolongées. Non seulement le processus dialectique, celui de l’unité et de l’identité des opposés, peut être décrit comme une bicatégorie, mais le lexique de la théorie des catégories permet également de distinguer la véritable dialectique et ses deux antagonistes, qui ne sont que deux variantes de la pensée d’entendement — celle qui ne thématise pas. En rester à l’opposition formelle de deux concepts, c’est en effet demeurer dans une catégorie discrète ; et vouloir rassembler les opposés de manière purement extérieure, c’est se contenter de la somme catégorielle, ou tout aussi bien du produit catégoriel, d’une catégorie linéaire.

117 Cette représentation mathématisée de la dialectique hégélienne passera peut-être pour suspecte. Ne revient-elle pas à mécaniser, par conséquent à dévitaliser la philosophie hégélienne, en la remplaçant par de purs jeux formels sur des flèches et sur des flèches entre flèches, en perdant au passage la signification intuitive des concepts mis au jour par la dialectique ? Mais c’est un formalisme dont Lawvere a voulu se défendre. Ce qui est particulièrement frappant, c’est que l’analyse hégélienne de n’importe quel s’avère impliquer des monoïdes graphiques qui sont en réalité des bicatégories. Ainsi, l’organisation de n’importe quelle branche de la connaissance, pour autant qu’elle puisse être mathématique (c’est-à-dire enseignable), peut dans une certaine mesure se refléter dans des représentations graphiques. Quoique proposée voici près de 200 ans, la méthode hégélienne d’analyse a été largement sous-utilisée depuis cette époque ; des prétentions idéologiques prétendument contradictoires, affirmant soit qu’elle est incohérente soit que sa fluidité est trop admirable pour être mathématisée, ont conspiré contre une large diffusion de son enseignement. Nous croyons avoir montré par quelques modestes exemples qu’elle est cohérente (et non-triviale), et qu’une grande partie de la méthode peut être mathématisée, au service de ceux qui veulent l’utiliser sérieusement, y compris pour cette partie qui reste fluide. [Lawvere 1989, 52-53. Nous soulignons.]

118 C’est donc à une véritable défense et illustration de la méthode dialectique que Lawvere a voulu se consacrer. Tâche dont un homme ne saurait vraisemblablement s’acquitter seul : il faut « clarifier l’utilité de ces interprétations particulières de la méthode dialectique de recherche ». Mais pour cela, ajoute Lawvere, « nous avons grand besoin de l’aide des philosophes et mathématiciens intéressés » [Lawvere 1992, 30].

Conclusion

119 Cet appel de William Lawvere aux philosophes ne faisait peut-être que renouer avec les sources philosophiques de la théorie des catégories : peu après que Cavaillès a observé et conceptualisé un phénomène inhérent à l’histoire et à la pratique des mathématiques, des mathématiciens ont à leur tour thématisé ce processus, donnant naissance à la théorie des catégories, illustration et formalisation de cette activité. La pratique mathématique se vit ainsi formuler par des concepts philosophiques, qui à leur tour purent être formalisés par des concepts mathématiques. Par la thématisation, la pratique mathématique devient elle-même objet de connaissance mathématique.

120 Hegel distinguait les connaissances philosophique et mathématique par l’absence de dialectique de la seconde [Hegel 2006, Préface, XLVIII sq.]. En décrivant la nature

Philosophia Scientiæ, 16-1 | 2012 145

dialectique de la pratique mathématique, Cavaillès et Lawvere ont paradoxalement montré, contre Hegel, ce que l’histoire des mathématiques avait de profondément hégélien : le concept philosophique immanent au concept mathématique.

BIBLIOGRAPHIE

ASPERTI, ANDRÉA & LONGO, GIUSEPPE 1991 Categories, Types, and Structures. An Introduction to for the Working Computer Scientist, Foundations of Computing, Cambridge : MIT Press.

BLANCHE, ROBERT & DUBUOS, JACQUES 1996 La Logique et son histoire, U Philosophie, Paris : Armand Colin.

CANGUILHEM, GEORGES 1989 Une vie, une œuvre. 1903-1944, Jean Cavaillès, philosophe et résistant, dans Œuvres complètes de philosophie des sciences, édité par CAVAILLÈS, JEAN, Hermann, 683-686, 1994.

CAVAILLÈS, JEAN 1981 Méthode axiomatique et formalisme. Essai sur le problème du fondement des mathématiques, Paris : Hermann. 2008 Sur la logique et la théorie de la science, Bibliothèque des textes philosophiques, Paris : Vrin.

DESANTI, JEAN-TOUSSAINT 1981 Souvenir de Jean Cavaillès, dans Méthode axiomatique et formalisme. Essai sur le problème du fondement des mathématiques, édité par CAVAILLÈS, JEAN, Paris : Hermann, I-IV. 1999 Variantes philosophiques 2 : Philosophie, un rêve de flambeur, Figures, Paris : Grasset.

EHRESMANN, CHARLES 1965 Catégories et structures, Travaux et recherches mathématiques, Paris : Dunod.

GRANGER, GILLES-GASTON 1988 Pour la Connaissance philosophique, Paris : Odile Jacob. 1998 Jean Cavaillès et l’histoire, Philosophia Scientiœ, 3(1), 65-77.

GRASSMANN, HERMANN GUNTHER 1878 Die Ausdehnungslehre von 1844 oder die lineale Ausdehnungslehre, ein neuer Zweig der Mathematik, Leipzig : O. Wigand.

HEGEL, GEORG WILHELM FRIEDRICH 1970 Encyclopédie des sciences philosophiques, I. La science de la logique, Bibliothèque des textes philosophiques, Paris : Vrin, 1994. 1971 Leçons sur l’histoire de la philosophie, I. La philosophie grecque, de Thalès à Anaxagore, Bibliothèque des textes philosophiques, Paris : Vrin.

1972 Science de la logique. L’être. Edition de 1812, Bibliothèque philosophique, Paris : Aubier- Montaigne, 1987. 1975 Leçons sur l’histoire de la philosophie, IV. La philosophie grecque : le dogmatisme et le scepticisme, les néoplatoniciens, Bibliothèque des textes philosophiques, Paris : Vrin. 2006 Phénoménologie de l’esprit, Bibliothèque des textes philosophiques, Paris : Vrin.

Philosophia Scientiæ, 16-1 | 2012 146

KRÔMER, RALF 2004 La théorie des catégories : ses apports mathématiques et ses implications épistémologiques. Un hommage historico-philosophique, Thèse de doctorat, Université Nancy 2 - Universitât des Saarlandes.

LAWVERE, F. WILLIAM 1989 Display of Graphics and their Applications, as Exemplified by 2-Categories and the Hegelian "Taco", dans Proceedings of the First International Conference on Algebraic Methodology and Software Technology, University of Iowa, 51-75. 1991 Some thoughts on the future of category theory, Lecture Notes in Mathematics, 1488/1991, 1-13. 1992 Categories of space and of quantity, dans The Space ofMathematics : Philosophical, Epistemological and Historical Explorations, Berlin : DeGruyter, 14-30, International Symposium on Structures in Mathematical Theories (1990), San Sebastian, Spain. 1996a Grassmann s dialectics and category theory, dans Proceedings of the 1994 Conference to commemorate 150 years of Grassmann’s "Ausdehnungslehre", Hermann Guenther Grassmann (1809-1877): Visionary Mathematician, Scientist and Neohumanist Scholar, Boston Studies in the Philosophy of Science, édité par SCHUBRING, GERT, Dordrecht-Boston-London: Kluwer Academic Publishers, t. 187, 255-264. 1996b Unity and identity of opposites in calculus and physics, Applied Categorical Structures, 4, 167-174, 10.1007/BF00122250. dx.doi.org/10.1007/BF00122250.

LAWVERE, F. WILLIAM & SCHANUEL, STEPHEN H. 1991 Conceptual Mathematics. A First Introduction to Categories, Cambridge : Cambridge University Press, 1997.

MAC LANE, SAUNDERS 1950 Duality for groups, Bulletin of the American Mathematical Society, 56(6), 485-516. 1969 Categories for the Working Mathematician, Graduate Texts in Mathematics, New York : Springer, 1991. 1986 Mathematics : Form and Function, New York : Springer. 1997 Reviewed work: Conceptual mathematics: A first introduction to categories by F. William Lawvere; Steven Schanuel; Emilio Faro; Fatima Fenaroli; Danilo Lawvere; Students of Mathematics 108 of the University of Buffalo, The American Mathematical Monthly, 104(10), 985-987.

RUSSELL, BERTRAND 1961 Histoire de mes idées philosophiques, Tel, Paris : Gallimard.

SCHWARTZ, ELISABETH 2005 Histoire des mathématiques et histoire de la philosophie chez Jules Vuillemin, dans Philosophie des mathématiques et théorie de la connaissance. L’œuvre de Jules Vuillemin, édité par RASHED, ROSHDI, Sciences dans l’histoire, Paris : Albert Blanchard, 1-28. 2006 Jean Cavaillès (1903-1944), Revue d’Auvergne, 580-581, 305-329.

SEBESTIK, JAN 2008 Postface, dans Sur la Logique et la théorie de la science, édité par CAVAILLÈS, JEAN, Bibliothèques des textes philosophiques, Paris : Vrin, 91-142.

SINACEUR, HOURYA 1990 Encyclopédie philosophique universelle, Paris : Presses Universitaires de France, t. II, chap. Thématisation, 2581-2582. 1994 Jean Cavaillès. Philosophie mathématique, Philosophies, Paris : Presses Universitaires de France.

Philosophia Scientiæ, 16-1 | 2012 147

VUILLEMIN, JULES 1962 La Philosophie de l’algèbre. I. Recherches sur quelques concepts et méthodes de l’Algèbre moderne, Épiméthée, Paris : Presses universitaires de France.

NOTES

*. Ce texte fut d'abord prononcé, sous des formes partielles mais convergentes, sous les titres « Jean Cavaillès entre Hegel et la théorie des catégories » (colloque « Concepts purs, concepts appliqués » organisé par le groupe PhénoMath, Université de Nice, 10 décembre 2010), puis « Logique hégélienne et théorie des catégories chez William Lawvere » (journée d'études « Modèles mathématiques pour la philosophie », organisée par l'auteur, Maison des sciences de l'homme de Clermont-Ferrand, 26 mars 2011). L'auteur remercie vivement Jocelyn Benoist, Sylvain Cabanacq, Marion Duquerroy, Thierry Paul, Andrei Rodin et Hourya Sinaceur, ainsi que les éditeurs et les relecteurs anonymes de la revue. Il dédie ce texte à Elisabeth Schwartz et à sa proche famille. 2. Restreinte au second aspect, celui de la formalisation, cette hypothèse a été formulée par Jean- Toussaint Desanti [Desanti 1999, 223—225]. 3. Parmi les meilleurs commentaires de ce passage figurent ceux de Hourya Sinaceur [Sinaceur 1994, 95 sq.] et de Gilles-Gaston Granger [Granger 1988, 70—81]. 4. Dans cet ouvrage, Cavaillès évoque les généralisations des concepts d’addition et de multiplication [Cavaillès 1981, 49 sq.], définit la thématisation comme « transformation d’une opération en élément d’un champ opératoire supérieur, [par] exemple topologie des transformations topologiques (essentielles d’une façon générale en théorie des groupes) » [Cavaillès 1981, 177]. Et c’est dans une note que Cavaillès semble attribuer à Husserl la paternité du concept de thématisation : « La notion de type — introduite par Russell pour échapper au paradoxe qu’il avait découvert dans la théorie des ensembles — répond à ce que Husserl [Logique formelle et logique transcendantale] appelle la faculté thématisante des mathématiques : toute propriété d’un objet peut devenir à son tour objet et posséder des propriétés » [Cavaillès 1981, 105 n. 1].

5. Selon la définition abstraite, un objet A x B accompagné d’une paire de morphismes π1 : A x B

→ A et π2 A x B → B est appelé coproduit de A et B si pour tout objet C et pour toute paire de morphismes f1 : C → A et f2 : C → B, il existe exactement un morphisme f : A x B tel que f1 = π1 o f et f2 = π2 o f, c’est-à-dire que le diagramme suivant soit commutatif :

6. Nous privilégierons le premier terme, qui est plus intuitif ; mais en toute rigueur il est préférable de n’utiliser le terme de somme que dans le cadre des catégories distributives (cf. infra, p. 171).

Philosophia Scientiæ, 16-1 | 2012 148

7. Selon la définition abstraite, un objet A + B accompagné d’une paire de morphismes i1 : A x B →

A et i 2 A x B → B est appelé coproduit de A et B si pour tout objet C et pour toute paire de morphismes f1 : C → A et f2 : C → B, il existe exactement un morphisme f : A x B tel que f o i1 = f1 et f o i2 = f2, c’est-à-dire que le diagramme suivant soit commutatif :

8. Notamment en ce qui concerne l’ordre des quantificateurs [Mac Lane 1997]. 9. Prenons en effet le produit libre de deux groupes (A ;+) et (B ;+). Ses éléments sont de la forme a1 +b1 +a2 +b2 +…+an +bn. Le groupe étant commutatif, on peut réunir à gauche tous les ai, à droite tous les bi, pour obtenir a1 + a2 + … + an + b1 + b2 + … + bn, que l’on peut simplifier en un certain élément ak + bk, isomorphe à un certain couple (ak ; bk), le groupe des éléments ainsi générés par le produit libre étant isomorphe au produit direct (A × B ;+). Dans la catégorie des groupes abéliens, la somme et le produit sont donc isomorphes, ce qui n’est pas le cas en toute généralité dans la catégorie des groupes. 10. Le concept de bicatégorie a été défini pour la première fois par Mac Lane [Mac Lane 1950]. 11. Hegel y sauve les néoplatoniciens de l’accusation d’éclectisme : ces philosophes n’ont pas juxtaposé des théories opposées comme on rapièce un vêtement, mais leur ont donné une unification concrète.

RÉSUMÉS

Les concepts de paradigme et de thématisation, par lesquels Jean Cavaillès décrit dans l’ouvrage posthume Sur la Logique et la théorie de la science la dynamique de l’activité mathématique, trouvent dans la théorie des catégories à la fois une illustration et une formalisation, et dans la dialectique hégélienne un précédent. Dans un premier temps, nous examinerons cette hypothèse, non sans définir le concept de thématisation et les quelques notions élémentaires de théorie des catégories qui nous serviront par la suite. Ensuite, en supposant hégélienne la notion de « dialectique » employée par Cavaillès, nous vérifierons si le concept de « thématisation » peut prétendre décrire le processus dialectique de la logique hégélienne, interprétation dont nous trouverons l’esquisse chez William Lawvere. Enfin, nous mettrons à l’épreuve cette interprétation en analysant la dialectique hégélienne de l’être et du néant dans la Science de la logique et les Leçons sur l’histoire de la philosophie. Nous verrons ainsi comment la dialectique hégélienne peut tout à la fois décrire la pratique mathématique et s’y voir représenter.

Philosophia Scientiæ, 16-1 | 2012 149

Concepts of both paradigm and thematization, through which Jean Cavaillès describes the dynamics of mathematical activity, in the posthu-mous book Sur la Logique et la théorie de la science, find in the category theory an illustration as well as a formalization, and find too a precedent in the Hegelian dialectic. First, we will examine this assumption and will also make sure to define, as a basis for our later development, the concept of thematization and some basic notions of category theory. In supposing then as Hegelian the notion of ’dialectic’ used by Cavaillès, we will check if the concept of ’thematization’ can claim to describe the Hegelian interpretation’s dialectical process. This kind of interpretation can be found in William Lawvere’s writings. Finally, we will put the interpretation to the test in analyzing the Hegelian dialectic about being and nothingness in the Science of Logic and the Lectures on the History of Philosophy. Consequently we will see how the Hegelian dialectic can, in the same time, describe the mathematical practice and be represented by it.

AUTEUR

BAPTISTE MÉLÈS Université de Clermont-Ferrand II (France)

Philosophia Scientiæ, 16-1 | 2012 150

La physique dans la recherche en mathématiques constructives

Vincent Ardourel

Introduction

1 Les rapports qu’entretiennent les mathématiques et la physique ont fait et continuent de faire l’objet de nombreuses études en philosophie des sciences. La question la plus discutée est celle de l’applicabilité des mathématiques à la physique : ‘pourquoi et comment les mathématiques s’appliquent-elles à la nature ?’ ‘et’ comment les mathématiques participent-elles du développement de la physique ? [Steiner 1998], [Morrison 2000], [Wilson 2000], [Batterman 2002]. La question inverse, celle de savoir si et comment la physique contribue au développement des mathématiques se révèle en revanche être beaucoup moins discutée.

2 Certains se sont tout de même penchés sur cette question. Dès 1905, H. Poincaré souligne déjà les interactions entre la physique mathématique et les mathématiques pures et montre notamment comment la première contribue au développement de la seconde1 [Poincaré 1905, 152]. Plus récemment, P. Maddy met en évidence l’influence des mathématiques appliquées dans l’histoire des mathématiques pures [Maddy 2008]. A. Urquhart s’intéresse aussi aux interactions entre physique et mathématiques en se concentrant sur le « problème inverse » [Urquhart 2008a, 413] à celui traditionnellement discuté : il s’intéresse à l’application de la physique aux mathématiques. Il montre comment les mathématiques se développent en « assimilant » [Urquhart 2008a, 413] les idées des physiciens et comment la méthodologie de la physique s’exporte en mathématiques [Urquhart 2008a], [Urquhart 2008b]. Cependant, dans toutes ces discussions, les mathématiques en jeu sont les mathématiques classiques, celles qui se fondent sur la logique classique et qui sont communément utilisées par la communauté scientifique et enseignées à l’université. Il existe d’autres types de mathématiques, fondées sur des logiques différentes et qui, comme les mathématiques classiques, font aussi l’objet de recherches de la part de la communauté scientifique. La place de la physique dans ce type de recherches

Philosophia Scientiæ, 16-1 | 2012 151

mathématiques n’a en revanche pas fait l’objet de réelle discussion. P. Fltecher a bien abordé la question d’une Perspective constructiviste de la physique mais sans discuter l’influence de la physique sur la recherche en mathématiques constructives [Fletcher 2002]. J. R. Brown a en revanche souligné comment la physique interagissait avec les mathématiques constructives en permettant d’exporter sur un ‘terrain pratique’ le débat philosophique entre constructivisme et classicisme [Brown 2003].

3 Dans cet article, je propose d’analyser l’influence de la physique dans la recherche en mathématiques constructives. Plus précisément, je propose de montrer qu’une pratique de la recherche en mathématiques constructives consiste à reformuler de manière constructive les théories physiques. Je discute en particulier trois aspects de cette pratique. Dans la partie II, je montre qu’elle illustre une situation assez rare selon laquelle la recherche en mathématiques est motivée par des considérations de philosophie des mathématiques : nous verrons que ces dernières exercent une réelle influence sur l’établissement de résultats techniques des mathématiques constructives. Dans la partie III, je montre ensuite que cette pratique de la recherche consiste en une méthodologie que je qualifie de « logicienne » [Poincaré 1905, 12] selon la terminologie de Poincaré. Enfin, dans la partie IV, je montre que dans cette pratique, les théories physiques possèdent un rôle heuristique fructueux pour le développement des mathématiques constructives, au sens où elles révèlent des problèmes et des domaines de recherches nouveaux en mathématiques constructives. Avant cela, je commence dans une première partie par présenter les spécificités de la recherche en mathématiques constructives.

1 La recherche en mathématiques constructives

1.1 La recherche de preuves constructives

4 Au sein de la communauté des mathématiciens, les constructivistes constituent une minorité de chercheurs2 aux exigences bien particulières : ils considèrent les preuves constructives comme les seules preuves légitimes. Pour les constructivistes, un objet mathématique existe, si l’on peut en principe le construire à l’aide d’une procédure. Cette exigence se traduit par différents types de constructivisme. Il est par exemple possible de travailler dans le cadre de la théorie des fonctions récursives [Kushner 1973] ou dans celui de la théorie Type Two Effectivity [Weihrauch 2000]3. Dans cet article, je me limite aux mathématiques constructives héritées des travaux de [Bishop 1967] et à l’interprétation que Bridges et Richman en font4. À la question de savoir ce que sont les mathématiques constructives, D. Bridges répond : Selon les pionniers du sujet, tels que Brouwer, Markov et Bishop, ce sont les mathématiques qui exigent d’interpréter ‘ce qui existe comme’ce qui est calculable (en un sens qui dépend de la perspective de chacun d’eux). En adoptant cette interprétation, on est obligé, de manière presque inévitable, de recourir à une logique différente de la traditionnelle logique classique. [... ] Selon nous, il est clair qu’en pratique les mathématiques constructives ne sont rien d’autre que les mathématiques utilisant la logique intuitionniste. [Bridges 1999, 439]

5 Les constructivistes tels que Bridges et Richman suivent Bishop en interprétant l’exigence de constructivité des preuves mathématiques par le recours à la logique intuitionniste. Celle-ci se distingue de la logique classique notamment par l’absence du principe du tiers-exclu. En logique intuitionniste, on ne peut pas conclure à la vérité

Philosophia Scientiæ, 16-1 | 2012 152

d’une proposition à partir de la fausseté de la proposition contraire et la double négation d’une proposition n’est pas équivalente à son affirmation.

6 L’activité de recherche en mathématiques constructives consiste généralement à reformuler constructivement les résultats des mathématiques classiques. Elle consiste à chercher des preuves constructives des théorèmes des mathématiques classiques dans les domaines aussi variés que l’analyse, l’algèbre et la topologie. Ce type d’activité peut de prime abord laisser penser que la recherche en mathématiques constructives ne participe pas au développement des mathématiques (classiques et constructives). En effet, les théorèmes que les constructivistes cherchent à prouver ne sont généralement pas des théorèmes nouveaux au sens où ils ont déjà été démontrés dans le cadre des mathématiques classiques.

7 Une telle conception du développement des mathématiques n’est pas tenable. Premièrement, les théorèmes non-constructifs des mathématiques classiques ne sont pas des théorèmes pour les constructivistes. Produire une preuve constructive d’un énoncé déjà démontré en mathématiques classiques, c’est démontrer un nouveau théorème des mathématiques constructives. La notion de nouveauté d’un théorème est relative au type de mathématiques en question. Ainsi, en tant qu’elle produit de nouveaux théorèmes constructifs, la recherche en mathématiques constructives participe à son propre développement.

8 Deuxièmement, il arrive aux constructivistes de prouver des théorèmes qui n’ont encore jamais été démontrés dans le cadre des mathématiques classiques. Considérons par exemple l’énoncé suivant : « Il existe un algorithme qui, appliqué à une fonction holomorphe f sur le domaine D(0,1), ainsi qu’à deux valeurs complexes omises du domaine de f, calcule l’ordre v du pôle de f en 0 » [Bridges s. d., #4]. Ce théorème a été démontré en 1982 de manière constructive [Bridges 1982]. Or, il n’existe pas actuellement de preuve de cet énoncé autre que la preuve constructive5. Il existe des preuves classiques d’énoncés qui ressemblent à cet énoncé, c’est-à-dire des énoncés dont le contenu est proche de l’énoncé en question, en l’occurrence le théorème de Picard [Picard 1879], [Conway 1978, chap. 12]. En revanche, il n’existe pas de preuve de cet énoncé précisément autre que la preuve constructive. Cet exemple illustre un aspect surprenant de la recherche en mathématiques constructives. Celle-ci peut contribuer au développement des mathématiques classiques : en effet un théorème qui a été prouvé dans le cadre des mathématiques constructives peut ensuite toujours devenir un théorème des mathématiques classiques6.

1.2 Une recherche récemment dirigée vers la physique

9 Jusque dans les années 1980, la recherche en mathématiques constructives concernait exclusivement le domaine des mathématiques pures : les mathématiques appliquées à la physique ne semblaient pas faire partie des sujets de recherche des constructivistes. Dans le livre de référence de Bishop, Foundation of Constructive Analysis, aucune mention n’est faite à de possibles applications des mathématiques constructives à la physique [Bishop 1967]. Selon lui : Ce livre est une contribution à la propagande constructiviste, conçue pour montrer qu’il existe une alternative satisfaisante [aux mathématiques classiques]. Dans ce but, nous développons une grande partie de l’analyse pure7 de manière constructive. [Bishop 1967, ix]

Philosophia Scientiæ, 16-1 | 2012 153

10 Pourtant, il existait déjà à la même époque de nombreux travaux de recherche en mathématiques classiques appliquées à la physique8. Cette différence de traitement pour les applications s’explique notamment par le fait que contrairement aux mathématiciens classiques, les constructivistes s’intéressent en premier lieu aux fondements des mathématiques et plus précisément aux types de preuves qu’il est légitime d’accepter en mathématiques. On comprend dès lors que l’applicabilité des mathématiques constructives à la physique ne constituait pas une activité de recherche prioritaire.

11 Une des premières contributions des constructivistes aux mathématiques appliquées à la physique est due à Bridges. Dans Towards a constructive formulation for quantum mechanics, Bridges jette les bases d’une reformulation constructive de la mécanique quantique [Bridges 1981]. Les notions physiques d’états, d’événements, d’opérateurs qui interviennent en mécanique quantique sont définies dans le cadre des mathématiques constructives9. Par exemple, quatre axiomes (numérotés de ES1 à ES4) sont énoncés à propos de deux ensembles E et S10 dans le but de définir les notions d’événements et d’états d’un système physique : On appelle structure d’événement le couple (E, S) satisfaisant [aux axiomes] ES1 - ES4. Les éléments de E, ou événements, sont interprétés comme les résultats des expériences physiques et peuvent être identifiés avec des questions dont les réponses sont ‘oui’ ou ‘non’ . (Par exemple, ‘est-ce qu’un électron passe à travers une certaine fente du dispositif expérimental ?’ ) L’interprétation de ‘a ≤ b’ est la suivante : si l’événement a se produit, alors l’événement b se produit aussi ; on interprète a’ comme l’événement qui se produit précisément quand a ne se produit pas. D’autre part, les éléments de S, ou états, représentent les états d’un système physique dans le sens suivant : s(a) est la probabilité que l’événement a se produise quand le système est préparé dans l’état représenté par l’élément s de S. [Bridges 1981, 262]

12 Bridges énonce les axiomes ES1 à ES411 de manière à pouvoir prouver constructivement les théorèmes que l’on peut en tirer. Son travail consiste ainsi à élaborer un cadre mathématique constructiviste permettant de définir des objets mathématiques interprétables comme des notions physiques de la mécanique quantique12. Avec cet article, Bridges initie le projet d’une reformulation constructive de la physique, projet qui sera poursuivi jusqu’à nos jours [Richman & Bridges 1999], [Billinge 1997], [Ye 1999], [Ye 2000], [Ye 2011], [Spitters 2002a], [Spitters 2002b], [Spitters 2006].

13 Remarquons tout de suite que cette entreprise de reformulation des théories physiques ne peut pas exister en mathématiques classiques puisque les théories physiques sont déjà formulées avec ce type de mathématiques. En revanche, on la retrouve avec un autre type de mathématiques dont les exigences de preuve diffèrent aussi de celles des mathématiques classiques : les mathématiques prédicatives. Il s’agit de mathématiques qui rejettent les preuves faisant intervenir un certain type de définitions (les définitions qualifiées précisément d’imprédicatives13). Feferman, après avoir développé ce type de mathématiques, estime que l’approche prédicative « suffit à la formalisation de presque toutes, si ce n’est toutes les mathématiques appliquées aux sciences14 » [Feferman 1998b, 285], [Feferman 1998a].

Philosophia Scientiæ, 16-1 | 2012 154

2 Une pratique motivée par la philosophie

14 La reformulation constructive des théories physiques est une pratique de la recherche en mathématiques qui a la particularité d’être motivée par des considérations philosophiques. Même s’il semble exister des cas similaires dans l’histoire des mathématiques15, il s’agit d’une situation suffisamment rare pour mériter notre attention.

15 Dans Science and Constructive Mathematics, Brown montre que la physique constitue un « terrain d’affrontement » ou « champ de bataille » [Brown 2003, 48] entre constructivisme et classicisme en philosophie des mathématiques. Il s’appuie pour cela sur le débat qui a eu lieu de 1993 à 2000 entre d’un côté un groupe de constructivistes et de l’autre, leur détracteur Hellman : Récemment, Geoffrey Hellman a soutenu que les mathématiques constructives échouaient dans un certain nombre de cas. Par exemple, il n’y a pas de preuve constructive du théorème de Gleason alors qu’il est crucial pour certains résultats de mécanique quantique. En relativité générale les théorèmes de singularités de Hawking et Penrose ne semblent pas avoir d’équivalents constructifs et de même pour les opérateurs non bornés de la mécanique quantique16. Hellman y voit là un coup fatal. Cependant, des constructivistes (Bridges, Richman, Billinge) ont répondu qu’il existait une version constructive du théorème de Gleason suffisante pour tous les besoins de la mécanique quantique. Selon les constructivistes, il devrait en être de même pour les autres exemples. [Brown 2003, 48]

16 Ce débat résumé par Brown entre Hellman et un groupe de constructivistes, cherche à arbitrer une question de philosophie des mathématiques (une preuve non-constructive est-elle légitime en mathématiques ?) à partir de considérations d’applicabilité des mathématiques (quelles sont les mathématiques qui s’appliquent à la nature ?). Cette stratégie repose sur l’acceptation d’un pré-supposé indispensabiliste, à savoir que la viabilité des mathématiques dépend de leur indispensabilité en sciences17. Cette démarche est assumée par G. Hellman qui considère que : Comme Weyl le soutenait, ‘c’est une fonction des mathématiques que d’être au service des sciences de la nature’ [Weyl 1949, 61]. Remplir cette fonction est en effet un objectif indispensable à n’importe quel substitut des mathématiques classiques. Une alternative qui échouerait sur ce point n’est certainement pas un substitut viable. [Hellman 1998, 426]18

17 En retenant ainsi comme critère d’évaluation des mathématiques leur capacité à permettre une formulation des théories physiques, on comprend dès lors l’enjeu pour Hellman de montrer les failles d’une reformulation constructive des théories physiques, ce qu’il a effectivement cherché à faire dans [Hellman 1993a], [Hellman 1993b] et [Hellman 1998].

18 Ce présupposé indispensabiliste est non seulement retenu par les détracteurs du constructivisme19, mais aussi par certains constructivistes eux-mêmes20. F. Ye, qui développe dans [Ye 1999], [Ye 2000], [Ye 2011] une formulation constructive des opérateurs non bornés, les opérateurs dont G. Hellman juge précisément que leur non- constructivité est problématique, introduit ses travaux comme suit : Nous considérons cet article comme une contribution au projet général consistant à chercher une réponse à la question : Les mathématiques classiques sont-elles logiquement indispensables pour formuler les théories physiques contemporaines […] ? [Ye 2000, 358]

Philosophia Scientiæ, 16-1 | 2012 155

19 De même, H. Billinge qui travaillait à la recherche d’une preuve constructive du théorème de Gleason, écrivait deux ans avant que celle-ci ne soit trouvée : Le problème auquel nous sommes confrontés est le suivant : si nous ne pouvons pas prouver (d une quelconque manière) une forme du théorème de Gleason à l’aide des mathématiques constructives, il semblerait que nous perdions un théorème très important de la mécanique quantique. Il semblerait que les mathématiques classiques s’avèrent indispensables21 aux sciences contemporaines. [Billinge 1997, 662]

20 Enfin, Spitters qui propose dans sa thèse un développement de la théorie constructive de l’intégration et de l’analyse fonctionnelle introduit son travail comme suit : [La question de savoir] si les mathématiques constructives sont suffisantes22 pour développer la physique mathématique, et particulièrement la mécanique quantique […] a soulevé plusieurs problèmes précis de mathématiques. Certains de ces problèmes ont été résolus23 […] D’autres problèmes seront résolus dans cette thèse24. [Spitters 2002a, 9]

21 Etant donné le présupposé commun à certains constructivistes et à leurs détracteurs à propos de l’enjeu d’une possible reformulation constructive de la physique, on comprend dès lors que la reformulation effective des théories physiques soit un réel projet de recherche des mathématiques constructives. En effet, la meilleure manière de montrer que les mathématiques constructives peuvent être suffisantes à la formulation de la physique, c’est encore d’établir une telle reformulation constructive des théories physiques.

22 Après avoir montré comment la reformulation constructive des théories physiques était une pratique motivée par des considérations philosophiques, je propose d’identifier à quel type de méthodologie de la recherche en mathématiques cette pratique correspond.

3 La reformulation constructive des théories physiques : une méthodologie de « logiciens »

23 Dans La Valeur de la science, Poincaré distingue deux types de mathématiciens : les logiciens et les intuitifs [Poincaré 1905, chapitre 1, § 1]25. Les premiers cherchent avant tout à prouver des énoncés mathématiques, même s’ils paraissent évidents et connus de tous26. Les seconds cherchent à proposer des énoncés nouveaux, des conjectures jamais formulées auparavant, même si la preuve de leur conjecture peut manquer de rigueur27. Ce sont deux méthodologies de la recherche en mathématiques bien différentes. L’une place la recherche de preuves rigoureuses comme l’activité essentielle des mathématiciens, l’autre la recherche d’énoncés nouveaux.

24 À partir de cette distinction, comment qualifier le rôle joué par la physique dans la recherche en mathématiques ? Dans cette partie, je montre que la physique intervient dans une recherche en mathématiques classiques à la fois selon une méthodologie « intuitive »28 et une méthodologie « logicienne ». En revanche, elle semble seulement intervenir selon une méthodologie « logicienne » dans le cas de la recherche en mathématiques constructives.

Philosophia Scientiæ, 16-1 | 2012 156

3.1 La physique et la méthodologie intuitive en mathématiques

25 Poincaré illustre comment la physique intervient dans la méthodologie intuitive des mathématiciens (classiques) sur l’exemple d’un résultat d’analyse complexe29 que le géomètre Klein a établi en ayant recours à une analogie physique : [M. Klein] étudie une des questions les plus abstraites de la théorie des fonctions ; il s’agit de savoir si sur une surface de Riemann donnée, il existe toujours une fonction admettant des singularités données. Que fait le célèbre géomètre allemand ? Il remplace sa surface de Riemann par une surface métallique dont la conductibilité électrique varie suivant certaines lois. Il met deux de ses points en communication avec les deux pôles d’une pile. Il faudra bien, dit-il, que le courant passe, et la façon dont ce courant sera distribué sur la surface définira une fonction dont les singularités seront précisément celles qui sont prévues par l’énoncé. Sans doute, M. Klein sait bien qu’il n’a donné là qu’un aperçu : toujours est-il qu’il n’a pas hésité à le publier ; et il croyait probablement y trouver sinon une démonstration rigoureuse, du moins je ne sais quelle certitude morale. Un logicien aurait rejeté avec horreur une semblable conception, ou plutôt il n’aurait pas eu à la rejeter, car dans son esprit elle n’aurait jamais pu naître. [Poincaré 1905, 13]

26 Quel rôle, dans cet exemple, la physique a-t-elle joué dans la recherche en mathématiques ? Selon Poincaré, Klein a utilisé la théorie électromagnétique pour interpréter un problème mathématique comme un problème physique [Klein 1882, 167], [Klein 1895], [Chorlay 2007, 135]. Cette interprétation a fourni une « image physique » [Poincaré 1905, 153] du problème d’analyse complexe qui a rendu possible la formulation d’une conjecture non rigoureusement démontrée à propos du comportement d’une fonction complexe. Grâce à la formulation complexe (au sens de l’analyse complexe) de la théorie électromagnétique, Klein a pu procéder à une analogie entre une expérience de pensée de physique et un problème mathématique, analogie qui lui a permis d’énoncer « sinon une démonstration rigoureuse, du moins je ne sais quelle certitude morale » [Poincaré 1905, 153], c’est-à-dire une conjecture mathématique.

27 Il semble que nous retrouvions cette méthodologie « intuitive » de la recherche en mathématiques classiques dans des discussions plus récentes. Urquhart montre comment la physique permet de fournir des méthodes de calculs qui, alors même qu’elles ne reçoivent pas de fondements mathématiques rigoureux, sont utilisées par certains mathématiciens [Urquhart 2008b]30. C’est le cas par exemple de la Méthode des répliques, introduite par le physicien Mézard pour résoudre des problèmes de physique statistique [Mézard, Parisi & Virasoro 1987]. Cette méthode permet en physique de calculer la fonction de partition d’un système statistique31. Or cette méthode est en toute rigueur mathématiquement injustifiée [Urquhart 2008b, 432]. Cependant, cette méthode qui ‘marche’ en physique, c’est-à-dire qui permet d’établir des prédictions vérifiées empiriquement, s’exporte en mathématique. Elle est utilisée dans la résolution de problèmes d’optimisation combinatoire et de théorie des graphes [Urquhart 2008b, 435]. Cette ‘exportation’ en mathématiques est très controversée, ce qui à mon sens rend la distinction entre méthode intuitive et méthode logicienne toujours d’actualité32.

28 La méthodologie « intuitive » me semble être une spécificité des mathématiques classiques. Cela s’explique certainement par le fait que les physiciens travaillent dans le cadre des mathématiques classiques et non dans celui des mathématiques constructives. Par conséquent, les analogies et les méthodes physiques dont les

Philosophia Scientiæ, 16-1 | 2012 157

mathématiciens peuvent tirer profit dans leurs recherches s’appliquent seulement aux mathématiques classiques. Il n’y a certainement pas d’impossibilité à ce qu’une méthodologie « intuitive » utilisant la physique puisse être suivie dans les mathématiques constructives, mais force est de constater que jusqu’à aujourd’hui, elle n’est toujours pas en vigueur. Le recours à la physique dans la recherche en mathématiques constructives ne s’inscrit pas dans une recherche mathématique qui ‘devine’ ou ‘émet’ des conjectures, mais au contraire dans une recherche mathématique qui prouve en toute rigueur.

3.2 La physique et la méthodologie logicienne en mathématiques

29 Les mathématiciens constructivistes s’intéressent à la physique dans le but de donner un fondement rigoureux, selon leurs propres critères, à la formulation des théories physiques. Lorsque dans [Bridges 1981], [Bridges & Svozil 2000], Bridges s’intéresse à la formulation de la mécanique quantique et Ye [Ye 2011] à celle de la théorie de la relativité, ils cherchent à définir en toute rigueur les notions utilisées en physique (événement, état, géodésique, courbure, tenseur, etc.) mais qui reçoivent jusqu’alors une définition mathématique illégitime à leurs yeux. Leur travail consiste précisément à en proposer une reformulation rigoureuse.

30 Ce travail de recherche de fondements mathématiques rigoureux pour des notions physiques est une méthodologie que l’on retrouve aussi en mathématiques classiques. C est précisément le travail qu’a accompli L. Schwartz pour la notion de fonction 5 de Dirac [Urquhart 2008b]. Il s’agit d’une fonction qui représente en physique « la fonction densité de masse d’une particule ponctuelle de masse unité située à l’origine » [Urquhart 2008b, 429]. Selon Dirac, elle peut être utilisée « en pratique pour tout calcul de mécanique quantique sans donner de résultats incorrects » [Maddy 2008, 38]. Cependant, elle est en toute rigueur un non-sens mathématique. En effet, cette fonction est nulle partout sauf en zéro tout en possédant une intégrale sur tout l’espace égale à 1. Ceci est contradictoire puisque la propriété pour une fonction d’être nulle presque partout implique que son intégrale sur tout l’espace soit nulle aussi. Après avoir pris connaissance de cette fonction qui conduisait à des « formules tellement folles d’un point de vue mathématique »

31 [Schwartz 1997, 218], [Urquhart 2008b, 429], Schwartz a travaillé à l’établissement d’un fondement rigoureux et y parvient en développant la théorie des distributions [Zemanian 1965].

32 Selon moi, le travail de Schwartz pour la fonction de Dirac est analogue au travail des constructivistes pour les notions d’états ou d’opérateurs hermitiens de la mécanique quantique. Dans les deux cas, il s’agit de la recherche d’un fondement mathématique rigoureux pour des notions intervenant en physique : la densité de masse d’une particule ponctuelle dans un cas, les états et les opérateurs hermitiens dans l’autre. Même si ces dernières notions de la mécanique quantique reçoivent des formulations mathématiques rigoureuses dans le cadre des mathématiques classiques, elles sont considérées comme étant mathématiquement infondées par les constructivistes. La différence entre les deux situations consiste ainsi seulement en ce qui est accepté comme étant un fondement rigoureux. Pour Schwartz, ce sont les mathématiques utilisant la logique classique. Pour les constructivistes, ce sont les mathématiques utilisant la logique intuitionniste.

Philosophia Scientiæ, 16-1 | 2012 158

33 Après avoir identifié à quel type de méthodologie la reformulation constructive des théories physiques correspondait, je propose d’examiner le rôle qu’elle joue dans le développement des mathématiques constructives.

4 La physique dans les mathématiques constructives : un rôle heuristique

34 Reformuler constructivement les théories physiques est une pratique fructueuse de la recherche en mathématiques constructives pour au moins deux raisons. Elle permet d’une part de stimuler la recherche de preuves constructives d’énoncés mathématiques et la formulation constructive de notions physiques. D’autre part, elle permet d’orienter la recherche en mathématiques constructives vers de nouveaux domaines. Dans ces deux cas, les théories physiques remplissent une fonction heuristique, au sens où elles indiquent de nouveaux sujets de recherches en mathématiques constructives.

4.1 Stimuler la recherche d’une preuve constructive d’un énoncé mathématique individuel

35 La recherche d’une reformulation constructive d’une théorie physique se focalise parfois sur un énoncé en particulier. C est notamment le cas lorsque cet énoncé est jugé avoir un statut particulièrement important dans la théorie physique. La recherche d’une preuve de cet énoncé en particulier a une valeur de test pour l’ensemble de la théorie. Dans ce cas, la pratique peut être qualifiée de fructueuse si elle conduit à l’établissement d’une preuve constructive de l’énoncé en question, c’est-à-dire à l’établissement d’un nouveau théorème des mathématiques constructives.

36 C est le cas par exemple du théorème de Gleason33. Il s’agit d’un théorème important de la mécanique quantique : il relie d’une part la règle de Born de la mesure à, d’autre part, les opérateurs statistiques et les opérateurs de projection. Il stipule que la probabilité d’un résultat de mesure s’écrit comme la trace du produit des opérateurs statistiques et de projection34. Ce théorème a déjà été démontré en 1957 par Gleason dans le cadre des mathématiques classiques [Gleason 1957]. Cependant, jusqu’en 1999, cet énoncé n’était pas encore un théorème des mathématiques constructives. C’est Richman et Bridges qui, non sans difficulté, en donnèrent pour la première fois une preuve constructive [Richman & Bridges 1999]35.

37 Lorsque les constructivistes concentrent leurs efforts sur la recherche de la preuve constructive d’un énoncé individuel qui intervient dans une théorie physique, le contenu physique de la théorie en question ne semble plus intervenir. Dans ce cas, le rôle joué par le contenu physique de la théorie semble assez faible voire inexistant. La théorie physique contribue alors au développement des mathématiques constructives en indiquant un énoncé mathématique, certes important pour la théorie, mais qui reste avant tout un énoncé mathématique36.

Philosophia Scientiæ, 16-1 | 2012 159

4.2 Stimuler la recherche d’une formulation constructive de notions physiques

38 La recherche d’une preuve constructive d’un énoncé individuel s’intègre souvent dans la recherche d’une reformulation constructive d’une théorie physique dans son ensemble et des notions physiques qui entrent enjeu37. Dans ce cas, d’une part le rôle joué par le contenu physique de la théorie semble plus important que dans le cas précédent. D’autre part, le résultat d’une telle pratique n’est plus un seul nouveau théorème (aussi important qu’il soit), mais un ensemble de résultats liés les uns aux autres au sein d’une même structure.

39 C’est le cas par exemple du travail de reformulation constructive de la mécanique quantique [Bridges 1981]. Premièrement, dans cet exemple, il s’agit bel et bien d’énoncer différents axiomes, définitions et propriétés pour définir des notions physiques. Les notions d’états et d’événements sont définies ainsi que celles d’opérateurs et de résultats de mesures, essentielles à la formulation de la mécanique quantique puisque les grandeurs physiques de la mécanique quantique (comme l’énergie, le spin, le nombre de particules, etc.) sont définies comme des opérateurs et les résultats de mesures comme les états possibles du système physique après la mesure de ces grandeurs. La possibilité d’une interprétation physique des notions définies mathématiquement dans un cadre constructif semble décisive pour la viabilité du projet de Bridges. Les propriétés physiques des événements, par exemple « si l’événement a se produit, alors l’événement b se produit aussi » [Bridges 1981, 262] doivent pouvoir être formulées mathématiquement dans un cadre constructif : ce qui est le cas grâce à la relation « a ≤ b » [Bridges 1981, 262].

40 Deuxièmement, les résultats mathématiques dans [Bridges 1981] consistent en une série de plusieurs lemmes et résultats constructifs liés les uns aux autres concernant entre autre la projection des états, la probabilité entre événements et les relations entre événements. Par exemple, les axiomes ES1-ES3 permettent de prouver le lemme suivant pour deux événements a et b : « si s ∈ S, s(a) > s(b) et a ᐱ b’ existe, alors s(a ᐱ b’) > 0 »38 (avec s(a) [respectivement s(b)], la probabilité que l’événement a [respectivement b] se produise si le système physique est préparé dans l’état s). Ensuite, à partir de ce 39 résultat, il est possible de prouver un autre lemme, par exemple : « a ᐱ b’ = 0 ⇔ a ≤ b » . Ainsi, on constate que les apports pour les mathématiques constructives d’une formulation constructive des notions physiques de la mécanique quantique ne se limitent pas à l’établissement d’un théorème nouveau, mais à un ensemble de lemmes et résultats mathématiques constructifs interdépendants40.

4.3 Orienter la recherche vers de nouveaux domaines

41 La reformulation constructive d’une théorie physique permet aussi d’orienter les recherches en mathématiques constructives vers de nouveaux domaines. Il s’agit là d’un deuxième aspect du rôle heuristique que joue la physique dans la recherche en mathématiques constructives.

42 C’est le cas par exemple de la recherche d’un fondement constructif des opérateurs linéaires non bornés. Ces derniers consistent en une extension41 de la théorie des opérateurs bornés des espaces de Hilbert que l’on pouvait déjà trouver dans [Bishop & Bridges 1985]. La recherche d’un fondement constructif à une théorie plus générale que

Philosophia Scientiæ, 16-1 | 2012 160

celle déjà existante des opérateurs bornés est essentiellement due à la recherche d’une reformulation constructive de la mécanique quantique. En effet, ces opérateurs interviennent largement en mécanique quantique, notamment pour rendre compte de notions fondamentales telles que la position ou l’impulsion des particules. Un fondement constructif pour ces opérateurs semble incontournable si l’on veut prétendre à une reformulation constructive complète de la mécanique quantique42. Ce qui est en jeu dans ce travail, c’est l’exploration d’un domaine nouveau dans la recherche en mathématiques constructives et qui a déjà donné de nombreux résultats. Les travaux de Ye [Ye 1999], [Ye 2000], [Ye 2011] et Spitters

43 [Spitters 2002a], [Spitters 2002b] ont conduit notamment à la formulation constructive d’un cas particulier de ces opérateurs : les opérateurs non bornés auto-adjoints. Un théorème de décomposition spectrale, le théorème de Stone, ainsi que d’autres propriétés sur ces opérateurs ont été établis dans un cadre constructif [Ye 1999], [Ye 2000], [Ye 2011], [Spitters 2002a], [Spitters 2002b], même si « une bonne théorie pour les opérateurs non bornés [en général] reste toujours à établir » [Spitters 2002a, 55]. Il n’est d’ailleurs pas surprenant que la recherche d’une formulation constructive de la théorie des opérateurs non bornés ait commencé par celle des opérateurs non bornés auto-adjoints puisque ces derniers sont précisément ceux qui interviennent principalement en mécanique quantique43.

44 De même, la recherche d’une formulation constructive (dans un cadre dit de finitisme strict44) de la théorie de la relativité générale, récemment entreprise dans [Ye 2011], s’attaque à un domaine encore jamais exploré dans les mathématiques constructives : la géométrie pseudo-riemannienne. Comme Ye le souligne lui-même : Le chapitre 8 [de [Ye 2011]] qui concerne la géométrie pseudo-riemannienne est une addition récente au strict finitisme. Elle n’est fondée sur aucune théorie constructive déjà existante45. [Ye 2011, vii]

45 Dans cet exemple aussi, la recherche d’une reformulation constructive d’une théorie physique est décisive pour cette nouvelle orientation de recherche. C’est en effet dans le but de reformuler la théorie de la relativité générale que Ye s’intéresse à ce nouveau domaine de la recherche en mathématiques constructives : Ce chapitre développe les principes fondamentaux des variétés différentiables et de la géométrie pseudo-riemannienne pour46 ses applications à la relativité générale. [Ye 2011, 217]

46 Ce travail de reformulation constructive de la théorie de la relativité générale a permis de donner un fondement constructif aux notions de tenseurs, de métrique, de dérivée covariante, de géodésique permettant ainsi de « couvrir les bases des mathématiques appliquées […] à la relativité générale » [Ye 2011, vii].

47 De manière générale, la physique joue un rôle heuristique, non seulement dans la recherche en mathématiques constructives, mais aussi en mathématiques classiques : elle permet de révéler de nouveaux domaines de recherche. En 1822 déjà, les recherches de Fourier sur la propagation de la chaleur sont à l’origine d’un nouveau domaine de recherches en mathématiques classiques, l’analyse de Fourier [Fourier 1822]. Plus récemment, Maddy rappelle que « dans les sciences contemporaines, par exemple, les besoins de la théorie quantique des champs et de la théorie des cordes ont conduit à l’étude de nouveaux domaines de la théorie des ensembles » [Maddy 2008, 38]. Une fois ce fait constaté, d’une influence de la physique sur la recherche en mathématiques, la question de la nécessité de cette influence peut se poser. L’équation

Philosophia Scientiæ, 16-1 | 2012 161

de la chaleur comme la géométrie pseudo-riemannienne auraient-elles pu faire l’objet des recherches en mathématiques (respectivement classiques et constructives) si un contexte physique n’avait pas été en jeu ? À cette question contrefactuelle, on peut seulement répondre que rien ne nous permet d’affirmer que l’établissement de l’analyse de Fourier n’aurait pas pu se faire sans les recherches physiques sur la propagation de la chaleur. De même, rien ne nous permet d’affirmer que les recherches sur les opérateurs non bornés ou sur la géométrie pseudo-riemannienne n’auraient jamais pu voir le jour si ces deux notions n’intervenaient pas dans des théories physiques. Cependant, si la physique n’est pas en principe nécessaire au développement de nouveaux domaines de recherche en mathématiques, il n’en reste pas moins qu’elle se révèle bel et bien dans les faits être à l’origine de ces nouvelles recherches. C’est une analyse de ce constat dont fait l’objet cet article, avec l’intention de mettre en évidence que la physique joue un rôle heuristique dans le développement des mathématiques entendues dans une acception plus large qu’elles ne le sont d’habitude, non seulement comprises comme les mathématiques classiques mais aussi, comme des mathématiques aux exigences de preuves bien différentes de celles des mathématiques classiques.

Conclusion

48 Les rapports qu’entretiennent la physique et les mathématiques ont déjà fait l’objet de nombreuses discussions, cependant, presque toujours selon une approche où les mathématiques sont les mathématiques classiques. L’examen de l’’influence de la physique dans la recherche en mathématiques constructives, qui diffèrent des mathématiques classiques par leurs fondements logiques, n’a en revanche pas été beaucoup traité.

49 Pourtant, depuis les années 1980, les mathématiciens constructivistes se sont intéressés aux théories physiques en cherchant à les reformuler de manière constructive. Plus précisément, les constructivistes se sont essentiellement tournés vers la reformulation constructive de la mécanique quantique et de théorie de la relativité générale. Dans cet article, je me suis intéressé à cette pratique de la recherche en mathématiques constructives qui consiste à reformuler les théories physiques et j en ai proposé une analyse en discutant trois aspects. Dans un premier temps, j ai montré que cette pratique de la recherche avait la particularité d’être motivée par des considérations philosophiques. Elle illustre une situation particulière de la recherche en mathématiques où la philosophie des mathématiques influe sur cette dernière. Dans cette interaction entre philosophie et recherches en mathématiques, la physique joue le rôle de « champ de bataille » dans le débat philosophique entre constructivisme et classicisme en mathématiques. Dans un deuxième temps, j ai cherché à qualifier la méthodologie de la recherche en mathématiques à laquelle cette pratique correspond. Pour reprendre la distinction de Poincaré, il s’agit davantage d’une méthodologie de « logiciens » plutôt que d’« intuitifs » puisque l’activité de recherche en question est essentiellement dirigée vers la recherche de preuves plutôt que vers celle de conjectures. Dans un troisième temps, j ai montré comment, dans cette pratique, les théories physiques avaient un rôle heuristique sur le développement des mathématiques constructives. Elles permettent d’une part de stimuler la recherche de preuves d’énoncés mathématiques individuels (comme dans le cas par exemple du théorème de Gleason) mais aussi la recherche de plusieurs résultats mathématiques

Philosophia Scientiæ, 16-1 | 2012 162

constructifs interdépendants (comme ceux intervenant dans la définition de la notions physiques d’états quantiques) et d’autre part d’orienter la recherche en mathématiques constructives vers de nouveaux domaines (comme dans les cas par exemple des opérateurs non bornés ou de la géométrie pseudo-riemannienne).

BIBLIOGRAPHIE

BATTERMAN, ROBERT 2002 The Devil in the Details: Asymptotic Reasoning in Explanation, Reduction and Emergence, Oxford: Oxford University Press.

BILLINGE, HELEN 1997 A constructive formulation of Gleason’s theorem, Journal of Philosophical Logic, 26(6), 671-670. 2000 Applied constructive mathematics: on Hellman’s mathematical constructivism in spacetime, The British Journal for Philosophy of Science, 51(2), 299-318.

BISHOP, ERRETT 1967 Foundations of Constructive Analysis, New York: McGraw-Hill.

BISHOP, ERRETT & BRIDGES, DOUGLAS S. 1985 Constructive Analysis, Grundlehren der mathematischen Wissenschaften, t. 279, Berlin: Springer.

BRIDGES, DOUGLAS S. 1981 Toward a constructive foundation for quantum mechanics, Lecture Notes in Mathematics, 873 (Constructive Mathematics), 260-273. 1999 Can constructive mathematics be applied in physics?, Journal of Philosophical Logic, 28(5), 439-453. s. d. Constructive Mathematics: Frequently Asked Questions (FAQ). www.math.canterbury.ac.nz/ php/groups/cm/faq/.

BRIDGES, DOUGLAS S. & SVOZIL, KARL 2000 Constructive mathematics and quantum physics, International Journal of Theoretical Physics, 39(3), 503-515.

BRIDGES, DOUGLAS S. et al. 1982 Picard’s Theorem, Transactions of the American Mathematical Society, 269(2), 513-520.

BROWN, JAMES R. 2003 Science and constructive mathematics, Analysis, 63(1), 48-51.

CHORLAY, RENAUD 2007 L’émergence du couple local/global dans les théories géométriques, de Bernhard Riemann à la théorie des faisceaux (1851-1953), Thèse de doctorat d’histoire des mathématiques, Université Paris 7.

COLYVAN, MARK 2001 The Indispensability of Mathematics, New York: Oxford University Press.

CONWAY, JOHN B. 1978 Functions of One Complex Variable I, Graduate texts in mathematics, t. 11, New York: Springer.

Philosophia Scientiæ, 16-1 | 2012 163

FEFERMAN, SOLOMON 1998a In the Light of Logic, Oxford University Press, chap. Weyl vindica-ted: Das Kontinuum seventy years later, 249-283. 1998b In the Light of Logic, Oxford University Press, chap. Why a little bit goes a long way: Logical foundations of scientifically applicable mathematics?, 284-298. FIELD, HARTRY H. 1980 Science without Numbers: A Defence of Nominalism, Princeton: Princeton University Press.

FLETCHER, PETER 2002 A constructivist perspective on physics, Philosophia Mathematica, 10(1), 26-42.

FOURIER, JOSEPH 1822 Théorie analytique de la chaleur, Paris : Jacques Gabay, 1988. FRIEDGUT, EHUD & BOURGAIN, JEAN 1999 Sharp thresholds of graph properties, and the k-sat problem, Journal ofthe American Mathematical Society, 12(4), 1017-1054.

GLEASON, ANDREW M. 1957 Measures on the closed subspaces of Hilbert space, Journal of Mathematics and Mechanics, 6(6), 885-893.

HEATHCOTE, ADRIAN 1990 Unbounded operators and the incompleteness of quantum mechanics, Philosophy ofScience, 57(3), 523-534.

HELLMAN, GEOFFREY 1992 On the scope and force of indispensability arguments, dans Proceedings of the Biennial Meeting of the Philosophy of Science Association, 456-464. 1993a Gleason’s theorem is not constructively provable, Journal of

Philosophical Logic, 22(2), 193-203. 1993b Constructive mathematics and quantum mechanics: Unbounded operators and the spectral theorem, Journal of Philosophical Logic, 22(3), 221-248.

1998 Mathematical constructivism in spacetime, The Briitish Journal for Philosophy ofScience, 49(3), 425-450.

JAFFE, ARTHUR & QUINN, FRANCK 1993 "Theoretical mathematics": Toward a cultural synthesis of mathe-matics and theoretical physics, Bulletin ofthe American Mathematical Society, 29(1), 1-13.

JEFFREYS, HAROLD & JEFFREYS, BERTHA S. 1946 Methods ofMathematical Physics, Cambridge Mathematical Library, Cambridge: Cambridge University Press, 2001.

KIRKPATRICK, SCOTT & SELMAN, BART 1994 Critical behaviour in the satisfiability of random boolean expressions, Science, 264(5163), 1297-1301.

KLEIN, FELIX 1882 Ueber Riemann’s Theorie der algebraischen Funktionen und ih-rer Integrale, Teubner, traduction anglaise par Frances Hardcastle, On Riemann’s Theory of Algebraic Functions and their Integrals, Cambridge: MacMillan and Bowes, 1893. 1895 Riemann and his significance for the development of modern mathe-matics, Bulletin ofthe American Mathematical Society, 1(7), 165-180.

KUSHNER, BORIS A 1973 Lektsii po konstructivnomu matematicheskomu analizu, Moscou: Nauka, traduction anglaise par

Philosophia Scientiæ, 16-1 | 2012 164

E. Mendelson, Lectures on constructive mathematical analysis, Providence: American Mathematical Society, 1984.

MADDY, PENELOPE 2008 How applied mathematics became pure, The Review of Symbolic Logic, 1(1), 16-41.

MÉZARD, MARC, PARISI, GIORGIO & VIRASORO, MIGUEL ANGEL 1987 Spin Glass Theory and Beyond, Singapore: World Scientific.

MORRISON, MARGARET 2000 Unifying Scientific Theories: Physical Concepts and Mathematical Structures, Cambridge: Cambridge University Press.

PICARD, ÉMILE 1879 Sur les fonctions analytiques uniformes dans le voisinage d’un point singulier essentiel, dans Comptes rendus de l’Académie des sciences.

POINCARÉ, HENRI 1905 La Valeur de la science, Bibliothèque de Philosophie Scientifique, Paris : Flammarion, 1939.

PUTNAM, HILARY 1971 Philosophy of Logic, New York : Harper and Row, traduction française par P. Pecatte, Philosophie de la logique, Combas : l’Éclat, 1996.

QUINE, WILLARD VAN ORMAN 1953 From a logical point of view, New York: Harper Torchbooks, chap. On what there is, 1-19, 1963.

RICHMAN, FRED & BRIDGES, DOUGLAS S. 1999 A constructive proof of Gleason’s theorem, Journal of Functional Analysis, 162, 287-312.

ROBINSON, ABRAHAM 1966 Non-standard analysis, Amsterdam: North-Holland Publishing Company.

ROUILHAN, PHILIPPE DE 2005 Prédicativisme, dans Encyclopaedia Universalis, Dictionnaire des Idées, Encyclopaedia Universalis SA, 638-640.

SCHWARTZ, LAURENT 1997 Un mathématicien aux prises avec le siècle, Paris : Odile Jacob, traduction anglaise par L. Schneps, A Mathematician Grappling with his Century, New York : Birkhâuser, 2001.

SMITH, PETER 2003 Constructivism exploded?, Analysis, 63(279), 263-266. SPITTERS, BAS 2002a Constructive and intuitionistic integration theory and functional analysis, Thèse de doctorat, University of Nijmegen. 2002b Located operators, Mathematical Logic Quarterly, 48(1), 107-122. 2006 A constructive view on ergodic theorems, Journal of Symbolic Logic, 71(2), 611-623.

STEINER, MARK 1998 The Applicability of Mathematics as a Philosophical Problem, Harvard: Harvard University Press.

TALAGRAND, MICHEL 2003 Spin Glasses: A Challenge for Mathematicians, Berlin: Springer. 2006 The Parisi formula, Annals of Mathematics, 163(1), 221-263.

URQUHART, ALASDAIR 2008a The boundary between mathematics and physics, dans The Philosophy of Mathematical Practice, édité par MANCOSU, PAOLO, Oxford: Oxford University Press, 407-416.

Philosophia Scientiæ, 16-1 | 2012 165

2008b Mathematics and physics: Strategies of assimilation, dans The Philosophy of Mathematical Practice, édité par MANCOSU, PAOLO, Oxford: Oxford University Press, 417-440.

WEIHRAUCH, KLAUS 2000 Computable Analysis, Heidelberg: Springer.

WEYL, HERMANN 1949 Philosophy of Mathematics and Natural Science, Princeton: Princeton University Press.

WILSON, MARK 2000 The unreasonable uncooperativeness of mathematics in the natural sciences, , 83(2), 296-314.

YE, FENG 1999 Strict Constructivism and the Philosophy ofMathematics, Ph. d. dissertation, Princeton University. 2000 Toward a constructive theory of unbounded linear operators, The Journal of Symbolic Logic, 65(1), 357-370. 2008 A strictly finitistic system for applied mathematics. www.phil.pku.edu.cn/cllc/people/ fengye/ strictlyFinitisti cSystemForAppliedMath.pdf. 2011 Strict Finitism and the Logic of Mathematical Applications, no 355 dans Synthese Library, Dordrecht: Springer.

ZEMANIAN, ARMEN H 1965 Distribution Theory and Transform Analysis, New York: Dover Publications, 1987.

NOTES

1. La physique mathématique permet d’une part de révéler des problèmes d’analyse pure (comme par exemple dans le cas de l’étude de la propagation de la chaleur par [Fourier 1822] qui a conduit à poser le problème de la résolution d’une équation différentielle, l’équation de la chaleur). D’autre part, elle permet aussi d’aider à trouver des moyens pour résoudre ces problèmes (comme par exemple dans le cas du mathématicien F. Klein qui, pour résoudre une question d’analyse complexe, utilise une analogie physique [Klein 1882], [Poincaré 1905, 13, 153]). Je reviendrai sur cet exemple dans la partie III. 2. La communauté des constructivistes compte tout de même plusieurs équipes de recherche à travers le monde, notamment les départements de mathématiques des universités d’Upsal en Suède (Erik Palmgren et Viggo Stoltenberg-Hansen), de Munich en Allemagne (Josef Berger, Peter Schuster et Helmut Schwichtenberg), de Padoue en Italie (Milly Maietti, Giovanni Sambin et Silvio Valentini), de Canterbury en Nouvelle-Zélande (Douglas Bridges et LuminiÇa Simona VîÇa) et l’Institut des Sciences et des Technologies du Japon (Hajime Ishihara et Takako Nemoto). Le programme de recherche européen ‘CONSTRUMATH’ coordonné par P. Schuster rassemble ces différents centres de recherche. Notons aussi que d’autres groupes de recherche s'intéressent plus particulièrement à l’application des mathématiques constructives à l’informatique, par exemple le département de sciences computationnelles de l’université de Cornell aux États-Unis (avec Robert L. Constable) [Bridges s. d., #6]. 3. Ces deux théories fournissent l’exemple d’une approche constructive des mathématiques fondées sur la logique classique.

Philosophia Scientiæ, 16-1 | 2012 166

4. J’utilise aussi le travail de F. Ye qui s’inscrit dans un type de constructivisme très proche de celui Bridges et Bishop [Ye 2008], [Ye 2011] . Il s’agit d’une approche finitiste stricte (voir note de bas de page no 44). 5. « Je ne connais aucune preuve de cet énoncé autre que la preuve constructive de [Bridges 1982] » [Bridges s. d., #4]. 6. C’est un des avantages que possèdent les mathématiques constructives sur les mathématiques classiques : « Tout théorème prouvé avec la logique intuitionniste peut [ensuite] être modélisé dans le cadre des mathématiques classiques » [Bridges 1999, 440]. 7. Je souligne. 8. Par exemple, la première édition du livre de [Jeffreys & Jeffreys 1946] Methods of Mathematical Physics paraît en 1946. De même, plusieurs revues scientifiques telles que le Journal of Mathematical Physics ou Theoretical and Mathematical Physics ont commencé à paraître respectivement dès 1960 et 1969. 9. « Récemment, beaucoup d’efforts ont été engagés dans la recherche d’un fondement axiomatique de la physique qui conduit d’une manière naturelle au modèle de l’espace de Hilbert standard de la mécanique quantique. Dans cet article, nous décrivons une théorie axiomatique constructive des événements et des états, et nous montrons que les parties appropriées du modèle de l’espace de Hilbert satisfont à nos axiomes » [Bridges 1981, 260]. 10. E pour events (événements) et S pour states (états). 11. Les quatre axiomes ES1 à ES4 sont : — « ES1 : a ≠ b si et seulement si il existe un s dans S tel que s(a) ≠ s(b) »

— « ES2 : si a1, a2 ... sont des éléments orthogonaux de E tels que pour tout s de S la série Σns(an) converge, alors il existe un b dans E tel que pour tout s, s(b) = 1 - Σns(an) » — « ES3 : si a ᐯ b existe, alors pour tout s de S, s(a ᐯ b) ≤ s(a) + s(b) »

— « ES4 : soient a1,a2 ... des éléments orthogonaux de E, et s un élément de S tel que Σ ns(an) converge et reste inférieure à 1. Pour tout ε > 0, il existe un b dans E tel que b ≤ a’n(n ≥ 1) et s(b) +

Σns(an) > 1 - ε » [Bridges 1981, 260]. 12. D. Bridges propose en une nouvelle version d’une formulation constructive de la mécanique quantique, interprétée cette fois dans le contexte physique des réseaux quantiques [Bridges & Svozil 2000]. 13. La définition d’un objet est qualifiée d’imprédicative si celle-ci fait référence à une totalité infinie d’objets à laquelle l’objet en question appartient. Selon une approche prédicative, « un objet ne peut être défini dans les termes d’une multiplicité d’objets parmi lesquels il se trouve » [Rouilhan 2005]. 14. Je souligne. 15. Il semble en effet que ce soit aussi le cas de l’Analyse non standard, théorie mathématique développée à partir des années 1960 par A. Robinson et E. Nelson et dont les recherches semblent s’inscrire autour de la question des infinitésimaux leibniziens : « On montre dans ce livre que les idées de Leibniz peuvent être pleinement soutenues et qu’elles mènent à une approche novatrice et féconde de l’Analyse classique et beaucoup d’autres branches des mathématiques » [Robinson 1966, 2], traduction française de l’extrait par I. Drouet (communication privée). 16. « Un opérateur A borné, défini sur un espace de Hilbert ℋ dans un domaine D(A) £ ℋ est une fonction satisfaisant : ǁAΨǁ ≤ kǁΨǁ, avec Ψ ∈ D(A), k ∈ ℝ+ et où ǁ • ǁ correspond à la norme d’un vecteur définie par le produit scalaire sur ℋ. Un operateur non borné est un opérateur pour lequel il n’existe pas un tel k » [Heathcote 1990, 524]. 17. La notion d’indispensabilité des mathématiques dans les sciences empiriques a fait et continue de faire toujours l’objet de nombreuses discussions en philosophie des mathématiques. Ces discussions [Field 1980], [Colyvan 2001], initiées par [Quine 1953] et [Putnam 1971], cherchent à

Philosophia Scientiæ, 16-1 | 2012 167

arbitrer une question d’ontologie (les objets mathématiques existent-ils indépendamment de nous ?) à partir de la constatation selon laquelle ces derniers s avèrent indispensables dans les sciences empiriques (ou au contraire, pas indispensables selon la position philosophique défendue). Il s agit de déplacer sur un terrain ‘pratique’ le débat entre nominalisme et platonisme en philosophie des mathématiques. Dans notre cas, les arguments d’indispensabilité ne sont pas utilisés pour débattre une question ontologique, mais de fondement logique. En effet, selon l’interprétation de Richman et Bridges que nous suivons, « on peut travailler constructivement — c’est-à-dire en utilisant la logique intuitionniste — avec n’importe quel objet [je souligne] des mathématiques et non avec seulement quelques objets particuliers d’un type ‘constructif’ (quel qu’en soit le sens) » [Bridges 1999, 440]. 18. On retrouve aussi ce présupposé dans : — [Hellman 1993b, 211] : « Dans quelle mesure les mathématiques réellement utilisées dans les sciences empiriques, particulièrement en physique, peuvent être formulées constructivement […] ? Il n’est probablement pas exagéré de dire que la viabilité d’une philosophie constructiviste des mathématiques est ici en jeu » — et aussi dans [Hellman 1992, 456] où celui-ci y discute explicitement la question suivante : « Dans quelle mesure les mathématiques classiques, le raisonnement infinitiste (non-constructif) sont-ils indispensables aux mathématiques applicables aux sciences ? » 19. Brown formule ce présupposé de manière encore plus catégorique : « Si les mathématiques classiques sont indispensables aux sciences, alors l’approche constructive est réellement ruinée » [Brown 2003, 48]. Remarquons que Brown lui-même se range du côté de Hellman dans son article [Brown 2003]. Il y propose à son tour une situation physique relativement simple qui ne semble ne pas pouvoir être formulée selon une approche constructiviste. P. Smith répond à cet exemple en montrant comment un physicien constructiviste pourrait finalement formuler cette situation [Smith 2003]. 20. H. Billinge souligne cependant que ce présupposé ne semble pas être accepté par le constructiviste M. Dummett : « La logique appropriée aux mathématiques est déterminée par la théorie correcte de la signification en mathématiques. [M. Dummett] ne reconnaîtrait simplement pas le fait (si c’est le cas) que certains résultats des théories scientifiques contemporaines qui ne peuvent seulement être formulés avec les mathématiques classiques montrent que les mathématiques constructives sont inadéquates en un quelconque sens » [Billinge 2000, 313]. 21. Je souligne. 22. Je souligne. 23. « Bridges et Richman ont donné une preuve constructive du théorème de Gleason. Ye a donné une formulation constructive du théorème spectral pour les opérateurs auto-adjoints non bornés sur des espaces de Hilbert » [Spitters 2002a, 9]. 24. « Plus précisément, nous donnerons une formulation constructive au théorème ergodique, au théorème de Peter-Weyl pour la représentation des groupes compacts, au théorème d’approximation des fonctions quasi-périodiques et nous développerons en partie de manière constructive la théorie des opérateurs. Nous étendrons aussi les résultats de Ye sur les opérateurs non-bornés » [Spitters 2002a, 9]. 25. « Il est impossible d’étudier les œuvres des grands mathématiciens, et même celles des petits, sans remarquer et sans distinguer deux tendances opposées, ou plutôt deux sortes d’esprits entièrement différents. Les uns sont avant tout préoccupés de la logique, à lire leurs ouvrages, on est tenté de croire qu’ils n’ont avancé que pas à pas, avec la méthode d’un Vauban qui pousse ses travaux d’approche contre une place forte, sans rien abandonner au hasard. Les autres se laissent guider par l’intuition et font du premier coup des conquêtes rapides, mais quelquefois précaires, ainsi que de hardis cavaliers d’avant-garde. Ce n’est pas la matière qu’ils traitent qui leur impose l’une ou l’autre méthode. Si l’on dit souvent

Philosophia Scientiæ, 16-1 | 2012 168

des premiers qu’ils sont des analystes et si l’on appelle les autres géomètres, cela n’empêche pas que les uns restent analystes, même quand ils font de la géométrie, tandis que les autres sont encore des géomètres, même s ils s occupent d’analyse pure. C’est la nature même de leur esprit qui les fait logiciens ou intuitifs [je souligne], et ils ne peuvent la dépouiller quand ils abordent un sujet nouveau » [Poincaré 1905, 11]. 26. Poincaré prend l’exemple suivant : « M. Méray veut démontrer qu’une équation binôme a toujours une racine, ou, en termes vulgaires, qu’on peut toujours subdiviser un angle. S’il est une vérité que nous croyons connaître par intuition directe, c’est bien celle-là. Qui doutera qu’un angle peut toujours être partagé en un nombre quelconque de parties égales ? M. Méray n’en juge pas ainsi ; à ses yeux, cette proposition n’est nullement évidente et pour la démontrer, il lui faut plusieurs pages » [Poincaré 1905, 12]. 27. L’exemple pris par Poincaré pour illustrer cette méthodologie est discuté à la section suivante. 28. Il faut ici prendre garde à ne pas confondre la méthodologie « intuitive » et la logique intuitionniste. 29. C’est le domaine des mathématiques qui étudie les fonctions à valeurs complexes (dans ℂ). 30. Cette discussion s’inscrit dans un débat initié par un article de [Jaffe & Quinn 1993] dénonçant le manque de rigueur mathématique des méthodes de calcul utilisées par les physiciens et leur utilisation par certains mathématiciens, conduisant à ce qu’ils appellent péjorativement des ‘mathématiques théoriques’. Ce débat est bien résumé dans [Urquhart 2008b, § 1]. 31. C est une grandeur théorique importante en physique statistique à partir de laquelle il est possible d’établir de nombreuses prédictions vérifiables empiriquement. 32. En effet, alors que certains mathématiciens utilisent cette méthode non rigoureuse dans leurs recherches, d’autres au contraire la fustigent et cherchent à en établir un fondement rigoureux. Par exemple, Kirkpatrick et Selman annoncent un résultat d’optimisation combinatoire en utilisant des méthodes mathématiques de physique statistique [Kirkpatrick & Selman 1994]. Cependant, cinq ans après, Friedgut et Bourgain en proposent une preuve rigoureuse en utilisant les méthodes conventionnelles des mathématiciens [Friedgut & Bourgain 1999]. De même, la formule de Parisi de physique statistique, utilisée par certains mathématiciens, n’est toujours pas rigoureusement prouvée. Cependant, Talagrand a réussi à la prouver partiellement selon les méthodes conventionnelles des mathématiciens [Talagrand 2006]. Les mathématiciens Talagrand (qui ne voit dans la méthode des répliques qu’« une manière de deviner des formules correctes » [Talagrand 2003, 195]), Friedgut et Bourgain travaillent selon une méthodologie que Poincaré n’hésiterait sans doute pas à qualifier de « logicienne ». 33. « Soit μ une mesure σ-additive sur un sous-espace fermé (réel ou complexe) d’un espace de Hilbert ℋ de dimension ≥ 3. Il existe alors un opérateur auto-adjoint positif à trace W, tel que, ℋ pour tout sous-espace fermé A, μ ( A) = Tr(WPa), où Pa est l’opérateur projection de sur A. » [Hellman 1993a, 193]. 34. « Le théorème répond à la question centrale : comment la probabilité peut-elle être introduite en mécanique quantique dans le cadre du formalisme de l’espace de Hilbert ? » [Hellman 1992, 461]. 35. « Fred Richman, par un tour de force technique et d’inventivité, a finalement produit une preuve constructive du théorème original de Gleason » [Bridges 1999, 450]. 36. Le philosophe des mathématiques constructives Per Martin-Lof considère que le théorème de Gleason ne fait pas partie de la physique, étant donné qu’il n’est pas utilisé pour faire des calculs en laboratoire [Hellman 1992, 462]. La question de savoir quels sont les critères qu’un théorème mathématique doit remplir pour qu’il fasse partie d’une théorie physique ne conduit pas à mon sens à une réponse simple. Cependant, je m en tiendrai ici à l’acception selon laquelle le théorème de Gleason est un théorème mathématique qui ne fait pas partie de la mécanique quantique, puisqu il n’intervient qu’indirectement en mécanique quantique, pour déduire une

Philosophia Scientiæ, 16-1 | 2012 169

formule qui, elle, est utilisée pour faire des calculs en laboratoire, à savoir pour calculer la probabilité d’un événement. 37. Ce n’est pas forcément le cas. Par exemple, dans [Spitters 2006], Spitters développe une approche constructive du théorème ergodique alors même que la recherche d’une formulation constructive de la théorie physique des systèmes dynamiques (dans laquelle ce théorème intervient) ne semble pas être un sujet de recherche des constructivistes. 38. Il s’agit du lemme (1.2) de [Bridges 1981]. 39. Il s agit du lemme (1.3) de [Bridges 1981] dont la preuve utilise le lemme (1.2). 40. Notons que la formulation constructive de la mécanique quantique de Bridges n’est pas entièrement satisfaisante comme celui-ci le reconnaît lui-même : « Il est clair qu’une approche constructive des fondements mathématiques de la physique quantique révèle de sérieux problèmes » [Bridges 1981, 272]. Certains de ces problèmes, comme l’établissement d’une preuve constructive au théorème de Gleason dont Bridges notait déjà en 1981 qu’il constituait un sérieux problème, seront résolus par la suite [Richman & Bridges 1999]. 41. Spitters [Spitters 2002a], [Spitters 2002b] qui poursuit les travaux de Ye [Ye 1999], [Ye 2000] sur les opérateurs non bornés parle explicitement d’extension et de généralisation pour qualifier la théorie des opérateurs non bornés par rapport à la théorie des opérateurs bornés. La théorie des opérateurs bornés devient un cas particulier de la théorie des opérateurs non bornés : « Nous généralisons [je souligne] la notion d’opérateur pour y inclure celle des opérateurs non bornés. […] L'expression ‘opérateur non borné’est synonyme d’’opérateur’ . Selon cette terminologie, un opérateur borné est un cas particulier d’un opérateur non borné » [Spitters 2002a, 56]. 42. Il y a un accord général sur ce point, aussi bien de la part des constructivistes que de leurs détracteurs. Par exemple, selon le mathématicien constructiviste Spitters « les opérateurs non bornés f (x) -— f ’(x) et f (x) -— xf (x) sur L2(ℝ) jouent un rôle fondamental en mécanique quantique » [Spitters 2002a, 55]. C est aussi la position de Billinge : « La non-constructivité des opérateurs linéaires non bornés signifierait que les constructivistes ne peuvent pas utiliser les outils mathématiques standards pour représenter de nombreuses propriétés de la mécanique quantique (par exemple, la position et l’impulsion). Nous serions obligés de rejeter le point de vue constructif de la mécanique quantique » [Billinge 2000, 303]. Et selon Hellman, les opérateurs non bornés « sont fondamentaux pour comprendre la structure de la mécanique quantique » [Hellman 1992, 461]. 43. Les observables de la mécanique quantique sont en effet des opérateurs autoadjoints, c’est-à- dire des opérateurs qui vérifient la relation A = A†. 44. L’approche finitiste stricte de Ye [Ye 2008], [Ye 2011] est une approche constructive très proche de celle développée dans [Bishop & Bridges 1985] : « Développer les mathématiques avec un finitisme strict est très proche de développer les mathématiques selon les mathématiques constructives d’Errett Bishop. Ce livre [Ye 2011] suivra beaucoup d’idées du livre Constructive Analysis d’E. Bishop et D. Bridges[…] Mon objectif est de démontrer que les idées principales de Bishop et de Bridges peuvent être appliquées avec un finitisme strict, c’est-à-dire avec les mathématiques récursives élémentaires » [Ye 2011, vii]. Le finitisme strict est un « cadre encore plus restrictif » [Ye 2008, 2] que celui de Bishop et Bridges. 45. Je souligne. 46. Je souligne.

Philosophia Scientiæ, 16-1 | 2012 170

RÉSUMÉS

Je propose d’analyser une pratique de la recherche en mathématiques constructives, celle qui consiste à reformuler constructivement les théories physiques. Je discute plus précisément trois aspects de cette pratique. Je montre d’abord que celle-ci a la particularité d’être motivée par des considérations philosophiques et comment la physique est utilisée pour arbitrer un débat de philosophie des mathématiques entre constructivisme et classicisme. Ensuite, j’identifie la méthodologie de la recherche en mathématiques que cette pratique implique et montre qu’il s’agit, selon une terminologie empruntée à Poincaré, d’une méthodologie de « logiciens ». Enfin, je montre que dans cette pratique, les théories physiques ont un rôle heuristique sur le développement des mathématiques constructives. Elles permettent avec succès de stimuler la recherche d’énoncés constructifs et d’orienter la recherche en mathématiques constructives vers de nouveaux domaines.

I focus on a practice of mathematical research which consists in reformulating constructively physical theories. More precisely, I discuss three features of this practice. First, I show that it is stimulated by philosophical views and discuss how physics is used in the debate between classicism and constructivism in the philosophy of mathematics. Secondly, I characterise the methodology of the research in mathematics involved in this practice and show this is a methodology of "logicians" as Poincaré said. Finally, I show that in this practice, physical theories play a heuristic role for the progress of constructive mathematics. Successfully, they stimulate researches on constructive results and lead to the study of new provinces of constructive mathematics.

AUTEUR

VINCENT ARDOUREL IHPST - Université Paris 1 Panthéon-Sorbonne (France)

Philosophia Scientiæ, 16-1 | 2012