<<

Proving theorems with computers

Kevin Buzzard ∗ May 1, 2020

Superhuman One might hence ask the following question: if one human had an understanding of all of modern How is breakthrough mathematics achieved? Here is pure mathematics simultaneously, how much further one example, from algebraic . would they immediately be able to see? How many In his 1987 paper “Deforming Galois representa- insights are there which are needed to unblock one tions”, Barry Mazur observes that the geometric con- field, but which have already been made in another? cept of a smoothly-varying complex family of rep- This question can of course be dismissed as a resentations of a group has an arithmetic analogue. thought experiment. Perhaps it is the kind of thing He sets up deformation theory in the purely alge- which philosophers might muse over, but it is the year braic setting of mod p and p-adic Galois representa- 2020 and pure mathematics is much too big for one tions, and makes some interesting observations about human to comprehend. the relationship between deformation rings and Ga- However, it is the year 2020 and hence we have lois . By 1990, Mazur and Tilouine have computer proof systems which are now in theory ca- raised a profound question about whether a certain pable of understanding all of modern pure mathe- universal deformation ring coming out of this the- matics. So why is our community not teaching it to ory is isomorphic to one of Hida’s Hecke algebras. them? This is entirely within our grasp, and we have In 1993 Wiles uses new techniques in commutative absolutely no idea what will happen when we do. algebra to reduce a variant of this question to a nu- merical criterion, and a year later, aided by Taylor, he has pushed the strategy through. The semistable 1 Teaching mathematics to hu- Shimura–Taniyama conjecture (at that time often called the semistable Shimura–Taniyama–Weil con- mans jecture) follows, and hence, by earlier work of Ribet, Fermat’s Last Theorem. Let us look at the way pure mathematics is learnt by This is but one of very many examples where humans, from undergraduate to PhD level. cross-fertilization has occurred in mathematics. The breadth of Mazur’s mathematical knowledge (he was 1.1 The basics initially a topologist) played a key role here. In a 2014 article [Maz14] for the Math Intelligencer, Mazur Three key concepts in pure mathematics are the def- writes: “Reasoning by analogy is the keystone: it inition, the theorem statement, and the proof. We is present in much (perhaps all) daily mathematical will talk more about this trichotomy later on, but for thought, and is also often the inspiration behind some now let us focus on the concept of proof. A course of the major long-range projects in mathematics”. introducing the formal notion of proof might cover concepts such as sets, functions and binary relations, ∗Kevin Buzzard is a professor of mathematics at Imperial College London. His email address is and basic theorems about these objects will be care- [email protected]. fully proved. For example, there might be a proof

1 that distinct equivalence classes for an equivalence 1.2 Developing intuition relation are disjoint. There are many ways that one can attempt to teach this to undergraduates. Re- After a while it becomes inconvenient to do mathe- cently I have become fond of a method where I bring matics in a purely axiomatic fashion. For example, a set of around 100 plastic shapes coloured red, yel- proving that if we remove a finite set of points from 2 low, green and blue into class. Two shapes are de- R then the resulting topological space is still path fined to be equivalent if they have the same colour, connected could of course in theory be done from and it is not hard to convince the students that this the axioms, but in practice, rather than attempting is an equivalence relation, and that the red shapes to write down the function defining a path between form an equivalence class, the blue shapes form an- two arbitrary points in the space, one would just other, and so on. The fact that distinct equivalence draw a picture. The same is also true when prov- classes are disjoint is now obvious. However this is an ing basic results about contour integrals in complex example, not a proof. The formal proof, which I then analysis – there are several “proofs by picture” in a go on to show the students, is a series of elementary typical development of the theory. The concept of steps, each of which follows from the rules of logic or a simple closed curve in the plane having an inside the axioms of an equivalence relation. and an outside will often be taken as read, although very few students will have seen a proof of the Jor- dan curve theorem at this point in their mathemat- ical education. Over time, students learning math- In proofs such as these, we are operating very close ematics begin to understand our unwritten rules of to the “machine” which drives formal mathematics – “what is allowed in practice”. One is reminded of the machine which tells us that if P is true, and if P the apocryphal story of a student asking their pro- implies Q, then Q is true, and other such logical rules. fessor whether the fact just presented to the class as This axiom-based attitude continues to play an im- “obvious” was indeed obvious, and the professor go- portant role in subsequent classes such as first courses ing into deep thought to emerge 20 minutes later with in group theory, linear algebra, and real analysis. The the reply “yes”. By this point in the development of a real numbers might be presented as a complete totally student’s education, lecturers are expecting the stu- ordered archimedean field, that is, a structure satis- dents to “learn to fly”. Arguments in lectures may fying a list of axioms. Using only these axioms we can take place high above the axioms, with technical de- build a basic theory of real analysis from first princi- tails being dismissed as obvious or easy to verify, and ples. The theories of sequences, limits, infinite sums, left to the reader (perhaps with some hints). This is the beginning of what Terry Tao [Tao09] has called continuous functions R → R, differentiation, integra- tion and so on can all be carefully built from the ax- the post-rigorous stage of mathematics. To borrow ioms for complete totally ordered archimedean fields. a phrase from computer science, students begin to Similarly, a lecturer presents the axioms of a group in learn the intuitive “front end” of mathematics. a first course on group theory, and from these axioms we can build the theory of subgroups, normal sub- 1.3 PhD Research groups, group homomorphisms, kernels, images and quotient groups, and prove the first isomorphism the- Those students who convince us that they can steer orem for groups. At this stage in the development of their mathematical arguments correctly are rewarded mathematics, every proof can be chased right down by being given PhD places. The prize for this “level- to the axioms of the system we are considering, and ling up” is that they are allowed access to the mathe- students are expected to learn from such courses that matical literature, and from now on they can assume mathematics can be done in this way. Much (but, as any result they like, as long as it is published in a we are about to see, not all) of undergraduate pure reasonably prestigious journal and their advisor be- mathematics is of this form. lieves it. A typical PhD thesis in pure mathemat-

2 ics will contain new proofs of results in a given the- typesetting system. Thirty years ago, only a small ory. In my personal case, this theory was the theory number of (typically young) knew of p-adic Galois representations attached to modular how to write LATEX files, but now most of us do; this forms. By the time they graduate, a PhD student will program is now the standard typesetting system for typically know many theorem statements concerning mathematicians. If one runs the LATEX program on the objects they chose to study, and may well have a LATEX file, one of two things can happen. The file contributed to this list of theorem statements them- might contain errors, for example perhaps we acci- selves. Again to borrow a phrase from computer sci- dentally used a command in normal text mode which ence, the student knows the interface to each object is only valid in maths mode. In this case the LATEX in their area of expertise. The student is allowed to editor we are using will typically flag these errors and assume any results in the interface, and might well ask that we fix them. But when all the errors are know how to prove some of them – but possibly not gone, the LATEX program will compile the LATEX file, all of them. For example, when I was a PhD student, and the output will be a (hopefully) beautifully type- I had not read the details of the proof of the theo- set document. This document is typically a pdf file rem of Deligne which attached a p-adic Galois repre- nowadays, which can be read on a screen, printed out, sentation to a modular form, a key result from the or of course sent to another computer on the internet. interface to the theory of modular forms upon which The computer program Lean is an interactive the- my entire PhD work was based. In fact, at that time orem prover (ITP). A small number of (typically there was only really a sketch proof of the result in young) mathematicians know how to write Lean files. the literature. This was not however a problem, be- A Lean file can contain definitions, theorem state- cause the proof of Deligne’s theorem was “known to ments, and proofs – we shall see examples later. Just the experts”. Students are expected to get their own as a LATEX file is likely to contain some parts written intuition of their area, and after a while should have a in “maths mode”, a Lean file is likely to contain some feeling about what is accessible given known results, parts written in “tactic mode”. and what requires genuinely new ideas. If one runs the Lean program on a Lean file, one In the rest of this article, I would like to discuss of two things can happen. The file might contain the idea that computers can be taught mathemat- errors, for example perhaps not all the hypotheses of ics in much the same sort of way. I leave it up to a theorem were checked (or were true!) when it was the reader to decide whether they would like to be applied. In this case, the Lean editor we are using involved, but what I do believe is that these com- will typically flag these errors and ask that we fix puter systems are now here, that they can eat math- them. When all the errors are gone, Lean will compile ematics, and that they will ultimately change the way the Lean file, and the output will be. . . nothing at we do both teaching and research; furthermore, the all. When this happens, Lean believes that all the sooner these systems are noticed by mathematicians, definitions in your file make sense, and it believes the sooner this will happen. that all the proofs in the file are correct.1 Note that Lean is not (just) a programming lan- guage like Python or C++ or Haskell. To give an ex- 2 A brief introduction to in- ample of the difference: in Python you could write a teractive theorem provers program which printed out the first 1000 prime num- 1It is not strictly speaking true that the output is nothing (ITPs) at all – the actual output is a computer file, unreadable by hu- mans, where each proof is represented as a complicated graph. Before we talk about computer proof systems, let us The proofs in this form (terms in a type theory) can be inde- consider a computer program which many of us are pendently verified by typecheckers written in other languages and running on other operating systems and other chipsets, A familiar with: LTEX. to minimise the possibility that bugs in Lean cause incorrect The computer program LATEX is a mathematical proofs to be accepted as valid.

3 bers, or the first 1000 digits of π. In Lean you could • 2 + 2 = 5; write a proof that there are infinitely many prime numbers, or a proof that π was transcendental. • Fermat’s Last Theorem; Lean was written by Leonardo de Moura at Mi- • The Riemann Hypothesis. crosoft Research, and is free and open source soft- ware which runs on any modern operating system. Two of these statements are true, one is false, and It is one of many ITP systems. Others include Coq, the truth value of the Riemann hypothesis is cur- Isabelle/HOL, Mizar, Metamath, HOL Light, Arend rently unknown. Mathematicians do not really have and there are at least 20 more; many of these are a good word for a general true/false statement. The also free and open source, and some are over 30 years Riemann Hypothesis is called a conjecture, but it old. All of these systems use slightly different logi- seems ridiculous to call 2 + 2 = 4 or 2 + 2 = 5 a con- cal foundations, and it is hard to automatically move jecture. Note however that most mathematicians use a mathematical proof from one system to another, the words Theorem, Lemma, Proposition and Corol- for the same reasons that it is hard to automatically lary to express ideas which are formally the same, translate a computer program from one language to whereas logicians use the word Proposition to mean another. However the differences in these systems an arbitrary true/false statement. From now on we will not concern us here. It suffices to say that essen- will use the word Proposition in this way, meaning tially all of the examples in this article can in theory a mathematical statement which has a truth value, be written in essentially all of the ITP systems. We rather than one which is definitely true. For example, will focus on the Lean theorem prover in the exam- 2 + 2 = 5 is a false Proposition. We will capitalise ples below, but this is only because it is the system Proposition to remind us of this slightly non-standard which the author knows best. usage. In Lean there is a “universe” called Prop, which can be thought of as the collection of all Propositions. 3 Teaching mathematics to Propositions written in Lean are often easily readable computers to a trained . Here is an example of some valid Lean code which defines a Proposition. For the rest of this article, I would like to discuss the variables(XYZ: Type) possibility of teaching a computer pure mathematics, (f:X → Y)(g:Y → Z) following the same path as the way we teach it to humans. Let us start with the basics. Earlier on, we open function -- giving us access to the mentioned three fundamental mathematical concepts concept of injectivity – the theorem statement, the proof, and the defini- tion. We will now see about how Lean understands example: Prop:= injectivef ∧ injectiveg → these concepts. injective(g ◦ f) This code defines the Proposition stating that the 3.1 Propositions composite of two injective functions is injective. Note First let us talk about general true/false statements, that there is no proof here – we are just observing that a fundamental concept in mathematics. As well as the theorem statement can be formalised in Lean. theorems, mathematicians are interested in conjec- Note also that the variables X, Y and Z were ini- tures, which are true/false statements which might tialised to be not sets but “types”; however in this be believed to be true, but which are not proven. context the two ideas coincide, the only difference Here are some examples of true/false statements: being a linguistic one where we speak about terms x of a given type X and write x:X rather than speak- • 2 + 2 = 4; ing about elements x of a given set X and writing

4 x ∈ X. Note also that the arrow → is used both as f : X → Y, notation for a function, and for the concept =⇒ of g : Y → Z, implication. f_inj : injective f, Here is an example which shows that 2 + 2 = 5 is g_inj : injective g, also a Proposition in Lean: a b : X, example: Prop:= hgf : (g ◦ f) a = (g ◦ f) b 2+2=5 ` f a = f b The hypotheses are listed above the “turnstile” ` A Proposition is any true/false statement. and the goal is stated after it. Recall that a b : X just means that a and b are elements of the set X. 3.2 Proofs The hypothesis f inj is the assertion that f is injec- tive, and so on. As one solves the level, i.e., builds the Creating a mathematical proof is like solving a puz- proof, in Lean, the tactic state changes, until it even- zle, or a level of a computer game. Let us look further tually becomes no goals or Proof complete (the at this analogy. In a sudoku puzzle, one is presented exact success message will depend on which editor with a partially filled-in grid of numbers and asked one is using to interact with Lean). Just like theo- to fill in the rest of them, subject to some axioms. rem statements, Lean’s tactic state can often be eas- My first experience with sudoku was seeing levels of ily understood by mathematicians without any spe- this game in newspapers, and solving them with a cialist knowledge of ITPs, because the notation used pen. Solving a sudoku level in this way can be quite is standard mathematical notation. inconvenient – any error may create quite a mess and Here is a complete Lean file containing the proof, might be difficult to recover from. It is far easier to written in Lean’s tactic mode, with comments solve sudoku levels on a phone app or a web browser, (written in grey, preceded by --, and ignored by where errors can be erased more efficiently, and ex- Lean). If you are reading this article in digital perimental lines can be explored and then painlessly format, I invite you to interact with this proof by accepted or rejected. clicking on this link, which will open up a Lean edi- Many of us will have seen undergraduate work, tor within a web browser. This is not the best way to written in pen, which is also quite a mess. Lean of- interact with Lean – It will be slow (you might have fers a structured environment where proofs can be to wait for up to ten seconds for Lean to initialise and created. As an example, let us consider the proof of process the file). But when it is compiled, you can the Proposition that if f : X → Y and g : Y → Z click around and see Lean’s tactic state at any given are injective functions, then so is g ◦ f. Let us point in the proof. If you install Lean on your own first run through the mathematical proof. By def- computer, the same process will be lightning fast. inition, our task is to prove that if a, b ∈ X and import tactic -- gives access to many g(f(a)) = g(f(b)), then a = b. By injectivity of f, useful tactics we know that to prove a = b it suffices to prove that f(a) = f(b). Finally, the assertion f(a) = f(b) fol- open function lows from injectivity of g and our assumption. We will go through a Lean proof of this Proposition variables(XYZ: Type) (f:X → Y)(g:Y → Z) below. As one creates the proof using a Lean editor, one can see Lean’s “tactic state”, representing the -- The composite of two injective things Lean knows at any point during the proof. functions is injective. For example, just after one has applied injectivity of theorem injective_comp: f, Lean’s tactic state is the following: injectivef ∧ injectiveg → injective(g ◦ f):= X Y Z : Type, begin

5 -- assumef andg are injective. ((a + b)(a + b))(a + b) or (a + b)((a + b)(a + b)) de- rintro hf_inj, g_inji, pending on what definition we are using for x3. Then -- We want to proveg ◦ f is injective. we apply left and right distributivity several times to So saya,b ∈ X and assume expand out the brackets. If one does this whilst care- g(f(a))=g(f(b)). fully keeping track of all brackets involved, one will introsab hgf, -- We want to prove thata=b. By discover that one now needs to apply associativity injectivity off, it suffices to prove and commutativity of addition and multiplication 20 thatf(a)=f(b). times or more in order to turn the left hand side into apply f_inj, the right hand side – operations which a mathemati- -- By injectivity ofg, it suffices to cian applies intuitively and without comment. The proveg(f(a))=g(f(b)). full proof, using only the axioms of a ring, seems to apply g_inj, be at least 30 lines long, although the exact num- -- But this is an assumption. assumption, ber of axiom applications needed depends on other end foundational questions such as whether 3 is defined to mean (1 + 1) + 1 or 1 + (1 + 1). Here is a complete This tactic mode proof looks mildly intimidat- proof of this result in Lean’s tactic mode: ing but basically comprehensible, in the same way that \sum_{n=0}^{100}n^2 looked mildly intimidat- import tactic -- tactic mode ing but basically comprehensible before we learnt import data.real.basic -- the real numbers about LAT X’s maths mode. The tactics used in the E example(ab: R): proof such as intros and apply perform basic logical (a+b)^3=a^3+3∗a^2∗b+3∗a∗b^2+b^3:= moves on the tactic state, and are not hard to pick begin up. The last line of the proof, the assumption tactic, ring, solves the goal g(f(a)) = g(f(b)) by noting that it is end equal to one of our assumptions, namely hgf. The ring tactic is a high-level tactic, concealing Anyone concerned about how such a triviality tedious low-level work and enabling mathematicians might take five lines to prove might be interested in to operate using their usual interface, well above the seeing a complete “term mode” proof of the same axioms of a ring. This tactic was implemented in theorem: Lean by Mario Carneiro, a computer scientist, fol- theorem injective_comp0 : lowing the formally verified algorithm for checking injectivef ∧ injectiveg → injective identities in rings described in 2005 by Gr´egoireand (g ◦ f):= Mahboubi in [GM05]. It has been indispensable for λ hf_inj, g_inji __ hgf, f_inj$ g_inj hgf the subsequent development of ring theory in Lean. This complete proof – shorter than the correspond- A general tactic mode proof will contain a mixture ing LATEX proof – is far harder for a beginner to un- of low-level and high-level tactics, corresponding to derstand, and also indicates something about what is whether the corresponding human proof is operating going on under the hood, namely that a proof in Lean at axiom level or well above the axioms. is actually a function. Indeed, Lean is a functional Codewars (www.codewars.com) is a website con- programming language. We will not go any further taining programming challenges in many program- into this issue here. ming languages, including Lean. The Lean Codewars Below is another tactic mode Lean proof, this time levels contain many mathematical puzzles, ranging of the fact that if a and b are real numbers, then from easy questions (proving that the sum of two odd (a + b)3 = a3 + 3a2b + 3ab2 + b3. Before we em- numbers is even, for example), to far harder ones in- bark upon it, let us consider what goes into a proof volving finding all integer solutions to the subtle Dio- from first principles, assuming that the real num- phantine equations x2 − 37y2 = 3 and y2 = x3 + 11. bers are a field. We first write left hand side as Solving these harder problems involves using a range

6 of low-level and high-level tactics. Of course one can In 2012 Peter Scholze introduced the concept of a also invoke theorems from Lean’s extensive mathe- perfectoid algebra and a perfectoid space into math- matics library, where various number theoretic facts ematics. Slightly later on, Fontaine introduced the such as quadratic reciprocity are already proved. We general notion of a perfectoid ring. Here is the defi- will say more about Lean’s mathematics library be- nition of a perfectoid ring in Lean (here p is a prime low. number): structure perfectoid_ring(R: Type) [Huber_ringR] extends Tate_ringR: 3.3 Definitions Prop:= As well as Propositions and proofs, systems like Lean (complete: is_complete_hausdorffR) (uniform: is_uniformR) can understand new mathematical definitions. Many (ramified: ∃ $ : pseudo_uniformizerR, of the definitions in Lean’s maths library are defini- $^p | p inR ◦) tions of types or of terms. For example, the type (Frobenius: surjective(FrobR ◦/p)) R is defined to be the type of equivalence classes of Cauchy sequences of rationals, and the term π A perfectoid ring is a complete Hausdorff Tate ring is defined to be twice the smallest positive zero of satisfying some technical hypotheses. Experts in the the cosine function. Groups, rings, fields, manifolds, area will certainly be able to read and understand the schemes, perfectoid spaces and many many other gist of the code above. Perfectoid rings and perfectoid mathematical structures are all defined to be types in spaces were formalised in Lean by Johan Commelin, Lean, and their definitions are all readable to mathe- Patrick Massot and myself: see [BCM20]. maticians. For example here is a definition of a group in Lean. 3.4 Lean’s mathematics library class group(G: Type) The perfectoid ring code snippet above would not extends has_mulG, has_oneG, has_invG:= compile in core Lean alone; some imports would be (mul_assoc: needed from Lean’s maths library and Lean’s perfec- ∀ (abc:G), (a ∗ b) ∗ c=a ∗ (b ∗ c)) toid space library. Lean’s maths library is a rapidly- (one_mul: ∀ (a:G), 1 ∗ a=a) growing library containing a lot of undergraduate (mul_left_inv: ∀ (a:G),a −1 ∗ a= 1) level algebra, analysis, number theory, and The “structural” part of a group (the multiplica- . At the time of writing (April 2020) it con- tion, identity and inverse) is packed into the second tains theorems from number theory such as quadratic line, and the axioms follow afterwards. Given this reciprocity, theorems from algebra such as the Hilbert definition, one can now start to prove other basic re- basis theorem, theorems from analysis such as the in- sults. For example here is something which is proved verse function theorem, open mapping theorem and very early on in the development of the interface for Arzel`a–Ascolitheorem, definitions such as manifolds groups – the proof that the left identity coming from and topological spaces, smooth functions, and so on. the axioms is also a right identity. Undergraduates at my university can (and do) solve problem sheets and past exam questions in Lean; I ∗ theorem mul_one(a:G):a 1=a:= am convinced that it can play a role in making un- begin dergraduate teaching better (see forthcoming work of -- exercise! end Iannone and Thoma, discussing my interventions so far). This exercise now becomes a puzzle. It is my ex- Note however that much work remains to be done perience that certain undergraduates enjoy solving in Lean’s maths library. A notable omission in num- puzzles like this in Lean, developing the basic theory ber theory is the basic theory of factorisation of ideals of groups by completing levels of a computer game. into prime ideals in algebraic number fields, a notable

7 omission in algebra is undergraduate representation ple of a higher-powered tactic, ring, which solves a theory, and a notable omission in complex analysis problem for which working axiomatically would be a is Cauchy’s integral formula. It is only a matter of chore. How much further away from the axioms can time before these holes are filled however, such is the we move? pace of development right now. Enough commuta- The Yoneda lemma is a lemma in category theory tive algebra was developed for undergraduates at my where the idea is, in some sense, in the theorem state- university to define schemes and to prove that an ment, and the proof is to just chase the diagrams. It affine scheme is a scheme (that is, that the structure will come as no surprise to hear that Lean can find presheaf on an affine scheme is a sheaf). We have also the proof by itself; a generic “follow your nose” tactic formalised various tags in the Stacks Project [Sta18], has been written by Scott Morrison at ANU. Morri- an encyclopedic reference for modern algebraic ge- son, a mathematician, is an expert in designing high- ometry. Homological algebra is on the horizon, al- powered tactics in Lean which can discover proofs of though we have made the design decision to set up statements such as this. Morrison is currently con- everything within the context of abelian categories centrating on category theory, although the tactics he and are currently developing tactics to help with di- is developing are slowly being taken up by developers agram chasing. By “we” I mean the collection of in other areas of mathematics in Lean. In particular, mathematicians and computer scientists who are col- Lean already has some kind of primitive intuition for laborating to build Lean’s mathematics library, an mathematics, although this intuition currently works open source library of mathematics written in Lean better in some areas than others. which was started in 2017 and now contains nearly a Gabriel Ebner, a computer scientist, is implement- quarter of a million lines of code. There are plenty ing a second approach, where versions of Lean goals of ways to get started learning the theory, and plenty are passed to an external system which specialises of projects to work on. Within a few years I believe in solving first order logic problems, and the exter- that the library will cover all of undergraduate pure nal system passes back data which Lean then at- mathematics. Another of my visions for the library tempts to turn into a rigorous proof. Such techniques is that it becomes a 21st century version of Bour- (calling external solvers which have been designed baki. Indeed much of Bourbaki’s Topologie G´enerale to solve certain types of logic problem) are referred is now formalised in Lean, thanks to the efforts of to as “hammers” and they are already in active use (mathematician) Patrick Massot, building on earlier in several other ITPs – Lean is still playing catch- work of (computer scientist) Johannes H¨olzl. Much up in this area. Indeed the first hammer, Sledge- of this was needed to formalise the definition of a per- hammer [PB10], developed by a team led by Larry fectoid space. We are only just beginning to create a Paulson, was an extremely successful tool for the Is- rich interface for the theory of local fields however, so abelle/HOL proof system. Note however that almost proving Scholze’s tilting correspondence is still sev- all experiments by computer scientists are restricted eral years away. Later on, we will ask whether proving to the databases of formalised theorems which are profound results like this is even the right thing to be currently available, and because such databases still doing. do not even cover all of undergraduate mathematics, one imagines that there is still room for improvement. 3.5 Learning to fly 3.6 Research level mathematics We have seen that ITPs, if taught by humans, are capable of picking up the basics of mathematics, and Mathematics is vast, and growing quickly, and un- are (completely unsurprisingly) capable of checking less there is some kind of complete cultural change proofs which stick close to the axioms of mathemat- (which seems unlikely), one cannot imagine humans ics, such as most basic results in the first year of a formalising their proofs in an ITP rather than typ- mathematics degree. We have also seen an exam- ing them up into LATEX any time soon (it would also

8 make papers around four times longer; four seems to other systems. One output of such a project would be the current “de Bruijn factor” representing the be a complex graph linking important mathematical ratio between the length of a formal proof and the concepts, and which grows over time. Hales imagines length of the corresponding LATEX proof). Similarly exploration tools enabling humans to analyse this it would take an extraordinary breakthrough in ma- graph, like a Google Earth for mathematics. Search chine learning before computers can start to read for mathematical theorems would suddenly become human-written papers. much easier – and the machine learning experts would There have been some spectacular one-off achieve- finally have something to get their teeth into. Math ments however. The two most obvious examples are Reviews and Zentralblatt contain human-written re- [GAAea13] (a formalisation of the proof of the Feit– views of modern mathematical papers, and writing a Thompson odd order theorem in Coq) and [HAB+17] formal abstract would in many cases be easier than (a formalisation of the Hales–Ferguson proof of the writing a review, as only the theorem statements Kepler conjecture). These two formalisations came would be required, and once sufficiently many defi- about for two rather different reasons. The Feit– nitions are in the system, formalising theorem state- Thompson work, led by Gonthier, was a demonstra- ments becomes easy. As more young mathematicians tion that ITPs were capable of understanding a proof learn how to write mathematics in this kind of soft- which is of standard. The Kepler con- ware, the possibility of making such a database seems jecture work, led by Hales, was an attempt to jus- to be becoming ever more real. Such a project seems tify the correctness of the Hales–Ferguson proof, after to be a feasible way of digitising modern mathemat- the Annals of Mathematics chose to publish the pa- ics, and can be integrated into the hammers men- per proof whilst the referees claimed that they were tioned previously in order to make more powerful only “99% certain” of its correctness. To give an- proof search. If PhD students are encouraged to for- other recent example, the 2017 Ellenberg–Gijswijt malise the statements of the results they are claiming proof [EG17] of the cap set conjecture, also published to prove, and the statements of some of the theorems in the Annals, was formalised [DHL19] in Lean two they use, then such a database could begin to grow years later by Dahmen, H¨olzland Lewis. These ex- very quickly indeed. amples show that it is now feasible for modern math- The Formal Abstracts project offers a real possibil- ematics (or at least some of it) to be formalised in ity of making an entirely new object – a digitisation “real time”. However the problem remains that there of what the modern mathematician believes, and a is no currently feasible method to turn the corpus of chance for machine learning experts to see what they modern mathematics into a formally verified state. can make of it in a format which they can easily un- derstand. If we, the mathematical community, build 3.7 Formal abstracts this database, then they, the computer scientists, will come. In fact, they are waiting. But let us go back to the human PhD student, who, when they start their research, does not need to know all of the proofs of all of the theorems that they will Final thoughts be using. The important thing is that they know the statements. And teaching modern theorem state- I have spoken a lot about Lean, but there are many ments to a computer is well within our grasp. other systems available; Coq, and Isabelle/HOL are Tom Hales has proposed, in his Formal Abstracts also serious systems with a lot of mathematics in, project [Hal20], that the main theorem statements and there are others too. The debate about which and definitions of mathematical papers be formalised system is “best” is a complex one, involving techni- in an ITP. Proofs are not required. Hales has cho- calities about dependent types, strong normalisation sen Lean for the system he wants to use, but there and subject reduction, and is beyond the scope of this is no reason why other analogous projects cannot use article.

9 All of the systems could be used for educational References purposes at undergraduate level; I use Lean with my 1st year students and some of them love it. I run a [BCM20] Kevin Buzzard, Johan Commelin, and Patrick Massot, Formalising perfectoid spaces, Proceed- weekly club, the Xena project, where I teach under- ings of the 9th ACM SIGPLAN international con- graduates (and they teach me) how to use Lean to do ference on certified programs and proofs, CPP undergraduate and MSc level mathematics in Lean. I 2020, new orleans, la, usa, january 20-21, 2020, would be extremely interested in making one of these 2020, pp. 299–312. systems more user-friendly whilst keeping it flexible [DHL19] Sander R. Dahmen, Johannes H¨olzl, and Robert Y. Lewis, Formalizing the solution to enough to do a lot of undergraduate level mathemat- the cap set problem, 10th International Con- ics. An interesting related question is how to teach ference on Interactive Theorem Proving, 2019, this topic to undergraduates – teaching material for pp. Art. No. 15, 19. MR4008934 undergraduate mathematicians is currently being de- [EG17] Jordan S. Ellenberg and Dion Gijswijt, On large n veloped by a team consisting of both mathematicians subsets of Fq with no three-term arithmetic pro- and computer scientists. The CoCalc website [Sag20] gression, Ann. of Math. (2) 185 (2017), no. 1, 339– 343. MR3583358 enables teams of people to work on Lean code in a [GAAea13] Georges Gonthier, Andrea Asperti, Jeremy Avi- collaborative real time environment and I have used gad, and et al., A machine-checked proof of the this environment for training undergraduates to use odd order theorem, Interactive theorem proving, Lean. 2013, pp. 163–179. MR3111271 Whether or not they are used for teaching, ulti- [GM05] Benjamin Gr´egoireand Assia Mahboubi, Proving mately it seems to me inevitable that these systems equalities in a commutative ring done right in will change research. Writing proofs in these sys- Coq, Theorem proving in higher order logics, 2005, pp. 98–113. MR2197007 tems forces a human to clarify their thinking, and [HAB+17] Thomas Hales, Mark Adams, Gertrud Bauer, will perhaps lead them to discover better abstrac- Tat Dat Dang, John Harrison, Le Truong Hoang, tions. Long proofs, too long for any single human to Cezary Kaliszyk, Victor Magron, Sean McLaugh- comprehend every last detail of, are becoming more lin, Tat Thang Nguyen, Quang Truong Nguyen, commonplace. Can computers help to check these? Tobias Nipkow, Steven Obua, Joseph Pleso, Ja- son Rute, Alexey Solovyev, Thi Hoai An Ta, Looking further ahead, I believe that one day our Nam Trung Tran, Thi Diep Trieu, Josef Urban, subject will experience a true culture shock, where Ky Vu, and Roland Zumkeller, A formal proof of a computer announces a proof of a conjecture which the Kepler conjecture, Forum Math. Pi 5 (2017), humans are interested in, and if the argument is suf- e2, 29. MR3659768 ficiently deep then the computer might not be able [Hal20] Tom Hales, Formal Abstracts Project, 2020. to explain their argument using a language that hu- [Maz14] Barry Mazur, Is it plausible?, Math. Intelligencer mans can understand. But well before that hap- 36 (2014), no. 1, 24–33. MR3166990 pens, computers will become tools which we can use [PB10] Lawrence C. Paulson and Jasmin Christian to search for and verify lemmas in all areas of pure Blanchette, Three years of experience with sledge- hammer, a practical link between automatic and mathematics, and the sooner an area is taught to interactive theorem provers, The 8th international one of these systems, the sooner computers will be workshop on the implementation of logics, IWIL able to help. The Lean Prover community website 2010, yogyakarta, indonesia, october 9, 2011, 2010, at https://leanprover-community.github.io/ and the pp. 1–11. Lean chatroom at leanprover.zulipchat.com are two [Sag20] Inc. SageMath, Cocalc collaborative computation online, 2020. https://cocalc.com/. places to start, if people have any questions about what some of us believe will be the future of mathe- [Sta18] The Stacks Project Authors, Stacks Project, 2018. matics. [Tao09] , There’s more to mathematics than rigour and proofs, 2009. I thank the members of the Lean Prover commu- nity for their feedback on an earlier version of this article.

10