Formal Proof—Getting Started Freek Wiedijk

Formal Proof—Getting Started Freek Wiedijk A List of 100 Theorems On the webpage [1] only eight entries are listed for Today highly nontrivial mathematics is routinely the first theorem, but in [2p] seventeen formaliza- being encoded in the computer, ensuring a reliabil- tions of the irrationality of 2 have been collected, ity that is orders of a magnitude larger than if one each with a short description of the proof assistant. had just used human minds. Such an encoding is When we analyze this list of theorems to see called a formalization, and a program that checks what systems occur most, it turns out that there are such a formalization for correctness is called a five proof assistants that have been significantly proof assistant. used for formalization of mathematics. These are: Suppose you have proved a theorem and you want to make certain that there are no mistakes proof assistant number of theorems formalized in the proof. Maybe already a couple of times a mistake has been found and you want to make HOL Light 69 sure that that will not happen again. Maybe you Mizar 45 fear that your intuition is misleading you and want ProofPower 42 to make sure that this is not the case. Or maybe Isabelle 40 you just want to bring your proof into the most Coq 39 pure and complete form possible. We will explain all together 80 in this article how to go about this. Although formalization has become a routine Currently in all systems together 80 theorems from activity, it still is labor intensive. Using current this list have been formalized. We expect to get to technology, a formalization will be roughly four 99 formalized theorems in the next few years, but times the size of a corresponding informal LATEX Fermat’s Last Theorem is the 33rd entry of the list proof (this ratio is called the de Bruijn factor), and therefore it will be some time until we get to and it will take almost a full week to formalize a 100. single page from an undergraduate mathematics If we do not look for quantity but for quality, textbook. the most impressive formalizations up to now are: The first step towards a formalization of a proof Gödel’s First Incompleteness Theorem: by consists of deciding which proof assistant to use. Natarajan Shankar using the proof assis- For this it is useful to know which proof assistants tant nqthm in 1986, by Russell O’Connor have been shown to be practical for formalization. using Coq in 2003, and by John Harrison On the webpage [1] there is a list that keeps track of using HOL Light in 2005. the formalization status of a hundred well-known Jordan Curve Theorem: by Tom Hales us- theorems. The first few entries on that list appear ing HOL Light in 2005, and by Artur in Table 1. Korniłowicz using Mizar in 2005. Freek Wiedijk is lecturer in computer science at the Rad- Prime Number Theorem: by Jeremy Avigad boud University Nijmegen, The Netherlands. His email using Isabelle in 2004 (an elementary proof address is [email protected]. by Atle Selberg and Paul Erdös), and by 1408 Notices of the AMS Volume 55, Number 11 theorem number of systems in which the theorem has been formalized p 1. The Irrationality of 2 ≥ 17 2. Fundamental Theorem of Algebra 4 3. The Denumerability of the Rational Numbers 6 4. Pythagorean Theorem 6 5. Prime Number Theorem 2 6. Gödel’s Incompleteness Theorem 3 7. Law of Quadratic Reciprocity 4 8. The Impossibility of Trisecting the Angle and Doubling the Cube 1 9. The Area of a Circle 1 10. Euler’s Generalization of Fermat’s Little Theorem 4 11. The Infinitude of Primes 6 12. The Independence of the Parallel Postulate 0 13. Polyhedron Formula 1 … … Table 1. The start of the list of 100 theorems [1]. John Harrison using HOL Light in 2008 (a express all abstract mathematics though. Another proof using the Riemann zeta function). disadvantage of HOL is that the proof parts of Four-Color Theorem: by Georges Gonthier the HOL scripts are unreadable. They can only be using Coq in 2004. understood by executing them on the computer. All but one of the systems used for these four Mizar on the other hand allows one to write theorems are among the five systems that we listed. abstract mathematics very elegantly, and its scripts This again shows that currently these are the most are almost readable like ordinary mathematics. interesting for formalization of mathematics. Also Mizar has by far the largest library of already Here are the proof styles that one finds in these formalized mathematics (currently it is over 2 systems: million lines). However, Mizar has the disadvantage that it is not possible for a user to automate proof assistant proof style of the system recurring proof patterns, and the proof automation HOL Light procedural provided by the system itself is rather basic. Also, Mizar declarative in Mizar it is difficult to express the formulas of ProofPower procedural calculus in a recognizable style. It is not possible Isabelle both possible to “bind” variables, which causes expressions for Coq procedural constructions like sums, limits, derivatives, and A declarative system is one in which one writes integrals to look unnatural. a proof in the normal way, although in a highly stylized language and with very small steps. For The Example: Quadratic Reciprocity this reason a declarative formalization resembles In this article we will look at two formalizations of program source code more than ordinary mathe- a specific theorem. For this we will take the Law of matics. In a procedural system one does not write Quadratic Reciprocity, the seventh theorem from proofs at all. Instead one holds a dialogue with the the list of a hundred theorems. This theorem has computer. In that dialogue the computer presents thus far been formalized in four systems: by David the user with proof obligations or goals, and the Russinoff using nqthm in 1990, by Jeremy Avigad user then executes tactics, which reduce a goal to using Isabelle in 2004, by John Harrison using HOL zero or more new, and hopefully simpler, subgoals. Light in 2006, and by Li Yan, Xiquan Liang, and Proof in a procedural system is an interactive game. Junjie Zhao using Mizar in 2007. In this paper we will show HOL Light as the When I was a student, my algebra professor example of a procedural system, and Mizar as the Hendrik Lenstra always used to say that the Law of example of a declarative system. Quadratic Reciprocity is the first nontrivial theorem The main advantage of HOL Light is its elegant that a student encounters in the mathematics cur- architecture, which makes it a very powerful and riculum. Before this theorem, most proofs can be reliable system. A proof of the correctness of the found without too much trouble by expanding the 394 line HOL Light “logical core” even has been definitions and thinking hard. In contrast the Law formalized. On the other hand HOL has the disad- of Quadratic Reciprocity is the first theorem that vantage that it sometimes cannot express abstract is totally unexpected. It was already conjectured mathematics—mostly when it involves algebraic by Euler and Legendre, but was proved only by the structures—in an attractive way. It can essentially “Prince of Mathematicians”, Gauss, who called it December 2008 Notices of the AMS 1409 the Golden Theorem and during his lifetime gave 1 eight different proofs of it. 2 p The Law of Quadratic Reciprocity relates whether 1 px qy 2 q an odd prime p is a square modulo an odd prime − ≤ − q, to whether q is a square modulo p. The theorem 0 says that these are equivalent unless both p and q qy < 0 − are 3 modulo 4, in which case they have opposite px < 1 q < px truth values. There also are two supplements to 2 − the Law of Quadratic Reciprocity, which say that − 1 p < qy 1 2 −1 is a square modulo an odd prime p if and only 2 − qy px 1 p if p ≡ 1 (mod 4), and that 2 is a square modulo an − ≤ − 2 odd prime p if and only if p ≡ ±1 (mod 8). 0 1 1 q The Law of Quadratic Reciprocity is usually 2 2 phrased using the Legendre symbol. A number a is called a quadratic residue modulo p if there HOL Light exists an x such that x2 ≡ a (mod p). The Legendre Suppose that we select HOL Light as our proof symbol a for a coprime to p then is defined by assistant. The second step will be to download and p install the system. This does not take long. First ! ( download the ocaml compiler from [5] and install a 1 if a is a quadratic residue modulo p = it. Next download the tar.gz file with the current p −1 otherwise: version of the HOL Light sources from [6] and unpack it. Then follow the installation instructions Using the Legendre symbol, the Law of Quadratic in the README file. If you use Linux or Mac OS X, all Reciprocity can be written as: you will need to do is type “make”. Under Windows, installation is a bit more involved: you will have to ! ! p q p−1 q−1 copy the “pa_j_….ml” file that corresponds to your = (−1) 2 2 q p version of ocaml as given by “ocamlc -version” to a file called “pa_j.ml”, and then compile that copy The right hand side will be −1 if and only if both p using one of the two “ocamlc -c” commands that and q are 3 (mod 4).

Formal Proof—Getting Started Freek Wiedijk

Details

Download

Copyright

We respect the copyrights and intellectual property rights of all users. All uploaded documents are either original works of the uploader or authorized works of the rightful owners.

Support