<<

Figurate and sums of numerical powers: Fermat, Pascal, Bernoulli

David Pengelley∗

In the year 1636, one of the greatest of the early seventeenth century, the Frenchman (1601—1665), wrote to his correspondent Gilles Personne de Roberval1 (1602—1675) that he could solve “what is perhaps the most beautiful problem of all ” [4], and stated that he could do this using the following theorem on the figurate numbers derived from natural progressions, as he first stated in a letter to another correspondent, Marin Mersenne2 (1588—1648) in :

∞∞∞∞∞∞∞∞

Pierre de Fermat, from Letter to Mersenne. September/October, 1636, and again to Roberval, November 4, 1636

The last multiplied by the next larger number is double the collateral triangle; the last number multiplied by the triangle of the next larger is three times the collateral pyramid; the last number multiplied by the pyramid of the next larger is four times the collateral triangulo-triangle; and so on indefinitely in this same manner [4], [15, p. 230f], [8, vol. II, pp. 70, 84—85].

∞∞∞∞∞∞∞∞ Fermat, most famous for his last theorem,3 worked on many problems, some of which had ancient origins. Fermat had a law degree and spent most of his life as a government offi cial in . There are many indications that he did partly as a diversion from his professional duties, solely for personal gratification. That was not unusual in his day, since a mathematical profession comparable to today’sdid not exist. Very few scholars in Europe made a living through their research accomplishments. Fermat had one especially unusual trait: characteristically he did not divulge proofs for the discoveries he wrote of to others; rather, he challenged them to find proofs

∗Mathematical Sciences; Dept. 3MB, Box 30001; New Mexico State University; Las Cruces, NM 88003; [email protected]. 1 Roberval was a prominent of the era just before the invention of the infinitesimal . He developed his own “method of indivisibles”for finding areas and before the fundamental theorem of calculus was available. For more on his life and work see, e.g., [19]. 2 Mersenne was a theologian, philosopher, mathematician and music theorist. In addition to his creative work, he carried on extensive correspondence with mathematicians and other in many countries. The scientific journal had not yet come into being, and Mersenne served as the central node for exchange of much mathematical and scientific information. For more on his life and work see, e.g., [16]. 3 Fermat claimed that the equation xn +yn = zn has no solution in positive whole numbers x, y, z when n > 2. One of the greatest triumphs of twentieth-century mathematics was the proof in the 1990s of his famous long-standing claim. This was accomplished by , utilizing much of the advanced developed in the preceding century. For more information see, e.g., [9].

1 Figure 1: Fermat of their own. While he enjoyed the attention and esteem he received from his correspondents, he never showed interest in publishing a book with his results. He never traveled to the centers of mathematical activity, not even Paris, preferring to communicate with the scientific community through an exchange of letters, facilitated by the Parisian theologian , who served as a clearinghouse for scientific correspondence from all over Europe, in the absence as yet of scientific research journals. While Fermat made very important contributions to the development of the differential and calculus and to analytic [15, Chapters III, IV], his life-long passion belonged to the study of properties of the , now known as number theory, and it is here that Fermat has had the most lasting influence on the subsequent course of mathematics. In hindsight, Fermat was one of the great mathematical pioneers, who built a whole new paradigm for number theory on the accomplishments of his predecessors, and laid the foundations for a mathematical theory that would later be referred to as the “queen of mathematics.”[17] What was this “most beautiful problem of all arithmetic” to which Fermat refers in his letter to Roberval? And what are the figurate numbers about which he makes his geometric sounding claims? He writes as if these numbers are themselves geometric figures like triangles and pyramids. In fact they are the numbers that “count”how many equally spaced dots occur in forming discrete versions of these geometric figures, and such figurate numbers had first been studied millennia earlier. Fermat’s“most beautiful problem of all arithmetic”, which he claimed he could solve using figurate numbers, also had a long history. He was referring to finding formulas for in a natural progression, i.e., in an 1, 2, . . . , n whose “last number” is n. We shall see shortly what is meant by a sum of powers, and see why formulas for sums of powers became central in mathematics as the ideas of calculus emerged through the seventeenth century. Both figurate numbers and formulas for sums of powers were important in mathematics long before and after Fermat, and the whole story is told in full in [14]. This project focuses on the

2 intertwined study of the figurate numbers and formulas for sums of powers through the work of three great seventeenth century mathematicians, Pierre de Fermat, (1623—1662),and Jakob Bernoulli (1654—1705). We will see that while figurate numbers are not generally well known by that name today, they are the same numbers that count of choices, that appear as coeffi cients in binomial expansions, and that appear in Pascal’sTriangle, and are thus still perhaps the most important numbers in combinatorial mathematics today.

Fermat’sfigurate numbers and sums of powers Already in the classical Greek era, figurate numbers were of great interest, and formulas for sums of powers were sought for solving area and problems. The Pythagoreans, a mysterious led by Pythagoras in ancient Greece around the sixth century b..e., believed that number was the substance of all things, and geometric patterns seen in dots or pebbles in the sand showed them relationships between numbers. For instance, in Figure 2 we see dot pictures for three types of numbers the Pythagoreans probably considered, and we can obtain some formulas for discrete sums from them.

Figure 2: , rectangular, and triangular numbers

Exercise 1. Using the dot pictures of Figure 2, explain how to obtain formulas like

n n n n (n + 1) (2i 1) = n2, 2i = n (n + 1) , i = , − 2 i=1 i=1 i=1 X X X and then prove these formulas by .

We can now also understand Fermat’sfirst claim, if we realize that for the progression 1, 2, . . . , n, what Fermat means by the collateral triangle is the triangle with rows of dots whose ranges from 1 through n, as in Figure 2. We will call the total number of dots in such a triangle a .

Exercise 2. Interpret and justify Fermat’sfirst claim about the collateral triangle using the results of the previous exercise.

The further figurate numbers, such as Fermat’scollateral pyramid, will generalize this by count- ing dots in analogous higher-dimensional figures. We shall study these shortly, but first we begin the parallel story of sums of powers. In classical we find an interest not just in knowing how to sum the terms in 4 n n(n+1) a progression, as in the closed formula appearing on the right side of i=1 i = 2 . We also find n 2 n 3 sums like i=1 i and i=1 i , which today we call sums of powers, i.e., the sum of n terms in the progression each raised to a fixed power. These arose in specific determinationsP of areas and volumes P P 4 By a closed formula we mean one without an arbitrarily long summation process.

3 of geometric objects with curved sides, by what is today called the Greek . For instance, of Syracuse (c. 287—212 b.c.e.), perhaps the greatest mathematician of antiquity, determined areas bounded by parabolas and spirals. To see the to sums of powers, the reader should recall and review from modern calculus that an area bounded by curves is given by an integral, defined as a of sums that approximate the area with sums of areas 1 k of rectangles. Recall, for instance, that for the important integral 0 x dx, with k an arbitrary fixed , the simplest Riemann sums will involve n ik, and that to determine the i=1 R integral as a limiting value of the Riemann sums, one will need some very good knowledge about the size of these sums as n grows. While you may be temptedP simply to evaluate this integral by antidifferentiation using the fundamental theorem of calculus, note that neither antidifferentiation nor the fundamental theorem of calculus were even dreamt of by anyone in the world until the second half of the seventeenth century, two millennia after Archimedes and several decades after Fermat’s work. However, by Fermat’s time in the first half of the seventeenth century, finding formulas for n k 1 k sums of powers i=1 i for the purpose of computing the 0 x dx was a major quest in mathematics. We should therefore not be surprised that Archimedes desired knowledge about sums n P2 1 2 R of i=1 i , since 0 x dx gives an area bounded by a parabola, and the reader may check that the same integral arises when finding the area bounded by an Archimedean spiral, a curve R described todayP by r = aθ in polar coordinates. In Fermat’stime the curves y = xk for k > 2 were called higher parabolas by analogy. 1 k Exercise 3. Carry out an analysis with sums of rectangles to approximate the area 0 x dx of the region under the curve y = xk by an expression involving n ik, and state what limit needs to i=1 R be computed to determine the area. Specialize to the case k = 2 to specify what limit will give the area under the parabola. P Archimedes actually solved his area problem for the parabola by two incredibly clever methods that did not require using a sum of squares, and the reader may see these in [13]. But he found the area bounded by a turn of his spiral by discovering, proving, and utilizing a result about the n 2 sum of squares, i=1 i , within the Greek method of exhaustion. We will not show his geometric viewpoint or proof of his result on a sum of squares, but will state his result here in modern numerical terminology:P n n (n + 1) n2 + i = 3 i2. i=1 i=1 X X Exercise 4. Use this equality of Archimedes, and the earlier formula for the sum of terms in the n 2 progression, to find an explicit formula for i=1 i . n 2 Exercise 5. Now use the explicit polynomial formula youP have for i=1 i to compute the necessary limit to obtain the value of 1 x2dx. 0 P n 2 Exercise 6. Show from theR polynomial formula for i=1 i that n n3 (Pn + 1)3 < i2 < , 3 3 i=1 X 1 2 and determine the value of 0 x dx directly from these inequalities without needing the precise formula for n i2. i=1 R Exercise 7.PRoberval wrote to Fermat that he could find the areas under all the higher parabolas 1 k (which we would label as the integrals 0 x dx for k > 2) using the inequalities nk+1R n (n + 1)k+1 < ik < , k + 1 k + 1 i=1 X 4 1 k which he claimed hold. Indeed, calculate 0 x dx by considering lower and upper sums of rectangles based on left and right endpoints of equally spaced partitions of the interval, and by using Roberval’s R inequalities to compute the appropriate limit. Fermat went further with sums of powers, claiming he could find explicit formulas for them.

Let us now reconnect our exploration of sums of powers to figurate numbers by seeing how we n 2 can obtain the polynomial formula for i=1 i from Fermat’ssecond claim about figurate numbers. This was exactly the viewpoint expressed by Fermat, that the sums of powers problem can be solved by understanding figurate numbers.P To do this, consider Fermat’s second claim, that “the last number multiplied by the triangle of the next larger is three times the collateral pyramid”. We need to know what he meant by the pyramid collateral to the number n. Analogous to the collateral triangle, by the collateral pyramid Fermat means to count the dots in a three-dimensional triangular pyramid with sides each having n dots (Figure 3).

Figure 3: Pyramidal numbers

We can now apply Fermat’sclaim, by thinking first about the geometry of a triangular pyramid with n dots on each edge. We see that its vertical layers consist of n triangular numbers stacked on top of each other, with edge lengths from 1 to n. Since we know from before that the triangular i(i+1) n i(i+1) number with edge length i has 2 dots, we see that altogether the pyramid has i=1 2 dots. Then since the “triangle of the next larger” (number than n) must have (n+1)(n+2) dots, we see 2 P that Fermat’sclaim can be expressed as

(n + 1) (n + 2) n i (i + 1) n = 3 2 2 i=1 X Now notice that we can separate the right side via

n i (i + 1) n i2 n i = + , 2 2 2 i=1 i=1 i=1 X X X n 2 and should now be able to solve for an explicit formula for i=1 i , since we already have an explicit formula for n i. i=1 P n 2 Exercise 8.PSolve for an explicit formula for i=1 i using Fermat’s claim, and check to see if it is the same formula that you found in a previous exercise from the equality of Archimedes. P The previous exercise performs exactly what Fermat had in mind when he said he could solve the sums of powers problem using his claims about figurate numbers.

5 Exercise 9. We have seen how to obtain the formula for a sum of squares from the equality of Archimedes, and also from Fermat’s second claim about figurate numbers. But so far we have proofs for neither of these precursors, so just for good measure, prove the formula for sums of squares by mathematical induction.

Moving towards higher powers and higher figurate numbers, we note here that formulas for sums of and sums of fourth powers were also known centuries before Fermat, and some work towards general methods had been achieved. To give a flavor of the diversity of sources, knowledge of a formula for sums of cubes appears implicitly in the work of the neo-Pythagorean Nicomachus of Gerasa in the first century c.e., and explicitly in verse by Aryabhat¯ .a in India written in 499 c.e., along with a proof by the Islamic mathematician Ab¯uBakr al-Karaj¯ı(c. 1000 c.e.) of the House of Wisdom established in Baghdad in the ninth century. The Egyptian mathematician Ab¯u ‘Al¯ıal-H. asan ibn al-Haytham (965—1039)gives us the first steps along a path toward understanding these formulas in general, in which he finds formulas for sums of fourth powers in order to find the volume of a general paraboloid of revolution (in contemporary terms this involves integrating x4). His complicated method could in principle be generalized to higher powers [14]. What we do not see in these earlier works is a connection between the figurate numbers and sums of powers. This is what Fermat brought to the scene, and to which we return.

Exercise 10. Guess a formula for sums of cubes: First calculate the first six sums and look for a pattern. Then prove by mathematical induction that your guess is correct.

Exercise 11. Above we showed how Fermat’ssecond claim about figurate numbers could be used to derive a formula for a sum of squares. Generalize this approach to obtain the formula for a sum of cubes from Fermat’s third claim. Of course a collateral triangulo-triangle is the generalization of a pyramid to a four-dimensional object consisting of n stacked triangular pyramids with edge lengths ranging from 1 to n. Discuss what would be involved in carrying this to even higher powers.

Exercise 12. From Fermat’sclaim above, derive a formula for the number of dots in a 3-dimensional triangular pyramid with n dots on an edge.

Exercise 13. From Fermat’s final claim above, derive a formula for the number of dots in a 4- dimensional triangulo-triangle with n dots on an edge. Generalize to formulas for higher dimensional ‘pyramids’.

Since Fermat did not reveal his methods, it is up to us to decipher what he may have been thinking when he made his claims about figurate numbers, and to see if they are true. We will be able to do this by studying the figurate numbers and discovering their agreement with the numbers in the “arithmetical triangle”5 (Figure 4) of his contemporary and correspondent Blaise Pascal. The numbers in the arithmetical triangle have yet other extremely important properties in combinatorial mathematics, namely in their roles as combination numbers and binomial coeffi cients. Fermat’sclaims are not merely geometric interpretations of relationships between certain whole numbers; they are actually discrete analogues of results we already know about continuous objects like areas and volumes. For instance, his first claim is a discrete version of the fact that for any line segment, the of the segment with itself creates a rectangle with area twice that of the triangle with that side and height. Fermat’suse of the “next larger number”for one of the discrete lengths is exactly what is necessary to ensure a precise two-to-one ratio of the discrete count of dots in a rectangular figure to the count of dots in an inscribed triangle at the corners, as we know happens for the continuous measurements of the analogous areas. This is visible geometrically in

5 Today called Pascal’striangle.

6 Figure 4: Pascal’s arithmetical triangle

7 the packing of dots in the middle of Figure 2. Similarly, when he says “the last number multiplied by the triangle of the next larger is three times the collateral pyramid,”he is expressing a delicate discrete version of the fact that a three-dimensional pyramid on a triangular base has volume one-third that of the prism created with the same base and height. Fermat claims that similar relationships hold in all . Let us formally define the figurate numbers by their recursive piling up property. Here Fn,j will denote the figurate number that is j-dimensional with n dots along each edge. Call n its edge length. For instance, F3,2 is the planar triangular number with 3 dots per edge, so by counting up its rows of dots, F3,2 = 1 + 2 + 3 = 6. The piling up property, by which we define larger figurate numbers from smaller ones, is encoded in the formula Fn+1,j+1 = Fn,j+1 + Fn+1,j. This formalizes the idea that to increase the edge length of a (j + 1)-dimensional figurate number from n to n + 1, we simply add on another layer at its base, consisting of a j-dimensional figurate number of edge length n + 1 (Figure 3). This defining formula is called a relation, since it defines each of the figurate numbers in terms of those of lower and edge length, provided we start correctly by specifying those numbers with the smallest dimension and those with the smallest edge length. While we could start from dimension one, it is useful to begin with zero-dimensional figurate numbers, all of which we will define to have the value one. We thus define

Fn,0 = 1 (n 1), ≥ F1,j = 1 (j 0), ≥ Fn+1,j+1 = Fn+1,j + Fn,j+1 (n 1, j 0). ≥ ≥ We must check that our starting data determine just what was intended for Fermat’s figurate numbers. We have required that any figurate number with edge length one will have exactly one dot, no matter what its dimension. And the recursion relation clearly produces the desired one- dimensional figurate numbers Fn,1 = n for all n 1 from the starting data of zero-dimensional numbers. ≥ Now Fermat’sclaimed formulas to Mersenne assume the form

nFn+1,1 = 2Fn,2,

nFn+1,2 = 3Fn,3,

nFn+1,3 = 4Fn,4, ...

Exercise 14. Explain why Fermat’sclaimed formulas to Mersenne assume the form

nFn+1,1 = 2Fn,2,

nFn+1,2 = 3Fn,3,

nFn+1,3 = 4Fn,4, ... .

We shall study the figurate numbers further to see why Fermat’sclaim about them is true, and simultaneously prepare the groundwork for reading Pascal’s work on sums of powers. The reader may have noticed that the individual figurate numbers seem familiar, from looking at Pascal’s arithmetical triangle (Figure 4). The numbers shown there seem to match the figurate numbers, i.e., Fn,j appears in Pascal’s“parallel row”(horizontal) labeled n and “perpendicular row”(vertical) labeled j + 1. Indeed, in his Treatise on the Arithmetical Triangle, Pascal defined the numbers in the triangle by starting the process off with a 1 in the corner, and defined the rest simply by saying

8 that each number is the sum of the two numbers directly above and directly to the left of it, which corresponds precisely to the recursion relation and starting data by which we formally defined the figurate numbers. So they must agree. Perhaps you already also recognize the numbers in the arithmetical triangle as binomial coef- ficients or combination numbers. The numbers occurring along Pascal’s ruled diagonals in Figure 4 appear to be the coeffi cients in the expansion of a binomial; for instance, the coeffi cients of (a + b)4 = 1a4 + 4a3b + 6a2b2 + 4ab3 + 1b4 occur along the diagonal Pascal labels with 5. Our m m j j modern notation for these binomial coeffi cients is that j denotes the coeffi cient of a − b in the expansion of (a + b)m. Indeed, if in the arithmetical triangle we index both the diagonals and their  individual entries beginning with zero (rather than one, as Pascal does), then the entry in diagonal m m at column j will be the binomial coeffi cient j . This is easy to prove, by showing that they, too, satisfy the recursion relation and starting data of the arithmetical triangle.  Exercise 15. Show that the coeffi cients in the expansion of a binomial satisfy the starting data and the recursion relation of the arithmetical triangle. In other words, if for all m 0 we write m m m m j j m≥ m (a + b) = j=0 j a − b , show that these coeffi cients satisfy the starting data 0 = m = 1, and Pascal’srecursion relation m+1 = m + m . (Hint: write (a + b)m+1 = (a + b)(a + b)m.)  j j j 1   P − Since we have now identified both the  figurate  numbers and the binomial coeffi cients as the numbers in the arithmetical triangle generated by the basic recursion relation, their precise rela- tionship follows just by comparing their indexing: Fi+1,j is the number in Pascal’s arithmetical triangle in the row and column he labels i + 1 and j + 1, which we reindexed (starting with zero) to be labeled as i and j, which is in diagonal i + j and column j, so it is the binomial coeffi cient i+j j . We have i + j  Fi+1,j = for i, j 0. j ≥   We can also verify a closed formula in terms of for the numbers in the triangle: m m! m(m 1) (m j + 1) = = − ··· − for 0 j m, j j!(m j)! j! ≤ ≤   − where the notation i! (read “i ”) is defined to mean i (i 1) (i 2) 3 2 1, · − · − ··· · · and 0! is defined to be 1. Exercise 16. Show that m m! = for 0 j m. j j!(m j)! ≤ ≤   − (Hint: Show that this factorial formula satisfies the starting data and the recursion relation of the arithmetical triangle.) The ubiquitous numbers in the arithmetical triangle had already been in use for over 500 years, in places ranging from China to the Islamic world, before Pascal developed and applied its properties in his Traité du Triangle Arithmétique (Treatise on the Arithmetical Triangle), written by 1654 [12]. A key fact, which Pascal called his Twelfth Consequence, is that neighboring numbers along a diagonal in the triangle are always in a simple ratio:

m m (m j) = (j + 1) for j < m, − j j + 1     which is easily obtained from the factorial formula above.

9 Exercise 17. Prove Pascal’sTwelfth Consequence from the factorial formula for binomial coeffi cients.

Translated into figurate numbers, Pascal’sTwelfth Consequence reemerges as precisely Fermat’s claim in his letter to Mersenne! For instance, letting j = 2 and m = n + 2 yields

n + 2 n + 2 n = 3 , 2 3     or nFn+1,2 = 3Fn,3, exactly Fermat’sclaim that “the last number multiplied by the triangle of the next larger is three times the collateral pyramid.”The reader may now easily confirm Fermat’sgeneral claim.

Exercise 18. State Fermat’s general claim about figurate numbers, and prove it from Pascal’s Twelfth Consequence.

We have achieved our goal of understanding and verifying Fermat’sclaims about figurate num- bers, although we cannot know that this is precisely how Fermat thought of things. In the process we have connected the figurate numbers to the arithmetical triangle and binomial coeffi cients, and found a factorial formula for them. All this will prove to be a powerful shift. We have also seen how Fermat could use his knowledge of figurate numbers to find formulas for sums of powers, at least for squares and cubes. While it is clear that the process can be continued indefinitely, it quickly becomes impractically complicated, and it is also not clear that it yields any new general insight about sums of powers. However, Pascal too wrote about sums of powers, using the binomial coeffi cient meaning of the figurate numbers to great advantage, and this is where we continue.

Pascal’sArithmetical Triangle, binomials, and sums of powers of arithmetic pro- gressions Pascal wrote two treatises of interest to us at around the same time. In his Treatise on the Arithmetical Triangle [11, v. 30], Pascal made a systematic study of the numbers in his triangle, simultaneously encompassing their figurate, combinatorial, and binomial roles. Although these numbers had emerged in the mathematics of several cultures over many centuries [12], Pascal was the first to connect binomial coeffi cients with combinatorial coeffi cients in . A major motivation for Pascal’s treatise was a question from the beginnings of , about the equitable division of stakes in an interrupted game of chance. The question had been posed to Pascal around 1652 by Antoine Gombaud, the Chevalier de Méré, who wanted to improve his chances at gambling: Suppose two players are playing a fair game, to continue until one player wins a certain number of rounds, but the game is interrupted before either player reaches the winning number. How should the stakes be divided equitably, based on the number of rounds each player has won [12, p. 431, 451ff]? The solution requires the combinatorial properties inherent in the numbers in the arithmetical triangle, as Pascal demonstrated in his Treatise, since they count the number of ways various occurrences can combine to produce a given result. Blaise Pascal (1623—1662) was born in Clermont-Ferrand, in central . As a teenager his father introduced him to meetings of Marin Mersenne’scircle of mathematical discussion in Paris. He quickly became involved in the development of projective geometry, the first in a of highly creative mathematical and scientific episodes in his life, punctuated by periods of religious fervor. Around age twenty-one he spent several years developing a mechanical addition and sub- traction machine, in part to help his father in tax computations as a local administrator. It was the first of its kind ever to be marketed. Then for several years he was at the center of investigations

10 Figure 5: Blaise Pascal of the problem of the vacuum, which led to an understanding of barometric pressure. In fact, the scientific unit of pressure is named the pascal. He is also known for Pascal’slaw on the behavior of fluid pressure. Around 1654 Pascal conducted his studies on the arithmetical triangle and its relationship to . His correspondence with Fermat in that year marks the beginning of probability theory. Several years later, Pascal refined his ideas on area problems via the method of indivisibles already being developed by others, and solved various problems of areas, volumes, centers of gravity, and lengths of curves.6 After only two years of work on the calculus of indivisibles, Pascal fell gravely ill, abandoned almost all intellectual work to devote himself to prayer and charitable work, and died three years later at age thirty-nine. In addition to his work in mathematics and , Pascal is prominent for his Provincial Letters defending Christianity, which gave rise to his posthumously published Pensées on religious philosophy [10, 6]. Pascal was an extremely complex person, and one of the outstanding scientists of the mid-seventeenth century, but we will never know how much more he might have accomplished with more sustained efforts and a longer life. The relation of the arithmetical triangle to counting , and thus their centrality in probability theory, follows easily from the factorial formula above for the triangle’s numbers. The m reader may verify that j represents the number of different combinations of j elements that can occur in a of m elements.  Exercise 19. Prove that the number of distinct five-card hands possible from a standard deck of 52 m fifty-two playing cards is 5 . Then prove that j is the number of different combinations of j elements that can occur in a set of m elements. Hint: First see why   m (m 1) (m j + 1) − ··· − 6 Later in the seventeenth century, Gottfried Wilhelm Leibniz (1646—1716), one of the two inventors of the infin- itesimal calculus, which supplanted the method of indivisibles, explicitly credited Pascal’s approach as stimulating his own ideas on the so-called characteristic triangle of infinitesimals in his fundamental theorem of calculus.

11 is the number of different ways of selecting a sequence of j elements from a sequence with m elements (where different orderings of the same elements count as different ). We have seen that the numbers in the arithmetical triangle have three interchangeable in- terpretations: as figurate numbers, combination numbers, and binomial coeffi cients. Given this multifaceted nature, it is no wonder that they arose early on, in various manners and parts of the world, and that they are ubiquitous today. The arithmetical triangle in fact overflows with fascinating patterns.7 Exercise 20. (Optional) Prove that the sum of the numbers in each diagonal of the arithmetical triangle is a . Hint: . In about the same year as his Treatise on the Arithmetical Triangle, Pascal produced the text we will now study, Potestatum Numericarum Summa (Sums of Numerical Powers) [18, v. III, pp. 341—367],which analyzes sums of powers of an arithmetic progression (a sequence of numbers with a fixed difference between each term and its successor) in terms of the numbers in the arithmetical triangle, interpreted as binomial coeffi cients. Pascal also makes the connection between these results and area problems via the method of indivisibles. Fermat’sgreat enthusiasm in 1636 for the problem of calculating sums of powers was not immedi- ately embraced by others, and Pascal, although a direct correspondent of Fermat’s,was apparently unaware of Fermat’s work when he made his own analysis about eighteen years later in Sums of Numerical Powers. Here we will see the transition from a geometric to algebraic approach almost complete, since Pascal, unlike others before him, and even Fermat, is bent on presenting a gener- alized arithmetic solution for the problem, albeit still using mostly verbal descriptions of formulas, instead of modern algebraic notation. We will find that Pascal obtains a compact formula directly relating sums of powers for various exponents, using binomial coeffi cients as the intermediary.

∞∞∞∞∞∞∞∞ Blaise Pascal, from Sums of Numerical Powers

Remark.

Given, starting with the unit, some consecutive numbers, for example 1, 2, 3, 4, one knows, by the methods the Ancients made known to us, how to find the sum of their squares, and also the sum of their cubes; but these methods, applicable only to the second and third degrees, do not extend to higher degrees. In this treatise, I will teach how to calculate not only the sum of squares and of cubes, but also the sum of the fourth powers and those of higher powers up to infinity: and that, not only for a sequence of consecutive numbers beginning with the unit, but for a sequence beginning with any number, such as the sequence 8, 9, 10,.... And I will not restrict myself to the natural sequence of numbers: my method will apply also to a progression having as ratio [difference] 2, 3, 4, or any other number,– that is to say to a sequence of numbers different by two units, like 1, 3, 5, 7,..., 2, 4, 6, 8,..., or differing by three units like 1, 4, 7, 10, 13,.... And what is more, whatever the first term in the sequence may be: if the first term is 1, as in the sequence with ratio three, 1, 4, 7, 10,...: or if it is another term in the progression, as in the sequence 7, 10, 13, 16, 19; or even if it is alien to the progression, as in the sequence with ratio three, 5, 8, 11, 14,... beginning with 5. It is remarkable that a single general method will suffi ce to treat all these different cases. This method is so simple that it will be explained along several lines, and without the preparation of algebraic notations to which diffi cult demonstrations must have recourse. One can judge this after having read the following problem.

7 These patterns are explored in the project Treatise on the Arithmetical Triangle [1, pp. 185—195].

12 Definition.

Consider a binomial A + 3, whose first term is the letter A, and the second a number: raise this binomial to any power, the fourth for example, which gives

A4 + 12.A3 + 54.A2 + 108.A + 81; the numbers 12, 54, 108, which are multiplying the different powers of A, and are formed by the combi- nation of the figurate numbers with the second term, 3, of the binomial, will be called the coeffi cients of A. Thus, in the cited example, 12 will be the coeffi cient of the A; 54, that of the square; and 108, that of the first power. As for the number 81, it will be called the pure number.

Lemma

Suppose any number, like 14, be given, and a binomial 14 + 3, whose first term is 14 and the second any number 3, in such a manner that the difference of the numbers 14 and 14 + 3 will equal 3. Let us raise these numbers to a same power, the fourth for example: the of 14 is 144, that of the binomial, 14 + 3, is 144 + 12.143 + 54.142 + 108.14 + 81. In this expression, the powers of the first term, 14, of the binomial are obviously affected by the same coeffi cients as the powers of A in the expansion of (A + 3)4. This put down, the difference of the two fourth powers, 144 and 144 + 12.143 + 54.142 + 108.14 + 81, is 12.143 + 54.142 + 108.14 + 81; this difference comprises: on the one hand, the powers of 14 whose degree is less than the proposed degree 4, these powers being affected by the coeffi cients which the same powers of A have in the expansion of (A + 3)4; on the other hand, the number 3 (the difference of the proposed numbers) raised to the fourth power [because the absolute number 81 is the fourth power of the number 3]. From this we deduce the following Rule: The difference of like powers of two numbers comprises: the difference of these numbers raised to the proposed power; plus the sum of all the powers of lower degree of the smaller of the two numbers, these powers being respectively multiplied by the coeffi cients which the same powers of A have in the expansion of a binomial raised to the proposed power and having as first term A and as second term the difference of the given numbers. Thus, the difference of 144 and 114 will be

12.113 + 54.112 + 108.11 + 81, since the difference of the first powers is 3. And so forth.

A single general method for finding the sum of like powers of the terms of any progression.

Given, beginning with any term, any sequence of terms of an arbitrary progression, find the sum of like powers of these terms raised to any degree. Suppose an arbitrary number 5 is chosen as the first term of a progression whose ratio [difference], arbitrarily chosen, is for example three; consider, in this progression, as many of the terms as one

13 wishes, for instance the terms 5, 8, 11, 14, and raise these terms to any power, suppose to the cube. The question is to find the sum of the cubes 53 + 83 + 113 + 143. These cubes are 125, 512, 1331, 2744; and their sum is 4712. Here is how one finds this sum. Let us consider the binomial A + 3 having as first term A and as second term the difference of the progression. Raise this binomial to the fourth power, the power immediately higher than the proposed degree three; we obtain the expression

A4 + 12.A3 + 54.A2 + 108.A + 81. This admitted, we consider the number 17, which, in the proposed progression, immediately follows the last term considered, 14. We take the fourth power of 17, known as 83521, and subtract from it: First: the sum 38 of the terms considered, 5 + 8 + 11 + 14, multiplied by the number 108 which is the coeffi cient of A; Second: the sum of the squares of the same terms 5, 8, 11, 14, multiplied by the number 54, which is the coeffi cient of A2. And so on, in case one still has the powers of A of lesser degree than the proposed degree three. With these subtractions made, one subtracts also the fourth power of the first term proposed, 5. Finally one subtracts the number 3 (ratio [difference] of the progression) itself raised to the fourth power and taken as many times as one considers terms in the progression, here four times. The remainder of the subtraction will be a multiple of the sum sought; it will be the product of this sum with the number 12, which is the coeffi cient of A3, that is to say the coeffi cient of the term A raised to the proposed power three. Thus, in practice, one must form the fourth power of 17, being 83521, then subtracting from it successively: First, the sum of the terms proposed, 5 + 8 + 11 + 14, being 38, multiplied by 108,– that is, the product 4104; Then the sum of the squares of the same terms, 52 + 82 + 112 + 142, or 25 + 64 + 121 + 196, or again 406, which, multiplied by 54, gives 21924; Then the number 5 to the fourth power, which is 625; Finally the number 3 to the fourth power, being 81, multiplied by four, which gives 324. In summary one must subtract the numbers 4104, 21924, 625, 324, whose sum is 26977. Taking this sum away from 83521, there remains 56544. The remainder thus obtained is equal to the sum sought, 4712, multiplied by 12; and, in fact, 4712 multiplied by 12 equals 56544. The rule is, as one sees, easy to apply. Here now is how one proves it. The number 17 raised to the fourth power, which one writes 174, is equal to

174 144 + 144 114 + 114 84 + 84 54 + 54. − − − − In this expression, only the term 174 appears with the single sign +; the other terms are in turns added and subtracted. But the difference of the terms 17 and 14 is 3; likewise the difference of the terms 14 and 11, and of the terms 11 and 8, and of the terms 8 and 5. Thenceforth, according to our preliminary lemma: 174 144 equals 12.143 + 54.142 + 108.14 + 81. Likewise− 144 114 equals 12.113 + 54.112 + 108.11 + 81. Likewise 114 − 84 equals 12.83 + 54.82 + 108.8 + 81. Likewise 84 −54 equals 12.53 + 54.52 + 108.5 + 81. The term 54−does not need to be transformed.

14 One then finds as the value of 174:

12.143 + 54.142 + 108.14 + 81 + 12.113 + 54.112 + 108.11 + 81 + 12.83 + 54.82 + 108.8 + 81 + 12.53 + 54.52 + 108.5 + 81 + 54, or, on interchanging the order of the terms:

5 + 8 + 11 + 14 multiplied by 108, + 52 + 82 + 112 + 142 multiplied by 54, + 53 + 83 + 113 + 143 multiplied by 12, + 81 + 81 + 81 + 81 + 54.

If therefore one subtracts on both sides the sum:

5 + 8 + 11 + 14 multiplied by 108, + 52 + 82 + 112 + 142 multiplied by 54, + 81 + 81 + 81 + 81 + 54; there remains 174 diminished by the previously known :

5 8 11 14 multiplied by 108, − − − − 52 82 112 142 multiplied by 54, − − − − 81 81 81 81 − − − − 54; − which will be found equal to the sum 53 + 83 + 113 + 143 multiplied by 12. Q.E.D. One may thus present as follows the statement and the general solution of the proposed problem.

The sum of powers

Given, beginning with any term, any sequence of terms of an arbitrary progression, find the sum of like powers of these terms raised to any degree. We form a binomial having A as its first term, and for its second term the difference of the given progression; we raise this binomial to the degree immediately higher than the proposed degree, and we consider the coeffi cients of the various powers of A in the expansion obtained. Now we raise to the same degree the term that, in the given progression, immediately follows the last term considered. Then we subtract from the number obtained the following quantities:

15 First: The first term given in the progression,– that is, the smallest of the given terms,– itself raised to the same power (immediately higher than the proposed degree). Second: The difference of the progression, raised to the same power, and taken as many times as of the terms considered in the progression. Third: The sums of the given terms, raised to the various degrees less than the proposed degree, these sums being respectively multiplied by the coeffi cients of the same powers of A in the expansion of the binomial formed above. The remainder of the subtraction thus accomplished is a multiple of the sum sought: it contains it as many times as unity is contained in the coeffi cient of the power of A whose degree is equal to the proposed degree.

NOTE

The reader himself will deduce practical rules that apply in each particular case. Suppose, for example, that one wishes to find the sum of a certain number of terms in the natural sequence [i.e., of natural numbers] beginning with an arbitrary number: here is the rule that one deduces from our general method: In a natural progression beginning with any number, the square of the number immediately above the last term, diminished by the square of the first term and the number of terms given, is equal to double the sum of the stated terms. Suppose given a sequence of any consecutive numbers whose first term is arbitrary, for example the four numbers 5, 6, 7, 8: I say that 92 52 4 equals the double of 5 + 6 + 7 + 8. One will easily obtain analogous rules− giving− the sums of powers of higher degrees and which apply to all progressions.

Conclusion.

Any who are a little acquainted with the doctrine of indivisibles will not fail to see what profit one may make from the preceding results for the determination of curvilinear areas. These results permit the immediate squaring of all types of parabolas and an infinity of other curves. If then we extend to continuous quantities the results found for numbers, by the method expounded above, we will be able to state the following rules:

Rules relating to the natural progression beginning with unity.

The sum of a certain number of lines is to the square of the largest as 1 is to 2. The sum of the squares of the same lines is to the cube of the largest as 1 is to 3. The sum of their cubes is to the fourth power of the largest as 1 is to 4.

General rule relating to the natural progression beginning with unity.

The sum of like powers of a certain number of lines is to the immediately greater power of the largest among them as unity is to the exponent of this same power. I will not pause here for the other cases, because this is not the place to study them. It will be enough for me to have cursorily stated the preceding rules. One can discover the others without diffi culty by relying on the principle that one does not increase a continuous magnitude when one adds to it, in any

16 number one wishes, magnitudes of a lower8 order of infinitude. Thus points add nothing to lines, lines to surfaces, surfaces to solids; or– to speak in numbers as is proper in an arithmetical treatise,– roots do not count in regard to squares, squares in regard to cubes, and cubes in regard to square-squares. In such a way one must disregard, as nil, quantities of smaller order. I have insisted on adding these few remarks, familiar to those who practise indivisibles, in order to bring out the always wonderful connection that nature, in love with unity, establishes between objects distant in appearance. It appears in this example, where we see the calculation of the dimensions of continuous magnitudes joined with the summation of numerical powers.

∞∞∞∞∞∞∞∞ Pascal’s approach to sums of powers is rich with detail, and ends with his view on how this topic displays the connection between the continuous and the discrete. His idea of using a sum of equations in which one side “telescopes” via cancellations is masterful, and is a tool widely used in mathematics today. Pascal presents a rule obtained by generalizable example. The idea of a generalizable example is to prove the claim for a particular number, but in a way that clearly shows that it works for any number. This was a common method of proof for centuries, in part because there was no notation adequate to handle the general case, and in particular no way of using indexing as we do today to deal with sums of arbitrarily many terms. Pascal’sstudy of sums of powers is expanded over Fermat’sin scope on two fronts: to arithmetic progressions with arbitrary differences and to those beginning with any number. The reader should analyze whether his example convinces one of the general rule, and then apply it to obtain sums of fourth and fifth powers.

Exercise 21. Apply Pascal’s method to obtain the sum of the fourth powers in the progression 5, 8, 11, 14, and then check your answer by direct calculation.

Exercise 22. Apply Pascal’s method to obtain the polynomial formula for the sum of the fourth n 4 powers in a natural progression beginning with one, i.e., i=1 i . Exercise 23. Apply Pascal’s method to obtain the polynomialP formula for the sum of the fifth n 5 powers in a natural progression beginning with one, i.e., i=1 i . Pascal’s prescription does require us to substitute knownP formulas for sums with previous ex- ponents, and then solve for the sum with desired exponent. But we can at least say that Pascal’s prescription represents the first general recipe for sums of powers, and it is an attractive formu- lation, connecting sums of powers to each other very directly using binomial coeffi cients. Let us transcribe his verbal prescription into modern notation for how to find the sum of the first n kth powers: n k 1 n − k + 1 (k + 1) ik = (n + 1)k+1 1k+1 n 1k+1 ij . − − · − j i=1 j=1 i=1 X X   X We call this Pascal’s equation. The reader may verify that his method generalizes to produce what he claims in his verbal prescription for more general progressions.

Exercise 24. (Optional) Write out Pascal’sgeneral result, in modern notation, and provide a proof (based on the method of his example) to justify his general prescription for a sum of powers of any arithmetic progression, i.e., with arbitrary difference and beginning with any number. Include a modern formulation of his , and apply it to compute some examples.

8 The French version mistakenly says “higher”here.

17 We can use Pascal’s equation to confirm patterns that have slowly been emerging, namely that sums-of-powers have a particular degree and predictable leading and trailing coeffi cients: n k+1 k n 1 k k 1 i = + n + ? n − + + ? n + 0 for k 1, k + 1 2 ··· ≥ i=1 X where the numbers ? remain a mystery at this point. We can confirm these features and even k 1 push one step further to discover and confirm a simple pattern for the coeffi cients of n − in the polynomial formulas.

Exercise 25. In the text and exercises we have obtained explicit polynomial formulas for sums of powers up to exponent five. From these we conjecture

n k+1 k n 1 k k 1 i = + n + ? n − + + ? n + 0. k + 1 2 ··· i=1 X Use Pascal’s equation to prove these observed patterns for all k. (Hint: mathematical induction. You will need a strong form of induction, in which you assume the truth of all preceding statements, not just the one prior to the one you are trying to verify. Why is this stronger form of mathematical induction a valid method of proof?) Then push one step further to conjecture and prove a pattern k 1 for the coeffi cient of n − in the formulas. At this point we can also use Pascal’sequation to prove that sums of powers satisfy Roberval’s inequalities discussed earlier, which he used to find the areas under all the higher parabola curves.

n k k+1 Exercise 26. Prove the inequality i=1 i < (n + 1) / (k + 1) of Roberval using Pascal’sequa- k+1 n k tion. Can you also use it to prove his second inequality n / (k + 1) < i=1 i , which is harder to show? P P The patterns in the coeffi cients of the polynomial formula for sums of powers above begin to nk+1 reveal more of the connection between the continuous and the discrete. The term k+1 is the area n k k 0 x dx under the curve y = x between 0 and n. The left side of the above equation is the area of a right-endpoint approximating sum of rectangles for this area, and thus the rest constitutes R “correcting terms” interpolating between the area under the curve and the sum of rectangles. 1 k The second term, 2 n , amounts to improving the right-endpoint approximation to a trapezoidal approximation, and the next term also has geometric interpretation as a further correction.

Exercise 27. There is a simple geometric interpretation of the second term in the polynomial formula above. Draw a picture illustrating the difference between the region under the curve y = xk for 0 x n and the region of circumscribing rectangles with ends at values. Interpreting ≤ ≤ n k nk+1 n k their areas as 0 x dx = k+1 and i=1 i , find an interpretation in the picture of how the term 1 nk above represents part of the region between these two, and explain what its connection is to 2 R P the trapezoid rule from calculus as a numerical approximation for definite integrals. This should suggest to you the sign of the next term in the formula above. What should it be and why?

We can speculate that the other coeffi cients in these polynomials continue to follow an interesting pattern. We pursue this in the next part of our story, emerging at the turn of the eighteenth century in the work of Jakob Bernoulli (1654—1705). We end this section by remarking on Pascal’s Conclusion about indivisibles and the squaring (area) of higher parabolas. Clearly for him the connection between “dimensions of continuous magnitudes” and “summation of numerical powers” is striking and subtle, and was probably a

18 prime motivation for his investigations on sums of powers in an era when many were vying to square higher parabolas and other curves. His view is that for continuous quantities, terms of “lower order of infinitude”(i.e., lesser dimension) add nothing, and one must “disregard [them] as nil,”so that the sums of powers formulas above become

n nk+1 ik , ≈ k + 1 i=1 X which is his statement about summing continuous quantities. Today we recognize this as analogous to our integration formula n nk+1 tkdt = k + 1 Z0 for the area under a higher parabola. Turning these analogies into a tight logical connection between discrete summation formulas and continuous area results was part of the long struggle to define and rigorize calculus, which began with the classical Greek mathematics exemplified by Archimedes, and lasted until well into the nineteenth century.

Exercise 28. Use n nk+1 1 ik = + nk + k + 1 2 ··· i=1 X as obtained in an exercise above to prove that

n ik 1 lim i=1 = , n nk+1 k + 1 →∞ P a result known as Wallis’stheorem (Wallis will appear in the next section). Utilize Wallis’stheorem and the modern definition of the integral as a limit of approximating sums to calculate

x xk+1 tkdt = for any real x > 0. k + 1 Z0 Discuss how this supports what Pascal is arguing in his Conclusion.

Jakob Bernoulli finds a pattern Jakob (Jacques, James) Bernoulli (1654—1705) was one of two spectacular mathematical brothers in a large family of mathematicians spanning several generations. The Bernoulli family had settled in Basel, Switzerland, when fleeing the persecution of Protestants by Catholics in the Netherlands in the sixteenth century. Jakob at first studied mathematics against the will of his father, who wanted him to become a minister, and then traveled widely to learn from prominent mathematicians and scientists in France, the Netherlands, and England. He was appointed professor of mathematics in Basel, and he and his younger brother Johann (Jean, 1667—1748)were among the first to fully absorb Gottfried Leibniz’s (1646—1716) newly invented methods of calculus, and to apply them to solve many fascinating mathematical questions. For instance, in 1697 Jakob used a differential equation to solve the brachistochrone problem, i.e., to find the curve down which a frictionless bead will slide from one point to another in the least time. His method began a new mathematical field, the , in which one seeks among all curves the one that maximizes or minimizes some property [12, pp. 547—549]. Bernoulli also used the calculus to discover numerous wonderful properties of the

19 Figure 6: The mathematical Bernoulli family [5, p. 416]

Figure 7: Jakob Bernoulli

20 logarithmic spiral, leading him to request that this “spira mirabilis”be engraved on his tombstone [5, p. 417f], [23, pp. 148—153]. And he worked and published much on infinite . Jakob wrote the earliest substantial book on probability theory, (The Art of Conjecturing). Its posthumous publication in 1713 contained much original work, including the pattern we have been seeking in the formulas for sums of powers. It should not surprise us that Jakob connects probability theory and sums of powers, for we have learned that the figurate numbers, binomial coeffi cients, and combination numbers are simply different interpretations of the numbers in the arithmetical triangle of Pascal.

Figure 8: Ars Conjectandi

While Pascal’s equation displayed a compact connection between sums of powers formulas for various exponents, its recursive nature still prevents quick and easy calculation of the polynomial n k formulas representing i=1 i for various values of k. Nor did it reveal any general pattern in all the coeffi cients of these polynomials, even though we suspect there is one. Bernoulli addresses both issues in his shortP addendum9 on sums of powers (Summae Potestatum) in a chapter of Ars Conjectandi on and combinations [2, v. 3, pp. 164—167], [3], [21, pp. 85—90]. Here

9 We are indebted to Daniel E. Otero for this translation from Latin.

21 the Bernoulli numbers first appear, and our knowledge takes a tremendous leap forward. We will comment in detail after our source, but we note as an aid beforehand that Bernoulli uses the integral sign to represent finite !

∞∞∞∞∞∞∞∞

Jakob Bernoulli, from The Art of Conjecturing Part Two

A THEORY OF PERMUTATIONS AND COMBINATIONS On combinations of particular numbers of things; which leads to figurate numbers and their properties

[...] Scholium. We note in passing that many (among others, Faulhaber and Remmelin from Ulm, Wallis, Mercator in his Logarithmotechnia, Prestet) have engaged themselves in the study of figurate numbers. But I have found no one who has given a universal and scientific demonstration of this property. Wallis put forward his fundamental methods in the Arithm. Infinitorum, where he investigates inductively10 the ratios of the series of Squares, Cubes, or other powers of the natural numbers to the series, having as many terms, of the largest of these powers. From this he moves [...] to the study of Triangular, Pyramidal, and the remaining figurate numbers. But it would have been more convenient and appropriate in the nature of things had he instead first prepared a treatise on the figurate numbers, with universal and accurate demonstrations, and then later continued the investigation of sums of powers. For after all, the method of demonstration by induction is not particularly scientific, and besides, each series requires its own special methods. Those series which should be considered first, by general estimation, and whose natures are most fundamental and simple, are seen to be the figurate numbers, which are generated by addition, while the powers are generated by . Moreover, the series of figurates, beginning with their respective zeros, have exact fractional ratios with the series having the same number of constant terms equal to the largest of these,11 which is not necessarily so for the powers (at least not in a finite number of terms, regardless how many zeros, by excess or defect, are prefixed to it). Furthermore, from the knowledge of the sum of figurates, it is no more diffi cult to determine the sums of powers, and so the author has concluded from these first ideas, as I will now do most briefly. Let there be given the series of natural numbers from unity: 1, 2, 3, 4, 5, etc., up to n, and suppose that we ask for the sums of these, or of their squares, their cubes, etc.: In the Table of Combinations12 the indefinite term in the [...]13 third column is found to be n 1 n 2 nn 3n + 2 − · − = − , 1 2 2 · 10 That is, by inductive, as opposed to deductive, reasoning. Inductive reasoning argues from particular cases (specific examples) towards a general conclusion. Deductive reasoning argues from a known principle to an unknown, from the general to the specific, or from a premise to a logical conclusion. When Bernoulli refers here to inductive reasoning, he is not referring to the method of deductive proof called mathematical induction, which is entirely different. 11 The reader may wish to decipher this claim, and see that it is actually equivalent to Fermat’sclaims in his letter to Mersenne. Hint: Bernoulli means that to obtain an unchanging fractional ratio 1/ (r + 1) using the sum of any series of r-dimensional figurate numbers in the numerator, one should prefix the sequence with r zeros for determining the number of constant terms in the denominator of the ratio. He then contrasts this with sums of powers, and compares with the infinite case; can you see what he is getting at? 12 That is, the arithmetical triangle. 13 We omit Bernoulli’sderivation of the sum of first powers, moving directly to a sum of squares.

22 nn 3n+2 and the sum of all the terms (that is, all −2 ) is n n 1 n 2 n3 3nn + 2n · − · − = − ; 1 2 3 6 · · this gives14 nn 3n + 2 1 3 n3 3nn + 2n − or nn n + 1 = − . 2 2 − 2 6 Z Z Z Z 1 n3 3nn + 2n 3 So nn = − + n 1. 2 6 2 − Z Z Z 3 3 3 3 But n = n = (by what was shown above) nn + n, 2 2 4 4 Z Z and 1 = n; substituting these above gives

R 1 n3 3nn + 2n 3nn + 3n 1 1 1 nn = − + n = n3 + nn + n, 2 6 4 − 6 4 12 Z and by doubling, nn (the sum of the squares of all n) Z 1 1 1 = n3 + nn + n. 3 2 6 [...]15 And by proceeding to higher powers in turn, we easily build up the following formulas:

Sums of Powers16 17 14 Bernoulli uses a bar over an expression where we would today place parentheses around it. 15 Bernoulli continues on to a sum of cubes. 16 The symbol in Bernoulli’s table means that a term with a particular power of n is not shown because the coeffi cient is zero.∗ 17 There is an error in the original published Latin table of sums of powers formulas. The last coeffi cient in the formula for n9 should be 3 , not 1 ; we have corrected this here. − 20 − 12 R

23 1 1 n = nn + n. 2 2 Z 1 1 1 nn = n3 + nn + n. 3 2 6 Z 1 1 1 n3 = n4 + n3 + nn. 4 2 4 Z 1 1 1 1 n4 = n5 + n4 + n3 n. 5 2 3 ∗ −30 Z 1 1 5 1 n5 = n6 + n5 + n4 nn. 6 2 12 ∗ −12 Z 1 1 1 1 1 n6 = n7 + n6 + n5 n3 + n. 7 2 2 ∗ −6 ∗ 42 Z 1 1 7 7 1 n7 = n8 + n7 + n6 n4 + nn. 8 2 12 ∗ −24 ∗ 12 Z 1 1 2 7 2 1 n8 = n9 + n8 + n7 n5 + n3 n. 9 2 3 ∗ −15 ∗ 9 ∗ −30 Z 1 1 3 7 1 3 n9 = n10 + n9 + n8 n6 + n4 nn. 10 2 4 ∗ −10 ∗ 2 ∗ −20 Z 1 1 5 1 5 n10 = n11 + n10 + n9 1n7 +1n5 n3 + n. 11 2 6 ∗ − ∗ ∗ −2 ∗ 66 Z Indeed, a pattern can be seen in the progressions herein, which can be continued by means of this rule: Suppose that c is the value of any power; then the sum of all nc or

c 1 c+1 1 c c c 1 c c 1 c 2 c 3 n = n + n + An − + · − · − Bn − c + 1 2 2 2 3 4 Z · · c c 1 c 2 c 3 c 4 c 5 + · − · − · − · − Cn − 2 3 4 5 6 · · · · c c 1 c 2 c 3 c 4 c 5 c 6 c 7 + · − · − · − · − · − · − Dn − ... & so on, 2 3 4 5 6 7 8 · · · · · · where the value of the power n continues to decrease by two until it reaches n or nn. The uppercase letters A, B, C, D, etc., in order, denote the coeffi cients of the final term of nn, n4, n6, n8, etc., namely Z Z Z Z 1 1 1 1 A = ,B = ,C = ,D = . 6 −30 42 −30 These coeffi cients are such that, when arranged with the other coeffi cients of the same order, they add up to unity: so, for D, which we said signified 1 , we have − 30 1 1 2 7 2 1 + + + (+D) = 1. 9 2 3 − 15 9 − 30 By means of these formulas, I discovered in under a quarter hour’swork that the tenth (or quadrato- sursolid) powers of the first thousand numbers from unity, when collected into a sum, yield

91409924241424243424241924242500.

24 Clearly this renders obsolete the work of Ismael Bulliald18, who wrote so as to thicken the volumes of his Arithmeticae Infinitorum with demonstrations involving immense labor, unexcelled by anyone else, of the sums of up to the first six powers (which is only a part of what we have superseded in a single page).

∞∞∞∞∞∞∞∞

Exercise 29. See whether you can duplicate Bernoulli’s claim that he calculated (by hand, of course) the sum of the tenth powers of the first thousand numbers in less than a quarter of an hour.

Interestingly, while Bernoulli indicates his familiarity with the work of several others on sums of powers, he mentions neither Fermat nor Pascal. (1616—1703) had studied sums of powers in his Infinitorum of 1655, with the same motivation as Fermat and Pascal, to find the areas under higher parabolas. Bernoulli contrasts Wallis’s work with his own, including comparing sums of figurate numbers with sums of powers. While finite sums of powers do not behave as nicely as sums of figurate numbers, Bernoulli’s subsequent formulas shed light on the n k nature of the difference between them by providing a precise expression for i=1 i as a polynomial in n. Notice Bernoulli’ssummation notation as he proceeds to analyze sums ofP powers. The expression after the integral indicates both the general term and the ending index, i.e., he writes n7 for n i7. He also uses an asterisk to indicate “missing”terms, i.e., monomials with zero coeffi cient. i=1 R n 7 7 ExerciseP 30. Today we use the notation i=1 i instead of Bernoulli’s n . What are the advan- tages and disadvantages of the two notations? P R Bernoulli first shows how to derive sum formulas for the first few exponents, using his knowledge of the arithmetical triangle, by exactly the same method we presented when considering Fermat’s claim to have solved the problem. He presents the results of calculation in a table of polynomials for sums up to the tenth powers. And now suddenly he claims:

∞∞∞∞∞∞∞∞ Indeed, a pattern can be seen in the progressions herein which can be continued by means of this rule:

∞∞∞∞∞∞∞∞ Perhaps readers will discover this pattern for themselves before reading closely Bernoulli’sdescrip- tion of it.

Exercise 31. (Optional) Guess, as did Bernoulli, the complete pattern of coeffi cients for sums of powers formulas just from the examples in Bernoulli’s table. Clearly the pattern is to be sought down each column of Bernoulli’s table. The key is to multiply each column of numbers by a common denominator, and then compare with the arithmetical triangle (computing the sequence of successive differences in a column, and the successive differences in that sequence, etc., may also help). Can you also express the general rule for calculating the special numbers A, B, C, D, . . . , which Bernoulli introduces? Hint: What happens when n = 1?

18 Bernoulli is pretty hard on Ismaël Bullialdus, about whom see, e.g., [7].

25 The reader should check that in modern notation, Bernoulli is claiming

n k+1 k k k n n 1 k + 1 k+1 j i = + + Bjn − for k 1, k + 1 2 k + 1 j ≥ i=1 j=2 X X   where we have represented the special sequence of numbers that Bernoulli calls A, B, C, D, . . . by B2m for m 1, and B2m+1 = 0. These numbers have been important in mathematics ever since their introduction≥ here by Bernoulli.19 The great eighteenth century mathematician (1707—1783) christened them the Bernoulli numbers. Bernoulli also claims that he can compute his special sequence of numbers A, B, C, D, . . . . First he notes that they

∞∞∞∞∞∞∞∞ in order, denote the coeffi cients of the final term of nn, n4, n6, n8. Z Z Z Z ∞∞∞∞∞∞∞∞ Indeed, we notice that the coeffi cient of n in the general formula he gives is always the first occur- rence of a new in the process. And he says:

∞∞∞∞∞∞∞∞ These coeffi cients are such that, when arranged with the other coeffi cients of the same order, they add up to unity.

∞∞∞∞∞∞∞∞ Here he is simply evaluating both sides of his general formula at n = 1. Since the left side is then 1, the kth formula simplifies to

1 1 k 1 k + 1 1 = + + B . k + 1 2 k + 1 j j j=2 X  

Since the last term in the sum is the newest Bernoulli number Bk, one can solve for it in terms of the previous ones. Thus the Bernoulli numbers are recursively defined by these formulas. He 1 gives as an example the computation of D = B8 = 30 from the formula for k = 8 and the previous numbers. While this still leaves a step-by-step− aspect to the determination of sums of powers formulas, the process is now greatly simplified. Moreover, we see a general pattern in the relationship between the coeffi cients for different values of k, since the Bernoulli numbers are the same in the formulas for all k. These are the great steps forward that Bernoulli provided beyond the work of Fermat and Pascal. How might we attempt to verify the general validity of the pattern Bernoulli guessed? Since Pascal gave us an equation relating the sums of kth powers to those of lower powers, we should be able to proceed by strong mathematical induction on k, by simply substituting all the formulas of Bernoulli’s into Pascal’s equation to verify the inductive claim at each stage. All but one of Bernoulli’s formulas substituted in Pascal’s equation are assumed true inductively, and the kth is thus shown true by verifying the equality itself.

19 The evidence suggests that around the same time, Takakazu Seki (1642?—1708) in Japan also discovered the same numbers [20, 22].

26 Exercise 32. Prove Bernoulli’sclaimed formulas by strong mathematical induction, in the manner suggested in the text, using Pascal’s equation, Bernoulli’s claims, and the Bernoulli numbers as defined recursively. At some point in your calculations you may need to prove and use the identity a b a a c b c = c b−c . Hint: When substituting Bernoulli’sclaims into Pascal’sequation, verify equality by calculating− and comparing the coeffi cients for an arbitrary power of n on each side of the equation.     The reader may notice a pattern in the Bernoulli numbers not even mentioned by Bernoulli.

Exercise 33. What do you conjecture about the signs of the Bernoulli numbers? Compute several more Bernoulli numbers to see whether your conjecture has promise.

Exercise 34. Let’sverify that the odd Bernoulli numbers are zero! a) Calculate x ∞ B (ex 1) 1 + n xn . − − 2 n! n=2 ! X (Hint: First work out the first few terms of the product. The final answer is very simple.)

x B n b) Next solve for 1 + ∞ n x . − 2 n=2 n! x c) Then use this to prove thatP all the odd Bernoulli numbers are zero. (Hint: Move 2 to the other side, then remember what an even is.)

Exercise 35. Write a summary discussing what each of Fermat, Pascal, and Bernoulli accomplished that went beyond their predecessors’knowledge.

Leonhard Euler in the eighteenth century was the first to prove Bernoulli’sclaimed patterns in the coeffi cients of the sums of powers polynomials, as part of his development of spectacular new mathematics for finding sums of infinite series, a discovery called the Euler-MacLaurin summation formula. For instance, he was able to guess, and then prove, that the sum of the reciprocal squares, 1 2 i∞=1 i2 , is exactly π /6, one of the most astonishing discoveries of the era. Before this time, almost the only infinite series whose sums were known were , and their sums never Pinvolved totally unexpected numbers like π. Euler’s work led to modern methods of studying the distribution of prime numbers, one of the most active research areas in mathematics today [14].

References

[1] J. Barnett, G. Bezhanishvili, H. Leung, D. Pengelley, D. Ranjan, Historical Projects in Discrete Mathematics and , chapter in Resources for Teaching Discrete Mathematics (editor B. Hopkins), MAA Notes 74, Mathematical Association of America, Washington, DC, 2009, pp.163—276. Also at http://www.math.nmsu.edu/hist_projects/DMRG.1.pdf.

[2] J. Bernoulli, Die Werke von Jakob Bernoulli, Naturforschende Gesellschaft in Basel, Birkhäuser Verlag, Basel, 1975.

[3] J. Bernoulli The Art of Conjecturing: Together with “Letter to a Friend on Sets in Court Tennis” (transl. E.D. Sylla), Johns Hopkins University Press, Baltimore, 2006.

[4] C. Boyer, Pascal’s Formula For the Sums of Powers of the Integers, Scripta Mathematica 9 (1943), 237—244.

27 [5] C.B. Boyer, U.C. Merzbach, A (second ed.), John Wiley & Sons, New York, 1989.

[6] Encyclopædia Britannica, Chicago, 1986.

[7] Ismaël Bullialdus, available from http://en.wikipedia.org/wiki/Ismaël_Bullialdus.

[8] P. de Fermat, Oeuvres, P. Tannery (ed.), Paris, 1891—1922.

[9] Fermat’sLast Theorem, available from http://en.wikipedia.org/wiki/Fermat’s_last_theorem.

[10] C.C. Gillispie, F.L. Holmes (eds.), Dictionary of Scientific Biography, Scribner, New York, 1970.

[11] Great Books of the Western World, Mortimer Adler (ed.), Encyclopædia Britannica, Inc., Chicago, 1991.

[12] V.J. Katz, A History of Mathematics: An Introduction (second ed.), Addison-Wesley, New York, 1998.

[13] R. Laubenbacher, D. Pengelley, Mathematical Expeditions: Chronicles by the Explorers, Springer Verlag, New York, 1999.

[14] A. Knoebel, R. Laubenbacher, J. Lodder, D. Pengelley, Mathematical Masterpieces: Further Chronicles by the Explorers, Springer Verlag, New York, 2007.

[15] M. Mahoney, The Mathematical Career of Pierre de Fermat, 2nd edition, Princeton University Press, Princeton, New Jersey, 1994.

[16] Marin Mersenne, available from http://en.wikipedia.org/wiki/Mersenne.

[17] Mathematics, available from http://en.wikipedia.org/wiki/Mathematics.

[18] B. Pascal, Oeuvres, L. Brunschvieg (ed.), Paris, 1908—14;Kraus Reprint, Vaduz, Liechtenstein, 1976.

[19] , available from http://en.wikipedia.org/wiki/Gilles_de_Roberval.

[20] K. Shen, and Li Shanlan’s Formulae on the Sum of Powers and Factorials of Natural Numbers. (Japanese) Sugakushi Kenkyu 115 (1987), 21—36.

[21] D.J. Struik, A Source Book in Mathematics, 1200—1800, Cambridge, Harvard University Press, 1969; Princeton University Press, Princeton, 1986.

[22] K. Yosida, A Brief Biography on Takakazu Seki (1642?—1708), Math. Intelligencer 3 (1980/81), no. 3, 121—122.

[23] R.M. Young, Excursions in Calculus: An Interplay of the Continuous and the Discrete, Math- ematical Association of America, Washington, D.C., 1992.

28 Notes to the instructor This project is for students of upper division discrete mathematics or , and connects sums of powers to figurate numbers, binomial coeffi cients, Pascal’striangle, and Bernoulli numbers via the writings of Fermat, Pascal, and Jakob Bernoulli. It follows a portion of the epic story of formulas for sums of powers that stretches from the Pythagoreans to Euler and is central in the development of discrete mathematics and combinatorics. The full story is told via primary sources in [14, ch. 1]. Many of the techniques and phenomena introduced in a discrete mathematics or combinatorics course simply arise naturally in this project, like recursive definitions, delicate and challenging work with summations and inequalities, counting and geometry, binomial coeffi cients and combination numbers, and proofs by mathematical induction, including strong induction. Instead of separately covering various such topics and techniques, that class time could simply be spent on the project, and students will learn those things in the process. The only formal prerequisites are a facility with proofs, mathematical induction, the binomial theorem, and calculus. The project is quite flexible, and the instructor can choose from various activities offered. The full project can be completed in at most four to five weeks, and for a shorter project the instructor may choose selectively; some exercises are marked as optional if they are not critical to later work. Students can work productively in groups, with group or individual writeups. Many of the ideas and knowledge in the project are acquired through guided discovery exercises, a number of which are open-ended, so the instructor should work through all the details before assigning any student work, and select carefully from the big picture if exercises are omitted. Students may need substantial guidance with some parts. The goal is for students to learn many basic notations, techniques, and skills in the context of an historically and mathematically authentic big motivating problem with multiple connections to other mathematics. Hopefully this will be much more effective and rewarding than simply being asked to learn various skills for no immediately apparent application. The project also asks students to conjecture from patterns they generate, develop their math- ematical intuition and judgement, and try proving their , i.e., putting students in the creative mathematical driver seat in an authentic context. The setting of sums of powers in the context of primary sources allows a richness of questions and interpretations, and especially in- cludes deep connections to geometry, the two-way interplay with calculus, and a richness of proof techniques.

29