Maa-Spectrum-Erickson-Martin-J-Beautiful-Mathematics-Mathematical-Association-Of

i “main” — 2011/10/12 — 12:06 — page i — #1 i i i

Beautiful Mathematics

i i

i i i “main” — 2011/10/12 — 12:06 — page ii — #2 i i i

c 2011 by the Mathematical Association of America, Inc.

Library of Congress Catalog Card Number 2011939398 Print edition ISBN: 978-0-88385-576-8 Electronic edition ISBN: 978-1-61444-509-8 Printed in the United States of America Current Printing (last digit): 10987654321

i i

i i i “main” — 2011/10/12 — 12:06 — page iii — #3 i i i

Beautiful Mathematics

Martin Erickson Truman State University

Published and Distributed by The Mathematical Association of America

i i

i i i i “main” — 2012/2/3 — 11:55 — page iv — #4 i i

Council on Publications and Communications Frank Farris, Chair Committee on Books Gerald Bryce, Chair Spectrum Editorial Board Gerald L. Alexanderson, Editor RobertE.Bradley SusannaS.Epp Richard K. Guy Keith M. Kendig ShawneeL. McMurran Jeffrey L. Nunemacher KennethA.Ross FranklinF.Sheehan James J. Tattersall Robin Wilson

i i

i i i “main” — 2011/10/12 — 12:06 — page v — #5 i i i

SPECTRUM SERIES

The Spectrum Series of the Mathematical Association of America was so named to reﬂect its purpose: to publish a broad range of books including biographies, accessible expositions of old or new mathematical ideas, reprints and revisions of excellent out-of-print books, popular works, and other monographs of high interest that will appeal to a broad range of readers, including students and teachers of mathematics, mathematical amateurs, and researchers.

777 Mathematical Conversation Starters, by John de Pillis 99 Points of Intersection: Examples—Pictures—Proofs, by Hans Walser. Translated from the original German by Peter Hilton and Jean Pedersen Aha Gotcha and Aha Insight, by Martin Gardner All the Math That’s Fit to Print, by Keith Devlin Beautiful Mathematics, by Martin Erickson Calculus Gems: Brief Lives and Memorable Mathematics, by George F. Simmons Carl Friedrich Gauss: Titan of Science, by G. Waldo Dunnington, with additional material by Jeremy Gray and Fritz-Egbert Dohse The Changing Space of Geometry, edited by Chris Pritchard Circles: A Mathematical View, by Dan Pedoe Complex Numbers and Geometry, by Liang-shin Hahn Cryptology, by Albrecht Beutelspacher The Early Mathematics of Leonhard Euler, by C. Edward Sandifer The Edge of the Universe: Celebrating 10 Years of Math Horizons, edited by Deanna Haunsperger and Stephen Kennedy Euler and Modern Science, edited by N. N. Bogolyubov, G. K. Mikhailov, and A. P. Yushkevich. Translated from Russian by Robert Burns. Euler at 300: An Appreciation, edited by Robert E. Bradley, Lawrence A. D’Antonio, and C. Edward Sandifer Expeditions in Mathematics, edited by Tatiana Shubin, David F. Hayes, and Gerald L. Alexanderson Five Hundred Mathematical Challenges, by Edward J. Barbeau,Murray S. Klamkin, and William O. J. Moser The Genius of Euler: Reflections on his Life and Work, edited by William Dunham The Golden Section, by Hans Walser. Translated from the original German by Peter Hilton, with the assistance of Jean Pedersen. The Harmony of the World: 75 Years of Mathematics Magazine, edited by Gerald L. Alexanderson with the assistanceof Peter Ross A Historian Looks Back: The Calculus as Algebra and Selected Writings, by Judith Grabiner History of Mathematics: Highways and Byways, by Amy Dahan-Dalmédicoand JeannePeiffer, translated by Sanford Segal How Euler Did It, by C. Edward Sandifer Is Mathematics Inevitable? A Miscellany, edited by Underwood Dudley I Want to Be a Mathematician, by Paul R. Halmos Journey into Geometries, by Marta Sved JULIA: a life in mathematics, by Constance Reid The Lighter Side of Mathematics: Proceedings of the Eugène Strens Memorial Conferenceon Recre- ational Mathematics & Its History, edited by Richard K. Guy and Robert E. Woodrow Lure of the Integers, by Joe Roberts Magic Numbers of the Professor, by Owen O’Shea and Underwood Dudley Magic Tricks, Card Shuffling, and Dynamic Computer Memories: The Mathematics of the Perfect Shuffle, by S. Brent Morris Martin Gardner’s Mathematical Games: The entire collection of his Scientific American columns

i i

i i i “main” — 2011/10/12 — 12:06 — page vi — #6 i i i

The Math Chat Book, by Frank Morgan Mathematical Adventures for Students and Amateurs, edited by David Hayes and Tatiana Shubin. With the assistance of Gerald L. Alexanderson and Peter Ross Mathematical Apocrypha, by Steven G. Krantz Mathematical Apocrypha Redux, by Steven G. Krantz Mathematical Carnival, by Martin Gardner Mathematical Circles Vol I: In Mathematical Circles Quadrants I, II, III, IV, by Howard W. Eves Mathematical Circles Vol II: Mathematical Circles Revisited and Mathematical Circles Squared, by Howard W. Eves Mathematical Circles Vol III: Mathematical Circles Adieu and Return to Mathematical Circles, by Howard W. Eves Mathematical Circus, by Martin Gardner Mathematical Cranks, by Underwood Dudley Mathematical Evolutions, edited by Abe Shenitzer and John Stillwell Mathematical Fallacies, Flaws, and Flimflam, by Edward J. Barbeau Mathematical Magic Show, by Martin Gardner Mathematical Reminiscences, by Howard Eves Mathematical Treks: From Surreal Numbers to Magic Circles, by Ivars Peterson Mathematics: Queen and Servant of Science, by E.T. Bell Mathematics in Historical Context,, by Jeff Suzuki Memorabilia Mathematica, by Robert Edouard Moritz Musings of the Masters: An Anthology of Mathematical Reflections, edited by Raymond G. Ayoub New Mathematical Diversions, by Martin Gardner Non-Euclidean Geometry, by H. S. M. Coxeter Numerical Methods That Work, by Forman Acton Numerology or What Pythagoras Wrought, by Underwood Dudley Out of the Mouths of Mathematicians, by Rosemary Schmalz Penrose Tiles to Trapdoor Ciphers ...and the Return of Dr. Matrix, by Martin Gardner Polyominoes, by George Martin Power Play, by Edward J. Barbeau Proof and Other Dilemmas: Mathematics and Philosophy, edited by Bonnie Gold and Roger Simons The Random Walks of George Pólya, by Gerald L. Alexanderson Remarkable Mathematicians, from Euler to von Neumann, by Ioan James The Search for E.T. Bell, also known as John Taine, by Constance Reid Shaping Space, edited by Marjorie Senechaland George Fleck Sherlock Holmes in Babylon and Other Tales of Mathematical History, edited by Marlow Anderson, Victor Katz, and Robin Wilson Student Research Projects in Calculus, by Marcus Cohen, Arthur Knoebel, Edward D. Gaughan, Douglas S. Kurtz, and David Pengelley Symmetry, by Hans Walser. Translated from the original German by Peter Hilton, with the assistance of Jean Pedersen. The Trisectors, by Underwood Dudley Twenty Years Before the Blackboard, by Michael Stueben with Diane Sandford Who Gave You the Epsilon? and Other Tales of Mathematical History, edited by Marlow Anderson, Victor Katz, and Robin Wilson The Words of Mathematics, by Steven Schwartzman MAA Service Center P.O. Box 91112 Washington, DC 20090-1112 800-331-1622 FAX 301-206-9789

i i

i i i “main” — 2011/10/12 — 12:06 — page vii — #7 i i i

To Rodman Doll, who mentored me in mathematics when I was a high school student

i i

i i i “main” — 2011/10/12 — 12:06 — page viii — #8 i i i

i i

i i i “main” — 2011/10/17 — 16:31 — page ix — #9 i i i

Preface

Why are numbers beautiful? It’s like asking why is Beethoven’s Ninth Symphony beautiful. If you don’t see why, someone can’t tell you. I know numbers are beautiful. If they aren’t beautiful, nothing is. PAUL ERDOS˝ (1913–1996) This book is about beautiful mathematical concepts and creations. Some people believe that mathematics is the language of nature, others that it is an abstract game with symbols and rules. Still others believe it is all calculations. Plato equated mathematics with “the good.” My approach to mathematics is as an art form, like painting, sculpture, or music. While the artist works in a tangible medium, the mathematician works in a medium of numbers, shapes, and abstract patterns. In mathematics, as in art, there are constraints. The most stringent is that mathematical results must be true; others are conciseness and elegance. As with other arts, mathematical ideas have an esthetic appeal that can be appreciated by those with the willingness to investigate. I hope that this book will inspire readers with the beauty of mathematics. I present mathematical topics in the categories of words, images, formulas, theorems, proofs, solutions, and unsolved problems. We go from complex numbers to arithmetic progressions, from Alcuin’s sequence to the zeta function, and from hypercubes to infinity squared. Who should read this book? I believe that there is something new in it for any mathemat- ically-minded person. I especially recommend it to high school and college students, as they need motivation to study mathematics, and beauty is a strong motivation; and to pro- fessional mathematicians, because we always need fresh examples of mathematical beauty to pass along to others. Within each chapter, the topics require progressively more prerequi- site knowledge. Topics that may be too advanced for a beginning reader will become more accessible as the reader progresses in mathematical study. An appendix gives background definitions and theorems, while another gives challenging exercises, with solutions, to help the reader learn more. Thanks to thepeople who have kindlyprovided suggestionsconcerning thisbook:Roland Bacher, Donald Bindner, Robert Cacioppo, Robert Dobrow, Shalom Eliahou, Ravi Fer- nando, Suren Fernando, David Garth, Joe Hemmeter, Daniel Jordan, Ken Price, Khang Tran, Vincent Vatter, and Anthony Vazzana. Thanks also to the people affiliated with pub- lishing at the Mathematical Association of America, including Gerald Alexanderson, Don Albers, Carol Baxter, Rebecca Elmo, Frank Farris, Beverly Ruedi, and the anonymous readers, for their help in making this book a reality.

i i

i i i “main” — 2011/10/12 — 12:06 — page x — #10 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page xi — #11 i i i

Contents

Preface ix 1 ImaginativeWords 1 1.1 Lemniscate ...... 1 1.2 Centillion...... 3 1.3 GoldenRatio ...... 3 1.4 BorromeanRings...... 5 1.5 SieveofEratosthenes ...... 5 1.6 TransversalofPrimes...... 6 1.7 WaterfallofPrimes ...... 7 1.8 Squares,TriangularNumbers,andCubes ...... 7 1.9 Determinant...... 8 1.10ComplexPlane ...... 8 2 Intriguing Images 13 2.1 SquarePyramidalSquareNumber ...... 13 2.2 BinaryTrees ...... 15 2.3 BulgingHyperspheres ...... 16 2.4 ProjectivePlane...... 16 2.5 Two-ColoredGraph...... 17 2.6 Hypercube...... 18 2.7 FullAdder...... 19 2.8 Sierpi´nski’s Triangle ...... 20 2.9 SquaringMap...... 21 2.10RiemannSphere...... 22 3 CaptivatingFormulas 25 3.1 ArithmeticalWonders...... 25 3.2 Heron’sFormulaandHeronianTriangles ...... 25 3.3 Sine,Cosine,andExponentialFunctionExpansions ...... 28 3.4 TangentandSecantFunctionExpansions ...... 29 3.5 SeriesforPi...... 30 3.6 ProductforPi...... 31 3.7 FibonacciNumbersandPi ...... 32 3.8 VolumeofaBall ...... 32 3.9 Euler’sIntegralFormula ...... 34 3.10Euler’sPolyhedralFormula...... 35

i i

i i i “main” — 2011/10/12 — 12:06 — page xii — #12 i i i

xii Contents

3.11TheSmallestTaxicabNumber ...... 36 3.12InﬁnityandInﬁnitySquared ...... 37 3.13ComplexFunctions ...... 38 3.14TheZetaFunctionandBernoulliNumbers ...... 40 3.15TheRiemannZetaFunction ...... 41 3.16TheJacobiIdentity ...... 42 3.17Entropy ...... 43 3.18RookPaths ...... 44

4 Delightful Theorems 49 4.1 ASquareinsideEveryTriangle...... 49 4.2 Morley’sTheorem ...... 50 4.3 TheEulerLine ...... 52 4.4 Monge’sTheorem...... 54 4.5 PowerMeans ...... 54 4.6 RegularHeptagon...... 58 4.7 IsometriesofthePlane ...... 59 4.8 SymmetriesofRegularConvexPolyhedra ...... 61 4.9 PolynomialSymmetries...... 63 4.10KingsandSerfs...... 65 4.11 The Erd˝os--Szekeres Theorem ...... 66 4.12Minkowski’sTheorem ...... 67 4.13Lagrange’sTheorem ...... 69 4.14VanderWaerden’sTheorem ...... 72 4.15LatinSquaresandProjectivePlanes ...... 76 4.16TheLemniscateRevisited ...... 79

5 Pleasing Proofs 83 5.1 ThePythagoreanTheorem ...... 83 5.2 The Erd˝os--Mordell Inequality ...... 84 5.3 TriangleswithGivenAreaandPerimeter ...... 85 5.4 APropertyoftheDirectrixofaParabola...... 86 5.5 AClassicIntegral...... 87 5.6 IntegerPartitions ...... 88 5.7 IntegerTriangles ...... 89 5.8 TriangleDestruction ...... 92 5.9 SquaresinArithmeticProgression ...... 94 5.10RandomHemispheres...... 95 5.11OddBinomialCoefﬁcients ...... 95 5.12Frobenius’PostageStampProblem...... 96 5.13Perrin’sSequence...... 99 5.14OntheNumberofPartialOrders ...... 99 5.15PerfectError-CorrectingCodes...... 101 5.16BinomialCoefﬁcientMagic ...... 104 5.17AGroupofOperations ...... 106

i i

i i i “main” — 2011/10/12 — 12:06 — page xiii — #13 i i i

Contents xiii

6 Elegant Solutions 109 6.1 ATetrahedronandFourSpheres ...... 109 6.2 AlphabetCubes...... 110 6.3 ATriangleinanEllipse...... 110 6.4 AbouttheRootsofaCubic...... 111 6.5 DistanceonPlanetX ...... 113 6.6 ATiltedCircle ...... 114 6.7 TheMillionthFibonacciNumber...... 116 6.8 TheEndofaConjecture ...... 117 6.9 AZero-SumGame ...... 117 6.10AnExpectedMaximum...... 119 6.11WalksonaGraph...... 120 6.12RotationsofaGrid ...... 123 6.13StampRolls...... 125 6.14MakingaMillion ...... 128 6.15ColoringaProjectivePlane...... 129 7 Creative Problems 131 7.1 Two-DimensionalGobblingAlgorithm...... 131 7.2 NonattackingQueensGame ...... 132 7.3 Lucas Numbers Mod m ...... 132 7.4 ExactColoringsofGraphs ...... 133 7.5 QueenPaths...... 134 7.6 TransversalAchievementGame ...... 136 7.7 BinaryMatrixGame ...... 136 A Harmonious Foundations 139 A.1 Sets ...... 139 A.2 Relations ...... 141 A.3 Functions ...... 141 A.4 Groups ...... 142 A.5 Fields ...... 145 A.6 VectorSpaces...... 146 B Eye-Opening Explorations 151 B.1 Problems ...... 151 B.2 Solutions ...... 155 Bibliography 165 Index 169 About the Author 177

i i

i i i “main” — 2011/10/12 — 12:06 — page xiv — #14 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 1 — #15 i i i

1 Imaginative Words

It is impossible to be a mathematician without being a poet in soul. SOFIA KOVALEVSKAYA (1850–1891) The objects of mathematics can have fascinating names. Mathematical words describe numbers, shapes, and logical concepts. Some are ordinary words adapted for a speciﬁc purpose, such as cardinal, cube, group, face, ﬁeld, ring, and tree. Others are unusual, like cosecant, holomorphism, octodecillion, polyhedron, and pseudoprime. Some sound peculiar— deleted comb space, harmonic map, supremum norm, twisted sphere bundle, to name a few. Mathematical words have appeared in poems (see [19]). Let us look at some mathematical words.

1.1 Lemniscate Consider the lemniscate, a curve shaped like a figure-eight1 as shown in Figure 1.1. We learn in [46] that it gets its name from the Greek word lemniskos, a ribbon used for fas- tening a garland on one’s head, derived from the island Lemnos where they were worn. By a coincidence, the end of the word lemniscate sounds like “skate,” and one can (with practice and skill) skate a figure-eight. Skating a lemniscate is portrayed in the animated Schoolhouse Rock segment “Figure Eight,” with the theme sung by jazz vocalist Blossom Dearie (1926–2009). She sings that a figure-eight is “double four,” which is probably the Indo-European origin of the word “eight.” In the animation, a girl daydreams of skating a figure-eight that turns into the infinity symbol . 1

Figure 1.1. A lemniscate.

1Another curve, known as the Eight Curve, is perhaps closer to a ﬁgure-eight, but we will stick with the lemniscate because it is so graceful.

i i

i i i “main” — 2011/10/12 — 12:06 — page 2 — #16 i i i

2 1. Imaginative Words

Figure 1.2. A lemniscate graph. y

Figure 1.3. A lemniscate and a hyperbola as circular inverses.

As in Figure 1.2, a lemniscate can be graphed on an xy coordinate system by the parametric equations cos t x D 1 sin2 t C sin t cos t y ; < t < : D 1 sin2 t 1 1 C As the parameter t moves along the real number line, a point .x; y/ in the plane traces the lemniscate over and over, followingthe right lobe counterclockwise and the left lobe in the clockwise direction. Where do the parametric equations for the lemniscate come from? One way to obtain a lemniscate is by circular inversion of a hyperbola. In Figure 1.3, we see the hyperbola

x2 y2 1 D and the lemniscate as inverses with respect to the unit circle centered at the origin. Each point on the hyperbola is joined by a line segment to the origin. The point where it crosses the lemniscate is indicated. The length of the segment from the origin to the lemniscate is the reciprocal of the length of the segment from the origin to the hyperbola. The self- intersection point of the lemniscate corresponds to a point at inﬁnity on the hyperbola. The distance from the origin to a point .x; y/ is x2 y2. To ﬁnd the point on the C lemniscate corresponding to .x; y/ on the hyperbola, we make the transformation p x y .x; y/ ; : 7! x2 y2 x2 y2 Â C C Ã

i i

i i i “main” — 2011/10/12 — 12:06 — page 3 — #17 i i i

1.2. Centillion 3

This transforms the equation for the hyperbola into an equation for the lemniscate:

.x2 y2/2 x2 y2: C D Starting with parametric equations for the hyperbola,

x sec t D y tan t; < t < ; D 1 1 we obtain, upon making the same transformation, the parametric equations for the lemniscate.

1.2 Centillion What is the largest number you can name? A million is 103 thousand. A billion is 106 thousand. A trillionis 109 thousand. The largest number given in a dictionary list of numbers is typically a centillion, which is 1 followed by 100 groups of three zeros followed by another group of three zeros, or 10300 thousand, or

10303:

A centillionis much larger than a googol,a coinedterm for 10100, but much smaller than a googolplex, deﬁned as 1 followed by a googol of zeros. If you have ten dollars and seven cents in pennies then you have, in a way, a centillion. The number of ways of selecting a subset of the pennies is 21007, which is about 1:4 centillion.

1.3 Golden Ratio Figure 1.4 shows a golden rectangle. If we remove the square on the shorter side, the remaining rectangle has the same proportionsas the original rectangle. The golden ratio is y x : x D y x This is y 1 ; x D y 1 x or y 2 y 1: x x D Á Completing the square, y 1 2 5 ; x 2 D 4 Â Ã so y 1 p5 ; x 2 D ˙ 2

i i

i i i “main” — 2011/10/12 — 12:06 — page 4 — #18 i i i

4 1. Imaginative Words

x y – x

Figure 1.4. The golden rectangle.

and hence y 1 p5 ˙ : x D 2 Since y=x is greater than 1, the positive sign applies. Therefore, the golden ratio, denoted by , is 1 p5 : C 1:6: D 2 D A rational number is a ratio of two integers, such as 4=7 or 2=1. An irrational number is a real number that isn’t rational. The golden ratio is an irrational number. If were rational, then we could write y=x, where x and y are positive integers. From Figure 1.4, we D see that also equals x=.y x/. This is a representation of as a ratio of a pair of smaller positive integers. We could repeat the process, representing as ratios of smaller and smaller pairs of positive integers. But this would imply an inﬁnite decreasing chain of positive integers, which is impossible. Therefore is irrational. In Euclidean geometry, we can construct a line through any two points, a circle with any center and passing through any given point, and the intersection of two given lines, a line and a circle, or two circles. Figure 1.5 shows a construction of the golden rectangle using four lines and six circles. For more about the golden ratio, see the delightful book [52].

Figure 1.5. Construction of the golden rectangle.

i i

i i i “main” — 2011/10/12 — 12:06 — page 5 — #19 i i i

1.4. Borromean Rings 5

Figure 1.6. Borromean rings. 1.4 Borromean Rings Figure 1.6 shows three interlocking rings called Borromean rings, named after the Bor- romeo family in Italy whose coat of arms depicted them. No two rings are linked but if we remove one of them then the other two come apart. Borromean rings exist as an abstract concept, but they do not exist in reality. The rings cannot be represented by three circles in 3-dimensional Euclidean space, even with arbitrary radii. The problem is that one cannot make rigidcircles pass over and under each other in the required way. For a proof, see [31]. However, the sculptor John Robinson has shown that the interlocking conﬁguration can be made with three squares or equilateral triangles instead of circles.

Figure 1.7. Sieve of Eratosthenes.

1.5 Sieve of Eratosthenes A primenumber is an integer greater than 1 with no positive divisorsother than 1 and itself. For example, 13 is a prime number, but 10 is not because it is divisible by 2 and 5. There

i i

i i i “main” — 2011/10/12 — 12:06 — page 6 — #20 i i i

6 1. Imaginative Words

are infinitely many prime numbers, as proved in Euclid’s Elements. Every integer greater than 1 is the product of primes in a unique way, the Fundamental Theorem of Arithmetic. The Sieve of Eratosthenes, invented by Eratosthenes of Cyrene (c. 276–195 BCE), is an algorithm for listingall the primes up to a given number. The method is to eliminate proper multiples of known primes. Figure 1.7 shows the result of the Sieve of Eratosthenes on the numbers 2 through 400. In the figure, the boxes represent the integers 2 through 400, reading left-to-rightand top- to-bottom. Unshaded squares represent prime numbers and shaded squares represent composite (non-prime) numbers. The algorithm starts with the boxes unshaded. Then proper multiples of the first prime (2) are shaded (since they are divisible by 2 and hence composite). The next remaining number in order, 3, is a prime, and its proper multiples are shaded. The next remaining number is the prime 5, and its proper multiples are shaded. This continues for all primes up to 19 (those in the first row of the array). We need to sift out multiples of the primes 2, 3, 5, 7, 11, 13, 17, and 19, since 19 is the largest prime whose square is less than 400. Any composite number up to 400 must be divisibleby one of them.

1.6 Transversal of Primes

Let p beaprimenumber.Ina p p square array consisting of the numbers 1 through p2 (in left-to-right, top-to-bottom order), is there always a collection of p primes with no two of them in the same row or column? The solutions for p 2, 3, and 5 are unique. D Figure 1.8 shows an example for p 11. D A transversal of an n n array is a selection of n cells of the array with no two in the same row or column. We are asking whether there is a transversal of primes in a p p array. Adrien-Marie Legendre (1752–1833) conjectured that there exists at least one prime number between consecutive squares N 2 and .N 1/2. This conjecture is still open. Leg- C endre’s conjecture is a necessary condition for our problem, since there must be a prime in the last row of the grid. Is the answer to our question possibly “no” for some prime p?

987654321 1110 2221201918171615141312 3332313029282726252423 4443424140393837363534 5554535251504948474645 6665646362616059585756 7776757473727170696867 8887868584838281807978 9998979695949392919089 110109108107106105104103102101100 121120119118117116115114113112111

Figure 1.8. A transversal of primes.

i i

i i i “main” — 2011/10/12 — 12:06 — page 7 — #21 i i i

1.7. Waterfall of Primes 7

1.7 Waterfall of Primes Primes greater than 2 are odd and therefore upon division by 4 leave a remainder of 1 or 3. For example, 11 4 2 3 and 13 4 3 1. Among the ﬁrst 1000 odd primes, 495 are D C D C of the form 4n 1 and 505 are of the form 4n 3. Thus, there are about half of each type. C C As the sequence of primes goes on, the distribution of primes into the two types is closer and closer to half-half. The waterfall of primes in Figure 1.9 depicts the way that prime numbers fall into the two classes, primes of the form 4n 1 on the rightand primes of the C form 4n 3 on the left. As the waterfall continues for all eternity, the difference between C the number of primes of the two forms changes sign inﬁnitely often. Primes of different forms have different properties. For example, an odd prime is the sum of two squares of integers if and only if it is of the form 4n 1. C

Figure 1.9. A waterfall of primes.

1.8 Squares, Triangular Numbers, and Cubes Number theory is the study of properties of the counting numbers, 1, 2, 3,.... A theorem of Joseph-Louis Lagrange (1736–1813) says that every positive integer is equal to the sum of four squares of integers. For example,

132 92 72 12 12: D C C C A similar theorem, due to Carl Friedrich Gauss (1777–1855), asserts that every positive integer is equal to the sum of at most three triangular numbers. A triangular number is a number of the form 1 2 k, for some positive integer k. So 10 1 2 3 4 is a C C C D C C C triangular number. The reason for this term is that dots representing the numbers 1 through

i i

i i i “main” — 2011/10/12 — 12:06 — page 8 — #22 i i i

8 1. Imaginative Words

k can be stacked in the shape of a triangle. An example of Gauss’s theorem is

100 91 6 3: D C C A thirdtheorem of number theory, due to Pierre de Fermat (1601–1665),says that a cube (a positiveinteger of the form n3) is never equal to thesum of twocubes. For instance, there is noway towrite 103 1000 as thesum of twocubes. This result is part of a more general D assertion known as Fermat’s Last Theorem, which was proved by Andrew Wiles in 1995. It says there are no positive integer solutions to the equation xn yn zn, where n is an C D integer greater than 2. Figure 1.10 represents these three theorems pictorially. In his diary, Gauss wrote an equivalent of the second equation accompanied by the exclamation Eureka! A good reference on number theory is [37].

Figure 1.10. Three theorems of number theory.

1.9 Determinant A determinant is an algebraic quantity that determines whether or not a system of linear equations has a solution. Perhaps you are familiar with the formula for 2 2 determinants: a b ad bc: c d D ˇ ˇ ˇ ˇ ˇ ˇ Did youknow that thedeterminantˇ is theareaˇ of a parallelogram? In Figure1.11, thearea of the gray parallelogram, spanned by the vectors .a; b/ and .c; d/, is the area of the rectangle minus the areas of two triangles and two trapezoids: 1 1 1 1 .a c/.b d/ ab cd c.b b d/ b.c a c/ C C 2 2 2 C C 2 C C ad bc: D In any dimension, a determinant is equal to the signed volume of the parallelepiped spanned by its row vectors.

1.10 Complex Plane Complex numbers were treated with skepticism when they were ﬁrst introduced in the 1500s. What sense can be made of the number p 1?

i i

i i i “main” — 2011/10/12 — 12:06 — page 9 — #23 i i i

1.10. Complex Plane 9

(0,b + d ) (a + cb , + d )

(,)c d (0,d ) (,)ab

(0,0) (a ,0) (a + c ,0)

Figure 1.11. A 2 2 determinant as an area.

Since the square of a positive number is positive, and the square of a negative number is positive, and zero squared is zero, it appears that the square of no number can be 1. However, this is the definition of i, the imaginary unit. The equation x2 1 0 has no solutionin the field of real numbers R. But it is possible C D to solve it in a field that contains R. The field is created by adding a new element, i, to R that has the property that i 2 1 0. Once i is added, other numbers must be added in C D order to ensure that the structure is a field. We call this field the field of complex numbers, C. In this field, every polynomial equation with complex coefficients has a complex root. The field C can be identified with the plane R2 in a natural way. However, the two structures are different. Multiplication of two vectors in R2 cannot be defined to produce a new vector so as to form a field. (The dot product of two vectors produces a number, not a vector. The cross product is defined for two vectors in R3, but this multiplication does not produce a field, since it isn’t commutative.) However, a vector multiplication is possible in C. The main reason for defining this new field is the realization of these two properties: algebraic closure and existence of a multiplication. In the construction identifying C and R2, we identify each real number r R with the 2 ordered pair .r; 0/, we identify i with the ordered pair .0; 1/, and we identify the number a bi with the ordered pair .a; b/. This constructionwas first carried out by the Norwegian C mathematician Caspar Wessel (1745–1818). For an engaging account of the early history of complex numbers, see [36]. Once we have defined the element i such that the equation i 2 1 holds, we have D defined the field of complex numbers C. The relation i 2 1 induces a rotational product D for the whole complex plane. Hence, .R2; ; / forms a field where addition is ordinary C vector addition. We call .R2; ; / the complex plane, and the points .a; b/ R2 are called C 2 complex numbers. They are ordinarypointsinthe ordinaryplane, butwitha way tomultiply them to get another point. Thus, the plane with vector addition and this rotational product is a field. See Figure 1.12. Many polynomials don’t have real zeroes, for example, x2 1. The complex numbers are C built upon the reals and a zero of this polynomial. The nontrivialfact that every polynomial has a real zero was first proved by Carl Friedrich Gauss (1777–1850) in the early 1800s, and is called the Fundamental Theorem of Algebra.

i i

i i i “main” — 2011/10/12 — 12:06 — page 10 — #24 i i i

10 1. Imaginative Words

¡A imaginary axis

4 i 2 i C C r r real axis ¨ 0 0i H H C ¨ r

2 2i r

4 4i r ¡A Figure 1.12. The complex plane.

We deﬁne the sum of two complex numbers by

.a bi/ .c di/ .a c/ .b d/i; C C C D C C C or, in terms of ordered pairs,

.a; b/ .c; d/ .a c; b d/: C D C C Multiplicationis deﬁned based on the rule that i 2 1, so that D .a bi/.c di/ .ac bd/ .ad bc/i; C C D C C or, in terms of ordered pairs,

.a; b/ .c; d/ .ac bd; ad bc/: D C The formula for multiplication can be remembered by writing .a; b/ as a bi and using C ordinary multiplication of binomials, replacing i 2 by 1 wherever it occurs. Since .a; 0/ .b; 0/ .a b; 0/ and .a; 0/ .b; 0/ .ab; 0/, we can identifythe x-axis C D C D with the real line. We call .x; 0/ a real number. As .0; 1/ .0; 1/ . 1; 0/, we can identify .0; 1/ with i. This is also denoted by i and D the y-axis is called the imaginary axis. We call .0; y/ a pure imaginary number. We see that .b; 0/ .0; 1/ .0; b/ and thus .a; b/ .a; 0/ .0; b/ .a; 0/ .b; 0/ .0; 1/. D D C D C That is, .a; b/ a bi where a and b are real numbers called the real and imaginary parts D C of the complex number a bi. We denote the real part of z by z and its imaginary part C < by z. =

i i

i i i “main” — 2011/10/12 — 12:06 — page 11 — #25 i i i

1.10. Complex Plane 11

We define .a bi/ to be the complex number corresponding to the ordered pair C . a; b/. Also, we define the difference of a bi and c di to be .a bi/ .c di/ C C C C D .a c/ .b d/i. C To prove that C is a field, we need to show that an arbitrary complex number z a bi D C has a multiplicative inverse z1. We can do this by rationalizing the denominator: 1 a bi a bi a b z1 i: D a bi D .a bi/.a bi/ D a2 b2 D a2 b2 a2 b2 C C C C C 1 To evaluate the complex quotient z1=z2 is easy; we compute z1z2 . We can show that complex multiplication is associative, commutative, distributive over addition, and has the unit 1 1 0i .1; 0/. This means that .R2; ; / isa field, and we D C D C denote it by C. The modulus or absolute value of a complex number a bi is its distance, as a point in C the plane, from the origin .0; 0/. That is, a bi pa2 b2. j C j D C The distance between the complex numbers z1 a1 b1i and z2 a2 b2i is the D C D C usual Euclidean distance in the plane between .a1; b1/ and .a2; b2/, which is

2 2 .a2 a1/ .b2 b1/ : C The conjugate of z a bipis z a bi. Geometrically, the vector z is reflected in D C D the x-axis to produce the vector z. Leonhard Euler (1707–1783) discovered a fundamental connection between the sine and cosine functions and the exponential function. When we consider them as functions of a real variable, there doesn’t appear to be any connection among them. Sine and cosine are bounded, periodic and take on negative values, which contrasts with the behavior of the exponential function, which has none of these properties. To see the relationship, we look at the exponential function with a purely imaginary argument, or, more precisely, determine how the exponential function should be extended to the y-axis of R2. The functions have power series expansions 2 4 6 cos 1 D 2Š C 4Š 6Š C 3 5 7 sin D 3Š C 5Š 7Š C x x2 x3 x4 x5 x6 x7 ex 1 : D C 1Š C 2Š C 3Š C 4Š C 5Š C 6Š C 7Š C Euler considered ei . Because i 4kC2 1 and i 4k 1, the coefficients of even powers of D D in the expansions of ei and cos are the same. And because i 4kC1 i and i 4kC3 i, D D the coefficients of odd powers of in the expansion of ei and i sin are the same. Thus, we have Euler’s formula ei cos i sin : D C This is not really a derivation since we haven’t defined the cosine, sine, or exponential functions with complex arguments, much less determined their power series. So, actually, Euler’s formula is an insight into what becomes the definition of the exponential function with a pure imaginary argument.

i i

i i i “main” — 2011/10/12 — 12:06 — page 12 — #26 i i i

12 1. Imaginative Words

Using Euler’s formula, we can write a complex number z a bi in polar coordinates D C form as z rei . For example, D 5 5ei0; 5 5ei ; i ei.=2/; i ei.=2/; 1 i p2ei.3=4/: D D D D C D

As a further example, the circle of radius 5 centered at z0 is z z0 5, or equivalently, i j j D z z0 5e , where 0 < 2. D C Ä We have seen that we can represent a complex number as an ordered pair of real numbers or in polar coordinates. We can also represent it as a vector

a ; b Ä where a and b are real numbers. And we can represent a complex number as a 2 2 matrix: a b cos sin or r : b a sin cos Ä Ä The second matrix is a rotation matrix corresponding to counterclockwise rotation about the origin by the angle in radians. Multiplication by a ﬁxed matrix is a linear transformation of the plane. This allows us to apply complex numbers to problems of geometry. An example of the interplay between complex numbers and plane geometry is the description of the isometries of the Euclidean plane in terms of complex numbers. See Isometries of the Plane in Chapter 4.

i i

i i i “main” — 2011/10/12 — 12:06 — page 13 — #27 i i i

2 Intriguing Images

Mathematics, rightly viewed, possesses not only truth, but supreme beauty—a beauty cold and austere, like that of sculpture, without appeal to any part of our weaker nature, without the gorgeous trappings of painting or music, yet sublimely pure, and capable of a stern perfection such as only the greatest art can show. The true spiritof delight,the exal- tation, the sense of being more than Man, which is the touchstone of the highest excellence, is to be found in mathematics as surely as poetry. —BERTRAND RUSSELL (1872–1970), The Study of Mathematics Many mathematical concepts are embodied in diagrams, drawings, and other kinds of images. A sketch may illustrate a theorem. A picture may point the way to new mathematics. Let us look at some mathematical images and learn about the mathematics behind them.

2.1 Square Pyramidal Square Number The equation 12 22 32 242 702 C C C C D might seem, at ﬁrst glance, to be a miscellaneous mathematical fact, but it is special. Edouard´ Lucas (1842–1891) posed a problem, called the Cannonball Puzzle, which asked for a number N such that N cannonballs (spheres) can be placed in a square array, or in a pyramidal array with a square base. Thus, Lucas asked for a solution in integers1 to the equation 12 22 32 m2 n2; C C C C D where N n2. From the formula for the sum of the ﬁrst consecutive m squares, the D equation becomes m.m 1/.2m 1/ C C n2: 6 D Lucas suspected, but was unableto prove, that the only solutionis m 24 and n 70, with D D N 4900, as above. That is, 4900 is the only square pyramidal square number greater D 1A polynomial equation required to have integer solutions is called a Diophantine equation, after Diophantus of Alexandria (c. 200–c.284 B.C.E.), whose book Arithmetica treats equations of this kind.

i i

i i i “main” — 2011/10/12 — 12:06 — page 14 — #28 i i i

14 2. Intriguing Images

Figure 2.1. A pyramid of 4900 spheres.

than 1. Mathematicians after Lucas succeeded in proving this (see [4] for a particularly clear treatment). Figure 2.1 shows the pyramid of 4900 spheres. The set of solutionsto

x.x 1/.2x 1/ C C y2 6 D

comprise what is called an elliptic curve. The study of elliptic curves is an active and fascinating area of mathematics. An excellent book is [53]. The solutionto the Cannonball Puzzle is the basis for the existence of a famous lattice in 24 dimensions known as Leech’s lattice, discovered by John Leech (1926–1992). A lattice is a regularly repeating pattern ofpoints,such as the intersection pointsofgraph paper. If we put a sphere of the same radius around each lattice point so that the spheres just touch, then the lattice yields a sphere packing. Leech’s lattice has a remarkably high contact number (the number of nearest neighbors of each lattice point): 196;560. In the sphere packing associated with Leech’s lattice, every sphere touches exactly 196;560 other spheres. This is the maximum contact number for a lattice sphere packing in 24 dimensions. Leech’s lattice can be constructed from a Lorentzian lattice in 26 dimensional space. The vector w .0;1;2;3;:::;24 70/ has length 0 in this space, since in the Lorentzian metric D I we compute length to be the square root of the sum of the squares of the coordinates with the exception that the last coordinate square is subtracted. Leech’s lattice (a 24-dimensional linear space) is the quotient space w?=w of the orthogonal space to w (dimension 25) and the space spanned by w (dimension 1). The deﬁnitive reference on lattices and sphere packing is [14].

i i

i i i “main” — 2011/10/12 — 12:06 — page 15 — #29 i i i

2.2. Binary Trees 15

3 2

4 5 11 8

9 10 7 6 14 13 15 12

Figure 2.2. An order-preserving labeling of the full binary tree of order 4. 2.2 Binary Trees

Consider

2 22 2n2 2n 2 2n1 2 2n2 2 2 ::: ; n 1: 2n1 1 2n2 1 2n3 1 1 ! ! ! ! The expression counts the number of order-preserving labelings of the full binary tree of order n, with the integers 1,..., 2n 1. The full binary tree of order n is a directed graph with a top node joined by arrows to two nodes at the next level down; each of these is joined by arrows to two nodes at the next level down, and so on for n levels in all. In an order-preserving labeling, each node is labeled with a smaller number than the labels of any of its descendents. Figure 2.2 shows an example with n 4. D To see that this is what is counted, notice that the 1 must go at the top node of the tree. Then there is a choice of half of the remaining elements to go into the left subtree. It may 2n2 be made in 2n11 ways. This leaves the other elements in the right subtree. The least element in each subtree must go on top. Repeating, allowing for all the choices in all the branches at each level, gives the expression. Call this number f .n/. The table below gives some numerical values.

n f .n/ 1 1 2 2 3 80 4 21964800 5 74836825861835980800000

If n 1, the expression is an empty product which we deﬁne to be 1. D The labeled binary tree in Figure 2.2 is an example of a data structure in computer science called a heap. A heap can be viewed as a labeled subtree of a full binary tree. Fig- ure 2.3 illustrates a heap on the set 1;2;3;4;5;6;7;8;9 . Heaps allow for quick insertion f g or deletion of minimum or maximum values, and they are used in a sorting algorithmcalled heapsort.

i i

i i i “main” — 2011/10/12 — 12:06 — page 16 — #30 i i i

16 2. Intriguing Images

3 2

4 5 8

9 7 6

Figure 2.3. A heap. 2.3 Bulging Hyperspheres Figure 2.4 shows a square of side length 4 circumscribing four circles of radius 1. By the Pythagorean theorem, the radius of the small circle in the middle of the larger circles is p2 1. What happens if we generalize to any dimension d 1? Suppose that we have a hypercube of side 4 in d-dimensional space, containing 2d hyperspheres of radius 1. See Volume of a Ball in Chapter 3. We can place the hypercube so that its center is at the origin and the unit hyperspheres are centered at . 1; 1; : : : ; 1/. By the Pythagorean theorem, ˙ ˙ ˙ the distance between the center of one of the unit hyperspheres and the hypersphere that sits in the middleof them is r pd. Hence the radius of the small hypersphere is D r pd 1: D For d 4, the radius of the small hypersphere is 1, the same as the radii of the other D hyperspheres. For d > 4, the small hypersphere is larger than those that surround it. For d 9, the radius is 2 and the small hypersphere touches the sides of the hypercube. For D d > 9, the small hypersphere bulges outside the hypercube!

Figure 2.4. A circle surrounded by other circles.

2.4 Projective Plane A projective plane is a geometry in which every two points determine a line and every two lines intersect in exactly one point. There are no parallel lines! Figure 2.5 shows a thirteen-point projective plane.

i i

i i i “main” — 2011/10/12 — 12:06 — page 17 — #31 i i i

2.5. Two-Colored Graph 17

02 12 22 @ r@ r r @ 01@ 11 21 0 @ r r@ r r @ @ 00 10 20@ r r r @ @ @ 1 r r2 Figure 2.5. A projective plane of order three.

The thirteen-pointprojective plane has thirteen lines (in the ﬁgure, some of the lines are curved). Every line contains four points and every point is on four lines. Nine of the points are labeled with coordinates 00 through 22. The other four points, called ideal points, are labeled 0, 1, 2, and . 1 We call this little universe a projective plane of order three (one less than the number of points per line). A projective plane of order n is a collection of n2 n 1 points and C C n2 n 1 lines such that each line contains n 1 points, each point lies on n 1 lines, C C C C every two pointsdetermine a unique line, and every two lines intersect in exactly one point. There exists a projective plane of order equal to any power of a prime number. No one knows if there is a projective plane whose order is not a prime power. The smallest integer greater than 1 that isn’t a prime power is 6, and Gaston Tarry (1843–1913) proved that there is no projective plane of order 6. The next feasible order, 10, has been ruled out by a combination of mathematics and computer calculations: there is no projective plane of order 10. The existence of a projective plane of order 12 remains an open question. If it exists, it would have 157 points and 157 lines. 2.5 Two-Colored Graph In graph theory, a graph is a collection of vertices and a collection of edges joiningpairs of vertices. The edges may be straight or curved and may cross. Figure 2.6 shows a complete graph on 17 vertices. It is complete because every two vertices are joined by an edge. The edges of the graph are colored with two colors, indicated by dark lines and light lines. The coloring has the property that there exist no four vertices all of whose six edge connections are the same color. However, every two-coloring of the edges of the complete graph on 18 vertices must have four vertices all of whose edge connections are the same color. This statement is an instance of a combinatorial result called Ramsey’s theorem. Ramsey’s theorem, discovered by Frank Ramsey (1903–1930), says that for every n, there exists a least integer R.n/ so that no matter how the edges of a complete graph on R.n/ vertices are two-colored, there exist n vertices all of whose edge connections are the same color. Thus R.4/ 18. Can you show that R.3/ 6? D D

i i

i i i “main” — 2011/10/12 — 12:06 — page 18 — #32 i i i

18 2. Intriguing Images

The coloringof Figure 2.6 has a cyclic symmetry, with every vertex joined by dark edges to the vertices at steps 1, 2, 4, 8, 9, 13, 15, and 16 clockwise around the circle. A good reference on graph theory is [54].

Figure 2.6. A two-coloring of the complete graph on 17 vertices.

2.6 Hypercube To sketch a two-dimensional drawing of a cube, draw two squares separated by a little distance and draw four lines joining corresponding vertices. We can go a step further and draw a hypercube, a four-dimensionalcube. Draw two cubes a littledistance apart and draw lines joiningcorresponding vertices. The drawing has sixteen vertices and thirty-twoedges, each vertex joined to four other vertices. Figure 2.7 shows one way to draw the picture. A hypercube can be deﬁned combinatorially as the set of sixteen binary strings of length four, e.g., 0110, where two strings are joined if and only if they differ in exactly one place. For example, the strings 0110 and 0111 are joined. This way of thinkingabout a hypercube is useful in constructing an example of a graph coloring. In the exercises in Appendix B, you are asked to give a three-coloring of the edges of a complete graph on 16 vertices such that there exists no triangle all of whose edges are the same color. In such a coloring, each single-color subgraph is a hypercube with the diagonals added.

Figure 2.7. A hypercube.

i i

i i i “main” — 2011/10/12 — 12:06 — page 19 — #33 i i i

2.7. Full Adder 19

input1 input2 output

carry

nextcarry

Figure 2.8. A full adder. 2.7 Full Adder Figure 2.8 shows a logic diagram for an integrated circuit known as a full adder. It is the backbone of the arithmetic unit of a computer. The circuit performs binary addition. Given two input bits (0 or 1) and a previous carry bit,the circuit adds the numbers in binary and yieldsan outputbit and a next carry bit.Here is the truth table for this circuit. input1 input2 carry output nextcarry 0 0 0 0 0 0 0 1 1 0 0 1 0 1 0 0 1 1 0 1 1 0 0 1 0 1 0 1 0 1 1 1 0 0 1 1 1 1 1 1

The elements in a full adder are called logic gates. The basic types are AND gates, OR gates, and NOT gates. They perform the logical operations described by their names. In combination they can create complex circuits such as the full adder. The NOT gate is the simplest. It changes an input of 1 to 0, and 0 to 1.

input output

input output 0 1 1 0

The AND gate yields an output of 1 if and only if both inputs are 1. input2 output input1

i i

i i i “main” — 2011/10/12 — 12:06 — page 20 — #34 i i i

20 2. Intriguing Images

input1 input2 output 0 0 0 0 1 0 1 0 0 1 1 1 The OR gate yields an output of 1 if either input is 1. input2 output input1

input1 input2 output 0 0 0 0 1 1 1 0 1 1 1 1 All aspects of a computer’s thinking, including memory, are formed from these building blocks. The mathematical underpinnings are discussed in [43].

2.8 Sierpin´ski’sTriangle Figure 2.9 shows a fractal shape called Sierpiński’s triangle, introduced by Wacław Sier- piński (1882–1969). The triangle is shown at the sixth step of its formation. Starting with a solid equilateral triangle, at the first step the triangle is divided into four equal-size equilateral triangles and the middle one is removed. At each subsequent step, this process is repeated on the remaining solid equilateral triangles. The area of the Sierpiński triangle is 0. Suppose that the starting equilateral triangle has area 1. At thefirst step,thearea is 3=4 of the original area since one of the four sub-triangles is removed. At each further step, the area is reduced to 3=4 of the previous area. Hence, the area at step k is 3 k ; 4 Â Ã

Figure 2.9. The Sierpi´nski triangle.

i i

i i i “main” — 2011/10/12 — 12:06 — page 21 — #35 i i i

2.9. Squaring Map 21

and this tends to 0 as k tends to infinity. Sierpiński’s triangle isn’t one-dimensional or two- dimensional. It has a fractional dimension, called Hausdorff dimension, equal to log 3 1:585 : : : : log 2 D The reason is that Sierpiński’s triangle is the union of three copies of itself, each scaled down by a factor of two.

2.9 Squaring Map Deﬁne a graph whose vertices are the integers modulo n. For two vertices A and B, draw a directed arrow from vertex A to vertex B if A2 mod n B: D We call this graph the squaring map modulo n. You may want to draw the squaring map for some small values of n, such as 2 n 10. Here is the squaring map modulo 25. Ä Ä

10 7

5 0 20 24 1

15 18

17 23

8 14 4 2

21 16

11 6

12 19 9 3

13 22

The arrows for the vertices that map to themselves, 0 and 1, aren’t shown. There is a directed cycle of length four, namely, 6 11 21 16 6: ! ! ! !

i i

i i i “main” — 2011/10/12 — 12:06 — page 22 — #36 i i i

22 2. Intriguing Images

We call the sets 0 , 1 , and 6; 11; 21; 16 attractors. If we start at a vertex and followthe f g f g f g arrows, we end up in an attractor. If there is a directed path from one vertex to another, we say that the vertices are in the same component of the graph. The squaring map modulo 25 has three components. The squaring map modulo n always has exactly one attractor in each component (exercise). Hence, the squaring map modulo n always has at least two components, corresponding to the attractors 0 and 1 . It can be shown (more difficult) that the squaring map f g f g modulo n has exactly two components if and only if n is a power of 2 or a Fermat prime. 2j A Fermat prime is a prime of the form Fj 2 1, where j 0. The only known D C Fermat primes are F0 3, F1 5, F2 17, F3 257, and F4 65537. What does the D D D D D squaring map look like when n 17? The graph appears, with a few minor differences, as D a figure earlier in this chapter. Using a computer, can you find the number of components and the size of a largest attractor when n is 1;000;000? For a complete solution to the squaring map problem, see [24].

2.10 Riemann Sphere The Riemann sphere builds on the deﬁnition of the complex plane as a representation of complex numbers. It is possible to model the real line and the plane by including a new point that we call . The models are called the extended real line or plane. What do they look like? 1 The extended real line is a circle, since are the same point. Similarly, the extended ˙1 plane is a sphere called the Riemann sphere, named after Bernhard Riemann (1826–1866); see Figure 2.10. Stereographic projection is a bijection between points in the plane and points on the punctured Riemann sphere (with the North Pole removed). The North Pole is identiﬁed with . We take the1 Riemann sphere to be a unit sphere centered at .0; 0; 0/. The North Pole is .0; 0; 1/ and the South Pole is .0; 0; 1/. In stereographic projection, a point z x D C yi .x; y; 0/ in the complex plane (which is equivalent to the xy-plane) is mapped to the Á point .x0; y0; z0/ on the sphere so that .0; 0; 1/, .x; y; 0/, and .x0; y0; z0/ are collinear. The origin is mapped to the South Pole; the unit circle in the plane is mapped to the equator; circles centered at the origin of radius greater than (respectively, less than) 1 are mapped to latitudinal circles in the northern (respectively, southern) hemisphere; and lines through the origin are mapped to longitudinal circles. (0,0,1)

(,,)x¢ y ¢ z ¢

(xy , ,0)

Figure 2.10. Stereographic projection.

i i

i i i “main” — 2011/10/12 — 12:06 — page 23 — #37 i i i

2.10. Riemann Sphere 23

We define the extended complex plane C to be the complex plane C together with the point at infinity ; that is, C C . Stereographic projection is a correspondence 1 D [ f1g between C and the Riemann sphere. b Let’s find the correspondenceb between .x; y; 0/ and .x0; y0; z0/. The line determined by .0; 0; 1/ andb .x; y; 0/ has parametric form

x0 x; y0 y; z0 1 ; R: D D D 2 Since .x0; y0; z0/ lies on the unit sphere,

.x/2 .y/2 .1 /2 1; C C D or 2.x2 y2 1/ 2 0; C C D a quadratic equation in . One of the roots is 0, which corresponds to the North Pole. D The other is 2 2 : D x2 y2 1 D z 2 1 C C j j C This yields 2 z 2 z z 2 1 x0 < ; y0 = ; z0 j j : D z 2 1 D z 2 1 D z 2 1 j j C j j C j j C In the reverse direction, we obtain from the parametric formulas

x0 y0 x ; y : D 1 z0 D 1 z0 Stereographic projection gives a correspondence between lines and circles in the plane and circles on the sphere. A curve that is either a line or a circle is called a “lircle.” See Figure 2.11. ∞

circle

line

Figure 2.11. Lircles on the Riemann sphere.

Let’s prove this. The equation for a lircle is

A.x2 y2/ Bx Cy D 0: C C C C D

i i

i i i “main” — 2011/10/12 — 12:06 — page 24 — #38 i i i

24 2. Intriguing Images

If A 0, then we have a line. The condition that the circle is not degenerate is D 4AD < B2 C 2: C We see this by completing squares. Substitution gives x0 2 y0 2 x0 y0 A B C D 0; 1 z0 C 1 z0 C 1 z0 C 1 z0 C D "Â Ã Â Ã # which simpliﬁes to A.x02 y02/ Bx0.1 z0/ Cy0.1 z0/ D.1 z0/2 0: C C C C D Since x02 y02 z02 1, we have C C D A.1 z02/ Bx0.1 z0/ Cy0.1 z0/ D.1 z0/2 0; C C C D which upon division by 1 z0 yields A.1 z0/ Bx0 Cy0 D.1 z0/ 0; C C C C D the equation of a plane. This is obvious if the lircle is a line. The intersection of a plane and the sphere is a circle. Finally, the condition that the plane intersects the circle is the nondegeneracy condition for lircles. To see this, use the formula for distance from a point to a plane. It is clear that the steps are reversible and hence all circles on the sphere are stereographic images of lircles in the complex plane. A nice property of stereographic projection is that it preserves angles. Suppose that ax C by c 0 and dx ey f 0 are two lines in the complex plane. We know that C D C C D their images under stereographic projection are circles that intersect at the North Pole and another point. The angles of the curves at the intersections are the same (by symmetry), so let us determine the angle of intersection at the North Pole. From the equation for the plane of stereographic projection of a lircle, we see that the stereographic projections of the lines lie on the planes ax0 by0 c.1 z0/ 0 and dx0 ey0 f .1 z0/ 0, C C D C C D respectively. A tangent to a circle lying on a sphere lies in the tangent plane to the sphere at the point of tangency. Hence, tangents to the circles lie in the plane z 1; they are D given by ax0 by0 0, z 1 and dx0 ey0 0, z 1. It is evident from the form of C D D C D D the equations that the angle between the tangent lines is the same as the angle between the original lines. The Riemann sphere has many pleasing properties. Any two lines intersect at .A 1 neighborhood of is a spherical cap on the Riemann sphere; it corresponds, under stere- 1 ographic projection, to the outside of a circle centered at the origin. Thus, any path that moves away from the origin (a ray or spiral, for example) is said to tend to . This is 1 different from the real line, where we distinguishes between and . Another nice C1 1 property of the Riemann sphere is that stereographic projections of the points z and 1=z are antipodal. The symmetries (self-similarities) of the Riemann sphere are the M¨obius functions, also called linear fractional transformations, of the form az b z C ; a; b; c; d C: 7! cz d 2 C

i i

i i i “main” — 2011/10/12 — 12:06 — page 25 — #39 i i i

3 Captivating Formulas

Mathematicians do not study objects, but relations among objects; they are indifferent to the replacement of objects by others as long as the relations don’t change. Matter is not important, only form interests them. —HENRI POINCARE´ (1854–1912)

Mathematical formulas, whether simple or complicated, convey in symbols the essence of mathematicians’ discoveries. Some formulas are well known, such as Euler’s formula ei cos i sin . Some are less known.We willlook at a few formulas I ﬁnd beautiful, D C some stark and some ornate. You may ﬁnd them beautiful too.

3.1 Arithmetical Wonders Here are three arithmetical curiosities:

123456789 8 9 987654321 C D 123456789 9 10 1111111111 C D 111111111 111111111 12345678987654321: D It is easy to verify their truth, but why do they work? What happens when you do the multiplication?

3.2 Heron’s Formula and Heronian Triangles Heron’s formula, discovered by Heron of Alexandria (c. 10–70), gives the area of a triangle in terms of its side lengths. Suppose that a triangle has side lengths a, b, c, and semiperimeter s .a b c/=2. Then the area of the triangleis D C C A s.s a/.s b/.s c/: D p For instance, a triangle with sides 10, 11, and 13 has semiperimeter s 17 and area D p17 7 6 4 2p714: D 25

i i

i i i “main” — 2011/10/12 — 12:06 — page 26 — #40 i i i

26 3. Captivating Formulas

We will prove Heron’s formula. Let vectors a and b represent the sides of lengths a and b. By the determinant formula of Chapter 1, the area of the triangle is 1 A det M ; D 2j j where M is the 2 2 matrix whose rows are a and b. Since the transpose M t of M has the same determinant,

a2 a b 4A2 det M det M t det.MM t / : D D D a b b2 ˇ ˇ ˇ ˇ The third side of the triangle is represented by the vectorˇc a b, andˇ ˇ D ˇ c2 .a b/ .a b/ a2 2a b b2: D D C Solving for a b, we obtain a2 .a2 b2 c2/=2 4A2 C ; D .a2 b2 c2/=2 b2 ˇ ˇ ˇ C ˇ ˇ 2a2 a2 b2 c2 ˇ 16A2 ˇ C ˇ D a2 b2 c2 2b2 ˇ ˇ ˇ C ˇ ˇ4a2b2 .a2 b2 c2/2 ˇ D ˇ C ˇ .2ab a2 b2 c2/.2ab a2 b2 c2/ D C C C ..a b/2 c2/.c2 .a b/2/ D C .a b c/.a b c/.c a b/.c a b/; D C C C C C A2 s.s c/.s b/.s a/; D A s.s a/.s b/.s c/: D A Heronian triangle isp a triangle with rational side lengths and area. An example is the familiar right triangle with sides 3, 4, 5, and area 6. We will give a formula that generates all Heronian triangles. Since the area of a triangle is half of its base times its height, the altitudes of a Heronian triangle are rational. Hence, we may scale a Heronian triangle by a rational factor so that it has an altitude of 2. We will assume that this altitude is to a longest side of the triangle. Thus, the triangle splits into two right triangles, as in the diagram.

y 2 z

w x

i i

i i i “main” — 2011/10/12 — 12:06 — page 27 — #41 i i i

3.2. Heron’s Formula and Heronian Triangles 27

By hypothesis, the lengths w x, y, and z are rational. We will show that w and x are C rational. By the Pythagorean theorem,

w2 4 y2 C D x2 4 z2: C D Subtraction gives .w x/.w x/ y2 z2; C D and hence w x is rational. It follows that .w x/ .w x/ 2w is rational, and thus C C D w and x are rational. From w2 4 y2, we have 4 .y w/.y w/. Set C D D C y w 2p C D 2 y w ; D p where p is a rational number. By the triangle inequality, y w > 2 and so p > 1. Solving C for y and w, we obtain 1 1 w p ; y p ; p > 1: D p D C p This gives rational side lengths for the right triangle on the left in the diagram. The other right triangle has a similar form: 1 1 x q ; z q ; q > 1: D q D C q Allowing for a rational scaling factor of r, every Heronian triangle has side lengths given uniquely, up to an interchange of p and q, by 1 1 1 1 r p ; r q ; r p q ; p;q>1;r>0; C p C q p C q Â Ã Â Ã Â Ã with area 1 1 r 2 p q : p C q Â Ã For instance, the Heronian triangle corresponding to p 7=2, q 13=5, and r 10=19 D D D has side lengths 265 388 4941 ; ; ; 133 287 1729 and area 49410 : 32851 The 3–4–5 right triangle comes from p 2, q 3, r 6=5. D D D Many questions can be asked about Heronian triangles. For example, can we ﬁnd all Heronian triangles with consecutive integer side lengths? (easy) Are there Heronian triangles whose medians are rational numbers? (unsolved) A gem about Heronian triangles is that they can be scaled so that their vertices have integer coordinates in the plane (see [55]).

i i

i i i “main” — 2011/10/12 — 12:06 — page 28 — #42 i i i

28 3. Captivating Formulas

3.3 Sine, Cosine, and Exponential Function Expansions The power series expansions of the sine, cosine, and exponential functions have esthetic appeal. A power series expansion of a function is an inﬁnite series of the form

2 3 4 a0 a1x a2x a3x a4x ; C C C C C

where the an are numbers and x is a variable. How do we ﬁnd the power series expansion of ex? Assume that

x 2 3 4 e a0 a1x a2x a3x a4x ; D C C C C C for all real numbers x. If we let x 0, then the right side of this expression collapses to 0 D a0, while the left side is e 1. Hence a0 1. Now we know that D D x 2 3 4 e 1 a1x a2x a3x a4x : D C C C C C x x To determine a1, differentiate both sides. As the derivative of e is e , we obtain

x 2 3 e a1 2a2x 3a3x 4a4x : D C C C C

Letting x 0, we have 1 a1, so D D x 2 3 4 e 1 2a2x 3a3x 4a4x 5a5x : D C C C C C Taking another derivative, we obtain

x 2 3 e 2a2 3 2a3x 4 3a4x 5 4a5x : D C C C C

Letting x 0, we have 1 2a2, so a2 1=2. Repeating, we obtain a power series D D D expansion for the exponential function: x2 x3 x4 ex 1 x : D C C 2Š C 3Š C 4Š C The series converges for all real numbers x. In fact, the variable can be any complex number z. Setting x 1, we obtain a formula for e, the base of natural logarithms: D 1 1 1 : e 1 1 2:71828: D C C 2Š C 3Š C 4Š C D If we do the same for sine and cosine, we ﬁnd that x3 x5 x7 sin x x D 3Š C 5Š 7Š C x2 x4 x6 cos x 1 : D 2Š C 4Š 6Š C Leonhard Euler (1707–1783) observed that the expansion for the exponential function works just as well if x is a complex variable, and if we replace x by i, where i is the imaginary unit (i 2 1), then we have a relation among the exponential, sine, and cosine D functions: ei cos i sin : D C

i i

i i i “main” — 2011/10/12 — 12:06 — page 29 — #43 i i i

3.4. Tangent and Secant Function Expansions 29

Letting yields D ei cos i sin 1; D C D and hence ei 1 0: C D This relation unites ﬁve important mathematical constants, , e, i, 1, and 0, in one formula.

3.4 Tangent and Secant Function Expansions We found in the previous section that the power series expansions of sin x and cos x follow a simple pattern. What about the power series expansions of tan x and sec x? They are

2x3 16x5 272x7 7936x9 tan x x D C 3Š C 5Š C 7Š C 9Š C x2 5x4 61x6 1385x8 sec x 1 : D C 2Š C 4Š C 6Š C 8Š C What is the pattern of the sequences 1;2;16;272;7936;::: and 1;1;5;61;1385;::: ? f g f g Suppose that

a x0 a x1 a x2 a x3 tan x 0 1 2 3 D 0Š C 1Š C 2Š C 3Š C b x0 b x1 b x2 b x3 sec x 0 1 2 3 : D 0Š C 1Š C 2Š C 3Š C From the differentiation formula .tan x/0 sec2 x, we obtain D a2 a3 2 a1 x x C 1Š C 2Š C b b b b b b b b b b b2 0 1 1 0 x 0 2 1 1 2 0 x2 : D 0 C 0Š 1Š C 1Š 0Š C 0Š 2Š C 1Š 1Š C 2Š 0Š C Â Ã Â Ã Equating coefﬁcients of like powers of x, we have

a n 1 b b n k n1k ; n 1; .n 1/Š D kŠ .n 1 k/Š kXD0 and hence n 1 n 1 an bkbn1k ; n 1: D k ! kXD0 Similarly, from .sec x/0 sec x tan x, we get D n1 n 1 bn akbn1k ; n 1: D k ! kXD0

These recurrence relations, with the initial values a0 0 and b0 1, generate the se- D D quences an and bn . You can show that an 0 for n even, and bn 0 for n odd (which f g f g D D

i i

i i i “main” — 2011/10/12 — 12:06 — page 30 — #44 i i i

30 3. Captivating Formulas

is expected since tan is an odd function and sec is an even function). Ignoring the 0s, we have the sequences 1;2;16;272;7936;::: and 1;1;5;61;1385;::: . f g f g We might ask whether the sequences have any other signiﬁcance, and they do. They give the number of alternating permutations of the set 1; 2; : : : ; n . An alternating permutation f g is a permutation in which the elements alternately increase and decrease. Let cn be the number of alternating permutations of 1; 2; : : : ; n . The followingtable liststhealternating f g permutations for 1 n 5 and the corresponding values of cn. Ä Ä n alternating permutations cn 1 1 1 2 12 1 3 132, 231 2 4 1324, 1423, 2314, 2413, 3412 5 5 13254, 14253, 14352, 15243, 15342, 23154, 24153, 24351, 16 25143, 25341, 34152, 34251, 35142, 35241, 45132, 45231 We can guess from these numbers that

an if n is odd cn D bn if n is even: Can you prove this formula by mathematical induction?

3.5 Series for Pi The famous number , also called Archimedes’ constant,isdeﬁned as the ratio of a circle’s circumference to its diameter. The “p” in (“pi”) stands for “peripheria,” the circumference of the circle. The series formula1 1 1 1 1 1 1 : 4 D 1 3 C 5 7 C 9 11 C is a striking representation of . We can derive it by starting with the geometric series formula 1 x2nC2 1 x2 x4 . 1/nx2n . 1/nC1 : 1 x2 D C C C 1 x2 C C Thw expansion is valid for all real numbers x and all integers n 0. Integrate both sides: 1 dx 1 1 1 1 dx x2 dx x4 dx . 1/n x2n dx 1 x2 D C C Z0 C Z0 Z0 Z0 Z0 1 x2nC2 dx . 1/nC1 : 2 C 0 1 x Z C The left side evaluates to tan1 1 tan1 0 0 ; D 4 D 4 1This formula was ﬁrst discoveredby Madhavan of Sangamagramam (1350–1425),and rediscoveredby Got- tfried Leibniz (1646–1716) and James Gregory (1638–1675).

i i

i i i “main” — 2011/10/12 — 12:06 — page 31 — #45 i i i

3.6. Product for Pi 31

and the right side to 1 1 1 1 . 1/n 3 C 5 C 2n 1 plus or minus the last integral, which is bounded: C 1 x2nC2 dx 1 1 0 < < x2nC2 dx : 1 x2 D 2n 3 Z0 C Z0 C Since the upper bound tends to 0 as n , the integral tends to 0, and this ﬁnishes the ! 1 derivation. The series converges slowly. It requires 625 terms to obtain the approximation 3:14 for . The difference between and a partial sum of the series has the asymptotic formula

n kC1 1 . 1/ b2m 4 ; 2k 1 .2n/2m D mD0 kX1 X where bn is the sequence of numbers associated with the coefﬁcients of the secant func- f g tion, given in Tangent and Secant Function Expansions.

3.6 Product for Pi Wallis’s formula2 for is an infinite product: 2 2 4 4 6 6 : 2 D 1 3 3 5 5 7 We can derive it from an infinite product expansion of the sine function: x x x x x x sin x x 1 1 1 1 1 1 : D C 2 C 2 3 C 3 Although we don’t giveÁ a rigorousÁ proof of thisÁ infiniteÁ product expansion,Á youÁ can intu- itively see that it’s true because the zeros of sin x are n. ˙ Letting x =2, we obtain D 1 1 1 1 1 1 1 1 1 1 1 1 1 :::: D 2 2 C 2 4 C 4 6 C 6 Â ÃÂ ÃÂ ÃÂ ÃÂ ÃÂ Ã Therefore 2 2 4 4 6 6 : 2 D 1 3 3 5 5 7 See [9] for more rigorous approaches. Comparing the product and the series formulas for sin x, we obtain (as Leonhard Euler did) a value of a series. We have x3 x5 x7 x 3Š C 5Š 7Š C x x x x x x x 1 1 1 1 1 1 D C 2 C 2 3 C 3 Á Á Á Á Á Á x2 x2 x2 x 1 1 1 : D 2 22 2 32 Â ÃÂ ÃÂ Ã 2This formula was discovered by John Wallis (1616–1703), who made numerous contributions in algebra, geometry and calculus.

i i

i i i “main” — 2011/10/12 — 12:06 — page 32 — #46 i i i

32 3. Captivating Formulas

Equating coefﬁcients of x3 yields 1 1 1 1 ; 6 D 2 22 2 32 2 and so 2 1 1 1 : 6 D 12 C 22 C 32 C We will see how to compute further values of the sum

1 1 ; k 2; mk mD1 X in The Zeta Function and Bernoulli Numbers.

3.7 Fibonacci Numbers and Pi

3 The Fibonacci sequence Fn is deﬁned by f g

F0 0; F1 1; Fn Fn1 Fn2; n 2: D D D C The Fibonacci sequence is

0; 1; 1; 2; 3; 5; 8; 13; 21; 34; 55; ::::

It may be surprising that 1 1 tan1 : 4 D F C nD1 2n 1 X What do the Fibonacci numbers have to do with ? The partial sums are

k 1 1 tan1 tan1 ; F2nC1 D 4 F C nD1 2k 2 X and the summation result follows instantly. The partial sums formula can be derived using the formula for the tangent of a difference and Cassini’s identity

2 nC1 F FnC1Fn1 . 1/ ; n 1: n D See [28] for a wealth of identities involving Fibonacci numbers.

3.8 Volume of a Ball

2 4 3 A circle of radius r has area r . A sphere of radius r has volume 3 r . Does it make sense to talk about the volume of a higher-dimensional sphere? A d-dimensional ball of radius r, centered at the origin, is the set of points .x1; : : : ; xd / in d-dimensional Euclidean space such that x2 x2 r 2: 1 C C d Ä 3The Fibonacci sequence was introduced by Leonardo of Pisa, known as Fibonacci (c. 1170–c. 1250), one of the most innovative mathematicians of the Middle Ages.

i i

i i i “main” — 2011/10/12 — 12:06 — page 33 — #47 i i i

3.8. Volume of a Ball 33

The boundary of a ball is a hypersphere. In dimension 1 we have a line segment, in dimension 2 a disk (a circle and its interior),and in dimension 3 an ordinary ball. A 4-dimensional ball is more difﬁcult to picture, but we can use analogies with the lower-dimensional objects. We will show that the volume of a d-dimensional ball of radius r, for d 1, is d Vd r ;

where Vd , the volume of the unit d-dimensional ball (radius 1), is

d=2 : .d=2/Š

Though we cannot construct or even visualize an n-dimensional sphere, we can ﬁnd its volume! If d is even, then d=2 is an integer and .d=2/Š is the usual factorial function. If d is odd, we have to explain what .d=2/Š means. We deﬁne

1 p Š : 2 D 2 Â Ã An explanation for this, using the gamma function (a generalization of the factorial function), will be given in the next section. Now we can compute .d=2/Š for d odd by treating the expression like a normal factorial, multiplying d=2 by d=2 1, d=2 2, etc., until we get down to 1=2. For instance,

7 7 5 3 1 105p Š Š : 2 D 2 2 2 2 D 16 Â Ã Â Ã

We already know that V1 2, V2 , and V3 .4=3/. We will check V2 and V3, D D D and then determine V4. Here is a check of the formula for the area of a unit circle. We use polar coordinates:

2 1 2 1 2 1 2 1 1 V2 r dr d r d d .2/ : D D 2 D 2 D 2 D Z0 Z0 Z0 ˇ0 Z0 ˇ ˇ For the unit ball in 3-dimensional space, weˇ use polar coordinates to represent the ﬁrst 2 2 2 two dimensions. We have x x r , and x3 is bounded by a line segment of radius 1 C 2 D p1 r 2. Summing (with an integral) the volumes of the line segments, 2 1 1 2 2 2 3=2 4 V3 2p1 r r dr d .1 r / .2/ : D D 3 D 3 Z0 Z0 ˇ0 ˇ ˇ For the 4-dimensional ball, ˇ

2 1 2 1 2 1 2 1 4 1 2 V4 p1 r r dr d r r .2/ : D D 2 4 D 2 Z0 Z0 Â Ãˇ0 Á ˇ 2 ˇ Thus V4 =2. ˇ D

i i

i i i “main” — 2011/10/12 — 12:06 — page 34 — #48 i i i

34 3. Captivating Formulas

Let’s prove the volume formula for the d-dimensional unit ball. Representing the ﬁrst two dimensions by polar coordinates r and , the cross-section is a .d 2/-dimensional ball of radius p1 r 2. Hence 2 1 d2 2 Vd Vd2 p1 r r dr d: D 0 0 Z Z Á (Using the letter ‘d’ to denote both a differential and a dimension should cause no confu- sion.) The double integral is

1 1 2 d=2 2 Vd2 1 r .2/ Vd2 : d D d Â Ã ˇ0 Â Ã ˇ ˇ Therefore we can ﬁnd the constants Vd recursively:ˇ 2 Vd Vd2 ; d 2: D d Â Ã The volume formula follows at once by mathematical induction. We will use it to prove Lagrange’s four squares theorem in Chapter 4.

3.9 Euler’s Integral Formula In Volume of a Ball, we said that to evaluate the factorial of a half-integer we need the gamma function. Leonhard Euler (1707–1783) noted that

1 nŠ t net dt; n 0: D Z0 This can be obtained by starting with

1 eat 1 1 eat dt ; D a D a Z0 Â ÃˇtD0 ˇ ˇ and differentiating n times with respect to a to get ˇ 1 nŠ . 1/nt neat dt . 1/n : D anC1 Z0 Setting a 1 yields Euler’s result. The gamma function is deﬁned for all complex numbers D z with positive real part by

1 .z/ t z1et dt; z > 0: D < Z0 For integers, .n/ .n 1/Š; n 1: D We will show that .1=2/Š p=2. From the deﬁnition of .z/, we have D 1 3 1 Š t 1=2et dt: 2 D 2 D Â Ã Â Ã Z0

i i

i i i “main” — 2011/10/12 — 12:06 — page 35 — #49 i i i

3.10. Euler’s Polyhedral Formula 35

The change of variables t x2 gives D

1 1 2 Š 2 x2ex dx; 2 D Â Ã Z0

2 and integration by parts (with u x and dv xex dx) gives D D

1 1 2 1 2 Š xex ex dx: 2 D 0 C 0 Â Ã Áˇ Z ˇ ˇ The ﬁrst term on the right is 0, and we use the integral given in A Classic Integral in Chapter 5:

1 2 ex dx p: D Z1

2 Since ex is an even function, .1=2/Š is half of this integral.

3.10 Euler’s Polyhedral Formula

A cube has eight vertices, twelve edges, and six faces.

For a convex polyhedron, let V be the number of vertices, E the number of edges, and F the number of faces. Euler’s polyhedral formula is

V E F 2: C D

You could test this formula on some other examples, say a tetrahedron and an octahedron. Consider the polyhedron below. It is a cube with a square tunnel from one face to the opposite face.

i i

i i i “main” — 2011/10/12 — 12:06 — page 36 — #50 i i i

36 3. Captivating Formulas

The polyhedron is drawn as a solid, so we can’t see all its vertices, edges, and faces. But we can infer that V 16, E 32 and F 16. So D D D V E F 0: C D The hole makes the difference. If a polyhedron can be drawn on a surface with g holes (g is called the genus of the surface), then Euler’s formula is

V E F 2 2g: C D Care must be taken in the deﬁnitionof genus. It applies only to surfaces that are orientable. A M¨obius strip, a surface with only one side, is not orientable. For a proof of Euler’s polyhedral formula, see [1]. For generalizations of Euler’s formula, a rich resource is [32].

3.11 The Smallest Taxicab Number A story is often told about G. H. Hardy (1887–1947) visiting Srinivasa Ramanujan (1887– 1920) when Ramanujan was ill in a hospital. Hardy said that the number of his taxicab, 1729, seemed to him to be a dull number. Ramanujan responded that, on the contrary, 1729 is the smallest number that is the sum of two cubes in two ways. That is,

1729 103 93 123 13: D C D C However, the statement that 1729 is the smallest such number must be altered if we allow cubes of negative numbers. Let’s find the smallest positive integer that is the sum of two cubes (positive or negative) in two ways. We list the first ten cubes. n1234 5 6 7 8 9 10 n3 1 8 27 64 125 216 343 512 729 1000 From the curious relation 33 43 53 63; C C D we have 91 33 43 63 . 5/3; D C D C so 91 is the sum of two cubes in two ways. This is the smallest such positiveinteger, as can be verified using the table.

i i

i i i “main” — 2011/10/12 — 12:06 — page 37 — #51 i i i

3.12. Infinity and Infinity Squared 37

3.12 Infinity and Infinity Squared

The collection of all whole numbers 0;1;2;3;4;5;::: is an inﬁnite set. The collection of f g all ordered pairs of whole numbers,

.0;0/;.0;1/;.0;2/;:::;.1;0/;.1;1/;.1;2/;::: ; f g

is also infinite. Does it makes sense to talk about the size of an infinite set, to say that two infinite sets are the same size or that one is larger than the other? Georg Cantor (1845– 1918) described a way to do so. The key definition is that two sets, finite or infinite, are the same size if there is a bijection (a one-to-one correspondence) between them. Although it may appear that the set of ordered pairs of whole numbers is larger than the set of whole numbers, the two infinite sets are the same size. The mapping

.m n/.m n 1/ .m; n/ C C C m; m; n 0 7! 2 C

takes an ordered pair of whole numbers to a whole number. The correspondence is shown in the table.

n 0 1 2 3 4 5 6 : : : m 0 0 1 3 6 10 15 21 ::: 1 2 4 7 11 16 22 29 ::: 2 5 8 12 17 23 30 38 ::: 3 9 13 18 24 31 39 48 ::: 4 14 19 25 32 40 49 59 ::: 5 20 26 33 41 50 60 71 ::: 6 27 34 42 51 61 72 84 ::: : : : : : : : : : : : : : : : :

The table contains all the whole numbers, increasing in order along diagonals. Given a whole number w, let n be the unique whole number such that

n.n 1/ .n 1/.n 2/ C w < C C : 2 Ä 2

Then let m be the whole number deﬁned by

n.n 1/ m w C : D 2

The mapping w .m; n/ is the other half of the bijection. Thus, the set of all whole 7! numbers has just as many elements as the set of all ordered pairs of whole numbers.

i i

i i i “main” — 2011/10/12 — 12:06 — page 38 — #52 i i i

38 3. Captivating Formulas

3.13 Complex Functions Consider the function f .z/ z2; D from C to C. Each complex number z is squared to produce a new complex number z2. We can’t graph the function in two dimension because there are four dimensions in play, two for the domain and two for the range. However, we can say what the squaring function does geometrically. We can represent a complex number z as

z rei ; D where r z and is the angle that z makes in the complex plane, measured counter- D j j clockwise from the positive real axis. Because

z2 r 2e2i D is a vector with length r 2 and argument 2, the squaring map changes the length of vectors and rotates them counterclockwise. The derivative of a complex function is deﬁned in the same way as for a real-valued function: f .z z/ f .z/ f 0.z/ lim C : D z!0 z But there is a subtle difference that turns out to be critical. When we say that z 0, ! the complex number z can approach 0 in any fashion, and this may prevent the existence of the derivative. This restriction on the existence of the derivative means that complex functions that have a derivative are special. It is possible for a real-valued function to have a derivative that does not have a derivative. Let f .x/ x5=3: D Then f 0.x/ .5=3/x2=3, but f 0.x/ is not differentiable at x 0. However, if a function D D of a complex variable is differentiable, then all its higher derivatives exist. If f .z/ z2, then D .z z/2 z2 f 0.z/ lim C D z!0 z z2 2zz .z/2 z2 lim C C D z!0 z 2zz .z/2 lim C D z!0 z lim .2z z/ D z!0 C 2z: D We found that .z2/0 2z, as for real-valued functions. D

i i

i i i “main” — 2011/10/12 — 12:06 — page 39 — #53 i i i

3.13. Complex Functions 39

Consider the complex conjugation function, f .z/ z. Its derivative would be D .z z/ z f 0.z/ lim C : D z!0 z Writing z x iy and z x iy, we have D C D C .x iy x iy/ .x iy/ f 0.z/ lim C C C C D x;y!0 x iy C .x x iy iy/ .x iy/ lim C D x;y!0 x iy C x iy lim : D x;y!0 x iy C If we set y 0, then the limit wouldbe 1. If we set x 0, then the limit wouldbe 1. D D But a limit must be unique, so it doesn’t exist and therefore f .z/ z has no derivative. D Suppose that f .z/ u iv; D C where u and v are functions of x and y, so u u.x; y/ and v v.x; y/. It can be shown D D that necessary and sufficient conditions for f 0.z/ to exist are given bythe Cauchy–Riemann equations @u @v @u @v ; : @x D @y @y D @x These partial derivatives express the rate of change of the functions when only one of the variables changes. For example, @u=@x, the partial derivative of u with respect to x, is the derivative of u.x; y/ as a function of x, where y is held constant. As an exercise, you can check that f .z/ z2 satisfies the Cauchy–Riemann equations. D The first step is to write z x iy. D C We can represent a complex number a bi as a 2 2 matrix C a b ; b a Ä a rotation and stretching matrix that mimics what happens when we multiply complex numbers by a bi. Thus, we can think of multiplication by a complex number as a map C from C to C or as a map from R2 to R2. A map from R2 to R2 with

x u.x; y/ y 7! v.x; y/ Ä Ä has a derivative that is at each pointa linear map of the form

@u @u @x @y 2 3 : @v @v 6 7 6 @x @y 7 4 5

i i

i i i “main” — 2011/10/12 — 12:06 — page 40 — #54 i i i

40 3. Captivating Formulas

For a complex function, the Cauchy–Riemann equations ensure that the derivative map acts like multiplication by a complex number, for the corresponding matrix has the appropriate form: @u @v @x @x 2 3 : @v @u 6 7 6 @x @x 7 Differentiability is a stronger property4 for complex5 functions than for real-valued functions. If a complex function has a derivative in some region, then it is differentiable in- ﬁnitely many times there. In the next section, we will see an example of a differentiable complex function deﬁned on the entire complex plane except for one point.

3.14 The Zeta Function and Bernoulli Numbers The zetafunction is defined for all integers k 2 by 1 1 .k/ : D mk mD1 X (If k 1 we have the divergent harmonic series.) D 4 Bernoulli numbers are defined recursively by B0 1 and D 1 n 1 n 1 Bn C Bk; n 1: D n 1 k ! C kXD0 The first few Bernoulli numbers are 1 1 1 1 1 5 1; ; ; 0; ; 0; ; 0; ; 0; ;:::: 2 6 30 42 30 66

It appears that B2nC1 0, for n 1, and this is true. Bernoulli numbers are related D to the coefﬁcients of the tangent function that we found in Tangent and Secant Function Expansions. See, e.g., [21, p. 287]. There is a connection between the zeta functionand Bernoulli numbers. For any positive integer n, we have 2n nC1 .2/ .2n/ . 1/ B2n: D 2.2n/Š See, e.g., [23]. In Product for Pi, we discovered that

1 1 2 : m2 D 6 mD1 X What is the value of 1 1 ‹ m4 mD1 X 4Bernoulli numbers are named for Jakob Bernoulli (1654–1705), who made important discoveries in probability theory and counting, and in calculus and differential equations.

i i

i i i “main” — 2011/10/12 — 12:06 — page 41 — #55 i i i

3.15. The Riemann Zeta Function 41

3.15 The Riemann Zeta Function We saw in The Zeta Function and Bernoulli Numbers that the zeta function is deﬁned for all integers k 2 by 1 1 .k/ : D mk mD1 X We also know that 1 1 1 2 .2/ 1 D C 22 C 32 C 42 C D 6

and 2n nC1 .2/ .2n/ . 1/ B2n: D 2.2n/Š In 1859 Bernhard Riemann showed how to extend the deﬁnition of to the entire complex plane except the point 1. For a complex number s with real part greater than 1, deﬁne

1 1 .s/ : D ms mD1 X It can be shown that the sum converges.

The series 1 . 1/m1 : ms mD1 X converges when the real part of s is positive. Since

1 1 1 . 1/m1 1 . 1/m .s/ ms D ms C ms mD1 mD1 mD1 X X X 1 2 D .2n/s nD1 X 21s.s/; D

we have 1 1 . 1/m1 .s/ : D 1 21s ms mD1 X This extends the deﬁnition of .s/ to the half plane where the real part of s is positive. Riemann showed that the deﬁnition of can be extended to the entire complex plane except s 1 by using a functional equation D s .s/ 2s s1 sin .1 s/.1 s/; s < 0: D 2 < Á Here .z/ is the gamma function, a generalization of the factorial function that we met in Euler’s Integral Formula: 1 .z/ t z1et dt: D Z0 Recall that .n/ .n 1/Š when n is a positive integer. D

i i

i i i “main” — 2011/10/12 — 12:06 — page 42 — #56 i i i

42 3. Captivating Formulas

As an exercise, you can use the functional equation to prove that

1 . 1/ : D 12

The extended deﬁnition of .s/ is a differentiable function. According to the theory of complex functions, it is uniquely deﬁned. Riemann’s zeta function is connected with prime numbers, as seen in the formula

1 1 1 .s/ ; s > 1; D ms D 1 ps < mD1 p X Y where the product ranges over all primes p. To understand why this formula holds, use the geometric series sum formula

1 1 ps p2s p3s ; 1 ps D C C C C and observe what happens when such sums for various primes p are multiplied together.

3.16 The Jacobi Identity

Consider the set of n n matrices with real entries. Although we can add and multiply matrices together, matrix multiplication is not commutative for n > 1. For example,

1 1 1 2 2 3 1 2 1 1 D 3 4 Ä Ä Ä but 1 2 1 1 3 5 : 1 1 1 2 D 2 3 Ä Ä Ä However, we can deﬁne a multiplication of matrices that is anti-commutative. We deﬁne the product of two square matrices of the same size, A and B, in bracket notation as

ŒA; B AB BA: D Then ŒB; A BA AB ŒA; B: D D Under this deﬁnition of multiplication, the set of matrices satisﬁes the Jacobi identity5:

ŒA; ŒB; C ŒB; ŒC; A ŒC; ŒA; B 0: C C D

5The Jacobi identity is due to Carl Gustav Jacob Jacobi (1804–1851), one of the most inﬂuential mathematicians of the 1800s.

i i

i i i “main” — 2011/10/12 — 12:06 — page 43 — #57 i i i

3.17. Entropy 43

Let’s check: ŒA; ŒB; C ŒB; ŒC; A ŒC; ŒA; B C C ŒA; BC CB ŒB;CA AC ŒC; AB BA D C C A.BC CB/ .BC CB/A B.CA AC / D C .CA AC /B C.AB BA/ .AB BA/C C ABC ACB BCA CBA BCA BAC D C C CAB ACB CAB CBA ABC BAC C C C 0: D An algebra is a vector space in which we can multiply vectors. An algebra is bilinear if it satisfies Œra sb; c rŒa; c sŒb; c C D C and Œa; rb sc rŒa; b sŒa; c; C D C where r and s are scalars (elements of the base field) and a, b, and c are vectors. It is easy to show that the Jacobi product gives a bilinear algebra. An algebra with a bilinear anticommutative multiplication that satisfies Jacobi’s identity is called a Lie algebra, named after Sophus Lie (1842–1899). Lie algebras are nonassocia- tive: the identity Œa; Œb; c ŒŒa; b; c D is not assumed. Another, perhaps more familiar, example of a Lie algebra is the vector cross product, Œu; v u v, defined on 3-dimensional Euclidean space. You can check that the vector D cross product satisfies Jacobi’s identity by direct calculation or by using the identity a .b c/ .a c/b .a b/c, where a, b, and c are three-dimensional real vectors. D Lie algebras are important in quantum mechanics. A very understandable explanation of Lie algebras in the context of Euclidean space is [51].

3.17 Entropy Suppose that you have a goose that lays golden eggs, silver eggs, and bronze eggs. She lays one egg each day and you don’t know which kind it will be. Half of the days she lays golden eggs, one-fourthof the days she lays silver eggs, and one-fourthof thedays she lays bronze eggs. Your neighbor has a goose who lays golden eggs, silver eggs, and bronze eggs with equal probability, one per day. There is uncertainty about what kind of eggs the geese will produce on any day. But how much uncertainty? Which goose is more unpredictable? We will give a mathematical deﬁnition of uncertainty and use it to measure the uncertainty associated with your goose and your neighbor’s goose. We say that a source S is a set of outcomes that occur with various probabilities.Suppose that outcomes x1, x2,..., xn occur with probabilities p1, p2,..., pn, respectively, where the probabilities are nonnegative real numbers that sum to 1.

i i

i i i “main” — 2011/10/12 — 12:06 — page 44 — #58 i i i

44 3. Captivating Formulas

In 1948 Claude E. Shannon (1916–2001),the founder of information theory, deﬁned the entropy H of a source: n H.S/ pi log pi : D iD1 X Entropy is a weighted average of the logarithms of the probabilities of events. The logarithms normalize the probabilities so that the resulting calculations can be done in conve- nient units called bits of information.In information theory, calculations are done with base 2 logarithms. Let’s calculate the entropy of the two magical geese. Your goose lays eggs with entropy

1 1 1 1 1 1 H.your goose/ log log log 1:5 bits: D 2 2 2 4 2 4 4 2 4 D Your neighbor’s goose lays eggs with entropy

1 1 1 1 1 1 : H.neighbor’s goose/ log log log 1:58 bits: D 3 2 3 3 2 3 3 2 3 D Your neighbor’sgoose is more unpredictable than your goose by about 0:08 bits. Maximum entropy occurs when all outcomes are equally probable. Shannon proved the two main theorems of information theory. Shannon’s first theorem says that given any information source, the most compact way to encode it with 0s and 1s requires, on average, a codeword whose length is equal to the entropy of the source. So, in our example of the geese that lay expensive eggs, if you want to keep a day-by-day journal account of the type of eggs that your goose lays, you need to expend, on average, 1:5 binary symbols per day, no matter what encoding scheme you devise. Your neighbor needs to use an average of 1:58 symbols per day. Shannon’s second theorem says that when information is sent over a noisy channel, where some symbols may be distorted, we can always devise a code to send the information at near perfect accuracy, but at a slower rate than if no code is used. The rate is given by a quantity called the channel capacity, which is defined in terms of entropy. See, e.g., [15]. Shannon didn’t arrive at the information theory definition of entropy in a vacuum. Rudolf Julius Emanuel Clausius (1822–1888) introduced the concept of entropy in thermodynam- ics, Ludwig Eduard Boltzmann (1844–1906) gave a mathematical formulation, and Josiah Willard Gibbs (1839–1903) described entropy as an amount of randomness. The work of Shannon’s predecessors aided him in formulating his ideas.

3.18 Rook Paths A chess Rook can move any number of squares horizontally or vertically in one step. How many paths can a Rook take from the lower-left corner square to the upper-right corner square of an 8 8 chessboard, assuming that it moves right or up at each step? An example of a Rook path is shown in Figure 3.1. We want to count lattice paths from .0; 0/ to .n; n/ with steps of the form .x; 0/ or .0; y/, where x and y are positive integers.

i i

i i i “main” — 2011/10/12 — 12:06 — page 45 — #59 i i i

3.18. Rook Paths 45

The Rook path problem can be solved by generalizing to ﬁnd the number of paths from .0; 0/ to any square on an arbitrarysize board,that is,to any point .m; n/. Let r.m; n/ be the number of paths, where m; n 0. We set r.0; 0/ 1. By symmetry, r.m; n/ r.n; m/. D D For m or n positive, r.m; n/ is equal to the sum of the values of r for the horizontal and vertical predecessors of .m; n/, since the Rook arrives at .m; n/ from one of the squares to its left or below it. For example, r.3; 2/ .2 5 14/ .4 12/ 37. From the D C C C C D following table, we see that the number of Rook paths from the lower-left corner to the upper-right corner of an 8 8 chessboard is r.7; 7/ 470010. D : : : : : : : : : : : : : : : : 64 320 1328 4864 16428 52356 159645 470010 ... 32 144 560 1944 6266 19149 56190 159645 ... 16 64 232 760 2329 6802 19149 52356 ... 8 28 94 289 838 2329 6266 16428 ... 4 12 37 106 289 760 1944 4864 ... 2 5 14 37 94 232 560 1328 ... 1 2 5 12 28 64 144 320 ... 11 2 4 8 16 32 64 ... We determined r.m; n/ using a variable number of preceding terms. But there is a recurrence relation that requires only three preceding terms:

r.0; 0/ 1; r.0; 1/ 1; r.1; 0/ 1; r.1; 1/ 2 D D D D I r.m; n/ 2r.m 1; n/ 2r.m; n 1/ 3r.m 1; n 1/; m 2 or n 2: D C (We assume that r.m; n/ 0 for m or n negative.) It can be proved by the method of D inclusion and exclusion and is an exercise in Appendix B. The recurrence formula yields a rational generating function for the doubly-inﬁnite sequence r.m; n/ , namely, f g .1 x/.1 y/ r.m; n/xmyn : D 1 2.x y/ 3xy m0; n0 X C C

Figure 3.1. A Rook path.

i i

i i i “main” — 2011/10/12 — 12:06 — page 46 — #60 i i i

46 3. Captivating Formulas

The form of the denominator is given by the recurrence relation. The numerator is obtained by multiplying the denominator by the polynomial that represents the initial values, 1 C x y 2xy, keeping only those monomials with exponents of x and y both less than 2. C C Another way to obtain the generating function for Rook paths is to start with the generating function 1=.1 x y/, which counts sequences of length n having some number of x’s and a complementary number of y’s (the total number of x’s and y’s is n). For Rook paths, we allow an arbitrary step length in either direction. This amounts to replacing x by x=.1 x/ and y by y=.1 y/. Hence, the generating function for Rook paths is 1 : 1 .x=.1 x// .y=.1 y// We can generalize Rook paths to three dimensions. How many ways can a Rook move from .0; 0; 0/ to .m; n; o/, where each step isa positiveinteger multipleof .1; 0; 0/, .0; 1; 0/, or .0; 0; 1/? The generating function for three-dimensional Rook paths is

.1 x/.1 y/.1 z/ : 1 2.x y z/ 3.xy yz zx/ 4xyz C C C C C In dimension d, an asymptotic formula for the number of Rook paths from the origin to a main diagonal point is

r.n; : : : ; n/ .d 1/dn1 d .dC2/=2.2n.d 2//.1d/=2: C C See [16]. It is surprising that appears in the formula. Manuel Kauers and Doron Zeilberger have conjectured that, for n ﬁxed, the number of Rook paths from the origin to a main diagonal point is

.nd/Š r.n; : : : ; n/ en1 : nŠd

Let rn r.n; n/, the number of Rook paths from .0; 0/ to the diagonal point .n; n/. The D generating function for the sequence rn 1;2;14;106;838;::: is f g D f g 1 n 1 1 x rnx 1 : D 2 C 1 9x nD0 r ! X A recurrence formula for such paths is

r0 1; r1 2 D D I rn ..10n 6/rn1 .9n 18/rn2/=n; n 2: D No counting proof of this recurrence formula is known. For three-dimensional Rook paths, the diagonal sequence rn r.n; n; n/ satisﬁes the f D g

i i

i i i “main” — 2011/10/12 — 12:06 — page 47 — #61 i i i

3.18. Rook Paths 47

recurrence formula

r0 1; r1 6; r2 222; r3 9918 D D D D I 3 2 rn ..121n 212n 85n 6/rn1 D C C 3 2 .475n 3462n 7853n 5658/rn2 C C 3 2 . 1746n 14580n 40662n 37908/rn3 C C C 3 2 3 2 .1152n 12672n 46080n 55296/rn4/=.2n 2n /; n 4: C C Such recurrence relations exist for Rook paths to a diagonal point in any dimension, but their orders and the degrees of the polynomial coefﬁcients are unknown in general. A Rook path is equivalent to a game of Nim. In Nim, two players alternately remove any number of stones from one of a number of piles. The game ends when the last stone is removed. A Rook path from .0;0;:::;0/ to .a1; a2; : : : ; ad / is equivalent to a Nim game starting with d piles of stones of sizes a1, a2,..., ad .

i i

i i i “main” — 2011/10/12 — 12:06 — page 48 — #62 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 49 — #63 i i i

4 Delightful Theorems

Mathematics is like looking at a house from different angles. THOMAS F. STORER1 (1938–2006) Mathematicians prove theorems. Once a theorem is proved, it is true for all time. The theorems proved by the ancient Greeks are as true today as they were over two thousand years ago, and the theorems proved today will be true even if, after millions of years, hu- mans evolve into another species. In this chapter we present some delightful and sometimes surprising theorems.

4.1 A Square inside Every Triangle Given any triangle, is it always possibleto inscribe a square in it? We require thatthe square has a side on one of the sides of the triangle, with the other two corners touching the other sides of the triangle. The answer is yes, by similarity. Put the triangle on top of a square, as ABC is placed 4 in Figure 4.1. Now extend the other two sides of ABC so that they meet the line that the 4 square sits on. This results in a triangle similar to the given triangle and circumscribing the square. Finally, change the scale of the whole diagram so that the circumscribing triangle is the same size as our given triangle—and we are done. Note that the side of the triangle we place on the square must be chosen so that the altitude to that side lies inside the triangle. We have proved the theorem.

B C

Figure 4.1. A square inscribed in a triangle.

Theorem. Given any triangle, there exists a square inscribed inside it.

1Tom Storer, the ﬁrst Native American to earn a Ph.D. in mathematics, was the author’s thesis advisor.

i i

i i i “main” — 2011/10/12 — 12:06 — page 50 — #64 i i i

50 4. Delightful Theorems

b b b

g g a g a a

Figure 4.2. Morley’s theorem (with the equilateral triangle in bold).

In the proof, we worked backwards, starting with the square to be inscribed and ﬁtting a triangle around it. The idea of working backwards in a geometric construction is used strikingly in the proof of the next theorem. By the way, the square inscribed in the triangle is constructible using straightedge and compass. How can you do the construction?

4.2 Morley’s Theorem One of the most delightful theorems of plane geometry was discovered fairly recently, a little over a hundred years ago. Morley’s Theorem. In any triangle, the three points of intersection of adjacent angle trisectors are the vertices of an equilateral triangle.2 See Figure 4.2. Paul Erd˝os (1913–1996)asserted that in some Platonic realm there is a book that contains the best proofs of mathematical theorems. In 1995 John H. Conway found a beautiful proof of Morley’s theorem that may well be in Erd˝os’ book. Conway’s proof is elegant, memorable, and its diagram requires only six extra lines. Also, the proof proceeds by the neat method of starting with an equilateral triangle and working backwards to create a triangle similar to the given triangle, as in the proof of A Square inside Every Triangle. Following Conway, let

C 60ı and CC 120ı; D C D C where is an angle. Suppose that our triangle has angles 3˛, 3ˇ, and 3 , so that ˛ ˇ 60ı. Start with C C D an equilateral triangle of arbitrary size.

2Morley’stheorem, a gem of geometry,was discoveredin 1899by Frank Morley(1860–1937).Morleyworked in the areas of algebra and geometry. It isn’t possible to construct the angle trisectors of an arbitrary angle using only straight-edge and compass. Is Morley’s theorem a part of Euclidean geometry?

i i

i i i “main” — 2011/10/12 — 12:06 — page 51 — #65 i i i

4.2. Morley’s Theorem 51

Next, form three triangles with angles shown in the ﬁgure below, and with the sides established by shaded segments congruent to the sides of the equilateral triangle.

+ + g a

b+ + a g b+ g+ a

We can easily check that these angle measures really do give triangles. For example,

˛ ˇC C .˛ ˇ / 60ı 60ı 60ı 60ı 60ı 180ı: C C D C C C C D C C D The shaded edges determine the sizes of these triangles. Finally, form three more triangles, with angles shown below.

b b a++

g++ g

a b++ g a

We need to say how large the triangles are, which we will do in a moment. For now, we check that the given angles make triangles. For example,

˛ ˇ CC .˛ ˇ / 120ı 180ı: C C D C C C D To determine the sizes of the new triangles, drop pairs of equal line segments from a vertex of each triangle to the opposite side. Let’s take the triangle with angles ˛, ˇ, and CC as an example. Drop line segments from the vertex with angle CC to the opposite side so that the indicated angles are both C. Let the lengthof the linesegments be equal to the side of the equilateral triangle. This determines the size of this triangle. The sizes of the other two triangles are determined similarly. The appearance of the diagram would change slightly if one of the angles in the original triangle is a right angle or an obtuse angle. For example, if 3 > 90ı, then C > 90ı. How does this change the picture?

i i

i i i “main” — 2011/10/12 — 12:06 — page 52 — #66 i i i

52 4. Delightful Theorems

b b

g+ g+ g

a g

We claim that the seven triangles thus formed ﬁt together to make a triangle similar to the given triangle. Since we have created a triangle similar to the given triangle for which the result is true, and the given triangle was arbitrary, we have proved the result for all triangles.

b b b ++ +g+ a a + g+ + b b + + g b + a g g ++ g a b a a

To prove that the triangles fit together, we must check that the four angles around each vertex of the equilateral trianglesum to 360ı. I leave this as an exercise. We must also check that the sides fit together. The three triangles sharing a side with the equilateral triangle fit because the shared sides were formed to have length equal to the side of the equilateral triangle. To see that the other three triangles fit, notice that the two shaded triangles are congruent (because they have the same angles and a pair of congruent corresponding sides); hence the shaded trianglesfit alonga common side. Making this type of observationfor five analogous cases completes the proof. For other proofsof Morley’stheorem, see [26]. There are many types of proofs,including one using complex numbers, but I believe that Conway’s proof is the simplest.

4.3 The Euler Line Every triangle has four well-known centers. The orthocenter H is the intersection of the three altitudes. The centroid G (the center of gravity of the triangle) is the intersection of the three medians. The circumcenter O (the center of the circumscribed circle) is the intersection of the perpendicular bisectors of the three sides. The incenter I (the center of the inscribed circle) is the intersection of the three angle bisectors.

i i

i i i “main” — 2011/10/12 — 12:06 — page 53 — #67 i i i

4.3. The Euler Line 53

H G O

A B

Figure 4.3. The Euler line (OGH ).

Leonhard Euler (1707–1783) discovered that H, G, and O are collinear. The line that they lie on is called the Euler line. Furthermore, G lies one-third of the way along the Euler line from O to H. See Figure 4.3. We will give a vector proof. In vector notation, we can describe Euler’s discovery as follows. Theorem. If H, G, and O are the orthocenter, centroid, and circumcenter of a triangle, then 3!OG !OH : D The coordinates of the centroid are the averages of the coordinates of the three vertices of the triangle. This implies that

!GA !GB !GC !0 : C C D By the deﬁnition of vector addition,

!OG !GA !OA C D !OG !GB !OB C D !OG !GC !OC: C D Adding, we get 3!OG !OA !OB !OC: D C C We will show that the right side is equal to !OH . The vector sum !OA !OB is a vector represented by the diagonal of the parallelogram C spanned by !OA and !OB. By deﬁnition of the circumcenter, these two vectors have the same length and the parallelogram they span is a rhombus. Since the diagonals of a rhombus are perpendicular, !OA !OB is a vector on the line OP , perpendicular to side AB. Hence, C !OA !OB is parallel to the altitude from C . It follows by the deﬁnition of vector addition C that !OA !OB !OC is a vector from O to the altitude from C . A similar argument shows C C that !OA !OB !OC is a vector from O to the altitudes from A and B. Since H is the C C intersection of the altitudes, !OA !OB !OC !OH. C C D

i i

i i i “main” — 2011/10/12 — 12:06 — page 54 — #68 i i i

54 4. Delightful Theorems

Figure 4.4. The circles, common external tangents, and collinear points of Monge’s theorem. 4.4 Monge’s Theorem

Theorem. Given three circles in the plane, with different radii and none inside another,the three pairs of common external tangents of the circles intersect in three collinear points.3 See Figure 4.4. Suppose that the circles are C1, C2, C3, with centers c1, c2, c3, and radii r1, r2, r3, respectively. Let pij be the intersection of the external tangents to Ci and Cj , where 1 i < j 3. Observe (using similar triangles) that Ä Ä ri ri cj rj ci pij ci .cj ci / : D C ri rj D ri rj It follows that

r1.r2 r3/p23 r2.r3 r1/p13 r3.r1 r2/p12 0; C C D

and therefore p12, p23, and p13 are collinear, since the scalar coefﬁcients sum to 0.

4.5 Power Means Power means are a grand generalization of the arithmetic mean and geometric mean. They have been studied for nearly 200 years but there are still interesting open questions con- cerning them, and some problems have been solved only recently. A real-valued function f is convex on an interval I if

f ..1 /a b/ .1 /f .a/ f.b/; C Ä C for all a; b I and 0 1. 2 Ä Ä A straightforward application of calculus shows that if f 00.x/ 0 for all x I , then f 2 is convex on I . For example, the function f .x/ ln x is convex on the interval .0; /. D 1 3This theorem is attributed to Gaspard Monge (1746–1818),the inventor of descriptive geometry.

i i

i i i “main” — 2011/10/12 — 12:06 — page 55 — #69 i i i

4.5. Power Means 55

If a function is convex, then it satisﬁes an inequality due to Johan Jensen (1859–1925):

Theorem (Jensen’s Inequality). Suppose that f is convex on I . If a1,..., an I and 2 1,..., n are nonnegative real numbers such that 1 n 1, then C C D n n f i ai i f .ai /: Ä iD1 ! iD1 X X The proof is by mathematical induction on n. Suppose that a1,..., an are positive numbers and w1,..., wn are positive numbers (weights) such that wi 1. For r , let D 1 Ä Ä 1 P n r 1=r wi a < r < ; r 0 iD1 i 1 1 ¤ 8 Pn wi ˆ iD1 ai r 0 ˆ D Mr ˆ D ˆ Q <ˆ max ai r f g D 1 ˆ ˆ min a r : ˆ i ˆ f g D 1 ˆ We call Mr the r-th power: mean of the numbers a1,..., an with weights w1,..., wn. We can make the values of the ai explicit, if we wish, using the notation Mr .a1; a2; : : : ; an/. The functions Mr with r 1, 0, 1, and 2 are called, respectively, the harmonic mean D (HM), geometric mean (GM), arithmetic mean (AM), and quadratic mean (QM):

1 M1 n ; D iD1 wi =ai Pn wi M0 a ; D i iD1 Y n M1 wi ai ; D iD1 X n 1=2 2 M2 wi a : D i iD1 ! X Applying Jensen’s inequality to the convex function f .x/ ln x yields the arithmetic D mean–geometric mean (AM–GM) inequality (with weights):

M0 M1: Ä The equal-weight case of the AM–GM inequality is often useful:

1=n a1 an .a1 : : : an/ C C ; Ä n

where a1,..., an are positive numbers. Equality holds if and only if all the ai are equal.

i i

i i i “main” — 2011/10/12 — 12:06 — page 56 — #70 i i i

56 4. Delightful Theorems

Theorem (Power Means). Let a1,..., an, w1,..., wn be ﬁxed. Then Mr is a continuous and increasing function of r for r . Moreover, Mr is a strictly increasing 1 Ä Ä 1 function of r unless all the ai are equal. Here is a proof. For 0 < r < s < , let t s=r > 1 and f .x/ xt , for x > 0. We have 1 D D f 00.x/ t.t 1/xt2 0, so f is a convex function. By Jensen’s inequality, D n r r f wi a wi f .a /: i Ä i iD1 ! iD1 X X Hence n s=r n r s wi a wi a i Ä i D ! D Xi 1 Xi 1 and n 1=r n 1=s r s wi a wi a : i Ä i D ! D ! Xi 1 Xi 1 This shows that Mr Ms. Ä For 0 < r < , we apply the AM–GM inequality to ar ,..., ar and obtain 1 1 n

n 1=r n r wi wi a a : i i iD1 ! iD1 X Y

Hence M0 Mr . Ä The cases < r < s 0 are covered by these results and the identity 1 Ä 1 Mr .a1; : : : ; an/ .Mr .1=a1; : : : ; 1=an// : D If 0 < r < , then 1 n 1=r r Mr wi max ai max ai M1: Ä f g D f g D iD1 ! X

Similarly, M1 Mr if < r < 0. Therefore, Mr is an increasing function of r. It is Ä 1 easy to show that there is strict inequality unless a1 a2 an. D D D Because Mr is a composition of continuous function, it is continuous on the intervals . ; 0/ and .0; /. To prove that Mr is continuous for all r Œ ; , we must show 1 1 2 1 1 that Mr is continuous at 0, , and . To show that Mr is continuous at , we applythe 1 1 1 squeeze principle. We have already shown that Mr M1. Since Ä r 1=r 1=r Mr wi M .wi / M1 1 D 1=r and limr!1 .wi / 1, it follows that limr!1 Mr M1. Hence Mr is continuous at D D . A similar proof shows that Mr is continuous at . Finally, the continuity of Mr at 1 1 r 0 is proved by squeezing Mr between M0 and a boundthat tends to M0 as r 0. We D !

i i

i i i “main” — 2011/10/12 — 12:06 — page 57 — #71 i i i

4.5. Power Means 57

consider the case where r is positive (the case where r is negative follows from the identity above). We have already shown that M0 Mr . From the AM–GM inequality, Ä

r 1 n w ar = w ar wi ai r i i i i 1 ai 1 r r r P : wi a D wi a Á a i P i iD1 i Y Â Ã Taking the reciprocalP and the 1=r-thP power yields

r n 1= wi ai r wi ai Mr a P : Ä i iD1 ! Y Since the upper bound tends to M0 as r 0, we conclude that Mr is continuous at 0. ! We thus obtain the following chain of classical inequalities:

min ai HM GM AM QM max ai : f g Ä Ä Ä Ä Ä f g

Graphing Mr for various values of the weights and variables may give the impression that the curve always has a single inﬂection point (where it changes from convex to concave). Often, the curve looks like the picture below.

max ai - f g Mr

M0 r min ai f g

Since Mr has two horizontal asymptotes, the curve has at least one inﬂection point. But is there always exactly one? Harold Shniad [47] found the counterexample

r 2r 3r 1=r Mr .0:1e 0:8e 0:1e / : D C C (We have written the powers of the variables as equivalent exponential functions.) A computer algebra system shows that the second derivative of Mr changes sign three times, so there are three points of inflection: M 00 is positive for r 2, negative for r 1, r D D positive for r 0, and negative for r 4. D D In 2008 Phan Thanh Nam and Mach Nguyet Minh [44] showed that for two variables, the function Mr is indeed convex-concave (having one inflection point). Their admirable proof deals with the complicated algebra involved in the second derivative. Is there is a simpler proof? If there are more than two variables with equal weights, by setting some of the variables equal, Schniad’s function can be written as a power mean with ten variables and equal weights. So, in this case, the curve has more than one inflection point. The question of whether the power mean for more than two variables and equal weights is always convex–concave is, to the best of my knowledge, unsolved.

i i

i i i “main” — 2011/10/12 — 12:06 — page 58 — #72 i i i

58 4. Delightful Theorems

cos2p /7 1

Figure 4.5. A regular heptagon. 4.6 Regular Heptagon A regular polygon with n sides is constructible by straightedge and compass if and only if n is of the form k 2 p1 : : : pm; 2j where k; m 0 and the pi are distinct primes of the form 2 1, with j 0. Primes of C this form are called Fermat primes, and the only ones known are 3, 5, 17, 257, and 65537, corresponding to j 0, 1, 2, 3, and 4. D If we allow the use of an angle-trisecting device, then certain other regular polygons can be constructed. Theorem. A regular heptagon, which has seven sides, can be constructed using straightedge, compass, and an angle trisecting device.

We will show that a regular (convex) heptagon can be constructed in the complex plane. The vertices are 1, z, z2, z3, z4, z5, and z6, where z e2i=7. See Figure 4.5. The construction D amounts to constructing a segment of length z z6 z 1=z 2 cos.2=7/, a real C D C D number, for this is twice the projection of z onto the real axis. From the equation z7 1 we have D z6 z5 z4 z3 z2 z 1 0: C C C C C C D Letting a z 1=z (a real number), we obtain D C a3 a2 2a 1 0: C D The irreducibility of this cubic polynomial is the reason why a regular heptagon is not constructible by straightedge and compass alone. A minimal polynomial of a constructible length must have a degree that is a power of 2. To eliminate the coefﬁcient of a2, we make the substitution a .b 1/=3 and obtain D b3 21b 7 0: D We can solve this equation by using the formula for the cosine of a triple angle:

cos 3 4 cos3 3 cos : D

i i

i i i “main” — 2011/10/12 — 12:06 — page 59 — #73 i i i

4.7. Isometries of the Plane 59

P¢

P P¢ P

translation rotation

P P P¢ P¢ reflection glide-reflection

Figure 4.6. The four types of isometries of the Euclidean plane.

Our trisection procedure allows us to construct cos when given cos 3. The change of variables b 2p7c puts our equation in the proper form. D Using straightedge and compass, we can construct any rational number and take square roots. Therefore, with the angle trisecting device we can construct a and thereby construct the regular heptagon. See [20] for a discussion of the construction of the regular heptagon and the regular triskaidecagon, with thirteen sides, along with the following characterization of the regular polygons that can be constructed using straight-edge, compass, and angle trisecting device: Theorem. A regular n-gon can be constructed with straightedge, compass, and angle tri- sector if and only if n is of the form

k l 2 3 p1 : : : pm;

h j where k; l; m 0, and the pi are distinct primes greater than 3 of the form 2 3 1, with C h; j 0.

4.7 Isometries of the Plane An isometry of the Euclidean plane R2 is a function f R2 R2 that preserves distances: W ! f .a/ f .b/ a b ; for all a; b R2: j j D j j 2 The four types of isometries, as shown in Figure 4.6, are translations, rotations, reflections, and glide-reflections. A glide-reflection is a composition of a translation and a reflection in a line parallel to the direction of translation. Translations and glide-reflections have no fixed points (although a glide-reflection’s reflecting line is fixed as a set of points), while rotations have one fixed point (the center) and reflections have a line of fixed points. Translations and rotations are orientation-preserving,while reflections and glide-reflections are orientation-reversing. Given that every isometry of the plane is one of these four types, we will prove that all isometries are given by two families of complex functions. For this purpose, we represent the Euclidean plane as C.

i i

i i i “main” — 2011/10/12 — 12:06 — page 60 — #74 i i i

60 4. Delightful Theorems

Theorem. Every isometry of the Euclidean plane is of the form

f .z/ ˛z ˇ or f .z/ ˛z ˇ; where ˛; ˇ C; ˛ 1: D C D C 2 j j D The ﬁrst functionis an orientation-preservingisometry; the second is an orientation-reversing one. A translation is represented as

f .z/ z ˇ; ˇ C; D C 2 where ˇ gives the direction and magnitude of the translation. A rotation with center at the origin is represented as

f .z/ ˛z; ˛ C; ˛ 1: D 2 j j D The angle of the rotation is arg ˛. A rotation with an arbitrary center is represented using conjugation (in the group theory sense: ghg1) of a rotation by a translation:

f .z/ ˛.z / ; ˛; C; ˛ 1: D C 2 j j D The angle of rotation is arg ˛ and the center of rotation is . If f .z/ ˛z ˇ, with ˛; ˇ C; ˛ 1, then f is a translation or a rotation. Indeed, D C 2 j j D if ˛ 1, then f is a translation, while if ˛ 1, then f is the rotation given by D ¤ ˇ ˇ f .z/ ˛ z : D 1 ˛ C 1 ˛ Â Ã Reﬂection with respect to the x-axis is represented as

f .z/ z: D Now let us consider a reflection with respect to a line through the origin. Suppose that the reflecting line is given by the complex number ! (as a vector), with ! 1. Then j j D reflection with respect to this line is effected by conjugation:

f .z/ !.!1 z/ !2z: D D Hence ˛ !2. D Reﬂection with respect to a line parallel to ! is effected by conjugation by a translation si!, for some real s:

f .z/ !2.z si!/ si! !2z 2si!; ! C; ! 1; s R: D C D C 2 j j D 2 Glide-reﬂection with respect to a line through the origin is represented as

f .z/ !2.z t!/ !2z t!; ! C; ! 1; t R: D C D C 2 j j D 2 Glide-reﬂection with respect to an arbitrary line is represented as

f .z/ !2z 2si! t!; ! C; ! 1; s; t R: D C C 2 j j D 2

i i

i i i “main” — 2011/10/12 — 12:06 — page 61 — #75 i i i

4.8. Symmetries of Regular Convex Polyhedra 61

The vector 2si! is perpendicular to ! and t! is parallel to !. If f .z/ ˛z ˇ, with ˛; ˇ C; ˛ 1, then f is a reﬂection or a glide-reﬂection. D C 2 j j D For let ˛ !2, and 2si! and t! be the perpendicular and parallel components of ˇ with D respect to !, respectively. Then we may write

f .z/ !2z 2si! t!; D C C and we see that f is a glide-reflection (or reflection, if t 0). D It is easy to show that the set of isometries comprise a group under composition. They form a closed set with an identity and inverses, and composition of functions is associative. The group of isometries of the Euclidean plane is generated by one type of isometry: reflections.

A translation is the composition of two reflections. A rotation is the composition of two reflections. A reflection is one reflection. A glide-reflection is the composition of a reflection and a translation, so it is equivalent to three reflections.

Thus, every isometry in the Euclidean plane is a composition of one, two, or three re- ﬂections.

4.8 Symmetries of Regular Convex Polyhedra A regular polyhedron has the property that under symmetry all its vertices are equivalent, all its edges are equivalent, and all its faces are equivalent. There are ﬁve regular convex polyhedra: tetrahedron, cube, octahedron, dodecahedron, and icosahedron. See Figure 4.7. A simple proof uses Euler’s formula, V E F 2, where E is the number of edges, V C D the number of vertices, and F the number of faces of the polyhedron. See, e.g., [26].

tetrahedron cube octahedron dodecahedron icosahedron Figure 4.7. The ﬁve regular convex polyhedra.

vertices edges faces edges per vertex edges per face tetrahedron 4 6 4 3 3 cube 8 12 6 3 4 octahedron 6 12 8 4 3 dodecahedron 20 30 12 3 5 icosahedron 12 30 20 5 3

i i

i i i “main” — 2011/10/12 — 12:06 — page 62 — #76 i i i

62 4. Delightful Theorems

Figure 4.8. A regular tetrahedron inscribed in a cube.

A symmetry of a polyhedron is a motion that moves it so that it occupies its original space. The set of all symmetries forms a group under composition of motions. We are considering only proper symmetries of the polyhedron, those that preserve orientation. Re- flections are excluded. What are the symmetry groups of the regular convex polyhedra? The dual polyhedron of a polyhedron is the polyhedron obtained by putting a vertex at the center of each face of the given polyhedron and joining two new vertices if the faces of the given polyhedron share an edge. A polyhedron and its dual have the same symmetry group. The cube and the octahedron are duals, so they have the same symmetry group. The icosahedron and dodecahedron are duals, so they have the same symmetry group. The tetrahedron is self-dual. If we pick up a cube and set it down again so that it occupies its original space, then its vertices, edges, and faces of may have changed position. The symmetry group of the cube is the group of all such ways to reposition the cube. It’s easy to find the order (the number of elements) of the symmetry group of the cube. We can set the cube down on any of its six faces, and then rotate it in any of four ways. Hence there are 6 4 24 symmetries. D However, we still need to decide which 24-element group this is. We know that the symmetric group S4, the group of permutations on 4 objects, has 4Š 24 elements. In fact, the symmetry group of the cube, and of the regular octahedron, is D isomorphic to S4. To see this,we need onlyshow that thecube contains some fourelements that are permuted in all ways by the symmetries of the cube, and that each permutation of them gives rise to a unique symmetry of the cube. The four diagonals of the cube have this property. Every symmetry of the cube permutes the diagonals, and conversely every permutation of the diagonals comes from a symmetry of the cube. To find the symmetry group of the regular tetrahedron, place the vertices of the tetrahedron at four vertices of a cube. These are opposite pairs of vertices on oppositefaces of the cube. See Figure 4.8. Nowthat weknowthesymmetry groupof thecube (S4), we find thatthe symmetry group of theregular tetrahedron goes along for the ride. Every symmetry of the cube automatically gives a symmetry of the regular tetrahedron (with the tetrahedron inscribed in the cube), or it moves the vertices of the tetrahedron to the other four vertices of the cube. How many symmetries of the regular tetrahedron are there? Since we can put the tetrahedron down on any of its four faces and then rotate it in any of three ways, the regular tetrahedron has 4 3 12 symmetries. The symmetries of the tetrahedron comprise a subgroup of S4 of D order 12. It can be shown that this subgroup is the alternating group A4, consisting of even permutations in S4.

i i

i i i “main” — 2011/10/12 — 12:06 — page 63 — #77 i i i

4.9. Polynomial Symmetries 63

Similar arguments show that the symmetry group of both the regular icosahedron and the regular dodecahedron is the alternating group A5. A cube can be inscribed in a regular dodecahedron so that each of the twelve edges of the cube is a diagonal of a face of the dodecahedron. See the diagram below. Each face of the dodecahedron has five diagonals (the edges of a pentagram). Hence, five such cubes can be inscribed. Symmetries of the dodecahedron permute the five cubes. It follows that the symmetry group of the dodecahedron is a subgroup of the symmetric group S5. Since the symmetry group has order 60 (why?), it is A5.

Theorem. The regular tetrahedron has symmetry group A4. The cube and regular octahedron have symmetry group S4. The regular dodecahedron and regular icosahedron have symmetry group A5. See [13] for a panoramic and detailed survey of symmetries of geometric ﬁgures.

4.9 Polynomial Symmetries

A symmetry of a polynomial(in several variables) is a permutation of the polynomial’svari- ables that leaves the polynomial unchanged. For example, the polynomial xy2 yz2 zx2 C C has symmetry group Z3, since the permutations of the variables that preserve the polynomial are .x/.y/.z/, .x; y; z/, and .z; y; x/. Theorem. Given a ﬁnite group G of order n, there exists a polynomial in n variables, all of whose coefﬁcients are 1, with symmetry group G. Given any n-element group G a group of permutations of the set 1; : : : ; n , we see by f g inspection that the polynomial

2 n f .x1; x2; : : : ; xn/ x x : : : x D .1/ .2/ .n/ 2 XG

2 n has symmetry group G. The monomial x1x2 : : : xn is mapped by to another monomial x x2 : : : xn in the polynomial if and only if G. .1/ .2/ .n/ 2

i i

i i i “main” — 2011/10/12 — 12:06 — page 64 — #78 i i i

64 4. Delightful Theorems

If we do this construction for Z3, we obtain

2 3 2 3 2 3 f .x1; x2; x3/ x x x x x x x x x : D 1 2 3 C 2 3 1 C 3 1 2

Factoring out x1x2x3, and making the change of variables x1 x, x2 y, and x3 z,

we obtain our polynomial xy2 yz2 zx2. C C The quaternion group Q consists of eight elements, 1, i, j , and k, that satisfy ˙ ˙ ˙ ˙ the rules i 2 j 2 k2 1 and ij k, jk i, ki j . D D D D D D Let’s ﬁnd a polynomial whose symmetry group is the quaternion group. We use the labeling 1 2 3 4 5 6 7 8 1 1 i i j j k k: Using the multiplicationrules, we ﬁnd that the polynomial is

x x2x3x4x5x6x7x8 x x2x3x4x5x6x7x8 1 2 3 4 5 6 7 8 C 2 1 4 3 6 5 8 7 x x2x3x4x5x6x7x8 x x2x3x4x5x6x7x8 C 3 4 2 1 7 8 6 5 C 4 3 1 2 8 7 5 6 x x2x3x4x5x6x7x8 x x2x3x4x5x6x7x8 C 5 6 8 7 2 1 3 4 C 6 5 7 8 1 2 4 3 x x2x3x4x5x6x7x8 x x2x3x4x5x6x7x8: C 7 8 5 6 4 3 2 1 C 8 7 6 5 3 4 1 2 It has symmetry group Q. A ﬁnite group G has many realizations. We have seen that G is the symmetry group of a polynomial. Some other realizations are:

G is given by its multiplication table. G is isomorphic to a set of permutations, under composition. (Arthur Cayley4) G is isomorphic to a matrix group. This is called a group representation. G is the automorphism group of a ﬁnite graph. (Roberto Frucht5) G is the automorphism group of a compact Riemann surface. (Adolf Hurwitz6) G is the automorphism group of a perfect binary code. (Kevin Phelps7) G is the automorphism group of a distributive lattice. (Garrett Birkhoff8) G is given by a presentation. An example of a group presentation is given in Ap- pendix A.

It is not known whether the following is true:

G is the automorphism group of an algebraic extension of the rational numbers. 4Arthur Cayley (1821–1895) was a pioneer in the areas of algebra, non-Euclidean geometry, and combinatorics. 5Roberto Frucht (1906–1997)was a graph theorist. 6Adolf Hurwitz (1859–1919) was an algebraist, geometer, and number theorist. 7Kevin Phelps is a researcher in the areas of coding theory, combinatorics, and graph theory. 8Garrett Birkhoff (1911–1996) was an algebraist, working speciﬁcally in the areas of lattice theory and universal algebra.

i i

i i i “main” — 2011/10/12 — 12:06 — page 65 — #79 i i i

4.10. Kings and Serfs 65

3 4 2

6 9 7 8 Figure 4.9. A tournament on nine vertices. 4.10 Kings and Serfs A tournament is a complete ﬁnite graph in which each edge has been replaced by a directed arrow. Figure 4.9 shows a tournament on nine vertices. In a tournament, a King is a vertex from which every other vertex can be reached in one or two steps. A Serf is a vertex that can be reached from every other vertex in one or two steps. Every vertex in the tournament of Figure 4.9 is both a King and a Serf. We will show that this is typical. The outdegree of a vertex v is the number of directed edges that emanate from v. The indegree of v is the number of edges directed to v. Every vertex in the tournament of Figure 4.9 has outdegree 4 and indegree 4. Theorem. (H. G. Landau). Every tournament has a King. Consider a vertex v of maximum outdegree. We will prove that v is a King of the tournament. Suppose that there are edges directed away from v to r vertices, u1,..., ur . Suppose also that there is a vertex w that cannot be reached in one or two steps from v. Then w is not among the ui and there are edges directed from w to all the ui and to v. But this means that the outdegree of w is at least r 1, contradicting the choice of v. C The assertion that every tournament has a Serf is the dual statement of Landau’s theorem. In a random tournament, the direction of each edge is chosen randomly with equal probability of going in either direction. In a large random tournament, almost assuredly every vertex is both a King and a Serf. Theorem. (Stephen B. Maurer). In a random tournament on n vertices, the probability that every vertex is a King and a Serf tends to 1 as n tends to inﬁnity.

A tournament lacks the desired property if and only if there exists a pair of vertices v1 and v2 with v1 v2 such that there is no path of length 2 from v1 to v2. In a random

tournament, this happens with probability at most n 3 n2 : 2 4 ! Â Ã n The reason is that there are 2 choices for the “bad” vertices v1 and v2, and n 2 choices 2 for a third vertex w; the probabilitythat there is a path from v1 to w to v2 is .1=2/ 1=4. D

i i

i i i “main” — 2011/10/12 — 12:06 — page 66 — #80 i i i

66 4. Delightful Theorems

n2 Hence, the probabilitythat there is no path of length 2 from v1 to v2 is .3=4/ . Therefore, since probabilities are subadditive, the probability that there exist such v1 and v2, with no path of length 2 from v1 to v2, is bounded by the number of choices of v1 and v2 times .3=4/n2. Our upper bound is the product of a polynomial and an exponential function with base less than 1. As n , the exponential function dominates and the product tends to 0. ! 1 Hence, the probability of the complementary event—the event that every vertex is a King and a Serf—tends to 1 as n tends to . 1

4.11 The Erdo˝s--Szekeres Theorem In any sequence of ten distinct real numbers, there exists an increasing subsequence of four terms or a decreasing subsequence of four terms. The terms in the subsequence need not be consecutive. For example, the ten-term sequence

7; 8; 4; 9; 5; 1; 6:2; 3; 10

contains, among others, the four-term increasing subsequence 4, 5, 6, 10. This assertion is an instance of the Erd˝os–Szekeres theorem, due to Paul Erd˝os (1913– 1996) and George Szekeres (1911–2005), which we state below. Let the terms of the sequence be x1, x2,..., x10. To each xi , we associate an ordered pair .ui ; di /, where ui is the length of a longest increasing subsequence that begins with xi (the u stands for “up”), and di is the length of a longest decreasing subsequence that begins with xi (the d stands for “down”). Assume that the sequence doesn’t contain an increasing or decreasing subsequence of length four. Then, for each i, we have 1 ui ; di 3. Now Ä Ä we can invoke the famous pigeonhole principle, a staple of combinatorial mathematics.

Pigeonhole Principle. If N 1 objects are placed in N pigeonholes, then one of the C pigeonholes contains at least two objects. The proof is by contradiction. We apply the pigeonhole principle with N 9, where the objects are the ten numbers D 1, 2,..., 10, and the pigeonholes are the nine ordered pairs .0; 0/, .0; 1/,..., .3; 3/. For 1 i 10, we place i in the pigeonhole corresponding to the ordered pair .ui ; di /. The Ä Ä pigeonhole principle guarantees that some two numbers i and j , with i < j , are in the same pigeonhole; that is, some .ui ; di / and .uj ; dj / are equal. But this is impossible, for if xi < xj then ui > uj , while if xi > xj then di > dj . The contradiction means that there exists an increasing subsequence or a decreasing subsequence of length four. Here is the general statement of the theorem. Erd˝os–Szekeres Theorem. In any sequence of mn 1 distinct real numbers, where m C and n are positive integers, there exists an increasing subsequence of length m 1 or a C decreasing subsequence of length n 1. C The proof uses the pigeonhole principle as in the special case m n 3. The result of D D the theorem doesn’t hold if we replace mn 1 by mn. We can see this in our special case. C

i i

i i i “main” — 2011/10/12 — 12:06 — page 67 — #81 i i i

4.12. Minkowski’s Theorem 67

Remove the 10 at the end of the example sequence above, and there is no increasing or decreasing subsequence of length four. How many sequences consisting of the numbers 1, 2,..., 9, in some order, contain no monotonic (increasing or decreasing) subsequence of length 4? A computer search shows that there are 1764. This is interesting,because 1764 is a perfect square: 1764 422. There D are 42 fillings of a 3 3 array with the numbers 1 through 9, so that thenumbers increase in each row and column. These are called standard fillings.Here is an example of a standard filling.

1 2 4

3 5 8

6 7 9

The number of standard ﬁllingsis given by the hook length formula. The hook length of a cell in a grid is the number of squares to the right and below that cell, plus one to count the cell itself. The hook lengths for the cells of a 3 3 grid are shown below. 5 4 3

4 3 2

3 2 1

The number of standard ﬁllings of the 3 3 grid is the number of permutations of nine elements divided by the product of the hook lengths: 9Š 42: 5 4 4 3 3 3 2 2 1 D Thus, the number of permutations of nine elements that contain no monotonic subsequence of length four is the square of the number of standard ﬁllings of a 3 3 grid. We have touched on the rich theory of Young tableaux. See, e.g., [32].

4.12 Minkowski’s Theorem A region in the plane is convex if it contains the line segment joining any two of its points. The region is centrally symmetric if it contains the point . x; y/ whenever it contains the point .x; y/.A lattice point in the plane is a point with integer coordinates. A convex centrally symmetric planar region certainly contains at least one lattice point: the origin. A famous theorem of Hermann Minkowski9 asserts that if such a region has area greater than 4, then it must contain another lattice point.

9Hermann Minkowski (1864–1909)made fundamental contributions in number theory and relativity theory.

i i

i i i “main” — 2011/10/12 — 12:06 — page 68 — #82 i i i

68 4. Delightful Theorems

Figure 4.10. A planar region satisfying the conditions of Minkowski’s theorem.

Minkowski’s Theorem. A convex centrally symmetric planar region of area greater than 4 contains a lattice point other than the origin. Figure 4.10 shows an example of a convex centrally symmetric region of area greater than 4. We see that it contains the lattice points .1; 1/ and . 1; 1/, in addition to .0; 0/. It’s a good exercise to show why the conclusion of Minkowski’s theorem fails if the region isn’t convex or isn’t centrally symmetric. We sketch a proof of Minkowski’s theorem. Suppose that K is a convex centrally symmetric planar region of area greater than 4. Then we claim that K contains distinct points v .a; b/ and w .c; d/ such that v w is an ordered pair of even integers. The proof D D of this claim uses the pigeonhole principle, but in a different version from the one in the previous section. For each element of K, add or subtract an ordered pair of even integers so that the result is a pair of numbers both between 1 and 1. Since the area of K is greater than 4, some distinct points v and w are mapped to the same point, and hence have the desired property. Since K is centrally symmetric, w is an element of K. Because K is convex, the midpoint of v and w, that is, v . w/ C ; 2 is in K. Since each coordinate of the numerator is an even integer, the midpoint is a lattice point,and it is not .0; 0/ since v w. So, we have proved the existence of a lattice point in ¤ K other than the origin. Because of central symmetry, K contains two such lattice points. We can deduce a consequence of Minkowski’s theorem in number theory. Theorem. If p is a prime number of the form 4n 1, then p is the sum of two squares of C integers: p x2 y2. D C For instance, 29 52 22. The representation of p as a sum oftwosquares isuniqueup D C to the order of the squares, but we won’t prove this. There is no representation of a prime p as a sum of two squares if p 3 .mod 4/. Squares modulo 4 are either 0 or 1, so the Á

i i

i i i “main” — 2011/10/12 — 12:06 — page 69 — #83 i i i

4.13. Lagrange’s Theorem 69

sum of two squares must be 0, 1, or 2. Our proof requires a generalization of Minkowski’s theorem to arbitrary planar lattices. Let v1 and v2 be linearly independent vectors in the plane. Then v1 and v2 generate a lattice ƒ consisting of all sums of integral multiples of the two vectors:

ƒ m1v1 m2v2 m1; m2 integers : D f C W g

Let be the area of the parallelogram spanned by v1 and v2. Minkowski’s Theorem (for an Arbitrary Lattice). A convex centrally symmetric planar region of area greater than 4 contains a lattice point other than the origin. Let p be a prime of the form 4n 1. An important fact of number theory is that 1 C is a square modulo p. A good way to think about this is in terms of the group of units (the nonzero elements) modulo p. For any prime, the units form a cyclic group; that is, the group consists of powers of an element called a generator. For instance, the group of units modulo 7 is generated by 3, because modulo 7 we have 3, 32 2, 33 6, 34 4, 35 5, Á Á Á Á and 36 1. For our prime p, let g be a generator. Then .gn/2 g2n 1 .mod p/. Á n Á Á Let h g (whose square is 1), and deﬁne ƒ by the vectors v1 .h; 1/ and v2 D D D .p; 0/. The parallelogram spanned by v1 and v2 has area 1 p p. 2 2 D D If .x; y/ ƒ, then x y 0 .mod p/. The reason is that x m1h m2p and 2 C Á D C y m1, for some integers m1 and m2, and hence D 2 2 2 2 2 2 x y .m1h m2p/ m m .h 1/ 0 .mod p/: C D C C 1 Á 1 C Á Let K betheopen disk of radius p2p centered at the origin.The area of K is 2p > 4, and of course a disk is convex and centrally symmetric. Minkowski’s theorem guarantees the existence of a lattice point .x; y/ in K other than the origin. For this lattice point,

0 < x2 y2 < 2p: C Since x2 y2 0 .mod p/, we have x2 y2 p, a representation of p asasumoftwo C Á C D squares. See [39] for an excellent introduction to Minkowski’s theorem and its consequences.

4.13 Lagrange’s Theorem Lagrange’s Theorem. Every positive integer is a sum of four squares. For instance, 15 32 22 12 12: D C C C Three squares do not always sufﬁce; e.g., 7 isn’t a sum of three squares. Our proof of this number theory gem uses Minkowski’s theorem. Let’s start with the easiest case: 1 12 02 02 02: D C C C We must show that every integer greater than 1 is a sum of four squares. If m isa sum offoursquares and n is a sum of four squares, then the product mn is a sum of four squares. To understand this, we turn—surprisingly—toalternative number systems.

i i

i i i “main” — 2011/10/12 — 12:06 — page 70 — #84 i i i

70 4. Delightful Theorems

2 2 The length of the complex number z a bi is z pa b . Let z1 a bi and D C j j D C D C z2 c di. Then from the identity D C

z1 z2 z1z2 ; j jj j D j j we obtain, upon squaring,

.a2 b2/.c2 d 2/ .ac bd/2 .ad bc/2: C C D C C Thus, the product of two sums of squares is itself a sum of squares. We can get the same result for sums of four squares by using quaternions (see Polynomial Symmetries). The length of a quaternion q a bi cj dk is D C C C q a2 b2 c2 d 2: j j D C C C p Let q1 a bi cj dk and q2 w xi yj zk. Then the identity D C C C D C C C

q1 q2 q1q2 j jj j D j j yields

.a2 b2 c2 d 2/.w2 x2 y2 z2/ C C C C C C .aw bx cy dz/2 .ax bw cz dy/2 D C C C .ay bz cw dx/2 .az by cx dw/2: C C C C C C Hence, a product of two sums of four squares is a sum of four squares. Since every integer greater than 1 factors into prime numbers, we need only prove that every prime is a sum of four squares. We willuse a generalization of Minkowski’stheorem to arbitrarylattices in d-dimensional space. A lattice ƒ in d-dimensional space is the collection of integer linear combinations of d independent vectors, v1,..., vd :

ƒ m1v1 md vd m1; : : : ; md integers : D f C C W g

Let be the volume of the parallelepiped spanned by v1,..., vd . Minkowski’s Theorem (for an Arbitrary Lattice in d-Dimensional Space). A convex centrally symmetric region in d-dimensional space of volume greater than 2d contains a lattice point other than the origin. Let p be a prime. For p 2, we have 2 12 12 02 02. So, let’s assume that D D C C C p is an odd prime. In order to carry out our proof that p is a sum of four squares, we will describe a lattice in 4-dimensional space. Its deﬁnition depends on the solution to a congruence modulo p. We claim that there exist integers a and b such that

a2 b2 1 .mod p/: C Á There are exactly .p 1/=2 distinct squares modulo p, namely, the squares modulo p C of the ﬁrst .p 1/=2 integers. Thus, the quantities a2 and .b2 1/ both take exactly C C

i i

i i i “main” — 2011/10/12 — 12:06 — page 71 — #85 i i i

4.13. Lagrange’s Theorem 71

.p 1/=2 values modulo p. By the pigeonhole principle, there exist a and b for which C they are equal, so they satisfy the congruence. We deﬁne the lattice ƒ in 4-dimensionalspace to be the set of integer linear combinations of the vectors

v1 .p; 0; 0; 0/ D v2 .0; p; 0; 0/ D v3 .a; b; 1; 0/ D v4 .b; a; 0; 1/: D Recall from Chapter 1 that a d d determinant is equal to the (signed) volume of the parallelepiped in d-dimensional space spanned by its rows. Hence p 0 0 0 0 p 0 0 ˇ ˇ : D ˇ a b 1 0 ˇ ˇ ˇ ˇ b a 0 1 ˇ ˇ ˇ ˇ ˇ This lower-triangular determinant is equalˇ to the productˇ of its diagonal entries: p2. ˇ ˇ D If .x1; x2; x3; x4/ ƒ, then there exist integers m1, m2, m3, and m4 such that 2 .x1; x2; x3; x4/ m1.p; 0; 0; 0/ m2.0; p; 0; 0/ m3.a; b; 1; 0/ m4.b; a; 0; 1/ D C C C .m1p m3a m4b; m2p m3b m4a; m3; m4/: D C C C Modulo p we have

2 2 2 2 2 2 2 2 x x x x .m3a m4b/ .m3b m4a/ m m 1 C 2 C 3 C 4 Á C C C 3 C 4 2 2 2 2 2 2 m3a m4b m3b m4a m m Á C C C C 3 C 4 .a2 b2 1/.m2 m2/ Á C C 3 C 4 0: Á Let K be the open four-dimensional hypersphere of radius p2p centered at the origin. The formula for the volume of a d-dimensional hypersphere (ball) of radius r was given in Chapter 3: d=2r d : .d=2/Š For K, we have d 4 and r p2p, so its volume is D D 2p2 2: Since 2 > 8, this expression is greater than 16p2, and Minkowski’s theorem implies the existence of a lattice point in K other than the origin. If .x1; x2; x3; x4/ is such a point, then 0 < x2 x2 x2 x2 < 2p; 1 C 2 C 3 C 4 and therefore x2 x2 x2 x2 p, a representation of p as a sum of four squares. This 1 C 2 C 3 C 4 D concludes the proof of Lagrange’s theorem.

i i

i i i “main” — 2011/10/12 — 12:06 — page 72 — #86 i i i

72 4. Delightful Theorems

4.14 Van der Waerden’s Theorem An l-term arithmetic progression (l-AP) is a sequence a; a d; a 2d; : : : ; a .l 1/d; C C C where a is the initial term of the sequence and d is the common difference between two consecutive terms of the sequence. For instance, 10; 15; 20; 25; 30; 35; 40 is a 7-AP with initial term 10 and common difference 5. In 1927 B. L. van der Waerden10 showed that if the set of positive integers N is partitioned into two classes, then at least one of the classes contains arbitrarily long arithmetic progressions. Van der Waerden’s Theorem. If the set of positive integers (N) is partitioned into two classes, then at least one contains an l-AP for every l 1. The theorem doesn’t say that some class contains an infinite arithmetic progression. This isn’t guaranteed, as you can see by defining the classes so that 1 is in the first class, 2 and 3 are in the second class, 4, 5, and 6 are in the first class, 7, 8, 9, and 10 are in the second class, and so on. Neither class contains an infinite arithmetic progression. To prove van der Waerden’s theorem, we will generalize it to allow for a partitioningof N into any finite number of classes. Van der Waerden’s Theorem (Infinite Version). Let c be an integer greater than 1. If the set of positive integers (N) is partitioned into c classes, then at least one of the classes contains an l-AP for every l 1. We call this theorem the “infinite version” because another version of van der Waerden’s theorem concerns only finite sections of the positive integers. This statement is referred to as the “finite version” of the theorem. We set N.n/ 1;2;3;:::;n . D f g Van der Waerden’s Theorem (Finite Version). Given integers c 1 and l 1, there exists a least integer W.c; l/ with the property that if N.W.c; l// is partitioned into c classes, then one of the classes contains an l-AP. We sometimes refer to the classes as colors and to the partition of N.W.c; l// as a c- coloring. An l-AP contained in a single class is called a monochromatic l-AP. The values of W.c; l/ are called van der Waerden numbers. The generalization to c colors is equivalent to the restriction to two colors. For in a coloring with c colors, all the colors but one could be combined to make one color, and hence produce a coloring using only two colors. If the monochromatic l-AP is foundin the color that isn’t a combined color, then we are done; otherwise, we can repeat the argument until we have a monochromatic l-AP in one of the c colors.11 10Bartel Leendert van der Waerden (1903–1996) made contributions in algebra and wrote a famous textbook on the subject called Modern Algebra. 11In proving the theorem that bears his name, van der Waerden worked with Emil Artin (1898–1962)and Otto Schreier (1901–1929). Schreier suggested the finite version of the theorem, while Artin suggested the generalization to arbitrarily many colors.

i i

i i i “main” — 2011/10/12 — 12:06 — page 73 — #87 i i i

4.14. Van der Waerden’s Theorem 73

The finite version of van der Waerden’s theorem is equivalent to the infinite version. To see that the finite version implies the infinite version, suppose that we have a partition of N into c classes. We want to show that one of the classes contains arbitrarily long finite arithmetic progressions. For any l, there exists an integer W.c; l/ such that no matter how N.W.c; l// is partitioned into c classes, one class contains a monochromatic l-AP. Thus, one class in the partition of N contains a monochromatic l-AP for each l 1. Since there are only finitely many classes, one of the classes must contain monochromatic l-APs for infinitely many values of l. This class contains arbitrarily long monochromatic arithmetic progressions. We will show that the infinite version implies the finite version for c 2 (the general D case is handled in the same way). Actually, we will prove the contrapositive; that is, we will assume that the finite version is false and show that the infinite version is false. That the finite version is false for c 2 means that there is a positive integer l such that for every D positive integer n there is a partition of N.n/ into two classes neither of which contains an l-AP. For n 1, 2, 3, . . . , let D.n/ be a partitionof N.n/ into two classes, A and B, such D that neither A nor B contains an l-AP. For instance, if l 3 and n 6, we can take D.6/ D D to be N.6/ 1; 3; 4; 6 2; 5 , as neither subset contains a 3-AP. Consider the sequence D f g[ f g S of such partitions: S D.1/; D.2/; D.3/; : : : : D f g The integer 1 occurs in one of the classes in each partition, so it must occur in the same class, either A or B, in infinitely many of them. Let

S.1/ D.1; 1/; D.1; 2/; D.1; 3/; : : : D f g be a subsequence of S in which 1 occurs in the same class in each partition. In the same way, we may form a subsequence S.2/ of S.1/ in which the integer 1 occurs in the same class and the integer 2 occurs in the same class (although 1 and 2 need not occur in the same class): S.2/ D.2; 2/; D.2; 2/; D.2; 3/; : : : : D f g Continuing, we can get a sequence

S.n/ D.n; n/; D.n; n 1/; D.n; n 2/; : : : ; D f C C g in which the integer i occurs in the same class in each partition, for i 1,..., n. D Now we define a partition D of N 1; 2; 3; : : : by putting m into the class in which D f g it appears in D.m; m/. If the infinite version were true, then the partition D would have a class that contained an l-AP in the first k integers, for some k. But then D.k; k/ would contain an l-AP, which contradicts the fact that D.k; k/ was constructed so that it doesn’t contain an l-AP. Hence the infinite version must be false. The contrapositive of this im- plication is that the infinite version implies the finite version. This is what we wanted to show. We will now prove the finite version of van der Waerden’s theorem. The proof is by mathematical induction on l, the number of terms in the monochromatic arithmetic progression. The result is trivially true for l 1, since W.c; 1/ 1 for all c. It is also trivially D D true for l 2, since W.c; 2/ c 1 for all c. This statement is the basis of the induction. D D C

i i

i i i “main” — 2011/10/12 — 12:06 — page 74 — #88 i i i

74 4. Delightful Theorems

Given l 2, we assume that W.c; l/ exists for all c 2, and we will prove the existence of W.c; l 1/ for all c 2. This will complete the induction. C We claim that W.c; l 1/ exists and satisﬁes W.c; l 1/ f .c/, where f is deﬁned C C Ä recursively:

f .1/ 2W.c; l/ D f .n/ 2W.cf .n1/; l/f .n 1/; n 2: D Suppose that N.f .c//, which we call a c-block, is c-colored without a monochromatic .l 1/-AP, and N.f .c// is partitioned into f .c/=f .c 1/ blocks of f .c 1/ consecutive C integers. Let’s call these blocks of integers .c 1/-blocks. Likewise, each .c 1/-block is partitioned into f .c 1/=f .c 2/ blocks of f .c 2/ consecutive integers, which we call .c 2/-blocks. This partitioning takes place at each of the c levels, until each 1-block is partitioned into 2W.c; l/ 0-blocks (which are integers). By the definition of W.c; l/, the first half of each 1-block contains a monochromatic l-AP. The coloring of the elements of a 1-block induces a coloring of the 1-block itself: we assign one of cf .1/ colors to the 1-block according to the way its elements are c-colored. Since f .2/ 2W.cf .1/; l/f .1/, each 2-block contains 2W.cf .1/; l/ 1-blocks, so that by D definition of W.cf .1/; l/ the first half of each 2-block contains a monochromatic l-AP of 1- blocks. By similar reasoning, the first half of each 3-block contains a monochromatic l-AP of 2-blocks. This property holds at each level, with the first half of N.f .c 1// containing C a monochromatic l-AP of c-blocks. We consider only those integers that lie in l-APs at all c levels of blocks. We coordinatize each integer as

x .x1; : : : ; xc/; D

with 1 xi l, where xi is the position of x in the monochromatic l-AP of the i-block Ä Ä in which it occurs. All coordinatized integers have the same color, say ˛1. Within each 1-block, the l integers

.1; x2; : : : ; xc/; .2; x2; : : : ; xc/; : : : ; .l; x2; : : : ; xc/

constitute a monochromatic l-AP. Therefore, the integer .l 1; x2; : : : ; xc/ is a color C other than ˛1, say ˛2. Furthermore, the factor 2 in the deﬁnition of f .1/ implies that .l 1; x2; : : : ; xc/ occurs within the 1-block. Now we introduce the idea of focusing. C Within a 2-block, the l integers

.l 1; 1; x3; : : : ; xc/; .l 1; 2; x3; : : : ; xc/; : : : ; .l 1; l; x3; : : : ; xc/ C C C

are a monochromatic l-AP of color ˛2. This forces .l 1; l 1; x3; : : : ; xc/ to be a color C C other than ˛2. However, we can focus a second l-AP on this integer, namely,

.1; 1; x3; : : : ; xc/; .2; 2; x3; : : : ; xc/; :::; .l;l;x3; : : : ; xc/:

Thus, .l 1; l 1; x3; : : : ; xc/ cannot be color ˛1 or ˛2; say it is color ˛3. Figure 4.11 C C illustrates the two focused progressions, representing colors ˛1, ˛2, ˛3 by dots, circles, and an x, respectively. The dashes represent numbers with undetermined colors. Continuingthe

i i

i i i “main” — 2011/10/12 — 12:06 — page 75 — #89 i i i

4.14. Van der Waerden’s Theorem 75

. . o . . o - - x 1-block 1-block 1-block

2-block

Figure 4.11. Focusing in a 2-block.

focusing process at each of the c levels, we conclude that .l 1; l 1; : : : ; l 1/ can C C C be none of the colors ˛1, :::, ˛c, a contradiction. Hence, there exists a monochromatic .l 1/-AP. This completes the induction. C As an example of van der Waerden’s theorem, there exists a number n such that no matter how N.n/ is partitioned into two classes, one of the classes will contain a 3-AP. It turns out that the least such n is 9. That is, no matter how we partition N.9/ into two classes, one of the two classes must contain a 3-AP; it is unavoidable. For example, if we have N.9/ 1; 3; 4; 6; 8 2; 5; 7; 9 , then the ﬁrst subset contains 4, 6, 8, which is a D f g [ f g 3-AP. But we can’t take n to be 8 and always expect to get a 3-AP in one class. Here is a partition of N.8/ into two classes neither of which contains a 3-AP:

N.8/ 1;2;3;4;5;6;7;8 1; 3; 6; 8 2; 4; 5; 7 : D f g D f g [ f g This isn’t the only partition that avoids a 3-AP in one class. We can also partition N.8/ as

2; 4; 5; 7 1; 3; 6; 8 : f g [ f g It turns out that if we are looking for a 4-AP then we will have to partition N.35/ into two classes in order to guarantee it. A general problem is the determination of W.2; l/ for various values of l. It has been solved only for l 1, 2, 3, 4, and 5. It is trivial D that W.2; 1/ 1, because the class that contains the only element of N.1/ will contain D a 1-term arithmetic progression. It is also immediate that W.2; 2/ 3, because when D three integers are partitioned into two classes, one of the classes must contain at least two integers and hence a 2-AP. We have indicated that W.2; 3/ 9. It is nontrivial to show D that W.2; 4/ 35 and W.2; 5/ 178. The known values of W.c; l/ with c 2 and l 3 D D are W.2; 3/ 9, W.2; 4/ 35, W.2; 5/ 178, W.3; 3/ 27, and W.4; 3/ 76. As D D D D D is suggested by the upper bound for f .c/ in our proof of van der Waerden’s theorem, it is known that W.c; l/ grows very fast. Van der Waerden’s theorem is important in a branch of combinatorics called Ramsey theory. Ramsey theory is the part of combinatorics that studies the question, “What order exists in disorder?” Graphs and arithmetic sequences are natural settings the question, and there we ﬁnd the seminal Ramsey theory results (Ramsey’s theorem and van der Waerden’s theorem). Van der Waerden’s theorem says that, if to each positive integer we assign a color (say, green or red), then we will possess an inﬁnite collection of pieces of information. The order in this collection is the existence of arbitrarily long monochromatic arithmetic progressions.

i i

i i i “main” — 2011/10/12 — 12:06 — page 76 — #90 i i i

76 4. Delightful Theorems

The deﬁnitive book on Ramsey theory, [22], explores several generalizations of van der Waerden’s theorem.

4.15 Latin Squares and Projective Planes A latinsquare12 of order n is an n n array in which all the numbers 1 through n appear in every row and in every column. Two latin squares of order 3 are

1 2 3 1 2 3 2 3 1 and 3 1 2 3 1 2 2 3 1:

In each array, the numbers 1, 2, and 3 appear in every row and in every column. If we superimpose them, every ordered pair of the numbers 1, 2, and 3 appears exactly once. Such latin squares are called orthogonal.

11 22 33 23 31 12 32 13 21

From the two orthogonal latin squares of order 3, we can construct a projective plane of order 3, as shown in Figure 2.5. Change the numbers in the latin squares from 1, 2, and 3 to 0, 1, and 2. A projective plane of order three has thirteen points and thirteen lines. The thirteen points are the ordered pairs

00; 01; 02; 10; 11; 12; 20; 21; 22;

together with the ideal points 0, 1, 2, and . We arrange the ninenon-ideal points in a 3 3 1 array, and the four ideal points around the array, as in Figure 2.5. We need to define the thirteen lines. The first three lines are the horizontal lines 00; 10; 20; 0 ; f g 01; 11; 21; 0 ; f g 02; 12; 22; 0 : f g The next three lines are the vertical lines 00; 01; 02; ; f 1g 10; 11; 12; ; f 1g 20; 21; 22; : f 1g The next three lines go through constant values in the first latin square (together with 1):

02; 10; 21; 1 ; f g 01; 12; 20; 1 ; f g 00; 11; 22; 1 : f g 12The term “latin square” derives from the use by Leonhard Euler (1707–1783)of latin letters instead of numbers in the arrays.

i i

i i i “main” — 2011/10/12 — 12:06 — page 77 — #91 i i i

4.15. Latin Squares and Projective Planes 77

The next three lines go throughconstant values in the second latin square (together with 2):

02; 11; 20; 2 ; f g 00; 12; 21; 2 ; f g 01; 10; 22; 2 : f g Finally, we include the ideal line 0; 1; 2; : f 1g It’s easy to check that each line contains four points and each point is on four lines. Fur- thermore, it can be checked that two points determine a unique line and two lines intersect in exactly one point. A set of mutually orthogonal latin squares (MOLS) of order n is a collection of latin squares in which each pair are orthogonal. The maximum number of MOLS of order n is at most n 1. To see this, suppose that we have n MOLS of order n. Relabel the numbers in each latin square (if necessary) so that the ﬁrst row is 1, 2,..., n. Relabeling numbers doesn’t change latinicity or orthogonality. Consider the .2; 1/ entry of each square. It cannot be 1 since there is a 1 above it. Hence, there are only n 1 choices it and by the pigeonhole principle some two squares have the same entry, say m. But for these two squares the entry mm occurs twice in the superim- posed squares, in the .2; 1/ position and the .1; m/ position, so they are aren’t orthogonal. The contradiction implies that the maximum number of MOLS is at most n 1. Recall from Chapter 2 that a projective plane of order n is a collection of n2 n 1 C C points and n2 n 1 lines such that each line contains n 1 points, each point lies on n 1 C C C C lines, every two points determine a unique line, and every two lines intersect in exactly one point. As in our example with n 3, a projective plane of order n is equivalent to a set of n 1 D MOLS of order n. Theorem.13 A projective plane of order n is equivalent to a set of n 1 MOLS of order n. Startingwith a collection of n 1 MOLS of order n, the construction of a projective plane of order n works as in our example with n 3. Let the MOLS be labeled 1,..., n 1, D with the entries in each latin square labeled 0,..., n 1. Let the points of the projective plane be the ordered pairs .x; y/, where 0 x; y n 1, together with the ideal points 0, Ä Ä 1,..., n 1, . This accounts for n2 n 1 points. The lines are the n horizontal lines 1 C C .x; b/ 0 x n 1 0 ; 0 b n 1; f W Ä Ä g [ f g Ä Ä the n vertical lines

.a; y/ 0 y n 1 ; 0 a n 1; f W Ä Ä g [ f1g Ä Ä the ideal line 0; 1; : : : ; n 1; ; f 1g 13This theorem was ﬁrst proved in 1938 by Raj Chandra Bose (1901–1987),who made fundamental contributions to the theory of algebraic designs and combinatorial codes.

i i

i i i “main” — 2011/10/12 — 12:06 — page 78 — #92 i i i

78 4. Delightful Theorems

and, for 1 i n 1, the collection of points whose latin square entries are constant in Ä Ä the ith square together with the corresponding ideal point:

.x; y/ entry .x; y/ of ith latin square equals j i ; 0 j n 1: f W g [ f g Ä Ä This accounts for n n 1 .n 1/n n2 n 1 lines. C C C D C C It follows immediately from the definitions that each line contains n 1 points and each C point is on n 1 lines. C Given any point P , the n 1 lines containing P contain .n 1/n n2 n points other C C D C than P . Because of latinicity and orthogonality, the points are distinct. Since these are all the points other than P , every point lies on a line with P . From the definitions, there is at most one line determined by any two given points. Hence, every two points determine a unique line. Given any line l, there are n 1 pointson it and these points lie on .n 1/n other lines. C C Because two points determine exactly one line, these lines are distinct. Since these are all the lines other than l, every line intersects l. No two lines intersect in more than one point, for if two lines intersected in two points, then these two points would determine more than one line. Hence, every two lines intersect in exactly one point. We have shown how n 1 MOLS of order n yield a projective plane of order n. Now we go in the reverse direction. Suppose that we have a projective plane of order n. Choose a line, call it the x-axis, and label its n 1 points as 0, 1,..., n 1, 0. Choose another C line through 0, call it the y-axis, and label its points 0, 1,..., n 1, , in such a way that 1 the intersection of the two axes is labeled 0 on both axes. Call the line joining 0 and the 1 ideal line. For every point P not on the ideal line, suppose that the linethrough P and (a 1 “vertical line”) intersects the x-axis at x, and the line through P and 0 (a “horizontal line”) intersects the y-axis at y. Give P the coordinates .x; y/. This also gives coordinates to the pointson the axes. The ideal line has n 1 points other than 0 and . Each point will give 1 rise to a latin square with coordinates .x; y/, where 0 x; y n 1. Each of these points Ä Ä is on n lines other than the ideal line. Let the points on each line correspond to constant- value entries in the corresponding latin square. You may wish to confirm the latinicity and orthogonalityproperties of the resulting squares. We have shown that a projective plane of order n is equivalent to a set of n 1 MOLS of order n. It is always possible to construct a projective plane of order n, or the equivalent set of n 1 MOLS of order n, from a field of order n. A field of order n exists if and only if n is a prime power (see Appendix A). However, there exist projective planes of prime power order that do not arise from fields. No one has found a finite projective plane that is not of prime power order or proven that none exists. Here’s the main idea of the construction of a set of n 1 MOLS of order n from a field of order n. Let the nonzero field elements be f1,..., fn1. We define a latin square for each fi , with 1 i n 1, by the rule that the .x; y/ entry, where 0 x; y n 1, is Ä Ä Ä Ä fi fx fy : C I leave it as an exercise to show that this produces mutually orthogonal latin squares. The two mutually orthogonal latin squares of order 3 shown at the beginning of this section arise from the field Z3. For more on the connections between latin squares and finite geometries, see [29].

i i

i i i “main” — 2011/10/12 — 12:06 — page 79 — #93 i i i

4.16. The Lemniscate Revisited 79

4.16 The Lemniscate Revisited Recall the lemniscate graph from Chapter 1 (Figure 1.2), with parametric equations cos t x D 1 sin2 t C sin t cos t y ; < t < : D 1 sin2 t 1 1 C A moving point makes one lap around the graph as t goes from 0 to 2. It repeats this ﬁgure-eight in periods of 2. What distance does it travel? In other words, what is the arc length of the lemniscate graph? To ﬁnd the length of a curve, we integrate the differential of arc length,

dx 2 dy 2 dt: s dt C dt Â Ã Â Ã For the lemniscate, this is dt : 1 sin2 t C The length L of the lemniscate curve isp found by integrating the arc length over the interval of t values: 2 dt L : D 0 1 sin2 t Z C Integrals of this type are known as elliptic integralsp because they arise in connection with finding the arc length of an ellipse. This isn’t an easy integral to evaluate. If we use a computer algebra system to evaluate it numerically, we find an approximation to the length of the lemniscate curve: : L 5:244115108: D Carl Friedrich Gauss (1777–1855) discovered that this number is related to . The circumference of a unit circle is : 2 6:283185307: D Hence, the ratio of the circumference of a unit circle to the length of the lemniscate curve is 2 : 1:198140234: L D This number may not look familiar, but Gauss recognized it as the arithmetic-geometric mean of 1 and p2. Starting with two positive real numbers a and b, we define their arithmetic-geometric mean as follows. Set a0 a and b0 b. Let a1 be the arithmetic mean of a0 and b0, D D that is, .a0 b0/=2; and let b1 be the geometric mean of a0 and b0, that is, b1 pa0b0. C D Repeat this process starting with the numbers a1 and b1. Thus a2 .a1 b1/=2 and b2 D C D pa1b1. Continuing, we obtain sequences an and bn that converge to the same limit, f g f g the arithmetic-geometric mean of a and b, denoted by M.a; b/. To see that the sequences

i i

i i i “main” — 2011/10/12 — 12:06 — page 80 — #94 i i i

80 4. Delightful Theorems

converge, we use the fact that the arithmetic mean is always at least equal to the geometric mean (see page 55). Assuming that b0 a0, we have Ä

b0 b1 a1 a0: Ä Ä Ä

We see that bn is a nondecreasing sequence bounded above, and hence convergent. Simi- f g larly, an is a nonincreasing sequence bounded below and hence convergent. Suppose that f g an converges to L1 and bn converges to L2. Then .L1 L2/=2 L1, and therefore f g f g C D L1 L2, i.e., the two sequences converge to the same limit. D A calculation shows that the arithmetic-geometric mean of 1 and p2 is : M.1; p2/ 1:198140234: D It can ne shown that M.1; p2/ is a transcendental number (it isn’t a zero of a polynomial with integer coefﬁcients). As Gauss did, we will demonstrate that 2 M.1; p2/: L D The idea is to generalize the arc length integral. Often in mathematics, problems are more easily solved when they are generalized. Deﬁne

=2 dt I.a; b/ : D 0 a2 cos2 t b2 sin2 t Z C By symmetry, and since 1 sin2 t cosp2 t 2 sin2 t, the integral for the length of the C D C lemniscate curve equals 4I.1; p2/. The key step is to show that

a b I.a; b/ I C ; pab . /: D 2 Â Ã Once this is accomplished, the rest of the demonstration will be easy. We can repeatedly use ( ) to obtain

I.a; b/ I.a1 ; b1/ I.a2; b2/ I.M.a; b/; M.a; b//: D D D D Hence

I.a; b/ I.M.a;b/;M.a;b// D =2 dt

D 2 Z0 M.a; b/2 cos2 t M.a; b/2 sin t C q 1 =2 dt D M.a; b/ Z0 =2 : D M.a; b/

i i

i i i “main” — 2011/10/12 — 12:06 — page 81 — #95 i i i

4.16. The Lemniscate Revisited 81

Since for the lemniscate, L 4I.1; p2/, the relation 2=L M.1; p2/ follows imme- D D diately. To ﬁnish the demonstration we must prove ( ). The change of variables 2a sin u sin t ; D a b .a b/ sin2 u C C where 0 t; u =2, yields Ä Ä 2a cos u.a b .b a/ sin2 u/ cos t dt C C du; D .a b .a b/ sin2 u/2 C C and using some algebra and trigonometry we get

dt du du : 2 2 2 2 D 2 2 D 2 a cos t b sin t aCb ab 2 a2 cos2 u b2 sin u C sin u 1 1 2 2 C p r Á Á q This proves ( ), so we can now state Gauss’s discovery as a theorem. Theorem. Let L be the length of the lemniscate given by the parametric equations

cos t x D 1 sin2 t C sin t cos t y ; 0 t 2: D 1 sin2 t Ä Ä C Then 2 M.1; p2/; L D where M.1; p2/ is the arithmetic-geometric mean of 1 and p2. See [10] for a discussion of the relationship between elliptic integrals and elliptic curves.

i i

i i i “main” — 2011/10/12 — 12:06 — page 82 — #96 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 83 — #97 i i i

5 Pleasing Proofs

Real mathematics . . . must be justiﬁed as art if it can be justiﬁed at all. —G.H.HARDY (1877–1947), A Mathematician’s Apology1 In mathematics, assertions can be proved, which distinguishes mathematics from other disciplines. Mathematical knowledge is thus absolute and universal, independent of space and time. In this chapter, we present some proofs that are particularly memorable. Most are not well known and deserve to be better known.

5.1 The Pythagorean Theorem The Pythagorean theorem states that given a right triangle, the area of a square formed on the hypotenuse is equal to the sum of the areas of the squares formed on the two legs. There are many proofs of this important theorem. Figure 5.1 shows a tessellation proof. The plane is tessellated, or tiled, with copies of the square on the hypotenuse of the triangle (shaded in the ﬁgure), and also tessellated by copies of the squares on the two legs. This shows that the square on the hypotenuse can be divided into ﬁve pieces that can be reassembled to form the squares on the two legs. Two pieces make the smaller square and three pieces make the larger square.

Figure 5.1. A tessellation proof of the Pythagorean theorem.

1Hardy, G.H., A MathematiciansApology, pp. 139, Cambridge University Press, 1967. Reprinted with permis- sion.

i i

i i i “main” — 2011/10/12 — 12:06 — page 84 — #98 i i i

84 5. Pleasing Proofs

5.2 The Erdo˝s--Mordell Inequality In 1935 Paul Erd˝os conjectured a geometric inequality. Let ABC be a triangle and M be a point in the interior or on the boundary of ABC . Let the distances from M to the vertices A, B, C be x, y, z, respectively, and let the distances from M to the sides AB, BC , CA be c, a, b, respectively. Then

x y z 2.p q r/; C C C C with equality occurring if and only if ABC is equilateral and M is its center. See the diagram below (the shading in the diagram will be explained in a moment). A

x c q b r M

y p z BCa

Soon after the conjecture was made, Louis Mordell and D. F. Barrow proved it, but the proof was complicated. Other proofs have been found from time to time, and in 2007 a simple proof was found by Claudi Alsina and Roger B. Nelsen [2]. This proof is so beautiful that it deserves to be admired. The key idea is the inequality ax br cq, which we will now establish. Scale the C triangle ABC by a factor of x, the light shaded triangle by a factor of b, and the dark shaded triangle by a factor of c; and put the triangles together as in the diagram below. br cq

cx bx

We claim that the ﬁgure is a trapezoid. To see that the base is really a straight line, note that the four angles at the vertex corresponding to A (looking at both diagrams) consist of two pairs of complementary angles. In a trapezoid, the side opposite the base is at least as long as the base, by the Pythagorean theorem. Therefore ax br cq. Equality occurs C only when the trapezoid is a rectangle. Similar arguments show that by cp ar and cz aq bp. From the three inequal- C C ities, b c a c b a x y z p q r: C C c C b C c C a C a C b Â Ã Á Â Ã

i i

i i i “main” — 2011/10/12 — 12:06 — page 85 — #99 i i i

5.3. Triangles with Given Area and Perimeter 85

By the arithmetic mean–geometric mean (AM–GM) inequality, the sums in parentheses are at least equal to 2. This proves the Erd˝os–Mordell inequality. Furthermore, equality occurs in the AM–GM inequality only if a b c, that is, only if ABC is an equilateral D D triangle. Given that ABC is an equilateral triangle and the trapezoid in the second diagram is a rectangle, the two right triangles in the second diagram are congruent and hence q r. D Similarly, p q, and therefore M is the incenter (i.e., the center) of ABC . D In 2008 Victor Pambuccian [41] proved that in absolute geometry (Euclidean geometry without the parallel postulate), the Erd˝os–Mordell inequality is equivalent to the statement that the sum of the angles of a triangleis at most .

5.3 Triangles with Given Area and Perimeter If a triangle has perimeter 6, how large can its area be? With some thought, you can conclude that the unique triangle with perimeter 6 and maximum area is the equilateral triangle of side length 2. Its altitude is p3 and hence its area is p3. But let’s change the question a little. What if we require a triangle of perimeter 6 and area equal to some specified number less than p3. It turns out that there are infinitely many triangles with perimeter 6 having the specified area. In general, given positive real numbers A and P such that A < p3P 2=36, there are infinitely many triangles with area A and perimeter P . We’ll prove this in a moment. First let’s show that the area A cannot exceed p3P 2=36, where P is the perimeter. Re- call Heron’s formula for the area of a triangle (Chapter 3):

A s.s a/.s b/.s c/; D p with a, b, c the sides lengths and s .a b c/=2 the semiperimeter. By the arithmetic D C C mean–geometric mean inequality (see Chapter 4), we have

A ps .s a/.s b/.s c/ D ps a s b s c 3=2 ps C C Ä 3 Â Ã s 3=2 s2 p3P 2 ps : D 3 D 3p3 D 36 Á This proves that the area of the triangle is at most p3P 2=36, and the maximum is attained only by an equilateral triangle with sides P=3. Let’s now prove that if A is any positive real number less than p3P 2=36, then there are inﬁnitely many triangles with area A and perimeter P . Squaring both sides of Heron’s formula, and using the fact that P 2s, we obtain D 16A2 P.P 2a/.P 2b/.P 2c/: D Since P 2c 2a 2b P , this becomes D C 16A2 P.P 2a/.P 2b/.2a 2b P /: D C

i i

i i i “main” — 2011/10/12 — 12:06 — page 86 — #100 i i i

86 5. Pleasing Proofs

Using the quadratic formula to solve for b, we have

P a 16A2 b a2 : D 2 ˙ s P.P 2a/ One of the sign choices gives b and the other c. The key observation is that this formula produce actual triangles if the quantity inside the square root is nonnegative, that is, 16A2 a2 0; P.P 2a/ or a2P.P 2a/ 16A2: We will show that there are inﬁnitely many allowable choices for a, as long as A < p3P 2=36. The function f .a/ a2P.P 2a/: D is a cubic polynomial with a double root at a 0 and the third root at a P=2. It D D is easy to show that f has a local maximum at .P=3; P 4=27/. If 16A2 P 4=27, or D equivalently A p3P 2=36, then there is only one value of a for which f .a/ 16A2, D namely a P=3, corresponding to the equilateral triangle of side P=3. However, if A < D p3P 2=36, then there are inﬁnitely many values of a (in an interval containing P=3) such that f .a/ 16A2. Each such a yields a triangle with area A and perimeter P . Notice that we didn’t absolutely need the AM–GM inequality at the beginning of our analysis. Our argument boils down to Heron’s formula, the quadratic formula, and simple properties of a cubic polynomial.

5.4 A Property of the Directrix of a Parabola Given a parabola and a point in its “exterior,” there are two tangent lines to the parabola passing through the point. The locus of points for which the two tangent lines are perpendicular is the parabola’s directrix. We will prove these statements. 2 Without loss of generality, let the parabola be y kx , and let P .x0; y0/ be a point D 2 D in the exterior of the parabola, that is, with y0 < kx0 . A line through P tangent to the parabola at .a; ka2/ has equation

2 y0 ka m.x0 a/; D where m is the slope. Since m 2ka, we obtain D m2 m y0 m x0 ; 4k D 2k or Á 2 m 4kx0m 4ky0 0: C D 2 The condition y0 < kx0 guarantees that the discriminant of the quadratic is positive; hence, there are two values of m corresponding to two tangent lines to the parabola through P .

i i

i i i “main” — 2011/10/12 — 12:06 — page 87 — #101 i i i

5.5. A Classic Integral 87

These tangent lines are perpendicular if and only if the product of their slopes is 1. The product of the slopes is the constant coefﬁcient of the quadratic, 4ky0. Thus, The two tangent lines passing through P are perpendicular if and only if y0 1=.4k/, i.e., P is D on the parabola’s directrix. We also give a geometric proof that every point on the directrix has the required property.

A F ı

r ı B ı ˛ ˇ ˛ ˇ C P D

As in the diagram, let P be a point on the parabola’s directrix. Draw tangent lines from P to the parabola at A and B, making angles ˛ and ˇ with the directrix. Let and ı be the complementary angles to ˛ and ˇ, respectively. Let vertical lines through A and B intersect the directrix at C and D, respectively. Hence, PAC and PBD ı. Construct line † D † D segments AF , BF , and PF , where F is the focus. A parabola has the property that a ray parallel to its axis of symmetry strikes the parabola and is reﬂected through the focus. This implies that PAF and PBF ı. Since FA AC (by deﬁnition of directrix), † D † D D the triangles PAC and PAF are congruent. Hence FPA ˛. By a similar argument, † D FPB ˇ. Consequently, 2˛ 2ˇ 180ı and thus APB is a right angle. (It also † D C D † follows that ˇ, ı ˛, and AFB is a straight line segment.) D D It is easy to show that for any point exterior to the parabola and not on the directrix, the two tangent lines passing through it are not perpendicular. Hint: draw the intersection of one of the tangent lines with the directrix.

5.5 A Classic Integral The classic integral formula

1 2 I ex dx p D D Z1 can be proved by a clever trick that is well worth seeing. The surprising technique is to evaluate a double integral:

1 1 1 1 2 2 2 2 I 2 ex dx ey dy ex y dx dy: D D Z1 Z1 Z1Z1

i i

i i i “main” — 2011/10/12 — 12:06 — page 88 — #102 i i i

88 5. Pleasing Proofs

We use polar coordinates r and , where 0 r < and 0 2, with x2 y2 r 2 Ä 1 Ä Ä C D and dx dy r dr d. Then D 2 1 2 I 2 er r dr d D Z0 Z0 2 1 2 d rer dr D Z0 Z0 1 1 2 2 er D 2 Â Ã0 1 2 D 2 : D Therefore I p: D The value of the integral can come as a surprise on ﬁrst sight.

5.6 Integer Partitions

A partition of a positive integer n is a summation of positive integers equal to n. The order of the summands is unimportant. For example, the partitions of 5 are

5; 4 1; 3 2; 3 1 1; 2 2 1; 2 1 1 1; 1 1 1 1 1: C C C C C C C C C C C C C There is a rich literature on partitions of integers. Some good sources are [3] and [32]. One observation about partitions is that the number of partitions of n into odd parts (summands) is equal to the number of partitions of n into distinct parts. For instance, there are three partitions of 5 into odd parts:

5; 3 1 1; 1 1 1 1 1: C C C C C C And there are three partitions of 5 into distinct parts:

5; 4 1; 3 2: C C We will prove our claim for every positive integer n by describing a bijection (a one- to-one correspondence) between the set of partitions of n into odd parts and the set of partitions of n intodistinct parts. Suppose that we have a partitionof n intoodd parts. If any two parts are equal, then replace them with their sum. Continueuntil we have distinct parts. In the other direction, starting with a partitionof n into distinctparts, if a summand is even, replace it by two of its halves. Continue until we have only odd parts. This demonstrates the bijection.

i i

i i i “main” — 2011/10/12 — 12:06 — page 89 — #103 i i i

5.7. Integer Triangles 89

The bijection associates partitions of 5 into odd parts and partitions of 5 into distinct parts as follows:

5 5 $ 3 1 1 3 2 C C $ C 1 1 1 1 1 4 1: C C C C $ C As an exercise, you could show the bijection for n 6 (there are four partitions of each D type). Of course, we have to convince ourselves that the bijection is well defined. That is, if we start with a partitionof n into odd parts, then there is only one way to do thecombining and finish with a partition of n into distinct parts. And conversely, if we start with a partitionof n into distinct parts, there is only one way to do the dividing and finish with a partition of n into distinct parts. And we must show that the two operations are inverses of each other. All this follows if we notice that combining two parts has no effect on the other parts, nor does dividing a part into two equal parts.

5.7 Integer Triangles An integer triangle is a triangle with integer side lengths. Given a nonnegative integer n, let t.n/ be the number of incongruent integer triangles with perimeter n. For example, t.10/ 2, since there are two integer triangles with perimeter 10, namely, .3; 3; 4/ and D .2; 4; 4/. We write a triangle as an ordered triple .a; b; c/, with a b c. Here are the Ä Ä ﬁrst few values of the sequence t.n/ , known as Alcuin’s sequence. f g n 012345678910 t.n/0001011213 2

What is a formula for t.n/? I will give the simplest derivation that I know of. It relates integer triangles to partitions of an integer into three parts. The key observation is that

t.2n/ u.n/; n 0; D where u.n/ is the number of ways to write n as a sum of three positive integers (order unimportant). For example, t.10/ u.5/ 2, as there are two partitions of 5 into three D D parts, namely, 2 2 1 and 3 1 1: C C C C The function u.n/ is often denoted by p.n; 3/ or p3.n/. We see that t.2n/ u.n/ for all n 0 by the bijection D .a; b; c/ n a; n b; n c ; $ f g where .a; b; c/ represents the sides of an integer triangle of perimeter 2n and n a; n f b; n c represents the summands in a partition of n into three parts. Notice that a, b, and g c satisfy the triangle inequality if and only if each quantity is less than n, and this is true if and only if n a, n b, and n c are positive.

i i

i i i “main” — 2011/10/12 — 12:06 — page 90 — #104 i i i

90 5. Pleasing Proofs

Next, we will show that u.n/ satisﬁes a simple recurrence relation. Consider three f g cases according to the size of the last summand in a partition of n 6 into three parts, C where the parts are written in non-increasing order. If the last summand is 3 or greater, then subtract 2 from each summand to obtain a partition of n into three parts. If the last summand is a 2, then subtract 2 from each part to obtain a partition of n into one or two parts. If the last summand is a 1, then subtract 1 from each part to obtain a partitionof n 3 C into one or two parts. Given any positive integer m, the number of partitionsof m into one part is 1, and the number of partitionsof m into two parts is m=2 , where x , the ﬂoor of b c b c x, is the greatest integer less than or equal to x. Hence n n 3 u.n 6/ u.n/ 1 1 C ; n 0: C D C C 2 C C 2 j k We can simplify by examining the cases n even and n odd, to obtain the recurrence relation

u.n 6/ u.n/ n 3; n 0: C D C C Remembering that u.n/ t.2n/, we have the initial values D n 0 1 2 3 4 5 : u.n/ 0 0 0 1 1 2 Using the recurrence relation and the initial values, we can generate data and guess the formula n2 u.n/ ; n 0; D 12 Ä where Œx denotes the nearest integer to x. Because n2=12 is never a half-integer, the formula is well-deﬁned. The formula has the correct six initialvalues. All that remains toshow is that it satisﬁes the recurrence relation for u.n/ . This is simple: f g .n 6/2 n2 12n 36 n2 C C C n 3: 12 D 12 D 12 C C Ä Ä Ä By our analysis so far, we have found a formula for t.n/ with n even:

t.n/ u.n=2/ Œn2=48: D D Taking another look at the data, we conjecture that

t.n/ t.n 3/; if n is odd: D C This gives us a way to calculate t.n/ for n odd. You can establish the relation by showing the correspondence .a; b; c/ .a 1; b 1; c 1/; $ C C C where .a; b; c/ represents an integer triangle with odd perimeter n. Putting the pieces together, we obtain a formula for the number of incongruent integer triangles with perimeter n:

n2 48 if n is even; t.n/ 2 D 8 .nhC3/i if n is odd; n 0: < 48 h i :

i i

i i i “main” — 2011/10/12 — 12:06 — page 91 — #105 i i i

5.7. Integer Triangles 91

Alcuin’s sequence is named in honor of Alcuin of York (732–804). Alcuin is credited with writing a book called Propositions of Alcuin, A Teacher of Emperor Charlemagne, for Sharpening Youths. The book contains 53 mathematical word problems that can be solved by simple arithmetic, algebra, or geometry. The twelfth problem is an allocation problem.

Problem 12: A father and his three sons. A father,when dying, gave to his sons 30 glass flasks, of which 10 were full of oil, 10 were half full, and 10 were empty. Divide the oil and the flasks so that the three sons receive the same number of flasks and the same amount of oil. Each son receives 10 flasks. There are five solutions, which can be listed by the number of full flasks that go to each son: 5; 5; 0 , 5; 4; 1 , 5; 3; 2 , 4; 4; 2 , and 4; 3; 3 . We f g f g f g f g f g don’t count permutations of solutions as distinct. Each son receives an equal number of full flasks and empty flasks, and a number of half-empty flasks to make the total 10. The number ofways to allocate n fullflasks of oil, n half-full flasks, and n empty flasks to three persons so that each person receives the same number of flasks and the same amount of oil is t.n 3/. This is why t.n/ is called Alcuin’s sequence. C f g Alcuin’s sequence has many wonderful properties. For example, it is a zigzag sequence (its values alternately rise and fall starting with n 6). It can be extended to negative D values of n using the same formula that we found, and the doubly-infinite sequence is palindromic, i.e., it satisfies the same recurrence relation going forward as backward. The order nine linear recurrence relation with constant coefficients is

t.n/ t.n 2/ t.n 3/ t.n 4/ D C C t.n 5/ t.n 6/ t.n 7/ t.n 9/; n 9: C We can see the form of this recurrence relation in the expansion of the denominator of the rational generating function for Alcuin’s sequence: x3 x3 : .1 x2/.1 x3/.1 x4/ D 1 x2 x3 x4 x5 x6 x7 x9 C C C The period of a sequence a.n/ is the least positive integer k such that a.n k/ a.n/ f g C D for all n. Given m 2, the period of the sequence t.n/ mod m is 12m. For example, f g Alcuin’s sequence modulo 2 begins

0; 0; 0; 1; 0; 1; 1; 0; 1; 1; 0; 0; 1; 1; 0; 1; 1; 0; 1; 0; 0; 0; 0; 0;

and the pattern repeats with a period of 24. This is an intriguing result because the corresponding problem for the Fibonacci sequence remains unsolved, except for moduli that are powers of 2 or 5. Let’s prove the claim. We ﬁrst show that the sequence t.n/ mod m repeats in a cycle of f g length 12m. It follows that its period is a divisor of 12m. For even values of the argument, we have

t.2n 12m/ Œ.2n 12m/2=48 Œ.2n/2=48 m 3m2 t.2n/ .mod m/: C D C D C C Á For n odd we have

t.n 12m/ t.n 3 12m/ t.n 3/ t.n/ .mod m/: C D C C Á C D

i i

i i i “main” — 2011/10/12 — 12:06 — page 92 — #106 i i i

92 5. Pleasing Proofs

Figure 5.2. The complete graph of order 17.

We next show that the period of t.n/ mod m is not a proper divisor of 12m. Since the f g period is a divisor of 12m, it is of the form 12r, for r m, or the sequence repeats in a Ä cycle of length 6m or 4m. If the period has the form 12r, then m divides t.12r/ 3r 2 D (since t.0/ 0) and m divides t.12r 2/ 3r 2 r (since t.2/ 0). Hence, m divides D C D C D the difference, which is r, and m r. Ä We will show that the sequence t.n/ mod m does not repeat in a cycle of length 6m. f g We have already covered the case m 2s. Let m 4s 1. Then 6m 24s 6 and D D C D C t.6m/ 12s2 6s 1 3s.4s 1/ 3s 1, and we see that m does not divide t.6m/. D C C D C C C The case m 4s 3 is similar. D C Finally, we will show that the sequence t.n/ mod m does not repeat in a cycle of f g length 4m. We have already covered the case m 3s. Let m 3s 1. Then 4m 12s 4 D D C D C and t.4m/ s.3s 1/ s, and again we see that m does not divide t.4m/. The case D C C m 3s 2 is similar. D C We conclude that the period of t.n/ mod m is 12m. f g It can be shown (see [8]) that the sequence t.n/ mod m takes every value modulo m f g if and only if m is one of 7; 10; 19; 2j ; 3j ; 5j ; 11j ; 13j ; 41j ; 2 3j ; 5 3j ; for j 1: 5.8 Triangle Destruction The completegraph of order n consists of n vertices and all possible edges between pairs of vertices. The edges can be drawn straight or curved and may cross. Recall Figure 2.6, the complete graph on 17 points. Figure 5.2 shows this graph without the two-coloring of edges. Consider the game called Triangle Destruction, played on the graph. Two players, Oh and Ex, alternately remove edges. Oh moves ﬁrst, removing an edge. Then Ex moves, removing any other edge. Then Oh removes an edge, and so on. The ﬁrst player to eliminate the last triangle of the graph is the winner. Given best possible play by both players, who should win Triangle Destruction? The key is to think of what the graph looks like immediately before the second-to-last move. The player whose turn it is cannot win, but no matter what move he or she makes,

i i

i i i “main” — 2011/10/12 — 12:06 — page 93 — #107 i i i

5.8. Triangle Destruction 93

the other player can then win immediately. The graph before the second-to-last move must contain two triangles, for if it contained no triangles then the game would already be over, and if it contained only one triangle then the player whose turn it was would destroy that triangle and win. There are three cases. Case 1: The graph contains two disjoint triangles.

In thiscase, the graph contains no other edges, or else the player to move would delete one of the extraneous edges and not lose on the next turn. Case 2: The graph contains two triangles with a vertex in common.

This is essentially the same as Case 1. The graph contains no other edges, or else the player to move would delete one of the extraneous edges and not lose on the next turn. Case 3: The graph contains two triangles that share an edge.

In this case, we have a collection of four vertices and ﬁve edges. The graph contains no edges outside the complete graph on these four vertices, or else the player on the move would delete one of the extraneous edges and not lose on the next turn. The sixth edge (the dotted line),making a complete graph, wouldalso have to be in thegraph, or else theplayer to move would remove the edge adjacent to both triangles and destroy them both. In all cases, the graph at this stage has six edges, an even number. Since we start with a graph with 17 136 edges, an even number of turns must have taken place. Therefore, it 2 D must be Oh’s turn. Hence Ex will win. Triangle Destruction can be played on the complete graph of order n. By the same reasoning as in the n 17 game, Ex wins if n 0; 1 .mod 4/, and Oh wins if n 2; 3 D Á Á .mod 4/.

i i

i i i “main” — 2011/10/12 — 12:06 — page 94 — #108 i i i

94 5. Pleasing Proofs

5.9 Squares in Arithmetic Progression The numbers 1, 25, and 49 are three perfect squares in arithmetic progression (since 25 1 49 25).The same is true of thetrio 4, 100, and 196. There are inﬁnitely many triples D of distinct perfect squares of positive integers in arithmetic progression, as evidenced by

.a2 2ab b2/2; .a2 b2/2; .b2 2ab a2/2; C where a and b are positive integers with a > b. Clearly, these are squares. We can check that they are in arithmetic progression by expanding the expressions and subtracting the first from the second and the second from the third. Both differences are 4ab.a2 b2/. For example, with a 7 and b 5, we obtain the arithmetic progression D D 462 2116; 742 5476; 942 8836; D D D with common difference 3360. How do we find such a solution?We use a number theory technique, due to Diophantus, in which we can find rational points on a curve given a rational base point. Let the three numbers be x2, y2, and z2, with z2 y2 y2 x2, so that D x2 z2 2y2: C D Let x0 x=y and z0 z=y, so that D D x02 z02 2: C D We have a solution .x0; z0/ .1; 1/. Suppose that there is another solution .m; n/, where D m and n are rational numbers. Then the slope of the line from .1; 1/ to .m; n/ is a rational number, say t. Thus t .n 1/=.m 1/, and we have n t.m 1/ 1. Hence D D C m2 Œt.m 1/ 12 2; C C D and it follows that

.1 t 2/m2 .2t 2t 2/m .t 2 2t 1/ 0: C C C D By the quadratic formula, 2t 2 2t 2.1 t/ m ˙ C : D 2.1 t 2/ C The positive sign gives us the solution we already know, .1; 1/, so the negative sign applies:

t 2 2t 1 t 2 2t 1 m ; n C : D t 2 1 D t 2 1 C C Letting t a=b, where a and b are integers (b 0), we obtain D ¤ a2 2ab b2 b2 2ab a2 m ; n : D a2 b2 D a2 b2 C C Multiplying these values by a2 b2 results in the solution. C

i i

i i i “main” — 2011/10/12 — 12:06 — page 95 — #109 i i i

5.10. Random Hemispheres 95

While there are inﬁnitely many sets of three squares in arithmetic progression, there are no sets of four squares in arithmetic progression. This was conjectured by Pierre de Fermat (1601–1665) and proved by Leonhard Euler (1707–1783). An application of Diophantus’ method is the parameterization of all rational points on the unit circle x2 y2 1. Taking .0; 1/ as the base point, we can show that all rational C D points on the unit circle, except the base point, are given by 2t 1 t 2 x ; y ; t Q: D 1 t 2 D 1 t 2 2 C C If we allow t , then we get the point .0; 1/. D 1 Similarly, we can parameterize all rational points on the unit sphere x2 y2 z2 1. C C D With .0; 0; 1/ as the base point, we obtain 2s 2t 1 s2 t 2 x ; y ; z ; s; t Q: D 1 s2 t 2 D 1 s2 t 2 D 1 s2 t 2 2 C C C C C C If s t , then we get the point .0; 0; 1/. D D 1 If we allow s and t to be real numbers, then the parameterization is essentially the stereographic projection of the plane onto the unit sphere given in Riemann Sphere in Chapter 2.

5.10 Random Hemispheres Let n hemispheres of a ﬁxed sphere be selected at random. The probability that the sphere is covered by the hemispheres is 1 2n.n2 n 2/. Here is a proof: Each hemisphere is C bounded by a great circle. The n great circles dividethe surface of the sphere into n2 n 2 C regions (which can be proved by mathematical induction). Given a great circle, there are two hemispheres that have it as a boundary.Hence, theprobabilitythat a given region within the great circle is covered by one of the two hemispheres is 1=2. The probability that the region is covered by none of the n hemispheres is 2n. Since there are n2 n 2 regions, C the probability that at least one of them is covered by no hemisphere is 2n.n2 n 2/. C The probability that the entire sphere is covered is the complementary probability. The event that a sphere is not covered by n hemispheres is the same as the event that the n centers of the hemispheres are contained in a single hemisphere (why?). So we have shown that the probability that n random pointson the surface of a sphere are contained in a hemisphere is 2n.n2 n 2/. C As a generalization, suppose that n spherical caps are selected at random on a sphere. Let the subtended central angle of each cap be 2˛, where 0 < ˛ < . If ˛ =2, then each D cap is a hemisphere. Let f .n/ be the probability that the surface of the sphere is covered by the n spherical caps. No exact formula for f .n/ is known, but an asymptotic result is log.1 f .n// ˛ lim log 1 sin2 ; n!1 n D 2 where log is the natural logarithm. Á

5.11 Odd Binomial Coefficients A positive integer n dominates a positive integer k if the powers of 2 expansion of n contains all the terms in the powers of 2 expansion of k. For example, 45 dominates 9, since

i i

i i i “main” — 2011/10/12 — 12:06 — page 96 — #110 i i i

96 5. Pleasing Proofs

45 32 8 4 1 and 9 8 1. The binomial coefﬁcient n is odd if and only if n D C C C D C k dominates k. n One way to prove this is by showing that the exact power of 2 that divides k is equal to the number of carries when k and n k are added in base 2. We claim that j j .x 1/2 x2 1 .mod 2/; j 0: C Á C The reason is that in the binomial expansion of the expression on the left, the binomial 2j j coefﬁcients m , for 1 < m < 2 , are all even. This is because

j 2j 2j 2 1 m1 ; m! D m

and m divides the product in the numerator on the right yet m < 2j . Consider a speciﬁc value of n, such as 45. We have

.x 1/45 .x 1/32C8C4C1 C D C .x 1/32.1 x/8.1 x/4.1 x/1 D C C C C .1 x32/.1 x8/.1 x4/.1 x1/.mod 2/: Á C C C C Modulo 2, the only binomial coefficients that show up on the left are the odd ones, but on 45 the right the only binomial coefficients m that show up are those for which 45 dominates m. This proves the result in the case n 45, and the general result is proved in the same D way. An immediate consequence of our result is that the number of odd entries in the nth row of Pascal’s triangle is 2˛, where ˛ is the number of 1s in the binary representation of n. By the way, if in Pascal’s triangle you replace the even numbers by 0s and the odd numbers by 1s, you will get a pattern that looks like Sierpiński’s triangle of Chapter 2.

5.12 Frobenius’ Postage Stamp Problem Suppose that postage stamps come in denominations a and b, where a and b are positive integers greater than 1 with greatest common divisor 1, and we have an unlimited supply of stamps. The problem is to ﬁnd those positive integer amounts that cannot be made using these stamps.2

(a) The number of amounts that cannot be made is ﬁnite.

(b) The largest amount that cannot be made is ab a b. (c) The number of amounts that cannot be made is .ab a b 1/=2. C For example, if a 3 and b 5, then we can make all amounts except 1, 2, 4, and 7. D D 2The creator of this problem, Ferdinand Georg Frobenius (1849–1917), made contributions in the areas of differential equations and group theory.

i i

i i i “main” — 2011/10/12 — 12:06 — page 97 — #111 i i i

5.12. Frobenius’ Postage Stamp Problem 97

@ @ @ @ @ .0; c=b/ c @ @ ax by c @ C D @ @ @ @ .x0c bt; y0c at/ y @ C s @ @ c x .c=a; 0/@ @ .x0c; y0c/ @ s @ @

Figure 5.3. The line ax C by D c.

If gcd.a; b/ 1, we would have infinitely many amounts that cannot be made. For ¤ example, if a 2 and b 4, then no odd amount can be made. D D (a) We will show that if c > ab, then the amount c can be made with a positive number of a stamps and b stamps. It follows that the number of positive integer amounts that cannot be made is finite. A well-known theorem of number theory is if a and b are relatively prime positive integers, i.e., gcd.a; b/ 1, then there exist integers x0 and y0 such that ax0 by0 1. D C D Assume that c > ab. We want to show that there exist positive integers x and y such that ax by c. This is equivalent to showing that the line ax by c contains a point C D C D with integer coordinates in the first quadrant of the plane. See Figure 5.3. For any integer t we have

a.x0c bt/ b.y0c at/ c: C C D So integer solutions exist along the line shown at intervals of step length pa2 b2. The C result will follow if we show that the length of the line segment in the ﬁrst quadrant is greater than the step length. Thus

c c 2 c 2 c > ab a2 b2 > a2 b2 > a2 b2: H) ab C C H) r a C b C p p Á Á p Therefore, some integer solution lies in the ﬁrst quadrant, and we are done. (b) A generating function gives the result. Without loss of generality, assume that a < b. The generating function for the nonnegative integer amounts that can be made using no b stamps (and any number of a stamps) is

1 1 xa x2a : C C C D 1 xa

i i

i i i “main” — 2011/10/12 — 12:06 — page 98 — #112 i i i

98 5. Pleasing Proofs

The generating function for the amounts that can be made with one b stamp is xb xb xaCb x2aCb : C C C D 1 xa The generating function for the amounts that can be made with two b stamps is x2b x2b xaC2b x2aC2b : C C C D 1 xa ...The generating function for the amounts that can be made with a 1 of the b stamps is x.a1/b x.a1/b xaC.a1/b x2aC.a1/b : C C C D 1 xa We can stop here, because the amounts that can be made with a or more of the b stamps have already been counted (using b or more of the a stamps). Hence, the generating function for all nonnegative integer amounts that can be made is 1 1 xab 1 xb x2b x.a1/b : 1 xa C C C C D .1 xa/.1 xb/ Á The generating function for all nonnegative integer amounts is 1 1 x x2 : C C C D 1 x Therefore, the generating function for all amounts that cannot be made is 1 1 xab .1 xa/.1 xb/ .1 x/.1 xab/ : 1 x .1 xa/.1 xb/ D .1 x/.1 xa/.1 xb/ We know from (a) that this rational function is a polynomial. The highest amount that cannot be made using a and b stamps is the degree of this polynomial, i.e., the degree of the numerator minus the degree of the denominator: .ab 1/ .a b 1/ ab a b: C C C D (c) The number of amounts that cannot be made is the number of nonzero terms in the polynomial. We can obtain this by evaluating it at x 1, or, equivalently, by calculating D the limit of the generating function as x 1. Since both numerator and denominator tend ! to 0, we use l’Hˆopital’s rule, and we must do so three times. The third-order derivative of the numerator, letting x 1, is D 3ab.ab a b 1/: C The third order derivative of the denominator, letting x 1, is D 6ab: The quotient is ab a b 1 C ; 2 so this is the number of positiveinteger amounts that cannot be made using a and b stamps. The generalization of Frobenius’ problem to three stamp denominations is unsolved. See [40] for an algorithmic approach to the problem and a survey of results.

i i

i i i “main” — 2011/10/12 — 12:06 — page 99 — #113 i i i

5.13. Perrin’s Sequence 99

5.13 Perrin’s Sequence

Let an bethe nth term of Perrin’s sequence (named after the French mathematician R. f g Perrin), deﬁned by

a0 3; a1 0; a2 2; D D D an an2 an3; n 3: D C

If p is prime, then p ap (p divides ap). See [34] for three pleasing proofs of this Perrin j property. We give a simple counting argument based on the fact that an is the number of maximal independent subsets of 1;2;3;:::;n , as observed by Zoltán Füredi in 1987. f g Denote by bn the number of maximal independent subsets of 1;2;3;:::;n , where f g n 2. The word independent means that the subset contains no two consecutive numbers (where 1 and n are regarded as consecutive). The word maximal means that no more elements can be included in the subset. An example of a maximal independent subset of 1;2;3;4;5;6;7 is 1; 4; 6 . f g f g We will prove that an bn, for n 2, and show that p ap, for p prime. D j However, n an does not imply that n is prime. The smallest counterexample is n j D 271441 5212. D We can easily check that b2 2, b3 3, and b4 2, agreeing with Perrin’s sequence. D D D Suppose that the largest element in a maximal independent subset of 1;2;3;:::;n is k f g and the second-largest element is j . Then clearly k j 2 or k j 3. Eliminating k D C D C and the numbers between k and j results in a maximal independent subset of a set of size n 2 (if k j 2) or size n 3 (if k j 3). This transformation, and its inverse, D C D C show that bn bn2 bn3, for n 3. Therefore an bn, for n 2. D C D Now we can give a combinatorial proof that a prime p divides the pth Perrin number. Let p be a prime. Consider the map, say f , that acts on maximal independent subsets of 1; : : : ; p by rotating them one step forward: f .k/ k 1 if k < p, and f .p/ 1. f g D C D We claim that f has no fixed points; that is, there exists no maximal independent subset S of 1; : : : ; p such that f applied to S is S itself. The reason is that if there were such an f g S, then, given any element s S, by definition of f we would have s 1 S. But this 2 C 2 implies that S contains all the elements of 1; : : : ; p and hence could not be independent. f g Furthermore, we claim that each independent subset S of 1; : : : ; p has order p under f . f g This means that f must be applied p times to S in order to obtain S again. I leave this as an exercise (it uses the fact that p is a prime). These observations imply that the collection of maximal independent subsets of 1; : : : ; p is partitioned into sub-collections of size p, f g which means that p divides the number of maximal independent subsets of 1; : : : ; p , that f g is, p ap. j

5.14 On the Number of Partial Orders A partialorder on a set is a binary relation that is transitive, reﬂexive, and anti-symmetric (see Appendix A). For example, the relation “a divides b” is a partial order on the set 1;2;3;4;5;6 . Let p.n/ be the number of partial orders on the set 1; : : : ; n . No formula f g f g

i i

i i i “main” — 2011/10/12 — 12:06 — page 100 — #114 i i i

100 5. Pleasing Proofs

for p.n/ is known. The known values of p.n/, at the time of this writing, are given in the table. See entry A001035 in the Online Encyclopedia of Integer Sequences. n p.n/ 1 1 2 3 3 19 4 219 5 4231 6 130023 7 6129859 8 431723379 9 44511042511 10 6611065248783 11 1396281677105899 12 414864951055853499 13 171850728381587059351 14 98484324257128207032183 15 77567171020440688353049939 16 83480529785490157813844256579 17 122152541250295322862941281269151 18 241939392597201176602897820148085023 The units digits appear to be periodic with period 4. The repeating block is 1, 3, 9, 9. Can we prove this? When faced with a challenging mathematical question, it’s often a good idea to generalize, which may give a better understanding. Trials with other moduli suggest that if the modulus m is a prime number, then the sequence is periodic with period m 1. If the modulus m is a prime power, the sequence appears to be periodic with period .m/, where is Euler’s -function. (This function counts the number of positive integers less than m that have no common factor with m.) For any modulus m, the sequence appears to be periodic with period equal to the least common multiple (lcm) of the constituent periods. For example, with m 12, the period appears to be lcm..4/; .3// lcm.2; 2/ 2. D D D Let’s prove our conjecture when m is prime. This result was originally proved by Z. I. Borevich in 1984, but we will give a beautiful 2010 proof (unpublished) by Aaron Meyerowitz. It is similar to the proof of a property of Perrin’s sequence given in the previous section. We will show that

p.n m 1/ p.n/ .mod m/; n 1; C Á where m is prime. It’s easier to follow the proof with speciﬁc numbers, so let’s take m 5 D and n 3. We want to show that D p.7/ p.3/ .mod 5/: Á Remember that p.3/ is the number of partial orders on the set 1; 2; 3 . The key idea is f g to replace the 3 in this set by ﬁve clones, say 3a, 3b, 3c, 3d, and 3e. Now we have a

i i

i i i “main” — 2011/10/12 — 12:06 — page 101 — #115 i i i

5.15. Perfect Error-Correcting Codes 101

set of seven elements: 1;2;3a;3b;3c;3d;3e . We will consider partial orders on this f g set, which is equivalent to the set 1;2;3;4;5;6;7 . Let be the permutation of the set f g 1;2;3a;3b;3c;3d;3e that fixes 1 and 2, and cycles the clones: f g .1/.2/.3a; 3b; 3c; 3d; 3e/: D It induces a permutation on the collection of all partial orders on the set 1;2;3a;3b;3c; f 3d; 3e . Since is a 5-cycle, and 5 is a prime, its orbits have sizes 1 or 5. We are counting g modulo 5, so the orbits of size 5 may be ignored. When can result in an orbit of size 1 (a fixed point)? If a partial order is fixed, then there can be no relations among the clones (other than the reflexive relations), or else a clone with no predecessor would be mapped to another clone that has a predecessor. Furthermore, each clone must act exactly like the element 3 in the set 1; 2; 3 . Hence, the orbits of size 1 are in bijective correspondence f g with the partial orders on 1; 2; 3 . This proves the congruence, and the general case where f g m is a prime is proved in the same manner. Can this argument can be boosted up (say, by defining clones of clones) to account for the case where m is a prime power? For our purposes, we need the case only when m is prime. Applying our result with m 2 and m 5, we find that p.n 1/ p.n/ is divisible D D C by 2 for all n (thus, all p.n/ are odd), and p.n 4/ p.n/ is divisible by 5 for all n. It C follows that p.n 4/ p.n/ is divisible by 10 for all n, and hence the block 1, 3, 9, 9 C repeats forever. It is gratifying that we can prove such a thing about a sequence for which we can’t compute many values.

5.15 Perfect Error-Correcting Codes A binarycode is a set of binary vectors of some ﬁxed length. For example, the set

.0;0;1/; .1;1;0/; .1;1;1/ f g constitutes a code. The elements of a code are called codewords. If we have information to send over a noisy channel, one in which errors occur, we can ﬁrst encode the information with codewords. The distance between two binary strings is the number of coordinates in which they differ. For example, the distance between .0; 0; 1/ and .1; 1; 1/ is 2, since their ﬁrst and second coordinates differ. The distance of a code is the minimum distance between any two codewords. The code given above has distance 1. If a code has distance 2e 1, then there is a method for correcting e or fewer errors C (alterations of bits). If e or fewer errors occur in the transmission of a codeword, we assume that the intended codeword is the one within distance e of the received binary vector. We think of each codeword as surrounded by a sphere of radius e, consisting of all binary vectors whose distance from the center is at most e. All such vectors are decoded to the codeword at the center. If the vectors have length n, then the number of binary vectors in a sphere of radius e is e n ; i iD0 ! X since, for 0 i e, there are n choices for which i coordinates of a binary vector Ä Ä i disagree with the codeword.

i i

i i i “main” — 2011/10/12 — 12:06 — page 102 — #116 i i i

102 5. Pleasing Proofs

A code is capable of correcting e errors if and only if the spheres of radius e around the codewords are disjoint. Since there are 2n binary strings of length n, in a code with w codewords it is necessary that e n w 2n: i Ä iD0 ! X In the case of equality, we say that a code is perfect. A perfect code is a special mathematical object. It is equivalentto a packing of w disjoint spheres of radius e in an n-dimensional binary vector space. e n In a perfect code, the quantities w and iD0 i must both be powers of 2 (because their product is a power of 2). This turns out to be a severe limitation on the possibilities for P existence of perfect codes. Let’s ignore the cases e 0 (which means that every binary D vector is a codeword), e n (which means that there is only one codeword), and e D D .n 1/=2, for n odd (which means that there are only two codewords). Aside from these trivial cases, an inﬁnite family of perfect codes exists with e 1; these are called Hamming D codes.3 The only other feasible values of .n; e/ are .23; 3/ and .90; 2/. This is difﬁcult to prove, but a computer search may convince you of its likelihood.Notice that

23 23 23 23 90 90 90 211 and 212: 0 ! C 1 ! C 2 ! C 3 ! D 0 ! C 1 ! C 2 ! D

In fact, there is no perfect code with .n; e/ .90; 2/. Assume that there is such a code. D Without loss of generality, we may assume that the code contains the all-zero vector of length 90. Let X be the set of binary strings whose first two coordinates are 1 and have exactly one other coordinate equal to 1. Since there are 88 choices for the third coordinate that equals 1, we see that X has 88 elements. Since the code is perfect, each element of X is contained in exactly one sphere of radius 2 around a codeword. Such a codeword must have exactly five 1s (why?). Let Y be the number of codewords that have exactly five 1s. Since each such codeword contains exactly three elements of X, we have 3 Y X 88. But j j D j j D 88 isn’t divisible by 3, so we have a contradiction. Therefore, there is no perfect code with .n; e/ .90; 2/. D A perfect binary code with .n; e/ .23; 3/, called the Golay code, was discovered by D the mathematician and physicist Marcel J. E. Golay (1902–1989).We will give a construction of the Golay code based on an idea due to Robert T. Curtis and Tony R. Morris (see [6]). We will produce the Golay code as a vector space, namely, the vector space of linear combinations of the row vectors of a 12 24 matrix. This will yield a code where the codewords have length 24. Deleting any one coordinate results in a perfect code where the codewords have length 23. Let G be the icosahedral graph (Figure 5.4). It has twelve vertices, corresponding to the twelve vertices of a regular icosahedron. Two vertices are adjacent (joined by an edge) if and only if the corresponding vertices of the icosahedron are the endpoints of an edge. Assume that the vertices of G are numbered from 1 to 12. Take vectors of length 24, denoted by x; y , where x is the indicator vector of any subset of vertices of G, and y is the h i indicator vector of the set of vertices of G that are nonadjacent to an odd number of vertices

3Hamming codes were discoveredby Richard Hamming (1915–1998).

i i

i i i “main” — 2011/10/12 — 12:06 — page 103 — #117 i i i

5.15. Perfect Error-Correcting Codes 10 103

7 5 4 2 1 3 8 9 6 11 12

Figure 5.4. The icosahedral graph.

of x. For example, if

x 1;1;0;0;0;0;0;0;0;0;0;0 ; D h i

then

y 1;1;0;1;1;0;0;1;1;0;0;0 : D h i

The generator matrix of the extended Golay code is M ŒI12 A, where A is the non- D j adjacency matrix of the icosahedral graph, and I12 is the 12 12 identity matrix. The ij entry of A is 1 if vertices i and j are nonadjacent and 0 if they are adjacent. Every vertex is nonadjacent to itself, so the diagonal entries of A are 1. This matrix M encodes the definition that y represents the set of vertices of G that are nonadjacent to an odd number of vertices represented by x. An arbitrary binary vector v of length 12 maps to vM , producing a binary vector of length 24. We obtain 212 different vectors altogether, since the rows of M are linearly independent. If v 0, then vM is the all-zero vector. We claim that the other image vectors D have eight 1s, twelve 1s, sixteen 1s, or twenty-four 1s. The easiest way to prove this is by getting a computer to do all the multiplications and keep track of the number of 1s in the image vectors. Deleting one coordinate results in the Golay code with .n; e/ .23; 3/, D which has 212 codewords of length 23, each pair differing in at least seven coordinates. The Golay code is special. Its automorphism group (the group of permutations of the 4 coordinates of the codewords that leave the code unchanged) is the Mathieu group M23, a sporadic simple group of order 23 22 21 20 16 3. The 759 codewords of weight 8 in the extended Golay code give rise to a Steiner5 system S.5; 8; 24/. See [45] for a clear account of coding theory.

4Emile´ L´eonard Mathieu (1835–1890)discovered ﬁve sporadic simple groups. 5Jakob Steiner (1796–1863)was a geometer who worked primarily in the area of synthetic geometry.

i i

i i i “main” — 2011/10/12 — 12:06 — page 104 — #118 i i i

104 5. Pleasing Proofs

5.16 Binomial Coefficient Magic In 1891 A. C. Dixon6 found a formula for the alternating sum of the cubes of binomial coefficients: 3 2n 2n .3n/Š . 1/k . 1/n : 3 k ! D .nŠ/ kXD0 Though there are formulas for the sum of the first and second powers of the binomial coefficients, n n 2 n n 2n 2n and k! D k! D n ! kXD0 kXD0 (e.g., see [17]), no formula is known for the sum of the cubes of the binomial coefficients:

3 n n : k! kXD0 Let’s prove Dixon’s formula. We will look at it as a special case of a more general formula. This is a typical srategy in mathematics. Sometimes a more general formula is easier to prove than a specific case. Since we are dealing with cubes of binomial coefficients, it is fairly natural to sum over products of three different binomial coefficients. An educated guess is the formula called Dixon’s identity:

1 a b b c c a .a b c/Š . 1/j C C C C C ; a j b j c j D aŠbŠcŠ j D1 ! ! ! X C C C where a, b, and c are nonnegative integers. The domain of summation is actually ﬁnite, since the summand is 0 outside the interval a j a. Setting a b c n and Ä Ä D D D making the change of variables j k n, we recover the desired special case. Let’s think of Dixon’s identity as an identity in one variable with the other two variables ﬁxed. Thus, we replace a by m, and thinkof b and c as constants. The identity is

1 m b b c c m .m b c/Š f .m/ . 1/j C C C C C : D m j b j c j D mŠbŠcŠ j D1 ! ! ! X C C C We want to prove this identity for all m 0. For m 0, there is only one nonzero D C summand (corresponding to j 0), and the identityreads b c .b c/Š=.bŠcŠ/, which D b D C is true. What happens when m increases to m 1? We have C f .m 1/ .m 1 b c/Š mŠbŠcŠ m b c 1 C C C C C C C ; m 0: f .m/ D .m 1/ŠbŠcŠ .m b c/Š D m 1 C C C C If we can prove this relation, then it will follow by induction that f .m/ .m b D C C c/Š=.mŠbŠcŠ/ for all m 0, and since we could do the same process for each of the three variables, this will establish Dixon’s identity. We write the relation to be proved as

.m 1/f .m 1/ .m 1 b c/f .m/ 0: C C C C C D 6Alfred Cardew Dixon (1865–1936) worked primarily in differential equations. Dixon attributed the summation formula to Frank Morley, the geometer who proved Morley’s theorem on the trisectors of a triangle.

i i

i i i “main” — 2011/10/12 — 12:06 — page 105 — #119 i i i

5.16. Binomial Coefficient Magic 105

We will use telescoping series. Denote the summand in Dixon’s identity as

m b b c c m f .m; j / . 1/j C C C : D m j b j c j C ! C ! C ! By definition, f .m; j / f .m/: D j X We want to find a function g.m; j / such that .m 1/f .m 1; j / .m 1 b c/f .m; j / g.m; j 1/ g.m; j / ( ): C C C C C D C If we sum both sides of ( ) over all integers j , then the right side will be a telescoping series, and if the telescoping series sums to 0, we are done! Thus, we get a one-step proof if we can find g.m; j /. Experience has shown that we can often choose a function of the form g.m; j / r.m; j /f .m; j /; D where r.m; j / is a rational function of m and j . In our case, we can take .b j /.c j / r.m; j / C C : D 2.j m 1/ How do we find such a rational function?One way is to feed some values of f .m; j / into a computer to guess the numerator and denominator, assuming that the degrees are not large. The most important point is that once we have g.m; j /, we can easily check (by computer or by hand) that ( ) is satisfied. The resulting telescoping series sums to 0, since f .m; j / 0 for j sufficiently large or sufficiently small. Let’s prove condition ( ) by D hand. In order to cancel lotsof factorials, we divideeach term of ( ) by f .m; j /, remembering that g.m; j / r.m;j/f.m;j/. The left side becomes D .m 1/.m b 1/.m c 1/ C C C C C .m b c 1/; .m 1 j /.m 1 j / C C C C C C and the right side becomes .b j /.c j / .b j /.c j / C C : 2.j m 1/ 2.j m 1/ C C Algebra shows that the expressions are equal. This concludes the proof of Dixon’s identity and the special sum in our discussion. The functions f .m; j / and g.m; j / in our solutionare called a WZ pair, named after the discoverers of the method, Herbert Wilf and Doron Zeilberger. A complete description of the WZ method is given in [42]. We couldhave used the WZ method directlyon Dixon’sformula, withoutthe need forthe more general identity. However, the rational function would have been more complicated. You might want to use the method to prove the formulas 2 2 n n n n 2n 2n 2n 2n 2n; ; and . 1/k . 1/n : k! D k! D n ! k ! D n ! kXD0 kXD0 kXD0

i i

i i i “main” — 2011/10/12 — 12:06 — page 106 — #120 i i i

106 5. Pleasing Proofs

Figure 5.5. A 4 4 array.

You can use the method on these formulas directly, without recourse to more general identities.

5.17 A Group of Operations Let a 4 4 array be given (Figure 5.5). Suppose that we can perform three operations on the array: Exchange any two rows. Exchange any two columns. Exchange any two quadrants (the quadrants are outlined with heavy lines). Combinations of the operations comprise a group. What is its order? The group is isomorphic to the afﬁne linear group AG.4; 2/. This is the group of all transformations v vM t; 7! C where M is an invertible 4 4 binary matrix, and v and t are binary row vectors of length 4. Itsorder is .24 1/.24 2/.24 22/.24 23/24 322560: D The reason is that there are 24 1 choices for the ﬁrst row of an invertible binary matrix (any binary vector of length 4 except the all 0 vector), 24 2 choices for the second row

11 0011 0111 1011 1111

10 0010 0110 1010 1110

01 0001 0101 1001 1101

00 0000 0100 1000 1100

00 01 10 11

Figure 5.6. A 4 4 array with coordinates.

i i

i i i “main” — 2011/10/12 — 12:06 — page 107 — #121 i i i

5.17. A Group of Operations 107

(any binary vector except a multiple of the first row), 24 22 choices for the third row (any binary vector except a linear combination of the first two rows), 24 23 choices for the third row (any binary vector except a linear combination of the first three rows), and 24 choices for the translation vector t. The order of the group is the productof the numbers of choices. Let’s prove that the group of operations is AG.4; 2/. We put binary coordinates on the cells of the array (Figure 5.6), with the first two coordinates placed in the horizontal direction and the last two placed in the vertical direction. We’ll refer to the quadrants of the array as NE, NW, SE, and SW. Let’s look at theaction of transvection matrices on the coordinates. A transvection matrix is a matrix formed by changing one of the off-diagonal entries of the identity matrix from 0 to 1. For each such matrix M , we consider the result of multiplyingon the left by a row vector: vM . In this case, the translation vector t is the all zero vector. matrix action

1 1 0 0 0 1 0 0 M12 2 3 swaps columns 10 and 11 D 0 0 1 0 6 0 0 0 1 7 6 7 4 5 1 0 0 0 1 1 0 0 M21 2 3 swaps columns 01 and 11 D 0 0 1 0 6 0 0 0 1 7 6 7 4 5 1 0 1 0 0 1 0 0 M13 2 3 swaps NE and SE quadrants D 0 0 1 0 6 0 0 0 1 7 6 7 4 5 1 0 0 0 0 1 0 0 M31 2 3 swaps NE and NW quadrants D 1 0 1 0 6 0 0 0 1 7 6 7 4 5 1 0 0 0 0 1 0 0 M34 2 3 swaps rows 10 and 11 D 0 0 1 1 6 0 0 0 1 7 6 7 4 5 1 0 0 0 0 1 0 0 M43 2 3 swaps rows 01 and 11 D 0 0 1 0 6 0 0 1 1 7 6 7 In all cases the origin4 cell, 0000, is ﬁxed.5

i i

i i i “main” — 2011/10/12 — 12:06 — page 108 — #122 i i i

108 5. Pleasing Proofs

We will show that these six transvection matrices generate all invertible 4 4 binary matrices. The matrices are labeled Mij so that the single off-diagonal 1 occurs in the ith row and j th column. Every invertible matrix can be written as a product of elementary row operation matrices. The elementary row operations are (1) multiplication by a nonzero scalar, (2) interchange of two rows, and (3) replacement of a row by that row plus a scalar multiple of another row. With operations over a two-element field, scalar multiplication doesn’t amount to much, since the only nonzero scalar is 1. Transvection matrices correspond to elementary row operations of type (3). We assume that the operating matrix is on the left and the operated-upon matrix is on the right. For example, multiplying a 4 4 matrix A on the left by M12 replaces the first row of A by the sum of the first and second rows. In general, the product Mij A is the matrix in which row i of A is replaced by the sum of rows i and j . The type (2) elementary row operations are easy to form with our transvection matrices. For instance, the matrix product M12M21M12A is the matrix in which the first two rows of A are interchanged. With permutation matrices at our disposal, it is easy to generate the other transvection matrices. How would you generate the matrix M23? Thus, the six matrices generate all 4 4 invertible binary matrices. Additionof a nonzero translation vector t moves the origin cell. Therefore, the group of operations is the same as AG.4; 2/.

i i

i i i “main” — 2011/10/12 — 12:06 — page 109 — #123 i i i

6 Elegant Solutions

The essence of mathematics resides in its freedom. —GEORG CANTOR (1845–1918) What makes a difﬁcult mathematics problem easy? Sometimes there is a sudden ﬂash of understanding. Sometimes past experience points out the right direction to take. This chapter presents problems whose solutions illustrate concepts or techniques that can be appreciated for their power and beauty, and may be useful to you in future problem solving.

6.1 A Tetrahedron and Four Spheres Four spheres of radius 1 are contained in a regular tetrahedron in such a way that each is tangent to three faces of the tetrahedron and to the other three spheres. What is the side length of the tetrahedron?

In a regular tetrahedron, let r be the ratio of the distance between its center and a face to the side length. Line segments joiningthe fourcenters of the mutually tangent spheres form a small regular tetrahedron of side length 2. The small tetrahedron and the circumscribing tetrahedronhave the same center, and their faces are one unit apart (the radii of the spheres). Hence, the side length of the circumscribing tetrahedron is .2r 1/=r. We will ﬁnd r and C thereby ﬁnd this length. Given a regular tetrahedron of side length 1, the altitude of a face has length p3=2 (by the Pythagorean theorem). Since the center of an equilateral triangle divides the altitudes in the ratio 2 1, the distance between the center of a face and a vertex is p3=3 and hence W (again by the Pythagorean theorem) the altitudeof the tetrahedron is p6=3. Since the center

109

i i

i i i “main” — 2011/10/12 — 12:06 — page 110 — #124 i i i

110 6. Elegant Solutions

of a regular tetrahedron divides the altitudes in the ratio 3 1, we conclude that r p6=12 W D and therefore the side length of the circumscribing tetrahedron is 2 2p6. C 6.2 Alphabet Cubes Suppose that you have 27 wooden cubes, all the same size. On each face of the ﬁrst cube is the letter A. On each face of the second cube is the letter B, and so on, all the way to Z. The twenty-seventh cube is blank. Is it possible to arrange the cubes in a 3 3 3 cube, with the blank cube in the middle, so that consecutive letters of the alphabet occur on cubes sharing a face?

V N O UVMVNLNO O EUFMUMGTMLPL ELEMFVMFTGKTGP P DLZCLVYHVKQK DWDZCVZCYHJYHQ Q AWBWVI VJ J A AB BI I

It is impossible to accomplish this. Consider a checkerboard coloring of the cubes, using red and black, so that cubes that share a face have different colors. If there were such an arrangement, then as we go throughthe alphabet, we wouldchange cube colorsat each step. But this would mean that we have thirteen cubes of each color. However, the checkerboard coloring has fourteen cubes of one color and thirteen of the other. Omitting the middle cube makes the distributionof colors even more imbalanced, with fourteen of one color and twelve of the other color. Therefore, our desired path through the alphabet is impossible. It is possible to solve the problem if consecutive alphabetic cubes have an edge or a face in common, and we can even make it so that this applies to A and Z.

HGE TUV SRQ

IFD JW KMP

ABC ZYX LNO

6.3 A Triangle in an Ellipse Show how to inscribe a triangle of maximum area in an ellipse. What happens when the ellipse is a circle? You may conjecture that a triangle of maximum area inscribed in a circle is equilateral. Let’s prove this by simple geometry. Suppose that ABC is a triangle of maximum area inscribed in a circle. We will show that it is equilateral. Let a line parallel to AB be tangent to the circle at a point C 0. There are two

i i

i i i “main” — 2011/10/12 — 12:06 — page 111 — #125 i i i

6.4. About the Roots of a Cubic 111

D B C E

Figure 6.1. Inscribing a maximum-area triangle in an ellipse.

choices for C 0; pick the one so that the distance between AB and C 0 is the greatest. If C is any point on the circle other than C 0, the area of ABC will be less than the area of ABC 0, since the altitude of the ﬁrst triangle will be less than the altitude of the second triangle. Because ABC is a maximum-area inscribed triangle, C C 0. Since C lies on D the perpendicular bisector of AB, which passes through the center of the circle, we conclude that AC BC . By a similar argument, BC BA, and therefore ABC is an D D equilateral triangle. Now that we know that a maximum-area triangle inscribed in a circle is an equilateral triangle, we can show how to ﬁnd a maximum-area triangle inscribed in an ellipse. Without loss of generality, suppose that the ellipse has the equation

x2 y2 1; a2 C b2 D or b2x2 y2 b2: a2 C D By the change of coordinates x0 bx=a, y0 y, the equation becomes D D x02 y02 b2; C D a circle of radius b. Thus, an ellipse may be thought of as a circle stretched by a constant factor. This scales the area of a plane ﬁgure by the factor. Hence, a maximum-area triangle inscribed in the circle is stretched to produce a maximum-area triangle inscribed in the ellipse. Figure 6.1 shows the construction. In the diagram, the ellipse lies outside the circle, but it could just as well be the other way around. We place A at an intersectionpoint of the circle and ellipse, and let B and C be the points on the circle forming the equilateral triangle ABC . We draw the linecontaining B and C , which intersects the ellipse at D and E. Then ADE is a maximum-area triangleinscribed in the ellipse. Maximum-area triangles inscribed in ellipses come in lots of shapes. We can stretch any equilateral triangle inscribed in the circle.

6.4 About the Roots of a Cubic The cubic polynomial x3 11x2 19x 100 C C has three roots, one real and two complex. What is the sum of their squares?

i i

i i i “main” — 2011/10/12 — 12:06 — page 112 — #126 i i i

112 6. Elegant Solutions

We could ﬁnd the roots explicitly,square them, and add, but we will be more elegant and relate the roots to the polynomial’s coefﬁcients. A monic cubic polynomial with roots r1, r2, and r3 can be factored as

3 2 .x r1/.x r2/.x r3/ x .r1 r2 r3/x .r1r2 r1r3 r2r3/x r1r2r3: D C C C C C Equating coefﬁcients, we see that1

r1 r2 r3 11; C C D r1r2 r1r3 r2r3 19; C C D r1r2r3 100: D Therefore, the sum of the squares of the roots is

2 2 2 2 r r r .r1 r2 r3/ 2.r1r2 r1r3 r2r3/ 1 C 2 C 3 D C C C C . 11/2 2 19 D 83: D Here is a method to calculate the sum of the kth powers of the roots of our polynomial, for any positive integer k. Let

k k k pk r r r ; k 0: D 1 C 2 C 3 Each root satisﬁes the equation

x3 11x2 19x 100; D C the characteristic equation of the recurrence relation

pk 11pk1 19pk2 100pk3; k 3: D C

Hence, the sequence pk is given by this recurrence relation and the initial conditions f g

p0 3; p1 11; p2 83: D D D It’s now a simple matter to use the recurrence formula to calculate the sum of the kth powers of the roots. For example, the sum of the cubes of the roots is

p3 11.83/ 19. 11/ 100.3/ 404: D C D What is the sum of the fourth powers of the roots? The answer is the year that Wolfgang Amadeus Mozart (1756–1791) wrote the opera Apollo et Hyacinthus.

1These relations are called Viète’s formulas, named after François Viète [Vieta] (1540–1603).

i i

i i i “main” — 2011/10/12 — 12:06 — page 113 — #127 i i i

6.5. Distance on Planet X 113

6.5 Distance on Planet X Planet X is a sphere of radius 1000 km. On Planet X, a city is located at 10ı degrees N latitude and 20ı degrees E longitude,while another city is located at 30ı N latitudeand 60ı degrees E longitude. What is the shortest distance between the two cities along the surface of Planet X?

Nothing is special about the coordinates given for the two cities, so we want a method for finding the minimum distance along the surface of Planet X between any two points. A shortest path between two points on a sphere is called a geodesic. A geodesic is an arc of a great circle, so if we know the central angle of the great circle containing the points, we can multiplyby the radius to get its length. Latitude and longitude are essentially spherical coordinates. If we convert them to Cartesian coordinates of vectors v and w, then we can use the dot product formula v w cos D v w j jj j to find the central angle separating the two cities, and multiply (in radians) by the radius to find the length of the geodesic. We take the spherical coordinates for a point on the surface of a planet with radius r to be .r; ; /, where is the latitude and is the longitude of the point. The corresponding Cartesian coordinates are

.x; y; z/ .r cos cos ; r cos sin ; r sin /: D

Let the two pointshave spherical coordinates .r; 1; 1/ and .r; 2; 2/ and hence Cartesian coordinates

v .r cos 1 cos 1; r cos 1 sin 1; r sin 1/ and D w .r cos 2 cos 2; r cos 2 sin 2; r sin 2/: D By the dot product formula, the cosine of the central angle (cos ) is

cos 1 cos 2 cos 1 cos 2 cos 1 cos 2 sin 1 sin 2 sin 1 sin 2; C C and the length of the geodesic is

1 r cos .cos 1 cos 2 cos 1 cos 2 cos 1 cos 2 sin 1 sin 2 sin 1 sin 2/: C C

i i

i i i “main” — 2011/10/12 — 12:06 — page 114 — #128 i i i

114 6. Elegant Solutions

z (0,0,1)

(1/3,1/3,1/3) y (0,1,0) (1,0,0) x

Figure 6.2. A tilted circle.

ı ı ı ı In our problem, r 1000 km, 1 10 , 1 20 , 2 30 , and 2 60 . It follows D D D D D : that the two alien cities are separated by a central angle of approximately 0:74 radians D (42ı) and a distance of approximately 740 km. We can use the dot product method to prove a formula of spherical trigonometry known as the spherical law of cosines. Consider a spherical triangle on a unit sphere, with angles A, B, C and opposite sides a, b, and c. The sides of the triangle are arcs of great circles of the sphere. The angles are determined by the planes of the great circles deﬁning the sides. Since the sphere has radius 1, the arcs subtend central angles a, b, and c. Set up a Cartesian coordinate system so that the xy-plane passes through A, C , and the center of the sphere. Choose the x-axis so that C has coordinates .1; 0; 0/. By plane trigonometry, A has coordinates .cos b; sin b; 0/ and B has coordinates .cos a; sin a cos C; sin a sin C/. The dot product formula gives us the spherical law of cosines:

cos c cos a cos b sin a sin b cos C: D C

6.6 A Tilted Circle Find parametric equations for the circle in R3 that passes through the points .1; 0; 0/, .0; 1; 0/, and .0; 0; 1/. See Figure 6.2. Parametric equations are equations for x, y, z in terms of a new variable t. They allow us to trace the circle in R3 as a vector function of one variable. Sometimes a mathematical problem can be solved by taking a hint from physics. The unit circle centered at the origin in R2 is given by the parametric equations

x cos t; D y sin t; 0 t < 2: D Ä The components of the velocity of a point moving around the circle at constant speed are

x0 sin t; D y0 cos t; 0 t < 2: D Ä The variable t, for time, is deﬁned so that the point moves around the circle once every 2

i i

i i i “main” — 2011/10/12 — 12:06 — page 115 — #129 i i i

6.6. A Tilted Circle 115

units of time. Acceleration of the point is given by the second derivatives: x00 cos t; D y00 sin t; 0 t < 2: D Ä The acceleration vector is the opposite of the position vector: x00 x; y00 y: D D This applies to any circular motion at constant normalized speed, where the circle is centered at the origin. In three dimensions, the position function .x.t/; y.t/; z.t// satisﬁes x00 x; y00 y; z00 z: D D D Solutions to these equations constitute simple harmonic motion, a linear combination of sine and cosine functions. Thus

x A1 cos t B1 sin t; y A2 cos t B2 sin t; z A3 cos t B3 sin t; D C D C D C for some constants A1, B1, A2, B2, A3, B3. In our problem, the center of the circle is .1=3; 1=3; 1=3/ as this point is equidistant from .1; 0; 0/, .0; 1; 0/, and .0; 0; 1/, and also coplanar with them (since the sum of its coordinates is 1). The radius of the circle is the distance between its center and .1; 0; 0/, that is, p6=3, but we don’t need this value. We add the center to our parametric equations thus far: 1 x A1 cos t B1 sin t D 3 C C 1 y A2 cos t B2 sin t D 3 C C 1 z A3 cos t B3 sin t; 0 t < 2: D 3 C C Ä

Given that x.0/ 1, we ﬁnd A1 2=3, and from x.2=3/ 0, we ﬁnd B1 0. So D D D D 1 2 x cos t: D 3 C 3 By symmetry, 1 2 x cos t D 3 C 3 1 2 2 y cos t D 3 C 3 3 Â Ã 1 2 4 z cos t ; 0 t < 2: D 3 C 3 3 Ä Â Ã The circle passes through the points .1; 0; 0/, .0; 1; 0/, and .0; 0; 1/ when t 0, 2=3, and D 4=3, respectively. Another problem that can be solved by appealing to physical intuitionis the construction of the Fermat point of a triangle [26, pp. 34–36]. The book [30] is devoted to the method of using physical reasoning to solve math problems.

i i

i i i “main” — 2011/10/12 — 12:06 — page 116 — #130 i i i

116 6. Elegant Solutions

6.7 The Millionth Fibonacci Number

What are the ﬁrst three digits of F1000000, the one millionth Fibonacci number? The Fi- bonacci sequence Fn is deﬁned by f g

F0 0; F1 1; D D Fn Fn1 Fn2; n 2: D C

It isn’t practical to compute F1000000, even with a computer, because it has too many digits. But we are only asked to ﬁnd the ﬁrst three digits. We use an explicit formula for the nth Fibonacci number:

n n Fn O ; n 0; D p5 where 1 p5 1 p5 C ; : D 2 O D 2 : : The constant is the golden ratiodescribed in Chapter 1. Since 1:6 and 0:6, the D O D Fibonacci sequence grows like the exponential sequence n=p5 . The difference between f g the two sequences becomes exponentially small as n tends to infinity, and is therefore negligible. Thus : n Fn : D p5 Taking the base-10 logarithm, : log Fn n log log p5: D For n 1000000, we have D : : log F1000000 1000000 log log p5 208987:2908: D D We see that the number of digits in the millionth Fibonacci number is 208;988. To get the first three digits, we compute : 100:2908 1:953; D so the first three digits of the millionthFibonacci number are 195. In scientific notation,the millionth Fibonacci number is

: 208987 F1000000 1:95 10 : D It isn’t difficult to find the last three digits of the millionth Fibonacci number. We can run the Fibonacci recurrence relation, keeping only the last three digits at each step. The millionthFibonacci number ends in 875. The last three digits of Fibonacci numbers repeats every 1500 terms. Since F1000 ends in 875, and 1500 divides 1000000 1000,this confirms that the last three digits of F1000000 are 875.

i i

i i i “main” — 2011/10/12 — 12:06 — page 117 — #131 i i i

6.8. The End of a Conjecture 117

6.8 The End of a Conjecture Let beapermutationoftheset 1; : : : ; n , for some positive integer n. The order of is f g the smallest positive integer k such that k ( applied k times) is the identity permutation. A natural question is, what is the greatest possible order of ? The order of a permutation is the least common multiple of the lengths of its disjoint cycles. Experimentation shows that the greatest possible order of a permutation of 10 elements is 2 3 5 30, which occurs for permutationsconsisting of disjoint cycles of lengths D 2, 3, and 5. Based on this and other small examples, we could make a conjecture:

Conjecture. If n is the sum of consecutive prime numbers, n 2 3 p, then the D C C C greatest possible order of a permutation of the set 1; : : : ; n is 2 3 p. f g This conjecture is plausible but false. Disprove it. To disprove a conjecture, it suffices to find an instance when it doesn’t hold. How do we do this in the present case? We will start with a partition n 2 3 p, for some p, D C C C and replace some of the primes by powers of these primes, in such a way that the sum of the terms is still n. Since the prime powers will have no common factors, we can hope that their least common multiple will be greater than the least common multiple of the terms that they replace. In fact, the conjecture is true for all primes p up to 19.So let’slook at the case p 23. Consider the partition D 100 2 3 5 7 11 13 17 19 23: D C C C C C C C C Replace the 2, 3, and 23 by 16, 9, 1, 1, and 1 (retaining the sum). Then the ratio of the least common multiple of the numbers in the second partition to the least common multiple of the numbers in the first partition is 16 9 144 ; 2 3 23 D 138 which is greater than 1. Hence, a permutation of 100 whose cycle lengths are the terms of the new partition will have a greater order than one given by the partitionin the conjecture. The function g.n/ that gives the greatest possible order of a permutation of n elements is called Landau’s function, named after Edmund Landau (1877–1938). There is no simple formula known for g.n/, although some of its properties are known. For instance, ln g.n/ is asymptotic to pn ln n, which means that ln g.n/ lim 1: n!1 pn ln n D A surprising characteristic of g.n/ is that it is constant for arbitrarily many consecutive values of n. See [33] for a very readable account of Landau’s function.

6.9 A Zero-Sum Game Consider a game in which two players, A and B, choose one of two alternatives, x and y. Based on their choices, there is a payoff from player B to player A according to the following table.

i i

i i i “main” — 2011/10/12 — 12:06 — page 118 — #132 i i i

118 6. Elegant Solutions

B x y A x 3 2 C y 1 2 C For example, if A and B both choose x, then B gives 3 points to A. Suppose that A and B play this game repeatedly, at each turn randomly choosing x and y according to ﬁxed probabilities p1, p2, q1, q2, as shown in the table. What is the expected long-termoutcome of the game? Let be the expected payoff to player A. Thus

3p1q1 2p1q2 p2q1 2p2q2: D C Both A and B are trying to maximize their expected gain. Player A should choose probabilities pi that maximize no matter what probabilities qj player B chooses. At the same time, player B should choose probabilities qj that minimize no matter what probabilities pi player A chooses. This fundamental principle of zero-sum games is summarized by the equilibrium formula max min min max : pi qi D qi pi Geometrically, this min-max value is the saddle point of the surface given by in three- dimensional space where the independent variables are p p1 and q q1. Let’s calculate D D it. We have

.p; q/ 3pq 2p.1 q/ .1 p/q 2.1 p/.1 q/ D C 8pq 4p 3q 2: D C At a critical point, the partial derivatives are @=@p 8q 4 0, so that q 1=2, and D D D @=@q 8p 3 0, so that p 3=8. The determinant of the Hessian matrix is D D D @2 @2 @p2 @p@q 0 8 ˇ ˇ 64 < 0: ˇ 2 2 ˇ ˇ @ @ ˇ D 8 0 D ˇ ˇ ˇ ˇ ˇ @q@p @q2 ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ This conﬁrms that the criticalˇ point is a saddleˇ point,as indicated in Figure 6.3. The payoff for A is .3=8; 1=2/ 1=2. Therefore, B should stop playing this game. D The min-max equilibrium result for zero-sum games was formulated by John von Neu- mann (1903–1957). If the payoffs are 11, 12, 21, and 22, as shown below, can you determine the condition on the ’s that guarantees the existence of a min-max solution? B x y

A x 11 12 y 21 22 An excellent reference on game theory, including zero-sum games, is [50].

i i

i i i “main” — 2011/10/12 — 12:06 — page 119 — #133 i i i

6.10. An Expected Maximum 119 1 1 3 D D r r

1=2 q D r

2 2 0 D D 0 1 r p r Figure 6.3. The payoffs in a zero-sum game. 6.10 An Expected Maximum A man has a bottle of vitamin pills from which he takes a half pill per day. Each day, he selects a pill at random from the bottle. If it is a whole pill he cuts it in half, takes a half pill, and puts the other half back in the bottle. If it is a half pill, he takes it. He continues this daily regimen until the bottle is empty. If the bottle starts with n pills, show that the expected maximum number of half pillsin the bottletends to n=e as n tends to inﬁnity.(The savvy problem-solver won’t be surprised by the appearance of e in a probability problem.) We will set up and solve a system of differential equations. Let p.t/ be the expected number of whole pills and h.t/ the expected number of half pills on day t, where 0 t Ä Ä 2n. The probability that a whole pill is selected on day t is p.t/=.p.t/ h.t// and the C probabilitythat a half pill is selected is h.t/=.p.t/ h.t//. If it is a whole pill,thenumber C of whole pills goes down by 1 while the number of half pills goes up by 1. If it isa half pill, the number of whole pills is unchanged while the number of half pills goes down by 1. These observations give rise to a system of difference equations: p.t/ p.t 1/ p.t/ C D p.t/ h.t/ C p.t/ h.t/ h.t 1/ h.t/ ; 0 t 2n 1; C D C p.t/ h.t/ p.t/ h.t/ Ä Ä C C where p.0/ n and h.0/ 0. D D We replace this discrete process with a continuous one, i.e., a system of differential equations: dp p dt D p h C dh p h ; dt D p h C where p and h are functions of t, with p.0/ n and h.0/ 0. This approximation to a D D discrete system by a continuous one becomes better as n increases. We see that dh=dt is positive until h p. Therefore, the maximum value of h occurs D when h p. Dividing the two differential equations gives the single differential equation D dh h p ; dp D p

i i

i i i “main” — 2011/10/12 — 12:06 — page 120 — #134 i i i

120 6. Elegant Solutions

where h is a function of p, and h 0 when p n. By inspection, D D h p ln.n=p/; D which, with p h, yields D h n=e: D Thus, as n increases, the expected maximum number of half pills approaches n=e. What do the curves p.t/ and h.t/ look like? Since a whole pill is consumed in two days (not necessarily consecutive) and a half pill is consumed in one day,

2p h t 2n; C C D and hence t 2n 2p p log.n=p/: D (It’s impossible to write p and h in terms of elementary functions of t.) The maximum value of h occurs when p n=e, and this gives D t .2 3=e/n: D The curves p.t/ and h.t/ are shown below.

n p.t/

h.t/ n=e

0 0 .2 3=e/n 2n t

6.11 Walks on a Graph A graph is a set of points and a collection of edges joining pairs of points. A walk on a graph is a sequence of vertices such that consecutive vertices are adjacent (joined by an edge). Figure 6.4 depicts a graph with four vertices and three edges. An example of a walk on this graph is the sequence 1, 2, 1, 2, 3. The length of a walk is the number of edges traversed in the graph. The walk 1, 2, 1, 2, 3 has length 4.

1 2 3 4

Figure 6.4. A graph to walk on.

i i

i i i “main” — 2011/10/12 — 12:06 — page 121 — #135 i i i

6.11. Walks on a Graph 121

Let a.n/ be the number of walks of length n from vertex i to vertex j , where n 1 and ij 1 i; j 4. Let us a formula for a.n/. Ä Ä ij To solve this problem, we introduce the adjacency matrix of the graph: 0 1 0 0 1 0 1 0 A 2 3 : D 0 1 0 1 6 0 0 1 0 7 6 7 4 5 The ij entry of A is 1 if there is an edge from vertex i to vertex j , and 0 otherwise. By deﬁnition, A is a symmetric matrix. One of the useful properties of an adjacency matrix is that its square tells us the number of walks of length 2 between any two vertices: 1 0 1 0 0 2 0 1 A2 2 3 : D 1 0 2 0 6 0 1 0 1 7 6 7 4 5 For example, the 12 entry of A2 is 0, and there are no walks of length 2 from vertex 1 to vertex 2. Why does the ij entry of A2 equal the number of walks of length 2 from i to j ? Consider what happens when we form the matrix product A2. Since

.1/ .1/ .1/ .1/ a11 a12 a13 a14 a.1/ a.1/ a.1/ a.1/ A 2 21 22 23 24 3 ; D a.1/ a.1/ a.1/ a.1/ 6 31 32 33 34 7 6 a.1/ a.1/ a.1/ a.1/ 7 6 41 42 43 44 7 4 5 the ij entry of A2 is a.1/a.1/ a.1/a.1/ a.1/a.1/ a.1/a.1/: i1 1j C i2 2j C i3 3j C i4 4j .1/ .1/ Each term aik akj is the product of two numbers equal to 0 or 1, which is non-zero if and only if a.1/ 1 and a.1/ 1, that is, the vertices i and k are adjacent and the vertices ik D kj D k and j are adjacent. This happens precisely when there is a walk of length 2 from i to j 2 .2/ through k. Since we sum over all vertices k, the ij entry of A is aij . For n 1, the ij entry of An is a.n/, the number of walks of length n from i to j , which ij can be proved by mathematical induction. Let’s investigate some higher powers of A. We have 0 2 0 1 2 0 3 0 A3 2 3 ; D 0 3 0 2 6 1 0 2 0 7 6 7 4 5 2 0 3 0 0 5 0 3 A4 2 3 ; D 3 0 5 0 6 0 3 0 2 7 6 7 4 5

i i

i i i “main” — 2011/10/12 — 12:06 — page 122 — #136 i i i

122 6. Elegant Solutions

and 0 5 0 3 5 0 8 0 A5 2 3 : D 0 8 0 5 6 3 0 5 0 7 6 7 4 5 It appears that the Fibonacci numbers are involved. The Fibonacci numbers Fn are deﬁned by the recurrence formula

F0 1; F1 1; Fn Fn1 Fn2; n 2; D D D C so

Fn 0; 1; 1; 2; 3; 5; 8; 13; 21; 34; ::: : f g D f g We claim that

Fn1 0 Fn 0 0 FnC1 0 Fn 8 2 3 ; for n even F 0 F 0 ˆ n nC1 ˆ 6 0 F 0 F 7 ˆ 6 n n1 7 n ˆ 4 5 A ˆ D ˆ <ˆ 0 Fn 0 Fn1 Fn 0 FnC1 0 ˆ 2 3 ; for n odd: ˆ 0 F 0 F ˆ nC1 n ˆ 6 F 0 F 0 7 ˆ 6 n1 n 7 ˆ 4 5 ˆ We have veriﬁed this: for n 1 and n 2. Suppose that the formula holds for an even D D value of n. Then

Fn1 0 Fn 0 0 1 0 0 C 0 FnC1 0 Fn 1 0 1 0 An 1 2 3 2 3 D Fn 0 FnC1 0 0 1 0 1 6 0 F 0 F 7 6 0 0 1 0 7 6 n n1 7 6 7 4 5 4 5 0 Fn1 Fn 0 Fn C FnC1 0 FnC1 Fn 0 2 C 3 D 0 Fn FnC1 0 FnC1 C 6 F 0 F F 0 7 6 n n n1 7 4 C 5 0 FnC1 0 Fn FnC1 0 FnC2 0 2 3 : D 0 FnC2 0 FnC1 6 F 0 F 0 7 6 n nC1 7 4 5 This is the correct formula for n 1. Similarly, we can show that if the formula holds for C an odd value of n, then it holds for n 1. It follows by mathematical induction that the C .n/ formula holds for all n 1. The formula gives a simple way to ﬁnd a . ij

i i

i i i “main” — 2011/10/12 — 12:06 — page 123 — #137 i i i

6.12. Rotations of a Grid 123

1 2 3

4 5 6

Figure 6.5. A 2 3 grid. 6.12 Rotations of a Grid Let a 2 3 grid be given, containing the integers 1 through 6, as in Figure 6.5. Let L (for “left”) be the operation of rotating the left-most 2 2 sub-grid 90ı clockwise, and R (for “right”) be the operation of rotating the right-most 2 2 sub-grid 90ı clockwise. That is, L is the permutation .1; 2; 5; 4/.3/.6/ and R is the permutation .2; 3; 6; 5/.1/.4/. Given successive applications of these two operations, how many different permutations of the grid can result? What is the group? We are faced with two questions: how many elements are in the group and what is its structure? To start, we look at some combinations of group elements. Using the rotations L and R, we can put any of the numbers in any cell of the grid.Let’s say that we put a selected number in the .1; 1/ position of the grid. Then we can put any of the remaining numbers in the .2; 1/ position, using powers of L2RL1, which fixes the .1; 1/ position and moves the other cells in a 5-cycle, and any of the remaining entries in the .1; 2/ position using powers of R. Thus, we see that there are at least 6 5 4 120 D permutations of the grid. Because 120 5Š, we could guess that the group is isomorphicto the group of permuta- D tions of a five-element set, for this group has order 5Š. This group, denoted by S5, is called the symmetric group of degree 5. We will prove this is correct. To prove that the group is isomorphic to S5, we are confronted with the question, what five things can we permute? In an effort to find a five somewhere in the problem, we notice that the number of pairs of cells of the grid is 6 15, so perhaps we should put the pairs 2 D of cells into five groups of three each, as A 1; 4 ; 2; 6 ; 3; 5 D ff g f g f gg B 1; 3 ; 2; 5 ; 4; 6 D ff g f g f gg C 1; 5 ; 2; 4 ; 3; 6 D ff g f g f gg D 1; 2 ; 3; 4 ; 5; 6 D ff g f g f gg E 1; 6 ; 2; 3 ; 4; 5 : D ff g f g f gg It’s helpful to picture the sets with lines representing the pairs of cells. Here is the picture for the set A.

i i

i i i “main” — 2011/10/12 — 12:06 — page 124 — #138 i i i

124 6. Elegant Solutions

@ @ A @ @ @

The operations L and R move each set A, B, C, D, and E to another such set. That is, the sets are permuted by L and R. Speciﬁcally,

L .A; D; B; E/.C/ and R .A/.B; E; C; D/: D D

At this point we know that the group is a subgroup of S5. We calculate

LR .A; B; C; D; E/; D and RLR .A; B; C/.D; E/: D Applying RLR three times, the 3-cycle disappears and we are left with a transposition:

.RLR/3 .A/.B/.C/.D; E/: D Hence, the group contains a 5-cycle and a transpositionof adjacent terms in the 5-cycle (D and E). It is well known (e.g., see [25, p. 118]) that they generate S5. Let’s show that the transposition .1; 2/ and the cycle .1;2;:::;n/ generate all permutations of 1; 2; : : : ; n . We’ll demonstrate this in the case n 5 but the same argument f g D works in general. We can generate every transposition of consecutive numbers. The technique is conjugation, which sends an element x to a new element g1xg. Thus

.5;4;3;2;1/.1;2/.1;2;3;4;5/ .2; 3/.1/.4/.5/; D and we have generated the transposition .2; 3/. Continuing, we generate .3; 4/, .4; 5/, and .5; 1/. Now that we have transpositions of consecutive numbers, we can generate all transpositions. For instance,

.2; 3/.3; 4/.4; 5/.3; 4/.2; 3/ .2; 5/: D So we have the transposition .2; 5/. As an exercise, show how to obtain the transposition .3; 5/. Now that we have all transpositions, it is easy to obtain any cycle. For instance,

.3; 5/.3; 1/.3; 4/ .3; 5; 1; 4/: D Finally, since all permutations are products of cycles, we are done. An alternative presentation of the group is worth considering. We relabel the entries of the grid.

i i

i i i “main” — 2011/10/12 — 12:06 — page 125 — #139 i i i

6.13. Stamp Rolls 125

0 1 3

2 4 1

The entries are the residue classes of integers modulo 5, i.e., the numbers 0, 1, 2, 3, and 4, together with . We deﬁne two functions 1 1 f .x/ ; g.x/ 3x: D 2x 1 D C You can verify that f represents the operation L (it rotates the left-most 2 2 sub-grid by 90ı), and g represents R (it rotates the right-most 2 2 sub-grid by 90ı). Remember to reduce each value of the functions modulo 5. For example,

1 2 2 1 1 1 f 0 1 2 0: W 7! 7! 3 D 6 D 1 D 7! 5 D 0 D 1 7! D 1 The resulting group of compositions of f and g is called the group of linear fractional transformations of the ﬁve-element ﬁeld together with . 1 The problem can be generalized to allow any rotation of any 2 2 sub-grid of an m n grid, where 2 m n. The group is the full symmetric group of degree mn, except in Ä Ä the cases m n 2, when we get a cyclic group of order 4, and m 2, n 3, when D D D D we get S5. To show this, we can use the fact that the symmetric group on 1; 2; : : :; n is f g generated by transpositions of the form .k; k 1/, where 1 k n 1. C Ä Ä

6.13 Stamp Rolls We have two stamp rolls with unlimited supplies of 1-cent and 2-cent stamps (Figure 6.6). Let a.n/ be the number of ways to make postage of n cents by taking strips of stamps from the two rolls. The order of the strips and the number of stamps per strip matter. For

1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢ 1¢

&% '$

2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢ 2¢

&% Figure 6.6. One-cent and two-cent stamp rolls.

i i

i i i “main” — 2011/10/12 — 12:06 — page 126 — #140 i i i

126 6. Elegant Solutions

example, a.4/ 15, as there are fifteen ways to make postage of four cents: D .1/ .1/ .1/ .1/; .1 1/ .1/ .1/; .1/ .1 1/ .1/; C C C C C C C C C .1/ .1/ .1 1/; .1 1/ .1 1/; .1 1 1/ .1/; C C C C C C C C C .1/ .1 1 1/; .1 1 1 1/; .2/ .1/ .1/; C C C C C C C C .1/ .2/ .1/; .1/ .1/ .2/; .2/ .1 1/; C C C C C C .1 1/ .2/; .2/ .2/; .2 2/: C C C C In this notation,the numbers withinparentheses comprise a strip. For instance, .2/ .1 1/ C C means a single 2-cent stamp followed by a strip of two 1-cent stamps. Find a.100/ and give an approximate value of a.105/, the number of ways to make postage of $1000. You will probably want to use a computer. We will find a recurrence relation for a.n/ . Then we can calculate a.100/ and approx- f g imate a.105/. When solving a challenging mathematics problem, it’s a good idea to look at specific cases. We can easily find a.1/ 1, a.2/ 3, a.3/ 6, and a.4/ 15 (as D D D D above). We can go a little further and find a.5/ 33 and a.6/ 78, but larger values are D D harder to produce. Can you guess a recurrence relation for a.n/ from what we know? f g Since the last strip chosen consists of some number of 1-cent or 2-cent stamps, we have the recurrence relation

a.n/ a.n 1/ a.n 2/ a.n 3/ a.n 4/ D C C C C a.n 2/ a.n 4/ a.n 6/ a.n 8/ ; C C C C C where we stop summing when the arguments become negative, and we define a.0/ 1. D We could use this to calculate a.100/, but it would be better to find a recurrence relation that requires only a fixed number of previous terms. For n 3, we have a.n/ a.n 1/ a.n 2/ Œa.n 3/ a.n 4/ D C C C C a.n 2/ Œa.n 4/ a.n 6/ a.n 8/ C C C C C a.n 1/ 3a.n 2/: D C Hence, a.n/ is given by a linear recurrence relation of order two: f g a.0/ 1; a.1/ 1; a.2/ 3; D D D a.n/ a.n 1/ 3a.n 2/; n 3: D C With the help of a computer, we find that : a.100/ 870338141873214655919573200648700175 8:7 1035: D D We can calculate a.105/ by computer using the recurrence relation and get an approximate answer: : a.105/ 7:5 1036224: D

i i

i i i “main” — 2011/10/12 — 12:06 — page 127 — #141 i i i

6.13. Stamp Rolls 127

Or we can obtain an exact formula for a.n/ and then approximate it. The standard way to solve a linear homogeneous recurrence relations of order two with constant coefficients is to assume it has a solutionof the form a.n/ ˛r n ˇr n; n 1; D 1 C 2 where ˛ and ˇ are constants, and r1 and r2 are the distinct roots of the characteristic equation x2 x 3 0; D i.e., 1 p13 1 p13 r1 ; r2 C : D 2 D 2 Using a.1/ 1 and a.2/ 3, we find D D 1 2 1 2 ˛ and ˇ ; D 3 3p13 D 3 C 3p13 thus obtaining 1 2 1 2 a.n/ r n r n; n 1: D 3 p 1 C 3 C p 2 Â 3 13Ã Â 3 13Ã n For large n, the exponential term r2 dominates. Hence : 1 2 : a.105/ r 100000 7:5 1036224: D 3 C p 2 D Â 3 13Ã We can also approximate a.105/ using a generating function. Define 1 f .x/ a.n/xn a.0/ a.1/x a.2/x2 a.3/x3 a.4/x4 D D C C C C C nD0 X 1 x 3x2 6x3 15x4 : D C C C C C The recurrence relation for a.n/ yields f g 1 a.n/xn.1 x x2 x3 x4 x2 x4 x6 x8 / a.0/ 1: D D nD0 X From the formula for the sum of a geometric series, we obtain 1 1 x2 f .x/ 2 : D 1 x x D 1 x 3x2 1x 1x2 The generating function is a rational function and the coefficients in its denominator match those in the recurrence relation. This happens with the generating function of any linear recurrence relation with constant coefficients. We can write the generating function as 1 ˛ ˇ f .x/ : D 3 C 1 r1x C 1 r2x The geometric series with growth rate r2x dominates, and we obtain the same approximation as before: : : a.105/ ˇr 100000 7:5 1036224: D 2 D

i i

i i i “main” — 2011/10/12 — 12:06 — page 128 — #142 i i i

128 6. Elegant Solutions

6.14 Making a Million How many ways can you make $1 million using any number of pennies, nickels, dimes, quarters, one-dollar bills, five-dollar bills, ten-dollar bills, twenty-dollar bills, fifty-dollar bills, and hundred-dollar bills? You will need a computer for this problem. The number of pennies used in making a million dollars must be a multiple of 5. Thus, we may think of 5 cents (a nickel or five pennies) as the smallest unit of currency, all other units being multiples of 5. Let cn be the number of ways to make 5n cents. The generating function for the sequence cn is f g 2 3 c.x/ 1 c1x c2x c3x D C C C C 1 1 1 1 D1 x 1 x 1 x2 1 x5 1 1 1 1 1 1 : 1 x20 1 x100 1 x200 1 x400 1 x1000 1 x2000 The terms in the middle row above correspond to the contributions from the coins. The terms in the bottom row represent contributions from the paper money. To see how the factors in the generating function work, consider the contributionto the generating function from the term 1 1 x2 x22 x32 x42 : 1 x2 D C C C C C A selection of, say, x32 corresponds toa selection of threedimes (adime is twoof our basic units). When we multiply all selections together and combine coefficients of like powers, we obtain the generating function for cn . f g Since $1 million is 20;000;000 of the 5-cent units, our job is to find c20;000;000, the coefficient of x20;000;000 in c.x/. We will write the generating function in a form that makes this easier. In the denominator of c.x/, all the powers of x divide the largest power, 2000. Accord- ingly, we rewrite each factor in the denominator as .1 x2000/ with a compensating factor, a geometric series, in the numerator. The new numerator is

.1 x x1999/2.1 x2 x4 x1998/.1 x5 x1995/ C C C C C C C C C C .1 x20 x1980/.1 x100 x1900/.1 x200 x1800/ C C C C C C C C C .1 x400 x1600/.1 x1000/: C C C C A computer algebra system can quickly multiply out the new numerator. The new denominator is .1 x2000/10, and we can expand its reciprocal as a binomial series: 1 k 9 .1 x2000/10 C x2000k: D 9 ! kXD0 We complete the calculation by multiplying this binomial series by appropriate terms from the numerator to obtain the coefﬁcient of x20;000;000. The numerator is a polynomial, say

i i

i i i “main” — 2011/10/12 — 12:06 — page 129 — #143 i i i

6.15. Coloring a Projective Plane 129

3 5 7

2 6 4

Figure 6.7. A projective plane of order 2.

p, of degree 16271, but the only powers of x that matter are multiples of 2000; the corresponding coefﬁcients are

p0 1 D p2000 48820947949 D p4000 3864246773424 D p6000 34961841233371 D p8000 73423441820500 D p10000 41833663537539 D p12000 5760824622356 D p14000 107159417621 D p16000 1647239: D We calculate, again with the help of a computer,

8 10000 j 9 c20;000;000 C p2000j D 9 j D0 ! X 441;287;168;799;272;062;712;629;114;612;633;953;025;220;001 D : 4:4 1044: D 6.15 Coloring a Projective Plane A projective plane of order three is shown on page 17. We will consider a projective plane of order two, as shown in Figure 6.7. It has seven points and seven lines. Each line contains three points, each point lies on three lines, every two points determine a unique line, and every two lines intersect in a unique point. If we color the points with different colors, how many different colorings do we obtain? The points and lines of a ﬁnite geometry may be moved around as long as the incidences

i i

i i i “main” — 2011/10/12 — 12:06 — page 130 — #144 i i i

130 6. Elegant Solutions

between points and lines remain unchanged. For instance, the points 1, 2, and 3 must be collinear no matter how we draw the plane. We will show that there are 30 different colorings. The important thing is the number of symmetries of the plane that can occur when the points are moved. Since all seven points are equivalent, we may move any point to occupy the positionof any other point.So there are seven choices for where to move a point. Once that choice is made, there are six choices for where to move another point. However, once these decisions are made, the third point collinear with the ﬁrst two must stay on the line determined by them. There are four remaining points, and any of them can be moved to any of the four remaining positions. However, this choice, together with the earlier choices, determines the positions of all the remaining points. Altogether, there are 7 6 4 168 choices, and this is the number of D symmetries of the projective plane. Withoutsymmetries, we would have 7Š 5040 different colorings. This multiply-counts D colorings that can be obtained from each other by a symmetry. Since there are 168 symmetries, there are only 5040=168 30 different colorings. D RecallingA Group ofOperations from Chapter 5, you may notice that the formula forthe number of symmetries of the projective plane, 7 6 4, gives the number of invertible 3 3 binary matrices. The multiplicative group of the matrices is isomorphic to the symmetry group of the projective plane. We can specify the isomorphism by labeling the seven points of the projective plane with the seven nonzero binary vectors of length three. Each matrix acts by multiplication on the set of points. We must do the labeling so that the vectors corresponding to three points on a line sum to 0, as collinearity is preserved by matrix multiplication.

i i

i i i “main” — 2011/10/12 — 12:06 — page 131 — #145 i i i

7 Creative Problems

In mathematics, you understand what you build up. —FAN CHUNG It is easy to formulate new mathematical problems. One only needs an inquiring mind. The problems in this chapter are partially or completely unsolved, so there is much to work on!

7.1 Two-Dimensional Gobbling Algorithm Choose a positive integer, say, 20. Now choose a random integer between 1 and 20, say, 9. Subtract: 20 9 11. Next, choose a random integer between 1 and 11, say, 7. Subtract: D 11 7 4. Choose a random integer between 1 and 4, say, 3. Subtract: 4 3 1. Now D D we must choose the integer 1, and we subtract: 1 1 0. Since we have obtained 0, we D stop. We did four subtractions. Starting with 20, how many subtractions are expected? We can show, using a recurrence relation, that starting with a positive integer n, the expected number of subtractions is the harmonic number 1 1 1 Hn 1 : D C 2 C 3 C C n What if we start with a pair of positive integers, say, .10; 6/? Choose a random integer between 1 and 10, say, 5, and a random integer between 1 and 6, say, 2. Subtract: .10; 6/ .5; 2/ .5; 4/. Repeat, choosing the ordered pair .2; 3/. Subtract: .5; 4/ .2; 3/ .3; 1/. D D Repeat, choosing .2; 1/. Subtract: .3; 1/ .2; 1/ .1; 0/. Since one of the numbers is 0, D we stop. There were three subtractions. How many subtractions do we expect? Let e.m; n/ be the expected number of subtractions, starting with a pair of positive integers .m; n/. It’s easy to write a recurrence formula for it:

e.m; 1/ 1; m 1 D I e.1; n/ 1; n 1 D I m1 n1 1 e.m; n/ 1 e.m;n/; m;n>1: D C mn j D1 X kXD1 The recurrence formula produces a table of values of e.m; n/.

131

i i

i i i “main” — 2011/10/12 — 12:06 — page 132 — #146 i i i

132 7. Creative Problems

n 1 2 3 4 5 m 1 1 1 1 1 1 2 1 5=4 4=3 11=8 7=5 3 1 4=3 53=36 223=144 115=72 4 1 11=8 223=144 475=288 549=320 5 1 7=5 115=72 549=320 4309=2300 Do you see a pattern?

7.2 Nonattacking Queens Game A chess Queen attacks all squares on its row, column, and diagonals. A Queen’s range on an 8 8 board is shown in Figure 7.1. Let n be a positive integer. Two players play a game in which they alternately place Queens on an n n board so that each new Queen is out of range of the others. The last player able to place a Queen on the board is the winner.

Figure 7.1. A Queen’s range on an 8 8 board.

Given best possible play by both players, who should win this game? If n is odd, then the first player has a winning strategy: place the first Queen in the center of the board and after a second player move, take the square symmetric with respect to the center. Every time the second player has an available move, so does the first player, so the second player will run out of moves first. For n even, the outcome of the game is in general unknown. This problem was introduced in [18]. You can show by considering all cases that the first player wins for n 2, 4, 6, and 8. Hassan Noon and Glen van Brummelen [38] showed D that the second player wins for n 10. Can you determine who wins on a 12 12 board? D 7.3 Lucas Numbers Mod m The Lucasnumbers are defined by

L0 2; L1 1; Ln Ln1 Ln2; n 2: D D D C

The sequence Ln satisﬁes the same recurrence relation as the Fibonacci sequence but has f g different initial values. Given m 2, when does the range of the sequence Ln mod m consist of a com- f g plete residue system modulo m? The corresponding question for Alcuin’s sequence was

i i

i i i “main” — 2011/10/12 — 12:06 — page 133 — #147 i i i

7.4. Exact Colorings of Graphs 133

addressed in Chapter 5. Although we don’t have a formula for the period of Ln mod m , f g we know by the pigeonhole principle that it is at most m2. Computer explorations give rise to a conjecture.

Conjecture. The sequence Ln mod m takes all values modulo m if and only if m is f g one of 2; 4; 6; 7; 14; 3k; k 1: For example, the sequence Ln mod 6 is f g 2; 1; 3; 4; 1; 5; 0; :::;

and we obtain all the residues modulo 6. The sequence Ln mod 5 is f g 2; 1; 3; 4; 2; 1; :::;

and since it repeats we never obtain the residue 0. Can you prove the conjecture? Stephen A. Burr solved the corresponding problem for the Fibonacci sequence [11]. The sequence Fn mod m contains all residues modulo m if and only if m is one of f g 5j ; 2 5j ; 4 5j ; 3k 5j ; 6 5j ; 7 5j ; 14 5j ; j 0; k 1:

7.4 Exact Colorings of Graphs A graph ofthekindencountered in graphtheoryis a set of verticesand a set of edgesjoining pairs of vertices. A complete graph is a graph in which every two vertices are joined by an edge. There are many problems about colorings of graphs. In this problem, we are concerned with coloring the edges of a graph using a set of colors. An exact c-coloring of a graph is an assignment of one color chosen from c colors, to each edge of the graph such that each color is used at least once. The following exact coloring problem for infinite graphs is unsolved. For 1 m c, Ä Ä let P.c; m/ be the statement that every exact c-coloringof the edges of a countably infinite complete graph yields an exactly m-colored countably infinitecomplete subgraph. For what values of c and m is P.c; m/ true? The case m 1 is a famous theorem of combinatorics D known as Ramsey’s theorem. Ramsey’s Theorem. If the edges of the complete infinite graph on a countable infinity of vertices are colored using finitely many colors, then there exists a complete subgraph on infinitely many vertices all of whose edges are the same color. A corollary of Ramsey’s theorem is that P.c; m/ is true when m 2. If c m, then D D P.c; m/ is trivially true (take the subgraph to be the given graph). These may be the only values of c and m for which P.c; m/ is true.

Conjecture. The statement P.c; m/ is true if and only if m 1, m 2, or c m. D D D

i i

i i i “main” — 2011/10/12 — 12:06 — page 134 — #148 i i i

134 7. Creative Problems

Figure 7.2. A Queen path.

As an example of how we can disprove P.c; m/, consider the case P.11; 5/. We exhibit an exact 11-coloring of the edges of the complete infinite graph on countably many vertices so that there is no exactly 5-colored complete infinite subgraph. Color each edge of a 5 subgraph K5 using a different color. This requires 10 colors. Color every other edge 2 D in the infinite graph using the 11th color. Every infinite subgraph is exactly k 1 colored 2 C for some k. Since 5 is not a number of this form, P.11; 5/ is false. Alan Stacey and Peter Weidl [49] proved that P.c; m/ is false for each fixed m 3 and c sufficiently large. Can you prove the conjecture in its entirety?

7.5 Queen Paths A chess Queen can move any number of squares horizontally, vertically, or diagonally in one step. Figure 7.2 shows a sample Queen path from the lower-left corner of the board to the upper-right corner. Let us extend the board inﬁnitely to the rightand upward. Denote the squares by ordered pairs of nonnegative integers, with the lower-left corner square labeled .0; 0/. How many lattice paths can the Queen take from .0; 0/ to .m; n/, where m and n are nonnegative integers? Let q.m; n/ be the number of paths from .0; 0/ to .m; n/ such that at each step the Queen moves up, right, or up-right. In the table, we calculate each entry by adding all the entries to the left of, below, and diagonally left-belowthe entry. For example, q.3; 2/ 2 7 22 4 17 1 7 60. The reason this works is that the Queen has D C C C C C C D to arrive at the given square from one of these squares. : : : : : : : : : : : : : : : : 64 464 2392 10305 39625 140658 470233 1499858 ... 32 208 990 3985 14430 48519 154352 470233 ... 16 92 401 1498 5079 16098 48519 140658 ... 8 40 158 543 1712 5079 14430 39625 ... 4 17 60 188 543 1498 3985 10305 ... 2 7 22 60 158 401 990 2392 ... 1 3 7 17 40 92 208 464 ... 11 2 4 8 16 32 64 ... We see from the table that the number of Queen paths from the lower-left corner to the upper-right corner of the board is q.7; 7/ 1499858. D

i i

i i i “main” — 2011/10/12 — 12:06 — page 135 — #149 i i i

7.5. Queen Paths 135

The recurrence relation for the two-variable sequence requires arbitrarily many prior values. Let’s ﬁnd a recurrence relation for the number of Queen paths that requires a ﬁxed number of prior values. We use the generating function method that we saw in the solution to Stamp Rolls in Chapter 6. We represent the Queen’s basic steps by the indeterminates x, y, and xy. From the recurrence relation that we have already found, we have

1 1 q.m; n/xmyn.1 x x2 y y2 .xy/ .xy/2 / mD0 nD0 X X q.0; 0/ 1: D D Thus, the generating function is

1 1 1 q.m; n/xmyn D 1 x y xy mD0 nD0 1x 1y 1xy X X 1 x y x2y xy2 x2y2 C C : D 1 2x 2y xy 3x2y 3xy2 4x2y2 C C C Looking at the denominator of the generating function, we can read off a recurrence formula for the number of Queen paths:

q.0; 0/ 1; q.0; 1/ 1; q.0; 2/ 2; D D D q.1; 0/ 1; q.1; 1/ 3; q.1; 2/ 7; D D D q.2; 0/ 2; q.2; 1/ 7; q.2; 2/ 22 D D D I q.m; n/ 2q.m 1; n/ 2q.m; n 1/ q.m 1; n 1/ 3q.m 2; n 1/ D C 3q.m 1; n 2/ 4q.m 2; n 2/; m 3 or n 3: C We set q.m; n/ 0 for m or n negative. D The diagonal sequence for Queen paths, qn q.n; n/ , is f D g 1; 3; 22; 188; 1712; 16098; 154352; 1499858; 14717692; 145509218; ::: :

Its generating function is .x 1/ 1 x 1 : .3x 2/ C p1 12x 16x2 Ä C From the generating function we obtain a recurrence formula:

q0 1; q1 3; q2 22; q3 188 D D D D I qn ..29n 18/qn1 . 95n 143/qn2 D C C .116n 302/qn3 . 48n 192/qn4/=.2n/; n 4: C C C The quantity under the square root sign can be factored as

2 1 12x 16x .1 r1x/.1 r2x/; C D

i i

i i i “main” — 2011/10/12 — 12:06 — page 136 — #150 i i i

136 7. Creative Problems

where r1; r2 6 2p5. It can be shown that D ˙ n qn c r =pn; 1

where c 10.3p5 5/=8. D The problemq of counting Queen paths can be generalized to higher dimensions. A Queen path from .0; 0; 0/ proceeds in steps that are positive integer multiples of .1; 0; 0/, .0; 1; 0/, or .0; 0; 1/. A Queen path from .0;0;:::;0/ to .a1; a2; : : : ; ad / is equivalent to a Wythoff’s Nim game that starts with d piles of stones of sizes a1, a2,..., ad . In the game, two players alternately remove the same number of stones from any of the piles. The game is over when the last stone is removed. Our formulas count the number of possible games. The number of Queen paths to a main diagonal point .n; n; n/ has been recently conjectured by Alin Bostan (leading a team) to satisfy a linear recurrence relation of order 14 with polynomial coefﬁcients of degree 52. What is a recurrence relation for the number of Queen paths to a diagonal point in dimension d 3?

7.6 Transversal Achievement Game Recall from Transversal of Primes in Chapter 1 that a transversal of an n n array is a set of n cells with no two in the same row or column. We can define a two-player game based on transversals. Two players, Oh and Ex, alternately choose unoccupied cells from an n n array. They write their symbols, O and X, in the chosen cells. The first player, if any, to occupy a collection of n cells constituting a transversal is the winner. (The player may occupy other cells, too.) If there is a winner, then the winner must be the first player, Oh. The reason is that if the second player, Ex, had a winning strategy, then the first player could adopt it and win one step earlier. A blockingstrategy for Ex might be to try to occupy a complete row or column. This would prevent Oh from occupying a transversal. However, Oh may prevent Ex from occupying a complete row or column. For what values of n does Oh win this game?

7.7 Binary Matrix Game A two-dimensional version of van der Waerden’s theorem, called Gallai’s theorem, named after Tibor Gallai (1912–1992),guarantees that in any coloringof the elements of an infinite square grid with two colors, there must exist four cells, all colored the same, lying at the vertices of a square with horizontal and vertical sides. A finite version of Gallai’s theorem says that there exists a positive integer n such that given any two-coloringof the cells of an n n grid, there exist some four cells all the same color, lying at the vertices of a square with horizontal and vertical sides. We can think of the grid as a matrix and take the colors to be 0 and 1, so that we have a binary matrix. Say that a constant sub-square is a set of four equal entries of a binary matrix at the vertices of a square with horizontal and vertical sides. A long-standing problem was to find the least value of n that forces the existence of a constant sub-square. It was solved in

i i

i i i “main” — 2011/10/12 — 12:06 — page 137 — #151 i i i

7.7. Binary Matrix Game 137

Figure 7.3. A 14 14 binary matrix without a constant sub-square.

Figure 7.4. A pattern for a 13 1 binary matrix without a constant sub-square.

2009 by Roland Bacher and Shalom Eliahou [5], who proved that n 15. Furthermore, D they showed that every 14 15 binary matrix must have four such entries, and there exist 14 14 and 13 binary matrices that don’t. 1 Figure 7.3 shows a 14 14 binary matrix with no constant sub-square. Figure 7.4 shows the pattern for a 13 binary matrix with no constant sub-square. Representing 0 by a 1 blank and 1 by a filled square gives the figures, especially the second one, an Escher-like quality. We can create a game related to the Bacher–Eliahou result. Suppose that two players, Oh and Ex, alternately place their symbols, O and X, in unoccupied cells of an n n grid. The first player, if any, to mark four cells of a constant sub-square is the winner. As in the transversal achievement game, if there is a winner with best possible play, then it is Oh. The proof is by contradiction. If Ex had a winning strategy, then Oh could simply adopt it and get there first. By the Bacher–Eliahou result, Oh has a winning strategy if n 15. But perhaps Oh can force a win on a smaller playing board. What is the least value of n for which Oh can always win? If instead of a binary matrix we have a trivalued matrix with entries 0, 1, or 2, then the minimum size of the matrix that guarantees the existence of a constant sub-square is unknown. Can you find it?

i i

i i i “main” — 2011/10/12 — 12:06 — page 138 — #152 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 139 — #153 i i i

A Harmonious Foundations

Mathematics is a more powerful instrument of knowledge than any other that has been bequeathed to us by human agency. —RENE´ DESCARTES (1596–1650) Mathematical definitions appear inevitable, as if they exist independently of human thought. The appearance of inevitability prompts the question of whether mathematics is discovered or invented. We can’t answer that question, but we note that someone had to think of the definitions that we now take for granted. This results from a historical process of formulating problems, looking for solutions, and creating the best mathematics for the given situations. In this appendix, we give background information on the mathematical concepts in the book. As a utilitarian fork or a chair can be beautiful, everyday mathematical constructs are also beautiful. Simple definitions can give rise to surprising phenomena. A good reference on mathematical foundations is [48].

A.1 Sets Sets provide the building blocks for many mathematical definitions. The modern notion of sets was introduced by Georg Cantor (1845–1918). However, Cantor’s set theory admitted some paradoxes, the most famous of which is Russell’s paradox. It concerns the set S of all sets that are not members of themselves. If S is a member of itself, then by definition S is not a member of itself. But if S is not a member of itself, then by definition S is a member of itself. There is a contradiction either way. Set theory was put on a firm foundation by Ernst Zermelo (1871–1953) and Abraham Fraenkel (1891–1965). Their system, together with the Axiom of Choice, is called ZFC set theory. Russell’s paradox has found its way into rigorousmathematics via results in mathematical logicsuch as those due to Kurt Gödel (1906–1978). Gödel’s Incompleteness Theorem asserts that in any consistent mathematical system, that is, one in which false statements are not provable, and rich enough to contain the integers, there exists a statement G that is true but not provable withinthe system. That is, the system is incomplete. Statement G: Statement G is not provable within the system. Consider whether G is true or false. If it is false, then it is provable within the system. But this would mean that a false statement is provable, which is impossible in a consistent

139

i i

i i i “main” — 2011/10/12 — 12:06 — page 140 — #154 i i i

140 Appendix A. Harmonious Foundations

system. Hence G is a true statement. Since G says that G is not provable withinthe system, this must be the case. So we have a statement, G, that is true but not provable within the system. Thus the system is incomplete. A virtue of set theory is that we can define many important mathematical objects in terms of sets. For a clear introduction to Gödel’s theory, see [35]. For an advanced discussion of the way set theory is used in mathematical applications, see [12]. A set is a collection of elements. We sometimes define a set by listing its elements. For example, A 1;2;3;4;5;6;7;8;9;10 D f g has for its elements the integers 1 and 10 inclusive. We may also define a set by a rule that its elements satisfy. Thus A may be written as

A x x is an integer between 1 and 10 (inclusive) : D f W g We write x S to indicate that x is an element of S. Thus 3 A. If two sets A and B 2 2 have the same elements, then A and B are equal and we write A B. D We say that A is a subset of B, and write A B, if every element of A is an element of Â B. If A B and B A, then by deﬁnition A B. Â Â D The empty set, denoted by , is the set with no elements. ; Some sets of numbers have special names:

N the set of natural numbers 1; 2; 3; : : : D f g Z the set of integers (positive, negative, and 0) D Q the set of rational numbers D R the set of real numbers D C the set of complex numbers: D The cardinality of a set A, denoted by A , is the number of elements in A. For example, j j 1; 3; 5; 7; 9 5. If A has finitely many elements, we say that A is finite. If A is not finite, jf gj D then A is infinite. Two sets are said to have the cardinality if there is a bijection between them. The union of two sets A and B, written A B, is the set of elements in A or B or both. [ The intersection of A and B, written A B, is the set of elements in both A and B. \ Sets A and B are disjoint if A B . If A and B are disjoint,then A B A B . \ D; j [ j D j jCj j A collection of sets C is pairwise-disjoint if every pair of members of C are disjoint. The difference of A and B, written A B, is the set of elements in A but not in B. If B A, and the set A is clear from context, we call A B the complement of B, and denote Â it by B. The power set of A, denoted P.A/, is the collection of all subsets of A, including the empty set. If the cardinality of A is n, then P.A/ 2n. j j D It is always the case that the cardinality of P.A/ is greater than the cardinality of A (even if A is an infinite set). The reason is that there is no onto function from A to P.A/. For suppose that f is a function from A to P.A/. Let X x A x f .x/ . Given any D f 2 W 62 g a in A, if a X, then a f .a/, and hence f .a/ X; and if a X, then a f .a/, and 2 62 ¤ 62 2

i i

i i i “main” — 2011/10/12 — 12:06 — page 141 — #155 i i i

A.2. Relations 141

hence f .a/ X. Therefore, X is not in the range of f and we conclude that f is not an ¤ onto function. As a consequence of this result, there is no largest inﬁnite set.

A.2 Relations The Cartesian product A B of two sets A and B is the collection of all ordered pairs .a; b/ with a A and b B. If A and B are ﬁnite, then A B A B . 2 2 j j D j jj j A relation R on a set X is a subset of X X. If .a; b/ R, then a is related to b. 2 Here are two relations on the set Z of integers:

R1 .a; b/ a; b Z and a b is divisibleby 3 D f W 2 g and R2 .a; b/ a; b Z and a is divisible by b : D f W 2 g A relation R on X is reflexive if .a; a/ R for all a X; symmetric if .b; a/ R 2 2 2 whenever .a; b/ R; antisymmetric if .a; b/ R and .b; a/ R imply that a b; and 2 2 2 D transitive if .a; b/ R and .b; c/ R imply that .a; c/ R. 2 2 2 An equivalence relation is a relation that is reflexive, symmetric, and transitive. The relation R1 is an equivalence relation. Given an element x X, the set of elements related 2 to x, thatis, theset of y X such that .x; y/ R, is called the equivalence class of x, and 2 2 is denoted Œx. For example, in R1 we have Œ0 :::; 6; 3;0;3;6;::: . D f g In general, for m 2, we define the congruence relation a b (modulo m) if and only Á if a b is divisible by m. This is an equivalence relation on Z. A partial order is a relation that is reflexive, antisymmetric, and transitive. The relation R2 is a partial order.

A.3 Functions The concept of a function is used so often that it may be difficult to understand why there was ever any ambiguity about the definition. Is a function a machine that takes an input and gives an output? Is it a curve that you can graph? It may seem strange that the modern definition of function wasn’t worked out until the twentieth century. Before then, a function could conceivably return more than one value for a given input, so it was more general than the functions of today which return only one value. On the other hand, until the 1800s a function was a mapping that could be constructed from a family of well known functions such as sine, cosine, and exponential functions. A function is a set of ordered pairs .x; y/, where x is an element of a set X and y is an element of a set Y , where it is possible that X Y . Every element of X occurs as the first D element of an ordered pair, and for each x in X there is a unique corresponding y in Y . Less formally, a function is a rule that assigns to each element in X a unique element in Y . There is no requirement that a function can be graphed by a continuous curve. An example of a pathological function, nowhere continuous, was given by Peter Gustav Lejeune Dirichlet (1805–1859). It is a function from the set of real numbers to the set 0; 1 . The f g value of the function is 1 at each rational number and 0 at each irrational number. You can’t graph the function because it oscillates too wildly between 0 and 1. However, it is well-defined.

i i

i i i “main” — 2011/10/12 — 12:06 — page 142 — #156 i i i

142 Appendix A. Harmonious Foundations

A function f from A to B, written f A B, is a subset of A B such that for each W ! a A there exists a unique b B with .a; b/ f . We say that f maps a to b, and 2 2 2 we write f .a/ b or f a b. We call b the image of a. The domain of f is A. The D W 7! codomain of f is B. The range of f is the set of b B for which there exists a A such 2 2 that f .a/ b. D For example, the function

f 1; 2; 3; 4 1;2;3;4;5;6;7;8 W f g ! f g x 2x 7! maps each element x 1; 2; 3; 4 to its double in 1;2;3;4;5;6;7;8 .The range of f is 2 f g f g 2; 4; 6; 8 . f g The identity function on A is the function f A A deﬁned by f .x/ x. W ! D A function f A B is one-to-one if no two elements of A are mapped to the same ele- W ! ment of B. The function f is onto if each b B is the image of some a A. Equivalently, 2 2 f is onto if the range of f equals B. If f is one-to-one and onto, then f is a bijection. If f is a bijection, then the inverse of f , denoted f 1, is a function from B to A where f 1.b/ a if and only if f .a/ b. D D The following theorem is often useful.

Theorem. Suppose that X and Y are ﬁnite sets of the same cardinality. Then a function from X to Y is one-to-one if and only if it is onto. Given functions f A B and g B C , the composition of f and g is the function W ! W ! from A to C deﬁned by x g.f .x//. If f A A is a bijection, then the composition of 7! W ! f and f 1 is the identity function on A.

A.4 Groups The mathematical term group was ﬁrst used by Evariste´ Galois (1811–1832) in the study of the solvability of polynomial equations. Other mathematicians, such as Arthur Cayley (1821–1895) and Augustin-Louis Cauchy (1789–1857), used essentially the same idea in the study of permutations. Eventually, these ideas, and others involving number theory and geometry, were synthesized into the modern deﬁnition of an abstract group. A good primer on group theory is [25]. A group G is a nonempty set together with a binary operation such that: For all x, y G, we have x y G (closure). 2 2 For all x, y, z G, we have x .y z/ .x y/ z (associative law). 2 D There exists an element e G with the property that, for all x G, we have x e 2 2 D e x x. D For every x G, there exists an element x1 G with the property that x x1 2 2 D x1 x e. D The element e is called the identity of G. The element x1 is called the inverse of x. The identity element of a group is unique and the inverse x1 of each element x is unique.

i i

i i i “main” — 2011/10/12 — 12:06 — page 143 — #157 i i i

A.4. Groups 143

Examples of groups: The set of integers Z is a group with respect to addition. The set R 0 of nonzero real numbers is a group with respect to multiplication. f g In writing group elements, we usually suppress the group operation sign, denoting x y by xy. We abbreviate xx by x2, x1x1 by x2, etc. For all x G, we set x0 e. 2 D A finite group is a group with a finite number of elements. The order of a finite group is the number of elements in it. The cyclic group Zn, of order n, is the set 0; : : : ; n 1 with the operation of addition f g modulo n. If p is prime, then the nonzero residues modulo p form a cyclic groupof order p 1, with multiplication modulo p. In general, for n 2, the set of numbers m such that 1 m < n Ä and gcd.m; n/ 1 form a group of order .n/, with multiplication modulo n. The group D is denoted Z. For example, Z 1; 3; 7; 9 , under multiplication modulo 10. n 10 D f g A group G is abelian if xy yx for all x; y G. Otherwise, G is nonabelian. For D 2 example, the group Z is abelian. The order of an element x G is the least positive integer n for which xn e. If there 2 D is no such integer, then x has infinite order. For example, in Z4, the elements 0, 1, 2, 3 have orders 1, 4, 2, 4, respectively. The symmetric group Sn consistsof the nŠ permutations of an n-element set, e.g., 1; 2; 3; f : : : ; n . The group operation is the composition of permutations (performing one permu- g tation followed by the other permutation). The elements of Sn are conveniently written in cycle notation. Thus .1; 2; 3/.4; 8/.3; 6; 7/.5/.9/.10/

is theelement of S10 that maps 1 to 2 to 3 to 1, transposes 4 and 8, maps 3 to 6 to 7 to 3, and ﬁxes 5, 9, and 10. To multiplytwo permutations together, ﬁnd the result of the composition of the two bijections (reading left to right). For example,

.1;2;3/.4;5/ .1;2;3;4;5/ .1; 3; 2; 4/.5/: D

Since .1; 2/.1; 3/ .1; 3/.1; 2/, the symmetric group Sn is nonabelian for n 3. ¤ Two groups G1 and G2 are isomorphic if there is a bijection (called an isomorphism) ' G1 G2 that preserves multiplication: '.gh/ '.g/'.h/, for all g; h G1. For W ! D 2 example, Z10 is isomorphic to Z4. Can you find an isomorphism? Suppose that G1 and G2 are two groups. The product of G1 and G2, denoted G1 G2, is the set of ordered pairs .g1; g2/ g1 G1; g2 G2 subject to the multiplication rule 0 0 f0 0 W 2 2 g .g1; g2/ .g ; g / .g1g ; g2g /. 1 2 D 1 2 The product Z2 Z2 is a four-element group. It is not isomorphicto Z4, for Z2 Z2 has three elements of order 2 while Z4 has only one. The group Z2 Z3 is isomorphic to Z6. Can you find an isomorphism? A subset H of G is a subgroup of G if H is a group with respect to the group operation of G. For example, the two-element group .1; 2/.3/; .1/.2/.3/ is a subgroup of the six- f g element group S3. The symmetric group Sn is especially important because every finite group is isomorphic to a subgroup of some Sn.

Theorem. If G is a ﬁnite group of order n, then G is isomorphic to a subgroup of Sn.

i i

i i i “main” — 2011/10/12 — 12:06 — page 144 — #158 i i i

144 Appendix A. Harmonious Foundations

6 1

2 5

4 3 f

Figure A.1. Generators of the dihedral group D6.

Here is a proof. For each element g G, we deﬁne a function fg G G by the 2 W ! rule fg .a/ ag (right multiplication by g). Because fg has an inverse, fg1 , it is a bi- D 1 1 jection. We check: fg .f 1 .a// ag g a and f 1 .fg .a// agg a. Since g D D g D D fg is a permutation of the n-element set G, we can deﬁne a function ' G Sn by W ! '.g/ fg . We claim that ' is an isomorphism between G and the range of '. First, D we check that ' preserves multiplication: '.gh/.a/ fgh.a/ a.gh/ .ag/h D D D D fh.fg .a// .fg fh/.a/ .'.g/'.h//.a/. Second, we check that ' is one-to-one: If D D '.g/.a/ '.h/.a/, then fg .a/ fh.a/, which implies that ag ah, and g h. D D D D

The dihedral group Dn, of order 2n, consists of the set of symmetries of a regular convex n-gon. If we number the vertices of the n-gon 1,..., n, then we see that Dn is a subgroup of Sn. The subgroup is generated by two permutations: the rotation r .1;2;3;:::;n/ D and a ﬂip f along an axis of symmetry of the n-gon. If n is odd, we take the ﬂipto be

f .n/.1; n 1/.2; n 2/ : : : ..n 1/=2; .n 1/=2/: D C If n is even, we take

f .1; n/.2; n 1/:::.n=2;n=2 1/: D C

See FigureA.1 for a depictionof D6, the group of symmetries of a regular convex hexagon. ˛ ˇ Every element of Dn can be written in the form r f , where ˛ 0;1;2;:::;n 1 2 f g and ˇ 0; 1 . Elements are multiplied using the rules r n e, f 2 e, and rf f r 1. 2 f g D D D We say that Dn has the presentation

r; f r n e; f 2 e; rf f r 1 : h W D D D i For an explanation of the theory of group presentations, consult [27]. We have noted that every element of Sn can be expressed as a product of cycles. A cycle of length one is called a ﬁxed point and a cycle oflengthtwois called a transposition.

i i

i i i “main” — 2011/10/12 — 12:06 — page 145 — #159 i i i

A.5. Fields 145

Cycles of length greater than two can be written as products of transpositions.For example, .1; 2; 3/ .1; 2/.1; 3/. A permutation may be written as a product of fixed points and D transpositionsin more than one way. The number of transpositionsis always even or always odd. A permutation is accordingly called an even permutation or an odd permutation. Of the nŠ permutations in Sn, half are even and half are odd. This followsfrom the observation that f ./ .1; 2/ is a bijectionbetween the set of even permutations in Sn and theset of D odd permutations in Sn. As the identitypermutationis even and the set of even permutations is closed under multiplication and taking inverses, the even permutations are a group. The alternating group An is the group of even permutations of an n-element set. It has order nŠ=2. Let G be a group and X a set. An action of G on X is a function that associates to each g G and x X an element of G, denoted gx, such that the following conditions hold: 2 2 For every x X, we have ex x (where e is the identity element of G). 2 D For every g, h G and x X, we have g.hx/ .gh/x. 2 2 D In a group action, each element g G yields a permutation of the set X, defined by 2 sending x to gx. For if gx gy, then x y, so the map is one-to-one, and gg1x x, D D D so the map is onto. For example, the symmetric group Sn acts on the set 1;2;3;:::;n by the action f g gx g.x/, where g.x/ is the image of x under the bijection g 1;2;3;:::;n D W f g ! 1;2;3;:::;n . f g Similarly, the cyclic group Zn acts on the set 1;2;3;:::;n by the action f g g x if g x n gx C C Ä D g x n if g x > n: C C Here g denotes the equivalence class representative of Œg between 1 and n. A.5 Fields The concept of a field arose in the study of the solvabilityof polynomial equations as well as in the study of properties of the real numbers and complex numbers. Heinrich M. Weber (1842–1913) gave the first modern definition of a field. A field F is a set having at least two elements, with two binary operations, and , such C that the following conditions hold:

F is an abelian group with respect to . C F 0 , where 0 is the additive identity,is an abelian group with respect to . f g For all x, y, z F , we have 2 x .y z/ x y x z (distributive law): C D C Examples of ﬁelds: The set R of real numbers with the usual addition and multiplication. The set Q of rational numbers with the usual addition and multiplication. The set Z2 0; 1 with addition and multiplication modulo 2. D f g

i i

i i i “main” — 2011/10/12 — 12:06 — page 146 — #160 i i i

146 Appendix A. Harmonious Foundations

The set Zp 0; 1; : : : ; p 1 , where p is a prime number, with additionand multiplication D f g modulo p. A finitefield exists and is uniqueup to isomorphism for any primepower order. We show 3 a construction for the order 8 2 . Start with the field Z2 0; 1 . Find a polynomial of D D f g degree 3 over this field that doesn’t factor into polynomials of lesser degree. One choice is f .x/ x3 x 1. We see that f .0/ 03 0 1 1 and f .1/ 13 1 1 1. D C C D C C D D C C D So neither 0 nor 1 is a root of f . If f factored into polynomials of lesser degree, then at least one of the factors would be linear, but then 0 or 1 would be a root. Hence f doesn’t factor. Next, take tobe a rootof f in the field of order 8 that we are trying to construct. Thus f ./ 3 1 0, which implies that 3 1 (because the base field is D C C D D C Z2). Define the field to be the collection of polynomials of degree 2 in over Z2. The eight field elements are:

0; 1; ; 2; 1; 2 ; 2 1; 2 1: C C C C C To do addition and multiplication in this field, add or multiply polynomials, reduce coefficients modulo 2, and use the identity 3 1. The seven nonzero elements of the field D C form a cyclic group generated by . Successive powers of comprise all seven nonzero field elements. See Appendix B for a challenge about constructing another finite field.

A.6 Vector Spaces A vector space consists of a group of vectors defined over a field of scalars. More formally: A vector space V over a field F is an additiveabelian group togetherwith a rule that assigns to every f F and v V an element f v V such that the following conditions hold 2 2 2 for all f , f1, f2 F and v, v1, v2 V : 2 2 f .v1 v2/ f v1 f v2; C D C

.f1 f2/ v f1 v f2 v; C D C

f1 .f2 v/ .f1f2/ v; D 1 v v, where 1 is the multiplicative identity of F . D Elements of V are called vectors and elements of F are called scalars. Examples of vector spaces: The group R2 is a vector space over the field R. The group R is a vector space over the field Q of rational numbers. An important example of a vector space is F n, the vector space of ordered n-tuples over a field F . Addition and multiplication of vectors is defined componentwise. We write an element of F as an n 1 vector. For example, with n 4 and F Z2, one vector is D D 1 0 2 3 : 1 6 0 7 6 7 4 5

i i

i i i “main” — 2011/10/12 — 12:06 — page 147 — #161 i i i

A.6. Vector Spaces 147

Suppose that V is a vector space over F . A subset S of V spans V if every vector v V 2 can be written as a linear combination of elements of S; that is,

v f1v1 fnvn; D C C

for some elements v1,..., vn in S and f1,..., fn in F . A subset S of V is linearly independent if no element of S can be written as a linear combination of the other elements of S. A basis of V is a subset of V that spans V and is linearly independent.

Theorem. If V is a vector space, then V has a basis. Moreover, all bases of V have the same cardinality.

The cardinality of a basis of a vector space is its dimension. For example, the vector space R2 over R has dimension 2. One basis, called the standard basis, consists of 1 0 and : 0 1 Ä Ä A linear transformation from one vector space to another is given by a matrix. A matrix A is a rectangular array of numbers Œaij , where 1 i m and 1 j n. Ä Ä Ä Ä For 2 2 matrices a a b b A 11 12 and B 11 12 ; D a21 a22 D b21 b22 Ä Ä where the entries are arbitrary numbers, we deﬁne

a11 b11 a12 b12 A B C C : C D a21 b21 a22 b22 Ä C C That is, we add the corresponding entries of A and B. We define scalar multiplication so that the result of applying the transformation A and then a multiple c is the same as applying the transformation cA. This definition amounts to multiplying each entry of A by c. We need to define matrix multiplication. We want to define the matrix product AB so that it represents the result of applying the linear transformation B to x and y, and applying the linear transformation A to the result. We write x and y as a vector

x : y Ä We set

a11 a12 x a11x a12y C ; a21 a22 y D a21x a22y Ä Ä Ä C

i i

i i i “main” — 2011/10/12 — 12:06 — page 148 — #162 i i i

148 Appendix A. Harmonious Foundations

and

a11 a12 b11 b12 x a11 a12 b11x b12y C a21 a22 b21 b22 y D a21 a22 b21x b22y Ä Ä Ä Ä Ä C

.a11b11 a12b21/x .a11b12 a12b22/y C C C D .a21b11 a22b21/x .a21b12 a22b22/y Ä C C C

a11b11 a12b21 a11b12 a12b22 x C C : D a21b11 a22b21 a21b12 a22b22 y Ä C C Ä Therefore, we deﬁne

a11 a12 b11 b12 a11b11 a12b21 a11b12 a12b22 C C : a21 a22 b21 b22 D a21b11 a22b21 a21b12 a22b22 Ä Ä Ä C C We call the ij entry of the product the dot product of the ith row vector of A and the j th column vector of B. Matrix addition is defined similarly for any two matrices of the same dimensions, and matrix multiplication is defined similarly for any two matrices in which the number of columns of the first matrix is the same as the number of rows of the second matrix. We can use matrices to solve systems of linear equations. For example, we can write the system

3x 4y 5z 154 C C D x 10z 0 C D 3x 7y 12z 385 C C D as 3 4 5 x 60 1 0 10 y 91 : 2 3 2 3 D 2 3 3 7 12 z 6 4 5 4 5 4 5 The matrix x y 2 3 z 4 5 is a vector. Call it v. Denote the 3 3 matrix by A and the vector of constants on the right side by c. The system is Av c: D We can solve this system by multiplying by the inverse of A. Suppose that A1 is the inverse of A (with respect to matrix multiplication). Then

v A1c: D When is the matrix a a A 11 12 D a21 a22 Ä

i i

i i i “main” — 2011/10/12 — 12:06 — page 149 — #163 i i i

A.6. Vector Spaces 149

invertible? Solving for x and y in the corresponding two-equation system Ax c, we ﬁnd D that c1a22 c2a12 c2a11 c1a21 x and y : D a11a22 a12a21 D a11a22 a12a21 The quantity a11a22 a12a21 iscalled the determinant of the matrix. The system is solvable if and only if the determinant is nonzero. The determinant of an n n matrix A Œai;j is deﬁned as D

det A sgn./a1;.1/a2;.2/ : : : an;.n/; D X where the sum is over all nŠ permutations of the set 1; 2; : : : ; n and sgn./ is the sign f g of , i.e., 1 if is an even permutation and 1 if is an odd permutation. C Let’s look at a matrix as a geometric transformation. Consider a rotationof the Cartesian coordinate system by radians in the counterclockwise direction about the origin. We can ﬁnd the matrix that performs this transformation. Trigonometry shows that

1 0 Ä is rotated to cos ; sin Ä and 0 1 Ä is rotated to sin : cos Ä It follows from the deﬁnition of matrix multiplicationthat the rotation matrix is given by

cos sin R : D sin cos Ä The deﬁnitions make such computations transparent.

i i

i i i “main” — 2011/10/12 — 12:06 — page 150 — #164 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 151 — #165 i i i

B Eye-Opening Explorations

The only way to learn mathematics is to do mathematics. —PAUL HALMOS (1916–2006) In this appendix, we pose some problems related to topics discussed in the book. Can you solve these problems?

B.1 Problems 1. Recall the Lemniscate of Chapter 1. A Cartesian equation for this curve is

.x2 y2/2 x2 y2: C D Suppose that x and y are integers considered modulo p, where p is an odd prime. Use a computer algebra system to count the number of ordered pairs .x; y/ that satisfy the lemniscate equation modulo p. Conjecture a formula for the number of solutions in terms of p. 2. Recall the discussion of a googol in Centillionin Chapter 1. What is the smallest number of pennies such that the number of subsets is greater than a googol? 3. Recall the properties of complex numbers and determinants discussed in Chapter 1. (a) Prove that two triangles whose vertices in the complex plane are ˛, ˇ, and ˛0, ˇ0, 0 are similar if and only if ˛ ˇ ˛0 ˇ0 0 0: ˇ ˇ D ˇ 1 1 1 ˇ ˇ ˇ ˇ ˇ ˇ ˇ (b) Prove that the complex numbersˇ ˛, ˇ, and areˇ the vertices of an equilateral triangle if and only if ˛ ˇ! !2 0; C C D where 1 p3 ! i D 2 C 2 is a cube root of unity.

151

i i

i i i “main” — 2011/10/12 — 12:06 — page 152 — #166 i i i

152 Appendix B. Eye-Opening Explorations

4. In this problem, we look at a flat version of the Square Pyramidal Square Number of Chapter 2. Find an integer greater than 1 that is both a square number and a triangular number. 5. In Bulging Hyperspheres of Chapter 2, we saw that a “small” hypersphere of radius pd 1 bulges outside a d-dimensional hypercube of side 4 when d > 9. What happens to the ratio of the volume of this hypersphere to the volume of the hypercube as d increases? Use the formula from Volume of a Ball in Chapter 3. 6. Recall the Two-Colored Graph in Chapter 2. Draw a three-coloring of the edges of the complete graph on 16 vertices with the property that there is no triangle all of whose edges are the same color. Refer to the discussion of Hypercube in Chapter 2. Prove that a three-coloringof the edges of a complete graph on 17 vertices must contain a triangle with all three edges the same color. Use the pigeonhole principle (see page 66). These results are prominent in the area of combinatorics known as Ramsey theory. 7. Recall the Hypercube of Chapter 2. If we define a hypercube in 5 dimensions, how many vertices does it have? How many neighboring vertices does each vertex have? How many edges are there? 8. Recall the SquaringMap of Chapter 2. For n 106, find the number of components of D the graph and the size of a largest attractor. You may need a computer. 9. Recall the Riemann Sphere of Chapter 2. What action on the Riemann sphere is caused by the mapping z 1=z in the complex plane? What geometric relationship do the 7! numbers z and 1=z have? 10. Recall the Heronian triangles of Chapter 3. Find two incongruent triangles with integer side lengths, having the same integer area and perimeter. You may need a computer. 11. Use the technique for finding the area of a triangle given in Heron’s Formula and Hero- nian Triangles in Chapter 3 to find the volume of a tetrahedron in terms of its side lengths. 12. Recall the product formula for sin x given in Product for Pi in Chapter 3. Prove the infinite product formula for the hyperbolic sine function:

1 n2 1 sinh C : n2 D nD1 Y 13. Use mathematical induction to prove Cassini’s identity, from Fibonacci Numbers and Pi in Chapter 3: 2 nC1 F FnC1Fn1 . 1/ ; n 1: n D Use thisand the formula forthe tangent of a difference to prove the partial sums formula

k 1 1 tan1 : F2nC1 D 4 F C nD1 2k 2 X

i i

i i i “main” — 2011/10/12 — 12:06 — page 153 — #167 i i i

B.1. Problems 153

14. Recall the definition of the Fibonacci sequence Fn given in Fibonacci Numbers and f g Pi in Chapter 3. Find its generating function. Also, find the generating function for the sequence of fifth powers of the Fibonacci numbers, F 5 . f n g 15. Recall The Smallest Taxicab Number of Chapter 3. Find the smallest positive integer that is the sum of two fourth powers in more than one way. 16. In the discussion of The Zeta Function and Bernoulli Numbers in Chapter 3, we said that we could use Bernoulli numbers to find

1 1 : m4 mD1 X Fill in the details of this calculation. 17. Recall the Riemann zeta function from Chapter 3. Prove that

B C . n/ n 1 ; n 1: D n 1 C In particular, this means that . n/ 0 when n is an even positive integer. D 18. Give a countingproof of the recurrence formula for the number of Rook paths in Chap- ter 3:

r.0; 0/ 1; r.0; 1/ 1; r.1; 0/ 1; r.1; 1/ 2 D D D D I r.m; n/ 2r.m 1; n/ 2r.m; n 1/ 3r.m 1; n 1/; D C m 2 or n 2: 19. Recall the theorem A Square inside Every Triangle from Chapter 4. Describe how to perform the construction of a square inside a triangle using straightedge and compass. 20. Recall from Polynomial Symmetries in Chapter 4 that given a finite group, there is a polynomial whose symmetries are that group. Find such a polynomial for the dihedral group D6. See Figure A.1. 21. Recall the definition of Kings and Serfs in a tournament from Chapter 4. A vertex that reaches every other vertex in one step is called an Emperor. Prove that (a) A tournament with no Emperor has at least three Kings. (b) A tournament on n > 4 vertices can have any number of Kings between 1 and n except 2. 22. How many permutations of the set 1;2;3;:::;16 have no increasing subsequence of f g length five and no decreasing subsequence of length five? Recall the discussion after the Erd˝os–Szekeres theorem in Chapter 4. 23. Use Minkowski’s theorem from Chapter 4 to prove that, for any real numbers a, b, c, d with a b 0; D c d ¤ ˇ ˇ ˇ ˇ ˇ ˇ ˇ ˇ

i i

i i i “main” — 2011/10/12 — 12:06 — page 154 — #168 i i i

154 Appendix B. Eye-Opening Explorations

there exist integers x and y, not both 0, such that

ax by p and cx dy p: j C j Ä j C j Ä

24. Recall the lemniscate graph from Chapter 4. Find the area it encloses.

25. The substitutionr sin t changes the integral givingthe length of the lemniscate curve D from =2 dt L 4 D 0 1 sin2 t Z C to p 1 dr L 4 : D 0 p1 r 4 Z Use this integral to show

1 1 3 1 3 5 L 4 1 : D C 2 5 C 2 4 9 C 2 4 6 13 C Â Ã The last number in each denominator is twice the previous number plus 1.

26. Investigate the Fibonacci sequence Fn modulo powers of 2. Can you conjecture a f g m formula for the lengthof the periodof the sequence Fn mod 2 , where m is a positive fm g integer? Do the same for the sequence Fn mod 5 . This problem comes from the f g discussion of Integer Triangles in Chapter 5.

27. Find a formulafor the number of odd entries in the nth row of Pascal’s triangle. Recall Odd Binomial Coefﬁcients in Chapter 5.

28. Use the method of constructing Perfect Error-Correcting Codes from Chapter 5 to construct a binary code of lengtheight, consistingof 16 code words at distance at least four apart. Start with a cyclic graph on four vertices. What code do you get when you delete one coordinate from the length eight code?

29. Recall the problem Making a Millionin Chapter 6. How many ways are there to make a milliondollars if we also use $2 bills?

30. Prove the claim about the expected number of steps in the one-dimensional gobbling algorithm described in Two-Dimensional Gobbling Algorithm in Chapter 7.

31. Show how to construct a field of nine elements. Recall the explanation of constructing finite fields in Appendix A. Start with the field Z3. Find a polynomial of degree 2 that doesn’t factor over this field. Recall the discussion of Projective Plane in Chapter 2. Explain how to use the nine point field to construct a projective plane with 91 points and 91 lines.

i i

i i i “main” — 2011/10/12 — 12:06 — page 155 — #169 i i i

B.2. Solutions 155

B.2 Solutions 1. If p 1 .mod 8/, then the curve contains p integer points, while if p 1 .mod 8/, 6Á Á then it contains p 4 integer points. We find a parameterization of the lemniscate with rational functions. Recalling the technique of Diophantus from Squares in Arithmetic Progression in Chapter 5, we use the rational parameterization of the unit circle 2v 1 v2 cos t ; sin t ; v Q: D 1 v2 D 1 v2 2 C C From this, we obtain a rational parameterization for the lemniscate: v v3 v v3 x C ; y ; v Q: D 1 v4 D 1 v4 2 C C If we plot this curve in the plane, where v is a real variable, we see that the entire lemniscate is generated with each point occurring exactly once. If v is in the field 0;1;2;:::;p 1 , then we have to interpret what the division in the formula means f g and what happens if a denominators is 0. If the denominators are nonzero, then we interpret division as multiplying by a multiplicative inverse. When can the denominators 1 v4 be 0 modulo p? The nonzero elements modulo p, C where p is a prime, form a cyclic group of order p 1, with multiplication modulo p. Let g be a generator of the group. Then 1 g.p1/=2, and we see that 1 has a D fourth root if and only if .p 1/=4 is an integer, i.e., p 1 .mod 8/. In this case, the Á four solutions to 1 v4 0 are g.p1/=8 , g3.p1/=8, g5.p1/=8 , and g7.p1/=8 . For C D any prime p, the defined points in the parameterization are distinct. This accounts for p points on the lemniscate if p 1 .mod 8/ and p 4 points if p 1 .mod 8/. To 6Á Á show that the values are distinct, assume that

2 2 v1.1 v / v2.1 v / v1.1 v1/ v2.1 v2/ C 1 C 2 ; : 1 v4 D 1 v4 1 v4 D 1 v4 C 1 C 2 C 1 C 2 Dividing the ﬁrst expression by the second and simplifying, we obtain 1 v2 1 v2 C 1 C 2 ; 1 v2 D 1 v2 1 2 or 2 2 1 1 ; C 1 v2 D C 1 v2 1 2 2 2 which implies that v v . Substituting into the original equations yields v1 v2. 1 D 2 D 2. A computer check shows that n 333 is the smallest integer such that D 2n > 10100:

3. (a) Two trianglesare similar if and only if one can be dilated, rotated, and translated to the position of the other. The ﬁrst row of the given matrix is a linear combination of the second and third rows (and the determinant is 0) when

.˛; ˇ; / a.˛0; ˇ0; 0/ b.1; 1; 1/: D C

i i

i i i “main” — 2011/10/12 — 12:06 — page 156 — #170 i i i

156 Appendix B. Eye-Opening Explorations

The multiplier a does the dilation and rotation (a complex number can accomplish this), while b.1; 1; 1/ is a translation in the direction of the complex number b. (b) The complex numbers ˛, ˇ, and are the vertices of an equilateral triangle if and only if !. ˇ/ ˛ : D To see this, sketch the vectors ˇ and ˛ , and see what happens when the ﬁrst vector is rotated 120ı counterclockwise (done by !). Using the relation 1 ! C C !2 0, we have D ˛ ˇ! !2 0: C C D 4. It is easy to check that 36 is the smallest such number. We have

36 62 1 2 3 8: D D C C C C All square triangular numbers are given by the recurrence formula

a0 0; a1 1; an 6an1 an2; n 2: D D D 2 Thus an 0;1;6;35;204;::: , and a is both square and triangular. f g D f g n 5. The ratio of the volume of the hypersphere of radius pd 1 to the volume of the hypercube of side 4 is d=2.pd 1/d : .d=2/Š4d As d increases, the volume decreases close to 0 and then takes off toward inﬁnity. 6. The picture below shows a graph representing one of the color classes. This graph can be made from the ﬁve-dimensional hypercube with each point also joined to itsdiagonal opposite. In binary representation this means joining each binary string to the strings that differ from it in one coordinate and to its complementary string.

The three color classes must be put together to make the complete graph on sixteen vertices. I leave this as an exercise. Now we show that every three-coloring of the edges of the complete graph on seventeen vertices must contain a monochromatic triangle. Choose a vertex. There are sixteen edges emanating from it. Since there are three colors available for the edges, it follows by the pigeonhole principle that at least six of them are the same color. Without loss

i i

i i i “main” — 2011/10/12 — 12:06 — page 157 — #171 i i i

B.2. Solutions 157

of generality, suppose that six edges emanating from the vertex are blue. If any edge joining two endpoints of these edges is blue, then there is a blue triangle and we are done. So suppose that all the edges on the complete graph formed by the six endpoints are red or green. Then, by the version of Ramsey’s theorem mentioned in Two-Colored Graph, there is a red triangle or a green triangle and we are done. 7. From the model of a hypercube in dimension n as the set of binary strings of length n, we see that in dimension 5 there are 25 32 vertices. The neighbors of a vertex are the D strings that differ from it in exactly one coordinate. Since there are ﬁve choices for the coordinate, each vertex has ﬁve neighbors. Multiplying the number of vertices by the number of neighbors per vertex, we get the total number of edges counted twice (from the perspective of both endpoints). Hence, the hypercube has 32 5=2 80 edges. D An n-dimensional hypercube has a vertex set consisting of all binary n-tuples. Two vertices are joined by an edge if they differ in exactly one coordinate. For 0 k n, a Ä Ä k-dimensional face of an n-dimensional hypercube is a subset of k coordinates and the corresponding edge connections that form a k-dimensional hypercube. The number of k-dimensional faces of an n-dimensional hypercube is

n 2nk; 0 k n: k! Ä Ä

n The reason is thatthere are k choices for the k coordinates that form the k-dimensional hypercube. The other n k coordinates can be either 0 or 1. For example, the 4-dimensional hypercube has 24 16 vertices, 4 23 32 edges, D 1 D 4 22 24 faces, 4 2 8 three-dimensional faces, and 4 1 four-dimensional 2 D 3 D 4 D face (the whole hypercube). Try this calculation for the 3-dimensional cube. 8. A simple algorithm finds the number of components and the size of a largest attractor in the squaring map modulo n. Start with n singleton sets 0 , 1 , 2 ,..., n 1 . For f g f g f g f g k from 0 to n 1, concatenate the set containing k and the set containing k2 mod n. Then the sets are the components of the graph. To find the size of the attractor in each set, choose an element of a set and compute its successive squares modulo n. When the process goes into a loop, count the number of steps in the loop. Applyingthis algorithm (using a computer) when n 106, we find that there are fourteen components and a D largest attractor has size 2500. 9. The Riemann sphere is rotated 180ı about the real (x-) axis. To see this, show that the mapping z 1=z induces the mapping .x; y; z/ .x; y; z/. The numbers z and 7! 7! 1=z are antipodal points on the Riemann sphere. 10. A simple computer program finds the smallest two such triangles: 17; 25; 28 and f g 20; 21; 29 , both with perimeter 70 and area 210. You can find this solution by hand f g with a little educated guessing. Heron’s formula says that the area of a triangle with sides a; b; c is s.s a/.s b/.s c/, where s .a b c/=2. Setting a0 s a, f g D C C D b0 s b, and c0 s c, the formula becomes psa0b0c0. Noting that a0 b0 c0 s, D pD C C D the problem requires finding two triples a0; b0; c0 with equal products and equal sums. f g Experimenting with prime factors 2, 3, 5, and 7, we find two such triples: 2 32; 2 5; 7 f g

i i

i i i “main” — 2011/10/12 — 12:06 — page 158 — #172 i i i

158 Appendix B. Eye-Opening Explorations

and 3 5; 2 7; 2 3 , both with sum 5 7; and the product sa0b0c0 is a perfect square. f g The triples a0; b0; c0 determine the triples a; b; c . f g f g 11. Assume that the tetrahedronhas sides a, b, c, a0, b0, c0, with a0 opposite to a, b0 opposite to b, and c0 opposite to c. Suppose that a is represented by the vector a, etc. The volume of the tetrahedron is 1 V det M ; D 6j j where M is the 3 3 matrix whose rows are a, b, and c. Since the transpose matrix M t has the same determinant as M , we have a2 a b a c 36V 2 det M det M t det MM t a b b2 b c : D D D ˇ ˇ ˇ a c b c c2 ˇ ˇ ˇ ˇ ˇ Writing the dot products in terms of the side lengths,ˇ we obtain ˇ ˇ ˇ a2 .a2 b2 c02/=2 .a2 c2 b02/=2 C C 36V 2 .a2 b2 c02/=2 b2 .b2 c2 a02/=2 D ˇ C C ˇ ˇ .a2 c2 b02/=2 .b2 c2 a02/=2 c2 ˇ ˇ ˇ ˇ C C ˇ ˇ 2a2 a2 b2 c02 a2 c2 b02 ˇ ˇ C C ˇ 288V 2 a2 b2 c02 2b2 b2 c2 a02 : D ˇ C C ˇ ˇ a2 c2 b02 b2 c2 a02 2c2 ˇ ˇ ˇ ˇ C C ˇ ˇ ˇ ˇ ˇ A computer algebra system can help to calculate the determinant. It turns out that 144V 2 a2b2a02 a2b2b02 a2c2a02 a2c2c02 D C C C b2c2b02 b2c2c02 a2a02b02 a2a02c02 C C C C b2a02b02 b2b02c02 c2b02c02 c2a02c02 C C C C a2b2c02 a2c2b02 b2c2a02 a02b02c02 a2a04 b2b04 c2c04 a02a4 b02b4 c02c4: The first twelve terms are products of squares of triples that constitute neither a triangle nor three edges with a common vertex. The next four terms are products of squares of triples that form a triangle. The last six terms are products of squares of sides and fourth powers of their opposite sides. 12. By definition, eiz eiz sinh z : D 2 The zeros of sinh z are z in, where n is an integer. Hence, an infinite product D expansion of this function is 1 z2 sinh z z 1 : D C n2 2 nD1 Y Â Ã Letting z , we obtain the infinite product formula. D

i i

i i i “main” — 2011/10/12 — 12:06 — page 159 — #173 i i i

B.2. Solutions 159

13. Cassini’s identity holds for n 1, since D 2 2 F F2F0 1 1 0 1 . 1/ : 1 D D D Assume that it holds for n. Then

2 2 F FnC2Fn F .Fn FnC1/Fn nC1 D nC1 C 2 FnC1.FnC1 Fn/ F D n 2 FnC1Fn1 F D n . 1/nC1 D . 1/nC2; D and we see that the identity holds for n 1. Therefore, by mathematical induction, C Cassini’s identity holds for all n 1. From the formula for the tangent of a difference and Cassini’s identity,

1 1 .1=F2n/ .1=F2nC2/ tan1 tan1 tan1 F2n F2nC2 D 1 .1=F2n/.1=F2nC2 / C F2nC2 F2n tan1 D F2nF2nC2 1 C 1 F2nC1 tan 2 D F2nC1 1 tan1 : D F2nC1 It follows that the partial sums are telescoping series:

k 1 k 1 1 tan1 tan1 tan1 F C D F F C nD1 2n 1 nD1 2n 2n 2 X X Â Ã 1 1 tan1 tan1 D F2 F2kC2 1 tan1 : D 4 F2kC2 14. To ﬁnd the generating function for the Fibonacci sequence, we use the well-known formula n n Fn O ; n 0; D p5 where .1 p5/=2 and .1 p5/=2. Then D C O D 1 1 n 1 n n n 1 1 1 Fnx x D p O D p 1 x nD0 5 nD0 5 1 x X X Á Â O Ã x : D 1 x x2

i i

i i i “main” — 2011/10/12 — 12:06 — page 160 — #174 i i i

160 Appendix B. Eye-Opening Explorations

The series converges for x < 1. j j We find the generating function for the fifth powers of the Fibonacci numbers similarly. From the binomial theorem and the fact that 1, we have O D 5 5 1 n n Fn D .p5/5 O Á 1 5n 54nn 103n2n 102n3n 5n4n 5n D 25p5 O C O O C O O Á 1 .5/n 5. 3/n 10n 10n 5. 3/n .5/n : D 25p5 C O C O O Á Thus, the generating function is 1 1 5 10 10 5 1 : 25p5 1 5x 1 3x C 1 x 1 x C 1 3x 1 5x Â C O C O O Ã Combining the outer two fractions, then the next outer two, and finally the inner two, we obtain 1 x 2x 2x : 5 1 11x x2 C 1 4x x2 C 1 x x2 Â C Ã We use the fact that the Lucas numbers, Ln, defined by the recurrence relation L0 2, n n D L1 1, and Ln Ln1 Ln2, for n 2, are given by Ln , for n 0. D D C D C O The generating function is valid for 5x < 1. j j 15. Leonhard Euler found the solution

635318657 594 1584 1334 1344: D C D C A straightforward way to find it using a computer is to generate a list of the first 200 fourth powers, then form a list of sums of two terms from the first list. The solutionwill appear twice on the second list. 16. The formula 2n nC1 .2/ .2n/ . 1/ B2n D 2.2n/Š with n 2 yields D .2/4 .4/ B4: D 2.4/Š

Since B4 1=30, we obtain D 4 .4/ : D 90 17. Applying the formula s .s/ 2s s1 sin .1 s/.1 s/; s < 0; D 2 < Á for s n, we have D . n/ . n/ 2n n1 sin .n 1/.n 1/: D 2 C C Â Ã

i i

i i i “main” — 2011/10/12 — 12:06 — page 161 — #175 i i i

B.2. Solutions 161

If n is even, then the sine factor is 0, so . n/ 0. If n is odd, then we may use the D formulas 2k kC1 .2/ .2k/ . 1/ B2k; k 1; D 2.2k/Š and .n 1/ nŠ C D to obtain B C . n/ n 1 : D n 1 C 18. From the deﬁnitionof r.m; n/, we have

2r.m 1; n/ 2r.m; n 1/ 3r.m 1; n 1/ C r.m 1; n/ r.m 1; n/ r.m; n 1/ D C C r.m; n 1/ 3.m 1; n 1/ C m1 n1 n1 m1 r.a; n/ r.m 1; b/ r.m; b/ r.a; n 1/ D C C C aD0 D D aD0 X bX0 bX0 X 3.m 1; n 1/ m1 n1 n1 m1 r.a; n/ r.m; b/ r.m 1; b/ r.a; n 1/ D C C C aD0 aD0 X bXD0 bXD0 X 3.m 1; n 1/ m1 n1 r.a; n/ r.m; b/ 3.m 1; n 1/ 3.m 1; n 1/ D C C aD0 X bXD0 r.m; n/: D 19. Here is a diagram of the construction. A

B C

The construction of a square (BCED) by straightedge and compass is well known. Draw lines from D and E to A. The points where they intersect BC give the side of the square to be constructed. A similar triangles argument shows that the rectangle inscribed in ABC is a square.

i i

i i i “main” — 2011/10/12 — 12:06 — page 162 — #176 i i i

162 Appendix B. Eye-Opening Explorations

20. One such polynomial is

x1x2 x2x3 x3x4 x4x5 x5x6 x1x6: C C C C C

21. (a) Suppose that the tournament has no Emperor. Let v be a King. Let A be the set of vertices to which v is directed and B the remaining set of vertices. Since v is a King, v can reach any vertex in B in two steps. Since v is not an Emperor, B is not empty. Why is A non-empty? Let vA be a King in the tournament restricted to A, and vB a King in the tournament restricted to B. Then vA and vB are also Kings of the given tournament, so the tournament has at least three Kings. (b) If the tournament has an Emperor, then there is exactly one King, and we are done. By part (a), if there is no Emperor then there are at least three Kings. So we must show that any number of Kings from 3 to n is possible. Suppose that 3 k n. Ä Ä By the proof of Maurer’s theorem, a tournament on k vertices has the property that not every vertex is a King with probability at most

k 3 k2 : 2 4 ! Â Ã This is less than 1 for k 21. Hence, there exists a tournament on k vertices in which every vertex is a King when k 21. As an exercise, you can construct tournaments with this property for 3 k 20. Now, let T be a tournament on k Ä Ä vertices in which every vertex is a King. Draw a directed edge from every vertex of T to n k other vertices. Join them to each other with edges directed in any way. This tournament has exactly k Kings.

22. According to the hook length formula, the number of standard ﬁllings of a 4 4 grid with the numbers 1 through 16 is

16Š 24024: 1 2 2 3 3 3 4 4 4 4 5 5 5 6 6 7 D The number of permutations of the integers 1 through 16 that do not contain an increasing subsequence of length ﬁve or a decreasing subsequence of length ﬁve is

240242 577152576: D

23. Let K be the set of ordered pairs .x; y/ of real numbers such that

p ax by p Ä C Ä p cx dy p: Ä C Ä The transformation .x; y/ .ax by; cx dy/ is linear with determinant . It 7! C C followsthat the area of K is 4.Iftheareaof K were greater than 4, then by Minkowski’s theorem K wouldcontain a latticepointother than .0; 0/. However, the same conclusion holds for K since it is a closed set. Why does it still hold?

i i

i i i “main” — 2011/10/12 — 12:06 — page 163 — #177 i i i

B.2. Solutions 163

24. The area bounded by a curve given by parametric equations x.t/ and y.t/, where ˛ Ä t ˇ, is Ä ˇ y.t/x0.t/ dt: Z˛ For the lemniscate, the integral is

0 4 cos t sin2 t.5 cos.2t// 4 C dt: . 3 cos.2t//3 Z=2 C An antiderivative is 4 sin3 t F.t/ ; D . 3 cos.2t//2 C and so, by the Fundamental Theorem of Calculus, the area is

4.F.0/ F.=2// 1: D

25. Isaac Newton (1643–1727) showed how to extend the binomial theorem to all real exponents by the series formula

1 ˛ .1 x/˛ xn; x < 1; C D n j j nD0 ! X where ˛ n1.˛ i/ ˛ iD0 ; n 1; 1: n D nŠ 0 D ! Q ! The integrand is

1 1 1 3 1 3 5 1 r 4 r 8 r 12 : p1 r 4 D C 2 C 2 2 2Š C 2 2 2 3Š C Integrating this over 0 r 1 yields the desired series. Ä Ä m m1 m m 26. The period of Fn mod 2 is 3 2 . The period of Fn mod 5 is 4 5 . No f g f g general formula is known for the period of the sequence Fn mod m where m 2. f g 27. Recall that n dominates k means that the binary expansion of n has a 1 in every position where the binary expansion of k has a 1. Suppose that the binary expansion of n has d.n/ 1s. Then n dominates exactly 2d.n/ numbers, as this is the number of subsets of a set of size d.n/. Therefore, there are 2d.n/ odd entries in the nth row of Pascal’s triangle.

28. The ﬁrst four coordinates of the code represent a collection of vertices of the cyclic graph of length four. There are 24 16 such subsets. The last four coordinates com- D prise the indicator vector of the set of vertices nonadjacent to the ﬁrst set. The resulting code has distance four. Deleting a coordinate results in a code of length seven and distance three, consisting of 16 code words. It is a Hamming code.

i i

i i i “main” — 2011/10/12 — 12:06 — page 164 — #178 i i i

164 Appendix B. Eye-Opening Explorations

29. The number of ways is

4012504634719967902995238092061023959932853457130267501 : 4 1054: D 30. Let e.n/ be the expected number of steps in the gobbling algorithm starting with n. The first number chosen is either 1 (with probability 1=n) or not 1 (with probability .n 1/=n). This yields the recurrence formula 1 n 1 1 e.1/ 1 e.n/ .e.n 1/ 1/ e.n 1/ e.n 1/ ; n 2: D I D n C C n D C n It follows that 1 1 1 e.n/ 1 : D C 2 C 3 C C n 2 31. The polynomial f .x/ x x 2 does not factor over Z3 0; 1; 2 . To see this, D C C D f g try 0, 1, and 2, and check that you do not get 0. If f factored, then it would have two linear factors and hence would have roots in Z3. Let be a root of f . A field F of nine elements is obtained by taking powers of together with 0. Thus, the elements of the field are

0; 1; ; 2 2 1; 3 2 2; 4 2; 5 2; 6 2; 7 1: D C D C D D D C D C To construct a projective plane of order nine, let the points be the ordered pairs .x; y/, where x; y F , together with the ideal points m F and . This accounts for 2 2 1 92 9 1 91 points.The lines are of three types: sets of points .x; y/ that satisfy an C C D equation y mx b, where m; b F , together with m; sets of points .a; y/, where D C 2 a F , together with ; and the ideal line, consisting of the points m F and . 2 1 2 1 This accounts for 92 9 1 91 lines. As an exercise, check that each point is on ten C C D lines, each line contains ten points, every two points determine a unique line, and every two lines intersect in a unique point. There exist three other projective planes of order nine that do not arise from a ﬁeld. See [7].

i i

i i i “main” — 2011/10/12 — 12:06 — page 165 — #179 i i i

Bibliography

[1] M. Aigner and G. M. Ziegler. Proofs From THE BOOK. Springer-Verlag, New York, third edition, 2004. [2] C. Alsini and R. B. Nelsen. A visual proof of the Erd˝os–Mordell inequality. Forum Geometricorum, 7:99–102, 2007. [3] G. E. Andrews and K. Eriksson. Integer Partitions. Cambridge University Press, Cambridge, 2004. [4] W. S. Anglin. The square pyramid puzzle. The American Mathematical Monthly, 97(2):120–124, 1990. [5] R. Bacher and S. Eliahou. Extremal matrices without constant 2-squares. Journal of Combinatorics, 1(1):77–100, 2010. [6] L. W. Beineke and R. J. Wilson, editors. Graph Connections. Clarendon Press, Ox- ford, 1997. [7] M. K. Bennett. Afﬁne and Projective Geometry. Wiley, New York, 1995. [8] D. Bindner and M. Erickson. Alcuin’s sequence. The American Mathematical Monthly, to appear. [9] G. Borosand V. H. Moll. Irresistible Integrals: Symbolics, Analysis and Experiments in the Evaluation of Integrals. Cambridge University Press, Cambridge, 2004. [10] E. Brown. Three Fermat trails to elliptic curves. The College Mathematics Journal, 31(3):162–172, 2000. [11] S. A. Burr. On moduli for which the Fibonacci sequence contains a complete system of residues. The Fibonacci Quarterly, 9:497–504, 1971. [12] K. Ciesielski. Set Theory for the Working Mathematician, volume 39 of London Mathematical Society Student Texts. Cambridge University Press, Cambridge, 1997. [13] J. H. Conway, H. Burgiel, and C. Goodman-Strauss. The Symmetries of Things. A. K. Peters, Wellesley, 2008. [14] J. H. Conway and N. J. A. Sloane. Sphere Packings, Lattices, and Groups. Springer– Verlag, New York, third edition, 1999. [15] M. Erickson. Pearls of Discrete Mathematics. Chapman & Hall/CRC Press, Boca Raton, 2009.

165

i i

i i i “main” — 2011/10/12 — 12:06 — page 166 — #180 i i i

166 Bibliography

[16] M. Erickson, S. Fernando, and K. Tran. Enumerating rook and queen paths. Bulletin of the Institute of Combinatorics and Its Applications, 60(37-48), 2010. [17] M. J. Erickson. Introduction to Combinatorics. Wiley, New York, 1996. [18] M. J. Erickson and J. Flowers. Principles of Mathematical Problem Solving. Prentice Hall, Upper Saddle River, 1999. [19] S. Glaz and J. Growney, editors. Strange Attractors: Poems of Love and Mathematics. A. K. Peters, New York, 2009. [20] A. M. Gleason. Angle trisection, the heptagon, and the triskaidecagon. The American Mathematical Monthly, 95(3):185–194, 1988. [21] R. L. Graham, D. E. Knuth, and O. Patashnik. Concrete Mathematics: A Foundation for Computer Science. Addison-Wesley, Reading, MA, second edition, 1994. [22] R. L. Graham, B. L. Rothschild,and J. H. Spencer. Ramsey Theory. Wiley, New York, second edition, 1990. [23] G. Hardy and E. Wright. An Introductionto the Theory of Numbers. Clarendon Press, Oxford, ﬁfth edition, 1989. [24] J. Hemmeter. On an iteration diagram. Congressium Numeratium, 60:59–66, 1987. [25] I. N. Herstein. Abstract Algebra. Prentice Hall, Upper Saddle River, third edition, 1996. [26] R. Honsberger. Mathematical Gems I. Mathematical Association of America, Wash- ington, DC, 1973. [27] D. L. Johnson. Presentations of Groups. Cambridge University Press, New York, 1990. [28] T. Koshy. Fibonacci and Lucas Numbers with Applications. Wiley-Interscience, New York, 2001. [29] C. F. Laywine and G. L. Mullen. Discrete Mathematics Using Latin Squares. Wiley- Interscience, New York, 1998. [30] M. Levi. The Mathematical Mechanic: Using Physical Reasoning to Solve Problems. Princeton University Press, Princeton, 2009. [31] B. Lindstrom and H.-O. Zetterstom. Borromean circles are impossible. The American Mathematical Monthly, 98(4):340–341, 1991. [32] J. H. van Lint and R. M. Wilson. A Course in Combinatorics. Cambridge University Press, Cambridge, 1992. [33] W. Miller. The maximum order of an element of a ﬁnite symmetric group. The American Mathematical Monthly, 94(6):497–506, 1987. [34] G. Minton. Three approaches to a sequence problem. Mathematics Magazine, 84(1):33–37, 2011.

i i

i i i “main” — 2011/10/12 — 12:06 — page 167 — #181 i i i

Bibliography 167

[35] E. Nagel and J. R. Newman. Gödel’s Proof. New York University Press, New York, 2001. [36] P. J. Nahin. An Imaginary Tale: The Story of p 1. Princeton University Press, Princeton, 1998. [37] I. Niven, H. Zuckerman, and H. Montgomery. An Introduction to the Theory of Num- bers. Wiley, New York, fifth edition, 1991. [38] H. Noon and G. van Brummelen. The non-attacking queens game. The College Mathematics Journal, 37(3):223–227, 2006. [39] C. D. Olds, A. Lax, and G. Davidoff. The Geometry of Numbers. The Mathematical Association of America, Washington, DC, 2000. [40] R. W. Owens. An algorithmto solve the Frobenius problem. Mathematics Magazine, 76(4):264–275, 2003. [41] V.Pambuccian. The Erd˝os–Mordell inequality is equivalent to non-positive curvature. Journal of Geometry, 88:134–139, 2008. [42] M. Petkovsek, H. Wilf, and D. Zeilberger. A=B. AK Peters, New York, 1996. [43] C. Petzold. Code: The Hidden Language of Computer Hardware and Software. Mi- crosoft Press, Redmond, 1999. [44] T. N. Phan and N. M. Mach. Proof for a conjecture on general means. Journal of Inequalities in Pure and Applied Mathematics, 9(3), 2008. [45] V. Pless. Introduction to the Theory of Error-Correcting Codes. Wiley, New York, third edition, 1998. [46] S. Schwartzman. The Words of Mathematics: An Etymological Dictionary of Mathe- matical Terms Used in English. Mathematical Association of America, Washington, DC, 1994. [47] H. Shniad. On the convexity of mean value functions. Bulletin of the American Mathematical Society, 54:770–776, 1948. [48] D. Smith, M. Eggen, and R. St. Andre. A Transition to Advanced Mathematics. Brooks/Cole, Chicago, seventh edition, 2009. [49] A. Stacey and P. Weidl. The existence of exactly m-coloured complete subgraphs. Journal of Combinatorial Theory, Series B, 75:1–18, 1999. [50] P. D. Straffin. Game Theory and Strategy. Mathematical Association of America, Washington, DC, 1993. [51] L. W. Tu. An Introduction to Manifolds. Springer, New York, 2008. [52] H. Walser. The Golden Section. Mathematical Association of America, Washington, DC, 2001. [53] L. C. Washington. Elliptic Curves: Number Theory and Cryptography. Chapman & Hall/CRC Press, Boca Raton, 2003.

i i

i i i “main” — 2011/10/12 — 12:06 — page 168 — #182 i i i

168 Bibliography

[54] D. B. West. Introduction to Graph Theory. Prentice Hall, Upper Saddle River, 1995. [55] P. Yiu. Heronian triangles are lattice triangles. The American Mathematical Monthly, 108:261–263, 2001.

i i

i i i “main” — 2012/1/10 — 10:22 — page 169 — #183 i i i

Index

absolute value, 11 binary string, 18, 157 acceleration, 115 binary tree, 15 Alcuin of York, 91 binary vector, 101 Alcuin’s sequence, 89, 132 binomial coefﬁcient, 96, 104 algebra, 43, 64 binomial series, 128 universal, 64 binomial theorem, 163 algorithm, 6, 157 Birkhoff, Garrett, 64 gobbling, 131 bits, 19, 44 heapsort, 15 Boltzmann, Ludwig Eduard, 44 sorting, 15 Borevich, Z. I., 100 allocation problem, 91 Borromean rings, 5 Alsina, Claudi, 84 Borromeo family, 5 alternating permutation, 30 Bose, Raj Chandra, 77 AM–GM inequality, 55, 85 Bostan, Alin, 136 angle, 24 bracket notation, 42 angle trisection, 58 van Brummelen, Glen, 132 angle trisectors, 50 Burr, Stephen A., 133 anti-commutative multiplication, 42 antiderivative, 163 calculus, 54 arc length, 79 Cannonball Puzzle, 13 Archimedes’ constant (), 30 Cantor, Georg, 37, 109, 139 area, 154, 163 cardinal, 1 of circle, 32 cardinality, 140 of triangle, 25 carries, 96 arithmetic mean, 54 Cartesian coordinate system, 149 arithmetic progression, 72, 94 Cartesian equation, 151 Arithmetica, 13 Cartesian product, 141 array, 106 Cassini’s identity, 32, 152 Artin, Emil, 72 Cauchy, Augustin-Louis, 142 asymptotic, 117 Cauchy–Riemann equations, 39 asymptotic estimate, 31 Cayley, Arthur, 64, 142 attractor, 22 centillion, 3 Axiom of Choice, 139 change of coordinates, 111 change of variables, 81 Bacher, Roland, 137 characteristic equation, 127 ball, 32 chess Queen, 132, 134 Barrow, D. F., 84 chess Rook, 44 basis, 147 Chung, Fan, 131 Bernoulli number, 40, 153 circle, 2, 4, 5, 12, 16, 30, 32, 79, 110, 114, 155 Bernoulli, Jakob, 40 circumference of, 30 bijection, 37, 88, 142 circles, 54 binary code, 64, 101 circular inversion, 2 binary matrix, 136 Clausius, Rudolf Julius Emanuel, 44 binary representation, 96 code, 64, 101, 154

169

i i

i i i “main” — 2012/1/10 — 10:22 — page 170 — #184 i i i

170 Index

perfect, 102 dodecahedron, 61 codeword, 101 dot product, 9, 113, 148 coding theory, 103 combinatorics, 64, 75 e, 28 complex number, 9, 22, 70 Eight Curve, 1 complex plane, 9, 22, 58, 152 element, 140 complex variable, 28 order of, 143 composition, 142 elementary row operations, 108 computer, 19, 22, 67, 102, 103, 105, 116, 126, Elements, 6 128, 152, 155, 157, 160 Eliahou, Shalom, 137 computer algebra system, 57, 79, 128, 151, 158 ellipse, 79, 110 computer science, 15 elliptic curve, 14, 81 congruence relation, 141 elliptic integral, 79 conjugate, 11 empty set, 140 conjugation, 124 entropy, 44 contact number, 14 equilibrium formula, 118 convex function, 54 equivalence relation, 141 convex set, 67 Eratosthenes of Cyrene, 6 Conway,John H., 50 Erd˝os, Paul, ix, 50, 66, 84 coordinate system, 2 Erd˝os–Mordell inequality, 84 coordinates Erd˝os–Szekeres theorem, 66, 153 Cartesian, 113 Euclid, 6 polar, 33, 88 Euclidean distance, 11 spherical, 113 Euclidean geometry, 4, 50 cosecant, 1 Euclidean plane, 12, 59 counting proof, 99, 153 Euclidean space, 5, 32, 43 cross product, 9, 43 Euler line, 53 cube,1, 18, 35, 61, 110 Euler’s formula (exponential function), 11, 25 symmetry group of, 62 Euler’s formula (factorial function), 34 cube root of unity, 151 Euler’s formula (polyhedra), 35, 61 Curtis, Robert T., 102 Euler’s -function, 100, 143 cyclic group, 69 Euler, Leonhard, 11, 28, 31, 34, 53, 76, 95, 160 exponential function, 11 Dearie, Blossom, 1 extended complex plane, 23 deleted comb space,1 extended plane, 22 derivative, 29, 38 extended real line, 22 partial, 39 Descartes, Ren´e, 139 face, 1 descriptive geometry, 54 factorial, 34 determinant, 8, 26, 71, 118, 149, 151 Fermat point, 115 difference, 140 Fermat prime, 22, 58 differential, 79 Fermat’s Last Theorem, 8 differential equation, 104, 119 Fermat, Pierre de, 8, 95 digit, 116 Fibonacci, 32 dimension, 147 Fibonacci number, 116, 122 fractional, 21 Fibonacci sequence, 32, 91, 116, 122, 132, 133, Diophantine equation, 13 153, 154 Diophantus, 13, 94, 155 ﬁeld, 1, 9, 78, 145, 154 directrix, 86 formula Dirichlet, Peter Gustav Lejeune, 141 Dixon’s, 104 disk, 33, 69 Euler’s, 25 Dixon’s identity, 104 Heron’s, 25 Dixon, Alfred Cardew, 104 four squares theorem, 7

i i

i i i “main” — 2012/1/10 — 10:22 — page 171 — #185 i i i

Index 171

fractal, 20 generating function, 45, 91, 97, 127, 128, 135, fractional dimension, 21 153 Fraenkel, Abraham, 139 genus, 36 Frobenius, Ferdinand Georg, 96 geodesic, 113 Frucht, Roberto, 64 geometric construction, 4, 50, 58, 153 full adder, 19 geometric mean, 54 function, 141, 142 geometric series, 42, 127 bijection, 142 geometric transformation, 149 complex, 38 geometry, 12, 16, 50 convex, 54 absolute, 85 cosine, 28 descriptive, 54 even, 30 Euclidean, 50, 85 exponential, 11, 28, 66 non-Euclidean, 64 factorial, 33, 34 Gibbs, Josiah Willard, 44 floor, 90 glide-reflection, 59 gamma, 33, 34, 41 gobbling algorithm, 154 hyperbolic sine, 152 Gödel, Kurt, 139 identity, 142 Golay code, 102 inverse, 142 Golay, Marcel J. E., 102 Landau’s, 117 golden ratio, 3, 116 Möbius, 24 golden rectangle, 3 nearest integer, 90 googol, 3, 151 odd, 30 googolplex, 3 of complex variable, 38 graph, 15, 17, 18, 21, 75, 120, 133 one-to-one, 142 adjacency matrix of, 121 onto, 142 coloring of, 133 pathological, 141 complete, 17, 18, 92, 133 polynomial, 66 component of, 22 rational, 127 edge of, 92 secant, 29, 31 icosahedral, 102 sine, 28, 31 infinite, 133 tangent, 29 nonadjacency matrix of, 103 zeta, 40, 41 vertex of, 92 functional equation, 41 walk on, 120 Fundamental Theorem of Algebra, 9 graph theory, 17, 133 Fundamental Theorem of Arithmetic, 6 greatest common divisor, 96 Fundamental Theorem of Calculus, 163 Gregory, James, 30 Füredi, Zoltán, 99 group, 1, 63, 64, 106, 123, 142 abelian, 143 Gallai’s theorem, 136 affine linear, 106 Gallai, Tibor, 136 alternating, 62, 145 Galois, Evariste,´ 142 automorphism, 64, 103 game cyclic, 69, 143 binary matrix, 137 dihedral, 144, 153 Nim, 47 finite, 143 nonattacking Queens, 132 generator of, 69 transversal achievement, 136 Mathieu, 103 Triangle Destruction, 92 matrix, 64 Wythoff’s Nim, 136 nonabelian, 143 zero-sum, 117 of isometries, 61 game theory, 118 order of, 143 gamma function, 33, 34, 41 quaternion, 64 Gauss, Carl Friedrich, 7, 9, 79 sporadic simple, 103

i i

i i i “main” — 2012/1/10 — 10:22 — page 172 — #186 i i i

172 Index

symmetric, 62, 123, 143 information theory, 44 symmetry, 63 integral, 31, 79, 87, 154, 163 group action, 145 integrated circuit, 19 group presentation, 64, 144 integration by parts, 35 group representation, 64 intersection, 140 isometry, 12, 59 Halmos, Paul, 151 isomorphism, 143 Hamming code, 102, 163 Hamming, Richard, 102 Jacobi identity, 42 Hardy, G. H., 36, 83 Jacobi, Carl Gustav, Jacob, 42 harmonic map, 1 Jensen’s inequality, 55 harmonic number, 131 Jensen, Johan, 55 harmonic series, 40 Hausdorff dimension, 21 Kauers, Manuel, 46 heap, 15 Kovalevskaya, Sofia, 1 heapsort, 15 hemisphere, 95 Lagrange’s theorem, 7, 34, 69 heptagon, 58 Lagrange, Joseph-Louis, 7 Heron of Alexandria, 25 Landau’s function, 117 Heron’s formula, 25, 85, 157 Landau’s theorem, 65 Heronian triangle, 26 Landau, Edmund, 117 Hessian matrix, 118 Landau, H. G., 65 hexagon, 144 latin square, 76 holomorphism, 1 lattice, 14, 64, 70 hook length formula, 67, 162 lattice path, 44, 134 l’Hôpital’s rule, 98 lattice point, 67 Hurwitz, Adolf, 64 lattice theory, 64 hyperbola, 2 least common multiple, 100, 117 hypercube,16, 18, 152 Leech’s lattice, 14 hypersphere, 16, 33, 71 Leech, John, 14 Legendre’s conjecture, 6 icosahedral graph, 102 Legendre, Adrien-Marie, 6 icosahedron, 61, 102 Leibniz, Gottfried Wilhelm, 30 ideal line, 77 lemniscate, 1, 79, 151, 154 ideal point, 17, 76 Lemnos, 1 identity Lie algebra, 43 Cassini’s, 32, 152 Lie, Sophus,43 Dixon’s, 104 linear fractional transformation, 24, 125 Jacobi, 42 linear transformation, 12, 147 identity function, 142 lircle, 23 imaginary unit (i), 9, 28 logarithm, 116 inclusion and exclusion, 45 logic, 139 Incompleteness Theorem, 139 logic gate, 19 inequality, 84 Lorentzian lattice, 14 AM–GM, 55, 85 Lucas numbers, 132, 160 Jensen’s, 55 Lucas, Edouard,´ 13 power means, 56 triangle, 27, 89 Madhavan of Sangamagramam, 30 infinite product, 31, 152, 158 mathematical induction, 30, 34, 55, 73, 95, 104, infinite series, 28 121, 122, 152, 159 infinite set, 37 mathematical proof, 83 infinity, 1 Mathieu group, 103 information channel, 44 Mathieu, Emile´ Léonard, 103

i i

i i i “main” — 2012/1/10 — 10:22 — page 173 — #187 i i i

Index 173

matrix, 12, 42, 64, 147 golden ratio, 116 adjacency, 121 harmonic, 131 binary, 106, 136 i, 9, 28 Hessian, 118 integer, 140 identity, 103 irrational, 4, 141 invertible, 106 Lucas, 132 nonadjacency, 103 natural, 140 rotation, 12, 149 Perrin, 99 symmetric, 121 , 30, 46, 79 transvection, 107 power of two, 22 trivalued, 137 prime, 5, 7, 17, 22, 42, 68, 70, 99, 100, 117, Maurer’s theorem, 65, 162 143, 146, 151 Maurer, Stephen B., 65 pure imaginary, 10 mean quaternion, 70 arithmetic, 55, 79 rational, 4, 64, 140, 141 arithmetic-geometric, 79 real, 10, 140 geometric, 55, 79 square, 6, 7, 94, 152 harmonic, 55 square pyramidal, 13 power, 54, 55 taxicab, 36, 153 quadratic, 55 transcendental, 80 Meyerowitz, Aaron, 100 triangular, 7, 152 min-max equilibrium, 118 whole, 37 Minh, Mach Nguyet, 57 number theory, 7, 8, 68, 97 Minkowski’s theorem, 68, 153 Minkowski, Hermann, 67 octahedron, 35, 61 M¨obius function, 24 octodecillion, 1 M¨obuis strip, 36 one-to-one correspondence,37 modulus, 11 Online Encyclopedia of Integer Sequences, 100 MOLS, 77 Monge’s theorem, 54 Pambuccian, Victor, 85 Monge, Gaspard, 54 parabola, 86 Mordell, Louis, 84 parallel postulate, 85 Morley’s theorem, 50 parallelepiped, 8, 70, 71 Morley, Frank, 50, 104 parallelogram, 8, 53, 69 Morris, Tony R., 102 parametric equations, 2, 79, 114, 163 Mozart, Wolfgang Amadeus, 112 partial order, 99, 141 multiplication table, 64 partition of an integer, 88, 89 Pascal’s triangle, 96, 154, 163 Nam, Phan Thanh, 57 pentagram, 63 natural logarithm, 28 permutation, 117, 123, 143, 153 Nelsen, Roger B., 84 alternating, 30 von Neumann, John, 118 cycle, 124 Newton, Isaac, 163 cycle notation for, 143 Nim, 47 even, 62, 145 Noon, Hassan, 132 odd, 145 number transposition, 124 Bernoulli, 40, 153 Perrin’s sequence,99, 100 binary, 19 Perrin, R., 99 complex, 8, 34, 70, 140, 151 physics, 114 cube, 8 , 30 e, 28, 119 pigeonhole principle, 66, 68, 71, 77, 133, 152, Fibonacci, 116, 122 156 fourth power, 153 Plato, ix

i i

i i i “main” — 2012/1/10 — 10:22 — page 174 — #188 i i i

174 Index

Poincaré, Henri, 25 symmetric, 141 point at infinity, 2 transitive, 141 polar coordinates, 12, 33, 88 rhombus, 53 polygon, 58 Riemann sphere, 22, 95, 152 polyhedron, 1, 35, 61 Riemann surface, 64 dual, 62 Riemann zeta function, 40, 41, 153 regular, 61 Riemann, Bernhard, 22, 41 polynomial, 9, 63, 80, 146, 164 ring, 1 cubic, 58, 111 Robinson, John, 5 minimal, 58 Rook path, 44, 153 root of, 111 rotation, 59 symmetry of, 63 Russell’s paradox, 139 power mean, 54 Russell, Bertrand, 13 power means inequality, 56 power of two, 22 saddle point, 118 power series, 11, 28, 29 scalar, 146 power set, 140 Schoolhouse Rock, 1 powers of 2 expansion, 95 Schreier, Otto, 72 prime number, 5, 7, 17, 22, 42, 68, 70, 99, 100, scientific notation, 116 117, 143, 146, 151 sequence, 29, 66 Fermat, 22, 58 Alcuin’s, 89 probability, 65, 95, 119, 164 diagonal, 46, 135 projective plane, 16, 17, 77, 129, 154 Fibonacci, 32, 91, 122, 132, 133, 154 proof palindromic, 91 bijective, 88, 100 period of, 91 by contradiction, 66 Perrin’s, 99 contradiction, 102, 137 zigzag, 91 counting, 99, 153 series, 30, 163 mathematical induction, 30, 55, 73, 95, 121, binomial, 128 122, 152, 159 geoemtric, 30 pigeonhole principle, 66, 152 geometric, 127, 128 tessellation, 83 harmonic, 40 vector, 53 telescoping, 105, 159 pseudoprime, 1 set, 140 Pythagorean theorem, 16, 27, 83, 84 complement of, 140 finite, 140 quantum mechanics, 43 infinite, 37, 140 Queen path, 134 power, 140 sets Ramanujan, Srinivasa, 36 disjoint, 140 Ramsey theory, 75, 76, 152 equal, 140 Ramsey’s theorem, 17, 75, 133, 157 pairwise disjoint, 140 Ramsey, Frank, 17 Shannon’s theorems, 44 rectangle, 8 Shannon, Claude E., 44 recurrence formula, 46, 131, 135, 153, 156 Shniad, Harold, 57 recurrence relation, 29, 45, 90, 112, 126, 132 Sierpiński’s triangle, 20, 96 characteristic equation of, 112 Sierpiński, Wacław, 20 linear homogeneouswith constantcoefficients, Sieve of Eratosthenes, 6 127 similar triangles, 54, 161 reflection, 59 simple harmonic motion, 115 relation, 141 source, 43 antisymmetric, 141 sphere, 32, 95, 109 reflexive, 141 sphere packing, 14

i i

i i i “main” — 2012/1/10 — 10:22 — page 175 — #189 i i i

Index 175

spherical cap, 95 twisted sphere bundle, 1 spherical law of cosines, 114 spherical triangle, 114 union, 140 spherical trigonometry, 114 universal algebra, 64 square, 5, 18, 49, 153, 161 unsolved problems square number, 7 algebraic extension, 64 squaring map, 21, 152 binary matrix game, 137 squeeze principle, 56 counting proof, 46 Stacey, Alan, 134 exact colorings of graphs, 134 stamps, 96, 125 gobbling algorithm, 132 Steiner, Jakob, 103 Heronian triangles, 27 stereographic projection, 22, 95 Lucas numbers, 133 Storer, Thomas F., 49 nonattacking Queens game, 132 subgroup, 143 postage stamps, 98 subset, 140 power means, 57 supremum norm, 1 projective plane, 17 surface, 36 Queen paths, 136 symmetry, 18, 24, 62, 115, 130, 144 Rook paths, 46, 47 symmetry group, 63 transversal achievement game, 136 Szekeres, George, 66 transversal of primes, 6 vector, 8, 12, 53, 146 tangent, 54 vector space, 43, 102, 146 tangent line, 86 velocity, 114 Tarry, Gaston, 17 Viète’s formulas, 112 telescoping series, 105, 159 Viète, François, 112 tessellation, 83 volume, 8, 70, 71 tetrahedron, 35, 61, 62, 109, 152 of ball, 33 tournament, 65, 153 of sphere, 32 Emperor of, 153 of tetrahedron, 152 King of, 65, 153 random, 65 van der Waerden numbers, 72 Serf of, 65, 153 van der Waerden’s theorem, 72, 136 translation, 59 van der Waerden, B. L., 72 transversal, 136 walk, 120 trapezoid, 8, 84 Wallis’s product formula, 31 tree, 1 Wallis, John, 31 binary, 15 Weber, Heinrich M., 145 triangle, 8, 49, 50, 52, 84, 85, 110, 115, 153 Weidl, Peter, 134 altitude of, 49 Wessel, Caspar, 9 area of, 25 Wiles, Andrew, 8 centroid of, 52 Wilf, Herbert, 105 circumcenter of, 52 Wythoff’s Nim, 136 equilateral, 5, 20, 50, 85, 110, 151 WZ method, 105 Heronian, 26, 152 WZ pair, 105 incenter of, 52 integer, 89 Young tableau, 67 orthocenter of, 52 standard filling of, 67 right, 83 triangle inequality, 89 Zeilberger, Doron, 46, 105 triangular number, 7 Zermelo, Ernst, 139 trigonometry, 149 zero-sum game, 117 triskaidecagon, 59 zeta function, 40, 41, 153 truth table, 19 ZFC set theory, 139

i i

i i i “main” — 2012/1/10 — 10:22 — page 176 — #190 i i i

i i

i i i “main” — 2011/10/12 — 12:06 — page 177 — #191 i i i

About the Author

Martin Erickson was born in Detroit, MI in 1963. He graduated with High Honors from the University of Michigan in 1985 and received his Ph.D. at the University of Michigan in 1987. He is a professor of mathematics at Truman State University. He has written several acclaimed mathematics books, including Aha! Solutions (MAA) and Introduction to Number Theory (with Anthony Vazzana, CRC Press). He is a member of the Mathematical Association of America and the American Mathematical Society.

177

i i