<<

1

1 Factorisation of

Definition N = {1, 2, 3,...} are the natural numbers.

Definition Z = {..., −2, −1, 0, 1, 2,...} are the integers.

Closed under the binary operations +, ×, −

Definition α ∈ R then bαc is the greatest which is less than or equal to α. √ Ex b3c = 3,  2 = 1, b−πc = −4

Then bαc 6 α < bαc + 1

Proposition 1 If a and b are two integers with b > 0 then there are integers q and r with 0 6 r < b and a = qb + r. a Proof. Let α = b .

a  a  ⇒ 0 6 b − b < 1  a  ⇒ 0 6 a − b b < b

 a   a  so if r = a − b b then a = qb + r with q = b . 

Definition If a = cb (a, b, c ∈ Z) we say a is a multiple of b, or b divides a and write 2 b|a.

Proposition 2 If b 6= 0, c 6= 0 then

(a) b|a and c|b ⇒ c|a (b) b|a ⇒ bc|ac (c) c|d and c|e ⇒ ∀m, n ∈ Z, c|dm + en.

Proposition 3 Let a, b > 0. If b|a and b 6= a then b < a.

Definition If b|a and b 6= 1 or a then we say b is a proper of a. If b does not divide a write b - a.

Definition P = {p ∈ N : p > 1 and the only of p are 1 and p} are the prime numbers. Then N \ (P ∪ {1}) are the composite numbers. P = {2, 3, 5, 7, 11, 13, 17, 19, 23,...}.

Theorem 1 Every n > 1, n ∈ N, is a product of prime numbers. Proof. If n ∈ P we are done. If n is not prime, let q1 be the least proper divisor of n. Then q1 is prime (since otherwise, by Prop 3, it would have a smaller proper divisor). Let n = q1n1, 1 < n1 < n. If n1 is prime we are done. If not n1 = q2n2, 1 < n2 < n1 < n. This process must terminate in less than n steps. Hence n = q1q2 . . . qs with s < n. 

Ex 10725 = 3 · 5 · 5 · 11 · 13 3

In a prime factorization of n arrange the primes so that p1 < p2 < ··· < pk and exponents αi ∈ N, 1 6 i 6 k so

α1 α2 αk n = p1 p2 ··· pk k Y αj = pj j=1 is the standard factorisation of n.

Prime Numbers

We can use the sieve of Eratosthenes to list the primes 2 6 p 6 N. √ √ If n 6 N √and n is not prime, then n must be divisible by a prime p 6 N (if p1 > N and p2 > N ⇒ p1p2 > N).

List all of the integers between 2 and N 2, 3, 4, 5,...,N successively remove

(i) 4, 6, 8, 10,... even integers from 22 on (ii) 9, 15, 21, 27,... multiples of 3 from 32 on (iii) 25, 35, 55, 65,... multiples of 5 from 52 on 4

etc. √ i.e. remove all integers which are multiples of a prime p < N. We are left with all primes up to N. √ Ex N = 16, N = 4 {2, 3, 64, 5, 66, 7, 68, 69, 106 , 11, 126 , 13, 146 , 156 ,166 }

Theorem 2 |P| = ∞, i.e. there are an infinite number of primes. Qn Proof. Let P = {p1, p2, . . . , pn} with p1 < p2 < ··· < pn and let q = j=1 pj + 1. Then Qn q > pj ∀j ⇒ q 6∈ P so q is composite. But pi | q ⇒ pi | q − j=1 pj = 1 ⇒ pi = 1 which is false. Hence |P| = ∞. 

How many primes are there ?

Note:

∞ X 1 = ∞ n n=1 ∞ X 1 π2 = < ∞. n2 6 n=1 5

We can show ∞ X 1 = ∞ p j=1 j so the primes are denser than the squares.

2 √ If x > 0, let S(x) = #{n ∈ N : n 6 x}. Then S(x) = b xc. We can show π(x) = #{p ∈ P : p 6 x} x ∼ log(x)

Definition A modulus is a set of integers closed under ±. The zero modulus is just {0}. If a ∈ Z then M = {na : n ∈ Z} is a modulus.

Proposition 4 If M is a modulus with a, b ∈ M and m, n ∈ Z then ma + nb ∈ M. Proof. a ∈ M ⇒ a + a = 2a ∈ M ⇒ 2a + a = 3a ∈ M etc. so ma ∈ M and so is nb, thus ma + nb ∈ M. 

Proposition 5 If M 6= {0} is a modulus, it is the set of multiples of a fixed positive integer. Proof. Let d be the least positive integer in M with 0 < d.

Claim: every element of M is a multiple of d. If not (???) let n ∈ M have d - n. Then n = dq + r with 1 6 r < d. But r = n − dq ∈ M (!!!).  6

Definition Let a, b ∈ Z and let M = {ma + nb : m, n ∈ Z} then M is generated by d in that M = {nd : n ∈ Z}. We call d the greatest common divisor or GCD of a and b, and write (a, b) = d.

Proposition 6 (i) ∃x, y ∈ Z so ax + by = (a, b) (ii) ∀x, y ∈ Z, (a, b)|ax + by (iii) If e|a and e|b then e|(a, b)

Definition If (a, b) = 1 we say a and b are coprime.

Ex The GCD (greatest common divisor) is normally computed using the Euclidean Algorithm. From Proposition 5: (a = 323, b = 221)

323 = 221 · 1 + 102 so 102 ∈ M 221 = 102 · 2 + 17 so 17 ∈ M 102 = 17 · 6 + 0

so 17 is the least positive integer in M ⇒ (323, 221) = 17. Reading back:

17 = 221 − 2 · 102 = 221 − 2 · (323 − 221) = 3 · 221 − 2 · 323

so (a, b) = xa + yb ⇒ x = −2, y = 3. 7

Proposition 7 If p ∈ P and p|ab then p|a or p|b. Proof. If p - a then (a, p) = 1. By Prop 6(i) ∃x, y ∈ Z so xa + yp = 1 ⇒ xab + ybp = b

But p|ab so ab = qp. Hence (xq + yb)p = b so p|b. 

Proposition 8 If c > 0 and (a, b) = d then (ac, bc) = dc. Proof. ∃x, y ∈ Z so xa + yb = d ⇒ x(ac) + y(bc) = dc ⇒ (ac, bc) | dc. Also d | a ⇒ cd | ca (and similarly cd | cb) ⇒ dc | (ac, bc). Hence dc = (ac, bc). 

Theorem 3 (Fundamental Theorem of Arithmetic) The standard factorisation of a number n ∈ N is unique. Proof. If p | ab ··· m, by Proposition 7, p must divide one of the factors. If each of α1 αi β1 βj these is prime, then p must be one of them. If n = p1 ··· pi = q1 ··· qj are two standard factorizations of n, each p must be a q and each q a p. Hence i = j. Since p1 < p2 < ··· < pk and q1 < q2 < ··· < qk, p` = q` for 1 6 ` 6 k. If β1 < α1, divide n by β1 α1−β1 α2 β2 p1 to get p1 p2 ··· = p2 · · · ⇒ α1 = β1 etc.  8

Proposition 9 Let a, b ∈ N have non-standard factorisations

m Y αj a = pj j=1 and m Y βj b = pj j=1 with αj > 0, βj > 0 then m Y min (αj , βj ) (a, b) = pj . j=1

Ex a = 223451 b = 213051 ⇒ (a, b) = 213051

Definition Let a, b ∈ Z+ = {0, 1, 2,...} = N ∪ {0}. The least common multiple or LCM of a and b is the smallest common multiple of a and b and is written {a, b}.

Ex {3, 4} = 12 9

Proposition 10 With the same notation as for Proposition 9, m Y max (αj , βj ) {a, b} = pj . j=1

Proposition 11 Any common multiple of a and b is a multiple of the least common multiple.

Proposition 12 {a, b} (a, b) = ab Proof. m Y max (αj , βj )+min (αj , βj ) LHS = pj . j=1 But ∀x, y max (x, y) + min (x, y) = x + y. Hence m m m Y αj +βj Y αj Y βj LHS = pj = pj · pj = ab. j=1 j=1 j=1 

Alternative Characterisation of the GCD

By Proposition 6 (ii), (a, b)|ax + by. 10

Let x = 1, y = 0 ⇒ (a, b)|a. Let x = 0, y = 1 ⇒ (a, b)|b.

So g = (a, b) is a common divisor of a and b. By Proposition 6 (iii), if e|a and e|b then e | g i.e. g is divisible by every common divisor. Hence it is the greatest. This property: “being a common divisor divisible by every common divisor” characterises the GCD up to sign.

Proof. If g1 and g2 satisfy this property, then g1 and g2 are both common divisors with g1 | g2 and g2 | g1. Hence g2 = αg1 = αβg2 ⇒ αβ = 1 if g2 6= 0. Hence α = ±1. So g1 = ±g2. The GCD, so defined by the above property, is made unique by fixing the sign, g > 0. 

Ex divisors of 12 = {±1, ±2, ±3, ±4, ±6, ±12} = D12 divisors of 18 = {±1, ±2, ±3, ±6, ±9, ±18} = D18 common divisors = {±1, ±2, ±3, ±6} = D12 ∩ D18 So ±6 satisfies the property. Hence, fixing the sign, 6 = (12, 18).

Linear Equations in Z

Proposition 13 Given a, b, n ∈ Z, the equation ax + by = n has an integer solution x, y ⇔ (a, b)|n. 11

Proof.(⇐) By Proposition 6 (i) ∃x, y such that ax + by = (a, b). Since (a, b)|n, ∃c such that (a, b) c = n Hence a(xc) + b(yc) = (a, b) c = n and xc, yc is the solution. (⇒) By Proposition 6 (ii), (a, b)|ax + by = n. 

Proposition 14 Let (a, b) = 1 and let x0, y0 be a solution to ax + by = n (a solution exists by Proposition 13). Then all solutions are given by

x = x0 + bt , t ∈ Z. y = y0 − at

Proof.

a(x0 + bt) + b(y0 − at) = ax0 + abt + by0 − bat = n so each such x and y is a solution. If ax0 + by0 = n and ax + by = n also, then a(x − x0) + b(y − y0) = n − n = 0. But (a, b) = 1. Hence b | x − x0 ⇒ bt = x − x0 so x = x0 + bt ⇒ abt + b(y − y0) = 0 ⇒ y − y0 = −at if b 6= 0. 

Theorem 4 If (a, b) = 1, a > 0, b > 0 then every integer n > ab−a−b is representable as n = ax + by, x > 0, y > 0 and ab − a − b is not. Proof. By Proposition 14,

x = x0 + bt

y = y0 − at 12

Choose t so that 0 6 y0 − at < a ⇒ 0 6 y0 − at 6 a − 1. But

(x0 + bt)a = n − (y0 − at)b > ab − a − b − (a − 1)b = −a

⇒ (x0 + bt) > −1 ⇒ (x0 + bt) > 0. Hence n is representable. Finally suppose ax + by = ab − a − b (???) x > 0, y > 0. ⇒ a(x + 1) + b(y + 1) = ab.

But (a, b) = 1, hence a|y + 1 (a(x + 1 − b) = b(−y − 1)) and b|x + 1. ⇒ a 6 y + 1 and b 6 x + 1 so ab = (x + 1)a + (y + 1)b > ba + ab = 2ab (!!!). 

Definition n ∈ N σ(n) = sum of the divisors of n X = d d|n

Ex σ(12) = 1 + 2 + 3 + 4 + 6 + 12 = 28 σ(6) = 1 + 2 + 3 + 6 = 12 = 2(6).

Perfect Numbers 13

Definition A is equal to the sum of its proper divisors X n = d d | n 1 6 d < n or σ(n) = 2n.

Ex 6, 28

Qm αj Proposition 15 If n = j=1 pj then

m αj +1 Y pj − 1 σ(n) = p − 1 j=1 j

x1 xm Proof. All divisors of n have the form d = p1 ··· pm with 0 6 xj 6 αj. Hence

α1 αm X X x1 xm σ(n) = ··· p1 ··· pm x1=0 xm=0

α1 ! αm ! X x1 X xm = p1 ··· pm x1=0 xm=0 = RHS above.

 14

Definition A function f : N → N is called multiplicative if a, b ∈ N and (a, b) = 1 ⇒ f(ab) = f(a)f(b)

Proposition 16 (a, b) = 1 ⇒ σ(ab) = σ(a)σ(b) i.e. σ is a multiplicative function. Proof. This follows from Proposition 15. 

n 1 n−1 n Theorem 5 Let p = 2 − 1 be prime. Then m = 2 p(p + 1) = 2 (2 − 1) is perfect. Every even perfect number has this form. 1 n−1 1 Proof. m = 2 p(p + 1) = 2 p and p is odd. By Proposition 15 2n − 1 p2 − 1 σ(m) = · 2 − 1 p − 1 = (2n − 1)(p + 1) = p(p + 1) = 2m so m is perfect. Let a be an even perfect number. a = 2n−1u, u > 1, 2 - u. (Note that σ(2α) = 2α+1 − 1 6= 2 · 2α, so no power of 2 is perfect.) Since σ is multiplicative, 2n − 1 σ(a) = σ(u) = 2a = 2nu 2 − 1 since a is perfect. Hence 2nu u σ(u) = = u + . 2n − 1 2n − 1 15

u u n But u|u and 2n−1 |u so u has just two divisors hence u ∈ P and 2n−1 = 1 ⇒ u = 2 − 1. 

Conjecture There are no odd perfect numbers.

Definition If p = 2n − 1 ∈ P we say p is a .

Theorem 6 If n > 1 and an − 1 is prime then a = 2 and n is prime. Proof. If a > 2 then a − 1|an − 1 (an − 1 = (a − 1)(an−1 + an−2 + ··· + 1)) so an − 1 6∈ P. If a = 2 and n = j`, where j is a proper divisor of n, then 2n − 1 = (2j)` − 1 is divisible j j by 2 − 1 (a = 2 in the equation above). Hence n ∈ P.  web: http://www.utm.edu/research/primes/mersenne.shtml

Theorem 7 If 2m + 1 ∈ P then m = 2n. Proof. If m = qr, where q is odd, then 2qr + 1 = (2r)q + 1 = (2r + 1)(2r(q−1) − 2r(q−2) + ··· + 1) and 1 < 2r + 1 < 2qr + 1 so 2qr + 1 n cannot be prime. Hence m has no odd prime factor. Hence m = 2 , n ∈ N. 

Note The factorization

an − bn = (a − b)(an−1 + an−2b + an−3b2 + ··· + bn−1) 16

works here for odd n since

an + 1 = an − (−1)n = (a − (−1))(an−1 + an−2(−1) + an−3(−1)2 + ··· (−1)n−1) = (a + 1)(a + 1)(an−1 − an−2 + an−3 − · · · + 1)

Fermat Numbers

th 2n Definition The n , Fn = 2 + 1

F0 = 3,F1 = 5,F2 = 17,F3 = 257,F4 = 65537.

Fi ∈ P for 0 6 i 6 4. No other Fermat prime is known.

F5 6∈ P.

(Euler, 1732): 641|225 + 1 = 641 · 6700417. 17

Proof. Let

a = 27 b = 5 a − b3 = 3 1 + ab − b4 = 1 + 5 · 3 = 24

Therefore

225 + 1 = (28)4 + 1 = (2a)4 + 1 = 24a4 + 1 = (1 + ab − b4)a4 + 1 = (1 + ab)a4 + 1 − a4b4 = (1 + ab)a4 + (1 − a2b2)(1 + a2b2) = (1 + ab)[a4 + (1 − ab)(1 + a2b2)] and 1 + ab = 641. 

Theorem 8 (Lagrange) If p ∈ P, the exact power α of p dividing n!(pα kn!) is n  n   n  α = + + + ··· p p2 p3 18

Proof.

n! = 1 · 2 ··· (p − 1) ·p(p + 1) ··· 2p ··· (p − 1)p ·p2 ···

j n k j n k 2 There are p multiples of p, p2 multiples of p , etc.

Each multiple of p contributes 1 to α. Each multiple of p2 has already contributed 1, being a multiple of p, so contributes 1 more to α leading to

n  n  + p p2

etc. Hence n  n   n   n  α = + + + ··· + p p2 p3 pr

r+1 j n k where r is the first N such that p > n. So pβ = 0 ∀β > r + 1.  19

Ex n = 12, p = 3 so

12 12 12 α = + + 3 9 27 = 4 + 1 + 0 = 5. 12! = 12 · 11 · 10 · 9 · 8 · 7 · 6 · 5 · 4 · 3 · 2 · 1 ↓ ↓ ↓ ↓ 1 2 1 1 and 35 k12! . 20

2 Congruences

Definition a ≡ b (mod m) if m|a − b, m 6= 0, a, b, m ∈ Z. If so we say a is congruent to b modulo m. We call m the modulus.

Proposition 17 ≡ is an equivalence relation on Z and the set of equivalence classes forms a ring (Zm, +, ·, [ 1 ]m) where

[ a ]m + [ b ]m = [ a + b ]m

[ a ]m · [ b ]m = [ a · b ]m

Proposition 18 a ≡ b (mod m)  a · a ≡ b · b (mod m) 1 1 ⇒ 1 2 1 2 a2 ≡ b2 (mod m) a2 + a2 ≡ b1 + b2 (mod m)

Proposition 19  ac ≡ bd (mod m)  c ≡ d (mod m) ⇒ a ≡ b (mod m) (c, m) = 1  21

Proof.(a − b)c + b(c − d) = ac − bd ≡ 0 (mod m) ⇒ m|(a − b)c ⇒ m|a − b so a ≡ b (mod m). 

If m ∈ P then (c, m) = 1 ∀c ∈ Z with m - c, c 6= 0 and ∃x, y ∈ Z so that cx + my = (c, m) = 1 so cx ≡ 1 (mod m). Hence [ c ]m has a multiplicative inverse class [ x ]m and (Zm, +, ·, [ 1 ]m) is a field GF(m) called a Galois field.

Note [ c ]m is called a residue class with representative c. Each class has a smallest non-negative representative.

Ex m = 5

GF (5) = {[ 0 ]5 , [ 1 ]5 , [ 2 ]5 , [ 3 ]5 , [ 4 ]5} Proof. If c ∈ Z and m > 0, ∃q, r so that c = mq + r, 0 6 r < m and c ≡ r (mod m) ⇒ [ c ]m = [ r ]m 

Euler’s Phi Function φ

Definition φ(n) = #{i 6 n : 1 6 i and (i, n) = 1} is the number of natural numbers less than n and coprime to n.

Ex φ(1) = 1, φ(2) = 1, φ(4) = 2 since (1, 4) = 1, (2, 4) = 2, (3, 4) = 1, (4, 4) = 4.

Ex p ∈ P ⇒ φ(p) = p − 1 since (p, 1) = 1, (p, p) = p and (p, j) = 1, 1 < j < p. 22

Consider m > 1. In Zm, [ c ]m will have an inverse class ⇔ (c, m) = 1. (⇐) cx + my = (c, m) = 1 ⇒ cx ≡ 1 (mod m). Hence the number of classes which have inverses is φ(m).

Definition A reduced residue system is a complete set of representatives for those classes with inverses.

Ex {1, 3} is such a system for Z4.

Proposition 20 If a1, . . . , aφ(m) is a reduced residue system and (m, k) = 1 then ka1, . . . , kaφ(m) is also a reduced residue system.

Proof.(ai, m) = 1 ⇒ (kai, m) = 1. If kai ≡ kaj (mod m) ⇒ ai ≡ aj (mod m) ⇒ i = j. Hence the kai represent distinct residue classes, and each is coprime with m. 

Theorem 9 (Euler) (a, m) = 1 ⇒ aφ(m) ≡ 1 (mod m).

Proof. The {aai : 1 6 i 6 φ(m)} and {ai : 1 6 i 6 φ(m)} represent the same classes (albeit in a different order). Hence

φ(m) φ(m) Y Y (aaj) ≡ aj (mod m) j=1 j=1 φ(m)  φ(m)  φ(m) Y Y ⇒ a  aj ≡  aj (mod m) j=1 j=1 23

φ(m) and so a ≡ 1 (mod m) since (aj, m) = 1 means we can cancel. 

Corollary (Fermat’s Little Theorem) (a, p) = 1 ⇒ ap ≡ a (mod p). p−1 p Proof. φ(p) = p − 1 so a ≡ 1 (mod p) ⇒ a ≡ a (mod p). 

Note Simple probabilistic primality test: Check q ∈ N through considering aq ≡ a (mod q) for random a with (a, q) = 1.

Note Euler’s aφ(m) ≡ 1 (mod m) is the basis of RSA public key cryptography.

Proposition 21 Let (m, m0) = 1, let x run over a complete residue system (mod m) and x0 over a complete system (mod m0). Then mx0 + m0x runs over a complete system (mod mm0). Proof. Consider the mm0 numbers mx0 + m0x. If mx0 + m0x ≡ my0 + m0y (mod mm0) then mx0 ≡ my0 (mod m0)  x0 ≡ y0 (mod m0) ⇒ m0x ≡ m0y (mod m) x ≡ y (mod m) since (m, m0) = 1. So each class is distinct. The result follows since there are mm0 classes 0 (mod mm ). 

Proposition 22 Same as before but ‘complete’ → ‘reduced’. Proof. Claim:(mx0 + m0x, mm0) = 1. If not (???) Let p ∈ P have p|(mx0 + m0x, mm0). If p | m then p | m0x. But (m, m0) = 1 so p - m0 hence p | x and p | (m, x) which is false (!!!). This proves the claim. 24

Claim: Every a ∈ Z, (a, mm0) = 1 satisfies a ≡ mx0 + m0x (mod mm0) for x, x0 with (x, m) = (x0, m0) = 1. By the above ∃x, x0 so a ≡ mx0 + m0x (mod mm0). If (x, m) = d 6= 1 then (a, m) = (mx0 + m0x, m) = (m0x, m) = (x, m) = d 6= 1 which is false. Similarly (x0, m0) = 1.

By the above, the numbers mx0 + m0x are incongruent. hence we have a reduced residue system of this form. 

Theorem 10 φ is a multiplicative function. Proof. If (m, m0) = 1,

φ(mm0) = #{RRS(mm0)} = #{RRS(m)}· #{RRS(m0)} = φ(m) · φ(m0)

 25

Qm αj Since φ = ϕ is multiplicative, if n = j=1 pj is the standard factorisation,

m Y αj φ(n) = φ(pj ). j=1

Theorem 11  1 φ(pα) = pα 1 − p so Y  1 φ(n) = n 1 − . p p|n

α Proof. Consider the natural numbers in the interval 1 6 j 6 p . There are pα  = pα−1 p

multiples of p and the rest are coprime with p, (j, p) = 1 hence (j, p) = 1. Therefore α α α−1 α 1 φ(p ) = p − p = p (1 − p ).  26

Ex φ(100) = φ(22 · 52)  1  1 = 100 1 − 1 − 2 5 1 4 = 100 2 5 = 40 ⇒ 40% are coprime with 100

Theorem 12 (Wilson) If p ∈ P, (p − 1)! ≡ −1 (mod p). p−1 Proof. In Zp, f(x) = x − 1 has degree p − 1 and roots [ 1 ]p , [ 2 ]p ,..., [ p − 1 ]p since ap−1 ≡ 1 (mod p). x = 0 ⇒ −1 ≡ (−1)p−1(p − 1)! (mod p) ⇒ (p − 1)! ≡ −1 (mod p) for p odd and for p = 2, (2 − 1)! = 1! = 1 ≡ −1 (mod 2). 

Note The converse also holds.

Note on Fermat Numbers

These can be defined as

F0 = 3 2 Fn+1 = Fn − 2Fn + 2, n > 0 27 since then

2 Fn − 1 = (Fn−1 − 1) 22 = (Fn−2 − 1) . . 2n = (F0 − 1) = 22n

2n so Fn = 2 + 1 ∀n > 0.

Proposition 23 (Fn,Fm) = 1 ∀n 6= m. 28

3 M¨obiusFunction and M¨obiusInversion

(Mathematica: MoebiusMu[n])

Definition   1 if n = 1 µ(n) = (−1)m if n is a product of m distinct p ∈ P  0 if ∃p ∈ P with p2 |n

Ex µ(1) = 1, µ(2) = −1, µ(6) = (−1)2 = 1, µ(p) = −1, µ(4) = 0, µ(12) = µ(223) = 0

Proposition 24 µ is multiplicative. Qm αi Qn βj Proof. Let (a, b) = 1, a = i=1 pi , b = j=1 qj . If ∃αi or βi > 2 then µ(ab) = 0 and µ(a) or µ(b) = 0 so µ(ab) = 0 = µ(a)µ(b). If not

µ(ab) = (−1)n+m = (−1)m(−1)n = µ(a)µ(b) so µ is multiplicative. 

Definition  1 if n = 1 I(n) = . 0 if n > 1 29

Proposition 25 If f(n) is multiplicative and not identically zero, then f(1) = 1. Proof. (1, a) = 1 ⇒ f(1 · a) = f(1)f(a) so f(a) = f(1)f(a). If we choose a so f(a) 6= 0 then 1 = f(1). 

Theorem 13 Let g(n) and h(n) be multiplicative. Then the function X n f(n) = g(d)h d d|n is also multiplicative. Proof. Let (a, b) = 1. Then

X ab f(ab) = g(d) h d d|ab X ab = g(d) h d d = uv u | a, v | b X X ab = g(uv) h uv u|a v|b X X a  b  = g(u) g(v) h h u v u|a v|b 30

a b  since (u, v) = u , v = 1. X X a  b  f(ab) = g(u) h g(v) h ∴ u v u|a v|b     X a X  b  = g(u) h g(v) h  u   v  u|a v|b = f(a)f(b).



Proposition 26 Let f be multiplicative and not identically zero. Then X Y µ(d)f(d) = (1 − f(p)) (1) d|n p|n where the product includes one term for each prime divisor of n. Proof. Let g(n) = µ(n)f(n) and h(n) = 1 in Theorem 13. Then LHS of equation (1) is P n  d|n g(d)h d so is multiplicative. The RHS of (1) is also multiplicative since if n = ab then (a, b) = 1, p|n ⇔ p|a or p|b.

At n = 1 LHS = µ(1)f(1) = 1

RHS = empty product = 1 (by definition). 31

At n = pα X LHS = µ(d)f(d) d|pα = µ(1)f(1) + µ(p)f(p) + µ(p2)f(p2) + ··· = 1 + (−1)f(p) + 0 + 0 + ··· = 1 − f(p)

Y RHS = (1 − f(p)) p|pα = 1 − f(p) = LHS Hence they are equal, since multiplicative functions are determined by their values at 1 and prime powers. 

Proposition 27 If n > 0, X  1 if n = 1 µ(d) = I(n) = . 0 if n > 1 d|n

Proof. Let f(d) = 1 in Proposition 26 and note X µ(d) = µ(1) = 1. d|1 32



Theorem 14 X φ(d) = n d|n

Proof. Let S = {1, 2, ··· , n}. If d | n let A(d) = {k :(k, n) = d, 1 6 k 6 n}. Then F F S P P S = d|n A(d) (i.e. disjoint union 6= ) ⇒ #S = d|n #A(d) or n = d|n f(d) where f(d) = #A(d). But k n (k, n) = d ⇔ , = 1 and d d k n 0 < k n ⇔ 0 < 6 d 6 d k n so if q = d there is a 1-1 correspondence between q ∈ N satisfying 0 < q 6 d and n  q, d = 1. n i.e. f(d) = φ( d ) P n Hence n = d|n φ( d ) n But as d runs through the divisors of n, so does d . Hence X n = φ(d). d|n

 33

Ex Divisors of 6 are {1, 2, 3, 6} and

 1  1 φ(1) + φ(2) + φ(3) + φ(6) = 1 + (2 − 1) + (3 − 1) + 6 1 − 1 − 2 3 = 1 + 1 + 2 + 2 = 6

Dirichlet Multiplication

Definition If f and g are two real functions on N then define their Dirichlet product (or convolution) h(n) as X n h(n) = f(d)g = (f ∗ g)(n). d d|n

Proposition 28 I ∗ f = f ∗ I = f where

 1 if n = 1 I(n) = 0 if n > 1 34

Proposition 29 f ∗ g = g ∗ f (commutative law) (f ∗ g) ∗ k = f ∗ (g ∗ k). (associative law)

Definition The function u(n) = 1 ∀n ∈ N.

Then for Proposition 27: X µ(d) = I(n) is µ ∗ u = I (2) d|n and for Theorem 14: X φ(d) = n is φ ∗ u = N d|n where N(n) = n is the identity. If f(1) 6≡ 0 there is a unique function f −1 with f ∗ f −1 = f −1 ∗ f = I.

Ex By (2) u = µ−1, u−1 = µ.

Theorem 13 says if f and g are multiplicative then so is f ∗ g, their Dirichlet product.

Theorem 15 (M¨obiusInversion Formula) X X n f(n) = g(d) ⇔ g(n) = µ(d)f d d|n d|n 35

Proof.(⇒) f = g ∗ u ⇒

f ∗ µ = (g ∗ u) ∗ µ = g ∗ (u ∗ µ) = g ∗ I = g.

(⇐) g = f ∗ µ ⇒

g ∗ u = (f ∗ µ) ∗ u = f ∗ (µ ∗ u) = f ∗ I = f.



Ex Theorem 14: X φ(d) = n d|n φ ∗ u = N 36

⇒ φ(n) = (µ ∗ N)(n) X n = µ(d) d d|n X µ(d) = n d d|n

Liouville’s Function

Definition m Pm Y αi 1 αi n = pi ⇒ λ(n) = (−1) i=1

Then λ is completely multiplicative.

Theorem 16 ∀n > 1, X  1 if n is a square λ(d) = 0 otherwise. d|n 37

P Proof. Let g(n) = d|n λ(d). Then g = λ ∗ u is multiplicative as the Dirichlet product of multiplicative functions. So we need to compute g(pα) for p ∈ P and α = 1, 2, 3,... X g(pα) = λ(d) d|pα = λ(1) + λ(p) + λ(p2) + ··· + λ(pα) = 1 − 1 + 1 − 1 + ··· + (−1)α  0 if α is odd = 1 if α is even.

Qm αi Qm αi If n = i=1 pi and n is not a square, then ∃j so αj is odd, hence g(n) = i=1 g(pi ) = 0 th αi since the j term is zero. If n is a square each αi is even, hence g(pi ) = 1 ∀i ⇒ g(n) = 1.  38

4 Averages of Arithmetic Functions

Definition X d(n) = 1 = # of divisors of n ∈ N d | n 1 6 d 6 n is the “”.

Then, as a function of n, d is very irregular. d(p) = 2 ∀p ∈ P but d(n) can be very large. Averages are smoother n 1 X d˜(n) = d(j) n j=1 indeed (later) d˜(n) lim = 1. n→∞ log(n) Need the partial sums X D(x) = d(j) 16j6x where we define D(x) = 0 for 0 < x < 1. So D(x) = d(1) + d(2) + ··· + d(bxc), x > 1.

Later we prove Dirichlet’s theorem: √  x > 1 ⇒ D(x) = x log(x) + (2γ − 1)x + O x 39

(γ is Euler’s constant) where f(x) = O(g(x)) if ∃x0, ∃M > 0 such that ∀x > x0, |f(x)| 6 Mg(x) defines the ‘big-Oh’ notation, and f(x) = h(x) + O(g(x)) ⇔ f(x) − h(x) = O(g(x)).

Ex x = O(x2) , x2 + 7x + 20 = O(x2)

Normally, f(x) is number theoretic, like D(x), h(x) is ‘nice’ and smooth, g(x) is a nice power, or other simple ‘mop up’ for the ‘random variation’ in f(x) e.g. D(x) = x log(x)+ O(x).

Definition We say f(x) is asymptotic to g(x) as x → ∞ if

f(x) lim = 1 x→∞ g(x) and write f(x) ∼ g(x), x → ∞.

So D(x) ∼ x log(x) as x → ∞ since √ D(x) x log(x) (2γ − 1)x  x  = + + O x log(x) x log(x) x log(x) x log(x)

D(x) ˜ ⇒ x log(x) → 1. From this it follows that d(n) ∼ log(n).

Theorem 17 (Euler Summation) If f has a continuous derivative f 0 on [y, x] ∈ R 40 where 0 < y < x, then X Z x Z x S = f(n) = f(t) dt + (t − btc)f 0(t) dt y y y

Proof. Let m = byc , k = bxc. If n, n − 1 ∈ [y, x]: Z n Z n btc f 0(t) dt = (n − 1)f 0(t) dt n−1 n−1 = (n − 1)(f(n) − f(n − 1)) = {nf(n) − (n − 1)f(n − 1)} − f(n) Summing from n = m + 2 to n = k, the sum in braces ({· · · }) telescopes to give

k Z k X btc f 0(t) dt = kf(k) − (m + 1)f(m + 1) − f(n) m+1 n=m+2 X = kf(k) − mf(m + 1) − f(n) y

R x Integrating y f(t) dt (by parts) gives Z x Z x f(t) dt = xf(x) − yf(y) − tf 0(t)dt. (5) y y

Then (4) − (5) ⇒ (3). 

Theorem 18 X 1  1  = log(x) + γ + O n x n6x where ! Z ∞ t − btc X 1 γ = 1 − 2 dt = lim − log(x) . 1 t x→∞ n n6x 42

1 0 1 Proof. Let f(t) = t in Theorem 17 with y = 1 so f (t) = − t2 and X 1 X 1 = 1 + n n 0

Note: γ = 0.577215776 ... is Euler’s constant (EulerGamma in Mathematica). It could 43

be rational, but probably is not. By Theorem 18, ! X 1 lim − log(x) = γ + 0 = γ. x→∞ n 16n6x

P∞ 1 Since log(x) → ∞ as x → ∞, n=1 n = ∞, quoted earlier.

Theorem 19 (Dirichlet) X √ D(x) = d(n) = x log(x) + (2γ − 1)x + O x (6) 16n6x

Proof. X d(n) = 1 d|n X X X ⇒ D(x) = d(n) = 1 n6x n6x d|n Now d|n ⇒ n = qd so we can express the double sum as X D(x) = 1

q, d qd 6 x 44

This is a sum over a set of lattice points in the q − d plane with (q, d) such that qd = n and n = 1, 2, 3,..., bxc. We sum these horizontally: X X D(x) = 1 d x x 6 q6 d But X 1 = x + O(1) (Ex) 16i6x so X nx o D(x) = + O(1) d d6x X 1 = x + O(x) d d6x   1  = x log(x) + γ + O + O(x) x = x log(x) + O(x) (7)

This is weaker than (6). To prove (6) we use the symmetry of the set of points: X njxk o √ D(x) = 2 − d +  x √ d d6 x = 2#(below line q = d) + #(on q = d) (8) 45

But ∀y ∈ R, byc = y + O(1) so (8) ⇒ X nx o √ D(x) = 2 − d + O(1) + O x √ d d6 x X 1 X √ = 2x − 2 d + O x √ d √ d6 x d6 x  √  1  x √  √ = 2x log x + γ + O √ − 2 + O x + O x x 2 √ = x log(x) + (2γ − 1)x + O x where we have use Lemma 1 below for the middle sum. 

Lemma If α > 0, X xα+1 nα = + O(xα) . α + 1 n6x 46

Proof. In Theorem 17 (Euler Summation), let f(t) = tα, f 0(t) = αtα−1 ⇒ X X nα = 1 + nα 0

 √ Note: Improvements in the error term O( x) in Dirichlet’s theorem for d(n) have come at great cost: 1903 Voronoi Ox1/3 log(x) 1922 van der Corput Ox33/100 1969 Kolesnik Oxε+12/37 ∀ε > 0 θ 1 1915 Hardy and Landau O x ⇒ θ > 4

The Distribution of Primes 47

Let Z x dt Li(x) = 2 log(t) for x > 2 be the “logarithmic integral” and π(x) = #{p ∈ P : 2 6 p 6 x}. Consider the following data:

x π(x) So π(x) ; log(x) but π(x) ; Li(x) is better, and x → 0 as x → 0 apparently. Indeed x π(x) ∼ log(x) ∼ Li(x).

This distribution is the subject of the famous Theorem, which took all of the 19th century to prove.

Because log(10n) = n log(10) 1 in [2, 100] : about 2 the numbers are prime 1 in [2, 1000] : 3 1 in [2, 1, 000, 000] : 6 etc. 1 so they progressively thin out with a local density log(t) since if a < b Z b dt Z a dt Z b dt #{p ∈ P : a 6 p 6 b} = π(b) − π(a) ∼ − = . 2 log t 2 log t a log(t)

Theorem 20 For n > 2, 1 π(n) 12. 8 6 n/ log n 6 48

Note: This is as close as we will get to proving the Prime Number Theorem.

Pn 1 Lemma (Chebyshev) If H(n) = j=2 j then 1 H(n) π(n) 6. 8 6 n 6

Proof. Proof of Theorem 20 assuming Chebychev’s Lemma:

For n > 2, n Z n dt 1 1 1 Z n dt log = < + + ··· + < = log(n) . 2 2 t 2 3 n 1 t For n > 4, 1 n log(n) log . 2 6 2

1 Hence 2 log(n) 6 H(n) 6 log(n) so, by the RHS of Chebychev’s Lemma, π(n) log(n) π(n) 2H(n) 12 n 6 n 6 and by the LHS of Chebychev’s Lemma 1 π(n)H(n) π(n) log(n) 8 6 n 6 n 49

using Lemma 2 when n > 4.

If n = 2, π(2) = 1 and 1 1 6. 8 6 2/ log(2) 6 | {z } 0.34 If n = 3, π(3) = 2 and 1 2 6. 8 6 3/ log(3) 6 | {z } 0.73 This completes the proof of the theorem.

Proof of Lemma 2 :

k+1 k Claim : ∀k > 0, π(2 ) 6 2 (9) x Proof: If x > 9, π(x) 6 2 since all even numbers greater than 2 are composite. Since 1 0 1 2 π(2 ) = 1 = 2 , π(4) = 2 = 2 and π(8) = r = 2 , (1) is true ∀k > 0.

1 Claim : ` H(2`) ` (10) 2 6 6 50

1 1 where H(n) = 2 + ··· + n . 1 1 1 1 1 1 1  1 1  H(2`) = + + + + + + + ··· + + ··· + 2 3 4 5 6 7 8 2`−1 + 1 2` 1 1 1 1 1 1 1  1 1  + + + + + + + ··· + + ··· + > 2 4 4 8 8 8 8 2` 2` ` = 2 and 1 1 1 1 1 1  1  H(2`) = + + + + + + ··· + 2 3 4 5 6 7 2` 1 1 1 1 1 1  1 1  + + + + + + ··· + + ··· 6 2 2 4 4 4 4 2`−1 2`−1 6 ` This proves the claim.

If p ∈ P has n < p < 2n ⇒ p|2n! and p - n! ⇒     2n 2n! Y 2n p = ⇒ p (11) n n! n! n n

2n By Lagrange, the power of p in is n

r X  2n   n  − 2 (12) pm pm m=1

r r+1 where p 6 2n < p and the sum is ≤ r since ∀x, b2xc − 2 bxc 6 1 (See below). Hence

  2n Y r p n pr62n

By (11) and (12)

Y 2n Y nπ(2n)−π(n) < p pr (2n)π(2n) (13) 6 n 6 6 n 52

so 2n 2n 4n (14) 6 n 6

π(2n)−π(n) 2n n π(2n) Using LHS of (13) we get n < 2 and the RHS gives 2 < (2n) , n > 1. Now let n = 2k, k = 0, 1 , 2,... so these two inequalities translate to

k(π(2k+1)−π(2k)) 2k+1 2k (k+1)π(2k+1) 2 6 2 , 2 6 2 , k > 0 or k+1 k k+1 k k+1 k(π(2 ) − π(2 )) 6 2 , 2 6 (k + 1)π(2 ). (15) Hence

(k + 1)π(2k+1) − kπ(2k) = k(π(2k+1) − π(2k)) + π(2k+1) k+1 k+1 6 2 + π(2 ) < 2k+1 + 2k by (9) = 3 · 2k

Apply this for k = 0, 1, 2, . . . , k and add (π(20) = π(1) = 0):

⇒ (k + 1)π(2k+1) < 3(20 + 21 + ··· + 2k) < 3 · 2k+1. (16)

By (15) and (16), ∀k > 0 1 2k+1 2k+1 π(2k+1) < 3 . 2 k + 1 6 k + 1 53

k+1 k+2 If n ∈ N, n > 1 choose k so 2 6 n < 2 . By (10) (π is increasing) 2k+2 2k+1 6n π(n) π(2k+2) < 3 6 6 k + 2 6 H(2k+2) 6 H(n) (H is increasing) and 1 2k+1 π(n) π(2k+1) > > 2 k + 1 1 2k+2 = 1 8 2 (k + 1) 1 2k+2 > 8 H(2k+1) 1 n > 8 H(n)

1 π(n) ⇒ 6 8 6 n/H(n) 6 as claimed. 

Ex ∀x ∈ R, 0 6 b2xc − 2 bxc 6 1 Proof. bxc 6 x ⇒ 2 bxc 6 2x and 2 bxc ∈ Z ∴ 2 bxc 6 b2xc 54

so 0 6 b2xc − 2 bxc

If x ∈ Z, then b2xc − 2 bxc = 2x − 2 · x = 0 6 1.

1 1 If x 6∈ Z, ∃n ∈ Z so n < x < n+1 and x = n+ 2 +ε where |ε| < 2 . Then 2x = 2n+1+2ε. b2xc = b2n + 1 + 2εc = 2n + 1 + b2εc bxc = n = 2n + 1

Hence b2xc − 2 bxc = (2n + 1) − 2n = 1 6 1. 

Note we have used several times the result

by + nc = byc + n ∀n ∈ Z . (Ex) 55

5 Primes in Gaps

• primes can be close together: {11, 13}, {29, 31}, {101, 103},...

• there can be long stretches of N with no primes:  a1 = n! + 2  are n − 1 composite and consecutive a2 = n! + 3  . numbers, so none are prime .   and n can be as large as you like. an−1 = n! + n 

• we will prove the celebrated Bertrand’s Hypothesis: ∀n ∈ N, ∃p ∈ P with n 6 p < 2n. • ∀n ∈ N does there exist a p ∈ P with n2 < p < (n + 1)2? 56 57 58 59 60 61

Aron, Potter, Young

lim [size of gap n] = ∞. n→∞

If n ∈ N

a1 = (n + 1)! + 2

a2 = (n + 1)! + 3

a3 = (n + 1)! + 4 . .

an = (n + 1)! + (n + 1)

Then {a1, . . . , an} are consecutive and i + 1|ai ⇒ composite.

But [1993] best gap length = 804 at p ≈ 1015.

n = 804 ⇒ n! ≈ 0.771 × 101977.

Proof of Bertrand’s Postulate 62

Proof. Claim: Y x−1 x > 2 ⇒ p 6 4 (17) p6x If q is the largest prime less than or equal to x

Y Y q−1 x−1 p = p and 4 6 4 p6x p6q

2−1 so we can assume x = q is prime. If q = 2, 2 6 4 so let q = 2m + 1 be odd. Then ! ! Y Y Y p = p · p p62m+1 p6m+1 m+1

m By induction A 6 4 . Also 2m + 1 (2m + 1)! = m m!(m + 1)! so all primes in B divide the numerator and are not cancelled so

2m + 1 2m + 1 1 B = (1 + 1)2m+1 6 m m + 1 6 2

m 2m 2m+1−1 x−1 Hence A · B 6 4 2 = 4 = 4 , which proves the claim. 63

Legendre’s Theorem Implications j k n! contains the prime factor p exactly P n times. j>1 pα

Ex 24! = 222 · 310 · 54 · 73 · 112 · 13 · 17 · 19 · 23

24  24  p = 23 : = 1, = 0, ··· 23 232 24 24 p = 7 : = 3, = 0, ··· 7 72

Claim:   r 2n r p ⇒ p 2n. n 6 2n j k j k contains p P 2n − 2 n times. But n j>1 pj pj 2n  n  2n  n  0 − 2 < − 2 − 1 = 2 6 pj pj pj pj ⇒ each summand is 0 or 1 and is 0 for pj > 2n X 2n  n  ⇒ − 2 max{j : pj 2n} pj pj 6 6 j>1 64

2n ⇒ if p2 > 2n, p occurs at most once in . n

2n If 2 n < p n ⇒ p does not appear in . 3 6 n

2 (2n)! 3 n < p ⇒ 2n < 3p ⇒ p, 2p are the only multiples of p in the numerator of n! n! . p 6 n ⇒ there are two in the denominator. So they cancel.

Ex n = 24 48 = 22 · 32 · 52 · 13 · 29 · 31 · 37 · 41 · 41 · 43 · 47 24

√ 2 × 24 16 < p 6 24

Grand Finale

Assume that for some n ∈ N there is no p in n < p < 2n (???). Now

2n X 2n 2n 4n = 22n ⇒ j n > 2n j=0 65

Hence A B C z }| { z }| {     z }| !{ 4n 2n Y Y Y 6 6  2n ·  p · p 2n n √ √ 2 n

√ 2n 2 n A 6 (2n) ,C = 1 by (???), B 6 4 3 by (17). Thus √ n 1+ 2n 2n 4 6 (2n) 4 3 √ n ⇒ 3 log(4) 6 (1 + 2n) log(2n) ⇒ n < 468.

But n < p < 2n ⇔ pn+1 < 2pn.

Consider the primes qj {2, 3, 5, 7, 13, 23, 43, 83, 163, 317, 631} qj+1 < 2qj so Bertrand’s Postulate is true for n < 468 (!!!), hence it is true ∀n > 2. 

Prime Number Theorem Implications

π(x) log(x) lim = 1 x→∞ x 66

or x  x  π(x) = + o log(x) log(x) Number of primes in (x, x(1 + ε)], ε > 0, is

εx  x  π(x + εx) − π(x) = + o > 0 for x x log(x) log(x) > ε

⇒ ∃p with x < p 6 (1 + ε)x Let ε = 1 ⇒ ∀n > N2 ∃p n < p < 2n

Bertrand’s Postulate Chebyshev [1850] Ramanujan Erd¨osat age 19 years

Progress beyond Bertrand

θ xθ ∃θ < 1 with π(x + x ) − π(x) ∼ log(x) 67

1 1930 Hoheisel θ = 1 − 33,000 + ε (∀ε > 0) 5 1937 Ingham θ = 8 + ε 3 1961 Montgomery θ = 5 + ε 7 1972 Huxley θ = 12 + ε 13 1979 Iwaniec, Jutlia θ = 23 + ε 1 1 1984 Iwaniec, Printz θ = 2 + 21 + ε = 0.547 ... + ε 7 1994 Lou and Yeo θ = 13 + ε = 0.538 ... + ε 1998 Baker and Herman θ = 0.535 ... + ε

Exercise for John in his Retirement

[Hardy and Wright, 1979]: There is a prime p with n2 < p < (n + 1)2

1 √ Note: θ = 2 ⇒ ∃p x < p < x + x. x = n2 ⇒ n2 < p < n2 + n < (n + 1)2.

Degree of difficulty for the student:

Other Results on the Distribution of Primes 68

Theorem 21 (Bertrand’s Postulate) ∀n ∈ N, ∃p ∈ P with n 6 p < 2n.

Theorem 22 There are infinitely many primes of the form 4n − 1. Proof. Assume there are only a finite number and let p be the largest. Let

n N = 22 · 3z · 5}|··· p{ −1

The product n = 3 · 5 ··· p contains all the odd primes less than or equal to p as factors. Since N > p and N = 4n − 1, it cannot be prime. No prime less than or equal to p divides N (since it would divide 1). Thus all the prime factors of N must exceed p.

If x = 4m + 1 and y = 4` + 1 then

xy = 16m` + 4m + 4` + 1 = 4(4m` + m + `) + 1 = 4k + 1

If two factors of N are of the form 4n+1, so is their product. But N has the form 4n−1, so at least one prime factor must be of the form p = 4m − 1. This contradiction proves the theorem. 

Can also show there are an infinite number of primes of each of the forms 4n+1, 5n− 1, 8n − 1, 8n − 3 and 8n + 3.

Note All numbers of the form 4n or 4n + 2 are composite. Every prime p ∈ P is of the form 4n + 1 or 4n + 3. 69

Theorem (Dirichlet) If k > 0 and (h, k) = 1 then ∀x > 1, X log(p) 1 = log(x) + O(1) p φ(k) p 6 x p ≡ h (mod k)

Corollary Since x → ∞ ⇒ log(x) → ∞, there are an infinite number of primes in every arithmetic progression nk + h, n = 0, 1, 2, 3,... since p = nk + h for some n ⇔ p ≡ h (mod k).

Theorem (Dirichlet) Let X πh(x) = 1.

p 6 x p ≡ h (mod k)

Then πh(x) counts the number of primes in nk + h, n = 0, 1, 2, 3,... . π(x) 1 x π (x) ∼ ∼ as x → ∞ h φ(k) φ(k) log(x)

Corollary For each h (mod k), πh(x) has the same asymptotic value i.e. the number of primes in each class [ h ]k is asymptotically the same.

Note All attempts to extend this result to more complex subsets of N than arithmetic progressions have failed. 70

1. Are there an infinite number of primes of the form p = n2 + 1? Ex There are an infinite number of composites xy = n2 + 1.

2. Are there an infinite number of primes p such that q = 2p+1 is also prime? (Sophie Germain primes.)

3. Are there an infinite number of primes p such that q = p + 2 is also prime? (Twin primes conjecture.) Ex If n > 3 one of {n, n + 2, n + 4} is divisible by 3, and is hence composite. (No triple primes conjecture.) Proof.

n ≡ 0 (mod 3) ⇒ 3|n n ≡ 1 (mod 3) ⇒ n + 2 ≡ 3 ≡ 0 (mod 3) ⇒ 3|n + 2 n ≡ 2 (mod 3) ⇒ n + 4 ≡ 6 ≡ 0 (mod 3) ⇒ 3|n + 4

 4. Find a quadratic polynomial f(n) = an2 + bn + c with an infinite number of prime values. 71

0

~ r f,J

~'"'t>

~ ~J

I::

--- - ~('f) . ~ ~

0 0 0 0 0 0 0 0 -=:It C"1 N rl 72 73 74

6 Sums of Squares

Sums of Two Squares

Which n can be expressed as n = x2 + y2? 1 = 12 + 02 2 = 12 + 12 4 = 22 + 02 5 = 22 + 12 8 = 22 + 22 But 3, 6, 7 cannot be written in this form.

Proposition If n ≡ 3 (mod 4) then n = x2 + y2 is impossible. Proof. x2 ≡ 0 or 1 (mod 4) only ⇒ x2 + y2 ≡ 0, 1 or 2 (mod 4) only ⇒ x2 + y2 ≡ 3 (mod 4) is impossible. 

Ex 3 6≡ x2 + y2 7 6≡ x2 + y2 15 6≡ x2 + y2 75

Proposition If n is representable (as the sum of two squares) so is k2n ∀k ∈ N. Proof. x2 +y2 = n ⇒ k2x2 +k2y2 = k2n ⇒ k2n = (kx)2 +(ky)2 so k2n is representable. 

Theorem n is not representable ⇔ ∃pα kn where α is odd and p ≡ 3 (mod 4). 2 2 x y n Proof.(⇐) Let n = x + y and d = (x, y) (the GCD), x1 = d , y1 = d , n1 = d2 then x 2 y 2 n 2 2 2 d + d = d2 ⇒ d |n and x1 + y1 = n1 and (x1, y1) = 1.

β α−2β If p k d ⇒ p | n1 and α − 2β > 1 since α is odd. Hence p | n1. But (x1, y1) = 1 so p - x1 and there is a u ∈ Z so ux1 ≡ y1 (mod p).

2 2 2 2 2 2 Hence 0 ≡ n1 ≡ x1 + y1 ≡ x1 + (ux1) ≡ x1(1 + u ) (mod p). But (p, x1) = 1 also so 2 p−1 2+4` 1 + u ≡ 0 (mod p). But p ≡ 3 (mod 4) ⇒ (−1 | p) = (−1) 2 = (−1) 2 = −1 so 2 2 2 u ≡ −1 (mod p) is impossible. This contradiction shows n 6= x + y . 

Proposition a, b, c, d ∈ Z ⇒ (a2 + b2)(c2 + d2) = (ac + bd)2 + (ad − bc)2. Proof. LHS = a2c2 + a2d2 + b2c2 + b2d2

RHS = a2c2 + b2d2 + 2acbd + a2d2 + b2c2 − 2adbc = LHS.

OR 76

2 2 2 z = a − ib, w = c + id, |z| · |w| = |zw| . 

2 2 2 2 2 2 Note: If n1 = x1 + y1 and n2 = x2 + y2 then n1n2 = z1 + z2 where z1 = x1x2 + y1y2 and z2 = x1y2 − x2y1. Hence the product of any two representable numbers is representable.

Ex 5 = 22 + 12, 13 = 32 + 22 ⇒ 65 = 5 · 13 = (2 · 3 + 1 · 2)2 + (2 · 2 − 3 · 1)2.

Theorem Every prime p ≡ 1 (mod 4) can be written as the sum of two squares. 2 2 Proof. Outline: Show x + y = kp. Then if 1 < k there is a k1 < k.

p−1 2 p ≡ 1 (mod 4) ⇒ (−1 | p) = (−1) 2 = 1 os u ≡ −1Modp has a solution. Hence u2 + 1 = kp for some k ∈ N. Let x = u, y = 1. So x2 + y2 = kp.

Define r, s by k k  r ≡ x (mod k) − 2 < r 6 2 k k s ≡ y (mod k) − 2 < s 6 2 2 2 2 2 2 2 2 Then r + s ≡ x + y ≡ 0 (mod k) ⇒ r + s = k1k for some k1 > 1. ⇒ (rx + sy) + 2 2 2 2 2 2 (ry − sx) = (r + s )(x + y ) = (k1k)(kp) = k1k p from the Proposition above.

But rx + sy ≡ r2 + s2 ≡ 0 (mod k) and ry − sx ≡ rs − sr ≡ 0 (mod k). So k2 |(rx + sy)2 and k2 |(ry − sx)2 and we can write

rx + sy 2 ry − sx2 + = k p ⇒ x2 + y2 = k p. k k 1 1 1 1 77

2 2 k2 k2 k2 2 2 k2 k r + s 6 4 + 4 = 2 but r + s = k1k ⇒ k1k 6 2 ⇒ k1 6 2 ⇒ k1 < k and we are done. 

Notes:

1. n = x2 + y2 + z2 ⇔ n 6= 4e(8k + 7) and only 15 numbers less than 100 cannot be written as the sum of three squares

{7, 15, 23, 28, 31, 39, 47, 55, 60, 63, 71, 79, 87, 92, 95}

2. Every integer can be written as the sum of four squares.

3. 3 6= x3+3 but every integer can be written as the sum of 9 cubes (of positive integers).

4. Let g(k) be the smallest value of s ∈ N such that every integer can be written as the sum of s kth powers. g(2) = 4, g(3) = 9

• 1770 Waring guessed {g(2), g(3), g(4) = 19} • 1909 Proved g(k) exists for all k. • Much later $ % 3k g(k) = 2k + − 2 2

for 6 6 k 6 200, 000 and thought to be true for all k. 78

5. Goldbach’s Conjecture (1742): Every even integer n > 2 can be written as the sum of two primes. 4 = 2 + 2 6 = 3 + 3 8 = 5 + 3 10 = 5 + 5 12 = 7 + 5 . . 100 = 97 + 3

10 • Known 2n = p1 + p2 + ··· + pk, k 6 2 × 10 . • Vinograou n > n0, 2n = p1 + p2 + p3 + p4. • Chen (1966) n > n0, 2n = p1 + p2p3.

Sums of Four Squares

2 2 2 2 Bachet (1621): Stated ∀n ∈ N, n = x + y + z + w where x, y, z, w > 0. Verified up to n = 325.

Fermat claimed he had a proof. 79

Descartes: “The theorem is true, but so difficult I dare not undertake it.”

Euler (1743): Product of a sum of four squares is again a sum of four squares.

(1751): 1 + x2 + y2 ≡ 0 (mod p) ∀p ∈ P.

Lagrange (1770): Proof.

Euler (1773): Simpler proof—after 43 years!

Proposition The product of four squares is a sum of four squares. Proof. (a2 + b2 + c2 + d2)(r2 + s2 + t2 + u2) = (ar + bs + ct + du)2 + (as − br + cu − dt)2 + (at − bu − cr + ds)2 + (au + bt − cs − dr)2

Check by multiplying out each side. 

We now need only show every prime p is the sum of four squares.

Proposition If p is an odd prime then 1 + x2 + y2 ≡ 0 (mod p)

p has a solution with 0 < x, y < 2 . 80

Proof. Let ( ) p − 12 S = 02, 12,..., . 1 2 Then x2 ≡ y2 (mod p) ⇒ (x + y)(x − 7) ≡ 0 (mod p) ⇒ p|x + y or p|x − y ⇒ x = y p−1 if 0 6 x, y 6 2 so the numbers in S1 are distinct mod p. So are the numbers in ( ) p − 12 S = −1 − 02, −1 − 12 − 1 − 22,..., −1 − . 2 2

p−1 p−1  S1 ∪ S2 contains 2 + 1 + 2 + 1 numbers i.e. p + 1 numbers. Hence one number in 2 2 p−1 S1 is congruent to one number in S2 or x ≡ −1 − y (mod p) and 0 6 x, y 6 2 ⇒ 2 2 1 + x + y ≡ 0 (mod p). 

Approach to Lagrange’s theorem: Express some multiple of p as the sum of four squares, then prove there is a smaller multiple. The Proposition above implies kp = x2+y2+12+02.

Proposition If p is an odd prime, there is an odd integer m < p such that

mp = x2 + y2 + z2 + w2.

Proof. By the above kp = x2 + y2 + 12 + 02 p 2 2 p2 p2 2 where 0 < x, y < 2 . ⇒ kp = x + y + 1 < 4 + 4 + 1 < p ⇒ k < p. 81

Claim: We can choose k odd. If k is even let

kp = x2 + y2 + z2 + w2

then all of x, y, z, w are odd, all are even, or two are odd and two are even. So arrange terms so x ≡ y (mod 2) and z ≡ w (mod 2).

kp x − y 2 x + y 2 z − w2 z − w2 z + w2 ⇒ = + + + + 2 2 2 2 2 2

k If 2 is even repeat this process, until eventually, we obtain an odd multiple of p expressed as the sum of four squares. 

Proposition If m, p are odd, 1 < m < p and mp = x2 + y2 + z2 + w2 then there is a positive integer m1 with m1 < m and

2 2 2 2 m1p = x1 + y1 + z1 + w1

m m Proof. Choose A, B, C, D in − 2 < A, B, C, D < 2 with A ≡ x, B ≡ y, C ≡ z, D ≡ w (mod m).

⇒ A2 + B2 + C2 + D2 ≡ x2 + y2 + z2 + w2 (mod m) ≡ 0 (mod m)

2 2 2 2 2 2 2 2 m2 m2 m2 m2 2 ⇒ A +B +C +D = km for some k. But A +B +C +D < 4 + 4 + 4 + 4 = m ⇒ 0 < k < m.(k 6= 0 since k = 0 ⇒ A = B = C = D = 0 so m|x, y, z, w ⇒ m2 |mp.) 82

Hence m2kp = (x2 + y2 + z2 + w2)(A2 + B2 + C2 + D2)

⇒ m2kp = (xA + yB + zC + wD)2 + (xB − yA + zD − wC)2 + (xC − yD − zA + wB)2 + (xD + yC − zB − wA)2

Each term in parentheses on the RHS is divisible by m:

xA + yB + zC + wD ≡ x2 + y2 + z2 + w2 ≡ 0 (mod m) xB − yA + zD − wC ≡ xy − yx + zw − wz ≡ 0 (mod m) xC − yD − zA + wB ≡ xz − yw − zx + wy ≡ 0 (mod m) xD + yC − zB − wA ≡ xw + yz − zy − wx ≡ 0 (mod m)

So put xA + yB + zC + wD x = 1 m xB − yA + zD − wC y = 1 m xC − yD − zA + wB z = 1 m xD + yC − zB − wA w = 1 m

2 2 2 2 m2kp ⇒ x1 + y1 + z1 + w1 = m2 = kp and k < m (from above) so the proposition is proved.  83

Theorem Every positive integer can be written as the sum of four integer squares. 2 2 2 2 2 2 Proof. 2 = 1 + 1 and pi = xi + yi + zi + wi for odd pi ∈ P. So given

m α0 Y αi n = 2 pi i=1 apply the above results as many times as necessary to show that n can be expressed as the sum of four squares. 

Euler’s Conjecture (1769): ∀k > 3 a non-zero kth power is not equal to the sum of k − 1 non-zero kth powers.

(1966) Lander and Perkin k = 5

1445 = 275 + 845 + 1105 + 1335

(1988) Elkies (using elliptic curves)

20, 615, 6734 = 2, 682, 4404 + 15, 365, 6394 + 18, 796, 7604

Notes:

1. Every integer can be expressed as the “algebraic” sum of three squares i.e. n = 84

±x2 ± y2 ± z2 2n + 1 = (n + 1)2 − n2 + 02 (2n + 1 odd) 2n = (n + 1)2 − n2 − 12 (2n even)

2. Every integer can be expressed as the sum of five cubes: If 6|n then (x + 1)3 + (x − 1)3 − 2x3 = 6x = n

3 n−n3 ∀n 6|n − n so x → 6 n − n3 3 n − n3 3 n − n3 3 ⇒ + 1 + − 2 = n − n3 6 6 6

3 3 3 3 3 ⇒ n = x1 + x2 + x3 + x4 + x5

7 Diophantine Equations

Ex The Pythagorean equation x2 + y2 = z2, (x, y, z) = 1 has a solution x = p2 − q2, y = 2pq, z = p2 + q2

for p ∈ N, q ∈ N with (p, q) = 1, one being even and one being odd. Conversely every solution of the Pythagorean equation in coprime positive integers has this form. Ex p = 2, q = 1, p = 3, q = 2. 85

Ex The equation 2x2 + 3y2 = z2 is insoluble: Assume (x, y, z) = 1 ⇒ 3 - x. But 2x2 ≡ z2 (mod 3) ⇒ (zx−1)2 ≡ 2 (mod 3). But 02 ≡ 0, 12 ≡ 1, 22 ≡ 1 (mod 3) so this equation has no solution.

Fermat’s Last Theorem (FLT, 1994, Wiles, Ribet & Taylor) For n > 3 the equation xn + yn = zn has no solution in (strictly) positive integers.

Theorem 23 There are no non-trivial solutions to x4 + y4 = z2

Corollary FLT is true for n = 4: If false there is a solution a4 + b4 = c4 = (c2)2 so there would be a solution x = a, y = b, z = c2. Proof.(of Theorem 23 ) (“method of infinite descent”) Suppose a4 + b4 = c2 is a solution with c2 as small as possible. Then

Claim:(a, b) = 1. If not ∃p ∈ P, p|aandp|b ⇒ p4 |c2 so p2 |c and a4  b 4  c 2 + = p p p2 would be a solution with a smaller value for c2.

Claim: a and b cannot both be odd. If so a = 2n + 1, b = 2m + 1 ⇒ a4 + b4 ≡ 2 (mod 4) but 02 ≡ 0, 12 ≡ 1, 22 ≡ 0, 32 ≡ 1 (mod 4) so c2 ≡ 2 (mod 4) is impossible. 86

Claim: a and b cannot both be even. Since 2|a, 2|b ⇒ 2|(a, b) = 1.

Hence one is even and one is odd. Call the even one a. Now we have a solution to x2 + y2 = z2, (x, y) = 1. (a2)2 + (b2)2 = c2 (a2, b2) = 1, a2 is even and b2 is odd.

Hence, ∃m, n ∈ Z, (m, n) = 1 not both odd so 2  a = 2mn  b2 = m2 − n2 (18) c = m2 + n2 

Claim: n is even. If n is odd and m even, b2 ≡ m2 − n2 (mod 4) ⇒ b2 ≡ −n2 (mod 4) ⇒ x2 ≡ −1 (mod 4) but this is impossible so n is even, hence m is odd by (18).

Say n = 2q. a2 (18) ⇒ a2 = 4mq ⇒ = mq. (19) 2

Claim (m, q) = 1: If not, ∃p | m and p | q ⇒ p | m and p | n ⇒ (m, n) 6= 1. By (19) ∃t, v ∈ N with m = t2, q = v2 and (t, v) = 1. Now n2 + (m2 − n2) = m2 (20) 87

 2  n = 2q = 2v we know m2 − n2 = b2  m = t2 so (20) ⇒ (2v2)2 + b2 = (t2)2 and no two of 2v2, b and t2 have a common factor. Therefore, by the Pythagorean theorem again 2v2 = 2AB, t2 = A2 + B2, A > 0, B > 0, (A, B) = 1. v2 = AB and (A, B) = 1 ⇒ A = r2,B = s2, r > 0, s > 0 and so r4 + s4 = t2.

2 2 2 2 But t 6 t = m 6 m < m + n = c by (18) and so c is not the least member of a solution (!!!). 

Corollary x4n + y4n = z4n, n = 1, 2, 3,... has no non-trivial solution. 4n 4n 4n n 4 n 4 2n 2 Proof. If it did, say a + b = c , a, b, c > 1 ⇒ (a ) + (b ) = (c ) would give a 4 4 2 solution to x + y = z , which is impossible. 

Theorem 24 FLT(p) for any odd prime p ⇒ FLT(n) ∀n > 3. n n n Proof. Let n > 3 and a + b = c , a, b, c > 1. Let n not be divisible by any odd prime. m 2m 2m 2m Then n = 2 , m > 2 so a + b = c ⇒ (a2m−2 )4 + (b2m−2 )4 = (c2m−1 )2

solves x4 + y4 = z2 which is impossible. 88

Let n be divisible by some odd prime p > 3 so n = pm. Then an + bn = cn ⇒ amp + bmp = cmp ⇒ (am)p + (bm)p = (cm)p

which is impossible by FLT(p) 

Further reading Fermat’s Last Theorem—by Simon Singh (4th Estate) Video—Fermat’s Last Theorem—BBC Horizon

Proof based on the Frey curve y2 = x(x − Ap)(x − Bp), which, if Ap + Bp = Cp, has a “discriminant” (ABC)p, should not, and does not exist. 89

Ex Equations like y2 = x3 + 7 are called “elliptic curves”. They arise in solving integrals for, say, the period of a body in a planetary orbit.

(Lebesgue, 1869) The equation y2 = x3 + 7 is insoluble over Z. Proof. If x is even, x = 2α ⇒ RHS = 8α3 + 7 = 8β + 7, where β = α3. But 02 ≡ 0, 12 ≡ 1, 22 ≡ 4, 32 ≡ 1, 42 ≡ 0, 52 ≡ 1, 62 ≡ 4 and 72 ≡ 1 (mod 8) so y2 ≡ 7 (mod 8) has no solution. Hence x is odd. Write

y2 + 1 = x3 + 8 = (x + 2)(x2 − 2x + 4) = (x + 2)((x − 1)2 + 3)

If x = 2n + 1 (odd) then (x − 1)2 + 3 = 4n2 + 3 = 4m + 3, m = n2 so (see back) must have a prime factor of the form p = 4` + 3. But then y2 + 1 ≡ qp ≡ 0 (mod p) But 2 (lemma later) p ≡ 3 (mod 4) ⇒ y ≡ −1 (mod p) has no solution. 

We frequently need to know the answer to the following: When does x2 ≡ r (mod p) have a solution x? Or, more generally, x2 ≡ α (mod m). The answer is given by the theory of quadratic reciprocity due to Gauss. This will be developed later. 90

8 Pell’s Equation

x2 − Ny2 = 1

Trivial solution x = 1, y = 0, x, y > 0.

N = −1 ⇒ (x, y) = (1, 0) or (0, 1) are trivial solutions only.

N 6 −2 ⇒ (x, y) = (1, 0).

Let N > 0 and not a square: If N = M 2,M ≥ 1, x2 − Ny2 = x2 − (My)2 = (x − My)(x + My) = 1 ⇒ x − My = 1 and x + My = 1 so we can get all solutions. Indeed (x, y) = (1, 0) for x, y > 0. So we always assume N ≥ 2.

Note: Solutions to Pell’s equation provide good rational approximations for square roots, since x2 = Ny2 + 1 x2 1 ⇒ = N + y y2 √ x ⇒ y ≈ N if y is large.

Note: This type of equation has√ a long and interesting history, and has lots of applica- tions, especially to fields F = Q( N).

n(n+1) Ex (Euler, 1770) A has the form 2 . Which numbers are both 91

triangular and square? m2 = n(n + 1)/2 ⇒ 8m2 + 1 = 4n2 + 4n + 1 = (2n + 1)2 ⇒ x2 − 2y2 = 1 where x = 2n + 1, y = 2m. So solutions to this Pellian equation produce (all) square triangular numbers.

Definition A fundamental solution to x2 − dy2 = 1 is (r, s) where any other positive solution satisfies r < x and s < y.

Theorem 25 (Lagrange) Let (r, s) be the least positive (or fundamental) solution to x2 − dy2 = 1, where d is not a square. Then every solution to this equation is given by (xn, yn) where √ √ n xn + dyn = (r + s d) for n = 1, 2, 3,... Proof. √ √ x2 − dy2 = (x + y d)(x − y d) n n n √n n √ n = (r + s d)n(r − s d)n = (r2 − s2d)n = 1n = 1

Hence (xn, yn) is a solution.

Let (a, b) be a solution. Suppose ∀n = 1, 2, 3,..., (a, b) 6= (xn, yn). Then there is a 92

positive integer m with √ √ √ (r + s d)m < a + b d < (r + s d)m+1 (21)

√ √ But (r + s d)−m = (r − s d)m so (21) ⇒ √ √ √ 1 < (a + b d)(r − s d)m < (r + s d) (22) √ √ √ Let u + v d = (a + b d)(r − s d)m so √ √ u2 − v2d = (u + v d)(u − v d) √ √ √ √ = (a + b d)(r − s d)m(a − b d)(r + s d)m = (a2 − b2d)(r2 − s2d)m = 1 · 1m = 1 Thus (u, v) is a solution. √ √ But 1 < u + v d ⇒ 0 < u − v d < 1 so √ √ 2u = (u + v d) + (u − v d) > 1 + 0 > 0 √ √ √ √ √ And 2v d = (u + v d) − (u − v d) > 1 − 1 = 0 so u > 0, v > 0 and u + v d < r + s d by (22), contradiction the assumption that (r, s) is the fundamental solution. Hence (a, b) = (xn, yn) for some n. 

Finding the least positive solution is not easy however and requires the theory of continued fractions of J. L. Lagrange. Frenicle’s table for non-square d up to 50 is given below. 93

Pell's equation

"f,,' ~l:lPuler,after a cursoryreading of Wallis'sOpera Mathematica, mistakenly r~buted the first serious study of nontrivial solutions to equations of the '~J;f°!Inx2 - dy2= 1,where x ~ 1 andy ~ 0, to Cromwell'smathematician ~a,.JohnFell. However, there is no evidence that Fell, who taught at the '~;~niversity of Amsterdam, had ever considered solving such equations. ~t[;rhey:would be more aptly called Fermat's equations, since Fermat first ~~(tlvestigatedproperties of nontrivial solutions of each equations. Neverthe- ,~\(tess,Pellian equations have a long history and can be traced back to the .;ff,.Greeks.Theon of Smyrna used x/y to approximate ~, where x and y ~'\gY(ereintegral solutions to x2 - 2y2 = 1. In general, if x2 = dy2 + 1, then ;~~2/y =d+ 1/y2. Hence, for y large, x/y is a good approximation of ~'Yd,a factwell knownto Archimedes. JI(Archimedes's problema bovinum took two thousand years to solve. itccording to a manuscript discovered in the Wolfenbiittel library in 1773 ,tRY Gotthold Ephraim Lessing, the German critic and dramatist, Archi- ~~edes became upset with Apollonius of Perga for criticizing one of his t~orks. He divised a cattle problem that would involve immense calculation j~?isolve and sent it off to Apollonius. In the accompanying correspon- r!~ence,A.rchimedes asked Apollonius to compute, if he thought he was ,ii' .

~ - -~

smart enough, the number of the oxen of the sun that grazed once upon the plains of the Sicilian isle Trinacria and that were divided according to color into four herds, one milk white, one black, one yellow and one dappled, with the following constraints:

white bull~ ~ yellow bulls + (~+ ~) black bulls, (1 1 black bulls = yellow bulls + 4 + :5 ) dappled bulls, i f (1 1) \ dappled bulls = yellow bulls + "6 + '7 whitebulls,

white cows = (~+~) black herd,

black cows = (~+~) dappled herd,

dappledcows = (~+~) yellowherd, and

yellowcows = (~+ ~) white herd.

Archimedes added, if you find this number, you are pretty good at numbers, but do not pat yourself on the back too quickly for there are two more conditions, namely: white bulls plus black bulls is square and dappled bulls plus yellow bulls is triangular. Archimedes concluded, if you solve the whole problem then you may 'go forth as conqueror and rest assured that thou art proved most skillful in the science of numbers'. The smallest herd satisfying the first seven conditions in eight unknowns, after some simplifications, lead to the Pellian equation x2- 4729494 y2 = 1. The least positive solution, for which y has 41 digits, was discovered by Carl Amthov in 1880. His solution implies that the number of white bulls has over 2 X 105 digits. The problem becomes much more difficult when the eighth and ninth conditions are added and the first complete solution was given in 1965 by H.C. Williams, R.A. German, and C.R. Zarnke of the University of Waterloo. In Arithmetica, Diophantus asks for rational solutions to equations of the type x2 - dy2 = 1. In the case where d = m2 + 1, Diophantus offered the integral solution x = 2m2+ 1 and y = 2m. Pellian equations are found in Hindu . In the fourth century, the Indian mathematican -- 94 95 96

9 Continued Fractions

Ex 1 1 1 + 1 = 1 + 1 2 + 1 2 + 13/4 3+ 4 1 = 1 + 4 2 + 13 1 = 1 + 30/13 13 = 1 + 30 43 = 30

looks silly until we consider some interesting continued fraction expansions

π : [3, 7, 15, 1, 292, 1, 1, 1,...] i.e. 1 3 + 1 7 + 1 15+ 293+··· e : [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1,...] 97

√ √2 : [1, 2, 2, 2, 2,...] √3 : [1, 1, 2, 1, 2, 1, 2, 1, 2,...] √5 : [2, 4, 4, 4,...] n2 + 1 : [n, 2n, 2n, . . .] (Euler)

Definition By a simple continued fraction (or C.F.) we mean an expression 1 a0 + 1 = [a0, a1, a2,...] a1 + a2+···

where a0 ∈ Z and ai ∈ N for i > 1.

a0 a0a1+1 a2a1a0+a2+a0 Note:[a0] = , [a0, a1] = , [a0, a1, a2] = 1 a1 a2a1+1

Generally, [a , . . . , a ] = pn where p and q are polynomials in the a , linear in any given 0 n qn n n i th aj, and a0 does not occur in the denominator qn.(pn, qn) are called the n convergents.

Note:[a , . . . , a ] = [a , . . . , a + 1 ] 0 n 0 n−1 an

Proposition If [a0, . . . , am] = [b0, . . . , bn], ai, bi ∈ N, am, bn > 1 then m = n and ai = bi ∀i. Proof. This follows by induction from 1 1 [a0, . . . , am] = a0 + = b0 + [a1, . . . , am] [b1, . . . , bn] 98

if we can show [a1, . . . , am] > 1 when a1, . . . , am > 1. But this is so since [a1, . . . , am] = 1 a1 + . a2+··· 

Let ai > 0 and ∀n let τn = [a0, . . . , an] then τn can be computed using the recursive formulas, for n ≥ 2:

p0 = a0 p1 = a0a1 + 1 pn = anpn−1 + pn−2 q0 = 1 q1 = a1 qn = anqn−1 + qn−2

p0 p1 pn so τ0 = , τ1 = and τn = q0 q1 qn Proof. 0 1 pn−1 τn = [a0, . . . , an] = [a0, . . . , an−1 + ] = 0 an qn−1 where these belong to a , . . . , a , a + 1 i.e. (induction) 0 n−2 n−1 an

 1  0 an−1 + pn−2 + pn−3 pn−1 an 0 =   qn−1 a + 1 q + q n−1 an n−2 n−3 a (a p + p ) + p = n n−1 n−2 n−3 n−2 an(an−1qn−2 + qn−3) + qn−2 a p + p = n n−1 n−2 (induction again!) anqn−1 + qn−2 99

Hence pn = anpn−1 + pn−2 and qn = anqn−1 + qn−2 

th (pn, qn) are called the n convergents of the C.F.

1 Let θ ∈ \ , θ > 1. a0 = bθc so θ = a0 + , θ1 > 1 defines θ1. Continue with R Z θ1 1 1 θ1 = a1 + so a1 = bθ1c , θ2 > 1 if θ1 6∈ etc θn = an + , an = bθnc , θn=1 > 1 if θ2 Z θn+1 θn 6∈ Z. We get 1 θ = a0 + 1 a1 + 1 a2+ . . 1 .+ 1 an+ θn+1 1 so θ = [a0, a1, . . . , an + ] θn+1

+ Proposition The expansion stops if θn = an is in N and then θ ∈ Q i.e. is a positive . Conversely, if θ ∈ Q+, the C.F. expansion is finite. u + Proof. Let θ = v ∈ Q , u, v ∈ N. Use division

u = a0v + r1 0 < r1 < v v = a1r1 + r2 0 < r2 < r1 r1 = a2r2 + r3 0 < r3 < r2 . . rn−1 = anrn + 0 100

as if we were doing the Euclidean algorithm. These equations give

u r1 1 1 θ = θ0 = = a0 + = a0 + = a0 + v v v/r1 θ1 r2 1 1 θ1 = a1 + = a1 + = a1 + r1 r1/r2 θ2 . . rn−1 θn = ∈ N rn

so the C.F. expansion is finite. 

Proposition ∀n > 2 θ p + p θ = n n−1 n−2 θnqn−1 + qn−2

pn θnpn−1+pn−2 Proof. The definition of θn is θ = [a0, . . . , an−1, θn] so θ = τn = = using an qn θnqn−1+qn−2 and θn for this particular C.F.  √ Ex 2 = [1, 2, 2,...] √ √ √ √ ( 2 − 1)( 2 + 1) = 2 − 1 = 1 ⇒ 2 − 1 = 1√ so 2 = 1 + 1√ . 1+ 2 1+ 2 √ √ We now copy the expression for 2 in the RHS into the 2 on the RHS successively 101

(photocopy model for recursion). √ 1 2 = 1 + 1 + 1 + 1√ 1+ 2 1 = 1 + 2 + 1√ 1+ 2 1 = 1 + 2 + 1 2+ 1√ 1+ 2 √ √ √etc. leading to 2 = [1, 2, 2, 2, 2,..., 2, 1 + 2]. If we continue indefinitely we obtain 2 = [1, 2, 2,...] = [1, 2 ].

Every quadratic irrational has a periodic continued fraction—this characterises quadratic irrationals. √ √ Ex 2 = [1, 2,..., 2, 1 + 2] so a0 = 1, a1 = 2,...  p0 = a0 = 1 p0 1 τ0 = = = 1 q0 = 1 q0 1

 p1 = a0a1 + 1 = 3 p1 3 τ1 = = = 1.5 q1 = a1 = 2 q1 2 102

 p2 = a2p1 + p0 = 7 p2 7 τ2 = = = 1.4 q2 = a2q1 + q0 = 5 q2 5 √ and the approximation τn ≈ 2 gets better.

Theorem 26 Let a0 ∈ Z, ai ∈ N, i > 1. Then (τn) converges to an irrational number θ. The ai are uniquely determined by the C.F. expansion of θ. Conversely, if θ is an irrational number, and τn = [a0, . . . , an] are obtained by expanding θ as a C.F. then

θ = lim τn. n→∞

Proof. The sequences (pn) and (qn) are both strictly monotonically increasing sequences of natural numbers.

Claim: n−1 pnqn−1 − pn−1qn = (−1) (23) 1−1 ∀n > 1. If n = 1 this is p1q0 − p0q1 = (a0a1 + 1)1 − a0a1 = 1 = (−1) which is true. Assume it is true for n = m. Then

pm+1qm − pmqm+1 = (am+1pm + pm−1)qm − pm(am+1qm + qm−1)

= pm−1qm − pmqm−1

= −(pmqm−1 − pm−1qm) = −(−1)m−1 = (−1)m 103

Hence, by induction, the claim is true ∀n > 1.

Divide (23) by qnqn−1 to obtain p p (−1)n−1 n − n−1 = qn qn−1 qnqn−1 or (−1)n−1 τn − τn−1 = (24) qnqn−1

Apply this to θ = [a0, . . . , an−1, θn] to get (−1)n−1 θ − τn−1 = (25) qn−1(θnqn−1 + qn−2)

But θi > 0 and qi → ∞

lim τn = θ ∴ n→∞ since RHS of (25) → 0. The proof of uniqueness is similar to that given above when + θ ∈ Q .  √ √ 2 Aside Numbers of the√ form α + β d, d ∈ N, d 6= m are a field, F = Q( d), the “extension” of Q by d: √ 1 α − β d  α   β  √ √ √ = = − d ∈ {α1 + β1 d} α + β d α2 − β2d α2 − β2d α2 − β2d 104

Diophantine Approximation

Equation (25) implies

pn 1 θ − = qn qn(θn+1qn + qn−1) 1 < (26) qnqn+1

The numbers q0, q1,... are strictly increasing in N. The continued fraction process pro- vides us with an infinite sequence of rational approximations to an irrational number, θ, namely the convergents pn ∈ . How rapidly do they approach θ? qn Q

x By (26), if y is a convergent,

x 1 θ − < y y2 It is possible to prove that (Hurwitz, 1891] any irrational number θ has an infinite number of rational approximations which satisfy

x 1 θ − < √ (27) y 5y2 √ This is the best possible: If we choose β > 5 then there are numbers η ∈ R \ Q for

which there are only a finite number of rationals x with η − x < 1 . y y βy2 105 e.g. the golden ratio 1 1 g = 1 + 1 = 1 + 1 + 1 g 1+ 1+... √ 2 1+ 5 so g − g − 1 = 0 ⇒ g = 2 .

Inequalities of the form (27) will be very important later when√ we study rational, alge- 401 1+ 5 braic, irrational and transcendental numbers such as 403 , 2 and e or π.

Quadratic Irrationals

√ • solutions to quadratic equations with Z coefficients e.g. x2 − 2 = 0 ⇒ x = 2. √ • simplest type of irrational e.g. ( 4 + 71/3)1/5 is ‘more’ irrational as is π (see later)

√ √ 24− 15 Ex θ = 17 : 3 < 15 < 4 ⇒ bθc = 1 and 1 θ = 1 + θ1 106

√ 1 17 7 + 15 θ1 = = √ = θ − 1 7 − 15 2 bθ1c = 5 1 ⇒ θ1 = 5 + θ2 √ 1 2 15 + 3 θ2 = = √ = θ1 − 5 15 − 3 3 bθ2c = 2 1 ⇒ θ2 = 2 + θ3 √ 1 3 15 + 3 θ3 = = √ = θ2 − 2 15 − 3 2 bθ3c = 3 1 ⇒ θ3 = 3 + θ4 √ 1 2 15 + 3 θ4 = = √ = so θ4 = θ2 θ3 − 3 15 − 3 3 √ 25 − 15 1 ⇒ = 1 + 1 17 5 + 1 2+ 1 3+ 1 2+ 3 107

Ex √ 2 = [1, 2 ] √ 3 = [1, 1, 2 ] √ 5 = [2, 4 ] √ 6 = [2, 2, 4 ]

H. Davenport, The Higher Arithmetic

√ Ex 50 = [ 7, 14 ] 108

Purely periodic fractions

Ex √ 1 2 + 1 = 2 + 2 + 1 2+ 1 √ ··· 6 + 2 = [ 4, 2 ]

These numbers are easier to deal with than those with a ‘preperiod’.

Ex 1 α = 4 + 1 1 + 1 3+ 1 4+ 1 1+ 3+··· = [4, 1, 3, α]

 4 5 19 5  using the recursive equations we get convergents 1 , 1 , 4 , 1 ,... . 19α + 5 αp + p α = ⇐ α = n−1 n−2 4α + 1 αqn−1 + qn−2

Hence 4α2 − 18α − 5 = 0 and α is a quadratic irrational. 109

Now consider the number β which has the period of α reversed: 19β + 4 β = [ 3, 1, 4 ] ⇒ β = 5β + 1 ⇒ 5β2 − 18β − 4 = 0

1 1 The equations are the same if − β = α ⇒ − β is the second root of the equation for α called the (algebraic) conjugate of α or α i.e. β = −1/α.

In general let α = [a0, . . . , an, α] be purely periodic, then p α + p α = n n−1 qnα + qn−1

Let β = [an, . . . , a0] = [an, . . . , a0, β] then (Ex) p β + q β = n n pn−1β + qn−1 1 As before − β is the conjugate of the root α.

1 Note: If β > 1 then −1 < − β < 0.

Theorem 27 Any purely periodic continued fraction represents a quadratic irrational 1 number α > 1 with a conjugate α satisfying −1 < α < 0. This conjugate is α = − β where β is defined by the C.F. of α with the period reversed. 110

Remark (Galois, 1828) This property characterises numbers with purely periodic con- tinued fractions.

Definition A quadratic irrational α is reduced if α > 1 and −1 < α < 0.

Theorem 28 If α is reduced, its C.F. expansion is purely periodic. Proof. There are integers a, b, c such that a α2 + b α + c = 0. Solving for α: √ √ −b ± b2 − 4ac P ± D α = = 2a Q

2 −1  where P,Q ∈ Z,D ∈ N,D 6= m . Assume the sign is positive, else multiply by −1 so √ P + D α = Q so α, the other root, is √ P − D α = . Q Note that P 2 − D b2 − (b2 − 4ac) = = 2c ⇒ Q|P 2 − D Q 2a 111

But 1 < α and −1 < α < 0 so √ D (i) α − α > 0 ⇒ Q > 0 ⇒ Q > 0 (ii) α + α > 0 ⇒ P > 0 ⇒ P > 0 Q √ (iii) α < 0 ⇒ P < D √ √ (iv) 1 < α ⇒ Q < P + D < 2 D √ √ 2 ⇒ P,Q ∈ N, P < D, Q < 2 D and Q|P − D. (28)

Now expand α as a C.F. 1 α = a0 + , a0 = bαc , α1 > 1 α1 1 ⇒ α = a0 + α1 1 ⇒ α1 = − ⇒ −1 < α1 < 0 a0 − α

Hence α1 is reduced also. Similarly α2, α3,... are reduced.

Now √ √ 1 P + D P − Qa0 + D = α − a0 = − a0 = α1 Q Q so let P1 = −P + Qa0 so √ Q P1 + D α1 = √ = (29) −P1 + D Q1 112

2 2 where Q1Q = D − P1 and Q1 ∈ Z since Q|D − P and P1 ≡ −P (mod Q).

Then √ P1 + D α1 = Q1 and since α1 is reduced, P1 > 0,Q1 > 0 and get the conditions (28) above using (29). We carry on with the C.F. process, using α1 instead of α, . . .. Each complete quotient pn has the form qn √ Pn + D αn = Qn where Pn,Qn satisfy (28)There are only a finite set of possibilities for the pairs (Pn,Qn) so eventually we come to a pair (Pm,Qm) = (Pn,Qn) , m > n so αm = αn and so the C.F. is periodic from this point on.

Claim: The C.F. is purely periodic.

Subclaim: αn−1 = αm−1. If this were so we would be able to work back to get, eventually, 1 α0 = αm−n proving pure periodicity. Proof of the subclaim: αn = an + ⇒ αn = an + αn+1 1 1 1 1 . Let βn = − then −1 < αn < 0 ⇒ 1 < βn and − = an −βn+1 or βn+1 = an + αn+1 αn βn βn so an = bαnc = bβn+1c. Now let n < m and αn = αm so αn = αm ⇒ βn = βm and a = bβ c = bβ c = a . But α = a + 1 , α = a + 1 ⇒ α = α . n−1 n m m−1 n−1 n−1 αn m−1 m−1 αm n−1 m−1 113

Applying this again successively α = α0 = αm−n = αr say, and

α = [a0, a1, . . . , ar−1, αr]

= [a0, a1, . . . , ar−1, α]

= [ a0, a1, . . . , ar−1 ] pure periodic with period length r.  114 115 116

Now Consider the table on given above: N ∈ N, n 6= m2. All the continued fractions are of a special form: √ √ j√ k √ (1) None are purely periodic N = − N < −1 but a0 = N , α = a0 + N has 1 < α and −1 < α < 0 and the continued fraction of α begins with 2a since j √ k j√ k √ 0 a0 + N = a0 + N = 2a0. Hence a0 + N = [2a0, a1, . . . , an, 2a0] eventually since it is purely periodic. √ (2) There is one preperiod term for N (with a periodic part consisting of symmetric terms), followed by 2a0: √ N = [a0, a1, a2, . . . , a2, a1, 2a0]

Ex √ 53 = [7, 3, 1, 1, 3, 14]

Theorem (Lagrange) A continued fraction is periodic ⇔ it is the continued fraction of a quadratic irrational. Ex

21/3 = [1, 3, 1, 5, 11, 4, 1,...] e − 1 = [2, 6, 10, 14,...] e + 1 e = [2, 1, 2, 1, 1, 4, 1, 1, 6, 1, 1, 8,...] 117

Return to the Fundamental Solution to Pell’s Equation

√ N = [a0, a1, . . . , an, 2a0] | {z } periodic part 118

pn−1 pn Let , be the two convergents just preceding 2a0. i.e qn−1 qn

pn−1 = [a0, . . . , an−1] qn−1 pn = [a0, . . . , an−1, an] qn √ Now α = N = αn+1pn+pn−1 (1) where αn+1qn+qn−1 1 αn+1 = 2a0 + a1 + ... = [2a , a , a ,...] 0 √1 2 = a0 + N

Substituting this value for αn+1 in (1) and simplifying gives √ √ √ √ N( N + a0)qn + Nqn−1 = ( N + a0)pn + pn−1 equating rational and irrational parts:

⇒ Nqn = a0pn + pn−1

a0qn + qn−1 = pn

⇒ pn−1 = Nqn − a0pn

qn−1 = pn − a0qn 119

Substitute in equation (1) on page 74 to get

n−1 pn(pn − a0qn) − qn((Nqn − a0pn) = (−1)

2 2 n−1 ⇒ pn − Nqn = (−1) 2 2 2 2 Hence x = pn, = qn is a solution to x − Ny = 1 if n is odd and x − Ny = −1 if n is even. In the latter case apply the same argument to the convergents at the end of the second period: [a0, a1, . . . , an, 2a0, a1, . . . , an, 2a0] so x = p2n+1, y = q2n+1 solve x2 − Ny2 = (−1)2n+1−1 = (−1)2n = 1

Ex N = 21 √ 21 = [4, 1, 1, 2, 1, 1, 8] so n = 5 odd Convergents: 4 , 5 , 9 , 23 , 32 , 55 = p5 ,... 1 1 2 5 7 12 q5 2 2 So x1 = p5 = 55, y1 = q5 = 12 solves x1 − 21y1 = 1

This gives the fundamental solution. Other solutions are √ √ m xm + ym 21 = (x1 + y1 21) , m = 2, 3, 4,...

Ex N = 29 √ 29 = [5, 2, 1, 1, 2, 10] so n = 4 even Convergents: 5 , 11 , 16 , 27 , 70 , 727 , 1524 , 2251 , 3774 , 9801 = p9 1 2 3 5 13 135 283 418 701 1820 q9 So x1 = 9801, y1 = 1820 gives the fundamental solution. 120

Note

1. Not all details have been proved, e.g. that the C.F. expansion gives all of the solutions, including the fundamental.

2. There are deep mysteries tied up in C.F. expansions e.g. why are they o closely related to quadratic irrationals? 121

10 Quadratic Reciprocity

Let p be an odd prime p ∈ {3, 5, 7,...}, n ∈ Z, p - n.

• If x2 ≡ n (mod p) has a solution x ∈ Z let (n | p) = 1. • If x2 ≡ n (mod p) has no solution let (n | p) = −1.

• If p|n let (n | p) = 0.

This defines the Legendre symbol,(n | p). 122

Ex p = 11:

12 ≡ 1 22 ≡ 4 32 ≡ 9 42 = 16 = 11 + 5 ≡ 5 52 = 25 = 22 + 3 ≡ 3 and 62 ≡ (11 − 5)2 ≡ (−5)2 ≡ 52 ≡ 3 72 ≡ (−4)2 ≡ 5 82 ≡ (−3)2 ≡ 9 92 ≡ (−2)2 ≡ 4 102 ≡ (−1)2 ≡ 1 and 112 ≡ 0 (mod 11)

So the quadratic residues are 1, 3, 4, 5, 9 ⇒ (n | 11) = 1 the non-residues are ∴ 2, 6, 7, 8, 10 ⇒ (n | 11) = −1 and (11 | 11) = 0.

Proposition a ≡ b (mod p) ⇒ (a | p) = (b | p) Proof. a = b + lp so if p | a ⇒ p | b ∴ (a | p) = 0 ⇔ (b | p) = 0. Also (a | p) = 1 ⇔ 2 2 x ≡ a (mod p) has a solution ⇔ x ≡ a ≡ b has a solution.  123

Proposition Let p ∈ P be odd. Every (reduced) residue system mod p contains exactly p−1 p−1 2 quadratic residues and 2 quadratic non-residues mod p. The quadratic residues belong to the residue classes containing the numbers: ( ) p − 12 R = 12, 22, 32,..., 2

Proof. Claim: the numbers in R are distinct mod p: If x2, y2 ∈ N and x2 ≡ y2 ⇒ p−1 p−1 (x − y)(x + y) ≡ 0 (mod p) ⇒ p | (x − y)(x + y) But 0 < x + y 6 2 + 2 < p so p|x − y ⇒ x ≡ y (mod p) ⇒ x = y. Since (p − k)2 = p2 − 2pk + k2 ≡ k2 (mod p) and {1, 2, . . . , p − 1} is a complete set of representatives, every quadratic residue is congruent to one of the numbers in R. 

Ex p = 7 12 ≡ 1 22 ≡ 4 32 ≡ 2

7−1 The number of residues is 2 = 3. So 1, 4, 2 are residues 3, 5, 6 are non-residues

Ex For all odd p, (1 | p) = 1. 124

Ex For all odd p and m ∈ Z, (m2 | p) = 1.

Ex Fix p and let f(n) = (n | p) so f : Z → {−1, 0, 1}. Then f(n + p) = (n + p | p) = (n | p) = f(n) so f is periodic with period p. It is also completely multiplicative. f(ab) = f(a)f(b) ∀a, b ∈ Z.

Theorem 29 (Euler) Let p ∈ P be odd. Then ∀n ∈ Z,

p−1 (n | p) ≡ n 2 (mod p).

p−1 Proof. Note 2 ∈ N. If p|n both sides are zero so suppose p - n. Let (n | p) = 1. Then ∃x so x2 ≡ n (mod p)

p−1 2 p−1 p−1 ⇒ n 2 ≡ (x ) 2 = x ≡ 1 = (n | p)

by Fermat’s little theorem. Hence

p−1 n 2 ≡ (n | p) if (n | p) = 1

Let (n | p) = −1. Consider the polynomial

p−1 p − 1 f(x) = x 2 − 1, ∂f = degree of f = 2 p−1 So, over any field, f has at most 2 roots, hence the congruence f(x) ≡ 0 (mod p) p−1 p−1 has at most 2 solutions. But the 2 quadratic residues mod p are solutions (the case 125

(n | p) = 1) so the non-residues are not. Hence

p−1 n 2 6≡ 1 (mod p) if (n | p) = −1. But p−1  p−1   p−1  n − 1 = n 2 − 1 n 2 + 1 and p|np−1 − 1 so p−1 n 2 ≡ ±1 (mod p) Hence p−1 n 2 ≡ −1 = (n | p) (mod p) 

Proposition f(n) = (n | p) is completely multiplicative. Proof. p | m or p | n ⇒ p | mn ⇒ (mn | p) = 0, hence f(mn) = f(m) · f(n) if p | m or p|n. Let p - m and p - n. Then p - mn so

p−1 p−1 p−1 (mn | p) ≡ (mn) 2 = f(mn) = m 2 · n 2 ≡ (m | p)(n | p) (mod p) But each of (mn | p) , (m | p) or (n | p) is ±1 os the difference (mn | p) − (m | p)(n | p) is 0, ±2. But this difference is divisible by p, hence it is 0, and therefore, (mn | p) = (m | p)(n | p) ⇒ f(mn) = f(m) · f(n)¡ so f is completely multiplicative. 

Proposition  p−1 1 p ≡ 1 (mod 4) (−1 | p) = (−1) 2 = −1 p ≡ 3 (mod 4) 126

Proof. By Euler, Theorem 29,

p−1 (−1 | p) ≡ (−1) 2 (mod p) since each side is ±1, they must be equal. 

p−1 Ex p = 5, 2 = 2 n 0 1 2 3 4 5 (n | 5) 0 1 1 0 p = 7 n 0 1 2 3 4 5 6 7 (n | 7) 0 1 -1 0

Proposition 2  p −1 1 p ≡ ±1 (mod 8) (2 | p) = (−1) 8 = −1 p ≡ ±3 (mod 8) p 0 1 2 3 4 5 6 7 8 × R × N × N × R × 127

p−1 Proof. Consider the 2 congruences mod p: p − 1 ≡ 1(−1)1 2 ≡ 2(−1)2 p − 3 ≡ 3(−1)3 . . p − 1 p−1 r ≡ (−1) 2 2 Multiply these together and note that each integer on LHS is even, since p is odd.   p − 1 1+2+3+···+ p−1 ⇒ 2 · 4 · 6 ··· (p − 1) ≡ !(−1) 2 (mod p) 2

    2 p−1 p − 1 p − 1 p −1 ⇒ 2 2 ! ≡ !(−1) 8 (mod p) 2 2 p−1  But p - 2 !, hence, by Euler, Theorem 29,

p−1 p2−1 (2 | p) = 2 2 ≡ (−1) 8 (mod p) and since LHS and RHS are ±1 we have

p2−1 (2 | p) = (−1) 8

 128

Euler’s theorem is normally too computationally expensive to compute (n | p). Gauss’ lemma and Reciprocity theorem, proved below, both give better ways to evaluate this function.

Note: f : Z → {−1, 0, 1} ⊂ S0 ∪{0} = {z ∈ C : |z| = 1}∪{0} is an example of a so-called   Z 0 character an extension of the character χ : pZ → S , χ [ n ]p = (n | p)

p−1 Theorem 30 (Gauss) Let p - n and consider the residues mod p of the 2 multiples p−1 of n, M = {n, 2n, 3n, . . . , 2 n} which are the least positive residue representatives, i.e. p m which lie in {1, . . . , p}. If m is the number which exceed 2 , then (n | p) = (−1) . p−1 Proof. If p|in − jn with 1 6 i, j 6 2 . Then p|(i − j)n but p - n ⇒ i = j. hence the numbers in M are incongruent mod p. Consider their least positive residues and put them in two disjoint sets A = {a1, . . . , ak} p−1 p and B = {b1, . . . , bm} where ai ≡ tn (mod p), 1 6 t 6 2 , 0 < ai < 2 and bi ≡ sn p−1 p (mod p), 1 6 s 6 2 , 2 < bi < p (1). p−1 Since A ∩ B = ∅, m + k = 2 . Let ci = p − bi, 1 6 i 6 m and C = {c1, . . . , cm}. Now p 0 < ci < 2 by (1). Claim A ∩ C = ∅: If ci = aj ⇒ p − bi = aj ⇒ aj + bj = p ≡ 0 (mod p) ∴ tn + sn = p (t + s)n ≡ 0 (mod p) for some s and t with 1 6 s, t < 2 . But this is impossible since 129

1 <6 s + t < p ⇒ p - s + t. Hence A ∩ C = ∅. p−1 p−1 Hence #(A ∪ C) = m + k = 2 integers in [1, 2 ]. Hence p − 1 A ∪ C = {a , . . . , a , c , . . . , c } = {1, 2,..., } 1 k 1 m 2

Now form the product of all of the elements in A ∪ C: p − 1 a a ··· a c c ··· c = ! 1 2 k 1 2 m 2

But ci = p − bi so p − 1 ! = a ··· a (p − b ) ··· (p − b ) 2 1 k 1 m m ≡ (−1) a1 ··· akb1 ··· bm (mod p) p − 1  ≡ (−1)mn(2n)(3n) ··· n (mod p) 2   m p−1 p − 1 ≡ (−1) n 2 ! (mod p) 2

p−1 m m ⇒ n 2 ≡ (−1) (mod p) and (n | p) = (−1) follows by Theorem 29. 

Theorem 31 If m is defined as in the above theorem,

p−1 X2 jn p2 − 1 m ≡ + (n − 1) (mod 2) p 8 j=1 130 so if n is odd: p−1 X2 jn m ≡ (mod 2) p j=1

a Note (1) If n ∈ N, n = 2 m, a > 0, m odd, so (n | p) = (2 | p)a (m | p) where m is odd

a(p2−1) = (−1) 8 (m | p)

(2) (n | p) = (−1)m so only the value of m (mod 2) (its parity) is needed to compute the Legendre symbol. p−1 Proof. The number m is the number of least positive residues of n, 2n, . . . , 2 n exceeding p 2 . Let jn be one of these jn jn jn jn = + where 0 < < 1 p p p p

jn jn jn so jn = p + p = p + r where 0 < r < p p p p j j 131

j jn k The number rj is the least positive residue of jn : rj = jn − p p (1). Using the same notation as in the previous theorem,

{r1, . . . , r p−1 } = {a1, . . . , ak, b1, . . . , bm} 2  p − 1 1, 2,..., = {a , . . . , a , c , . . . , c } 2 1 k 1 m

ci = p − bi

Add all of the elements in each set:

p−1 k m X2 X X rj = ai + bj (2) j=1 i=1 j=1 p−1 k m k m X2 X X X X j = ai + cj = ai + mp − bj (3) j=1 i=1 j=1 i=1 j=1

In (2) use (1) for rj:

p−1 p−1 k m X X X2 X2 jn a + b = n j − p (4) i j p i=1 j=1 j=1 j=1

p−1 k m X X X2 (3) is mp + ai − bj = j (5) i=1 j=1 j=1 132

Add (4) and (5) to get

p−1 k ! X p2 − 1 X2 jn 2 a + mp = (n + 1) − p i 8 p i=1 j=1 But −p ≡ 1 (mod 2) and n + 1 ≡ n − 1 (mod 2), hence

p−1 p2 − 1 X2 jn m ≡ (n + 1) + (mod 2) 8 p j=1 

Theorem 32 (Quadratic Reciprocity Law, Gauss, 1796) If p and q are distinct odd primes, then (p−1)(q−1) (p | q)(q | p) = (−1) 4 (1)

Proof.(q | p) = (−1)m where

p−1 X2 jq  m ≡ (mod 2) p j=1 Similarly (p | q) = (−1)n where

q−1 X2 ip n ≡ (mod 2) q i=1 133

Hence (p | q)(q | p) = (−1)m+n and (1) follows from the claimed identity:

p−1 q−1 X2 jq  X2 ip p − 1 q − 1 + = (2) p q 2 2 j=1 i=1 Consider the rectangle with given vertices. (In the illustration, p = 7, q = 5.)

q The diagonal does not pass through any lattice point, because if so, y = p x at the lattice point (x, y). ⇒ xq = yp ⇒ p|x and q |y so x > p, y > q and the point (x, y) must be 134

outside the rectangle. p−1  q−1  The total number of lattice points inside the rectangle is 2 2 = c The total number of points in the triangle below the diagonal is

p−1 X2 jq  b = p j=1

The number above is q−1 X2 ip a = q i=1 So a + b = c and so (2) follows, hence (1). 

Ex (219 | 383). Note 383 ∈ P. Now 219 = 3 · 73 (73 ∈ P) so, by multiplicativity (219 | 383) = (3 | 383) (73 | 383) .

Reciprocity implies that

(383−1)(3−1) (3 | 383) (383 | 3) = (−1) 4 = −1

so

(3 | 383) = − (−1 | 3) using periodicity mod 3 = 1 135

Also

(383−1)(73−1) (73 | 383) = (383 | 73) (−1) 4 = (18 | 73) = (2 | 73) (3 | 73)2

732−1 = (−1) 8 = 1

Hence (219 | 383) = 1 · 1 = 1 and x2 ≡ 219 (mod 383) has a solution. 136

11 Elliptic Equations and Curves

• Diophantine family with interesting properties. • Used in factoring and encryption. • Curves have their own intrinsic arithmetic.

Ex (see above) y2 = x3 + 7 has no Z solutions.

General (Weierstrass) form:

2 3 2 y + a1xy + a3y = x + a2x + a4x + a6

1 y → 2 (y − a1x − a3) and multiplication by 4 gives 2 3 2 2 2 y = 4x + (a1 + ra2)x + 2(2a4 + a1a3)x + (a3 + 4a6)

2 x−3(a1+4a2) y 2 x → 36 , y → 108 and multiplying by 108 we get y2 = x3 − Ax − B

and Proposition If the ai ∈ Z so are A and B.

Discriminant 137

Definition D = (4A3 − 27B2)

If y2 = F (x) is an where F (x) is a cubic polynomial with integer coefficients with roots r1, r2, r3 ∈ C then the discriminant

Y 2 D := (ri − rj) ∈ N i

which is nonzero if and only if the roots are distinct.

3 2 q A • If D = 0 then x − Ax − B = (x − 2α)(x + α) , α = B .

1. If D = 0 and A 6= 0 then the curve y2 = x3 − Ax − B crosses itself—known as a node.

2. If D = 0 and A = B = 0 we have a cusp. 138

• If D 6= 0 curve is non-singular i.e. ‘interesting’. 139

Can use Mathematica to plot elliptic curves e.g. ContourPlot[y2 − x3 + x, {x, -4, 4}, {y, -4, 4}, PlotPoints->200, Contours-> {0}, ContourShading->False]

Line intersection property: each non-vertical line meeting a curve E(R), points on the curve with coordinates in R, in two points P,Q meets it in a third point R. 140

Proof. If the line is y = mx + c solve with y2 = x3 − Ax − B so (mx + c)2 = x3 − Ax − B has two solutions x1, x2 if P = (x1, y1) ,Q = (x2, y2) and therefore a third x3 so let R = (x3, mx3 + c) 

Now if P,Q have rational coordinates, m, c ∈ Q, so if A, B ∈ Z then x1x2x3 = −(−B) = B so if x1, y1 ∈ Q and x2, y2 ∈ Q so does x3 and hence y3 ∈ Q.

This simple observation enables us to generate new Q solutions or points on E(R) out of old. 141

Vertical lines: We say each vertical line meets the curve again “at ∞” and give this point a label 0 or zero.

Note:

1. P 0 = Q0 is possible but we still get a third point R.

00 00 00 00 2. If P Q is vertical, then their x-coordinates are the same so P = (x1, y1) ,Q = 2 3 3 2 (x2, y2) ⇒ x1 = x2 so y1 = x1 − Ax1 − B = x2 − Ax2 − B = y2. Hence y2 = −y1 and the curve is symmetric about OX.

Definition of the group law: definition of + If P,Q ∈ E(R) and R0 has the same x-coordinate as R, the third point on the line through P and Q, but with y-coordinate negated, let R0 = P + Q. This defines +.

0 • P = (x1, y1) ,Q = (x2, y2) ,R = (x3, y3) then x1 6= x2 ⇒  2 ) y2−y1 x3 = − x1 − x2 x2−x1 (A). y2−y1 y1x2−y2x1 y3 = − x3 − x2−x1 x2−x1 • P = Q ⇒  2 2  3x1−A x3 = − x1 − x2  2y1 2 (B). 3x1−A y3 = (x1 − x3) − y1  2y1 142

• P = Q0 ⇒ 0 = P + Q.

Notes:

1. The proof of these formulas are an exercise in coordinate geometry.

00 2. P = P since (x1, −(−y1)) = (x1, y1). 3.

 2 ! y1 − y2 y1 − y2 y2x1 − y1x2 Q + P = − x1 − x2, − x3 − x1 − x2 x1 − x2 x1 − x2  2 ! y2 − y1 y2 − y1 y1x2 − x2y1 = − x1 − x2, − x3 − x2 − x1 x2 − x1 x2 − x1 = P + Q

so + is commutative.

4. + is also associative (P + Q) + R = P + (Q + R).

5. + takes a point with Q coordinates to a point with Q coordinates i.e. + : E(Q) × E(Q) → E(Q). 143

Write 2P instead of P + P and nP for P + (n − 1)P .

Note: We could have P 6= 0 but nP = 0 for some n > 1.

2 3 Ex y = x − 63x − 162 : P1 = (−6, 0) ,P2 = (−3, 0) ,P3 = (9, 0) all satisfy 2Pi = 0

Definition If nP = 0 with P 6= 0 we say P is a torsion point.

Ex y2 = x3 − 2, 52 = 33 − 2 ⇒

P = (3, 5) ∈ E(Z) ⊂ E(Q) 129 383  2P = , − 100 1000 164, 323 66, 234, 835 3P = , − 29, 241 5, 000, 211

etc and nP 6= 0 ∀n ∈ N. 144

Ex y2 = x3 − 11,P = (3, 4) ,Q = (15, 58) generate an ‘independent’ set of two dimensions nP + mQ = 0 ⇒ n = m = 0 and E(Q) has no torsion points.

Ex (Mestre) y2 − 246xy + 36, 599, 029y = x3 − 19, 339, 780x − 36, 239, 244 has at least 12 independent points.

Conjecture ∀n ∈ N ∃ an elliptic curve with at least n independent points.

Note: Finding points can be difficult: (Bremner, Cassels) y2 = x3 + 877x; P = (0, 0) , 2P = 0 the next simplest point is

375494528127162193105504069942092792346201 , 6215987776871505425463220780697238044100 256256267988926809388776834045513089648669153204356603464786949 490078023219787588959802933995928925096061616470779979261000

Theorem (Mazur) The number, t, of torsion points for an elliptic curve E(Q) is one of {1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 16}. Also, if A, B ∈ Z, a torsion point has integral coordinates and either y = 0 (so 2 (x, y) = 0) or y2 |∆ := 4D.

There are infinitely many non-torsion points if there are any at all. 145

Definition The rank of a curve E(Q) is an integer r such that there are r independent points on the curve and every can be expressed as a sum of multiples of these points and some torsion point.

Theorem (Mordell) 0 6 r < ∞ so each curve has finite rank.

Note:

∼ r 1. The structure of E(Q) is that of a finitely generated abelian group. G = T ⊕ Z where the torsion elements T form a subgroup.

2. Finding r is a very difficult problem.

Elliptic curves mod p We can often get insight into points on curve with rational coordinates, E(Q), by consid- ering them modulo p, where p is a prime. If p 6= 2 or 3, formulas (A) and (B) still work so they define + for E(Zp) × E(Zp) → E(Zp) with Zp = {[ 0 ] ,..., [ p − 1 ]} = GF (p) the finite field of order p.

Ex

y2 ≡ x3 − Ax − B (mod p) y2 ≡ x3 + x + 2 (mod 11) 146

The solutions are {(1, ±2) , (2, ±1) , (4, ±2) , (5, 0) , (6, ±2) , (7, 0) , (10, 0)} = S and 0 = ∞ making 12. All points are torsion since they are finite in number.

3 How many points are there in E(Zp)? x can have p values, so x − Ax − B at most p. If these were random, we would expect about half to be quadratic residues and half non-residues, the residues giving two possible values of y. √ Theorem (Hasse) |#E(Zp) − (p + 1)| < 2 p

Proposition p ≡ 1 (mod 4) ⇒ the number of points on y2 ≡ x3 − x is exactly p (mod p) (including ∞).

Any integral solution of y2 = x3 − Ax − B becomes a modular solution of the congruence y2 ≡ x3 − AX − B (mod p).

Warning: Over Z, we may have D 6= 0 (a requirement), but D ≡ 0 (mod p) when p | ∆. and such a curve would not be elliptic mod p. If ∆ 6≡ 0 (mod p) we have good reduction, so we assume this is so. 147

E : y2 = x3 − Ax − B, A, B ∈ Z, ∆ = 16(4A3 − 27B2) 6= 0,D = 4A3 − 27B2

Map

Z θ : Z → pZ = Fp = GF (p)(Zp before)

n 7→ [ n ]p

Theorem (Nagell-Lutz) Points P = (x, y) of E(Q) of finite order, other than 0, i.e. the torsion points, have integer coordinates, i.e. are in E(Z), and either y = 0 or y | D.

2 3 We map E(Q) to E(Zp) through considering y ≡ x − Ax − B (mod p)

Theorem (Reduction Theorem) Let T ⊂ E(Q) be the subgroup of all points of finite order (the torsion subgroup). If p - 2D, p ∈ P then reduction mod p is an isomorphism of T onto a subgroup of E(Zp).

Theorem (Lagrange) If S ⊂ G and S is a subgroup of the finite group G, then #(S)|#(G)i.e. the order of S divides the order of G.

Corollary If P ∈ T and the order of P in E(Q) is m ∈ N then m|#E(Zp) ∀p - 2D.

These theorems can be used to determine the points of finite order of an elliptic curve E(Q). 148

2 3 5 Ex E : y = x + 3,D = −3 so let p > 5, p ∈ P. Then #E(Z5) = 6, #E(Z7) = 13 ⇒ #T |6 and #T |13 ⇒ #T = 1 ⇒ T = {0} so T has no (finite) points of finite order. Note (1, 2) ∈ E(Q) since 22 = 13 + 3, so (1, 2) has infinite order and E(Q) has an infinite number of points.

Ex E : y2 = x3 −43x+166,D = 91215 ·13. Exploring small integers (x, y) ∈ Z2 we find P = (3, 8) ∈ E. Using the point doubling formula above the x-coordinates of 2P, 4P,... are x(P ) = 3, x(2P ) = −5, x(4P ) = 11, x(8P ) = 3 so x(P ) = x(8P ) ⇒ 8P = ±P so P is a point of finite order.

Since 3 - 2D, by the Reduction Theorem, T is isomorphic to a subgroup of E(Z3). #E(Z3) = 7 so #(T ) = 1 or 7. But 0 ∈ T and so does P = (3, 8) so #(T ) = 7. The only abelian group of order 7 is Z7, a cyclic group generated by P (which must be of order 7 since its order divides 7). Computing {0,P, 2P, 3P, 4P, 5P, 6P } we get

T = {0, (3, ±8) . (−5, ±16) , (11, ±32)}.

Congruent Number Problem

Find a simple test to determine whether or not n ∈ N is the area of a right triangle, all of whose sides are of Q length. 149

Ex

1 6 = 2 · 3 · 4 so 6 is congruent.

Ex Fermat n = 1 is not congruent. (X4 + Y 4 6= Z4 ∀X,Y,Z ∈ Z).

Ex Euler n = 7 is congruent.

{1, 2, 3, 4} are not congruent but {5, 6, 7, } are congruent.

Problem: Find a nice criteria to check n.

Theorem (Tunnell, 1983) Let n be an odd square-free . Then if n is congruent, the number of triples satisfying 2x2 + y2 + 8z2 = n is twice the number of triples (x, y, z) satisfying 2x2 + y2 + 32z2 = n.

Let n be square-free and let X,Y,Z (X < Y < Z) be sides of a right triangle with area n. The number n ∈ N is fixed. 150

1 2 2 2 So n = 2 XY,X + Y = Z .

Proposition There is a 1-1 correspondence between the right triangles given above and rational numbers x for which x, x + n, x − n are each the square of a rational number. The correspondence is

Z 2 (X,Y,Z) 7→ x = 2 √ √ x 7→ X = x + n − x − n √ √ Y = x + n + x − n √ Z = 2 x

In particular, n is congruent ⇔ ∃x ∈ Q+ such that x, x + n, x − n are squares of rational numbers. + 1 2 2 2 Proof.(⇒) Let X,Y,Z ∈ Q be a triple with n = 2 XY,X + Y = Z . Then 2 2 2 2 2 X±Y 2 Z 2 X +Y = Z and 2XY = 4n ⇒ (X ±Y ) = Z ±4n ⇒ (1) 2 = 2 ±n = x±n 151

Z 2 if x = 2 . So x, x ± n are squares of rational numbers. (⇐) Given x, x ± n being squares, then √ √ X = x + n − x − n √ √ Y = x + n + x − n √ Z = 2 x satisfy X < Y < Z and X,Y,Z ∈ Q+. Finally √ √ √ √ XY = ( x + n − x − n)( x + n + x − n) = (x + n) − (x − n) = 2n and

X2 + Y 2 = (x + n) + (x − n) − 2p(x + n)(x − n) + (x + n) + (x − n) + 2p(x + n)(x − n) = 4x = Z2.



Let n be a congruent number. By the above equation (1),

X ± Y 2 Z 2 1 = ± n if n = XY 2 2 2 152

Multiply these two equations together:

X2 − Y 2 2 Z 4 = − n2 4 2

2 4 2 X2−Y 2 Z 2 so v = u − n has a rational solution v = 4 , u = 2 . Now multiply by u : 6 2 2 2 2 Z 2 2 2 u − n u = (uv) . Let x = u = 2 (as before) and y = uv = (X − Y )Z/8 ⇒ a pair (x, y) ∈ Q2 satisfying y2 = x3 − n2x—an elliptic equation y2 = x(x − n)(x + n).

Hence if n is congruent, the curve y2 = x3 − n2x has a nontrivial rational point.The converse, that any point (x, y) ∈ Q2 must come from such a triangle is false in general. We need extra conditions equivalent to ∃Q ∈ En(Q) such that (x, y) = P = 2Q i.e. P is a (rational) point which is double a rational point.

Theorem 32B Let (x, y) ∈ Q2 be on y2 = x3 − n2x. Let x satisfy

(i) it is the square of a rational number, (ii) its denominator is even, (iii) its numerator is coprime with n.

Then there is a right triangle with rational sides and area n under the correspondence of the above Proposition. √ + y + 2 y2 Proof. Let u = x ∈ Q (i) and let v = u ∈ Q . Since (x, y) is on En(Q): v = x = x2 − n2 ⇒ v2 + n2 = x2 (1). Let t ∈ N be the denominator of u, i.e. the smallest N so 153

tu ∈ Z. By (ii) t is even. Because n ∈ N, the denominators of v2 and x2 are the same by (1), namely t4. Hence (t2v)2 + (t2n)2 = (t2x)2 is a primitive with t2n even. (Primitive through (iii).) Hence ∃a, b ∈ Z such that t2n = 2ab, t2v = a2 − b2, t2x = a2 + b2. Then 2a 2b 1 2a sb 2ab the right triangle with sides t , t , 2u has area 2 t t = t2 = n. Finally, the image of 2a 2b Z 2 this triangle with X = t ,Y = t ,Z = 2u is x = 2 as required.  √ Ex (i) and (ii) alone are not sufficient: n = 5, x = 25 , y = 75 ⇒ X = x + n − √ √ 4 8 x − n = 5 6∈ Q.

Back to Tunnell’s theorem n odd and square-free,

3 2 2 2 n congruent ⇒ #{(x, y, z) ∈ Z : 2x + y + 8z = n} 3 2 2 2 = 2#{(x, y, z) ∈ Z : 2x + y + 32z = n} (B) (A) ⇒ (B)

Then, subject to an unproved conjecture, (B) ⇒ (A). We can confidently use not(B) ⇒ not(A)

Ex #{(x, y, z) : 2x2 + y2 + 8z2 = n} = #{(x, y, z) : 2x2 + y2 + 32z2 = n} if n < 8. So none of {1, 62, 3, 64, 5, 66, 7} can be congruent unless the size of each set is O(2 · 0 = 0). But x2, y2 ≡ 0, 1 or 4 (mod 8) ⇒ 2x2 + y2 + 8z2 6≡ 5, 7 (mod 8). So e.g. if n = 5 or 7 both of the sets of triples are ∅. 154

Ex The first congruent number n ≡ 1, 3 (mod 8) is n = 41:

If (B) ⇒ (A) is true, the above argument would imply all of the following (odd, square- free) numbers are congruent, through 2 · 0 = 0: {5, 7, 13, 15, 21, 23, 29, 31, 37, 39, 47}. 155 156 157

12 Numbers Rational and Irrational

m If α ∈ R we say α ∈ Q if α = n , m, n ∈ Z, n 6= 0. √ Proposition 2 6∈ . √ Q Proof. Assume 2 ∈ Q √ a ⇒ 2 = , (a, b) = 1 (???) √b ⇒ a = 2b ⇒ a2 = 2b2 ⇒ 2|a2 ⇒ 2|a So a = 2c and 4c2 = 2b2 ⇒ 2c2 = b2 ⇒ 2|b2 ⇒ 2|b Hence 2|a and 2|b so 2|(a, b) so (a, b) 6= 1 (!!!).  √ √ We say 2 is irrational or 2 ∈ I = R \ Q.

We can generalise the above proposition to get a much wider family of irrational numbers:

Theorem 33 If x ∈ R satisfies the equation n n−1 x + c1x + ··· + cn = 0

where ci ∈ Z, then x is either an integer or an irrational number. 158

a Proof. Let x ∈ Q i.e. x = b , b > 0, (a, b) = 1. Then

n n−1 n−2 n−1 a = −b(c1a + c2a b + ··· + cnb )

If b > 1, then p | b ⇒ p | an ⇒ p | a but then p | (a, b) (!!!). Hence b has no prime divisors, so b = 1.  √ th 1/n n 1/n Corollary If m ∈ N is not an n power then m = m ∈ I since α = m satisfies xn − m = 0.

Trigonometric function values and π

xng(x) Lemma 1 Let g ∈ Z[x] (i.e. a polynomial with integral coefficients). Let h(x) = n! . If j 6= n, h(j)(0) is an integer divisible by (n + 1). If g(0) = 0, h(n)(0) is an integer divisible by (n + 1). Proof. Let 1 xng(x) = (c xn + c xn+1 + ··· + c xj + ··· ) n! n n+1 j

where c0, ··· , cn−1 = 0 and the ci are integers.

Then the j’th derivative c j! h(j)(0) = j . n!

If j < n, cj = 0. 159

(j) j! If j > n, n + 1|h (0) since n! = (n + 1)(n + 2) ··· (j).

(j) (n) n n 2 If j = n ⇒ h (0) = cj ⇒ h (0) = cn but g(0) = 0 ⇒ x g(x) = x [g1x+g2x +··· ] = n+1 (j) cn+1x + · · · ⇒ n + 1|h (0). 

Lemma 2 If f(x) is a polynomial in (r − x)2, then, for any odd positive integer j, f (j)(r) = 0 i.e. f 0(r) = 0, f 000(r) = 0, ··· . Proof. Let j ≥ 0 be an integer and n ∈ N a positive integer.

(2j+1) (2j+1) f(x) = a0 =⇒ f (x) = 0 =⇒ f (r) = 0 f(x) = (r − x)2 =⇒ f 0(x) = −2(r − x) =⇒ f 0(r) = 0 f(x) = (r − x)2n, 2j + 1 > 2n =⇒ f (2j+1)(x) = 0 =⇒ f (2j+1)(r) = 0 2n! f(x) = (r − x)2n, 2j + 1 < 2n =⇒ f (2j+1)(x) = (−1)2j+1 (r − x)2n−2j−1 2j + 1! =⇒ f (2j+1)(r) = 0.

Therefore if f(x) if a sum of even powers of (r − x) all of its odd derivatives vanish at x = r. 

Theorem 34 π is irrational, i.e. π ∈ I. xn(1−x)n Proof. Let f(x) = n! where n ∈ N. 160

By Lemma 1 above, ∀j, f (j)(0) ∈ Z and f(x) = f(1 − x) ⇒ f (j)(1) ∈ Z. Since 0 < x < 1 ⇒ 0 < xn < 1 and 0 < 1 − x < 1 ⇒ 0 < (1 − x)n < 1 we have 1 0 < f(x) < n! (1). 2 a Let π = b , a > 1, b > 1, a, b ∈ N (???). Let

F (x) = bn[π2nf (0)(x) − π2n−2f (2)(x) + π2n−4f (4)(x) − · · · + (−1)nf (2n)(x)].

So F (0) ∈ Z,F (1) ∈ Z. Now d   F 0(x) sin πx − πF (x) cos πx = F (2)(x) + π2F (x) sin πx dx = bnπ2n+2f(x) sin πx = π2anf(x) sin πx 161

So Z 1 F 0(x) sin πx 1 πan f(x) sin πx dx = − F (x) cos πx 0 π 0 = F (1) + F (0) ∈ Z. But by (1), Z 1 n n πa 0 < πa f(x) sin πx dx < < 1 for n > n0 0 n! 2 which is a contradiction. Hence π is irrational. 

Corollary π is irrational: If not π2 would be rational. ’

Note: With a similar, but more complex proof, we can show r ∈ Q \{0} ⇒ cos r is irrational.

Corollary 1 to the note π is irrational, since if π ∈ Q, cos π ∈ I but cos π = −1.

Corollary 2 All trigonometric functions are irrational at non-zero rational values of their arguments. 2 Proof. r ∈ Q and sin r ∈ Q ⇒ cos2r = 1 − 2 sin r ∈ Q, which is false. Similarly, 1−tan2 r tan r ∈ Q ⇒ cos 2r = 1+tan2 r ∈ Q. 

Corollary 3 Any non-zero value of an inverse trigonometric function is irrational at rational values of the argument. 162

Proof. Let r ∈ Q and arccos r = cos−1 r = s. Suppose s ∈ Q ⇒ cos s = r which is false. 

Exponential, hyperbolic and logarithmic functions

Note: e0 = 1 ∈ Q and sinh 0 = 0, cosh 0 = 1 but these are the only rational values at rational arguments. The proof is similar to Theorem34 based on cosh:

r −r 1 er+e−r Corollary 4 e ∈ Q ⇒ e = er ∈ Q ⇒ 2 ∈ Q but this is not possible if r ∈ Q.

Theorem 35 e is irrational, e ∈ I. Proof. Claim: ∀n ∈ N n X 1 1 0 < e − < (1) j! n · n! j=0 1 1 1 Represent e by an infinite series e = 1 + 1! + 2! + ··· + j! + ··· so

n ∞ X 1 X 1 e − = > 0 j! j! j=0 j=n+1 163

Also n X 1 1 1 e − = + + ··· j! (n + 1)! (n + 2)! j=0 1  1 1 1  = + + + ··· n! n + 1 (n + 1)(n + 2) (n + 1)(n + 2)(n + 3) 1  1 1 1  < + + + ··· n! n + 1 (n + 1)2 (n + 1)3 1  1/(n + 1)  = (sum of a geometric series r = 1 ) n! 1 − 1/(n + 1) n+1 1 1 = n! n which proves the claim.

m Now let e = n , m, n ∈ N, (m, n) = 1 (???), and assume n 6= 1. Let n ! X 1 η = n! e − j! j=0 By (1) 1 1 0 < η < n! = n · n! n But m 1 1 1  η = n! − 1 − − − · · · − ∈ (!!!) n 1! 2! n! Z 164

Hence e is irrational.  √ √ Corollary e is irrational, since otherwise e = ( e)2 would be in Q. √ Question: e seems to be ‘more’ irrational than 2. We will explore families of irrational numbers below.

Let S ⊂ R be a subset. We say S has measure zero if it is possible to cover S with a finite or countable set of intervals of arbitrarily small total length. Write µ(S) = 0.

Ex S = N:  ε ε 1 ∈ 1 − , 1 + 2 2  ε ε  2 ∈ 2 − , 2 + 22 22  ε ε  j ∈ j − , j + = I 2j 2j j So ∞ [ N ⊂ Ij j=1 165 and

`(Ij) = length of Ij ε  ε  = j + − j − 2j 2j ε = 2 2j

Then

∞ ∞ X X 1 `(I ) = 2ε j 2j j=1 j=1 = 2ε which can be made arbitrarily small by choice of ε > 0. Hence µ(N) = 0. We can replace ε ε  N by any countable set A = {an : n ∈ N} ⊂ R. by defining Ij = aj − 2j , aj + 2j since 2ε `(Ij) = 2j . 166

Definition A property of real numbers is said to hold “almost everywhere” or to almost all numbers, if the set of numbers which do not have the property has measure zero.

Ex µ(Q) = 0 since Q is countable. Hence almost all numbers are irrational.

Note We can count the numbers in Q+ via listing them and then counting the diagonals, skipping any already counted.

r1 r3 → r4 ↓ % . r2 × . r5 ↓

1 1 1 1 2 → 3 ··· ↓ % . 2 2 2 1 2 3 ··· . 3 3 3 1 2 3 ··· ↓ 167

1/1 1/2 → 1/3 ··· ↓ % . 2/1 2/2 2/3 ··· . 3/1 3/2 3/3 ··· ↓

+ ⇒ Q = {rn : n ∈ N}.

Since `([0, 1]) = 1 > 0, [0, 1] and (hence) R are not of measure 0. Hence, since the union of any two countable sets is countable, the irrational numbers I are not countable. Proof. A = {an : n ∈ N},B = {bn : n ∈ N} ⇒ A ∪ B = {cn : c2n = an, c2n−1 = bn, n = 1, 2, 3,...} so A ∪ B is countable. 

Now let S ⊂ R. We say S is dense in R if ∀α < β ∃x ∈ S with α < x < β.

1 Archimedian Axiom (AA) ∀ε > 0 ∃n ∈ N such that 0 < n < ε.

Proposition Q is dense in R. 1 Proof. Let α < β¿ By AA ∃n ∈ N such that 0 < n < β − α. Let m ∈ Z satisfy 1 m+1 1 m m m m < nβ 6 m + 1. Then α < β − n 6 n − n = n and n < β. Hence α < n < β and m we can let x = n .  168

Proposition I is dense in R. m Proof. Let α, β ∈ R have α < β. Let α < n < β as above, and using AA choose k ∈ N so 1 β − m 0 < < √ n . k 2 √ √ m m 2 m 2 Then α < n < n + k < β and x = n + k ∈ I. 

Definition A number is algebraic if it satisfies an equation

n n−1 n−2 x + a1x + a2x + ··· + an = 0 with ai ∈ Q. √ Ex 2 satisfies x2 − 2 = 0.

The unique polynomial with leading coefficient 1 (called monic) in Q[x] of minimal degree which has a given algebraic number α as a root is called the minimal polynomial of α, and the degree of this polynomial is called the degree of α.

The set of all algebraic numbers is called A ⊂ R. Proof. If An is the set of algebraic numbers of degree n for n = 1, 2, 3,... then

∞ [ A = An n=1 169

There are a countable number of polynomials of degree n with Q coefficients since p(x) = n n−1 n x + a1x + ··· + an ↔ (a1, . . . , an) ∈ Q and the latter is a countable set.

But each polynomial has at most n roots in R ⇒ An is countable. To complete the proof we need to assume that a countable union of countable sets is countable. To see this, use the diagonal counting trick:

A1 = {a11, a12, → a13,...} ↓ % . A2 = {a21, a22, a23,...} . A3 = {a31, a32, a33,...} ↓ . . 

Since µ(A) = 0, almost all numbers are not algebraic. We call these numbers transcen- dental and the set of all such numbers T = R \ A.

Ex √ 21/3 + 2 √ ∈ A, π and e ∈ T 3 The former is not difficult, but π and e are both very difficult. 170

Both Q and I (and T) are dense in R. This implies each real number can be expressed as the 1 1 limit of rational numbers: Let α ∈ R then give n ∈ N ∃rn ∈ Q with α − n < rn < α + n 1 so |α − rn| < n ⇒ α = limn→∞. But this universal fact gives little insight into the difference between Q, A and T.

Definition A real number α is said to be approximable by rationals to order n ∈ N if ∃ a constant C = C(α) > 0 such that the inequality

h C α − < k kn

h has infinitely many rational solutions k where k > 0, (h, k) = 1.

Note Approximable to order 3 ⇒ Approximable to order 2 and 1.

h Theorem 36 If α ∈ I, ∃ infinitely many k ∈ Q with

h 1 α − < k k2 i.e. α is approximable to order 2. Proof. See page 75. If α ∈ I its continued fraction expansion is infinite so the set of convergents pn is infinite and qn

pn 1 1 α − < < 2 qn qnqn+1 qn 171

so we can let h = pn . k qn OR Let n ∈ N. Consider the n + 1 real numbers S = {0, α − bαc , 2α − b2αc , . . . , nα − bnαc}

j j+1 and their distribution in the intervals n 6 x < n , j = 0, . . . , n − 1 which cover [0, 1), so contain all of the numbers in S. Hence (by the Dirichlet pigeon-hole principle) two numbers lie in the same interval, say 0 6 n1 < n2 6 n, n1α − bn1αc , n2α − bn2αc ∈  j j+1  1 n , n . The length of this interval is n so 1 |(n α − bn αc) − (n α − bn αc)| < 2 2 1 1 n 1 Let k = n2 − n1 and h = bn2αc − bn1αc , k ∈ N, h ∈ Z. so |kα − h| < n and h 1 1 k 6 n (1) ⇒ α − k < nk 6 k2 . Suppose there were only a finite number of such pairs (h, k):(h1, k1) ,..., (hr, kr). Let   h1 hr ε = min α − ,..., α − > 0. k1 kr

Use AA to find n ∈ with 0 < 1 < ε so ∃h, k by (1) so α − h < 1 1 < ε so h 6= hi N n k nk 6 n k ki (!!!). 

Theorem 37 Any rational number is approximable to order 1, but not to any higher order. a Proof. Let α = b , (a, b) = 1, b > 1 be rational. Then there are infinitely many solutions (x, y) to ax − by = 1 (x = x0 + bt, y = y0 + at, t ∈ Z if(x, y) is one solution) and 172 infinitely many with x > 0. Then a y 1 2 ax − by = 1 ⇒ − = < b x bx x Hence α is approximable to order 1. y y a If ∈ Q and 6= then x x b a y ax − by 1 − = b x bx > bx 1 C a there is no constant C such that bx < x2 for infinitely many x ∈ N. Hence b = α is not approximable to any order higher than 1. 

1 1 1 1 th Ex ξ = 10 + 102 + 104 + ··· + 102m + · · · ∈ I. Let rm = (m + 1) partial sum of ξ, rm ∈ m+1 m+2 m+1 m −2 −2 −2 −2 2 an Q. |ξ − rm| = 10 + 10 + ··· < 2 · 10 = 2(10 ) . rm = 102m , an ∈ N. So this inequality shows we can approximate ξ to order 2 at least. Hence ξ 6∈ Q ⇒ ξ ∈ I. 173

Theorem 38 A real algebraic number α of degree n is not approximable to order n + 1 or higher. n n−1 Proof. Theorem 37 is n = 1. Let α satisfy the equation f(x) = a0x +a1x +···+an = 0 where ai ∈ Z, n > 2. Since the degree of α is n, this polynomial must be irreducible (since otherwise f(x) = g(x) · h(x) ⇒ 0 = f(α) = g(α) · h(α) so g(α) = 0 or h(α) = 0 and each would have lower degree than n). If x ∈ (α − 1, α + 1) , |x| < |α| + 1 so 0 n−1 n−2 |f (x)| = na0x + (n − 1)a1x + ··· + an−1 n−1 n−2 6 na0x + (n − 1)a1x + ··· + |an−1| n−1 n−2 < n |a0| {|α| + 1} + (n − 1) |a1| {|α| + 1} + ··· + |an−1| = A h h h  Now if k is a rational approximation to α with α−1 < k < α+1 we must have f k 6= 0, h since otherwise f(x) would have a factor x − k over Q, but it is of degree n > 2 and irreducible. Hence   n n−1 n h |a0h + a1h k + ··· + ank | 1 f = k kn > kn h  h  h  0 By the Mean Value Theorem f k = f k −f(α) = k − α f (ξ) for some ξ between h k and α. Hence h  h f k 1 − α = > k |f 0(ξ)| Akn 1 C There is no constant C so that Akn < kn+1 for infinitely many k, hence α is not approx- imable to order n + 1 or higher.  174

Definition A Liouville Number η ∈ satisfies ∀m ∈ ∃ hm ∈ such that η − hm < R N km Q km 1 m . km Proposition Any Liouville Number is transcendental.

hm 1 1 Proof. ∀m n + 1, η − < m < n+1 . So η cannot be algebraic of order n ∀n ∈ . > km km km N 

Ex η = 10−1! + 10−2! + ··· + 10−m! + ··· = 0.1100010 ... and hm is the mth partial sum 1 km m! so km = 10

hm −(m+1)! −(m+2)! η1 − = 10 + 10 + ··· km < 2 · 10−(m+1)! < (10m!)−m 1 = m . km

a 1 a Ex (Baker)(Alledi, 1979) log(2) − b > 1010b5.8 ∀ b ∈ Q. √ 3 a C a Ex (Baker, 1964) 2 − b > b296 ∀ b ∈ Q.

a 1 a Ex (Mahler, 1953) π − b > b42 ∀ b ∈ Q. 175

√ β 2 Ex (Gelford,√ Schneider, 1934) α ∈ A, β ∈ A \ Q, α 6= 0 ⇒ α ∈ T. e.g. 2 ∈ √ 2 T, 2 ∈ T.

Ex (Hermite, 1873) e ∈ T.

Ex (Lindeman, 1882) π ∈ T via α ∈ A \{0} ⇒ eα ∈ I since eiπ = −1.

RSA Public Key Cryptograms

1. Let p, q ∈ P be large and distinct primes known only to Alice. 2. Alice selects a (large) random integer e relatively prime to r := (p − 1)(q − 1).

3. Alice publishes n = pq and e (say in a newspaper—the public key).

4. The sender Bob has a message and encodes this in an integer 1 < m < n.

5. He uses the public key (n, e) to compute the least positive residue c ≡ me (mod n). The encrypted message is c.

6. He sends c to Alice using any (public) transmission process. 176

7. Alice recovers m from c using Y  1 φ(n) = n 1 − p p|n  1  1 = pq 1 − 1 − p q = (p − 1)(q − 1) = r 177

as follows:

8. Compute d ≡ e−1 (mod r) using xe + yr = 1 so de ≡ 1 (mod r).

9. Then cd ≡ (me)d ≡ med ≡ m1+`φ(n) ≡ m(mφ(n))` ≡ m · 1` (mod n) ⇒ cd ≡ m (mod n) and 1 < m < n ⇒ we have recovered m.

Code Cracking

Given n = pq where p, q ∈ P and are large, find p and q. √ √ √ Method 1 Either p 6 n or q 6 n so for 1 6 j 6 b nc try j |n until it succeeds.

√ • If n ∼ 10100, n ∼ 1050.

• If we can check 1 million divisors on average, per second then we need 3.2 × 1037 years= T .

• If we speed this up by 1 million times (i.e. 1012 per second), T = 3.2 × 1031 years.

Method 2 (Pollard’s p − 1, 1974) Let n have a prime factor p such that p − 1 is a product of (high powers) of small primes. By Fermat’s little theorem, if p - a, ap−1 ≡ 1 178

(mod p) ⇒ p | (ap−1 − 1, n). The prime p is unknown. So first let k = 2α1 3α2 ··· rαs where 2, 3, . . . , r are the first s primes and αi ∈ N and are “small”. k  k  Compute a − 1, n = (a − 1) (mod n), n , which can be done in O(log2(2kn)) oper- ations. If ∃p|n with p − 1|k and (a, n) = 1 then p|ak − 1 since ak = a(p−1)` = (ap−1)` ≡ ` k  1 ≡ 1 (mod p) and so a − 1, n > p > 1.

If ak − 1, n 6= n we have a non-trivial factor of n, so we can divide by it, and repeat this process on each factor.

If ak − 1, n = n choose a new a. If ak − 1, n = 1 choose a larger k.

Ex n = 246, 082, 373

1. Compute 2n−1 6= 1 (Mathematica: PowerMod[2, n-1, n]—very slow). Hence n is not prime, by Fermat’s little theorem. ⇒ n is composite.

2. Let a = 2, k = 22 · 32 · 5 = 180 then k = 180 = 22 + 24 + 25 + 27 in base 2. We need 2i to compute 2 (mod n) for 0 6 i 6 7 we can do a few extras by successive mod n 179 squaring: i 22i (mod n) 0 2 1 4 2 16 3 256 4 65,536 5 111,566,955 6 166,204,404 7 214,344,997 8 111,354,998 9 82,087,367 10 7,262,569 11 104,815,687 Using the table 2180 = 222 · 224 · 225 · 227 ≡ 16 · 65, 536 · 111, 566, 955 · 28, 795, 219 (mod 246, 082, 373) ≡ 121, 299, 227 (mod n) then, using the Euclidean algorithm: 2180 − 1, n = gcd(121, 299, 226, 246, 082, 373) = 1 so the test fails because n has no factor p with p − 1 dividing 180. Choose a new k = {2, 3,..., 9} = 23 · 32 · 5 · 7 = 2520. In base 2 2520 = 23 + 180

24 + 26 + 27 + 28 + 211 so 22520 ≡ 223 · 224 ··· 2211 ≡ 101, 220, 672 (mod n) and (22520 − 1, n) = gcd(101, 220, 672, 246, 082, 373) = 2521 so 2521|n and we have a factor. Indeed n = 2521 · 97613 and each of these factors is prime.

Summary (Pollard p − 1) n > 2 composite given.

1. k = {1, 2, 3,...,K}, a product of small primes to small powers. 2. Choose arbitrary a in 1 < a < n, say a = 2. 3. Calculate (a, n). If more than 1 then a is a factor of n so return a. 4. Let d = ak − 1, n. If 1 < d < n return d. If d = 1 go to 1. and choose K → K + 1. If d = n go to 2. and choose another a.

1 Pollard’s algorithm eventually returns a proper factor since we will reach K = 2 (p − 1) α1 1 so k = 2 ··· 2 (p − 1) and (p − 1)|k.

Note: The algorithm is fast only when n has a prime factor p such that p − 1 is the product of small primes to small powers, i.e. K is reasonably small.

Z  Method 3 (Lenstra, ECM, 1987) Non-zero elements of pZ = Fp form a multi- plicative group of order p − 1 so p − 1|k ⇒ ak = 1 in the group. This makes Pollard’s 181

∗ p − 1 work. Here the group Fp (so-called multiplicative group) is replaced by the group of points on an elliptic curve E(Fp) and a by a point P ∈ E(Fp). As before, choose k a product of small primes. If #E(Fp) | k ⇒ kP = 0 in E(Fp) and this will often allow us to find a non-trivial factor of n.

Lenstra is good if for some curve E(Q) and some p ∈ P, p|n and #E(Fp) is a product of small primes. If we lose with Pollard, the game is over, e.g n = pq and p − 1, q − 1 both have large prime factors. If we lose with Lenstra, we simply choose a new curve. √ Note: (Subject to a conjecture but crucial underpinning) #E(Fp) = p+1−εp, |εp| 6 2 p and for fixed p, as we pass over all such curves, the numbers εp are well spread in the  √ √  interval −2 p, 2 p so it is likely we will find a curve E with #E(Fp) =product of small primes.

Summary (Lenstra, ECM) n > 2 a composite integer.

r 1. Check (n, 6) = 1 and n 6= m for any r > 2.

2. Choose random integers b, x1, y1 with 1 < b, x1, y1 < n.

2 2 3. Let c = y1 − x1 − bx1 (mod n) Let E : y2 = x3 + bx + c Let P = (x1, y1) ∈ E 4. Check g = (rb3 + 27c2, n) = 1. If g = 1 we have ‘bad reduction’ so go back and get a new b. If 1 < g < n we have a non-trivial factor, so return g. 182

5. Let k = {1, 2,...,K} for some K ∈ N. 6. Compute   ak bk kP = 2 , 3 dk dk

7. Calculate d = (dk, n). If 1 < d < n, return d. If d = 1 go to 2. and choose a new curve. If d = n go to 5. and decrease k.

So how/why does it work? What is step 6.?

Suppose we eventually found a curve E such that for p|n, #E(Fp)|k, then each P ∈ E(Fp) has an order o(P ) | #E(Fp) | k so o(P ) | k hence kP = 0 i.e. kP is the point at ∞, 0. Then p|dk (see  below). Hence p|(dk, n) and normally n - dk.

How to compute kP efficiently ((P + P ) + P + ··· is too slow. k = k0 + k1 · 2 + k2 · 2 r 2 + ··· + kr · 2 in binary ki ∈ {0, 1}.  P0 = P   P1 = 2P0 = 2P  2  X P2 = 2P1 = 2 P kP = Pi .  .  ki=1 r  Pr = 2Pr−1 = 2 P  183

(2 log2(k) steps).

All computations are done mod n Q1 = (x1, y1) ,Q2 = (x2, y2) , xi, yi ∈ Zn (integers mod n). Q3 = Q1 + Q2 = (x3, y3) 2 2 x3 = λ − x1 − x

y3 = −λx3 − (y1 − λx1) y2−y1 Z where λ = where the division is carried out mod n. Note that in n = n, x2 −x1 x2−x1  Z Z may not have an inverse. Then

If (x2 − x1, n) = 1 ⇒ inverse exists. If 1 < (x2 − x1, n) < n return this. If (x2 − x1, n) = n go back to 2. or 5. in Lenstra.

To double a point Q = (x, y) (mod n) we need f 0(x) 3x2 + 2ax + b λ = = (mod n) 2y 2y and the same choices 1.,2. or 3. apply based on (2y, n).

Ex n = 1, 715, 761, 513, 2n−1 ≡ 93, 082, 891 (mod n) ⇒ n 6∈ P. √ √ √ 1. n is not a power: n, 3 n, . . . , 31 n = 1.9855 are not integers (Mathematica: check n == Floor[n(1/j)]j, for j = 1,..., 31. (n, 6) = 1. 184

√ 2. n ≈ 42, 422 so ∃p | n, p < 42, 422. We want k so that some integer close to p divides k. Try k{1, 2,..., 17} = 12, 252, 240 with lots of factors less than 42,422.

Choosing an elliptic curve: Choose a point P and one coefficient for E, then the other so the point is on the curve.

Ex P = (2, 1) , c = −7 − 2b e.g. b = 1 ⇒ c = −9 so E : y2 = x3 + x − 9 and (2, 1) ∈ E.

Now compute kP using successive doubling

k = 12, 252, 240 = 24 + 26 + 210 + 212 + 213 + 214 + 215 + 217 + 219 + 220 + 221 + 223

i so we need 2 P (mod n) for 0 6 i 6 23.

So

kP ≡ (1, 225, 303, 014, 142, 796, 033) ≡ (421, 401044, 664, 333, 727)

This tells us nothing about the factors of n. It is when the addition law breaks that we get a factor. So we need a new k, a new P or a new curve. 185

k = 12, 252, 240 as before, P = (2, 1) as before. b = 2 ⇒ c = −7 − 2b = −11 so E : Y 2 = x3 + 2x − 11 and P ∈ E and kP (mod n) is still okay. b = 42 ⇒ c = −91 so E : y2 = x3 + 42x − 91,P ∈ E. The addition law breaks an d a factor is delivers. Table (A) 2iP (mod n) is okay. then we start adding up the points to produce (B).

At the penultimate step

(24 + 26 + ··· + 221)P = 3, 863, 632P ≡ (1, 115, 004, 543, 1, 676, 196, 055) (mod n)

Then from the new (A) 223P ≡ (1, 267, 572, 925, 848, 156, 341) (mod n) and try to add these points. We need the inverse mod N of the difference of their x-coordinates, but gcd (1, 115, 004, 543 − 1, 267, 572, 925, n) = 26, 927 6= 1. which gives us the factor n = 26, 927 · 63, 719.

Note:  Now we see what this means. 0 is not a finite point on the curve, so kP = 0 means we are not able to compute mod p coordinates for kP . The only way that can happen is if p | x2 − x1 a denominator, i.e. kP = 0 means at some stage the process of building kP (using the appropriate versions of table (A) and (B) above breaks down.

Mathematica: << NumberTheory‘FactorIntegerECM‘ may be bad. ab (mod n) is PowerMod[a, b, n] and appears bad.

Method 4 Search for solutions to x2 ≡ y2 (mod N) since then x2 −y2 = (x−y)(x+y) = 186 kN and we might get a factor of N. 187 188

13 ABC Conjecture

A simple but powerful relation between the additive and multiplicative properties of numbers.

Definition The radical Y N(n) = p p|n is the largest square-free divisor of n, or the core of n.

Ex n = 223553 ⇒ N(n) = 2 · 3 · 5 N(pα) = p ∀p ∈ P

Proposition N is multiplicative and N(n) · N(m) = N(nm) · N((n, m)).

ABC Conjecture: ∀ε > 0 ∃Kε > 0 such that if a, b, c are relatively prime integers and a + b = c then 1+ε max (|a| , |b| , |c|) 6 KεN(abc)

This is an important unsolved problem. It is so deep it implies the asymptotic Fermat’s Last Theorem.

Ex a = 3k, b = 2 · 3k, c = 3k+1 so a + b = c and max (|a| , |b| , |c|) = c = 3k+1. 189

3k+1 k+1 1+ε abc = 2 · 3 so N(abc) = 2 · 3 and 3 6 Kε(2 · 3) ∀k and some Kε is false for some ε > 0. i.e. (a, b, c) = 1 is essential.

Theorem (Asymptotic Fermat Theorem) ABC ⇒ ∃n0 ∈ N so that the Fermat n n n equation x + y = z has no solution in relatively prime integers ∀n > n0. Proof. Let x, y, z be relatively prime (i.e. each has different prime factors). Note that n n n 3 N(x y z ) = N(xyz) 6 xyz 6 z since x < z, y < z. Apply ABC with ε = 1 so n n n n n n n 2 6 z = max (x , y , z ) ≤ K1N(x y z ) 6 K1z so n log(z) 6 log(K1) + 6 log(z) log(K ) log(K ) ⇒ n 6 + 1 6 + 1 6 log(z) 6 log(3)

so let log(K ) n = 7 + 1 . 0 log(3) 

Conjecture (Catalan Conjecture) 8 and 9 are the only consecutive powers i.e. the only solution to Catalan’s equation

m n x − y = 1  in positive integers x, y, n, m > 1 is 32 − 23 = 1.

Many special cases of this conjecture have been proved: e.g. 190

1. x2 − yn = 1 has one solution x = n = 3, y = 2. 2. xm − y2 = 1 has no solutions.

So we need only consider n, m > 3.

Theorem (Asymptotic Catalan Conjecture) ABC ⇒ the Catalan equation has only a finite number of solutions. Proof. Let (x, y, m, n) be a solution with m, n > 3. Then (x, y) = 1 since otherwise 1 p | x, p | y ⇒ p | 1. The ABC Conjecture with ε = 4 ⇒ ∃K1/4 = K such that 5/4 m n max (|a| , |b| , |c|) 6 KN(abc) But  ⇒ x = 1 + y so n m m n 5/4 y < x 6 KN(1 · x · y ) = KN(xy)5/4 5/4 6 K(xy) 5 ⇒ m log(x) 6 log(K) + 4 (log(x) + log(y)) 5 ⇒ n log(y) < log(K) + 4 (log(x) + log(y)) 5 ⇒ m log(x) + n log(y) < 2 log(K) + 2 (log(x) + log(y)) 5  5  ⇒ m − 2 log(x) + n − 2 log(y) < 2 log(K) But 2 6 x, 2 6 y so  5  5  5  5 m − log(2) + n − log(2) < m − log(x) + n − log(y) 2 2 2 2 < 2 log(K) 191

2 log(K) ⇒ m + n < + 5 log(2) Thus there are only finitely many exponents m and n for which  has a solution. It is known that, for fixed m, n,  has only a finite set of solutions x, y. Hence  has only a finite set of solutions x, y, m, n. 

Wieferich Primes

If p ∈ P is odd 2p−1 ≡ 1 (mod p) If p is such that 2p−1 ≡ 1 (mod p2), p is called a and we write p ∈ W ⊂ P.

Ex 9 - 22 − 1 so 3 is not one. 25 - 24 − 1 so 5 is not one. 49 - 26 − 1 so 7 is not one

Problem: Are there an infinite number of Wieferich (or non-Wieferich) primes?

Lemma Let p ∈ P be odd. If ∃n ∈ N so 2n ≡ 1 (mod p) but 2n 6≡ 1 (mod p2) then p 6∈ W . 192

Proof. Let 2d ≡ 1 (mod p) with d minimal (d > 0). Then d|n (else n = ed + r, 0 < r < d and 1 ≡ 2n = 2ed · 2r = (2d)e · 2r ≡ 2r (mod p) contradiction d being minimal).

Now 2n 6≡ 1 (mod p2) ⇒ 2d 6≡ 1 (mod p2) (else (2d)3 ≡ 1e ⇒ 2n ≡ 1 (mod p2)). Now 2d ≡ 1 (mod p) ⇒ 2d = 1 + kp and (k, p) = 1 (else 2d = 1 + k0p2 and 2d ≡ 1 (mod p2)). p−1 Also 2 ≡ 1 (mod p) ⇒ d|p − 1 ⇒ p − 1 = de for some e with 1 6 e 6 p − 1. Then p−1 d e e 2 (ek, p) = 1 and 2 = (2 ) = (1 + kp) ≡ 1 + ekp 6≡ 1 (mod p ) so p 6∈ W . 

Definition A is a v ∈ N such tat p | v ⇒ p2 | v. For example 72 = 2 · 62 = 23 · 32 is powerful but 192 = 2 · 96 = 22 · 48 = 22 · 42 · 3 = 26 · 3 is not. 193

Theorem ABC ⇒ |W c| = ∞. n n Proof. ∀n ∈ N let 2 − 1 = unvn ⊗ where vn is the maximal powerful divisor of 2 − 1 so un is just those primes which appear to power 1, is square free.

n n 2 If p | un then 2 ≡ 1 (mod p) by ⊗ but 2 6≡ 1 (mod p ) since 1 is the power of p n c c c appearing in 2 − 1. Hence p ∈ W , so all the prime divisors of un ∈ W . If |W | < ∞, ∃ only finitely many square-free integers with prime divisors all in W ⇒ #{un : n = 1, 2,...} < ∞ ⇒ #{vn : n = 1, 2,...} = ∞ and so is unbounded in size. √ n Since vn is powerful N(vn) 6 vn. Let 0 < ε < 1 in ABC and consider (2 − 1) + 1 = a + b = c = 2n.

n vn 6 un · vn = 2 − 1 < 2n = max (|a| , |b| , |c|) n n 1+ε 6 KεN(2 (2 − 1) · 1) 1+ε = KεN(2unvn) 1+ε 1+ε 6 Kε(2un) N(vn) 0 (1+ε)/2 6 Kεvn

1− 1 − ε 1 0 (1+ε)/2 2 2 0 0 1/2−ε/2 so vn < Kεvn ⇒ vn < Kε ⇒ vn < (Kε) = Bε contradicting the unbounded nature of the {vn}. 

x y Theorem (LeVeque, 1952) If a, b > 2 are given, the equation a − b = 1 has at most one solution in positive integers x, y unless a = 3, b = 2 where there are two solutions: (x, y) = (1, 1) and (x, y) = (2, 3). 194

Proof. Assume that (u, v) and (x, y) are solutions with u < x. So au − bv = ax − by = 1. Then v < y since 0 < ax − au = by − bv and au(ax−u − 1) = bv(by−v − 1). Now au − bv = 1 ⇒ (a, b) = 1 ⇒ bv = au − 1 = ax−u − 1 and au = bv + 1 = by−v − 1.

Hence au = ax−u so u = x − u ⇒ 2u = x and also by−v − bv = 2 so y − v < v and bv(by−2v − 1) = 2. Hence v = 1, b = 2, by−2v − 1 = 1 so y − 2v = 1 ⇒ y = 1 + 2v = 3. u v Thus a = 1 + b = 3 so u = 1 and a = 3.  195

14 Formulas for Primes

A function is easy: Let f(n) := max{p ∈ P : p|n} indeed s 2t X  (u!)nπ  f(n) = lim lim lim 1 − cos2 r→0 s→0 t→0 n u=0 Both are impractical.

th A formula for pn, the n prime, would be nice, but probably impossible to find, the elements of P being scattered in such an irregular manner.

Easier aim: find a formula which produces only primes. Will show below that no poly- nomial will work but

 3n  f(n) = θ does work for some θ ∈ R.

Ex f(n) = an + b. Let f(n) = p and f(m) = q, p, q ∈ P, p 6= q. Then (a, b) = 1 since d = (a, b) ⇒ d | a and d | b so d | an + b = p. Similarly, d | q, but p 6= q so (p, q) = 1 ⇒ d = 1. So if f has more than one prime value, (a, b) = 1.

Tables of primes reveal arithmetic progressions of various lengths: 196

1. 3, 5, 7.

2. 7, 37, 67, 97, 127, 157.

3. 199, 409, 619, 829, 1039, 1249, 1459, 1669, 1879, 2089.

Proposition No arithmetic progression of N, of infinite length, can yield only primes. th Proof. Let an + b = p ∈ P, n, p fixed, and nk = n + kp, k = 0, 1, 2,.... Then the nk term of the progression is

ank + b = a(n + kp) + b = an + b + akp = p(1 + ak) so p|ank + b ∀k > 1. Thus, since the nk numbers come at intervals every p terms, every pth term of the original progression is divisible by p. Hence the progression contains infinitely many composite numbers. 

Note: Dirichlet’s Theorem (see back) says {an + b : n ∈ N} contains an infinite number of primes if (a, b) = 1.

th Proposition If p - a then every p term of {an + b}, starting somewhere, is divisible by p.

Proof. p - a ⇒ (p, a) = 1 ⇒ ∃r, s so pr + as = 1. Let nk = kp − bs, k = 1, 2, 3,.... 197

Then

f(nk) = ank + b = a(kp − bs) + b = akp − abs + b = akp − b(1 − pr) + b = p(ak + br).

Thus p|ank + b. Since nk+1 − nk = p, the terms ank + b occur p terms apart. 

Ex 2 - a ⇒ every second term is divisible by 2. So {an + b} cannot have more than 1 consecutive prime values.

Ex {30, 030n−6887 : n = 1, 2, 3,...} has 12 consecutive terms which are prime. 30, 030 = 2 · 3 · 5 · 7 · 11 · 13. This is a curiosity: Linear formulas fail.

Quadratic Formulas

f(n) = an2 + bn + c

Ex f(n) = n2 + 21n + 1 is not composite for n = −38, −37,..., 0, 1, 2,..., 17: 56 values. f(0) = 1 6∈ P of course. However, f(18) = 703 = 37 · 19. 198

Ex 19|f(n) if n ≡ −1 (mod 19): since n = −1 + 19` ⇒

f(n) = (−1 + 19`)2 + 21(−1 + 19`) + 1 = 19(−1 + 19` + 19`2).

Ex f(n) = n2 + n + 41 ∈ P for n ∈ {−40, −39,..., 39} i.e. for 80 consecutive values.

Conjecture 80 is the best possible for any quadratic.

Known (1967): No f(n) = n2 + n + A (A > 41) gives primes for n = 0, 1,...,A − 2.

Proposition No quadratic can always be prime. 2 2 Proof. f(n) = an + bn + c = p ∈ P ⇒ f(nk) ≡ an + bn + c (mod p), nk = n + kp so th every p term of {f(n)} is divisible by p. 

Does {an2 + bn + c} contain an infinite number of prime values?—Unknown. Does {n2 + 1} have an infinite number of prime values?—Also unknown.

If f ∈ Z[x] is a polynomial and f(n) = p ∈ P, then p|f(n + kp) for k = 0, 1, 2,..., so no such f has an infinite set of consecutive prime values.

Ex Give d ∈ N, ∃ a polynomial f ∈ Z[x] of degree d, taking on d + 1 arbitrarily assigned 199

values, which could be prime: 60f(x) = 7x5 − 85x4 + 355x3 − 575x2 + 418x + 180 has n = 0 1 2 3 4 5 f(n) = 3 5 7 11 13 17

Method: Use Lagrange interpolation with xi = {0, 1,..., 5} and yi = {3, 5, 7,..., 17} and 6 6 ! X Y (x − xj) f(x) = yi (xi − xj) i=1 j=1,j6=i

so f(xi) = yi, 1 6 i 6 6.

By these examples, we see the functions must be more complex than linear or polynomial, if they are to have all prime values.

Theorem There is a number θ ∈ R such that f(n) = θ3n  is prime for all n ∈ N.

Note: This formula is not effective, since to know θ exactly, we would need to be able to recognise arbitrarily large primes (θ ≈ 1.3064 ...). 200

Lemma If u1 6 u2 6 ··· 6 un 6 ··· 6 B is a bounded increasing real sequence, then

lim un = θ n→∞ exists.

Lemma If A 6 ··· 6 vn 6 ··· 6 v2 6 v1 is a decreasing real sequence which is bounded below, then lim vn = α n→∞ also exists.

Assumption: ∃A ∈ N such that if n > A, ∃p ∈ P with n3 < p < (n + 1)3 − 1.

“The proof of this assumption is very difficult” (Elementary Number Theory by Under- wood Dudley). Indeed, but easier than ∃p so n2 < p < (n + 1)2, I would say.

Proof.of the Theorem Let p1 be any prime with A < p1 and for n = 1, 2, 3,... let pn+1 be a prime with 3 3 pn < pn+1 < (pn + 1) − 1

1 −n −n −n−1 3 3n 3 3 Let un = pn = pn and vn = (pn + 1) , n = 1, 2,.... Then un+1 = pn+1 > −n−1 3 3 3−n (pn) = pn = un. So {un} is increasing. Also {vn} is decreasing since vn+1 = 3−n−1 3 3−n−1 3−n (pn+1 + 1) < ((pn + 1) − 1 + 1) = (pn + 1) = vn. From their definitions above, un < vn ∀n ∈ N. Hence, by the two Lemmas above

lim un = θ and lim vn = α n→∞ n→∞ 201

Since un < vn we have θ 6 α, indeed un < θ 6 α < vn because {un} is strictly increasing and {vn} strictly decreasing. 3n 3n 3n 3n 3n Therefore un < θ 6 α < vn ∀n ∈ N. But, from their definitions, un = pn and 3n 3n  3n   3n  vn = pn + 1 so pn < θ < pn + 1 ⇒ pn = θ ∀n ∈ N so θ is prime. 

This Theorem would be valuable if we could work out the value of θ without reference to primes.

Ex π(1 + (n − 1)!) f(n) = sin n then f(n) = 0 ⇔ n ∈ P.

pn Another Catalan Conjecture (1876) Let p0 = 2 ∈ P and pn+1 = 2 − 1 for n = 0, 1, 2,... then 2 p1 = 2 − 1 = 3 ∈ P 3 p2 = 2 − 1 = 7 ∈ P 7 p3 = 2 − 1 = 127 ∈ P and p4 ∈ P. Is pj ∈ P ∀j? {pj} increases very rapidly.

Theorem (Matijaseviˇc,1971) There exists a multinomial p (p ∈ Z[a, b, c, . . . , z] of degree 25 (Jones, Sato, Weda, 1975)) such that the set of prime numbers coincides with the set of positive values assumed by this multinomial,as the variables range in the set of non-negative integers Z+ = N ∪ {0}. 202

1 Proposition (Dixon) If p is the multinomial, r = 2 + 2 (p − 2 + |p − 2|) is a function (not a multinomial) with range exactly the set of primes. 1 Proof. p(x) 6 0 ⇒ p − 2 < 0 ⇒ |p − 2| = 2 − p so r = 2 + 2 · 0 = 2 ∈ P and 1 p(x) > 0 ⇒ p(x) > 2 ⇒ r = 2 + 2 (p − 2 + p + 2) = 2 + p − 2 = p so r(x) = p(x) ∈ P. 

(Jones, 1979) F0 = 1,F1 = 1 and Fn+1 = Fn + Fn−1, n > 1, the Fibonacci numbers are the set of positive values at non-negative integers of

p(x, y) = 2xy4 + x2y3 − 2x3y2 − y5 − x4y + 2y

Hilbert’s 10th Problem: There is no algorithm which is good enough to decide whether any given diophantine equation has a solution in positive integers. (Matijaseviˇc).

(Siegel, 1972) Every quadratic diophantine equation is decidable.

Unknown: Is every multinomial in 2 variables decidable?

These results and questions relate to the axiomatic and logical foundation of arith- metic. e.g. are some problems simply impossible to solve because we do not have an appropriate set of properties of numbers to begin with?

Resistant problems: 203

1. Twin primes conjecture: ∃ an infinite set of pn ∈ P so that pn + 2 ∈ P also.

2. There exist infinitely many Sophie Germain primes i.e. pn ∈ P and 2pn + 1 ∈ P.

p 3. Mp = 2 − 1 is a Mersenne number. Are infinitely many composite? We believe so.

A simple new “axiom”/“conjecture” will resolve each of these questions. 204

15 Axiom D

(Dirichlet) (a, b) = 1, a 6= 0, b > 1, f(x) = bx + a then ∃ an infinite number of integers m > 0 with f(m) ∈ P.

Conjecture/Axiom (Dixon, 1904) Let s > 1 and fj(x) = bjx + aj with bj > 1 and aj, bj ∈ Z. If @n > 1 with

n|f1(k)f2(k) ··· fs(k) ∀k ∈ Z OR ∀n > 1, ∃k ∈ Z so n - f1(k)f2(k) ··· fs(k) 

Then there exist infinitely many m ∈ N with {f1(m), . . . , fs(m)} all primes.

This is Axiom D, the weakest form of a more general Axiom H where the linear polynomials fj are replaced with polynomials of arbitrary degree.

Proposition Axiom D ⇔  ⇒ ∃m ∈ N so {f1(m), . . . , fs(m)} are primes. Proof.(⇒) Follows directly. (⇐) ∃m1 > 1 so f1(m1), . . . , fs(ms) are primes. Let gj(x) = fj(x + 1 + m1) for 1 6 j 6 s. Then  is satisfied by the {gj}, hence ∃k1 > 1 so g1(k1), . . . , gs(k1) are primes. Let m2 = k1 + 1 + m1 > m + 1 so f1(m2), . . . , f3(m2) are primes. Repetition of this procedure generates infinitely many mj ∈ N.  205

Theorem Axiom D ⇒ ∀m > 1 there exist infinitely many arithmetic progressions consisting of m Sophie Germain primes. Proof. Let d = (2m + 2)! > 4! (even). Consider the 2m polynomials:

f1(x) = x + d

f2(x) = x + 2d . .

fm(x) = x + md

fm+1(x) = 2x + 2d + 1

fm+2(x) = 2x + 4d + 1 . .

f2m(x) = 2x + 2md + 1

so fm+j(x) = 2fj(x) + 1, 1 6 j 6 m. These polynomials satisfy : Let f(x) = Q2m m j=1 fj(x) with degree 2m and leading coefficient 2 . Let p ∈ P divide f(k) for Q2m k = −1, 0, 1, . . . , p − 2 (1). Now f(−1) = j=1(odd) ≡ 1 (mod 2) ⇒ p 6= 2 so f(x) ≡ 0 (mod p) has 2m roots (in a field extension of Fp), but it has p roots from (1). Hence p 6 2m and p|d = (2m + 2)! But we CLAIM f(−1) ≡ 1 (mod 3).

f(−1) = (d − 1)(2d − 1) ··· (2d − 1)(4d − 1) · · · ≡ (−1)2m(3) ≡ 1 (mod 3). Hence p 6= 3. 206

But

f(1) = (1 + d)(1 + 2d) ··· (3 + 2d)(3 + 4d) ··· | {z } | {z } m factors m factors ≡ 3m (mod p)

since p | d. Hence p - f(1) which is a contradiction. Hence the {fj} satisfy . By Axiom D there exist infinitely many k so fj(k) = pi and fm+i(k) = 2pi + 1 are primes for i = 1, . . . , m. Moreover p1 < p2 < ··· < pm are in progression with difference d. 

Corollary Axiom D ⇒ there exist infinitely many Sophie Germain primes.

Proposition Let a, b, c be pairwise relatively prime non-zero integers (i.e. each consists of products of different primes). There exist infinitely many pairs of primes (p, q) so ap − bq = c assuming Axiom D.

Proposition (Schinzel and Sierpi´nski,1958) Axiom D ⇒ there exist infinitely many 1 n with 2 φ(n) ∈ P where φ is Euler’s phi function.

Proposition There exist infinitely many triples of consecutive integers, each being the product of two distinct primes, assuming Axiom D. Proof.  f1(x) = 10x + 1  f1(0)f2(0)f3(0) = 2 f2(x) = 15x + 2 f1(1)f2(1)f3(1) = 11 · 17 · 7 f3(x) = 6x + 1  207

⇒  is satisfied. Thus there exist infinitely many integers m > 1 such that:  p = 10m + 1  q = 15m + 2 are primes. r = 6m + 1 

Then 3p = 30m + 3 = 3p 3p + 1 = 30m + 4 = 2q 3p + 2 = 30m + 5 = 5r so {3p, 3p + 1, 3p + 2} are products of two distinct primes. 

Theorem Axiom D ⇒ there exist infinitely many composite Mersenne numbers (Mp = 2p − 1). Proof. Let  f1(x) = 4x − 1 f1(0)f2(0) = 1 ⇒  f2(x) = 8x − 1 Hence there exist infinitely many m > 1 such that p = 4m − 1  are primes q = 8m − 1

But then q = 2p + 1 and p ≡ 3 (mod 4). 2 p q −1 Claim q | 2 − 1: Consider the Legendre symbol (2 | q) = (−1) 8 . p ≡ 3 (mod 4) ⇒ 208

q = 2(3 + 4m) + 1 ⇒ q = 7 + 8m ≡ −1 (mod 8) ⇒ q2 ≡ 1 (mod 8). So (2 | q) = 1−1 q−1 p p (−1) 8 = 1 ≡ 2 2 = 2 (mod q). Hence q |2 − 1.

Now, if m > 1 the corresponding primes p, q satisfy 2p − 1 = 24m−1 − 1 > 8m − 1 = 2 ⇔ 16m − 2 > 16m − 2 ⇔ 16m > 16m which is true. So q | 2p − 1 is a proper divisor and p Mp = 2 − 1 is composite. 

Theorem Let a1 < a2 < ··· < as be non-zero integers and assume f1(x) = x + a1, . . . , x + as = fs(x) satisfy . Then there exist infinitely many integers M > 1 so {m + a1, m + a2, . . . , m + as} are consecutive primes.

Theorem Axiom D ⇒ ∀k ∈ N there exist infinitely many pairs of consecutive primes with difference 2k. In particular, there exist infinitely many pairs of twin primes. Proof. Let

f (x) = x + 1  f (0)f (0) = 1 + 2k = a 1 then 1 2 f2(x) = x + 2k + 1 f2(1)f2(1) = 2(2 + 2k) = b

and

(a, b) = (1 + 2k, 4(1 + k)) = (1 + 2k, 1 + k) = 1

since 2(1 + k) − (1 + 2k) = 1. Hence {f1, f2} satisfy . 209

By the previous Theorem, since 1 < 2k + 1, there exist infinitely many integers m > 1 so f1(m), f2(m) are consecutive primes, i.e. p = m + 1 and q = m + 2k + 1 = p + 2k are consecutive primes, so {p, p + 2k} are twins with the given property. 

Axiom D has a number of other consequences e.g. on the existence of primes in arith- metic progressions.

Let 1 < n and d a multiple of Q p. Then there exist infinitely many arithmetic p6n progressions with difference d, each consisting of n consecutive primes.

Proving Axiom D: First try s = 2, generalising s = 1. Showing Axiom D is indepen- dent—very difficult and unlikely. 210

16 Partitions

Generating functions arise because if n, m ∈ Z n m addition n + m ! z · z multiplication | {z } | {z } integers polynomials/series

2 2 2 2 12 Ex (Lagrange) ∀n ∃xi, 1 6 i 6 4, n = x1 + x2 + x3 + x4 is equivalent to if (1 + z + 22 n2 4 2 z + ··· + z + ··· ) = f(z) = a0 + a1z + a2z + ··· then ai > 0 ∀i = 0, 1, 2,....

Change Making How many ways can we make change for n ∈ N if the coins are of denomination 1, 2 and 3 i.e. given N how many different solutions are there to N = 1x + 2y + 3z in x, y, z > 0 all integers?

Let |z| < 1 and write, using the sum to ∞ of a geometric series: 1 = 1 + z + z2 + z3 + ··· 1 − z = 1 + z1 + z1+1 + z1+1+1 + ··· 1 = 1 + z2 + z2+2 + z2+2+2 + ··· 1 − z2 1 = 1 + z3 + z3+3 + z3+3+3 + ··· 1 − z3 211 so 1 = (1 + z1 + z1+1 + ··· )(1 + z2 + z2+2 + ··· )(1 + z3 + z3+3 + ··· ) (1 − z)(1 − z2)(1 − z3) ∞ X = c(n)zn n=0 c(0) = 1. what happens when we multiply out the RHS? We get terms like z1+1+1+1 · z2 · z3+3 = z12 but this is zfour ‘1’s + one ‘2’ + two ‘3’s i.e. a method of changing 12 into ‘1’s, ‘2’s and ‘3’s. Every way of changing 12 will appear so c(12) is exactly the number of ways 12 can be ‘changed’. Similarly, c(n) is the number of ways n can be changed

∞ X 1 c(n)zn = (1 − z)(1 − z2)(1 − z3) n=0 so our number theory counting problem has been transformed into an analytic problem, i.e. finding coefficients of a Taylor series. Don’t do the series multiplication to get the ‘c(n)’s. Use partial fractions (check by simplifying the RHS): 1 1 1 1 1 1 1 1 1 = + + + (1 − z)(1 − z2)(1 − z3) 6 (1 − z)3 4 (1 − z)2 4 (1 − z2) 3 (1 − z3) Then ∞ ! ∞ d  1  1 d  X X = = zn = (n + 1)zn dz 1 − z (1 − z)2 dz n=1 n=0 212

and

∞ ! ∞ d  1  1 d  X n + 1 X (n + 1)(n + 2) = = zn = zn dz 2(1 − z)2 (1 − z)2 dz 2 2 n=0 n=0

1 (n + 1)(n + 2) 1  1 4 n even ⇒ c(n) = · + (n + 1) + 1 6 2 4 3 3|n n2 n  = + + 1 (see below) 12 2

If 2|n and 3|n get

n2  3 1 2 1 1 1 c(n) = + + n + + + + 12 12 4 12 4 4 3 n2 n = + + 1 12 2

But c(n) ∈ N so n2 n  c(n) = + + 1 12 2 213

If 2|n and 3 - n n2 n 1 c(n) = + + 1 − 12 2 3 n2 n  = + + 1 12 2 since c(n) ∈ N. Similarly, if 2 - n and 3|n: n2 n 1 c(n) = + + 1 − 12 2 4 n2 n  = + + 1 12 2

Crazy Dice

Normal die have faces labelled 1–6. When tossed there exist 6 × 6 = 36 equally likely 1 outcomes e.g. the probability of (6, 6) is 36 . What are the probabilities for sums? s = 214 x + y, 2 6 s 6 12. p(z) = z1 + z2 + z3 + z4 + z5 + z6 Combined possibilities for sums are encoded in

(z1 + z2 + z3 + z4 + z5 + z6)(z1 + z2 + z3 + z4 + z5 + +z6) = z2 + 2z3 + 3z4 + 4z5 + 5z6 + 6z7 + 5z8 + 4z9 + 3z10 + 2z11 + z12 so there are 3 ways in which we can achieve s = 10:

5 + 5 = 10 6 + 4 = 10 4 + 6 = 10 215

Question: Can we label the two cubes with other positive integers and obtain the same frequencies for sums? i.e do there exist a1, . . . , a6; b1, . . . , b6 ∈ N so

pa(z) · pb(z) = (za1 + za2 + za3 + za4 + za5 + za6 )(zb1 + zb2 + zb3 + zb4 + zb5 + zb6 ) = z2 + 2z3 + 3z4 + 4z5 + 5z6 + 6z7 + 5z8 + 4z9 + 3z10 + 2z11 + z12

Call these Crazy Dice:

LHS = (z + z2 + z3 + z4 + z5 + z6)2  (1 − z6)2 = z 1 − z  (1 − z2)(1 + z2 + z4)2 = z 1 − z = z(1 + z)(1 + z + z2)(1 − z + z2)2

Since Z[x] is a unique factorisation domain, the polynomials pa and pb must consist of these factors. Since ai > 1, bi > 1, 1 6 i 6 6, a factor z must occur in both. a1 a2 a3 a4 a5 a6 2 pa(1) = 1 + 1 + 1 + 1 + 1 + 1 = 1 + 1 + 1 + 1 + 1 + 1 = 6 so (1 + z + z )(1 + z) must appear in a factorization of pa. The same applies to pb. This leaves the two factors 216

2 (1 − z + z ) to distribute. One to each → normal die. Both to pa → crazy die.

2 2 2 pa(z) = z(1 + z)(1 + z + z )(1 − z + z ) = z + z3 + z4 + z5 + z6 + z8 2 pb(z) = z(1 + z + z )(1 + z) = z + 2z2 + 2z3 + z4 so {1, 3, 4, 5, 6, 8} and {1, 2, 2, 3, 3, 4} are the labels.

Representation Function

Let A ⊂ Z+ = N ∪ {0} a subset of non-negative integers.

How many ways can a given n ∈ N be written as the sum of two elements of A?

• Order counts and the summands can be equal:

r(n) = #{(a, b) ∈ A × A : n = a + b}

• Order does not count, but they can be equal:

r+(n) = #{(a, b) ∈ A × A : a 6 b, n = a + b} 217

• Order does not count, and they cannot be equal:

r−(n) = #{(a, b) ∈ A × A : a < b, n = a + b}

Let A(z) be the generating function for the set A i.e. X A(z) = zn n∈A Then ∞ X 2 r(n)zn = (A(z)) = A2(z) n=0 and ∞ X 1 r (n)zn = A2(z) − A(z2) = B(z) say − 2 n=0 finally ∞ X 1 r (n)zn = B(z) + A(z2) = A2(z) + A(z2) + 2 n=0

+ Question: Is there an infinite set A ⊂ Z , with A= 6 ∅, for which r+(n) = C = constant ∀n or ∀n > n0?

1 2 2 C Then 2 (A (z) + A(z )) = 1−z + P (z) where P is a polynomial with ∂P < n0. 218

+ Let z → −1 . Then |P (z)| 6 B1 a bound,

C B2 a bound, 1 − z 6 2 2 A (z) > 0 and A(z ) → A(1) → ∞ so the RHS is unbounded. Hence the answer to the question is no.

Question: Can we split Z+ into two disjoint sets A and B so every non-negative integer is expressible in the same number of ways as the sum of two distinct members of A as it is the sum of two distinct members of B?

Trial-and-error: Let 0 ∈ A, then 1 ∈ B else 1 = 1 + 0 = a + a0 but not 1 = b + b0. Then 2 ∈ B else 2 = 2 + 0 = a + a0 6= b + b0 (1 + 1 is not distinct). then 3 ∈ A else 3 6= a + a0 whereas 3 = 1 + 2 = b + b0 etc. Then A = {0, 3, 5, 6, 9,...} B = {1, 2, 4, 7, 8,...} What is the pattern? Are A and B unique?

Use generating functions A(z) for A and B(z) for B so 1 1 A2(z) − A(z2) = B2(z) − B(z2) (1) 2 2 Also, because A t B = Z+ is a splitting 1 A(z) + B(z) = = 1 + z + z2 + z3 + ··· (2) 1 − z 219

(1) ⇒ A2(z) − B2(z) = A(z2) − B(z2) so (A(z) − B(z))(A(z) + B(z)) = A(z2) − B(z2). A(z) − B(z) (2) ⇒ = A(z2) − B(z2) 1 − z ⇒ A(z) − B(z) = (1 − z)[A(z2) − B(z2)] ∀z, |z| < 1 z → z2 ⇒ A(z2) − B(z2) = (1 − z2)[A(z4) − B(z4)] ⇒ A(z) − B(z) = (1 − z)(1 − z2)[A(z4) − B(z4)]

Iterating this gives:

A(z) − B(z) = (1 − z)(1 − z2)(1 − z4) ··· (1 − z2n−1 ) A(z2n ) − B(z2n )

But A(0) = 1,B(0) = 0 and z2n → 0 as n → ∞ since |z| < 1 ⇒ ∞ Y  j  A(z) − B(z) = 1 − z2 [A(0) − B(0)] j=0 ∞ Y  j  = 1 − z2 (3) j=0 We can easily multiply this out!

Every n ∈ Z+ has a unique binary representation, i.e. expression as a sum of powers f 2, 2j. Indeed, 220

• n = sum of an even number of powers of 2 ⇒ zn has coefficient +1.

• n = sum of an odd number of powers of 2 ⇒ zn has coefficient −1. so A = {n : n is an even ...}, B = {n : n is an odd ...}

This is not trivial, not something we might have guessed.   0 = 0 1 = 20    3 = 20 + 21  2 = 21    5 = 20 + 22  4 = 22 A = 6 = 21 + 22 B = 7 = 20 + 21 + 22  0 3  3  9 = 2 + 2  8 = 2  .  .  .  .

Euler’s Identity

Consider the number of ways of expressing n as the sum of (any number of) distinct 221

positive integers p(n):

6 = 1 + 2 + 3 = 2 + 4 = 1 + 5 = 6 so p(6) = 4. Also express n as the sum of positive odd numbers, q(n) allowing repeats so:

6 = 1 + 5 = 3 + 3 = 1 + 1 + 1 + 3 = 1+1+1+1+1+1

so q(6) = 4 and p(6) = q(6). This is not a coincidence!

Theorem (Euler) The number of ways of expressing N as the sum of distinct positive integers equals the number of ways of expressing n as (not necessarily distinct) odd positive integers. P∞ n P∞ n Proof. To prove n=0 p(n)z = n=0 q(n)z i.e 1 (1 + z1)(1 + z2)(1 + z3) ··· = (1 − z)(1 − z3)(1 − z5) ··· This is Euler’s identity. 222

1 3 5 2 3 Consider RHS × LHS = (1 − z)(1 − z )(1 − z ) ··· (1 + z)(1 + z )(1 + z ) ··· . This is (1 − z)(1 + z)(1 − z3)(1 + z3) ··· (1 + z2)(1 + z4)(1 + z6) ··· = (1 − z2)(1 − z6)(1 − z10) ··· (1 + z2)(1 + z4)(1 + z6) ··· = P (z), say But then P (z2) = (1 − z4)(1 − z12) ··· (1 + z4)(1 + z8) ··· = (1 − z2)(1 + z2)(1 − z6)(1 + z6) ··· (1 + z4)(1 + z8) ··· = (1 − z2)(1 − z6) ··· (1 + z2)(1 + z4) ··· = P (z) So P (z) = P (z2) so P 0(z) = 2zP 0(z2) and P 0(0) = 0 similarly P 00(0) = 0,P 000(0) = 0, . . . , p(n)(0) = 0 ∀n ∈ N so (Taylor expansion) P (z) = P (0) + 0 = P (0) ∀ |z| < 1. But P (0) = (1 − 0)(1 − 0) ··· (1 + 0)(1 + 0) ··· = 1. Hence P (z) = 1 ∀ |z| < 1 so LHS = RHS and we have proved Euler’s mysterious identity. 

1 Ex Euler’s identity at z = 2 is ∞ ∞ −1 Y  1  Y  1  1 + = 1 − 2j 22i−1 j=1 i=1

3 5 9 2 8 32 z2 z3 z4 i.e. 2 · 4 · 8 ··· = 1 · 7 · 31 ··· or take log and use log(1 + z) = z − 2 + 3 − 4 ··· (|z| < 1). ∞ ∞ ∞ ∞ X X zjn X X z(2i−1)n (−1)n = n n j=1 n=1 i=1 n=1 223

Partition Function p(n)

Question: In how many ways can n ∈ N be expressed as a sum of natural numbers?

First let order count:

Ex  4 = 1 + 3 = 3 + 1   = 2 + 2  = 4 8 = 24−1 ways  = 2+1+1=1+2+1=1+1+2  = 1 + 1 + 1 + 1 

Proposition If order counts, n can be expressed as a sum in 2n−1 = q(n) ways. Proof. n = 1 : q(1) = 1 = 21−1 so the result is true. Given n > 1 assume it is true for n − 1 and write n = (n − 1) + 1 = (n − 2) + 2 = ··· = 1 + (n − 1) by the induction hypothesis each of these brackeded numbers can be expressed in a total 2n−1 − 1 = 2n−2 + 2n−3 + ··· + 1 224

ways and this represents the sums for n with 2 or more terms with order counting. The n−1 only remaining sum is n = n so we get q(n) = 2 ∀n ∈ N. 

If order does not count then the counting is much more complex: p(1) = 1, p(2) = 2, p(3) = 3,

4 = 1 + 1 + 1 + 1 = 1 + 1 + 2 = 1 + 3 = 2 + 2 = 4 and p(4) = 5. Similarly p(5) = 7. There is no pattern.

Major MacMahon computed hundreds of values of p(n) by hand and it suddenly occurred to him that from a distance, the outline of the digits formed a parabola! √ √ ⇒ # of digits ∼ C n so p(n) ∼ eα n. Later work showed √ eπ 2n/3 p(n) ∼ √ (Rademacher) 4 3 · n

12 At n = 200, RHS ; 4 × 10 ; p(200). the proof uses elliptic modular functions. We will derive an upper bound for the RHS. p(n) is call the (unrestricted) partition function. 225

Geometric Representation

Ex 15 = 6 + 3 + 3 + 2 + 1 •••••• ••• ••• •• • Reading vertically, 15 = 5+4+3+1+1+1 is another, “conjugate” partition. Then number of parts in the first equals the size of the largest part in the second, and vice-versa.

Proposition The number of partitions of n into m parts is equal to the number of 226 partitions of n into parts, the largest of which is m.

Theorem (Euler) ∞ ∞ Y 1 X = p(n)xn 1 − xm m=1 n=0 |x| < 1, p(0) = 1. So the LHS is a generating function for p(n). Proof. Expand each factor on LHS as a power series using the sum to ∞ of a geometric series:

LHS = (1 + x + x2 + x3 + ··· )(1 + x2 + x4 + ··· )(1 + x3 + x6 + ··· ) ···

Now multiply out and collect like powers of x so

∞ X LHS = 1 + a(j)xj j=1

We need to prove a(j) = p(j). If we take a term xk1 from the first, x2k2 from the second,

3k3 mkm th x from the third,. . . , x from the m where each ki > 0, their product is xk1 · x2k2 ··· xmkm = xk say, where k = k1 + 2k2 + 3k3 + ··· + mkm or

k = (1 + 1 + ··· + 1) + (2 + 2 + ··· ) + (3 + 3 + ··· ) + ··· + (m + m + ··· ) | {z } | {z } | {z } | {z } k1 k2 k3 km 227

so this is a partition of k into positive summands. Conversely each term xk comes from such a partition. Hence a(k) = p(k). (This can be made into a more rigorous proof.) 

Similarly other types of partitions can be described using generating functions:

Generating function for the number of partitions of n into parts which are Q∞ 1 m=1 1−x2m even Q 1 p 1−xp prime Q∞ m m=1(1 + x ) unequal Q∞ m2 m=1(1 + x ) distinct squares Q∞ 1 squares m=1 1−xm2 Q p p(1 + x ) distinct primes

Pentagonal Numbers

These belong to the family of polygonal numbers, beloved by the Greek Pythagoreans. 228

1 = 1 1 + 4 = 5 1 + 4 + 7 = 12 1 + 4 + 7 + 10 = 22

In general, the nth is the nth partial sum of the arithmetic progression 229

1, 4, 7, 10, 13,..., 3n + 1, . . . n = 0, 1, 2,.... Let n−1 X ω(n) = (3j + 1) j=0 n−1 n−1 X X = 3 j + 1 j=0 j=0 3 = n(n − 1) + n 2 3n2 − n = 2 3n2−n 3n2+n Then, normally, ω(n) = 2 and ω(−n) = 2 are called pentagonal numbers. ω(1) = 1, ω(2) = 5, ω(3) = 12,....

Theorem (Euler’s Pentagonal Number Theorem) Let |x| < 1, then ∞ ∞ Y X (1 − xm) = (−1)nxω(n) m=1 n=−∞

So, surprisingly, the LHS is a sort of generating function for the ω(n). Note also the surprising relationship between the p(n) and ω(n): ∞ ! ∞ ! X X 1 = p(n)xn (−1)nxω(n) n=0 n=−∞ 230

Proof. (Euler by induction 1750, Legendre 1830, Jocobi 1846, Franklin 1881 gave this remarkable “combinatorial” proof) Let

∞ ∞ Y X (1 − xm) = 1 + a(n)xn m=1 n=1 Now every partition of n into unequal parts contributes to a term on the right with

• +1 if xn is the product of an even number of terms.

• −1 if xn is the product of an odd number of terms.

Hence ∞ ∞ Y m X n (1 − x ) = 1 + (pe(n) − po(n))x (1) m=1 n=1 Franklin showed that there is a 1-1 correspondence between even and odd partitions, so pe(n) = po(n), except when n is pentagonal.

Consider the graph of a partition. It is in standard form if the parts are in strictly decreasing order going down the page.

Definition The base of the graph is the longest line segment connecting points in the last row. Let b be the number of points. 231

Definition The slope of the graph is the longest 45o segment joining the last point in the first row with the last point in successive rows. Let s be the number of points in the slope.

. . .7. -Slope (s=4) . . . . .-,-.. . . '-&"""-21 - ---'

Definition Operation A: Move points on the base so they all lie on a line parallel to the slope. It is permissible if the resulting graph is in standard form.

Definition Operation B: Move all points on the slope so they lie on a line below the base. Again it is permissible if the resulting graph is in standard form. 232

If A is permissible we get a new partition of n into unequal parts with 1 less term. If B is permissible we get a new partition of n into unequal parts with 1 more term.

If for a given n and every partition of n, exactly one of A or B is permissible, there will be a 1-1 correspondence between partitions of n into an even and odd number of terms ⇒ pe(n) = po(n) for these n.

Determination whether A or B is permissible:

• Case 1 b < s : b 6 s − 1 ⇒ A is okay but B is not. • Case 2 b = s: B is not okay. A is okay except when the base and slope intersect.

• Case 3 s < b: A is not permissible, B is okay except when b = s + 1¿

∴ there are just two exceptions, (a) and (b) above. 233

• Consider (a): Let there be k rows so b = k and counting ‘•’s: 3k2 − k n = k + (k + 1) + ··· + (2k − 1) = = ω(k) 2 So if k is even we get an extra partition into an even number (k) of parts. If K is k odd we get an extra odd partition. ∴ pe(n) − po(n) = (−1) . • In (b): 3k2 − k n = + k because there s an extra point on each row 2 3k2 + k = 2 = ω(−k)

k and again pe(n) − po(n) = (−1) .

Hence, by (1) ∞ ∞ ∞ Y X X (1 − xm) = 1 + (−1)kxω(k) + (−1)kxω(−k) m=1 k=1 k=1 

Theorem (Euler) Let p(0) = 1 and p(n) = 0 for n < 0: ∞ X p(n) = (−1)k+1{p(n − ω(k)) + p(n − ω(−k))} k=1 234

Proof. By the above two theorems ∞ ! ∞ ! X X 1 + {xω(k) + xω(−k)} p(m)xm = 1 k=1 m=0 n n For n > 1 the coefficient of x on RHS is zero. So equating coefficients of x on each side: ∞ ∞ ∞ ∞ ∞ X X X X X p(n)xn + (−1)kp(m)xm+ω(k) + (−1)kp(m)xm+ω(−k) = 0 n=0 m=0 k=1 m=0 k=1 ∞ ∞ " ∞ # ∞ " ∞ # X X X X X p(n)xn + (−1)kp(n − ω(k)) xn + (−1)kp(n − ω(−k)) xn = 0 n=0 n=0 k=1 n=0 k=1 ∞ X ⇒ p(n) = (−1)k+1{p(n − ω(k)) + p(n − ω(−k))} k=1 

P∞ k+1 Ex p(5) = k=1(−1) {p(5−ω(k))+p(5−ω(−k))}. Using ω(0) = 0, ω(1) = 1, ω(2) = 5, ω(3) = 12, ω(−1) = 2, ω(−2) = 7, ω(−3) = 15. we get: p(5) = (−1)2{p(5 − ω(1)) + p(5 − ω(−1))} + (−1)3{p(5 − ω(2)) + p(5 − ω(−2))} = 1 ·{p(4) + p(3)} − {p(0) + p(−2)} + 0 = {5 + 3} − {1 + 0} = 7 235 as before.

An upper bound for p(n)

√ k n q 2 Theorem ∀n > 1, p(n) < e where k = π 3 . Q∞ m −1 P∞ k Proof. Let F (x) = n=1(1 − x ) = 1 + k=1 p(k)x and restrict x to lie in 0 < x < 1. Then p(n)xn < F (x), each term being positive. So

1  log(p(n)) < log(F (x)) + n log x = A + B 236

First estimate A, then B: A = log(F (x)) ∞ ! Y = − log (1 − xn) n=1 ∞ X = − log(1 − xn) n=1 ∞ ∞ X X xmn = m n=1 m=1 ∞ ∞ X 1 X = (xm)n m m=1 n=1 X 1 xm = · m 1 − xm m=1 1−xm 2 m−1 Now 1−x = 1 + x + x + ··· + x and 0 < x < 1 so 1 − xm mxm−1 < < m 1 − x m(1 − x) 1 − xm m(1 − x) ⇒ < < x xm xm with all terms positive so inverting gives: xm 1 xm 1 x · · m2(1 − x) 6 m 1 − xm 6 m2 1 − x 237

Sum on m ∞ ∞ X 1 xm x X 1 π2 x A = · = · m 1 − xm 6 1 − x m2 6 1 − x m=1 m=1 1−x 1−x 1 π2 1 1  Let t = x so 1 + t = 1 + x = x so A 6 6 · t and log x = log(1 + t) < t

1  π2 Hence log(p(n)) < log(F (x)) + n log x < 6t + nt

2 Now the minimum value of θ(t) = π + nt occurs when t = √π . 6t 0 6n 238

For this value of t 2nπ √ θ(t0) = 2nt0 = √ = K n 6n √ √ K n Hence log(p(n)) < K n ⇒ p(n) < e . 

We can use generating functions and logarithmic differentiation to devise recursion formulas for arithmetical functions:

Let A ⊂ N be a subset. Let f(n) be an arithmetical function. Let the product

Y n − f(n) FA(x) = (1 − x ) n n∈A and the series X f(n) G (x) = xn A n n∈A 239

converge absolutely for |x| < 1. Then X f(n) log(F (x)) = − log(1 − xn) A n n∈A ∞ X f(n) X xmn = n m n∈A n=1 ∞ X 1 = G (xm) m A m=1 Then differentiate and multiply by x to obtain:

∞ F 0 (x) X x A = G0 (xm)xm F (x) A A m=1 ∞ X X = f(n)xmn m=1 n∈A ∞ ∞ X X mn = xA(n)f(n)x m=1 n=1 = RHS where  1 n ∈ A x (n) = A 0 n 6∈ A is the so-called characteristic function of A. 240

Now collect terms with mn = k to get

∞ X k RHS = fA(k)x k=1 where X X fA(k) = x(d)f(d) = f(d) d|k d|k,d∈A Hence ∞ 0 X k xFA(x) = FA(x) fA(k)x (1) k=1

Now write FA(x) as a power series in x. The coefficient will depend on A and f of course so call them pA,f (n):

∞ X n Y FA(x) = pA,f (n)x , pA,f (0) = FA()) = 1 = 1 n=0 n∈A Finally, equate the coefficients of xn on both sides of (1) to obtain

n X npA,f (n) = fA(k)pA,f (n − k) k=1 with X pA,f (0) = 1 and fA(k) = f(d) d|k,d∈A 241

Ex A = N, f(n) = n ⇒ pA,f (n) = p(n), the (unrestricted) partition function, and P fA(k) = d|k d = σ(k)¡ the divisor sum function so:

n X np(n) = σ(k)p(n − k) k=1

Check: p(1) = 1, p(2) = 2, p(3) = 3, p(4) = 5, p(5) = 7, so LHS = 35

RHS = σ(1)p(4) + σ(2)p(3) + σ(3)p(2) + σ(4)p(1) + σ(5)p(0) = 1 · 5 + 3 · 3 + 4 · 2 + 7 · 1 + 6 · 1 = 35