<<

Additive

and

Combinatorial

Number

———————————————————————————————————————

Notes written by Jacques Verstra¨ete Based on a course given by W.T. Gowers in Cambridge (1998) Chapter 4 written by Tim Gowers

1 Contents

1 The Hales-Jewett 3

2 Roth’s Theorem 6

3 Weyl’s Inequality 10

4 Vinogradov’s Three Primes Theorem 15

5 The of 32

6 Freiman’s Theorem 40

7 Szemer´edi’s Theorem 46

References 56

Notation 58

2 §1 The Hales-Jewett Theorem

The following theorem was proved in 1927 by van der Waerden [20], answering a con- jecture of Schur:

Theorem 1.1. If the natural numbers are partitioned into two sets, then one set must contain arbitrarily long progressions.

This result was proved before Ramsey’s Theorem, and led to a of generalizations, with implications in Ramsey Theory. Theorem 1.1 can be rewritten as follows: for any pair of positive k; r, there exists an W = W (k; r) such that if [W ] is r-coloured, then we may find a monochromatic k-term arithmetic progression.

In this section, an important theorem known as the Hales-Jewett Theorem [8] is proved. Consider the following notation. For x [k]N , A [N] and j [k] define ∈ ⊂ ∈

xi i A (x jA)i = 6∈ ⊕  j i A  ∈ A Hales-Jewett line is a set of the form x jA : 1 i k , for some x [k]N and { ⊕ ≤ ≤ } ∈ A [N], A = . The Hales-Jewett Theorem implies van der Waerden’s Theorem. To ⊂ 6 ∅ see this, represent points in the cube [k]N by the coefficients in a base k expansion of non-negative integers less than kN . Provided N is large enough, a monochromatic line exists, corresponding to a monochromatic arithmetic progression.

Hales-Jewett Theorem. Let k; r N. Then there exists N such that if [k]N is ∈ r-coloured, then it contains a monochromatic Hales-Jewett line. Proof. Let HJ(k; r) denote the smallest integer for which the theorem works. We must show HJ(k; r) is always finite. If k = 1 set N = 1. Suppose that N = HJ(i; r) r 1 has been found for each i < k and set i = k. Let N = HJ(k 1; r2 − ) and set 1 −

r−i s N = HJ(k 1; r2 k r ) i −

Ni for i = 1; 2; : : : ; r, where sr = i

x = (z kA ; : : : ; z kA ; z 1A ; : : : ; z 1A ; : : : ; z 1A ) 1 ⊕ 1 t ⊕ t t+1 ⊕ t+1 u ⊕ u r ⊕ r and A = t

x j A j A j A : 1 j k ; { ⊕ 1 1 ⊕ 2 2 ⊕ · · · ⊕ d d ≤ i ≤ } where A1; A2; : : : ; Ad are disjoint and non-empty in [N], then for every k; r; d there exists an N such that, however [k]N is r-coloured, there is a monochromatic d-dimensional Hales-Jewett subspace. Another way of viewing the Hales-Jewett theorem: if [N] is coloured with r colours, then there exist disjoint sets A0; A1; : : : ; Ak such that A0 i I Ai ∪ ∈ are all monochromatic where I [k]. The following remarkable inductive proofSof the ⊂ Hales-Jewett theorem is due to Shelah [14]:

M−1 M−i (k 1) (k 1) N1+...+Ni 1 Proof. Let M = HJ(k 1; r) and define N = r − and N = r − k − − 1 i for i = 2; 3; : : : ; r. Let • be an r-colouring of [k]N1 [k]NM . Given x [k]NM , let • × · · · × ∈ x

4 be the colouring of [k]N1 [k]NM−1 induced by •. There are at most rkN1+...NM−1 such ×· · ·× colourings, so we can find two points x and x ,of the form (k 1; : : : ; k 1; k; : : : ; k), 1 2 − − such that • = • . If the first m and first n co-ordinates of x and x are (k 1), x1 x2 1 2 − respectively, and A = (m; n], then • jA is the same for j = k 1 and j = k, where m zm ⊕ m − zm = x. Let LM = zM jAM : 1 j k . For each i, we have an induced colouring { ⊕ ≤ ≤ } N1+...+N N1 Ni 1 k i−1 M i of [k] [k] − L L . There are at most r (k 1) − × · · · × × i+1 × · · · × M − different colourings of this kind, so we find a line L [k]Ni , L = z jA : 1 j k i ⊂ i { i ⊕ i ≤ ≤ } such that •zi jAi is the same for j = k 1; k. At the end of this process, we construct ⊕ − L L so that •, restricted to L L does not vary over co-ordinate 1 × · · · × M 1 × · · · × M change from k 1 to k. This completes the inductive step. 2 −

This proof was a breakthrough in that it was the first to give primitive recursive bounds on the van der Waerden numbers. Erd˝os and Tur´an [4] hoped this could be achieved by finding, for each k N, an o(N) function nk(N) such that every subset of [N] of ∈ size at least nk(N) contains an arithmetic progression of length k. We now look at this problem more closely.

5 §2 Roth’s Theorem

The following theorem was first proved by Szemer´edi [17] using ingenious combinatorial techniques, and later by Fursten¨ burg [6], using methods in .

Szemer´edi’s Theorem. Let A be a set of positive upper density in N. Then A contains arbitrarily long arithmetic progressions.

Szemer´edi actually proved more than this. Let nk(N) denote the smallest integer such that any subset of nk(N) elements taken from [N] contains an arithmetic progression of length k. Szemer´edi established that nk(N) = o(N) for each k, thus proving a of Erd˝os and Tur´an [4]. The proof used van der Waerden’s Theorem and Szemer´edi’s

Regularity Lemma, therefore the upper bound on the of nk(N) obtained can be no better than the bounds given by these .

Roth [11] gave a remarkable analytic proof that n3(N) = o(N) in 1954. Szemer´edi proved it for the more difficult case k = 4 [16] which then generalized to the above theorem, for general k. We present the theorem of Roth here. The interest in this proof is that it gives a good lower bound on n (N) – n (N) cN= log log N for some constant 3 3 ≤ c > 0 – and that it offers the possibilty of generalization. Szemer´edi’s Theorem was proved by markedly different techniques and Fursten¨ burg’s proof gives no bounds on the van der Waerden numbers.

ˆ Let n N and f : ZN C. The (discrete) f of f is defined by ∈ → ˆ N 1 rs f(r) = s=0− f(s)! , where ! = exp(2…i=N). We define the convolution of f and g, f g byP(f g)(r) = t u=r f(t)g(u). The following properties are easily proved from ∗ ∗ − the definition, and willPbe used throughout the material to follow. The first identity is known as Parseval’s Identity and the third will be called the convolution formula.

Lemma 2.2. The following properties hold for fourier transforms

(1) fˆ(r) 2 = N f(r) 2 | | | | (2) P fˆ(r) gˆ(r) = PN f(r) g(r) (3) (Pf g)ˆ= fˆ gˆ P ∗ (4) N (f g)(r) 2 = fˆ(r) 2 gˆ(r) 2 | ∗ | | | | | (5) Pfˆ(r) 4 = N P f(a)f(b)f(c)f(d) | | a+b=c+d P P 6 An arithmetic progression in Z is called a Z-arithmetic progression if it is an arithmetic progression when considered as a subset of Z.

Lemma 2.3. Let a; d ZN with d = 0 and let m N. Then the set A = a; a + ∈ 6 ≤ { d; : : : ; a + (m 1)d can be partitioned into fewer than 3m1/2 subsets, which are Z- − } arithmetic progressions. Proof. Let ‘ = m1/2 and consider the numbers a; a + d; : : : ; a + (m 1)d . At least b c { − } two lie within N=‘ of each other so there exists s [‘ 1] with N=‘ sd N=‘ such ∈ − − ≤ ≤ that we split A into subprogressions, each with common difference sd. If P is one of them, then P can be partitioned into Z-arithmetic progressions, all but two of which have size at least ‘ m1/2, as sd N=‘. So the whole set can be partitioned into at ≥ | | ≤ most m=m1/2 + 2s 3m1/2 Z-arithmetic progressions. 2 ≤

The idea in the proof of Roth’s Theorem is that if a set A does not contain an arithmetic progression of length three, then Aˆ has a large Fourier coefficient. This implies that A has an intersection with a long ZN -arithmetic progression, where the density of A increases.

As the density is bounded above by 1, and ZN -arithmetic progressions are taken care of by the Lemma 2.3, this completes the argument, provided N is large enough.

Roth’s Theorem. There is a constant c > 0 such that for any N N and A [N] of ∈ ⊂ size at least cN= log log N, A contains an arithmetic progression of length three.

Proof. In general, if X; Y and Z are subsets of ZN with densities fi; fl; respectively, then the number of triples (x; y; z) X Y Z such that x + z = 2y (mod N) is ∈ × × 1 N − X Y Z + Xˆ(r)Yˆ ( 2r)Zˆ(r). Using Cauchy–Schwartz, the second term has | || || | r − modulus at mostP

1 2 1/2 2 1/2 N − max Xˆ(r) Yˆ ( 2r) Zˆ(r) : r=0 | | r | − | r | | 6 ³X ´ ³X ´

Using Parseval’s Identity for Yˆ and Zˆ, this expression is flN maxr=0 Xˆ(r) . Provided 6 | | max Xˆ(r) 1 fifl1/21/2z1/2N, there are at least 1 fiflN 2 triples of the required form | | ≤ 2 2 1 2 1 2 as N − X Y Z = fiflN . A non-trivial solution occurs if fiflN > N. | || || | 2 Now let A have density – and B = a A : N < x < 2N . We plan to show that A has a { ∈ 3 3 } substantial intersection with, and higher density in, a long arithmetic progression P . If B –N=5, then A has density at least 6–=5 in [0; N=3] or [2N=3; N), and we have the | | ≤ 7 required arithmetic progression P , of length N=3 or N=3 + 1: Suppose B –N=5. b c b c | | ≥ Then there exists a non-trivial solution a + c = 2b with (a; b; c) A B B, or ∈ × × Aˆ(r) –2N=10 for some r. In the first case, we have a Z-arithmetic progression of | | ≥ length three in A, as required.

Partition the unit circle into M consecutive equal intervals I1; I2; : : : ; IM of diameter as 2 rx close to – =20 as possible. Define P = x : !− I . Then each P is an arithmetic j { ∈ j} j 1 progression in ZN with common difference r− (mod N) (the ZN inverse of r). Define − − f(x) = A(x) –. Then f(x) = 0 and fˆ = Aˆ. So − P 2 rx rx rx – N=10 fˆ(r) = f(x)!− f(x)!− f(x)!− : ≤ | | ≤ ≤ ¯ x ¯ ¯ j x Pj ¯ j ¯x Pj ¯ ¯X ¯ ¯X X∈ ¯ X¯ X∈ ¯ ¯ ¯ ¯ ¯ ¯ ¯ Now fix j and let x P . Then j ∈ j

rx rxj rx rxj f(x)!− f(x)!− + f(x) !− !− ≤ − ¯x Pj ¯ ¯x Pj ¯ ¯x Pj ³ ´¯ ¯ X∈ ¯ ¯ X∈ ¯ ¯ X∈ ¯ ¯ ¯ ¯ ¯ –2 ¯ ¯ f(x) + ≤ 20 ¯x Pj ¯ x Pj ¯ X∈ ¯ X∈ ¯ ¯ –2 P f(x) + | j|: ≤ 20 ¯x Pj ¯ ¯ X∈ ¯ ¯ ¯ δ2N M δ2 M Summing over j, we find 10 j=1 x Pj f(x) + 20 j=1 Pj . Since the last sum is ≤ | ∈ | | | M δ2N M N, we get 1 Pj f(x) 20 .PRecallingP that f(x)P= 0, we find 1 ( Pj f(x) + | | ≥ 2 | | δ2N δ Pj | | Pj f(x)) isPat Pleast 20 . So there exists j suchPthat x Pj f(x) P40 . PAs A(x) = ∈ ≥ fP(x) + –, A(x) P –(1 + –=40) P . By Lemma 2.3,P P may be partitioned into | ∩ j| ≥ | j| j 1/2 r 3 Pj Z-arithmetic progressions Q1; Q2; : : : ; Qr. This gives ≤ | | r –2 P f(x) | j| ≥ 40 i=1 x Qi X X∈ and so there is k such that f(x) –2 Q =80 and Q –2 P =80r –3N 1/2=5000. Qk ≥ | k| | k| ≥ | j| ≥ In other words, A has densitPy –(1+–=80) in the long arithmetic progression Qk. We now repeat the argument on A Q in Q . As the density increases by a factor –=80 each ∩ k k time, this procedure must stop in 160=– steps. That is, A must contain an arithmetic progression of length three provided that – 500= log log N. 2 ≥

c Heath-Brown [9] and Szemer´edi [18] have recently improved the denominator to (log N)− , for some constant c > 0.

8 §3 Weyl’s Inequality

Let f : N R be a function and write f(x) for the fractional part of f(x). We say → { } that f is uniformly distributed if for fi (0; 1], ∈ lim m n : f(m) < fi = fi n →∞ |{ ≤ { } }| Weyl [23] established that if f(x) is a polynomial that has at least one non-constant term with an irrational coefficient, then f is uniformly distributed. This theorem is proved using a fundamental inequality, known as Weyl’s Inequality, involving exponential sums. We shall prove this theorem with f(x) = fixk: that is, fi; 2kfi; 3kfi; : : : is equidistributed { } modulo 1. As a consequence, if fi is a then for any " > 0 there exists N such that N 2fi is at distance at most " from an integer.

We derive the appropriate inequality to prove this result by establishing estimates for exponential sums. The statements here are written for simplicity, rather than for finding optimal bounds. We begin with the following elementary lemma:

Lemma 3.1. Let fi; fl R. Then for n N, ∈ ∈ n 1 e(fix + fl) min n; (2 fi )− ≤ { k k } ¯x=1 ¯ ¯X ¯ where fi is the distance ¯from fi to the ¯nearest integer. k k Proof. The constant fl does not affect the inequality. If fi = 0, then the sum is n. If 1 iz iz fi = 0, then the sum is e(fi)(1 e(fin))=(1 e(fi)). As sin z = (e e− ), this is at 6 − − 2i − 1 most sin …fi − . Since sin …fi 2 fi , the inequality follows. 2 | | | | ≥ k k

Lemma 3.2. Let m; r; Q N with Q 2 and m r. Let 1; 2; : : : ; m be real numbers ∈ ≥ ≤ 1 with r− whenever i = j. Then k i − jk ≥ 6 m 1 min ; Q 6 log Q(Q + r): i ≤ Xi=1 nk k o Proof. Without loss of generality, [ 1=2; 1=2] and the contribution S+ to the sum i ∈ − from the non-negative i is at least one half of the total. Suppose the positive i are ordered: 0 < 1 < 2 < : : : < k. Then

k k k r/Q 1 1 b c min ; Q = min i− ; Q min r=(i 1); Q = Q + r=i: i { } ≤ { − } Xi=1 nk k o Xi=1 Xi=1 Xi=0 r/Q

9 Estimating the last term with , S+ (1 + r=Q)Q + 2r(log k + log Q log r) ≤ − ≤ Q + r + 2r log Q. Therefore the sum is at most 2S+ 6 log Q(Q + r): 2 ≤

The following lemma will be used in this and subsequent sections.

Lemma 3.3. Let q; Q; R N, Q 2, and let fi R be chosen so that there exists a N ∈ ≥ ∈ ∈ 2 with (a; q) = 1 and fi a=q q− . Then | − | ≤ R 1 min ; Q 48 log Q(Q + q + R + QR=q): fix + fl ≤ xX=0 nk k o 2 Proof. Let s; t 0 be natural numbers. Then sfi tfi (s t)a=q s t q− . ≥ k − k ≥ k − k − | − | If 0 < s t q=2 then a(s t) = 0 (mod q), so is at least 1=q q=2q2 = 1=2q. In | − | ≤ − 6 − 1 the first case, suppose R < q=2 1. Then fl; fi + fl; : : : ; Rfi + fl are all (2q)− -separated − (mod 1), so by Lemma 3.2 with r = 2q,

R 1 min ; Q 6 log Q(Q + 2q): i ≤ xX=0 nk k o In the second case, split the sum into segments of size at most q=2 – at most 4R=q segments in total. By Lemma 3.2, each contributes at most 6 log Q(Q + 2q). Therefore the sum is at most 24 log Q(QR=q + 2R). In both cases, we obtain the upper bound 48 log Q(Q + q + R + QR=q), as required. 2

2 Theorem 3.4. Let q; Q N, Q 2, let (a; q) = 1 and let fi R with fi a=q q− . ∈ ≥ ∈ | − | ≤ Let `(x) = x2 + cx + d. Then

Q e(fi`(x)) 20 log Q(Q1/2 + q1/2 + Q=q1/2): ≤ ¯x=0 ¯ ¯X ¯ ¯ ¯ Proof. Let ˆ (y) = 1 [`(y + u) `(y)] = y + u=2 + c=2. Then u 2u − Q Q 2 e(fi`(x)) = Q e(fi`(x) fi`(y)) − ¯xX=0 ¯ yX=0 xX=0 ¯ ¯ Q Q y ¯ ¯ − = e(fi`(y + u) fi`(y)) y=0 u= y − X X− 1 Q Q Q u − − = e(fi`(y + u) fi`(y)) + e(fi`(y + u) fi`(y)) u= Q y=Q+u − u=0 y=0 − X− X X X

10 Q = e(2fiuˆu(y)) u= Q y Iu X− X∈ Q e(2ufiy + flu) ≤ u= Q¯y Iu ¯ X− ¯ X∈ ¯ Q ¯ ¯ 1 min 2ufi − ; Q ≤ u= Q {k k } X− 2Q min ufi ; Q ≤ u= 2Q {k k } X− 48 log Q(Q + q + 2Q + 1 + Q(2Q + 1)=q) ≤ 200 log Q(Q + q + Q2=q): ≤ where Iu denotes the range of y-summation. This gives the desired bound. 2

In the next results, we will write u u ; u ; : : : ; u , for convenience. The next lemma j ≡ 1 2 j is the step required to prove Weyl’s Inequality.

Lemma 3.5. Let ` be a monic polynomial of degree k and let 0 j k 1. Let Q ≤ ≤ − f(fi) = x=1 e(fi`(x)). Then P 2j 2j j 1 f(fi) (2Q) − − : : : e(fi`uj (y)) | | ≤ u I u I u I y I 1 1 2 2 j j ¯ uj ¯ X∈ X∈ X∈ ¯ X∈ ¯ ¯ ¯ where I ; I ; : : : ; I are integer intervals, contained in ( Q; Q], such that I depends on 1 2 j − i u1; u2; : : : ; ui 1, Iu is a sub-interval of [Q] and `u is a polynomial of degree k j with − j j − leading coefficient k!=(k j)!. − Proof. By induction on j. For j = 0, the result is clear. Now, as ( a ) n a2, i ≤ i P P 2j+1 2j+1 2j 2 j 2 f(fi) (2Q) − − (2Q) e(fi`uj (y)) | | ≤ y I ¯ uj ¯ ¯ X∈ ¯ e(fi[`¯ (y + u ) `¯ (y)]); ≤ · uj j+1 − uj uj+1 Ij+1 y Iu X∈ X∈ j where I is a subset of I I ( Q; Q] and I I . Now ` (y+u ) ` (y) = j+1 uj − uj ⊂ − uj+1 ⊂ uj uj j+1 − uj ` (y) where ` (y) has degree k j 1. The leading coefficient of ` (y) is uj+1 uj+1 − − uj+1 u (k j) k!=(k j)! u u : : : u . 2 j+1 − · − · 1 2 j

We now state and prove Weyl’s Inequality.

11 2 Theorem 3.6 Let (a; q) = 1, fi R, fi a=q q− , k; Q N, Q 2. Then ∈ | − | ≤ ∈ ≥ Q k/2k−1 1 1 k 1/(2k 1) e(fi`(x)) 100(log Q) Q(Q− + q− + qQ− ) − : ≤ ¯x=1 ¯ ¯X ¯ ¯ ¯ 2(k 1) Proof. Let k 2. Given n N, there are at most (2 log2 n) − ways of writing n as ≥ ∈ k 1 a product of k 1 integers. If m = k!Q − , then −

2k−1 2k−1 k − f(fi) (2Q) e(fi`uk−1 (y)) | | ≤ · u ,...,u y 1 Xk−1¯X ¯ 2k−1 k ¯ ¯ − ¯ ¯ (2Q) e(fi`uk−1 (y)) ≤ · u y k−1¯ ¯ X ¯X ¯ ¯ m ¯ 2k−1 k k 1 k 1 1 (2Q) − 2 − (log2 Q) − min Q; fix − ≤ · m+1 { k k } −X 2k−1 k k 1 k 1 k 1 k (2Q) − 2 − (log Q) − 48 log Q(Q + 2k!Q − + q + 2k!Q =q): ≤ · 2 ·

k 2k−1 1 1 k This is at most 100(log Q) Q (Q− + q− + qQ− ). 2

We briefly look at an important practical application of Weyl’s Inequality, which will lead to Weyl’s Theorem.

Proposition 3.7. Let fi R; N N. Then there exists q : 1 q N such that ∈ ∈ ≤ ≤ 1 fiq N − . k k ≤ 1 Proof. Of the reals fi; 2fi; : : : ; Nfi, two lie within N − of each other (mod 1). Thus 1 there exist s; t : s = t with (s t)fi n− . Set q = s t . 2 6 k − k ≤ | − |

Lemma 3.8. For n N, the number of factors of n is at most n4/(log log n). ∈ Proof. Let 2 t n and write ¿(n) for the number of of n. Then ≤ ≤ ¿(n) = (a + 1) (a + 1) 2a pa|n ≤ pa|n,p≤t pa|n,p>t paY+16|n paY+16 | n paY+16 | n log n t log 2/ log t 1 + pa ≤ log 2 pa|n ³ ´ ³ paY+16 | n ´ exp t(2 + log log n) + log 2 log n= log t : ≤ · ³ ´ On choosing t = log n=(log log n)3, we obtain the result. 2

12 Lemma 3.9. Let A ZN , A = M, and suppose that A ( 2L; 2L] = . Then there ⊂ | | ∩ − 6 ∅ exists r such that 0 < r < (N=L)2 and Aˆ(r) LM=2N. | | ≥ Proof. We have x; y I = ( L; L] implies x y ( 2L; 2L]. Therefore if (I I)(s) = 0 ∈ − − ∈ − ∗ 6 ˆ 2 ˆ ˆ 2 ˆ then A(s) = 0. So s(I I)(s)A(s) = 0 and r I A(r) = 0 implies r=0 I(r) A(r) ∗ | | 6 | | | | ≥ Iˆ(0) Aˆ(0) = 4L2MP. However Iˆ(r) = Pe( rs) min r=N ; 2PL . If N=2 < r < | || | | | | I − | ≤ {k k } − N=2, then this is min N=r; 2L . ConsequenP tly { } Iˆ(r) 2 Aˆ(r) max Aˆ(r) Iˆ(r) 2 + M (N=r)2 0< r (N/L)2 r=0 | | | | ≤ | |≤ | | r | | r >(N/L)2 X6 X | | X 2LN max Aˆ(r) + 3MN 2=(N=L)2: 0

The next theorem, due to Weyl [23], is a well-known consequence of Weyl’s Inequality:

Theorem 3.10. For every k N there exists " > 0 such that for all M sufficiently ∈ k ε large and fi R, there exists q M such that q fi 2M − . ∈ ≤ k k ≤ Proof. Approximate fi arbitrarily closely by a rational a=N with N prime. With- out loss of generality, fi = a=N. If the claim of the theorem is false, then A = k k ε a; 2 a; : : : ; M a and ( 2L; 2L] are disjoint when L = NM − . Applying Lemma { } − b c 2 2ε 1 ε 3.9, we find r such that 0 < r (N=L) 2M , and such that Aˆ(r) M − =2. Now ≤ ≤ | | ≥ M k 1 2−k Aˆ(r) = e(firs ) . Let q M with fir p=q (qM)− . If q M , applying | | | s=1 | ≤ | − | ≤ ≥ Weyl’s inequalitP y gives

M k 1+ε 1/(2k2k−1) e(firs ) M M − ; ≤ · ¯s=1 ¯ ¯X ¯ ¯ ¯ for M is sufficiently large. With " = 1=(k2k+3), this is a contradiction. If q M 2−k , ≤ 1 k k 1/2 2kε 1 ε then fiqr M − , by Proposition 3.7. But then fi(qr) 2 M − M 2M − k k ≤ k k ≤ ≤ for M sufficiently large. 2

13 §4 Vinogradov’s Three-Primes Theorem

Vinogradov’s famous theorem asserts that every sufficiently large odd number is the sum of three primes. Together with Chen’s theorem (every sufficiently large even number is the sum of p and q, where p is prime and q is the product of at most two primes) this is one of the strongest results in the direction of Goldbach’s conjecture. In this section we shall see how to use exponential-sum estimates to prove Vinogradov’s theorem, and we shall also gain some insight into why Goldbach’s conjecture itself is out of reach.

We begin with some definitions and simple lemmas. Given n N, let Λ(n) be log p if ∈ n = pk with p prime, k 1 and zero otherwise. Let „(n) = ( 1)k if n is a product of k ≥ − distinct primes (interpreting this as 1 when n = 1) and zero otherwise. These functions are called von Mangoldt’s function and the M˜obius function respectively.

Lemma 4.1. Let x N. Then d x Λ(d) = log x. ∈ | Proof. Write x as a product ofPprime powers and it becomes obvious. 2

Lemma 4.2. Let x N. Then d x „(d) = 0 unless x = 1 in which case d x „(d) = 1. ∈ | | Proof. Let x 2 and write xP= pa1 : : : pak . Then every subset A [Pk] contributes ≥ 1 k ⊂ A ( 1)| | to the sum d x „(d). But − | P k A j k k ( 1)| | = ( 1) = (1 1) = 0: A [k] − j=0 − Ãj! − X⊂ X (Another way of looking at the last calculation is that a randomly chosen subset of [k] has the same chance of being of even as of odd size.) 2

Recall that d(x) is defined to be the number of divisors of x. We know from the previous section that d(x) is sometimes quite large. The next lemma shows that this does not happen all that often.

2 3 Lemma 4.3. Let n N. Then x n d(x) 2n(log n) . ∈ ≤ ≤ Proof. This is surprisingly easyPto prove. Indeed,

d(x)2 = b x 1 x n x n | c x X≤ X≤ X X|

14 = 1 b n c n y.lcm(a,b) n X≤ X≤ X ≤ 1 ≤ a n d n/a e n/ad y n/ade X≤ ≤X ≤X ≤X n=ade ≤ a n d n/a e n/ad X≤ ≤X ≤X (n=ad)(log n + 1) ≤ a n d n/a X≤ ≤X (n=a)(log n + 1)2 ≤ a n X≤ n(log n + 1)3; ≤ which proves the lemma. 2

It is easy to check that the number of ways of writing n as the sum of three primes is 3 F (fi) e( fin)dfi, where F (fi) is the function p n e(fip). Roughly speaking, our aim − ≤ willR be to estimate F (fi) for every fi, and use thisP estimate to prove that the integral is non-zero. As in the previous section, F (fi) turns out to be small when fi is not too close to a rational with small denominator. When it is close to such a rational, we shall use results about the distribution of primes in an arithmetic progression to estimate F (fi) directly.

There are, however, certain advantages in weighting the primes so that their density is 1 approximately constant through the interval. Since the density near m is (log m)− , the appropriate weight to give p is log p. Accordingly, we shall estimate the function f(fi) = 3 p n log pe(fip). The integral f(fi) e( fin)dfi gives us the sum of (log p1)(log p2)(log p3) ≤ − Pover all triples (p1; p2; p3) suchRthat p1 + p2 + p3 = n, so for the purposes of Vinogradov’s theorem it is enough to prove that this integral is non-zero for large enough odd n.

Finally, even this function is not always the most convenient to estimate. The next lemma shows that we may replace it by g(fi) = x n Λ(x)e(fix), with only a small ≤ error. P

Lemma 4.4. f(fi) g(fi) C√n for every fi and some absolute constant C. | − | ≤ k Proof. g(fi) f(fi) = pk n,k 2 log pe(fip ) which in modulus is at most (log2 n) p √n 1. − ≤ ≥ ≤ By Chebyshev’s theoremP the result follows. P 2

15 The next lemma is similar to the lemma we kept using during the proof of Weyl’s inequality, and follows from it. Since we are about to prove several results with the same hypotheses, let us state them once and for all before starting. Thus, a and q will 2 be positive integers with (a; q) = 1 and fi is a real number with fi a=q q− . | − | ≤

Lemma 4.5. Let Q; R be positive integers with q Q. Then ≤ R 1 1 1 min fix − ; Qx− 200 log Q log R(q + R + Qq− ): {k k } ≤ xX=1 1 Proof. We know, from chapter 3, that the numbers 0; fi; 2fi; : : : ; (q=2) fi are (2q)− - b c separated. Therefore,

1 1 min fix − ; Qx− 2 2q=x 4q log q: x q/2 {k k } ≤ x q/4 ≤ X≤ ≤dX e 2i 1 1 1 Given an integer i, let S be the sum − i−1 min fix − ; Qx− : Then i x=2 {k k } 2i 1 P − 1 i 1 Si min fix − ; Q=2 − ≤ i−1 {k k } x=2X (i 1) i 1 1 which, by Lemma 3.3 of the last chapter, is at most 48 log Q(2− − Q+2 − +q +Qq− ). i i 1 Summing over all i such that 2 > q=2 and 2 − R, we obtain the desired result. 2 ≤

We now prove an identity due to Vaughan [21], which will allow us to show that g(fi) is small when fi is not close to a rational with small denominator. This identity seems mysterious when it is just drawn out of a hat, but the mystery can be reduced with a few remarks.

We wish to show that g(fi) = x n Λ(x)e(fix) is appreciably smaller than n when q is ≤ not too small (or too large). TheP function which is hard to understand is of course Λ, but we know that Λ has the nice property that d x Λ(d) = log x, which is much more | familiar. Therefore, we try to express g(fi) as aPsum of pieces of this form. As a first observation, we notice (or rather, it has been noticed) that

Λ(x)e(fixy) = Λ(x)e(fiu): x n y n/x u n x u X≤ ≤X X≤ X| This is very promising, because

Λ(x)e(fix) = „(d)Λ(x)e(fixy) x n x n y n/x d y X≤ X≤ ≤X X|

16 = „(d) Λ(x)e(fidxz) ; d n z n/d x n/zd X≤ ≤X ≤X which is a 1-combination of sums of the required form, and therefore seems to have a ± chance of being small.

Now it is clearly not easy to obtain a good estimate for the last directly, because d takes n possible values and for each one we are not going to do better than a modulus of 1. It is therefore essential to restrict d. However, this introduces a new error term which must be shown to be small. Moreover, showing that this error term is small turns out not to be possible unless we also restrict x to be not too small. We now prove the identity by a process of trial and error, starting with the observations above.

2/5 2/5 Lemma 4.6. Let X = n . Then g(fi) = x n Λ(x)e(fix) = S T U + O(n ), ≤ − − where P S = „(d) Λ(x)e(fidxz) ; d X z n/d x n/zd X≤ ≤X ≤X T = „(d) Λ(x)e(fidxz) d X z n/d x X,x n/zd X≤ ≤X ≤ X≤ and U = „(d) Λ(x)e(fixu): X

¿u Λ(x)e(fixu) = U + Λ(x)e(fix): u n X

Λ(x)e(fix) = g(fi) + O(n2/5): X

¿u Λ(x)e(fixu) = „(d) Λ(x)e(fixu) u n X

17 In the next three lemmas, we show that each of S; T and U is small. Notice that S is the sum we originally expected to be able to bound, and is therefore in a sense the important one, while T and U are error terms that we were unable to avoid introducing.

Lemma 4.7. S 80(log n)3(q + X + n=q). | | ≤ Proof. Writing u for xz, we have

S = „(d) Λ(x)e(fidu) log ue(fidu) | | ≤ ¯d X u n/d x u ¯ d X¯u n/d ¯ ¯ X≤ ≤X X| ¯ X≤ ¯ ≤X ¯ ¯ ¯ ¯ ¯ by Lemma 4.1. But

u log ue(fidu) = e(fidu) dt=t 1 u n/d u n/d Z ¯ ≤X ¯ ¯ ≤X ¯ ¯ ¯ n/d ¯ ¯ ¯ ¯ 1 t u n/d e(fidu) dt=t ¯ ¯ ≤ ≤ ≤ n/d ¯ 1 ¯ R1 min¯P fid − ; n=d¯ dt=t ≤ ¯ {k k }¯ 1 R log n min fid − ; n=d : ≤ {k k } Summing over d X and applying Lemma 4.5 (taking into account that log X = ≤ (2=5) log n) we obtain the bound claimed. 2

Lemma 4.8. T 160(log n)3(q + X2 + n=q). | | ≤ Proof. Interchanging the order of summation of z and x in the definition of T , and using the fact that „(d) 1, we have | | ≤ T Λ(x) e(fidxz) : | | ≤ d X x X ¯z n/dx ¯ X≤ X≤ ¯ ≤X ¯ ¯ ¯ Now let y = dx, and this becomes

Λ(x) e(fiyz) : y X2 x X,x y ¯z n/y ¯ X≤ ≤X | ¯ ≤X ¯ ¯ ¯ By Lemma 4.1, x X,x y Λ(x) log y log n, so we can bound this above by ≤ | ≤ ≤ P 1 log n min fiy − ; n=y k X2 {k k } X≤ which is at most the bound stated, by Lemma 4.5. 2

4 1/2 1/2 1/2 1/2 Lemma 4.9. U 40(log n) (n q + n=X + nq− ). | | ≤ 18 Proof. Given a positive integer i, let Ui be the sum

2i 1 − ¿ Λ(x)e(fixu) : | u| u=2i−1 ¯X X and 2 − < n=X. It is easy to check that there are at most log n such values of i. (The fact that 2i is between roughly n2/5 and roughly n3/5 more than compensates for the replacement of log2 n by log n.) We shall estimate the Ui separately. By the Cauchy-Schwarz inequality,

2i 1 2i 1 − − 2 U 2 ¿ 2 Λ(x)e(fixu) : i ≤ | u| ³u=2i−1 ´³u=2i−1¯X

2i 1 − Λ(x)Λ(y)e(fi(x y)u): u=2i−1 X

Λ(x)Λ(y) e(fi(x y)u) − X

1 i 1 Λ(x)Λ(y) min fi(x y) − ; 2 − : X

2 i 1 1 i 1 (log n) (n=2 − ) min fiz − ; 2 − ; n/2i−1

19 which, by Lemma 3.3 of the last chapter, is at most

2 i 2 i 1 (log n) :48 log n(q + n=2 − + 2 − + 2n=q):

Multiplying the two estimates together, we have shown that

2 6 i i 1 U 96n(log n) (q + 4n=2 + 2 − + 2n=q) ; i ≤

i i 1 which implies, since n=2 and 2 − are at most n=X, that

3 1/2 1/2 1/2 1/2 U 40(log n) (n q + n=X + nq− ): i ≤

Since there are at most log n values of i such that Ui contributes to U, the result fol- lows. 2

Remarks. It may look complicated to split the sum into log n (or so) further pieces, but this was a good (and standard) thing to do because we were estimating something of the form u f(u)g(u), where f(u) appeared to be roughly proportional to u and g(u) 1 roughly propPortional to u− . So applying the Cauchy-Schwarz inequality straight away would have been disastrous. Note that the choice of X = n2/5 was made in order to 2 1/2 minimize max X ; nX− . { } If we put together Lemmas 4.4 and 4.6 to 4.9 we obtain the following result.

Theorem 4.10. Let a; q be positive integers with (a; q) = 1 and let fi be a real number 2 such that fi a=q 1=q . Then x n Λ(x)e(fix) and p n log p e(fip) are both at most | − | ≤ ≤ ≤ 4 1/2 1/2 4/5 1/2 50(log n) (n q + n + nq− P), when n is sufficientlyP large.

We have now managed to show that f(fi) is small, provided that q is not too small. The usual approach to the rest of the proof is to estimate f(fi) when fi is close to a rational with small denominator, using the Siegel-Walfisz theorem (see [19]), and then combine these results to obtain a fairly accurate estimate for f(fi)3e( fin) dfi (in particular, − accurate enough to show that it is non-zero). In theseR notes, a different argument is used, which is believed to explain, in a more intuitive way, why the integral comes out to be positive. It has the added advantage that we do not actually need to estimate the integral at all accurately, although it is possible to work harder in order to do so.

20 The main idea is to work out exactly what is meant by the familiar idea that the primes are somehow randomly distributed. A minor problem to worry about first is that there are more small primes than large ones, but we have already dealt with that by weighting a prime p by log p. Now, in chapter 2, we thought of a subset A of 1; 2; : : : ; n as being { } random if the Fourier coefficients Aˆ(r) were all much smaller than n, for non-zero r. However, it is clear that the primes are not random in this sense, because, for example, only one prime is a multiple of five.

Which constraints of this kind have an effect on Fourier coefficients? It is an easy exercise to show that congruence conditions mod q have an effect if and only if q is small. Motivated by this observation, we let p1; : : : ; pk be the primes less than or equal to (log n)A, in ascending order, and define Q to be the set of integers less than or equal to n that are not multiples of any . Here, A is an absolute constant (in fact we shall choose A = 16), but there is some freedom in the argument, and we could have made pk quite a bit larger. What we shall do in the rest of the section is show that the weighted primes behave like a random subset of Q.

It is not hard to work out how to interpret this statement. It means that the Fourier transforms f(fi) = p n log p e(fip) and h(fi) = x Q e(fix) are roughly proportional. ≤ ∈ This implies that inPtegrals involving these functionsP are also roughly proportional, so that, roughly speaking, whatever is true for Q is true for the weighted primes as well. (That “roughly speaking” is important: a good exercise is to see why Lemma 4.20 below does not translate into a solution of the Goldbach conjecture.)

We begin by obtaining an estimate similar to Theorem 4.10 for the function h(fi). The proof is much simpler, however.

2 Lemma 4.11. Suppose that (a; q) = 1 and fi a=q q− . Then | − | ≤ 2 1/2 1 1 1/4A h(fi) 100(log n) (n + q + nq− + n − ): | | ≤ Proof. Notice first that

k h(fi) = ( 1)s e(fip : : : p y): − i1 is s=0 1 i1 ... is k y n/pi ...pi X ≤ ≤X≤ ≤ ≤ X1 s The justification of this is similar to the proof of Lemma 4.2. If z Q then e(fiz) is ∈ added when s = 0, and otherwise does not appear. If z = Q then z = pa1 : : : par w ∈ j1 jr 21 B for some w Q, and a 1, and e(fiz) is added ( 1)| | times for every subset B of ∈ i ≥ − j ; dots; j , giving a total contribution of zero. { 1 r} 1 The inner sum is at most min fip : : : p − ; n=p : : : p . Let t = log n=2A log log n {k i1 is k i1 is } and note that pt √n. These estimates and the fundamental theorem of arithmetic k ≤ imply that t s ( 1) e(fipi1 : : : pis y) s=0 − 1 i ... i k ¯ 1 s y n/pi1 ...pis ¯ ¯X ≤ ≤X≤ ≤ ≤ X ¯ ¯ 1 ¯ 2 1/2 is at most x √n min fix − ; n=x , which, by Lemma 4.5, is at most 100(log n) (n + ≤ {k k } 1 q + nq− ).P

The rest of the sum is, in modulus, at most

k s 1 n pi−j ; s=t+1 1 i1 ... is k j=1 X ≤ ≤X≤ ≤ Y which is at most k 1 1 1 s n (s!)− (p1− + : : : + pk− ) : s=Xt+1 1 1 It is well known (and follows from the theorem) that p1− + : : : + pk− is about log log k, and so at most 2 log log log n, when n is sufficiently large. Approximating s! by (s=e)s, we obtain an upper bound of 2n(2e log log log n=t)t, since t 4e log log log n. ≥ 1/4A It is not hard to check that this is at most n− when n is sufficiently large. This, together with the first estimate, proves the lemma. 2

We now turn to the “major-arcs” estimates, that is, estimates for f(fi) and h(fi) when fi is close to a rational with small denominator. It turns out that such estimates are more or less equivalent to estimating p X log p and X Q for certain long arithmetic ∈ | ∩ | progressions X. In the case of the primesP themselves, we shall appeal to known estimates of this type, as given in the next result, the Siegel-Walfisz theorem.

Siegel-Walfisz Theorem. Let A be a positive real number, let x be an integer, let q (log x)A be another integer and let (a; q) = 1. Then ≤ x log p = + O(exp( C log x)) ; p x,p a (q) `(q) − ≤ X≡ q where C is a constant depending on A only.

22 Notice that from the Siegel-Walfisz Theorem it follows that, if q (log n)A, and X is ≤ the arithmetic progression a; a + q; : : : ; a + (m 1)q , where (a; q) = 1 and 1 a { − } ≤ ≤ n (m 1)q, then for any constant B, we have − − mq log p = + O(n=(log n)B) ; p X `(q) X∈ with the implied constant in the error term depending on A and B only.

We shall now obtain an estimate for X Q , when X is an arithmetic progression of | ∩ | the kind above.

Lemma 4.13. Let q (log n)A, let X = a; a + q; : : : ; a + (m 1)q be a subset of [N] ≤ { − } with m N 1/2 and suppose that (q; a) = 1. Then ≥ k mq 1 1/4A X Q = (1 p− ) + O(mn− ): | ∩ | `(q) − i iY=1 Proof. Let x X be chosen uniformly at random, and for each i let X be the event ∈ i 1 1 1 p x. Then the of X is p− + O(m− ) if p q and O(m− ) if p q. More i| i i i 6 | i| generally, for any choice 1 i : : : i k we have ≤ 1 ≤ ≤ s ≤ s 1 Prob(X : : : X ) = † =p + O(m− ) ; i1 ∩ ∩ is ij ij jY=1 where † = 1 if p q and 0 if p q. It follows from this and the inclusion-exclusion i i 6 | i| formula that, for any t, k t s t s 1 k 1 Prob X = ( 1) † =p + O(m− ) : − i − ij ij s i=1 s=0 1 i1 ... is k j=1 s=1 Ã ! ³ [ ´ X ≤ ≤X≤ ≤ Y X Now k k s (1 † =p ) = ( 1)s † =p − i i − ij ij i=1 s=0 1 i1 ... is k j=1 Y X ≤ ≤X≤ ≤ Y and s 1 1 1 s † =p (s!)− (p− + : : : + p− ) ij ij ≤ 1 k 1 i1 X... is k jY=1 ≤ ≤ ≤ ≤ (4e log log log n=s)s ≤ when n is sufficiently large. If t 8e log log log n, then this quantity summed from t + 1 ≥ t t k to k is at most (4e log log log n=t) . Furthermore, s=1 s is easily seen to be at most kt. It follows that P ³ ´ k k 1 Prob X = (1 † =p ) + O((log n)At + (4e log log log n=t)t): − i − i i ³i[=1 ´ iY=1 23 1/4A Choosing t to be log n=2A log log n gives an error of at most O(n− ), as in the proof of Lemma 4.11. Note finally that

k k 1 (1 † =p ) = (1 1=p ) (1 1=p )− − i i − i − i i=1 i=1 pi q Y Y Y| k 1 = (1 1=pi) (1 1=p)− i=1 − p q − Y Y| k q 1 = (1 p− ): `(q) − i iY=1 Multiplying everything by m proves the lemma. 2

k 1 1 Corollary 4.14. Let a; q; X be as in Lemma 4.13, let K = (1 p− )− and let B i=1 − i be any positive constant. Then Q

B K X Q log p = O(n(log n)− ): | ∩ | − p X X∈ Proof. This follows immediately from Lemma 4.13 and the remark following Lemma 4.12. (Strictly speaking one must consider what happens if (a; q) = 1 but then it is easy 6 to see that both K X Q and p X log p are very small.) 2 | ∩ | ∈ P Lemma 4.15. Let q (log n)A, let (b; q) = 1 and let fi be a real number such that ≤ fi b=q (log n)A=qn. Let G be a function from 1; 2; : : : ; n to R such that G(x) | − | ≤ { } | | ≤ log n for every x and such that

B G(x) = O(n(log n)− ) ¯x X ¯ ¯ X∈ ¯ for every arithmetic progression¯ X = a;¯ a + q; : : : ; a + (m 1)q , where B 4A + 2. { − } ≥ Then A G(x)e(fix) = O(n(log n)− ): ¯x n ¯ ¯X≤ ¯ Proof. Let fl = fi b=q and¯ let X be one¯ of the arithmetic progressions of the above − type. Notice that, if x; y X, then ∈ e(flx) e(fly) = 1 e(fl(x y)) 2… x y fl 2…m(log n)A=n: | − | | − − | ≤ | − || | ≤

Therefore, letting x0 be an arbitrary element of X, we have

G(x)e(fix) = G(x)e(bx=q)e(flx) ¯x X ¯ ¯x X ¯ ¯ X∈ ¯ ¯ X∈ ¯ ¯ ¯ ¯ ¯ 24 e(ab=q) G(x)(e(flx) e(fl x)) + e(ab=q)e(flx ) G(x) ≤ − 0 0 ¯ x X ¯ ¯ x X ¯ ¯ X∈ ¯ ¯ X∈ ¯ = ¯ G(x)(e(flx) e(flx )) + ¯G(x¯ ) ¯ − 0 ¯x X ¯ ¯x X ¯ ¯ X∈ A ¯ ¯ X∈ B¯ (2…m(log n) =n)m log n + O(n(log n)− ) ≤ ¯ ¯ ¯ ¯ A+1 2 1 B = O((log n) m n− + n(log n)− ):

But we can partition [n] into 2n=m0 arithmetic progressions of the form of X, with B/2 m m in each case. Therefore, choosing m = n(log n)− and summing over all ≤ 0 0 these, we find that A+1 B/2 G(x)e(fix) = O(n(log n) − ) ¯x n ¯ ¯X≤ ¯ which proves the result. ¯ ¯ 2

Recall that f(fi) = p n log p e(fip). Let us define h1(fi) to be K x Q e(fix) = Kh(fi). ≤ ∈ P P Corollary 4.16. Let A = 16. Then, for every real number fi, f(fi) h (fi) = − 1 A/4 O(n(log n)− ).

A Proof. Let fi be a real number. Then we can find q n(log n)− and b with (b; q) = 1 ≤ such that fi b=q (log n)A=nq. If q (log n)A, then Theorem 4.10 implies that | − | ≤ ≥ 4 A/2 f(fi) = O(n(log n) − ), while Lemma 4.11 (with an easy estimate for K) implies that 3 A h1(fi) = O(n(log n) − ), so the result holds.

If on the other hand q (log n)A, then set G(x) = log x KQ(x) if x is prime, and ≤ − KQ(x) otherwise. Corollary 4.14 tells us that G satisfies the conditions for Lemma − 4.15. But x n G(x)e(fix) = f(fi) h1(fi), so Lemma 4.15 gives us the result in this ≤ − case. P 2

This is all we need for the three-primes theorem. However, it is perhaps of some interest to obtain an actual estimate for f(fi) and h1(fi) when q is small, rather than merely showing that they are close. So the next two lemmas are here for interest only.

For notational convenience, when we write (a; q) = 1 in the next lemma we shall mean that a and q are coprime and that 1 a q. ≤ ≤

Lemma 4.17. For every q, (a,q)=1 e(a=q) = „(q). P

25 Proof. If q = 1 then the result holds. If q is a prime, then

e(a=q) = e(a=q) = 0 1 = 1: (a,q)=1) 1 a

k 1 e(a=q) = e(a=q) e(b=p − ) = 0 0 = 0: (a,q)=1 1 a q − 1 b pk−1 − X ≤X≤ ≤ X≤ Finally, if q and r are coprime, then

e(a=q) e(b=r) = e(ar + bq=qr): (a,qX)=1 (b,rX)=1 (a,q)=1X,(b,r)=1 But ar+bq runs through all residues mod qr, and (ar+bq; qr) = 1 if and only if (a; q) = 1 and (b; r) = 1. So the sum is (a,qr)=1 e(a=qr). These properties of the left handP side force it to equal „. 2

A Now, given q (log n) , let us define a function Hq : [n] R by letting Hq(x) equal ≤ → q=`(q) if (x; q) = 1 and zero otherwise.

Lemma 4.18. Let q (log n)A, let (b; q) = 1 and let fi be a real number such that ≤ fi b=q (log n)A=nq. Let fl = fi b=q. Then | − | ≤ −

„(q) 2A Hq(x)e(fix) = e(flx) + O((log n) ): x n `(q) x n X≤ X≤

Proof. Let us write Xa for the set of integers less than or equal to n and congruent to a mod q. If (a; q) = 1, then clearly x Xa Hq(x)e(fix) = 0. On the other hand, if 6 ∈ (a; q) = 1, then P q H (x)e(fix) = e(bx=q)e(flx) q `(q) x Xa x Xa X∈ q X∈ = e(ab=q) e(flx): `(q) x Xa X∈ Now, if a ; a q, then 1 2 ≤ e(flx) e(flx) 1 + e(flx) 1 e(fl(a a )) : − ≤ | − 1 − 2 | ¯x Xa1 x Xa2 ¯ ¯x Xa1 ¯ ¯ ∈X ∈X ¯ ¯ ∈X ¯ ¯ ¯ ¯ ¯ 26 Since a a q, we know that 1 e(fl(a a )) = O((log n)A=n), so this shows that, | 1 − 2| ≤ − 1 − 2 for every a, 1 A e(flx) = q− e(flx) + O((log n) ): x Xa x n X∈ X≤

(In words, the numbers x Xa e(flx) are all approximately equal, and therefore all ap- ∈ proximately equal to theirP average.) It follows that

q 1 A H (x)e(fix) = e(ab=q) q− e(flx) + O(log n) : q `(q) 0 a

Corollary 4.19. Let fi, b, q and fl be as in Lemma 4.18. Then f(fi) and h1(fi) are both A equal to („(q)=`(q)) x n e(flx) + O(n(log n)− ). ≤ Proof. This followsPeasily from Theorem 4.12 and Lemmas 4.13, 4.15 and 4.18. Let P (x) be the function log x if x is prime and zero otherwise. Setting G(x) = P (x) H (x), − q Theorem 4.12 tells us that the conditions for Lemma 4.15 are satisfied. But this implies A that f(fi) = x n Hq(x)e(fix) + O(n(log n)− ). Then Lemma 4.18 gives us our estimate ≤ for f(fi). ThePsame argument works for h1(fi) if we use Lemma 4.13 instead of Theorem 4.12. 2

After that diversion, let us now finish the proof of the three-primes theorem. There are two steps to the proof. First, we show that every sufficiently large odd integer is the sum of three elements of Q (or fake primes) in many ways, using the Brun sieve once again. Then we deduce, from the fact that f and h1 are uniformly close, that the same is true of the genuine primes.

Lemma 4.20. Let m be an integer. Then the number of ways of writing m = x+y with k 1 1/2 1/4A x and y both in Q is at least m (1 r =p ) + O(m− n + mn− ), where r = 1 i=1 − i i i if p m and 2 otherwise. Q i| Proof. Choose x randomly and uniformly from the set [m]. For each i let Xi be the event that p x or p m x. As in the proof of Lemma 4.13, it is easy to show that i| i| − 1 Prob(X ) = r =p +O(m− ). (The point about the r is that the events p x and p m x i i i i i| i| − are the same if p m and mutually exclusive otherwise.) More generally, it is not hard i| to show that s rij 1 Prob(Xi1 : : : Xis ) = + O(m− ): ∩ ∩ Pij jY=1 27 Therefore, by the inclusion-exclusion formula,

k t s t s 1 k 1 Prob X = ( 1) r =p + O(m− ) : − i − ij ij s i=1 s=0 1 i1 ... is k j=1 s=1 Ã ! ³ [ ´ X ≤ ≤X≤ ≤ Y X But k k s (1 r =p ) = ( 1)s r =p − i i − ij ij i=1 s=0 1 i1 ... is k j=1 Y X ≤ ≤X≤ ≤ Y and

s 1 1 1 s r =p (s!)− (2p− + : : : + 2p− ) ij ij ≤ 1 k 1 i1 X... is k jY=1 ≤ ≤ ≤ ≤ (8e log log log n=s)s: ≤ As in the proof of Lemma 4.13, it follows that

k k 1 At t 1 Prob X = (1 r =p ) + O(m− (log n) + (8e log log log n=t) ) − i − i i ³i[=1 ´ iY=1 for any t 16e log log log n. Choosing t to be log n=2A log log n implies the lemma. 2 ≥

Corollary 4.21. If n is sufficiently large and odd, then the number of ways of writing 2 1 k 1 n as the sum of three elements of Q is at least (n =16)K− (1 2p− ). i=2 − i Proof. Note first that Lemma 4.13 implies that the numberQ of elements of Q less than 1 or equal to n=2 is at least K− n=4 (when n is sufficiently large). For every odd z n=2, ≤ the number of ways of writing n z as the sum of two elements of Q is, by Lemma 4.20, − at least (n=4) k (1 2=p ). The result follows. 2 i=2 − i Q It is possible to be much more careful and work out the number of ways of writing n as the sum of three elements of Q to within a factor 1 + o(1), but we do not need this.

Vinogradov’s Three-Primes Theorem. Every sufficiently large odd integer is the sum of three primes.

1 k 1 1 Proof. Note first that (16K)− (1 2p− ) is easily shown to be at least (log n)− i=2 − i when n is sufficiently large, so theQ number of ways of writing n as the sum of three elements of Q is at least n2= log n. On the other hand, it is also h(fi)3e( fin) dfi, so − we certainly have h (fi)3e( fin) dfi n2= log n. R 1 − ≥ R

28 As we commented at the beginning, it is sufficient for our purposes to establish that f(fi)3e( fin) dfi = 0. But, by Corollary 4.16, − 6 R f(fi)3e( fin) dfi h (fi)3e( fin) dfi − − 1 − ¯Z Z ¯ ¯ A/4 ¯ 2 2 ¯ = O(n(log n)− ) f¯ (fi) + f(fi)h1(fi) + h1(fi) dfi Z | | A/4 2 2 = O(n(log n)− ) f(fi) + h (fi) dfi | | | 1 | A/4 Z 2 2 = O(n(log n)− ) (log p) + K Q | | ³p n ´ 2 A/X≤4 = O(n log n(log n)− ):

Since we chose A to be 16, this and our estimate for the integral with h1 are enough to prove the theorem. 2

29 §5 The

n A in R is a subgroup generated by n linearly independent vectors. A basis for a lattice Λ is a linearly independent set in Λ that generates Λ.

Lemma 5.1. Let Λ be a lattice and let x1; x2; : : : ; xn and y1; y2; : : : ; yn be distinct bases of Λ. Let fi : x y be linear. Then det(fi) = 1. i 7→ i 1 Proof. Each yi is an integer combination of xi so both fi and fi− are non-singular and have integer . 2

Let Λ be a lattice with basis x1; x2; : : : ; xn. The fundamental parallellopiped of Λ with respect to x ; x ; : : : ; x is the set P = a x + a x + : : : + a x : 0 a < 1 . Note 1 2 n { 1 1 2 2 n n ≤ i } n that the sets x + P , x Λ are disjoint and their union is R . The det(Λ) ∈ of Λ is the volume of P , which is well-defined by the Lemma 5.1. Alternatively, det(Λ) is det(A) , where A = (x ; x ; : : : ; x ) and the x are column vectors with respect to the | | 1 2 n i n n canonical basis for R . A convex body is a bounded convex open subset of R .

n Lemma 5.2. Let Λ be a lattice and suppose that K is a convex body in R . Then n vol(K) = limt Λ tK det(Λ)=t . →∞ | ∩ | Proof. Let Q be a translate of a fundamental parallellopiped P of Λ. Then tQ contains exactly tn points in Λ if t is an integer. However, Λ tQ lies between t n and t n. | ∩ | b c d e n Therefore the result is true for all sets of the form z + ‰P with z R and ‰ > 0. As K ∈ is convex, it can be approximated by finite unions of such sets. 2

n A sublattice of a lattice Λ in R is a subgroup M Λ which is also a lattice. ⊂

Lemma 5.3. Let Λ be a lattice and let M be a sublattice of Λ. Then the index of M as a subgroup of Λ is det(M)=det(Λ). n Proof. Let P be a fundamental parallellopiped for M. Then every vector x R can ∈ be written uniquely as y + z, where y M and z P . Therefore every x Λ can be ∈ ∈ ∈ written uniquely as y + z with y M and z Λ. So the index of M is P Λ . If t ∈ ∈ | ∩ | is an integer, tP Λ = tn P Λ . Thus volP = P Λ det(Λ), by Lemma 5.2, and it | ∩ | | ∩ | | ∩ | follows that det(M) = P Λ det(Λ). 2 | ∩ |

30 n Blichfeldt’s Lemma. Let K R be a measurable set, Λ a lattice and suppose ⊂ vol(K) > det(Λ). Then K K contains a non-zero lattice point. − Proof. Let x ; x ; : : : ; x be a basis for Λ and let Q = a x +: : :+a x : 1 a < 1 . 1 2 n { 1 1 n n − ≤ i } Then vol(K) = 2ndet(Λ) and Q contains 2n points of Λ. Provided M is sufficiently large, vol(K MQ) is still greater than det(Λ). So K MQ. Let N be an integer, ∩ ⊂ chosen such that (1 + M=N)n < vol(K)=det(Λ). If the lemma were false, then the sets x + K; x Λ NQ are disjoint. The union of these sets is contained in (M + N)Q ∈ ∩ and has volume (2N)nvol(K), since there are (2N)n lattice points in Q. Therefore (2N)nvol(K) (2(N + M))ndet(Λ). By the choice of M, this is a contradiction. 2 ≤

Minkowski’s First Theorem. Let Λ be a lattice and let K a centrally symmetric convex body with vol(K) > 2ndet(Λ). Then K contains a non-zero point of Λ. Proof. As K is convex and centrally symmetric, K = 1 K 1 K. However, vol 1 K > 2 − 2 2 det(Λ), so the result follows by Blichfeldt’s Lemma. 2

Let Λ be a lattice, let K be a centrally symmetric convex body. Define 0 < ‚ 1 ≤ ‚ : : : ‚ by ‚ = inf ‚ : ‚K contains k linearly independent vectors in Λ . The 2 ≤ ≤ n k { } numbers ‚1; ‚2; : : : ; ‚n are called the successive minima of K with respect to Λ. Note n that we can find vectors b1; b2; : : : ; bk R such that bk ‚kK Λ for each k n. ∈ ∈ ∩ ≤ These bi actually form a basis for Λ.

Minkowski’s Second Theorem. Let Λ be a lattice and let K be a convex body. Suppose 0 < ‚ ‚ : : : ‚ are the successive minima of K with respect to Λ. 1 ≤ 2 ≤ ≤ n Then ‚ ‚ : : : ‚ vol(K) 2ndet(Λ). 1 2 n ≤ Proof. Let b ; b ; : : : ; b be a basis as defined above. Set V = 0 and, for each i, set 1 2 n 1 { } Vi = b1; b2; : : : ; bi 1 and Wi = bi; bi+1; : : : ; bn . Define a map ci : iK K by setting h − i h i → c (x) equal to the centre of gravity of (x + V ) K. We note that c is continuous (K is i i ∩ i open) and c (x) does not depend on the first i 1 co-ordinates of x. Also c (x) x V and i − i − ∈ i so if ci(x)j is the jth co-ordinate of ci(x) with respect to b1; b2; : : : ; bn, then ci(x)j = xj n for j i. Now define `(x) = i=1(‚i ‚i 1)ci(x), with ‚0 = 0. Then, expanding `(x), ≥ − − n P n `(x) = (‚i ‚i 1) ci(x)jbj − − Xi=1 jX=1 n j = bj ci(x)j(‚i ‚i 1) − − jX=1 hXi=1 i 31 n j n = bj xj(‚j ‚j 1) + ci(x)j(‚i ‚i 1) − − − − jX=1 hXi=1 i=Xj+1 i n = bj(‚jxj + `j(xj+1; : : : ; xn); jX=1 where `j is some continuous function. The next claim is that vol`(K) = ‚1‚2 : : : ‚nvol(K). First note that `(x) = ‚ x . For fixed t, let K(t) denote the cross section x K : n n n { ∈ x = t of K. Then ` restricted to K(t) can be represented by a formula n } n 1 n 1 − − `( xibi + tbn) = ‚ntbn + bj(‚jxj + ˆj(xj+1; : : : ; xn 1) − Xi=1 jX=1 as t is fixed, so by induction vol(`(K(t))) = vol y `(K) : yn = ‚nt = ‚1‚2 : : : ‚n 1. { ∈ } − Applying Fubini’s theorem, the theorem is proved. 2

For further reading on Minkowski’s Theorems, see [15].

Let r1; r2; : : : ; rk Zn and – > 0. Then the Bohr neighbourhood B(r1; r2; : : : ; rk; –) is ∈ the set s Zn : ris –N; i = 1; 2; : : : ; k where ris is the distance from ris to the { ∈ | | ≤ } | | nearest multiple of N. A d-dimensional arithmetic progression is a subset of Z or ZN of the form x + d a x : 0 a < s . It is proper if the numbers d a x are { 0 i=1 i i ≤ i i} i=1 i i all distinct. It willPbe seen, in 6, that Bohr neighbourhoods can be usedPas a step in § finding arithmetic progressions, using Fourier transforms.

Theorem 5.7. Let r1; r2; : : : ; rk ZN and 0 < – < 1=2. Then the Bohr neighbourhood ∈ B(r1; : : : ; rk; –) contains a proper k-dimensional arithmetic progression of cardinality at least (–=k)kN.

k Proof. We have s B(r1; r2; : : : ; rk; –) if and only if (r1s; r2s; : : : ; rks) lies within ‘ - ∈ ∞ k k distance –N of a point in NZ . Or, equivalently, (r1s; r2s; : : : ; rks) + NZ contains a k point x with x –N. Let Λ be the lattice generated by NZ and (r1; r2; : : : ; rk). k k∞ ≤ k k k 1 The index of NZ is clearly N , and it has index N in Λ. Therefore Λ has index N − in k k 1 Z , implying det(Λ) = N − by Lemma 5.3. Let K = (a1; a2; : : : ; ak) : 1 < ai < 1 ; { − } we apply Minkowski’s Second Theorem to K and Λ, to obtain a basis b1; b2; : : : ; bk k k of R with bi Λ and bi having b = ‚i, where ‚1‚2 : : : ‚kvol(K) det(Λ) 2 . ∈ k k∞ ≤ · k 1 Therefore ‚ ‚ : : : ‚ N − . Now notice that if a ; a ; : : : ; a are integers with a 1 2 k ≤ 1 2 k | i| ≤ –N=k‚ , then a b (–N=‚ k) ‚ = –N. However b is a vector of the form i k i ik ≤ i · i i P P 32 (r s ; r s ; : : : ; r s ). So a s r –N, j = 1; 2; : : : ; k. Let P be the k-dimensional 1 i 2 i k i | i i j| ≤ arithmetic progression Pk a s : a –N=k‚ : As the vectors b ; b ; : : : ; b are { i=1 i i | i| ≤ i} 1 2 k independent and a b P –N, P is proper – no two terms are equal (mod N). The k i ik ≤ number of integersPa in the interval [ –N=k‚ ; –N=k‚ ] is at least –N=k‚ , therefore i − i i i k k 1 k P –N=k‚ = (–=k) N (‚ ‚ : : : ‚ )− (–=k) N. 2 | | ≥ i · 1 2 k ≥ Q A layered graph is a graph G with vertex set comprising a disjoint union V V : : : V 0 ∪ 1 ∪ ∪ n of sets such that each edge lies between V and V for some i [0; n 1]. It will often i i+1 ∈ − be convenient to have an implicit orientation of the edges, from Vi to Vi+1 for each i. A layered graph G is called a Plunnec˜ ke Graph if it satisfies the following two conditions:

(1) For u Vi 1, v Vi and distinct w1; w2; : : : ; wk Vi+1 with uv; vwi ∈ − ∈ ∈ ∈ E(G), i = 1; 2; : : : ; k then there exist distinct v ; v ; : : : ; v V such 1 2 k ∈ i that uv and v w E(G). i i i ∈ (2) For distinct u1; u2; : : : ; uk Vi 1, v Vi and w Vi+1, uiv; viw E(G) ∈ − ∈ ∈ ∈ there exist distinct v1; v2 : : : ; vk such that uivi 1; viw E(G). − ∈

 w1 ,, w1  | |v |  , 1  ,  ,  ,  , Z w2 c w2 | |Z | |c |v | u v Z u c 2 Z c Z c Z c ZZ w c w | k | | k v k Given two layered graphs G; H with vertex sets V V : : : V and W W : : : W , 0 ∪ 1 ∪ ∪ n 0 ∪ 1 ∪ n the product graph G H has vertex set V W V W V W ) and (v; w) × 0 × 0 ∪ 1 × 1 ∪ · · · ∪ n × n joined to (v0; w0) if and only if vv0 E(G) or ww0 E(G). It is easily seen that G H ∈ ∈ × is a Plunnec¨ ke Graph if G and H are. The ith magniflcation ratio Di(G) of a layered graph G with vertex set V V : : : V is defined to be 0 ∪ 1 ∪ ∪ n Im (Z) min | i | : Z V ; Z = Z ⊂ 0 6 ∅ ½ | | ¾ where Im (Z) = y V : there is a directed path from some z Z to y . i { ∈ i ∈ }

33 Lemma 5.8. Let G; H be layered graphs. Then D (G H) = D (G)D (H). i × i i Proof. Suppose G; H have vertex sets V V : : : V and W W : : : W 0 ∪ 1 ∪ ∪ n 0 ∪ 1 ∪ ∪ n respectively. Let Y V , Z W satisfy Im (Y ) = Y = D (G) and Im (Z) = Z = ⊂ 0 ⊂ 0 | i | | | i | i | | | D (H). Then (v; w) Im (Y Z) if and only if there are paths from some y Y to v i ∈ i × ∈ and some z Z to w – equivalently (v; w) Im (Y ) Im (Z). Therefore D (G H) ∈ ∈ i × i i × ≤ Di(G)Di(H).

Conversely, if F is a layered graph with vertex set P Q R and D(P; Q); D(Q; R) and ∪ ∪ D(P; R) are the magnification ratios in the layered subgraphs between P; Q, Q; R and P; R respectively, then D(P; R) D(P; Q)D(Q; R). Define layered graph F and vertex ≥ sets P; Q and R as follows: let P = V W , Q = V W and R = V W . Join 0 × 0 0 × i i × i (v; w) V W to (v0; w0) V W in F if v = v0 and w0 Im ( w ). Similarly, join ∈ 0 × 0 ∈ 0 × i ∈ i { } (v; w) V W to (v0; w0) V W if w = w0 and v0 Im ( v ). Then there exists ∈ 0 × i ∈ i × i ∈ i { } a path from (v; w) P to (v0; w0) Q if and only if there exists a path from (v; w) to ∈ ∈ (v0; w0) in G H. Hence D(P; R) = D (G H). We next show that D(P; Q) = D (H). × i × i If C W with Im (C) = C = D (H), let v V and C0 = v C. This shows ⊂ 0 | i | | | i ∈ 0 { } × D(P; Q) D (H). Conversely, let C P = V W . For each x V , let C = y : ≤ i ⊂ 0 × 0 ∈ 0 x { (x; y) C . Then Im (C ) = C D (H) hence D(P; Q) = D (H) – note that C and ∈ } | Q x | | x| ≥ i i x ImQ(Cx) are disjoint and non-empty and C is arbitary. Similarly D(Q; R) = Di(G). Therefore D (G)D (H) D (G H). 2 i i ≤ i ×

Menger’s Theorem. Let G be a graph and let a; b V (G). Then the maximum number ∈ of internally disjoint a-b paths equals the size of a smallest set of vertices separating a from b.

Lemma 5.10. Let G be a Plunne¨ cke Graph, on V V : : : V , such that D (G) 1. 0 ∪ 1 ∪ ∪ n n ≥ Then there are V disjoint paths from V to V and, in particular, D 1 for all i n. | 0| 0 n i ≥ ≤ Proof. Add a vertex a joined to all of V0 and a vertex b joined to all of Vh. Let m be the maximum number of disjoint a-b paths. There exists a set S = s ; s ; : : : ; s { 1 2 m} of size m separating a from b, by Menger’s Theorem. Set S = S V . Choose S such i ∩ i that M = iS (s) is a minimum. We claim that S V V . Suppose this is false; s i ⊂ 0 ∪ n there existsPi : 1 i < n such that S V = s ; s ; : : : ; s = . Let P ; P ; : : : ; P ≤ ∩ i { 1 2 q} 6 ∅ 1 2 m be disjoint paths from V0 to Vn. Each Pi contains exactly one si, by the minimality of + m. Let si− and si denote the predecessor and successor of si on the path containing

34 s , oriented from V to V , 1 i q. By the minimality of M, we cannot replace any i 0 n ≤ ≤ elements of S with predecessors on the paths. So we find a path P from V0 to Vn that misses s−; s−; : : : ; s−; s ; : : : ; s . This path must intersect S, as S is a separating { 1 2 q q+1 m} set. Let r = P Vi 1. Then the next vertex of P must be si for some i : 1 i q. { } ∩ − ≤ ≤ + + + We claim that every path from s−; s−; : : : ; s−; r to s ; s ; : : : ; s passes through the { 1 2 q } 1 2 q vertices s1; s2; : : : ; sq. Suppose that this claim is false. If there exists a path Q from si− to + sj missing s1; s2; : : : ; sq, then the path comprises the segment of Pi to si−, the segment + of Q to sj and the segment of Pj onwards, misses S. This contradicts the fact that

S is a separating set. Therefore the graph induced by s−; : : : ; s−; r ; s ; : : : ; s and { 1 q } { 1 q} + + + s ; : : : ; s is a Plunnec¨ ke Graph. In this subgraph, let d (x) and d−(X) be the in- and { 1 q } + + + out-degrees of x. Since s− is joined to s , d (s−) d (s ). Similarly, d−(s ) d−(s ). i i i ≥ i i ≥ i q + q + + + Also i=1 d (si) = i=1 d−(si ) by counting edges, and d (r) + d (si−) = d−(si). SincePd+(r) > 0, wePhave a contradiction. Therefore S V VPand, by minimalitP y, ⊂ 0 ∪ n S = (V S) (Im (V S) and S = V S + Im (V S) V S + V S = V . 0 ∩ ∪ n 0\ | | | 0 ∩ | | n 0\ | ≥ | 0 ∩ | | 0\ | | 0| Plunnec¨ ke’s Theorem. Let G be a Plunne¨ cke Graph on V V : : : V . Then 0 ∪ 1 ∪ ∪ n D D1/2 : : : D1/n. 1 ≥ 2 ≥ ≥ n Proof. It is enough to show D1/i D1/n for i < n. When D = 1, this holds. Suppose i ≥ n n r r Dn < 1. Choose a positive integer r; then Dn(G ) = Dn, by Lemma 5.8. Given an integer m, we can find a set b1; b2; : : : ; bm Z such that all sums bj1 + bj2 + : : : + bji , { } ⊂ m+i 1 with i m and j : : : j , are distinct. The number of these sums, given i, is − ≤ 1 ≤ ≤ i i – between mi=i! and mi. Let B = b ; b ; b , A = 0 and H be the natural³layered´ { 1 2 · · · m} { } m graph with layers A; A + B; : : : ; A + nB. Let m be minimal such that mnDr =m! 1. n ≥ r 1/n r 1/n r Then m = (n!D− ) (n!D− ) + 1. By the choice of m, D (G H ) 1 so by d n e ≤ n m × m ≥ Lemma 5.10, D (Gr H ) 1, and D (G)r D (H ) 1. However, D (H ) mi so i × m ≥ i · i m ≥ i m ≤ i/r r 1/n i/r i/n D = D (G) m− (n!D− ) − D : i i ≥ ≥ n → n h i This completes the proof when Dn < 1. If Dn > 1, consider the reverse In of Hn – vertex m+n 1 1 sets A + nB; A + (n 1)B; : : : ; A + B; A. Then D (I ) − − and − n m ≥ m 1 ³ ´ m + n i 1 m + n 1 − n i n i D (I ) − − − n!m − =m = n!m− : i m ≤ m i · m ≤ µ − ¶ µ ¶ r n r/n Let r be a positive integer and m maximal such that D m− 1. Then m = D− n ≥ b n c ≥ r/n r r r r i D 1. Then D (G I ) 1 so D (G I ) 1. However D (G I ) D n!m− n − n × m ≥ i × m ≥ i × m ≤ i r i 1 r/n i 1 1/r i/n so D m n!− implying D [(D 1) n!− ] D , as required. 2 i ≥ i ≥ n − → n 35 Corollary 5.12. Let A and B be non-empty subsets of ZN such that A + iB C A . | | ≤ | | h/i For h i, there is a = A0 A such that A0 + hB C A0 . ≥ ∅ 6 ⊂ | | ≤ | | Proof. Let G be the natural Plunnec¨ ke Graph. If the result were false, then Dh(G) > h/i C so D (G) > C which implies that A0 + iB > C A , a contradiction. 2 i | | | |

Corollary 5.13. If A is a non-empty subset of Z and A+A C A , then kA C k A | | ≤ | | | | ≤ | | for each k 3. ≥ Proof. Take i = 1 and B = A in the preceding Corollary. This implies that there k k exists a non-empty A0 A such that A0 + kA C A0 C A , but A0 + kA kA , ⊂ | | ≤ | | ≤ | | | | ≥ | | so the result is proved. 2

Lemma 5.14. Let U; V; W Z. Then U V W U + V U + W . ⊂ | || − | ≤ | || | Proof. Define, for x V W , `(u; x) = (u+v(x); u+w(x)) where v(x) V , w(x) W ∈ − ∈ ∈ satisfy v(x) w(x) = x. Then ` is an injection U (V W ) (U + V ) (U + W ). 2 − × − → ×

Theorem 5.15. Let A; B Z such that A + B C A and let k and l be natural ⊂ | | ≤ | | numbers with l k. Then kB lB C k+l A . ≥ | − | ≤ | | Proof. Suppose l k 1. By Corollary 5.12, there exists A0 A with A0 + kB ≥ ≥ ⊂ | | ≤ k l C A0 . Again there exists A00 A0 with A00 + lB C A00 . Using Lemma 5.14, | | ⊂ | | ≤ | | k+l A00 kB lB A00 + kB A00 + lB C A0 A00 and the result follows on dividing | || − | ≤ | || | ≤ | || | by A00 . 2 | |

In the next chapter, we will see the use of Theorem 5.15. In essence, the arithmetic properties of kA for large k are easier to deal with than when k is small. Theorem 5.15 also allows one to deal with distinct set sums A + B by converting the problem to a single set difference problem kB lB. −

36 §6 Freiman’s Theorem

Freiman’s Theorem [5] describes the structure of a set A under the condition that A + A has size close to that of A. We define a generalised arithmetic progression to be a sum P of ordinary arithmetic progressions (see Theorem 5.7). If P is a subset of a small generalised arithmetic progression then P +P is close to P . Freiman’s Theorem states | | | | the converse: if P + P is close to P then P must be contained in a small generalized | | arithmetic progression.

We now proceed to the proof of Freiman’s Theorem, using a remarkable and ingenious approach due to Ruzsa [12].

Let A Zs or A Z and B Zt. Then ` : A B is called a (Freiman) k- ⊂ ⊂ ⊂ → homomorphism if whenever x + x + : : : + x = y + y + : : : + y , with x ; y A, 1 2 k 1 2 k i i ∈ `(xi) = `(yi). In addition, ` is called a k- if ` is invertible and ` and 1 P`− are k-homomorphisms.P

Note that ` is a k-homomorphism if the map ˆ : (x ; : : : ; x ) `(x ) induced by ` 1 k 7→ i is a well defined map kA kB, and a k-isomorphism if ˆ is a bijection.P Our interest → will be in 2-, as these preserve arithmetic progressions – a set 2-isomorphic to an arithmetic progression is clearly an arithmetic progression. We use the following notation: if ` : A B and A0 A, then ` 0 denotes the restriction of ` to A0. → ⊂ |A

Lemma 6.1. Let A Z and suppose kA kA C A . Then, for any prime N > C A , ⊂ | − | ≤ | | | | there exists A0 A with A0 A =k that is k-isomorphic to a subset of ZN . ⊂ | | ≥ | | Proof. We may suppose A N and select a prime p > k max A. Then the quotient ⊂ map `1 : Z Zp is a homomorphism of all orders, and `1 A is a k-isomorphism. Now → | let q be a random element of [p 1] and define `2 : Zp Zp by `2(x) = qx. Then − → `2 is an isomorphism of all orders, and hence a k-isomorphism. Let `3(x) = x where

`3 : Zp Z. Then for any j, `3 Ij is a k-isomorphism where → | j 1 j Ij = x Zp : − p x < p 1 : { ∈ k ≤ k − }

k k k k For, if i=1 xi = i=1 yi (mod p) with xi; yi Ij, then i=1 xi = i=1 yi in Z. By ∈ the pigeonholeP principle,P there exist A0 A with A0 PA =k (depPending on q) and ⊂ | | ≥ | |

37 ` ` [A0] I for some j. Restricted to A0, ` ` ` is a k-homomorphism. Finally, 2 1 ⊂ j 3 2 1 let `4 be the quotient map (a k-homomorphism) Z ZN . Then with ` = `4`3`2`1, → `(x) = qx (mod p) (mod N) and ` 0 is a k-homomorphism, as it is the composition of |A k-homomorphisms.

The only way ` 0 is not a k-isomorphism is if there are a ; a ; : : : ; a ; a0 ; a0 ; : : : ; a0 A0 |A 1 2 k 1 2 k ∈ k k k k such that `(a ) = `(a0 ) but `(a ) = `(a0 ). Now a = a0 i=1 i i=1 i i=1 i 6 i=1 i i i 6 i i implies Pa = a0 (moPd p) so we havPe q( a Pa0 ) (mod p) is a mPultiple ofPN. i i 6 i i i i − i i The probabilitP y Pof this event is at most kA PkA =NP< 1 since kA kA C A and | − | | − | ≤ | | N > C A . So for some q, ` 0 is a k-isomorphism. 2 | | |A

The next theorem, due to Bogolyubov [3], shows that we may find long arithmetic progressions with small dimension in 2A 2A. The proof is surprisingly simple. −

Theorem 6.2 Let A ZN with A fiN. Then 2A 2A contains an arithmetic ⊂ | | ≥ − 2 α−2 2 progression of length at least (fi =4) N and dimension at most fi− . Proof. Let g(x) be the number of ways of writing x = (a b) (c d) with a; b; c; d A. − − − ∈ That is, g = (A A) (A A) and x 2A 2A if and only if g(x) = 0. Now ∗ ∗ ∗ ∈ − 6 1 4 rx 3/2 g(x) = N − Aˆ(r) ! , by Lemma 2.2 (3). Let K = r = 0 : Aˆ(r) fi N . Then r | | { 6 ≥ } P Aˆ(r) 4 max Aˆ(r) 2 Aˆ(r) 2 < fi3N 2 fiN 2 = fi4N 4: r6=0 r6=0 | | ≤ r6∈K | | r | | · rX6∈K X

Therefore, if x is such that Re(!rx) 0 for all r K, then ≥ ∈ Re Aˆ(r) 4!rx > Aˆ(0) 4 fi4N 4 = 0: r | | | | − ³X ´ Therefore g(x) = 0 and 2A 2A contains the Bohr neighbourhood B(K; 1=4) – Re(!rs) 6 − ≥ ˆ 2 3 2 ˆ 2 0 if and only if N=4 rs N=4. Now r K A(r) kfi N and r K A(r) − ≤ ≤ ∈ | | ≥ ∈ | | ≤ fiN 2. By Theorem 5.7, 2A 2A contains theP required arithmetic progression.P 2 −

We now present Ruzsa’s proof of Freiman’s Theorem.

Freiman’s Theorem. Let A ZN be a set such that A + A C A . Then A is ⊂ | | ≤ | | contained in a d-dimensional arithmetic progression P of cardinality at most k A where | | d and k depend on C only.

38 16 Proof. By Theorem 5.15, 8A 8A C A . By Lemma 6.1, A contains a subset A0 | − | ≤ | | 16 of cardinality at least A =8 which is 8-isomorphic to a a set B ZN with C A < N | | ⊂ | | ≤ 2C16 A , where N is prime and C A < N 2C A , using Bertrand’s Postulate. So B = | | | | ≤ | | | | 16 1 fiN with fi (16C )− . By Theorem 6.2, 2B 2B contains an arithmetic progression ≥ − 2 2 α−2 2 α−2 of dimension at most fi− and cardinality at least (fi =4) N (fi =4) A . Since B ≥ | | is 8-isomorphic to A0, 2B 2B is 2-isomorphic to 2A0 2A0. Any set 2-isomorphic to a d- − − dimensional arithmetic progression is a d-dimensional arithmetic progression. Therefore

2A0 2A0, and hence 2A 2A, contains an arithmetic progression Q of dimension at most − − 2 2 α−2 fi− and cardinality A , where (fi =4) . Now let X = x ; x ; : : : ; x A be | | ≥ { 1 2 k} ⊂ maximal such that x; y X, x = y imply x y Q Q. Equivalently, all the sets x+Q ∈ 6 − ∈ − are disjoint, so X+Q = X Q . Since X is maximal, A X+(Q Q) and X is contained | || | ⊂ − in the k-dimensional arithmetic progression R = k a x : 0 a 1 . Clearly R i=1 i i ≤ i ≤ | | ≤ 2k. Therefore A is contained in the arithmetic progressionnP R + (Q Q),oof dimension at − 2 most fi− +k. We know that X +(Q Q) A+(4A 4A) = A+2A 2A+2A 2A, and − ⊂ − − − that X + Q A + 2A 2A = 3A 2A. So X + Q 3A 2A C 5 A , by Theorem ⊂ − − | | ≤ | − | ≤ | | 5 5 1 α−2 5.15. So k C A = Q C − . Finally, Q Q 2 Q , by d-dimensionality. ≤ | | | | ≤ | − | ≤ | | 2 5 1 So A is contained in an arithmetic progression of dimension at most fi− C − , and cardinality at most 2k2α−2 Q 2k2α−2 2A 2A kC42α−2 A . 2 | | ≤ | − | ≤ | |

The constants from this theorem can be chosen to be d = exp(C α) and k = expexp(Cβ), where fi; fl > 0 are absolute constants. Using a refinement of the same approach, a better result can be obtained for set difierences of the same set (see [2]):

Theorem 6.4. Let C be a positive real number. Suppose A is a set of integers satisfying C C+1 b cb c A A C A and A 2( C+1 C) . Then A is a subset of an arithmetic progression | − | ≤ | | | | ≥ b c− of dimension at most C 1 and cardinality at most expexp(C γ) where > 0 is an b − c absolute constant.

It is likely that a result with very much the same constants is true for A + A. These theorems can be generalized to theorems about abelian groups [4], [13]. We now turn to results concerning difference sets, which will eventually aid in finding four-term arith- metic progressions in the next chapter.

Lemma 6.5. Let A ; A ; : : : ; A be subsets of [N], fi > 0 and suppose that m A 1 2 m i=1 | i| ≥ fimN. Then there exists B [m], of cardinality at least fi5m=2, such that Pfor at least ⊂ 39 ninety percent of pairs (i; j) B B, A A fi2N=2. ∈ × | i ∩ j| ≥ Proof. Let x1; x2; : : : ; x5 be chosen randomly and independently from [N]. Let B = i : x ; x ; : : : ; x A . Then Prob[i B] = ( A =N)5 and thus the expected size { { 1 2 5} ⊂ } ∈ | i| of B is m ( A =N)5 m( A =mN)5 fi5m, by Jensen’s Inequality. By Cauchy- i=1 | i| ≥ | i| ≥ Schwartz,P E[ B 2] fi10m2. IfPA A fi2N=2, then Prob[i B; j B] < fi10=32. So | | ≥ | i ∩ j| ≤ ∈ ∈ if C = i; j B B : A A < fi2N=2 , then E[ C ] < fi10m2=32. It follows that the { ∈ × | i ∩ j| } | | expected value of E[ B 2 16 C ] > fi10m2=2. Hence there exist x ; x ; : : : ; x such that | | − | | 1 2 5 B 2 > fi10m2=2 and B 2 16 C . 2 | | | | ≥ | |

The following theorem is due to Balog and Szemer´edi [1]:

Theorem 6.6. Let A be a subset of an abelian . Suppose fi > 0 and that there are at least fi A 3 quadruples (a; b; c; d) A A A A such that a b = c d. Then A | | ∈ × × × − − contains a subset A0 such that A0 c A and A0 A0 C A where c and C depend | | ≥ | | | − | ≤ | | on fi only. Proof. Set A = n. Let f(x) = (A A)(x), the number of ways of writing x = a b | | ∗ − with a; b A. Then f(x) = n2, f(x)2 fin3 and max f(x) n. It follows that ∈ x x ≥ ≤ f(x) fin=2 for at leastP fin=2 valuesPof x: otherwise let B = x : f(x) < fin=2 and note ≥ { } 2 3 2 3 B f(x) < maxB f(x) B f(x) < fin =2 which implies x f(x) < fin . Let x be called aPpopular difierence if fP(x) fin=2. Define a graph GPwith vertex set A and edge set ≥ ab : a b is a popular difference – note that f is symmetric. There are at least fi2n2=8 { − } edges in G, by the first part of the proof. Let Γ(a) denote the open neighbourhood of 2 2 a vertex a in G. Then a A Γ(a) fi n =4 so, by the preceding lemma, we can find ∈ | | ≥ B A of cardinality atPleast fi10n=211 such that Γ(a) Γ(b) fi4n=32 for at least ⊂ | ∩ | ≥ ninety percent of pairs (a; b) B B. ∈ × Define a new graph H with vertex set B and edge set ab : Γ(a) Γ(b) fi4n=32 . { | ∩ | ≥ } Since the average degree in H is at least 9 B =10, at least 4 B =5 vertices have degree | | | | at least 4 B =5. Let A0 be the set of all such vertices; this will be the desired set. Let | | a; b A0. There are at least 3 B =5 numbers c B such that ac and bc are edges of H, ∈ | | ∈ 4 by definition of A0. If ac is an edge of H, then Γ(a) Γ(c) fi n=32, so there are at | ∩ | ≥ least fi4n=32 numbers d such that ad and cd are edges of G, and similarly for bc. If ad is an edge of G then there are at least fin=2 pairs (x; y) A A such that y x = d a ∈ × − −

40 so a + y x = d, and similarly for other edges of G. Therefore there are at least − 3 fi10 fi4n 2 fin 4 n 5 · 211 · 32 2 ³ ´ ³ ´ distinct octuples (x ; y ; x ; y ; : : : ; x ; y ) 8 A such that a + y x + y x + y 1 1 2 2 4 4 ∈ i=1 1 − 1 2 − 2 3 − x +y x = b. If we choose a different pair (Qa0; b0) A0 A0 such that b0 a0 = b a, then 3 4− 4 ∈ × − 6 − 8 the corresponding set of octuples is disjoint. Using the above inequality, A0 A0 n | − | ≤ 26 22 10 9 26 22 and so A0 A0 2 fi− n. Setting c = fi =(5 2 ) and C = 2 fi− completes the | − | ≤ · proof. 2

k Corollary 6.7. Let A Z with A = m and such that the number of quadruples ⊂ | | (a; b; c; d) A A A A, with a b = c d, is at least cm3. Then there exists an ∈ × × × − − arithmetic progression P of cardinality at most Cm and dimension at most d such that A Q cm, where C and d depend only on c. | ∩ | ≥ Proof. This follows directly from the preceding result and Freiman’s Theorem. 2

This corollary, or rather a derivative of it, will be very useful in studying four-term arithmetic progressions in the next chapter. In fact, this result is equivalent to Freiman’s Theorem.

Any integer quadruple (a; b; c; d) such that a b = c d is called an additive quadruple. − − For a function ` : B ZN , where B ZN , we say (a; b; c; d) B B B B is an → ⊂ ∈ × × × additive quadruple of ` if (a; b; c; d) is an additive quadruple and (`(a); `(b); `(c); `(d)) d is an additive quadruple. If B Z , then (A A)(x) is the number of representations ⊂ ∗ of x as y z. Therefore the number of quadruples (a; b; c; d) A A A A with − ∈ × × × a b = c d is A A 2. The result we shall use in the next chapter is the following: − − k ∗ k2

Corollary 6.8. Let B ZN be a set of cardinality flN, and let ` : B ZN be ⊂ → a function with at least fiN 3 additive quadruples. Then there exist constant and ·, γ depending only on fl and c, a ZN -arithmetic progression P of cardinality at least N and a linear function ˆ : P ZN such that ˆ(s) = `(s) for at least · P values of s P . → | | ∈ Proof. Let Γ denote the graph of ` in Z Z. By Theorem 6.6, there are constants × c and C, depending only on fi, and a set A0 A of cardinality at least c A such that ⊂ | | A0 A0 C A . A result of Ruzsa shows that if A is any set with A A C A , then | − | ≤ | | | − | ≤ | | there exists a Z-arithmetic progression Q of dimension at most 218C32 and size at least

41 20 32 218C32 5 d (2 C )− A , such that A Q C− 2− Q . Applying this result to A0, we get a | | | ∩ | ≥ | | d-dimensional Z-arithmetic progression Q of cardinality at most CN, with Γ Q cN, | ∩ | ≥ where d; c; C depend on fi and fl only. If Q = Q1 +Q2 +: : :+Qd, then at least one Pi has cardinality at least (CN)1/d (cN)1/d, so Q can be partitioned into one-dimensional ≥ arithmetic progressions of cardinality at least (cN)1/d. Therefore there is an arithmetic 1/d 1 progression R Z Z, of cardinality at least (cN) such that R Γ cC− R . ⊂ × | ∩ | ≥ | | As Γ is the graph of a function, R is not vertical unless R Γ = 1 in which case the | ∩ | result is proved. So there exists an arithmetic progression P Z of the same size as R ⊂ 1 and a linear function ˆ such that Γ contains at least cC − P pairs (s; ˆ(s)). Reducing | | modulo N gives the required result. 2

K K Following Ruzsa’s proof of Freiman’s Theorem, we may take = fi and · = exp( fi− ), − where K > 0 is an absolute constant.

42 §7 Szemer´edi’s Theorem

To prove Szemer´edi’s Theorem [16] for four term arithmetic progressions, following Gow- ers [7], a two case argument: we consider first sets which behave roughly like random sets, and then those which do not. Then, if a set does not behave in the first sense above, it can be restricted to an arithmetic progression, of reasonable length, in which its density increases. This argument applies a finite number of times as the density is bounded above by 1. Notice the similarities in approach with the proof of Roth’s Theorem. The difference is that a stronger condition, namely quadratic uniformity is required for random-like behaviour with regards to four term arithmetic progressions. The difficult part is finding an arithmetic progression of reasonable length in which the density increases.

We now define the concept of quadratic uniformity. Let f : ZN z C : z 1 → { ∈ | | ≤ } ˆ 4 4 and fi > 0. Then f is fi-uniform if r f(r) fiN . If A ZN , A = –N and f(x) = | | ≤ ⊂ | | A(x) –, then A is fi-uniform if f Pis fi-uniform – A is fi-uniform if Aˆ(r) 4 (–4 + − r | | ≤ fi)N 4. The concept of fi-uniformity is not quite strong enough, in termsP of containing the expected number of arithmetic progressions of length four. We say f is quadratically fi-uniform if ∆(ˆ f; k)(r) 4 fiN 5 r | | ≤ Xk X where ∆(f; k)(x) = f(x)f(x k). We generally define −

∆(f; k1; k2; : : : ; kr) = ∆(∆(f; k1; k2; : : : ; kr 1); kr): −

This is independent of the order of the ki. Also

∆(ˆ f; k)(r) 4 = N ∆(f; k)(x)∆(f; k)(x ‘)∆(f; k)(x m)∆(f; k)(x l m) r | | − − − − Xk X x,kX,l,m = ∆(f; k; l; m)(x) x,kX,l,m 2 = N ∆(f; k; l)(x) : k,l ¯ x ¯ X¯X ¯ ¯ ¯ In this chapter, D will denote the unit disc in the .

Lemma 7.1 Let f : ZN D. Then f is fi-uniform if and only if for any function → g : Z C, → 43 2 f(s)g(s k) √fiN 2 g 2: − ≤ k k2 k ¯ s ¯ X¯X ¯ ¯ ¯ Also, f is fi-uniform if max fˆ(r) fi1/2N. r | | ≤ Proof. For the first part, we know that

f(s)g(s k) 2 = f g)k) 2 k | s − | k | ∗ | X X X 1 2 = N − (f g)ˆ(r) r | ∗ | 1 X 2 2 = N − fˆ(r) gˆ(r) r | | | | X 1/2 1/2 fˆ(r) 4 gˆ(r) 4 ; ≤ r | | r | | ³X ´ ³X ´ by the Cauchy-Schwartz inequality, and Lemma 2.2. Since ( gˆ(r) 4)1/2 gˆ(r) 2, r | | ≤ r | | if f is fi-uniform then the inequality in f and g above must Phold. For the secondP part, we use the fact that fˆ(r) 4 max fˆ(r) 2 fˆ(r) 2, and Parseval’s Identity from r | | ≤ r | | r | | Lemma 2.2 to obtain P fˆ(r) 2 N 2 and the resultP follows. 2 r | | ≤ P The first few results lead to showing that quadratically fi-uniform sets do contain four- term arithmetic progressions. We begin by proving a number of technical lemmas con- cerning fi-uniformity. The following lemma shows that a quadratically uniform set is also uniform.

Lemma 7.2. If f is quadratically fi-uniform, then f is fi1/2-uniform.

2 2 Proof. fˆ(r) 4 = N f(x)f(x k)f(x l)f(x k l) | | − − − − ³ r ´ ³ x,k,l ´ X X 2 = N 2 ∆(f; k; l)(x) ³x,k,l ´ X 2 N 4 ∆(f; k; l)(x) ≤ k,l ¯ x ¯ X¯X ¯ = N 3 ¯ ∆ˆ (f; k)(r) 4 fi¯ N 8: | | ≤ 2 Xk,r

Lemma 7.3. Let f1; f2; f3 : ZN D. Suppose that f3 is fi-uniform. Then we have → f (a)f (a + d)f (a + 2d) fi1/4N 2: | a,d 1 2 3 | ≤ P

44 Proof. If S = a,d f1(a)f2(a + d)f3(a + 2d) then P S = f (a)f (b)f (c) | | 1 2 3 ¯a+Xc=2b ¯ ¯ 1 ¯ = ¯N − fˆ (r)fˆ ( 2r)f¯ˆ (r) 1 2 − 3 ¯ r ¯ ¯ X ¯ ¯ 1 ¯2 1/2 2 1/2 N − max fˆ3(r) fˆ1(r) fˆ2(r) r ≤ · r | | · r | | ³X ´ ³X ´ 1 1/4 2 1/4 2 N − fi N N = fi N : 2 ≤ · ·

Lemma 7.4. Let f1; f2; f3; f4 : ZN D and fi > 0. Suppose f4 is quadratically → fi-uniform. Then f (a)f (a + d)f (a + 2d)f (a + 3d) fi1/8N 2: | a,d 1 2 3 4 | ≤ Proof. Let S be theP sum we are estimating. Then

2 2 S N f1(a)f2(a + d)f3(a + 2d)f4(a + 3d) | | ≤ a X¯Xd ¯ ¯ 2 ¯ N ¯ f (a + d)f (a + 2d)f (a + 3d) ¯ ≤ 2 3 4 a ¯ d ¯ X¯X ¯ ¯ f2(a + d)f2(a + e)f3(a + 2d)f3(a¯+ 2e)f4(a + 3e)f4(a + 3e) ≤ a X Xd,e = N ∆(f2; k)(a + d)∆(f3; 2k)(a + 2d)∆(f3; 3k)(a + 3d) a X Xd,k = N ∆(f2; k)(a)∆(f3; 2k)(a + 2d)∆(f3; 3k)(a + 3d): a X Xd,k

Since f4 is quadratically fi-uniform, there are fi(k), k ZN such that for each k, ∆(f4; k) ∈ 1 is fi(k)-uniform and N − k fi(k) = fi. By Lemma 7.3, the above expression is at most P N fi(3k)1/4N 2 = N fi(k)fi(k)1/4N 2 Xk Xk Nfi1/4NN 2 = fi1/4N 4: 2 ≤

1/2 Theorem 7.5. Let A1; A2; A3; A4 ZN with Ai = –iN. Suppose that A3 is fi - ⊂ | | uniform and A4 is quadratically fi-uniform. Then a,d A1(a)A2(a + d)A3(a + 2d)A4(a + 3d) – – – – N 2 12fi1/8N 2: P − 1 2 3 4 ≤ Proof. Set f (x) = A (x) – . Replace the A ( ) with f ( ) + – in the sum we wish i i − i i · i · i to estimate. The sum splits into sixteen parts. We think of –i as constant functions and apply the two preceding lemmas. If we choose f4 in applying Lemma 7.4, then the 1/8 2 sum is at most fi N . If we do not choose f4, but choose f3, then the sum is at most 1/2 1/4 2 1/8 2 (fi ) N = fi N , by Lemma 7.3. If neither f3 nor f4 is chosen, we use the identity

45 g1(a)g2(a + d) = g1(a)g2(b) = g1(a) g2(b) : a Xa,d Xa,b ³X ´³Xb ´ This shows that all of the remaining terms are zero, apart from the constant term, which 2 is a,d –1–2–3–4 = N –1–2–3–4. This completes the proof of Theorem 7.5. 2 P Corollary 7.6. Let A [N], A = –N where – > 0. Suppose that A is quadratically ⊂ | | fi-uniform. If fi –32=288 and N 200=–4, then A contains an arithmetic progression ≤ ≥ 9 of length four or we can find a subprogression where A has density at least 8 –. Proof. Let A = A = A [2N=5; 3N=5] and A = A = A. If A –=10, we 1 2 ∩ 3 4 | | ≤ have A [0; 2N=5] or A [3N=5; N) of cardinality at least –(9N=20). By Theorem 7.5, ∩ ∩ 4 1/8 2 A1 A2 A4 contains at least (– =100 12fi )N ZN arithmetic progressions of × × · · · × − length four. Provided this is greater than N, we have a Z-arithmetic progression – all

ZN arithmetic progressions in A1 A2 A3 A4 are Z-arithmetic progressions. 2 × × ×

We now turn to the case where f is not quadratically uniform. If A is the corresponding set of density –, then we plan to show that A intersects a Z-arithmetic progression P 1; 2; : : : ; N of size at least N d and such that A P (– + ") P where " and d ⊂ { } | ∩ | ≥ | | depend only on fi and –.

Lemma 7.7. Suppose that f is not quadratically fi-uniform. Then there exists a set B, of cardinality at least fiN=2, and a function ` : B ZN such that →

∆ˆ (f; k)(`(k)) 2 (fi=2)2N 3: k B | | ≥ X∈ Proof. Since f is not quadratically fi-uniform, ∆(ˆ f; k)(r) 4 > fiN 5. So there k r | | must be more than fiN=2 values of k for which P∆(ˆPf; k)(r) 4 fiN 3=2. So there are r | | ≤ more than fiN=2 values of k such that max ∆(ˆPf; k)(r) (fi=2)1/2N, by the second r | | ≥ part of Lemma 7.1. Therefore there exists a set B, of cardinality at least fiN=2, and a function ` such that ∆(ˆ f; k)(r) (fi=2)1/2N for all k B. Summing this over k B | | ≥ ∈ ∈ gives the required result. 2

Recall the definition of an additive quadruple, given in the last part of the last chapter.

46 Lemma 7.8. Suppose that f : ZN D, B ZN and ` : ZN is a function such that, → ⊂ for some fi > 0, ∆(ˆ f; k)(`(k)) 2 fiN 3: k B | | ≥ X∈ Then there exist at least fi4N 3 quadruples (a; b; c; d) B B B B such that a+b = c+d ∈ × × × and `(a) + `(b) = `(c) + `(d). Proof. Expanding the left hand side of the inequality, we get:

∆(ˆ f; k)(`(k)) 2 (fi=2)2N 3 k B | | ≥ X∈ φ(k)(s t) 3 f(s)f(s k)f(t)f(t k)!− − fiN ⇒ k B s,t − − ≥ X∈ X φ(k)u 3 f(s)f(s k)f(s u)f(s k u)!− fiN ⇒ k B s,u − − − − ≥ X∈ X φ(k)u 3 f(s k)f(s k u)!− fiN ⇒ − − − ≥ u,s ¯k B ¯ X¯ X∈ ¯ ¯ φ(k)u¯2 2 4 f(s k)f(s k u)!− fi N : ⇒ − − − ≥ u,s ¯k B ¯ X¯ X∈ ¯ ¯ ¯ φ(k)u 2 3 Let (u) satisfy f(s k)f(s k u)!− = (u)N . Using the first part of s| B − − − | Lemma 7.1, we deduceP P that B(k)!φ(k)u is not (u)2-uniform, and therefore (by definition) φ(k)u rk 4 2 4 2 2 ! − (u) N . By the above inequalities, (u) fi N so (u) r| B | ≥ u ≥ u ≥ Pfi4NP. Therefore P P φ(k)u rk 4 4 5 ! − fi N : u r |k B | ≥ X X X∈ Expanding the left hand side we find that:

(φ(a)+φ(b) φ(c) φ(d))u r(a+b c d) 4 5 ! − − !− − − fi N : u,r a,b,c,d B ≥ X X∈ However, the left side is N 2 times the number of quadruples (a; b; c; d) B B B B ∈ × × × for which a + b = c + d and `(a) + `(b) = `(c) + `(d). 2

We recall the definition of additive quadruples for a function `, from the end of chapter six.

3 Lemma 7.9. Suppose that ` : ZN ZN has at least fiN additive quadruples. Then → there exist ·; , depending only on fi, and an arithmetic progression P of length at least N γ such that for some ‚ and „,

47 ∆(ˆ f; k)(‚k + „) 2 ·N 2 P : k P | | ≥ | | X∈ K K Proof. This follows from Corollary 6.8 with = fi and · = exp( fi− ). 2 −

Lemma 7.10. Let f : ZN D. Let · > 0 and P ZN be an ZN -arithmetic progression → ⊂ such that, with ‚; „ ZN , ∈

∆(f; k)ˆ(2‚k + „) 2 · P N 2 k P | | ≥ | | X∈ 1/2 Then, for P N , there exists a partition of ZN into translates P1; P2; : : : ; PM of P | | ≤ or P with an endpoint removed, such that for each i we can find ri ZN such that ∈

2 λx rix f(x)!− − · P N=2: | | ≥ | | i x Pi X X∈

(2λk+µ)x 2 Proof. f(x)f(x k)!− · P N k P x − ≥ | | X∈ ¯X ¯ ¯ ¯ (2λk+µ)(x y) 2 ¯ f(x)f(x k)f(y)f(y¯ k)!− − · P N ⇒ k P x y − − ≥ | | X∈ X X (2λk+µ)u 2 f(x)f(x k)f(x u)f(x k u)!− · P N : ⇒ k P x u − − − − ≥ | | X∈ X X

Every u ZN can be written in exactly P ways as v + l with v ZN and l P , ∈ | | ∈ ∈ therefore,

(2λk+µ)(v+l) 2 2 f(x)f(x k)f(x v l)f(x v k l)!− · P N : k P l P x v − − − − − − ≥ | | X∈ X∈ X X

Hence we can find v ZN such that ∈

(2λvk+µv+2λkl+µl) 2 f(x)f(x k)g(x l)g(x k l)!− · P N − − − − ≥ | | ¯k P l P x ¯ ¯X∈ X∈ X ¯ ¯ ¯ where g(x) = f(x v). Now with 2‚vk = 2‚v(x l (x k l)), „l = „(x (x l)) − − − − − − − and 2‚kl = ‚(x2 (x k)2 (x l)2 + (x k l)2), − − − − − − h (x)h (x k)h (x l)h (x k l) · P 2N 1 2 − 3 − 4 − − ≥ | | ¯k P l P x ¯ ¯X∈ X∈ X ¯ ¯ ¯

48 λx2 µx λx2 λx2+(2λv µ)x where h1(x) = f(x)!− − , h2(x) = f(x)!− , h3(x) = g(x)!− − and λx2+2λvx h4(x) = g(x)!− . This implies that

h (x)h (x k)h (x l)h (x k l) · P 2N: 1 2 − 3 − 4 − − ≥ | | x ¯k P l P ¯ X¯X∈ X∈ ¯ ¯ ¯ 2 For each x, define ·(x) by k P l P h2(x k)h3(x l)h4(x k l) = ·(x) P : Then | ∈ ∈ − − − − | | | P P 1 r(k+l m) 2 N − h2(x k)h3(x l)h4(x m)! − ·(x) P r k P l P m P +P − − − ≥ | | ¯X X∈ X∈ ∈X ¯ ¯ rk rl ¯ rm 2 ¯ h (x k)!− h (x l)!− h (x ¯ m)!− ·(x) P N: ⇒ 2 − · 3 − · 4 − ≥ | | r ¯k P ¯ ¯l P ¯ ¯m P +P ¯ X¯X∈ ¯ ¯X∈ ¯ ¯ ∈X ¯ ¯ ¯ ¯ ¯ ¯ ¯ rl 2 2 However r l P h3(x l)!− = N l P h3(x l) N P and similarly for h4. | ∈ − | ∈ | − | ≤ | | Applying PCaucPhy-Schwartz, P

rk 1/2 2 max h2(x k)!− 2 N P ·(x) P N: r − · | | ≥ | | ¯k P ¯ ¯X∈ ¯ ¯ ¯ rxk 1/2 So there exists rx such that k P h2(x k)! · P 2− . That is, | ∈ − | ≥ | | P 2 λ(x k) +rx(x k) 1/2 f(x k)!− − − ·(x) P 2− : − ≥ | | ¯k P ¯ ¯X∈ ¯ ¯ ¯ 2 λy +rxy 1/2 Summing over all x, we obtain x y x P f(y)!− · P N2− : An easy aver- | ∈ − | ≥ | | aging argument then shows thatPwePcan partition ZN into translates of copies of P (or P with an endpoint removed), which we call P ; P ; ; P with 1 2 · · · M 2 λy riy f(y)!− − ·N P =2: ≥ | | i ¯y Pi ¯ X¯ X∈ ¯ ¯ ¯ 1/2 The division by 2 is to ensure that the Pi differ in length by at most 1. 2

Let ` : ZN ZN be a function. We define, for S Zn, diam`(S) = max `(x) `(y) : → ⊂ { − x; y S . ∈ }

Lemma 7.11. Let m; r; l [N] and let P be a ZN -arithmetic progression of length ∈ 1/3 m. Let ` : ZN ZN be a linear function. Then, provided that l (m=r) , P can be → ≤ partitioned into subprogressions P ; i 1 of lengths l or l 1, such that diam`(P ) N=r i ≥ − i ≤ for each i. Proof. Without loss of generality, suppose P = [0; m 1]. By the pigeonhole principle, − there exists d rl such that `(d) `(0) N=rl. Set Q = x; x + d; : : : ; x + (l 1)d . ≤ | − | ≤ { − } 49 Then `(x+ld) `(x) l `(d) `(0) N=r so diam`(Q) N=r. As each congruence | − | ≤ | − | ≤ ≤ class modulo d has size at least m=d m=rl l2, we can split P into copies P of Q, ≥ ≥ i differing in length by at most one. 2

In the next lemma, we apply Weyl’s Theorem (Theorem 3.10):

Lemma 7.12. Let m [N]. and let ` : ZN ZN be a quadratic function and let P be a ∈ → 1/18.128 ZN -arithmetic progression of length m. Then for any l m , P can be partitioned ≤ 1/6.128 into subprogressions P , i 1, of lengths l or l 1, with diam`(P ) Cm− N. i ≥ − i ≤ Proof. Suppose `(x) = ax2 + bx + c and P = [m]. Choose d m1/2 such that, modulo ≤ 2 1/128 1/3.128 N, ad m− N: this is possible, by Theorem 3.10 with k = 2. Let t m and | | ≤ ≤ Q = x; x + d; : : : ; x + (t 1)d . Then `(x + td) `(x) = (2axd + bd)t + ad2t2. We note i { − } − 2 2 1/3.128 1/6.128 that ad t Cm− N modulo N. Applying Lemma 7.11 to Q , with r = m , | | ≤ i for l m1/18.128, Q can be partitioned into subprogressions R of sizes l or l 1 with ≤ i ij − 1/6.128 diam`(R ) N=r = Cm− N. Considering a partition of P into Q s, the R form ij ≤ i ij the required arithmetic progressions. 2

Lemma 7.13. Let ` : ZN ZN be a quadratic polynomial and r N. Then there → ≤ 1 1/18.128 exists m Cr − such that [0; r 1] can be partitioned into arithmetic progressions ≤ − P1; P2; : : : ; Pm, of lengths differing by at most 1, and such that, if f : ZN D is any → function with r 1 − φ(x) f(x)!− ·r; ≥ ¯x=0 ¯ ¯mX ¯ then ¯ f(x) ¯ ·r=2: ≥ j=1¯x Pj ¯ X¯ X∈ ¯ ¯ ¯ 1/6.128 Proof. By Lemma 7.12, we find P ; P ; : : : ; P such that diam`(P ) CNr− . 1 2 m i ≤ Provided N is sufficiently large, this is at most ·N=4…. By the triangle inequality, m φ(x) f(x)!− ·r: ≥ j=1¯x Pj ¯ X¯ X∈ ¯ ¯ ¯ φ(x) φ(xj Let x P . The estimate on the diameter of `(P ) implies that !− !− ·=2 j ∈ j i | − | ≤ for all x P . So ∈ j m m φ(xj ) f(x) = f(x)!− j=1¯x Pj ¯ j=1¯x Pj ¯ X¯ X∈ ¯ X¯ X∈ ¯ ¯ ¯ m ¯ ¯ m φ(x) f(x)!− (·=2) P ·r=2: ≥ − | j| ≥ j=1¯x Pj ¯ j=1 X¯ X∈ ¯ X ¯ ¯ 50 This completes the proof. 2

Szemer´edi’s Theorem. There exists an absolute constant c > 0 such that if A [N], ⊂ A = –N and – (log log log N)c, then A contains an arithmetic progression of length | | ≥ four.

32 88 Proof. Regard A as a subset of ZN . If A is quadratically fi = – =2 -uniform, then the theorem is proved, by Corollary 7.6. Let f(x) = A(x) – and suppose f is not − quadratically fi-uniform. By Lemma 7.7, there exists a set B of cardinality at least

fiN=2 and a function ` : B ZN such that → ∆(ˆ f; k)(`(k)) (fi=2)2N 3: k B | | ≥ X∈ By Lemma 7.8, ` has at least (fi=2)8N 3 additive quadruples and so, by Lemma 7.9, there exists an arithmetic progression P with P N γ and | | ≥

∆(f; k)ˆ(2‚k + „) 2 · P N 2 k P | | ≥ | | X∈ K K By Ruzsa’s proof of Freiman’s Theorem, we may choose = fi and · exp( fi− ) ≥ − where K > 0 is an absolute constant. By Lemma 7.10, we then have

2 λx rix f(x)!− − · P N=2; | | ≥ | | i x Pi X X∈ where the Pi are as in Lemma 7.10. Apply Lemma 7.13 in each Pi to obtain further pro- gressions P , of cardinalities differing by at most 1, and with average lengths C P 1/18.128 ij | | (for some constant C > 0) and such that

m f(x) ·N P =4: ≥ | | i j=1¯x Pij ¯ X X¯ X∈ ¯ ¯ ¯ A consequence of Lemma 7.12 is that we can insist that the Pij are Z-arithmetic pro- gressions, except that (by Lemma 2.3) the average length of P is C P 1/2.18.128, where ij | | C > 0 is a constant and no Pij has more than twice this length.

γ/2.18.128 Relabel the Pijs as Q1; Q2; : : : ; QM , where M = N − and the Qi have average length N γ/2.18.128. As f(x) = 0, we have P f(x) + f(x) ·N=4: ≥ i ³¯x Qi ¯ x Qi ´ X ¯ X∈ ¯ X∈ ¯ ¯ 51 The contribution of Q with Q N=M is at most 2N=√M ·N=8, therefore there i | i| ≤ ≤ q exists Qi such that Qi N=M such that x Qi f(x) + x Qi f(x) · Qi =8. This | | ≥ | ∈ | ∈ ≥ | | q implies that x Qi f(x) · Qi =16. P P ∈ ≥ | | So we have Pshown that there exists an arithmetic progression Q, of length at least N=M N γ/4.18.128 = N δc such that A Q (– + exp[ –c]) Q , where c > 0 is a ≥ | ∩ | ≥ − | | qconstant. Rewriting this in terms of –, a four-term arithmetic progression must be found c when – (log log log N)− for some c > 0. 2 ≥

52 References

[1] A. Balog and E. Szemer´edi, A statistical theorem of set addition, Com- binatorica 14(3) (1994) 263–268. [2] Y. Bilu, Structure of sets with small sumset, Structure Theory of Set Addition, Ast´erisque 258 (1999) 77–108. [3] N. N. Bogolyubov, Zap. Kafedry Mat. Fizi. 4 (1939) 185. [4] P. Erd˝os and P. Tur´an, On some of integers, J. London Math. Soc. 11 (1936) 261–264. [5] G. R. Freiman, Foundations of a Structural Theory of Set Addition, Translations of Mathematical Monographs 37, Amer. Math. Soc., Providence, R. I., USA. [6] H. Fursten¨ burg, Ergodic behaviour of diagonal measures and a theorem of Szemer´edi on arithmetic progressions, J. Analyse Math. 31 (1977), 204–256. [7] W. T. Gowers, A new proof of Szemer´edi’s theorem for arithmetic pro- gressions of length four, Geom. Funct. Anal. 8 (1998) (3) 529–551. [8] A. W. Hales, R. I. Jewett, Regularity and positional games, Trans. Amer. Math. Soc. 106 (1963), 222–229. [9] D. R. Heath-Brown, Integer sets containing no arithmetic progressions, J. London Math. Soc. (2) 35 (1987), 385–394. [10] N. M. Korobov, Exponential Sums and their Applications, and its Applications 80, Kluwer, (1992). [11] K. F. Roth, On certain sets of integers, J. London Math. Soc. 28 (1953), 245–252. [12] I. Ruzsa, Generalized arithmetic progressions and sumsets, Acta Math. Hungar. 65 (1994), 379–388. [13] I. Ruzsa, An analog of Freiman’s Theorem for abelian groups, Structure Theory of Set Addition, Ast´erisque 258 (1999) 323–326.

53 [14] S. Shelah, Primitive recursive bounds for van der Waerden Numbers, J. Amer. Math. Soc. 1(3) (1988) 683–697. [15] C. F. Siegel, Lectures on Geometric Number Theory, [16] E. Szemer´edi, On sets of integers containing no four elements in arith- metic progression, Acta Math. Acad. Sci. Hungar. 20 (1969), 89–104. [17] E. Szemer´edi, On sets of integers containing no k elements in arithmetic progression, Acta Arith. Hungar. 27 (1975), 299–345. [18] E. Szemer´edi, Integer sets containing no arithmetic progressions, Acta Math. Hungar. 56 (1990) 155–158. [19] G. Tenenbaum, Introduction to analytic and probabilistic number the- ory, Cam. Studies in Advanced Math. 46 Cam. Univ. Press (1995). [20] B. L. van der Waerden, Beweis einer Baudetschen vermutung, Nieuw Arch. Wisk. 15 (1927), 212–216. [21] R. C. Vaughan, The Hardy-Littlewood Method, 2nd Ed. Cam. Tracts in Math. 125 Cam. Univ. Press (1997). [22] I. M. Vinogradov, The method of trigonometrical sums in the theory of numbers, Trav. Inst. Math. Steklof 23 (1947). [23] H. Weyl, Ub¨ er die Gleichverteilung von Zahlen mod Eins, Math. An- nalen 77 (1913), 313–352.

54 Notation

A general integer set A + B sum set a + b : a A, b B { ∈ ∈ } α distance from α to the nearest integer k k α fractional part of α { } A set of subset sums ε a : ε 0, 1 , a A ∗ { i i i ∈ { } i ∈ } B(K, δ) Bohr neighbourhoodP C field of complex numbers c(q) min c : A c, A A = { | | ≥ ⊂ Zq ⇒ ∗ Zq} c, C constant Γ graph of a function Γ(a) neighbourhood of a vertex a d dimension of arithmetic progression D unit disc in C

Di(G) ith magnification ratio det(Λ) determinant of lattice Λ ∆(f; k)(r) f(r)f(r k) − e(α) exp(2πiα) E expectation δ density f g convolution ∗ fˆ fourier transform of f g `2- of function g k k2 G H product of layered graphs × HJ(k, r) Hales-Jewett numbers i, j, k counting variables

Imi(Y ) image of Y in ith layer

55 kA k-fold sum A + A + . . . + A of A K convex body or absolute constant K(t) tth cross-section of body K Λ(x) von Mangoldt’s function (m, n] integers greater than m and less than or equal n µ(x) M¨obius function nk(N) number of elements required in [N] for a k-term progression N large integer [N] 1, 2, . . . , N { } N natural numbers Prob probability R real numbers τ(n) function vol(K) volume of K W (k, r) van der Waerden numbers x jA Hales-Jewett line ⊕ Z set of integers ωr exp(2πir/N)

56