<<

Faculty of Sciences Department of Mathematics

Exponential sums and applications in number theory and analysis

Frederik Broucke

Promotor: Prof. dr. J. Vindas

Master’s thesis submitted in partial fulfilment of the requirements for the degree of Master of Science in Mathematics

Academic year 2017–2018 ii Voorwoord

Het oorspronkelijke idee voor deze thesis was om het bewijs van het ternaire vermoeden van Goldbach van Helfgott [16] te bestuderen. Al snel werd mij duidelijk dat dit een monumentale opdracht zou zijn, gezien de omvang van het bewijs (ruim 300 bladzij- den). Daarom besloot ik om in de plaats de basisprincipes van de Hardy-Littlewood- of cirkelmethode te bestuderen, de techniek die de ruggengraat vormt van het bewijs van Helfgott, en die een zeer belangrijke plaats inneemt in de additieve getaltheorie in het algemeen. Hiervoor heb ik gedurende het eerste semester enkele hoofdstukken van het boek “The Hardy-Littlewood method” van R.C. Vaughan [37] gelezen. Dit is waarschijnlijk de moeilijkste wiskundige tekst die ik tot nu toe gelezen heb; de weinige tussenstappen, het gebrek aan details, en zinnen als “one easily sees that” waren vaak frustrerend en demotiverend. Toch heb ik doorgezet, en achteraf gezien ben ik echt wel blij dat ik dat gedaan heb. Niet alleen heb ik enorm veel bijgeleerd over het onderwerp, ik heb ook het gevoel dat ik beter of vlotter ben geworden in het lezen van (moeilijke) wiskundige teksten in het algemeen. Na het lezen van dit boek gaf mijn promotor, professor Vindas, me de opdracht om de idee¨en en technieken van de cirkelmethode toe te passen in de studie van de functie van Riemann, een “pathologische” continue functie die een heel onregelmatig puntsgewijs gedrag vertoont. De combinatie van deze twee onderwerpen leidde uiteindelijk tot de keuze voor exponenti¨ele sommen als onderwerp van de thesis, met een grote nadruk op de toepassingen voor de cirkelmethode en de studie van Riemanns functie.

Persoonlijk vind ik dat je een wiskundig bewijs op twee niveaus kan begrijpen. Ener- zijds kan je elke overgang of stap afzonderlijk begrijpen, inzien waarom regel n + 1 volgt uit regel n voor elke n. Anderzijds (en misschien belangrijker) kan je het bewijs op een globaal niveau begrijpen: wat zijn de achterliggende idee¨en en motivaties, waarom doet men iets op die manier, waarom werkt dit? In mijn thesis heb ik geprobeerd om beide niveaus voldoende te belichten. Ik heb geprobeerd zo veel mogelijk resultaten rigoureus en volledig bewezen. Het is hierbij nagenoeg onmogelijk om (soms vervelende) technische details te vermijden. Toch hoop ik dat de techniciteiten de achterliggende idee¨en niet verbloemen, en dat de lezer door het lezen van deze thesis inzicht en appreciatie kan krijgen voor het mooie onderwerp van exponenti¨ele sommen en hun toepassingen in vele fascinerende problemen.

Ik heb er bewust voor gekozen om deze masterproef in het Engels te schrijven. Dit lijkt mij passend voor de masterproef als (mini-)wetenschappelijk onderzoek, aangezien de overgrote meerderheid van de wetenschappelijke literatuur in het Engels wordt ge- schreven.

Ten slotte wil ik mijn promotor, professor Vindas, bedanken. Hij stond me bij met

iii iv VOORWOORD advies, hielp me met problemen tijdens het lezen en het schrijven, en voorzag nuttige referenties. Anderzijds gaf hij me ook voldoende vrijheid om mijn eigen ding te doen, wat ik zeer op prijs stel. Ten slotte maakte hij veel tijd vrij in zijn drukke schema om de talrijke voorlopige versies nauwgezet na te lezen.

De auteur geeft de toelating deze masterproef voor consultatie beschikbaar te stellen en delen van de masterproef te kopi¨eren voor persoonlijk gebruik. Elk ander gebruik valt onder de beperkingen van het auteursrecht, in het bijzonder met betrekking tot de verplichting de bron uitdrukkelijk te vermelden bij het aanhalen van resultaten uit deze masterproef.

Frederik Broucke 30 mei 2018 Contents

Voorwoord iii

List of symbols vii

1 Introduction 1

2 Exponential sums 3 2.1 Some elementary estimates ...... 3 2.2 Characters and Gauss sums ...... 4 2.2.1 Ramanujan sums ...... 6 2.2.2 Separable Gauss sums and primitive characters ...... 7 2.2.3 Quadratic Gauss sums ...... 9 2.3 kth-power Gauss sums ...... 13 2.3.1 kth-powers modulo pl ...... 13 2.3.2 The exponential sums S(q, a) and S(q, a, b) ...... 15 2.4 Weyl sums ...... 18 2.4.1 Weyl’s method ...... 18 2.4.2 Vinogradov’s method ...... 22 2.4.3 Vinogradov’s mean value theorem ...... 28

3 The Hardy-Littlewood method 35 3.1 Generalities ...... 35 3.2 Waring’s problem ...... 37 3.2.1 Approximating the generating function ...... 37 3.2.2 The singular series ...... 43 3.2.3 The singular integral ...... 49 3.2.4 The contribution from the major arcs ...... 51 3.2.5 The contribution from the minor arcs ...... 53 3.3 The ternary Goldbach problem ...... 55 3.3.1 The contribution from the major arcs ...... 55 3.3.2 The contribution from the minor arcs ...... 60

4 The Vinogradov-Korobov zero-free region for ζ 65

5 Riemann’s non-differentiable function 73 5.1 Introduction ...... 73 5.2 Behaviour at rational points ...... 75 5.3 Behaviour at irrational points ...... 78 5.3.1 The upper bound for α(ρ)...... 78 5.3.2 The lower bound for α(ρ)...... 79

v vi CONTENTS

6 Conclusion 85

A Nederlandse samenvatting 87

B Populariserende samenvatting 91 B.1 Oplossingen van vergelijkingen detecteren ...... 91

C Additional theorems 95 C.1 Diophantine approximation ...... 95 C.2 The Poisson summation formula ...... 99 C.3 The continuous wavelet transform ...... 100 List of symbols

Symbol Description N The set of natural numbers including zero, {0, 1, 2,...}. Z The set of integers. Q The set of rational numbers. R The set of real numbers. C The set of complex numbers. R× The unit group of a ring R. d | n d divides n. α α α α+1 p k n p exactly divides n; p | n and p - n. (n, m) The greatest common divisor of n and m. d(n) The number of positive divisors of n. ω(n) The number of distinct prime factors of n. µ(n) The M¨obiusfunction; µ(n) = (−1)ω(n) if n is square-free, µ(n) = 0 otherwise. ϕ(n) Euler’s totient function; the number of a, 1 ≤ a ≤ n for which (a, n) = 1. Λ(n) The von Mangoldt function; Λ(n) = log p if n = pα for some prime p, Λ(n) = 0 otherwise. (n) The unit function for Dirichlet convolution; (1) = 1 and (n) = 0 if n 6= 1. P ϑ(x) The first Chebyshev function; ϑ(x) = p≤x log p. P ψ(x) The second Chebyshev function; ψ(x) = n≤x Λ(n). P π(x) The prime counting function; π(x) = p≤x 1. indg a The index of a. If g is a primitive root mod q, and (a, q) = 1, then m indg a is the unique m mod ϕ(q) such that a ≡ g mod q. e(z) The complex exponential with period 1; e(z) = exp(2πiz). G(n, χ) The associated with n and the character χ mod q: q X G(n, χ) = χ(m)e(mn/q). m=1 q X cq(n) The Ramanujan sum: cq(n) = e(mn/q). m=1 (m,q)=1 q X amk  S (q, a) The kth-power Gauss sum: e . k q m=1 q X amk + bm S (q, a, b) A variant of the kth-power Gauss sum: e . k q m=1

vii viii LIST OF SYMBOLS

Symbol Description M X Sf (M) Weyl sums; Sf (M) = e(f(m)). m=1 [x] The integer part of x; the unique integer such that [x] ≤ x < [x]+1. {x} The fractional part of x; {x} = x − [x]. dxe The ceiling function of x; the unique integer such that dxe − 1 < x ≤ dxe. kxk The distance from x to the nearest integer.

f(x) = O(g(x)) f(x) ≤ Cg(x) for some absolute constant C. f(x) = o(g(x)) lim f(x)/g(x) = 0. f(x) = Ω(g(x)) The negation of f(x) = o(g(x)). f(x)  g(x) f(x) = O(g(x)). f(x)  g(x) g(x) = O(f(x)), g non-negative. f(x) ∼ g(x) lim f(x)/g(x) = 1. f(x)  g(x) f(x)  g(x) and f(x)  g(x).

Concerning the asymptotic notations: the range in which the inequalities or limits hold is usually clear from the context; if needed it will be specified, e.g.

f(x) = o(g(x)) as x → 0 means that f(x) lim = 0. x→0 g(x) If the implicit constant in the notation is not absolute, but depends on some additional parameter, this will be notated via a subscript, e.g.

f(x) k g(x) where f and g are functions which also depend on some additional parameter k, means that there is a constant Ck, only depending on k, such that f(x) ≤ Ckg(x).

We use the convention that for a z, its argument arg z is a number in the interval ] − π, π].

We use the notation fˆ for the Fourier transform of a function f, defined as follows Z fˆ(y) = f(x)e−ixy dx. R

Sometimes (e.g. in summations), the expression min(X, 1/0) will occur; this will be taken to be X. Chapter 1

Introduction

We denote the complex exponential with period 1 by e, that is ∀z ∈ C : e(z) = exp(2πiz). This thesis is devoted to the study of exponential sums, namely sums of the form

N X e(f(n)), f real-valued n=1 and their applications. Except for some special cases, these sums cannot be evaluated explicitly. The first objective of this thesis is to study means to establish upper bounds for exponential sums which are sharper than the trivial bound

N X e(f(n)) ≤ N.

n=1

This requires proving that there occurs some cancellation within the sum, i.e. that the arguments of the exponentials, the f(n), do not all have the same value mod 1. This is generally a very difficult task, and proofs of such sharper bounds are often delicate and subtle. Quite often, the gains over the trivial estimate N are very small, for example N −δ for a tiny positive δ, or even smaller. These small gains can nonetheless have vast ramifications in applications. The second objective of this thesis is to investigate some of the applications of expo- nential sums in specific problems in number theory and analysis.

In Chapter 2, some general methods for estimating exponential sums will be inves- tigated. After some introductory examples in Section 2.1, we will prove some basic properties of character and Gauss sums in Section 2.2. These sums are ubiquitous in number theory. We devote special attention to quadratic Gauss sums and their gener- alisations, the kth-power Gauss sums, who will be treated in Section 2.3. In the last section of this chapter, Section 2.4, we will discuss Weyl sums, exponential sums of which the argument f is a polynomial, or more general, a smooth function. We will present two methods for obtaining estimates for them, Weyl’s method and Vinogradov’s method. One of the essential ingredients in Vinogradov’s method, Vinogradov’s mean value theorem, will also be treated.

The next chapters of the thesis are devoted to applications in number theory and analysis of the results obtained in Chapter 2.

1 2 CHAPTER 1. INTRODUCTION

In Chapter 3, we will explore an important technique in additive number theory, the Hardy-Littlewood. After a historical introduction and an outline of the general ideas in Section 3.1, we will consider its application to two problems: Waring’s problem (Section 3.2) and the ternary Goldbach problem (Section 3.3). In Chapter 4 we will deduce the Vinogradov-Korobov zero-free region for the Rie- mann zeta function ζ. Although this result is about sixty years old, it is to this day the asymptotically best zero-free region for ζ (but unfortunately nowhere near the hoped zero-free region which would follow from the Riemann Hypothesis). The implications for the error term in the Prime Number Theorem are also stated. Finally, Chapter 5 is devoted to the study of Riemann’s so called “non-differentiable” function. In particular, we will prove at which points it is differentiable and at which points it is not, and moreover we will determine the H¨olderexponent at every point.

Chapters 2–4 are the result of a literature study and are heavily based on the books by Vaughan [37], and Iwaniec and Kowalski [23]. Specifically, we list the primary sources per section in table 1.1. Chapter 5 contains some original results (or rather a new method of proving al- ready existing results), but still relies heavily on other works, especially the articles by Duistermaat [4] and Jaffard [24].

Table 1.1: Primary sources per section

Section Source 2.1 [29, Section 2] 2.2 [1, Chapter 8] 2.3 [37, Section 4.2] 2.4 [37, Sections 2.2, 5.1 and 5.2] and [23, Section 8.5] 3.2 [37, Chapter 4 and Section 5.3] 3.3 [37, Chapter 3] and [23, Sections 13.4 and 13.5] 4 [23, Section 8.5] 5.2 [4] 5.3 [4] and [24] Chapter 2

Exponential sums

2.1 Some elementary estimates

To get some intuition and a taste of exponential sums, we begin our exposition with some elementary estimates. First, we consider the simple but very important example of an exponential sum over an arithmetic progression. For these sums we have the following upper bound.

Proposition 2.1.1. Let α, β be real numbers and M a positive integer. Then

M   X 1 e(αm + β) ≤ min M, , 2kαk m=1 where kαk denotes the distance from α to the nearest integer.

Proof. The upper bound M follows via estimating trivially via the triangular inequality. Upon using the formula for the partial sums of a geometric series, we see that

M X 1 − e(αM) 2 e(αm + β) = e(α + β) ≤ . 1 − e(α) 1 − e(α) m=1

Without loss of generality, we may replace α by kαk, since we may change the sign of

α and add an arbitrary integer to it without altering the value of 1 − e(α) . Using elementary trigonometry, 1 − e(kαk) = 2 sin(πkαk), which by Jordan’s inequality is bounded from below by (2/π)πkαk = 2kαk, since πkαk ∈ [0, π/2].

If one has M real numbers x1, . . . , xM , then one can picture the partials sums sm = e(x1) + ··· + e(xm) as part of a walk starting in 0, with step length 1, and where the th direction of the m step is determined by the fractional part {xm} of xm. Intuitively, if the values of the fractional parts of the xm are close together, the walk will be more or less in one general direction, and the resulting sum will be large. If the fractional parts are more evenly distributed mod 1, the direction of each step will appear more “random”, and we expect the resulting sum to be smaller1. This idea is captured in the following theorem, taken from [29].

1One can show that for a random walk with M steps in two dimensions the expected value of the square of the distance√ from the endpoint to the starting point is M, while the expected value of the π √ distance itself is ∼ M. 2

3 4 CHAPTER 2. EXPONENTIAL SUMS

Theorem 2.1.2 (Kusmin-Landau). Let x1, . . . , xM be real numbers and set δm = xm+1− xm. Suppose there is a positive ∆ such that ∆ < δ1 ≤ ... ≤ δM−1 < 1 − ∆. Then M X π∆ e(xm) ≤ cot . 2 m=1

Proof. Let zm = e(xm), wm = zm+1/zm = e(δm) and ρm = 1/(1 − wm). Then

M M−1 X X e(xm) = ρm(zm − zm+1) + zM . m=1 m=1 By partial summation this equals

M−1 X ρ1z1 + (ρm − ρm−1)zm + (1 − ρM−1)zM , m=2 so that M M−1 X X e(xm) ≤|ρ1| + |ρm − ρm−1| +|1 − ρM−1| .

m=1 m=2 If ρ = 1/(1 − w) and w = e(δ) with 0 < δ < 1, then ρ = (1 + i cot πδ)/2 and |ρ| = |1 − ρ| = 1/(2 sin πδ). Since cot is decreasing on [0, π], the above is

M−1 1 1 X 1 ≤ + (cot πδ − cot πδ ) + 2 sin πδ 2 m−1 m 2 sin πδ 1 m=2 M−1 1  1 1  = + cot πδ1 − cot πδM−1 + 2 sin πδ1 sin πδM−1 1 π∆ ≤ + cot π∆ = cot . sin π∆ 2

Corollary 2.1.3. Let f be a real-valued continuous function on [a, b], differentiable on

]a, b[ such that f 0 is increasing. Suppose that ∀x ∈ ]a, b[ : f 0(x) ≥ ∆ for some positive ∆. Then,

X 2 e(f(m)) ≤ . π∆ a≤m≤b Proof. Let N be an integer chosen so that ∀x ∈ ]a, b[ : N < f 0(x) < N + 1. Such an integer exists in view of Darboux’s theorem (f 0 has the intermediate value property). If we replace f(x) by f(x) − Nx, the sum is unchanged and ∆ ≤ f 0(x) ≤ 1 − ∆. We can apply the previous theorem with xm = f(m). We have δm = f(m + 1) − f(m) = 0 0 f (ξm) for some ξm ∈ ]m, m + 1[ by the mean value theorem. Since f is increasing, the hypothesis on the δm is indeed satisfied. The result follows by noting that cot y < 1/y for y ∈ ]0, π/2].

2.2 Characters and Gauss sums

We assume the reader is acquainted with the basics of (Dirichlet) characters, but we will restate the important properties. For an introduction to characters, see for example [1, Chapter 6] 2.2. CHARACTERS AND GAUSS SUMS 5

Definition 2.2.1. Suppose (G, ·) is a group. A character χ of G is a morphism of G to × the multiplicative group of the complex numbers: χ : (G, ·) → (C , ·). It is easily seen that for any character χ of G, χ(e) = 1, where e is the identity of G, and that for any element g ∈ G of finite order n, χ(g) is an nth root of unity. The character which maps every element to 1 is called the principal character, and is denoted by χ0.

Example 2.2.2. If G = hgi for some g of finite order n, then the characters of G are m given by χk : g 7→ e(km/n), for k = 0, . . . , n − 1.

Define Gˆ to be the set of all characters of G. We can make Gˆ into a group by defining the multiplication of two characters χ1, χ2 pointwise: (χ1 · χ2)(g) = χ1(g)χ2(g); the identity element of Gˆ is the principal character, and for a character χ, its inverse is given by the character 1/χ which maps g to 1/χ(g). The set Gˆ with this multiplication is called the character group of G. Using the above example, and the characterisation of finite abelian groups, it is easy to show that every finite abelian group is isomorphic to its character group. The following theorem is one of the main reasons why characters are so useful.

Theorem 2.2.3 (Orthogonality relations for characters). Suppose G is a finite abelian group of order n. Then for any χ ∈ Gˆ and for any g ∈ G we have ( X n if χ = χ0, χ(g) = 0 otherwise; g∈G ( X n if g = e, χ(g) = 0 otherwise. χ∈Gˆ

The orthogonality relations imply that the characters form an orthogonal basis for the vector space all functions f : G → C. Indeed, consider such a function f, and define its Fourier transform fˆ : Gˆ → C via 1 1 X fˆ(χ) = hf, χi = f(g)χ(g). |G| |G| g∈G X Then f = fˆ(χ)χ: χ∈Gˆ X X 1 X fˆ(χ)χ(g) = f(h)χ(h)χ(g) |G| χ∈Gˆ χ∈Gˆ h∈G X 1 X = f(h) χ(h−1g) = f(g). |G| h∈G χ∈Gˆ

Example 2.2.4. Suppose f is a periodic arithmetic function with period q. Then

q q X 1 X f(n) = fˆ(k)e(kn/q), where fˆ(k) = f(n)e(−nk/q). q k=1 n=1

(Here, the character group of (Z/qZ, +) is identified with the group (Z/qZ, +) itself.) 6 CHAPTER 2. EXPONENTIAL SUMS

In number theory, we are mainly interested in the so called Dirichlet characters. For a positive integer q, a Dirichlet character mod q is an arithmetical function χ such that × there exists a characterχ ˜ of the group of reduced residues mod q,(Z/qZ) , with the property that ( χ(n) =χ ˜(n + qZ) if (n, q) = 1, χ(n) = 0 if (n, q) > 1. + × A Dirichlet character is hence just the lift to N of a character of (Z/qZ) ; often the × distinction between a character of (Z/qZ) and a q-periodic completely multiplicative arithmetical function f with f(n) 6= 0 ⇐⇒ (n, q) = 1 will not be made, and they will be both referred to as a Dirichlet character. Dirichlet characters satisfy the orthogonality relations q ( ( X ϕ(q) if χ = χ0, X ϕ(q) if n ≡ 1 mod q, χ(m) = χ(n) = 0 otherwise; 0 otherwise. m=1 χ mod q Also, every q-periodic arithmetical function supported on the integers coprime with q can be expressed as a sum over Dirichlet characters. It is therefore useful to study sums involving Dirichlet characters. Definition 2.2.5. Suppose q is a positive integer. Given an integer n and a Dirichlet character χ mod q, the Gauss sum G(n, χ) associated with n and χ is given by q X G(n, χ) = χ(m)e(mn/q). m=1 From the definition it is obvious that G(n, χ) is periodic in n with period q. The Gauss sum G(n, χ) can be viewed as the inner product of the multiplicative character χ with the additive character m 7→ e(−mn/q), and hence as a discrete analog of the .

2.2.1 Ramanujan sums We will first consider a special case of the Gauss sums, the Ramanujan sums.

Definition 2.2.6. Suppose q and n are positive integers. The Ramanujan sum cq(n) is the Gauss sum associated with the principle character mod q: q X cq(n) = G(n, χ0) = e(mn/q). m=1 (m,q)=1

Lemma 2.2.7. For fixed n, cq(n) is a multiplicative function of q.

Proof. Suppose (q1, q2) = 1. By the Chinese remainder theorem, for each m mod q1q2 there are unique m1 mod q1, m2 mod q2 such that m = m1q2 + m2q1. Furthermore, (m, q) = 1 ⇐⇒ (mi, qi) = 1. Therefore, q1 q2   X X (m1q2 + m2q1)n cq1q2 (n) = e q1q2 m1=1 m2=1 (m1,q1)=1 (m2,q2)=1 q1   q2   X m1n X m2n = e e = cq1 (n)cq2 (n). q1 q2 m1=1 m2=1 (m1,q1)=1 (m2,q2)=1 2.2. CHARACTERS AND GAUSS SUMS 7

Theorem 2.2.8. X µ(q/(q, n)) c (n) = dµ(q/d) = ϕ(q). q ϕ(q/(q, n)) d|(n,q)

Proof. Write  for the characteristic function of {1}. Then  = µ∗1, and we can eliminate the condition (m, q) = 1 via this relation:

q q q X X X X cq(n) = e(mn/q) = ((m, q))e(mn/q) = e(mn/q) µ(d) m=1 m=1 m=1 d|(m,q) (m,q)=1 q/d X X X X = µ(d) e(kdn/q) = µ(d)q/d = dµ(q/d). d|q k=1 d|q d|(n,q) (q/d)|n

Evaluating in prime powers pl yields:

 0 if pl−1 n,  - l−1 l−1 cpl (n) = −p if p k n,  pl − pl−1 if pl | n µ(pl/(pl, n)) = ϕ(pl). ϕ(pl/(pl, n))

Since both cq(n) and µ(q/(q, n))ϕ(q)/ϕ(q/(q, n)) are multiplicative and they coincide on prime powers, they are equal.

2.2.2 Separable Gauss sums and primitive characters

We now consider the Gauss sum G(n, χ) mod q and suppose (n, q) = 1. Then we can write

q q X X G(n, χ) = χ(m)e(mn/q) = χ(n)χ(mn)e(mn/q) m=1 m=1 = χ(n)G(1, χ). (2.1)

In the last step we used that mn runs over a complete residu system mod q whenever m does, since n is invertible mod q. This nice property is called separability.

Definition 2.2.9. The Gauss sum G(n, χ) mod q is called separable if G(n, χ) = χ(n)G(1, χ).

Since χ(n) = 0 whenever (n, q) > 1, G(n, χ) is separable for every n if and only if G(n, χ) = 0 whenever (n, q) > 1. The absolute value of these Gauss sums is easily determined.

√ Theorem 2.2.10. Suppose G(n, χ) is separable for every n. Then G(1, χ) = q. 8 CHAPTER 2. EXPONENTIAL SUMS

Proof.

q 2 X G(1, χ) = G(1, χ)G(1, χ) = G(1, χ) χ(m)e(−m/q) m=1 q q q X X X = G(m, χ)e(−m/q) = χ(k)e(m(k − 1)/q) m=1 m=1 k=1 q q X X = χ(k) e(m(k − 1)/q) = χ(1)q = q. k=1 m=1

Next, we would like to determine which characters give rise to (everywhere) separable Gauss sums.

Lemma 2.2.11. Let χ be a Dirichlet character mod q and suppose G(n, χ) 6= 0 for some n with (n, q) > 1. Then there exists an integer d with d | q, d < q such that

χ(a) = 1 whenever (a, q) = 1 and a ≡ 1 mod d.

Proof. Let k = (n, q) > 1 and d = q/k. Now suppose a is a natural number with (a, q) = 1 and a ≡ 1 mod d. Since a is invertible mod q, am runs over a complete residue system mod q whenever m does. Hence

q X G(n, χ) = χ(am)e(amn/q) = χ(a)G(an, χ). m=1 Since a ≡ 1 mod d, a = 1 + bd = 1 + bq/k for some b. Now anm/q = nm/q + bnm/k ≡ nm/q mod 1, since k | n. Therefore,

G(n, χ) = χ(a)G(an, χ) = χ(a)G(n, χ), and since G(n, χ) 6= 0, χ(a) = 1.

Definition 2.2.12. Suppose χ is a Dirichlet character mod q.

• A positive divisor d of q is called an induced modulus of χ if

χ(a) = 1 whenever (a, q) = 1 and a ≡ 1 mod d.

• The smallest induced modulus of χ is called the conductor of χ.

• χ is called primitive if its conductor equals q, i.e. if it has no induced modulus smaller than q.

Remark 2.2.13.

• 1 is an induced modulus for χ if and only if χ is the principal character. Therefore, every non-principal character modulo a prime p is primitive.

• One can show that, if d is an induced modulus for χ, there is a character ψ mod d which induces χ, in the sense that χ = ψχ0, where χ0 is the principal character mod q. If d is the conductor, then ψ can be taken primitive mod d. 2.2. CHARACTERS AND GAUSS SUMS 9

Lemma 2.2.11 implies the following theorem. Theorem 2.2.14. Suppose χ is a primitive Dirichlet character. Then G(n, χ) is sepa- √ rable for every n, and so G(1, χ) = q. Remark 2.2.15. • The converse of the above theorem also holds, if G(n, χ) is separable for every n, then χ is primitive.

• For general Gauss sums, there is the following theorem (see for example [30, page 290]). Let χ be a character mod q induced by the primitive character ψ modulo its conductor d. Put r = q/(q, n). If d - r, then G(n, χ) = 0, and if d | r, then ϕ(q) G(n, χ) = ψ(n/(q, n))ψ(r/d)µ(r/d) G(1, ψ). ϕ(r)

2.2.3 Quadratic Gauss sums Definition 2.2.16. Let q, a, b, k be positive integers with (a, q) = 1. The exponential sums Sk(q, a) and Sk(q, a, b) are defined as:

q X amk  S (q, a) = e , k q m=1 q X amk + bm S (q, a, b) = e . k q m=1 If the exponent k is clear from the context, the subscript k will be omitted in the notation for these sums. In this subsection, we will determine the values of these sums for k = 2, which is a classical result due to Gauss. Definition 2.2.17. • Suppose p is an odd prime. The Legendre symbol is defined as  1 if p - a and a is a quadratic residue mod p, a  = −1 if p - a and a is a quadratic non-residue mod p, p  0 if p | a.

α1 αN • Suppose q is an odd integer, q = p1 ··· pN for odd primes pi. The Jacobi symbol is defined as a  a α1  a αN = ··· . q p1 pN Since the Legendre and Jacobi symbol coincide when they are both defined, we do not make a distinction in the notation. Notice also that when p 6= 2 is prime

a p−1 ≡ a 2 mod p. p It is easy to see that the Jacobi symbol for fixed q is a Dirichlet character mod q: it is q-periodic, completely multiplicative and supported on the integers coprime with q. It is in fact a quadratic character: a non-principal character whose square is principal. 10 CHAPTER 2. EXPONENTIAL SUMS

Lemma 2.2.18. Suppose p is an odd prime and a is an integer with p - a. Then a S(p, a) = S(p, 1). p m Proof. The number of solutions of x2 ≡ m mod p equals 1 + . Therefore p

p X m   ·  S(p, a) = 1 + e(am/p) = G a, p p m=1 a   ·  a = G 1, = S(p, 1), p p p where we used the fact that (a, p) = 1 and the separability of G(a, χ) for any non- principal character χ mod p (see Theorem 2.2.14).

We now determine the value of the quadratic Gauss sums in the case a = 1. Theorem 2.2.19. Suppose a and q are positive integers with at least one of them even, and define the exponential sum (remark the 2 in the denominator)

q X am2  S0(q, a) = e . 2q m=1 Then r q 1 + i S0(q, a) = √ S0(a, q). a 2 Proof. We present an analytic proof given in [30, Chapter 9]. Define ( e(ax2/(2q)) for 1/2 < x < q + 1/2, f(x) = 0 otherwise.

By the Poisson summation formula (C.2.1),

K X S0(q, a) = lim fˆ(2πk). K→+∞ k=−K For non-zero k,

Z q+1/2 ax2  Z q+1/2 −1 0 ax2  fˆ(2πk) = e − kx dx = e(−kx) e dx 1/2 2q 1/2 2πik 2q −1 1 Z q+1/2 2πia ax2  1 = O(1) + e(−kx) xe dx q,a , 2πik 2πik 1/2 q 2q |k| by integration by part. By completing the square, ax2 a k2q − kx = (x − kq/a)2 − , 2q 2q 2a and by the change of variables u = (x − kq/a)/q, we have that

 k2q  Z 1+1/(2q)−k/a fˆ(2πk) = qe − e(aqu2/2) du. 2a 1/(2q)−k/a 2.2. CHARACTERS AND GAUSS SUMS 11

Since at least one of a and q is even, if k ≡ r mod a, then qk2 ≡ qr2 mod 2a, so we can group the residues mod a in the sum over k:

K a [K/a] X X −qr2  X Z 1+1/(2q)−m−r/a fˆ(2πk) = q e e(aqu2/2) du + O (1/K). 2a q,a k=−K r=1 m=−[K/a] 1/(2q)−m−r/a

When K → +∞, the latter sum converges to

Z 1 1 + i e(aqu2/2) du = √ √ , R aq 2 which is called the Fresnel integral. This is a “classical” improper integral which can be evaluated via contour integration and residue calculus. We get eventually:

r q 1 + i S0(q, a) = √ S0(a, q). a 2

By taking a = 2, we obtain a famous result of Gauss:

Corollary 2.2.20 (Gauss). √ q if q ≡ 1 mod 4,  q 2  X m  0 if q ≡ 2 mod 4, S(q, 1) = e = √ q i q if q ≡ 3 mod 4, m=1  √ (1 + i) q if q ≡ 0 mod 4.

Consider two distinct odd primes p1 and p2. It is easy to verify (see also Lemma 2.3.4 below) that S(p1p2, 1) = S(p1, p2)S(p2, p1). By evaluating this quadratic Gauss sum in two different ways, using the above corollary and Lemma 2.2.18, we get a quick proof of the famous reciprocity law for the Legendre symbol:

p p  2 1 = (−1)(p1−1)(p2−1)/4. p1 p2

The above theorem also implies a reciprocity law for the quadratic Gauss sum S(q, a).

Corollary 2.2.21. If a is odd, then

r q 1 + i S(q, a) = (1 + e(−qa/4))S(a, q). a 2

Proof.

r r 2a 2 q 1 + i q 1 + i X  qm  S(q, a) = S0(q, 2a) = S0(2a, q) = e − . a 2 a 2 4a m=1

We split the last sum in a sum over even and odd m. For the odd m, observe that (m + 2a)2 ≡ m2 mod 4a, so we may sum over the odd numbers in any complete residu 12 CHAPTER 2. EXPONENTIAL SUMS system mod 2a. Choosing the complete residu system {a + 1, a + 2,..., 3a} and using the fact that a is odd, we get:

2a a a X  qm2  X  q(2m)2  X  q(2m + a)2  e − = e − + e − 4a 4a 4a m=1 m=1 m=1 a a X  qm2  X  qm2  = e − + e − e(−qa/4) a a m=1 m=1 = S(a, q)(1 + e(−qa/4)).

Theorem 2.2.22. Suppose a and q are positive integers with (a, q) = 1. For odd n, define ( 1 if n ≡ 1 mod 4, εn = i if n ≡ 3 mod 4.

Then     a √ εq q if q is odd,  q  S(q, a) = 0 if q ≡ 2 mod 4,    q √ (1 + i)ε q if q ≡ 0 mod 4.  a a

We have all the essential ingredients for the proof, but it will be postponed to Sub- section 2.3.2 since it makes use of a lemma (Lemma 2.3.6) which also holds for general exponents k.

Finally, we are left to treat the sums S(q, a, b), but these can be easily related to the sums S(q, a) as follows. Suppose first that b ≡ 2b0 mod q for some b0. Then we can complete the square and we get:

q X a(m + a−1b0)2   a−1b02   a−1b02  S(q, a, b) = e e − = e − S(q, a). q q q m=1

Here, a−1 is the multiplicative inverse of a mod q. If there is no such b0, then q is even and b odd. In this case we have

2q 2q X a(2m + a−1b)2  X a(2m)2  S(4q, a) = e + e 4q 4q m=1 m=1 a−1b2  = 2e S(q, a, b) + 2S(q, a), 4q since 2m + a−1b runs over all odd residues mod 4q when m runs over {1,..., 2q}. There- fore, 1  a−1b2   e − S(4q, a) if q ≡ 2 mod 4, S(q, a, b) = 2 4q 0 if q ≡ 0 mod 4. 2.3. kth-POWER GAUSS SUMS 13

2.3 kth-power Gauss sums

Throughout this section, we fix an integer k > 1.

2.3.1 kth-powers modulo pl We will first examine the distribution of kth-powers modulo a prime power pl. The following theorem about the structure of the unit group is well known (see for example [1, Chapter 10]). Theorem 2.3.1. • For the prime 2, we have  ×  ×  × Z =∼ 1, Z 2 =∼ Z , Z l =∼ Z × Z l−2 for l ≥ 2 . 2Z 2 Z 2Z 2 Z 2Z 2 Z An explicit isomorphism for the latter case is

×   l−2 a b l Z × Z l−2 → Z l : (a + 2 , b + 2 ) 7→ (−1) 5 + 2 . (2.2) 2Z 2 Z 2 Z Z Z Z • For the odd primes, we have  × Z l =∼ Z l−1 . p Z p (p − 1)Z

l × First, we would like to determine the number of solutions x ∈ Z/p Z of the equation

xk ≡ y mod pl (2.3)

l × for a fixed y ∈ Z/p Z . Again, we will first consider the case of the prime 2 and l ≥ 3, and later the case of the odd primes.

Case 1: p = 2. Using the isomorphism (2.2) write x ≡ (−1)a5b and y ≡ (−1)c5d. Then (2.3) is equivalent to ( ka ≡ c mod 2 (2.4) kb ≡ d mod 2l−2

If k ≡ 1 mod 2, then k is invertible mod 2 and mod 2l−2, so that we have a unique solution for (2.4), and hence also for (2.3). Suppose k ≡ 0 mod 2. If c ≡ 1 mod 2, then (2.4) has no solutions. If c ≡ 0 mod 2, then (2.4) reduces to only the second equation. l−2 Write e = (k, 2 ). If e - d, then this equation has no solutions. If e | d, then the equation k d 2l−2 b ≡ mod e e e has a unique solution b0, and the solutions of (2.4) are given by  a = 0, 1  2l−2 b = b + m for m = 1, . . . , e  0 e 14 CHAPTER 2. EXPONENTIAL SUMS

In conclusion, if k is odd, then (2.3) has a unique solution. If k is even, then if c is odd or e - d, (2.3) has no solutions, and otherwise it has 2e solutions.

Case 2: p odd. l Consider a primitive root (i.e. generator of the group of units) g of Z/p Z, and write l−1 a = indg(x) and b = indg(y) (here indg(z) is the unique c mod p (p − 1) such that z ≡ gc mod pl, and is called the index of z). Then (2.3) is equivalent to ka ≡ b mod pl−1(p − 1). (2.5)

l−1 Write e = (k, p (p − 1)). If e - b, then (2.5) has no solutions. If e | b, then k b pl−1(p − 1) a ≡ mod e e e has a unique solution a0, and the solutions of (2.5) are given by pl−1(p − 1) a + m for m = 1, . . . , e. 0 e In conclusion, if e - b, then (2.3) has no solutions, and if e | b, then it has e solutions.

th l × Using these observations, we can easily count the number of k -powers in Z/p Z . Proposition 2.3.2. Define e = (k, 2l−2) if p = 2 and e = (k, pl−1(p − 1)) otherwise. th l × The number of k -powers in Z/p Z is given by  1 if p = 2, l = 1,  1 resp. 2 if p = 2, l = 2, k even resp. odd, l−2 l−1 2 /e resp. 2 if p = 2, l ≥ 3, k even resp. odd,  pl−1(p − 1)/e if p 6= 2.

For applications it is useful to relate the kth-powers modulo larger prime powers to the kth-powers modulo a small prime power. For this, define τ(p) and γ(p) via ( τ(p) + 1 if p > 2 or p = 2 and τ(p) = 0, pτ(p) k k, γ(p) = τ(p) + 2 if p = 2 and τ(p) > 0. If the prime p is clear from the context, we will often omit the argument p and write k γ τ and γ instead. Then, the number of solutions of x ≡ y mod p (p - y) equals 0 or pγ−τ−1(k, ϕ(pτ+1)); the number of (invertible) kth-powers mod pγ equals ϕ(pτ+1)/(k, ϕ(pτ+1)). Now suppose l ≥ γ. Using the formulas above, we see that the number of solutions k l γ−τ−1 τ+1 of x ≡ y mod p (p - y) stays the same, namely 0 or p (k, ϕ(p )); the number of invertible kth-powers mod pl however is multiplied by pl−γ. k γ γ−τ−1 τ+1 γ k If xi ≡ y mod p , p - xi, y, i = 1, . . . , p (k, ϕ(p )), then (xi + mp ) ≡ y + γ l γ−τ−1 τ+1 l−γ ni,mp mod p , for some ni,m when i = 1, . . . , p (k, ϕ(p )), m = 1, . . . , p . For every n mod pl−γ there are exactly pγ−τ−1(k, ϕ(pτ+1)) couples (i, m) such that l−γ ni,m ≡ n mod p . Otherwise there would be a number n sucht that the equation xk ≡ y + npγ has more than pγ−τ−1(k, ϕ(pτ+1)) solutions.Therefore, given a kth-power y mod pγ, the numbers y + npγ, n = 1, . . . , pl−γ are pl−γ distinct kth-powers mod pl. In conclusion, we have the following proposition. 2.3. kth-POWER GAUSS SUMS 15

th l × Proposition 2.3.3. Suppose l ≥ γ. The k -powers of Z/p Z are given by γ l (y + np ) + p Z, γ th γ × l−γ where y + p Z is a k -power in Z/p Z and 1 ≤ n ≤ p .

2.3.2 The exponential sums S(q, a) and S(q, a, b)

In this subsection, we will take a closer look at the exponential sums Sk(q, a) and Sk(q, a, b). The case k = 2 is already investigated in Subsection 2.2.3; there we could evaluate them explicitly. For general k, such an explicit formula for these sums is not known, so we will have to deal with estimates for them. These sums are of great importance especially in Waring’s problem, see Section 3.2 below. A first step in treating them is a reduction to the prime power case, which can be done via the following lemma.

Lemma 2.3.4. Let q1, q2, a, b be positive integers with (q1, q2) = (q1q2, a) = 1. Then k−1 k−1 k−1 k−1 S(q1q2, a) = S(q1, aq2 )S(q2, aq1 ), and S(q1q2, a, b) = S(q1, aq2 , b)S(q2, aq1 , b).

Proof. Since q1 and q2 are coprime, a complete residu system mod q1q2 is given by {tq2 + uq1|1 ≤ t ≤ q1, 1 ≤ u ≤ q2}, as follows from the Chinese remainder theorem. Therefore,

q1 q2  k  X X a(tq2 + uq1) + b(tq2 + uq1) S(q q , a, b) = e 1 2 q q t=1 u=1 1 2 q1 q2 X X atkqk−1 + bt aukqk−1 + bu = e 2 + 1 = S(q , aqk−1, b)S(q , aqk−1, b). q q 1 2 2 1 t=1 u=1 1 2

We will first focus our attention on the sum S(q, a). The following two lemmas treat the prime case q = p, and reduce the prime power case q = pl to one with a smaller exponent. Lemma 2.3.5. Suppose (p, a) = 1. Write A for the set of non-principal characters χ mod p for which χk is principal. Then X S(p, a) = χ(a)G(1, χ), χ∈A √ and S(p, a) ≤ ((k, p − 1) − 1) p. Proof. Suppose (y, p) = 1. Using the orthogonality relations for characters, we have that ( X X p − 1 if xk ≡ y mod p, χ(xk)χ(y) = χ(xky−1) = 0 otherwise. χ mod p χ mod p Therefore, the number of solutions of xk = y mod p equals p p X 1 X X 1 X χ(xk)χ(y) = χ(y) χk(x) p − 1 p − 1 x=1 χ mod p χ mod p x=1 X X = χ(y) = 1 + χ(y). χ mod p χ∈A k χ =χ0 16 CHAPTER 2. EXPONENTIAL SUMS

The last expression also holds when y ≡ 0, since then the equation has only the solution x ≡ 0. Therefore, we can rewrite S(p, a) in the following way:

p X X  ay  S(p, a) = 1 + χ(y) e p y=1 χ∈A p X X ay  = χ(y)e p χ∈A y=1 p X X ay  X = χ(a) χ(ay)e = χ(a)G(1, χ). p χ∈A y=1 χ∈A

The non-principal characters whose kth-power is principal are precisely given by

 h  χ (m) = e ind m , (m, p) = 1, h (k, p − 1) g with 1 ≤ h < (k, p − 1), and g a primitive root mod p. Since these are primitive √ characters, G(1, χh) = p by Theorem 2.2.14 and the lemma follows. Lemma 2.3.6. Suppose l > γ and (pl, a) = 1. Then ( pl−1 if l ≤ k, S(pl, a) = pk−1S(pl−k, a) if l > k.

Proof. Using Proposition 2.3.3, we have

pγ pl−γ pl−1 X X a(y + npγ) X S(pl, a) = e + e(amkpk−l). pl y≡xk mod pγ n=1 m=1 p-y The inner sum of the left term equals

pl−γ ay  X  an  e e = 0. pl pl−γ n=1 The right term equals ( pl−1 if l ≤ k, pk−1S(pl−k, a) if l > k.

As promised in Subsection 2.2.3, we will now give the proof of Theorem 2.2.22, which evaluates the quadratic Gauss sums S2(q, a) explicitely.

Proof of Theorem 2.2.22. Using the reciprocity law for the quadratic Gauss sums, Corol- lary 2.2.21, we see that we only have to prove the case q odd. Consider first an odd prime p, and an integer a with p - a. Recall the definition of εq: ( 1 if q ≡ 1 mod 4, εq = i if q ≡ 3 mod 4. 2.3. kth-POWER GAUSS SUMS 17

Using Lemma 2.3.6 and Lemma 2.2.18, we see that for even powers   2l l a p 2l S (p , a) = p = ε 2l p , 2 p p2l and for odd powers   2l+1 l a p 2l+1 S (p , a) = p S (p, a) = ε 2l+1 p . 2 2 p p2l+1

l1 ln Now consider odd q arbitrary, and write q = p1 ··· pn for distinct odd primes pi. Using the “twisted” multiplicativity of S(q, a) (Lemma 2.3.4), we get

  lilj n   Y pi pj Y a √ S2(q, a) = ε li q. p p pi q i

We claim that the constant up front equals εq. Observe that (q1−1)(q2−1)/4 li εq1 εq2 = (−1) εq1q2 . Denote by m the number of i such that pi ≡ 3 mod 4. By quadratic reciprocity of the Legendre symbol, the first product equals (−1)m(m−1)/2. [m/2] The second product equals (−1) εq. The proof is now complete by noting that m(m − 1)/2 and [m/2] have the same parity.

Combining the previous lemmas, we can give an estimate for S(q, a).

1−1/k Theorem 2.3.7. Suppose that (q, a) = 1. Then S(q, a) k q .

Proof. For k = 2, this follows from Lemma 2.2.22. Suppose that k > 2. Note that k ≥ γ(p) for each p. Write l = uk + v with 1 ≤ v ≤ k and u ≥ 0. By Lemma 2.3.6, S(pl, a) = p(k−1)uS(pv, a).

Case 1: v > 1.

If p > k, then γ = 1, so S(pv, a) = pv−1. If p ≤ k, then trivially S(pv, a) ≤ kpv−1. Hence ( l−l/k l p if p > k, S(p , a) ≤ kpl−l/k if p ≤ k.

Case 2: v = 1. √ By Lemma 2.3.5, S(pv, a) ≤ (k − 1) p ≤ kp1−1/k−1/6, since k ≥ 3. Hence

( l−l/k 6 l p if p > k , S(p , a) ≤ kpl−l/k if p ≤ k6.

By Lemma 2.3.4, we conclude that   Y 1−1/k 1−1/k S(q, a) ≤ k q k q . p≤k6

Finally, we mention a theorem without proof, which gives an upper bound for S(q, a, b). The case q = p prime follows from deep work of Weil on exponential sums over finite fields, and we have the following theorem (see for example [31, chapter 2]). 18 CHAPTER 2. EXPONENTIAL SUMS

Theorem 2.3.8 (Weil). Suppose p is a prime, f a polynomial with integral coefficients of degree k < p, such that p does not divide the leading coefficient. Then p X √ e(f(m)/p) ≤ (k − 1) p.

m=1

Using this theorem in the case f(X) = aXk +bX, Hua [20] proved the ensuing upper bound for S(q, a, b) with general q.

Theorem 2.3.9 (Hua). Suppose that (q, a) = 1. Then for any ε > 0, we have

1/2+ε S(q, a, b) k,ε q (q, b).

2.4 Weyl sums

In this section we will study Weyl sums. A Weyl sum is an exponential sum of the form

M X e(P (m)), m=1 where P is a real polynomial. These sums are named after , who was one of the first to study them and apply them in number theory, for example in the theory of the [44]. Weyl sums also show up in Waring’s problem. It is of great importance in applications to determine upper bounds for them. The goal is always to obtain a strictly better bound then the trivial one, M. For this, one has to use the form of P to obtain some cancellation. We will present two methods which lead to an estimate for Weyl sums; Weyl’s method and Vinogradov’s method. Besides these two methods, there is also a third method, developed by the Dutch mathematician Johannes van der Corput in the nineteen twenties. His method is es- pecially applicable in multiplicative , for example for deriving bounds for the Riemann zeta function on the critical line σ = 1/2, or in the context of the Dirichlet divisor problem. We will not treat van der Corput’s method in this thesis. For a detailed account of his method and its applications, we refer to the book by Graham and Kolesnik [9].

Let us first introduce a notation.

Definition 2.4.1. Given a function f : R → C, the exponential sum Sf (M) is defined as: M X Sf (M) = e(f(m)). m=1

2.4.1 Weyl’s method In this subsection, we will prove Weyl’s inequality. This inequality gives an estimate for Weyl sums in terms of the number of terms in the sum, and the size of the denominator of a rational approximation a/q for α. The argument for showing this estimate uses the (forward) difference operator. By squaring the absolute value of the exponential sum, one obtains exponential sums involv- ing this forward difference operator. This operator reduces the degree of polynomials by 2.4. WEYL SUMS 19

1, so this procedure will relate the original exponential sum to exponential sums over polynomials of degree k − 1. The inductive application of this argument reduces it to degree 1, which can be estimated via Proposition 2.1.1. This proposition yields an esti- mate of the form min(M,kαk−1); we will first need a technical lemma which deals with sums over this kind of expressions.

Lemma 2.4.2. Suppose that X,Y and α are real numbers, X,Y ≥ 1, and for some a, q with (a, q) = 1 we have α − a/q ≤ 1/q2. Then

X XY 1  1 1 q  min ,  XY + + log(2Xq). m kαmk q Y XY m≤X

Throughout the proof, it is useful to keep in mind that if α, β, γ ∈ R, |γ| ≤ 1/2 and α = β + γ, then kβk −|γ| ≤kαk ≤kβk +|γ|.

Proof. Denote the sum by S. Writing m as jq + r, we get

q X X  XY 1  S ≤ min , . qj + r α(qj + r) 0≤j≤X/q r=1

The idea of the proof is as follows. For each j, we define an integer mj such that we can −1 −1 relate α(qj + r) to (mj + ar)/q . The sum over r of the latter is easily estimated: each term equals q/h for some integer h between 0 en q/2. Since a is invertible mod q, each such h will occur at most 2 times. 2 2 Define mj = [αjq ] and θ = q α − qa. Then

m + ar {αjq2} θr α(qj + r) = j + + , |θ| ≤ 1. q q q2

If j = 0 and r ≤ q/2, then using θr/q2 ≤ 1/2, we get

ar θr ar θr ar 1 1 ar α(qj + r) = + ≥ − ≥ − ≥ , q q2 q q2 q 2q 2 q

since ar/q ≥ 1/q. For each j we have

m + ar {αjq2} θr m + ar q + r α(qj + r) ≥ j − + ≥ j − . 2 2 q q q q q

Here we used that {αjq2}/q + θr/q2 ≤ 1/2 if q ≥ 4, which we may assume, since for q < 4 the result is trivial. Now

q + r 2 1 mj + ar ≤ ≤ , q2 q 2 q except in at most 9 cases per value of j (if the numerator is 0, ±1, ±2, ±3 or ±4 mod q). 20 CHAPTER 2. EXPONENTIAL SUMS

If j > 0 or j = 0 and r > q/2, then qj + q ≤ 2(qj + r). Collecting everything, we get

q ! X 1 X XY X 1 S  + 9 + ar/q qj + q (mj + ar)/q 1≤r≤q/2 0≤j≤X/q r=1 q-mj +ar XY X 1 X  X q  + + 1 q j + 1 q h 0≤j≤X/q 1≤h≤q/2 1 1 q   XY + + max(log(X/q + 2), log(q/2 + 2)) q Y XY 1 1 q   XY + + log(2Xq). q Y XY

Remark 2.4.3. The above lemma will also often be used to estimate X  1  min Y, kαmk m≤X via X  1  X XY 1  1 1 q  min Y, ≤ min ,  XY + + log(2Xq). kαmk m kαmk q Y XY m≤X m≤X If one redoes the above proof for the first sum, one can see that one can remove the factor X in the logarithm, but this small gain is irrelevant in most applications.

The forward difference of a function f with step length h is defined as ∆1(f; h)(x) = th f(x+h)−f(x). The j iterate ∆j of the forward difference operator is defined recursively via ∆j(f; h1, . . . , hj) = ∆1(∆j−1(f; h1, . . . , hj−1); hj). The following lemma is due to Weyl. Lemma 2.4.4 (Weyl).

2j 2j −j−1 X X ~ Sf (M) ≤ (2M) ··· S(h),

|h1|

m∈Ij (~h) and the Ij(~h) are (possibly empty) intervals satisfying

I1(h1) ⊆ [1,M],Ij(h1, . . . , hj) ⊆ Ij−1(h1, . . . , hj−1).

Proof. We proceed by induction on j. For j = 1 we have

M M−m 2 X X Sf (M) = e(∆1(f; h1)(m)) m=1 h1=1−m M−1 X X = e(∆1(f; h1)(m)),

h1=1−M m∈I1(h1) 2.4. WEYL SUMS 21 where I1(h1) = [1,M] ∩ [1 − h1, m − h1]. Now suppose the lemma holds for a certain value of j. By the Cauchy-Schwarz inequality,

2j+1 2j+1−2j−2 j X X ~ 2 Sf (M) ≤ (2M) (2M) ··· S(h) .

|h1|

|hj+1| 0, we have

1 1 q 1/K S (M)  M 1+ε + + , f k,ε q M M k where K = 2k−1. Proof. We will apply the previous lemma (Lemma 2.4.4) with j = k − 1.

K K−k X X Sf (M) ≤ (2M) e(∆k−1(f; h1, . . . , hk−1)(m)). h ,...,h ~ 1 k−1 m∈Ik−1(h) |hi|

X e(h(k!α(m + h1/2 + ... + hk−1/2) + (k − 1)!α1))

~ m∈Ik−1(h) ! X 1 = e(hk!αm)  min M, , αh˜ ~ m∈Ik−1(h) where h˜ = k!h. For the total sum, we get  k!M k−1   K K−k k−1 ε X 1 Sf (M) k,ε (2M) M + M min M, kαhk h=1 k!M k−1  X M k 1   M K−k+ε M k−1 + min , k,ε h kαhk h=1  1 k! q    M K−k+ε M k−1 + M k + + log(2k!M k−1q) k,ε q M M k 1 1 q   M K+2ε + + , k,ε q M M k 22 CHAPTER 2. EXPONENTIAL SUMS by Lemma 2.4.2 and if we assume for example that q ≤ M k to bound the log. (If q > M k, then the result follows trivially via the triangular inequality).

We remark that this inequality is only useful if we have some lower and upper bounds on q; otherwise the trivial bound Sf (M) ≤ M might be better. We have for example the following corollary.

Corollary 2.4.6. Under the same conditions as in Theorem 2.4.5 and the extra condi- tion that M  q  M k−1, we have that for any ε > 0 1 S (M)  M 1+ε−c(k), where c(k) = . f k,ε 2k−1

2.4.2 Vinogradov’s method A new method for estimating Weyl sums was developed in the nineteen thirties by I.M. Vinogradov [38, 39]. Like Weyl’s method, the idea is to reduce the sum to exponential sums over linear polynomials, and to use Proposition 2.1.1 to estimate them. How this reduction is achieved however is quite different from Weyl’s method (where it was achieved by successive differencing of the polynomial as a result of successive squaring of the exponential sum). In Vinogradov’s method, this reduction is done by an application of Vinogradov’s mean value theorem, a bound on the number of solutions of a certain system of Diophantine equations. Vinogradov’s method can be briefly summarised as follows

1. Shift the range of summation over some integer h, and average over the h in some chosen (multi)set H.

2. Raise to a high power and apply H¨older’sinequality to obtain a large number of variables.

3. Reduce the polynomials in the newly created variables to linear ones by applying Vinogradov’s mean value theorem.

4. Estimate the exponential sums over the linear polynomials by e.g. Proposition 2.1.1.

To illustrate this method, we will estimate the exponential sum

M X Sf (M) = e(f(m)), m=1

k where f(x) = αkx + ··· + α1x + α0 is a polynomial of degree k > 1. The following discussion is based on a section from the book by Iwaniec and Kowalski [23, Section 8.5]. First, we shift the summation over a positive integer h:

M−h M X X Sf (M) = e(f(h + m)) = e(f(h + m)) + 2hθh, m=1−h m=1 for some θh with |θh| ≤ 1. We choose H = {xy | 1 ≤ x, y ≤ H}, where the numbers h = xy occur as much as there are such representations, and for some H with H2 < M (typically we will put H = M a with a < 1/2). The motivation for this particular 2.4. WEYL SUMS 23 choice of H is that after averaging this creates a bilinear sum. Indeed, denote ~γ(x) = (x, x2, . . . , xk), then

k (m) ~ X (m) f(xy + m) = Bm(~γ(x),~γ(y)) + α0 , where Bm(~a, b) = αj ajbj, j=1 (m) and where the coefficients αj equal k   (m) X i α = α mi−j. j i j i=j

As shown below, the bilinearity of Bm allows us to use Vinogradov’s mean value theorem, which will reduce the sums to exponential sums of the form

X X ~  e Bm(~σ, λ) ,

~σ ~λ where the argument of the exponential is linear instead of polynomial in ~σ and ~λ, and which therefore can be estimated by Proposition 2.1.1. Averaging over the h ∈ H yields M 1 X S (M) = S + 2H2θ, (2.6) f H2 m m=1 where H H X X (m) Sm = e Bm(~γ(x),~γ(y)) + α0 (2.7) x=1 y=1 and |θ| ≤ 1. We will estimate the sums Sm, and then use these estimates to obtain an estimate for Sf (M) from (2.6) and the triangular inequality. First we raise Sm to a high power l and apply H¨older’sinequality to obtain H H l l l−1 X X  |Sm| ≤ H e Bm(~γ(x),~γ(y))

x=1 y=1 H l−1 X X  = H ζx e Bm(~γ(x),~γ(y1)) + ··· + Bm(~γ(x),~γ(yl)) x=1 y1,...yl H l−1 X X  = H ζx e Bm(~γ(x),~γ(y1) + ··· + ~γ(yl)) . x=1 y1,...yl (m) In the first line we dropped the constant term α0 , in the second line we introduced ζx, the opposite phase of the inner sum, to get rid of the absolute value, and in the third line ~ we used the bilinearity of Bm. Now for integers λ1, . . . , λk denote by ν(λ) the number of solutions to the system of equations   y1 + ··· + yl = λ1   2 2  y1 + ··· + yl = λ2 .  .   k k  y1 + ··· + yl = λk 24 CHAPTER 2. EXPONENTIAL SUMS in integers yi with 1 ≤ yi ≤ H. Using this notation, we have that

H l l−1 X X  |Sm| ≤ H ν(~λ) ζxe Bm(~γ(x), ~λ) , (2.8) ~λ x=1

j where we sum the λj in the range l ≤ λj ≤ lH . To decouple the weights ν from the inner sums, we raise to the power 2l and apply H¨older’sinequality again to get

!2l−1 H 2l ! 2 X 2l X X 2l 2l(l−1) 2l−1 ~ ~  |Sm| ≤ H ν (λ) ζxe Bm(~γ(x), λ) . (2.9)

~λ ~λ x=1 To treat the sum over the weights ν, we again apply H¨older’sinequality to get sums which we can interpret combinatorially:  2l−1  2l−1  2l−2  X 2l X 2l−2 2 X X 2 ν 2l−1 = ν 2l−1 ν 2l−1 ≤ ν ν . ~λ ~λ ~λ ~λ

The first sum just counts all the possible values for yi with 1 ≤ yi ≤ H and hence equals l (k) ~ (k) H . The second sum equals Jl (H; 0), where Jl (X; ~σ) is the number of solutions to the system of equations   y1 + ··· + yl − z1 − · · · − zl = σ1   2 2 2 k  y1 + ··· + yl − z1 − · · · − zl = σ2 . (2.10)  .   k k k k  y1 + ··· + yl − z1 − · · · − zl = σk in integers yi, zi with 1 ≤ yi, zi ≤ X. The core element which makes Vinogradov’s (k) method for Weyl sums successful, is a non-trivial estimate for Jl , which is referred to as a Vinogradov mean value theorem. Such a theorem will be given in Subsection 2.4.3. (k) Observe that by the orthogonality of the exponentials, Jl (X; ~σ) equals Z 1 Z 1 2l X k ··· e(u1y + ··· + uky ) e(u1σ1 + ··· + ukσk) du1 ··· duk,

0 0 y≤X which by the triangular inequality is

Z 1 Z 1 2l X k (k) ≤ ··· e(u1y + ··· + uky ) du1 ··· duk = J (X), l 0 0 y≤X

(k) (k) ~ and where we write Jl (X) for Jl (X; 0).

The second sum in (2.9) equals X X ~  e Bm(~γ(x1) + ··· + ~γ(xl) − ~γ(xl+1) − · · · − ~γ(x2l), λ) , ~λ x1,...,x2l which is bounded by

(k) X X ~  J (H) e Bm(~σ, λ) , l ~σ ~λ 2.4. WEYL SUMS 25

j j where the λj resp. σj run independently over the ranges l ≤ λj ≤ lH resp. σj ≤ lH . This sum factors: denote

1 X X D(α, X) = e(αxy) , (2.11) X2 |x|≤X 1≤y≤X then we have that the second sum in (2.9) is bounded by

k (k) 2k k(k+1) Y (m) j Jl (H)l H D αj , lH . (2.12) j=1

Combining equations (2.9) and (2.12) we get that

k 2l2 2k 4l(l−1)+k(k+1) (k) 2 Y (m) j |Sm| ≤ l H Jl (H) D αj , lH . (2.13) j=1

As in Weyl’s method, the gain over the trivial estimate M in the estimation of Sf (M) will come from the estimation of exponential sums over (bi)linear polynomials, (2.11). The sums D are trivially bounded by 3, but if we have additional information on the (m) coefficients αj (for example lower and upper bounds, or Diophantine properties), then by Proposition 2.1.1, we might get Q D  R(H) for some function R with R = o(1). On the other hand, the reason why Vinogradov’s method is so successful, is that we (k) have strong upper bounds for Jl . In the following subsection we will prove the bound

(k) 1 2l− 2 k(k+1)+ηk,l Jl (X) k,l X , (2.14)

2 [l/k] 2 where ηk,l = k (1−1/k) /2. Raising equation (2.13) to the power 1/(2l ) and plugging this in (2.6) yields 1   2l2 2ηk,l Y 2 Sf (M) k,l M H D + H . (2.15)

For each specific situation we will choose a suitable value of l such that ηk,l is suffi- ciently small, in order to prevent that H2ηk,l negates the possible gain obtained in the Q 2 estimation of D. Since ηk,l decays exponentially as l → ∞, in applications it often suffices to take l of polynomial growth in k, for example l  k2. This means that, except for some small values of k, Vinogradov’s method is an improvement over Weyl’s method. In Weyl’s method, the reduction to linear polynomials was achieved by successive squar- ing, and this resulted in a gain M −1/K where K = 2k−1, which is exponential in k. In Vinogradov’s method, one only needs to raise to the power 2l2, which is of polynomial growth in k, and we will typically have a gain of the form M −c/l2 for some positive constant c.

We will now make explicit how one can bound the sums D in two cases. The first (m) case is when we have lower and upper bounds on some of the coefficients αj . Lemma 2.4.7. For any J ⊆ {1, . . . , k} we have

k   Y (m) j Y (m) 1 k D αj , lH ≤ αj + (8k log(lH)) . (m) 2 2j j=1 j∈J αj l H 2 2 −l/k2 We have the explicit inequality ηk,l ≤ k e . 26 CHAPTER 2. EXPONENTIAL SUMS

Proof. By Proposition 2.1.1 we have that X  1  X2D(α, X) ≤ min X, . 2kαxk |x|≤X Now suppose I is an interval of length 1/|α|. Then by comparing with an integral,

X  1  Z 1/(2|α|)  1  min X, ≤ 2X + 2 min X, dt 2kαxk 2|α| t x∈I 0  1 Z 1/(2|α|) dt  ≤ 2X + 2 X + 2|α| X 1/(2|α|X) 2|α| t  1   1  ≤ 2 X + log X ≤ 2 X + log X. |α| |α| Therefore, 1  1  D(α, X) ≤ (2|α| X + 1)2 X + log X X2 |α|  1 1 1   1  ≤ 4 |α| + + + log X ≤ 8 |α| + log X. X X |α| X2 |α| X2 Since each D is also bounded by 3, we have that for any J ⊆ {1, . . . , k} k   Y (m) j Y (m) 1 k D αj , lH ≤ αj + (8k log(lH)) . (m) 2 2j j=1 j∈J αj l H

For the second case, we suppose that we can approximate αk by a/q. Then we have the following analog of Weyl’s inequality (Theorem 2.4.5).

k Theorem 2.4.8. Suppose f(x) = αkx + ··· + α1x + α0 for some real numbers αj and k > 1. Suppose also that for some a, q with (a, q) = 1 we have α − a/q ≤ 1/q2. Then √ k for any H ≤ M and for any positive integer l we have

1   1 1 q   2l2 S (M)  M H2ηk,l + + (log 2qH) + H2. f k,l Hk q H2k

(m) Proof. Note that for every m, αk = αk. By Proposition 2.1.1 and Lemma 2.4.2, we have

(m) k 1 X X D α , lH = e(αxy) k l2H2k |x|≤lHk 1≤y≤lHk 1 X  1  ≤ min lHk, l2H2k 2kαxk |x|≤lHk 1  1 1 q   + + + (log 2qHk) lHk lHk q l2H2k  1 1 q   + + (log 2qH) k,l Hk q H2k The theorem now follows from equation (2.15). 2.4. WEYL SUMS 27

If we have some lower and upper bounds on the denominator q, we can deduce the following estimate.

2 Corollary 2.4.9. Suppose that for some a, q with (a, q) = 1 we have αk − a/q ≤ 1/q , and that M  q  M k−1. Then for any ε > 0, we have 1 S (M)  M 1+ε−c(k), where c(k) = . f k,ε 8d2k2 log(2k)e2 Proof. We choose 1 − 1 2 H = M 2 4k and l = d2k log(2k)e. By Theorem 2.4.8,

1 1+ε ηk,l −1/2 2 1−1/(2k) Sf (M) k,ε M M M 2l + M ,

ε where we bounded log 2qH ε M . The result now follows since

2 −l/k2 2 2 ηk,l ≤ k e ≤ k exp(− log(4k )) ≤ 1/4.

We conclude this subsection on Vinogradov’s method by mentioning a refinement in the second case where we have Diophantine information on some of the coefficients αj. This refinement is achieved via the following large sieve-type inequality.

Theorem 2.4.10. Suppose δ1, . . . , δd > 0, N1,...,Nd ∈ N and define d N = {(n1, . . . , nd) ∈ N | 1 ≤ nj ≤ Nj}.

d 0 Suppose B is a finite set of points of R which are ~δ-spaced mod 1, i.e. ∀β,~ β~ ∈ B ∃j : 0 βj − βj ≥ δj. Then for any complex numbers a(~n)

2 d X X ~ X 2 Y −1 a(~n)e(β · ~n)  a(~n) (Nj + δ ), j β~∈B ~n∈N ~n∈N j=1 and the implicit constant depends only on d.

If we have some Diophantine information on some of the coefficients αj, we can use Theorem 2.4.10 instead of H¨older’sinequality to decouple the weights ν(~λ) from the inner sum in equation (2.8). This is more efficient, since it only requires squaring instead of raising to the power 2l. This however introduces a lot of technicalities which need to be addressed. Since the current section is already quite technical (and since the extra technicalities might obscure the general ideas of Vinogradov’s method), we do not go into any further detail, and mention without proof the Weyl-type estimate which one can obtain. Note that this produces the better exponent 1/(2l) instead of the exponent 1/(2l2) in Theorem 2.4.8. For details and the proof of Theorem 2.4.10, we refer to [37, Section 5.2]. Theorem 2.4.11. Suppose that for some j with 2 ≤ j ≤ k there exist a and q with j 2 q ≤ M , (a, q) = 1, and αj − a/q ≤ 1/q . Then for any integer l

1   1 1 q  2l S (M)  M M ηk−1,l + + log(2M). f k,l M q M j 28 CHAPTER 2. EXPONENTIAL SUMS

2.4.3 Vinogradov’s mean value theorem

(k) We recall that Jl (X) is the number of solutions of

xj + ··· + xj = yj + ··· + yj (1 ≤ j ≤ k), 1 l 1 l (2.16) xi, yi ∈ Z, 1 ≤ xi, yi ≤ X.

Define k X k Uk =]0, 1] f(~α) = e(α1x + ··· + αkx ). x≤X As noted before, by the orthogonality of the exponentials we have the following analytic (k) representation for Jl (X): Z (k) 2l Jl (X) = f(~α) d~α. Uk

(k) By a combinatorial argument, one easily gets the following lower bound for Jl (X). If one takes the xi arbitrary and the yi equal to (a permutation of) the xi, we see that (k) l Jl (X) ≥ [X] . On the other hand, consider the system of equations (2.10), where the 2l σj are considered as additional variables. The number of solutions of this equals [X] (the values of xi and yi completely determine the values of the σj). This can also be expressed as 2l X (k) [X] = Jl (X; ~σ), ~σ

j (k) (k) where the sum is over the range σj ≤ lX . Using the inequality Jl (X; ~σ) ≤ Jl (X), we get that 1 (k) 2l k 2 k(k+1) [X]  l X Jl (X), so that (k) 1 2l− 2 k(k+1) l Jl (X)  max X ,X . The main conjecture in Vinogradov’s mean value theorem is that for any ε > 0

(k) 1 2l− 2 k(k+1)+ε l+ε Jl (X) k,l,ε max X ,X .

This conjecture is very recently proven by the work of Wooley [45] and Bourgain, Demeter and Guth [2]. We will prove the following weaker form of Vinogradov’s mean value theorem, which was first established by Karatsuba [25] and Stechkin [34].

Theorem 2.4.12. For l ≥ k,

(k) 1 2l− 2 k(k+1)+ηk,l Jl (X) k,l X , (2.17) where  [l/k] 1 1 2 η = k2 1 − ≤ k2e−l/k . k,l 2 k

(k) (k) The core of the proof is to relate Jl (X) to Jl−k(X/p) for a suitable prime number p. We begin with a lemma of Linnik. 2.4. WEYL SUMS 29

Lemma 2.4.13. Suppose p is a prime with p > k. Let A(p;~h) be the number of solutions of k X j j nr ≡ hj mod p (1 ≤ j ≤ k), r=1 k k(k−1)/2 with nr ≤ p and the nr distinct mod p. Then A(p;~h) ≤ k!p . Proof. Denote by B(p;~g) the number of solutions of

k X j k nr ≡ gj mod p (1 ≤ j ≤ k), (2.18) r=1

k with nr ≤ p and the nr distinct mod p. Then A(p;~h) is the sum of the B(p;~g) with j k gj ≡ hj mod p and 1 ≤ gj ≤ p (1 ≤ j ≤ k). For fixed h1, . . . , hk, the number of choices for the gj equals k−1 k−2 1 k(k−1) p p ··· p = p 2 . It therefore suffices to show that B(p;~g) ≤ k!, and this will be shown by proving that every solution of (2.18) is a permutation of a given fixed solution. Consider such a fixed solution n1, . . . , nk, and suppose m1, . . . , mk is another solution. Consider the polynomial k Y P (x) = (x − nr), r=1 which equals k X k−r r P (x) = (−1) ek−rx , r=1 where the er are the elementary symmetric polynomials in the nr: X er = ni1 ··· nir . 1≤i1<···

r k X j−1 X j rer = (−1) ek−jpj where pj = nr. j=1 r=1

k k Now if r ≤ k, then r is invertible mod p (k < p), and since the pj are the same mod p for the nr and the mr (they are both a solution of (2.18)), we have that

k Y k P (x) ≡ (x − mr) mod p . r=1

k This implies that for every r we have P (mr) ≡ 0 mod p , so for every r there exists an s such that mr ≡ ns mod p. Since the ns are distinct mod p, this s is unique. Hence,

Y k (mr − ns) (mr − nj) ≡ 0 mod p . j6=s

k k For j 6= s, mr − nj 6≡ 0 mod p, so mr − nj is invertible mod p , so mr ≡ ns mod p . We conclude that for each r there is a unique s with mr = ns, and this proves the lemma. 30 CHAPTER 2. EXPONENTIAL SUMS

We want to use this lemma to reduce the number of variables from 2l to 2(l − k). In order to do this, we first divide solutions of (2.16) into two categories: Let R1(~h) be the number of solutions of l X j xr = hj (1 ≤ j ≤ k), r=1 ~ with 1 ≤ xr ≤ X and the first k variables, x1, . . . , xk mutually distinct. Denote by R2(h) the number of solutions with at least two of the variables x1, . . . , xk equal. Then

(k) X ~ ~ 2 Jl (X) = R1(h) + R2(h) ≤ 2I1 + 2I2, ~h where X 2 Ii = Ri(~h) . ~h

Observe that Ii counts the number of solutions of (2.16) where x1, . . . , xk and y1, . . . , yk have no repeated values (resp. each sequence has at least one repeated value) if i = 1 (resp. i = 2). We can use Lemma 2.4.13 to reduce the number of variables for solutions in the first category (contributing to I1). For this, we define a set P of prime numbers p with p > k such that if x1, . . . , xk are distinct and y1, . . . , yk are distinct, they will remain distinct mod p for at least one prime p ∈ P. At the same time, in view of the bound k!pk(k−1)/2 from Lemma 2.4.13, we also want to be able to control the size of the primes in P. These two demands are fulfilled if we choose P as the set of the first k2(k − 1) prime numbers 1/k p with p > X and p > k. Also, define ω0(n) as the number of distinct prime factors p of n for which p > X1/k. Then

log n ω (n) ≤ . 0 log X1/k

Indeed, given m, the smallest n for which ω0(n) = m is n = p1 ··· pm (where pi is the th i prime above X1/k), and for this n, the inequality holds.

Lemma 2.4.14.

2 k 2l+k(k−5)/2 (k)  I1 ≤ k!k (k − 1)X max p J (X/p) . p∈P l−k

Proof. Consider a solution x1, . . . , xl, y1, . . . , yl which contributes to I1. Then there is at least one prime p ∈ P with the property that x1, . . . , xk are distinct mod p and y1, . . . , yk are distinct mod p. To see this, define Y P (z1, . . . , zk) = (zi − zj). 1≤i

Then

  k X ω0 P (x1, . . . , xk) + ω0 P (y1, . . . , yk) ≤ log xi − xj + log yi − yj log X 1≤i

Therefore, X I1 ≤ I1(p), (2.19) p∈P where I1(p) is the number of solutions of (2.16) with x1, . . . , xk distinct mod p and y1, . . . , yk distinct mod p. Define

X k fp(~α, h) = e(α1x + ··· + αkx ) and x≤X x≡h mod p

Hp = {(h1, . . . , hk) | 1 ≤ hr ≤ p and hr distinct mod p}.

Then Z 2 2l−2k X X I1(p) = fp(~α, h1) ··· fp(~α, hk) fp(~α, h) d~α. Uk ~h∈Hp h≤p By H¨older’sinequality

2l−2k X 2l−2k−1 X 2l−2k fp(~α, h) ≤ p fp(~α, h) ,

h≤p h≤p so Z 2 2l−2k X 2l−2k I1(p) ≤ p max fp(~α, h1) ··· fp(~α, hk) fp(~α, h) d~α, (2.20) h≤p Uk ~h∈Hp

The integral equals the number of solutions of (2.16) with the x1, . . . , xk distinct mod p, the y1, . . . , yk distinct mod p and xr ≡ yr ≡ h mod p for k < r ≤ l. Writing xk+r = pwr +h, yk+r = pzr +h with −h/p < zr, wr ≤ (X −h)/p, this equals the number of solutions of

k l−k X j j X j j (xr − yr) = (pzr + h) − (pwr + h) , (j = 1, . . . k). r=1 r=1 The right hand side of this equation equals

j l−k X j X hj−i pi(zi − wi ). i r r i=0 r=1 By inverting this linear transform, we see that this equals the number of solutions of

l−k k X j j j X j j p (zr − wr) = (xr − h) − (yr − h) , (2.21) r=1 r=1 where 1 ≤ xr, yr ≤ X, x1, . . . , xk distinct mod p, the y1, . . . , yk distinct mod p, and −h/p < zr, wr ≤ (X − h)/p. Indeed, the lower triangular matrix A with

j  hj−i i ≤ j, Aj,i = i 0 i > j, 32 CHAPTER 2. EXPONENTIAL SUMS has inverse B with j  (−h)j−i i ≤ j, Bj,i = i 0 i > j; and j k k X j X X (−h)j−i (xi − yi ) = (x − h)j − (y − h)j. i r r r r i=0 r=1 r=1 k Now choose the xr arbitrary. Since 1 ≤ yr ≤ X < p , the yr are completely determined by their value mod pk. By Lemma 2.4.13, we then have at most k!pk(k−1)/2 choices for the yr. Given the xr and yr, the number of choices for zr and wr is bounded by (k) Jl−k(X/p). Indeed, put

k 1 X σ = (x − h)j − (y − h)j, j pj r r r=1 then the number of choices for zr and wr equals the number of solutions of (2.10) with l replaced by l−k, and where the variables occur in the range −h/p < zr, wr ≤ (X −h)/p. This number is bounded by the number of solutions to the homogenous equation, and for the homogenous system it is not that hard to show that the number of solutions only depends on the length of the interval −h/p, (X − h)/p. This can be done by shifting the variables by the amount −h/p and expanding everything via the binomial theorem. k k(k−1)/2 (k) Hence the number of solutions of (2.21) is bounded by X k!p Jl−k(X/p). Us- ing equations (2.19) and (2.20), we see that

k X 2l+k(k−5)/2 (k) I1 ≤ k!X p Jl−k(X/p). p∈P

The contribution from I2 is treated in the following lemma. Lemma 2.4.15. k2l J (k)(X) ≤ 4l + 4k!k2(k − 1)Xk maxp2l+k(k−5)/2J (k) (X/p). l 2 p∈P l−k

(k) Proof. Similarly to Jl (X), we have the following integral bound for I2:  2 Z k 2 2l−4 I2 ≤ f(2~α) f(~α) d~α. 2 Uk Indeed, the binomial coefficients represents the number of ways we can choose two pairs 2 of variables, one from x1, . . . , xk and one from y1, . . . , yk; f(2~α) is the contribution of 2l−4 those two pairs, where both variables of a pair are equal; and f(~α) is the contri- bution from the other 2l − 4 variables. By H¨older’sinequality, the integral is bounded by

2 2l−4 2 2l−4 f(2~α) l/2 f(~α) l/(l−2) ≤k1kl f(2~α) l f(~α) l/(l−2) Z 1/lZ 1−2/l 2l 2l = f(2~α) d~α f(~α) d~α . Uk Uk 2.4. WEYL SUMS 33

Since f is 1-periodic in every αj-direction, it is readily seen that the first integral equals the second one. As noted before, Z 2l (k) f(~α) d~α = Jl (X). Uk

We conclude that k2 I ≤ J (k)(X)1−1/l. 2 2 l

If I2 ≥ I1, then k2 J (k)(X) ≤ 4 J (k)(X)1−1/l, l 2 l

(k) lk2l (k) so Jl (X) ≤ 4 2 and the lemma holds. If I1 ≥ I2, then Jl (X) ≤ 4I1 and the lemma also holds by Lemma 2.4.14.

We are now ready to prove Theorem 2.4.12

Proof of Theorem 2.4.12. We will prove the explicit bound

(k) 1 2l− 2 k(k+1)+ηk,l 2  Jl (X) ≤ C(k, l)X ,C(k, l) = exp ak max(l, k )[l/k] , (2.22) for some absolute constant a (independent of k and l). We proceed by induction on l. For the base case, suppose k ≤ l < 2k, so that ηk,l = 2 (k − k)/2. Choose the variables xk+1, . . . , xl, y1, . . . , yl arbitrarily. By an argument similar to the one given in the proof of Lemma 2.4.13, every solution x1, . . . , xk of

k l l X j X j X j xr = yr − xr, (1 ≤ j ≤ k) r=1 r=1 r=k+1 where xk+1, . . . , xl, y1, . . . , yl are fixed, is a permutation of one given solution, so we have

(k) 1 2l−k 2  2l− 2 k(k+1)+ηk,l Jl (X) ≤ k!X ≤ exp k max(l, k ) X .

Now suppose l ≥ 2k, and the theorem holds for l replaced by l − k. If X ≤ kk, then (k) 2l 2kl the conclusion holds, since then Jl (X) ≤ X ≤ k = exp(2kl log k), and 2kl log k ≤ 2k2l ≤ 2k2(2[l/k]k) ≤ 4k max(l, k2)[l/k]. Suppose now X > kk. By Lemma 2.4.15 and the induction hypothesis, there exists a prime p ∈ P such that

2l 2(l−k)− 1 k(k+1)+η k X  2 k,k−l J (k)(X) ≤ 4l + 4k!k2(k − 1)Xkp2l+k(k−5)/2C(k, l − k) . l 2 p

The first term is

k(k − 1)2l = exp2l log(k(k − 1)) ≤ exp(4kl).

2 In the second term, the exponent of p equals k − ηk,l−k, and the exponent of X equals 1/k 2l − k − k(k + 1)/2 + ηk,l−k. Since X > k, the smallest prime of P is the smallest prime larger than X1/k, and for some fixed constant b we have p ≤ ebkX1/k for every 34 CHAPTER 2. EXPONENTIAL SUMS p ∈ P. Indeed, by the estimates of Chebyshev there exist constants c1 ≤ 1 ≤ c2 such that

ebkX1/k X1/k π(ebkX1/k) − π(X1/k) ≥ c − c 1 log(ebkX1/k) 2 log(X1/k) X1/k  ebk  = c1 − c2 log(X1/k) bk + 1 log(X1/k) √  bk  e 2 ≥ k c1 bk − c2 ≥ k (k − 1), log k + 1

1/k for sufficiently√ large (but fixed) b, and where we used the fact that X > k and k/ log k ≥ k (say). Bounding p like this yields

(k) 2 1 1 2 bk(k −ηk,l−k) 2l−k− 2 k(k+1)+ηk,l−k k− k ηk,l−k Jl (X) ≤ exp(4kl)+4k!k (k−1)e C(k, l−k)X X . Since 1 η − η = η , k,l−k k k,l−k l,k we have the desired exponent for X. For the constant we have that

2 4k! ≤ exp(2k2), k2(k − 1) ≤ exp(k3), ebk(k −ηk,l−k) ≤ exp(bk3), so

2 exp(4kl) + 4k!k2(k − 1)ebk(k −ηk,l−k)C(k, l − k) ≤ expak max(l, k2)C(k, l − k) ≤ expak max(l, k2) expak max(l, k2)([l/k] − 1) ≤ expak max(l, k2)[l/k]

2 −l/k2 for some constant a. Finally, the estimate ηl,k ≤ k e follows from the fact that

1 1 [l/k]  1 l/k  l  1 − ≤ 1 − = exp − (log k − log(k − 1)) 2 k k k and the fact that log k − log(k − 1) ≥ 1/k by the mean value theorem. Chapter 3

The Hardy-Littlewood method

3.1 Generalities

Consider a set A = {a1, a2,... } ⊆ N with a1 < a2 < . . . , and some s ∈ N. One of the central objectives in the field of additive number theory is determining the structure of the s-fold sumset sA = {n1 + ··· + ns|ni ∈ A} from the structure of A itself. More specifically, one is interested in the following questions.

• Which integers are contained in sA, that is, which integers can be written as a sum of s elements of A?

• In how many ways can we write such integers as a sum of s elements of A?

Two famous instances of this are Waring’s problem, where A is the set of perfect kth-powers for some fixed k > 1, and (the generalised) Goldbach problem, where A is the set of prime numbers. The Hardy-Littlewood method is a very general technique to attack these additive problems using methods from analysis. The main idea of the method is the following one. Consider a generating function of A, defined as

+∞ X F (z) = zam (|z| < 1). m=1 Then +∞ +∞ +∞ X X X F (z)s = ··· zm1+...+ms = R(n)zn,

m1=1 ms=1 n=1 where R(n) is the number of ways to write n as an ordered sum of s elements of A. The order of the sum is important here: when for example s = 2, a1 + a2 and a2 + a1 are counted separately for R(a1 + a2). Via Cauchy’s integral formula, we get an expression for R(n) in terms of F , 1 I R(n) = F (z)sz−n−1 dz (0 < ρ < 1). 2πi |z|=ρ

Estimates for R(n) can then be obtained by estimating the integral over some cir- cle. Because of this, the Hardy-Littlewood method is sometimes also referred to as the circle method. One typically hopes to deduce an asymptotic formula of the form R(n) = M(n) + O(E(n)), where M is a positive function, and E is a function which is

35 36 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD asymptotically of strictly smaller order than M. If one can prove such a formula, then it automatically follows that every sufficiently large integer is the sum of s elements of A.

The method originated in a paper of Hardy and Ramanujan [15] about partitions, and representations of numbers as a sum of squares. It was further developed in the context of Waring’s problem by Hardy and Littlewood in the early 1920’s in a series of papers sharing the title “Some problems of ‘Partitio Numerorum’ ”, see for example [12]. In 1928, I. M. Vinogradov1 refined the technique. Instead of working with a power series as the generating function, he used finite exponential sums, X f(x) = e(mx). m∈A m≤n It should be noticed that the generating function f depends on n. Using the orthogo- nality of the exponentials, Z R(n) = f(x)se(−nx) dx. R/Z Traditionally, one splits up the circle R/Z in two parts: the “major arcs”, M, where f is relatively large and can be well approximated by a simpler function so that one can derive an asymptotic formula for the integral over M; and the “minor arcs”, m, where one hopes to have good upper bounds (in L∞- and Lp-sense) of f so that the integral over m has a asymptotically smaller contribution than the main term in the asymptotic formula for the integral over M. Typically, the major arcs consist of intervals around rationals with small denominators, and the minor arcs are the rest of R/Z.

Sometimes it will be easier to consider a weighted counting function for the number of ways to represent n as a sum of elements of A. Instead of R, we consider X R˜(n) = w1(m1) ··· ws(ms), mi∈N m1+···+ms=n where the functions wi are (positive) weight functions. Define the Fourier transform wˆ : R/Z → C of w as +∞ X wˆ(x) = w(m)e(mx), m=1 where the weights are defined in such a way that this series converges (e.g. with finite support). Then Z R˜(n) = wˆ1(x) ··· wˆs(x)e(−nx) dx. R/Z In the case of Goldbach’s problem, the weights can be defined as (a truncated form of) the von Mangoldt function Λ, which is easier to “handle” with multiplicative (analytic) number theory than the characteristic function of the primes. It is clear that the approach with the generating function f described above is a special case of this “Fourier transform” approach: there f is just the Fourier transform of the characteristic function of A ∩ [1, n].

1Vinogradov made massive contributions to the circle method (including Waring’s problem and the ternary Goldbach problem) and to the theory of exponential sums in general. His work on this subject is bundled in his book “The method of trigonometrical sums in the theory of numbers” (see [41] for the original, or [43] for a version translated and revised by K.F. Roth and A. Davenpoort). 3.2. WARING’S PROBLEM 37

3.2 Waring’s problem

Fix an integer k > 1. In 1770, Waring posed the following question: is there an integer s such that each integer can be written as the sum of at most s kth-powers? When k = 2 for example, Lagrange’s four square theorem from 1770 states that this is indeed the case, and that one can take s = 4. In 1909, Hilbert [17] proved that the question has a positive answer for each k. Denote by g(k) the minimum s such that each integer is the sum of at most s kth-powers. Then Lagrange’s four-square theorem implies g(2) ≤ 4, and by looking at the squares mod 4, one easily sees that g(2) ≥ 4, so that g(2) = 4. In fact, the values of the function g are almost completely known:

g(k) = 2k + (3/2)k − 2 if 2k(3/2)k + (3/2)k ≤ 2k, and otherwise (2k + (3/2)k + (4/3)k − 2 if (4/3)k(3/2)k + (4/3)k + (3/2)k = 2k, g(k) = 2k + (3/2)k + (4/3)k − 3 if (4/3)k(3/2)k + (4/3)k + (3/2)k > 2k.

Mahler [28] proved that 2k(3/2)k + (3/2)k ≤ 2k for all but finitely many k. It is conjectured however that there are no such exceptional k, and that the first formula always holds. To see that g is always larger than or equal to the above expression, one just has to consider the integer 2k(3/2)k−1. This integer is smaller than 3k, and hence it can only be a sum of the kth-powers of 1 and 2. The minimum amount of needed terms is easily seen to be (3/2)k − 1 times 2k and 2k − 1 times 1k, which gives that g(k) ≥ 2k + (3/2)k − 2. From the above example, it is clear that the value of g(k) (or at least a lower bound) is determined by small numbers which require an exceptional amount of kth-powers, simply because only the kth-powers of 1 and 2 are available. It is therefore more interesting to look at the function G, where G(k) is the minimal s such that every sufficiently large integer is the sum of at most s kth-powers:

+ k k G(k) = min{s ∈ N|∃N ∈ N : n ≥ N =⇒ ∃m1, . . . , ms ∈ N : n = m1 + ... + ms }.

For an integer s, define Rs(n) as the number of ways to write n as an ordered sum of s th k -powers. In this section, we will derive an asymptotic formula for Rs when s ≥ s0 for some s0. This formula will imply that G(k) ≤ s0.

3.2.1 Approximating the generating function Fix an integer n, and write N = n1/k. The generating function is defined as

N X f(x) = emkx. m=1 We remark again that this function depends on n (and also on k), but we do not include n (nor k) in the notation. In rationals a/q (with (a, q) = 1), we have

N a X amk  N f = e = S(q, a) + O(q). (3.1) q q q m=1 If q is small compared to n, then (N/q)S(q, a) is a good approximation. By Theorem 2.3.7, this is bounded by Nq−1/k (and by Lemma 2.3.6, this upper bound can be reached), 38 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD so we can expect that the main contribution to Rs will come from rationals with small denominators. The first step is then to get an approximation for f near these rationals, so that we can deduce some asymptotics for Rs. We would also like that this approximation is valid in an interval around these rationals which is as wide as possible.

To find a good approximation, we might propose a “smoother” exponential sum P th cme(my), with coefficients supported on the whole of [1, n], instead of on the k - powers. Then n n a  X X f + y − c e(my) = c˜ e(my), q m m m=1 m=1 where  am th e − cm if m is a k -power, c˜m = q −cm otherwise. By summation by parts, this equals

N n X amk  X Z n X  e − c − 2πiye(ty) c˜ dt. q m m m=1 m=1 1 m≤t

From this, it makes sense to make the partial sums ofc ˜m as small as possible, and in view of (3.1), we might propose to take the cm of the form

S(q, a) X c = d , where d ≈ X. m q m m m≤Xk

1/k−1 An example which fulfils the latter condition is dm = (1/k)m , as it is easily verified by Euler summation that

k k 1 X 1 Z X Z X  m1/k−1 = t1/k−1 dt − {Xk}X1−k + {t}(1/k − 1)t1/k−2 dt k k 0 0 m≤Xk = X + OX1−k. (3.2)

Define n 1 X 1 v(y) = m1/k−1e(my), and V (x, q, a) = S(q, a)v(x − a/q). k q m=1 We will see that V (x, q, a) approximates f very well indeed near a/q. The above calcu- lations shows in fact that  f(x) − V (x, q, a)  1 + n x − a/q q, but we will significantly improve this. Of course, the choice in the definition of v is somewhat arbitrary, and there are other valid choices. The definition given here however is perhaps the simplest and most intuitive. Furthermore, this choice for v will allow a relatively easy evaluation of the so called “singular integral” in Subsection 3.2.3 below. We will begin by proving a slightly better bound than 1 + n|y| for f(y) − v(y) when |y| is small. This will be done in two steps: first by proving that f is close to an integral, and second by showing that this integral is close to v. 3.2. WARING’S PROBLEM 39

Lemma 3.2.1. Define

Z N v˜(y) = e(ytk) dt. 0 1 If |y| ≤ , then f(y) =v ˜(y) + O|y| kN k−1. 2kN k−1 Proof. By Euler-Maclaurin summation

Z N f(y) =v ˜(y) + 2πiyktk−1e(ytk−1)({t} − 1/2) dt. 0

Estimating the integral via the triangular inequality would yield the bound|y| N k. How- ever, the oscillation of the exponential in the integrand can be exploited to yield a smaller bound. This is done by expanding {t} − 1/2 as a Fourier series. It is well known that

X e(−ht) {t} − 1/2 = 2πih h∈Z\{0} and that the partial sums are uniformly bounded. By Lebesgue’s dominated convergence theorem, we may swap the sum and integral and obtain that

Z N X 1 Z N 2πiyktk−1e(ytk−1)({t} − 1/2) dt = yktk−1e(ytk − ht) dt. 0 h 0 h∈Z\{h}

We will estimate the integral by integration by parts. By the assumption on y, yktk−1 −h is non-zero on [0,N]. Therefore,

Z N Z N k−1 k−1 k 1 ykt k 0 ykt e(yt − ht) dt = k−1 e(yt − ht) dt 0 0 2πi ykt − h k−1 Z N  k−1 0 1 ykN k 1 ykt k = k−1 e(yN − hN) − k−1 e(yt − ht) dt 2πi ykN − h 2πi 0 ykt − h |y| kN k−1  , ykN k−1 − h where in the last step we used that yktk−1/(yktk−1 − h) is monotonic on [0,N], so that its derivative has a fixed sign on [0,N]. The result then follows by noting that

k−1 +∞ X 1 |y| kN k−1 X 1 · ≤ 4|y| kN h ykN k−1 − h h2 h∈Z\{h} h=1 by the assumption on y.

Remark 3.2.2. The technique from the above lemma can be generalised to obtain the following: Suppose that X < Y and that f is a twice continuously differentiable function on [X,Y ] 0 such that f is monotonic on [X,Y ]. Suppose that H1 and H2 are integers such that

0 ∀x ∈ [X,Y ] : H1 ≤ f (x) ≤ H2. 40 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Then, with H = max(|H1| ,|H2| ), we have

H X X2 Z Y e(f(m)) = e(f(x) − hx) dx + O(log(2 + H)). X X

This result is an important part of van der Corput’s method of estimating exponential sums. We refer to [9, Lemma 3.5] for the details.

Lemma 3.2.3. With v˜ defined as before, we have v˜(y) = v(y) + O(1 +|y| N).

Proof. Write X 1 G(X) = m1/k−1. k m≤X

By (3.2), G(X) = X1/k + OX1/k−1. Summation by parts yields

Z n v(y) = G(n)e(yn) − 2πiy G(t)e(yt) dt 1 Z n = n1/ke(yn) + On1/k−1 − 2πiy t1/ke(yt) dt + O|y| n1/k. 1 The integral equals

Z n1/k Z n1/k −2πiy ukuk−1e(yuk) du = e(y) − n1/ke(yn) + e(ytk) 1 1 =v ˜(y) − n1/ke(yn) + O(1), by the change of variables u = t1/k and integration by parts, and the lemma follows.

Combining the previous two lemmas, we see that f(y) = v(y)+O1+|y| kN k−1 when |y| ≤ 1/(2kN k−1). Using this, we can now prove that V (q, a, x) is a good approximation for f(x) near the rational point a/q.

Theorem 3.2.4. Suppose (a, q) = 1. If

1−k a N x − ≤ , q 2kq then for any ε > 0 1/2+ε f(x) − V (x, q, a) k,ε q .

Proof. The main idea is to write f as an average over the exponential sums S(q, a, b). Write y = x − a/q, and define

N X  bm F (y, b) = e ymk − . q m=1

Then we have 1 X f(x) = S(q, a, b)F (y, b); (3.3) q −q/2

indeed

q N X X X ark + br   bm e e ymk − q q −q/2

since the last inner sum is zero. The main contribution in this average will come from the term b = 0. by Lemmas 3.2.1 and 3.2.3 and the condition on y S(q, a, 0) S(q, a) F (y, 0) = f(y) = V (q, a, x) + O(1). q q The other terms will be estimated by the (corollary of the) Kusmin-Landau theorem,

Corollary 2.1.3. Since yktk−1 ≤ 1/(2q), we have for −q/2 < b ≤ q/2 and b 6= 0 that

yktk−1 − b/q ≥ b/(2q) for every t ∈ [0,N]. By Corollary2 2.1.3,

F (y, b)  q/b , if b 6= 0.

The theorem then follows by Hua’s bound for S(q, a, b), Theorem 2.3.9, and by observing that

q q/d 1 X 1/2+ε 1/2+ε X (q, b) 1/2+ε X X 1 (q, b)q q/b ≤ q ≤ q d q b md −q/2

Theorem 3.2.4 suggests the following definition of the major arcs: set

N a P a P  [ P = , M(q, a) = − , + , M = M(q, a). (3.4) 2k q qn q qn 1≤a≤q≤P (q,a)=1

One easily sees that the major arcs M(q, a) are disjoint and contained in P/n, 1+P/n. The minor arcs are the complement of the major arcs: m = P/n, 1+P/n\M. The width of each major arc M(q, a) is motivated by Theorem 3.2.4. In the above definition, we have chosen to include in M all the major arcs around reduced fractions with denominator up to P . The reason for choosing this many major arcs is the following one. On the one hand, we can’t use too many major arcs: on each major arc, we will approximate the generating function f, and this approximation introduces an error. We

2The function yktk−1 − b/q is only increasing if y ≥ 0. If y < 0, we can replace y by −y and use the

fact that F (y, b) = F (−y, −b) . 42 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD hence have to restrict the number of major arcs so that the accumulated error will not be too large. On the other hand, we want the minor arcs to be as small as possible. On the minor arcs, we will have no choice other than to estimate the integral via the triangular inequality. We will bound the generating function on the minor arcs by the methods for estimating Weyl sums from Section 2.4. For a number α ∈ m, Dirichlet’s theorem on Diophantine approximation (Theorem C.1.1) tells us that for any X > 0 there are integers a and q with 1 ≤ q ≤ X and (a, q) = 1 such that α − a/q < 1/(Xq). However, the fact that α is not contained in any major arc imposes a restriction on the denominator q: it must be at least of a given magnitude. In view of results such as Theorems 2.4.5 and 2.4.8, we want to have a large lower bound for the denominator q, and thus we want a large number of major arcs. These two demands can be met if we choose to include all the major arcs around reduced fractions with denominator up to P in our definition of M.

Define Z Z s s RM,s(n) = f(x) e(−nx) dx, Rm,s(n) = f(x) e(−nx) dx. M m

Then Rs(n) = RM,s(n) + Rm,s(n). Our first goal is to deduce an asymptotic formula for RM,s. If we approximate f(x) on M(q, a) by V (x, q, a), we get q Z X X s X RM,s ≈ V (x, q, a) e(−nx) dx = Sn(q)Jn(q), q≤P a=1 M(q,a) q≤P (a,q)=1 where q s X 1   an Z P/(nq) S (q) = S(q, a) e − ,J (q) = v(y)se(−ny) dy. n q q n a=1 −P/(nq) (a,q)=1 In order to obtain an asymptotic approximation of this sum over q, we will extend the domain of integration to an interval of unit length so that the integral becomes + independent of q. Next we will extend the sum over q to the whole of N to obtain

RM,s ≈ S(n)J(n), (3.5) where +∞ Z 1/2 X s S(n) = Sn(q),J(n) = v(y) e(−ny) dy. q=1 −1/2 They are called the singular3 series and the singular integral respectively. We will in fact show that  1 s  s −1 S(n)  1 and J(n) ∼ Γ 1 + Γ ns/k−1, k,s k k when s ≥ s0 for some suitable s0, and where Γ is the gamma function. Of course, the error made by introducing these approximations must not be too large (preferably at most o(ns/k−1)). The asymptotic relations for S and J and necessary lemmas for controlling the error term will be established in the next subsections.

3The adjective singular stems from the original approach using power series, the points e(a/q) on the unit circle are singularities of the generating power series. 3.2. WARING’S PROBLEM 43

3.2.2 The singular series

We begin by taking a closer look at the sums Sn(q). Observe that these sums are real: if a runs over the reduced residue system, then so does −a. By the “twisted” multiplicativity of the exponential sums S(q, a) (Lemma 2.3.4), we immediately obtain

Lemma 3.2.5. For fixed n, Sn(q) is a multiplicative function of q.

We only need to look at what happens in prime powers. Recall the definitions of τ(p) and γ(p) from Subsection 2.3.1: ( τ(p) + 1 if p > 2 or p = 2 and τ(p) = 0, pτ(p) k k, γ(p) = τ(p) + 2 if p = 2 and τ(p) > 0.

Using our knowledge of the sums S(q, a), we obtain the following upper bounds for Sn(q).

Proposition 3.2.6. Suppose l ≥ 1 and write l = uk + v with 1 ≤ v ≤ k, and put λ l λ = l − max(k, γ(p)). If λ > 0 and p - n, then Sn(p ) = 0. Otherwise we have the upper bounds  −(u+1/2)s√ l−1 l  l p p(p , n) + (p , n) if v = 1, Sn(p ) k,s p−(u+1)s(pl, n) if v 6= 1.

Proof. Suppose first that p > k, then γ = 1 and λ = l − k. By Lemma 2.3.6, we have

pl X  an S (pl) = p−lspu(k−1)s S(pv, a)se − . (3.6) n pl a=1 p-a

Every a in the sum can be written as a = xpv + y, for some x, y with 0 ≤ x < pl−v, v 1 ≤ y ≤ p , p - y. Hence the sum on the right of (3.6) equals

pv pl−v−1 X  yn X  xn  S(pv, y)se − e − . pl pl−v y=1 x=0 p-y

The inner sum over x equals 0, unless pl−v | n, in which case it equals pl−v. Suppose that indeed pl−v | n.

Case 1: v > 1. Again by Lemma 2.3.6, the sum over y equals

v p  l−v    s(v−1) X y(n/p ) s(v−1) n p − = p c v , pv p pl−v y=1 p-y Where c is the Ramanujan sum. By Theorem 2.2.8,  0 if pl−1 n,  - l  l−(u+1)s−1 l−1 Sn(p ) = −p if p k n, (3.7)  pl−(u+1)s(1 − 1/p) if pl | n, 44 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD whence the first upper bound follows.

Case 2: v = 1. The sum over y equals p−1 X  (n/pl−1)y  S(p, y)se − . p y=1 By Lemma 2.3.5,

p−1 s XX   (n/pl−1)y  S (pl) = p−lspu(k−1)spl−1 G(1, χ)χ(y) e − n p y=1 χ∈A p X X X  (n/pl−1)y  = p−u(s−k)−s ··· G(1, χ ) ··· G(1, χ ) χ ··· χ (y)e − . 1 s 1 s p χ1∈A χs∈A y=1 (3.8)

Put χ = χ1 ··· χs. If χ is non-principal, then by separability (2.1) the inner sum equals l−1  l l χ(n/p )G 1, χ . If χ is principal, the inner sum equals −1 if p - n and p − 1 if p | n. By Theorem 2.2.14 on separable Gauss sums, we get   l −u(s−k)−s X (s+1)/2 X s/2 l−1 Sn(p ) ≤ p p + p (p, n/p ) (3.9) χ6=χ0 χ=χ0  X √ X  ≤ p−(u+1/2)s p(pl−1, n) + (pl, n)

χ6=χ0 χ=χ0 −(u+1/2)s√ l−1 l  k,s p p(p , n) + (p , n) ,

s s since |A| = (k, p − 1) − 1 k,s 1.

Suppose now that p ≤ k. Note that, since γ ≤ k + 1, max(k, γ) = k or k + 1. If l l ≤ max(k, γ), then l ≤ k + 1, so Sn(p ) k 1, and the upper bounds of the proposition hold trivially. If l > max(k, γ), write l = u0k + v0 with max(k, γ) − k < v0 ≤ max(k, γ). 0 0 l Then (3.6) holds with u and v replaced by u and v , and analogously Sn(p ) = 0 unless pl−v0 | n, in such a case

0 pv   0 0 X 0 ny 0 0 S (pl) = p−ls+u (k−1)s+l−v S(pv , y)se −  p−(u +v )s(pl, n), n pl k,s y=1 p-y since the sum is k,s 1 because p ≤ k. The proof is now complete by noting that u0 + v0 ≥ u + 1, so this gives a stronger estimate than required.

This proposition allows us to bound the singular series S from above.

Theorem 3.2.7. Suppose s ≥ 4, then S converges absolutely. If s ≥ max(5, k +2), then ε S(n) k,s 1. If max(4, k) ≤ s < max(5, k + 2), then S(n) k,s,ε n for any ε > 0.

Proof. From Proposition 3.2.6 we see that

l −(ul+1/2)s+1/2 −(s−1)/2 −ul Sn(p ) k,s p n = np p , 3.2. WARING’S PROBLEM 45 and therefore we have +∞ +∞ X l −(s−1)/2 X −u −3/2 Sn(p ) k,s np k p k np l=1 u=0

Using the multiplicativity of Sn, we can write the singular series as an Euler product and get X Y −3/2 Y −3/2n n Sn(q) ≤ 1 + C1np ≤ 1 + C1p ≤ C2 , q≤Q p≤Q p≤Q where C1,C2 are constants only depending on k and s. This establishes the absolute convergence.

θ Define θ = θp via p k n, en write l = uk +v, 1 ≤ v ≤ k. By the previous proposition, we have ( ω l k,s p if l ≤ θ + max(k, γ), Sn(p ) = 0 if l > θ + max(k, γ), where  −(u + 1/2)s + l if l ≤ θ and v = 1,  ω = −(u + 1/2)s + 1/2 + θ if l > θ and v = 1,  −(u + 1)s + min(l, θ) if v > 1. If θ = 0, then ω ≤ −3/2, and

+∞ X l −3/2 −3/2 Sn(p ) k,s p max(k, γ) k p . l=1 Suppose now that θ > 0. We consider two cases for the exponents l.

Case 1: l ≤ θ. If s ≥ max(5, k + 2), then for those l we have ω ≤ −3/2 − 2u, so the contribution of P l these exponents to the sum l Sn(p ) is +∞ −3/2 X −2u −3/2 k,s p k p k p . u=0 If s ≥ max(4, k), then for those l we have ω ≤ 0, so their contribution to the sum is k,s θ.

Case 2: l > θ. If s ≥ max(5, k + 2), then for those l we have ω ≤ −2, so their contribution to the sum −2 −2 is k,s p max(k, γ) k p . If s ≥ max(4, k), then for those l we have again ω ≤ 0, so their contribution to the sum is k,s max(k, γ) k 1.

We conclude that +∞ X l −3/2 Sn(p ) k,s p resp. θ, l=1 when s ≥ max(5, k + 2) resp. max(4, k). If s ≥ max(5, k + 2), then we have

Y −3/2 S(n) k,s 1 + Cp k,s 1, p 46 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD where C is a constant which only depends on k and s. If s ≥ max(4, k), then

Y Y −3/2 S(n) k,s (1 + C1θp) 1 + C2p p|n p-n

Y C1 C1 ε k,s (1 + θp) = d(n) k,s,ε n , p|n

Where again the constants C1,C2 depend only on k and s. In order to control the error term in (3.5), we will need the following technical lemmas. Lemma 3.2.8. Suppose s ≥ max(4, k + 1), Q > 0. Then for any ε > 0,

X 1/k ε q Sn(q) k,s,ε (nQ) . q≤Q Proof. Analogous to the preceding proof,  Ok,s(1) if l ≤ θ, l 1/k l  (p ) Sn(p ) = Ok,s(1/p) if θ < l ≤ θ + max(k, γ),  0 if l > θ + max(k, γ).

Therefore (with constants C1,C2 only depending on k and s)

X 1/k Y q Sn(q) ≤ (1 + C1θp + C2/p) q≤Q p≤Q

C1 Y ε ≤ d(n) (1 + C2/p) k,s,ε (nQ) , p≤Q Q since p≤Q(1 + 1/p)  log(Q). This lemma also allows us to estimate the tail of S(n). Lemma 3.2.9. Suppose s ≥ max(4, k + 1), Q > 0. Then for any sufficiently small ε > 0, we have X ε ε−1/k Sn(q) k,s,ε n Q . q>Q Proof. For any ε > 0, we have by the Lemma 3.2.8 that

X −1/k X 1/k ε ε−1/k Sn(q) ≤ Q q Sn(q) k,s,ε n Q . Q

Applying this inductively with Q replaced by 2lQ, we get

X ε ε−1/k ε−1/k l(ε−1/k)  ε ε−1/k Sn(q) k,s,ε n Q 1 + 2 + ··· + 2 + ··· k,s,ε n Q , q>Q when ε is sufficiently small (say ε < 1/(2k)).

Define q ∗ X −1 s−1 S (q) = q S(q, a) . a=1 (a,q)=1 3.2. WARING’S PROBLEM 47

Lemma 3.2.10. Suppose s ≥ max(5, k + 1) and Q > 0. Then for any ε > 0,

X −1/4 ∗ ε q S (q) k,s,ε Q . q≤Q ∗ Proof. In the same manner as for Sn(q), one proves that S (q) is multiplicative. Lemma 2.3.6 and a straightforward calculation yield  l−u(s−1)−(s−1)/2 ∗ l p if v = 1, S (p ) k,s pl−u(s−1)−(s−1) if v > 1, where l = uk + v with 1 ≤ v ≤ k. Then +∞ X Y  X  q−1/4S∗(q) ≤ 1 + p−l/4S∗(pl) . q≤Q p≤Q l=1 By our assumption on s, +∞ +∞  k  X −l/4 ∗ l X 3uk/4−u(s−1) 3/4−(s−1)/2 X 3v/4−(s−1) p S (p ) k,s p p + p l=1 u=0 v=2 +∞ X   1  p−uk/4 p3/4−2 + kp3k/4−(s−1)  , k,s k,s p u=0 since the series is  1 and 3k/4 − (s − 1) ≤ −1. The lemma then follows by using the Q fact that p≤Q(1 + 1/p)  log Q. Next it is of great interest to prove that S(n) is bounded from below. Otherwise the asymptotic approximation (3.5) may not be of much use. The singular series is in fact strongly connected with the “Waring problem mod q”. By this we mean the following: k k define Mn(q) to be the number of solutions of m1 +···+ms ≡ n mod q, with 1 ≤ mi ≤ q. Then we have the following nice relation. Lemma 3.2.11. X 1−s (Sn ∗ 1)(q) = Sn(q) = q Mn(q). d|q

Proof. We can express Mn(q) as an exponential sum (by the orthogonality of the additive character r 7→ e(rh/q)) q q q 1 X X X M (q) = ··· er(mk + ··· + mk − n)/q. n q 1 s r=1 m1=1 ms=1 By grouping together the r by (r, q), we get that, with d = q/(r, q),

d q q 1 X X X X M (q) = ··· ea(mk + ··· + mk − n)/d n q 1 s d|q a=1 m1=1 ms=1 (a,d)=1 d s d d 1 X X q  X X = ··· ea(mk + ··· + mk − n)/d q d 1 s d|q a=1 m1=1 ms=1 (a,d)=1 s−1 X = q Sn(d). d|q 48 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Using this relation, we see that

+∞ X l l(1−s) l Sn(p ) = lim p Mn(p ), l→+∞ l=0

l (provided that the series converges). Providing lower bounds for Mn(p ) can thus give a lower bound for S(n). Via elementary means, one can show that (see for example [37, Lemmas 2.13–2.15])

Lemma 3.2.12. Suppose that  p τ(p)   k, p (p − 1) if γ(p) = τ(p) + 1, p − 1 s ≥ 2τ(2)+2 if p = 2 and τ(2) > 0 and k > 2,  5 if p = k = 2.

l (l−γ(p))(s−1) Then Mn(p ) ≥ p .

This allows us to prove that the singular series is bounded from below.

Theorem 3.2.13. Suppose that  5 if k = 2,  s ≥ 4k if k is power of 2, and k > 2,  (3k)/2 otherwise.

Then S(n) k,s 1.

Proof. One verifies that the conditions on s imply that the conditions in Lemma 3.2.12 are fulfilled for any p, and that s ≥ max(4, k + 1), so in particular S(n) converges absolutely. By Lemmas 3.2.11 and 3.2.12, we have that

+∞ X l γ(p)(1−s) Sn(p ) ≥ p . l=0 It therefore suffices to show that for p > k

+∞ X l −3/2 Sn(p ) ≥ −Cp , l=1 for some constant C only depending on k and s, since then we can bound the factors in the Euler product where p < max(k + 1,C4) by pγ(p)(1−s), and the factors with p ≥ max(k + 1,C4) by (1 − p−5/4). To do this, we will use the bounds obtained in the proof of Proposition 3.2.6. If p > k, then γ(p) = 1. As before, write l = uk + v with 1 ≤ v ≤ k. Define θ via pθ k n. By equation (3.7),

+∞ X l θ+1−([θ/k]+1)s−1 −2 Sn(p ) ≥ −p ≥ p , l=1 l6≡1 mod k 3.2. WARING’S PROBLEM 49

Since only the term with l = θ + 1 is negative, and

θ + 1 − [θ/k]s − s ≤ θ + 1 − [θ/k]k − k − 1 ≤ −1.

By equation (3.9), if l = θ + 1 and l ≡ 1 mod k (which may or may not occur), then

l −s/2+1/2+[l/k](k−s) −3/2 Sn(p ) k,s p  p .

It remains to treat X l X l Sn(p ), since Sn(p ) = 0. l≤θ l>θ+1 l≡1 mod k l≡1 mod k If s ≥ 5 and l ≤ θ and l ≡ 1 mod k, then again by (3.9)

l [l/k](k−s)−s+s/2+1 [l/k](k−s)−3/2 Sn(p ) k,s p  p , and X l −3/2 X −(s−k)[l/k] −3/2 Sn(p ) k,s p p k,s p , l≤θ l≤θ l≡1 mod k l≡1 mod k since s ≥ k + 1. Suppose finally that s = 4, l ≤ θ and l ≡ 1 mod k. The proof of Proposition 3.2.6 shows that, p−1 l −[l/k](4−k)−4 X 4 Sn(p ) = p Sk(p, a) . a=1

Since s = 4, k = 2 or k = 3, and hence Sk(p, a) is real or totally imaginary: for k = 2 this follows from Theorem 2.2.22, and for k = 3, this follows by noting that S3(p, a) = S3(p, −a) = S3(p, a). Hence the fourth power in the above equation is always positive.

Summarising, we have proved that, in any case, there exists a constant C only depending on k and s such that for p > k

+∞ X l −3/2 Sn(p ) ≥ −Cp , l=1 and this proves the theorem.

3.2.3 The singular integral We will now treat the singular integral J(n). First we will show an asymptotic formula for J(n), and then we will show that the error introduced by replacing the Jn(q) by J(n) is not too large.

Lemma 3.2.14. Suppose α, β are real numbers with α ≥ β > 0, β ≤ 1. Then

n−1 X β−1 α−1 β+α−1 −β m (n − m) = n B(α, β) + Oα,β n , m=1 where B is the Euler beta function. 50 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Proof. Consider the function φ(t) = tβ−1(n − t)α−1. By Euler summation,

n−1 X Z n−1 Z n−1 mβ−1(n − m)α−1 = φ(t) dt + φ(1) + {t}φ0(t) dt. m=1 1 1 α−1 0 It is readily seen that φ(1) α n . Now since φ (t) has at most one zero t0 ∈]1, n − 1[, we see that Z n−1 ( 0 0 φ(1) + φ(n − 1) if φ has no zero in ]1, n − 1[, {t}φ (t) dt  0 1 φ(1) + 2φ(t0) + φ(n − 1) if φ has the unique zero t0 ∈]1, n − 1[.

α−1 In both cases, these are α,β n . The first integral equals

Z 1−1/n nα+β−1 tβ−1(1 − t)α−1 dt 1/n Z 1/n Z 1  = nα+β−1B(α, β) − nα+β−1 + (tβ−1(1 − t)α−1 dt), 0 1−1/n

α−1 and the lemma follows by noting that the second term is α,β n . Theorem 3.2.15. For s ≥ 2,  1 s  s −1 J(n) = Γ 1 + Γ ns/k−1 + O n(s−1)/k−1. k k k,s Proof. By the orthogonality of the exponentials, Z 1/2  n s n n 1 X 1/k−1 X X −s 1/k−1 J(n) = m e(my) e(−ny) dy = ··· k (m1 ··· ms) . −1/2 k m=1 m1=1 ms=1 m1+···+ms=n We will estimate the latter sum via induction on s. Let us write the dependence of J on s explicitly via J(n) = J(n, s). For s = 2, we apply Lemma 3.2.14 with α = β = 1/k, and we get  1 1   1 2  2 −1 J(n, 2) = k−2B , n2/k−1 + O n1/k−1 = Γ 1 + Γ n2/k−1 + O n1/k−1, k k k k k k where we used the recurrence relations for the beta and gamma functions. Now suppose the theorem holds for some s ≥ 2. then we have

n−1 X J(n, s + 1) = k−1m1/k−1J(n − m, s) m=1 s −1 n−1  1   s  X = Γ 1 + Γ k−1 m1/k−1(n − m)s/k−1 k k m=1  n−1  X 1/k−1 (s−1)/k−1 + Ok,s m (n − m) . m=1 Applying Lemma 3.2.14 with α = s/k and β = 1/k, we see that the sum in the main (s+1)/k−1 s/k−1 term is B(1/k, s/k)n + Ok,s n , and the same lemma with α = (s − 1)/k s/k−1 and β = 1/k gives that the error term is Ok,s n ). Hence, by using the recurrence relations for the beta and gamma function, the lemma follows. 3.2. WARING’S PROBLEM 51

In order to show that we may indeed replace Z P/(nq) s Jn(q) = v(y) e(−ny) dy −P/(nq) by J(n) without making a too large error, we need some estimate on v. Lemma 3.2.16. Suppose |y| ≤ 1/2, then v(y)  minn1/k,|y|−1/k. Proof. We have n 1 X v(y) = m1/k−1e(ym)  n1/k + O(1), k m=1 by equation (3.2). If |y| ≤ 1/n, then the upper bound stated in the lemma holds. Now suppose |y| > 1/n, and set M = |y|−1. The terms in the sum with m ≤ M contribute  M 1/k ≤|y|−1/k. To estimate the remaining terms, put

m 1 X d = m1/k−1,S = e(yr). m k m r=1 By summation by parts, n n 1 X X m1/k−1e(ym) = d S − d S + (d − d )S . k n+1 n M+1 M m m+1 m m=M+1 m=M+1

−1 By Proposition 2.1.1, |Sm| |y| , and since the sequence (dm)m is decreasing, we have 1 X m1/k−1e(ym)  d |y|−1 <|y|−1/k . k M+1 m=M+1

3.2.4 The contribution from the major arcs Theorem 3.2.17. Suppose s ≥ max(5, k + 1). With the definition of the major arcs as in (3.4), we have that there exists a positive number δ such that

 1 s  s −1 R (n) = Γ 1 + Γ S(n)ns/k−1 + O ns/k−1−δ. M,s k k k,s Proof. Suppose x ∈ M(q, a), then by Theorem 3.2.4 we have for any ε > 0 that f(x) − 1/2+ε V (x, q, a) k,s,ε q . Therefore, we have that  q1/2+εs if V (x, q, a) ≤ q1/2+ε, s s  f(x) − V (x, q, a) k,s,ε 1/2+εs 1/2+ε s−1 1/2+ε  q + q V (x, q, a) if V (x, q, a) > q , and so the second estimate always holds. Using this, we get

q X Z f(x)s − V (x, q, a)se(−nx) dx a=1 M(q,a) (a,q)=1 Z 1/2 P 1/2+εs 1/2+ε ∗ s−1 k,s,ε q q + q S (q) v(x) dx. nq −1/2 52 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Summing over all q up to P , the first term contributes P 2  P 1/2+εs  n(2+s/2)/k−1+εs/k. k,s,ε n For the second term, note that by Lemma 3.2.16, Z 1/2 Z 1/n Z 1/2 s−1 (s−1)/k −(s−1)/k (s−1)/k−1+ε v(x) dx  n dx + x dx ε n , −1/2 −1/n 1/n where the extra ε in the exponent comes from the possibility s = k + 1. By Lemma 3.2.10, X 1/2+ε ∗ 3/4+ε X −1/4 ∗ 3/(4k)+ε/k+ε/k q S (q) ≤ P q S (q) ε n . q≤P q≤P Now note that 2 + s/2 εs s 1 − 2εs s − 1 3 2ε s  1 2ε  − 1 + ≤ − 1 − , − 1 + ε + + = − 1 − − − ε , k k k 2k k 4k k k 4k k by our assumptions on s. If we choose ε sufficiently small (depending on k and s), then we find for some positive δ that q Z X X s s/k−1−δ RM,s(n) = V (x, q, a) e(−nx) dx + Ok,s n q≤P a=1 M(q,a) (a,q)=1 Z P/qn X s s/k−1−δ = Sn(q) v(y) e(−ny) dy + Ok,s n . q≤P −P/qn Now we will extend the domain of integration to [−1/2, 1/2]: put N(q) = [−1/2, 1/2] \ [−P/(qn),P/(qn)]. By Lemma 3.2.16, Z Z +∞  s/k−1 s −s/k nq Sn(q) v(y) dy  Sn(q) y dy k,s Sn(q) . N(q) P/nq P Summing over q up to P , we get by Lemma 3.2.8 that Z X s s/k−1 −1/k ε s/k−1−(1/k2−(1+1/k)ε) Sn(q) v(y) e(−ny) dy k,s,ε n P nP  n . q≤P N(q) Again by taking ε sufficiently small (depending on k and s), we get that this is s/k−1−δ k,s n . Therefore, s/k−1−δ X RM,s(n) = S(n, P )J(n) + Ok,s n , with S(n, P ) = Sn(q). q≤P By Lemma 3.2.9, we have that

X ε+ε/k−1/k2 −δ Sn(q) k,s,ε n ≤ n , q>P if we take ε sufficiently small. Using Theorem 3.2.15 for the asymptotics of J(n), we get

 1 s  s −1 R (n) = Γ 1 + Γ S(n)ns/k−1 M,s k k s/k−1−1/k s/k−1−1/k−δ s/k−1−δ + Ok,s n S(n) + n + n ,

s/k−1−δ and by Theorem 3.2.7, the error term is k,s n , which concludes this theorem. 3.2. WARING’S PROBLEM 53

3.2.5 The contribution from the minor arcs Theorem 3.2.18. Define

1 1 [l/k]  1 1  η = 1 − , σ = max , , k,l 2 k 2k−1 8d2k2 log(2k)e2 and define s0 as the smallest integer such that  2  s0 > min 2l + ηk,l . l σ With the definition of the minor arcs as in (3.4), we have that there is a positive number δ such that for every s ≥ s0

s/k−1−δ Rm,s(n) k,s n . Proof. Suppose α ∈ m. By Dirichlet’s theorem on Diophantine approximation (Theorem

C.1.1), there exist a and q with q ≤ n/P ,(a, q) = 1 and α − a/q ≤ P/(nq). Since α ∈ m, we must have that q > P . By the bounds obtained via Weyl’s and Vinogradov’s methods, Corollaries 2.4.6 and 2.4.9, we have that for any ε > 0

1+ε−σ f(α) k,ε N . Now suppose l is an integer with 2l ≤ s. Then Z Z 1 s s−2l 2l Rm,s(n) ≤ f(x) dx ≤ sup f(α) f(x) dx. m α∈m 0 The integral equals the number of solutions of

k k k k x1 + ··· + xl = y1 + ··· + yl , 1 ≤ xj, yj ≤ N. By considering the system of equations

l l X j j X k k (xr − yr) = hj (1 ≤ j ≤ k − 1), (xr − yr ) = 0 r=1 r=1 where we consider the hj as additional free variables, we see that this is k(k−1)/2 (k) k,l N Jl (N). Then by Vinogradov’s mean value theorem (Theorem 2.4.12), we have (1+ε−σ)(s−2l)+2l−k+ηk,l s−k−δ Rm,s(n) k,s,ε N = N , where δ = (s − 2l)(σ − ε) − ηk,l. Choose ε = σ/2, then 2 δ > 0, if s > 2l + η . σ k,l

If we now put s0 to be the smallest integer such that  2  s0 > min 2l + ηk,l , l σ

Then we have indeed that for some δ > 0, for every s ≥ s0

s/k−1−δ Rm,s(n) k,s n . 54 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Combining Theorem 3.2.17, Theorem 3.2.18, Theorem 3.2.7 and Theorem 3.2.13, we get

Theorem 3.2.19. Using the notations from Theorem 3.2.18, we have that for every s ≥ s0 there is a positive number δ such that

 1 s  s −1 R (n) = Γ 1 + Γ S(n)ns/k−1 + O ns/k−1−δ. (3.10) s k k k,s

Furthermore S(n) k,s 1.

Corollary 3.2.20. Suppose s ≥ s0. Then every sufficiently large integer can be written as the sum of s kth-powers.

Remark 3.2.21.

2 • By choosing for example l = [6k log k] in the definition of s0, we see that

 log k  s ≤ k2 log k 12 + O . 0 k2

• Using Hua’s Lemma4, which states that for 1 ≤ j ≤ k

Z 1 2j 2j −j+ε f(x) dx k,ε N , 0

k it is not hard to show that Theorem 3.2.19 holds with s0 replaced by 2 + 1. We k k have that 2 + 1 ≤ s0 for small values of k, while s0 < 2 + 1 if k ≥ 13 or so.

Denote by G˜(k) the minimum value of s0 such that for each s ≥ s0 the asymptotic formula (3.10) holds for some δ > 0 and n sufficiently large. It is obvious that G(k) ≤ G˜(k). Theorem 3.2.19 states that

2 G˜(k) ≤ s0 ≤ k log k(12 + o(1)).

Using his improvements in Vinogradov’s mean value theorem, Wooley [45] showed that

G˜(k) ≤ 2k2 + 2k − 3.

One can also prove upper bounds for G(k) which are significantly better than the ones provided by upper bounds for G˜(k). The current best bound for G(k) is also due to Wooley, who proved that

G(k) ≤ k log k + k log log k + O(k).

Remark that for such smaller values of s, one does not necessarily have the asymptotic formula (3.10). For a Survey article on Waring’s problem, we refer to the article by Vaughan and Wooley [36].

4The proof of Hua’s Lemma uses the same ideas as the proof of Weyl’s inequality (Theorem 2.4.5). we omit the details and refer to [37, Lemma 2.5] 3.3. THE TERNARY GOLDBACH PROBLEM 55

3.3 The ternary Goldbach problem

In 1742, in a series of letters to Euler, the German mathematician Christian Goldbach famously conjectured that every number greater than 2 can be written as the sum of three primes, and that every even number greater than 2 can be written as the sum of two primes. Since Goldbach considered 1 to be a prime number, with the current definition of primes these conjectures are formulated as follows. • (Ternary Goldbach conjecture). Every odd number greater than 5 can be written as the sum of three primes. • (Binary Goldbach conjecture). Every even number greater than 2 can be written as the sum of two primes. The first statement is also known as the weak Goldbach conjecture, and the second one as the strong Goldbach conjecture, since the second implies the first. To this day, the strong conjectures remains unproven. All the even numbers up to 4 · 1018 have been checked, and not a single counterexample has been found. The ternary Goldbach conjecture was only fairly recently proven, in 2013, by Helfgott [16]. For a long time, there was no real progress towards a solution of these problems, until the development of the circle method. In 1923, Hardy and Littlewood [13, 14] proved that, under the assumption of the generalised Riemann hypothesis5, every sufficiently large odd number is the sum of three primes. In 1937, Vinogradov [40] gave an uncondi- tional proof of this statement, using his refinement of the circle method and his method of estimating exponential sums. In this section, we will present Vinogradov’s proof.

As described in Section 3.1, we will use a weighted counting function for the number of ways in which an integer n can be written as the sum of three primes: define R(n) as X    R(n) = log p1 log p2 log p3 . p1+p2+p3=n The generating function is then defined as X f(x) = (log p)e(px), p≤n and Z R(n) = f(x)3e(−nx) dx. R/Z The reason for working with these weighted versions, is that these weighted prime sums are easier to handle with results from multiplicative analytic number theory.

3.3.1 The contribution from the major arcs To find a good approximation for the generating function and a proper definition of the major and minor arcs, it is instructive to see what we can say about the value of f in rationals: suppose 1 ≤ a ≤ q with (a, q) = 1. q a X ap X ar  X X ap f = (log p)e = e log p + (log p)e . q q q q p≤n r=1 p≤n p|q (r,q)=1 p≡r mod q p≤n

5The generalised Riemann hypothesis states that for every Dirichlet L-function L(χ, s) every non- trivial zero of L(χ, s) has real part 1/2. 56 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Now if q is small compared to n, then the inner sum of the first term can be estimated via a quantitative form of the prime number theorem for primes in arithmetic progressions (see for example [30, chapter 11]): Theorem 3.3.1 (Siegel-Walfisz). There is a constant c such that for any A > 0, and 1 ≤ a ≤ q ≤ (log x)A with (a, q) = 1, 1 ϑ(x; q, a) = x + O x exp(−cplog x), ϕ(q) A where ϑ(x; q, a) is given by X ϑ(x; q, a) = log p. p≤x p≡a mod q Using this theorem, the value of f in a/q with q ≤ (log n)A becomes

q a X ar  1  f = e n + O n exp(−cplog n) + O(log n)ω(q) q q ϕ(q) A r=1 (r,q)=1 µ(q) = n + O n exp(−cplog n), (3.11) ϕ(q) A where ω(q) is the number of distinct prime factors of q and satisfies ω(q)  log q, and since cq(a) = cq(1) = µ(q). In order to find a good approximation for f near those rationals with small denomina- tor, we start with the same ansatz as in Subsection 3.2.1. We suggest an approximation P of the form m≤n cme(my). Then, n n a  X X f + y − c e(my) = c˜ e(my), q m m m=1 m=1 where ( (log m)e(am/q) − cm if m is prime, c˜m = −cm otherwise. By summation by parts, this equals n X ap X Z n X  (log p)e − c − 2πiye(ty) c˜ dt. (3.12) q m m p≤n m=1 1 m≤t

From this, we of course try to make the partial sums ofc ˜m small, and from (3.11), one might suggest to take µ(q) X c = d , where d ≈ X. m ϕ(q) m m m≤X

The simplest example for such a sequence dm is dm = 1 for every m. Remarkably enough, this simple approximation is good enough for our purposes. Furthermore, this simple form allows for a much easier evaluation of the singular series and the singular integral, compared to Waring’s problem. Define n X v(y) = e(my). m=1 3.3. THE TERNARY GOLDBACH PROBLEM 57

Lemma 3.3.2. Suppose A > 0. There exists a constant c > 0 such that whenever 1 ≤ a ≤ q ≤ (log n)A, (a, q) = 1, and |y| ≤ n−1(log n)A, a  µ(q) f + y = v(y) + O n exp(−cplog n). q ϕ(q) A Proof. By equation (3.11) and (3.12), a  µ(q) Z n X f + y − v(y)  n exp(−cplog n) +|y| c˜ dt, q ϕ(q) A m 1 m≤t where ( (log m)e(am/q) − µ(q)/ϕ(q) if m is prime, c˜m = −µ(q)/ϕ(q) otherwise. If we can establish that the partial sums X F (t) = c˜m m≤t

√  A ofc ˜m are OA n exp(−c log n) , then the lemma follows, since n|y| ≤ (log n) , which can be absorbed in the error term by reducing the constant c in the exponent to any c0 < c. For the partial sum up to n, F (n), this was already established by equation (3.11). One might proceed in the same way to estimate F (t), but one has to be careful, since we used the Siegel-Walfisz theorem which only holds for q ≤ (log t)A. However, if √ for example t > n, then (log t)A > 2−A(log n)A > (log n)A/2 if n > e4, and one can use √ the same argument. Hence for n < t ≤ n, we have X µ(q) µ(q) c˜ = t + O t exp(−cplog t) − [t]  n exp(−cplog n). m ϕ(q) A ϕ(q) A m≤t √ For 1 ≤ t ≤ n, we can use Chebyshev’s estimate ϑ(t)  t to get X 1 c˜  t + [t]  n exp(−c0plog n). m ϕ(q) A m≤t

Lemma 3.3.2 suggests the following definition of the major arcs. Fix a positive number A and set a P a P  [ P = (log n)A, M(q, a) = − , + , M = M(q, a). (3.13) q n q n 1≤a≤q≤P (q,a)=1 Define m =]P/n, 1 + P/n] \ M, and Z Z 3 3 RM(n) = f(x) e(−nx) dx, Rm(n) = f(x) e(−nx) dx. M m

Then R(n) = RM(n) + Rm(n). Again we will first deduce an asymptotic formula for 3 3 RM(n). The approximation of f(a/q + y) by (µ(q)/ϕ(q)v(y)) on the major arcs will lead to a singular series and integral given by

+∞ q X X µ(q)  an Z 1/2 S(n) = e − ,J(n) = v(y)3e(−ny) dy. ϕ(q)3 q q=1 a=1 −1/2 (a,q)=1 58 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Lemma 3.3.3. The series X 1 ϕ(n)2 n converges, and X 1 1  . ϕ(n)2 x n≥x Proof. The idea is that ϕ(n) is close to n. Making this precise, a straightforward calcu- lation shows that 1 1 = ∗ g, ϕ2 id2 where g is the multiplicative function defined by 2p − 1 g(1) = 1, g(p) = , g(pl) = 0 for l > 1. p2(p − 1)2 Using the convolution method, we get that

+∞ X 1 X X a2 X X 1 = g(a) = g(a) ϕ(n)2 n2 m2 n≥x n≥x a|n a=1 m≥x/a +∞ +∞ X a 1 X 1  g(a) = g(a)a  , x x x a=1 a=1 since g(p)  8/p3 and

8ω(a)µ2(a) d(a)3µ2(a) 1 g(a)  =  . a3 a3 ε a3−ε

Denote q X X µ(q)  an S(n, Q) = e − . ϕ(q)3 q q≤Q a=1 (a,q)=1 Lemma 3.3.4. S(n) converges absolutely, S(n, Q) = S(n) + O(1/Q), and

Y 1  Y 1  S(n) = 1 + 1 − . (p − 1)3 (p − 1)2 p-n p|n Furthermore, S(n) = 0 if n is even, and S(n)  1 if n is odd. Proof. For the qth term in S(n) we have q   X µ(q) an µ(q)cq(n) 1 e − = ≤ . ϕ(q)3 q ϕ(q)3 ϕ(q)2 a=1 (a,q)=1

By Lemma 3.3.3, S(n) converges absolutely, S(n)  1 and the tail is  1/Q. By 3 the multiplicativity of the general term µ(q)cq(n)/ϕ(q) , we can write S(n) as an Euler product. By Theorem 2.2.8, we see that Y S(n) = T (p), p 3.3. THE TERNARY GOLDBACH PROBLEM 59 where   1 +∞ l l l 1 + if p - n, X µ(p )µ(p /(p , n))  (p − 1)3 T (p) = = ϕ(pl)2ϕ(pl/(pl, n)) 1 l=0 1 − if p | n.  (p − 1)2

If n is even, then 2 | n and 1 − 1/(2 − 1)2 = 0. If n is odd, then

Y 1  Y 1  S(n) ≥ 1 − ≥ 1 − (p − 1)2 (p − 1)2 p|n p>2 Y 1  1 6 ≥ 1 − = = . p2 ζ(2) π2 p

Denote N = [−1/2, 1/2] \ [−P/n, P/n].

Lemma 3.3.5. 1 J(n) = (n − 1)(n − 2), 2 and Z 2 3 n v(y) e(−ny) dy  2 . N P Proof. Z 1/2 J(n) = v(y)3e(−ny) dy −1/2 is the number of ways in which one can write n as m1 + m2 + m3 with 1 ≤ mi ≤ n, which equals (n − 1)(n − 2)/2. By Proposition 2.1.1, v(y)  1/|y| for y ∈ [−1/2, 1/2], so

Z Z +∞ 2 3 1 P v(y) e(−ny) dy  3 dy  2 . N n/P y n

Using the previous lemmas, we can determine the contribution from the major arcs.

Theorem 3.3.6. With the definition of the major arcs as in (3.13), we have that

1  n2  R (n) = S(n)n2 + O . M 2 A (log n)A

Proof. Suppose x ∈ M(q, a). By Lemma 3.3.2,

µ(q) f(x)3 − v(x − a/q)3 ϕ(q)3  µ(q)  µ(q) µ(q)2  = f(x) − v(x − a/q) f(x)2 + f(x) v(x − a/q) + v(x − a/q)2 ϕ(q) ϕ(q) ϕ(q)2 3 p A n exp(−c log n), 60 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD where we used the trivial estimates f, v  n for the second factor. By integrating over M, we get Z f(x)3e(−nx) dx M q X X Z µ(q)  X  = v(x − a/q)3e(−nx) dx + O P n2 exp(−cplog n) ϕ(q) ϕ(q)3 A q≤P a=1 M(q,a) q≤P (a,q)=1 Z P/n 3 3 2 p  = S(n, P ) v(y) e(−ny) dy + OA P n exp(−c log n) . −P/n

The error term is n2  , A (log n)A while by lemmas 3.3.4 and 3.3.5, the main term equals

S(n)n2 J(n) n2  1  n2  S(n)J(n) + O + + = S(n)n2 + O . P 2 P P 3 2 (log n)A

3.3.2 The contribution from the minor arcs

The main obstacle in the ternary Goldbach problem before the work of Vinogradov, was to find estimates for the generating function f(x) on the minor arcs. By assuming the generalised Riemann hypothesis, Hardy and Littlewood bypassed this problem, since then one can take major arcs which fill up the whole unit circle R/Z. Vinogradov was able to overcome this obstacle, and was able to deduce satisfactory estimates on the minor arcs. He is argument is rather complicated, but we will present a nice and elegant proof based on Vaughan’s identity for the von Mangoldt function Λ. First we need a lemma for estimating a certain type of bilinear exponential sums.

Lemma 3.3.7. Suppose that X,Y,Z are real numbers with Y,Z ≤ X, and that α is a for which there exist a and q with (a, q) = 1 and α − a/q < 1/q2. Then for any complex numbers am and bn with |am| ,|bn| ≤ 1, we have

 1/2 X X X X X √ 2 a b e(αmn)  + + + q Xlog(2qX) . m n Y Z q m>Y n>Z mn≤X

Proof. We may assume that YZ ≤ X. We split the sum over m in the O(log X) sums

X X X , ,..., , Y

6The following estimates also hold for the last sum where the range for m is 2iY < m ≤ X/Z instead of 2iY < m ≤ 2i+1Y . 3.3. THE TERNARY GOLDBACH PROBLEM 61

Cauchy-Schwarz inequality that

2  2 X X X X ambne(αmn) ≤ bne(αmn)

2j Y Z 2j Y

The contribution from the diagonal terms n1 = n2 is

X X X2 ≤ 2jY  2jYX ≤ . m Z 2j Y

j XX X  2 Y e(αm(n1 − n2)) j j j+1 1≤n1Z nm≤X and the lemma follows.

To obtain some cancellation in the estimation of f(x), we will use a famous formula for the von Mangoldt function which is due to Vaughan. We remind the reader of the elementary identities

Λ = µ ∗ log, log = 1 ∗ Λ, µ ∗ 1 = .

Lemma 3.3.8 (Vaughan’s identity). Suppose Y,Z ≥ 1. For any m > Z we have X m X X X X Λ(m) = µ(b) log − µ(b)Λ(c) + µ(b)Λ(c). b b≤Y b≤Y c≤Z b>Y c>Z b|m bc|m bc|m

Proof. By the first elementary identity, X m X m Λ(m) = µ(b) log + µ(b) log . b b b≤Y b>Y b|m b|m 62 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Since log(m/b) = (Λ ∗ 1)(m/b), the second term equals XX X X X X µ(b)Λ(c) = µ(b)Λ(c) + µ(b)Λ(c). b>Y b>Y c>Z b>Y c≤Z bc|m bc|m bc|m Here, the second term equals XX X X µ(b)Λ(c) − µ(b)Λ(c), c≤Z b≤Y c≤Z bc|m bc|m and since in the first term c ≤ Z < m, the sum over b vanishes ((µ∗1)(m/c) = (m/c) = 0), and the lemma follows.

Theorem 3.3.9. Suppose α is a real number for which there exist a and q with (a, q) = 1 and α − a/q < 1/q2. Then X Λ(m)e(αm)  n4/5 + q−1/2n + q1/2n1/2 + q(log 2qn)3. m≤n Proof. By Vaughan’s identity (Lemma 3.3.8), the sum equals (with Y, Z < n) X XX Λ(m)e(αm) + µ(b)(log a)e(αab) m≤Z b≤Y ZY,c>Z Z

For S2, write d = bc and rearrange the indices to obtain ! X X X X X X |S2| = Λ(c)µ(d/c) e(αad) ≤ Λ(c) e(αad) .

d≤YZ c|d Z

For S3, set d = ac and rearrange to obtain ! X X X S3 = Λ(c) µ(b)e(αbd). b>Y d>Z c|d bd≤n c>Z

Again the sum over c is ≤ log d ≤ log n. By Lemma 3.3.7, we then have

 n n n 1/2√ S  (log n) + + + q n(log 2qn)2 3 Y Z q  n n n   + + + (nq)1/2 (log 2qn)3. Y 1/2 Z1/2 q1/2

2/5 We minimise the sum Z + S1 + S2 + S3 by choosing Y = Z = n , and we get

 n  f(α)  n4/5 + + (nq)1/2 + q (log 2qn)3. q1/2

Theorem 3.3.10. With the definition of the major and minor arcs as in (3.13), we have 2 4−A/2 Rm(n)  n (log n) .

Proof. Estimating the integral via the triangular inequality, we have

Z Z 1 3 2 Rm(n) ≤ f(x) dx ≤ sup f(α) f(x) dx. m α∈m 0 By Parceval’s theorem and integration by parts, we have for the integral:

Z 1 Z n 2 X 2 2 f(x) dx = (log p) = (log x) dπ(x) 0 p≤n 1 Z n log x = (log n)2π(n) − 2 π(x) dx  n log n, 1 x where we used Chebyshev’s estimate π(x)  x/ log x. Now suppose α ∈ m. By Dirich- let’s theorem on Diophantine approximation (Theorem C.1.1), there exist a and q with q ≤ n/P ,(a, q) = 1 and α − a/q ≤ P/(nq). Since α ∈ m, this implies that q > P . Since √ ψ(n) − ϑ(n) ≤ n log n, we have by Theorem 3.3.9 that

√ 4/5 −A/2 −A/2 −A 3 f(α)  n log n + n + n(log n) + n(log n) + n(log n) (log n)  n(log n)3−A/2.

At last we can prove Vinogradov’s theorem.

Theorem 3.3.11. For any number B > 0, we have

1  n2  R(n) = S(n)n2 + O . 2 B (log n)B 64 CHAPTER 3. THE HARDY-LITTLEWOOD METHOD

Proof. Choose the value of A in the definition of the major and minor arcs (3.13) to be A = 2B + 8. Then by Theorem 3.3.6 and Theorem 3.3.10, we have

1  n2   n2  R(n) = R (n) + R (n) = S(n)n2 + O + O M m 2 B (log n)A (log n)B 1  n2  = S(n)n2 + O . 2 B (log n)B

Combining Theorem 3.3.11 with Lemma 3.3.4 we obtain the following:

Corollary 3.3.12. Every sufficiently large odd integer is the sum of three primes.

Remark 3.3.13.

• In the proof that is presented here, one cannot make explicit what “sufficiently large” means, since we used the Siegel-Walfisz theorem (Theorem 3.3.1), which is ineffective: the implicit constant in that theorem is not effectively computable. We now know by the work of Helfgott [16] that “sufficiently large” means larger than 5.

• It is not possible to obtain an asymptotic formula for the binary Goldbach problem by adapting the above proof. Indeed, the contribution of the major arcs would be of order  n. However, estimating the minor arc contribution by the L2- or the L∞ bound for f(α) on the minor arcs derived in Theorem 3.3.10 will yield an upper bound which dominates n by at least a logarithmic factor. The shortcomings of the circle method for binary problems are nicely described in this blogpost of Terence Tao on the heuristic limitations of the circle method [35].

• One can however use the circle method to prove the binary Goldbach conjecture for “almost all even n”: for every A > 0, the number of even integers below n which −A are not the sum of two primes is bounded by OA n(log n) . See for example [37, page 36]. Chapter 4

The Vinogradov-Korobov zero-free region for ζ

In this brief chapter we will use Vinogradov’s method for estimating exponential sums (see Subsection 2.4.2) to derive the Vinogradov-Korobov zero-free region of the Riemann zeta function ζ. This zero-free region was independently derived by Vinogradov [42] and Korobov [26] in 1958, and it remains (up to improvements of the involved constants) the (asymptotically) best zero-free region to this day.

We start with a simple identity for ζ, which we derive from its definition as Dirichlet series. Write s = σ + it. For σ > 1 and x > 1, we have

+∞ X 1 X 1 X 1 ζ(s) = = + ns ns ns n=1 n≤x n>x X 1 Z +∞ du {x} Z +∞ {u} = + + − s du ns us xs us+1 n≤x x x X 1 x1−s {x} Z +∞ {u} = + + − s du, (4.1) ns s − 1 xs us+1 n≤x x where we used Euler summation. The last expression defines an analytic function in the punctured half plane {s ∈ C | σ > 0}\{1}, and by the uniqueness property of analytic continuation, the equality for ζ remains valid in that punctured half plane. Note that

t  X 1 X e − log n = 2π . ns nσ n≤x n≤x

The weights n−σ will be removed via summation by parts, so the first objective is to obtain good upper bounds for the exponential sum

N X t S (N) = e(f(n)), where f(y) = − log y. (4.2) f 2π n=1 The function f is not a polynomial, but we can still use Vinogradov’s method from Subsection 2.4.2 for estimating Sf by approximating f by its Taylor polynomial of degree k. Also for reasons which will become clear in a minute, we will consider sums with summation range M < m ≤ N for some M < N instead of 1 ≤ n ≤ N. We implement

65 66 CHAPTER 4. THE VINOGRADOV-KOROBOV ZERO-FREE REGION FOR ζ these two changes in the first step of the method, namely the shifting of the summation. For a positive integer h, we shift by h, and then we approximate f(n + h) by its Taylor polynomial of degree k in h around the point n: X X e(f(n)) = e(f(n + h)) M

k (k+1) X (n) f (ξh,n) f(n + h) = α hj + hk+1 and j (k + 1)! j=0  (k+1)  (k+1) f (ξh,n) k+1 f k+1 e h − 1 ≤ 2π h . (k + 1)! (k + 1)!

Lemma 4.1. There is a positive constant c1 such that for t ≥ N ≥ 2 and with the notation from (4.2)  (log N)3  S (N)  N exp −c , f 1 (log t)2 and the implicit constant is absolute.

Proof. We have that

(−1)jt (−1)jt f (j)(n) = (j − 1)! , α(n) = . 2πnj j 2πjnj

(n) In view of Lemma 2.4.7, we want to have good upper and lower bounds for the αj . A lower bound follows from the fact that n ≤ N. In order to get a good upper bound, we restrict the summation to M < n ≤ N for some M, and estimate the rest via the triangular inequality. Choose for example M = N 7/8. Then

t (n) t ≤ α < (4.4) 2πjN j j 2πjN 7j/8 and

X 7/8 Sf (N) ≤ e(f(n)) + N .

N 7/8

7/8 3 −2 The error N is admissible, since (for c1 ≤ 1/8) it is ≤ N exp(−c1(log N) (log t) ). Next, we choose H = N 3/8. Then h ≤ N 3/4, and by (4.3), we have

 k  X X X (n) j 3/4 1−(k+1)/8 e(f(n)) = e αj h + O N + tN . N 7/8

Again, the error is admissible, provided we choose for example k + 1 ≥ 9(log t)/(log N). The remaining sum is treated via the methods of Subsection 2.4.2. By equation (2.13) and by the explicit form of Vinogradov’s mean value theorem (2.22), we have

k 1   2 X 2 3η /4 Y (n) 3j/8 2l e(f(n)) ≤ N exp(O(l ))N k,l D α , lN , j N 7/8

k   Y (n) 3j/8 Y (n) 1 3/8 k D αj , lN ≤ αj + 8k log(lN ) (n) 2 3j/4 j=1 j∈J αj l N  j/4  Y t N k ≤ exp(O(l2)) + log(N 3/8) , N 7j/8 t j∈J where we used the bounds (4.4) and where we absorbed the constants in exp(O(l2)). We now want to restrict ourselves to values of j which are sufficiently large, in order to cancel the t in the numerator in the first term. We would also like to save an amount N −δ for some positive δ. This can be done by choosing for example

 4 log t 8 log t  J = j ∈ | ≤ j ≤ . N 3 log N 3 log N

Observe that J 6= ∅, and since we will choose k +1 ≥ 9(log t)/(log N) (as stated before), J ⊆ {1, . . . , k}. For j ∈ J we have

j t N j/4 t N −1/8N 3/8 + = + N 7j/8 t N 1/8N 3/4j t 1 1 ≤ + , N j/8 N j/8 and X j 1 14 log t  8 log t 8 log t  4 log t   − log N = − log N + − + 1 8 8 2 3 log N 3 log N 3 log N 3 log N j∈J 1  log t 2 ≤ − (log N)2 16 log N 1 (log t)2 = − . 8 log N Indeed, one can show that for α ≥ 1 we have say

d4α/3e + [8α/3][8α/3] − d4α/3e + 1 ≥ 2α2 since the left hand side is ∼ 48α2/9 and by checking some small values for α numerically. Collecting the obtained estimates so far, we have

1   2   2 X 2 3 1 (log t) 3/8 k 2l e(f(n)) ≤ N exp(O(l )) exp ηk,l log N − log(N ) . 4 8 log N N 7/8

We will choose  log t  k = 9 and l = ak2 log N

2 −l/k2 for some positive integer a. Since ηk,l ≤ k e , this yields the following for the argument of the exponential:

3 1 (log t)2 243 1(log t)2 η log N − ≤ e−a − . 4 k,l 8 log N 4 8 log N

We choose a such that 243e−a/4 ≤ 1/16, so for example a = 7. Finally using the fact that  log t   log N 3/8k = exp 9 log(3/8) log N log N  1 (log t)2  ≤ C exp 32 log N say, for some absolute constant C, we conclude that

 2   4−1 X 1 (log t) log t e(f(n)) ≤ O(1)N exp − 98 9 32 log N log N N 7/8

Remark 4.2. It is highly likely that a more clever choice of the many parameters in the above proof will result in sharper values of both the implicit constant and the constant c1. We are however not that concerned with optimality; the main form of the estimate is the most important thing for us.

The bound from Lemma 4.1 implies a strong bound for the Riemann zeta function, but first we need one final lemma, taken from [9, Lemma 2.11].

Lemma 4.3. For t ≥ 2 and 1/2 ≤ σ ≤ 1 we have

X 1 ζ(s) = + Ot1−2σ log t. ns n≤t

Proof. In equation (4.1), we put x = t2 to obtain

X 1 t2−2s {t2} Z +∞ {u} ζ(s) = s + + 2s − s s+1 du n s − 1 t t2 u n≤t2   X 1 1−2σ X 1 = + O t + . ns ns n≤t t

We decompose the sum in the error term in O(log t) sums of the form X X X , ,..., , t

X 1 X  t  S (y, u) = = e − log n . f nit 2π y

Since

t 1 j j+1 − ≥ for 2 t < u ≤ 2 t, 2πu 2j+2π j j+3 we have that Sf (2 t, u) ≤ 2 in that range by the (corollary of) the Kusmin-Landau theorem (Corollary 2.1.3). By summation by parts,

j j+1 Z 2j+1t j X 1 Sf (2 t, 2 t) Sf (2 t, u) s = j+1 σ + σ σ+1 du n (2 t) 2j t u 2j t

Mutatis mutandis, we also have

X 1  t1−2σ, ns 2it

Theorem 4.4. There exists a positive constant c2 such that for |t| ≥ 2 and 1/2 ≤ σ ≤ 1

3/2 ζ(s) |t|c2(1−σ) (log |t|)2/3, and the implicit constant is absolute.

Proof. Since ζ(s) = ζ(s), we may assume that t > 0. By the previous lemma (Lemma 4.3) and summation by parts we obtain

Z t −σ −σ−1 1−2σ  ζ(s) ≤ Sf (t)t + σ x Sf (x) dx + O t log t . 1 Using Lemma 4.1,

Z t  (log x)3  1−σ−c1 −σ 1−2σ  ζ(s)  t + x exp −c1 2 dx + O t log t . 1 (log t)

Using the substitution u = log x/ log t we see that the integral equals

1 Z 3 log t t(1−σ)u−c1u du. 0 We now would like to complete the cube in the exponent, but (1 − σ)u has the wrong sign for this. However, since √ √ √ 1 − σ 2 4 √ 1 − σ 3c1u − 4 ≥ 0, 3c1 70 CHAPTER 4. THE VINOGRADOV-KOROBOV ZERO-FREE REGION FOR ζ we have

3/2 3 √ 1/2 2 (1 − σ) 3 (1 − σ)u − c1u ≤ (1 − σ)u + 3c1(1 − σ) u − 2(1 − σ)u + √ − c1u 3c1   1/2  3/2 3/2 3 1 − σ 2 1 − σ 1 − σ 2(1 − σ) = −c1 u − 3 u + 3 u − + √ 3c1 3c1 3c1 3 3c1 2 = −c (u − v)3 + (1 − σ)v, 1 3 p where v = (1 − σ)/(3c1). We split the integral into an integral from 0 to v and one from v to 1, and in the second integral we complete the cube as shown above. We get that the first integral is bounded by Z v (1−σ)v 3 (1−σ)v −1/3 t exp(−c1(log t)u ) du  t (log t) 0 and the second by Z 1 2(1−σ)v/3 3 2(1−σ)v/3 −1/3 t exp(−c1(log t)(u − v) ) du  t (log t) . v Collecting everything, we have that

3/2 3/2 (1−√σ) 2(1−√σ) 1−σ−c 2/3 2/3 1−2σ ζ(s)  t 1 + t 3c1 (log t) + t 3 3c1 (log t) + t log t 3/2  tc2(1−σ) (log t)2/3 √ If c2 ≥ 1/ 3c1 is sufficiently large. The methods for deducing zero-free regions for ζ from such bounds are well known. We present without a proof the following result due to Landau [27]. This can be proven via the 3-4-1 inequality and a clever application of the Borel-Carath´eodory Lemma (we refer to [30, Section 6.1] for the proof of the classical zero-free region using this method, or to exercise 9 from [30, Section 6.1] for the general theorem).

+ Theorem 4.5 (Landau). Suppose that θ(t) and φ(t) are functions on R with the fol- lowing properties: φ is positive and increasing, e−φ(t) ≤ θ(t) ≤ 1/2 and θ is decreasing. Suppose that ζ(s)  eφ(|t|) for |t| ≥ 2, σ ≥ 1 − θ(|t|). Then there is a positive constant c such that θ(2|t| + 1) ζ(s) 6= 0 for σ ≥ 1 − c , φ(2|t| + 1) ζ0 φ(2|t| + 2) c θ(2|t| + 2) (s)  for σ ≥ 1 − . ζ θ(2|t| + 2) 2 φ(2|t| + 2)

Corollary 4.6 (Vinogradov-Korobov). There is a positive constant c3 such that (for |t| ≥ 3) ζ0 ζ(s) 6= 0 and (s)  (log|t|)2/3(log log|t|)1/3 ζ for 1 σ ≥ 1 − c . (4.5) 3 (log|t|)2/3(log log|t|)1/3 71

Proof. Apply Theorem 4.5 and Theorem 4.4 with

log log t2/3 θ(t) = , φ(t) = (c + 2/3) log log t. log t 2

By contour integration, one can deduce from this the following form of the Prime Number Theorem (see for example [21, pages 60–65] for a general method of obtaining error terms in the PNT from a given zero-free region).

Theorem 4.7. There exist positive constants c4 and c5 such that

  (log x)3/5  ψ(x) = x + O x exp −c , 4 (log log x)1/5   (log x)3/5  π(x) = Li(x) + O x exp −c , 5 (log log x)1/5 where Z x du Li(x) = . 2 log u The zero-free region (4.5) and the PNT in the form of Theorem 4.7 are the strongest known results in this context. For the sharpest known numerical values of the constants c2, c4 and c5 and the implicit constants in Theorems 4.4, 4.7, we refer to two articles by Ford [6, 7]. 72 CHAPTER 4. THE VINOGRADOV-KOROBOV ZERO-FREE REGION FOR ζ Chapter 5

Riemann’s non-differentiable function

5.1 Introduction

According to an account of Weierstrass, Riemann would have suggested the function +∞ X 1 φ (x) = sin(n2πx) R n2π n=1 (R for Riemann) as an example of a function which is continuous but nowhere differen- tiable. However, it seems that neither he nor Weierstrass were able to show the non- differentiability of φR. It is said that this function served as inspiration for Weierstrass when he came up with his (in)famous “monster”

+∞ X an cos(bnπx), n=1 0 < a < 1, for which he showed that it is nowhere differentiable if b is an odd integer and ab > 1 + 3π/2. In 1916, Hardy generalised this to b > 1 arbitrary, ab ≥ 1 in his paper “Weierstrass’s non-differentiable function” [10]. In that same paper, Hardy also proved, based on properties of the Jacobi theta function established in the paper [11] by him and Littlewood, that Riemann’s function is not differentiable at any irrational point, and at rational points of the form (2r + 1)/(2s) or 2r/(4s + 1). This seemed to confirm Riemann’s conjecture, but in 1970, Gerver [8] showed that φR is differentiable at rational points of the form (2r + 1)/(2s + 1) (with derivative −1/2), and not differentiable at the other rationals. His proof is elementary, but difficult and long. Later this was simplified by Smith in 1972 [33] and Itatsu in 1981 [22]. Using methods from wavelet analysis, Holschneider and Tchamitchian [19] showed the same (non-)differentiability results, and showed that the singularities of φR at certain rational points are of cusp type:

− 1/2 + 1/2 φR(a/q + x) = φR(a/q) + Ca/q|x|− + Ca/q|x|+ + ψ(x), where the remainder ψ is differentiable in 0. This was also shown by other methods by Duistermaat in a very nice and accessible article [4]; he provides an asymptotic expansion of φR at every rational point. He also provides lower and upper bounds for the H¨older exponent α α(ρ) = sup{α > 0 | φR(ρ + x) = φR(ρ) + O(|x| )}

73 74 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION at irrational points ρ. Finally, in his paper [24], Jaffard determined the H¨olderexponent at every irrational point ρ, in terms of the rate of convergence of the convergents in the continued fraction expansion of ρ.

Most of the properties of φR described above were obtained by studying a Jacobi theta function in the upper half plane, and more specifically by studying its behaviour near the real line. Estimates for this function are usually obtained by its transformational behaviour under the action of some subgroup of the modular group PSL(2, Z), the Theta modular group. In this chapter, we will reprove some of the properties of φR, by studying the theta function via (a version of) the Poisson summation formula, and via properties of the quadratic Gauss sums, studied in Subsection 2.2.3.

As a generalisation of the Riemann function, one can consider the family of functions

+∞ X e(mkx) φ (x) = , α > 1, k ∈ +. α,k mα N m=1

The pointwise analysis of these functions is a difficult open problem. Some partial results however were obtained by Chamizo and Ubis [3].

Definition 5.1.1. We1 define the functions φ and θ for complex z = x + iy with y > 0 as:

+∞ X 1 φ(z) = e(n2z), 2πin2 n=1 X θ(z) = e(n2z). n∈Z

Applying the M-test of Weierstrass, we see that these series converge uniformly on half planes of the form {z ∈ C | y ≥ y0}, y0 > 0. Using another theorem of Weierstrass, this implies that they are holomorphic on the upper half plane, and we may differentiate termwise to obtain the relation 1 φ0(z) = (θ(z) − 1). (5.1) 2

Of course, φ has a continuous extension to the real line R, and our main object of interest, Riemann’s function, is given by φR(x) = 2 Re φ(x/2). To get information about φ on R, we will study the behaviour of θ near the real axis.

We begin with an elementary estimate for θ.

Lemma 5.1.2. Suppose y > 0. ( y−1/2 if y ≤ 1, θ(x + iy)  1 if y > 1.

1In the literature, these functions are usually defined to be 2-periodic; I however prefer them to be 1-periodic. This choice has of course no impact on the proved results. 5.2. BEHAVIOUR AT RATIONAL POINTS 75

Proof. We estimate by the triangular inequality and compare with the Gaussian integral to obtain:

+∞ Z +∞ X 2 2 −1/2 θ(x + iy) ≤ 1 + 2 exp(−2πym ) ≤ 1 + 2 exp(−2πyt ) dt  1 + y , m=1 0 and the lemma follows.

5.2 Behaviour at rational points

In order to examine the regularity of φ at rational points, we will study the behaviour of θ near rationals. We start with a lemma, which can be viewed as the Poisson summation formula where the summation runs over an arithmetic progression instead of over all the integers.

1 0 1 Lemma 5.2.1. Suppose f ∈ C (R) and f, f ∈ L (R). Then

M X 1 X bm 2πm f(n) = lim e fˆ . q M→+∞ q q n∈b+qZ m=−M Proof. Define g(x) = f(qx + b), then X X f(n) = g(n), n∈b+qZ n∈Z which, by the Poisson summation formula C.2.1, equals

M M X 1 X bm 2πm lim gˆ(2πm) = lim e fˆ . M→+∞ M→+∞ q q q m=−M m=−M

We can apply this formula to get an expansion for θ near rationals a/q. Lemma 5.2.2. Suppose 1 ≤ a ≤ q, (a, q) = 1 and y = Im z > 0. Then

+∞ ! a  eπi/4 X  iπm2  θ + z = √ z−1/2 S(q, a) + 2 S(q, a, m) exp − . q 2q2z q 2 m=1 Here, S(q, a) and S(q, a, m) are the quadratic Gauss sums defined in Subsection 2.2.3, and z−1/2 is defined using the principal branch of the logarithm: z−1/2 = exp−1/2(log|z| + i arg z).

Proof. q a  X an2  X ab2  X θ + z = e e(n2z) = e e(n2z). q q q n∈Z b=1 n∈b+qZ 2 For fixed z, consider the function fz : R → C : t 7→ e(zt ). Then fz meets the require- ments of Lemma 5.2.1, hence

X 1 X bm 2πm e(n2z) = e fˆ . q q z q n∈b+qZ m∈Z 76 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION

Its Fourier transform is given by πi/4  2  e −1/2 iu fˆz(u) = √ z exp − . 2 8πz Plugging this in the above gives q a  eπi/4 X ab2  X mb  iπm2  θ + z = √ z−1/2 e e exp − q q 2 q q 2q2z b=1 m∈Z eπi/4 X  iπm2  = √ z−1/2 S(q, a, m) exp − q 2 2q2z m∈Z +∞ ! eπi/4 X  iπm2  = √ z−1/2 S(q, a) + 2 S(q, a, m) exp − . 2q2z q 2 m=1

The previous lemma allows us to determine an asymptotic formula for φ near a/q. Define the “twisted” φ-function φq,a(z) as +∞ X S(q, a, m) φ (z) = e(m2z). q,a 2πim2 m=1 Theorem 5.2.3. Suppose 1 ≤ a ≤ q and (a, q) = 1. Then for real x we have     πi/4 a a e 1/2 1 φ + x = φ + √ S(q, a)x − x + Rq,a(x), q q q 2 2 with eπi/4  1  eπi/4 Z x  1  √ 3/2 √ 1/2 3/2 3/2 Rq,a(x) = 4q φq,a − 2 x − 6q t φq,a − 2 dt  q |x| . 2 4q x 2 0 4q t Remark that for negative values of x, x1/2 = i|x|1/2.

Proof. Suppose y > 0. By equation (5.1), a  a  Z a/q+x+iy a  1 Z x+iy a  1 φ +x+iy = φ +iy + φ0(ζ) dζ = φ +iy + θ +ζ dζ− x. q q a/q+iy q 2 iy q 2 Using Lemma 5.2.2, Z x+iy a  θ + ζ dζ iy q ! eπi/4 h ix+iy Z x+iy   1 0 √ 1/2 −1/2 2 2 = S(q, a) 2ζ + 2 ζ (4q ζ ) φq,a − 2 dζ q 2 iy iy 4q ζ

2eπi/4 h ix+iy   1 x+iy √ 1/2 2 3/2 = S(q, a) ζ + 4q ζ φq,a − 2 ζ q 2 iy 4q iy Z x+iy   ! 2 1/2 1 − 6q ζ φq,a − 2 dζ . iy 4q ζ

+ All the occuring functions have continuous extensions to R. Letting y → 0 gives the first part of the theorem. The bound on the error term follows easily when we note that √ √ φq,a  q, since S(q, a, m)  q by the remarks at the end of Subsection 2.2.3. 5.2. BEHAVIOUR AT RATIONAL POINTS 77

Table 5.1: Behaviour of Reφ(a/q + x) − φ(a/q)

q mod 4 a mod 4 x < 0 x > 0

    a 1 p a 1 √ 1 any − √ |x| + O |x| √ x + O (x) q 2 q q q 2 q q     a 1 p a 1 √ 3 any − √ |x| + O |x| − √ x + O (x) q 2 q q q 2 q q 1 1 2 any − x + Oq3/2|x|3/2 − x + Oq3/2x3/2 2 2   q 1 p 1 0 1 − √ |x| + O |x| − x + Oq3/2x3/2 a q q 2 1  q  1 √ 0 3 − x + Oq3/2|x|3/2 √ x + O (x) 2 a q q

Using the explicit expression for S(q, a) given by Theorem 2.2.22, we can exhibit the behaviour of Reφ(a/q + x) − φ(a/q) in a precise fashion, which is given in table 5.1. Remark that at some rational points, the function has a left (resp. right) derivative, but no right (resp. left) derivative. By rescaling by a factor 1/2, we obtain the well known regularity of φR at rational points.

Corollary 5.2.4. Suppose r = a/q is rational. If a and q are both odd, then φR is differentiable at r; otherwise the H¨olderexponent of φR at r equals 1/2.

Like in the article of Duistermaat [4], we can iterate the integration by parts pro- cedure from the proof of Theorem 5.2.3 to obtain an asymptotic expansion for φ at rationals. Define for k ∈ N

+∞ X S(q, a, m) φ(−k)(x) = e(m2x). q,a (2πim2)k+1 m=1

Then dφ(−k) φ = φ(0) and q,a (x) = φ(−k+1)(x). q,a q,a dx q,a Proposition 5.2.5. Suppose 1 ≤ a ≤ q and (a, q) = 1. Then

    πi/4 πi/4 +∞   a a e 1/2 1 e X 2k+1 (−k) 1 k+3/2 φ + x ∼ φ + √ S(q, a)x − x + √ akq φ − x , q q q 2 2 2 q,a 4q2x k=0 where2 k k k+1 Y ak = (−1) 4 (j + 1/2). j=1 2We take the empty product to be equal to 1. 78 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION

Proof. By induction on k. The case k = 0 follows from Theorem 5.2.3. For general k, this follows from

Z x   Z x    k+1/2 (−k) 1 2 k+5/2 d (−k−1) 1 t φq,a − 2 dt = 4q t φq,a − 2 dt 0 4q t 0 dt 4q t   Z x   2 k+1+3/2 (−k−1) 1 2 k+1+1/2 (−k−1) 1 = 4q x φq,a − 2 − 4q (k + 2 + 1/2) t φq,a − 2 dt. 4q x 0 4q t

5.3 Behaviour at irrational points

We now would like to investigate the behaviour of φ at irrational points ρ. Unlike in the rational case, we will not be able to derive an asymptotic formula for φ near ρ. We can however determine the H¨olderexponent

α α(ρ) = sup{α > 0 | φR(ρ + x) = φR(ρ) + O(|x| )} at these points, in terms of the rate of convergence of the convergents in the continued fraction expansion of ρ. For the basics of continued fractions, we refer to Appendix C.1. th Denote the n convergent in the continued fraction expansion of ρ by rn = pn/qn, where (pn, qn) = 1. Define τn via

 1 τn |ρ − rn| = . qn

We have the following properties about the continued fractions (proven in Appendix C.1): for every n ∈ N, τn > 2, |ρ − rn+1| < |ρ − rn|, rn and rn+1 lie on different sides of ρ, and n pn+1qn − pnqn+1 = (−1) . (5.2)

Since |rn − rn+1| = 1/(qnqn+1) we have 1 |ρ − rn| ≤ ≤ 2|ρ − rn| , qnqn+1 so  1 τn−1 1  1 τn−1 ≤ ≤ 2 . (5.3) qn qn+1 qn

Note also that by equation (5.2), it is never the case that both qn and qn+1 are ≡ 2 mod 4.

5.3.1 The upper bound for α(ρ) An upper bound for the H¨olderexponent at ρ can be obtained from the expansion of φ near rationals. This proposition is due to Duistermaat [4, Proposition 5.2].

Proposition 5.3.1. Suppose there is a subsequence (rnk )k such that for every k, qnk 6≡

2 mod 4 and τnk ≥ τ for some τ ≥ 2. Then

 1 + 1  Reφ(ρ + x) − φ(ρ) = Ω |x| 2 2τ .

Here, “ = Ω(...)” is the negation of “ = o(...)”. 5.3. BEHAVIOUR AT IRRATIONAL POINTS 79

Proof. We will construct a sequence of points (xk)k such that xk → 0 and Re φ(ρ + x) − φ(ρ) is bounded from below by a constant multiple of |x|1/2+1/(2τ). We will do this be

exploiting the square root behaviour of φ in pnk /qnk . For ease of notation, we will for

now drop the index and write q for qnk . Given such a q, we will choose an x such that

|x| = λ ρ − p/q (5.4)

for a fixed (independent of k) constant λ > 0. First choose the sign of x in such a way that p/q + x lies on the side with the square root term. This is always possible since q 6≡ 2 mod 4, see table 5.1. Using Theorem 5.2.3 we see that

|x|1/2 1 Reφ(p/q + x) − φ(p/q) ≥ √ − |x| − Cq3/2|x|3/2 , 2 q 2

where C is an absolute constant (independent of q). By equation (5.4), and using that

ρ − p/q ≤ q−2, this is √  √  λ 1/2 1 1 ≥ √ ρ − p/q − √ λ − Cλ . q 2 2 q

Now fix a λ with 1 0 < λ < . 2C If q is sufficiently large, then

1/2 ρ − p/q Reφ(p/q + x) − φ(p/q) ≥ ε √ , q

for some fixed ε with √ 1  0 < ε < λ − Cλ . 2

Now by using the fact that q−τ ≥ ρ − p/q , we get

1 + 1 1 + 1  2 2τ 0 2 2τ Re φ(p/q + x) − φ(p/q) ≥ ε ρ − p/q = ε |x| .

Finally, since |Reφ(ρ) − φ(p/q)| and |Reφ(ρ) − φ(p/q + x)| are not both smaller than  |Re φ(p/q + x) − φ(p/q) | /2, we can take xk = pn /qn − ρ or xk = pn /qn + x − ρ,  k k k k such that |Re φ(ρ) − φ(ρ + xk) | is maximal, and we get

1 + 1  00 2 2τ Re φ(ρ) − φ(ρ + xk) ≥ ε |xk| , xk → 0.

5.3.2 The lower bound for α(ρ) To find lower bounds for the H¨older exponent, we follow Holschneider and Tchamitchian [19] and Jaffard [24] and use methods from wavelet analysis.

Definition 5.3.2. 80 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION

3 ∞ 1 • For our purposes , a wavelet is a function g ∈ L (R) ∩ L (R) for which Z g(x) dx = 0. R

∞ • Given a function f ∈ L (R), the continuous wavelet transform Wgf of f with respect to g is defined as Z +∞ 1 x − b Wgf(b, a) = g f(x) dx, b ∈ R, a > 0. −∞ a a

The continuous wavelet transform is very useful in this context, since it may be used to analyse the function f over R at arbitrary length scales. The wavelet coefficient Wgf(b, a) informally contains information of f at position b and length scale a. In [18], Holschneider makes the analogy of the continuous wavelet transform with a mathematical microscope: b corresponds to the position, a−1 to the enlargement, and the chosen wavelet g is a suitable “optic” chosen for the specific problem at hand. This idea of mathematical microscope is made concrete in a range of theorems which relate global or local regularity of f with bounds for the wavelet transform Wgf near points on the real axis, and vice versa. The result which we will use is given by Proposition C.3.2 in appendix C.3. For more on the continuous wavelet transform, we refer to the book “Wavelets, An Analysis Tool” by Holschneider [18].

To obtain pointwise regularity results for φ, we will investigate the wavelet transform of φ with respect to the wavelet 1 g(x) = π(x + i)2 (which is called a Cauchy wavelet). The wavelet transform of φ with respect to this wavelet can be computed via residue calculus. We have

+∞ 2  Z 1 1 X 1 Z e n (ax + b) W φ(b, a) = φ(x) dx = dx, g 2 2π2i n2 (x − i)2 R aπ (x − b)/a − i n=1 R where we used to uniform convergence of the series defining φ to swap the sum and integral. For R > 0 denote by ΓR the semicircle with center the origin and ranging from R to −R in the upper half of the complex plane. Then

Z en2(az + b) 1 Z 1 dz  |dz|  → 0 (z − i)2 R2 R ΓR ΓR when R → +∞. By Cauchy’s residue theorem, we get that Z en2(ax + b) en2(az + b) dx = 2πi res = 2πi2πin2aen2(b + ia). 2 z=i 2 R (x − i) (z − i) Simplifying everything, we get that  Wgφ(b, a) = ia θ(b + ia) − 1 .

3Depending on the situation one often assumes extra assumptions, such as localisation of g org ˆ, smoothness properties, higher order vanishing moments, . . . 5.3. BEHAVIOUR AT IRRATIONAL POINTS 81

The function φ can be reconstructed from its wavelet transform by

Z +∞ da Z +∞ 1 x − b φ(x) = h Wgφ(b, a) db, (5.5) 0 a −∞ a a for suitable functions h, which are called reconstruction wavelets. In this case, any ∞ 1 function h ∈ L (R) ∩ L (R) with Z +∞ 1 hˆ(a)e−a da = − 0 2 suffices. Indeed, for a > 0, the inner integral in (5.5) equals

Z +∞   +∞ Z +∞   x − b X 2 x − b 2 i h θ(b + ia) − 1 db = 2i e−2πn a h e2πin b db a a −∞ n=1 −∞ +∞ Z +∞ +∞ X 2 2 2 X 2 2 = 2i e−2πan e2πin xa h(b)e−2πiabn db = 2i e2πin xe−2πn aahˆ(2πn2a). n=1 −∞ n=1

Integrating this over a with respect to da/a gives

+∞ Z +∞ +∞ X 2 X 1 2ie(n2x) e−2πn ahˆ(2πn2a) da = e(n2x) = φ(x). 2πin2 n=1 0 n=1

A valid reconstruction wavelet is for example given by h = g/2. Indeed ( −2ae−a a ≥ 0, Z +∞ gˆ(a) = and gˆ(a)e−a da = −1. 0 a < 0, 0

Proposition 5.3.3. Suppose z = b + ia with a > 0. If 2|ρ − rn+1| ≤ |z| ≤ 2|ρ − rn|, then 1 1 1 1     + |b| 2τn+1 1 + 1 |b| 2τn aθ(ρ + z)  a 2 2τn+1 1 + + a 2 2τn 1 + . a a

Furthermore, if qn+1 ≡ 2 mod 4 resp. qn ≡ 2 mod 4, then

1 1   1 1   1 + 1 |b| 2τn + |b| 2τn−1 aθ(ρ + z)  a 2 2τn 1 + resp. a 2 2τn−1 1 + . a a

The implicit constants are independent of n.

Proof. We will derive bounds for θ near p/q and apply them with p = pn+1, q = qn+1. By Lemma 5.2.2,

p  eπi/4 X  πi  θ + ζ = √ ζ−1/2 S(q, p, m) exp − m2 . q q 2 2q2ζ m∈Z √ By the results obtained in Subsection 2.2.3, we have that S(q, p, m)  q. Estimating via triangular inequality gives       p 1 X π Im ζ 2 1 i Im ζ θ + ζ  √ exp − m = √ θ . q q|ζ|1/2 2q2|ζ|2 q|ζ|1/2 4q2|ζ|2 m∈Z 82 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION

Now upon applying Lemma 5.1.2 while distinguishing the cases Im ζ/(4q2|s|2) ≤ 1 resp. > 1, we get that this is √ 1 q|ζ|1/2  √ + . q|ζ|1/2 (Im ζ)1/2

Now set ζ = z + (rn+1 − ρ). Then θ(ρ + z) = θ(rn+1 + ζ). Since 2|ρ − rn+1| ≤ |z| we have ζ  z: 3 1 |ζ| ≤|z| +|ρ − r | ≤ |z| , |ζ| ≥|z| −|ρ − r | ≥ |z| . n+1 2 n+1 2 Also 1 1 1 2τ 2τ √ =|ρ − rn+1| n+1 ≤|z| n+1 , qn+1 and by (5.3), the fact that τn > 2, and |z| ≤ 2|ρ − rn|

τn−1 1 1 √ 1 1 √ 2 − − qn+1 ≤ qn =|ρ − rn| 2τn 2 ≤ 2|z| 2τn 2 .

Therefore, 1 1 − 1/2 1 aθ(ρ + z)  a|z| 2τn+1 2 + a |z| 2τn . Now if q ≡ 2 mod 4, then S(q, p, m) = 0 for even m. Hence,

p  eπi/4 X  πi  θ + s = √ s−1/2 S(q, p, m) exp − m2 . q q 2 2q2s m∈1+2Z √ By estimating via the triangular inequality, using S(q, p, m)  q and comparing with the Gaussian integral, we get     p 1 X π Im ζ 2 θ + ζ  √ exp − m q q|ζ|1/2 2q2|ζ|2 m∈1+2Z √ 1 Z  π Im ζ  q|ζ|1/2  exp − t2 dt  , √ 1/2 2 2 1/2 q|ζ| R 2q |ζ| (Im ζ)

If qn+1 ≡ 2 mod 4 resp. qn ≡ 2 mod 4, then we apply this with p = pn+1, q = qn+1 resp. p = pn, q = qn and we obtain the stated result in a similar manner as before. (|ρ − rn| <|ρ − rn−1| so |z| ≤ 2|ρ − rn−1|.) We now have all the necessary ingredients to determine the H¨olderexponents at irrational points.

Theorem 5.3.4. Suppose ρ ∈ R\Q and denote by (rnk )k the subsequence of convergents in the continued fraction expansion of ρ for which qn 6≡ 2 mod 4. If we define τ(ρ) = lim supk τnk , then we have for the H¨olderexponent of both Re φ and φ in ρ: 1 1 α(ρ) = + . 2 2τ(ρ)

Proof. Write τ = τ(ρ). Pick ε > 0 arbitrary. Then there exists a subsequence (rn ) = kl l (rl)l such that τl ≥ τ − ε. By Proposition 5.3.1, we have that 1 1 α(ρ) ≤ + . 2 2(τ − ε) 5.3. BEHAVIOUR AT IRRATIONAL POINTS 83

On the other hand, we have that ∃K ∈ N : ∀k ≥ K : τnk ≤ τ + ε. By Proposition 5.3.3, we have that there exist a δ > 0 such that for a and |x| smaller than δ,

  1 1 + 1 |x| 2(τ+ε) W φ(ρ + x, a)  a 2 2(τ+ε) 1 + . g a

This bound extends to a < δ and |x| ≥ δ. Indeed, by concavity we have

  1    1  |x| 2(τ+ε) 1 −1 |x| 2(τ+ε) 1 + ≥ 2 2(τ+ε) 1 + . a a

Therefore,

  1 1 + 1 |x| 2(τ+ε) 1 + 1 1 1 1 a 2 2(τ+ε) 1 +  a 2 2(τ+ε) + a 2 |x| 2(τ+ε)  a 2 . a δ

However, by comparing with the Gaussian integral as in the proof of Lemma 5.1.2, we see that θ(b + ia) − 1  a−1/2, for any a > 0, so that indeed

  1 1 + 1 |x| 2(τ+ε) W φ(ρ + x, a)  a 2 2(τ+ε) 1 + g a whenever a < δ. Also 1 Wgφ(b, a)  a 2 when a ≥ δ. By Proposition C.3.2, we then have that 1 1 α(ρ) ≥ + . 2 2(τ + ε)

Since ε > 0 was arbitrary, the theorem follows. 84 CHAPTER 5. RIEMANN’S NON-DIFFERENTIABLE FUNCTION Chapter 6

Conclusion

In this thesis, we have investigated exponential sums. In particular, we have studied (ways to determine) non-trivial upper bounds for them, and some applications in number theory and analysis.

Some of the important upper bounds which we have obtained are those for sums over an arithmetic progression:

M X  1  e(αm + β)  min M, , kαk m=1 and the following bounds for kth-power Gauss sums: 1−1/k Sk(q, a) k q .

We have evaluated the quadratic Gauss sums explicitly. We have also investigated two ways for estimating Weyl sums, Weyl’s method and Vinogradov’s method. In the case that P is a real polynomial of degree k who’s leading coefficient α satisfies α − a/q ≤ 1/q2 for some a and q with (a, q) = 1, we have the following bounds: if M  q  M k−1, then for any ε > 0 M X 1+ε−c(k) e(P (m)) k,ε M , m=1 where 1 1 c(k) = resp. c(k) = 2k−1 8d2k2 log(2k)e2 from Weyl’s method resp. Vinogradov’s method.

We have applied our obtained knowledge on exponential sums in Waring’s problem and the ternary Goldbach problem by means of the Hardy-Littlewood method. We have been able to prove that every sufficiently large integer can be written as the sum of s kth-powers, provided that s ≥ k2 log k(12 + o(1)), and that every sufficiently large odd integer can be written as the sum of three primes. As a second application, we have derived the Vinogradov-Korobov zero-free region for the Riemann zeta function ζ: there exists a positive constant c such that

1 ζ(σ + it) 6= 0 for σ ≥ 1 − c . (log|t|)2/3(log log|t|)1/3

85 86 CHAPTER 6. CONCLUSION

Finally, we have studied the pointwise behaviour of Riemann’s function

+∞ X 1 φ (x) = sin(n2πx). R n2π n=1 We have shown that it is differentiable at rational points of the form a/q with (a, q) = 1 and q ≡ 2 mod 4, and that it is not differentiable at any other point. Furthermore, we have evaluated the H¨olderexponent α(ρ) of the function at each irrational point ρ: 1 1 α(ρ) = + , 2 2τ(ρ) where τ(ρ) is a measure for the speed of convergence of the convergents in the continued fraction expansion of ρ. Appendix A

Nederlandse samenvatting

In deze thesis hebben we exponenti¨ele sommen onderzocht, dat zijn sommen van de vorm

N X e(f(n)), f re¨eelwaardig, n=1 en waarbij we e(z) = exp(2πiz) hebben gesteld. De twee hoofddoelen van deze thesis waren 1. het onderzoeken van methoden om (niet-triviale) bovengrenzen voor de absolute waarde van exponenti¨ele sommen te bekomen;

2. het onderzoeken van verschillende toepassingen van exponenti¨ele sommen op spe- cifieke problemen in de getaltheorie en de analyse. Verschillende types van bovengrenzen kwamen aan bod in hoofdstuk 2, toepassingen in de hoofdstukken 3–5.

E´envan de belangrijkste types van exponenti¨ele sommen, zijn deze over een aritme- tische progressie. Door gebruik te maken van de formule voor de partieelsommen van een meetkundige reeks, hebben we bewezen dat

M X  1  e(αm + β) ≤ min M, . (A.1) 2kαk m=1

Een ander belangrijk type van exponenti¨ele sommen zijn de kde-machts Gauss som- men. Deze zijn voor getallen 1 ≤ a ≤ q met (a, q) = 1 gedefinieerd als

q X amk  S (q, a) = e . k q m=1

1−1/k Voor deze Gauss sommen hebben we de bovengrens Sk(q, a) k q bewezen, en de kwadratische Gauss sommen hebben we expliciet kunnen evalueren in termen van het Jacobisymbool. Een laatste type van exponenti¨ele sommen die we hebben onderzocht zijn de Weyl sommen. Dit zijn sommen van de vorm

M X Sf (M) = e(f(m)), m=1

87 88 APPENDIX A. NEDERLANDSE SAMENVATTING waarbij f een re¨eel polynoom is. We hebben twee methodes in detail onderzocht die tot bovengrenzen kunnen leiden, Weyls methode en Vinogradovs methode. Het doel van beide methodes is om de Weyl sommen te reduceren tot sommen over een lineair polynoom, waarvoor we goede afschattingen hebben (zie (A.1)). Hoe deze reductie wordt bewerkstelligd verschilt echter sterk tussen beide methoden: bij Weyls methode wordt dit gedaan door (de absolute waarde van) de som iteratief te kwadrateren, bij Vinogradovs methode door het toepassen van Vinogradovs gemiddelde-waardestelling. Deze methoden leveren ondermeer de volgende afschatting op: stel dat de hoogste- graadsco¨effici¨ent α van f voldoet aan α − a/q ≤ 1/q2 voor zekere a en q met (a, q) = 1, en stel dat M  q  M k−1, waarbij k de graad van f is. Dan hebben we voor elke ε > 0 dat M X 1+ε−c(k) e(P (m)) k,ε M , m=1 waarbij 1 1 c(k) = resp. c(k) = 2k−1 8d2k2 log(2k)e2 voor Weyls methode resp. Vinogradovs methode.

We hebben de volgende drie toepassingen van exponenti¨ele sommen bestudeerd: 1. de Hardy-Littlewood- of cirkelmethode, met in het bijzonder Warings probleem en het ternaire Goldbach probleem;

2. het Vinogradov-Korobov nulpuntvrij gebied voor de Riemann-z`eta-functie ζ;

3. de puntsgewijze analyse van Riemanns “niet-differentieerbare” functie

+∞ X 1 φ (x) = sin(n2πx). R n2π n=1

In de cirkelmethode vertrekt men van het idee om het aantal oplossingen van een k k Diophantische vergelijking, zoals m1 +···+ms = n of p1 +p2 +p3 = n, uit te drukken als een integraal van een exponenti¨ele som, die nu dienst doet als genererende functie. Via die methode hebben we bijvoorbeeld kunnen bewijzen dat indien s ≥ k2 log k(12 + o(1)), elk voldoende groot natuurlijk getal te schrijven is als de som van s kde-machten, en dat elk voldoende groot oneven natuurlijk getal te schrijven is als de som van drie priemgetallen. Vertrekkende van de afschatting

N X  (log N)3  n−it  N exp −c , 1 (log t)2 n=1 voor een zekere c1 > 0, die we hebben bewezen met behulp van Vinogradovs methode, hebben we het Vinogradov-Korobov nulpuntvrij gebied voor ζ afgeleid: er bestaat een positieve constante c2 zo dat 1 ζ(σ + it) 6= 0 for σ ≥ 1 − c . 2 (log|t|)2/3(log log|t|)1/3 Ten slotte hebben we via de kennis van kwadratische Gauss sommen en een vorm van de Poisson-sommatieformule bewezen dat de functie van Riemann φR(x) afleidbaar 89 is in rationale punten van de vorm a/q met (a, q) = 1 en q ≡ 2 mod 4, en niet afleidbaar is in alle andere punten. Met behulp van de continue wavelettransformatie hebben we ook in elk irrationaal punt de H¨olderexponent

α α(ρ) = sup{α > 0 | φR(ρ + x) = φR(ρ) + O(|x| )} bepaald: 1 1 α(ρ) = + , 2 2τ(ρ) waarbij τ(ρ) een maat is van de convergentiesnelheid van de convergenten in de ketting- breukexpansie van ρ. 90 APPENDIX A. NEDERLANDSE SAMENVATTING Appendix B

Populariserende samenvatting

Deze thesis gaat over exponenti¨ele sommen, dat zijn sommen van termen van de vorm e(θ) = e2πiθ. Via de formule van Euler kan men die termen schrijven als e(θ) = cos(2πθ)+ i sin(2πθ), en men kan deze getallen voorstellen als punten in het complexe vlak die liggen op de cirkel met straal 1 en middelpunt de oorsprong. Verrassend genoeg hebben deze exponenti¨ele sommen vele toepassingen, onder andere in de getaltheorie, de tak van de wiskunde die eigenschappen van gehele getallen bestudeert. In deze appendix zullen we zo een toepassing in de getaltheorie wat meer toelichten.

B.1 Oplossingen van vergelijkingen detecteren

E´envan de merkwaardige connecties tussen exponenti¨ele sommen en de getaltheorie, berust op het feit dat men exponenti¨ele sommen kan gebruiken om oplossingen van vergelijkingen te detecteren. Om dit verder uit te leggen hebben we eerst de volgende twee belangrijke eigenschappen van exponenti¨ele functies nodig.

Eigenschap B.1. Voor elke twee getallen x en y is e(x) · e(y) = e(x + y).

Eigenschap B.2. Stel dat n en m willekeurige gehele getallen zijn. Dan is ( Z 1 1 als n = m, e(nx)e(−mx) dx = 0 0 als n 6= m.

Zo is bijvoorbeeld

Z 1 Z 1 e(2x)e(−3x) dx = 0, en e(5x)e(−5x) dx = 1. 0 0

Hoe kan men deze eigenschap nu gebruiken om oplossingen van vergelijkingen te detecteren? Beschouw bijvoorbeeld de vergelijking

p1 + p2 = 5, (B.1) in de onbekenden p1 en p2, en waarbij we enkel oplossingen toelaten waarbij zowel p1 als p2 priemgetallen zijn. Door de priemgetallen kleiner dan 5 te beschouwen, namelijk 2 en 3, ziet men snel dat deze vergelijking twee oplossingen heeft: (p1, p2) = (2, 3) en (p1, p2) = (3, 2). Bemerk dat de volgorde van belang is voor de manier waarop we het

91 92 APPENDIX B. POPULARISERENDE SAMENVATTING aantal oplossingen tellen. Het feit dat deze vergelijking twee oplossingen heeft, kan als volgt “gedetecteerd” worden door exponenti¨ele sommen:

Z 1   e(2x) + e(3x) e(2x) + e(3x) e(−5x) dx 0 Z 1  = e((2 + 2)x) + e((2 + 3)x) + e((3 + 2)x) + e((3 + 3)x) e(−5x) dx 0 Z 1  = e(4x)e(−5x) + e(5x)e(−5x) + e(5x)e(−5x) + e(6x)e(−5x) dx 0 = 0 + 1 + 1 + 0 = 2.

Hier hebben we bij de eerste overgang de haakjes uitgewerkt en de rekenregel van eigen- schap B.1 gebruikt, en bij de laatste overgang eigenschap B.2. De integraal detecteert dus als het ware dat er twee oplossingen zijn voor de vergelijking (B.1). Laat ons nog een ander voorbeeld bekijken. Stel dat we het aantal oplossingen van de vergelijking p + k = 6 (B.2) in de onbekenden p en k wensen te detecteren, waarbij we enkel oplossingen toelaten waarbij p een priemgetal en k een kwadraat van een (positief) natuurlijk getal is. De kandidaat-priemgetallen zijn 2, 3 en 5; de kandidaat-kwadraten 1 = 12 en 4 = 22. We zien dat we twee oplossingen hebben, namelijk (p, k) = (2, 4) en (p, k) = (5, 1). Deze oplossingen worden gedetecteerd door de volgende integraal:

Z 1   e(2x) + e(3x) + e(5x) e(1x) + e(4x) e(−6x) dx 0 Z 1 = e((2 + 1)x) + e((2 + 4)x) + e((3 + 1)x) 0  + e((3 + 4)x) + e((5 + 1)x) + e((5 + 4)x) e(−6x) dx

Z 1 = e(3x)e(−6x) + e(6x)e(−6x) + e(4x)e(−6x) 0  + e(7x)e(−6x) + e(6x)e(−6x) + e(9x)e(−6x) dx

=0+1+0+0+1+0=2.

De integraal is gelijk aan 2, en dit komt inderdaad overeen met het aantal oplossingen van (B.2)

Deze techniek kunnen we als volgt veralgemenen. Stel dat we het aantal oplossingen van de volgende vergelijking wensen te tellen,

a + b = n, (B.3) in de onbekenden a en b, en voor een zeker vast gekozen getal n. Stel dat we over twee categorie¨en van natuurlijke getallen beschikken, en dat we enkel toelaten dat a een getal van de eerste categorie is, en b een getal van de tweede categorie. Deze situatie veralgemeent de twee vorige voorbeeldjes. In vergelijking (B.1) is ons vast gekozen getal n = 5, en zijn de twee categorie¨en telkens de categorie van de priemgetallen. In B.1. OPLOSSINGEN VAN VERGELIJKINGEN DETECTEREN 93 vergelijking (B.2) is ons vast gekozen getal n = 6, en is de eerste categorie die van de priemgetallen, en de tweede die van de kwadraten. We kunnen nu een algemene formule opstellen om het aantal oplossingen voor vergelijking (B.3) te tellen:

Z 1   aantal oplossingen = exp som 1ste cat exp som 2de cat e(−nx) dx. (B.4) 0 Hier is “exp som 1ste cat” de exponenti¨ele som die loopt over alle kandidaat-oplossingen van de eerste categorie, en “exp som 2de cat” de exponenti¨ele som die loopt over alle kandidaat-oplossingen van de tweede categorie. In het eerste voorbeeld zijn deze twee sommen gelijk aan elkaar en gelijk aan

e(2x) + e(3x).

In het tweede voorbeeld zijn de twee exponenti¨ele sommen gelijk aan

e(2x) + e(3x) + e(5x), respectievelijk e(1x) + e(4x).

Wat gebeurt er nu eigenlijk in deze algemene formule? De exponenti¨ele sommen representeren alle kandidaat-oplossingen van een bepaalde categorie. Wanneer we twee exponenti¨ele sommen vermenigvuldigen, dan worden, als gevolg van eigenschap B.1, alle mogelijke sommen van die kandidaten gegenereerd in een nieuwe exponenti¨ele som. De integraal met e(−nx) “filtert” wegens eigenschap B.2 hieruit juist deze termen die een som gelijk aan n representeren.

Hoe wordt deze “detectieformule” (B.4) nu gebruikt in wiskundige problemen? In de getaltheorie zijn veel problemen te formuleren als het zoeken naar oplossingen van vergelijkingen. Zo is er bijvoorbeeld het beroemde vermoeden van Goldbach: elk even getal groter dan 2 is te schrijven als de som van twee priemgetallen. We kunnen dit herformuleren door te zeggen dat voor elk even getal n > 2, de vergelijking

p1 + p2 = n minstens ´e´enoplossing heeft. Als we over een vergelijking beschikken, dan hebben we gezien dat de formule (B.4) ons het aantal oplossingen zal geven. In de twee uitgewerkte voorbeeldjes hebben we de integraal expliciet uitgerekend, vertrekkende van de oplossingen die we gevonden hadden voor de vergelijking. Als men bijvoorbeeld wil aantonen dat voor elk even getal een bepaalde vergelijking oplossingen heeft, dan kan men onmogelijke alle vergelijkingen expliciet oplossen, omdat het er oneindig veel zijn. In toepassingen werkt men dus in de andere richting: als men op ´e´enof andere manier kan aantonen dat de integraal (B.4) een strikt positief getal geeft (en dus niet gelijk is aan 0), dan moeten we hieruit wel besluiten dat onze oorspronkelijke vergelijking minstens ´e´enoplossing heeft! Het merkwaardige is dat we dan hebben kunnen bewijzen dat een vergelijking oplossingen heeft, zonder dat we weten wat die oplossingen zijn. Inderdaad, de waarde van de integraal vertelt ons enkel iets over het aantal oplossingen, maar zegt niets over wat die oplossingen precies zijn. Vaak is het zo dat men iets kan zeggen over de integraal voor algemene n, en zo kan men dus besluiten trekken over een hele familie van vergelijkingen tegelijk.

In mijn thesis bestudeer ik zulke exponenti¨ele sommen, onder andere met het oog op dergelijke problemen in de getaltheorie. Het vermoeden van Goldbach dateert van 1742, 94 APPENDIX B. POPULARISERENDE SAMENVATTING maar het is nog altijd niet bewezen, (en er zijn ook nog geen tegenvoorbeelden gevonden: elk even natuurlijk getal dat men tot nu toe heeft gecheckt kan men inderdaad schrijven als de som van twee priemgetallen). Redelijk recent is er wel een zwakkere variant van het vermoeden bewezen, het zogenaamde ternaire of zwakke vermoeden van Goldbach: elk oneven getal groter dan 5 is te schrijven als de som van drie priemgetallen. Dit is in 2013 bewezen door de Peruviaanse wiskundige H.A. Helfgott [16]. Zijn bewijs is extreem moeilijk en lang en bestaat uit zeer diepe wiskunde. Het achterliggende idee komt echter in zijn simpelste vorm overeen met hetgeen wij hebben beschreven: door een grondige studie van exponenti¨ele sommen heeft Helfgott kunnen aantonen dat de detectie-integraal in het ternaire vermoeden van Goldbach voor elk oneven getal groter dan 5 strikt positief is! Appendix C

Additional theorems

C.1 Diophantine approximation

In this section, some well known and elementary theorems about approximating real numbers by rationals (which is called Diophantine approximation, after Diophantus of Alexandria) will be stated and proved. This appendix is based on Chapter 1 of the lecture notes by Schmidt [32]. We begin with an important theorem due to Dirichlet. Theorem C.1.1 (Dirichlet). Suppose that α and Q are real numbers with Q ≥ 1. Then there exist a and q such that 1 ≤ q ≤ Q, (a, q) = 1 and α − a/q ≤ 1/(Qq).

Proof. Consider the numbers βq = {αq} = αq − [αq] for q = 1,..., [Q], and consider the [Q] + 1 intervals

 r − 1 r  I = , for r = 1,..., [Q] + 1. r [Q] + 1 [Q] + 1

If there is a βq in I1 resp. I[Q]+1, then we can take a = [αq] resp. a = [αq] + 1 and we are done. If not, then by the pigeonhole principle, there is an interval Ir which contains two of the βq, say βu and βv with u < v. Then we can take q = v − u, a = [αv] − [αu]. (If they are not coprime, then divide by the common factor, the upper bound will hold a fortiori).

An efficient way of generating good rational approximations of a given irrational number is by means of continued fractions.

Definition C.1.2. Consider the variables a0, a1, a2,... indexed by N. A (simple) con- tinued fraction [a0, a1, . . . , an] is the rational function given by

1 a0 + . 1 a1 + 1 a2 + . 1 .. + an

Since this is a rational function in the variables a0, a1, . . . , an, there exist polynomials pn and qn in n + 1 variables such that

pn(a0, a1, . . . , an) [a0, a1, . . . , an] = . qn(a0, a1, . . . , an)

95 96 APPENDIX C. ADDITIONAL THEOREMS

Since pn(a0, a1, . . . , an) 1 = a0 + , qn(a0, a1, . . . , an) pn−1(a1, a2, . . . , an) qn−1(a1, a2, . . . , an) we define p0(a0) = a0, q0(a0) = 1 and inductively

pn(a0, a1, . . . , an) = a0pn−1(a1, a2, . . . , an) + qn−1(a1, a2, . . . , an),

qn(a0, a1, . . . , an) = pn−1(a1, a2, . . . , an).

0 If we write pn for pn(a0, a1, . . . , an) and pn for pn(a1, a2, . . . , an+1) (and the same for the qn), this can be written as

0 0 0 pn = a0pn−1 + qn−1, qn = pn−1.

We will first prove some useful identities for the polynomials pn and qn. Lemma C.1.3. For n ≥ 2, we have

pn = anpn−1 + pn−2,

qn = anqn−1 + qn−2.

Proof. Via induction on n. For n = 2, this is clear. Suppose it holds for n − 1. Then we have

0 0 0 pn−1 = anpn−2 + pn−3, 0 0 0 qn−1 = anqn−2 + qn−3.

Therefore

0 0 0 0 0 0 pn = a0pn−1 + qn−1 = a0(anpn−2 + pn−3) + anqn−2 + qn−3 = anpn−1 + pn−2, 0 0 0 qn = pn−1 = anpn−2 + pn−3 = anqn−1 + qn−2.

Lemma C.1.4. For n ≥ 1 we have

n qnpn−1 − pnqn−1 = (−1) , n qn+1pn−1 − pn+1qn−1 = (−1) an+1.

Proof. Again via induction on n. The case n = 1 is again easily verified. Suppose it holds for n − 1. In view of Lemma C.1.3, we obtain

qnpn−1 − pnqn−1 = (anqn−1 + qn−2)pn−1 − (anpn−1 + pn−2)qn−1 n = −qn−1pn−2 + pn−1qn−2 = (−1) and

qn+1pn−1 − pn+1qn−1 = (an+1qn + qn−1)pn−1 − (an+1pn + pn−1)qn−1 n = an+1(−1) . C.1. DIOPHANTINE APPROXIMATION 97

Lemma C.1.5. Set α = [a0, a1, . . . , an+1]. Then (−1)n qnα − pn = . an+1qn + qn−1 Proof. The previous lemmas (Lemma C.1.3 and Lemma C.1.4) yield

n pn+1 qnpn+1 − pnqn+1 (−1) qnα − pn = qn − pn = = . qn+1 qn+1 an+1qn + qn−1

It is obvious that a (finite) continued fraction is a rational number, and it is not hard to show that every rational number equals a finite continued fraction. The following lemma and theorem state that this remains true for irrational numbers if one replaces finite with infinite.

Lemma C.1.6. Suppose a0, a1, a2,... are integers with a1, a2,... positive. Then ρ = limn[a0, a1, . . . , an] exists and is irrational. Moreover, p p p p 0 < 2 < ··· < ρ < ··· < 3 < 1 . q0 q2 q3 q1

Proof. By Lemma C.1.3, q0 = 1, q1 = a1 and qn = anqn−1 + qn−2, so that (qn)n is an increasing sequence of positive integers. By Lemma C.1.4,

p p (−1)n−1a n−2 − n = n . qn−2 qn qn−2qn If n ≥ 2, n even, this is positive, if n ≥ 3, n odd this is negative. Now suppose n is even and m is odd. Without loss of generality, n < m. Then p p p n ≤ m−1 < m , qn qm−1 qm

m where the last inequality follows form Lemma C.1.4: qmpm−1 − pmqm−1 = (−1) < 0. Hence we have p p p p 0 < 2 < ··· ; ··· < 3 < 1 . q0 q2 q3 q1

Since (p2n/q2n)n is an increasing sequence which is bounded from above, it converges to a number ρ, and similarly p2n+1/q2n+1 converges to a numberρ ˜. In view of the identity p p (−1)n n−1 − n = qn−1 qn qn−1qn and the fact that 1/(qn−1qn) → 0, the limits ρ andρ ˜ are equal. By Lemma C.1.4,

pn pn+1 pn 1 ρ − < − < 2 . (C.1) qn qn+1 qn qn

If ρ were rational, say ρ = p/q, then since pn/qn 6= ρ,

p pn qnp − pnq 1 − = ≥ . q qn qqn qqn

This contradicts (C.1) when we take n sufficiently large such that qn > q. 98 APPENDIX C. ADDITIONAL THEOREMS

If ρ = limn[a0, a1, . . . , an], we also write ρ = [a0, a1, a2,... ].

Theorem C.1.7. Suppose ρ is irrational. Then there exist integers a0, a1,... with a1, a2,... positive such that ρ = [a0, a1, a2,...]. Moreover, we have

pn 1 ρ − < 2 , qn qn and the integers a0, a1, a2,... are unique

th Proof. Define ρ0 = ρ, a0 = [ρ] (the integral part, not the 0 continued fraction), and recursively define ρn and an via 1 ρn−1 = an−1 + , an = [ρn]. ρn

This is well defined, since each ρn−1 is irrational, so 0 < ρn−1 − an−1 < 1. This also implies that for n ≥ 1, ρn > 1 and an ≥ 1. Note that for each n, ρ = [a0, a1, . . . , an, ρn+1]. By Lemma C.1.5, (−1)n qnρ − pn = . ρn+1qn + qn+1

Now q0 = 1, q1 = a1, and qn = anqn−1 + qn−1, so that (qn)n is an increasing sequence of positive integers. Since ρn+1 > 1, we have that

pn 1 ρ − < 2 , qn qn so that indeed ρ = lim [a0, a1 . . . , an]. n→∞ To prove uniqueness, suppose 1 ρ = [b0, b1, b2,... ] = b0 + , [b1, b2,... ] with bn integers which are positive for n ≥ 1. Since [b1, b2,... ] > b1 ≥ 1, we have 0 < ρ − b0 < 1. Therefore b0 = [ρ] = a0, and [b1, b2,... ] = ρ1. Continuing in this fashion, we see that for all n ∈ N : bn = an.

Remark C.1.8. By Lemma C.1.4, (pn, qn) = 1, so pn/qn is in reduced form.

Definition C.1.9. Suppose ρ is irrational, and let a0, a1,... be as in the proof of Theo- th rem C.1.7. The integer an is called the n partial quotient of ρ, and pn/qn is called the nth convergent to ρ. The following theorem by Lagrange is called the “law of best approximation”, and shows that the convergents to an irrational ρ are in some sense the rationals which approximate ρ the best.

Theorem C.1.10 (Lagrange). Suppose ρ is irrational, and let (pn/qn)n be the sequence of convergents to ρ. Then

• |ρq0 − p0| >|ρq1 − p1| >|ρq2 − p2| > ··· ;

• if n ≥ 1 and 1 ≤ q ≤ qn, and if p 6= pn−1, pn, q 6= qn−1, qn, then |ρq − p| > |ρqn−1 − pn−1|. C.2. THE POISSON SUMMATION FORMULA 99

Proof. With the notation from the proof of Theorem C.1.7 and by Lemma C.1.5, we have 1 1 |ρqn − pn| = < ρn+1qn + qn−1 qn + qn−1 and 1 1 1 |ρqn−1 − pn−1| = > = , ρnqn−1 + qn−2 (an + 1)qn−1 + qn−2 qn + qn−1 which proves the first part. For the second part, define µ, ν by the equations

µpn + νpn−1 = p,

µqn + νqn−1 = q. By Lemma C.1.4, the matrix of this equation has determinant ±1, and hence is invertible over Z. Therefore µ and ν are integers. Since p 6= pn and q 6= qn, ν 6= 0. If µ = 0, then p = νpn−1, q = νqn−1 and since q 6= qn−1, we have ν ≥ 2. Then

|ρq − p| ≥ 2|ρqn−1 − pn−1| >|ρqn−1 − pn−1| .

If both µ and ν are nonzero, they must be of opposite sign, since q = µqn + νqn−1 and 1 ≤ q ≤ qn. Since by Lemma C.1.6, also ρqn − pn and ρqn−1 − pn−1 are of opposite sign, µ(ρqn − pn) and ν(ρqn−1 − pn−1) have the same sign. Therefore

|ρq − p| =|µ||ρqn − pn| +|ν||ρqn−1 − pn−1| >|ρqn−1 − pn−1| .

+ We remark that Lagrange’s theorem also implies that for each n ∈ N

pn pn−1 ρ − < ρ − . qn qn−1

C.2 The Poisson summation formula

1 Theorem C.2.1. Suppose f : R → C is piecewise C with a finite number of disconti- 0 nuities, and suppose kfk1 , f 1 < ∞. Then

K X f(n+) + f(n−) X = lim fˆ(2πk). 2 K→+∞ n∈Z k=−K Proof. For n ≤ y ≤ n + 1, we have Z n+1 Z y Z n+1 f(y) = f(x) dx + (x − n) df(x) + (x − n − 1) df(x), n n y by integration by parts. Therefore, Z n+1 Z n+1 0 f(y) ≤ f(x) dx + f (x) dx + O(1), n n where the O(1) term is the contribution from the discontinuities of f in [n, n + 1], which is zero when n is sufficiently large. From this and the hypothesis on f, it follows that X F (y) = f(n + y) n∈Z 100 APPENDIX C. ADDITIONAL THEOREMS is well defined and converges absolutely and uniformly. The function F is 1-periodic and also piecewise C1, so its Fourier series converges to (F (y+) + F (y−))/2 for every y. The kth-Fourier coefficient of F is given by

Z 1 X Z 1 F (y)e(−ky) dy = f(n + y)e(−ky) dy 0 0 n∈Z Z = f(x)e(−kx) dx = fˆ(2πk), R Where we swapped the integration with the summation since the convergence is uniform. Putting y = 0 gives

K X f(n+) + f(n−) X = lim fˆ(2πk). 2 K→+∞ n∈Z k=−K

The Poisson summation formula also holds under weaker assumptions, for example when f is integrable and of bounded variation on R. A distributional version of the Poisson summation formula is investigated in [5].

C.3 The continuous wavelet transform

Definition C.3.1.

∞ 1 • A wavelet is a function g ∈ L (R) ∩ L (R) for which Z g(x) dx = 0. R

∞ • Given a function f ∈ L (R), the continuous wavelet transform Wgf of f with respect to g is defined as

Z +∞ 1 x − b Wgf(b, a) = g f(x) dx, b ∈ R, a > 0. −∞ a a

In many cases, the analyzed function f can be recovered from its wavelet transform: under suitable conditions on f, g, and h we have

Z +∞ da Z +∞ 1 x − b f(x) = h Wgf(b, a) db. 0 a −∞ a a

We refer to the book by Holschneider [18] for theorems which make these conditions explicit.

The continuous wavelet transform is very useful for investigating the local regularity of the analyzed function f. Indeed, there is a correspondence between the decay of the wavelet transform near the real axis and the H¨olderexponent of f in a given point. Explicitly, we have the following proposition, which is essentially due to Jaffard [24]. C.3. THE CONTINUOUS WAVELET TRANSFORM 101

Proposition C.3.2. Suppose the function f has a scale-space-representation Z +∞ da Z +∞ 1 x − b f(x) = h T (b, a) db, 0 a −∞ a a 1 where the reconstruction wavelet h satisfies h ∈ C (R) and

0 1 h(x) + h (x)  . (1 +|x|)2 If the scale-space-coefficients T (b, a) satisfy the bounds

0  |b − x |α T (b, a)  aα 1 + 0 for a < δ with 0 < α0 < α < 1, (C.2a) a T (b, a)  aβ for a ≥ δ with β < 1, (C.2b)

α then for |x − x0| ≤ δ/2 we have f(x) − f(x0) |x − x0| . Proof. Z +∞ Z +∞      da 1 x − b x0 − b f(x) − f(x0) = h − h T (b, a) db. 0 a −∞ a a a

We split the integral over a into three parts: I1 is from δ to +∞, I2 from |x − x0| to δ and I3 from 0 to |x − x0|. Set Z +∞ 1 y − b ω(a, y) = h T (b, a) db. −∞ a a For fixed a, the function ω(a, y) is differentiable with respect to y, and Z +∞   ∂ω 1 0 y − b (a, y) = 2 h T (b, a) db. ∂y −∞ a a

0 ∂ω β−1 If a ≥ δ, then by (C.2b) and by the bound on h , ∂y (a, y)  a . Hence by the mean value theorem (with ηa between x and x0), Z +∞ da ∂ω I1  |x − x0| (a, ηa) δ |x − x0| , δ a ∂y since β < 1. Now if a < δ, then by (C.2a),

0 ∂ω Z +∞ 1 y − b  |b − x |α (a, y)  h0 aα 1 + 0 db 2 ∂y −∞ a a a Z +∞  α0 α−1 0 |y − x0 − ab| = a h (b) 1 + db. −∞ a Note that α0 α0 α0  |y − x − ab|  |y − x |   |y − x | 0 1 + 0 ≤ 1 + 0 +|b| ≤ 1 + 0 +|b|α , a a a so by the bound on h0 and since α0 < 1,

0 ∂ω  |y − x |α (a, y)  aα−1 1 + 0 . ∂y a 102 APPENDIX C. ADDITIONAL THEOREMS

Again using the mean value theorem (with ηa between x and x0),

Z δ Z δ  α0 ∂ω da α−1 |x − x0| da I2  |x − x0| (a, ηa) |x − x0| a 1 + |x−x0| ∂y a |x−x0| a a 0 Z δ/|x−x0|  α α α−1 1 da α |x − x0| a 1 + |x − x0| , 1 a a since α < 1. For a <|x − x0| ≤ δ, one has by the bound on h (analogous to the above)

0  |y − x |α ω(y, a)  aα 1 + 0 . a

Therefore,

0 0 Z |x−x0|  α Z 1  α α |x − x0| da α α 1 da I3  a 1 + =|x − x0| a 1 + . 0 a a 0 a a The proof is now complete by noting that

1  α0 1 1 Z 1 da Z da Z 0 da aα 1 +  aα + aα−α  1, 0 a a 0 a 0 a since α0 < α. Bibliography

[1] Tom M. Apostol. Introduction to analytic number theory. Springer-Verlag, New York-Heidelberg, 1976. Undergraduate Texts in Mathematics. [2] Jean Bourgain, Ciprian Demeter, and Larry Guth. Proof of the main conjecture in Vinogradov’s mean value theorem for degrees higher than three. Ann. of Math. (2), 184(2):633–682, 2016. [3] Fernando Chamizo and Adri´anUbis. Some Fourier series with gaps. J. Anal. Math., 101:179–197, 2007. [4] J.J. Duistermaat. Selfsimilarity of “Riemann’s nondifferentiable function”. Nieuw Archief voor Wiskunde. Vierde Serie, 9, 1991. [5] A. L. Dur´an,R. Estrada, and R. P. Kanwal. Extensions of the Poisson summation formula. J. Math. Anal. Appl., 218(2):581–606, 1998. [6] Kevin Ford. Vinogradov’s integral and bounds for the Riemann zeta function. Proc. London Math. Soc. (3), 85(3):565–633, 2002. [7] Kevin Ford. Zero-free regions for the Riemann zeta function. In Number theory for the millennium, II (Urbana, IL, 2000), pages 25–56. A K Peters, Natick, MA, 2002. [8] Joseph Gerver. The differentiability of the Riemann function at certain rational multiples of π. Amer. J. Math., 92:33–55, 1970. [9] S. W. Graham and G. Kolesnik. van der Corput’s method of exponential sums, volume 126 of London Mathematical Society Lecture Note Series. Cambridge Uni- versity Press, Cambridge, 1991. [10] G. H. Hardy. Weierstrass’s non-differentiable function. Trans. Amer. Math. Soc., 17(3):301–325, 1916. [11] G. H. Hardy and J. E. Littlewood. Some problems of diophantine approximation (II). Acta Math., 37(1):193–239, 1914. [12] G. H. Hardy and J. E. Littlewood. Some problems of ‘Partitio Numerorum’: II. Proof that every large number is the sum of at most 21 biquadrates. Math. Z., 9(1-2):14–27, 1921. [13] G. H. Hardy and J. E. Littlewood. Some problems of ‘Partitio numerorum’: III. On the expression of a number as a sum of primes. Acta Math., 44(1):1–70, 1923. [14] G. H. Hardy and J. E. Littlewood. Some Problems of ‘Partitio Numerorum’: V. A Further Contribution to the Study of Goldbach’s Problem. Proc. London Math. Soc. (2), 22:46–56, 1924.

103 104 BIBLIOGRAPHY

[15] G. H. Hardy and S. Ramanujan. Asymptotic formulæ in combinatory analysis. Proceedings of the London Mathematical Society, s2-17(1):75–115, 1918.

[16] Harald A. Helfgott. The ternary Goldbach problem. to appear in Ann. of Math. Studies, preprint arXiv:1501.05438v2.

[17] David Hilbert. Beweis f¨urdie Darstellbarkeit der ganzen Zahlen durch eine feste Anzahl nter Potenzen (Waringsche Problem). Math. Ann., 67(3):281–300, 1909.

[18] M. Holschneider. Wavelets. Oxford Mathematical Monographs. The Clarendon Press, Oxford University Press, New York, 1995. An analysis tool, Oxford Science Publications.

[19] M. Holschneider and Ph. Tchamitchian. Pointwise analysis of Riemann’s “nondif- ferentiable” function. Invent. Math., 105(1):157–175, 1991.

[20] Loo-keng Hua. On exponential sums. Sci. Record (N.S.), 1:1–4, 1957.

[21] A. E. Ingham. The distribution of prime numbers. Cambridge Tracts in Mathematics and Mathematical Physics, No. 30. Cambridge University Press, 1932.

[22] Seiichi Itatsu. Differentiability of Riemann’s function. Proc. Japan Acad. Ser. A Math. Sci., 57(10):492–495, 1981.

[23] Henryk Iwaniec and Emmanuel Kowalski. Analytic number theory, volume 53 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2004.

[24] Stephane Jaffard. The spectrum of singularities of Riemann’s function. Rev. Mat. Iberoamericana, 12(2):441–460, 1996.

[25] A. A. Karatsuba. Mean value of the modulus of a trigonometric sum. Izv. Akad. Nauk SSSR Ser. Mat., 37:1203–1227, 1973. In Russian.

[26] N. M. Korobov. Estimates of trigonometric sums and their applications. Uspehi Mat. Nauk, 13(4 (82)):185–192, 1958. in Russian.

[27] Edmund Landau. ¨uber die Wurzeln der Zetafunktion. Math. Z., 20(1):98–104, 1924.

[28] Kurt Mahler. On the fractional parts of the powers of a rational number. II. Math- ematika, 4:122–124, 1957.

[29] Hugh L. Montgomery and Robert C. Vaughan. Exponential sums. http:// www-personal.umich.edu/~hlm/math775/ch17.pdf. Draft version of a chapter of the second volume in their series on multiplicative number theory.

[30] Hugh L. Montgomery and Robert C. Vaughan. Multiplicative number theory. I. Classical theory, volume 97 of Cambridge Studies in Advanced Mathematics. Cam- bridge University Press, Cambridge, 2006.

[31] Wolfgang M. Schmidt. Equations over finite fields. An elementary approach. Lecture Notes in Mathematics, Vol. 536. Springer-Verlag, Berlin-New York, 1976.

[32] Wolfgang M. Schmidt. Diophantine approximation, volume 785 of Lecture Notes in Mathematics. Springer, Berlin, 1980. BIBLIOGRAPHY 105

[33] A. Smith. The differentiability of Riemann’s functions. Proc. Amer. Math. Soc., 34:463–468, 1972.

[34] S. B. Steˇckin. Mean values of the modulus of a trigonometric sum. Trudy Mat. Inst. Steklov., 134:283–309, 411, 1975.

[35] Terence Tao. Heuristic limitations of the circle method. blog- post available at https://terrytao.wordpress.com/2012/05/20/ heuristic-limitations-of-the-circle-method/.

[36] R. C. Vaughan and T. D. Wooley. Waring’s problem: a survey. In Number theory for the millennium, III (Urbana, IL, 2000), pages 301–340. A K Peters, Natick, MA, 2002.

[37] Robert C. Vaughan. The Hardy-Littlewood method, volume 125 of Cambridge Tracts in Mathematics. Cambridge University Press, Cambridge, second edition, 1997.

[38] I. M. Vinogradov. On Weyl’s sums. Mat. Sbornik, 42(5):521–530, 1935.

[39] I. M. Vinogradov. A new method of estimation of trigonometrical sums. Mat. Sbornik, 43(1):175–188, 1936.

[40] I. M. Vinogradov. Representations of an odd number as the sum of three primes. Dokl. Akad. Nauk SSSR, 15:291–294, 1937.

[41] I. M. Vinogradov. The method of trigonometrical sums in the theory of numbers. Trav. Inst. Math. Stekloff, 23:109, 1947. in Russian.

[42] I. M. Vinogradov. A new estimate of the function ζ(1 + it). Izv. Akad. Nauk SSSR. Ser. Mat., 22:161–164, 1958. in Russian.

[43] I. M. Vinogradov. The method of trigonometrical sums in the theory of numbers. Dover Publications, Inc., Mineola, NY, 2004. Translated from the Russian, revised and annotated by K. F. Roth and Anne Davenport, Reprint of the 1954 translation.

[44] H. Weyl. Zur Absch¨atzungvon ζ(1 + ti). Math. Zeit., 10:88–101, 1921.

[45] Trevor D. Wooley. Vinogradov’s mean value theorem via efficient congruencing. Ann. of Math. (2), 175(3):1575–1627, 2012.