<<

Exponential Sums and the Distribution of Prime Numbers

Jori Merikoski

HELSINGIN YLIOPISTO  HELSINGFORS UNIVERSITET  UNIVERSITY OF HELSINKI

Tiedekunta/Osasto  Fakultet/Sektion  Faculty Laitos  Institution  Department

Faculty of Science Department of Mathematics and Statistics

Tekijä  Författare  Author Jori Merikoski Työn nimi  Arbetets titel  Title

Exponential Sums and the Distribution of Prime Numbers

Oppiaine  Läroämne  Subject Mathematics Työn laji  Arbetets art  Level Aika  Datum  Month and year Sivumäärä  Sidoantal  Number of pages Master's thesis February 2016 102 p.

Tiivistelmä  Referat  Abstract We study growth estimates for the on the critical strip and their implications to the distribution of prime numbers. In particular, we use the growth estimates to prove the Hoheisel-Ingham Theorem, which gives an upper bound for the dierence between consecutive prime numbers. We also investigate the distribution of prime pairs, in connection which we oer original ideas. The Riemann zeta function is dened as s in the half-plane Re We extend ζ(s) := n∞=1 n− s > 1. it to a meromorphic function on the whole plane with a simple pole at s = 1, and show that it P satises the functional equation. We discuss two methods, van der Corput's and Vinogradov's, to give upper bounds for the growth of the zeta function on the critical strip 0 Re s 1. Both of ≤ ≤ these are based on the observation that ζ(s) is well approximated on the critical strip by a nite exponential sum T s T Van der Corput's method uses the Poisson n=1 n− = n=1 exp s log n . summation formula to transform this sum{− into a sum} of integrals, which can be easily estimated. P P 1 This yields the estimate ζ(1/2 + it) = (t 6 log t), as t . Vinogradov's method transforms the problem of estimating an exponentialO sum into a combinatorial→ ∞ problem. It is needed to give a strong bound for the growth of the zeta function near the vertical line Re s = 1. We use complex analysis to prove the Hoheisel-Ingham Theorem, which states that if ζ(1/2+it) = (tc) for some constant c > 0, then for any θ > 1+4c , and for any function xθ h(x) x, we have O 2+4c   ψ(x + h) ψ(x) h, as x . The proof of this relies heavily on the growth estimate obtained by the Vinogradov's− ∼ method.→ ∞Here is the summatory function ψ(x) := n x Λ(n) = pk x log p of the von Mangoldt's function. From this we obtain≤ by using van≤ der Corput's estimate that the P P5 8 + dierence between consecutive primes satises p p < pn for all large enough n, and for any n+1 − n  > 0. Finally, we study prime pairs, and the Hardy-Littlewood Conjecture on their distribution. More

precisely, let π2k(x) stand for the number of prime numbers p x such that p + 2k is also a prime. The following ideas are all original contributions of this thesis:≤ We show that the average of π (x) over 2k xθ is exactly what is expected by the Hardy-Littlewood Conjecture. Here 2k ≤ we can choose 1+4c as above. We also give a lower bound of for the averages over θ > 2+4c π2k(x) much smaller intervals 2k E log x, and give interpretations of our results using the concept of equidistribution. In addition,≤ we study prime pairs by using the discrete Fourier transform. We

express the function π2k(n) as an exponential sum, and extract from this sum the term predicted by the Hardy-Littlewood Conjecture. This is interpreted as a discrete analog of the method of major and minor arcs, which is often used to tackle problems of additive number theory.

Avainsanat  Nyckelord  Keywords Riemann zeta function, exponential sums, , prime numbers

Säilytyspaikka  Förvaringsställe  Where deposited Kumpulan tiedekirjasto

Muita tietoja  Övriga uppgifter  Additional information

HELSINGIN YLIOPISTO  HELSINGFORS UNIVERSITET  UNIVERSITY OF HELSINKI

Tiedekunta/Osasto  Fakultet/Sektion  Faculty Laitos  Institution  Department

Matemaattis-luonnontieteellinen Matematiikan ja tilastotieteen laitos Tekijä  Författare  Author Jori Merikoski Työn nimi  Arbetets titel  Title

Exponential Sums and the Distribution of Prime Numbers

Oppiaine  Läroämne  Subject Matematiikka Työn laji  Arbetets art  Level Aika  Datum  Month and year Sivumäärä  Sidoantal  Number of pages Pro gradu -tutkielma Helmikuu 2016 102 s.

Tiivistelmä  Referat  Abstract Esittelemme tutkielmassa kasvuarvioita Riemannin zeta-funktiolle kriittisessä nauhassa ja niiden soveltamista alkulukujen teoriaan. Kasvuarvioita käyttäen todistamme ylärajan peräkkäisten alku- lukujen väliselle etäisyydelle. Lisäksi tutkimme alkulukuparien jakaumaa, johon liittyen esittelemme alkuperäisiä ideoita. Riemannin zeta-funktio määritellään puolitasossa Re s > 1 suppenevana sarjana ζ(s) = s Osoitamme, että zeta-funktio jatkuu koko tasossa meromorseksi funktioksi, jolla on n∞=1 n− . yksinkertainen napa pisteessä s = 1. Esittelemme van der Corputin ja Vinogradovin menetelmät P zeta-funktion kasvun arvioimiseksi kriittisessä nauhassa 0 Re s 1. Molemmat perustuvat ≤ ≤ havaintoon, että äärellinen eksponenttisumma T s T aproksimoi hyvin n=1 n− = n=1 exp s log n zeta-funktiota kriittisellä nauhalla. Van der Corputin menetelmässä tämä{− eksponenttisumma} muun- P P netaan Poissonin kaavan avulla summaksi integraaleista, jotka voidaan helposti arvioida. Tästä saa- 1 daan arvio ζ(1/2 + it) = (t 6 log t), kun t . Vinogradovin menetelmässä eksponenttisumman arvioiminen muunnetaanO kombinatoriseksi ongelmaksi.→ ∞ Tätä tarvitaan, jotta saadaan vahva rajoitus zeta-funktion kasvulle pystysuoran Re s = 1 läheisyydessä. Käytämme kompleksianalyyttisiä menetelmiä todistamaan Hoheiselin ja Inghamin lauseen. Sen mukaan jos ζ(1/2 + it) = (tc) jollakin vakiolla c > 0, niin silloin kaikilla θ > 1+4c ja kaikilla O 2+4c funktioilla xθ h(x) x pätee ψ(x + h) ψ(x) h, kun x . Lauseen todistus nojaa vahvasti Vinogradovin menetelmästä saatuun arvioon.− Tässä∼ → ∞ on von ψ(x) := n x Λ(n) = pk x log p Mangoldtin funktion summafunktio. Käyttämällä van der Corputin≤ menetelmästä≤ saatua arviota P P 5 8 + tästä seuraa, että peräkkäisten alkulukujen väliselle erotukselle pätee p p < pn kaikilla n+1 − n tarpeeksi suurilla luvuilla n ja kaikilla  > 0. Tutkielman lopuksi tarkastelemme alkulukupareja ja Hardyn-Littlewoodin konjektuuria niitä kos-

kien. Merkitköön π2k(x) niiden alkulukujen p x määrää, joilla p + 2k on myös alkuluku. Kaikki ≤ θ seuraavat ideat ovat alkuperäisiä. Osoitamme, että keskiarvo funktioista π2k(x) yli lukujen 2k x on asymptoottisesti sama kuin mitä Hardyn-Littlewoodin konjektuurin perusteella voidaan odot-≤ taa. Tässä vakio 1+4c voidaan valita kuten yllä. Tämän lisäksi annamme alarajan funktioiden θ > 2+4c π2k(x) keskiarvoille paljon pienempien välien 2k E log x yli ja tulkitsemme todistamiamme tulok- sia käyttäen tasanjakautuneisuuden käsitettä. Tutkimme≤ alkulukupareja myös käyttäen diskreettiä

Fourier-muunnosta. Ilmaisemme funktion π2k(n) eksponenttisummana, josta pystymme erottamaan termin, joka vastaa Hardyn-Littlewoodin konjektuuria. Tulkitsemme tämän diskreettinä versiona niin kutsutusta major/minor arc -menetelmästä, jota käytetään usein additiivisen lukuteorian on- gelmien ratkaisemisessa.

Avainsanat  Nyckelord  Keywords Riemannin zeta-funktio, eksponenttisummat, analyyttinen lukuteoria, alkuluvut

Säilytyspaikka  Förvaringsställe  Where deposited Kumpulan tiedekirjasto

Muita tietoja  Övriga uppgifter  Additional information

Contents

1 Introduction 1

2 The Riemann Zeta Function 5 2.1 Euler Product ...... 5 2.2 Analytic Continuation to σ > 0, s = 1 ...... 7 6 2.3 Analytic Continuation to C 1 and the Functional Equation ...... 8 2.4 Simple Estimates and a Zero-free\{ } Region ...... 11

3 Van Der Corput's Method 19 3.1 Exponential Integrals ...... 19 3.2 Exponential Sums ...... 21 3.3 Preparations For the Next Chapter ...... 30

4 Vinogradov's Method 33 4.1 From Exponential Sums to Combinatorics ...... 34 4.2 Integrals J(q, l) ...... 39 4.3 Order of ζ(s) Near the Line σ = 1 ...... 41 5 Dierence Between Consecutive Prime Numbers 47 5.1 An Approximate Formula For ψ(x) ...... 48 5.2 Estimating for 1 ...... 52 ζ(s) 2 σ 1 5.3 Ingham's Mean Value≤ Theorem≤ ...... 55 5.4 Prime Gaps ...... 60

6 Prime Pairs 65 6.1 Averaged Form of the Hardy-Littlewood Conjecture ...... 65

6.2 Lower Bounds For Averages of π2k(x) ...... 75 6.3 Exponential Sums Over Prime Numbers ...... 78 6.4 Another Approach to the Hardy-Littlewood Conjecture ...... 80

7 Conclusions 89

Appendices 93 A Notations ...... 93 B Some Useful Theorems ...... 95

References 101

1 Introduction

The Riemann zeta function is profoundly related to the distribution of prime num- bers, which is a central object of study in analytic number theory. The relationship between the two is most apparent in the elegant formula

ρ x ζ0(0) 1 2 m (1.1) ψ(x) = x lim log(1 x− ), x > 1, x = p , − T ρ − ζ(0) − 2 − 6 →∞ γ T |X|≤ where the sum is over the non-trivial zeros ρ = β + iγ of the zeta function. Here

(1.2) ψ(x) := Λ(n) := log(p). n x pm x X≤ X≤ The formula (1.1) was rst stated by Riemann and later proven by von Mangoldt in 1885 (Ivic, 2003, p. 298). It is because of this formula that a great deal of the theory of the zeta function concerns the location of the non-trivial zeros. By the famous unsolved Riemann hypothesis they all lie on the line Re 1 s C : s = σ = 2 . It is not dicult to guess from (1.1), that{ estimates∈ about the density} of the zeros in vertical strips of the type σ1 σ σ2 imply upper bounds on how big the dierence between consecutive prime{ numbers≤ ≤ can} be. Theorems of this nature can be found in Ivic (2003). What is perhaps less obvious is that estimates on how rapidly zeta function grows on the vertical lines also imply bounds for prime gaps. This latter approach is the main topic of this Master's thesis. The main body of this thesis is divided into ve parts. In the second Chapter we will get acquainted with the Riemann zeta function. We prove the functional equation for zeta function and show important estimates on its growth, using basic tools from function theory. We will also show that estimates on the growth of zeta function near the line σ = 1 give bounds on how close the zeros of zeta function can be to the same line. This is the only piece of information about the zeros that we will need to bound prime gaps. In the third Chapter we will give estimates for the growth of the zeta function on the line 1 using van der Corput's method. The main idea is to approximate zeta σ = 2 function by a nite exponential sum and then transform the sum into a sum of integrals using the Poisson summation formula. We will prove that

1 1 (1.3) ζ + it = t 6 log t , t . 2 O → ∞     In Chapter four we concentrate on a dierent method of dealing with exponential sums by Vinogradov. This technique will give better results near the line σ = 1 than van der Corput's method. The main goal is to obtain a result of the type ζ0(s)/ζ(s) = B in a region where ω(t) log log t where tends to innity (log t), ζ(s) = 0 σ 1 log t , ω(t) withO t. Van der6 Corput's method gives≥ this− kind of estimates only for a constant ω. It will become apparent in Chapter ve that this slight improvement is crucial for our application.

1 In the fth Chapter we will give an upper bound for prime gaps. In particular, we will use our accumulated knowledge in the Hoheisel and Ingham's ingenious proof of the fact that if 1 ζ + it = (tc) , t , 2 O → ∞   then for any given 1+4c we have θ > 2+4c

p p < pθ n+1 − n n for suciently large n. We will then use the estimates from Chapter three to obtain that for every  > 0 we have

5 8 + (1.4) p p < pn n+1 − n for large enough n. In the sixth Chapter we will study another prime number problem of additive nature, namely the problem of prime pairs. Let π2k(x) stand for the number of prime numbers p x such that p + 2k is also a prime number. The behaviour of π2k(x) for any given≤ 2k is a question that resists the repower of even the most modern methods. We make the problem accessible by considering averages over 2k. We will show that for any 1+4c the average of over positive θ is consistent with the 2+4c < θ < 1 π2k(x) 2k x famous Hardy-Littlewood Conjecture. Using this we will≤ show that for large x, and θ for a positive proportion of the numbers 2k x , the number π2k(x) is greater than a constant times what the Hardy-Littlewood Conjecture≤ claims. We will also give lower bounds for the averages of π2k(x) over much smaller sets k E log x . In the last section of the sixth Chapter, we apply the{ discrete≤ Fourier} transform to study the distribution of prime pairs. This is yet another application of exponential sums to the theory of prime numbers. We express the function π2k(n) as an exponential sum, and then recover the term corresponding to the Hardy-Littlewood Conjecture from this sum. We do not expect that the reader has much experience with analytic number theory. Knowledge of a proof of the prime number theorem using the zeta function is not a prerequisite, but it makes some of the discussion easier to follow. Ayoub (1963, Ch. 2) contains several proofs of the prime number theorem. We assume that the reader is comfortable with basic complex analysis, especially calculus of residues. Ahlfors (1979) is suggested for an introduction to complex analysis. We will also make use of the Poisson summation formula from Fourier analysis in chapters two and three. There is an appendix at the end which contains notations and some useful basic theorems. Due to length constraints we were faced with a dilemma. On one hand one wishes to give complete proofs for all the theorems that will be made use of. On the other hand, in order to t the variety of ideas needed in such a narrow space, we would have to be so slick that the proofs would be utterly mystifying for a reader who is unfamiliar with the subject. For this reason we have left out some of the more technical analysis from Chapter four concerning Vinogradov's method. The omitted proofs are cited in the text.

2 Many of the proofs may seem very technical at rst and there is a risk that the beautiful ideas get lost beneath the detailed calculations. For this reason from time to time there are heuristic arguments in the text to motivate the rigorous proofs. We hope that these will help to illustrate the underlying concepts. Even though the ideas developed for estimating the Riemann zeta function are most fascinating, we must not forget what made us to embark on the journey  the desire to understand prime numbers.

3

2 The Riemann Zeta Function

2.1 Euler Product

We will use notation s = σ +it for complex variables. We will also denote the half-plane Re by or just say that is in the half-plane s C : s = σ > σ0 σ > σ0 s σ > σ0. {Similar∈ notations are used} for{ vertical} strips σ < σ < σ and vertical lines σ = σ . { 1 2} { 0} We also use the notation σ > f(t) for the domain s C : Re s = σ > f(t) , where f is a continuous function.{ } { ∈ } In the half-plane σ > 1 we dene the Riemann zeta function as Denition 2.1. (Riemann zeta function).

∞ s (2.1) ζ(s) := n− , σ > 1. n=1 X As can be seen by using the triangle inequality, the sum converges uniformly and absolutely in every half-plane σ > 1 + δ, δ > 0. Therefore the series 2.1 denes an analytic function in the domain σ > 1. The series clearly does not converge at s = 1. From the general theory of Dirichlet series it follows that the sum therefore diverges for all σ < 1 (Saksman, 2011, p. 44). We also note that we clearly have the symmetry ζ(s) = ζ(s). Hence all the zeros of ζ(s) lie symmetrically with respect to the real line. We also have the following important theorem, which relates ζ-function to prime numbers.

Theorem 2.1. (Euler product). For σ > 1

s 1 (2.2) ζ(s) = (1 p− )− , − p Y where the product is over all prime numbers. Proof. We have

∞ ∞ s 1 s ks ζ(s) (1 p− )− = n− p− − − − p m n=1 p m k=1 ! ≤ ≤ Y X Y X

s = n−

n: p-n p m X∀ ≤ ∞ σ n− 0 , ≤ → n=m+1 X as m . For the last inequality we have used the triangle inequality and the fact →s ∞ σ that x− = x− for real x. | | Remark 2.1. From now on always when the index in a sum or a product is p, the sum or the product is taken over prime numbers.

5 Euler's product can be viewed as an analytical version of the fundamental theorem of arithmetic, which asserts that every natural number can be represented as a unique product of powers of prime numbers. We also obtain

Corollary 2.1. For σ 1 + δ > 1 the following hold uniformly: ≥ ∞ 1 (2.3) log ζ(s) = , npns p n=1 X X ζ (s) ∞ Λ(n) (2.4) 0 = , − ζ(s) ns n=2 X where Λ(n) is the von Mangoldt's function

log p, n = pk for some prime p, (2.5) Λ(n) = (0, otherwise. Proof. From the Euler product we get

∞ s 1 log ζ(s) = log(1 p− ) = . − − npns p p n=1 X X X This double sum converges uniformly in every compact subset of σ > 1, because its absolute value is by triangle equality not greater than

∞ 1 ∞ ∞ 1 ∞ k σ ∞ 1 − = < . npnσ ≤ nknσ ≤ 1 k σ kσ 1 ∞ p n=1 n=1 − X X Xk=2 X Xk=2 − Xk=2 − Since the series converges uniformly, dierentiation yields us

∞ ∞ ζ0(s) ns Λ(n) = log p p− = . − ζ(s) ns p n=1 n=2 X X X

Because the sum in (2.3) converges for σ > 1, we also have Corollary 2.2. ζ(s) = 0 in the domain σ > 1 and 6 1 ∞ µ(n) (2.6) = , ζ(s) ns n=1 X where µ(n) is the Möbius function

1, n = 1, (2.7) n is not squarefree, µ(n) = 0, ( 1)k, n has k distinct prime factors. −  6 Proof. We have

1 s = (1 p− ). ζ(s) − p Y The theorem follows immediately from carrying out the multiplication by the denition of the Möbius function.

2.2 Analytic Continuation to σ > 0, s = 1 6 The following theorem states that even though the sum (2.1) does not converge for σ 1, we can still use its partial sum to approximate ζ(s). Estimating ζ(s) and related functions≤ by a nite sum will be a reoccurring theme in the all of the following chapters. We use the notation x to denote the largest integer that is at most x. That is, x = b c b c max k Z : k x . { ∈ ≤ } Theorem 2.2. The Riemann zeta function extends to an analytic function in the do- main σ > 0, s = 1 by the formula 6

N 1 s s N − ∞ s 1 (2.8) ζ(s) = n− + s x− − (x x ) dx, s 1 − − b c n=1 N X − Z where N 1 is any given integer. ≥ Proof. Euler summation formula B.2 gives us

M 1 s 1 s M s M − N − s 1 n− = − s x− − (x x ) dx. 1 s − − b c n=N+1 N X − Z Letting M we obtain for σ > 1 → ∞

N N ∞ 1 s s s s N − ∞ s 1 ζ(s) = n− + n− = n− + s x− − (x x ) dx. s 1 − − b c n=1 n=N+1 n=1 N X X X − Z The integral converges uniformly on every compact domain on the right half-plane and therefore the right-hand side of (2.8) denes an analytic function for all σ > 0, s = 1. The theorem follows now from the uniqueness of analytic continuation. 6

From this we immediately get by choosing N = 1

Corollary 2.3. The Riemann zeta function has a simple pole at s = 1 with residue 1.

7 2.3 Analytic Continuation to C 1 and the Functional Equa- tion \{ }

We will need the following lemma to prove the functional equation of ζ(s). The proof is from Saksman (2014, p. 33).

Lemma 2.1. (Transformation formula of the Jacobi theta function). Dene the Jacobi theta function for x > 0 by

∞ πn2x (2.9) θ(x) := e− . n= X−∞ Then

1 1 (2.10) θ x− = x 2 θ(x).

πy2x Proof. Let x > 0 and dene f(y) := e− . Then the Fourier transform of f is by denition

∞ ∞ π(y2x+2ξyi) fˆ(ξ) = f(y)e( ξy) dy = e− dy − Z−∞ Z−∞ 2 2 πξ ξ 2 πξ 2 − ∞ π(y√x+ i) 1 − ∞ π(t+bi) ξ = e x e− √x dy = e x e− dt, b = . √x √x Z−∞ Z−∞ πz2 To evaluate the integral, we note that e− is an entire function. Hence we can write using Cauchy's theorem

T +bi T T +bi T πz2 πz2 πz2 − πz2 e− dz = e− dz + e− dz + e− dz T +bi T T T +bi Z− Z− Z Z− := I1 + I2 + I3.

Simple computation gives us for indices k = 2, 3

T +bi πz2 π(T 2 b2) Ik e− dz be− − 0, | |≤ T | | ≤ → Z as T for a xed b. Therefore → ±∞ 2 2 2 πξ 2 πξ 2 πξ 1 − ∞ π(t+bi) 1 − ∞ πt 1 − fˆ(ξ) = e x e− dt = e x e− dt = e x √x √x √x Z−∞ Z−∞ Now the Poisson summation formula B.3 gives us

∞ ∞ 1 1 x− 2 θ x− = fˆ(n) = f(n) = θ(x). n= n=  X−∞ X−∞

8 We will also need to dene the following two functions

Denition 2.2. Dene the Euler's Γ-function by

∞ s 1 x (2.11) Γ(s) := x − e− dx, σ > 0. Z0 and for σ > 1 dene

s s (2.12) ξ(s) := s(s 1)π− 2 Γ ζ(s). − 2   In what follows the basic properties of Γ-function are assumed to be known. These can be found in Ahlfors (1979).

Theorem 2.3. Function ξ(s) extends to an entire function and satises

(2.13) ξ(s) = ξ(1 s) − for all s C. ∈ Proof. Assume at rst that σ > 1. By a change of variables x πn2x we obtain from the denition of Γ-function that 7→

s s s ∞ s 1 πn2x π− 2 n− Γ = x 2 − e− dx. 2 0   Z Summing over n and multiplying by s(s 1) we get −

∞ ∞ s 1 πn2x ∞ s 1 θ(x) 1 (2.14) ξ(s) = s(s 1) x 2 − e− dx = s(s 1) x 2 − − dx, − − 2 0 n=1 0 Z X Z where the interchange of the order of summation and integration is justied, because the series converges absolutely. By dening ω(x) = (θ(x) 1)/2 we obtain from the Transformation formula of the Jacobi's theta function that −

1 1 1 (2.15) ω(x− ) = + √x + √xω(x). −2 2 Next we divide the integration in (2.14) into two parts, [0, 1] and [1, ] ∞ 1 s 1 ∞ s 1 ξ(s) = s(s 1) x 2 − ω(x) dx + x 2 − ω(x) dx . − Z0 Z1  The rst integral is the tricky part, which we have to deal with in order to obtain an analytic continuation for To this end we make a change of variables 1 in the ξ(s). x x rst integral, which yields 7→

∞ s 1 s 1 1 ξ(s) = s(s 1) x 2 − ω(x) + x− 2 − ω(x− ) dx. − Z1  9 Hence by the Transformation formula

∞ s 1 s 1 1 ∞ s 1 s 1 ξ(s) = s(s 1) ω(x) x 2 − + x− 2 − 2 dx + x− 2 − 2 x− 2 − dx − 1 2 1 − Z   Z    ∞ s 1 s 1 s(s 1) 2 1 s 2 s ∞ = s(s 1) ω(x) x 2 − + x− 2 − 2 dx + − x 2 − 2 + x− 2 − 1 2 1 s s 1 Z    −  ∞ s 1 s 1 = s(s 1) ω(x) x 2 − + x− 2 − 2 dx + 1, − 1 Z   for σ > 1. From the denition of θ(x) and ω(x) it is clear that ω(x) decays exponentially as x tends to innity. Therefore the integral in the last expression converges uniformly on every compact subset of C and hence denes an entire function, which yields an analytic continuation for ξ(s). The functional equation follows from the fact that the last expression remains unchanged under s 1 s. 7→ − Now we can prove

Corollary 2.4. (Analytic Continuation And the Functional Equation of ζ(s)). The function ζ(s) has an analytic continuation to C 1 with a simple pole at s = 1. Furthermore, for all s = 1 we have \{ } 6 s s 1 πs (2.16) ζ(s) = χ(s)ζ(1 s), χ(s) = 2 π − sin Γ(1 s). − 2 −   Proof. From the denition of ξ(s) we obtain

(2.17) ξ(s) ζ(s) = s s . s(s 1)π− 2 Γ − 2 Hence by the previous theorem ζ(s) extends to a meromorphic  function with a pole of order one at s = 1, because the simple pole of Γ(s) at s = 0 cancels the other possible pole in (2.17). From the Functional equation and the denition of ξ(s) we get

s 1 s s − 1 s s(s 1)π− 2 Γ ζ(s) = (1 s)( s)π− 2 Γ − ζ(1 s), − 2 − − 2 −     which yields

1 s −2 1 s ζ(s) π− Γ −2 χ(s) = = s s ζ(1 s) π 2 Γ − − 2  s 3 πs 1 s s = π − 2 sin Γ − Γ 1 2 2 − 2       s s 1 πs = 2 π − sin Γ(1 s), 2 −   where we have used the reection formula and the duplication formula of Γ-function (Ahlfors, 1979).

10 Remark 2.2. The above proof of the functional equation is one of Riemann's original proofs from his famous 1859 paper. From (2.17) we see that ζ(s) has zeros at all the negative even integers, because Γ(s) has poles at s = 1, 2, 3, ... These are the so called trivial zeros of ζ(s). Note also that by the functional− − equation,− the non-trivial zeros of lie symmetrically with respect to the line 1 ζ(s) σ = 2 . 2.4 Simple Estimates and a Zero-free Region The proofs in this section are from Titchmarsh (1951, Ch. 3). Let us rst dene notations related to asymptotics. Denition 2.3. (Asymptotic notations). For two functions f : C C, g : C R+ we write → →

(2.18) f(s) = (g(s)), O or equivalently

(2.19) f(s) g(s)  as s in an unbounded connected set A C, if there exists a constant C > 0 such → ∞ ⊂ that f(s) Cg(s) for all s A. The subscript  in  or  indicates that the constant may| depend|≤ on a parameter∈. We write f(s) g(sO) if g(s) f(s) g(s). For two functions we write   f : C C, g : C R+ → → (2.20) f(s) = o(g(s)) if f(s)/g(s) 0 as s in an unbounded connected set A C. For two functions→ | |→ of a ∞ real variable we write ⊂

(2.21) f(x) g(x) ∼ as x if f(x)/g(x) 1 as x . → ∞ → → ∞ To study the asymptotics of ζ(s), it is useful to dene the following function. Denition 2.4. We dene the characteristic function of ζ(s) by

(2.22) µ(σ) := inf r R : ζ(σ + it) = ( t r) . { ∈ O | | } Observe that by the formula (2.3) we have for σ > 1

∞ 1 ∞ 1 log ζ(s) = = (1), t . | | npns ≤ npnσ O → ∞ p n=1 p n=1 X X X X Therefore and are both bounded on for a xed and we have ζ(s) 1/ζ(s ) σ > 1 µ(σ) = 0 in the domain σ > 1. From the Complex Stirling formula for Γ(s) we obtain that

1 σ t 2 − (2.23) χ(s) , t . | | ∼ 2π → ∞   11 See Titchmarsh (1951, p. 68) for details. By the Functional equation (2.16) this implies µ(σ) = 1 σ for σ < 0. That is, 2 − 0, σ > 1 (2.24) µ(σ) = 1 σ, σ < 0. ( 2 − Now by the Phragmén-Lindelöf theorem, µ(σ) is a convex function and therefore continuous. It is also nite and non-negative for all values of σ (Titchmarsh (1976)). By continuity and convexity we get 1 for with equality at the µ(σ) 2 (1 σ) 0 σ 1 endpoints. In particular, we obtain 1 ≤ 1 − ≤ ≤ µ( 2 ) 4 . Extrapolating on the equation (2.24)≤ one is tempted to guess that 1 and µ 2 = 0 that the graph of consists of two straight lines meeting at the point 1 This µ(σ) 2 , 0 . is known as the Lindelöf hypothesis and is yet an unsolved problem. We shall  prove in the next chapter that 1 1  µ 2 6 . We also obtain ≤  Theorem 2.4.

(2.25) ζ(σ + it) = (log t) O as t in the region σ 1 1 . Furthermore, we have → ∞ ≥ − log t 1 (2.26) ζ(it) = (t 2 log t) O Proof. To prove the rst part note that choosing N = t in Theorem 2.2 yields for σ > 0 b c

t b c 1 s s t − ∞ s 1 ζ(s) = n− + b c s x− − (x x ) dx s 1 − − b c n=1 t X − Zb c t b c s σ ∞ σ 1 = n− + (t− ) + t x− − dx O O n=1 t X  Z  t b c s 1 σ = n− + (t − ). O n=1 X Therefore for σ 1 1 ≥ − log t t t b c b c σ 1 1 1 log t ζ(s) n− + t log t t log t n− e log t log t log t.     n=1 n=1 X X The second part now follows from the functional equation, since the Complex Stirling 1 formula for Γ(s) implies that χ(it) = (t 2 ), as t (Titchmarsh, 1951, p. 68). Hence O → ∞ 1 ζ(it) = χ(it)ζ(1 it) t 2 log t. − 

12 To conclude this chapter, we will show that estimates for the growth of ζ(s) near the line σ = 1 imply a region near the same line, where ζ(s) = 0. As it will turn out, no more information on the zeros of ζ(s) will be required for6 our application to the distribution of prime numbers. First we need prove the following three lemmata, which are based on the Borel- Carathéodory theorem B.5. The proofs are from Titchmarsh (1951, Ch. 3).

Lemma 2.2. Assume that f is analytic and satises f(s) (2.27) eM f(s ) ≤ 0 for some M > 0 in B(s , r). Then for all s B(s , r/4) 0 ∈ 0 f (s) 1 M (2.28) 0 48 f(s) − s ρ ≤ r ρ − X where the sum runs through the zeros of in with multiple zeros counted ρ f B(s0, r/2) as many times as is the order of the zero. Proof. Let us dene

1 (2.29) g(s) := f(s) (s ρ)− , − ρ Y where the product is taken over zeros ρ of f in B(s0, r/2) with multiple zeros appearing as many times as is the order of the zero. Because the function f(s) is analytic and the product exactly cancels the zeros of f inside B(s0, r/2), g(s) is analytic in B(s0, r) and does not vanish in B(s0, r/2). On ∂B(s0, r), we have s ρ > r/2 > s0 ρ , which implies that | − | | − |

g(s) f(s) s ρ f(s) = 0 − eM . g(s0) f(s0) s ρ ≤ f(s0) ≤ ρ − Y This holds for all by the maximum modulus principle (see Ahlfors, 1979). s B(s0, r) Therefore the function∈ g(s) (2.30) h(s) := log g(s )  0  is analytic in B(s , r/2), and we have h(s ) = 0 and Re(h(s)) M. Hence by the 0 0 ≤ Borel-Carathéodory theorem B.5 we have h(s) 6M for all s B(s0, 3r/8). Hence for s B(s , r/4) we have | |≤ ∈ ∈ 0 1 h(z) 1 h(z) h0(s) = dz dz | | 2πi (z s)2 ≤ 2π (z s)2 | | Z∂B(s,r/8) Z∂B(s,r/8) − − 8 M sup h(z) 48 , ≤ z ∂B(s,r/8)| | r ≤ r ∈

13 which is what we claimed since

d f(s) s0 ρ f 0(s) 1 h0(s) = log + log − = dt f(s ) s ρ f(s) − s ρ 0 ρ ! ρ X − X −

Lemma 2.3. Suppose that f satises the conditions of the previous lemma and suppose that f(s) does not vanish in the right half of the disk B(s0, r/2). Then

f (s ) M (2.31) Re 0 0 48 . − f(s ) ≤ r  0  Also, if f has a zero ρ on the line segment joining s r/2 and s , then 0 0 − 0 f (s ) M 1 (2.32) Re 0 0 48 . − f(s ) ≤ r − s ρ  0  0 − 0 Proof. Note that Re z z . Hence by the previous lemma − ≤ | | f (s ) 1 f (s ) 1 M Re 0 0 0 0 48 , − f(s0) − s0 ρ! ≤ f(s0) − s0 ρ ≤ r ρ − ρ − X X which implies both of the claims, since Re (1 /(s0 ρ)) 0 by our assumptions for all zeros ρ. − − ≥

Lemma 2.4. Assume that f satises the conditions on Lemma 2.2 and that

f (s ) M (2.33) 0 0 . f(s ) ≤ r 0

Suppose that f(s) = 0 in the part σ σ0 2r0 of B(s0, r), where 0 < r0 < r/8. Then for all s B(s , r )6 we have ≥ − ∈ 0 0 f (s) M (2.34) 0 96 . f(s) ≤ r

Proof. From Lemma 2.2 we now get f (s) M 1 M Re 0 48 Re 48 . − f(s) ≤ r − s ρ ≤ r ρ   X − for all s B(s, r/4) such that σ σ 2r0. The claim now follows from the Borel- ∈ ≥ 0 − Carathéodory theorem B.5 by using the radii 2r0 and r0.

14 We are now ready to prove the following theorem concerning the roots of ζ(s). The proof is attributed to Landau in Tichmarsh (1951). Recall that for σ > 1 the non- vanishing of ζ(s) was deduced from the Euler product. The idea is to show that the Euler product `leaks' over the line σ = 1 even though we cannot show that the product converges for any s = σ + it such that σ 1. The proof makes use of the fact, that 3+4 cos x+cos 2x = 2(1+cos x)2 0 for all≤ real x. This same observation was originally used by de la Vallée-Poussin in the≥ proof that ζ(s) = 0 on the line σ = 1. For a reader who is not familiar with this proof, it might be useful6 to read the de la Vallée-Poussin's proof rst before proceeding to the proof of the following more general and precise statement. The proof can be found in Titchmarsh (1951, p. 41).

Theorem 2.5. (Landau). Let φ(t) and 1/θ(t) be strictly positive non-decreasing func- tions for t 0, such that φ(t) , θ(t) 1 and ≥ → ∞ ≤ φ(t) (2.35) = o eφ(t) θ(t)  Suppose that

(2.36) ζ(s) = eφ(t) O as s in the region 1 θ(t) σ 3. Then there exists a constant A such that ζ(s) =→ 0 ∞in the region − ≤ ≤ 6 θ(2t + 1) (2.37) σ 1 A . ≥ − φ(2t + 1)

Proof. Suppose that ζ(β + iγ) = 0, γ > 0. Our aim is to show that there exists a constant A > 0 such that θ(2γ + 1) (2.38) β < 1 A . − φ(2γ + 1) Assume that

φ(2γ+1) (2.39) 1 + e− σ 2 ≤ 0 ≤ and dene

(2.40) s0 = σ0 + iγ, s1 = σ0 + 2iγ, r = θ(2γ + 1).

The exact value of σ0 will be specied later. Because θ(t) is non-increasing, both of the disks B(s0, r) and B(s1, r) lie in the region σ 1 θ(t). For some constant A we have ≥ − 1

∞ ∞ 1 µ(n) σ0 A1 φ(2γ+1) (2.41) − = s n = ζ(σ0) A1e , ζ(s0) n 0 ≤ ≤ σ0 1 ≤ n=1 n=1 − X X

15 where the second inequality follows from Theorem 2.2 by choosing N = 1. Same holds for 1/ζ(s1). Therefore by our hypothesis on the growth of ζ(s), there exists an absolute constant A2, such that

ζ(s) ζ(s) (2.42) eA2φ(2γ+1), eA2φ(2γ+1) ζ(s ) ≤ ζ(s ) ≤ 0 1

in the disks B(s0, r) and B( s1, r) respectively. Let us rst assume that for the real part of the zero we have β > σ0 r/2. Since ζ(s) = 0 for σ > 1, we may use Lemma 2.3 with M = A φ(2γ + 1). Hence have− for some 6 2 constant A3 ζ (σ + 2iγ) φ(2γ + 1) (2.43) Re 0 0 < A , − ζ(σ + 2iγ) 3 θ(2γ + 1)  0  ζ (σ + iγ) φ(2γ + 1) 1 (2.44) Re 0 0 < A . − ζ(σ + iγ) 3 θ(2γ + 1) − σ β  0  0 − Because ζ(s) has a simple pole with residue 1 at s = 1, we have

ζ0(σ0) 1 − ζ(σ ) ∼ σ 1 0 0 − as σ 1 and hence 0 → ζ (σ ) a (2.45) 0 0 , − ζ(σ ) ≤ σ 1 0 0 − where a can be made arbitrarily close to 1 by the choice of σ0. From Corollary 2.1 we obtain

ζ (σ ) ζ (s ) ζ (s ) ∞ Λ(n) (2.46) 3 0 0 4Re 0 0 Re 0 1 = (3 + 4 cos(α) + cos(2α)) 0, − ζ(σ ) − ζ(s ) − ζ(s ) nσ0 ≥ 0 0 1 n=1 X where α = γ log n and 3 + 4 cos α + cos 2α = 2(1 + cos α)2 0. We have already estimated the terms on the left-hand≥ side in (2.43), (2.44) and (2.45) from above, which combined give us

3a φ(2γ + 1) 4 + 5A > 0. σ 1 3 θ(2γ + 1) − σ β 0 − 0 − From this we can solve

1 3a 5A3 φ(2γ + 1) − (2.47) 1 β > + (σ0 1) − 4(σ0 1) 4 θ(2γ + 1) − −  −  1 3a 5A φ(2γ + 1)(σ 1) 3a 5A φ(2γ + 1) − (2.48) = 1 3 0 − + 3 . − 4 − 4 θ(2γ + 1) 4(σ 1) 4 θ(2γ + 1)    0 −  16 By our assumption (2.35) we have

φ(t) = o eφ(t) . θ(t)  Hence

φ(t) θ(t) e− = o , φ(t)   and we may choose

1 θ(2γ + 1) φ(2γ+1) (2.49) σ0 := 1 + 1 + e− 40A3 φ(2γ + 1) ≥ without violating the assumption (2.39). To make the numerator in (2.48) positive, we choose a = 5/4. This yields

1 15 5 15A φ(2γ + 1) 5A φ(2γ + 1) − β < 1 1 3 + 3 − − 16 − 160 40θ(2γ + 1) 4 θ(2γ + 1)     θ(2γ + 1) = 1 , − 1240A3φ(2γ + 1) which proves the claim.

If β σ0 r/2, the theorem follows immediately, because by choosing σ0 as in (2.49) we obtain≤ that−

θ(2γ + 1) 1 A θ(2γ + 1) β 1 + θ(2γ + 1) < 1 4 , ≤ 40A3φ(2γ + 1) − 2 − φ(2γ + 1) since φ(t) and θ(t) 1. → ∞ ≤

We will also need the following growth estimate for ζ0(s)/ζ(s). The proof is again from Titchmarsh (1951). The idea is to x s0 = 1 + δ + it0 and to use Lemma 2.4 in conjunction with the above theorem to obtain an estimate for ζ0(s)/ζ(s) inside a small disk centred at s0. The constants in this estimate will be independent of the choice of t0. Theorem 2.6. With the same assumptions as in Theorem 2.5 we have

ζ (s) φ(2t + 3) (2.50) 0 = ζ(s) O θ(2t + 3)   uniformly as s in the region → ∞ A θ(2t + 3) (2.51) σ 1 . ≥ − 4 φ(2t + 3) 17 Proof. Set

A θ(2t0 + 3) (2.52) s0 = 1 + + it0, r = θ(2t0 + 3) 1. 2 φ(2t0 + 3) ≤

Then in the disk B(s0, r) we have for some constant B > 1 that

ζ(s) eφ(t) φ(2t + 3) (2.53) = = 0 eφ(t0+1) = (eBφ(2t0+3)), ζ(s ) O σ 1 O θ(2t + 3) O 0  0 −   0  ζ (s ) 1 φ(2t + 3) φ(2t + 3) (2.54) 0 0 = = 0 = 0 , ζ(s ) O σ 1 O θ(2t + 3) O r 0  0 −   0    where the -constants are independent of t0. Using the previous theorem we have ζ(s) = 0 for all t O t + 1 in the region 6 ≤ 0 θ(2t + 3) (2.55) σ 1 A 0 . ≥ − φ(2t0 + 3)

Therefore we may use Lemma 2.4 with M = Bφ(2t0 + 3) and

3A θ(2t0 + 3) 2r0 = , 2 φ(2t0 + 3) since ζ(s) = 0 for s B(s , r) σ > σ 2r0 . Hence 6 ∈ 0 ∩ { 0 − } ζ (s) φ(2t + 3) (2.56) 0 = ζ(s) O θ(2t + 3)   uniformly for all t0 and for all s such that

3A θ(2t0 + 3) (2.57) s s0 . | − |≤ 4 φ(2t0 + 3)

For s = σ + it0 this gives us that for some constant C which is independent of t0 we have

ζ (σ + it ) φ(2t + 3) 0 0 C 0 ζ(σ + it0) ≤ θ(2t0 + 3) for all

A θ(2t + 3) σ 1 0 , ≥ − 4 φ(2t0 + 3) which is the theorem with t replaced by t0.

18 3 Van Der Corput's Method

In this chapter we will give an estimate for the growth of the ζ-function on the critical line 1 Because of symmetry we can restrict our observations to the upper σ = 2 . half-plane t > 0. Let us rst motivate the technical inspections to be made in this chapter. Recall that by Theorem 2.2 we have

N 1 s s N − ∞ s 1 (3.1) ζ(s) = n− + s x− − (x x ) dx s 1 − − b c n=1 N X − Z for any given N in the region σ > 0. Let us write the integrand on the right-hand side as a product of a monotone function and an oscillating function

s 1 σ 1 it x− − (x x ) = x− − x− (x x ) − b c − b c The oscillating part of the integrand is

it it log x x− (x x ) = e− (x x ). − b c − b c For x t we have  d t t log x = 1. dx x  Hence for x t the oscillation of the exponential term is slow in comparison to the oscillations ofx x . Therefore we expect that cancellations occur. The integrand σ 1− b c decays like x− − as x and therefore for N t the contribution from the integral is expected to be small.→ ∞  The second term on the right-hand side of (3.1) is also small for N t, which leads us to write that 

s ζ(s) n− ≈ n Ct X≤ for some constant C in the domain σ > 0. This fact will be proven rigorously in Theorem 3.3. For this purpose we will develop a method due to van der Corput for estimating exponential sums of the type

it t (3.2) n− = e log n , −2π a

19 Lemma 3.1. Assume that f(x) and g(x) are real functions on [a, b] such that f is dierentiable, g is continuous, f 0(x)/g(x) is monotonic and f 0(x)/g(x) m > 0 or ≥ f 0(x)/g(x) m < 0. Then ≤ − b 2 (3.3) g(x)e(f(x)) dx . ≤ πm Za

Proof. Without loss of generality we may assume that g(x)/f 0(x) is positive and de- creasing. Using the second mean-value theorem for integration (Apostol, 1978, p. 165) we get for the real part of the integral (3.3)

b b g(x) g(x) cos(2πf(x)) dx = f 0(x) cos(2πf(x)) dx f (x) Za Za 0 g(a) ξ = f 0(x) cos(2πf(x)) dx f (a) 0 Za g(a) = (sin(2πf(ξ)) sin(2πf(a))) 2πf 0(a) − for some ξ (a, b). By our assumptions the modulus of the last expression is clearly not greater∈ than 1/πm. The same holds for the imaginary part, which gives us the theorem.

Lemma 3.2. Let f(x) be a real and twice dierentiable function and let g(x) be con- tinuous in [a, b]. Suppose that f 00(x) r > 0 or f 00(x) r < 0 and that g(x) M. Then ≥ ≤ − | |≤

b 4√2M (3.4) g(x)e(f(x)) dx . ≤ √πr Za

Proof. We may assume that Then is strictly increasing and there- f 00(x) r > 0. f 0(x) fore vanishes at most once in the≥ interval [a, b], say at point c. For a suitable δ > 0 set

b c δ c+δ b − I = g(x)e(f(x)) dx = + + := I1 + I2 + I3, a a c δ c+δ Z Z Z − Z where a + δ < c < b δ. For x c + δ or x c δ we have − ≥ ≤ − x f 0(x) = f 00(x) dx r(x c) rδ. ≥ − ≥ Zc Therefore our previous lemma gives us

4M I + I . | 1| | 3| ≤ πrδ Trivially we have I 2Mδ. Hence | 2|≤ 4M I + 2Mδ. | | ≤ πrδ

20 The lemma follows now by choosing δ = 2/πr. If c a + 2/πr or c b 2/πr, a similar argument gives the result. ≤ ≥ − p p p If f 0 does not vanish, we set c = a or c = b depending on whether f 0 is positive or negative. Then we divide the integration into two parts, and use the same technique to obtain the bound.

3.2 Exponential Sums The next theorem relates exponential sums to exponential integrals via the Poisson sum- mation formula B.4. The proof is from Tenenbaum (1995, Ch. I.6). We use Riemann- Stieltjes integration in the computations below. For a reader who is not familiar with this, we suggest Apostol (1978, Ch. 7).

Theorem 3.1. Assume that f is dierentiable and f 0 is monotonic in the interval [a, b] and let α and β be the values of f 0(x) at the endpoints a and b, α < β. Then for every  > 0 we have

b (3.5) e(f(n)) = e(f(x) νx) dx + (log(β α + 2)), − O − a 0. We may assume that 1 α  < 0, because we can replace f(x) by f(x) + kx for a suitable integer k. We− can≤ also− assume that a and b are of the form m + 1/2 for an integer m. By doing so the error on the left-hand side is (1). To estimate the error on the right-hand side, we note the trivial estimate O

1 e(( β +  + 1)x) e( νx) min β α + 2, e( α  x) − b c − ≤ − d − e 1 e(x) α <ν<β+  −  − X 2 min β α + 2, ≤ − 1 e(x)  −  1 = min β α + 2, , O − x   k k where x is the distance of x from the nearest integer and α  is the least integer greaterk thank or equal to α . Interchanging summation andd integration,− e we get for the − 1 error in the lower endpoint of the integration interval, if, say a a + (β α + 2)− , ≤ b c − (3.6) a + 1 a + 1 b c 2 b c 2 1 e(f(x)) e( νx) dx min β α + 2, dx −  − x a ν a Z X Z  k k a + 1 a +(β α+2) 1 b c 2 dx b c − − (3.7) = + (β α + 2) dx a +(β α+2) 1 a + 1 x a − Zb c − − b c − Z (3.8) log(β α + 2).  − 21 1 If a + (β α + 2)− a a + 1/2, then we have only the rst integral in (3.7). Similarly,b c if−a a +≤ 1/2, ≤we b dividec the integration from a to a + 3/2 suitably to obtain the same≥ estimate. b c The same estimates hold for the upper endpointb c of integration b. Thus, we may assume that a and b are of the form m + 1/2. Finally, we may also assume that f 0 is decreasing on [a, b]. If we now dene e(f(x)) outside the interval [a, b] to be identically zero, we get by the Poisson summation formula B.4 that

e(f(n)) = e(f)(ν) + o(1),N → ∞ a

e(f)(ν) = (log(β + 2)). O ν N, ν [0,β+) | |≤ X6∈ d Now partial integration gives us for ν / [0, β + ) ∈ b b 1 2πie(f)(ν) = e(f(x) νx) dx = d(e(f(x) νx)) a − a f 0(x) ν − Z b Z − d e(f(x) νx) b 1 = − e(f(x) νx) d . f (x) ν − − f (x) ν  0 − a Za  0 − 

For the second term on the right-hand side we note that f 0(x) ν does not vanish and is monotone decreasing, so that the integral is well dened and− we have

b 1 b 1 e(f(x) νx) d d − f (x) ν ≤ f (x) ν Za  0 −  Za  0 −  1 1 = . α ν − β ν − − Since we assumed that 1 α  < 0, we have for all integers ν [0, β + ) − ≤ − 6∈ 1 1 β α β + 1 = − . α ν − β ν (ν α)(ν β)  ν(ν β) − − − − − Hence the sum of these terms over ν N, ν [0, β + ) is | |≤ 6∈

β + 1 ∞ β + 1 β + 1 = + ,  ν(ν β) ν(ν + β) ν(ν β) ν [0,β+) ν=1 ν β+ 6∈X − X X≥ − 22 where we have divided the sum in to those ν that are negative and those that are at least β + . This is equal to

β + 1 β + 1 β + 1 β + 1 + + + ν(ν + β) ν(ν + β) ν(ν β) ν(ν β) 1 ν β ν>β β+ ν 2β ν>2β ≤X≤ X X≤ ≤ − X − 1 2 1 1 2 ν− + (β + 1) ν− + + ν− + (β + 1) ν−   1 ν β ν>β 1 ν β ν>β ≤X≤ X ≤X≤ X 2 1 (β + 1) ν− + ν− log(β + 2).   ν>β 1 ν β X ≤X≤ For the rst term, recall that a and b are both of the form m + 1/2 for some integer m. Therefore

e(f(x) νx) b e(f(b)) e(f(a)) − = ( 1)ν ( 1)ν , f (x) ν − α ν − − β ν  0 − a − − and we get two sums of terms with alternating signs. Therefore the error is

ν+1 1 ν 1 ( 1) ν− + ( 1) (ν β)− = (1).  − − − O ν N, ν [0,β+) ν N, ν [0,β+) | |≤ X6∈ | |≤ X6∈

The theorem now follows. The next we generalize the above theorem. The proof is author's own, and not contained in Tenenbaum (1995), altough the idea is essentially the same as in the proof of the previous theorem. The theorem can be found in Titchmarsh (1951) along with a dierent proof.

Theorem 3.2. Let f(x) satisfy the conditions of the previous theorem, and suppose that g(x) is a real non-negative decreasing function with a continuous derivative. Assume also that g0(x) is decreasing. Then for every  > 0 | | b g(n)e(f(n)) = g(x)e(f(x) νx) dx (3.9) − a

(g(a) + g(b)) = (g(a) log(β α + 2)), O O − 23 since g is decreasing. For the error on the right-hand side, in the lower endpoint a if, 1 say a a + (β α + 2)− , we compute ≤ b c − a + 1 a + 1 b c 2 b c 2 g(x)e(f(x) νx) dx g(x)e(f(x)) e( νx) dx −  − α <ν<β+ a a ν − X Z Z X a + 1 b c 2 1 g(a) min β α + 2, dx,  − x Za  k k which is equal to

a + 1 a+ 1 b c 2 dx β α+2 g(a) + g(a) − (β α + 2) dx g(a) log(β α + 2). 1 x a 1 −  − a+ β α+2 a Z − − b c − Z Similarly for the other cases and the upper endpoint b. Thus we may assume that a and b are of the form m + 1/2. Let F (x) = g(x)e(f(x)), x [a, b] and F (x) = 0 elsewhere. We have ∈ F (n) = F (ν) + o(1),N . → ∞ a

e(f(x) νx) b b e(f(x) νx) b 1 g0(x) − − dg0(x) g0(x)e(f(x) νx)d . (f (x) ν)2 − (f (x) ν)2 − − (f (x) ν)2  0 − a Za 0 − Za  0 −  2 All the terms in the above expression are clearly ( g0(a) /ν ), so that sum over them O | | is ( g0(a) ). The theorem follows from this. O | | 24 We now combine the above theorem with Theorem 2.2 to get Theorem 3.3. We have

1 s s u − σ (3.10) ζ(s) = ζ(σ + it) = n− + (u− ), t − 1 s O | |→ ∞ n u X≤ − uniformly in σ δ > 0, for all u = u(t) C t /2π, where C > 1 is any constant. In particular, for u≥= t we have ≥ | |

s σ (3.11) ζ(s) = n− + ( t − ), t . O | | | |→ ∞ n t X≤| | Proof. We have by Theorem 2.2 that

N 1 s s N − ∞ s 1 ζ(s) = n− s x− − (x x ) dx − 1 s − − b c n=1 N X − Z N 1 s s N − s = n− + | | . − 1 s O N σ n=1 X −   Note that

s σ t n− = n− e log n . −2π   σ Using the notation of the previous theorem, we set g(x) = x− and t log x t (3.12) f(x) = , f 0(x) = . − 2π −2πx Therefore for x > 2πu/C, C > 1 we have t f 0(x) 1/C < 1. | |≤ 2πu ≤ Hence by the previous theorem

N s s σ n− = g(n)e(f(n)) = x− dx + (u− ) O u

1 s s s N − s ζ(s) = n− + n− + | | − 1 s O N σ n u u

1 s 1 σ u − t − σ | | = t − , t . 1 s  t | | → ∞ − | |

Remark 3.1. The previous theorem has a cute geometric interpretation. By writing

1 s s u − σ (3.13) n− = ζ(s) + + (u− ) 1 s O n u X≤ − 1 σ it we see that for σ > 0 the partial sums tend to a spiral of the form u − u− /(1 s) centred at ζ(s) as u . For σ = 1 the partial sums tend to a circle of radius 1/ 1− s centred at ζ(s). → ∞ | − | To estimate the remaining sum we require the following two theorems. The proofs are again from Tenenbaum (1995, Ch. 6).

Theorem 3.4. (Van der Corput 1). Let f(x) be real, twice dierentiable and let f 00 λ in the interval [a, b] for some constant λ > 0. Then | | 2 2 1 1 (3.14) e(f(n)) (b a)λ 2 + λ− 2 ,  − 2 2 a

Proof. Let us assume that f 00 > 0. If λ 1, then the claim is trivially true so assume 2 ≥ that λ2 < 1. Now Theorem 3.1 and Lemma 3.2 give us

b e(f(n)) (β α + 1) max e(f(x) νx) dx + log(β α + 2)  − α <ν<β+ − − a

b β α = f 0(b) f 0(a) = f 00(x)dx = ((b a)λ ). − − O − 2 Za The theorem follows because

1 1 log(β α + 2) β α + 2 (b a)λ + 2 (b a)λ 2 + λ− 2 . −  −  − 2  − 2 2

26 We will also need a similar theorem for the case when we have f 000 λ3. For this purpose we note that if we set g (x) := f(x + r) f(x), then | | r − g00(x) = f 00(x + r) f 00(x) = rf 000(ξ) r − for some ξ (x, x + r). Therefore g00(x) rλ . The next lemma shows that we may ∈ | r | 3 estimate the sum of e(f(n)) by a double sum of e(gr(n)) over n and r. This reduces the case f 000 λ3 to the previous theorem. The parameter q below is to be chosen later for an optimal| | estimate. Lemma 3.3. Let f be a real function and let q be a positive integer not greater than b a. Then − 1 q 1 2 (3.15) 2(b a) b a − e(f(n)) −1 + 2 − e(f(n + r) f(n)) . ≤ q 2 q − ! a b for convenience. We have ≤ q 1 e(f(n)) = e(f(n + m)) q n n m=1 X X Xq 1 e(f(n + m)) . ≤ q n m=1 X X The outer sum runs over n such that 2a b a q n b 1. Therefore there are at most 2(b a) values of n in the sum and− we ≤ get− by≤ Cauchy-Schwartz≤ − inequality − 1 q 2 2 1 (3.16) e(f(n)) 2(b a) e(f(n + m)) . ≤ q  −  n n m=1 X X X Now  

q 2 q q e(f(n + m)) = e(f(m + n) f(µ + n)) − m=1 m=1 µ=1 X X X = q + 2 Re e(f(m + n) f(µ + n)). − 1 µ

q 2 e(f(n + m)) 2(b a)q + 2 e(f(m + n) f(µ + n)) . ≤ − − n m=1 n 1 µ

q 1 q 1 − − (q r) e(f(ν + r) f(ν)) q e(f(ν + r) f(ν)) . − − ≤ − r=1 ν r=1 ν X X X X

27 Inserting this into 3.16 yields

1 q 1 2 1 − e(f(n)) 4(b a)2q + 4(b a)q e(f(ν + r) f(ν)) ≤ q − − − ! n r=1 ν X X X 1 q 1 2 2(b a) b a − −1 + 2 − e(f(ν + r) f(ν)) . ≤ q 2 q − ! r=1 ν X X

Theorem 3.5. (Van der Corput 2). Let f be real and three times continuously dierentiable and suppose that f 000(x) λ . Then | | 3 1 1 1 (3.17) e(f(n)) (b a)λ 6 + (b a) 2 λ− 6 ,  − 3 − 3 a

(3.18) g00(x) = f 00(x + r) f 00(x) = rf 000(ξ), r − where x < ξ < x + r. Therefore g00(x) rλ , and we have by Theorem 3.4 | r | 3 1 1 e(g (n)) (b a)(rλ ) 2 + (rλ )− 2 . r  − 3 3 a

1 q 1 2 (b a) b a − 1 1 2 2 e(f(n)) −1 + 2 − (b a)(rλ3) + (rλ3)−  2 q − a 1, the theorem is trivial. We get our rst main result from these two estimates.

28 Theorem 3.6. We have

1 1 (3.19) ζ + it = (t 6 log t), t . 2 O → ∞   Therefore, µ 1 1 . 2 ≤ 6 Proof. Dening  again f(x) := t/2π log x we get − t t (3.20) f 00(x) = , f 000(x) = . πx2 −πx3 Therefore, for all a we have for all x [a, 2a] that ∈ t t t t (3.21) f 00(x) , f 000(x) . 4πa2 ≤ | |≤ πa2 8πa3 ≤ | |≤ πa3 Hence, to optimize our estimate of the sums

it n− (b 2a), ≤ a

1 it 1 n− 2 − t 6  a

1 2 1 1 it t a 2 1 n− 2 − + t 6 ,  a t  a

29 3.3 Preparations For the Next Chapter We will next generalize the previous theorems. The results will be used in the next chapter to estimate ζ(s) near the line σ = 1. We note that Lemma 3.3 was used to reduce the case of Theorem 3.5 ( ) to that of Theorem 3.4 ( ). The f 000 λ3 gr00 rλ3 next lemma generalizes this idea:| We| can use Lemma 3.3 inductively| | to obtain an (k) estimate for an exponential sum e(f(n)) with the hypothesis that f λk. We will not give the proof here because the proof is quite messy and contains| essentially| no new ideas. The proof can be found inP Titchmarsh (1951, p. 93).

Lemma 3.4. Assume that f(x) is real and k times continuously dierentiable, where k 2, and assume that λ f (k)(x) hλ in the interval a < x b. Then ≥ k ≤ | |≤ k ≤ 1 1 2 2 2K 2 1 2K 2 (3.22) e(f(n)) h K (b a)λ − + (b a) − K λ− − ,  − k − k a

1 (3.23) A(log log t) 2 σ 1 α(t) := 1 1 ≥ − − log 2 t

r 1 we have uniformly, for any given r = 2, 3, ..., and R = 2 − , that

s (3.24) n− = (1), t , O → ∞ tβ

t log x ( 1)k(k 1)! t (3.25) f(x) = , f (k)(x) = − − . − 2π 2πxk and let k vary over 2, 3, ..., r. If a < n b 2a, then ≤ ≤ (k 1)! t (k 1)! t (3.26) − f (k)(n) − , 2π(2a)k ≤ | |≤ 2π(a)k so that we have

(k 1)! t (3.27) λ = − , h = 2k k 2π(2a)k

30 in the notation of the previous lemma. Hence

1 1 2K 2 2K 2 it 2k (k 1)! t − 1 2 (k 1)! t − − n− 2 K a − + a − K −  2π(2a)k 2π(2a)k a

1 2 + k 1 1 k 1 a − K 2K 2 t− 2K 2 a − 2K 2 t 2K 2 , − − ≤ − − or if

k 2 1 a K 1 − K t K 1 , − ≤ − or if

K a t kK 2K+2 =: t . ≤ − k Therefore, we get from the Partial summation estimate B.1 for σ 1 α(t) and a t ≥ − ≤ k s α(t) k 1 k α(t)+ 1 n− a − 2K 2 t 2K 2 a− 2K 2 t 2K 2 .  − − ≤ − − a

1 kK 1 s α(t)+ 2K 2 (2K 2)(kK K+1) α(t) 2(k 1)K+2 (3.28) n− t − − − − t − − .   a

s s s (3.29) n− = n− + n− + ··· tβ

Hence for all m = 1, 2, ..., (log t) we have O 1 s α(t) 2(k 1)K+2 n− max t − − , k=2,3,...,r t t  2m

31 m once we apply estimate (3.28) with the unique k = 2, 3, . . . r such that 2− t (tk+1, tk]. Thus by (3.29) we have ∈

1 1 s α(t) 2(k 1)K+2 α(t) 2(r 1)R+2 n− max t − − log t = t − − log t  k=2,3,...,r tβ

32 4 Vinogradov's Method

Using similar estimates given at the end of the previous chapter it can be obtained that 5 as in the region (log log t)2 The proof of this can ζ(s) = (log t) t σ 1 log(t) . be found inO Titchmarsh (1951,→ ∞ p. 98). However,≥ for− our purposes this is not strong enough. We will need a dierent method for estimating exponential sums to obtain better results near the line σ = 1. The method was developed by Vinogradov. The proofs in this chapter follow the same lines as in Titchmarsh (1951, Ch. 6). Let us rst motivate the method. We will not yet specify the summation intervals for the sake of brevity. As in the previous chapter, set

it t (4.1) n− = e(f(n)), f(x) := log x. −2π X X In van der Corput's method we were led to study sums of e(f(n + m) f(n)), where m < n (see Lemma 3.3). We now make the observation that by Taylor's− theorem

t f(n + m) f(n) = (log(n + m) log n) − −2π − k r = Ar(n)m + Rk(n, m), r=1 X where

t t m k+1 A (n) = ( 1)r , R (n, m) . r − 2πrnr | k | ≤ 2π(k + 1) n   This leads us to expect that for large enough k we have

k (4.2) e(f(n + m) f(n)) e A (n)mr . − ≈ r m m r=1 ! X X X We now note that the value of the sum on the right-hand side depends only on the numbers Ar(n) modulo 1. This is true because we can clearly subtract any integer from any of the numbers Ar(n) without changing the sum. Because of Lemma 3.5 we need to only consider n tδ for small enough δ. Hence if we vary n in (4.2) we expect that ≤ the values of Ar(n) are roughly evenly distributed modulo 1. This suggests that we can write

2l 2l k 1 1 k r r (4.3) e Ar(n)m C e xrm dx1 dxk, !  0 ··· 0 ! ··· n m r=1 Z Z m r=1 X X X X X where C is a function that essentially measures how well the numbers A r(n) are evenly distributed modulo 1. This argument will be made rigorous in Theorem 4.1.

33 The sum on the left-hand side of (4.3) may be obtained from the original sum using Hölder's inequality. The reason why we have raised to the power 2l is that the quantity on the right-hand side now has a clear combinatorial interpretation: Recall that

1 1, n = 0 e(nx) dx = 0 n = 0. Z0 ( 6 We have

k 2l k r r r r r e xrm = e xk(m1 + + ml n1 nl ) . ! ··· − · · · − ! m r=1 m1,...,ml,n1,...,nl r=1 X X X X Therefore, the integral on the right-hand side of 4.3 is just the number of solutions to the set of equations

(4.4) mr + + mr = nr + + nr for all r = 1, 2, .., .k. 1 ··· l 1 ··· l it Thus we have reduced the problem of estimating the sum n− to a purely combina- torial question. 4.1 From Exponential Sums to CombinatoricsP

Let us now formalize the above discussion with precise notations. For integer k 2 let ≥ k r (4.5) p(m) := p(m, x) := xrm , r=1 X and dene

(4.6) S(q) := e(p(m)), a

k S(q) 2l= e x (mr + + mr nr nr) . | | k 1 ··· l − 1 · · · − l m ,...,m ,n ,...,n r=1 ! 1 Xl 1 l X Integrating over the k-dimensional unit cube we obtain that J(q, l) is the number of solutions to the set of equations

(4.8) mr + + mr = nr + + nr for all r = 1, 2, .., .k, 1 ··· l 1 ··· l 34 such that a < mi a + q and a < mi a + q. We note that the≤ number of these solutions≤ independent of a, since the set of equa- tions

mr + + mr = nr + nr for all r = 1, 2, ..., k 1 ··· l 1 ··· l holds if and only if

(m a)r + + (m a)r = (n a)r + (n a)r for all r = 1, 2, ..., k. 1 − ··· l − 1 − ··· l − To relate the exponential sums to the integrals J(q, l), we will need the following lemma, which asserts that if an arithmetic function φ(n) grows very slowly but steadily, then its value cannot be very often close to an integer. This will be used to estimate the function C in 4.3. Lemma 4.1. Suppose that M and N > 1 are integers, and let φ(n) be a real function dened for integers M n M + N 1. Assume also that for all such n we have ≤ ≤ − (4.9) δ φ(n + 1) φ(n) cδ, ≤ − ≤ where δ > 0, c 1. Let W > 0, and denote by x the distance of x to the closest integer. Then the≥ number of values of n such that kφk(n) W δ is less than k k≤ (4.10) (Ncδ + 1)(2W + 1).

Proof. Let 1 1 be a , and denote by the number of values of 2 < x < 2 Gx n such that −

k + x < φ(n) k + x + δ ≤ for some integer k. By our assumptions for each integer k there is at most one such n. Therefore

G φ(M + N 1) φ(N) + 1 + δ Ncδ + 1. x ≤ − − ≤ The theorem now follows because an interval of length 2W δ can be divided into 2W +1 intervals of length less than δ. b c Now we can prove the following important theorem.

Theorem 4.1. Let k 6, Q 2 and 2 q Q be integers, and let f(x) be real and have continuous derivatives≥ up≥ to order k≤+ 1≤in the interval [P + 1,P + Q]. Assume that

f (k+1)(x) (4.11) λ | | 2λ ≤ (k + 1)! ≤ in the same interval, and assume also that

1 1 (4.12) λ− 4 Q λ− . ≤ ≤ 35 Then

P +Q 1 1 1 1 k 1 k(k+1) 2l k+1 (4.13) S := e(f(n)) 2q− Q − 2l 3kλ− q − 2 J(q, l) + 4πkλq Q + q, | | ≤ n=P +1   X where l > 0 is an integer.

Proof. We note that the theorem is trivial if λqk 1, so let λqk < 1. Let us dene for P + 1 n P + Q q ≥ ≤ ≤ − q (4.14) T (n) := e(f(n + m) f(n)). − m=1 X Then

q P +Q 1 S = q− e(f(n)) | | m=1 n=P +1 X X q P +Q q+m q − 1 1 q− e(f(n )) + q− q ≤ m=1 n=P +1+m m=1 X X X P +Q q q − 1 = q− e(f(n + m )) + q n=P +1 m=1 X X P +Q q q − 1 q− e(f(n + m)) + q ≤ n=P +1 m=1 X X P +Q q − 1 = q− T (n) + q | | n=P +1 X 1 P +Q q 2l − 1 1 1 2l (4.15) q− Q − 2l T (n) + q, ≤ | | n=P +1 ! X where the last inequality follows from Hölder's inequality. Now by Taylor's theorem, we have for 1 m q ≤ ≤ f(n + m) f(n) = A (n)m + + A (n)mk + 2λθqk+1, − 1 ··· k where A (n) := f (r)(n)/r! and θ < 1. r | | Let us now dene the k-dimensional region

(4.16) k 1 k+1 r Ωn := (x1, ..., xk) R : xr Ar(n) q − λ, r = 1, 2, ..., k . ∈ | − |≤ 2 ∀   36 Then, for 1 m q, we have for all x = (x , ..., x ) Ω ≤ ≤ 1 k ∈ n e (f(n + m) f(n)) e(p(m, x)) 2π f(n + m) f(n) p(m, x) | − − | ≤ | − − | k 2π x A mk + 4πλqk+1 ≤ | k − k| r=1 X k m k πλqk+1 + 4 ≤ q r=1 ! X   πλqk+1(k + 4) 2πkλqk+1. ≤ ≤ Hence

T (n) S(q) + T (n) S(q) | | ≤ | | | − | q = S(q) + e(f(n + m) f(n)) e(p(m, x)) | | − − m=1 X k+2 S(q) + 2 πkλq . ≤ | | This gives us

T (n) 2l 22l S(q) 2l + (4πkλqk+2)2l | | ≤ | | for all x Ω . The k-dimensional volume of Ω is ∈ n n k k+1 r k k(k+1) V (Ωn) = λq − = λ q 2 . r=1 Y Therefore integrating over Ωn and dividing by its volume yields

2l 2l k k(k+1) 2l k+2 2l (4.17) T (n) 2 λ− q− 2 S(q) dx dx + (4πkλq ) . | | ≤ ··· | | 1 ··· k Z ZΩn The integral above remains unchanged if we subtract an integer from any of the coordinates of Ωn. This is true because for any sequence of integers (g1, ..., gk) we have

k k k S(q) = e x mr = e x mr e g mr r r − r a

We say that any such region is congruent to Ωn modulo 1. Note that for every n the k region Ωn is a product of intervals of length less than λq < 1. This allows us to embed k the region Ωn into the k-dimensional unit cube [0, 1) . By embedding an interval [a, b] with b a < 1 in to the unit interval [0, 1) we mean the following: If there is an integer −n such that n a < b < n + 1, then the image of [a, b] under the embedding is ≤ 37 [a n, b n] [0, 1). If there is an integer n such that a < n b, then the image of [a,− b] is [a− n ⊂+ 1, 1) [0, b n) [0, 1). ≤ We wish− to estimate∪ the− sum⊂ over n of the integrals in (4.17) by one integral over [0, 1)k. Since the integrand S(q) 2l is real and positive, we just need an upper bound | | for how many of the regions Ωn intersect after embedding them into the k-dimensional unit cube. That is, for any given P + 1 n P + Q q, how many values of P + 1 m P + Q q there exist, such that≤ the≤ region Ω− intersects with a region ≤ ≤ − m that is congruent to Ωn modulo 1. Clearly a necessary condition for this is that the embeddings into the unit cube intersect in the last coordinate. By denition of Ωn this is equivalent to

k k+1 (4.18) A (m) A (n) q− λq = λq. k k − k k ≤ For a xed n let us dene φ(m) := A (m) A (n). Then k − k f (k)(m + 1) f (k)(m) f (k+1)(ξ) φ(m + 1) φ(m) = − = , − k! k! where m < ξ < m + 1. By our hypothesis (4.11) on f(x), the conditions of our previous Lemma 4.1 are satised with c = 2 and δ = λ(k + 1). That is, λ(k + 1) φ(m + 1) φ(m) 2λ(k + 1). ≤ − ≤ Taking W = q/(k + 1) in the notation of the previous lemma we obtain that for given n the number of m such that (4.18) holds is less than 2q 2q (2Qλ(k + 1) + 1) + 1 (2k + 3) + 1 6q + 2k + 3 3kq, k + 1 ≤ k + 1 ≤ ≤     since we are assuming that k 6. That is, the number of regions Ω which intersect ≥ m with a region that is congruent to Ωn is bounded by 3kq. This holds independently of the choice of n. Therefore, in the sum of integrals

P +Q q − S(q) 2l dx dx , ··· | | 1 ··· k n=P +1 Ωn X Z Z after embedding the regions Ωn into the k-dimensional unit cube, each point of the unit cube appears at most in 3kq of the integrals. Thus

P +Q q − 1 1 (4.19) S(q) 2l dx dx 3kq S(q) 2l dx dx ··· | | 1 ··· k ≤ ··· | | 1 ··· k n=P +1 Ωn 0 0 X Z Z Z Z (4.20) = 3kqJ(q, l). Recall that by inequalities (4.15) and (4.17) we have

1 P +Q q 2l − 1 1 1 2l S q− Q − 2l T (n) + q, | | ≤ | | n=P +1 ! X 38 and

2l 2l k k(k+1) 2l k+2 2l T (n) 2 λ− q− 2 S(q) dx dx + (4πkλq ) . | | ≤ ··· | | 1 ··· k Z ZΩn Plugging in the estimate (4.20) for the sum of the integrals yields

1 1 1 1 2l k k(k+1) k+2 2l 2l S q− Q − 2l 2 λ− q− 2 3kqJ(q, l) + (4πkλq ) Q + q | | ≤ 1  k(k+1)  1 1 1 k 1 2l k+1 2q− Q − 2l 3kλ− q − 2 J(q, l) + 4πkλq Q + q. ≤  

Remark 4.1. In our application we will choose q, k and l so that the rst two terms will 1 ρ depend the parameter Q in the form Q − log Q, for some small ρ(t) 0 as t . For this reason Vinogradov's method gives good bounds for ζ(s) only near→ the line→σ ∞= 1.

4.2 Integrals J(q, l) Unfortunately we cannot within the limitations of this thesis describe in detail and rigour the methods used to estimate the integrals J(q, l). There exists a number of dierent methods for obtaining bounds for J(q, l). In this section we give a rough description of the method and the results given in Titchmarsh (1951, Ch. 6). An alternative method which yields slightly stronger results can be found in Ivic (2003, Ch. 6). Recall that

1 1 J(q, l) = S(q) 2l dx dx ··· | | 1 ··· k Z0 Z0 is equal to the number of solutions to the set of equations

mr + + mr = nr + + nr for all r = 1, 2, .., .k, 1 ··· l 1 ··· l with the constraints a < mi a + q and a < mi a + q. Instead of studying this≤ set of equations, the≤ method consists of estimating the number of sets of integers m1, ..., mk, n1, ..., nk, for which the quantities

(4.21) mr + + mr nr nr 1 ··· k − 1 − · · · − k lie on some given intervals. To simplify the combinatorics we also assume that the integers m , ..., m satisfy m < m < < m and are roughly `evenly spaced'. Same 1 k 1 2 ··· k assumptions are made for n1, ..., nk. The number of solutions satisfying these constraints can be estimated relatively painlessly (Tichmarsh 1951, Lemmata 6.2, 6.3 and 6.4). The more dicult part is getting back to the integrals J(q, l) from this modied 1 1 problem. The main idea is to estimate J(q, l) by J(q − k , l k) and use this inductively. − 39 1 1 We can relate J(q, l) to J(q − k , l k) in the following manner. First we divide the summation interval (0, q] to 2µ parts− of equal lengths to obtain

S(q) = e(p(m)) = Zµ,g, 0

Zµ,g = e(p(m)). (g 1)2 µq

S(q)l = Z Z , µ,g1 ··· µ,gl X where the sum runs over all sets of indices (g1, . . . gl). We now separate the sum into two parts: In the rst sum we gather those multi-indices (g1, . . . gl) for which the numbers g1, ...gl are `evenly spaced' in some permutation of the indices. The number of these can be eectively bounded. This gives us

l S(q) = Zµ,g1 Zµ,gl + Zµ,g1 Zµ,gl , `evenly spaced' ··· not `evenly spaced' ··· X X

In the second sum we write Zµ,g = Zµ+1,2g 1 + Zµ+1,2g, that is, we divide the sum Zµ,g in the middle. Carrying out the multiplication− yields a sum of terms of Zµ,g1 Zµ,gl the form Z Z , which gives ··· µ+1,g1 ··· µ+1,gl l S(q) = Zµ,g1 Zµ,gl + Zµ+1,g1 Zµ+1,gl . evenly spaced ··· ··· X X

Some of the multi-indices (g1, . . . gl) in the second sum are now `evenly spaced' in some permutation of the indices, so we may separate these from the rest. We do not need to know explicitly the set of muti-indices that the second sum runs over, only that we can bound the number of those which are `evenly spaced'. We repeat this process of separating the `evenly spaced' and dividing the remaining terms, until we have products of the form Z Z , where M > µ is suitably chosen. We now have M,g1 ··· M,gl M S(q)l = Z Z , r,g1 ··· r,gl r=µ X X where the latter sum is over g1, ..., gl which are `evenly spaced' in some order, except for r = M. For each r = µ, . . . , M, the number of terms in the second sum is bounded. After squaring this can be used to estimate

1 1 J(q, l) = S(q) 2l dx dx , ··· | | 1 ··· k Z0 Z0 40 once we write

Z Z = Z Z Z Z r,g1 ··· r,gl r,g1 ··· r,gk r,gk+1 ··· gl for each of the products in the sum S(q)l. Note that there are l k terms in the latter product on the right-hand side. Hölder's inequality can be used− to relate this product 1 1 2(l k) to S(q − k ) − . After integration the rst part of the product on the right-hand side | | corresponds to the modied combinatorial problem 4.21, since the numbers g1, ..., gk are `evenly spaced'. All this yields the formidable estimate

Theorem 4.2. Let l 1 k2 + 5 k and dene ≥ 4 4 log q (4.22) M := . k log 2   Then

1 l k 3 1 1 2l 2 k k(k 1) 2 − + k 1 (4.23) J(q, l) max 1,M 48 (l!) l k 2 − q k 2 − 2 J(q − k , l k). ≤ { } − Proof. See Lemma 6.8 in Titchmarsh (1951, p. 107).

From this it is relatively straightforward to deduce the following bound.

Theorem 4.3. Let l = k2 log(k2 + k) + 1 k2 + 5 k + 1, and let k 7. Then b 4 4 c ≥ 64lk log2 k 2l 2l 1 k(k+1)+ 1 (4.24) J(q, l) e log q q − 2 2 . ≤ · Proof. See Lemma 6.10 in Titchmarsh (1951, p. 107).

4.3 Order of ζ(s) Near the Line σ = 1 We can now combine Theorems 4.1 and 4.3 to obtain

Theorem 4.4. Assume that the hypotheses of Theorem 4.1 hold for f(x), and let ρ := 2 1 (56k log k)− . Then for k 7 ≥ P +Q 32k log2 k 1 ρ (4.25) S := e(f(n)) e Q − log Q.  n=P +1 X Proof. We will not present the proof here, since it is just a matter of pasting the The- orems 4.1 and 4.3 together, and choosing q optimally. Needless to say, this is rather tedious, albeit elementary. The necessary calculations are carried out on the page 112 of (Titchmarsh 1951).

We need to rene this result before we can apply it to ζ(s). We also collect our assumptions for future reference.

41 Theorem 4.5. Let k 7, Q 2 and 1 N Q be integers. Let f(x) be real and have continuous derivatives≥ up to order≥ k + 1≤in the≤ interval [P + 1,P + N]. Assume that

f (k+1)(x) (4.26) λ | | 2λ ≤ (k + 1)! ≤ in the same interval, and assume also that

1 1 (4.27) λ− 3 Q λ− . ≤ ≤ 2 1 Then, for ρ = (56k log k)− , we have

P +N 32k log2 k 1 ρ (4.28) e(f(n)) e Q − log Q.  n=P +1 X 1 Proof. If N λ− 4 , the theorem follows from the previous theorem with Q replaced by ≥1 N. If N < λ− 4 , then

P +N 1 3 1 ρ e(f(n)) N < λ− 4 Q 4 Q − . ≤ ≤ ≤ n=P +1 X

We are now ready to prove the main result of this chapter. Theorem 4.6. (Vinogradov). There exists a region

1 (4.29) A(log log t) 2 σ 1 α(t) := 1 1 ≥ − − log 2 t where A > 0 is a constant, such that for some constant C0

1 5 ζ(s) exp C log 4 t(log log t) 4  0 n o uniformly as s in the region. → ∞ Proof. By Lemma 3.5 we have for any given r = 2, 3, ... that

r 1 s 2 − (4.30) − n = (1), β = r 1 O (r 1)2 − + 1 tβ

s 128 1 (4.31) ζ(s) = n− + (1), δ = < O 7 128 + 1 7 n tδ X≤ · in the region σ 1 α(t). ≥ − 42 To estimate the remaining sum we set

t log x ( 1)k+1k! t (4.32) f(x) := , f (k+1)(x) = − . − 2π 2πxk+1 Let a, b [0, tδ], a < b 2a. For a < x b the function f (k+1)(x) is strictly decreasing. We now∈ divide the interval≤ (a, b] to at≤ most k + 1 subintervals| |

1 1 2 m (4.33) (a, 2 k+1 a], (2 k+1 a, 2 k+1 a],..., (2 k+1 a, b],

m where m k is the largest integer such that 2 k+1 a < b. Then the hypothesis of Theorem 4.5 that ≤

f (k+1)(x) t λ | | = 2λ ≤ (k + 1)! 2π(k + 1)xk+1 ≤ holds on each subinterval, where the number λ depends on the subinterval. For all subintervals of the interval (a, b] the number λ satises

t t t (4.34) λ . 2π(k + 1)(2a)k+1 ≤ ≤ 4(k + 1)πak+1 ≤ 4πak+1 Let us now assume that

1 1 (4.35) 2 log 2 t < log a log t. ≤ 7 In the notation of the previous theorem we set Q = a and

log t (4.36) k = + 1 7. log a ≥   3 1 To use Theorem 4.5 we need to check that a− λ a− . ≤ ≤ We have by the denition of k (4.36) that

log t a = exp log(a) = exp log a + log a log t { } log a −   k+1 1 exp (k + 1) log a log t = a t− ≤ { − } Therefore

t 1 λ a− . ≤ 4πak+1 ≤ By the denition of k we also have

k+1 1 2 k 1 1 2 log t 2 a t− = a a − t− a exp log a log t a . ≤ log a − ≤   43 3 k+2 Hence, to show that λ a− , it is enough to show that a 2 π(k + 1) since then ≥ ≥

t 2 1 3 λ a− a− , ≥ 2π(k + 1)(2a)k+1 ≥ 2k+2π(k + 1) ≥

By taking logarithms we see that a 2k+2π(k + 1) holds if ≥ log a > (k + 2) log 2 + log(k + 1) + log π, which is true if

log t log t log a + 3 + log + 2 + log π. ≥ log a log a     1 Since we assumed that log a 2 log 2 t, the last inequality holds if ≥ 1 1 log a log a + 3 + log log a + 2 + log π, ≥ 2 2   3 which is true for large enough t. Therefore, we have λ a− . Since k 7, we have by the Theorem 4.5 ≥ ≥ it 32k log2 k 1 ρ n− ke a − log a,  a

3 1 (4.38) log a C(log t log log t) 4 ( > 2 log 2 t ) ≥ for some large C > 0. Then by the denition of k,

log t k log2 k + 1 (log log t)2 ≤ log a   log t 2 2 1 5 2 (log log t) log 4 t(log log t) 4 , ≤ log a ≤ C where we have used assumtion (4.38) to obtain the last inequality. We also have

log a log a log3 a 2 2 2 . k log k ≥ log t ≥ 4 log t log log t log a + 1 log log t   44 Finally we have

1 3 2 A(log log t) 2 log a A((log t log log t) 4 ) log a α(t) log a = 1 = 2 log 2 t log t log log t A log3 a , ≤ C2 log2 t log log t where we have again used assumtion (4.38) to obtain the last inequality. Therefore, for a large enough constant C log a A 1 log3 a α(t) log a − 56k2 log k ≤ C2 − 224 log2 t log log t   1 log3 a ≤ −448 log2 t log log t 3 9 3 1 C (log t log log t) 4 C 1 5 = log 4 t(log log t) 4 ≤ −448 log2 t log log t −448 Hence

3 2 log a 66 C 1 5 33k log k + α(t) log a log 4 t(log log t) 4 − 56k2 log k ≤ C − 448  1  5 C0 log 4 t(log log t) 4 , ≤ − once we rst choose a suitably large constant C. Thus by (4.37) for all a t such that 3 ≤ log a C(log t log log t) 4 we have ≥ s 1 5 n− exp C0 log 4 t(log log t) 4 + log log t  {− } a

s s s ζ(s) = n− + (1) = n− + n− + (1) O O n tδ n γ γ

45 Corollary 4.1. We have

ζ0(s) 3 3 (4.39) = log 4 t(log log t) 4 , ζ(s) = 0 ζ(s) O 6   in the region

(4.40) A0 σ 1 3 3 ≥ − log 4 t(log log t) 4 for some constant A0 > 0. Proof. We can now use Theorems 2.5 and 2.6 with

1 A(log log t) 2 1 5 4 4 θ(t) = α(t) = 1 , φ(t) = C0 log t(log log t) . log 2 t Indeed,

φ(t) = o eφ(t) , θ(t)  and the theorem now follows, since clearly

φ(2t + 3) 3 3 log 4 t(log log t) 4 . θ(2t + 3) 

Remark 4.2. In the next chapter we will use a weaker version of the previous theorem, for the sake of convenience, which states that

ζ (s) 0 = (log t), ζ(s) = 0 ζ(s) O 6 in the region

ω(t) log log t σ 1 , ≥ − log t where ω(t) as t does. By the previous theorem we may clearly choose any such 1→ ∞ ω(t) log 5 t, for example. 

46 5 Dierence Between Consecutive Prime Numbers In this chapter we will use our theorems on the growth of the Riemann zeta function to prove an upper bound for prime gaps. In particular, we will show that if ζ(s) = (tc) for some c > 0, then for all θ > 1+4c we have p p < pθ for large enough n. WeO use 2+2c n+1 − n n the phrase `... for large enough n' to mean that there exists a constant n0 such that the claim holds for all n n0. A standard recipe≥ for proving theorems about the distribution of prime numbers is to rst study the Chebysev's function

(5.1) ψ(x) = Λ(n) = log p. n x pn x X≤ X≤ This function is simpler to relate to the Riemann zeta function than, for example, the prime counting function One can prove that statements about π(x) := p x 1. ψ(x) correspond to direct statements about≤ prime numbers. For example, the statement ψ(x) x is equivalent to the primeP number theorem π(x) x/log x (for proof see Ayoub,∼ 1963, p. 73). ∼ We are interested in the following theorem, which relates the function ψ(x) to the dierence between prime numbers. Theorem 5.1. Let 0 < θ < 1 be a constant, and assume that (5.2) ψ(x + xθ) ψ(x) xθ, x . − ∼ → ∞ There always is a prime number in the interval (x, x + xθ] for all large enough x. There- fore, for large enough n, (5.3) p p pθ . n+1 − n ≤ n Proof. Assume that an interval (x, x + xθ] contains no prime numbers. Then

ψ(x + xθ) ψ(x) = log p − x

log 2x 1 + 1 + ≤  ··· x

ψ(x + xθ) ψ(x) xθ, x − ∼ → ∞ cannot be true, which is a contradiction.

47 5.1 An Approximate Formula For ψ(x) In this section we will prove a formula which relates ψ(x) to ζ(s). The proofs are from Saksman (2014). Recall that by Corollary 2.1 we have

ζ (s) ∞ Λ(n) (5.4) 0 = . − ζ(s) ns n=2 X Therefore obtaining a formula for ψ(x) is just a matter of recovering the partial sum of the coecients of ζ0(s)/ζ(s). To this end we need − Lemma 5.1. Let y, c > 0 and T 2. Dene ≥

c+iT s 1, y > 1, (5.5) 1 y 1 I(y, T ) := ds, Θ(y) :=  2 , y = 1, 2πi c iT s Z −  0, y < 1.

Then 

c , y = 1, (5.6) T I(y, T ) Θ(y) c 1 | − | ≤ ( y min 1, T log y , y = 1. | | 6 n o Proof. Case y = 1. By Residue theorem we have for a real d = 0 6 6 c+iT d+iT d iT c iT s 1 − − y 1, d < 0, + + + ds = 2πi c iT c+iT d+iT d iT s 0, d > 0, Z − Z Z Z −  ( since for d < 0 we pick the residue 1 of ys/s at the point s = 0. Therefore for y = 1 we have 6

d+iT d iT c iT s 1 − − y I(y, T ) Θ(y) = + + ds − 2πi c+iT d+iT d iT s Z Z Z −  1 := (I + J + I ), 2πi 1 2 if we choose d > 0 when y < 1, and d < 0 when y > 1. In both cases we have yd 0, as d . Hence → | |→ ∞ d+iT yσ yc yd yc I = I dσ = − , d | 1| | 2| ≤ T T log y → T log y | |→ ∞ Zc+iT | | | | and

d d iT d y − 2T y J ds = 0, d , | | ≤ d | | d → | |→ ∞ | | Zd+iT | | 48 when we choose the sign of d as before. Therefore letting d yields | |→ ∞ yc I(y, T ) Θ(y) . | − | ≤ T log y | |

c 2 2 To show that I(y, T ) Θ(y) y , let R = √c + T . For y > 1 dene C1 = (∂B(0,R)) σ |c as in− Figure| ≤ 5.1. The integrand ys/s has a pole at s = 0 with a residue 1. Thus∩ { by≤ the} Residue theorem we have

1 ys yc I(y, T ) Θ(y) = ds ds yc. | − | 2π s ≤ 2πR | | ≤ ZC1 Z∂B(0,R)

For y < 1 dene C = (∂B(0,R)) σ c . Then by Cauchy's theorem we have 2 ∩ { ≥ } 1 ys yc I(y, T ) Θ(y) = I(y, T ) = ds ds yc. | − | | | 2π s ≤ 2πR | | ≤ ZC2 Z∂B(0,R)

Figure 1: Integration paths C1 an C2.

49 Case y = 1. A straightforward computation gives us

1 c+iT ds 1 0 c+iT ds I(y, T ) = = + 2πi c iT s 2πi c iT 0 s Z − Z − Z  T T 1 c dt 1 c dt 1 1 ∞ dt = 2 2 = 2 = 2 . π 0 c + t π 0 1 + t 2 − π T 1 + t Z Z Z c The theorem now follows, since

1 ∞ dt 1 ∞ dt c 2 2 = . π T 1 + t ≤ π T t πT Z c Z c

1 Theorem 5.2. Let c = 1 + (log x)− , and let T x. Then  1 c+iT ζ (s) xs ds x log2 x (5.7) ψ(x) = 0 + . 2πi c iT − ζ(s) s O T Z −   Proof. Summing over the previous lemma gives us

∞ x ψ(x) = Λ(n) = Θ Λ(n) + (log x) n O n x n=2 X≤ X   ∞ x ∞ x c 1 = I ,T Λ(n) + log x + Λ(n) min 1, n O n T log x n=2 n=2, n=x  n ! X   X6   | | since if x is an integer, the error contributed by that term in the previous lemma is (log x/T ). The rst sum is by Corollary 2.1 O

∞ ∞ c+iT s x 1 s x ds I ,T Λ(n) = Λ(n)n− n 2πi s n=2 n=2 c iT X   X Z − c+iT ∞ s 1 s x ds = Λ(n)n− 2πi s c iT n=2 Z − X 1 c+iT ζ (s) xs ds = 0 , 2πi c iT − ζ(s) s Z − where the interchange of the order of summation and integration is justied since the series converges absolutely. To estimate the sum in the error term we divide the sum into three parts

∞ = + + := S1 + S2 + S3. n=1, n=x 0< x n 2 2< x n < x x n x X6 |X− |≤ |X− | 4 | −X|≥ 4 For the rst sum we clearly have S = (log x), since Λ(n) log n. 1 O ≤ 50 For the second sum we observe that x x n x n log = log 1 + − − , n n  n   because x n < x/4. Therefore | − | c 2 x c n x log x 1 x log x S Λ(n)n− n− . 2  T x n  T  T 2< x n < x 1

ζ (c) 1 0 . ζ(c)  c 1

− Hence

1+ 1 xc ζ (c) x log x 1 ex log x S 0 = . 3  T ζ(c)  T c 1 T

− The theorem now follows from collecting the three estimates together.

Remark 5.1. For the sake of brevity we have not proven here the most general formula. The so called Perron Formula, which gives a similar relationship between a general Dirichlet series s and its coecients' summatory function can f(s) = ann− n x an, be proven using a similar argument as above. The assumption T x in the≤ above theorem may also be removedP with the cost of a more complicated error term.P To deduce information about the asymptotics of ψ(x) from the previous theorem, we would like to move the integration path to the left of the line σ = 1 . Since every { } zero of ζ(s) is a pole of ζ0(s)/ζ(s), to shift the path we require information about the zeros of ζ(s). By Corollary 4.1 we may shift the integration path just past the line where σ = 1. Now the integrand has a simple pole at s = 1 with residue x. Hence a careful calculation of the error terms would yield that ψ(x) x. This proof and other proofs of the prime number theorem can be found with details∼ in Ayoub (1963, Ch. 2). It is not as surprising as it may rst seem that such intricate information on the distribution of the prime numbers may be extracted from ζ(s) blowing up at s = 1; Because of Euler's product, this fact immediately implies that there must be innitely many prime numbers. We have just made this qualitative observation more precise. For our application on prime gaps we would like to shift the integration path even further, all the way to the line 1 to obtain more detailed information about . σ = 2 ψ(x) Since we do not have the required information about the zeros of ζ(s), we need to be clever. The idea is to approximate 1/ζ(s) with a function that is analytic in σ < 1 . Recall that for σ > 1 we have { }

1 ∞ µ(n) = . ζ(s) ns n=1 X 51 Knowing that the partial sums of ζ(s) approximate it even where the sum does not converge, a natural candidate for our analytic function should be

1 µ(n) (5.8) =: M (s) ζ(s) ≈ ns T n T X≤ for some suitably chosen T. Before making this argument rigorous we need more infor- mation on ζ(s) for 1 σ 1. 2 ≤ ≤ 5.2 Estimating ζ(s) for 1 σ 1 2 ≤ ≤ As we have seen, it is easy to understand the behaviour of ζ(s) for σ 1 and σ 0, but not inside the strip 0 < σ < 1. This motivates the following theorem,≥ which states≤ that the behaviour of an analytic function on the boundaries of an innite strip dictates its behaviour inside the strip. The proofs in this section are similar to those presented in Ramachandra (2007, ch. 11).

Theorem 5.3. (Gallagher-Selberg). Let f(s) be analytic in a neighbourhood of the vertical strip σ1 σ σ2 and of nite non-negative order in the strip, that is f(σ + it) = (tA) uniformly{ ≤ ≤ for some} constant A 0. Denote for j = 1, 2 O ≥

(5.9) Mj(t) := max f(σj + iu) , u t log t | | | − |≤ ∞ (y t)2 (5.10) I (t) := f(σ + iy) e− − dy. j | j | Z−∞ Then for σ + d σ σ d we have uniformly 1 ≤ ≤ 2 − σ σ σ σ σ σ σ σ 1 2− − 1 1 2− − 1 σ σ σ σ σ σ σ σ (5.11) f(σ + it) I (t) 2− 1 I (t) 2− 1 M (t) 2− 1 M (t) 2− 1 , t .  d 1 2  d 1 2 → ∞ Proof. Let us x s such that σ + d σ σ d and consider the function 1 ≤ ≤ 2 − z s (z s)2 (5.12) F (z, s) = f(z)K − e − , where K > 0 is a constant. Now by the Residue theorem

1 F (z, s) f(s) = F (s, s) = dz, 2πi z s Z∂Rs,T − where Rs,T = σ1 Rez σ2, Imz t T (see Figure 5.2 below). Hence clearly for σ + d σ σ{ ≤d ≤ | − |≤ } 1 ≤ ≤ 2 − F (z, s) 1 f(s) | | dz F (z, s) dz . ≤ z s | | ≤ d | | | | Z∂Rs,T | − | Z∂Rs,T since for all z ∂R we have z s d. ∈ s,T | − | ≥ 52 Figure 2: Integration path ∂Rs,T .

We next show that the integrals over the horizontal lines tend to zero as T . As Imz , we have for σ σ σ and σ Rez σ , → ∞ | |→ ∞ 1 ≤ ≤ 2 ≤ ≤ 2 (Imz t)2 A (Imz t)2 F (z, s) f(z) e− − (Imz) e− − 0, | |  | |  → since we are assuming that f(z) (Imz)A. Hence 

σ2+i(t T ) σ1+i(t+T ) − + F (z, s) dz max F (x + i(t T ), s) 0, x [σ σ ] σ1+i(t T ) σ2+i(t+T ) | | | |  1 2 | ± |→ Z − Z ! ∈ as T . Therefore, by letting T in (5.13) we obtain → ∞ → ∞

σ2+i(t+T ) σ2+i(t T ) σ1+i(t+T ) σ1+i(t+T ) 1 − f(s) + + + F (z, s) dz ≤ d σ2+i(t T ) σ1+i(t T ) σ1+i(t T ) σ2+i(t+T ) | | | | Z − Z − Z − Z ! σ2+i σ1+i 1 ∞ ∞ + F (z, s) dz ≤ d σ2 i σ1 i | | | | Z − ∞ Z − ∞  σ σ σ σ 1 2 2− − 1 σ1 σ σ2 σ σ σ σ σ (I (t)K − + I (t)K − ) = I (t) 2− 1 I (t) 2− 1 ,  d 1 2 d 1 2

σ2 σ1 if we choose K − = I1/I2 to make the two terms equal. This proves the rst claim.

53 We still need to show that Ij(t) = (Mj(t)). We note that in the integrals Ij(t) most of the contribution comes from the partO where y is close to t. Therefore we divide the integration into two parts

∞ (y t)2 I (t) = f(σ + iy) e− − dy j | j | Z−∞

= + =: Jj,1 + Jj,2. y t log t y t >log t Z| − |≤ Z| − | Now

∞ (y t)2 J M (s)e− − dy M (t) j,1  j  j Z−∞ and since we are assuming that f(s) tA uniformly, we have  y2 Jj,2 = ( f(σj + (y + t)i) + f(σj + ( y t)i) ) e− dy y log t | | | − − | Z ≥ A y2 A log(y+t) y2 (y + t) e− dy = e − dy 0,  y log t y log t → Z ≥ Z ≥ as t . Hence the theorem follows. → ∞ From this we immediately obtain

Corollary 5.1. Suppose that f(s) analytic in a neighbourhood of the vertical strip σ1 σ σ and of nite non-negative order in the strip. Assume also that there{ exist≤ ≤ 2} positive non-decreasing real functions g1(t) and g2(t) of nite order such that (5.13) f(σ + it) g (t), f(σ + it) g (t). 1  1 2  2 Then for all σ + d σ σ d, we have 1 ≤ ≤ 2 − σ σ σ σ 1 2− − 1 σ σ σ σ (5.14) f(σ + it) g (t) 2− 1 g (t) 2− 1 , t .  d 1 2 → ∞ Remark 5.2. From Corollary 5.1 we can easily obtain the Phragmén-Lindelöf Theorem, which states that for any f(s) which satises the hypotheses of the above corollary, the indicator function

r µf (σ) := inf r R : f(σ + it) = (t ) { ∈ O } is a convex function. As a convex function it is also continuous for σ [σ , σ ]. ∈ 1 2 We can now prove the following

Theorem 5.4. Let be any given constant and assume that 1 c for A > 0 ζ( 2 + it) = (t ) some constant c. Then for 1 A σ 1 we have uniformly O 2 − log t ≤ ≤ 2c(1 σ) 2 (5.15) ζ(s) = (t − log t), O 2c(1 σ) 3 (5.16) ζ0(s) = (t − log t), O as s . → ∞ 54 Proof. If 1 σ 1 + 1 , then by the convexity of µ(σ) we have that 2 ≤ ≤ 2 log t c 2c(1 σ) 2 ζ(σ + it) = (t ) = (t − log t), O O c+ 2 because 2c(1 σ) log t 2 c If 1 the theorem also holds clearly t − t = e t . 1 log t σ 1, because of Theorem≥ 2.4. − ≤ ≤ Therefore assume that 1 1 1 In the notation of the previous 2 + log t σ 1 log t . corollary, we have 1 and ≤and≤ − σ1 = 2 σ2 = 1, c (5.17) g1(t) = t , g2(t) = log t. Hence

2c(1 σ) 2σ 1 2c(1 σ) 2 ζ(s) (log t)t − (log t) − t − log t.   To extend the result to 1 A σ 1 , we use the same corollary with σ = 0, σ = 2 − log t ≤ ≤ 2 1 2 1 1 1 + . Recall that by the Theorem 2.4 we have ζ(it) = (t 2 log t). Hence 2 log t O 1 1 2σ 2cσ 1 σ 2cσ 2 2c(1 σ) 2 ζ(s) log t (t 2 log t) − t t 2 − t log t t − log t,    A 1 σ A because t 2 − t log t = e . The second≤ part now follows from the rst part via Cauchy's formula. In the rst part of the theorem let the constant be 2A. Now for all s such that 1 A σ 1, 2 − log t ≤ ≤ the disk lies within the region 1 2A A Therefore B := B(s, A/log t) 2 log t σ 1 + log t . Cauchy's formula gives us − ≤ ≤

1 ζ(z) 0 ζ (s) = 2 dz 2πi ∂B (z s) Z − 2 log t 2c(1 σ) 3 max ζ(s) dz t − log t.  z ∂B| | A2 | |  ∈ Z∂B

5.3 Ingham's Mean Value Theorem In this section we will prove a precise statement for what we mean by the approximation

1 µ(n) . ζ(s) ≈ ns n T X≤ This will be in the form of a mean value theorem. For this purpose we will need the following lemma. The proof is from Ramachandra (2007). Lemma 5.2. Let

s (5.18) FX (s) := ann− n X X≤ 55 be a nite Dirichlet series, where Then we have uniformly an C. ∈ T a 2 (5.19) F (σ + it) 2 dt (T + X log X) | n| ,T . | X |  n2σ → ∞ 0 n X Z X≤ Proof. Let us write

a a F (s) 2 = m n | X | mσ+itnσ it n,m X − X≤ a 2 a a n it = | n| + m n . n2σ (mn)σ m n X n,m X, n=m X≤ ≤X 6   Then

T a 2 a a T n it (5.20) F (σ + it) 2 dt = T | n| + m n dt. | X | n2σ (mn)σ m 0 n X n,m X, n=m 0 Z X≤ ≤X 6 Z   The rst term is already of the right form, and the second term is bounded by

a a a a (5.21) | m n| | m n| .  (mn)σ log n  (mn)σ log n n,m X, n=m m 1 m

a a n a a | m n| | m| | n|  (mn)σ n m  σ σ 1 1 m

1 1 a 2 2 a 2 2 a 2 log X | m| | n| n2 X log X | n| ,  m2σ n2σ  n2σ m X ! n X ! n X X≤ X≤ X≤ which combined with 5.20 proves the lemma.

56 We are now ready for the proof of the following theorem, which states that the mean 2 value of ζ(s)MT (s) 1 on 0 t T is relatively small. The idea of the proof is again from| Ramachandra− | (2007,≤ Ch.≤ 11), but we have simplied the argument found in there. Note that in the previous lemma we may shift the integration interval from [0,T ] to [y, y + T ] fo any y R and the same bound holds for the integral. This is because ∈ the shift can be absorbed into the complex numbers an of the above lemma without i(y+t) iy it changing their modulus, by writing ann = ann n . Theorem 5.5. (Ingham). Assume that 1 c for some constant ζ 2 + it = (t ) c > 0 and dene O  1 s (5.22) φ (s) := ζ(s)M (s) 1, Φ (s) := (1 2 − )φ (s), T T − T − T where

µ(n) (5.23) M = . T ns n T X≤ Then for 1 + 1 σ 1 1 we have uniformly 2 log T ≤ ≤ − log T T 2 (2+4c)(1 σ) 8 (5.24) I := Φ (σ + it) dt T − log T,T . | T |  → ∞ Z0 1 s Proof. We have multiplied the function φ (s) by (1 2 − ) in order to get rid of the T − pole of ζ(s) at s = 1. Note that clearly ΦT (s) φT (s) . Now by Theorem 5.3 we have with 1 that | |  | | σ1 = 2 , σ2 = 1

2 2(1 σ) 2σ 1 Φ (s) log TI (t) − I (t) − , T  1 2 where

∞ 2 y2 (5.25) I (t) := Φ (σ + i(y + t)) e− dy. j | T j | Z−∞ Hölder's inequality gives us

T T 2 2(1 σ) 2σ 1 I = ΦT (σ + it) dt log T I1(t) − I2(t) − dt 0 | |  0 Z Z 2(1 σ) 2σ 1 T − T − (5.26) log T I (t) dt I (t) dt ,  1 2 Z0  Z0  since 2(1 σ) < 1, 2σ 1 < 1 and 2(1 σ) + 2σ 1 = 1. We next estimate the two integrals separately.− − − − On σ = 1 we have ζ(σ + i(t + y)) ( t + y )c. Therefore 2  | | | | 1 2 Φ + i(y + t) ζ(s)M (s) 2+1 ( t + y )2c M (s) 2+1. T 2  | T |  | | | | | T |   57 Hence, changing the order of integration yields

T T ∞ y2 2 I1(t) dt = e− ΦT (σj + i(y + t)) dt dy 0 0 | | Z Z−∞ Z T 2 ∞ y2 2c 1 e− T + (T + y ) MT + i(y + t) dt dy.  | | 0 2 Z−∞ Z   !

Now by Lemma 5.2 we have

T 1 2 µ(n) M + i(y + t) dt (T + T log T ) | | T log2 T. T 2  n  Z0   n T X≤

Therefore

T ∞ y2 2c 2 1+2c 2 (5.27) I1(t) dt e− T + (T + y ) T log T dy T log T. 0  | |  Z Z−∞  For the second integral Theorem 3.3 gives us

1 s 1 s 1 it 1 1 (5.28) (1 2 − )ζ(1 + it) = (1 2 − ) n− − + + (T − ). − − O 1 + t O n T   X≤ | | We have

s s s n− MT (s) = n− µ(n)n− n T n T n T X≤ X≤ X≤ s (5.29) =: 1 + ann− =: 1 + FT (s), T

1, n = 1 µ(d) = 0, n > 1. d n ( X| We note that clearly Therefore, for by (5.28) and an d n 1 = d(n). s = 1 + i(t + y), (5.29) we have | | ≤ | P 2 1 s 2 ΦT (s)) = (1 2 − )(ζ(s)MT (s) 1) | | | − − | 2 s MT (s) MT (s) n− M (s) 1 + | | + | |  T − (1 + t + y ) T n T X≤ | | 2 2 2 MT (s) MT (s) FT (s) + | | + | | .  | | (1 + t + y )2 T 2 | | 58 Hence for a xed y we have for s = 1 + i(y + t) T T T 2 T 2 2 2 MT (s) MT (s) ΦT (s)) dt FT (s) dt + | | 2 dt + | 2 | dt 0 | |  0 | | 0 (1 + t + y ) 0 T Z Z T Z | | Z F (s) 2 dt + log2 T,  | T | Z0 because of the trivial estimate 2 2 1 2 M (1 + i(t + y)) n− log T. T   n T ! X≤ Now we divide the sum into log T parts FT (s) m = log 2 j k m F (s) = + + + =: G (s). T ··· j T

m F (s) 2 log T G (s) 2. | T |  | j | j=1 X By Lemma 5.2 we have for a xed y T 2 2 j an Gj(1 + i(t + y)) dt (T + 2 T (log T + j log 2)) | 2| 0 | |  j 1 j n Z 2 − T

d(n)2 2jT log3(2jT ) 2jT log3 T.   2j 1T

T m Φ (s) 2 dt log2 T + log4 T log5 T, | T |   0 j=1 Z X and we obtain T T ∞ y2 2 I2(t) dt = e− ΦT (1 + i(t + y)) dt dy 0 0 | | Z Z−∞ Z ∞ y2 5 5 e− log T dy log T.   Z−∞ Collecting our two estimates together in (5.26) yields

1+2c 2 2(1 σ) 5 2σ 1 (2+4c)(1 σ) 8 (5.30) I log T (T log T ) − (log T ) − T − log T.  

59 5.4 Prime Gaps We are now ready to prove the theorem of Hoheisel and Ingham on the dierence of consecutive prime numbers. The proof given here is from Ramachandra (2007 Ch. 11).

Theorem 5.6. (Hoheisel-Ingham, 1937). Assume that 1 c for some ζ 2 + it = (t ) constant c > 0. Then for every 1+4c < θ < 1 and for all functions h = Oh(x) such that 2+4c  xθ h x, we have   (5.31) ψ(x + h) ψ(x) h, x . − ∼ → ∞ Therefore, for large enough n, (5.32) p p < pθ . n+1 − n n Proof. Let us denote

ζ (s) (5.33) Z = Z(s) := 0 , φ = φ (s) = ζ(s)M (s) 1. − ζ(s) T T T − Then a simple calculation gives us

2 Z = ZφT ζ0MT = ZφT + ζ0MT (φT 1) − −2 − ZΦT (5.34) = + ζ0M (ζM 2). (1 21 s)2 T T − − − By Theorem 5.2 we have for T x  1 c0+iT xs ds x log2 x (5.35) ψ(x) = Z + , 2πi c iT s O T Z 0−   1 where 1 Now let be a function such that and 5 c0 = 1 + log x . ω(t) ω(t) ω(t) log (t), and dene → ∞  ω(T ) log log T (5.36) η = η(T ) := . log T Then by Corollary 4.1 we can choose ω(t) so that we also have (5.37) Z(σ + it) log t, ζ(s) = 0  6 for all t T and σ 1 η. Therefore by the Residue theorem we have | | ≤ ≥ − c +iT 1 η+iT 1 η iT c iT s 1 0 − − − 0− x ds + + + Z = x, 2πi c iT c +iT 1 η+iT 1 η iT s Z 0− Z 0 Z − Z − − ! since the only pole inside the rectangle is at s = 1 with residue x. Since by Corollary 4.1 we have Z(s) log t, the second and the fourth integral are bounded by  1 η+iT s c − x x 0 x log T Z ds log T (c0 1 + η) .  s | |  T −  T Zc0+iT

60 Therefore, for T x we have  1 η+iT s 2 1 − x ds x log x (5.38) ψ(x) = x + Z + . 2πi 1 η iT s O T Z − −   Hence, for any 1 h x we have ≤  1 η+iT s s 2 1 − ((x + h) x ) x log x ψ(x + h) ψ(x) = h + Z − ds + − 2πi 1 η iT s O T Z − −   1 η+iT x+h 2 1 − s 1 x log x = h + Z u − du ds + 2πi 1 η iT x O T Z − − Z   1 x+h x log2 x (5.39) = h + g(u) du + , 2πi O T Zx   where

1 η+iT − s 1 (5.40) g(u) := Zu − ds. 1 η iT Z − − What we need to show is that if we choose T suitably, then g(u) = o(1) for x u x + h as x . We now write using (5.34) the integrand as ≤ ≤ → ∞ 1 η+iT 2 1 η+iT − ZΦT s 1 − s 1 − 0 − g(u) = 1 s 2 u ds + ζ MT (ζMT 2)u ds 1 η iT (1 2 − ) 1 η iT − Z − − − Z − − =: g1(u) + g2(u).

Now on σ = 1 η we have −

1 s η 1 Z log t, 1 2 − 2 1 η .  | − | ≥ −  ≥ log T Therefore by Theorem 5.5 we have

T η 3 2 g1(u) u− log T ΦT (1 η + it) dt  0 | − | η (2+4c)ηZ 11 x− T log T,  since we have assumed that ζ 1 + it = (tc). 2 O To estimate the second integral we note that the integrand is analytic for σ < 1. Hence by Cauchy's theorem 

1 1 2 +iT 1 η+iT 2 iT − − s 1 g2(u) = + + ζ0MT (ζMT 2)u − ds 1 iT 1 +iT 1 η iT − Z 2 − Z 2 Z − − ! =: I + J1 + J2.

61 To estimate J1 and J2, we have for σ < 1 σ 1 σ 1 1 σ M (s) n− T − n− T − log T, T ≤ ≤ ≤ n T n T X≤ X≤ and by Theorem 5.4

2c(1 σ) 2 2c(1 σ) 3 ζ(s) = (t − log t), ζ0(s) = (t − log t). O O Therefore, for k = 1, 2, we have σ 1 Jk max ζ0MT (ζMT 2) x − | |  1 σ 1 η, t=T | − | 2 ≤ ≤ − 1 σ T 2+4c − max log7 T.  1 σ 1 η x 2 ≤ ≤ −   For the integral we have 1 so that I σ = 2 , s 1 1 2 ζ0M (ζM 2)u − x− 2 ( ζζ0M +2 ζ0M ) T T −  | T | | T | 2c 3 T log T 2 1 ( MT + MT ).  x 2 | | | | 2 2 Now MT + MT min 2 MT , 2 depending on whether MT 1 or MT < 1. There- fore | | | |≤ { | | } | | ≥ | | T 2c log3 T T 1 2 T 1+2c log5 T I 1 MT + it + 2 dt 1 ,  2 2  2 x Z0   ! x since by Lemma 5.2

T 2 1 1 2 M + it dt T log T n− T log T. T 2   Z0   n T X≤ Combining our estimates for and and yields g1 I,J1 J2 1 σ T 2+4c − g(u) max log11 T.  1 σ 1 η x 2 ≤ ≤ −   1  Taking T = x 2+4− c x for any small  > 0 gives us  η 11 (log log T ) log x 11 g(u) x− log T = exp  ω(T ) log T  − log T   11 ω(T ) (5.41) (log T ) − 0,  → as x , because ω(T ) . Therefore from (5.39) we obtain → ∞ → ∞ 1 1 + 2 ψ(x + h) ψ(x) = h + o(h) + (x − 2+4c log x) − O for any given Hence, if we choose any 1+4c any θ and choose  > 0. 2+4c < θ < 1, x h x a small enough , we obtain   ψ(x + h) ψ(x) h, x . − ∼ → ∞ The other part of the theorem now follows directly from Theorem 5.1.

62 We can now use our estimate for ζ(s) obtained from van der Corput's method to deduce the following corollary.

Corollary 5.2. For every  > 0, we have

5 8 + (5.42) p p < pn n+1 − n for large enough n. Therefore, for large enough N, there is always a prime number between N 3 and (N + 1)3.

1 Proof. By Theorem 3.6 we have 1 6 Therefore Theorems 5.1 and ζ 2 + it = (t log t). 5.6 give us that for every O  4 1 + 6 5 θ > 4 = 2 + 6 8 there always lies a prime number in the interval (n, n + nθ], which implies the rst part of the theorem. For the second part of the theorem we have

3 3 2 2 3 2 3 5 + 1 (N + 1) N = 3N + 3N + 1 7N 7(N ) 3 < (N ) 8 100 − ≤ ≤ for large enough N, which gives us the theorem.

63

6 Prime Pairs

In this chapter we study prime pairs by using exponential sums and the theorems of the previous chapters. The proofs and the results presented in this chapter are author's own, unless otherwise stated. We have searched the existing literature for similar theorems. In those instances, where we were able to nd corresponding ideas (namely Theorem 6.4 and Section 6.4), we have inserted references to them. 6.1 Averaged Form of the Hardy-Littlewood Conjecture In this section we will apply the theorem of Hoheisel and Ingham to derive asymptotic results concerning prime pairs. In many of the theorems we will have the familiar assumptions 1 c for some constant and 1+4c By Theorem ζ( 2 + it) = (t ) c > 0 2+4c < θ < 1. 3.6, we may choose any 5O 8 < θ < 1. Let 2k 2 be a constant and consider the prime pair counting function ≥ (6.1) π (x) := p x : p and p + 2k both prime numbers . 2k |{ ≤ }| That is, how many prime numbers p x there are such that p + 2k is also a prime number. We use the notation A for the≤ cardinality of a nite set A. By the prime number theorem| | we know that there are roughly x/log x prime num- bers less than x. Then roughly 1/log x of the numbers less than x are prime numbers. Assuming that the event that some n x is a prime is independent of the number ≤ 2 n + 2k being a prime, it is tempting to guess that π2k(x) is asymptotic to x/log x. However, there are some problems with this heuristic. Consider rst the case 2k = 2. If q > 2 is a prime we know already that q + 2 is not divisible by 2. Therefore we should at least multiply x/log2 x by 2. Suppose then that q is not divisible by a prime p > 2. Then q belongs to one of the (p 1) non-zero residue classes of p. If q + 2 is also not divisible by p, then it must be in− one of the remaining (p 2) of the (p 1) non-zero residue classes of p. Hence, given that q is not divisible by−p, the number−q + 2 is not divisible by approximately p 2 of the time. Since the probability that the integer p p−1 q +2 is not divisible by is p 1 we− should multiply our guess for by a correction factor p −p , π2(x) p 2 p 1 p(p 2) − Doing this for all prime numbers leads us to guess that p−1 / −p = (p 1)2 . p > 2 − − p(p 2) x (6.2) π (x) 2 − . 2 ∼ (p 1)2 2 p>2 ! log x Y − The convergence of the above product can be seen from

p(p 2) 1 (6.3) − = 1 . (p 1)2 − (p 1)2 p>2 p>2 Y − Y  −  Let us then consider, for example, the cases 2k = 2 and 3k = 6. If a number q > 3 is a prime, then it is not divisible by the number 3. This alone does not imply whether or not q + 2 is divisible by 3, but we do know that q + 6 also cannot be divisible by 3. This makes it more probable that q + 6 is a prime than that q + 2 is a prime. The

65 number q + 2 is not divisible by 3 only half of of the time, since either q 1 (mod 3) or ≡ q 2 (mod 3). Thus a reasonable guess is that π6(x) 2 π2(x). ≡More generally, let 2k be a constant and let q >∼ k be a prime number. Then q is not divisible by any of the prime factors of 2k. Hence also q + 2k is not divisible by any of the odd prime factors of k. Let p be a odd prime factor of k. Then q + 2 is not divisible by approximately 1 p 2 of the time, which leads us to guess that p 1 p 1 = p−1 p 1 − − − π2k(x) 2

Conjecture 6.1. (Hardy-Littlewood Conjecture). Let 2k 2 be a constant. Then ≥ x (6.4) π2k(x) C2k , ∼ log2 x where the constant C2k is dened by p(p 2) p 1 (6.5) C := 2 − − . 2k (p 1)2 p 2 p>2 2

lim inf pk+1 pk 246. k →∞ − ≤ The rst result of this form was obtained by Yitang Zhang for at least one 2k 70, 000, 000 in 2013 (Polymath, 2014). It should be noted that the Prime Number≤ Theorem π(x) x/log x immediately implies that ∼ p p lim inf k+1 − k 1. k →∞ log pk ≤ Using the theorem of Hoheisel and Ingham proven in the previous section we will θ show that the average of the functions π2k(x) over positive 2k x is consistent with the Hardy-Littlewood Conjecture. First we require the following≤ theorem on the ordinary prime counting function π(x). Theorem 6.1. Suppose that 1 c for some constant Then for all ζ( 2 + it) = (t ) c > 0. 1+4c < θ < 1, and for all functions h = h(xO) such that xθ h x, we have 2+4c   (6.6) π(x + h) π(x) h, x . − ∼ → ∞ Proof. By Theorem 5.6 we have

ψ(x + h) ψ(x) = log p h, x . − ∼ → ∞ x

log p log x 1 = log x (π(x + h) π(x)) . ≥ − x

ψ(x + h) ψ(x) h (6.7) π(x + h) π(x) − . − ≤ log x ∼ log x To obtain an inequality to the other direction, we note that

log(x + h)(π(x + h) π(x)) log p − ≥ x

log p = log p + log p + ... x

ψ(x + h) ψ(x) 1 log x h (6.8) π(x + h) π(x) − h 2 , − ≥ log(x + h) − log 2 ∼ log x since 1+4c < θ < 1 and xθ h x. Therefore, by (6.7) and (6.8), we have 2+4c   h π(x + h) π(x) . − ∼ log x

Remark 6.1. In what follows we will need the following almost trivial remark about the error term in the previous theorem. Let n = n(x) be a function such that xθ < n x. If we write ≤

xθ (6.9) π(n + xθ) π(n) = + R(n, x), − log n

67 then by the previous Theorem xθ holds uniformly for all of the functions R(n, x) = o( log x ) . If this was not the case, then we could nd a sequence tending to n = n(x) (xk)k∞=1 innity such that for each we could nd θ such that xθ k xk < nk xk R(nk, xk)  log x for some xed  > 0. This would in turn contradict≤ the above theorem,| because|≥ then for any function h = h(x) satisfying xθ h x and h(n ) = xθ we would have   k k h π(x + h) π(x)  . − log x Note that θ θ We will need this fact in a moment, when we sum nk h(nk) = xk < nk. over the error≤ terms over xθ < n x. ≤ We also record the Prime Number Theorem for future reference. The proof can be found in Ayoub (1963, Ch. 2). Theorem 6.2. (Prime Number Theorem). The number of prime numbers less than x satises x (6.10) π(x) , x . ∼ log x → ∞ Using the above theorems we can prove the following theorem on the average of the functions π (x) over positive 2k xθ. 2k ≤ Theorem 6.3. Suppose that 1 c for some constant Then for every ζ( 2 + it) = (t ) c > 0. 1+4c we have O 2+4c < θ < 1, 2 x (6.11) π2k(x) 2 , x . xθ ∼ log2 x → ∞ 2k xθ X≤ Furthermore, for all functions h = h(x) such that xθ h x,   2 h (6.12) (π2k(x + h) π2k(x)) 2 , x . xθ − ∼ log2 x → ∞ 2k xθ X≤ Proof. Let P (n) denote the characteristic function of primes. That is,

1, n is a prime, (6.13) P (n) := (0 otherwise. Then for any xed 2k the function P (n)P (n + 2k) is the characteristic function for primes p such that p + 2k is also a prime. Note that if r is odd, then the number of primes p such that p + r is also a prime is at most 1. Therefore,

π2k(x) = P (n)P (n + 2k) 2k xθ 2k xθ n x X≤ X≤ X≤ = P (n)P (n + k) + (xθ) O k xθ n x X≤ X≤ = P (n)P (n + k) + (x2θ). O k xθ xθ

P (n)P (n + k) = P (n) P (n + k) k xθ xθn x xθ

xθ xθ xθ x1+θ P (n) + o = P (n) + o . log n log n log n log2 x xθ

xθ xθ x1+θ P (n) π(x) π(xθ) . log n ≥ log x − ∼ log2 x xθ

xθ xθ xθ P (n) P (n) + P (n) log n ≤ θ log x log y xθ

θ 1+θ x x 2θ π2k(x) = P (n) + o + (x ) log n log2 x O 2k xθ xθ

2 x π2k(x) 2 xθ ∼ log2 x 2k xθ X≤ as x . → ∞ 69 The proof of the second claim is similar, but slightly easier. For all xθ h x we have   2 2 (π (x + h) π (x)) = P (n)P (n + 2k) xθ 2k − 2k xθ 2k xθ 2k xθ x a. That is, ≤ 2 x (6.14) π2k(x) 2 , x . (b a)xθ ∼ log2 x → ∞ axθ<2k bxθ − X≤ θ 2 In this sense the numbers 2k x , for which π2k(x) is large in comparison to x/log x, are equidistributed (see Section≤ 6.3).

Note that in the Hardy-Littewood Conjecture we have the correction factors C2k, which depend on the odd prime divisors of k. This is not in contradiction with the above theorem, since most of the contributions in the average (6.14) come from 2k which are dependent on x. In the Hardy-Littlewood Conjecture the number 2k is considered to be a constant. However, motivated by the previous theorem we next show that the average of the constants C over 2k y does tend to 2, as y . 2k ≤ → ∞ Theorem 6.4. Let

p(p 2) p 1 (6.15) C = 2 − − 2k (p 1)2 p 2 p>2 2

2 (6.16) C 2, y . y 2k → → ∞ 2k y X≤ 70 Proof. Let us write p 1 1 − = 1 + p 2 p 2 2

1 1 1 + = exp log 1 + p 2  p 2  2

p 1 y y 1 1 − = + µ(d) + y 2 p 2 2 2d| | p 2 O 2k y 2

y 1 1 (6.17) = 1 + µ(d) + y 2 . 2  | | p(p 2) O 22 p>2 Y − Y  −  1 = 1 + µ(d) . | | p(p 2) d>2, d odd 2

2 p(p 2) y . Thus, − → ∞ Q 2 p(p 2) 2 p 1 C = 2 − − y 2k (p 1)2 y p 2 2k y p>2 2k y 2

p(p 2) 1 1 = 2 − 1 + µ(d) + y− 2 (p 1)2  | | p(p 2) O p>2 2

As mentioned earlier, giving lower bounds for π2k(x) is a problem of great diculty. Next we will apply the previous two theorems to give lower bounds for the functions θ π2k(x) for at least some proportion of the numbers 2k x . Since we have an asymptotic result for the average of the functions, in order to deduce≤ lower bounds we need to have upper bounds. Sieve theory has proven very successful in giving upper bounds for the functions π2k(x). We make use the following result, the proof of which is too complicated to be given here. The proof can be found in Halberstam & Richert (2011, p. 117).

Theorem 6.5. We have uniformly for all 2k x (6.18) π2k(x) 4 C2k (1 + o(1)) , x , ≤ log2 x → ∞ where

p(p 2) p 1 (6.19) C = 2 − − 2k (p 1)2 p 2 p>2 2

Remark 6.4. We have abused the small o notation above. For two functions f(x) and g(x), we write f(x) g(x)(1 + o(1)) to say that for all  > 0 we have f(x) (1 + )g(x) ≤ ≤ for large enough x x. Similarly for f(x) g(x)(1 + o(1)). We also dene the strict inequality f(x) < g(≥x)(1 + o(1)) as the negation≥ of f(x) g(x)(1 + o(1)). That is, there exists an  > 0 such that f(x) < g(x)(1 ) for arbitrarily≥ large x. Similar denition applies for the other direction. −

p(p 2) In what follows we will denote − by We will also make use of the fact p>2 (p 1)2 Π2. that 2Π > 1.32 (Halberstam & Richert, p.− 128). For a constant 0 < C < 1 we will say 2 Q that a subset A 2k xθ is of proportion C if 2 A /xθ = C(1 + o(1)) as x . ⊂ { ≤ } | | → ∞ 72 Theorem 6.6. Suppose that ζ( 1 + it) = (tc) for some constant c > 0 and let 1+4c < 2 O 2+4c θ < 1. Let C < 1 be a positive constant. Then for at least a proportion C of the 24Π2 numbers 2k xθ we have uniformly ≤ x (6.20) π2k(x) DC2k (1 + o(1)) , x , ≥ log2 x → ∞ where 1 That is, for large enough we can nd a subset of the D = D(C) = 2 12CΠ2. x, numbers 2k xθ of size− Cxθ/2 such that (6.20) holds. ≤ Proof. Denote by K := 2k xθ . Suppose that there exists a subset B K such that B = (1 C) K , and for{ all≤2k } B we have ⊂ | | − | | ∈ x π2k < DC2k (1 + o(1)) , log2 x where C and D are to be determined so that this can be done for arbitrarily large x. Dene A = K B, so that A = C K . Then \ | | | | 2 2 2 π (x) = π (x) + π (x) xθ 2k xθ 2k xθ 2k 2k xθ 2k B 2k A X≤ X∈ X∈ x 2 2 < D (1 + o(1)) C2k + π2k(x) log2 x xθ xθ 2k B 2k A X∈ X∈ x 2 2 D (1 + o(1)) C2k + π2k(x) ≤ log2 x xθ xθ 2k K 2k A X∈ X∈ x 2 (6.21) = 2D (1 + o(1)) + π2k(x) log2 x xθ 2k A X∈ by Theorem 6.4. The second sum can be bounded using the sieve theory estimate Theorem 6.5, since the theorem holds uniformly for all 2k. We have 2 x 2 (6.22) π2k(x) 4 (1 + o(1)) C2k. xθ ≤ log2 x xθ 2k A 2k A X∈ X∈ To obtain an upper bound for the sum on the right-hand side we note that

2 2 2 C = C C xθ 2k xθ 2k − xθ 2k 2k A 2k K 2k B X∈ X∈ X∈ 2 = 2(1 + o(1)) C . − xθ 2k 2k B X∈ To use Theorem 6.3, we need to show that

2 3 (6.23) C + δ (1 + o(1)) xθ 2k ≥ 2 2k B   X∈ 73 for some δ > 0. First we note that for all 2k we have C2k 2Π2 > 1.32. If 2k is divisible by 3, then 3 1 We also note that if the≥ density of the set is small, then at C2k 2Π2 3−2 = 4Π2. C A least≥ proportion− 1 of the numbers in must be divisible by 3, and included in . 3 C K B This can be seen, if− we denote K := 2k K : 3 k , from the trivial estimate 3 { ∈ | } B K = (K A) K K A | ∩ 3| | \ ∩ 3| ≥ | 3|−| | 1 1 K 1 A = C K 1, ≥ 3| |− − | | 3 − | |−   since

K 1 K = | | K 1. | 3| 3 ≥ 3| |−   We also have

2 B K = (K A) K K K A C K , | \ 3| | \ \ 3| ≥ | \ 3|−| | ≥ 3 − | |   because

K 2 K K = K | | K | \ 3| | |− 3 ≥ 3| |   Therefore,

2 2 2 C = C + C xθ 2k xθ 2k xθ 2k 2k B 2k B K3 2k B K3 X∈ ∈X∩ ∈X\ 1 2 C 4Π (1 + o(1)) + C 2Π (1 + o(1)) ≥ 3 − 2 3 − 2     4 3C 2Π (1 + o(1)) (1.76 6CΠ ) (1 + o(1)) . ≥ 3 − 2 ≥ − 2   since K = xθ/2 . Hence, | | b c 2 2 (6.24) C = 2(1 + o(1)) C (0.24 + 6CΠ )(1 + o(1)). xθ 2k − xθ 2k ≤ 2 2k A 2k B X∈ X∈ Therefore, by (6.21) and (6.22)

2 x π2k(x) < (2D + 4 0.24 + 24CΠ2) (1 + o(1)) xθ · log2 x 2k xθ X≤ x < (1 + 2D + 24CΠ2) (1 + o(1)). log2 x

74 By Theorem 6.3 the sum on the left-hand side is x Therefore, we must 2 2 (1 + o(1)). have log x

1 + 2D + 24CΠ2 > 2, which implies that for any positive density C < 1 the theorem holds for 24Π2

1 D = 12CΠ . 2 − 2

Remark 6.5. In the above theorem we have not tried to obtain the sharpest possible limitations for the constants C and D. To improve the result we could be more careful with estimating the sum (6.23). However, since the sieve theory estimate Theorem 6.5 is four times bigger than what is expected, and since for all of the constants we have 1 C2k 2Π2, the above argument clearly cannot yield anything better than C < . ≥ 4Π2 We also note that the subset of 2k xθ for which the lower bound holds may be very dierent for dierent values of x. From≤ the above proof it is clear that the subset of 2k xθ is equidistributed in the sense that any interval axθ < 2k bxθ for xed b > a≤also contains a subset of proportion C for which the{ lower bound≤ of} the above theorem holds. To conclude this section we note that the ideas presented above are applicable also to other prime constellations than just prime pairs. For a sequence of even integers

(2h1,... 2hk) we can dene π (x) = p x : p, p + 2h , . . . , p + 2h all prime numbers . 2h1,...,2hk |{ ≤ 1 k }| We can then study the averages of over θ Following the π2h1,...,2hk (x) 2h1,..., 2hk x . lines of the proof of Theorem 6.3 we would obtain ≤ 2k 2k π (x) P (n) P (n + h ) P (n + h ) xkθ 2h1,...,2hk xkθ 1 k θ ··· θ ∼ θ θ ··· θ 2h1 x 2hk x x

6.2 Lower Bounds For Averages of π2k(x)

In this section we will prove two theorems, which bound fairly short averages of π2k(x) from below. First we need a lemma. The proof is a simple application of the Cauchy- Schwarz inequality and the Prime Number Theorem.

Lemma 6.1. Let be a set of positive integers such that x uniformly for B b = o log2 x all b B. Then   ∈ 1 x 1 x (6.25) π a b (x) (1 + o(1)). B 2 | − | ≥ log2 x − B log x (a,b) B2, a=b   | | ∈X 6 | | 75 Proof. We have

P (n + a) = (π(x + a) π(a)) = B π(x)(1 + o(1)), − | | a B n x a B X∈ X≤ X∈ since a = o x = o (π(x)) for all a B. Therefore, by the Cauchy-Schwarz inequal- log2 x ∈ ity  

2 2 B 2π(x)2(1 + o(1)) = P (n + a) x P (n + a) . | | ≤ n x a B ! n x a B ! X≤ X∈ X≤ X∈ Using the Prime Number Theorem this implies

1 x (6.26) P (n + a)P (n + b) (1 + o(1)). B 2 ≥ log2 x n x (a,b) B | | X≤ X∈ The left-hand side is 1 π a b (x + min a, b ) π a b (min a, b ) , B 2 | − | { } − | − | { } | | (a,b) B2 X∈  which is equal to

1 x π a b (x) + o , B 2 | − | log2 x (a,b) B2   | | X∈ since x The contribution from the pairs such that is a, b = o log2 x . (a, b) a = b   1 1 x π(x) = (1 + o(1)). B B log x | | | | Moving this to the right-hand side of (6.26) yields

1 x 1 x π a b (x) (1 + o(1)). B 2 | − | ≥ log2 x − B log x (a,b) B2, a=b   | | ∈X 6 | |

Using the above lemma we can give a lower bound for a weighted average of π2k(x) over k E log x. ≤ Theorem 6.7. Let 1 be a constant. Then E > 2 1 1 x ( E log x k)π2k(x) 1 (1 + o(1)). E2 log2 x b c − ≥ − 2E log2 x 1 k E log x   ≤ X≤ 76 Proof. By choosing B = 1, 2, 3,..., 2 E log x the previous lemma yields { b c} 1 x 1 x π a b (x) (1 + o(1)). 4E2 log2 x | − | ≥ log2 x − 2E log2 x (a,b) B2, a=b   ∈X 6

If a b is not divisible by 2, then π a b (x) 1. The contribution from this to the sum on− the left-hand side is clearly negligible.| − | ≤ Every even number 2k 2 E log x appears 4( E log x k) times as the dierence a b , namely for the pairs≤ b c b c − | − | (1, 1 + 2k), (2, 2 + 2k),..., (2 E log x ) 2k, 2 E log x ), b c − b c and the other way around. Hence,

1 1 x ( E log x k)π2k(x) 1 (1 + o(1)). E2 log2 x b c − ≥ − 2E log2 x 1 k E log x   ≤ X≤

Remark 6.6. By Theorem 6.3 we expect that the best possible lower bound would be of the form 1 x π2k(x) 2 (1 + o(1)). E log x ≥ log2 x 1 k E log x ≤ X≤ For large constants E the above theorem is close to this, once we take in to account the fact that the weights satisfy

1 1 (E log x 1)E log x ( E log x k) = − (1 + o(1)) E2 log2 x b c − E2 log2 x 2 1 k E log x ≤ X≤ 1 = + o(1). 2 As an immediate corollary we get the following more elegant but weaker version.

Corollary 6.1. Let 1 Then E > 2 . 1 1 x π2k(x) 1 (1 + o(1)). E log x ≥ − 2E log2 x 1 k E log x   ≤ X≤ Remark 6.7. The above corollary is a much more quantitative statement of the fact that

p p lim inf k+1 − k 1. k →∞ log pk ≤ As mentioned before, this follows immediately from the Prime Number Theorem.

We can use Lemma 6.1 also to give a lower bound for the average of π2km(x) for any integer m.

77 Theorem 6.8. Let x such that Let h be h = h(x) = o log2 x log x = o(h). m = o log x an integer. Then     1 x 2 (M k) π2mk(x) (1 + o(1)), M 2 − ≥ log2 x 1 k M ≤X≤ where h M = 2m . Proof. Let B := 2m, 4m, . . . , 2mM . Then by Lemma 6.1 { } 1 x 1 x x π a b (x) (1 + o(1)) = (1 + o(1)), M 2 | − | ≥ log2 x − M log x log2 x (a,b) B2, a=b   ∈X 6 since 1 m = o 1 . Each number 2mk appears 2(M k) times as the dierence M  h log x − a b , which proves the theorem. | − | The following more appealing version follows at once.

Corollary 6.2. Let x such that Let h be h = h(x) = o log2 x log x = o(h). m = o log x a positive integer. Then     1 x π2mk(x) (1 + o(1)), M ≥ 2 log2 x 1 k M ≤X≤ where h M = 2m .   6.3 Exponential Sums Over Prime Numbers In this section we prove two results concerning exponential sums over prime numbers. Let us rst recall Weyl's Equidistribution Theorem for sequences (see Grafakos, 2008, p. 209). We say that the sequence of numbers in is equidistributed modulo (ak)k∞=1 [0, 1] 1, if for all [a, b] [0, 1] we have ∈ a , . . . , a [a, b] (6.27) lim |{ 1 k} ∩ | = b a. k →∞ k − Theorem 6.9. (Weyl's Equidistribution Theorem). Let be a sequence (ak)k∞=1 of numbers in [0, 1]. The sequence is equidistributed modulo 1 if and only if for every m Z 0 we have ∈ \{ } 1 N (6.28) e(ma ) 0,N . N k → → ∞ Xk=1 The two theorems that we will prove in this section are in the spirit of the Weyl's Equidistribution Theorem, but we will use the concept of equidistribution for sequences, which take values on larger intervals than [0, 1]. This is clearly analogous to the above

78 after scaling by the length of the interval. The next theorem states that the primes on the interval [n, n + xθ] are equidistributed (modulo xθ) for large x, and for any xθ n x. The reason this is true is that we may simply choose any φ = θ  > 1+4c   − 2+4c and divide the interval [n, n + xθ] into smaller intervals of lenght xφ and use Theorem 6.1.

Theorem 6.10. Suppose that 1 c for some constant and let ζ( 2 + it) = (t ) c > 0, 1+4c < θ < 1. Let xθ n x. Then for anyO constant m = 0 we have 2+4c   6 log n mp (6.29) e 0, xθ xθ → n

2+4c . Write −

x 1 mp b c− mp e = e + (xφ). xθ xθ O n

mp mn  e = e + kmx− (1 + o(1)). xθ xθ Therefore, by Theorem 6.1,   

x 1 x 1 b c− b c− φ mp x  e (1 + o(1))e kmx− xθ  log n k=0 n+kxφ

79 We can now prove

Theorem 6.11. Suppose that 1 c for some constant and let ζ( 2 + it) = (t ) c > 0, 1+4c Then O 2+4c < θ < 1.

log2 x 2km (6.30) e π (x) 0, x1+θ xθ 2k → 2k xθ   X≤ as x for any constant m = 0. → ∞ 6 Proof. Let P (n) be the characteristic function of prime numbers. By the previous theorem and the Prime Number Theorem we have

2km km e π (x) = e P (n)P (n + k) + (xθ) xθ 2k xθ O 2k xθ   k xθ   n x X≤ X≤ X≤ km = P (n) e P (n + k) + (x2θ) xθ O xθ

xθ x1+θ = o P ( n) + (x2 θ) = o .  log n O log2 x xθ

Remark 6.8. By Theorem 6.3 we know that

x1+θ π2k(x) , x . ∼ log2 x → ∞ 2k xθ X≤ Therefore, Theorem 6.11 is just another way of saying that the numbers 2k xθ, for 2 ≤ which π2k(x) is of the order x/log x, are equidistributed in the averaged sense of Remark 6.2.

6.4 Another Approach to the Hardy-Littlewood Conjecture In this section we attack the problem of prime pairs by the means of distrete Fourier analysis. Fourier series on R/Z has been applied succesfully to problems on prime num- bers. For example, Vinogradov's proof of the Three Primes Theorem, which states that every suciently large n can be expressed as a sum of three prime numbers (Davenport, 1980). The calculations in this section are very similar to Vinogradov's proof. The main dierence in our approach is that we consider the discrete Fourier transform in Zn.

80 Let be the additive group of integers modulo In the formulas below we identify Zn n. with as a set. Let be a function. We may then dene Zn 0, 1, 2, . . . , n 1 f : Zn C the discrete{ Fourier transform− } (modulo n) by →

(6.31) fˆ(ξ) := f(x)e ( ξx), n − x Zn X∈ 2πix where x n Throughout this section the Fourier transform is always en(x) := e n = e . modulo n. Many of the formulas  of the usual Fourier transform hold also for the discrete one (see Theorem B.7 of the Appendix). In particular, if we dene the convolution of the two functions by

(6.32) (f g)(x) := f(y)g(x y), ∗ − y Zn X∈ then the discrete Fourier transform satises (\f g)(ξ) = fˆ(ξ)ˆg(ξ). That is, the Fourier transform of a convolution, which is dened additively,∗ is just the product of Fourier transforms. The diculty of the problem of prime pairs arises from the fact that it is both additive and multiplicative in nature. Therefore, interpreting the function π2k(n) = as a convolution and taking the Fourier transform should make x n P (x)P (x + 2k) the≤ problem easier. P In this section the characteristic function of primes P (x) diers from the previous sections in the way that it is dened modulo n. That is, we set P (x + ln) := P (x) for all and Clearly for a xed the error this results in is 0 x n 1 l Z. k, π2k(n) (1) as n ≤tends≤ to− innity. We∈ also set P (2) := 0 for convenience, so that P (x) countsO only the odd primes. Since the discrete Fourier Transform satises also the inversion formula, we have

Theorem 6.12. Let n and k be positive integers. Then 1 (6.33) π (n) = Pˆ(ξ) 2 e ( 2kξ). 2k n | | n − ξ Zn X∈ In particular,

n 1 2 n (6.34) π2k(n) = + Pˆ(ξ) en( 2kξ) + o , n . log2 n n | | − log2 n → ∞ ξ Zn 0   ∈X\{ } Proof. We have

π2k(n) = P (x)P (x + 2k) = (P P )( 2k), ∗ − − x Zn X∈ where P (x) = P ( x). By Theorem B.7 we have (P\P )(ξ) = Pˆ(ξ)Pˆ (ξ) = Pˆ(ξ) 2, since − − ∗ − − | |

Pˆ (ξ) = P (x)en(ξx) = Pˆ(ξ). − x Zn X∈ 81 By Theorem B.7 the Fourier inversion formula holds, which yields

1 π2k(n) = (P P )( 2k) = (P\P )(ξ)en( 2kξ) ∗ − − n ∗ − − ξ Zn X∈ 1 = Pˆ(ξ) 2 e ( 2kξ). n | | n − ξ Zn X∈ The second part of the Theorem follows from the observation that n Pˆ(0) = P (x) = π(n) ∼ log n x Zn X∈ by the Prime Number Theorem.

Theorem 6.12 is of great interest for the reason that the main term 1 ˆ 2 n n P (0) 2 is of the right order of magnitude. We would like to show that that cancellations| | ∼ occurlog n in the remaining exponential sum. However, by the Hardy-Littlewood Conjecture 6.1 we expect that the remaining sum is of the same order as the leading term. We immediately have some control over the sizes of Pˆ(ξ). The trivial estimate is of course

n Pˆ(ξ) = P (x)e ( ξx) = . n − O log n x Zn   X∈ The discrete analog of the Plancherel formula gives us much stronger constraints. In particular, the next theorem implies that 1 ˆ 2 can be of size n for at most n P (ξ) C log2 n of | | C log n ξ Zn. ∈ Theorem 6.13. Let n be a positive integer. Then 1 n (6.35) Pˆ(ξ) 2= P (x) , n . n | | ∼ log n → ∞ ξ Zn x Zn X∈ X∈ Proof. The rst equality follows immediately from the second part of Theorem B.7, and the second is just the Prime Number Theorem.

Taking just the rst term in the sum (6.33) as an approximation to π2k(n) suers from the same problem as the initial heuristic guess that we gave in the beginning of this Chapter; the primeness of an integer p + 2k is not independent of the number p being a prime. That is, the set of prime numbers is expected to have some addivitive structure, which is described by the constants C2k in the Hardy-Littlewood Conjecture. For instance, if p > 2 is prime, it is odd. Then we already know that p + 2k is odd, and therefore twice more likely to be a prime number than a randomly chosen number.

This argument gave the factor 2 in the constant C2k. In the sum (6.33) this additive structure causes ˆ to be of of the largest possible order n for some of the P (ξ) log n ξ = 0, as can be seen from the discussion below. 6

82 Let us rst look at how to recover the factor 2 of the Hardy-Littlewood Conjecture from Theorem 6.12. We may assume that n is even, since if n is odd we may consider n + 1. The error from this is clearly (1). Then we have for all 0 ξ < n/2 O ≤ n n x Pˆ ξ + = P (x)e ξx x = P (x)e ( ξx)e 2 n − − 2 n − − 2 x Zn x Zn   X∈   X∈   = Pˆ(ξ), − since P (x) = 1 only if x is odd. Hence,

1 2 2 2 π2k(n) = Pˆ(ξ) en( 2kξ) = Pˆ(ξ) en( 2kξ) n | | − n n | | − ξ Zn 0 ξ < X∈ ≤X 2 n 2 2 n = 2 + Pˆ(ξ) en( 2kξ) + o . log2 n n | | − log2 n 0<ξ< n   X2 Note that the factor 2 was obtained using the fact that primes greater than 2 are odd, which corresponds to its heuristic justication

More generally, let us write the coecient C2k of the Hardy-Littlewood Conjecture as follows: p(p 2) p 1 C := 2 − − 2k (p 1)2 p 2 p>2 2

(6.36) π(n, q, a) := P (m), m n m aX≤mod q ≡ and assume that the greatest common divisor (a, q) = 1. Suppose that q logB n for ≤ some constant B. Then there exists a constant CB depending only on B such that

Li(n) 1 (6.37) π(n, q, a) = + n exp C log 2 n , n . ϕ(q) O − B → ∞  n o Here ϕ(q) is the Euler tontient function, which gives the number of integers less than q that are coprime to q. Li(n) is the oset logarithmic integral n dx n (6.38) Li(n) := , n . log x ∼ log n → ∞ Z2 83 The statement on the asymptotics of the logarithmic integral function follows easily by partial integration:

n dx n 2 n dx n = + , log x log n − log 2 log2 x ∼ log n Z2 Z2 since

n n dx dx 1 n = + (n 2 ) = . log2 x log2 x O O log2 n Z2 Z√n   It should be noted that the next theorem does not give any estimate for the remaining sum. It only produces the factor C2k in a natural way from the exponential sum (6.33). For the theorem we set

Q = Qz := p, p

Theorem 6.15. Let k be a xed positive integer, let Q = Qz be as above, and let Q n. Then |

Q 1 2 n 1 − rn 2kr n π2k(n) = C2k + en( 2kξ) Pˆ ξ + e + o , log2 n n − Q − Q log2 n 0<ξ< n r=0       XQ X as n . → ∞ Proof. Since Q n, we have by Theorem 6.12 |

Q 1 2 − 1 2 1 rn 2krn π2k(n) = Pˆ(ξ) en( 2kξ) = Pˆ ξ + en 2kξ n | | − n n Q − − Q ξ Zn r=0 0 ξ<     X∈ X ≤XQ Q 1 2 1 − rn 2kr = e ( 2kξ) Pˆ ξ + e . n n − Q − Q 0 ξ< n r=0     ≤XQ X

To prove the theorem we need to show that for ξ = 0 we have

Q 1 2 1 − rn 2kr n Pˆ e C2k , n . n Q − Q ∼ log2 n → ∞ r=0     X

84 First we note that Pˆ(0) = Pˆ(n), so that we may write the above sum as running from r = 1 to r = Q. We have rn xr Pˆ = P (x)e − Q Q   x Zn   X∈ ar = P (x) e −   Q 1 a Q x n   ≤X≤ x aX≤mod Q   ≡  ar = P (x) e − + (Q).   Q O 1 a 1, then there is at most one prime p a mod Q. For (a, Q) = 1 we have by the Siegel-Walsz Theorem ≡ Li(n) 1 P (x) = + n exp C log 2 n . ϕ(Q) O − B x n x aX≤mod Q  n o ≡ Using this we obtain

rn Li(n) ar 1 Pˆ = e − + Q n exp C log 2 n , Q ϕ(Q) Q O − B   1 a

2 2 1 rn Q 1 Li(n) 1 ˆ Li 2 2 P = µ 2 + (Q (n) + Q n) exp CB log n n Q (r, Q) Q n O −     ϕ (r,Q)  n o 2 Q  1  Li(n) 1 2 2 = µ 2 + Q n exp CB log n . (r, Q) Q n O −   ϕ (r,Q)  n o

Therefore,  

Q 2 Li 2 Q 1 ˆ rn 2kr (n) Q 1 2kr P e µ 2 e , n Q − Q ∼ n (r, Q) Q − Q r=1     r=1   ϕ   X X (r,Q)

  85 as n , because the summation over the errors is bounded by → ∞ 3 1 1 Q n exp C log 2 n n exp 3 log Q C log 2 n − B  − B n o n o 1 n exp 3B log log n C log 2 n  − B n 1 o n exp C0 log 2 n .  − n o Note that 1 Li(n)2 n . Hence, to complete the proof, we just need to show that n ∼ log2 n the above sum tends to C2k, as n . To see this, we write the sum as running over the divisors of That is, we collect→ ∞ together all the terms where Q and sum Q. (r,Q) = d over d Q. The equation Q = d holds for r Q which are of the form r = Q b, where | (r,Q) ≤ d (b, d) = 1 and b d. Therefore, ≤ Q Q 1 2kr µ(d) 2kb µ 2 e = | 2| e (r, Q) Q − Q ϕ(d) − d r=1   ϕ   d Q b d   X (r,Q) X| (b,dX≤)=1

  1 = µ(d) c (2k) | | (p 1)2 d d Q p d X| Y| − c (2k) = 1 + p (p 1)2 p Q   Y| − p p(p 2) = − p 1 (p 1)2 p (2k,Q) p Q | − | − Y pY- 2k p p(p 2) − = C , → p 1 (p 1)2 2k p 2k p - 2k Y| − Y − as This is because tends to innity as tends to innity and n . Q = Qz = p

p 1, p 2k, cp(2k) = − | 1, p - 2k. (−

Remark 6.9. For readers who are aware of the proof of the Vinogradov's Three Primes Theorem, the technique of the above proof can be thought of as a discrete version the major arc/minor arc method (see Davenport, 1980). The main dierence here is again that we are considering the Fourier transform in This has the advatange of avoiding Zn. the construction the major arcs and the minor arcs. In addition, the computation of the main term, which corresponds to the contribution of the major arc, is more direct.

86 It should be noted that all the calculations done in this section for π2k(n) apply also to the function

ψ2k(n) := Λ(x)Λ(x + 2k) = (Λ Λ )( 2k). ∗ − − x Zn X∈ It is sometimes more convenient to use this function because of the nice properties of the von Mangoldt function. Similar argument as in Theorem 6.1 shows that the Hardy-

Littlewood Conjecture is equivalent to ψ2k(n) C2kn. The Siegel-Walsz Theorem holds also for the function ∼

(6.40) ψ(n, q, a) := Λ(m), m n m aX≤mod q ≡ in the form

n 1 (6.41) ψ(n, q, a) = + n exp C log 2 n , n ϕ(q) O − B → ∞  n o for q logB n and (a, q) = 1 (Davenport, 1980). Therefore, the same calculations as for ≤ the function π2k(n) give us the following theorem. Theorem 6.16. Let k be xed, and let Q be as in Theorem 6.15. Then 1 (6.42) ψ (n) = Λ(ˆ ξ) 2 e ( 2kξ), 2k n | | n − ξ Zn X∈ and

Q 1 2 1 − rn 2kr (6.43) ψ (n) = C n + e ( 2kξ) Λˆ ξ + e + o (n) , 2k 2k n n − Q − Q 0<ξ< n r=0     XQ X as n . → ∞

87

7 Conclusions

The main goal of this thesis was to study the growth of ζ(s) as well as exponential sums, and their implications on the distribution of the prime numbers. We have shown that if 1 c then for every 1+4c we have θ for large ζ 2 + it = (t ), θ > 2+4c pn+1 pn < pn enough n. In the proofO of this fact we have used Vinogradov's estimates− from the fourth  Chapter for the growth of zeta function near the line σ = 1. From the equation (5.41) it is clear that for the argument to work, it was necessary to establish that ζ(s) = 0 and B 6 ζ0(s)/ζ(s) log t in a region  ω(t) log log t σ 1 , ≥ − log t

1 where as From the van der Corput estimate 1 6 ω(t) t . ζ 2 + it = (t log t) proven in the→ third ∞ chapter→ ∞ we obtained using Theorem 5.6 that that for everyO > 0 we 5  8 + have p p < pn for large enough n. n+1 − n We have also used our theorems on the connection between the growth of ζ(s) and the distribution of primes to obtain results on the number of prime pairs p, p+2k. More θ precisely, we have shown that the average of the functions π2k(x) over 2k x is what we would expect by the Hardy-Littlewood Conjecture 6.1. We have also shown≤ that for θ a positive proportion of the numbers 2k x , the number π2k(x) satises a lower bound which is a constant multiple of what the≤ Hardy-Littlewood Conjecture predicts. Using a combinatorial argument, we were also able to give a lower bound for the average of over for any 1 The lower bound obtained is worse only π2k(x) k E log x E > 2 . by a constant than what≤ is expected to be the best possible bound. By appluing discrete Fourier transform in we were able to write the function as an exponential Zn π2k(n) sum. We have shown how to extract from this sum the term n which the Hardy- C2k log2 n , Littlewood conjecture predicts to be asymptotic to π2k(n). The problem of bounding the remaining sum is the last (but propably much bigger) hurdle. The true order of on the line 1 remains unsolved. The best current result ζ(s) σ = 2 is 1 32 by Huxley (2005), which is still larger than 1 The famous Lindelöf µ 2 205 7 . hypothesis≤states that 1 c for every For the prime gaps this would ζ 2 + it = (t ) c > 0. imply by Theorem 5.6 that we haveO  1 2 + (7.1) p p < pn n+1 − n for any given  > 0 for large enough n. It is also known that the Riemann hypothesis 1 2 implies that pn+1 pn < pn log n for large enough n, which is only slightly stronger (Ivic, 2003). Therefore− Theorem 5.6 seems to have the right limit as c 0. However, one can ask if the dependence 1+4c in Theorem→ 5.6 is sharp. For θ > 2+4c example, suppose that we could obtain a mean value result of the type

T (2+2c)(1 σ) B (7.2) Φ (σ + it) dt T − log T,T , | T |  → ∞ Z0 89 which corresponds to the Theorem 5.5, but with in the place of 2 Then φT0 (s) φT0 (s) . going through the argument of the proof of the Theorem| | 5.6, with the substitution| |

ZΦT Z = ζ0M −1 21 s − T − − instead of

2 ZΦT Z = + ζ0M (ζM 2), (1 21 s)2 T T − − − we would clearly obtain the result of Theorem 5.6 for all 1+2c θ > 2+2c . Looking back to the proof of the Ingham's Theorem 5.5, the estimate (7.2) would follow, if one could show that

T (7.3) Φ (1 + it) dt logC T,T . | T |  → ∞ Z0 This corresponds to the stronger statement that for t T ≤

1 1 (7.4) = M (1 + it) + t− , t, T , ζ(1 + it) T O → ∞  which at rst glance does not seem too unlikely, because this is true for ζ(s) and its partial sums by Theorem 3.3. However, even if one of the approximations (7.3) or (7.4) is valid, it is not likely that either can be proven easily. This is because, by Titchmarsh (1951, pp. 314-315), the Riemann hypothesis is equivalent to the statement that the series

∞ s µ(n)n− n=1 X converges for all 1 and the sum is By the Partial summation estimate B.1 σ > 2 , 1/ζ(s). this would imply that for every δ > 0

1 s 1 +δ = M (1 + it) + µ(n)n− = M (1 + it) + (T − 2 ), ζ(1 + it) T T O n>T X which would imply that for every  > 0

T T 1 + 2  Φ (1 + it) dt T 2 , Φ (1 + it) dt T . | T |  | T |  Z0 Z0 In this light the Ingham's Theorem 5.5 is a surprisingly strong result, and the proof of Theorem 5.6 seems optimal when we use the second power of φT0 (s) . The best current result for the prime gaps is | |

1 1 2 + 40 + p p < pn n+1 − n 90 by Baker, Harman and Pintz (2001). The bounds currently obtainable even assuming the Riemann Hypothesis fall well short of the Cramér's conjecture on prime gaps, which states that

p p log2 p . n+1 − n  n Cramér's conjecture is supported by a heuristic argument, which relies on a probabilistic model of the prime numbers (Cramér, 1936).

91

Appendices

A Notations

j, k, l, m, n Integers.

th p, pn Prime number and the n prime number.

s = σ + it, z = x + iy Complex numbers.

Re z, Im z Real and imaginary part of z.

z Complex conjugate.

e(x) 1-periodic Exponential function. e(x) = e2πix.

2πix en(x) n-periodic Exponential function. en(x) = e n .

fˆ(ξ) Fourier transform of f(x). fˆ(ξ) = ∞ f(x)e( ξx) dx. −∞ − R B(z, r) Open disk centered at z with radius r.

A Closure of the set A.

Boundary of the set . In the orientation is positive. ∂A A ∂B(z,r)

Divisor function. R d(n) d(n) = d n 1. | P log p, n = pk, Λ(n) Von Mangoldt's function. Λ(n) = (0 otherwise.

Chebysev's function. ψ(x) ψ(x) = n x Λ(n). ≤ P 1, n = 1, Möbius function. n is not squarefree, µ(n) µ(n) = 0, ( 1)k, n has k distinct prime factors. −  ϕ(n) Euler tontient function. Number of positive integers less than n and coprime to n.

93 (a, b) Greatest common divisor of a and b.

Floor function. x is the greatest integer less than or equal to x. b·c b c x is the distance of x from the set of integers. k·k k k Landau and Vinogradov notations. For two functions , , ,  f : C C, O O   we write or equivalently → g : C R+ f(s) = (g(s)) f(s) g(s) → O  as s in an unbounded connected set A C, if there exists a constant→ ∞ C > 0 such that f(s) Cg(s) for∈ all s A. The sub- script  indicates that the constant| |≤ may depend on∈ a parameter .

We write f(s) g(s) if g(s) f(s) g(s).     Small notation. For two functions we o o f : C C, g : C R+ write f(s) = o(g(s)), if f(s)/g(s) 0 as s→ in an→ unbounded → | |→ ∞ connected set A C. ∈ Asymptotic. We write f(x) g(x) as x if f(x)/g(x) 1 ∼ as x . ∼ → ∞ → → ∞

94 B Some Useful Theorems

Theorem B.1. (Partial Summation Estimate). Let b1, b2, ..., bn be a decreasing sequence of non-negative real numbers and let a1, ..., an be complex numbers. If

m

(B.1) Am := ak M, m = 1, 2, ..., n, | | ≤ k=1 X then

n

(B.2) akbk Mb1. ≤ k=1 X

Proof. Let us write

n n n 1 − akbk = b1A1 + bk(Ak Ak 1) = Ak(bk bk+1) + Anbn. − − − Xk=1 Xk=2 Xk=1 Now from the triangle inequality we obtain

n n 1 − akbk M (bk bk+1) + bn = Mb1. ≤ − ! k=1 k=1 X X

Theorem B.2. (Euler Summation Formula). Let f :[a, b] C be continuously dierentiable, where a and b are integers. Then →

b b (B.3) f(n) = f(x) dx + (x x )f 0(x) dx. − b c a

b b b f(n) = f(x) d x = f(x) dx + f(x) d( x x). b c b c − a

Theorem B.3. (Poisson Summation Formula). Assume that f(x) is continuous and real and that for some  > 0, c > 0 we have

(1+) (B.4) f(x) c(1 + x )− , | | ≤ | | (1+) (B.5) fˆ(ξ) c(1 + ξ )− , | | ≤ | | 95 where fˆ is the Fourier transform of f

∞ (B.6) fˆ(ξ) := f(x)e( ξx) dx. − Z−∞ Then

∞ ∞ (B.7) f(n) = fˆ(n). n= n= X−∞ X−∞ Proof. Dene for x R ∈ ∞ (B.8) g(x) := f(x + n). n= X−∞ By our assumption on f(x) the series clearly converges uniformly and denes a contin- uous 1-periodic function. We compute the Fourier coecients of g(x) as

1 ∞ 1 2πiξx gˆ(n) = g(x)e− dx = f(x + k)e( ξx) dx = fˆ(n). − 0 k= 0 Z X−∞ Z It follows from our assumption on ˆ that Hence by basic theory f(ξ) n∞= gˆ(n) < . on Fourier series (Saksman, 2011) we have that−∞ the Fourier∞ series of g converges at x = 0, which gives us P

∞ ∞ ∞ f(n) = g(0) = gˆ(n) = fˆ(n). n= n= n= X−∞ X−∞ X−∞

We will also need the following version of the Poisson summation formula. The proof is similar and can be found in Saksman (2011).

Theorem B.4. Suppose that f is continuous and piecewise continuously dierentiable on the nite interval [a, b], where a and b are not integers. Suppose also that f(x) = 0 outside [a, b]. Then

(B.9) f(n) = lim fˆ(n). N a n b →∞ n N ≤X≤ |X|≤ The next theorem states that we can estimate an analytic function by its real part (Titchmarsh, 1976).

Theorem B.5. (Borel-Carathéodory). Let f be non-constant and analytic in B(w, R) and let f(w) = 0 and Ref M for some constant M > 0. Then for any given 0 < r < R we have ≤ r (B.10) f(z) 2M | | ≤ R r − for all z B(w, r). ∈ 96 Proof. We may assume that w = 0 without loss of generality. Dene f(z) g(z) = 2M f(z). − Then g is also analytic in B(0,R), since the real part of the denominator does not vanish. We also have g(0) = 0. Writing f = u + iv yields us

u2 + v2 g(z) 2= 1. | | (2M u)2 + v2 ≤ − Hence by Schwarz's lemma we have

r g(z) | | | | ≤ R for all z B(0, r). Therefore ∈ g(z) r f(z) = 2M | | 2M . | | 1 + g(z) ≤ R r | | −

We will also need the following elementary theorem. Theorem B.6. Let Then d(n) := d n 1. | P (B.11) d(n)2 x log3 x, x .  → ∞ n x X≤ Proof. We have

d(n) = 1 = 1 = 1 1 n x n x km=n km x k x m x X≤ X≤ X X≤ X≤ X≤ k x x = = + (x) = x log x + (x). k k O O k x k x X≤ j k X≤ Clearly for every k, m 1 ≥ d(k)d(m) = 1 1 1 = d(km). ≥ d k d m d km X| X| X| Hence

d(n)2 = d(km) = d(km) = d(km) n x n x km=n km x k x m x X≤ X≤ X X≤ X≤ X≤ k x x d(k) d(k) d(m) d(k) log x log x . ≤  k k  k k x m x k x k x X≤ X≤ k X≤ X≤

97 In the last expression

d(k) 1 1 1 = log2 x, k nm ≤ n m  k x nm x n x m x X≤ X≤ X≤ X≤ which gives us the theorem. The next theorem contains some basic properties of the discrete Fourier transfrom (Tao & Vu, 2007). Theorem B.7. (Fourier Transform In ). Let be a positive integer and let Zn n be a function. Dene the discrete Fourier transform (mod n) of by f : Zn C f → fˆ(ξ) := f(x)e ( ξx), n − x Zn X∈ where x 2πix Then for all functions we have en(x) := e = e n . f, g : Zn C n →  1 (B.12) f(x) = fˆ(ξ)e (ξx). n n ξ Zn X∈ (B.13) fˆ(ξ)gˆ(ξ) = n f(x)g(x)

ξ Zn x Zn X∈ X∈ (B.14) (\f g)(ξ) = fˆ(ξ)ˆg(ξ), ∗ where the convolution (f g) is dened by ∗ (f g)(x) := f(y)g(x y). ∗ − y Zn X∈ Proof. The proofs of all three claims are direct computations, based on an orthogonality relation. We have

fˆ(ξ)e (ξx) = f(y)e (ξ(x y)) n n − ξ Zn ξ Zn y Zn X∈ X∈ X∈ = f(y) e (ξ(x y)) = nf(x), n − y Zn ξ Zn X∈ X∈ since

1 en(xn) − = 0, x = 0, 1 en(x) en(ξx) = − 6 n, x = 0. ξ Zn ( X∈ This proves the rst formula. For discrete version of the Plancherel formula we have

fˆ(ξ)gˆ(ξ) = f(x)g(y) e (ξ(y x)) = n f(x)g(x). n − ξ Zn x Zn y Zn ξ Zn x Zn X∈ X∈ X∈ X∈ X∈ 98 To obtain the convolution formula we compute

(\f g)(ξ) = f(y)g(x y)e ( ξx) ∗ − n − x Zn y Zn X∈ X∈ = f(y)e ( ξy) g(x y)e ( ξ(x y)) = fˆ(ξ)ˆg(ξ). n − − n − − y Zn x Zn X∈ X∈

99

References

Ahlfors, L. (1979). Complex Analysis. Third edition. McGraw-Hill Education, New York.

Apostol, T. (1978). Mathematical Analysis. Second edition. Addison-Wesley pub- lishing company, Massachusetts.

Ayoub, R. (1963). An Introduction to the Analytic theory of numbers. First edition. American Mathematical Society, Rhode Island.

Baker, R., Harman, G. & Pintz, J. (2001). The dierence between consecutive primes. II. Proceedings of the London Mathematical Society, 83 (3), pp. 532-562.

Cramér, H. (1936). On the order of magnitude of the dierence between consecutive prime numbers. Acta Arithmetica, 2, pp. 23-46.

Davenport, H. (1980). Multiplicative Number Theory. Second edition. Springer, New York.

Gallagher, P. (1976). On the distribution of primes in short intervals. Matematika, 23 (1), pp. 4-9.

Grafakos, L. (2008). Classical Fourier Analysis. Second edition. Springer, New York.

Halberstam, H. & Richert, H. (2011). Sieve Methods. Second edition. Dover Publi- cations Inc., New York.

Hardy, G. & Littlewood, J. (1923). Some Problems of 'Partitio Numerorum.' III. On the Expression of a Number as a Sum of Primes. Acta Mathematica, 44, pp. 1-70.

Huxley, M. (2005). Exponential sums and the Riemann zeta function. V. Proceed- ings of the London Mathematical Society. Third Series, 90 (1), pp. 1-41.

Ivic, A. (2003). The Riemann Zeta-function: Theory and Applications. First Edi- tion. Dover Publications Inc., New York.

Polymath, D. H. J. (2014). Variants of the Selberg sieve, and bounded intervals con- taining many primes. Arxiv: http://arxiv.org/abs/1407.4897

Ramachandra, K. (2007). Theory of Numbers: A Textbook. First Edition. Alpha Science International, Ltd., Oxford.

Saksman, E. (2014). Introduction to Analytic Number Theory. Lecture Notes. Uni-

101 versity of Helsinki.

Tao, T. & Vu, V. (2007). Additive Combinatorics. First Edition. Cambridge Uni- versity Press, Cambridge.

Tenenbaum, E. (1995). Introduction to Analytic and Probabilistic Number Theory. First Edition. Cambridge University Press, Cambridge.

Titchmarsh, E. (1951). The Theory of the Riemann Zeta-Function. First Edition. Oxford University Press, Oxford.

Titchmarsh, E. (1976). The Theory of Functions. Second Edition. Oxford Univer- sity Press, Oxford.

102