Treball final de màster

MÀSTER DE MATEMÀTICA AVANÇADA

Facultat de Matemàtiques Universitat de Barcelona

Sieve Theory and Applications

Autor: Mario Sariols Quiles

Director: Dr. Luis Dieulefait Realitzat a: Departament de Matemàtiques i Informàtica

Barcelona, 06 de setembre de 2017

Contents

Introduction 1 Acknowledgements ...... 2 Notation ...... 2

1 Introduction to Sieve Theory 3 1.1 Selberg’s Sieve ...... 5 1.2 Twin Primes and Sums of Two Primes ...... 10

2 Linear Sieve 17 2.1 Combinatorial Sieve ...... 18 2.2 The Functions Φ and φ ...... 24 2.3 The Jurkat-Richert Theorem ...... 26

3 Large Sieve 29 3.1 A Large Sieve Inequality ...... 29

4 Chen’s Theorem 39 4.1 Three Sieving Functions ...... 40 4.2 Linearity of the Sieves ...... 44 4.3 The Sieve of S(A, P, z) ...... 47 P 4.4 The Sieve of z≤q

Appendices 67 A Arithmetic Functions ...... 67 B The Möbius and Euler’s Totient Functions ...... 70 C Dirichlet Characters ...... 73

References 76

Introduction

The object of this Master Thesis is the study of 1) sieve theory 2) a theorem due to J. Chen (1966) stating that every even large enough number is the sum of an odd prime and a product of at most two primes.

The proof of said theorem makes great use of sieving techniques. Thus, this Master Thesis’ purpose is to introduce the required sieving methods in order to fully understand their application in said theorem’s proof, as well as providing a proof of the theorem.

Section 1 introduces basic notations and methods of sieve theory. An upper bound is found for the prime counting function using elementary sieving methods, which is by no means comparable to the Theorem, but is interesting on its own nonethe- less. Moreover, a sieve due to Selberg is described and then applied to bounding above the counting function and the number of representations of any given even number as the sum of two primes. The references for this Section are mainly §7.2 of [1], §1.2 of [4] and §7 of [5].

Section 2 focuses on combinatorial sieves, which are in a sense a generalisation of the sieve shown in Section 1. In particular, it centers on linear sieves, that is, combinatorial sieves of dimension one. Finally, a theorem due to Jurkat and Richert on linear sieves is stated and proved for it will later be used in the proof of Chen’s Theorem in Section 4. This section roughy follows §8 of [3] and §9 of [5].

Section 3 briefly presents large sieves due to the fact that a large sieve inequality will be needed in the proof of Chen’s Theorem in Section 4. For more details, see §27 of [2] and §8 of [1].

Section 4 is exclusively dedicated to state and prove Chen’s Theorem. Said proof is predominantly extracted from §10 of [5] and §11 of [3] to a lesser extent.

1 Acknowledgements I would like to thank Professor Dieulefait for his guidance.

Notation The letters p and q denote prime numbers. Similarly, n, m, d, N, among others, are always used for natural numbers. The set of natural numbers is denoted by

N = {1, 2, 3,...} Given m and n two natural numbers, write (m, n) to denote the greatest common divisor and [m, n] the lowest common multiple of m and n. Moreover, dr k n means the greatest power of d dividing n, that is, the greatest r ≥ 0 such that dr | n and dr+1 - n. Given two functions f and g, write f < 0 such that |f| ≤ Ag, the same as f = O(g). Similarly, write f >> g if there exists B > 0 such that |f| ≥ Bg.

Finally, f < 0 such that |f| ≤ Ag. π(x) is the prime counting function: X π(x) = 1 p≤x

π(x, a mod n) is the number of primes up to x congruent to a modulo n: X π(x, a mod n) = 1 p≤x p≡a(mod n)

ω(n) is the prime divisor function: X ω(n) = 1 p|n

ϕ(n) is Euler’s totient function: X ϕ(n) = 1 d≤n (d,n)=1 γ is Euler’s constant, and multiplicative functions are not identically 0.

2 1 Introduction to Sieve Theory

Given a finite set A of natural numbers, a set of primes P and a real number z > 1, the question is how many elements of A are not divisible by any prime in P smaller than z. Finding the answer to the previous question is what sieve theory seeks to accomplish.

The set A is referred as the sieving set, and P and z are often named the sieving range and sieving level, respectively. Together they define the sieving function X S(A, P, z) = 1 a∈A (a,P (z))=1 where Y P (z) = p p∈P p

Perhaps the most famous of sieves is the one due to Eratosthenes, who based the sieving of primes on the following remark:

Every between 2 and N not divisible by any prime smaller or equal than the square root of N is a prime number.

Proof. Let 2 ≤ n ≤ N. Write n = p1 . . . pr as product of primes, where r > 0. If n 1 1 is not divisible by any prime smaller or equal than N 2 , then pi > N 2 , for all i. Hence r r 2 2 r n > N ≥ n . Therefore, 1 > 2 , which implies r = 1.

In the above setting, Eratosthenes’ sieve consists in sieving the set

A = {n ∈ N : n ≤ N}

1 with sieving range P = and sieving level z = [N 2 + 1]. The value of z cannot simply be 1 P set to equal N 2 because of technical reasons with P (z) being the product of primes p < z rather than p ≤ z. Anyhow

X X 1 1 = 1 = π(N) − π(N 2 ) 1 1 N 2

X X X 1 S(A, P, z) = 1 = 1 + 1 + 1 = 1 + π(N) − π(N 2 ) n≤N 1 1 1

1 since there is no 1 < n ≤ N 2 coprime with P (z). An upper bound for S(A, P, z) will be found in order to estimate π(N).

3 1 INTRODUCTION TO SIEVE THEORY

In a general setting, a way to find an upper bound for a sieving function, is to make use of the Möbius function (see Appendix B). By Theorem B.1 X X X X X X X S(A, P, z) = 1 = µ(d) = µ(d) = µ(d) 1 a∈A a∈A d|(a,P (z)) a∈A d|a d|P (z) a∈A (a,P (z))=1 d|P (z) d|a

Given d ∈ N square-free, define

Ad = {a ∈ A : d | a}

Then X S(A, P, z) = µ(d)|Ad| d|P (z) since d | P (z) implies d square-free. The above identity is known as Legendre’s identity, and will be used in upcoming sections. In the particular case of Eratosthenes’ sieve

X N  N N  |A | = 1 = = − d d d d n≤N d|n

Hence X µ(d) X N  S(A, P, z) = N − µ(d) d d d|P (z) d|P (z)

1 By Theorem B.1 with f(n) = n , rewrite the first sum as

X µ(d) Y  1 = 1 − d p d|P (z) p|P (z)

Therefore Y  1 S(A, P, z) = N 1 − + R p p|P (z) where X N  X   R = − µ(d) = O(1) = O 2π(z) d d|P (z) d|P (z) since the number of divisors of P (z) is exatcly 2ω(P (z)) = 2π(z). The problem is that the 1 error term R is too big for the chosen value of z (which is of order N 2 ) and thus very little can be said of S(A, P, z). A solution appears itself by reducing the size of z. Set z = log N. In this case   R = O 2π(log N) = O(N log 2) since 2π(log N) ≤ 2log N = elog N log 2 = N log 2. Moreover

S(A, P, z) ≥ 1 + π(N) − π(z) ≥ π(N) − z

4 1 INTRODUCTION TO SIEVE THEORY 1.1 Selberg’s Sieve since S(A, P, z) certainly counts the number one and every prime between z and N. Then

Y  1 Y  1 π(N) ≤ z + S(A, P, z) = log N + N 1 − + R = N 1 − + O N log 2 p p p|P (z) p|P (z) where

−1 −1 Y  1 Y  1 Y X 1 X 1 Z z dx 1 − = 1 − = > > = log z p p pm n x p|P (z) p

Hence Y  1 1 1 1 − < = p log z log log N p|P (z) and therefore N π(N) << log log N This result is much weaker than the Prime Number Theorem. Nonetheless it serves the purpose of showing how sieve theory can be applied.

1.1 Selberg’s Sieve In Selberg’s sieve one replaces the Möbius function in X X X S(A, P, z) = 1 = µ(d) a∈A a∈A d|(a,P (z)) (a,P (z))=1 with a sequence of real numbers {λd}, where d is square-free and λ1 = 1, carefully chosen to minimise the final estimates. The reason being that  2 X X µ(d) ≤  λd d|(a,P (z)) d|(a,P (z)) for any arbitrary sequence of numbers {λd}, with λ1 = 1. The choice of λd = 0 for every d 6= 1 produces the trivial and useless bound X S(A, P, z) ≤ 1 = |A| a∈A Theorem 1.1 (Selberg Sieve). Let A be a sieving set, P a sieving range and z a sieving level. Let Y P (z) = p p∈P p

5 1.1 Selberg’s Sieve 1 INTRODUCTION TO SIEVE THEORY

Let Ad = {a ∈ A : d | a}, for any d ∈ N square-free and f a multiplicative function such that 0 < f(p) < 1, for every p ∈ P. Define

r(d) = |Ad| − |A|f(d) Let g be the completely multiplicative function defined by g(p) = f(p) for every p ∈ P and X G(z) = g(n) n

Proof. For every divisor d of P (z), define λ1 = 1 and λd = 0, for every d ≥ z. Then, by Lemma A.5  2 X X X X X X S(A, P, z) = 1 ≤  λd = λd1 λd2

a∈A a∈A d|(a,P (z)) a∈A d1|a d2|a (a,P (z))=1 d1|P (z) d2|P (z) X X X = λd1 λd2 1 = λd1 λd2 |A[d1,d2]|

d1,d2|P (z) a∈A d1,d2|P (z) [d1,d2]|a X X = |A| λd1 λd2 f([d1, d2]) + λd1 λd2 r([d1, d2])

d1,d2|P (z) d1,d2|P (z)

X f(d1)f(d2) = |A| λd1 λd2 + R = |A|T + R f((d1, d2)) d1,d2|P (z) where X f(d1)f(d2) T = λd1 λd2 f((d1, d2)) d1,d2

d1,d2

6 1 INTRODUCTION TO SIEVE THEORY 1.1 Selberg’s Sieve

Therefore

X f(d1)f(d2) X X T = λd1 λd2 = λd1 λd2 f(d1)f(d2) F (d) f((d1, d2)) d1,d2 0, since µ(1) µ(p) 1 F (p) = + = − 1 > 0 f(p) f(1) f(p) for all primes in P, and F is multiplicative. Moreover, µ2(d) = 1. Let X 1 V (z) = F (d) d

7 1.1 Selberg’s Sieve 1 INTRODUCTION TO SIEVE THEORY

It is therefore manifest that the minimum value of T is 1 V (z) attained at µ(d) w = d F (d)V (z) since F (d) > 0. Substitute these values of wd in the expression previously found for λd, to obtain 1 X  δ  µ(δ) µ(d) X 1 λ = µ = d f(d) d F (δ)V (z) f(d)V (z) F (δ) δ

X 1 X 1 X 1 1 X 1 = = = F (δ) F (d`) F (d`) F (d) F (`) δ

Hence

µ(d) X 1 µ(d) Y 1 X 1 λ = = d f(d)F (d)V (z) F (`) V (z) f(p)F (p) F (`) d`

1 X µ(h) X 1 1 X 1 X 1 |λd| = ≤ V (z) F (h) F (`) V (z) F (h) F (`) h|d d`

8 1 INTRODUCTION TO SIEVE THEORY 1.1 Selberg’s Sieve

Thus X X X X X |R| ≤ |r([d1, d2])| = |r(d)| ≤ |r(d)| 1 2 d1,d2

r Y max(αi,βi) p1 . . . pr = d = [d1, d2] = pi i=1 which implies that the amount of ordered pairs (d1, d2) such that [d1, d2] = d is equal to the amount of ordered pairs (αi, βi) such that max(αi, βi) = 1, which is 3 for every i, corresponding to the pairs (1, 0), (0, 1) and (1, 1). Hence |A| X S(A, P, z) ≤ + 3ω(d)|r(d)| V (z) d

To conclude, it is enough to prove that V (z) ≥ G(z). Let d | P (z). Then, d is the product d d d of distinct primes. Thus, (h, h ) = 1, for every h | d. Hence f(d) = f(h h ) = f(h)f( h ). Therefore X µ(h) 1 X 1 Y F (d) = = µ(h)f(h) = (1 − f(p)) f d  f(d) f(d) h|d h h|d p|d by Theorem B.1. Then

X 1 X Y 1 X Y 1 V (z) = = f(d) = g(d) F (d) 1 − f(p) 1 − g(p) d

9 1.2 Twin Primes and Sums of Two Primes 1 INTRODUCTION TO SIEVE THEORY where the last inequality follows from the fact that X 1 ≥ 1 d

1.2 Twin Primes and Sums of Two Primes Selberg’s sieve can be applied to provide an upper bound for the number of twin primes up to a given number and for the number of representations of an even number as the sum of two primes.

The twin prime conjecture states that there exist infinitely many primes p such that p + 2 is prime. Let π2(x) be the number of twin primes up to x. Then, the twin prime conjecture is equivalent to limx→+∞ π2(x) = +∞. Selberg’s sieve will provide the following upper bound for π2(x): x π2(x) << log2 x Goldbach’s conjecture states that every even natural number greater than 2 can be written as the sum of two primes. Equivalently, R(N) ≥ 1, for every even N > 2, where R(N) is the number of representations of N as the sum of two primes. Using Selberg’s sieve, one can find the following upper bound for R(N):

Y  1 N R(N) << 1 + p log2 N p|N

Before proving either of the above, a lemma.

Lemma 1.2. Let N be an even natural number and f the completely multiplicative function defined by  1  p if p | N f(p) = 2  p otherwise P Let G(z) = n

1 Y  1 1 << 1 + G(z) p log2 z p|N

10 1 INTRODUCTION TO SIEVE THEORY 1.2 Twin Primes and Sums of Two Primes

Proof. Let n < z. Write ω(N) k Y αi Y βj n = pi qj i=1 j=1 where p1, . . . , pω(N) are the distinct primes that divide N, and q1, . . . , qk are distinct primes not dividing N, with k, αi and βj nonnegative integers. Then

ω(N) k ω(N) k 1 2βj 2β1+...+βk Y αi Y βj Y Y f(n) = f(pi) f(qj) = α = p i βj n i=1 j=1 i=1 i j=1 qj P Let d(s) = d|s 1 be the divisor function and X d(s, N) = 1 d|s (d,N)=1

Then  k  k k Y βj Y Y βj β1+...+βk d(n, N) = d  qj  = (1 + βj) ≤ 2 = 2 j=1 j=1 j=1

Qk βj since d | j=1 qj if and only if k Y γj d = qj j=1 where 0 ≤ γj ≤ βj, for every j, that is, 1 + βj possible different values for γj, for every j. Hence X X d(n, N) G(z) = f(n) ≥ n n

−1 Y  1 X d(n, N) X 1 X X 1 G(z) 1 − ≥ = d(n, N) p n r nr p|N n

11 1.2 Twin Primes and Sums of Two Primes 1 INTRODUCTION TO SIEVE THEORY

s Let s < z and n be such that n | s and p | N for every p | n . Write ω(N) k Y ai Y bj s = pi qj i=1 j=1 and ω(N) k Y ci Y dj n = pi qj i=1 j=1 where p1, . . . , pω(N) are the distinct primes that divide N, and q1, . . . , qk are distinct primes not dividing N, with k, ai, bj, ci and dj nonnegative integers. Then

ω(N) k s Y Y b −d = pai−ci q j j n i j i=1 j=1 s where ai − ci ≥ 0 and bj − dj ≥ 0, since n | s. In fact, bj = dj, since p | N for every p | n . In particular k k Y Y d(n, N) = (1 + dj) = (1 + bj) j=1 j=1 s Furthermore, the number of divisors n of s such that p | N for every p | n is exactly ω(N) Y (1 + ai) i=1 since the exponents dj are completely determined by s and there are as much as 1 + ai possible different values for ci, for every i, because 0 ≤ ci ≤ ai, for every i. Thus

k k ω(N) X Y X Y Y d(n, N) = (1 + bj) 1 = (1 + bj) (1 + ai) = d(s) n|s j=1 n|s j=1 i=1 s s p| n ⇒p|N p| n ⇒p|N Finally, by Lemma A.2 −1 Y  1 X d(s) G(z) 1 − ≥ >> log2 z p s p|N s

12 1 INTRODUCTION TO SIEVE THEORY 1.2 Twin Primes and Sums of Two Primes

After the lemma, both previously stated results can be proved.

Theorem 1.3. Let x ≥ 1 be a real number and π2(x) the number of twin primes up to x. Then x π2(x) << log2 x Proof. Consider the sieving set A = {n(n + 2) : n ≤ x}

1 of [x] elements. Let P = P be the sieving range and z = x 8 the sieving level. Then X S(A, P, z) = 1 n≤x (n(n+2),P (z))=1 Let z < n ≤ x. Assume that p | n(n + 2), for some p < z. Then, p | n or p | (n + 2), which implies, n∈ / P or n + 2 ∈/ P, since n + 2 > n > z > p. This means that (n(n + 2),P (z)) = 1 if both n ∈ P and n + 2 ∈ P, with z < n ≤ x. Therefore X X π2(x) = 1 = π2(z) + 1 ≤ π2(z) + S(A, P, z) ≤ z + S(A, P, z) p≤x z

In order to apply Selberg’s sieve, consider the sets Ad = {a ∈ A : d | a}, for all d ∈ N square-free. Let f the completely multiplicative function defined by

 1  p if p | N f(p) = 2  p otherwise Define r(d) = |Ad| − [x]f(d) and X G(z) = f(n) n

[x] X S(A, , z) ≤ + 3ω(d)|r(d)| P G(z) d

To find an upper bound for the error term, let d | P (z). Write d as d = p1 . . . pr or d = 2p1 . . . pr, with 2 < pi < z distinct primes and r ≥ 0, and consider X |Ad| = 1 n≤x d|n(n+2)

13 1.2 Twin Primes and Sums of Two Primes 1 INTRODUCTION TO SIEVE THEORY which equals the number of solutions to the congruence n(n+2) ≡ 0 (mod d), for n ≤ x. It is enough to solve this congruence for primes only and then apply the Chinese Remainder Theorem, since d is product of distinct primes. Let p = 2. Then n(n + 2) ≡ 0 (mod 2) if and only if n ≡ 0 (mod 2), which corresponds to only 1 residue class modulo 2. Let p > 2. Then n(n + 2) ≡ 0 (mod p) if and only if n ≡ 0 (mod p) or n + 2 ≡ 0 (mod p), which corresponds to 2 different residue classes modulo p. Hence, by the Chinese Remainder Theorem, n(n + 2) ≡ 0 (mod d) if and only if n lies on any of some 2r different residue classes modulo d. Therefore

2r 2r X X X hxi hxi |A | = 1 = = 2r d d d i=1 n≤x i=1 n≡αi(mod d)

r where αi are the said 2 different residue classes modulo d. Thus [x]  |A | = 2r + O(1) = [x]f(d) + O(2r) d d

2r since f(d) = d , whence r(d) = O(2r) = O(2ω(d)) since ω(d) equals either r or r + 1 (depending whether d is odd or even). This bound for r(d) allows the proof to be concluded, because

X ω(d) X ω(d) ω(d) X ω(d) X ω(d) X 2 log 6 2+2 log 6 3 |r(d)| << 3 2 = 6 ≤ 6 < z log 2 < z log 2 d

1 1 x 9 x π2(x) ≤ x 8 + S(A, P, z) <

Corollary 1.4. The sum of reciprocals of twin primes converges.

Proof. Let pn be the n-th twin prime. Then, by Theorem 1.3

pn pn n = π2(pn) << 2 ≤ 2 log pn log n for every n > 1. Hence X 1 X 1 1 X 1 = ≤ + 2 < +∞ p pn 3 n log n p n≥1 n≥2 p+2∈P since p1 = 3.

14 1 INTRODUCTION TO SIEVE THEORY 1.2 Twin Primes and Sums of Two Primes

Finally, the second application of Selberg’s sieve, whose proof is very similar to that of Theorem 1.3.

Theorem 1.5. Let N > 2 be an even natural number and R(N) denote the number of representations of N as the sum of two primes. Then

Y  1 N R(N) << 1 + p log2 N p|N

It has to be made clear that the number of representations of N as the sum of two primes is to be understood as the number of ordered pairs of primes whose sum is N. Hence 12 = 5 + 7 = 7 + 5 are considered different representations of 12 as sum of two primes. This can be adjusted by a factor of 2.

Proof. Consider the sieving set

A = {n(N − n): n ≤ N}

1 of N elements. Let P = P be the sieving range and z = N 8 the sieving level. In particular X S(A, P, z) = 1 n≤N (n(N−n),P (z))=1

Let z < n < N − z. Assume that p | n(N − n), for some p < z. Then, p | n or p | (N − n), which implies, n∈ / P or N − n∈ / P, since n > z > p and N − n > z > p. This means that (n(N − n),P (z)) = 1 if both n ∈ P and N − n ∈ P, with z < n < N − z. Therefore X X X R(N) = 1 + 1 + 1 ≤ z + S(A, P, z) + z = 2z + S(A, P, z) p≤z z

Similarly to the proof of Theorem 1.3, consider the sets Ad = {a ∈ A : d | a}, for all d ∈ N square-free, and f the completely multiplicative function defined by

 1  p if p | N f(p) = 2  p otherwise

Let r(d) = |Ad| − Nf(d) and X G(z) = f(n) n

N X S(A, , z) ≤ + 3ω(d)|r(d)| P G(z) d

15 1.2 Twin Primes and Sums of Two Primes 1 INTRODUCTION TO SIEVE THEORY and by Lemma 1.2

1 Y  1 1 Y  1 64 Y  1 1 << 1 + = 1 + << 1 + G(z) p log2 z p log2 N p log2 N p|N p|N p|N

The same bound for the error term will be found as in Theorem 1.3. Write d | P (z) as d = q1 . . . qkp1 . . . pr, with qj | N and pi - N all distinct primes and k, r ≥ 0. Now X |Ad| = 1 n≤N d|n(N−n) which equals the number of solutions to the congruence n(N −n) ≡ 0 (mod d), for n ≤ x. It is enough to solve this congruence for primes only and then apply the Chinese Remainder Theorem, since d is product of distinct primes. Let q | N. Then n(N − n) ≡ 0 (mod q) if and only if n ≡ 0 (mod q), which corresponds to only 1 residue class modulo q. Let p - N. Then n(N − n) ≡ 0 (mod p) if and only if n ≡ 0 (mod p) or N − n ≡ 0 (mod p), which corresponds to 2 different residue classes modulo p. Hence, by the Chinese Remainder Theorem, n(N − n) ≡ 0 (mod d) if and only if n lies on any of some 1k2r = 2r different residue classes modulo d. Therefore

2r 2r X X X N  N  |A | = 1 = = 2r d d d i=1 n≤N i=1 n≡αi(mod d)

r where αi are the said 2 different residue classes modulo d. Thus N  |A | = 2r + O(1) = Nf(d) + O(2r) d d

2r since f(d) = d , whence r(d) = O(2r) = O(2ω(d)) since r ≤ ω(d). Finally

X ω(d) X ω(d) ω(d) X ω(d) X ω(d) X 2 log 6 2+2 log 6 3 |r(d)| << 3 2 = 6 ≤ 6 < z log 2 < z log 2 d

16 2 Linear Sieve

Given a sieving set A, a sieving range P and a sieving level z, the sieving function X S(A, P, z) = 1 a∈A (a,P (z))=1 where Y P (z) = p p∈P p

Ad = {a ∈ A : d | a} for any square-free d ∈ N. Let f be a multiplicative function such that 0 < f(p) < 1 for every p ∈ P. Define r(d) = |Ad| − |A|f(d) Then, by Theorem B.1 X X Y X S(A, P, z) = |A| µ(d)f(d) + µ(d)r(d) = |A| (1 − f(p)) + µ(d)r(d) d|P (z) d|P (z) p|P (z) d|P (z) = |A|V (z) + R(z) where Y V (z) = (1 − f(p)) p|P (z) and X R(z) = µ(d)r(d) d|P (z) The sum that defines the error term R(z) has as much as 2ω(P (z)) = 2π(z) addends, which makes it bigger than the non-error terms in many occasions. In a combinatorial sieve, the Möbius function is replaced by two arithmetic functions with similar properties to those of µ with the objective of reducing the size of R(z). Two functions λ+ and λ− are respectively called upper and lower bound sieves if

λ−(1) = 1 = λ+(1) and X X λ−(d) ≤ 0 ≤ λ+(d) d|n d|n

17 2.1 Combinatorial Sieve 2 LINEAR SIEVE for every n > 1. Moreover, assume there exists D > 0 and a set of primes P0, such that

λ−(d) = 0 = λ+(d)

± for all d ≥ D and for all d | p with p∈ / P0. Then λ are said to have support level D and sieving range P0. A combinatorial sieve is said to have dimension n > 0 if

n Y 1  log z  ≤ C 1 − f(p) log u p∈P u≤p 1 and for all 1 < u < z. The case n = 1 is called linear sieve.

The Jurkat-Richert Theorem is a result on linear sieves that provides upper and lower bounds for sieving functions. This theorem makes use of a general combinatorial sieve and a particular upper and lower bound sieves, which are to be studied in Section 2.1, and some bounds for X λ±(d)f(d) p|P (z) involving two functions Φ and φ defined in Section 2.2.

2.1 Combinatorial Sieve Theorem 2.1. Let A be a sieveing set, P a sieving range and z a sieving level. Let ± P0 be a subset of P such that Q = P\P0 is finite. Let λ be upper and lower bound ± sieves with support level D and sieving range P0, such that |λ (d)| ≤ 1, for all d. Let P (z) = Q p, P (z) = Q p, Q(z) = Q p and Q = Q p. Let f be a P3p

d|P0(z)

Then Y S(A, P, z) ≤ |A|F (z, λ+) (1 − f(p)) + R p|Q(z) and Y S(A, P, z) ≥ |A|F (z, λ−) (1 − f(p)) − R p|Q(z) where X R = |r(d)| d

18 2 LINEAR SIEVE 2.1 Combinatorial Sieve

Proof. Let d ∈ N be square-free. There exist unique d1 and d2 such that d = d1d2 with (d1,Q) = 1 and d2 is divisible only by primes in Q. Define

± ± Λ (d) = λ (d1)µ(d2)

Then Λ−(1) = 1 = Λ+(1), since λ−(1) = 1 = λ+(1). Moreover, for every n ∈ N, there exist unique n1 and n2 such that n = n1n2 with (n1,Q) = 1 and n2 is divisible only by primes in Q. Hence

X ± X ± X ± X Λ (d) = λ (d1)µ(d2) = λ (d1) µ(d2)

d|n d1d2|n1n2 d1|n1 d2|n2

By Theorem B.1 X X Λ−(d) ≤ 0 ≤ Λ+(d) d|n d|n for every n > 1, since λ± are upper and lower bound sieves. Thus, Λ± are upper and lower bound sieves. Let p | d where p∈ / P. Then, p∈ / P0 and p∈ / Q. Write d = d1d2, for some unique d1 and d2 with (d1,Q) = 1 and d2 divisible only by primes in Q. It follows that p | d1, since (p, d2) = 1. Let d1 = pd3. Then d = pd3d2, where (pd3,Q) = 1 and d2 is divisible only by primes in Q. Therefore ± ± Λ (d) = λ (pd3)µ(d2) = 0

± ± since λ (pd3) = 0, because λ are upper and lower bound sieves with sieving range P0 ± and p∈ / P0. This implies that Λ have sieving range P. Let d ≥ DQ. Write d = d1d2, for some unique d1 and d2 with (d1,Q) = 1 and d2 divisible only by primes in Q. Then, either d2 ≤ Q, which implies d1 ≥ D and in particular, ± ± λ (d1) = 0, since λ have support level D; or d2 > Q, which implies d2 is not square-free, ± ± and hence, µ(d2) = 0. In either case, Λ (d) = 0. Therefore, Λ have support level DQ. Finally

X X ± X ± X X ± Λ (d) = Λ (d) 1 = Λ (d)|Ad| a∈A d|(a,P (z)) d|P (z) a∈A d|P (z) d|a X X X X = Λ±(d)|A|f(d) + Λ±(d)r(d) = |A| Λ±(d)f(d) + Λ±(d)r(d) d|P (z) d|P (z) d|P (z) d

X ± X X ± X ± X Λ (d)f(d) = Λ (d1d2)f(d1d2) = λ (d1)f(d1) µ(d2)f(d2) d|P (z) d1|P0(z) d2|Q(z) d1|P0(z) d2|Q(z) since P (z) = P0(z)Q(z) and (P0(z),Q(z)) = 1, because P0 and Q are disjoint, and f(d1d2) = f(d1)f(d2) by multilicativity of f. By Theorem B.1 X X Y X Λ±(d) = |A|F (z, λ±) (1 − f(p)) + Λ±(d)r(d) a∈A d|(a,P (z)) p|Q(z) d

19 2.1 Combinatorial Sieve 2 LINEAR SIEVE

Use the fact that −1 ≤ Λ±(d) ≤ 1 to obtain X X X Y X S(A, P, z) = 1 ≤ Λ+(d) ≤ |A|F (z, λ+) (1 − f(p)) + r(d) a∈A a∈A d|(a,P (z)) p|Q(z) d

Next, two functions will be defined and proved to be upper and lower bound sieves with both support level and sieving range at choice, and such that they are always bounded by 1 in absolute value. Lemma 2.2. Let D > 0 and P be a set of primes. Let Y P (D) = p p∈P p

+ 3 D = {p1 . . . pj : j ≥ 0, pj < . . . < p1 < D, p1 . . . pi−1pi < D, ∀i odd} and − 3 D = {p1 . . . pj : j ≥ 0, pj < . . . < p1 < D, p1 . . . pi−1pi < D, ∀i even} + − Then, the functions λD,P and λD,P defined by ( µ(d) if d ∈ D± and d | P (D) λ± (d) = D,P 0 otherwise are upper and lower bound sieves, respectively, with support level D and sieving range P. ± In particular, |λD,P (d)| ≤ 1, for all d. Proof. Both D± are finite sets of square-free natural numbers smaller than D. Moreover, 1 ∈ D±. + Let d ∈ D . Write d = p1 . . . pω(d) with pω(d) < . . . < p1. If ω(d) happens to be odd, then 3 2 3 d < p1 . . . pω(d) < D. If ω(d) is even, then d < p1 . . . pω(d)−1 < p1 . . . pω(d)−1 < D. In either case, d < D, for every d ∈ D+. Analogously, d < D, for every d ∈ D−. ± It it therefore enough to prove that λD,P are upper bound sieves, for it is plain to see that their support level is D and sieving range P, by construction. First ± λD,P (1) = µ(1) = 1 since 1 ∈ D±. It remains to be proved that

X − X + λD,P (d) ≤ 0 ≤ λD,P (d) d|n d|n

20 2 LINEAR SIEVE 2.1 Combinatorial Sieve

± for every n > 1. Without loss of generality, assume n | P (D), since λD,P (d) = 0, for all d - P (D). Let n | P (D). Then, n is the product of distinct ω(n) primes in P and smaller than D. The proof goes by induction on ω(n). Let ω(n) = 1. Then n = p, with D > p ∈ P. In particular, n ∈ D−. Hence

X − λD,P (d) = µ(1) + µ(p) = 1 − 1 = 0 d|n and X + + λD,P (d) = µ(1) + λD,P (p) ≥ 1 − 1 = 0 d|n which proves the case ω(n) = 1. Assume the result holds for every n with ω(n) = r > 1. Let n be such that ω(n) = r + 1. Write n = q0q1 . . . qr with qr < . . . < q0 < D and qi ∈ P, for every i. Let n N = = q1 . . . qr q0 Then N | P (D) and ω(N) = r. By induction hypothesis

X − X + λD,P (d) ≤ 0 ≤ λD,P (d) d|N d|N

Every divisor of n is either of the form d or q0d, where d | N. Thus

X + X + X + X + X λD,P (d) = λD,P (d) + λD,P (q0d) ≥ λD,P (q0d) = µ(q0d) d|n d|N d|N d|N d|N + q0d∈D X = − µ(d) d|N + q0d∈D and likewise

X − X − X − X − X λD,P (d) = λD,P (d) + λD,P (q0d) ≤ λD,P (q0d) = µ(q0d) d|n d|N d|N d|N d|N − q0d∈D X = − µ(d) d|N − q0d∈D

Let d | N. Write d = p1 . . . ps, with ps < . . . < p1 ≤ q1 < q0 < D and pi ∈ P, for every i. Let D E = q0 ± ± + Consider the sets E be the sets D with E instead of D, respectively. Then q0d ∈ D 3 3 3 if and only if q0 < D and p1 . . . pi < E for every even i. Hence, assuming q0 < D, the

21 2.1 Combinatorial Sieve 2 LINEAR SIEVE

+ − − + condition q0d ∈ D is equivalent to d ∈ E . Similarly, q0d ∈ D if and only if d ∈ E , 3 provided that q0 < D. Moreover X X X X 0 = µ(d) = µ(d) = µ(d) = µ(d) d|N d|N d|N d|N − + − + q0d∈D q0d∈D d∈E d∈E 3 3 3 q3≥D q0 ≥D q0 ≥D q0 ≥D 0 since all the above sums are empty. Therefore X X X X µ(d) = µ(d) = µ(d) = µ(d) ≤ 0 d|N d|N d|N d|N + + − − q0d∈D q0d∈D d∈E d∈E 3 3 q0

X + λD,P (d) ≥ 0 d|n

Analogously X X X X µ(d) = µ(d) = µ(d) = µ(d) ≥ 0 d|N d|N d|N d|N − − + + q0d∈D q0d∈D d∈E d∈E 3 q3

P ± The objective is to bound d|P (z) λD,P (d)f(d), where f is a multiplicative function. The following lemma, expresses this sum in terms of a sum of some other functions Tn, and in the next subsection, a bound for Tn will be found.

Lemma 2.3. Let 2 ≤ z < D and P be a set of primes. Let Y P (z) = p p∈P p

± X ± F (z, λD,P ) = λD,P (d)f(d) d|P (z)

± where λD,P are the upper and lower bound sieves with support level D and sieving range P, defined in Lemma 2.2. Then

+ X F (z, λD,P ) = V (z) + Tn(D, z) n odd

22 2 LINEAR SIEVE 2.1 Combinatorial Sieve and − X F (z, λD,P ) = V (z) − Tn(D, z) n even where Y V (z) = (1 − f(p)) p|P (z) and X Tn(D, z) = f(p1 . . . pn)V (pn)

p1,...,pn∈P wn≤pn<...

2 p1 . . . pnwn = D Proof. By definition of the sets D±

± X ± X F (z, λD,P ) = λD,P (d)f(d) = µ(d)f(d) d|P (z) d|P (z) d∈D+

Let d | P (z) such that d ∈ D+. Then d is product of distinct primes in P smaller than 3 2 D z ≤ D such that p1 . . . pi < D for every odd i, which is equivalent to pi 2 < D, that is, wi pi < wi. Therefore

+ X j F (z, λD,P ) = (−1) f(p1 . . . pj) p1,...,pj ∈P pj <...

X X j V (z) = µ(d)f(d) = (−1) f(p1 . . . pj)

d|P (z) p1,...,pj ∈P pj <...

Hence

X j X j V (z) = (−1) f(p1 . . . pj) + (−1) f(p1 . . . pj)

p1,...,pj ∈P p1,...,pj ∈P pj <...

p1,...,pj ∈P pj <...

23 2.2 The Functions Φ and φ 2 LINEAR SIEVE equals, for any fixed n

X n X j−n (−1) f(p1 . . . pn) (−1) f(pn+1 . . . pj)

p1,...,pn∈P pn+1,...,pj ∈P wn≤pn<...

X j−n X (−1) f(pn+1 . . . pj) = µ(pn+1 . . . pj)f(pn+1 . . . pj) = V (pn)

pn+1,...,pj ∈P pn+1,...,pj ∈P pj <...

Hence

+ X X + X V (z) = F (z, λD,P ) − f(p1 . . . pn)V (pn) = F (z, λD,P ) − Tn(D, z) n odd p1,...,pn∈P n odd wn≤pn<...

By an analogous argument

− X V (z) = F (z, λD,P ) + Tn(D, z) n even

2.2 The Functions Φ and φ ± Lemma 2.3 characterizes the functions F (z, λD,P ) used to bound the sieving function in a combinatorial sieve (Theorem 2.1) in terms of a sum of functions Tn(D, z). Two functions Φ and φ are to be introduced, to further bound Tn.

Let  v(x) Φ(x) = eγ u(x) + x and  v(x) φ(x) = eγ u(x) − x

1 for x > 0, where γ is Euler’s constant and u(x) = x and v(x) = 1 for 0 < x ≤ 2 and v(x − 1) (xu(x))0 = u(x − 1), v0(x) = − x − 1

2eγ for x ≥ 2. From this definition, it follows that Φ(x) = x and φ(x) = 0 for 0 < x ≤ 2. Moreover, for x ≥ 2 x(Φ(x) + φ(x)) = 2eγ xu(x) and x(Φ(x) − φ(x)) = 2eγ v(x)

24 2 LINEAR SIEVE 2.2 The Functions Φ and φ

Whence (xΦ(x))0 + (xφ(x))0 = 2eγ (xu(x))0 = 2eγ u(x − 1) and v(x − 1) (xΦ(x))0 − (xφ(x))0 = 2eγ v0(x) = −2eγ x − 1 for x ≥ 2. Therefore

(xΦ(x))0 = φ(x − 1), (xφ(x))0 = Φ(x − 1) for x ≥ 2. Hence Z x φ(s − 1) ds = xΦ(x) − 2Φ(2) = xΦ(x) − 2eγ 2

2eγ R x for all x ≥ 2. Then, Φ(x) = x , for 2 ≤ x ≤ 3, since 2 φ(s − 1) ds = 0, because φ(s − 1) = 0 when s − 1 ≤ 2. Hence Lemma 2.4. Let 0 < x ≤ 3. Then 2eγ Φ(x) = x Similarly Z x Φ(s − 1) ds = xφ(x) − 2φ(2) = xφ(x) 2 for all x ≥ 2, since φ(2) = 0. Then, for 2 ≤ x ≤ 4, by Lemma 2.4 1 Z x 2eγ Z x 1 2eγ log(x − 1) φ(x) = Φ(s − 1) ds = ds = x 2 x 2 s − 1 x Thus Lemma 2.5. Let 2 < x ≤ 4. Then log(x − 1) φ(x) = 2eγ x

Lastly, Φ and φ are expressed as sums of the functions fn defined in pages 245−246 of [5], in the following way: X Φ(s) = 1 + fn(s) n odd X φ(s) = 1 − fn(s) n even

The functions fn are used to bound the functions Tn defined in Lemma 2.3, whenever the sieve is linear, as follows: Lemma 2.6. Let z ≥ 2 and P be a set of primes. Let f be a multiplicative function such that 0 < f(p) < 1, for every p ∈ P, and such that

Y 1 log z ≤ (1 + ε) 1 − f(p) log u p∈P u≤p

25 2.3 The Jurkat-Richert Theorem 2 LINEAR SIEVE

1 for some 0 < ε < 200 and for all 1 < u < z. Then,

 log D  log D n 10− log z Tn(D, z) < V (z) fn( log z ) + ε0.99 e for every odd n and D ≥ z, and for every even n and D ≥ z2, where Y V (z) = (1 − f(p)) p|P (z)

A detailed proof of the above can be found in pages 253−256 of [5].

2.3 The Jurkat-Richert Theorem The Jurkat-Richert Theorem provides both an upper and lower bound for sieving functions when the sieve is linear. The bounds depend on the functions Φ and φ, previously detailed.

Theorem 2.7 (Jurkat-Richert). Let A be a sieveing set, P a sieving range and z ≥ 2 a sieving level. Let Y P (z) = p p∈P p

Y 1 log z ≤ (1 + ε) 1 − f(p) log u p∈P\Q u≤p

1 for some 0 < ε < 200 and for all 1 < u < z. Then, for any D ≥ z

 log D  log D 14− log z S(A, P, z) < Φ( log z ) + εe |A|V (z) + R and for any D ≥ z2

 log D  log D 14− log z S(A, P, z) > φ( log z ) − εe |A|V (z) − R where the functions Φ and φ are the ones defined in Section 2.2, Y V (z) = (1 − f(p)) p|P (z) and X R = |r(d)| d

26 2 LINEAR SIEVE 2.3 The Jurkat-Richert Theorem

Proof. Let P = P\Q. By Lemma 2.2, the functions λ± there defined, are upper and 0 D,P0 ± lower bound sieves with support level D and sieving range P0 and such that |λD,P (d)| ≤ 1, for every d. Let Y P0(z) = p

p∈P0 p

p|P0(z) By Lemma 2.3 and Lemma 2.6 ! log D + X X log D  10− log z X n F (z, λD,P ) = V0(z) + Tn(D, z) < V0(z) 1 + fn log z + εe 0.99 n odd n odd n odd  log D  log D  14− log z < V0(z) Φ log z + εe

P n 9900 4 for any D ≥ z, since n odd 0.99 = 199 < e . Again, by Lemma 2.3 and Lemma 2.6 ! log D − X X log D  10− log z X n F (z, λD,P ) = V0(z) − Tn(D, z) > V0(z) 1 − fn log z − εe 0.99 n even n even n even  log D  log D  14− log z > V0(z) φ log z − εe

P n 9801 4 for any D ≥ z, since n even 0.99 = 199 < e . Hence, by Theorem 2.1

+ Y S(A, P, z) ≤ |A|F (z, λD,P ) (1 − f(p)) + R p|Q(z)  log D  log D 14− log z Y Y < |A| Φ( log z ) + εe (1 − f(p)) (1 − f(p)) + R p|P0(z) p|Q(z)  log D  log D 14− log z = Φ( log z ) + εe |A|V (z) + R for any D ≥ z, and similarly

− Y S(A, P, z) ≥ |A|F (z, λD,P ) (1 − f(p)) − R p|Q(z)  log D  log D 14− log z Y Y > |A| φ( log z ) − εe (1 − f(p)) (1 − f(p)) − R p|P0(z) p|Q(z)  log D  log D 14− log z = φ( log z ) − εe |A|V (z) − R for any D ≥ z2.

27 2.3 The Jurkat-Richert Theorem 2 LINEAR SIEVE

A relevant remark on Theorem 2.7 is that the hypothesis

Y 1 log z ≤ (1 + ε) 1 − f(p) log u p∈P\Q u≤p

Y 1 Y 1 ≤ << 1 1 − f(p) 1 − f(p) p∈Q p∈Q u≤p

Y 1 Y 1 Y 1 log z = << 1 − f(p) 1 − f(p) 1 − f(p) log u p∈P p∈P\Q p∈Q u≤p

1 Y 1 = << log z V (z) 1 − f(p) p∈P p 2.

28 3 Large Sieve

All previous sieves rely on the Möbius function. In this brief section, however, a com- pletely different type of sieve is introduced. Given a sequence (an)n of complex numbers, a sequence (br)r of rational numbers and R, x ≥ 1, a large sieve is an inequality of the form 2

X X 2πinbr X 2 ane ≤ C |an|

r≤R n≤x n≤x where C is to depend on R and x only and hence not on br nor an. One of the most advanced applications of large sieves is the Bombieri-Vinogradov The- orem. This is a result on the average distribution of primes in congruence classes of large moduli. To be precise, on the error term in approximating the number of primes up to π(x) x of any single class modulo d by ϕ(d) , on average, where ϕ is Euler’s totient function. Under√ a generalised Riemann hypothesis, the error term of the above approximation is O( x log x), before averaging, which is exactly the same bound Bombieri-Vinogradov’s Theorem provides, after averaging, that is.

Theorem 3.1 (Bombieri-Vinogradov). Let x ≥ 1, n ∈ N and A > 0. Then, there exists β(A) > 0 such that

X π(x) x max π(x, n mod d) − <

Theorem 3.2 (Siegel-Walfisz). Let x ≥ 1, n ∈ N and A > 0. Then

π(x) x π(x, n mod d) − << ϕ(d) A logA x where π(x, n mod d) is the number of primes up to x congruent to n modulo d.

3.1 A Large Sieve Inequality

Back to large sieve for beginners, consider a differentiable function f : [0, 1] −→ C with continuous derivative and extended by periodicity to all R (with period one). Let R ≥ 1 and u ∈ [0, 1]. For every natural numbers r ≤ R and h ≤ r with (h, r) = 1, one has Z u h 0 −f( r ) = −f(u) + f (t) dt h r by the Fundamental Theorem of Calculus. Take absolute values on either side of the above to obtain Z u h 0 |f( r )| ≤ |f(u)| + |f (t)| dt h r

29 3.1 A Large Sieve Inequality 3 LARGE SIEVE

h 1 h 1 Consider now the family of intervals ( r − 2R2 , r + 2R2 ) where h and r are natural numbers with h ≤ r ≤ R and (h, r) = 1. Then, the union of them all is contained in [0, 1], since h 1 1 1 h 1 r−1 1 1 1 r − 2R2 > R − 2R2 > 0 and r + 2R2 ≤ r + 2R2 < 1− R + 2R2 < 1, since R ≥ 1. Moreover, h 1 h 1 h0 1 h0 1 said intervals are nonoverlapping, since given x ∈ ( r − 2R2 , r + 2R2 ) ∩ ( r0 − 2R2 , r0 + 2R2 ) h h0 h 1 h0 1 with r 6= r0 , then |x − r | < 2R2 and |x − r0 | < 2R2 . On the one hand, by the triangular inequality 0 0 h h h h 1 1 1 − ≤ x − + x − < + = r r0 r r0 2R2 2R2 R2

0 0 h h0 hr0−h0r 0 0 On the other hand, hr − h r 6= 0 since 0 6= r − r0 = rr0 . Hence |hr − h r| ≥ 1, since h, h0, r, r0 are natural numbers. Therefore

0 0 0 h h |hr − h r| 1 1 − = ≥ ≥ r r0 rr0 rr0 R2

h which is a contradiction. Before integrating the previous inequality over the interval ( r − 1 h 1 2R2 , r + 2R2 ) with respect to u and adding over all such intervals, take into account that

h 1 h 1 h 1 h 1 Z r + 2 Z u Z r + 2 Z r + 2 Z r + 2 2R 0 2R 2R 0 1 2R 0 |f (t)| dt du ≤ |f (t)| dt du = 2 |f (t)| dt h − 1 h h − 1 h − 1 R h − 1 r 2R2 r r 2R2 r 2R2 r 2R2 Hence, integrating and adding all intervals, one obtains

h 1 Z r + 2 1 X X h X X 2R h 2 |f( r )| = |f( r )| du R h − 1 r≤R h≤r r≤R h≤r r 2R2 (h,r)=1 (h,r)=1 h 1 h 1 Z r + 2 Z r + 2 Z u X X 2R X X 2R ≤ |f(u)| du + |f 0(t)| dt du h − 1 h − 1 h r≤R h≤r r 2R2 r≤R h≤r r 2R2 r (h,r)=1 (h,r)=1 h 1 h 1 Z r + 2 Z r + 2 X X 2R 1 X X 2R 0 ≤ |f(u)| du + 2 |f (t)| dt h − 1 R h − 1 r≤R h≤r r 2R2 r≤R h≤r r 2R2 (h,r)=1 (h,r)=1 Z 1 Z 1 Z 1 Z 1 1 0 1 0 ≤ |f(u)| du + 2 |f (t)| dt = |f(u)| du + 2 |f (u)| du 0 R 0 0 R 0

Given x ≥ 1 and a sequence (an)n of complex numbers, let

f(u) = S(u)2 where X 2πinu S(u) = ane n≤x Then f clearly is a differentiable function f : R −→ C with continuous derivative and f(u + 1) = f(u), for every u ∈ [0, 1]. Moreover

f 0(u) = 2S(u)S0(u)

30 3 LARGE SIEVE 3.1 A Large Sieve Inequality

Therefore, by the above reasoning

Z 1 Z 1 1 X X h 2 2 2 0 S( ) ≤ |S(u)| du + |S(u)||S (u)| du R2 r R2 r≤R h≤r 0 0 (h,r)=1

First     Z 1 Z 1 2 X 2πinu X −2πimu |S(u)| du =  ane   ame  du 0 0 n≤x m≤x Z 1 X X 2πi(n−m)u X X 2 = anam e du = anan = |an| n≤x m≤x 0 n≤x n≤x

And second, by the Cauchy-Schwarz inequality

1 1 Z 1 Z 1 2 Z 1 2 |S(u)||S0(u)| du ≤ |S(u)|2 du |S0(u)|2 du 0 0 0 where the squared of the first factor has just been dealt with above, and the squared of the second similarly equals     Z 1 Z 1 0 2 X 2πinu X −2πimu |S (u)| du =  2πinane   2πimame  du 0 0 n≤x m≤x Z 1 2 X X 2πi(n−m)u 2 X 2 2 2 X 2 =4π nmanam e du = 4π n anan ≤ 4π x |an| n≤x m≤x 0 n≤x n≤x

Whence

1 1  2  2 1 X X h 2 X 2 2 X 2 2 2 X 2 S( ) ≤ |an| + |an| 4π x |an| R2 r R2     r≤R h≤r n≤x n≤x n≤x (h,r)=1  4πx X = 1 + |a |2 R2 n n≤x

Multiplying by R2

Theorem 3.3. Let (an)n be a sequence of complex numbers and R, x ≥ 1. Then

2

X X X 2πin h 2 X 2 ane r ≤ (R + 4πx) |an|

r≤R h≤r n≤x n≤x (h,r)=1

The next step is to improve the above result by introducing Dirichlet characters (see Appendix C).

31 3.1 A Large Sieve Inequality 3 LARGE SIEVE

Theorem 3.4. Let (an)n be a sequence of complex numbers and R, x ≥ 1. Then 2 ∗ X r X X 2 X 2 anχ(n) ≤ (R + 4πx) |an| ϕ(r) r≤R χ mod r n≤x n≤x P∗ where χ mod r denotes the sum over primitive characters χ modulo r. Proof. Let r ≤ R and (n, r) = 1. Let χ be a primitive character modulo r. Then, by Lemma C.2 1 X 2πin h χ(n) = χ¯(h)e r τ(¯χ) h≤r

Multiply by an, sum over n ≤ x, apply modulus and square it all out to obtain 2 2 2

X 1 X X 2πin h 1 X X 2πin h anχ(n) = an χ¯(h)e r = χ¯(h) ane r |τ(¯χ)|2 r n≤x n≤x h≤r h≤r n≤x by Lemma C.3. Sum over all primitive characters modulo r

2 2 ∗ ∗ X X 1 X X X 2πin h anχ(n) = χ¯(h) ane r r χ mod r n≤x χ mod r h≤r n≤x 2

1 X X X 2πin h ≤ χ¯(h) ane r r χ mod r h≤r n≤x    1 X X X 2πin h1 X X −2πin h2 = χ¯(h ) a e r χ(h ) a e r r  1 n  2 n  χ mod r h1≤r n≤x h2≤r n≤x    1 X X X 2πin h1 X −2πin h2 X = a e r a e r χ¯(h )χ(h ) r  n  n  1 2 h1≤r h2≤r n≤x n≤x χ mod r    2

ϕ(r) X X 2πin h1 X −2πin h1 ϕ(r) X X 2πin h =  ane r  ane r  = ane r r r h1≤r n≤x n≤x h≤r n≤x (h1,r)=1 (h,r)=1 since, by Lemma C.1 ( X X −1 ϕ(r) if h1 = h2 and (h1, r) = 1 χ¯(h1)χ(h2) = χ(h h2) = 1 0 otherwise χ mod r χ mod r for all h1, h2 ≤ r. Hence 2 2 ∗ r X X X X 2πin h anχ(n) ≤ ane r ϕ(r) χ mod r n≤x h≤r n≤x (h,r)=1

32 3 LARGE SIEVE 3.1 A Large Sieve Inequality

Sum over r ≤ R and apply Theorem 3.3

2 2 ∗ X r X X X X X 2πin h 2 X 2 anχ(n) ≤ ane r ≤ (R + 4πx) |an| ϕ(r) r≤R χ mod r n≤x r≤R h≤r n≤x n≤x (h,r)=1

In the following Section, a particular inequality will have to be used at one point in the proof of Chen’s Theorem. The above result is used to prove said inequality, presented in the following Theorem.

Theorem 3.5. Let (an)n be a sequence of complex numbers such that |an| ≤ 1, for all n ∈ N. Let A, X, Y, Z ≥ 1 such that X > (log Y )2A. Set

1 (XY ) 2 S = logA Y

Then

2 X X X 1 X X XY (log XY ) max an − an <

Proof. Let (h, d) = 1, p a prime and n ∈ N. By Lemma C.1

( X X ϕ(d) if pn ≡ h (mod d) χ¯(h)χ(pn) = χ(h−1pn) = 0 otherwise χ mod d χ mod d

Then

X X X X an X a = χ¯(h)χ(pn) n ϕ(d) n

1 X X 1 X X χ¯ (h) a χ (pn) = a ϕ(d) 0 n 0 ϕ(d) n n

33 3.1 A Large Sieve Inequality 3 LARGE SIEVE

Hence

X X X 1 X X max an − an (h,d)=1 ϕ(d) d

X 1 X X X ≤ χ¯(h) anχ(n) χ(p) ϕ(d) d

X 1 X X X ≤ anχ(n) χ(p) ϕ(d) d

X 1 X X X anχ(n) χ(p) ϕ(d) d

∗ X 1 X X 0 00 X 0 00 = anχ (n)χ (n) χ (p)χ (p) ϕ(rs) 0 0 rs

∗ X 1 X X X = anχ(n) χ(p) ϕ(rs) rs

X 1 X X X anχ(n) χ(p) ϕ(d) d

∗ X 1 X 1 X X X ≤ anχ(n) χ(p) ϕ(s) ϕ(r) s

34 3 LARGE SIEVE 3.1 A Large Sieve Inequality since

X π(Y ) π(Y ) X χ(a) = χ(a) = 0 ϕ(r) ϕ(r) a (mod r) a (mod r) due to X χ(a) = 0 a (mod r) which follows from the fact that there exists a natural number b coprime with r and such that χ(b) 6= 1, for which

X X X χ(b) χ(a) = χ(ab) = χ(a) a (mod r) a (mod r) a (mod r)

Analogously

X rZ rY χ(p) << << (log Z)4A (log Y )4A p

X X X X X rY χ(p) = χ(p) − χ(p) ≤ χ(p) + χ(p) << (log Y )4A Z≤p

The number of terms removed in the above sum by adding the condition (p, s) = 1 is less than or equal to ω(s), which by Lemma A.4 is bounded above by 2 log s << log S, since s < S. Hence

X rY χ(p) << + log S (log Y )4A Z≤p

X X X anχ(n) ≤ |an||χ(n)| ≤ 1 ≤ X

n

Now, the sum

∗ X 1 X X X anχ(n) χ(p) ϕ(r) r

A will be counted dividing the range r < S into r < S0 = log Y and a second range S0 ≤ r < S which is to be partitioned into pairwise disjoint subintervals Sj ≤ r < 2Sj,

35 3.1 A Large Sieve Inequality 3 LARGE SIEVE

j log S log S log 2 log 2 where Sj = 2 S0 and 0 ≤ j << log S (since 2 S0 = e S0 = S S0 > S). Thus

∗ ∗   X 1 X X X X 1 X rY anχ(n) χ(p) << X + log S ϕ(r) ϕ(r) (log Y )4A r

∗ X 1 X X X anχ(n) χ(p) ϕ(r) Sj ≤r<2Sj χ mod r n

    1  X r X   X r X  ≤  anχ(n)   χ(p)  Sj  ϕ(r)   ϕ(r)   r<2Sj n

∗ ∗ 1  X r X X   X r X X  =  anχ(n)   χ(p)  Sj  ϕ(r)   ϕ(r)  r<2Sj χ mod r n

∗ X r X X 2 X 2 2 X anχ(n) ≤ (4S + 4πX) |an| ≤ (4S + 4πX) 1 ϕ(r) j j r<2Sj χ mod r n

36 3 LARGE SIEVE 3.1 A Large Sieve Inequality

since |an| ≤ 1. As for the second factor, apply Theorem 3.4 this time with R = 2Sj, x = Y and the sequence of numbers equal to 1 when n is a prime, n ≥ Z and (n, s) = 1, and 0 otherwise, to obtain

2

∗ X r X X 2 X 2 2 χ(p) ≤ (4S + 4πY ) 1 ≤ (4S + 4πY )Y ϕ(r) j j r<2Sj χ mod r Z≤p

Therefore

∗ 1 1 X 1 X X X 1 2 2 2 2 anχ(n) χ(p) ≤ (4Sj + 4πX)X (4Sj + 4πY )Y ϕ(r) Sj Sj ≤r<2Sj χ mod r n> log Y and X 2 > log Y , by hypothesis. Whence

∗ X X 1 X X X XY X XY log S anχ(n) χ(p) << 1 << ϕ(r) logA Y logA Y j Sj ≤r<2Sj χ mod r n

The contribution of the sum in the range r < S0 and S0 ≤ r < S are both bounded by a XY log S certain constant times logA Y . Hence their sum is too, which means

∗ X 1 X X X XY log S anχ(n) χ(p) << ϕ(r) logA Y r

37 3.1 A Large Sieve Inequality 3 LARGE SIEVE

By Lemma B.4

X 1 X X X anχ(n) χ(p) ϕ(d) d

∗ X 1 X 1 X X X ≤ anχ(n) χ(p) ϕ(s) ϕ(r) s

38 4 Chen’s Theorem

The chinese mathematician Jingrun Chen proved in 1966 (although did not publish his result until 1973 due to political turmoil in China) that there is a lower bound on the number of representations of even numbers as sums of an odd prime and a product of at most two primes.

Theorem 4.1 (Chen, 1966). Let N be an even large enough natural number and R(N) denote the number of representations of N as the sum of an odd prime and a product of at most two primes. Then N R(N) >> S(N) log2 N where Y p − 1 Y  1  S(N) = 2 1 − p − 2 (p − 1)2 p>2 p>2 p|N

In particular, every even large enough natural number is the sum of a prime and a product of at most two primes, since S(N) >> 1 in view of the fact that

Y  1  2 1 − = O(1) (p − 1)2 p>2 and Y p − 1 Y > 1 = 1 p − 2 p>2 p>2 p|N p|N A couple of remarks ought to be made before starting with the proof. First, representations are to be understood as ordered pairs, meaning that a + b and b + a are considered different respresenations of a number as the sum of two other. This affects R(N) only by a factor of at most 2. The second remark to make is the fact that 1 is considered a product of at most two primes (for it is the product of no primes at all). This consideration does not alter the result, since R(N) is increased by 2 when N − 1 is prime and left unaffected otherwise. Finally, the result has been deliberately stated in a simple form. One can be more precise in the prime decomposition of the product of at most two primes that occurs in 1 the statement. In fact, said product is either 1 or a prime greater than N 8 or a 1 1 product of a prime greater than N 8 and another prime greater than N 3 . Moreover, representations of the form N = p + (N − p), where p divides N are not counted. This implies that the representation 3+3 for 6 or 24 = 2+22 = 3+21, for instance, are excluded.

To motivate Chen’s strategy for the proof of the theorem, consider first the following elementary remark:

Every natural number between 2 and N − 1 not divisible by any prime strictly smaller than the k-th root of N is the product of at most k − 1 primes.

39 4.1 Three Sieving Functions 4 CHEN’S THEOREM

Proof. Let 2 ≤ n < N. Write n = p1 . . . pr as product of primes, where r > 0. If n 1 1 is not divisible by any prime strictly smaller than N k , then pi ≥ N k , for all i. Hence r r k k r n ≥ N > n . Therefore, 1 > k , which implies k < r. Initially, one may consider sieving the set

{N − p ∈ N : p < N}

1 with sieving range P and sieving level z = N 3 to count how many N − p in said set are product of at most two primes. The reason for it being that said amount equals the number of representations of N as the sum of a prime and a product of at most two primes, since N = p + (N − p). However, it will be made clear in the upcoming proof that in this case, lower bounds for the sieving function when applying Jurkat-Richert’s Theorem turn out to be negative, and thus useless. Hence, the strategy is to consider a larger initial set, and sieve out all numbers which are product of three or more primes. The way of doing this is by assigning weights to the elements of the set in a way that these are positive when such elements are at most product of two primes. Finally, three main ingredients are to be used in the proof: the Jurkat-Richert Theorem (Theorem 2.7) to find upper and lower bounds for sieving functions and either the large sieve inequality in Theorem 3.5 or the Bombieri-Vinogradov Theorem (Theorem 3.1) to bound the error term the linear sieve produces.

4.1 Three Sieving Functions Let N be an even integer. Consider the sieving set

A = {N − p : p < N, (p, N) = 1} the sieving range P = {p ∈ P :(p, N) = 1} and the sieving level 1 z = N k where k > 3 is to be later established. Let Y P (z) = p p

Therefore the sieving function X S(A, P, z) = 1 p

40 4 CHEN’S THEOREM 4.1 Three Sieving Functions p < N and (p, N) = 1. Let d = (n, N). Then, d | (N − p) and d | N. Hence, d | p, which implies d = 1 or d = p. The latter is impossible since (p, N) = 1. Hence the former must hold. Thus, any n ∈ A such that (n, P (z)) = 1 can be written as

n = p1 . . . prpr+1 . . . pr+s where 1 1 z = N k ≤ p1 ≤ ... ≤ pr < N 3 ≤ pr+1 ≤ ... ≤ pr+s since n < N, (n, N) = 1 and (n, P (z)) = 1. In particular

s N 3 ≤ pr+1 . . . pr+s ≤ n < N

s whence 3 < 1, which implies s ∈ {0, 1, 2}. Let

1 y = N 3

The way to proceed is to assign to any n = N − p ∈ A with (n, P (z)) = 1 a positive weight if and only if either:

(i) N − p = 1, or

(ii) N − p is a prime greater or equal than z, or

(iii) N − p is the product of exactly two primes, both greater or equal than y, or

(iv) N − p is the product of exactly two primes, one of them greater or equal than z and smaller than y, and the other greater or equal than y

Let 1 X 1 X α = 1 − j − 1 n 2 2 z≤q

Moreover, the second sum X 1

n=p1p2p3 z≤p1 0 if and only if r = 0, or r = 1 and the second sum equals 0. Assume r = 1. Then the second sum equals 0 if and only if s = {0, 1}. Therefore αn > 0 if and only if r = 0 and s ∈ {0, 1, 2}, or r = 1 and s = {0, 1}. The case r = s = 0 corresponds to (i). The case r + s = 1 (meaning r = 0 and s = 1 and vice versa) refers to (ii). The case r = 0 and s = 2 is exactly (iii). Finally, r = s = 1 corresponds to (iv).

41 4.1 Three Sieving Functions 4 CHEN’S THEOREM

In any of the above cases, r + s ≤ 2. With all these considerations, a lower bound for R(N) can be found as follows

X X X X R(N) = 1 ≥ 1 = 1 ≥ 1 N=p+n n=N−p n∈A n∈A n∈{1,p1,p1p2} p

X X R(N) ≥ αn ≥ αn n∈A n∈A (n,P (z))=1 (n,P (z))=1 n∈{1,p1,p1p2 : p1,p2≥z}        X  1  X X  1  X X  =  1 −  j −  1   2   2   n∈A n∈A z≤q 2, which implies αn ≤ 0, by the above considerations. Now the first sum is exactly the sieving function

X 1 = S(A, P, z) n∈A (n,P (z))=1

The second sum equals

    X X  X X   X X  j =  1 +  (j − 1) n∈A z≤q

X X X X X 1 = 1 = S(Aq, P, z) n∈A z≤q

Ad = {n ∈ A : d | n}

42 4 CHEN’S THEOREM 4.1 Three Sieving Functions and the second sum turns out to be X X X X X X X (j − 1) = (j − 1) = (j − 1) 1 n∈A z≤q1 z≤q1 X X X X X X X X j − 1 = (j − 1) 1 ≤ (j − 1) 1 ≤ N qj z≤q1 n∈A z≤q1 n1 (n,P (z))=1 qj |n qj kn

j  N  N since the amount of n < N divisible by q adds up to qj ≤ qj . For any |x| < 1 1 X X = jxj−1 = (j − 1)xj−2 (1 − x)2 j>0 j>1 Therefore X X X X j − 1 X 1 X j − 1 X 1 (j − 1) ≤ N = N = N qj q2 qj−2 q2(1 − 1 )2 n∈A z≤q1 z≤q1 z≤q

N 2N 1 1− k = 1 ≤ 1 = 2N N k − 2 N k

1 1 k 1 k since N − 2 ≥ 2 N , for big enough values of N. Hence X X X 1− 1 j < S(Aq, P, z) + 2N k n∈A z≤q

B = {N − p1p2p3 : p1p2p3 < N, z ≤ p1 < y ≤ p2 ≤ p3, (p1p2p3,N) = 1}

Let N − p1p2p3 ∈ B ∩ P. Then, p = N − p1p2p3 ∈ B. Hence, p1p2p3 = N − p ∈ A, since p < N and (p, N) = 1. Let now p1p2p3 ∈ A with z ≤ p1 < y ≤ p2 ≤ p3. Then, N − p1p2p3 is prime, and in particular, p1p2p3 < N. Moreover, (p1p2p3,N) = 1, since otherwise p1p2p3 ∈/ A. Therefore, N − p1p2p3 ∈ B ∩ P if and only if p1p2p3 ∈ A and z ≤ p1 < y ≤ p2 ≤ p3. Thus X X X X X X 1 = 1 = 1 = 1 = 1 n=p p p n∈A 1 2 3 p1p2p3∈A p1p2p3∈A n∈B∩P p∈B (n,P (z))=1 z≤p1

43 4.2 Linearity of the Sieves 4 CHEN’S THEOREM

Therefore, putting all the above together

Theorem 4.2.

1 X 1 1− 1 1 1 R(N) > S(A, P, z) − S(A , P, z) − S(B, P, y) − N k − N 3 2 q 2 2 z≤q

4.2 Linearity of the Sieves The following step is to apply the Jurkat-Richert Theorem for each of the above three sieving functions. To do so, one must choose a suitable multiplicative function. For all the sieves to come, let 1 f(d) = ϕ(d) where ϕ(d) is Euler’s totient function. Then, clearly 0 < f(p) < 1, for all p ∈ P, since ϕ(p) = p − 1 and 2 ∈/ P because N is even. The linearity of a sieve does not depend on the sieving set, but rather on the multiplicative function, the sieving range (which is P in all three cases) and the sieving level (either z or y). 1 Consider both cases z and y simultaneously, since both are of the form w = N K with K ≥ 3. The only thing yet to be defined is the finite subset Q of P in such a way that

Y 1 log w ≤ (1 + ε) 1 − 1 log u p∈P\Q ϕ(p) u≤p

1 holds for some 0 < ε < 200 and for all 1 < u < w. By Lemma A.7, for any ε > 0, there exists n1(ε), such that

−1 Y p Y  1 log w = 1 − < (1 + ε ) p − 1 p 3 log u u≤p

1 for every n1(ε) ≤ u < w, independently of w = N K . Moreover, there exists n2(ε), such that Y (p − 1)2 < 1 + ε p(p − 2) 3 p≥n2(ε)

Q (p−1)2 Q 1 since the product p p(p−1) = p(1 + p(p−2) ) converges. Let n3(ε) = max{n1(ε), n2(ε)}. Then, for any u ≥ n3(ε)

Y 1 Y 1 Y p − 1 Y (p − 1)2 Y p = = = 1 − 1 1 − 1 p − 2 p(p − 2) p − 1 u≤p

44 4 CHEN’S THEOREM 4.2 Linearity of the Sieves

1 for any 0 < ε < 3, and in particular, for any 0 < ε < 200 . Let Qε be the set of primes not exceeding n3(ε). Define the finite subset Q = P ∩ Qε of P. Then Y 1 log w ≤ (1 + ε) 1 − 1 log u p∈P\Q ϕ(p) u≤p

1 1 K holds for 0 < ε < 200 and for all 1 < u < N , for all w. Hence, all hypothesis of Jurkat- 1 Richert’s Theorem (Theorem 2.7) are verified, since moreover w = N K ≥ 2 because N is big enough. By the remarks after said theorem, one concludes that the three sieves are 1 linear and V (w) << log N , where Y  1  Y  1  V (w) = 1 − = 1 − ϕ(p) p − 1 p|P (w) 2

Q ≤ Qε < log N for sufficiently large N, since Qε does not depend on N. Finally

1 Theorem 4.3. Let w = N K , with K ≥ 3. Then Ke−γ S(N)   1  V (w) = 1 + O log N log N for large enough N.

K 1 Proof. Let N ≥ 4 . Then w = N K ≥ 4. First, compute

−1 −1 Y  1  Y  1  Y  1  V (w) 1 − = 1 − 1 − p − 1 p − 1 p − 1 22 p≥w p|N p|N p|N

1 1 1 −2x Secondly, 0 < p−1 ≤ w−1 < 3 , for all p ≥ w. Use the fact that 1 − x > e , certainly for 1 −x 0 < x < 3 , and that e > 1 − x, for all real x, to obtain         Y 1 Y −2 X −2  X −2  1 − > exp = exp   ≥ exp   p − 1 p − 1 p − 1 w − 1 p≥w p≥w p≥w  p≥w  p|N p|N p|N p|N   −2 X −2ω(N) −4 log N  ≥ exp 1 = exp > exp w − 1  w − 1 w − 1 p|N −8 log N  log N log N > exp > 1 − 8 = 1 − 8 1 w w N K

45 4.2 Linearity of the Sieves 4 CHEN’S THEOREM since ω(N) ≤ 2 log N, by Lemma A.4. Whence

−1 −1 Y  1  Y  1   log N  V (w) 1 − = 1 − 1 + O 1 p − 1 p − 1 K 22 N p|N Y p − 2   1  = 1 + O p − 1 log N p>2 p|N

Next

X 1 Z ∞ dx 1 Z ∞ dx 1 Z ∞ dx < = − n(n − 2) x(x − 2) 2 x − 2 2 x n≥w w−1 w−1 w−1 1 1 1 w − 1 = − log(w − 3) + log(w − 1) = log 2 2 2 w − 3

Hence   Y  1  Y  1  X 1 1 w − 1 1 + < exp = exp < exp log p(p − 2) p(p − 2)  p(p − 2) 2 w − 3 p≥w p≥w p≥w

1 1 w − 1 2  2  2 2  1  = = 1 + < 1 + = 1 + O w − 3 w − 3 w − 3 w since 1 + x < ex for all x > 0. Therefore

−1 −1 Y  1  Y  1 Y  1   1 Y p(p − 2) 1 − 1 − = 2 1 − 1 − = 2 p − 1 p p − 1 p (p − 1)2 22 p≥w p>2 p≥w Y  1    1  = 2 1 − 1 + O (p − 1)2 w p>2

Thus, by Theorem A.6

Y  1  Y  1    1  Y  1 1 − = 2 1 − 1 + O 1 − p − 1 (p − 1)2 w p 22 p2 2e−γ Y  1    1  = 1 − 1 + O log w (p − 1)2 log N p>2

46 4 CHEN’S THEOREM 4.3 The Sieve of S(A, P, z) which finally implies

−1 Y  1  Y  1  V (w) = V (w) 1 − 1 − p − 1 p − 1 22 p>2 p|N 2e−γ Y p − 2 Y  1    1  = 1 − 1 + O log w p − 1 (p − 1)2 log N p>2 p>2 p|N e−γ S(N)   1  Ke−γ S(N)   1  = 1 + O = 1 + O log w log N log N log N

1 As a side note, recall that V (w) << log N . Then, by Theorem 4.3, not only is S(N) >> 1 as evidenced when first introducing S(N), but also S(N) << 1. With that, all is set to be begin sieving.

4.3 The Sieve of S(A, P, z) Jurkat-Richert’s Theorem (Theorem 2.7) provides the following lower bound for S(A, P, z)

 log D   S(A, P, z) > φ log z + O(ε) |A|V (z) − R

2 2 for any D ≥ z = N k , where φ is the function defined in Section 2.2, and X R = |r(d)| d

ω(N) = O(log N)

Hence, by the Prime Number Theorem N   1  N   1  N |A| = 1 + O +O(log N) = 1 + O = (1+O(ε)) log N log N log N log N log N

47 4.3 The Sieve of S(A, P, z) 4 CHEN’S THEOREM

Secondly X X |Ad| = 1 = 1 = π(N,N mod d) + O(ω(N)) = π(N,N mod d) + O(log N) N−p∈A p≤N d|(N−p) (p,N)=1 p≡N mod d

Then |A| π(N) ω(N) r(d) = |A | − = π(N,N mod d) + O(log N) − + d ϕ(d) ϕ(d) ϕ(d) π(N) = π(N,N mod d) − + O(log N) ϕ(d)

Bombieri-Vinogradov (Theorem 3.1) is to be applied to bound

X X π(N) X R = |r(d)| = π(N,N mod d) − + O(log N) ϕ(d) d

X π(N) ≤ π(N,N mod d) − + DQO(log N) ϕ(d) d

N  The bound of R to seek is O log3 N . Hence, apply Theorem 3.1 with N = x = n and A = 3. Choose 1 N 2 D = (log N)1+β(3)

2 2 for, by doing so, D ≥ z = N k , provided that k > 4, for N big enough. Moreover

1 N 2 DQ < (log N)β(3) since Q < log N for N big enough. Then, by Theorem 3.1

N R << log3 N since 1 N 2 N DQ log N < << (log N)β(3)−1 log3 N Finally

1 log D 2 log N − (1 + β(3)) log log N k log log N = 1 = − k(1 + β(3)) log z k log N 2 log N Hence, if 4 < k ≤ 8 log D k 2 < < ≤ 4 log z 2

48 4 CHEN’S THEOREM 4.3 The Sieve of S(A, P, z) for big enough values of N. By Lemma 2.5

2eγ log z 4eγ log k − 1 log log N  φ log D  = log log D − 1 = 2 + O log z log D log z k log N 4eγ log k−2 = 2 + O(ε) k Hence ! 4eγ log k−2 N  N  S(A, P, z) > 2 + O(ε) (1 + O(ε))V (z) + O k log N log3 N

Therefore

Theorem 4.4. Let 4 < k ≤ 8. Then ! 4eγ log k−2 N  N  S(A, P, z) > 2 + O(ε) V (z) + O k log N log3 N for large enough N.

An intermediate result can be obtained at this point by setting k = 5 momentarily. The number of representations of N as the sum of an odd prime and a product of at most 1 four primes is bounded below by S(A, P,N 5 ). In other words:

Corollary 4.5. Let N be an even large enough natural number and R4(N) denote the number of representations of N as the sum of an odd prime and a product of at most four primes. Then N R4(N) >> S(N) log2 N Proof. By Theorem 4.4 with k = 5

 γ 3    1 4e log 2 N 1 N R4(N) > S(A, P,N 5 ) > + O(ε) V (N 5 ) + O 5 log N log3 N γ   3 e N 1 N = (4 log + O(ε)) V (N 5 ) + O 2 5 log N log3 N

1 Choose 0 < ε < 200 small enough such that

3 4 log 2 + O(ε) = 1.6218... + O(ε) > 1 Then, by Theorem 4.3

N   1   N  N R4(N) > S(N) 1 + O + O >> S(N) log2 N log N log3 N log2 N

49 P 4.4 The Sieve of z≤q

P 4.4 The Sieve of z≤q

 log Dq   S(Aq, P, z) < Φ log z + O(ε) |Aq|V (z) + Rq

1 for any Dq ≥ z = N k , where Φ is the function defined in Section 2.2, and X Rq = |rq(d)|

d

Let d | P (z). Then (q, d) = 1, since q ≥ z. Then X X X |(Aq)d| = 1 = 1 = 1 = |Aqd|

n∈Aq n∈A n∈A d|n q|n qd|n d|n

Therefore |A | |A| |A | |A| 1  |A|  r (d) = |A | − q = |A | − − q + = r(qd) − |A | − q qd ϕ(d) qd ϕ(qd) ϕ(d) ϕ(qd) ϕ(d) q ϕ(q) r(q) = r(qd) − ϕ(d) since ϕ(qd) = ϕ(q)ϕ(d). Hence

X X X 1 R = |r (d)| ≤ |r(qd)| + |r(q)| q q ϕ(d) d

For any z ≤ q < y, let 1 N 2 D = q q(log N)1+β(4)

1 which is greater than z = N k , provided that k > 6, for in this case

1 1 N 2 N 6 1 D > = > N k = z q y(log N)1+β(4) (log N)1+β(4)

50 P 4 CHEN’S THEOREM 4.4 The Sieve of z≤q

for N big enough. Instead of estimating Rq alone, a bound for X X X X X 1 R ≤ |r(qd)| + |r(q)| q ϕ(d) z≤q

X X π(N) X |r(s)| = π(N,N mod s) − + O(log N) ϕ(s) s

X π(N) N ≤ π(N,N mod s) − + qDqQO(log N) << ϕ(s) log4 N s

1 1 log Dq since k log N ≤ log q < 3 log N. Hence 1 < log z < 3, for N big enough. By Lemma 2.4 2eγ log log N  2eγ Φ log Dq  = + O = + O(ε) log z k log q log N 1 log q 2 − k log N k( 2 − log N )

51 P 4.4 The Sieve of z≤q

On the other hand, by the Prime Number Theorem

π(N) π(N) |A | = π(N,N mod q) + O(log N) = + π(N,N mod q) − + O(log N) q ϕ(q) ϕ(q) N   1  π(N) = 1 + O + π(N,N mod q) − ϕ(q) log N log N ϕ(q)

Therefore

  1  γ ! N 1 + O X  log D  X 2e log N Φ q  + O(ε) |A | = + O(ε) log z q 1 log q ϕ(q) log N z≤q

The factor 2eγ 2eγ + O(ε) ≤ + O(ε) = O(1) 1 log q k( 1 − 1 ) k( 2 − log N ) 2 3

1 since ε < 200 = O(1). Hence the second sum is      X π(N)  N N O  π(N,N mod q) −  = O = O(ε) ϕ(q) 2 log N  q

2eγ N   1  X 1 N X 1 1 + O + O(ε) k log N ϕ(q)( 1 log N − log q) log N ϕ(q) z≤q

By Lemma A.3

X 1 X 1 X 1 X 1 = ≤ << = log log y − log log z + O(1) ϕ(q) q − 1 q − 1 q z≤q

Hence N X 1 N O(ε) = O(ε) log N ϕ(q) log N z≤q

52 P 4 CHEN’S THEOREM 4.4 The Sieve of z≤q

N X 1 kN X 1 = log N ϕ(q)( 1 log N − log q) log2 N k log q z≤q

k log q since 1 < 2 − k log N < 3. This means, by all the above, that

γ X  log D  2e N X 1 N Φ q  + O(ε) |A | < + O(ε) log z q k ϕ(q)( 1 log N − log q) log N z≤q

To bound the above sum, write

1 1 1 2 = ≤ + ϕ(q) q − 1 q q2

Then

X 1 X 1 X 2 ≤ + ϕ(q)( 1 log N − log q) q( 1 log N − log q) q2( 1 log N − log q) z≤q

X 2 X 2 X 2 ≤ = q2( 1 log N − log q) q2( 1 log N − log y) q2( 1 − 1 ) log N z≤q

Consider next the continuous, positive and increasing function

1 f(x) = 1 2 log N − log x defined for z ≤ x ≤ y. Then, by twice integrating by parts and using Lemma A.3 in the form X 1  1  = log log x + a + O p log x p

53 P 4.4 The Sieve of z≤q 0 ! X 1 X f(q) Z y X 1 = = f(x) d q( 1 log N − log q) q q z≤q

y y 1 Z Z dx Z 3 N u log Ndu f(x) d log log x = = 1 u 1 x log x( log N − log x) 1 N u log N( log N − u log N) z z 2 k 2 1 1 1 Z 3 du 2 Z 3 du du  = 1 = + 1 log N 1 u( − u) log N 1 u − u k 2 k 2 2 2 log(k − 2) = log 1 − log 1 − log 1 + log( 1 − 1 ) = log N 3 k 6 2 k log N Whence X 1 2 log(k − 2)  1  ≤ + O ϕ(q)( 1 log N − log q) log N log2 N z≤q

54 P 4 CHEN’S THEOREM 4.4 The Sieve of z≤q

Thus

X X X  log Dq   X S(Aq, P, z) = S(Aq, P, z) < Φ log z + O(ε) |Aq|V (z) + Rq z≤q

Therefore

Theorem 4.6. Let 6 < k ≤ 8. Then

X 4eγ log(k − 2)  N  N  S(Aq, P, z) < + O(ε) V (z) + O k log N log3 N z≤q

Another intermediate result can be obtained at this point. Having studied the two first terms of the weight αn allows one to produce an upper bound for the number of representations of N as the sum of an odd prime and a product of at most three primes, say R3(N), which is obtained by assigning the weights

1 X α = 1 − j en 2 z≤q

1 X α = α − 1 n en 2 n=p1p2p3 z≤p1

This implies, that αen > 0 if and only if n is such that αn > 0, or αn = 0 and the above sum equals 1. Now, if said sum is 1 exactly, then αn = 0. Hence, αen > 0 if and only if n is of the case (i), (ii), (iii), (iv), or the product of three primes, n = p1p2p3, with z ≤ p1 < y ≤ p2 ≤ p3. Thus

1 X 1− 1 R (N) > S(A, P, z) − S(A , P, z) − N k 3 2 q z≤q

1 X 6 S(A, P, z) − S(A , P, z) − N 7 2 q z≤q − + O(ε) V (N 7 ) + O 7 2 7 log N log3 N

Whence

55 4.5 The Sieve of S(B, P, y) 4 CHEN’S THEOREM

Corollary 4.7. Let N be an even large enough natural number and R3(N) denote the number of representations of N as the sum of an odd prime and a product of at most three primes. Then N R3(N) >> S(N) log2 N Proof. By Theorem 4.4 and Theorem 4.6 with k = 7 γ   5 2e N 1 N R3(N) > (2 log − log 5 + O(ε)) V (N 7 ) + O 2 7 log N log3 N

1 Choose 0 < ε < 200 small enough such that 1 2 log 5 − log 5 + O(ε) = log 5 + O(ε) = 0.2231... + O(ε) > 2 4 5 Then, by Theorem 4.3 2 N   1   N  N R3(N) > S(N) 1 + O + O >> S(N) 5 log2 N log N log3 N log2 N

4.5 The Sieve of S(B, P, y) Bombieri-Vinogradov will no longer be of any use when bounding the error term. Instead, this is accomplished by large sieving techniques and particularly Theorem 3.5. The set

B = {N − p1p2p3 : p1p2p3 < N, z ≤ p1 < y ≤ p2 ≤ p3, (p1p2p3,N) = 1} is far too restrictive. Instead, a new set B0 is defined containing B. Divide the interval z ≤ x < y into pairwise disjoint subintervals xj ≤ x < (1 + ε)xj, where

j xj = (1 + ε) z

log y−log z and 0 ≤ j ≤ log(1+ε) , since x0 = z and

log y−log z log y log(1+ε) (1 + ε) log(1+ε) z = e log(1+ε) elog z = elog y = y

This implies that for every z ≤ p1 < y, there exists a unique j, such that xj ≤ p1 < (1+ε)xj. Define the sets Bj by

{N − p1p2p3 : z ≤ p1 < y ≤ p2 ≤ p3, xj ≤ p1 < (1 + ε)xj, xjp2p3 < N, (p2p3,N) = 1} The number of sets Bj is bounded above by log y − log z ( 1 − 1 ) log N log N log N + 1 = 3 k + 1 << << log(1 + ε) log(1 + ε) log(1 + ε) ε

1 since k > 3 and 0 < ε < 200 . Define [ B0 = Bj j

56 4 CHEN’S THEOREM 4.5 The Sieve of S(B, P, y)

Then X |B0| = |Bj| j since Bj are pairwise disjoint. Furthermore

B ⊆ B0

Hence X X X X S(B, P, y) ≤ S(B0, P, y) = 1 = 1 = S(Bj, P, y) b∈B0 j b∈Bj j (b,P (y))=1 (b,P (y))=1

The estimate of S(Bj, P, y) is once again provided by Jurkat-Richert’s Theorem (Theo- rem 2.7), by means of

j  log D   j S(B , P, y) < Φ log y + O(ε) |B |V (y) + Rj

1 for any D ≥ y = N 3 , where Φ is the function defined in Section 2.2, and X Rj = |rj(d)| d

A prime p1 either divides d or is coprime with d. Hence, the second sum can be written as X X X 1 = 1 + 1

z≤p1

2N X 1− 1  1− 1  ≤ 1 ≤ 2N k ω(d) = O N k log d z p1≥z p1|d

57 4.5 The Sieve of S(B, P, y) 4 CHEN’S THEOREM

by Lemma A.4. For any d dividing P (y), the condition (p1, d) = 1 is equivalent to (p1p2p3, d) = 1, since (p2p3, d) = 1 is already implied by d | P (y), since both p2 and p3 are greater or equal than y. Therefore

1− 1 ! X 1 X N k log d r (d) = 1 − 1 + O j ϕ(d) ϕ(d) z≤p1

Let (an)n be the sequence defined by

( 1 if n = p2p3, y ≤ p2 ≤ p3, (p2p3,N) = 1 an = 0 otherwise

Then 1− 1 ! X 1 X N k log d r (d) = a − a + O j n ϕ(d) n ϕ(d) z≤p1

Let N Xj = ,Yj = min(y, (1 + ε)xj),Zj = max(z, xj) = xj xj

Then

1− 1 ! X X 1 X X N k log d r (d) = a − a + O j n ϕ(d) n ϕ(d) n< N z≤p1

Let 1 N 2 D = log7 N

1 which is greater than y = N 3 . Then

1 1 1 1 1 1 1    2   2 N 2 N 2 y N 2 Yj (XjYj) 2 (XjYj) 2 DQ < 6 < 6 min , 1 + ε = 6 = 6 ≤ 6 log N log y xj log y xj log y log Yj since Q < log N for N big enough and both y and 1+ε are greater than 1. By Theorem 3.5, xj

58 4 CHEN’S THEOREM 4.5 The Sieve of S(B, P, y)

with X = Xj, Y = Yj, Z = Zj, h = N and A = 6 X Rj = |rj(d)| = d

1− 1 ! X X X 1 X X X N k log d = an − an + O ϕ(d) ϕ(d) d

1− 1 ! X X X 1 X X X N k log d ≤ an − an + O √ ϕ(d) √ ϕ(d) XjYj n

12 12 12 2 N N log Yj ≤ log y < log N < N 3 = ≤ = Xj y xj In addition N N XjYj = min(y, (1 + ε)xj) ≤ (1 + ε)xj = N(1 + ε) ≤ 2N <> log N 1 since clearly log y = 3 log N >> log N and log((1 + ε)xj) > log xj ≥ log x0 = log z = 1 k log N >> log N. Moreover, by Lemma B.4

1 1 X 1 X 1 (XjYj) 2 N 2 ≤ << log 6 << log 6 √ ϕ(d) √ ϕ(d) log Yj log N XjYj XjYj d< 6 d< 6 log Yj log Yj d|P (y)

Therefore 2 1 N log N 1− 1 2 N 2 N Rj << + N k log << log6 N log6 N log4 N Hence    N  S(Bj, P, y) < Φ log D  + O(ε) |Bj|V (y) + O log y log4 N where 1 log D 2 log N − 7 log log N 3 log log N = 1 = − 21 log y 3 log N 2 log N

59 4.5 The Sieve of S(B, P, y) 4 CHEN’S THEOREM

By Lemma 2.4 2eγ 4eγ log log N  4eγ Φ log D  = = + O = + O(ε) log y 3 log log N 3 log N 3 2 − 21 log N Next, by Theorem 4.3

3e−γ S(N) V (y)   1    1  3 3 = log N 1 + O 1 + O = (1 + O(ε)) = + O(ε) V (z) ke−γ S(N) log N log N k k log N whence 4eγ   3   N  S(Bj, P, y) < + O(ε) |Bj| + O(ε) V (z) + O 3 k log4 N 4eγ   N  = + O(ε) |Bj|V (z) + O k log4 N

j log N  Recall that the amount of sets B is O ε . Thus X 4eγ  X X  N  S(B, P, y) ≤ S(Bj, P, y) < + O(ε) |Bj|V (z) + O k 4 j j j log N 4eγ   N  = + O(ε) |B0|V (z) + O k ε log3 N Finally, |B0| is to be estimated. From the definition of B0 it follows that

0 B ⊆ {N − p1p2p3 : z ≤ p1 < y ≤ p2 ≤ p3, p1p2p3 < (1 + ε)N}

j since, given N − p1p2p3 ∈ B , for any j

p1p2p3 < (1 + ε)xjp2p3 < (1 + ε)N Then X |B0| ≤ 1

z≤p1

60 4 CHEN’S THEOREM 4.5 The Sieve of S(B, P, y)

N N 2 1  since p1p2 < ≤ = N 3 . There exists N(ε) > 0 such that the error term 1+O < p3 y log N 1 + ε, for N ≥ N(ε), in which case

  1  (1 + ε) 1 + O < (1 + ε)2 = 1 + 2ε + ε2 < 1 + 3ε log N

Thus (1 + ε)N  (1 + 3ε)N π < N p1p2 p1p2 log p1p2 for N big enough. Therefore

X X X (1 + ε)N  |B0| ≤ 1 ≤ 1 ≤ π p1p2 z≤p1

−1 1 where w = ((1 + ε)Np ) 2 for convenience purposes only. In a similar way as in the the P 1 bound of z≤q

1 f(x) = log N p1x defined for 0 < x < N . Then, by twice integrating by parts and using Lemma A.3 in the p1 form X 1  1  = log log x + a + O p log x p 0

Z w ! X 1 X f(p2) X 1 N = = f(x) d p2 log p2 y p y≤p2

First  f(w)   f(y)  f(w)  1  O + O = O = O log w log y log y log2 N

61 4.5 The Sieve of S(B, P, y) 4 CHEN’S THEOREM since f is increasing and f(w) 1 1 1 1 1 = = << = << log y log y log N log y 1 log N log y log N 1 log N 2 log N log2 N p1w 2 (1+ε)p1 y 3 3 Second Z w df(x) Z w dx Z w dx 1 Z w dx = ≤ = 2 N 2 N 1 1 N 2 y log x y x log x log y x log y log log N( log ) y x p1x p1w 3 2 (1+ε)p1 12(log w − log y) 12( 1 log N − 1 log N) log N 1 = ≤ 2 3 << = log N log2 N log N log2 N log N log2 N ( 2 log N)2 (1+ε)p1 (1+ε)p1 y 3  1  = O log2 N Third

1 w ( N )2 w Z Z p1 Z f(x) d(log log x + a) = f(x) d log log x + f(x) d log log x 1 y y ( N )2 p1 N 1 where, using the change of variables x = ( )2 u p1

√ q Z w Z w Z 1+ε N du dx p1 f(x) d log log x = = 1 1 N q q  ( N )2 ( N )2 x log x log 1 N N N p p p1x u log u log 1 1 p1 p1 p N p1 u p1 √ Z 1+ε du =     1 u 1 log N + log u 1 log N − log u 2 p1 2 p1 √ √ Z 1+ε du 1 Z 1+ε du =   << 2 1 u 1 log2 N − log2 u log N 1 u 4 p1 1 log(1 + ε) 1 log 2 1 = 2 < 2 << log2 N log2 N log2 N since    √   2  1 log2 N − log2 u > 1 log2 N − log2 1 + ε = 1 2 log N − log2 2 >> log2 N 4 p1 4 y 4 3 Whence 1 N 2 Z ( p )   X 1 1 1 N = f(x) d log log x + O 2 p2 log y log N y≤p2

1 N 2 Z ( p )   0 X 1 f(x) d log log x (1 + 3ε)N X 1 |B | < (1 + 3ε)N + O 2 y p1 log N p1 z≤p1

62 4 CHEN’S THEOREM 4.5 The Sieve of S(B, P, y) since, by Lemma A.3

X 1 1 1 k = log( 3 log N) − log( k log N) + O(1) = log 3 + O(1) = O(1) p1 z≤p1

1 ( N )2 Z t d log log x g(t) = N y log tx defined for z ≤ t ≤ y. Then, once again, by twice integrating by parts and Lemma A.3

1 N 2 ! Z ( p ) Z y X 1 d log log x X g(p1) X 1 N = = g(t) d y p1 log p1 z p z≤p1

1 k−1 ( N )2 N 2k k−1 g(z) 1 Z z d log log x 1 Z d log log x 1 Z 2k du = = = N 1− 1 1 log z log z log z N k log z 1 y log zx y u(1 − k − u) log N log x 3 k−1 1 Z 2k du 1 1 = 1 << << 2 log z log N 1 u(1 − − u) log z log N log N 3 k Second, by the chain rule

N 0 1 − t2 −2 g (t) = = 2 N q N q N N q N log log √ 2 t log t t t N t t t and therefore Z y Z y Z y Z y dg(t) 2dt 2dt 2 dt = ≤ = log t 2 N 2 N 1 2 2 t z z t log t log t z t log z log y k log N( 3 log N) z 9k( 1 log N − 1 log N) 1 = 3 k << 2 log3 N log2 N

63 4.5 The Sieve of S(B, P, y) 4 CHEN’S THEOREM

Last but not least, by the changes of variables x = N u and t = N w

1 y y ( N )2 Z Z Z t d log log x g(t) d(log log t + a) = N d log log t z z y log tx 1 1−w 1 1−w Z 3 Z 2 du dw 1 Z 3 1 Z 2 du = = dw 1 1 uw(1 − w − u) log N log N 1 w 1 u(1 − w − u) k 3 k 3 where

1−w 1−w Z 2 du 1 Z 2  1 1  = + du 1 u(1 − w − u) 1 − w 1 u 1 − w − u 3 3 log 1−w − log 1 − log(1 − w − 1−w ) + log(1 − w − 1 ) = 2 3 2 3 1 − w log 1−w + log 3 − log 1−w + log( 2 − w) log(2 − 3w) = 2 2 3 = 1 − w 1 − w whence y 1 Z 1 Z 3 log(2 − 3w) g(t) d(log log t + a) = dw log N 1 w(1 − w) z k Let 1 Z 3 log(2 − 3w) Ik = dw 1 w(1 − w) k Then 1 N 2 Z ( p )   X 1 d log log x Ik 1 N = + O 2 y p1 log log N log N z≤p1

4eγ  N  N  S(B, P, y) < + O(ε) (Ik + O(ε)) V (z) + O k log N ε log3 N

Whence

Theorem 4.8. Let k > 3. Then 4eγ I  N  N  S(B, P, y) < k + O(ε) V (z) + O k log N ε log3 N for large enough N.

64 4 CHEN’S THEOREM 4.6 Completion of the Proof

4.6 Completion of the Proof Let N be big enough and 6 < k ≤ 8. Recall Theorem 4.4, Theorem 4.6 and Theorem 4.8 ! 4eγ log k−2 N  N  S(A, P, z) > 2 + O(ε) V (z) + O k log N log3 N

X 4eγ log(k − 2)  N  N  S(Aq, P, z) < + O(ε) V (z) + O k log N log3 N z≤q

1 X 1 1− 1 1 1 R(N) > S(A, P, z) − S(A , P, z) − S(B, P, y) − N k − N 3 2 q 2 2 z≤q 2 log − log(k − 2) − Ik + O(ε) V (z) + O 2 k log N ε log3 N      k−2  2N 1 N = log − Ik + O(ε) S(N) 1 + O + O 4 log2 N log N ε log3 N since     N 1− 1 1 1 N O − N k − N 3 = O log3 N 2 ε log3 N The above bound becomes of any use when

1 Z 3 k−2 k−2 log(2 − 3x) I(k) = log 4 − Ik = log 4 − dx > 0 1 x(1 − x) k Set k = 8. Then

1 3 Z 3 log(2 − 3x) I(8) = log − dx = 0.04238... > 0 2 1 x(1 − x) 8

1 Pick a small enough value of 0 < ε < 200 in a way that I(8) + O(ε) > 0.04

For said fixed value of ε  N   N  S(N)N  N  1  O = O = O = S(N) O ε log3 N log3 N log3 N log2 N log N since S(N) >> 1. Therefore

N   1  R(N) > 0.08 S(N) 1 + O log2 N log N

65 4.6 Completion of the Proof 4 CHEN’S THEOREM

Thus, for big enough N N R(N) >> S(N) log2 N which completes the proof of Theorem 4.1. 

Nevertheless, k = 8 is not the best choice for k. Of course, the statement of Theorem 4.1 cares not about its specific value. However, it has been made clear throughout the proof 1 1 that the closer z = N k is to y = N 3 , the more information it is given about the range where the prime p and the prime factors of N − p lie. This now turns out to be a problem of finding a zero of I(k) as a function of k. Its derivative is positive whenever 6 < k ≤ 8, meaning I(k) is increasing with k. Moreover

I(7) = −0.06761... < 0

This is the reason why k = 8 is finally used to conclude Theorem 4.1, since it is the only possible integer value of k. The ideal value for k lies between 7.585 and 7.586. Take

k = 7.586

Then I(7.586) = 1.0126...·10−4 > 0 and 1 0.1318... N 7.586 = N Thus, the complete and unabbreviated statement of Theorem 4.1 ought to be:

Let N be an even large enough natural number and R(N) denote the number of rep- resentations of N as N = p + (N − p), where p < N is a prime not dividing N and

(i) N − p = 1, or (ii) N − p is a prime greater or equal than N 0.1319, or

1 (iii) N − p is the product of exactly two primes, both greater or equal than N 3 , or (iv) N −p is the product of exactly two primes, one of them greater or equal than N 0.1319 1 1 and smaller than N 3 , and the other greater or equal than N 3 Then N   1  R(N) > 2.025·10−4 S(N) 1 + O log2 N log N In particular N R(N) >> S(N) log2 N

66 Appendices A Arithmetic Functions

Lemma A.1 (Abel’s Summation Formula). Let {an}n≥1 be a sequence of real numbers, P A(x) = n≤x an, and f a function with continuous derivative on [1, ∞). Then Z x X 0 anf(n) = A(x)f(x) − A(u)f (u) du n≤x 1 for any x ≥ 1.

Proof. Let x ≥ 1 and set k = [x] and a0 = 0. Then X X an(f(x) − f(n)) = (A(n) − A(n − 1))(f(x) − f(n)) n≤x n≤k X X = A(n)(f(x) − f(n)) − A(n)(f(x) − f(n + 1)) n≤k n≤k−1 X = A(n)(f(n + 1) − f(n)) + A(k)(f(x) − f(k)) n≤k−1 X Z n+1 Z x = A(n) f 0(u) du + A(k) f 0(u) du n≤k−1 n k X Z n+1 Z x Z x = A(u)f 0(u) du + A(u)f 0(u) du = A(u)f 0(u) du n≤k−1 n k 1

P Lemma A.2. Let d(n) = d|n 1 be the divisor function. Then

X d(n) >> log2 x n n≤x

Proof. Let

X X X X X X hxi X 1 A(x) = d(n) = 1 = 1 = = x + O(x) d d n≤x n≤x d|n d≤x n≤x d≤x d≤x d|n where X 1 X 1 Z x du = 1 + < 1 + = 1 + log x d d u d≤x 2≤d≤x 1 which implies X 1 = log x + O(1) d d≤x Then A(x) = x log x + O(x)

67 A Arithmetic Functions APPENDICES

1 By Lemma A.1 with an = d(n) and f(x) = x X d(n) A(x) Z x A(u) Z x log u Z x du = + du = log x + O(1) + du + O n x u2 u u n≤x 1 1 1 log2 x = log x + O(1) + + O(log x) >> log2 x 2

Lemma A.3. Let x ≥ 1. Then X 1 = log log x + O(1) p p 0. Proof. Let π(x) be the prime counting function. By the Prime Number Theorem x   1  π(x) = 1 + O log x log x Then X 1 X π(n) − π(n − 1) X π(n) X π(n) = = − p n n n + 1 p

P Lemma A.4. Let ω(n) = p|n 1 be the prime divisor function. Then ω(n) ≤ 2 log n for all n ∈ N.

68 APPENDICES A Arithmetic Functions

Proof. Assume there exists n with ω(n) > 2 log n. Then Y Y n ≥ p ≥ 2 = 2ω(n) > 22 log n = e2 log n log 2 > elog n = n p|n p|n which is a contradiction. Thus, no such n can exist. Lemma A.5. Let f be a multiplicative function. Then f([m, n])f((m, n)) = f(m)f(n) for every m, n ∈ N.

Qr αi Proof. Let p1, . . . , pr be the primes dividing m or n. Then, write m = i=1 pi and Qr βi n = i=1 pi , where αi, βi are nonnegative integers. Then r Y max(αi,βi) [m, n] = pi i=1 and r Y min(αi,βi) (m, n) = pi i=1

If max(αi, βi) = αi, then min(αi, βi) = βi, and vice versa. Therefore r   r   r r Y max(αi,βi) Y min(αi,βi) Y αi Y βi f([m, n])f((m, n)) = f pi f pi = f(pi ) f(pi ) i=1 i=1 i=1 i=1 = f(m)f(n)

For brevity’s sake, the following result will not be proved. For a detailed proof see pages 165-166 of [5] and pages 65-67 of [1], for instance. Theorem A.6 (Mertens’ formula). Let x ≥ 2. Then

Y  1 e−γ   1  1 − = 1 + O p log x log x p

Lemma A.7. Let x ≥ 2 and ε > 0. Then, there exists n1(ε) such that −1 Y  1 log x 1 − < (1 + ε) p log u u≤p

69 B The Möbius and Euler’s Totient Functions APPENDICES

eγ ε Proof. Let s = 2+ε . Then eγ + s (2 + ε)eγ + eγ ε 2 + 2ε = = = 1 + ε eγ − s (2 + ε)eγ − eγ ε 2

By Mertens’ formula (Theorem A.6), there exists n1(s) = n1(ε) such that

−1 Y  1 (eγ − s) log x < 1 − < (eγ + s) log x p p u ≥ n1(ε). Then

−1  1   −1 Q 1 − γ Y 1 p

B The Möbius and Euler’s Totient Functions The Möbius function, defined by  1 if d = 1  µ(d) = (−1)ω(d) if d > 1 is square-free 0 otherwise is a multiplicative function with many interesting and useful properties is . Theorem B.1. Let f be a multiplicative function with f(1) = 1. Then X Y µ(d)f(d) = (1 − f(p)) d|n p|n for every n ∈ N. In particular ( X 1 if n = 1 µ(d) = 0 otherwise d|n

n1 nr Proof. The result trivially holds for n = 1. Assume n > 1. Let n = p1 . . . pr , where p1, . . . , pr are distinct primes, ni > 0 and r = ω(n) > 0. Let N = p1 . . . pr. Then X X X X X µ(d)f(d) = µ(d)f(d) = f(1) − f(pi) + f(pipj) − f(pipjpk) + ...

d|n d|N pi pi

pi pi 1, and clearly P d|1 µ(d) = µ(1) = 1.

70 APPENDICES B The Möbius and Euler’s Totient Functions

The Dirichlet convolution of two arithmetic functions f and g is defined by X n X n (f ∗ g)(n) = f(d)g = f g(d) d d d|n d|n

Therefore f ∗ g = g ∗ f. Moreover, the Dirichlet convolution is associative, meaning f ∗(g∗h) = (f ∗g)∗h, for all arithmetic functions f, g and h, and preserves multiplicativity, that is, f ∗ g is multiplicative if both f and g are multiplicative. Let ( 1 if n = 1 δ(n) = 0 otherwise

Then, for every f X n (f ∗ δ)(n) = f(d)δ = f(n) d d|n

Finally, let 1(n) = 1. Then, Theorem B.1 can be stated as follows

µ ∗ 1 = δ

Theorem B.2 (Möbius inversion). Let D be a divisor-closed set of natural numbers and f and g two arithmetic functions defined on D. Then X n f(n) = (g ∗ µ)(n) = g(d)µ d d|n if and only if X g(n) = (f ∗ 1)(n) = f(d) d|n Proof. Assume first f = g ∗ µ. Then

f ∗ 1 = (g ∗ µ) ∗ 1 = g ∗ (µ ∗ 1) = g ∗ δ = g

Assume now that g = f ∗ 1. Then

g ∗ µ = (f ∗ 1) ∗ µ = f ∗ (1 ∗ µ) = f ∗ δ = f

Theorem B.3 (Dual Möbius inversion). Let D be a divisor-closed set of natural numbers and f an arithmetic function defined on D. Let X g(n) = f(d) d∈D n|d for all n ∈ D. Then X  d  f(n) = µ g(d) n d∈D n|d

71 B The Möbius and Euler’s Totient Functions APPENDICES

Proof. Let n ∈ D. Then

X  d  X  d  X X X µ g(d) = µ f(m) = µ(r) f(m) n n d∈D d∈D m∈D nr∈D m∈D n|d n|d d|m nr|m X X X X X X = µ(r) f(nrs) = f(na) µ(r) = f(na) µ(r) = f(n) nr∈D nrs∈D na∈D r∈D na∈D r|a r|a by Theorem B.1.

Euler’s totient function is a multiplicative function defined by

X Y  1 ϕ(n) = 1 = n 1 − p d≤n p|n (d,n)=1

1 and hence ϕ(p) = p(1 − p ) = p − 1. Moreover Lemma B.4. Let x > 1. Then X 1 << log x ϕ(n) n

Proof. Given any r ∈ N, let Y R = p p|r Then X 1 X 1 Y 1 X 1 X 1 X 1 X 1 X 1 X 1 = = = = ϕ(n) n 1 − 1 n r r n r hR n

X 1 X 1 Y  1 1  Y  1  1  = = 1 + + + ... = 1 + 1 + + ... rR m p2 p3 p2 p r≥1 m≥1 p p p2|m,∀p|m ! Y 1 1 Y  1  = 1 + = 1 + < +∞ p2 1 − 1 p(p − 1) p p p

72 APPENDICES C Dirichlet Characters

C Dirichlet Characters

A Dirichlet character modulo d ∈ N is a completely multiplicative function χ : N −→ C such that χ(n+d) = χ(n), for all n ∈ N and it is supported on coprimes of d only, meaning χ(n) 6= 0 if and only if (n, d) = 1. This implies that χ(1) = 1, since χ(1) = χ(1)χ(1) = χ(1)2 and χ(1) 6= 0 because (1, d) = 1. Moreover, χ(n1) = χ(n2), for every n1 ≡ n2 (mod d). In particular, given n such that (n, d) = 1 χ(n)ϕ(d) = χ(nϕ(d)) = χ(1) = 1 since nϕ(d) ≡ 1 (mod d) by Euler’s Theorem. Hence, χ(n) is either a ϕ(d)-th root of unity when (n, d) = 1, or 0 otherwise. In particular, |χ(n)| = 1, for every (n, d) = 1. The principal character modulo d is denoted by χ0 and defined by ( 1 if (n, d) = 1 χ0 = 0 otherwise

The total number of different characters modulo d is ϕ(d), since χ are completely determined by the value at a single n such that (n, d) = 1 (for which there are ϕ(d) possible values), because of complete multiplicativity. Lemma C.1. ( X ϕ(d) if n ≡ 1 (mod d) χ(n) = 0 otherwise χ mod d Proof. Let m be such that (m, d) = 1. Then X X X χ(n) = χ(nm) = χ(m) χ(n) χ mod d χ mod d χ mod d P Assume χ(n) 6= 0. Then, χ(m) = 1, for all (m, d) = 1, which implies χ = χ0, and χPmod d P in this case χ mod d χ(n) = χ mod d 1 = ϕ(d).

πir Let n be such that (n, d) = 1. Then χ(n) = e ϕ(d) , for some r, since χ(n) is some ϕ(d)-th root of unity. Then − πir χ¯(n) = χ(n) = e ϕ(d) = χ(n)−1 = χ(n−1) where χ(n−1) is to be understood as χ(a), where n−1 ≡ a (mod d). Dirichlet characters modulo d are not defined to be d-periodic, but rather that the value at any n and n + d coincide. A Dirichlet characters modulo d is said to be primitive when it is in fact d-periodic, restricted to coprimality with d. Otherwise, the character is said to be imprimitive, meaning it has period strictly less than d. Define the Gauss sum associated to a Dirichlet character χ modulo d by

X 2πi h τ(χ) = χ(h)e d h≤d Lemma C.2. Let χ be a primitive Dirichlet charatcer modulo d. Then

1 X 2πin h χ(n) = χ¯(h)e d τ(¯χ) h≤d for all n ∈ N.

73 C Dirichlet Characters APPENDICES

Proof. Assume first (n, d) = 1. Then

X 2πi h X −1 2πi h χ(n)τ(¯χ) = χ(n)¯χ(h)e d = χ¯(n h)e d h≤d h≤d

−1 −1 2πi h 2πin a Let a ≡ n h (mod d). Then χ(n h) = χ(a) and e d = e d . Therefore

X 2πin a χ(n)τ(¯χ) = χ¯(a)e d a≤d where the condition χ primitive is left unused. Assume now (n, d) > 1 and χ primitive. Then χ(n) = 0. It is therefore needed to be proved that X 2πin h χ¯(h)e d = 0 h≤d

n n0 Write the fraction as an irreducible fraction . Then, (n0, d0) = 1 and d0 divides d. d d0 Moreover d0 < d, since (n, d) > 1. If d0 = 1, then n is a multiple of d and the result h 2πin d 0 d trivially holds since both χ(n) and e are zero. Assume then d0 > 1. Let d = d and 0 0 write h = sd0 + r, where 0 ≤ s < d and 1 ≤ r ≤ d0. Then, if h runs from 1 to d, then s 0 and r range from 0 ≤ s < d and 1 ≤ r ≤ d0, respectively. Hence

0 d −1 d0 h h r X 2πin X 2πin0 X X 2πin0s 2πin0 χ¯(h)e d = χ¯(h)e d0 = χ¯(sd0 + r)e e d0 h≤d h≤d s=0 r=1 0 d0 d −1 r X 2πin0 X = e d0 χ¯(sd0 + r) r=1 s=0 It is therefore enough to prove that the inner sum, say f(r), is zero. That is

d0−1 X f(r) = χ¯(sd0 + r) = 0 s=0 for every 1 ≤ r ≤ d0. Note that

d0−1 d0−1 d0 X X X f(r + d0) = χ¯(sd0 + r + d0) = χ¯((s + 1)d0 + r) = χ¯(sd0 + r) s=0 s=0 s=1 d0−1 X = χ¯(sd0 + r) = f(r) s=0

0 since χ(d d0 + r) = χ(d + r) = χ(r). However, χ is primitive. Thus, χ is d-periodic. In particular, χ is not d0-periodic. Hence, there exist m1 and m2 such that (m1, d) = (m2, d) = 1 and m1 ≡ m2 (mod d0), such that χ(m1) 6= χ(m2); since otherwise, χ would −1 be d0-periodic. Let m ≡ m1m2 ≡ 1 (mod d0). In particular (m, d) = 1 and m = 1 + kd0. Then

d0−1 d0−1 d0−1 X X X χ¯(m)f(r) = χ¯(smd0 + rm) = χ¯(sd0 + r + (sd0 + r)kd0) = χ¯(sd0 + r) = f(r) s=0 s=0 s=0

74 APPENDICES C Dirichlet Characters whence f(r) = 0, since χ¯(m) 6= 1.

Lemma C.3. Let χ be a primitive Dirichlet charatcer modulo d. Then

|τ(χ)|2 = d

Proof. If χ is primitive, then so is χ¯. By Lemma C.2

X 2πin h χ(n)τ(χ) = χ(h)e d h≤d for all n ∈ N. Hence X X X |τ(χ)|2ϕ(d) = |τ(χ)|2 1 = |τ(χ)|2 |χ(n)|2 = |χ(n)|2|τ(χ)|2 1≤n≤d n≤d n≤d (n,d)=1 X X X 2πin h1 −2πin h2 X X X 2πin h1−h2 = χ(h1)¯χ(h2)e d e d = χ(h1)¯χ(h2) e d

n≤d h1≤d h2≤d h1≤d h2≤d n≤d X X X 2 X = χ(h1)¯χ(h1) 1 = d |χ(h)| = d 1 = dϕ(d)

h1≤d n≤d h≤d 1≤h≤d (h,d)=1

75 References

[1] Cojocaru, A.C. and Murty, M.R., An Introduction to Sieve Methods and their Applica- tions. London Mathematical Society 66. Cambridge University Press, 2006.

[2] Davenport H., Multiplicative Number Theory. Graduate Texts in Mathematics 74, sec- ond edition. Springer-Verlag, 1980.

[3] Halberstam H. and Richert H.E., Sieve Methods. London Mathematical Society. Aca- demic Press, 1974.

[4] Hooley C., Applications of sieve methods to the theory of numbers. Cambridge Tracts in Mathematics 70. Cambridge University Press, 1976.

[5] Nathanson M. B., . The Classical Bases. Graduate Texts in Mathematics 164. Springer-Verlag, 1996.

76