<<

ANALYTIC (Spring 2019)

Eero Saksman Chapter 1

Arithmetic functions

1.1 Basic properties of arithmetic functions

We denote by P := {2, 3, 5, 7, 11,...} the prime numbers. Any function f : N → C is called and arithmetic function1. Some examples: ( 1 if n = 1, I(n) := 0 if n > 1. u(n) := 1, n ≥ 1. X τ(n) = 1 ( function). d|n φ(n) := #{k ∈ {1, . . . , n} | (k, n))1} (Euler’s φ -function). X σ(n) = (divisor sum). d|n ω(n) := #{p ∈ P : p|n} ` ` X Y αk Ω(n) := αk if n = pk . k=1 k=1

Q` αk Above n = k=1 pk was the prime decomposition of n, and we employed the notation X X f(n) := f(n). d|n d∈{1,...,n} d|n

Definition 1.1. If f and g are arithmetic functions, their multiplicative (i.e. Dirichlet) is the f ∗ g, where X f ∗ g(n) := f(d)g(n/d). d|n

1In this course N := {1, 2, 3,...}

1 Theorem 1.2. (i) f ∗ g = g ∗ f, (ii) f ∗ I = I ∗ f = f,

(iii) f ∗ (g + h) = f ∗ g + f ∗ h, (iv) f ∗ (g ∗ h) = (f ∗ g) ∗ h. Proof. (i) follows from the symmetric representation X f ∗ g(n) = f(k)f(`) k`=n (note that we assume automatically that above k, l are positive ). In a similar vain, (iv) follows by iterating this to write X (f ∗ g) ∗ h(n) = f(k1)g(k2)h(k3).

k1k2k3=n Other claims are easy. Theorem 1.3. If f(1) 6= 0, the arithmetic function f has a unique inverse f −1 such that f ∗ f −1 = f −1 ∗ s = I. Proof. Assume that f(1) 6= 0. The condition f ∗ f −1(k) = I(k) (1.1) is satisfied for k = 1 if we set f −1(1) := 1/f(1). Assume then that n ≥ 2 and f −1(1), . . . , f −1(n − 1) are chosen so that (1.1) holds true for k ≤ n − 1. Then it holds also for k = n if we choose −1  X  f −1(n) := f(k/d)f −1(d) . f(1) d|n d

Definition 1.4. M¨obiusfunction µ : N → R satisfies ( (−1)`, if n = q ··· q with distinct primes q , . . . q , µ(n) = 1 ` 1 ` 0, otherwise.

In other words, µ(n) = (−1)ω(n) if n is squarefree, otherwise µ(n) = 0. Especially, µ(1) = 1. Theorem 1.5. µ ∗ u = I. P Proof. We thus need to verify that d|n µ(d) = 0 if n ≥ 2. Assume that p1, . . . , p` are the distinct primes that divide n. As µ(d) = 0 if d has a square factor, we simply have X   µ(d) = µ(1) + µ(p1) + ... + µ(p`) + µ(p1p2) + . . . µ(p`−1p`) + ... + µ(p1 . . . p`) d|n ` ` ` = 1 − + − ... + (−1)` = (1 − 1)` = 0. 1 2 `

2 P Corollary 1.6. (M¨obiusinversion formula) If f(n) := d|n g(d), ∀n ≥ 1, then P g(n) = d|n µ(d)f(n/d). Proof. According to the assumption f = u∗g. The claim follows by taking convolution on both sides with µ and recalling Thm 1.5. Definition 1.7. We denote the identity function by j. Thus j(n) = n for all n ≥ 1. α Moreover, given α ∈ C we set jα(n) = n for n ≥ 1. P In the course NT it was shown that d|n φ(d) = n, or in other words, φ ∗ u = j. It follows that φ = j ∗ µ. (1.2) Definition 1.8. Arithmetic function f is multiplicative if f 6= 0 and f(mn) = f(m)f(n) if (m, n) = 1. Moreover, f is completely multiplicative if the above condition holds even without the condition (m, n) = 1. Example 1.9. Function φ is multiplicative. Same is true for µ directly from the definition. Functions j and jα are completely multiplicative. Theorem 1.10. If f and g are multiplicative, then also f ∗ g is multiplicative.

Proof. Let (m, n) = 1 If d|nm, we may write uniquely d = d1d2, where d1|m and d2|n. Hence X X m n  (f ∗ g)(mn) = f(d)g(mn/d) = f(d1d2)g d1 d2 d|mn d1|m, d2|n X = f(d1)f(d2)g(m/d1)g(n/d2)

d1|m, d2|n X  X  = f(d1)f(m/d1) f(d2)g(n/d2)

d1|m d2|n = (f ∗ g)(m)(f ∗ g)(n).

Lemma 1.11. If g (6≡ 0) and f ∗ g are multiplicative,then also f is. Proof. Clealy, unless f ≡ 0, we must have f(1) = 1. Of course also g(1) = 1. Assume then f(1) = 1 and that N ≥ 2 so that we have proven multiplicativity condition f(mn) = f(k) for all k = mn ≤ N − 1 and (m, n) = 1. Let then N = mn with (m, n) = 1. We may compute as in the proof of Thm 1.10 X h(mn) = f(d1)f(d2)g(m/d1)g(n/d2) + f(mn)

d1|m, d2|n d1d2

d1|m d2|n = h(m)h(n) + f(mn) − f(m)f(n).

3 Since h(m) = h(m)h(n), it follows that f(mn) = f(m)f(n). The claim follows now by induction on N.

Corollary 1.12. If f is multiplicative and f 6≡ 0, also f −1 is multiplicative.

Proof. This follows from Lemma 1.11 since now f and f∗f −1 = I are multiplicative.

Corollary 1.13. Arithmetic functions equipped with addition and convolution (as the product operation) form a ring. Element f is invertible iff f(1) 6= 0. Multiplicative functions form a subgroup (with respect the convolution) of the set of invertible ele- ments.

Theorem 1.14. (i) If f is completely multiplicative, then f −1(n) = µ(n)f(n) for all n ≥ 1. (ii) Conversely, if f is multiplicative and f −1(n) = µ(n)f(n) for all n ≥ 1, then f is completely multiplicative.

Proof. Exercise. In NT it was shown that Y φ(n) = n (1 − 1/p). p|n

Q` αk If the prime decomposition of n is n = k=1 pk , we also have (Exercise)

` Y τ(n) = (αk + 1), k=1 ` Y pαk+1 − 1 σ(n) = k , pk − 1 k=1 ` Y (αk + 1)(αk + 2) u ∗ u ∗ u(n) = . 2 k=1

1.2 Estimation of summatory functions

Many arithmetic functions behave fairly irregularly. However, it makes perfect sense and is mathematically useful to try to understand their average behaviour. This is most easily done by studying the corresponding summatory functions, of which also the prime counting function is an important example: X X π(x) := 1 = χP(n). p≤x n≤x

4 This function describes the average density of primes, and the large size irregularities of the prime distribution, so it is especially important in number theory. We shall study it later on, as well as the M¨obissum function X M(x) := µ(n). n≤x

At this stage we make a closer study of the divisor sum X D(x) := µ(n)τ(n) n≤x and the following sum that counts the number of squarefree numbers X Q(x) := |µ(n)|. n≤x

For that purpose (and for the sake of the whole course) let us revisit the standard Landau symbols. Let F and G be functions that depend on x ∈ [a, ∞). We say that

F (x) = O(G(x))

(” big O of ”) if there is finite constant C > 0 such that |F (x)| ≤ CG(x) for large enough x. Similarly F (x) = o(G(x))

(” small o of ”) if limx→∞ F (x)/G(x) = 0. If F (x) lim = 1, x→∞ G(x) we denote F (x) ∼ G(x) (as x → ∞). The less stringent requirement that F and G grow (or decrease) about the same rate means that

F (x) C−1 ≤ ≤ C as x large enough, G(x) and this is denoted by F (x)  G(x). The above definitions are modified in a natural way when one considers e.g. the situations where x → x0 or n → ∞. Example 1.15. ex = o(1) as x → ∞ −2x2 + x − 5 = O(x2) as x → ∞, 1 1 ∼ as x → ∞, x2 + x x2 2 1  as x → 0. x2 + sin2(2x) x2

5 Definition 1.16. Let f be an arithmetic function. The corresponding summatory function ( or sum function ) is F : R → C, where F (x) = 0 for x < 1 and X F (x) = f(n), for x ≥ 1. n≤x Example 1.17. The sum function that obtained from u is bxc if x > 0, and 0 otherwise. Lemma 1.18. Let f and g be arithmetic functions with F and G the corresponding summatory functions. Then X X X (f ∗ g)(n) = F (x/m)g(m) = G(x/`)f(`). m≤x m≤x `≤x Proof. Left hand side is X X X f(`)g(m) = f(`)g(m) n≤x m`=n `m≤x X X X = g(m) f(`) = F (x/m)g(m). m≤x `≤x/m m≤x The second claim follows then by symmetry. Example 1.19. As τ(n) = u ∗ u(n), and the sum function corresponding to u is bxc, we obtain by the above lemma X X D(x) = τ(n) = bx/mc. (1.3) n≤x m≤x This can also be deduced directly by noting that bx/mc is the number of positive integers less than x that are divisible by m.

1.3 Dirichlet’s divisor problem

A classical challenge is to understand average size of the , i.e. the goal is to estimate the growth of the function X D(x) = τ(n). n≤x For this purpose, let us state our first integral approximation result for discrete sums (later on we shall provide more precise results).

Lemma 1.20. Let f be monotone and integralle on [a, b, where a, b ∈ Z and a < b. Then there is θ ∈ [0, 1] so that

b X Z f(n) = f(t)dt + θ(f(b) − f(a)).

a

6 Proof. We may assume that f is increasing (otherwise we could consider −f). Obvi- ously a+j+1 Z f(a + j) ≤ f(t)dt ≤ f(a + j + 1), 0 ≤ j ≤ b − a − 1.

a+j By summing over j it follows that

b X Z X f(n) ≤ f(t)dt ≤ f(n),

a≤n

Let us denote bxc = N. We may estimate the sum term by Lemma 1.20 as follows:

N N N X X Z m−1 = 1 + m−1 = 1 + x−1dx + O(1) = log N + O(1) = log x + O(1).

m=1 m=2 1 Thus we finally have

D(x) = x(log x + O(1)) + O(x) = x log x + O(x).

Especially, it follows that

D(x) ∼ x log x as x → ∞, which tells is some sense that the numbers below x have about log x on the average. One may actually improve the error term we obtained above considerably by em- ploying so-called Dirichlet’s hyperbola method. For that end we first need an improve- ment to the estimate of the harmonic series partial sum. X Lemma 1.21. m−1 = log x + γ + O(1/x). m≤x Proof. We shall prove this later on. √ Theorem 1.22. D(x) = x(log x + 2γ − 1) + ∆(x), where ∆(x) = O( x).

Proof. Let us observe that X X X D(x) = 1 = . n≤x uv=1 uv≤x

7 √ √ Write A1 := {(u, v): uv ≤√x, u ≤ x} and A2 := {u, v): ≤ x, v ≤ x}. Moreover, let A3 := {(u, v): u, v ≤ x}. Then A3 = A1 ∩ A2 and we obtain by symmetry √ 2 D(x) = #(A1) + #(A2) − #(A3) = 2#(A1) − b xc . (1.4) √ √ Here one easily checks that b xc2 = x + O( x). Moreover, X X X #(A1) = 1 = bx/uc √ √ u≤ x v≤x/u u≤ x X  √ √  √ = (x/u + O(1)) = x log( x) + γ + O(1/ x + O x) √ u≤ x 1 √ = x log x + γx + O x). 2 By combining these estimates with (1.4) the statement follows. Remark 1.23. It is a well-know open problem to give essentially optimal size estimate of the error term, i.e. what is the smallest α so that ∆(x) = O(xα+ε) for all ε > 0. Voroinoi proved in 1904 that αopt ≤ 1/3 Hardy observed in 1916 that αopt ≥ 1/4. Best known result at the moment is due to Huxley, who showed that αopt ≤ 131/416.

1.4 Density of the squarefree numbers

We claim that actually the squarefree numbers occupy a positive proportion of all positive integers. Recall that n is squarefree if p2 6 |n for all primes p. The number of all squarefree numbers less than x is denoted by X Q(x) := |mu(n)|. n≤x

We shall first give a heuristic derivation of the result. Thus, denote by pk the k:th prime, i.e. p1 = 2, p2 = 3,..., and let us denote by Ak the ’probabilistic event’ that 2 2 a ’randomly picked’ positive n is divisible by pk. Since every pk:th integer is 2 divisible by pk, it is natural to set −2 Prob(Ak) = pk . 2 2 0 2 2 In a similar way, since n is divisible by both pk and pk0 (k 6= k ) if and only if pkpk0 divides n, we have −2 −2 Prob(Ak ∩ Ak0 ) = pk pk0 , and a similar conclusion holds for any number of sets Ak. Hence the events Ak are independent. This enables us to compute

\ c  Y c Prob(n is squarefree) = Prob Ak = Prob(Ak) k≥1 k≥1 Y Y −2 = (1 − Prob(Ak)) = (1 − p ). k≥1 p The following Lemma computes this constant.

8 ∞ ∞ X π2 X 6 Lemma 1.24. (i) n−2 = , (ii) µ(n)n−2 = , and 6 π2 n=1 n=1 Y 6 (iii) (1 − p−2). = . π2 p∈P Proof. The first result (i) is due to Euler, and we give a proof in the next section. Then, due to absolute convergence of the series we may compute ∞ ∞  X  X  X µ(d) X X n−2 µ(d)d−2 = = k−2 µ(d) = 1, (nd)2 n=1 d=1 n,d≥1 k≥1 d|k where we applied Theorem 1.5. This yields (ii). Finally, (iii) is left as an easy exercise (one may e.g. make use of the Euler product of the ). When the previous lemma is combined with the heuristics, we obtain a plausible asymptotics Q(x) ∼ 6π−2x, as x → ∞. One may actually make this rigorous. 6 Theorem 1.25. As x → ∞, we have Q(x) = x + O(x1/2). π2 Proof. Define g := µ ∗ |µ|. Then g is multiplicative as the Dirichlet convolution of two multiplicative functions. Observe that if p is a prime, we obtain

k X g(pk) = µ(pj)|µ(pk−j)|. j=0 If k ≥ 3, either j ≥ 2 or k − j ≥ 2, so the terms in the sum vanish and we have g(pk) = 0. On the other hand, g(1) = 1, g(p) = µ(1)|µ(p)| + µ(p)|µ(1)| = 1 − 1 = 0, and µ(1)|µ(p2)| + µ(p)|µ(p)| + µ(p2)|µ(1)| = −1. Thus g(p2k) = µ(pk) for all integers and µ(p2k+1) = 0. By multiplicativity we deduce that actually g(m2) = µ(m) ∀m ≥ 1, and g(m) = 0 if m is not a square. (1.5) As we convolve both sides of the equation g = µ ∗ |µ| by u, it follows that |µ| = u ∗ g. By employing this equality we may then compute recalling (1.5) and Lemma 1.18

X X X (1.5) X Q(x) = |µ(n)| = u ∗ g(n) = g(n)bx/nc = µ(k)bx/k2c n≤x n≤x n≤x k≤x1/2 X  x  X X = µ(k) + O(1) = x µ(k)k−2 − µ(k)k−2 + O(x1/2) k2 k≤x1/2 k≥1 k>x1/2 ∞ 6  Z  6 = x + O y−2dy + O(x1/2) = x + O(x1/2) π2 π2 x1/2−1

9 Chapter 2

Euler and Poisson summation formulas. Gaus sums. The functional equation of ζ

2.1 A primer of Fourier analysis

1 If f ∈ L1(R) we define the of f by setting ∞ Z −2πixξ fb(ξ) := e f(x)dx, ξ ∈ R. −∞ In case supp(f) ∈ [0, 1] one usually considers the Fourier coefficients of f:

fb(n), n ∈ Z.

Remark 2.1. We write T := [0, 1) = R/Z and our aim here is mainly to consider the convergence of Fourier-series of a given function f : T → R. When a function f is defined on T, we automatically denote also by f the 1-periodic extension of f to the whole of R. Especially, we then have e.g. that a+1 Z Z f(t)dt = f(t)dt

T a for any a ∈ R. This makes many formulas easy to write. Also, we say that f ∈ Ck(T) if the periodic extension of f is in Ck(R). As usual, we denote the Lp-norm of f by Z 1/p  p  kfkLp(T) := |f(t)| dt . T 1We give a brief and sketchy account here, basically enough for our purposes. If you are not familiar with the material, while reading this it is useful to take simultaneously a look at the corresponding material of the course ’Fourier Analysis’.

10 Definition 2.2. We set e(x) := exp(2πix) and write (formally at this point) the Fourier series of given f ∈ L1(T ) as X f(t) ∼ fb(n)e(nt). (2.1) n∈Z

1 R 1 Here the n:th Fourier coefficient of f ∈ L (T) is fb(n) := 0 f(u)e(−nu)du. One says that the Fourier series of f converges at point t0 if there exists the limit

lim SN f(t0) N→∞ where the N:th partial sum is

N X SN (t) := fb(n)e(nt). −N We shall provide sufficient conditions for the convergence of the Fourier series. Lemma 2.3. The N:th Fejer kernel is the trigonometric polynomial

N X |n| K (t) := 1 − e(nt). N N + 1 −N

It satisfies 1 sin((k + 1)πt)2 (1) K (t) = . (2) K (t) ≥ 0 ∀ t ∈ . N N + 1 sin(πt) N T Z Z (3) KN (t)dt = 1. (4) Kn(t)dt → 0 as N → ∞ for all δ > 0.

T (−1/2,1/2)\(−δ,δ)

Proof. First of all, we observe that sin2(u/2) = 1/2 − (1/4)(eiu + e−iu). Hence multi- plying a trigonometric polynomial by sin2(t/2) amounts basically to taking a second difference of the coefficients. We thus obtain after small computation that

N 2 it −it X  (N + 1) sin (t/2)KN (t) = (1/2 − (1/4)(e + e ) (N + 1) − |n| e(nt) −N = (1/2 − (1/4)(e(N+1)it + e−(N+1)it), which implies (1). In turn (1) clearly yields (2) and (4), and (3) is a consequence of the fact that R e(nx)dx = 0 if n 6= 0. T The convolution of two 1-periodic functions is defined as usual by

1 Z f ∗ g(t) = g ∗ f(t) := f(t − u)g(u).

0

11 PK In case g is a trigonometric polynomial, if we substitute above g(t−u) = −K gb(n)e(nt)e(−nu), we obtain that K X f ∗ g(t) = gb(n)fb(n)e(nt). −K

In particular, by choosing g = KN above it follows that

N X |n|  σN (f)(t) := KN ∗ f(t) = 1 − fb(n)e(nt) (2.2) N + 1 −N This is called the N:th Fejer partial sum of (the Fourier series of) f. One may note that is corresponds to applying Cesaro summation to the original Fourier series. The usefulness of these partial sum is due to the properties (1)-(4) of the Fejer kernels listed in Lemma 2.3. Namely they tell that Fejer kernels form an ’approximation of the identity’ in the sense of ’Real ’Analysis ’Course’, and hence we obtain immediately

Theorem 2.4. If f ∈ Lp(T) with 1 ≤ p < ∞, then

kKN ∗ f − fkLp(T → 0 as N → ∞.

Moreover, if f ∈ C(T), then

kKN ∗ f − fkL∞ (T) → 0 as N → ∞. Proof. The latter statement is an exercise using uniform continuity of f and the prop- erties of the Fejer kernel stated in Lemma 2.3. The first statement then follows by observing that convolution by the Fejer-kerner does not increase Lp-norm and using the fact that continuous functions are dense in Lp. We refer to the ’Real Analysis’ course notes for the details. As the elements in L1 are defined up to sets of measure zero, we immediately obtain

1 Corollary 2.5. If f, g ∈ L (T) satisfy fb(n) = gb(n) for all n ∈ Z, then f(t) = g(t) for almost every t ∈ T. It is useful to note that the functions e(n·) form an orthonormal basis of L2(T ). First of all, it is readily verified that Z e(nt)e(mt)dt = δm,n, (2.3)

T so these functions are orthogonal and have L2-norm one in L2(T). The completeness follows from

Theorem 2.6. (Parseval formula Let f ∈ L2(T). Then

1 Z 2 2 X 2 kfk 2 = |f(t)| dt = |f(n)| . L (T) b 0 n∈Z

12 PN Proof. If f(t) = −N fb(n)e(nt) is a finite trigonometric polynomial, the orthogonality relations (2.3) yield immediately that

1 N N Z X Z X |f(t)|2dt = fb(m)fb(n) e(mt)e(nt)dt = |fb(n)|2. 0 m,n=−N T n=−N Especially, we have

Z N 2 X |n|  2 |KN ∗ f(t)| dt = 1 − |fb(n)| . N + 1 n=−N T The claim now follows by letting N → ∞ and applying Lemma 2.3 on the left hand side and monotone convergence on the right hand side. We are then ready to prove our first convergence result. Theorem 2.7. Assume that f ∈ L1(T ) satisfies

∞ X |fb(n)| < ∞. n=−∞ Then, by modifying f in a set of measure zero (if needed) it becomes continuous and its Fourier series converges uniformly to f. No modification is needed if f is continuous. Proof. Simply note that since |e(nt)| = 1 for all t, the condition of the Theorem implies that the Fourier series converges uniformly, and denote the limit function by X f0(t) := fb(n)e(nt). n∈Z

Thus it remains to prove that f0 = f for almost every t. But this follows immediately from Corollary 2.5 as soon as we check that fb0(n) = fb(n) for all n. Fix n, and note that by the uniform convergence and the orthogonality relations (2.3) we obtain

N Z X fb0(n) = lim fb(k)e(−nt)dt = fb(n). N→∞ N T

Definition 2.8. Function f on T is piecewise continuously differentiable if we may 0 partition T into finitely many half-open intervals [0, 1) = I1,...I` so that f is contin- 0 uous on the interior of each of Ij := [aj, bj), and the left limit f (aj+) the right limit 0 f (bj−) exists for all j. One may readily verify that if f is piecewise continuously differentiable, then f 0 is bounded, and also f has left and right limits at each endpoint of the intervals Ij (exercise).

13 Lemma 2.9. Assume that f is both continuous and continuously piecewise differen- tiable. Then fb0(n) = (2πin)fb(n) for all n ∈ Z. Proof. The assumptions enable us (why?) to integrate by parts to obtain

1 2 Z Z fb0(n) = e−2πinf 0(t)dt = 2πin e−2πinf(t)dt = 2πinfb(n). 0 0

We the obtain an effective criteria to apply Theorem 2.7 without having to compute the Fourier coefficients. Theorem 2.10. Assume that f is both continuous and continuously piecewise differ- entiable. Then P |f(n)| < ∞, and the Fourier series of f converges to f both n∈Z b pointwise and uniformly. Proof. According to Thm 2.7 we only need to verify the absolute summability of the Fourier coefficients. Now f 0 ∈ L2(T), so ∞ ∞ X X ∞ > |fb0(n)|2 = 4π2 n2|fb(n)|2 n=−∞ n=−∞ according to Thm 2.6 and Lemma 2.9. We may then apply Cauchy-Schwarz to estimate

X X 1 X 2 21/2 X 1 1/2 |fb(n)| = |fb(0)| + |nfb(n)| ≤ |fb(0)| + n |fb(n)| < ∞. n n2 n=−∞ n6=0 n6=0 n6=0

Whether the Fourier series of f converges at point t0 or not does actually depends just on the behaviour of f in a neighbourhood of point t0. This follows from 1 Theorem 2.11. (Principle of locality) Let f ∈ L (T), t0 ∈ T, and f|(t0−δ,t0+δ) = 0. Then lim SN f(t0) = 0. N→∞ 1 A fortiori, if the Fourier series of g ∈ L (T ) converges at t0 and f ≡ g in a neighbour- hood of t0, then the Fourier series of f converges at t0 to the same value. Proof. Exercise with instructions (or see the notes of the course ’Fourier analysis’).

The function sgn T(x) is defined by setting  −1 if x ∈ (−1/2, 0),  sgn T(x) = 1 if x ∈ (0, 1/2), 0 if x ∈ {−1/2, 0, 1/2}. and extending 1-periodically to R so that we may alternatively view it to be defined on T. The proof of the following lemma is a simple computation.

14 Lemma 2.12.  2  if n is odd, sgn[T(n) = πni 0 if n is even. Since above the sequence of Fourier coefficients is odd, we obtain

Corollary 2.13. The Fourier series of sgn T converges at 0. ∞ X π2 Corollary 2.14. (Euler) n−2 = . 6 n=1 Proof. By Lemma 2.12 and Parseval formula it follows that

1 Z ∞ 2 X 2 8 X 1 ksgn k 2 = 1dt = 1 = |sgn (n)| = , T L (T) [T π2 (2k − 1)2 0 n∈Z k=1 or in other words, π2 1−2 + 3−2 + 5−2 + 7−2 + ... = . (2.4) 8 One may compute

∞ ∞ X 1  1 1 1  X 1 4 π2 π2 = 1 + + + + ... = = . k2 22 42 82 (2k − 1)2 3 8 6 k=1 k=1

We are then ready for the convergence theorem that is enough for our purposes.

Theorem 2.15. Assume that f : T → C is piecewise continuously differentiable. Then for every t ∈ T one has

1 − +  lim SN f(t) = f(t ) + f(t ) . N→∞ 2 Proof. (sketch) By translation invariance we may assume that t = 0. If f is continuous at 0, we may easily find a continuous and piecewise continuously differentiable g on T so that f = g in a neighbourhood of the origin. Then the claim follows from the principle of locality (Thm 2.11) and Theorem 2.10. We may reduce the general case to this one by choosing constant A so that f − A sgn T is continuous at 0, and finally invoking Corollary 2.13. Example 2.16. By the previous theorem and Lemma 2.12, the Fourier series

∞ 2 X 1 4 X 1 e((2k − 1)t) = sin((2k − 1)2πt) πi 2k − 1 π 2k − 1 k∈Z k=1 converges at each point to sgn T(t). However, since the limit function is not continuous, the convergence cannot be uniform. Abel gave this as a natural example of a pointwise converging series of continuous functions whose sum is not continuous.

15 2.2 Poisson summation formula

We now return to functions defined on the whole of R. Many important identities can be obtained fro the following result.

Theorem 2.17. (Poisson summation formula) Assume that f ∈ C(R) satisfies for some constants ε > 0 and C > 0

|f(x)| ≤ C(1 + |x|)−(1+ε) (2.5) and that its Fourier transform satisfies an analogous bound

|fb(ξ)| ≤ C(1 + |ξ|)−(1+ε). (2.6) Then X X f(n) = fb(n). n∈Z n∈Z X Proof. Define g(x) = f(x + k). Then condition (2.5) implies uniform convergence k∈Z of the series, so g is continuous and 1-periodic. We may hence consider g as a function on T. Let us compute its Fourier-coefficients:

1 1 Z Z Z X −2πinx gb(n) = e(−nt)g(t)dt = e(−nx)f(x + k)dx = e f(x) = fb(n). k∈ 0 Z 0 R Thus, by (2.6) we see that P |g(n)| < ∞, which implies that the Fourier series of n∈Z b g converges at each point t to g(t) by Theorem 2.7. When we use this knowledge at x = 0 the desired statement follows. We will later on give applications of the Poisson summation formula. Sometimes it is useful to have a version, where f has compact support. Let us thus agree with the convention Definition 2.18. ( X0 P f(n), x 6∈ , f(n) := n≤x Z n≤x P 1 n≤x−1 f(n) + 2 f(x) x ∈ Z. P0 In a similar manner, in a

b X0 X Z f(n) = fb(n), where fb(n) := e(−nx)f(x)dx. a

16 Proof. One runs the proof of Theorem 2.17 where we simply extend f to the whole of R by setting f(x) = 0 if x 6∈ [a, b]. In this case g is piecewise continuously differentiable with 1 X0 g(0−) + g(0+) = f(n), 2 a

Let a ∈ Z and N ∈ N. The Gauss sum N X G(a, N) := e(ak2/N) (2.7) k=1 plays a signficant role in number theory. The following result is due to Gauss2

Theorem 2.21. Let N ∈ N. Then √ (1 + i) N if N ≡ 0 (mod4), √  N if N ≡ 1 (mod4), G(1,N) = 0 if N ≡ 2 (mod4),  √ i N if N ≡ 3 (mod4).

Proof. We write, in order to fit into the formulation of Theorem 2.19

N−1 1 X 1 X0 G(1,N) = + e(k2/N) + = e(k2/N). 2 2 0

N 1 X Z X Z G(1,N) = e(x2/N)e(−nx)dx = N e(−Nn2/4) e(N(y − n/2)2)dy

n∈Z 0 n∈Z 0 −n/2+1 X Z = N e(−Nn2/4) e(Nz2)dz. n∈ Z −n/2

2It took 4 years from Gauss to find a proof for the theorem.

17 Above we substituted first x = Ny, and then y = 1/2n + z. Write n = 2m + , where  ∈ (0, 1), perform the sum separately with respect to even or odd values of m, and observe that ( 1, n even, e(−Nn2/4) = i−N , n odd. Thus we obtain −m+1/2 −m+1 X Z X Z G(1,N) = N e(Nz2)dz + Ni−N e(Nz2)dz m∈ m∈ Z−m−1/2 Z −m ∞ ∞ Z √ Z = N(1 + i−N ) e(Nz2)dz = N(1 + i−N ) e(z2)dz.

−∞ −∞ R ∞ 2 In order to determine the integral −∞ e(z )dz (recall that in view of Remark 2.20 R A 2 this integral should be understood as the limit lim A → ∞ −A e(z )dz, but see the remark below), take N = 1 and note that by definition G(1, 1) = 1, whence we obtain −1 R ∞ 2 R ∞ 2 1 = (1 + i ) −∞ e(z )dz, or in other words −∞ e(z )dz = (1 + i)/2. The stated result follows by substituting this in the above computation and considering different remainders (mod N).

R ∞ 2 Remark 2.22. It is useful to note (exercise!) that the integral −∞ e(z )dz actually exists as the double limit B Z lim e(z2)dz, A→−∞,B→∞ A whence there is actually nothing dubious within our notation in the above proof. As a side remark, by taking real and imaginary parts and making a simple change of variables we obtain the values of the Fresnell integrals ∞ ∞ Z Z sin(u2)du = pπ/2 = cos(u2)du.

−∞ −∞ We refer to the course NT for the statement and notation related to the famous Quadratic Law of Reciprocity. In order to prove it we need a further lemma on the Gauss sums. Lemma 2.23. (i) If p ≥ 3 is a prime and (a, p) = 1, then a G(a, p) = G(1, p), p

 a  where p is the Legende symbol. (ii) If m, n ≥ 3 satisfy (m, n) = 1, then G(m, n)G(n.m) = G(1, nm).

18 Proof. (i) If k, k0 ∈ {1, . . . , p − 1}, one has k2 ≡ k02 (mod p) if and only if k = k0 or k = p − k0. Thus, as k runs over {1, . . . , p − 1}, then k2 runs exactly twice over the set + − Np of quadratic residues (mod p). Denote by Np the set of non-residues, so that

+ − {1, . . . p − 1} = Np ∪ Np .

We may thus write X G(a, p) = 1 + 2 e(ar/p). + r∈Np

Pp Since r=1 e(ar/p) = 0 by geometric series, we see that X X G(a, p) = 1 + e(ar/p) + e(ar/p). + − r∈Np r∈Np ar ar  Also, as = , and, moreover ar 6≡ ar0 if r 6≡ r0 (mod p), we obtain (mod p p p p)  + a Np if = 1, +  p aNp = a N − if = −1.  p p a Hence, if = 1 we deduce p X X G(a, p) = 1 + 2 e(r/p) = 1 + 2 e(r0/p) = G(1, p).

+ 0 + r∈aNp r ∈Np a In turn, in case = −1 it follows that p X X G(a, p) = 1 + 2 e(r/p) = 1 + 2 e(r0/p)

+ 0 − r∈aNp r ∈Np  X  X = 1 + 2 − 1 − e(r0/p) = −1 − 2 e(r0/p)

0 + 0 + r ∈Np r ∈Np = −G(1, p), which conludes the proof of (i). (ii) The proof of this part will be an exercise.

Theorem 2.24. (Law of Quadratic Reciprocity) Let p, q ≥ 3 be prime numbers. Then pq  = (−1)(p−1)(q−1)/4 q p

19 Proof. By applying both parts of Lemma 2.23 we see that

pq  G(1, pq) = . q p G(1, p)G(1, q)

We then observe that Theorem 2.21 for odd N can be written in the equivalent form

√ (N − 1)2  G(1,N) = N e . 16 When we combine this with the previous formula, it follows that

pq  (pq − 1)2 − (p − 1)2 − (q − 1)2  = e . q p 16

Finally, one writes p = 4k +  and q = 4k0 + 0, and a small computation shows that the result only depends on the values of , 0 ∈ {±1}.

2.4 Jacobi theta function

For Im z > 0 the ν-function is defined through the formula

X 2 X 2 ν(z) := eπin z = qn , n∈Z n∈Z where q := eπiz. Since |q| = |eπin2z| = e−πn2Im z, the series converges uniformly on {Im z ≥ ε} for any ε > 0, and hence defines an analytic function in the upper half plane.

Theorem 2.25. For Im z > 0 one has √ η(−1/z) = −izη(z), (2.8) √ where the branch of −iz is chosen to be positive on the upper half of the imaginary axis. A fortiori, if we define the Jacobi theta function on (0, ∞) by

X 2 θ(x) = ν(ix) = e−πn x, x > 0, n∈Z then θ satisfies the transformation formula

θ(1/x) = x1/2θ(x), x > 0. (2.9)

−πay2 Proof. Fix a > 0. Let us denote for y ∈ R f(y) := e . The Fourier transform of f takes the form 2 fb(ξ) = a−1/2e−πξ /a. (2.10)

20 In order to prove this we note that by the very quick decay of the exponential function, one check that its possible to differentiate under the integral in the definition of fb(ξ) to obtain

d Z 2 Z d  2  Z 2 fb(ξ) = (−2πiy)e−2πiξy−πay dy = ia−1 e−πay −2πiξy dy − 2πa−1ξ e−2πiξy−πay dξ dy R R R = −2πa−1ξfb(ξ). Since f(0) = R e−πay2 dy = a−1/2 we see that f is the unique solution of the ordinary b R b differential equation g0(ξ) = −(2ξπa−1)g(ξ), g(0) = a−1/2, and one checks that indeed (2.10) yields a solution. In order to deduce (2.9), we simply apply Poisson summation formula to the func- tion f to obtain

X 2 X X X 2 e−πn a = f(n) = fb(n) = a−1/2e−πn /a. n∈Z n∈Z n∈Z n∈Z This yields (2.8) on the positive imaginary axis, and finally (2.8) follows by analytic continuation for z in the upper half plane.

2.5 Analytic continuation and functional equation of the Riemann ζ function

Recall that the famous Riemann zeta function ζ(s) is defined by the formula

∞ X ζ(s) = n−s, Re s > 1. (2.11) n=1 Remark 2.26. In it is customary to denote the complex variable by s = σ + it ! Thus, from now own, if not otherwise stated or clear from the context, s is always a with real part σ and imaginary part t. Participants of the course ’Complex Analysis II’ have seen extension of the Riemann zeta function to a meromorphic function on the whole complex plane, and proof of the functional equation by calculus of residues. That was one of the the two proofs that Riemann gave on his famous 1859 article. We will here follow the second proof of Riemann that applies the transformation formula of the Jacobi theta function we just proved. For that purpose, let us define the following function3 that is clearly analytic in σ > 1: 1 ξ(s) := s(s − 1)π−s/2Γ(s/2)ζ(s). 2 3We follows the lead of Titschmarsh’ classic book ’Theory of the Riemann zeta function’ by in- cluding factor 1/2 in the definition of ξ-function.

21 In term of function ξ it is now very easy to state the meromorphic extendability and functional equation for ζ: Theorem 2.27. Function ξ has an analytic continuation to an entire function that satisfies ξ(s) = ξ(1 − s) ∀s ∈ C. Proof. We assume that σ > 1 and substitute t = πn2x and s/2 in place of s in the definition of the Gamma function ∞ Z Γ(s) := e−tts−1dt, Re σ > 0. (2.12)

0 It follows that ∞ Z Γ(s/2) = e−πn2x(πn2x)s/2−1πn2dx.

0 Multiplying both sides by n−sπ−s/2 and summing over n = 1, 2,..., we obtain

∞ Z  X 2  π−2/2Γ(s/2)ζ(s) = xs/2−1 n ≥ 1e−πn x dx

0 ∞ Z = xs/2−1ω(x)dx, (2.13)

0 . where we denoted ω(x) := (θ(x) − 1)/2. It is important to observe that ω decays exponentially as x → ∞. In terms of ω, the transformation formula 2.9 takes the form 1 1 ω(1/x) = − + x1/2 + x1/2ω(x). 2 2 R 1 R ∞ We then divide the last integral in (2.13) in to parts 0 and 1 , and perform the change of variables x → 1/x in the first integral. In this way we obtain

∞ 1  Z Z  2ξ(s) = s(s − 1) xs/2−1ω(x)dx + xs/2−1ω(x)dx

1 0 ∞ ∞  Z Z  = s(s − 1) xs/2−1ω(x) + x−s/2−1ω(1/x) dx

1 1 ∞ ∞  Z 1 Z    = s(s − 1) ω(x)xs/2−1 + x(1−s)/2−1dx + − x−s/2−1 + x−s/2−1/2 dx 2 1 1 ∞ Z = s(s − 1) ω(x)xs/2−1 + x(1−s)−1dx + 1.

1

22 Since ω decays exponentially, the last written integral is an entire function of s, and this yields the analystic continuation of ξ. Moreover, the integrand and the integral are invariant under s → 1 − s, which verifies the stated functional equation.

Corollary 2.28. The Riemann zeta function extends meromorphically to C with only one pole, whichh is located at 1 and is simple. Moreover, the functional equation of ζ can be written in the form ζ(s) = χ(s)ζ(1 − s), where 1 χ(s) := πs−12s sin( πs)Γ(1 − s). 2 Proof. The proof of the expression for χ is just a computation (Exercise!) which expresses ξ in terms of ζ and substitutes this expression into the functional equation of ξ. To get to the expression given above one applies the well-known identities (proven in Complex Analysis II) for the Gamma function: π Γ(z)Γ(1 − z) = sin(πz) (Euler reflection formula) and √ πΓ(2z) = 22z−1Γ(z)Γ(z + 1/2)

(Legendre duplication formula). Since ξ is entire, it follows that ζ is meromorphic as we recall that Gamma function is meromorphic. Since Γ-function never vanishes, the only possible poles of ζ are at 0 and 1, and they can be only simple poles. However, sΓ(s) is regular and non-zero at 0, so 0 is not a pole. Finally, we have

∞ ∞ X X lim ζ(s) = lim n−s = n−1 = ∞, s→1+ s→1+ n=1 n=1 so that 1 must be a simple pole.

2.6 Euler summation formula

A useful formula for approximating sums or for deriving asymptotic formulas is the Euler(McLaurin) summation formula: Theorem 2.29. (Euler’s summation formula) Assume that f ∈ C1[a, b], where a, b ∈ R, a < b. Then

b b Z b Z X 0 f(n) = f(x)dx + B1f − B1(x)f (x)dx, (2.14) a a

1 where B1(x) := 2 − (x − bxc), for x ∈ R.

23 Proof. It is useful to draw a picture of the function B1 (note especially that is has average 0 on each interval of length 1 and takes the value 1/2 at each integer). We divide (a, b] into pieces

a, bac + 1, bac + 1, bac + 2,..., bbc − 1, bbc, bbc, b.

Since both sides of (2.14) are (directly due to definition) ’additive’ with respect to the interval (a, b], it is thus enough to treat the case where m ≤ a < b ≤ m + 1. By translation invariance we may actually assume that 0 ≤ a < b ≤ 1. In case b < 1 the left hand side of (2.14) is zero, and so is the right hand side by a simple integration 0 by parts (note that B1(x) = −1 for x ∈ (0, 1)). In order to treat the case b = 1 one simply notes that the LHS and the RHS of (2.14) as the function b make jump f(1) at b = 1. More precisely, as a function of b ∈ (a, 1] we have by direct inspection

LHS(1) − LHS(1−) = f(1) = RHS(1) − RHS(1−).

One may note that the above method may be developed further by introducing higher order corrections terms, but the above formulation is enough for our purposes. Proof of Lemma 1.21. By applying Thm 2.29 with f(t) = 1/t, a = 1, and b = x > 1 one obtains

x x Z x Z X −1 −1 −1 −2 n = t dt + B1(t)t + B1(t)t dt 1 1

R ∞ B1(t) R ∞ −2  −1 since x t2 dt = O x t dt = O(x ). Put together, we have X n−1 = log(x) + C + O(1/x), 1≤n≤x and the statement of lemma follows as soon as we rename C = γ, and note that by the PN −1  asymptotics we just proved γ = limN→∞ n=1 n − log N . Also the classical Stirling’s approximation for the factorial function is obtained easily by the same method, although the determination of constant in front of every thing requires an additional argument. 1 1 Lemma 2.30. log(n!) = n(log n − 1) + log n + log π + O(1/n). 2 2

24 Proof. We apply Euler summation with f(t) = log(t) on the interval (a, b] = (1,N]:

N x X Z N log(N!) = log n = log xdx + B1(t) log t + EN 1 n=1 1 1 = N(log N − 1) + log N + C + E , 2 N

R ∞ −1 where En = − N B1(t)t dt. Thus the statement (and the validity of the above ma- nipulation) follows if we verify that

∞ Z −1 B1(t)t dt = O(1/x). x

R k+1 For that end we observe that k B1(t)dt = 0 for any k ∈ N, whence one may apply the ’subtraction of constant trick’ to estimate

k+1 k+1 k+1 Z Z Z 1 1 | B (t)t−1dt| = | B (t)(t−1 − k−1)dt| ≤ dt = 1 1 k(k + 1) k(k + 1) k k k X 1 and what we need follows by observing that ≤ N −1. k(k + 1) k≥N Finally, the determination of the value of the constant C is left as an exercise4 In analytic number theory it is important to understand in detail the asymptotic behaviour of Γ(s)-function also for complex s , especially for large s at a finite distance from the imaginary axis. It actually turns out that the Stirling formula remains valid in great generality. Theorem 2.31. As s → ∞ we have 1 1 log(Γ(s)) = (s − ) log(s) − s + log(2π) + O(|s|−1) 2 2 uniformly in a fixed angle arg(s) ∈ (−π + δ, π − δ). Above the branches of log(Γ(s) and log(s) are real on (0, ∞). Proof. Assume first that s > 0. This time we apply Euler’s summation formula on the function t 7→ log(s + t) on the interval (0,N] to obtain

N N N Z Z X 1 B1(t) log(n + s) = log(t + s)dt + log(s + N) − log(s) − dt. 2 t + s n=1 0 0

4 R π/2 n Hint: denote In = 0 sin (x)dx. Use partial integration and induction to compute S2n = π Qn 2k−1 Qn 2k 2 k=1 2k and S2n+1 = k=1 2k+1 . Apply these and the already proven form of the Stirling formula to determine the constant C.

25 By observing that log(N + s) = log N + log(1 + s/N) = log(N) + sN −1 + O(N −2) we see that the contribution of the first two terms on the RHS above equals

N Z 1 log(t + s)dt + log(s + N) − log(s) 2 0 1 = (N + s) log(N + s) − s log s − (N + s − s) + log(s + N) − log(s) 2 = (N + 1/2) log N + s log N − N + s − (s + 1/2) log s + O(N −1). We then recall the Gauss product formula (e.g. from the course Complex Analysis II): N!N s Γ(s) = lim . N→∞ s(s + 1) ... (s + N) By invoking Lemma 2.30 and our above computation formulas we hence obtain for s > 0  1 1 log(Γ(s)) = lim N log N − N + log N + log 2π + N log s − N→∞ 2 2 N Z B (t)  −(N + 1/2) log N − s log N + N − s + (s + 1/2) log s + O(N −1) − 1 dt t + s 0 N h 1 1 Z B (t) i = lim (s − ) log s − s + log 2π + O(N −1) + 1 dt N→∞ 2 2 t + s 0

N Z R (t) This proves the existence of lim 1 dt and we have proven the beautiful formula N→∞ t + s 0 ∞ 1 1 Z B (t) log(Γ(s)) = (s − ) log s − s + log 2π + 1 dt. (2.15) 2 2 t + s 0 In order to estimate the error term and to extend this to complex values of s by analytic R k+1 continuation, we use again the fact that k B1(t)dt = 0 for all k ∈ N, so that we may write k+1 k+1 k+1 Z B (t) Z −1 1 Z (btc − t + 1/2) 1 dt = B (t) + dt = B (t) )dt t + s 1 k + 1/2 + s t + s 1 (t + s)(btc + s) k k k

By recalling the definition of B1 we finally have 1 1 log(Γ(s)) = (s − ) log s − s + log 2π + E(s) with (2.16) 2 2 ∞ Z (B (t))2 E(s) = 1 dt. (t + s)(btc + 1/2 + s) 0

26 We may obviously continue each term here analytically to C\(−∞, 0], and hence (2.16) holds for all C \ (−∞, 0]. Given δ > 0, the error term may be bounded for large s in a fixed angle arg(s) ∈ (−π + δ, π − δ) by noting that according to elementary geometry, inside this angle we have |t + s| ≥ c(δ)(t + |s|) for any t > 0, whence for (say) |s| ≥ 2

∞  Z 1  |E(s)| = O = O(|s|−1) (t + |s|)2 0

27 Chapter 3

Convergent

3.1 Convergence of Dirichlet series

In this section we study basic properties of analytic functions that can be represented as series of the form X −s f(s) = ann n=1 that converge at least at one point s ∈ C. Such series are called Dirichlet series, and they have special importance for number theory. A useful tool in estimating and manipulating Dirichlet series (or series overall) is

Theorem 3.1. (Abel summation formula) Let λ1 ≤ λ2 ≤ . . . λn → ∞ and assume ∞ that x ∈ R with x > λ1. Then for any sequence (an)n=1 and a piecewise differentiable and continuous function f :[λ1, x] → C one has

x Z X 0 anf(λn) = A(x)f(x) − A(t)f (t)dt,

λn≤x λ1 X where A(x) := an.

λn≤x Proof. We may compute X X  A(x)f(x) − anf(λn) = an f(x) − f(λn)

λn≤x λn≤x x x Z Z X 0 X  0 = an f (t)dt = an f (t)dt.

λn≤x λn≤t λn λ1

A version of the above for sequences (’partial summation of series’) is given by

28 Theorem 3.2. Let (a )∞ and (b )∞ be arbitrary sequences. Denote A := Pn a . n n=1 n n=1 n j=n0 j Then n n−1 X X ajbj = Anbn − Aj(bj+1 − bj).

j=n0 j=n0 Proof. Exercise.

Corollary 3.3. Let b1 ≥ b2 ≥ ... ≥ bn −→n→∞ 0 and assume that |An| ≤ C for all Pn n ≥ 1, where An := j=1 aj. Then

∞ X anbn converges. n=1 Proof. Exercise.

P∞ −s Lemma 3.4. The series f(s) = n=1 ann converges for some value s if and only if |an| is at most polnomially increasing, i.e. we have for some b ∈ R

b an = O(n ).

−s −σ b Proof. Observe that |ann | = |an|n . Hence, if [an| ≤ cn we have at s = b + 2

∞ ∞ X −b−2 X −2 |ann | ≤ c n < ∞. n=1 n=1

−σ0−it0 Conversely, if the series converges at s0 = σ0 + it0 we must have |ann | = −σ0 σ0 |an|n ≤ c ∀n, whence an = O(n ). P∞ −s A Dirichlet series n=1 ann that converges at least at one point is called a con- verging Dirichlet series. It turns out that if the Dirichlet series converges at a point, then it converges everywhere strictly right from that point. The precise statement is given by the following result which is fundamental for understanding the basic conver- gence properties of Dirichlet series.

P∞ −s Theorem 3.5. If the series f(s) := n=1 ann converges at s0 = σ0 + it0, then it converges uniformly in any closed angle (δ > 0 is fixed) π A (s ) := {s } ∪ {s : |arg(s − s )| ≤ − δ}. δ 0 0 0 2 P∞ −s As a consequence, if we denote by f(s) := n=1 ann the sum of this series, f is well-defined, continuous and analytic in the open half-plane {σ > σ0}. Moreover, lims→s0 f(s) = f(s0) when s → s0 in the angle Aδ(s0). Proof. Assume that δ > 0 is fixed and ε > 0 is given. By the assumed for convergence at s0 and Cauchy’s criterion we may choose n0 so large that |S(x)| ≤ ε for x ≥ N0, where X −s0 S(x) := ann .

N0≤n≤x

29 Assume that s ∈ Aδ(s0) \{s0} and use Abel summation (Theorem 3.1) applied to the 0 0 0 sequence (an) where an = 0 if n < N0 and an = an if n ≥ n0 to deduce

N N N Z X −s X −s0 s0−s s0−s s0−s−1 ann = ann n = sN N − (s − s0)t S(t)dt.

n=N0 n=N0 N0 This implies that

N N Z  |s − s | X −s σ0−σ−1 0 ann ≤ ε + ε|s − s0| t dt = ε 1 + ≤ c(δ)ε, σ − σ0 n=N0 N0

|s−s0| −1 where we simply observed that for s ∈ Aδ(s0) we have ≤ (sin δ) . Since the σ−σ0 stated estimate clearly is true also at s0, this proves the claimed uniform convergence. The uniform convergence then verifies that the series converges and the limit func- tion f is continuous on Aδ(s0). Especially, we have that lims→s0 f(s) = f(s0) when s → s0 in a angle Aδ(s0). Further, f is analytic in the interior of Aδ(s0) as the uniform limit of analytic functions (Weierstrass theorem). Finally, analyticity in the whole open half-plane {σ > σ0} follows as any point in this half-plane has a neighbourhood that is contained in an angle Aδ(s0) when δ > 0 is chosen appropriately.

P∞ −s Definition 3.6. Let f(s) = n=1 ann be a convergent Dirihlet series. We define the abscissa of convergence of f to be the quantity σc = σc(f) ∈ [−∞, ∞), where

∞ X −s σc := inf{σ : ann converges at some s = σ + it}. n=1

In a similar way, the abscissa of absolute convergence of f is the quantity σa = σa(f) ∈ [−∞, ∞), where

∞ X −s σa := inf{σ : ann converges absolutely at some s = σ + it}. n=1

P∞ −s Theorem 3.7. Assume that f(s) = n=1 ann is a convergent Dirichlet series. Then the different convergence abscissae of f satisfy:

(i) σc ≤ σa ≤ σc + 1

(ii) The series of f converges for σ > σc, and it diverges for σ < σc.

(iii) The series of f converges absolutely for σ > σa, and it fails to do so for σ < σa.

(iv) f is analytic in the half plane {σ > σa}, and we may express the derivatives of f therein as the convergent series

∞ (k) X k −s f (s) = an(− log n) n , σ > σc. n=1

30 Proof. Towards (i) we note that the divergence for σ < σc follows directly from the definition of σc, and the convergence for σ > σc is stated already inTheorem 3.5. This proves (i). Actually, in the theorem it was shown that there is locally uniform conver- gence in {σ > σc}. This implies (iv) by the general Weierstrass theorem concerning uniformly convergent series of analytic functions (see e.g. ’Complex Analysis II’). To treat (i) we note first that obviously σa ≥ σc. On the other hand, fix ε > 0 and note −σc−ε that the series of f converges at σc + ε. This implies that |ann | ≤ C for all n ≥ 1, −σc−1−2ε −1−ε whence |ann | ≤ Cn for all n ≥ 1, and the series converges absolutely at σc +1+2ε, so that σa ≤ σc +1+2ε. Since ε > 0 is arbitrary, it follows that σa ≤ σc +1. Finally, (ii) is left as an exercise.

Example 3.8. (i) For the Riemann zeta function it is clear (why?) that σc = σa = 1. (ii) Consider the Dirichlet series

∞ ∞ X 1 X f (s) = n−s and f (s) = 2−ns 1 n log2(n + 1) 2 n=1 n=1

(observe that also the latter one is an ordinary Dirichlet series). Then σc(f1) = σc(f2) = 0, and f1 obviously converges on the boundary {σ = 0} while f2 fails to converge at any point of the boundary. (iii) Define ζe by setting ∞ X ζe(s) := 1 − 2−s + 3−s − ... = (−1)n+1n−s. n=1

Obviously in this case σa = 1. By the Leibnitz criterion on alternating series the sum P∞ n+1 −s n=1(−1) n converges for s ∈ (0, 1). Thus σc ≤ 0, but obviously also σc ≥ 0, so that σc = 0 for ζ.e This shows that the inequalities in Theorem 3.7 are optimal. Actually this particular Dirichlet series has another interesting property: it extends to an entire function although its abscissa of convergence is 0. This shows that although the class of Diriclet series has many analogies with Taylor series, here there is an important difference since if a Taylor series defines an entire function then its radius of convergence is necessarily ∞. In order to verify the stated analytic extension we relate ζe to ζ. Assume that σ > 0 and denote ∞ X a(s) := 1 + 3−s + 5−s + ... = (2k − 1)−s. k=1 Then we have a(s) ζ(s) = 1 + 2−s + 4−s + 8−s + ... 1 + 3−s + 5−s + ...  = . 1 − 2−s On the other hand, 2−s a(s)(1 − 21−s) ζ(s) = 1−2−s−4−s−8−s−... 1+3−s+5−s+...  = 1− a(s) = . 1 − 2−s 1 − 2−s

31 By comparing the formulas we se that

ζe(s) = (1 − 21−s)ζ(s).

The factor (1 − 21−s) kills the pole of ζ(s) at 1 and hence ζe is entire. There is a formula for the abscissa of convergence that is sometimes useful.

Theorem 3.9. Let σc be the abscissa of convergence of the Dirichlet series f(s) = P∞ −s n=1 ann . Then P∞ (i) If n=1 an diverges, then σc ≥ 0 and we have

β σc = inf{β : |a1 + ... + an| = O(n ).

P∞ (ii) If n=1 an converges, then σc ≤ 0 and we have

β σc = inf{β : |an + an+1 + an+2 + ... | = O(n ).

Proof. Exercise. We end this section by observing that for a convergent Dirichlet series the coeffi- cients are unique. The basic statement says that for two Dirichlet series that yield the same analytic function in a half-plane where they both converge, the coefficients of the series are identical. In fact, a considerably stronger statement is true:

P∞ −s P∞ −s Theorem 3.10. Let f(s) = n=1 ann and g(s) = n=1 bnn be two convergent Dirichlet series. Then, if f(sk) = g(sk) for a sequence (sk) such that σk → ∞, then an = bn for all n ≥ 1.

Proof. By considering f − g, it is enough to show that f(sk) = 0 for all k implies that an = 0 for all n. We may assume that σk ≥ 1 + σa(f) for all k ≥ 1. Observe that since −sk −s1 σ1−σk |ann | = |ann |n , we obtain by dominated convergence

a1 = lim f(sk) = lim 0 = 0. k→∞ k→∞

Assume then that we have already shown that aj = 0 for j < n, where n ≥ 2. Write

∞ sk X −sk 0 = n f(sk) = an + am(m/n) m=n+1 and observe that −sk σ1 −σ1 σk−σ1 |am(m/n) | ≤ n |amm |(n/m)

sk so that dominated convergence theorem again yields that an = limk→∞ n f(sk) = 0. Corollary 3.11. Let f and g be two convergent Dirichlet series. If f(s) = g(s) in an open neighbourhood, where both series converge, then the series coincide (i.e. their coefficients coincide).

32 Proof. By analyticity it follows that f ≡ g in the half plane {σ > max(σc(f), σc(g))}, and the claim then follows from Thm 3.10. Example 3.12. One may note that Thm 3.10 does not hold by just assuming that sk → ∞, since e.g. the non-trivial Dirichlet series f(s) = 2−s − 4−s converges everywhere and it has infinitely many zeroes on the imaginary axis.

3.2 The Ingham-Newman Tauberian theorem

The statement of the following theorem is essentially due to Ingham from 1935. How- ever, our proof is a minor modification of the ingenious short argument due to Newman (1980). We later on apply the result to give a very succinct proof of the prime number theorem. P∞ −s Theorem 3.13 (Ingham-Newman). Assume that the Dirichlet series f(s) = n=1 ann 1 has analytic extension over the line {σ = 1} . and satisfies |an| ≤ C for all n ≥ 1. P∞ −1 Then n=1 ann converges and one has ∞ X an f(1) = . n n=1 Proof. We may assume that C = 1. It is notationally easier to make a translation so that the vertical line {σ = 1} is moved to the imaginary axis. Thus, denote GN (s) := PN−1 −1−s n=1 ann and G(s) = f(s + 1). By assumption, then G is defined and analytic in an open domain Ω which contains the closed right half space {σ ≥ 0}. We are to show that GN (0) → G(0) as N → ∞. (3.1) Fix R > 0 and choose δ > 0 so small that

Ωδ,R ⊂ Ω, where Ωδ,R := B(0,R) ∩ {σ > −δ}.

Let us write ∂Ωδ,R = C1 ∪ C2 with

C1 := ∂B(0,R) ∩ {σ > 0} and C2 := ∂Ωδ,R ∩ {σ ≤ 0}, and denote also C3 := {σ ≤ 0} ∩ ∂B(0,R). We use Cauchy’s integral theorem to write2 1 Z  s2 ds g(0) − G (0) = G(s) − G (s)N s 1 + N 2πi N R2 s ∂Uδ,R 1 Z 1 Z = + =: I + I . 2πi 2πi 1 2 C1 C2 1I.e., f extends to an analytic function (denoted by f) in an open domain Ω which contains the closed right half space {σ ≥ 1}. 2The simplicity of Newmans proof is based on his clever idea to add the factor (1 + s2/R2) to the integral, which vanishes at s = ±R and thus kills the factor |σ|−1 that would otherwise kill the proof.

33 On the boundary of the circle ∂B(0,R) we have 2 σ σ s s 1 N N N 1 + = |R/s + s/R| = |s/R + s/R| R2 s R R 2|σ|N σ = . (3.2) R2

Moreover, on C1 we have (now 0 < σ < R) ∞ ∞ ∞ Z X −1−σ X −1−σ −1−σ −1−σ |GN (s) − G(s)| ≤ |an|n ≤ n ≤ N + x dx n=N n=N N N −σ N −σ R ≤ N −1−σ + ≤ + 1. (3.3) σ σ N By combining (3.2) and (3.3) it follows that 1 Z 1 1 |ds| 1 1 |I | ≤ 2 +  ≤ + . (3.4) 1 2π N R R N R C1

In turn, we divide I2 into two pieces 1 Z  s2 ds I = I − I := G(s)N s 1 + 2 2,1 2,2 2πi R2 s C2 1 Z  s2 ds − G (s)N s 1 + . 2πi N R2 s C2

In order to treat I2,2 we observe first that by analyticity the path can be replaced by C3 since GN is entire. For σ < 0 we have

N−1 N−1 N X X Z N |σ| N −σ |G (s)| ≤ |a |n−1−σ ≤ n−1+|σ| ≤ x|σ|−1dx = = . N n |σ| |σ| n=1 n=1 0 By combining this again with (3.2) we obtain 1 Z 2|ds| 1 |I | ≤ ≤ . (3.5) 2,2 2π R2 R C3 s Finally, in order to consider I2,1, observe that |G(s)N | ≤ |G(s)| ≤ c0(R, δ) on s C2 and |G(s)N | → 0 pointwise as N → ∞ if σ < 0. A fortiori, by the dominated convergence theorem

I2,1 → 0 as N → ∞. (3.6) Putting together the estimates (3.4), (3.5) and (3.5) we see that 2 1  2 lim sup |G(0) − GN (0)| ≤ lim sup + ≤ . N→∞ N→∞ R N R Since R > 0 is arbitrary, the theorem follows.

34 3.3 Products and negative powers of Dirichlet se- ries

We next take a look at multiplication of Dirichlet series. If ∞ ∞ X X f(s) = a(n)n−s and g(s) = b(n)n−s, (3.7) n=1 n=1 and both series converge absolutely at s. Then we may compute ∞ ∞ X X X f(s)g(s) = a(n)b(m)(nm)−s = a(b)b(m)k−s (3.8) n,m=1 k=1 nm=k ∞ X = (f ∗ g)(k)k−s := H(s). k=1 We may record this as Theorem 3.14. Let f, g be convergent Dirichlet series with coefficients as in (3.7) Then the coefficients of the the product Dirichlet series H(s) := f(s)g(s) are given by the convolution a ∗ b and one has σa(H) ≤ max(σa(f), σa(g)). Since u ∗ u = τ, we obtain ∞ X ζ2(s) = τ(n)n−s, σ > 0. n=1 Also, we know that u ∗ µ = I , so that ∞ 1 X = µ(n)n−s, σ > 0. (3.9) ζ(s) n=1 On the other hand, the equality (1.2) which states that φ = j ∗ µ shows that the Euler φ-function corresponds to the Dirichlet series ∞ ∞ ∞ X  X  X  φ(n)n−s = j(n)n−s µ(n)n−s n=1 n=1 n=1 ζ(s − 1) = . (3.10) ζ(s) Further examples are contained in the Exercises. The behaviour of various abscissas under multiplication of Dirichlet series have been Our next goal is to verify that if f is a convergent Dirichlet series, then 1/f is also. Lemma 3.15. Let a be an invertible arithmetic function (i.e. a(1) 6= 0). (i) Assume that a(1) = 1 and a(n) ≤ 1 for n ≥ 2. Then |a−1(n)| ≤ n2.

(ii) Assume that a is of polynomial growth, i.e. a(n) = O(nβ) for some β ∈ R. Then also a−1 is of polynomial growth.

35 Proof. In order to verify (i), assume that the claim is true for the values 1, 2, . . . n − 1, where n ≥ 2. Then by Lemma 1.24 we have 2 −1 X −1 X −1 X 2 2 π 2 |a (n)| = a(d)a (n/d) ≤ a (n/d) ≤ (n/d) = n ( − 1) < n . 6 d|n d|n d≥2 d>1 d>1 Hence (i) follows by induction. Towards (ii) we may clearly assume that a(1) = 1, but otherwise a is arbitrary. Denote b(n) := a(n)n−α, where α ≥ 0 is chosen large enough so that |b(n)| ≤ 1 for all n ≥ 1. By part (i) we then have that b−1(n) ≤ n2 and, as direct computation (exercise!) shows that a−1(n) = nαb−1(n) we obtain that |a−1(n)| ≤ nα+2. In view of Lemma 3.4 we obtain immediately: P∞ −s Corollary 3.16. Let f(s) = n=1 a(n)n be a convergent Dirichlet series. Then 1/f(s)(which is well-defined for large σ) is a convergent Dirichlet series if and only if a1 6= 0, and we may write so σ large enough ∞ 1 X = a−1(n)n−s, f(s) n=1 where f 1 is the Dirichlet inverse of the arithmetic function f.

3.4 Euler products

The famous Euler product for the Riemann zeta function states Y −1 Lemma 3.17. ζ(s) = 1 − p−s , for σ > 1. p∈P −σ −1 P∞ −ks Proof. An easy proof is obtained by writing for σ > 1 (1 − p ) = k=0 p , so that certainly N N ∞ X −σ Y −s−1 X −σ n ≤ 1 − pk ≤ n = ζ(s). n=1 k=1 n=1 By letting N → ∞ above the claim follows for real s, and the general case is obtained by analytic continuation. Before treating more general case Euler products we record a simple lemma, whose proof is left as an exercise.

Lemma 3.18. Let Aj,k ∈ C for j, k ≥ 1 and assume that X |Aj,k| < ∞. j,k≥1 Then ∞ ∞ Y X  X X X 1 + Aj,k = 1 + Aj1,m1 ...Aj`,m` . ` j=1 k=1 `≥1 1≤ j1<...

36 P −s Theorem 3.19. Assume that f(s) = n=1 a(n)n is a convergent Dirichlet series. Then (i) The arithmetic function a is multiplicative if and only if

Y −s 2 −2s  f(s) = 1 + a(p)p + a(p )p + ... , σ > σa(f). p∈P (ii) The arithmetic function a is completely multiplicative if and only if

Y −s−1 f(s) = 1 − a(p)p , σ > σa(f). p∈P

Proof. Assuming σ > σa(f) we may multiply out the product thanks to Lemma 3.18, and apply uniqueness of the coefficients (Theorem 3.10) to obtain that the stated β1 β` equality in (i) holds if and only if a(1) = 1 and for any n = p1 . . . p` we have

β1 β` a(n) = a(p1 ) . . . a(p` ). This is equivalent to a being multiplicative. The proof of (ii) is analoguous.

3.5 Perron’s formula.

For many purposes e.g. in analytic number theory one needs to estimate the sum function of the arithmetic function a with a(n) =: an by using only information from P∞ −s the function f(s) = n=1 ann . A basic quantity involved in Perron’s formula is the integral

a+iT 1 Z ds I(y, t) := ys , 2πi s a−iT where T ≥ 2 and a, y > 0 are positive reals (path of integration is, unless otherwise indicated, the vertical line segment between points c±iT ). Our estimates will be based on the following result.  1, if y > 1,  Lemma 3.20. Let a, y > 0 and T ≥ 2. Denote Θ(y) := 1/2, if y = 1, Then 0, if y < 1.

(aT −1, if y = 1, I(y, T ) − θ(y) ≤ ya min(1, 1 ), if y 6= 1. T | log y|

Proof. Let C = ∂B(0,R) be the circle (taken with positive direction) with center 0 and radius R := |a + iT |. Write

C1 = C ∩ {σ ≥ a} and C2 = C ∩ {σ ≤ a}.

37 Denote by L = [a − iT, a + iT ] the line segment between the points a ± iT . Case y > 1. By Cauchy’s integral theorem 1 Z ys ds = 1 = Θ(y). 2πi s L∪C2 Hence, by the definition of I(y, T ) Z a Z 1 s ds y ds a I(y, T ) − Θ(y) = y ≤ ≤ y . (3.11) 2πi s 2π s C2 C2 This gives one part of ’min’. To get the other, choose a0 > 0 large and apply Cauchy’s 0 theorem on the boundary of the rectangle with sides L, L+,L ,L−, where L+ = 0 0 0 0 0 [a + iT, −a + iT ],L = [−a + iT, −a − iT ], and L− = [−a − iT, a − iT ]. This time Z 1 s ds I(y, T ) − Θ(y) = y |. 2πi s + 0 L ∪L ∪L− We observe first that Z −a0 s ds y y ≤ 2T s a0 L0

Secondly, as |1/s| ≤ T on L±, it follows that

a Z Z a −a0 s ds −1 σ y − y y ≤ T y dσ = . s T log y 0 L± −a Putting these estimates together gives us

−a0 1 2T y a −a0 2  I(y, T ) − Θ(y) ≤ + (y − y ) . 2π a0 T log y Letting a0 → ∞ gives the upper bound (π log y)−1ya, which is the other part of ’min’. Case y < 1 is similar to the previous case, the only differences being that one uses the path C1 in the first integration and in the second one the new corner points for the rectangular path of integrations are a0 ± iT. Case y = 1. Denote s = a + it and obtain

T T ∞ Z Z Z 1 1 idt 1 1 dt adt I(1,T ) − = − = − (3.12) 2 2πi a + it 2 2π a + it a2 + t2 −T −T −∞ ∞ ∞ 1 Z adt a Z dt a = ≤ = . π a2 + t2 π t2 πT T T

38 As an immediate corollary we see that for any a, y > 0

a+iT 1 Z ds lim I(y, t) = lim = ys = Θ(y) (3.13) T →∞ T →∞ 2πi s a−iT

The following lemma has independent interest since it yields a useful estimate of the growth of Dirichlet series in the vertical direction .

P∞ −s Lemma 3.21. Let f(s) = n=1 ann be a convergent Dirichlet series and denote σc = σc(f). Then for each ε > 0 there is Cε = c(ε, f) such that

1−(σ−σc)+ε |f(s)| ≤ Cε(1 + |t|) for all s with σc + ε ≤ σ ≤ σc + 1.

P −σc−/2 Proof. Assume s ∈ C with σ ≥ σc + ε. Denote R(x) := n>x ann so that −σc−/2 |R(x)| ≤ C for all x > 0 and |ann | ≤ C for all n ≥ 1, since the Dirichlet series −σ σc+/2−σ converges at σc+ε/2. The latter estimate may be reformulated as |ann | ≤ Cn . P −σc−ε/2 Let N > 2 and observe first since N

∞ ∞ X ε Z a n−s = −(σ + − s) (R(N) − R(x))xσc+ε/2−s−1dx, n 0 2 n=N+1 N or in other words

N ∞ X ε Z f(s) = a n−s − R(N)N σc+ε/2−s + (σ + − s) R(x)xσc+ε/2−s−1dx. n c 2 n=1 N

−σ Invoking the bounds for |ann | and R(x) we obtain

 N ∞  X ε Z |f(s)| ≤ C nσc+ε/2−σ + N σc+ε/2−σ + (1 + |t| + ) xσ0+ε/2−σ−1dx  2  n=1 N   ≤ c(ε)C N 1+σc+ε/2−σ + (1 + |t|)N σc+ε/2−σ .

The desired conclusion follows by choosing N ∼ 1 + |t|. We recall the normalized sum convention (compare to Definition 2.18): ( 0 P X n

39 P∞ −s Theorem 3.22 (Perron’s formula). Let f(s) = ann be a convergent Dirichlet  n=1 series. Then for any b > max 0, σc(f) and x > 0 it holds that

b+i∞ 0 Z X 1 s ds an = f(s)x , n≤x 2πi s b−i∞ R T where the integral is understood as the limit limT →∞ . T

Proof. Let us first assume that b > σc + 1. In that case by the absolute convergence on the line {σ = b} we may write for T > 2 by employing the already familiar notation 1 R b+iT s ds I(y, t) := 2πi b−iT y s

b+iT ∞ b+iT Z 0 Z 0 s ds X X s ds X f(s)x − an = an (x/n) − an (3.14) s n≤x s n≤x n=1 b−iT b−iT ∞ X  = an I(T, x/n) − Θ(x/n) (3.15) n=1 In case x 6∈ N Lemma 3.20 shows that the absolute value of the right hand side is dominated by

∞ b b ∞ ! X (x/n) c(x)x X −b |an| ≤ |an|n −→ 0. T | log(x/n)| T T →∞ n=1 n=1 If x ∈ N, Lemma 3.20 tells us that the extra error term induced is O(T −1) and the result follows again. In the general case where just b > σc we know that the conclusion is true if b would be replaced by b + 1. We use Cauchy’s integral theorem to compare the integral over the segment [b − iT, b + iT ] to the integral over the segment [b + 1 − iT, b + 1 + iT ]. The error induced is the sum of integrals over the segments L± := [b ± iT, b + 1 ± iT ], 1−δ each of length one. By Lemma 3.21 there is δ > 0 so that |f(s)| ≤ cT on L±, which verifies that the integrals over the segments L± tend to zero as T → ∞. This concludes the proof. As an immediate corollary of the Perron formula we obtain a new proof of the uniqueness of coefficients (Theorem 3.10). What is more, Perron’s formula offers a concrete formula for recovering the coefficients. Most often though it is used to estimate the summatory function of the coefficients, and for that purpose it is important to estimate the error made when the integration is truncated at level T . The following formulation is a typical and useful result in this direction. P∞ −s Theorem 3.23 (An effective Perron’s formula). Let f(s) = n=1 ann be a con- vergent Dirichlet series. Then , if x ≥ 2 is not an integer we have for any b >  max 0, σa(f) and x ≥ 1 it holds that

b+iT ∞ ! 0 Z −b X 1 s ds b X |an|n an − f(s)x = O x . n≤x 2πi s 1 + T | log(x/n)| n=1 b−iT

40 If x is an integer, then the term corresponding to n = x in the above sum may be −1 replaced by b|ax|T .

P∞ −b Proof. As now |an|n converges absolutely, we may simply repeat the argument n=1 in the beginning of the proof of Theorem 3.22. We use this time the estimate I(T, y)− b −1 Θ(y) ≤ 2y 1 + T | log y| in case x not an integer,which follows easily from Lemma

3.20. If x is an integer, we use for n = x the estimate I(T, y) − Θ(y) ≤ bT −1 again from from Lemma 3.20. We leave it as an exercise to derive from the above another form of the error estimate:

P∞ −s Theorem 3.24 (Second effective Perron’s formula). Let f(s) = ann be a  n=1 convergent Dirichlet series. Then for any b > max 0, σa(f) and x ≥ 1 it holds that

b+iT ∞ ! 0 Z b X 1 s ds x X −b x log x an − f(s)x = O |an|n + b(x) 1 + , n≤x 2πi s T T n=1 b−iT where b(x) := max |an|. 3 5 4 x≤n≤ 4 x

41 Chapter 4

Prime number theorem

Let us start by recalling von Mangold’s arithmetic function: ( log p if λ = pk, k ≥ 1, Λ(n) := 0 otherwise.

Thus Λ(n) vanishes if n is not a power of a prime and one has (Exercise !) the important relation ∞ X λ(n)ns, (4.1) n=1 which we will use in later sections but not for our first proof of the PNT. Directly from the definition one obtains that

Λ ∗ u(n) = log(n). (4.2)

The corresponding summatory function is called Chebyshev’s function: X ψ(x) := Λ(n). n≤x

The equivalence between (i) and (ii) in the following result is the most common starting point of a proof of the PNT. That it is also elementarily equivalent to (iv) was observed by Landau.

Theorem 4.1. The following four conditions are equivalent. (i) PNT holds, i.e. π(x) ∼ x(log x)−1 as x → ∞. ψ(x) (ii) lim = 1. x→∞ x X (iiii) M(x) := µ(n) = o(x). n≤x ∞ X µ(n) (iv) = 0. n n=1

42 We prove just that (iv) =⇒ (iii) =⇒ (ii) ⇐⇒ (i) since only the implication (iv) =⇒ (i) is used in our proof of PNT. The remaining implications are left as a guided exercise. Before the proof it is useful to write down a slightly more general form of the simple Dirichlet’s hyperbola method that we already applied before in order to estimate the divisor sum. Lemma 4.2. Let f and g be number theoretic functions, and let F and G stand for the corresponding summatory functions. Then, for any y ∈ (1, x) we have X X X (f ∗ g)(n) = f(d)G(x/d) + F (x/d0)g(d0) − F (y)G(x/y). n≤x d≤y d0≤x/y Proof. The left hand side can be written as X  X X  f(d)g(d0) = + f(d)g(d0) dd0≤x dd0≤x dd0≤x d≤y d>y X X X X = f(d) g(d0) + g(d0) f(d) d≤y d0≤x/d d0≤x/y y

P Proof of Theorem 4.1. Let us start with (ii) ⇐⇒ (i). Note that ψ(x) = pk≤x log p, so that X jlog xk X ψ(x) = log p ≤ log x = π(x) log x. (4.3) log p p≤x p≤x On the other hand, if 1 < y < x, then

X log p ψ(x) π(x) − π(y) ≤ ≤ . log y log y y

By choosing y = x/ log2 x and using π(y) ≤ y we obtain x ψ(x) π(x) ≤ + . log2 x log x − 2 log log x Combining this with (4.3) it follows that

ψ(x) π(x) 1 ψ(x)  2 log log x−1 ≤ ≤ + 1 − , x x/ log x log x x log x and in the limit x → ∞ this clearly yields the equivalence between (i) and (ii).

43 P µ(n) Assume then that (iv) holds true so that m(x) := n≤x n = o(x). Then (iii) follows by an application of partial summation

x M(x) 1 X µ(n) 1 Z = n · = m(x) − m(u)du → 0 as x → ∞. x x n x n≤x 1 We then assume that (iii) is satisfied, and establish (ii). For that end, let us recall the divisor theorem concerning the average behaviour of the divisor function τ(n) and denote for n ≥ 1 the following expression close to the remainder in the divisor theorem (Thm 1.22) by E(n) := τ(n) − log(n) − 2γ (γ is Euler’s constant). We claim that X ψ(x) = x − (E ∗ µ)(n) + O(1). (4.4) n≤x

Observe first that by M¨obiusinversion and (4.2) we have Λ(n) = (µ ∗ log)(n). Also τ = u ∗ u so that u = µ ∗ τ. Hence (4.4) follows from the computation X   bxc − ψ(x) − 2γ = u (n) − Λ(n) − 2γ)I (n) n≤x X = µ ∗ τ − log −2γu (n) n≤x X = (µ ∗ E)(n), n≤x where we also used the fact that I = µ ∗ u . According to (4.4), statement (ii) follows if we are able to show that X H(x) := (E ∗ µ)(n) = o(x). (4.5) n≤x

First of all, by the definition of E, Theorem 1.22 and Stirling’s formula we see that that X X X F (x) := E(n) = τ(n) − log n − 2γbxc n≤x n≤x n≤x = x log x + (2γ − 1)x + O(x1/2) − x log x − x + O(log x) − 2γx + O(1) = O(x1/2). (4.6)

We then fix y > 1 and apply Dirichlet’s hyperbola method via Lemma 4.2 to write for x > y X X X H(x) = E(d)µ(d0) = E(d)M(x/d) + F (x/d0)µ(d0) − F (y)M(x/y). dd0≤x d≤y d0≤x/y

44 If we divide above both sides by x and recall that y is fixed, our assumption M(x) = o(x) yields

H(x) 1 X C X p lim sup ≤ lim sup F (x/n) ≤ lim sup x/n x→∞ x x→∞ x x→∞ x n≤x/y n≤x/y C  X  C ≤ lim sup √ y−1/2 ≤ lim sup √ 1 + 2px/y ≤ 2C0y−1/2. (4.7) x→∞ x x→∞ x n≤x/y

This yields (4.5) since y > 1 is arbitrary. We are almost ready to prove the PNT. One element of the proof is the elementary number theory contained in implication (iv) =⇒ (i) of the previous theorem (Landau’s observation). In turn the analytic part is Theorem 3.13, whose simple proof was based on Newman’s ingenious trick. As an input for Theorem 3.13 we just need the important fact that ζ has no zeroes on the line {σ = 1}.

Lemma 4.3. ζ(1 + it) 6= 0 for all t ∈ R. Proof. Starting from the Euler product formula (Lemma 3.17) and using absolute con- P∞ P 1 vergence in we obtain log ζ(s) = k=1 p kpks , for σ > 1. By taking the real part it follows that ∞ X X cos(kt log p) log |ζ(s)| = , σ > 1. kpks k=1 p∈P Applying the trigonometric identity 3 + 4 cos(x) + cos(2x) = 2(1 + cos(x))2 for real values of x and using the previous formula we see that for any σ > 1 and any t it holds that 3 log ζ(σ) + 4 log |ζ(σ + it)| + log |ζ(σ + 2it)| ≥ 0. In other words, there is the interesting inequality

ζ(σ)3|ζ(σ + it)|4|ζ(σ + 2it)| ≥ 1 for σ > 1. (4.8)

Let us assume contrary to the claim that ζ(1 + it0) = 0 for some t0 6= 0. We + choose in (4.8) t = t0 and let σ → 0 . Then the left side of (4.8) takes the form O(σ − 1)−3(σ − 1)4 = O(σ − 1), while the right hand side does not tend to zero as σ → 1, which is a contradiction. Finally, the Prime Number Theorem:

Theorem 4.4. (PNT, Hadamard and de la Vallee-Poussin 1896)

π(x) lim = 1. x→∞ x/ log x

45 P∞ −s Proof. Denote F (s) = n=1 µ(n)n = 1/ζ(s) for σ > 1. By 4.3 it extends analytically over {σ = 1} and vanishes at s = 1. Theorem 3.13 applies and it follows that

∞ X 0 = F (1) = µ(n)/n. n=1 Now PNT follows from Theorem 4.1.

Corollary 4.5. All the statements (i)–(iv) of Thm 4.1 are true.

Remark 4.6. The above proof of the PNT used (besides trivialities) basically the fol- lowing 4 ingredients:

1. ζ extends meromorphically over the line {σ = 1} with a pole at 1.

2. ζ does not have zeroes on the line {σ = 1}.

3. Ingham-Newman tauberian theorem (Thm 3.13)

4. Landau’s equivalence, based on elementary NT (Thm 4.1).

Thus, the knowledge that is needed from ζ is coded in conditions 1. & 2., of which 2. is the deeper one (an exercise asks you to apply Abel summation to give a very simple proof of part 1.). Basically all known analytic profs of PNT use these conditions in one or another way. The original proofs from 1896 used many complex analytic properties of ζ besides conditions 1. and 2. The first proof that uses from ζ only the conditions 1. and 2. is dues to Wiener. One should also note that it is possible to organise Newman’s proof so that it does not refer to Landau’s equivalence, see [Zagier: On Newman’s proof of the PNT, American Mathematical Monthly 1997.].

46 Chapter 5

Dirichlet characters, primes in arithmetic sequences and quadratic excess

In 1837 Dirichlet gave a striking start for analytic number theory by showing that every arithmetic sequence a + b, 2a + b, 3a + b, . . . , where (a, b) = 1, a ≥ 1, contains infinitely many primes. In order to get some a priori insight to Dirichlet’s ideas, let us recall that in NT- course we gave a rather easy proof of Claim: There are infinitely many primes with p ≡ 1 (mod 4). Our proof was based on applying the Legendre symbol (−1/p). Let us now see how one may apply suitable Dirichlet series to obtain another proof of the result. For that end, let us define the arithmetic function χ (a character) by setting  0 if n is even,  χ(n) := 1 if n ≡ 1 (mod 4), −1 if n ≡ −1 (mod 4). We observe that χ is completely multiplicative: χ(nm) = χ(n)χ(m) for all n, m ≥ 1. Define the corresponding Dirichlet series (or ”L-function”) by setting ∞ X −s −s −s −s Lχ(s) := χ(n)n = 1 − 3 + 5 − 7 + .... n=1

Then Lχ is a converging Dirichlet series with σc = 0 and σa = 1. Too see this note that the series is alternating and convergent if s ∈ (0, ∞), and Lχ is analytic in σ > 0. Thus by Theorem 3.19 we have for σ > 1

Y −s−1 Y −s−1 Y −s−1 Lχ(s) = 1 − χ(p)p = 1 − p 1 + p p p≡1 (mod) 4 p≡−1 (mod) 4

47 Since we may write the Euler product of the Riemann zeta function ζ in the form

−1 Y −1 Y −1 ζ(s) = 1 − 2−s 1 − p−s 1 − p−s p≡1 (mod) 4 p≡−1 (mod) 4 it follows that

−s−1 Y −s−2 Y −2s−1 ζ(s)ζχ(s) = 1 − 2 1 − p 1 − p (5.1) p≡1 (mod) 4 p≡−1 (mod) 4

We now let s = σ > 1 be real and consider the limit σ → 1+ in (5.1). Clearly

1 1 lim ζχ(σ) = ζχ(1) = 1 − + − ... 6= 0 . σ→1+ 3 5

Q −2σ−1 Moreover, obviously there is the finite limit limσ→1+ p≡−1 (mod) 4 1 − p = Q −2−1 p≡−1 (mod) 4 1 − p > 0. Thus, because ζ has a pole at s = 1, (5.1) yields that

Y −1 Y −1 ∞ = lim 1 − p−σ = 1 − p−1 . σ→1+ p≡1 (mod) 4 p≡1 (mod) 4

This obviously yields that the there are infinitely many primes congruent to 1 (mod 4). In order to be able to generalise the above arguments we need to be sure that the key facts in the above argument have the right analogues in the case of arbitrary arithmetic sequences with parameters a, b such that (a, b) = 1. More precisely, we will be facing the following questions: • Are there completely multiplicative characters χ (mod a) such that they can be suitably employed to separate integers that are congruent to b (mod a)?

• Do the corresponding L-function(s) satisfy Lχ(1) 6= 0 ? We start by giving a positive answer to the first question, and for that end we will have a little bit fun by learning how commutative and finite abelian groups look like.

5.1 Structure of finite abelian groups

In this section all groups are abelian (i.e. commutative) and finite. Let us recall some basic notions. The order (or size) of a group G is denoted by |G| := #(G). If g ∈ G then the order of g is ord (g) :) min(n ≥ 1 : ng = 0). As seen the above formula, we use additive notation for groups, i.e. we consider (G, +, 0) and 3g := g + g + g etc. An element g ∈ G generates the cyclic subgroup

hgi := {ng : 0 ≤ n ≤ ord (g) − 1}.

48 We say that G is cyclic if G = hgi fr some g ∈ G. Obviously, then we have the group isomorphism G ≈ Zm with m = ord (g). Here, as in the course of number theory we denote for simplicity (not an orthodox notation!) Zm := Z/mZ, where Z stand for the additive group of integers.

Definition 5.1. A direct product of groups G1 and G2 is the product set G1 × G2 equipped with the operation

0 0 0 0 (g1, g2) + (g1, g2) = (g1 + g1, g2 + g2)

0 0 for any g1, g1 ∈ G1 and g2, g2 ∈ G2. Lemma 5.2. Let G be an abelian finite group. Then for any g ∈ G we have (i) mg = 0 if and only if ord (g)|m. ord (g) (ii) ord (mg) = (m,ord (g)) . (iii) Let k ≥ 0, m ≥ 1, and p ∈ P . If ord (pkg) = pm , then ord (g) = pk+m Proof. Exercise.

Definition 5.3. G is a p-group if ord (g) if p ∈ P and ord (g) is a power of p for all g ∈ G. Observe that always |G|g = 0 for all g ∈ H (why?), whence we have

ord (g)| |G| for all g ∈ G. (5.2)

Recall that if H ⊂ G is a subgroup, then it is automatically normal since G is abelian. We may hence define the quotient group G/H as the set of cosets

g := g + H g ∈ G, and G/H is again abelian. The following lemma looks a little bit specialised, but turns out to be useful.

Lemma 5.4. Let G be a finite abelian p-group. Let g1 ∈ G have maximal order α1 ord (g1) = p , α1 ≥ 1. Denote by H = hg1i the cyclic subgroup generated by g1. Assume that g ∈ G so that g ∈ G/H \{0} has order pα (in G/H). Then there is a representative g0 ∈ G of g so that ord (g0) = pα (in G). Proof. By assumption, pαg ∈ H, or in other words

α α1 p g = `g1 with 1 ≤ ` ≤ p .

If ` = pα1 , we have pαg = 0 (in G), so we may clearly choose g0 = g since in any case the order of g (in G) cannot be smaller than the order of g (in G/H). Assume then

49 α1 k that ` < p and write ` = p m with (m, p) = 1 and k < α1. By Lemma 5.2(ii) we have pα1 α k α1−k ord (p g) = ord (mp g1) = = p > 1 (pα1 , mpk) so that Lemma 5.2(iii) yields that

ord (g) = pα+α1−k, and since pα1 was the maximal order in G, we must have α ≤ k. Denote

k−α 0 h := p mg1 ∈ H and choose g := g − h.

Then g0 = g and pαg0 = 0, which implies that ord (g0) ≤ pα. On the other hand, we obviously have ord (g0) ≥ ord (g0) = ord (g) = pα, so that ord (g0) = ord (g). It is useful to observe that

G ≈ G1 × ... × G` for cyclic factors Gj and with |Gj| = mj, 1 ≤ j ≤ ` if and only if

G ≈ Zm1 × ... × Zm` , and for this it is clearly equivalent that we can find a ’basis’ for G (elements g1, . . . , g` ∈ G) so that ` X (A) G = ajgj : a1, . . . , a` ∈ Z}, j=1

` X (B) ajgj = 0 ⇔ mj|aj ∀ j = 1, . . . , `}. j=1 We can now first settle the structure of p-groups.

Theorem 5.5. Let p ∈ P and assume that G is a finite abelian p-group. Then G is isomorphic to a direct product of cyclic p-groups.

Proof. We induct on the size G of the group. The statement is obviously true if |G| ≤ 2. Thus, assume that m ≥ 3 and the claim is true whenever |G| ≤ m − 1. Assume that α1 α |G| = m. Pick g1 ∈ G with maximal order ord (g1) = p (α1 ≥ 1). Write m1 = p1 and let H := hg1i = {0, g1, 2g2,..., (m1 − 1)g1} stand for the cyclic subgroup generated by g1. By the induction hypothesis we may write

G/H ≈ Zm2 × ... × Zm` ,

50 where mj ≥ 2 for each j. Here we have observed that G/H is a also a p-group since β β if gb ∈ G/H we have p g = 0, where p = ord (g) (in G), and hence ord (g) (in G/H) divides pβ by Lemma 5.2(i). Especially, we may pick a basis

b2, b3,..., b` ∈ G/H so that the satisfy conditions (A) and (B) above, and so that ord (bj) = mj for all j = 2, . . . `. Lemma 5.4 allows us to pick representatives gj ∈ G so that gj = bj and

ord (gj) = ord (bj) = mj, 2 ≤ j ≤ `. (5.3)

Define the map

Φ: Zm1 × ... × Zm` → G, where (for any representatives aj ∈ Z of elements aj ∈ Zmj ) we set

` X Φ((a1, . . . , a`)) = ajgj. j=1

We next verify that Φ is a group isomorphism.

Claim 1. Φ is a well-defined homomorphism.

The map is well-defined (i.e. does not depend on the choice of representatives aj) 0 0 since ord (gj) = mj so that ajgj = ajgj whenever aj ≡ aj (mod mj). Clearly it is a homomorphism. Claim 2. Φ is surjective. Take arbitrary g ∈ G and note that by our use of the induction hypothesis there are integers a2, . . . , a` so that g = a2g2 + ... + a`g`.

In other words, the element g − (a2g2 + . . . a`g`) is in H, and it hence can be written as a1g1 with some integer a1. All in all X g = `jgj. j=1

Claim 3. Φ is injective.

Assume that Φ((a1, . . . , a`)) = 0. We are to show that mj|aj for all j ∈ {1, . . . , `}. First of all 0 = Φ((a1, . . . , a`)) = a2g2 + . . . a`g`, which implies that mj|aj whenever j ≥ 2. Since by (5.3) ord (gj) = mj we have mjaj = 0 for j ≥ 2, which implies ajg1 = 0. Finally, since ord (a1) = m1 we must have m1|a1.

51 Lemma 5.6. Let G be a finite abelian group and let the subgroups G1,G2 ⊂ G satisfy

G + G2 = G and G1 ∩ G2 = {0}.

Then G  G1 × G2. Proof. Consider the group homomorphism

Φ: G1 × G2 → G, (g1, g2) 7→ g1 + g2. Simply observe that the first (resp. second) condition verifies the surjectivity (resp, injectivity) of Φ. Definition 5.7. Group G belongs to exponent m if ord (g)|m for all g ∈ G. Equiva- lently, mg = 0 for all g ∈ G. Lemma 5.8. Assume that G is a finite abelian group of exponent mn, where m, n ≥ 2 and (m, n) = 1. Then there are subgroups G1,G2 ⊂ G so that

G ≈ G1 × G2 and G1 belongs to exponent m and G2 to exponent n. Proof. Simply define ( Gm := {g ∈ G : with mx = 0},

Gn := {g ∈ G : with nx = 0}.

Then clearly G1 belongs to exponent m and G2 to exponent n. It is also clear that both are subgroups of G. Thus the statement follows as soon as we verify the conditions of Lemma 5.6. Assume first that g ∈ G1 ∩ g2. Pick integers r, s so that rm + sn = 1. We obtain g = (rm + sn)g = r(mg) + s(ng) = 0 + 0 = 0, whence G1 ∩ G2 = {0}. Moreover, if g ∈ G is arbitrary, we have mng = 0 since G belongs to the exponent mn, and this implies that nx ∈ G1 and mx ∈ G2. Finally, x = s(nx) + r(mx) which implies that G = G1 + G2. We are ready for the general result: Theorem 5.9. Let G be a non-trivial finite abelian group. Then G is isomorphic to a direct product of cyclic groups, i.e.

G ≈ Zm1 × ... × Zm` (5.4) for some integers m1, . . . m` ≥ 2. Proof. We induct on the order |G|. Assume that the claim is true for |G| ≤ r − 1, α1 αk where r ≥ 3, and assume then that |G| = p1 . . . pk , where the pj:s are different primes. repeated of Lemma 5.8 allows us to write G as a product of p-groups and finally the claim follows by applying Theorem 5.5 on each of the factors. Remark 5.10. One may actually write some abelian groups in a several different ways in form (5.4). E.g., one may check that Z2 × Z3 ≈ Z6.

52 5.2 Characters of finite Abelian groups

As is customary, we now shift to the multiplicative notation. We denote with slight ambiguity by 1 ∈ G the neutral element. Especially, Theorem ?? now takes the form: if G is a finite abelian group, then for a suitable choice of the basis g1, . . . , g` we have

a1 a2 a` G : {g1 g2 . . . g` : 0 ≤ aj ≤ ord gj − 1}, (5.5) where the representation is unique. Especially, we have

` Y |G| = ord (gj). j=1

Definition 5.11. Let G be an abelian group. A map χ : G → C is a character (of G) if χ(gh) = χ(g)χ(h) for all g, h ∈ G, and χ is not identically zero. The set of all characters is denoted by Gb.

Lemma 5.12. Let χ ∈ Gb, where G is finite and abelian. Then (i) χ(1) = 1 (ii) |χ(g)| = 1 for all g ∈ G. (iii) If ord (g) = `, then χ(g) is a `:th root of unity.

Proof. Exercise.

Lemma 5.13. Let G be finite and abelian group. Then Gb is an abelian group when it is endowed with the standard product of functions on G: if χ1, χ2 ∈ Gb, we set

(χ1χ2)(g) = χ1(g)χ2(g) for g ∈ G.

The identity element is the principal character χ1 :

χ1(g) := 1 for all g ∈ G.

Proof. Exercise.

Our next three result express each in its own way that Gb is ’rich’ enough for our later purposes.

Theorem 5.14. Let G be finite a abelian group. Then |Gb| = |G|.

Proof. We apply representation (5.5). Denote mj := ord gj ≥ 2, 1 ≤ j ≤ `. By Lemma 5.12 χ(gj) is a mj:th root of unity, i.e.

2πi  χ(gj) = exp kj (5.6) mj

53 for some kj ∈ {0, 1, . . . , mj − 1}. Conversely, if we fix kj ∈ {0, 1, . . . , mj − 1} for all 1 ≤ j ≤ ` and define a character by the formula

` Y 2πi χ(ga1 . . . ga` ) = exp a k , (5.7) 1 ` m j j j=1 j it is obviously well-defined and multiplicative. It is now clear that the number of characters equals m1 × ... × m` = |G|.

Lemma 5.15. If g ∈ G and g 6= 1, then χ(g) 6= 1 for some χ ∈ Gb.

a1 a` Proof. let g = g1 . . . g` , as in (5.7), where we may assume that aj0 6 |mj0 := ord (gj0 ) for some index j0 ∈ {1, . . . , `} since g 6= 1. Define a a character χ by choosing kj0 = 1 and kj = 0 for j 6= j0 in (5.7). Then χ(g) = exp(2πiaj0 /mj0 ) 6= 1. The following result expresses an important orthogonality property of characters. X X The notation is a shorthand for the sum .

χ χ∈Gb Theorem 5.16. Let G be a finite abelian group. Then for any gh ∈ G we have ( 1 X 1, h = g, χ(h)χ(g) = |G| χ 0, otherwise.

Proof. We have χ(h−1)χ(h) = 1, whence χ(h) = χ(h−1). Thus χ(h)χ(g) = χ(h−1g), which shows that is enough to consider case h = 1, i.e. to show that ( X 1, g = 1, χ(g) = (5.8) 0, otherwise. chi

Case g = 1 is clear, so assume that g 6= 1 and use Lemma 5.15 to pick χ0 so that χ0(g) 6= 1. As χ runs over Gb, the elements χ0χ do so as well. Hence X X X χ(g) = (χ0χ)(g) = χ0(g) χ(g), χ χ χ P which implies that χ χ(g) = 0.

5.3 Dirichlet characters

∗ Fix m ≥ 2. Recall that the reduced residue system ZM consists of all units of Zm, or in other words, of all residue classes k(mod m) with (k, m) = 1. We have

∗ |Zm| = φ(m),

∗ and (Zm, ·) is a finite abelian group. Dirichlet characters are periodic extensions of the elements in the dual group on Z:

54 0 ∗ 0 Definition 5.17. Let χ ∈ Zcm, i.e. let χ be a character of the finite abelian group ∗ Zm. Then the corresponding (mod m) is the arithmetic function defined on Z by setting ( χ0(n), if (n, m) = 1, χ(n) := for n ∈ Z. 0, if (n, m) > 1.

Theorem 5.18. Let m ≥ 2, There are φ(m) distinct Dirichlet characters (mod m). Each of them is completely multiplicative:

χ(nk) = χ(n)χ(k) for all k, n ∈ Z, (5.9) m-periodic: χ(n + m) = χ(n) for all n ∈ Z, (5.10) and satisfies the vanishing condition:

χ(n) = 0 iff (n, m) > 1. (5.11)

Conversely, if an arithmetic function a on Z satisfies the above 3 conditions (5.9)- (5.11), it is a Dirichlet character (mod m). Proof. The claims follow almost directly from the definitions, and we leave the details as an exercise. P Theorem 5.19. Let χ denote the the sum over all Dirichlet characters (mod m). Assume that (b, m) = 1. Then ( 1 X 1, if n ≡ b (mod m), χ(b)χ(n) = φ(m) χ 0, if n 6≡ b (mod m).

Proof. In case (n, m) = 1 this is just a restatement of Theorem 5.16. If (n, m) > 1, the statement follows by noting that the χ(n) = 0 for all Dirichlet characters (mod m). Examples and remarks.

• Fix m ≥ 2. The Dirichlet character χ1 that satisfies χ1(n) = 1 whenever (n, m) = 1 is called the principal character (mod m).

• m = 1 Since φ(m) = 1, there is only one Dirichlet character, namely the prin- cipal character.

• m = 3 Now φ(m) = 2, so there is one character besides the principal one, let us call it χ2. For any character χ(2) = ±1, and we get a simple table n ≡ 1 2 3 (mod 3) χ1 1 1 0 χ2 1 −1 0

55 • m = 4 Again φ(m) = 2, and there is one character besides the principal one. For any character χ(3) = ±1, and we get a simple table n ≡ 1 2 3 4 (mod 4) χ1 1 0 1 0 χ2 1 0 −1 0

• m = 5 This time φ(m) = 4, and there are 3 characters besides the principal one. We observe that 24 = 16 ≡ 1 (mod 5)- actually 2 is a primitive root (mod 5). Especially every character satisfies (χ(2))4 = 1χ(16) = χ(1) = 1, or χ(2) ∈ {±1, ±i}, and this value determines all the other values of χ. For example χ(3) = χ(2)3 as 23 ≡ 3 (mod 5). We obtain the table n ≡ 1 2 3 4 5 (mod 5) χ1 1 0 1 1 0 χ2 1 i −i −1 0 χ3 1 −1 −1 1 0 χ2 1 −i i −1 0 We may also note that the sum over the vertical rows vanishes when n 6= 1, as it should (why?).

5.4 Nonvanishing of ζχ(1) Definition 5.20. Let χ be a Dirirchlet character (mod m), m ≥ 2. The corresponding (Dirichlet) L-function is the Dirichlet series ∞ X −s Lχ(s) := χ(n)n . n=1 Before stating some basic properties of Dirichlet L-functions let us make a couple of simple observations.

Lemma 5.21. Let L be a non-principal Dirichlet character (mod m), i.e. χ 6= χ1. Then m X χ(k) = 0, and (5.12) k=1 X Aχ(x) := χ(n) = O(1). and (5.13) n≤x 0 ∗ 0 Proof. Let χ be the coresponding character of Zm. As χ is not the principal character ∗ 0 there is a ∈ Zm with χ (a) 6= 1. We obtain X X X χ0(b) = χ0(ab) = χ(a) χ0(b) ∗ ∗ ∗ b∈Zm b∈Zm b∈Zm P 0 Pm This implies that ∗ χ (b) = 0, and (5.12) follows by observing that χ(k) = b∈Zm k=1 P 0 ∗ χ (b). Finally, (5.13) is obvious by (5.12) and the m-periodicity of χ. b∈Zm

56 Theorem 5.22. Let L be a Dirichlet character (mod m).

(i) If χ 6= χ1, then σc(Lχ) = 0 whence Lχ extends analytically to {σ > 0}.

(ii) For χ = χ1 we have

Y −s Lχ1 (s) = ζ(s) 1 − p , p|m so that Lχ1 extends analytically to {σ > 0}\{1}, and it has a simple pole at s = 1. Q −s (ii) In both casesL χ(s) = p 1 − χ(p)p for Proof. Statement (i) follows from (5.13) by easy Abel summation or invoking directly Theorem 3.9(i). In turn (iii) is a consequence of Theorem 3.19 (ii). Finally, (ii) is a special case of (iii) where one notes that χ1(p) = 1 unless p|m, when χ(p) = 0. Towards the non-vanishing statement, we record three auxiliary lemmas. Y Lemma 5.23. Lχ(s) ≥ 1 for s > 1. p P∞ 1 k Proof. By Theorem 5.22 (iii) and the power series − log(1−z) = k=1 k z (for |z| < 1) we obtain a branch of the logarithm of the L-function as follows: ∞ X X X 1 log L (s) = − log(1 − χ(p)p−s) = χ(pk)p−ks. (5.14) χ k p p k=1 By choosing b = 1 in Theorem 5.19 we see that ( X 1, if n ≡ 1 (mod m), χ(n) = χ 0, otherwise. In any case, this sum is always positive. Thus, if we sum up (5.14) over χ (mod m) we obtain for s > 1 ∞ X X X β(pk) log L (s) = p−ks ≥ 0, χ k χ p k=1 where β(pk) ∈ {0, 1} for all prime powers pk. This clearly implies he claim. Lemma 5.24. Assume that χ is a real character, i.e. χ = χ -in other words, χ is real valued. Let X A(n) := u ∗ χ(n) = χ(d). d|n Then A(n) ≥ 0 ffor all n ≥ 1. If n is a square, we have A(n) ≥ 1. Proof. To see this assume first that n = pk, p ∈ P. Then k k X X A(pk) = χ(pj) = (χ(p))j. j=0 j=0 Since χ is a real character we must have χ(p) ∈ {1, −1, 0}. We may easily compute the sum in each of the following 5 possibilities:

57 1. χ(p) = 1, k even: A(pk) = k + 1.

2. χ(p) = −1, k even: A(pk) = 0.

3. χ(p) = 1, k even: A(pk) = k + 1.

4. χ(p) = −1, k even: A(pk) = 0.

5. χ(p) = 0: A(pk) = 1.

In any case A(pk) ≥ 0 always, and A(pk) ≥ 1 if k is even. The statements of the lemma now follow since A is multiplicative. X √ Lemma 5.25. For T ≥ 1 we have k1/2 = 2 T + C + O(T −1/2). k≤T Proof. Euler’s summation formula yields

T T Z T Z X −1/2 −1/2 . −1/2 1 −3/2 k = 1 + t dt + B1(t)t + B1(t)t dt. 1 2 k≤T 1 1

√ T R T −1/2 . −1/2 −1/2 Here 1 t dt = 2 T − 2, and B1(t)t = −1/2 + O(T ). Moreover, 1

T ∞ ∞ 1 Z 1 Z Z B (t)t−3/2dt = B (t)t−3/2dt + O t−3/2dt = C0 + O(T −1/2). 2 1 2 1 1 1 T The claim follows by combining these observations.

Theorem 5.26. Let χ be a non-principal character (mod m). Then

Lχ(1) 6= 0.

Moreover, if χ is a real character we have Lχ(1) > 0. Proof. Assume first that χ is a complex character, by which we mean that the character χ is different from χ. Consider the function Y H(s) := Lχ(s), σ > 0. χ

By Theorem 5.22 H extends meromorphically to the half-plane {σ > 0}, and it can have pole only at s = 1. At that point only one of the factors, namely Lχ1 (s) has a pole at s = 1, the other factors are analytic at that point. If we would have Lχ(1) = 0 for our particular χ, we would also have Lχ(1) = Lχ(1) = 0. Hence in the product defining H at least two different factors would have a zero at 1, so that these zeros

58 would cancel the pole and make H(1) = 0. However, this would contradict Lemma 5.23, which yields that lims→1+ H(s) ≥ 1. This settles the case of a complex character. Assume then that χ is a real character. The proof in this case is more involved. Define the arithmetic function A as in Lemma 5.24, i.e. A := u ∗ χ, and consider the sum X S(N) := A(n)n−1/2. n≤N 2 By Lemma 5.24 we have

X A(m2) X 1 S(N) ≥ ≥ → ∞ as N → ∞. (5.15) m m m≤N m≤N

On the other hand we shall show that

S(N) = 2NLχ(1) + O(1). (5.16)

Together, (5.15) and (5.16) imply that Lχ(1) > 0. It remains to prove (5.16). For that purpose let us compute X X X S(N) = n−1/2 χ(d) = χ(d)(dk)−1/2 n≤N 2 d|n dk≤N 2 X X X X = χ(d)d−1/2 k−1/2 + k−1/2 χ(d)d−1/2 . d≤N k≤N 2/d k≤N N

A partial summation (e.g. Theorem 3.1) and (5.13) yield that

x X 1 Z χ(d)d−1/2 = A (x)x−1/2 + A (t)t−3/2dt χ 2 χ n≤x 1 ∞ ∞ 1 Z  Z  = O(x−1/2) + A (t)t−3/2dt − t−3/2dt 2 χ 1 T −1/2 = c0 + O(x ).

Hence

X −1/2 2 −1/2 −1/2 −1/2 | χ(d)d | = c0 + O (N /k) − c0 − O (N) = O(N ) N

−1/2 X −1/2 S2(N) = O N k = O(1). (5.17) k≤N

59 In order to treat S1(N), recall first from Theorem 5.22 that σc(Lχ) = 0, so that the series of Lχ converges at s = 1/2. Hence X χ(d)d−1/2 = O(1) as N → ∞. (5.18) d≤N In turn, partial summation and (5.13) yield that

∞ X Z χ(d)d−1 = N −1O(1) + O t−2dt = O(N −1),

d≥N N which yields that X −1 −1 χ(d)d = Lχ(1) + O(N ). (5.19) d≤N

We may now estimate S1(N) as follows:

(Lemma 5.25) X −1/2 −1/2 1/2 −1  S1(N) = χ(d)d 2Nd + C + O(d N ) d≤N

(5.18) & (5.18) −1 −1 = 2N(Lχ(1) + O(N )) + CO(1) + N Aχ(N)

= 2NLχ(1) + O(1).

Now (5.16) is a consequence of this estimate combined with (5.17).

5.5 Proof of Dirichlet’s theorem

Theorem 5.27. Let m ≥ 2 and (m, b) = 1. Then the arithmetic sequence {mk + b : k ≥ 1} contains infinitely many primes P Proof. Let us consider Dirichlet characters (mod m), so that again χ stands for summing over all Dirichlet characters (mod m). For any n ≥ 1 we have by Theorem 5.19 ( 1 X 1, n ≡ b (mod m), χ(b)χ(n) = (5.20) φ(m) χ 0, n 6≡ b (mod m). According to (5.14) we have for positive s > 1

∞ X X 1 log(L (s)) = χ(pk)p−ks. χ k p k=1

As we multiply both sides by χ(b) and sum over χ it follows by (5.20) that 1 X X 1 χ(b) log(L (s)) = χ(pk)p−ks. φ(m) χ k χ pk≡b (mod m)

60 Let us write this in the form X 1 p−s = log(L (s)) + R (s) + R (s), (5.21) φ(m) χ1 1 2 p≡b (mod m) where X 1 R (s) := − χ(pk)p−ks 1 k pk≡b (mod m) k≥2 and 1 X R (s) = χ(b) log(L (s)) 2 φ(m) χ χ6=χ1 We shall verify that lim sup |Rj(s)| < ∞, j = 1, 2. (5.22) s→1+

Assuming this, and recalling that by Theorem 5.22 Lχ1 has a pole at s = 1, we may late s → 1+ in (5.21) to obtain

X 1 = ∞, p p≡b (mod m) which immediately yields the Theorem. In order to verify (5.22) observe first that we have for s > 1

∞ X X X p−2s X |R (s)| ≤ p−ks = ≤ 2 p−2 ≤ 4. 1 1 − p−2s p k=2 p p

+ The boundedness of R2 in the limit s → 1 in turn follows immediately from Theorems 5.22 and 5.26. Actually, we proved a slightly stronger result. Namely, we obtain (Exercise) imme- diately from the proof that

X 1 1  1  = log + O(1) for s > 1. ps φ(m) s − 1 p≡b (mod m)

5.6 Quadratic excess

n  Definition 5.28. Let p ∈ P, p ≥ 3, and denote by p the Legendre symbol (mod p). + − The quadratic excess (mod p) is the quantity Ep := Ep − Ep , where n p − 1 n o E+ := # n ∈ {1, 2,..., } | (  = 1 , p 2 p n p − 1 n o E− := # n ∈ {−1, −2,..., − } | (  = 1 . p 2 p

61 In other words, Ep is the difference between the numbers of quadratic residues lying on the intervals (0, p(2) and (−p/2, 0).

Lemma 5.29. (i) Ep = 0 if p ≡ 1 (mod 4).

(ii) Ep 6= 0 if p ≡ −1 (mod 4). Proof. (i) This follows by observing that now n and −n are simultaneously quadratic residues since p ≡ 1 (mod 4) implies that −n −1 n n (  = ( (  = ( . p p p p

(ii) By NT-course the total number of quadratic residues among the numbers ±1, ±2,...± (p − 1)/2 equals (p − 1)/2, and since this is odd in this case we cannot have equality + − EP = Ep .

After the previous lemma it would be natural to expect that Ep changes sign rather irregularly as p walks over primes congruent to -1 (mod 4). Let us try with some primes. In the following table the quadratic residues are underlined.

p 3 −1, 1 7 −3, −2, −1, 1, 2, 3 11 −5, −4, −3, −2, −1, 1, 2, 3, 4, 5 19 −9, −8, −7, −6, −5, −4, −3, −2, −1, 1, 2, 3, 4, 5, 6, 7, 8, 9 23 −11, −10, −9, −8, −7, −6, −5, −4, −3, −2, −1, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11

This lead to the table p 3 7 11 19 23 Ep 1 1 3 3 3.

Surprisingly enough it looks like Ep is always positive! This turns out to be true, and it was proven by Dirichlet. We shall give a proof after first stating a simple lemma.

Lemma 5.30. Let p be a prime, p ≡ −1 (mod 4). Define the arithmetic function χL by setting n  if n is odd, χL(n) := p  0 if n is even.

Then χL is a non-principal real Dirichlet character (mod 2p).

n  Proof. Since the map n 7→ p is p-periodic, we see that χL is 2p-periodic by definition. n  Moreover, by the complete multiplicativity of n 7→ p , one checks again by definition n  that n 7→ p is completely multiplicative (note that n1n2 is even exactly when either n1 or n2 is even). Finally, χL is not identically zero, and it clearly satisfies χL(n) = 0 iff (n, 2p) > 1. Thus Theorem 5.18 yields that χL is a real Dirichlet character. It is non-principal as one sees by e.g. by noting that χL(−1) = −1.

62 Let us also recall the definition of the periodiced signum function (from p. 14), which is 1-periodic on R and satisfies  −1 if x ∈ (−1/2, 0),  sgn T(x) = 1 if x ∈ (0, 1/2),  0 if x ∈ {−1/2, 0, 1/2}. By Example 2.16 its Fourier series is given by ∞ 4 X 1 sgn (t) sin((2m − 1)2πt), (5.23) T π 2m − 1 m=1 and the stated equality holds (in particular the series converges) for all t ∈ R.

Theorem 5.31. Ep > 0 for all primes p congruent to -1 (mod 4). Proof. In the proof of Lemma 2.23 we already recalled that when j takes the values 1, 2,..., (p − 1), then j2 runs twice over the quadratic residues (mod p). We deduce (why?) that p−1 1 X E = sgn (2πj2/p), (5.24) p 2 T j=1 where we noted that sgn is 1-periodic and here clearly j2/p 6= k, k + 1/2 with k ∈ Z. On the other hand, as χL is a real non-principal character, Theorem 5.26 yields for the corresponding L-function ∞ ∞ X 1 X 1 2m − 1 L (1) = χ (n) = > 0. (5.25) χL n L 2m − 1 p n=1 m=1

PN 2 Recall also the definition of a Gauss sum G(a, N) = k=1 e(ak /N) from (2.7). We may then compute

p−1 ∞ (5.24) & (5.23) X 2 X 1 E = sin (2m − 1)2πj2/p p π 2m − 1 j=1 m=1 ∞ p−1 2 X 1  X  = sin (2m − 1)2πj2/p π 2m − 1 m=1 j=1 ∞ p−1 2 X 1  X  = Im e((2m − 1)j2/p π 2m − 1 m=1 j=1 ∞ 2 X 1 = Im G(2m − 1, p) π 2m − 1 m=1 ∞ Lemma 2.23 2 X 1 2m − 1 = Im G(1, p) π 2m − 1 p m=1 2 = L (1)Im G(1, p). π χL 63 √ By Theorem 2.21 we have Im G(1, p) = p since p = −1 (mod 4). The claim then follows from (5.25). √ 2 p Remark 5.32. The above proof yields the formula E = L (1) > 0 if p = −1 p π χL (mod 4). Remarkably, no ’elementary’ proof (i.e. argument that does not use analytic tools) is known for the positivity of Ep.

64 Chapter 6

Basic properties of ζ-function

6.1 The Hadamard product formula

In this section we shall show that entire functions that do not grow too fast can be expressed (essentially) as a product of factors of type w or (1−z/w), where the w:s are the zeroes of the function. This fact can be thought of as a far-reaching generalization of the fundamental theorem of algebra.

Definition 6.1. Let f be an entire function. We say that f is of finite order, if there is α ∈ (0, ∞) so that |f(z)| = O(exp(|z|α) as |z| → ∞. (6.1) The order of f is the infimum of all α > 0 such that (6.1) is valid (for some C = C(α)).

For example, the exponential function z 7→ ez is of order 1. So is z 7→ z3ez −cos(3z). Every polynomial is of order 0. The function z 7→ e(−5z2) is of order 2. Hadamard’s theorem can be stated for all functions of finite order, but for our purposes it is enough to consider functions whose order is at most one, i.e. entire functions with the property that for any ε > 0 one has

1+ε |f(z)| ≤ Cε exp(|z| ).

We now proceed with a couple of auxiliary results and recall first a basic fact of convergence of products of analytic function (for the proof see ’Complex Analysis II’). P∞ Lemma 6.2. If the sum j=1 |fn(z)| converges locally uniformly in the domain Ω ⊂ C, and each fn is analytic, then the product

∞ Y F (z) = (1 + fn(z)) n=1 converges in Ω to an analytic function. Moreover, if fn(z) 6= −1 ∀n ≥ 1, then F (z) 6= 0. The first one shows that a one-sided majorant for Re f(z) over a circle leads to a majorant of the absolute value |f(z)| over a smaller circle.

65 Lemma 6.3. (Borel-Caratheodory) Assume that f in analytic in B(0,R) with f(0) = 0 and it satisfies Re f(z) ≤ M on {|z| = R}. Let 0 < r < R. Then f satisfies 2r |f(z)| ≤ M on {|z| ≤ r}. 1 − r Proof. Is is easy (Exercise) to see that from the assumption it follows that Re f(z) ≤ M in the whole disc {|z| ≤ R}. By considering the function h(z) = f(z/R)/M we may assume M = R = 1. The Mobius map u 7→ u/(2 − u) maps the half-plane {Re u ≤ 1} inside the closed disc B(0, 1). Hence the analytic function f(z) g(z) := z(2 − f(z)) satisfies |g(z)| ≤ 1 on {|z| = 1}, and by the maximum modulus principle this holds in the whole closed disc B(0, 1). On the other hand, we may solve 2zg(z) f(z) = , 1 + zg(z) and for the values |z| ≤ r we thus have |zg(z)| ≤ r and, a fortiori, 2 · r 2r |f(z)| ≤ = . 1 − r 1 − r

It is not difficult to see that if an entire function f satisfies |f(z)| ≤ C(1 + |z|k), then f is a polynomial (Exercise). The next lemma shows that for this conclusion it is enough to assume only that the real part of f has just (one-sided) polynomial majorant that grows at most polynomially. Lemma 6.4. (Borel-Caratheodory) Assume that f is an entire function such that

Re f(z) ≤ C(1 + |z|)a, (6.2) where a > 0. Then f is polynomial with deg (f) ≤ a. The same conclusion holds true under the weaker assumption that condition (6.2) holds only for z that lie on the circles {|z| = rk}, where rk → ∞ as k → ∞. Proof. We may clearly replace f by f −f(0) if needed, and hence assume that f(0) = 0. Then the previous Lemma yields that the modulus |f| satisfies similar bound to (6.2). P∞ j Choose k ∈ N with k ≤ a < k + 1. Write the Taylor series f(z) = j=0 ajz and consider the entire function h, where

Pk j f(z) − ajz h(z) := j=0 . zk+1

By the assumption, it follows that max|z|=rk |h(z)| → 0 as k → ∞, whence the maximun modulus principle yields that h is identically zero. Thus h is a polynomial of degree less than equal to k.

66 Lemma 6.5. Assume that f is analytic and non-zero in B(0,R), where 0 < R ≤ ∞. Then one may define an analytic branch of log(f(z)) in B(0,R). Moreover, for every r ∈ (0,R) the following mean value formula holds true:

2π 1 Z log |f(0)| = log(|f(reit)|) dt. (6.3) 2π 0 Proof. The definability of log(f) follows from more general results we will prove later on. However, a simple proof is obtained by observing that since f 0/f is analytic in B(0,R), it has (by FT1) an integral function H with H(0) = 0 in the disc B(0,R). We may compute

(f exp(−H))0 = (f 0 − f(f 0/f)) exp(−H) = 0 so that f = f(0) exp(H). Hence H +log(f(0)) (with any choice of branch for log(f(0))) yields the desired branch of log(f). Since the above defined branch of log(f) is analytic in B(0, r) it satisfies the Gauss mean value principle (FT1/theorem 9.27), i.e.

2π 1 Z log(f(0)) = log(f(reit)) dt, 2π 0 and (6.3) follows by taking real parts on both sides.

Lemma 6.6. The map a − z φ(z) = λ , where |a| < 1 and |λ| = 1 1 − az¯ is analytic in D and satisfies |φ(z)| = 1 for |z| = 1. Proof. Exercise. By this observation it follows for r > 0 that  a/r − z/r |z| = r = 1 for , or equivalently 1 − az/r2 |a| < r,  r(a − z) |z| = r = 1 for (6.4) r2 − az |a| < r.

Assume that the entire function f satisfies f(0) 6= 0 and has zeroes w1, w2, . . . , wn in the circle {|z| < r} and, in addition f(z) 6= 0 for |z| = r. Denote

n   n  2  . Y r(wj − z) Y r − wjz g(z) := f(z) = f(z) r2 − w z r(w − z) j=1 j j=1 j

67 Then g is analytic and g 6= 0 in {|z| ≤ r}. We may apply Lemma 6.5 and write

2π 1 Z log |g(0)| = log(|g(reit)|) dt. 2π 0 By using the and definition of g computing g(0) in terms of f we obtain the first part of Theorem 6.7. (Jensen’s formula) Let f be an entire analytic function with f(0) 6= 0 and with zeroes w1, w2, . . . , wn in B(0, r)(counted with multiplicities). Then

n 2π X  r  1 Z log = log(|f(reit)|) dt − log(|f(0)|) (6.5) |wj| 2π j=1 0 r Z du = n (u) , (6.6) f u o  where we define nf (r) := # zeroes of f in B(0, r) (counted with multiplicities). Proof. We already proved (6.5) in case f(z) 6= 0 on the circle {|z| = r}. The case where f possibly has zeroes on the {|z| = r} follows by applying the already proven case with r0 in place of r, where r0 < r is very close to r, and letting r0 ↑ r (the details are an Exercise). Finally, (6.6) follows from the first statement by observing that each zero wj in {|z| < r} makes the contribution r Z du  r  = log u |wj| |zj | to the integral in (6.6). Lemma 6.8. Let f be entire function that has only finitely many zeroes and is of order at most 1 Then f(z) = p(z) exp(az + b)), where a, b ∈ C and p is a polynomial. Proof. Take a polynomial p that has exactly the same zeroes (with multiplicities counted) as f. According to Lemma 6.5 we may define an analytic branch of the logarithm of g = log(f/p) in the whole complex plane. Take any β > 1. We infer that that Re g satisfies Re g(z) ≤ C0(1 + |z|β) for all |z|. Then lemma 6.4 yields that g is a polynomial with deg(p) ≤ 1, and the statement follows. u Lemma 6.9. Write E1(u) := (1 − u)e for u ∈ C. Then ( e−2 for |u| ≤ 1/2, |E1(u)| ≥ e−|u| for |u| ≥ 2. Moreover, |E1(u) − 1| ≤ 6|u| for |u| ≤ 1/2.

68 Proof. Exercise. We are now ready for

Theorem 6.10. (Hadamard’s product formula) Let f be an entire function order at most 1. Assume that f has an m:th order zero at the origin and it has infinite number of other zeroes (with multiplicities counted) w1, w2,.... Then ∞ X −(1+ε) (i) |wj| < ∞ for all ε > 0. j=1 ∞ Y z (ii) f(z) = zmeAz+B 1 − ez/wj , w j=1 j where A, B ∈ C. ∞ X −1 Cz (iii) If |wj| < ∞, then f(z) = O e for some C and one may write j=1

∞ 0 0 Y z f(z) = zmeA z+B 1 − . w j=1 j

Proof. We may clearly assume that f(0) 6= 0, i.e. m = 0, since otherwise one may consider f(z)/zm. (i) Let R ≥ 1 and choose r = 2R in Jensen’s formula (Theorem 6.7) to obtain

2R Z du n (R) ≤ 2 n (u) = O(R1+ε). f f u R

−(1+2ε) Since nf (R)R → 0 as R → ∞, partial summation yields

∞ ∞ ∞ Z Z X −(1+ε) −2−2ε −1−ε |wj| = (1 + 2ε) nf (u)u du ≤ C nf (u)u du < ∞. j=1 |z1| |z1|

(ii) We first claim that the function

∞ ∞ Y z Y h(z) := 1 − ez/wj = E (z/w ) (6.7) w 1 j j=1 j j=1 is a well-defined entire function. Fix any R ≥ 1 and then choose j0 ≥ 1 so large that |wj| ≥ 2R for j ≥ j0. Then by Lemma 6.9(ii) we may estimate

2 2 −2 |E1(z/wj) − 1| ≤ 6|z/wj| ≤ 6R |wj| for j ≥ j0.

Since P∞ |w |−2 converges by part (i) of the Theorem, we see that the series j=j0+1 j P∞ |E (z/w )−1| has a uniform majorant series that converges, and then Lemma j=j0+1 1 j

69 6.2 yields that the full product converges to an analytic function on B(0,R). Since R ≥ 1 was arbitrary, the claim follows. We may thus define the function f(z) H(z) := . (6.8) h(z) A priori H is only meromorphic, but since both numerator and denominator have exactly the same zeros (with same multiplicities), it is an entire function. Our second claim is that The order of H is less or equal to 1. (6.9) If this is known, then since H has no zeroes, Lemma 6.8 yields that H is of the form H = exp(g), where g is polynomial with degree at most bαc, and part (ii) of the Theorem follows. In order to prove (6.9), fix R ≥ 1. Fix β ∈ (1, 2). Since the order of f is 1 there is a constant C > 0 so that

β |f(z)| ≤ C exp(|z| ) for all z ∈ C. (6.10) Let us write h as a product of two functions Y Y h(z) = h1(z)h2(z) := E`(z/wj) E`(z/wj) .

|wj |≤2R |wj |>2R | {z } | {z } h1(z) h2(z)

Observe that for |z| = R we have |z/wj| ≤ 1/2 in the definition of h2, so we may use Lemma (6.9) to estimate for |z| = R

Y  X 2 |h2(z)| ≥ |E1(z/wj)| ≥ exp − (|z|/|wj|) (6.11)

|wj |>2R |wj |>2R

In the last written formula one has always |z|/|wj| ≤ 1/2, whence

X 2 X β 0 β (|z|/|wj|) ≤ (|z|/|wj|) ≤ C R ,

|wj |>2R |wj |>2R since by part (i) the sum P |w |−β converges. In particular, we obtain that |wj |>2R j

0 β inf |h2(z)| ≥ exp(−C R ). (6.12) |z|=R

We next estimate h1(z) from below on the circle |z| = 4R (this extremely clever idea is due to Landau!) On this circle one has |z/wj| ≥ 2 for all the terms involved in the definition of h1(z), whence we may apply Lemma 6.9(i) to obtain Y X  |h1(z)| ≥ |E1(z/zj)| ≥ exp − (|z|/|wj|) . (6.13)

|wj |≤2R |wj |≤2R

70 This time one uses |z|/|wj| ≥ 2 to majorize the last written sum as follows

X X β 0 β 00 β (|z|/|wj|) ≤ (|z|/|wj|) ≤ 2 · c · (4R) = C R ,

|wj |≤2R |wj |≤2R

0 P −β again by the convergence of the series c = j |wj| , and we deduce

00 β inf |h1(z)| ≥ exp(−C R ). (6.14) |z|=4R

By combining (6.14) with the estimate (6.10) it follows that

β 00 β 000 β sup |f(z)/h1(z)| ≤ C exp((4R) ) exp(C R ) ≤ C exp(C R ). |z|=4R

At this stage the maximum principle (observe that f/h1 is analytic) yields that

000 β sup |f(z)/h1(z)| ≤ exp C(C R ). (6.15) |z|=R

Finally, joining this knowlege with (6.12) it follows for R ≥ 1 that

0 000 β sup |H(z)| ≤ sup |f(z)/h1(z)| sup |1/h2(z)| ≤ C exp((C + C )R ). |z|=R |z|=R |z|=R

This verifies that H is of order 1 since we may choose β as close to 1 as we want. ∞ X −1 Q∞ z/wn (iii) Assume that |wj| < ∞. Then the product n=1 e converges and has j=1 Aze P∞ −1 the form e , where Ae := j=1 wj . This observation yields the stated representation. In order to estimate the growth of f we note that |(1 − u)eu| ≤ (1 + |u|)e|u| ≤ e2|u|. When this estimate is applied to the product in (ii), the exponential bound f(z) = O(exp(C|z|) follows.

sin(πz) Example 6.11. The function πz is of order 1 and it has zeroes ±1, ±2,.... By Hadamard’s theorem we obtain (Exercise) the famous Euler product formula

sin(πz) z2 z2 z2 = 1 − 1 − 1 −  · ..., for all z ∈ . πz 12 22 32 C 6.2 Riemann’s 1860 memoir

In this paper Riemann made a deep study (although mainly conjectural) on the prop- erties of the ζ-function in the complex domain and their connection to the distribution of primes. In somewhat more detail,

Riemann proved:

• meromorphic continuation and the functional equation of ζ

71 Riemann sketched or conjectured:

• ζ has infinitely many zeroes in the critical strip {0 < σ < 1}, and their number N(T ) in the rectangle (0, 1) × (0,T ) satisfies

T T T N(T ) = log  − + O(log T ). 2π 2π 2π

• the function ξ has the product representation Y ξ(s) = eas+b (1 − s/ρ)es/ρ, ρ

where the ρ are the nontrivial zeroes of ζ, i.e. those that are located in the critical strip. Recall here from Section 2.5 the function ξ(s) is defined as 1 ξ(s) := s(s − 1)π−s/2Γ(s/2)ζ(s), 2 and it is an entire function.

• There is a (somewhat heuristic and complicated) explixite formula for the ’error term’ in the prime number theorem

R(x) := π(x) − Li (x),

x Z dt where Li (x) := in the ’integral logarithm’ function. Especially, Li (x) log t 2 should be a better approximation to π(x) than x/ log(x).

• Most of, and likely all of the nontrivial zeroes of ζ lie on the critical line {σ = 1/2}. In the form ’all of’ this is the famous Riemann’s conjecture.

Nearly all of Riemann’s statements have since been verified, but the last one. In what follows we shall prove many of them, and we shall start with the product formula.

6.3 Product formula for ζ

Recall that the function ξ is entire and symmetric: ξ(s) = ξ(1 − s).

Proposition 6.12. The zeros of the Riemann zeta function are consist of the trivial zeroes {−2n : n ∈ Z+} and of the nontrivial zeroes that all lie on the critical strip {0 < σ < 1}. In turn, the nontrivial zeroes are exactly the zeroes of the function ξ.

72 Proof. Let us first note that by the Euler product representation, ζ has no zeroes in {σ > 1}. Moreover, by Lemma 4.3, ζ has no zeroes on the line {σ = 1}. Thus ζ has no zeroes in the closed half-plane {σ ≥ 1}. Assume then that ζ(s0) = 0, where σ0 ≤ 0. Since the functional equation may be written as 1 ζ(s) = πs−12s sin( πs)Γ(1 − s)ζ(1 − s), 2

1 and Re(1 − s) ≥ 1, we see that necessarily the factor sin( 2 πs)Γ(1 − s) vanishes at s0. This meromorphic function has zeroes s = 0, −2, −4,... in the closed half plane {σ ≤ 0}. In turn ζ(1 − s) is non-zero at s = −2, −4,..., but has pole at s = 0 which cancels the zero of the other factor at this point. We deduce that s0 ∈ {−2, −4,...}, and in turn all these points are zeroes of ζ. Especially, all other zeroes lie in the (open) critical strip {0 < σ < 1}. What comes to zeroes of the function 1 ξ(s) := s(s − 1)π−s/2Γ(s/2)ζ(s), 2 simply note that the meromorphic factor s(s − 1)π−s/2Γ(s/2) has zero at 1 and poles at -2,-4,. . . , and hence exactly cancels from ζ its pole and its trivial zeroes. We next want to understand the growth of ζ (or ξ) in order to obtain a product formula for it. Let us start by an easy estimate that verifies that the growth of ζ is at most linear in the half-plane σ ≥ 1/2 - the rest can then be deduced from the functional equation.

∞ s Z Lemma 6.13. (i) ζ(s) = − s (x − bxc)x−s−1dx for σ > 0. s − 1 1 (ii) |ζ(s)| < C|s| if σ ≥ 1/2 and s is large. Proof. (i) Assume first that σ > 2. Then

∞ ∞ ∞ X X X Z ζ(s) = n−s = n(n−s − (n + 1)−s) = s n x−s−1dx n=1 n=1 n=1 ∞ Z = s (x − bxc)x−s−1dx.

1

∞ s Z The stated formula follows by adding and subtracting = s xx−s−1dx. More- s − 1 1 over, since the integral in the formula of (i) clearly defines an analytic function on {sigma > 0}, meromorphic continuation shows the validity of (i) in the half-plane {σ > 0}. R ∞ −s−1 R ∞ −3/2 (ii) If σ ≥ 1/2, we have 1 (x − bxc)x dx ≤ 1 x < ∞, and the claim follows immediately from (i).

73 Remark 6.14. One may note that the proof of (i) actually yields an easy meromorphic continuation of ζ to {σ > 0}.

Lemma 6.15. The entire function ξ is of order one. More precicely, there is the bound |ξ(s)| ≤ exp(C|s| log |s|) for large |s|, and, on the other hand, |ξ(s)|= 6 OC exp(C|s|) for any C > 0.

Proof. Since ξ(s) = ξ(1−s),it is enough to consider s with σ ≥ 1/2. By part (ii) of the previous lemma, ζ(s) grows at most linearly in this range. Thus we just need to verify the stated growth for the factor s(s − 1)π−s/2Γ(s/2). Clearly s(s − 1)π−s/2 grows at most exponentially in |s| as |s| → ∞, and for the Gamma function factor the growth is at most like exp(C|s| log |s|) directly from Theorem 2.31 (Stirling’s formula). This gives the stated upper bound and shows that order of ξ is at most 1. In order to obtain the lower bound we note that for real values s >> 1 we have |ζ(s)| ≥ 1 and hence again by the definition and Stirlings formula s s s s log |ξ(s)| ≥ C − log π + log(s/2) − ≥ log(s) − Cs, 2 2 2 2 which shows that ξ grows quicker than exponentially. Hence the order is exactly 1.

Theorem 6.16. Call by ρ1, ρ2,... the non-trivial zeroes of ζ (equivalently, the zeroes of ξ). There are infinitely many of non-trivial zeroes and

∞ X −1−ε |ρj| < ∞ for all ε > 0, (6.16) n=1 but ∞ X −1 |ρj| = ∞. (6.17) n=1 There are constants A, B ∈ C so that ξ has the representations as the product Y ξ(s) = eA+Bs (1 − s/ρ)es/ρ. (6.18) ρ

Proof. By Lemma 6.15 ξ is of order 1, which yields the stated product frespresentation by (ii) of Hadamard’s product formula, Theorem 6.10. Also, since Lemma 6.15 verifies that ξ does not grow exponentially, part (i) and (iii) of Theorem 6.10 yield (6.16) and (6.17), respectively. This yields of course that there are infinitely many nontrivial zeroes. P Remark 6.17. From now on the expression ρ (or similarly with products etc) stands for sum over the nontrivial zeroes of ζ !!

1 Lemma 6.18. In (6.18) we have A = − log 2 an B = −γ/2 − 1 + 2 log(4π). Proof. Exercise.

74 Corollary 6.19. Whenever s 6= 1 and ζ(s) 6= 0 it holds that

ζ0(s) −1 1 1 Γ0( 1 s + 1) X  1 1 = + B + log π − 2 + + . ζ(s) s − 1 2 2 Γ( 1 s + 1) s − ρ ρ 2 ρ

Q∞ Proof. If the product f(z) = n=1 fn(s) of analytic functions converges locally uni- formly (recall these basic facts from ’Complex Analysis II’), then by Weierstrass we 0 0  QN  also have f (z) = limN→∞ n=1 fn(s) , and we obtain whenever f(z) 6= 0

∞ f 0(z) X f 0 (z) = n . f(z) f (z) n=1 n When this is applied to (6.18) it follows (the convergence in Hadamard theorem is locally uniform) that outside the non-trivial zeroes we have

ξ0(s) X  1 1 = + , ξ(s) s − ρ ρ ρ and the rest follows by invoking the definition of ξ and differentiating logarithmically.

6.4 A zero free region for ζ

As we shall see in next chapter, we may use the zeta-function to obtain quantitative error bounds in the PNT. Explicite error bounds are based on establishing a quanti- tative zero-free zone that is located on the critical strip, immediately on the left hand side of the line σ = 1. The ’classical’ zero-free region is given by the following result.

Theorem 6.20. There is c > 0 so that ζ(s) 6= 0 in the region c σ ≥ 1 − , |t| ≥ 2. log |t|

P∞ −s Proof. Recall that for σ > 1 we have 1/ζ(s) = n=1 µ(n)n . On the other hand, 0 P∞ −s P∞ −s 0 ζ (s) = − n=1 log(n)n , whence we have ζ(s) n=1 Λ(n)n = −ζ (s) in view of the formula log = Λ ∗ µ (see (4.2) and recall that the Λ was defined on p. 42). In particular, we have the fundamental relation

∞ ζ0(s) X = − Λ(n)n−s, σ > 1. (6.19) ζ(s) n=1 Taking real parts yields

∞ ζ0(s) X −Re = Λ(n)n−σ cos(t log n). ζ(s) n=1

75 By again observing that 3 + 4 cos(x) + cos(2x) ≥ 0, we deduce as on Lemma 4.3 that for any σ > 1 and arbitrary t  ζ0(σ)  ζ0(σ + it) ζ0(σ + 2it) 0 ≤ 3 − + 4 − + (6.20) ζ(σ) ζ(σ + it) ζ(σ + 2it)

0 Since ζ /ζ has a simple pole at s = 1 with residue -1 there is a constant c0 so that ζ0(σ) 1 − < + c , 1 < σ ≤ 2. (6.21) ζ(σ) σ − 1 0 By Corollary 6.19 we have

ζ0(s) 1 1 1 Γ0( 1 s + 1) X  1 1 − = − B − log π + 2 − + . (6.22) ζ(s) s − 1 2 2 Γ( 1 s + 1) s − ρ ρ 2 ρ

The following lemma helps to estimate the term involving the Gamma function. Lemma 6.21. Γ0(z)/Γ(z) = log z + O(1/|z|) in any angle of the form | arg z| ≤ π − ε as |z| → ∞. Proof. A differentiation of formula (2.15) yields the nice equality

∞ Γ0(z) Z B (t)dt = log(z) − 1 . Γ(z) (t + z)2 0

In the angle | arg z| ≤ π − ε one checks that |t + z| ≥ c1(t + |z|), whence

∞ ∞ Z B (t)dt  Z dt   1  1 = O = O . (t + z)2 (t + |z|)2 |z| 0 0

By this Lemma we obtain

 0 1  1 Γ ( 2 s + 1) Re 1 ≤ c2 log t for t ≥ 2 and σ ∈ [1, 2]. (6.23) 2 Γ( 2 s + 1) Hence, for the same values of σ, t (6.22) yields the inequality

ζ0(s) X  1 1 −Re ≤ c log t − Re + . (6.24) ζ(s) 3 s − ρ ρ ρ

We shall follow the tradition of analytic number theory and denote ρ = β + iγ with β, γ ∈ R (here γ is not the Euler constant!), and deduce that above  1 1 σ − β β Re + = + ≥ 0. s − ρ ρ |s − ρ|2 |ρ|2

76 In combination with (6.24) this yields that

ζ0(s) −Re ≤ c log t for t ≥ 2 and σ ∈ [1, 2]. ζ(s) 3 In particular, we have ζ0(σ + 2it) −Re ≤ c log t for t ≥ 2 and σ ∈ [1, 2]. (6.25) ζ(σ + 2it) 3

Assume then that ρ0 =: β0 + iγ0 is nontrivial zero, i.e. β0 ∈ (0, 1) and ζ(ρ0) = 0. All the terms in the sum in the right hand side of (6.24) are positive. Hence, keeping only the term Re 1 it follows that s−ρ0 ζ0(σ + it) 1 −Re ≤ c log t − . (6.26) ζ(σ + it) 3 σ − β

We then substitute the inequalities (6.21), (6.25), and (6.26) into (6.20) and deduce  1   1  3 + c + 4 c log t − + c log t > 0. σ − 1 0 3 σ − β 3 This yields that 4 3 3 ≤ + 5c log t + 3c ≤ + c log t. σ − β σ − 1 3 0 σ − 1 4

−1 We choose σ = 1 + (2c4 log t) and deduce 4 ≤ 7c log t, σ − β 4 which yields that 4 1 β ≤ σ − = 1 − . 7c4 log t 14c4 log t

We may thus choose c = 1/(14c4) in the Theorem. Remark 6.22. Still today the widest known zero free domain is the one dues to Korokov and Vinogradov in the late 1950’s: c σ > 1 − . log2/3(t) log log1/3(t)

6.5 The Riemann - von Mangoldt Theorem

Definition 6.23. For T > 0 we denote by N(T ) the number (counted with multiplic- ity) of zeroes of the Riemann ζ-function in the rectangle

0 < σ < 1 and 0 < t < T.

77 In the above rectangle ζ and ξ have the same zeroes, so in order to estimate N(T ) we may consider ξ if needed. Also, since ξ has no zeroes in the set {|σ| ≥ 1} we may equally well consider zeroes in the rectangle

R(T ) := {−1 < σ < 2 and 0 < t < T }.

Observe that ξ is non-zero on the vertical sides of RT by Proposition 6.12. Moreover, it does not vanish on the bottom-side:

ξ(s) 6= 0 for s ∈ [0, 2]. (6.27)

This follows from Proposition 6.12 for s ∈ [−1, 0] ∪ [1, 2]. To see it for s ∈ (0, 1) it is enough to check that ζ(s) 6= 0 for these values. For that end recall from Example 3.8 that ζ(s) = (1 − 21−s)−1ζe(s), P∞ n−1 −s where ζe(s) = n=1(−1) n 6= 0 for s ∈ (0, 1), as wee see from the alternating nature of the series. Thus, if T >> 1 is chosen so that the segment {t = T, 0 < σ < 1} does not contain nontrivial zeroes (i.e. γ 6= T for all ρ = β + iγ), we obtain by the argument principle 1 N(T ) = ∆ arg ξ(s), (6.28) 2π ∂RT where ∂RT is the boundary of RT is traversed to the positive direction. Here ∆ refers to the total change in the argument. Let us denote

LT := {line segment from 2 to 2+iT} ∪ {line segment from 2 + iT to 1/2 + iT }.

We recall that ξ(s) = ξ(s) since ξ is real on R. Moreover, it satisfies ξ(1 − s) = ξ(s). Hence the change of argument of arg ξ(s) over the polygonal path

(1/2 + iT, −1 + IT ) ∪ (−1 + iT, −1) equals that of over LT since we have

ξ(σ + it) = ξ(1 − σ − it) = ξ(1 − σ + it).

Moreover, arg ξ(s) does not change on (−1, 2) since ξ is real on R. Hence 1 N(T ) = ∆ arg ξ(s), (6.29) π ∂LT By writing ξ(s) = (s − 1)π−s/2Γ(1 + s/2)ζ(s) we will estimate the change in the argument separately for each factor. In what follows arg z refers to the principal value of Im log x, i.e. arg z ∈ (−π, π) for z 6∈ [−∞, 0].

1  1. ∆∂L arg(s − 1) = arg(iT − 1/2) = π/2 + arctan = π/2 + O(1/T ). T pT 2 + 1/4)

78 2. ∆∂LT arg Γ(1 + s/2) = Im log(Γ(5/4 + iT/2)). In Sirlings formula (Theorem 2.31) the branch of log Γ is real on the positive real line, so we see that the above quantity equals  1  Im (3/4 + iT/2) log(5/4 + iT/2) − (5/4 + iT/2) + log(2π) + 0(T −1) . 2 Here 5 iπ log(5/4 + iT/2) = log(iT/2) + log 1 +  = log(T/2) + + O(1/T ). 2iT 2 Plugging this into the previous formula and simplifying we find 1 ∆ arg Γ(1 + s/2) = T log(T/2) − 1 ∂LT 2

−s/2 1  1 3. ∆∂LT (π = ∆∂LT − t 2 log(π) = − 2 log(π)T. Put together, we have proven an intermediate result T T T Lemma 6.24. N(T ) = log  − + O(1) + S(T ), 2π 2π 2π where 1 S(T ) := ∆ arg(ζ(s)). π ∂LT Thus we finally need to estimate the change of argument of ζ itself. For this we cannot hope for very precise formula, so for this reason we did not produce a sharper form of the previous Lemma (there one can write down even an asymptotic formula). The main point in invoking first the function ξ was that by this way we reduced the task to estimating change of the argument over LT , and in this range ζ is more amenable to estimation. For this purpose we need some a priori knowledge on the local density of the zeroes and a local version of Corollary 6.19. Recall that {ρ = β + iγ} is the set of non-trivial zeroes.

Proposition 6.25. (i) For any T ≥ 2 and it holds that X 1 (i) = O(log T ). 1 + |t − γ|2 ρ (ii) N(T + 1) − N(T ) = O(log T ). (iii) Moreover, for s = σ + it with t ≥ 4 and −1 ≤ σ ≤ 2 we have

ζ0(s) X 1 = + O(log t). ζ(s) s − ρ ρ: |t−γ|≤1

79 Proof. Recall that by Corollary 6.19 we have

ζ0(s) −1 1 1 Γ0( 1 s + 1) X  1 1 = + B + log π − 2 + + . ζ(s) s − 1 2 2 Γ( 1 s + 1) s − ρ ρ 2 ρ According to Lemma 6.21 we have above

0 1 1 Γ ( 2 s + 1) 1 1 1 2 2  − 1 = − log( s + 1) + O(1/t) = log (1 + σ/2) + t /4 − iπ/4 + O(1/t) = O(log t), 2 Γ( 2 s + 1) 2 2 2 so that in the range σ ∈ [−1, 2], t ≥ 2 we have

ζ0(s) X  1 1 = + + O(log t). (6.30) ζ(s) s − ρ ρ ρ Let us substitute here s = 2+iT and note that the left hand side is uniformly bounded on the vertical line {σ = 2} to obtain ! X  1 1 Re + = O(log t). (6.31) s − ρ ρ ρ 1 Now Re ≥ 0 and ρ  1  2 − β 1 1/4 Re = ≥ ≥ . 2 + iT − ρ (2 − β)2 + (γ − T )2 4 + (γ − T )2 1 + (γ − T )2 Claim (i) follows as we substitute this estimate in (6.31). 1 For γ ∈ [T,T + 1] we gave ≥ 1/2, whence (ii) is a consequence of (i). 1 + (γ − T )2 By substituting in (6.30) first s and then 2 + it and subtracting the latter from the first one we obtain ζ0(s) X  1 1  = − + O(log t) ζ(s) s − ρ 2 + it − ρ ρ X 1 = + E(t) + O(log t), (6.32) s − ρ |ρ−t|≤1 where   X 1 X 1 1 E(t) = O + − .  |2 + it − ρ| s − ρ 2 + it − ρ  |ρ−t|≤1 |ρ−t|≥1 1 1 In the above expression for E(t) we have ≤ ≤ 1, so that |2 + it − ρ| Re (2 + it − ρ)| by part (i) the first sum is of size O(log t). In addition, 1 1 2 − σ 3 − = ≤ , s − ρ 2 + it − ρ |s − ρ||2 + it − ρ| |γ − t|2

80 so that the last sum is dominated by X 3 X 1 3 ≤ 6 = O(log t), |γ − t|2 1 + |γ − t|2 |ρ−t|≥1 ρ according to part (i). We deduce that E(t) = O(log t), and part (ii) follows from (6.32). We are then ready to prove the Riemann - von Mangoldt bound for the count of non-trivial zeroes. Theorem 6.26. S(T ) = O(log T ) and T T T N(T ) = log  − + O(log T ). 2π 2π 2π Proof. After Lemma 6.24 it is enough to prove the statement on S(T ). For that end we note first that the change of the argument of ζ(s) along the line {σ = 2} is uniformly bounded since | log ζ(2 + it) is uniformly bounded for t ∈ R. Hence

 T i+1/2  1 Z ζ0(s) N(T ) = O 1 + Im ds  π ζ(s)  T i+2  T i+1/2  Prop.(6.25) X 1 Z 1 = Im ds + O log(T ).  π s − ρ  ρ: γ−T |≤1 T i+2 All this is perfectly ok as soon as there are no roots ρ with imaginary part T . Now the change of argument of (s − ρ) along the line of integration above is bounded by π, whence the imaginary part of each integral in the last formula is bounded by 1. Put together, we obtain

S(T ) = O#{γ : |γ − T | ≤ 1} + O(log T ) = O(log T ), where we used Proposition refprop:zeroestimates1(i). Finally, the obtained estimate for N(T ) is valid for any T (regardless of whether some ρ has imaginary part T ) since we may in any case pick arbitrarily small ε > 0 so that none of the ρ has imaginary part T . Remark 6.27. It is possible to use slightly sharpened versions of Lemma 6.24 to try to verify the Riemann conjecture up to given height numerically.

6.6 The explicit formula

Denote 0 ( X ψ(x) x 6= pα, ψ (x) = Λ(n) = 0 1 α n≤x ψ(x) − 2 log p, x = p .

81 Also, let us denote by hxi the distance from x to the nearest prime power (not equal to x !). In other words,

 α α hxi = min |x − p | : p ∈ P, α ∈ N, p 6= x . (6.33) Theorem 6.28. For x, T ≥ 4 we have

ρ 0 X x ζ (0) 1 2 ψ (x) = x − − − log(1 − x− ) + R(x, T ), (6.34) 0 ρ ζ(0) 2 ρ: |Im ρ|

ρ 0 X x ζ (0) 1 2 ψ (x) = x − − − log(1 − x− ). (6.37) 0 ρ ζ(0) 2 ρ

Remark 6.29. • The sum in (6.37) must be understood as the symmetric summation

X xρ X xρ := lim . ρ T →∞ ρ ρ ρ: |Im ρ|

Moreover, xρ := exp(ρ log x). • For applications (6.34) is in most cases much more useful than the (beautiful though) formula 6.37. ζ0(0) • The exact value of ζ(0) is log 2π, but it is traditional to write it down in the explicit formula.

ζ0(s) P∞ −s Proof of Theorem 6.28. Recall by (6.19) that ζ(s) = − n=1 Λ(n)n for σ > 1. As- sume first that x is not a prime power. An application of Perron’s effective formula (Theorem 3.23) yields that

c+iT Z 0 s ∞ 1 ζ (s) x c X −c  1  ψ0(x) − ds ≤ x Λ(n)n min 1, (6.38) 2πi ζ(s) s T | log(x/n)| c−iT n=1 X X = −” − + −” − (6.39) n: |n−x|≤x/4 n: |n−x|l>x/4

= := R1 + R2. (6.40) where we assume that c > 1. We shall actually make the choice

82 1 c = 1 + , log x which yields that xc = ex = O(x). In order to estimate R2 we simply use that in this range of summation one has | log(x/n)| ≥ log(5/4) to obtain

c ∞ !  0  x X −c x ζ 1  R2 ≤ O Λ(n)n = O · 1 + (6.41) T T ζ log x n=1 log x = O , (6.42) T since (ζ0(1 + s)/ζ(1 + s) = −1/s + O(1) for small s. For R1 we observe that Λ(n) = 0 in the sum unless |n − x| ≥ hxi and |n − x| ≤ x/4 However, in the range |n − x| ≤ x/4 we have

Λ(n) = O(log x) and xcn−c = O(1).

Also, if ±(n − x) = hxi + j, then

x n hxi + j  log = log = log 1 ± . (6.43) n x x

If hxi ≥ x/8, we obtain log(x/n)−1 = O(1) and the outcome is simply

 X 1  x log x R = O log x = O . (6.44) 1 T T |n−x|

In turn, if hxi < x/8 we apply (6.43) and the estimate | log(1 + u)| ≈ |u| for |u| ≤ 1/2 to estimate

bx/4c  X x  R = O 2 log(x) min(1, 1 T (hxi + j) j=0 bx/4c  x X x  = O log(x) min 1,  + log(x) T (hxi) T j j=1 x log2(x) x  = O + log(x) min 1,  . T T (hxi)

This covers also the case of (6.44) and is larger than the estimate we obtained before for R2 in (6.41). Put together, up to now we have shown that

c+iT 1 Z ζ0(s) xs x log2(x) x  ψ (x) + ds = O + log(x) min 1,  (6.45) 0 2πi ζ(s) s T T (hxi) c−iT

83 We assumed that x is not a prime power above, but if it is, the error term of the Perron effective formula corresponding to x:th term produces error that is bounded by 2T −1ψ(x) = O(log x/T ), that can be subsumed in our previous error term. Thus (6.45) is valid in any case. Our next duty is to apply residues to transform the integral in (30) to the desired sum. Recall that ζ has simples zeroes at −2n, n ∈ N. Thus we may compute all the residues of ζ as forllos: ζ0(s) xs  x−2n Res = , n = 1, 2,.... s=−2n ζ(s) s −2n ζ0(s) xs  ζ0(0) Res = . s=0 ζ(s) s ζ(0) ζ0(s) xs  Res = −x. s=1 ζ(s) s ζ0(s) xs  xρ Res = , s=ρ ζ(s) s ρ where in the last formula the stated residue at s = ρ is miltiplied by the (possi- ble)multiplicity of the non-trivial zero ρ. Choose T ≥ 4 so that no ρ has imaginary part ±T and let U be an odd positive integer. The idea is to simply apply the residue theorem on the rectangle with the sides (c − iT, C + iT ), L1 := (c + iT, −U + iT )),L0 := (−U + iT, −U − iT ),L−1 = (−U − iT, c − iT ). We thus obtain

c+iT 1 Z ζ0(s) xs ds 2πi ζ(s) s c−iT bU/2c X xρ ζ0(0) X x−2n = −x + + + + I + I + I , ρ ζ(0) −2n 1 0 −1 ρ: |Im ρ|

bU/2c n X y −U/2 For y ∈ (0, 1) we have the elementary estimate log(1 − y) + = O(2 ). n n=1 This leads to bU/2c X x−2n 1 − log(1 − x−2 = O(2−U/2). (6.46) −2n 2 n=1

We next estimate the integrands of Ij:s. In order to estimate ζ on {σ = −U} we substitute 1 − s in place of s in the functional equation to obtain ζ(1 − s) = 21−sπ−s cos(πs/2)γ(s)ζ(s)

84 and differentiate logarithmically: ζ0(1 − s) π Γ0(s) ζ0(s) − = − log(2π) − tan(πs/2) + + . (6.47) ζ(1 − s) 2 Γ(s) ζ(s)

0 ζ (1 + Uit) If s = 1+U −it with U odd positive integer, we have ≤ C(independent ζ(1 + U − it) of t or U) and 1 1 sinh(πt/2) | tan π(1 + U − it)| = | tan( πit)| = ≤ 1. 2 2 cosh(πt/2) Moreover, the now already familiar estimate (6.21) yields Γ0(1 + U − it) = O(log |1 + U − it|) = O(log |U + it|). Γ(1 + U − it)

Substituting all these estimates into (6.47) yields an estimate for I0:

T  Z log |U + it| T log U  I = O x−U dt = O x−U . (6.48) 0 U + it U −T We will now modify T so that, if needed, we replace T by T 0 with |T − T 0| ≤ 1 and so that c min |γ − T 0| ≥ (ρ = β + iγ), (6.49) ρ log T which is possible since the density of γ:s close to T is O(log T ) by Proposition 6.25. We shall later on check that this replacement is harmless. We next observe that for |t| ≥ 1 one has

1 π 1 tan( 2 πσ) + i tanh( 2 t) | tan( π(σ + it)| = 1 π ≤ C for |t| ≥ 1, 2 1 − i tan( 2 πσ) tanh( 2 t) π since now 0 < c1 ≤ tanh( 2 t) ≤ 1. As this is substituted in (6.47) we obtain ζ0(1 − s) Γ0(s) = O(1) + O  = O(log(|s|) for σ ≥ 2, |t| ≥ 1. ζ(1 − s) Γ(s) Hence, ζ0(s) = O(log(|s|) if s ∈ L ∩ {σ ≤ −1}. (6.50) ζ(s) ±1

Over L±1 ∩ {−1σ ≤ 2} we use Proposition 6.25(iii) and (6.49) to estimate

0 0 ζ (σ + iT )  X 1  Prop. 6.25(ii) = O log T 0 + = O log(T 0) + log2(T 0) ζ(σ + iT 0) |s − ρ| ρ: |γ−T 0|≤1 = O log2(T 0).

85 We may now substitute all this in the definition of the integrals I±1 to estimate

 c −1  Z xσ+iT 0 Z xσ+iT 0 I + I = O log2(T 0)dσ + log(T 0 + |σ|)dσ 1 −1  σ + iT 0 σ + iT 0  −1 −U  c  log2(T 0) Z log2(T 0) xc  = O xσdσ = O  T 0  T 0 log x −∞ log2(T 0) x  = O T 0 log x

Altogether, by combining this with (6.48) we have in the limit U → ∞

log2(T 0) x  |I + I + I | = O . 0 1 −1 T 0 log x

By recalling (6.45) this means that we have derived the following bound for the error term: x log2(x) x log2(T 0) x  R(x, T 0) = O + log(x) min 1,  + T 0 T 0(hxi) T 0 log x x log2(xT 0) x  = O + log(x) min 1,  T 0 T 0(hxi)

This is exactly the stated bound, but for T 0 instead of T . However, the above estimate does not change is T 0 is replaced by T . Moreover,

X xρ x |R(x, T 0) − R(x, T )| ≤ = O log T ), ρ T ρ: ||γ|−T |≤1 where we noted that the number os summands is O(log T ). This error is subsumed by the general error term. Finally, the statement when x is integer follows from the given general bound, and the explicit formula (6.37) follows by letting T → ∞ in (6.34) and (6.35).

6.7 Some estimates for ζ

Let us finish this chapter by establishing some basic estimates for ζ on or near the line {σ = 1}. We already now that ζ is zero free in a quantitative region around this line - recall Theorem 6.20. 1 Theorem 6.30. If |t| ≥ 10 and 1 − ≤ σ ≤ 2, then log[t|

|ζ(s)| = O(log |t|) and |ζ0(s)| = O(log2 |t|).

86 Proof. We may assume that t is positive. Analogously to what we did before while proving Lemma 6.13(i), write

∞ N−1 ∞ X X X n−s := ζ(s) − n−s = n(n−s − (n + 1)−s) − (N − 1)N −s (6.51) n=N n=1 n=N ∞ n+1 X Z = s n x−s−1dx − (N − 1)N −s (6.52)

n=N n ∞ 1 Z = N 1−s + N −s − s (x − bxc)x−s−1dx. (6.53) s − 1 N 2 Assume that 1 − ≤ σ ≤ 3. For n ≤ t we have log |t|

|n−s| = n−σ ≤ n−1n2/ log |t| ≤ n−1t2/ log |t| = e2/n.

Hence   X −s X −1 n = O  n  = O(log t). (6.54) n≤btc−1 n≤btc−1

By choosing N = btc in (6.51) it follows that

N−1 ∞ X  Z  ζ(s) − n−s = Ot−1t2/ log[t + O t x−2+2/ log tdx = O(1). n=1 btc

Together with (6.54) this yields (i) actually for 3 ≥ σ ≥ 1 − 2(log |t|)−1. Assume then that σ ≥ 1 − (log t)−1, and t ≥ 8. We apply part (i) and the Cauchy integral formula in the disc B(s, (log(t))−1) (remember that we actually proved (i) in slightly larger region that is stated in the theorem). It follows that   X −s X −1 n = O  n  = O(log t). (6.55) n≤btc−1 n≤btc−1

Finally, we give a simple estimate for the logarithmic derivative of the zeta function inside the zero-free zone we established previously. c0 Theorem 6.31. There is a constant c0 > 0 so that for t ≥ 2 and σ ≥ 1 − we have log t ζ0(s) = O(log2 t). ζ(s)

87 Proof. We combine two things. First, by Proposition 6.25(iii) we have for the consid- ered values of s the estimate ζ0(s) X 1 = + O(log t), ζ(s) s − ρ ρ: |t−γ|≤1 and the number of summands is O(log T ) by part (i) of the same proposition. By Theorem 6.20 any ρ = β + iγ in the above sum satisfies β ≤ 1 − c(log(t + 1))−1. Thus, if we choose c0 = c/2, it follows that in the sum Re (s − ρ) ≥ (c/2)(log(t + 1))−1 so that each term in the sum is uniformly O(log t) and the total sum is thus O(log2 t).

88 Chapter 7

On the distribution of primes. Miccellaneous

x Recall that the PNT states π(x) ∼ as x → ∞. A better approximation is obtained log x from Gauss’ from of the conjecture (of the PNT): x Z dt π(x) ∼ Li (x) as x → ∞, Li (x) := . (7.1) log t 2 We already encountered the integral logarithm function Li while discussing Riemann’s memoir. If one is not interested on the size of the error term, (7.1) is equivalent to the standard form of the PNT since by partial integration x log x log 2 Z dt Li (x) = − + , where (7.2) x 2 log2 t 2 √ x x x Z Z Z dt √ −2 √ 4  x  2 ≤ + ≤ ( x − 2) log 2 + (x − x) 2 = O 2 . log t √ log x log x 2 2 x However, it is of great interest to understand how good the approximation is, so we write π(x) = Li (x) + R(x) (7.3) and ask for upper bounds for the error term R(x). We shall soon show that R grows slower than x(log x)−2, whence Li(x) indeed yields a better approximation. The estimates for R are usually obtained via those for ψ(x) − x. That they are essentially equivalent follows from the following observation, which is a refined version of the equivalence between (i) and (ii) in our earlier Theorem 4.1. Lemma 7.1. Assume that the increasing function Re : (0, ∞):→ (0, ∞) satisfies Re(x) = O(x) as x → ∞ and we have ψ(x) = O(Re(x)) as x → ∞.

89 Then the error term R in (7.3) satisfies √ Re(x) x  R(x) = O + as x → ∞. log x log x

Proof. We define (the ’first Chebyshev’) function θ by setting X θ(x) := log p. p≤x

P α Recall that ψ(x) = pα≤x log p. If p ≥ 2 is a prime the number of prime powers p ≤ x log x equals b c. Hence log p

X X log x √ √ 0 ≤ ψ(x) − θ(x) = log p = b c − 1 log p ≤ log(x)π( x) = O( x). √ log p pα≤x p≤ x α≥2 √ Especially, we have θ(x) − x = O( x + Re(x)). Next we apply Abel partial summation to write

x X 1 θ(x) Z θ(t) π(x) = log p · = + dt log p log x t log2 t p≤x 2 x x x Z dt θ(x) − x Z θ(t) − t = + + + dt log x log2 t log x t log2 t 2 2 x 2 θ(x) − x Z θ(t) − t = Li (x) + + + dt, log 2 log x t log2 t 2 where in the last step we used (7.2). A fortiori, we may express R in terms of θ as follows: x 2 θ(x) − x Z θ(t) − t R(x) = + + dt. (7.4) log 2 log x t log2 t 2 √ Since θ(x) − x = O( x + Re(x)), and Re is increasing, we obtain √ ! x √ x + Re(x)  Z t + Re(t)  R(x)) = O + O dt . (7.5) log x t log2 t 2 Here the first O-term is of the desired form. In turn, we have

x √ x Z t Z √ √ dt ≤ t−1/2dt = 2( x − 2) = O(p(x)). (7.6) t log2 t 2 2

90 On the other hand, by the increasing nature of Re and the bound R(x) = O(x) we obtain √ √ x x x x x Z Re(t) Z Z  Z dt   Z dt  2 dt = + = O 2 + O Re(x) 2 (7.7) t log t √ log t √ t log t 2 2 x 2 x The first term above is handled by splitting one more the integration interval in a geometric way: √ √ x 4 x Z dt  Z   √ √  √ = O dt + O log−2(x)( x − 4 x) = O x(log x)−2, log2 t 2 2 and the second term may be computed exactly:

x Z dt 1 1 1 = √ − = . 2 log x log x √ t log t log x x

x Z Re(t) √ −2 −1 All in all, dt = O x(log x) + O Re(x)(log x) , and in combination with t log2 t 2 (7.5) and (7.6) this yields the claim. Remark 7.2. One may easily prove (exercise!) a converse result: if

π(x) − Li (x) = O(S(x)), where S is increasing and with S(x) = O(x), then it follows that √ ψ(x) − x = O( x + log xS(x)).

7.1 PNT with estimate for the error term

Theorem 7.3. There is a positive constant c > 0 so that as x → ∞, we have

ψ(x) = x + Ox exp(−c log1/2(x)). (7.8) and π(x) = Li(x) + Ox exp(−c log1/2(x)). (7.9) Proof. By Lemma 7.1 it is enough to prove the first estimate (7.8). Also, we may assume that x ≥ 8 is an integer. Then the explicit formula (6.34) with error term (6.50) yields that

ρ X x  x 2 ψ(x) = x − + O log(xT ) + O(log x), ρ T ρ: |Im ρ|

91 ρ where the O(log x) term comes from replacing ψ0(x) by ψ(x). The term x /ρ (ww write as usual ρ = β + iγ) can be estimated as follows: xρ xβ x  log x  ≤ ≤ exp − c , ρ |ρ| |ρ| log T where the last estimate comes from Theorem 6.20, which ensures that β < 1 − c(log |γ|)−1 ≤ 1 − c(log T )−1 since in the sum |γ| ≤ T . In order to estimate the sum of the reciprocals |ρ|−1 we note that by Proposition refprop:zeroestimates1(ii) we have N(m + 1) − N(m) ≤ c0 log m so that

bT c+1 T X  X log m  Z log x  = O = O dx = O(log2 T ). (7.10) m x ρ: |Re Im ρ|

 x 2 2 −c log x  ψ(x) − x = O log x + log(xT ) + (log T ) e log T . T √ √ Choose here T = e log x Then log T = log x and it follows that √ √ √    0  ψ(x) − x = O x log2 x(e− log x + e−c log x) = O xe−c log x .

Remark 7.4. Still today the best know estimate for the error term is    π(x) − Li (x) = O e exp − c(log x)3/5(log log x)−1/5 due to Vinogradov and Korobov in 1958 ! The improvement is based on establishing a larger zero-free region, which in turn is based on applying Vinogradov’s method of estimating exponential sums.

7.2 The Riemann hypothesis

Recall that Riemann hypothesis (RH): Re ρ = 1/2 for each non-trivial zero of the the Rie- mann zeta function. Equivalently ζ(s) 6= 0 for σ > 1/2. This is perhaps the most famous (and for many the most important) open problem in all mathematics. The significance is clear from the following result: Theorem 7.5. For any θ ∈ [1/2, ) the following are equivalent: (i) ψ(x) − x = O(xθ+ε) as x → ∞ for any ε > 0. (ii) π(x) − Li (x) = O(xθ+ε) as x → ∞ for any ε > 0. (iii) sup Re ρ} ≤ θ. {

92 Proof. Assume first that (iii) holds true. We follow the proof of Theorem 7.3. This time xρ xθ ≤ ρ |ρ| so that we obtain  X 1 x  ψ(x) − x = O log x + xθ + log2(xT ) |ρ| T ρ: |Re Im ρ| 1 ζ(s) n=1

∞ ∞ ζ0(s) Z s Z − = s ψ(x)x−s−1dx = + s (ψ(x) − x)x−s−1dx. ζ(s) s − 1 1 1 By assumption (i) the integral continues analytically to {σ > θ}, which shows that ζ0(s) s − − is analytic in this domain. Especially, none of the ρ:s may lie in this ζ(s) s − 1 domain, and we obtain (iii). Finally, (i) and (ii) are equivalent due to Lemma 7.1.

Corollary 7.6. (i) sup Re ρ} ≥ 1/2.. { (ii) π(x) − Li (x) 6= O(x1/2−ε) as x → ∞ for any ε > 0. (iii) ψ(x) − x 6= O(x1/2−ε) as x → ∞ for any ε > 0.

Proof. We already know that the set {ρ} is nonempty. Moreover, if ρ is a nontrivial zero, then 1 − ρ is also by the functional equation, whence (i) follows. This implies (iii) by our proof of implication (i) ⇒ (iii) in Theorem 7.5. Finally, that (i) implies (iii) will be a guided exercise that modifies the proof of Theorem 3.27 of course ’Complex Analysis II’. Let us finally make couple of quick observations on the behaviour of M(x) := P n≤x µ(n). Theorem 7.7. We have M(x) 6= O(x1/2−ε) for any ε > 0. Moreover, the following are equivalent: (i) M(x) = (O(x1/2+ε) for all ε > 0. (ii) RH is true.

93 Proof. This time we start from

∞ ∞ 1 X Z = µ(n)n−2 = s M(x)x−s−1dx. ζ(s) n=1 If (i) is try, this formula shows that 1/ζ(s) extends analytically to the half-plane {σ > 1/2}, which obviously implies the RH. We will postpone the proof that RH implies (i).

7.3 Lindel¨ofhypothesis (LH)

We shall next consider the growth of the Riemann zeta function as t → ∞ inside the critical strip. For purposes of this kind E. Lindel¨ofdevised a natural quantity, the indicator function. Definition 7.8. Let f be an analytic function on the half strip S = (a, b) × (0, ∞). For any σ ∈ (a, b) we set

A µf (σ) = inf{A ≥ 0 : f(σ + it) ≤ Cat , t ≥ 1}

(recall that the infimum of an empty set is ∞)1.

Theorem 7.9. (Lindel¨of) Assume in the situation of Definition 7.8 that there is Ae ∈ R Ae such that f(σ + it) ≤ Ct for t ≥ 1 and all σ(a, b). Then the indicator function µf is convex on (a, b) Proof. By replacing (a, b) by a subinterval if needed, and by a linear change of co- ordinates it is enough to consider the case where f is continuous in the strip S1 := A0 [0, 1] × [1, ∞), analytic in the interior, and is bounded by Ct in S1, and satisfies σ0 A1 |f(it)| ≤ C0t and |f(1 + it)| ≤ C1t for t ≥ 1, where A0,A1 ≥ 0. We should now show that this implies

|f(σ + it)| ≤ Cte (1−σ)A0+σA1 for t ≥ 1. (7.11) For that end choose δ > 0 and consider the auxiliary function   g(s) = f(s) exp − (A0 + sA1) log(s/i) + iδs .

One verifies (Stirling again !) for any δ > 0 the function g is bounded by a constant that depends only on C,C1,C2,A0,A1 (not on δ) on the boundary of the rectangle (0, 1)×(1,T ) as soon as T is large enough, depending on δ > 0. By maximum principle the same bound holds inside the rectangle, and since T may be taken arbitrarily large, + actually inside the whole strip S1. Especially we may let δ → 0 in the estimate. In the resulting estimate we express f in terms of g, and check that (7.11) follows. Details are an exercise. 1In our definition A ≥ 0 but this condition could be dropped.

94 Lemma 7.10. Write the functional equation in the form ζ(s) = χ(s)ζ(1 − s), where 1 χ(s) := 2(2π)s−2Γ(1 − s) sin( πs). 2 Then we have uniformly for σ in any bounded interval

 |t| 1/2−σ χ(s) ∼ as |t| → ∞. 2π Proof. This is easy application of Stirling’s formula.

Theorem 7.11. (i) For any fixed σ ∈ [0, 1] and ε > 0 we have as |t| → ∞

|ζ(σ + it)| = O(|t|(1−σ)/2+ε).

(ii) Especially, |ζ(σ + it)| = O(|t|1/4+ε). For fixed ε > 0 the constant in the O-statement may be chosen same for all σ ∈ [0, 1].

Proof. By symmetry we need only to consider positive values of t. According to Lemma 6.13 we have ζ(σ + it) = O(t) for σ ∈ (1/2, 2], say. Then the functional equation together with Lemma 7.10 imply that for some finite a we have ζ(σ + it) = O(ta) for σ ∈ [−1, 2]. By Theorem 7.9 the indicator function of ζ is bounded from above, and hence convex on [0,1]. Theorem 6.30 verifies that µζ (1) = 0 (also to the direction of negative t). Using again Lemma 7.10 together with functional equation this implies that µζ (0) = 1/2. Part (i) of the theorem follows by The convexity of µζ due to Theorem 7.9 now states for σ ∈ (0, 1) that 1 1 µ (σ) ≤ (1 − σ)µ (0) + σµ (1) ≤ (1 − σ) + σ · 0 = (1 − σ). ζ ζ ζ 2 2 Statement (ii) is just restatement of the most interesting special case σ = 1/2. Fi- nally, the statement of uniformity follows directly from the proof of Theorem 7.9, see especially (7.11). After Lindel¨ofproved the previous theorem, he postulated that due to convexity of the indicator function and the fact that µζ (σ) = 1/2 − σ also for all σ ≥ 0, he conjectured that one has

µζ (σ) = max(0, 1/2 − σ) for all σ ∈ R. This is usually stated in the form:

Definition 7.12. Lindel¨ofhypothesis (LH) states that µζ (1/2) = 0. Equivalently,

|ζ(1/2 + it) = O(tε) as t → ∞ for any ε > 0.

95 Lot of brainpower has been devoted to proving the LH or improving the above bound for µf (1/2). More that 100 years ago Hardy and Littlewood showed with a not too difficult proof that µζ (1/2) ≤ 1/6. Currently the best known bound is due to Bourgain, 100 year later, stating that µζ (1/2) ≤ 13/84. Here one might observe that 1 13 1 − = !! 6 84 84 We next turn to proving that the RH implies the LH. For that end we shall employ another interpolation result (the previous one was the the convexity of the indicator function)

Theorem 7.13. (Hadamard’s 3 circles theorem) Let R1 > R0 > 0 and let A be the annulus A := B(0,R1) \ B(0,R0). Assume that f : A → C is continuous and analytic inside A. Denote for r ∈ [R0,R1] the maximum of f on the circle {|z| = r} by M(r) := max{|f(z)| | |z| = r}. The we have for any r ∈ (R0,R1)

log r − log(R0) log(R1) − log r log(M(r)) ≤ log(M(R0)) + log(M(R1)) . log(R1) − log(R0) log(R1) − log(R0) Proof. One may actually observe that the theorem states that log M is convex as a function of log r. Towards the proof, we first note that one may assume R0 = 1 since the general case may be reduced to this by considering fe(z) := f(R0z). Fix arbitrary integers n ∈ Z and m ≥ 1 and apply the maximum principle to the analytic function g(z) := (f(z))mzn in the annulus A. By looking at the boundary values it follows that (recall that now R0 = 1)

m n m m n |f(z)| |z| = |g(z)| ≤ max M(1) ,M(R2) R2 , 1 ≤ |z| ≤ R1. Dividing by |z|n and taking m:th roots on both sides we obtain

−n/m n/m |f(z)| ≤ |z| max M(1),M(R2)R2 . Choosing suitable sequence of rational approximations n/m to the real number

log M(R )/M(1) − 2 log R2 we get in the limit log(M(R2))−log(M(1)) |f(z)| ≤ |z| log R2 M(1) Denoting |z| = r, log |z| = u, it follows that log(M(r)) ≤ u(log(M(R2))−log(M(1))+log(M(1) = (1−u) log(M(R1)+u log(M(R2)), which was to be proven.

96 Lemma 7.14. Assume the RH. Then, for any fixed δ > 0 there is the estimate:

log(ζ(s)) = O(log t) for σ ≥ 1/2 + δ as t → ∞. Here the constant in the O-term depends only on δ > 0.

Proof. Note that by assuming the RH, function g(s) := log(ζ(s) extends to an analytic function in the domain {σ > 1/2}\ (1/2, 1], and the branch is fixed e.g. by setting g be real on the real axis. By already standard argument that uses the Euler product we have for any a > 1 that

g(z) = O(1) in {σ ≥ a}, (7.12) where the implied constant may depend on a. On the other hand, we know that by Lemma 6.13(ii) that

ζ(s) = O(|s|) in {σ ≥ 1/2, t ≥ 10}. (7.13)

Fix t0 ≥ 10 and σ0 ∈ [1/2, 2]. Note that (7.13) implies that in the circle B(2 + it0, 3/2) it holds that

Re (g(s)) ≤ c0 + log(t0).

Borel-Caratheodory lemma (Lemma 6.3) allows us to deduce that in the slightly smaller circle B(2 + it0, 3/2 − δ) we then have  |g(s) − Im g(2 + it0)| ≤ c(δ) c0 + log(t) .

Since Im g(2 + it0) has a bound independent of δ due to (7.12), and as σ0 + t0 ∈ B(2 + it0, 3/2 − δ), we finally deduce that

|g(σ0 + it0) = O(log t0).

Here we assumed that σ ≤ 2, but if σ > 2 there is even a bound of the form O(1) according to (7.12). We are now ready for:

Theorem 7.15. (i) RH implies LH.

(ii) RH implies also that µ1/ζ (σ) = 0 for all σ > 1/2.

Proof. In order to prove that LH holds, it is enough to fix arbitrary σ0 ∈ (1/2, 1) and show that µζ (σ0) = 0 (7.14) since then convexity of µζ , applied on the interval [0, σ0] implies that

σ0 − 1/2 1/2 − 0 1 σ0 − 1/2 µζ (1/2) ≤ µζ (0) + µζ (σ0) = ≤ σ0 − 1/2 σ0 − 0 σ0 − 0 2 σ0

97 This can be made arbitrarily small by choosing σ0 close to 1/2, and we deduce µζ (1/2) = 0, i.e. the LH. Assume that RH is true. Then the function g(s) := log ζ(s) is analytic in the quadrant {σ > 1/2, t > 0} (we take the branch that has real values as one approaches real axis in the range σ > 1. Fix σ0 := 1/2 + 2δ0 . Consider arbitrary t0 ≥ 100, say.

Our aim is to estimate ζ(σ0 + it0) by applying the 3-circle theorem on g in the annulus

A := B(t0 + 1/2 + δ0 + it0, t0) \ B(t0 + 1/2 + δ0 + it0, t0 − 1)

Thus in our application of the 3-circle theorem the center point is t0 +1/2+δ0 +it0, the inner radius R0 := t0 − 1, the outer one R1 := t0 and, in order to estimate |ζ(σ0 + it0)| we need an estimate for M(r) with r := t0 − δ. Thus, our next duty is to estimate M(R0) and M(R1). In what follows, all the constants in O-terms may depend on δ which is fixed throughout). First of all, the inner boundary of the annulus A is contained in the half plane {σ ≥ 1 + δ}, where g = log ζ is certainly uniformly bounded, so we may write

M(R0) = O(1). (7.15) In turn, by Lemma 7.14 and simple geometry we see that

M(R1) = O(log t0). (7.16) Then, for the above defined radius r the Hadamard 3 circle theorem yields that

log R1 − log r log r − log R0 log M(r) = O(1) + (log log t0 + O(1)) log R1 − log R0 log R1 − log R0 log r − log R0 = log log(t0) + O(1) (7.17) log R1 − log R0 Above     log t0−δ log 1 + 1−δ log r − log R0 t0−1 t0−1 =   =   → 1 − δ as t0 → ∞. (7.18) log R1 − log R0 log t0 log 1 + 1 t0−1 t0−1

In view of this observation and (7.20) we have for large enough t0 that log M(r) = (1 − δ/2) log log(t0)O(1) and especially   |g(σ0 + it0)| = O exp (1 − δ/2) log log t0 ≤ ε log t0 (7.19)

ε for all ε and t0 large enough (depending on ε). This implies that |ζ(σ + t0)| = O(t0), and since ε > 0 was arbitrary, we get that µζ (σ0) = 0, and we have eatableshed (i). Exactly in a similar way (7.19) implies (ii). Remark 7.16. It is clear from the proofs that we have actually for any δ > 0 a uniform bound:

max(|ζ(s)| , |ζ(s)|−1) = O(|s|ε) as |s| → ∞ in {σ ≥ 1/2 + δ}.

98 We are now able to complete the proof of Theorem 7.7. Proof of (ii) ⇒ (i) in Theorem 7.7. Assume that x ≥ 10 is a half-integer (i.e. x−1/2 ∈ N). Applying again Perrons effective formula (e.g. in the form of Thm 3.24) and the ∞ 1 X µ(n) identity = we may write ζ(s) ns n=1

c+iT X 1 Z xs   M(x) = µ(n) = ds + O T −1xcζ(c) + 1 + T −1x log x . (7.20) 2πi sζ(s) n≤x c−iT √ √ We choose c = 1+1/ log(x) and T = x. Then the error term above is of order x log x, i.e. not more than the growth that we want to prove for the LHS. Thus it is enough to get a good estimate for the integral. The integral is analytic everywhere and we choose small δ > 0 and use Cauchy’s theorem√ to change√ the integral to a sum√ of integrals√ over the three paths γ1 := (1/2+δ −i x, 1/2+δ +i x), γ± := (1/2+δ ±i x, c+δ ±i x). Clearly by Theorem 7.15(ii) we have √ x Z  Z  1/2+δ δ −1 1/2+3δ/2 = O x (1 + |u|) (1 + |u|) = O(x ) √ γ1 − x and we may estimate the remaining integrals as Z   c −1/2 δ = O x · x · x , , γ± where the factor xδ estimates ζ−1 a la Remark 7.16. Putting everything together, we obtain that M(x) = O(x1/2+3δ/2), and the claim follows since δ > 0 is arbitrary.

7.4 Hardys theorem on number of zeroes on the half-line

We learned before that one may get rather good estimate for the number of non-trivial zeroes ρ = β + iγ satisfying 0 < γ < T . It is considerably harder to get one’s hand on the zeroes that lie on the critical line. As a last theorem of the course we prove the 1914 theorem of Hardy which states that infinitely many of the zeroes actually lie on {σ = 1/2}. For that purpose we need a couple of simple estimates for exponential sums. As a side remark, many of the sharpest estimates in analytic number theory are based on analyzing size of suitable exponential sums. Recall that e(x) := exp(2πix). In the following lemma a main point is that the result does not depend at all on a, b !

99 Lemma 7.17. Assume that f :[a, b] → R is C2 so that the derivative f 0 is monotone and lower bounded from zero (especially it does not change sign). Set

0 < m := min |f 0(x)| a≤x≤b

Then b Z 2 e(f(x))dx ≤ . πm a Proof. May assume that f 0 is non-increasing and compute

b b Z 1 Z 1 d ef(x)dx = ef(x)dx 2π f 0(x) dx a a b Z 1 he(f(x))ib h 1 i = − e(f(x))d 2π f 0(x) a f 0(x) a b 1 Z h 1 i ≤ (m−1 + m−1) + d 2π f 0(x) a 2 ≤ . mπ

The previous lemma alone is often useful, but the result we actually need is the following:

Theorem 7.18. Assume that f :[a, b] → R is C2 so that the derivative f 00 does not change sign and is lower bounded. Set

0 < M := min |f 00(x)| a≤x≤b

Then b Z r 2 e(f(x))dx ≤ 4 . πM a Proof. We may well assume that f 00(x) ≤ −M on [a, b]. Now f 0 is strictly decreasing and it may have only one zero on the interval. Assume that f(c) = 0 for a < c < b, pick small δ > 0 and divide the integral into 3 parts

b c−δ c+δ b Z Z Z Z I := e(f(x))dx = + + =: I1 + I2 + I3.

a a c−δ c+δ

100 0 R x 00 If |x − c| ≥ δ, we have |f (x)| ≥ | c f (y)dy| ≥ Mδ. Hence we may apply Lemma 7.17 on integrals I1,I2 with the outcome 4 |I | + |I | ≤ . 1 3 πMδ

On the other hand we obviously have |I3| ≤ 2δ so that 4 |I| ≤ + 2δ. πMδ The claim follow now by minimising the expression over δ, i.e. choosing δ = p2/πM. Finally, it may happen that f 0 has no zeroes on [a, b], or that δ > min(|a−c|, |b−c|) for the delta we chose above, but in these cases the proof is actually easier. In preparation for Hardy’s theorem we introduce the functions χ and Z. First of all, recall the meromorpfic auxiliary function χ from Lemma 7.10 defined by the formula 1 χ(s) = 2(2π)s−1Γ(1 − s) sin( πs) 2 and set for t ∈ R Z(t) := χ(1/2 + it)−1/2ζ(1/2 + it). Here the branch of the square root is chosen to be real for t = 0. Function Z is clearly analytic at least in a neighbourhood of the real axis (χ does not vanish on the critical line). Here is some more precise information: Lemma 7.19. (i) ζ(s) = χ(s)ζ(1 − s). (ii) χ(s) is analytic and nonzero in {0 < σ < 2, t > 0}. (iii) |χ(1/2 + it)| = 1 for t ∈ R. (iv) Z(t) is continuous and real-valued on the real axis. Proof. (i) is just a restatement of the functional equation. For (ii) we note that the only possible zero on this domain produced by the sin-factor is killed by the only potential pole from the Γ-factor. By first writing (i) and then replacing s by (1 − s) and multiplying sidewise we obtain χ(s)χ(1 − s) = 1.

Especially, letting t ∈ R yields χ(1/2 + it)χ(1/2 − it) = 1, and since χ is real on the real axis, we have χ(1/2 + it) = χ(1/2 − it) In other words, |χ(1/2 + it)|2 = 1, which proves (iii). Finally, Z in continuous on real axis by analyticity, and letting t be real, statement (iv) follows from the computation

(iii) Z(t) = χ(1/2 + it)−1/2ζ(1/2 + it) = χ(1/2 + it)1/2ζ(1/2 − it) 1/2 (i) ζ(1 − (1/2 − it)χ(1/2 + it) = = ζ(1/2 + it)χ(1/2 + it)−1/2 χ(1/2 + it) = Z(t).

101 The key estimate is the following:

2T 2T Z Z Lemma 7.20. Denote I1 := Z(t)dt and I2 := |Z(t)|dt. Then

T T 7/8 (i) I1 = O(T ).

(ii) |I2| ≥ c0T for large enough T , where c0 > 0. Proof. We first estimate the lefthand side integral by applying Cauchy’s theorem on the rectangular domain with the sides

Lr := (5/4 + iT, 5/4 + 2iT ), L+ := (5/4 + 2iT, 1/2 + 2iT ), L` := (1/2 + 2iT, 1/2 + iT ), and L− := (1/2 + iT, 5/4 + iT ). Application of Cauchy is legitimate since Lemma 7.19 (ii) implies that Z continues analytically over the above rectangle.The integral of Z over the different sides are called Ir,I`,I±. We note that I1 = −iI` and hence by Cauchy integral theorem

I1 = −iI` = i(Ir + I+ + I−). (7.21)

We start by estimating the integrals I±. By Lemma 7.10 and Theorem 7.11 we have  (χ(s))−1 = O(|t|σ−1/2) and ζ(σ + it) = Otε+max (1−σ)/2,0 , σ ∈ [1/2, 2] as t → ∞. It follows that

Z(σ + it) = O(t3/8+ε), σ ∈ [1/2, 5/4].

This especially implies that 1/2 I± = O(T ) (7.22)

In order to estimate Ir we first note that by using the complex Stirling formula a somewhat tedious but straightforward computation (EXERCISE!) yields that

 t 3/8+it χ(5/4 + it)−1/2 = c e−it/2(1 + O(1/t)). (7.23) 2π In absolute value the part above containing O(1/t) is of size O(T −5/8) on our path of integration. Since ζ is bounded on the integration path independent of T , the −5/8 3/8 contribution to the integral Ir is of order T × O(T ) = O(T ). To estimate the contribution of the main term we simply replace ζ by its Dirichlet series (which converges absolutely and uniformly on the range of integration), and obtain the sum

∞ 2T X Z  t 3/8 c n−5/4 e(F (t))dt, (7.24) 2π n n=1 T

102 where t F (t) := log(t/2π) − 1/2 − log n. n 2π In order to estimate this exponential sum we note first that

00 −1 −1 Fn (t) = t ≥ (2T ) for t ∈ [T, 2T ], so that Theorem 7.18 verifies that (independently of n !)

u Z 1/2 G(u) := e(Fn(t)dt = O(T ) for all u ∈ [T, 2T ].

T We thus obtain by partial integration

2T T Z  t 3/8 .2T  t 3/8 3 Z 3  t −5/8 e(Fn(t))dt = G(t) − G(t)dt 2π T 2π 18π 18π 2π T T = OT 1/2T 3/8 + O(T × T −5/8T 1/2) = O(T 7/8)

Since this is uniform in n we may substitute it in the sum (7.24) and obtain the result

7/8 Ir = O(T ) (7.25)

In combination with (7.22) this completes the proof of (i). The proof of (ii) is much easier and uses the observation that due to Lemma 7.19 we have that

2T 2T 2T Z Z Z

I2 = |Z(t)|dt = |ζ(1/2 + it)|dt ≥ ζ(1/2 + it)dt , T T T R 2T and we hence only need to estimate from below the size of T ζ(1/2 + it)dt. For that end we apply the Cauchy theorem over the rectangle with vertices 1/2 + iT, 2 + iT, 2 + 2iT, 1/2 + 2iT. The integrals over the vertical sides are O(T 1/2+ε) by Theorem 7.11. The integral of the rightmost vertical line can be computed by using the uniform convergence of the Dirichlet series of ζ:

2+2iT ∞ 2T Z X Z ζ(s)ds = iT + i n−2 n−itdt

2+iT n=2 T ∞ X 1 = iT + i niT − n2iT  n2 log n n=2 = = iT + O(1).

Hence by Cauchy’s theorem I2 grows at least at the rate T , which proves (ii).

103 It is customary to denote

N0(T ) := #{ρ on the critical line with Im ρ ∈ (0,T )}.

Theorem 7.21. (Hardy). There is a positive constant c so that N0(T ) ≥ c log T for large T . Especially, there are infinitely many zeroes of the Riemann zeta function on the critical line.

Proof. We will apply the very simple observation: Lemma 7.20 implies that for large enough integers k the continuous and real-valued function Z satisfies

2k+1 2k+1 Z Z

|Z(t)|dt > Z(t)dt .

2k 2k

But this means that Z, and a fortiori ζ has to have a zero on (2k, 2k+1). Since this holds for each k, the statement on N0 follows.

Remark 7.22. Actually, Hardy originally just showed that N0(T ) is not bounded. Later in 1922 he an Littlewood jointly produced the quantitative bound

N0(T ) ≥ c1T for T ≥ T0.

In 1942 Selberg showed that a certain uniform percentage of the zeroes actually lie on the critical line by showing that N0(T ) ≥ c2T log T . In other words

N0(T ) ≥ αN(T ) for T ≥ T0 for some positive α. Selberg’s proof gives a rather poor value of α, but this was much improved by Levinson in 1974, who showed that one may take α = 1/3. Later Conrey verified that α = 2/5 also works.

– THE END –

104