<<

ANALYTIC THEORY NOTES

AARON LANDESMAN

1. INTRODUCTION taught a course (Math 249A) on Analytic at Stanford in Fall 2017. These are my “live-TeXed“ notes from the course. Conventions are as follows: Each lecture gets its own “chapter,” and appears in the table of contents with the date. Of course, these notes are not a faithful representation of the course, either in the itself or in the quotes, jokes, and philo- sophical musings; in particular, the errors are my fault. By the same token, any virtues in the notes are to be credited to the lecturer and not the scribe. 1 Please email suggestions to [email protected]

1This introduction has been adapted from Akhil Matthew’s introduction to his notes, with his permission. 1 2 AARON LANDESMAN

2. 9/26/17 2.1. Overview. This will be somewhat of an introductory course in analytic methods, but more like a second introduction. We’ll assume familiarity with the theorem, connecting contribu- tions of primes from zeros of the zeta function. You might look at the first half of Davenport’s book or so as a prerequisite. We’ll assume the students know how to prove there are infinitely many primes in progressions. To get started, we’ll do the first four or five lectures proving Vino- gradov’s three prime theorem: Theorem 2.1 (Vinogradov). Every large odd number is the sum of three primes. When we say “large,” one can actually compute the bound explic- itly (i.e., it is effective). Remark 2.2. Helfgott, a few years back, made the bound accessible so that one could compute exactly which odd were not ex- pressible as the sum of three primes. He showed something like all primes more than 7 could be written as the sum of three primes.

To start the proof, write N = p1 + p2 + p3, and we’ll count the number of ways to write N as such. In fact, we’ll consider

∑ Λ(n1)Λ(n2)Λ(n3) N=n1+n2+n3 where ( log p if n = pk Λ(n) := 0 else If we define √ Ψ(x) := ∑ Λ(n) = x + O(xe−c log x). n≤x This is equivalent to saying √ π(x) = li(x) + o(xe−c log x). Here Z x dt li(x) = . 2 log t x li(x) is about log x . NOTES 3

2.2. Heuristic of proof. A first guess is that there are about π(N) choices for each of p1, p2, p3. Their sum must add up to a given num- 1 ber N. The chance that p1 + p2 + p3 is exactly N is roughly N . Hence, the number of such ways is approximately  N 3 1 N2 = . log N N (log N)3 We can also estimate 3 2 1+1/2 R3(N) := ∑ Λ(n1)Λ(n2)Λ(n3) ∼ N /N = N + O(N + log N) N=n1+n2+n3 where the error comes from contributions of powers of primes. 2.3. Hardy and Littlewood’s circle method. Let S(α) := ∑ Λ(n)e(nα) n≤N where e(x) = e2πix. Then, Z 1 Z 1 3 S(α) e(−Nα)dα = ∑ Λ(n1)Λ(n2)Λ(n3)e (n1 + n2 + n3 − N)α) dα 0 0 n1,n2,n3≤N = R3(N). To bound this, note that S(0) = Ψ(N) ∼ N. We’d like to bound it by about N2. Also, S(1/2) is pretty big because S(1/2) = (Λ(2) + λ(4) + ··· ) − ∑ Λ(n) n≤N,n odd because e(x) = −1 if x is odd. Then, for all 10−6 |λ| ≤ , N we have .99N. Then, Z S(α)3e(−Nα)dα ' 10−6N2. |α|≤10−6/N We could similarly make an argument in a small neighborhood of 1 2 . This gives an analytic reason that the number of representations might be on the scale of N2. So there are portions of the which give the correct answer. 4 AARON LANDESMAN

Exercise 2.3 (Waring’s problem). We want to know whether we can k k k write N = x1 + x2 + ··· + x3 (i.e., as a sum of four squares or nine cubes, etc.) (1) First, find a probabilistic guess for the number of such repre- sentations. (2) Use the circle method  S Z 1  k   ∑ e(n α) e(−Nα)dα. 0 1 n≤N k Then, find portions of the integrand that correspond to the right probabilistic answer. Returning to our integral for three primes, let’s think about when S(α) is big.

1 Example 2.4. Let’s try 3 . ∑ Λ(n)e(n/3). n≤N We have a contribution from powers of 3 which is about log N, so ∑ Λ(n)e(n/3) = O(log N) + e(1/3)Ψ(N; 3, 1) + e(2/3)Ψ(N; 3, 2) n≤N ∼ N/2. where N N Ψ(N; 3, 1) ∼ = . φ(3) 2 where Ψ(N; a, b) counts the number of primes up to N which is b mod a. Remark 2.5. Note that a S( ) q counts approximately the distribution of primes in progressions mod q with (a, q) = 1. Sometimes when q is small, since we’re only count primes coprime to q, we will get an answer substantially away from 0. We’ll later need to think through the uniformity of q in terms of N. ANALYTIC NUMBER THEORY NOTES 5

Remark 2.6 (Insight). S(α) is big near most rational numbers with small denominators. It’s not big near 1/4, so we’re only saying it’s big near certain ones. This might have something to do with whether the denominator of the is square free: if it is not square free, you essentially get translates over roots of unity of that prime whose square divides q, and things cancel out Exercise 2.7. Show ∗ ak ∑ e( ) = µ(q), k mod q q where the ∗ means (k, q) = 1 and µ is the Mobius¨ function. Goal 2.8. If α is not near a rational number with small denominator then |S(α)| is small. To accomplish this, Hardy and Littlewood decided to split [0, 1] into two parts - major arcs M and minor arcs m. The major arcs are a close to q for q small, and the minor arcs are the rest. The measure of the minor arcs are big while the measure of the major arcs have small measure. That is, the minor arcs have nearly full measure. So, there is a very small set on which the is big. There is also a big set on which the generating function is small. There is a trivial bound |S(α)| ≤ Ψ(N) ∼ N, using the triangle inequality. One can also work out Lemma 2.9. We have Z 1 |S(α)|2dα ∼ N log N. 0 Proof. Z 1 Z 1 2 |S(α)| dα = ∑ Λ(n1)Λ(n2) e ((n1 − n2)α) dα 0 0 n1,n2≤N = ∑ Λ(n)2 ≤ n N √ = ∑ log nΛ(n) + O( N log N) n≤N = N log N.

√ Exercise 2.10. Verify the above, where the O( N log N) difference is coming from prime powers. The idea for the last step is that most 6 AARON LANDESMAN numbers less than N are on the order of N. One might use partial summation, which is integration by parts.  Usually, p |S(α)| ∼ N log N. If α is far from every rational number, such as the golden ratio, φ, 1 +ε we might try to compute S(φ). We might expect that S(φ)  N 2 . We don’t know whether this is true, but we do know ∑ Λ(n)e(nφ)  N1−δ n≤N for some δ > 0 (and it will probably even be a pretty large δ). We will now develop a technique saying that once you are far away from a rational number, you can get this sort of power saving.

2.4. Strategy for determining asymptotics for R3(N). We have the integral Z 1 Z Z S(α)3e(−Nα)dα = ( + )S(α)3e(−Nα)dα 0 M m We want the second part over the minor arcs to be smaller than N2. The idea is that we can bound ! Z Z 1 S(α)3e(−Nα)dα ≤ ∑ |S(α)| |S(α)|2dα m α∈m 0  (N log N) ∑ |S(α)|. m So, it is enough to have εN ∑ |S(α)| ≤ . α∈m log N This will show the contribution from the minor arcs is less than that of the major arcs, assuming we know the major arcs contribute N2.

Exercise 2.11 (Roth). For all δ > 0, there is N0 = N(δ) so that for all N ≥ N0, such that every subset A ⊂ [1, N] with |A | ≥ δN has a (nontrivial) three term . Letting A (α) = ∑ e(aα), a∈A ANALYTIC NUMBER THEORY NOTES 7 we obtain Z 1 A (α)2A (−2α)dα 0 counts the number of triples (x, y, z) with x + z = 2y. This includes |A | trivial solutions, so we want to see this integral is larger. We might expect δ3N2 solutions. But now, it’s a bit hard to see how to actually bound this integral. Exercise 2.12 (Vague exercise). If, “away from 0,” |A (α)| ≤ ε|A | then the contribution of that portion of the integral Z 1 A (α)2A (−2α)dα 0 is bounded by ε|A |2. (We’d like to know something like ε ≤ δ/106.) The idea is that either we have this bound above, or else we get some additive structure in A which we exploit to get a bigger den- sity set. There are notes on this on Sound’s web-page from a course he taught on additive . Now, we want to focus on showing that for some definition of the minor arcs, the sum S(α) has a little bit of cancellation. 2.5. Vinogradov’s method. Here is the key idea from Vinogradov’s method. This comes up many times throughout analytic number theory. We’d like to understand the sequence S(α) = ∑ Λ(n)e(nα). n≤N We could similarly study ∑ Λ(n)e( f (n)), n≤N √ √ where, say, f (n) is e( n) or e( n + (log n)2). We could similarly study √ ∑ e( n) n≤N or ∑ e(t log n) n≤N 8 AARON LANDESMAN for looking at 0’s of the zeta function. Let’s start with the simplest version of these, where instead of summing over primes and prime powers, we only sum over all the . Say we want to consider ∑ e(nα). n≤N This is a geometric progression, so it is easy to sum: e(α) (1 − e(Nα)) ∑ e(nα) = n≤N 1 − e(α) x = sin πα where x is bounded by 2, and the numerator is approximately sin πNα.

Exercise 2.13. Show

  2 ∑ e(nα) ≤ min N, n≤N sin πα  1   min N, . ||α|| letting ||α|| denote the distance from the nearest . Let Φ be a smooth function. Then, ∑ Φ(n/N)e(nα) n is some smooth version of what we are trying to approximate. We might try to use the Poisson summation formula. We can write ∑ Φ(n/N)e(nα) = N ∑ φˆ(N(k + α)) n k and work out the Poisson summation formula. For φ smooth, the Fourier transform is rapidly decreasing.

3. 9/28/17 Recall last time we had

R3(N) := ∑ Λ(n1)Λ(n2)Λ(n3). n1+n2+n3=N The goal was to show this asymptotes to N2. We set S(α) = ∑ Λ(n)e(nα), n≤N ANALYTIC NUMBER THEORY NOTES 9 and found Z 1 3 R3(N) = S(α) e(−Nα)dα. 0 The idea is to show that S(α) is large only near rational numbers with small denominators (the minor arcs). On the complement, we want to show |S(α)| is small, and then bound Z   Z 1 | S(α)3e(−Nα)dα| ≤ sup |S(α)| |S(α)|2dα. m α∈m 0 We then could use Parseval’s identity to bound this by N log N. Toward the end of last time, we found 1 ∑ e(nα)  min(N, ). n≤N ||α||

Exercise 3.1. Count the number of ways of writing N = n1 + n2 + n3 asymptotically by writing down the associated integral using the circle method. The will only be big for α near 0. There is only one major arc in this case. The answer should be about N2/2, and the point is to see where the 1/2 comes from.

Recall from elementary number theory that Λ(n) = ∑ab=n µ(a) log b. If we look at the Dirichlet for ζ0 1 − (s) = · −ζ0(s) ζ ζ(s) 1 where the first term has Λ(n), ζ has Dirichlet series µ and −ζ0(s) has Dirichlet series log. Then, the convolution of µ and log is Λ. Then,

Z Z N ! log ne(nα) = log td e(nα) − ∑ n≤N 1 n≤t Z N 1 = log N e(nα) − e(nα)dt ∑ − ∑ n≤N 1 t n≤t  1   (log N) min N, . ||α|| Then, ∑ Λ(n)e(nα) = ∑ µ(a) ∑ log be(abα). n≤N a b≤N/a 10 AARON LANDESMAN

Example 3.2. First, let’s try the case α = 0. Then, ∑ Λ(n) = ∑ µ(a) ∑ log b n≤N a b≤N/a

∞ µ(a) If we knew ∑a=1 a = 0 we could then prove the . This is essentially equivalent to proving the prime number theorem, so it would take some work. Things are good when a is small, but there is a problem when a is big. Goal 3.3. Our overall aim is to bound S(α). 3.1. Vinogradov’s idea. We’d like to somehow decompose Λ(n) into pieces, where either we use a simple exponential sum, or using the following idea. The idea has to do with bilinear forms. We notate m ∼ M meaning M < m ≤ 2M.

B(M, N) := ∑ ∑ ambn · f (m, n), m∼M n∼N with ai, bi arbitrary complex numbers, and f (m, n) is an oscillatory term, such as f (m, n) = e(mnα). Intuitively f (m, n) should have some “cancellation.” Goal 3.4. We’d like to bound the sum B(M, N) by something like !1/2 !1/2 2 2 ∑ |am| ∑ |bn| · Nf m∼M n∼N where g is some sort of operator norm of the matrix f (m, n). (1) We think of f (m, n) as something that cancels out. It does not always have the same sign. (2) We typically have | f (m, n)| small, e.g., ≤ 1. (3) We might also imagine am, bn ≤ 1. We’d then like to compare the bound we obtain to the trivial bound MN. We’d like to beat this trivial bound. This will be impossible to bound if (1) f (m, n) = 1. (2) f (m, n) = α(m)β(n). (3) Both M and N are big (or at least the associated matrix has large rank). In order to avoid these impossibilities, we will need to exploit that f (m, n) is genuinely a 2-variable function, and does not decouple. ANALYTIC NUMBER THEORY NOTES 11

To obtain the bound, we will use Cauchy-Schwarz. We have 2 !  2

2 ∑ ∑ ambn f (m, n) ≤ ∑ |am|  ∑ ∑ bn f (m, n)  . m∼M n∼N m∼M m∼M n∼N Let ∗ denote 2

∑ ∑ bn f (m, n) m∼M n∼N Then, 2

∗ = ∑ ∑ bn f (m, n) m∼M n∼N ≤ ( ) ( ) ∑ bn1 bn2 ∑ f m, n1 f m, n2 . n1,n2∼N m∼M

From this we have gained that we have replaced the unknown am, bn by inner products of our known matrix f (m, n). If we knew f (m, n) were orthogonal, then the sum amounts to terms with n1 = n2 of the form 2 ∑ |bn| M. n∼N Things will never be quite so good that we will precisely get orthog- onality. But, we might have some approximate orthogonality. For example, if n1 6= n2, maybe we can bound the correlation by 1. Then, the off-diagonal terms are of the form | ∑ bn1 bn2 . n16=n2 Using Cauchy’s inequality, we obtain 2 2 |bn1 bn2 |  |bn1 | + |bn2 | . Hence, |  | |2 ∑ bn1 bn2 N ∑ bn . n16=n2 n∼N In the above favorable circumstances, putting the above together, we get a bound !1/2 !1/2 √ √  2 2 ∑ ∑ ambn f (m, n)  ∑ |am| ∑ |bn| M + N m∼M n∼N m∼M n∼N 12 AARON LANDESMAN

We might√ now√ try setting all |an| = |bn| = 1, and then our bound is M N + N M instead of MN so we save √1 + √1 . Again, this M N bound holds under various assumptions that the f (m, n) are approx- imately orthogonal. This is the key strategy. We now want to implement the above strategy in the situation we are in. The key point of the strategy is that we have transferred the problem from understanding the un- known an, bm to the known problem of understanding the correlation of f (m, n).

3.2. Applying Vinogradov’s idea. We now want

∑ ∑ ambne(mnα). m∼M n∼N

Thinking of the am as µ(a) and the bn as µ(n). We then want to bound

∗ = |b b | e (m(n − n ) ) ∑ n1 n2 ∑ 1 2 α . n1,n2∼N m∼M

Suppose we write n1 − n2 =: k. Then |k| ≤ N. We then have    1  ∗  | |2 + | |2 ∑ bn1 bn2 min M, ||(n1 − n2)α|| n1,n2∼N !  1   |b |2 min M, . ∑ n1 ∑ ||kα|| n1∼N |k|≤N We conclude !1/2  1/2 1/2    2 2 1 ambne(mnα) ≤ |am| |bn|  min M,  . ∑ ∑ ∑ ∑ ||k || m,n n |k|≤N α

We’d like to show we get something from this if α is not close to a rational number with small denominator. So we keep in mind that α might be irrational. We start with Dirichlet’s theorem: Theorem 3.5 (Dirichlet). For all Q > 1 and all α ∈ R, there exists a rational number a/q with (a, q) = 1 and q ≤ Q so that

a 1 α − < . q qQ ANALYTIC NUMBER THEORY NOTES 13

So, we can get pretty good approximations to irrational numbers with small denominators. A crude version of this is

a 1 α − ≤ . q q2 We can get approximations of this type by expan- sions. Let (∗) denote  1  (∗) := min M, ∑ ||k || |k|≤N α We should expect that if q is small, then we might revert to the trivial bound MN. Perhaps there is some inverse relationship√ with q. So maybe we get something like MN/q or MN/ q. So, the larger q gets, the more saving we should get over the trivial bound. So, very small values of q are not good, but we’d like to show that if q is in some intermediate range, we might hope to be in a good situation. So, the bound we will write down will depend on the Diophantine properties of α and the scale on which we are operating. So, assume α has the rational approximation given by Dirichlet’s theorem satisfying

a 1 α − ≤ . q q2 Split the interval from m to n of length k into several intervals of length q. How do |kα| vary on this interval - there is at most one value which is very close to an integer. Then, we have  q ∑ min M,  M + q log q. 0≤a≤q a The log q is unimportant and we can remove it if we’d like, using Poisson summation if we had a smooth function. It would then be min(M, q/a2) and we could remove the log. We now want to bound the following by dividing N into N/q + 1 intervals of length q.  1  (∗) = min M, ∑ ||k || |k|≤N α = (N/q + 1)(M + q log q) log q  (M + q)(N + q) . q 14 AARON LANDESMAN

We have proven: | − a < 1 ( ) = Proposition 3.6. If α q q2 and a, q 1, then

∑ ∑ ambne(mnα) m∼M n∼N  1/2  1/2  1/2 log q √ p p   |a |2 |b |2 MN + Mq + Nq + q . ∑ n ∑ n q Question 3.7. Why is the above bound useful? Just to summarize, this might be helpful to think of the case that the ai, bj are bounded in norm by 1. We then get approximately a bound by √ MN √ √ √ √   MN  1 1  √ √ √ MN + q M + N + q = √ + MN √ + √ + q MN. q q M N The middle term is what we would get from the orthogonality re- lation. If q is small or large, we don’t beat the trivial bound, as ex- pected. This is the crucial bound for our particular bilinear form. Next time, we’ll try and rewrite the coefficients as a bilinear form as a function of multiple summands. The final ingredient is to write a combinatorial identity to express Λ(n) in terms of things we un- derstand. That is, we want to write it as something like

e(bα) ∑ (log b) + ∑ ambne(mnα). m,n where both m and n are large in the second sum.

4. 10/3/17 4.1. Review. Last time, we were trying to bound sums like !  2

2 2 | ∑ ∑ f (m, n)| ≤ ∑ |am| ∑ ∑ bn f (m, n)  m n m m n

= |b b | f (m n ) f (m n ) ∑ n1 n2 ∑ , 1 , 2 n1,n2 m

  = |b |2 + |b |2 f (m n ) f (m n ) ∑ n1 n2 ∑ , 1 , 2 , n1,n2 m ANALYTIC NUMBER THEORY NOTES 15

and we hoped that the correlations, i.e., the terms ∑m f (m, n1) f (m, n2) were bounded by M if n1 = n2 and O(1) if n1 6= n2. We then could 2 bound the above by (M + N) ∑i |bn| . Recall last time, we were trying to bound sums like

∑ ∑ ambne(mnα) m∼M n∼N where m ∼ M means M ≤ m ≤ 2M. We had

a 1 α − ≤ , q q2 with (a, q) = 1. We proved last time

∑ ∑ ambne(mnα) m∼M n∼N !1/2 !1/2 √ √ √ log q 2 2  √     √ ∑ |am| ∑ |bn| MN + q M + N + q . q m n We had Λ(n) = ∑ µ(a) log b S(α) = ∑ Λ(n)e(nα). n≤N Our tools are (1) ∑ log ne(nα) n≤x is some sort of geometric progression which after factoring out a log, which we understand (2)

∑ ∑ ambne(mnα) m∼M n∼N which is well bounded using Vinogradov’s method discussed last time (and above this lecture) assuming the covariances are small for off-diagonal terms and on the order of the num- ber of elements for the diagonal terms. Our goal is now the following combinatorial one: Goal 4.1. Write Λ(n) in a form where we can use the above two tools. 16 AARON LANDESMAN

Theorem 4.2 (Vaughan’s identity). We have

∞ Λ(n) ζ0 = − ( ) ∑ s s n=1 n ζ 1 µ(a) = ∑ s ζ(s) a a log b − 0( ) = ζ s ∑ s . b b Proof. This follows from straightforward manipulations of Dirichlet series, the first one comes from the derivative of log ζ(s). 

4.2. Mollifying ζ(s). We now want to Mollify ζ(s). For this, we will use Selberg’s sieve. One way to study the zeta function could be an appropriate trun- cation. We may consider µ(n) ( ) = M s ∑ s . n≤U n which is a sort of approximation to the inverse of the Riemann ζ function, using the above identity that 1 µ(a) = ∑ s ζ(s) a a

One can compute,

∞ a(n) ( ) ( ) = ζ s M s ∑ s . n=1 n where a(n) is defined by a(n) = ∑ µ(d) d|n,d≤U  1 if n = 1  = 0 if 1 < n ≤ U  the norm is bounded by d(n) if n > U where d(n) is the number of divisors of n. ANALYTIC NUMBER THEORY NOTES 17

We have ζ0 ζ0 − (s) = − (s) (1 − ζM + ζM) ζ ζ ζ0 = −ζ(s)M(s) − (s) (1 − ζM(s)) . ζ First, we should be fairly happy with the term −ζ(s)M(s) which has Dirichlet series given by !  log b M(n) ∑ s ∑ s , b n≤U n and we’ll have a long sum in the b’s, where we can hope to get some cancellation. Next, to understand ζ0 (s) (1 − ζM(s)) . ζ We try to think of this product as a sort of bilinear form. The terms from 1 − ζM(s) only matter for n larger than U. Thinking of this term as a bilinear term, we’re happy because 1 − ζM(s) is large. But, we have to ensure that ζ0/ζ is not too “skinny.” To deal with this, we can subtract out the small primes, and then later add them back. To accomplish this, we define Λ(n) ( ) = P s : ∑ s . n≤V n Then, ζ0 ζ0 − (s) = − (s) − P(s) + P(s) ζ ζ Λ(n) Λ(n) = + ∑ s ∑ s n>V n n≤V n

Then, ζ0 ζ0 − (s) = −ζ0(s)M(s) − (s) (1 − ζM(s)) ζ ζ −ζ0  = −ζ0(s)M(s) + (s) − P(s) (1 − ζM(s)) + P(s) (1 − ζM(s)) ζ −ζ0  = −ζ0(s)M(s) + (s) − P(s) (1 − ζM(s)) + P(s) − ζ(s)M(s)P(s) ζ 18 AARON LANDESMAN

The point of this breakdown is that we now have three terms we can handle using our two tools we have. The middle term decomposes into two parts, both of which are big, which gives a bilinear form. The last term has a long sum from the ζ(s) term in simple coeffi- cients. For P(s), we can just ignore it because V is small. The first term is similarly a long some from the ζ0 term. Remark 4.3. The first term which we handle via our first summation technique is called a “type 1 sum” and the second handled via our second bilinear form summation technique is called a “type 2 sum.” Example 4.4. Let’s say we want to write −ζ0 −ζ0   (s) = (s) (1 − ζM)2 + 2ζM − ζ2 M2 ζ ζ  ζ0  = − + ζ0 M (1 − ζM) − 2ζ0 M + ζζ0 M2. ζ The first is a type 2 sum, the second is a type 1. The ζ and ζ0 are both somewhat a simple divisor function because if the product of ζ, ζ0 goes in a long range then at least one of them must be summed in a long range. So, the third term is also a type 1 sum. Remark 4.5 (Heath Brown identity, aka binomial theorem). Given −ζ0 (s) (1 − ζM)k ζ one can try expanding this in k via the binomial theorem, and try to bound various terms. 4.3. Proving Vinogradov’s theorem. Recall we have

a 1 α − ≤ . q q2 with (α, q) = 1. Our goal is to bound S(α) in terms of q. Trivially we know S(α) is bounded by N, and we want to save a bit more than one log on the minor arcs, and then we’ll have to concentrate on the major arcs. Using Vaughan’s identity, (where we have not yet specified U and V). There are three type 1 sums and one type 2 sum (from the bilinear form. Recall we have ζ0 −ζ0  − (s) = −ζ0(s)M(s) + (s) − P(s) (1 − ζM(s)) + P(s) − ζ(s)M(s)P(s) ζ ζ ANALYTIC NUMBER THEORY NOTES 19 and we are trying to bound the four sums. First we deal with the P(s) term, which is

∑ Λ(n)e(nα)  V. n≤V

Next, we try to bound the first term, which is the contribution from primes coming from ζ0 M. This term is

 N 1  µ(n) log re(nrα)  min , . ∑ ∑ ∑ || || n≤U r≤N/n n≤U n nα

It is convenient to split over dyadic blocks 2k < n ≤ 2k+1.

Exercise 4.6. Carry out the argument from week 1 for dealing with sums like  1  min N, . ∑ || || |n|≤N nα

There is a small lie in what we will next do, and your job is to fix it by Thursday. You should check what happens for smaller n as well.

Pretending that only the large range matters, we can bound

 N 1  µ(n) log re(nrα)  min , ∑ ∑ ∑ || || n≤U r≤N/n n≤U n nα U   N   (log N) + 1 + q log q q U  N N   (log N)2 + U + + q q U

Now, we’ll aim to attack the last type 1 sum, which is the term corresponding to ζ(s)M(s)P(s) which is ∑ ∑ µ(n)Λ(`) ∑ e(n`rα) n≤U `≤V r≤N/n` if we let n` = a then a ≤ UV. Then, the terms in a are bounded by something like ∑n`=a Λ(`) = log a (using that the left hand side is the convolution of ζ with ζ0/ζ which is ζ0 which has coefficients 20 AARON LANDESMAN given by log. Therefore,

∑ ∑ µ(n)Λ(`) ∑ e(n`rα)  ∑ log a ∑ e (arα) n≤U `≤V r≤N/n` a≤UV r≤N/a  N 1   (log N) ∑ min , a≤UV a ||αa||  Nq N   (log N)2 + q + UV + . q UV

Exercise 4.7. Verify the above bounds using a method similar to the type 2 bound of the first ζ0 M term. Adding up our three type one sums, and removing terms trivially bounded by others, we get  N N  (log N)2 + q + UV + . q U This handles three of the four terms. The last term remaining to be handled is the type 2 sum corresponding to −ζ0  (s) − P(s) (1 − ζM(s)) ζ The first sum only contains terms larger than V and the second only contains terms larger than U. Using   1 1 − ζM(s) =  µ(d) ∑ ns ∑ n d|n,d>U we obtain the sum ∑ Λ(n) ∑ ∑ µ(s)e(mnα) n>V m>U d|m,d>U with mn ≤ N. Remark 4.8. We now have two terms with variables in our bilinear form, both with large values. Both will range over dyadic intervals. It starts to look like a bilinear form, though there is the caveat that the two variables are connected by the condition that mn ≤ N. Hence, we need some technical device to separate the variables. Essentially, this is saying these are like points lying below a hy- perbola and we would like to approximate the hyperbola by some rectangle. ANALYTIC NUMBER THEORY NOTES 21

Morally, ∑ Λ(n) ∑ ∑ µ(s)e(mnα) n>V m>U d|m,d>U with mn ≤ N. Ignoring the condition mn ≤ N, (which we will fix next time) the above sum is approximated by ∑ ∑ a(m)b(n)e(mnα). m∼A n∼B for A > U, B > V, AB ≤ N. This is the kind of bilinear form we want for our type 2 sum. We can now use the bilinear form estimate from type 2, we get the estimate !1/2 !1/2 log q √ √ √  √  ∑ a(m)2 ∑ b(n)2 √ AB + A + B q + q . m∼A n∼B q Note that the correlation is as needed because we have checked it for the particular bilinear form ambne(mnα). Here, bn = Λ(n). Then, ∑ Λ(n)2  B log B. n∼B Next, ∑ a(m)2  ∑ d(m)2 m∼A m∼A

Exercise 4.9. Show ∑ d(n) ∼ x log x. n≤x

(write this as ∑n≤x ∑d|n 1 and interchange the two sums).

2 3 It turns out ∑n≤x d(n) ∼ Cx (log x) . We end up getting a bound from ∑ a(m)2  ∑ d(m)2  Cx (log x)3 . m∼A m∼A So, we have some loose ends which we shall address next time including (1) thinking about these sum of divisor functions up to x, (2) thinking through the type one bounds for the first and fourth terms, (3) and putting these things all together. 22 AARON LANDESMAN

5. 10/5/17 5.1. Recap of last time. Recall that we have defined S(α) := ∑ Λ(n)e(n). n≤N Our goal is to bound these exponential sums. We assume

a 1 α − ≤ q q2 for (a, q) = 1. We had a way of approaching this bound with expo- nential sums and bilinear forms. We used the combinatorial identity −ζ0 −ζ0  (s) = P(s) + ζ0(s)M(s) − ζ(s)M(s)P(s) + (s) − P(s) (1 − ζ(s)M(s)) ζ ζ where Λ(n) ( ) = P s ∑ s . n≤V n and µ(n) ( ) = M s ∑ s . n≤U n The first three terms are type 1 sums, and the last term is a type 2 sum. Last time, we discussed the bound for the type 1 sums. We saw they were bounded by  N N   (log N)2 + q + UV + . q U For example, to bound the term ζ0(s)M(s), we had to bound  N 1  ∑ min , . n≤U n ||nα|| We could split this into dyadic blocks and carry out the usual sum. When 1 ≤ n ≤ q, we cannot split it into intervals of length q. Exercise 5.1. For the blocks, we should take a dyadic sum over inter- vals 2kq ≤ n ≤ 2k+1q, and then we should pay attention to the case 1 ≤ n ≤ q, and we should get a bound around q log q or something like that for the sum of the first q terms. ANALYTIC NUMBER THEORY NOTES 23

At the end of last class, we were discussing the type 2 sums. There were many small things we needed to keep track of. We wrote the sum as   ∑ Λ(n) ∑  ∑ µ(d) e(mnα) n>V m>U,mn≤N d|m,d>U Last time, we divided this sum into dyadic intervals with m ∼ A and n ∼ B. Remark 5.2. We have to justify why the sum can be split the sum into dyadic blocks subject to the condition that mn ≤ N. We then bounded the above by !1/2 !1/2 log q √ √ √  √  ∑ d(m)2 ∑ Λ(n)2 √ AB + A + B q + q m∼A n∼B q !1/2 log q √ √ √  √   ∑ d(m)2 (B log B)1/2 √ AB + A + B q + q m∼A q 5.2. Bounding the sum of the divisor function. We can see ∑ d(n) = ∑ ∑ 1 n≤x a≤x b≤x/a x = ∑ ( + O(1)) a≤x a = x log x + O(x). This is a wasteful O(1) when a is small. Dirichlet’s idea was to deal with the hyperbola ab = x and count b ≤ B and a ≤ A. One could count points a certain portion of the hyperbola based on whether A or B is smaller on the outside or inside. When one carries this out, one gets an error term on the√ size of A + B, instead of x (with A + B = x. One can take A √= B = x. One ends up getting an error of x log x + (2γ − 1) x + O( x). Exercise 5.3. Carry out Dirichlet’s idea and check this error term. Then, we can compute

dk(n) = ∑ 1 a1···ak=n = ∑ dk−1(b), ab=n 24 AARON LANDESMAN and use induction. If we knew ∑b≤y dk−1(b), we could then use the hyperbola method for a ≤ A, ab ≤ B and choose the parameters A, B with AB = x. 5.3. A second method for bounding the sum of the divisor func- tion. We now want a second method of calculating this. We are try- ing to bound d(n) ζ(s)2 = . ∑ ns We have 1 Z c+i∞ xs ∑ d(n) = ζ(s)2 ds n≤x 2πi c−i∞ s for c > 1. Exercise 5.4. Show ( 1 Z ds 1 if y > 1 ys = 2πi (c) s 0 if y < 1 (see davenport’s book) where the path (c) means that from c − i∞ to c + i∞. Essentially, one can prove this by noting the integral is 0 for very small y, and there is only one pole at y = 1 which has residue 1. When we expand xs = x (1 + (s − 1) log x + ··· ) and 1 ζ(s) = + γ + O(s − 1). s − 1 The residue of the pole at s = 1 is x log x + (2γ − 1) x. It would be useful, and can be done easily, to have some bounds for |ζ(s)|  (1 + |t|)1/2 where s = σ + iτ, 0 ≤ σ ≤ 1. We are trying to bound 1 Z xs ζ(s)2 ds 2πi (c) s 1 Z c+iT xs xc = ζ(s)2 ds + O( ). 2πi c−iT s T ANALYTIC NUMBER THEORY NOTES 25

We can then try to bound this integral by something like x log x + (2γ − 1) x, similarly to the way done in Davenport’s book. In potential hope of formalizing this method, we are trying to we want to bound

∑ dk(n), n≤x and we consider ∞ d (n) ( )k = k ζ s ∑ s . n=1 n We then can compute these by examining 1 Z xs ζ(s)kds. 2πi (c) s So, we want to know vaguely what the residue of this integral at s = 1. The residue is something like

x (log x)k (k − 1)! with some lower order terms we can work out. The main term will be a polynomial of log x of degree k − 1. Exercise 5.5. Show we end up getting a residue of the form

1−δk xPk(log x) + O(x ) for Pk a degree k − 1 polynomial. Remark 5.6. Gauss should that the number of lattice points in a circle of radius R is N(R) = πR2 + O(R1/2+ε).

(the best currently known is only error R2/3−δ Dirichlet’s divisor problem is to show ∑ d(n) = x log x + (2γ − 1) x + O(x1/4+ε). n≤x In both cases, the main term is the area of the region you are consid- ering. The best error known is only about O(x1/3−δ). 26 AARON LANDESMAN

5.4. Calculating the sum of squares of the divisor function. Now, we’d like to calculate ∑ d(n)2. n≤x We will instead calculate ∞ d(n)2  4 9  = + + + ··· ∑ s ∏ 1 s 2s . n=1 n p p p Note that this will converge absolutely whenever we are to the right of 1. We have  1 1  ( ) = + + + ··· ζ s ∏ 1 s 2s . p p p We can approximate ∞ d(n)2  4 9  = + + + ··· ∑ s ∏ 1 s 2s n=1 n p p p = ζ(s)4F(s) where  α  ( ) = + + ··· F s ∏ 1 2s p p for some α, which converges absolutely if Re(s) > 1/2. One then obtains that ∑ d(n)2 ∼ Cx (log x)3 , n≤x 4 using the bound for ζ as xPk(log x) with Pk of degree k − 1. with an asymptotic power saving error term. We could similarly use the hyperbola method to approximate 2 d(n) = (d4 ∗ f ) (n) for a multiplicative function f with f (p) = 0. Remark 5.7. When one actually calculates what d(n)2 ∑ s n n one might find something like ζ(s)4 ζ(2s) ANALYTIC NUMBER THEORY NOTES 27 although this identity is not relevant to finding the correct asymp- totic formula. Exercise 5.8 (Fun exercise!). Let a(n) be the number of abelian groups of order n. First make a guess for ∑ a(n) n≤x asymptotically. Then compute the asymptotics. Hint: Use the iden- tity for the partition function ∞ − ∑ p(n)xn = ∏ (1 − xn) 1 n=0 to get the constant in the asymptotics, which ends up being some- thing like ζ(2) · ζ(3) ··· . Remark 5.9. One might also try computing

∑ dπ(n) n≤x

∑ di(n) n≤x π where dπ(n) are the coefficients of the Dirichlet series of ζ(s) . Re- latedly, one might try to count ∑ 1, n≤x,n=a2+b2 and one can work out 1/2 1/4 ∑ 1 = ζ(s) L(s, χ−4) F(s), n=a2+b2 where f (s) is regular to the left of 1. It’s not completely obvious how these functions continue analyti- cally. We could make sense of ζ(s) = exp (π log ζ(s)) , which makes sense to the right of 1. But, it also can be extended to regions where there are no zeros or poles of the ζ function. If we understand the zero-free region of the zeta function, then we can make sense of this function in this zero-free region. In the region c γ > 1 − , log T 28 AARON LANDESMAN

ζ(s) 6= 0 as shown in davenport. It turns out this function has a singularity which is not a pole (nor essential nor removable) and it turns out to be something like − x (log x)π 1 Γ(π) for the function dπ(n). This idea is called the Selberg, Delange method (or in a paper to- day on arXiv, the LSD method). Remark 5.10. We only wanted an upper bound for ∑ d(n)2. n≤x We didn’t need an asymptotic. In analytic number theory, this is called Rankin’s method. We can bound d(n)2 ( )2 ≤ α ∑ d n x ∑ α n≤x n≤x n ∞ d(n)2 ≤ α x ∑ α n=1 n  4  = α + + ··· x ∏ 1 α p p Then, we want to optimize to choose the best α. Making α close to 1, the product blows up and xα gets small. From , there will be some choice of α which minimizes this product. 1 α For example, if you guess α = 1 + log x , you find x is about x and  4   1 4 + + ··· ∼ + ∏ 1 α ζ 1 . p p log x

This yields x (log x)4 as a bound, and you are only off by one log. Exercise 5.11. Verify this. Exercise 5.12. Let p(n) denote the number of partitions of n. Prove that  √  p(n) ≤ exp π 2/3n .

Moreover, find the optimal constant so that p(n) ≤ eαn. (Sound thinks the constant above is optimal). ANALYTIC NUMBER THEORY NOTES 29

Hardy and Ramanujan found √ exp π 2/3n p(n) ∼ √ . 4n 3 Hint: Show ∞ − ∑ ∏ (1 − xn) 1 n=1 Then, − p(N) ≤ ∏ (1 − xn) 1 x−N. Exercise 5.13 (Fun mathoverflow problem). Let N be a parameter. How many subsets S ⊂ [1, N] are there so that 1 ∑ < 1. s∈S s Obviously the answer is ≤ 2N, and the exercise is to find a better bound. Hint: This is not an application of what we’ve discussed, but it is an application of the ideas we’ve discussed. 5.5. Returning to our type 2 sum. Recall we had A > U, B > V. We can now bound !1/2 !1/2 log q √ √ √  √  ∑ d(m)2 ∑ Λ(n)2 √ AB + A + B q + q m∼A n∼B q  1/2 log q √ √ √  √   A(log A)3 (B log B)1/2 √ AB + A + B q + q q  √ √  3 AB p  (log N) √ + A B + B A + qAB q   3 N N N p  (log N) √ + √ + √ + qN . q V U using for the last step that AB ≤ N. We are doing well here because both variable A and B vary only in long ranges (i.e., U and V are reasonably large). We then have to add the error from the type 1 sum which was  N N   (log N)2 + q + UV + . q U 30 AARON LANDESMAN

Adding these together, we get   3 N N N p S(α)  (log N) √ + √ + √ + qN + UV . q V U

By symmetry, we may as√ well choose U = V, and so we should optimize by choosing N/ U = U2. Hence, U = N2/5. Then, one obtains   3 N p S(α)  (log N) √ + qN + N4/5 . q There is one small caveat, where we must ensure how to separate the variables m and n subject to the condition mn ≤ N. We’ll have to finish this next time. Believing this for the moment, we’ve proven. So, if N > q > (log N)10 . (log N)10 So, as long as we can approximate α by some rational q in this range, we get a good bound. These will be called the “minor arcs.” Theorem 5.14. Let φ be the golden ration. Then, ∑ Λ(n)e(nφ)  N4/5 (log N)3 . n≤N √ Proof. We can plug in q to be around N using Fibonacci number approximations plugged into the above formula, and then the bound 3 is (log N) N4/5.  Remark 5.15. The bound also works well for bounding things like ∑ Λ(n)ein n≤N using that

1 a C − ≥ , π q q20 so one can always find a pretty good approximation to π. So, we get a bound of about ∑ Λ(n)ein  N.99 n≤N ANALYTIC NUMBER THEORY NOTES 31

6. 10/10/17 6.1. Exercises and questions. Last time, we let q be a number with

a 1 α − < q q2

  3 N p ∑ Λ(n)e(nα)  (log N) √ + qN + N4/5 . n≤N q Exercise 6.1. Let φ be the Euler totient function, let ! φ(n)k ∑ n≤N n Find asymptotics for this. Why might these asymptotics be interest- ing. 6.2. Recapping what we have seen in the Proof of Vinogradov’s theorem. Last time, we had some sum in terms of m ∼ A, n ∼ B with a condition mn ≤ N. We want to separate m and n. We have ( 1 Z c+i∞  N s ds 1 if mn ≤ N = 2πi c−i∞ mn s 0 if mn > N When we plug this into our bilinear form

∑ ∑ f1 f2e(mnα), m∼A n∼B

(for appropriate f1, f2) we get Z c+i∞ 1 f1 f2e(mnα) s ds ∑ ∑ s s N . 2πi c−i∞ m∼A n∼B m n s This separates the variables at the cost of log N when we integrate ds s . Question 6.2 (Possibly open question). We have ∑ Λ(n)e(nφ)  N4/5+ε, n≤N for φ the golden ratio. Can one say something better? Presumably the right answer is N1/2, though that may be hard. Maybe one could show something like N3/4. The key is that we have rational approx- imations at every scale. 32 AARON LANDESMAN

Recapping what we have done so far, we were trying to bound Z 1 S(α)3e(−Nα)dα. 0 We split this up into major and minor arcs. On the minor arcs, we bounded this by Z   | S(α)3e(−Nα)dα| < max |S(α)| N log N. m m We expect a main term on the order of N2. We have a good bound on S(α) so long as N ≥ q ≥ (log N)10 (log N)10 The minor arcs will be all points which satisfy an approximation of this type, and the major arcs will be all points which do not satisfy an approximation of this type. Let Q = N . By Dirichlet’s theorem, for all α ∈ (0, 1), there (log N)10 exists (a, q) = 1, q ≤ Q and

a 1 1 α − ≤ ≤ . q qQ q2

Definition 6.3. We say α ∈ m (in a minor arc) if there exists such an approximation with

q ≥ (log N)10 . Otherwise, there exists α ∈ M (in a major arc). That is,

a 1 α − ≤ q qQ with q < (log N)10. The major arcs M are disjoint. The total measure of the major arcs is

φ(q)2 C (log N)10 |M| = ∼ , ∑ qQ Q q≤(log N)10 which is roughly (log N)20/N. ANALYTIC NUMBER THEORY NOTES 33

We now wish to understand S(α) for α on a major arc. Let α = a 1 a q + β for q small, |β| ≤ qQ . The idea is to understand S( q ). Let’s instead try to understand  an ∑ Λ(n) exp . n≤x q 6.3. and counting primes. To start, let us re- call what the Riemann hypothesis says about counting the number of primes up to x. Let Ψ(x) be the number of primes up to x. It implies   Ψ(x) = x + O x1/2+ε .

If one further assumes the generalized Riemann hypothesis, one finds x Ψ (x; q, a) = + O(x1/2+ε). φ(q) Further, the constant in O(x1/2+ε) is independent of q. In particular, this means φ(q) ∼ q, Thus, we have a nice asymptotic for q ≤ x1/2−ε.

Conjecture 6.4 (Montgomery). We have ! x x1/2+ε Ψ(x; q, a) = + O √ . φ(q) q

Plugging in the generalized Riemann hypothesis, we get na ∗  ak  ∑ Λ(n) exp = ∑ ∑ Λ(n) exp n≤x q k mod q n≡k mod q,n≤x q ∗  ak  x  = exp + O(x1/2+ε) ∑ ( ) k mod q q φ q µ(q) = x + O(qx1/2+ε) φ(q) where the superscript ∗ again means k is coprime to q and we are using

∗  k  Exercise 6.5. Show ∑k mod q exp q = µ(q). Now, suppose (n, q) = 1 and we want to express exp (n/q) in terms of characters χ mod q. 34 AARON LANDESMAN

× Letting χ0 denote the identity on (Z/qZ) We can consider ∗  k  ∗  k  1 exp χ = exp χ(k)χ(n) ∑ 0 ∑ ( ) ∑ k mod q q k mod q q φ q χ mod q 1 = χ(n)τ(χ), ( ) ∑ φ q χ mod q where  k  τ(χ) = ∑ χ(k) exp . k mod q q Then,  an 1 Λ(n) exp = τ(χ) Λ(n)χ(an). ∑ ( ) ∑ ∑ n≤x q φ q χ mod q n≤x Define ψ(x; χ) := ∑ Λ(n)χ(n). n≤x Then, Ψ (x; q, a) = ∑ Λ(x) n≤x,n≡a mod q 1 = χ(a)Ψ(x, χ). ( ) ∑ φ q χ mod q The generalized Riemann hypothesis (GRH) is essentially the state- ment that for χ = χ0, we have Ψ(x, χ) = Ψ(x) up to a small error (which is just the usual Riemann hypothesis) and for χ 6= χ0, we have |Ψ(x, χ)| = O(x1/2+ε).

In the case χ = χ0, we get the main term with ∗  k  τ(χ0) = ∑ exp = µ(q). k mod q q So, the main term is µ(q) Ψ(x). φ(q) ANALYTIC NUMBER THEORY NOTES 35

Exercise 6.6 (A bit tricky, perhaps). Using orthogonality of charac- ters show that if χ is primitive√ modq (meaning not having a pe- riod dividing q) then |τ(χ)| = q. Hint: See Davenport’s section on Gauss sums. Plugging this in the above formulas and the GRH bounds, we see a refined GRH bound  an µ(q)    √  ∑ Λ(n) exp = x + O x1/2+ε + O qx1/2+ε . n≤x q φ(q) So, compared to our previous error bound with O(qx1/2+ε) error, we √ only get O( qx1/2+ε). We are not assuming GRH, rather we want an unconditional proof, so the above discussion assuming GRH can now be ignored. We have  an 1 Λ(n) exp = χ(a)τ(χ)ψ(x, χ) ∑ ( ) ∑ n≤x q φ q χ mod q √ ! µ(q) q = Ψ(x, χ ) + O |Ψ(x, χ)| . φ(q) 0 φ(q) ∑ χ6=χ0 We have  2 Ψ (x, χ0) = Ψ(x) + O (log x) . From the prime number theorem, we have   p  Ψ(x) = x + O x exp −c log x for some c > 0. The key step in the proof of this is that the region > − c ( ) = σ 1 log 2+|t| has no zeros of the zeta function ζ s , where s σ + it. We therefore get a bound 2 Ψ (x, χ0) = Ψ(x) + O (log x) xρ  x  ∼ x − + O . ∑ ρ T |ρ|≤T In the best case, we might have  p  Ψ(x; χ)  x exp −c log x , p  for χ 6= χ0 mod q and q ≤ exp log x . 36 AARON LANDESMAN

The conclusion is that if  p  q ≤ exp c log x for c small, then   an µ(q)   p  ∑ Λ(n) exp = x + O x exp −c log x . n≤x q φ(q) In a similar way, one would like to show x   p  ψ(x; q, a) = + O x exp −c log x . φ(q) The short version of the story is that we can basically do this, but with one important caveat, which is called a Landau-.

6.4. Siegel Zeros. We want to understand Ψ(x; χ). If χ 6= χ0, we have ! xρχ x (log x)2 Ψ(x, χ) = − ∑ + O ρχ T ρχ,|ρχ|≤T where ρχ are the zeros of ∞ χ(n) ( ) = L s, χ : ∑ s . n=1 n One can find proofs of all of these things in Davenport. If χ is primitive then L(s, χ) has a functional equation of the form  q s/2 s + α Γ L(s, χ) π 2 where α is either 0 or 1 depending on whether χ(−1) = 1, (so α = 0) or χ(−1) = −1 (so α = 1). This yields the volume at 1 − s. One can count the number of zeros of ζ(s) or L(s, χ) up to height T, which is approximately T log qT . 2π It is also useful to know the Hadamard factorization, which, once you know this is an order 1 function, tells you this has a factorization in terms of its zeros. That is, !  s  L(s, χ) = ∏ 1 − es/ρ eA+Bs. ρ ρ So the sum of the reciprocals of the squares of the 0’s converge, but possibly not the sum of the reciprocals of the 0’s. ANALYTIC NUMBER THEORY NOTES 37

For the zeta function we have, ξ(s) = s(s − 1)π−s/2Γ(s/2)ζ(s). which kills the pole at s = 1. This satisfies the functional equation ξ(s) = ξ(1 − s). The main difference between the L functions and the ζ function, we will need to know something about the zero free region. For ζ(s), there is a zero free region of the form c σ > 1 − , log 2 + |t| with σ = im s. We want c σ > 1 − , q(log 2 + |t|) is free of zeros of L(s, χ). This would imply   p  Ψ(x, χ) = O x exp −c log x p for q ≤ exp c log x. This holds if χ is a complex character modq. But for quadratic characters χ mod q, there is the unfortunate pos- sibility that there could be one exceptional real simple zero β Theorem 6.7 (Siegel). Let β be the possible zero of L(s, χ) as above. Then, C(ε) β < 1 − qε for any ε > 0 for some constant C(ε) which cannot be computed (i.e., the proof is ineffective). Next time we’ll say a bit more about Siegel’s theorem. It might be helpful to review things about the prime number theorem in pro- gressions which we will go over as needed on Thursday. Sound also says he is happy to look at or discuss solutions if you do end up solving problems.

7. 10/12/17 7.1. Review. Let χ be some character Z/qZ → C×. We have Ψ(x, χ) = ∑ Λ(n)χ(n). n≤x

If χ = χ0, we have     p  Ψ(x) + O (log x)2 = x + O x exp −c log x . 38 AARON LANDESMAN

If χ 6= χ0, GRH implies |Ψ(x, χ)|  x1/2+ε. We would like an unconditional bound around  p  (7.1) |Ψ(x, χ)|  x exp −c log x p and we would like to say when q ≤ exp log x . We have  an 1 Λ(n) exp = τ(χ)χ (a) Ψ(x, χ). ∑ ( ) ∑ n≤x q φ q χ mod q

The main term comes from χ = χ0 where τ(χ0) = µ(q), using an exercise on computing Gauss sums from last time. The error term, assuming GRH is of the form q1/2 + x1/2+ε. If Equation 7.1 holds, then we can bound   an µ(q)   p  ∑ Λ(n) exp  + O x exp −c log x n≤x q φ(q) p in the range q ≤ exp c log x). We don’t actually know Equation 7.1, but for our application to sums of three primes, we thought of q as only going up to (log x)10 p and not all the way to exp c log x. If χ is complex, (i.e., not a real character) then  p  |Ψ(x, χ)|  x exp −c log x p for q ≤ exp c log x then there are no zeros of L(s, χ) for c σ > 1 − . log q(2 + |t|) with σ = im s. If instead χ is real or quadratic, then the zero free region above holds except possibly for one real simple zero. Theorem 7.1 (Siegel). The real zero β (if it exists) must satisfy C(ε) β < 1 − qε for any ε > 0 and some ineffective constant C(ε). Remark 7.2. If the zero does not exist, then we can obtain Equa- tion 7.1. If there does exist a Siegel zero for χ mod q, then −xβ   p  Ψ(x, χ) = + O x exp −c log x β ANALYTIC NUMBER THEORY NOTES 39

If q ≤ (log x)A, we can choose ε small enough, we can ensure β < β 1 − √ C , and then we can absorb the main term for Ψ(x, χ) − x log x β into the error term. In the presence of the Siegel zero, we can only get this uniform desired result for q ≤ (log x)A, but not for q ≤ p exp c log x. This is also ineffective. Therefore, we obtain   an µ(q)   p  ∑ Λ(n) exp  x + O x exp −c log x . n≤x q φ(q) Remark 7.3. Suppose β is very close to 1. Pretend β = 1. Then there is one character χ mod q so that Ψ(x, χ) is approximately −x. If you think of 1 x xβ χ(a) Ψ(x; q, a) = χ(a)ψ(x, χ) = − . φ(q) ∑ φ(q) β φ(q) and here χ is real so χ(a) = χ(a). Then, half of the progression get most of the primes and the other half get none of them (this happens depending on whether χ(a) = ±1). 7.2. Proving Vinogradov’s theorem using Siegel’s theorem. For the moment, we’ll assume Siegel’s theorem and finish the proof of Vino- gradov’s theorem. We’ll later come back to discuss Siegel’s theorem. We have seen that if q ≤ (log x)A (A is around 10) then   an µ(q)   p  ∑ Λ(n) exp = + O x exp −c log x . n≤x q φ(q) The major arcs are of the form

a 1 α − ≤ , q qQ with N Q = (log N)10 for q ≤ (log N)10. Recall we have already bounded the minor arcs, a and we are now trying to bound the major arcs. Set α = q + β. We 40 AARON LANDESMAN would like to understand  an S(α) := ∑ Λ(n) exp exp (nβ) . n≤N q

 an  We can think of the the product of Λ(n) exp q whose partial sums we understand and exp (nβ) which doesn’t vary very much. So,  an S(α) = ∑ Λ(n) exp exp (nβ) n≤N q ! Z N  an = exp(xβ)d ∑ Λ(n) exp 1 n≤x q   µ(q) Z N   p  Z N  p  = exp(xβ)dx + O N exp −c log N + O βx exp −c log x dx φ(q) 1 1   p  = O 1 + N|β|N exp −c log N   p  = O N exp −c log x .

Where we used integration by parts to get the above bounds on the 1 error terms, and then we used that β ≤ qQ , and we might have to adjust the constant c to absorb some factors of log N. a Remark 7.4. The above bound makes sense: If β is very close to q , we pick up the same error term we had before. But if β is very far, then the error term should group approximately proportionally to N |β|, which indeed it does. We now want to evaluate the major arc contribution Z S(α)3e (−Nα) dα. M We are hoping this is of size N2 · C with C some constant we can evaluate. Indeed, Z ∗ Z 1/qQ  a 3   a  S(α)3e (−Nα) dα = ∑ ∑ S + β exp −N + β dβ. M −1/qQ q q q≤(log N)10 a mod q We know  a 3 µ(q) Z N    + = ( ) + 3 − p S β 3 exp xβ dx O N exp c log N q φ(q) 0 ANALYTIC NUMBER THEORY NOTES 41

The error term in the integral over the major arcs is then    p  1   p  O  N3 exp −c log N  = O N2 exp −c log N . ∑ Q q≤(log N)10 So the error terms are under control. We now want to understand the main term. The main term is almost independent of a except for the   a  factor exp −N q + β . We want to understand the main term of Z S(α)3e (−Nα) dα. M which is ∗ Z 1/qQ µ(q) Z N    a  ( ) − + ∑ ∑ 3 exp xβ dx exp N β dβ. −1/qQ φ(q) 0 q q≤(log N)10 a mod q Recall the Ramanujan sum  aN  c (N) := exp . q ∑ q (a,q)=1 The main term is then µ(q) Z 1/qQ Z N 3 ( ) ( ) (− ) ∑ 3 cq N exp xβ dx e Nβ dβ. φ(q) −1/qQ 0 q≤(log N)10 We can now replace Z N Z 1 exp(xβ)dx = N exp (Nxβ) dx. 0 0 This yields

Z 1/qQ Z N 3 exp (xβ) dx e (−Nβ) dβ −1/qQ 0 Z 1/qQ Z 1 3 = N) N3 exp (Nxβ) dx e (−Nβ) dβ −1/qQ 0 Z N/qQ Z 1 3 = N2 exp (Nxβ) dx e (β) dβ −N/qQ 0 Z ∞ Z 1 3 q2Q2  = 2 ( ) ( ) + N exp Nxβ dx e β dβ O 2 −∞ 0 N 42 AARON LANDESMAN where the last step uses that the tail is

Z dβ q2Q2  = O 3 O 2 . |β|>N/qQ β N This integral above is called the singular integral. Plugging in this remainder term, we get that the error contribution is   1  10 O Q2 φ(q)q2 = O Q2 (log N) ∑ φ(q)3 q≤(log N)10 ! N2 = O . (log N)10 So, we can replace our integral from −N/qQ to N/qQ by an inte- gral going off to infinity. This integral is essentially computing the number of ways to write N as a sum of three numbers, which is es- sentially N2/2. But, we can also compute it since this is essentially a Fourier transform. That is,

Z ∞ Z 1 3 N2 exp (Nxβ) dx e (β) dβ −∞ 0 is the convolution of χ[0,1] ∗ χ[0,1] ∗ χ[0,1] which has Fourier transform above, and for this convolution we get Z δ (t1 + t2 + t3 = 1) . t1,t2,t3∈[0,1] Then one can use Parseval’s identity to compute the Fourier trans- form of this. Here, Parseval is counting the number of ways of writ- ing 1 as a sum of three real numbers. Before we were writing N as a sum of three integers. Now, let’s finish our calculation. We were trying to compute

µ(q) Z 1/qQ Z N 3 ( ) ( ) (− ) ∑ 3 cq N exp xβ dx e Nβ dβ. φ(q) −1/qQ 0 q≤(log N)10 which is approximated, using our above discussion, by

µ(q) N2 c (N) . ∑ φ(q)3 q 2 q≤(log N)10 ANALYTIC NUMBER THEORY NOTES 43

This sum is called the singular sum The tail of this sum is roughly

2! 1 − O  (log N) 10 . ∑ φ(q) q>(log N)10

Therefore, the main term is of the form

N2 ∞ µ(q) c (N). ∑ ( )3 q 2 q=1 φ q

Let ∞ µ(q) S (N) := c (N). ∑ ( )3 q q=1 φ q

We can write, using the Chinese remainder theorem so that cp1 p2 (N) = cp1 (N)cp2 (N). So we have  1  S (n) = ∏ 1 − s cp(n) . p (p − 1)

Then,

− p 1  aN  cp(N) = ∑ exp a=1 p ( −1 if p N = - p − 1 if p | N

Then,  1  S (n) = ∏ 1 − s cp(n) p (p − 1)  ! 1 1 = + · − ∏ 1 3 ∏ 1 2  . p-N (p − 1) p|N (p − 1)

Remark 7.5. We see this cancels out when N is even. When N is even, the major arc at 0 is cancelled by the major arc at 1/2. And in general, the major arc at a/q is canceled by a similar one at a/2q. 44 AARON LANDESMAN

Finally, we have

Z 1 S(α)3e (−Nα) = ∑ 0 n1+n2+n3=N N2  N  = S (N) + O . 2 log N Because the contribution of prime squares and cubes is negligible, we get that every sufficiently large odd number is the sum of three primes (and in fact it is the sum of three primes in many ways), where here we are using that S (N) ≥ c for all N where c is some universal constant bounded below by  1  2 · 1 − . ∏ ( − )2 p≥3 p 1 This finishes the proof. Remark 7.6 (Philosophy). Under suitable situation, we’d like to say we can get an answer by counting contributions at each place and then multiplying them together. For example, say we’d like to count the number of ways to write 2N = p1 + p2. We can try to do the same computation mod p (i.e., the counting the number of (a, b) so that N = a + b mod p for a, b relatively prime to p, and similarly over the infinite place). We could then approximate

∑ Λ(n1)Λ(n2) ∼ S (N)2N, n1+n2=2N R 1 ( )2 (− ) we can then try to use the circle method to approximate 0 S α e 2Nα dα. But we can no longer use Parseval’s identity to bound the minor arcs because Z 1 |S(α)|2 = ∑ Λ(n)2 ∼ 2N log N. 0 n≤2N Remark 7.7. At the beginning of this course, we mentioned we could try to count the number of ways to write N as a sum

k k N = x1 + ··· + xs . ANALYTIC NUMBER THEORY NOTES 45

Letting P = N1/k, this is approximated by the integral !s Z 1   ∑ exp nkα e (−Nα) dα. 0 x≤P Then, we might expect Ps/N = Ps−k. There might then be local obstructions (e.g., squares are always 1 mod 8 or 0 mod 8). Then, if S is large enough in terms of 4k, one might try to show this can be done. Instead of trying to understand exponen- tial sums over primes, we would want to understand exponential sums over powers. But, once S ≥ k + 1 and there are no congruence constructions, this sort of result should hold. For example, every large number should be a sum of four squares. But for three squares, there is a congruence obstruction - 7 mod 8 can never be written as a sum of three squares. It turns out you can write numbers as sums of 7 cubes. But for fifth powers, the problem turns out to be much harder. Next time we’ll talk about effectivity and Siegel’s theorem.

8. 10/17/17 8.1. Exercises to solidify the ideas thus far. Here are two exercises, which are a bit longer and harder than usual. Exercise 8.1 (Difficult exercise). Assume GRH. Give a bound for ∑ Λ(n) exp (nα) n≤x

− a ≤ 1 for α q q2 without using bilinear forms, but instead using GRH and thinking about the prime number theorem and arithmetic pro- gressions. Hint: We discussed how to write na 1 ∑ Λ(n) exp ∼ ∑ χ(a)τ(χ)Ψ(x, χ), n≤x q φ(q) and one can input information about Ψ using GRH.  na  One would then try to write exp q in terms of exp (nβ) with 1 |β| ≤ qQ and then one should try to obtain good minor arc estimates for this summation using a “quasi-Riemann hypothesis” (i.e., assuming there 46 AARON LANDESMAN are no zeros with σ ≥ 2/3. Unconditionally, we know information about primes in progressions up to some√ modulus. Assuming GRH, we know estimates for primes up to x. We can then√ find approxi- mations for numbers with the denominator up to x. We can then use the prime number theorem for everything, and then we won’t even have to worry about major and minor arcs, we can hit the whole problem in both cases using GRH. If one is more careful (via a result due to Hardy and Littlewood) it is enough to assume there are no zeros with σ ≥ 3/4. Remark 8.2. On GRH, one should be able to prove ∑ Λ(n) exp (nφ)  x3/4+ε, n≤x whereas Vinogradov’s method only gave an x4/5. To get the 3/4 estimate, we would need Hardy and Littlewood’s refinement. This refinement due to Hardy and Littlewood is a refinement of the Gauss   sum idea. One might decompose exp an as a sum of multiplicative √ q characters. This incurs a loss of q. When one writes it as 1 τ(χ)χ(ax), ( ) ∑ φ q χ mod q one√ rewrites a number on the order of 1 as a number on the order of q. For n ≤ x, one can write exp(nβ) in terms of integral of the form Z f (y)niydy |y|≤x We try to replace this additive character in x in terms of a multiplica- tive character in y. A is then of the form n τ(χ) = exp χ(n). ∑ q

There will then be an integral√ which is an analog of a Gauss sum. One then saves a factor of q, instead of just writing it naively by breaking it up into progressions. Exercise 8.3 (Difficult exercise). This exercise is to prove a theorem of Davenport. Let µ(n) be the Mobius¨ function. Show

x ( ) ( )  sup ∑ µ n exp nα A A α∈R n≤x (log x) ANALYTIC NUMBER THEORY NOTES 47 for any A > 0. You will have to figure out what happens when α is on a major or minor arc. When α is on a minor arc, you will want to use Vinogradov’s√ √ method and use a bilinear estimate, you will have x/ q + q · x. ζ0 We use Vaughn’s identity for obtaining bilinear forms for − ζ (s), 1 and we would need an analog for identifying ζ (s). One would take ζ · M for M a modifier, and play around with powers of that. To deal with the case when α is on a major arc. This has to do with understanding  an ∑ µ(n) exp , n≤x q for q ≤ (log x)A. One might rewrite this in terms of χ mod q. The goal would then be to understand ∑ µ(n)χ(n). n≤x Many results holding for prime numbers also hold for the Mobius¨ function. For studying primes we look at something like 1 Z −ζ0 ds (s)xs , 2πi (c) ζ s and in this case we would be looking at 1 Z 1 ds xs . 2πi (c) L(s, χ) s For primes there is a pole at s = 1, but the pole at s = 1 becomes a 0 for 1/L. So there is some cancellation. On the major arcs, there are savings with powers of log, and on the minor arcs there is another method using bilinear forms which gives savings of powers of log. Exercise 8.4. Assuming GRH, show

3/4+ε sup ∑ µ(n) exp (nα)  x . α∈R n≤x Maybe 5/6 instead of 3/4 would be easier to prove. Presumably the correct answer is x1/2+ε. The supremum does obtain x1/2 because Parseval tells us 2 Z 1 2 ∑ µ(n) exp (nα) dα = ∑ µ(n) . 0 n≤x n≤x 48 AARON LANDESMAN

Remark 8.5. The minor arc technology gave use an estimate of the form N p √ + qN + E q where E is some error term endemic to the method.√ The first two terms are optimized when q is on the order of N, in which case the first two terms give N3/4, so you cannot really do better than N3/4 with this minor arcs method. We can write exp(nα) in terms of Z ∑ χ(n)nit f , χ t a for some function f . Here n ≤ N, α ∈ (0, 1). One than writes α = q + β. One does not want to use too many q’s and too many t’s. Roughly one uses q characters χ and integrates over t which is roughly 1 + |β|x. One needs to balance what weight to put on the sum and what weight to put on the integral. You can always choose√ q ≤ Q, |β| ≤ 1 . Then, 1 + |β|N ≤ 1 + N . One can choose Q ∼ N so the integral qQ √ Q √ goes up to N and the sum adds over N terms. One looses N1/2 complexity when doing the above procedure. So it is very hard to beat N3/4 in these major and minor arc estimates. 8.2. Zeros of ζ and L-functions. It’s good to have some intuition for where the 0’s come from and how you might prove these functions have a 0-free region. We’d like to prove ζ(1 + it) 6= 0, L(1, χ) 6= 0, where −  χ(p) 1 L (1, χ) = ∏ 1 − . p p We can consider the product −  1  1 ( + ) = − ζ 1 it ∏ 1 1+it . p p How will you find t with |ζ (1 + it)| being large or small. The small primes have a much bigger impact on this product above than the large primes. The maximum impact occurs when the small primes are as big or small as possible. ANALYTIC NUMBER THEORY NOTES 49

To make |ζ (1 + it)| large, we would like pit ∼ 1 and to make it small we would like pit ∼ −1 for many small primes p. Exercise 8.6 (Simple exercise). Show that for any N, there are qua- dratic characters χ with χ(p) = 1 for all p ≤ N (and similarly χ(p) = −1 for all p ≤ N). Remark 8.7. Then, χ will occur to some modulus q, and q might be very large in terms of N. If one tries to compute one of these via the Chinese remainder theorem, one might then have the modulus exponentially large in N. Conjecture 8.8 (Vinogradov). If χ(p) = 1 for all p ≤ N, then q > NA for arbitrarily large A. Conversely, χ(p) = −1 for all p ≤ N then q > NA for all large A. Example 8.9. The least quadratic non-residue must be smaller than qε, if Conjecture 8.8 were to hold true. Remark 8.10. The chance that all the first N primes land heads, one should expect the chance is around 1/2N. So, one would expect the number of primes would have to be exponentially large. Every once in a while, there can be a surprise. For example, if D = −163. Then, −163 = −1 p for p < 41. There are 12 such primes. If you think of things as coin tosses, there would only be a 1/212 (since there are 12 primes up to and including 37) but 163 is substantially smaller than 212 = 4096. L(1, χ−163) should be very small. Indeed, the Class number for- mula gives √  πh Q −163 π L (1, χ−163) √ ∼ . 163 13 50 AARON LANDESMAN and the class number is 1 here (and h denotes class√ number). Recall that the implies that for Q −D, √ πh(Q −D L (1, χD) = √ . D Goldfeld and Gross-Zagier’s result implies  C log D  L 1, χ−D ≤ √ . D Siegel’s theorem implies that if χ is a quadratic character mod q, then L(1, χ) ≥ C (ε) q−ε for all ε > 0. Remark 8.11. The zero free region is determined as follows. If L (β, χ) = 0, then L (1, χ) = (1 − β) L0 (σ, χ) for some β ≤ σ ≤ 1. Exercise 8.12. Prove that if 1 1 ≤ σ ≤ 1 − , log q then 0 2 L (σ, χ) ≤ C (log q) . See Davenport. Essentially you can make sense of this a little left of the 1 line. Then you can differentiate it and deduce this. So, you have a bound on how close a zero can be to 1. So, if there’s a bad Siegel 0 it has to be bounded away from 1. 8.3. The zero-free region of the Zeta function. We’d like to instead look at the completed zeta function ξ(s) = s (s − 1) π−s/2Γ(s/2)ζ(s). The functional equation says ξ(s) = ξ(s − 1). The Hadamard product formula gives  s  eA+Bs ∏ 1 − es/ρ. ρ ρ The trivial zeros come because the Γ(s/2) function has zeros at s = 0, −2, −4, ··· . ANALYTIC NUMBER THEORY NOTES 51

Exercise 8.13. The Riemann hypothesis is equivalent to |ξ (σ + it)| is monotonically increasing in σ ≥ 1/2 Hint: Show that 0 by 0, the Hadamard product above will be increasing. Furthermore, |ζ (σ + it)| is monotone increasing on σ ≥ 1. Exercise 8.14. Let χ be an even character, i.e., χ(−1) = 1. Define ξ (s, χ) := π−s/2Γ(s/2)L(s, χ). Then prove |ξ (s, χ)| is monotone increasing in σ > 1. If ζ(1 + it) = 0, then pit ∼ −1 for many small primes p. This implies p2it ∼ 1 for many small primes p. This implies ζ(1 + 2it) is very big. This relates to the classical inequality ζ(σ)3 |ζ (σ + it)|4 |ζ (σ + 2it)| ≥ 1. Then, χ(p)pit ∼ −1 implies χ(p)2 p2it ∼ 1, which implies   L 1 + 2it, χ2 is big. This would yield a contradiction unless 2 χ = χ0 and t = 0. In this case, we are considering the ζ function at 1, which is big because it has a pole. This is the Siegel zero situation where we have a quadratic character and want a lower bound for L(1, χ). 52 AARON LANDESMAN

8.4. Siegel zero situation. We’ll now discuss a proof due to Gold- feld of Siegel’s theorem. We want to show that a lower bound for L(1, χ) L(1, χ)  C(ε)q−ε for χ a quadratic character modq. We look at the region h ε i 1 − , 1 . 10 Either (1) All quadratic Dirichlet L-functions have no zero in this region ε We take β = 1 − 10 . We define Ψ to some character mod3. (2) There is some quadratic character Ψ mod r for some r with ε L(β, Ψ) = 0 with 1 ≥ β ≥ 1 − 10 . Consider a(n) ζ(s)L(s, χ)L(s, Ψ)L(s, χΨ) = . ∑ ns

Exercise 8.15. Check a(n) ≥ 0 for all n by just expanding the defini- tions of Dirichlet characters for the various L functions. This function above is the Dedekind ζ function for the biquadratic extension defined by χ and Ψ. This function is always non-negative on primes because (1 + χ(p))(1 + Ψ(p)) ≥ 0 and both χ, Ψ takes values ±1, 0. Then, for c > 1, consider 1 Z I := ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)Γ(s)Xsds 2πi (c) where X is some large parameter that we haven’t yet defined, which is roughly (qr)10. Exercise 8.16. Show that 1 Z XsΓ(s)ds = e−1/x. 2πi (c) Look at the 0’s of Γ, compute the residues, and you will see the Taylor expansion for e−1/x. ANALYTIC NUMBER THEORY NOTES 53

Here we have nice absolutely convergent . But instead of picking up the characteristic function, we pick up a “smoothed” version of the characteristic function. Then, we have ∞ a(n) = −n/x I ∑ β e . n=1 n where we are plugging in a(n) ∑ nβ and (X/n)s We have Then, we have ∞ a(n) = −n/x ≥ −1/x ≥ I ∑ β e e 1/2. n=1 n Dirichlet L-functions are entire. The only L-function with a pole is the . For any other character, the L function terms cancel out every q-steps. For example, using integration by parts ! Z ∞ 1 L(s, χ) = d χ(n) − s ∑ 1 y n≤y ! Z ∞ 1 = s χ(n) dy. − s+1 ∑ 1 y n≤y Moving the line of integration to the left, we encounter poles at 1 − β from the ζ function, there are poles from the Γ function. We take Re s = −β + 1/2, so this is negative, but not as negative as −1. We encounter poles at s = 1 − β, s = 0 coming from ζ(s + β) and Γ(s). The pole at s = 1 − β has residue Computing the residue at 1 − β we get L (1, χ) L (1, Ψ) L (1, χΨ) X1−βΓ(1 − β).

1 The residue at 0 is given as follows: near 0, we have Γ(0) ∼ s using that s · Γ(s) = Γ(s + 1) and Γ(1) = 1 and is smooth. The residue at 0 is ζ(β)L(β, χ)L(β, Ψ)L(β, χΨ) ≤ 0. 54 AARON LANDESMAN

Indeed, if all Dirichlet functions have no 0’s, L(β, χ) is positive and L(β, Ψ), L(β, χΨ) is positive, and ζ(β) is negative. In the second case L(β, Ψ) = 0. We then have a lower bound for the residue at 1 − β. This is what we want, because we want a lower bound for L(1, χ). We would be done if we had upper bounds for the latter Dirichlet L functions. We can just replace using integration by parts ! Z ∞ 1 L(s, χ) = d χ(n) − s ∑ 1 y n≤y ! Z ∞ 1 = s χ(n) dy. − s+1 ∑ 1 y n≤y as above. Exercise 8.17. Indeed show that for χ a character modq, show |L (1, χ) |  log q. for X a large power of qr (using Re(s) = −β + 1/2). Therefore, we would conclude a bound of the form − L (1, χ)  (qr) ε . We could get an effective bound, but we don’t know what r is. In case 1, r = 3, so things would be fine. But, if there is some violation to the Riemann hypothesis, then r depends on what the violation to the Riemann hypothesis is. So this r is the source of the ineffectivity in Siegel’s theorem. Next time, we’ll discuss effectivity of the 3-prime theorem. Then we’ll move on to discussing a theorem of Maynard: Theorem 8.18. There are infinitely many primes with no 7 in their decimal expansion.

9. 10/24/17 9.1. Quick recap of the proof of Siegel’s theorem. Recall that last time we proved Siegel’s theorem: Theorem 9.1 (Siegel). We have C(ε) L(1, χ) > qε (with C(ε) ineffective). ANALYTIC NUMBER THEORY NOTES 55

The idea of the proof was to construct an auxiliary character Ψ mod r. There were two cases. In the first case, all characters have no ze-  1  ros [1 − ε/10, 1] and we took ψ mod r = 3 In the second case we ε assume there exists some r with a zero β ≥ 1 − 10 . The idea was to consider 1 Z a(n)e−n/x ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)XsΓ(s)ds = ∑ ≥ e−1/x 2πi (c) nβ We then move the line of integration to Re(s) = 1/2 − β. This has a pole at 1 − β. We obtain L(1, χ)L(1, Ψ)L(1, χΨ)X1−βΓ(1 − β). Then, at s = 0, we have ζ(β)L(β, χ)L(β, Ψ)L(β, χΨ) ≤ 0. At the end of last time, we claimed 1 Lemma 9.2. The integral on 2 − β is negligible. That is, 1 Z ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)XsΓ(s)ds  1 1 2πi ( 2 −β) for appropriate values of x (we will take (qr)20). Proof. Indeed, 1 Z ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)XsΓ(s)ds 1 2πi ( 2 −β) Z ∞ − −| | 1 1 1 1  x1/2 β e t ζ( + it)L( + it, χ)L( + it, Ψ)L( + it, χΨ) dt −∞ 2 2 2 2 We want some kind of polynomial bound to show this integral is negligible. We have ξ(s) = s (s − 1) π−s/2Γ(s/2)ζ(s) is entire of order 1 (meaning it doesn’t grow more than exponen- tially). We want to use the maximum modulus principal in a com- plex strip with real part between −1 and 2. It’s easy to bound ξ because ζ is a bounded function on Re(s) = 2. Similarly, we can un- derstand asymptotics of the other terms. By the functional equation, we then also understand the value at Re(s) = −1. So, by this variant of the maximum modulus principal, we can bound |ξ (1/2 + it)| by, essentially, |ξ(2 + it)|. This cannot literally be true because it would imply the Riemann hypothesis, but if we restrict to a rectangular re- gion, bounding things from above and below, we will have good 56 AARON LANDESMAN enough bounds. But, in any case, after making this precise, we can bound Γ by a sterling approximation, and then bound |ζ (1/2 + it)|  (1 + |t|) .

Remark 9.3. If we instead carry this out between −ε and 1 + ε, one can obtain the convexity bound + |ζ(1/2 + it)|  (1 + |t|)1/4 ε . The Lindelof¨ hypothesis says we can replace 1/4 + ε by any positive exponent. Altogether, we can bound 1 Z ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)XsΓ(s)ds 1 2πi ( 2 −β) Z ∞ − −| | 1 1 1 1  x1/2 β e t ζ( + it)L( + it, χ)L( + it, Ψ)L( + it, χΨ) dt −∞ 2 2 2 2 Z ∞  x1/2−β e−|t| ((1 + |t|) qr)4 dt −∞  (qr)4 x−.4.

Now, choose x = qr20 so that (qr)4 x−.4  1.  Using the above bound from the lemma, together with 1 Z a(n)e−n/x ζ(s + β)L(s + β, χ)L(s + β, Ψ)L(s + β, χΨ)XsΓ(s)ds = ∑ ≥ e−1/x 2πi (c) nβ (with the last term bounded by .9) we get 1 xβ−1 L (1, χ) L (1, Ψ) L (1, χΨ) ≥ 3 Γ(1 − β) 1 ≥ (1 − β) x−ε/10 5 − = (1 − β) /5 (qr) 2ε . Then, note L(1, χ) ≤ c log r, L(1, χΨ) ≤ c log qr. ANALYTIC NUMBER THEORY NOTES 57

So, one obtains − L (1, χ) ≥ C (1 − β)(qr) 3ε . The constant C is calculatable, but the reason for the ineffectivity is that we do not know what r and β are. Remark 9.4. One can effectively prove C L (1, χ) ≥ √ , q

√1 so β must be at least r away from 1, or something like that. So really the constant above only depends on r, since we can get a bound on β from that. 9.2. Effectivity of ternary Goldbach. Returning to ternary Goldbach, we considered  an ∑ Λ(n) exp . n≤N q Using q ≤ (log N)10, we found ∑ Λ(n)χ(n), n≤N β −N √c is bounded by something like β for β > 1 − q . Then, √ Nβ ≤ N1−c/ q is small compared to N only when q ≤ (log N)1.99 . But, we wanted q to go up to (log N)10 rather than (log N)2. So, we will have to use Siegel’s theorem in some range. Even though Siegel’s theorem is not effective, we can use that Siegel zeros are rare to still get effectivity of ternary Goldbach. Lemma 9.5. There cannot exist two primitive quadratic characters

χ1(modq1), χ2(modq2) 100 10−10 with Q ≤ q1, q2 ≤ Q and both L functions having a 0 at least 1 − log Q .

Proof. Suppose we have two such characters χ1, χ2. We’ll now play these two characters against each other. Consider

ζ(s)L(s, χ1)L(s, χ2)L(s, χ1χ2). This is the of a biquadratic field, so its Dirich- let coefficients are all positive. 58 AARON LANDESMAN

Instead, consider

ξ(s)ξ(s, χ1)ξ(s, χ2)ξ(s, χ1χ2).

Consider its logarithmic derivative and evaluate at some real num- ber σ > 1. We have ξ0 ξ0 ξ0 ξ0 (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ) ξ ξ 1 ξ 2 ξ 1 2

Using the Hadamard product formula we have

 s  ξ(s) = eA+Bs ∏ 1 − es/ρ ρ ρ then ξ0 1 (s) = ∑ . ξ ρ s − ρ

Then, 1 σ − β Re = . s − ρ |s − ρ|2 with s = σ + it, ρ = β + iγ. On the one hand, the expression

ξ0 ξ0 ξ0 ξ0 (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ) ξ ξ 1 ξ 2 ξ 1 2 is always positive. On the other hand, if we have two real zeros, we would obtain ξ0 ξ0 ξ0 ξ0 1 1 (σ) + (σ, χ1) + (σ, χ2) + (σ, χ1χ2) ≥ + . ξ ξ ξ ξ σ − β1 σ − β2 We also know

ξ(σ) = σ (σ − 1) π−σ/2Γ(σ/2)ζ(σ), and with α equal to either 0 or 1,

q σ/2 ξ(σ, χ ) = 1 Γ(σ + α/2)L(σ, χ ). 1 π 1 ANALYTIC NUMBER THEORY NOTES 59

We get similar expressions for the other two ξ functions. Then, we obtain ξ0 ξ0 ξ0 ξ0 (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ) ξ ξ 1 ξ 2 ξ 1 2 1 1 1 1 + log q + log q + log q q + O(1) 1 − σ 2 1 2 2 2 1 2 ζ0 L0 L0 L0 + (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ). ζ L 1 L 2 L 1 2 We then obtain ζ0 L0 L0 L0 (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ). ζ L 1 L 2 L 1 2 is approximated by Λ(n) − (1 + χ (n) + χ (n) + χ χ (n)) , ∑ nr 1 2 1 2 which has all Dirichlet coefficients negative. Therefore, we have ξ0 ξ0 ξ0 ξ0 (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ) ξ ξ 1 ξ 2 ξ 1 2 1 1 1 1 + log q + log q + log q q + O(1) 1 − σ 2 1 2 2 2 1 2 ζ0 L0 L0 L0 + (σ) + (σ, χ ) + (σ, χ ) + (σ, χ χ ) ζ L 1 L 2 L 1 2 1 ≤ + log q q + O(1). σ − 1 1 2

If β1, β2 were close to 1, we have a lower bound by something close to 2/1 − σ.

100 Exercise 9.6. If q1, q2 are comparable to each other, say q1 ≤ q2 , q2 < q100, then we can’t have both a lower bound by 1 + 1 and an 1 σ−β1 σ−β2 upper bound by 1 + log(q q ) + O(1) θ − 1 1 2

−6 Here, we are choosing σ to be around 1 + 10 . log q1q2  60 AARON LANDESMAN

9.3. Discriminants of number fields. Let K be a number field over Q. Then, some prime must be ramified in K because the discriminant is more than 1. In general, if K has degree n over Q, what can we say about dK := disc K. Question 9.7 (Open question). Take f ∈ Z[x] of degree n irreducible. How does disc( f ) grow? Theorem 9.8 (Minkowski, Stark-Odlyzko). The discriminant of a num- ber field K is bounded below by cn with c > 1. Remark 9.9. Minkowski got this by thinking about lattices and using the of numbers. One way to think of this is the following idea going back to Stark: Let r1 be the number of real embeddings, r2 be the number of com- plex embeddings, so that r1 + 2r2 = n. Consider the Dedekind zeta function r s/2  −s/2  1 −s r2 ξK(s) = s(s − 1)dK ζK(s) π Γ(s/2) (2π) Γ(s)  s  = (··· ) 1 − (··· ) ∏ ρ ρK K where ··· indicate factors we must include to make the product con- verge. This will satisfy a functional equation ξK(1 − s) = ξK(s), and will have a Hadamard product, and so on. Then, ξ0 1 K (σ) = ≥ 0. ξ ∑ σ − ρ K ρK K Then, 0  0  0 ξK 1 1 1 −1 1 Γ ζK (σ) = + + log dK + r1 log π + (σ/2) + r2 (··· ) + (σ). ξK σ σ − 1 2 2 2 Γ ζK

ζ0 Using that the last term K (σ) is negative, and the whole sum is pos- ζK Γ0 itive, choosing σ near 1 optimally and knowledge of Γ (1/2) and Γ0 Γ (1) gives a lower bound for the discriminant. One must choose an c appropriate value of σ, Sound suggests something like σ = 1 + n . So, if you have a field with small discriminant, this also means there are not many primes of small norm. The zeros of such an L function are then also nicely behaved. Exercise 9.10. Work out the details in the above remark. ANALYTIC NUMBER THEORY NOTES 61

There is a nice survey by Odlyzko (if one searches “discriminants Odlyzko“) and also an article by Serre on “Minorations of discrimi- nants” (in French). Remark 9.11. The of integers in a number field may not be monogenic, so the discriminant of a polynomial may be much larger than the discriminant of a number field. We don’t have a good lower bound on the discriminant of a polynomial. Suppose (log N)1.9 q ≤ (log N)10 . suppose there is some χ mod q0 with a Siegel zero at β0. All we have a to worry about are α = q + β with q0 | q. We have an expression of the form  an Nµ(q) τ(χ) ∑ Λ(n) exp = + (··· ) + χ(a)Ψ(N, χ). n≤N q φ(q) φ(q) β The last Ψ(N, χ) is bounded by something like N 0 /β0. So, the above is approximated by Nµ(q) τ(χ) Nβ0 + χ(a) φ(q) φ(q) β0 and then we can then approximate these things by major and minor arcs. We then have to change whatever main term we had before with this new main term coming from τ(χ) Nβ0 χ(a) φ(q) β0 That is, we have Z Z  a 3   a  S(α)3 exp (−Nα) dα = S + β exp −N + β dβ. ∑ 1 M 10 |β|≤ qQ q q q≤(log N) ,q0|q Then, to find the contribution of the cube of this main term, we find ∗ τ(χ)3 N3β0 −aN  χ(a) exp ∑ ( )3 3 a mod q φ q β0 q

 −aN  From χ(a) exp q we get another Gauss sum so ∗ τ(χ)3 N3β0 −aN  τ(χ)4 N3β0−1 χ(a) exp ∼ . ∑ ( )3 3 ( )3 3 a mod q φ q β0 q φ q β0 62 AARON LANDESMAN − − ( ) where the 1√ in 3β0 1 is coming from the integral. Then, τ χ is bounded by q. So, the above is bounded by

q2 N3β0−1 3 3 . q β0 So, in conclusion, we get a bound like 2 N 2 ∑  N q0 log log N. 10 q q≤(log N) ,q0|q 1.9 for q0 > (log N) . Therefore, the proof is effective.

10. 10/26/17 Remark 10.1. Goldfeld Gross Zagier says c log |D| L (1, χ) ≥ p |D| for imaginary quadratic fields. Then, effectively, h(−D) > c log |D|. √ Remark 10.2 (Euler’s idoneal numbers). Consider Q −D. Can it p be that all p ≤ |D| are either ramified or inert? Gauss’ genus theorem tells us h (−D) ≥ 2 part of the class group = 2# primes |D = d (|D|) . The divisor function grows as O(|D|ε). The problem is to find all discriminants −D < 0 where  √  cl Q −D = (Z/2Z)r .

Remark 10.3.√ There is work of Biro on class numbers of fields of the form Q n2 + 4 .

10.1. Primes with missing digits. In general it is quite hard to an- swer questions of the form: (1) If p is a prime, is p + 2 a primes? (2) If n is even, when is n2 + 1 prime? Here are some theorems coming out of sieve methods. Theorem 10.4 (Piatetski-Shapiro, 1950s). If 1 < α < 1.1, then there are infinitely many primes of the form bnαc. ANALYTIC NUMBER THEORY NOTES 63

Remark 10.5. This is quite a sparse set of numbers up to x, there are only x1/α such numbers. Recall from Fermat that every p ≡ 1 mod 4 can be written as a sum of two squares. Theorem 10.6 (Fouvry and Iwaniec). Infinitely often, one can write p ≡ 1 mod 4 as p = n2 + m2 with n a prime. Theorem 10.7 (Friedlander and Iwaniec). For p ≡ 1 mod 4 one can write p = m2 + n4 for infinitely many primes. Theorem 10.8 (Heath-Brown and Li). There are infinitely many primes p ≡ 1 mod 4 with p = m2 + q4 for q primes. Theorem 10.9 (Heath-Brown). There are infinitely many primes of the form a3 + 2b3. Remark 10.10. This answers an old question of Hardy and Little- wood asking if there are infinitely many primes which are sums of three cubes. Friedlander and Iwaniec only involves pairs m, n over sets of size x3/4 = x1/2 · x1/4 and in Heath-Brown’s result, this only involves a set of size x2/3 up to some x. Question 10.11. Are there infinitely many primes p = a2 + b3? Remark 10.12. This is analogous to the question of whether there are infinitely many elliptic curves with prime discriminant or conductor (since the discriminant of an elliptic curve in short Weierstrass form is something like 4a3 + 27b2). The main result we’ll spend the next few lectures proving is of a similar flavor. Theorem 10.13 (Maynard). If q is a sufficiently large base (e.g. q = 107), k j write n = ∑j=0 njq with 0 ≤ nj ≤ q − 1 as a base q expansion. Select a forbidden digit 0 ≤ a0 ≤ q − 1 Let

A := {n ∈ N : n does not have the digit a0 base q } . Then, n o # n < qk : n ∈ A = (q − 1)k  log(q−1)/ log q = qk . 64 AARON LANDESMAN

Then,

( ) ∼ ( ) ( − )k ∑ Λ n κa0 q q 1 . n

One can also find elements of A that are, say, squares.

j Theorem 10.15 (Mauduit and Rivat). Write primes in binary p = ∑ aj2 . Count s(p) := ∑j aj. Then, s(p) is equally likely to be 0 or 1 mod 2.

More generally, this can be done with any base replacing 2, with obvious exceptions. There is also the following cute result: One might ask if one can find Fermat primes, with two 1’s in the binary expansion and all other digits 0. This might be a hard problem because the set is quite sparse, but, one can try to further ask if there are infinitely many primes with k 1’s. One might try an easier problem asking if there simply exist primes with exactly k 1’s in their binary expansion. The following theorem shows the answer is yes.

Theorem 10.16 (Drmota, Mauduit, and Rivat)√. Let K be an integer and let k be on the scale of K/2 (say k − K/2 = O( k)). Then, n o p < 2K : p prime , there are k digits equal to 1 has an asymptotic formula, with about 1/k of the numbers in this set prime.

K Remark 10.17. One would expect about ( K ) ∼ √2 K/2 K

Theorem 10.18 (Bourgain). There are primes p ≤ 2k for which you can specify any αK of the binary digits, for some fixed α > 0, where the last digit must be 1 (so that the number is not even).

10.2. Beginning the proof of Maynard’s theorem. We now turn to proving Maynard’s theorem on primes without a specified digit. Re- call we have fixed a base q, an integer k, and defined A as the set of k primes up to q without a digit a0. ANALYTIC NUMBER THEORY NOTES 65

We are trying to count

∑ Λ(n) = ∑ Λ(n)1A(n). n

= ∑ Λ(n)1A(n). n≤qk The last equality holds because m, n < qk. For the penultimate one, writing  a   am = ( ) S k ∑ Λ m exp k q m q and −a −na = ( ) A k ∑ 1A n exp k , q n q and the only terms that survive are m ≡ n mod qk. This is an excel- lent approximation to the integral from the circle method. Exercise 10.19. Verify the above equalities. 66 AARON LANDESMAN

We now separate terms into major and minor arcs, as in the circle method. We now write a ` = + β qk d

d ≤ qk | | ≤ 1 with /2 and β dqk/2 . We try to approximate a/qk using rational numbers with denomi- nator at most qk/2. By Dirichlet’s theorem, we can always write num- bers in this form. We write A as some large positive number. The major arcs are those values of a with  A (log qk)A d ≤ log qk , |β| ≤ . qk The minor arcs are the remaining values of a. The major arcs are distinct because the denominators are small and we are taking small intervals around each rational number. We’ll first deal with the major arcs. The harder part will come later when we deal with the minor arcs.

10.4. The major arc contribution. There are two cases: (1) The denominator d is a small power of q (2) The denominator d is not a small power of q. The main terms will come from the first case. Consider the sum S(α) on the major arcs of the first case, so ` b α = + d qk with d a power of q. b is small (at most d) because β is bounded. Consider `n S (`/d) = Λ(n) exp ∑ d n

Then,   !  ` b  µ(d) nb qk S + =  exp  + O . d qk φ(s) ∑ qk 3A n

Then, splitting contributions from j = 1, . . . , k − 1 and j = 0, we get

  q−1    ! −` − −n ` −a ` A = (q − 1)k 1 e 0 − exp 0 q ∑ q q n0=0   −a ` − = − exp 0 (q − 1)k 1 q 68 AARON LANDESMAN

Adding all these main terms together we get − − ! 1 1 −`  −1  (q − 1)k 1 q 1 −a ` qk (q − 1)k + A qk + exp 0 k k ∑ − ( − ) ∑ q q 1≤`≤q−1 q q 1 q 1 `=1 q  k q (q − 1) q−1 if a0 = 0 =   (q − )k − 1 a 6=  1 1 (q−1)2 if 0 0

11. 10/31/17 11.1. review. Let q be a large but fixed base and let A be the set of numbers missing a0 ∈ [0, q − 1]. Goal 11.1. Count ∑ Λ(n) n 0. Let

A (α) Ak (α) := ∑ 1A (n)e (nα) , n

a ` 1 − < qk d dqk/2 with d ≤ qk/2. Last time, we were trying to work out the major arc A contributions over intervals with d ≤ log qk and

kA a ` log q − ≤ . qk d qk ANALYTIC NUMBER THEORY NOTES 69

Last time we computed the major arc contribution in the case d was A a power of q less than log qk where we got ( q = k q−1 if a0 0 mod q ca0 (q) (q − 1) = 1 1 − 2 if a0 6= 0. (q−1) 11.2. Remaining major arcs. We next deal with the remaining major ` arcs. Namely, we show those centered at d for d not a power of q are negligible. We’ll put a trivial bound on S and the cancellation will come form A (α). We know |S(α)| ≤ qk (1 + o(1)) trivially. We now look for cancellation in  a   `  A = A + c qk d

A (log qk) c for at most qk , as we are on a major arc. We have   k−1 q−1  j  Ak(α) = ∏  ∑ e njq α  . j=0 nj=0,nj6=a0 A crude L∞ bound will in fact work for us, as we now explain. We can bound

 e n θ ≤ (q − 3) + |e(nθ) + e ((n + 1)θ)| ∑ j nj6=a0 ≤ (q − 3) + 2 |cos(πθ)| = (q − 1) − 2 (1 − | cos πθ|)  2 ≤ (q − 1) exp −cq ||θ|| . for some small cq > 0, where ||x|| = min |x − n|. n∈Z Then, ! k−1 2 k j |Ak(α)| ≤ (q − 1) exp −cq ∑ q α . j=0 70 AARON LANDESMAN

 A  (log qk) = ` + O Assuming α d qk , we get ! k−1 2 k j |Ak(α)| ≤ (q − 1) exp −cq ∑ q α j=0 k/2 2! k j `  (q − 1) exp −cq ∑ q . j=0 d

1 Remark 11.2. Note that if ||θ|| ≤ 2q then ||qθ|| = q ||θ||. log d Lemma 11.3. For k in an interval of length log q + 1, we can find k0 with

` 1 qk0 ≥ . d 2q Proof. We have

` 1 qk ≥ d d and now using Remark 11.2, we see that powers of q increase, and eventually the value “wraps around 1” so we get some term which is not too small.  Therefore,  k  |A (α)|  (q − 1)k exp −c . k 1 log k Therefore, these other major arcs contribute k ! 1  2A −cq k ( − )k −   ( − )k k k ∑ q q 1 exp cqk/ log k q 1 log q exp . q A log k d<(log qk) (`,d)=1, d not a power of q which is negligible compared to the main term computed last class. 11.3. The minor arcs. The real crux of the matter for dealing with minor arcs is that it is possible to get good bounds for the L1 norm of |A (α)| . We want to bound either   a A ∑ k qk a mod qk ANALYTIC NUMBER THEORY NOTES 71 or Z 1 |Ak(α)| dα. 0 There will be a huge amount of cancellation here. Let’s look at

k−1 q−1 j j |Ak(α)| = ∏ ∑ e(njq α) − e(a0q α) . j=0 n=0 We’ll now try to bound q−1 j j ∑ e(njq α) − e(a0q α) n=0 j Let θ := njq α. Then, we have q−1   1 − e(qθ) e(n qjα) − e(a qjα) ≤ min q − 1, 1 + ∑ j 0 − ( ) n=0 1 e θ We have

1 − e(qθ) 2 ≤ 1 − e(θ) 2 |sin πθ| 1 = . 2 ||θ|| Therefore, q−1     1 − e(qθ) 1 e(n qjα) − e(a qjα) ≤ min q − 1, 1 + ≤ min q − 1, 1 + . ∑ j 0 − ( ) n=0 1 e θ 2 ||θ|| We now plug this in above for each θ = qjα. We have ! k−1 1 | ( )| ≤ − + Ak α ∏ min q 1, 1 j j=0 2 q α Let’s write b b b α = 1 + 2 + 3 + ··· q q2 q3 j for 0 ≤ bj ≤ q − 1. Multiplying by q α yields

bj+1 z + + ε q j 72 AARON LANDESMAN

1 with z and integer and 0 < εj ≤ q . Therefore, the distance to the j nearest integer of q α is well determined by bj+1. We have good bounds whenever bj+1 6= 0, q − 1, while at bj+1 = 0, q − 1, we need to use the rather poor bound of q − 1. Putting the above together, we have ! k−1 1 | ( )| ≤ − + Ak α ∏ min q 1, 1 j j=0 2 q α  q − 1 if bj+1 = 0 or q − 1  q q−1 1 + if 1 ≤ bj+ ≤ = 2bj+1 1 2  q q−1 1 + if ≤ bj+1 ≤ q − 2. 2(q−1−bj+1) 2

Plugging this in a summing over all possibilities for digits, we have

  k q−1/2  ! a q A = α(q − 1) + 2 + ∑ k qk ∏ ∑ b a mod qk j=1 b=1 i k = ∏ (3q + q log q) . j=1

From the above, we have deduced the following lemma.

Lemma 11.4. We have   a k A ≤ (3q + q log q) ∑ qk q mod qk and Z 1 k |Ak(α)| dα ≤ (3 + log q) . 0 So if q is large, there is a lot of cancellation, but if q is small, we won’t get very much. Before continuing the proof, let’s motivate this. Recall that in our sum of three primes problem, we had some bound of the form  √ √  ∑ Λ(n)e(nα)  x4/5 + x/ q + qx n≤x ANALYTIC NUMBER THEORY NOTES 73 | − a | = 1 .9x ≥ > .1 1 Here α q q2 for x q x . On the L norm in this lemma, we’re only using a very small power of x. As long as the denom- inators are not too small or large, we are doing well on the minor arcs. Then, we want to bound 1 (a qk)S(−a qk) k ∑ Ak / / q a where the latter is bounded by qk/(log qk)A and the former is bounded by the lemma. This will work out when d ≥ q.01k in `/d. So we would be happy for large q. N The second idea will be used to estimate ∑j=1 Ak(αj) for rela- tively few values of αj. We’ll need an additional spacing condition that αi − αj ≥ δ if j 6= i. This is natural because

`1 `2 1 − ≥ . d1 d2 d1d2 Estimates like this are called large sieve estimates. These are usually done in L2, but here we’ll do an L1 estimate.

Lemma 11.5. With the spacing condition that αi − αj ≥ δ if j 6= i. we have N   1 k k ∑ Ak(αj)  + q (3 + log q) j=1 δ R 1 ( ) Our hope to get a bound is something like N 0 |A α | dα. Remark 11.6 (Sobolev inequality). We have Z u f (t) = f (u) − f 0(v)dv. t  δ δ  Integrating both over u ∈ t − 2 , t + 2 , we have Z t+δ/2 Z t+δ/2 δ | f (t)| ≤ | f (u)| + δ| f 0(v)|dv. t−δ/2 t−δ/2 Then Z t+δ/2 Z t+δ/2 1 0 | f (t)|  | f (u)|du + f (v) dv. δ t−δ/2 t−δ/2

Since all points αj were at least δ apart, these intervals will not over- lap when proving the lemma. 74 AARON LANDESMAN

Proof. We have the bound N Z 1 Z 1 1 0 ∑ Ak(αj)  |Ak(α)| dα + Ak (α) dα j=1 δ 0 0 Z 1 1 k 0  (3 + log q) + Ak (α) dα δ 0 Using

A (α) = ∑ e(nα)1A(n), n

12. 11/2/17 12.1. Review. Last time, we wanted to evaluate 1  a  −a A S . qk ∑ k qk qk a mod qk We had the major arcs which were ( A ) a a `  A log qk : = + η, d = log qk , |η| ≤ . q q d qk ANALYTIC NUMBER THEORY NOTES 75

We gave an asymptotic formula for these major arcs of the form k ca0 (q) (q − 1) . This did not depend on the size of q and is true for any base at least 3 or so. It remains to deal with the minor arcs. Last time, we saw we could 1 estimate the L norm of Ak. We showed   1 a k A  (3 + log q) qk ∑ k qk We also found Z 1 k |Ak(α)| dα  (3 + log q) . 0 By a Sobolev type argument, we saw that if

α1,..., αN are δ spaced (i.e., αi − αj ≥ δ if i 6= j. Then, N   1 k k ∑ Ak(αj) ≤ + q (3 + log q) . j=1 δ 12.2. Bounding the minor arcs. Recall that by Dirichlet’s theorem,

a ` 1 − ≤ qk d dqk/2 using Dirichlet’s theorem with Q = qk/2, with d ≤ qk/2. We can A assume d ≥ log qk as we are on the minor arcs. For now, fix a choice of B and D (where we will split d into dyadic intervals based on D and qk|η| into dyadic intervals based on D). We will later range over different possibilities of D and B. We will split this into terms with D ≤ d ≤ 2D. We won’t worry about over-counting because we’ll ultimately estimate things by tak- ing absolute values. Write a ` = + η qk d and so that B ≤ qk|η| ≤ 2B and  `  qk η + ∈ Z. d We also need to consider the case where qk|η| ≤ 1. 76 AARON LANDESMAN

The number of choices for η with qk|η| between B and 2B is roughly 2B (since η can be negative). We have D ≤ qk/2. Then qk|η|  qk/2/D. We can assume B  qk/2/D. Being on a minor arc means either A (1) D ≥ log qk . A (2) or if D is small then B ≥ log qk . So, being on a minor arc means BD is somewhat large. Goal 12.1. We now want to understand the contribution of one of these dyadic blocks. We want to understand   ` A + η . ∑ ∑ ∑ k d D≤d≤2D (`,d)=1 η qk|η|∼B qk(η+`/d)∈Z This is a sum containing about D2B terms. This number of terms in the sum is then at most qk/2D  qk since B  qk/2/D. Recall that we are trying to estimate the number of primes up to qk not containing the digit a0. Now, our set Ak is self similar meaning  `     A + η = e n + n q + ··· + n qk−1 α k d ∑ 0 1 k−1 n0,n1,...,nk−1 0≤ni≤q−1 nj6=a0 a ` where α = q = d + η. Proposition 12.2. We have   ` A + η ∑ ∑ ∑ k d D≤d≤2D (`,d)=1 η qk|η|∼B qk(η+`/d)∈Z  αq  (q − 1)k D2B where  q  log q−1 (B + log q) α = q log q ANALYTIC NUMBER THEORY NOTES 77

Then,    αq q D2αq = qk1 = (3 + log q)k1 . q − 1 where qk1 ∼ D2, qk2 ∼ B. Proof. We’ll now split this sum into  the first k digits  1 the middle k − k1 − k2 digits   the last k2 digits with k1 − 4k2 ≤ k. ( ) The first k1 digits is dominated by Ak1 α . For the middle digits, − − − − there are (q − 1)k k1 k2 , each bounded by 1, so we get (q − 1)k k1 k2 as a bound. For the last digits, we get

k−k2 + + ···  q nk−k2 nk−k2+1 α. Therefore, multiplying the contributions from all the digits, we get

 −   k−k −k  −  e n + n q + ··· + n qk 1 α = (α) (q − 1) 1 2 qk k2 α . 0 1 k−1 Ak1 Ak2

Remark 12.3. Thinking about what we are doing, there are D2 points of the form `/d and about B points η near each `/d. Given a fixed `/d, we are multiplying it by something corresponding to each of the B well spaced B intervals. Then, we choose qk1 on the scale of D2 and qk2 on the scale of B. We can bound !  `   `     `  k−k1−k2 1 k Ak + η  (q − 1) sup Ak + η Ak q + η . 1 2 k2 d |η|∼B/qk d q d

Note that the last Ak2 term corresponds to B well spaced points mod 1. That is there are B 1 (since we chose qk2 ∼ B, and in fact we will qk2 need qk2 < B). Now, we sum this over `, d, η. By a lemma from last time, fixing `, d and summing over η, we get     1 ` k qk + η ∼ qk2 ( + q) 2 ∑ Ak2 k 3 log . η q 2 d 78 AARON LANDESMAN

Then, we want to compute   ! k−k −k ` (q − ) 1 2 + 1 ∑ sup Ak1 η `,d,d∼D |η|∼B/qk d − −  (q − 1)k k1 k2 qk1 (3 + log q)k1 using the lemma from last time again and the fact that rational num- 1 bers with denominator on the order of D are D2 spaces.  The bound of the above proposition yields a useful bound when D2B is small compared to qk, where this bound starts to beat the L1 bound. We want    ` |A (`/d + η) | S − + η ∑ ∑ ∑ k d d∼D ` η,qk|η|∼B   k  αq `  (q − 1) D2B max S + η . d∼D,qk|η|∼B d

We want this to be small compared to (q − 1)k · qk. We already know that for these points, the size of   ` S + η d Recall   ` S + η = | Λ(n)e(nα)|. d ∑ n≤qk Remark 12.4. Recall that from Vinogradov’s theorem, we found ∑ λ(n)e (nα) n≤x we had an approximation of the form

a 1 α − ≤ q q2 was bounded by  x √  x4/5 + √ + xq (log x)3 . q We can use the same bound here. ANALYTIC NUMBER THEORY NOTES 79

d ≤ qk/2 | | ≤ 1 We have and η dqk/2 . Therefore,

  k ! ` q q  3 S + η  q4k/5 + √ + qkD log qk . d D

The√ only worry is that if D is small then there is an issue, because 1/ D is big. In this case, we would like to look for savings in B. So, this is not quite enough when D is small. ` a Recall that the approximations d to q are convergents of the con- tinued fractions. We have

a ` B − ∼ . qk d qk Perhaps this approximation is not too good. We could try taking later approximations of continued fractions, taking the next convergent. u Choose a modulus Q and pick an approximation v with v ≤ Q and

a u 1 − ≤ . qk v vQ

3 k Arrange this so that 1  B . That is, choose Q = 10 q . Then, dQ 10qk BD this u/v is not the same as `/d because it is a closer approximation. Further,

1 ` u 1 2B ≤ − ≤ + . dv d v vQ qk 1 1 1 ≤ 2B Further, vQ is small compared to dv , and dv 3 qk and we get 103qk 1 qk  v ≥ . BD 10 BD So, in this case, we can redo our previous argument with a larger denominator. Using the bound from before, we see

  k ! ` q  3 s + η  q4k/5 + √ log qk . d BD using that √ qk BD  q3k/4 qk/2 and so we can absorb the third term into q4k/5. 80 AARON LANDESMAN

We are now basically done. We know BD is at least some power of log qk. We want to find 1  a  −a k ∑ Ak k S k q minor arcs q q αq !  5 (D2B)αq D2B  log qk (q − 1)k max + √ A k/5 D,B,DB≥(log qk) q DB

k/2 k/2 1 Now, DB ≤ q and D ≤ q , so we just need αq < 5 for the first 1 term to be sufficiently small and we need αq < 4 for the second term 1 to be sufficiently small. We we just need to check αq < 5 . Let’s now examine this constraint. Recall  q  log q−1 (B + log q) α = q log q If q is sufficiently large, this will hold. Indeed, for q > 2 · 106 this will hold. −.01k For example, if αq = .19, the first term is bounded by q and − −.12A the second term is bounded by (DB) .12 ≤ log qk , and we can choose A as large as we want so that this savings dominates 5 log qk . Exercise 12.5. Work out any changes in the case that q is composite. Hint: There is essentially no difference. We only used some sim- plifications for computing the major arcs. We divided major arcs into cases that the denominators are powers of the modulus q. One would then have to work out differences when the denominator di- vides q, or something like that. Remark 12.6. For further ideas along this line, look at Piatetski-Shapiro yielding primes of the form bnαc for α = 1.01.

13. 11/7/17 Today we’ll start talking about something new, the Bombieri-Vinogradov Theorem. This tells us about the distribution of primes in arithmetic progressions. Let Ψ(x; q, a) denote the number of primes up to x congruent to a mod q. Let (a, q) = 1. We are looking to estimate x E (x; q, a) := Ψ (x; q, a) − φ(q) ANALYTIC NUMBER THEORY NOTES 81

The generalized Riemann hypothesis implies |E (x; q, a)|  x1/2 (log x)2 which is good for q ≤ x1/2/ (log x)2. In conditionally, we’ll need to include Siegel zeros. Theorem 13.1 (Bombieri-Vinogradov). For every A > 0 there exists a B > 0 so that x max max |E (y; q, a)|  ∑ y≤x A q≤Q (a,q)=1 (log x)

1/2 provided Q ≤ x . (log x)B Remark 13.2. The generalized Riemann hypothesis yields a bound of the form Q · x1/2 (log x)2, which is essentially the same. Remark 13.3. There is a trivial bound of the form x |E (x; q, a)|  log x q so one trivially obtains the bound  x (log x)2 trivially, and Bombieri- Vinogradov lets us save arbitrary powers of log x. Remark 13.4. The key ideas in the proof are (1) Bilinear forms and Vaughn’s identity (2) Primes in progressions and Siegel zeros (3) Large Sieve inequalities

13.1. Large Sieve for Additive characters. Suppose we have α1,..., αR ∈ R/Z which are δ well-spaces, i.e., |αr − αs| ≥ δ for r 6= s. Goal 13.5. Our goal is to bound 2 R  ∑ ∑ Nane nαj . j=1 n=1 for an ∈ C. Instead of taking the sum in the above goal from n = 1 to N we can re-parameterize the sum as M+N N ∑ ane (nα) = ∑ aM+Ne ((M + n)α) n=M+1 n=1 N = ∑ aM+Ne(Mα) · e (nα) n=1 82 AARON LANDESMAN so this is no more general.

Remark 13.6. We can bound

!2 2 ∑ |an| ≤ N ∑ |an| n n by Cauchy-Schwarz and we shouldn’t expect anything better than this. Suppose on the other hand, that the an are “wiggling around in all directions randomly” and all have norm 1. If the an are behaving independently for different values of n, and in this case we might expect some kind of square-root cancellation. That is, we might have

 2 ∑ |an| · R

1 Maybe in place of R, we might get δ because if R points were evenly spaced, we would be using square root cancellation and averaging.

We now want to get estimates of the above form. We’ll prove something stronger, but here’s a first pass:

Theorem 13.7. We have

2 ! R N N  1 2 ∑ ∑ ane nαj  (N + ) ∑ |an| j=1 n=1 δ n=1

We give two proofs.

First Proof. The first step to prove this is a Sobolev argument. We have

2 N Z αj+δ/2  1 2 ∑ ane nαj  ∑ ane(nα) dα n=1 δ αj−δ/2 ! ! Z αj+δ/2

+ ∑ ane(nα) ∑ nane(nα) dα. αj−δ/2 n n ANALYTIC NUMBER THEORY NOTES 83

Summing from 1 to R, we have 2 2 R N Z 1  1 ∑ ∑ ane nαj  ∑ ane(nα) dα j=1 n=1 δ 0 n  2 1/2  2 1/2 Z 1 Z 1

+  ∑ ane(nα) dα  ∑ nane(nα) dα 0 n 0 n !1/2 !1/2 1 2 2 2  ∑ |an| + ∑ |an| ∑ |nan| δ n n n !1/2  !1/2 1 2 2 2  ∑ |an| + ∑ |an| N ∑ |an|  . δ n n n using Parseval’s identity. 

Second proof. This argument is based on duality. Say we have (am,n)M×N an M × N matrix. From this we can consider three kinds of objects. (1) 2 M N N 2 ∑ ∑ amnyn ≤ C ∑ |yn| m=1 n=1 n=1 (2) !1/2 !1/2 M N 2 2 ∑ ∑ amnxmyn C ∑ |xm| ∑ |yn| m=1 n=1 m n (3) 2 N M 2 ∑ ∑ amnxm ≤ C ∑ |xm| . n=1 m=1 m Exercise 13.8. Show one of the above three inequalities holds for all choices of x, y if and only if the other two do. I.e., show the above three statements are equivalent. By duality, in order to give the desired bound, it suffices to bound

N R

∑ ∑ bre(nαr) n=1 r=1 84 AARON LANDESMAN in terms of the L2 norm of b for all choices of b. Expanding this out, we get

N R N

∑ ∑ bre(nαr) = ∑ brbs ∑ e (n (αr − αs)) . n=1 r=1 r,s≤R n=1

Since αr and αs are all well spaced, the terms in the exponentials won’t be close to integers very often. There are two types of terms, those with r = s, in which case we 2 get a contribution of N ∑r |br| . There are also the off-diagonal terms with r 6= s. Here,

N 1 e (nθ)  ∑ || || n=1 θ where ||θ|| is the integer nearest to θ. We can then estimate the sum of the off diagonal terms by !   1 1 |b |2 + |b |2  |b |2 ∑ r s || − || ∑ r ∑ || − || r6=s αr αs r s6=r αr αs ! R ! 2 1  ∑ |br| ∑ r j=1 jδ !   2 1  ∑ |br| log R r δ ! 2 1 1  ∑ |br| log . r δ δ

2 2 using symmetry to bound the |br| + |bs| by (an implicit factor of 2 2 times |br| . So, we have proved, in the dual form, that

2 N    1 1 2 ∑ ∑ bre(nαr) ≤ N + O log ∑ |br| . n=1 r δ δ r

We now have an extra factor of log, and we will now explain how to remove this factor of log . We will set it up, but won’t really carry it out. ANALYTIC NUMBER THEORY NOTES 85

Recall we were trying to estimate 2 R

∑ ∑ bre (nαr) n=1 r=1 Say we start with the characteristic function between 1, N and taking a smoothing Φ of this characteristic function supported on a small interval around (1, N) and always positive. We instead try to esti- mate 2 R

∑ Φ(n) ∑ bre (nαr) n=1 r=1 One could image one might be able to smooth on an the interval  1 1 1 − , N + δ δ so that Φ is supported on this interval. We have 2 R

∑ Φ(n) ∑ bre (nαr) = ∑ brbs ∑ Φ(n)e (n (αr − αs)) n=1 r=1 r,s n ˆ = ∑ brbs ∑ Φ (k + αr − αs) . r,s k using Poisson summation. The Fourier transform is large at 0 (around 2 N + δ ). You can get the rate of decay by integrating by parts many times. One can learn about the decay from the derivative of Φ. The Fourier transform is approximately supported on an interval of length δ , apart from some small fluctuations. Since Φˆ (k + αr − αs) never gets within δ of an integer, it is always close to 0 when r 6= s. Therefore, including the contribution at r = s, we get the sum is well estimated by R ˆ 2 Φ(0) ∑ |br| r=1 and we save the log term. Exercise 13.9 (Involved exercise). Complete the above sketch into a proof 

Remark 13.10. Here is a problem: Can on e choose Φ ≥ 0, Φ ≥ χ[1,N] and Φˆ supported in (−δ, δ) minimizing Φˆ (0)? 86 AARON LANDESMAN

There is a solution discovered by Beurling and Selberg. One ob- ˆ 1 tains something like Φ(0) ≤ N + δ − 1. We will next deduce the large sieve from the above theorem. Say we have  a  : q ≤ Q, (a, q) = 1 q

2 1 which is about Q points each Q2 spaced.

2 Q ∗ M+N     M+N an 2 2 ∑ ∑ ∑ ane ≤ N + O(Q ) ∑ |an| . q=1 a mod q n=M+1 q n=M+1

Example 13.11 (Important example). Take an = 1 if n ∈ [M + 1, M + N] is prime and 0 otherwise. To examine the left hand side,

∗  an ∗  an e = e ∑ ∑ q ∑ ∑ q a mod q n prime ∈[M+1,M+N] n∈[M+1,M+N],n prime a mod q = ∑ µ(q) n∈[M+1,M+N],n prime = µ(q) (π(M + N) − π(M)) where π(k) is the number of primes up to k. Using Cauchy-Schwarz, we get

2 ∗   an 2 2 φ(q) ∑ ∑ e ≥ µ(q) (π(M + N) − π(M)) . a mod q nprime q

Combining the above, the left hand side of the Large sieve is bounded by

2 Q ∗ M+N   ( )2 an µ q 2 ∑ ∑ ∑ ane ≥ ∑ (q) (π(M + N) − π(M)) q=1 a mod q n=M+1 q q≤Q φ and the large sieve implies

µ(q)2   ∑ (q) (π(M + N) − π(M))2 ≤ N + O(Q2) (π (M + N) − π(M)) . q≤Q φ ANALYTIC NUMBER THEORY NOTES 87

This implies that the number of primes in the interval M to M + N is bounded by !−1    µ(q)2 N + O Q2 . ∑ ( ) q≤Q φ q If we make the Q too big, this O(Q2) term will start to dominate. So, we might want Q2 to be something like Q = N1/2−ε. The bound then becomes 1 ( + ( )) N 1 o 1 2 . ∑q≤N1/2−ε µ(q) /φ(q)

Exercise 13.12. Show µ(n)2 ∑ ∼ log x. n≤x φ(n) Then, the number of primes between N and M + N yields a bound of 1 2N(1 + o(1)) ( + ( )) ≤ N 1 o 1 2 ∑q≤N1/2−ε µ(q) /φ(q) log N This yields Theorem 13.13 (Brun-Titchmarsh theorem). We have 2 (1 + o(1)) N π(M + N) − π(M) ≤ . log N Remark 13.14. The constants in this inequality can be made explicit. In fact, one can replace 2 (1 + o(1)) N π(M + N) − π(M) ≤ . log N by 2N π(M + N) − π(M) ≤ . log N without any error terms. That is, the number of primes from M to M + N is no more than twice the number of primes from 1 to N. Remark 13.15. In fact, one might expect π(x) + π(y) ≥ π(x + y). This contradicts a conjecture of Hardy and Littlewood, so is expected to be false. 88 AARON LANDESMAN

Exercise 13.16. Generalize the Brun-Titchmarsh theorem as follows. Use the Large Sieve appropriately to show x (2 + o(1)) π (x; q, a) ≤ . φ(q) log(x/q)

For example if x = q1,000,000. Then, π(x; q, a) is at most 2.00001 times the expected number of primes. I.e., x π (x; q, a) ≤ (2.000001) . φ(q) log x This constant more than 2 is significant because of Siegel zeros. Then,

x xβ Ψ(x; q, a) = − χ(a) φ(q) φ(q)β for χ a quadratic character. If one could replace the 2 by 1.99 one would imply there are no Siegel zeros. Exercise 13.17 (What is large about the large sieve). For primes we used that one residue class is forbidden and so we get some im- balances. Now, more generally, suppose we have S ⊂ [1, N] with p+1 |S(mod p)| ≤ 2 . Use the Large sieve to show |S| ≤ N1/2+ε.

Here, the sieve is large because we are forbidding a large√ number of residue classes. Say here the primes p range up to p ≤ N. Remark 13.18. This bound is tight because if we take S to be the set of squares, we get the claimed number of residue classes. Remark 13.19. There is a conjecture of Helfgott and Venkatesh say- ing that if one is missing half the residue classes and do have half the residue classes, it should look like some quadratic polynomial.

14. 11/9/17 Last time we discussed the Large sieve in its additive form. That is, if α1,..., αR are δ well spaced, then

R M+N    1 2 ∑ ∑ ane(nαr) ≤ N + O ∑ |an| . r=1 M+1 δ ANALYTIC NUMBER THEORY NOTES 89

a One way we’ll apply this is by taking q with (a, q) = 1, q ≤ Q, R = 2 ∼ 1 Q , δ Q2 and obtaining 2 ∗ M+N     an 2 2 ∑ ∑ ∑ a(n)e ≤ N + O(Q ) ∑ |a(n)| . q≤Q a mod q n=M+1 q 14.1. A multiplicative version of the large sieve. We’ll now for- mulate the large sieve in a multiplicative form in order to prove Bombieri Vinogradov. We’ll average over all characters χ mod q and sum over q ≤ Q. 2 ∗ M+N   2 2 ∑ ∑ ∑ a(n)χ(n) ≤ N + O(Q ) ∑ |a(n)| q≤Q χ mod q n=M+1

Remark 14.1. Here the term of size N corresponds to a particular “bad character.” and the Q2 corresponds to the sum over the re- maining characters with square-root savings bounding by some L2 norm. We’d like to think characters of different moduli are orthog- onal to each other, but we don’t want to recount characters, so we have the star on our sum to indicate we are summing over primitive characters χ (not induced by characters of smaller modulus). In fact, we’ll obtain something slightly more precise than the above. We want to go from multiplicative characters to something involv-  n  ing additive characters. We’ll want to pass between χ(n) and e q . Let χ mod q be a primitive character. Let  a τ(χ) = ∑ χ(a)e a mod q q be the Gauss sum. Suppose (n, q) = 1. Then, consider  an  an ∑ χ(a)e = ∑ χ(a)e χ(n)χ(n) a mod q q a mod q q = τ(χ)χ(n). noting that χ(n)χ(n) = 1 if n is coprime to q. Then, 1  an χ(n) = χ(a)e . ( ) ∑ τ χ a mod q q This holds for all χ mod q so long as (n, q) = 1. 90 AARON LANDESMAN

Exercise 14.2. If χ is primitive, then in fact the above equality is true for all n. That is, if n has factor in common with q, then the left hand side is 0, and we have to check the right hand side is also zero so long as χ is primitive. For example, consider q a prime. Then, every character except the principal character has right hand side evaluating to 0 when q | n.

We have

M+N 1 M+N  an a(n)χ(n) = χ(a) a(n)e . ∑ ( ) ∑ ∑ n=M+1 τ χ a mod q n=M+1 q

Let

 a M+N  an S := ∑ a(n)e . q n=M+1 q We wanted to bound 2 ∗ ∗ 2 1 a ∑ ∑ a(n)χ(n) = ∑ ∑ χ(a)S( ) χ mod q q χ mod q a mod q q 2

1 a ≤ ∑ ∑ χ(a)S( ) q χ mod q a mod q q ∗   2 φ(q) a = ∑ S . q a mod q q √ using that |τ(χ)| = q. So, using the above and the large sieve,

∗ M+N 2 ∗   2 q a a(n)χ(n) ≤ S ∑ ( ) ∑ ∑ ∑ ∑ q≤Q φ q χ mod q n=M+1 q≤Q a mod q q   M+N ≤ N + O(Q2) ∑ |a(n)|2 . n=M+1

Remark 14.3. The idea is that we are estimating some quantity on average, and one term is very bad and the rest of the terms have square-root cancellation. ANALYTIC NUMBER THEORY NOTES 91 √ Proving Bombieri Vinogradov. Q = x 14.2. Let (log x)B . Our goal is to bound

x x max Ψ(x; q, a) −  . ∑ ( ) 2 q≤Q (a,q)=1 φ q (log x) In our original statement, we also had a maximum over y up to x, which we will forget about, as it is not so important. Recall 1 Ψ(x; q, a) = ∑ χ(a)Ψ(x, χ) φ(q) χ ! 1 x x = χ(a)Ψ(x, χ) − + O φ(q) ∑ φ(q) A+100 χ6=χ0 φ(q) (log x) and this error term is bounded using 1  log x. ∑ ( ) q≤Q φ q Here we are using 1 1 = χ(a)χ(n) n≡a mod q φ(q) ∑ and so 1 Ψ(x; q, a) = ∑ χ(a) ∑ Λ(n)χ(n). φ(q) χ n≤x Then, we get

x 1 max Ψ(x; q, a) − ≤ ∑ |Ψ(x, χ)| (a,q)=1 φ(q) φ(q) χ6=χ0

Suppose χ mod q is induced by some primitive character χe mod qe. We’ll assume qe > 1 so the principal character does not show up, and then qe | q. Then, 1 ∗ 1 |Ψ(x, χ)| = |Ψ(x, χ)| . ∑ φ(q) ∑ ∑ ∑ ∑ φ(q) q≤Q χ mod q,χ6=χ0 1

We have χ(n) = χe(n) if (n, q) = 1. If (n, q) > 1 bun (n, qe) = 1 then the two could be different. 92 AARON LANDESMAN

We have the bound |Ψ(x, χ) − Ψ (x, χe)| = ∑ Λ(n)  log x# {p | q : p - qe} n≤x (n,q)>1 (n,qe)=1  (log x)2 .

Exercise 14.4. Show that we can bound ∗ 1 |Ψ(x, χ)| . ∑ ∑ ∑ φ(q) 1

Remark 14.6. We’d like to bound x max |E(x; q, a)|  . √ ∑ α A q≤ x/(log x)B (log x)

Even if we are only interested in q ≥ x1/3, we still will have to deal with small moduli because of imprimitive characters. We could avoid dealing with small moduli if we only sum over primes.

Exercise 14.7. √Work out a Bombieri Vinogradov theorem in the range x1/3 to Q = x for integers, all of whose prime factors are bigger (log x)B than x1/10. 14.3. The case R is small. Here we can use Siegel zeros and what we know about zero-free regions. If q ≤ (log x)10A then x |Ψ (x, χ)|  + (log x)100A 100 using Siegel’s theorem. These easily yield a bound of Equation 14.2 by x + . (log x)10A 10

14.4. The case R is large. Now, let’s deal with the range √ x (log x)10A ≤ R ≤ . (log x) B We now use the trick of decomposing Λ(n) via Vaughan’s identity. We have ∑ Λ(n)e (nα) which by Vaughan’s identity yields a good bound

∑ ∑ ambne(mnα). m n

There is an issue that if we only wanted to estimate ∑n Λ(n)χ(n) for one character χ we could get something like ∑m,n ambnχ(m)χ(n). But, we’re only trying to average over all characters Q over ranging R. The idea is now to write down Vaughan’s identity and the use the large sieve. 94 AARON LANDESMAN

Recall Vaughan’s identity says that for Λ(n) P(s) =≤ ≤ m U ns µ(n) ( ) = M s ∑ s n≤V n we have −ζ0 −ζ0  (s) = (s) − P(s) (1 − ζ(s)M(s)) + P(s) − ζ0(s)M(s) − ζ(s)M(s)P(s) ζ ζ with the first term on the right a type 2 sum and the latter three terms Type 1 sums. We’d now like to try to bound what all these terms give us. First, let’s consider the type 2 sum.

14.5. Bounding the type 2 sum. Recall we are trying to bound some sum of terms of the form ∑n Λ(n)χ(n). Expanding Λ(n) using Vaughan’s identity, we get some bound of the form   ∗ ∑ ∑ ∑ ∑ Λ(m) +  ∑ µ(d) χ(m)χ(n). R≤q≤2R χ mod q m>U n>V,mn≤x d|n,d>V

But note that the term ∑ µ(d) d|n,d>V is bounded by d(n), so we can essentially ignore this. Exercise 14.8. Use a Perron type integral to separate the variables m and n. Then, group them into dyadic blocks with M ≤ m ≤ 2M, N ≤ n ≤ 2N with the conditions M ≥ U, N ≥ V, MN  x to remove the dependence mn ≤ x. Then, using the above, show   ∗ ∑ ∑ ∑ ∑ Λ(m) +  ∑ µ(d) χ(m)χ(n) R≤q≤2R χ mod q m>U n>V,mn≤x d|n,d>V ∗ ! !  (log x)3 ∑ ∑ ∑ Λ(m)χ(m) ∑ a(n)χ(n) . q∼R χ mod q m∼M n∼N ANALYTIC NUMBER THEORY NOTES 95

2 2 We now use Cauchy-Schwarz and ∑m∼M Λ(m)  M log M, ∑n∼N d(n)  N (log N)3 to obtain

∗ ! ! (log x)3 ∑ ∑ ∑ Λ(m)χ(m) ∑ a(n)χ(n) q∼R χ mod q m∼M n∼N  21/2  21/2 ∗ ∗ 3  (log x) ∑ ∑ ∑ Λ(m)χ(m)  ∑ ∑ ∑ a(n)χ(n)  q χ m∼M q χ n∼N !1/2 !1/2      (log x)3 M + R2 ∑ Λ(m)2 N + R2 ∑ d(n)2 m∼M n∼N n o1/2 n o1/2  (log x)5 M2 + MR2 N2 + NR2  MN MN √   (log x)5 MN + √ R + √ R + MNR2 M N  xR xR √   (log x)5 x + √ + √ + xR2 . U V Then, taking U = V = x1/10, we see  √  1 5 xR xR 2 max√ (log x) x + √ + √ + xR (log x)10A≤R≤ x/(log x)B R U V √ x A  1 1  √ x  + x (log x)5 √ + √ + x (log x)5 5 log x U V (log x)B √ For B > A + 10 or so we can bound all the terms by x/(log x)A. So, this completes the type 2 sum case. It only remains to deal with the type 1 sum case. We’ll do the trivial type 1 sum, which comes from P(s). This is Λ(n) ∑ s . u≤U n We then have to bound 1 ∗ ∑ ∑ | ∑ Λ(n)χ(n)|  UR R q∼R χ mod q n≤U √  U s.

Since U is small, around x1/10, we have bounded this sum. 96 AARON LANDESMAN

Next time we’ll deal with the other two terms. Let’s just give an idea of how to deal with one of them now. We’re trying to bound the term corresponding to ζ(s)M(s)P(s). We want to bound ∗ ∑ ∑ ∑ Λ(m)µ(n) ∑ χ(k)χ(m)χ(n). q∼R χ mod q m≤U,n≤V k≤x/mn We then have the problem of evaluating ∑ χ(k) k≤x/mn which is√ certainly bounded by q, and we’d like to even improve this a bit to q, plug it in, and take trivial estimates on everything.

15. 11/14/17 15.1. Brun-Titchmarsh. Recall a few classes ago, we were trying to bound π(M + N) − π(M). We wanted to bound ∗  ap ∑ ∑ e = µ(q) (π(M + N) − π(M)) . a mod q M+1≤p≤M+N q We used Cauchy-Schwarz to bound !  2 ∗   2 2 ap µ(q) (π(M + N) − π(M)) ≤ ∑ 1  ∑ ∑ e  . a mod q a mod q M+1≤p≤r+N q One then gets a bound to which one can now use the large sieve.

15.2. Back to Bombieri Vinogradov. Recall we have reduced to proof to bounding 1 ∗ (log x)3 max |Ψ (x; χ)| . √ B ∑ ∑ R≤ x/(log x) R R≤q≤2R χ mod q

We had two ranges. If R small, like R ≤ (log x)10A we could use our bounds for |Ψ (x, χ)|. To conclude, we only needed to deal with √ x (log x)10A ≤ R ≤ (log x)B using Vaughan’s identity and the large sieve. ANALYTIC NUMBER THEORY NOTES 97

Recall −ζ0  ζ0 (s) − P(s) (1 − ζM(s)) = − (s) − P(s) + ζ0(s)M(s) + ζ(s)M(s)P(s). ζ ζ with Λ(n) = P ∑ s n≤U n µ(n) = M ∑ s n≤V n We were able to bound the type 2 sum by  x x x √  x  (log x)5 + √ + √ + xR  R U V (log x)A We also bounded P by UR  x.6. Next, we bound the type 1 sum ζMP given by

∗ 1 ∑ ∑ ∑ Λ(m)µ(n) ∑ χ(kmn) . R R≤q≤2R χ mod q m≤U,n≤V k≤x/mn We’ll also bound this crudely, forgetting about the sums over M and N, and get cancellation from the sum over k.

15.3. Polya Vinogradov theorem. We’ll prove the Polya Vinogradov theorem: Theorem 15.1 (Polya-Vinogradov Theorem). Suppose χ mod q is prim- itive. Then,

√ max χ(n)  q log q x ∑ n≤x Remark 15.2. Assuming Polya Vinogradov, we can bound the type 1 sum ζMP by

∗ 1 ∑ ∑ ∑ Λ(m)µ(n) ∑ χ(kmn) R R≤q≤2R χ mod q m≤U,n≤V k≤x/mn 1 √  R2UV R log R R  x1−ε. 98 AARON LANDESMAN

Exercise 15.3. Show √ ∑ χ(n) log n  q (log q)(log x) n≤x using Partial summation. Hint: Consider ! Z x dt ∑ χ(n) 1 n≤t t and obtain a log n from this integral. Exercise 15.4. Bound ζ0(s)M(s) in a similar way using the previous exercise. In fact, one can bound this term by 1 √ ζ0(s)M(s)  R2V R (log R)(log x) . R It only remains to prove the Polya-Vinogradov Theorem. Proof. The idea is to rewrite the character χ in terms of additive char- acters.  an ∑ χ(a)e a mod q q  an = χ(n) ∑ χ(a)χ(n)e = τ(χ)χ(n) a mod q q This yields, 1  an χ(n) = χ(a)e . ( ) ∑ τ χ a mod q q therefore, we have 1  an χ(n) = χ(a) e . ∑ ( ) ∑ ∑ n≤x τ χ a mod q n≤x q Summing  an ∑ e n≤x q as a progression, we have    an 1 ∑ e ≤ min x,  . ≤ q a n x q ANALYTIC NUMBER THEORY NOTES 99

We also have

1 |τ(χ)| = O(√ ). q

( ) ≤ − ≤ ≤ a ∼ |a| We know ∑n≤x χ n x. Say q/2 a q/2. Then, q q q and we use the bound x if |a| ≤ a and the bound q/|a| if |a| > q/x. Therefore, we obtain a bound

1  an χ(n) = χ(a) e ∑ ( ) ∑ ∑ n≤x τ χ a mod q n≤x q 1  √ (q log q) q √ = q log q.



Remark 15.5. Here is an alternate heuristic explanation of Polya- Vinogradov. We have

n 1 n  an χ(n)Φ = χ(a) Φ e ∑ ( ) ∑ ∑ n x τ χ a mod q n x q 1   a = χ(a) xΦ x k + ( ) ∑ ∑ b τ χ a mod q k q 1  kq + a = χ(a) xΦ x ( ) ∑ ∑ b τ χ a mod q k q 1  xm = ∑ χ(m)xΦb . τ(χ) m q where we let m = kq + a. So the left hand side is a sum over χ of size x and the right hand side is a sum over χ of size q/x. This is√ an < involution. This explains√ why Polya Vinogradov holds. If x q and we get a bound by q. If not, do the flip and bound the χ right √ q √ hand side trivially which gives a bound x/ q · x  q. 100 AARON LANDESMAN   1 Say we want to understand L 2 , χ . We have 1  ∞ χ(n) L , χ = ∑ √ 2 n=1 n ! Z ∞ 1 = √ d ∑ χ(n) 1 y n≤y ! 1 Z ∞ 1 = (n) dy 3/2 ∑ χ 2 1 y n≤y where we can bound √ ∑ χ(n) min (y, q log q) n≤y using Polya Vinogradov. Exercise 15.6. Show the above is bounded by  q1/4 log q. The kind of argument we were discussing in Remark 15.5 yields 1  χ(n) χ(n) L , χ = √ + ε(χ) √ ∑√ ∑√ 2 n≤ q n n≤ q n where ε(x) is a of size 1.   1/4 1 So, q log q is called the convexity bound for for L 2 , χ . The Riemann hypothesis implies the Lindelof¨ hypothesis, which implies   1 L , χ  qε 2 for any ε > 0. In fact, we can slightly improve the above bound. Theorem 15.7 (Burgess). We have   1 + L , χ  q3/16 ε 2 for q cube free. If χ is quadratic, there is an even better result: ANALYTIC NUMBER THEORY NOTES 101

Theorem 15.8 (Conrey and Iwaniec). For χ quadratic, 1  L , χ  q1/6+ε 2 Burgess has a result saying q is cubefree if

∑ χ(n) = o(x) n≤x if x ≥ q1/4+ε. Exercise 15.9. If χ is quadratic modq for q a prime, then then the least quadratic non-residue (lqnr) modq is at most q1/2 log q. Gauss 1/2 showed lqnr√ ≤ q . A trick of Vinogradov allows you to save a factor of e and Polya Vinogradov yields √ lqnr ≤ q1/(2 e)+ε Burgess implies √ lqnr ≤ q1/(4 e)+ε. 15.4. Some more extended exercises. Exercise 15.10 (Difficult, theorem of Goldfeld). Question 15.11 (Open question, Sophie Germian). Are there infin- itely many primes p with p − 1 = 2q with q a prime. A weakening of the above statement would be to find primes p so that p − 1 has a large prime factor. Let P(x) be the largest prime dividing x The exercise is to show that there are lots of primes p ≤ x so that P(p − 1) ≥ x1/2+δ for some δ > 0. Hint: The point of this exercise is to use Bombieri Vinogradov. Here is the idea of the proof. Suppose there were lots of such primes, say   ∑  ∑ Λ(d) ∼ x p≤x q|p−1 102 AARON LANDESMAN

Suppose q ≤ x1/2+δ always. Let Q = x1/2+δ. We can now exchange these two sums to obtain ∑ log q (π(x; q, 1)) q≤Q If q ≤ x1/2 (log x)A would give some bound by Bombieri Vinogradov. So, we write ∑ log q (π(x; q, 1)) q≤Q = ∑ log q (π(x; q, 1)) + ∑ log q (π(x; q, 1)) q≤x1/2/(log x)B x1/2/(log x)B≤q≤Q and we get for the first term is asymptotic to π(x) (log q) ∼ x/2 ∑ φ(q) q≤x1/2/(log x)B where we can bound the error term by Bombieri Vinogradov So, there must be a large contribution from the second term. We don’t know how to control π(x; q, 1). But, we do have an upper bound on them by Brun-Titchmarsh. Indeed, we can estimate π(x; q, 1) from Brun Titchmarsh. 2x (1 + o(1)) π(x; q, 1) ≤ . φ(q) log(x/q) Use this to show that the second term is at most .49x if δ ≤ .01. This finishes the proof because then these two terms cannot add up to be as big as x. This would yield a contradiction. Remark 15.12. The above theorem was published under Morris Gold- feld, but this is the same person as Dorian Goldfeld, who changed his name after he published this. Exercise 15.13 (Difficult exercise, Titchmarsh divisor problem). Pick a number at random and say its largest prime factor. “Does p − 1 look in some ways like a random number?” Prove the following lemma. Lemma 15.14. We have ∑ d(n) ∼ x log x. n≤x and ∑ d(p − 1) ∼ Cx p≤x ANALYTIC NUMBER THEORY NOTES 103 for some C > 0. Sketch of proof. We know d(n) = 2 1 ∑√ d|n,d≤ n and so it suffices to bound 1 = π(x; d, 1). ∑ √∑ ∑√ p≤x d≤ x,d|p−1 d≤ x One can solve this problem by combining it with Bombieri Vino- √ B √ B gradov√ in the range d ≤ x/ (log x) . For the small region x/ (log x) ≤ d ≤ x, try to bound π(x; d, q) by Brun-Titchmarsh and hope it be- comes an error term.  Question 15.15 (Open problem). Let

d3(n) be defined by ∞ d (n) ( )3 = 3 ζ s ∑ s . n=1 n

so d3(n) is the number of ways of writing n = abc. Bound

∑ d3(p − 1). p≤x One can keep track of small prime factors, but occasionally it might have a very large prime factors. Conjecture 15.16 (Montgomery’s conjecture). We have ! Ψ(x) x1/2+ε Ψ(x; q, a) = + O √ . φ(q) q

Remark 15.17. There reasoning behind this is that the error term is approximately 1 χ(n)Ψ(x, χ) φ(q) ∑ χ mod q,χ6=χ0 and we can bound Ψ(x, χ) by x1/2 and then get some cancellation in the sums of the characters. Montgomery’s conjecture would imply the Elliott-Halberstam con- jecture. 104 AARON LANDESMAN

Conjecture 15.18 (Elliott-Halberstam conjecture). We have

Ψ(x) x ∑ max Ψ(x; q, a) −  (a,q)=1 φ(q) A q≤x1−ε (log x) for any A > 0.

16. 11/16/17

Today we’ll begin a discussion of gaps between primes. Let pn be the nth prime. By the prime number theory, pn ∼ n log n. Hence, on average, pn+1 − pn ∼ log n. Question 16.1 (Open question). What can we say about the distribu- tion of pn+1 − pn log n as n varies? To make sense of this question, we can pick an interval (α, β) ⊂ R>0 and ask about   1 p + − p lim # n ≤ N : n 1 n ∈ (α, β) . n→∞ N log pn How might we guess this? There is a naive model called the Cramer model. This is clearly bogus, but also reasonably successful. Define the random variable Xn by ( 1 1 with probability log n Xn := 1 0 with probability 1 − log n for n ≥ 3. Now, let’s count the probability

Prob (Xn+1 = Xn+2 = ··· = Xn+h−1 = 0, Xn+h = 1, given Xn = 1) . This is asking for the chance of a gap of size h. To calculate this, thinking of h as small compared to log n. we see this is approximately −  1 h 1 1 1 1 − ∼ e−h/ log n . log n log n log n Thinking of this a different way, looking at the interval [n, n + h], we can ask for the chance there are exactly k values for which Xm = 1. We can also handle this quite easily. We can pick k numbers to be ANALYTIC NUMBER THEORY NOTES 105 primes. Calculating using the binomial theorem, we see this chance is − ! h 1  1 h k h k 1 1 − ∼ e−h/ log n . k (log n)k log n log n k!

Example 16.2. So the guess one might obtain from this is that in the interval [n, n + log n] the chance of finding k primes is about 1 e−1 k! So, we might make the following conjecture. Conjecture 16.3. If h = λ log n, then as n → ∞ chosen randomly, then The number of primes in [n, n + h] is Poisson with parameter λ. That is, 1 λk # {n ≤ N : [n, n + h] contains k primes } ∼ e−λ. N k! Remark 16.4. Saying this another way,   β 1 p + − p Z # n ≤ N : n 1 n ∈ (α, β) ∼ e−xdx. N log pn α Example 16.5. One could ask the same sort of question about any subset of the integers (or discrete subset of the real numbers). For example, say we would like to know the spacing of the zeros of the zeta function. Say they are of the form 1/2 + iγn with 2πn γ ∼ . n log n The spacings of log n γ + − γ ∼ 1 n 1 n 2π on average. We can ask now about the distribution. These are not expected to behave like a Poisson process. Question 16.6. Why should we believe the above conjecture? Well, there is in fact a better conjecture going back to Hardy and Littlewood. 106 AARON LANDESMAN

Definition 16.7. Let H = {h1,..., hk} be a set of distinct integers. The singular series of H is −  ν (p)  1  k S(H ) := ∏ 1 − H 1 − . p p p

Remark 16.8. The singular series S(H ) is approximated by

∑ Λ(n + h1) ··· Λ(n + hk) ∼ S(H )N. n≤N

Conjecture 16.9 (Hardy and Littlewood). Let H = {h1,..., hk} be a set of distinct integers. Then, N # {n ≤ N : n + h1,..., n + hk are all primes } ∼ S(H ) . (log N)k with S(H ) the singular series of H We next justify the above definition of singular series. Remark 16.10. The Cramer model predicts N # {n ≤ N : n + h1,..., n + hk are all primes } ∼ . (log N)k Exercise 16.11. Let n, n + 2 be both prime. We would like to conjec- ture a value for the singular series S({0, 2}). We might expect an approximation via the circle method like ! ! Z 1 ∑ Λ(n)e(nα) ∑ Λ(m)e(−mα) e(2α)dα. 0 n≤N m≤N Compute what the major arc contribution is. When one computes this, one might have a guess as to what the answer should be. There will be a major arc around 0 and a major arc around 1. Hint: Here, a 1 take α close to q with error roughly n . Consider  a  ∑ Λ(n)e n + nβ . n q Assume Λ(n) behaves like log n from the prime number theorem, and similarly put in an estimate from the prime number theorem on arithmetic progressions. One should also check we get 0 as our main term if we put e(α) instead of e(2α) above. ANALYTIC NUMBER THEORY NOTES 107

Exercise 16.12. Suppose n, n + 2, n + 6 are all primes. Then, we might want to compute ! ! Z 1 ∑ Λ(n)Λ(n + 2)e(nα) ∑ Λ(m)e(−mα)e(6α . 0 n≤N m≤N Here again, compute an estimate for the major arcs. 16.1. A probabilistic argument for the distribution of primes. One could also think probabilistically (which, according to Hardy and Littlewood’s paper, is not a notion in mathematics but rather a no- tion in physics or philosophy). The idea is to add in by hand all the density information for any prime p we have. That is, given a prime p, we ask

Question 16.13. What is the probability that n + h1,..., n + hk are all coprime to p for n chosen randomly.

This is asking that n not be in the classes −h1,..., −hk mod p. So, we need that n does not lie in the νH (p) congruence classes mod p, where

νH (p) = # {h1 mod p,..., hk mod p} .

So, the probability that n + h1,..., n + hk are all coprime to p should be ν (p) 1 − H . p We want to keep track of the difference between this probability and the Cramer model. The Cramer model only uses the fact that k ran- k  1  dom numbers are all coprime to p with probability 1 − p . Now, the guess is to take −  ν (p)  1  k S(H ) := ∏ 1 − H 1 − . p p p

2 If p > max(hj) then vH (p) = k so the above is 1 + O(1/p ) for large p, and hence converges absolutely. This implies the above product is 0 if and only if one of the terms is equal to 0. This means there is some prime p for which

vH (p) = p for some prime p. 108 AARON LANDESMAN

Remark 16.14. Hardy and Littlewood did not like this probabilistic argument because it is assuming primes are independent. That is, √  1  2e−γ N when one considers π(N) ∼ N ∏p≤ N 1 − p ∼ log N . Exercise 16.15. Show the above conjecture predicts the number of twin primes is roughly N 1.33 . (log N)2 That is, because N, N + 1 cannot both be prime, there is a little higher chance of N and N + 2 being prime. Exercise 16.16 (Extended exercise, due to Gallagher).

∑ S ({h1,..., hk}) ∼ ∑ 1. h1,...,hk≤H h1,...,hk≤H hi distinct hi distinct Exercise 16.17 (Lead in to previous exercise). Pick a prime p. Show  ν (p)   1 k E 1 − H : H ⊂ {0, p − 1}k = 1 − p p where E denotes expected value. Hint: Use a sort of double counting argument Let’s now construct some sets where the singular series is nonzero.

Example 16.18. (1) Consider

{h1,..., hk} := {0, k!, 2k!,..., (k − 1)k!} . Then, ( 1 if p ≤ k v (p) = H > 0 if p > k (2) Take k distinct primes all larger than k for H . This has a nonzero singular series. Question 16.19. What is the largest set in 1, 107 which is admissible (i.e., has a nonzero singular series). Question 16.20. If we find such a large set, what is it good for? Say the set in 1, 107 has size k. If we do find such a large set, then there are intervals n, n + 107 with at least k primes. Then, Hardy and Littlewood’s conjecture also implies that the number of primes up to 107 is an upper bound for the number of primes in n, n + 107. ANALYTIC NUMBER THEORY NOTES 109

Remark 16.21. Hensley and Richards (with a nice paper by Richards in the bulletin of the AMS in 1974) have a nice historical document on computing. For example, if y = 20, 000 one can construct an admissible set of length more than 20, 000 more than π(y). They showed that Hardy and Littlewood’s conjecture above contradicts Hardy and Littlewood’s conjecture that π(x + y) ≤ π(x) + π(y). Remark 16.22. The above is really a problem in . In more detail, given an interval [1, y], for each small prime p, remove one progression ap mod p. The aim is to keep as many numbers as pos- sible. Stop once the prime exceeds the number of remaining integers. For example, we start at 2, remove either 0 or 1 mod2. Then, go to 3, and remove some residue mod3. Then, we stop once there are fewer integers left than the prime we have reached. This is sort of like a greedy algorithm. Remark 16.23. Here is another variant: Consider the interval [1, y]. For each prime p ≤ z, remove one progression mod p until nothing is left. How small can one make z? We did prove something interesting about Remark 16.22 using the large sieve. Essentially, we get an upper bound, that the number is always at most 2π(y), as follows from the large sieve. Remark 16.24. One could use the above problems to show p + − p lim sup n 1 n → ∞. log n For a while, the best result was Theorem 16.25 (Erdos-Rankin). c log p log p − ≥ n 2 n pn+1 pn 2 log4 pn (log3 pn) where logn denotes the nth iterated log, log n = log logn−1. But, in 2014, there were some improvements: Theorem 16.26 (Ford, Green, Konyagin, Tao, Maynard). One can bound

c log pn log2 pn pn+1 − pn ≥ log4 pn log3 pn The other side of the problem is to ask whether p + − p lim inf n 1 n = 0. log pn 110 AARON LANDESMAN

In fact, this was shown by Goldston, Pintz, and Yildirim in 2005. However, their method did not show p + − p n 2 n = 0. log pn This was the basis for an amazing result of Zhang in 2013 yielding bounded gaps between primes: 7 pn+1 − pn < 70 ∗ 10 . The main ingredient was Zhang’s version of Bombieri Vinogradov: Let a 6= 0 be any integer. Then, x | ( )|  ∑ E x; q, a A . q≤x1/2+δ,p|q =⇒ p≤xδ (log x) Using this and the work of Goldston Pintz and Yildirim, he was able to get bounded gaps. However, in the same year, Maynard and Tao showed that instead of getting two primes, one could get many primes in a bounded interval. Further, one could use the original version of Bombieri Vinogradov instead of Zhang’s variant. Theorem 16.27 (Maynard, and independently, Tao). For any ` there exists k such that for any admissible set H (meaning there is no prime p with all residues mod p appearing in H of size k, there exist many n with at least ` primes in (n + h1, n + h2,..., n + hk). So, one can find 3 or 4 or more primes in a bounded interval. In other words: Corollary 16.28.

lim inf pn+` − pn < ∞. 17. 11/28/17 Last time, we were discussing the Hardy Littlewood conjecture: Conjecture 17.1 (Hardy-Littlewood). If you have a set

H := {h1,..., hk} then x # {n ≤ x : n + h primes } ∼ S(H ) (log x)k where −  1  k  ν (p) S(H ) = ∏ 1 − 1 − H p p p ANALYTIC NUMBER THEORY NOTES 111 with

νH (p) = #H mod p

Note νH (p) = k if p is large enough. At the end of last time we stated recent work of Maynard and Tao:

Theorem 17.2 (Maynard-Tao). For any s, there exists a suitably large k such that for every admissible set H = {h1,..., hk} (meaning S 6= 0) there are infinitely many n with at least s primes in n + h1,..., n + hk.

Remark 17.3. Sieve methods can give upper bounds for the number of prime k-tuples asymptotic to x . We have already seen one (log x)k version of this, which is the large sieve.

Exercise 17.4 (Extended exercise). Use the large sieve to give such an upper bound. Recall the large sieve looked at some exponential sum. One can try to bound the number of inadmissible tuples over all possible primes.

But, now we describe a different sieve method, known as Selberg’s sieve.

17.1. Selberg’s sieve. Selberg’s sieve is based on the simple idea that squares of real numbers are non-negative. Consider   λ √ ∑  ∑ d x≤n≤x d|(n+h1)···(n+hk)

For λd ∈ R. We want to arrange that the square of the quantity in parentheses is always non-negative and at least 1 if n + h1,..., n + hk are prime. We will assume λd = 0 for d > R, with R = R(x) chosen as some function of x, to be decided later. We should choose λ1 = 1. So, with these stipulations on λk, we have   λ ≥ # {R < n ≤ x : n + h ,..., n + h are all prime } . √ ∑  ∑ d 1 k x≤n≤x d|(n+h1)···(n+hk) 112 AARON LANDESMAN

On the other hand, expanding, we get        λ  = λ λ  1 . √ ∑ ∑ d ∑ d1 d2  √ ∑  x≤n≤x d|(n+h1)···(n+hk) d1,d2 x≤n≤x [d1,d2]|(n+h1)···(n+h2) where [d1, d2] is the least common multiple of d1 and d2. The paren- thesized expression above is a in λd’s, and the prob- lem is to minimize this quadratic form subject to the linear condition that λ1 = 1. We now try and minimize this quadratic form. Suppose p | [d1, d2]. Then, there is some i with n ≡ −ki mod p Then, n lies in νH (p) residue classes mod p. We define f to be multiplicative and f (p) = νH (p). Then, f ([d1, d2]) It follows that n lies in f ([d1, d2]) residue classes mod [d1, d2]. So, the quadratic form  f ([d , d ])  1 2 + O ( f ([d d ])) ∑ λd1 λd2 1, 2 . [d1, d2] x The function f is multiplicative by the Chinese remainder theorem, though some annoying things might happen on squares of primes. For simplicity, we’ll make the additional assumption that λd is 0 unless d is squarefree. Then,   ω([d1,d2]) O ( f ([d1, d2])) = O k = O (xε) , Here ω(n) is the number of distinct prime factors of n. The above 1−ε 1/2−ε is useful if [d1, d2] ≤ x . This is good if R ≤ x . We will also ε assume |λd|  d . We will justify this later. In this case, the total contribution of the error terms, is bounded by the number of terms times the bound which is R2xε ≤ x1−ε. Our quadratic form is then

λd1 λd2 ∑ f ([d1, d2]) [d1, d2] d1,d2 and we want to minimize this subject to the constraints

(1) λ1 = 1 1/2−ε (2) λd = 0 unless d ≤ R = x and d is square free ε (3) |λd|  d . ANALYTIC NUMBER THEORY NOTES 113

We’d now like to diagonalize this quadratic form and read of the diagonal entries by Cauchy-Schwarz. Now, d1, d2 are tied together by the lcm function. Let (d1, d2) denote the gcd of d1, d2. Let a = (d1, d2) be the gcd. Then, let d1 = ar1, d2 = ar2. We obtain

λar1 λar2 ∑ ∑ f (ar1r2) . a r1,r2, ar1r2 (r1,r2)=1 By multiplicativity, we have

f (ar1r2) = f (a) f (r1) f (r2). Next, use Mobius¨ inversion to remove the coprimality condition. We have ( 1 if (r , r ) = 1 µ(b) = 1 2 ∑ 0 else b|(r1,r2)

Write r1 = bs1, r2 = bs2. Define

λds f (s) ξd = ∑ . s s Then,

λar1 λar2 ∑ ∑ f (ar1r2) a r1,r2, ar1r2 (r1,r2)=1 f (a) µ(b) f (b)2 λ λ = abs1 abs2 f (s ) f (s ) ∑ ∑ a b2 ∑ s s 1 2 a b s1,s2 1 2 f (a) µ(b) f (b)2 λ = abs ( ) ∑ ∑ 2 ∑ f s a b a b s s f (a) f (b)2 = 2 ( ) ∑ ξd ∑ 2 µ b d ab=d ab   f (d) f (b) = ξ2  µ(b) . ∑ d ∑ d ∑ b d ab=d b|d Then, let   f (b) h(d) :=  µ(b) . ∑ b b|d 114 AARON LANDESMAN

We may observe  f (p) h(d) = 1 − . ∏ p p|d

This is starting to look related to the singular series. Therefore, our quadratic form can be written as

  f (d) f (b) f (d) ξ2  µ(b) = h(d)ξ2. ∑ d ∑ d ∑ b ∑ d d d ab=d b|d d

We have now diagonalized our quadratic form, but we now need to transform our constraint λ1 into a constraint in terms of the ξd. So, we’d like to invert our linear change of variables. We want to write down λd in terms of things involving ξi. To do this, using Mobius¨ inversion, we want to understand   f (s) λ = λ  µ(b) d ∑ ds s ∑ s b|s λ f (bt) = ∑ µ(b) ∑ dbt b t bt µ(b) f (b) = ∑ ξdb. b b

Then, we want ξd = 0 unless d ≤ R and squarefree. We also want λ1 = 1 if and only if µ(b) f (b)ξ = 1. ∑ b b We have !2 ! ! µ(b) µ(b)2 f (b) f (d) 1 = f (b)ξ ≤ h(d)ξ2 ∑ b ∑ ( ) ∑ d b b b b h b d d The equality case of Cauchy-Schwarz occurs when the vectors are proportional to each other, which occurs when µ(b) ξ ∼ . b h(b) ANALYTIC NUMBER THEORY NOTES 115

Therefore, the minimum of the quadratic form given by

f (d) 2 ∑ h(d)ξd d d is bounded by !−1 µ(b)2 f (b) ∑ ( ) b≤R b h b and this is the equality case of Cauchy-Schwarz, so it actually attains this bound. And further one can determine the constant of propor- tionality c by µ(b)2 f (b) 1 = c . ∑ ( ) b≤R bh b We obtain that x # {n ≤ x : n + h ,..., n + h all prime } ≤ + O(x1−ε) 1 k µ(b)2 f (b) ∑b≤R bh(b) where one needs to verify ε ε Exercise 17.5. Verify ξd  d , λd  d Then R ≤ x1/2−ε. Now, f (p) is about k, so f (n) is roughly dk(n) the k-divisor func- tion of n. Next, h(n) is roughly 1. Then, d (b) 1 Z ds ∑ k = ζ(s + 1)kRs b≤R b 2πi s (log R)k ∼ . k! So, we should expect µ(b)2 f (b) (log R)k ∼ ∑ ( ) b≤R bh b k! up to multiplying by some convergent Euler factor to mitigate our estimates above. Exercise 17.6. Verify that this Euler factor is S(H )−1, meaning µ(b)2 f (b) (log R)k = S(H )−1 + lower order terms . ∑ ( ) b≤R bh b k! 116 AARON LANDESMAN

Combining the above, we conclude Theorem 17.7. k!2kx # {n ≤ x : n + h1,..., n + hk all prime } ≤ S(H ) (1 + o(1)) . (log x)k This is a typical application of Selberg’s sieve. Remark 17.8. When k = 2 we can do a funny trick which gives a better bound. We can replace the 2!22 by a factor of 4. = µ(d) Remark 17.9. Recall that the optimal choice of ξd h(d) , up to scal- ing. Then, µ(s) f (s) λ = λ . d ∑ ds s Imagine that ξds ∼ µds, Then, we get  µ(s)2 f (s)  ∑s≤R/d sh(s) λ ∼ µ(d) d µ(s)2 f (s) ∑s≤R sh(s) Then, log R/dk λ = µ(d) . d log R Exercise 17.10. Show  nk µ(d) log = 0 unless n has at most k prime factors. ∑ d d|n Remark 17.11. Consider the simplest case k = 1. This could be tricky because we might be trying to count primes in a short interval. Exercise 17.12. Check that one gets exactly the same upper bound for π(x + y) − π(x) as with the large sieve using Theorem 17.7. (Not only asymptotically the same, but rather exactly the same expres- sion.) Recall our quadratic form in the case of sieving out a 1-tuple is λ λ ∑ d1 d2 [d1, d2] d1,d2≤R with log(R/d) λ = µ(d) . d log R ANALYTIC NUMBER THEORY NOTES 117

The optimal answer ended up being

λd λd 1 ∑ 1 2 = . [d1, d2] log R d1,d2≤R Then, 2 λd λd µ(a) µ(r) µ(s) ∑ 1 2 = ∑ . [d1, d2] a r s a r s d1,d2≤R , , r≤R/a s≤R/a for R/2 ≤ a ≤ R. Exercise 17.13 (difficult extended exercise). Then, µ(d )µ(d ) ∑ 1 2 → c 6= 0 [d1, d2] d1,d2≤R as R → ∞. The sieve accounts for the above by replacing µ(a)2 µ(r) µ(s) ∑ . a,r,s a r s r≤R/a s≤R/a by µ(a)2 µ(r) µ(s) log R/ar  log R/as ∑ . a,r,s a r s log R log R r≤R/a s≤R/a Here is a variant useful for what we will do next. We want to find when n + h1,..., n + hk are all prime. Fix n + h1 = p. Sieve n + h1,..., n + hk  2

∑  ∑ λd n+h1=p≤x d|(n+h2)···(n+hk) with λ1 = 1, λd = 0 unless d ≤ R and d is squarefree. This lets us count

π (x; [d1, d2] , a) and handle this using Bombieri-Vinogradov with R ≤ x1/4−ε. 118 AARON LANDESMAN

Exercise 17.14. Show that x k π (x; [d , d ] , a) ≤ S(H )4k−1(k − 1)! 1 2 log x using that we have a k − 1 dimensional sieve. When k = 2 this gives a better bound with 4 instead of 8, but it is worse for k > 2.

18. 11/30/17 Last time, we discussed Selberg’s sieve. We proved bounds like

k S(H )x # {n ≤ x : n + h1,..., n + hk are all prime } ≤ (1 + o(1)) 2 k! . (log x)k This was shown by considering a quadratic form  2

 ∑ λd d|(n+h1)···(n+hk) with log R/dk λ ∼ µ(λ) . d log R with R = x1/2−ε. Exercise 18.1. Another way to find this is to consider  2

∑  ∑ λd n≤x,n+h1 prime d|(n+h1)···(n+hk) taking R = x1/4−ε. Then, show that one gets a bound which is better when k = 2, but not for other k, of the form  2 k−1 S(H )x ∑  ∑ λd ≤ (1 + o(1)) 4 (k − 1)! . ( )k n≤x,n+h1 prime d|(n+h1)···(n+hk) log x Exercise 18.2. Use Selberg’s sieve to bound n o # n ≤ x : n2 + 1 is prime . Use sieve weights summing over polynomial values. That is, bound  2 n 2 o # n ≤ x : n + 1 is prime ≤ ∑  ∑ λd n≤x d|n2+1 ANALYTIC NUMBER THEORY NOTES 119

2 with λd = 1. We get n + 1 ≡ 0 mod [d1, d2]. Probably take λd = 0 for primes 3 mod 4 since such primes won’t divide this. Then diago- nalize the quadratic form and see what you get. The solutions to this will be given by some multiplicative function of the form f ([d , d ]) x 1 2 . [d1, d2] which is 2 if the prime is 1 mod 4 and 0 if it is 3 mod 4. Derive a similar bound for other polynomials. Theorem 18.3 (Goldston-Pintz-Yildirim). p + − p lim inf n 1 n = 0. n→∞ log pn The idea of proof is relatively simple. Start with an admissible k- tuple, meaning S(H ) 6= 0. The idea is to look for a non-negative function a(n) ≥ 0 so that we can make k ∑ a(n) < ∑ ∑ a(n). n≤x j=1 n≤x,n+hj prime

Then, there is some n with at least two primes among n + h1,..., n + hk, essentially by pigeonhole principal. Then, motivated by Selberg’s sieve, we will take  2

a(n) =  ∑ λd d|(n+h1)···(n+hk) with λ1 = 1 and λd = 0 unless d ≤ R is squarefree. We’d like the desired equality above with k as small as possible. We won’t be able to solve this, (it would imply bounded gaps) but we can tweak it a bit to get Theorem 18.3. In Selberg’s sieve, we wished to minimize a quadratic form given a linear form. Here, the real problem is to maximize the ratio of quadratic forms. Let

Q1(λ) := ∑ a(n) n≤x and

Q2(λ) := ∑ a(n). n≤x,n+hj prime 120 AARON LANDESMAN

Then, we can write Q1(λ) as

f ([d , d ]) x · 1 2 + O(R2xε) ∑ λd1 λd2 [d1, d2] d1,d2 with

f (p) := νH (p)

1/2−ε (recall νH (p) is usually k.) This is good if R ≤ x . We can write Q2(λ) as

k ∑ λd1 λd2 ∑ 1 d1,d2 n≤x, n+h1 prime , (n+h2)···(n+hk)≡0 mod [d1,d2]

To evaluate this sum, we take all possible choice in H mod p, and rule out the single choice n ≡ −h1 mod p. So, n lies in g ([d1, d2]) residue classes mod [d1, d2] with g(p) = f (p) − 1. For our inner sum, we get an estimate of the form

x g ([d , d ]) 1 2 . log x φ ([d1, d2])

Averaging over all d1, d2 and using Bombieri-Vinogradov, the error terms are under control so long as R ≤ x1/4−ε. So, we can approximate Q2(λ) by

x λd1 λd2 k ∑ g ([d1, d2]) . log x φ ([d1, d2]) d1,d2

Let’s simplify and assume that

log R/d λ = µ(d)P . d log R where P is a polynomials vanishing to order k at 0. It is now just a calculation to figure out these two quadratic forms and see if we can find a suitable polynomial P. ANALYTIC NUMBER THEORY NOTES 121

Again Q1(λ) is given by f (a) λ λ f (s ) f (s ) ∑ ∑ as1 as2 1 2 a s1s2 a s1,s2,(s1,s2)=1 !2 f (a) µ(b) f (b)2 λ f (s) = abs ∑ 2 ∑ a a b s s f (d)  f (p)  λ f (s)2 = 1 − ds . ∑ d ∏ p ∑ s d p|d

Similarly, we can write Q2(λ) as

2 !2 kx ∑a,b g(a)µ(b)g(b) λabsg(s) 2 ∑ log x φ(a)φ(b) s φ(s) kx g(d)  g(p)   λ g(s)2 = 1 − ds . x ∑ (d) ∏ (p) ∑ (s) log d φ p|d φ φ For both these cases, the first step is to understand the rightmost terms  λ f (s)2  λ g(s)2 ds and ds . ∑ s ∑ φ(s) So, for d ≤ R, we want to evaluate h(l) log R/dl  ∑ µ(dl) P l≤R/d l log R ( ) ( ) g(l)l where h l is either f l or φ(l) . In both cases, h is multiplicative. Usually h(p) ∼ w with w = k − 1 or k. Take p(t)(0) P(y) = ∑ yt t≥k t!

Lemma 18.4. For c > 0 and (c) the corresponding vertical line in the ,

( t 1 Z zs (log z) t! if z > 1 + ds = 2πi (c) st 1 0 if z < 1 122 AARON LANDESMAN

Proof. Either move the line of integration to the left picking up the pole at t = 0. If z < 1 move the line of integration to the right, and there is no pole so the integral is 0.  We now want to understand ! P(t)(0) Z ∞ µ(dl) R s ds 1 ( ) ∑ ∑ 1+s h l t+1 t . t≥k 2πi (c) `=1 l d s (log R) Using the lemma, we can evaluate this, which only gives a nonzero result if d < R. We have ∞ µ(dl) ( ) ∑ 1+s h l `=1 l 1 − can be approximated by something like ζw(s+1) with w either k 1 or k, using that a power of ζ gives the series for the divisor function, and the Mobius¨ function inverts this. Then, we get ζ(s + 1)−w up to some involving terms of primes squared, which can be thought of as quite tame. Note that 1 1 t − w + s = ζ(s+1)w ζt+1(s) has a pole of order 1 at 0. The idea to evaluate our desired integral is to move contours using the zero free region for ζ(s). We will pick up a contribution of the pole at s = 0. Then, p(t)(0) (log R/d)t−w · T| ∑ t ( − ) s=0 t≥k (log R) t w ! for T the tame Euler factor from above, and T|s=0 denoting the eval- uation of T at s = 0. We can simplify the above to   T| = log R/d s 0 P(w) . (log R)w log R To finish this argument, we compute  1 s  h(p) T| = = 1 − 1 − µ(d) s 0 ∏ p ∏ p p p-d   − − !  1  w  h(p)  1  w =  µ(p) 1 −  1 − 1 − . ∏ p ∏ p p p|d p-d Let’s plug this in to our first quadratic form and see what we get. ANALYTIC NUMBER THEORY NOTES 123

For Q1(λ), we get   − x f (d)  f (p)  1  2k µ(d)2  1 − 1 −  2k ∑ d ∏ p p (log R) d p|d ! ! 1 −2k  f (p)2  log R/d2 · 1 − 1 − P(k) . ∏ p p log R p-d We want to find   − ! ! µ(n)2 f (n)  f (p)  1  2k 1 −2k  f (p)2  1 − 1 −  1 − 1 − ∑ ns+1 ∏ p p ∏ p p n p|d p-d k = ζ(s + 1) T2|s=0 and then we want to understand the other corresponding term. We have ( − )  1 k  1  2k  f (p)2 f (p)  f (p)  1 2k ∏ 1 − 1 − 1 − + 1 − 1 − . p p p p p p p This should, hopefully, turn out to be the Hardy-Littlewood con- stant. ( − )  1 k  1  2k  f (p)2 f (p)  f (p)  1 2k ∏ 1 − 1 − 1 − + 1 − 1 − p p p p p p p −  f (p)  1  k = ∏ 1 − 1 − p p p = S(H ). Then, by partial summation,   − x f (d)  f (p)  1  2k µ(d)2  1 − 1 −  2k ∑ d ∏ p p (log R) d p|d ! ! 1 −2k  f (p)2  log R/d2 · 1 − 1 − P(k) ∏ p p log R p-d ! x Z R log R/z log R2 (log z)k ∼ P(k) d S(H ) (log R)2k 1 k! xS(H ) Z 1 yk−1 ∼ P(k) (1 − y)2 dy. (log R)k 0 (k − 1)! 124 AARON LANDESMAN

One does a similar calculation for Q2(λ). One can similarly compute the Euler product, and one again gets

Z 1 k−2 xS(H ) y (k−1) 2 Q2(λ) = k − P (1 − y) dy. log x (log R)k 1 0 (k − 2)!

All we need is to find a polynomial P to make the ratio Q2(λ)/Q1(λ) > 1 subject to the condition that R < x1/4. It’s advantageous to make R as large as possible in terms of x but R < x1/4. These quadratic forms we have only depend on the polynomial P. Next time, we’ll finish this. It will end up happening that when R = x1/4−ε you get this ratio to be just under 1. But, one can even get this ratio to tend to infinity thinking of this as a higher dimensional problem, using an argument of Maynard.

19. 12/5/17

Let H := {h1,..., hk} be an admissible tuple so that S(H ) 6= 0. We want to compare  2

Q1 := ∑  ∑ λd n≤N d|(n+h1)···(n+hk) with  2 k Q2 := ∑ ∑  ∑ λd . j=1 n≤N,n+hj prime d|(n+h1)···(n+hk)

1/4−ε We took λd = 0 unless d ≤ R is square free. Then, R ≤ N by Bombieri Vinogradov. We can take log R/d λ = µ(d)P d log R for P a polynomial vanishing to order k at 0. We computed the main terms of these two sums. We found that for R ≤ N1/2−ε

Z 1 k−1 S(H )N y (k) 2 Q1 ∼ P (1 − y) dy. (log R)k 0 (k − 1)! ANALYTIC NUMBER THEORY NOTES 125

We found also that for R ≤ N1/4−ε Z 1 k−2 kS(H )N y (k−1) 2 Q1 ∼ − P (1 − y) dy (log N)(log R)k 1 0 (k − 2)! Let Q(y) = P(k−1)(y) be a polynomial vanishing to order at least 1 at 0. Then, we want to know if k log R Z 1 yk−2 Z 1 yk−1 Q (1 − y)2 dy > Q0(1 − y)2dy.(19.1) log N 0 (k − 2)! 0 (k − 1)! so that we can understand the ratio Q1/Q2. In Selberg’s sieve one typically takes Q(y) = y so that P ∼ yk. In GPY, they took Q(y) = yl with l chosen to be large. We now have to compute these integrals, which are examples of the β function. Recall Lemma 19.1. Z 1 a!b! ya (1 − y)b dy = . 0 (a + b + 1) Proof. Take n = a + b + 1. We put down n numbers at random

x1,..., xn ∈ (0, 1) independently and uniformly. We now ask what the chance is that 1 xn is in position a + 1? It is easy to see, by symmetry, there is a n chance. On the other hand, we can order the n objects, and we can inte- grate over the possible positions of the nth object such that it is in position a + 1. By choosing the ordering for the other objects, we see this probability is   Z 1 a + b a b xn (1 − xn) a 0 Therefore,   Z 1 1 a + b a b = xn (1 − xn) . a + 1 a 0  Simplifying (19.1) we get that it suffices to check k log R (k − 2)! (2l)! (k − 1)!l2 (2l − 2)! (k − 1) < . log N (k − 1 + 2l)! (k + 2l − 2)! 126 AARON LANDESMAN

Simplifying further, we want to compare k log R 2l (2l − 1) < l2. k − 1 + 2l log N √ Now say k is large and l ∼ k, we get roughly that it suffices to check log R (4 − ε) > 1. log N But, we chose R = N1/4−ε. If we could chose R larger than N1/4 we could prove bounded gaps between primes. But, with Bombieri Vinogradov, this barely fails to give bounded gaps between primes.

Exercise 19.2. Assuming Elliott-Halberstam, obtain a bound for lim (pn+1 − pn) . We’ll next talk about Maynard’s refinement. The additional idea in GPY is the following. We have  2

Q1 = ∑ ∑  ∑ λd h1,...,hk≤H distinct n≤N d|(n+h1)···(n+h+k)  2

Q2 = ∑ , ∑ ∑  ∑ λd . h≤H h1,...,hk≤H distinct n≤N,n+h prime d|(n+h1)···(n+hk)

If every interval [n, n + H] contains at most one prime than Q2 ≤ Q1. If h 6= h1,..., hk the second form is of size N . (log N)(log R)k

One has many more possibilities coming from h not in h1,..., hk. One has Q2 is almost Q1 when h ∈ H , but it is then pushed over by allowing h ∈/ H . Multiplying this by the size of H which is δ log N. Therefore, we get enough extra help from elements of [n, n + H] not lying in the tuple H . Here, k is very large, so we are looking at a high dimensional sieve. This is a different optimization problem and has some surprises. Maynard and Tao have a method which gives many primes in bounded gaps. GPY gives only 2 primes in bounded gaps, but not many. Let

H = {h1,..., hk} ANALYTIC NUMBER THEORY NOTES 127 be admissible. Consider  2          λ  ∑  ∑ d1,...,dk  n∼x   d1|(n+h1)  d2|(n+h2)   .  . dk|(n+hk) and compare it to  2

∑  ∑ λd1,...,dk  . n∼x di|(n+hi) n+hj prime Before we had

∑ λd d|(n+h1)···(n+hk) and = λd ∑ λd1,...,dk d1···dk=d where we might have in mind that = ( ) ··· ( ) λd1,...,dk µ d1 µ dk equal to a function log d log d  F 1 ,..., k log R log R whereas we are allowing F(x1,..., xk) supported on x1 + ··· + xk ≤ 1 rather than just the function G(x1 + ··· xk). So there is more flex- ibility in allowing functions of many variables rather than just of a single variable. We now introduce the trick, previously known as using a small sieve, but after Green Tao it is known as the W-trick. The most naive sieve can be used to count ∑ 1 n≤x p|n =⇒ p>z 128 AARON LANDESMAN

To count this, if ∏ p p≤z is very small, we can easily sieve this. The product above is around ez z ≤ log x . For example if 106 this is very easy to sieve. For example, take W = ∏ p, p≤log log log x ( ) then W = (log log x)O 1 . When studying these, we insist n lies in some progression ν mod W. That is, we want to understand  2          λ  ∑  ∑ d1,...,dk  ∼ ≡   n x,n ν mod W d1|(n+h1)  d2|(n+h2)   .  . dk|(n+hk) and compare it to  2

∑  ∑ λd1,...,dk  . n∼x di|(n+hi) n+hj prime n≡ν mod W  Then, n + hi, n + hj = 1 for all i and j with n ≡ ν mod W. So, we want to understand the quadratic form

∑ ∑ λd1,...,dk λe1,...,ek . n∼x d d ≡ 1,..., k n ν mod W e1,...,ek [di,ei]|(n+hi) = We have λd1,...,dk 0 unless (1) d1,..., dk ≤ R and are squarefree (2) d ··· d is coprime to W. 1  k (3) di, dj = 1. These above conditions are automatic, but there is an additional con- dition: Note that if i 6= j we must have (di, ej) = 1, as otherwise the sum would be 0. ANALYTIC NUMBER THEORY NOTES 129

So, suppose now x λ λ = λ λ . ∑ ∑ d1,...,dk e1,...,ek ∑ d1,...,dk e1,...,ek k n∼x d d d d W ∏ = [di, ei] ≡ 1,..., k 1,..., k i 1 n ν mod W e1,...,ek e1,...,ek [di,ei]|(n+hi) The error term is ok if R ≤ x1/2−ε. If di, ei have a common factor this will only appear once as in the denominator. But if di and dj have a common factor, this will appear in the denominator at least to the power 2. But it turns out these form part of the tail of a convergent sum which will go to 0. Hence,  we will ignore the condition that if i 6= j implies di, ej = 1. To justify this, we need to check that the terms with (di, ei) > 1 will contribute a small amount compared to the main term (assuming we have removed small prime factors. Then, x λ λ = λ λ ∑ ∑ d1,...,dk e1,...,ek ∑ d1,...,dk e1,...,ek k n∼x d d d d W ∏ = [di, ei] ≡ 1,..., k 1,..., k i 1 n ν mod W e1,...,ek e1,...,ek [di,ei]|(n+hi) k x λd1,...,dk λe1,...,ek ∼ ∑ ∏ (di, ei) . W d1 ··· dke1 ··· ek d1,...,dk i=1 e1,...,ek Then, we can write

(di, ei) = ∑ φ( fi). fi|(di,ei)

We can write our quadratic form Q1 as approximately  2 k x  λd1,...,dk  Q1 ∼ ∑ ∏ φ( fi)  ∑  . W  d1 ··· dk  f1,..., fk i=1 d1,...,dk fi|di We the set   k ! λd d y = ( f ) ( f ) 1,..., k f1,..., fk ∏ µ i φ i  ∑  . d1 ··· dk i=1 fi|di Then, ∼ ( ) ··· ( ) λd1,...,dk µ d1 µ dk . 130 AARON LANDESMAN

The Mobius¨ function of the fi then cancel out and the fi mostly cancel ( ) with the φ fi . Therefore, the choice of y f1,..., fk will look like log f log f  y ∼ F 1 ,..., n . f1,..., fk log R log R Then, after an invertible change of variables,

ya a = (d )d  1,..., k λd1,...,dk ∏ µ j j ∑ . φ(x1) ··· φ(xk) dj|aj Then, y2 x f1,..., fk Q1 ∼ ∑ . W φ( f1) ··· φ( fk) f1,..., fk = ··· ≤ So, y f1,..., fk 0 unless f1 fk R is squarefree and coprime to W. We make the choice log f log f  y = F 1 ,..., k . f1,..., fk log R log R

Then, Q1 is approximately  k Z x φ(W) k 2 (log R) F (x1,..., xk) dx1 ··· dxk R W x1,...,xk where

F (x1,..., xk) = 0 unless x1 + ··· xk ≤ 1. Remark 19.3. The (log R)k in the numerator (instead of the denomi- nator) is a scaling fact relating to how we chose the y f1,..., fk .

Let’s now start understanding Q2, which we will finish on Thurs- day. This will be similar. We have  2 Q ∼ k 2 ∑  ∑ λd1,...,dk  n∼x di|n+hi n+hk prime n≡ν mod W = k ∑ λd1,...,dk λe1,...,ek ∑ 1 n∼x d1,...,dk e ,...,e n≡ν mod W 1 k + d =1 n hk prime k n+h ≡0 mod [d ,e ] ek=1 i i i ANALYTIC NUMBER THEORY NOTES 131

Again, on the last line above there will be a coprimality condition on (di, ej) which we can ignore as was done above. Then, we can simplify x ∼ ∑ 1 k−1 . n∼x (log x) φ(W) ∏ = φ ([di, ei]) n≡ν mod W i 1 n+hk prime n+hi≡0 mod [di,ei] We’ll finish understanding this quadratic form on Thursday.

20. 12/7/17 Recall last time we had an admissible set

H = {h1,..., hk} and we chose W = ∏ p p≤log log log x where

ν mod W with (ν + hi, W) = 1 for all i We had  2 Q = 1 ∑  ∑ λd1,...,dk  . n≤x d|n+hi n≡ν mod W When R ≤ x1/2−ε, we evaluated the above as y2 x r1,...,rk W ∑ pr φ(r ) r1,...,rk j with λ = ( ) ( ) d1,...,dk yr1,...,rk ∏ µ ri φ ri ∑ d1 ··· dk ri|di where yr r = (d )d  1..., k λd1,...,dk ∏ µ i i ∑ . φ(r1) ··· φ(rk) di|ri We chose log r log r  y = F 1 ,..., k r1,...,rk log R log R 132 AARON LANDESMAN with

F (x1,..., xk) supported on x1 + ··· + xk ≤ 1. We then get

 k Z x φ(W) k 2 Q1 = (log R) F (x1,..., xk) dx1 ··· dxk. W W x1,...,xk We have 1 φ ((d, e)) = φ ([d, e]) φ(d)φ(e) using that φ(n) = ∑ g(d) d|n with g multiplicative (on relatively prime inputs) with g(p) = p − 2. We compare this Q1 to (with the function g defined above)

2 Q =  2 ∑ ∑ λd1,...,dk n≤x n+hk prime n≡ν mod W λ λ = ∑ d1,...,dk e1,...,ek φ(W) log x ∏ φ ([di, ei]) d1,...,dk e1,...,ek dk=ek=1 k ! x λd1,...,dk λe1,...,ek = ∑ ∏ g( fi) ∑ . φ(W) log x ∏ φ(dj) ∏ φ(ej) f1,..., fk i=1 di,ei fi|di, fi|ei dk=ek=1 Note that above we had a coprimality condition [e , d ] | n + h and  i i i di, ej = 1 for i 6= j. Then, we let λ (k) = ( ) ( ) · d1,...,dk y f ,..., f ∏ µ fi g fi ∑ . 1 k φ(d1) ··· φ(dk) d1,...,dk dk=1, fi|di

(k) By convention we set y = 0 unless f = 1. f1,..., fk k ANALYTIC NUMBER THEORY NOTES 133

We then have

 ( ) 2 y k x f1,..., fk Q2 = ∑ . φ(W) log x k g( f ) f1,..., fk ∏j=1 j

Lemma 20.1. Letting fk = dk = 1,

(k) y f ,..., f ,r y ∼ 1 k−1 k . f1,..., fk ∑ φ(r ) rk k

Proof. We have

λ (k) = ( ) ( ) d1,...,dk y f ,..., f ∏ µ fi g fi ∑ 1 k φ(d1) ··· φ(dk) d1,...,dk fi|di

 1  yr1,...,rk = ∏ µ( fi)g( fi) ∑ ∏ µ(dj)dj ∑ φ(d1) ··· φ(dk) r ,...,r φ(r1) ··· φ(rk) fi|di 1 k d |r dk=1 j j k (d )d  yr1,...,rk µ j j = ∏ µ( fj)g( fj) ∑ ∑ ∏ r r ∏ φ(rj) φ(dj) 1,..., k d1,...,d1=1 j=1 | fj rj fi|di|ri

Fixing fi, rj we want to compute

k µ(dj)dj ∑ ∏ . φ(dj) d1,...,d1=1 j=1 fi|di|ri if k = j, then the term is 1. If j < k the term is

  fjµ( fj) p ∏ 1 − . φ( fj) p − 1 p|rj/ fj

p Then, 1 − p−1 the above term is relatively small, unless rj = fj, in which case the product is the empty product and goes away. 134 AARON LANDESMAN

Therefore, we can simplify

k (d )d  yr1,...,rk µ j j ∏ µ( fj)g( fj) ∑ ∑ ∏ r r ∏ φ(rj) φ(dj) 1,..., k d1,...,d1=1 j=1 | fj rj fi|di|ri ! µ( f )g( f ) f µ( f ) y ∼ j j j j f1,..., fk−1,rk ∏ φ( f ) ∑ φ( f ) ··· φ( f )φ(r ) j j rk 1 k−1 k ! µ( f )g( f ) f µ( f ) y = j j j j ) f1,..., fk−1,rk ∏ φ( f )2 ∑ φ(r ) j j rk k y f ,..., f ,r ∼ 1 k−1 k . ∑ φ(r ) rk k



Using the lemma, let’s continue to evaluate Q2. Recall we chose F so that log r log r  y = F 1 ,..., k . r1,...,rk log R log R By Lemma 20.1,

( ) y f ,..., f ,r y k ∼ 1 k−1 k f1,..., fk ∑ φ(r ) rk k φ(W) Z log f log f   = (log R) F 1 ,..., k−1 , x dx . W log R log R k k

Plugging this in for Q2, we have

 ( ) 2 y k x f1,..., fk Q2 = ∑ φ(W) log x k g( f ) f1,..., fk ∏j=1 j − x φ(W) 2 φ(W) k 1 = log R log R φ(W) log x W W Z Z 2 F (x1,..., xk−1, xk) dxk dx1 ··· dxk−1. x1,...,xk−1 xk (where here we are really multiplying by k for choosing a particular hi, and we are assuming F(x1,..., xk) is symmetric). ANALYTIC NUMBER THEORY NOTES 135

So, for comparison, we have

 k Z x φ(W) 2 Q1 = log R F (x1,..., xk) dx1 ··· dxk W W x1,...,xk kx φ(W) k log R Z Z 2 Q2 = log R F (x1,..., xk) dxk dx1 ··· dxk−1. W W log x x1,...,xk−1 xk We then see that this almost matches up with our first quadratic form Q1. The only difference is that we have two different quadratic forms based on the function F we are choosing. So, we have boiled everything down to a problem of optimizing F. Recall we want F(x1,..., xk) to be symmetric and x1 + ··· + xk ≤ 1. We will choose k F (x1,..., xk) = ∏ g(kxi) i=1 for some fixed function g on x1 + ··· + xk ≤ 1. The numerator of the ratio Q2/Q1 is given by !2  log R Z Z k g(kx )dx dx ··· dx xk ∏ i k 1 k−1 log x x1···xk− 1 x1+···+xk≤1 k log R Z Z 2 k−1 = g(u )du g u 2 du k+1 k k ∏ j j k log x u1,...,uk−1 uk,u1+···uk≤k j=1 with kxi = ui. Then, the denominator is similarly given by Z 1 2 g(uj) duj. kk u1,...,uk ∏ u1+···+uk≤k Next, we will give an upper bound on the denominator and a lower bound on the numerator. For the denominator, we have the upper bound

Z Z ∞ k 2 2 u ,...,u ∏ g(uj) duj ≤ g(u) du . 1 k 0 u1+···+uk≤k So, g is a function on the positive reals, but we may as well take g supported on [0, k], since we are only integrating over ui with ∑i ui ≤ k. Let’s assume that g is supported on [0, B] with B a bit smaller than k, say B ∼ k/100. Now, let’s obtain a lower bound for the numerator. 136 AARON LANDESMAN

We’ll let uk go up to B. We’ll then make an upper bound for the numerator Z Z 2 k−1 g(u ) g(u )2du .(20.1) u1,...,uk−1 k ∏ j j uk u1+···+uk−1≤k−B j=1

If we ignore the restriction that ∑i ui ≤ k − B, then then Q2/Q1 is bounded below by, up to some constant, R 2 ( g(u)) log R . R g(u)2 log x

Remark 20.2. But now, how can we ignoring the restriction that u1 + ··· + uk−1 ≤ k − B? This might seem like a serious issue, but we now discuss the answer. The key additional observation is the following. If we know R ug(u)2du ≤ 1 R 2 2 g(u) , then most of the weight is concentrated on values of u ≤ 1/2. Then, ∑i ui + ··· + uk−1 is at most k/2, most of the time. So we k−1 should then be able to ignore ∑i=1 ui ≤ k − B condition. Let’s now make this idea more precise. We now observe that Z Z 2 k−1 g(u ) g(u )2du u1,...,uk−1 k ∏ j j uk u1+···+uk−1≤k−B j=1 ! Z 2 Z u + ··· + u 2 k−1 ≥ g(u) 1 − 1 k−1 g(u )2du − ∏ j j u1,...,uk−1 k B j=1 − − Z 2 Z k 1 k − 1 Z  Z k 1 ≥ g(u) { g(u)2 − u2g(u)2du g(u)2 (k − B)2 − (k − 1)(k − 2) Z 2 Z k 3 − ug(u)2 g(u)2 }. (k − B)2 Then, Z B Z  u2g(u)2du ∼ g(u)2 2 and the whole term − − k − 1 Z  Z k 1 1 Z k 1 u2g(u)2du g(u)2 ≤ g(u)2 . (k − B)2 200 ANALYTIC NUMBER THEORY NOTES 137

So, we can bound Equation 20.1 by − 1 Z 2 Z k 1 g(u) g(u)2 . 2 So, we now want Z 1 Z ug(u)2 ≤ g(u)2du 2 and we want to make (R g(u))2 large in comparison to R g(u)2. The condition vaguely means that most of the mass should appear on small numbers. You can start to guess what function might work for g (or you can try to use calculus of variations). We can try g(u) = 1 u , though this has a pole at 0 Let’s try ( 1 if 1 ≤ u ≤ B g(u) = u A 0 else Then, Z g(u) = log AB Z g(u)2 ∼ A Z ug(u)2du = log AB. So, we need something like log AB ≤ A/2. So, let’s take B = k/100. We then, have log AB ∼ log k. We can take A = 3 log k. It then meets the condition that B ≤ k/100 and Z 1 Z ug(u)2 ≤ g(u)2du. 2 To conclude, we now want to compute Q2/Q1. Indeed, R 2 ( g(u)) (log AB)2 = R g(u)2 A (log k)2 ∼ 3 log k log k = . 3 And indeed, this goes to ∞ so long as k → ∞. So, in any tuple where k is sufficiently large, where you expect to find k primes, you can at least find log k number of primes.