arXiv:1604.01041v2 [math.NT] 28 Oct 2019 in h ubro lmnsof elements of number The sion. Let c hw h xsec fifiieymn neesin integers many a infinitely Dartyge of of work existence the the particular shows In structure. Fourier this ploiting enmc rvoswr 1 ,4 ,6 1 3 studying 13] 11, T 6, size. 5, in 4, small 2, unusually [1, work often previous is much and been description, analytic explicit nient etesto ubr hc aen ii qa to equal digit no have which numbers of set the be rmswe h e nqeto osse oeadtoa multiplicat infin additional set contain some The sets possesses question sparse in that c set show set the only the when analy can whether primes several we as presents Typically such way questions primes. this arithmetic many in answer sparse to being tries one set A numbers. ral hsrsl eyn ntefc that fact the on relying result this eso htteeaeifiieymn rmsin primes many infinitely are there presc that by show created We set Bourga sparse digits. of binary the work the in the of and primes proportion w well-distributed, of [17] is Rivat existence primes and the Mauduit of showed of digits work of related the sum mention the also We 16]. 12, [7, atclr emk e s fteFuirsrcueof structure Fourier Marko the a of with use comparison key a make and biline numbers we of of particular, method geometry the the sieve, sieve, Harman’s large method, circle the of combination o (10 log = a 0 { ∈ Abstract. ubr hc onthv h digit the have not do which numbers n oetetmtsotie ycmaio ihaMro pr t Markov numbers, a with of comparison geometry by the obtained of estimates digit combination moment restricted a and with on condit numbers rely digital of from estimates transform large Fourier is the mi primes which when the the conditions control of Diophantine to transform decorrelating sieve Fourier and by Harman’s I’ obtained in ‘Type use is suitable for obtaining information on metic rests and problem, binary A 1 0 / h ro sa plcto fteHryLtlwo icem circle Hardy-Littlewood the of application an is proof The , . . . , 9) a nsal iesrcuei htisFuirtasomhsacon a has transform Fourier its that in structure nice unusually has / o 10 log 9 A RMSWT ETITDDIGITS RESTRICTED WITH PRIMES } 1 Let n let and = ≈ a n 0 0 0 ≤ . X { ∈ 046 i ≤ k 0 , . . . , > n i 1. 10 .I particular, In 0. AE MAYNARD JAMES 9 i A Introduction } : eso hr r nntl ayprime many infinitely are there show We . A 1 n 1 swl-itiue naihei progressions arithmetic in well-distributed is i { ∈ hc r esthan less are which a 1 0 nterdcmlexpansion. decimal their in 0 , . . . , A 9 A }\{ 1 1 A sasas usto h natu- the of subset sparse a is iha ot2piefactors, prime 2 most at with a 1 a u ro sbsdo a on based is proof Our . 0 A } 0 A k , 1 nterdcmlexpan- decimal their in oswihdictate which ions 1 itt hnthe when dictate n eae esb ex- by sets related and ntesm prtas spirit same the in , slre These large. is s ≥ Tp I arith- II’ ‘Type x o rs This arcs. nor elresieve large he 0 is ocess. o to oa to ethod dMuut[,8] [7, Mauduit nd nan infinitely ontains iigapositive a ribing O i iclisif difficulties tic ( x v structure. ive rcs.In process. v rsm,the sums, ar 1 n[]which [3] in − tl many itely oshowed ho c ,where ), eehas here ve- 2 JAMES MAYNARD the aforementioned works. Somewhat surprisingly, the Fourier structure allows us to successfully apply the circle method to a binary problem. i Theorem 1.1. Let X ≥ 4 and A = { 0≤i≤k ni10

Here, and throughout the paper, f ≍ g means that there are absolute constants c1,c2 > 0 such that c1f

Thus there are infinitely many primes with no digit a0 when written in base 10. Since #A/Xlog 9/ log 10 oscillates as X → ∞, we cannot expect an asymptotic for- mula of the form (c + o(1))Xlog 9/ log 10/ log X. Nonetheless, we expect that #A #{p ∈ A} = (κ + o(1)) , A log X where 10(φ(10)−1) 9φ(10) , if (10,a0) = 1, (1.1) κA = 10 ( 9 , otherwise.

Indeed, there are (φ(10)κA/10+ o(1))#A elements of A which are coprime to 10, and (1+o(1))X/ log X primes less than X which are coprime to 10, and (φ(10)/10+ o(1))X integers less than X coprime to 10. Thus if the properties ‘being in A’ and ‘being prime’ where independent for integers n

′ i A = niq

The final case of Theorem 1.2 when B = {0,...,s − 1} and s ≈ q − q57/80 shows the existence many primes in a set of integers A′ with #A′ ≈ X57/80 = X0.7125, a rather thin set. The exponent here can be improved slightly with more effort. The estimates in Theorem 1.2 can be improved to asymptotic formulae if we restrict s slightly further. For general B with s = #B ≤ q1/4−δ and any q sufficiently large in terms of δ > 0 we obtain #A #{p ∈ A′} = (κ + o(1)) , B log X where, if B contains exactly t elements coprime to q, we have q(φ(q) − t) κ = . B φ(q)(q − s) In the case of just one excluded digit, we can obtain this asymptotic formula for q ≥ 12. In the case of B = {0,...,s − 1}, we obtain the above asymptotic formula provided s ≤ q − q3/4+δ. We expect several of the techniques introduced in this paper might be useful more generally in other digit-related questions about arithmetic . Our general approach to counting primes in A and our analysis of the minor arc contribution might also be of independent interest, with potential application to other ques- tions on primes involving sets whose Fourier transform is unrelated to Diophantine properties of the argument.

2. Outline

Our argument is fundamentally based on an application of the circle method. Clearly for the purposes of Theorem 1.1 we can restrict X to a power of 10 for convenience. The number of primes in A is the number of solutions of the binary equation p − a = 0 over primes p and integers a ∈ A, and so is given by 1 a −a #{p ∈ A} = S S , X A X P X 0≤Xa

SA(θ)= e(aθ), aX∈A SP(θ)= e(pθ). p

It turns out that the Fourier transform SA(θ) has some somewhat remarkable fea- tures which cause it to typically have better than square-root cancellation. (A closely related phenomenon is present and crucial in the work of Mauduit and Rivat [17] and Bourgain [3].) Indeed, we establish the ℓ1 bound a (2.1) S ≪ #A X0.36. A X 0≤a

2 1.526 By combining an ℓ bound for SP(a/X) with an ℓ bound for SA(a/X), we are able to show that it is indeed the case that ‘generic’ a

We expect that SP(θ) is large only when θ is close to a rational with small de- nominator, and SA(θ) is large when θ has a decimal expansion containing many 0’s or 9’s. Thus we expect the product to be large only when both of these con- ditions hold, which is essentially when θ is well approximated by a rational whose denominator is a small power of 10. By obtaining suitable estimates for A in arithmetic progressions via the large sieve, one can verify that amongst all a in the major arcs M where a/X is well- approximated by a rational of small denominator we obtain our expected main term, and this comes from when a/X is well-approximated by a rational with de- nominator 10. Thus we are left to show when a ∈ E and a/X is not close to a rational with small denominator, the product SA(a/X)SP(−a/X) is small on average. By using an expansion of the indicator function of the primes as a sum of bilinear terms (similar to Vaughan’s identity), we are led to bound expressions such as a a X a n − a n −1 (2.2) S 1 S 1 min , 1 1 2 2 , A X A X N X a1,a2∈E\M     n1,n2≤N   X X which is a weighted and averaged form of the typical expressions one encounters when obtaining a ℓ∞ bound for exponential sums over primes. Here k·k is the distance to the nearest integer. PRIMES WITH RESTRICTED DIGITS 5

2 The double sum over n1,n2 in (2.2) is of size O(N ) for ‘typical’ pairs (a1,a2), and if it is noticeably larger than this then a1 and a2 must share some Diophantine 3 structure. We find that the pair (a1,a2) must lie close to the projection from Z to Z2 of some low height plane or low height line if this quantity is large, where the arithmetic height of the line or plane is bounded in terms of the size of the double sum. (For example, the diagonal terms a1 = a2 give a large contribution and lie on a low height line, and a1,a2 which are both small give a large contribution and lie in a low height plane.)

This restricts the number and nature of pairs (a1,a2) which can give a large con- tribution. Since we expect the size of SA(a1/X)SA(a2/X) to be determined by digital rather than Diophantine conditions on a1,a2, we expect to have a smaller total contribution when restricted to these sets. By using the explicit description of such pairs (a1,a2) we succeed in obtaining such a superior bound on the sum over these pairs. It is vital here that we are restricted to a1,a2 lying in the small set E (for points on a line) and outside of the set M of major arcs (for points in a lattice). This ultimately allows us to get suitable bounds for (2.2) provided N ∈ [X0.36,X0.425]. If this ‘Type II range’ were larger, we would be able to express the indicator function of the primes as a combination of such bilinear expressions and easily controlled terms. We would then obtain an asymptotic estimate for #{p ∈ A}. Unfortu- nately our range is not large enough to do this. Instead we work with a minorant for the indicator function of the primes throughout our argument, which is chosen such that it is essentially a combination of bilinear expressions which do fall into this range. It is this feature which means we obtain a lower bound rather than an asymptotic estimate for the number of primes in A. Such a minorant is constructed via Harman’s sieve, and, since it is essentially a com- bination of Type II terms and easily handled terms, we can obtain an asymptotic formula for elements of A weighed by it. This gives a lower bound #A #{p ∈A}≥ (c + o(1)) log X for some constant c. We use numerical integration to verify that we (just) have c > 0, and so we obtain our asymptotic lower bound for #{p ∈ A}. The upper bound is a simple sieve estimate. Remark. For the method used to prove Theorem 1.1, strong assumptions such as the Generalized Riemann Hypothesis appear to be only of limited benefit. In particular, even under GRH one only gets pointwise bounds of the strength SP(θ) ≪ X3/4+o(1) for ‘generic’ θ, which is not strong enough to give a non-trivial minor arc bound on its own. The assumption of GRH and the above pointwise bound is sufficient to deal with the entire minor arc contribution in the regime where we obtain asymptotic formulae (i.e. when the base is sufficiently large).

3. Notation

We use the asymptotic notation ≪, ≫, O(·), o(·) throughout, denoting a depen- dence of the implied constant on a parameter t by a subscript. As mentioned earlier, 6 JAMES MAYNARD we use f ≍ g to denote that both f ≪ g and g ≪ f hold. Throughout the paper ǫ will denote a single fixed positive constant which is sufficiently small; ǫ = 10−100 would probably suffice. In particular, any implied constants may depend on ǫ. We will assume that X is always a suitably large integral power of 10 throughout. We will exclusively use the letter p to denote a , without always making this restriction explicit. We will use the nonstandard notation that n ∼ X to mean that n lies in the interval (X/10,X] throughout the paper. Several variables will be assumed to be non-negative integers, without directly specifying this. Thus sums such as n

− log 9/ log 10 (3.1) FY (θ)= Y 1A1 (n)e(nθ) . n

We use 1A1 for the indicator function of the set A1 of integers with restricted digits. Here e(x)= e2πix is the complex exponential function. We need to make use of various numerical estimates throughout the paper, some of which succeed only by a small margin. We have endeavored to avoid too many explicit calculations and we encourage the reader to not pay too much attention to the numerical constants appearing on a first reading.

4. Structure of the paper

In Section 6, we use a sieve decomposition to reduce the proof of Theorem 1.1 to the proof of Proposition 6.1 and Proposition 6.2, which are asymptotic estimates for particular types of terms arising from sieve decompositions. These propositions are established in Section 7. In Section 7, we use sieve theory to reduce the proof of Proposition 6.1 and Propo- sition 6.2 to the proof of Proposition 7.1 and Proposition 7.2, which are our ‘Type I’ and ‘Type II’ estimates. These will be established in Section 8 and Section 9 respectively. In Section 8 we use a large sieve argument to reduce the proof of our Type I estimate Proposition 7.1 to that of Lemma 8.1 and Lemma 8.2, which are Fourier ℓ∞ and ℓ1 bounds. These will be established in Section 10. In Section 9 we use the circle method and geometric decompositions to reduce the proof of our Type II estimate Proposition 7.2 to that of Proposition 9.1, Proposition 9.2 and Proposition 9.3, which are our estimates for the ‘major arcs’, the ‘generic minor arcs’ and the ‘exceptional minor arcs’. These will be established in Sections 11, 12 and 13 respectively. PRIMES WITH RESTRICTED DIGITS 7

In Section 10 we establish various Fourier estimates. In particular we establish Lemma 8.1 and Lemma 8.2, as well as several auxiliary lemmas which will be used in later sections. In Section 11 use results on primes in arithmetic progressions to establish our major arc estimate Proposition 9.1, making use of the estimates of Section 10. In Section 12 we use Fourier moment bounds from Section 10 to establish our generic minor arc estimate Proposition 9.2. In Section 13 we use the geometry of numbers to reduce the proof of the exceptional minor arc estimate Proposition 9.3 to the proof of Proposition 13.3 and Proposition 13.4, which are estimates from frequencies constrained to lie in low height lattices or low height lines. These will be established in Section 14 and Section 15. In Section 14 we establish our estimate for low height lattices Proposition 13.3, using the estimates of Section 10. In Section 15 we establish our estimate for low height lines Proposition 13.4 , using the geometric counting estimates and the results of Section 10. This completes the proof of Theorem 1.1. In Section 16, we sketch the modifications in the argument required to establish Theorem 1.2. In particular, the dependency graph between the main statements in the proof of Theorem 1.1 is as follows:

Lemma 8.1

Proposition 6.2 Lemma 8.2 Proposition 7.1

Proposition 9.1 Theorem 1.1

Proposition 7.2 Proposition 9.2 Proposition 6.1 Proposition 13.3 Proposition 9.3 Proposition 13.4

5. Basic estimates

We will make frequent use of some well-known facts in analytic without extra comment. In particular, we make use of the Prime Number Theorem in short intervals and arithmetic progressions with error term (see [10, Chapter 22], for example). This states that for any A> 0 we have ∆Y Y (5.1) Λ(n)= + O φ(q) A (log Y )A Y ≤n≤Y +∆Y n≡a X(mod q)   8 JAMES MAYNARD provided ∆ ≥ (log Y )−A and q ≤ (log Y )A and gcd(a, q) = 1. We recall the following sieve estimate (see, for example, [18, Theorem 7.11]): For u> 1+1/(log Y )1/2 uY (5.2) #{n < Y : p|n ⇒ p ≥ Y 1/u} = (ω(u)+ o (1)) , u log Y where ω(u) is the Buchstab function defined by the delay-differential equation ω(u)=1/u, 1 ≤ u ≤ 2, ω′(u)= ω(u − 1) − ω(u), u> 2. We recall some results from the geometry of numbers and Minkowski’s theory of successive minima (see, for example, [9, Page 110]). A lattice in Rk is a discrete subgroup of the additive group Rk. For any lattice Λ there is a Minkowski-reduced k basis {v1,..., vr} of linearly independent vectors in R such that

Λ= v1Z + ··· + vrZ, and for any x1,...,xr ∈ R we have r kx1v1 + ··· + xrvrk2 ≍ kxivik2, i=1 X and with kv1k2 ···kvrk2 ≍ det(Λ), where these implied constants depend only on the ambient dimension k. Here det(Λ) is the r-dimensional volume of the funda- mental parallelepiped, given by r xivi : x1,...,xr ∈ [0, 1] . i=1 nX o We say r is the rank of the lattice. We see the properties of the Minkowski-reduced basis above indicate that each generating vector vi has a positive proportion of its length in a direction orthogonal to all the other basis vectors.

6. Sieve Decomposition and proof of Theorem 1.1

First, we prove Theorem 1.1 assuming two key propositions, given below. This reduces the problem to establishing Proposition 6.1 and Proposition 6.2 which we do over the remaining sections. As remarked in Section 2, it suffices to consider X as a power of 10. If X = 10k we will think of all elements of A as having k digits, none of which is equal to a0. This is equivalent to slightly changing the definition of A in the case when a0 =0 (since it restricts A to (X/10,X]), but by considering X, X/10, X/100, ... we see that we can easily recover Theorem 1.1 for the original set A from this situation. We will make a decomposition of #{p ∈ A} into various terms following Harman’s sieve (see [15] for more details). Each of these terms can then be asymptotically estimated by Proposition 6.1 or Proposition 6.2 (given below), or can be trivially bounded below by 0. To keep track of the terms in this decomposition we apply the same decomposition to the set B = {0 ≤ n

Let wn be weights supported on non-negative integers n

(We recall that 1A is the indicator function of A, and κA is the constant given by (1.1).) For a set C we define

Cd = {c : cd ∈ C}, S(C,z) = #{c ∈C : p|c ⇒ p>z}. Given an integer d> 0 and a real number z > 0, let κ #A (6.2) S (z)= w = S(A ,z) − A S(B ,z). d nd d X d nz

We expect that Sd(z) is typically small for a wide range of d and z. The following two propositions show that this is the case for certain d, z.

Proposition 6.1 (Sieve asymptotic terms). Fix an integer ℓ ≥ 0. Let θ1 =9/25+2ǫ ℓ and θ2 = 17/40 − 2ǫ. Let L be a set of O(1) affine linear functions L : R → R. Then we have ∗ #A S (Xθ2−θ1 )= o , p1···pℓ L log X θ2−θ1 X ≤p1≤···≤pℓ   X 1−θ1 p1···pℓ≤X where ∗ indicates the summation is restricted by the conditions log p log p P L 1 ,..., ℓ ≥ 0 log X log X   for all L ∈ L.

Proposition 6.1 includes the case ℓ = 0, where we interpret the statement as #A (6.3) S (Xθ2−θ1 )= o . 1 log X   Proposition 6.2 (Type II terms). Fix an integer ℓ ≥ 1. Let θ1,θ2, L be as in Proposition 6.1, and let I⊆{1,...,ℓ} and j ∈{1,...,ℓ}. Then we have ∗ #A S (p )= o , p1···pℓ j L log X θ2−θ1 X ≤p1≤···≤pℓ   θ1 X θ2 X ≤Qi∈I pi≤X p1···pℓ≤X/pj and ∗ #A S (p )= o , p1···pℓ j L log X θ2−θ1 X ≤p1≤···≤pℓ   1−θ2 X 1−θ1 X ≤Qi∈I pj ≤X p1···pℓ≤X/pj where ∗ indicates the same restriction of summation to L ≥ 0 for all L ∈ L as in Proposition 6.1. P 10 JAMES MAYNARD

We note that by inclusion-exclusion the same result holds if some of the inequalities L ≥ 0 are replaced by the strict inequality L> 0.

Proof of Theorem 1.1 assuming Proposition 6.1 and Proposition 6.2. Let θ1 =9/25+ 2ǫ and θ2 = 17/40 − 2ǫ as in Proposition 6.1. We first consider the upper bound for Theorem 1.1, which is essentially a standard sieve upper bound. Since θ2 − θ1 < 1/2, we have

#{p ∈ A} = S(A,X1/2)+ O(X1/2) ≤ S(A,Xθ2−θ1 )+ O(X1/2).

Thus, using (6.3) and the fact (5.2) that there are O(X/ log X) integers in [0,X] with no prime factors smaller than Xθ2−θ1 , we have

#{p ∈A}≤ S(A,Xθ2−θ1 )+ O(X1/2) #A = κ S(B,Xθ2−θ1 )+ S (Xθ2−θ1 )+ O(X1/2) A X 1 #A #A = κ #{nXθ2−θ1 } + o A X log X #A   ≪ . log X Thus it suffices to establish the lower bound.

To simplify notation, we let z1 ≤ z2 ≤ z3 ≤ z4 ≤ z5 ≤ z6 be given by

θ2−θ1 θ1 θ2 z1 = X , z2 = X , z3 = X ,

1/2 1−θ2 1−θ1 z4 = X , z5 = X , z6 = X .

We have κ #A #{p ∈ A} = #{p ∈ A : p>X1/2} + O(X1/2)= S (z )+(1+ o(1)) A . 1 4 log X

Thus we wish to bound S1(z4) from below. By Buchstab’s identity (i.e. inclusion- exclusion on the least prime factor) we have

S1(z4)= S1(z1) − Sp(p). z1

The term S1(z1) is o(#A/ log X) by (6.3) from Proposition 6.1. We split the sum over p into ranges (zi,zi+1], and see that all the terms with p ∈ (z2,z3] are also negligible by Proposition 6.2. This gives #A S (z )= − S (p) − S (p)+ o . 1 4 p p log X z1X then there are additional terms in Sp((X/p) ) from primes in the interval ((X/p)1/2,p]. For δ = 1/(log X)1/2, by the prime PRIMES WITH RESTRICTED DIGITS 11 number theorem and Proposition 6.1, we have

1/2 0 ≤ S(Ap, min(p, (X/p) )) − S(Ap,p) 1/2 p

z1

Spq(q)= Spq(q)+ Spq(q)

z1q z6

(6.8) Spq(q)= Spq(q)+ Spq(q).

z1

For the second term of (6.8) when q2p is large, we first separate the contribu- tion from products of three primes. By an essentially identical argument to when 1/2 we replaced Sp(p) by Sp(min(p, (X/p) )) in (6.4), we may replace Spq(q) by 1/2 Spq(min(q, (X/pq) )) at the cost of a negligible error term (since pq < z6). By Buchstab’s identity we have (with r restricted to being prime)

1/2 1/2 Spq(min(q, (X/pq) )) = Spq((X/pq) )+ Spqr(r). z1

6.2. For the terms with qr∈ / [z2,z3] we just take the trivial lower bound. Thus

Spqr(r)= Spqr(r)+ Spqr(r)

z1z3 #A + o log X  κ #A 1 − u − v − w dudvdw (6.10) ≥−(1 + o(1)) A ω log X w uvw2 (u,v,wZZZ)∈R1   κ #A 1 − u − v − w dudvdw (6.11) − (1 + o(1)) A ω , log X w uvw2 (u,v,wZZZ)∈R2   where R1 and R2 are given by

R1 = (u,v,w): θ2 − θ1 θ2 . o Together (6.9), (6.10) and (6.11) give a suitable lower bound for the terms in (6.8) 2 with q p ≥ z6.

2 When q p

#A S (q)= S (min(q, (X/pq)1/2)) + o pq pq log X z1

#A = o − S (z )+ S (s) log X pqr 1 pqrs z1

Spqrs(s)

z1

R3 = (u,v,w,t): θ2 − θ1

n u + v + w +2t< 1, θ2

{u + v,u + w,u + t, v + w, v + t, w + t} ∩ [θ1,θ2]= ∅ . o This completes our decomposition of the terms from (6.8), coming from the second term of (6.6). We note that we could have imposed various further restrictions such as u + v + w∈ / [θ1,θ2] in R3, but for ease of calculation we do not include these. We perform decompositions to the third term of (6.6) in a similar way to how we 2 3/2 3/2 dealt with the second term. We have q p < (qp) < z2 < z6 so, as above, we can apply two Buchstab iterations and use Proposition 6.1 to evaluate the terms 2 Spqr(z1) since we have pqr ≤ pq

Spq(q)= Spq(z1) − Spqr(r) z1

κ #A 1 − u − v − w − t dudvdwdt (6.13) ≥−(1 + o(1)) A ω , log X t uvwt2 (u,v,w,tZZZZ)∈R4   where

R4 = (u,v,w,t): θ2 − θ1

n u + v + w + t∈ / [θ1,θ2] ∪ [1 − θ2, 1 − θ1],

{u + v + w,u + v + t,u + w + t, v + w + t} ∩ [θ1,θ2]= ∅ . o We note that for R4 we have dropped different constraints to those we dropped in R3. Together (6.7), (6.9), (6.10), (6.11), (6.12) and (6.13) give our lower bound for all the terms occurring in (6.6), and so gives a lower bound for first term from (6.5) which covers all terms with p ≤ z2. We are left to consider the second term from (6.5), which is the remaining terms with p ∈ (z3,z4]. We treat these in a similar manner to those with p ≤ z2. We first split the sum according to the size of qp. Terms with qp ∈ [z5,z6] are negligible by Proposition 6.2, so we are left to consider qp ∈ (z3,z5) or qp > z6. We then split 2 the terms with qp ∈ (z3,z5) according to the size of q p compared with z6. This gives #A S (q)= S + S + S + o , pq 1 2 3 log X z3

S1 = Spq(q)

z3

S2 = Spq(q)

z3

S3 = Spq(q).

z3

We apply two further Buchstab iterations to S3 (we can handle the intermediate 2 terms using Proposition 6.1 as before since q p

R5 = (u,v,w,t): θ2 − θ1

n u + v +2w< 1, u + v + w +2t< 1, θ2

{u + v,u + w,u + t, v + w, v + t, w + t} ∈/ [θ1,θ2] . Together (6.14), (6.15), (6.16) give our lower bound for the second termo from (6.5), which is all the terms with p ∈ [z3,z4]. This completes our lower bound for S1(z4).

Let I1,...,I9 denote the integrals in (6.7), (6.9), (6.10), (6.11), (6.12), (6.13), (6.14), (6.15) and (6.16) respectively. Putting everything together, we obtain κ #A #{p ∈ A} = (1+ o(1)) A + S (z ) log X 1 4 κ #A ≥ (1 + o(1)) A (1 − I − I − I − I − I − I − I − I − I ). log X 1 2 3 4 5 6 7 8 9 In particular, we have κ #A (6.17) #{p ∈A}≥ (1 + o(1)) A 1000 log X PRIMES WITH RESTRICTED DIGITS 17

1 provided that I1 + ···+ I9 ≤ 0.999. Numerical integration then gives the following bounds on I1,...,I9 in the case when θ1 and θ2 in the definition of I1,...,I9 are replaced by 9/25 and 17/40 respectively.

I1 ≤ 0.02895, I2 ≤ 0.35718,

I3 ≤ 0.01402, I4 ≤ 0.04238,

I5 ≤ 0.05547, I6 ≤ 0.06622,

I7 ≤ 0.21879, I8 ≤ 0.20339,

I9 ≤ 0.00924.

Thus in this case we have I1 + ··· + I9 < 0.996, and so by continuity we have I1 + ··· + I9 < 0.996+ O(ǫ) when θ1 =9/25+2ǫ and θ2 = 17/40 − 2ǫ. Thus, taking ǫ suitably small, we see that (6.17) holds, and so we have completed the proof of Theorem 1.1 for X sufficiently large. If X ≥ 4 is bounded by a constant, then Theorem 1.1 follows (after potentially adjusting the implied constants) on noting that either 2 or 3 is a prime in A and so Theorem 1.1 also holds for bounded X ≥ 4. 

We note that there are various ways in which one can improve the numerical esti- mates, but we have restricted ourselves to the above decomposition in the interests of clarity. Judiciously employing further Buchstab decompositions would give small numerical improvements, for example. Thus it suffices to establish Propositions 6.1 and 6.2.

7. Sieve Asymptotics

In this section we prove Proposition 6.1 and Proposition 6.2 assuming Proposition 7.1 and Proposition 7.2, given below. This reduces the problem to proving standard ‘Type I’ and ‘Type II’ estimates. These propositions will then be proven in Sections 8 and 9. Before we state the propositions, we set up some extra notation. Let ℓ Qℓ(η)= {(x1,...,xℓ) ∈ R : η ≤ x1 ≤···≤ xℓ, x1 + ··· + xℓ =1}. By a closed convex polytope in Rℓ we mean a region R defined by a finite number of non-strict affine linear inequalities in the coordinates (equivalently, this is the convex hull of a finite set of points in Rℓ). Given a closed convex polytope R ⊆ Qℓ(η), we let

log p1 log pℓ 1, if a = p1 ··· pℓ for some p1,...,pℓ with log a ,..., log a ∈ R, 1R(a)= (0, otherwise.  

We caution that 1R counts numbers with a particular type of prime factorization, and should not be confused with 1A, the indicator function of the set A. We recall B = {n ∈ Z : 0 ≤ n

1A Mathematica R file detailing this computation is included with this article on arxiv.org. 18 JAMES MAYNARD

Proposition 7.1 (Type I estimate). Let A > 0 and Q ≤ X50/77(log X)−2A−2. Then we have #A #A #{a ∈ A : q|a, (a, 10) = 1}− κ ≪ , q A (log X)A q 0, and let ℓ ≤ 2η . Let R⊆Qℓ(η) be a closed convex polytope in Rℓ which has the property that 9 17 e ∈ R ⇒ e ∈ + ǫ, − ǫ i 25 40 Xi∈I h i for some set I⊆{1,...,ℓ}. Then we have #A #A 1 (a)= κ 1 (n)+ O , R A #B R R,η log X log log X aX∈A n

Proposition 6.2 follows quickly from Proposition 7.2, but it will be convenient to establish a slightly more general version where the primes can be as small as Xη. Lemma 7.3 (Type II terms, alternative formulation). Fix an integer ℓ ≥ 1 and a quantity η > 0. Let θ1 = 9/25+2ǫ, θ2 = 17/40 − 2ǫ, and L be as in Proposition 6.2, and let I⊆{1,...,ℓ} and j ∈{1,...,ℓ}. Then we have ∗ #A Sp1···pℓ (pj )= oL,η , η log X X ≤p1≤···≤pℓ θ1 X θ2   X ≤Qi∈I pi≤X p1···pℓ≤X/pj and ∗ #A Sp1···pℓ (pj )= oL,η , η log X X ≤p1≤···≤pℓ 1−θ2 X 1−θ1   X ≤Qi∈I pj ≤X p1···pℓ≤X/pj where ∗ indicates the same restriction of summation to L ≥ 0 for all L ∈ L as in Proposition 6.2. P As before, we note that by inclusion-exclusion the same result holds if some of the constraints L ≥ 0 are replaced with L > 0. We see Proposition 6.2 follows immediately from Lemma 7.3 on choosing η = θ2 − θ1.

Proof of Lemma 7.3 assuming Proposition 7.2. We just deal with the case when θ1 θ2 i∈I pi ∈ [X ,X ]; the other case is entirely analogous with θ1 and θ2 simply replaced with 1 − θ2 and 1 − θ1 throughout. (Notice that if e ∈R⊆Qℓ(η) satisfies Q PRIMES WITH RESTRICTED DIGITS 19

i∈I ei ∈ [23/40 + ǫ, 16/25 − ǫ], then i/∈I ei ∈ [9/25 + ǫ, 17/40 − ǫ]. Thus the interval [9/25 + ǫ, 17/40 − ǫ] in Proposition 7.2 can be replaced by the interval P[23/40+ ǫ, 16/25 − ǫ], and so PropositionP 7.2 applies similarly in both cases.)

Recall the definition (6.2) of Sd(z). We see that Sp1···pℓ (pj ) is a sum of wn only involving integers n with at most 1/η prime factors, since all prime factors are of size at least Xη. The terms with exactly r prime factors (for some r ≤ 1/η) are a sum of wp1···pr over p1,...,pr with the summation only restricted by a bounded number of linear inequalities on log p1/ log X,..., log pr/ log X. (These are the previous restrictions on p1,...,pℓ, and the restriction pj ≤ pℓ+1 ≤ ··· ≤ pr). We may η ℓ write the condition X ≤ p1 and the restriction on the size of i∈I pi and i=1 pi as linear conditions only involving log p1/ log X,..., log pℓ/ log X with coefficients having constants depending only on η. Thus, after increasingQL to includeQ these conditions, it suffices to show that

∗ #A (7.1) w = o , p1···pr L,η log X p1≤···≤pℓ   pj ≤pℓ+1X≤···≤pr where ∗ indicates that the summation is restricted by the conditions log p log p (7.2) P L 1 ,..., ℓ ≥ 0 log X log X   for all L ∈ L.

Let δ =1/ log log X. We first trivially discard the contribution from n = p1 ··· pr < 1−δ X . Each n appears Oη(1) times in (7.1), so recalling the definition (6.1) of wn and dropping the other constraints, the total contribution from such terms is #A #A #A (7.3) ≪ 1+ 1 ≪ #A1−δ + = o . η #B Xδ η log X n∈A n

1−η Since we have the constraint p1 ··· pℓ ≤ X/pj ≤ X , the result follows immedi- ately if r = ℓ (if η<δ the result is trivial). Thus we may assume that r>ℓ, so none of the constraints involve all the pi. We now wish to replace log pi/ log X with 1−δ log pi/ log n in the conditions (7.2). For n ∈ [X ,X], we have log p log p log p i ≤ i ≤ (1+2δ) i , log X log n log X

log p1 log pℓ log p1 log pℓ and so if exactly one of L log X ,..., log X and L log n ,..., log n is non-negative, we must have     log p log p (7.5) L 1 ,..., ℓ ≪ δ. log n log n L  

20 JAMES MAYNARD

To bound the contribution of such terms, let γ > 0 be a parameter and #A G(γ,L) := 1A(p1 ··· pr)+ 1B(p1 ··· pr) . η #B n ≤p1,..., pr   log pX1 log pℓ −γ≤L( log n ,..., log n )≤γ θ1 θ2+ǫ n ≤Qi∈I pi≤n

(Here the summation is over all choices of primes p1,...,pr, and for any such choice 1−δ n = p1 ··· pr. We do not restrict to n ≥ X in the summation.) We wish to show that if γ = oL,η(1) then G(γ,L) = oL,η(#A/ log X), and we will do this by first thinking of γ fixed but very small.

We split the sum into at most r!= Oη(1) subsums where the variables are ordered (we potentially double-count the contribution from pi = pi′ for an upper bound). Thus, after relabelling the pi, we see that #A G(γ,L) ≪ sup 1A(p1 ··· pr)+ 1B(p1 ··· pr) i1,...,iℓ∈{1,...,r} η #B n ≤p1≤···≤pr distinct log p X log p   i1 iℓ −γ≤L( log n ,..., log n )≤γ θ1 θ2+ǫ n ≤Qi∈I′ pi≤n ′ for some set I ⊆{1,...,r}. Let R = R(γ,L,η) ⊆Qr(η) be given by

R = (x1,...,xr) ∈Qr(η): −γ ≤ L(xi1 ,...,xiℓ ) ≤ γ, xi ∈ [θ1,θ2 + ǫ] . ′ n iX∈I o Then R satisfies the conditions of Proposition 7.2, so

1A(p1 ··· pr)= 1R(n) η n ≤p1≤···≤pr n∈A log p X log p X i1 iℓ −γ≤L( log n ,..., log n )≤γ θ1 θ2+ǫ n ≤Qi∈I′ pi≤n #A #A = 1 (n)+ o . #B R R log X log log X n

We see from (7.5) that the error introduced to (7.4) by replacing log pi/ log X with log pi/ log n in the conditions (7.2) is O( L∈L G(γ,L)) for some γ ≪L δ = oL(1). By the above discussion, this is oL,η(#A/ log X), which is negligible. P After making this change, we may reintroduce the terms with n

∗ ∗∗ #A w = w + o , p1···pr p1···pr L,η log X p1≤···≤pℓ p1≤···≤pℓ   pj ≤pℓ+1X≤···≤pr pj ≤pℓ+1X≤···≤pr 1−δ p1···pr ≥X where ∗∗ indicates the sum is constrained to log p log p P L 1 ,..., ℓ ≥ 0 log n log n   θ1 θ2 for all L ∈ L. Moreover, since we had the constraint i∈I pi ∈ [X ,X ] in (7.2), this second sum includes the constraint p ∈ [nθ1 ,nθ2 ]. We now split the i∈I i Q summation into Oη(1) subsums where the pi are totally ordered. After relabelling the coordinates, Proposition 7.2 appliesQ to each of these sums, since the linear constraints L ≥ 0 for L ∈ L define a closed convex polytope (depending only on L), and the ordering of the variables ensures that this lies within Qr(η) (recall that the η η η constraint X ≤ p1 becomes n ≤ p1, so all primes are at least n ). The constraint θ1 θ2 i∈I pi ∈ [n ,n ] corresponds to the sum of a subset of the coordinates of all points in the polytope lying in [θ1,θ2]. Proposition 7.2 shows that the contribution Q from each such sum is oL,η(#A/ log X). Since there are Oη(1) such sums, the total contribution is oL,η(#A/ log X), giving the result. 

Our aim for the remainder of this section is to establish Proposition 6.1 using Proposition 7.1 and Proposition 7.2. We first establish an auxiliary lemma. Lemma 7.4 (Fundamental Lemma). For δ > 0 we have κ #A exp(−δ−2/3) #A S(A ,Xδ) − A S(B ,Xδ) ≪ #A + . d #B d log X (log X)100 50/77−ǫ d

The implied constant is independent of δ.

Proof of Lemma 7.4 assuming Proposition 7.1. If δ>ǫ4 then since S(C,Xt) is non- negative and decreasing in t for any set C, we have

#A 4 #A −κ S(B ,Xǫ ) ≤ S(A ,Xδ) − κ S(B ,Xδ) A #B d d A #B d 4 κ #A 4 #A 4 ≤ S(A ,Xǫ ) − A S(B ,Xǫ ) + κ S(B ,Xǫ ). d #B d A #B d   ǫ4 1−ǫ Since S(Bd,X ) ≪ X/(d log X) for d

#A 4 #A 4 #A S(A ,Xδ)−κ S(B ,Xδ) = S(A ,Xǫ )−κ S(B ,Xǫ ) +O . d A #B d d A #B d d log X  

22 JAMES MAYNARD

By the rough number estimate (5.2) again, we see that the sum of 1/d over dǫ follows from the result for δ = ǫ4, so we may assume without loss of generality that δ ≤ ǫ4. Let A′ = {a ∈ A : (a, 10) = 1}. ′ Then #A = κ#A, where κ is the constant given in Proposition 7.1. Let Rd(e) be defined by κ#A #{a ∈ A′ : e|a} = + R (e). d de d We put q = de and see from Proposition 7.1 that for any A > 0 the error terms Rd(e) satisfy κ#A R (e) ≪ #A′ − d q q 50/77−ǫ ǫ/2 50/77−ǫ/2 dXδ (e,10)=1 (q,10)=1 p|e⇒p≤Xδ #A (7.6) ≪ . A (log X)A By the fundamental lemma of sieve methods (see, for example, [14, Theorem 6.9]) we have −ǫ κ#A 1 S(A′ ,Xδ)= 1 − O exp 1 − + O R (e) . d 2δ d p d     p≤Xδ    eXδ p∤10

ǫ 1 1 #A ≪ exp − 1 − #A + . 2δ p d (log X)100 δ 50/77−ǫ   p≤YX   dXδ The product in the final bound is O(δ−1(log X)−1), and the inner sum over d is seen to be O(δ−1) by an Euler product upper bound. Finally, since we are assuming that δ ≤ ǫ4, we have that δ−2 exp(−ǫ/(2δ)) ≪ exp(−δ−2/3). Thus (7.7) κ#A 1 exp(−δ−2/3)#A #A S(A′ ,Xδ) − 1 − ≪ + . d d p log X (log X)100 50/77−ǫ δ dXδ p∤10

An identical argument works for the set B′ = {nXδ p∤10

PRIMES WITH RESTRICTED DIGITS 23

′ δ δ ′ δ We see that for (d, 10) = 1 we have S(Ad,X ) = S(Ad,X ), that S(Bd,X ) = δ ′ S(Bd,X ), and that #B = φ(10)#B/10. Thus, by the triangle inequality 10κ#A S(A ,Xδ) − S(B ,Xδ) d φ(10)#B d 50/77−ǫ d

κ#A 1 ≤ S(A′ ,Xδ) − 1 − d d p 50/77−ǫ δ dXδ p∤10

10κ#A #B′ 1 + S(B′ ,Xδ) − 1 − φ(10)#B d d p 50/77−ǫ δ dXδ p∤10

κ#A 1 10κ#A#B′ 1 + 1 − − 1 − . d p φ(10)d#B p 50/77−ǫ δ δ dXδ p∤10 p∤10

We bound the first summation by (7.7), the second summation by (7.8), and note ′ that since #B = φ(10)#B/10, the third summation is zero. Since κA = 10κ/φ(10), this gives

κ #A exp(−δ−2/3) #A S(A ,Xδ) − A S(B ,Xδ) ≪ #A + .  d #B d log X (log X)100 50/77−ǫ d

Using Lemma 7.4 we can now prove Proposition 6.1.

Proof of Proposition 6.1 assuming Lemma 7.3 and Lemma 7.4. Recall that θ1 = 9/25+ 2ǫ, θ2 = 17/40 − 2ǫ. Let θ := θ2 − θ1, and let δ ≥ 1/ log log X be a small quantity which we will eventually choose to tend to 0 in a suitable manner. In particular, δ will be small compared with ǫ.

θ1 We first consider the contribution from p1 ··· pℓ < X . Given a set C and an integer d, we let

δ T (C; d)= S(C ′ ′ ,X ), m p1···pm δ ′ ′ θ X

Um(C; d)= Tm(C; d) − Um+1(C; d) − Vm+1(C; d). 24 JAMES MAYNARD

δ θ1 We define T0(C; d)= S(C; X ) and V0(C; d) = 0. This gives for d ≤ X

θ m S(C,X )= T0(C; d) − V1(C; d) − U1(C; d)= (−1) (Tm(C; d)+ Vm(C; d)). mX≥0 −1 We apply the above decomposition to Ad. This gives an expression with O(δ ) terms since trivially Tm(Ad; d)= Um(Ad; d)= Vm(Ad; d) = 0 if m> 1/δ. Applying the same decomposition to Bd, taking the weighted difference, and summing over d = p1 ··· pℓ we obtain

′ κ #A ′ S(A ,Xθ) − A S(B ,Xθ) d X d p ,...,p p ,...,p 1X ℓ 1X ℓ ′ κ #A ≪ T (A ; d) − A T (B ; d) m d X m d 0≤m≪1/δ p1,...,pℓ X X ′ κA#A (7.9) + V (A ; d) − V (B ; d) . m d X m d 0≤m≪1/δ p1,...,pℓ  X X ′ Here indicates we are summing over all choices of p1,...,pℓ which appear in the summation in Proposition 6.1 with the additional condition that d = p1 ··· pℓ < Xθ1 . P

θ We note that p1,...,pℓ ≥ X , so d has O(1) prime factors and any integer e can ′ ′ ′ ′ be represented O(1) times as dp1 ··· pm for some primes pm ≤···≤ p1 and some choice of p1,...,pℓ defining d. Thus, expanding the definition of Tm, if δ ≤ ǫ we have

′ κ #A T (A ; d) − A T (B ; d) m d #B m d 0≤m≪1/δ p1,...,pℓ X X κA# A ≪ S(A ,Xδ) − S(B ,Xδ) e #B e eX

δ−1 exp(−δ−2/3)#A (7.10) ≪ . log X

Here we applied by Lemma 7.4 in the last line, using δ ≥ 1/ log log X.

We now consider the Vm terms. We expand the definition of Vm as a sum. We note ′ θ θ2−θ1 θ1 ′ ′ that pm ≤ X = X , so the summation is constrained by X ≤ dp1 ··· pm ≤ θ2 ′ ′ ′ X , which is our Type II constraint. We see that all terms have dp1 ··· pm ≤ X/pm, so we can insert this condition without changing the sum. We recall p1,...,pℓ are constrained only by some linear conditions on log p1/ log X,..., log pℓ/ log X. Thus we see that the sum is of the form considered in Lemma 7.3 with η = δ, since all the conditions in the summation can be written as linear constraints on log pi/ log X ′ for 1 ≤ i ≤ ℓ and log pj / log X for 1 ≤ j ≤ m. Thus, by Lemma 7.3, we have

′ κA#A #A (7.11) Vm(Ad; d) − Vm(Bd; d) = oδ,L . − #B log X m≪δ 1 p1,...,pℓ    X X

PRIMES WITH RESTRICTED DIGITS 25

Putting together (7.9), (7.10) and (7.11), we obtain ′ κ #A ′ #A S(A ,Xθ) − A S(B ,Xθ) ≪ exp(−δ−1/2)+ o (1) . d #B d δ,L log X p ,...,p p ,...,p 1X ℓ 1X ℓ   Letting δ → 0 sufficiently slowly then gives the result for d

The contribution from d with Xθ2 1 − 17/40+2ǫ = 1 − θ2, the terms corresponding to Tm can still be handled by Lemma 7.4.

Finally, the contribution from d with Xθ1 ≤ d ≤ Xθ2 or X1−θ2 ≤ d ≤ X1−θ1 can be bounded almost immediately by Lemma 7.3. One Buchstab iteration gives θ δ Sd(X )= Sd(X ) − Sdp(p). δ θ X

1−θ1 Together these cover the whole range p1 ··· pℓ ≤ X , giving the result. 

Thus, since Lemma 7.3 and Lemma 7.4 follow from Proposition 7.1 and Proposition 7.2, it suffices to establish Proposition 7.1 and Proposition 7.2.

8. Type I estimate

In this section we establish our ‘Type I’ estimate Proposition 7.1, assuming the more technical Lemmas 8.1 and 8.2, which we will establish later in Section 10. We recall that Proposition 7.1 describes the number of elements of A in arithmetic progressions to modulus up to X50/77−ǫ ≈ X0.65 on average. Our Type I estimate is based on suitable bounds on the Fourier Transform

SA(θ)= e(aθ) aX∈A of the set A. We recall our definition of the function FY from (3.1), which is a normalized version of SA. In particular, |SA(θ)| = #A· FX (θ). The two key lemmas which we use in this section are the following. Lemma 8.1 (Large sieve estimate). We have a Q2 F ≪ Q54/77 + . Y q Y 50/77 q≤Q 0

∞ 1/3 Lemma 8.2 (ℓ bound). Let q < Y be of the form q = q1q2 with (q1, 10) = 1 −2/3 and q1 > 1, and let |η| < Y /2. Then for any integer a coprime with q we have a log Y F + η ≪ exp −c Y q log q     for some absolute constant c> 0.

Proof of Proposition 7.1 assuming Lemma 8.1 and Lemma 8.2. By M¨obius inver- sion and using additive characters, we have for (q, 10) = 1 ′ #Aq = #{a ∈ A : q|a, (a, 10) = 1} = µ(d) a∈A d|(10,a) Xq|a X 1 ab = µ(d) e dq dq dX|10 aX∈A 0≤Xb1 (b ,q )=1

1 #A b′′ = #{a ∈ A : (a, 10) = 1} + O F . q q X d′q′ d′|10 q′|q 0≤b′′1 (b ,dXq )=1 We note that #{a ∈ A : (a, 10) = 1} = κ#A. Summing over q1 (b ,d q )=1

#A b′′ 1 ≪ ′ FX ′ ′ ′′ ′ q ′ ′′ ′ ′ d q ′′ ′ q 11 ′ ′ Here we recall our notation that q ∼ Q1 means q ∈ (Q1/10,Q1]. By Lemma 8.1 we have for any d|10

1 a 1 Q1 FX ≪ 23/77 + 50/77 , Q1 dq X q∼Q1 0≤a

4A+8 which gives the required bound if Q1 > (log X) on recalling that Q1 ≤ Q ≤ 50/77 −2A−2 4A+8 X (log X) . In the case Q1 ≤ (log X) we instead use Lemma 8.2, which gives

1 a a Q1 FX ≪ Q1 sup FX ≪A 100(A+1) . Q1 dq (a,q)=1 dq (log X) q∼Q1 a≤dq (q,X10)=1 (a,dqX)=1   11 (q,10)=1 d|10 A Thus we see that the bound (8.1) is OA(#A/(log X) ) in either case, as required. 

We are left to establish Proposition 7.2 and Lemma 8.1 and Lemma 8.2.

9. Type II estimate

In this section we reduce our ‘Type II’ estimate to various major arc and minor arc estimates. In particular, we will reduce the proof of Proposition 7.2 to the proof of Propositions 9.1, 9.2 and 9.3. We first recall the statement of Propositon 7.2 which allows us to count integers in A with a specific type of prime factorization provided such numbers always have a ‘conveniently sized’ factor. Proposition (Type II estimate Proposition 7.2 restated). Let η > 0, and let ℓ ≤ −1 ℓ 2η . Let R⊆Qℓ(η) be a closed convex polytope in R which has the property that 9 17 e ∈ R ⇒ e ∈ + ǫ, − ǫ i 25 40 Xi∈I h i for some set I⊆{1,...,ℓ}. Then we have #A #A 1 (a)= κ 1 (n)+ O , R A #B R R,η log X log log X aX∈A n

To avoid technical issues due to the fact that n

We note that in ΛR the conditions are on log pi/ log X, whereas in 1R the conditions are on log pi/ log n. If every e ∈ R has e1 ≤···≤ eℓ then at most one term occurs in the summation, so ΛR simplifies to

ℓ log p1 log pℓ i=1 log pi, if n = p1 ··· pℓ and ( log X ,..., log X ) ∈ R, ΛR(n)= (Q0, otherwise. 28 JAMES MAYNARD

We prove Proposition 7.2 by an application of the Hardy-Littlewood circle method, whereby we study the functions

SA(θ)= e(aθ), SR(θ)= ΛR(n)e(nθ). aX∈A n 0 and let ℓ ∈ Z satisfy 1 ≤ ℓ ≤ 2/η. Let −1 δ = (log log X) , and let RX = RX (a1,...,aℓ−1) be given by ℓ ℓ−1 η R = e ∈ Rℓ : e ∈ (a ,a +δ] for 1 ≤ i ≤ ℓ−1, e ≤ 1, e ≥ max , 1− a −ℓδ , X i i i i ℓ 4 i i=1 i=1 n X  X o ℓ−1 for some a1,...,aℓ−1 ∈ R satisfying mini ai ≥ η/2 and i=1 ai < 1 − η/2. Let M = M(C) be given by P a b (log X)C M = 0 ≤ a

1 a −a #A #A S S = κ Λ (n)+ O . X A X RX X A X RX C,η (log X)C 0≤a

Here κA is the constant given in Proposition 7.2. The implied constant depends on C and η, but not on RX or a1 ...,aℓ−1. Proposition 9.2 (Generic minor arcs). Fix η> 0 and let ℓ ∈ Z satisfy 1 ≤ ℓ ≤ 2/η. Let R⊆ Rℓ be a closed convex polytope. Let M = M(C) be as in Proposition 9.1. Then there is some exceptional set E ⊆ [0,X] with #E ≤ X23/40, such that 1 a −a #A S S ≪ . X A X R X η Xǫ a

Proposition 9.3 (Exceptional minor arcs). Let A> 0. Let η, ℓ, RX = RX (a1,...,aℓ−1) and M = M(C) be as given in Proposition 9.1. Let a1,...,aℓ−1 in the definition of RX satisfy i∈I ai ∈ [9/25 + ǫ/2, 17/40 − ǫ/2] ∪ [23/40+ ǫ/2, 16/25 − ǫ/2] for some I⊆{1,...,ℓ − 1}, and let C = C(A, η) in the definition of M be sufficiently large in termsP of A and η. Let E ⊆ [0,X] be any set such that #E ≤ X23/40. Then we have 1 a −a #A S S ≪ . X A X RX X η,A (log X)A a∈E a/X∈M     The implied constant depends on η and A, but not on RX or a1,...,aℓ−1.

We expect the contribution from the major arcs M to give the main contribution. Proposition 9.1 shows that we can get an asymptotic formula from frequencies in M. Proposition 9.2 shows that most frequencies contribute negligibly, and that PRIMES WITH RESTRICTED DIGITS 29 any significant contribution must come from some small exceptional set E. (In view of Proposition 9.1, we must have E contains elements of M and so E is non- empty). We would expect that we can take E = M, but cannot quite show this. However, Proposition 9.3 shows that E\M contributes negligibly to our sum, which is sufficient for our purposes.

Proof of Proposition 7.2 assuming Propositions 9.1, 9.2 and 9.3 and Lemma 7.4. Let δ = (log log X)−1. Clearly we may assume that δ is sufficiently small in terms of η, since otherwise the result is trivial. We note that ℓ ≥ 2, since the sum of co- ordinates of points in R is 1 but a non-trivial subset of them lies in [9/25, 17/40]. ℓ Given reals a1,...,aℓ−1 ≥ 0 and γ > 0 and a set S ∈ R , let

C(a; γ) := a1,a1 + γ ×···× aℓ−1,aℓ−1 + γ ,

 i  i ℓ ℓ−1 + ℓ C (a; γ) := e ∈ [η/4, 1] : (e1,...,eℓ−1) ∈C(a; γ), ei ≤ 1, eℓ ≥ 1 − ai − ℓδ , i=1 i=1 n X X o log p1 log pℓ 1, n = p1 ··· pℓ for some p1,...,pℓ with log X ,..., log X ∈ S, 1˜S (n) := (0, otherwise.  

We see that 1S and 1˜S differ in that the denominators of the fractions are log n and log X respectively. We cover [η, 1]ℓ−1 by O(δ−(ℓ−1)) disjoint hypercubes C(a,δ) of side length δ (for example, we can take all a ∈ {0, δ, 2δ, . . . , ⌈δ−1⌉δ}ℓ−1). Let R ⊆ [η, 1]ℓ−1 denote the projection of R onto the first ℓ − 1 coordinates (which is also a closed convex 2 polytope). We see that if n ∈ [X1−δ ,X] then log n and log X differ by a factor of at 2 most 1−δ . In particular, if log pj / log X ∈ [aj ,aj +δ] then certainly log pj / log n ∈ [aj ,aj +2δ]. This means that if C(a;2δ) ⊆ R and log pj/ log X ∈ [aj ,aj + δ] for 1−δ2 all j ≤ ℓ − 1, then 1R(p1 ··· pℓ) = 1 for all pℓ ∈ [X /p1 ··· pℓ−1,X/p1 ··· pℓ−1]. 2 Thus for n ∈ [X1−δ ,X]

0, if R∩C(a;2δ)= ∅, ˜ ˜ (9.2) 1R(n)1C+(a;δ)(n)= 1C+(a;δ)(n), if C(a;2δ) ⊆ R, O(1˜C+(a;δ)(n)), otherwise.  If C(a;2δ) ∩ R 6= ∅ but C(a;2δ)6⊆ R then C(a;2δ) intersects the boundary ∂R of R.

η Since 1R(n) is supported on n with ℓ prime factors all at least n , if n = p1 ··· pℓ ≥ 1−δ2 X and 1R(n) = 1 then there is an a with ai ≥ η/2 such that 1˜C(a;δ)(p1 ··· pℓ−1)= 2 2 ℓ−1 1−δ 1−δ 1−P ai−ℓδ 1. Moreover, since n ≥ X we have pℓ ≥ X /p1 ··· pℓ−1 ≥ X i=1 , so in fact 1˜C+(a;δ)(n) = 1. Since the cubes are disjoint, this happens for exactly one 2 choice of a. Therefore we have for any n ∈ [X1−δ ,X]

1R(n)= 1˜C+(a;δ)(n)1R(n). a X 30 JAMES MAYNARD

Using this with (9.2) to split the summation over hypercubes C, we find

κ #A 1 (m) − A 1 (n) R X R m∈A X1−δ2

κA#A ≤ 1˜C+(a;δ)(m) − 1˜C+(a;δ)(n) a X m∈A X1−δ2

κA#A + O 1˜C+(a;δ)(m)+ 1˜C+(a;δ)(n) . a X m∈A X1−δ2

2 2 Re-inserting terms with m ≤ X1−δ and n ≤ X1−δ , we obtain

κA#A κA#A 1 (m) − 1 (n) ≤ 1˜ + a (m) − 1˜ + a (n) R X R C ( ;δ) X C ( ;δ) m∈A n

κA#A + O 1˜ + a (m)+ 1˜ + a (n) C ( ;δ) X C ( ;δ) a m∈A n

The final two terms above satisfy

#A 2 #A δ#A (9.4) 1+ κ 1 ≪ #A1−δ + ≪ . A X Xδ2 log X m∈A 1−δ2 2 n≤X m≤XX1−δ X

We now consider the contribution to (9.3) from C(a;2δ)∩∂R= 6 ∅. Since R⊆ [η, 1]ℓ, we must have ai ≥ η/2 and since the coordinates of points in R sum to 1 we ℓ−1 ˜ also have i=1 ai ≤ 1 − η/2. Since 1C+(a;δ)(n) and ΛC+(a;δ)(n) have the same support, which is restricted to integers with no factor less than Xη/4, we have P −ℓ 1˜C+(a;δ)(n) ≪η (log X) ΛC+(a;δ)(n). Thus we have

κA#A 1˜ + a (m)+ 1˜ + a (n) C ( ;δ) X C ( ;δ) mX∈A n

Here we used the triangle inequality in the final line. By the prime number theorem, for any choice of a ∈ [0, 2]ℓ−1 we have ℓ−1

ΛC+(a;δ)(n) ≤ log pi log pℓ p1,...,p −1 n

ℓ−1 −(ℓ−2) Since R is a closed convex polytope, so is R⊆ R . Therefore there are OR(δ ) hypercubes C(a;2δ) which intersect ∂R. Thus the contribution to (9.3) from the final term of (9.5) is #A δℓ−1#A ≪ Λ + a (n) ≪ 1 X(log X)ℓ C ( ;δ) log X a n

We now consider the terms with C(a;2δ) ⊆ R. Since R⊆Qℓ(η), if e ∈ R then ′ ′ ′ e1 ≤···≤ eℓ, so if e ∈ R then e1 ≤···≤ eℓ−1. Therefore, since C(a;2δ) ⊆ R,

(9.7) aj + δ

(1 + Oη(δ))ΛC+(a;δ)(n) 1˜C+(a;δ)(n)= ℓ−1 ℓ−1 ℓ (1 − i=1 ai)( i=1 ai)(log X) Λ + a (n) P C (Q;δ) (9.9) = + Oη(δ1˜C+(a;δ)(n)). ℓ−1 ℓ−1 ℓ (1 − i=1 ai)( i=1 ai)(log X) Thus P Q κA#A 1˜ + a (m) − 1˜ + a (n) C ( ;δ) X C ( ;δ) a m∈A n

1 κA#A ≪ Λ + a (m) − Λ + a (n) η (log X)ℓ C ( ;δ) X C ( ;δ) a m∈A n

κA#A (9.10) + δ 1˜ + a (m)+ 1˜ + a (n) . C ( ;δ) X C ( ;δ) a m∈A n

Since any n = p1 ··· pℓ contributing to the second term above is counted at most once and has all prime factors at least Xη/4, we have

κA#A η/4 #A η/4 δ 1˜ + a (m)+ 1˜ + a (n) ≪ δS(A,X )+ δ S(B,X ) C ( ;δ) X C ( ;δ) X a m∈A n

Thus to establish Proposition 7.2 it is sufficient to show that for any A > 0, we have #A #A (9.12) Λ + a (m)= Λ + a (n)+ O , C ( ;δ) X C ( ;δ) A,η (log X)A mX∈A n

Since i∈I ei ∈ [9/25+ǫ, 17/40−ǫ] if e ∈ R, by taking J = I or J = {1,...,ℓ}\I, we must have that aj ∈ [9/25+ǫ/2, 17/40−ǫ/2]∪[23/40+ǫ/2, 16/25−ǫ/2] for P i∈J some J ⊆{1,...,ℓ − 1} for any a such that C(a;2δ) ∩ R 6= ∅. Since R⊆ [η, 1]ℓ, we P ℓ−1 have mini ai ≥ η/2 and i=1 ai < 1 − η/2 if C(a;2δ) ∩ R= 6 ∅. Thus all hypercubes under consideration satisfy the assumptions on RX of Propositions 9.1-9.3. P By Fourier expansion we have 1 b −b Λ + a (m)= S S + a . C ( ;δ) X A X C ( ;δ) X mX∈A 0≤Xb

Since Lemma 7.4 follows from Proposition 7.1, which in turn follows from Lem- mas 8.1 and 8.2, we are left to establish Lemma 8.1, Lemma 8.2, Proposition 9.1, Proposition 9.2 and Proposition 9.3. PRIMES WITH RESTRICTED DIGITS 33

10. Fourier Estimates

In this section we collect various distributional bounds on the Fourier transform

SA(θ)= e(aθ), aX∈A which will underpin our later analysis. In particular, we establish Lemma 8.1 and Lemma 8.2, as well as several other related estimates. Specifically, Lemma 8.1 is a special case of Lemma 10.5, and Lemma 8.2 is the same as Lemma 10.1.

We recall our normalized version of SA(θ) from (3.1)

− log 9/ log 10 FY (θ)= Y 1A1 (n)e(nθ) . n

(10.1) FY (θ) ≤ 1 for all θ and Y . The key property of FY which we exploit is that it has an excep- k k−1 i tionally nice product form. If Y = 10 , then letting n = i=0 ni10 have decimal digits nk−1,...,n0, we find P k−1 1 F (θ)= e n 10iθ Y 9k i i=0 n0,...,nk−1∈{X0,...,9}\{a0} X  k−1 1 = e(n 10iθ) 9 i i=0 ni∈{0,...,9}\{a0} Y X k 1 e(10iθ) − 1 (10.2) = − e(a 10i−1θ) . 9 e(10i−1θ) − 1 0 i=1 Y We note that FY is periodic modulo 1, and that the above product formula gives the identity

(10.3) FUV (θ)= FU (θ)FV (Uθ). (We recall that we assume that U and V are both powers of 10 in such a statement.) Lemma 10.1 (ℓ∞ bound, Lemma 8.2 restated). Let q < Y 1/3 be of the form −2/3 q = q1q2 with (q1, 10) = 1 and q1 > 1, and let |η| < Y /2. Then for any integer a coprime with q we have a log Y F + η ≪ exp −c Y q log q     for some absolute constant c> 0.

Proof. From the bounds coming from truncated Taylor expansions, we have that |e(nθ)+ e((n + 1)θ)|2 = 2+2cos(2πkθk) ≤ 4 − 4π2kθk2 +4π4kθk4/3 ≤ 4 − 4kθk2 ≤ 4 exp(−kθk2). 34 JAMES MAYNARD

We recall that k·k denotes the distance to the nearest integer. This implies that kθk2 e(n θ) ≤ 7+2exp(−kθk2/2) ≤ 9exp − . i 20 ni∈{0,...,9}\{a0}   X For the final inequality we used the convexity of exp(−x2). We substitute this k bound into our expression (10.2) for FY , which gives for Y = 10 k−1 1 F (t)= e(n 10it) Y 9 i i=0 ni∈{0,...,9}\{a0} Y X 1 k−1 ≤ exp − k10itk2 . 20 i=0  X  i If t = a/q1q2 with q1 > 1, (q1, 10)= 1 and (a, q1) = 1, then k10 tk≥ 1/q1q2 for all −2/3 i. Similarly, if t = a/q1q2 + η with a, q1, q2 as above, with |η| < Y /2 and with 1/3 i i q = q1q2 < Y then for i ≤ k/3 we have k10 tk≥ 1/q − 10 |η|≥ 1/2q. However, if k10itk < 1/20 then k10i+1tk = 10k10itk. Thus, for any interval I ⊆ [0,k/3] of length log q/ log10, there must be some integer i ∈ I such that k10i(a/q + η)k > 1/200. This implies that k a 2 1 log Y 10i + η ≥ . q 105 3 log q i=0   j k X Substituting this into the bound for F , and recalling we assume q < Y 1/3 gives the result. 

Lemma 10.2 (Markov moment bound). Let J be a positive integer. Let λt,J be J J the largest eigenvalue of the 10 × 10 matrix Mt, given by t J ℓ−1 J ℓ−1 G(a1,...,aJ+1) , if i − 1= ℓ=1 aℓ+110 , j − 1= ℓ=1 aℓ10 (M ) = for some a ,...,a ∈{0,... 9}, t i,j  P1 J+1 P 0, otherwise, where  J −j J 1 e( j=0 tj 10 + 10γ) − 1 a0tj G(t0,...,tJ ) = sup − e + a0γ . − − J −j−1 j+1 |γ|≤10 J 1 9 e( tj 10 + γ) − 1 10 Pj=0 j=0  X Then we have that P t a k F k ≪ λ . 10 10k J,t t,J k 0≤Xa<10  

Proof. We recall the product formula (10.3) with Y = 10k k 1 e(10iθ) − 1 F (θ)= − e(a 10i−1θ) , Y 9 e(10i−1θ) − 1 0 i=1 Y where we interpret the term in parentheses as 9 if k10i−1θk = 0. Writing θ = k −i th i=1 ti10 for ti ∈ {0,..., 9}, we see that the (k − j) term in the product depends only on tk−j ,...,tk. Moreover, the value of the term is mainly dependent Pon the first few of these digits by continuity. Thus we may approximate the absolute PRIMES WITH RESTRICTED DIGITS 35

th value of FY (θ) by a product where the j term depends only on tj ,...,tj+J for some constant J. Explicitly, we have

k k J ti+j J t 1 e( j + 10γ) − 1 t F i ≤ sup j=0 10 − e a i+j + a γ Y i J t + 0 j+1 0 −J−1 i j 10 |γ|≤10 9 e( +1 + γ) − 1 10 i=1  i=1 P j=0 10j  j=0  X Y X k P = G(ti,...,ti+J ), i=1 Y where we put tj = 0 for j > k. With this formulation we can interpret the above bound in terms of the probability of a walk on {0,..., 9, ∞}k. Let t ∈ R be given. Consider an order-J Markov chain X1,X2,... where for a,a1,...,an ∈{0,..., 9} we have for n>J t P(Xn = a|Xn−i = ai for 1 ≤ i ≤ J)= cG(a,a1,a2,...,aJ ) for some suitably small constant c (so that the probability that Xn ∈{0,..., 9} is less than 1). To make this a genuine Markov chain we choose the probability that Xn = ∞ given Xn−1,...,Xn−J to be such that the probabilities add up to 1, and if Xn = ∞ then we have that Xn+1 = ∞ with probability 1. Then we have that k a t F i ≤ c−kP(X = a for J

J J Explicitly, let Mt be the 10 × 10 matrix given by t J ℓ−1 J ℓ−1 G(a1,...,aJ+1) , if i − 1= ℓ=1 aℓ+110 , j − 1= ℓ=1 aℓ10 (M ) = for some a ,...,a ∈{0,..., 9}, t i,j  P1 J+1 P 0, otherwise, and let λt,Jbe the absolute value of the largest eigenvalue of Mt. Since G(t1,...,tJ+1) > 0 for all t1,...,tJ+1, we have that Mt is irreducible, and so each eigenspace corre- sponding to an eigenvalue of modulus λt,J has dimension 1 by the Perron-Frobenius th Theorem. Let (Mt)i,j = mi,j . By expanding out the k power, we have

k (Mt )i,j = mi,i1 mi1,i2 ··· mik−1,j . J i1,...,ik−1∈{X0,...,10 −1} We recall that mi,j = 0 unless there is a1,...,aJ+1 ∈{0,..., 9} such that J−1 i − 1= a2 + 10a3 + ··· + 10 aJ+1, J−1 j − 1= a1 + 10a2 + ··· + 10 aJ . 36 JAMES MAYNARD

Thus the product mi,i1 mi1,i2 ··· mik−1,j is non-zero only if there are a1,...,ak+J ∈ {0,..., 9} such that

J−1 j − 1= a1 + 10a2 + ··· + 10 aJ , J−1 ik−1 − 1= a2 + 10a3 + ··· + 10 aJ+1, . . J−1 i1 − 1= ak + 10ak+1 + ··· + 10 aJ+k−1, J−1 i − 1= ak+1 + 10ak=2 + ··· + 10 aJ+k. If this is the case then we have k t mi,i1 mi1,i2 ··· mik−1,j = G(ai,ai+1,...,ai+J ) . i=1 Y Thus, fixing i = 1 so that ak+1 = ··· = aJ+k = 0, and summing over j, we have that

10J −1 k (Mt )1,j = m1,i1 mi1,i2 ··· mik−1,j j=0 J X i1,...,ik−1,jX∈{0,...,10 −1} t t = G(a1,...,aJ+1) ··· G(ak,...,ak+J )

a1,...,ak ∈{0,...,9} ak+1=···X=ak+J =0

k 10 −1 a t ≥ F . Y 10k a=0 X   On the other hand, by the eigenvalue expansion of Mt, we have

10J −1 k k (Mt )1,j ≪t,J λt,J . j=0 X This gives the result. 

Lemma 10.3 (ℓ1 bound). We have for any k ∈ N

k 27k/77 G(ti,...,ti+4) ≪ 10 . t k i=1 ∈{0X,...,9} Y

In particular, we have for Y1 ≍ Y2 ≍ Y3

a 27/77 sup FY2 β + ≪ Y1 , β∈R Y3 a

Proof. This follows from Lemma 10.2 and a numerical bound on λ1,4. Specifically, by Lemma 10.2 taking J = 4 we find

k−4 k−4 k (10.4) G(ti,...,ti+J ) ≤ (M1 )1,j ≪ λ1,4. t k i=1 j ∈{0X,...,9} Y X A numerical calculation2 reveals that 27/77 (10.5) λ1,4 < 2.24190 < 10

k k 27/77 for all choices of a0 ∈ {0,..., 9}. Thus, letting Y = 10 we have λ1,4 < Y , which gives the first result.

For the second bound, let U1 = max(1, Y3/Y2). Since Y3 ≍ Y2, we have U1 ≪ 1. Any a < Y1 can be written as a = a1 + U1a2 + Y3a3 for some 0 ≤ a1 < U1 ≪ 1, 0 ≤ a2 < Y3/U1 = min(Y3, Y2) and 0 ≤ a3 < Y1/Y3 ≪ 1. Since there are O(1) choices of a1,a3 and these can be absorbed into the supremum over β, we see that it suffices to show a2 27/77 sup FY2 β + ≪ Y2 . β∈R Y2 a2

Since FY2 ≥ 0 we can extend the summation to a2 < Y2. Thus without loss of k generality we may assume that Y1 = Y2 = Y3 = Y = 10 . We see that

k t k−4 F i + η ≤ G(t ,...,t )+ O(10i−1η) Y 10i i i+4 i=1 i=1 X  Y  k−4 (10.6) =(1+ OJ (Y η)) G(ti,...,ti+4). i=1 Y Here we used the fact that G(ti,...,ti+4) is bounded away from 0 for all t1,...,tk ∈ {0,..., 9} since it is the maximal absolute value of a trigonometric polynomial over an interval. Since F is periodic modulo 1 we see that

k k ti ti sup FY i + β = sup FY i + η , β∈R 10 −1 10 t k i=1 η∈[0,Y ] t k i=1 ∈{0X,...,9} X  ∈{0X,...,9} X  and so the second bound of the lemma follows from (10.6), (10.4) and (10.5) on k i −1 letting a = i=1 ti/10 . For the final bound we integrate (10.6) over η ∈ [0, Y ] and sum over t1,...,tk ∈{0,..., 9}, giving P 1 Y −1 1/Y FY (t)dt = FY (a/Y + η)dη 0 a=0 0 Z X Z 1 k−4 ≪ G(t ,...,t ) Y i i+4 t k i=1 ∈{0X,...,9} Y 1 ≪ .  Y 50/77

2A Mathematica R file detailing this computation is included with this article on arxiv.org. 38 JAMES MAYNARD

Lemma 10.4 (235/154th moment bound). We have that a 1 # 0 ≤ a < Y : F ∼ ≪ B235/154Y 59/433. Y Y B n   o Here 235/154 ≈ 1.5 and 59/433 ≈ 0.14. We recall that n ∼ X means that X/10 < n ≤ X.

Proof. This follows from Lemma 10.2 and a numerical bound for λ235/154,4. Ex- plicitly, we take J = 4 and Y = 10k. By Lemma 10.2 we have a 1 a 235/154 # 0 ≤ a < Y : F ∼ ≤ B235/154 F Y Y B Y Y n   o 0≤Xa

a 27/77 q sup sup FY + β + η ≪ 1+ δq q + 50/77 , β∈R |η|<δ q Y Xa≤q      a Q2 sup sup F + β + η ≪ 1+ δQ2 Q54/77 + , Y q Y 50/77 β∈R q≤Q 0

Proof. For each a ≤ q, let |ηa| maximize FU (a/q + η) over |η| < δ. Since the fractions a/q are all separated from one another by at least 1/q, we have for any t a 1 1 # a ≤ q : η + ∈ t − ,t + ≪ 1+ qδ. a q 2q 2q n h io Thus, considering t = b/q − β, we see that a b (10.7) sup FU + β + η ≪ (1 + qδ) sup FU + η . |η|<δ q |η|≤1/2q q Xa≤q   Xb≤q   We have that t ′ FU (t)= FU (s)+ FU (v)dv. Zs Thus integrating over s ∈ [t − γ,t + γ] for some γ > 0, we have 1 t+γ t+γ F (t) ≪ F (s)ds + |F ′ (s)|ds. U γ U U Zt−γ Zt−γ 3A Mathematica R file detailing this computation is included with this article on arxiv.org. PRIMES WITH RESTRICTED DIGITS 39

This implies that 1 t+2γ t+2γ sup F (t + η) ≪ F (s)ds + |F ′ (s)|ds. U γ U U |η|≤γ Zt−2γ Zt−2γ Taking γ =1/2q, we obtain b/q+1/q b/q+1/q b ′ sup FU + η ≪ Q FU (s)ds + |FU (s)|ds |η|≤1/2q q b/q−1/q b/q−1/q Xb≤q   Xb≤q Z Z  1 1 ′ (10.8) ≪ q FU (t)dt + |FU (t)|dt. Z0 Z0 u u−1 i Writing U = 10 and n = i=0 ni10 , we see that 2π |FP′ (t)| = n1 (n)e(nt) . U 9u A n<10u X u−1 j−1 Writing n = j=0 nj10 and using the triangle inequality, we have u−1 2πP |F ′ (t)|≤ 10j n 1 (n )e(n 10jt) 1 (n )e(n 10it) U 9u j A j j A i i j=0 0≤nj <10 0≤i≤u−1 0≤ni<10 X X Yi6=j X 10u 1 i ≪ u sup A(ni)e(ni10 t) . 9 j≤u 0≤i≤u−1 0≤ni<10 Yi6=j X

We recall the function G from Lemma 10.2. Since G(t1,...,t1+J ) is bounded away from 0, we see that for η ≪ U −1 u u t F ′ i + η ≪ U G(t ,...,t )+ O(10iη) U 10i i i+J i=1  i=1  X Y u 2 ≪ (U + O(U η)) G(ti,...,ti+J ). i=1 Y Thus, integrating over η ∈ [0,U −1], taking J = 4, and using Lemma 10.3, we obtain 1 u ′ 27/77 (10.9) |FU (t)|dt ≪ G(ti,...,ti+4) ≪ U . 0 t u i=1 Z ∈{0X,...,9} Y By Lemma 10.3 we have 1 1 (10.10) F (t)dt ≪ . U U 50/77 Z0 Combining (10.10), (10.9), (10.8) and (10.7), we obtain

a 27/77 q sup FU + β + η ≪ 1+ δq U + 50/77 . |η|<δ q U Xa≤q      Combining this with the trivial bound

FY (t) ≤ FU (t) for U ≤ Y , and choosing U maximally subject to U ≤ q and U ≤ Y gives the first result of the lemma. 40 JAMES MAYNARD

The other bounds follow from entirely analogous arguments. In particular we note that for (a, q) = 1, q

Lemma 10.6 (Hybrid Bounds). Let E ≥ 1. Then we have

a qE F + η ≪ (qE)27/77 + , Y q Y 50/77 a≤q |η|≤E/Y   X (η+a/qX)Y ∈Z a Q2E 27/77 Q2E F + η ≪ + . Y q d dY 50/77 q

In the above lemma, we emphasize that a,q,d are all integers, bu the summation over η is over real numbers which are well-spaced from the condition Y (η+a/q) ∈ Z.

Proof. We first note that the summand a/q + η runs through fractions b/Y with |b| ≤ E + Y since we have the condition (η + a/q)Y ∈ Z. Each fraction b/Y is represented O(1 + min(qE/Y,q)) times, since if a1/q + η1 = a2/q + η2 then a2 = a1 + O(qE/Y ) and η2 is determined by a1,a2, η1. There are O(1 + E/Y ) choices of b giving the same fraction (mod 1), and since FY is periodic (mod 1) these all give the same value of FY (b/Y ). Thus we may consider only b < Y with each fraction b/Y occurring O((1 + E/Y ) min(qE/Y,q)) times. Thus we see that if 10qE ≥ Y then a qE E b F + η ≪ min , q 1+ F Y q Y Y Y Y a≤q |η|≤E/Y      0≤b

FY (θ)= FU (θ)FV (Uθ)FY/UV (UVθ).

We also have the trivial bound FV (Uθ) ≤ 1 of (10.1). For UV ≤ Y and |η| < E/Y these give a UVa a FY + η ≤ FY/UV + UV η sup FU + γ . q q |γ|≤E/Y q       We choose V and then U to be the largest powers of 10 such that V ≤ Y/qE and U ≤ Y/VE. Note that this choice gives U, V ≥ 1 since qE < Y/10 and q, E ≥ 1. PRIMES WITH RESTRICTED DIGITS 41

Thus a a UVa FY + η ≤ sup FU + γ FY/UV + UV η q |γ|≤E/Y q q a≤q |η|≤E/Y   a≤q   |η|≤E/Y   X (η+a/qX)Y ∈Z X (η+a/qX)Y ∈Z

≤ Σ1Σ2, where a Σ1 = sup FU + γ , |γ|≤E/Y q Xa≤q   Σ2 = sup FY/UV UVβ + UV η β∈R |η|≤E/Y   Y (ηX+β)∈Z

′ UVa ≤ sup FY/UV β + . β′∈R Y aX≤2E   Since we chose U and V maximally, we have V ≥ Y/10qE, so q/100 ≤ U ≤ 10q. Since qE

27/77 Σ1 ≪ q . Similarly, since Y/UV ≍ E, by Lemma 10.3 we have

27/77 Σ2 ≪ E . Putting this together gives the first result. The second bound follows from an entirely analogous argument. We first split the argument depending on whether Q2E/d ≥ Y/10 or not, and use the final bound of Lemma 10.5 instead of the first bound to handle Σ2. 

The argument giving the first bound of Lemma 10.6 is essentially sharp if the ℓ1 bounds used in the proof are sharp and if q is a divisor of a power of 10 or if QE ≥ Y . When QE ≤ Y 1−ǫ and q is not a divisor of a power of 10, however, we trivially bounded a factor FV (U(a/q +η)) by 1 in the proof, which we expect not to be tight. Lemma 10.7 below allows us to obtain superior bounds (in certain ranges) provided the denominators do not have large powers of 2 or 5 dividing them.

Lemma 10.7 (Alternative Hybrid Bound). Let D,E,Y,Q1 ≥ 1 be integral powers u of 10 with DE ≪ Y . Let q1 ∼ Q1 with (q1, 10) = 1 and let d ∼ D satisfy d|10 for some u ≥ 0. Let a S = S(d, q1,Q2,E,Y )= FY + η . dq1q2 q2∼Q2 a

In particular, if q = dq′ with (q′, 10) = 1 and d|10u for some integer u ≥ 0, then we have a E5/6d3/2q F + η ≪ (dE)27/77q1/21 + . Y q Y 10/21 a

For example, if (q, 10) = 1 and qE is a sufficiently small power of Y , then we improve the first bound (qE)27/77 of Lemma 10.6 in the q-aspect to E27/77q1/21. This improvement is important for our later estimates.

Proof. Choose E′ ≍ E and D′ ≍ D with E′,D′ ≥ 1 integral powers of 10 such that E′D′ ≤ Y . Let V be the largest integral power of 10 such that V 2 ≤ Y/D′E′. ′ ′ ′ Since D E ≤ Y we have that V ≥ 1. Let d = d1d2d3 where d3 = (d, D ) and ′ d2d3 = (d,VD ).

By the periodicity of F modulo one, the fact (q1q2, d) = 1, and the Chinese remain- der theorem, we have a FY + η dq1q2 a

′ ′ ′ D Va b1(D V/d2d3) β3 = + . q1q2 d1

These give

′ ′ a b1 b2 b3 FY + + + + η ′ q1q2 d1d2d3 d2d3 d3 a

′ 2 ′ 2 ′ 2 ′ D V a Σ1 = sup FE′ D V β + D V η ≤ sup FE′ β + . β∈R β′∈R Y |η|≤E/Y   a≤2E   Y (ηX+β)∈Z X

′ ′ Since (d1d2d3,D ) = d3 and (q1q2, d) = 1, as a , b1 and b2 go through all residue ′ classes (mod q1q2), (mod d1) and (mod d2) respectively subject to (a , q1q2) = ′ (b1 +d1b2, d1d2) = 1, wesee that D β2 goes through all values of c/q1q2d1d2 (mod 1) for 0

2 b3 ′ ′ sup FD′ β2 + + γ FV D β2 + D γ ≪ Σ2Σ3, ′ |γ|≤E/Y d3 a

We note that only Σ3 and Σ5 depend on q2. Thus, summing over q2 ∼ Q2 with (q2, 10) = 1 we obtain

a ′ ′ (10.12) FY + η ≤ Σ1(Σ2Σ3 +Σ4Σ5), q1q2d q2∼Q2 a

We have d2d3 ≤ d ≤ D and DE ≪ Y , so E/Y ≪ 1/d2d3. Thus, by Lemma 10.5, we have 27/77 (10.14) Σ2 ≪ d3 , 27/77 (10.15) Σ4 ≪ (d2d3) . ′ ′ We are left to bound Σ3 and Σ5, which are very similar. Let 2 ′ ′ a1 Σ =Σ (q1, d1, d2)= sup FV + γ . |γ|≤D′EV/Y d1d2q1q2 q2∼Q2 a1

r Since FR(θ) ≥ FV (θ) for R ≤ V , we may replace FV with FR where R = 10 is the 2 ′ largest power of 10 less than min(V, d1d2Q1Q2). Since R ≤ V and D EV/Y ≪ 1/V , we see all quantities γ occurring in the supremum are of size at most O(1/R). Given any choice of reals ηa,q2 ≪ 1/R for a ≤ d1d2q1q2 and q2 ∼ Q2 with (a, d1d2q1q2) = 1, PRIMES WITH RESTRICTED DIGITS 45

2 the numbers a/d1d2q1q2 +ηa,q2 can be arranged into O(d1d2Q1Q2/R) sets such that all numbers in any set are separated by ≫ 1/R. (Recall that r is chosen such that 2 R ≤ d1d2Q1Q2.) Thus, as in the proof of Lemma 10.5 (specifically the argument leading up to (10.8)), we find that 2 ′ a Σ ≤ sup FR + η |η|≪1/R d1d2q1q2 q2∼Q2 a

1 1 1/2 1 ′ ′ 2 2 R |FR(t)|FR(t)dt ≪ FR(t) dt FR(t) dt ≪ r . 0 0 0 9 Z Z  Z  Putting this together gives d d Q Q2 Σ′ ≪ 1 2 1 2 . 9r r 2 1/2 We recall that R = 10 ∼ min(V, d1d2Q1Q2) and V ≍ (Y/DE) , and note that 20/21 < log 9/ log 10. This gives Y −10/21 (10.16) Σ′ ≪ (d d Q Q2)1/21 + d d Q Q2 . 1 2 1 2 1 2 1 2 DE ′ ′ ′   ′ This gives a bound for Σ3 since Σ3 ≤ Σ , and we obtain an analogous bound for Σ5 with d2 replaced by 1. Combining (10.16) with our earlier bounds (10.13), (10.14) and (10.15) and substituting these into (10.12) gives a b FY + + η q1q2d d q2∼Q2 a

The second statement of the lemma is simply the case when Q2 = 1 and q = dq1. 

We see that Lemma 8.1 follows immediately from Lemma 10.5, and Lemma 8.2 is the same as Lemma 10.1. Thus we are left to establish Proposition 9.1, Proposition 9.2 and Proposition 9.3, which we do over the next few sections. 46 JAMES MAYNARD

11. Major arcs

In this section we establish Proposition 9.1 using the prime number theorem in arithmetic progressions and short intervals, making use of Lemma 10.1.

Proof of Proposition 9.1. We split M up as three disjoint sets

M = M1 ∪M2 ∪M3, where a b (log X)C M = a ∈M : − ≤ for some b, q ≤ (log X)C , q ∤ X , 1 X q X n o a b (log X)C M = a ∈M : = + ν for some b, q ≤ (log X)C , q|X, 0 < |ν|≤ , 2 X q X n a b o M = a ∈M : = for some b, q ≤ (log X)C, q|X . 3 X q n o

By Lemma 10.1 and recalling X is a power of 10, we have a a sup SA = #A sup FX ≪ #A exp(− log X). a∈M1 X a∈M1 X     p ℓ Using the trivial bound SR X (θ) ≪ X(log X) , where ℓ ≤ 2/η and noting #M1 ≪ (log X)3C, we obtain 1 a −a #A (11.1) S S ≪ . X A X RX X C,η (log X)C aX∈M1     This gives the result for M1.

We now consider M2. Recalling the definition of RX , we have that for n

(11.2) ΛRX (n)= log pi = ΛC(m) log p, n=p1···pℓ i=1 n=mp aj aj +δ η/4 pj ∈(X ,XX ] for j<ℓ Y p≥XX 1−P − η/4 1−P ai−ℓδ i ai ℓδ pℓ≥X ,X i p≥X where C = (a1,a1 + δ] ×···× (aℓ−1,aℓ−1 + δ] is the projection of RX onto the first ℓ − 1 coordinates. We note the crude bound Λ (m) log p ℓ−1 (11.3) C ≤ ≪ (log X)ℓ−1. m p m

If mp = j∆X + O(∆X) and p ≡ r (mod q) we have b c brm e mp + = e e(jc∆) + O(∆(log X)C). q X q      By the prime number theorem in short intervals and arithmetic progressions (5.1), for m

∆X ∆2X log p = + O mφ(q) C,η mφ(q) p∈[j∆X/m,(j+1)∆X/m)   p≡r X(mod q)

Thus a Λ (m) brm sup S = ∆X sup C e e(j∆c) RX X mφ(q) q a∈M2 b≤q 1−η/3 −1 C m

e(j∆c)= −e(c)= −1= O(1). −1 1≤j

1RX (n), and that Xν ∈ Z for a ∈M2. 48 JAMES MAYNARD

3C Using the trivial bounds SA(θ) ≤ #A and #M2 ≪ (log X) along with (11.4), we obtain 1 a −a #A (11.5) S S ≪ . X A X RX X C,η (log X)C aX∈M2     Finally, we consider M3. By the prime number theorem in arithmetic progressions as above, we have for (r, q) = 1 and q ≤ (log X)C that X Λ (m) X Λ (n)= C + O RX φ(q) m η,C (log X)4C n

12. Generic minor arcs

In this section we establish Proposition 9.2 and obtain some bounds on the excep- tional set E by using the distributional estimates of Lemma 10.4. PRIMES WITH RESTRICTED DIGITS 49

Lemma 12.1 (ℓ2 bound for primes). We have that

a X # 0 ≤ a

Proof. This follows from the ℓ2 bound coming from Parseval’s identity.

a X C2 a 2 # 0 ≤ a

Lemma 12.2 (Generic frequency bounds). Let

a 1 E = 0 ≤ a

#E ≪ X23/40−ǫ, a F ≪ X23/80−ǫ, X X aX∈E   and 1 a −a 1 F S ≪ . X X X R X η Xǫ a

Proof. The first bound on the size of E follows from using Lemma 10.4 with B = X23/80 and verifying that (23 × 235)/(80 × 154)+ 59/433 < 23/40. For the second bound we see from Lemma 10.4 that a a F ≪ # 0 ≤ aX2 makes a contribution O(1/X)). 50 JAMES MAYNARD

This gives

1 a −a F S X X X R X a

We concentrate on the inner sum. Using Lemmas 10.4 and 12.1 we see that the sum contributes X a 1 −a X ≪ # a : F ∼ , S ∼ BC X X B R X C n Oη (1)    o X(log X) ≪ min C2, B 235/154X59 /433 BC X59/866   ≪ X1+ǫ . η B73/308

Here we used the bound min(x, y) ≤ x1/2y1/2 in the last line. In particular, we 1−2ǫ 23/80 see this is Oη(X ) if B ≥ X on verifying that 23/80 × 73/308 > 59/866. Substituting this into our bound above gives the result. 

13. Exceptional minor arcs

In this section we reduce Proposition 9.3 to the task of establishing Proposition 13.3 and Proposition 13.4, given below. We do this by making use of the bilinear structure of ΛRX (n) which is supported on integers of the form n1n2 with n1 of convenient size, and then showing that if these resulting bilinear expressions are large then the Fourier frequencies must lie in a smaller additively structured set. Propositions 13.3 and 13.4 then show that we have superior Fourier distributional estimates inside such sets. Thus we conclude that the bilinear sums are always small. To make the bilinear bound explicit, we establish the following lemma, from which Proposition 9.3 follows quickly.

Lemma 13.1 (Bilinear sum bound). Let N,M,Q ≥ 1 and E satisfy X9/25 ≤ N ≤ X17/40, Q ≤ X1/2, NM ≤ 1000X and E ≤ 100X1/2/Q, and either E ≥ 1/X or E =0. Let F = F(Q, E) be given by

a b F = a

a −anm X(log X)O(1) F α β γ e ≪ . X X n m a X (Q + E)ǫ/10 a∈F∩E n∼N X mX∼M     PRIMES WITH RESTRICTED DIGITS 51

Proof of Proposition 9.3 assuming Lemma 13.1. By symmetry, we may assume that I = {1,...,ℓ1} for some ℓ1 < ℓ. By Dirichlet’s theorem on Diophantine approxi- mation, any a ∈ [0,X) has a representation a b = + ν X q for some integers (b, q) = 1 with q ≤ X1/2 and some real |ν|≤ 1/X1/2q. Thus we can divide [0,X) into O(log X)2 sets F(Q, E) as defined by Lemma 13.1 for different parameters Q, E satisfying 1 ≤ Q ≤ X1/2 and E =0or1/X ≤ E ≤ 100X1/2/Q. Moreover, if a∈M / then a ∈F = F(Q, E) for some Q, E, with Q + E ≥ (log X)C. Thus, provided C is sufficiently large compared with A and η, we see it is sufficient to show that 1 a −a #A (13.1) S S ≪ . X A X RX X (Q + E)ǫ/20 a∈F∩E     X From the definition (9.1) of ΛRX and shape of RX given by Proposition 9.3, we have that for n

ΛRX (n)= ΛR1 (n1)ΛR2 (n2) log p, n1n2p=n − − Xη/4,X1XPi ai ℓδ ≤p where R1 is the projection of RX onto the first ℓ1 coordinates, and R2 is the projection onto the subsequent ℓ − ℓ1 − 1 coordinates.

Since n1, n2, p and X are integers, | log ((X − 1/2)/n1n2p)| ≫ 1/X. Thus, by Perron’s formula (see, for example, [10, Chapter 17]), we have for n1,n2,p

We will use this to remove the constraint n = n1n2p

1 a ΛR1 (n1)ΛR2 (n2)cp −anmp #A SA s s s e ≪ ǫ/15 , X X n1n2p X (Q + E) a∈F∩E n1∼N1 X   n2X∼N2   p∼P

η/4 1−P ai−ℓδ where cp = log p if p ≥ X ,X i and 0 otherwise. (The integral over s 4 and the choices of N1,N2, P contribute a factor of O(log X) , which is acceptable for establishing (13.1) if C is sufficiently large.)

ℓ1 ℓ1 Pi=1 ai Pi=1 ai+ℓδ Since ΛR1 (n1) is supported on n1 ∈ [X ,X ] and ΛR2 (n2) is sup- ℓ−1 Pℓ +1 ai 1−ℓδ ported on n2 ≥ X 1 , we only need to consider N1N2P ≥ X and N1 ∈ Pℓ1 a Pℓ1 a +ǫ/6 [X i=1 i ,X i=1 i ]. But, by assumption, ℓ 1 9 ǫ 17 ǫ 23 ǫ 16 ǫ a ∈ + , − ∪ + , − , i 25 2 40 2 40 2 25 2 i=1 X h i h i 52 JAMES MAYNARD

9/25 17/40 so either N1 or N2P lie in [X ,X ]. Since ΛR1 (n1), ΛR2 (n2), log p ≪ℓ (log X)ℓ−1, 1 a −anm #A (13.2) S α β e ≪ X A X n m X (Q + E)ǫ/12 a∈F∩EX   nX∼N mX∼M   uniformly over all choices of N ∈ [X9/25,X17/40] and M ≤ 1000X/N and uniformly ℓ over all 1-bounded complex sequences αn,βm. (Setting αn = ΛR1 (n)/(log X) ℓ ℓ1 and βm = pn2=m,p∼P,n2∼N2 ΛR2 (n2)cp/(log X) gives the bound when i=1 ai ∈ [9/25+ ǫ/2, 17/40 − ǫ/2]; the other case is analogous with α and β swapped.) P n m P Finally, let γa be the 1-bounded sequence satisfying SA(a/X) = #AγaFX (a/X). After substituting this expression for SA, we see that (13.2) follows immediately from Lemma 13.1 for C sufficiently large in terms of η, thus giving the result. 

Thus it remains to establish Lemma 13.1. The key estimate constraining Fourier frequencies to additively structured sets is the following lemma.

Lemma 13.2 (Geometry of numbers). Let K0 be a sufficiently large constant, let 3 t ∈ R with ktk2 =1 and let N > 1 >δ> 0. Let 3 R = {v ∈ R : kvk2 ≤ N, |v · t|≤ δ} 3 2 3 satisfy #R ∩ Z ≥ δKN for some K>K0. Then there exists a lattice Λ ⊂ Z of rank at most 2 such that δKN 2 #{v ∈ Λ ∩R} ≥ . 2

If a cuboid R ⊆ R3 of volume V lies in a the region |z| ≤ ǫ, then it can easily contain rather more than V lattice points from the plane z = 0. Lemma 13.2 says that such a situation is essentially the only way a cuboid can contain many lattice points; if any cuboid has substantially more than V lattice points in R ∩ Z3, then these lattice points must come from some lower dimensional linear subspace. The region R which we are interested in is a slightly thickened disc through the origin in the plane orthogonal to t.

Proof of Lemma 13.2. Let φ : R3 → R3 be the linear map which is a dilation by a 3 3 factor N/δ in the t-direction (i.e. φ(v)= v+t(N/δ−1)(v·t).) Let Λ1 = φ(Z ) ⊂ R be the lattice which is the image of Z3 under φ. Since the determinant of a lattice is the volume of the fundamental parallelepiped, we see that det(Λ1)= N/δ.

Let {v1, v2, v3} be a Minkowski-reduced basis of Λ1. We recall that this means that any v ∈ Λ1 can be written uniquely as n1v1 + n2v2 + n3v3 for some n1,n2,n3 ∈ Z, and for any n1,n2,n3 ∈ Z we have

3 kn1v1 + n2v2 + n3v3k2 ≍ knivik2, i=1 X and that kv1k2kv2k2kv3k2 ≍ det(Λ1) = N/δ. Without loss of generality let kv1k2 ≤kv2k2 ≤kv3k2. PRIMES WITH RESTRICTED DIGITS 53

We now notice that any element of R∩Z3 is mapped injectively by φ to an element of {x ∈ Λ1 : kxk2 ≤ 2N}. Thus for a sufficiently large constant C, we have 3 3 3 3 n ∈ Z : nivi ∈ φ(R) ⊆ n ∈ Z : nivi ≤ 2N 2 n i=1 o n i=1 o X X 3 N ⊆ n ∈ Z : | ni|≤ C . kvik2 3n o If kv3k2 > CN, then there are no n ∈ Z counted above with n3 6= 0. If instead kv3k2 ≤ CN then since kv1k2 ≤kv2k2 ≤kv3k2, the number of n is C3N 3 N 3 ≪ ≪ ≪ δN 2. 3 det(Λ ) i=1 kvik2 1 2 Thus in either case there areQO(δN ) points with n3 6= 0. However, by assumption of the lemma we have that K is sufficiently large and 2 3 δKN ≤ #{x ∈ Z ∩ R} = #{x ∈ Λ1 : x ∈ φ(R)}.

This means that most of the contribution must come from terms with n3 = 0. Indeed, we have 2 2 #{(n1,n2) ∈ Z : n1v1 + n2v2 ∈ φ(R)} = #{x ∈ Λ1 : φ(x) ∈ R}− O(δN ) ≥ δKN 2 − O(δN 2). 2 We may choose K0 such that if K ≥ K0 then the right hand side is at least δKN /2. −1 −1 3 Thus, we see if Λ is the lattice φ (v1)Z + φ (v2)Z then Λ ⊆ Z and #{v ∈ Λ ∩R} ≥ δKN 2/2. 

We establish Lemma 13.1 assuming two key propositions, Proposition 13.3 and Proposition 13.4, given below. These propositions will be proven over the next two sections. Proposition 13.3 (Bound for angles generating lattices). Let X,K,N,Q ≥ 1 and δ > 0, E ≥ 0 satisfy X17/40 ≤ NK, δ ≥ N/X, E ≤ 100X1/2/Q and Q ≤ X1/2. 2 2 Let B1 = B1(N,K,δ) ⊆ [0,X) be the set of pairs (a1,a2) ∈ Z such that there is a lattice Λ ⊆ Z3 of rank 2 such that 2 #{n ∈ Λ: |n1a1 + n2a2 + n3X|≤ δX, knk2 ≤ N}≥ δKN , and not all of these points lie on a line through the origin. Let F = F(Q, E) be given by a b F = a

Given B ≤ X23/80, let E′ = E′(B) be given by a 1 E′ = a

Proof of Lemma 13.1 assuming Propositions 13.3 and 13.4. We split E into O(log X) subsets of the form a 1 E′ = E′(B)= a ∈ [0,X): F ∼ X X B n   o for some B ∈ [1,X23/80]. By Cauchy-Schwarz, we have a −anm F α β γ e ≪ Σ1/2Σ1/2, X X n m a X 1 2 a∈F∩E′ n∼N X mX∼M     where X Σ = |β |2 ≪ , 1 m N m≪XX/N a −anm 2 Σ2 = αnγaFX e ′ X X m≪X/N a∈F∩E n∼N     X X X a1 a2 m(a1n1 − a2n2) = FX FX αn1 αn2 γa1 γa2 e ′ X X X a1,aX2∈F∩E     n1,nX2∼N m≪XX/N   −1 a1 a2 X a1n1 − a2n2 ≪ FX FX min , . ′ X X N X a1,a2∈F∩E     n1,n2∼N   X X Thus it suffices to show −1 O(1) a1 a2 X a1n1 − a2n2 NX(log X) FX FX min , ≪ ǫ/5 , ′ X X N X (Q + E) a1,a2∈F∩E     n1,n2≤N   X X provided X9/25 ≤ N ≤ X17/40, Q ≤ X1/2 and E ≤ 100X1/2 /Q.

′ Let G(K) denote the set of pairs (a1,a2) ∈F∩E such that X n a − n a −1 min , 1 1 2 2 ∼ N 2K. N X n1,n2≤N   X We consider 1 ≤ K ≤ X/N taking values which are integral powers of 10, and split the contribution of our sum according to these sets. We see it is therefore sufficient to show that for each K a a X(log X)O(1) F 1 F 2 ≪ . X X X X (Q + E)ǫ/5NK (a1,a2)∈G(K) X ′     a1,a2∈F∩E ′ Let G(K,δ) denote the set of pairs (a1,a2) ∈F∩E such that n a − n a − n X # n ∈ Z3 : 1 1 2 2 3 ≤ δ, knk ≤ 10N ≥ δN 2K. X 2 n o

PRIMES WITH RESTRICTED DIGITS 55

By considering δ =2−j and using the pigeonhole principle, we see that if X n a − n a −1 min , 1 1 2 2 ∼ N 2K, N X n1,n2≤N   X then there is some δ ≥ N/X and some K/ log X ≪ K′ ≤ K such that ′ (a1,a2) ∈ G(K ,δ). Thus is suffices to show for all K′,δ that O(1) a1 a2 X(log X) (13.3) FX FX ≪ ǫ/5 ′ . ′ X X (Q + E) NK (a1,a2)∈G(K ,δ) X ′     a1,a2∈F∩E From Lemma 12.2, we have the bound 2 a1 a2 a1 23/40−2ǫ FX FX ≪ FX ≪ X , ′ X X ′ X (a1,a2)∈G(K ,δ) a1∈E X ′      X   a1,a2∈F∩E which gives (13.3) in the case when NK′ ≪ X17/40+ǫ. Thus we may assume that NK′ ≫ X17/40+ǫ. By assumption, we also have that N ≤ X17/40, so we only consider K′ ≫ Xǫ. In particular, we may use Lemma 13.2 to conclude that either there is a rank 2 lattice Λ ⊆ Z3 such that ′ 2 #{n ∈ Λ: knk2 ≤ 10N, |n1a1 + n2a2 + n3X|≤ δX}≥ δK N /2, and not all of these points lie on a line through the origin, or there is a line L ⊆ Z3 such that ′ 2 #{n ∈ L : knk2 ≤ 10N, |n1a1 + n2a2 + n3X|≤ δX}≥ δK N /2. In either case (13.3) follows from Proposition 13.3 or Proposition 13.4 (taking ‘N’ and ‘K’ in the propositions to be 10N and K′/1000 ≥ 1 in our notation here). 

Thus it remains to establish Proposition 13.3 and Proposition 13.4.

14. Lattice Estimates

In this section we establish Proposition 13.3, which controls the contribution from pairs of angles which cause a large contribution to the bilinear sums considered in Section 13 to come from a lattice. A low height lattice Λ makes a significant contribution only if (a1,a2,X) is approximately orthogonal to the plane of the lattice, and so only if (a1,a2,X) lies close to the line through the origin orthogonal to this lattice. We note that we only make small use of the fact that these angles lie in a small set, but it is vital that the angles lie outside the major arcs. Lemma 14.1 (Lattice generating angles have simultaneous approximation). Let 2 δ > 0 and X,N,K ≥ 1 be such that δ ≥ N/X. Let B1 = B1(N,K,δ) ⊆ [0,X) be 2 3 the set of pairs (a1,a2) ∈ Z such that there is a lattice Λ ⊆ Z of rank 2 such that 2 #{n ∈ Λ: |n1a1 + n2a2 + n3X|≤ δX, knk2 ≤ N}≥ δKN , and moreover the points counted above do not all lie on a line through the origin. 56 JAMES MAYNARD

Then all pairs (a1,a2) ∈B1 have the simultaneous rational approximations a b 1 1 = 1 + O , X q NKq a b  1  2 = 2 + O , X q NKq   for some integer q ≪ X/NK.

3 We see Lemma 14.1 restricts the pair (a1,a2) to lie in a set of size O(X/NK) , which is noticeably smaller than X2 for the range of NK under consideration. This allows us to obtain superior bounds for the sum over a1,a2, by exploiting the estimates of Lemma 10.6 which show F is not abnormally large on such a set.

Proof. Clearly we may assume that NK is sufficiently large, since otherwise the result is trivial. By assumption of the lemma, for any pair (a1,a2) ∈B1 there is a 2 rank 2 lattice Λ = Λa1,a2 such that #(Λ ∩ H) ≥ δKN where 3 H = {x ∈ R : |x1a1 + x2a2 + x3X|≤ δX, kxk2 ≤ N}. Moreover, not all the points in Λ ∩ H lie in a line through the origin. Let a = 3 3 (a1,a2,X), and let φ : R → R be a dilation by a factor N/δ in the a-direction, and let Λ′ = φ(Λ). Then we see that ′ φ(Λ ∩ H) ⊆{x ∈ Λ : kxk2 ≤ 2N}. Moreover, not all the points on the right hand hand side lie in a line through the origin, since φ−1 preserves lines through the origin. Let Λ′ have a Minkowski- reduced basis {v1, v2}, and let V1 = kv1k2 and V2 = kv2k2. Since km1v1 + m2v2k2 ≍ |m1|V1 + |m2|V2, for a suitably large constant C we have

′ CN CN {x ∈ Λ : kxk2 ≤ 2N}⊆ m1v1 + m2v2 : |m1|≤ , |m2|≤ . V1 V2 n o Since not all of the points in the final set lie in a line through the origin, we see that V1, V2 ≤ CN. Thus N 2 δKN 2 ≤ #(Λ ∩ H) = #(Λ′ ∩ φ(H)) ≪ . V1V2

In particular, V1V2 ≪ 1/δK.

−1 −1 Let w1 = φ (v1) and w2 = φ (v2), so w1 and w2 are linearly independent 3 vectors in Λ ⊆ Z . Since φ can only increase the length of vectors, kw1k2 ≤ V1 and kw2k2 ≤ V2. Let ǫ1 = |w1 · a| and ǫ2 = |w2 · a|. Trivially we have |v1 · a| ≪ V1X and |v2 · a| ≪ V2X, and so recalling that φ is a dilation by a factor N/δ in the a-direction, we see that ǫ1 ≪ δXV1/N and ǫ2 ≪ δXV2/N.

Putting this together, we see that for any pair (a1,a2) ∈ B1 there are linearly 3 independent vectors w1, w2 ∈ Z and quantities V1, V2 such that 1 V V ≪ , kw k ≤ V , kw k ≤ V , 1 2 δK 1 2 1 2 2 2 δXV δXV |a · w | ≪ 1 , |a · w | ≪ 2 . 1 N 2 N PRIMES WITH RESTRICTED DIGITS 57

This puts considerable constraints on the possibilities for (a1,a2), since it must lie in an infinite cylinder with axis parallel to w1 × w2 with short radius, for some low 3 height vectors w1, w2. (Here × is the standard cross product on R .) Explicitly, 3 let e1, e2, e3 be an orthonormal basis of R with e1 orthogonal to w1 and w2, and with e2 orthogonal to w2. Then we see that e1 ∝ w1 × w2, e2 ∝ w2 × e1 and e3 ∝ w2. In particular, we have that |e3 · w2| = kw2k2, and

|w1 · (w2 × (w1 × w2))| kw1 × w2k2 |e2 · w1| = = . kw2k2kw1 × w2k2 kw2k2

(Here we used the identity a · (b × c)= c · (a × b).) Thus, if x = x1e1 + x2e2 + x3e3 has |x · w1| ≪ δXV1/N and |x · w2| ≪ δXV2/N, then δXV 2 ≫ |x · w | = |x |kw k , N 2 3 2 2 δXV1 |x2|kw1 × w2k2 ≫ |x · w1| = + O |x3|kw1k2 . N kw2k2   Since kw1k2 ≪ V1, kw2k2 ≪ V2 and kw1 × w2k2 ≤kw1k2kw2k2, this implies that

δXV2 δXV1V2 |x3| ≪ ≪ , Nkw2k2 Nkw1 × w2k2 δXV1V2 |x3|kw1k2 kw2k2 δXV1V2 |x2| ≪ + ≪ . Nkw1 × w2k2 kw1 × w2k2 Nkw1 × w2k2

Thus, since V1V2 ≪ 1/δK, we see that any vector x with |x · w1| ≪ δXV1/N and |x · w2| ≪ δXV2/N satisfies X x = λ(w1 × w2)+ O NKkw1 × w2k2   for some λ ∈ R. We note that the error term is o(X) since w1, w2 are linearly independent integer vectors and NK is assumed sufficiently large. Let the com- 3 ponents of w1 × w2 be c1,c2,c3 (with respect to the standard basis of R ). Since 3 w1, w2 ∈ Z , we have c1,c2,c3 ∈ Z. Thus if a is of the above form we must have a = λ(w1 × w2)+ o(X) for some λ. Since kak2 ≥ X and a1,a2 ≤ a3 = X, we must have that |c1|, |c2| ≪ |c3|. In particular, |c3|≍kw1 × w2k2. Dividing through by X = λc3 + O(X/NK|c3|) then gives a /X c /c 1 (14.1) 1 − 1 3 ≪ . a2/X c2/c3 2 NK|c3|    

Finally, we note that since δ ≥ N/X and V1V 2 ≪ 1/δK we have 1 X c ,c ,c ≤kw × w k ≤kw k kw k ≤ V V ≪ ≪ . 1 2 3 1 2 2 1 2 2 2 1 2 δK NK

Thus, we see that for any pair (a1,a2) ∈ B1 there must be integers c1,c2,c3 ≪ X/NK such that (14.1) holds. This gives the result. 

Lemma 14.2 (Size of rational approximations). Let B1(N,K,δ) and F = F(Q, E) 2 be as in Proposition 13.3. If B1(N,K,δ) ∩F 6= ∅ then X 2 Q + E ≪ . NK   58 JAMES MAYNARD

Proof. By Lemma 14.1, if (a1,a2) ∈B1(N,K,δ) then a b 1 = 1 + ν , X q 1 a b 2 = 2 + ν , X q 2 for some q ≪ X/NK and |ν1|, |ν2| ≪ 1/NKq. By clearing common factors we may assume that (b1,b2, q) = 1.

2/3 If NK > X (and X is sufficiently large) then we see that b1/q and b2/q are 1/3 the best rational approximations to a1/X and a2/X with denominator O(X ), 2/3 since the error in the approximation is O(1/(qX )). Thus if we also have a1,a2 ∈ F(Q, E) then we must have q ≫ Q and |ν1|, |ν2| ∼ E/X. In particular, we must have Q + E ≪ X/NK. If instead NK ≤ X2/3 then since Q + E ≪ X1/2 we have Q + E ≪ (X/NK)2. Thus in either case we have that there are no such pairs 2 (a1,a2) in both B1(N,K,δ) and in F×F unless Q + E ≪ (X/NK) . 

17/40 Lemma 14.3. Let NK ≥ X , and let B1(N,K,δ), F = F(Q, E) and E be as in Proposition 13.3. Then we have

a1 a2 5 FX FX ≪ (log X) sup min(S1S2,S1S3), X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1X,a2∈E d0X∼D0 d1∼D1 u v where V = {2 5 : u, v ∈ Z≥0}, the supremum is over all choices of Q1, G1, G2,D0,D1, E0 ≥ 1 which are powers of 10 and satisfy Q1G1G2D0D1E0 ≪ X/NK and G1 ≪ G2, and S1,S2,S3 are given by ′ b2 S1 = sup FX + ν2 , ′ ′ ′ q ∼Q1 ′ ′ ′ ′ d0d1q g1 ′ g1∼G1 b2

Proof. By Lemma 14.1 we are considering pairs (a1,a2) ∈B1(N,K,δ) such that a b 1 = 1 + ν , X q 1 a b 2 = 2 + ν , X q 2 for some q ≪ X/NK and |ν1|, |ν2| ≪ 1/NKq.

By clearing common factors we may assume that (b1,b2, q) = 1. We let g1 = (b1, q) and g2 = (b2, q). By symmetry we may assume that g1 ≤ g2. We let d1 be the part u ′ of g1 not coprime to 10 (i.e. d1|10 for some integer u, and g1 = g1d1 for some PRIMES WITH RESTRICTED DIGITS 59

′ (g1, 10) = 1). Similarly we let d0 be the part of q/g1g2 which is not coprime to 10. ′ ′ ′ ′ To ease notation we let b1 = b1/g1, b2 = b2/g2, q = q/g1g2d0 and g1 = g1/d1. Thus ′ ′ ′ ′ ′ ′ ′ ′ ′ ′ q = g1g2d0d1q , b1 = b1d1g1 and b2 = b2g2 with (b1, d0q g2) = (b2, d0d1q g1)=1 ′ ′ and (q , 10) = (g1, 10) = 1. 5 We split the contribution of pairs (a1,a2) ∈B1 into O(log X) subsets. We consider ′ ′ terms where we have the restrictions q ∼ Q1, g1 ∼ G1, g2 ∼ G2, d0 ∼ D0 and d1 ∼ D1 for some Q1, G1, G2,D0,D1 ≥ 1 all integer powers of 10 with Q0 := ′ Q1G1G2D0D1 ≪ X/NK. Since g1 = g1d1 ≤ g2 we have G1D1 ≪ G2. We relax the restriction |ν1|, |ν2| ≪ 1/NKq to |ν1|, |ν2| ≤ E0/X for a suitable power of 10 5 E0 ≍ X/NKQ0 with E0 ≥ 1. We see there are O(log X) sets with such restrictions which cover all possible (b1,b2,q,ν1,ν2) and hence all (a1,a2) ∈B1. For simplicity, the reader might like to consider the special case G1 = G2 = D0 = D1 =1ona first reading.

u v To ease notation we let V = {2 5 : u, v ∈ Z≥0}, and note that we have d0, d1 ∈V. ′ ′ ′ ′ By summing over all possibilities of q ,g1,g2, d0, d1,b1,b2, we see that

a1 a2 5 FX FX ≪ (log X) sup S0, X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1X,a2∈E d0X∼D0 d1∼D1

where the supremum is over all choices of Q1, G1, G2,D0,D1, E0 ≥ 1 which are powers of 10 and satisfy Q1G1G2D0D1E0 ≪ X/NK and G1D1 ≪ G2 and S0 is given by

′ ′ ′ ′ ′ b1 b2 S0 = FX ′ + ν1 FX ′ ′ + ν2 . d0q g2 d0d1q g ′ ′ ′ |ν |≤E /X 1 q ∼Q1 b1

′ In S0, we have used to indicate that the summation is further constrained by the conditions P

′ ′ ′ ′ ′ ′ ′ (q , 10) = (g1, 10) = (b1, d0q g2) = (b2, d0d1q g1)=1, ′ ′ ′ ′ ′ X(b1/d0q g2 + ν1) ∈ Z, X(b2/d0d1q g1 + ν2) ∈ Z,

′ ′ ′ which we suppressed for notational simplicity. We see that g1,g2,b1,b2,ν1,ν2 each ′ occur in only one of the two FX terms, and so given d0, d1, q the remaining summa- tion in S0 factors into a product of two sums. Taking a supremum over all choices of q′ in the first of these then gives

a1 a2 5 (14.2) FX FX ≪ (log X) sup S1S2, X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1X,a2∈F d0X∼D0 d1∼D1 60 JAMES MAYNARD where ′ b2 (14.3) S1 = sup FX + ν2 , ′ ′ ′ q ∼Q1 ′ ′ ′ ′ d0d1q g1 ′ g1∼G1 b2

The bound (14.2) will be useful when Q0 is small, but when Q0 is large it is wasteful to sum over all these possibilities since we have not made use of the fact that a1,a2 ∈ E, a small set. To obtain an alternative bound we first sum over all a1 ∈E, then all possibilities of q, b2, ν2. This shows that a1 a2 5 ′ (14.5) FX FX ≪ (log X) sup S0, X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1X,a2∈E d0X∼D0 d1∼D1 ′ where the supremum has the same constraints as before, and S0 is given by ′ ′ ′ ′ ′ ′ ′ a1 b2 S0 = FX FX ′ ′ + ν2 . ′ ′ ′ ′ ′ X d0d1q g1 a1∈E q ∼Q1 g ∼G |ν |≤E /X X X 1X 1 b2

Again, taking a supremum over q′ and factorizing the summation, we find that

′ (14.6) S0 ≪ S1S3, where S1 is as given by (14.3) above, and S3 is given by a (14.7) S = F 1 N(a , d ), 3 X X 1 0 aX1∈E   where ′ ′ ′ a1 b1 E0 ′ ′ N(a1, d0) = # q ∼ Q1 : ∃ b1,g2 s.t. − ′ ≤ , (b1, d0q g2)=1, g2 ∼ G2 . X q d0g2 X n o Putting together (14.2), (14.5), (14.6) we obtain

a1 a2 5 FX FX ≪ (log X) sup min(S1S2,S1S3), X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1X,a2∈E d0X∼D0 d1∼D1 as required. 

17/40 Lemma 14.4. Let NK ≥ X and let S1,S2,S3 be as in Lemma 14.3. Let Q1, G1, G2,D0,D1, E0 ≥ 1 be powers of 10 which satisfy Q1G1G2D0D1E0 ≪ X/NK and G1 ≪ G2. Then we have 1−ǫ 1−ǫ min(S1S2,S1S3) ≪ Q0 E0 , PRIMES WITH RESTRICTED DIGITS 61 where Q0 = Q1G1G2D0D1.

Proof. We first bound S1,S2,S3 individually using Lemma 12.2, Lemma 10.6 and Lemma 10.7. We will then combine these bounds to give the desired result.

′ We first consider the quantity N(a1, d0) occurring in S3. If q and q are both counted by N(a, d) then there exists b,g and b′,g′ such that (b,qdg) = (b′, q′dg′) = 1 and a b 1 b′ 1 = + O = ′ ′ + O . X qdg NKQ0 q dg NKQ0     Here we used the fact that E0/X ≪ 1/NKQ0. The variables we consider satisfy ′ ′ q, q ∼ Q1 ≪ Q0/G1G2D0D1 and g,g ∼ G2 and d ∼ D0. Thus

′ ′ ′ Q0 Q0 bq g − b qg ≪ 2 2 ≪ . D0D1G1NK D0D1NK ′ ′ ′ Let h ≪ Q0/D0D1NK be such that bq g −b qg = h. There are O(1+Q0/D0D1NK) such choices of h. Given q,g,b,h with (qg,b)= 1, we then see

q′g′ ≡ hb−1 (mod qg), b′ ≡ h(qg)−1 (mod b).

Since q′g′ ≍ qg and b′ ≍ b, there are O(1) choices of b′ and q′g′. Thus there are ǫ ′ ′ ′ O(Q0) such choices of q ,g ,b by the divisor bound. Thus we find that 1+ǫ ǫ Q0 N(a1, d0) ≪ Q0 + . D0D1NK Combining this with Lemma 12.2 gives the bound

23/80 23/80 Q0X (14.8) S3 ≪ X + . D0D1NK

We recall Q0 = Q1G1G2D0D1 is the approximate size of q and that G1 ≪ G2, E0Q0 ≪ X/NK ≪ X. By Lemma 10.6 we have E D D Q G2 S ≪ (E D D Q G2)27/77 + 0 0 1 1 1 1 0 0 1 1 1 X50/77 27/77 27/77 (14.9) ≪ Q0 E0 , Q2G2E D S ≪ (E D Q2G2)27/77 + 1 2 0 0 2 0 0 1 2 X50/77 2 27/77 2 Q0E0 Q0E0 (14.10) ≪ 2 2 + 50/77 . D0D1G1 X D0D1G1   Alternatively, we may bound S1 using Lemma 10.7, which gives

Q G2(D D )3/2E5/6 S ≪ (D D E )27/77(Q G2)1/21 + 1 1 0 1 0 1 0 1 0 1 1 X10/21 1/2 5/6 1/21 27/77 Q0G1(D0D1) E0 (14.11) ≪ Q0 (D0D1E0) + 10/21 . G2X 62 JAMES MAYNARD

If the first term in (14.11) dominates, then since E0 ≪ X/NKQ0, the bounds (14.11) and (14.10) give Q2+1/21E2 S S ≪ E54/77Q54/77+1/21 + 0 0 1 2 0 0 X50/77 1 X 1+1/21+ǫ ≪ Q1−ǫE1−ǫ 1+ . 0 0 X50/77 NK 1−ǫ 1−ǫ     17/40 This shows S1S2 ≪ Q0 E0 in this case by recalling that NK ≫ X and verifying that 22/21 × 23/40 < 50/77. If instead the second term in (14.11) dominates, then by (14.9) and (14.11) (using 5/6 G1 ≪ G2 and replacing E0 with E0 to simplify the expression), we have Q E (D D )1/2 (14.12) S ≪ min (Q E )27/77, 0 0 0 1 . 1 0 0 X10/21 Combining this with (14.10), we obtain  2 27/77 1/3 1/2 2/3 E0Q0 27/77 Q0E0(D0D1) S1S2 ≪ 2 2 (E0Q0) 10/21 G1D0D1 X  2   1/2  Q0E0 Q0E0(D0D1) + 50/77 10/21 X D0D1G1 X Q3/2E6/5 Q3E2 ≪ 0 0 + 0 0 . X3/10 X9/8 Here we have simplified the exponents appearing for an upper bound. We recall 17/40 that Q0E0 ≪ X/NK and (by assumption of the lemma) NK ≫ X . These give Q3/2E6/5 Q E Q E 0 0 ≪ 0 0 (X23/40)1/2 ≪ 0 0 . X3/10 X3/10 X1/80 1−ǫ 1−ǫ Thus this term is O(Q0 E0 ), and so Q3E2 (14.13) S S ≪ Q1−ǫE1−ǫ + 0 0 . 1 2 0 0 X9/8 Similarly, we find that combining (14.12) and (14.8) gives 1/2 23/80 23/80 27/77 Q0E0(D0D1) X Q0 S1S3 ≪ X (Q0E0) + 10/21 X D0D1NK Q2E ≪ X23/80(Q E )27/77 + 0 0 . 0 0 X3/16NK 17/40 Here we used 10/21 − 23/80 > 3/16. Since Q0E0 ≪ X/NK and NK ≫ X ≫ X13/32+ǫ, we see that Q2E X13/16 0 0 ≪ Q ≪ Q1−ǫ ≪ Q1−ǫE1−ǫ. X3/16NK 0 (NK)2 0 0 0 Thus we have 1−ǫ 1−ǫ 23/80 27/77 (14.14) S1S3 ≪ Q0 E0 + X (Q0E0) . Combining (14.13) and (14.14), we obtain Q3E2 min(S S ,S S ) ≪ Q1−ǫE1−ǫ + min X23/80(Q E )27/77, 0 0 . 1 2 1 3 0 0 0 0 X9/8   PRIMES WITH RESTRICTED DIGITS 63

We find that Q3E2 77/100 Q3E2 23/100 min X23/80(Q E )27/77, 0 0 ≪ X23/80(Q E )27/77 0 0 0 0 X9/8 0 0 X9/8    96/100 73/100    Q E = 0 0 X(90−77)×23/8000 1−ǫ 1−ǫ ≪ Q0 E0 . 1−ǫ 1−ǫ  Thus we have min(S1S2,S3S2) ≪ Q0 E0 in all cases, as desired.

Having established the technical Lemma 14.3 and Lemma 14.4, we are now in a position to prove Proposition 13.3.

Proof of Proposition 13.3. We wish to show that a a (log X)5 X F 1 F 2 ≪ X X X X (Q + E)ǫ/4 NK (a1,a2)∈B1(N,K,δ)     a1,aX2∈F∩E 17/40 2 2 in the region X ≤ NK. Since B1(N,K,δ) ∩F = ∅ unless Q + E ≪ (X/NK) by Lemma 14.2, we may assume that Q + E ≪ (X/NK)2. By Lemma 14.3 and Lemma 14.4 we have

a1 a2 5 FX FX ≪ (log X) sup min(S1S2,S1S3) X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ) d0,d1∈V     D0,D1,E0 a1,aX2∈F∩E d0X∼D0 d1∼D1 5 1−ǫ 1−ǫ ≪ (log X) sup Q0 E0 . Q1,G1,G2 d0,d1∈V D0,D1,E0 d0X∼D0 d1∼D1 ǫ/2 There are O(Q0 ) elements d0, d1 ∈ V with d0, d1 ≪ Q0. Thus, recalling that Q0E0 ≪ X/NK, we have a1 a2 5 1−ǫ/2 1−ǫ FX FX ≪ sup (log X) Q0 E0 X X Q1,G1,G2 (a1,a2)∈B1(N,K,δ)     D0,D1,E0 a1,aX2∈F∩E X 1−ǫ/2 ≪ (log X)5 . NK We recall that Q + E ≪ (X/NK)2, and so this gives   a a (log X)5X F 1 F 2 ≪ , X X X X (Q + E)ǫ/4NK (a1,a2)∈B1(N,K,δ)     a1,aX2∈F∩E as required. 

15. Line Estimates

In this section we establish Proposition 13.4, which controls the contribution from pairs of angles which cause a large contribution to the bilinear sums considered in Section 13 to come from a line. If a line L makes a large contribution, then 64 JAMES MAYNARD

(a1,a2,X) must lie close to the low height plane orthogonal to this line. We note that we do not make use of the fact that these angles lie outside the major arcs, but it is vital that the angles are restricted to the small set E. Lemma 15.1 (Line angles lie in low height plane). Let 0 <δ< 1 and K,N,X > 1 17/40 be reals with δ ≥ N/X and NK ≥ X . Let B2 = B2(N,K,δ) be the set of 2 integer pairs (a1,a2) ∈ [0,X) such that there is a line L through the origin such that

3 2 #{n ∈ L ∩ Z : |n1a1 + n2a2 + n3X|≤ δX, knk2 ≤ N} ≫ δN K.

Then all pairs (a1,a2) ∈B2 satisfy

v1a1 + v2a2 + v3X + v4 =0

2 for some integers v1, v2, v3, v4 ≪ X/N K not all zero.

3 Proof. Let v = (v1, v2, v3) be a non-zero element of Z ∩ L of smallest norm, and 3 let V = kvk2 and ǫ1 = |v1a1 + v2a2 + v3X|. Then all of Z ∩ L is generated by v, and so

3 N δX #{n ∈ L ∩ Z : |n1a1 + n2a2 + n3X|≤ δX, knk2 ≤ N} ≪ min , . V ǫ1   By assumption, this is also ≫ δN 2K, and so we obtain 1 X X V ≪ ≪ , ǫ ≪ . NKδ N 2K 1 N 2K

Letting v4 = −(v1a1 + v2a2 + v3X) ∈ {±ǫ1} gives the result. 

Lemma 15.2 (Sparse sets restricted to low height planes). Let C ⊆ [0,X) be a set of integers. Then we have for any V ≥ 1

2 4 # (a1,a2) ∈C : ∃(v1, v2, v3, v4) ∈ [−V, V ] \{0} s.t. v1a1 + v2a2 + v3X + v4 =0 n #C3/2V 3 o ≪ Xo(1) #C5/4V 2 + . X1/2   2 Proof. Trivially there are O(#C ) choices of a1,a2 ∈ C, which gives the required bound if V > #C3/8. In particular, we may assume that V < #C ≤ X. There are O(#C) points with a1 =0 or a2 = 0, so we may assume that a1,a2 6= 0. We first claim that there are (15.1) O(#CV 2Xo(1)) choices of v1, v2, v3, v4, a1, and a2 satisfying v1a1 +v2a2 +v3X +v4 = 0 with at least one of v1, v2, v3, v4 equal to 0 and at least one of v1, v2, v3, v4 non-zero. For example, 2 if v1 = 0 then there are O(#CV ) choices of a1, v3, v4, which then determines v2a2. Since there are no non-zero solutions to v3X + v4 = 0, this is non-zero and so there ǫ are O(X ) choices of v2,a2. The other cases are entirely analogous. Thus it suffices to consider pairs (a1,a2) such that v1a1 +v2a2 +v3X +v4 = 0 for some v1, v2, v3, v4 all non-zero. We let C2 denote the set of such pairs. PRIMES WITH RESTRICTED DIGITS 65

2 2 1/2 Given a ∈ Z, let Ma be the smallest value of (c1 + c2) over all non-zero integers 2 c1,c2 such that c1 ≡ c2X (mod a). We divide C into O(log X) subsets localizing the size of a

C(A, M)= {a ∈C : a ∼ A, Ma ∼ M}. 2 2 2 1/2 There are O(M ) choices of c1,c2 with (c1 + c2) ≤ M, and given any such choice o(1) with M

(15.3) N2 ≪ (log X) sup N3(V1), V1 where ′ ′ ′ ′ ′ N3(V1) = # (a2,a2,a1, d, v˜1, v˜1, v2, v3, v4, v2, v3, v4): 0 < d ≤ V/V1,a1 ∈ C\{0}, n v a v v′ a′ v′ a dv˜ v˜′ = 2 2 + v + 4 v˜′ = 2 2 + v′ + 4 v˜ , a ,a′ ∈C(A, M)\{0}, 1 1 1 X 3 X 1 X 3 X 1 2 2 ′ ′ ′  ′  ′ ′ 0 < |v˜1|, |v˜1|, |v2|, |v2|, |v3|, |v3|, |v4|, |v4|≤ V, max(|v˜1|, |v˜1|) ∼ V1, gcd(˜v1, v˜1)=1 .

o(1) 3/2 4 2 6 o We wish to show that N3(V1) ≪ X (#C V + #C V /X) for any choice of ′ ′ 0 < V1 < V . By symmetry we may assume |v˜1| ≥ |v˜1|, so |v˜1|∼ V1. Let b1 =v ˜1v2, ′ ′ ′ ′ ′ b2 = −v˜1v2, b3 =v ˜1v3 − v˜1v3 and b4 =v ˜1v4 − v˜1v4. We see that any solution counted by N3(V1) must give a solution to ′ b1a2 + b2a2 + b3X + b4 =0 with 0 ≤ |b1|, |b2|, |b3|, |b4|≤ 2V1V and b1,b2 6= 0.

3 3 ′ There are O(V1 V ) choices of b2,b3,b4 and O(#C) choices of a2. Given such a ′ o(1) choice of b2,b3,b4,a2, there are O(X ) choices of b1 and a2 by the divisor bound, ′ o(1) since b1a2 = −b2a2 −b3X −b4 and b1a2 is non-zero. Given b1,b2 there are O(X ) ′ ′ ′ choices ofv ˜1, v˜1, v2, v2 by the divisor bound (recall b1,b2 6= 0). Givenv ˜1, v˜1 and b3 we see that ′ −1 v3 ≡ b3v˜1 (modv ˜1). 66 JAMES MAYNARD

′ Thus there are O(V/V1) choices of v3 (here we use the fact that gcd(˜v1, v˜1) = 1). ′ Given v1, v˜1,b3 and such a choice of v3 there is just one choice of v3. Similarly, ′ ′ there are O(V/V1) choices of v4, v4 givenv ˜1, v˜1 and b4. Givenv ˜1, v2, v3, v4,a2, there o(1) are O(X ) choices of d, a1 since da1v˜1X = v2a2 + v3X + v4 and da1v˜1X 6= 0. Putting this all together, we have o(1) 5 (15.4) N3(V1) ≪ X #CV1V .

This bound will be good for us if V1 is small, but we need a different argument if V1 is large. We note that b a + b a′ + b V V A b = − 1 2 2 2 4 ≪ 1 . 3 X X ′ o(1) 4 2 We make a choice of a2,a2,b1, for which there are ≪ V V1X min(M , #C ) possibilities counted by N3(V1). We see that b3,b4 satisfy ′ b3X + b4 ≡ b1a2 (mod a2). 2 2 Let b3,0,b4,0 be a solution to this congruence with b3,0 + b4,0 minimal. We may assume that b3,0 ≪ V V1A/X and b4,0 ≪ V V1 since otherwise there are no possible b3,b4. All pairs b3,b4 satisfying the congruence are then of the form (b3,b4) = ′ ′ ′ ′ ′ ′ ′ (b3,0 + b3,b4,0 + b4) for some integers b3,b4 satisfying b3X + b4 ≡ 0 (mod a2) and ′ ′ ′ ′ 2 b3 ≪ V V1A/X, b4 ≪ V V1. This forces b3e1 + b4e2 to lie in a lattice Λ ⊂ Z of ′ 2 2 2 determinant a2, where e1, e2 are the standard basis vector of Z . Let φ : R → R be the linear map which is a dilation by a factor X/A in the e1 direction, and ′ 2 Λ = φ(Λ), a lattice in R of determinant a2X/A ≍ X.

′ Let Λ have a Minkowski-reduced basis {v1, v2}. We recall this means that kv1k2 · ′ kv2k2 ≍ det(Λ) = a2X/A ≍ X and kn1v1 + n2v2k2 ≍ kn1v1k2 + kn2v2k2. From the definition of Ma, we see that the smallest non-zero vector in Λ has length at least M/10, and so since φ can only increase the length of vectors we have kv1k2, kv2k2 ≥ M/10.

′ ′ ′ The set of vectors b3e1 + b4e2 in Λ inside the bounded region |b3| ≪ V V1A/X, ′ ′ |b4| ≪ V V1 can be injected by φ into the set {x ∈ Λ : kxk2 ≤ CV V1} for some suitably large constant C. Thus, provided C is sufficiently large so that we also ′ ′ have kn1v1 + n2v2k2 ≥ maxi knivik2/C, we see that the number of pairs (b3,b4) is bounded by

′ 2 #{x ∈ Λ : kxk2 ≤ CV V1} = # (n1,n2) ∈ Z : kn1v1 + n2v2k2 ≤ CV V1}

n 2 2 V V1 2 V V1 ≤ # (n1,n2) ∈ Z : |n1|≤ C , |n2|≤ C kv1k2 kv2k2 n V V V V o ≪ 1+ 1 1+ 1 kv1k2 kv2k2    V V V 2V 2 ≪ 1+ 1 + 1 M det(Λ′) V V V 2V 2 ≪ 1+ 1 + 1 . M X ′ Here we used the fact that kv1k2, kv2k2 ≫ M and kv1k2 ·kv2k2 ≍ det(Λ ) in the penultimate line, and det(Λ′) ≍ X in the final line. PRIMES WITH RESTRICTED DIGITS 67

′ Given any choice of a2,a2,b1,b3,b4, we see that b2 is then determined uniquely by ′ b1a2 +b2a2 = b3X +b4, since we have already chosen all the other terms. As before, ′ o(1) 2 2 ′ ′ given a2, a2, b1, b2, b3, b4 there are O(X V /V1 ) choices ofv ˜1,v ˜1, v2, v3, v4, v2, ′ ′ v3, v4, d, a1. Putting this all together, we obtain the bound 3 2 2 o(1) V 4 2 V V1 V V1 N3(V1) ≪ X min(M , #C ) 1+ + . V1 M X   Since min(M 4, #C2) ≤ min(M#C3/2, #C2) this gives 3 2 6 2 V 3/2 4 #C V o(1) (15.5) N3(V1) ≪ #C + #C V + X . V1 X   Combining (15.4) and (15.5), we obtain 3 2 6 o(1) 5 2 V 3/2 4 #C V N3(V1) ≪ X min #CV1V , #C + #C V + V1 X  1/2 3 1/2  2 6 o(1) 5 2 V 3/2 4 #C V ≪ X #CV1V #C + #C V + V1 X   #C2V 6   (15.6) ≪ Xo(1) #C3/2V 4 + . X We substitute (15.3) and (15.6) into (15.2), and obtain #C3/2V 3 1 ≪ Xo(1) #C5/4V 2 + . X1/2 (a1,aX2)∈C2   2 o(1) We recall from (15.1) that terms with v1v2v3v4a1a2 = 0 contribute a total O(#CV X ), which is negligible compared with the #C5/4V 2 term above. Thus we obtain the result. 

We see that Lemma 15.2 improves on the trivial bound O(Xo(1) min(V 3#C, #C2)) if V 8/3+ǫ ≪ #C ≪ V 4−ǫ + X1−ǫ.

Proof of Proposition 13.4. We wish to show that a a X1−ǫ F 1 F 2 ≪ X X X X NK (a1,a2)∈B2(N,K,δ) X ′     a1,a2∈E in the region N ≫ X9/25. We recall that a 1 E′ = a

57/80 Alternatively, if NK ≫ X /B, we use Lemmas 15.1 and 15.2 to bound #(B2 ∩ (E′)2), and obtain

a a #(B (N,K,δ) ∩ (E′)2) F 1 F 2 ≤ 2 X X X X B2 (a1,a2)∈B2(N,K,δ) X ′     a1,a2∈E 1 X ≤ # a ,a ∈E′ : ∃v ∈ Z4\{0} s.t. kvk ≪ , v · a =0 B2 1 2 2 N 2K Xo(1)n X 2 (#E′)3/2 X 3 o (15.8) ≪ (#E′)5/4 + . B2 N 2K X1/2 N 2K       4 Here we have written a for the vector (a1,a2,X, 1) ∈ Z . Since NK ≫ X57/80/B, we have X/NK ≪ X23/80B. Combining this bound with (15.7), we obtain a bounds for (#E′)5/4B−2X/NK and (#E′)3/2B−2X−1/2(X/NK)2 of the form XaBb for some b > 0. Since we are only considering B ≪ X23/80, these expressions are maximized when B ≍ X23/80. When B ≍ X23/80 we have #E′ ≪ X23/40 and X/NK ≪ X23/40. Thus we obtain the bounds

(#E′)5/4 X ≪ X115/160 = X23/32, B2 NK (#E′)3/2 X 2 ≪ X75/80 = X15/16. B2X1/2 NK   Substituting these bounds into (15.8) gives

a a X23/32 X15/16 X1+o(1) F 1 F 2 ≪ + . X X X X N 2 N 3 NK (a1,a2)∈B2(N,K,δ) X ′       a1,a2∈E

We can then verify that 2 × 9/25 > 23/32 and that 3 × 9/25 > 15/16, so for N ≫ X9/25 this is O(X1−ǫ/NK), as required. 

16. Modifications for Theorem 1.2

Theorem 1.2 follows from essentially the same overall approach as in Theorem 1.1. We only provide a brief sketch the proof, leaving the complete details to the interested reader. When q is large, there is negligible benefit from using the 235/154th moment, so we just use ℓ1 bounds. For Y = qk a power of q, we let

k−1 1 F (θ)= Y − log(q+s)/ log q 1 (n)e(nθ) = e(n qiθ) . Y A q − s i n

PRIMES WITH RESTRICTED DIGITS 69

The inner sum is ≤ min(q − s, s +2/kqiθk). Thus, similarly to Lemma 10.3, we find k−1 t 1 q q F ≪ min q − s, + + s Y Y (q − s)k t q − t i=0 t

17. Acknowledgments

We thank Ben Green for introducing the author to this problem, Xuancheng Shao for useful discussions and Fabian Karwatowski for some important corrections. We also thank the anonymous referee for many helpful suggestions and corrections. The author is supported by a Clay Research Fellowship and a Fellowship by Examination of Magdalen College, Oxford. Part of this work was performed whilst the author was visiting Stanford university, whose hospitality is gratefully acknowledged. 70 JAMES MAYNARD

References

[1] William D. Banks, Alessandro Conflitti, and Igor E. Shparlinski. Character sums over integers with restricted g-ary digits. Illinois J. Math., 46(3):819–836, 2002. [2] William D. Banks and Igor E. Shparlinski. Arithmetic properties of numbers with restricted digits. Acta Arith., 112(4):313–332, 2004. [3] Jean Bourgain. Prescribing the binary digits of primes, II. Israel J. Math., 206(1):165–182, 2015. [4] Sylvain Col. Diviseurs des nombres ellips´ephiques. Period. Math. Hungar., 58(1):1–23, 2009. [5] Jean Coquet. On the uniform distribution modulo one of some subsequences of polynomial sequences. J. Number Theory, 10(3):291–296, 1978. [6] Jean Coquet. On the uniform distribution modulo one of subsequences of polynomial se- quences. II. J. Number Theory, 12(2):244–250, 1980. [7] C´ecile Dartyge and Christian Mauduit. Nombres presque premiers dont l’´ecriture en base r ne comporte pas certains chiffres. J. Number Theory, 81(2):270–291, 2000. [8] C´ecile Dartyge and Christian Mauduit. Ensembles de densit´enulle contenant des entiers poss´edant au plus deux facteurs premiers. J. Number Theory, 91(2):230–255, 2001. [9] Harold Davenport. Indefinite quadratic forms in many variables. II. Proc. London Math. Soc. (3), 8:109–126, 1958. [10] Harold Davenport. Multiplicative number theory, volume 74 of Graduate Texts in Mathemat- ics. Springer-Verlag, New York, third edition, 2000. Revised and with a preface by Hugh L. Montgomery. [11] Michael Drmota and Christian Mauduit. Weyl sums over integers with affine digit restrictions. J. Number Theory, 130(11):2404–2427, 2010. [12] Paul Erd˝os, Christian Mauduit, and Andr´as S´ark¨ozy. On arithmetic properties of integers with missing digits. I. Distribution in residue classes. J. Number Theory, 70(2):99–120, 1998. [13] Paul Erd˝os, Christian Mauduit, and Andr´as S´ark¨ozy. On arithmetic properties of integers with missing digits. II. Prime factors. Discrete Math., 200(1-3):149–164, 1999. Paul Erd˝os memorial collection. [14] and . Opera de cribro, volume 57 of American Mathematical Society Colloquium Publications. American Mathematical Society, Providence, RI, 2010. [15] Glyn Harman. Prime-detecting sieves, volume 33 of London Mathematical Society Mono- graphs Series. Princeton University Press, Princeton, NJ, 2007. [16] Sergei Konyagin. Arithmetic properties of integers with missing digits: distribution in residue classes. Period. Math. Hungar., 42(1-2):145–162, 2001. [17] Christian Mauduit and Jo¨el Rivat. Sur un probl`eme de Gelfond: la somme des chiffres des nombres premiers. Ann. of Math. (2), 171(3):1591–1646, 2010. [18] Hugh L. Montgomery and Robert C. Vaughan. Multiplicative number theory. I. Classical theory, volume 97 of Cambridge Studies in Advanced Mathematics. Cambridge University Press, Cambridge, 2007.

Magdalen College, Oxford, England, OX1 4AU

E-mail address: [email protected]