<<

Generalized functions in arithmetic progressions: I

The k-fold divisor function in arithmetic progressions to large moduli

David T. Nguyen1 Department of , South Hall, University of California, Santa Barbara, CA 93106.

Abstract We prove some distribution results for the k-fold divisor function in arithmetic progressions to moduli that exceed the square-root of length X of the sum, with appropriate constrains and averaging on the moduli, saving a power of X from the trivial bound. On assuming the Generalized , we obtain uniform power saving error terms that are independent of k. We follow and specialize Y.T. Zhang’s method on bounded gaps between primes to our setting. Our arguments are essentially self-contained, with the exception on the use of Deligne’s work on the Riemann Hypothesis for varieties over finite fields. In particular, we avoid the reliance on Siegel’s theorem, leading to some effective estimates. Keywords: Divisor functions, equidistribution estimates, Bombieri-Vinogradov theorem, Elliott-Halberstam conjecture, Siegel-Walfisz theorem

Email address: [email protected] (DTN)

Preprint Friday 5th March, 2021 Generalized divisor functions in arithmetic progressions: I

The k-fold divisor function in arithmetic progressions to large moduli

David T. Nguyen1 Department of Mathematics, South Hall, University of California, Santa Barbara, CA 93106.

Contents

1 Introduction and statement of results2

2 Notation and sketch of proof9

3 Preliminary lemmas 11

4 Proof of the main result Theorem1 19

5 Proof of uniform power savings Theorem2 49

6 Proofs of Theorems3 and4 50

List of Tables

1 Only for k = 1, 2, 3 is the exponent of distribution θk for τk(n) known to hold for a value larger than 1/2...... 4 2 Known results for τk(n) averaged over moduli d, and references...... 5 3 Table of parameters and their first appearance...... 10

1. Introduction and statement of results

Let n ≥ 1 and k ≥ 1 be . Let τk(n) denote the k-fold divisor function X τk(n) = 1,

n1n2···nk=n where the sum runs over ordered k-tuples (n1, n2, . . . , nk) of positive integers for which −s n1n2 ··· nk = n. Thus τk(n) is the coefficient of n in the

∞ k X −s ζ(s) = τk(n)n . n=1

Email address: [email protected] (DTN)

Preprint Friday 5th March, 2021 It is well known that the function τk is closely related to prime numbers. This paper is concerned with the distribution of τk(n) in arithmetic progressions to moduli d that exceed the square-root of length of the sum, in particular, provides a sharpening of the result in [32]. We next give a brief background of the problem and present our main result.

1.1. Survey and main result Towards the end of the 18th century, Gauss conjectured the celebrated Theorem concerning the sum X 1 p≤X as X approaches infinity, where p denotes a prime. It is more convenience to count primes with weight log p instead of weight 1, c.f. Chebyshev; this leads to consideration of the sum X log p. p≤X

To access the more conveniently we also count powers of primes, leading to the sum X log p, pα≤X α≥1 which is equal to the unconstrained sum over n X Λ(n) n≤X where Λ(n) is the von Mangoldt function–the coefficient of n−s in the series −ζ0(s)/ζ(s). In 1837, Dirichlet considered the deep question of primes in arithmetic progression, leading him to consider sums of the form X Λ(n) n≤X n≡a(mod d) for (d, a) = 1. More generally, the function Λ(n) is replaced by an f(n), satisfying certain growth conditions, and we arrive at the study of the congruence sum X f(n). (1.1) n≤X n≡a(mod d)

This sum (1.1) is our main object of study. For most f appearing in applications, it is expected that f is distributed equally among the reduced residue classes a(mod d) with (a, d) = 1, e.g., that the sum (1.1) is well approx- imated by the average 1 X f(n) (1.2) ϕ(d) n≤X (n,d)=1

3 since there are ϕ(d) reduced residue classes modulo d, where ϕ(n) is the Euler’s totient function. The quantity (1.2) is often thought of as the ‘main term’. Different main terms are also considered. Thus, the study of (1.1) is reduced to studying the ‘error term’

X 1 X ∆(f; X, d, a) := f(n) − f(n), for (a, d) = 1. ϕ(d) n≤X n≤X n≡a(mod d) (n,d)=1 measuring the discrepancy between the the sum (1.1) and the expected value (1.2). If f satisfies f(n) ≤ Cτ B(n) logB X for some constants B, C > 0, which is often the case for most f in applications, then a trivial bound for the discrepancy ∆(f; X, d, a) is

0 ∆(f; X, d, a) ≤ C0X logB X, for some constants B0,C0 > 0. The objective is then to obtain a non-trivial upper bound such as 1 X ∆(f; X, d, a)  , A > 0, (1.3) ϕ(d) logA X or 1 ∆(f; X, d, a)  X1−δ, 0 < δ < 1, (1.4) ϕ(d) with d in a certain range depending on X. For f(n) = Λ(n), the von Mangoldt function, the clasical Siegel-Walfisz theorem implies that (1.3) holds uniformly in the range d < logB X, where B > 0 with A depending on B. θk− For f(n) = τk(n), the estimate (1.4) is valid uniformly in the range d ≤ X , where the exponent of distribution θk are summarized in Table1.

Table 1: Only for k = 1, 2, 3 is the exponent of distribution θk for τk(n) known to hold for a value larger than 1/2.

k θk References

k = 2 θ2 = 2/3 Selberg, Linnik, Hooley (independently, unpublished, 1950’s); Heath-Brown (1979) [17, Corollary 1, p. 409]. k = 3 θ3 = 1/2 + 1/230 Friedlander and Iwaniec (1985) [14, Theorem 5, p. 338]. θ3 = 1/2 + 1/82 Heath-Brown (1986) [18, Theorem 1, p. 31]. θ3 = 1/2 + 1/46 Fouvry, Kowalski, and Michel (2015) [11, Theorem 1.1, p. 122], (for prime moduli, polylog saving). k = 4 θ4 = 1/2 Linnik (1961) [24, Lemma 5, p. 197]. k ≥ 4 θk = 8/(3k + 4) Lavrik (1965) [23, Teopema 1, p. 1232]. k = 5 θ5 = 9/20 Friedlander and Iwaniec (1985) [13, Theorem I, p. 273]. k = 6 θ6 = 5/12 Friedlander and Iwaniec (1985) [13, Theorem II, p. 273]. k ≥ 7 θk = 8/3k Friedlander and Iwaniec (1985) [13, Theorem II, p. 273]. k ≥ 5 θk ≥ 1/2 Open.

4 In many problems in analytic , it suffices to prove that (1.3) holds on average, in the sense that X X max |∆(f; X, d, a)|  (1.5) (a,d)=1 logA X d≤Xθ− for any  > 0, A > 0, and some 0 < θ ≤ 1. For f(n) = Λ(n), one form of the celebrated Bombieri-Vinogradov Theorem [1][31] (1965) asserts that (1.5) holds with θ = 1/2. By a general version of the Bombieri-Vinogradov theorem (see, e.g., [25] or [33]), the bound (1.5) holds for a wide class of arithmetic functions, including f(n) = τk(n) for all k; see Table2 for a summary.

Table 2: Known results for τk(n) averaged over moduli d, and references. k θk References k = 2 θ2 = 1 Fouvry (1985) [10, Corollaire 5, p. 74] (exponential saving); Fouvry and Iwaniec (1992) [? , Theorem 1, p. 272]. k = 3 θ3 = 1/2 + 1/42 Heath-Brown (1986) [18, Theorem 2, p. 32]. θ3 = 1/2 + 1/34 Fouvry, Kowalski, and Michel (2015) [11, Theorem 1.2, p. 123], (for prime moduli, polylog saving). k ≥ 4 θk < 1/2 Follows from the general version of Bombieri-Vinogradov theorem, see, e.g., [25] or [33], (polylog saving). k ≥ 4 θk ≥ 1/2 Open.

It is believed that (1.5) should hold with θ = 1 for a large class of function f(n), including Λ(n) and τk(n); however, going beyond θ > 1/2 proves to be very difficult. In the recent breakthrough work of Y. Zhang [34] on bounded gaps between primes, a crucial step is to show that, for any fixed a 6= 0, X X |∆(Λ; X, d, a)|  , logA X d∈D 1 1 d

X 2 1/2 µ(d) |∆(τk; X, d, a)|  X exp(− log X), (1.6) d∈D d X71/584}, (1.7) p

5 and the implied constant depends on k and a. The condition on the moduli d in (1.7) slightly allowing for d to have some, but not too many, prime factors larger than X1/1168. The error term and, more importantly, the exponent of distribution θk = 293/584 = 1/2 + 1/548 in (1.6) hold uniformly in k. In the main result of this paper, we provide a sharpening of the error term in (1.6), saving a power of X from the trivial bound, with a constraint on the moduli d not having too many very small prime factors. Actually, our arguments follow closely those of [34] in treating contribution coming from large moduli; see Section 2.2 below for more discussion. Theorem 1 (Main theorem). Let 1 $ = (1.8) 1168 and  1  θ = min , $2 . (1.9) k 12(k + 2) For a 6= 0, let Y Y D = {d ≥ 1 : (d, a) = 1, |µ(d)| = 1, (d, p) < X$, and (d, p) > X71/584}, $ p≤X$2 p≤X

where µ is the M¨obiusfunction. Then for each k ≥ 4 we have

1 X X X 1−θk τk(n) − τk(n)  X . (1.10) ϕ(d) d∈D n≤X n≤X d

X X 1 X X τk(n) − τk(n)  ϕ(q) (log X)1− Q≤q≤2Q X

6 Theorem 2. On the Generalized Lindel¨ofHypothesis, the estimate (1.10) holds with the right side replaced by X1−$2 , where the θk power saving is replaced by a positive constant independent of k. This uniform power saving is the result of sharper estimates of L(s, χ)k on the critical line that are independent of k. We next present two results when we are allowed to take an extra averaging over the residue classes a(mod d).

1.2. Results on further averaging In a function field variant, the work of Keating, Rodgers, Roditty-Gershon, and Rudnick in [22] leads Rodgers and Soundararajan [29, Conjecture 1] to the following conjecture over the integers for the variance of τk. Conjecture 1. For X, d → ∞ such that log X/ log d → c ∈ (0, k), we have

d X 2 k2−1 ∆(τk; X, d, a) ∼ ak(d)γk(c)X(log d) , a=1 (a,d)=1

where ak(d) is the arithmetic constant

∞ 2 k2 X τk(n) ak(d) = lim (s − 1) , s→1+ ns n=1 (n,d)=1

2 and γk(c) is a piecewise polynomial of degree k − 1 defined by Z 1 2 k γk(c) = 2 δc(w1 + ··· wk)∆(w) d w, k!G(k + 1) [0,1]k Q where δc(x) = δ(x − c) is a Dirac delta function centered at c, ∆(w) = i

d X X 2 1−1/6(k+2) k2−1 ∆(τk; X, d, a)  (D + X )X(log X) . (1.11) d≤D a=1 (a,d)=1

7 This result is of Barban-Davenport-Halberstam type. In forthcoming work [27], we re- place the upper bound in (1.11) by an asymptotic equality for the ternary divisor function τ3(n) with the condition (a, d) = 1 removed. Lastly, motivated by the recent work [19] of Heath-Brown and Li in 2017, we also prove analogous bilinear estimates over hyperbolas m ≡ an (mod d) for pairs of τk(n)’s 2− and τk(n)Λ(n) to moduli d that can taken to be almost as large as X . Theorem 4. For k ≥ 4 and any  > 0 there holds

  22 d X X  X 1  X   4−1/3(k+4)  τk(m)τk(n) −  τk(n)   X (1.12) ϕ(d) d≤D a=1  m,n≤X  n≤X   (a,d)=1 m≡an(mod d) (n,d)=1

for any D ≤ X2−1/3(k+2). In particular, the above estimate is valid if one of the τk is replaced by the von Mangoldt function Λ. We have  2 d X X  X X X  4−1/3(k+4)  τk(m)Λ(n) − τk(n)  X (1.13) ϕ(d) d≤D a=1  m,n≤X n≤X  (a,d)=1 m≡an(mod d) (n,d)=1

for any D ≤ X2−1/3(k+2). It might look surprising at first that the moduli in Theorems4 can be taken almost as large as X2, but proof is in fact rather simple; the proof of Theorem4 follows essentially also from the large sieve inequality. Assuming the Generalized Riemann Hypothesis, it might be possible to show that the estimates (1.12) and (1.13) hold in a larger range for d with right side replaced by ( X2−δ, for 1 ≤ D ≤ X1+, X2D(log X)k2 , for X1+ < D ≤ X2,

for some constant δ > 0. We note that the moduli d in Theorems3 and4 need not be smooth as in Theorems1 and2.

1.3. Acknowledgments I wish to express gratitude to my Ph.D. advisor Zhang YiTang for introducing me to this problem, and for his guidance and numerous encouragements. I would also like to thank the referee for their careful reading and very helpful suggestions which greatly improves the presentation of the paper. Additionally, I am grateful to M. Ram Murty, Matthew Welsh, Carl Pomerance, Kim SungJin, Mits Kobayashi, and Garo Sarajian for helpful mathematical conversations. I’d also like to thank Birge Huisgen-Zimmermann, Jeff Stopple, and Brad Rodgers for their feedbacks and interests in this project. Further thanks to Hector Ceniceros, Dave Morri- son, Mihai Putinar, Alan Krinik, Eugene Lipovetsky, Kai S. Lam, Ester Trujillo and the

8 UCSB Graduate Scholars Program for their support in the early stage. Lastly I acknowledge the Mathematics department at UCSB, in particular Medina Price, my office mates and neighbors for comfortable working environment leading up to completion of this paper.

2. Notation and sketch of proof 2.1. Notation N = {1, 2, 3,... }. p–a prime number. a, b, c–integers. d, n, m, k, q, r, s, Q, R–positive integers. Λ(q)–the von Mangoldt function. τk(q)–the k-fold divisor function; τ2(q) = τ(q). ϕ(n)–the Euler’s totient function. s = σ + it X–a large real number. L = log X. χ(n)–a Dirichlet character. e(y)–the additive character exp{2πiy}. ed(y) := exp{2πiy/d}. fˆ–the Fourier transform of f, i.e., Z ∞ fˆ(z) = f(y)e(yz)dy. −∞ m ≡ a(q) means m ≡ a(mod q). q ∼ Q means Q ≤ q < 2Q. –any sufficiently small, positive constant, not necessarily the same in each occurrence. B–some positive constant, not necessarily the same in each occurrence. kαk–means the L2 norm of α = (α(m)), i.e.,

!1/2 X kαk = |α(m)|2 . m

χN –the characteristic function of the subset [N, (1 + ρ)N) ⊂ R. X0 –means a over nonprincipal characters χ(mod d). χ(mod d) X∗ –means a summation over primitive characters χ(mod d). χ(mod d) q X X –means . b(mod q) b=1 q X∗ X –means . b(mod q) b=1 (b,q)=1

9 Table 3: Table of parameters and their first appearance. Parameters First apprearance $ = 1/1168 (1.8) n 1 2o θk = min 12(k+2) , $ (1.9) 1/12(k+1) Q0 = X (4.1) $4/3 D0 = X (4.93) $ D1 = X (3.28) 1/2−1/12(k+1) D2 = X (4.2) 1/2+2$ D3 = X (3.23) P = Q p (3.24) 0 p≤D0 P = Q p (3.25) 1 p≤D1 ρ = X−$ (4.11) 3/8+8$ X1 = X (4.61) 1/2−4$ X2 = X (4.61)

We follow standard notations and write f(X) = O(g(X)) or f(X)  g(X) to mean that |f(X)| ≤ Cg(X) for some fixed constant C, and f(X) = o(g(X)) if |f(X)| ≤ c(X)g(X) for some function c(X) that goes to zero as X goes to infinity. The sequences α(n) and β(n) we consider are all real; in particular, the absolute value sign is not needed in several expressions.

2.2. Sketch of the proof of the main theorem Here, and in the rest of the paper, we fix an integer k ≥ 4, unless specified otherwise. To prove (1.10) we follow standard practice and split the summation over moduli d 1 −δ in into two sums: one over d < X 2 which are called small moduli and the other over 1 −δ 1 +2$ X 2 ≤ d < X 2 which are called large moduli. For small moduli, we estimate (1.10) directly using the large sieve inequality together with a direct substitute for the Siegel-Walfisz condition. For the von Mangoldt function Λ(n), the M¨obiusfunction µ(n) is involved and, hence, the Siegel-Walfisz theorem is needed to handle very small moduli. For us, fortunately, τk is simpler than Λ in that µ is absent–this feature of τk allows us to get a sharper bound in place of the Siegel-Walfisz theorem; see Lemma8 below. The constant here is effective. For large moduli, we adapt the methods of Zhang in [34] to bound the error term which goes as follows. After applying suitable combinatorial arguments, we split τk into appropriate convolutions as Type I, II, and III, as modeled in [34]. We treat the Type I and II in our Case (b), Type III in our Case (c), and Case (a) corresponds to a trivial case which we treat directly. The main ingredients in Case (b) are the dispersion method and Weil bound on Kloosterman sums. The Case (c) depends crucially on the factorization d = qr of the moduli to Weil shift a certain incomplete Kloosterman sum to the modulus r. The shift modulo this r then induces a Ramanujan sum, which is known to have better than square- root cancellation. This allows for a saving of a power of r, and since d is a multiple of r, and d is less than X, this saves a small power of X from the trivial bound.

10 3. Preliminary lemmas We collect here lemmas that shall be used to prove our theorems. Some lemmas are standard and we quote directly from the literature.

Lemma 1. For any  > 0 we have  τj(n)  n . (3.1) Proof. See [21, Equation (1.81)].

Lemma 2. Let γ be an arithmetic function. If χ(mod d) is nonprincipal, then there exists a unique q|d, q > 1, and a unique primitive character χ∗(mod q), such that, with r = d/q, X X γ(n)χ(n) = γ(n)χ∗(n). n (n,r)=1

Proof. See, e.g., [9, Section 5] for definition of characters and proofs. In reducing nonprincipal characters, which may have not too small moduli, to primitive characters for the application of large sieve inequality, very small moduli of the primitive characters may occur. We treat contributions from those small moduli via the following lemma.

Lemma 3. Let χ be a primitive character (mod d). For d < X1/3(k+1) we have

X 1− 1 τk(n)χ(n)  X 3(k+2) . (3.2) n≤X

Proof. Decompose the interval [1,X] in to dyadic intervals of the form [N, 2N). Denote by X ψ(χ) = τk(n)χ(n). n∼N Let 0 < η < 1 be a parameter to be specified latter (see (3.14) below). Let f(x) be a function of C∞(−∞, ∞) class such that 0 ≤ f(y) ≤ 1,

f(y) = 1 if N ≤ y ≤ 2N,

f(y) = 0 if y∈ / [N − N η, 2N + N η], and obeying the derivative bound

f (j)(y)  N −jη, j ≥ 1, (3.3) where the implied constant depends on η and j at most. Let

∞ ∗ X ψ (χ) = τk(n)χ(n)f(n). (3.4) n=1

11 By (3.1), we have

∗ X X η+ ψ (χ) − ψ(χ) = τk(n)χ(n) + τk(n)χ(n)  N N−N η≤n≤N 2N≤n≤2N+N η for any  > 0. Let Z ∞ F (s) = f(x)xs−1dx 0 be the Mellin transform of f(x). The function F (s) is absolutely convergent for σ > 0 with inverse Mellin transform 1 Z f(x) = F (s)x−sds, (3.5) 2πi (2) R R c+i∞ where (c) denotes the integration c−i∞ over the vertical line c + it where t runs from −∞ to ∞. Substituting (3.5) into (3.4) and changing the order of summation and integration, we get

∞ X  1 Z  ψ∗(χ) = τ (n)χ(n) F (s)n−sds k 2πi n=1 (2) ∞ ! 1 Z X = F (s) τ (n)χ(n)n−s ds 2πi k (2) n=1 1 Z = F (s)L(s, χ)kds, (3.6) 2πi (2)

P∞ −s where L(s, χ) = n=1 χ(n)n is the Dirichlet series for χ. Since the function L(s, χ), and thus, F (s)L(s, χ)k has no poles in σ ≥ 0, we may move the line of integration in (3.6) from σ = 2 to σ = 1/2 and obtain

1 Z ψ∗(χ) = F (s)L(s, χ)kds. (3.7) 2πi 1 ( 2 ) We next estimate this integral by bounding the integrand and splitting the line of inte- gration into two parts, over |t| < T and |t| ≥ T , then choosing T suitably (see (3.13) below). For σ = 1/2, we have the convexity bound; see, e.g., [21, Theorem 5.23],

|L(s, χ)|k  dk/4|s|k. (3.8)

We next obtain upper bound for F (s). On the line σ = 1/2, we have, by definitions of F (s) and f(x), Z 2N+N η Z 2N+N η F (s) = f(x)xs−1dx ≤ x−1/2dx  N 1/2. (3.9) N−N η N−N η This bound is sufficient for bounding small |t| in (3.7), but too large for |t| large. To bound contribution from large |t| we fix an ` > k + 1 (3.10)

12 and apply integration by parts ` times to F (s): 1 Z ∞ F (s) = (−1)` f (`)(x)xs+`−1dx. s(s + 1) ··· (s + ` + 1) 0 Hence, by the derivative bound (3.3), F (s) is bounded by 1 1 |F (s)|  N −`η+1/2+`  N (1−η)`+1/2. (3.11) |s|` |s|` This bound allows us to save an arbitrary negative power of |s|; we will use this bound for large |t|. We now split the integral in (3.7) into two and estimate each part individually. Let s = 1/2 + it. For T > 2, we can write ψ∗(χ) in (3.7) as 1 Z 1 Z ψ∗(χ) = F (s)L(s, χ)kds + F (s)L(s, χ)kds. 2πi |t|

1− 1 +  X5/6+dk/4 + X 3(k+1) . Thus, if d < X1/3(k+1), then the above estimate is

1− 1 + 1− 1  X 3(k+1)  X 3(k+2) for small enough . This gives the estimate (3.2).

13 Lemma 4. Let γ be an arithmetic function. For (a, d) = 1 we have ! 1 X0 X ∆(γ; X, d, a) = χ(a) γ(n)χ(n) . (3.15) ϕ(d) χ(mod d) n≤X

Proof. By the orthogonality condition ( 1 X 1, if n ≡ a(d), χ(a)χ(n) = ϕ(d) 0, otherwise, χ(mod d) we may write

  ! X X 1 X 1 X X γ(n) = γ(n) χ(a)χ(n) = χ(a) γ(n)χ(n) . ϕ(d)  ϕ(d) n≤X n≤X χ(mod d) χ(mod d) n≤X n≡a(d) (3.16) If χ (mod d) is principal, then  χ(a) = 1, X X γ(n)χ(n) = γ(n). n≤X n≤X  (n,d)=1

Hence the contribution from the principal character gives the main term in (3.16) and the discrepancy ∆(γ; X, d, a) is given by a sum over nonprincipal characters. This gives (3.15).

The next lemma is the well-known multiplicative large sieve inequality.

Lemma 5. Let χ be a primitive character mod q. For a(n) a sequence of complex numbers, we have 2 ∗ X X X 2 X 2 a(n)χ(n)  (Q + N) |a(n)| . (3.17) q≤Q χ(mod q) n≤N n≤N Proof. See [21, Theorem 7.13]. The next lemma is a truncated Poisson formula.

Lemma 6. Suppose that η∗ > 1 and X1/4 < M < X2/3. Let f be a function of C∞(−∞, ∞) class such that 0 ≤ f(y) ≤ 1,

f(y) = 1 if M ≤ y ≤ η∗M,

f(y) = 0 if y∈ / [(1 − M −)M, (1 + M −η∗)M], and f (j)(y)  M −j(1−), j ≥ 1,

14 with the implied constant depending on  and j at most. Then we have X 1 X f(m) = fˆ(h/d)e (−ah) + O(d−1) d d m≡a(d) |h|

for any H ≥ dM −1+2, where fˆ is the Fourier transform of f. Proof. See [2, Lemma 2]. Lemma 7. Suppose that 1 ≤ N < N 0 < 2x, N 0 − N > Xd, and (c, d) = 1. Then for j, ν ≥ 1 we have X ν 0 jν −1 τj(n)  (N − N)L , (3.18) N≤n≤N 0 and 0 X N − N ν τ (n)ν  Lj −1. j ϕ(d) N≤n≤N 0 n≡c(d) The implied constants depending on , j, and ν at most. Proof. See [30, Theorem 2]. In the next lemma we verify a substitute for the “Siegel-Walfisz” condition.

Lemma 8. Let β = βi1 ∗ · · · ∗ βi` , 1 ≤ i1 ≤ i2 ≤ · · · ≤ i` ≤ k, and βj = χNj , with κ N := Ni1 Ni2 ··· Ni`  X for some constant κ > 0. For χ a primitive character modulo r  Xκ, we have X β(n)χ(n)  X−κ/12N. (3.19) n

Proof. We first verify (3.19) for a single β = βi. For the general case, it suffices to check that if βi and βj satisfy (3.19), then so does their convolution βi ∗ βj. κ Let β = χNi N = Ni  X . We proceed analogously as to the proof of Lemma3. Let f(x) be a function of C∞(−∞, ∞) class such that 0 ≤ f(y) ≤ 1, f(y) = 1 if N ≤ y ≤ (1 + ρ)N, f(y) = 0 if y∈ / [N − N 11/12, (1 + ρ)N + N 11/12], and obeying the derivative bound f (j)(y)  N −11j/12, j ≥ 1, where the implied constant depends on j. Let Z ∞ F (s) = f(x)xs−1dx 0 denote the Mellin transform of f(x). Let X ψ(χ) = β(n)χ(n) n

15 and X ψ∗(χ) = β(n)χ(n)f(n). n Analogously, we have X X ψ∗(χ) − ψ(χ) = χ(n) + χ(n)  N 11/12 N−N 11/12≤n≤N (1+ρ)N≤n≤(1+ρ)N+N 11/12 and |L(1/2 + it, χ)|  r1/4|s|. Thus, 1 Z ψ∗(χ) = F (s)L(s, χ)ds 2πi 1 ( 2 ) 1 Z = F (1/2 + it)L(1/2 + it, χ)ds 2πi |t|

Assume r  Xκ. We deduce 1 r1/4 Xκ/4 ψ(χ)  N 11/12 + N 2/3r1/4  N + N  NX−κ/12 + N  NX−κ/12. N 1/12 N 1/3 Xκ/3

κ κi Now assume βi and βj satisfy (3.19) with N := NiNj  X . Write Ni = X and κj Nj = X so that κi + κj ≥ κ. Since βi and βj satisfy (3.19), we have

X −κi/12 βi(n)χ(n)  NiX n and X −κj /12 βj(n)χ(n)  NjX . n Thus, writing n as mn and separate variables, we get

X X X −κi/12 −κj /12 −κ/12 βi ∗ βj(n)χ(n) = βi(m)χ(m) βi(n)χ(n)  NiX NjX  NX . n m n This completes the proof of Lemma8. Lemma 9. Let β be given as in (4.58), with N given in (4.59) satisfying (4.60). Assume R ≤ X−$/6N. Then for any q ≥ 1 and (r, `), we have  2

∗ X X  X 1 X  B 2 − $  β(n) − β(n)  τ(q) N X 12 . (3.20)  ϕ(r)  r∼R `(mod r) n≡`(r) (n,qr)=1 (n,q)=1

16 Proof. By M¨obiusinversion, the condition (n, q) = 1 may be removed at the cost of removing the τ(q)B factor on the right side of (3.20); see, e.g., [15, p. 21-22]. Thus it suffices to show X X∗ ∆(β; X, r, `)2  N 2X−$/12. (3.21) r∼R `(mod r) By (3.15), we have ! 2 0 2 1 X X ∆(β; X, r, `) = χ(a) β(n)χ(n) ϕ(r)2 χ(mod r) n≤X ! ! 1 X0 X X0 X = χ (a) β(n)χ (n) χ (a) β(n)χ (n) . ϕ(r)2 1 1 2 2 χ1(mod r) n≤X χ2(mod r) n≤X Summing over primitive `(mod r) and changing the order of summation, we get ! ! X∗ 1 X0 X0 X X X∗ ∆(β; X, r, `)2 = τ (n)χ (n) τ (n)χ (n) χ (a)χ (a). ϕ(r)2 k 1 k 2 1 2 `(mod r) χ1(mod r) χ2(mod r) n≤X n≤X a(mod r) By the orthogonality relation ( 1 X∗ 1, if χ1 = χ2, χ1(a)χ2(a) = (3.22) ϕ(r) 0, if χ 6= χ , a(mod r) 1 2 this becomes ! 2 ∗ 0 X 2 1 X X ∆(β; X, r, `) = β(n)χ(n) . ϕ(r) `(mod r) χ(mod r) n≤X We now reduce to primitive characters as in the proof of Proposition1. By Lemma2, we have  ! 2 ∗ ∗ X X 2 X 1 X 1 X X ∆(β; X, r, `)  log L  β(n)χ(n)  . s q r∼R `(mod r) s≤R 1

! 2 !2 ∗   1 X X X 1 2 X N k−1 β(n)χ(n)  (Q + N) β(n)  Q + NL . Q Q Q q∼Q χ(mod q) n≤X n≤X For R ≤ X−$/6N, this leads to (3.21).

17 The next lemma restricts d to moduli that have ‘well-factorable’ property. Lemma 10 (Factorization lemma). Write

1/2+2$ D3 = X . (3.23)

Suppose d is square-free such that D2 < d < D3,

$ (d, P0) < X , $ = 1/1168, (3.24) and 1/8−4$ (d, P1) > X (3.25) Then, for any R∗ satisfying X2$ ≤ R∗ ≤ X45$ (3.26) or X3/8+7$ ≤ R∗ ≤ X1/2−2$, (3.27) −$ ∗ ∗ there is a factorization d = qr such that X R < r < R and (q, P0) = 1.

Proof. Since d is square-free, we may write d as d = d0d1d2 with

d0 = (d, P0)

n (d, P1) Y d = = p ,D < p < p < ··· < p < D , n ≥ 2, (3.28) 1 (d, P ) j 0 1 2 n 1 0 j=1 and Y d2 = p. p|d p>D1 $ ∗ 2$ 0 We have d0 < X . By the first inequality in (3.26), R ≥ X , and there is an n < n such that n0 n0+1 Y ∗ Y ∗ d0 pj < R and d0 pj ≥ R . j=1 j=1 Similarly, by (3.24), we have d 3/8+6$ d2 = < X . (d, P1) By the first inequality in (3.27), R∗ ≥ X3/8+7$, and there is an n00 < n such that

n00 n00+1 Y ∗ Y ∗ d2 pj < R and d2 pj ≥ R . j=1 j=1

The assertion follows by choosing

r = d0ri, i = 1, 2,

18 where n0 n00 Y Y r1 = d0 pj and r2 = d0d2 pj j=1 j=1 ∗ and noting that ri ≥ R /pn0+1. Lemma 11. ([34, Lemma 9] ) Suppose that H,N ≥ 2 and (c, d) = 1. Then we have X min{H, kcn/dk−1}  (dN)(H + N), n≤N (n,d)=1 where n/d means a/d(mod 1) with an ≡ 1(mod d). We quote a crucial bound on an incomplete Kloosterman sum obtained in [34, Lemma 11].

Lemma 12 ([34, Lemma 11]). Suppose that N ≥ 1, d1d2 > 10, and |µ(d1)| = |µ(d2)| = 1. Then for any c1, c2, and `, we have

! 2 X c1n c2(n + `) 1/2 (c1, d1)(c2, d2)(d1, d2) N e +  (d1d2) τ(d1d2) + . (3.29) d1 d2 d1d2 n≤N (n,d1)=1 (n+`,d2)=1

In the case d2 = 1, (3.29) becomes a Ramanujan sum

X 1/2 (c1, d1)N ed1 (c1n)  d1 τ(d1) + ; (3.30) d1 n≤N (n,d1)=1

see, e.g., [2, Lemma 6] for a proof. This next lemma is the Birch-Bombieri bound. Lemma 13. ([34, Lemma 12] ) Let X X∗ X∗   T (k; m1, m2; q) = eq `t1 − (` + k)t2 + m1t1 − m2t2 .

`(q) t1(q) t2(q) (`(`+k),q)=1

Suppose that q is square-free. Then we have

1/2 3/2 T (k; m1, m2; q)  (k, q) q τ(q).

4. Proof of the main result Theorem1 We start with the proof of Theorem1 which is the longest of the four. We begin by making some preliminary reductions. Writing

1 Q0 = X 12(k+1) (4.1)

19 and 1/2 X 1 − 1 D2 = = X 2 12(k+1) , (4.2) Q0 we first show that contributions coming from moduli d ≤ D2, which we call small moduli, are acceptable. (See Remark3 below for a discussion on the dependency of D2 on k.) The main ingredients in this first step are the multiplicative large sieve inequality (3.17) in conjunction with Lemma3 to control contributions from primitive characters with very small moduli. Proposition 1. Let τk = τ` ∗ τs (4.3) 1/6(k+1) with k = ` + s, τ` supported on [M, 2M), τs supported on [N, 2N), M, N > X , and MN = X. For D ≤ D2 we have

X 1− 1 max |∆(τk; X, d, a)|  X 12(k+2) . (4.4) (a,d)=1 d≤D

Proof. As mentioned above, we estimate each |∆(τk; X, d, a)| directly using the large sieve inequality (3.17). By Lemma4, we first reduce the task of estimating |∆(τk; X, d, a)| to a sum over nonprincipal characters, then, with Lemmas2 and3 and the factorization

d = qr, we further reduce this sum to one involving only primitive characters, to which, we apply the large sieve inequality (3.17) to obtain (4.4). By Lemma4, the left side of (4.4) is

X 1 X0 X ≤ τ (n)χ(n) . ϕ(d) k d≤D χ(mod d) n≤X

1 log L By the bound ϕ(d)  d and Lemma2, this is

∗ X 1 X X X ∗  log L τk(n)χ (n) d d≤D d=qr χ∗(mod q) n≤X q>1 (n,r)=1  

X 1  X 1 X∗ X  = log L  τk(n)χ(n)  . r  q  r≤D 1

P 1 The sum r≤D r contributes a factor of log D. Thus, to show (4.4), it suffices to show that, for each fixed r ≤ D,

∗ X 1 X X 1−1/12(k+1) τk(n)χ(n)  X . (4.5) q 1

20 Fix an r ≤ D. Recall Q0 given as in (4.1). We split the range of primitive conductors q ∈ (1, D/r] into two, one over q < Q0 and the other over Q0 ≤ q ≤ D/r, with the intention of applying Lemma3 to the former and large sieve inequality (3.17) to the latter. By Lemma3, we have, for q ≤ Q0, X 1−1/3(k+1) τk(n)χ(n)  X . n≤X 2 And since the number of characters with modulus less than Q0 is at most Q0, we get

∗ X 1 X X 2 1−1/3(k+1) X 1 1−1/6(k+2) τk(n)χ(n)  Q X  X . q 0 q 1

∗ 1 X X X 1−1/12(k+1) τk(n)χ(n)  X (4.6) Q Q≤q≤2Q χ(mod q) (n,r)=1

for any Q0 ≤ Q ≤ D. By (4.3), we have     X X X τk(n)χ(n) =  τ`(m)χ(m)  τs(n)χ(n) . (4.7) (n,r)=1 (m,r)=1 (n,r)=1 By (4.7) and Cauchy’s inequality, the left side of (4.6) is  21/2  21/2

1 X X∗ X X X∗ X ≤  τ`(m)χ(m)   τs(n)χ(n)  . Q Q≤q≤2Q χ(mod q) (m,r)=1 Q≤q≤2Q χ(mod q) (n,r)=1 By the large sieve inequality (3.17), the above quantity is 1 1 ≤ (Q2 + M)M 21/2 (Q2 + N)N 21/2 = (Q2 + M)1/2(Q2 + N)1/2X1/2+. Q Q √ √ √ By the inequality x + y < x + y, the above is bounded by 1 √ √ 1 √ √ < (Q + M)(Q + N)X1/2+ = (Q2 + Q M + Q N + X1/2)X1/2+ Q Q  √ √ X1/2  = Q + M + N + X1/2+. (4.8) Q 1/6(k+1) Since Q0 < Q < D2 and N, M > X , we have  1/2−1/12(k+1) Q < D2 = X , √ r r  X X 1/2−1/12(k+1)  M = < < X ,  N X1/6(k+1) √ r X r X  N = < < X1/2−1/12(k+1),  M X1/6(k+1)  1/2 1/2 X X 1/2−1/12(k+1)  < = X . Q Q0

21 Combining the above estimates, (4.8) is  X1−1/12(k+1)+  X1−1/12(k+2). This leads to (4.4).

Thus, by Proposition1, Theorem1 holds for 1 ≤ d ≤ D2. From this, to prove (1.10), it suffices to show that X 1−θk |∆(τk; X, d, a)|  X . (4.9) d∈D D2

Remark 3. The cutoff parameter D2 introduced in (4.2) separating small and large moduli unfortunately has a dependence on k. We are unable to resolve this dependency without appealing to GRH or the Lindel¨ofHypothesis. This dependency on k is the result of the convexity bound (3.8) of L(s, χ)k on the critical line which depends on k. On GRH, the bound on L(1/2 + it, χ)k can be made uniform in k; see Section5 below for more.

4.1. Combinatorial argument The goal of this subsection is to apply combinatorial arguments to reduce the proof of (4.9) to showing that X |∆(γ; X, d, a)|  X1−θk , (4.10) d∈D D2

γ = β1 ∗ β2 ∗ · · · ∗ βk, a convolution of simpler arithmetic functions βj. Following the fundamental work of Friedlander and Iwaniec in their treatment of the ternary divisor function τ3(n) in [14, Section 3], after decomposing the interval [1,X] to O(LB) dyadic intervals of the form [N, 2N), we perform a finner-than-dyadic subdivision of the interval [N, 2N) as follows. Let ρ = X−$. (4.11) Let R be the largest positive integer r for which (1 + ρ)r < 2x. We have the following bound for R: log 2x R ≤  ρ−1 log X. log(1 + ρ)

For n ∼ N, we have τk(n) = T1(n) where X T1(n) = χNk ∗ χNk−1 ∗ · · · ∗ χN1 . (4.12)

N =(N1,N2,··· ,Nk)

Here N1,N2,...,Nk ≥ 1 run over the powers of 1 + ρ satisfying

k [Nk ··· N1, (1 + ρ) Nk ··· N1) ∩ [N, 2N) 6= ∅. (4.13)

22 Let T2 have the same expression as T1 but with the constraint (4.13) replaced by k [Nk ··· N1, (1 + ρ) Nk ··· N1) ⊂ [N, 2N). (4.14) −k k Since T1 − T2 is supported on [(1 + ρ) N, (1 + ρ) N] and (T1 − T2)(n)  τk, by Lemma7 we have X 1−$2 |∆(T1 − T2; X, d, a)|  X . d∈D D2

γ = χNk ∗ χNk−1 ∗ · · · . ∗ χN1

νi with Nk,...,N1 satisfying (4.14) and Nk ≤ · · · ≤ N1. Write Ni = X . We have

0 ≤ νk ≤ · · · ≤ ν1, (4.15) and log ρ 0 ≤ ν + ··· + ν < 1 − k . (4.16) k 1 L We deduce the proof of (4.9) from the following Proposition 2. With the same notation as above, we have X 7/2 |∆(γ; X, d, a)|  X1−$ . (4.17) d∈D D2

By Lemma8, all the choices of βi = χNi above satisfy the Siegel-Walfisz condition (3.19). Noting that the sum in (4.12) contains R  ρ−1 log X  X$15/16 terms, by the above discussion, we conclude that (4.17) implies (4.9). We start with the proof of Case (a), the simplest of the three.

23 4.2. Proof of Case (a) This is the simplest case of the three. Let

β = β1,N = N1, and α = β2 ∗ · · · ∗ βk,M = N2 ··· Nk, 5/8−8$ so that γ = α ∗ β. Since ν1 ≥ 5/8 − 8$ in this case, by Lemma8 with κ = X , we have ∆(β; X, d, a)  X5/16−4$N.

By definition of ∆(γ; X, d, a) and bounding

X X B α(m)  τk(m)  ML m∼M m∼M trivially, we get

X X ∆(γ; X, d, a) ≤ α(m) |∆(β; X, d, am)|  MLB max ∆(α; X, d, a)  X5/16−5$. (a,d)=1 (m,d)=1 m∼M Thus, X 5/16−5$ 13/16−3$ ∆(γ; X, d, a)  X D3 ≤ X .

d≤D3 This leads to (4.10).

4.3. Proof of Case (b) This case corresponds to the Type III estimate in [34, §§13, 14], which we will follow closely. The main tool we need is the Birch-Bombieri bound from Lemma 13. We will reduce ∆(γ; X, d, a) into an exponential sum of that form; see (4.51). Write α = β4 ∗ · · · ∗ βk so that γ = α ∗ β1 ∗ β2 ∗ β3. Let M = N4 ··· Nk.

Note that α ∗ β1 is supported on [MN1, 2MN1). We have the following lemma. Lemma 14. Suppose that 5 ν < − 8$ (4.18) 1 8 and, for any subset I ⊆ {1, 2, ··· , k},

X 3 5  ν ∈/ + 8$, − 8$ . (4.19) i 8 8 i∈I Then we have 5 ν + ν > − 8$. 3 2 8 24 Proof. By virtue of (4.19) with I = {2, 3}, it suffices to show that 3 ν + ν > + 8$. 3 2 8 By (4.18) and (4.19) we have 3 ν < + 8$. 1 8 By (4.15), 3 ν ≤ ν < + 8$. 2 1 8 By the first inequality in (4.16),

3  5 ν + ··· + ν = (ν + ··· + ν ) − ν > 1 − + 8$ = − 8$. 2 k k 1 1 8 8

Hence, by (4.16), there is an 3 ≤ ` ≤ k such that 3 ν + ··· + ν < + 8$ 2 `−1 8 and 5 ν + ··· + ν > − 8$, 2 ` 8 so that 1 ν > − 16$ ` 4 Thus, by (4.15) and (4.19), we conclude 1 3 ν + ν ≥ 2ν > − 32$ > + 8$. 3 2 ` 2 8

We have  1/2 X 5/16−4$ N1 ≥ N2 ≥ ≥ X , (4.20) MN1 and X 1/4−16$ N3 ≥ ≥ X . (4.21) MN1N2 By Lemma 14 we have 5 ν + ν > − 8$. 3 2 8 Hence, X 3/8+8$ MN1   X . (4.22) N2N3

25 ∗ Let f be as in Lemma6 with η = η,  = $, and with N1 in place of M. Note that the − + ± −$ function β1 − f is supported on [N1 ,N1] ∪ [ηN1, ηN1 ] with N1 = (1 ± N1 )N1. Letting ∗ γ = α ∗ β2 ∗ β3 ∗ f, we have

X ∗ X (γ − γ )(n)  α ∗ β2 ∗ β3(n)β1(m) − α ∗ β2 ∗ β3(n)f(m) (n,d)=1 (mn,d)=1 X X = α ∗ β2 ∗ β3(n) (β1(m) − f(m)) (n,d)=1 (m,d)=1 ϕ(d)  MN N M (N 1−$ + ηN 1−$) d 2 3 1 1 ϕ(d)  X1−$/2 d and

X ∗ X X X X (γ − γ )(n)  τk(`) + τk(`) n≡a(d) − 1≤`<3x/q + 1≤`<3x/q N1 ≤q≤N1 ηN1≤q≤ηN1 (q,d)=1 `q≡a(d) (q,d)=1 `q≡a(d) ϕ(d) X1−$/2  MN N N 1−$LB  . d2 2 3 1 d It therefore suffices to prove (4.10) with γ replaced by γ∗. We shall prove the bound X1−$/3 ∆(γ∗; d, c)  . (4.23) d We remove the condition (n, d) = 1 by means of M¨obiusinversion formula ( X 1, if (n, d) = 1, µ(δ) = (4.24) 0, otherwise. δ|(n,d) By (4.24) we have X X X X X f(n) = f(n) µ(δ) = µ(δ) f(n). (n,d)=1 n δ|n δ|d n≡0(δ) δ|d For δ > N 1−2, the inner sum on the right side is X N f(n)   N 2. N 1−2 n≡0(δ) This yields X X X X µ(δ) f(n) = µ(δ) f(n) + O(N ) = fˆ(0) + O(N ). δ (n,d)=1 δ|d n≡0(δ) δ|d δ≤N 1−2 δ≤N 1−2

26 Since X µ(δ) ϕ(d) = , δ d δ|d the above becomes X ϕ(d) f(n) = fˆ(0) + O(N ). (4.25) d (n,d)=1 By (4.20), this yields

1 X fˆ(0) X X X γ∗(n) = α(m) + O(d−1X3/4). ϕ(d) d (n,d)=1 (m,d)=1 n3'N3 n2'N2 (n3,d)=1 (n2,d)=1

Here n ' N stands for N ≤ n ≤ ηN. On the other hand, we have

X ∗ X X X X γ (n) = α(m) f(n1).

n≡a(d) (m,d)=1 n3'N3 n2'N2 mn3n2n1≡a(d) (n3,d)=1 (n2,d)=1

The innermost sum is, by Lemma6, equal to

1 X fˆ(h/d)e (−ahmn n ) + O(d−1) d d 3 2 |h|

where ∗ −1+2 H = dN1 . It follows that the left side of (4.23) is

1 X X X X = α(m) fˆ(h/d)e (−ahmn n ) + O(d−1X3/4). d d 3 2 m'M n3'N3 n2'N2 1≤|h|

Hence the proof of (4.23) is reduced to showing that

X X X ˆ 1−$/2+2 −1 f(h/d)ed(ahn3n2)  X M (4.26) ∗ 1≤h

for any a with (a, d) = 1. Substituting d1 = d/(h, d) and applying M¨obiusinversion, the left side of (4.26) can be written as

X X X X ˆ f(h/d1)ed1 (ahn3n2)

d1|d 1≤h

d1d2=d b3|d2 b2|d2 1≤h

27 where −1+2 H = d1N1 . (4.27) It therefore suffices to show that

X X X ˆ 1−$/2+ −1 f(h/d1)ed1 (bhn3n2)  X M (4.28) 0 0 1≤h

0 0 for any d1, b, N3, and N2 satisfying d N d N d |d, (b, d ) = 1, 1 3 ≤ N 0 ≤ N , 1 2 ≤ N 0 ≤ N , (4.29) 1 1 d 3 3 d 2 2 which are henceforth assumed. By the lower bound (4.20) we have

H  X3/16+6$+. (4.30)

1−2 1−2 By (4.27), the left side of (4.28) is void if d1 ≤ N1 , so we may assume that d1 > N1 . By the trivial bound ˆ f(z)  N1 (4.31) and the bound (3.30), we find that the left side of (4.28) is

1/2+ −1 3/2+ 2  HN3N1(d1 + d1 N2)  d1 N1 N3.

5/12−6$ 1−$+3 If d1 ≤ X , then the right side of the above is  X by (4.22) and (4.61), and this leads to (4.28). Thus we may further assume that

5/12−6$ d1 > X . (4.32)

We appeal to the Weyl shift and the factorization d2 = rq. By Lemma 10, with d1 in place of d, we can choose a factor r of d1 such that

X44$ < r < X45$. (4.33)

Write X X X ˆ N (d1, k) = f(h/d1)ed1 (bh(n2 + hkr)n3), 0 0 1≤h 0. We have

N (d1, k) − N (d1, 0) = Q1(d1, k) − Q2(d1, k), (4.34) where X X X ˆ Qi(d1, k) = f(h/d1)ed1 (bh`n3), i = 1, 2, 0 1≤h

28 By M¨obiusinversion, we have X X X X ˆ Qi(d1, k) = µ(s) f(h/t)et(bh`n3). 0 st=d1 1≤h

2 The h sum is empty unless s < H. Since H = o(d1) by (4.30) and (4.32), it follows, by changing the order of summation, that

X X X X ˆ |Qi(d1, k)| ≤ f(h/t)et(bh`n3) , 0 st=d1 n3'N3 `∈Ii(H) h∈Ji(s,`) t>H (n3,d1)=1 (`,d1)=1 where Ji(s, `) is some interval of length at most H depending on s and `. By integration by parts, we have d fˆ(z)  min{N 2, |z|−2N }, dz 1 1 and by partial summation and (4.31) we obtain

X ˆ 1+ −1 f(h/t)et(bh`n3)  N1 min{H, kb`n3/tk }.

h∈Ji(s,`) Thus, 1+ X X X −1 Qi(d1, k)  N1 min{H, kb`n3/tk }.

t|d1 `∈Ii(H) n3<2N3 t>H (`,d1)=1 (n3,d1)=1 1+ Since H = o(N3) by (4.21) and (4.30), the inner most sum is  N3 , by Lemma 11. By (4.27), this leads to 1+ Qi(d1, k)  d1 krN3. (4.35) We now introduce the parameter

−1/2−48$ K = [X N1N2],

which is  X1/8−56$ by (4.20). By the second inequality in (4.33), the right side of (4.35) is  X1−$+M −1 if k < 2K. Hence, by (4.34), the proof of (4.28) is reduced to showing that 1 X N (d , k)  X1−$/2+M −1. (4.36) K 1 k∼K We now prove (4.36). By the relation

h(n2 + hkr) ≡ ` + kr (mod d1)

for (h, d1) = (n2 + hkr, d1) = 1, where ` ≡ hn2 (mod d1), we may rewrite N (d1, k) as X X N (d1, k) = ν(`; d1) ed1 (b(` + kr)n3), 0 ` (mod d1) n3'N3 (`+kr,d1)=1 (n3,d1)=1

29 where X0 ˆ ν(`; d1) = f(h/d1).

hn2≡`(d1) 0 X 0 Here is the restriction to 1 ≤ h < H,(h, d1) = 1, and n2 ' N2. It follows by Cauchy’s inequality that 2

X N (d1, k) ≤ P1P2, (4.37) k∼K where 2

X 2 X X X P1 = |ν(`; d1)| and P2 = ed1 (b(` + kr)n) . 0 ` (mod d1) ` (mod d1) k∼K n'N3 (`+kr,d1)=1 (n,d1)=1 By (4.31) we have 2 P1  N1 N4, where

0 N4 = #{(h1, h2; n1, n2): h2n1 ≡ h1n2 (mod d1), 1 ≤ hi < H, ni ' N2}  2 X  X    τ(m) .   ` (mod d1) 1≤m<2HN2 m≡`(d1)

1+ Since HN2  d1 by (4.27), we get 1+ 2 P1  d1 N1 . (4.38) We claim that 3/16+52$+ 2 P2  d1X K . (4.39) With this, the estimate (4.36) follows from (4.37), (4.38), and (4.39) immediately since

3/8+8$ −1 1/2+2$ N1 ≤ X M , d1 < X , and 31 $ + 36$ = 1 − . (4.40) 32 2

It remains to prove (4.39). Write d1 = rq. Note that N 0 3  X1/6−69$ (4.41) r by (4.29), (4.32), (4.21), and the second inequality in (4.33). Since X X X ed1 (b(` + kr)n) = ed1 (b(` + kr)(nr + s)) + O(r), 0 0 n'N3 0≤s

30 we get X X ed1 (b(` + kr)n) = U(`) + O(Kr), 0 k∼K n'N3 (`+kr,d1)=1 (n,d1)=1 where X X X U(`) = ed1 (b(` + kr)(rn + s)). 0 0≤s

` (mod d1) By the second inequality in (4.33), the second term on the right side of the above is admissible for (4.39). On the other hand, we have

X 2 X X X X |U(`)| = V (k2 − k1; s1, s2), (4.42)

` (mod d1) k1∼K k2∼K 0≤s1

`(n1r + s1) − (` + kr)(n2r + s2)

d1 r2` (n r + s ) − r2(` + k)(n r + s ) q2s s ` (s − s ) ≡ 1 1 1 1 2 2 + 1 2 2 2 1 (mod 1). q r Thus, by the Chinese remainder theorem, the innermost sum in (4.43) is equal to

X 2 2 Cr(s2 − s1) eq(br `(n1r + s1) − br (` + k)(n2r + s2)), ` (mod q) (`(`+k),q)=1 and, thus, V (k; s1, s2) = W (k; s1, s2)Cr(s2 − s1), (4.44) where 0 X X X 2 2 W (k, s1, s2) = eq(br `(n1r + s1) − br (` + k)(n2r + s2)). 0 0 n1'N3/r n2'N3/r ` (mod q) (n1r+s1,q)=1 (n2r+s2,q)=1

31 X0 Here is the restriction to (`(` + k), q) = 1.

We first estimate the contribution from terms with k1 = k2 on the right side of (4.42) as follows. For (n1r + s1, q) = (n2r + s2, q) = 1, we have

∗ X 2 2 eq((br `(n1r + s1) − br `(n2r + s2)) = Cq((n1 − n2)r + s1 − s2). ` (mod q)

0 1/3 Since N3  X , by (4.32) and the second inequality in (4.33), we have N 0 3  X−1/12+6$  r−1. (4.45) d1

0 This implies N3/r = o(q), giving

X 1+ |Cq(nr + m)|  q 0 n'N3/r for any m. Thus 1+ −1 0 W (0; s1, s2)  q r N3. Substituting the above into (4.44) and using the simple estimate

X X 2+ |Cr(s2 − s1)|  r ,

0≤s1

X X X X 3/16+52$+ 2 V (k2 − k1; s1, s2)  d1X K . (4.46)

k1∼K k2∼K 0≤s1

0 0 00 0 n = min{n : n ' N3/r}, n = max{n : n ' N3/r}, we may rewrite W (k; s1, s2) as     X X X0 n1 n2   F F e br2`(n r + s ) − br2(` + k)(n r + s ) , q q q 1 1 2 2 n1≤q n2≤q ` (mod q) (n1r+s1,q)=1 (n2r+s2,q)=1 where F (y) is a function of C2[0, 1] class such that

0 ≤ F (y) ≤ 1,

32  h 0 00 i 1, if y ∈ n , n , F (y) = q q h n0 1 n00 1 i 0, if y∈ / q − 2q , q + 2q , for which the Fourier coefficient Z 1 κ(m) = F (y)e(−my)dy 0 satisfies 1 1 q  κ(m)  κ∗(m) := min , , . (4.47) r |m| m2 Here we have used (4.45). Fourier expand F (y) we obtain

∞ ∞ X X W (k; s1, s2) = κ(m1)κ(m2)Y (k; m1, m2; s1, s2), (4.48)

m1=−∞ m2=−∞ where X X X0 Y (k; m1, m2; s1, s2) = eq(δ(`, k; m1, m2; n1, n2; s1, s2)),

n1≤q n2≤q ` (mod q) (n1r+s1,q)=1 (n2r+s2,q)=1

with

2 2 δ(`, k; m1, m2; n1, n2; s1, s2) = br `(n1r + s1) − br (` + k)(n2r + s2) + m1n1 + m2n2.

Moreover, if njr + sj ≡ tj (mod q), then nj ≡ r(tj − sj) (mod q) so that

m1n1 + m2n2 ≡ r(m1t1 + m2t2) − r(m1s1 + m2s2) (mod q).

Hence, on substituting njr + sj = tj, we may write Y (k; m1, m2; s1, s2) as

Y (k; m1, m2; s1, s2) = Z(k; m1, m2)eq (−r(m1s1 + m2s2)) , (4.49)

where ∗ ∗ 0 X X X 2 2 Z(k; m1, m2) = eq(br `t1 − b(r (` + k)t2) + r(m1t1 + m2t2)).

t1 (mod q) t2 (mod q) ` (mod q)

By (4.44), (4.48), and (4.49) we get

∞ ∞ X X X X V (k; s1, s2) = κ(m1)κ(m2)Z(k; m1, m2)J(m1, m2), (4.50)

0≤s1

where X X J(m1, m2) = eq(−r(m1s1 + m2s2))Cr(s2 − s1).

0≤s1

33 We now appeal to Lemma 13. By simple substitution we have 3 3 Z(k; m1, m2) = T (k, bm1r , −bm2r ; q), so Lemma 13 gives 1/2 3/2+ Z(k; m1, m2)  (k, q) q , (4.51) the right side does not depend on m1 and m2. We claim the following estimate ∞ ∞ X X ∗ ∗ 1+ κ (m1)κ (m2)|J(m1, m2)|  r . (4.52)

m1=−∞ m2=−∞ Combining the above two estimates together with (4.50) we obtain

X X 1/2 3/2+ 1+ V (k; s1, s2)  (k, q) q r .

0≤s1

X X 1/2  2 (k1 − k1, q)  q K ,

k1∼K k2∼K k26=k1 whence (4.39) follows. We now prove (4.52). We rewrite the left side of (4.52) as ∞ ∞ 1 X X X κ∗(m )κ∗(m + k)|J(m , m + k)|. r 1 2 1 2 m1=−∞ m2=−∞ 0≤k

X 2+ |J(m1, m2 + k)|  r (4.53) 0≤k

|t|

X X  |Cr(t)| eq(r(m1 + m2)s) ,

|t|

34 where It is some interval of length less than r depending on t. For any fixed t and any square-free r1, there are exactly τ(r1/(t, r1)) distinct residue classes modulo r1 such that

s(s + t) ≡ 0 (mod r1) if and only if s belongs to one of these classes. On the other hand, if r = r1r2, then

X −1 eq (r(m1 + m2)s)  min{r2, kr2(m1 + m2)/qk }

s∈It s≡a(r1)

for any a. Hence the inner sum on the right side of (4.54) is

X −1  τ(r) min{r2, kr2(m1 + m2)/qk },

r2|r

which does not depend on t. This, together with the simple estimate X |Cr(t)|  τ(r)r, |t|

yields 2 X −1 J(m1, m2)  τ(r) r min{r2, kr2(m1 + m2)/qk }.

r2|r Thus the left side of (4.53) is

2 X X X −1  τ(r) r min{r2, kr2(m1 + m2 + k1r2 + k2)/qk }. (4.55)

r1r2=r 0≤k1

Assume r2|r. By the relation r q 1 2 ≡ − + (mod 1), q r2 qr2 we have X −1 min{r2, kr2(m + k)/qk }  r2L (4.56)

0≤k

4.4. Proof of Case (c) We finish the proof of Theorem1 with the last and most involved case. This case corresponds to the Type I and II estimates in [34, §§7-12]. Without loss of generality, assume there is a subset I of {1, 2, . . . , k} such that

3 X 1 log 2 + 8$ < ν < + . (4.57) 8 i 2 2L i∈I

35 Let J be the complement of I in {1, 2, . . . , k}. Write

α = βj1 ∗ βj2 ∗ · · · ∗ βjm ,J = {j1, j2, . . . , jm}, and

β = βi1 ∗ βi2 ∗ · · · ∗ βi` ,I = {i1, i2, . . . , i`}, (4.58) so that γ = α ∗ β. We have α supported on [M, 2M) and β supported on [N, 2N), where Y Y M = Nj,N = Ni. (4.59) j∈J i∈I

By (4.57), we have X3/8+8$ < N  X1/2. (4.60) We now treat (4.63) via the methods in [14] and [2, §§3-7], following [34]. Write

3/8+8$ 1/2−4$ X1 = X and X2 = X . (4.61)

We apply Lemma 10 with ( X−$/6N, if X < N ≤ X , R∗ = 1 2 (4.62) −3$ 1/2 X N, if X2 < N ≤ 2X .

Hence, by Lemma 10, the proof of Case (c) is reduced to showing that

X X 5/3 µ(qr)2|∆(γ; X, qr, a)|  X1−$ (4.63) q∼Q r∼R (r,a)=1 (q,rP0)=1 subject to the conditions 1 X−$R∗ < R < R∗ and D < QR < X1/2+2$. 2 2 Therefore, it suffices to prove that

X X 5/3 B(γ; Q, R) := |µ(r)| |∆(γ; X, qr, a)|  X1−$ (4.64) r∼R q∼Q (r,a)=1 q|P (q,rP0)=1 subject to the constraints X−$R∗ < R < R∗ (4.65) and 1/2+2$ D2  QR  X , (4.66) which are henceforth assumed. In what follows we assume that

r ∼ R, |µ(r)| = 1, and (r, a) = 1. (4.67)

36 Let c(q, r) denote ( sign∆(γ; X, qr, a), if q ∼ Q, q|P, and (q, rP ) = 1, c(q, r) = 0 0, otherwise.

Splitting γ = α ∗ β, writing n as mn, and changing the order of summation, the inner sum over q in (4.64) becomes X X |∆(γ; X, q, a)| = α(m)D(r, m), (4.68) q∼Q (m,r)=1 q|P (q,rP0)=1 where   X X 1 X D(r, m) := c(q, r) β(n) − β(n) .  ϕ(qr)  (q,m)=1 mn≡a(qr) (n,qr)=1 Substituting (4.68) into (4.64) and applying Cauchy’s inequality to the m variable, we get X X B(γ; Q, R)2  MRLB |µ(r)| f(m)D(r, m)2, (4.69) r∼R (m,r)=1 where f(y) is as in Lemma6. Squaring out D(r, m) and summing over m, we have

X 2 f(m)D(r, m) = S1(r) − 2S2(r) + S3(r), (4.70) (m,r)=1 where Sj(r, a), j = 1, 2, 3, are defined by  2 X X X S1(r) = f(m)  c(q, r) β(n) , (m,r)=1 (q,m)=1 mn≡a(qr)

X X c(q1, r)c(q2, r) X X X S2(r) = β(n1)β(n2) f(m), ϕ(q2r) q1 q2 n1 (n2,q2r)=1 mn1≡a(q1r) (m,q2)=1

X X c(q1, r)c(q2, r) X X X S3(r) = β(n1)β(n2) f(m). ϕ(q1r)ϕ(q2r) q1 q2 (n1,q1r)=1 (n2,q2r)=1 (m,q1q2r)=1

By (4.69) and (4.70), the proof of (4.64) is reduced to showing that

X −1 1−$5/3 (S1(r) − 2S2(r) + S3(r))  NR X , (4.71) r where r is constrained as in (4.67). We begin with the evaluation of S3(r) which is the simplest of the three sums. We make frequent use of the trivial bound fˆ(z)  M. (4.72)

37 Similar to the proof of (4.25), we have, for qj ∼ Q, j = 1, 2,

X ϕ(q1q2r) f(m) = fˆ(0) + O(X). q1q2r (m,q1q2r)=1 This yields

ˆ X X c(q1, r)c(q2, r) ϕ(q1q2r) X X  2 −2 S3(r) = f(0) β(n1)β(n2) + O(X N R ). ϕ(q1r)ϕ(q2r) q1q2r q1 q2 (n1,q1r)=1 (n2,q2r)=1

If (q1q2, P0) = 1, then either (q1, q2) = 1 or (q1, q2) > D0. Thus, on the right side of the above, the contribution from terms with (q1, q2) > 1 is, by (4.72) and trivial estimation,

−1 −2 B  XND0 R L . It follows that ˆ −1 −2 B S3(r) = f(0)X(r) + O(XND0 R L ), (4.73) where X X c(q1, r)c(q2, r) X X X(r) = β(n1)β(n2). (4.74) q1q2rϕ(r) q1 (q2,q1)=1 (n1,q1r)=1 (n2,q2r)=1 The value of X(r) is not essential, since it will cancel out, with acceptable error (see (4.98) below), when we insert it back into the left side of (4.71). We next evaluate S2(r). Our next goal is to show that

ˆ −1 −2 B S2(r) = f(0)X(r) + O(XND0 R L ) (4.75) with X(r) given as in (4.74). The main tool we need is the Ramanujan bound (3.30). Assume c(q1, r)c(q2, r) 6= 0. On substituting mn1 = n and applying Lemma7 we have

X X X XLB β(n1) f(m)  τk(n)  . q1r n1 mn1≡a(q1r) n<2X (m,q2)=1 n≡a(q1r)

It follows that contributions from terms with (q1, q2) > 1 in S2(r) is

−1 −2 B  XND0 R L , so that

X X c(q1, r)c(q2, r) X X X −1 −2 B S2(r) = β(n1)β(n2) f(m)+O(XND0 R L ). ϕ(q2r) q1 (q2,q1)=1 n1 (n2,q2r)=1 mn1≡a(q1r) (m,q2)=1 (4.76) Note that the innermost sum over m in (4.76) is empty unless (n1, q1r) = 1. For |µ(q1q2r)| = 1 and (q2, P0) = 1 we have q2 −1 = 1 + O(τ(q2)D0 ) ϕ(q2)

38 and, by Lemma7,

B X X X τk(q2)XL β(n1) f(m)  τk(n)  . q1rD0 (n1,q1r)=1 mn1≡a(q1r) n<2X (m,q2)>1 n≡a(q1r) (n,q2)>1

Thus (4.76) still holds with the constraint (m, q2) = 1 removed and with ϕ(q2r) replaced by q2ϕ(r). That is, we have

X X c(q1, r)c(q2, r) X X X −1 −2 B S2(r) = β(n1)β(n2) f(m)+O(XND0 R L ). q2ϕ(r) q1 (q2,q1)=1 n1 (n2,q2r)=1 mn1≡a(q1r) (4.77) By Lemma6, for ( n1, q1r) = 1, we have   X 1 X ˆ h −1 f(m) = f eq1r(−hm) + O(d ), q1r q1r mn1≡a(q1r) |h|

ˆ −1 −2 B S2(r) = f(0)X(r) + R2(r) + O(XND0 R L ), (4.78) where   X X c(q1, r)c(q2, r) X R2(r) =  β(n2) q1q2ϕ(r) q1 (q2,q1)=1 (n2,q2r)=1   X X ˆ h × β(n1) f eq1r(−hm). q1r (n1,q1r)=1 1≤|h|

The proof of (4.75) is now reduced to estimating R2(r). Note that, by the second inequality in (4.66), we have −1/2+2$+2 H2  X N (4.79) −1 −1 since M  X N. This implies that R2(r) = 0 if X1 < N < X2, since H2 < 1 in this case. 1/2 Now assume that X2 < N < 2X . By the ‘reciprocity’ relation m aq n arn ≡ 1 1 + 1 (mod 1), q1r r q1 we get 1+ −2 X ∗ R2(r)  N R |R (r, n)|, (4.80) n∼N (n,r)=1

39 where X c(q, r) X  h  −ahqn ahrn R∗(r, n) = fˆ e − . q qr r q (q,n)=1 1≤|h|

X X c(q, r)c(q0, r) |R∗(r, n)|2 = qq0 (q,n)=1 (q0,n)=1    0   0 0 0  X X ˆ h ˆ h a(h q − hqn ahrn ah rn × f f 0 e − + 0 . 0 qr q r r q q 1≤|h|

0 −2 X ∗ 2 X X |c(q, r)c(q , r)| X X 0 0 M |R (r, n)|  0 |W(q, r; q , h )|, (4.81) 0 qq 0 n∼N q q 1≤|h|

−1 −3$+ H2Q  X . (4.82) It follows that, on the right side of (4.81), the contribution from terms with h0q = hq0 is X X  NQ−2 τ(hq)  X−3$+N. (4.83)

1≤|h|

0 0 0 0 Now assume that c(q, r)c(q , r) 6= 0, 1 ≤ |h| < H2, 1 ≤ |h | < H2, and h q 6= hq . Letting d = [q, q0]r, we have a(h0q0 − hq) ahr ah0r c − + ≡ (mod 1) r q q0 d for some c with (c, r) = (h0q0 − hq, r). By the estimate (3.30) it follows that (c, d)N W(q, r; q0, h0)  d1/2+ + . (4.84) d

Since N > X2, by the first inequality in (4.65), (4.62), and (4.61), we have R−1 < X4$N −1 < X−1/2+8$. (4.85)

This and the second inequality in (4.66) imply that

Q  X10$. (4.86)

40 Thus we have d1/2  (Q2R)1/2  X1/4+6$. Note that h0q0 − hq ≡ (h0q − hq0)qq0 (mod r). This implies 0 0 (c, d) ≤ (c, r)[q, q ]  [q, q ]H2Q. (4.87) This together with (4.79), (4.85), and (4.86), give

(c, d)N  H NQR−1  X16$+. d 2 Combining these estimates with (4.84) we deduce that

W(q, r; q0, h0)  X1/4+7$.

This together with (4.79) imply that the contribution from terms with h0q 6= hq0 on the right side of (4.81) is  X1/4+12$, which is sharper than the right side of (4.83). Combining these estimates with (4.81) we conclude that X |R∗(r, n)|  X1−3$/2+. n∼N (n,r)=1

Substituting this into (4.80) we obtain

−2 1−$ R2(r)  NR X , (4.88) which is sharper than the big-Oh term in (4.78). The relation (4.75) follows from the bounds (4.78) and (4.88) immediately. It remains to deal with S1(r). The evaluation of the last sum S1(r) is the most difficult. The main tool we need is Lemma 12. We shall instead establish an averaged bound on S1(r) of the form

X X ˆ −1 −$5/3 S1(r) = (f(0)X(r) + R1(r)) + O(XNR X ) (4.89) r r with R1(r) to be specified below in (4.95). By the estimates (4.75) for S2(r) and (4.73) for S3(r), the proof of (4.71) will be reduced to estimating R1(r). By definition of S1(r), expanding out the square, we have X X X X X S1(r) = c(q1, r)c(q2, r) β(n1)β(n2) f(m). (4.90)

q1 q2 n1 n2≡n1(r) mn1≡a(q1r) mn2≡a(q2)

Let U(r, q0) denote the sum of terms on the right side of the above with (q1, q2) = q0. The sum U(r, q0) vanishes unless

q0 < 2Q, q0|P, and (q0, rP0) = 1,

41 which we will assume. We first show that the contribution

X X −1 B U(r, q0)  XN(D0R) L (4.91)

r q0>1 coming from terms with q0 > 1 is admissible. Assume that, for j = 1, 2,

qj ∼ Q, qj|P, (qj, rP0) = 1, and (q1, q2) = q0.

0 0 Writing q1 = q1/q0, q2 = q2/q0, we get

X X X X XX 0 00 X X X U(r, q0) = c(q0q , r)c(q0q , r) β(n1)β(n2) f(m) 0 00 r q0>1 r q0>1 (q ,q )=1 n1 n2≡n1(r) m≡µ(q1q2r) (a,q0q00)=1 where µ(mod q1q2r) is a common solution to

µn1 ≡ a(mod q1r) and µn2 ≡ a(mod q2r). (4.92)

Since q1 and q2 have no prime factor less than D0, we have either q0 = 1 or q0 ≥ D0. By Cauchy’s inequality we have

X X X X X X X 2 X X U(r, q0)  f(m) |β(n1)| 1. 0 0 r q0>1 D0

By Lemma7 the two inner sums of the above is

X B B −1 B  τ (mn − a)τ (q0)  N(D0R) L .

n≡am(q0r)

This and Lemma7 imply

X X −1 B X X B −1 B −1 U(r, q0)  N(D0R) L τ (mn − a)  NR L XD0 . r q0>1 m n

This bound is admissible provided $4/3 D0 = X (4.93) which we henceforth assume. We now turn to U(r, 1). Assume |µ(q1q2r)| = 1. In the case (n1, q1r) = (n2, q2r) = 1, the innermost sum in (4.90) is, by Lemma6, equal to   1 X ˆ h −1 f eq1q2r(−µh) + O(d ), q1q2r q1q2r |h|

42 It follows that ˆ ∗ U(r, 1) = f(0)X (r) + R1(r) + O(1), (4.94) where ∗ X X c(q1, r)c(q2, r) X X X (r) = β(n1)β(n2) q1q2r q1 (q2,q1)=1 (n1,q1r)=1 n2≡n1(r) (n2,q2)=1 and   X X c(q1, r)c(q2, r) X X ˆ h R1(r) = β(n1)β(n2) f eq1q2r(−µh). q1q2r q1q2r q1 (q2,q1)=1 n2≡n1(r) 1≤|h|

X X ˆ ∗ −1 B S1(r) = (f(0)X (r) + R1(r)) + O(XN(D0R) L ). r r In view of (4.72), the proof of (4.89) is thus reduced to showing that

X 5/3 (X∗(r) − X(r))  N 2R−1X−$ . (4.96) r We have ∗ X X c(q1, r)c(q2, r) X (r) − X(r) = V(r; q1, q2), q1q2r q1 (q2,q1)=1 with X X 1 X X V(r; q , q ) = β(n )β(n ) − β(n )β(n ). 1 2 1 2 ϕ(r) 1 2 (n1,q1r)=1 n2≡n1(r) (n1,q1r)=1 (n2,q2r)=1 (n2,q2)=1 It follows that

X ∗ 1 X X 1 X (X (r) − X(r))  |V(r; q1, q2)|. (4.97) R q1q2 r q1∼Q q2∼Q r∼R (r,q1q2)=1 Noting that     X∗  X 1 X   X 1 X  V(r; q1, q2) =  β(n) − β(n)  β(n) − β(n) ,  ϕ(r)   ϕ(r)  `(mod r) n≡`(r) (n,q1r)=1 n≡`(r) (n,q2r)=1 (n,q1)=1 (n,q2)=1 and by Cauchy’s inequality and Lemma9, we find that the innermost sum in (4.97) is

B 2 −$/12  τ(q1q2) N X , which leads to (4.96).

43 Combining (4.73), (4.75), and (4.89) leads to

X X −1 −$5/3 (S1(r) − 2S2(r) + S3(r)) = R1(r) + O(XNR X ). r r Note that µ aq q n aq rn aq rn ≡ 1 2 1 + 2 1 + 1 2 (mod 1) q1q2r r q1 q2

by (4.92). Hence, on substituting n2 = n1 + kr, we may write R1(r) as 1 X R (r) = R (r, k), 1 r 1 |k|

(n,q1r)=1 (n+kr,q2)=1 with aq1q2n aq2rn aqr(n + kr) ξ(r; q1, q2; n, k) = + + . r q1 q2 Thus, the proof of (4.71) will follow from the bound

1−$/2 R1(r, k)  X (4.98) which we shall prove in the next two subsections. We note that the bound (4.98) amounts to saving a power of X from the trivial estimate; indeed, it trivially follows from (4.72) that

1+ R1(r, k)  X H1.

On the other hand, in view of (4.61), since

 2 −1 −1 H1  X (QR) (MN) NR ,

and, by the first inequality in (4.65) and (4.62), ( X$+, if X < N ≤ X , NR−1 < 1 2 (4.99) 4$ 1/2 X , if X2 < N < 2X ,

it follows from the second inequality in (4.66) that ( X5$+2 if X < N ≤ X , H  1 2 (4.100) 1 8$+ 1/2 X if X2 < N ≤ 2X

is bounded by a small power of X.

44 4.4.1. Estimation of R1(r, k): The Type I case In this and the next subsection we assume that |k| < NR−1, and we write

R1, c(q1), c(q2), and ξ(q1, q2; n) for R1(r, k), c(q1, r), c(q2, r), and ξ(r; q1, q2; n, k), respectively, with the goal of proving (4.98). The variables r and k may also be omitted for notational simplicity. The proof is analogous to the estimation of R2(r); the main ingredient is Lemma 12. ∗ Assume that X1 < N ≤ X2 and R is as in (4.62). We have

 X c(q1) X R1  N |F(q1, n)|, (4.101) q1 q1 n∼N (n,q1r)=1 where   X X c(q2) ˆ h F(q1, n) = f e(−hξ(q1, q2; n)). q2 q1q2r 0<|h|

We assume c(q1) 6= 0. To bound the sum of |F(q1, n)| we observe that, similar to (4.81),

0 −2 X 2 X X |c(q2)c(q2)| X X 0 0 M |F(q1, n)|  0 |G(h, h ; q1, q2; q2)|, 0 q2q2 0 n∼N (q2,q1)=1 (q2,q1)=1 0<|h|

1≤|h|

0 0 0 0 Now assume that c(q2)c(q2) 6= 0, (q2q2, q1) = 1, and h q2 6= hq2. We have

h0q0 − hq aq n h0ξ(q , q0 ; n) − hξ(q , q ; n) ≡ 2 2 1 1 2 1 2 r 0 0 0 h q2 − hq2arn h aq1r(n + kr) haq1r(nkr) + + 0 − (mod 1). q1 q2 q2

45 0 Letting d1 = q1r and d2 = [q2, q2], we may write h0q0 − hq aq h0q0 − hq ar c 2 2 1 + 2 2 ≡ 1 (mod 1) r q1 d1

for some c1 with 0 0 (c1, r) = (h q2 − hq2, r), and 0 h aq1r haq1r c2 0 − ≡ (mod 1) q2 q2 d2

for some c2, so that

0 0 c1n c2(n + kr) h ξ(q1, q2; n) − hξ(q1, q2; n) ≡ + (mod 1). d1 d2

Since (d1, d2) = 1, it follows by Lemma 12 that

0 0 1/2+ (c1, d1)N G(h, h ; q1, q2; q2)  (d1d2) + . (4.104) d1

By the condition N > X1, this gives, by (4.99), R−1 < X$+N −1 < X−3/4−15$+N. (4.105)

Together with (4.66), this yields

1/2 3 1/2 3/4+3$ −1 −12$+ (d1d2)  (Q R)  X R  X N. A sharper bound for the second term on the right side of (4.104) can be obtained as follows. In a way similar to the proof of (4.87), we find that

2 (c1, d1) ≤ (c1, r)q1  H1Q . It follows by (4.100), (4.66), and the first inequality in (4.105) that

(c1, d1) −2 1/2+9$+4 −2 −1/4−6$  H1(QR)R  X N  X . d1

Here the condition N > X1 is used again. Combining these estimates with (4.104) we deduce that 0 0 −12$+ G(h, h ; q1, q2; q2)  X N. Together with (4.100), this implies that, on the right side of (4.102), the contribution from 0 0 terms with h q2 6= hq2 is −12$+ 2 −2$+5  X H1 N  X N, which has the same order of magnitude as the right side of (4.103) essentially. Combining these estimates with (4.102) we obtain

X 2 1−2$+5 |F(q1, n)|  X M. n∼N (n,q1r)=1

46 This yields, by Cauchy’s inequality,  1/2  1/2 X  X   X 2 |F(q1, n)|   1  |F(q1, n)|  (4.106) n∼N n∼N n∼N (n,q1r)=1 (n,q1r)=1 (n,q1r)=1  N 1/2(X1−2$+5M)1/2  X1−$+3.

The estimate (4.98) follows from (4.101) and (4.106) immediately.

4.4.2. Estimation of R1(r, k): The Type II case 1/2 ∗ We now assume that X2 < N < 2X and R is as in (4.62). We have

 X R1  N |K(n)|, (4.107) n∼N (n,r)=1 where   X X c(q1)c(q2) X ˆ h K(n) = f e(−hξ(q1, q2; n)). q1q2 q1q2r (q1,n)=1 (q2,q1(n+kr))=1 1≤|h|

# X 0 0 Let stands for a summation over the tuples (q1, q2; q1, q2) with

0 0 (q1, q2) = (q1, q2) = 1. To estimate the sum of K(n) we observe that, similar to (4.81),

# 0 0 −2 X 2 X c(q1)c(q2)c(q1)c(q2) X X 0 0 0 M |K(n)|  0 0 |M(h, h ; q1, q2; q1, q2)|, q1q2q1q2 0 n∼N 1≤|h|

0 0 0 0 0 0 0 c(q1)c(q2)c(q1)c(q2) 6= 0, (q1, q2) = (q1, q2) = 1, and h q1q2 6= hq1q2.

47 We have 0 0 0 0 0 sn t1n t1n t2(n + kr) t2(n + kr) h ξ(q1, q2; n) − hξ(q1, q2; n) ≡ + + 0 + + 0 (mod 1) (4.110) r q1 q1 q2 q2 with

0 0 0 s ≡ a(h q1q2 − hq1q2) (mod r),

t1 ≡ −ahq2r (mod q1), 0 0 0 0 t1 ≡ ah q2r (mod q1),

t2 ≡ −ahq1r (mod q2), 0 0 0 0 t2 ≡ ah q1r (mod q2). 0 0 Letting d1 = [q1, q1]r and d2 = [q2, q2], we may rewrite (4.110) as

0 0 0 c1n c2(nkr) h ξ(q1, q2; n) − hξ(q1, q2; n) ≡ + (mod 1) d1 d2

for some c1 and c2 with 0 0 0 (c1, r) = (h q1q2 − hq1q2, r). It follows by Lemma 12 that 2 1/2+ (c1, d1)(d1, d2) N M  (d1d2) + . (4.111) d1 By (4.66) and (4.86) we have 1/2 4 1/2 1/4+16$ (d1d2)  (Q R)  X . 0 0 2 0 On the other hand, we have (d1, d2) ≤ (q1q1, q2q2)  Q , since (q2q2, r) = 1, and, similar to (4.87), 0 0 2 (c1, d1) ≤ (c1, r)[q1, q1]  [q1, q1]H1Q . It follows by (4.99), (4.86), and the first inequality in (4.85) that 2 (c1, d1)(d1, d2) N 6 −1 72$  H1NQ R  X . d1 Combining these estimates with (4.111) we deduce that 0 0 0 1/4+16$+ M(h, h ; q1, q2; q1, q2)  X . Together with (4.99), this implies that, on the right side of (4.108), the contribution from 0 0 0 terms with h q1q2 6= hq1q2 is 1/4+16$+ 2 1/4+33$ X H1  X , which is sharper than the right side of (4.109). Combining these estimates with (4.108) we obtain X |K(n)|  X1−$. (4.112) n∼N (n,r)=1 The estimate (4.98) follows from (4.107) and (4.112) immediately. This completes the proof of Theorem1.

48 5. Proof of uniform power savings Theorem2 Let χ be a primitive character (mod d) and L(s, χ) denote its Dirichlet L-function. On the Generalized Lindel¨ofHypothesis, we have, for σ ≥ 1/2,

L(s, χ)k  (d|s|)k (5.1)

for any  > 0; see, e.g., [3]. This bound will allow us to significantly improve the estimate in (3.2). This Lemma is a truncated Perron’s formula.

Lemma 15. Let  0, if 0 < X < 1,  δ(X) = 1/2, if X = 1, 1, if X > 1 and 1 Z c+iT Xs I(X,T ) = ds. 2πi c−iT s Then 1 Z c+i∞ Xs δ(X) = ds, 2πi c−i∞ s and, for X > 0, c > 0, and R > 0, we have ( Xc min{1,T −1| log X|−1}, if X 6= 1, |I(X,T ) − δ(X)| < c/T, if X = 1.

Proof. See [26, Theorem 4.1.4]. With (5.1) we can strengthen Lemma3, which was used in the proof of Theorem1, to

Proposition 3. Assume the Generalized Lindel¨ofHypothesis. For χ a primitive character (mod d) we have X 7/8 1/2 τk(n)χ(n)  X d . (5.2) n≤X Proof. The proof of this proposition is in principal very similar to that of Lemma3. Indedd, we estimate directly the left side of (5.2) using the truncated Perron’s formula, getting

X 1 Z 9/8+iT Xs X9/8  τ (n)χ(n) = L(s, χ)k ds + O . (5.3) k 2πi s T n≤X 9/8−iT

Since χ is nonprincipal, the function L(s, χ)k is analytic and has no poles in σ ≥ 1/2. We move the line of integration to σ = 1/2 and apply Cauchy’s theorem. On the generalized Lindel¨ofHypothesis, we apply (5.1) with  = 1/2k giving

L(s, χ)k  (d|s|)1/2, σ ≥ 1/2.

49 The contribution from horizontal segments is ! 1 Z 1/2+iT Z 1/2−iT Xs X9/8 + L(s, χ)k ds  d1/2 , 1/2 2πi 9/8+iT 9/8−iT s T and contribution from the vertical segment σ = 1/2 is

1 Z 1/2+iT Xs k 1/2 1/2 1/2 L(s, χ) ds  d X T . 2πi 1/2−iT s

Hence, by Cauchy’s theorem, (5.3) becomes

X X9/8 X9/8 τ (n)χ(n)  d1/2 + d1/2X1/2T 1/2 + . k T 1/2 T n≤X

We choose T = X1/2. Thus, the error term of the above is

 d1/2X7/8.

This leads to the right side of (5.2). Proof of Theorem2. By Proposition2, we have

X 1−$2 |∆(τk; X, d, a)|  X . (5.4) d∈D D2

By Proposition3 above together with the large sieve inequality (3.17), in a way similar to the proof of Proposition1, for D ≤ D2, we get

X 1−$2 max |∆(τk; X, d, a)|  X . (a,d)=1 d≤D This, together with (5.4), give the desired estimate

X 1−$2 |∆(τk; X, d, a)|  X . d∈D d

6. Proofs of Theorems3 and4 6.1. Proof of theorem3 We proceed analogously as in the proof of Lemma9 and Proposition1. By (3.15), we have ! ! 1 X0 X X0 X ∆(τ ; X, d, a)2 = χ (a) τ (n)χ (n) χ (a) τ (n)χ (n) . k ϕ(d)2 1 k 1 2 k 2 χ1(mod d) n≤X χ2(mod d) n≤X

50 Summing over primitive a(mod d) and changing the order of summation, we get ! ! X∗ 1 X0 X0 X X X∗ ∆(τ ; X, d, a)2 = τ (n)χ (n) τ (n)χ (n) χ (a)χ (a). k ϕ(d)2 k 1 k 2 1 2 a(mod d) χ1(mod d) χ2(mod d) n≤X n≤X a(mod d) By the orthogonality relation (3.22) this becomes

! 2 ∗ 0 X 2 1 X X ∆(τk; X, d, a) = τk(n)χ(n) . ϕ(d) a(mod d) χ(mod d) n≤X We now reduce to primitive characters. By Lemma2, we have

 ! 2 ∗ ∗ X X 2 X 1 X 1 X X ∆(τk; X, d, a)  log L  τk(n)χ(n)  . r q d≤D a(mod d) r≤D 1

By Lemma3, we get, for 1 < q ≤ X1/3(k+2), 2 ! X 1 X∗ X X 1  2− 2  2− 1 τk(n)χ(n)  ϕ(q)X 3(k+2)  X 3(k+2) . q q 1

! 2 !2 ∗   1 X X X 1 2 X X k−1 τk(n)χ(n)  (Q + X) τk(n)  Q + XL . Q Q Q q∼Q χ(mod q) n≤X n≤X This leads to (1.11).

6.2. Proof of theorem4 Denote     X 1  X   X  E(f1, f2; X, d, a) = f1(m)f2(n) −  f1(m)  f2(n) . ϕ(d) m,n≤X  m≤X   n≤X  m≡an(d) (m,d)=1 (n,d)=1 We start by writing X X ψ1(χ) = f1(m)χ(m) and ψ2(χ) = f2(n)χ(n). m≤X n≤X

We first prove the estimate (1.12). Let f1 = f2 = τk. By (3.15), we have 1 X0 E(τ , τ ; X, d, a) = χ(a)ψ (χ)ψ (χ). k k ϕ(d) 1 2 χ(mod d)

51 Taking the square of the modulus of both sides yields

1 X0 X0 E(τ , τ ; X, d, a)2 = χ (a)ψ (χ )ψ (χ ) χ (a)ψ (χ )ψ (χ ). k k ϕ(d)2 1 1 1 2 1 2 1 2 2 2 χ1(mod d) χ2(mod d)

Summing over primitive a(mod d) and changing the order of summation, we get

X∗ 1 X0 X0 X∗ E(τ , τ ; X, d, a)2 = ψ (χ )ψ (χ )ψ (χ )ψ (χ ) χ (a)χ (a). k k ϕ(d)2 1 1 2 1 1 2 2 2 1 2 a(mod d) χ1(mod d) χ2(mod d) a(mod d) By the orthogonality relation (3.22) we get

2 ∗ 0 X 2 1 X E(τk, τk; X, d, a) = ψ1(χ)ψ2(χ) . ϕ(d) a(mod d) χ(mod d) We now reduce to primitive characters. By Lemma2, we have  2 ∗ ∗ X X 2 X 1 X 1 X E(τk, τk; X, d, a)  log L  ψ1(χ)ψ2(χ)  . r q d≤D a(mod d) r≤D 1

By Lemma3, we get, for 1 < q ≤ X1/3(k+2),

2 2 X 1 X∗ X 1  2− 2  4− 2 ψ1(χ)ψ2(χ)  ϕ(q)X 3(k+2)  X 3(k+2) . q q 1

2 ∗ 1 X X 4−1/3(k+3) ψ1(χ)ψ2(χ)  X . Q q∼Q χ(mod q) By the large sieve inequality (3.17) and the bound (3.18), the left-side of the above is

!2 1 X X∗ 1 X  X2  ≤ |ψ (χ)ψ (χ)|2  (Q2 +X2) τ (n)  Q + X2L2k−2. (6.2) Q 1 2 Q k Q q∼Q χ(mod q) n≤X

For D ≤ X2−1/3(k+2), the above is

 X2−1/3(k+2) + X2−1/3(k+2)X2L2k  X4−1/3(k+3).

This, together with (6.1), lead to the right side of (1.12). Note also that, for X2−1/3(k+2) < D ≤ X2,(6.2) becomes

 X2  D + X2L2k−2  DX2L2k−2. X2−1/3(k+2)

52 This gives an estimate for (1.12) in this range. We next prove (1.13). Let f1 = τk and f2 = Λ. The proof of (1.13) is analogous to that of (1.12), except that in (6.1) and (6.2) we estimate the sum over Λ by X X ψ2(χ) = Λ(n)χ(n)  X and Λ(n)  X. n≤X n≤X Thus (6.1) becomes

2

X 1 X∗ 4− 1 ψ1(χ)ψ2(χ)  X 3(k+2) , q 1

References [1] E. Bombieri, On the large sieve, Mathematika 12 (1965), 201-225. https://doi.org/ 10.1112/S0025579300005313 [2] E. Bombieri, J. B. Friedlander, and H. Iwaniec, Primes in arithmetic progressions to large moduli, Acta Math. 156 (1986), 203-251. https://doi.org/10.1007/ BF02399204 [3] J. B. Conrey and A. Ghosh, Remarks on the generalized Lindel¨ofHypothesis, Funct. Approx. Comment. Math., Volume 36 (2006), 71-78. https://doi.org/10.7169/facm/ 1229616442 [4] J. B. Conrey and S. M. Gonek, High moments of the Riemann zeta-function, Duke Math. J. 107 (3) 577-604, 15 April 2001. https://doi.org/10.1215/ S0012-7094-01-10737-0 [5] J. B. Conrey, J. P. Keating, Moments of zeta and correlations of divisor-sums: I, Phil. Trans. R. Soc. A 373:20140313 (2015). http://dx.doi.org/10.1098/rsta.2014.0313 [6] J. B. Conrey, J. P. Keating, Moments of zeta and correlations of divisor-sums: II, In Advances in the Theory of Numbers Proceedings of the Thirteenth Conference of the Canadian Number Theory Association, Fields Institute Communications (Ed- itors: A. Alaca, S. Alaca & K.S. Williams), 75 85, 2015. https://doi.org/10.1007/ 978-1-4939-3201-6\_3 [7] J. B. Conrey, J. P. Keating, Moments of zeta and correlations of divisor-sums: III, Inda- gationes Mathematicae, 26(5): 736-747 (2015). https://doi.org/10.1016/j.indag. 2015.04.005

53 [8] J. B. Conrey, J. P. Keating, Moments of zeta and correlations of divisor-sums: IV, Research in Number Theory, 2(24): (2016). https://doi.org/10.1007/ s40993-016-0056-4

[9] H. Davenport, Multiplicative Number Theory, Graduate Texts in Mathematics 74, third edition, 2000. https://doi.org/10.1007/978-1-4757-5927-3

[10] E.´ Fouvry, Sur le prob`emedes diviseurs de Titchmarsh (in French) (The Titchmarsh divisor problem), J. Reine Angew. Math. 357 (1985) 51-76. https://doi.org/10. 1515/crll.1985.357.51

[11] E.´ Fouvry, E. Kowalski, and P. Michel, On the exponent of distribution of the ternary divisor function, Mathematika 61 (2015), no. 1, 121-144. https://doi.org/10.1112/ S0025579314000096

[12] E.´ Fouvry and M. Radziwi l l,Level of distribution of unbalanced convolutions, Preprint arXiv:1811.08672[math.NT].

[13] J. B. Friedlander and H. Iwaniec, The divisor problem for arithmetic progressions, Acta Arith. 45 (1985), 273-277. https://doi.org/10.4064/aa-45-3-273-277

[14] J. B. Friedlander and H. Iwaniec, Incomplete Kloosterman sums and a divisor problem, Ann. of Math. (2) 121 (1985), no. 2, 319-350. https://doi.org/10.2307/1971175

[15] J. B. Friedlander and H. Iwaniec, Close encounters among the primes, Preprint arXiv: 1312.2926.

[16] A. J. Harper and K. Soundararajan, Lower bounds for the variance of sequences in arithmetic progressions: Primes and divisor functions, The Quarterly Journal of Math- ematics, Volume 68, Issue 1, March 2017, pp. 97-123. https://doi.org/10.1093/ qmath/haw005

[17] D. R. Heath-Brown, The moment of the Riemann Zeta-function, Proc. London Math. Soc. (3) 38 (1979), 385-422. https://doi.org/10.1112/plms/s3-38. 3.385

[18] D. R. Heath-Brown, The divisor function d3(n) in arithmetic progressions, Acta Arith. 47 (1986), 29-56. https://doi.org/10.4064/aa-47-1-29-56

[19] D. R. Heath-Brown and X. Li, Prime values of a2 + p4, Invent. Math. 208 (2017), no. 2, 441-499. https://doi.org/10.1007/s00222-016-0694-0

[20] C. Hooley, An asymptotic formula in the theory of numbers, Proc. London Math. Soc. (3) 7 (1957), 396-413. https://doi.org/10.1112/plms/s3-7.1.396

[21] H. Iwaniec, E. Kowalski, Analytic Number Theory, American Mathematical Society Colloquium Publications Vol. 53, 2004. https://doi.org/10.1090/coll/053

54 [22] J. Keating, B. Rodgers, E. Roditty-Gershon, and Z. Rudnick, Sums of divisor functions in Fq[t] and matrix integrals, Math. Z. (2018) 288: 167-198. https://doi.org/10. 1007/s00209-017-1884-1

[23] A. F. Lavrik, On the problem of in segments of arithmetical progressions (in Russian), Dokl. Akad. Nauk SSSR 164:6 (1965), 1232-1234. MR0205953

[24] Yu. V. Linnik, All large numbers are sums of a prime and two squares (A problem of Hardy and Littlewood) II, Mat. Sb. (N. S.) 53 (95) (1961), 3-38; Amer. Math. Soc. Transl., 37 (1964), 197-240. https://doi.org/10.1016/0370-1573(79)90002-4

[25] Y. Motohashi, An induction principle for the generalization of Bombieri’s prime number theorem, Proc. Japan Acad. 52(6) (1976), 273-275. https://doi.org/10.3792/pja/ 1195518296

[26] M. R. Murty, Problems in Analytic Number Theory, Graduate Texts in Math- ematics (Book 206), Springer; 2nd edition (2007). https://doi.org/10.1007/ 978-1-4757-3441-6

[27] D. T. Nguyen, Generalized divisor functions in arithmetic progressions: II, preprint.

[28] D.H.J. Polymath, New equidistribution estimates of Zhang type, Algebra Number The- ory 8 (2014), no. 9, 2067-2199. https://doi.org/10.3792/pja/1195518296

[29] B. Rodgers and K. Soundararajan, The variance of divisor sums in arithmetic pro- gressions, Forum Math. 30 (2018), no. 2, 269-293. https://doi.org/10.1515/ forum-2016-0227

[30] P. Shiu, A Brun-Titchmarsh theorem for multiplicative functions, J. Reine Angew. Math. 313 (1980), 161-170. https://doi.org/10.1515/crll.1980.313.161

[31] A. I. Vinogradov, The density Hypothesis for Dirichlet L-series, Izv. Akad. Nauk SSSR Ser. Mat. 29 (1965), 903-934. MR0194397

[32] F. Wei, B. Xue, and Y. Zhang, General divisor functions in arithmetic progressions to large moduli, Sci. China Math. 59 (2016), no. 9, 1663-1668. https://doi.org/10. 1007/s11425-016-0355-4

[33] D. Wolke, Uber¨ die mittlere Verteilung der Werte zahlentheoretischer Funktionen auf Restklassen, I. (in German), Math. Ann. 202 (1973), 125. https://doi.org/10.1007/ BF01351202

[34] Y. Zhang, Bounded gaps between primes, Ann. of Math. 179 (2014), no. 3, 1121-1174. http://doi.org/10.4007/annals.2014.179.3.7

55