The Pennsylvania State University The Graduate School

ON A VARIANCE ASSOCIATED WITH THE DISTRIBUTION OF

REAL IN ARITHMETIC PROGRESSIONS

A Dissertation in Mathematics by Pengyong Ding

© 2021 Pengyong Ding

Submitted in Partial Fulfillment of the Requirements for the Degree of

Doctor of Philosophy

May 2021 The dissertation of Pengyong Ding was reviewed and approved by the following:

Robert Charles Vaughan Professor of Mathematics Dissertation Adviser Committee Chair

Wen-Ching Winnie Li Distinguished Professor of Mathematics

Ae Ja Yee Professor of Mathematics

Mary Kathleen Heid Distinguished Professor of Education

Mark Levi Professor of Mathematics Head of the Department

ii Abstract

This dissertation is composed of two parts. The first part concerns the general result on the following variance associated with the distribution of a real {an} in arithmetic progressions:

q V (x, Q) = X X |A(x; q, a) − f(q, a)M(x)|2, q≤Q a=1 where A(x; q, a) represents the sum of {an} in the residue class of a modulo q, and f(q, a) and M(x) approximately reflect the local and global properties of {an} respectively. We will give a brief history of the problem and introduce two powerful methods, and then we will provide the standard initial procedure. The second part is an example on calculating the variance V (x, Q) when an = r3(n), the number of (ordered) representations of n as the sum of three positive cubes:

X r3(n) = 1. x1,x2,x3 3 3 3 x1+x2+x3=n We will introduce the properties of the function and show how to calculate the main terms and estimate the error terms. The conclusion will be stated as Theorem 5.1. Finally, several special cases and similar questions will be listed at the end of the dissertation.

iii Table of Contents

List of Symbols vi

Acknowledgments viii

Chapter 1 Introduction 1 1.1 An Introduction to Analytic Number Theory ...... 1 1.2 An Introduction to the Problem ...... 3 1.3 The History of the Problem ...... 4

Chapter 2 Two Powerful Methods 8 2.1 The Hardy-Littlewood Circle Method ...... 8 2.2 The ...... 10

Chapter 3 The Standard Initial Procedure 13 3.1 The Treatment of S1 ...... 14 3.2 Further Arrangement ...... 16

Chapter 4 An Example: an = r3(n) 17 4.1 Introduction to r3(n) ...... 17 4.2 Upper and Lower Exponents ...... 19

Chapter 5 The Variance in the Special Case 21 5.1 Several Lemmata ...... 21 5.2 The Main Term S3 ...... 31 5.3 The Error Term S2 − S3 ...... 33 5.4 The Major Arcs ...... 35 5.5 The Optimal Choice for R ...... 38 5.6 The Main Term S6 ...... 40 5.7 The Calculation of W (X) ...... 42

iv 5.8 Conclusion: Theorem 5.1 ...... 48

Chapter 6 Several Notes 50 6.1 Results for Large Q ...... 50 6.2 Similar Questions ...... 54

Bibliography 56

v List of Symbols

(a, b) The of integers a and b, p.1

C The set of complex numbers, p.1

ζ(s) The Riemann-zeta function of a complex number s, p.2

f(x) ∼ g(x) lim f(x)/g(x) = 1, p.2

log x The natural logarithm of a real number x, p.2

N The set of natural numbers, p.3

R The set of real numbers, p.3 a ≡ b (mod c)( a − b)/c is an integer where a, b, c are integers, p.3

f(x)  g(x) |f(x)| ≤ Cg(x) where C is an absolute constant, p.3

f(x) = O(g(x)) f(x)  g(x), p.4

Λ(n) The von Mangoldt function of a natural number n, p.4

φ(n) The Euler totient function of a natural number n, p.4

exp(x) ex where x is a real number, p.5

a|b b/a is an integer where a and b are integers and a 6= 0, p.6

d(n) The divisor function of a natural number n, p.7

[x] The integer part of a real number x, p.9

e(z) e2πiz where z is a complex number, p.9

Z The set of integers, p.9

vi a - b b/a is not an integer where a and b are integers, p.14 Γ(t) The gamma function of a real number t, p.17

B(a, b) The beta function of real numbers a and b, p.28

||r|| The distance from a real number r to the nearest integer, p.29

γ The Euler constant, p.32

=s The imaginary part of a complex number s, p.46 f(x)  g(x) f(x)  g(x) and g(x)  f(x), p.50

vii Acknowledgments

First, I would like to thank my advisor, Professor Robert C. Vaughan, for his guidance during my Ph.D. career. Next, I would like to thank Professor Wen-Ching W. Li, Ae Ja Yee and M. Kathleen Heid for serving on my committee. I would also like to thank Allyson Borger, the administrative support assistant of the graduate studies program in the Department of Mathematics, for her kindly assistance in the past four years. Finally, I would like to thank my parents for their love and support, both physically and mentally. I would never have made such a success without them.

viii Chapter 1 | Introduction

This dissertation is composed of two parts. The first part is the general result on a variance associated with the distribution of a real sequence {an} in arithmetic progressions, including the first three chapters: Chapter 1 introduces the problem and gives a brief historical overview, Chapter 2 contains two powerful methods to study the problem, including the Hardy-Littlewood circle method and the Farey sequence, Chapter 3 gives a standard initial procedure to estimate the variance. The second part of the dissertation is a specific example when an = r3(n), including Chapter 4 and onward: Chapter 4 introduces the function r3(n) and shows some preliminary results, Chapter 5 shows further results and the conclusion on the variance, and finally, Chapter 6 gives several notes on the problem, including several special cases for r3(n) as well as similar questions.

1.1 An Introduction to Analytic Number Theory

Number Theory is a branch of pure mathematics to the study of integers and integer- valued functions, and analytic number theory is a branch of number theory where we use analytic methods to solve problems about integers, especially the methods where we use mathematical analysis like real analysis and complex analysis. It may be said to begin with Dirichlet’s introduction of L-function in 1837, when he proved the existence of primes in an arithmetic progression, namely for any integers a and q such that (a, q) = 1, there are infinitely many primes of the form a + nq, where n is also an integer. He defined the L-function as follows: ∞ L(s, χ) := X χ(n)n−s, (1.1) n=1 where s ∈ C and 1, and χ is a Dirichlet character modulo q, i.e. a complex-valued function with the following properties:

1 (1) χ(n) = χ(n + q) for all integers n, (2) χ(n) 6= 0 if and only if (n, q) = 1, (3) χ(mn) = χ(m)χ(n) for all integers m and n.

And if χ0 is a character such that for all integers n, χ0(n) = 1, then define the Riemann-zeta function: ∞ X −s ζ(s) := L(s, χ0) = n . (1.2) n=1 Since then, many well-known results have been discovered, including: (1) The prime number theorem, namely π(x) ∼ x/ log x, where π(x) denotes the number of prime numbers less than or equal to x. (2) The existence of zeros of the Riemann-zeta function ζ(s). For a fixed character χ, if L(s0, χ) = 0 for a negative real number s0, then s0 is called a trivial zero of L(s, χ). It is proved that there are infinitely many trivial zeros of ζ(s): s = −2n where n is a positive integer. A zero of L(s, χ) is called non-trivial if it is not trivial. It is also proved that if a non-trivial zero s0 of L(s, χ) exists, then it satisfies 0 ≤

2 will introduce the Hardy - Littlewood circle method in Chapter 2, and we will use it to study the problem on a variance associated with the distribution of a real sequence in arithmetic progressions.

1.2 An Introduction to the Problem

Assume that there is a function a(n) whose domain is N and range is a subset of R. We usually write such a function as a sequence {an}. Now define A(x; q, a) as the sum of

{an} in arithmetic progressions:

X A(x; q, a) = an. (1.3) n≤x n≡a(modq)

For some sequences, there exist real functions f(q, a) and M(x), such that

A(x; q, a) ∼ f(q, a)M(x). (1.4)

If such functions exist, we say that f(q, a) and M(x) approximately reflect the local and global properties of {an} respectively. In order to measure how far the sums in arithmetic progressions is spread out from their approximation, we are interested in the variance

q V (x, Q) = X X |A(x; q, a) − f(q, a)M(x)|2, (1.5) q≤Q a=1 where the value of Q is not greater than x. If Q > x, then there is at most one term in the sum (1.3), which will result in large deviation. From now on, we always assume that 1 ≤ Q ≤ x.

The difficulty of the problem largely depends on the sequence {an}, especially whether the sum X an  x n≤x and the second moment sum X 2 an  x, n≤x where in general, f(x)  g(x)

3 means f(x) = O(g(x)), which means that there is a constant C > 0 such that |f(x)| ≤ Cg(x) for all x in the appropriate domain, especially when x is large enough. Equivalently, it depends on whether the mean value 1 X an x n≤x and the mean square 1 X 2 an x n≤x are bounded or not. In §1.3, we will have a brief historical overview of the problem, and we will see how the properties of the mean value and the mean square of {an} affect the difficulty of the problem.

1.3 The History of the Problem

The first example on the variance (1.5) is studied by Barban [1966]. He studied a special case when an = Λ(n), the von Mangoldt function, which is defined as follows:

 k  log p, when n = p , p is a prime, k > 0 is an integer, Λ(n) =  0, otherwise. and in this case, 1 f(q, a) = , φ(q) where φ(q) is Euler’s totient function, which is defined as the number of integers a such that 1 ≤ a ≤ q and (a, q) = 1, and

M(x) = x.

The special case was studied successively by Davenport and Halberstam [1966], and Gallagher [1967], Montgomery [1970] and Hooley [1975a]. Montgomery established an asymptotic formula in the special case, and Hooley simplified the proof and gave a more refined result. He showed that if an = Λ(n), then for any positive constant A,

 x2  V (x, x) = x2 log x + D x2 + O (1.6) 1 (log x)A

4 and  x2  V (x, Q) = Qx log Q + D Qx + O(Q5/4x3/4) + O , (1.7) 2 (log x)A where D1 and D2 are constants. And if one assumes the generalized Riemann hypothesis, which is mentioned in §1.1, then the last error terms in (1.6) and (1.7) can be replaced 11 +ε by O(x 7 ), where ε stands for an arbitrary positive constant. Later, Hooley [1975b], [1975c], [1998a], [1998b], [2002], [2005], [2007] developed the subject further by studying wide reaching generalizations. In particular his method established that an asymptotic formula can be obtained for a wide class of {an} if one has an asymptotic formula of the kind

A(x; q, a) ∼ xf(q, a) with a reasonable error term, analogous to the Siegel-Walfisz theorem, and some control over the behaviour of the mean square

X 2 an. n≤x

In the special case of the von Mangoldt function, further refinements occur in Friedlander and Goldston [1996], and in Vaughan and Goldston [1997]. The latter paper showing that some advantage could be accrued by applying a version of the Hardy-Littlewood circle method. In detail, suppose that the generalized Riemann hypothesis holds and let

U(x, Q) = V (x, Q) − Qx log Q − D2Qx, then for any constant ε > 0,

2 1 +ε 3 5 2 U(x, Q)  Q (x/Q) 4 + x 2 (log 2x) 2 (log log 3x) .

Note that for any constant c > 0,

 x  X Λ(n) = x + O √ , n≤x exp(c log x) and X Λ(n)2 = x log x + O(x). n≤x In other words, Λ(n) is a sequence whose mean value is a constant but whose mean

5 square is unbounded. Having shown the usefulness of this method in that specific question, Vaughan [1998a], [1998b] then considered the more general problem. Suppose that

M(x) = x

and f(q, a) only depends on q and (q, a), and the real sequence {an} satisfies

X 2 an  x, n≤x i.e. the mean square is bounded. For convenience write f(q, (q, a)) for f(q, a) and suppose further that  x  A(x; q, a) = xf(q, (q, a)) + O , Ψ(x) where Ψ(x) is an increasing function with Ψ(x) > log x for all large x, Ψ(1) > 0 and

Z x Ψ(y)−1dy  xΨ(x)−1. 1

Then it was shown that

∞ X 2 X V (x, Q) = Q an − Qx g(q) + U(x, Q), n≤x q=1 where

U(x, Q)  x3/2 log x + x2(log 2x)9/2Ψ(x)−1 + x2(log x)4/3Ψ(x)−2/3 + Q2E(x/Q),

Z z E(z) = X g(q)dy, 0 q>y and  2 g(q) = φ(q) X f(q, r)µ(q/r) . r|q After that, more examples on the application of the Hardy - Littlewood circle method were given. Another example of a sequence whose mean value is a constant but whose mean square is unbounded is an = r(n), the number of ways of writing a positive integer n as the sum of two squares. This was studied in the context of the above variance by Dancs [2002]. It is, therefore of some interest to consider examples in which both the mean square

6 and the mean value of {an} are unbounded. An example of such a sequence is an = d(n), the divisor function, which is defined as the sum of all positive divisors of n:

d(n) = X m. m|n

It is shown that X d(n) = x log x + Cx + O(x1/3) n≤x for some constant C, so the mean value of d(n) is unbounded, let alone the mean square. The above variance has been studied in this special case by Pongsriiam [2012] and

Pongsriiam and Vaughan [2015]. They proved that if an = d(n), then for any ε > 0,

x2 V (x, x) = (log x)3 + g x2(log x)2 + g x2 log x + g x2 + O(x5/3+ε), π2 2 1 0 and for 1 ≤ Q < x and 0 < Θ < 1,

Qx Q2 3  Q2 2 Q2 V (x, Q) = log + h Qx log + h Qx log π2 x 3 x 2 x 5/3+ε 1+Θ 1−Θ 2 +h1Qx log x + h0Qx + O(x + Q x (log x) ),

where gi (i = 0, 1, 2) and hj (j = 0, 1, 2, 3) are constants.

7 Chapter 2 | Two Powerful Methods

In this chapter, we will introduce two powerful methods, including the Hardy-Littlewood circle method and the Farey sequence. The introduction on the Hardy-Littlewood circle method is referred to in R. C. Vaughan [1997], and the introduction on the Farey sequence is referred to in I. Niven, H. S. Zuckerman and H. L. Montgomery [1991], and D. A. Goldston and R. C. Vaughan [1997]. We will use these methods in Chapter 3 when applying the standard initial procedure on the variance (1.5).

2.1 The Hardy-Littlewood Circle Method

The idea of the Hardy-Littlewood circle method was first introduced in 1916 or 1917 on the asymptotics of the partition function. It was then taken up by many other researchers in analytic number theory, including I. M. Vinogradov, who introduced a number of notable refinements. It becomes a powerful method in the study of Waring’s problem, Goldbach binary and ternary problems (see §1.1), as well as Diophantine equations and inequalities. In recent years, the circle method is also applied in a wide range of settings outside the area of analytic number theory. The initial version of the circle method is concerned with the generating function of the series. For example, for a fixed integer k > 0, let Rs(n) be the number of (ordered) representations of n as the sum of s k-th power of positive integers. In order to find G(k) in Waring’s problem (see §1.1), it is sufficient to find the least possible s such that

Rs(n) > 0 for every sufficiently large n. Consider

∞ X mk F (z) = z , where z ∈ C, |z| < 1, m=1

8 and its s-th power

∞ ∞ ∞ ∞ k k k s X X X m +m +···+ms X n F (z) = ··· z 1 2 = Rs(n)z . m1=1 m2=1 ms=1 n=0

The latter equality is proved by comparing the coefficients of zn. By Cauchy’s integral formula, Z 1 s −n−1 Rs(n) = F (z) z dz, (2.1) 2πi C where C is a circle centered at 0 of radius ρ and 0 < ρ < 1. To estimate the above contour integral, Hardy and Littlewood introduced the terms major arcs and minor arcs to describe the separated parts of C, where major arcs usually lead to the main terms and minor arcs usually lead to the error terms. Later, I. M. Vinogradov introduced several refinements, one of which was to replace the generating function by a finite sum of exponential functions. In the case of Waring’s problem, he replaced F (z) by

N f(α) = X e(αmk), m=1 where N = [n1/k], the integer part of n1/k, and

2πiz e(z) := e , z ∈ C. (2.2)

To find Rs(n), we need the following lemmas:

Lemma 2.1. We have e(z1)e(z2) = e(z1 + z2).

2πiz1 2πiz2 2πi(z1+z2) Proof. e(z1)e(z2) = e e = e = e(z1 + z2).

Lemma 2.2. For any r ∈ R and h ∈ Z,

 Z 1+r  1, when h = 0, e(αh)dα = (2.3) r  0, when h 6= 0.

Proof. If h = 0, then Z 1+r Z 1+r e(0)dα = dα = 1. r r If h 6= 0, then

Z 1+r e((1 + r)h) − e(rh) e(rh)(e(h) − 1) e(αh)dα = = = 0. r 2πih 2πih

9 Now by Lemma 2.1,

N N N sN k s X X X k k k X f(α) = ··· e(α(m1 + m2 + ··· + ms )) = Rs(m, n)e(αm), m1=1 m2=1 ms=1 m=1 where Rs(m, n) is the number of (ordered) representations of m as the sum of s k-th power of positive integers, none of which exceed n. Therefore, if m ≤ n, then Rs(m, n) = Rs(m). By Lemma 2.2, Z 1+r s f(α) e(−αn)dα = Rs(n, n) = Rs(n). (2.4) r Under this refinement, the major arcs and the minor arcs are the subintervals of the unit interval [r, 1 + r], where they lead to main terms and error terms respectively. To estimate the variance (1.5), we would use the refined version of the Hardy- Littlewood circle method. In other words, we would use the finite sum of exponential functions and a line integral which is similar to (2.4). After that, to construct major arcs and minor arcs on the unit interval, we need to use the Farey sequence.

2.2 The Farey Sequence

For any given integer n > 0, a Farey of order n is defined as a fraction a/q such that a, q ∈ Z, 1 ≤ q ≤ n and (a, q) = 1, and the Farey sequence of order n is defined as the sequence of all Farey of order n, listed in order of their size (see I. Niven, H. S. Zuckerman and H. L. Montgomery [1991] p. 300). For example, the Farey sequence of order 3 is

−2 −5 −3 −4 −1 −2 −1 −1 0 1 1 2 1 4 3 5 2 ··· , , , , , , , , , , , , , , , , , , ··· 1 3 2 3 1 3 2 3 1 3 2 3 1 3 2 3 1

Now let a−/q−, a/q and a+/q+ be three consecutive fractions ordered from the smallest to the largest in the Farey sequence of order n. Then obviously

a a + a a a + a a − < − < < + < + . q− q + q− q q + q+ q+

Note that we may also treat a/q as (a−)+/(q−)+ or (a+)−/(q+)−. So if we define

a + a a + a  M(q, a) := − , + (2.5) q + q− q + q+

10 then all these M(q, a) such that a, q ∈ Z, 1 ≤ q ≤ n and (a, q) = 1 form a partition of R = (−∞, ∞). We further assume that 1 ≤ a ≤ q ≤ n, then a/q goes over all Farey fractions of order n between 1/n and 1/1. Note that the preceding term of 1/n is 0/1, and the succeeding term of 1/1 is (n + 1)/n, so the lower endpoint of M(n, 1) is 1/(n + 1), and the upper endpoint of M(1, 1) is (n + 2)/(n + 1). Therefore, all M(q, a) such that a, q ∈ Z, 1 ≤ a ≤ q ≤ n and (a, q) = 1 form a partition of the unit interval

 1 n + 2 , . n + 1 n + 1

The following lemma describes the length of each subinterval M(q, a).

Lemma 2.3. If a/b and c/d are Farey fractions of order n such that no other Farey fraction of order n lies between them, then

a a + c 1 1 − = ≤ b b + d b(b + d) b(n + 1) and

c a + c 1 1 − = ≤ . d b + d d(b + d) d(n + 1) Proof. See I. Niven, H. S. Zuckerman and H. L. Montgomery [1991] p. 301-302, Theorem 6.7.

In general, if R ≥ 1 and R is not necessarily an integer, then define the Farey sequence of order R as the Farey sequence of order [R], where [R] stands for the integer part of R, which is the unique integer satisfying

R − 1 < [R] ≤ R.

To construct the major arc and the minor arc from M(q, a) under the Farey sequence of order R, we need the following lemma:

Lemma 2.4. If a, q ∈ Z, 1 ≤ a ≤ q ≤ R and (a, q) = 1, a−/q− and a+/q+ are the preceding and succeeding terms of a/q in the Farey sequence of order R respectively, then

1 a a + a 1 ≤ − − < 2qR q q + q− qR

11 and 1 a + a a 1 ≤ + − < . 2qR q + q+ q qR Proof. By Lemma 2.3,

a a + a 1 1 1 − − = ≤ < , q q + q− q(q + q−) q([R] + 1) qR and

q + q− ≤ 2R.

Hence the first inequality holds. The second inequality can be proved similarly.

Now define a 1 a 1  N(q, a) := − , + (2.6) q 2qR q 2qR under the Farey sequence of order R. Then by Lemma 2.4,

N(q, a) ⊂ M(q, a). (2.7)

Therefore, we would let all N(q, a) and all M(q, a) \ N(q, a) be the major arcs and minor arcs on the unit interval  1 [R] + 2 , (2.8) [R] + 1 [R] + 1 respectively. In Chapter 3, we will see how they lead to main terms and error terms of the variance V (x, Q).

12 Chapter 3 | The Standard Initial Procedure

Now we focus on the variance V (x, Q). By (1.3), (1.5) and binomial theorem, we have

X 2 V (x, Q) = 2S1 − 2S2 + S3 + [Q] an, (3.1) n≤x where X X S1 = aman, (3.2) q≤Q m

q X X X S2 = M(x) f(q, a) an, (3.3) q≤Q a=1 n≤x n≡a(modq)

q 2 X X 2 S3 = M(x) f(q, a) , (3.4) q≤Q a=1 and [Q] is the integer part of Q.

It is possible to rewrite −2S2 + S3 as −S3 + 2(S3 − S2), where S3 − S2 is usually an error term. Also, in order to avoid [Q] when Q is not an integer, we would rather use

  X 2 X 2 Q an + O an n≤x n≤x than X 2 [Q] an. n≤x

The most important part is to estimate S1. To do this, we need to use the refined version of the Hardy-Littlewood circle method to rewrite the sum as a line integral, and then we would use Farey sequence to construct the major and minor arcs.

13 3.1 The Treatment of S1

Here we give the main conclusions as we follow §3 of D. A. Goldston and R. C. Vaughan [1997], §4 of R. C. Vaughan [1998a], and §4 of R. C. Vaughan [1998b]. Let F (α) = X X e(αqr), (3.5) q≤Q r≤x/q X G(α) = ane(nα). (3.6) n≤x Then by Lemma 2.1 and Lemma 2.2, for any r ∈ R,

Z 1+r 2 S1 = F (α)|G(α)| dα. (3.7) r

We also have

F (α) = Fq(α) + Hq(α), where X  X X  Fq(α) = + e(αlm) (3.8) √ √ l≤ x m≤x/l x

√ 1 2 x ≤ R ≤ x. (3.9) 2

Let a, q ∈ Z, 1 ≤ a ≤ q ≤ R and (a, q) = 1, and define M(q, a) under the Farey sequence of order R as (2.5). By §2.2, all these M(q, a) form a partition of the unit interval (2.8). So let 1 r = [R] + 1 in (3.7), we have

q Z q Z X X 2 X X 2 S1 = Fq(α)|G(α)| dα + Hq(α)|G(α)| dα. (3.10) q≤R a=1 M(q,a) q≤R a=1 M(q,a) (a,q)=1 (a,q)=1

Now let α ∈ M(q, a) and β = α − a/q. By (2.5), Lemma 2.4 and (3.9),

1 1 |β| < ≤ √ , qR 2q x

14 so √ Hq(α)  ( x + q) log 2q (3.11) and √ x log(2 x/q) √ F (α)  if q ≤ x. (3.12) q q + qx|β| See D. A. Goldston and R. C. Vaughan [1997] p. 121-122, formulas (3.6) and (3.11). √ To estimate S1, note that Hq(α)  R log x when α ∈ M(q, a), since x ≤ R/2 and q ≤ R ≤ x/2. So the second term on RHS of (3.10) is

q Z X X 2 X 2  R(log x) |G(α)| dα = R(log x) an. q≤R a=1 M(q,a) n≤x (a,q)=1

Therefore, we can write it as   X 2 O R(log x) an , n≤x which dominates   X 2 O an n≤x since R log x is sufficiently large. √ √ By (3.8), Fq(α) = 0 when q > x. Suppose that q ≤ x and define N(q, a) under the Farey sequence of order R as (2.6). Then Fq(α)  R log x if α ∈ M(q, a) \ N(q, a), or if α ∈ N(q, a) but q > x/R. So the first term on RHS of (3.10) is

  X 2 S4 + O R(log x) an , n≤x where

q Z Z 1/2qR q   2 X X 2 X X a S4 =: Fq(α)|G(α)| dα = Fq(β) G β + dβ. N(q,a) −1/2qR q≤x/R a=1 q≤x/R a=1 q (a,q)=1 (a,q)=1 (3.13) The last part of the equality comes from integration by substitution α = β + a/q, and the property that Fq(α) is periodic with period 1/q. Hence

  X 2 S1 = S4 + O R(log x) an . (3.14) n≤x

15 3.2 Further Arrangement

So far, we have

  X 2 X 2 V (x, Q) = 2S4 − S3 + 2(S3 − S2) + Q an + O R log x an , (3.15) n≤x n≤x

X 2 where 2S4, −S3 and Q an are expected to be the main terms, and 2(S3 − S2) and n≤x   X 2 O R log x an are expected to be the error terms. Note that n≤x

X 2 Q an n≤x can dominate   X 2 O R log x an n≤x only when Q/(R log x) is large, otherwise, the result is trivial. So from now on, we always assume that x1/2 log x ≤ Q ≤ x. (3.16)

The treatment of S4 depends on obtaining an approximation to G(β + a/q) which can be considered a product of local factors. In the examples discussed in §1.3, it is possible to estimate G(β + a/q) as ν(q)J(β) for some ν(q) and J(β) that are independent of a, at least when (a, q) = 1, so the integral will be changed to

Z 1/2qR 2 Fq(β)|J(β)| dβ (3.17) −1/2qR when factoring out φ(q)|ν(q)|2. In other situations, we have to deal with more complex approximations, for example, ν(q, a)J(β). Nevertheless, in this case, it is possible to factor out q X |ν(q, a)|2 a=1 (a,q)=1 for a given q, and the integral will still be changed to (3.17), which is likely to be close to

Z 1/2 2 Fq(β)|J(β)| dβ. −1/2

16 Chapter 4 | An Example: an = r3(n)

Chapter 3 shows the standard initial procedure to estimate the variance V (x, Q) from the Hardy-Littlewood circle method and the Farey sequence. As an application, we will focus on the example an = r3(n), the number of (ordered) representations of n as the sum of three positive cubes. §4.1 introduces r3(n) and gives a brief history on the study of the function, which helps to find the corresponding expressions A(x; q, a), f(q, a), M(x) and the variance V (x, Q) under this circumstance, and §4.2 redefines the upper and lower exponents to estimate the size of the error terms. Again, in this case, we still assume that x is sufficiently large and Q satisfies (3.16).

4.1 Introduction to r3(n)

By definition, the mathematical expression of r3(n) is

X r3(n) = 1. (4.1) x1,x2,x3 3 3 3 x1+x2+x3=n

In spite of significant work by Hooley, this function is only poorly understood. The best results that we have known about the mean value and the mean square of r3(n) (Vaughan [2021]) are

 3 4 2 X 4 Γ( ) 2 5 1 3 3 9 3 r3(n) = Γ x − 5 x + O(x (log x) ), (4.2) n≤x 3 2Γ( 3 ) and X 2 7 ε− 5 r3(n)  x 6 (log x) 2 , (4.3) n≤x

17 where Γ(t) is the gamma function:

Z ∞ Γ(t) = xt−1e−xdx, t > 0. (4.4) 0

Note that in all of the special cases studied hitherto, which are mentioned in §1.3, the behavior of the mean square is at least well understood even if it is not bounded.

However, in the case of r3(n), the size of its mean square is unknown, and we are not sure whether the mean square is bounded or not. Therefore, this case is more difficult than the previous ones.

There are several conjectures on the mean square of r3(n). Hooley [1996] has shown that X 2 1+ε r3(n)  x . n≤x when assuming that a certain Hasse-Weil L-function satisfies the Riemann Hypothesis. It may be true that X 2 r3(n) ∼ Cx n≤x for some positive constant C, but we only know a lower bound of this order with a value of C larger than the obvious guess. See Hooley [1986b]. Also, by refining the deep work of Hooley [1981], Vaughan [2021] has shown that for every ε > 0 there exists a positive constant δ, such that for every ε > 0 and all Q ≤ x,

 3 X 4 −3 8 +ε 1 2 10 5 −δ max sup Υ(y; q, a) − Γ yρ(q, a)q ε x 9 + x 3 Q 9 (Q 9 + x 9 )(log x) , (4.5) a q≤Q y≤x 3 where X Υ(x; q, a) = r3(n), (4.6) n≤x n≡a(modq)

3 3 3 and ρ(q, a) denotes the number of solutions of the congruence l1 + l2 + l3 ≡ a(modq).

When comparing (4.2), (4.5) and (4.6) with (1.3) and (1.4), we have: If an = r3(n), then A(x; q, a) = Υ(x; q, a), (4.7)

f(q, a) = ρ(q, a)q−3, (4.8) 43 M(x) = Γ x, (4.9) 3

18 and the variance (1.5) becomes

q 2 43 X X −3 V (x, Q) = Υ(x; q, a) − Γ xρ(q, a)q . (4.10) q≤Q a=1 3

Besides, by (3.3), (3.4) and (3.6), we need to update the expressions for S2, S3 and G(α):

3 q 4 X X ρ(q, a) X S2 = Γ x 3 r3(n), (4.11) 3 q≤Q a=1 q n≤x n≡a( mod q)

 6 q 2 4 2 X X ρ(q, a) S3 = Γ x 6 , (4.12) 3 q≤Q a=1 q X G(α) = r3(n)e(nα), (4.13) n≤x while Fq(α) and S4 are still defined by (3.8) and (3.13) respectively. Besides, R, the order of the Farey sequence, still satisfies (3.9), and the major arcs N(q, a) are still defined by (2.6), where 1 ≤ a ≤ q ≤ R and (a, q) = 1. So far, there are two sets of definitions, including the general problem and the special case an = r3(n). To avoid confusion, from now on, we will use the definitions in the special case unless otherwise stated.

4.2 Upper and Lower Exponents

By (3.15), the variance is

  X 2 X 2 V (x, Q) = 2S4 − S3 + 2(S3 − S2) + Q r3(n) + O R(log x) r3(n) , (4.14) n≤x n≤x so the mean square of r3(n), or equivalently, the second-moment sum

X 2 r3(n) (4.15) n≤x is contained in both the main term and the error term. However, from discussions in §4.1, its behavior is still unknown. Therefore, to estimate the size of the main term and the error term, we need to redefine the exponent of x in an expression E(x).

19 First, redefine the upper exponent E+ and lower exponent E− of x in E(x) as follows:

log E(x) E+ := lim sup (4.16) x→∞ log x and log E(x) E− := lim inf . (4.17) x→∞ log x If E+ = E−, then redefine the exponent E of x in E(x) as

E := E+ = E−. (4.18)

To get the exponent of x in (4.15), note that r3(n) is always a non-negative integer, 2 so r3(n) ≤ r3(n) . Hence by (4.2) and (4.3),

X 2 7 ε− 5 x  r3(n)  x 6 (log x) 2 . n≤x

Therefore, if we denote A+, A− and A as the upper exponent, the lower exponent and the exponent of x in (4.15), then

7 1 ≤ A− ≤ A+ ≤ . (4.19) 6

Later, we will show that the optimal choice for R depends on the value of A+, so the error term also depends on A+. Once the value of A+ is found after some research on the mean square of r3(n), the results mentioned in this dissertation will be finalized accordingly.

20 Chapter 5 | The Variance in the Special Case

By (4.14), in order to estimate the variance V (x, Q), we still need to work on S4, S3 and S2 − S3. We starts from §5.1 with several important lemmas, and we will use these lemmas to estimate the main term S3 in §5.2, the error term S2 − S3 in §5.3, and the main term S4 from §5.4 to §5.7, where the optimal choice for R is given in §5.5. Finally, the conclusion will be stated in §5.8 as Theorem 5.1.

5.1 Several Lemmata

First, define the exponential sum S(q, a) as follows:

q am3  S(q, a) := X e . (5.1) m=1 q

The following few lemmas show the properties of S(q, a).

3 3 3 Lemma 5.1. Let ρ(q, a) denotes the number of solutions of the congruence l1 + l2 + l3 ≡ a(mod q). Then 1 q  ba ρ(q, a) = X e − S(q, b)3, (5.2) q b=1 q and q q r X 2 1 X 6 5 X 1 X 6 ρ(q, a) = |S(q, b)| = q 6 |S(r, c)| . (5.3) a=1 q b=1 r|q r c=1 (c,r)=1 Proof. The value of 1 q b(l3 + l3 + l3 − a) X e 1 2 3 q b=1 q

21 3 3 3 is 1 if (l1 + l2 + l3 − a)/q is an integer, and 0 otherwise. In other words,

1 q  ba bl3  bl3  bl3  X e − e 1 e 2 e 3 = 1 q b=1 q q q q

3 3 3 iff (l1, l2, l3)(mod q) is a solution of the congruence l1 +l2 +l3 ≡ a(mod q). So the number of all such solutions is

q q q 1 q  ba bl3  bl3  bl3  ρ(q, a) = X X X X e − e 1 e 2 e 3 q q q q q l1=1 l2=1 l3=1 b=1 1 q  ba q bl3  q bl3  q bl3  = X e − X e 1 X e 2 X e 3 q q q q q b=1 l1=1 l2=1 l3=1 1 q  ba = X e − S(q, b)3. q b=1 q

We have ρ(q, a) = ρ(q, a) = |ρ(q, a)| since ρ(q, a) is a nonnegative integer. So

q q 1 q  b a 1 q b a  X ρ(q, a)2 = X X e − 1 S(q, b )3 X e 2 S(q, b )3 q q 1 q q 2 a=1 a=1 b1=1 b2=1 1 q q q (b − b )a = X X S(q, b )3S(q, b )3 X e 2 1 q2 1 2 q b1=1 b2=1 a=1 1 = X S(q, b )3S(q, b )3 · q q2 1 2 1≤b1=b2≤q 1 q = X |S(q, b)|6. q b=1

To complete the proof, note that if c/r is the simplified form of b/q, in other words, if c/r = b/q and (c, r) = 1, then S(q, b) = (q/r)S(r, c). So

q r 6 r X 6 X X q 6 X 1 X 6 |S(q, b)| = S(r, c) = q |S(r, c)| . 6 b=1 r|q c=1 r r|q r c=1 (c,r)=1 (c,r)=1

A character is called principal if it assumes the value 1 for arguments coprime to its modulus and otherwise is 0. We have

22 Lemma 5.2. Suppose that p is a prime number and p - a. Then

S(p, a) = X χ(a)τ(χ) (5.4) χ∈A where χ represents the congruent of χ, A denotes the set of non-principal characters χ modulo p for which χ3 is principal, card A = (3, p−1)−1 and τ(χ) satisfies |τ(χ)| = p1/2.

Proof. See Vaughan [1997] p.45-46, Lemma 4.3, and let k = 3.

Corollary 5.1. Suppose that p is a prime number and p - a. Then (i) S(p, a) = 0 if p = 3 and p ≡ 2 (mod 3), (ii) |S(p, a)| ≤ 2p1/2 if p ≡ 1 (mod 3).

Proof. If p = 3 or p ≡ 2 (mod 3), then card A = 0, so there is no term on RHS of (5.4). Hence S(p, a) = 0.

If p ≡ 1 (mod 3), then card A = 2, so let A = {χ1, χ2}. By (5.4),

S(p, a) = χ1(a)τ(χ1) + χ2(a)τ(χ2), where 1/2 |τ(χ1)| = |τ(χ2)| = p .

Also, by definition of character (see §1.1),

|χ1(a)| = |χ2(a)| = 1 since a is coprime to p. Hence |S(p, a)| ≤ 2p1/2.

Lemma 5.3. Suppose that p is a prime number, p - a and l is an integer. Then   p, when l = 2 and p 6= 3,   S(pl, a) = p2, when l = 3, (5.5)    p2S(pl−3, a), when l > 3.

Proof. See Vaughan [1997] p.46, Lemma 4.4, and let k = 3. Note that γ = 2 when p = 3 and γ = 1 otherwise (see Vaughan [1997] p.22).

Corollary 5.2. Suppose that p is a prime number, p - a, u is a non-negative integer and v = 1, 2 or 3. Then S(p3u+v, a) = p2uS(pv, a).

23 Proof. Use Lemma 5.3 and induction on u.

Lemma 5.4. Suppose that (q, r) = (qr, a) = 1. Then

S(qr, a) = S(q, ar2)S(r, aq2). (5.6)

Proof. See Vaughan [1997] p.47, Lemma 4.5, and let k = 3.

Lemma 5.5. Suppose that (q, a) = 1, then

S(q, a)  q2/3. (5.7)

Proof. See Vaughan [1997] p.47, Theorem 4.2, and let k = 3.

By prime factorization, every positive integer r can be written uniquely as

2 3 r = r1r2r3 (5.8) where r1 and r2 are squarefree, and (r1, r2) = 1. Then the following lemma holds.

Lemma 5.6. If (r, a) = 1, then for every ε > 0,

ε 1/2 2 |S(r, a)|  r r1 r2r3, (5.9) where r1, r2 and r3 are given by (5.8).

Proof. The result is trivial when r = 1. So suppose that r ≥ 2. First consider the case when r = pα, where p is a prime number and α is a positive integer. Let α = 3u + v, where u is a nonnegative integer and v = 1, 2 or 3. u+1 (i). If v = 3, then r1 = 1, r2 = 1, r3 = p . By Corollary 5.2 and Lemma 5.3,

α 3u+3 2u+2 1/2 2 S(p , a) = S(p , a) = p = r1 r2r3.

u (ii). If v = 2, then r1 = 1, r2 = p, r3 = p . By Corollary 5.2,

S(pα, a) = S(p3u+2, a) = p2uS(p2, a).

If p 6= 3, then by Lemma 5.3,

α 2u+1 1/2 2 S(p , a) = p = r1 r2r3.

24 If p = 3, then an immediate calculation gives |S(p2, a)| ≤ 9 (= 3p). So

α 2u+1 1/2 2 |S(p , a)| ≤ 3p = 3r1 r2r3.

u (iii). If v = 1, then r1 = p, r2 = 1, r3 = p . By Corollary 5.2 and Corollary 5.1,

α 3u+1 2u 2u+1/2 1/2 2 |S(p , a)| = |S(p , a)| = p |S(p, a)| ≤ 2p = 2r1 r2r3.

To sum up, we always have

α 1/2 2 |S(p , a)| ≤ 3r1 r2r3. (5.10)

Qt αi Finally, for every integer r ≥ 2, use prime factorization: r = i=1 pi , where pi are distinct primes and αi are positive integers for 1 ≤ i ≤ t. Then by Lemma 5.4 and induction, there exist ai for 1 ≤ i ≤ t, such that (pi, ai) = 1 and

t Y αi S(r, a) = S(pi , ai) i=1

From definition, r1, r2 and r3 are multiplicative as functions of r. So by (5.10),

t 1/2 2 |S(r, a)| ≤ 3 r1 r2r3, where t = ω(r), the number of distinct primes dividing r. Hence 3t  rε for every ε > 0 (see Montgomery and Vaughan [2007], p.55, Theorem 2.10), and the lemma follows.

The following two lemmas concern about the function T (r) which is defined as

r 1 X 6 T (r) := 7 |S(r, c)| . (5.11) r c=1 (c,r)=1

Lemma 5.7. We have T (r)  r−2.

Proof. By Lemma 5.5, |S(r, c)|  r2/3 if (c, r) = 1. So

r 1 X 2/3 6 φ(r) 1 T (r)  7 (r ) = 3 ≤ 2 . r c=1 r r (c,r)=1

25 Lemma 5.8. T (r) is a multiplicative function.

Proof. Obviously T (1) = 1, so it is sufficient to prove that if (r1, r2) = 1, then T (r1r2) =

T (r1)T (r2). By (5.11) and Lemma 5.4,

r1r2 1 X 2 6 2 6 T (r1r2) = 7 7 |S(r1, cr2)| |S(r2, cr1)| . r1r2 c=1 (c,r1r2)=1

From elementary number theory, we know that if 1 ≤ c ≤ r1r2, (c, r1r2) = 1 and

(r1, r2) = 1, then there exists a unique pair of numbers c1 and c2, such that c ≡ c2r1 + c1r2(modr1r2), 1 ≤ ci ≤ ri and (ci, ri) = 1 where i = 1 or 2. In this situation, 2 2 we have S(r1, cr2) = S(r1, c1) and S(r2, cr1) = S(r2, c2) by (5.1). So the identity above becomes

1 r1 r2 T (r r ) = X X |S(r , c )|6|S(r , c )|6 = T (r )T (r ), 1 2 r7r7 1 1 2 2 1 2 1 2 c1=1 c2=1 (c1,r1)=1 (c2,r2)=1 which completes the proof.

To estimate S2 − S3, we need the following two lemmas. Lemma 5.9. Suppose that (q, a) = 1, then for every ε > 0,

ax3  X e = q−1S(q, a)n1/3 + O(q1/2+ε). (5.12) q x≤n1/3

Proof. See Vaughan [1997] p.43, Theorem 4.1. Let β = 0 (thus α = a/q), k = 3 and use the refined version for V (α, q, a), i.e. replace v(β) by v1(β), then the lemma follows. Lemma 5.10. Suppose that (q, a) = 1, then for every ε > 0,

   3   X an 4 −3 3 2 1 +ε r3(n)e = Γ xq S(q, a) + O x 3 q 2 . (5.13) n≤x q 3

Proof. By (4.1),

   3 3 3  X an X X a(x1 + x2 + x3) r3(n)e = e n≤x q n≤x x1,x2,x3 q 3 3 3 x1+x2+x3=n a(x3 + x3) ax3  = X e 1 2 X e 3 . x1,x2 q 3 3 1/3 q 3 3 x3≤(x−x1−x2) x1+x2≤x

26 Consider the innermost sum. By Lemma 5.9,

 3  X ax3 −1 3 3 1/3 1/2+ε e = q S(q, a)(x − x1 − x2) + O(q ). 3 3 1/3 q x3≤(x−x1−x2)

So

   3 3  X an −1 X a(x1 + x2) 3 3 1/3 r3(n)e = q S(q, a) e (x − x1 − x2) n≤x q x1,x2 q 3 3 x1+x2≤x   + O X q1/2+ε . (5.14) x1,x2 3 3 x1+x2≤x

The second term above is

 q1/2+ε X 1 ≤ q1/2+εx2/3. x1,x2 3 3 x1+x2≤x

For the first term, repeat this argument on the sum over x2. The first term is

 3   3  −1 X ax1 X ax2 3 3 1/3 = q S(q, a) e e (x − x1 − x2) 1/3 q 3 1/3 q x1≤x x2≤(x−x1)  3   3  Z (x−x3)1/3 −1 X ax1 X ax2 1 3 3 −2/3 2 = q S(q, a) e e (x − x1 − y ) y dy 1/3 q 3 1/3 q x2 x1≤x x2≤(x−x1)  3  Z (x−x3)1/3   3  −1 X ax1 1 3 3 −2/3 2 X ax2 = q S(q, a) e (x − x1 − y ) y e dy 1/3 q 0 q x1≤x x2≤y by changing the order of the sum and the integral. For the innermost sum in the parenthesis, by Lemma 5.9,

ax3  X e 2 = q−1S(q, a)y + O(q1/2+ε). q x2≤y

Hence the first term of (5.14) is

 3  Z (x−x3)1/3 −2 2 X ax1 1 3 3 −2/3 3 q S(q, a) e (x − x1 − y ) y dy 1/3 q 0 x1≤x  Z (x−x3)1/3  −1 X 1 3 3 −2/3 2 1/2+ε +O q |S(q, a)| (x − x1 − y ) y q dy . (5.15) 1/3 0 x1≤x

27 Note that S(q, a)  q, then the second term of (5.15) is

Z (x−x3)1/3 X 1 3 3 −2/3 2 1/2+ε X 3 1/3 1/2+ε 1/2+ε 2/3  (x − x1 − y ) y q dy = (x − x1) q ≤ q x . 1/3 0 1/3 x1≤x x1≤x

Use integration by substitution, the integral in the first term of (5.15) is

1 4 1 (x − x3)2/3B , , 3 1 3 3 where B(a, b) represents the beta function:

Z 1 B(a, b) = xa−1(1 − x)b−1dx, a > 0, b > 0, (5.16) 0 with the property that Γ(a)Γ(b) B(a, b) = . (5.17) Γ(a + b)

Now repeat the argument on the sum over x1, we can see that the first term of (5.15) is

   3  1 4 1 −2 2 X ax1 3 2/3 = B , q S(q, a) e (x − x1) 3 3 3 1/3 q x1≤x 2 4 1 ax3  Z x1/3 = B , q−2S(q, a)2 X e 1 (x − y3)−1/3y2dy 3 3 3 1/3 q x1 x1≤x 2 4 1 Z x1/3  ax3  = B , q−2S(q, a)2 (x − y3)−1/3y2 X e 1 dy 3 3 3 0 q x1≤y 2 4 1 Z x1/3 = B , q−2S(q, a)2 (x − y3)−1/3y2(q−1S(q, a)y + O(q1/2+ε))dy, 3 3 3 0 including a main term and an error term, and the error term is

Z x1/3  2 (x − y3)−1/3y2q1/2+εdy = q1/2+εx2/3. 0

Therefore, by (5.14) and (5.15), we have

    Z x1/3   X an 2 4 1 −3 3 3 −1/3 3 2 1 +ε r3(n)e = B , q S(q, a) (x − y ) y dy + O x 3 q 2 . n≤x q 3 3 3 0

28 Finally, after using integration by substitution, the integral above is

1 4 2 xB , , 3 3 3 so the coefficient of the main term is

2 4 1 1 4 2 43 B , · B , = Γ 3 3 3 3 3 3 3 by (5.17). The lemma follows.

Now define S(q, a)3 ν(q, a) := , (5.18) q3 43 J(β) := Γ X e(βn). (5.19) 3 n≤x We have ν(q, a)  1 since S(q, a)  q. More precisely, by Lemma 5.5,

1 ν(q, a)  if (q, a) = 1. (5.20) q

There are two estimations on J(β):

J(β)  X 1 ≤ x, (5.21) n≤x and 1 1 J(β)  = when β ∈ [−1/2, 1/2] \{0}. (5.22) ||β|| |β| So if x ∈ [−1/2, 1/2], then x J(β)  . (5.23) 1 + x|β| By Lemma 2.1 and Lemma 2.2, we also have

Z 1/2 46 |J(β)|2dβ = Γ [x]. (5.24) −1/2 3

The lemma below tells us that ν(q, a)J(β) is a good approximation of G(β + a/q).

Lemma 5.11. Suppose that (q, a) = 1 and x ≥ 1, then for every ε > 0,

    a 2 1 +ε G β + = ν(q, a)J(β) + O x 3 q 2 (1 + x|β|) , (5.25) q

29 where G(α), ν(q, a) and J(β) are defined by (4.13), (5.18) and (5.19) respectively.

Proof. First consider the case β = 0. By Lemma 5.10, the LHS of (5.25) is

     3   a X an 4 −3 3 2 1 +ε G = r3(n)e = Γ xq S(q, a) + O x 3 q 2 , q n≤x q 3 while the RHS is

   3   2 1 +ε 4 −3 3 2 1 +ε ν(q, a)J(0) + O x 3 q 2 = Γ [x]q S(q, a) + O x 3 q 2 , 3 and the difference between the main terms is

43  Γ q−3|S(q, a)|3  1. 3

Thus we obtain the lemma in the case β = 0. Now consider the case when β is not necessarily 0. The LHS of (5.25) is

 a X an G β + = r3(n)e e(βn) q n≤x q X an Z x  = r3(n)e e(βx) − 2πiβe(βt)dt n≤x q n a Z x  X an = e(βx)G − 2πiβe(βt) r3(n)e dt q 0 n≤t q by inverting the order of summation and integration. Applying Lemma 5.10 again, the expression above is

 3   4 −3 3 2 1 +ε = Γ xq S(q, a) e(βx) + O x 3 q 2 3 Z x  3  Z x  4 −3 3 2 1 +ε − 2πiβe(βt)Γ tq S(q, a) dt + O |β|t 3 q 2 dt 0 3 0  3  Z x    4 −3 3 2 1 +ε = Γ q S(q, a) xe(βx) − 2πiβe(βt)tdt + O x 3 q 2 (1 + x|β|) . 3 0

Integrating by parts gives

Z x Z x 2πiβe(βt)tdt = xe(βx) − e(βt)dt, 0 0

30 so

 a G β + q  3 Z x   4 −3 3 2 1 +ε =Γ q S(q, a) e(βt)dt + O x 3 q 2 (1 + x|β|) 3 0  3  Z x    4 X 2 1 +ε =ν(q, a)J(β) + Γ ν(q, a) e(βt)dt − e(βn) + O x 3 q 2 (1 + x|β|) . 3 0 n≤x

Finally, use a similar argument on

Z x e(βt)dt − X e(βn), 0 n≤x we have

 Z x  X e(βn) = X e(βx) − 2πiβe(βt)dt n≤x n≤x n Z x   = e(βx) X 1 − 2πiβe(βt) X 1 dt n≤x 0 n≤t Z x = xe(βx) − 2πiβe(βt)tdt + O(1 + x|β|) 0 Z x = e(βt)dt + O(1 + x|β|) 0 when integrating by parts. Hence by (5.20), the lemma is proved.

Now define ∆(q, a, β) as follows:

 a ∆(q, a, β) := G β + − ν(q, a)J(β). (5.26) q

Lemma 5.11 says that if (q, a) = 1 and x ≥ 1, then for every ε > 0,

∆(q, a, β)  x2/3q1/2+ε(1 + x|β|). (5.27)

5.2 The Main Term S3

By (4.12), Lemma 5.1 and (5.11), we have

 −6 4 −2 X 1 X Γ x S3 = rT (r) (5.28) 3 q≤Q q r|q

31 1 = X X · rT (r) r≤Q l≤Q/r rl  Q  r  = X T (r) log + γ + O , r≤Q r Q where γ is Euler’s constant:

 n 1  γ = lim − log n + X . (5.29) n→∞ k=1 k

So  −6   4 −2 X X 1 X Γ x S3 = (log Q + γ) T (r) − T (r) log r + O rT (r) . 3 r≤Q r≤Q Q r≤Q

∞ ∞ By Lemma 5.7, T (r)  r−2, so the series X T (r) and X T (r) log r are convergent, and r=1 r=1

X X 1 1 T (r)  2  , (5.30) r>Q r>Q r Q

X X log r log Q T (r) log r  2  , (5.31) r>Q r>Q r Q 1 X rT (r)  X  log Q. (5.32) r≤Q r≤Q r So  −6 ∞ ∞   4 −2 X X log Q Γ x S3 = (log Q + γ) T (r) − T (r) log r + O . 3 r=1 r=1 Q Now define the following constants:

6 ∞ 4 X C0 := Γ T (r), (5.33) 3 r=1 and 6 ∞ 4 X C1 := Γ T (r) log r. (5.34) 3 r=1 Therefore, x2 log Q S = C x2 log Q + (γC − C )x2 + O . (5.35) 3 0 0 1 Q

32 5.3 The Error Term S2 − S3

We are concerned with

3 q 3 4 X X ρ(q, a) X ρ(q, a) 4  S2 − S3 = Γ x 3 r3(n) − 3 Γ x . (5.36) 3 q≤Q a=1 q n≤x q 3 n≡a( mod q)

By the proof of (5.3), we have:

q 2  3 q  3 X ρ(q, a) 4 1 X 3 3 4 6 Γ x = 7 S(q, b) S(q, b) Γ x. a=1 q 3 q b=1 3

By (5.2), we have:

q q q   X ρ(q, a) X X 1 X ba 3 X 3 r3(n) = 4 e − S(q, b) r3(n) a=1 q n≤x a=1 q b=1 q n≤x n≡a( mod q) n≡a( mod q) q q   1 X 3 X X ba = 4 S(q, b) e − r3(n) q b=1 a=1 n≤x q n≡a( mod q) q   1 X 3 X bn = 4 S(q, b) e − r3(n). q b=1 n≤x q

So

q 3 X ρ(q, a) X ρ(q, a) 4  3 r3(n) − 3 Γ x a=1 q n≤x q 3 n≡a( mod q) q     3  1 X 3 X bn 4 −3 3 = 4 S(q, b) e − r3(n) − Γ xq S(q, −b) , q b=1 n≤x q 3 since S(q, b) = S(q, −b). By taking out the common factor (q, b), we can observe that the expression above is

r 3     3  1 X X q 3 X an 4 −3 3 −3 3 = 4 3 S(r, a) e − r3(n) − Γ xq q r S(r, −a) q r|q a=1 r n≤x r 3 (a,r)=1 r 3     3  1 X X S(r, a) X an 4 −3 3 = 3 e − r3(n) − Γ xr S(r, −a) , q r|q a=1 r n≤x r 3 (a,r)=1

33 and note that a similar argument was used when proving Lemma 5.1. By Lemma 5.10, this is r 3 1 X X |S(r, a)| 2/3 1/2+ε  3 x r . q r|q a=1 r (a,r)=1 So we have

r 5/3 X 1 X X 3 −5/2+ε S2 − S3  x |S(r, a)| r q≤Q q r|q a=1 (a,r)=1  1 r = x5/3 X X X |S(r, a)|3r−7/2+ε r≤Q l≤Q/r l a=1 (a,r)=1 r  x5/3(log Q) X r−7/2+ε X |S(r, a)|3 r≤Q a=1 (a,r)=1 by inserting the approximation above in (5.36). By Lemma 5.6, if (r, a) = 1, then for ε 1/2 2 every ε > 0, |S(r, a)|  r r1 r2r3, where r1, r2 and r3 are defined by (5.8). So

r 5/3 X −7/2+ε X 3ε 3/2 3 6 S2 − S3  x (log Q) r r r1 r2r3 r≤Q a=1 (a,r)=1 5/3 X −5/2+4ε 3/2 3 6 ≤ x (log Q) r r1 r2r3 r≤Q 5/3 X 4ε−1 8ε−2 12ε−3/2 = x (log Q) r1 r2 r3 r1,r2squarefree and coprime 2 3 r1r2r3≤Q   ∞  ∞  5/3 X 4ε−1 X 8ε−2 X 12ε−3/2 ≤ x (log Q) r1 r2 r3 . r1≤Q r2=1 r3=1

If ε is small enough, then the series above are convergent. Also,

X 4ε−1 4ε X −1 4ε 5ε r1 ≤ Q r1  Q log Q  Q . r1≤Q r1≤Q

5/3 6ε Hence S2 − S3  x Q . Finally, since ε represents an arbitrary positive real number, then 6ε also has the same meaning, so we can rewrite 6ε as ε. Therefore,

  5 ε S2 − S3 = O x 3 Q . (5.37)

34 5.4 The Major Arcs

By (5.26), we have

  2    a G β + = ν(q, a)J(β) + ∆(q, a, β) ν(q, a) · J(β) + ∆(q, a, β) q 2 2 = |ν(q, a)| |J(β)| + ∆1 + ∆2, where   ∆1 := 2< ν(q, a) · J(β) · ∆(q, a, β) and 2 ∆2 := |∆(q, a, β)| .

So if we define

Z 1/2qR q X 2 X 2 S5 := Fq(β)|J(β)| |ν(q, a)| dβ, (5.38) −1/2qR q≤x/R a=1 (a,q)=1 then from the definition of S4 (see (3.13)),

S4 = S5 + Σ1 + Σ2, (5.39) where q X Z 1/2qR X Σ1 := Fq(β) ∆1dβ, −1/2qR q≤x/R a=1 (a,q)=1

q X Z 1/2qR X Σ2 := Fq(β) ∆2dβ. −1/2qR q≤x/R a=1 (a,q)=1

By (5.27) and the property of Fq(α) (see (3.12)), we have √ Z 1/2qR q   X x log(2 x/q) X 4/3 1+2ε 2 Σ2  x q (1 + x|β|) dβ −1/2qR q≤x/R q + qx|β| a=1 (a,q)=1 Z 1/2qR  X x7/3(log x)q1+2ε(1 + x|β|)dβ −1/2qR q≤x/R 1  x  = x7/3(log x) X q1+2ε · · 2 + q≤x/R 2qR 2qR

35  x10/3(log x)R−2 X q−1+2ε q≤x/R  x10/3+2ε(log x)2R−2, while √ Z 1/2qR q   X x log(2 x/q) X 1 x 2/3 1/2+ε Σ1  · · x q (1 + x|β|) dβ −1/2qR q≤x/R q + qx|β| a=1 q 1 + x|β| (a,q)=1 Z 1/2qR dβ  X x8/3(log x)q−1/2+ε · −1/2qR q≤x/R 1 + x|β| 2  x  = x8/3(log x) X q−1/2+ε · log 1 + q≤x/R x 2qR  x5/3(log x)2 X q−1/2+ε q≤x/R  x13/6+ε(log x)2R−1/2 if we also apply the properties (5.20) and (5.23). Hence by (5.39),

  10 +2ε 2 −2 13 +ε 2 − 1 S4 = S5 + O x 3 (log x) R + x 6 (log x) R 2 . (5.40)

To estimate S5, note that by (5.18) and (5.11),

q q X 2 1 X 6 |ν(q, a)| = 6 |S(q, a)| = qT (q), a=1 q a=1 (a,q)=1 (a,q)=1 so Z 1/2qR X 2 S5 = qT (q) Fq(β)|J(β)| dβ. −1/2qR q≤x/R

Define S 11 and S6 as follows: 2

Z 1/2 X 2 S 11 := qT (q) Fq(β)|J(β)| dβ 2 −1/2 q≤x/R and Z 1/2 X 2 S6 := qT (q) Fq(β)|J(β)| dβ. (5.41) √ −1/2 q≤ x

36 −1 By the property of Fq(α), we have a cruder estimate that Fq(β)  q x log x. Thus by Lemma 5.7, (5.22) and (5.24),

 Z −1/2qR Z 1/2  X 2 S 11 − S5 ≤ qT (q) + |Fq(β)||J(β)| dβ 2 −1/2 1/2qR q≤x/R  Z −1/2qR Z 1/2  1  x log x X T (q) + dβ −1/2 1/2qR 2 q≤x/R |β| = 2x log x X T (q)(2qR − 2) q≤x/R  Rx log x X q−1 q≤x/R  Rx(log x)2, and

Z 1/2 X 2 S6 − S 11 ≤ qT (q) |Fq(β)||J(β)| dβ 2 √ −1/2 x/R

Hence 2 S5 = S6 + O(Rx(log x) ). (5.42)

Therefore, by (4.14), (5.35), (5.37), (5.40) and (5.42),

2 2 X 2 V (x, Q) = 2S6 − C0x log Q + (C1 − γC0)x + Q r3(n) + U(x, Q), (5.43) n≤x where

10 +2ε 2 13 +ε 2 X 2 x 3 (log x) x 6 (log x) U(x, Q) R(log x) r3(n) + 2 + 1 n≤x R R 2 2 2 x log Q 5 ε + Rx(log x) + + x 3 Q . (5.44) Q

37 5.5 The Optimal Choice for R

By (5.41), S6 does not depend on R, so by (5.43), R has no effect on the main terms of V (x, Q) and is only contained in the error terms. Therefore, we can find the optimal choice for R such that the sum of the error terms on RHS of (5.44) has the minimum size, in other words, it has the minimum upper exponent of x, which equals to the minimum exponent of x if it exists. Recall that the upper exponent, the lower exponent and the exponent of x in an expression E(x) are defined by (4.16), (4.17) and (4.18) respectively. Now assume that the exponent of x in R exists and it equals to B. Since log x  xε for any ε > 0, and ε is not considered into the limit, then the upper exponents of x in the first four terms on RHS of (5.44) are B + A+, 10/3 − 2B, 13/6 − B/2 and B + 1 respectively, where A+ is the upper exponent of (4.15) with the range given by (4.19). For the last two terms, we just need to consider x5/3Qε in which the exponent of x is 5/3, because it dominates the other one due to the assumption that x1/2 log x ≤ Q ≤ x (see (3.16)):

x2 log Q x2 log x ≤  x5/3Qε. Q x1/2 log x

Hence the upper exponent of x on RHS of (5.44) is

 10 13 B 5 I(B) = max B + A+, − 2B, − ,B + 1, . 3 6 2 3

If A+ ∈ [1, 7/6], then the function I(B) has the minimum value when

10 A+ B = − , (5.45) 9 3 and in this case, 10 A+  10 2 I − = + A+. (5.46) 9 3 9 3 Note that 13 10 A+ 7 ≤ − ≤ 18 9 3 9 and 16 10 A+  17 ≤ I − ≤ , 9 9 3 9 so if the exponent of x in R is (5.45), then R still satisfies (3.9) if x is large enough, and RHS of (5.44) is strictly  x2. Since the behavior of (4.15) remains unknown so far,

38 we may use any expression for R in which the exponent of x is (5.45). There are two possible expressions:

Possibility 1. We may use the exponential function. In other words, let

+ 10 − A R = x 9 3 . (5.47)

By (4.18), for any ε > 0, X 2 (A+)+ε r3(n)  x , n≤x and note that log x  xε. By calculation, all the terms on RHS of (5.44) are

10 + 2 (A+)+4ε  x 9 3 .

Since 4ε represents an arbitrary positive number and ε has the same meaning, we can rewrite 4ε as ε. Therefore, the error term satisfies

10 + 2 (A+)+ε U(x, Q)  x 9 3 . (5.48)

Possibility 2. If the exponent of x in (4.15) exists, then A = A+, so we may replace A+ by A in (5.47) and (5.48). Moreover, we can avoid A or A+ if we replace xA+ (which is also xA) by (4.15) instead. Thus

 − 1 10 X 2 3 R = x 9 r3(n) . (5.49) n≤x

Use a similar argument as mentioned in Possibility 1, we can prove that the error term satisfies   2 10 +ε X 2 3 U(x, Q)  x 9 r3(n) . (5.50) n≤x Note that (5.50) shows the relationship between the error term U(x, Q) and the second moment sum (4.15), while (5.48) does not show it directly. However, (5.48) is always true no matter how (4.15) behaves, while (5.50) may not hold if the exponent of x in (4.15) does not exist. Hence right now, we do not know which approximation is better, so both of them will be included in the conclusion. If we know how (4.15) behaves in the future, then we can choose a better approximation between (5.48) and (5.50). We may also find other expressions for R and the error term with the same exponent of x.

39 5.6 The Main Term S6

To finalize the conclusion, we need to calculate the integral (5.41). By definitions of

Fq(α) and J(β) (see (3.8) and (5.19)) as well as Lemma 2.1 and Lemma 2.2,

Z 1/2  6   2 4 X X X Fq(β)|J(β)| dβ = Γ + ([x] − lm), −1/2 3 √ √ l≤ x m≤x/l x

  X + X ([x] − lm) √ m≤x/l x

6 6 4 X X 4 X  X  S6 = Γ qT (q) K(x, l, Q) = Γ qT (q) K(x, l, Q). 3 √ √ 3 √ q≤ x l≤ x l≤ x q|l q|l

Define h(l) := X qT (q), (5.51) q|l then

6 4  X X  S6 =Γ h(l)K(x, l, Q) + h(l)K(x, l, Q) 3 √ l≤x/Q x/Q

40 Actually, we are more interested in 2S6 than S6, so if we define

h(l) W (X) := X (X − l)2, (5.52) l≤X l then

 6     4 2 X h(l) √ 2 x X 2S6 = Γ x + xW ( x) − Q W + O x h(l) . (5.53) 3 √ l Q √ l≤ x l≤ x

By Lemma 5.7 and (5.51), the error term is

 x X X qT (q) = x X qT (q) X 1 ≤ x3/2 X T (q)  x3/2. √ √ √ √ l≤ x q|l q≤ x l≤ x q≤ x q|l

The first term on RHS of (5.53) is

46 h(l) 46 1 Γ x2 X = Γ x2 X X qT (q), 3 √ l 3 √ l l≤ x l≤ x q|l which looks similar to the following expression for S3 (see (5.28)):

 6 4 2 X 1 X S3 = Γ x rT (r). 3 q≤Q q r|q √ Replacing Q, q and r from (5.35) by x, l and q respectively gives √  6  2  4 2 X h(l) 2 √ 2 x log x Γ x = C0x log x + (γC0 − C1)x + O √ 3 √ l x l≤ x 1 = C x2 log x + (γC − C )x2 + O(x3/2 log x), (5.54) 2 0 0 1 where γ is Euler’s constant (see (5.29)), and the constants C0 and C1 are defined by (5.33) and (5.34). So by (5.53) and (5.54),

1 46 √  x  2S = C x2 log x + (γC − C )x2 + Γ xW ( x) − Q2W + O(x3/2 log x). 6 2 0 0 1 3 Q (5.55) √ Hence to finalize the conclusion, we only need to calculate W (X) for X = x and X = x/Q.

41 5.7 The Calculation of W (X) √ Now we calculate W (X) when 1 ≤ X ≤ x. We start from the following lemma:

Lemma 5.12. Assume that s = σ + it is a complex number, where σ, t ∈ R and σ > −2. Then the following results hold: (1). If p 6= 3 is a prime number, then

∞ k  p−1  X T (p ) 1 1 X 6 p − 1 1 ks = −(3s+6) 1 + s+7 |S(p, c)| + 2s+7 − 3s+7 . k=0 p 1 − p p c=1 p p

(2). ∞ k  9  X T (3 ) 1 1 X 6 1 ks = −(3s+6) 1 + 2s+14 |S(9, c)| − 3s+7 . k=0 3 1 − 3 3 c=1 3 (c,3)=1

Proof. By the proof of Lemma 5.6, if p is prime, (p, a) = 1 and u ∈ Z, u > 0, then

S(p3u+3, a) = p2u+2,

 2u+1  p when p 6= 3, S(p3u+2, a) =  32uS(9, a) when p = 3, and S(p3u+1, a) = p2uS(p, a),

So by (5.11), we have p − 1 T (p3u+3) = , p6u+7   p − 1  when p 6= 3,  p6u+7  T (p3u+2) = 9 1 X 6  |S(9, c)| when p = 3,  36u+14  c=1  (c,3)=1 and p−1 3u+1 1 X 6 T (p ) = 6u+7 |S(p, c)| , p c=1 in particular T (33u+1) = 0 since S(3, 1) = S(3, 2) = 0 by Corollary 5.1. Note that the identity ∞ k ∞ 3u+1 ∞ 3u+2 ∞ 3u+3 X T (p ) X T (p ) X T (p ) X T (p ) ks = 1 + (3u+1)s + (3u+2)s + (3u+3)s (5.56) k=0 p u=0 p u=0 p u=0 p

42 holds when the series on RHS are convergent. First assume that p 6= 3, then from the discussion above, RHS of (5.56) becomes

 p−1  ∞ 1 X 6 p − 1 p − 1 X 1 1 + s+7 |S(p, c)| + 2s+7 + 3s+7 (3s+6)u . p c=1 p p u=0 p

If σ = −2, then the series above is absolutely convergent since |p3s+6| = p3σ+6 > 1, and it converges to (1 − p−(3s+6))−1. Hence the first result is proved. Now assume that p = 3, then RHS of (5.56) becomes

 9  ∞ 1 X 6 2 X 1 1 + 2s+14 |S(9, c)| + 3s+7 (3s+6)u . 3 c=1 3 u=0 3 (c,3)=1

Similarly, the series above converges to (1 − 3−(3s+6))−1 when σ > −2, thus the second result is also proved.

Now define ∞ ∞ X h(l) 1 X h(l) D(s) := · s = s+1 (5.57) l=1 l l l=1 l if σ = 0. Then by (5.51), Lemma 5.8 and Lemma 5.12, we have

∞ X 1 X D(s) = s+1 qT (q) l=1 l q|l ∞ ∞ X X qT (q) = s+1 s+1 q=1 m=1 q m ∞ X T (q) = ζ(s + 1) s (5.58) q=1 q ∞ k Y  X T (p ) = ζ(s + 1) ks p k=0 p

= ζ(s + 1)ζ(3s + 6)D0(s), (5.59)

where D0(s) is defined as

 p−1   9  Y 1 X 6 p − 1 1 1 X 6 1 1+ s+7 |S(p, c)| + 2s+7 − 3s+7 · 1+ 2s+14 |S(9, c)| − 3s+7 . (5.60) p6=3 p c=1 p p 3 c=1 3 (c,3)=1

By definition (5.1), |S(9, c)| ≤ 9 when c = 1, 2, 4, 5, 7, 8. By Corollary 5.1, |S(p, c)| ≤ 2p1/2

43 when p is a prime and 1 ≤ c ≤ p − 1. So

Y  64 1 1   2 1  |D0(s)| ≤ 1 + σ+3 + 2σ+6 + 3σ+7 · 1 + 2σ+1 + 3σ+7 . p6=3 p p p 3 3

The product on RHS of the inequality converges locally uniformly when σ > −2, so D0(s) converges absolutely and locally uniformly for −2. Note that Lemma 5.12 also holds when −2, so (5.59) affords an analytic continuation of D(s) to that open half plane. Thus from the properties of the Riemann zeta function, D(s) is meromorphic in −2 with simple poles at s = 0 (from s + 1 = 1) and s = −5/3 (from 3s + 6 = 1). Moreover, we have the following identity if we compare (5.58) with (5.59) without considering the analytic continuation: ∞ X T (q) s = ζ(3s + 6)D0(s). (5.61) q=1 q The identity is correct when the series converges, for example, when s satisfies both <(3s + 6) > 1 and −2. Therefore, we have: ∞ ∞ X X Lemma 5.13. The series qT (q) and T (q) converges to ζ(3)D0(−1) and ζ(6)D0(0) q=1 q=1 respectively.

Proof. s = −1, 0 satisfies both <(3s + 6) > 1 and −2, so substitute s = −1, 0 into (5.61). The lemma is then proved.

Now consider the relationship between W (X) and D(s). We need the following lemma on Cesàro weight:

Lemma 5.14. If {bn} is a real sequence and x > 0, then

Z σ0+i∞ s+2 1 X 2 1 x bn(x − n) = α(s) · ds (5.62) 2 n≤x 2πi σ0−i∞ s(s + 1)(s + 2) when σ0 > max{0, σc}, where ∞ X −s α(s) = bnn (5.63) n=1 and σc is the abscissa of convergence of α(s) with the property that α(s) converges for all s with σc and for no s with

44 Let bn = h(n)/n in Lemma 5.14. By (5.52) and (5.57), (5.62) becomes

s+2 1 1 Z σ0+i∞ X W (X) = D(s) · ds. (5.64) 2 2πi σ0−i∞ s(s + 1)(s + 2)

Note that D(s) always converges when 0, so D(s) has an abscissa of convergence

σc ≤ 0. Therefore, (5.64) is true if σ0 > max{0, σc} = 0. We need to calculate the integral on RHS of (5.64). Note that the integrand is homomorphic in the open half plane −2 except for simple poles at both s = −1 and s = −5/3, and a double pole at s = 0. Let 0 < δ < 1/3 and C(t1, t2) be the counterclockwise rectangular contour with vertices σ0 − it1, σ0 + it2, −2 + δ + it2 and

−2 + δ − it1 where t1, t2 > 1, then by residue formula, we have:

1 Z Xs+2  5 D(s) · ds = Res(f, 0) + Res(f, −1) + Res f, − , (5.65) 2πi C(t1,t2) s(s + 1)(s + 2) 3 where f represents the integrand. The residues are:

 2 s+2  d s D(s)X 2 2 Res(f, 0) = lim = D1X log X + D2X , s→0 ds s(s + 1)(s + 2)

(s + 1)D(s)Xs+2 Res(f, −1) = lim = D3X, s→−1 s(s + 1)(s + 2)   s+2 5 (s + 5/3)D(s)X 1/3 Res f, − = lim = D4X , 3 s→−5/3 s(s + 1)(s + 2) where D1,D2,D3 and D4 are constants, which can be calculated from (5.59), Lemma

5.13 and the fact that lims→0 sζ(s + 1) = 1 (see Montgomery and Vaughan [2007], p.26, Corollary 1.16):

∞ sD(s) 1 1 X D1 = lim = ζ(6)D0(0) lim sζ(s + 1) = T (q), s→0 s→0 (s + 1)(s + 2) 2 2 q=1

∞ D(s) 1 X D3 = lim = −D(−1) = −ζ(0)ζ(3)D0(−1) = qT (q), s→−1 s(s + 2) 2 q=1 and (s + 5/3)D(s) 9  2  5 D4 = lim = ζ − D0 − . s→−5/3 s(s + 1)(s + 2) 10 3 3

45 Hence RHS of (5.65) is

 ∞   ∞      1 X 2 2 1 X 9 2 5 1/3 T (q) X log X + D2X + qT (q) X + ζ − D0 − X . (5.66) 2 q=1 2 q=1 10 3 3

Note that the value of D2 is not essential since the related terms will be cancelled at the end. Now consider LHS of (5.65). First, by (5.59), rewrite the integrand as

ζ(s + 1)ζ(3s + 6)D (s)Xs+2 0 , (5.67) s(s + 1)(s + 2) and D0(s) is uniformly bounded when

Lemma 5.15. If w is a complex number, then for any ε > 0,

1 ζ(w) −  τ λ(u)+ε, (5.68) w − 1 where u =

  0 when u > 1,   1 1  − u when 0 < u ≤ 1, λ(u) = 2 2 (5.69)   1  − u when u ≤ 0.  2

Proof. See Titchmarsh [1986], p.95-96, including formulas (5.1.2), (5.1.3) and (5.1.4), and in case that w → 1 or v → 0 (so that ζ(w) → ∞ or |v|λ(u)+ε → 0), replace ζ(w) by ζ(w) − 1/(w − 1) and replace |v| by τ = 4 + |v|.

Consider the integral along the line segment σ0 + it2 → −2 + δ + it2, which is a side of the contour C(t1, t2). If s lies on the line segment, then <(s + 1) ≥ −1 + δ > −1 and <(3s + 6) ≥ 3δ > 0. Also, both |(s + 1) − 1|−1 and |(3s + 6) − 1|−1 are bounded by 1 since =s = t2 > 1. So by Lemma 5.15,

λ(

46 Therefore, the integrand (5.67) is

−1+2ε σ0+2  t2 X since t2 > 1. Hence for fixed X and σ0, the integral along this line segment tends to 0 when t2 → ∞ and ε is small enough, since the line segment is of finite length. Similarly, the integral along the line segment −2 + δ − it1 → σ0 − it1 also tends to 0 when t1 → ∞.

Now consider the integral along the line segment −2 + δ + it2 → −2 + δ − it1. By substitution, the integral (with coefficient) is

Z t2 it 1 δ ζ(−1 + δ + it)ζ(3δ + 3it)D0(−2 + δ + it)X 1 δ − X dt = − X (I1 + I2 + I3), 2π −t1 (−2 + δ + it)(−1 + δ + it)(δ + it) 2π (5.70) where I1, I2 and I3 are defined as the integrals with the same integrand as LHS with different intervals of integration, namely [−t1, −1], [−1, 1] and [1, t2] respectively. Note that 0 < δ < 1/3, so −1 < −1 + δ < −2/3 and 0 < 3δ < 1. Therefore, if t = =s satisfies |t| ≥ 1, then by Lemma 5.15,

ζ(−1 + δ + it)  (4 + |t|)λ(−1+δ)+ε = (4 + |t|)3/2−δ+ε and ζ(3δ + 3it)  (4 + 3|t|)λ(3δ)+ε = (4 + 3|t|)1/2−3δ/2+ε.

it We also have D0(−2 + δ + it) is bounded and |X | = 1, so the integrand on LHS of (5.70) is  |t|3/2−δ+ε+1/2−3δ/2+ε−3 = |t|−1−5δ/2+2ε when |t| ≥ 1. Moreover, let ε = δ, then the integrand is

 |t|−1−δ/2.

−δ/2 −δ/2 So for a fixed δ, I1  1 − t1 and I3  1 − t2 . If |t| ≤ 1, then the integrand on LHS of (5.70) is bounded, so I2  1. Hence the integral along the line segment δ −2 + δ + it2 → −2 + δ − it1 tends to O(X ) when t1 → ∞ and t2 → ∞.

Finally, from (5.64) we know that the integral along the line segment σ0 −it1 → σ0 +it2

(with coefficient) tends to W (X)/2 when t1 → ∞ and t2 → ∞. So let t1, t2 → ∞ on

47 both sides of (5.65), and note that RHS of (5.65) is (5.66), we have

 ∞   ∞      X 2 2 X 9 2 5 1/3 δ W (X) = T (q) X log X+2D2X + qT (q) X+ ζ − D0 − X +O(X ). q=1 q=1 5 3 3 (5.71)

5.8 Conclusion: Theorem 5.1 √ To get the conclusion, substitute X = x and X = x/Q into (5.71). Note that for √ W ( x), we can ignore all the other terms except for the first two terms:

∞ √ 1 X  √ W ( x) = T (q) x log x + 2D2x + O( x). 2 q=1 √ √ since the other terms, including x, x1/6 and O(xδ/2) are all  x. Hence

 ∞  √ 1 X 2 2 3/2 xW ( x) = T (q) x log x + 2D2x + O(x ). (5.72) 2 q=1

While for W (x/Q), we need to consider all terms in (5.71). Then by (5.55) and (5.33),

2 2 5/3 1/3 3/2 2−δ δ 2S6 = C0x log Q+(γC0−C1)x −A1Qx+A2Q x +O(x log x)+O(Q x ), (5.73) where 6 ∞ 4 X A1 = Γ qT (q), (5.74) 3 q=1 and 9 46  2  5 A = − Γ ζ − D − . (5.75) 2 5 3 3 0 3

Note that both A1 and A2 are positive constants. Finally, by equality (5.43), two possibilities for the error terms (5.48) and (5.50), as well as definitions (5.1), (5.11) and (5.60) which are related to constants A1 and A2, we have the final conclusion: 1 Theorem 5.1. Assume that ε and δ are positive numbers satisfying 0 < δ < and let 3

X 2 5 1 U0(x, Q) = V (x, Q) − Q r3(n) + A1Qx − A2Q 3 x 3 n≤x

48 where  6 ∞ q 4 X 1 X 6 A1 = Γ 6 |S(q, c)| , 3 q=1 q c=1 (c,q)=1 9 46  2  5 A = − Γ ζ − D − , 2 5 3 3 0 3  p−1   9  Y 1 X 6 p − 1 1 1 X 6 1 D0(s) = 1+ s+7 |S(p, c)| + 2s+7 − 3s+7 · 1+ 2s+14 |S(9, c)| − 3s+7 , p6=3 p c=1 p p 3 c=1 3 (c,3)=1 and q cm3  S(q, c) = X e . m=1 q

1 Then when x 2 log x ≤ Q ≤ x one has

10 + 2 (A+)+ε 2−δ δ U0(x, Q)  x 9 3 + Q x

+ X 2 where A is the upper exponent of x in the representation of r3(n) , or n≤x

  2 10 +ε X 2 3 2−δ δ U0(x, Q)  x 9 r3(n) + Q x n≤x

X 2 if the exponent of x in the representation of r3(n) exists. n≤x

49 Chapter 6 | Several Notes

In this chapter, we will discuss several notes on the problem. §6.1 still focuses on the example where an = r3(n), and shows more precise results for large Q, for example, when Q = x. §6.2 lists some similar questions, which can be pretty useful in the study of Waring’s problem (see §1.1).

6.1 Results for Large Q

Theorem 5.1 gives us a general result of the problem. However, if Q is very close to x, say Q  x, where f(x)  g(x) means that f(x) and g(x) have the same order of magnitude, i.e. both f(x)  g(x) and g(x)  f(x), then

Qx  Q5/3x1/3  Q2−δxδ  x2.

In other words, the exponent of x in Q2−δxδ is the same as that in the main terms 5/3 1/3 2−δ δ −A1Qx+A2Q x , which is 2. Therefore, we need to improve the error term O(Q x ) and find a better result. Note that the error term O(Q2−δxδ) comes from Q2W (x/Q), so to avoid this, instead of applying (5.71) to approximate Q2W (x/Q), we can calculate the accurate value directly. Assume that x/(k + 1) < Q ≤ x/k, where k ≥ 1 is an integer, then by (5.52),

 x  h(l) x 2 h(l) Q2W = Q2 X − l = X (x − lQ)2. (6.1) Q l≤x/Q l Q l≤k l

√ 2 For xW ( x), we may still apply (5.72). However, in this case the term 2D2x can not be cancelled, so we need to calculate D2, which is not easy if we calculate the corresponding residue as mentioned in §5.7. The following lemma shows a better way to

50 √ investigate W ( x):

Lemma 6.1. If Y ≥ 1, then

X 1 2 2 2 (Y − m) = Y log Y + C2Y + Y + O(1), (6.2) m≤Y m where ∞ 11 Z B2(u) C2 = − − 2 du (6.3) 12 1 u3 and 1 1 1 B (u) = (u − [u])2 − (u − [u]) + . (6.4) 2 2 2 12

Proof. See Vaughan [1998b], p.805-806, where C2 is defined as in the formula (6.12) and the result is stated as in the formula (6.11). √ By (5.52), (5.51), Lemma 6.1 (let Y = x/q) and Lemma 5.7, we have

√ 1 √ W ( x) = X ( x − l)2 X qT (q) √ l l≤ x q|l 1 √ = X X ( x − qr)2qT (q) √ √ qr q≤ x r≤ x/q √ 1 x 2 = X q2T (q) X − r √ √ r q q≤ x r≤ x/q √ √  x x C x x  = X q2T (q) log + 2 + + O(1) √ q2 q q2 q q≤ x 1  X X √ X √ = x log x + C2x T (q) − x T (q) log q + x qT (q) + O( x). 2 √ √ √ q≤ x q≤ x q≤ x √ Replace r and Q by q and x respectively in (5.30), (5.31) and (5.32), we have

1 log x X T (q)  √ , X T (q) log q  √ and X qT (q)  log x, √ x √ x √ q> x q> x q≤ x so ∞ ∞ √ 1  X X √ W ( x) = x log x + C2x T (q) − x T (q) log q + O( x log x). 2 q=1 q=1 Hence

 ∞   ∞ ∞  √ 1 X 2 X X 2 3/2 xW ( x) = T (q) x log x+ C2 T (q)− T (q) log q x +O(x log x). (6.5) 2 q=1 q=1 q=1

51 Note that if comparing (6.5) with (5.72), then

∞ ∞ 1 X X  D2 = C2 T (q) − T (q) log q . (6.6) 2 q=1 q=1

By (5.43), (5.55), (6.5), and the definition of constants (5.33) and (5.34), we have

   6   X 2 2 x 4 2 x V (x, Q) =Q r3(n) + x C0 log + C0C2 − C1 − Γ Q W n≤x Q 3 Q + U(x, Q) + O(x3/2 log x). (6.7)

Finally, by (5.1), (5.11), (5.51), (6.1) and (6.7), two possibilities for the error terms (5.48) and (5.50), as well as definitions (5.33), (5.34), (6.3) and (6.4) which are related to constants C0, C1 and C2, we have the following conclusion on V (x, Q) when Q is large enough:

Theorem 6.1. Assume that ε is a positive number and k ≥ 1 is an integer, and let

   6 X 2 2 x 4 X h(l) 2 U0(x, Q) = V (x, Q)−Q r3(n) −x C0 log +C0C2 −C1 −Γ (x−lQ) n≤x Q 3 l≤k l where  6 ∞ q 4 X 1 X 6 C0 = Γ 7 |S(q, c)| , 3 q=1 q c=1 (c,q)=1

 6 ∞ q 4 X log q X 6 C1 = Γ 7 |S(q, c)| , 3 q=1 q c=1 (c,q)=1

∞ 11 Z B2(u) C2 = − − 2 du, 12 1 u3 1 1 1 B (u) = (u − [u])2 − (u − [u]) + , 2 2 2 12 q X 1 X 6 h(l) = 6 |S(q, c)| , q|l q c=1 (c,q)=1 and q cm3  S(q, c) = X e . m=1 q

52 x x Then when < Q ≤ one has k + 1 k

10 + 2 (A+)+ε U0(x, Q)  x 9 3

+ X 2 where A is the upper exponent of x in the representation of r3(n) , or n≤x

  2 10 +ε X 2 3 U0(x, Q)  x 9 r3(n) n≤x

X 2 if the exponent of x in the representation of r3(n) exists. n≤x

By §5.5, the error terms in Theorem 6.1 is  x17/9+ε, so the result is more precise than Theorem 5.1. We can also apply Theorem 6.1 to several special cases and list the results as corollaries. The first corollary concerns the cases when x/3 < Q ≤ x/2 and x/2 < Q ≤ x. They were treated as special cases when Goldston and Vaughan [1997] studied the variance for an = Λ(n) (see p.119 for the theorem and p.142 for the proof).

Corollary 6.1. Define C0, C1 and C2 as in Theorem 6.1. Then x x (i) when < Q ≤ one has 3 2

  6 X 2 2 x 3 4 V (x, Q) =Q r3(n) + x C0 log + C0C2 − C1 − Γ n≤x Q 2 3 46 46 + 4Γ Qx − 3Γ Q2 + U (x, Q), 3 3 0 x (ii) when < Q ≤ x one has 2

  6 X 2 2 x 4 V (x, Q) =Q r3(n) + x C0 log + C0C2 − C1 − Γ n≤x Q 3 46 46 + 2Γ Qx − Γ Q2 + U (x, Q), 3 3 0

where U0(x, Q) has the same property as in Theorem 6.1.

53 Proof. By (5.1), (5.11) and (5.51), we have h(1) = 1 and h(2) = 1. So

 x  x2 − 2Qx + Q2 when < Q ≤ x, X h(l)  2 (x − lQ)2 = l 3 2 2 x x l≤k  x − 4Qx + 3Q when < Q ≤ .  2 3 2

The corollary is proved when applying Theorem 6.1.

The following corollary concerns the case for the largest possible Q, i.e. Q = x.

Corollary 6.2. Define C0, C1 and C2 as in Theorem 6.1. Then

X 2 2 V (x, x) = x r3(n) + x (C0C2 − C1) + U0(x), n≤x where U0(x) has the same property as U0(x, Q) in Theorem 6.1.

Proof. Let Q = x in Corollary 6.1, case (ii). Then this corollary is proved.

Finally, the last corollary concerns the case when Q and x are linearly dependent, i.e. Q = x/m where m ≥ 1 is a constant (not necessarily an integer).

Corollary 6.3. Define C0, C1, C2 and h(l) as in Theorem 6.1. Then

    6  x x X 2 2 4 1 X h(l) 2 V x, = r3(n) + x C0 log m + C0C2 − C1 − Γ 2 (m − l) m m n≤x 3 m l≤m l

+ U0(x),

where m ≥ 1 is a constant, and U0(x) has the same property as U0(x, Q) in Theorem 6.1.

Proof. We have x x x < ≤ . [m] + 1 m [m] So let Q = x/m in Theorem 6.1, then k = [m], and the sum over l ≤ [m] is the same as the sum over l ≤ m. Thus the corollary is proved.

6.2 Similar Questions

There are some similar questions which can be used for further research on the variance V (x, Q). One of them is to change the sum of three cubes to the sum of s k-th powers.

54 In other words, let an = rk,s(n) be the number of (ordered) representations of n as the sum of s positive k-th powers, with the mathematical expression as follows:

X rk,s(n) = 1. x1,x2,··· ,xs k k k x1 +x2 +···+xs =n

Note that we can rewrite r3(n) as r3,3(n) under this notation. Similar to r3(n), the behavior of rk,s(n) when k ≥ 3 and s ≥ 3, including the mean value and the mean square, is little known, but the study on such function is useful in Waring’s problem (see §1.1).

It is conjectured that if k = s is large enough, then there exists a constant Ck such that

X 2 rk,k(n) ∼ Ckx. n≤x

If the conjecture is true, i.e. the mean square of rk,k(n) is bounded if k is large enough, then it will not be too difficult to estimate the variance V (x, Q). However, the study on the conjecture itself is still challenging. Another question is to study the congruence on the sum of three cubes. In other words, let X an = 1 x1,x2,x3≤n 3 3 3 x1+x2+x3≡a(modq) and estimate V (x, Q). A similar question focuses on the congruence on the sum of s k-th powers X an = 1. x1,x2,··· ,xs≤n k k k x1 +x2 +···+xs ≡a(modq) This function is also useful in Waring’s problem, since it is necessary to find the number of solutions of the congruence

k k k x1 + x2 + ··· + xs ≡ a(mod q) with (x1, q) = 1 (see Vaughan [1997], p.22). The behavior of the function mainly depends on the local property, so it may not be hard to study the variance V (x, Q) on such a function.

55 Bibliography

[1966] M. B. Barban, The large sieve method and its applications in the theory of numbers, Russian Math Surveys 21 (1966), 49–103.

[2002] M. Dancs, On a variance arising in the Gauss circle problem, Ph.D. Thesis, The Pennsylvania State University (2002), 51 pp.

[1966] H. Davenport and H. Halberstam, Primes in arithmetic progressions, Michigan Math. J. 13 (1966), 485–489.

[1996] J. B. Friedlander and D. A. Goldston, Variance of the distribution of primes in residue classes, Quart. J. Math. Oxford Ser (2) 47 (1996), 313–336.

[1967] P. X. Gallagher, The large sieve, Mathematika 14 (1967), 14–20.

[1997] D. A. Goldston and R. C. Vaughan, On the Montgomery-Hooley Asymptotic Formula, Sieve Methods, Exponential Sums, and their Applications in Number Theory (Cardiff, 1995), Cambridge University Press, 1997.

[1975a] C. Hooley, On the Barban-Davenport-Halberstam theorem. I., J. Reine Angrew. Math. 274/275 (1975), 206–223.

[1975b] C. Hooley, On the Barban-Davenport-Halberstam theorem. II., J. London Math. Soc. (2) 9 (1975), 625–636.

[1975c] C. Hooley, On the Barban-Davenport-Halberstam theorem. III., J. London Math. Soc. (2) 10 (1975), 249–256.

[1981] C. Hooley, On Waring’s problem for two squares and three cubes, J. für die reine und angewandte Mathemtik, 328(1981), 161–207.

[1986b] C. Hooley, On some topics connected with Waring’s problem. J. für die reine und angewandte Mathematik, 369(1986), 110–153.

[1996] C. Hooley, On Hypothesis K∗ in Waring’s problem, in: Sieve Methods, Exponential Sums, and their Applications in Number Theory, G. Greaves, G. Harman, and M.N. Huxley, Eds, Cambridge University Press, 1996, 175–185.

56 [1998a] C. Hooley, On the Barban-Davenport-Halberstam theorem. IX., Acta Arith. 83 (1998), 17–30.

[1998b] C. Hooley, On the Barban-Davenport-Halberstam theorem. X., Hardy-Ramanujan J. 21 (1998), 9 pp.

[2002] C. Hooley, On the Barban-Davenport-Halberstam theorem. XIV., Acta Arith. 101 (2002), 247–292.

[2005] C. Hooley, On the Barban-Davenport-Halberstam theorem. XVIII., Illinois J. Math. 49 (2005), 581–643.

[2007] C. Hooley, On the Barban-Davenport-Halberstam theorem. XIX., Hardy- Ramanujan J. 30 (2007), 56–67.

[1970] H. L. Montgomery, Primes in arithmetic progressions, Michigan Math. J. 17 (1970), 33–39.

[2007] H. L. Montgomery and R. C. Vaughan, Multiplicative Number Theory, Cambridge University Press, 2007.

[1991] I. Niven, H. S. Zuckerman and H. L. Montgomery, An Introduction to the Theory of Numbers, 5th edn, John Wiley & Sons, Inc., 1991.

[2012] P. Pongsriiam, The distribution of the divisor function in arithmetic progressions, Ph.D. Thesis, The Pennsylvania State University (2012), 103 pp.

[2015] P. Pongsriiam and R. C. Vaughan, The divisor function on residue classes I, Acta. Arith. 168 (2015), 369–381.

[1986] E. C. Titchmarsh, The theory of the Riemann zeta-function, 2nd edn, revised by D. R. Heath-Brown, Oxford Univ. Press, 1986.

[1997] R. C. Vaughan, The Hardy-Littlewood Method, 2nd edn, Cambridge University Press, 1997.

[1998a] R. C. Vaughan, On a variance associated with the distribution of general sequences in arithmetic progressions. I, Phil. Trans. R. Soc. Lond. A 356 (1998), 781–791.

[1998b] R. C. Vaughan, On a variance associated with the distribution of general sequences in arithmetic progressions. II, Phil. Trans. R. Soc. Lond. A 356 (1998), 793–809.

[2021] R. C. Vaughan, On some questions of partitio numerorum: Tres cubi, Glasgow Mathematical Journal. 63 (2021), 223–244.

57 Vita Pengyong Ding

EDUCATION (2021) Ph.D. in Mathematics, Pennsylvania State University (2016) B.Sc. in Mathematics, Nankai University TALKS AMS Fall Eastern Sectional Meeting, Special Session on Analytic Number Theory PAlmetto Joint Arithmetic, Modularity, and Analysis Series