<<

Character Sum Estimates in Finite Fields and Applications

by

Brandon Hanson

A thesis submitted in conformity with the requirements for the degree of Doctor of Philosophy Graduate Department of Mathematics University of Toronto

c Copyright 2015 by Brandon Hanson Abstract

Character Sum Estimates in Finite Fields and Applications

Brandon Hanson Doctor of Philosophy Graduate Department of Mathematics University of Toronto 2015

In this thesis we present a number of character sum estimates for sums of various types occurring in finite

fields. The sums in question generally have an arithmetic combinatorial flavour and we give applications of such estimates to problems in arithmetic combinatorics and analytic number theory. Conversely, we demonstrate ways in which the theory of arithmetic combinatorics can be used to obtain certain character sum estimates.

ii Dedication

To my friends, for all the laughs. To my family, for their support. To my teachers, for inspiring me. To John, for his patience and commitment. To Michelle, for everything.

iii Acknowledgements

First and foremost I must thank John Friedlander, my thesis advisor, for giving me so much of his time and patience. This thesis would not have been possible without all of the helpful discussions we had. I also want to thank Leo Goldmakher for getting me interested in the field and providing much encouragement along the way. Thanks to Kumar Murty and Antal Balog for being a part of my thesis committee and providing fruitful discussion. Finally, like every graduate student in math at the University of Toronto, I am indebted to Ida Bulat and Jemima Merisca. They made my life in the graduate program so much easier. I want to thank them for ensuring that my headaches were purely mathematical ones.

iv Contents

1 Introduction and Motivation 1 1.1 Primes in arithmetic progressions: A fundamental example of equidistribution in number theory ...... 1 1.2 Random oscillatory sums and the square-root law: The nature of random sequences . . . . 2 1.3 Weyl’s Equidistribution Criterion: Fourier analysis enters the scene ...... 3 1.4 The Sum-Product Phenomenon: A source of equidistribution ...... 5 1.5 Character sums: The star of the show ...... 6 1.6 An outline of this thesis ...... 8

2 Notation and relevant background 10 2.1 Asymptotic Notation ...... 10 2.2 Fourier analysis on finite abelian groups ...... 10 2.3 Finite fields ...... 13 2.4 Additive combinatorics and the Sum-Product Phenomenon ...... 14 2.5 Bohr sets and their structure ...... 18 2.6 Character sums ...... 21

3 Capturing forms in dense subsets of finite fields 25 3.1 Introduction ...... 25 3.2 Statement of results ...... 26 3.3 Upper Bound ...... 27 3.4 Lower Bound ...... 29 3.5 Remarks for Composite Modulus ...... 31

4 Character sum estimates for Bohr sets and applications 33 4.1 Introduction ...... 33 4.2 Statement of Results and Applications ...... 33 4.2.1 Main Results ...... 33 4.2.2 Applications ...... 34 4.3 The P´olya-Vinogradov Argument ...... 35 4.4 The Burgess Argument ...... 36 4.5 Application to Recurrence ...... 38

v 5 Character sum estimates for various convolutions 41 5.1 Introduction ...... 41 5.2 Statement of Results ...... 43 5.3 Trivariate sums ...... 44 5.4 Mixed multivariate sums ...... 47

Bibliography 49

vi Chapter 1

Introduction and Motivation

Much of this thesis is concerned with equidistribution as it pertains to arithmetic. This notion is a fundamental one in number theory which measures the extent to which an object behaves randomly. In the next five sections we illustrate some results in number theory, old and new, which will motivate the thesis. The hope is that through this exposition, our train of thought will be made clear, so that the reader has context for the results of the following chapters. In the first section we present the quintessential example of equidistribution in analytic number theory - the distribution of primes in arithmetic progressions. Though not explicitly related to the results of this thesis, the problem of understanding the distribution of primes seems like the most natural starting point for any discussion about equidistribution in number theory. In the second section we digress a bit in order to recall some of the properties of uniform random sequences. We hope this diversion will suggest which qualities a deterministic object should have in order to deem it random-like. The third section of this introduction is devoted to Weyl’s Equidistribution Criterion. This is a basic result which relates the problem of measuring the uniformity of a sequence with its Fourier analytic behaviour. As the criterion suggests, Fourier analysis plays a large rˆole,in analytic number theory and it will be used at length in this book. In the fourth section of the introduction we discuss the Sum-Product Problem of combinatorial number theory. This problem seeks to quantify the extent to which additive structure and multiplicative structure are uncorrelated. The spirit of the Sum-Product Problem was the motivation for work on the character sum estimates proved in Chapters 4 and 5. In the fifth section, we hope to capture the reader’s interest in the question of character sum estimates. Such questions began with Dirichlet’s work on the distribution of primes in arithmetic progressions, but we hope that throughout this chapter we can convince the reader that these estimates are interesting in their own right. In the final section of this introduction we give an outline for the rest of this thesis and a statement of the results to come.

1.1 Primes in arithmetic progressions: A fundamental example of equidistribution in number theory

The first, and perhaps most famous instance of equidistribution of arithmetic objects is the equidistri- bution of the primes into arithmetic progressions. While we do not investigate the distribution of primes in this thesis, the question provides a good starting point for our exposition. The primes are mysterious numbers, mostly because they are defined by what they are not rather than by what they are. As such,

1 Chapter 1. Introduction and Motivation 2 stating facts about primes is rarely easy. We begin by examining some basic properties. Certainly, each prime other than 2 is odd. And no primes other than 3 and 5 should have a common factor with 15. In general, when we divide p by q, which is to say we write p = nq + a with 0 ≤ a ≤ q − 1, the remainder a is necessarily relatively prime with q. Indeed, if q and a had a factor in common, that factor would also divide p. In short, if p = a mod q then (a, q) = 1. Beyond this obvious pattern, it is hard to deduce anything structural about the number a. Arguments going back to Euclid tell us that if we divide the odd primes by 4 then the remainders 1 and 3 occur infinitely often (0 and 2 are forbidden). This fact was famously generalized by Dirichlet, who proved that each of the eligible remainders that come from dividing a prime p by a number q also occur infinitely often as we run over the primes. His work and subsequent work in analytic number theory lead to the Theorem in Arithmetic Progressions, which says that each eligible remainder occurs with roughly the same frequency. In other words, primes fall uniformly into the φ(q) eligible residue classes modulo q.

Theorem (Prime Number Theorem in Arithmetic Progressions). Let π(x, q, a) denote the number of primes up to x which have remainder a when divided by q, and let π(x) denote to total number of primes up to x. Then as x → ∞ we have π(x, q, a) 1 → . π(x) φ(q) Suppose we were given a large prime and asked which residue class p lies in modulo q. Without any further information this task seems hopeless - the prime is really, really big. We should not be to hard on ourselves however, because the above theorem is telling us we might do just as well to choose one class at random. So, it is not the case that we do not understand patterns in the distribution of primes beyond the obvious ones, but rather that (at least at this scope) there aren’t any. The main tool for the study of primes in arithmetic progressions is the Dirichlet character. These characters will be of central interest in Chapter 4 and Chapter 5. We will give further exposition to Dirichlet characters in Section 1.5.

1.2 Random oscillatory sums and the square-root law: The na- ture of random sequences

Most of the equidistribution problems investigated in this thesis are concerned with points on the unit circle in the complex plane, 1 S = {z ∈ C : |z| = 1}.

We identify the circle S1 with the R/Z = [0, 1], the group operation being addition modulo 1. This identification is via the map e : R/Z → S1 defined by e(θ) = e2πiθ. Given a real number α, the expression α mod 1 means the fractional part {α} of α up to translation by integers. We now divert briefly from the topic of equidistribution to discuss what we might expect on random grounds when summing complex unit vectors.

Suppose we choose N numbers θ1, . . . , θN uniformly at random from [0, 1] and send them to the circle by the map e defined above. These new points have uniformly distributed angles and so are likely to point in all sorts of directions. In particular, given an arc C of length l(C) on the circle, we would expect that a proportion l(C)/2π - the proportion of the circle occupied by C - of the points lie in the arc C. Chapter 1. Introduction and Motivation 3

The (probabilistic) expectation of e(θn) is E(e(θn)) = 0. The numbers e(θn) are all unit vectors and pointing in various directions and so when adding them, we expect to see a lot of cancellation - their total expectation is X E(e(θn)) = 0. n≤N How close to this expectation is the sum typically? Well, while the sum

N X SN = e(θn) n=1 could be as large as N, being a sum of N complex numbers of unit modulus, by Chebychev’s inequality we have √ 1 1 X (|S | > k N) ≤ (|S |2) = (e(θ )e(θ )). P N k2N E N k2N E m n 1≤m,n≤N

Since θm is independent of θn when m 6= n we have

E(e(θm)e(θn)) = E(e(θm))E(e(θn)) = 0.

Thus √ 1 (|S | > k N) ≤ P N k2 √ and so we typically have SN  N. In fact, the Central Limit Theorem tells us that

  Z b SN 1 −t2/2 P a < √ < b → √ e dt. N 2π a

When considering the distribution of complex unit vectors, a quantitative way to measure their random- ness is to see cancellation in their sum. The “Holy Grail” of this business is to prove that these sums exhibit the same square-root cancellation as random sums do. We call this the square-root law. With all of this in mind, let us continue with our discussion of equidistribution.

1.3 Weyl’s Equidistribution Criterion: Fourier analysis enters the scene

Suppose we are given a sequence of numbers (αn) in R/Z, which appear to have no obvious patterns. The sequence is a deterministic one, and in our case, will usually have a number theoretic origin. We would like to quantify how close to a uniform random sequence these numbers are and we will do this by comparing their distributions. The basic tool for doing this sort of thing is Fourier analysis, which is illustrated by the Weyl Criterion. Recall that e was the function from the reals modulo 1 (R/Z) to the 1 2πiθ circle (S ), defined by e(θ) = e . How often do the points e(αn) lie in the right side of the circle and how often do they lie in the left side? If the sequence were unbiased, then we would expect the answer two be half and half. In general and as was discussed in Section 1.2, given an arc C of length l(C) on the circle, we expect that the proportion of elements of the sequence which lie in C is about l(C)/2π, the Chapter 1. Introduction and Motivation 4 proportion of the circle occupied by C. To be precise, we would expect that

|{n ≤ N : e(α ) ∈ C}| l(C) lim n = . N→∞ N 2π

If this holds for any arc C, we say the sequence (αn) is equidistributed. Weyl’s Criterion turns the problem of showing a sequence is equidistributed into a question about estimating oscillatory sums.

Theorem (Weyl’s Criterion). A sequence (αn) in R/Z is equidistributed if and only if for each integer k 6= 0, we have 1 X lim e(kαn) = 0. N→∞ N n≤N

It is simple consequence of Weyl’s Criterion that the sequence αn = nα mod 1 is equidistributed if and only if the number α is irrational - that is if α doesn’t satisfy any rational linear equation. This suggests that predicting the long-term behaviour of consecutive translation by α is a complicated problem. Given a huge number N and limited computing power, it would be tough to figure out where e(Nα) is on the circle. On the other hand, when α is rational, say α = a/q, then the sequence of numbers nα mod 1 begins with 0, 1/q, 2/q, . . . , (q−1)/q and repeats, so that long-term behaviour is pretty simple: to find e(Nα) on S1 we just need to work out N mod q. Our best guess for where Nα lands when N is large and α is irrational, according to Weyl’s Criterion, is to choose uniformly at random on the circle. We remark here that there are plenty of number theoretic sequences whose distribution mod 1 is the subject of ongoing research. Complicated sequences like bnπ, which is concerned with the distribution of the digits of π in base b, remain a mystery still today. Here is a rough idea of the proof of Weyl’s Criterion. The basic strategy is a common one and highlights the usefulness of Fourier analysis in analytic number theory. The theory of Fourier analysis, at least as far as finite abelian groups are concerned, is developed in Section 2.2. We warn that our argument is quite imprecise, but the ideas can be made rigorous. Let (a, b) be an interval in R/Z (which can be identified with an arc on the circle S1). Our task is to show that the proportion of n ≤ N for which a < αn < b is the length of (a, b) (which we denote l(a, b)) if and only if for each k 6= 0 we have

1 X lim e(kαn) = 0. N→∞ N n≤N

In essence we approximate the indicator function 1(a,b) by a trigonometric polynomial

X F (θ) = cme(mθ) |m|≤M with c0 approximately l(a, b). That one can make such an approximation is usually proved in a standard

first course in Fourier analysis. The number of n for which a < αn < b is about

X X 1 X X 1 X F (α ) = c e(mα ) ≈ l(a, b) + c e(mα ) n m N n m N n n≤N |m|≤M n≤N |m|≤M n≤N m6=0 and the right hand side tends to l(a, b) as N tends to infinity, which is what we wanted. On the other 1 2π 2πim/kM hand, for any k 6= 0, we divide S into arcs Cm of length M around the points e with M very N large and relatively prime to k. Each of these arcs contains M +Em points e(αn) were Em is an error and Chapter 1. Introduction and Motivation 5

2πikm/M Em/N → 0. Since k and M are relatively prime, the complex numbers e are just a permutation of the complex numbers e2πim/M . Thus we would expect

M 1 X 1 X  N  e(kα ) ≈ e2πim/M + E → 0 N n N M m n≤N m=1 as N → ∞. Actually, Weyl’s Criterion provides more than just a way of measuring the randomness of a sequence. If we can estimate the necessary exponential sums, the criterion allows us to approximate our sequence by a random one. This means we can analyse the sequence based on random heuristics which is usually a much easier problem. If we have a certain special configuration, such as an arithmetic progression, and wanted to estimate how often it occurred in the given sequence, we can do so by counting the expected number of occurrences. We will make use of this idea in Chapter 3 and in Chapter 4.

1.4 The Sum-Product Phenomenon: A source of equidistribu- tion

Many interesting questions in number theory are concerned with the additive structure of multiplicative objects or vice versa. For instance, Goldbach’s Conjecture asks whether each even number at least four is a sum of two primes. This question is difficult because the natural questions about primes are concerned with their multiplicative properties rather than their additive properties. It is a general phenomenon that the interaction of addition and multiplication is a complicated one. One way of quantifying this complexity leads a famous and unsolved problem of Erd¨osand Szemer´edi[ES]. For a finite A of integers, we define the sumset of A to be

A + A = {a + a0 : a, a0 ∈ A} and the productset to be A · A = {a · a0 : a, a0 ∈ A}.

There are potentially |A|(|A| + 1)/2 different sums that could occur in A + A, accounting for the identity a + a0 = a0 + a. On the other hand if A was very structured, like an arithmetic progression, then many of these sums would repeat, and so |A + A| could be as small as 2|A| − 1. A similar analysis shows that

2|A| − 1 ≤ |A · A| ≤ |A|(|A| + 1)/2 though the sets A for which |A · A| is small look like geometric progressions rather than arithmetic progressions. Erd¨osand Szemer´ediconjectured that while one of the quantities |A + A| and |A · A| could be small, it is impossible for both quantities to be small simultaneously. Erd¨osand Szemer´edieven went as far as to conjecture that:

Conjecture (Erd¨os-Szemer´edi). For finite sets A ⊂ Z and ε > 0,

2−ε max{|A + A|, |A · A|} ε |A| Chapter 1. Introduction and Motivation 6 so that one of the two should be almost as big as possible. The ε above is to some extent necessary as the set A = {1, . . . , n} has sumset {2,..., 2n} which has size 2n − 1 and a product set which is of size o(n2), as was proved by Erd¨osin [E]. The study of the Erd¨os-Szemer´ediConjecture is referred to as the Sum-Product Problem. Currently the best-known result, which holds not only for sets A consisting of integers but also for sets of complex numbers, is

max{|A + A|, |A · A|}  |A|4/3−ε.

This was proved in the case of real sets A by Solymosi in [So] with a beautiful, and completely elementary geometric argument. The argument was cleverly extended to complex sets A in [KR]. The Sum-Product Problem is equally sensible in the finite field setting, however the possible presence of finite subfields (which are obstructions to the conjecture) makes the problem more difficult. We restrict our attention to prime fields Fp in order to get around this, though it is true that certain Sum-Product type statements are valid in extensions of Fq under additional hypotheses, see for instance [LRN]. The Sum-Product Phenomenon essentially tells us that additive sequences (or elements of additively structured sets) should appear random from a multiplicative point of view. Most of the work in this thesis can be interpreted in this spirit. In Chapter 4 and Chapter 5 we will make use of the known Sum-Product results in finite fields to estimate certain character sums.

1.5 Character sums: The star of the show

In the last section of this chapter we introduce the principal object of study in this thesis, the character sum. We will give a more extensive introduction to abstract characters in Section 2.2, but for now by a character we mean a multiplicative character over a prime field Fp. This is a function on the group × 1 of units Fp satisfying χ(ab) = χ(a)χ(b) which takes values in S . We extend this function to all of Fp by setting χ(0) = 0. These very useful functions were introduced by Dirichlet in his work on primes in arithmetic progressions, which we discussed in Section 1.1. A character sum is just a quantity of the form

X S = w(a)χ(a) a∈A where χ is a character and w : A → C is some weight function. Of course, since χ takes values in the unit disc we always have what we shall call the trivial estimate

X |S| ≤ |w(a)|. a∈A

The goal of studying character sums is to understand when this estimate can be improved. It is sometimes impossible to improve this bound, which happens when A possesses too much multiplicative structure. A non-trivial estimate is then evidence that A is unstructured. To motivate the need for character sum estimates, we consider the following classical problem in analytic number theory:

Problem. Let p be a prime and consider the complete set of non-zero residue classes modulo p given by {1, 2, . . . , p − 1}. Of these, precisely half are quadratic residues and half are not. Let np denote the smallest integer in this set which is not a . In terms of p, how big is np? Chapter 1. Introduction and Motivation 7

The most famous instance of a multiplicative character is the Legendre symbol which is defined by  0 if a ≡ 0 mod p a  = 1 if a 6≡ 0 mod p and a is a quadratic residue modulo p p  −1 if a 6≡ 0 mod p and a is not a quadratic residue modulo p.

 N  It follows the np is the smallest positive integer N satisfying p = −1, or equivalently the smallest value of N for which we can improve upon the trivial estimate in the sum

X n S = . p 1≤n≤N

There is a classical estimate for S going back to P´olya and Vinogradov, which also holds for other characters as well. Theorem (P´olya-Vinogradov). Let χ be a non-trivial multiplicative character modulo p. Then

X √ χ(n)  p log p. M≤n≤M+N √ This estimate is better than the trivial estimate provided N  p log p and is simple to prove. A remarkable feature of this bound is that it is uniform in the length of the interval of summation. In [P], Paley proved that the bound is in fact nearly sharp for longer sums. One needs to work harder to get non-trivial estimates for shorter intervals. One reason for this is that many of the methods we have to estimate character sums extend to sums over arbitrary finite fields. This is problematic because in finite fields which are not prime, certain characters may not oscillate on an interval. It could be the case that the variable of summation ranges over some subfield to which a non-trivial character restricts trivially. Because the subfields of a finite field with q elements have at √ √ most q elements, sums with q or fewer terms tend to be much more difficult to estimate even when we expect a lot of cancellation. This obstacle has come to be known as the square-root barrier. In the early 1960’s ([Bu1], [Bu2]), D. A. Burgess gave an ingenious argument to break the square-root barrier. Theorem (Burgess). Let χ be a non-trivial multiplicative character modulo p. Then for any positive integer k and ε > 0 we have

X 1−1/k (k+1)/4k2+ε χ(n) k,ε N p . M≤n≤M+N

This result is better than trivial provided N  p1/4+δ which can be seen by taking the parameter k to be sufficiently large. Obtaining estimates for even shorter intervals remains a major open problem in analytic number theory. Further reading can be found in Chapter 12 of [IK]. Both the P´olya-Vinogradov and the Burgess estimates are leveraging the additive structure of the integers in an interval. Multiplicative characters are just the multiplicative analogs of the exponential functions which were used in Weyl’s Criterion. Since the Sum-Product Phenomenon tells us that such × additively structured sets should appear random from the point of view of the multiplicative group Fp , we expect non-trivial estimates for these sums to hold. In this thesis we find other settings in which the methods of Burgess and P´olya-Vinogradov prove fruitful. The basic Burgess method will be given in Chapter 1. Introduction and Motivation 8

Section 2.6. In Chapter 5 we use Burgess’ ideas to estimate certain smoother, combinatorial character sums. In Chapter 4 we prove Burgess and P´olya-Vinogradov type estimates for character sums on a Bohr set.

1.6 An outline of this thesis

The next chapter is devoted to the necessary background needed for Chapters 3, 5 and 4. While many of these well-known results quoted without proof, we hope the exposition is still enlightening. Where appropriate, we attempt to give context and intuition for these facts in order to further motivate the results of subsequent chapters. Chapter 3 gives a first taste of the application of character sums to combinatorial number theory. The results of that chapter represent progress toward a finite field analog to a long standing and open conjecture of Hindman. This conjecture is as follows. Suppose the natural numbers are each coloured by any of r possible colours. Must there always be two numbers x, y ∈ N for which x + y and xy are coloured the same? We ask a similar question over finite fields Fq where the theory of characters can be used. We give estimates on the size of a subset A ⊂ Fq needed to guarantee the existence of x, y ∈ Fq satisfying xy, x + y ∈ A. We also construct a subset A ⊂ Fq of size on the order of log q for which xy, x + y ∈ A has no solutions. In fact, the result is slightly more general in that, provided certain non-degeneracy conditions, one can replace x + y with a linear form in x and y, and one can replace xy with a quadratic form in x and y. Our main theorem of Chapter 3 is:

Theorem. Let Fq be a finite field of odd order. Let Q ∈ Fq[X,Y ] be a binary quadratic form with non-zero discriminant and let L ∈ Fq[X,Y ] be a binary linear form not dividing Q. Then we have √ log q  Nq(L, Q)  q.

Chapter 4 contains estimates analogous to the classical estimates of P´olya-Vinogradov and Burgess for character sums over Bohr sets. In addition, we provide applications of these estimates to discrete analogs of questions in Diophantine approximation. Our first main theorem in Chapter 4:

Theorem (P´olya-Vinogradov for Bohr sets). Let B = B(Γ, ε) be a Bohr set with |Γ| = d. Then for any non-trivial multiplicative character χ

√ X d χ(x) d p(log p) . x∈B √ This estimate is non-trivial for Bohr sets which are larger than p in size. For smaller sets, we have the second main theorem of Chapter 4:

Theorem (Burgess for Bohr sets). Let B = B(Γ, ε) be a regular Bohr set with |Γ| = d. Let k ≥ 1 be an √ integer and let χ be non-trivial multiplicative character. When |B| ≥ p we have the estimate

 5/16k  −1/8k X 2 |B| p χ(x)  |B| · p5d/16k +o(1) . k,d εdp |B| x∈B Chapter 1. Introduction and Motivation 9

√ When |B| < p we have the estimate

 5/16k  5 −1/8k X 2 |B| |B| χ(x)  |B| · p5d/16k +o(1) . k,d εdp p2 x∈B

From here we move on to our applications - discrete analogs of Schmidt’s Theorem on approximation by squares. Our first application is the recurrence of small powers:

Theorem (Recurrence of k’th powers). Let Γ be a set of d integers, let p be a prime and let k be a positive integer. There is an integer x ≤ p for which   k r −1/2d 1/d max x d p log p · k . r∈Γ p

× Next we move on to recurrence of generators of Fp :

Theorem (Recurrence of primitive roots). Let Γ be a set of d integers and let p be a prime. There is × an integer 1 < x < p which generates Fp and such that

  1/2d r p log p max x d . r∈Γ p φ(p − 1)1/d

Chapter 5 is concerned with estimates for character sums of three and four variables. The sums under investigation are smoother versions of well-known sums where breaking the square-root barrier is thought to be quite difficult. This work was originally motivated by the character sum estimates used in Chapter 3, however the general problem is well-known. The first main theorem of Chapter 5 is that we are able to breach the square-root barrier for triple convolutions: √ Theorem. Given subsets A, B, C ⊂ Fp each of size |A|, |B|, |C| ≥ δ p, for some δ > 0, and a non-trivial character χ, then we have

|Sχ(A, B, C)| = oδ(|A||B||C|).

Unfortunately, we are unable to save a power of p in the above sum, which is usually what one seeks. By introducing a multiplicative fourth variable, we can obtain such a saving:

δ √ 4 56 28 33 Theorem. Suppose A, B, C, D ⊂ Fp are sets with |A|, |B|, |C|, |D| > p , |C| < p and |D| |A| |B| |C| ≥ p60+ε for some δ, ε > 0. There is a constant τ > 0 depending only on δ and  such that

−τ |Hχ(A, B, C, D)|  |A||B||C||D|p .

√ In the case that |A|, |B|, |D| > pδ, |C| ≥ p and |D|8|A|112|B|56 ≥ p87+ε then there is a constant τ > 0 depending only on δ and  such that

−τ |Hχ(A, B, C, D)|  |A||B||C||D|p . Chapter 2

Notation and relevant background

2.1 Asymptotic Notation

We will usually be interested in studying a quantity asymptotically with respect to some parameter. To do so we introduce the following standard notation. Given a complex valued function f and a real, non-negative function g of some variable t tending to infinity, we say f = O(g) if |f(t)| ≤ Cg(t) for |t| sufficiently large and some constant C independent of t. We sometimes write f  g to mean the same thing. In the case that there is a dependence on one or more further parameters u1, . . . , uk, i.e. if |f(t)| ≤ Cu1,...,uk g(t) for sufficiently large t but the constant Cu1,...,uk depends on u1, . . . , uk, then we write f = Ou1,...,uk (g) or f u1,...,uk g. In the case that f(t)/g(t) → 0 as t → ∞ we write f = o(g), and f = ou1,...,uk (g) if there is a dependence on other parameters.

2.2 Fourier analysis on finite abelian groups

In this section we develop the basic theory of Fourier analysis on a finite . The books [TV] and [N] give a nice treatment of the subject. We will make use of the theory of Lp spaces on finite sets, though because we shall reserve the letter p for a prime number, we will denote the spaces by Lu. For a finite set X, the space Lu(X) is the set of functions f : X → C endowed with the norm

1 X kfku = |f(x)|u. u |X| x∈X

P 2 The case u = 2 is of particular interest because by setting hf, gi = x∈X f(x)g(x), we endow L (X) with the structure of an inner product space. Given a subset X0 ⊂ X we define the indicator function of X0 by  1 if x ∈ X0 1X0 (x) = 0 if x∈ / X0.

We also write δx = 1{x}.

Definition (Character). Let G be an abelian group. A character on G to be a function γ : G → S1 such that for each x, y ∈ G we have γ(x + y) = γ(x)γ(y).

10 Chapter 2. Notation and relevant background 11

The set of characters on G is called the dual group of G and denoted Gb. It is in fact a group under pointwise multiplication (γγ0)(x) = γ(x)γ0(x) and its identity is the constant function 1G(x) = 1, which will be called the trivial character. A crucial property of characters is that they are orthogonal as functions in L2(G).

Proposition 2.1 (Orthogonality relations). Let γ, γ0 ∈ Gb then  0 X |G| if γ = γ , γ(x)γ0(x) = 0 x∈G 0 if γ 6= γ .

Let x, x0 ∈ G then  0 X |G| if x = x , γ(x)γ(x0) = 0 if x 6= x0. γ∈Gb

The characters of the cyclic group Z/NZ are given by the functions x 7→ e2πikx/N where k ∈ Z/NZ. These functions are well-defined since they have period dividing N. We obtain all characters of Z/NZ in this fashion as k varies over the elements of Z/NZ thus producing an isomorphism Z/NZ → Z\/NZ. Given a direct sum of cyclic groups (Z/N1Z) ⊕ · · · ⊕ (Z/NlZ) and characters γ1, . . . , γl on the individual groups, we define a character γ on the direct sum by

γ (x1 ⊕ · · · ⊕ xl) = γ1(x1) ··· γl(xl).

In fact all characters on the direct sum are produced in this way. From the classification of finite abelian groups we deduce the following theorem.

Theorem 2.1. For any finite abelian group G we have an isomorphism G =∼ Gb. Usually, we shall be interested in some quantitative statement about the structure of a subset of an abelian group. For instance, we may wish to count the number of solutions to a + b = c + d with a, b, c, d ∈ A ⊂ G. This quantity is called the additive energy of A, denoted E+(A, A) and will be discussed further in Section 2.4. While counting typically is done with indicator functions, such as

X E+(A, A) = 1A(a)1A(b)1A(c)1A(d), a+b=c+d characters provide a basis of functions that capture (using the orthogonality relations) the identity a + b = c + d. Thus the expression of an indicator function as linear combinations of characters gives a useful method for estimating such quantities. The fact that we can express any function as a linear combination of characters is called Fourier inversion.

Definition (Fourier Transform). Given a function f : G → C and a character γ ∈ Gb, we define the Fourier transform of f at γ to be

X fb(γ) = f(x)γ(x) = hf, γi. x∈G

Recall that the convolution of functions is defined as: Chapter 2. Notation and relevant background 12

Definition (Convolution). For functions f, g : G → C we define their convolution f ∗ g : G → C by X f ∗ g(x) = f(x − y)g(y). y∈G

Perhaps the most useful property of the Fourier transform is that it turns convolution into multipli- cation. We record this and some other useful properties here.

Lemma 2.1 (Properties of the Fourier Transform). Let f, g : G → C, then we have

1. Fourier inversion: f(x) = 1 P f(γ)γ(x). |G| γ∈Gb b

2. Parseval’s identity: P f(x)g(x) = 1 P f(γ)g(γ). x∈G |G| γ∈Gb b b

3. Plancherel’s identity: P |f(x)|2 = 1 P |f(γ)|2. x∈G |G| γ∈Gb b

4. Convolution to multiplication: f[∗ g(γ) = fb(γ)gb(γ).

When A is a subset of G and 1A is the indicator function of A then the number of solutions to x = a + b with a, b ∈ A is just the convolution 1A ∗ 1A(x). Using the properties of the Fourier transform, we have a neat formula for additive energy:

X 1 X 1 X E (A, A) = 1 ∗ 1 (x)2 = |1\∗ 1 (γ)|2 = |1 (γ)|4. + A A |G| A A |G| cA x∈G γ∈Gb γ∈Gb

The first equality here is Plancherel’s identity, the second follows from the convolution to multiplication property. Weyl’s Criterion in the finite group setting can be viewed as a simple identity. Recall that Weyl’s

Criterion in Section 1.3 said that a sequence of numbers (αn) modulo 1 was equidistributed if and only if we have cancellation in the exponential sums

X e(kαn) = o(N) n≤N when k ∈ Z is non-zero. The functions θ 7→ e(kθ) are just the characters on the group R/Z and those with k 6= 0 are the non-trivial characters. Thus Weyl’s Criterion says that we have equidistribution provided there is cancellation when non-trivial characters are summed over the elements of the sequence. In the

finite abelian group setting, we can establish the same fact easily from Fourier inversion. Suppose (αn)n is a sequence in G. We would like to say the sequence is equidistributed if the number of αn with n ≤ N with αn = x ∈ G is about N/|G|. Let

N A (x) = |{n ≤ N : α = x}| − , N n |G| then we have by Plancherel’s identity,

2 X 2 1 X (AN (x)) = AdN (γ) . |G| x∈G γ∈Gb Chapter 2. Notation and relevant background 13

When γ is trivial, X AdN (γ) = AN (x) = 0. x∈G Thus we see that we get closer to uniform distribution of the sequence as we get more cancellation of the Fourier transform at the non-trivial characters. We end with the Poisson Summation Formula, which further illustrates how correlation with group structure leads to a concentration in the Fourier transform. Given a subgroup H ⊂ G, any character on G can be viewed as a character on H by restriction. We let

⊥ H = {γ ∈ Gb : γ|H = 1} be the set of characters which restrict trivially to H.

Proposition 2.2 (Poisson Summation Formula). Let f : G → C be a function. Then

1 X 1 X f(x) = fb(γ). |H| |G| x∈H γ∈H⊥

This will be used in Section 4.5 to deduce an application of character sums over Bohr sets.

2.3 Finite fields

In this section we set the stage for the problems that are investigated in this thesis, covering the basic facts concerning finite fields. All of the work in subsequent chapters concern problems in this context. These facts can be found in the first 3 Chapters of [LN].

A finite field is of course a field containing finitely many elements. The basic example is Fp = Z/pZ, the field of residue classes modulo a prime integer p. Other examples (in fact, all other examples) are the algebraic extensions Fp(α) of Fp, obtained from Fp by adjoining an element α which satisfies some n n−1 polynomial relation α + cn−1α + ... + c1α + c0 = 0 with coefficients ci ∈ Fp. The characteristic of a field F with multiplicative identity 1 is the smallest integer p > 0 (if it exists) such that p · 1 = 0, and it is necessarily a prime number. Otherwise we say the field has characteristic 0.

Theorem 2.2 (The Structure of Finite Fields). We have the following facts concerning finite fields.

1. Any finite field F has characteristic p with p prime. Such a field necessarily contains q = pn elements for some integer n > 0. Each element a ∈ F then satisfies the relation aq = a.

n 2. Conversely, given a prime power q = p , there is a finite field Fq containing exactly q elements q which is unique up to field isomorphism. It is the splitting field of the polynomial X − X ∈ Fp[X].

n Given the above theorem, we will henceforth denote all finite fields by Fq where q = p is some prime power.

n Theorem 2.3 (The Subfield Criterion). The subfields of Fq with q = p consist precisely of the finite m fields Fr with r = p and m|n. √ The Subfield Criterion tells us that there are no subfields of Fq of size bigger than q. This is the source of the so-called square-root barrier that arises in the estimation of character sums. We will talk about this barrier in Section 2.6. Chapter 2. Notation and relevant background 14

× Theorem 2.4 (The Structure of Units). The set of non-zero elements of Fq, denoted Fq , is a group which we call the multiplicative group of Fq. It is a cyclic group of order q − 1.

× Definition (Primitive Roots). The generators of the group Fq are called primitive roots. There are exactly φ(q − 1) of primitive roots, where φ is the Euler totient function.

Having discussed the theory of Fourier analysis on an arbitrary finite abelian group in the previous section, we now review the theory tailored specifically to finite fields. In this setting there are two groups with respect to which we perform Fourier analysis, namely the additive group Fq and the multiplicative × group Fq . First we recall the trace map on an extension field.

m Definition (Trace). Suppose m ≥ 0 and Fqm is the finite field of q elements extending the finite field , then the trace map is the -linear map Tr : m → defined by Fq Fq Fqm /Fq Fq Fq

m−1 X j Tr (a) = aq . Fqm /Fq j=0

With this in mind, we describe the characters of Fq, which will be called additive characters or exponentials. They are parametrized by the elements of Fq: all additive characters of Fq are of the form x 7→ e (Tr(ax)) = e2πiTr(ax)/p with a ∈ , where Tr = Tr is the trace map. We will henceforth p Fq Fq /Fp abbreviate this with the notation eq(ax) = ep(Tr(ax)). The element a ∈ Fq is sometimes referred to as the frequency of the additive character. × The characters of the multiplicative group Fq will be called multiplicative characters, or sometimes just characters if there is no risk of ambiguity. We also extend multiplicative characters χ to the whole of Fq by setting χ(0) = 0. The multiplicative characters of a prime field are extended to a completely multiplicative function on the integers given by first reducing modulo p. These multiplicative functions are the Dirichlet characters, which were introduced in his work on primes in arithmetic progressions and are objects of great interest in analytic number theory.

2.4 Additive combinatorics and the Sum-Product Phenomenon

Additive combinatorics is a fairly young subject that is starting to see many applications in analytic number theory. It can loosely be thought of as the conversion of combinatorial information into algebraic information. One of its aims is to understand the nature of sumsets and the like. Most of the material here can be found in the standard reference [TV] except for the quoted version of the Balog-Szemer´edi- Gowers Theorem, in which case references are supplied.

Definition (Sumset, difference set and partial analogs). Let A and B be finite subsets of an abelian group G. Their sumset is the set

A + B = {a + b : a ∈ A, b ∈ B}.

The difference set of A and B is the set

A − B = {a − b : a ∈ A, b ∈ B}. Chapter 2. Notation and relevant background 15

If E ⊂ A × B then we define the partial sumset with respect to E to be

E A + B = {a + b :(a, b) ∈ E} and the partial difference set with respect to E to be

E A − B = {a − b :(a, b) ∈ E}.

We are often interested in combinatorial information about A+B. How large is A+B? The quantity |A + A|/|A| is referred to as the doubling constant of A and much of additive combinatorics seeks to understand the sets A for which have small doubling constant. Closely related to the sumset of two sets is their additive energy:

Definition (Additive energy). Let A and B be finite subsets of an abelian group G. The additive energy between A and B is the quantity

0 0 0 0 E+(A, B) = |{(a, a , b, b ) ∈ A × A × B × B : a + b = a + b }| .

Let r(s) denote the number of ways an element s ∈ A + B can be represented as s = a + b with a ∈ A and b ∈ B. We have the simple identities

X |A||B| = r(s) s∈A+B and X 2 E+(A, B) = r(s) . s∈A+B A simple application of the Cauchy-Schwarz inequality shows that the size of A+B is tied to the additive energy between A and B.

Lemma 2.2. Let A and B be finite subsets of an abelian group G. Then we have

2 2 |A| |B| ≤ |A + B| · E+(A, B).

Proof. We have X |A||B| = r(s), s∈A+B so by Cauchy-Schwarz we have

!2 2 2 X X 2 |A| |B| = 1 · r(s) ≤ |A + B| r(s) = |A + B| · E+(A, B). s∈A+B s∈A+B

In fact, a converse to this result holds too, and comes in the form of the following theorem. Chapter 2. Notation and relevant background 16

Theorem 2.5 (Balog-Szemer´edi-Gowers). Suppose A is a finite subset of an abelian group G and

|A|3 E (A, A) ≥ . + K

0 0 |A| Then there is a subset A ⊂ A of size |A |  K(log(e|A|))2 with

|A0|3(log(|A|))8 |A0 − A0|  K4 . |A|2

The implied constants are absolute.

We remark here that it is sometimes necessary to pass to subsets A0 and B0 in order to obtain a small sumset. Indeed, suppose we take A = B = {1,...,N} ∪ {21,..., 2N }. Then the interval part {1,...,N} will cause the additive energy of A to be large, while the geometric progression part {21,..., 2N } will produce a large sumset. The version of the Balog-Szemer´edi-Gowers Theorem we have quoted has very good explicit bounds, and is due Bourgain and Garaev. The proof is essentially a combination of the following two lemmas from [BG], and we shall record it for convenience. It was communicated to us by O. Roche-Newton.

Lemma 2.3. Let G be an abelian group and A, B ⊂ G finite subsets. Suppose E ⊂ A × B is such that |A||B| 0 0 1 |E| ≥ K . There is a subset A ⊂ A of size |A | ≥ 10K |A| with

E |A0 − A0||A||B|2 |A − B|4 ≥ . 104K5

E E The second lemma, below, is stated in [BG] with A + B but works just as well with A − B.

Lemma 2.4. Let G be an abelian group and A, B ⊂ G finite subsets. There is a subset E ⊂ A × B such that 8|E|2 E (A, B) ≤ (log(e|A|))2. + E |A − B|

|E| 2 Proof of Theorem 2.5. Let E ⊂ A × A be any subset (which has size |E| = |A|2 |A| ). Then by Lemma 0 |E| 2.3, there is a subset A ⊂ A of size at least 10|A|2 |A| and such that

E |A0 − A0||A|3|E|5 |A0 − A0||E|5 |A − A|4 ≥ = . (2.1) 104|A|10 104|A|7

By Lemma 2.4 there is a subset E ⊂ A × A with

8|E|2 E (A, A) ≤ (log(e|A|))2. + E |A − A|

E Using this bound for |A − A| in (2.1) gives

7 3 8 4 3 8 0 0 |A| |E| log(|A|) K |E| log(|A|) |A − A |  4  5 E+(A, A) |A| Chapter 2. Notation and relevant background 17

after using our lower bound on E+(A, A). Now we have that

|E|  |A||A0| so upon inserting this we have K4|A0|3 log(|A|)8 |A0 − A0|  |A|2

E which is what we wanted. We just need to check the lower bound on |A0|. Since any element of A − A can be represented in at most |A| ways, we have

E |E| |A − A| ≥ . |A|

On the other hand we have E |E|2(log(|A|))2 |A − A|  E+(A, A) showing that E (A, A) E (A, A) |A|2 |E|  +  + = . |A|(log(|A|))2 |A|(log(|A|))2 K(log(|A|))2 This gives the desired bound on |A0|.

One useful tool for working with sumsets and difference sets is Ruzsa’s Triangle Inequality.

Lemma 2.5 (Ruzsa’s Triangle Inequality). Let A, B, C be finite subsets of an abelian group G. Then |A − B| ≤ |A − C||C − B|/|C|.

Proof. We produce an injection i : C × (A − B) → (A − C) × (C − B). For each element d ∈ A − B

fix a representation d = ad − bd for some ad ∈ A and bd ∈ B. Then we define i(c, d) = (ad − c, c − bd).

The sum of the two co-ordinates in the image of i is d. From d we recover ad and bd since each d was assigned fixed summands. We can then recover c, and so i is indeed invertible.

Since we shall usually prefer to work with sumsets rather than difference sets, we also need the following lemma, which is a simple and standard consequence of the Ruzsa Triangle Inequality.

Lemma 2.6. Suppose A is a finite subset of an abelian group G. Then

|A + A|2 |A − A| ≤ |A|. |A|

Proof. In the Triangle Inequality, take A = B and C = −A.

It is sometimes easier to work with the energy between a set and itself rather than between distinct sets. Fortunately, the following lemma allows us to reduce to this scenario.

Lemma 2.7. We have 2 E+(A, B) ≤ E+(A, A)E+(B,B) Chapter 2. Notation and relevant background 18

Proof. Let rA(d) denote the number of ways d = a1 − a2 with a1, a2 ∈ A and rB(d) denote the number of ways d = b1 − b2 with b1, b2 ∈ B. We have

!2 ! ! 2 X X 2 X 2 E+(A, B) = rA(d)rB(d) ≤ rA(d) rB(d) d d d by Cauchy-Schwarz. The right hand side above is just E+(A, A)E+(B,B).

In order to execute a Burgess-type argument for character sums, we will need estimates on multi- plicative energy. This is the same thing as additive energy, but in the context of the multiplicative group × Fp . For two sets A, B ⊂ Fp we call

E×(A, B) = | {(a1, a2, b1, b2) ∈ A × A × B × B : a1b1 = a2b2} | the multiplicative energy between A and B. We observe that if

r×(x) = |{(a, b) ∈ A × B : ab = x}| then X 2 E×(A, B) = r×(x) . x∈Fp As with additive energy, such quantities appear regularly in additive combinatorics, particularly in reference to the Sum-Product Problem. For our purposes, we need to bound the multiplicative energy between two sets with additive structure. We achieve this by using of the following Sum-Product estimate from [R]1. The estimate presented here is not explicitly written, but it is proved on the way to proving Theorem 1 of that article. √ Theorem 2.6 (Rudnev). Let A ⊂ Fp satisfy |A| < p. Then

7 E×(A, A)  |A||A + A| 4 log |A|.

√ There is often a restriction like in the above theorem that |A| < p for otherwise it is impossible for the sets A · A and A + A to have size |A|2. Thus one cannot hope for estimates as strong as the Erd¨os- Szemer´ediConjecture 1.4 for integers. On the other hand one can prove very strong and nearly-optimal √ Sum-Product estimates for |A| ≥ p using Fourier analytic methods, as was shown in [G2].

2.5 Bohr sets and their structure

The material here can be found in section 4.4 of [TV]. We begin by defining Bohr sets in the setting of an arbitrary abelian group and then refine the theory to the more specific setting of a finite field. These sets were introduced into additive combinatorics and number theory by Bourgain in his work on arithmetic progressions in sumsets and improvements to Roth’s Theorem on three-term progressions, see [Bou1] and [Bou2] respectively. Suppose G is a finite abelian group and Γ ⊂ Gb is some collection of

1Recently, Rudnev’s sum-product estimate was improved in [RNRS]. Turning this bound into an energy estimate may give a small improvement to our Burgess-type estimates. However, sum-product estimates are still far from optimal and are likely to see further improvement. Chapter 2. Notation and relevant background 19 characters. The kernel of γ ∈ Γ is the subset Ker(γ) of G on which γ restricts to the trivial character. Since γ is a , Ker(γ) is a subgroup of G. We extend this definition and define the kernel of the set Γ to be the subset Ker(Γ) of G on which each character γ ∈ Γ restricts to the trivial character. It is straightforward that \ Ker(Γ) = Ker(γ) γ∈Γ and hence that Ker(Γ) is also a subgroup. It may be the case, for instance when G is the group Fp which is the situation we are interested in, that there are no non-trivial subgroups. In this case Ker(Γ) is the trivial subgroup unless Γ consists solely of the trivial character. Hence we need to settle for a weaker structure - an approximate kernel. The Bohr sets fill this rˆole.Given a set of characters Γ and a parameter ε > 0, the Bohr set B(Γ, ε) is the set on which each character in Γ is approximately trivial:

Definition (Bohr set, abelian group version). Let G be an abelian group, let Γ ⊂ Gb be a finite set of characters and let ε > 0 be a real number. Then we define the Bohr set to be

B(Γ, ε) = {x ∈ G : |γ(x) − 1| ≤ ε for each γ ∈ Γ} .

The size of Γ is called the rank of the Bohr set and the parameter ε is called the radius.

Remarkably, a Bohr set retains quite a bit of the group structure of G. For instance, Bohr sets are symmetric in the sense that B(Γ, ε) = −B(Γ, ε), and they contain the identity element of the group G. Furthermore, from the triangle inequality it is straightforward that

B(Γ, ε) + B(Γ, ε) ⊂ B(Γ, 2ε).

It follows that if B(Γ, 2ε) is not too much bigger than B(Γ, ε), then the Bohr set is closed under addition in a weak sense. Sets with these properties are referred to in the literature as “approximate groups”, and are discussed in more detail in Section 2.4 of [TV].

We now refine our discussion of Bohr sets to the additive group of Fp. Such sets are quite useful in additive combinatorics. For instance it is often the case that one will transfer a problem in Z to a problem in Fp, where one has a simpler version of Fourier analysis and a legal division operation. The drawback to working in Fp is that there are no subgroups, so one needs to work with the weaker structure of a Bohr set. As we saw in Section 2.3, the additive characters of Fp are given by exponentials x 7→ ep(rx) with r ∈ Fp. We identify a set Γ of exponentials with their frequencies r. Thus for a subset Γ ⊂ Fp, the Bohr set B(Γ, ε) consists of elements x ∈ Fp for which |ep(rx) − 1| ≤ ε for each r ∈ Γ. What is roughly equivalent (after renormalizing ε) is that B(Γ, ε) is the set of x ∈ Fp for which krx/pk ≤ ε where k · k is the distance to the nearest integer, and we have interpreted x as an integer up to equivalence modulo p.

Here we have just used the fact that krx/pk and |ep(rx) − 1| are comparable. This is the definition we shall take in the Fp setting.

Definition (Bohr set, finite field version). Let Γ ⊂ Fp and let ε > 0 be a real number. Then we define the Bohr set to be

B(Γ, ε) = {x ∈ Fp : krx/pk ≤ ε for each r ∈ Γ} .

Again, the size of Γ is called the rank of the Bohr set and the parameter ε is called the radius.

This definition of a Bohr set should be somewhat reminiscent of Dirichlet’s Theorem on rational Chapter 2. Notation and relevant background 20 approximation:

Theorem (Dirichlet approximation). For real numbers α1, . . . , αd there is an integer n ≤ Q so that −1/d max{knαkk : 1 ≤ k ≤ d} ≤ Q .

Bohr sets consist of the numbers guaranteed by Dirichlet’s theorem, though we are working with discrete approximation - rational numbers with denominator p. In fact, using Dirichlet’s box principle we get the following estimates on the size of a Bohr set.

Lemma 2.8 (The size of Bohr sets). Let Γ ⊂ Fp with |Γ| = d and ε > 0. Then

|B(Γ, ε)| ≥ εdp and |B(Γ, 2ε)| ≤ 4d|B(Γ, ε)|.

As we mentioned above, B(Γ, ε) + B(Γ, ε) ⊂ B(Γ, 2ε) by the triangle inequality, and we can imme- diately deduce the following bound.

Corollary 2.1. Let Γ ⊂ Fp with |Γ| = d and ε > 0. Then

|B(Γ, ε) + B(Γ, ε)| ≤ 4d|B(Γ, ε)|.

Given Γ ⊂ Fp, there are certain values of ε for which |B(Γ, ε + κ)| varies nicely for small values κ. More precisely, we define a regular Bohr set as follows.

Definition (Regular values and regular Bohr sets). Suppose Γ ⊂ Fp is a set of size d, we say ε is a 1 regular value for Γ if whenever |κ| < 100d we have

|B(Γ, (1 + κ)ε)| 1 − 100d|κ| ≤ ≤ 1 + 100d|κ|. |B(Γ, ε)|

We say the Bohr set B(Γ, ε) is regular.

The natural first question to ask is if a given Γ has any regular values. As it turns out, one can always find a regular value close to any desired radius. The following results are due to Bourgain.

Lemma 2.9. Let Γ be a set of size d and let δ ∈ (0, 1). There is an ε ∈ (δ, 2δ) which is regular for Γ.

The crucial property of regular Bohr sets is that they are almost invariant under translation by Bohr sets of small radius. This will allow us to replace a character sum over a Bohr set by something “smoother” in the next chapter.

Corollary 2.2. Let B(Γ, ε) be a regular Bohr set with |Γ| = d. If η ≤ δε/200d for some 0 < δ < 1 then for any natural number n ≥ 1 and y1, . . . , yn ∈ B(Γ, η) we have

X |1B(Γ,ε)(x + y1 + ... + yn) − 1B(Γ,ε)(x)| ≤ nδ|B(Γ, ε)|. x∈Fp

Proof. By the triangle inequality it suffices to prove the result for n = 1. For y = y1, the value of

|1B(Γ,ε)(x + y) − 1B(Γ,ε)(x)| is 0 unless exactly one of x and x + y lies in B(Γ, ε) in which case there is Chapter 2. Notation and relevant background 21 a contribution of 1. However, if the latter happens then x ∈ B(Γ, ε + η) \ B(Γ, ε − η). Owing to the regularity of B(Γ, ε), for any y ∈ B(Γ, η), there is a contribution of at most

      δ δ B Γ, ε 1 + − B Γ, ε 1 − ≤ δ |B(Γ, ε)| . 200d 200d

2.6 Character sums

Here we recall well-known facts concerning character sums over finite fields. For details, we refer to × chapter 11 of [IK]. Multiplicative characters are the characters χ of the group Fq which are extended to Fq by setting χ(0) = 0. For example, in the next chapter we will be particularly interested in the quadratic character on Fq that is the character given by  1 if c 6= 0 is a square  χ(c) = −1 if c 6= 0 is not a square  0 if c = 0.

Suppose χ is a non-trivial multiplicative character. For a ∈ Fq the Fourier transform of χ at a is X τ(χ, −a) = χ(y)eq(−ay) y∈Fq which is known as the Gauss sum. By expanding the square modulus, it is not hard to prove the following.

Lemma 2.10. For non-zero a ∈ Fq we have √ |τ(χ, −a)| = q and τ(χ, 0) = 0.

× Proof. That τ(χ, 0) = 0 is immediate from orthogonality on Fq . Suppose then that a 6= 0. Then

2 X X X |τ(χ, −a)| = χ(y1/y2)eq(a(y1 − y2)) = χ(z1) eq(az2(z1 − 1)).

y1,y2∈ q z1∈ q × F F z2∈Fq y26=0

In the second step here we have made the change of variables z1 = y1/y2 and z2 = y2. Since z2 ranges over all non-zero elements of Fq, then provided z1 6= 1 orthogonality tells us that then inner sum is −1. If z1 = 1 then the inner sum is q − 1, and χ(z1) = 1. Thus we have

2 X |τ(χ, −a)| = q − χ(z1) = q z1∈Fq

× by orthogonality on Fq .

The exponential sums over the squares in Fq are also called Gauss sums by abuse of notation. This Chapter 2. Notation and relevant background 22 can be reconciled with the identity:

× Lemma 2.11. Suppose a ∈ Fq and χ is the quadratic character on Fq. We have

X 2 τ(χ, a) = eq(ax ). x∈Fq

Proof. It is easy to see that x2 = y has exactly χ(y) + 1 solutions x. So the right hand side above is

X 2 X X eq(ax ) = eq(ay)(χ(y) + 1) = τ(χ, a) + eq(ay) = τ(χ, a) x∈Fq y∈Fq y∈Fq by orthogonality.

This is a pretty remarkable result. It says that the squares in Fq are perfectly equidistributed. We hit the square-root law right on the nose. Estimates of this strength are quite rare in the character sums business, and the strength of this is what allows us to prove the P´olya-Vinogradov bound. In order to carry out the proof of a Burgess-type estimate, we shall need Weil’s bound for character × sums with polynomial arguments. Recall that the subgroups of Fq consist of the l’th powers for some × l dividing q − 1, owing to the fact that Fq is a cyclic group. Weil’s bound tells us that a polynomial cannot take on values in such a subgroup unless the polynomial is itself an l’th power, in which case it clearly must.

Theorem 2.7 (Weil). Let f ∈ Fp[x] be a polynomial with r distinct roots over Fp. Then if χ has order l and provided f is not an l’th power over Fp[x] we have

X √ χ(f(x)) ≤ r p.

x∈Fp

We now record a general version of Burgess’ argument which is an application of H¨older’sinequality and Weil’s bound. This proof is distilled from the proof of Burgess’s original estimate in Chapter 12 of [IK]. The first ingredient we need is a basic consequence of Weil’s bound.

Lemma 2.12. Let k be a positive integer and χ a non-trivial multiplicative character. Then for any subset A ⊂ Fp we have 2k √ X X 2k k χ(a + x) ≤ |A| 2k p + (2k|A|) p. x∈Fq a∈A Proof. Expanding the 2k’th power and using that χ(y) = χ(yp−2), we have

X X p−2 p−2 χ((x − a1) ··· (x − ak)(x − ak+1) ··· (x − a2k) )

a1,...,a2k∈A x X X = χ(fa (x)). a∈A2k x

Here fa (t) is the polynomial

p−2 p−2 fa (X) = (X − a1) ··· (X − ak)(X − ak+1) ··· (X − a2k) . Chapter 2. Notation and relevant background 23

P √ By Weil’s theorem, x χ(fa (x)) ≤ 2k p unless fa is an l’th power, where l is the order of χ. If any of the roots ai of fa is distinct from all other aj then it occurs in the above expression with multiplicity 1 or p − 2. Both 1 and p − 2 are prime to l since l divides p − 1. Hence fa is an l’th power only provided (2k)! k 2k all of its roots can be grouped into pairs. So, for all but at most k ≤ (2k|A|) vectors a ∈ A , we √ 2 k! have the estimate 2k p for the inner sum. For the remaining a we bound the sum trivially by p. Hence the upper bound 2k √ X X 2k k χ(a + x) ≤ |A| 2k p + (2k|A|) p. x∈Fq a∈A

Lemma 2.13. Let A, B, C ⊂ Fp and suppose χ is a non-trivial multiplicative character. Define

r(x) = |{(a, b) ∈ A × B : ab = x}|.

Then for any positive integer k, we have the estimate

X X 1−1/k 1/4k 1/4k r(x) χ(x + c) ≤ (|A||B|) E×(A, A) E×(B,B) · x∈Fp c∈C √ 1/2k · |C|2k2k p + (2k|C|)kp .

Proof. Call the left hand side above S. Applying H¨older’sinequality

 1−1/k  1/2k  2k1/2k

X X 2 X X |S| ≤  r(x)  r(x)   χ(x + c)  x∈Fp x∈Fp x∈Fp c∈C 1−1/k 1/2k 1/2k = T1 T2 T3 .

Now T1 is precisely |A||B| and T2 is the multiplicative energy E×(A, B). By the Cauchy-Schwarz inequality, we have p E×(A, B) ≤ E×(A, A)E×(B,B).

The estimate for T3 is an immediate from Lemma 2.12.

We conclude this section with a brief discussion of the square-root barrier. This is the phenomenon √ where one can estimate sums efficiently in the field Fq when the range of summation is bigger than q. The usual method for doing so is called “completing of the sum”, where one is able to replace a given sum over a small set by a complete sum over the whole field. Complete sums can then be handled with some sort of orthogonality. The problem with the completion method is that the tools involved - usually Fourier analysis and the Cauchy-Schwarz inequality - are not sensitive to a given field’s structure. In particular, the tools are not able to distinguish between Fp with p prime and other non-prime fields. This is troublesome because the variables in our sum could lie in a subfield to which a non-trivial character restricts trivially. In the Fp setting, where we want to make such estimates, this is plainly impossible since there are no subfields. So we need to use tools that are sensitive to this fact. For this reason, the Sum-Product theory has been very useful because it is a theory tailored to prime fields. On the other hand, Burgess’ method is the only way we know to estimate character sums past the square-root Chapter 2. Notation and relevant background 24 barrier, and it is inefficient for sums shorter than p1/4. This is because the Burgess method makes use of completion again, but replaces the use of orthogonality with the Weil bound. Perhaps if one could handle incomplete Weil type sums, non-trivial estimates could be made in a broader range. Chapter 3

Capturing forms in dense subsets of finite fields

3.1 Introduction

Ramsey theory is concerned with finding small, structured objects inside of large objects. For instance, if you flip a coin enough times, you are likely to see a sequence of one hundred consecutive heads. The prototypical result in the area is appropriately named Ramsey’s Theorem.

Theorem 3.1 (Ramsey’s Theorem). Let G = (V,E) be the complete graph with vertices V indexed by the natural numbers. Suppose we have a function c : E → {1, . . . , r} on the edges of G for some finite integer r ≥ 1. Then there is an infinite set of vertices V 0 ⊂ V and a number i such that for any 0 v1, v2 ∈ V the edge c({v1, v2}) = i.

We typically think of the edges of G in Ramsey’s Theorem as having been coloured by r different colours. The theorem then says that if we colour the edges of G with finitely many colours then there is an infinite subset of the vertices such that the induced graph on that subset has monochromatic edges, i.e. they all have the same colour. In arithmetic Ramsey theory, we are interested in finding configurations of numbers satisfying some arithmetic relation. For instance, it is a simple consequence of Ramsey’s Theorem that if we colour the natural numbers by c : N → {1, . . . , r}, we can find a pair (and in fact infinitely many pairs) of numbers x, y with each of x, y and x + y the same colour. To see this consider the complete graph with vertex set N, and colour the edge {x, y} by the same colour as c(|x − y|). Then by Ramsey’s Theorem, we can certainly find three vertices in the graph a < b < c such that all edges on these three vertices are coloured the same. But then the numbers a − b, a − c and b − c are coloured the same so that x = a − b and y = b − c are the desired pair. From this, we can give a similar result concerning pairs x, y with x, y and xy coloured the same. Define a new colouring of the natural numbers given by colouring k the same way we coloured 2k in the original colouring. Finding x0 and y0 with x0, y0 and x0 + y0 all the same colour in this new colouring 0 0 gives the pair x = 2x , y = 2y with x, y and xy monochromatic. An open problem of Hindman asks if we can satisfy both equations monochromatically at the same time. That is, given a colouring of the natural numbers with finitely many colours, can we find x, y

25 Chapter 3. Capturing forms in dense subsets of finite fields 26 with x, y, x + y and xy monochromatic. Even the easier question of finding x and y with x + y and xy the same colour (never mind the colours of x and y) seems intractable. The problem is difficult largely because we have a hard time controlling the additive structure and multiplicative structure of integers at the same time. As a first step we can reduce the question to one about solving quadratic equations. Indeed, suppose we want to find x and y with xy and x + y of a fixed colour i. Let A be the set of all natural numbers n with c(n) = i. Then we want to find a, b ∈ A with x + y = a and xy = b. This is equivalent to x and y being the roots of the quadratic polynomial Q(X) = X2 − aX + b. So our question is reduced to asking if any of the quadratic X2 − aX + b with a, b ∈ A have natural roots. Which as √ we learned in high school is equivalent to knowing when (a ± a2 − 4b)/2 is a natural number. There are two obstructions to this. First, the numerator needs to be even. The second and much more severe obstruction is that the discriminant a2 − 4b needs to be a perfect square. In this chapter we ask a question similar to Hindman’s conjecture, but in a setting were dividing by 2 is legal and perfect squares are easier to come by - a finite field of odd characteristic. Before proceeding to the problem, we remark that the similar question of colouring x + y and x/y the same with x, y ∈ N and y dividing x is much easier. As we shall see, this question is a linear one.

Proposition 3.1. Let c : N → {1, . . . , r} be a finite colouring of the natural numbers. Then there exist x, y ∈ N with y|x and c(x + y) = c(x/y).

Proof. Write z = x/y, so that x = yz - thus x is linear in y. We want to have c(z) = c(x+y) = c(y(z+1)).

Now just take numbers ai with c(ai) = i for i = 1, . . . , r. Then if the colour of (a1 + 1) ··· (ar + 1) is k, Q we can take z = ak and y = i6=k(ai + 1). Letting x = yz gives the desired pair.

3.2 Statement of results

One might suspect that in fact a stronger result than Hindman’s Conjecture might hold, namely that any sufficiently dense set of natural numbers contains the elements x + y and xy for some x and y. This would immediately solve the problem since one of the colours in any finite colouring must be sufficiently dense. Such a result is impossible however, since the odd numbers provide a counter example and are fairly dense in many senses of the word. This simple parity obstruction disappears in the finite field setting. In [Shk], the following was proved.1

5 Theorem (Shkredov). Let p be a prime number, and A1,A2,A3 ⊂ Fp be any sets, |A1||A2||A3| ≥ 40p 2 . Then there are x, y ∈ Fp such that x + y ∈ A1, xy ∈ A2 and x ∈ A3.

n Now, let q = p be an odd prime power and Fq a finite field of order q. Given a binary linear form L(X,Y ) and a binary quadratic form Q(X,Y ), define Nq(L, Q) to be the smallest integer k such that 2 for any subset A ⊂ Fq with |A| ≥ k, there exists (x, y) ∈ Fq with L(x, y),Q(x, y) ∈ A. That is,

 2 Nq(L, Q) = min k : ∀ A ⊂ Fq with |A| ≥ k, ∃ (x, y) ∈ Fq with L(x, y),Q(x, y) ∈ A .

In this chapter we give estimates on the size of Nq(L, Q). We will prove the following theorem.

1This result was communicated to us by J. Solymosi after the results of this section were made available. Chapter 3. Capturing forms in dense subsets of finite fields 27

Theorem 3.2. Let Fq be a finite field of odd order. Let Q ∈ Fq[X,Y ] be a binary quadratic form with non-zero discriminant and let L ∈ Fq[X,Y ] be a binary linear form not dividing Q. Then we have √ log q  Nq(L, Q)  q.

This theorem is the content of the next two sections. In the final section, we briefly remark on the analogous problem in the ring of integers modulo N when N is composite, where the situation is much akin to that of the integers.

3.3 Upper Bound

Let L(X,Y ) be a linear form and Q(X,Y ) be a quadratic form, both with coefficients in Fq. Suppose A is an arbitrary subset of Fq. We will reduce the problem of solving L(x, y),Q(x, y) ∈ A to estimating a character sum. Recall that we defined the quadratic character χ in Section 2.6 to be  1 if c 6= 0 is a square  χ(c) = −1 if c 6= 0 is not a square  0 if c = 0.

Lemma 3.1. Let Q ∈ Fq[X,Y ] be a binary quadratic form and let L ∈ Fq[X,Y ] be a binary linear form. Suppose a, b ∈ Fq. Then there exist r, s, t ∈ Fq depending only on L and Q such that

2 2 2 |{(x, y) ∈ Fq : L(x, y) = a and Q(x, y) = b}| = |{y ∈ Fq : ry + say + ta = b}|.

Furthermore, r = 0 if and only if L|Q and r = s = 0 if and only if L2|Q.

Proof. Write L(X,Y ) = a1X + a2Y where without loss of generality we can assume a1 6= 0. We can factor Q(X,Y ) = tL(X,Y )2 + sL(X,Y )Y + rY 2.

If L(x, y) = a then we obtain Q(x, y) = ta2 + say + ry2.

The y2 coefficient vanishes if and only if Q = LM for some linear form M. The y and y2 coefficients vanish if and only if Q = tL2. Certainly, any solution to L(x, y) = a and Q(x, y) = b gives a solution y 2 2 −1 of ry + say + ta = b. Conversely, if y is such a solution, setting x = a1 (a − a2y) produces a solution (x, y).

Corollary 3.1. Let Q ∈ Fq[X,Y ] be a binary quadratic form and let L ∈ Fq[X,Y ] be a binary linear form not dividing Q. For a, b ∈ Fq, the number of solutions to L(x, y) = a and Q(x, y) = b is

1 + χ((s2 − 4rt)a2 + 4rb) where χ is the quadratic character.

Proof. The quantity (sa)2 − 4r(ta2 − b) is the discriminant of ry2 + say + ta2 − b. The result follows from the definition of χ and the quadratic formula. Chapter 3. Capturing forms in dense subsets of finite fields 28

In fact, from Lemma 3.1, we can essentially handle the situation when L|Q.

Corollary 3.2. Let Q ∈ Fq[X,Y ] be a binary quadratic form and let L ∈ Fq[X,Y ] be a binary linear 2 q+1 form dividing Q. Then Nq(L, Q) = 1 if L does not divide Q, otherwise Nq(L, Q) ≥ 2 .

Proof. Let A ⊂ Fq. The number of pairs (x, y) with L(x, y),Q(x, y) ∈ A is

X X X 2 1A(L(x, y))1A(Q(x, y)) = 1A(say + ta ) x,y a∈A y∈Fq

2 by the above lemma. If sa 6= 0 then say + ta ranges over Fq as y, and the inner sum is |A|. In this case there are in fact |A|2 solutions (x, y). If a = 0 then 0 ∈ A and we can take (x, y) = (0, 0). If s = 0 then P 2 the sum is q a∈A 1A(a t). If we set  t · N = {tn : n ∈ N} if t 6= 0 A = N if t = 0

q+1 where N is the set of non-squares in Fq, then there are no solutions. This shows that Nq(L, Q) ≥ 2 .

We now handle the case that L does not divide Q. The following estimate is essentially due to Vinogradov (see for instance the exercises of Chapter 6 in [V] for the analogous result for exponentials).

× Lemma 3.2. Let A, B ⊂ Fq and suppose χ is a non-trivial multiplicative character. Then if u, v ∈ Fq X X χ(ua2 + vb) ≤ 2pq|A||B|. a∈A b∈B

Proof. Let S denote the sum in question. Then

1  2 2 X X 2 1 X X 2 2 |S| ≤ χ(ua + vb) ≤ |B|  χ(ua + vb)  b∈B a∈A b∈Fq a∈A by Cauchy’s inequality. Expanding the sum in the second factor, we get

 2   2 2  X X ua1 + vb X X u(a1 − a2) χ 2 = χ 1 + 2 ua2 + vb ua2 + vb a1,a2∈A b∈Fq a1,a2∈A b∈Fq 2 2 ua2+vb6=0 ua2+vb6=0 X X 2 2  = χ 1 + u(a1 − a2)b a ,a ∈A × 1 2 b∈Fq

2 −1 2 2 2 2 after the change of variables (ua2 + vb) 7→ b. When a1 6= a2, the values of 1 + u(a1 − a2)b range over × all values of Fq save 1 as b traverses Fq . Hence, in this case, the sum amounts to −1. It follows that the total is at most 4q|A|.

2 2 Recall that the discriminant of a quadratic form Q(X,Y ) = b1X + b2XY + b3Y is defined to be 2 disc(Q) = b2 − 4b1b3.

Proposition 3.2. Let Q ∈ Fq[X,Y ] be a binary quadratic form and let L ∈ Fq[X,Y ] be a binary linear √ q−1 form not dividing Q. Then Nq(L, Q) ≤ 2 q + 1 if disc(Q) 6= 0 otherwise Nq(L, Q) ≥ 2 . Chapter 3. Capturing forms in dense subsets of finite fields 29

Proof. Let A ⊂ Fq. By Corollary 3.1, the number of pairs (x, y) with L(x, y),Q(x, y) ∈ A is

X X 2 1A(L(x, y))1A(Q(x, y)) = (1 + χ(Da + 4rb)) x,y a,b∈A

2 −2 where D = s − 4rt. One can check that in fact D = a1 disc(Q). If D = 0 then χ(Da2 + 4rb) + 1 = χ(r)χ(b) + 1. This will be identically zero if A is chosen to be the q−1 squares or non-squares according to the value of χ(r). Hence, if disc(Q) = 0 then Nq(L, Q) ≥ 2 . Now assume D 6= 0. Summing over a, b ∈ A the number of solutions is

X |A|2 + χ(Da2 + 4rb) = |A|2 + E(A). a,b∈A √ By Lemma 3.2, E(A) < |A|2 when |A| ≥ 2 q + 1 and the result follows.

In the case that A has particularly nice structure, we can improve the upper bound. Suppose q = p is prime and A is an interval. Then as above the number of pairs (x, y) with L(x, y),Q(x, y) ∈ A is

X |A|2 + χ(Da2 + 4rb). a,b∈A

Now

X 2 X X 2 χ(Da + 4rb) ≤ χ(Da /4r + b) . a,b∈A a∈A b∈A Using the classical Burgess estimate, the inner sum (which is also over an interval) is o(|A|) whenever 1 +ε |A|  p 4 .

3.4 Lower Bound

In this section we give a lower bound for Nq(L, Q) in the case that L does not divide Q and disc(Q) 6= 0. To do so we need to produce a set A such that L(x, y) and Q(x, y) are never both elements of A. Equivalently, we need to produce a set A for which χ(Da2 + 4rb) = −1 for all pairs (a, b) ∈ A × A.

Let a ∈ Fq and define  1 if χ(Da2 + 4rb) = χ(Db2 + 4ra) = −1 Xa(b) = 0 otherwise.

Thus the desired set A will have Xa(b) = 1 for a, b ∈ A. The idea behind our argument is probabilistic. Suppose we create a graph Γ with vertex set

V = {a ∈ Fq : Xa(a) = 1} and edge set

E = {{a, b} : Xa(b) = Xb(a) = 1}.

1 These edges appear to be randomly distributed and occur with probability roughly 4 . In this setting, Nq(L, Q) is one more than the clique number of Γ (ie. the size of the largest complete subgraph of Chapter 3. Capturing forms in dense subsets of finite fields 30

Γ). Let G(n, δ) be the graph with n vertices that is the result of connecting two vertices randomly and independently with probability δ. Such a graph has clique number roughly log n (see [AS], chapter 10). One is tempted to treat Γ as such a graph and construct a clique by greedily choosing vertices, and indeed this is how the set A is constructed. It is worth mentioning that this model suggests that the right upper bound for Nq(L, Q) is closer to log q in magnitude.

Lemma 3.3. Let B ⊂ Fq. Then for a ∈ Fq, we have

X 1 X X (b) = (1 − χ(Da2 + 4rb))(1 − χ(Db2 + 4ra)) + O(1). a 4 b∈B b∈B

Proof. The summands on the right are  4 if χ(Da2 + 4rb) = χ(Db2 + 4ra) = −1   2 if {χ(Da2 + 4rb), χ(Db2 + 4ra)} = {0, −1} (1 − χ(Da2 + 4rb))(1 − χ(Db2 + 4ra)) = 1 if χ(Da2 + 4rb) = χ(Db2 + 4ra) = 0   0 otherwise.

For fixed a, the second and third cases can only occur for O(1) values of b.

√ Proposition 3.3. Let A, B ⊂ Fq with |A|, |B| > q. Then

X X |A||B| 1 1 X (b) = + O(|A||B| 2 q 4 ). a 4 a∈A b∈B

Proof. By the preceding lemma, it suffices to estimate ! X 1 X (1 − χ(Da2 + 4rb))(1 − χ(Db2 + 4ra)) + O(1) 4 a∈A b∈B |A||B| 1 X X 1 X X = − χ(Da2 + 4rb) − χ(Db2 + 4ra) 4 4 4 a∈A b∈B a∈A b∈B 1 X X + χ((Da2 + 4rb)(Db2 + 4ra)) + O(|A|). 4 a∈A b∈B

p 1 1 By Lemma 1 of the previous section, the first two sums above are O( q|A||B|) = O(|A||B| 2 q 4 ). By Cauchy’s inequality, the final sum is bounded by

1  2 2 1 X X 2 2 2 |B|  χ((Da + 4rb)(Db + 4ra))  . b∈Fq a∈A

Expanding the square modulus, the second factor is the square-root of

X X 2 2 2 2 χ((Da1 + 4rb)(Db + 4ra1)(Da2 + 4rb)(Db + 4ra2)). a1,a2∈A b∈Fq Chapter 3. Capturing forms in dense subsets of finite fields 31

√ By Weil’s Theorem, the inner sum is bounded by 6 q when the polynomial

2 2 2 2 f(b) = (Da1 + 4rb)(Db + 4ra1)(Da2 + 4rb)(Db + 4ra2)

2√ is not a square. This happens for all but O(|A|) pairs (a1, a2). Hence the bound is O(|A|q + |A| q). √ 2√ 1 1 Since |A| > q, this is O(|A| q) and the overall bound is O(|A||B| 2 q 4 ).

We immediately deduce the following. √ Corollary 3.3. There is an absolute constant c > 0 such that if B ⊂ Fq with |B| ≥ c q then there is an element a ∈ B such that 1 |{b ∈ B : X (b) = 1}| ≥ |B|. a 8 Proof. Indeed, taking A = B in the preceding theorem, ( ) X 1 X |B| 1 1 |B| max Xa(b) ≥ Xa(b) = + O(q 4 |B| 2 ) ≥ a∈B |B| 4 8 b∈B a,b∈B √ when |B| > c q for some appropriately chosen c.

Proposition 3.4. Let Q ∈ Fq[X,Y ] be a binary quadratic form and let L ∈ Fq[X,Y ] be a binary linear form not dividing Q. Then if disc(Q) 6= 0 we have Nq(L, Q)  log q. q−1 Proof. We will construct a clique in the graph Γ introduced above. First we claim that |V | = 2 +O(1). Indeed X X X χ(Da2 + 4ra) = χ(a−2)χ(Da2 + 4ra) = χ(D + 4ra−1) = O(1) × × × a∈Fq a∈Fq a∈Fq × by orthogonality. The final term is O(1) and the claim follows since χ takes on the values ±1 on Fq . 0 √ Now set V0 = V and assume q is large. Write |V0| = c q > c q (with c as in the preceeding corollary 0 1 and c ≈ 2 ). For a ∈ V0, let N(a) denote the neighbours of a (ie. those b which are joined to a by 0 an edge). Then there is an a1 ∈ V0 such that |N(a1)| ≥ c q/8. Let A1 = {a1}, let V1 = N(a1) ⊂ V0, and for a ∈ V1 let N1(a) = N(a) ∩ V1. By choice, all elements of V1 are connected to a1. Now 0 0 0 |V1 \ A1| ≥ c q/8 − 1 ≥ c q/16 so, provided this is at least c q/16, there is some element a2 of V1 \ A1 such that |N1(a2)| ≥ |V1 \ A1|/8. Let A2 = A1 ∪ {a2}, V2 = N1(a2) ⊂ V1 and define N2(a) = N(a) ∩ V2.

Once again each element of V2 is connected to each element of A2. We repeat this process provided that at stage i there exists an element ai+1 ∈ Vi \ Ai with |Ni(ai+1)| ≥ |Vi \ Ai|/8. We set Ai+1 = Ai ∪ {ai+1} √ and observe that Ai+1 induces a clique. We may iterate provided |Vi \ Ai| > c q which is guaranteed for i  log q. The final set Ai (which has size i) will be the desired set A.

The combination of this proposition and Proposition 3.2 completes the proof of Theorem 3.2.

3.5 Remarks for Composite Modulus

Consider the analogous question in the ring Z/NZ with N odd. Let L(X,Y ) = a1X+a2Y with (a1,N) = 2 2 2 1 and Q(X,Y ) = b1X + b2XY + b3Y . We then let A ⊂ Z/NZ and wish to find (x, y) ∈ (Z/NZ) such that L(x, y),Q(x, y) ∈ A. As before, this amounts to finding a solution to

−1 Q(a1 (a − a2Y ),Y ) = b Chapter 3. Capturing forms in dense subsets of finite fields 32 for some a, b ∈ A. In general, one cannot find a solution based on the size of A alone unless A is very large. Indeed, if p is a small prime dividing N and t mod p is chosen such that the discriminant of

−1 Q(a1 (t − a2Y ),Y ) − t is a non-residue modulo p then taking A = {a mod N : a ≡ t mod p} provides a set of density 1/p which fails admit a solution. Chapter 4

Character sum estimates for Bohr sets and applications

4.1 Introduction

In Chapter 1 we discussed the Sum-Product Problem. This problem seeks to quantify the extent to which additive structure and multiplicative structure are independent phenomena. In this section we establish a result which is dual to this: if one can control additive characters on a given set, one cannot hope to control multiplicative characters on that set. Recall that in Section 2.5 we defined the Bohr set

B(Γ, ε) = {x ∈ Fp : max{krx/pk : r ∈ Γ} ≤ ε}.

Thus B(Γ, ε) is the set on which the exponentials with frequencies in Γ approximate the trivial character. We think of B(Γ, ε) as behaving like a kernel for this set of characters (there are of course no actual kernels since Fp has no non-trivial subgroups). Can such a set also behave like a kernel for a multi- plicative character? In this chapter we show that the answer is no. We will prove that any non-trivial multiplicative character must oscillate on a Bohr set.

4.2 Statement of Results and Applications

4.2.1 Main Results

In Section 2.5 we discussed several facts concerning Bohr sets, and in particular we stressed that they possess a lot of additive structure. Indeed, elements x ∈ B(Γ, ε) dilate Γ into a short interval, and much of the additive structure of this interval carries over to B. We shall put this structure to use in order to obtain very strong estimates on large Bohr sets. In Section 4.3 we obtain the following analog of the P´olya-Vinogradov estimate which is non-trivial for large Bohr sets.

Theorem 4.1 (P´olya-Vinogradov for Bohr sets). Let B = B(Γ, ε) be a Bohr set with |Γ| = d. Then for

33 Chapter 4. Character sum estimates for Bohr sets and applications 34 any non-trivial multiplicative character χ

√ X d χ(x) d p(log p) . x∈B

This result is comparable to [Sh] in which a P´olya-Vingradov estimate is established for generalized arithmetic progressions of rank d, which is a set of the form

A = {a0 + x1a1 + ··· + xdad : 1 ≤ xi ≤ Ni}

√ for some integers Ni. For non-trivial estimates when the Bohr set is on the order of p or smaller, we appeal to Burgess’ method. We are able to prove non-trivial results provided the Bohr set satisfies a certain niceness condition known as regularity, see Definition 2.5.

Theorem 4.2 (Burgess for Bohr sets). Let B = B(Γ, ε) be a regular Bohr set with |Γ| = d. Let k ≥ 1 √ be an integer and let χ be non-trivial multiplicative character. When |B| ≥ p we have the estimate

 5/16k  −1/8k X 2 |B| p χ(x)  |B| · p5d/16k +o(1) . k,d εdp |B| x∈B √ When |B| < p we have the estimate

 5/16k  5 −1/8k X 2 |B| |B| χ(x)  |B| · p5d/16k +o(1) . k,d εdp p2 x∈B

The statement appears complicated, but usually one has |B| ≈ εdp so the middle factor in the estimate is harmless. If the rank d is bounded, one can take k much larger than d and obtain a non- trivial estimate in the range |B|  p2/5+δ for some positive δ. This is comparable to character sum estimates of M.-C. Chang for generalized arithmetic progressions of similar rank, proved in [C1]. As in her proof, we make use of Sum-Product Phenomena in Fp.

4.2.2 Applications

Recall Dirichlet’s approximation theorem states that for real numbers α1, . . . , αd there is an integer −1/d n ≤ Q so that maxk{knαkk} ≤ Q . Schmidt proved in [Sch] that, at the cost of weakening the approximation, we can take n to be a perfect square. Specifically, he proved the following.

Theorem (Schmidt). Given real numbers α1, . . . , αd and Q a positive integer, there is an integer 1 ≤ n ≤ Q and a positive absolute constant c such that

2 −c/d2 max{kn αkk : 1 ≤ k ≤ d}  dQ .

This result was also proved by Green and Tao in [GT] and extended to different systems of polynomials in [LM]. An elementary proof of a slightly weaker estimate was also given in [CLR].

When Γ is a subset of Fp and ε > 0 then the elements of B(Γ, ε) are precisely the elements guaranteed by Dirichlet’s approximation theorem. Here we are replacing approximation in the continuous torus R/Z with approximation in the discrete torus Fp. We will prove the following Fp analog of Schmidt’s theorem. Chapter 4. Character sum estimates for Bohr sets and applications 35

Theorem 4.3 (Recurrence of k’th powers). Let Γ be a set of d integers, let p be a prime and let k be a positive integer. There is an integer x ≤ p for which   k r −1/2d 1/d max x d p log p · k . r∈Γ p

In a similar fashion, we can prove a result about recurrence of primitive roots.

Theorem 4.4 (Recurrence of primitive roots). Let Γ be a set of d integers and let p be a prime. There × is an integer 1 < x < p which generates Fp and such that

  1/2d r p log p max x d . r∈Γ p φ(p − 1)1/d

4.3 The P´olya-Vinogradov Argument

The P´olya-Vinogradov argument is an effective way of obtaining good character sum estimates over sets 1 whose Fourier transform has a small L norm. Indeed, suppose A ⊂ Fp, then by Parseval’s identity and the Gauss sum estimate we have

X 1 X √ χ(a) = 1cA(x)τ(χ, −x) ≤ pk1cAk1. p a∈A x∈Fp

One can get a fairly strong estimate on this L1 norm of Bohr sets. We do so now and establish Theorem 4.1.

Proof of Theorem 4.1. Write Γ = {r1, . . . , rd} and r = (r1, . . . , rd). Since x ∈ B if and only if

rx ∈ [−εp, εp] = I for each r ∈ Γ, we have

X 1cB(y) = ep(−yx) x∈B d X Y = 1I (xrk)ep(−yx) x∈Fp k=1 d 1 X Y X = 1 (v )e (v r x)e (−yx) pd cI k p k k p x∈Fp k=1 vk∈Fp 1 X X = 1 d (v) e (x(v · r − y)) pd dI p d x∈ v∈Fp Fp 1 X = 1 d (v). pd−1 dI d v∈Fp v·r=y Chapter 4. Character sum estimates for Bohr sets and applications 36

×d Here we have set Id = I × ... × I and

1dId ((v1, . . . , vd)) = 1cI (v1) ··· 1cI (vd).

Plugging this in, we obtain

1 X X 1 X d k1 k ≤ |1 d (v)| = |1 d (v)| = k1 k . cB 1 pd dI pd dI cI 1 y∈ p d v∈ d F v∈Fp Fp v·r=y

As in the classical proof of the P´olya-Vinogradov inequality,

N 2N+1 X X ep(v(2N + 2) − 1 p |1cI (v)| = ep(−kv) = ep(−kv) =  . ep(v) − 1 v k=−N k=0

It follows that k1cI k1  log p and the theorem is proved.

Remark. If one takes Γ = {1} and ε = N/p for some positive integer N then B(Γ, ε) = [−N,N], thought of as a subset of Fp. This recovers the classical P´olya-Vinogradov estimate X √ χ(n)  p log p. |n|≤N

4.4 The Burgess Argument

In this section we prove Theorem 4.2. The method is the same as in the proof of Burgess’ estimate for character sums over an interval, which can be found in Chapter 12 of [IK]. The main difference lies in estimating the multiplicative energy between two Bohr sets and for this we use Rudnev’s Sum- Product result quoted in Section 2.4. Sum-Product estimates were used for the same purpose in [C1] with methods taken from [KS]. It is likely that the argument presented here is not efficient. Indeed, Bohr sets are highly structured and the current Sum-Product estimates are weaker than expected, and certainly weaker than what is predicted by the Erd¨os-Szemer´ediConjecture. For example, one of the energy estimates proved in [C1] was improved in [K] using the geometry of numbers. We were unable to adapt that argument to the present situation.

Proof of Theorem 4.2. Suppose Γ ⊂ Fp has size d and ε is a regular value for Γ. We may as well assume that Γ 6= 0 for otherwise B = Fp and the result is trivial. Write B = B(Γ, ε) and let χ be a non-trivial × character of Fp . Then we wish to estimate X S(χ) = χ(x). x∈B

We begin by first using Corollary 2.2. Let η = p−1/kε/(200d) and let y ∈ B(Γ, η). For any natural number n ≤ p1/2k we have Chapter 4. Character sum estimates for Bohr sets and applications 37

X S(χ) = 1B(x)χ(x) x∈Fp X  −1/k = 1B(x + ny)χ(x) + O n|B|p x∈Fp X   = χ(x − ny) + O n|B|p−1/k . x∈B

Averaging this over all values 1 ≤ n ≤ p1/2k and over all values y ∈ B0 = B(Γ, η) \{0} we obtain

1 X X X   S(χ) = χ(x − ny) + O |B|p−1/2k . p1/2k|B0| x∈B y∈B0 1≤n≤p1/2k

It remains to estimate 1 X X X T (χ) = χ(x − ny). p1/2k|B0| x∈B y∈B0 1≤n≤p1/2k

√ We begin by assuming that |B| < p. Then, applying Lemma 2.13 (where r(x) is now the number of ways of writing x as ab with a ∈ B and b ∈ (B0)−1), we have

1 X X |T (χ)|  r(x) χ(x − n) p1/2k|B0| x∈Fp 1≤n≤p1/2k (|B||B0|)1−1/kE (B,B)1/4kE (B0,B0)1/4k ≤ × × · p1/2k|B0|  1/2k · 2kp3/2 + (2k)kp3/2 √ ≤ |B|(|B||B0|)−3/4k(|B + B||B0 + B0|)7/16k(log p)1/2k kp1/4k after applying Theorem 2.6. Applying Corollary 2.1, we get the bound √ |T (χ)|  |B|(|B||B0|)−5/16k47d/8k(log p)1/2k kp1/4k.

Using Lemma 2.8,  ε d |B0| ≥ ηdp = p p1/k200d so that  5/16k  5/2 −1/4k 2 |B| |B| |T (χ)|  |B| · p5d/16k +o(1) . d,k εdp p

√ √ √ Now if |B| ≥ p, first split B into disjoint sets Bi with p  |Bi| < p. Then

|B| 1 X X X |T (χ)|  √ · max χ(x − ny) . 1/2k 0 i p p |B | 0 x∈Bi y∈B 1≤n≤p1/2k Chapter 4. Character sum estimates for Bohr sets and applications 38

√ Proceeding as before, this time bounding |Bi| < p and |Bi + Bi| ≤ |B + B|, we obtain √ √ |T (χ)|  |B|( p|B0|)−3/4k(|B + B||B0 + B0|)7/16k(log p)1/2k kp1/4k |B|3/4k   = √ |B|(|B||B0|)−3/4k(|B + B||B0 + B0|)7/16k · p  √  · (log p)1/2k kp1/4k  5/16k  −1/8k 2 |B| p  |B| · p5d/16k +o(1) . d,k εdp |B|

It is worth remarking that the Burgess estimate just proved gives a genuine improvement over the P´olya-Vinogradov estimate in some cases. To see this, we need a Bohr set whose size is |B| ≈ εdp ≈ pγ with 2/5 < γ < 1/2. To find such a set, we need only note that the bound in Lemma 2.8 is sharp on average. Averaging over all subsets of Fp of size d we have (where I is the interval [−εp, εp])

1 X 1 X X Y p |B(A, ε)| = p 1I (ax) d d |A|=d |A|=d x∈Fp a∈A 1 X X Y = p 1x−1I (a) + O(1) d |A|=d × a∈A x∈Fp 1 X X Y = p 1x−1I (a) + O(1). d × |A|=d a∈A x∈Fp

−1 |I| The inner sum vanishes unless A ⊂ x I in which case it contributes d . Thus the total sum is roughly |I|p−1 d d d p  ε p. It follows that for the typical choice of A of size d and appropriate choice of ε, which we can take to be regular by Lemma 2.9, we find a regular Bohr set with size in the desired range.

4.5 Application to Polynomial Recurrence

We are now going to prove Theorem 4.3 and Theorem 4.4. Their proofs will follow the standard method of counting using characters, which we mentioned in Section 2.2. First we prove an analog of Schmidt’s theorem for squares. This proof is quite simple and does not need character sums, but it will give a good idea of what to aim for when we move to higher powers.

Let Γ ⊂ Fp be a set of size d and let ε > 0 be a parameter. Then B = B(Γ, ε) contains a non- zero square provided ε2dp > 1. To see this, observe that Bohr sets have the dilation property xB = B(x−1Γ, ε), which follows immediately from the definition of a Bohr set. If the non-zero elements of B are all non-squares, then for any non-square element x, xB(Γ, ε) ∩ B(Γ, ε) = {0}. But this intersection contains B(Γ ∪ x−1Γ, ε) which has size at least ε2dp by Lemma 2.8 yielding a contradiction. It follows that there is a non-zero integer 1 ≤ a < p such that   2 r −1/2d max a  p . r∈Γ p

The above argument does not immediately generalize to higher powers because there is no dichotomy Chapter 4. Character sum estimates for Bohr sets and applications 39

- an element can be in any of the k cosets of the set of k’th powers. Instead, we will use Theorem 4.1 to find higher powers and primitive roots in Bohr sets.

Proof of Theorem 4.3. Write B for B(Γ, ε). Observe that when (k, p − 1) = l then the k’th powers are × the same as the l’th powers. So we suppose k|(p − 1) and K is the subgroup of Fp consisting of the k’th powers. This group has index k. The problem is then showing that B(Γ, ε) ∩ K is non-empty. Let K⊥ be the group of multiplicative characters which restrict to the trivial character on K. This group has size |K⊥| = k. The Poisson Summation Formula, Proposition 2.2, states that

1 X 1 (x) = χ(x). K k χ∈K⊥

Thus, 1 X X |K ∩ B| = χ(b). k χ∈K⊥ b∈B

After extracting the contribution from the trivial character χ0 this we have

|B| |K ∩ B| − ≤ max |S(χ)| k χ

P ⊥ where S(χ) = b∈B χ(b) and the maximum is taken over all non-trivial characters χ ∈ K . Thus if we |B| can show that the maximum value of |S(χ)| is at most k then B must contain an element of K. By 1/2 d Theorem 4.1, B contains a k’th powers provided |B| d kp (log p) which is certainly the case when d −1/2 d ε d kp (log p) in view of Lemma 2.8. Thus   k r −1/2d 1/d max x d p log p · k . r∈Γ p

We now turn to primitive roots.

× Proof of Theorem 4.4. We can also find primitive roots in a Bohr set. Recall that the group Fp is cyclic and a primitive element of Fp is a generator of this group. Denote the primitive roots of Fp by P. The characteristic function of P has a nice expansion in terms of characters, due to Vinogradov, see exercise 5.14 of [LN]: φ(p − 1) X µ(d) X 1 (x) = χ(x) P p − 1 φ(d) d|(p−1) χd where φ is Euler’s totient function and P is the sum over all characters with order exactly d. Summing χd over the elements of a Bohr set B and extracting the contribution from the trivial character, we obtain

φ(p − 1) √ d |B ∩ P| − |B| d p(log p) . p − 1

p1/2d We deduce that B will contain a primitive root whenever ε  φ(p−1)1/d · log p. Thus there is a primitive root 1 < x < p with   1/2d r p log p max x d . r∈Γ p φ(p − 1)1/d Chapter 4. Character sum estimates for Bohr sets and applications 40

We close by mentioning that use of Theorem 4.2 would allow for smaller choices of ε but for the k  |B|  factor εdp appearing in the estimate. As we mentioned in the preceding section, this factor is usually harmless, but we wanted uniform results for all sets Γ which comes more easily by way of Theorem 4.1. Chapter 5

Character sum estimates for various convolutions

5.1 Introduction

This chapter is motivated by the following question of S´ark¨ozy:

Problem. Are the quadratic residues modulo p a sumset? That is, do there exist sets A, B ⊂ Fp, each of size at least two with the set A + B equal to the set of quadratic residues?

The general consensus is that the answer to the above question is no. Indeed, if B contains two elements b, b0 then we would require that A + b and A + b0 are both subsets of the quadratic residues. But we expect that a + b is a quadratic residue half of the time, and we expect that a + b0 also be a residue half of the time, independent of whether or not a+b is a quadratic residue. So if A+B consisted entirely of quadratic residues then many unlikely events must have occurred. For A + B to consist of all the quadratic residues would be shocking. In [Shk2], Shkredov showed that one could not take A = B. By estimating certain character sums, S´ark¨ozy[Sar] and later Shparlinski [Shp] were able to prove that:

Theorem. If A, B ⊂ Fp, each of size at least two with the set A+B equal to the set of quadratic residues √ then |A| and |B| are within a constant factor of p.

This provides a bit of tension since |A + B| ≤ |A||B|  p but on the other hand A + B must contain at least (p − 1)/2 elements. A tempting way to approach the question is to understand the sum

X X a + b . p a∈A b∈B

Using the Cauchy-Schwarz inequality and orthogonality it is not too hard to show that this sum is at most p|A||B|p which is non-trivial for |A||B| > p. This argument was used in proof of Lemma 3.2. This estimate just fails to answer our question. Thus improving upon it even by a constant factor (1/4 would suffice, for instance) would be worthwhile. Unfortunately, the proof of this estimate is not sensitive to the fact that Fp is a prime field - this is the square-root barrier that was discussed at the end of Section 2.6. And in general finite fields, the presence of subfields makes this estimate sharp. In this Chapter we

41 Chapter 5. Character sum estimates for various convolutions 42 show that if one is willing to accept a sum which is made smoother by introducing more variables, then we can leverage the structure of Fp and obtain non-trivial estimates past the square-root barrier. Suppose χ is a non-trivial multiplicative character of the finite field Fp with p prime. Then given subsets A, B ⊂ Fp we wish to estimate sums of the form X X Sχ(A, B) = χ(a + b). a∈A b∈B

As was mentioned earlier, there is a simple estimate that comes from the Cauchy-Schwarz inequality. We have already seen what is essentially the same result in Chapter 3, Lemma 3.2.

Lemma 5.1. Given subsets A, B ⊂ Fp and a non-trivial character χ we have

p |Sχ(A, B)| ≤ p|A||B|.

Proof. We have

!2  2

2 X X X X |Sχ(A, B)| ≤ χ(a + b) ≤ |A|  χ(x + b)  . a∈A b∈B x∈Fp b∈B

Expanding the second factor, we get   X X X X x + b1 χ(x + b1)χ(x + b2) = χ x + b2 b1,b2∈B −b26=x∈Fp b1,b2∈B −b26=x∈Fp

The inner sum over x is −1 when b1 6= b2, so we are left with the contribution when b1 = b2 which is at most p. Thus the double sum is at most |B|p and the lemma follows.

This bound is non-trivial provided |A||B| > p and improving it for smaller sets is a difficult open problem. With various further assumptions on the sets A and B, Friedlander and Iwaniec improved the range in which one can obtain non-trivial estimates, see [FI]. However, the additional constaints on the sets A and B in their work are quite rigid. These constraints were weakened by Mei-Chu Chang in [C1], allowing us to estimate sums Sχ(A, B) when |A + A| is very small.

α 4 Theorem (Chang). Suppose A, B ⊂ Fp with |A|, |B| ≥ p for some α > 9 and such that |A+A| ≤ K|A|. Then there is a constant τ = τ(K, α) such that for p sufficiently large and any non-trivial character χ, we have −τ |Sχ(A, B)| ≤ |A||B|p .

We remark that in light of Freiman’s Theorem, which we will discuss a bit later, the condition that |A + A| has to be so small is still very limiting. In this chapter we aim to establish non-trivial bounds for sums with more variables. These results are different from those mentioned above since they hold for all sets which are sufficiently large - there are no further assumptions made about their structure. There is precedent for such results: by interchanging the rˆolesof addition and multiplication one can also prove that

X X p ep(xab) ≤ p|A||B|. a∈A b∈B Chapter 5. Character sum estimates for various convolutions 43

This inequality is non-trivial in the range |A||B| > p but is in fact nearly sharp, even in prime fields, since one can take A = B = {1, . . . , p1/2−ε} and x = 1 and see very little cancellation. However, Bourgain [Bou3] proved that with more variables one can extend the range in which the estimate is non-trivial.

Theorem. There is a constant C such that the following holds. Suppose δ > 0 and k ≥ Cδ−1, then for δ × A1,...,Ak ⊂ Fp with |Ai| ≥ p and x ∈ Fp , we have

X X −τ ··· ep(xa1 ··· ak) < |A1| · · · |Ak|p a1∈Ai ak∈Ak where τ > C−k.

We cannot prove results of this strength. The reason is that non-trivial additive characters are × parameterized by elements of Fp acting on Fp. Thus there is tension between the inherently additive structure of the frequencies for which the sum is large and the multiplicative nature of the variables of summation. We can then utilize Sum-Product estimates to exploit this tension and conclude something about how large the can be. This property of additive characters (i.e. that they have frequencies) is simply not present in the case of multiplicative characters, and we must rely on Burgess’ method instead.

5.2 Statement of Results

In Section 5.3 we investigate the trivariate analog of Lemma 5.1. We consider the problem of estimating the sum X X X Sχ(A, B, C) = χ(a + b + c). a∈A b∈B c∈C Using Chang’s estimate, we are able to give bounds for trivariate sums which are non-trivial just past the square-root barrier: √ Theorem 5.1. Given subsets A, B, C ⊂ Fp each of size |A|, |B|, |C| ≥ δ p, for some δ > 0, and a non-trivial character χ, then we have

|Sχ(A, B, C)| = oδ(|A||B||C|).

Typically, in the estimation of character sums one really seeks a power saving. In Theorem 5.1 −τ we would prefer a bound of the form |Sχ(A, B, C)| ≤ |A||B||C|p for some positive τ. However, our estimate relies on Chang’s Theorem which only allows one to estimate Sχ(A, B) past the square-root barrier under the hypothesis that |A + A| ≤ K|A| for some constant K. This hypothesis plays a crucial part in the proof of her bound because it allows one to appeal to Freiman’s Classification Theorem:

Theorem (Freiman’s Theorem). Suppose A is a finite set of integers such that |A + A| ≤ K|A|. Then there is a generalized arithmetic progression P containing A and such that P is of dimension at most K and log(|P |/|A|)  Kc for some absolute constant c.

Using this classification theorem, one can make a change of variables a 7→ a + bc which is the first step in a Burgess type argument. To see this in action, see the proof of Theorem 4.2 in Chapter 4. Freiman’s Theorem is simply unable to accommodate the situation |A + A| ≤ |A|1+δ, even for small Chapter 5. Character sum estimates for various convolutions 44

values of δ > 0. This creates a barrier which prevents us from extending Chang’s estimates for Sχ(A, B) to such sets A. However, this is the sort of estimate we would need in order to get a power saving in our bound for three variable sums. To circumvent the use of Freiman’s Theorem, in Section 5.4 we replace triple sums with sums of four variables. This time, by incorporating both additive and multiplicative convolutions we arrive at sums of the form

X X X X Hχ(A, B, C, D) = χ(a + b + cd). a∈A b∈B c∈C d∈D

In this way we have essentially forced a scenario where we can make use of the Burgess argument. While such sums may seem contrived, we believe they are worth studying. Indeed, even for these sums it is only by using very recent ideas from additive combinatorics that we are able to obtain estimates beyond the square-root barrier. By introducing both arithmetic operations, we are able to weigh the additive structure in one of the variables against the multiplicative structure of that variable in order to use a Sum-Product estimate. We are able to prove the following theorem.

δ √ 4 56 28 33 Theorem 5.2. Suppose A, B, C, D ⊂ Fp are sets with |A|, |B|, |C|, |D| > p , |C| < p and |D| |A| |B| |C| ≥ p60+ε for some δ, ε > 0. There is a constant τ > 0 depending only on δ and  such that

−τ |Hχ(A, B, C, D)|  |A||B||C||D|p .

√ In the case that |A|, |B|, |D| > pδ, |C| ≥ p and |D|8|A|112|B|56 ≥ p87+ε then there is a constant τ > 0 depending only on δ and  such that

−τ |Hχ(A, B, C, D)|  |A||B||C||D|p .

5.3 Trivariate sums

We begin this section by giving a simple estimate which is non-trivial past the square-root barrier provided we can control certain additive energy.

Corollary 5.1. Given subsets A, B, C ⊂ Fp and a non-trivial character χ we have

√ 2/3 |Sχ(A, B, C)| ≤ p (|A||B||C|) .

Proof. By the above lemma, we have

X p |Sχ(A, B, C)| ≤ |Sχ(a + B,C)| ≤ |A| p|B||C|. a∈A

Interchanging the roles of A, B, C and taking geometric averages gives the result.

Once again this bound is only non-trivial with |A||B||C| ≥ p3/2, and so says nothing for sets A, B, C √ of size p. This is yet another instance the square-root barrier: this estimate is also sharp in the presence of subfields. However, using Sum-Product theory (which is not valid for Fp2 without modification) we may be able to improve upon the estimate. First we show that if there are not too many additive relations among the sets, then we obtain something non-trivial. Chapter 5. Character sum estimates for various convolutions 45

Lemma 5.2. Given subsets A, B, C ⊂ Fp and a non-trivial character χ we have

p |Sχ(A, B, C)| ≤ p|A|E+(B,C).

Proof. Let r(x) be the number of ways in which x ∈ Fp is a sum x = b + c with b ∈ B and c ∈ C. Then

 1/2  21/2

X X X 2 X X |S(A, B, C)| ≤ r(x) χ(a + x) ≤  r(x)   χ(a + x)  . x∈Fp a∈A x∈Fp x∈Fp a∈A

p It is straightforward to check that the first factor above is E+(B,C) and as before, the second factor is pp|A|.

2 2 Now E+(B,C) ≤ min{|B| |C|, |C| |B|} so that we recover Corollary 5.1. On the other hand,

E+(B,C) may be much smaller, in which case we have a better estimate. So in order to improve 2 2 upon Corollary 5.1, we may assume that E+(B,C)  min{|B| |C|, |C| |B|}, which tells us the sets in question have a lot of additive structure. Using the Balog-Szemer´edi-Gowers theorem, we can therefore deduce a doubling bound for B and C. We shall only in fact need the symmetric version of the theorem when the sets are identical. Before proceeding, we record a technical lemma.

Lemma 5.3. Let z1, . . . , zn be complex numbers with | arg z1 − arg zj| ≤ δ. Then

|z1 + ... + zn| ≥ (1 − δ)(|z1| + ... + |zn|).

Proof. We have

|z1| + ... + |zn| = θ1z1 + ... + θnzn = θ1(z1 + ... + zn) + (θ2 − θ1)z2 + ... + (θn − θ1)zn for some complex numbers θk of modulus 1 with |θ1 − θj| ≤ δ. Thus by the triangle inequality

|z1| + ... + |zn| ≤ |z1 + ... + zn| + δ(|z2| + ... + |zn|) and the result follows.

We are now able to make an improvement to estimates for Sχ(A, B, C). The idea of the proof is pretty simple. Ignoring technical details for the moment, either we are in a situation where Lemma 5.2 improves upon the trivial estimate, or else we can appeal to the Balog-Szemer´edi-Gowers Theorem and deduce that A has a subset with small sumset. In the latter case we can make use of Chang’s Theorem and also arrive at a non-trivial estimate, even saving a power of p. Unfortunately, this scenerio does not come in to play until one of the sets has a lot of additive energy. This means that the saving from Lemma 5.2 will become quite poor before we are rescued by Chang’s estimate. We proceed with the proof proper.

Proof of Theorem 5.1. Suppose, by way of contradiction, that the theorem does not hold. This means that there is some positive constant ε > 0 such that for p arbitrarily large, we have sets A, B, C ⊂ Fp √ × with |A|, |B|, |C| ≥ δ p, and a non-trivial character χ of Fp satisfying

|Sχ(A, B, C)| ≥ ε|A||B||C|. Chapter 5. Character sum estimates for various convolutions 46

It follows that X ε|A||B||C| ≤ |Sχ(B, a + C)|. a∈A If we let ε A0 = {a ∈ A : |S (B, a + C)| ≥ |B||C|} χ 2 then ε X |A||B||C| ≤ |S (B, a + C)| 2 χ a∈A0 and |A0| ≥ |A|ε/2. Now by the same argument as in the proof of Lemma 5.2, we must have

ε2 |A|2|B|2|C|2 ≤ p|C|E (A0,B) ≤ p|C|E (A0,A0)1/2E (B,B)1/2, 4 + + + √ the last inequality being a consequence of Lemma 2.7. So, using that |A|, |B|, |C| ≥ δ p and E+(B,B) ≤ |B|3, we have ε4δ4 E (A0,A0) ≥ |A0|3 + 16 √ and so by Theorem 2.5 and Lemma 2.6 we can find a subset A00 ⊂ A0, with size at least (εδ)t p and such that |A00 + A00| ≤ (εδ)−t|A00| for some t = O(1). Now since A00 ⊂ A0, we have

ε X |A00||B||C| ≤ |S (B, a + C)|. 2 χ a∈A00

By the pigeon-hole principle, after passing to a subset of A000 of size |A00|/16, we can assume that the 1 complex numbers Sχ(B, a + C) all have argument within 2 of each other. Thus, by Lemma 5.3, we have ε |A000||B||C| ≤ |S (A000,B,C)| , 4 χ √ we have |A000| ≥ (εδ)t p/16, and we have

|A000 + A000| ≤ |A00 + A00| ≤ (εδ)−t|A00| ≤ 16(εδ)−t|A000|.

However, by the triangle inequality, this implies that

ε 000 000 |A ||B + c| ≤ max |Sχ(A ,B + c)| . 4 c∈C

This is in clear violation of Theorem 5.1 provided p is sufficiently large in terms of δ and ε. Thus we have arrived at the desired contradiction.

δ It is likely, provided |A| > p for some positive δ, that we have |Sχ(A, A)| = o(|A|). Some such size requirement is necessary, for if χ were the Legendre symbol, then we could take A to be an arithmetic progression contained in the quadratic residues of size log p which are known to exist. However, one might suspect that for a given set A there is a “smooth enough” sum in which one can find cancellation.

δ Problem. Suppose A ⊂ Fp has size |A| = p . Then there is an integer k depending only on δ such that

X k Sχ(A; k) = χ(a1 + ... + ak) = o(|A| ).

a1,...,ak∈A Chapter 5. Character sum estimates for various convolutions 47

5.4 Mixed multivariate sums

In the final section we turn to a different sort of sum where a power saving is possible. First we consider a different trivariate character sum with a multiplicative convolution.

X X X Mχ(A, B, C) = χ(a + bc). a∈A b∈B c∈C

This type of sum appears in the proof of Burgess’ estimate. An important quantity which arises in the study of this sum, and appears frequently in additive combinatorics is the multiplicative energy

E×(X,Y ) = |{(x1, x2, y1, y2) ∈ X × X × Y × Y : x1y1 = x2y2}|.

This quantity is again bounded using Cauchy-Schwarz by

2 E×(X,Y ) ≤ E×(X,X)E×(Y,Y ).

Improving on an estimate for this sum remains elusive. In fact, even the case when the sets are highly structured is open [C2]. For instance, we still have no non-trivial estimates beyond the square-root barrier when both sets are multiplicative subgroups. Now using sum-product estimates, if the sets had additive structure, we could bound the multiplicative energy non-trivially and make an improvement. This is essentially Burgess’ argument, though he did not use sum-product theory; rather, since he was working with arithmetic progressions, the multiplicative energy could be bounded directly. We turn instead to a quadravariate sum which has enough operations to force a Sum-Product type problem. Let A, B, C, D be subsets of Fp and χ a non-trivial character. We consider the sum X X X X Hχ(A, B, C, D) = χ(a + b + cd). a∈A b∈B c∈C d∈D

By fixing one element in this sum, we can view Hχ as a trivariate sum in two different ways. First,

X Hχ(A, B, C, D) = Sχ(A, B, d · C) d∈D where d·C is the dilate of C by d. Loosely, we can use Lemma 5.2 to bound this non-trivially if E+(C,C) is small. If not, we can write

X Hχ(A, B, C, D) = Mχ(a + B,C,D) a∈A and try to bound this non-trivially using Lemma 2.13, which we can do if E×(C,C) is small. By making some simple manipulations to Hχ and using a sum-product estimate, we will be able to guarantee one of these facts holds.

√ Proof of Theorem 5.2. Let 2 ≤ k  log p be a (large) parameter. First we handle the case |C| < p. Let us write

|Hχ(A, B, C, D)| = ∆|A||B|||C||D| Chapter 5. Character sum estimates for various convolutions 48 so that our purpose is to estimate ∆. Let

 ∆|A||B||D| C = c ∈ C : |S (A, B, c · D)| ≥ . 1 χ 2

We have 1 X |H (A, B, C, D)| ≤ |S (A, B, c · D)|, 2 χ χ c∈C1 and using that the inner quantities are at most |A||B||D|, we have

∆ |C | ≥ |C|. 1 2

|C1| ∆ Now, passing to a subset C2 of C1 of size at least 16 ≥ 32 |C|, we can assume that the complex numbers 1 0 Sχ(A, B, c · D) with c ∈ C2 all have arguments within 2 of each other, so that for any C2 ⊂ C2 we have

0 |C2| 0 ∆|A||B||D| X |Hχ(A, B, C, D)| = |C2| ≤ |Sχ(A, B, c · D)| 2|C| 2 0 c∈C2 and so by Lemma 5.3 we have

0 |C2| X 0 |Hχ(A, B, C, D)| ≤ Sχ(A, B, c · D) = |Hχ(A, B, C2,D)|. (5.1) 4|C| 0 c∈C2

0 In particular, if C2 = C2 we have

∆2 |C0 | X |A||B||C||D| ≤ 2 |H (A, B, C, D)| ≤ |S (A, B, d · C )|. 128 4|C| χ χ 2 d∈D

Now in view of Lemma 5.2, we see that

2 ∆ p √ 1/2 3/4 1/4 |A||B||C||D| ≤ |D| max p|A|E+(B, d · C2) ≤ p|D||A| |B| E+(C2,C2) . 128 d∈D

Thus ∆8  ∆8  E (C ,C ) ≥ |A|2|B||C|4p−2 ≥ |A|2|B||C|p−2 |C |3. + 2 2 1284 1284 2

−1 ∆8 2 −2 For convenience, write K = 1284 |A| |B||C|p . By Theorem 2.5 there is a subset C3 ⊂ C2 of size at |C2| least K(log p)2 and such that 2 8 4 |C3| (log p) |C3 − C3|  K 2 |C3|. |C2| In particular, by Theorem 2.6 we have

 2 8 7/4 7 |C3| (log p) 7/4 7 25/4 −7/2 15 E×(C3,C3)  |C3|K 2 |C3| log p = K |C3| |C2| (log p) . |C2| Chapter 5. Character sum estimates for various convolutions 49

0 Now we take C2 = C3 in equation (5.1) so that we get

∆ |C3| X |A||B||C ||D| = |H (A, B, C, D)| ≤ |H (A, B, C ,D)| ≤ |M (a + B,C ,D)|. 4 3 4|C| χ χ 3 χ 3 a∈A

Now we apply Lemma 2.13 to obtain that

∆ 1− 1 1/4k 2k √ k 1/2k |A||B||C ||D|  |A|(|D||C |) k (E (D,D)E (C ,C )) |B| 2k p + (2k|B|) p 4 3 3 × × 3 3

3 which implies (after bounding E×(D,D) trivially by |D| )

4k −1 −4 √ −1 k 2 ∆  |D| |C3| E×(C3,C3) 2k p + (2k|B| ) p .

δ 2k 1 Since 2 ≤ k  log p and |B| ≥ p , the final factor is at most O(p(log p) ) as long as k > 2δ , and after inserting the upper bound for E×(C3,C3) we have

4k −1 7 9/4 −7/2 2k+15 ∆  |D| K |C3| |C2| (log p) p.

−1 ∆8 2 −2 Now we substitute K = 1284 |A| |B||C|p and see

4k+56 −1 −14 −7 −7 9/4 −7/2 2k+15 15 ∆  |D| |A| |B| |C| |C3| |C2| (log p) p .

Bounding |C3| ≤ |C2| and |C2|  ∆|C| we get

4k+ 229 −1 −14 −7 − 33 2k+15 15 ∆ 4  |D| |A| |B| |C| 4 (log p) p .

Upon taking 4k’th roots we have

1/4k 1+229/16k  −1 −14 −7 − 33 15 1/2+15/4k ∆  |D| |A| |B| |C| 4 p (log p) .

Since |D|4|A|56|B|28|C|33 ≥ p60+ε, the quantity in brackets on the right is at most p−ε/4. This shows that we must have ∆ < p−τ for some τ > 0 depending only on ε and δ. This is because we only needed k to be sufficiently large in terms of δ. √ |C| If |C| > p then we can break C into a disjoint union of m ≈ √ sets C1,...,Cm of size at most √ p p. Then X |Hχ(A, B, C, D)| ≤ |Hχ(A, B, Cj,D)|. j

−τ We obtain a savings of p for each Hχ(A, B, Cj,D) and hence for Hχ(A, B, C, D) provided

4 56 28 33 4 56 28 33/2 60+ε |D| |A| |B| |Cj|  |D| |A| |B| p ≥ p which is guaranteed by hypothesis (with 2ε in place of ε). Bibliography

[AS] N. Alon and J. Spencer, The Probabilistic Method, 3rd edition, John Wiley and Sons, Inc., 2008.

[Bou1] J. Bourgain, On arithmetic progressions in sums of sets of integers, A tribute to Paul Erds, 105-109, Cambridge Univ. Press, Cambridge, 1990.

[Bou2] J. Bourgain,On triples in arithmetic progression, Geom. Funct. Anal. 9 (1999), no. 5, 968-984.

[Bou3] J. Bourgain,Multilinear exponential sums in prime fields under optimal entropy condition on the sources, Geom. Funct. Anal. 18 (2009), no. 5, 1477-1502.

[BG] J. Bourgain and M.Z. Garaev,On a variant of sum-product estimates and explicit exponential sum bounds in prime fields, Math. Proc. Cambridge Philos. Soc. 146 (2009), no. 1, 1-21.

[BGK] J. Bourgain, A. A. Glibichuk and S. V. Konyagin, Estimates for the number of sums and products and for exponential sums in fields of prime order, J. Lond. Math. Soc. (2) 73 (2006), no. 2, 380-398.

[BKT] J. Bourgain, N. Katz and T. Tao, A sum-product estimate in finite fields, and applications, Geom. Funct. Anal. 14 (2004), no. 1, 27-57.

[Bu1] D. A. Burgess, On character sums and L-series, Proc. London Math. Soc. (3) 12 1962 193-206.

[Bu2] D. A. Burgess, On character sums and L-series. II, Proc. London Math. Soc. (3) 13 1963 524-536.

[C1] M.-C. Chang, On a question of Davenport and Lewis and new character sum bounds in finite fields, Duke Math. J. 145 (2008), no. 3, 409-442.

[C2] M.-C. Chang, Character sums in finite fields, Finite fields: theory and applications, 83-98, Con- temp. Math., 518, Amer. Math. Soc., Providence, RI, 2010.

[CLR] E. Croot, N. Lyall and A. Rice, A purely combinatorial approach to simultaneous polynomial recurrence modulo 1, arXiv:1307.0779.

[E] P. Erd¨os, An asymptotic inequality in the theory of numbers, (Russian. English summary) Vestnik Leningrad. Univ. 15 1960 no. 13, 41-49.

[ES] P. Erd¨osand E. Szemer´edi, On sums and products of integers, Studies in pure mathematics, 213- 218, Birkhuser, Basel, 1983.

50 Bibliography 51

[FI] J. Friedlander and H. Iwaniec, Estimates for character sums, Proc. Amer. Math. Soc. 119 (1993), no. 2, 365-372.

[G1] M. Z. Garaev, An explicit sum-product estimate in Fp, Int. Math. Res. Not. IMRN 2007, no. 11, Art. ID rnm035, 11 pp.

[G2] M. Z. Garaev, The sum-product estimate for large subsets of prime fields, Proc. Amer. Math. Soc. 136 (2008), no. 8, 2735-2739.

[GT] B. Green and T. Tao, New bounds for Szemer´edistheorem. II. A new bound for r4(N), Analytic number theory, 180-204, Cambridge University Press, Cambridge, 2009.

[HLS] N. Hindman, I. Leader, I. and D. Strauss, Open Problems in Partition Regularity, Combinatorics, Probability and Computing, no. 12, 571-583.

[IK] H. Iwaniec and E. Kowalski, Analytic Number Theory, American Mathematical Society Colloquium Publications. Amer. Math. Soc., Providence, RI, 2004.

[KS] N. H. Katz and C.-Y. Shen, A slight improvement to Garaevs sum product estimate, Proc. Amer. Math. Soc. 136 (2008), 2499-2504.

[K] S.V. Konyagin, Estimates for character sums in finite fields, (Russian) Mat. Zametki 88 (2010), no. 4, 529–542; translation in Math. Notes 88 (2010), no. 3-4, 503-515

[KR] S.V. Konyagin and M. Rudnev, On new sum-product-type estimates, SIAM J. Discrete Math. 27 (2013), no. 2, 973-990.

[LN] R. Lidl and H. Neiderreiter, Finite fields, Encyclopedia of mathematics and its applications. Cam- bridge University Press, 1997.

[LM] N. Lyall and A. Magyar, Simultaneous polynomial recurrence, Bull. Lond. Math. Soc. 43 (2011), no. 4, 765-785.

[LRN] L. Li and O. Roche-Newton, An improved sum-product estimate for general finite fields, SIAM J. Discrete Math. 25 (2011), no. 3, 1285-1296.

[N] M. B. Nathanson, Elementary Methods in Number Theory, Graduate Texts in Mathematics. Springer, 2000.

[P] R. E. A. C. Paley, A theorem on characters, J. Lond. Math. Soc. 7 (1932), 28-32.

[RNRS] O. Roche-Newton, M. Rudnev and I. Shkredov, New sum-product type estimates over finite fields, arXiv:1408.0542v1.

[R] M. Rudnev, An improved sum-product inequality in fields of prime order, Int. Math. Res. Not. IMRN 2012, no. 16, 3693-3705. Bibliography 52

[Sar] A. S´ark¨ozy, On additive decompositions of the set of quadratic residues modulo p, Acta Arith. 155 (2012), no. 1, 41-51.

[Sch] W. M. Schmidt, Small fractional parts of polynomials, CBMS Regional Conference Series in Math., 32, Amer. Math. Soc., 1977.

[Sh] X. Shao, On character sums and exponential sums over generalized arithmetic progressions, Bull. Lond. Math. Soc. (2013) 45 (3): 541-550.

[Shk] I. D. Shkredov, On monochromatic solutions of some nonlinear equations in Z/pZ, Mathematical Notes, 88,(2010), no. 3-4, 603611.

[Shk2] I. D. Shkredov, Sumsets in quadratic residues, Acta Arith. 164 (2014), no. 3, 221-243.

[So] J. Solymosi, Bounding multiplicative energy by the sumset, Adv. Math. 222 (2009), no. 2, 402-408.

[Shp] I. Shparlinski, Additive decompositions of subgroups of finite fields, SIAM J. Discrete Math. 27 (2013), no. 4, 1870-1879.

[TV] T. Tao and V. Vu, Additive Combinatorics, Cambridge Studies in Advanced Mathematics, 105. Cambridge University Press, Cambridge, 2006.

[V] I. M. Vinogradov, An Introduction to the Theory of Numbers, 6th edition (translated from Russian), Pergamon Press, 1952.