THE DICHOTOMY BETWEEN STRUCTURE AND RANDOMNESS AND APPLICATIONS TO COMBINATORIAL NUMBER THEORY
DISSERTATION
Presented in partial fulfillment of the requirements for the degree Doctor of Philosophy in the Graduate School of The Ohio State University
By Florian K. Richter, MASt, Bsc Graduate Program in Mathematics
The Ohio State University 2018
Dissertation Committee: Vitaly Bergelson, Advisor Alexander Leibman David Penneys Copyright by
Florian K. Richter
2018 ABSTRACT
The study of the long-term behavior of dynamical systems has far-reaching applications to other areas of mathematics. The employment of analytic tools coming from measurable, topological, and symbolic dynamics offers novel possibilities for analyzing seemingly static number-theoretic and combinatorial situations and has proven to be a powerful method in solving numerous open problems in Ramsey theory and combinatorial number theory that previously appeared to be intractable.
In this thesis we develop new techniques that are inspired by dynamical heuristics and lead to a variety of applications in discrete mathematics. One theme featured prominently in this work is the idea of dichotomy between structure and randomness. This dichotomy manifests itself via decomposition theorems that deal with splittings of arithmetic functions into two components, one of which is structured and the other is pseudo-random. From these decomposition theorems we then derive results in ergodic theory and density Ramsey theory.
Among other things, we obtain generalizations and refinements of Szemerédi’s theorem and
Sárközy’s theorem, and present a solution to a long-standing open sumset conjecture of
Erdős.
ii ACKNOWLEDGEMENTS
My accomplishments as a graduate student would not have been possible without the sup- port of the kind people around me.
First and foremost, I would like to express my sincere gratitude to Vitaly Bergelson, my advisor, whose mentorship has formed me as a mathematician. His never-ending patience and care towards me and all his other students has set a great example of how to become an effective and successful advisor. I could not have imagined having a better mentor for my Ph.D. studies.
I also want to thank the members of the Ohio State University mathematics department, in particular Alexander Leibman, Nimish Shah and Daniel Thompson, for their help and guidance throughout the years. My thanks also go to Mariusz Lemańczyk. I benefited greatly from my collaboration with him.
I am grateful to the graduate school at the Ohio State University for awarding me the
Presidential Fellowship, and to the Phi Kappa Phi Honor Society for awarding me the Louise
B. Vetter Award, which provided me with the time and resources to finish this dissertation.
To my colleagues, co-authors and friends, including Daniel Glasscock, John H. John- son, Andreas Koutsogiannis, Joanna Kułaga-Przymus, Joel Moreira and Donald Robertson, thank you for sharing with me your skills and enthusiasm for mathematics.
Finally, I would like to thank my loving family for their help and support and for all the sacrifices that they have made to help me achieve my goals. Thank you!
iii VITA
June 2010 ...... Bachelor of Science
Technische Universität Wien
June 2011 ...... Master of Advanced Studies in Mathematics
Cambridge University
September 2012 to Present ...... Graduate Teaching Associate
Department of Mathematics
Ohio State University
PUBLICATIONS
• V. Bergelson, F. K. Richter. “On the density of coprime tuples of the form (n, bf1(n)c,
..., bfk(n)c), where f1, . . . , fk are functions from a Hardy field”. In: Number Theory – Diophantine Problems, Uniform Distribution and Applications, Festschrift in Hon-
our of Robert F. Tichy’s 60th Birthday (C. Elsholtz and P. Grabner, eds.), Springer
International Publishing, Cham, (2017), pp. 109 – 135.
• J. H. Johnson, F. K. Richter. “Revisiting the Nilpotent Polynomial Hales-Jewett
Theorem”. In: Advances in Mathematics 321 (2017), pp. 269 – 286.
d • J. Moreira, F. K. Richter. “Large subsets of discrete hypersurfaces in Z contain arbitrarily many collinear points”. In: European Journal of Combinatorics 54 (2016),
pp. 163 – 176.
iv FIELDS OF STUDY
Major Field: Mathematics
Specialization: Ergodic Theory, Additive and Combinatorial Number Theory
v TABLE OF CONTENTS
Abstract ...... ii
Acknowledgements...... iii
Vita...... iv
List of Figures...... ix
1 Introduction...... 1
1.1 The dichotomy between structure and randomness...... 1
1.2 Notions of structure ...... 3
1.3 Notions of randomness...... 9
1.4 Decomposition theorems...... 12
1.4.1 Decomposition theorems for arithmetic functions...... 12
1.4.2 Decomposition theorems for multicorrelation sequences ...... 14
1.4.3 A dichotomy theorem for multiplicative functions and a structure
theorem for level sets of multiplicative functions ...... 17
1.5 Main Results ...... 19
1.5.1 Generalizations of Furstenberg’s multiple recurrence theorem and re-
finements of Szemerédi’s theorem...... 19
1.5.2 The Erdős sumset conjecture...... 24
2 Proofs of decomposition theorems...... 26
2 2.1 Proofs of decomposition theorems for L (N, Φ) ...... 26 2 2.1.1 A completeness lemma for L (N, Φ) ...... 26
vi 2 2.1.2 Decomposing functions from L (N, Φ) into almost periodic and cope- riodic components ...... 28
2 2.1.3 The Jacobs–de Leeuw–Glicksberg splitting for L (N, Φ) ...... 31 2.2 Proofs of decomposition theorems for multicorrelation sequences ...... 35
2.2.1 Preliminaries on nilmanifolds, nilsystems and nilsequences ...... 35
2.2.2 Preliminaries on almost periodic functions...... 40
2.2.3 Proving Theorem 18 for the special case of nilsystems ...... 41
2.2.4 Host-Kra-Ziegler factors...... 47
2.2.5 Spectrum of the orbit of the diagonal ...... 49
2.2.6 A useful reduction...... 51
2.2.7 The subgroup H ...... 53
2.2.8 A Theorem for eliminating the rational spectrum...... 56
2.3 Multiplicative functions and their level sets ...... 62
2.3.1 Preliminaries...... 62
2.3.2 Dichotomy theorem for M0 ...... 74 2.3.3 Structure theorem for D ...... 76
3 Applications of decomposition theorems to the theory of multiple re-
currence and to combinatorial number theory...... 90
3.1 Multiple ergodic averages along Beatty sequences and a proof of Theorem 28 90
3.2 Multiple ergodic averages along rational sets and applications ...... 93
3.2.1 Rational sequences are good weights for polynomial multiple conver-
gence ...... 94
3.2.2 Divisible rational sets are good for polynomial multiple recurrence . 96
3.2.3 Applications to additive combinatorics...... 97
3.3 Multiple ergodic averages along level sets of multiplicative functions and
applications to ergodic theory and combinatorics...... 99
3.3.1 The class Drat ...... 99 3.3.2 Proofs of Theorem 35 and Proposition 36 ...... 103
vii 3.4 Completing the proof of the Erdős sumset conjecture ...... 105
3.4.1 An ultrafilter reformulation of the Erdős sumset conjecture . . . . . 105
3.4.2 Proving the ultrafilter reformulation...... 110
3.4.3 Establishing properties U1 - U4 ...... 112
3.4.4 An application of the pointwise ergodic theorem ...... 115
3.4.5 A variant of an argument of Beiglböck ...... 117
Bibliography...... 119
viii LIST OF FIGURES
1.1 Notions of structure ...... 9
1.2 Notions of randomness...... 12
ix CHAPTER 1
INTRODUCTION
1.1 The dichotomy between structure and randomness
Density Ramsey theory is a rich and active area of research in mathematics at the interface of combinatorics and measure theory. Broadly speaking, it deals with finding arithmetic, geometric or combinatorial patterns in large subsets of spaces that admit a natural notion of density. The most classical and most studied space of this kind – and also the space that we focus on in this dissertation – is the set of positive integers N := {1, 2, 3,...}. On N, a notion of density that is natural to consider is the so-called upper density. Given a subset
A ⊂ N the upper density of A is defined as |A ∩ {1,...,N}| d(A) := lim sup . N→∞ N
One of the basic principles of density Ramsey theory is that any set A with d(A) > 0 is combinatorially and arithmetically rich. Two celebrated theorems that showcase this principle are Szemerédi’s theorem on arithmetic progressions and Sárközy’s theorem.
Szemerédi’s theorem ([Sze75]). Any set A ⊂ N with d(A) > 0 contains arbitrarily long arithmetic progressions.
Sárközy’s theorem ([Sár78], see also [Fur77]). Any set A ⊂ N with d(A) > 0 contains two elements whose difference is a perfect square.
How does one prove results such as Szemerédi’s theorem or Sárközy’s theorem? The class of all subsets of N with positive upper density is so large that, if no further restrictions are made, the nature of an arbitrary set A with d(A) > 0 can be rather intricate and 1 difficult to describe, especially since A might exhibit a blend of different qualities. An effective approach for proving such theorems is to decompose A into manageable pieces.
At the center of any such approach lies a decomposition theorem which guarantees that an arbitrary set, no matter how disorderly, can be viewed as a “superposition” of well-behaved components. We will focus on decomposition theorems that are based on a rather simple but powerful idea: the dichotomy between structure and randomness.
Sometimes also interpreted as the dichotomy between order and chaos, this heuristics plays a central role in all known proofs of Szemerédi’s theorem[Sze75; Fur77; Gow01;
Tao06; GT10c], where it manifests itself in diverse ways, such as, for example, a regular- ity lemma [Sze78; RS04; Gow07; LS07; GT10a], an inverse theorem for unifomity norms
[GTZ12; HK09], or a structure theorem for arbitrary probability measure preserving sys- tems [FKO82; EW11]. Moreover, the idea of dichotomy between structure and randomness
finds applications in other areas as well; see for example the expository articles on this topic by Tao [Tao07; Tao08] and Gowers [Gow10].
In this thesis we are concerned with refining old and developing new results that deal with decompositions of arithmetic functions on the integers into two components, one of which is structured and the other is random. In this context, we use the term structured very broadly to describe objects that exhibit some form of low-complexity or almost pe- riodicity, and the term random (or sometimes pseudo-random) for objects that within our framework exhibit behavior similar to that of a sequence of independent and identically distributed random variables. Some of the decomposition theorems that we obtain are rather general and yield a decomposition for any bounded arithmetic function f : N → C, others will be more specialized and apply only to certain classes of arithmetic functions, such as multiplicative functions, indicator functions of level sets of multiplicative functions, or multicorrelation sequences (cf. Sections 1.4.2 and 1.4.3 below for definitions). All decom- position theorems are collectively formulated in Section 1.4, whereas the pertinent notions of structure and randomness are introduced beforehand in Sections 1.2 and 1.3. The main results of this doctoral dissertation are then stated in Section 1.5 and include dynamical
2 and combinatorial corollaries that can be derived from the decomposition theorems pre- sented in Section 1.4. Among many other things, this includes far-reaching generalizations and analogues of Szemerédi’s theorem and Sárközy’s theorem (see Theorems 28, 32, 35 and Corollaries 29, 33, 34, 37, 38) as well as a solution to the long-standing Erdős sumset conjecture (see Theorem 40). Partial results of the presented work have already appeared in [MR16a; BKPLR16; BKPLR17; MRR17; BMR17; MRR18] and some theorems and passages have been taken verbatim from these sources.
1.2 Notions of structure
All decomposition theorems that we will encounter in Section 1.4 follow a similar scheme in the sense that they allow us to split an arithmetic function f : N → C as
f = fstructured + frandom, (1.2.1)
where the two components appearing in the splitting, fstructured and frandom, exhibit special qualities: The first component is structured in the sense that an explicit description of its arithmetic architecture is known and allows for it to be studied using hands-on computa- tions. The second, on the other hand, is the opposite. It is pseudo-random, which means that even though not much is known about its explicit structure, its statistical behavior is easy to predict and can be studied using probabilistic heuristics.
The purpose of this section is to introduce a rigorous framework for defining the type of structured objects that play the role of fstructured in (1.2.1). We do the same for frandom in Section 1.3 below. Ergodic theory and topological dynamics will play a crucial role in defining these notions and so we begin by recalling the definition of a topological dynamical system.
Definition 1. A topological dynamical system is a pair (X,T ), where X is a compact metric space and T : X → X is a homeomorphism.
There is a natural way of associating to every topological dynamical systems a collec- ∞ : tion of arithmetic functions belonging to ` (N) = {f : N → C : supn∈N |f(n)| < ∞} by 3 evaluating continuous functions along orbits of points. This leads to the following definition.
Definition 2. We say that an arithmetic function f : N → C is generated by the topological dynamical system (X,T ) if there exists a point x ∈ X and a continuous function F ∈ C(X)
n such that f(n) = F (T x) for all n ∈ N.
Note that many dynamical qualities of the system are inherited by the sequences that it gives rise to. In particular, if (X,T ) is a structurally simple system then so are all the sequences that (X,T ) generates. We provide now some illustrations of this principle.
Bohr rationally almost periodic functions. The simplest type of a topological dy- namical system is a translation on a finite group, by which we mean a system (X,T ) where the underlying space X is a finite group and the transformation T : X → X is multiplica- tion by a fixed element in this group, i.e., there exists a ∈ X such that T x = a · x for all x ∈ X. An arithmetic function f : N → C is called Bohr rationally almost periodic if for every ε > 0 there exists a function g : N → C that is generated by a translation on a finite group such that supn∈N |f(n) − g(n)| < ε. Write Bohrrat(N) for the set of all arithmetic functions that are Bohr rationally almost periodic. We remark that Bohr rationally al- most periodic functions form a subfamily of the more general class of Bohr almost periodic functions introduced later in this section.
Example 3. Any periodic function is clearly Bohr rationally almost periodic. However, there are plenty of non-periodic Bohr rationally almost periodic functions, such as functions
P∞ 2πiqin P∞ of the form f(n) = i=1 cie where q1, q2,... ∈ Q and c1, c2,... ∈ C with i=1 |ci| < ∞.
Besicovitch rationally almost periodic functions. An important generalization of
Bohr rationally almost periodic functions is given by Besicovitch rationally almost periodic functions, which were introduced in [BKPLR16] and find applications in multiplicative number theory (cf. [BKPLR17]).
A Følner sequence on N is a sequence Φ = (ΦN )N∈N of finite non-empty subsets of N
4 |(ΦN −1)∩ΦN | satisfying limN→∞ = 1. The Besicovitch seminorm k · kΦ is then defined as |ΦN |
1 2 1 X 2 kfkΦ := lim sup |f(n)| . (1.2.2) N→∞ |ΦN | n∈ΦN
An arithmetic function f : N → C is called Besicovitch rationally almost periodic with respect to Φ if, for every ε > 0, there exists an arithmetic function g : N → C generated by a translation on a finite group such that kf(n) − g(n)kΦ < ε. We define Besrat(N, Φ) to be the set of all Besicovitch rationally almost periodic functions with respect to Φ.
Example 4. Certainly, Bohrrat(N) ⊂ Besrat(N, Φ) for any Følner sequence Φ. The reverse inclusion is not true: Let Φ(Cesàro) denote the Følner sequence of initial intervals (meaning (Cesàro) that ΦN = {1, 2,...,N} for all N ∈ N). Then the indicator function of the square- 2 free numbers Q := {n ∈ N : p - n for all primes p} is an example of a function that is Besicovitch rationally almost periodic with respect to Φ(Cesàro) but not Bohr rationally al- most periodic. Note that the indicator function of the squarefree numbers is multiplicative, meaning that for all m, n ∈ N with gcd(m, n) = 1 one has 1Q(mn) = 1Q(m)1Q(n) (cf.
Section 1.4.3 below). It is shown in [BKPLR17] that not just 1Q, but actually any bounded multiplicative function f : N → [0, ∞) is Besicovitch rationally almost periodic with respect to the Følner sequence of initial intervals Φ(Cesàro).
Bohr almost periodic functions. A more general class of low complexity systems, which includes all translations on finite groups, is the class of translations on compact groups. Translations on compact groups are topological dynamical systems (X,T ) where X is a compact Hausdorff topological group and the transformation T is again multiplication by a fixed element in the group. An arithmetic function f : N → C is called Bohr almost periodic if it is generated by a translation on a compact group. We denote the collection of all Bohr almost periodic functions by Bohr(N).
Example 5. Bohr almost periodic functions were introduced by Harald Bohr (see [Boh25a;
P∞ 2πiλin Boh25b]) and include, among other examples, all functions of the form f(n) = i=1 cie
5 P∞ where λ1, λ2,... ∈ R and all c1, c2,... ∈ C with i=1 |ci| < ∞. In particular, all trigono- metric polynomials are Bohr almost periodic.
Besicovitch almost periodic functions (cf. [Bes26; Bes55; BL85]). Let Φ = (ΦN )N∈N be a Følner sequence on N. If, for every ε > 0, there exists g : N → C that is generated by a translation on a compact group and such that kf(n) − g(n)kΦ < ε, then f is called
Besicovitch almost periodic with respect to Φ. Write Bes(N, Φ) for the space of arithmetic functions that are Besicovitch almost periodic with respect to Φ.
Example 6. Bes(N, Φ) contains all Bohr almost periodic functions and all Besicovitch rationally almost periodic functions. An example of a function that is neither Bohr almost periodic nor Besicovitch rationally almost periodic, but still belongs to Bes(N, Φ), is the indicator function of the set {bnαc : n ∈ N} for any α ∈ R\Q. It is worth noting that the set of all trigonometric polynomials (and therefore also Bohr(N)) is dense in Bes(N, Φ) with respect to the Besicovitch seminorm k.kΦ, that is, for any f ∈ Bes(N, Φ) and any ε > 0 there exists a trigonometric polynomial p(n) such that kf − pkΦ < ε (cf. [Bes55; BL85]).
Nilsequences (cf. [BHK05; HK08;HK]) . A class of dynamical systems that includes all translations on compact groups and which finds important applications in density Ramsey theory is the class of nilsystems: Let k ∈ N, let G we a k-step nilpotent Lie group and let Γ be a uniform and discrete subgroup of G (here uniform means that the quotient G/Γ is compact). The quotient space G/Γ is called a k-step nilmanifold and for any fixed group element a ∈ G the transformation T : G/Γ → G/Γ given by T (bΓ) = (a·b)Γ for all bΓ ∈ G/Γ is called a niltranslation. If X = G/Γ is a k-step nilmanifold and T is a niltranslation on it then the resulting topological dynamical system (X,T ) is called a k-step nilsystem.
Any arithmetic function f : N → C generated by a k-step nilsystem is called a basic k-step nilsequence. We call f : N → C a k-step nilsequence if for every ε > 0 there exists a basic k-step nilsequence g : N → C such that supn∈N |f(n) − g(n)| < ε. Write Nilk(N) for the set of all k-step nilsequences. It is straightforward to show that Nil1(N) = Bohr(N).
6 Example 7. A classical example of a k-step nilsequence is f(n) = e2πip(n), where p is any polynomial in R[x] of degree smaller or equal to k.
Besicovitch Nilsequences. Fix a Følner sequence Φ = (ΦN )N∈N. We call f : N → C a Besicovitch k-step nilsequence with respect to Φ if for every ε > 0 there exists a sequence g : N → C coming from a k-step nilsystem such that kf(n) − g(n)kΦ < ε. Let BesNilk(N, Φ) denote the set of Besicovitch k-step nilsequences. In analogy to Nil1(N) = Bohr(N), we have
BesNil1(N, Φ) = Bes(N, Φ).
Example 8. All nilsequences are obviously Besicovitch nilsequences with respect to Φ for any Φ, but the converse is not true. For instance, given two rationally independent numbers
α, β ∈ R the sequence e(bnαcnβ) belongs to BesNil2(N, Φ) for any Følner sequence Φ, but does not belong to Nil2(N) (see [HK08, Appendix C]).
Our next goal is to introduce measure theoretic analogues of almost periodic functions and nilsequences; for this we first need to recall the definition of a measure preserving dynamical system and thereafter establish a measure theoretic analogue of Definition2.
Definition 9. Let (X, B, µ) be a probability space and let T : X → X be a measurable map that is measure preserving, i.e., µ(T −1B) = µ(B) for all B ∈ B. We call the quadruple
(X, B, µ, T ) a measure preserving dynamical system. In the case where X is a metric space and B is the σ-algebra of Borel sets, we call (X, B, µ, T ) a metric measure preserving system, or m.m.p.s. for short. (Note, in this thesis we will only deal with m.m.p.s. where T : X → X is continuous.) Two measure preserving systems (X, B, µ, T ) and (Y, C, ν, S) are called isomorphic if there exists a measurable subset X0 ⊂ X with µ(X0) = 1 and an injective measurable map ϕ: X0 → Y such that ϕ ◦ T = S ◦ ϕ and ν coincides with the push-forward of µ under the map ϕ.
For the remainder of this section let Φ = (ΦN )N∈N be a Følner sequence on N.
Definition 10. Given a m.m.p.s. (X, B, µ, T ), we say a point x ∈ X is generic along Φ for
7 the measure µ if for every F ∈ C(X),
1 Z lim X F (T nx) = F dµ. N→∞ |ΦN | X n∈ΦN
We say f : N → C is generated along Φ by the m.m.p.s. (X, B, µ, T ) if there exists a point x ∈ X, which is generic along Φ for µ, and a continuous function F ∈ C(X) such that
n f(n) = F (T x) for all n ∈ N.
Compact functions (cf. [MRR18]). On any compact Hausdorff topological group the normalized Haar measure is a Borel probability measure which is invariant under all group translations. Therefore, any translation on a compact group can be viewed as a metric measure preserving system. We call any m.m.p.s. (X, B, µ, T ) that is measurably isomorphic to a translation on a compact group with Haar measure a measure-theoretic Kronecker system. An arithmetic function f : N → C is then called compact with respect to Φ if it is generated along Φ by a measure-theoretic Kronecker system. Such functions are called compact because their orbit under shifts {n 7→ f(n+h): h ∈ N} is pre-compact with respect to the Besicovitch seminorm k.kΦ. We use Comp(N, Φ) to denote the set of all functions that are compact with respect to Φ.
Measure-theoretic nilsequences. Let k ∈ N. On any k-step nilmanifold G/Γ there exists a unique Borel probability measure µG/Γ on G/Γ that is invariant under all nil- traslations called the Haar measure of the nilmanifold G/Γ (cf. [Rag72]). Therefore any nilsystem together with its Haar measure is a metric measure preserving dynamical system.
A m.m.p.s. (X, B, µ, T ) is then called a measure-theoretic k-step nilsystem if it is measurably isomorphic to a nilsystem with its Haar measure.
An arithmetic function f : N → C is called a measure-theoretic nilsequence with re- spect to Φ if it is generated along Φ by a measure-theoretic k-step nilsystem. We use
MeasNilk(N, Φ) to denote the set of all measure-theoretic k-step nilsequences with respect to Φ. Clearly, measure-theoretic 1-step nilsequences are the same as compact functions.
8 Figure 1.1: Notions of structure
Bohrrat(N) ⊂ Bohr(N) ⊂ Nil2(N) ⊂ Nil3(N) ... ⊂ ⊂ ⊂ ⊂
Besrat(N, Φ) ⊂ Bes(N, Φ) ⊂ BesNil2(N, Φ) ⊂ BesNil3(N, Φ) ... ⊂ ⊂ ⊂
Comp(N, Φ) ⊂ MeasNil2(NΦ) ⊂ MeasNil3(N, Φ) ...
We end this section with a diagram (see Fig. 1.1) that illustrates the relationship between the different notions of structure that we have introduced thus far. We remark that all inclusions appearing in this diagram are in fact proper inclusions.
1.3 Notions of randomness
Let us now discuss notions of randomness that serve as the counterparts to the notions of structure introduced in the previous section. We start with what can be regarded as the complementary notion to that of a Besicovitch rationally almost periodic function.
Aperiodic functions. Let Φ = (ΦN )N∈N be a Følner sequence. An arithmetic function f : N → C is aperiodic with respect to Φ if
1 lim X f(n) = 0 N→∞ |ΦN ∩ (aN + b)| n∈ΦN ∩(aN+b) for all a ∈ N and all b ∈ {0, 1, . . . , a − 1}. Write Aper(N, Φ) for the space of aperiodic arithmetic functions.
Coperiodic functions. The next notion that we consider is a natural generalization of
2πinθ aperiodicity. For θ ∈ [0, 1) write e(nθ) := e . We call f : N → C coperiodic with respect to Φ if 1 lim X f(n)e(nθ) = 0, ∀θ ∈ [0, 1). N→∞ |ΦN | n∈ΦN Let Coper(N, Φ) be the set of all coperiodic functions with respect to Φ.
9 ∞ Conilsequences. Let k ∈ N. We say f ∈ ` (N) is a k-step conilsequence with respect to Φ if 1 lim X f(n)φ(n) = 0 N→∞ |ΦN | n∈ΦN for all φ ∈ Nilk(N). Note that f : N → C is 1-step conilsequence with respect to Φ if and only if it is coperiodic with respect to Φ. We use Conilk(N, Φ) to denote the set of all bounded k-step conilsequences with respect to Φ.
Weak mixing functions. A function f : N → C is weak mixing with respect to Φ if, for every bounded function g : N → C and every subsequence (Nk)k∈N with the property that 1 lim X f(n + h)g(n) k→∞ |Φ | Nk n∈Φ Nk exists for all h ∈ N, one has for all ε > 0,
n 1 X o d h ∈ N : lim f(n + h)g(n) > ε = 0. k→∞ |Φ | Nk n∈Φ Nk
Let WM(N, Φ) denote the set of all bounded arithmetic functions that are weak mixing with respect to Φ.
Example 11. A measure preserving system (X, B, µ, T ) is called weakly mixing if for all
1 PN −n A, B ∈ B one has limN→∞ N n=1 |µ(T A ∩ B) − µ(A)µ(B)| = 0. One can show that any f : N → C that is generated along Φ by a weak mixing m.m.p.s. and satisfies limN→∞ 1 P f(n) = 0 belongs to WM( , Φ). |ΦN | n∈ΦN N
Uniform functions. The uniformity (semi)norms provide a very useful way of mea- suring pseudo-randomness. For different spaces there exist different but related versions of the uniformity norms, such as the Gowers uniformity norms for functions on a finite group (cf. [Gow01; Tao12]), the Gowers uniformity norms for functions on {1,...,N} (cf.
[GTZ12; Tao12]), the Host-Kra norms on L∞-functions in the framework of measure pre- serving systems (see [HK05b]), and the localized Host-Kra seminorms introduced in [HK09]
∞ for functions belonging to ` (N). In this thesis we utilize the Gowers uniformity norms for ∞ functions on {1,...,N} and the localized Host-Kra seminorms on ` (N). 10 Given f : N → C and h ∈ N ∪ {0} define the multiplicative derivative ∂hf as
∂hf(n) := f(n)f(n + h), ∀n ∈ N.
Let N ∈ N and let Z/NZ denote the finite cyclic group with N elements. Given k ∈ N, the
Gowers uniformity norm k.kU k on Z/NZ (cf. [Gow01; GTZ12]) is defined as Z/NZ 1/2s 1 X kfk k := ∂ ··· ∂ f(n) . U k+1 hk h1 Z/NZ N n,h1,...,hk∈Z/NZ
Write [N] for the interval {1, 2,...,N}. To define the Gowers uniformity seminorm k.k s U[N] ˜ s ˜ for a function f : N → C, set N := 2 N, define a function fN : Z/NZ → C as fN (n) = f(n) ˜ ˜ ˜ for n ∈ [N] and fN (n) = 0 for n ∈ [N]\[N] (where we identify Z/NZ with the interval [N]).
Also, let 1[N] be the indicator function of the interval [N], and define (cf. Subsections A.1 and A.2 of Appendix A in [FH17a] and Appendix B in [GT10b])
kf k s N U ˜ Z/NZ kfkU s := . (1.3.1) [N] k1 k s [N] U ˜ Z/NZ s A bounded function f : → is called U -uniform if kfk s converges to zero as N → ∞. N C U[N] s A function f is called uniform if it is U -uniform for every s > 1. We use Uni(N) to denote the set of all uniform functions.
The uniformity seminorm of order k of f associated to Φ is defined as
H 1 X 1 X kfk k := lim lim ∂ ··· ∂ f(n) (1.3.2) U (Φ) k hk h1 H→∞ H N→∞ |ΦN | h1,...,hk=1 n∈ΦN whenever all involved limits exist. If the expression on the right hand side of (1.3.2) is not well defined then we say that kfkU k(Φ) is not well defined. We call an arithmetic function f : N → C k-step uniform with respect to Φ if kfkU k(Φ) is well defined and equals 0. We denote by Unik(N, Φ) the set of all arithmetic functions that are k-step uniform with respect to Φ. If f is k-step uniform with respect to Φ for every k > 1 then we say that f is uniform with respect to Φ.
Remark 12. Recall that Φ(Cesàro) denotes the Følner sequence of initial intervals (see
∞ Example4). We remark that any function f ∈ ` (N) that is k-step uniform with respect 11 Figure 1.2: Notions of randomness
Aper(N, Φ) ⊃ Coper(N, Φ) ⊃ Conil2(N, Φ) ⊃ Conil3(N, Φ) ... ⊃ ⊃ ⊃
WM(N, Φ) ⊃ Uni2(N, Φ) ⊃ Uni3(N, Φ) ... to Φ(Cesàro) is also k-step uniform (see [HK, Chapter 22]). However, the reverse inclusion does not hold (see also [HK, Chapter 22, ]).
The diagram in Fig. 1.2 showcases the relationship between Aper, Coper, Conil, WM and
Uni. All inclusions appearing in this diagram are strict.
1.4 Decomposition theorems
In Section 1.2 we have introduced a variety of classes of arithmetic functions which we consider to be structured, and in Section 1.3 we have introduced a variety of classes of arithmetic functions which we consider to be pseudo-random. The goal of this section is to bring together these notions of structure and randomness by formulating decomposition theorems in the spirit of (1.2.1).
1.4.1 Decomposition theorems for arithmetic functions
Fix a Følner sequence Φ on N. The first decomposition results that we present in this section are rather general and apply to any function in the space
2 L (N, Φ) := {f : N → C : kfkΦ < ∞}, where k.kΦ is the Besicovitch seminorm defined in (1.2.2). We begin with a theorem involv- ing Besicovitch almost periodic functions which was obtained by the author in joint work with Joel Moreira and Donald Robertson in [MRR18].
2 Theorem 13 ([MRR18]). For every Følner sequence Φ on N and any f ∈ L (N, Φ) there is a Følner subsequence Ψ of Φ and functions fBes ∈ Bes(N, Ψ) and fCoper ∈ Coper(N, Ψ) such
12 that f = fBes + fCoper. Moreover, fBes minimizes the distance between f and Bes(N, Ψ) in the sense that kf − fBeskΨ = inf{kf − gkΨ : g ∈ Bes(N, Ψ)}.
Instead of providing a proof of Theorem 13, we will establish a more general decom-
2 position result for L (N, Φ) that contains Theorem 13 as a special case. We will need the following definition.
Definition 14. Suppose that for every Følner sequence Φ we are given a subspace U(Φ)
2 of L (N, Φ) satisfying the following properties: • U(Φ) contains the constant functions and is closed under pointwise complex conjuga-
tion;
• for all u, v ∈ U(Φ) the inner product hu, viΦ exists; • If u, v ∈ U(Φ) are real valued, then the function n 7→ max{u(n), v(n)} is in U(Φ);
• U(Φ) is closed with respect to k · kΦ; • if Ψ is a subsequence of Φ then U(Ψ) ⊃ U(Φ).
Call any such assignment U of subspaces to Følner sequences a projection family. Given a projection family one can consider, for each Følner sequence Φ, the subspace
⊥ 2 U(Φ) := v ∈ L (N, Φ) : hu, viΦ exists and equals 0 for all u ∈ U(Φ)
2 of L (N, Φ).
It is straightforward to verify that that Φ 7→ Bes(N, Φ) is a projection family. In light of this fact, the following result is a generalization of Theorem 13.
Theorem 15 ([MRR18]). Let U be a projection family and let Φ be a Følner sequence.
2 For every f ∈ L (N, Φ) there exists a subsequence Ψ of Φ and fU ∈ U(Ψ) such that f − ⊥ fU ∈ U(Ψ) . Moreover, fU minimizes the distance between f and U(Ψ) in the sense that kf − fU kΨ = inf{kf − gkΨ : g ∈ U(Ψ)}.
We give a proof of Theorem 15 in Subsection 2.1.2.
2 The next decomposition theorem, which represents any f ∈ L (N, Φ) as a sum of a weak mixing function and a compact function, also first appeared in [MRR18] and can be viewed
13 as a discrete version of the Jacobs–de Leeuw–Glicksberg splitting [Jac56; LG61] on Banach spaces (see also [Kre85, Chapter 2.4] and [EFHN15, Example 16.25]).
2 Theorem 16 ([MRR18]). For every Følner sequence Φ on N and every f ∈ L (N, Φ) there is a Følner subsequence Ψ of Φ and functions fComp ∈ Comp(N, Ψ) and fWM ∈ WM(N, Ψ) such that f = fComp + fWM. Moreover, fComp minimizes the distance between f and Comp(N, Ψ) in the sense that kf − fCompkΨ = inf{kf − gkΨ : g ∈ Comp(N, Ψ)}.
Just like compact and weakly mixing functions can be thought of as the measure- theoretic analogues of Besicovitch almost periodic and coperiodic functions respectively,
Theorem 16 can be interpreted as a measure-theoretic analogue of Theorem 13.
Theorems 13 and 16 are instrumental in the proof of the Erdős sumset conjecture; this is discussed in more detail in Subsection 1.5.2 below.
1.4.2 Decomposition theorems for multicorrelation sequences
An intimate connection between Szemerédi’s theorem and dynamical systems was discovered by Furstenberg, who recast the problem of finding arithemtic progressions in sets of positive density in terms of ergodic theory. In [Fur77] Furstenberg demonstrated that Szemerédi’s theorem follows from a sophisticated generalization of the classical Poincaré recurrence theorem. This generalization is known as Furstenberg’s multiple recurrence theorem and states that for any probability measure preserving system (X, B, µ, T ), any k ∈ N, and any measurable set A ∈ B with µ(A) > 0, there exists an integer n ∈ N such that
µ A ∩ T −nA ∩ T −2nA ∩ ... ∩ T −knA > 0. (1.4.1)
In Subsection 1.5.1 we explain in more detail how (1.4.2) implies Szemerédi’s theorem.
Actually, Furstenberg proved something more general than (1.4.1). He showed that
1 N−1 lim inf X µ A ∩ T −nA ∩ T −2nA ∩ ... ∩ T −knA > 0. (1.4.2) N−m→∞ N − M n=M In Subsection 1.5.1 we obtain several refinements of Szemerédi’s theorem by considering variations of (1.4.2). Our work in this direction utilizes fine properties of multicorrelation
14 sequences, i.e., sequences
Z n kn α(n) = f0 · T f1 · ... · T fk dµ, (1.4.3) X
∞ −n −2n −kn where f0, . . . , fk ∈ L (X). Note that the sequence n 7→ µ A∩T A∩T A∩...∩T A appearing in (1.4.1) and (1.4.2) is, in fact, a multicorrelation sequence, which can be seen by choosing f0 = ... = fk = 1A in (1.4.3). Bergelson, Host and Kra established in [BHK05] a decomposition of multicorrelation
R n kn sequences α(n) = X f0 · T f1 · ... · T fk dµ in the following way:
α(n) = φ(n) + ω(n) (1.4.4)
where φ is a (k+1)-step nilsequence (in other words φ ∈ Nilk+1) and ω is a null-sequence, i.e., 1 PN limN−M→∞ N−M n=M |ω(n)| = 0. Our first decomposition theorem involving multicorre- lation sequences concerns a spectral refinement of the Bergelson-Host-Kra decomposition
(1.4.4) and first appeared in [MR16b]. For the formulation we need to recall the notions of the spectrum of a system and of the spectrum of a sequence.
Definition 17. The discrete spectrum σ(T ) of a probability measure preserving system
(X, B, µ, T ) is the set of eigenvalues θ ∈ T := R/Z for which there exists a non-zero eigen- function g ∈ L2(X, B, µ) satisfying T g := g ◦ T = e(θ)g, where e(θ) := e2πiθ. The spectrum
σ(f) of an arithmetic function f : N → C is the set of frequencies θ ∈ T for which
1 N lim sup X f(n)e(−θn) > 0. N N→∞ n=1 By examining the nilsystem from which the nilsequence in (1.4.4) arises, we will show that the spectrum of the multicorrelation sequence α(n) is contained in the discrete spec- trum of its originating system (X, B, µ, T ).
Theorem 18 ([MR16a]). Let k ∈ N, let (X, B, µ, T ) be an ergodic measure preserving ∞ system and let f0, f1, . . . , fk ∈ L (X). For every ε > 0 there exists a decomposition of the form Z n kn α(n) := f0 · T f1 · ... · T fk dµ = φ(n) + ω(n) + γ(n),
15 n where ω(n) is a null-sequence, γ satisfies supn∈N |γ(n)| < ε and φ(n) = F (R y) for some F ∈ C(Y ) and y ∈ Y , where (Y,R) is a k-step nilsystem whose discrete spectrum is contained in the discrete spectrum of (X, B, µ, T ).
The following is an immediate corollary of Theorem 18.
Corollary 19 ([MR16a]). Under the same assumptions as in Theorem 18, the spectrum
σ(α) of the multicorrelation sequence α is contained in the discrete spectrum σ(T ) of the underlying system (X, B, µ, T ).
In Section 1.5 we derive from Theorem 18 and Corollary 19 a useful multiple ergodic theorem (see Theorem 28) and also formulate an application to additive combinatorics (see
Corollary 29).
Next, we turn our attention to a polynomial version of a multiple correlation sequence:
A polynomial multiple correlation sequence is a sequence of the form
Z p1(n) pk(n) α(n) = f0,T f1 · ... · T fk dµ X
∞ where f0, f1, . . . , fk ∈ L (X) and p1, . . . , pk ∈ Z[x]. Polynomial multiple correlation se- quences are connected to Sárközy’s theorem and to Bergelson-Leibman’s polynomial exten- sion of Szemerédi’s theorem[BL96]. A generalization of (1.4.4) from linear to polynomial multiple correlation sequences was obtained by Leibman in [Lei10a; Lei15]. In Subsection
2.2.8 we prove the following refinement of Leibman’s theorem.
Theorem 20. Fix k ∈ N. Let (X, B, µ, T ) be an ergodic measure preserving system, A ∈ B with µ(A) > 0 and p1, . . . , pk ∈ Z[x] with pi(0) = 0 for all i = 1, . . . , k. Then there exists δ > 0 such that for every ε > 0 there exists a decomposition of the form
Z p1(n) pk(n) α(n) := 1A,T 1A · ... · T 1A dµ = ρ(n) + φ(n) + ω(n) + γ(n), X where
• ω(n) is a null-sequence,
• γ satisfies supn∈N |γ(n)| < ε,
16 • ρ is a periodic sequence with the property that
N 1 X lim ρ(qn) > δ, ∀q ∈ N, N→∞ N n=1
(Cesàro) • and φ(n) ∈ Nils(N) ∩ Coper(N, Φ ) for some s ∈ N.
Theorem 20 is instrumental in the proofs of Theorems 32 and 35 and their combinatorial corollaries, namely Corollaries 33, 34, 37 and 38, all of which are formulated in Section 1.5 below.
1.4.3 A dichotomy theorem for multiplicative functions and a structure theo-
rem for level sets of multiplicative functions
An arithmetic function f : N → C is called multiplicative if f(1) = 1 and f(mn) = f(m)·f(n) for all relatively prime m, n ∈ N and it is called completely multiplicative if f(1) = 1 and f(mn) = f(m) · f(n) for all m, n ∈ N. Let M denote the set of all multiplicative functions bounded in modulus by 1. We start this subsection with formulating a dichotomy theorem for the class M0 of all multiplicative functions f ∈ M with the property that 1 PN limN→∞ N n=1 f(qn + r) exists for all q, r ∈ N. This dichotomy theorem for M0 is taken from [BKPLR17] and follows quickly by combining the work of Daboussi and Délange
[DD74; DD82; Del72], Frantzikinakis and Host [FH17a] and Bellow and Losert [BL85] and is closely related to [FH17a, Theorem 1.1]. For convenience, we will refer to functions f : N → C that are Besicovitch rationally almost periodic with respect to the Følner sequence of initial intervals Φ(Cesàro) simply as Besicovitch rationally almost periodic functions.
Theorem 21 ([BKPLR17]). Let f ∈ M0. Then either (i) f is Besicovitch rationally almost periodic, or
(ii) f is uniform.
Given a multiplicative function f : N → C and a point z ∈ C let E(f, z) denote the set
17 of solutions to the equation f(n) = z, i.e.,
E(f, z) := {n ∈ N : f(n) = z}.
We will refer to E(f, z) as a level set of f and we use D to denote the collection of all sets of the form E(f, z), where f ranges over all multiplicative functions and z ranges over all complex numbers.
Example 22. Examples of sets belonging to D include many classical sets of number-
2 theoretical origin, such as: the squarefree numbers Q := {n ∈ N : p - n for all primes p}, the multiplicatively even numbers E := {n ∈ N : Ω(n) is even} and the multiplicatively odd numbers O := {n ∈ N : Ω(n) is odd}, where Ω(n) denotes the number of prime factors of n counted with multiplicities. In Subsection 2.3.1 we provide more examples of sets in D (see
Example 75).
The next result is a structure theorem for sets belonging to D in the spirit of Theorem 21 and was first obtained in [BKPLR17]. To formulate this theorem we need to introduce set-theoretic analogues of uniform functions and of Besicovitch rationally almost periodic functions.
Definition 23.
|A∩[N]| (i) Let us call a set A ⊂ N uniform if d(A) := limN→∞ N exists and the function
1A − d(A) is a uniform function.
Given E,R ⊂ N for which the densities d(E) and d(R) exist, we say that E is uniform
relative to R if E ⊂ R and the function d(R)1E − d(E)1R is uniform. Note that a set
A is uniform if and only if it is uniform relative to N.
(ii) A set A ⊂ N is called rational if for every ε > 0 there exists a set B, that is a union of finitely many arithmetic progression, such that d(A4B) < ε (see [BR02, Definition
2.1] and [BKPLR16]). Equivalently, a set A is rational if and only if its indicator
function is Besicovitch rationally almost periodic.
(iii) Let Drat be the collection of all level sets E(f, z), where f is a Besicovitch rationally almost periodic multiplicative function and z is an arbitrary complex number. We
18 show in Subsection 3.3.1 that any set in Drat is a rational set.
Before we state our main result regarding level sets of multiplicative fucntions, we remark that the density d(R) of any rational set R exists, which follows quickly from the definition of rationality, and the density d(E) of any set E in D also exists, which is a result established in [Ruz77] (see Corollary 86 below).
Theorem 24. For any set E ∈ D with positive density there exists R ∈ Drat such that E is uniform relative to R. If d(E) 6= 1 then R ∈ Drat with this property is unique.
As an immediate corollary of Theorem 24 we obtain the following decomposition theo- rem for indicator functions of level sets of multiplicative functions:
(Cesàro) Corollary 25. For any set E ∈ D there exists frat ∈ Besrat(N, Φ ) and funi ∈ Uni(N) such that 1E = frat + funi.
Theorem 24 allows us to study multiple ergodic averages along level sets of multiplicative functions, such as N 1 X 1 (n) T −p1(n)f ··· T −p`(n)f , (1.4.5) N E 1 ` n=1 where T is an invertible measure preserving transformation on a probability space (X, B, µ),
∞ f1, . . . , f` ∈ L (X, B, µ), p1, . . . , p` are polynomials with integer coefficients and E belongs to D. This, in turn, leads to new enhancements of Szemerédi’s theorem. Our results in this direction are presented in the next section (see Theorem 35 and Corollaries 37 and 38).
1.5 Main Results
1.5.1 Generalizations of Furstenberg’s multiple recurrence theorem and re-
finements of Szemerédi’s theorem
As was mentioned in Subsection 1.4.2, Furstenberg showed in [Fur77] that (1.4.1) implies
Szemerédi’s theorem. This is done through what today is called the Furstenberg correspon- dence principle.
19 Proposition 26 (Furstenberg’s correspondence principle, see [Fur81, Lemma 3.17] and
Lemma 47 below). Let E ⊂ N be a set with positive upper density d(E) > 0. Then there exist an invertible measure preserving system (X, B, µ, T ) and a set A ∈ B with µ(A) > d(E) such that for all n1, . . . , nk ∈ N, one has