THE DICHOTOMY BETWEEN STRUCTURE AND RANDOMNESS AND APPLICATIONS TO COMBINATORIAL NUMBER THEORY

DISSERTATION

Presented in partial fulfillment of the requirements for the degree Doctor of in the Graduate School of The Ohio State University

By Florian K. Richter, MASt, Bsc Graduate Program in

The Ohio State University 2018

Dissertation Committee: Vitaly Bergelson, Advisor Alexander Leibman David Penneys Copyright by

Florian K. Richter

2018 ABSTRACT

The study of the long-term behavior of dynamical systems has far-reaching applications to other areas of mathematics. The employment of analytic tools coming from measurable, topological, and symbolic dynamics offers novel possibilities for analyzing seemingly static number-theoretic and combinatorial situations and has proven to be a powerful method in solving numerous open problems in Ramsey theory and combinatorial number theory that previously appeared to be intractable.

In this thesis we develop new techniques that are inspired by dynamical heuristics and lead to a variety of applications in discrete mathematics. One theme featured prominently in this work is the idea of dichotomy between structure and randomness. This dichotomy manifests itself via decomposition theorems that deal with splittings of arithmetic functions into two components, one of which is structured and the other is pseudo-random. From these decomposition theorems we then derive results in ergodic theory and density Ramsey theory.

Among other things, we obtain generalizations and refinements of Szemerédi’s theorem and

Sárközy’s theorem, and present a solution to a long-standing open sumset conjecture of

Erdős.

ii ACKNOWLEDGEMENTS

My accomplishments as a graduate student would not have been possible without the sup- port of the kind people around me.

First and foremost, I would like to express my sincere gratitude to Vitaly Bergelson, my advisor, whose mentorship has formed me as a mathematician. His never-ending patience and care towards me and all his other students has set a great example of how to become an effective and successful advisor. I could not have imagined having a better mentor for my Ph.D. studies.

I also want to thank the members of the Ohio State University mathematics department, in particular Alexander Leibman, Nimish Shah and Daniel Thompson, for their help and guidance throughout the years. My thanks also go to Mariusz Lemańczyk. I benefited greatly from my collaboration with him.

I am grateful to the graduate school at the Ohio State University for awarding me the

Presidential Fellowship, and to the Phi Kappa Phi Honor Society for awarding me the Louise

B. Vetter Award, which provided me with the time and resources to finish this dissertation.

To my colleagues, co-authors and friends, including Daniel Glasscock, John H. John- son, Andreas Koutsogiannis, Joanna Kułaga-Przymus, Joel Moreira and Donald Robertson, thank you for sharing with me your skills and enthusiasm for mathematics.

Finally, I would like to thank my loving family for their help and support and for all the sacrifices that they have made to help me achieve my goals. Thank you!

iii VITA

June 2010 ...... Bachelor of Science

Technische Universität Wien

June 2011 ...... Master of Advanced Studies in Mathematics

Cambridge University

September 2012 to Present ...... Graduate Teaching Associate

Department of Mathematics

Ohio State University

PUBLICATIONS

• V. Bergelson, F. K. Richter. “On the density of coprime tuples of the form (n, bf1(n)c,

..., bfk(n)c), where f1, . . . , fk are functions from a Hardy field”. In: Number Theory – Diophantine Problems, Uniform Distribution and Applications, Festschrift in Hon-

our of Robert F. Tichy’s 60th Birthday (C. Elsholtz and P. Grabner, eds.), Springer

International Publishing, Cham, (2017), pp. 109 – 135.

• J. H. Johnson, F. K. Richter. “Revisiting the Nilpotent Polynomial Hales-Jewett

Theorem”. In: Advances in Mathematics 321 (2017), pp. 269 – 286.

d • J. Moreira, F. K. Richter. “Large subsets of discrete hypersurfaces in Z contain arbitrarily many collinear points”. In: European Journal of Combinatorics 54 (2016),

pp. 163 – 176.

iv FIELDS OF STUDY

Major Field: Mathematics

Specialization: Ergodic Theory, Additive and Combinatorial Number Theory

v TABLE OF CONTENTS

Abstract ...... ii

Acknowledgements...... iii

Vita...... iv

List of Figures...... ix

1 Introduction...... 1

1.1 The dichotomy between structure and randomness...... 1

1.2 Notions of structure ...... 3

1.3 Notions of randomness...... 9

1.4 Decomposition theorems...... 12

1.4.1 Decomposition theorems for arithmetic functions...... 12

1.4.2 Decomposition theorems for multicorrelation sequences ...... 14

1.4.3 A dichotomy theorem for multiplicative functions and a structure

theorem for level sets of multiplicative functions ...... 17

1.5 Main Results ...... 19

1.5.1 Generalizations of Furstenberg’s multiple recurrence theorem and re-

finements of Szemerédi’s theorem...... 19

1.5.2 The Erdős sumset conjecture...... 24

2 Proofs of decomposition theorems...... 26

2 2.1 Proofs of decomposition theorems for L (N, Φ) ...... 26 2 2.1.1 A completeness lemma for L (N, Φ) ...... 26

vi 2 2.1.2 Decomposing functions from L (N, Φ) into almost periodic and cope- riodic components ...... 28

2 2.1.3 The Jacobs–de Leeuw–Glicksberg splitting for L (N, Φ) ...... 31 2.2 Proofs of decomposition theorems for multicorrelation sequences ...... 35

2.2.1 Preliminaries on nilmanifolds, nilsystems and nilsequences ...... 35

2.2.2 Preliminaries on almost periodic functions...... 40

2.2.3 Proving Theorem 18 for the special case of nilsystems ...... 41

2.2.4 Host-Kra-Ziegler factors...... 47

2.2.5 Spectrum of the orbit of the diagonal ...... 49

2.2.6 A useful reduction...... 51

2.2.7 The subgroup H ...... 53

2.2.8 A Theorem for eliminating the rational spectrum...... 56

2.3 Multiplicative functions and their level sets ...... 62

2.3.1 Preliminaries...... 62

2.3.2 Dichotomy theorem for M0 ...... 74 2.3.3 Structure theorem for D ...... 76

3 Applications of decomposition theorems to the theory of multiple re-

currence and to combinatorial number theory...... 90

3.1 Multiple ergodic averages along Beatty sequences and a proof of Theorem 28 90

3.2 Multiple ergodic averages along rational sets and applications ...... 93

3.2.1 Rational sequences are good weights for polynomial multiple conver-

gence ...... 94

3.2.2 Divisible rational sets are good for polynomial multiple recurrence . 96

3.2.3 Applications to additive combinatorics...... 97

3.3 Multiple ergodic averages along level sets of multiplicative functions and

applications to ergodic theory and combinatorics...... 99

3.3.1 The class Drat ...... 99 3.3.2 Proofs of Theorem 35 and 36 ...... 103

vii 3.4 Completing the proof of the Erdős sumset conjecture ...... 105

3.4.1 An ultrafilter reformulation of the Erdős sumset conjecture . . . . . 105

3.4.2 Proving the ultrafilter reformulation...... 110

3.4.3 Establishing properties U1 - U4 ...... 112

3.4.4 An application of the pointwise ergodic theorem ...... 115

3.4.5 A variant of an argument of Beiglböck ...... 117

Bibliography...... 119

viii LIST OF FIGURES

1.1 Notions of structure ...... 9

1.2 Notions of randomness...... 12

ix CHAPTER 1

INTRODUCTION

1.1 The dichotomy between structure and randomness

Density Ramsey theory is a rich and active area of research in mathematics at the interface of combinatorics and measure theory. Broadly speaking, it deals with finding arithmetic, geometric or combinatorial patterns in large subsets of spaces that admit a natural notion of density. The most classical and most studied space of this kind – and also the space that we focus on in this dissertation – is the set of positive integers N := {1, 2, 3,...}. On N, a notion of density that is natural to consider is the so-called upper density. Given a subset

A ⊂ N the upper density of A is defined as |A ∩ {1,...,N}| d(A) := lim sup . N→∞ N

One of the basic principles of density Ramsey theory is that any set A with d(A) > 0 is combinatorially and arithmetically rich. Two celebrated theorems that showcase this principle are Szemerédi’s theorem on arithmetic progressions and Sárközy’s theorem.

Szemerédi’s theorem ([Sze75]). Any set A ⊂ N with d(A) > 0 contains arbitrarily long arithmetic progressions.

Sárközy’s theorem ([Sár78], see also [Fur77]). Any set A ⊂ N with d(A) > 0 contains two elements whose difference is a perfect square.

How does one prove results such as Szemerédi’s theorem or Sárközy’s theorem? The class of all subsets of N with positive upper density is so large that, if no further restrictions are made, the nature of an arbitrary set A with d(A) > 0 can be rather intricate and 1 difficult to describe, especially since A might exhibit a blend of different qualities. An effective approach for proving such theorems is to decompose A into manageable pieces.

At the center of any such approach lies a decomposition theorem which guarantees that an arbitrary set, no matter how disorderly, can be viewed as a “superposition” of well-behaved components. We will focus on decomposition theorems that are based on a rather simple but powerful idea: the dichotomy between structure and randomness.

Sometimes also interpreted as the dichotomy between order and chaos, this heuristics plays a central role in all known proofs of Szemerédi’s theorem[Sze75; Fur77; Gow01;

Tao06; GT10c], where it manifests itself in diverse ways, such as, for example, a regular- ity lemma [Sze78; RS04; Gow07; LS07; GT10a], an inverse theorem for unifomity norms

[GTZ12; HK09], or a structure theorem for arbitrary probability measure preserving sys- tems [FKO82; EW11]. Moreover, the idea of dichotomy between structure and randomness

finds applications in other areas as well; see for example the expository articles on this topic by Tao [Tao07; Tao08] and Gowers [Gow10].

In this thesis we are concerned with refining old and developing new results that deal with decompositions of arithmetic functions on the integers into two components, one of which is structured and the other is random. In this context, we use the term structured very broadly to describe objects that exhibit some form of low-complexity or almost pe- riodicity, and the term random (or sometimes pseudo-random) for objects that within our framework exhibit behavior similar to that of a sequence of independent and identically distributed random variables. Some of the decomposition theorems that we obtain are rather general and yield a decomposition for any bounded arithmetic function f : N → C, others will be more specialized and apply only to certain classes of arithmetic functions, such as multiplicative functions, indicator functions of level sets of multiplicative functions, or multicorrelation sequences (cf. Sections 1.4.2 and 1.4.3 below for definitions). All decom- position theorems are collectively formulated in Section 1.4, whereas the pertinent notions of structure and randomness are introduced beforehand in Sections 1.2 and 1.3. The main results of this doctoral dissertation are then stated in Section 1.5 and include dynamical

2 and combinatorial corollaries that can be derived from the decomposition theorems pre- sented in Section 1.4. Among many other things, this includes far-reaching generalizations and analogues of Szemerédi’s theorem and Sárközy’s theorem (see Theorems 28, 32, 35 and Corollaries 29, 33, 34, 37, 38) as well as a solution to the long-standing Erdős sumset conjecture (see Theorem 40). Partial results of the presented work have already appeared in [MR16a; BKPLR16; BKPLR17; MRR17; BMR17; MRR18] and some theorems and passages have been taken verbatim from these sources.

1.2 Notions of structure

All decomposition theorems that we will encounter in Section 1.4 follow a similar scheme in the sense that they allow us to split an arithmetic function f : N → C as

f = fstructured + frandom, (1.2.1)

where the two components appearing in the splitting, fstructured and frandom, exhibit special qualities: The first component is structured in the sense that an explicit description of its arithmetic architecture is known and allows for it to be studied using hands-on computa- tions. The second, on the other hand, is the opposite. It is pseudo-random, which means that even though not much is known about its explicit structure, its statistical behavior is easy to predict and can be studied using probabilistic heuristics.

The purpose of this section is to introduce a rigorous framework for defining the type of structured objects that play the role of fstructured in (1.2.1). We do the same for frandom in Section 1.3 below. Ergodic theory and topological dynamics will play a crucial role in defining these notions and so we begin by recalling the definition of a topological dynamical system.

Definition 1. A topological dynamical system is a pair (X,T ), where X is a compact metric space and T : X → X is a homeomorphism.

There is a natural way of associating to every topological dynamical systems a collec- ∞ : tion of arithmetic functions belonging to ` (N) = {f : N → C : supn∈N |f(n)| < ∞} by 3 evaluating continuous functions along orbits of points. This leads to the following definition.

Definition 2. We say that an arithmetic function f : N → C is generated by the topological dynamical system (X,T ) if there exists a point x ∈ X and a continuous function F ∈ C(X)

n such that f(n) = F (T x) for all n ∈ N.

Note that many dynamical qualities of the system are inherited by the sequences that it gives rise to. In particular, if (X,T ) is a structurally simple system then so are all the sequences that (X,T ) generates. We provide now some illustrations of this principle.

Bohr rationally almost periodic functions. The simplest type of a topological dy- namical system is a translation on a finite group, by which we mean a system (X,T ) where the underlying space X is a finite group and the transformation T : X → X is multiplica- tion by a fixed element in this group, i.e., there exists a ∈ X such that T x = a · x for all x ∈ X. An arithmetic function f : N → C is called Bohr rationally almost periodic if for every ε > 0 there exists a function g : N → C that is generated by a translation on a finite group such that supn∈N |f(n) − g(n)| < ε. Write Bohrrat(N) for the set of all arithmetic functions that are Bohr rationally almost periodic. We remark that Bohr rationally al- most periodic functions form a subfamily of the more general class of Bohr almost periodic functions introduced later in this section.

Example 3. Any periodic function is clearly Bohr rationally almost periodic. However, there are plenty of non-periodic Bohr rationally almost periodic functions, such as functions

P∞ 2πiqin P∞ of the form f(n) = i=1 cie where q1, q2,... ∈ Q and c1, c2,... ∈ C with i=1 |ci| < ∞.

Besicovitch rationally almost periodic functions. An important generalization of

Bohr rationally almost periodic functions is given by Besicovitch rationally almost periodic functions, which were introduced in [BKPLR16] and find applications in multiplicative number theory (cf. [BKPLR17]).

A Følner sequence on N is a sequence Φ = (ΦN )N∈N of finite non-empty subsets of N

4 |(ΦN −1)∩ΦN | satisfying limN→∞ = 1. The Besicovitch seminorm k · kΦ is then defined as |ΦN |

1   2 1 X 2 kfkΦ := lim sup |f(n)|  . (1.2.2) N→∞ |ΦN | n∈ΦN

An arithmetic function f : N → C is called Besicovitch rationally almost periodic with respect to Φ if, for every ε > 0, there exists an arithmetic function g : N → C generated by a translation on a finite group such that kf(n) − g(n)kΦ < ε. We define Besrat(N, Φ) to be the set of all Besicovitch rationally almost periodic functions with respect to Φ.

Example 4. Certainly, Bohrrat(N) ⊂ Besrat(N, Φ) for any Følner sequence Φ. The reverse inclusion is not true: Let Φ(Cesàro) denote the Følner sequence of initial intervals (meaning (Cesàro) that ΦN = {1, 2,...,N} for all N ∈ N). Then the indicator function of the square- 2 free numbers Q := {n ∈ N : p - n for all primes p} is an example of a function that is Besicovitch rationally almost periodic with respect to Φ(Cesàro) but not Bohr rationally al- most periodic. Note that the indicator function of the squarefree numbers is multiplicative, meaning that for all m, n ∈ N with gcd(m, n) = 1 one has 1Q(mn) = 1Q(m)1Q(n) (cf.

Section 1.4.3 below). It is shown in [BKPLR17] that not just 1Q, but actually any bounded multiplicative function f : N → [0, ∞) is Besicovitch rationally almost periodic with respect to the Følner sequence of initial intervals Φ(Cesàro).

Bohr almost periodic functions. A more general class of low complexity systems, which includes all translations on finite groups, is the class of translations on compact groups. Translations on compact groups are topological dynamical systems (X,T ) where X is a compact Hausdorff topological group and the transformation T is again multiplication by a fixed element in the group. An arithmetic function f : N → C is called Bohr almost periodic if it is generated by a translation on a compact group. We denote the collection of all Bohr almost periodic functions by Bohr(N).

Example 5. Bohr almost periodic functions were introduced by Harald Bohr (see [Boh25a;

P∞ 2πiλin Boh25b]) and include, among other examples, all functions of the form f(n) = i=1 cie

5 P∞ where λ1, λ2,... ∈ R and all c1, c2,... ∈ C with i=1 |ci| < ∞. In particular, all trigono- metric polynomials are Bohr almost periodic.

Besicovitch almost periodic functions (cf. [Bes26; Bes55; BL85]). Let Φ = (ΦN )N∈N be a Følner sequence on N. If, for every ε > 0, there exists g : N → C that is generated by a translation on a compact group and such that kf(n) − g(n)kΦ < ε, then f is called

Besicovitch almost periodic with respect to Φ. Write Bes(N, Φ) for the space of arithmetic functions that are Besicovitch almost periodic with respect to Φ.

Example 6. Bes(N, Φ) contains all Bohr almost periodic functions and all Besicovitch rationally almost periodic functions. An example of a function that is neither Bohr almost periodic nor Besicovitch rationally almost periodic, but still belongs to Bes(N, Φ), is the indicator function of the set {bnαc : n ∈ N} for any α ∈ R\Q. It is worth noting that the set of all trigonometric polynomials (and therefore also Bohr(N)) is dense in Bes(N, Φ) with respect to the Besicovitch seminorm k.kΦ, that is, for any f ∈ Bes(N, Φ) and any ε > 0 there exists a trigonometric polynomial p(n) such that kf − pkΦ < ε (cf. [Bes55; BL85]).

Nilsequences (cf. [BHK05; HK08;HK]) . A class of dynamical systems that includes all translations on compact groups and which finds important applications in density Ramsey theory is the class of nilsystems: Let k ∈ N, let G we a k-step nilpotent Lie group and let Γ be a uniform and discrete subgroup of G (here uniform means that the quotient G/Γ is compact). The quotient space G/Γ is called a k-step nilmanifold and for any fixed group element a ∈ G the transformation T : G/Γ → G/Γ given by T (bΓ) = (a·b)Γ for all bΓ ∈ G/Γ is called a niltranslation. If X = G/Γ is a k-step nilmanifold and T is a niltranslation on it then the resulting topological dynamical system (X,T ) is called a k-step nilsystem.

Any arithmetic function f : N → C generated by a k-step nilsystem is called a basic k-step nilsequence. We call f : N → C a k-step nilsequence if for every ε > 0 there exists a basic k-step nilsequence g : N → C such that supn∈N |f(n) − g(n)| < ε. Write Nilk(N) for the set of all k-step nilsequences. It is straightforward to show that Nil1(N) = Bohr(N).

6 Example 7. A classical example of a k-step nilsequence is f(n) = e2πip(n), where p is any polynomial in R[x] of degree smaller or equal to k.

Besicovitch Nilsequences. Fix a Følner sequence Φ = (ΦN )N∈N. We call f : N → C a Besicovitch k-step nilsequence with respect to Φ if for every ε > 0 there exists a sequence g : N → C coming from a k-step nilsystem such that kf(n) − g(n)kΦ < ε. Let BesNilk(N, Φ) denote the set of Besicovitch k-step nilsequences. In analogy to Nil1(N) = Bohr(N), we have

BesNil1(N, Φ) = Bes(N, Φ).

Example 8. All nilsequences are obviously Besicovitch nilsequences with respect to Φ for any Φ, but the converse is not true. For instance, given two rationally independent numbers

α, β ∈ R the sequence e(bnαcnβ) belongs to BesNil2(N, Φ) for any Følner sequence Φ, but does not belong to Nil2(N) (see [HK08, Appendix C]).

Our next goal is to introduce measure theoretic analogues of almost periodic functions and nilsequences; for this we first need to recall the definition of a measure preserving dynamical system and thereafter establish a measure theoretic analogue of Definition2.

Definition 9. Let (X, B, µ) be a probability space and let T : X → X be a measurable map that is measure preserving, i.e., µ(T −1B) = µ(B) for all B ∈ B. We call the quadruple

(X, B, µ, T ) a measure preserving dynamical system. In the case where X is a metric space and B is the σ-algebra of Borel sets, we call (X, B, µ, T ) a metric measure preserving system, or m.m.p.s. for short. (Note, in this thesis we will only deal with m.m.p.s. where T : X → X is continuous.) Two measure preserving systems (X, B, µ, T ) and (Y, C, ν, S) are called isomorphic if there exists a measurable subset X0 ⊂ X with µ(X0) = 1 and an injective measurable map ϕ: X0 → Y such that ϕ ◦ T = S ◦ ϕ and ν coincides with the push-forward of µ under the map ϕ.

For the remainder of this section let Φ = (ΦN )N∈N be a Følner sequence on N.

Definition 10. Given a m.m.p.s. (X, B, µ, T ), we say a point x ∈ X is generic along Φ for

7 the measure µ if for every F ∈ C(X),

1 Z lim X F (T nx) = F dµ. N→∞ |ΦN | X n∈ΦN

We say f : N → C is generated along Φ by the m.m.p.s. (X, B, µ, T ) if there exists a point x ∈ X, which is generic along Φ for µ, and a continuous function F ∈ C(X) such that

n f(n) = F (T x) for all n ∈ N.

Compact functions (cf. [MRR18]). On any compact Hausdorff topological group the normalized Haar measure is a Borel probability measure which is invariant under all group translations. Therefore, any translation on a compact group can be viewed as a metric measure preserving system. We call any m.m.p.s. (X, B, µ, T ) that is measurably isomorphic to a translation on a compact group with Haar measure a measure-theoretic Kronecker system. An arithmetic function f : N → C is then called compact with respect to Φ if it is generated along Φ by a measure-theoretic Kronecker system. Such functions are called compact because their orbit under shifts {n 7→ f(n+h): h ∈ N} is pre-compact with respect to the Besicovitch seminorm k.kΦ. We use Comp(N, Φ) to denote the set of all functions that are compact with respect to Φ.

Measure-theoretic nilsequences. Let k ∈ N. On any k-step nilmanifold G/Γ there exists a unique Borel probability measure µG/Γ on G/Γ that is invariant under all nil- traslations called the Haar measure of the nilmanifold G/Γ (cf. [Rag72]). Therefore any nilsystem together with its Haar measure is a metric measure preserving dynamical system.

A m.m.p.s. (X, B, µ, T ) is then called a measure-theoretic k-step nilsystem if it is measurably isomorphic to a nilsystem with its Haar measure.

An arithmetic function f : N → C is called a measure-theoretic nilsequence with re- spect to Φ if it is generated along Φ by a measure-theoretic k-step nilsystem. We use

MeasNilk(N, Φ) to denote the set of all measure-theoretic k-step nilsequences with respect to Φ. Clearly, measure-theoretic 1-step nilsequences are the same as compact functions.

8 Figure 1.1: Notions of structure

Bohrrat(N) ⊂ Bohr(N) ⊂ Nil2(N) ⊂ Nil3(N) ... ⊂ ⊂ ⊂ ⊂

Besrat(N, Φ) ⊂ Bes(N, Φ) ⊂ BesNil2(N, Φ) ⊂ BesNil3(N, Φ) ... ⊂ ⊂ ⊂

Comp(N, Φ) ⊂ MeasNil2(NΦ) ⊂ MeasNil3(N, Φ) ...

We end this section with a diagram (see Fig. 1.1) that illustrates the relationship between the different notions of structure that we have introduced thus far. We remark that all inclusions appearing in this diagram are in fact proper inclusions.

1.3 Notions of randomness

Let us now discuss notions of randomness that serve as the counterparts to the notions of structure introduced in the previous section. We start with what can be regarded as the complementary notion to that of a Besicovitch rationally almost periodic function.

Aperiodic functions. Let Φ = (ΦN )N∈N be a Følner sequence. An arithmetic function f : N → C is aperiodic with respect to Φ if

1 lim X f(n) = 0 N→∞ |ΦN ∩ (aN + b)| n∈ΦN ∩(aN+b) for all a ∈ N and all b ∈ {0, 1, . . . , a − 1}. Write Aper(N, Φ) for the space of aperiodic arithmetic functions.

Coperiodic functions. The next notion that we consider is a natural generalization of

2πinθ aperiodicity. For θ ∈ [0, 1) write e(nθ) := e . We call f : N → C coperiodic with respect to Φ if 1 lim X f(n)e(nθ) = 0, ∀θ ∈ [0, 1). N→∞ |ΦN | n∈ΦN Let Coper(N, Φ) be the set of all coperiodic functions with respect to Φ.

9 ∞ Conilsequences. Let k ∈ N. We say f ∈ ` (N) is a k-step conilsequence with respect to Φ if 1 lim X f(n)φ(n) = 0 N→∞ |ΦN | n∈ΦN for all φ ∈ Nilk(N). Note that f : N → C is 1-step conilsequence with respect to Φ if and only if it is coperiodic with respect to Φ. We use Conilk(N, Φ) to denote the set of all bounded k-step conilsequences with respect to Φ.

Weak mixing functions. A function f : N → C is weak mixing with respect to Φ if, for every bounded function g : N → C and every subsequence (Nk)k∈N with the property that 1 lim X f(n + h)g(n) k→∞ |Φ | Nk n∈Φ Nk exists for all h ∈ N, one has for all ε > 0,

n 1 X o d h ∈ N : lim f(n + h)g(n) > ε = 0. k→∞ |Φ | Nk n∈Φ Nk

Let WM(N, Φ) denote the set of all bounded arithmetic functions that are weak mixing with respect to Φ.

Example 11. A measure preserving system (X, B, µ, T ) is called weakly mixing if for all

1 PN −n A, B ∈ B one has limN→∞ N n=1 |µ(T A ∩ B) − µ(A)µ(B)| = 0. One can show that any f : N → C that is generated along Φ by a weak mixing m.m.p.s. and satisfies limN→∞ 1 P f(n) = 0 belongs to WM( , Φ). |ΦN | n∈ΦN N

Uniform functions. The uniformity (semi)norms provide a very useful way of mea- suring pseudo-randomness. For different spaces there exist different but related versions of the uniformity norms, such as the Gowers uniformity norms for functions on a finite group (cf. [Gow01; Tao12]), the Gowers uniformity norms for functions on {1,...,N} (cf.

[GTZ12; Tao12]), the Host-Kra norms on L∞-functions in the framework of measure pre- serving systems (see [HK05b]), and the localized Host-Kra seminorms introduced in [HK09]

∞ for functions belonging to ` (N). In this thesis we utilize the Gowers uniformity norms for ∞ functions on {1,...,N} and the localized Host-Kra seminorms on ` (N). 10 Given f : N → C and h ∈ N ∪ {0} define the multiplicative derivative ∂hf as

∂hf(n) := f(n)f(n + h), ∀n ∈ N.

Let N ∈ N and let Z/NZ denote the finite cyclic group with N elements. Given k ∈ N, the

Gowers uniformity norm k.kU k on Z/NZ (cf. [Gow01; GTZ12]) is defined as Z/NZ  1/2s 1 X kfk k := ∂ ··· ∂ f(n) . U  k+1 hk h1  Z/NZ N n,h1,...,hk∈Z/NZ

Write [N] for the interval {1, 2,...,N}. To define the Gowers uniformity seminorm k.k s U[N] ˜ s ˜ for a function f : N → C, set N := 2 N, define a function fN : Z/NZ → C as fN (n) = f(n) ˜ ˜ ˜ for n ∈ [N] and fN (n) = 0 for n ∈ [N]\[N] (where we identify Z/NZ with the interval [N]).

Also, let 1[N] be the indicator function of the interval [N], and define (cf. Subsections A.1 and A.2 of Appendix A in [FH17a] and Appendix B in [GT10b])

kf k s N U ˜ Z/NZ kfkU s := . (1.3.1) [N] k1 k s [N] U ˜ Z/NZ s A bounded function f : → is called U -uniform if kfk s converges to zero as N → ∞. N C U[N] s A function f is called uniform if it is U -uniform for every s > 1. We use Uni(N) to denote the set of all uniform functions.

The uniformity seminorm of order k of f associated to Φ is defined as

H   1 X 1 X kfk k := lim lim ∂ ··· ∂ f(n) (1.3.2) U (Φ) k  hk h1  H→∞ H N→∞ |ΦN | h1,...,hk=1 n∈ΦN whenever all involved limits exist. If the expression on the right hand side of (1.3.2) is not well defined then we say that kfkU k(Φ) is not well defined. We call an arithmetic function f : N → C k-step uniform with respect to Φ if kfkU k(Φ) is well defined and equals 0. We denote by Unik(N, Φ) the set of all arithmetic functions that are k-step uniform with respect to Φ. If f is k-step uniform with respect to Φ for every k > 1 then we say that f is uniform with respect to Φ.

Remark 12. Recall that Φ(Cesàro) denotes the Følner sequence of initial intervals (see

∞ Example4). We remark that any function f ∈ ` (N) that is k-step uniform with respect 11 Figure 1.2: Notions of randomness

Aper(N, Φ) ⊃ Coper(N, Φ) ⊃ Conil2(N, Φ) ⊃ Conil3(N, Φ) ... ⊃ ⊃ ⊃

WM(N, Φ) ⊃ Uni2(N, Φ) ⊃ Uni3(N, Φ) ... to Φ(Cesàro) is also k-step uniform (see [HK, Chapter 22]). However, the reverse inclusion does not hold (see also [HK, Chapter 22, ]).

The diagram in Fig. 1.2 showcases the relationship between Aper, Coper, Conil, WM and

Uni. All inclusions appearing in this diagram are strict.

1.4 Decomposition theorems

In Section 1.2 we have introduced a variety of classes of arithmetic functions which we consider to be structured, and in Section 1.3 we have introduced a variety of classes of arithmetic functions which we consider to be pseudo-random. The goal of this section is to bring together these notions of structure and randomness by formulating decomposition theorems in the spirit of (1.2.1).

1.4.1 Decomposition theorems for arithmetic functions

Fix a Følner sequence Φ on N. The first decomposition results that we present in this section are rather general and apply to any function in the space

2 L (N, Φ) := {f : N → C : kfkΦ < ∞}, where k.kΦ is the Besicovitch seminorm defined in (1.2.2). We begin with a theorem involv- ing Besicovitch almost periodic functions which was obtained by the author in joint work with Joel Moreira and Donald Robertson in [MRR18].

2 Theorem 13 ([MRR18]). For every Følner sequence Φ on N and any f ∈ L (N, Φ) there is a Følner subsequence Ψ of Φ and functions fBes ∈ Bes(N, Ψ) and fCoper ∈ Coper(N, Ψ) such

12 that f = fBes + fCoper. Moreover, fBes minimizes the distance between f and Bes(N, Ψ) in the sense that kf − fBeskΨ = inf{kf − gkΨ : g ∈ Bes(N, Ψ)}.

Instead of providing a proof of Theorem 13, we will establish a more general decom-

2 position result for L (N, Φ) that contains Theorem 13 as a special case. We will need the following definition.

Definition 14. Suppose that for every Følner sequence Φ we are given a subspace U(Φ)

2 of L (N, Φ) satisfying the following properties: • U(Φ) contains the constant functions and is closed under pointwise complex conjuga-

tion;

• for all u, v ∈ U(Φ) the inner product hu, viΦ exists; • If u, v ∈ U(Φ) are real valued, then the function n 7→ max{u(n), v(n)} is in U(Φ);

• U(Φ) is closed with respect to k · kΦ; • if Ψ is a subsequence of Φ then U(Ψ) ⊃ U(Φ).

Call any such assignment U of subspaces to Følner sequences a projection family. Given a projection family one can consider, for each Følner sequence Φ, the subspace

⊥  2 U(Φ) := v ∈ L (N, Φ) : hu, viΦ exists and equals 0 for all u ∈ U(Φ)

2 of L (N, Φ).

It is straightforward to verify that that Φ 7→ Bes(N, Φ) is a projection family. In light of this fact, the following result is a generalization of Theorem 13.

Theorem 15 ([MRR18]). Let U be a projection family and let Φ be a Følner sequence.

2 For every f ∈ L (N, Φ) there exists a subsequence Ψ of Φ and fU ∈ U(Ψ) such that f − ⊥ fU ∈ U(Ψ) . Moreover, fU minimizes the distance between f and U(Ψ) in the sense that kf − fU kΨ = inf{kf − gkΨ : g ∈ U(Ψ)}.

We give a proof of Theorem 15 in Subsection 2.1.2.

2 The next decomposition theorem, which represents any f ∈ L (N, Φ) as a sum of a weak mixing function and a compact function, also first appeared in [MRR18] and can be viewed

13 as a discrete version of the Jacobs–de Leeuw–Glicksberg splitting [Jac56; LG61] on Banach spaces (see also [Kre85, Chapter 2.4] and [EFHN15, Example 16.25]).

2 Theorem 16 ([MRR18]). For every Følner sequence Φ on N and every f ∈ L (N, Φ) there is a Følner subsequence Ψ of Φ and functions fComp ∈ Comp(N, Ψ) and fWM ∈ WM(N, Ψ) such that f = fComp + fWM. Moreover, fComp minimizes the distance between f and Comp(N, Ψ) in the sense that kf − fCompkΨ = inf{kf − gkΨ : g ∈ Comp(N, Ψ)}.

Just like compact and weakly mixing functions can be thought of as the measure- theoretic analogues of Besicovitch almost periodic and coperiodic functions respectively,

Theorem 16 can be interpreted as a measure-theoretic analogue of Theorem 13.

Theorems 13 and 16 are instrumental in the proof of the Erdős sumset conjecture; this is discussed in more detail in Subsection 1.5.2 below.

1.4.2 Decomposition theorems for multicorrelation sequences

An intimate connection between Szemerédi’s theorem and dynamical systems was discovered by Furstenberg, who recast the problem of finding arithemtic progressions in sets of positive density in terms of ergodic theory. In [Fur77] Furstenberg demonstrated that Szemerédi’s theorem follows from a sophisticated generalization of the classical Poincaré recurrence theorem. This generalization is known as Furstenberg’s multiple recurrence theorem and states that for any probability measure preserving system (X, B, µ, T ), any k ∈ N, and any measurable set A ∈ B with µ(A) > 0, there exists an integer n ∈ N such that

  µ A ∩ T −nA ∩ T −2nA ∩ ... ∩ T −knA > 0. (1.4.1)

In Subsection 1.5.1 we explain in more detail how (1.4.2) implies Szemerédi’s theorem.

Actually, Furstenberg proved something more general than (1.4.1). He showed that

1 N−1   lim inf X µ A ∩ T −nA ∩ T −2nA ∩ ... ∩ T −knA > 0. (1.4.2) N−m→∞ N − M n=M In Subsection 1.5.1 we obtain several refinements of Szemerédi’s theorem by considering variations of (1.4.2). Our work in this direction utilizes fine properties of multicorrelation

14 sequences, i.e., sequences

Z n kn α(n) = f0 · T f1 · ... · T fk dµ, (1.4.3) X

∞  −n −2n −kn  where f0, . . . , fk ∈ L (X). Note that the sequence n 7→ µ A∩T A∩T A∩...∩T A appearing in (1.4.1) and (1.4.2) is, in fact, a multicorrelation sequence, which can be seen by choosing f0 = ... = fk = 1A in (1.4.3). Bergelson, Host and Kra established in [BHK05] a decomposition of multicorrelation

R n kn sequences α(n) = X f0 · T f1 · ... · T fk dµ in the following way:

α(n) = φ(n) + ω(n) (1.4.4)

where φ is a (k+1)-step nilsequence (in other words φ ∈ Nilk+1) and ω is a null-sequence, i.e., 1 PN limN−M→∞ N−M n=M |ω(n)| = 0. Our first decomposition theorem involving multicorre- lation sequences concerns a spectral refinement of the Bergelson-Host-Kra decomposition

(1.4.4) and first appeared in [MR16b]. For the formulation we need to recall the notions of the spectrum of a system and of the spectrum of a sequence.

Definition 17. The discrete spectrum σ(T ) of a probability measure preserving system

(X, B, µ, T ) is the set of eigenvalues θ ∈ T := R/Z for which there exists a non-zero eigen- function g ∈ L2(X, B, µ) satisfying T g := g ◦ T = e(θ)g, where e(θ) := e2πiθ. The spectrum

σ(f) of an arithmetic function f : N → C is the set of frequencies θ ∈ T for which

1 N lim sup X f(n)e(−θn) > 0. N N→∞ n=1 By examining the nilsystem from which the nilsequence in (1.4.4) arises, we will show that the spectrum of the multicorrelation sequence α(n) is contained in the discrete spec- trum of its originating system (X, B, µ, T ).

Theorem 18 ([MR16a]). Let k ∈ N, let (X, B, µ, T ) be an ergodic measure preserving ∞ system and let f0, f1, . . . , fk ∈ L (X). For every ε > 0 there exists a decomposition of the form Z n kn α(n) := f0 · T f1 · ... · T fk dµ = φ(n) + ω(n) + γ(n),

15 n where ω(n) is a null-sequence, γ satisfies supn∈N |γ(n)| < ε and φ(n) = F (R y) for some F ∈ C(Y ) and y ∈ Y , where (Y,R) is a k-step nilsystem whose discrete spectrum is contained in the discrete spectrum of (X, B, µ, T ).

The following is an immediate corollary of Theorem 18.

Corollary 19 ([MR16a]). Under the same assumptions as in Theorem 18, the spectrum

σ(α) of the multicorrelation sequence α is contained in the discrete spectrum σ(T ) of the underlying system (X, B, µ, T ).

In Section 1.5 we derive from Theorem 18 and Corollary 19 a useful multiple ergodic theorem (see Theorem 28) and also formulate an application to additive combinatorics (see

Corollary 29).

Next, we turn our attention to a polynomial version of a multiple correlation sequence:

A polynomial multiple correlation sequence is a sequence of the form

Z p1(n) pk(n) α(n) = f0,T f1 · ... · T fk dµ X

∞ where f0, f1, . . . , fk ∈ L (X) and p1, . . . , pk ∈ Z[x]. Polynomial multiple correlation se- quences are connected to Sárközy’s theorem and to Bergelson-Leibman’s polynomial exten- sion of Szemerédi’s theorem[BL96]. A generalization of (1.4.4) from linear to polynomial multiple correlation sequences was obtained by Leibman in [Lei10a; Lei15]. In Subsection

2.2.8 we prove the following refinement of Leibman’s theorem.

Theorem 20. Fix k ∈ N. Let (X, B, µ, T ) be an ergodic measure preserving system, A ∈ B with µ(A) > 0 and p1, . . . , pk ∈ Z[x] with pi(0) = 0 for all i = 1, . . . , k. Then there exists δ > 0 such that for every ε > 0 there exists a decomposition of the form

Z p1(n) pk(n) α(n) := 1A,T 1A · ... · T 1A dµ = ρ(n) + φ(n) + ω(n) + γ(n), X where

• ω(n) is a null-sequence,

• γ satisfies supn∈N |γ(n)| < ε,

16 • ρ is a periodic sequence with the property that

N 1 X lim ρ(qn) > δ, ∀q ∈ N, N→∞ N n=1

(Cesàro) • and φ(n) ∈ Nils(N) ∩ Coper(N, Φ ) for some s ∈ N.

Theorem 20 is instrumental in the proofs of Theorems 32 and 35 and their combinatorial corollaries, namely Corollaries 33, 34, 37 and 38, all of which are formulated in Section 1.5 below.

1.4.3 A dichotomy theorem for multiplicative functions and a structure theo-

rem for level sets of multiplicative functions

An arithmetic function f : N → C is called multiplicative if f(1) = 1 and f(mn) = f(m)·f(n) for all relatively prime m, n ∈ N and it is called completely multiplicative if f(1) = 1 and f(mn) = f(m) · f(n) for all m, n ∈ N. Let M denote the set of all multiplicative functions bounded in modulus by 1. We start this subsection with formulating a dichotomy theorem for the class M0 of all multiplicative functions f ∈ M with the property that 1 PN limN→∞ N n=1 f(qn + r) exists for all q, r ∈ N. This dichotomy theorem for M0 is taken from [BKPLR17] and follows quickly by combining the work of Daboussi and Délange

[DD74; DD82; Del72], Frantzikinakis and Host [FH17a] and Bellow and Losert [BL85] and is closely related to [FH17a, Theorem 1.1]. For convenience, we will refer to functions f : N → C that are Besicovitch rationally almost periodic with respect to the Følner sequence of initial intervals Φ(Cesàro) simply as Besicovitch rationally almost periodic functions.

Theorem 21 ([BKPLR17]). Let f ∈ M0. Then either (i) f is Besicovitch rationally almost periodic, or

(ii) f is uniform.

Given a multiplicative function f : N → C and a point z ∈ C let E(f, z) denote the set

17 of solutions to the equation f(n) = z, i.e.,

E(f, z) := {n ∈ N : f(n) = z}.

We will refer to E(f, z) as a level set of f and we use D to denote the collection of all sets of the form E(f, z), where f ranges over all multiplicative functions and z ranges over all complex numbers.

Example 22. Examples of sets belonging to D include many classical sets of number-

2 theoretical origin, such as: the squarefree numbers Q := {n ∈ N : p - n for all primes p}, the multiplicatively even numbers E := {n ∈ N : Ω(n) is even} and the multiplicatively odd numbers O := {n ∈ N : Ω(n) is odd}, where Ω(n) denotes the number of prime factors of n counted with multiplicities. In Subsection 2.3.1 we provide more examples of sets in D (see

Example 75).

The next result is a structure theorem for sets belonging to D in the spirit of Theorem 21 and was first obtained in [BKPLR17]. To formulate this theorem we need to introduce set-theoretic analogues of uniform functions and of Besicovitch rationally almost periodic functions.

Definition 23.

|A∩[N]| (i) Let us call a set A ⊂ N uniform if d(A) := limN→∞ N exists and the function

1A − d(A) is a uniform function.

Given E,R ⊂ N for which the densities d(E) and d(R) exist, we say that E is uniform

relative to R if E ⊂ R and the function d(R)1E − d(E)1R is uniform. Note that a set

A is uniform if and only if it is uniform relative to N.

(ii) A set A ⊂ N is called rational if for every ε > 0 there exists a set B, that is a union of finitely many arithmetic progression, such that d(A4B) < ε (see [BR02, Definition

2.1] and [BKPLR16]). Equivalently, a set A is rational if and only if its indicator

function is Besicovitch rationally almost periodic.

(iii) Let Drat be the collection of all level sets E(f, z), where f is a Besicovitch rationally almost periodic multiplicative function and z is an arbitrary complex number. We

18 show in Subsection 3.3.1 that any set in Drat is a rational set.

Before we state our main result regarding level sets of multiplicative fucntions, we remark that the density d(R) of any rational set R exists, which follows quickly from the definition of rationality, and the density d(E) of any set E in D also exists, which is a result established in [Ruz77] (see Corollary 86 below).

Theorem 24. For any set E ∈ D with positive density there exists R ∈ Drat such that E is uniform relative to R. If d(E) 6= 1 then R ∈ Drat with this property is unique.

As an immediate corollary of Theorem 24 we obtain the following decomposition theo- rem for indicator functions of level sets of multiplicative functions:

(Cesàro) Corollary 25. For any set E ∈ D there exists frat ∈ Besrat(N, Φ ) and funi ∈ Uni(N) such that 1E = frat + funi.

Theorem 24 allows us to study multiple ergodic averages along level sets of multiplicative functions, such as N 1 X 1 (n) T −p1(n)f ··· T −p`(n)f , (1.4.5) N E 1 ` n=1 where T is an invertible measure preserving transformation on a probability space (X, B, µ),

∞ f1, . . . , f` ∈ L (X, B, µ), p1, . . . , p` are polynomials with integer coefficients and E belongs to D. This, in turn, leads to new enhancements of Szemerédi’s theorem. Our results in this direction are presented in the next section (see Theorem 35 and Corollaries 37 and 38).

1.5 Main Results

1.5.1 Generalizations of Furstenberg’s multiple recurrence theorem and re-

finements of Szemerédi’s theorem

As was mentioned in Subsection 1.4.2, Furstenberg showed in [Fur77] that (1.4.1) implies

Szemerédi’s theorem. This is done through what today is called the Furstenberg correspon- dence principle.

19 Proposition 26 (Furstenberg’s correspondence principle, see [Fur81, Lemma 3.17] and

Lemma 47 below). Let E ⊂ N be a set with positive upper density d(E) > 0. Then there exist an invertible measure preserving system (X, B, µ, T ) and a set A ∈ B with µ(A) > d(E) such that for all n1, . . . , nk ∈ N, one has

−n1 −nk  d(E ∩ (E − n1) ∩ ... ∩ (E − nk)) > µ A ∩ T A ∩ ... ∩ T A . (1.5.1)

It is clear that by combining (1.4.1) and Proposition 26 (with n1 = n, n2 = 2n, . . . , nk = kn) one obtains for any E ⊂ N with d(E) > 0,

d(E ∩ (E − n) ∩ ... ∩ (E − kn)) > 0, for some n ∈ N. This implies that there exists at least one positive integer n such that E ∩ (E − n) ∩ ... ∩ (E − kn) 6= ∅, or equivalently, E contains a (k + 1)-term arithmetic progression.

In this subsection we utilize Proposition 26 to obtain various refinements of Szemerédi’s theorem. Write hIi for the subgroup of T generated by a subset I ⊂ T. Subsets of R are tacitly identified with their projections mod 1 onto T. From Theorem 18 one can derive the following result.

Theorem 27 (cf. [Fra04, Theorem 6.4]). Let q, r ∈ N, and let (X, B, µ, T ) be an ergodic measure preserving system whose discrete spectrum σ(T ) satisfies σ(T ) ∩ q−1 = {0}. For

∞ any f1, . . . , fk ∈ L (X),

N k N k 1 X Y j(qn+r) 1 X Y jn lim T fj = lim T fj, (1.5.2) N−M→∞ N − M N→∞ N n=M j=1 n=1 j=1 where convergence takes place in L2(X). In particular, if (X, B, µ, T ) is totally ergodic, then equation (1.5.2) holds for all q, r ∈ N.

The case k = 3 of Theorem 27 was proven by Host and Kra in [HK02]. In the same paper Theorem 27 for k > 3 was posed as a question ([HK02, Question 2]).

Theorem 27 features multicorrelation sequences along infinite arithmetic progressions qZ + r. The next theorem, which was establihsed by the author in joint work with Joel 20 Moreira [MR16a], is a generalization of Theorem 27 in which infinite arithmetic progressions are replaced by more general Beatty sequences {bθn + γc : n ∈ Z}.

Theorem 28 ([MR16a]). Let θ, γ ∈ R with θ > 0, and let (X, B, µ, T ) be an ergodic measure preserving system whose discrete spectrum σ(T ) satisfies σ(T ) ∩ θ−1 = {0}. For

∞ any f1, . . . , fk ∈ L (X),

N k N k 1 X Y jbθn+γc 1 X Y jn lim T fj = lim T fj, (1.5.3) N−M→∞ N − M N→∞ N n=M j=1 n=1 j=1 where convergence takes place in L2(X). In particular, since discrete spectra are always countable, we have that for any fixed system (X, B, µ, T ) and for all but countably many

θ > 0 equation (1.5.3) holds for all γ ∈ R.

We provide a proof of Theorem 28 in Subsection 3.1.

Theorem 28, together with the fact that the discrete spectrum of any system is count- able and a standard application of Proposition 26, implies the following generalization of

Szemerédi’s theorem.

Corollary 29 ([MR16a]). Let k ∈ N, and let A ⊂ N have positive upper density. Then for all but countably many θ ∈ R and every γ ∈ R there exists a k-term arithmetic progression in A with common difference in the Beatty sequence {bθn + γc : n ∈ N}.

Next, we consider multiple ergodic averages of the from

1 N   X −p1(n) −p`(n) lim 1R(n)µ A ∩ T A ∩ ... ∩ T A . (1.5.4) N→∞ |R ∩ [1,N]| n=1 In Subsection 3.2.1 we show that for any rational set R (cf. Definition 23) the limit in (1.5.4) exists. Furthermore, we give necessary and sufficient conditions on R for this limit to be positive. This, in turn, allows us to obtain new refinements of the polynomial Szemerédi theorem.

Definition 30 (cf. [BH96, Definition 1.5]). We say that R ⊂ N is an averaging set of polynomial multiple recurrence if for any invertible measure preserving system (X, B, µ, T ),

A ∈ B with µ(A) > 0, ` ∈ N and polynomials p1, . . . , p` ∈ Z[x] with pi(0) = 0 for all 21 i ∈ {1, . . . , `}, the limit in (1.5.4) exists and is positive. If ` = 1 then we speak of an averaging set of polynomial single recurrence.

An averaging set of (single or multiple) polynomial recurrence R ⊂ N must also be a set of recurrence, i.e. for each measure preserving system (X, B, µ, T ) and each A ∈ B with

µ(A) > 0 there exists n ∈ R such that µ(A ∩ T −nA) > 0. If we assume that the density

1 d(R) = limN→∞ N |R ∩ [1,N]| exists and is positive then it follows – by considering cyclic rotations on finitely many points – that the density of R ∩ uN also exists and is positive for any positive integer u. This divisibility property is a rather trivial but necessary condition for a positive density set to be “good” for averaging recurrence. This leads to the following definition.

Definition 31. Let R ⊂ N. We say that R is divisible if d(R ∩ uN) exists and is positive for all u ∈ N.

Note that for rational sets the existence of d(R ∩ uN) follows immediately from the definition. Therefore, to verify divisibility, it suffices to check the positivity of d(R ∩ uN) for all u ∈ N. The next theorem, which first appeared in [BKPLR16] and which we will prove in

Subsection 3.2.2 using Theorem 20, asserts that for rational sets divisibility is not only a necessary but also sufficient condition for averaging recurrence:

Theorem 32 ([BKPLR16]). Let R ⊂ N be a rational set and assume d(R) > 0. The following are equivalent:

1. R is divisible.

2. R is an averaging set of polynomial single recurrence.

3. R is an averaging set of polynomial multiple recurrence.

The following combinatorial corollary of Theorem 32 follows with the help of Fursten- berg’s correspondence principle (see Proposition 26).

Corollary 33 ([BKPLR16]). Let R ⊂ N be rational and divisible. Then for any set E ⊂ N

22 with d(E) > 0 and any polynomials p1, . . . , p` ∈ Z[x] with pi(0) = 0 for all i ∈ {1, . . . , `}, there exists β > 0 such that the set

n   o n ∈ R : d E ∩ (E − p1(n)) ∩ ... ∩ (E − p`(n)) > β has positive lower density.

We provide a proof of Corollary 33 in Section 3.2.3.

We also obtain the following amplified version of Corollary 33, a proof of which is also contained in Section 3.2.3.

Corollary 34 ([BKPLR16]). Let R ⊂ N be rational and divisible. Then for any E ⊂ N with d(E) > 0 and any polynomials p1, . . . , p` ∈ Z[x] with pi(0) = 0 for all i ∈ {1, . . . , `}, there exists a subset R0 ⊂ R satisfying d(R0) > 0 and such that for any finite subset F ⊂ R0, we have ! \    d E ∩ E − p1(n) ∩ ... ∩ E − p`(n) > 0. n∈F The next result, which is obtained by combining Theorem 24 with Theorem 32, can be viewed as a generalization of [FH17b, Theorem 1.2, part (i)] and was first proven in

[BKPLR17].

Theorem 35 ([BKPLR17]). Let E ∈ D have positive density and let r ∈ N ∪ {0}. Then the following are equivalent:

• E − r is divisible;

• E − r is an averaging set of recurrence;

• E − r is an averaging set of polynomial multiple recurrence.

In view of Theorem 35, it is of interest to determine for which integers r the set E − r is divisible.

Proposition 36 ([BKPLR17]). Suppose E ∈ D has positive density. Let R ∈ Drat be as guaranteed by Theorem 24. Then for all r ∈ R the set E − r is divisible.

23 In [BR02], it was proven by Bergelson and Ruzsa that every self-shift of the set of squarefree numbers Q (i.e. any set of the form Q − r for r ∈ Q) is good for polynomial multiple recurrence. Combining Theorem 35 and Proposition 36 yields a result of similar nature for all sets of positive density belonging to D.

Corollary 37 ([BKPLR17]). Suppose E ∈ D has positive density. Then every self-shift of

E is an averaging set of polynomial multiple recurrence.

Corollary 37 implies, via Furstenberg’s correspondence principle, the following combi- natorial result.

Corollary 38 ([BKPLR17]). Let E be a set that belongs to D and suppose E has positive density. Then for any set D ⊂ N with positive upper density, any polynomials pi ∈ Z[t], i = 1, . . . , `, which satisfy pi(0) = 0 for all i ∈ {1, . . . , `}, and any r ∈ E there exists β > 0 such that the set

n   o n ∈ E − r : d D ∩ (D − p1(n)) ∩ ... ∩ (D − p`(n)) > β has positive lower density.

1.5.2 The Erdős sumset conjecture

We begin this section by stating a classical result of Hindman [Hin79].

Theorem 39 (Hindman’s theorem, [Hin79]). Let r ∈ N. For any partition of N into r cells,

N = C1 ∪...∪Cr, one of the cells contains all finite sums a some infinite set, i.e., there exists P i ∈ {1, . . . , r} and an infinite set B ⊂ N such that { n∈F n : F ⊂ B, 0 < |F | < ∞} ⊂ Ci.

As a corollary of Theorem 39 it follows that, whenever N is finitely partitioned, one of the cells contains B + C := {b + c : b ∈ B, c ∈ C} for some infinite sets B,C ⊂ N. The following conjectured density analogue, attributed to Erdős in [Nat80], is called an “old problem” in [EG80, p. 85].

Erdős sumset conjecture. If A ⊂ N satisfies d(A) > 0 then A contains B + C := {b + c : b ∈ B, c ∈ C}, where B and C are infinite subsets of N. 24 Nathanson [Nat80] showed that a set A with positive upper density contains a sum

B + C for a set B of positive density and a set C of any finite cardinality. More recently Di

Nasso, Goldbring, Jin, Leth, Lupini and Mahlburg [DNGJLLM15] employed non-standard analysis and ideas from ergodic theory to show that a set A ⊂ N with upper density greater than 1/2 contains a sum B + C where B and C are infinite sets. As a corollary, derived using Ramsey’s theorem and a result of Hindman, it follows that if A has positive upper density, then for some t ∈ N the union A ∪ (A − t) contains a sum B + C where B and C are infinite sets. Some further progress on a variant of the Erdős sumset conjecture was also made in [ACG17].

This thesis contains a proof of the Erdős sumset conjecture. In fact we prove a slightly stronger result (see Theorem 40 below). Recall that a Følner sequence in N is any sequence

Φ: N → ΦN of finite, non-empty subsets of N satisfying

(ΦN − 1) ∩ ΦN lim = 1. N→∞ |ΦN |

Given a Følner sequence Φ and a set A ⊂ N the quantity

|A ∩ ΦN | dΦ(A) := lim sup N→∞ |ΦN | is the upper density of A with respect to Φ. If the limit exists we denote it by dΦ(A) and call it the density of A with respect to Φ. The next theorem, which was obtained by the author in joint work with Joel Moreira and Donald Robertson [MRR18], establishes a generalization of the Erdős sumset conjecture to Følner sequences.

Theorem 40. Let Φ be a Følner sequence on N. For every A ⊂ N that satisfies dΦ(A) > 0 one can find infinite sets B,C ⊂ N with B + C ⊂ A.

A proof of Theorem 40 is contained in Section 3.4. Two key ingredients used in this proof are Theorems 13 and 16, which were formulated in Section 1.4 above.

25 CHAPTER 2

PROOFS OF DECOMPOSITION THEOREMS

This chapter contains the proofs of the decomposition theorems that were formulated in

Section 1.4, namely Theorems 15, 16, 18, 20, 21 and 24.

2 2.1 Proofs of decomposition theorems for L (N, Φ)

We begin with proving Theorems 15 and 16. Some of the content of this section was taken verbatim from [MRR18].

2 2.1.1 A completeness lemma for L (N, Φ)

Given a Følner sequence Φ on N and functions f, h: N → C, define the “inner product”

1 X hf, hiΦ = lim f(n)h(n) N→∞ |ΦN | n∈ΦN whenever the limit exists.

2 p Observe that L (N, Φ) is not a Hilbert space. Indeed, k · kΦ = h., .iΦ is not a norm, 2 the limit defining the inner product hf, hiΦ need not exist for all f, h ∈ L (N, Φ), and the 2 space L (N, Φ) need not be complete with respect to k · kΦ. To address the latter issue, we make use of the following proposition. We say that a sequence j 7→ fj of functions N → C is Cauchy with respect to k·kΦ if, for all ε > 0, there exists N ∈ N such that for all j, k > N one has kfk − fjkΦ 6 ε.

Proposition 41. Let Φ denote a Følner sequence on N. Let j 7→ fj be a sequence in

2 L (N, Φ) that is Cauchy with respect to k · kΦ. Then there exists a subsequence Ψ of Φ and 2 f ∈ L (N, Ψ) such that kf − fjkΨ → 0 as j → ∞. 26 Proof. Since j 7→ fj is Cauchy and the Besicovitch seminorm k.kΦ satisfies the triangle inequality, it suffices to prove that a subsequence of fj has a limit. We therefore assume, by

2 1 passing to a subsequence if necessary, that for all j ∈ N and all k > j we have kfk −fjkΦ 6 j . 2 2 In particular, with C := (kf1kΦ + 1) the estimate kfkkΦ 6 C is valid for all k ∈ N. Now, for every k ∈ N, pick N(k) ∈ N such that N(k + 1) > N(k) and that, for all N > N(k) and all j ∈ {1, . . . , k}, one has

1 X 2 2 1 X 2 |fj(n) − fk(n)| 6 and |fj(n)| 6 2C. |ΦN | j |ΦN | n∈ΦN n∈ΦN Also, by further refining the subsequence k 7→ N(k) if necessary, we can assume that   2  X 2  |ΦN(k)| > k max fk(n) − fi(n) : 1 6 i < k   n∈ΦN(i) for all k > 1. Define the Følner sequence Ψ by Ψk := ΦN(k) for all k ∈ N. SM−1  Let ΞM := ΨM \ k=1 Ψk and set ζM := ΨM \ΞM , the latter being a subset of SM−1 i=1 Ψi. Define f : N → C by ∞ X f(n) := 1ΞM (n)fM (n) M=1 2 2 2 for all n ∈ N. Using |x + y| /2 6 |x| + |y| , for each j 6 M ∈ N we have the estimate 1 X |f (n) − f(n)|2 X |f (n) − f (n)|2 + X |f (n) − f(n)|2 2 j 6 j M M n∈ΨM n∈ΨM n∈ζM 2|Ψ | M−1 M + X X |f (n) − f (n)|2 6 j M i i=1 n∈Ξi 2|Ψ | |Ψ | M + M 6 j M which proves that kf − fjkΨ 6 4/j, which tends to 0 as j → ∞.

We will also make use of the following version of Bessel’s inequality.

2 Lemma 42 (Bessel’s inequality). Let u1, u2,... be a sequence in L (N, Φ) such that kujkΦ = 2 1 for all j ∈ N and huj, ukiΦ exists and is 0 for all j 6= k. If u ∈ L (N, Φ) is such that hu, ujiΦ exists for all j ∈ N, then ∞ X 2 2 hu, ujiΦ 6 kukΦ j=1 holds. 27 Proof. It suffices to show that

J X 2 2 hu, ujiΦ 6 kukΦ (2.1.1) j=1 for every J ∈ N. Fix N ∈ N and write

1 [f, h] = X f(n) h(n) |ΦN | n∈ΦN for all f, h: N → C. We have

 J J  X X 0 ≤u − uj[u, uj], u − uk[u, uk] j=1 k=1 J J X 2 X =[u, u] − 2 [u, uj] + [u, uj][u, uk][uj, uk] j=1 j,k=1 whence J J X 2 X 2 [u, uj] ≤ [u, u] + [u, uj][u, uk][uj, uk] (2.1.2) j=1 j,k=1 holds. Since the uj are pairwise orthogonal

J J X X 2 lim [u, uj][u, uk][uj, uk] − [u, uj] = 0 N→∞ j,k=1 j=1 so taking the limit N → ∞ in (2.1.2) gives (2.1.1) as desired.

2 2.1.2 Decomposing functions from L (N, Φ) into almost periodic and coperiodic components

In this subsection we prove Theorem 15, which contains Theorem 13 as a special case. We will need the following lemma.

Lemma 43. Let U be a projection family and let Φ be a Følner sequence on N. For every 2 f ∈ L (N, Φ) there exists a subsequence Φ˜ of Φ such that for every subsequence Ψ of Φ˜, the inner product hf, uiΨ exists whenever u ∈ U(Ψ).

(k) Proof. We start by constructing for every k ∈ N ∪ {0} a Følner sequence Φ and elements (k) (0) u0, u1, . . . , uk ∈ U(Φ ) inductively. Define Φ := Φ and u0 := 0.

28 (k−1) Suppose we have defined functions u0, . . . , uk−1 ∈ U(Φ ) and a Følner sequence

(k−1) 0 Φ such that hf, uiiΦ(k−1) exists for all 0 ≤ i ≤ k − 1. For each Følner subsequence Φ of Φ(k−1), let

0  0 Ok−1(Φ ) := u ∈ U(Φ ): hu, uiiΦ0 = 0, ∀i ∈ {0, . . . , k − 1} and define

 0 (k−1) δk := sup |hf, uiΦ0 | :Φ is a Følner subsequence of Φ ,

0 u ∈ Ok−1(Φ ) with kukΦ0 = 1 and hf, uiΦ0 exists .

(k) (k−1) (k) Choose a Følner subsequence Φ of Φ and uk ∈ Ok−1(Φ ) with kukkΦ(k) 6 1 such 1 that hf, ukiΦ(k) exists and |hf, ukiΦ(k) | > δk − k . Then hf, uiiΦ(k) exists for all 0 ≤ i ≤ k. ˜ (N) (1) Define ΦN := ΦN . It is a subsequence of Φ and therefore itself a Følner sequence.

We claim that for every subsequence Ψ of Φ˜ and for all u ∈ U(Ψ) the inner product hf, uiΨ exists. More precisely, we claim that

∞ X hf, uiΨ = hf, uiiΨhu, uiiΨ. i=1

Note that the terms in the above series are well defined, since hu, uiiΨ exists because u, ui ∈ U(Ψ) and hf, uiiΨ exists by construction of Ψ. Moreover, this series is absolutely convergent, because Lemma 42 implies that the sequences i 7→ hf, uiiΨ and i 7→ hu, uiiΨ

2 are in ` (N).

For each k ∈ N, define k−1 X vk := u − uihu, uiiΨ i=1 and observe that vk ∈ Ok−1(Ψ) and that kvkkΨ 6 kukΨ. Therefore

∞ 1 X X lim sup f(n)u(n) − hf, uiiΨhu, uiiΨ N→∞ |ΨN | n∈ΨN i=1

1 ∞ X X 6 lim sup f(n)vk(n) + hf, uiiΨhu, uiiΨ N→∞ |ΨN | n∈ΨN i=k

∞ X 6 δkkvkkΨ + hf, uiiΨhu, uiiΨ . i=k

29 It thus suffices to show that δk → 0 as k → ∞. But by Lemma 42, we get

∞ ∞ 2 X 2 X 1 2 kfkΨ > |hf, ukiΨ| > δk − k k=1 k=1

2 and since f ∈ L (N, Φ), the series converges, which implies that indeed δk → 0 as k → ∞.

Proof of Theorem 15. As guaranteed by Lemma 43, let Ψ be a Følner subsequence of Φ such that for every u ∈ U(Ψ) the limit hf, uiΨ exists. Define

 2 δ := inf kf − ukΨ : u ∈ U(Ψ) .

2 1 For all k ∈ N choose ak ∈ U(Ψ) with kf −akkΨ < δ+ k . An application of the parallelogram 2 2 2 law to the vectors f − aj and f − ak shows that kaj − akkΨ 6 j + k , which implies that a is a Cauchy sequence with respect to k · kΨ. Using Proposition 41 and by refining Ψ if

2 necessary, we can find fU ∈ L (N, Ψ) such that limk→∞ kfU − akkΨ = 0. Hence fU belongs 2 to U(Ψ) and kf − fU kΨ = δ. Observe that fU minimizes the distance between f and U(Ψ). ⊥ Write h := f −fU . We claim that h belongs to U(Ψ) . First note that hh, aiΨ exists for all a ∈ U(Ψ) because both hf, aiΨ and hfU , aiΨ exist. Next, fix a ∈ U(Ψ) with kakΨ 6 1 and define I := hh, aiΨ. We have

2 h − Ia Ψ 1 = lim X |h(n)|2 − h(n)Ia(n) − h(n)Ia(n) + |I|2|a(n)|2 N→∞ |ΨN | n∈ΨN

2 2 2 6khkΨ − |I| (2 − kakΨ).

2 2 2 2 2 2 Since kakΨ 6 1 and khkΨ = δ, we conclude that khkΨ − |I| (2 − kakΨ) 6 δ − |I| . Therefore

2 2 h − Ia Ψ 6 δ − |I| . (2.1.3)

On the other hand, h − Ia = f − (fU + Ia) and fU + Ia ∈ U(Ψ). So

2 h − Ia Ψ > δ. (2.1.4)

Combining (2.1.3) and (2.1.4) proves that I = 0.

30 2 2.1.3 The Jacobs–de Leeuw–Glicksberg splitting for L (N, Φ)

The next theorem that we prove is Theorem 16. Beforehand, let us recall the Jacobs–de

Leeuw–Glicksberg splitting for Hilbert space isometries, after which Theorem 16 is modeled.

Fix an isometry U on a Hilbert space (H , k · kH ).

n Definition 44. An element x ∈ H is compact if {U x : n ∈ N} is a pre-compact subset of

(H , k · kH ). Equivalently, x is compact if for all ε > 0 there exists K ∈ N such that

m k min{kU x − U xkH : 1 ≤ k ≤ K} 6 ε for all m ∈ N.

Definition 45. An element x ∈ H is called weak mixing if for all ε > 0 and all y ∈ H the

n set {n ∈ N : |hU x, yi| > ε} has zero density with respect to every Følner sequence on N.

The set of all compact elements in H , denoted HComp, is a closed and U invariant subspace of H , as is the set HWM of weak mixing elements.

Theorem 46 (The Jacobs-de Leeuw-Glicksberg splitting, [Jac56; LG61] (see also [Kre85,

Chapter 2.4]). Let U be an isometry on a Hilbert space H . Then H = HComp ⊕ HWM.

In particular, for any x ∈ H there exist xComp ∈ HComp and xWM ∈ HWM such that x = xComp + xWM.

The proof of Theorem 16 requires some lemmas, the first of which is essentially [Fur81,

Lemma 4.23].

The next lemma, which represents an arbitrary bounded sequence as a continuous func- tion evaluated along the orbit of a point in a transitive topological dynamical system, can be seen as a version of the Furstenberg correspondence principle (see Proposition 26). In fact, it allows one to represent a countable collection of bounded sequences with the help of the same transitive topological dynamical system; in this strengthened form it will contribute to the proof of Theorem 143.

Lemma 47. Let J be a finite or countably infinite set and let {ai : i ∈ J} be a collection of bounded functions from N to C. Then there exists a compact metric space X, a continuous 31 map S : X → X, functions Fi ∈ C(X) for each i ∈ J, and a point x ∈ X with a dense orbit under S such that

n ai(n) = Fi(S x) ∀n ∈ N, ∀i ∈ J. (2.1.5)

Q N∪{0} Proof. Let Di ⊂ C be a compact set containing the image of ai and let Y := i∈J Di .

We can identify Y with the space of all sequences y : J × (N ∪ {0}) → C that satisfy y(i, n) ∈ Di for all n ∈ N ∪ {0} and i ∈ J. Given a point y ∈ Y we define S(y) as

(Sy)(i, n) = y(i, n + 1).

Let x be the point x(i, n) := ai(n) and let X be the orbit closure of x under the action of

S. Then X is a compact metric space. Moreover, if we define Fi(y) := y(i, 0) then (2.1.5) is satisfied.

2 Lemma 48. Let Φ be a Følner sequence on N and let f ∈ L (N, Φ). Suppose for all ε > 0  m k there exists K ∈ N such that min kR f − R fkΦ : 1 6 k 6 K < ε. Then there exists a subsequence Ψ of Φ such that f ∈ Comp(N, Ψ).

Proof. In light of Lemma 47 we can find a compact metric space X, a continuous map

S : X → X, a function F ∈ C(X) and a point x ∈ X with a dense orbit under S such

n that F (S (x)) = f(n) for all n ∈ N. Since X is a compact metric space, we can find a subsequence Ψ of Φ such that the measures

1 X µN := δSnx |ΨN | n∈ΨN

∗ m k weak converge to a S-invariant probability measure µ on X. Since kR f − R fkΦ =

m k  m kS F − S F kL2(µ), it follows that for all ε > 0 there exists K ∈ N such that min kS F − k 2 S F kL2(µ) : 1 6 k 6 K < ε. Therefore, F is a compact element in L (µ). Let C denote the smallest S-invariant sub-sigma-algebra of the Borel Sigma-algebra of X with respect to which the function F is measurable. Since F is compact, it belongs to the closure of the subspace spanned by all eigenfunctions (cf. [Kre85, Chapter 2, Theorem 4.5]) and

32 hence (X, C, µ, S) has discrete spectrum. In other words, (X, C, µ, S) is a measure-theoretic

Kronecker system. Since f(n) = F (Snx) and x is generic for µ along Ψ, we conclude that f is generated along Ψ by (X, C, µ, S). In other words, f ∈ Comp(N, Ψ).

We are finally ready to prove Theorem 16.

2 Proof of Theorem 16. We will first deal with the case where f ∈ L (N, Φ) is bounded and then derive from it the general case.

Using Lemma 47 we can find a compact metric space X, a continuous map S : X → X, a function F ∈ C(X) and a point x ∈ X with a dense orbit under S such that F (Sn(x)) = f(n) for all n ∈ N. Since X is a compact metric space, we can find a subsequence Ψ of Φ such that the measures 1 X µN := δSnx |ΨN | n∈ΨN weak∗ converge to a S-invariant probability measure µ on X. Let B denote the Borel sigma algebra on X. The transformation S induces an isometry U of L2(X, B, µ) via U(H) =

2 H ◦ S for all H ∈ L (X, B, µ). Let F = FComp + FWM be the Jacobs–de Leeuw–Glicksberg decomposition of F given by Theorem 46.

Next for each j ∈ N, let Hj ∈ C(X) be such that kFComp − Hjkµ < 1/j. Let hj(n) = n Hj(S x) for all n ∈ N and observe that

2 1 X n n 2 khj − h`kΨ = lim sup Hj(S x) − H`(S x) N→∞ |ΨN | n∈ΨN Z 2 2 = |Hj − H`| dµ = kHj − H`kµ, X 2 which implies in particular that j 7→ hj is a Cauchy sequence in L (Ψ). Using Proposition 41,

2 after refining Ψ if necessary, we can find a function fComp ∈ L (N, Ψ) such that khj−fckΨ → 0 as j → ∞. We also define fWM to be f − fComp.

To show that fComp is compact with respect to Ψ, fix ε > 0 and let K ∈ N be such that

 m k min kS FComp − S FCompkµ : 1 6 k 6 K < ε for every m ∈ N. Then taking j > 1/ε large enough so that khj − fCompkΨ < ε we have

m k m k kR fComp − R fCompkΨ 6 kR hj − R hjkΨ + 2ε 33 m k = kS Hj − S Hjkµ + 2ε

m k ≤ kS FComp − S FCompkµ + 4ε,

 m k and hence min kR fComp − R fCompkΨ : 1 6 k 6 K < 5ε. In light of Lemma 48, after refining Ψ if necessary, we obtain that fComp ∈ Comp(N, Ψ). Also, since FComp realizes the 2 minimal distance between F and all compact functions in L (X, B, µ), it follows that fComp realizes the minimal distance between f and Comp(N, Ψ).

To prove that fWM is weak mixing with respect to Ψ, let h: N → C be bounded and 0 n let Ψ be a Følner subsequence of Ψ such that the correlations hR f, hiΨ0 exist for every n ∈ N. Using Lemma 47 again, we can find another compact metric space X˜, a continuous map S˜: X˜ → X˜, a function F˜ ∈ C(X˜) and a point x˜ ∈ X˜ with a dense orbit under S such

n that F˜(S˜ (˜x)) = h(n) for all n ∈ N. Let Z ⊂ X × X˜ be the orbit closure of (x, x˜) under S × S˜. Since Z is a compact metric space, we can find a subsequence Ψ00 of Ψ0 such that the measures

1 X νN := 00 δ(S×S˜)n(x,x˜) |ΨN | 00 n∈ΨN converges in the weak∗ topology to an invariant probability measure ν on Z. For all ε > 0, if j is sufficiently large, then

m m hR fWM, hiΨ0 6 |hR (f − hj), hiΨ00 | + ε

1 X = lim (f − hj)(n + m)h(n) + ε N→∞ 00 |ΨN | 00 n∈ΨN

1 X n+m n = lim (F − Hj)(S x)F˜(S˜ x˜) + ε N→∞ 00 |ΨN | 00 n∈ΨN Z m  = (S × S˜) (F − Hj) ⊗ 1 (1 ⊗ F˜) dν + ε Z Z ˜ m ˜ 6 (S × S) (FWM ⊗ 1)(1 ⊗ F ) dν + 2ε. Z

For every φ ∈ C(X) and every ψ ∈ C(X˜) we have

|hFWM ⊗ 1, φ ⊗ ψiν| ≤ |hFWM, φiµ| sup ψ(z) z∈X˜ 34 2 which implies FWM ⊗ 1 in L (Z, ν) is a weak mixing function. This implies that the set  Z  ˜ m ˜ n ∈ N : (S × S) (FWM ⊗ 1)(1 ⊗ F ) dν > ε Z has zero density with respect to every Følner sequence. Hence the set

n n o n ∈ N : hR fWM, hiΨ > 3ε has zero density with respect to every Følner sequence, finishing the proof in the case f is bounded.

2 Finally, if f ∈ L (N, Φ) is arbitrary, let j 7→ fj be a sequence of bounded functions such that kf − fjkΦ → 0 as j → ∞. Apply the decomposition to each fj to obtain a Følner

(j) subsequence Ψ of Φ and decompositions fj = fj,Comp + fj,WM where fj,Comp is compact

(j) (j) with respect to Ψ and fj,WM is weak mixing with respect to Ψ . Note that we can do this in a way such that Ψ(j+1) is a subsequence of Ψ(j). Define now a new Følner subsequence (N) Ψ of Φ as ΨN := ΨN for all N ∈ N. 2 Note that hfj,Comp, f`,WMiΨ = 0 for every j, ` and hence kfj − f`kΨ = kfj,Comp − 2 2 f`,CompkΨ + kfj,WM − f`,WMkΨ. Since j 7→ fj is a Cauchy sequence with respect to Φ (and hence with respect to Ψ), it follows that j 7→ fj,Comp is also a Cauchy sequence with respect to Ψ. Using Proposition 41, and after refining Ψ if needed, we find a function fComp in

2 L (N, Ψ) such that kfj,Comp − fCompkΨ → 0 as j → ∞. It follows that fComp is compact with respect to Ψ. Then let fWM = f − fComp and observe that kfWM − fj,WMkΨ → 0 as j → ∞, which implies that fWM is weak mixing.

2.2 Proofs of decomposition theorems for multicorrelation sequences

The purpose of this section is to prove Theorems 18 and 20. Some of the material in this section appeared in [MR16a].

2.2.1 Preliminaries on nilmanifolds, nilsystems and nilsequences

Let G be a Lie group with identity 1G. The lower central series of G is the sequence

G = G1 D G2 D G3 D ... D {1G} 35 where Gi+1 := [Gi,G] is, as usual, the subgroup of G generated by all the commutators

−1 −1 aba b with a ∈ Gi and b ∈ G. If Gs+1 = {1G} for some finite s ∈ N we say that G is

(s-step) nilpotent. Each Gi is a closed normal subgroup of G (cf. [Lei05c, Section 2.11]). Given a nilpotent Lie group G and a uniform1 and discrete subgroup Γ of G, the quotient space G/Γ is called a nilmanifold. Naturally, G acts continuously and transitively on G/Γ via left-multiplication.

n Any element g ∈ G with the property that g ∈ Γ for some n ∈ N is called rational (or rational with respect to Γ). A closed subgroup H of G is then called rational (or rational with respect to Γ) if rational elements are dense in H. For example, the subgroups Gj in the lower central series of G are rational with respect to any uniform and discrete subgroup Γ of

G. (A proof of this fact can be found in [Rag72, Corollary 1 of Theorem 2.1] for connected

G and in [Lei05c, Section 2.11] for the general case.)

Remark 49. It is shown in [Lei06] that a closed subgroup H is rational if and only if H ∩Γ is a uniform discrete subgroup of H if and only if HΓ is closed in G.

If X = G/Γ is a nilmanifold, then a sub-nilmanifold Y of X is any closed set of the form Y = Hx, where x ∈ X and where H is a closed subgroup of G. It is not true that for every closed subgroup H of G and for every element x = gΓ in X = G/Γ the set Hx is a sub-nilmanifold of X; as a matter of fact, from Remark 49 it follows that Hx is closed in

X (and hence a sub-nilmanifold) if and only if the subgroup g−1Hg is rational with respect to Γ.

In the following we will use R or (Ra if we want to emphasize the dependence on a) to denote the translation by a fixed element a ∈ G, i.e. R: x 7→ ax. The map R: X → X is called a nilrotation and the pair (X,R) is called a (s-step) nilsystem.

Every nilmanifold X = G/Γ possesses a unique G-invariant probability measure called the Haar measure on X (cf. [Rag72, Lemma 1.4]). We will use µX to denote this measure. Let us state some classical results regarding the dynamics of niltranslation.

1A closed subgroup Γ of G is called uniform if G/Γ is compact or, equivalently, if there exists a compact set K such that KΓ = G.

36 Theorem 50 (see [AGH63; Par69] in the case of connected G and [Lei05c] in the general case). Suppose (X,R) is a nilsystem. Then the following are equivalent:

(i) (X,R) is transitive2;

(ii) (X, µX ,R) is ergodic; (iii) (X,R) is strictly ergodic3;

Moreover, the following are equivalent:

(iv) X is connected and (X, µX ,R) is ergodic.

(v) (X, µX ,R) is totally ergodic.

A theorem by Lesigne [Les91] asserts that for any nilmanifold X = G/Γ with connected

n G and any b ∈ G the closure of the set bZx := {b x : n ∈ Z} is a sub-nilmanifold of X. n (Actually, he shows that the sequence (b x)n∈N equidistributes with respect to the Haar measure on some sub-nilmanifold of X, but in virtue of Theorem 50 these two assertions are equivalent.) Leibman has extended this result as follows.

Theorem 51 ([Lei05b, Corollary 1.9]). Let G be a nilpotent Lie group and let Γ ⊂ G be a uniform and discrete subgroup. Assume Y is a connected sub-nilmanifold of X = G/Γ

Z : S n and b ∈ G. Then b Y = n∈Z b Y is a disjoint union of finitely many connected sub- nilmanifolds of X.

Let L be a normal, closed and rational subgroup of G. Since LΓ is closed, the quotient topology on L\X =∼ G/LΓ is Hausdorff and the map η : X → L\X that sends elements x ∈ X to their right cosets Lx is continuous and commutes with the action of G. Therefore the nilsystem (L\X,R) is a factor of (X,R) with factor map η.

An important tool in studying equidistribution of orbits on nilmanifolds is a theorem by Leon Green (see [AGH63; Gre61; Par70]). In [Lei05c] Leibman offers a refinement of this classical result of Green, a special case of which we state now. Here and throughout

◦ the text we denote by G the connected component of G containing the group identity 1G.

2A topological dynamical system (X,T ) is called transitive if there exists at least one point with dense orbit. 3A topological dynamical system (X,T ) is called strictly ergodic if there exists a unique T -invariant probability measure on X and additionally the orbit of every point in X is dense.

37 Theorem 52 (cf. [Lei05c, Theorem 2.17]). Let X = G/Γ be a connected nilmanifold, let u ∈ G and let N = hG◦, ui, hG◦, ui, where hG◦, ui denotes the group generated by G◦ and u. Then Ru is ergodic on X if and only if Ru is ergodic on N\X.

Note that in Theorem 52 it is not explicitly stated but implied that N is a normal, closed and rational subgroup of G and hence the factor space N\X is well defined.

Given a measure preserving dynamical system (X, B, µ, T ) let K denote the smallest sub-σ-algebra of B such that any eigenfunction of (X, B, µ, T ) becomes measurable with respect to K. The resulting factor system (X, K, µ, T ) is called the (measure-theoretic)

Kronecker factor of (X, B, µ, T ).

The following corollary of Theorem 52 describes the Kronecker factor of a connected ergodic nilsystem.

Corollary 53. Let X = G/Γ be a connected nilmanifold, let u ∈ G and assume Ru is

 ◦ ◦  ergodic. Define N := hG , ui, hG , ui . Then the Kronecker factor of (X,Ru) is (N\X,Ru).

For the proof of Corollary 53 it will be convenient to recall the definition of vertical characters: Let G/Γ be a connected nilmanifold and let G = G1 D G2 D ... D Gs D {1G} be the lower central series of G. The quotient T := Gs/(Γ ∩ Gs) is a connected compact

d abelian group and hence isomorphic to a torus T . We call T the vertical torus of G/Γ.

Since Gs is contained in the center of G, the vertical torus T acts naturally on G/Γ.A measurable function f ∈ L2(G/Γ) is called a vertical character if there exists a continuous group character χ of T such that f(tx) = χ(t)f(x) for all t ∈ T and almost every x ∈ X.

Proof of Corollary 53. Notice that the nilsystem (X,Ru) is isomorphic to the nilsystem

0 0 ◦ ◦ (X ,Ru), where X := hG , ui/(Γ ∩ hG , ui). We can therefore assume without loss of generality that G = hG◦, ui. We proceed by induction on the nilpotency class of G. Suppose

G is a s-step nilpotent Lie group. If s = 1, G is abelian and the result is trivial. Next, assume that s > 1 and that Corollary 53 has already been proven for all nilpotent Lie groups of step s − 1.

38 Observe that N\X is a compact group and hence (N\X,Ru) is a factor of the Kronecker factor of (X,Ru). It thus suffices to show that for all eigenfunctions f of the system (X,Ru) one has

2 ∀v ∈ N f ◦ Rv = f in L (X, µX ). (2.2.1)

Let θ ∈ T be an eigenvalue of the Koopman operator associated with Ru, let Eθ ⊂ 2 L (X, µX ) be its (non-trivial) eigenspace and let f ∈ Eθ.

Let T := Gs/(Γ∩Gs) denote the vertical torus of X = G/Γ and note that the action of T on X commutes with the action of Ru. In particular, T leaves the eigenspace Eθ invariant.

It thus follows from the Peter-Weyl theorem that Eθ decomposes into a direct sum of eigenspaces for the Koopman representation of T . In other words, any Ru-eigenfunction f ∈ Eθ can be further decomposed into a sum of vertical characters that are also contained in Eθ. It therefore suffices to establish (2.2.1) in the special case where f ∈ Eθ is a vertical character.

Now assume f ∈ Eθ, χ is a group character of T and f(tx) = χ(t)f(x) for all t ∈ T and almost every x ∈ X. We distinguish two cases; the first case where χ is trivial and the second case where χ is non-trivial.

Let us first assume that χ is trivial, i.e. χ(t) = 1 for all t ∈ T . This implies that f is

0 0 Gs invariant. Let G denote the nilpotent Lie group G/Gs and let ξ : G → G denote the natural quotient map. We define Γ0 := ξ(Γ), which is a uniform and discrete subgroup of

0 0 0 0 G , and we define X := G /Γ . Since f is Gs invariant it can be identified with a function

0 0 0 0 0 f on the nilmanifold X and f is then an eigenfunction for Ru0 , where u = ξ(u). Since G is an (s − 1)-step nilpotent Lie group, we can invoke the induction hypothesis and conclude that

0 0 0 0 2 0 ∀v ∈ N := ξ(N) f ◦ Rv0 = f in L (X , µX0 ). (2.2.2)

However, f is Gs invariant, and therefore (2.2.2) implies (2.2.1). Now assume that χ is non-trivial. Since T is connected, any non-trivial character has full range in the unit circle. In particular, there exists t ∈ T such that χ(t) = e(−θ). Pick any element g ∈ Gs such that g(Γ ∩ Gs) = t and define b := ug. Then from Ruf = e(θ)f

39 and from Rgf = e(−θ)f it follows that Rbf = f. Also, note that since the actions of Ru and Rb on the factor N\X are identical (because Gs ⊂ N), it follows from the ergodicity

◦ ◦ of Ru that Rb acts ergodically on N\X. Finally, the groups N = [G, G] = [hG , ui, hG , ui] and [hG◦, bi, hG◦, bi] are identical and hence it follows from Theorem 52 that the ergodicity of Rb lifts from N\X to X. We conclude that f has to be a constant function, thereby satisfying (2.2.1).

2.2.2 Preliminaries on almost periodic functions

In this section we collect some pertinent facts about almost periodic functions (in particular, functions belonging to Bohrrat(N), Bohr(N), Besrat(N, Φ) and Bes(N, Φ)); we refer the reader to the book of Besicovitch [Bes55] for a complete treatment on the theory of almost periodic functions.

Almost periodic functions were first introduced by Bohr in [Boh25a]. In his second paper on this subject [Boh25b] Bohr proves that any almost periodic functions can be ap- proximated uniformly by trigonometric polynomials whose frequencies are all contained in the spectrum σ(φ) (see Definition 17). This theorem is known as Bohr’s approximation theorem. An analogue of Bohr’s approximation theorem for Besicovitch almost periodic sequences was later obtained by Besicovitch. He showed that the spectrum of a Besicovitch almost periodic function is at most countable and then proved that any Besicovitch almost periodic function can be approximated in the Besicovitch seminorm by trigonometric poly- nomials whose frequencies are all contained in the spectrum σ(φ). Let us now give the precise statement and a proof of Besicovitch’s result.

◦ Theorem 54 (cf. [Bes55, Theorem II.8.2 (page 105)] and [BL85, Lemma 3.11]). Let f : N →

C be a Besicovitch almost periodic function with spectrum σ(f). Then for every ε > 0 Pk there exists a trigonometric polynomial P (n) = i=1 cie(θin) with c1, . . . , ck ∈ C and

θ1, . . . , θk ∈ σ(f) such that kf − P kΦ(Cesàro) 6 ε. This includes the case σ(f) = ∅, where one can take P ≡ 0 for all ε > 0 (i.e., f has empty spectrum if and only if kfkΦ(Cesàro) = 0).

Proof. Given n0, n1, . . . , ns ∈ N and β1, . . . , βs ∈ R such that the set {2π, β1, β2, . . . , βs} 40 is linearly independent over Q, the discrete Bochner-Fejér kernel with parameter B =

n0 n1 ... ns  2π β1 ... βs is defined as   ν0  |ν |  |ν | −i 2π+ν1β1+...+νsβs k X 1 s n0 KB(k) := 1 − ··· 1 − e , n1 ns where the sum ranges over ν0 = 1, . . . , n0, |ν1| < n1,..., |νs| < ns. The corresponding discrete Bochner-Fejér polynomial is

N f 1 X σ (n) := lim f(n + k)KB(k). B N→∞ N k=1 It is shown in [BL85, Lemma 3.11] (also cf. [Bes55, Theorem II.8.2◦(page 105)]) that

f f there exists a sequence of Bochner-Fejér polynomials σ , m ∈ , such that kf−σ k (Cesàro) Bm N Bm Φ goes to 0 as m → ∞. It is not hard to see that

σdf (θ) = fˆ(θ) · K[(1 − θ) = fˆ(θ) · K[(θ) Bm Bm Bm and hence σ(σf ) ⊂ σ(f). This finishes the proof. Bm

We will also make use of the following lemma regarding the spectrum of the product of two Besicovitch almost periodic functions.

Lemma 55. Let f1, f2 : N → C be bounded Besicovitch almost periodic functions. Then the product f1 · f2 is also Besicovitch almost periodic with spectrum σ(f1 · f2) ⊂ hσ(f1) ∪

σ(f2)i.

Proof. In view of Theorem 54, we can approximate each fi with a trigonometric polynomial

ρi whose spectrum is contained in σ(fi). Observe that the product ρ1ρ2 is a trigonometric polynomial with spectrum contained in hσ(f1) ∪ σ(f2)i. Finally, it is not hard to show that

ρ1ρ2 approximates f1f2, which finishes the proof.

2.2.3 Proving Theorem 18 for the special case of nilsystems

In this section we will prove Theorem 18 for the special case of nilsystems. This will serve as an intermediate step in obtaining Theorem 18 in its full generality.

41 Theorem 56. Let k ∈ N, let (X,R) be an ergodic k-step nilsystem and let f0, f1, . . . , fk ∈ C(X). Then Z n 2n kn f0 · R f1R f2 · ... · R fk dµX = φ(n) + ω(n), (2.2.3) where ω(n) is a null-sequence and φ(n) = F (Sny) for some F ∈ C(Y ) and y ∈ Y , where

(Y,S) is a k-step nilsystem whose discrete spectrum σ(Y,S) is contained in σ(X,R), the discrete spectrum of (X,R).

The key ingredient in the proof of Theorem 56 is the following result.

Theorem 57. Let k ∈ N, let X be a connected nilmanifold and let R: X → X be an ergodic niltranslation. Define S := R × R2 × ... × Rk and

 n k YX∆ := S (x, x, . . . , x): x ∈ X, n ∈ Z ⊂ X .

Then σ(X,R) = σ(YX∆ ,S).

The proof of Theorem 57 is postponed to Section 2.2.5.

Most of the ideas used in the rest of the proof of Theorem 56 were already present in

[BHK05] and [Lei10a]. For completeness, we repeat the same arguments here, adapting them to our situation as needed.

Let X = G/Γ be a nilmanifold and let π : G → X denote the natural projection of G onto X. We will use 1X to denote the point π(1G). Consider a closed subgroup H of G. As noticed in Remark 49 the set Y := π(H) is a sub-nilmanifold of X if and only if H is rational. Let L denote the normal closure of H in G, that is, let L be the smallest normal subgroup of G containing H. One can show that if H is closed and rational then so is L

(cf. [Lei10a]). In particular, the set Z := π(L) is a sub-nilmanifold of X containing Y . We call Z the normal closure of Y .

Note, every sub-nilmanifold Y = Hx of X can be viewed as a nilmanifold on its own and in particular it has its own Haar measure µY . Moreover, for any a ∈ G, the Haar measure of the sub-nilmanifold aY coincides with the push forward of µY under Ra.

42 Proposition 58 (cf. [Lei10a, Proposition 3.1]). Assume V = H/ΓH is a connected nilman- ifold, π : H → V is the natural projection of H onto V and W is a connected sub-nilmanifold of V containing 1V = π(1H ). Let b ∈ H and assume bZW is dense in V . If Z denotes the normal closure of W , then for all f ∈ C(V ) we have N Z Z 1 X lim f dµbnW − f dµbnZ = 0. (2.2.4) N−M→∞ N − M n n n=M b W b Z Proposition 59. Let (X,S) be a nilsystem, let W ⊂ X be a connected sub-nilmanifold containing the origin 1X and assume that V := SZW is also a connected sub-nilmanifold of X. Then there exists a factor (Y,S) of (V,S), and a point y ∈ Y such that for any continuous function f ∈ C(X), there exists F ∈ C(Y ) such that N Z 1 X n lim f dµSnW − F (S y) = 0. (2.2.5) N−M→∞ N − M n n=M S W

Proof. Since V is invariant under S and 1X ∈ V , we can find a closed rational subgroup H of G such that V = π(H). Therefore, ΓH := Γ ∩ H is a uniform discrete subgroup of H and the nilsystem (V,S) is isomorphic to (H/ΓH ,S). In the following we will identify V with

H/ΓH and vice versa. Let Z be the normal closure of the sub-nilmanifold W in V and let L denote the corresponding normal subgroup of H such that π(L) = Z. ∼ Define Y := L\V = H/(LΓH ) and let η : V → Y denote the natural projection. Recall that (Y,S) is a well defined factor of (V,S) with factor map η.

Define y := η(1X ) and observe that η(W ) = {y}. Note that for every z = η(π(h)) ∈ Y , the set η−1(z) = π(gL) is a sub-nilmanifold of Y and therefore it possesses a Haar measure, which we denote by µη−1(z). Let f ∈ C(X) and define the function F as Z F (z) := f dµη−1(z). η−1(z)

n R Finally, observe that F (S y) = SnZ f dµSnZ and so (2.2.5) follows immediately from Eq. (2.2.4) in Proposition 58.

To prove Theorem 56 we will also require a technical lemma:

Lemma 60. Let (X,R) be an ergodic connected nilsystem of step s and let q ∈ N. Then there exists an ergodic nilsystem (Y,S) of step s with exactly q connected components and 43 such that the restriction of Sq to each connected component of Y yields a system isomorphic to (X,R).

Proof. First, we claim that one can embed the connected nilsystem (X,R) into a nilflow

0 t 0 1 (X , (R )t∈R), so that X is a subnilmanifold of X invariant under R = R . Indeed, say X = G/Γ. One can assume that the identity component G◦ of G is simply connected, by passing to the universal cover if needed. Next one can use [Rag72, Theorem 2.20] to

find a connected simply connected nilpotent Lie group G0 such that G ⊂ G0 and Γ is a uniform discrete subgroup of G0. In particular X is a sub-nilmanifold of X0 := G0/Γ.

Since G0 is connected and simply connected, for any element a ∈ G0 the associated one-

t parameter subgroup (a )t∈R is well defined (cf. [Lei05c, Subsection 2.4]). In particular, the t 0 niltranslation R = Ra : X → X can be extended to a nilflow (R )t∈R on X by defining t t 0 R x := Rat x = a x for all x ∈ X , t ∈ R. 0 0 1/q Next, consider the product nilsystem (Y ,S) := (X ,R ) × (Z/(qZ), +1), so that as a nilmanifold Y 0 = X0 × {0, 1, . . . , q − 1} and the niltranslation S is defied as S(x, r) =

(R1/qx, r + 1 mod q). Finally, let Y ⊂ Y 0 be the orbit of X × {0} under S. Since Sq =

R1×Id = T ×Id preserves X×{0}, we deduce that Y has precisely q connected components.

In fact, Sq preserves each component X × {r}, and moreover (X × {r},Sq) is isomorphic to (X,R) via the map x 7→ (Rr/qx, r).

We are now ready to prove Theorem 56.

Proof of Theorem 56. Using invariance of the measure µX under R yields

Z Z n 2n kn n 2n (k+1)n f0 · R f1R f2 · ... · R fk dµX = R f0 · R f1 · ... · R fk dµX .

Hence, by changing k to k + 1 and renaming the functions f0, f1, . . . , fk to f1, f2, . . . , fk, we see that in order to prove Theorem 56 it is equivalent to prove that for any k ∈ N, any ergodic (k − 1)-step nilsystem (X,R) and any f1, f2, . . . , fk ∈ C(X) we have

Z n 2n kn R f1 · R f2 · ... · R fk dµX = φ(n) + ω(n), (2.2.6)

44 where ω(n) is a null-sequence and φ(n) = F (Sny) is a nilsequence coming from a (k−1)-step nilsystem (Y,S) with σ(Y,S) ⊂ σ(X,R).

Let α(n) denote the sequence Z n 2n kn α(n) := R f1 · R f2 · ... · R fk dµX .

We first deal with the case when X is connected. Let X∆ be the diagonal of Xk and let

S = R × R2 × · · · × Rk. Note that since X is connected, the diagonal X∆ is also connected.

We can write α(n) as Z α(n) = f1 ⊗ f2 ⊗ · · · ⊗ fk dµSnX∆ . SnX∆ It is shown in [Lei10b, Corollary 6.5] and also in [Fra08, Corollary 2.10] that if X is connected

n ∆ then YX∆ := {S X : n ∈ Z} is connected. It thus follows from Theorem 51 that YX∆ is k a sub-nilmanifold of X . Also, from Theorem 57 we have that YX∆ satisfies σ(YX∆ ,S) = ∆ σ(X,Ra). This observation allows us to apply Proposition 59 with W = X and V = YX∆ .

Therefore we can find a factor (Y,S) of (YX∆ ,S), a point y ∈ Y and a continuous function

F ∈ C(Y ) such that (2.2.5) is satisfied. Observe that since (Y,S) is a factor of (YX∆ ,S), the discrete spectra satisfy σ(Y,S) ⊂ σ(YX∆ ,S) and hence σ(Y,S) ⊂ σ(X,R). Besides that, (2.2.5) can be written as 1 N lim X |α(n) − F (Sny)| = 0. N−M→∞ N − M n=M Therefore, setting φ(n) = F (Sny) and ω(n) = α(n) − φ(n), we obtain a decomposition of

α(n) satisfying (2.2.6).

Next we deal with the case when X is not connected. Since X is compact, it has a finite number of connected components X0,...,Xq−1. It is not hard to see that R permutates the

q components X0,...,Xq−1 cyclically, that is, RX` = X`+1 mod q. In particular, R preserves

n q each X` and for each n ∈ Z the map R is an isomorphism between the systems (X`,R ) q q and (X`+n mod q,R ). Also, observe that for each ` ∈ {0, . . . , q − 1} the system (X`,R ) is Pq−1  `  totally ergodic. This shows that the function `=0 e q 1X` is an eigenfunction for R with eigenvalue 1/q and so  1 2 q − 1 σ(X,R) ∩ = 0, , ,..., . Q q q q 45 On the other hand, an irrational point θ ∈ T is an eigenvalue for R if and only if qθ is an eigenvalue for Rq; therefore we conclude that

1 q σ(X,R) = q σ(X0,R ) ⊕ {1/q, . . . , (q − 1)/q}. (2.2.7)

ir For each r ∈ {0, . . . , q − 1} and i ∈ {1, . . . , k} let fi,r := R fi. For ` ∈ {0, . . . , q − 1} denote by Z qn 2qn kqn α`,r(n) := R f1,r · R f2,r · ... · R fk,r dµX` X` and observe that

Z k q−1 Z k q−1 Y i(qn+r) X Y iqn X α(qn + r) = R fi dµX = R fi,r dµX` = α`,r(n). X i=1 `=0 X` i=1 `=0

` q q Since R is an isomorphism between (X0,R ) and (X`,R ) we have

Z k Z k Y qin ` Y qin α`,r(n) = R R fi,r dµX0 = R fi,r,` dµX0 X0 i=1 X0 i=1

` `+ir where fi,r,` = R fi,r = R fi.

q By applying the above argument for connected nilsystems to (X0,R ) we find a nilsystem

q (Y0,S0) with spectrum σ(Y0,S0) ⊂ σ(X0,R ), a point y0 ∈ Y0 and, for each r, ` ∈ {0, . . . , q−

1}, a function F`,r ∈ C(Y0) such that

N 1 X n lim |α`,r(n) − F`,r(S0 y0)| = 0. N−M→∞ N − M n=M Pq−1 Letting Fr := `=0 F`,r we deduce that N 1 X n lim |α(qn + r) − Fr(S0 y0)| = 0. (2.2.8) N−M→∞ N − M n=M Invoking Lemma 60 we find an ergodic nilsystem (Y,S) with q connected components and such that Y0 can be identified with one of the connected components of Y in such a way

q that the restriction of S to Y0 is precisely S0. The same argument which led to (2.2.7) implies that

1 n 1 q−1 o σ(Y,S) = q σ(Y0,S0) ⊕ 0, q ,..., q .

q Together with (2.2.7) and the fact that σ(Y0,S0) ⊂ σ(X0,R ), the previous equation implies that σ(Y,S) ⊂ σ(X,R). 46 Each point in Y can be represented uniquely as Sry for some r ∈ {0, . . . , q − 1} and

r n y ∈ Y0 ⊂ Y . Let F ∈ C(Y ) be defined as F (S y) = Fr(y), define φ(n) := F (S y0) and let ω(n) := α(n) − φ(n). This way, we obtain a decomposition as in (2.2.3), where φ(n) is a nilsequence coming from a (k − 1)-step nilsystem (Y,S) with σ(Y,S) ⊂ σ(X,R). It only remains to show that ω(n) is a nullsequence:

1 N lim X |ω(n)| N−M→∞ N − M n=M q−1 N 1 X 1 X qn+r = lim α(qn + r) − F (S (y0)) q N−M→∞ N − M r=0 n=M q−1 N 1 X 1 X n = lim |α(qn + r) − Fr(S0 y0)|, q N−M→∞ N − M r=0 n=M which together with (2.2.8) shows that ω(n) is a null-sequence.

2.2.4 Host-Kra-Ziegler factors

In Section 2.2.3 we have established Theorem 18 under the additional assumptions that

(X, B, µ, T ) is a nilsystem and each fi is continuous. In order to close the gap between nilsystems and general measure preserving systems, we rely on the theory of the Host-Kra-

Ziegler factors, which was developed by Host and Kra in [HK05b] and independently by

Ziegler in [Zie07]. Let us briefly summarize their theory; for details we refer the reader to

[HK05a; HK05b; Zie07].

Suppose (X, B, µ, T ) is a measure preserving system. For s ∈ N the s-th Host-Kra-

Ziegler factor of (X, B, µ, T ), denoted by Zs, is a T -invariant sub-σ-algebra of B which serves as a characteristic factor for multiple ergodic averages. For every s, the system

(X, Zs, µ, T ) is an inverse limit of s-step nilsystems, meaning that there exists a nested sequence of T -invariant sub-σ-algebras Z(1) ⊂ Z(2) ⊂ ... ⊂ Z such that W Z(m) = Z s s s m>1 s s (m) and such that for every m the system (X, Zs , µ, T ) is measurably isomorphic to an s-step nilsystem.

As mentioned above, the factors Zs are characteristic factors for multiple ergodic aver- ages. 47 Theorem 61 ([BHK05, Corollary 4.5]). For any k ∈ N, any ergodic measure preserving ∞ system (X, B, µ, T ) and any f0, . . . , fk ∈ L (X),   N Z k k 1 X Y jn Y jn lim  T fj − T E(fj|Zk) dµ = 0. N−M→∞ N − M n=M X j=0 j=0 Proof of Theorem 18. We are given an ergodic measure preserving system (X, B, µ, T ) as

∞ well as bounded measurable functions f0, f1, . . . , fk ∈ L (X) and some ε > 0; we seek to decompose the multicorrelation sequence

Z n kn α(n) := f0 · T f1 · ... · T fk dµ into α(n) = ϕ(n) + ω(n) + γ(n), satisfying the stated properties.

Using Theorem 61, and after changing α by a null-sequence if necessary, we may as- (m) sume that E(fj|Zk) = fj for all j ∈ {0, 1, . . . , k}. Let Zk , m ∈ N, be a nested se- quence of T -invariant sub-σ-algebras of Z such that W Z(m) = Z and such that for k m>1 k k (m) every m the system (X, Zk , µ, T ) is measurably isomorphic to a compact k-step nilsys- (m) tem. Since (X, Zk , µ, T ) is a factor of (X, B, µ, T ), we have the inclusion of the spectra (m) σ(X, Zk , µ, T ) ⊂ σ(X, B, µ, T ).

For each m ∈ N, let αm : Z → R be defined as Z (m) n (m) kn (m) αm(n) := E(f0|Zk ) · T E(f1|Zk ) · ... · T E(fK |Zk ) dµ.

It follows from Doob’s martingale convergence theorem (cf., for instance, [Dur10, Theorem

5.4.5]) that αm(n) converges to α(n) as m → ∞, uniformly in n. In other words, choosing a large enough m ∈ N we have that kα − αmk∞ < ε/2. (m) Next, one can approximate the functions E(fi|Zk ) (when identified with measurable functions on the respective compact s-step nilsystems) by continuous functions in k.kLp - norm, for every p < ∞. More precisely, there exists an s-step nilsystem (X,˜ µ,˜ T˜), which is (m) ˜ ˜ ˜ measurably isomorphic to (X, Zk , µ, T ), and continuous functions f1,..., fk ∈ C(X) such that the sequence Z ˜ n ˜ kn ˜ β(n) := f0 · T˜ f1 · ... · T˜ fk dµ˜ X˜ satisfies kαm − βk∞ < ε/2 and hence γ(n) := α(n) − β(n) satisfies kγk∞ < ε. 48 Finally, it follows from Theorem 56 that β can be written as

β(n) = φ(n) + ω(n), where ω is a nullsequence and φ(n) = F (Rny) for some F ∈ C(Y ) and y ∈ Y , where

(Y,R) is a k-step nilsystem whose discrete spectrum satisfies σ(Y,R) ⊂ σ(X,˜ µ,˜ T˜) = (m) σ(X, Zk , µ, T ) ⊂ σ(X, B, µ, T ), finishing the proof.

2.2.5 Spectrum of the orbit of the diagonal

The purpose of this section is to give a proof of Theorem 57. For convenience, let us restate the theorem here.

Theorem 57. Let k ∈ N, let X be a connected nilmanifold and let R: X → X be an ergodic niltranslation. Define S := R × R2 × ... × Rk and

 n k YX∆ := S (x, x, . . . , x): x ∈ X, n ∈ Z ⊂ X . (2.2.9)

Then σ(X,R) = σ(YX∆ ,S).

We will derive Theorem 57 from the following more general result.

Theorem 62. Let k ∈ N, let X be a connected nilmanifold and let R: X → X be an ergodic niltranslation. Define S := R × R2 × ... × Rk and

 n k Yx := S (x, x, . . . , x): n ∈ Z ⊂ X . (2.2.10)

Then for almost every x ∈ X the Kronecker factor of (Yx,S) is isomorphic to the Kronecker factor (K,R) of (X,R). In particular, σ(X,R) = σ(Yx,S) for almost every x ∈ X.

We remark that in the statement of Theorem 62 the restriction of x to a full measure subset of X is necessary, because there may be points x ∈ X for which the spectrum of the system (Yx,S) is strictly larger than the spectrum of (X,R), as the following example illustrates.

49 Example 63. Consider the matrix group     1 n c          G := 0 1 b : n ∈ Z, b, c ∈ R .        0 0 1 

2 This group is a 2-step nilpotent Lie group and it acts continuously and transitively on T via   1 n c       2 0 1 b(y, z) = (y + b, z + c + ny), ∀(y, z) ∈ T .     0 0 1

2 Also, (T ,G) is a nilsystem since it is isomorphic to (G/Γ,G), where Γ is the uniform and discrete subgroup of G given by     1 n m          Γ := 0 1 k  : n, m, k ∈ Z .        0 0 1 

Let α be an arbitrary irrational number and let a ∈ G denote the element   1 1 0       a = 0 1 α.     0 0 1

2 2 Then the niltranslation Ra : T → T , which takes the from (y, z) 7→ (y +α, z +y), is totally 1 ergodic. However, if x denotes the point x := ( 4 , 0), then the closure Y of the set

n 2n {(Ra x, Ra x): n ∈ Z}

2 2 in T × T is not connected. In fact, straightforward calculations reveal that it consists of 2 two connected components. This implies that 1/2 ∈ σ(Y,Ra × Ra2 ) but 1/2 ∈/ σ(T ,Ra).

The proof of Theorem 62 is presented in Sections 2.2.6 through 2.2.7. For now we will present the deduction of Theorem 57 from Theorem 62.

50 Proof of Theorem 57 assuming Theorem 62. Let k ∈ N, let X be a connected nilmanifold 2 k and let R: X → X be an ergodic niltranslation. Define S := R × R × ... × R and let YX∆ be defined by (2.2.9). Given θ ∈ σ(X,R), let f ∈ L2(X) be an eigenfunction of the system ˜ 2 ˜ (X,R) with eigenvalue θ. Since the function f ∈ L (YX∆ ) defined by f(x1, . . . , xk) = f(x1) is an eigenfunction for the system (YX∆ ,S) with eigenvalue θ, it follows that σ(X,R) ⊂

σ(YX∆ ,S). Next we prove the converse inclusion. Let ν be the Haar measure of the nilmanifold

YX∆ and let νx be the Haar measure of the nilmanifold Yx defined by (2.2.10). Observe that the sets Yx are precisely the atoms of the invariant σ-algebra of the system (YX∆ ,S).

Therefore, the measures νx form the ergodic decomposition of ν.

2 Let θ ∈ σ(YX∆ ,S) and let f ∈ L (YX∆ , ν) be an eigenfunction with eigenvalue θ. In other words, for almost every y ∈ YX∆ we have Sf(y) = e(θ)f(y). Since f cannot be 0 ν-a.e., there exists a positive measure set of x ∈ X for which the restriction of f to the system (Yx, νx,S) is not the zero function. But for any such x, the restriction of f to the system (Yx, νx,S) is an eigenfunction with eigenvalue θ. Finally, we invoke Theorem 62 to conclude that θ ∈ σ(X,R), finishing the proof.

2.2.6 A useful reduction

Let G be an s-step nilpotent Lie group, let Γ be a uniform and discrete subgroup of G and assume that X = G/Γ is connected. This is equivalent to the assertion that G = G◦Γ, where G◦ is the connected component of the identity in G.

Let a ∈ G be arbitrary and consider the group G0 := hG◦, ai generated by G◦ and a.

We have the following useful lemma regarding G0.

Lemma 64. The group G0 is a closed rational subgroup of G and hence a s-step nilpotent

Lie group. If

0 0 0 0 0 0 G = G1 D G2 D G3 D ... D Gs D {1G}

0 0 denotes the lower central series of G , then Gi is connected for 2 6 i 6 s.

51 Proof. First, let us show that G0 is closed. Since G◦ is a clopen subset of G we deduce that

G0 is a disjoint union of clopen sets and therefore clopen. In particular, G0 is closed.

0 ◦ Next, we show that Gi is connected for 2 6 i 6 s. Since G is a normal subgroup of G it holds that hG◦, ai = G◦aZ. Therefore

0 ◦ Z ◦ Z ◦ ◦ ◦ Z Z Z ◦ ◦ ◦ Z G2 = [G a ,G a ] = [G ,G ][G , a ][a , a ] = [G ,G ][G , a ].

Note that groups generated by connected sets are connected. Hence the group [G◦,G◦][G◦, aZ] is connected, as it is generated by the connected set

[ [G◦, g] ∪ [ [G◦, an]. g∈G◦ n∈Z

0 Hence, G2 is connected. Analogous arguments can be used to show by induction on i that 0 Gi is connected for all i = 3, . . . , s. Finally, to see why G0 is a rational subgroup of G, let π : G → G/Γ denote the natural projection from G onto G/Γ. In view of Remark 49, G0 is rational if and only if π(G0) is closed. Since G/Γ is connected it follows that π(G◦) = G/Γ and so it follows from the fact that G◦ is contained in G0 that π(G0) = G/Γ is closed. This finishes the proof.

Remark 65. Let (X, G, Ra) be an ergodic nilsystem with connected phase space X = G/Γ and let G0 := hG◦, ai be the group generated by G◦ and a. By Lemma 64 the group G0 is rational which means that Γ0 := Γ ∩ G0 is a uniform and discrete subgroup of G0. Let X0

0 0 0 0 denote the nilmanifold G /Γ and consider the nilsystem (X ,G ,Ra).

0 0 We claim that the two nilsystems (X, G, Ra) and (X ,G ,Ra) are isomorphic. To verify this claim consider the map η : X0 → X defined by the formula η(gΓ0) = gΓ for all g ∈ G0.

This map is well defined, continuous and injective. Moreover, using the fact that G = G◦Γ and that G◦ ⊂ G0, we conclude that η is also surjective. Since any continuous bijection between compact Hausdorff spaces is a homeomorphism we get that X0 and X are indeed homeomorphic spaces. Finally, since Ra ◦ η = η ◦ Ra, we conclude that (X, G, Ra) and

0 0 (X ,G ,Ra) are isomorphic dynamical systems.

52 2.2.7 The subgroup H

Let G be an s-step nilpotent Lie group, let Γ be a uniform and discrete subgroup such that

X = G/Γ is connected and let R = Ra be an ergodic niltranslation. It will be convenient for us to assume that G is generated by G◦ and a. Note that in view of Remark 65 this assumption can be made without loss of generality. Let G = G1 D G2 D ... D Gs D {1G} denote the lower central series of G and fix some positive integer k ∈ N. We define the subsets H(1),...,H(k−1) of Gk as

(i) (1) (2) (k) H := {(g i , g i , . . . , g i ): g ∈ Gi}, (2.2.11)

j where i = 0 for j < i, and we define H as

(1) (2) (k−1) k H := H H ··· H Gk. (2.2.12)

The set H is in fact a subgroup of Gk and can be used to explicitly describe the orbit closure

k 2 k of the diagonal in X under the niltranslation S := Ru ×Ru ×...×Ru. The next proposition lists some of the well known properties of H; for a more comprehensive discussion on H see

[Zie05; BHK05; Lei10b].

Proposition 66. Let G be a s-step nilpotent Lie group, let Γ be a uniform and discrete subgroup of G such that X = G/Γ is connected, let a ∈ G and assume that G = hG◦, ai.

Let k ∈ N and define H as in (2.2.12). Then (i) H is a closed and rational subgroup of Gk.

(ii) The commutator subgroup [H,H] of H satisfies

[H,H] = H ∩ [Gk,Gk].

(iii) Define ∆ := H ∩ Γk. Then ∆ is a uniform and discrete subgroup of H and Y := H/∆

is a connected nilmanifold.

(iv) For b ∈ G define

2 k Sb := Rb × Rb × ... × Rb

and for x = gΓ ∈ X define

 n k Yx := Sa (x, x, . . . , x): n ∈ Z ⊂ X . 53 For almost every x = gΓ ∈ X the nilsystems (Yx,Sa) and (Y,Sg−1ag) are isomorphic.

Proof. For the proofs of items (i)– (iii) we refer the reader to [Lei98], [BHK05, Theorem

5.1] and [Lei10b, Proposition 5.7].

For the proof of (iv) we repeat a short argument that appeared in [Fra08, Subsection

2.5]. We need the following theorem.

Theorem 67 (see [Zie05, Theorem 2.2] or [BHK05, Theorem 5.4]). Let G be a s-step nilpotent Lie group, let Γ be a uniform and discrete subgroup of G such that X = G/Γ is

◦ connected, let a ∈ G be such that Ra is ergodic and assume that G = hG , ai. Let k ∈ N, k ∞ define H as in (2.2.12) and put ∆ := H ∩ Γ and Y := H/∆. If f1, . . . , fk ∈ L (X), then for a.e. x = gΓ ∈ X we have

N Z 1 X −1 n −1 kn lim f1(g a gΓ) · ... · fk(g a gΓ) = f1 ⊗ ... ⊗ fk dµY . (2.2.13) N→∞ N n=1 Y

In the following let us identify Y with its embedding into Xk in the obvious manner.

k k Consider the injective map Rg−1 × ... × Rg−1 : X → X . It follows from the definition of n the group H that for any n ∈ Z the image of the point Sa (x, . . . , x) under Rg−1 ×...×Rg−1 n lies in Y . Since points of the form Sa (x, x, . . . , x) are dense in Yx we conclude that

(Rg−1 × ... × Rg−1 )(Yx) ⊂ Y.

Hence, it only remains to show that for almost all x ∈ X we have (Rg−1 ×... ×Rg−1 )(Yx) = n Y . This, however, follows right away from the fact that {Sg−1ag(x, x, . . . , x): n ∈ Z} is n contained in (Rg−1 × ... × Rg−1 )(Yx) and that (2.2.13) implies {Sg−1ag(x, x, . . . , x): n ∈ Z} is dense in Y for almost all x ∈ X.

Proof of Theorem 62. Let k ∈ N, let X be a connected nilmanifold and let R = Ra be an ergodic niltranslation. In view of Remark 65 we can assume without loss of generality that

◦ 2 k G is generated by G and a. Let S = Sa := R × R × ... × R , and let (K,R) denote the Kronecker factor of (X,R). We want to show that for almost every x ∈ X the Kronecker

54 factor of (Yx,S) is isomorphic (as measure preserving system) to the system (K,R), where

Yx is defined by (2.2.10). Let H be as in (2.2.12). Following the notation of Proposition 66, we define Y := H/∆,

k 2 k where ∆ := H ∩ Γ , and Sb := Rb × Rb × ... × Rb . By Proposition 66 part (iv) the system

(Yx,Sa) is isomorphic to (Y,Sg−1ag), for almost every x = gΓ ∈ X. Hence, to finish the proof it suffices to show that the Kronecker factor of (Y,Sb) is isomorphic to (K,R) for every b of the form g−1ag.

(1) (k−1) Let H ,...,H be as in (2.2.11). According to Lemma 64 the groups G2,G3,...,Gs

k (2) (k−1) are connected, from which we deduce that the group Gk and the sets H ,...,H are also connected. Since G is generated by G◦ and a, we get that the group generated by

H(1) is generated by its identity component and (a, a2, . . . , ak). Putting everything to- gether, this implies that H is generated by H◦ and (a, a2, . . . , ak). Moreover, since H is normalized by the diagonal G∆, we conclude that H is in fact generated by H◦ and

−1 −1 2 −1 k hg := (g ag, g a g, . . . , g a g) for any g ∈ G.

◦ Now, the fact that H is generated by H and hg allows us to apply Corollary 53 to the system (Y,Sb), noting that this system is isomorphic to systems of the form (Yx,Sa) which are, by construction, transitive and hence, in view of Theorem 50, ergodic. It now

−1 follows that for any b = g ag the Kronecker factor of (Y,Sb) is given by (KY ,Sb), where

NY = [H,H] and KY := NY \Y .

Finally, we need to show that the systems (KY ,Sb) and (K,R) are isomorphic. Observe

k that KY can be identified with H/(NY ∆). Moreover, by definition we have ∆ = H ∩ Γ

k and from Proposition 66, part (ii), we have NY = H ∩ N , where N = [G, G]. This implies

k k k k that NY ∆ = H ∩ (N Γ ) and hence KY = H/(H ∩ N Γ ). It follows from the second ∼ k k k k (i) k isomorphism theorem that KY = (HN Γ )/(N Γ ). Finally, since H ⊂ N for every k (1) k ∼ (1) k k k k i > 1, it follows that HN = H N and thus KY = (H N Γ )/(N Γ ). Applying Corollary 53 to the original system (X,R), we see that K = N\X = G/NΓ;

k k k k (1) therefore KY embeds naturally into K = G /(N Γ ). Looking at the definition of H

55 we deduce that ∼ ∼ KY = {(v, 2v, . . . , kv): v ∈ K} = K.

Finally, since K is an abelian group and b = g−1ag, the projections of a and b onto K coincide. Hence (KY ,Sb) is indeed isomorphic (as a measure preserving system) to (K,Ra) as desired.

2.2.8 A Theorem for eliminating the rational spectrum

The purpose of this subsection is to give a proof of Theorem 20. We will need the following lemma.

Lemma 68. Let X be a nilmanifold, let R = Ra be a niltranslation on X. Suppose x ∈ X

(Cesàro) and F ∈ C(X). Then there exist a periodic sequence ρ: N → C and φ ∈ Coper(N, Φ ) such that

n F (R x) = ρ(n) + φ(n), ∀n ∈ N.

Proof. Since X is compact, it only possesses finitely many connected components, which we denote by X0,X1,...,Xr−1. Since R cyclically permutes these connected components,

n after reordering if necessary, we can assume without loss of generality that R x ∈ Xn mod r for all n ∈ N. Now define Z

ρ(n) := F dµXn mod r Xn mod r and

φ(n) := F (Rnx) − ρ(n).

r Clearly, ρ is periodic with period r. Moreover, it follows from Theorem 50 that (Xt,R , µXt ) is totally ergodic for all t ∈ {0, 1, . . . , r − 1}.

Suppose q, s ∈ N. Let l := lcm(q, r) and pick m ∈ N such that qm = l. Then

1 N 1 m−1 1 N lim X φ(qn + s) = X lim X φ(q(mn + j) + s) N→∞ N m N→∞ N n=1 j=0 n=1 1 m−1 1 N = X lim X φ(ln + js) m N→∞ N j=0 n=1

56 m−1 N Z 1 X 1 X ln+js = lim F (R x) − F dµX . m N→∞ N js mod r j=0 n=1 Xjs mod r

r ln+js Since R is totally ergodic on Xjs mod r and r divides l, the sequence R x is uniformly distributed on Xjs mod r, from which it follows that

N Z 1 X ln+js lim F (R x) = F dµX . N→∞ N js mod r n=1 Xjs mod r

1 PN We conclude that limN→∞ N n=1 φ(qn + s) for all q, s ∈ N and this implies that φ ∈ (Cesàro) Coper(N, Φ ).

Next, we also need a uniform version of the following polynomial multiple recurrence theorem obtained in [BL96]:

Theorem 69 (see [BL96, Theorem A]). Let `, u ∈ N and let pi,j ∈ Q[t] be polynomials satisfying pi,j(Z) ⊂ Z and pi,j(0) = 0, i = 1, . . . , `, j = 1, . . . , u. Then for any probability space (X, B, µ), any u-tuple of commuting invertible measure preserving transformations

T1,...,Tu on (X, B, µ) and any A ∈ B with µ(A) > 0 one has

N  u u u  1 X Y −p1,j (n) Y −p2,j (n) Y −p`,j (n) lim inf µA ∩ Tj A ∩ Tj A ∩ ... ∩ Tj A > 0. N→∞ N n=1 j=1 j=1 j=1 The uniform version in question is given by the following theorem and we will use a special case of it to prove Theorem 20.

Theorem 70. For all `, d ∈ N and all ε > 0 there exists δ > 0 such that the following holds: For any u ∈ N, for any polynomials pi,j ∈ Q[t], i = 1, . . . , `, j = 1, . . . , u, satisfying deg(pi,j) 6 d, pi,j(Z) ⊂ Z, pi,j(0) = 0, for any probability space (X, B, µ), for any u-tuple of commuting invertible measure preserving transformations T1,...,Tu on (X, B, µ), for any

A ∈ B with µ(A) > ε and for any s ∈ N one has

N−1 u u 1 X Y −p1,j (sn) Y −p2,j (sn) lim µ A ∩ Tj A ∩ Tj A ∩ ... N−M→∞ N − M n=M j=1 j=1 u ! Y −p`,j (sn) ... ∩ Tj A > δ. j=1 (2.2.14)

57 We remark that a slightly less general version of Theorem 70 is stated in [FHK13,

Theorem 4.1] without a proof.

In the course of proving Theorem 70 we will make use of the following equivalent com- binatorial form of Theorem 69.

Theorem 71 (see [BHMP00, Theorem 3.2]). Let `, u ∈ N, let ε > 0 and let pi,j ∈ Z[t] be polynomials satisfying pi,j(0) = 0, i = 1, . . . , `, j = 1, . . . , u. Then there exists a positive

d integer N = N(`, u, ε, pi,j) such that for all sets A ⊂ Z with

|A ∩ [1,N]u| > ε N u there exist n ∈ N and a ∈ A such that a + (pi,1(n), . . . , pi,u(n)) ∈ A for all i ∈ {1, 2, . . . , `}.

We will need the following theorem, which is of independent interest and can be inter- preted as a polynomial extension of Theorem F2 in [BHMP00].

Theorem 72. For every `, d ∈ N and every ε > 0 there exist K ∈ N and β > 0 such that for any probability space (X, B, µ), any commuting invertible measure preserving trans- formations Ti,j, 1 6 i 6 ` and 1 6 j 6 d, and any A ∈ B with µ(A) > ε there exists n ∈ {1,...,K} such that

 d d d  Y −nj Y −nj Y −nj µA ∩ T1,j A ∩ T2,j A ∩ ... ∩ T`,j A > β. (2.2.15) j=1 j=1 j=1

Moreover,

N−1  d d d  1 X Y −nj Y −nj Y −nj β lim µA ∩ T1,j A ∩ T2,j A ∩ ... ∩ T`,j A > . (2.2.16) N−M→∞ N − M K2 n=M j=1 j=1 j=1

Proof. Let u := d` and, for 1 6 i 6 ` and 1 6 t 6 u, define   j n , if t = (i − 1)d + j with 1 6 i 6 ` and 1 6 j 6 d; pi,t(n) = (2.2.17)  0, otherwise.

Let K = N(`, u, ε/2, pi,t) as guaranteed by Theorem 71. For the remainder of this proof let us call a set of the form {a} ∪ {a + (pi,1(n), . . . , pi,u(n)) : 1 6 i 6 `} for some a =

58 u (a1, . . . , au) ∈ N and n ∈ N a basic arrangement. Let J denote the collection of all basic u ε arrangements contained in {1,...,K} . Set β := 4|J| . We claim that (2.2.15) and (2.2.16) are satisfied with this choice of K and β.

Let (X, B, µ) be an arbitrary probability space, let Ti,j, 1 6 i 6 ` and 1 6 j 6 d, be commuting invertible measure preserving transformations on X and let A ∈ B with

µ(A) > ε. For 1 6 t 6 u let St := Ti,j where (i, j) ∈ {1, . . . , `} × {1, . . . , d} is such that t = (i − 1)d + j. It thus follows from (2.2.17) that

d u Y nj Y pi,t(n) Ti,j = St . (2.2.18) j=1 t=1

Define u ! 1 X Y : nt f(x) = u 1A St (x) . K u (n1,...,nu)∈[1,K] t=1 R Clearly, f is a non-negative function and X f dµ > ε. Therefore the set B := {x ∈ X : f(x) > ε/2} satisfies µ(B) > ε/2. Also, for every x ∈ B the set

( u ) u Y nt Ex := (n1, . . . , nu) ∈ [1,K] : St (x) ∈ A t=1

u u has density at least ε/2 in [1,K] , i.e. |Ex| > (ε/2)K . By our choice of K, we are guaranteed to find at least one basic arrangement contained in Ex. We have shown that for every x ∈ B there exists a basic arrangement contained in

u u Ex ⊂ [1,K] . Since there are |J|-many basic arrangements in [1,K] , by the pigeonhole

ε principle there exists a set C ⊂ B with µ(C) > 2|J| such that Ex contains the same basic arrangement for every x ∈ C. Suppose this basic arrangement is given by {(a1, . . . , au)} ∪

0 Qu at 0 0 {(a1, . . . , au)+(pi,1(n), . . . , pi,u(n)) : 1 6 i 6 `}. Let C := t=1 St C. Then for any x ∈ C

Qu −at 0 and any i ∈ {1, . . . , `}, if x := t=1 St (x ) then by (2.2.18) and the definition of Ex we have d u u Y nj 0 Y pi,t(n) 0 Y at+pi,t(n) Ti,j (x ) = St (x ) = St (x) ∈ A. j=1 t=1 t=1

0 Qd −nj Qd −nj This shows that C is contained in the intersection A ∩ j=1 T1,j A ∩ j=1 T2,j A ∩ ... ∩ Qd −nj 0 j=1 T`,j A. Since µ(C ) = µ(C) > β, this finishes the proof of (2.2.15).

59 Next, we give a proof of (2.2.16). Let M > 1 be arbitrary. Note that for all m with

M(K − 1) < m 6 MK and all k with 1 6 k 6 K the products mk are pairwise distinct. mj For 1 6 j 6 d, 1 6 m 6 M and 1 6 i 6 ` define Ri,j,m := Ti,j . It follows that

MK2  d d d  1 X Y −nj Y −nj Y −nj µA ∩ T A ∩ T A ∩ ... ∩ T A MK2 1,j 2,j `,j n=1 j=1 j=1 j=1 MK K  d d d  1 X X Y −(mk)j Y −(mk)j Y −(mk)j µA ∩ T A ∩ T A ∩ ... ∩ T A > MK2 1,j 2,j `,j m=M(K−1)+1 k=1 j=1 j=1 j=1 MK K  d d d  1 X X Y −kj Y −kj Y −kj = µA ∩ R A ∩ R A ∩ ... ∩ R A. MK2 1,j,m 2,j,m `,j,m m=M(K−1)+1 k=1 j=1 j=1 j=1

In light of (2.2.15) we have

K  d d d  X Y −kj Y −kj Y −kj µA ∩ R1,j,mA ∩ R2,j,mA ∩ ... ∩ R`,j,mA > β k=1 j=1 j=1 j=1 for all 1 6 m 6 M. Therefore,

MK K  d d d  1 X X Y −kj Y −kj Y −kj µA ∩ R A ∩ R A ∩ ... ∩ R A MK2 1,j,m 2,j,m `,j,m m=M(K−1)+1 k=1 j=1 j=1 j=1 1 MK > X β MK2 m=M(K−1)+1 β = . K2 This proves that

MK2  d d d  1 X Y −nj Y −nj Y −nj β lim inf µA ∩ T1,j A ∩ T2,j A ∩ ... ∩ T`,j A > . (2.2.19) M→∞ MK2 K2 n=1 j=1 j=1 j=1

Finally, it follows from the results in [Wal12] that the limits on the left hand side of (2.2.19) and on the left hand side of (2.2.16) exist and are equal. This finishes the proof of (2.2.16).

Proof of Theorem 70. Depending only on `, d ∈ N and ε > 0, choose β > 0 and K > 1 as guaranteed by Theorem 72. Note that coefficients of integer polynomials of degree d can be written as fractions with denominator q := d!. Define b := q! and pick any δ > 0 such

β that δ < bK2 . We claim that (2.2.14) holds with this choice of δ. 60 Let u, s ∈ N and let pi,j ∈ Q[t], i = 1, . . . , `, j = 1, . . . , u, with deg(pi,j) 6 d, pi,j(Z) ⊂ Z, pi,j(0) = 0 and such that the denominators of the coefficients of pi,j (when written as reduced fractions) are at most q. Furthermore, let T1,...,Tu be commuting invertible measure preserving transformations on a probability space (X, B, µ) and let A ∈ B with

µ(A) > ε. It follows from [Wal12] that the limit on the left hand side of (2.2.14) exists and is equal to N u u u ! 1 X Y −p1,j (sn) Y −p2,j (sn) Y −p`,j (sn) lim µ A ∩ Tj A ∩ Tj A ∩ ... ∩ Tj A . (2.2.20) N→∞ N n=1 j=1 j=1 j=1 It thus suffices to show that (2.2.20) is bigger than δ. (1) (d) For 1 6 i 6 ` and 1 6 j 6 u find ai,j , . . . , ai,j such that

(1) (2) 2 (d) d pi,j(n) = ai,j n + ai,j n + ... + ai,j n .

(k) k k (k) By assumption, bai,j ∈ Z and hence s b ai,j ∈ Z for all s ∈ N. Define

u k k (k) Y s b ai,j Ri,k := Tj , 1 6 k 6 d. j=1 Clearly, u d Y pi,j (bsn) Y nj Tj = Ri,j, ∀n ∈ N. j=1 j=1 We thus have N  u u u  1 X Y −p1,j (sn) Y −p2,j (sn) Y −p`,j (sn) µA ∩ T A ∩ T A ∩ ... ∩ T A N j j j n=1 j=1 j=1 j=1 bN/bc  u u u  1 X Y −p1,j (bsn) Y −p2,j (bsn) Y −p`,j (bsn) µA ∩ T A ∩ T A ∩ ... ∩ T A > N j j j n=1 j=1 j=1 j=1 bN/bc  d d d  1 X Y −nj Y −nj Y −nj = µA ∩ R A ∩ R A ∩ ... ∩ R A. N 1,j 2,j `,j n=1 j=1 j=1 j=1 From (2.2.16) it follows that

bN/bc  d d d  1 X Y −nj Y −nj Y −nj β lim µA ∩ R1,j A ∩ R2,j A ∩ ... ∩ R`,j A > > δ. N→∞ N bK2 n=1 j=1 j=1 j=1 Therefore,

N  u u u  1 X Y −p1,j (sn) Y −p2,j (sn) Y −p`,j (sn) lim µA ∩ Tj A ∩ Tj A ∩ ... ∩ Tj A > δ. N→∞ N n=1 j=1 j=1 j=1

61 Proof of Theorem 20. Fix k ∈ N. Let (X, B, µ, T ) be an ergodic measure preserving system,

A ∈ B with µ(A) > 0 and p1, . . . , pk ∈ Z[x] with pi(0) = 0 for all i = 1, . . . , k. Write α(n)

R p1(n) pk(n) for the multicorrelation function X 1A,T 1A · ... · T 1A dµ. Using the Bergelson- Host-Kra decomposition (see (1.4.4)) we can write

α(n) = ϕ(n) + ω(n)

where ϕ is a (k + 1)-step nilsequence and ω is a null-sequence. Since ϕ ∈ Nilk+1, for every ε > 0 there exists a nilmanifold Y , a niltranslation R on Y , a point x ∈ X and F ∈ C(X)

n such that ϕ(n) = F (R x) + γ(n), where supn∈N |γ(n)| < ε. In light of Lemma 68 we can (Cesàro) find ρ: N → C and φ ∈ Coper(N, Φ ) such that

n F (R x) = ρ(n) + φ(n), ∀n ∈ N.

Finally, note that

N N 1 X 1 X lim ρ(qn) = lim α(qn), ∀q ∈ N, N→∞ N N→∞ N n=1 n=1 and therefore it follows from Theorem 70 there exists δ > 0 such that

N 1 X lim ρ(qn) > δ, ∀q ∈ N. N→∞ N n=1

2.3 Multiplicative functions and their level sets

The goal of this section is to provide proofs of the dichotomy theorem for M0 (Theorem 21) and of the structure theorem for level sets of multiplicative functions (Theorem 24). These proofs are based on [BKPLR17].

2.3.1 Preliminaries

In this subsection we present some basic results and ideas regarding multiplicative functions and level sets of multiplicative functions which will be used in the subsequent subsections.

62 Multiplicative functions

Recall that M denotes the set of all multiplicative functions f : N → C with |f(n)| 6 1 for all n ∈ N. The set M can be endowed with a “distance” function D: M × M → [0, ∞], which serves as a useful tool for cataloging the class of multiplicative functions bounded in modulus by 1. Let P denote the set of prime numbers. For f, g ∈ M define v u 1  (f, g) := uX 1 − Re(f(p)g(p)) . D t p p∈P

Remark 73. Let us list some important properties of D. For more details and proofs the reader is referred to the book of Granville and Soundararajan [GS].

(1) D(f, g) = D(g, f) = D(f, g);

(2) D satisfies the triangle inequality, D(f, g) 6 D(f, h) + D(h, g); m m (3) mD(f, g) > D(f , g ) for all m ∈ N; (4) D(f, g) < ∞ implies D(|f|, |g|) < ∞.

When D(f, g) < ∞ then, borrowing the terminology from [GS], we say that f pretends to be g. In this case, many properties of f are shared by g and vice versa.

In Subsections 2.3.1, 2.3.1 and 2.3.1 below we will see that one can often determine whether a multiplicative function f ∈ M:

– has a mean value,

– is Besicovitch almost periodic,

– is aperiodic, or

– is uniform by measuring the D-distance between f and Archimedean characters and Dirichlet charac- it it log n ters. An Archimedean character is a function of the form n 7→ n = e with t ∈ R. Any Archimedean character is a completely multiplicative element of M. An arithmetic function χ is called a Dirichlet character if there exists a number d ∈ N, called a modulus of χ, such that:

(1) χ(n + d) = χ(n) for all n ∈ N;

63 (2) χ(n) = 0 whenever gcd(d, n) > 1, and χ(n) is a ϕ(d)-th root of unity whenever

gcd(d, n) = 1, where ϕ denotes Euler’s totient function;

(3) χ(nm) = χ(n)χ(m) for all n, m ∈ N. Any Dirichlet character is periodic and completely multiplicative.4 We also remark that

χ: N → C is a Dirichlet character of modulus k if and only if there exists a group character ∗ χe of the multiplicative group (Z/kZ) such that χ(n) = χe(n mod k) for all n ∈ N. The ∗ Dirichlet character determined by the trivial (constant equal to 1) character of (Z/kZ) is called the principal character of modulus k. It is denoted by χ1. Note that if d|k and χ is a Dirichlet character of modulus d then

0 χ := χ · χ1 (2.3.1) is a Dirichlet character of modulus k.

Lemma 74 (cf. [GS, Lemma 4.6] and [FKPL, Remark after Lemma 2.2]). For every t 6= 0

it and every Dirichlet character χ we have D(χ, n ) = ∞. In particular, for t 6= 0 one has it D(1, n ) = ∞.

We end this subsection with a list containing examples of multiplicative functions be- longing to M and examples of sets in D that can be obtained from functions in M.

Example 75.

(1) The Liouville function λ is defined as λ(n) := (−1)Ω(n) and is completely multiplicative

(for the definition of Ω(n) see Example 22). The non-trivial level sets of λ are exactly

the multiplicatively even and odd numbers E and O defined in Example 22.

(2) The Möbius function µ is defined as µ(n) := λ(n)1Q(n). Note that µ is multiplicative but not completely multiplicative.

(3) Throughout this paper we identify the torus T := R/Z with the unit interval [0, 1) mod 1 or, when convenient, with the unit circle in the complex plane. Also, we introduce the

2πix notation e(x) := e for all x ∈ R. Given ξ ∈ T, define the multiplicative functions 4The converse of this statement is also true: Any periodic and completely multiplicative function is a Dirichlet character.

64 λξ, µξ and κξ as

λξ(n) := e(ξΩ(n)), µξ(n) := λξ(n)1Q(n),

and

κξ(n) := e(ξω(n)),

where ω(n) denotes the number of distinct prime divisors of n (counted without multi-

plicities). It is clear that κξ, λξ, µξ ∈ M. Observe that λ 1 = λ and µ 1 = µ. 2 2 The following examples of sets belong to D because they can be viewed as level sets of

the functions κξ, λξ, and µξ, respectively, where ξ is any primitive b-th root of unity:

Sω,b,r := {n ∈ N : ω(n) ≡ r mod b},

SΩ,b,r := {n ∈ N : Ω(n) ≡ r mod b},

Ub,r := {n ∈ N : n is squarefree and Ω(n) ≡ r mod b}.

Note that E = SΩ,2,0 and O = SΩ,2,1.

(4) If f : N → N is multiplicative and b is either 2, 4, p or 2p, where p stands for an odd prime number, then for any r ∈ {0, 1, . . . , b − 1} with gcd(b, r) = 1 the set

Vf,b,r := {n ∈ N : f(n) ≡ r mod b}

is an element of D. This is because for any such b the multiplicative group of integers

∗ mod b is cyclic and hence there exists a Dirichlet character χ which spans (Z/bZ) .

Therefore Vb,r can be realized as a level set of the multiplicative function χ ◦ f, which belongs to M. In particular, the set

Sτ ,b,r := {n ∈ N : τ (n) ≡ r mod b}

P belongs to D, where τ (n) := d|n 1 is the number of divisors function.

Mean value theorems of Wirsing and Halász

We say a function f ∈ M has a mean value, and denote it by M(f), if the limit

1 N M(f) := lim X f(n) (2.3.2) N→∞ N n=1 65 exists. In general, the mean value of a multiplicative function f ∈ M does not exist; for example, if t 6= 0 then the mean of nit does not exist, cf. [GS, Section 4.3].

Two classical theorems in multiplicative number theory are Wirsing’s celebrated mean value theorem regarding real-valued multiplicative functions bounded in modulus by 1, and

Halász’s generalization of Wirsing’s theorem to all functions in M.

Theorem 76 (Wirsing; see [Wir61] and [Ell79, Theorem 6.4]). For any real-valued g ∈ M the mean value M(g) exists.

Theorem 77 (Halász; see [Ell79, Theorem 6.3]). Let g ∈ M. Then the mean value M(g) exists if and only if one of the following mutually exclusive conditions is satisfied:

(i) there is at least one positive integer k so that g(2k) 6= −1 and, additionally, the series

P 1 p∈P p (1 − g(p)) converges; it (ii) there is a real number t such that D(g, n ) < ∞ and, moreover, for each positive integer k we have g(2k) = −2itk;

it (iii) D(g, n ) = ∞ for each t ∈ R. When condition (i) is satisfied then M(g) is non-zero and can be computed explicitly using the formula !  1 ∞ M(g) = Y 1 − 1 + X p−mg(pm) . (2.3.3) p p∈P m=1 In the case when g satisfies either (ii) or (iii) then the mean value M(g) equals zero.

Connection between multiplicative functions and Besicovitch almost periodic functions

Recall the definition of the Besicovitch seminorm k · kΦ(Cesàro) and the definition of Besicov- (Cesàro) itch almost periodic functions Bes(N, Φ ) and Besicovitch rationally almost periodic (Cesàro) functions Besrat(N, Φ ) given in Section 1.2. Any periodic function is clearly Besicovitch rationally almost periodic. In particular, any Dirichlet character χ is Besicovitch rationally almost periodic, since Dirichlet char- acters are periodic. There are, however, many other natural examples of multiplicative

66 2 ϕ(n) functions that are Besicovitch rationally almost periodic. For instance, µ and n are such. More generally, it will be shown at the end of this subsection (see Remark 83 below) that any bounded multiplicative function with values in [0, ∞) is Besicovitch rationally almost periodic.

For any Besicovitch almost periodic function f : N → C and any θ ∈ [0, 1) the limit 1 N fˆ(θ) := lim X f(n)e(−nθ) N→∞ N n=1 exists; moreover, fˆ(θ) differs from 0 for at most countably many values of θ (cf. [Bes55, pp. 104 – 105]). Then the spectrum of f is given by σ(f) := {θ ∈ [0, 1) : fˆ(θ) 6= 0} (cf.

Definition 17). See [BL85; Bes55] for more information on the Fourier analysis of almost periodic functions.

We say that a Besicovitch almost periodic function f : N → C has rational spectrum if

σ(f) is a subset of Q ∩ [0, 1). Note that if f is periodic then its spectrum is rational. Note also that each Besicovitch almost periodic function f has a mean given by fˆ(0). It easily follows that there are f ∈ M that are not Besicovitch almost periodic (indeed, take any f ∈ M which has no mean). However, it follows from the next theorem, which is due to

Daboussi, that whenever f ∈ M is Besicovitch almost periodic then its spectrum has to be rational (this fact is used later, cf. Corollary 82 part (i) and (ii)).

Theorem 78 (cf. [DD82, Theorem 1]). Let f ∈ M. Then for all irrational θ,

1 N lim X f(n)e(θn) = 0. N→∞ N n=1 In Corollary 79 below we show that a Besicovitch almost periodic function is Besicovitch rationally almost periodic if and only if it has rational spectrum. We will derive this as a corollary from Theorem 54.

Corollary 79. Let f : N → C be Besicovitch almost periodic. Then f is Besicovitch rationally almost periodic if and only if f has rational spectrum.

Proof. First assume f has rational spectrum. By Theorem 54, f can be approximated in Pk the seminorm k · kΦ(Cesàro) by trigonometric polynomials of the from P (n) = i=1 cie(θin) 67 with c1, . . . , ck ∈ C and θ1, . . . , θk ∈ σ(f) ⊂ Q. Since θ1, . . . , θk are rational numbers, the functions P (n) is periodic. In other words, f satisfies the definition of Besicovitch rationally almost periodic functions.

Next, let f be Besicovitch rationally almost periodic and let θ be an irrational number.

We will show that fˆ(θ) = 0. Let ε > 0 be arbitrary and let P : N → C be a periodic function with kf − P kΦ(Cesàro) 6 ε. Then N 1 X fˆ(θ) = lim f(n)e(−θn) N→∞ N n=1 N 1 X 6 lim P (n)e(−θn) + ε N→∞ N n=1 = ε.

Since ε > 0 was chosen arbitrarily, we conclude that fˆ(θ) = 0. This shows that no irrational number θ belongs to σ(f).

The next lemma is a consequence of Theorem 77 and establishes a connection between the distance function D, defined in Subsection 2.3.1, and the Besicovitch seminorm k · kΦ(Cesàro) .

Lemma 80. Suppose f ∈ M. Then kfkΦ(Cesàro) = 0 if and only if D(|f|, 1) = ∞.

Proof. First, observe that kfkΦ(Cesàro) = 0 if and only if the mean value of the multiplicative function |f| is zero, i.e., M(|f|) = 0. In view of Theorem 77, the mean value of |f| is zero if and only if |f| satisfies either condition (ii) or condition (iii) of the theorem. Since

1 − |f(p)| cos(tlog(p)) > min(1, 1 − cos(tlog(p))), ∀p ∈ P,

it it and D(1, n ) = ∞ for all t 6= 0 (cf. Lemma 74), it follows that D(|f|, n ) = ∞ for all t 6= 0. Therefore |f| cannot satisfy condition (ii) of Theorem 77. Hence M(|f|) = 0 if and only if f satisfies condition (iii) of Theorem 77. Finally, observe that |f| satisfies condition (iii) if and only if D(|f|, 1) = ∞.

In [DD82; DD74] Daboussi and Delange give necessary and sufficient conditions for a bounded multiplicative function to be Besicovitch almost periodic: 68 Theorem 81 ([DD82, Theorem 6]). A function f ∈ M is Besicovitch almost periodic if

P 1 and only if either kfkΦ(Cesàro) = 0 or there exists a Dirichlet character χ such that p∈P p (1− f(p)χ(p)) converges.

From Theorem 81 we obtain the following corollary.

Corollary 82. Let f ∈ M. The following are equivalent:

(i) f is Besicovitch almost periodic;

(ii) f is Besicovitch rationally almost periodic;

P 1 (iii) either kfkΦ(Cesàro) = 0 or there exists a Dirichlet character χ such that p∈P p (1 − f(p)χ(p)) converges.

Proof. The equivalence of (ii) and (iii) is given by Theorem 81. Also, the fact that (ii) implies (i) is obvious. It thus remains to show that (i) implies (ii). However, from Theo- rem 78 we deduce that any multiplicative function f has rational spectrum, which, in view of Corollary 79, implies that f is Besicovitch rationally almost periodic.

Remark 83. We claim that any bounded multiplicative function f taking values in [0, ∞) is Besicovitch rationally almost periodic.

Let us first prove the claim for the special case when 0 6 f(n) 6 1 for all n ∈ N. If kfkΦ(Cesàro) = 0 then f is Besicovitch rationally almost periodic for trivial reasons; it thus suffices to verify the claim for f with kfkΦ(Cesàro) > 0. In view of Lemma 80 it follows from kfkΦ(Cesàro) > 0 that D(f, 1) < ∞. Since f only takes values in the interval [0, 1], the P 1 assertion D(f, 1) < ∞ is equivalent to the fact that the series p∈P p (1 − f(p)) converges. Therefore, using (iii) ⇒ (ii) of Corollary 82, we conclude that f is Besicovitch rationally almost periodic.

Next, assume f takes values in [0, b) for some b > 1. Define two new multiplicative functions g and h via

   k k  k f(p ), if f(p ) 6 1 1, if f(p ) 6 1 g(pk) := and h(pk) :=   1, if f(pk) > 1 f(pk), if f(pk) > 1.

69 1 Clearly, f(n) = g(n)h(n)for all n ∈ N. Moreover, g and h are multiplicative functions taking 1 values in [0, 1]. It follows from the previous paragraph that both g and h are Besicovitch 1 rationally almost periodic. Since h is Besicovitch rationally almost periodic, for every ε > 0 1 1 1 there exists a periodic function P with k h − P kΦ(Cesàro) 6 ε. Since b 6 h(n) 6 1, we can 1 assume without loss of generality that b 6 P (n) 6 1. It is then straightforward to show 1 2 that kh− P kΦ(Cesàro) 6 b ε, which proves that h is also Besicovitch rationally almost periodic. Finally, observe that f, as a product of two Besicovitch rationally almost periodic functions, is itself Besicovitch rationally almost periodic.

Ruzsa’s theorem and some of its corollaries

In this short section we formulate a theorem of Ruzsa that shows that the density of a level set of a multiplicative function always exists and which gives necessary and sufficient conditions for this density to be positive. We also derive additional corollaries from this theorem which will be used in the latter sections of this paper.

r Let r ∈ N. A function f = (f1, . . . , fr): N → C is called multiplicative if each of its r coordinate components fi : N → C is a multiplicative function. We say that a point z ∈ C r is a concentration point for a multiplicative function f : N → C if the set P := {p ∈ P : P 1 f(p) = z} satisfies p∈P p = ∞. In the following, we denote by im(f) the image of f and, for z ∈ im(f), we use E(f, z) := {n ∈ N : f(n) = z} to denote level sets of f.

Definition 84 (cf. [Ruz77, Definition 3.8]). Assume that a multiplicative function f : N → r (C\{0}) possesses at least one concentration point z = (z1, . . . , zr). The subgroup G of the r multiplicative group ((C\{0}) , ·) generated by all concentration points of f is called the concentration group of f.

r Theorem 85 (cf. [Ruz77, Theorem 3.10]). Let f : N → (C\{0}) be a multiplicative func- tion.

(1) Assume that f satisfies the following three conditions:

(a) f has at least one concentration point,

(b) the concentration group G of f is finite, and

70 P 1 (c) p∈P, p < ∞. f(p)∈G/ Then d(E(f, z)) exists and is strictly positive for all z ∈ im(f). Moreover,

X d(E(f, z)) = 1. z∈im(f)

(2) If f does not satisfy (at least) one of the conditions (a), (b) or (c) of part (1), then

d(E(f, z)) = 0 for all z ∈ im(f).

Although we formulated Theorem 85 for arbitrary r ∈ N, we will mostly deal with the special case r = 1; the only exception is the proof of Lemma 107 below, where we also need the case r = 2.

Corollary 86 (cf. [Ruz77, Corollary 1.6 and the subsequent remark]). For any multiplica- tive function f : N → C and any z ∈ C the density of E(f, z) exists.

r Definition 87 (cf. [Ruz77, Definition 3.9]). A multiplicative function f : N → (C\{0}) is called concentrated if it satisfies conditions (a), (b) and (c) in part (1) of Theorem 85.

Corollary 88. Let f : N → C be a multiplicative function and z ∈ C\{0}. If d(E(f, z)) > 0 then there exists a concentrated multiplicative function g : N → C\{0} such that

E(f, z) = E(g, z).

P 1 Moreover, the set P := {p ∈ P : f(p) 6= g(p)} satisfies p∈P p < ∞.

Proof. Define J := im(f)\{0}. Since J is a countable subset of C\{0}, there exists y ∈ n C\{0} such that (y · J) ∩ J = ∅ for all n ∈ N. We define a new multiplicative function g as  f(pk), if f(pk) 6= 0 k  g(p ) := , ∀k ∈ N, ∀p ∈ P.  y, if f(pk) = 0

It follows from (yn · J) ∩ J = ∅ that E(f, z0) = E(g, z0) for all z0 ∈ J, so, in particular

E(f, z) = E(g, z). Since g(n) 6= 0 for all n ∈ N, we can apply Theorem 85 and deduce that g must satisfy conditions (a), (b) and (c) in part (1) of Theorem 85.

71 Note that kfkΦ(Cesàro) > 0, because z 6= 0 and d(E(f, z)) > 0. Therefore, by Lemma 80, 0 P 1 we conclude that P = {p ∈ P : f(p) = 0} satisfies p∈P 0 p < ∞. Finally, note that f(p) 6= g(p) if and only if p ∈ P 0, which completes the proof.

r Remark 89. Consider f : N → (C\{0}) . Notice that, by Theorem 85, if f is concentrated then d(E(f, 1)) > 0, because 1 ∈ im(f). Moreover, f is concentrated if and only if each of its coordinates fi is concentrated.

Uniform functions

It follows from the work of Green and Tao [GT12] and Green, Tao and Ziegler [GTZ12] that the classical Möbius function µ is a uniform function. A more general result was obtained by Frantzikinakis and Host in [FH17a]. In order to state their theorem, we need the following definition.

Definition 90. We call a multiplicative function f aperiodic if for all b ∈ N and all r ∈ 1 PN {0, 1, . . . , b − 1} we have limN→∞ N n=1 f(bn + r) = 0.

Theorem 91 (Theorem 2.4, [FH17a]). A multiplicative function f ∈ M is uniform if and only if it is aperiodic.

In [Del72; Del83], Delange gives a full characterization of all aperiodic functions in M:

it Proposition 92. Let f ∈ M. Then f is aperiodic if and only if D(f, χ · n ) = ∞ for each

Dirichlet character χ and t ∈ R.

Remark 93. It follows immediately from Proposition 92 and from the triangle inequality for D (see Remark 73) that if f, g ∈ M satisfy D(f, g) < ∞ then f is aperiodic if and only if g is aperiodic. Using Theorem 91 we can replace “aperiodic” with “uniform”. Hence, we get that if f, g ∈ M satisfy D(f, g) < ∞ then f is uniform if and only if g is uniform.

Proposition 94. Suppose f : N → C is bounded.

(a) If fn is uniform and fn → f in k · kΦ(Cesàro) , then f is uniform.

(b) If f is uniform, q ∈ N and r ∈ {0, 1, . . . , q − 1} then f · 1qN+r is uniform.

72 (c) If f is uniform and if g is Besicovitch rationally almost periodic, then h := f · g is

uniform.

(d) If f is uniform and t ∈ N then h(n) := f(tn) is uniform.

Proof. To prove part (a) it suffices to show that for all f : N → C bounded in modulus by 1 we have N 2s+1 1 X kf k s |f(n)|. (2.3.4) N U [N] 6 N n=1 We prove (2.3.4) by induction on s. For s = 1 the inequality in (2.3.4) follows immediately

1 from the definition of the U -norm. Thus, assume (2.3.4) has already been proven for s > 1. Then,

N 2s 2s+1 1 X h kfk s+1 = f T f U N N s [N] N U h=1 [N] 1 N 1 N X X f (n)f (n + h) 6 N N N N h=1 n=1 1 N X|f (n)|. 6 N N n=1 Part (b) follows directly from the inverse conjecture for the Gowers seminorms (see

[GTZ12] or [Tao12, Theorem 1.6.12 and Theorem 1.6.14]).

For the proof of part (c) observe that it follows from part (b) and the triangle inequality for k · k s that for any uniform f and any g that is a finite linear combination of functions U[N] of the form 1qN+r, q ∈ N and r ∈ {0, 1, . . . , q − 1}, the product f · g is uniform. Since any Besicovitch rationally almost periodic function can be approximated in the k · kΦ(Cesàro) - seminorm by finite linear combinations of functions of the form 1qN+r, it follows from part (a) that for any uniform f and any Besicovitch rationally almost periodic g the function h = f · g is uniform.

Finally, for part (d), one can easily show by induction that

2s+1 2s+1 2s+1 kf(tn)k s+1 6 t kf · 1tNk s+1 + o(N) U[N] U[N] and hence the claim follows from part (b).

73 2.3.2 Dichotomy theorem for M0

In this section we discuss some equivalent characterizations of M0 and give a proof of Theorem 21.

Equivalent characterizations of M0

The following subclass of M was introduced in Section 1.4.3:

( N ) 1 X M0 = f ∈ M : lim f(qn + r) exists for all q, r ∈ N . N→∞ N n=1

The next proposition offers an alternative characterizations of functions in M0.

Proposition 95 (cf. [BKPLR17]). Let f ∈ M. Then f ∈ M0 if and only if for all Dirichlet characters χ the mean value M(χ · f) exists.

P 1 Remark 96. Let f ∈ M0 and let χ be a Dirichlet character. We claim the series p∈P p (1− f(p)χ(p)) converges if and only if D(f, χ) < ∞. To prove this claim it suffices to prove that P 1 D(f, χ) < ∞ implies p∈P p (1 − f(p)χ(p)) converges, as the other direction is obvious. Let q denote a modulus of χ. If q is even, then we set χ0 := χ and if q is odd then we set

0 χ := χ · χ1, where χ1 denotes the principal character of modulus 2q (cf. (2.3.1)). Since

0 0 f ∈ M0, by Proposition 95, the function f ·χ has a mean. Therefore f ·χ satisfies either (i), (ii) or (iii) of Theorem 77. However, f · χ0 cannot satisfy (ii) because χ0 has even modulus and hence χ0(2k) = 0 for all k. Also, χ0(p) = χ(p) for all but finitely many primes p and

0 therefore D(f, χ) < ∞ implies D(f, χ ) < ∞. This implies that f · χ0 cannot satisfy (iii), 0 because D(f · χ0, 1) = D(f, χ ) < ∞. Therefore f · χ0 must satisfy (i) of Theorem 77, from P 1 which it follows that p∈P p (1 − f(p)χ(p)) converges.

Using the above observation we can now replace condition (iii) in Corollary 82 for

0 functions f ∈ M0 with a slightly simpler condition (see (iii) below):

Corollary 97. Let f ∈ M0. Then the following conditions are equivalent: (i) f is Besicovitch almost periodic;

(ii) f is Besicovitch rationally almost periodic;

74 0 (iii) either kfkΦ(Cesàro) = 0 or there exists a Dirichlet character χ such that D(f, χ) < ∞ (in other words, f pretends to be a Dirichlet character).

Proposition 98. Let f, g ∈ M0 and suppose D(f, g) < ∞. Then f is Besicovitch rationally almost periodic if and only if g is.

Proof. Suppose f is Besicovitch rationally almost periodic. We distinguish two cases, the case kfkΦ(Cesàro) = 0 and the case kfkΦ(Cesàro) > 0.

If kfkΦ(Cesàro) = 0 then, by Lemma 80, we have D(|f|, 1) = ∞. It follows from part (4) of Remark 73 that D(|f|, |g|) < ∞ and therefore, using the triangle inequality for D(·, ·), we get D(|g|, 1) = ∞. Another application of Lemma 80 shows that kgkΦ(Cesàro) = 0. Since any function with kgkΦ(Cesàro) = 0 is trivially Besicovitch rationally almost periodic, this concludes the first case.

Now assume kfkΦ(Cesàro) > 0. Then, by Theorem 81, there exists a Dirichlet character χ P 1 such that p∈P p (1 − f(p)χ(p)) converges. This implies that D(f, χ) < ∞ and, combined with D(f, g) < ∞ and the triangle inequality for D(·, ·), we obtain D(g, χ) < ∞. Since 0 g ∈ M0, we can now use (iii) from Corollary 97 to deduce that g is Besicovitch almost periodic.

Proof of Theorem 21

In this subsection we provide a proof of the dichotomy theorem for M0. The proof is rather short and follows from the results established in the previous subsection and in [DD74;

DD82; Del72; FH17a] (see Theorem 91 and Proposition 92).

Lemma 99. Suppose f ∈ M0. Then for any Dirichlet character χ and any t ∈ R\{0} one it has D(f, χ · n ) = ∞.

Proof. Suppose there exist a Dirichlet character χ and some t ∈ R\{0} such that D(f, χ · nit) < ∞. Let q denote a modulus of χ. If q is even, then we set χ0 := χ and if q is odd

0 then we set χ := χ · χ1, where χ1 denotes the principal character of modulus 2q.

75 We now use an argument that has already appeared in Remark 96. Since f ∈ M0, by Proposition 95, the mean of the function f · χ0 exists. This means that f · χ0 satisfies either (i), (ii) or (iii) of Theorem 77. However, f · χ0 cannot satisfy (ii) because χ0 has even modulus and hence χ0(2k) = 0 for all k. Since χ0(p) = χ(p) for all but finitely many primes

it it p, we deduce from D(f, χ · n ) < ∞ that D(f · χ0, n ) < ∞. It follows that f · χ0 cannot it satisfy (iii). Finally, D(f · χ0, n ) < ∞ together with property (2) listed in Remark 73 and

Lemma 74 imply that D(f · χ0, 1) = ∞ and therefore f · χ0 cannot satisfy (i) of Theorem 77; we have arrived at a contradiction.

it Proof of Theorem 21. Let f ∈ M0 be arbitrary. If D(f, χ · n ) = ∞ for all t ∈ R and all Dirichlet characters χ then we deduce from Proposition 92 that f is aperiodic and therefore,

it in view of Theorem 91, f is a uniform function. If, on the other hand, D(f, χ · n ) < ∞ for some t ∈ R and some Dirichlet characters χ, then we first apply Lemma 99 to deduce that t = 0 and hence D(f, χ) < ∞ and thereafter, using Proposition 98, we conclude that f is Besicovitch rationally almost periodic because χ is periodic.

2.3.3 Structure theorem for D

The goal of this section is to give a proof of Theorem 24. In Subsection 2.3.3 we discuss in some detail relatively uniform sets. Subsection 2.3.3 is devoted to the proof of Theorem 24 for the special case of level sets of concentrated multiplicative functions. Finally, in Sub- section 2.3.3 we establish Theorem 24 in full generality by reducing it to the special case established in Subsection 2.3.3.

Relative uniformity

In this subsection we provide additional examples of relatively uniform sets and prove a technical lemma which will be needed in the subsequent subsections.

We start with recalling the definition of relative uniformity of sets. Given sets E,R ⊂

N we say E is uniform relative to R if E ⊂ R, d(E) and d(R) exist and the function

76 d(R)1 − d(E)1 is uniform, i.e. kd(R)1 − d(E)1 k s goes to zero as N → ∞ for all E R E R U[N] s > 1. We list below some examples illustrating relative uniformity.

Example 100.

1. Let R ⊂ N be an arbitrary set whose density d(R) exists and is positive. Let (Xn)n∈R be a sequence of {0, 1}-valued independently and identically distributed random variables

1 such that Xn takes on the value 1 with probability 2 and the value 0 with probability 1 2 . We then claim that almost surely the random set E := {n ∈ R : Xn = 1} is uniform relative to R.

2s To verify this claim, let f := 21E − 1R and note that kfk s is bounded from above by U[N]

1 1 N X X g (n) , (2.3.5) s−1 h1,...,hs−1 N N 16h1,...,hs−16N n=1 where

gh1,...,hs−1 (n) := fN (n)fN (n + h1)fN (n + h2)

fN (n + h1 + h2) · ... · fN (n + h1 + ... + hs−1).

Let Vh1,...,hs−1 := R ∩ (R − h1) ∩ (R − h2) ∩ (R − h1 − h2) ∩ ... ∩ (R − h1 − ... − hs−1) and fix ε > 0. Let

s−1 Λ := {h1, . . . , hs−1 ∈ {1,...,N} : |Vh1,...,hs−1 ∩ {1,...,N}| > εN}.

If n∈ / V then g (n) = 0, which implies PN g (n) |V ∩ h1,...,hs−1 h1,...,hs−1 n=1 h1,...,hs−1 6 h1,...,hs−1 {1,...,N}| and so

1 1 N X X g (n) s−1 h1,...,hs−1 6 N N 16h1,...,hs−16N n=1

1 1 N X X g (n) + ε. N s−1 N h1,...,hs−1 (h1,...,hs−1)∈Λ n=1

1 On the other hand, if n ∈ Vh1,...,hs−1 , then gh1,...,hs−1 (n) equals 1 or −1 with probability 2 respectively. Utilizing Hoeffding’s inequality for sums of independent random variables

77 [Hoe63], applied to the sequence of random variables gh1,....hs−1 (n) for n ∈ Vh1,....hs−1 ∩ {1,...,N}, we get that the probability for the event

1 X g (n) ε |V ∩ {1,...,N}| h1,...,hs−1 > h1,...,hs−1 n∈V ∩{1,...,N} h1,...,hs−1

 ε2  is smaller or equal than 2 exp − 2 |Vh1,...,hs−1 ∩ {1,...,N}| . If (h1, . . . , hs−1) ∈ Λ, then  ε2   ε3N  2 exp − 2 |Vh1,...,hs−1 ∩ {1,...,N}| 6 2 exp − 2 and hence the probability for

1 N sup X g (n) ε N h1,...,hs−1 > 16h1,...,hs−16N n=1

 s−1  ε3N  is O N exp − 2 . In conclusion, the probability for the event (2.3.5) to be at 2s least 2ε, which majorizes the probability for the event kfk s 2ε, is bounded by U[N] >  s−1  ε3N  1 O N exp − 2 . Since d(R)1E − d(E)1R = d(E) f, it follows that kd(R)1E −

d(E)1 k s goes to zero as N → ∞ almost surely. R U[N] 1 2. Let ξ ∈ [0, 1) and let J be a Jordan measurable subset of the circle S := {w ∈ C : |w| =

1}. It was shown in [FH17b] that the set {n ∈ N : λξ(n) ∈ J} is uniform. It thus follows

from Lemma 101 below that the set E = {n ∈ N : µξ(n) ∈ J} is uniform relative to the

squarefree numbers Q, because Q is a rational set and E = {n ∈ N : λξ(n) ∈ J} ∩ Q.

One can show that if sets E,R,V ⊂ N are such that V is rational (see Definition 23) and E is uniform relative to R then E ∩ V is uniform relative to R ∩ V ; in fact we have the following slightly stronger result.

Lemma 101. Suppose E ⊂ R ⊂ N are sets such that d(E) and d(R) exist and suppose 0 d(R)1E −d(E)1R is uniform. Let t ∈ N, let V ⊂ N be any rational set and define E := tE∩V and R0 := tR ∩ V . If d(R0) exists, then d(E0) exists and satisfies the equation

d(E)d(R0) = d(R)d(E0) (2.3.6)

0 0 and the function d(R )1E0 − d(E )1R0 is uniform.

Proof. If d(R) = 0 then d(E) = d(E0) = d(R0) = 0 and hence there is nothing to show.

Let us therefore assume that d(R) > 0. Since d(R)1E − d(E)1R is uniform, it follows 78 from Proposition 94 part (d) that the function d(R)1tE − d(E)1tR is uniform. Then, using

Proposition 94 part (c), it follows that (d(R)1tE − d(E)1tR) · 1V = d(R)1E0 − d(E)1R0 is uniform as well. By definition, any uniform function has zero mean. From this we immediately obtain the identity d(E)d(R0) = d(R)d(E0) whenever d(R0) exists. Using

0 this identity and multiplying the function d(R)1E0 − d(E)1R0 by the constant d(R )/d(R)

0 0 0 0 we obtain the function d(R )1E0 − d(E )1R0 . This shows that d(R )1E0 − d(E )1R0 is also uniform.

A proof of Theorem 24 for the special case of concentrated multiplicative functions

Let f : N → C\{0} be a concentrated multiplicative function (see Definition 87) and let G denote its concentration group. Clearly, z|G| = 1 for all z ∈ G. Let us consider all pairs

(k, χ), where k ∈ N and χ is a Dirichlet character, such that

k D(f , χ) < ∞. (2.3.7)

There is at least one such pair (k, χ), because we can pick k = |G| and χ to be the principal character of modulus 1 (i.e. χ(n) = 1 for all n ∈ N). This leads to the following definition.

Definition 102. Given a concentrated multiplicative function f with concentration group

G let kG denote the smallest positive integer such that for some Dirichlet character χG equation (2.3.7) is satisfied.

The next theorem is a version of Theorem 24 for concentrated multiplicative functions and constitutes the main result of this subsection. In Subsection 2.3.3 we will show how

Theorem 24 can be derived in its full generality from this special case.

Theorem 103. Let g be a concentrated multiplicative function with concentration group

k G and let kG be as in Definition 102. Then g G is Besicovitch rationally almost periodic and

k k for every z ∈ C\{0} the set Eg := E(g, z) is uniform relative to Rg := E(g G , z G ).

For the proof of Theorem 103 we need three lemmas.

79 Lemma 104. Let p ∈ P and let k, m ∈ N and let c > 1. Let f and g be multiplicative ` ` functions and suppose f(q ) = g(q ) for all pairs (q, `) ∈ P × N with (q, `) 6= (p, k). Assume m f is Besicovitch rationally almost periodic and for every z ∈ C\{0} the set Ef := E(f, z) m m m is uniform relative to Rf := E(f , z ) and cd(Ef ) = d(Rf ). Then g is Besicovitch rationally almost periodic and for every z ∈ C\{0} the set Eg := E(g, z) is uniform relative m m to Rg := E(g , z ) and cd(Eg) = d(Rg).

Proof. Let z ∈ C\{0} be arbitrary. Let

k T := {n ∈ N : n = s · p for some s ∈ N with gcd(s, p) = 1} p−1 (2.3.8) [ k = p ((pN ∪ {0}) + a) a=1 and

S := N\T. (2.3.9)

Note that S is a multiplicative set. Clearly,

Eg ∩ S = Ef ∩ S and Rg ∩ S = Rf ∩ S. (2.3.10)

Define  n k o  n ∈ : f(n) = zf(p ) , if g(pk) 6= 0; 0  N g(pk) Ef :=  ∅, if g(pk) = 0 and  n  k mo  n ∈ : f m(n) = zf(p ) , if g(pk) 6= 0; 0  N g(pk) Rf :=  ∅, if g(pk) = 0.

k 0 0 0 0 If g(p ) 6= 0 then, by assumption, Ef is uniform relative to Rf and cd(Ef ) = d(Rf ). On k 0 0 0 the other hand, if g(p ) = 0 then Ef = Rf = ∅ and hence it is trivially satisfied that Ef is 0 0 0 uniform relative to Rf and cd(Ef ) = d(Rf ). Let n ∈ T be arbitrary and write n = s · pk with gcd(p, s) = 1. If g(pk) 6= 0 then

z z zf(pk) g(n) = z ⇔ g(s) = g(pk) ⇔ f(s) = g(pk) ⇔ f(n) = g(pk) .

80 If g(pk) = 0 then g(n) = z holds for no n ∈ T , because z 6= 0. This proves that

0 Eg ∩ T = Ef ∩ T. (2.3.11)

An analogous calculation shows that

0 Rg ∩ T = Rf ∩ T. (2.3.12)

Combining (2.3.10), (2.3.11) and (2.3.12) we obtain

   0  Eg = Ef ∩ S ∪ Ef ∩ T , (2.3.13)

   0  Rg = Rf ∩ S ∪ Rf ∩ T . (2.3.14)

Our goal is to show that the function d(Rg)1Eg − d(Eg)1Rg is uniform. It follows from

Corollary 86 that the density of Rg exists. If d(Rg) = 0 then the k · kΦ(Cesàro) -norm of d(Rg)1Eg − d(Eg)1Rg equals 0 and hence this function is uniform for trivial reasons. We can therefore assume without loss of generality that d(Rg) > 0.

Since 1S is a {0, 1}-valued multiplicative function, we deduce from Remark 83 that S

m m is a rational set. Moreover, Rf ∩ S = {n ∈ N : f (n)1S(n) = z } and therefore the  density d Rf ∩ S exists by Corollary 86. Similarly d(Ef ∩ S) exists. Now, by (2.3.10) and

Lemma 101 (applied to Ef ⊂ Rf and S), we obtain that the function

d(Rg ∩ S)  d(Rg ∩ S)1Ef ∩S − d(Eg ∩ S)1Rf ∩S = d(Rg)1Ef ∩S − d(Eg)1Rf ∩S d(Rg) is uniform. From this we conclude that

  d(Rg)1Ef − d(Eg)1Rf · 1S (2.3.15)

  is also uniform. Also, from (2.3.6) and d(Rf ) = cd(Ef ) we get d Rf ∩ S = cd Ef ∩ S .

0 Analogous to the way we proved that d(Rf ∩ S) exists, one can show that d(Rf ∩ S) 0 0 0  0 0 exists. It follows that d(Rf ∩ T ) = d Rf \(Rf ∩ S) = d(Rf ) − d(Rf ∩ S) also exists. 0 Additionally, since S is rational, the set N\S = T is rational. Using the fact that Ef is 0 0 0 uniform relative to Rf together with (2.3.11), (2.3.12) and Lemma 101 (applied to Ef ⊂ Rf 0  and T ) we deduce that d Ef ∩ T exists and that   d(Rg)1 0 − d(Eg)1 0 · 1T (2.3.16) Ef Rf 81 0 0 0  0  is uniform. From (2.3.6) and d(Rf ) = cd(Ef ) we obtain d Rf ∩ T = cd Ef ∩ T . Since the sum of two uniform functions remains uniform (due to the triangle inequality for k·k s ), we conclude by taking the sum of (2.3.15) and (2.3.16) and utilizing (2.3.13) and U[N] 0  0  (2.3.14) that d(Rg)1Eg −d(Eg)1Rg is uniform. Moreover, combining d Rf ∩T = cd Ef ∩T   and d Rf ∩ S = cd Ef ∩ S with (2.3.13) and (2.3.14) we obtain cd(Eg) = d(Rg). It is straightforward to show that if h is a Besicovitch rationally almost periodic function then for any q ∈ N so is    n  h q , if q | n h0(n) :=  0, otherwise. In particular, the function   m n  k f pk , if p | n h1(n) :=  0, otherwise is Besicovitch rationally almost periodic. Since S and T are rational sets, it follows that

m the functions f · 1S and h1 · 1T are Besicovitch rationally almost periodic. Note that any n ∈ T satisfies pk | n. Hence,    gm(pk)f m n , if n ∈ T ; m k  pk h3(n) := g (p )h1 · 1T =  0, otherwise,

m m is Besicovitch rationally almost periodic. Therefore g = f · 1S + h3 is Besicovitch rationally almost periodic.

P 1 Lemma 105. Let P ⊂ P with p∈P\P p < ∞ and let m ∈ N and c > 1. Let f and g be m multiplicative functions and suppose f(p) = g(p) for all p ∈ P\P . Assume f is Besicovitch rationally almost periodic and for every z ∈ C\{0} the set Ef := E(f, z) is uniform relative m m m to Rf := E(f , z ) and cd(Ef ) = d(Rf ). Then g is Besicovitch rationally almost periodic

m m and for every z ∈ C\{0} the set Eg := E(g, z) is uniform relative to Rg := E(g , z ).

k k Proof. Let Ω := {(p, k) ∈ P × N : f(p ) 6= g(p )}. Note that Ω can be turned into a linearly ordered set (Ω, ≺) using the relation

(p, k) ≺ (q, `) ⇔ pk < q`. 82 Let (p1, k1) ≺ (p2, k2) ≺ ... be an enumeration of Ω.

We now define inductively a sequences of multiplicative functions f0, f1, f2,... as follows.

First, we let f0 := f; then we define   k fi(p ), if (p, k) 6= (pi+1, ki+1); k  fi+1(p ) :=  g(pk), otherwise.

Note that for a fixed n ∈ N there exists in such that fi(n) = g(n) for all i > in. P 1 Since p∈P\P p < ∞, it follows that

X 1 pk < ∞. (p,k)∈Ω

Also,

 m m  d n ∈ N : g (n) 6= fi (n)  

   [ k  X 1 6 d n ∈ N : g(n) 6= fi(n) 6 d p N 6 pk . (2.3.17)  (p,k)∈Ω  (p,k)∈Ω (pi,ki)≺(p,k) (pi,ki)≺(p,k)

m m It follows that limi→∞ kg − fikΦ(Cesàro) = 0 and limi→∞ kg − fi kΦ(Cesàro) = 0.

Let z ∈ C\{0} be arbitrary. Recall that, by assumption, Ef is uniform relative to m m Rf . Define Efi := E(fi, z) and Rfi := E(fi , z ). It clearly follows from Lemma 104 and m induction on i that fi is Besicovitch almost periodic, Efi is uniform relative to Rfi and m m cd(Efi ) = d(Rfi ). Therefore, g is Besicovitch almost periodic, because limi→∞ kg − m fi kΦ(Cesàro) = 0. We deduce from (2.3.17) that

  lim d Eg4Ef = 0 and lim d Rg4Rf = 0, (2.3.18) i→∞ i i→∞ i

m m where Eg := E(g, z) and Rg := E(g , z ). Hence

    i→∞ d(Rg)1Eg − d(Eg)1Rg − d(Rfi )1E − d(Efi )1R −−−→ 0. fi fi Φ(Cesàro)

Finally, using part (a) of Proposition 94 we deduce that Eg is uniform relative to Rg. This finishes the proof.

83 Lemma 106. Let m ∈ N, f a multiplicative function and χ a Dirichlet character. Assume j m that f is aperiodic for all j ∈ {1, 2, . . . , m − 1} and that f = χ. Let z ∈ C and set E := E(f, z) and R := E(χ, zm). Then E is uniform relative to R and md(E) = d(R).

Proof. First, using Theorem 91, we deduce that for all j ∈ {1, 2, . . . , m − 1} the function f j is uniform. Also, note that the density of E and R exists, due to Corollary 86. It remains to show that the function

d(R)1E − d(E)1R (2.3.19) is uniform.

m If z = 0 then R = E (because f = χ) and so the function d(R)1E −d(E)1R is constant 0 and hence uniform. We can therefore assume without loss of generality that z 6= 0.

By assumption, for any n ∈ R we have f m(n) = χ(n) = zm. Therefore, the number z−1f(n) is an m-th root of unity for any n ∈ R. It follows that for all n ∈ R,

  1 m−1 1, if f(n) = z; X z−jf j(n) = m  j=0 0, otherwise.

So,  m−1  1 X −j j 1E = 1R ·  z f  m j=0 and after rearranging we get

 m−1  1 1 X −j j 1E − 1R = 1R ·  z f . (2.3.20) m m j=1

j Since 1R is Besicovitch rationally almost periodic and f is uniform for j = 1, ..., m − 1, by Proposition 94 (c), we deduce that the right hand side of (2.3.20) is uniform. This implies that 1 1 − 1 (2.3.21) E m R is uniform as well. Since any uniform function has zero mean, it follows that d(E)m = d(R) and so the function in (2.3.19) is a constant multiple of the function in (2.3.21) and hence also uniform.

84 Proof of Theorem 103. Let G denote the concentration group of g and let kG and χG be as

k k k k in Definition 102. Define ΩG := {(p, k) ∈ P × N : g(p ) ∈ G, g G (p ) = χG(p )}. Since the pair (k , χ ) satisfies (2.3.7), we have that P 1 < ∞. G G (p,k)∈/ΩG pk

kG k Given (p, k) ∈/ ΩG let ξ(p,k) be any complex number that satisfies ξ(p,k) = χG(p ). Define a new multiplicative function f via   k g(p ), if (p, k) ∈ ΩG; f(pk) :=  ξ(p,k), otherwise.

Note that f satisfies the functional equation

kG f = χG. (2.3.22)

j it We claim that D(f , χ · n ) = ∞ for all j ∈ {1, 2, . . . , kG − 1}, for all t ∈ R and for all Dirichlet characters χ. To verify this claim we have to distinguish between the case t = 0 and the case t ∈ R\{0}. j The case t = 0 follows from the minimality assumption on kG: D(g , χ) = ∞ for each P 1 j = 1, ..., kG − 1 and each Dirichlet character χ. Since p∈P p < ∞, it follows from the f(p)6=g(p) j triangle inequality for D that D(f , χ) = ∞ for each j = 1, ..., kG − 1 and each Dirichlet character χ.

For the case t 6= 0 we give a proof by contradiction. Let us assume that there are

j it j ∈ {1, . . . , kG}, a Dirichlet character χ and a number t ∈ R\{0} such that D(f , χ·n ) < ∞. j|G| |G| it|G| Using part (3) of Remark 73 it follows that also D(f , χ · n ) < ∞. However, for all j|G| j|G| j|G| |G| it|G| primes p with g(p) ∈ G we have that g (p) = f (p) = 1. Hence, D(f , χ ·n ) < ∞ |G| it|G| implies D(χ , n ) < ∞. This contradicts the statement of Lemma 74. j it Since D(f , χ·n ) = ∞ for all j ∈ {1, 2, . . . , kG −1}, all t ∈ R and all Dirichlet characters χ, it follows from Proposition 92 that f j is aperiodic. It therefore follows from Lemma 106

k that for all z ∈ C\{0} the set Ef := E(f, z) is uniform relative to Rf := E(χG, z G ) and kGd(Ef ) = d(Rf ). Finally, observe that f and g are two multiplicative functions that satisfy the conditions

k of Lemma 105 (with c = m = kG), from which we conclude that g G is Besicovitch rationally 85 almost periodic and for every z ∈ C\{0} the set Eg := E(g, z) is uniform relative to k k Rg := E(g G , z G ).

A proof of Theorem 24

In this subsection we give a proof of Theorem 24. The proof is based on the idea that any multiplicative function f either behaves like a concentrated multiplicative function, in which case Theorem 24 can be derived from Theorem 103, or all sets of the form E :=

{n ∈ N : f(n) = z} with z 6= 0 have zero density. This only leaves the case z = 0, which can be taken care of by using the characterization of Besicovitch rationally almost periodic multiplicative functions due to Daboussi and Delange discussed in Subsection 2.3.1.

We will need the following lemma.

Lemma 107. Suppose E1,E2 ∈ D and 0 < d(E1), d(E2) < 1. Then d(E14E2) = 0 if and

5 only if E1 = E2.

Proof. Clearly E1 = E2 implies d(E14E2) = 0. To prove the other direction we assume that there exists n0 ∈ E1 with n0 ∈/ E2 and show that this leads to a contradiction with d(E14E2) = 0.

By definition of D there exist multiplicative functions f1, f2 : N → C and numbers z1, z2 ∈ C such that E1 = E(f1, z1) and E2 = E(f2, z2). We have to distinguish three cases, the case z1 = z2 = 0, the case z1 6= 0 and z2 6= 0 and finally the case z1 = 0 and z2 6= 0.

We remark that the case z1 6= 0 and z2 = 0 is analogous to the case z1 = 0 and z2 6= 0 and is therefore omitted.

If z1 = z2 = 0 then for i ∈ {1, 2} we define gi(n) = 0 if fi(n) = 0 and gi(n) = 1 if fi(n) 6= 0. It is clear that gi = 1N\Ei and Ei = E(gi, 0). Since d(Ei) < 1, we have that kgikΦ(Cesàro) > 0 and hence, in view of Lemma 80, the sets Pi := {p ∈ P : gi(p) = 1} satisfy P 1 < ∞. Let P denote the set of all primes that belong to both P and P and that p∈P\Pi p 1 2 5 Note that if d(E1) = d(E2) = 0 or d(E1) = d(E2) = 1 then d(E14E2) = 0 does not necessarily imply E1 = E2. Take for instance E1 = {1, 2} and E2 = {1, 3} or E1 = N\{1, 2} and E2 = N\{1, 3}, which are sets belonging to D because the functions 1{1,2} and 1{1,3} are multiplicative.

86 do not divide n0. Let SP ⊂ N be defined as

SP := {n ∈ N : there exist distinct p1, . . . , pt ∈ P such that n = p1 · ... · pt}. (2.3.23)

Then by Lemma 80 we have d(SP ) > 0. Since n0 ∈ E1 but n0 ∈/ E2 and n0 is coprime to all numbers in SP , it follows that E1\E2 contains the set n0SP . In particular, d(E1\E2) > d(n0SP ) > 0. This, however, contradicts d(E14E2) = 0.

Next, assume z1 6= 0 and z2 6= 0. Using Corollary 88 we can find two concentrated multiplicative functions g1, g2 : N → C\{0} such that E1 = E(g1, z1) and E2 = E(g2, z2). 2 Define g := (g1, g2) and let im(g) ⊂ (C\{0}) denote the image of g. Since g1 and g2 are concentrated multiplicative functions, also g is concentrated, see Remark 89. We now use

2 an argument similar to the one used in the proof of Corollary 88. Choose y ∈ (C\{0}) n such that (y · im(g)) ∩ im(g) = ∅ for all n ∈ N. We define a new multiplicative function h = (h1, h2) via   k g(p ), if p n0 k  - h(p ) := , ∀k ∈ N, ∀p ∈ P.  y, if p | n0

It is straightforward to verify that g(n) = h(n) if and only if gcd(n, n0) = 1 and h(n) ∈/ im(g(n)) for all n with gcd(n, n0) > 1. Since g satisfies (a), (b) and (c) of Theorem 85, also h satisfies them because the number of primes p for which g(p) 6= h(p) is finite. Thus, h is concentrated, whence the set E(h, (1, 1)) = {n ∈ N : h1(n) = 1 and h2(n) = 1} has positive density by Remark 89. Note that h(n) = (1, 1) if and only if g(n) = (1, 1) and gcd(n, n0) = 1. Hence

E(h, (1, 1)) = {n ∈ N : g1(n) = 1, g2(n) = 1, gcd(n, n0) = 1}.

We obtain that g1(n0m) = g1(n0) and g2(n0m) = g2(n0) for all m ∈ E(h, (1, 1)). In particular n0E(h, (1, 1)) ⊂ E1\E2, which contradicts d(E14E2) = 0.

Finally, we deal with the case z1 = 0 and z2 6= 0. Let g denote the multiplicative function defined as g1(n) = 0 if f1(n) = 0 and g1(n) = 1 if f1(n) 6= 0. Let P := {p ∈

P : p - n0, g1(p) = 1} and let SP ⊂ N be defined as in (2.3.23). Arguing as in the case z1 = z2 = 0 above one can show that d(SP ) > 0. Next, using Corollary 88, we can find a 87 concentrated multiplicative function g2 : N → C\{0} such that E2 = E(g2, z2). Then, using arguments similar to the ones utilized in the previous paragraph, we first find y ∈ C\{0} n such that (y · im(g2)) ∩ im(g2) = ∅ for all n ∈ N and then define a multiplicative function h: N → C\{0} via   g2(p), if p ∈ P and k = 1 k  h(p ) := , ∀k ∈ N, ∀p ∈ P.  y, if either p∈ / P or k > 2

It is straightforward to verify that g2(n) = h(n) if and only if n ∈ SP and h(n) ∈/ im(g2(n)) for all n which are either not squarefree or satisfy p | n for some p ∈ P\P . Since g2 is P 1 concentrated and p∈P\P p < ∞, h is concentrated too. It follows from Theorem 85 that

E(h, 1) has positive density. Since h(n) = 1 if and only if g(n) = 1 and n ∈ SP , we obtain that g1(n0m) = g1(n0) and g2(n0m) = g2(n0) for all m ∈ E(h, 1) ⊂ SP . In particular, n0E(h, 1) ⊂ E1\E2, which again contradicts d(E14E2) = 0.

Proof of Theorem 24. Let E ∈ D and suppose d(E) > 0. By definition there exists a multiplicative function f such that E = E(f, z). Our goal is to find a set R ∈ Drat such that E is uniform relative to R. We distinguish two cases, z = 0 and z 6= 0.

If z = 0 then let g be the multiplicative function defined as g(n) = 0 if f(n) = 0 and g(n) = 1 if f(n) 6= 0. In view of Remark 83, g is Besicovitch rationally almost periodic.

Also, E = E(f, z) = E(g, z), which proves that the set E belongs to Drat. Since any set is uniform relative to itself, we can simply pick E = R and are done.

Now assume z 6= 0. Using Corollary 88 we can find a concentrated multiplicative function g : N → C\{0} such that E = E(f, z) = E(g, z). According to Theorem 103 there exist a Besicovitch rationally almost periodic multiplicative function h and y ∈ C\{0} such that E is uniform relative to R := E(h, y) (namely h = gkG and y = zkG ). Clearly, the set

R belongs to Drat. This proves the claim.

Finally, we have to show that if 0 < d(E) < 1 then the set R ∈ Drat such that E is

0 uniform relative to R is unique. Suppose R ∈ Drat is another set such that E is uniform

0 relative to R . Since 1R0 is Besicovitch rationally almost periodic and d(R)1E − d(E)1R is

88 uniform, it follows from part (c) of Proposition 94 that the function

(d(R)1E − d(E)1R) · 1R0 = d(R)1E − d(E)1R∩R0 (2.3.24) is uniform. Since any uniform function has zero mean, we have that

N 1 X lim d(R)1E(n) − d(E)1R∩R0 (n) = 0, N→∞ N n=1 which shows that d(R) = d(R ∩R0). By symmetry, it follows that d(R) = d(R ∩R0) = d(R0) and hence d(R4R0) = 0. In view of Lemma 107, this proves that R = R0.

89 CHAPTER 3

APPLICATIONS OF DECOMPOSITION THEOREMS TO THE THEORY

OF MULTIPLE RECURRENCE AND TO COMBINATORIAL NUMBER

THEORY

In Chapter3 we deal with applications of the decomposition theorems proved in Chapter2.

In particular, this chapter containes the proofs of Theorem 28, Theorem 32, Corollary 34,

Theorem 35, Proposition 36 and Theorem 40.

3.1 Multiple ergodic averages along Beatty sequences and a proof of Theo-

rem 28

The purpose of this section is to give a proof of Theorem 28. For the convenience of the reader, let us recall its statement here.

Theorem 28. Let θ, γ ∈ R with θ > 0, and let (X, B, µ, T ) be an ergodic measure preserving system whose discrete spectrum σ(T ) satisfies σ(T ) ∩ θ−1 = {0}. For any

∞ f1, . . . , fk ∈ L (X),

N k N k 1 X Y jbθn+γc 1 X Y jn lim T fj = lim T fj, (1.5.3) N−M→∞ N − M N→∞ N n=M j=1 n=1 j=1 where convergence takes place in L2(X). In particular, since discrete spectra are always countable, we have that for any fixed system (X, B, µ, T ) and for almost all θ > 0 equation

(1.5.3) holds for all γ ∈ R.

The key ingredient in the proof of Theorem 28 is the following lemma, which relies on

Theorem 18.

90 N 1 X Lemma 108. Let θ ∈ T and let F1 : T → T be Riemann integrable with lim F1(nθ) = N→∞ N n=1 0. Also, let (X, B, µ, T ) be an ergodic measure preserving system whose discrete spectrum

∞ satisfies σ(T ) ∩ hθi = {0}. Then for any f1, . . . , fk ∈ L (X) we have

N k 1 X Y jn lim F1(nθ) T fj = 0 (3.1.1) N−M→∞ N − M n=M j=1 in L2.

Proof. It follows from [HK09, Theorem 2.24] that the limit of the left hand side in (3.1.1)

2 ∞ exists in L . It thus suffices to show that for all f0 ∈ L (X) the limit

N Z 1 X n kn lim F1(nθ) f0 · T f1 · ... · T fk dµ = 0. (3.1.2) N−M→∞ N − M n=M In view of Theorem 18, the multicorrelation sequence can be decomposed as

Z n kn f0 · T f1 · ... · T fk dµ = ψ(n) + ω(n) + γ(n),

n where kγk∞ < ε, ω is a null-sequence and ψ(n) = F2(R2 y), where y ∈ Y2, F2 ∈ C(Y2) and (Y2,R2) is a nilsystem whose discrete spectrum satisfies σ(Y2,R2) ⊂ σ(X,T ). Let µY2 denote the Haar measure of the nilmanifold Y2.

Let us use R1 : T → T to denote rotation by θ on T and let Y1 denote the orbit closure of 0 under R1. Note that either Y1 = T, which corresponds to the case of irrational θ, or

Y1 is a finite subgroup of T, which corresponds to the case of rational θ. Either way, Y1 is a closed subgroup of T and we use µY1 to denote its Haar measure. Putting everything together we can now rewrite the left hand side of (3.1.2) (up to a loss of ε) as N 1 X n n lim F1(R1 0)F2(R2 y) = 0. N→∞ N n=1

Since the discrete spectrum σ(Y1,R1) of the system (Y1,R1) is given by the group generated by θ, we deduce that the systems (Y1,R1) and (Y2,R2) have mutually singular spectral type. In view of [Gla03, Theorem 6.28] we deduce that they are disjoint in the sense of Furstenberg

[Fur67].

91 Observe that for any increasing sequence (N`) for which the limit exists, the limit measure N` 1 X n ν := lim (R1 × R2)∗ δ(0,y) `→∞ N ` n=1 on Y1 × Y2 is a joining of the systems (Y1,R1) and (Y2,R2). By disjointness, there only exists the trivial joining and hence ν is the product of the two Haar measures µY1 and µY2 . In particular,

N Z 1 X n n lim F1(R1 0)F2(R2 y) = F1 ⊗ F2 d(µY1 ⊗ µY2 ) N→∞ N n=1 Y1×Y2 Z Z = F1 dµY1 · F2 dµY2 = 0. Y1 Y2

This finishes the proof.

Proof of Theorem 28. Let θ, γ ∈ R with θ > 0 and let (X, B, µ, T ) be a measure preserving system whose discrete spectrum σ(T ) satisfies hθ−1i ∩ σ(T ) = {0}. We need to show that

∞ for any f1, . . . , fk ∈ L (X, B, µ) we have

N k N k 1 X Y jbθn+γc 1 X Y jn lim T fj = lim T fj, (3.1.3) N−M→∞ N − M N→∞ N n=M j=1 n=1 j=1 where convergence takes place in L2.

Let us first deal with the case θ > 1. Define A = {bθn + γc : n ∈ N} and observe that 1 m ∈ A if and only if m θ mod 1 ∈ (a, b] ⊂ T, where a := (γ−1)/θ mod 1 and b := γ/θ mod 1. 1 1 Therefore 1A(m) = 1(a,b](m θ ). Define F (x) := 1(a,b](x) − θ . Since F is Riemann integrable with 1 N 1 lim X F (m ) = 0, N→∞ N θ m=1 it follows from Lemma 108 that

N   k 1 X 1 Y jm lim F m T fj = 0. N−M→∞ N − M θ m=M j=1

 1  1 From 1A(m) = F m θ + θ we deduce that

N k  N k  1 X Y jm 1 1 X Y jn lim 1A(m) T fj =  lim T fj. N−M→∞ N − M θ N→∞ N m=M j=1 n=1 j=1 92 From this and the observation that the density d(A) = 1/θ,(3.1.3) follows at once.

Now assume that θ < 1. If θ is rational, (3.1.3) follows immediately, so we assume also that θ is irrational. Let k := min{j ∈ N : jθ > 1}. First, we observe that bmθ + γc = n if h n−γ n+1−γ  and only if m ∈ d θ e, d θ e . Also, the number of m’s for which bmθ + γc = n varies between k − 1 and k. Define the sets

   1 γ γ  A := n : m : bmθ + γc = n = k − 1 = n : n mod 1 ∈ , 1 − ξ + k−1 θ θ θ and    1 γ γ  A := n : m : bmθ + γc = n = k = n : n mod 1 ∈ − ξ, k θ θ θ and the Riemann integrable functions

F (x) := 1 γ γ (x) − d(Ak) and G(x) := 1 γ γ (x) − d(Ak−1). ( θ −ξ, θ ] ( θ ,1−ξ+ θ ]

Observe that R F (x) dx = R G(x) dx = 0. We have

bθN+γc 1 N k k k X Y T jbθn+γcf = X 1 (n) Y T jnf N − M j N − M Ak j n=M j=1 n=bθM+γc j=1 (3.1.4) bθN+γc k − 1 k + X 1 (n) Y T jnf . N − M Ak−1 j n=bθM+γc j=1

1 1 Finally, using 1Ak (n) = F (n θ ) + d(Ak) and 1Ak−1 (n) = G(n θ ) + d(Ak−1) we conclude from Lemma 108 that

1 N k 1 N k X Y T jbθn+γcf = X Y T jnf . N − M j N − M j n=M j=1 n=M j=1

3.2 Multiple ergodic averages along rational sets and applications

In this section we follow [BKPLR16] and consider multiple ergodic averages of the from

1 N   X −p1(n) −p`(n) lim 1R(n)µ A ∩ T A ∩ ... ∩ T A , (3.2.1) N→∞ |R ∩ [1,N]| n=1 where R is a rational set. In particular, we will give proofs of Theorem 32 and of Corol- lary 34. 93 3.2.1 Rational sequences are good weights for polynomial multiple conver-

gence

To study expressions of the form (3.2.1), it will be convenient to show first that for any rational set R with d(R) > 0 the limit in (3.2.1) actually exists. We make the following observation: If d(R) exists and is positive then the limit in (3.2.1) exists and is positive if and only if the limit

1 N X −p1(n) −p`(n)  lim 1R(n)µ A ∩ T A ∩ ... ∩ T A (3.2.2) N→∞ N n=1 exists and is positive. Since throughout this thesis we only consider sets R for which d(R) exists and is positive, it suffices to study the ergodic averages given by (3.2.2) instead of

(3.2.1).

For the special case where ` = 1 and p1(t) = t, the existence of the limit in (3.2.2) follows from the work of Bellow and Losert in [BL85]. More precisely, they showed that that for any bounded Besicovitch almost periodic function x: N → C, the ergodic averages

1 N lim X x(n)T nf N→∞ N n=1 converge almost everywhere for any function f ∈ L1(X, B, µ). From this, the existence of the limit in (3.2.2) for ` = 1 and p1(t) = t follows immediately.

Definition 109. A sequence x ∈ {0, 1}N is called a good weight for polynomial multiple convergence if for every invertible measure preserving system (X, B, µ, T ), for all f1, . . . , f` ∈

∞ L (X, µ) and for all polynomials pi ∈ Z[x], i ∈ {1, . . . , `}, the limit

1 N ` X Y pi(n) lim x(n) T fi (3.2.3) N→∞ N n=1 i=1 exists in L2(X, B, µ).

The following proposition shows that the limit in (3.2.2) exists in general.

Proposition 110. Let x ∈ {0, 1}N be Besicovitch rationally almost periodic. Then x is a good weight for polynomial multiple convergence.

94 Proof. It follows from the results of Host, Kra [HK05a] and Leibman [Lei05a] that the sequence N 1 X T q1(n)f · ... · T q`(n)f , N 1 ` n=1 2 converges in L , for any qi ∈ Z[x], i = 1, . . . , `. In particular, given arbitrary a ∈ N and b ∈ Z the averages N 1 X T p1(an+b)f · ... · T p`(an+b)f , N 1 ` n=1 converge in L2 as N → ∞. Equivalently, the limit

1 N X p1(n) p`(n) lim 1a +b(n)T f1 · ... · T f` (3.2.4) N→∞ N N n=1 exists. Observe that any periodic function can be written as a finite linear combination of infinite arithmetic progressions 1aN+b. Therefore, it follows from (3.2.4) that for any periodic function y ∈ {0, 1}N the limit

1 N X p1(n) p`(n) lim y(n)T f1 · ... · T f` N→∞ N n=1 also exists in L2.

Since any Besicovitch rationally almost periodic function x can be approximated by periodic functions, we can find periodic functions ym, m ∈ N, satisfying kym −xkΦ(Cesàro) → 0 as m → ∞. Define

1 N X p1(n) p`(n) Lm := lim ym(n)T f1 · ... · T f`. N→∞ N n=1 Then

∞ ∞ kLm1 − Lm2 kL2 6 kym1 − ym2 kΦ(Cesàro) kf1kL · ... · kf`kL , which shows that (Lm) is a Cauchy sequence, whence the limit L := limm→∞ Lm exists. Moreover,

1 N X p1(n) p`(n) lim sup Lm − x(n)T f1 · ... · T f` N→∞ N n=1 L2 can be bounded from above by kx − ymkΦ(Cesàro) kf1kL∞ · ... · kf`kL∞ , which converges to zero as m → ∞. Therefore, the limit in (3.2.3) exists and equals L.

95 3.2.2 Divisible rational sets are good for polynomial multiple recurrence

In this subsection we prove (1) ⇒ (3) in Theorem 32. Since (3) ⇒ (2) ⇒ (1) are trivial, this will complete the proof of Theorem 32. Let us state the implication that we want to prove as a separate theorem.

Theorem 111. Assume R ⊂ N is rational and divisible. Then R is an averaging set of polynomial multiple recurrence.

The only two ingredients that we need in the proof of Theorem 111 are Theorem 20 and Proposition 110, which we already proved above.

Proof of Theorem 111. Let (X, B, µ, T ) be an invertible measure preserving system and assume that R ⊂ N is rational and divisible. Take any A ∈ B with µ(A) > 0 and let p1, . . . , p` ∈ Z[x] with pi(0) = 0 for all i = 1, . . . , `. We will show that N 1 X lim 1R(n)ϕ(n) > 0, (3.2.5) N→∞ N n=1 where ϕ(n) = µA∩T −p1(n)A∩...∩T −p`(n)A. This, in view of (3.2.2), suffices to conclude that R is an averaging set of polynomial multiple recurrence. The existence of the limit in (3.2.5) follows immediately from Proposition 110, hence it only remains to show its positivity.

By Theorem 20, there exists δ > 0 such that for every ε > 0 one can find a decomposition of the form Z p1(n) pk(n) ϕ(n) := f0,T f1 · ... · T fk dµ = ρ(n) + φ(n) + ω(n) + γ(n), X where ω(n) is a null-sequence, γ satisfies supn∈N |γ(n)| < ε, ρ is a periodic sequence with the property that N 1 X lim ρ(qn) > δ, ∀q ∈ N, (3.2.6) N→∞ N n=1 (Cesàro) and φ(n) ∈ Nils(N) ∩ Coper(N, Φ ) for some s ∈ N. Let Q ∈ N denote the period of ρ, that is, ρ(Qn + r) = ρ(r) for all n, r ∈ Q. We get N N 1 X 1 X   lim 1R(n)ϕ(n) = lim 1R(n)ρ(n) + 1R(n)φ(n) + 1R(n)ω(n) + 1R(n)γ(n) . N→∞ N N→∞ N n=1 n=1 96 1 PN Observe that limN→∞ N n=1 1R(n)φ(n) = 0 because φ is coperiodic and 1R is Besicov- 1 PN itch rationally almost periodic, and limN→∞ N n=1 1R(n)ω(n) = 0 because ω is a null- δ 1 PN sequence. Moreover, if ε was chosen to be smaller than 2 then N n=1 1R(Qn)γ(Qn) > d(R∩QN)δ − 2 for all N. Putting everything together we get

N N 1 X 1 X lim 1R(n)ϕ(n) = lim 1R(n)ρ(n) + 1R(n)γ(n) N→∞ N N→∞ N n=1 n=1 N 1 X > lim 1R(Qn)ρ(Qn) + 1R(Qn)γ(Qn) N→∞ N n=1 N 1 X d(R ∩ QN)δ > lim 1R(Qn)ρ(Qn) − N→∞ N 2 n=1 N 1 X d(R ∩ QN)δ > lim 1R(Qn)δ − N→∞ N 2 n=1 d(R ∩ Q )δ N , > 2 where the second to last inequality follows from the fact that n 7→ ρ(Qn) is a constant

d(R∩QN)δ together with (3.2.6). Since R is divisible, 2 is positive and this finishes the proof of (3.2.5).

3.2.3 Applications to additive combinatorics

In this section we show how the results obtained in the previous sections allow us to derive new refinements of the Szemerédi’s theorem. In particular, we give a proofs of Corollaries

33 and 34.

We have the following two regarding averaging sets of polynomial multiple recurrence.

Proposition 112. Let R ⊂ N be an averaging set of polynomial multiple recurrence.

Then for any set E ⊂ N with d(E) > 0 and any polynomials p1, . . . , p` ∈ Z[x], which satisfy pi(0) = 0 for all i ∈ {1, . . . , `}, there exists β > 0 such that the set

n   o n ∈ R : d E ∩ (E − p1(n)) ∩ ... ∩ (E − p`(n)) > β has positive lower density. 97 Proposition 113. Let R ⊂ N be an averaging set of polynomial multiple recurrence. Then for any E ⊂ N with d(E) > 0 and any polynomials p1, . . . , p` ∈ Z[t], which satisfy pi(0) = 0 for all i ∈ {1, . . . , `}, there exists a subset R0 ⊂ R satisfying d(R0) > 0 such that for any

finite subset F ⊂ R0 we have ! \    d E ∩ E − p1(n) ∩ ... ∩ E − p`(n) > 0. n∈F By combining Proposition 112 with Theorem 32, we immediately obtain a proof of

Corollary 33. Likewise, by combining Proposition 113 with Theorem 32, we immediately obtain a proof of Corollary 34.

Proposition 112 is a consequence of Furstenberg’s correspondence principle (Proposi- tion 26) and of the definition of an averaging sets a polynomial multiple recurrence.

For the proof of Proposition 113 we need the following theorem.

Theorem 114 (see [Ber85, Theorem 1.1]). Let (X, B, µ) be a probability space and suppose

An ∈ B, µ(An) > δ > 0, for n = 1, 2,.... Then there exists a set P ⊂ N with d(P ) > δ such that for any finite subset F ⊂ P , we have ! \ µ An > 0. n∈F

Proof of Proposition 113. Let R ⊂ N be an averaging set of polynomial multiple recurrence.

Let E ⊂ N with d(E) > 0 and let pi ∈ Z[t], i = 1, . . . , `, with pi(0) = 0 for all i ∈ {1, . . . , `}. According to Proposition 26, we can find an invertible measure preserving system

(X, B, µ, T ) and a set A ∈ B with µ(A) > d(E) such that (1.5.1) is satisfied. Next, since R is an averaging set of polynomial multiple recurrence, we can find some δ > 0 such that the set n   o D := n ∈ R : µ A ∩ T −p1(n)A ∩ ... ∩ T −p`(n)A > δ

|D∩{1,...,N}| has positive lower density, i.e., dd(D) = lim infN→∞ N > 0. Let n1, n2, n3,... be an enumeration of D and let Ai ∈ B denote the set

−p1(ni) −p`(ni) Ai := A ∩ T A ∩ ... ∩ T A.

98 Then, according to Theorem 114, we can find a set P ⊂ N with d(P ) > δ such that for any finite subset F ⊂ P , we have ! \ µ An > 0. (3.2.7) n∈F 0 0 0 Let R := {ni : i ∈ P }. Then R ⊂ R and it is straightforward to show that d(R ) > 0.

0 Moreover, combining (3.2.7) with (1.5.1), for any finite subset {n1, . . . , nk} ⊂ R , we obtain

r ! \   d E ∩ (E − p1(ni)) ∩ ... ∩ (E − p`(ni)) > 0. i=1 From this the claim follows immediately.

3.3 Multiple ergodic averages along level sets of multiplicative functions and

applications to ergodic theory and combinatorics

The purpose of this section is to provide proofs of Theorem 35 and Proposition 36. Our presentation is based on [BKPLR17].

We begin with a subsection concerning basic facts about elements in Drat (see Defini- tion 23).

3.3.1 The class Drat

Given a set E ⊂ N consider the following two conditions: (A) E is a rational set;

(B) for all q ∈ N and all r ∈ {0, 1, . . . , q − 1} either E ∩ (qN − r) = ∅ or d(E ∩ (qN − r)) exists and is positive.

Lemma 115. If f : N → C is a multiplicative function and 0 lies in the image of f, then the level set T := E(f, 0) satisfies conditions (A) and (B).

Proof. Let g(n) be the multiplicative function defined as g(n) = 0 if f(n) = 0 and g(n) = 1 if f(n) 6= 0. Then T = E(g, 0). However, using Lemma 80, we either have kgkΦ(Cesàro) = 0 or D(g, 1) < ∞. If kgkΦ(Cesàro) = 0, then d(T ) = 1, which implies that T is rational. On P 1 the other hand, if D(g, 1) < ∞ then D(g, 1) = p∈P p (1 − g(p)) < ∞ and therefore, using

99 Corollary 82, we deduce that g is Besicovitch rationally almost periodic, which implies that

T is rational. This shows that T satisfies (A).

Next, let q ∈ N and r ∈ {0, 1, . . . , q − 1}. Since T is a rational set, the density d(T ∩

(qN − r)) exists. It remains to show that if T ∩ (qN − r) 6= ∅ then d(T ∩ (qN − r)) is positive.

Suppose x ∈ T ∩ (qN − r). Let S := {n ∈ N : gcd(x, n) = 1}. Then xS ⊂ T . Also, xS is a finite union of infinite arithmetic progressions and hence xS ∩ (qN − r) is a non-empty

finite union of infinite arithmetic progressions. This shows that d(xS ∩ (qN − r)) exists and is positive, which, in turn, proves that T satisfies (B).

Proposition 116. Suppose R ∈ Drat and d(R) > 0. Then R satisfies conditions (A) and (B).

Remark 117. Proposition 116 implies that any level set of a Besicovitch almost periodic multiplicative function is rational. This fails to be true for general (not necessarily multi- plicative) Besicovitch rationally almost periodic functions. Indeed, let D ⊂ N be arbitrary 1 and consider the function f : N → C defined as f(n) = 0 if n ∈ D and f(n) = n if n∈ / D.

Then f is Besicovitch almost periodic, because kfkΦ(Cesàro) = 0, and E(f, 0) = D. This shows that any set whatsoever, and in particular any non-rational set, can be realized as a level set of a Besicovitch almost periodic function.

Before proving Proposition 116 we recall the definition of inner regular sets.

Definition 118 (see [BR02, Definition 2.3] and [BKPLR16]). A subset R ⊂ N is called inner regular if for each ε > 0 there exists m ∈ N such that for each s ∈ {0, 1, . . . , m − 1} 1−ε the intersection R ∩ (mN − s) is either empty or has lower density > m .

Remark 119. It follows immediately from Definition 118 that any inner regular set satisfies condition (A). We claim that inner regular sets also satisfy condition (B). To prove this claim, let q ∈ N and r ∈ {0, 1, . . . , q − 1} be arbitrary and assume R ∩ (qN − r) 6= ∅. 1 Fix any x ∈ R ∩ (qN − r). Let 0 < ε < q and choose m ∈ N such that for each s ∈ 1−ε {0, 1, . . . , m − 1} the intersection R ∩ (mN + s) is either empty or has lower density > m . Take s ∈ {0, 1, . . . , m − 1} such that s ≡ x mod m. Since x ∈ R and x ∈ mN + s, the

100 1−ε intersection R ∩ (mN + s) is non-empty and hence d(R ∩ (mN + s)) > m . On the other 1 1 1 hand, d((qN+r)∩(mN+s)) > qm . It follows that d(R∩(mN+s)∩(qN+r)) > m ( q −ε) > 0. This finishes the proof of the claim.

We need two lemmas for the proof of Proposition 116 which we state next.

2 Lemma 120 (see [BR02, Lemma 2.7] applied to B = {p : p ∈ P } ∪ (P\P )). Let P ⊂ P P 1 with p∈P\P p < ∞, and let SP be the set defined in formula (2.3.23). Then SP is inner regular. In particular, according to Remark 119, SP satisfies conditions (A) and (B).

P 1 Lemma 121. Let P ⊂ P with p∈P\P p < ∞, let f be multiplicative function. Let SP be (t) the set defined in formula (2.3.23) and for t ∈ N let SP := {s ∈ SP : gcd(s, t) = 1}. If for (t) all t ∈ N and z ∈ C the set E(f, z) ∩ SP satisfies (A) and (B) then for all z ∈ C the set E(f, z) satisfies (A) and (B).

Proof. Let TP be defined as

n 2 o TP := n ∈ N : for all p ∈ P if p | n then p | n . (3.3.1)

Since any natural number n can be written uniquely as st, where s ∈ SP , t ∈ TP and gcd(s, t) = 1, N can be partitioned into

[ (t) N = tSP . (3.3.2) t∈TP

Note that d(S ) = M(1 ) exists (due to Theorem 76) and d(S ) > 0 because P 1 < P SP P p∈P\P p P 1  ∞ and therefore p∈ p 1−1SP (p) < ∞ (cf. Lemma 80). Likewise, 1 (t) is a multiplicative P SP function and hence d(S(t)) = M(1(t) ) exists (again due to Theorem 76) and is positive (also P SP (t) −1 (t) by Lemma 80). Using (3.3.2) and the fact that d(tSP ) = t d(SP ) we obtain   (t) X d(SP ) X (t) [ (t) t = d(tSP ) 6 d tSP  = d(N) = 1. (3.3.3) t∈TP t∈TP t∈TP

101 For t ∈ TP let ut := f(t) and define  E(f, z ), if u 6= 0;  ut t   Et := ∅, if ut = 0 and z 6= 0; (3.3.4)    N, if ut = 0 and z = 0.

It is easy to check that E(f, z) ∩ tS(t) = t(E ∩ S(t)). Observe that if E = E(f, z ) P t P t ut (t) then Et ∩ SP satisfies (A) and (B) due to the assumptions stipulated in the statement of (t) Lemma 121. Also, if Et = ∅ then Et ∩ SP = ∅ obviously satisfies (A) and (B). In light of (t) (t) Lemma 120, if Et = N then Et ∩ SP = SP satisfies (A) and (B). We see that for each of (t) the three cases comprising the definition of Et in (3.3.4), the set Et ∩ SP satisfies (A) and (B). (t) (t) (t) Since Et ∩ SP satisfies (A) and (B) and E(f, z) ∩ tSP = t(Et ∩ SP ), it follows that (t) E(f, z) ∩ tSP satisfies (A) and (B). Note that any finite union of sets satisfying (A) and (B) also satisfies (A) and (B).

Therefore, for every M > 1, the set

[  (t) BM := E(f, z) ∩ tSP t∈TP t6M satisfies (A) and (B). Finally, since d(E(f, z)\BM ) = 0 as M → ∞ (see equation (3.3.3)), we conclude that E(f, z) satisfies (A). Since B ⊂ B ⊂ ... and E(f, z) = S B , we 1 2 M>1 M conclude that E(f, z) satisfies (B). This finishes the proof.

Proof of Proposition 116. Let R ∈ Drat with d(R) > 0 be given. Then there exist a Besi- covitch rationally almost periodic multiplicative function f and a complex number z such that R = E(f, z). Note that if z = 0 then it follows from Lemma 115 that R satisfies (A) and (B). We can therefore assume without loss of generality that z 6= 0.

We now apply Corollary 88 to find a concentrated multiplicative function g : N → C\{0} 0 P 1 such that the set P := {p ∈ P : f(p) 6= g(p)} satisfies p∈P 0 p < ∞. Since f is Besicovitch rationally almost periodic, it follows from Corollary 82 that there exists a Dirichlet character

P 1 χ such that p∈P p (1 − f(p)χ(p)) converges. In particular, D(f, χ) < ∞. 102 The function g is a concentrated multiplicative function and therefore its concentration

P 1 group G is a finite set of roots of unity and we have p∈P, p < ∞. Define g(p)∈G/ 00 P := {p ∈ P : f(p) 6= χ(p)} and let ρ := min{1 − Re(xy): x ∈ G, y ∈ im(χ), x 6= y}. Note that ρ > 0 and

1 1 1 1 X X + X + X p 6 p p p p∈P 00 p∈P, p∈P 0 p∈P, g(p)∈G/ f(p)6=χ(p), g(p)∈G, g(p)=f(p) 1 1 1 X + X + (f, χ) < ∞. 6 p p ρD p∈P, p∈P 0 f(p)∈G/

00 (t) Let P := P\P , let SP be the set defined in formula (2.3.23) and, for t ∈ N, let SP := (t) {s ∈ SP : gcd(s, t) = 1}. Since f(p) = χ(p) for all p ∈ P , we conclude that E(f, z) ∩ SP = (t) E(χ, z) ∩ SP . Recall that all Dirichlet characters are periodic functions. Therefore the set E(χ, z) is either empty or a finite union of infinite arithmetic progressions. In view (t) (t) of Lemma 120, the set SP is inner regular. Hence E(χ, z) ∩ SP is an inner regular set. (t) (t) Therefore, by Remark 119, for all t ∈ N the set E(f, z) ∩ SP = E(χ, z) ∩ SP satisfies (A) and (B). Finally, we can apply Lemma 121 and conclude that E(f, z) satisfies (A) and

(B).

Proposition 116 immediately gives the following corollary.

Corollary 122. Let R ∈ Drat with d(R) > 0. Then for all r ∈ R the set R − r is divisible (cf. Definition 31).

3.3.2 Proofs of Theorem 35 and Proposition 36

Proof of Proposition 36. Suppose E ∈ D has positive density, R ∈ Drat and E is uniform relative to R. Our goal is to show that for all r ∈ R the set E − r is divisible.

It follows from Proposition 116 and Corollary 122 that for all r ∈ R and q ∈ N the density d((R −r)∩qN) = d(R ∩(qN+r)) exists and is positive. However, since the function d(R)1E − d(E)1R is uniform, it follows from Proposition 94, part (b), that d(R)1E∩(qN+r) − 103 d(E)1R∩(qN+r) is uniform. Since all uniform functions have zero mean, we deduce that d(E ∩ (qN + r)) also exists and that

d(R)d(E ∩ (qN + r)) − d(E)d(R ∩ (qN + r)) = 0.

Thus, it follows from d(R ∩ (qN + r)) > 0 that d(E ∩ (qN + r)) > 0. This proves that E − r is divisible.

Example 123. Consider the multiplicative function

  k 1 if n = 2 m, where k ∈ {0, 2, 4, 6,...} and 2 - m. f(n) :=  0 otherwise.

Clearly, f is rationally Besicovitch almost periodic (see Corollary 82) and therefore the level set E = E(f, 1) = {n ∈ N : f(n) = 1} belongs to Drat.

Note that E − r is divisible for all r ∈ N ∪ {0}. This should be juxtaposed with the fact that for the set of squarefree numbers Q one has that Q − r is divisible if and only if r ∈ Q.

Next, we embark on the proof of Theorem 35. We will need the following lemma.

Lemma 124 (Lemma 3.5, [FHK13]). Let (X, B, µ, T ) be an invertible measure preserving

∞ system, k ∈ N, p1, . . . , p` ∈ Z[x], f1, . . . , f` ∈ L (X, B, µ) bounded by 1 and let F : N → C be bounded by 1 as well. Then there exists an integer s ∈ N, that only depends on k and the maximal degree of the polynomials p1, . . . , p`, such that

1 N   X p1(n) p`(n) F (n)T f1 ··· T f` = O kF kU s + o(1). N [N] n=1 L2(X,B,µ)

Proof of Theorem 35. Let E ∈ D and r ∈ N ∪ {0}. It suffices to show that if E − r is divisible then E − r is an averaging set of polynomial multiple recurrence, since all the other implications formulated in Theorem 35 are obvious.

Thus, assume E−r is divisible. Note that by Theorem 24 there exists R ∈ Drat such that E is uniform relative to R. According to Proposition 116, the set R is rational. Moreover, it follows from E − r ⊂ R − r that R − r is divisible.

104 Let (X, B, µ, T ) be an arbitrary invertible measure preserving system, let A ∈ B with

µ(A) > 0 and let pi ∈ Z[x], i = 1, . . . , ` with pi(0) = 0 be given. Using Lemma 124 and the fact that d(R)1E − d(E)1R is uniform, we get that the limit

1 N X −p1(n) −p`(n)  lim 1E−r(n)µ A ∩ T A ∩ ... ∩ T A (3.3.5) N→∞ N n=1 is the same as the limit

d(E) 1 N X −p1(n) −p`(n)  lim 1R−r(n)µ A ∩ T A ∩ ... ∩ T A , (3.3.6) d(R) N→∞ N n=1 (meaning that the limit in (3.3.5) exists if and only if the limit in (3.3.6) exists and then they are equal). Using Theorem 32 and the fact that R − r is rational and divisible, we conclude that the limit in (3.3.6) exists and is positive. It follows that the limit in (3.3.5) exists and is positive. Hence E−r is an averaging set of polynomial multiple recurrence.

3.4 Completing the proof of the Erdős sumset conjecture

In this section we follow [MRR18] and show how one can derive a proof of Theorem 40 from the two decomposition theorems, Theorem 13 and Theorem 16, proved in Chapter2.

3.4.1 An ultrafilter reformulation of the Erdős sumset conjecture

For the proof of Theorem 40 we found it useful to rely on the theory of ultrafilters, which has shown to be very useful in solving problems in Ramsey theory in the past. In this subsection we recall briefly some of the basic definitions and facts that we will utilize in this paper and then reduce Theorem 40 to a functional statement (see Theorem 131 below).

Readers in want of a friendly introduction to ultrafilters may well enjoy [Ber96, Section 3]; for the comprehensive treatment see [HS12].

An ultrafilter on N is any non-empty collection p of subsets of N that is closed under intersections and supersets and satisfies A ∈ p ⇐⇒ N\A/∈ p for every A ⊂ N. Given n ∈ N, the collection pn := {A ⊂ N : n ∈ A} is an ultrafilter; ultrafilters of this kind are called principal. For the existence of non-principal ultrafilters, which follows from the axiom of choice, see [HS12, Theorem 3.8]. 105 The set of all ultrafilters on N is denoted by βN. Given A ⊂ N write cl(A) := {p ∈ βN :

A ∈ p} for the closure of A in βN. The family {cl(A): A ⊂ N} forms a base for a topology on βN with respect to which βN is a compact Hausdorff space. The map n 7→ pn embeds

N densely in βN. Endowed with this topology, βN can be identified with the Stone–Čech compactification of N, which means that it has the following universal property: for any function f : N → K into a compact Hausdorff space K there is a unique continuous function

βf : βN → K such that (βf)(pn) = f(n) for all n ∈ N. When no confusion may arise we denote pn simply by n.

Given a function f : N → K with K a compact Hausdorff space and given an ultrafilter p ∈ βN, one can characterize (βf)(p) as the unique point x in K such that, for any neighborhood U of x, the set {n ∈ N : f(n) ∈ U} belongs to p. For this reason we write

(βf)(p) = lim f(n) n→p on occasion.

Given a set A ⊂ N we define

A − p := {n ∈ N : A − n ∈ p} for all ultrafilters p on N. Addition on N can be extended to a binary operation + on βN by

p + q = {A ⊂ : A − q ∈ p} = lim lim n + m N n→p m→q for all p, q in βN. We remark that despite being represented with the symbol +, this operation is not commutative. We mention this operation only to give the following criterion for a set to contain B+C; this will not be used throughout in the proof of Theorem 40. This lemma was independently discovered by Di Nasso and a proof was presented in [ACG17,

Proposition 3.1].

Lemma 125 (cf. [MRR18, Lemma 5.1]). Fix A ⊂ N. There are non-principal ultrafilters p and q with the property that A ∈ p + q and A ∈ q + p if and only if there are infinite sets

B,C ⊂ N with B + C ⊂ A.

106 Here is the main theorem of this section, which is inspired by the proof of [DNGJLLM15,

Theorem 3.2].

Theorem 126. Let A ⊂ N. If there exist a Følner sequence Φ in N and a non-principal  ultrafilter p ∈ βN such that dΦ (A − n) ∩ (A − p) exists for all n ∈ N and

 lim dΦ (A − n) ∩ (A − p) > 0 (3.4.1) n→p then there exist infinite sets B,C ⊂ N such that A ⊃ B + C.

The following result of Bergelson [Ber85] will be crucial for the proof of Theorem 126.

Lemma 127 ([Ber85, Theorem 1.1]). Let (X, B, µ) be a probability space and let n 7→ Bn be a sequence in B. Assume that there exists ε > 0 such that µ(Bn) > ε for all n ∈ N. Then there exists an injective map σ : N → N such that

  µ Bσ(1) ∩ · · · ∩ Bσ(n) > 0 (3.4.2) for every n ∈ N.

Given a Følner sequence Φ on N write M(Φ) for the set of Radon probability measures ∗ on βN that are weak limit points of the sequence

1 X N 7→ δn |ΦN | n∈ΦN of measures on βN, where δn is the unit mass at the principal ultrafilter pn.

Corollary 128. Let Φ be a Følner sequence on N and, for each n ∈ N, let An ⊂ N. Assume dΦ(An) exists for all n ∈ N and that there exists ε > 0 such that dΦ(An) > ε for all n ∈ N. Then there exists an injective sequence σ : N → N such that

  dΦ Aσ(1) ∩ · · · ∩ Aσ(n) > 0 for every n ∈ N.

Proof. Let µ ∈ M(Φ) and let Bn = cl(An). The set Bn is clopen and the density of An with respect to Φ exists so µ(Bn) = dΦ(An) for all n ∈ N. Apply Lemma 127 to the probability 107 space (βN, B, µ), where B is the Borel σ-algebra on βN, to find an injective map σ : N → N such that (3.4.2) holds for every n ∈ N. Since Bσ(1) ∩ · · · ∩ Bσ(n) = cl(Aσ(1) ∩ · · · ∩ Aσ(n)),    this implies that dΦ Aσ(1) ∩ · · · ∩ Aσ(n) > µ Bσ(1) ∩ · · · ∩ Bσ(n) > 0 as desired.

The next proposition, whose statement (and proof) is heavily influenced by the paper

[DNGJLLM15], can be seen as an ultrafilter-free version of Theorem 126.

Proposition 129. Let A ⊂ N. If there exist a Følner sequence Φ in N, a set L ⊂ N and  ε > 0 such that dΦ (A − m) ∩ L exists for every m ∈ N, and for every finite subset F ⊂ L

\ n  o (A − `) ∩ m ∈ N : dΦ (A − m) ∩ L > ε is infinite (3.4.3) `∈F then there exist infinite sets B,C such that A ⊃ B + C.

Proof. Let F1 ⊂ F2 ⊂ · · · be an increasing exhaustion of L by finite subsets. Construct a sequence n 7→ en in N of distinct elements such that

\ n  o en ∈ (A − `) ∩ m ∈ N : dΦ (A − m) ∩ L > ε `∈Fn for each n ∈ N. This can be done because each of the sets above is infinite by hypothesis.  In particular dΦ (A − en) ∩ L > ε for all n ∈ N. The Bergelson intersectivity lemma

(Corollary 128) implies that, for some subsequence n 7→ eσ(n) of e the intersection

    (A − eσ(1)) ∩ L ∩ · · · ∩ (A − eσ(n)) ∩ L is infinite for all n ∈ N.

Choose b1 ∈ Fσ(1) and put j1 = 1. Choose c1 = eσ(1). Thus c1 ∈ A − b1. Next choose b2 ∈ A − c1 ∩ L outside Fσ(1) and let j2 be minimal with b2 ∈ Fσ(j2). (In particular b2 is not equal to b1.) Then choose c2 = eσ(j2) ∈ (A − b1) ∩ (A − b2). Continue this process inductively, choosing

bn+1 ∈ (A − c1) ∩ · · · ∩ (A − cn) ∩ L = (A − eσ(j1)) ∩ · · · ∩ (A − eσ(jn)) ∩ L

outside Fσ(jn) and choosing jn+1 minimal with bn+1 ∈ Fσ(jn+1) and then choosing

cn+1 = eσ(jn+1) ∈ (A − b1) ∩ · · · ∩ (A − bn) 108 which is distinct from c1, . . . , cn because e is injective. Take B = {bn : n ∈ N} and

C = {cn : n ∈ N} to conclude the proof.

The proof of Theorem 126 is now quite straightforward.

Proof of Theorem 126. Let L = A − p = {` ∈ N : A − ` ∈ p} and let

ε = lim d(A − n) ∩ (A − p)/2. n→p

 Then the set {n ∈ N : d (A − n) ∩ L > ε} is in p and hence, for any finite set F ⊂ L, also the intersection \ n  o (A − `) ∩ m ∈ N : dΦ (A − m) ∩ L > ε `∈F is in p. Since p is non-principal, this intersection can not be finite. The desired conclusion now follows from Proposition 129.

In view of Theorem 126, the proof of Theorem 40 now follows from the following theo- rem.

Theorem 130. Let A ⊂ N and let Φ be a Følner sequence on N with dΦ(A) existing. For every ε > 0 there exists a Følner subsequence Ψ of Φ and a non-principal ultrafilter p ∈ βN such that dΨ((A − m) ∩ (A − p)) exists for all m ∈ N and

 2 lim dΨ (A − m) ∩ (A − p) dΨ(A) − ε (3.4.4) m→p > holds.

We conclude this section by reformulating Theorem 130 in a functional analytic lan-

m guage. Given a bounded function f : N → C define, for all m ∈ N, the shift R f : N → C by

(Rmf)(n) := f(n + m)

p for all n ∈ N. We extend this to all p ∈ βN by defining the function R f : N → C by

(Rpf)(n) := lim f(n + m) m→p

109 p for all n ∈ N. Observe that (R f)(n) = (βf)(n + p) and hence the indicator function of the p set A − p is the function R 1A, where 1A is the indicator function of A.

Given a Følner sequence Φ in N and functions f, h: N → C, recall the definition of the Besicovitch seminorm of f with respect to Φ given in Section 1.2,

 1/2 1 X 2 kfkΦ = lim sup |f(n)|  (3.4.5) N→∞ |ΦN | n∈ΦN and the definitions of the inner product, from Section 2.1.1,

1 X hf, hiΦ = lim f(n)h(n) N→∞ |ΦN | n∈ΦN whenever the limit exists.

The following result, whose proof is given in Section 3.4.2 using the material of Sec- tion 2.1, implies Theorem 130 by choosing f = 1A.

Theorem 131. Let f be a non-negative bounded function on N and let Φ be a Følner sequence on N such that h1, fiΦ exists. For every ε > 0 there exists a subsequence Ψ of Φ m p and a non-principal ultrafilter p ∈ βN such that hR f, R fiΨ exists for all m ∈ N and

m p 2 lim hR f, R fiΨ h1, fi − ε (3.4.6) m→p > Ψ holds.

3.4.2 Proving the ultrafilter reformulation

In Section 3.4.1 we reduced the proof of Theorem 40 to Theorem 131. In this section we use the decomposition theorems 13 and 16 to finish the proof of Theorem 131.

The main result of this section is the following theorem, which gives us an ultrafilter satisfying several convenient properties.

Theorem 132. Fix ε > 0 and a Følner sequence Φ on N. Given fBes ∈ Bes(N, Φ) bounded and non-negative, fCoper ∈ Coper(N, Φ) real-valued, and fComp ∈ Comp(N, Φ) bounded and non-negative, one can find a subsequence Ψ of Φ and an ultrafilter p ∈ βN such that:

U1. dΨ(E) > 0 for all E ∈ p; 110 n ε U2. {n ∈ N : kR fComp − fCompkΨ < 3 } ∈ p; p ε U3. kR fBes − fBeskΨ < 3 ; p U4. hfComp, R fCoperiΨ is non-negative.

The proof of Theorem 132 is given in Section 3.4.3. For now we show how, together with the decompositions Theorem 13 and Theorem 16, it implies Theorem 131.

Proof of Theorem 131. Fix a bounded, non-negative function f : N → R and a Følner se- quence Φ on N with h1, fiΦ existing. Fix also ε > 0. Our goal is to find a subsequence Ψ of Φ and a non-principal ultrafilter p ∈ βN such that

n p 2 lim hR f, R fiΨ h1, fi − ε n→p > Ψ holds.

Apply Theorem 13 and Theorem 16 to obtain, after passing to a subsequence Ψ of Φ, decompositions f = fBes+fCoper and f = fComp+fWM. Since f is bounded and non-negative, then so are fBes and fComp. In particular, fCoper is real-valued. Next we can apply Theorem 132 to get a finer subsequence Ψ and an ultrafilter p sat- isfying U1 through U4. Finally, pass once more to a subsequence of Ψ such that the inner

n p n p n p products hfComp, fBesiΨ, hR fWM, R fiΨ, hR fComp, R fBesiΨ and hR fComp, R fCoperiΨ ex- ist for all n ∈ N ∪ {0}. We then have

n p n p n p n p hR f, R fiΨ = hR fWM, R fiΨ + hR fComp, R fBesiΨ + hR fComp, R fCoperiΨ for all n ∈ N. We claim that

n p lim hR fWM, R fiΨ = 0 (3.4.7) n→p

n p ε lim hR fComp, R fCoperiΨ − (3.4.8) n→p > 3

n p 2 2ε lim hR fComp, R fBesiΨ h1, fi − (3.4.9) n→p > Ψ 3 are all true for our p. Once (3.4.7), (3.4.8) and (3.4.9) have been established, (3.4.6) follows immediately and the proof is complete. 111 Let us first show (3.4.7). Since fWM is weakly mixing with respect to Ψ we have, for

n p every δ > 0, that the set {n ∈ N : |hR fWM, R fiΨ| > δ} has zero density with respect to n p Ψ. It therefore does not belong to p by U1. It follows that {n ∈ N : |hR fWM, R fiΨ| < δ} belongs to p for all δ > 0 giving (3.4.7).

For the proof of (3.4.8) and (3.4.9) note that in light of U2

n p p ε lim hR fComp, R fCoperiΨ hfComp, R fCoperiΨ − (3.4.10) n→p > 3 and

n p p ε lim hR fComp, R fBesiΨ hfComp, R fBesiΨ − . n→p > 3

Thus (3.4.8) follows immediately from (3.4.10) and U4. Also, to prove (3.4.9) it suffices to show that

p 2 ε hfComp, R fBesiΨ > h1, fiΨ − 3 (3.4.11) holds. By U3 we have

p ε hfComp, R fBesiΨ ≥ hfComp, fBesiΨ − 3

2 while hfComp, fBesiΨ = kfBeskΨ + hfComp − fBes, fBesiΨ. Since fComp − fBes = fCoper − fWM and every weak mixing function belongs to Coper(N, Ψ), it follows that hfComp − fBes, fBesiΨ = 2 hfCoper − fWM, fBesiΨ = 0 and hence hfComp, fBesiΨ = kfBeskΨ. Finally, we apply the 2 2 Cauchy-Schwarz inequality to deduce that kfBeskΨ > h1, fBesiΨ and, using h1, fCoperiΨ = 0, 2 2 we get h1, fBesiΨ = h1, fiΨ. This implies (3.4.11) and finishes the proof.

3.4.3 Establishing properties U1- U4

We begin with some preparatory definitions.

Definition 133. Given a Følner sequence Φ on N we say an ultrafilter p is Φ essential if dΦ(E) > 0 for every E ∈ p. Write Ess(Φ) for the set of Φ essential ultrafilters on N.

Recall from Section 3.4.1 the set M(Φ).

112 Definition 134. A Borel measurable property of ultrafilters is said to hold Ψ almost ev- erywhere if the set of ultrafilters p with the property has full measure with respect to every

µ ∈ M(Ψ).

Lemma 135. Let Ψ be a Følner sequence on N. Then Ψ almost every p belongs to Ess(Φ).

Proof. First, observe that

\ [ Ess(Φ) = cl(N\E) = βN\ cl(E) E⊂N:dΨ(E)=0 E⊂N:dΨ(E)=0 so that it is a closed set (and hence Borel). Fix µ ∈ M(Ψ). We claim that the support of

µ is contained in Ess(Φ). Since µ is Radon this implies µ(Ess(Φ)) = 1 as desired.

To prove the claim, fix p ∈ βN\Ess(Φ). We need to show that there exists an open set

U ⊂ βN containing p such that µ(U) = 0. But since p ∈ βN\Ess(Φ), there exists E ⊂ N with dΨ(E) = 0 and p ∈ cl(E). The set cl(E) is then an open subset of βN containing p and with µ(cl(E)) 6 dΨ(E) = 0.

We will need also the following corollary about compact functions.

−1 Definition 136. A Bohr set on N is any set of the form a (U) where a is a homomorphism from N into a compact abelian group K and U ⊂ K is open. A Bohr set is a Bohr0 set if U contains the identity element of K.

Observe that the intersection of two Bohr0 sets is still a Bohr0 set.

2 Lemma 137. For every function f ∈ L (N, Φ) which is compact with respect to Φ and n every ε > 0, the set {n ∈ N : kR f − fkΦ < ε} is a Bohr0 set.

|n| Proof. Let g(n) = kR f − fkΦ for every n ∈ Z. Since f is compact with respect to Φ it k follows that the set {R g : k ∈ Z} has compact closure in the norm topology of C(Z). By [BJM78, Remark 9.8] the function g is of the form φ ◦ a for some continuous function φ on a compact abelian group K and some group homomorphism a: Z → K.

113 The following two theorems, proved in subsequent subsections, will be used in the proof of Theorem 132. The first, which will be used to guarantee U3, relies on the pointwise ergodic theorem. Its proof can be found in Section 3.4.4.

Theorem 138. Let Φ be a Følner sequence on N and let f ∈ Bes(N, Φ). For every ε > 0 there exists a Bohr0 set B and a subsequence Ψ of Φ such that for Ψ almost every ultrafilter

p p ∈ cl(B) we have kR f − fkΨ < ε.

The second is a modification of an argument due to Beiglböck [Bei11, Lemma 2] and will be used to guarantee U4. Its proof is given in Section 3.4.5.

Theorem 139. Suppose f is a real-valued bounded function that belongs to Coper(N, Ψ).

Then for every non-empty Bohr set B ⊂ N and every bounded function h: N → R the set    1  p ∈ cl(B) : lim sup X h(m)(Rpf)(m) ≥ 0 (3.4.12) N→∞ |ΨN |  m∈ΨN  has positive measure with respect to every µ ∈ M(Ψ).

With these theorems we can give the proof of Theorem 132.

Proof of Theorem 132. Fix ε > 0 and a Følner sequence Φ on N with respect to which 2 2 fComp ∈ L (N, Φ) is compact, fBes ∈ L (N, Φ) is Besicovitch, and fCoper ∈ Coper(N, Φ). We need to find a subsequence Ψ of Φ and an ultrafilter p such that U1 through U4 are satisfied.

n ε Lemma 137 gives that BComp = {n ∈ N : kR fComp − fCompkΦ < 3 } is a Bohr0 set.

Theorem 138 implies that, passing to a subsequence Ψ of Φ, there exists a Bohr0 set BBes

p such that for Ψ almost every p ∈ cl(BBes) we have kR f −fkΨ < ε/3. Let B := BComp ∩BBes.

Note that B is a Bohr0 set and Ψ almost any p ∈ cl(B) satisfies U2 and U3. Applying

Theorem 139 with f = fCoper and h = fComp we deduce that the set    1 X p  p ∈ cl(B) : lim sup fComp(m)(R fCoper(m)) ≥ 0 (3.4.13) N→∞ |ΨN |  m∈ΨN 

114 has positive measure for any µ ∈ M(Ψ). Notice that any p in the set (3.4.13) satisfies U4.

Since any such p belong to cl(B) it follows that Ψ almost every p in the set (3.4.13) satisfies

U2, U3 and U4.

Finally, in view of Lemma 135, Ψ almost every p ∈ βN satisfies U1. This means that Ψ almost every p in the set (3.4.13) satisfies U1, U2, U3, and U4.

3.4.4 An application of the pointwise ergodic theorem

In this section we present a proof of Theorem 138. We start with the following lemma.

Lemma 140. Let Φ be a Følner sequence on N. If a is a trigonometric polynomial and p p p ∈ βN then R a is a trigonometric polynomial and kR akΦ = kakΦ.

Proof. Choose c1, . . . , cJ ∈ C and θ1, . . . , θJ ∈ R such that a has the form

J X 2πiθj n a(n) = cje . (3.4.14) j=1

2πiθ m Define dj := limm→p cje j . Notice that

J p X 2πiθj n (R a)(n) = dje j=1

p and, since |cj| = |dj|, it follows that kR akΦ = kakΦ.

We will also need a version of the pointwise ergodic theorem. There are Følner sequences for which the pointwise ergodic theorem does not hold [AJ75]. However, every Følner sequence has a subsequence along which the pointwise ergodic theorem holds.

Definition 141. A Følner sequence Φ is called tempered if there exists C > 0 such that

N [ ΦN+1 − Φk 6 C|ΦN+1| k=1 for every N ∈ N, where ΦN+1 − Φk is the set of differences.

According to [Lin01, Proposition 1.4], every Følner sequence has a tempered subse- quence. Here is the pointwise ergodic theorem of Lindenstrauss.

115 Theorem 142 (Lindenstrauss’ ergodic theorem [Lin01, Theorem 1.2]). Let (X, µ, T ) be a measure preserving system and let Φ be a tempered Følner sequence. Then for every f ∈ L1(X) there exists an invariant function f˜ ∈ L1(X) such that

1 lim X f(T nx) = f˜(x) a.e. N→∞ |ΦN | n∈ΦN

Theorem 143. Let Φ be a Følner sequence on N and let h ∈ Bes(N, Φ) be bounded. Then p there is a subsequence Ψ of Φ with kR hkΨ = khkΨ for Ψ almost every p.

Proof. First we pass to a tempered subsequence Ψ of Φ. For each j ≥ 1 let aj be a trigonometric polynomial such that kh − ajkΨ → 0 as j → ∞. Apply Lemma 47 to find a compact metric space X, a continuous map S : X → X, a point x ∈ X with a dense orbit

n n under S and functions H,F1,F2,... in C(X) such that aj(n) = Fj(S x) and h(n) = H(S x) for all j, n ∈ N. p p n For each p ∈ βN define the map S : X → X by S x = limn→p S x and notice that

p n m n p (R aj)(n) = lim aj(n + m) = lim Fj(S S x) = Fj(S S x) (3.4.15) m→p m→p for every j, n ∈ N. We similarly have

(Rph)(n) = H(SnSpx) (3.4.16) for all n ∈ N. p The map π : βN → X defined by p 7→ S x is continuous and surjective. Lemma 140 and (3.4.15) imply that for every y = Spx ∈ X we have

1 X n 2 1 X n p 2 lim |Fj(S y)| = lim |Fj(S S x)| N→∞ |ΨN | N→∞ |ΨN | n∈ΨN n∈ΨN

1 X p 2 = lim |R aj(n)| N→∞ |ΨN | n∈ΨN

p 2 = kR ajkΨ

2 = kajkΨ.

Let µ ∈ M(Φ) be arbitrary and let ν = π∗µ be its pushforward under π. By the mean

2 2 ergodic theorem, for each j the orthogonal projection in L (X, ν) of |Fj| to the subspace of 116 2 invariant functions is constant and therefore equal to the integral of |Fj| . The hypothesis kaj − hkΨ → 0 as j → ∞ implies that kFj − Hkν → 0 and therefore, the orthogonal projection of |H|2 to the space of invariant functions is also equal to its integral.

Finally, we apply the pointwise ergodic theorem Theorem 142 to deduce that

1 Z lim X |H(Sny)|2 = |H|2 dν N→∞ |ΨN | X n∈ΨN for ν almost every y ∈ X. Since H(Sny) = H(SnSpx), it follows from (3.4.16) that

p kR hkΦ = khkΦ for µ almost every p ∈ βG as desired.

We are now ready to finish the proof of Theorem 138

Proof of Theorem 138. Let Φ be a Følner sequence on N, let f ∈ Bes(N, Φ) and let ε > 0.

Let a be a trigonometric polynomial such that kf −akΦ < ε/3. Notice that f −a ∈ Bes(N, Φ) and hence using Theorem 143 we can find a subsequence Ψ of Φ such that for Ψ almost every p ∈ βN

p p p p 2ε R f − f R (f − a) + R a − a + a − f R a − a + . Ψ 6 Ψ Ψ Ψ 6 Ψ 3

p It now suffices to find a Bohr0 set B such that for every p ∈ cl(B) we have R a−a Ψ 6 ε/3.

PJ 2πinθj Write a(n) = j=1 cje for some c1, . . . , cJ ∈ C and 0 ≤ θ1, . . . , θJ < 1. Let J M = maxj |cj| and let α: N → T be the homomorphism α(n) = (nθ1, . . . , nθJ ) (where T ε ε J J −1 is the torus R/Z as usual). Let U = − 3MJ , 3MJ ⊂ T and let B = α (U). Then B is a Bohr0 set. Notice that for every m ∈ B and every n ∈ N,

J ε m X 2πinθj 2πimθj  (R a)(n) − a(n) = cje e − 1 < (3.4.17) 3 j=1

p holds. Finally, let p ∈ cl(B). In view of (3.4.17), |(R a)(n) − a(n)| < ε/3 for every n ∈ N, p and therefore also R a − a Ψ 6 ε/3.

3.4.5 A variant of an argument of Beiglböck

This subsection is devoted to the proof of Theorem 139. The ideas used in this proof were motivated by the proof of [Bei11, Lemma 2]. 117 Proof of Theorem 139. Let µ ∈ M(Φ). Since B is a Bohr set, we have that dΦ(B) exists  and is positive. It follows that µ cl(B) = dΦ(B) > 0. Define a new probability measure

µB on βN by µ(Ω ∩ cl(B)) µ (Ω) := B µ(cl(B)) for all Borel sets Ω ⊂ βN. p For each n ∈ N the map p 7→ (R f)(n) = limm→p f(n + m) from βN → R is continuous, and hence measurable. Therefore also the map

1 p 7→ lim sup X h(n)(Rpf)(n) N→∞ |ΨN | n∈ΨN is measurable. In order to show that the set in (3.4.12) has positive measure it suffices to establish the inequality Z 1 X p lim sup h(n)(R f)(n) dµB(p) ≥ 0. β N→∞ |ΨN | N n∈ΨN Using Fatou’s lemma it thus suffices to prove that Z 1 X p lim sup h(n) (R f)(n) dµB(p) ≥ 0. (3.4.18) N→∞ |ΨN | β n∈ΨN N Notice that

Z 1 Z p p (R f)(n) dµB(p) = 1cl(B)(p)(R f)(n) dµ(p) βN µ(cl(B)) βN

1 X 6 lim sup 1B(m)f(n + m) N→∞ |ΨN | m∈ΨN

1 X = lim sup 1B+n(m)f(m) . N→∞ |ΨN | m∈ΨN ⊥ Since f ∈ Bes(N, Ψ) and m 7→ 1B+n(m) is Besicovitch almost periodic with respect to Ψ, we conclude that

1 X lim sup 1B+n(m)f(m) = 0 N→∞ |ΨN | m∈ΨN and therefore

Z p (R f)(n) dµB(p) = 0 βN for every n ∈ N. This implies (3.4.18) and finishes the proof.

118 BIBLIOGRAPHY

[ACG17] U. Andrews, G. Conant, and I. Goldbring. “Definable sets containing prod-

uctsets in expansions of groups”. In: ArXiv e-prints (Jan. 2017). arXiv:

1701.07791 [math.LO].

[AGH63] L. Auslander, L. Green, and F. Hahn. Flows on homogeneous spaces. With

the assistance of L. Markus and W. Massey, and an appendix by L. Green-

berg. Annals of Mathematics Studies, No. 53. Princeton University Press,

Princeton, N.J., 1963, pp. vii+107.

[AJ75] M. A. Akcoglu and A. del Junco. “Convergence of averages of point trans-

formations”. In: Proc. Amer. Math. Soc. 49 (1975), pp. 265–266. issn: 0002-9939.

[Bei11] M. Beiglböck. “An ultrafilter approach to Jin’s theorem”. In: Israel J.

Math. 185 (2011), pp. 369–374. issn: 0021-2172.

m [Ber85] V. Bergelson. “Sets of recurrence of Z -actions and properties of sets of m differences in Z ”. In: J. London Math. Soc. (2) 31.2 (1985), pp. 295–304. issn: 0024-6107.

[Ber87] V. Bergelson. “Ergodic Ramsey theory”. In: and combinatorics (Ar-

cata, Calif., 1985). Vol. 65. Contemp. Math. Amer. Math. Soc., Provi-

dence, RI, 1987, pp. 63–87.

[Ber96] V. Bergelson. “Ergodic Ramsey theory – an update”. In: Ergodic theory of

d Z actions (Warwick, 1993–1994). Vol. 228. London Math. Soc. Lecture Note Ser. Cambridge Univ. Press, Cambridge, 1996, pp. 1–61.

119 [Bes26] A. S. Besicovitch. “On Generalized Almost Periodic Functions”. In: Proc.

London Math. Soc. (2) 25 (1926), pp. 495–512. issn: 0024-6115.

[Bes55] A. S. Besicovitch. Almost periodic functions. Dover Publications, Inc., New

York, 1955, pp. xiii+180.

[BH96] V. Bergelson and I. J. Håland. “Sets of recurrence and generalized polyno-

mials”. In: Convergence in ergodic theory and probability (Columbus, OH,

1993). Vol. 5. Ohio State Univ. Math. Res. Inst. Publ. de Gruyter, Berlin,

1996, pp. 91–110.

[BHK05] V. Bergelson, B. Host, and B. Kra. “Multiple recurrence and nilsequences”.

In: Invent. Math. 160.2 (2005). With an appendix by Imre Ruzsa, pp. 261–

303. issn: 0020-9910.

[BHMP00] V. Bergelson, B. Host, R. McCutcheon, and F. Parreau. “Aspects of uni-

formity in recurrence”. In: Colloq. Math. 84/85.part 2 (2000). Dedicated

to the memory of Anzelm Iwanik, pp. 549–576. issn: 0010-1354.

[BJM78] J. F. Berglund, H. D. Junghenn, and P. Milnes. Compact right topological

semigroups and generalizations of almost periodicity. Vol. 663. Lecture

Notes in Mathematics. Springer, Berlin, 1978, pp. x+243. isbn: 3-540- 08919-5.

[BKPLR16] V. Bergelson, J. Kułaga-Przymus, M. Lemańczyk, and F. K. Richter.

“Rationally almost periodic sequences, polynomial multiple recurrence

and symbolic dynamics”. In: ArXiv e-prints (2016). arXiv: 1611.08392

[math.DS].

[BKPLR17] V. Bergelson, J. Kułaga-Przymus, M. Lemańczyk, and F. K. Richter. “A

structure theorem for level sets of multiplicative functions and applica-

tions”. In: ArXiv e-prints (2017). arXiv: 1705.07322 [math.DS].

120 [BL85] A. Bellow and V. Losert. “The weighted pointwise ergodic theorem and the

individual ergodic theorem along subsequences”. In: Trans. Amer. Math.

Soc. 288.1 (1985), pp. 307–345. issn: 0002-9947.

[BL96] V. Bergelson and A. Leibman. “Polynomial extensions of van der Waer-

den’s and Szemerédi’s theorems”. In: J. Amer. Math. Soc. 9.3 (1996),

pp. 725–753. issn: 0894-0347.

[BMR17] V. Bergelson, J. Moreira, and F. K. Richter. “Single and multiple recur-

rence along non-polynomial sequences”. In: ArXiv e-prints (2017).

[Boh25a] H. Bohr. “Zur Theorie der Fastperiodischen Funktionen”. In: Acta. Math.

45.1 (1925). I. Einer Verallgemeinerung der Theorie der Fourierreihen,

pp. 29–127.

[Boh25b] H. Bohr. “Zur Theorie der Fastperiodischen Funktionen”. In: Acta. Math.

46.1-2 (1925). II. Zusammenhang der fastperiodischen Funktionen mit

Funktionen von unendlich vielen Variabeln; gleichmässige Approximation

durch trigonometrische Summen, pp. 101–214.

[BR02] V. Bergelson and I. Ruzsa. “Squarefree numbers, IP sets and ergodic the-

ory”. In: Paul Erdős and his mathematics, I (Budapest, 1999). Vol. 11.

Bolyai Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2002, pp. 147–

160.

[DD74] H. Daboussi and H. Delange. “Quelques propriétés des fonctions multi-

plicatives de module au plus égal à 1”. In: C. R. Acad. Sci. Paris Sér. A

278 (1974), pp. 657–660.

[DD82] H. Daboussi and H. Delange. “On multiplicative arithmetical functions

whose modulus does not exceed one”. In: J. London Math. Soc. (2) 26.2

(1982), pp. 245–264. issn: 0024-6107.

[Del72] H. Delange. “Sur les fonctions multiplicatives de module au plus égal à

un”. In: C. R. Acad. Sci. Paris Sér. A-B 275 (1972), pp. 781–784.

121 [Del83] H. Delange. “Sur les fonctions arithmétiques multiplicatives de module

6 1”. In: Acta Arith. 42.2 (1983), pp. 121–151. issn: 0065-1036.

[DNGJLLM15] M. Di Nasso, I. Goldbring, R. Jin, S. Leth, M. Lupini, and K. Mahlburg.

“On a sumset conjecture of Erdős”. In: Canad. J. Math. 67.4 (2015),

pp. 795–809. issn: 0008-414X.

[Dur10] R. Durrett. Probability: theory and examples. Fourth. Cambridge Series in

Statistical and Probabilistic Mathematics. Cambridge University Press,

Cambridge, 2010, pp. x+428. isbn: 978-0-521-76539-8.

[EFHN15] T. Eisner, B. Farkas, M. Haase, and R. Nagel. Operator theoretic aspects of

ergodic theory. Vol. 272. Graduate Texts in Mathematics. Springer, Cham,

2015, pp. xviii+628. isbn: 978-3-319-16897-5; 978-3-319-16898-2.

[EG80] P. Erdős and R. L. Graham. Old and new problems and results in combi-

natorial number theory. Vol. 28. Monographies de L’Enseignement Math-

ématique [Monographs of L’Enseignement Mathématique]. Université de

Genève, L’Enseignement Mathématique, Geneva, 1980, p. 128.

[Ell79] P. D. T. A. Elliott. Probabilistic number theory. I. Vol. 239. Grundlehren

der Mathematischen Wissenschaften [Fundamental Principles of Mathe-

matical Science]. Mean-value theorems. Springer-Verlag, New York-Berlin,

1979, xxii+359+xxxiii pp. (2 plates). isbn: 0-387-90437-9.

[EW11] M. Einsiedler and T. Ward. Ergodic theory with a view towards number

theory. Vol. 259. Graduate Texts in Mathematics. Springer-Verlag London,

Ltd., London, 2011, pp. xviii+481. isbn: 978-0-85729-020-5.

[FH17a] N. Frantzikinakis and B. Host. “Higher order Fourier analysis of multi-

plicative functions and applications”. In: J. Amer. Math. Soc. 30.1 (2017),

pp. 67–157. issn: 0894-0347.

122 [FH17b] N. Frantzikinakis and B. Host. “Multiple ergodic theorems for arithmetic

sets”. In: Trans. Amer. Math. Soc. 369.10 (2017), pp. 7085–7105. issn: 0002-9947.

[FHK13] N. Frantzikinakis, B. Host, and B. Kra. “The polynomial multidimensional

Szemerédi theorem along shifted primes”. In: Israel J. Math. 194.1 (2013),

pp. 331–348. issn: 0021-2172.

[FKO82] H. Furstenberg, Y. Katznelson, and D. Ornstein. “The ergodic theoreti-

cal proof of Szemerédi’s theorem”. In: Bull. Amer. Math. Soc. (N.S.) 7.3

(1982), pp. 527–552. issn: 0273-0979.

[FKPL] S. Ferenczi, J. Kułaga-Przymus, and M. Lemańczyk. “Sarnak’s Conjecture

– what’s new”. to appear in: Proceedings of the Chair Morlet semester

"Ergodic Theory and Dynamical Systems in their Interactions with Arith-

metic and Combinatorics" 1.08.2016–31.01.2017, Springer, Lecture Notes

in Math., arXiv:1710.04039 .

[Fra04] N. Frantzikinakis. “The structure of strongly stationary systems”. In: J.

Anal. Math. 93 (2004), pp. 359–388. issn: 0021-7670.

[Fra08] N. Frantzikinakis. “Multiple ergodic averages for three polynomials and

applications”. In: Trans. Amer. Math. Soc. 360.10 (2008), pp. 5435–5475.

issn: 0002-9947.

[Fur67] H. Furstenberg. “Disjointness in ergodic theory, minimal sets, and a prob-

lem in Diophantine approximation”. In: Math. Systems Theory 1 (1967),

pp. 1–49. issn: 0025-5661.

[Fur77] H. Furstenberg. “Ergodic behavior of diagonal measures and a theorem

of Szemerédi on arithmetic progressions”. In: J. Analyse Math. 31 (1977),

pp. 204–256. issn: 0021-7670.

123 [Fur81] H. Furstenberg. Recurrence in ergodic theory and combinatorial number

theory. M. B. Porter Lectures. Princeton University Press, Princeton, N.J.,

1981, pp. xi+203. isbn: 0-691-08269-3.

[Gla03] E. Glasner. Ergodic theory via joinings. Vol. 101. Mathematical Surveys

and Monographs. American Mathematical Society, Providence, RI, 2003,

pp. xii+384. isbn: 0-8218-3372-3.

[Gow01] W. T. Gowers. “A new proof of Szemerédi’s theorem”. In: Geom. Funct.

Anal. 11.3 (2001), pp. 465–588. issn: 1016-443X.

[Gow07] W. T. Gowers. “Hypergraph regularity and the multidimensional Sze-

merédi theorem”. In: Ann. of Math. (2) 166.3 (2007), pp. 897–946. issn: 0003-486X.

[Gow10] W. T. Gowers. “Decompositions, approximate structure, transference, and

the Hahn-Banach theorem”. In: Bull. Lond. Math. Soc. 42.4 (2010), pp. 573–

606. issn: 0024-6093.

[Gre61] L. W. Green. “Spectra of nilflows”. In: Bull. Amer. Math. Soc. 67 (1961),

pp. 414–415. issn: 0002-9904.

[GS] A. Granville and K. Soundararajan. “Multiplicative number theory: The

pretentious approach”. In preparation - http://www.dms.umontreal.ca/∼

andrew/PDF/BookChaps1n2.pdf.

[GT10a] B. Green and T. Tao. “An arithmetic regularity lemma, an associated

counting lemma, and applications”. In: An irregular mind. Vol. 21. Bolyai

Soc. Math. Stud. János Bolyai Math. Soc., Budapest, 2010, pp. 261–334.

[GT10b] B. Green and T. Tao. “Linear equations in primes”. In: Ann. of Math. (2)

171.3 (2010), pp. 1753–1850. issn: 0003-486X.

[GT10c] B. Green and T. Tao. “Yet another proof of Szemerédi’s theorem”. In: An

irregular mind. Vol. 21. Bolyai Soc. Math. Stud. János Bolyai Math. Soc.,

Budapest, 2010, pp. 335–342. 124 [GT12] B. Green and T. Tao. “The Möbius function is strongly orthogonal to

nilsequences”. In: Ann. of Math. (2) 175.2 (2012), pp. 541–566. issn: 0003- 486X.

[GTZ12] B. Green, T. Tao, and T. Ziegler. “An inverse theorem for the Gowers

U s+1[N]-norm”. In: Ann. of Math. (2) 176.2 (2012), pp. 1231–1372. issn: 0003-486X.

[Hin79] N. Hindman. “Ultrafilters and combinatorial number theory”. In: Number

theory, Carbondale 1979 (Proc. Southern Illinois Conf., Southern Illinois

Univ., Carbondale, Ill., 1979). Vol. 751. Lecture Notes in Math. Springer,

Berlin, 1979, pp. 119–184.

[HK] B. Host and B. Kra. “Nilpotent Structures in Ergodic Theory”. In prepa-

ration - http://www.math.northwestern.edu/∼kra/papers/book.htm.

[HK02] B. Host and B. Kra. “An odd Furstenberg-Szemerédi theorem and quasi-

affine systems”. In: J. Anal. Math. 86 (2002), pp. 183–220. issn: 0021-7670.

[HK05a] B. Host and B. Kra. “Convergence of polynomial ergodic averages”. In:

Israel J. Math. 149 (2005). Probability in mathematics, pp. 1–19. issn: 0021-2172.

[HK05b] B. Host and B. Kra. “Nonconventional ergodic averages and nilmanifolds”.

In: Ann. of Math. (2) 161.1 (2005), pp. 397–488. issn: 0003-486X.

[HK08] B. Host and B. Kra. “Analysis of two step nilsequences”. In: Ann. Inst.

Fourier (Grenoble) 58.5 (2008), pp. 1407–1453. issn: 0373-0956.

[HK09] B. Host and B. Kra. “Uniformity seminorms on `∞ and applications”. In:

J. Anal. Math. 108 (2009), pp. 219–276. issn: 0021-7670.

[Hoe63] W. Hoeffding. “Probability inequalities for sums of bounded random vari-

ables”. In: J. Amer. Statist. Assoc. 58 (1963), pp. 13–30. issn: 0162-1459.

125 [HS12] N. Hindman and D. Strauss. Algebra in the Stone-Čech Compactification

– Theory and Applications. de Gruyter Textbook. Second revised and ex-

tended edition. Walter de Gruyter & Co., Berlin, 2012, pp. xviii+591.

isbn: 978-3-11-025623-9.

[Jac56] K. Jacobs. “Ergodentheorie und fastperiodische Funktionen auf Halbgrup-

pen”. In: Math. Z. 64 (1956), pp. 298–338. issn: 0025-5874.

[Kre85] U. Krengel. Ergodic theorems. Vol. 6. De Gruyter Studies in Mathematics.

With a supplement by Antoine Brunel. Walter de Gruyter & Co., Berlin,

1985, pp. viii+357. isbn: 3-11-008478-3.

[Lei05a] A. Leibman. “Convergence of multiple ergodic averages along polynomials

of several variables”. In: Israel J. Math. 146 (2005), pp. 303–315. issn: 0021-2172.

[Lei05b] A. Leibman. “Pointwise convergence of ergodic averages for polynomial

d actions of Z by translations on a nilmanifold”. In: Ergodic Theory Dynam. Systems 25.1 (2005), pp. 215–225. issn: 0143-3857.

[Lei05c] A. Leibman. “Pointwise convergence of ergodic averages for polynomial

sequences of translations on a nilmanifold”. In: Ergodic Theory Dynam.

Systems 25.1 (2005), pp. 201–213. issn: 0143-3857.

[Lei06] A. Leibman. “Rational sub-nilmanifolds of a compact nilmanifold”. In:

Ergodic Theory Dynam. Systems 26.3 (2006), pp. 787–798. issn: 0143- 3857.

[Lei10a] A. Leibman. “Multiple polynomial correlation sequences and nilsequences”.

In: Ergodic Theory Dynam. Systems 30.3 (2010), pp. 841–854. issn: 0143- 3857.

[Lei10b] A. Leibman. “Orbit of the diagonal in the power of a nilmanifold”. In:

Trans. Amer. Math. Soc. 362.3 (2010), pp. 1619–1658. issn: 0002-9947.

126 [Lei15] A. Leibman. “Nilsequences, null-sequences, and multiple correlation se-

quences”. In: Ergodic Theory Dynam. Systems 35.1 (2015), pp. 176–191.

issn: 0143-3857.

[Lei98] A. Leibman. “Polynomial sequences in groups”. In: J. Algebra 201.1 (1998),

pp. 189–206. issn: 0021-8693.

[Les91] E. Lesigne. “Sur une nil-variété, les parties minimales associées à une trans-

lation sont uniquement ergodiques”. In: Ergodic Theory Dynam. Systems

11.2 (1991), pp. 379–391. issn: 0143-3857.

[LG61] K. de Leeuw and I. Glicksberg. “Applications of almost periodic compact-

ifications”. In: Acta Math. 105 (1961), pp. 63–97. issn: 0001-5962.

[Lin01] E. Lindenstrauss. “Pointwise theorems for amenable groups”. In: Invent.

Math. 146.2 (2001), pp. 259–295. issn: 0020-9910.

[LS07] L. Lovász and B. Szegedy. “Szemerédi’s lemma for the analyst”. In: Geom.

Funct. Anal. 17.1 (2007), pp. 252–270. issn: 1016-443X.

[MR16a] J. Moreira and F. K. Richter. “A spectral refinement of the Bergelson-

Host-Kra decomposition and new multiple ergodic theorems”. In: ArXiv

e-prints (Sept. 2016). arXiv: 1609.03631 [math.DS].

[MR16b] J. Moreira and F. K. Richter. “Large subsets of discrete hypersurfaces in

d Z contain arbitrarily many collinear points”. In: European J. Combin. 54 (2016), pp. 163–176. issn: 0195-6698.

[MRR17] J. Moreira, F. K. Richter, and D. Robertson. “Disjointness for measurably

distal group actions and applications”. In: ArXiv e-prints (Aug. 2017).

arXiv: 1708.01934 [math.DS].

[MRR18] J. Moreira, F. K. Richter, and D. Robertson. “A proof of the Erdős

sumset conjecture”. In: ArXiv e-prints (Mar. 2018). arXiv: 1803.00498

[math.CO].

127 [Nat80] M. B. Nathanson. “Sumsets contained in infinite sets of integers”. In: J.

Combin. Theory Ser. A 28.2 (1980), pp. 150–155. issn: 0097-3165.

[Par69] W. Parry. “Ergodic properties of affine transformations and flows on nil-

manifolds.” In: Amer. J. Math. 91 (1969), pp. 757–771. issn: 0002-9327.

[Par70] W. Parry. “Dynamical systems on nilmanifolds”. In: Bull. London Math.

Soc. 2 (1970), pp. 37–40. issn: 0024-6093.

[Rag72] M. S. Raghunathan. Discrete subgroups of Lie groups. Ergebnisse der

Mathematik und ihrer Grenzgebiete, Band 68. Springer-Verlag, New York-

Heidelberg, 1972, pp. ix+227.

[RS04] V. Rödl and J. Skokan. “Regularity lemma for k-uniform hypergraphs”.

In: Random Structures Algorithms 25.1 (2004), pp. 1–42. issn: 1042-9832.

[Ruz77] I. Z. Ruzsa. “General multiplicative functions”. In: Acta Arith. 32.4 (1977),

pp. 313–347. issn: 0065-1036.

[Sze75] E. Szemerédi. “On sets of integers containing no k elements in arithmetic

progression”. In: Proceedings of the International Congress of Mathemati-

cians (Vancouver, B. C., 1974), Vol. 2. Canad. Math. Congress, Montreal,

Que., 1975, pp. 503–505.

[Sze78] E. Szemerédi. “Regular partitions of graphs”. In: Problèmes combina-

toires et théorie des graphes (Colloq. Internat. CNRS, Univ. Orsay, Orsay,

1976). Vol. 260. Colloq. Internat. CNRS. CNRS, Paris, 1978, pp. 399–401.

[Sár78] A. Sárkőzy. “On difference sets of sequences of integers. I”. In: Acta Math.

Acad. Sci. Hungar. 31.1–2 (1978), pp. 125–149. issn: 0001-5954.

[Tao06] T. Tao. “A quantitative ergodic theory proof of Szemerédi’s theorem”. In:

Electron. J. Combin. 13.1 (2006), Research Paper 99, 49. issn: 1077-8926.

[Tao07] T. Tao. “Structure and randomness in combinatorics”. In: ArXiv e-prints

(July 2007). arXiv: 0707.4269 [math.CO].

128 [Tao08] T. Tao. Structure and randomness. Pages from year one of a mathematical

blog. American Mathematical Society, Providence, RI, 2008, p. 298. isbn: 978-0-8218-4695-7.

[Tao12] T. Tao. Higher order Fourier analysis. Vol. 142. Graduate Studies in Math-

ematics. American Mathematical Society, Providence, RI, 2012, pp. x+187.

isbn: 978-0-8218-8986-2.

[Wal12] M. N. Walsh. “Norm convergence of nilpotent ergodic averages”. In: Ann.

of Math. (2) 175.3 (2012), pp. 1667–1688. issn: 0003-486X.

[Wir61] E. Wirsing. “Das asymptotische Verhalten von Summen über multiplika-

tive Funktionen”. In: Math. Ann. 143 (1961), pp. 75–102. issn: 0025-5831.

[Zie05] T. Ziegler. “A non-conventional ergodic theorem for a nilsystem”. In: Er-

godic Theory Dynam. Systems 25.4 (2005), pp. 1357–1370. issn: 0143- 3857.

[Zie07] T. Ziegler. “Universal characteristic factors and Furstenberg averages”.

In: J. Amer. Math. Soc. 20.1 (2007), 53–97 (electronic). issn: 0894-0347.

129