<<

Arithmetic , Gowers norms, and nilspaces

Pablo Candela (UAM, ICMAT)

JAE School 2019, ICMAT

1 / 47 Introduction A central theme in combinatorics is the study of the occurrence of combinatorial structures in subsets of abelian groups, under assumptions on the subsets that are typically quite weak. A major result in this theme: Theorem (Szemer´edi,1975 [9])

For every positive integer k and every α > 0 there exists N0 > 0 such that the following holds. For every integer N > N0, every subset A ⊂ [N] of cardinality at least αN contains an of length k. Here [N] = {1, 2,..., N} ⊂ Z, the combinatorial structures in question are arithmetic progressions of length k (or k-APs), and the assumption on A is just that it has density |A|/N at least α in [N] (for N large enough). It is helpful to rephrase this central result in terms of counting k-APs in A. Notation: given a finite set X and f : X → C, we define the average of f 1 P on X by Ex∈X f (x) = |X | x∈X f (x) . Given A ⊂ X we write 1A for the indicator function of A. (1A(x) = 1 for x ∈ A and 0 otherwise.)

E.g. for A ⊂ [N], Ex∈[N]1A(x) is simply |A|/N, the density of A in [N] (also denoted sometimes by P(A)). 2 / 47 An equivalent formulation of Szemer´edi’stheorem

If A ⊂ ZN = Z/NZ, we can express the count of k-APs in A in a norma- lized form (i.e. divided by the total of k-APs in ZN ) as the following average:

Ex,d∈ZN 1A(x) 1A(x + d) ··· 1A(x + (k − 1)d).

Notation: given f1,..., fk : ZN → C, let us denote by Sk (f1,..., fk ) the average Ex,d∈ZN f1(x) f2(x + d) ··· fk (x + (k − 1)d). We abbreviate this to Sk (f ) when f1 = ··· = fk = f . It is a known fact (which follows from a result of Varnavides [11]) that Szemer´edi’stheorem has the following equivalent formulation. Theorem Let k ∈ N and α > 0. There exists c = c(α, k) > 0 such that for every

N ∈ N and every f : ZN → [0, 1] with EZN f ≥ α, we have Sk (f ) ≥ c. As part of a strategy toward proving this theorem, we can search for some useful way to control the value of Sk (f ) in terms of some property of f .

3 / 47 The case k = 3 and Fourier analysis We shall tacitly assume from now on that N is an odd prime.

In the 1950s, K.F. Roth made the following observation [8]: given A ⊂ ZN , the average S3(1A) can be controlled in terms of the Fourier transform of 1A, in a way that yields a proof of Szemer´edi’stheorem for k = 3. Basics of discrete Fourier analysis: (let e : R/Z → C, θ 7→ exp(2πiθ)) r x • The characters on ZN are the functions er : ZN → C, x 7→ e( N ), (r ∈ ZN ). ∼ r x They form the dual ZcN = ZN (identify e( N ) with its frequency r). ZN • We equip C with the inner product hf , gi = Ex∈ZN f (x)g(x). The characters form an orthonormal basis of (CZN , h·, ·i). ZN r x • The Fourier transform of f ∈ C is fb : ZcN → C, r 7→ Ex∈ZN f (x)e( N ). Thus fb(r) = hf , er i. Call fb(r) the Fourier coefficient of f at frequency r. • Fourier inversion: for every f ∈ ZN , we have f (x) = P f (r) e( r x ). C r∈ZN b N p p p 1/p ZN p • We define the L and ` norms on C by kf kL = (Ex∈ZN |f (x)| ) , P p 1/p kf k`p = ( |f (x)| ) , p ∈ [1, ∞). We define also kf k∞ = sup |f (x)|. x∈ZN x ZN • Plancherel’s theorem: for every f ∈ C , we have kf kL2 = kfbk`2 . 4 / 47 These foundations of Fourier analysis are not hard to extend to a general finite abelian group Z. In particular Zb =∼ Z. As an explicit isomorphism, we can identify a character χ ∈ Zb with some unique frequency r ∈ Z, via the r x following notion that generalizes the map (r, x) 7→ N we used on ZN . Definition.A bilinear form on Z is a map Z × Z → R/Z, (r, x) 7→ r · x with the following property: for every r ∈ Z, the map x 7→ r · x is a homo- morphism, and for every x ∈ Z the map r 7→ r · x is a homomorphism. A bilinear form · is non-degenerate if for every r ∈ Z \{0} there is x ∈ Z such that r · x 6= 0, and for every x ∈ Z \{0} there is r such that r · x 6= 0. We say that the form · is symmetric if r · x = x · r for all r, x ∈ Z.

Fact [10, §4.1]: every Z has a non-degenerate symmetric bilinear form. Having fixed such a form ·, each χ ∈ Zb is a map x 7→ e(r · x) for some unique r ∈ Z. All the facts on slide 4 then hold similarly on Z. The only explicit examples that we shall use in the course are the following: r x • On ZN , as we saw, r · x = N mod 1 (viewing R/Z as [0, 1) with + mod 1 ). n • Let F be a vector space over a finite field F of prime order p. Then for n 1 r = (r1,..., rn), x = (x1,..., xn) ∈ F , let r · x = p (r1x1 + ··· + rnxn) mod 1. 5 / 47 The case k = 3 and Fourier analysis The above-mentioned observation of Roth can be stated as follows: Control of the 3-AP-count-deviation using Fourier coefficients 3 P 2 If A ⊂ Z, |A| = α| Z |, | Z | odd, then |S3(1A) − α | ≤ r6=0 |1cA(r)| |1cA(−2r)|.

Toward a more general version of this fact: let fA denote the balanced function of A, namely fA(x) = 1A(x) − α (“balanced” since Ex∈ZfA(x) = 0). Then S3(1A) = Ex,d∈Z (fA + α)(x)(fA + α)(x + d)(fA + α)(x + 2d). 3 A simple calculation shows that this equals α + S3(fA). Thus we have 3 P 2 P 2 S3(1A) − α = S3(fA), and r6=0 1cA(r) 1cA(−2r) = r fbA(r) fbA(−2r)

(the latter equality since fbA(0) = EfA = 0 and fbA(r) = 1cA(r) for r 6= 0). Therefore, the above fact is implied by the following result. Proposition (The Fourier transform controls averages over 3-APs) P Let f1, f2, f3 :Z → C. Then S3(f1, f2, f3) = r∈Z fb1(r) fb2(−2r) fb3(r). Moreover, if | Z | is odd and kfj k∞ ≤ 1 for all j ∈ [3], then |S3(f1, f2, f3)| ≤ minj∈[3] kfbj k∞.

6 / 47 Proposition (The Fourier transform controls averages over 3-APs) P Let f1, f2, f3 :Z → C. Then S3(f1, f2, f3) = r∈Z fb1(r) fb2(−2r) fb3(r). Moreover, if | Z | is odd and kfj k∞ ≤ 1 for all j ∈ [3], then |S3(f1, f2, f3)| ≤ minj∈[3] kfbj k∞.

Proof. By Fourier inversion, ∀ j ∈ [3] we have f (x) = P f (r )e(r · x). j rj ∈Z bj j j We substitute this into S3(f1, f2, f3) = Ex,d∈Z f1(x) f2(x + d) f3(x + 2d): S (f , f , f ) = P f (r ) f (r ) f (r ) er ·x +r ·(x +d)+r ·(x +2d) 3 1 2 3 Ex,d r1,r2,r3 b1 1 b2 2 b3 3 1 2 3 = P f (r ) f (r ) f (r )  ex · (r + r + r ) ed · (r + 2r ). r1,r2,r3 b1 1 b2 2 b3 3 Ex,d 1 2 3 2 3 Using the orthonormality of characters, the inner average here is seen to be    Ex e(x ·(r1 +r2 +r3)) Ed e(d ·(r2 +2r3)) = 1(r1 +r2 +r3 = 0) 1(r2 = −2r3) = 1(r1 = r3) 1(r2 = −2r3). We substitute this back into S3 above: P P S3(f1, f2, f3) = r1,r2,r3: fb1(r1) fb2(r2) fb3(r3) = r fb1(r) fb2(−2r) fb3(r). r1=r3, r2=−2r3 To see the second claim, it will suffice to prove it for j = 1. We have P P |S3(f1, f2, f3)| ≤ r |fb1(r)| |fb2(−2r)| |fb3(r)| ≤ kfb1k∞ r |fb2(−2r)| |fb3(r)|. P The Cauchy-Schwarz inequality implies r |fb2(−2r)| |fb3(r)| ≤ kfb2k`2 kfb3k`2 . By Plancherel’s theorem, this is kf2kL2 kf3kL2 ≤ kf2k∞ kf3k∞ ≤ 1. 7 / 47 Z It follows from the linearity of the map f 7→ fb that kfbk∞ is a norm on C , which is often denoted by kf ku2 . Let us restate the proposition: Proposition (The Fourier transform controls averages over 3-APs)

Let f1, f2, f3 :Z → C, kfi k∞ ≤ 1, | Z | odd. Then |S3(f1, f2, f3)| ≤ minj∈[3] kfj ku2 . This is one of the key analytic facts driving Roth’s proof of the case k = 3 of Szemer´edi’stheorem. Indeed, given A ⊂ ZN of density α, applying the 3 proposition to f1 = f2 = f3 = fA yields |α − S3(1A)| ≤ kfAku2 . Hence if 3 2 kfAku2 is small compared to α, then A contains roughly α N 3-APs. This leads to a proof of the theorem, using the following dichotomy:

• EITHER: kfAku2 = supr6=0 |1cA(r)| is sufficiently small and we are done with finding a 3-AP in A (for N large enough),

• OR: there is r 6= 0 such that |1cA(r)| is at least a fraction of α, and then this can be used to implement a so-called density-increment argument, using er to find a long AP where A has density markedly greater than α. We shall not look at the details of this argument for k = 3. Instead, we turn directly to extending this kind of argument to k > 3 (we will look at the density-increment argument in more detail for k = 4). 8 / 47 The case k = 4: Fourier analysis is not that helpful anymore The key result that we could hope for, to make the above dichotomy work for k = 4, is the following analogue of the last proposition:

∀  > 0, ∃ δ > 0 such that if f1, f2, f3, f4 : ZN → C safisfy kfj k∞ ≤ 1 and kfbj k∞ ≤ δ for every j ∈ [4], then |Ex,d f1(x) f2(x + d) f3(x + 2d) f4(x + 3d)| ≤ . However, this hypothetical proposition is false. x2 −3x2 3x2 −x2 Example: let f1(x) = e( N ), f2(x) = e( N ), f3(x) = e( N ), f4(x) = e( N ). 2 2 2 2 Note: ∀ x, d ∈ ZN , we have x − 3(x + d) + 3(x + 2d) − (x + 3d) = 0. We therefore have 1 2 2 2 2  S4(f1, f2, f3, f4) = Ex,d∈ZN e N (x −3(x +d) +3(x +2d) −(x +3d) ) = 1. − 1 However, for each j we have kfj ku2 ≤ N 2 → 0 as N → ∞. Indeed, ∀ r, 2 2 x2−r x y 2−r y −x2+r x (x+h) −r(x+h) |fb1(r)| = Ex e( N ) Ey e( N ) = Ex,h e( N ) e( N ) (x+h)2−x2−rh 2hx+h2−rh 2hx 1 = Ex,h e( N ) = Ex,h e( N ) ≤ Eh|Ex e( N )| = Eh 1{0}(h) = N .

Thus, for these functions, we have kfj ku2 = o(1)N→∞ for every j ∈ [4], and yet S4(f1, f2, f3, f4) = 1 is bounded away from 0 as N → ∞. 9 / 47 Looking for other tools

The last example shows that the norm kfbk∞ is not delicate enough to control averages over k-APs for k > 3. What tool can we use instead?

Idea: search, among quantities related to the norm kfbk∞, for one that has a clear generalization likely to be useful for controlling longer progressions. P 41/4 In this direction, Gowers considered the norm kfbk`4 = r |fb(r)| .

On one hand, if kf k∞ ≤ 1, then the norms kfbk`4 and kfbk∞ can be used roughly equivalently. Indeed, for such f , we have the following easily- 4 4 2 checked inequalities: kfbk∞ ≤ kfbk`4 ≤ kfbk∞. On the other hand, we have the following formula, valid on any finite abelian group Z: 1   4 kfbk`4 = Ex,h1,h2∈Z f (x) f (x + h1) f (x + h2) f (x + h1 + h2) . This can be proved by Fourier inversion (Exercise 1).

We can view the quadruples (x, x + h1, x + h2, x + h1 + h2) as 2-dimensional cubes (or parallelograms) in Z. It is then natural to think of generalizing this formula, by averaging instead over higher-dimensional cubes in Z. Thus, one is led to the definition of the Gowers uniformity norms on Z. 10 / 47 Uniformity norms on a finite abelian group Z Z There is one such norm on C for each integer d > 1, denoted by k · kUd . The U2 norm is the average over 2-cubes that we have just seen: 1  4 kf kU2 = Ex,h1,h2∈Z f (x) f (x + h1) f (x + h2) f (x + h1 + h2) . The U3 norm is obtained by averaging over 3-cubes:  3 kf kU = Ex,h1,h2,h3∈Z f (x) f (x + h1) f (x + h2) f (x + h1 + h2) 1  8 · f (x + h3) f (x + h1 + h3) f (x + h2 + h3) f (x + h1 + h2 + h3) . The pattern for the general definition is now clear, let us state it formally. Definition (Gowers uniformity norms on Z) d Z For each integer d ≥ 2, the U norm is defined for f ∈ C by 1  Y |v|  2d kf kUd = Ex,h1,h2,...,hd ∈Z C f (x + v1h1 + ··· + vd hd ) , v∈{0,1}d where Cf := f is the complex-conjugation operator, and |v| := v1 + ··· + vd .

11 / 47 Uniformity norms on a finite abelian group Z We can also give the following inductive definition of the Ud norms:

• Define the seminorm kf kU1 := |Ex∈Zf (x)|. Z • Let T denote the standard shift operator on C , defined by T hf (x) := f (x + h) for each h ∈ Z. We then define for each d ≥ 2

2d h 2d−1 kf kUd = Eh∈Zkf T f kUd−1 . (*)

8 E.g. kf kU3 = Ex,h1,h2,h f (x)f (x + h) f (x + h1) f (x + h1 + h)  f (x + h2) f (x + h2 + h) f (x + h1 + h2) f (x + h1 + h2 + h)

 h h h h  = Eh Ex,h1,h2 f T f (x) f T f (x + h1) f T f (x + h2) f T f (x + h1 + h2) h 4 = Ehkf T f kU2 . 2 Note that kf kU1 = EhEx f (x)f (x + h). Applying Cauchy-Schwarz over h, 2 2 1/2 2 we obtain kf kU1 ≤ (Eh|Ex f (x)f (x + h)| ) = kf kU2 , i.e. kf kU1 ≤ kf kU2 . A similar application of Cauchy-Schwarz in (*) shows by induction that

d The U norms form an increasing sequence: ∀ d ≥ 2, kf kUd ≤ kf kUd+1 .

12 / 47 Uniformity norms on a finite abelian group Z Let us establish other basic properties of the Ud norms, including the fact that they obey the triangle inequality. We begin by defining, for each d ≥ 2, an operation analogous to the inner product hf , gi = Ex f (x)g(x). Definition (Ud product on CZ) d Let d ≥ 2 and let (fv )v∈{0,1}d be a sequence of 2 functions Z → C. d The U product of these functions is denoted by h(fv )v iUd and defined by Y |v| h(fv )v iUd = Ex,h1,h2,...,hd ∈Z C fv (x + v1h1 + ··· + vd hd ). v∈{0,1}d

2 Z E.g. for d = 2, the U product of functions f00, f01, f10, f11 ∈ C is

2 hf00, f10, f01, f11iU = Ex,h1,h2 f00(x) f10(x + h1) f01(x + h2) f11(x + h1 + h2). 2d Note: if all fv are the same function f then h(fv )v iUd = kf kUd .

The Cauchy-Schwarz inequality implies the triangle inequality for k · kL2 . Similarly, to prove the triangle inequality for k · kUd , we shall use an analogue of the Cauchy-Schwarz inequality for the Ud product. 13 / 47 Uniformity norms on a finite abelian group Z

Proposition (Gowers-Cauchy-Schwarz inequality) d Let d ≥ 2 and let (fv )v∈{0,1}d be a sequence of 2 functions Z → C. Then Y |h(fv )v∈{0,1}d iUd | ≤ kfv kUd . v∈{0,1}d To see the idea of the proof, let us just treat the case d = 2. We have

2 |hf00, f10, f01, f11iU | = |Ex,h1,h2 f00(x) f10(x + h1) f01(x + h2) f11(x + h1 + h2)|.   Writing y = x + h2, this is |Eh1 Ex f00(x) f10(x + h1) Ey f01(y) f11(y + h1) |. We apply Cauchy-Schwarz over h1. This is then at most the square root of 2 2 Eh1 Ex f00(x) f10(x + h1) Eh1 Ey f01(y) f11(y + h1) . We observe that this is hf00, f10, f00, f10iU2 hf01, f11, f01, f11iU2 . Repeating this argument in each of these two factors, the first factor is bounded above by 1/2 1/2 2 2 hf00, f00, f00, f00iU2 hf10, f10, f10, f10iU2 = kf00kU2 kf10kU2 . 2 2 Similarly, the second factor above is at most kf01kU2 kf11kU2 . We conclude that |hf00, f10, f01, f11iU2 | ≤ kf00kU2 kf10kU2 kf01kU2 kf11kU2 . 14 / 47 Uniformity norms on a finite abelian group Z

We can now prove the desired property of k · kUd . Proposition Z We have that k · kUd is a norm on C , for each d ≥ 2.

Proof. We know that kf kUd ≥ kf kU2 and kf kU2 ≥ kfbk∞. Hence it is clear that k · kUd is non-negative and non-degenerate. To see the triangle inequality, we use that Q |h(fv )v∈{0,1}d iUd | ≤ v∈{0,1}d kfv kUd . (*) 2d Let f , g :Z → C, and consider kf + gkUd = h(f + g)v∈{0,1}d iUd . This d U -product is linear in each entry of (f + g)v . Using this, we can expand 2d d kf + gkUd into a sum of 2 terms, each term of the form h(fv )v iUd with fv = f or g. Applying (∗) to each of these terms, we obtain 2d P2d 2d  r 2d −r 2d kf + gkUd ≤ r=0 r kf kUd kgkUd = (kf kUd + kgkUd ) . So the Ud norms with d > 2 are natural generalizations of the U2 norm. But what is the use of these norms in arithmetic combinatorics? 15 / 47 Combinatorial use of the uniformity norms

The Ud norms enable us to control averages over structures more complex than 3-term progressions. A central example of this is that one can control averages Sk (f1,..., fk ) = Ex,d f1(x) f2(x + d) ··· fk (x + (k − 1)d) using k · kUk−1 . Theorem (Generalized Von Neumann theorem)

Let k ≥ 3, suppose that gcd(| Z |, (k − 1)!) = 1, and let f1, ..., fk :Z → C with kfi k∞ ≤ 1 for all i ∈ [k]. Then |Sk (f1,..., fk )| ≤ mini∈[k] kfi kUk−1 .

Combinatorial consequence: for A ⊂ Z with |A| = α | Z | we have k k |Sk (1A) − α | ≤ 2 kfAkUk−1 . (Exercise 2)

The theorem is proved by induction on k. The case k = 3 was done earlier. Indeed, recalling that kf ku2 := supr∈Z |fb(r)|, we proved that if | Z | > 2 is odd then |S3(f1, f2, f3)| ≤ mini∈[3] kfi ku2 . Also, kfi ku2 = kfbi k∞ ≤ kfbi k`4 = kfi kU2 . Hence |S3(f1, f2, f3)| ≤ mini∈[3] kfi kU2 .

16 / 47 Theorem (Generalized Von Neumann theorem)

Let k ≥ 3, suppose that gcd(| Z |, (k − 1)!) = 1, and let f1, ..., fk :Z → C with kfi k∞ ≤ 1 for all i ∈ [k]. Then |Sk (f1,..., fk )| ≤ mini∈[k] kfi kUk−1 .

Proof. k = 3 X. To see the idea of the inductive step, let us look at k = 4. By a change of variables, it suffices to show |S4(f1, ..., f4)| ≤ min2≤i≤4 kfi kU3 . 2 By Cauchy-Schwarz applied to Ex,d f1(x) f2(x + d) f3(x + 2d) f4(x + 3d) 2 over x, this is at most kf k2 f (x + d) f (x + 2d) f (x + 3d) . 1 L2 Ex Ed 2 3 4 This last average equals

0 0 0 Ex,d,d 0 f2(x + d) f2(x + d ) f3(x + 2d) f3(x + 2d ) f4(x + 3d) f4(x + 3d ). We change variables: y = x + d, h = d0 − d. Then this average becomes

Ey,d,h f2(y) f2(y + h) f3(y + d) f3(y + d + 2h) f4(y + 2d) f4(y + 2d + 3h). For each i ∈ [3] and h ∈ Z, we define gi,h(y) = fi+1(y) fi+1(y + ih). Then the last average is

Eh Ey,d g1,h(y) g2,h(y + d) g3,h(y + 2d) = Eh S3(g1,h, g2,h, g3,h).

17 / 47 Theorem (Generalized Von Neumann theorem)

Let k ≥ 3, suppose that gcd(| Z |, (k − 1)!) = 1, and let f1, ..., fk :Z → C with kfi k∞ ≤ 1 for all i ∈ [k]. Then |Sk (f1,..., fk )| ≤ mini∈[k] kfi kUk−1 . case k = 4 continued: so far we have established that 2 |S4(f1, f2, f3, f4)| ≤ Eh |S3(g1,h, g2,h, g3,h)| where, for each i ∈ [3] and h ∈ Z, we defined gi,h(x) = fi+1(x)fi+1(x + ih). By induction

(i.e. the case k = 3) we have Eh |S3(g1,h, g2,h, g3,h)| ≤ Eh mini∈[3] kgi,hkU2 . By Fatou’s lemma, this is at most mini∈[3] Ehkgi,hkU2 . 1/4  4  By Jensen’s inequality, this is at most mini∈[3] Ehkgi,hkU2 . As | Z | is prime to (k − 1)! = 6, the map h 7→ i h is a permutation of Z for 1/4 1/4  4   h 4  each i ∈ [3]. Hence mini∈[3] Ehkgi,hkU2 = mini∈[3] Ehkfi+1T fi+1kU2 . Recalling the inductive definition of the Ud norms, we see that this is 2 2 mini∈[3] kfi+1kU3 = mini∈{2,3,4} kfi kU3 . The case k = 4 follows. The inductive step for k > 4 is similar. 18 / 47 A summary • The Fourier transform is useful in the area of additive combinatorics, in particular because it enables us to control averages such as S3(f1, f2, f3). • The Fourier transform is not delicate enough to control averages over more complex structures, such as Sk (f1,..., fk ) for k > 3. • The U2 norm is closely related to the Fourier transform, and generalizes naturally to the Ud norms, d > 2. • The Ud norms are also relevant in this area, as they enable us to control

averages Sk (f1,..., fk ), k > 3 (the Generalized Von Neumann Theorem). 2 1/2 Z • For the U norm we have kf kU2 ≤ kfbk∞ for any f ∈ C with |f | ≤ 1. Thus, if kf kU2 ≥ η > 0, we can deduce that there exists a character 2 er ∈ Zb such that |hf , er i| ≥ η . For the Ud -norms to be as useful as the Fourier transform (in particular, to yield a proof of Szemer´edi’stheorem), we need to deduce similarly Z usable information from the assumption that f ∈ C has kf kUd ≥ c. We expect this information to say that f then has large inner product with some sufficiently simple function (analogous to a character). This type of result is known as an inverse theorem for the Ud norm. 19 / 47 Toward an inverse theorem for the U3 norm

Z The goal: find a family Fd of functions in C with the following features. Firstly, the functions in Fd should be simple/structured enough to be useful (like characters). Secondly, the family should be rich enough so that Z if f ∈ C with |f | ≤ 1 satisfies kf kUd+1 ≥ η > 0 then there is a function Q ∈ Fd such that |hf , Qi| ≥ c, with c = c(η) > 0 independent of | Z |.

For d = 1 we have seen that we can take the family Zb of characters er , r ∈ Z, that is F1 = Z.b Now we want to find F2 for k · kU3 .

Can we take F2 again to be the family of characters? No. Already for the U3 norm, this family is not rich enough to give an inverse theorem. 2 Example: consider again f (x) = e(x /N) on ZN . This has kf kU3 = 1, 2 2 2 2 because x − (x + h1) − (x + h2) + (x + h1 + h2) 2 2 2 2 −(x + h3) + (x + h1 + h3) + (x + h2 + h3) − (x + h1 + h2 + h3) ≡ 0. Yet we have sup |hf , e i| ≤ √1 (seen earlier), which is not bounded away r r N from 0 as N → ∞.

20 / 47 Toward an inverse theorem for the U3 norm The last example indicates that we should add more functions to the characters to get an adequate family F2. But which functions? A “discrete calculus” viewpoint on the Ud norms: Let Z be a finite Z abelian group. For every h ∈ Z and f ∈ C , let ∆hf denote the (discrete) multiplicative derivative of f of step h, defined by h ∆hf (x) = f (x + h)f (x) = T f f .

We can write kf kUd in terms of multiplicative derivatives of order d: 2d kf kUd = Eh1,...,hd ,x∈Z ∆hd ··· ∆h1 f (x). In particular, suppose that f (x) = e(φ(x)) for some map φ :Z → R/Z. For h ∈ Z, let h ∇ be the difference operator with step h, defined by (h ∇)φ(x) = φ(x + h) − φ(x). Then the last average becomes   Eh1,...,hd ,x∈Z e (hd ∇)(hd−1 ∇) ··· (h1 ∇)φ(x) . Using this point of view, the characters can be seen to be extrema for the U2 norm, in the following sense. 21 / 47 Toward an inverse theorem for the U3 norm

Proposition (Characters are extrema for the U2 norm)

Let f :Z → C with kf k∞ ≤ 1. Then kf kU2 = 1 ⇔ f (x) = e(φ(x) + θ), for some homomorphism φ :Z → R/Z and some constant θ ∈ R/Z. Proof. Any such function f must have constant modulus 1, so we have f (x) = e(ψ(x)) for some function ψ :Z → R/Z. Then the equation kf kU2 = 1 is equivalent to the following holding for all x, h1, h2 ∈ Z: (h2 ∇)(h1 ∇)ψ(x) = ψ(x + h1 + h2) − ψ(x + h1) − ψ(x + h2) + ψ(x) = 0. This is equivalent to ψ being of the form φ(x) + ψ(0) for some homomorphism φ :Z → R/Z, and the result follows. By a phase function on Z we shall mean a function Z → R/Z. The above result can be viewed as an extreme case of the inverse theorem 3 for k · kU2 . There is an analogous result for the U norm, in which the linear functions x 7→ φ(x) + θ are replaced by quadratic phase functions. Note: from now on we assume that | Z | is odd and at least 3, and we fix a non-degenerate symmetric bilinear form Z × Z → R/Z,(x, y) 7→ x · y. 22 / 47 Toward an inverse theorem for the U3 norm A homomorphism M :Z → Z is self-adjoint (or symmetric) if we have Mx · y = My · x for every x, y ∈ Z.

Proposition (Quadratic exponentials as extrema of k · kU3 )

Let f :Z → C with kf k∞ ≤ 1.Then kf kU3 = 1 if and only if there is a self-adjoint homomorphism M :Z → Z, some ξ ∈ Z, and θ ∈ R/Z such that for all x ∈ Z we have f (x) = eMx · x + ξ · x + θ. (*) See Exercise 3 (this is not easy though; see [4, Lemma 3.1]). This result may suggest the following thought: perhaps an inverse theorem for k · kU3 holds if instead of linear exponentials e(φ(x) + θ) we use the larger family of quadratic exponentials of the form (*). Unfortunately, this larger set of functions is still not rich enough to form a suitable family F2. To see this, we need to consider phase functions that are linear or qua- dratic only locally, i.e. on a proper subset of Z.

We shall use the following notation: for x, h1, h2,..., hd ∈ Z, let us write {0,1}d cx,h1,...,hd to denote the d-cube (x + v · (h1,..., hd ))v∈{0,1}d in Z .

For instance cx,h1,h2 = (x, x + h1, x + h2, x + h1 + h2). 23 / 47 Toward an inverse theorem for the U3 norm Definition (Locally linear and quadratic phase functions) Let X be a subset of an abelian group. A function φ : X → R/Z is linear on X if (h2 ∇)(h1 ∇)φ(x) = φ(x + h1 + h2) − φ(x + h2) − φ(x + h1) + φ(x) = 0, {0,1}2 for every cx,h1,h2 ∈ X . We say that φ is quadratic on X if {0,1}3 (h3 ∇)(h2 ∇)(h1 ∇)φ(x) = 0, for every cx,h1,h2,h3 ∈ X .

We can now see that the global quadratics Mx · x + ξ · x + θ do not suffice. Example (Gowers, 2001): let P ⊂ ZN be the 2-dimensional progression√ P = {x1 + Kx2 : −K/10 ≤ x1, x2 ≤ K/10}, with K = b Nc. 2 2  Let f (x) = e (x1 + x2 )/N 1P (x). The following facts can be checked: 2 2 • φ(x) = (x1 + x2 )/N is quadratic on P. • We have kf kU3  1 as N → ∞, essentially because the only non-zero {0,1}3 contributions to kf kU3 come from 3-cubes in P , and each of these contributions is 1, since φ is quadratic on P. r x2 s x • Every global quadratic phase on ZN has the form ψ(x) = N + N + θ, 2 and then |hf , e(ψ)i| ≤ Eh∈ZN |Ex∈ZN ∆hf (x) e(−2r h x/N)| = o(1)N→∞. 24 / 47 Toward an inverse theorem for the U3 norm

Thus, if we want F2 to consist of quadratic phase functions, then we must take into account these quadratics defined on multidimensional APs in ZN . It turns out that with these more general quadratics we do obtain a rich 3 enough family F2 for a U inverse theorem on ZN . For this course it is technically more convenient to focus on a simpler n version of this theorem, working on vector spaces F over a finite field F rather than on ZN . (In such spaces, the analogues of multidimensional APs are affine subspaces, which are much more convenient to work with.) To state the theorem, we use the following notion valid on any finite abelian group Z.

For X ⊂ Z, the local polynomial bias of degree d of f : X → C is the quantity kf kud (X ) = supφ:X →R/Z |hf , e(φ)i|, where the supremum is over all φ : X → R/Z that are polynomial of degree d on X , i.e. such that {0,1}d for every d-cube cx,h1,...,hd ∈ X we have (hd ∇) ··· (h1 ∇)φ(x) = 0.

Note: the inverse theorem for k · kU2 can be restated as the following 1/2 inequality, for every f :Z → C with |f | ≤ 1: kf kU2(Z) ≤ kf ku2(Z). 25 / 47 Toward an inverse theorem for the U3 norm Theorem (Inverse theorem for the U3 norm over finite fields) n Let F be a finite field of odd prime order, and let f : F → C with |f | ≤ 1 n such that kf kU3 ≥ η > 0. Then there is a subspace W ≤ F with −O(1) O(1) dim(W ) ≥ n − O(η ), such that Ey∈Fn kf ku3(y+W )  η . n O(1) In particular, for some y ∈ F we have kf ku3(y+W ) = Ω(η ). Remarks: • This theorem was proved by Green and Tao [4]. • The result is combinatorially useful. In particular, as we shall see, n it yields a proof of Szemer´edi’stheorem for k = 4 on F , using a density-increment argument. • There is an analogous theorem on groups ZN , but it is technically more demanding than the one above. Moreover, the generalization to k · kUd , d > 3 in terms of local polynomial phases on ZN is not clear. d (Gowers did prove Szemer´edi’stheorem using the U norms on ZN , but using only a weaker form of an inverse theorem for k · kUd .) More about the general inverse theorem later... 26 / 47 Proof of the inverse theorem for the U3 norm on Fn n 3 Step 1. If f : F → C with |f | ≤ 1 has large U norm, then the complex argument of the function h 7→ ∆h(f ) is a “somewhat linear” function, in that it resembles a function ξ whose graph has many additive quadruples. Proposition n n Let f : → with |f | ≤ 1 and kf k 3 ≥ η. Then there is H ⊂ with F C U √ F 8 n 4 P(H) ≥ η /2, and ξ : H → F such that |∆dhf (ξ(h))| ≥ η / 2 for all η64 n 3 h ∈ H, and such that the graph Γ of ξ has at least 28 |F | additive quadruples (i.e. quadruples (a1, a2, a3, a4) with a1 + a2 = a3 + a4).

8 4 Proof. By the inductive definition of k · kU3 we have kf kU3 = Ehk∆hf kU2 . 4 8 The assumption kf kU3 ≥ η is thus equivalent to Ehk∆hf kU2 ≥ η . 2 2 8 By the inverse theorem for U , we have Ehk∆hf ku2 ≥ η . n 2 8 2 Let H := {h ∈ F : k∆hf ku2 ≥ η /2}. Since |f | ≤ 1, we have k∆hf ku2 ≤ 1, 8 2 8 8 so η ≤ Ehk∆hf ku2 ≤ P(H) + η /2. We deduce the bound P(H) ≥ η /2. n By definition of k · ku2 , there is a function ξ : H → F satisfying 2 8 Ex ∆hf (x)e(−ξ(h) · x) ≥ η /2 for every h ∈ H. 27 / 47 We now show that the graph of ξ has many additive quadruples, using the 8 Cauchy-Schwarz inequality. First, using the bound P(H) ≥ η /2 for the 2 8 density of H, and that |∆dhf (ξ(h))| ≥ η /2 for each h ∈ H, we have 16 h 2 η /4 ≤ Eh Ex T f (x)f (x)e(−ξ(h) · x) 1H (h) h k k h = Eh,x,k T f (x)f (x)T f (x)T (T f )(x)e(−ξ(h) · k)1H (h) . By the triangle inequality over x, k (and |f (x)T k f (x)| ≤ 1), we have k h h 16 Ex,k EhT (T f )(x)T f (x)e(−ξ(h) · k)1H (h) ≥ η /4. By Cauchy-Schwarz over x, k, we have k h+` h+` Ex,k,h,`T (T f )(x)T f (x) e(−ξ(h + `) · k) 1H (h + `) k h h 32 ·T (T f )(x)T f (x) e(ξ(h) · k) 1H (h) ≥ η /16. Note that here the two exponentials multiply up to e`∇ξ(h) · k. The function T k (T h+`f )(x)T h+`f (x)T k (T hf )(x)T hf (x) is really a function of y, k, ` where y = x + h. Using this, and the triangle inequality  32 over y, k, `, we deduce Ek,`|Ehe `∇ξ(h) · k 1H (h + `)1H (h)| ≥ η /16. By Cauchy-Schwarz over k, `, we deduce  η64 Eh,k,`1,`2 e (`2∇)(`1∇)ξ(h) · k 1H (h + `1 + `2)1H (h + `2)1H (h + `1)1H (h) ≥ 28 . We are done! Why? Consider the average over k as a function of h, `1, `2. 28 / 47  (h, `1, `2) 7→ Ek e (`2∇)(`1∇)ξ(h) · k 1H (h + `1 + `2)1H (h + `2)1H (h + `1)1H (h). This equals 1 if ξ(h + `1 + `2) − ξ(h + `2) − ξ(h + `1) + ξ(h) = 0 and h + `1 + `2, h + `2, h + `1, h are all in H, and 0 otherwise. Step 2. ξ agrees with a perfectly linear map on a large subspace. To prove this, we shall apply the following three very useful results. Balog-Szemere´di-Gowers Theorem (BSG) (Schoen’s bounds, 2013). There exists c > 0 such that if A ⊂ Z has at least δ|A|3 additive qua- druples, then there is B ⊂ A with |B| ≥ c δ|A| and |B − B| ≤ c−1δ−4|B|.

In the following result [10, Corollary 5.29], p is the characteristic of F: n m 3 Chang Theorem for F . If A ⊂ F has δ|A| additive quadruples, then m −O(δ−O(1)) there is a subspace V ≤ F with V ⊆ 2A − 2A and |V | ≥ p |A|. The third result is another well-known one in this area: Pl¨unnecke-Ruzsa inequality. Let A, B be finite non-empty subsets of an abelian group such that |A + B| ≤ C|A|, for some C ≥ 1. Then for all integers k, ` ≥ 0 we have |kB − `B| ≤ C k+`|A|.

n n η64 3 Recall: the graph Γ ⊂ F × F of ξ has at least 8 |Γ| additive quadruples. 2 29 / 47 n n −O(1) Lemma. There is a subspace V ⊂ F × F with dim(V ) ≥ n − O(η ) n n O(1) and some (h1, ξ1) ∈ F × F such that |Γ ∩ (V + (h1, ξ1))|  η |V |.

η64 −O(1) Proof. By BSG with δ = 28 , there is B ⊂ Γ with |B − B|  η |B|. In particular B has  ηO(1)|B|3 additive quadruples (Exercise 4). The Chang Theorem gives V ⊂ 2B − 2B with the claimed dimension. Pl¨unnecke-Ruzsa ⇒ |B + V | ≤ |B + 2B − 2B|  η−O(1)|B|. Then |Γ ∩ (B + V )| ≥ |Γ ∩ B| = |B|  ηO(1)|B + V |. We partition B + V into |B + V |/|V | cosets of V . Applying the pigeonhole principle, we deduce that there is some coset V + (h1, ξ1) such O(1) that |Γ ∩ (V + (h1, ξ1))|  η |V |, as claimed. We have thus refined the underlying structure by passing from B to an affine subspace V + (h1, ξ1). We now obtain the main result of this step. n Theorem. Let f : F → C with |f | ≤ 1 and kf kU3 ≥ η > 0. Then there is n −O(1) n a subspace W1 ≤ F with dim(W1) ≥ n − O(η ), some x0 ∈ F , some n n linear map M : W1 → F and some ξ0 ∈ F , such that O(1) Eh∈W1 |∆\x0+hf (M(h) + ξ0)|  η .

30 / 47 Proof. Let V be the subspace obtained in the previous lemma, and let n Γ1 = Γ ∩ (V + (h1, ξ1)). Let V0 = V ∩ ({0} × F ). Since Γ is a graph of a function, all sums in Γ1 + V0 are different, so |Γ1 + V0| = |Γ1||V0|. O(1) Also |Γ1 + V0| ≤ |V |, and by the previous lemma |Γ1|  η |V |. −O(1) Hence |V0|  η . It follows from basic linear algebra (Exercise 5) that there is a subspace n n W1 ≤ F and a linear map M : W1 → F such that the subspace n n V1 := {(h, M(h)) : h ∈ W1} ≤ F × F satisfies V0 + V1 = V −O(1) (in particular dim(W1) = dim(V1) = dim(V ) − dim(V0) ≥ n − O(η )). Partition V into translates of V1. The lower bound on |Γ1| and the pigeonhole principle then imply that for some (x0, ξ0) we have  O(1) |Γ ∩ V1 + (x0, ξ0) |  η |V1|. By definition of Γ, this is equivalent to   O(1) Ph∈W1 h + x0 ∈ H, ξ(h + x0) = M(h) + ξ0  η . η4 Recall: by definition of ξ we have ∆\h+x0 f (ξ(h + x0)) ≥ 21/2 for h + x0 ∈ H. Combining all this, we finally obtain

Eh∈W1 |∆\h+x0 f (M(h) + ξ0)|   4 ≥ h + x ∈ H, ξ(h + x ) = M(h) + ξ η  ηO(1). Ph∈W1 0 0 0 21/2 31 / 47 Proof of the inverse theorem for the U3 norm on Fn Step 3 (the symmetry argument). Let W be the subspace on which M is symmetric: W = {h ∈ W1 : M(x) · h = M(h) · x for all x ∈ W1} ≤ W1. O(1) −O(1) Lemma: we have |W |  η |W1|, so dim(W ) ≥ n − O(η ). We say that a function f is 1-bounded if |f | ≤ 1 everywhere. O(1) Proof. Start from the result of step 2: η  Eh∈W1 |∆\x0+hf (M(h) + ξ0)|. h+x n 0 This is Eh∈W1 b(h) Ex∈F T f (x)f (x)e(−(M(h) + ξ0) · x), for some 1-bounded function b. To focus on the important terms, we will use the notation b to group together unimportant 1-bounded functions. Thus the O(1) n last expression is written Eh∈W1 Ex∈F b(h) b(x) b(x + h) e(−M(h) · x)  η . n n Splitting F into cosets of W1, by the pigeonhole principle there is x1 ∈ F O(1) such that Ex,h∈W1 b(h) b(x + x1) b(x + x1 + h) e(−M(h) · (x + x1))  η . Since x1 is now fixed, we can ignore it in the b notation. Thus the last O(1) inequality becomes η  Ex,h∈W1 b(h) b(x) b(x + h) e(−M(h) · x)

≤ Eh∈W1 |Ex∈W1 b(x) b(x + h) e(−M(h) · x)|, where b(h) is eliminated. Applying Cauchy-Schwarz over h ∈ W1, we deduce that O(1) η  Ex,y,h∈W1 b(x) b(y) b(x + h) b(y + h) e(−M(h) · (y − x)). 32 / 47 We want to obtain expressions combining x, y the desired way in M. To this end we change the variable h to z − x − y, rewriting the last O(1) inequality as η  Ex,y,z∈W1 b(x, z) b(y, z) e(−M(z − x − y) · (y − x)). Now note that e(−M(z − x − y) · (y − x)) = e(M(x) · y − M(y) · x) e(−M(z) · y + M(y) · y) e(M(z) · x − M(x) · x). This enables us to absorb more terms into the b functions, rewriting the O(1) last inequality: η  Ex,y,z∈W1 b(x, z) b(y, z) e(M(x) · y − M(y) · x). Now we want to eliminate the b functions. To this end we first apply the pigeonhole principle relative to z in the last average, deducing thus that O(1) η  Ex,y∈W1 b(x) b(y) e(M(x) · y − M(y) · x). Applying Cauchy-Schwarz over x (similarly as before), we deduce that O(1) η  Ex,y,h∈W1 b(y) b(y + h) e(M(x) · h − M(h) · x). Finally, by the triangle inequality over y, h, we eliminate b(y)b(y + h), O(1) obtaining that η  Eh∈W1 |Ex∈W1 e(M(x) · h − M(h) · x)|.

We are done! Indeed, note that Ex∈W1 e(M(x) · h − M(h) · x) = 1W (h) (since x 7→ e(M(x) · h − M(h) · x) is a character on W1). Hence the last O(1) inequality implies that η  Eh∈W1 1W (h) = |W |/|W1|, as claimed. We now move on to the last step of the proof. 33 / 47 Final step. We start again from the conclusion of step 2, namely O(1) η  Eh∈W1 |∆\x0+hf (M(h) + ξ0)|. Partition W1 into cosets of W . By the pigeonhole principle there is x1 ∈ W1 such that O(1) η  Eh∈W | ∆x\0+x1+hf (M(h + x1) + ξ0)| x0+x1+h = Eh∈W b(h) Ex∈Fn T f (x) f (x) e(−(M(h + x1) + ξ0) · x). We simplify this using the b notation as follows: note that for every ξ, x we have e(ξ · x) = e(ξ · (x + h))e(−ξ · h), so the last inequality is written O(1) η  Eh∈W ,x∈Fn b(h) b(h + x)f (x) e(−M(h) · x). n n Now split F into cosets y + W , y ∈ F . We obtain that O(1) η  Ey∈Fn Eh,x∈W b(h) b(h + x + y) f (x + y) e(−M(h) · (x + y)) = Ey∈Fn Eh,x∈W b(h, y) b(h + x, y) f (x + y) e(−M(h) · x). By the symmetry M(h) · x = M(x) · h for x, h ∈ W , we have (using p odd) 1  1  1  e(−M(h) · x) = e − 2 M(x + h) · (x + h) e 2 M(x) · x e 2 M(h) · h . Substituting this into the last inequality and grouping factors, we obtain O(1) 1  η  Ey∈Fn Eh,x∈W b(h, y) b(h + x, y) f (x + y) e 2 M(x) · x . 1  For each y let fy (x) = f (x + y) e 2 M(x) · x . We deduce (using Exercise 6) O(1) that η  Ey∈Fn kfy ku2(W ) = Ey∈Fn kf ku3(y+W ). 34 / 47 The density increment argument for k = 4 in Fn n Suppose A ⊂ F with P(A) = α has no non-trivial 4-AP. Then, by the Gen. Von Neumann Thm, the balanced function fA satisfies kfAkU3 ≥ η(α) > 0. The following result then yields Szemer´edi’stheorem for k = 4.

n Proposition: Let f : F → [−1, 1] with EFn f = 0 and kf kU3 ≥ η > 0. n −O(1) Then there is a subspace V ≤ F with dim V ≥ n/2 − O(η ), and n O(1) some x0 ∈ F , such that Ex∈x0+V f (x)  η . Proof. Note the identity max(y, 0) = (|y| + y)/2. This combined with

Ey∈Fn Ex∈y+W f (x) = EFn f = 0 implies that Ey∈Fn |Ex∈y+W f (x)| = 2Ey∈Fn max(Ey+W f , 0). We can also assume that C Ey+W f ≤ η /C for as large a fixed constant C > 0 as we need (otherwise C we are done). It follows that Ey∈Fn |Ex∈y+W f (x)| ≤ 2η /C. Choosing C  O(1) n large enough, we deduce that Ey∈F kf ku3(y+W ) − 2|Ex∈y+W f (x)|  η . n O(1) Hence there is y ∈ F such that kf ku3(y+W ) ≥ 2|Ex∈y+W f (x)| + Ω(η ). Without loss of generality we can assume y = 0. By definition of k · ku3 , there is a self-adjoint linear map M : W → W and ξ ∈ W such that O(1) |Ex∈W f (x)e(−M(x) · x − ξ · x)| ≥ 2|EW (f )| + Ω(η ). 35 / 47 O(1) Recall: we have |Ex∈W f (x)e(−M(x) · x − ξ · x)| ≥ 2|EW f | + Ω(η ).  j We split W into the level sets Sj = x ∈ W : M(x) · x + ξ · x = p , j ∈ [p]. Pp O(1) By the triangle inequality, j=1 |Ex∈W 1Sj (x)f (x)| ≥ 2|EW f | + Ω(η ). 1 We now use again max(y, 0) = 2 (|y| + y), to deduce that Pp 1 Pp 1 j=1 max(Ex∈W 1Sj (x)f (x), 0) ≥ 2 j=1 |Ex∈W 1Sj (x)f (x)| + 2 EW f 1 O(1) O(1) O(1) Pp |Sj | ≥ |EW f | + 2 EW f + Ω(η ) ≥ Ω(η ) = Ω(η ) j=1 |W | . O(1) |Sj | By the pigeonhole principle there is j such that Ex∈W 1Sj (x)f (x)  η |W | . Now we split Sj into affine subspaces. Let U ≤ W be of maximum size 1 3 such that Mx · y = 0 for all x, y ∈ U. We have dim(U) ≥ 2 dim(W ) − 2 n −O(1) ≥ 2 − O(η ) (Exercise 8). Split Sj into cosets of U. There is x1 + U O(1) such that Ex∈x1+U 1Sj (x)f (x) > Ω(η ) Px1+U (Sj ). Let V = Sj ∩ (x1 + U). O(1) Then Ex∈V f (x)  η , in particular V 6= ∅. Note that V is an affine n −O(1) subspace (why?). Moreover dim V ≥ dim(U) − 1 ≥ 2 − O(η ). As mentioned earlier, for other abelian groups, especially for ZN , it is not clear even how to state an inverse theorem for k · kUd , d > 3 in terms of polynomial phases. A different approach, replacing polynomial phases with more easily generalizable objects, was inspired by . 36 / 47 A fruitful interplay between combinatorics and ergodic theory Ergodic theory is based on the study of measure-preserving systems. A measure-preserving system is a quadruple (X , B, µ, T ) consisting of a measure space (X , B, µ) (typically a standard probability space), and a measure-preserving map T : X → X , i.e. a map such that for every A ∈ B the preimage T −1A is in B and µ(T −1A) = µ(A). Some central aspects of the interplay in question: • In the late 1970s, H. Furstenberg used ergodic theory to give a deep proof of Szemer´edi’stheorem [2]. This involved the analysis of certain 1 PN n 2n (k−1)n ergodic averages, functions x 7→ N n=1 f (T x)f (T x) ··· f (T x) for f a bounded measurable function on X . • In the early 2000s, Host and Kra introduced analogues of the Ud norms in ergodic theory, and used them to make important progress in the analysis of these averages [6]. In particular, they showed that to un- derstand the behavior of these averages it suffices to analyze them on certain interesting spaces known as . • This inspired Green and Tao to rephrase the U3 inverse theorem using nilmanifolds [4, §12]. This gave much clearer generalizations for d > 3. 37 / 47 A taste of nilmanifolds The circle group R/Z is the quotient of the commutative Lie group R by the lattice Z. We obtain a if instead of R we take a nilpotent connected Lie group G, and instead of Z we take a lattice Γ ≤ G, i.e. a discrete subgroup such that G/Γ is a compact topological space. For a group G and two subgroups H, K ≤ G, the commutator subgroup [H, K] is the subgroup of G generated by commutators [h, k] = hkh−1k−1. The lower-central series of G is the sequence G1 = G, G2 = [G, G], G3 = [G, G2], etc. We say that G is d-step nilpotent if Gd+1 = {idG }. A d-step nilmanifold is a quotient topological space G/Γ where G is a connected d-step nilpotent Lie group and Γ is a lattice in G.

1-step nilmanifolds are tori.  1 RR   1 ZZ  2-step example: the Heisenberg nilmanifold. G = 0 1 R , Γ = 0 1 Z . 0 0 1 0 0 1 We can view G/Γ as a bundle of circles R/Z 2 2 over the 2-torus R /Z .

38 / 47 A inverse theorem for k · kUd using nilmanifolds

3 If we phrase the U inverse theorem for ZN with a family F2 defined in terms of locally quadratic phases, then it is unclear how to generalize such functions for Ud , d > 3. Green and Tao reinterpreted these local quadra- tics using 2-step nilmanifolds, identifying them as 2-step nilsequences.

n A (linear) d-step nilsequence is a map Z → C of the form n 7→ F (g Γ), where G/Γ is a d-step nilmanifold, g ∈ G, and F : G/Γ → C is continuous.

Example: a character n 7→ e(nθ) on Z is a 1-step nilsequence, with n G/Γ = R/Z, F = e, g Γ = nθ + Z. Theorem (U3 inverse theorem in terms of 2-step nilsequences)

Let f : ZN → C with |f | ≤ 1 and kf kU3 ≥ η > 0. Then there is M(η) > 0 and a 2-step nilsequence Q : n 7→ F (g nΓ) of complexity at most M such that En∈[N]f (n)Q(n) ≥ 1/M.

The generalization of this theorem for Ud , d > 3 is clear. Green, Tao, and Ziegler eventually managed to prove this general inverse theorem [5]. 39 / 47 Understanding the relation between k · kUd and nilmanifolds The inverse theorem establishes a relation between the uniformity norms and nilmanifolds, but our understanding of this relation is not yet fully satisfactory. (In particular, a better such understanding should lead to better bounds in the theorem.) A conceptual approach to shed light on this relation was initiated by Host and Kra, and pursued by Szegedy. To give an idea of this approach, let us think again about a character x 7→ e(r x/N) on ZN . This is a composition e ◦ φ, of a homomorphism φ : ZN → R/Z, x 7→ r x/N, with the continuous function e : R/Z 7→ C. Let us focus on φ. We can express the property of being an (affine) 2 homomorphism using 2-cubes in ZN , and this is relevant to the U norm. 4 Recall that these 2-cubes are elements (x, x + h1, x + h2, x + h1 + h2) ∈ ZN .

Observation: a map φ : ZN → R/Z is an affine homomorphism ⇔ φ preserves 2-cubes (i.e. additive quadruples) 4 ⇔ ∀ (x1, x2, x3, x4) ∈ ZN , x1 − x2 = x3 − x4 ⇒ φ(x1) − φ(x2) = φ(x3) − φ(x4).

3 Let us reason similarly for the U norm: let us think about maps φ on ZN

that preserve 3-cubes. 40 / 47 Understanding the relation between k · kUd and nilmanifolds

By examining maps φ : ZN → R/Z that preserve 3-cubes, can we recover at least some of the examples of functions in F2 seen so far? 8 Recall: we define 3-cubes in ZN to be 8-tuples in ZN of the form (x, x + h1, x + h2, x + h1 + h2, x + h3, x + h1 + h3, x + h2 + h3, x + h1 + h2 + h3). 8 Let 3-cubes in R/Z also be 8-tuples in (R/Z) of this form. What is then a map φ : ZN → R/Z that preserves these 3-cubes? Focusing on the first 4 coordinates above, we see that φ must in particular send any quadruple (x, x + h1, x + h2, x + h1 + h2) to another quadruple of this form in R/Z. Hence φ is again just an affine homomorphism. Thus, by using the same notion of 3-cube on R/Z as on ZN , we obtain a family of maps φ that is not rich enough for a U3 inverse theorem. Idea: let us enrich the family of maps φ, by extending (or weakening) the notion of 3-cube that we allow on the target space R/Z.

41 / 47 Understanding the relation between k · kUd and nilmanifolds Note: an equivalent way to express a 3-cube

(x, x + h1, x + h2, x + h1 + h2, x + h3, x + h1 + h3, x + h2 + h3, x + h1 + h2 + h3) is as an 8-tuple (a1,..., a8) satisfying the following four linear equations:

a1 − a2 + a4 − a3 = 0

a1 − a2 + a6 − a5 = 0

a1 − a3 + a7 − a5 = 0

a1 − a2 + a4 − a3 + a7 − a8 + a6 − a5 = 0.

Observation: let us relax our definition of cubes on R/Z by keeping just the last equation, and keep the original definition of 3-cubes on ZN . Suppose that φ : ZN → R/Z sends 3-cubes on ZN to these more general 3-cubes on R/Z. This holds if and only if, for every x, h1, h2, h3 ∈ ZN , φ(x) − φ(x + h1) − φ(x + h2) + φ(x + h1 + h2) −φ(x + h3) + φ(x + h1 + h3) + φ(x + h2 + h3) − φ(x + h1 + h2 + h3) = 0. So: φ preserves 3-cubes in this sense ⇔ φ is a quadratic phase function!

42 / 47 Understanding the relation between k · kUd and nilmanifolds

So it seems promising to enlarge our family of maps φ on ZN by generalizing the definition of 3-cubes on the target space of φ. So far we have enlarged our family from affine homomorphisms to quadra- 3 tic phases on ZN . We know this is not enough for a U inverse theorem. To make this direction fruitful, we must go further and consider a more general notion of cubes. This approach was pioneered by Host and Kra, who defined abstract parallelepiped structures, in terms of a few axioms. This program was developed further by Camarena and Szegedy [1]. They defined the notion of a nilspace, which can be viewed as a very general way to define cubes on a set. To define a nilspace, we need: • An abstract notion of an n-dimensional cube: we take the set {0, 1}n, which we call the discrete n-cube. • Morphisms of discrete cubes: maps ϕ : {0, 1}n → {0, 1}m with the n m n m following property. Embedding {0, 1} and {0, 1} in Z , Z respec- n m tively, the map ϕ is a restriction of an affine homomorphism Z → Z . Note: if ϕ is also bijective, then it is a euclidean isometry of the n-cube. 43 / 47 Understanding the relation between k · kUd and nilmanifolds An m-face of {0, 1}n is a subset obtained by fixing n − m coordinates of v ∈ {0, 1}n. Definition: a nilspace is a set X together with a collection of cube sets n Cn(X) ⊂ X{0,1} , for each integer n ≥ 0, satisfying the following axioms: 1. (Composition) For every morphism ϕ : {0, 1}m → {0, 1}n and every cube c ∈ Cn(X), we have c ◦ϕ ∈ Cm(X). 2. (Ergodicity) C1(X) = X{0,1}. 3. (Corner completion) Let c0 : {0, 1}n \{1n} → X be such that every restriction of c0 to an (n − 1)-face containing 0n is in Cn−1(X). Then there exists c ∈ Cn(X ) such that c(v) = c0(v) for all v 6= 1n. We say that X is a d-step nilspace if, for n = d + 1, in axiom 3 the cube c completing the corner c0 is always unique. Camarena and Szegedy proved a deep result: under certain natural topo- logical assumptions, a d-step nilspace is essentially a... d-step nilmanifold! This result is a fundamental part of a very interesting conceptual approach d to proving inverse theorems for the U norms. 44 / 47 A simple case of the Camarena-Szegedy theorem Let us show that, for every 1-step nilspace X, one can use the cubes on X to define an abelian group operation. Recall the 1-step nilspace axioms: 1. (Composition) For every morphism ϕ : {0, 1}m → {0, 1}n and every c ∈ Cn(X ), we have c ◦ϕ ∈ Cm(X). 2. (Ergodicity) C1(X) = X{0,1}. 3. (Corner completion) For every corner c0 : {0, 1}n \{1n} → X, there exists a unique c ∈ Cn(X) such that c(v) = c0(v) for all v 6= 1n. Fix any point e ∈ X. We define a binary operation + on X as follows: ∀ x, y ∈ X, let x + y be the unique element z ∈ X such that the map c : {0, 1}2 → X with values c(00) = e, c(10) = x, c(01) = y, c(11) = z is in C2(X). Claim: this is a commutative group operation. Commutativity: let θ : {0, 1}2 → {0, 1}2 be the transposition permuting 01 and 10 (this is a morphism). Axiom 1 ⇒ c ◦θ ∈ C2(X). Then, by uniqueness of corner completion, we have x + y = z = y + x.

45 / 47 A simple case of the Camarena-Szegedy theorem Associativity: suppose that x, y, z ∈ X and consider the following 3-corner c0 : {0, 1}3 \{13} → X. y + z Here we have z x + z c(000) = e, c(100) = x, c(010) = y, c(001) = z, y x + y c(110) = x + y, etc. e x This corner has a unique completion c ∈ C3(X). Consider the following discrete-cube morphisms: 2 3 ϕ1 : {0, 1} → {0, 1} ,(v (1), v (2)) 7→ (v (1), v (1), v (2)), 2 3 ϕ2 : {0, 1} → {0, 1} ,(v (1), v (2)) 7→ (v (1), v (2), v (2)). We then have

(x + y) + z = c ◦ϕ1(11) = c(111) = c ◦ϕ2(11) = x + (y + z). Similar arguments show that e is the neutral element, and also that every element has an inverse. 46 / 47 References

[1] O. A. Camarena, B. Szegedy, Nilspaces, nilmanifolds and their morphisms, arXiv:1009.3825 [2] H. Furstenberg, Ergodic behavior of diagonal measures and a theorem of Szemer´edion arithmetic progressions, J. Analyse Math. 31 (1977), 204–256. [3] W. T. Gowers, A new proof of Szemer´edi’stheorem, GAFA 11 (2001), 465–588. [4] B. Green, T. Tao, An inverse theorem for the Gowers U3-norm, Proc. Edinb. Math. Soc. (1) 51 (2008) 73–153. [5] B. Green, T. Tao, T. Ziegler, An inverse theorem for the Gowers Us+1[N]-norm, Ann. of Math. (2) 176 (2012), no. 2, 1231–1372. [6] B. Host, B. Kra, Nonconventional ergodic averages and nilmanifolds, Ann. of Math. (2) 161 (2005), no. 1, 397–488. [7] B. Host, B. Kra, Parallelepipeds, nilpotent groups and Gowers norms, Bull. Soc. Math. France 136 (2008), no. 3, 405–437. [8] K. F. Roth, On certain sets of integers, J. London Math. Soc. 28 (1953), 104–109. [9] E. Szemer´edi, On sets of integers containing no k elements in arithmetic progressions, Acta Arith. 27 (1975), 199–245. [10] T. Tao, V. Vu, Additive combinatorics, Cambridge University Press, 2006. [11] P. Varnavides, On certain sets of positive density, J. London Math. Soc. 34 (1959), 358–360. 47 / 47