Ultraproducts and the Foundations of Higher Order Fourier Analysis
Evan Warner Advisor: Nicolas Templier
Submitted in partial fulfillment of the requirements for the degree of Bachelor of Arts Department of Mathematics Princeton University
May 2012 I hereby declare that this thesis represents my own work in accordance with University regulations.
Evan Warner Contents 1. Introduction 1 1.1. Higher order Fourier analysis 1 1.2. Ultraproduct analysis 1 1.3. Overview of the paper 2 2. Preliminaries and a motivating example 4 2.1. Ultraproducts 4 2.2. A motivating example: Szemer´edi’stheorem 11 3. Analysis on ultraproducts 15 3.1. The ultraproduct construction 15 3.2. Measurable functions on the ultraproduct 20 3.3. Integration theory on the ultraproduct 22 4. Product and sub σ algebras 28 − − 4.1. Definitions and the Fubini theorem 28 4.2. Gowers (semi-)norms 31 4.3. Octahedral (semi-)norms and quasirandomness 38 4.4. A few Hilbert space results 40 4.5. Weak orthogonality and coset σ-algebras 42 4.6. Relative separability and essential σ-algebras 45 4.7. Higher order Fourier σ-algebras 48 5. Further results and extensions 52 5.1. The second structure theorem 52 5.2. Prospects for extensions to infinite spaces 58 5.3. Extension to finitely additive measures 60 Appendix A. Proof ofLo´s’ ! theorem and the saturation theorem 62 References 65 1. Introduction 1.1. Higher order Fourier analysis. The subject of higher order Fourier analysis had its genesis in a seminal paper of Gowers, in which the Gowers norms U k for functions on finite groups, where k is a positive integer, were introduced. These norms were used to produce an alternative proof of Szemer´edi’stheorem (see [12] and [13]) on the existence of arbitrarily long arithmetic progressions in sets of integers of positive density.1 Briefly, while the U 2 norm measures noise from a “Fourier theoretic” point of view — that is, correlations with linear characters — there is substantial amount of structure uncaptured by this perspective. As it turns out, for example, the U 3 norm encapsulates a function’s correlations with “qua- dratic characters,” and a function can be structured from this perspective (i.e., possessing a large U 3 norm) while being random from a Fourier-analytic perspective (i.e., possessing a small U 2 norm, which is equivalent to having a small maximum Fourier coefficient). In the context of Gowers’ proof and in additive combinatorics more generally, these higher-order correlations serve to keep track of arithmetic progressions in sets. In the k =2 setting of ordinary Fourier analysis, one can manage progressions of length three, which is reflected in the fact that there is a Fourier-analytic proof of Szemer´edi’stheorem for progressions of length three (also known as Roth’s theorem; see [33] for a proof in this con- text). Unsurprisingly, in order to manage progressions of arbitrary length, Gowers employed the U k norms for arbitrarily high k. Although the proof did not use higher order Fourier analysis directly, the famous result of Tao and Green on the existence of arbitrarily large arithmetic progressions of primes [14] was heavily influenced by similar ideas. The subject has subsequently benefited from an influx of ergodic theory, starting with Host and Kra [17]. More recently, Balazs Szegedy has put the subject of higher order Fourier analysis on a more algebraic footing, relying heavily on a novel ultraproduct construction that was first developed in the context of hypergraph regularization (see [8]). In a series of three papers [27], [28], [29], Szegedy laid out a framework based around certain “higher order Fourier” σ-algebras k defined on the ultraproduct of a sequence of finite groups. The ultraproduct constructionF is employed because a satisfactory algebraic theory of higher order Fourier analysis, in contrast to ordinary Fourier analysis, seems to appear only in the limit of larger and larger ordinary structures. Although this avenue will not be pursued in this paper, one of the great strengths of the ultraproduct theory is that nearly everything proven in the setting of the ultraproduct has a corresponding version in the finite setting, usually at the cost of exactness (for example, replacing precise equalities with "-close approximations). Thus, a development in the ultraproduct setting gives rise to an analogous development in standard models. More recently in [31], Szegedy took the theory in a different direction when he proved that in a certain sense the higher-order structure arises from algebraic structures known as nilspaces. 1.2. Ultraproduct analysis. First developed in a model-theoretic context in the 1950s by Tarski,Lo´s,and ! others, the ultraproduct offers a method of taking very general limits objects of first-order logical structures. In a manner analogous to the construction of the re- als as equivalence classes of Cauchy sequences modulo the relation of “difference tending to zero,” an ultraproduct (or ultrapower, as it is called if all the original structures are equal) is
1In reference to the U k norms in his original paper, Gowers wrote, amusingly, “Although I cannot think of any potential applications, I still feel that it would be interesting to investigate them further.” 1 constructed as the set of equivalence classes of arbitrary sequences modulo the much stricter relation of “being equal on an ultrafilter.” Thus an infinite ultrapower of R will contain infi- nite and infinitesimal element, constructed as, respectively, equivalence classes of sequences of real numbers which tend to infinity and tend to zero (without eventually becoming zero). Models constructed using ultraproducts have two very desirable properties. First, there is a transfer principle, known asLo´s’theorem, ! that allows one to deduce statements in first-order logic in the ultraproduct from the corresponding statements in its constituent structures, and vice-versa. Second, such models turn out to be saturated, which in the context of this paper manifests itself mostly in a countable compactness result. Ultrapower models with these two properties, especially ultrapowers of the real numbers, are often called nonstan- dard models. Such models were exploited beginning with Robinson to provide a rigorous underpinning for the na¨ıve manipulation of infinitesimals in calculus (see [24]). To borrow an analogy from [33], the distinction between ultraproduct analysis and nonstandard anal- ysis is similar to the distinction between measure theory and probability theory (or more elementarily, the distinction between matrices and linear transformations); in the former areas one constructs all objects “by hand” in the same context as the base objects, while in the latter one proceeds axiomatically (but largely equivalently, in that results in one context are easily translated into the other context). In this paper, we will proceed exclusively in the framework of ultraproduct analysis, but intuition from nonstandard theory is often helpful. In 1975, Loeb [21] introduced a useful construction of ordinary measure spaces on ultra- products, now sometimes known as Loeb measures, that are used to great effect in Szegedy’s construction of algebraic higher order Fourier analysis. Besides the current application, Loeb measures have utility in a variety of contexts, including constructing measure spaces with various desirable properties, representing measures (such as Wiener measure) in a more man- ageable way, and proving existence results in analysis (especially in the field of stochastic differential equations). Nonstandard or ultraproduct analysis is also often useful in simplifying proofs in analysis or suggesting new ones, especially in arguments requiring the juggling of many “"-small quantities” that disappear or are simplified in the ultraproduct. The aforementioned Green- Tao theorem [14], for example, employs ultraproduct analysis in this context. The book [33] uses ultraproducts to develop results in quantitative algebraic geometry. As a motivating example below, we will show how Szemer´edi’stheorem can be quickly deduced from the Furstenberg recurrence theorem using simple ultraproduct arguments.
1.3. Overview of the paper. This paper contains the following: (i) A development for the nonspecialist of the background required to understand Szegedy’s results in higher order Fourier analysis. (ii) A careful proof of the first major structure theorem for the Fourier σ-algebras, ex- tending the context from finite groups to all measurable groups that are finite measure spaces, filling out Szegedy’s arguments and in some cases correcting them. (iii) A discussion and proof sketch of the second major structure theorem for Fourier σ- algebras. (iv) A discussion of possible extensions of these results to the contexts of infinite measure spaces and finitely additive measure spaces. No prior familiarity with either higher order Fourier analysis or ultraproduct analysis is presumed. 2 In the second section, we present a unified introduction to ultraproducts and ultraproduct analysis. Two model-theoretic results are relegated to an appendix. No new results are presented, although we include new proofs of known facts (for example, Proposition 2.13). As an illustration of the power of these techniques, we include a proof of Szemer´edi’stheorem from the Furstenberg recurrence theorem. In the third section, we construct the main measure space on the ultraproduct group A in a self-contained way, and then detour through a discussionA of the integration theory on the ultraproduct with a special focus on under what conditions the integral and ultralimit may be interchanged. This theory has a resemblance to the elementary real analysis question of when a limit and integral commute. The language is rooted in ultraproduct theory rather than nonstandard analysis, in contrast to the context in which these results appear in the literature. In the fourth and longest section, we introduce various useful measure spaces on the product of ultraproduct spaces and state and prove Keisler’s Fubini-type theorem before detouring again through a general discussion of Gowers norms. Here, in recognition of the lack of any discussion in the literature of Gowers norms over infinite spaces, we broaden to this context. Following Szegedy, we, though it makes the language somewhat more difficult to understand, mix ergodic-theoretic language with structures first defined in the context of additive combinatorics. Many results from the setting of finite groups carry over and are presented in the generality of finite measure spaces. Proposition 4.10 is a new result, while some results (such as Lemma 4.1) have novel, simplified proofs. The discussion then shifts to a careful exposition of Szegedy’s first structure theorem for higher order Fourier σ-algebras in the expanded context of ultraproducts of groups of finite measure, which states that a 2 function is measurable in k 1 if and only if it is orthogonal to every L function with a vanishing Gowers norm. TheF − often very sketchy proofs in [27] are fleshed out and in some cases corrected; for example, the original paper contained numerous errors centered around the distinction between the concepts of sets generating a σ-algebra and sets generating a σ-algebra which is dense in . Numerous smaller lemmas are addedA as necessary to ease the presentation.B A In the fifth section, we sketch a proof of the second structure theorem in Szegedy’s higher order Fourier analysis, which states that L2( ) can be decomposed into pairwise-orthogonal Fk modules over L∞( k 1), each generated by a single function. This has the corollary that for every function Ff − L2 measurable in the ultraproduct and every integer k>1, there is a unique decomposition∈ ∞ f = g + fi j=1 ! in the L2 sense, where g = 0 and the f are contained in different rank-one modules over || ||Uk i L∞( k 2). We then briefly discuss two possible extensions, to spaces of infinite measure andF to− finitely additive measures.
3 2. Preliminaries and a motivating example 2.1. Ultraproducts. Most of this material can be found in [34]. We start with a definition. An ultrafilter on the positive natural numbers N is a set ω consisting of subsets of N satisfying the following conditions: (i) The empty set does not lie in ω. (ii) If A N lies in ω, then any subset of N containing A lies in ω. (iii) If A ⊂and B lie in ω, then the intersection A B lies in ω. ∩ (iv) If A N, then exactly one of A and N A lies in ω. ⊂ \ If, furthermore, no finite set lies in ω, we say that ω is a nonprincipal ultrafilter. We would like to show that there exists a nonprincipal ultrafilter. Define a proper filter to be a set consisting of subsets of N that satisfies the first three conditions above. We say that a collection γ of subsets of N has the finite intersection property if for any finite collection A1,...,An of elements of γ, A1 ... An is nonempty. Certainly any proper filter has the finite intersection property. ∩ ∩ Lemma 2.1. If a collection γ has the finite intersection property, there is a proper filter φ containing γ.
Proof. Take φ to be the set of subsets B of N such that there exist A1,...,An in γ whose intersection A ... A is a subset of B. Conditions (i) and (ii) are immediate. If B and B 1∩ ∩ n 1 2 lie in φ, then there exist A ,...,A and A# ,...,A# lying in γ such that A ... A B 1 m 1 n 1 ∩ ∩ m ⊂ 1 and A# ... A# B . Therefore 1 ∩ ∩ n ⊂ 2 A ... A A# ... A# B B , 1 ∩ ∩ m ∩ 1 ∩ ∩ n ⊂ 1 ∩ 2 so B B lies in φ and condition (iii) is verified. 1 ∩ 2 ! Proposition 2.2. There exists a nonprincipal ultrafilter. Proof. The proof requires the use of the axiom of choice in the form of Zorn’s lemma. Let γ be the set of all cofinite subsets of N; it is clear that γ is a proper filter. Given any chain of proper filters, ordered by inclusion, take the union of all of them; it is immediate that this union is also a proper filter. Therefore we can apply Zorn’s lemma to find a maximal proper filter ω. Proceed by contradiction and assume that ω does not satisfy condition (iv) in the def- inition of an ultrafilter, so there is some A N such that neither A nor N A lie in ω. Let ⊂ \ γ = A B : B ω . { ∩ ∈ } Then ω# has the finite intersection property: if B1,...,Bn belongs to ω, then B1 ... Bn does as well (and is nonempty). If A B ... B were empty, then we would∩ ∩ have ∩ 1 ∩ ∩ n B1 ... Bn N A, so N A ω, contrary to hypothesis. ∩ ∩ ⊂ \ \ ∈ Therefore by the lemma we can extend γ to a filter ω#, which must contain A as well as every element of ω. Thus ω was not a maximal proper filter, which is a contradiction. We conclude that ω satisfies condition (iv) and is therefore an ultrafilter. Furthermore, since ω contains γ, by condition (iv) it cannot contain any finite set, so it is a nonprincipal ultrafilter. ! Lemma 2.3. If A ω for some ultrafilter ω, and A can be written as the disjoint union of finitely many sets A∈, then there is exactly one i such that A ω. i i ∈ 4 Proof. We use condition (iv) repeatedly. First, either A1 or N A1 is in ω; if the former, we’re done, while if the latter, \ n A (N A1)= Ai ∩ \ i=2 " is in ω. By induction, after a finite number of such steps we can find an i such that Ai ω. Only one such i can exist, for otherwise the intersection of two disjoint sets would be in∈ the ultrafilter, contradicting condition (i). !
An ultraproduct X of a sequence of sets X ∞ with respect to an ultrafilter ω is defined { i}i=1 as follows. First construct the Cartesian product i Xi. Define an equivalence relation ∈N x y, where x =(x1,x2,...) and y =(y1,y2,...), by ∼ # x y i N : xi = yi ω. ∼ ⇐⇒ { ∈ }∈ Then let X = X / . One can think of X as a sort of completion where one can i N i take the limit of arbitrary∈ ∼ sequences, rather than just Cauchy sequences: given a sequence # x ∞ , the equivalence class in X of this sequence will be denoted { i}i=1 x = lim xi. i ω → Thus, in this terminology, we have
lim xi = lim yi i ω i ω → → if and only if the set of i N such that xi = yi is a member of ω. Similarly, for subsets ∈ Hi Xi we denote by H or limi ω Hi the set of all elements of the ultraproduct arising from⊂ limits of points in the given→ subsets:
lim Hi = lim xi : xi Hi,i N i ω i ω ∈ ∈ → $ → % If all of the Xi are the same space, the ultraproduct is also called an ultrapower. Ultrapowers with respect to a nonprincipal ultrafilter will be denoted with a prior asterisk; e.g., the ultrapowers of N and R are written ∗N and ∗R, respectively. The latter object is called the set of hyperreal numbers. The order structure carries over into the hyperreals: for real sequences ai and bi whose ultralimits are a and b, respectively, by Lemma 2.3 exactly one of the sets i : ai
lim ri = lim ri , i ω i ω | | & → & → which will be a nonnegative hyperreal.& Given& x , we call x bounded if x f lim xi = lim fi(xi). i ω i ω → → It is an easy exercise to show that' finite( Boolean operations are preserved when taking ultraproducts; that is, if Gi i∞=1 and Hi i∞=1 are sequences of subsets of the Xi, then we have { } { } lim(Gi Hi)= lim Gi lim Hi , i ω i ω i ω → ∪ → ∪ → and similarly for intersections and complements.' ( ' ( The ultraproduct construction is properly a model-theoretic one, and can be defined for arbitrary structures. The investment in added generality is rewarded with two funda- mental theorems, which together comprise the core of the utility of ultraproducts in any context:Lo´s’theorem ! (sometimes called the fundamental theorem of ultraproducts) and the saturation theorem. To keep the model theory to a minimum, we will briefly define the model-theoretic ultraproduct here and state the two important theorems, but relegate their proofs to the appendix. For additional background, see [2] and [22]. To define the general ultraproduct, let i be a structure with universe Mi in a fixed language for each i I, where I is some indexM set (for our purposes, I will always be the L ∈ natural numbers N). Pick an ultrafilter ω on I. Define the ultraproduct = limi ω i as follows: let the universe of be M → M M M = M / , i ∼ i )∈N where is the equivalence relation given by the ultrafilter as previously described; we again ∼ denote the image in M of a sequence ai Mi by limi ω ai. For each constant symbol c of ∈ → the language, let the interpretation cM of c in be the equivalence class of (cM1 ,cM2 ,...) M in Mi. For an 1-ary function symbol in the language, if a = limi ω ai, define → i # f M(a) = lim f M (ai). i ω → It is easy to check that this is well-defined in the sense that it is independent of a particular choice of ai. A precisely analogous definition is made for n-ary function symbols. Similarly, if R is a one-place relation symbol and a = limi ω ai, we interpret → i RM = a : i N : ai RM ω , { { ∈ ∈ }∈ } which is similarly well-defined, and likewise for n-place relation symbols. It will be noted that many of the previous definitions in this section are particular instances of this general construction. With this out of the way, we can stateLo´s’theorem, ! which states that a first order sentence is true in the ultraproduct if and only if it is true for ω-many of the original structures: Theorem 2.4 (Lo´s’theorem)! . Let ω be an ultrafilter on I, let , i I, be a sequence of Mi ∈ structures over the same language, and let = limi ω i. Let φ be a first-order formula 1 n M → Mk k 1 n over σ with n arguments, and let a , . . . , a , where a = limi ω ai . Then φ(a , . . . , a ) ∈M → 6 1 n is true in if and only if the set of i for which φ(ai , . . . , ai ) is true in i is an element of ω. M M The following corollary will be useful. By an internal set we mean a set, viewed as a subset of X, which is equal to an ultraproduct of subsets of Xi. Corollary 2.5 (Overspill). Let φ(n) be a formula such that φ(n) is true for all finite n ∗N. Then there exists an infinite m such that φ(n) is true for all n m. ∈ ≤ Proof. First, we show that N ∗N is not an internal set. Note that the complement of ⊂ an internal set must be internal, so it suffices to show that the complement of N is not internal. Consider the statement in the language of the natural numbers that there exists a smallest element, which is certainly expressible in first-order logic. This statement is true for every subset of the natural numbers, and therefore byLo´s’theorem, ! it is true for every internal subset of N. However, the complement of N has no smallest element: given any element of the complement d = limi ω di, we can find a strictly smaller one given simply → d# = limi ω di# , where di# = di 1 if di > 0 and di# = 0 otherwise. Therefore N is not an internal space.→ − The set m ∗N such that φ(n) for all n Theorem 2.7 (Compactness). A set of sentences Γ= φi i I has a model if and only if every finite subset of Γ has a model. { } ∈ 7 Proof (sketch). Let J be the set of finite subsets of I. For each j J, by assumption there ∈ is a model j making true φi i j. Let Aj = k : j k for each j J. The collection M { } ∈ { ⊆ } ∈ of all such Aj generates a filter, so there is an ultrafilter ω on J containing Aj. For any formula φ, certainly A φ ω, so the set of all j such that φ holds in j is a superset of { } ∈ M A φ and hence in ω as well. Then byLo´s’theorem ! φ holds in the ultraproduct limj ω j, { } → M so this ultraproduct is a model of Γ, as desired. ! The saturation theorem has several down-to-earth consequences that will be very impor- tant. First, it yields a strong “countable compactness” result for internal sets. Corollary 2.8 (Countable compactness). Let ω be a nonprincipal ultrafilter. Then any cover of an internal set by countably many internal sets has a finite subcover. Proof. We first prove an intermediate result: let C1, C2,... be a sequence of ultrapdroduct spaces in X such that any finite collection of these sets has nonempty intersection; that is, N Cn = , n N. . ∅ ∀ ∈ n=1 * Then we claim that the entire sequence has nonempty intersection; that is, ∞ C = . n . ∅ n=1 * Write n n C = lim Ci i ω → n for subsets Ci Xi. Augment the language with countably many one-place relation sym- n ⊆ n bols C , which will be interpreted in each of the Xi as the subset Ci . By the definition of the ultraproduct, Cn will be interpreted as Cn in the ultraproduct X. Let p be the set consisting of the following collection of sentences: p(v)= Cn(v) , { } where the notation is interpreted by letting Cn(v) be true if and only if v Cn. We claim that p is an (incomplete) 1-type. By compactness, p is satisfiable∈ if it is finitely satisfiable, which is true because every finite collection has nonempty intersection. Our language is still countable, so we can apply the saturation theorem (combined with Proposition A.1 in the appendix, as p is incomplete) to conclude that p is satisfied in X. Therefore the intermediate result is proven. It remains to show that this implies countable compactness. Given a countable cover of X by internal spaces Di i∞=1, let Ci be the complement of Di in X for each i. We then have, for any set of indices{ } i ,i , . . . , i , { 1 2 r} r r C = X D . ij − ij j=1 j=1 * + Assume there is no finite subcover of the Di. There is no set of indices such that the right hand side of the above equality is nonempty, so no finite intersection of the Ci is nonempty. But because the Di form a cover, the infinite intersection of all the Ci is empty. This violates the countable saturation result above. ! 8 It is possible to prove countable compactness “by hand,” as follows, although the more general model-theoretic viewpoint will pay offlater: Alternate proof of countable compactness. We follow [34]. It suffices to prove the intermedi- ate result from above, so let notation be as before. Because finite intersections are preserved under ultraproducts, N N (i) lim Cn = Cn. i ω → ,n=1 - n=1 * * N (i) Therefore by assumption the set of i such that n=1Cn is nonempty, which we denote E , is an element of ω. Without loss of generality∩ we can arrange that E E ... N 1 ⊃ 2 ⊃ by replacing En by the intersection E1 ... En, which must still be a member of the ultrafilter. Additionally, we want to remove∩ ∩ any natural number less than N from each EN , so that N∞=1EN = . That elements of an ultrafilter are still in the ultrafilter after removing finitely∩ many elements∅ requires that the ultrafilter is nonprincipal (this is the only place in the proof where this condition is used); otherwise, if such a set were not in the ultrafilter, the intersection of its complement with the original set would be both finite and a member of the ultrafilter. For each i E1, let Ni be the largest number such that i EN and let xi be an (arbitrary) ∈ Ni (i) ∈ (i) element of the set n=1Cn , which is nonempty by assumption. By construction, xi Cn for all i E , which∩ is an element of ω. Therefore the object ∈ ∈ n x = lim xn i ω → lies in Cn for all n, so the infinite intersection of the Cn is nonempty. This proves the claim of countable saturation. ! Another form of countable compactness which is often useful is the following: Corollary 2.9 (Countable comprehension). Given any sequence (En)n of internal subsets ∈N of X, there is an internal function with domain ∗N (that is, an internal sequence) (En)n∗ N of subsets of X. ∈ Proof. Let Bm be the set of all sequences (Fn)n such that Fn = En for all n m and ∈N ≤ Fn X. It is easy to see that any finite collection of the Bm has nonempty intersection. Therefore⊂ by the intermediate result of the proof of countable compactness, the countable intersection of all the Bm for finite m is nonempty, so there is some extension of the En. ! In this paper, we will need to take ultraproducts of sequences of abelian groups Ai i∞=1. The group structure on the ultraproduct A is defined in the obvious way, per the{ above} general definition: we have lim(ai + bi) = lim ai + lim bi i ω i ω i ω → → → and 1 − 1 lim ai = lim(ai− ); i ω i ω → → it is clear both that these operations' are well-defined( and that they define a group structure on A. 9 Finite products do not technically commute with ultraproducts: that is, if we have se- quences of groups (or more general structures) A and B , then the space { i} { i} C = lim(Ai Bi) i ω → × is not the same space as A B = lim Ai lim Bi . i ω i ω × → × → However, in light of the following lemma,' we will( identify' them( without further concern: Lemma 2.10. With notation as above, there is a natural isomorphism C A B. 3 × Proof. An element of C is an ultralimit of pairs (ai,bi), which we can use to define an element lim ai lim bi A B; i ω × i ω ∈ × ' → ( ' → ( it is clear that this map is well-defined in that it is insensitive to the particular ai, bi chosen outside of an element of the ultrafilter. This map has an obvious inverse, and it is a trivial matter to check that it preserves group structure. ! The next lemma and propositions are not strictly necessary for the results that follow, but do they motivate two things. First, Proposition 2.12 explains why our basic construction will require taking a nonprincipal ultrafilter: otherwise, we get no extra structure — and in fact lose almost all information — upon taking the ultraproduct. Second, Proposition 2.13 elucidates why it will be uninteresting to take the ultraproduct of finite groups that are bounded in order. An ultrafilter which is not nonprincipal is (unsurprisingly) called principal. Lemma 2.11. If an ultrafilter ω is principal, then it is of the form ωn = A N : n A { ⊂ ∈ } for some n N. ∈ Proof. It is immediate to verify that ωn as defined above is an ultrafilter. By definition, ω must contain some finite set B = n1, . . . , nk . By Lemma 2.3, writing B as the union of singleton sets, there is some i such{ that n } ω. Because ω contains n , it must contain { i}∈ { } all subsets of N containing n; that is, it must contain ωn. If A/ωn, then N A ωn, so ∈ \ ∈ N A ω, so A/ω. Therefore ω = ωn. \ ∈ ∈ ! Proposition 2.12. The ultraproduct A of groups Ai with respect to the ultrafilter ωn is isomorphic to An. Proof. Two elements of the ultraproduct, considered as sequences a =(a1,a2,...) and b =(b ,b ,...) with a ,b A , are equal if and only if the set 1 2 i i ∈ i i N : ai = bi { ∈ } lies in ω , which is true if and only if a = b . Therefore the map Φ: A A given n n n → n by Φ(a)=an is a bijection. It obviously a homomorphism by the definition of the group structure on the ultraproduct. ! 10 Proposition 2.13. If A is a sequence of groups such that A is bounded, then there is an i | i| i such that the ultraproduct A is isomorphic to Ai. Conversely, if Ai is unbounded and ω is nonprincipal, then A is infinite. | | Proof. There are only finitely many isomorphism classes of groups of order less than a given bound. Let the indices i corresponding to a given isomorphism class be grouped into sets. Noting that their union N is certainly in ω, by Lemma 2.3 one such set I is in the ultrafilter ω; that is, there is an I ω such that Ai Aj for i, j I. Let N be the order of the groups in this isomorphism class.∈ 3 ∈ Without loss of generality, we can identify the groups in I via the isomorphisms and label them all as A. Consider the N elements of A given by the ultralimits of corresponding elements a A, which by abuse of notation we will denote by a# = limi ω a, even though the right hand∈ side is defined only on an element of the ultrafilter. The→ element in the ultralimit is nonetheless well-defined, because any choice for the elements of Ai, i/I will lead to an equivalent element in the ultralimit. We can think of this limit as a map∈A A → given by a a#. (→ First, we show that this map is injective. If a = b in A, then a# = b# in A: for otherwise, the set J of indices for which the sequences agree. would lie in ω,. but I and J would then be disjoint, so their intersection would also lie in ω, contrary to definition. ∅ Second, we show that this map is surjective. Consider an arbitrary element b = limi ω bi → of A; we will show that it must be equal to one of the N elements a# by another application of Lemma 2.3. For each a A, consider the set B of indices i such that b = a. We can ∈ a i certainly write I as the finite disjoint union of the Ba, so by the lemma one of the Ba is in ω. Then i : bi = a ω, so b = a#. Therefore the mapping A A described above is bijective, and by{ the definition}∈ of the group structure on the ultraproduct→ it is an isomorphism. The converse is straightforward: for any N we can find an index iN such that Aj >N k | | for j > iN . Then it is easy to construct N sequences ai ,1 k N, that differ on every index greater than j, which is a cofinite set and in the nonprincipal≤ ≤ ultrafilter ω. Therefore k k the elements of the ultraproduct a = limi ω ai are all distinct. Because we can do this for → every N, A is infinite. ! It should be noted that Proposition 2.13 has an alternate proof as an easy corollary of Lo´s’theorem,! together with the basic model-theoretic fact that for any finite structure in a finite language, there is a sentence φ precisely specifying the structure up to isomorphism. 2.2. A motivating example: Szemer´edi’s theorem. As an example of the utility of ultralimit analysis in additive number theory, we will follow [33] in showing how Szemer´edi’s theorem can be quickly deduced from the Furstenberg recurrence theorem, which we will assume: Theorem 2.14 (Furstenberg recurrence theorem). Let (X, , µ, T ) be a measure-preserving system, let A X have positive measure, and let k be a positiveB integer. Then there exists a positive integer⊂ r such that r (k 1)r A T A ...T − A ∩ ∩ is nonempty. ! By a measure-preserving system, it is meant that (X, ,µ) is a probability space and T : X X is a measure-preserving bijection such that bothBT and its inverse are measurable. → 11 Using ultralimits, we can use the Furstenberg recurrence theorem to prove Szemer´edi’s thereorem. First, a definition: let A be a set of integers. If A [ n, n] lim sup | ∩ − − | δ n [ n, n] ≥ →∞ | − | for some δ> 0, then we say that A has positive upper density. Theorem 2.15 (Szem´eredi’stheorem). Every set of integers of positive upper density con- tains arbitrarily long arithmetic progressions. Proof. Proceed by contradiction, assuming the existence of a set A of positive upper density and no progressions of length k for some k 1. Having positive upper density means ≥ precisely that there is a sequence of integers of ni tending to infinity and a δ> 0 such that A [ n ,n ] δ [ n ,n ] | ∩ − i i |≥ | − i i | for each i. Now consider the ultrapower ∗A with respect to some nonprincipal ultrafilter ω, and the ultralimit n = limi ω ni. The former is a subset of ∗N, while the latter is in ∗N (which can be considered a→ subset of the hyperreals). We claim that ∗A [ n, n] δ [ n, n] , | ∩ − |≥ | − | where the absolute value signs are as defined previously on the hyperreals (and δ = limi ω δ is considered as a hyperreal as well). This is a simple application ofLo´s’theorem; ! we→ will include here some of the details to get a sense of how the model-theoretic result is used in practice. Take the signature consisting of the binary operations + and , the binary relation , the unary relation (or set) A, and the constant δ. ByLo´s’theorem, ! · since the ≥ above inequality can be written as a first-order statement and is true for each i, and N is certainly an element of ω, by taking ultralimits of everything we get a true statement, which is precisely the claim. Additionally, it is true that ∗A has no progressions of length k, since that property can be expressed in a first-order sentence and is true for A. Call the space of all finite Boolean combinations of shifts of ∗A by elements of Z, which includes the empty set and all of ∗Z, the definable sets . The elements of form a field D D of sets. Define µ# : ∗R by D→ B [ n, n] µ#(B)=| ∩ − |, [ n, n] | − | which is a function from definable sets to the set ∗[0, 1], the ultrapower of the unit interval, such that µ#( ) = 0 and µ#(∗Z) = 1. Additionally, it is finitely additive: given two disjoint ∅ definable sets B1 and B2, (B1 B2) [ n, n] µ#(B B )=| ∪ ∩ − | 1 ∪ 2 [ n, n] | − | B [ n, n] + B [ n, n] = | 1 ∩ − | | 2 ∩ − | [ n, n] | − | = µ#(B1)+µ#(B2). 12 Finally, it is nearly translation invariant under translation by any standard integer j: we have B [ n j, n j] µ#(B + j)=| ∩ − − − | [ n, n] | − | B [ n, n] B [ n j, n 1] B [n j +1,n] = | ∩ − | + | ∩ − − − − | | ∩ − | [ n, n] [ n, n] − [ n, n] | − | | − | | − | = µ#(B)+"1 + "2, where "1 and "2 are both infinitesimal, since both numerators are finite but both denomi- nators are infinite. Define µ : R by D→ µ(B) = st(µ#(B)) and recall that st is a homomorphism. It is clear that µ is nonnegative, µ( ) = 1, µ(∗Z) = 1, and µ is finitely additive. Taking standard parts kills infinitesimal quantities,∅ so µ is a finitely additive pre-measure on the field of sets that is invariant under shifts by finite integers. By construction, µ(A) δ. D We will use µ to construct≥ a measure-preserving system on which to apply the Furstenberg recurrence theorem. Let X =2Z, the set of all subsets of the integers. We can view this space as the set of all doubly-infinite sequences of ones and zeroes. Define a topology on X as follows: first define the cylinder sets Em by E = B 2Z : m B . m { ∈ ∈ } From the viewpoint of the doubly-infinite sequences, Em consists of all sequences with 1 in the mth position. Let the set of all finite Boolean combinations of the Em, which is clearly a field of sets, be denoted . The topology is defined by letting be a basis. It then follows that consists precisely ofE the clopen sets of X. Let T : X E X be a shift-by-one map, shiftingE each entry of a doubly-infinite sequence to the left. To→ define a measure ν on this space, we will need to somehow reference our ultraproduct construction. This reference is provided by a map Φ: , which is defined on the E by E→D m Φ(Em)=∗A + m and then extended to all of by Boolean combinations. Let ν : R be defined for E by E E→ ∈E ν(E)=µ(Φ(E)), the pullback of µ by Φ. It is simple to verify for ν the properties we want: it is nonnegative, we have ν( )=µ( ) = 0, ν(X)=µ(∗Z) = 1, and it is clearly invariant under the shift T because µ is∅ invariant∅ under shifts by integers. Finite additivity follows because Φpreserves disjointness of sets, which follows because we have for E ,E that E E = = 1 2 ∈E 1 ∩ 2 ∅ ⇒ Φ(E1) Φ(E2)=Φ(E1 E2)= . Finally, we of course have ν(E) δ. By the∩ Carath´eodory∩ extension∅ theorem, ν extends to a measure≥ on the σ-algebra generated by on X. We may now finally apply the Furstenberg recurrence theorem on theB E measure-preserving system (X, ,ν,T ) and the set E0 of positive measure to deduce that there is a positive integer r suchB that r (k 1)r E T E ... T − E = . 0 ∩ 0 ∩ ∩ 0 . ∅ By applying Φ, it is immediate that ∗A (∗A + r) ... (∗A +(k 1)r) = . ∩ ∩ ∩ − . ∅ 13 Therefore ∗A contains an arithmetic progression of length k, which is a contradiction. ! 14 3. Analysis on ultraproducts 3.1. The ultraproduct construction. Let A ∞ be a sequence of abelian groups that { i}i=1 are also probability spaces, equipped with normalized measures µi, such that the measure spaces thus formed are compatible with the group structure in the sense that the action of the group on any measurable set is again measurable. In this situation we say that the σ-algebra in question is invariant with respect to the action of Ai on itself, or simply that the σ-algebra is invariant when there is no danger of confusion. To avoid the trivial case described by Lemma 2.13, we assume that sup A = . Although Szegedy’s sequence of i | i| ∞ papers ([27], [28], [29]) explicitly discusses only the case where the Ai are finite groups (with µi therefore the normalized counting measure), his arguments translate into the compact case with little alteration; we will flesh out these arguments in subsequent sections. Later, in the fifth section of this paper, we will discuss extensions of these results to other contexts. Let A be the ultraproduct of the Ai with respect to a fixed nonpricipal ultrafilter ω. We will construct a σ-algebra on A as follows. Say that a subset H A is an element of (A) if we can write it as ⊆ S H = lim Hi i ω → for some measurable sets Hi Ai. When it is clear from context, we will often omit writing the name of the ultraproduct⊆ group in labeling the σ-algebras on it. Note that (A) forms a field of sets because the ultralimit construction respects finite Boolean operations.S Recalling the countable compactness result of Corollary 2.8, one might think at this point that (A) is already a σ-algebra (it is certainly closed under complements). However, this fails becauseS not all countable unions of elements of (A) are internal sets, so the lemma does not apply. S Remark To clarify this statement, it may be helpful to develop a specific counterexample. Let the Ai be the groups Z/iZ equipped with normalized counting measure, and as usual let ω be a nonprincipal ultrafilter. Each element of the resulting ultraproduct A, considered as a singleton set, is trivially an internal space. In A there is an algebraic copy N of the natural numbers constructed as follows: for each natural number n, let ni = n Z/iZ when i>nand let n be the zero element otherwise. The set N A is then defined∈ to consist of i ⊂ the ultralimits n = limi ω ni. It is a countable union of internal sets (the singletons n ), but we have already shown→ in the proof of the overspill principle that it is not an internal{ } set, and therefore not possibly in (A). S 6 We define our main object of study (A) to be the completion of the σ-algebra generated by (A). The measure constructedA here on is sometimes called a Loeb measure (for example,S by [4], [5], and [6]). The constructionA can be carried out in two equivalent ways: first, a general construction based on the Carath´eodory extension theorem, and second, a hands-on method which gives us a helpful approximation result. By the uniqueness result in the Carath´eodory construction, the two extensions are equivalent. To put a measure µ on , we first define µ# : ∗R by A S→ µ#(H) = lim µi(Hi) i ω → for sets H = limi ω Hi. This is well-defined, for if H is the ultralimit of two sequences of (1) → (2) subsets Hi and Hi , then the sequences must agree on an element of the ultrafilter, so their measures will agree on an element of the ultrafilter, which uniquely fixes the ultralimit. 15 As the ultraproduct respects finite Boolean operations, µ# obeys the following: if Gj = (i) limi ω Gj , we have → (i) (i) µ#(G1 G2) = lim µi G1 G2 , i ω ∩ → ∩ ' (i) (i)( µ#(G1 G2) = lim µi G1 G2 , i ω ∪ → ∪ c ' c ( µ#(G1) = lim µi (G1) . i ω → It is clear that µ# is nonnegative and µ#( ) = 0. It is finitely additive: given disjoint sets (i) (i) ∅ (i) (i) G1 = limi ω G1 and G2 = limi ω G2 in , the set I of indices i such that G1 G2 = must be a→ member of the ultrafilter.→ For theseS indices, we can do the desired manipulation∩ ∅ because each of the µi are measures, and since I ω the other indices do not matter in the ultralimit: ∈ (i) (i) µ#(G1 G2) = lim µi G1 G2 i ω ∪ → ∪ ' (i) ( (i) = lim µi G1 + µi G2 i ω → ' ' (i) ( ' (((i) = lim µi G1 + lim µi G2 i ω i ω → ' ( → ' ( = µ#(G1)+µ#(G2). Now define µ : R by S→ µ(H) = st(µ#(H)), where the function st is extended so that it is equal to infinity when its argument is an infinite hyperreal. It is clear that µ is nonnegative and µ( ) = 0, and finite additivity follows because the function st is a homomorphism. In order to∅ prove that µ is a pre-measure on , it remains to be shown that it is countably additive, not just finitely additive. This followsS via the countable compactness result of Corollary 2.8 above: given a countable union j∞=1Gj of pairwise disjoint sets in which itself lies in and is therefore internal, we can write∪ it as a union of finitely manyS of the sets, which withoutS loss of generality we write as N G . By finite additivity, ∪j=1 j N N µ G = µ(G ), j j j=1 j=1 + ! so finite additivity here implies countable additivity. Therefore µ is a pre-measure on . It S is obvious that since the µi are probability measures, so will µ. By the Carath´eodory extension theorem, we can extend µ to a measure on the completion of the σ-algebra generated by (see [25]). See [7] or, especially, [4] for details of the ACarath´eodory construction in theS context of ultraproduct measures. Lemma 3.1. The measure µ defined above on is invariant under the group action. A Proof. First consider µ# : ∗R. By the definition of the group structure, for G and a A we have S→ ∈S ∈ µ#(G + a) = lim µi(Gi + ai) = lim µi(Gi)=µ#(G). i ω i ω → → 16 Taking standard parts, then, µ restricted to is invariant under the group action. It is clear that the Carath´eodory construction doesS not affect invariance with respect to the group action, for the outer measure (which restricts to the Carath´eodory measure) is defined as ∞ ∞ µ (E) = inf µ(Ej):E Ej, where Ej (A) for all j . ! ∗ ⊂ ∈S j=1 j=1 ! + For the second method of construction, we proceed as follows. We need one preliminary lemma, which shows us that “ultralimit convergence” gives us some control over actual convergence, at least for bounded sequences and for a set of indices in the ultrafilter: Lemma 3.2. Let a = limi ω ai, where each ai R and the ai are uniformly bounded. For → ∈ all real "> 0, considering R as a subset of ∗R, i N : ai [st(a) ", st(a)+"] ω. { ∈ ∈ − }∈ Proof. Since the ai are bounded, st(a) is finite and differs from a only by an infinitesimal. Without loss of generality, it suffices to assume that st(a) = 0, because we can then prove the result for any other sequence simply by translating; thus, a is infinitesimal and therefore in [ ","]. Consider the language of the real numbers extended by the constant ". ByLo´s’ ! − theorem in this language, a [ ","] implies that i N : ai [ ","] ω, as this formula ∈ − { ∈ ∈ − }∈ is certainly expressible in first-order language. ! Construct µ : (A) R exactly as before. Define a null set to be any subset E of A such that for eachS "> 0,→ there is a set F (A) such that E F and µ(F) <". Clearly, if E is a null set, then any subset of E is∈ alsoS a null set. It turns⊂ out that is almost a σ-algebra in a sense made precise by the following lemma: S Lemma 3.3. Let G , G ,... be an increasing family of sets in (A) and let 1 2 S ∞ E = Gj. j=1 + Then there is a set G (A) such that E G, µ(G) = limj µ(Gj), and G E is a null set. ∈S ⊂ →∞ \ Proof. Let the sets Gi A be given such that j ∈ i i Gj = lim Gj, i ω → and let i 1 Sj = i N : µi(G ) µ(Gj) , ∈ | j − |≤ 2j 8 i 9 a set of indices for which the measures of the Gj are particularly close to that of their ultralimit. By Lemma 3.2, Sj ω for each j. i ∈ i m i Let G Ai be defined as follows: if i S1,S2,...,Sm but i/Sm+1, let G = j=1 Gj. ∈ ∈ i i ∈ As one might expect, if i is in all of the S , let G = ∞ G . Then define j j=1 j : G = lim Gi. : i ω → We have to check three things, the third of which is an immediate consequence of the first two, so it suffices to verify that E G and that µ(G) = limj µ(Gj). First, fix j. ⊂ →∞ 17 i i The set of indices i for which Gj G is a superset of the intersection S1 S2 ... Sj, which is in ω, so ⊂ ∩ ∩ ∩ i i i N : G G ω, { ∈ j ⊂ }∈ which implies that G G for each j, which implies that E G. The second fact follows j ⊂ ⊂ by summing the measures over j and using the definition of Sj. ! Lemma 3.4. The set of null sets is closed under countable union. Proof. Suppose that E , E ,... are null sets, and choose any "> 0. Let F# be a set in (A) 1 2 j S containing Ej and such that " µ(F#) < . j 2j Define j # # Gj = Fi , i=1 + so the G# form an increasing family of sets in . By the previous lemma, there is a set j S G# containing all of the G# (and therefore all of the F#) and such that ∈S i i ∞ " µ(G#) = ", ≤ 2j j=1 ! so we are done. ! Finally, define a set E A to be measurable if there is a set F (A) such that the symmetric difference E F⊂is a null set, and define µ(E) in that case to∈ beS equal to µ(F). It 6 is easy to check that this is well-defined, for if we have two sets F1 and F2 such that both symmetric differences are null sets, their symmetric difference will be a null set in (A); that is, a set of measure zero, so their measures will agree. In particular, all null setsS are measurable, with measure zero. Let the measurable sets be denoted (A). It follows from the following theorem and the uniqueness of Carath´eodory extensionsA for σ-finite spaces that we have constructed “by hand” the same measure as before. Theorem 3.5. So defined, (A) is a σ-algebra with a probability measure µ, and this measure space is complete. A Proof. Call two measurable sets equivalent, with the symbol , if their symmetric differ- ≡ ence is a null set. Two equivalent sets clearly have the same measure. Let E1, E2,... be measurable sets, so there exist sets F1, F2,... in such that Ej Fj for each j. It is easy to check that S ≡ A E A F , \ 1 ≡ \ 1 E E F F , 1 ∪ 2 ≡ 1 ∪ 2 E E F F . 1 ∩ 2 ≡ 1 ∩ 2 This demonstrates that is closed under finite Boolean operations and µ is a finitely additive probability measureA on it. To prove that is closed under countable unions and µ A 18 is countably additive, it is enough to prove that if the Ej are disjoint, there is a E such that ∈S ∞ E E j ≡ j=1 + and ∞ µ(Ej)=µ(E). j=1 ! (This works because being closure under countable disjoint unions implies closure under arbitrary countable unions). As before, define j Gj = Ei, i=1 + so the G# form an increasing family of sets in , and j S ∞ ∞ Gj = Ej. j=1 j=1 + + By Lemma 3.3, there exists an E containing that union such that ∈S ∞ E E \ j j=1 + is a null set and ∞ µ(E) = lim µ(Gj)= µ(Ej), j →∞ j=1 ! so we are done. Completeness is clear; if E is a subset of a measure-zero set, then it certainly differs from that set by a null set and is therefore in . A ! Remark It is worth discussing why one has to deal with the σ-algebra instead of a topology on A. To define a topology on the ultraproduct, we would certainlyA want all ultraproducts of open sets to be open, so we could simply take these sets to be a basis for this topology. However, this topology will not have nice analytic properties: for example, if each Ai has the discrete topology (as would be natural on, for example, Z or a finite group), then every point of A will be made an open set; hence A would be endowed with the discrete topology. This is unhelpful for our purposes: we are interested in structures beyond the classical Fourier theory, so insofar as linear characters on locally compact groups (such as any discrete group) already span the relevant function spaces they do not give any new information. A partial way around this problem is by defining, instead of a topology, a σ-topology, which is a collection τ of subsets of A obeying the following axioms: (i) The empty set and A are in τ. (ii) The family τ is closed under finite intersection. (iii) The family τ is closed under countable union. 19 Thus a σ-topology is a weakening of the usual notion of a topology, allowing only countably many unions rather than arbitrary unions. We can define the standard σ-topology (A) as the collection of countable unions of ultraproducts of measurable sets; the axioms aboveO are immediate since ultraproducts respect finite Boolean operations. With this loosening, however, the group structure does not behave well. One would perhaps like the above-defined measure to be the σ-topological equivalent of a Haar measure, but this is not achievable. Firstly, the group addition function + : A A A will not be continuous with respect to (A) (A), as would be required for a “×σ-topological→ group.” O ×O Secondly (if we temporarily relax our assumption that the Ai are probability spaces), not all compact sets have finite measure: for example, take Ai = Zi, the cyclic group of order i, with the discrete topology, and let µi be the counting measure on Ai so that µi(Ai)=i. All of the Ai are obviously compact. By Tychonoff’s theorem, the product i Ai is compact, and as an arbitrary quotient of compact groups is compact, the ultraproduct A is compact as well. But # µ(A) = st lim µi(Ai) = st lim i = . i ω i ω → → ∞ Finally, we have nothing close to a result' like the( one on' uniqueness( of Haar measures; given topological groups Ai and their product A, there are many possibilities for the measure on A. For example, if we take all of the µi as probability measures, then µ(A) = 1, which is clearly not related to the measure of the above example by a rescaling. These considerations necessitate a more measure-theoretic, less topological approach than in the case of classical Fourier analysis. 6 3.2. Measurable functions on the ultraproduct. We will need some results about measurable functions on the ultraproduct space A. Let fi : Ai R be a sequence of → measurable functions on the Ai, and let f : A R be given by f = st(limi ω fi). Since → → we are not assuming that the fi are bounded, f is a function to the extended real numbers R = R . ∪{−∞}∪{∞} Lemma 3.6. So defined, f is measurable on A. Proof. Throughout, let ai i∞, ai Ai denote a sequence whose ultralimit is a A. For measurability, it suffices{ to} prove that∈ ∈ f[ ,d] = a A : f(a) d = a A : st lim fi(ai) d −∞ { ∈ −∞ ≤ ≤ } ∈ −∞ ≤ i ω ≤ $ ' → ( % is a measurable set for every d R. If b is a hyperreal, it is easy∈ to see that st(b) d if and only if, for every real −∞ ≤ ≤ "> 0, b d + ", or equivalently for all n N, b d +1/n. Therefore a A is in the above ≤ ∈ ≤ ∈ set if and only if for every n N, ∈ i N : fi(ai) d +1/n ω. { ∈ ≤ }∈ Let P = a : f (a ) d +1/n . n,i { i i i ≤ } Letting Pn = limi ω Pn,i, it is clear that → ∞ f[ ,d] = Pn. −∞ n=1 * But each Pn,i is measurable because fi is, so Pn and hence f[c,d] is as well. ! 20 In the other direction, we can ask to what extent we can represent a given µ-measurable function on A as an ultraproduct of µi-measurable functions on the Ai. The explicit con- struction of µ suggests that, as every µ-measurable set is “almost” (i.e., up to a null set) an ultraproduct of measurable sets in Ai, each µ-measurable function should be “almost” an ultraproduct of measurable functions. This is indeed the case. First, we need a lemma. For a function h into R or ∗R and n ∗R let ∈ h(a) if h(a) < n, (h n)(a)= ∧ ;n otherwise, so we have both (h n) h and (h n) n. ∧ ≤ ∧ ≤ Lemma 3.7. If fi : Ai R is a sequence of functions and f = st(limi ω fi), then → → f n = st lim(fi n) . ∧ i ω ∧ ' → ( Proof. This is a straightforward verification. Let a = limi ω ai. First presume that f(a) < n. Then on the one hand, (f n)(a)=f(a). On the other→ hand, by the definition of the standard part andLo´s’theorem, ! ∧ f(a) f(a) = st lim fi (a) = st lim(fi n) (a). i ω i ω ∧ ' → ( ' → ( Therefore the equality is verified if f(a) f(a) n = lim fi (a) n " = i N : fi(ai) n " ω, i ω ≥ ⇒ → ≥ − ⇒{ ∈ ≥ − }∈ so we have that ' ( i N : fi(ai) (fi n)(ai) <" ω { ∈ | − ∧ | }∈ for every real "> 0. This implies that lim fi (a) lim(fi n) (a) <" i ω − i ω ∧ &' → ( ' → ( & for every real "> 0, so if we& take standard parts we get equality:& & & f(a) = st lim fi (a) = st lim(fi n) (a). i ω i ω ∧ ' → ( ' → ( The equality is therefore verified. ! Proposition 3.8. For every measurable function g : A R, there exists a sequence of → measurable functions fi : Ai R such that if f = st(limi ω fi), then f = g almost → → everywhere with respect to µ. Furthermore, if g is bounded, then the fi can be chosen so as to be uniformly bounded (above or below) with the same bound. 21 Proof. Let (qn)n be an enumeration of all standard rationals, and let ∈N E = a A : g(a) q , n { ∈ ≤ n} which is measurable because g is. These sets form an “increasing sequence” in the sense that E E whenever q q . By the construction of (A), there are sets F (A) n ⊆ m n ≤ m A n ∈S such that µ(En Fn) = 0, and we can certainly choose the Fn to be increasing in the same sense as well. 6 By countable comprehension, extend the sequence of Fn to an internal sequence (Fn)n . ∈∗N We can express the fact that Fn Fm if qn qm as a first-order sentence, so by the overspill principle there is some infinite K⊆such that≤ this nesting property holds for all n, m K. ≤ Order the set qn n K as qi1 qij if a Fij Fij 1 f #(a)= ∈ \ − , q + 1 if a/F . ; iK ∈ iK where we have made the obvious convention F = . So defined, f is clearly the ultralimit of i0 ∅ functions fi : Ai R because it is a simple function defined with internal sets. Furthermore, outside the set → (E F ), n6 n n +∈N which is the countable union of null sets and therefore itself null, we have f #(a) q if and ≤ n only g(a) qn for all n N, hence if f = st(f #) we have f = g almost everywhere. If g is bounded≤ above,∈ say by n, then by Lemma 3.7 the sequence f n is a lifting of g. i ∧ The bounded below case follows analogously. ! Given a measurable g : A R, we will call a sequence fi : Ai R given by the above proposition a lifting of g. A lifting→ will be highly nonunique in general.→ 3.3. Integration theory on the ultraproduct. Standard measure theory gives rise to an integral defined on µ-measurable functions from the ultraproduct group A. For ultralimits of measurable functions, we can also integrate first on the Ai with respect to the measures µi, then take the ultralimit. The following proposition shows that for uniformly bounded functions, it does not matter these two operations — ultralimit and integral — are taken. Proposition 3.9. If the fi are uniformly bounded and f = st(limi ω fi), → f dµ = st lim fi dµi . i ω Proof. Fix N and approximate by simple functions: for each f and integer j 0, let i ≥ gi(ai)=j/N precisely when j j +1 f (a ) < , N ≤ i i N while for each integer j<0, let gi(ai)=(j + 1)/N precisely when this condition is satisfied. Then g is a measurable function on A , g (a ) f (a ) for every a (so in particular the i i | i i |≤| i i | i gi are uniformly bounded), and fi gi 1/N uniformly. Since we are integrating over probability spaces, | − |≤ 1 f g . i − i ≤ N & Corollary 3.10. If g : A R is µ-measurable and bounded, for any bounded lifting gi the following equality holds: → g dµ = st lim gi dµi . i ω Proof. By definition, a the ultraproduct of a lifting is µ-a.e. equal to the original function, so the result follows immediately when combined with the second part of Proposition 3.8. ! Without a uniform bound, we cannot expect to preserve the above equality. For a simple couterexample, let Ai = [0, 1] equipped with the usual Lebesgue measure and define 1 i if x i fi(x)= ≤ ;0 otherwise. The standard part of the ultraproduct, f : ∗[0, 1] R, is certainly equal to zero on all → non-infinitesimal elements of ∗[0, 1] (that is, all elements not infinitely close to zero). The set I of infinitesimal elements is a null set because I [0,"] ∗[0, 1] for all ", and by trivial calculation µ([0,"]) = ". Therefore f is zero except on⊂ a set⊂ of measure zero, so f dµ =0. <∗[0,1] On the other hand, it is clear that st lim fi dµi = st lim 1 =1. ,i ω [0,1] - i ω → < ' → ( For arbitrary positive µ-measurable functions, the best we can do is the following, which is reminiscent of Fatou’s Lemma: Proposition 3.11. Let fi and f be as above, where no boundedness is assumed but we do have f 0. Then ≥ f dµ st lim fi dµi . ≤ i ω Proof. We proceed somewhat analogously to the proof of Fatou’s lemma. Suppose g is a bounded measurable function on A with 0 g f. By Proposition 3.8, there is a lifting ≤ ≤ gi : Ai R such that → g = st lim gi i →∞ '23 ( µ-almost everywhere. Since we are ultimately only considering integrals of g, we can identify the two functions without harm. By the definition of the order relation on ∗R and the monotonicity of the integral, we have the following chain of inferences: g f ≤ = i N : gi fi ω ⇒{ ∈ ≤ }∈ = i N : gi dµi fi dµi ω ⇒{ ∈ ≤ }∈ = lim gi dµi lim fi dµi. ⇒ i ω ≤ i ω → st lim gi dµi st lim fi dµi . i ω ≤ i ω = → (i) limi ω fi dµi is finite, → Ai | | (ii) if E = limi ω Ei Ai is in (A) and µ(E) = 0, then ? → ⊂ S st lim fi dµi =0. i ω | | = → Proposition 3.12. Let fi : Ai R be a sequence of µi-measurable functions. Then the following are equivalent: → (i) the fi are S-integrable, (ii) if f = st(limi ω fi), → f dµ = st lim fi dµi . i ω + Proof. If fi and fi− are the positive and negative parts of the fi, respectively, then it is + clear that fi is S-integrable if and only if both fi and fi− are. Therefore it suffices to assume{ that} f 0 for each i. { } { } i ≥ 24 First assume (i). By the definition of S-integrability, the right-hand side is finite, so by Proposition 3.11 so is I = f dµ. (f n) dµ = st lim fi n dµi . ∧ i ω ∧ Certainly, then, we have 1 lim fi n dµi (f n) dµ + i ω ∧ ≤ ∧ n → st lim fi dµi =0, i ω = → st lim fi dµi I = f dµ. i ω ≤ = → st lim fi dµi = st lim fi dµi + st lim fi dµi i ω A i ω A E i ω E = → < i > , → < i\ i - = → < i > st lim fi dµi ≤ i ω A E , → < i\ i - f dµ ≤ A E < \ = f dµ. st lim fi dµi =0, i ω = → Proposition 3.13. A µ-measurable function f : A R is µ-integrable if and only if it has an S-integrable lifting. → Proof. If f has an S-integrable lifting f , then { i} f = st lim fi i ω → almost everywhere, and by Proposition 3.12' the latter( is integrable. + Conversely, assume f is integrable. We can consider f and f − separately, which means it suffices to assume f 0. Take a lifting fi such that fi 0, which we can do by the last part of Proposition≥ 3.8. By Lemma 3.7{ and} Corollary 3.10,≥ for each finite n, st lim (fi n) dµi = (f n) dµ, i ω ∧ ∧ = → st lim(fi ki) (a)= st lim fi (a)= i ω ∧ ∞ ⇐⇒ i ω ∞ ' → ( ' → ( precisely because limi ki is infinite. In the finite case, we also have equality, because the →∞ set of i such that ki is greater than any finite number is an element of the ultrafilter, so we can disregard the remainder. Taking standard parts of the above inequality, we get st lim (fi ki) dµi f dµ = st lim(fi ki) dµi i ω A ∧ ≤ A A i ω ∧ = → < i > < < ' → ( because by definition the standard part of the ultraproduct of a lifting is almost everywhere equal to the original function. By Proposition 3.11, the opposite inequality is also true, so st lim (fi ki) dµi = st lim(fi ki) dµi. i ω A ∧ A i ω ∧ = → < i > < ' → ( By Proposition 3.12, f k is an S-integrable lifting, as desired. { i ∧ i} ! 26 Finally, we have a result, first realized by Lindstr¨omin [20], that is especially important in the case p = 2: Proposition 3.14. If fi : Ai R is a sequence of µi-measurable functions with → p st lim fi dµi < i ω | | ∞ = → st lim fi dµi < . i ω | | ∞ = → st lim fi dµi =0. i ω | | = → 27 4. Product and sub σ algebras − − 4.1. Definitions and the Fubini theorem. So far products of ultraproduct σ-algebras have not been discussed, but they play a crucial role in the theory. The first observation is that we have two obvious ways of putting a σ-algebra on the product Ak. By Lemma 2.10, the product Ak is unambiguously defined in that we can assume that finite products and ultraproducts commute. However, in general the σ-algebras k k (A ), defined by starting with the product σ-algebras on Ai and performing the above procedureA on their ultraproduct, will be different from the product σ-algebra (A) ... (A). A × ×A In fact, much of the structure of the theory developed by Szegedy can be traced to the noncommutativity (A) ... (A) = (Ak). A × ×A . A We have the following inclusion: Lemma 4.1. Let A1 and A2 be ultraproduct spaces equipped with measures µ1, µ2 defined as above. Then (A ) (A ) (A A ) A 1 ×A 2 ⊆A 1 × 2 and the measure µ µ is unambiguously defined on the smaller σ-algebra. 1 × 2 Proof. That the product measure is unambiguously defined simply requires a retracing of the steps in the definition, using Lemma 2.10 wherever necessary. By definition, (A1) (A2) is generated by rectangles E1 E2, where E1 (A1) and E (A ).A It suffi×cesA to show that each such rectangle is in× (A A ). ∈A 2 ∈A 2 A 1 × 2 By construction, for j =1, 2 there is a Fj that is the ultraproduct of finite Boolean combinations of measurable sets such that µ (E F )=0. j j6 j Tracing the construction backwards, it is certainly the case that F1 F2 (A1 A2). We have the following, for purely set-arithmetical reasons as well as by× the definition∈A × of the product measure: (µ µ )((E E ) (F F )) 1 × 2 1 × 2 6 1 × 2 µ (E F )µ (E F )+µ (E F )µ (E F ) ≤ 1 16 1 2 2 ∪ 2 1 1 ∪ 1 2 26 2 =0. Therefore E E (A A ), as desired. 1 × 2 ∈A 1 × 2 ! In fact, the same proof gives that (A1) (A2) (A1 A2), where the bar indicates the completion of a measure. A ×A ⊆A × Several iterations of this lemma give that (A) ... (A) (Ak). A × ×A ⊆A In general the inclusion is proper, as we will see.2 2See [5] page 557 for a fairly down-to-earth example. 28 We need some notation for discussing such products. Let [k] denote the set 1, . . . , k and let S [k]. Let { } ⊆ Ai,S = Ai, j S B∈ and let S A = lim Ai,S i ω → be the ultraproduct of these sets, equipped with σ-algebra (AS) as described in the pre- vious section. A There is a natural projection map p : Ak AS S → which we can use to define σ-algebras on Ak: define k 1 S (A )=p− ( (A )), AS S A the σ-algebra of measurable sets that depend only on the coordinates in S. By this definition k [k] = , while is the trivial σ-algebra on A . It is obvious that A A A∅ , if T S. AT ⊆AS ⊆ S Let µS denote the measure on A ; by Lemma 4.1 it does not matter with respect to precisely what σ-algebra we are referring. Finally, let S∗ denote the system of sets T : T S, T = S 1 , and let be the { ⊂ | | | |− } AS∗ σ-algebra generated by the algebras such that T S∗. (Note that it is equivalent to AT ∈ define S∗ to consist of all proper subsets of S.) For example, in the case k = 2, we simply have that (A)= (A) (A), A[k]∗ A ×A while for higher k the σ-algebra lies somewhere in between the product σ-algebra and A[k]∗ = [k] itself, inclusion-wise. A WeA can use Fubini’s theorem to get results about interchanging the order of integration for functions in product σ-algebras. Because we do not in general have equality in Lemma 4.1, however, we cannot a priori say anything about interchanging the order of integration for (Ak)-measurable functions. Fortunately, we do have a strengthened Fubini’s theorem, developedA by Keisler. In order to prove it, we need to develop a lemma about product null sets that is manifestly false outside of the ultraproduct setting, but gives us the extra control needed for the full theorem: Lemma 4.2. Let A1 and A2 be ultraproduct spaces of the kind constructed above, with measures µ and µ , and let E A A . For each y A define 1 2 ⊂ 1 × 2 ∈ 2 E = x A :(x, y) E , y { ∈ 1 ∈ } the slice of E at y. Then the following are equivalent: (i) (µ µ )(E)=0. 1 × 2 (ii) For each n N there is a set En (A1 A2) containing E such that (µ1 µ2)(En) < 1/n. ∈ ∈S × × (iii) For each n N there is a set En (A1 A2) containing E such that ∈ ∈S × 1 1 µ y A : µ (E ) < 1 . 2 ∈ 2 1 n,y n ≥ − n =8 9> 29 (iv) For almost all y A , µ (E )=0. ∈ 2 1 y Proof (sketch). The implication (i) to (ii) is clear; if E is itself in (A A ) then we can S 1 × 2 take En = E for all n, while otherwise we can use Lemma 3.3 to find En. The implication (ii) to (iii) follows by using Fubini’s theorem on the A A , which works precisely because 1,i × 2,i En is an internal set. The implication (iii) to (iv) follows by the following argument: if we let L A be the set n ⊂ 2 1 y A : µ (E ) , ∈ 2 1 n,y ≥ n 8 9 then by the Borel-Cantelli lemma almost all y can be contained in only finitely many of the Ln. For these y, µ1(Ey) = 0. The implication (iv) to (i) is obvious. ! Theorem 4.3 (Keisler’s Fubini theorem). Let S, T [k], and let Sc denote the complement k Sc ⊆S of S in [k]. If f : A R and y A , let fy : A R be the function obtained from f by holding the Sc coordinates→ constant∈ at y. → k k Sc (i) If f : A R is measurable with respect to T (A ) and y A , then fy is measur- → k A ∈ able with respect to S T (A ) for almost all y. k A ∩ k (ii) If f : A R is integrable with respect to (A ), then → A f (dµ1 dµ2)= fy(x) dµ1(x) dµ2(y). A A × A A < 1× 2 < 2 =< 1 > Implicit in the second statement is the assertion that the inner function (of y) is measurable on A2, which needs to be proven. Let A1 = limi ω A1,i and A2 = limi ω A2,i. Let f be a lifting of f. Then by definition the→ set → { i} E = (x, y) : st lim fi (x, y) = f(x, y) { i ω . } ' → ( has measure zero. By the equivalence of (i) and (iv) in Lemma 4.2 and the existence of liftings, the functions fi,y(x)=fi(x, y) are liftings of fy for almost all y, so the first part is established. For the second part, let g(y)= f(x, y) dµ1(x). gi(y)= fi(x, y) dµ1,i(x). st lim gi (y)=g(y). i ω ' → ( Therefore gi is a lifting of g whenever fi,y is a lifting of fy; that is, almost everywhere, so in particular{ g} is measurable. Now we use Proposition 3.12 twice (once in the first line and once in the fourth), together with the usual Fubini theorem in the second line, to get the following chain of equalities: f (dµ1 dµ2) = st lim fi (dµ1,i dµ2,i) A A × i ω A A × < 1× 2 , → < 1,i× 2,i - = st lim fi dµ1,i dµ2,i i ω , → = st lim gi dµ2,i i ω , → = g dµ2, 2k f k = ... ∆ ... ∆ f(x) dµ(x) dµ(h ) . . . dµ(h ). || ||U h1 hk 1 k << < For example, for k = 2 we have 1/4 f 2 = f(x)f(x + h )f(x + h )f(x + h + h ) dµ(x) dµ(h ) dµ(h ) . || ||U 1 2 1 2 1 2 =< < < > while if k = 1 the norm reduces to 1/2 f 1 = f(x)f(x + h) dµ(x) dµ(h) || ||U =<< > 1/2 = f(x) f(h) dµ(h) dµ(x) =< C< D > = f(x) dµ(x) , &< & & & the absolute value of the integral& of f, where we& have made the obvious change of variables h h x in the inner integral.& & (→ −k Let G (A, µ) be the space of functions for which f U k is defined and finite (that is, f is measurable and if we replace f by f , all integrals are|| finite).|| Clearly, U 1 is only a seminorm on G1(A, µ)=L1(A, µ), and although| | we will refer to the U k seminorms as norms, they are in general only seminorms as well, with a nonzero kernel. By an obvious change of variables, the U k norms are invariant under the group action. It obviously remains to be shown that the U k are indeed seminorms on Gk(A, µ). To do this, we will prove several intermediate results. Lemma 4.4. For k 2, the following inductive formula holds: ≥ k k 1 2 2 − f k = ∆hf k 1 dµ(h). || ||U || ||U − < Proof. This is a trivial verification. ! In particular, the inductive formula shows that the U k map to nonnegative reals for all k; this is certainly true for k = 1 by the above calculation of U 1 and it is therefore true for all higher k by induction. We can put more structure on Gk(A, µ) by defining a sort of inner product taking 2k k k k k arguments instead of two. Given 2 functions fc G (A, µ), c 0, 1 , define the U product of these functions by ∈ ∈{ } c (fc)c 0,1 k U k = ... C| |[fc(x + c1h1 + ... + ckhk)] dµ(x) dµ(h1) . . . dµ(hk). 9 ∈{ } : << < c 0,1 k ∈{) } It is clear that this product is linear in all entries. Furthermore, if fc = f for all c, we have 2k (fc)c 0,1 k U k = f U k . 9 ∈{ } : || || 32 k k Proposition 4.5 (Gowers-Cauchy-Schwarz inequality). Let fc G (A, µ), c 0, 1 be 2k functions, where k 1. Then ∈ ∈{ } ≥ (fc)c 0,1 k U k fc U k . 9 ∈{ } : ≤ || || c 0,1 k & & ∈{) } & & Proof. Writing y = x + h1, we have c (fc)c 0,1 k U k = ... C| |[fc(x + c2h2 + ... + ckhk)] dµ(x) 9 ∈{ } : < < < c 0,1 k ∈{) } c1=0 c C| |[fc(y + c2h2 + ... + ckhk)] dµ(y) dµ(h2) . . . dµ(hk). < c 0,1 k ∈{) } c1=1 Temporarily denote the function of the variables h2, . . . , hk in the first set of parentheses by s(h2, . . . , hk) and the function in the second set of parentheses by t(h2, . . . , hk), so we end up with the expression ... s t dµ(h ) . . . dµ(h ). · 2 k < < k 1 We will apply Cauchy-Schwarz over A − to these two functions. We do not know a priori that the quantities involved are finite; however, we will shortly show that the right hand side is finite, which implies that the left hand side is. In all, (fc)c 0,1 k U k |9 ∈{ } : |≤ 1/2 1/2 ... s 2 dµ(h ) . . . dµ(h ) ... t 2 dµ(h ) . . . dµ(h ) . | | 2 k | | 2 k &< < & &< < & & & & & Consider the first& term, which upon changing& the& dummy variable from x to h & in the second & & & 1& term and putting the conjugation operator inside the integral immediately expands to c ... C| |fc(x + c2h2 + ... + ckhk) dµ(x) < << c 0,1 k ∈{) } c1=0 c C| |fc(h1 + c2h2 + ... + ckhk) dµ(h1) dµ(h2) . . . dµ(hk) < c 0,1 k ∈{) } c1=0 c = ... C| |fc(x + c2h2 + ... + ckhk)fc(h1 + c2h2 + ... + ckhk) dµ(x) dµ(h1) . . . dµ(hk) < < c 0,1 k ∈{) } c1=0 By making the change of variables h1 x + h1 and letting c# =(c2, . . . , ck) denote the projection of c along the first coordinate,(→ this can be rewritten as c ... C| |f0,c (x + c1h1 + ... + ckhk) dµ(x) . . . dµ(hk)= (f0,c )c 0,1 k U k # 9 # ∈{ } : < < c 0,1 k ∈{) } 33 The same can be done to the second term in the above inequality, except with the projection to f1,c# . In all, we have 1/2 1/2 (fc)c 0,1 k U k (f0,c )c 0,1 k U k (f1,c )c 0,1 k U k . |9 ∈{ } : |≤|9 # ∈{ } : | |9 # ∈{ } : | We now go through precisely the same procedure for each other coordinate. For example, if we denote (c3, . . . , ck) by c## in the same way as above we can show that 1/2 1/2 (f0,c )c 0,1 k U k (f0,0,c )c 0,1 k U k (f0,1,c )c 0,1 k U k . |9 # ∈{ } : |≤|9 ## ∈{ } : | |9 ## ∈{ } : | Repeating 2k 1 times, we arrive at the inequality − (fc)c 0,1 k U k fc U k . |9 ∈{ } : |≤ || || c 0,1 k ∈{) } k Since all of the fc were assumed in G (A, µ), the right hand side is finite, which justifies our earlier use of Cauchy-Schwarz. ! In much the same way that the triangle inequality for L2 norms follows from the Cauchy- Schwarz inequality, the triangle inequality for Gowers norms can be deduced from the Gowers-Cauchy-Schwarz inequality. Proposition 4.6. The U k are seminorms on Gk(A, µ). Proof. It is obvious that the U k are positively homogeneous (that is, if a is a complex k number, af U k = a f U k ) and we have already shown that U maps to the nonnegative reals. The|| triangle|| inequality| |·|| || is all that remains to be seen. k k By expanding f +g 2 , we get a sum of 22 different terms, each of which can be written || ||U k as a U k product of fs and gs. Each possible configuration of U k products of fs and gs is k represented exactly once in this sum. Letting jc, c 0, 1 represent one such configuration (so either j = f or j = g for every c 0, 1 k), by∈{ the Gowers-Cauchy-Schwarz} inequality c c ∈{ } (jc)c 0,1 k U k jc U k . 9 ∈{ } : ≤ || || c 0,1 k ∈{) } Group together all terms with d copies of f and 2k d copies of g. It is clear that there are precisely 2k choose d such terms. All in all, − k 2 k 2k 2 d 2k d 2k f + g k f k g − =( f k + g k ) , || ||U ≤ d || ||U || ||U k || ||U || ||U !d=0 = > k so upon taking 2 th roots we arrive at the triangle inequality. ! In the finite case (that is, when µ(A) < ), the U k are bounded in terms of the U k+1: ∞ Proposition 4.7. If µ(A) < , for every k 1 and f Gk+1(A, µ), ∞ ≥ ∈ f k C f k+1 || ||U ≤ || ||U for some constant C independent of f. k+1 Proof. For c 0, 1 , let fc = f when ck+1 = 0 and fc = 1 otherwise. By the Gowers- Cauchy-Schwarz∈{ inequality} for k + 1, we have (fc)c 0,1 k+1 U k+1 fc U k+1 |9 ∈{ } : |≤ || || c 0,1 k+1 ∈{)} 34 The left hand side is equal to c ... C| |f(x + c h + ... + c h ) dµ(x) . . . dµ(h ) dµ(h ) & 1 1 k k k k+1 & &< < c 0,1 k < & & ∈{) } & & 2k & & = f k µ(A), & & || ||U · & k k while the right hand side is equal to f 2 1 2 . It is a simple calculation that || ||U k+1 || ||U k+1 1/2 2k k/2+1 1 k+1 = ... dµ(x) dµ(h ) . . . dµ(h ) = µ(A) . || ||U 1 k+1 =< < > Plugging back into the Gowers-Cauchy-Schwarz inequality, we get 2k 2k k/2+1 f k µ(A)= f k+1 µ(A) , || ||U · || ||U · which reduces to k/2k+1 f k µ(A) f k+1 , || ||U ≤ || ||U thus proving the theorem. ! The form of C immediately gives the following: Corollary 4.8. If µ is a probability measure, then the Gowers norms are nondecreasing with increasing k. ! As another easy corollary, we have the following important fact, which implies that in the finite case the kernels of the Gowers seminorms are decreasing as k increases: Corollary 4.9. If µ(A) < and k 1, if f k+1 =0, then f k =0. ∞ ≥ || ||U || ||U ! It is also clear that, if µ(A) < , ∞ L1(A, µ)=G1(A, µ) G2(A, µ) G3(A, µ) .... ⊃ ⊃ ⊃ The form of the above proposition suggests that no similarly useful inequality should hold in the infinite case. In fact, it is easy to construct an explicit counterexample, taking advantage of the extra (multiplicative) structure of the real numbers and the fact that the Gowers norms do not scale nicely with this structure. Let A = R and let fn : R R for n>0 be given by → nx2 fn(x)=e− , which is in Gk(R,µ) for all k. Then a tedious but elementary calculation shows that k+1 k+1 2 2 n k+1 k(k 1) f k =2 − . || n||U π Therefore for any C, by letting ' ( 2k+1 2k+2 2k+2 2k+1 c = ,d= , k k +1− k +2 k (k + 1)k − k(k 1) − and choosing 1/c n> Cπck 2dk k , simple algebra shows that G H f k >C f k+1 . || n||U || n||U 35 Therefore there is no constant entirely independent of f such that Proposition 4.7 holds when A = R. The only situation in which such a bound can be recovered in the infinite case is if f has compact support on a set S and there exists a measurable set B such that the Minkowski sum/difference B + S S is contained in a set C of finite measure. That is, there is some measurable B such that− B + S S = x : x = b + s s ; b B, s ,s S − { 1 − 2 ∈ 1 2 ∈ } is contained in some C such that µ(C) < .3 This condition is not automatic: for example, ∞ consider the measure space on R consisting of all Lebesgue measurable sets, where the measure of a set is either 0 if that set is Lebesgue-null or otherwise. Then if C is the Cantor set, which has zero measure, its sumset 2C contains∞ an interval and therefore has infinite measure. Proposition 4.10. Let f Gk+1(A, µ) be supported on a set S, and let there exist a set B such that B + S S is contained∈ in a set of finite measure. Then − f k C f k+1 || ||U ≤ || ||U for some constant C depending on S, B, and k. Proof. Similarly to the above proof, for c 0, 1 k+1, let f = f when c = 0. Denote ∈{ } c k+1 the set of finite measure containing B + S S by C. Letting c# = (0, 0,..., 0, 1), define − fc# = χB and fc = χC when ck+1 = 1 and c = c#. By Gowers-Cauchy-Schwarz, . (fc)c 0,1 k+1 U k+1 fc U k+1 . |9 ∈{ } : |≤ || || c 0,1 k+1 ∈{)} The left-hand side is equal to the absolute value of c ... C| |f(x + c1h1 + ... + ckhk) < < c 0,1 k ∈{) } χB(hk+1) χC (c1h1 + ... + ckhk + hk+1) dµ(hk+1) dµ(x) . . . dµ(hk). < c 0,1 k ∈{) } c=c# + where in the inner integral we have made the change of variables hk+1 hk+1 x. Because f is supported on S, we can narrow down the domain of integration(→ considerably− due to the first product; i.e., for a particular set of coordinates (x, h1, . . . , hk) to contribute to the integration, all sums x + c1h1 + ... + ckhk must be in S. In particular, since x must be in S, subtracting off x, all sums of the form c1h1 + ... + ckhk must be in S S. Looking specifically at the inner integral, we must have h B to contribute to− the integral. k+1 ∈ Therefore every sum c1h1 + ... + ckhk + hk+1 is in B + S S, so the inner integral is greater than or equal to − χB(hk+1) dµ(hk+1)=µ(B), < 3This condition is slightly more permissible than the more obvious condition that µ(B + S S) < , in − ∞ recognition of the fact that if S is measurable, S S may not be. See [3]. − 36 and the left hand side as a whole is greater than or equal to 2k f k µ(B). || ||U · 2k 2k 1 Now consider the right hand side, which is equal to f U k+1 χB U k+1 χC U k−+1 . For any characteristic function on a set, we can very roughly|| estimate|| || the|| Gowers|| norm|| as follows. We can eliminate characteristic functions in the integrand at the cost of possibly increasing the integral. Therefore 2k+1 χ k+1 ... χ (x)χ (x + h ) ...χ (x + h ) dµ(x) . . . dµ(h ). || B||U ≤ B B 1 B k+1 k+1 < < By pulling in each dµ(hi), making the change of variables hi hi x in each term, and evaluating, we get (→ − 2k+1 k+2 (k+2)/2k+1 χ k+1 µ(B) = χ k+1 µ(B) . || B||U ≤ ⇒|| B||U ≤ Likewise 2k+1 k+2 2k 1 (k+2)(2k 1)/2k+1 χ k+1 µ(C) = χ − µ(C) − . || C ||U ≤ ⇒|| C ||U k+1 ≤ Putting this all together with the original inequality, 2k 2k (k+2)/2k+1 (k+2)(2k 1)/2k+1 f k µ(B) f k+1 µ(B) µ(C) − , || ||U · ≤|| ||U · · so (modulo some rearrangement) we are done. ! Another important fact about Gowers norms is the following, which characterizes U2 in terms of Fourier analysis in the compact case: Proposition 4.11. If A is compact, µ normalized Haar measure on A, and f G2(A, µ), then ∈ 1/4 ˆ 4 f U 2 = f(ξ) dµˆ(ξ) , || || ˆ | | = where Aˆ is the Pontryagin dual of A and µˆ is the dual measure of µ (chosen so that the Fourier inversion formula holds). Proof. If A is compact and µ normalized Haar measure, its Pontryagin dual Aˆ is discrete andµ ˆ is counting measure, so the above integral becomes a sum, and we have the Fourier expansion f(x)= fˆ(α)χα(x). α Aˆ !∈ The verification is then straightforward, if irritating: we have 4 f 2 = f(x)f(x + y)f(x + z)f(x + y + z) dµ(x) dµ(y) dµ(z) || ||U < < < = f(x)f(x + y) fˆ(α)fˆ(β)χβ(y)χα β(x + z) dµ(x) dµ(y) dµ(z) − < < < α,β Aˆ !∈ = f(x)f(x + y) fˆ(α)fˆ(β)χβ(y) χα β(x + z) dµ(z) dµ(x) dµ(y). − << α,β Aˆ =< > !∈ 37 The quantity in parenthesis is zero unless α = β, in which case it is equal to one, so this simplifies to 4 2 f 2 = f(x)f(x + y) fˆ(α) χ (y) dµ(x) dµ(y) || ||U α << α Aˆ !∈ 2 = fˆ(α) f(x)f(x + y)χα(y) dµ(x) dµ(y) α Aˆ << !∈ 2 = fˆ(α) fˆ(β)fˆ(γ)χβ γ (x)χα γ (y) dµ(x) dµ(y) − − α Aˆ << !∈ 2 = fˆ(α) fˆ(β)fˆ(γ) χβ γ (x) dµ(x) χα γ (y) dµ(y). − − α Aˆ < < !∈ Again, the integrals cancel unless α = β = γ, so we are left with 4 4 f 2 = fˆ(α) , || ||U α Aˆ !∈ as desired. ! This lemma has the following corollary: Corollary 4.12. If A is compact and µ normalized Haar measure on A, then the U k seminorms on (A, µ) are norms. Proof. Let f be in the kernel of U k. By Corollary 4.8, f must also be in the kernel of U 2. By the above proposition, this implies fˆ = 0 a.e., which implies that f = 0 a.e., so we are done. ! The failure of the U k to be norms in the ultraproduct case, which we will see below, therefore reflects the fact that none of the measure spaces constructed in this manner are compact topological groups. 4.3. Octahedral (semi-)norms and quasirandomness. In order to establish the basic equivalence result between the kernels of the Gowers seminorms and certain σ-algebras, we need to make a detour into a multidimensional version. For a group A and measure µ as above, let f : Ak C denote a measurable function with respect to the product measure on Ak. Then define→ the octahedral norm of f to be 2k c f k = ... C| |[f(x ,x , . . . , x )] dµ(x ) dµ(x ) . . . dµ(x ). || ||O 1,c1 2,c2 k,ck 1,0 1,1 k,1 << < c 0,1 k ∈{) } It is easy to check that if g : A C is a measurable function and → k g (x , . . . , x )=g x , k 1 k j j=1 ! then we have g k = g k . || ||U || k||O The importance of the octahedral norms comes from their connection to the σ-algebra on Ak defined above. A[k]∗ 38 First recall that a subspace H1 of a Hilbert space H2 is itself a Hilbert space if and only if it is closed. We can uniquely define a projection operator onto closed subspaces; we will denote the projection onto H by P . If f H and g H , the projection operator obeys 1 H1 ∈ 1 ∈ 2 f, g = f, P (g) . 9 : 9 H1 : Next, recall the definition of the conditional expectation. Given a probability space (X, ,µ), a sub-σ-algebra , and an -measurable function f : X C, the conditional A B A → expectation of f given is a -measurable function E(f ):X C such that B B |B → E(f ) dµ = f dµ |B Proof. Pick some B and let χB be the indicator function of B. For any f H1, by the projection property we∈B have ∈ f,χ = P (f),χ , 9 B: 9 H2 B: so f dµ = fχ dµ = f,χ = P (f),χ = P (f) dµ. B 9 B: 9 H2 B: H2 E(f ) dµ = P (f) dµ, |B H2 vector space 0= χE FχG dµ = µ((E F) G) k × × ∩ χ k =0. || G||O The integrand of the latter equation is nonzero (and equal to one) if and only if all points of the rectangular prism described by the 2k coordinates x1,0, . . . , xk,1 are in G. The octahedral norm therefore measures the total measure of rectangular prisms with all vertices in G. The first statement, on the other hand, is approximately a statement that G has no positive-measure intersection with any cylinder set. It should be clear that these statements are “morally” similar (they both state in some sense that G is nowhere dense) and it is likely that an alternate proof of the proposition could be developed along these lines. ! 4.4. A few Hilbert space results. Associated to each measure space (X, ,µ) is the B corresponding Hilbert space L2(X, ,µ) of square-integrable -measurable functions X C with inner product given by B B → f, g = fg dµ. 9 : Lemma 4.16. Let A1 and A2 be ultraproduct spaces with measures µ1 and µ2 constructed 2 as above, and let H L (A1, (A1),µ1) be a Hilbert space. Let f : A1 A2 C ⊆ A × → be an (A1 A2)-measurable function such that the function gy : A1 C defined by A × → 40 gy(x)=f(x, y) is in H for every y A2. Then g : A1 C defined by ∈ → g(x)= f(x, y) dµ2(y) g 2 = f(x, y)g(x) dµ (y) dµ (x) || ||2 1 2 = f(x, y)g(x) dµ2(x) dµ1(y) = f(x, y)PH (g(x)) dµ2(x) dµ1(y) = f(x, y)PH (g(x)) dµ1(y) dµ2(x) = g(x)PH (g(x)) dµ1(x) = PH (g(x))PH (g(x)) dµ1(x) Lemma 4.17. Let H and H be two L∞( )-modules. Then for every v H and f 1 2 B ∈ 1 ∈ L∞( ), we have B PH2 (vf)=fPH2 (v). Proof. Because the module operation is just pointwise multiplication, we can move elements of L∞( ) freely from one side of an inner product to the other, provided we take the complexB conjugate. Freely using the aforementioned property of projection operators, for every w H we have ∈ 2 P (vf),w = vf, w = v, wf = P (v),wf = fP (v),w . 9 H2 : 9 : 9 : 9 H2 : 9 H2 : Then set w = P (vf) fP (v) H , so the above equality yields w, w = 0. Therefore H2 − H2 ∈ 2 9 : w = 0, which is the desired result. ! The rank of H, denoted rk(H), is then defined to be the minimal cardinality of a subset S of H such that the L∞( )-module generated by S is dense in H. B 41 Lemma 4.18. Let H H be two L∞( )-modules. Then rk(H ) rk(H ). 1 ⊆ 2 B 1 ≤ 2 Proof. Let S be a subset of H2 such that the L∞( )-module generated by S is dense in H2. Let B S# = P (s):s S . { H1 ∈ } We claim that the L∞( )-module generated by S# is dense in H1. Consider some h HB and some "> 0. By assumption, there are elements s , . . . , s S ∈ 1 1 r ∈ and f , . . . , f L∞( ) such that 1 r ∈ B r h fisi ". && − && ≤ && i=1 &&2 && ! && The projection PH1 cannot increase&& distances, so&& applying PH1 we certainly have && && r h PH1 (fisi) ". && − && ≤ && i=1 &&2 && ! && By Lemma 4.17 above, this implies&& that && && r && h fiPH1 (si) ", && − && ≤ && i=1 &&2 && ! && so the L∞( )-module generated& by& S# is dense in H&&1. ! B && && 4.5. Weak orthogonality and coset σ-algebras. Let H1,H2 be two Hilbert spaces both contained in a larger Hilbert space H. We say that H1 and H2 are weakly orthogonal if for any h H , P (h) H . ∈ 1 H2 ∈ 1 Lemma 4.19. Weak orthogonality for Hilbert spaces is symmetric; that is, if H1 is weakly orthogonal to H2, then H2 is weakly orthogonal to H1. Proof. It is easy to see that that weak orthogonality means precisely that the orthogonal complements of H H in H and H are orthogonal, which is clearly a symmetric relation. 1 ∩ 2 1 2 ! There is an analogous notion for σ-algebras. Given two σ-algebras and both B1 B2 contained in a larger σ-algebra, we say that 1 and 2 are weakly orthogonal if for any 2 B B f L (X, 1), the conditional expectation E(f 2) is measurable in 1. By the connection between∈ HilbertB space projections and conditional|B expectations, we haveB the following: Lemma 4.20. Weak orthogonality for σ-algebras is symmetric. Proof. Weak orthogonality for σ-algebras when restricted to L2 functions is, by Proposition 4.13, equivalent to weak orthogonality on the corresponding Hilbert spaces. ! Back in the setting of the ultraproduct group A, consider an internal subgroup H; that is, a subgroup of the form H = lim Hi, i ω → where each Hi is a subgroup of Ai, or equivalently that ω-many of the Hi are subgroups of the corresponding Ai. Note that not every subgroup of A is of this form; for example, Z ∗R is a subgroup but as already discussed not an internal set. If H is an internal set that⊂ is a subgroup, however, the following lemma shows us that it is an internal subgroup as defined above: 42 Lemma 4.21. If H = limi ω Hi is a subgroup of A, then → i N : Hi is a subgroup of Ai ω. { ∈ }∈ Proof. Lo´s’theorem.! ! The coset σ-algebra (A, H) (A) is defined to be the subalgebra of consisting of unions of cosets of H.A For example,⊆A if H is the trivial subgroup then (AA, H)= (A), while if H = A then (A, H)= , A is the trivial σ-algebra. A A The importance ofA coset σ-algebras{∅ } is the following proposition. An invariant σ-algebra is one such that if E and a A then the translation E + a . B ∈B ∈ ∈B Proposition 4.22. A coset σ-algebra on A is weakly orthogonal to any invariant σ-algebra. Proof. Let H be the coset in question and let be an invariant σ-algebra. We can de- fine an averaging operator that we will show coincidesB with the projection operator into L2( (A, H)). Let f L2( ). Define the operator T on L2( ) by A ∈ B B T (f)(x)= f(x + h) dµH, T (f)(x),w = f(x + h) dµ (h) w(x) dµ(x) 9 : H = f(x) dµH(h) w(x) dµ(x) = f(x)w(x) dµ(x) 0, then there isB a set 9DB C: , suchA that ∈A # ∈ 9B C: µ(A D ) ". 6 # ≤ Theorem 4.24. Let (A, ,µ) be an ultraproduct space and let 1 and 2 be a sub-σ- algebras. Then there is anA independent complement of in ,B . B C B1 9B1 B2: ! We say that a function f measurable in a σ-algebra is almost measurable in a sub-σ- algebra if it can be approximated arbitrarily well in anAL1 sense by functions measurable in ; thatB is, for every "> 0 there is a f measurable in such that B # B f f ". || − #||1 ≤ Proposition 4.25. Let , be two weakly orthogonal σ-algebras on a probability B1 B2 ⊆A space (X, ,µ), and let W be a set measurable in 1, 2 . Let 3 be the σ-algebra generated by all theA functions 9B B : B E(χ t ),tL∞( ). W · |B1 ∈ B2 Then χ is almost measurable in , . W 9B2 B3: Proof. First assume that 1 and 2 are independent. By Lemma 4.23, there exist sets B1 and B2 ,1 Bj n, suchB that B2 is a partition of the space and if we let j ∈B1 j ∈B2 ≤ ≤ { j } n 1 2 W # = (B B ), j ∩ j j=1 + we have µ(W W #) ". 6 ≤ By construction, 2 1 2 χW # (x)χB (x)=χB B (x). j j ∩ j The projection of this function onto can be easily calculated: we claim that B1 2 E(χW χB2 1)=µ(B )χB1 . # j |B j j Checking the claim: for all B , using the independence of and , we have 1 ∈B1 B1 B2 1 2 1 χ 1 2 (x) dµ(x)=µ(B B B ) Bj Bj j j 1 ∩ ∩ ∩ 0, there is a C such that µ(C D) C". In the languageD of Hilbert space∈ modulesD above, relative separability∈C has a precise6 inter-≤ pretation: Lemma 4.26. If 1 2 are σ-algebras, then 2 is relative separable over 1 if and only 2 B ⊆B B B if the rank of L ( ) over the ring L∞( ) is at most countable. B2 B1 Proof. First assume that is relative separable over , where the countably many extra B2 B1 elements are denoted Fi, i N. Then the L∞( 1)-module generated by the characteristic functions χ is easily seen∈ to be dense in L2( B). Fi B2 45 2 In the other direction, if the rank of L ( 2) over L∞( 1) is countable, then there are 2 B B countably many fi L ( 2) such that the L∞( 1)-module generated by the fi is dense in L2( ). Each of the∈f canB be approximated arbitrarilyB well by a -measurable stepfunc- B2 i B2 tion, so if we take all of the sets in 2 used in these stepfunctions (which is still a countable set), we generate a σ-algebra that isB dense in . B2 ! For Proposition 4.28 below, we want a notion of relative separability defined even when 1 is not contained in 2. In this case, say that 2 is relative separable over 1 if 1 togetherB with at most countablyB extra elements togetherB generate a σ-algebra thatB is denseB in , . 9B1 B2: Likewise, with the same notation, define a relative atom of 2 over 1 to be an element E of with µ(E) > 0 such that for every subset F E thereB is a G B such that B2 ⊆ ∈B1 µ((E G) F)=0. ∩ 6 If is the trivial σ-algebra, then this reduces to the statement that the measure space with B1 σ-algebra 2 is atomless. These notionsB are relevant because relative separable, relative atomless extensions have a particularly nice structure. In particular, we have the following measure-theoretic fact, whose proof would take us too far astray: Lemma 4.27. Let be two σ-algebras on a space X such that is relative separable B1 ⊆B2 B2 and relative atomless over 1. Then there is a system of functions gi : X C such that B → E(g g )=δ i j|B1 i,j for all i, j in a countable set. ! We wish to study the slices of subsets of Ak in the last coordinate. Let E Ak be an arbitrary subset, and for x A define ⊂ ∈ Ex = (x1, . . . , xk 1) (x1, . . . , xk 1,x) E { − | − ∈ } k 1 to be the x-slice of E. Let (E) be the σ-algebra on A − generated by all x-slices. The following propositionS connects this notion back with the product σ-algebras defined k earlier. Recall that [k]∗ is a σ-algebra on A generated by the inverse images of the A k 1 natural σ-algebras (A − ) of all projections onto one fewer dimension. Likewise, [k 1]∗ A k 1 A − is similarly defined with respect to A − . k Proposition 4.28. The set E is in [k]∗ (A ) if and only if (E) is relative separable over k 1 A S [k 1] (A − ). A − ∗ Proof. In one direction, assume that E [k]∗ . Any particular element of a σ-algebra is certainly generated by countably many elements∈A of that σ-algebra (by transfinite induction, if you like), so by the definition of there are countably many sets E Ak such that A[k]∗ i,j ⊆ Ei,j [k] j , with i N and 1 j k, such that E is in the σ-algebra generated by the ∈A \{ } ∈ ≤ ≤ Ei,j. By definition, if j < k, then (Ei,j) [k 1] , S ∈A − ∗ so we don’t have to worry about those elements. If j = k, then Ei,k [k 1], so Ei,k is constant in the k coordinate, so all slices in that coordinate are the∈ same,A − so (E ) is S i,k 46 generated by that one element. Therefore (E) (Ei,j) i N, 1 j k S ⊆ 9S | ∈ ≤ ≤ : is generated by [k 1] together with at most countably many extra elements, so it is relative A − ∗ separable over [k 1] . A − ∗ In the other direction, let (E) be relative separable over [k 1] . We want to use Lemma S A − ∗ 4.27, but (E) may not be relative atomless over [k 1] . To remedy this, construct a σ- S A − ∗ algebra on Ak such that B k 1 [k 1] (E) (A − ) A − ∗ ⊆S ⊆B⊆A and the extension over [k 1]∗ is relative separable and relative atomless. We can con- struct such a byB addingA countably− many generators to (E), “filling in” all possible B S relative atoms. Now use Lemma 4.27 to find a relative basis g1,g2,... of over [k 1]∗ . Then, untangling the definitions, we have B A − ∞ χE(x1, . . . , xk)= gi(x1, . . . , xk 1)E(χEx (x1, . . . , xk 1) gj(x1, . . . , xk 1) [k 1] ), − k − · − |A − ∗ j=1 ! where as we recall Exk is the xk-slice of E. This formula shows that χE is measurable in , so E is a measurable set in the same σ-algebra. A[k]∗ ! With this definition, (E) may depend on a change of E on a set of measure zero. To eliminate this possibility,S it is convenient to introduce the somewhat more involved notion k 1 of the essential σ-algebra (E) on A − . Let [k]# denote the system of sets T : T [k],k E { ⊂ ∈ T, T = k 1 . This is similar to the system [k]∗, but with the added constraint that the last coordinate| | − must} always be included. Denote the σ-algebra generated by all of the algebras such that T [k]# by . Now define (E) to be the smallest σ-algebra containing AT ∈ A[k]# E [k 1] in which all of the functions A − ∗ g(x1, . . . , xk 1)= χE(x1, . . . , xk) t(x1, . . . , xk) dµ(xk) − · k k 1 Proposition 4.29. The following are satisfied by (E), where p : A A − denotes the projection canceling the last coordinate: E → (i) Whenever µ(E E#)=0, (E)= (E#) (insensitivity to measure-zero change). 6 E E (ii) If E [k]∗ , then (E) is relative separable over [k 1]∗ . ∈A E A − 1 (iii) If E [k]∗ , then the characteristic function χE is almost measurable in p− ( (E)), [k]# . (iv) If E ∈A , then the functions 9 E A : ∈A[k]∗ χE,x (x1, . . . , xk 1)=χE(x1, . . . , xk), k − are measurable in (E) for almost all x A. E k ∈ Proof. The first property is obvious, since (E) is defined in terms of an integral; the functions we get with a measure-zero changeE will be precisely the same as the functions we got originally. 47 From Lemma 4.16, since the characteristic functions of the slices are in the Hilbert L2 space corresponding to (E) by definition, we conclude that L2(E(E)) L2(S(E)). There- fore (E) (E). ByS Proposition 4.28, since E , (E) is relative⊆ separable over E ⊆S ∈A[k]∗ S [k 1] . Using the characterization of relative separability in terms of Hilbert spaces mod- A − ∗ ules in Lemma 4.26, by Lemma 4.18, since (E) is relative separable over [k 1]∗ , so is (E). S A − E For the third property, let 1 k 1 k 1 = p− ( (A − )) = [k 1](A ), B A A − = , B2 A[k]# 1 = p− ( (E)), B3 E be three σ-algebras on Ak. It is easy to see that and are weakly orthogonal, and by B1 B2 definition, is the σ-algebra generated by all the functions E(χ t ), where t L∞( ), B3 E · |B1 ∈ B2 since projection onto 1 is clearly just integrating over the last coordinate. Therefore by B 1 Proposition 4.25, E is almost measurable in p− ( (E)), [k]# , as desired. The fourth part follows from Fubini’s theorem9 E on theA ultraproduct,: together with the third part. ! 4.7. Higher order Fourier σ-algebras. Here we finally define the kth-order Fourier σ- algebras on the ultraproduct group A and prove the first major result about them. Let be a σ-algebra on A that is contained in and invariant under the group action, and letBG be an element of . We say G is relativelyA separable with respect to if there is A B a set of countably many elements ai A, i N such the set of translates ∈ ∈ G + ai : i N , { ∈ } together with , is dense the σ-algebra generated by all the translates, together with . Here density isB in the usual sense, where sets are arbitrarily well-approximated if theyB are arbitrarily well-approximated in terms of symmetric differences. In other words, as σ-algebras, countably many translates are dense in all the translates. We have the following: Lemma 4.30. If 1 2 are invariant σ-algebras and 2 is relative separable over 1 as a σ-algebra, then eachB ⊆ elementB of is relative separableB over . B B2 B1 Proof. Pick an element F , and let be the σ-algebra generated by and all of the ∈B2 C B1 translates of F. As 2, there are countably many elements of such that they, together with , generate aCσ⊆-algebraB that is dense in . By constructionC of , it is easy to see that B1 C C we can pick these elements to be themselves translates of F. ! Denote the set of relatively separable elements over by Υ( ). Using the monotone class theorem, Υ( ) must also be a σ-algebra, and it is obviousB thatB Υ( ) is invariant under the group actionB if is. In a failure of terminology, the set of relativelyB separable elements over a σ-algebra willBnot be relative separable over that σ-algebra. In particular, even if is the trivial σ-algebra, Υ( ) will be nonseparable. B The characterizationB of the higher order Fourier σ-algebras (once we define them) in this section will require one additional lemma, allowing us to preserve relative separability of sets when passing to coset σ-algebras. 48 Lemma 4.31. Let H be an internal subgroup of A, an invariant sub-σ-algebra of (A), and G (A, H) a relative separable element overB . Then G is also relative separableA over ∈A(A, H). B B∩A Proof. Note that it is a trivial verification that the intersection of two σ-algebras is also a σ-algebra, so (A, H) is indeed a σ-algebra and the statement of the lemma makes sense. B∩A As G is relative separable over , there are a countable set ai A such that all translates of G are arbitrarily well-approximatedB (in the sense of symmetric∈ differences) in the σ- algebra generated by and the sets T = G+ai . Therefore if a A and "> 0, by Lemma 4.23, there are sets BB1 T and B2 {,1 j } n, such that ∈ j ∈9 : j ∈B ≤ ≤ n G = (B1 B2) # j ∩ j j=1 + is such that µ((G + a) G ) ". 6 # ≤ By the definition of the coset σ-algebras, it is clear that n 1 2 E(χG! (A, H)) = χB E(χB (A, H)). |A j j |A j=1 ! No L2 norms can increase under projection, and the projection onto (A, H) obviously leaves G + a unchanged, so µ((G + a) G ) " implies that A 6 # ≤ E(χ (A, H)) χ ". || G! |A − G||2 ≤ By Lemma 4.22, and (A, H) are weakly orthogonal. Therefore all of the functions B A E(χB2 (A, H)) j |A are measurable in (A, H), which is sufficient to show that G + a is arbitrarily well- approximated by elementsB∩A of that intersection and countably many translates, as desired. ! It is now possible to define and characterize the basic objects of study of this subject, the Fourier σ-algebras on A. The definition is surprisingly straightforward; in terms of the Fi operator Υabove, we just define i recursively, letting 0 = , A be the trivial σ-algebra and defining F F {∅ } i =Υ( i 1) F F − for i 1. Of course, all of the are invariant under the group action. ≥ Fi Using the results developed above, we can link the i back to the Gowers norms, via a F k detour through product spaces. Let Dk denote the diagonal subgroup of A , consisting of elements (x , . . . , x ) with k x = 0. Let τ : Ak A be given by 1 k j=1 j k → I k τk(x1, . . . , xk)= xj. j=1 ! k Clearly, τk descends to a homomorphism from the quotient group A /Dk, which we will also denote by τk. 49 Lemma 4.32. The map τ : Ak/D A is a group isomorphism and gives rise to a k k → measure-preserving equivalence between (Ak, D ) and (A). A k A Proof. That the map is an isomorphism is true generally for any abelian group. k k For the second part, note that (A , Dk) is a σ-algebra on A , so we are using the map k A τk : A A. By a measure-preserving equivalence, we mean that images and preimages of measurable→ sets are measurable and the measure is unchanged by the map. We will show that we can factor τk through an intermediate measure space to get two evident measure- preserving equivalences. Let Ck be the subgroup of Ak with the kth coordinate is always equal to zero; that is, C = (x , . . . , x ) Ak : x =0 . { 1 k ∈ k } Let α : Ak Ak be the map given by k → k αk(x1, . . . , xk 1,xk)= x1, . . . , xk 1, xj . − − j=1 ! Note that αk is an invertible affine linear transformation with determinant one. By con- struction of (Ak), the measure is invariant under such transformations (in much the same A d way that the Lebesgue measure on R is, for example). Furthermore, αk takes Dk to Ck, so it is a measure-preserving equivalence between (A, Dk) and (A, Ck). k A Now let βk : A A be the projection onto the kth coordinate. It is obvious that, by → k the definition of the coset σ-algebras, β is an equivalence between (A , C ) and (A) k A k A (the elements of the former are precisely the unions of cosets of Ck which are measurable in (Ak); that is, cylinder sets in the kth coordinate). That measure is preserved follows fromA the definition of the product σ-algebra. Since τ = β α is the composition of two measure-preserving equivalences, it itself is k k ◦ k a measure-preserving equivalence. ! The following proposition ties together much of the above work in giving an understand- able lifting of k 1 to the product. F − k Proposition 4.33. The map τk : A /Dk A gives a measure-preserving equivalence k → between the σ-algebra (A /Dk) [k] and k 1. A ∩A ∗ F − Proof. The proof is by induction on k. If k = 1, the diagonal subgroup D is trivial, so (A1, D )= (A), while α is the 1 A 1 A [1]∗ trivial σ-algebra on A. By definition, 0 is the trivial σ-algebra on A as well, and τ1 is the identity map. The base case is thereforeF verified. Assume that the proposition holds for k 1, and let H be a set in (A). Let − A 1 Hk = τk− (H) be the corresponding subset of Ak, which is by Lemma 4.32 an element of (Ak). Therefore A k it suffices to prove that H is measurable in k 1 if and only if Hk is measurable in [k] (A ). F − A ∗ For convenience, let f = χH, the characteristic function of H. Then if we let k f (x , . . . , x )=f x , k 1 k j j=1 ! 50 as in the above discussion of octahedral norms, it is clear that fk = χHk . In one direction, assume that H is measurable in k 1, so H is relative separable over k 1 F − k 2. The slice σ-algebra (Hk) on A − consists of translates of the slices in the kth F − S coordinate of Hk, which are all the same and equal to Hk 1 (to see this, it is perhaps − easiest to consider the respective characteristic functions). Thus (Hk) is relative separable 1 S over τ − ( k 2). Therefore by the inductive hypothesis, (Hk) is relative separable over k F − S [k 1] . By Proposition 4.28, Hk is measurable in [k] . A − ∗ A ∗ In the other direction, assume that Hk [k]∗ . We claim that the essential σ-algebra k 1 ∈A (H ) is an invariant σ-algebra on A − . This is a routine verification from the definition. E k By the fourth property in Proposition 4.29, almost all of the slices of Hk are measurable in (Hk). But as previously mentioned, all the slices of Hk are translates of Hk 1, so as (Hk) E − E is invariant, Hk 1 itself must be in (Hk). By the second property in Proposition 4.29, − E (Hk) is relative separable over [k 1]∗ , so using Lemma 4.30, Hk 1 is relative separable E A − − k 1 over [k 1]∗ . By Lemma 4.31, Hk 1 is relative separable over [k 1]∗ (A − , Dk 1). By A − − 1 A − ∩A − the induction hypothesis, therefore, by mapping via τk− we get that H is relative separable over k 2. ! F − As a corollary, we get the first structure theorem in the subject, justifying the above definition of of the Fourier σ-algebras: Theorem 4.34. Let f be a function in L2(A, ), and define A k f (x , . . . , x )=f x . k 1 k j j=1 ! Then the following are equivalent: (i) The function f is measurable in k 1. F − 2 (ii) The function f is orthogonal to every g L (A, ) such that g k =0. ∈ A || ||U (iii) The function fk is measurable in [k]∗ . A 2 k (iv) The function f is orthogonal to every g L (A , ) such that g k =0. k ∈ A || ||O Proof. By Proposition 4.33, we immediately have the equivalence of (i) and (iii) and (ii) and (iv). Equivalence of (iii) and (iv) is given by Proposition 4.15. ! 51 5. Further results and extensions 5.1. The second structure theorem. As mentioned in the introduction, Szegedy’s second structure theorem is the following statement: Theorem 5.1. The Hilbert space L2(A, ) is generated as a module by rank-one modules Fk over L∞(A, k 1) which are pairwise orthogonal. F − The following corollary is immediate: Corollary 5.2. For each f L2(A, ) and each k>1, there is a unique decomposition ∈ A ∞ f = g + fj, j=1 ! where g U k =0and each fj is contained in a different rank-one module over L∞( k 2). || || F − Proof. By Theorem 4.34, we can uniquely decompose f = g + f #, 2 where g U k = 0 and f # L ( k=1). By Theorem 5.1 and the definition of a Hilbert space module|| (which|| requires only∈ thatF generators span a dense subspace of the module), we have a further decomposition ∞ f # = fj, j=1 ! where each fj is contained in a different rank-one module over L∞( k 2). Furthermore, each F − rank-one module is generated by a function α : A C such that α = 1 and α(x)α(x + t) → | | is measurable in k 1. ! F − Although we will not prove Theorem 5.1 in close detail, in what follows we will outline the pieces one needs to do so and give some indication of the proof. First, using language first employed in [17], we will discuss the cubic structure on powers of A. For a positive integer k, let Vk denote the set of all subsets of [k]= 1, . . . , k . We can k { } think of the 2 elements of Vk as vertices of a k-dimensional cube. For d < k,ad-dimensional face of Vk is defined in the natural way; that is, it is a maximal set of subsets of [k] that is constant on k d coordinates; that is, for each of the chosen coordinates j [k], either − ∈ all of the sets of the face contain j or all omit j. Define the subgroup B AVk as the k ⊂ collection of elements ai i V satisfying all possible equations of the form { } ∈ k a a + a a =0, p − q r − s where p, q is a one-dimensional face and p, q, r, s is a two-dimensional face. If v Vk is a vertex,{ } let S(v) refer to v together with{ its k neighbors} obtained by including an extra∈ element of [k] or omitting an element (if Vk is pictured as a k-dimensional cube, then the elements of S(v) are the literal geometrically closest neighbors of v). Somewhat fancifully, S(v) is referred to as the spider of v. Viewing Vk as a cube, there is enough symmetry so that any given vertex is interchange- able with any other. For definitiveness, therefore, we will isolate one, the zero vertex that is represented by the empty set in [k], and refer to it as 0. The following lemma, whose proof is in the same vein as Lemma 4.33, would apply equally well to any other vertex. 52 Lemma 5.3. Label the elements of the spider S(0) as 0, 1, 2 . . . , k, where 0 is the zero vertex S(0) Vk and j is the set j [k] for j =0. Let v Vk and let δk : A A denote the group homomorphism given{ }⊂ by . ∈ → δk(a0,a1, . . . , ak)v = a0 + ai. i v !∈ That is, the vth coordinate of δk is calculated by adding a0 to the sum of the ai such that k+1 i v. Then the action of δk on the measure space (A ) gives a measure-preserving { }⊂ k+1 A equivalence of the groups A and Bk. ! For a vertex v V , let τ denote the projection B A to the vth coordinate. We can ∈ k v k → pull back the σ-algebra (A) via this map to get a σ-algebra on Bk which will be denoted A k+1 k+1 v(Bk). Using the equivalence of Bk and A , we can define a similar map νv : A A byA the formula → ν = τ δ . v v ◦ k The following lemma is a trivial verification, expressing the Gowers norm in terms of the cubic structure. Lemma 5.4. If f : A C is a -measurable function, then → A 2k v f k = C| |[f(τ (x))] dµ(x), || ||U v Bk v V < )∈ k where v is the literal size of v as a subset of [k] and Cf = f is the complex conjugation | | operator. ! Finally, a third lemma relates the cubic structure with the Fourier σ-algebras. Its proof uses the previous lemma’s characterization of the Gowers norm together with Theorem 4.34. Lemma 5.5. For every v V , ∈ k 1 v w w = v = τv− ( k 1). A ∩ 9{A | . }: F − on Bk. ! An automorphism of a measure space is a measure-preserving equivalence from that measure space to itself. Let be an A-invariant σ-algebra on A sandwiched between two Fourier σ-algebras, as follows:H k 1 k. F − ⊆H⊆F We will call an automorphism of a k-automorphism if H (i) it commutes with the action of A on itself, and (ii) it fixes k 1. F − Let Autk( ) denote the set of k-automorphisms of . By transferring in the usual way between aHσ-algebra and the L2 space over it, a k-automorphismH σ induces an action on L2( ), which we will also denote by σ, such that H E(σ(f) k 1)=E(f k 1) |F − |F − 2 2 for f L ( ). Since the underlying measure spaces are finite, L∞ L , so σ acts on ∈ H ⊂ L∞( ) as well. H 53 Each k-automorphism of gives rise to a set of actions on the direct sum of 2k copies of H L∞( ) in the following way. Let T Vk+1 be a set of vertices. Define the action lT,σ on thisH direct sum by ⊂ l f = σ f , T,σ v v v v V v V ∈Bk+1 ∈Bk+1 where fv v V is a collection of functions each in L∞( ) and σv = σ whenever v T { } ∈ k+1 H ∈ and is the identity otherwise. If T is a face of V k+1, this action is called a face action. We define a second functional U˜ k on collections of functions on A, one for each v V : ∈ k fv v V ˜ k = fv(τv(x)). ||{ } ∈ k ||U Bk v V < )∈ k Note that this does not include the complex conjugation operator. The importance of this functional is that it is preserved by edge actions: Lemma 5.6. If is as before, E V is an edge (that is, a one-dimensional face), and H ⊂ k+1 σ Aut ( ), then the action l preserves U˜ k in the sense that ∈ k H E,σ l f = f && E,σ v&& && v&& && v Vk+1 && &&v Vk+1 && && ∈B &&U˜ k && ∈B && && && && && for all collections fv v Vk+1&& of functions in L&&∞( ).&& && ! { } ∈ && && H && && The proof of the above lemma is somewhat tricky and relies on Lemma 5.3 in order to pass from Bk to a power of A as well as the characterization of the Fourier σ-algebras in Lemma 5.5. In a rather different direction, we now turn to a discussion of what will be called pre- cocycles on the ultraproduct group A. These objects are variants of the cocycles used by Furstenberg and Weiss in [10] in their analysis of certain sparse averages in ergodic theory. The intuition here is borrowed from the construction of group extensions in group theory, but placed firmly in the context of ergodic theory. The discussion will be pitched at a somewhat higher level of generality than is strictly necessary, as eventually we will just take G to be a unitary group over the complex numbers. One should note at this stage that this is the first entrance of a nonabelian group into the theory; later work (for example, [31]) has made it clear that this noncommutativity is an essential part of the algebraic structure of higher order Fourier analysis. For now, though, let G be a second countable compact topological group with its associ- ated Borel σ-algebra and normalized Haar measure ν. As G will in general not be abelian, we will denote the groupG operation by multiplication. A function f : A G → is called a pre-cocycle of order k if f is measurable in and the function Fk 1 Ft(a)=f(a + t)f(a)− is measurable in k 1 for every fixed t A. This seemingly arbitrary form should be F − ∈ reminiscent of the promised generators α of the pairwise orthogonal L∞( k 1) modules in the statement of Theorem 5.1. In the context of the circle group (the unitF circle− of ), the inverse operation corresponds to complex conjugation, so if we construct pre-cocyclesC into 54 the unit circle they will give us exactly what we want. Studying objects of this form in more generality will allow us to prove the existence of such generators. Two pre-cocycles f1 and f2 of order k will be called equivalent if there exists a function h measurable in k 1 and an element g G such that F − ∈ f = f f g. 1 3 · 2 · Measurability is of course now defined with respect to the Borel σ-algebra on G rather G than the Borel σ-algebra on R or C. A pre-cocycle gives rise to two actions, one of A on A G and one of G on A G. The former is given by × × 1 a (b, h) = (b + a, f(b + a)f(b)− h) · and the latter is given by g (b, h)=(b, hg). · It is a routine verification that these two actions commute with each other, and by the invariance of the measures µ and ν both actions preserve the product measure µ ν. By the × definition of a cocycle, the product σ-algebra k 1 generated by rectangles is invariant under the action of A. F − ×G To use pre-cocycles, we will have to use some very technical results in ergodic theory. If is the set of A-invariant sets in k 1 , it can be checked that I F − ×G (i) The set is in fact a σ-algebra. (ii) The actionI of G, as defined above, leaves invariant. (iii) Furthermore, G acts ergodically on , whichI means that the only subsets of fixed by the action are the sets in the trivialIσ-algebra. I A general lemma that can be found in, for example, [11] then implies that there is a closed subgroup H G such that the action of G on is isomorphic to the action of G on the left coset space⊆ of H in G. This subgroup is calledI the Mackey group corresponding to the pre-cocycle f; it is defined only up to conjugacy class. If is trivial, then we say f is minimal. By another ergodic-theoretic lemma whose proof canI also be found in [11], every pre-cocycle f is equivalent to a minimal pre-cocycle f # : A H whose image is the Mackey group of f. → The argument then proceeds to connect these pre-cocycles back with structures that have already been defined. Using an averaging operator, Lemma 4.16, the invariance of various structures with respect to the defined actions, and a density argument, it is now possible to prove the following: Proposition 5.7. If f : A G is a minimal pre-cocyle, then there is a σ-algebra on A → H such that k 1 k and a measure-preserving equivalence φ : A G A between F − ⊆H⊆F × → k 1 and that commutes with the above-defined action of A. The equivalence φ also behavesF − ×G nicelyH when restricted to the trivial σ-algebra on G, giving an equivalence between k 1 ,G and k 1. ! F − ×{∅ } F − As a corollary, if is the σ-algebra constructed in the proposition, then G induces a faithful action on ,H meaning that two distinct elements of G induce different permutations of . Denote thisH action by H p : G Aut ( ). → k H We are now ready to put together the ergodic theory into one simple and useful result. Let f : A G be a minimal pre-cocycle of order k, let be the σ-algebra guaranteed by → H 55 ergodic theory in the form of Proposition 5.7, and let g1 and g2 be arbitrary elements of G. Pick a vertex w Vk+1 and let E1 and E2 be two edges of Vk+1 that meet at w. We make the clever definition∈ cw = lE1,p(g1),lE2,p(g2) , where the brackets indicate the commutatorJ of the actions.K Of course, cw acts on the space L∞( ), H v V ∈Bk+1 and it is easy to calculate that the action is given by cw = lw,p([g1,g2]). k From Lemma 5.6, the form U˜ is preserved by cw, so certainly even if we pick various other vertices w1,w2,... Vk+1 and compose them, the composition of the operators still k ∈ preserves U˜ . We can now use a trick to deduce that p([g1,g2]) itself must act trivially: for a function h L∞( ), define ∈ H g = h p([g ,g ])h. − 1 2 Then by definition 2k+1 v g k+1 = C| |[f(τ (x)) p([g ,g ])f(τ (x))] dµ(x). || ||U v − 1 2 v Bk v V < ∈)k+1 Expanding this product and using that all of the cw act trivially, we get perfect cancelation; that is, g k+1 . Since , by Theorem 4.34 g must be orthogonal to itself, hence zero, || ||U Fk hence p([g1,g2]) acts trivially. Because p is a faithful action, [g1,g2] is itself the identity element; that is, g1 commutes with g2. Since we picked g1 and g2 arbitrarily, we arrive at the following surprising conclusion: Proposition 5.8. Let f : A G be a minimal pre-cocycle of order k. Then G is abelian. → ! In the language of ergodic theory, this result is the statement that in this context isometric extensions are abelian. To proceed with the proof of Theorem 5.1, we introduce the “higher order group algebras,” a higher order analogue of a convolution algebra. First let M = L2(A A, ,µ µ) k × Fk ×Fk × be the space of Hilbert-Schmidt operators, which comes equipped with a multiplication operation in the usual way as follows: ◦ (K K )(x, y)= K (x, z)K (z, y) dµ(z). 1 ◦ 2 1 2 im(C∗C) = im(C), 56 so in fact im(C) is a separable Hilbert subspace of L2(A). By standard theory, we have a decomposition ∞ im(C)= Vi, i=1 B where the Vi are finite-dimensional eigenspaces corresponding to different eigenvectors, and each Vi L∞(A). The following⊂ is the last major proposition, using the abelian extension result to conclude the following: Proposition 5.9. As defined above, each Vi is contained in a finite-rank module over L∞( k 1), and this finite-rank module can be decomposed into rank one modules, each F − of which is generated by a function α : A C with α =1and α(x)α(x + t) k 1. → | | ∈F − Proof (sketch). Let Pi be the Hilbert space projection to Vi, which are all elements of Ck. d Let f , . . . , f L∞(A) be an orthonormal basis for V , and let f : A be defined as 1 d ∈ i →C f(x)=(f1(x), . . . , fd(x)). Let E A be the set of x A on which there exist elements t1, . . . , td A such that the matrix⊆ ∈ ∈ (f(x + t1),f(x + t2), . . . , f(x + td)) has full rank d. We can use orthonormality together with the fact that the Pi are elements of Ck to show that E is a set of positive measure that is contained in k 1. We can then employ the Gram-Schmidt process on E for the columnsF of− the above matrix. d d We get functions o1,o2, . . . , od : E C such that oi(x) i=1 is an orthonormal basis for every x E and we can express → { } ∈ d oi(x)= λi,j(x)f(x + tj) j=1 ! for certain coefficients λi,j. It is clear that these coefficients are measurable in k 1 as well. F − To extend the oi to the whole of A, find countably many subsets E1, E2,... of E, all measurable in k 1, such that A is a disjoint union of translates of these sets. That this is possible followsF from− a result in ergodic theory known as Rohlin’s lemma (see, for instance, [11]), together with the fact that the system of A acting on itself with σ-algebra k 1 is F − trivially ergodic. Then we can translate the basis oi i∞=1 accordingly to extend the basis to all of A. Using the above equation, any translate{ of}f can be expressed in this basis with coefficients measurable in k 1. Therefore Vi is contained in an L∞( )-module of rank d. To show that this moduleF − can be decomposed into rank-one modules,A we use the abelian extension result. If O(x) is the unitary matrix formed by the oi(x), the association x O(x) gives a pre-cocycle of degree k from A to the unitary group U(d). This is essentially(→ by construction; since the projections are in C we have that f(x),f(x + t) is measurable in k 9 : k 1 for each fixed t, which is the requirement for the map to be a pre-cocycle. Since every F − pre-cocycle is equivalent to a minimal pre-cocycle, O(x) is equivalent to an O#(x) going to some subgroup H of U(d) that, by Proposition 5.8, must be abelian. As H is abelian, by basic representation theory it decomposes into one-dimensional irreducible representations, so we can find pre-cocycles α taking values in C of absolute value 1. In the context of the unit circle, inverse is equivalent to complex conjugation, so that α is a pre-cocycle immediately implies that α has the desired form. ! 57 The proof of Theorem 5.1 can now proceed. We will prove that every invariant such H that k 1 k and is relative separable over k 1 is generated by rank-one modules of theF form− ⊂ givenH⊂ byF PropositionH 5.9. To do this, we willF − construct for each x A an infinite ∈ (that is, N N) matrix Mx,y such that the function x Mx,y for a certain fixed y generates a σ algebra× containing and such that each entry(→ of M is in C . This implies by H x,y k Proposition 5.9 that x Mx,y is measurable in a σ-algebra generated by rank-one modules of the desired form, so in(→ particular is. Because is the union of all invariant σ-algebras H Fk that are relative separable over k 1, this is sufficient to prove Theorem 5.1. F − 2 To construct such a matrix, let f1,f2,... be the relative orthonormal basis of L ( ) over 2 H L ( k 1) guaranteed by Lemma 4.27, meaning that F − E(fifj k 1)=δij. |F − For any f L2( ), therefore, we can uniquely write ∈ H ∞ f(x)= λj(x)fj(x), j=1 ! where the λj = E(ffj k 1) are measurable in k 1. Therefore in particular, we have |F − F − uniquely defined functions λi,j(t, x) such that ∞ fi(xt)= λi,j(t, x)fj(x). j=1 ! Let Mx,y be the infinite matrix whose (i, j)th entry is λi,j(y x, x). From the definition, it is straightforward to verify that − Mx,yMy,z = Mx,z for almost all (x, y, z), 1 Mx,y = My,x− = Mx,y∗ for almost all (x, y). Therefore there is some y such that the above equations are verified for almost all x and z; fix this y. Using these two equations, we can conclude that since Mx,yMx+t,y is measurable in k 1 for every fixed t, Mx,y is measurable in k; therefore each entry of Mx,y is in Ck. ByF construction,− the σ-algebra generated by x FM contains all of , and we are done. (→ x,y H 5.2. Prospects for extensions to infinite spaces. There are clear reasons why one would want to define higher order Fourier-analytic structures on infinite measure spaces. In ordinary Fourier analysis, for instance, the theory of the Fourier transform on R with the Lebesgue measure is fundamental and widely useful. If we are considering locally compact groups with Haar measure, the measure space is infinite if and only if the group is not compact. Of course, R falls into this category, as do many important number-theoretic groups, such as the adele groups of number fields. Ordinary Fourier analysis on these adele groups has famously yielded fantastic results, starting with Tate’s proof of the functional equation [32], and it is not unreasonable to want to extend the finer control given by higher order Fourier analysis to this setting. Unfortunately, there are several difficulties which make this prospect remote, although perhaps still possible. First, if we are given spaces Ai with infinite measures µi, the measure µ on the ultraproduct A is not σ-finite. For if (A, ,µ) were σ-finite, then it is the union of countably many elements of of finite measure.A Up to measure zero, therefore, it is the A 58 union of countably many elements of of finite measure. But by countable compactness, we can rewrite this countable union asS a finite union, which implies that µ(A) is finite. The failure of σ-finiteness has immediate analytic consequences. For example, Fubini’s theorem does not in general hold for such spaces (see Section 252 in [9] for the most general form of the theorem). This point is worth emphasizing because it is erroneously claimed in several sources, including for example in Section 3.9 of [26], that a Fubini-type theorem can be recovered for complete measure spaces even in the absence of σ-finiteness. In fact, we have the following counterexample, which can be found in [9]: Proposition 5.10. Let (X, ,µ) be the interval [0, 1] with Lebesgue measure and let (Y, ,ν) be the interval [0, 1] with countingA measure (i.e., every set is measurable, with the measureB given by the number of elements in the set if finite and infinity otherwise). Then there exists a function f such that f(x, y) d(µ ν)(x, y) < X Y | | × ∞ < × but f(x, y) dµ(x) dν(y) = f(x, y) dν(y) dµ(x). . f(x, y) dµ(x) dν(y)= µ(Wy) dν(y)=0 st lim fi =0. i ω = → 5.3. Extension to finitely additive measures. The especially astute reader will have noticed that, in the construction of the σ-algebra on the ultraproduct group A, it was A not strictly necessary to start with σ-additive measures on all of the original groups Ai. In fact, it suffices to take finitely additive measures µi on each Ai, and the construction proceeds entirely as before. The underlying reason is that σ-additivity of µ on A, rather than arising as a result of the σ-additivity of the µi, comes “for free” as a result of countable compactness (and therefore as a result of the saturation theorem). Once the additivity of µ has been checked, which requires only the additivity of the µi (or, properly, the additivity of ω-many of the µi, although this freedom is not particularly helpful or relevant), σ-additivity is automatic. A (discrete) group is called amenable if there is a left-invariant finitely additive probability measure that can be defined on it. Therefore if a sequence of abelian groups are all amenable, then they together with their invariant measures can be used to construct a higher order Fourier-analytic structure. Fortunately, we have the following: Proposition 5.11. All abelian groups are amenable. 60 Proof (sketch). First one shows that a group is amenable if and only if its finitely generated subgroups are (that is, locally amenable groups are amenable), and that the direct product of finitely many amenable groups is amenable. Then by the structure theorem for finitely generated abelian groups, it suffices to show that cyclic groups and Z are both amenable. All finite groups are clearly amenable because one can simply take the normalized counting measure. To show that Z is amenable, we have to construct a translation-invariant finitely additive probability measure on it. Though the usual proof uses the Hahn-Banach theorem, there is also a proof (found, for example, in [18]) in the spirit of this paper using a nonprincipal ultrafilter, as follows. Fix a nonprincipal ultrafilter ω on N. For each subset E Z and ⊂ i N, define ∈ E [ i, i] φ (E)=| ∩ − |, i 2i +1 the partial density of E in Z. Take any real p [0, 1] and define the sets ∈ + N (E)= i N : p>φi(E) , p { ∈ } N −(E)= i N : p φi(E) . p { ∈ ≤ } + Because these sets are complementary, one of them belongs to ω. Furthermore, if Np (E) ω + ∈ for some p, then certainly N (E) ω for all q > p, and likewise if N −(E) ω for some p, q ∈ p ∈ then N −(E) ω for all q < p. Therefore the following subsets of [0, 1] form a Dedekind cut q ∈ in [0, 1]: P +(E)= p [0, 1] : N +(E) ω , { ∈ p ∈ } P −(E)= p [0, 1] : N −(E) ω . { ∈ p ∈ } + Therefore either P (E) contains a least element or P −(E) contains a greatest element; call this element φ(A). We claim that this function is our translation-invariant finitely additive probability measure. It is certainly finitely additive and we have φ(Z) = 1; translation invariance follows because for any E Z, n Z, and real "> 0, we have that ∈ ∈ φ (E) φ (E + n) " | i − i |≤ for all sufficiently large i. ! The example of Z itself is perhaps the most interesting; however, this extension to finitely additive probability measure via amenability is possibly of limited utility because the method does not work for “ordinary” measures on infinite spaces, like the Haar measure of a locally compact group that is not compact. 61 Appendix A. Proof ofLo ! ´s’ theorem and the saturation theorem Proof ofLo´s’theorem. " We employ an induction on complexity of formulas. We will be slightly pedantic in recognition of the fact that this form of argument is not well-known outside of mathematical logic. If φ(v1, . . . , vn) is atomic then the theorem is true for φ(v1, . . . , vn) trivially, by the definition of the interpretation of constants, function sym- bols, and relations in the ultraproduct. It suffices to prove the following three induc- tion steps: if φ(v1, . . . , vn) satisfiesLo´stheorem, ! then φ(v1, . . . , vn) does (negation); if φ(v1, . . . , vn) and ψ(v1, . . . , vn) do, then φ(v1, . . . , vn) ¬ψ(v1, . . . , vn) does (conjunction); and if φ(v1, . . . , vn,x) does, then xφ(v1, . . . , vn,x) does∧ (existential quantification). ∃ j j For negation, we have the following chain of equivalences, assuming that a = limi ω ai for 1 j n and thatLo´s’theorem ! is true for φ(v1, . . . , vn): → ≤ ≤ = φ(a1, . . . , an) M| ¬ not = φ(a1, . . . , an) ⇐⇒ M| 1 n i N : i = φ(a , . . . , a ) / ω ⇐⇒ { ∈ M | i i } ∈ 1 n i N : not i = φ(a , . . . , a ) ω ⇐⇒ { ∈ M | i i }∈ 1 n i N : i = φ(a , . . . , a ) ω. ⇐⇒ { ∈ M | ¬ i i }∈ To go from the third to the fourth line, we used the ultrafilter property that exactly one of an index set and its complement are in ω. For conjunction, assumingLo´s’theorem ! is true for φ(v1, . . . , vn) and ψ(v1, . . . , vn) sepa- rately, we have the following chain of equivalences: = φ(a1, . . . , an) ψ(a1, . . . , an) M| ∧ = φ(a1, . . . , an) and = ψ(a1, . . . , an) ⇐⇒ M | M| 1 n 1 n i N : i = φ(a , . . . , a ) ω and i N : i = ψ(a , . . . , a ) ω ⇐⇒ { ∈ M | i i }∈ { ∈ M | i i }∈ 1 n 1 n i N : i = φ(a , . . . , a ) i N : i = ψ(a , . . . , a ) ω ⇐⇒ { ∈ M | i i }∩{ ∈ M | i i }∈ 1 n 1 n i N : i = φ(a , . . . , a ) and i = ψ(a , . . . , a ) ω ⇐⇒ { ∈ M | i i M | i i }∈ 1 n 1 n i N : i = φ(a , . . . , a ) ψ(a , . . . , a ) ω. ⇐⇒ { ∈ M | i i ∧ i i }∈ Here the key stage is between the third and fourth line, where we use the fact that A B ω if and only if A ω and B ω (this is a property of every filter, not just ultrafilters).∪ ∈ For existential∈ quantification,∈ assumingLo´s’theorem ! is true for φ(v1, . . . , vn,x), we have the following chain of equivalences: = xφ(a1, . . . , an,x) M| ∃ 1 n there exists b = lim bi M such that = φ(a , . . . , a ,b) i ω ⇐⇒ → ∈ M| 1 n there exists b = lim bi M such that i N : i = φ(ai , . . . , ai ,bi) ω. i ω ⇐⇒ → ∈ { ∈ M | }∈ 1 n The last statement certainly implies that i N : i = xφ(ai , . . . , ai ,x) ω. In the other direction, if this holds, we can pick b{such∈ thatM | =∃ φ(a1, . . . , an,x) for}∈ each i ω, i Mi | i i ∈ and pick bi for all other i arbitrarily, so the two statements are equivalent. This completes the induction and the proof. ! 62 We need one easy model-theoretic equivalence to prove the saturation theorem. This allows us to concentrate on 1-types instead of all n-types. Proposition A.1. Let κ be an infinite cardinal and a . Then the following are equivalent: M (i) is κ-saturated. (ii)M If A M with A <κ and p is a (possibly incomplete) n-type over A, then p is realized⊂ in . | | M (iii) If A M with A <κ and p SM(A), then p is realized in . ⊂ | | ∈ 1 M Proof. We prove that (i) implies (ii) implies (iii) implies (i). Statement (i) implies (ii) because for every imcomplete type p over A, there is a complete type p# over A such that p p#. The fact that p# is realized immediately implies that p is as well. ⊆ That (ii) implies (iii) is immediate. The last implication is by induction on n. The base case n = 1 is just (iii). Let p ∈ SnM(A), and let q SnM 1( ) be the type ∈ − ∅ q = φ(v1, . . . , vn 1):φ p . { − ∈ } n 1 By induction, q is realized by some (n 1)-tuple (a1, . . . , an 1) in M − . Let r S1M(A − − ∈ ∪ a1, . . . , an 1 ) be the type { − } r = ψ(a1, . . . , an 1,w):ψ(v1, . . . , vn) p . { − ∈ } By the base case, we can realize r by some b M. Then by construction the n-tuple ∈ (a1, . . . , an 1,b) realizes p, so is saturated. ! − M Proof of the saturation theorem. Let A M be countable. For this proof, if a A, choose ⊂ ∈ a sequence ai Mi such that a = limi ω ai. By the above∈ proposition, it suffices→ to show that every 1-type is satisfied in . Because the language is countable, the total number of formulas in the language is countable,M so certainly everyL 1-type is countable. Therefore take and enumerate a 1-type p = φ (v): { n n N , a set of A formulas such that p ThA( ) is satisfiable. By taking conjunctions, ∈ } L ∪ M it suffices to assume without loss of generality that φn+1(v)= φn(v) for all n. Let φn(v) n,1 n,mn ⇒ be θi(v, a , . . . , a ), where each θi is an -formula (we are just pulling out all of the constants not in the original language). L Let n,1 n,mn Dn = i N : i = vθi(v, a , . . . , a ) . { ∈ M | ∃ i i } Since p Th ( ) is satisfiable, certainly so is the existence statement vφ (v) Th ( ). ∪ A M ∃ n ∪ A M This implies that vφn(v) ThA( ), because the full theory of a model is complete. Therefore ∃ ∈ M = vφ (v)= = vθ (v, an,1, . . . , an,mn ), M| ∃ n ⇒M | ∃ n which implies byLo´s’theorem ! that D ω. n ∈ Define sets Jn = i N : n i . Each must be a member of the ultrafilter, since each is a cofinite set: otherwise,{ ∈ a finite≤ set} would be in the ultrafilter, contrary to the assumption that ω is nonprincipal. Therefore if we define X = J D , n n ∩ n then X ω as well. Because φ (v)= φ (v), the D are a descending chain, and the n ∈ n+1 ⇒ n n Jn trivially are as well. The intersection of all of the Jn is the empty set. Therefore the Xi 63 also form a descending chain whose intersection is the empty set. Thus for each i, there is a single greatest number n such that i X (if no such element exists, we just set n = 0). i ∈ ni i If ni = 0, define bi to to be an arbitrary element of Mi. Otherwise, let bi be such that = φ (b ,an,1, . . . , an,mn ). Mi | ni i i i Such an element clearly exists because the model makes true the corresponding existence statement. Then by construction, whenever n>0 and i Xn, by the definition of ni we have n n , so using that φ (v)= φ (v) several times∈ we get ≤ i m+1 ⇒ m = φ (b ,an,1, . . . , an,mn ). Mi | n i i i Except for n = 0, this holds whenever i Xn; that is, whenever n i and i Dn. ByLo´s’theorem, ! since X ω, for each∈ n we can conclude that≤ ∈ n ∈ = φ (b), M| n where b = limi ω bi. Note that the possible problems at n = 0 do not affect anything, since the ultraproduct→ is insensitive to any particular coordinate. Therefore we have found a b that realizes all of p, so we are done. ! 64 References [1] P. Candela, Uniformity norms and nilsequences (an introduction), notes accessible at http://www. dpmms.cam.ac.uk/~pc308/UNN-Notes.pdf. [2] C. C. Chang and H. J. Keisler, Model Theory, Studies in Logic and the Foundations of Mathematics, vol. 73, North-Holland, Amsterdam, 1977. [3] K. Ciesielski, H. Fejzi´c,and C. Freiling, Measure zero sets with non-measurable sum, Real Anal. Ex- change, 27:2, 783-794, 2001. [4] N. J. Cutland, Loeb Measures in Practice: Recent Advances, Lecture Notes in Mathematics, vol. 1751, Springer-Verlag, Berlin, 2001. [5] N. J. Cutland, Nonstandard measure theory and its applications, Bull. London Math. Soc., 15, 529-589, 1983. [6] N. J. Cutland, V. Nneves, F. Oliveira, and J. Sousa-Pinto, eds., Developments in Nonstandard Mathe- matics, Pitman Research Notes in Mathematics Series, vol. 336, Longman, Harlow, 1995. [7] G. Elek and B. Szegedy, Limits of hypergraphs, removal and regularity lemmas: a non-standard ap- proach, preprint, accessible at http://arxiv.org/abs/0705.2179. [8] G. Elek and B. Szegedy, A measure-theoretic approach to the theory of dense hypergraphs, preprint, accessible at http://arxiv.org/abs/0810.4062. [9] D. H. Fremlin, Measure Theory: Volume 2, 2nd edition, Torres Fremlin, 2010. 1 N n n2 [10] H. Furstenberg and B. Weiss, A mean ergodic theorem for N n =1 f(T x)g(T x), in Convergence in Ergodic Theory and Probability, Ohio State Univ. Math. Res. Inst. Publ., vol. 5, Walter de Gruyter P & Co., Berlin, 1996. [11] E. Glasner, Ergodic Theory via Joinings, Mathematical Surveys and Monographs, vol. 101, Amer. Math. Soc., 2003. [12] W. T. Gowers, Fourier analysis and Szemer´edi’stheorem, in Proceedings of the International Congress of Mathematicians, vol. I, Berlin, 1998. [13] W. T. Gowers, A new proof of Szemer´edi’stheorem, Geom. Funct. Anal., 11, 465-588, 2001. [14] B. Green and T. Tao, The primes contain arbitrarily long arithmetic progressions, Ann. Math., 167:2, 481-547, 2008. [15] P. R. Halmos, Measure Theory, Graduate Texts in Mathematics, vol. 18, Springer-Verlag, New York, 1974. [16] C. W. Henson, Unbouded Loeb measures, Proc. Amer. Math. Soc., 74, 143-150, 1979. [17] B. Host and B. Kra, Nonconventional ergodic averages and nilmanifolds, Ann. Math., 161:2, 397-488, 2005. [18] V. G. Kanove˘ι, Borel Equivalence Relations: Structure and Classification, University Lectures Series, vol. 44, Amer. Math. Soc., 2008. [19] H. J. Keisler, An Infinitesimal Approach to Stochastic Analysis, Mem. Amer. Math. Soc., 1984. [20] T. L. Lindstr¨om, Hyperfinite stochastic integration I: The nonstandard theory, Math. Scand., 46, 265- 292, 1980. [21] P. A. Loeb, Conversion from nonstandard to standard measure spaces and applications in probability theory, Trans. Amer. Math. Soc., 211, 113-122, 1975. [22] D. Marker, Model Theory: An Introduction, Graduate Texts in Mathematics, vol. 217, Springer-Verlag, New York, 2000. [23] D. Ramakrishnan and R. J. Valenza, Fourier Analysis on Number Fields, Graduate Texts in Mathe- matics, vol. 186, Springer-Verlag, New York, 1999. [24] A. Robinson, Non-standard analysis, Princeton University Press, 1996. [25] E. M. Stein and R. Shakarchi, Real analysis: measure theory, integration, & Hilbert spaces, Princeton University Press, 2005. [26] C. Swartz, Measure, integration, and function spaces, World Scientific, 1994. [27] B. Szegedy, Higher order Fourier analysis as an algebraic theory I, preprint, accessible at http:// arxiv.org/abs/1001.4282. [28] B. Szegedy, Higher order Fourier analysis as an algebraic theory II, preprint, accessible at http: //arxiv.org/abs/0911.1157. [29] B. Szegedy, Higher order Fourier analysis as an algebraic theory III, preprint, accessible at http: //arxiv.org/abs/0903.0897. 65 [30] B. Szegedy, Gowers norms, regularization and limits of functions on abelian groups, preprint, accessible at http://arxiv.org/abs/1010.6211. [31] B. Szegedy, On higher order Fourier analysis, preprint, accessible at http://arxiv.org/abs/1203.2260. [32] J. T. Tate, Fourier analysis in number fields and Hecke’s zeta-functions, Ph.D. thesis, Princeton, 1950. [33] T. Tao, Higher order Fourier analysis, preprint, accessible at http://terrytao.wordpress.com/books/ higher-order-fourier-analysis/. [34] T. Tao, Ultraproducts as a bridge between hard analysis and soft analysis, notes accessible at http://terrytao.wordpress.com/2011/10/15/ 254a-notes-6-ultraproducts-as-a-bridge-between-hard-analysis-and-soft-analysis/. 66