Furstenberg's Ergodic Theory Proof of Szemerédi's Theorem
Total Page:16
File Type:pdf, Size:1020Kb
FURSTENBERG'S ERGODIC THEORY PROOF OF SZEMEREDI'S´ THEOREM ZIJIAN WANG Abstract. We introduce the basis of ergodic theory and illustrate Fursten- berg's proof of Szemer´edi'stheorem. Contents 1. Introduction 1 2. A brief introduction to ergodic theory 2 2.1. Ergodicity and weak mixing 6 2.2. Compact systems 13 2.3. Factor and extension 14 2.4. Conditional measures 15 2.5. Weak mixing and compactness for extensions 17 2.6. The structure theorem 18 3. Furstenburg's proof of Szemer´edi'stheorem 18 3.1. General Strategy 18 3.2. Szemer´edi'stheorem 19 3.3. Correspondence 19 3.4. Two fundamental systems 23 3.5. Extension principles 27 3.6. Conclusion 28 Acknowledgments 28 4. bibliography 28 References 28 1. Introduction The statement of Szemer´edi'stheorem is very simple. Theorem 1.1 (Szemeredi). A subset of integers with positive upper Banach density has arbitrarily long arithmetic progressions. It was first proved by Szemer´ediin 1975 using a combinatorial and completely elementary approach. Although his method was extremely complicated, some of the important ideas such as Szemer´edi'sregularity lemma in graph theory came out from his proof. Two years later, a totally different approach is introduced by Furstenberg. He turned Szemer´edi'stheorem, a problem that looks extremely com- binatorial, into an ergodic puzzle about multiple recurrence of a measure-preserving Date: AUGUST 28, 2018. 1 2 ZIJIAN WANG system. Later in 2002, Gowers gave a Fourier-analytic proof. The fact that the orig- inal question asked by Erd¨osand Tur´anin 1936 is answered in three completely distinct ways has already made this problem highly interesting. In this paper, we discuss Furstenberg's ergodic proof of Szemer´edi'stheorem. Despite the elegance of Furstenberg's ergodic proof of Szemer´edi'stheorem, the value of this proof goes way beyond solving the problem per se. His proof sheds light on many important topics in ergodic theory, for instance, the classification of dynamical systems, conditional measures, extensions, etc. 2. A brief introduction to ergodic theory Ergodic theory studies dynamical systems. By dynamical systems, we mean certain \good" actions on measure spaces that exhibit interesting long-term behav- iors. An Z−action is just a function from the space to itself, or in other words, a dynamics. Obviously, not all functions are \well-behaved". In this section, we talk about the basics of ergodic theory to set the foundation for our later discussions. Definition 2.1. A measure space (X; B; µ) is a space X with measure µ and the σ-algebra B of measurable sets. Sometimes we ignore the σ-algebra associated to the measure space and just write (X; µ) when it is not so important. However, one shall treat the σ−algebra with great caution when dealing with conditional measures which will be discussed later in this paper. Remark 2.2. In this paper, we mostly assume that we are dealing with probability spaces, in which the measure of the entire space is 1. Definition 2.3. A map T :(X; BX ; µ) ! (Y; BY ; ν) is measure-preserving if for −1 any set A 2 BY , µ(T A) = ν(A). Definition 2.4. A measure-preserving map φ is an invertible measure-preserving map if the inverse of φ is measurable and well-defined almost everywhere. Definition 2.5. We call (X; B; T; µ) a measure-preserving system, or equiva- lently a dynamical system, if T is a measure-preserving map on X. Example 2.6. We define 2Z to be the infinite product of f0; 1g. This space is compact by the Tychonoff's theorem. Given an element x 2 2Z, we denote the kth coordinate of x by x[k]. We define a measure µ on the space 2Z by an infinite th product. Let πj be the projection onto the j coordinate. On each copy of f0; 1g, we use the "half-half" measure ν, i.e. for each measurable set A1 8 < 1 if A = f0; 1g; ν[A] = 0 if A is empty; : 1 2 otherwise. Q We define µ by µ(B) = ν(πnB) for all the measurable rectangles in the form of n2Z Q 2 Aj and extend the measure to the entire σ−algebra of measurable sets. This j2Z way of defining a measure is valid as explained in Remark 2.8. Moreover, one can define the Bernoulli shift Tk on this space for any integer k. Tk acts on an element 1Since we have a finite set, every subset is measurable. 2 Each Aj is measurable in its own copy of f0; 1g. FURSTENBERG'S ERGODIC THEORY PROOF OF SZEMEREDI'S´ THEOREM 3 x 2 2Z by shifting each coordinate of x to the left by k bits, i.e. x[s] = Tkx[s − 4] for all s 2 Z. Example 2.7. Given a circle T = R=Z equipped with the Haar measure µ, we can define rotation Rα acting by addition, i.e. Rα : x 7! x + α. This forms a measure-preserving system. In order to prove that Rα is measure-preserving, it suffices to show that Rα preserves the measure of all the intervals3. Notice that for 4 −1 any interval (a; b) ⊂ T, µ(Rα (a; b)) = (b − α) − (a − α) = b − a = µ((a; b)). Remark 2.8. Proving the measure-preserving property for every measurable set can be painful. However, it suffices to prove this property for a collection of sets that generates the σ-algebra. This is a standard trick that we will keep using repeatedly. Example 2.9. Instead of rotation, we can define a different dynamics on the circle T, namely the circle doubling map M2, where M stands for multiplication. M2 : T ! T is defined by M2(a) = 2a. For an arbitrary interval (a; b) ⊂ T, −1 a b S a+1 b+1 b a b+1 a+1 µ(M2 (a; b)) = µ(( 2 ; 2 ) ( 2 ; 2 )) = ( 2 − 2 ) + ( 2 − 2 ) = b − a = µ(a; b). Therefore, the circle doubling map M2 is also a dynamics on T. In fact, we can show that Mk, multiplication by k, is measure-preserving for every natural number k. Remark 2.10. Now we have defined two different dynamics on the same space T (or R=Z). One natural question to ask is whether these two systems are equivalent. Although it is quite obvious that they are different given that the action a 7! a + α is not even close to the action a 7! 2a. However, it is hard to tell whether two dynamical systems are different or \behave in some similar ways" when they are in different spaces. Therefore we introduce the notion of measurable isomorphism. Definition 2.11. Given a probability measure space (X; B; µ) and a measurable set A 2 B. A is null if µ(A) = 0. On the other hand, A is conull if µ(A) = 1. This gives us a convenient way to talk about the special sets that have zero or full measure, which we will encounter a lot in our discussion of ergodic theory. Definition 2.12. In a dynamical system (X; B; T; µ) and a measurable set A 2 B. We call A T-invariant, or invariant to T , if TA ⊂ A. Moreover, if TA = A, we say that A is strictly T-invariant, or strictly invariant to T . Definition 2.13. Two systems (X; BX ;TX ; µ) and (Y; BY ;TY ; ν) are measurably 0 0 isomorphic if there exist conull sets X 2 BX invariant to TX and Y 2 BY invariant 0 0 to TY and an invertible measure-preserving map f : X ! Y such that f ◦ TX = 0 TY ◦ f for every x 2 X , i.e. that the following diagram commutes a.e. X TX X f f Y TY Y Notice that the above commutative diagram may only be defined on a set of full measure. 3See Remark 2.8. 4Here we view the circle as the interval [0; 1) with endpoints identified. 4 ZIJIAN WANG 2 2 Example 2.14. (T; BT ;M4; µ) is isomorphic to (T ; BT ;M2 ⊗ M2; µ ⊗ µ) where 2 2 µ⊗µ is the product measure and M2 ⊗M2 : T ! T is defined by M2 ⊗M2(t1; t2) = 2 (2t1; 2t2). It is clear that M2 ⊗ M2 is a measure-preserving map on T . Here we construct a measure-preserving map φ from T to T2 such that the diagram below commutes. M4 T T φ φ M2⊗M2 T2 T2 2 We construct a sequence fφngn2N of maps from T to T where each φn is a measure- preserving map on a "small σ-algebra". When n = 1, we define C1 ⊂ BT to be the trivial σ-algebra, which only contains the entire interval [0; 1) and the empty set. 2 Similarly, we define D1 ⊂ BT to be the σ-algebra that only contains the unit 5 2 square. We define φ1 to be some bijective map from [0; 1)to [0; 1) . It is clearly 2 measure-preserving when viewed as a map from (T; C1) to (T ; D1). When n = 2, 1 1 1 1 3 3 we divide the interval into four subintervals f[0; 4 ); [ 4 ; 2 ); [ 2 ; 4 ); [ 4 ; 1)g and define C2 ⊂ BT to be the σ-algebra generated by these four subintervals. Similarly, we 2 can divide T into four squares and define the σ-algebra D2. The function φ2 is defined by sending the four subintervals of T into the for subsquares of T2 in counter clockwise order starting at the top left square.