arXiv:math/0611621v2 [math.CA] 25 Jan 2007 subset a δ Notation. Definition. this must definitions. following conditions of the kind to What led entropy. are their on estimate uniform lasb en h ustof subset the meant be always foeaos npriua,gvnasqec foeaoson operators of sequence a given particular, In operators. of ( Definition. only. functions real-valued u etn ilb rbblt pc ( space probability a be will setting Our Preliminaries 1 T blsi the in -balls j 1. 2. 3. 4. saBugi eunei h olwn r satisfied: are following the if sequence Bourgain a is ) T T T T yon h aege o raiain oee,tepof h proofs the However, organization. for goes between same differences his. independ The times, be at to own. are, designed my There are notes self-contained. These and applications. the of uecnegneadbuddentropy bounded and convergence sure j j j j 1 hr,1rfr otecntn ucin1), function constant the to refers 1 (here, 1 = (1) : : r oiie .. if i.e., positive, are hssre sana opeesto oe nBugi’ well Bourgain’s on notes of set complete near a as serves This S L L 1 2 ⊂ eoeby Denote ( ( Let ie suomti pc ( space pseudo-metric a Given µ µ oransEtoyEtmtsRevisited Estimates Entropy Bourgain’s d Y ) ) suomti eddt cover to needed pseudo-metric → → en the define , T j L L : 1 2 ( ( M µ µ M r bounded, are ) r isometries, are ) ( X ( X ) f h e fmaual functions measurable of set the ) δ M → etoynme of number -entropy ≥ M a.s.[ 0 ( onT Workman T. John X ( X hc a finite has which ) µ ), then ] j Abstract 2.Tetoetoyrslsaetetd si one is as treated, are results entropy two The [2]. X, ∈ ,d Y, F 1 N T µ , S easqec flna prtr.W say We operators. linear of sequence a be , j ta is, (that ) ednt hsby this denote We . ( .W r neetdi eti sequences certain in interested are We ). f S ) ≥ ob h iia ubro (closed) of number minimal the be to a.s.[ 0 L p nr.Ta is, That -norm. d µ edntsprt ons and points) separate not need L ], f oransntto and notation Bourgain’s 2 n fBugi’ paper Bourgain’s of ent ( : µ X konpaper -known ewn omk some make to want we ) ri r essentially are erein → N eunestsy We satisfy? sequence ( R ,d δ d, S, By . L p ( ). µ L Almost osssof consists ) p ( µ ,i will it ), 5. Tj satisfy a mean ergodic condition, i.e.,

1 J T f f(x) µ(dx) in L1(µ)-norm for all f L1(µ). J j → ∈ j=1 X X Z The mean ergodic condition will prove useful as is. The first four assumptions lead to the following properties. Lemma 1. Let T : (X) (X) be a linear operator which satisfies assumptions (1) - M → M 2 2 (4) above. Then, T : L∞(µ) L∞(µ) with T f ∞ f ∞ . Further, T (f )= T (f) → k kL (µ) ≤k kL (µ) a.s.[µ] for all f L2(µ). ∈ Proof. By assumption 4, T (c)= c for all constant functions c. By assumption 3, T (g) T (h) ≤ a.s.[µ] whenever g h a.s.[µ]. Let f L∞(µ). Then, f f ∞ a.s.[µ], so that T (f) ≤ ∈ ≤ k kL (µ) ≤ T ( f ∞ ) = f ∞ a.s.[µ]. Similarly, T (f) = T ( f) T ( f ∞ ) = f ∞ k kL (µ) k kL (µ) − − ≤ k kL (µ) k kL (µ) a.s.[µ]. Thus, T (f) f ∞ a.s.[µ], giving the first statement. | |≤k kL (µ) We now approach the second statement. First, suppose A . As0 χA 1, we 2 ∈ F 2 ≤ ≤ 2 have 0 T (χA) 1. By assumption 2, µ(A) = χA L2(µ) = T (χA) L2(µ) = X T (χA) dµ. ≤ ≤ 2 k k 2 k k 2 On the other hand, µ(A)=1 χ c 2 = 1 1 χ 2 = 1 T (1 χ ) 2 = A L (µ) A L (µ) R A L (µ) 2 −k k 2 −k − k 2−k − k 1 1 T (χ ) 2 =1 (1 T (χ )) dµ = 2T (χ ) T (χ ) dµ. Setting these two −k − A kL (µ) − X − A X A − A expressions of µ(A) equal, we have T (χ ) dµ = T (χ )2 dµ. As0 T (χ ) 1 a.s.[µ], R X A R X A ≤ A ≤ it must be that T (χ )=0, 1 a.s.[µ]. Namely, T takes indicator functions to a.s. indicator A R R functions. Now suppose A, B are disjoint. Then, χ χ =0. So,0= χ χ dµ = χ , χ = ∈F A B X A B h A Bi T (χ ), T (χ ) = T (χ )T (χ ) dµ. As the integrand is necessarily nonnegative a.s.[µ], h A B i X A B R we have T (χA)T (χB) = 0 a.s.[µ]. n R 2 Let s = i=1 ciχAi be a simple function, where Aj are pairwise disjoint. Then, s = 2 2 ci χAi . Now, T (s) = ciT (χAi ), which gives T (s) = i k cickT (χAi )T (χAk ) = 2 2P 2 2 i ci T (χAi ) = i ci T (χAi )= T (s ) a.s.[µ]. P Let f L2(µ) and ǫ >P0. Denote by T the operator normP ofPT on L1(µ). Choose a P ∈ P k k 2 2 simple function s so that s f , s f L2(µ) < ǫ/(4 f L2(µ)) and s f L1(µ) < ǫ/(2 T ). Then, | |≤| | k − k k k k − k k k

2 2 T (f ) T (f) L1(µ) k − k 2 2 2 2 2 2 T (f ) T (s ) L1(µ) + T (s ) T (s) L1(µ) + T (s) T (f) L1(µ) ≤k 2 − 2 k k − k k − k = T (f s ) L1(µ) + (T (f) T (s))(T (f)+ T (s)) L1(µ) k −2 2k k − k T f s 1 + T (f + s) 2 T (f s) 2 ≤k kk − kL (µ) k kL (µ)k − kL (µ) < ǫ/2+2 f 2 f s 2 k kL (µ)k − kL (µ) < ǫ. As ǫ is arbitrary, we have T (f 2)= T (f)2 a.s.[µ].

Our results will focus on a more general sequence. Let Sn : (X) (X), n N, be 2 2 M →M ∈ a sequence of linear operators where each Sn : L (µ) L (µ) is bounded (not necessarily uniformly). We have the following Banach principle-type→ statements.

2 2 2 Theorem 2. Let Sn : L (µ) L (µ) be bounded. Suppose that for some 2 p < , → p ≤ ∞ supn Snf is finite a.s.[µ] for all f L (µ). Then, there is a finite-valued function θ(ǫ) so that | | ∈

µ x X : sup Snf(x) > θ(ǫ) < ǫ ∈ n | | n o for every f p 1. k kL (µ) ≤ Theorem 3. Let S : L2(µ) L2(µ) be bounded. Suppose S f converges a.s.[µ] for all n → n f L∞(µ). Then, for each ǫ> 0 and η > 0, there exists ρ(ǫ, η) > 0 such that ∈

µ x X : sup Snf(x) > η < ǫ ∈ n | | n o for all f ∞ 1 and f 1 ρ(ǫ, η). k kL (µ) ≤ k kL (µ) ≤ Theorem 2 is a classical result, and Theorem 3 is due to Bellow and Jones [1]. We postpone the proofs of these two theorems until Section 5. The principal assumption we make on the sequence (Sn) is that it commutes with a Bourgain sequence (Tj), that is, SnTj = TjSn for all n, j. More on this later.

2 Normal Random Variables

The proofs of the two entropy results rely heavily on the theory of Gaussian (or normal) random variables and Gaussian processes. It is advantageous at this point to review a few fundamental results. Let (Ω, ,P ) be another probability space. B Definition. We say a random variable g : Ω R is normal (or Gaussian) with mean m and → variance σ2 if it has the density function

1 (x m)2 f(x)= exp − , √ 2 − 2σ2 2πσ   i.e., P (g A)= A f(x) dx for all Borel sets A. A normal random variable with mean 0 and variance∈ 1 is called a standard normal random variable. R Recall, for a random variable g with mean 0, the variance is given by σ2 = E(g2) = 2 g 2 , where E( ) is expectation. Also, if g is a normal random variable with mean 0, k kL (P ) · then it is centered, that is, P (g> 0) = P (g< 0)=1/2. The following results are well-known in probability theory, and we present them without proof.

Lemma 4. If g is a random variable on Ω, then

∞ ∞ g(ω) P (dω)= P ( g > t) dt and g(ω) 2 P (dω)=2 tP ( g > t) dt. | | | | | | | | ZΩ Z0 ZΩ Z0

3 Lemma 5. Let g be a normal random variable with mean 0. Then, the moment generating function is given by

2 2 eλg(ω) P (dω)= eλ σ /2, λ R. ∈ ZΩ Lemma 6. For each 1 p < , there exists a constant C > 0 so that g p ≤ ∞ p k kL (P ) ≤ C g 2 for all normal random variables g with mean 0. pk kL (P )

Lemma 7. Let g1,g2,...,gm be independent standard normal random variables. Then, for 2 any constants ai, aigi is a normal random variable with mean 0 and variance ai . P P Lemma 4 is proven by two simple applications of Fubini’s Theorem. The proof of Lemma 5 is a standard result and is found in most probability texts. Lemma 7 follows immediately from independence. Only Lemma 6 is a somewhat deep result. In fact, something stronger is true; the Lp and Lq norms of a normal random variable are uniformly equivalent for any 1 p,q < . We need only the case q = 2. A proof of the general result can be found in [7] ≤ ∞ (Corollary 3.2). We now state an important estimate for normal random variables [3]. The proof is postponed until Section 6.

Theorem 8. Let G1,...,GN be normal random variables each with mean 0. If for some constant s we have P ω Ω : sup G (ω) s 1/2, then sup G 1 6s. { ∈ n | n | ≤ } ≥ k n | n|kL (P ) ≤ We now turn our attention to Gaussian processes.

Definition. Let T be a countable indexing set. We say a collection of random variables (G : t T ) is a Gaussian process if each G has mean 0 and all finite linear combinations t ∈ t t atGt are normal random variables. P Note that this definition is not entirely standard, in particular the requirement that each Gt have mean 0. It is added here because throughout all Gaussian processes we deal with have this property, and it makes life simpler later on. If each Gt is itself a finite linear combination of mean 0 normal random variables, then (G : t T ) is trivially a Gaussian process. Define a pseudo-metric on T by d (s, t) = t ∈ G G G 2 . Denote the entropy number of T by N(T,d , δ). The following fundamental k s − tkL (P ) G result is Sudakov’s inequality.

Theorem 9. There exists a universal constant R such that if (Gt : t T ) is a Gaussian process, then ∈

sup δ log N(T,dG, δ) R sup Gt . δ>0 ≤ t T | | 1 ∈ L (P ) p

For the remainder of the paper, the notation R is fixed on this constant. A proof of Sudakov’s inequality can be found in [7] (Theorem 3.18).

4 3 The First Entropy Result

Recall, we consider a sequence S : L2(µ) L2(µ) of bounded operators. For f L2(µ), n → ∈ define a pseudo-metric on N by df (n, n′) = Snf Sn′ f L2(µ). For δ > 0, let Nf (δ) := N(N,d , δ), that is, the δ-entropy number of thek set− S f k: n N in L2(µ). We now state f { n ∈ } and prove the first of Bourgain’s entropy results. Proposition 1. Let S : L2(µ) L2(µ) be bounded (not necessarily uniformly), and as- n → sume (Sn) commutes with a Bourgain sequence (Tj). Suppose that for some 1 p < , p ≤ ∞ supn Snf < a.s.[µ] for all f L (µ). Then, there exists a constant C > 0 such that | | 1/2∞ ∈ 2 δ(log N (δ)) C f 2 for all δ > 0 and f L (µ). f ≤ k kL (µ) ∈ Proof. As (X,µ) is a probability space, Lp(µ) Lq(µ) when p < q. So, if p < 2, then p q ⊃ supn Snf < a.s.[µ] for all f L (µ) L (µ) for any q 2. Therefore, assume without loss of| generality| ∞ that p 2. ∈ ⊃ ≥ N ≥ For M , let M = 1, 2,...,M . Note, supM N(M,df , δ) = Nf (δ). Therefore, it ∈ { } 1/2 suffices to find C, independent of M, such that (log N(M,df , δ)) C f L2(µ) for all f, δ. Fix M N. ≤ k k ∈ 1/2 Suppose we could show δ log(N(M,df , δ)) C f L2(µ) for all f L∞(µ) and all 2 ≤ k k ∈ δ > 0. Let f L (µ) and δ > 0. Let D = max( S1 ,..., SM ) be the maximum of the 2 ∈ k k k k L (µ) operator norms. Choose f L∞(µ) with f f and f f 2 < δ/(2D). 1 ∈ | 1|≤| | k − 1kL (µ) 2 Then, Snf Snf1 L (µ) < δ/2 for all n M and N(M,df , δ) N(M,df1 ,δ/2). Hence, k − 1/k2 ∈ ≤ δ(log N(M,d , δ)) 2C f 2 2C f 2 , and we have the desired estimate with f ≤ k 1kL (µ) ≤ k kL (µ) 2C. Therefore, it suffices to prove the result for all L∞(µ) functions. Fix f L∞(µ). ∈ We will fix J at some large integer. By the mean ergodic condition on Tj, we have

J 1 2 2 2 J − T (f ) f 1 = f 2 j →k kL (µ) k kL (µ) j=1 X 1 1 2 2 in L (µ)-norm. But, J − T (f ) f ∞ a.s.[µ] by Lemma 1. Of course, if a sequence | j |≤k kL (µ) of functions h converges to a constant c in L1(µ)-norm and h B a.s.[µ] for all n, it n P n follows h c in Lq(µ)-norm for all 1 q < . Hence, as p | 2,| choose ≤ J so large that n → ≤ ∞ ≥ J 1 2 2 Tj(f ) 2 f L2(µ). (1) J ≤ k k j=1 Lp/2(µ) X 1 ′ 2 ′ 2 Similarly, for each pair n, n′ M, J − T (S f S f) S f S f 2 = ∈ j n − n → k n − n kL (µ) d (n, n )2 in L1(µ)-norm, and thus in probability. So, for J large enough, f ′ P

J 1 2 1 2 1 µ( ′ ) := µ x X : J − T (S f S ′ f) (x) d (n, n′) > 1 . Cn,n ∈ j n − n ≥ 4 f − 16M 2 ( j=1 ) X Choose J big enough so that this holds for each pair n, n′. Let g1,...,gJ be a sequence of independent standard normal random variables on a probability space (Ω, ,P ). Define the functions F, F ∗ on the product space X Ω by B × 5 J 1/2 F (x, ω)= J − g (ω)T f(x) and F ∗(x, ω) = sup S F (x, ω) . j j | n | j=1 n M X ∈ 1/2 By the commutativity assumption, SnF (x, ω)= J − gj(ω)TjSnf(x). Note, for each fixed x such that TjSnf(x) is finite for all j, n, (SnF (x, ): n M) is a Gaussian process. The focus of the proof will be finding the “correct” x to· fix.P ∈ We define four sets , , , X. First, let A B C D ⊂ = x X : T S f(x) < for all 1 j J, n M . A ∈ | j n | ∞ ≤ ≤ ∈ As each T S f L2(µn), it is clear that µ( )=1. Set o j n ∈ A

2 2 = x X : T (S f S ′ f) (x)= T (S f S ′ f)(x) for all 1 j J, n, n′ M B ∈ j n − n j n − n ≤ ≤ ∈ n o By Lemma 1, µ( )=1. Let = ′ ′ , i.e., B C n,n Cn,n T J 1/2 1 2 1 = x X : J − T (S f S ′ f) (x) d (n, n′) for all n, n′ M . C  ∈ j n − n ≥ 2 f ∈  j=1 !  X  Now,  

c c 1 1 ′ ′ µ n,n µ( n,n ) < 2 = . C ≤ ′ C ′ 16M 16  n,n′ M  n,n n,n [∈ X X Equivalently, µ( ) > 15/16. C Finally, let R′ = 5√2Cp θ(1/4), where where θ is the finite-valued function from Theo- rem 2 and Cp is the constant from Lemma 6. Note, R′ does not depend on f or M. Set

= x X : P ω Ω: F ∗(x, ω) R′ f 2 1/2 . D ∈ ∈ ≤ k kL (µ) ≥ Now, by Lemmas 1,n 6, and 7, and (1), we have o

6 1/p p F ( ,ω) p P (dω)= F (x, ω) µ(dx) P (dω) k · kL (µ) | | ZΩ ZΩ ZX  1/p F (x, ω) p µ(dx) P (dω) ≤ | | ZΩ ZX  J p 1/p 1/2 = J − gj(ω)Tjf(x) P (dω) µ(dx) X Ω j=1 ! Z Z X 1/p J p 1/2 = J − gj( )Tjf(x) µ(dx)  X ·  Z j=1 Lp(P ) X  p  1/p J p 1/2 Cp J − gj( )Tjf(x) µ(dx) ≤  X ·  Z j=1 L2(P ) X  p/2 1/p  J 1 2 = Cp (Tjf)(x) µ(dx)  X J  Z j=1 X  J 1/2  1 2 = Cp Tj(f ) J j=1 Lp/2(µ) X C √ 2 f 2 . ≤ p k kL (µ)

By Chebyshev’s inequality, P ω : F ( ,ω) Lp(µ) > 5√2Cp f L2(µ) 1/5 < 1/4, or equiva- lently, { k · k k k } ≤

P ω Ω: F ( ,ω) p 5√2C f 2 > 3/4. ∈ k · kL (µ) ≤ pk kL (µ) Fix an ω in the aboven set. Then, F ( ,ω) Lp(µ). By Theoremo 2, · ∈

µ x X : sup SnF (x, ω) F ( ,ω) Lp(µ)θ(1/4) > 3/4. ∈ n N | |≤k · k n ∈ o Of course, F ∗(x, ω) sup S F (x, ω) . Further, we have a bound on F ( ,ω) p by the ≤ n | n | k · kL (µ) choice of ω. Thus,

µ x X : F ∗(x, ω) 5√2C θ(1/4) f 2 > 3/4. ∈ ≤ p k kL (µ) As this holds for all suchn ω, we have o

P ω Ω: µ x X : F ∗(x, ω) R′ f 2 > 3/4 > 3/4. ∈ ∈ ≤ k kL (µ) We now applyn Fubini’s theorem. In particular, o

7 4 ∗ ′ 2 3/4 < χ µ(F (x,ω) R f 2 )>3/4 (ω) P (dω) µ x : F ∗(x, ω) R′ f L (µ) P (dω) { ≤ k kL (µ) } ≤ 3 { ≤ k k } ZΩ ZΩ 4 = χ F ∗(x,ω) R′ f (x, ω) µ(dx) P (dω) 3 { ≤ k kL2(µ)} ZΩ ZX 4 = χ F ∗(x,ω) R′ f (x, ω) P (dω) µ(dx) 3 { ≤ k kL2(µ)} ZX ZΩ 4 = P ω : F ∗(x, ω) R′ f 2 µ(dx) 3 { ≤ k kL (µ)} ZX 4 = P F ∗(x, ω) R′ f L2(µ) µ(dx)+ P F ∗(x, ω) R′ f L2(µ) µ(dx) 3 { ≤ k k } c { ≤ k k }  ZD ZD  4 (µ( )+1/2). ≤ 3 D The last line follows from the definition of . This gives µ( ) > 1/16. D D The estimates µ( ) = 1, µ( ) = 1, µ( ) > 15/16, and µ( ) > 1/16 together imply A B C D that µ( ) > 0. Fix x . Define Gn(ω) = SnF (x, ω) = 1/2 A∩B∩C∩D ∈ A∩B∩C∩D J − gj(ω)TjSnf(x). As x , (Gn : n M) is a Gaussian process. By Sudakov’s inequality and Theorem 8, and∈ because A x ,∈ we see that P ∈D

1/2 sup δ(log N(M,dG, δ)) R sup Gn(ω) P (dω) δ>0 ≤ Ω n M | | Z ∈

= R F ∗(x,ω) P (dω) 6RR′ f 2 . ≤ k kL (µ) ZΩ On the other hand, each G G ′ is a linear combination of independent standard normal n − n random variables. It follows from Lemma 7 again and because x that ∈B∩C

J 1 2 1/2 d (n, n′)= G G ′ 2 = T S f(x) T S ′ f(x) G k n − n kL (P ) J j n − j n j=1  X    J 1/2 1 2 = T (S f S ′ f) (x) J j n − n j=1 ! X 1 d (n, n′). ≥ 2 f This implies N(M,d , δ) N(M,d ,δ/2) for all δ > 0. Hence, δ(log N(M,d , δ))1/2 f ≤ G f ≤ 12RR′ f 2 . We note that 12RR′ is universal, and does not depend on M or f. As k kL (µ) f L∞(µ) was arbitrary, this holds all a.s. bounded functions. By our earlier note, ∈ 1/2 2 δ(log N(M,df , δ)) 24RR′ f L2(µ) =: C f L2(µ) for all f L (µ) and all δ > 0. Taking ≤ k k 1/2 k k ∈ the supremum over M, δ(log N (δ)) C f 2 . f ≤ k kL (µ)

8 4 The Second Entropy Result

It will now be necessary to assume the (Sn) are uniformly bounded. Of course, by dividing 2 out a constant, we may assume each Sn is an L (µ)-contraction, i.e., Snf L2(µ) f L2(µ) for all n. k k ≤k k

2 Proposition 2. Let Sn be a sequence of L (µ) contractions that commute with a Bourgain sequence (T ). Suppose S f converges a.s.[µ] for all f L∞(µ). Then, there exists a finite- j n ∈ valued function C(δ) such that N (δ) C(δ) for all δ > 0 and f 2 1. f ≤ k kL (µ) ≤ Proof. Suppose we can show the uniform entropy estimate for all f L∞(µ), f 2 1. ∈ k kL (µ) ≤ Let f 2 1 and δ > 0. Choose f L∞(µ), f f such that f f 2 < δ/2. k kL (µ) ≤ 1 ∈ | 1|≤| | k − 1kL (µ) Then, S f S f 2 < δ/2 for all n and N (δ) N (δ/2) C(δ/2). As δ and f k n − n 1kL (µ) f ≤ f1 ≤ are arbitrary, we have the uniform estimate with the function C0(δ)= C(δ/2). It therefore suffices to prove the result for a.s. bounded functions. We proceed by contradiction. Suppose not, i.e., suppose there is some δ > 0 such δ that Nf (δ) is unbounded over all such f. Define the constant R′ = 25R . As per The- orem 3, pick the constant ρ(1/10, R′/10). Choose K N, K > 1 big enough so that 1 1/2 1/2 ∈ (R′ 1000(log K)− ) > R′/10 and 2(log K)− < ρ(1/10, R′/10). 6 − Now, by our assumption, there is some f L∞, f 2 1 such that N (δ) > K. In ∈ k kL (µ) ≤ f particular, there is a subset I N with I = K (cardinality) and S f S ′ f 2 > δ for ⊂ | | k n − n kL (µ) all n = n′ I. 6 ∈ As before, we will need to choose an appropriately large J. First, denote B = f L∞(µ). T 2 7 k k Fix a number T > 0 such that T > 3√log K and exp( −2B2 ) 2KB2 . Note, the quantity λ2(2 B2) λ2(1 B2) ≤ e − e − is strictly positive for all λ [√log K,T/3]. Let γ > 0 be the minimum − ∈ 1 2 2 2 value of this quantity for λ in this interval. As J − T (f ) f 1 = f 2 in j → k kL (µ) k kL (µ) L1(µ)-norm, it converges in probability. So, pick J large enough so that P J 1 2 2 µ(Y ) := µ x X : Tj(f )(x) f L2(µ) > 1 < γ. (2) ( ∈ J −k k ) j=1 X 1 ′ 2 ′ 2 Recall from the proof of Proposition 1, J − T (S f S f) S f S f 2 in j n − n → k n − n kL (µ) probability for each pair n, n′ I. So, just as we did before, take J big enough so that if ∈ P

J 1/2 1 2 1 Z = x X : J − T (S f S ′ f) (x) S f S ′ f 2 for all n, n′ I 1  ∈ j n − n ≥ 2k n − n kL (µ) ∈  j=1 !  X  then µ(Z ) > 4/5. (3)  1 

1/2 J Again, define F (x, ω) = J − j=1 gj(ω)Tjf(x). Write F (x, ω) = ϕ(x, ω)+ H(x, ω) where P

ϕ(x, ω)= F (x, ω)χ F (x,ω) 6√log K and H(x, ω)= F (x, ω)χ F (x,ω) >6√log K . {| |≤ } {| | } Define three subsets of Ω by

9 = ω Ω: sup SnH(x, ω) µ(dx) 90 , A ∈ X n I | | ≤  Z ∈  1/2 = ω Ω: µ x X : sup SnF (x, ω) > R′(log K) > 1/5 , B ∈ ∈ n I | |  ∈   = ω Ω: ϕ(x, ω) µ(dx) 12 . C ∈ | | ≤  ZX  Suppose for the moment that we could choose ω . Define ψ(x) = 1 1/2 ∈A∩B∩C 6 (log K)− ϕ(x, ω). Simply from the definition of ϕ, it follows

ψ 1. | | ≤ As ω , ∈ C

1/2 ψ(x) µ(dx) 2(log K)− < ρ(1/10, R′/10). | | ≤ ZX By Theorem 3, we have that

µ x X : sup Snψ(x) > R′/10 < 1/10. (4) ∈ n | |   On the other hand, as ω , we have by Chebyshev that ∈A

µ x X : sup SnH(x, ω) > 1000 90/1000 < 1/10, ∈ n I | | ≤  ∈  or equivalently

µ x X : sup SnH(x, ω) 1000 > 9/10. (5) ∈ n I | | ≤  ∈  As ω , ∈ B

1/2 µ x X : sup SnF (x, ω) > R′(log K) > 1/5. (6) ∈ n I | |  ∈  Now, supn Snϕ supI Snϕ supI SnF supI SnH . So, taking the intersection of the sets in (5)| and (6),| ≥ we have| | ≥ | | − | |

1/2 µ x X : sup S ϕ(x, ω) R′(log K) 1000 > 1/10. ∈ | n | ≥ −  n  Applying the definition of ψ and the choice of K,

µ x X : sup Snψ(x) > R′/10 > 1/10. ∈ n | |   This clearly contradicts (4). Therefore, it suffices to find such an ω.

10 Estimate of . A Fix λ [log K,T/3]. By considering F (x, ω) as a normal random variable (in ω) with ∈ 1 2 variance J − Tjf(x) , it follows from Lemma 5 that P λ2 J exp(λF (x, ω)) P (dω) µ(dx)= exp T f(x)2 µ(dx) 2J j X Ω X j=1 ! Z Z Z X λ2 J = exp T (f 2)(x) µ(dx). 2J j X j=1 ! Z X 1 2 2 c Now, by (2), we see J − Tj(f ) f L2(µ) +1 2 on Y . On the other hand, we have 1 2 2 ≤ k k ≤ J − T (f ) B a.s.[µ]. So, j ≤ P P λ2 J exp(λF (x, ω)) P (dω) µ(dx)= exp T (f 2)(x) µ(dx) 2J j X Ω X j=1 ! Z Z Z X 2 2 2 eλ µ(dx)+ eλ B µ(dx) ≤ Y c Y Z 2 2 2 Z eλ + γeλ B ≤ λ2 λ2(2 B2) λ2(1 B2) λ2B2 2λ2 e +(e − e − )e = e . ≤ − Define µ (ω)= µ x X : F (x, ω) > t . Then, for all t> 0, we have from above that t { ∈ | | }

eλt µ (ω) P (dω)= eλt P F (x, ω) > t µ(dx) t {| | } ZΩ ZX =2eλt P F (x, ω) > t µ(dx) { } ZX =2 eλt P (dω) µ(dx) X F (x,ω)>t Z Z{ } 2 eλF (x,ω) P (dω) µ(dx) ≤ X Ω Z 2Z 2e2λ . ≤ t2/9 Set t =3λ. Then, Ω µt(ω) P (dω) 2e− , and this holds for all t [3√log K, T ]. 1 ≤2 2 ∈ On the other hand, J − T (f ) B a.s.[µ]. So, for all t,λ > 0, R j ≤ P

11 eλt µ (ω) P (dω) 2 eλF (x,ω) P (dω)µ(dt) t ≤ ZΩ ZX ZΩ λ2 J =2 exp T (f 2)(x) µ(dx) 2J j X j=1 ! Z X λ2B2 2 exp µ(dx) ≤ X 2 Z 2 2   =2eλ B /2.

2 λ2B2/2 t2/(2B2) Set t = λB to see µ (ω) P (dω) 2e− =2e− for all t> 0. X t ≤ From Lemma 4, R

∞ H(x, ω) 2 µ(dx) P (dω)=2 tµ H(x, ω) > t dtP (dω) | | {| | } ZΩ ZX ZΩ Z0 3√log K =2 tµ H(x, ω) > t dtP (dω) (7) {| | } ZΩ Z0 ∞ +2 tµ H(x, ω) > t dtP (dω) {| | } ZΩ Z3√log K By definition of H, and an application of Chebyshev,

3√log K 2 tµ H(x, ω) > t dtP (dω) {| | } ZΩ Z0 3√log K =2 tµ H(x, ω) > 6 log K dtP (dω) | | ZΩ Z0 3√log K  1 2 p 2 t H(x, ω) 2 µ(dx) dtP (dω) (8) ≤ 6√log K | | ZΩ Z0   ZX  1 = H(x, ω) 2 µ(dx) P (dω) 4 | | ZΩ ZX 1 H(x, ω) 2 µ(dx) P (dω). ≤ 2 | | ZΩ ZX As H F everywhere, µ H(x, ω) > t µ (ω) for all t and ω. So, | |≤| | {| | } ≤ t

12 ∞ 2 tµ H(x, ω) > t dtP (dω) {| | } ZΩ Z3√log K ∞ 2 tµ (ω) dtP (dω) ≤ t ZΩ Z3√log K T ∞ =2 tµ (ω) dtP (dω)+2 tµ (ω) dtP (dω) t t (9) ZΩ Z3√log K ZΩ ZT T t2/9 ∞ t2/(2B2) 4 te− dt +4 te− dt ≤ Z3√log K ZT 1 9 T 2/9 2 T 2/(2B2) = 18K− e− +4B e− − 2 1 32K− ≤ by the choice of T . Combining (7), (8), and (9),

1 2 1 H(x, ω) µ(dx) P (dω) 32K− . 2 | | ≤ ZΩ ZX 2 Stated another way, H( ,ω) 2 P (dω) 64/K. This gives Ω k · kL (µ) ≤ R

sup SnH(x, ω) µ(dx) P (dω)= sup SnH( ,ω) P (dω) 1 Ω X n I | | Ω n I | · | L (µ) Z Z ∈ Z ∈

sup SnH( ,ω) P (dω) 2 ≤ Ω n I | · | L (µ) Z ∈ 1/2 2 = (sup SnH(x, ω) ) µ(dx) P (dω) Ω X n I | | Z Z ∈  1/2 S H(x, ω) 2 µ(dx) P (dω) ≤ | n | Ω X n I Z  Z ∈  X 1/2 2 = S H( ,ω) 2 P (dω) k n · kL (µ) Ω n I Z  ∈  X 1/2 2 H( ,ω) 2 P (dω) ≤ k · kL (µ) Ω n I Z  X∈  1/2 = K H( ,ω) L2(µ) P (dω) Ω k · k Z 1/2 1/2 2 K H( ,ω) 2 P (dω) ≤ k · kL (µ) ZΩ  K1/2(64/K)1/2 =8. ≤

It follows by Chebyshev that P X supn I SnH(x, ω) µ(dx) > 90 8/90 < 1/10, or { ∈ | | } ≤ equivalently, P ( ) > 9/10. A R Estimate of . B 13 Denote F ∗(x, ω) = supn I SnF (x, ω) . Define the sets Z2,Z3 X by ∈ | | ⊂

Z = x X : T S f(x) < for all 1 j J, n I , 2 ∈ | j n | ∞ ≤ ≤ ∈ n 2 2 o Z = x X : T (S f S ′ f) (x)= T (S f S ′ f)(x) for all 1 j J, n, n′ I . 3 ∈ j n − n j n − n ≤ ≤ ∈ n o As in the proof of Proposition 1, µ(Z2) = µ(Z3) = 1. Let Z = Z1 Z2 Z3, so that µ(Z) > 4/5 by (4). ∩ ∩ Fix x Z. Define a Gaussian process by G (ω) = S F (x, ω) and let d (n, n′) = ∈ n n G G G ′ 2 as before. By the definition of Z and the original hypothesis, k n − n kL (P )

J 1/2 1 2 d (n, n′)= J − T (S f S ′ f)(x) G j n − n j=1 ! X J 1/2 1 2 1 = J − T (S f S ′ f) (x) S f S ′ f 2 > δ/2 j n − n ≥ 2k n − n kL (µ) j=1 ! X for all n = n′ I. So, N(I,d ,δ/4) = I = K. By Sudakov’s inequality, 6 ∈ G | |

1 δ 1/2 F ∗(x, ω) P (dω) log N(I,d ,δ/4) ≥ R 4 G ZΩ δ 1/2  1/2 =6 (log K) > 6R′(log K) . 24R   1/2 It follows from Theorem 8 that P ω : F ∗(x, ω) R′(log K) < 1/2, or equivalently, 1/2 { ≤ } P ω : F ∗(x, ω) > R′(log K) > 1/2. As this holds for all x Z, we have { } ∈ 1/2 µ x X : P ω Ω: F ∗(x, ω) > R′(log K) > 1/2 > 4/5. ∈ ∈ By the same Fubinin trick as in the proof of Proposition 1, we see thato

4/5 < 2χ F ∗(x,ω)>R′(log K)1/2 P (dω) µ(dx)= 2χ F ∗(x,ω)>R′(log K)1/2 µ(dx) P (dω) { } { } ZX ZΩ ZΩ ZX 1/2 1/2 =2 µ x : F ∗(x, ω) > R′(log K) P (dω)+ µ x : F ∗(x, ω) > R′(log K) P (dω) { } c { } ZB ZB  2(P ( )+1/5), ≤ B which implies P ( ) > 1/5. B Estimate of . C Now,

14 ϕ( ,ω) 1 P (dω) F ( ,ω) 1 P (dω) k · kL (µ) ≤ k · kL (µ) ZΩ ZΩ J 1/2 = J − gj(ω)Tjf(x) µ(dx) P (dω) Ω X j=1 Z Z X J 1/2 = J − gj(ω)Tjf(x) P (dω) µ(dx) X Ω j=1 Z Z X 1/2 J 2 1/2 J − gj(ω)Tjf(x) P (dω) µ(dx) ≤ X  Ω  Z Z j=1 X  J 1/2  1/2 2 = J − Tjf(x) µ(dx) X j=1 ! Z X J 1/2 1/2 2 J − T f(x) µ(dx) ≤ j X j=1 ! Z X J 1/2 J 1/2 1/2 2 1/2 2 = J − T f 2 = J − f 2 k j kL (µ) k kL (µ) j=1 ! j=1 ! X X 1. ≤

From Chebyshev, we see P ϕ( ,ω) L1(µ) > 12 1/12 < 1/10, or equivalently, P ( ) > 9/10. It now follows that P ({k · k ) > 0. } ≤ C A∩B∩C 5 Proofs of Theorems 2 and 3

2 2 Recall, our setting is a probability space (X, ,µ) with a sequence Sn : L (µ) L (µ). Theorem 2 is a well-known result. The proof isF as follows. →

p Proof of Theorem 2. Fix ǫ > 0. Let f L (µ). Then, S∗f := sup S f < a.s.[µ], by ∈ n | n | ∞ the hypothesis. So, there is some n N (depending on f), so that µ x : S∗f(x) > n ǫ/3. Thus, ∈ { } ≤

∞ p p L (µ)= f L (µ): µ S∗f > n ǫ/3 . ∈ { } ≤ n=1 [ n o p Denote B = f L (µ): µ S∗f > n ǫ/3 . Let S∗ f(x) := sup S f(x) :1 n N . n { ∈ { } ≤ } N {| n | ≤ ≤ } It follows that

∞ p B = f L (µ): µ S∗ f > n ǫ/3 . n ∈ { N } ≤ N=1 \ n o

15 N p Denote Bn = f L (µ): µ SN∗ f > n ǫ/3 . Fix n, N. Now, it is clear that { N∈ { N } ≤ } 2 SN∗ f L2(µ) k=1 Skf L2(µ) ( k=1 Sk ) f L2(µ) =: C f L2(µ) for all f L (µ), kwherek S ≤refers k to the| operator|k ≤ norm onk L2k(µk).k Suppose (kf )k BN and f ∈ f in k kk k ∈ n k → Lp(µ)-norm. LetPr N. Then, by Chebyshev,P ∈

µ SN∗ f > n +1/r µ SN∗ f SN∗ fk > 1/r + µ SN∗ fk > n { } ≤ 2{| − 2| } { } r S∗ (f f ) 2 + ǫ/3 ≤ k N − k kL (µ) 2 2 2 C r f f 2 + ǫ/3 ≤ k − kkL (µ) 2 2 2 C r f f p + ǫ/3 ǫ/3. ≤ k − kkL (µ) →

Because r SN∗ f(x) > n +1/r = SN∗ f(x) > n and these sets are nested, we have the { } { } N limit µ SN∗ f > n +1/r µ SN∗ f > n . Namely, µ SN∗ f > n ǫ/3 and f Bn . Thus, N { S p } → { } { } ≤ ∈ Bn is closed (in L (µ)), which implies Bn is closed. It follows from the Baire Category p Theorem, because L (µ)= Bn, that one of Bn contains an open set. That is, there exists p some n N, δ > 0, and f0 L (µ) such that f Bn for all f f0 Lp(µ) δ. In particular, ∈ ∈S ∈ k − k ≤ µ S∗(f + δg) > n ǫ/3 for all g p 1. Thus, for all g p 1, we have { 0 } ≤ k kL (µ) ≤ k kL (µ) ≤

2n µ S∗g > µ S∗(f + δg) > n + µ S∗f > n 2ǫ/3 < ǫ. δ ≤ { 0 } { 0 } ≤   If we set θ(ǫ)=2n/δ, we have the desired result.

Theorem 3 and its proof are taken directly from Bellow and Jones [1]. It is included here only for completeness. Denote the space Y0 = f L∞(µ): f L∞(µ) 1 . We will be concerned with the 2 { ∈ k k ≤ } L (µ)-norm on Y . It is well-known that Y is complete under 2 . Denote δ-balls in Y 0 0 k·kL (µ) 0 by B (f)= g Y : f g 2 < δ . The first step is to prove the following lemma. δ { ∈ 0 k − kL (µ) } Lemma 10. If f Y and δ > 0, then B (0) B (f ) B (f )= g g : g ,g B (f ) . 0 ∈ 0 δ ⊆ δ 0 − δ 0 { 1 − 2 1 2 ∈ δ 0 }

Proof. Let g B (0), so that g 2 < δ. Define ∈ δ k kL (µ)

g(x) if f0(x),g(x) are both finite, and f0(x)g(x) 0, u1(x)= ≤ (0 otherwise,

g(x) if f0(x),g(x) are both finite, and f0(x)g(x) > 0, u2(x)= − (0 otherwise.

Let g = f + u and g = f + u . Now, it is clear that g = u u =(f + u ) (f + u )= 1 0 1 2 0 2 1 − 2 0 1 − 0 2 g g for a.s.[µ] x X. It is also clear that f g 2 = u 2 g 2 < δ. 1 − 2 ∈ k 0 − 1kL (µ) k 1kL (µ) ≤ k kL (µ) Similarly, f g 2 < δ. So, we will be done if we can show g ,g Y . k 0 − 2kL (µ) 1 2 ∈ 0 Fix an x such that f0(x),g(x) are both finite and f0(x)g(x) 0, i.e., f0(x) and g(x) have opposite signs. Then, u (x) = g(x) and g (x) = f (x)+ u ≤(x) = f (x)+ g(x) . As 1 | 1 | | 0 1 | | 0 | 16 they have opposite signs, f0(x)+ g(x) max f0(x) , g(x) . Now suppose x is such that f (x),g(x) are finite and |f (x)g(x) > |0. ≤ Then,{| u (x|)| = 0|} and g (x) = f (x) . Hence, 0 0 1 | 1 | | 0 | g max f , g for a.s.[µ] x, which implies g ∞ max f ∞ , g ∞ 1. | 1| ≤ {| 0| | |} k 1kL (µ) ≤ {k 0kL (µ) k kL (µ)} ≤ Namely, g Y . Precisely the same argument shows g Y . 1 ∈ 0 2 ∈ 0 We can now proceed to the proof of Theorem 3. This proof also relies on the Baire Category Theorem. Proof of Theorem 3. Fix ǫ,η > 0. Choose 0 <α< 1/2 so that α < ǫ/3 and η > 2α. For N N define ∈

FN (α)= f Y0 : µ x X : sup SN f(x) Smf(x) α 1 α . ∈ ∈ m N | − | ≤ ≥ −   ≥   For M > N, define

FN,M (α)= f Y0 : µ x X : sup SN f(x) Smf(x) α 1 α . ∈ ∈ N m M | − | ≤ ≥ −   ≤ ≤  

Fix N and M > N. We wish to show FN,M (α) is closed, with respect to (Y0, L2(µ)). Let 2 k·k (fk) FN,M (α) and fk f Y0 in L (µ)-norm. Let gk(x) = supN m M SN fk(x) Smfk(x) ∈ → ∈ ≤ ≤ | − 2 | and g(x) = supN m M SN f(x) Smf(x) . By Sn it is meant the operator norm on L (µ). Now, ≤ ≤ | − | k k

g gk L2(µ) sup SN f Smf SN fk + Smfk k − k ≤ N m M | − − | 2 ≤ ≤ L (µ)

SN (f fk) L2(µ) + sup Sm(fk f) ≤k − k N m M | − | 2 ≤ ≤ L (µ)

M

SN (f fk) L2(µ) + Sm(fk f) ≤k − k | − | m=N L2(µ) X M

SN + Sk f fk L2(µ) 0. ≤ k k k k! k − k → kX=N For n N, we have that ∈

1 1 µ x X : g(x) >α + µ x X : g (x) >α + µ x X : g g > ∈ n ≤ { ∈ k } ∈ | − k| n   2 2   α + n g g 2 α. ≤ k − kkL (µ) → As g>α = g>α + 1 , and this is a nested sequence, we see µ g>α α. But { } n{ n } { } ≤ this says f FN,M (α), and FN,M (α) is closed. ∈ S

17 Now, as the relevant sets are nested and decreasing, it is easy to see that FN (α) =

M>N FN,M (α). So, each FN (α) is closed. Let g Y0. By hypothesis, Sng converges a.s.[µ]. By Egoroff’s Theorem, there is a set E with µ(X∈ E) <α so that S g converges uniformly − n Ton E. Then, there is some N such that S g(x) S g(x) α whenever n N and x E. | N − n | ≤ ≥ ∈ This implies g F (α). Hence, Y = F (α). As Y is complete under 2 , it follows ∈ N 0 N N 0 k·kL (µ) by the Baire Category Theorem that at least one of FN (α) contains an open set. That is, there is some N , some f Y , and someS δ > 0 such that B (f ) F (α). 0 0 ∈ 0 δ 0 ⊂ N0 2 Let SN∗ f = sup1 n N0 Snf and S∗f = supn Snf . Note, for each g Y0, Sng L (µ) 0 ≤ ≤ | | | | ∈ k k ≤ 2 2 Sn g L (µ), so that each Sn is continuous at 0 in (Y0, L (µ)). Hence, SN∗ 0 is continuous k kk k N0 k·k 2 at 0, because S∗ (f) S f . So, there is some 0 < δ′ < δ such that S∗ f 2 < N0 ≤ n=1 | n | k N0 kL (µ) α(η 2α)2 when f B ′ (0). − ∈ δP Fix f B ′ (0). Now, by Lemma 10, we see B ′ (0) B (0) B (f ) B (f ). Thus, ∈ δ δ ⊂ δ ⊆ δ 0 − δ 0 there are g1,g2 Bδ(f0) FN0 (α) such that f = g1 g2 a.s.[µ]. By definition of FN0 (α), we see that ∈ ⊂ −

µ x X : sup SN0 g1(x) Smg1(x) α 1 α, ∈ m N0 | − | ≤ ≥ −  ≥ 

µ x X : sup SN0 g2(x) Smg2(x) α 1 α. ∈ m N0 | − | ≤ ≥ −  ≥  Now, except for a set of probability 0,

sup SN0 f(x) Smf(x) = sup SN0 g1(x) Smg1(x) SN0 g2(x)+ Smg2(x) m N0 − m N0 − − ≥ ≥ sup SN0 g1(x) Smg1(x) + sup SN0 g2(x) S mg2(x) . ≤ m N0 − m N0 − ≥ ≥ Therefore,

µ x X : sup SN0 f(x) Smf(x) 2α ∈ m N0 | − | ≤  ≥ 

µ sup SN0 g1 Smg1 α sup SN0 g2 Smg2 α ≥ m N0 | − | ≤ m N0 | − | ≤ n ≥ o n ≥ o 1 2α. \ ≥ − Define

C = x X : S∗f(x) > η , ∈ n o D = x X : S∗f(x) S∗ f(x)+2α , ∈ ≤ N0 n o E = x X : sup SN0 f(x) Smf(x) 2α . ∈ m N0 | − | ≤ n ≥ o 18 Then, µ(E) 1 2α by above. Now, for any x X and m N it is clear that ≥ − ∈ ≤ 0

S f(x) S∗ f(x). | m | ≤ N0 On the other hand, for x E and any m N we have ∈ ≥ 0

S f(x) S f(x) +2α S∗ f(x)+2α. | m |≤| N0 | ≤ N0

Namely, for every x E we see S∗f(x) SN∗ 0 f(x)+2α, or x D. Thus, µ(D) µ(E) 1 2α. Finally, ∈ ≤ ∈ ≥ ≥ −

µ(C)= µ(C D)+ µ(C Dc) ∩ ∩ c µ x X : S∗ f(x) > η 2α + µ(D ) ≤ { ∈ N0 − } 2 1 S∗ f 2 +2α ≤k N0 kL (µ) (η 2α)2 − < 3α < ǫ.

This holds for all f B ′ (0). If we set ρ(ǫ, η)= δ′, we have the desired result. ∈ δ 6 Proof of Theorem 8

For this section, it will be convenient to discuss normal random vectors, that is, measurable maps G : Ω RN where G = (G ,...,G ) and each G is a normal random variable. We → 1 N j will say G has mean 0 if each Gj has mean 0. The concepts of distribution and independence are easily extended to this case. We say two random vectors G and H have the same distribution if P (G E) = P (H E) for all measurable sets E RN . We say G and H are independent if ∈P (G E)P (H∈ F ) = ⊂ ∈ ∈ P (G E,H F ) for all E, F RN . ∈ ∈ ⊂ Lemma 11. Let G and H be mean 0 normal random vectors which are independent and have the same distribution. Then, for any θ R, the normal random vector (G,H) : Ω R2N ∈ → has the same distribution as (G sin θ + H cos θ,G cos θ H sin θ). − Proof. A normal random vector (and its distribution) is completely determined by its co- variance matrix cov(s, t)= E(GsGt). But, it is clear that for all s, t

E(GsGt)= E((Gs sin θ + Hs cos θ)(Gt sin θ + Ht cos θ)), E(H H )= E((G cos θ H sin θ)(G cos θ H sin θ)), s t s − s t − t E(G H )=0= E((G sin θ + H cos θ)(G cos θ H sin θ)). s t s s t − t Hence, the covariance matrices of (G,H) and (G sin θ + H cos θ,G cos θ H sin θ) are the same. −

19 The above lemma is, in some sense, the statement that mean 0 normal random vectors are rotation invariant. Now, for a mean 0 normal random vector G on (Ω, ,P ), we can always find normal ran- B dom vectors H and K on some probability space (Ω′, ′,P ′) such that H,K are independent, B and H,K have the same distribution as G. That is P (G E) = P ′(H E) = P ′(K E) ∈ ∈ ∈ for all measurable sets E RN . We say H,K are independent copies of G. We now prove Theorem 8. This proof is⊂ taken from [3].

Proof of Theorem 8. Write G = (G1,...,GN ) as a normal random vector. Let H,K be independent copies of G on some probability space (Ω′, ′,P ′). Define S(G) : Ω [0, ) B → ∞ by S(G)(ω) = supn Gn(ω) . Define S(H),S(K) similarly. It follows S(G),S(H),S(K) have the same distribution| and| that S(H),S(K) are independent. Then, for t s ≥

2P S(G) s P S(G) > t = P ′ S(H) s P ′ S(K) > t + P ′ S(K) s P ′ S(H) > t { ≤ } { } { ≤ } { } { ≤ } { } = P ′ S(H) s,S(K) > t + P ′ S(K) s,S(H) > t . { ≤ } { ≤ } H+K H K Apply Lemma 11 to H,K with θ = π/4. Then, (H,K) and ( , − ) have the same √2 √2 H+K H K distribution. It follows that P S(H) s,S(K) > t = P S( ) s,S( − ) > t . ′ ′ √2 √2 Similarly for the second term, so{ that ≤ } { ≤ }

2P S(G) s P S(G) > t { ≤ } { } H + K H K H K H + K = P ′ S s,S − > t + P ′ S − s,S > t √ ≤ √ √ ≤ √   2   2     2   2   = P ′ S(H + K) s√2,S(H K) > t√2 S(H K) s√2,S(H + K) > t√2 , ≤ − − ≤ n o [ n o where the last equality follows as the sets are clearly disjoint. Consider the first set in the union, S(H +K) s√2,S(H K) > t√2 . In this set √2(t s) t s. Similarly, √2(t s) < S(H + K) S(K H) S(2H)=2S(H) or √2S(H) > t −s. The same calculations− work in the other− set.− So, we≤ − have

2P S(G) s P S(G) > t P ′ √2S(H) > t s, √2S(K) > t s { ≤ } { } ≤ { − − } (10) = P √2S(G) > t s 2. { − }

We now define a sequence (tn). Set t0 = s and tn+1 = tn√2+ s. It is easily checked (n+1)/2 by induction that tn = (√2 + 1)(2 1)s. Define q = 2P S(G) s and xn = 1 − { ≤ } q− P S(G) > t . By the hypothesis, q 1. By construction and from (10), { n} ≥

q2x = qP S(G) > t =2P S(G) s P S(G) > t n+1 { n+1} { ≤ } { n+1} t s 2 P S(G) > n+1 − = P S(G) > t 2 = q2x2 , ≤ √ { n} n  2  20 2 2n 2 q or xn+1 xn. This implies that xn x0 . It is easily seen that x0 = 2−q 1/2 as q 1. ≤ 2n ≤ 2n ≤ ≥ Hence, P S(G) > t = qx qx q2− . By Lemma 4, { n} n ≤ 0 ≤

∞ S(G) 1 = P S(G) > t dt k kL (P ) { } Z0 s ∞ tn+1 = P S(G) > t dt + P S(G) > t dt { } { } 0 n=0 tn Z X Z ∞ tn+1 s + P S(G) > t dt ≤ { n} n=0 tn X Z ∞ 2n s + (t t )q2− . ≤ n+1 − n n=0 X Of course, q = 2P S(G) s 2 trivially. From the induction characterization of t , we { ≤ } ≤ n see t t = s2(n+1)/2. Therefore, n+1 − n

∞ ∞ 3/2 n/2 2n 3/2 n/2 2n S(G) 1 s + s2 2 2− s + s2 2 2− k kL (P ) ≤ ≤ n=0 n=0 X X ∞ 3/2 3/2 3n/2 2 = s + s2 2− = s + s 6s. 1 2 3/2 ≤ n=0 − X −

7 An Application

Let T = R/Z. A function f on R with period 1 can be viewed as a function on T. Let m be Lebesgue measure, and consider the probability space (T, m). Let (aj) be any non-zero sequence of real numbers which converge to 0. For f : T R, consider the operators → 1 n S f(x)= f(x + a ). n n j j=1 X Bellow asked whether S f converges to f a.s. for all f L1(m). The answer to this turns n ∈ out to be no. In fact, it is not even true for all f L∞(m). ∈ We will prove this using the second entropy result. In this case, it is beneficial to consider complex-valued functions temporarily. Here, the operators Sn make perfectly good sense applied to complex-valued functions. In fact, we also have the nice property that Sn(Re f)= Re(Snf) and similarly for the imaginary part. We will take advantage of this. First, we need two technical lemmas.

Lemma 12. Let (aj) be a sequence of non-zero real numbers converging to 0. Then, given any r N, there exist integers J1

21 1 2πiaj n(¯α) 1 1 Js− e < if αs =0, − 10 j Js X≤ 1 2πia n(¯α) 1 1 J − e j > if α =1, − s 2 s j Js ≤ X

for all 1 s r. ≤ ≤ p 1/p Lemma 13. Let (Y, , ν) be a measure space and define h = ( Y h dν) for the mea- surable functions fromSY to C, where 1 p< . Let r kN kand r =4| r|2 +2r N. Suppose 0 R 0 0 h ,...,h is a collection of complex-valued≤ functions∞ on∈ Y such that h h∈ >α for all { 1 r} k j − kk j = k. Then, there exists a set I 1,...,r , I = r0 such that either Re hj Re hk > α/4 for6 all j = k I or Im h Im⊂{h > α/4}for| | all j = k I. k − k 6 ∈ k j − kk 6 ∈ We temporarily postpone the proofs of these lemmas and proceed to the solution of Bellow’s question.

Theorem 14. Let (a ) be any real sequence of numbers which converge to 0 and a =0 for j j 6 all j. Then, there exists f L∞(m) such that S f does not converge a.s.[m]. ∈ n

Proof. Let Tjf(x) = f(x + bj) for some real sequence (bj). Then, the fact that Tj(1) = 1 1 2 and Tj are positive is obvious. By periodicity, Tj is an isometry on L (m) and L (m). Also, 1 S f 2 f 2 = f 2 and T S = S T . k n kL (m) ≤ n k kL (m) k kL (m) j n n j Let w be an irrational number and bj =(j 1)w. It then follows from the equidistribution P − theorem (or a special case of Birkhoff’s Ergodic Theorem) that (Tj) satisfies the mean ergodic condition. Thus, (Sn) commutes with a Bourgain sequence, and the second entropy result can be applied to (Sn). It suffices to show that for some δ > 0 we have sup Nf (δ): f L2(m) 1 = . In fact, we will do this with δ =1/40. Let r N and r =4r2 +2{ r . Byk Lemmak ≤ 12,} choose∞ integers 0 ∈ 0 0 J <...

22 1 2πiaj n(¯α) βs,α¯ = Js− e . j Js X≤ Fix anα ¯ and suppose αs = 1 and αt = 0. By Lemma 12, we have 1 1 2 β β β 1 1 β > = . | s,α¯ − t,α¯|≥| s,α¯ − |−| − t,α¯| 2 − 10 5 By symmetry, this holds so long as α = α . Hence, by orthogonality and above, s 6 t

1/2 r/2 2 S g S g =2− β β k Js − Jt k2 | s,α¯ − t,α¯| α¯ ! X 1/2 r/2 2 2− β β ≥ | s,α¯ − t,α¯| α¯:αs=αt ! X6 1/2 r/2 > 2− (2/5) 1

α¯:αs=αt ! X6 r/2 (r 1)/2 =2− (2/5)2 − > 1/5.

This holds for all s = t. 6 Apply Lemma 13 to the set SJ1 g,...,SJr g to find a subset I J1,...,Jr , I = r such that either Re S g Re{ S g > 1/20} or Im S g Im S⊂g { > 1/20} for| | all 0 k Jt − Js k2 k Jt − Js k2 J = J I. If it is the first, set f = Re g, and if it is the second, set f = Im g. Then, s 6 t ∈ 2 2 f L (m) g 2 = 1. Further, SJs f SJt f L (m) > 1/20 for all Js = Jt I. As no two ksuchk S f≤could k k be contained ink the same− 1/40-ballk in L2(m), we see 6 N (1/∈40) I = r . Js f ≥ | | 0 As r is arbitrary, sup N (1/40) = . 0 f ∞ To conclude, we need only establish Lemmas 12 and 13. First, we recall three simple results in complex arithmetic.

Claim. Let a, b C, with a , b 1, and λ R. Then, ∈ | | | | ≤ ∈ 1. ab 1 a 1 + b 1 , | − |≤| − | | − | 2. Re(ab) a 1 + Re(b), ≤| − | 3. 1 e2πiλ 2π λ . | − | ≤ | | Proof. First, ab 1 = ab a + a 1 ab a + a 1 b 1 + a 1 . Second, | − | | − − |≤| − | | − |≤| − | | − | Re(ab) = Re(ab b)+Re(b) ab b +Re(b) a 1 +Re(b). Third, recall 1 cos(2x) 2x2 for all real x. So,− 1 e2πiλ ≤|2 = (1− |cos(2πλ))≤|2 +− sin|2(2πλ) = 2(1 cos(2πλ−)) 4π2λ≤2. | − | − − ≤

23 Proof of Lemma 12. Fix r N. If r = 1, then set J1 = 1 and choose n(¯α) accordingly. Assume r > 1. For each 1 ∈s r, we will construct integers m simultaneously as J . ≤ ≤ s s Set J = 1 and choose m so that 1 e2πia1m1 > 3/4. Assume J and m are known for 1 1 | − | t t all t 0 such that supj>Ls/100 aj 1 | | → | | ≤ (400Msπ)− . Further, we can choose Ts > 0 (depending on aj for j Js 1) such that for P ≤ − each z Z there is a corresponding t Z, t T satisfying e2πiaj z e2πiaj t < 1/50r for ∈ ∈ | | ≤ s | − | all j Js 1. ≤ − 1 As aj 0, we can choose Js such that Js > Ls, Js > Js 1, and Js− j Js aj < 1 → − ≤ | | (100T )− . Also, as a = 0 for all j, note that s j 6 P

1 R 1 1 1 1 lim e2πiaj x dx = lim (e2πiaj R 1) R R 0 Js R R Js 2πiaj − →∞ j Js ! →∞ j Js ! Z ≤ ≤ X X 1 1 1 lim Js− =0. ≤ R R πaj →∞ j Js X≤

1 2πiaj ys It follows there is ys > 0 such that Re(Js− j Js e ) < 1/10; otherwise, this limit could ≤ not be 0. Set zs to be the integer part of ys, and take ts Ts as prescribed above. Set P | | ≤ ms = zs ts. Define all Js and ms in this manner. Then,− for each 1 < s r, by second and third statements of the above claim and by ≤ construction, we have

1 2πiaj ms 1 2πiaj ys 2πiaj (ms ys) Re Js− e = Js− Re(e e − ) j Js ! j Js X≤ X≤ 1 2πia ys 2πia (ms ys) J − Re(e j )+ 1 e j − ≤ s | − | j Js X≤ 

1 2πia ys 1 2πia (zs ys ts) = Re J − e j + J − 1 e j − − s s | − | j Js ! j Js X≤ X≤ 1 1 +2πJ − a ( z y + t ) ≤ 10 s | j| | s − s| | s| j Js X≤ 1 2π(T + 1) 1 4π + s + < 1/4. ≤ 10 100Ts ≤ 10 100 This gives

Js > Ls,

2πiaj ms 2πiaj zs 2πiaj ts 1 e = e e < 1/50r for all j Js 1, | − | | − | ≤ −

1 2πia ms 1 2πia ms 1 J − e j 1 Re J − e j > 3/4. − s ≥ − s j Js j Js ! ≤ ≤ X X

24 But, J1 = 1 and m1 was chosen so the last condition is also true for s = 1. Further, if we rewrite the second condition, we have

sup 1 e2πiaj mt < 1/50r for all 2 t r, j Jt−1 | − | ≤ ≤ ≤

1 2πia ms 1 J − e j > 3/4 for all 1 s r. − s ≤ ≤ j Js ≤ X r Fixα ¯ 0, 1 . Define ns = αsms and n(¯α) = n1 + ... + nr. Fix 1 s r. Then, by the first statement∈ { } in the claim, ≤ ≤

1 2πia n(¯α) 1 2πia ns 1 2πia n(¯α) 2πia ns J − e j J − e j J − e j e j s − s ≤ s | − | j Js j Js j Js X≤ X≤ X≤ 1 2πia ns 2πia (n(¯α) ns) = J − e j e j − 1 s | || − | j Js X≤ 1 = J − exp 2πia n exp 2πia n 1 s j t j t − j Js ts ≤       X X X 1 1 J − exp 2πia n 1 + J − exp 2πia n 1 =: I + II. ≤ s j t − s j t − j Js ts ≤    ≤    X X X X

If s = 1, then I =0. If s> 1, then as J s > Ls,

1 1 I = J − exp 2πia n 1 + J − exp 2πia n 1 s j t − s j t − j

1 1 J − 2 + J − 2π a n ≤ s   s  | j| s  jJs/100 | | 1 1 1 1 +2Msπ sup aj < + = . ≤ 50 j>Ls/100 | | 50 200 40

On the other hand, for all s, we have by applying the first statement of the claim several times

25 1 II = J − exp 2πia n 1 s j t − j Js t>s ≤    X X

sup exp 2πiaj nt 1 ≤ j Js − ≤   t>s  X

= sup e2πiaj nt 1 sup e2πiaj nt 1 j Js − ≤ j Js | − | ≤ t>s ≤ t>s Y X sup e2πiaj mt 1 sup e2πiaj mt 1 ≤ j J | − | ≤ j J − | − | t>s s t>s t 1 X ≤ X ≤ 1 1 < . ≤ 50r 50 t>s X Hence, we have that for all s

1 2πia n(¯α) 1 2πia ns 1 1 1 J − e j J − e j < + < . s − s 40 50 20 j Js j Js ≤ ≤ X X Now, if α = 0, then n = 0 and s s

1 πia n(¯α) J − e j 1 < 1/20 < 1/10. s − j Js ≤ X If α = 1, then n = m and s s s

1 πia n(¯α) 1 πia ms 1 πia n(¯α) 1 πia ms J − e j 1 J − e j 1 J − e j J − e j s − ≥ s − − s − s j Js j Js j Js j Js ≤ ≤ ≤ ≤ X X X X > 3 /4 1/20 > 1/2. − This completes the proof.

Proof of Lemma 13. Suppose not. Then, there exist subsets M1, M2 1,...,r with M , M m2. Therefore, two− distinct points∈ j,k / M must be assigned the same − ∈ pair, say (a, b). Namely,

26 h h Re h Re h + Im h Im h k j − kk≤k j − kk k j − kk Re h Re h + Re h Re h + Im h Im h + Im h Im h ≤k j − ak k a − kk k j − bk k b − kk α. ≤ This contradicts the hypothesis.

8 Comments

1. The only significant difference between the proof of the first entropy result here and in [2] is the use of Theorem 8. Bourgain uses a different statement, namely that there is some c> 0 so that

P ω : µ x : F ∗(x, ω) c F ∗(x, ω′) P (dω′) >c >c ≥   ZΩ   for all N,J N big enough and f L∞(µ), f L2(µ) 1. Theorem 8 is used in an almost identical∈ way to this statement∈ in the proofsk k of≤ both entropy results. 2. Aside from the above comment, the proofs of the second entropy result here and in [2] differ in only one other place, and only slightly. Bourgain states and uses a different Banach principle for L∞(µ). In particular, he states that if Snf converges almost surely for all f L∞(µ), then there is some function δ(ǫ), which goes to 0 as ǫ 0, such ∈ → that sup S f dµ<δ(ǫ) whenever f ∞ 1 and f 1 < ǫ. This result can X n | n | k kL (µ) ≤ k kL (µ) be used in the same manner as Theorem 3. I have included the result from Bellow and R Jones instead, because their proof is so easily understood. 3. For more on this and related topics, see work by Roger Jones [4, 5], Michael Lacey [6], and Michel Weber (with Mikhail Lifshits and Dominique Schneider) [8 - 19]. 4. I will be happy to field any questions or concerns via e-mail. I would also appreciate being alerted to any typos. Be sure to use a descriptive subject, as I get lots of spam.

References

[1] Alexandra Bellow and Roger L. Jones, A Banach principle for L∞. Adv. Math. 120 (1996), 155–72. [2] , Almost sure convergence and bounded entropy. Israel J. Math. 63 (1988), no. 1, 79–97. [3] Xavier Fernique, Gaussian random vectors and their reproducing kernel Hilbert spaces. Technical Report Series of the Laboratory for Research in Statistics and Probability 34 (1985), 1–116.

27 [4] Roger L. Jones, A remark on singular integrals with complex homogeneity. Proc. AMS 114 (1992), 763–8.

[5] Roger L. Jones, and connections with analysis and probability. New York J. Math. 3A (1997), 31–67.

[6] Michael Lacey, Bourgain’s entropy criteria. Convergence in Ergodic Theory and Prob- ability, Ohio State Mathematical Research Institute Publications 5, New York, 1996, 249–61.

[7] Michel Ledoux and Michel Talagrand, Probability in Banach Spaces. Springer-Verlag, New York, 1991.

[8] Mikhail Lifshits and Michel Weber, Oscillations of Gaussian Stein’s elements. High Dimensional Probability, Progress in Probability 43, Birkh¨auser, Boston, 1998, 249–61.

[9] Mikhail Lifshits and Michel Weber, Tightness of stochastic families arising from ran- domization procedure. Asymptotic Methods in Probability and Statistics with Applica- tions, Birkh¨auser, Boston, 2001, 143–58.

[10] Dominique Schneider and Michel Weber, Une remarque sur un th´eor`eme de Bourgain. S´eminare de Probabiliti´es XXVII, Lecture Notes in 1557, Springer-Verlag, New York, 1993, 202–6.

[11] Michel Weber, Op´erateurs r´eguliers sur les espaces Lp. S´eminare de Probabiliti´es XXVII, Lecture Notes in Mathematics 1557, Springer-Verlag, New York, 207–15.

[12] Michel Weber, GB and GC sets in ergodic theory. Probability in Banach Spaces 9, Progress in Probability 35, Birkh¨auser, Boston, 1994, 129-51.

[13] Michel Weber, Borel matrix. Comment. Math. Univ. Car. 36 (1995), 401–15.

[14] Michel Weber, The Stein randomization procedure. Rendiconti di Matematica Serie VII 15 (1995), 569–605.

[15] Michel Weber, Coupling of the GB set property for ergodic averages. Journal of Theo- retical Probability 9 (1996), 105–12.

[16] Michel Weber, Entropie M´etrique et Convergence Presque Partout. Hermann, Paris, 1998.

[17] Michel Weber, Sur le caract`ere Gaussien de la convergence presque partout. C. R. Acad. Sci. Paris Ser. I Math. 329 (1999), 169–72.

[18] Michel Weber, Sur le caract`ere Gaussien de la convergence presque partout. Fund. Math. 167 (2001), 23-54.

[19] Michel Weber, R´egularit´eergodique de certaines classes de Donsker. J. Theory Prob. 48 (2004), 766–84.

28