On Convolution Squares of Singular Measures
On Convolution Squares of Singular Measures
by
Vincent Chan
A thesis presented to the University of Waterloo in fulfillment of the thesis requirement for the degree of Master of Mathematics in Pure Mathematics
Waterloo, Ontario, Canada, 2010
⃝c Vincent Chan 2010
I hereby declare that I am the sole author of this thesis. This is a true copy of the thesis, including any required final revisions, as accepted by my examiners.
I understand that my thesis may be made electronically available to the public.
iii
Abstract
We prove that if 1 > > 1/2, then there exists a probability measure such that the Hausdorff dimension of its support is and ∗ is a Lipschitz function of class − 1/2.
v
Acknowledgements
I would like to deeply thank my supervisor, Kathryn Hare, for her excellent supervision; her comments were useful and insightful, and led not only to finding and fixing mistakes, but also to a more elegant way of presenting various arguments. A careful comparison of earlier drafts of this document and its current version would reveal the subtle errors detected by her keen eye.
This thesis would not exist at all if it were not for Professor Thomas K¨orner,whom I thank for his help in clarifying points in his paper, which is the central focus in the pages to come.
I am indebted to my friends and colleagues for their support and advice, particularly that of Faisal Al-Faisal, who, being in a similar situation to myself (including the bizarre sleep schedule), has become comparable to a brother in arms.
Last but certainly not least, I am grateful for the support of my fianc´eeVicki Nguyen, who, despite having no formal education in pure mathematics, suffered through reading dozens of pages while proofreading. She has never failed to provide encouragement when it was required most, and surely has kept me sane throughout the process.
vii
Contents
1 Preliminaries1 1.1 Introduction...... 1 1.2 A tightness condition...... 7
2 A Probabilistic Result 11
3 A Key Lemma 21
4 The Main Theorem 35 4.1 -Lipschitz functions...... 35 4.2 Density results...... 41
APPENDICES 57
A Measure Theory 59
B Convergence of Fourier Series 63
C Probability Theory 65
D The Hausdorff Metric 67
Bibliography 74
ix
Chapter 1
Preliminaries
Unless otherwise stated, we will be working on T = ℝ/ℤ = [−1/2, 1/2], and only with positive, Borel, regular measures. We will use to denote the Lebesgue measure on T (and dx, dy, dt, and so forth to denote integration with respect to Lebesgue measure), and ∣E∣ = (E). Measures that are absolutely continuous or singular are taken with respect to in the case of T, and (left) Haar measure, should we require a more general setting. When speaking of measures living in an Lp space or having other properties associated to functions, we mean the measure is absolutely continuous and we consider its Radon- Nikodym derivative to have this property.
1.1 Introduction
The absolutely continuous measures form an ideal in the algebra of measures, so that for an absolutely continuous measure , ∗ is still absolutely continuous and so the Radon- Nikodym theorem gives rise to an associated function f ∈ L1 such that ∗ = f. For any -finite measure , Lebesgue’s Decomposition theorem decomposes as = ac + s, where ac is absolutely continuous and s is singular. It is thus natural to ask whether a similar situation holds with singular measures, that is, given a singular measure , does there exist a function f ∈ L1 with ∗ = f? This is easily seen to be false, for if
1 P P = n anxn is a discrete measure, then ∗ = n,m anamxn+xm is also a discrete measure.
To progress further, it may be helpful to appeal to an extension of Lebesgue’s Decom-
position theorem, which gives for any -finite measure three unique measures ac, cs
and d which are absolutely continuous, continuous singular, and discrete, respectively,
such that = ac + cs + d. Then we ask if we might recover the nice property of ∗ being absolutely continuous in the case of being a continuous singular measure. This is still not always the case, as there exist continuous singular measures on T where even their nth convolution power is singular for any n; for such an example we will need a preliminary definition.
Definition 1.1. Let −1/2 ≤ ak ≤ 1/2 and define the trigonometric polynomials on T
N Y k PN (t) = (1 + 2ak cos 3 t). k=1
It is a standard result that PN are probability measures which converge weak-∗ in M(T) to a probability measure with ⎧ 1 if n = 0, ⎨ M M (n) = Q P kj b j=1 akj if n = j=1 j3 , j = ±1, ⎩0 otherwise.
This measure is called the Riesz product measure associated to {ak}.
Lemma 1.2. If 1 and 2 are Riesz product measures, then so is 1 ∗ 2. Moreover, if i
is associated to {ak,i}, then 1 ∗ 2 is associated to {ak,1ak,2}.
Proof. We calculate the Fourier coefficients of the convolution: ⎧ 1 if n = 0, ⎨ M M k 1 ∗ 2(n) = 1(n)2(n) = Q P j (1.1) \ b b j=1 akj ,1akj ,2 if n = j=1 j3 , j = ±1, ⎩0 otherwise.
2 Then we can define the trigonometric polynomials on T
N Y k PN (t) = (1 + 2ak,1ak,2 cos 3 t), k=1 noticing that −1/2 ≤ −1/4 ≤ ak,1ak,2 ≤ 1/4 ≤ 1/2. The weak-∗ limit of PN produces a Riesz product measure whose coefficients satisfy (1.1), so we are finished by uniqueness.
A standard result (see, for example, [6] Theorem 7.2.2) gives that a Riesz product P∞ 2 measure is singular if and only if k=1 ak = ∞. In particular, suppose is a Riesz product measure associated to the constant sequence {c} for some constant 0 < c ≤ 1/2. P∞ 2 n Certainly k=1 c = ∞, so that is singular. Inductively applying Lemma 1.2, = ∗ ∗ ⋅ ⋅ ⋅ ∗ (the nth convolution power of ) is a Riesz product measure associated to n P∞ 2n n {c }. Then k=1 c = ∞, and so remains singular. It is also not hard to show that Riesz product measures are continuous, in general.
It should be remarked that the choice of frequencies 3k is not essential, and Riesz product measures may be generalized further. In addition, the concept of Riesz product measures as a whole and the conclusion that there exist continuous singular measures whose convolution powers always remain singular can be extended to any infinite, compact, abelian group (see [6]).
We now know it is impossible to hope the square convolution of a singular measure is absolutely continuous in general, even when restricting to the case of continuous singular measures. However, we may expect that are still some positive results; convolution behaves a smoothing operation, evidenced by the fact that both the continuous measures and the absolutely continuous measures form ideals in the space of measures. Indeed, a famous result due to Wiener and Wintner [16] produces a singular measure on T such that −1/2+" b(n) = o(∣n∣ ) as n → ∞ for every " > 0. By an application of the Hausdorff-Young inequality, such a measure can be shown to have the property that its square convolution ∗ is absolutely continuous, and in fact, ∗ ∈ Lp(T) for every p ≥ 1. Other authors have generalized this result; Hewitt and Zuckerman [9] constructed a singular measure on any non-discrete, locally compact, abelian group G such that ∗ ∈ Lp(G) for all p ≥ 1. It is worth noting that the condition of non-discrete cannot be
3 dropped, since otherwise all measures would be absolutely continuous. Using Rademacher- Riesz products, Karanikas and Koumandos [10] prove the existence of a singular measure with ∗ ∈ L1(G) for any non-discrete, locally compact group G. Shortly thereafter, Dooley and Gupta [3] produced a measure with ∗ ∈ Lp(G) for every p ≥ 1 in the case of compact, connected groups and compact Lie groups using the theory of compact Lie groups.
Saeki [15] took this concept further in a different direction by proving the existence of a singular measure on T with support having zero Lebesgue measure such that ∗ has a uniformly convergent Fourier series. This is an improvement on previous results; in this case the continuity of the partial Fourier sums for ∗ ensure that their limit, namely ∗ , is continuous. Then ∗ ∈ L∞(T) and subsequently ∗ ∈ Lp(T) for p ≥ 1 since T is compact. Generalizing this, Gupta and Hare [7] show such a measure exists (now using Haar measure in place of Lebesgue measure) when replacing T with any compact, connected group.
K¨orner[11] recently expanded on Saeki’s work; to discuss how, we need the following definitions.
Definition 1.3. For 1 ≥ ≥ 0, we say a function f : T → ℂ is Lipschitz of class 1, or simply -Lipschitz and write f ∈ Λ if
sup sup ∣ℎ∣− ∣f(t + ℎ) − f(t)∣ < ∞. (1.2) t∈T ℎ∕=0
Some references define a function to be Lipschitz of class if there is a constant C with ∣f(x) − f(y)∣ ≤ C∣x − y∣ for every x, y ∈ T. These definitions are in fact equivalent, with a straightforward proof.
Lemma 1.4. f ∈ Λ if and only if ∣f(x) − f(y)∣ ≤ C∣x − y∣ for a constant C, for every x, y ∈ T.
Proof. This is a simple matter of relabeling; use t = y and ℎ = x − y.
1Functions with this property are also sometimes referred to as H¨olderof class .
4 It is useful to note that the Λ classes are nested downwards:
Lemma 1.5. Suppose 1 ≥ i ≥ 0 for i = 1, 2. If 1 ≤ 2, then Λ 1 ⊇ Λ 2 .
− 1 − 2 Proof. Since 1 ≥ i ≥ 0, for ℎ ∈ T = [−1/2, 1/2] we get that ∣ℎ∣ ≤ ∣ℎ∣ . Thus if
sup sup ∣ℎ∣− 2 ∣f(t + ℎ) − f(t)∣ < ∞, t∈T ℎ∕=0 we certainly have sup sup ∣ℎ∣− 1 ∣f(t + ℎ) − f(t)∣ < ∞, t∈T ℎ∕=0 implying Λ 1 ⊇ Λ 2 .
We will also require the notion of Hausdorff dimension, which we recall here:
Definition 1.6. Fix ≥ 0 and let > 0. For a set E, define
( ∞ ∞ ) X [ ℋ (E) = inf ∣Ei∣ : Ei ⊇ E, ∣Ei∣ ≤ , i=1 i=1 where Ei can be taken to be intervals. Notice ℋ (E) is monotone increasing as decreases, since fewer permissible collections of sets are taken in the infimum. Then we take
ℋ (E) = lim ℋ (E) = sup ℋ (E), →0+ >0 and we call ℋ (E) the -Hausdorff measure of E; it is standard that this is indeed a measure (see [4], for example). Of special interest is the case = 1, in which case we recover the usual Lebesgue measure. It is easy to show that there is a critical value for such that ℋ (E) = ∞ for less than this critical value, and ℋ (E) = 0 for greater than this critical value. Then we define the Hausdorff dimension of E by
dimH (E) = sup{ : ℋ (E) > 0} = sup{ : ℋ (E) = ∞} = inf{ : ℋ (E) < ∞} = inf{ : ℋ (E) = 0}.
Heuristically, the Hausdorff dimension allows us to compare sets that are too sparse for the Lebesgue measure to be of use.
5 For our results, we will need to have a way of describing where a measure “lives”; we make this formal by introducing the concept of the support of a measure.
Definition 1.7. Let (X,T ) be a topological space, and be a Borel measure on (X,T ). Then the support of is defined to be the set of all points x in X for which every open neighborhood Nx of x has positive measure:
supp = {x ∈ X : (Nx) > 0, ∀x ∈ Nx ∈ T }. (1.3)
Among other things, K¨ornerdemonstrated the existence of a probability measure whose support has a prescribed Hausdorff dimension (between 1 and 1/2) such that ∗ is a Lipschitz function. Our goal will be to prove one of his main results, which is the following:
Theorem 4.19. If 1 > > 1/2, then there exists a probability measure such that dimH (supp ) = and ∗ = f where f ∈ Λ −1/2.
We will show that this theorem is indeed an extension of Saeki’s work. Using Lemma
1.4, we can see that f ∈ Λ yields a constant C such that
∣f(x + ℎ) − f(x)∣ ≤ C∣ℎ∣ for any x and ℎ, implying f(x + ℎ) − f(x) = o(((ln ∣ℎ∣−1)−1) (see Appendix B.1). By the Dini-Lipschitz test (see Appendix B.2), f has a uniformly convergent Fourier series. This says that the measure K¨ornerproduced has the same property as Saeki’s, provided it were also singular. This fact will hold due to the Hausdorff dimension condition; indeed, the fact that dimH (supp ) < 1 implies that ∣supp ∣ = 0 by definition of the dimension and recalling ℋ1 is simply Lebesgue measure. Since we always have ((supp )c) = 0, this provides the necessary decomposition for to be singular.
Consider now the condition 1 > > 1/2. As we have seen, this ensures the constructed measure will be singular since sets of Hausdorff dimension less than 1 have zero Lebesgue measure; however the converse is not true. Indeed, it is well known that there exists a set of Hausdorff dimension 1 with zero Lebesgue measure. Then the case = 1 could give still
6 give rise to singular measures. Since we are working on T which has Hausdorff dimension 1, monotonicity of Hausdorff dimension (easily seen from monotonicity of the Hausdorff measure) ensures we need not consider the case > 1. We shall see soon that we also need ≥ 1/2 for a positive result involving this Lipschitz condition, after proving a tightness condition.
1.2 A tightness condition
In this section, we aim to show that the result in Theorem 4.19 is nearly the best we can hope for. In particular, we will show that if the support of a measure has Hausdorff measure dimension and ∗ ∈ Λ , then −1/2 ≥ . To begin, we mention a relationship between the Hausdorff dimension of the support of a measure and its Fourier coefficients.
Definition 1.8. We define the s-energy of a finite, compactly supported measure on ℝn, denoted Is(), to be ZZ d(x) d(y) I () = . s ∣x − y∣s
Lemma 1.9. Suppose is a non-zero measure, 0 < < 1, and
X ∣∣2 b < ∞. ∣k∣1− k∕=0
Then dimH (supp ) ≥ .
Proof. By Theorem 2.2 in [8], with d = 1 and 0 < < 1, we have for some constant b that ! X ∣(k)∣2 I () ≤ b ∣(0)∣2 + b < ∞. b ∣k∣1− k∕=0
Then by Theorem 4.13 in [5], dimH (supp ) ≥ .
Now, we develop a bound for portions of the ℓp norm of the Fourier coefficients of a -Lipschitz function.
7 Lemma 1.10. If f ∈ Λ and 0 < p ≤ 2, then
1 X p 1−p( + ) ∣fb(k)∣ ≤ C1n 2 , n≤∣k∣≤2n−1 for some constant C1.
Proof. Fix ℎ ∈ [−1/2, 1/2]. Let g(t) = f(t + ℎ) − f(t − ℎ). A simple calculation shows that the Fourier coefficients of g are given by
gb(k) = 2i sin(2kℎ)fb(k).
P 2 2 By Parseval’s identity, k ∣gb(k)∣ = ∥g∥ , so that X Z 4 ∣fb(k)∣2 sin2(2kℎ) = ∣f(t + ℎ) − f(t − ℎ)∣2 dt. k
Since f ∈ Λ , there is a constant C such that
∣f(t + ℎ) − f(t − ℎ)∣ ≤ C∣(t + ℎ) − (t − ℎ)∣ = C∣2ℎ∣ , for all t ∈ T, so X Z 4 ∣fb(k)∣2 sin2(2kℎ) ≤ (C∣2ℎ∣ )2 dt = 4 C2∣ℎ∣2 ≤ 4C2∣ℎ∣2 . k Thus for any n, X ∣fb(k)∣2 sin2(2kℎ) ≤ C2∣ℎ∣2 . n≤∣k∣≤2n−1
2 k As this holds for any ℎ ∈ [−1/2, 1/2], consider ℎ = 1/(8n). Notice sin ( 4n ) ≥ 1/2 for n ≤ ∣k∣ ≤ 2n − 1, so
X 2 2 2 2 1 2 −2 2 1 2 −2 ∣fb(k)∣ ≤ 2C ∣ℎ∣ = 2C ( 8 ) n ≤ 2C ( 8 ) n . (1.4) n≤∣k∣≤2n−1
2 1 2 1 2 If p = 2, we are done; taking C1 = 2C ( 8 ) = 32 C , and (1.4) gives
1 X 2 −2 1−2( + ) ∣fb(k)∣ ≤ C1n = C1n 2 . n≤∣k∣≤2n−1
8 Otherwise, 0 < p < 2 and so 1 < 2/p, 1/(1 − p/2) < ∞; notice they are conjugate indices. p By H¨older’sinequality with counting measure on {∣fb(k)∣ }n≤∣k∣≤2n−1 and {1}n≤∣k∣≤2n−1, we have ⎛ ⎞p/2 ⎛ ⎞1−p/2 X p X p 2/p X 1/(1−p/2) ∣fb(k)∣ ≤ ⎝ (∣fb(k)∣ ) ⎠ ⎝ (1) ⎠ n≤∣k∣≤2n−1 n≤∣k∣≤2n−1 n≤∣k∣≤2n−1 ⎛ ⎞p/2 X 2 1−p/2 = ⎝ ∣fb(k)∣ ⎠ (2n) n≤∣k∣≤2n−1 p 1− 2 1 2 −2 p/2 2 ≤ (2C ( 8 ) n ) (2n) by (1.4) 1−p( +1/2) = C1n
C p as required, where C1 = 2( 8 ) is a constant.
Lemma 1.11. If is a measure with dimH (supp ) = and ∗ = f where f ∈ Λ , then − 1/2 ≥ .
Proof. Since f ∈ Λ , taking p = 1 in Lemma 1.10 yields
X 1−1( +1/2) (1−2 )/2 ∣fb(k)∣ ≤ C1n = C1n , n≤∣k∣≤2n−1
2 for some constant C1 depending on f. Since ∣fb(k)∣ = ∣b(k)∣ , we have
X 2 1−1( +1/2) (1−2 )/2 ∣b(k)∣ ≤ C1n = C1n , n≤∣k∣≤2n−1 and so if > 0, we have
2 2 (1−2 )/2 X ∣(k)∣ X ∣(k)∣ C1n b ≤ b ≤ = C n−(1+2 −2)/2 ∣k∣1− n1− n1− 1 n≤∣k∣≤2n−1 n≤∣k∣≤2n−1 for all n ≥ 1. In particular, for n = 2j for each j ≥ 0,
X ∣(k)∣2 b ≤ C (2j)−(1+2 −2)/2, ∣k∣1− 1 2j ≤∣k∣≤2j+1−1
9 so that ∞ ⎛ ⎞ ∞ X ∣(k)∣2 X X ∣(k)∣2 X b = b ≤ C (2j)−(1+2 −2)/2. ∣k∣1− ⎝ ∣k∣1− ⎠ 1 k∕=0 j=0 2j ≤∣k∣≤2j+1−1 j=0 Notice this converges whenever −(1 + 2 − 2)/2 < 0, that is, whenever (1 + 2 )/2 > .
Then by Lemma 1.9, dimH (supp ) ≥ , for every that satisfies (1 + 2 )/2 > . Thus dimH (supp ) ≥ (1 + 2 )/2, that is, − 1/2 ≥ as required.
Lemma 1.11 now gives us a motivation for the lower bound on ; taking = 0 we see that for ∗ ∈ Λ0, we must have dimH (supp ) ≥ 1/2. Of course, this still leaves open the question of what happens in the case of = 1/2; the boundary cases = 1 and = 1/2 are beyond the scope of this paper, as the proof relies on the strict inequalities.
In Chapter2, we prove a key lemma that provides us with a discrete measure satisfying useful boundedness properties which we help in our construction. The key to this lemma will be using a probabilistic argument, which will span most of the chapter.
In Chapter3, we utilize the key lemma from the previous chapter to construct an infinitely differentiable periodic function satisfying several key properties to be used in the main theorem.
In Chapter4, the main theorem will be proved, first by developing some complete metric spaces and applying the key lemma from the previous chapter in a density argument. Employing the Baire Category theorem will bridge the final gap towards the proof of the main theorem.
10 Chapter 2
A Probabilistic Result
The key result that we will need is not easy to prove directly, so we will prove a probabilistic variant of it instead, which will imply the result we need. We proceed by proving the following result:
Lemma 2.1. Suppose that 0 < Np ≤ 1 and m ≥ 2. Then if Y1,Y2,...,YN are independent random variables with
P(Yj = 1) = p, P(Yj = 0) = 1 − p, it follows that N ! X 2(Np)m P Y ≥ m ≤ . j m! j=1
Proof. First note that Yj ∈ {0, 1} almost surely, so that their sum cannot exceed N with positive probability. That is, we may assume without loss of generality that m ≤ N. For 1 ≤ k ≤ N, define ! N k N! k uk = p = p . k k!(N − k)! Since 0 < Np ≤ 1 and k, p > 0, we have (N − k)p = Np − kp ≤ 1. Then as k ≥ 1,
N! u pk+1 (N − k)p 1 1 k+1 = (k+1)!(N−k−1)! = ≤ ≤ . u N! k k + 1 k + 1 2 k k!(N−k)! p
11 Since Yj takes on only 0 or 1 almost surely,
N ! N N ! X X X P Yj ≥ m = P Yj = k j=1 k=m j=1 N ! X N k N−k = p (1 − p) since Yj are independent k=m k N ! N X N k X ≤ p = uk k=m k k=m N X um uk+1 1 ≤ k−m ≤ 2um since ≤ 2 uk 2 k=m N! 2(Np)m = 2 pm ≤ . m!(N − m)! m!
Definition 2.2. For a set A ⊆ ℝ and a random variable X, we define the random variable
X (A) by ⎧ ⎨1 if X(x) ∈ A, X (A)(x) = ⎩0 otherwise.
Lemma 2.3. If 1 > > 0 and " > 0, there exists an M( , ") ≥ 1 with the following property. Suppose that n ≥ 2, n ≥ N for sufficiently large n, and X1,X2,...,XN are independent random variables each uniformly distributed on
Γn = {r/n : 1 ≤ r ≤ n}.
Then with probability at least 1 − "/n,
N X Xj ({r/n}) ≤ M( , ") j=1 for all 1 ≤ r ≤ n.
Proof. Since 1 − > 0, there exists an integer M( , ") ≥ 2 such that " n2−M( ,")(1− ) < (2.1) 2 12 for all n ≥ 2. Fix r and set Yj = Xj ({r/n}). Note that Yj are independent random variables, as Xj are. Since Xj are uniformly distributed on Γn, we get
P(Yj = 1) = 1/n,
P(Yj = 0) = 1 − 1/n.
Since 1 > > 0 and n ≥ N, we have n ≥ n ≥ N, yielding Nn−1 ≤ 1. Clearly 0 ≤ Nn−1. Then with p = n−1 and m = M( , ") ≥ 2, Lemma 2.1 tells us that
N ! N ! X X P Xj ({r/n}) ≥ M( , ") = P Yj ≥ M( , ") j=1 j=1 2(Nn−1)M( ,") ≤ M( , ")! 2n M( ,")n−M( ,") ≤ M( , ")! " ≤ 2n−M( ,")(1− ) < , n2 where we used n ≥ N in the second inequality, and the final inequality follows from (2.1). Allowing r to vary, we get immediately
N ! X P Xj ({r/n}) ≥ M( , ") for some 1 ≤ r ≤ n j=1 N N ! X X = P Xj ({r/n}) ≥ M( , ") r=1 j=1 " " < N ≤ , n2 n where we have used the fact that N ≤ n.
Definition 2.4. Suppose P is a probability measure, X is a random variable with respect to the -algebra ℱ, and G ⊆ ℱ is a sub--algebra. If Y is a random variable with respect to the -algebra G such that for all A ∈ G, Z Z X dP = Y dP, A A
13 then we call Y the conditional expectation of X with respect to G, and denote it by E(X ∣ G). It is a consequence of the Radon-Nikodym Theorem that such a random variable always
exists, and is unique almost surely. If X1 is some random variable (with respect to the Borel -algebra ℬ), we can define
E(X ∣ X1) := E(X ∣ (X1)).
−1 Recall (X1) = {X1 (B): B ∈ ℬ} is the smallest -algebra such that X1 is a random
variable. More generally, if X1,...,Xn are random variables, we define
E(X ∣ X1,...,Xn) := E(X ∣ (X1,...,Xn)).
We will require two simple facts about conditional expectation.
Proposition 2.5. (i) If X is G-measurable E(X ∣ G) = X.
(ii) For any X, E(E(X ∣ G)) = E(X).
Proof. Both parts are immediate from the definition.
Lemma 2.6. Let > 0 and for j = 0, 1, . . . , n, let Wj be a random variable that is
measurable with respect to (X1,...,Xj). Define Yj = Wj − Wj−1 and suppose that
2 sYj aj s /2 E(e ∣ X0,X1,...,Xj−1) ≤ e
PN for all ∣s∣ < and some aj ≥ 0. Suppose further that A ≥ j=1 aj. Then provided that 0 ≤ x < A, we have −x2 P(∣W − W ∣ ≥ x) ≤ 2exp . N 0 2A
s(WN −W0) s(WN −WN−1) s(WN−1−W0) Proof. Suppose − < s < . By definition of Wj, we have e = e e .
Since WN−1 and W0 are (X1,...,XN−1)-measurable,
s(WN −W0) s(WN−1−W0) s(WN −WN−1) E(e ∣ X0,X1,...,XN−1) = e E(e ∣ X0,X1,...,XN−1)
s(WN−1−W0) sYN = e E(e ∣ X0,X1,...,XN−1)
14 2 ≤ es(WN−1−W0)eaN s /2.
By property of conditional expectation, taking expectation of both sides gives
2 E(es(WN −W0)) ≤ E(es(WN−1−W0))eaN s /2.
By induction, we get N Y 2 2 E(es(WN −W0)) ≤ eaj s /2 ≤ eAs /2, j=1 PN since A ≥ j=1 aj. Then by Markov’s inequality (see Appendix C.1),
2 s(WN −W0) sx s(WN −W0) −sx As /2 −sx P(WN − W0 ≥ x) = P(e ≥ e ) ≤ E(e )e ≤ e e .
Setting s = xA−1, 0 ≤ x < A ensures that 0 ≤ xA−1 < , so that ∣s∣ < . Then by the above, 2 2 2 −2 −1 x P(W − W ≥ x) ≤ eAs /2e−sx = eAx A /2e−xA x = exp − . N 0 2A
The same argument applies to the sequence −Wj, so that we also have x2 P(W − W ≥ x) ≤ exp − . 0 N 2A The result follows.
Lemma 2.7. Suppose : ℕ → ℝ is a sequence with (n)(log n)1/2 → ∞ as n → ∞, and for any > 0, we have (n)n− → 0 as n → ∞. Suppose 1 > > 1/2, and the positive integer N = N(n) satisfies n ≥ N ≥ n1/2+ for some > 0, for all sufficiently large n.
If " > 0, there exists an M( ) and an n0(, , ") ≥ 1 with the following property. Suppose that n ≥ n0(, , "), and n is odd. Suppose further that X1,X2,...,XN are independent random variables each uniformly distributed on
Γn = {r/n ∈ T : 1 ≤ r ≤ n}.
−1 PN Then, if we write = N j=1 Xj , we have
(n)(log n)1/2 ∣ ∗ ({k/n}) − n−1∣ ≤ " , Nn1/2 15 and M( ) ({k/n}) ≤ N for all 1 ≤ k ≤ n, with probability at least 1/2.
Proof. Set M( ) = M( , 1/4) coming from Lemma 2.3. Without loss of generality, assume Pj−1 M( ) ≥ 3. Fix r for now, and define Y1,Y2,...,YN as follows. If v=1 Xv ({u/n}) ≤ M( ) for all u with 1 ≤ u ≤ n, set
j−1 2j − 1 X Y = − + ({r/n}) + 2 ({r/n}). j n 2Xj Xv+Xj v=1
Pj Otherwise, set Yj = 0. Take X0 = W0 = 0 and Wj = m=1 Ym; notice Yj = Wj − Wj−1. Pj−1 Suppose first that v=1 Xv ({u/n}) ≤ M( ) for all u with 1 ≤ u ≤ n. Observe that if s is a fixed integer, then Xj + s/n and 2Xj are uniformly distributed over Γn (using addition modulo 1), since n is odd and Xj are independent. Then by uniform distribution,
j−1 2j − 1 X E(Y ) = − + E( ({r/n})) + 2 E( ({r/n})) j n 2Xj Xv+Xj v=1 j−1 2j − 1 1 X 1 2j − 2 j − 1 = − + + 2 = − + 2 = 0. n n n n n v=1
Pj−1 On the other hand, if it is not the case that v=1 Xv ({u/n}) ≤ M( ) for all u with
1 ≤ u ≤ n, then Yj = 0 by construction so that E(Yj) = 0 automatically.
sYj We wish to make use of Lemma 2.6, and so we bound E(e ∣ X0,X1,...,Xj−1). Notice as n ≥ N and < 1, we have n > n ≥ N. Then (2N − 1)/n ≤ 2N/n ≤ 2. Pj−1 Suppose first that v=1 Xv ({u/n}) ≤ M( ) for all u with 1 ≤ u ≤ n. Then
j−1 2j − 1 X ∣Y ∣ ≤ + ({r/n}) + 2 ({r/n}) j n 2Xj Xv+Xj v=1 2N − 1 ≤ + 1 + 2M( ) ≤ 2 + 1 + 2M( ) ≤ 3M( ), n
16 where we have used the fact that M( ) ≥ 3. Setting Zj = Yj + (2j − 1)/n, we notice that
Zj ∕= 0 if any of Xv+Xj ({r/n}) ∕= 0 for 1 ≤ v ≤ j. As Xj are uniformly distributed and by subadditivity, we thus get P(Zj ∕= 0) ≤ j/n. Since E(Yj) = 0,
∞ k ! ∞ k X (sYj) X s E(esYj ) = E = E(Y k) k! k! j k=0 k=0 ∞ ∞ X sk X ∣s∣k = 1 + E(Y k) ≤ 1 + E∣Y ∣k k! j k! j k=2 k=2 ∞ ∞ X ∣s∣k X ∣s∣k = 1 + P(Z = 0) E(∣Y ∣k ∣ Z = 0) + P(Z ∕= 0) E(∣Y ∣k ∣ Z ∕= 0) j k! j j j k! j j k=2 k=2 ∞ ∞ X ∣s∣k j X ∣s∣k ≤ 1 + E(∣Y ∣k ∣ Z = 0) + E(∣Y ∣k ∣ Z ∕= 0). k! j j n k! j j k=2 k=2 where the third line follows from the partition formula, and the final inequality follows since P(Zj = 0) ≤ 1 and P(Zj ∕= 0) ≤ j/n. Since Zj = Yj + (2j − 1)/n, in the case Zj = 0 we have ∣Yj∣ = (2j −1)/n. Otherwise, our earlier approximation gives ∣Yj∣ ≤ 3M( ). Thus,
∞ 2j−1 ∞ X (∣s∣ )k j X (∣s∣3M( ))k E(esYj ) ≤ 1 + n + k! n k! k=2 k=2 ∞ ∞ X (∣s∣ 2N−1 )k N X (∣s∣3M( ))k ≤ 1 + n + k! n k! k=2 k=2 since 1 ≤ j ≤ N. Notice if we have a constant 0 ≤ a ≤ 1, then
∞ ∞ X ak X a2 ≤ = a2. k! 2k k=2 k=1
2N−1 1 For sufficiently large n, we have 0 ≤ ∣s∣ n ≤ 1. If ∣s∣ < 3M( ) , we also get 0 ≤ ∣s∣3M( ) ≤ 1, so by our quick result above,
2N − 12 N N 2 N E(esYj ) ≤ 1 + ∣s∣ + (∣s∣3M( ))2 ≤ 1 + 4 s2 + 9M( )2 s2 n n n2 n ≤ 1 + (4 + 9M( )2)Nn−1s2 ≤ 1 + 10M( )2Nn−1s2 ≤ exp 20M( )2Nn−1s2/2 ,
17 where we have used M( )2 ≥ 9 in the second to last inequality, and the fact that 1+x ≤ ex 2 N for every real x in the last inequality. Define aj = 20M( ) n > 0. Pj−1 If it is not the case that v=1 Xv ({u/n}) ≤ M( ) for all u with 1 ≤ u ≤ n, then by
definition of Yj, we get automatically
E(esYj ) = E(e0) = E(1) = 1 ≤ exp 20M( )2Nn−1s2/2 .
Define N N X X 2 −1 2 2 −1 A = aj = 20M( ) Nn = 20M( ) N n . j=1 j=1 1 Set = 3M( ) and x = "(n)(log n)1/2Nn−1/2. To apply Lemma 2.6, we must check that 0 ≤ x < A for sufficiently large n. Indeed, (n)(log n)1/2 → ∞ as n → ∞, while " > 0 and Nn−1/2 > 0 for every n, so that x ≥ 0 for sufficiently large n. In order to show x < A, we must have 1 "(n)(log n)1/2Nn−1/2 < 20M( )2N 2n−1 3M( ) 3 ⇐⇒ " (n)(log n)1/2n1/2 < N. 20M( ) By hypothesis, N is such that there exists > 0 with N > n1/2+ for all sufficiently large −/2 20M( ) n. For this value of > 0, we get (n)n < 3" for sufficiently large n by property of . Then we easily get that for sufficiently large n, 3 " (n)(log n)1/2n1/2 < n/2(log n)1/2n1/2 < n1/2+ < N, 20M( ) as required. Then by Lemma 2.6, x2 "2(n)2(log n)N 2n−1 P(∣W − W ∣ ≥ x) ≤ 2 exp − = 2 exp − N 0 2A 40M( )2N 2n−1 "2 = 2 exp − (n)2 log n . 40M( )2
"2 2 Notice − 40M( )2 < 0 is a constant, and we are given (n) log n → ∞ as n → ∞. Then,
we may choose n0(, , ") ≥ 1 such that for all n ≥ n0, 1 P(∣W − W ∣ ≥ "(n)(log n)1/2Nn−1/2) ≤ . (2.2) N 0 4 18 For the rest of the proof, we assume n satisfies this condition.
Recall M( ) = M( , 1/4), as given by Lemma 2.3. By that lemma, we have that with probability at least 1 − 1/(4n),
N X Xv ({r/n}) ≤ M( ), (2.3) v=1 for all 1 ≤ r ≤ n. Then for every 1 ≤ r ≤ n and for every 1 ≤ j ≤ N,
j N X X Xv ({r/n}) ≤ Xv ({r/n}) ≤ M( ), v=1 v=1 with probability at least 1 − 1/(4n). Then by construction,
j−1 2j − 1 X Y = − + ({r/n}) + 2 ({r/n}). j n 2Xj Xv+Xj v=1
By definition of Wj,
N j−1 ! X 2j − 1 X W − W = − + ({r/n}) + 2 ({r/n}) . N 0 n 2Xj Xv+Xj j=1 v=1 Notice N X 2j − 1 N 2 − = − , n n j=1 while N j−1 ! N N X X X X 2Xj ({r/n}) + 2 Xv+Xj ({r/n}) = Xv+Xj ({r/n}). (2.4) j=1 v=1 v=1 j=1 Combining (2.2) and (2.4), we get that with probability at least 1 − 1/(4n) − 1/4 ≥ 1/2, we have
N N X X N 2 ∣W − W ∣ = ({r/n}) − < "(n)(log n)1/2Nn−1/2. N 0 Xv+Xj n v=1 j=1
−1 PN Writing = N j=1 Xj , this becomes
(n)(log n)1/2 ∣ ∗ ({r/n}) − n−1∣ < " , Nn1/2 19 and (2.3) becomes M( ) ({r/n}) ≤ . N Allowing r to take values from 1 to n, the result follows.
Lemma 2.8. Suppose , , and N are as in Lemma 2.7. If " > 0, there exists an M( ) and an n0(, , ") ≥ 1 with the following property. Suppose that n ≥ n0(, , "), and n is odd. Then we can find N points
xj ∈ {r/n ∈ T : 1 ≤ r ≤ n}
(not necessarily distinct) such that, writing
N −1 X = N xj , j=1 we have (n)(log n)1/2 ∣ ∗ ({k/n}) − n−1∣ ≤ " , Nn1/2 and M( ) ({k/n}) ≤ N for all 1 ≤ k ≤ n.
Proof. This follows immediately from Lemma 2.7, since an event with positive probability must have at least one occurrence.
20 Chapter 3
A Key Lemma
In this chapter, we will take steps to convert the discrete measure generated by Lemma 2.8 into an incredibly well-behaved function, being a periodic, positive, infinitely differentiable function satisfying several key properties; this procedure will be completed in four lemmas.
We will write 1A for the indicator function of the set A.
Lemma 3.1. Suppose , , and N are as in Lemma 2.7. If " > 0, there exists an M( ) and an n0(, , ") ≥ 1 with the following property. Suppose that n ≥ n0(, , "), and n is odd. Then we can find N points
xj ∈ {r/n ∈ T : 1 ≤ r ≤ n}
(not necessarily distinct) such that, writing
N n X g = 1 −1 −1 , N [xj −(2n) ,xj +(2n) ) j=1 we have:
(n)(log n)1/2n1/2 (i) ∥g ∗ g − 1∥∞ ≤ " N .
(ii) For each t ∈ T, if 0 < ∣ℎ∣ < 1/n, then (n)(log n)1/2n3/2 ∣ℎ∣−1∣g ∗ g(t + ℎ) − g ∗ g(t)∣ ≤ 4" . N
21 nM( ) (iii) ∣g(t)∣ ≤ N for all t ∈ T.
(iv) ∥g∥1 = 1.
Proof. By Lemma2 .8, we can find N points
xj ∈ {r/n ∈ T : 1 ≤ r ≤ n}
(not necessarily distinct) such that, writing
N −1 X = N xj , j=1 we have (n)(log n)1/2 ∣ ∗ ({k/n}) − n−1∣ ≤ " , Nn1/2 and M( ) ({k/n}) ≤ N for all 1 ≤ k ≤ n. Observe that as xj is supported on {xj}, for any a, b and any x ∈ T we have Z
xj ∗ 1[a,b)(x) = 1[a,b)(x − y) dxj (y) = 1[a,b)(x − xj) = 1[xj +a,xj +b)(x). As convolution respects addition,
N N n X n X n ∗ 1 −1 −1 = ∗ 1 −1 −1 = 1 −1 −1 = g. [−(2n) ,(2n) ) N xj [−(2n) ,(2n) ) N [xj −(2n) ,xj +(2n) ) j=1 j=1
Since 1[−(2n)−1,(2n)−1) is symmetric almost everywhere, Z 1[−(2n)−1,(2n)−1) ∗ 1[−(2n)−1,(2n)−1) = 1[−(2n)−1,(2n)−1)(t − x)1[−(2n)−1,(2n)−1)(t) dt Z = 1[x−(2n)−1,x+(2n)−1)(t)1[−(2n)−1,(2n)−1)(t) dt ⎧ 0 if x < −1/n, R ⎨ 1[−(2n)−1,x+(2n)−1)(t) dt if − 1/n ≤ x ≤ 0, = R 1 −1 −1 (t) dt if 0 ≤ x ≤ 1/n, [x−(2n) ,(2n) ) ⎩0 if x > 1/n,
22 ⎧ ⎨0 if ∣x∣ > 1/n, = ⎩1/n − ∣x∣ if ∣x∣ ≤ 1/n, = max {0, 1/n − ∣x∣} .
Writing
Δn = max {0, 1 − n∣x∣} , the previous calculations give
g ∗ g = (n ∗ 1[−(2n)−1,(2n)−1)) ∗ (n ∗ 1[−(2n)−1,(2n)−1)) = ∗ ∗ nΔn.
In particular, we have Z g ∗ g(r/n) = ( ∗ ) ∗ nΔn(r/n) = n max {0, 1 − n∣(r/n) − t∣} d( ∗ )(t).
Since is supported on {k/n}, 1 ≤ k ≤ n, so is ∗ . Then the above integral is zero except possibly when t = r0/n for r0 ∈ ℤ, i.e. nt ∈ ℤ. Now, Δn((r/n) − t) is non-zero only if 1 − n∣(r/n) − t∣ > 0, yielding 1 > ∣r − nt∣. But nt ∈ ℤ, implying nt = r. Therefore,
g ∗ g(r/n) = n(1 − n∣(r/n) − (r/n)∣) ∗ ({r/n}) = n ∗ ({r/n}). (3.1)
It is routine to check that for xj ∈ {r/n ∈ T : 1 ≤ r ≤ n}, we have
−1 −1 −1 −1 1[xi−(2n) ,xi+(2n) ) ∗ 1[xj −(2n) ,xj +(2n) )(x) ⎧ 0 if x∈ / [xi + xj − 1/n, xi + xj + 1/n], ⎨ = x − (xi + xj) + 1/n if x ∈ [xi + xj − 1/n, xi + xj], ⎩−x + (xi + xj) + 1/n if if x ∈ [xi + xj, xi + xj + 1/n].
Notice xi + xj is of the form r/n as well, so that this convolution is continuous and piecewise linear on each interval [(r − 1)/n, r/n], for 1 ≤ r ≤ n. Then g ∗ g, being a linear combination of such convolutions, will be continuous1 and piecewise linear on each
1Continuity can also be obtained by noticing g ∈ L2(T), and L2(T) ∗ L2(T) ⊆ C(T).
23 interval [(r − 1)/n, r/n], for 1 ≤ r ≤ n. Thus, g ∗ g attains a local maximum at points in {r/n ∈ T : 1 ≤ r ≤ n}. By property of the chosen , we have
(n)(log n)1/2 ∣ ∗ ({r/n}) − n−1∣ ≤ " . Nn1/2 By (3.1), this becomes
(n)(log n)1/2n1/2 ∣g ∗ g(r/n) − 1∣ ≤ " . N
Since g ∗ g has local maxima on {r/n ∈ T : 1 ≤ r ≤ n}, we get (n)(log n)1/2n1/2 ∥g ∗ g − 1∥∞ = max ∣g ∗ g(r/n) − 1∣ ≤ " , 1≤r≤n N verifying part (i).
To verify part (ii), we need to check the claimed inequality for each t ∈ T. We have two cases.
Case 1: t = r/n, for some 1 ≤ r ≤ n.
Suppose 0 < ℎ < 1/n. Since g ∗ g is linear on [r/n, (r + 1)/n], g ∗ g(r/n + ℎ) can be expressed as the convex combination: