<<

Lecture Notes in

Lectures by Dr. Charles Moore

Throughout this document, signifies end proof, and N signifies end of example.

Table of Contents

Table of Contents i

Lecture 1 Introduction to 1 1.1 Fourier Analysis ...... 1 1.2 In more general settings...... 3

Lecture 2 More Fourier Analysis 4 2.1 Elementary Facts from Fourier Analysis ...... 4

Lecture 3 Convolving Functions 7 3.1 Properties of ...... 7

Lecture 4 An Application 9 4.1 Photographing a Star ...... 9 4.2 Results in L2 ...... 10

Lecture 5 Hilbert Spaces 12 5.1 Fourier on L2 ...... 12

Lecture 6 More on Hilbert Spaces 15 6.1 Haar Functions ...... 15 6.2 on L2 ...... 16

Lecture 7 Inverse Fourier Transform 17 7.1 Undoing Fourier Transforms ...... 17

Lecture 8 Fejer Kernels 19 8.1 Fejer Kernels and Approximate Identities ...... 19

Lecture 9 Convergence of Cesar´oMeans 22 9.1 Convergence of Fourier Sums ...... 22

Notes by Jakob Streipel. Last updated December 1, 2017.

i TABLE OF CONTENTS ii

Lecture 10 Toward Convergance of Partial Sums 24 10.1 Dirichlet Kernels ...... 24 10.2 Convergence for Continuous Functions ...... 25

Lecture 11 Convergence in Lp 26 11.1 Convergence in Lp ...... 26 11.2 Almost Everywhere Convergence ...... 28

Lecture 12 Maximal Functions 29 12.1 Hardy-Littlewood Maximal Functions ...... 29

Lecture 13 More on Maximal Functions 33 13.1 Proof of Hardy-Littlewood’s Theorem ...... 33

Lecture 14 Marcinkiewicz Interpolation 36 14.1 Proof of Marcinkiewicz Interpolation Theorem ...... 36

Lecture 15 Lebesgue Differentiation Theorem 38 15.1 A Note About Maximal Functions ...... 38 15.2 Lebesgue Differentiation Theorem ...... 39

Lecture 16 Maximal Functions and Kernels 41 16.1 Generalising Lebesgue Differentiation Theorem ...... 41

Lecture 17 Rising Sun Lemma 44 17.1 Nontangential Maximal ...... 44 17.2 Riesz’s Proof of the Hardy-Littlewood Theorem ...... 46

Lecture 18 Calder´on-Zygmund Decomposition of Functions 47 18.1 Higher-Dimensional Rising Sun Lemma ...... 47

Lecture 19 Density of Sets 49 19.1 Hardy-Littlewood’s Theorem from Calder´on-Zygmund ...... 49 19.2 Density of Sets ...... 50

Lecture 20 Marcinkiewicz 52 20.1 Convergence of Marcinkiewicz Integral ...... 52

Lecture 21 Integral Operators 54 21.1 Schur’s Lemma ...... 54

Lecture 22 Integral Operators continued 57 22.1 Singular ...... 57

Lecture 23 Integral Operators continued 60 23.1 Finishing the Proof ...... 60

Lecture 24 Integral Operators continued 63 24.1 Proof of the Lemma ...... 63

Lecture 25 Integral Operators continued 64 25.1 Finalising the Proof ...... 64 TABLE OF CONTENTS iii

Index 67 INTRODUCTION TO FOURIER ANALYSIS 1

Lecture 1 Introduction to Fourier Analysis

Harmonic analysis is a broad field involving a great deal of subjects concerning the art of decomposing functions into constituent parts. These might be Fourier coefficients, breaking them down into exponential parts, theory, tools to deal with partial differential equations, or Sobolev spaces. This course will deal with the following:

• Fourier analysis, • Harmonic functions, • Singular integrals, and

• Maximal functions.

1.1 Fourier Analysis Definition 1.1.1 (Inner product space). Let V be a finite dimensional vector space over C. Then V is called an inner product space if there is a mapping h·, ·i: V × V → C which satisfies the following for all vectors x, y, z ∈ V and scalars α ∈ C: (i) Conjugate symmetry, meaning that hx, yi = hy, xi; (ii) Linearity in the first argument, i.e. hαx, yi = αhx, yi and hx + y, zi = hx, zi + hy, zi; (iii) Positive-definiteness, meaning that hx, xi ≥ 0 and hx, xi = 0 if and only if x = 0. All inner product spaces automatically induce a , namely kvk = hv, vi1/2. Moreover normed spaces are automatically metric spaces by d(v, w) = kv − wk. Ergo it has the following properties:

Definition 1.1.2 (Metric). A function d(·, ·): V × V → R is called a metric if (i) d(x, y) ≥ 0 and d(x, y) = 0 if and only if x = y;

(ii) symmetry, i.e. d(x, y) = d(y, x); and (iii) d(x, z) ≤ d(x, y) + d(y, z) for all x, y, x ∈ V . Since we have an inner product, we are able to define all manner of other interesting concepts.

Definition 1.1.3 (Orthogonal, orthonormal). A basis B = { v1, v2,..., vn } is called orthogonal if the basis elements are pairwise orthogonal, i.e. hvi, vji = 0 for all i 6= j. Moreover the basis is called orthonormal if in addition kvik = 1 for all i. INTRODUCTION TO FOURIER ANALYSIS 2

Since a basis spans the space, we can write v = c1v1 + c2v2 + ... + cnvn. Now if the basis in addition is orthonormal we have the illuminating property that

hv, v1i = hc1v1 + c2v2 + ... + cnvn, v1i

= c1hv1, v1i + c2hv1, v2i + ... + cnhv1, vni 2 = c1kv1k + 0 + 0 + ... + 0 = c1, and similarly hv, vii = ci for all i. Therefore v = hv, v1iv1 + hv, v2iv2 + ... + hv, vnivn, and hvvv, vii are called the Fourier coefficients of v. We can do the same thing on a slightly more interesting space than ordinary Euclidean space, namely L1: Definition 1.1.4. Suppose f is a function on [−π, π) with Z π|f(x)| dx < ∞ −π meaning that f ∈ L1[−π, π). Then for n ∈ Z we define 1 Z π fˆ(n) = f(x)e−inx dx, 2π −π called the nth Fourier coefficient of f. On [−π, π), consider the set { einx } . Define n∈Z 1 Z π hf, gi = f(x)g(x) dx. 2π −π Exercise 1.1.5. Show that this is indeed an inner product. Solution. First conjugate symmetry, and clearly 1 Z π 1 Z π hf, gi = f(x)g(x) dx = f(x)g(x) dx 2π −π 2π −π 1 Z π 1 Z π = f(x)g(x) dx = g(x)f(x) dx = hg, fi. 2π −π 2π −π Linearity follows by integration being linear on its own, and note for positive- definiteness that 1 Z π 1 Z π hf, fi = f(x)f(x) dx = |f(x)|2 dx ≥ 0 2π −π 2π −π since for any complex number z = a + ib, zz = (a + ib)(a − ib) = a2 − i2b2 = a2 + b2 = |z|2. For the final condition, note that in fact they are implied by the first three for all inner products. To see this, take f = 0 in linearity, i.e. h0, 0i = hg, 0i + h−g, 0i = hg, 0i − hg, 0i = 0 and from positive-definiteness we have that hx, xi = 0 implies x = 0.  INTRODUCTION TO FOURIER ANALYSIS 3

Now suppose that n 6= m are integers. Then

1 Z π 1 Z π heinx, eimxi = einxeimx dx = einxe−imx dx 2π −π 2π −π Z π i(n−m)x π 1 i(n−m)x 1 e = e dx = 2π −π 2π i(n − m) x=−π 1  ei(n−m)π e−i(n−m)π  = − = 0, 2π i(n − m) i(n − m) and  1 Z π 1/2  1 Z π 1/2 keinxk = heinx, einxi = einxeinx dx = e0 = 1. 2π −π 2π −π

Therefore { einx } is an orthonormal set. n∈Z Given a function f, we defined fˆ(n) = hf, einxi. We would like for this to mean that X f = fˆ(n)einx n∈Z like v = hv, v1iv1 + hv, v2iv2 + ... + hv, vnivn. But is it a basis? In what sense does this infinite sum converge?

1.2 In more general settings. . .

R π ˆ inx Given f on [−π, π) and −π|f(x)| dx < ∞ we defined f(n) = hf, e i. Then P fˆ(n)einx is called the of f, often written n∈Z X f ∼ fˆ(n)einx. n∈Z

Now it is extremely important to note that this means only that fˆ(n) is defined as the integral we discussed previously, and nothing more. R π inx In general, if µ is a on [−π, π), thenµ ˆ(n) = −π e dµ(x), and indeed if f is a function than f(x) dx is an example of a measure. We can do it more generally in higher . If f : Rn → C, and if R n |f(x)| dx < ∞, we define R Z fˆ(x) = eix·tf(t) dt, n R where x = (x1, x2, . . . , xn), t = (t1, t2, . . . , tn), and x·t = x1t1+x2t2+...+xntn. This fˆ: Rn → C is called the Fourier transform. Remark 1.2.1. This is of course not unlike, say, the Laplace transform, Z ∞ L{f}(s) = f(t)e−st dt. 0 MORE FOURIER ANALYSIS 4

Indeed we could consider n-dimensional tori too, if we like, where’d we’d be doing the exact same thing, just over a different integrating domain (and with a different scalar in front). More generally we may do this over any locally compact Abelian G (which is what it sounds like, with the caveat that the group structure and the topological structure be connected in that the group operations of addition and negation are both continuous). It is a fact that on G there exists a measure µ called the such that it is translation invariant, i.e. µ(E + x) = µ(E), for evert Borel set E ⊆ G and x ∈ G. If γ : G → C is a mapping such that γ(x + y) = γ(x)γ(y) for every x, y ∈ G, e.g. γ(x) = einx, then γ is called a (multiplicative) character on G. In L2[−π, π), einx are characters. Moreover the set of all characters of a group is called the dual group, often denoted Γ. Now we can define Z fˆ(γ) = f(x)γ(x) dx. G (These things are discussed in Rudin’s Fourier analysis on LCA groups.)

Lecture 2 More Fourier Analysis

Recall from last time that we write

∞ X f ∼ fˆ(n)einx n=−∞ to mean that 1 Z π fˆ(n) = f(t)e−int dt, 2π −π for functions f ∈ F 1([−π, π)), which means that Z π |f(t)| dt. −π

Similarly if f is defined on Rd we write Z fˆ(ξ) = e−iξ·t dt. d R 2.1 Elementary Facts from Fourier Analysis Proposition 2.1.1. With these definitions, 1 Z π (i) |fˆ(n)| ≤ |f(t)| dt, and 2π −π Z (ii) |fˆ(ξ)| ≤ |f(t)| dt. d R MORE FOURIER ANALYSIS 5

Proof. We prove the first one; the second one is almost identical: Z π 1 −int |fˆ(n)| = f(t)e dt 2π −π 1 Z π 1 Z π ≤ |f(t)||e−int| dt = |f(t)| dt. 2π −π 2π −π We can do better, in fact.

Proposition 2.1.2 (Riemann-Lebesgue lemma). (i) If f ∈ L1([−π, π)) then |fˆ(n)| → 0 as |n| → ∞, and

(ii) If f ∈ L1(Rd) then fˆ(ξ) is continuous and |fˆ(ξ)| → 0 as |ξ| → ∞. Note that whilst (ii) might seem stronger, since (a) is a sequence it is automat- ically continuous in the discrete .

Proof. We prove the first one (since again, save for the continuity, the second one is similar), but we prove it only for indicator functions. This is of course fine since they are dense in the space of L1 functions. For an indicator function ( 1, if x ∈ (a, b) χ (x) = (a,b) 0, otherwise we have Z π Z b 1 −inx 1 −inx χˆ(a,b)(n) = χ(a,b)e dx = e dx 2π −π 2π a −inx b −inb −ina 1 e 1 e − e  = = → 0 2π −in a 2π −in as n → ∞ since eix is bounded by 1. Now since characteristic functions are dense in L1, this means that there exists a function g that is a linear combination of characteristic functions such that Z π |f(x) − g(x)| dx < ε −π for any ε > 0. Now sinceg ˆ(n) → 0 as |n| → ∞ there exists an M such that for all |n| > M we have |gˆ(n)| < ε. Take such an n, then

Z π 1 −int |fˆ(n)| = f(t)e dt 2π −π Z π Z π Z π 1 −int 1 −int 1 −int = f(t)e dt + g(t)e dt − g(t)e dt 2π −π 2π −π 2π −π 1 Z π ε ≤ |f(t) − g(t)||e−int| dt + |gˆ(n)| ≤ + ε 2π −π 2π which can of course be made arbitrarily small. MORE FOURIER ANALYSIS 6

d d For the continuity in the second case, let ξ ∈ R and morever let ξj ∈ R such that ξj → ξ. Then Z Z ˆ ˆ −iξj ·t −iξ·t |f(ξj) − f(ξ)| = f(t)e dt − f(t)e dt d d R R Z ≤ |f(t)||e−ixij ·t − e−iξ·t dt| → 0 d R as j → ∞. Remark 2.1.3. Note how without comment we took limits from outside of the integral to the inside of the integral just then. This is of course allowed since the integrand is finite, by previous discussion. This is contrary to classical examples such as fn(x) = nχ(0,1/n)(x). Since this converges pointwise to 0 as n goes to infinity, we have

Z 1 Z 1 lim fn(x) dx = 0 and lim fn(x) dx = 1. 0 n→∞ n→∞ 0 Remark 2.1.4. The Riemann-Lebesgue lemma is not true for measures. Take for instance the Dirac measure δ, for which we have Z Z π δˆ(n) = 2π e−int dδ(t) = 1 1 −π for all n, so it certainly does not go to 0. Remark 2.1.5. We have now shown that the Fourier coefficients of an L1 function f is a two-sided decaying sequence. One might ask, then, whether if for each decaying two-sided sequence we might find an L1 function who has precisely that sequence as its Fourier coefficients. The answer to this question is no, by Bochner’s theorem, but more on this later.

Proposition 2.1.6. Let f ∈ L1([−π, π)). Extend f to all of R as a 2π periodic function, i.e. f(2πn + x) = x for all n ∈ Z. Then

(i) Let y ∈ R and define g(x) = f(x − y). Then gˆ(n) = fˆ(n)e−iyn;

(ii) If m ∈ Z, g(x) = f(x)eimx, then gˆ(n) = fˆ(n − m); Proof. (i) We compute, making a change of variable along the way:

1 Z π 1 Z π gˆ(n) = g(t)e−int dt = f(t − y)e−int dt 2π −π 2π −π 1 Z π−y e−iny Z π−y = f(s)e−in(s+y) ds = f(s)e−ins ds 2π −π−y 2π −π−y = e−inyfˆ(n).

(ii) is similar. Something similar is true for products of functions—almost. CONVOLVING FUNCTIONS 7

Definition 2.1.7 (Convolution). If f ∈ Lp(Rd), g ∈ L1(Rd), then we let Z f ∗ g(x) = f(x − y)g(y) dy, d R called the convolution of f and g.

Not is turns out that fgc 6= fˆgˆ in general, but f[∗ g = fˆ· gˆ.

Lecture 3 Convolving Functions

3.1 Properties of Convolution Recall that for f, g ∈ L1([−π, π)), we define

1 Z π f ∗ g(x) = f(x − y)g(y) dy 2π −π and for f, g ∈ L1(Rd) we define Z f ∗ g(x) = f(x − y)g(y) dy. d R p 1 Proposition 3.1.1 (Young’s inequality). If f ∈ L and g ∈ L , then kf ∗gkp ≤ kfkpkgk1. We will need the following lemma:

Lemma 3.1.2 (Holder’s inequality). Let 1 ≤ p, q ≤ ∞ be such that 1/p+1/q = 1. Then Z  Z 1/p Z 1/q |h(x)k(x)| dx ≤ |h(x)|p dx |k(x)|q dx . d d d R R R

Proof. We prove the case of Rd; the other one is similar. Z Z Z p p p kf ∗ gkp = |f ∗ g(x)| dx = f(x − y)g(y) dy dx d d d R R R Z Z p 1/p 1/q ≤ |f(x − y)||g(y)| |g(y)| dy dx d d R R Z Z 1/p Z 1/q p  p    ≤ |f(x − y)| |g(y)| dy |g(y)| dy dx d d d R R R Z Z p/q p = kgk1 |f(x − y)| |g(y)| dy dx d d ZR ZR p/q p = kgk1 |f(x − y)| |g(y)| dx dy d d ZR R Z p/q p = kgk1 |g(y)| |f(x − y)| dx dy d d R R p/q p p p/q+1 p p = kgk1 kfkpkgk1 = kfkpkgk1 = kfkpkgk1. CONVOLVING FUNCTIONS 8

Proposition 3.1.3. (i) If f, g ∈ L1([−π, π)) then

f[∗ g(n) = fˆ(n)ˆg(n).

(ii) If f, g ∈ L1(Rd) then f[∗ g(ξ) = fˆ(ξ)ˆg(ξ).

Proof. We prove (i):

1 Z π f[∗ g(n) = f ∗ g(t)e−int dt 2π −π 1 Z π 1 Z π = f(t − s)g(s) ds e−int dt 2π −π 2π −π 1 Z π 1 Z π = f(t − s)g(s)e−int ds dt 2π −π 2π −π 1 Z π 1 Z π = f(t − s)g(s)e−int dt ds 2π −π 2π −π 1 Z π 1 Z π = g(s) f(t − s)e−int dt ds 2π −π 2π −π 1 Z π Z π−s = g(s) f(v)e−in(s+v) dv ds 2π −π −π−s 1 Z π 1 Z π−s = g(s)e−ins f(v)e−inv dv ds =g ˆ(n)fˆ(n). 2π −π 2π −π−s

1 d 1 d Proposition 3.1.4. (i) If f ∈ L (R ) and if xkf(x) ∈ L (R ) (where x = ˆ (x1, x2, . . . , xd)), then f(ξ) is differentiable almost everywhere with respect to ξk and ∂ ˆ f(ξ) = −\ixkf(ξ). ∂ξk

1 d (ii) If f, ∂f/∂xk ∈ L (R ), then

d∂f ˆ (ξ) = iξkf(ξ). ∂xk Proof. Set h = (0,..., 0, h, 0,..., 0), with h being in the kth position. Consider

fˆ(ξ − h) − fˆ(ξ) 1 Z = f(t)(e−i(ξ+h)·t − e−iξ·t) dt h h d R Z e−ih·t − 1 = f(t)e−iξ·t dt. d h R Thus taking limits we get the partial in ξ, whereby

∂fˆ Z e−ihtk − 1 (ξ) = f(t)e−iξ·t lim dt ∂ξk d h→0 h ZR −iξ·t = f(t)e (−itk) dt = −\itkf(ξ). d R AN APPLICATION 9

Proposition 3.1.5. If f, df/dx ∈ L1([−π, π)) then

cdf (n) = infˆ(n). dx Proof. By simple computation using ,

cdf 1 Z π df (n) = (x)e−inx dx dx 2π −π dx π Z π 1 −inx 1 −inx = f(x)e − f(x)(−in)e dx 2π −π 2π −π Z π = in f(x)e−inx dx = infˆ(n). −π

Lecture 4 An Application

4.1 Photographing a Star We will describe an application of the Fourier transform. Imagine taking a two- dimensional photograph of a start from Earth. This star looks like a disc from our perspective, and due to there being atmosphere between us and the star the picture will be blurry. A question one might ask oneself then is: despite the blurriness, would it be possible to determine the radius of the star? To describe how this is accomplished using the Fourier transform, we first set the problem up in terms of functions on the plane. The true image of the star can be described as x − y  f(x) = λχ B ε where x = (x1, x2), y = (y1, y2) is the center of the disc (which is the image of the star), ε is the radius of the disc, and λ is the brightness or luminosity of the star. Moreover χB is the characteristic function of the unit disc, from which it follows that the above is the disc of radius ε centred on y with brightness λ. The blurry photograph of the star can be described by Z f ∗ k(x) = f(y)k(x − y) dy 2 R where k(y) is some sort of smooth function with a bump around the origin and 0 elsewhere. Such a function k is called a mollifier, which comes from the fact that the convolution above will produce an image of f, with the caveat that sharp edges have been smoothed out. Now imagine us taking n photos, getting f ∗ k1, f ∗ k2, . . . , f ∗ kn. We super- impose these, yielding f ∗ k1 + f ∗ k2 + ... + f ∗ kn, and then take the Fourier transform, producing ˆˆ ˆˆ ˆˆ ˆ ˆ ˆ ˆ fk1 + fk2 + ... + fkn = f(k1 + k2 + ... + kn). ˆ ˆ ˆ Now f has zeros, as do k1,..., kn. Let us assume that the zeros of Fourier transform are the zeros of fˆ (i.e. that the zeros of the Fourier transforms of the AN APPLICATION 10 mollifier don’t happen to coincide with each other and the zeros of fˆ). Then, using the substitutions s = x − y and u = y/ε along the way, Z ˆ −ix·ξ x − y  f(ξ) = e λχB dx 2 ε RZ −i(s+y)·ξ = λ e χB(s/ε) ds 2 R Z −iy·ξ −is·ξ = λe e χB(s/ε) dy 2 RZ 2 −iy·ξ −iεu·ξ = ε λe e χB(u) du 2 ZR 2 −iy·ξ −iu·(εξ) = ε λe e χB(u) du 2 R 2 −iy·ξ = ε λe χcB(εξ) It happens to be a fact that the Fourier transform of the unit ball is

2πJ (|ξ|) χ (ξ) = 1 , cB |ξ| where J1 is the of the first kind. Therefore J (ε|ξ|) fˆ(ξ) = ελe−iy·ξ2π 1 , |ξ| so under our assumption the zeros of the superimposed image of the star should be the zeros of fˆ, which are then exactly characterised by the zeros of the Bessel function, which are well known. Knowing then that the distance between the observed zeros would be the distance between the zeros of the Bessel function scaled by the radius ε, we can recover this radius.

4.2 Results in L2 A question we have asked ourselves previously is when a function f ∈ L1([−π, π)) is equal to its own Fourier series. In other words, when is

∞ X f(x) = fˆ(n)einx? n=−∞

Recall first that f ∈ L2([−π, π)) means that Z π |f(t)|2 dt < ∞. −π

Proposition 4.2.1. Suppose e1, e2, . . . , eN is an orthonormal set in an inner product space. Then 2 N N X X 2 anen = |an| n=1 n=1 for all an ∈ C. AN APPLICATION 11

Proof. That this is true is quite obvious: expanding the left-hand sum we get mostly mixed hen, emi, which by orthogonality are zero, and the remaining hen, eni are 1 by normality.

Proposition 4.2.2. Let f be a function in an inner product space, and { e1, e2, . . . , eN } an orthonormal set. Then 2 N N X 2 X 2 hf, eni − f = kfk − |hf, eni| . n=1 n=1 Proof. By computation:

N 2 * N N +

X X X hf, eni − f = hf, enien − f, hf, enien − f n=1 n=1 n=1 * N N + * N + X X X = hf, enien, hf, enien − hf, enien, f n=1 n=1 n=1 * N + X − f, hf, enien + hf, fi n=1 N N X 2 X 2 = |hf, eni| − 2 hf, enihf, eni + kfk n=1 n=1 N N X 2 X 2 = |hf, eni| − 2 |hf, eni| + kfk n=1 n=1 N X 2 = kfk − |hf, eni| . n=1

Corollary 4.2.3 (Bessel’s inequality). In the same setting,

N X 2 2 |hf, eni| ≤ kfk . n=1 Proof. This is immediate byu the previous proposition since norms are nonneg- ative.

Example 4.2.4. For any N,

N X 2 |fˆ(n)| ≤ kfk n=−N whereby ∞ X ˆ 2 2 |f(n)| ≤ kfk2. n=−∞ N HILBERT SPACES 12

Lecture 5 Hilbert Spaces

5.1 Fourier Series on L2 Recall that L2([−π, π)) is the set of all functions f satisfying

Z π 1/2  1 2  kfk2 = |f(t)| dt < ∞ 2π −π the norm of which is induced by the inner product 1 Z π hf, gi = f(t)g(t) dt, 2π −π and the space, being normed, is automatically metric, by

d(f, g) = kf − gk2. Definition 5.1.1 (). If H is an inner product space so that the resulting metric is complete, then H is called a Hilbert space. Recall moreover that a space being complete means that every Cauchy sequence converges in the space.

Definition 5.1.2 (Complete set). An orthonormal set { eα } ⊂ H, with H a Hilbert space, is called complete or maximal if hf, eαi = 0 for every α implies that f = 0.

3 Example 5.1.3. Let H = R , with e1 = (1, 0, 0), e2 = (0, 1, 0), and e3 = (0, 0, 1), as well as f = (f1, f2, f3). Then hf, eii = 0 for i = 1, 2, 3 implies that f = 0. N

Proposition 5.1.4. Let { eα } be an orthonormal set in a Hilbert space H. Then 2 P 2 (i) kfk2 = |hf, eαi| (Bessel’s equality). α P (ii) f = hf, eαieα in H, by which we mean that α

X f − hf, eαieα α∈S can be made arbitrarily small by choosing an appropriate finite set S of indices. (iii) If in addition g ∈ H, then X hf, gi = fˆ(α)gˆ(α) α ˆ where f(α) = hf, eαi, and again we mean convergence in the above sense.

2 inx Example 5.1.5. Let H = L ([−π, π)) and en = e , then functions are in fact the limits of their Fourier series. N HILBERT SPACES 13

Remark 5.1.6. We don’t actually know yet that { einx } ⊂ L2([−π, π)) is com- plete, i.e. that hf, einxi = 0 for all n ∈ Z implies that f = 0. This is intuitively true: the scalar products somehow measure how much f oscillates in the given frequency, so if f is constant (i.e. doesn’t oscillate at all), then all scalar prod- ucts except for n = 0 would be 0, and finally for this last one to be 0 the constant would indeed have to be 0. Proof. (ii) We know that

X 2 2 |hf, eαi| ≤ kfk2, α i.e. Bessel’s inequality. Suppose there exists some ε > 0 such that

X f − aαeα > ε α∈S for every finite set S and aα ∈ C. Let ( )

X M = aαeα S finite and aα ∈ C . α∈S

An elementary fact from linear is that we can write f = fM + fM ⊥ , where fM ∈ M and hfM ⊥ , mi = 0 for all m ∈ M. Since f 6∈ M (since f is a positive distance away from all things before taking closure, it remains outside afterward; just put a ball of radius ε/2 around it), then fM ⊥ 6= 0. But hfM ⊥ , eαi = 0 for every α since eα ∈ M, so by completeness fM ⊥ = 0, which is a contradiction. Therefore we cannot bound the distance between f and its Fourier series by any positive value. A simple computation shows that

X X f − hf, eαieα ≤ f − aαeα , α∈S α∈S i.e. this sort of quantity is minimised by the Fourier coefficients. Moreover from last lecture 2

X X 2 f − hf, eαieα = kfk − |hf, eαi| , α∈S α∈S and by (ii) we can make this arbitrarily small, and so we have (i). Furthermore * + X X X ˆ hf, gi − hf, eαieα, g = hf, gi − hf, eαihg, eαi = hf, gi − f(α)gˆ(α) α∈S α∈S α∈S but * + * +

X X hf, gi − hf, eαieα, g = f − hf, eαieα, g α∈S α∈S HILBERT SPACES 14 which by Cauchy-Schwartz inequality is bounded by

X f − hf, eαieα · kgk α∈S and this we can make arbitrarily small by (ii), and so (iii) follows.

Theorem 5.1.7 (Riesz-Fischer). (i) Given f ∈ H and { eα } an orthonormal ˆ set indexed by α ∈ A, then { f(α) }α∈A ∈ `2(A). 2 (ii) Conversely, given a sequence { aα } ∈ ` (A), then X aαeα α∈A

defines an element of H.

Recall before we proceed that `2(A) is the set of square summable sequences indexed by A, where we mean that the supremum of all finite square sums is finite. Example 5.1.8. For instance,

( ∞ )

2 X 2 ` (N) = { an } |an| < ∞ . n=1

N Proof. (i) By Bessel’s inequality,

X 2 2 |hf, eαi| ≤ kfk α∈A and so the sequence of Fourier coefficients is square summable and therefore is in `2(A). (ii) Choose finite sets Bn ⊂ A such that

X 2 X 2 1 |a | − |a | ≤ . α α 2n α∈A α∈Bn

We can assume Bn+1 ⊃ Bn. Then if n > m, since we are operating on an orthonormal set,

2 X X X 2 2 a e − a e = |a | ≤ α α α α α 2m α∈Bn α∈Bm α∈Bn\Bm which can be made arbitrarily small, whereby these sums indexed by n form a Cauchy sequence in H, and since H is a Hilbert space it must converge to an element in H. MORE ON HILBERT SPACES 15

Example 5.1.9. Consider a positive function h on [−1, 1] such that

Z 1 h(t) dt = 1. −1 Suppose further that Z 1 tnh(t) dt ≤ ∞ −1 for all n = 0, 1, 2,.... Then the set 1, x, x2,... isn’t orthonormal, but we can orthogonalise by Gram-Schmidt, i.e. let v0 = 1,

hx, v0i v1 = x − 2 v0 kv0k and 2 2 2 hx , v1i hx , v0i v2 = x − 2 v1 − 2 v0 kv1k kv0k and so on. Finally normalise by ui = vi/kvik, and we have an orthonormal set of functions (which will be complete for some choices of h). For instance h(x) = 1/2 yields the Legendre polynomials, h(x) = (1−x)α(1+ x)β produces Jacobi functions, having Gaussian h gives Hermite functions, and so on. These arise as solutions to some ordinary differential equa- tions, as orthonormal bases for some L2 spaces, or as certain recurrence rela- tions. N

Lecture 6 More on Hilbert Spaces

6.1 Haar Functions Example 6.1.1. Consider the following set of functions on [0, 1] with the inner product Z 1 hf, gi = f(t)g(t) dt. 0

Let h0(x) = 1, and let h(x) = χ[0,1/2)(x) − χ[1/2,1)(x). Now we recursively define

j/2 j hj,k(x) = 2 h(2 x − k) for all j = 0, 1, 2,... and for each j we have k ∈ { 0, 1, 2,..., 2j − 1 }. It is then straightforward to verify that

Z 1 2 khj,kk = |hj,k(x)| dx = 1 0 and that Z 1 hhj,k, hl,mi = hj,k(x)hl,m(x) dx 0 MORE ON HILBERT SPACES 16 is 0 if (j, k) 6= (l, m). This is clear since if j 6= l, then either their supports are disjoint, in which we get 0, or they aren’t, but then they average to 0 on the intersection of their supports. Similarly if j = l but k 6= m, then the supports are disjoint, and again we have 0. Hence the Haar functions form an orthonormal set. Moreover it is complete, and any function on [0, 1] can be written as a linear sum of Haar functions. N

6.2 Fourier Transform on L2

We will consider functions in L1(Rd) ∩ L2(Rd). 2 Lemma 6.2.1. Let f(x) = e−a|x| , with a > 0. Then

π n/2 2 fˆ(ξ) = e−|ξ| /(4a). a In other words, the Fourier transform of a Gaussian is (up to some constant) again Gaussian. Proof. It is pretty much by straightforward—but long—computation: Z Z −ix·ξ −a|x|2 −i(x ξ +...+x ξ )−a(x2+...x2 ) fˆ(ξ) = e e dx = e 1 1 d d 1 d dx d d R R Z Z Z 2 2 2 −ix1ξ1−ax −ix2ξ2−ax −ixdξd−ax = ··· e 1 · e 2 · ... · e d dx1 dx2 . . . dxd R R R  Z 2   Z 2  −ix1ξ1−ax −ixdξd−ax = e 1 dx1 · ... · e d dxd R R d Z 2 Y −ixj ξj −ax = e j dxj, j=1 R meaning that it is sufficient to evaluate this integral in one variable. So we compute:

Z 2 Z 2 2 Z 2 2 2 e−ixξ−ax dx = e−a(x −ξ/aix) dx = e−ξ /(4a) e−a(x ξ/aix−ξ /(4a )) dx R R R 2 Z 2 = e−ξ /(4a) e−a(x+iξ/(2a)) dx. R 2 We evaluate this by means of a contour integral of g(z) = e−az , around the curve C which is the rectangle with corners in N, N + iξ/(2a), −N + iξ/(2a), and −N. Then N ξ/(2a) I Z 2 Z 2 g(z) dz = e−ax dx+i e−a(N+is) ds C −N 0 N ξ/(2a) Z 2 Z 2 − e−a(x+iξ/(2a)) dx − i e−a(−N+is) ds. −N 0 The integrals on the vertical parts both vanish as N → ∞ since they’re products of a negative exponential and bounded terms. Moreover the whole thing is 0 since we’re integrating an entire function on a simple, closed curve, and therefore

Z 2 Z 2 e−ax dx = e−a(x+iξ/(2a)) dx, R R INVERSE FOURIER TRANSFORM 17 and the left-hand side is well-known to be pπ/a. Therefore

d Z d n/2 n/2 Y −ix ξ −ax2 Y π  −ξ2/(4a) π  −|ξ|2/(4a) fˆ(ξ) = e j j j dx = e j = e . j a a j=1 R j=1

Lecture 7 Inverse Fourier Transform

7.1 Undoing Fourier Transforms It is helpful to know that we can ‘move the hat’ inside of integrals:

Lemma 7.1.1. If f, g ∈ L1(Rd), then Z Z fˆ(y)g(y) dy = f(y)ˆg(y) dy. d d R R Proof. We prove it by straightforward computation: Z Z Z fˆ(y)g(y) dy = e−iy·xf(x) dxg(y) dy, d d d R R R and by Fubini’s theorem we can switch the order of integration since the inte- grand is bounded, so Z Z Z Z e−ix·yf(x)g(y) dy dx = f(x) e−ix·yg(y) dy dx d d d d R R ZR R = f(x)ˆg(x) dx. d R

Definition 7.1.2 (Inverse Fourier transform). For f ∈ L1(Rd) we define Z ˇ 1 ix·t f(x) = d e f(t) dt, (2π) d R the so-called inverse Fourier transform of f.

Theorem 7.1.3 (Fourier inversion theorem). If f, fˆ ∈ L1(Rd) and f bounded, then ˇ f(x) = fˆ(x) almost everywhere. Proof. It follows from computation and our two latest lemmas. Let φ(t) = 2 2 eix·t−ε |t| , x fixed. Then

Z 2 2 Z 2 2 φˆ(ξ) = eix·t−ε |t| e−iξ·t dt = e−i(ξ−x)·t−ε |t| dt. d d R R By the lemma from last lecture this is

r d π 2 2 e−|ξ−x| /(4ε ). ε2 INVERSE FOURIER TRANSFORM 18

Now for convenience write √ d 2 g(ξ) = π e−|ξ| /4, then φˆ(ξ) = ε−dg((x − ξ)/ε). Therefore Z Z 1 x − y  ˆ f(y) d g dy = f(y)φ(y) dy d ε ε d R R Z Z 2 2 = fˆ(y)φ(y) dy = fˆ(y)eix·y−ε |y| dy. d d R R ˇ Now let ε → 0. The right-hand side tends to fˆ(x), where we can pass the inside since it is bounded by an integrable function. For the left-hand side, suppose f is continuous at x and consider Z 1 x − y  f(y) d g dy − f(x) d ε ε R which, since Z 1 x − y  d g dy = 1, d ε ε R is the same if we multiply f(x) by the above integrand to combine the integrals. Then it is the same as Z 1 x − y  |f(y) − f(x)| d g dy d ε ε R and if we let u = (x − y)/ε this becomes Z |f(x − εu) − f(x)|g(u) du. d R ˇ Letting ε tend to 0, then this tends to 0, meaning that f equals fˆ in L1, but we want almost everywhere. But the above means that there exists sequences εj tending to 0 for which the above holds, which results in convergence almost everywhere.

Theorem 7.1.4 (Plancheral’s theorem). Suppose f, g ∈ L1 ∩ L2(Rd), then Z Z 1 ˆ (i) f(x)g(x) dx = d f(ξ)gˆ(ξ) dξ, d (2π) d R R Z Z 2 1 ˆ 2 (ii) |f(x)| dx = d |f(y)| dy. d (2π) d R R Proof. We compute the first part: Z Z ˇ f(x)g(x) dx = fˆ(x)g(x) dx d d R ZR Z 1 iy·x ˆ = d e f(y) dyg(x) dx d (2π) d R R 1 Z Z ˆ −ix·y = d f(y) e g(x) dx dy (2π) d d R R Z 1 ˆ = d f(y)gˆ(y) dy. (2π) d R The second part follows immediately by letting g = f. FEJER KERNELS 19

Corollary 7.1.5. Let B(0, r) be the ball of radius r centred on 0 in Rd. Let f ∈ L1 ∩ L2(Rd). Then \ ˆ χB(0,r)f → f in L2 as r → ∞. Compare this with how partial sums of Fourier series converge to the function in L2. One might ask similar questions for Lp ∩ L2. In one this is true, but in dimensions 2 or greater it is not. This is due to Fefferman in the seventies, part of what earned him the ’s medal.

Lecture 8 Fejer Kernels

8.1 Fejer Kernels and Approximate Identities One of our fondest hopes in this course is that the Fourier series of a function converges, in some reasonable way, to the function itself. Another way of asking if this happens is to study the partial sums n X ˆ ikx Snf(x) = f(k)e k=−n p and ask whether Snf(x) → f(x) in some sense of convergence, be it in L norm, almost everywhere, uniformly, and so on. What Fejer showed is that if we define S + S + ... + S σ = 0 1 n , n n + 1 p i.e. the average of the first n + 1 partial sums, then σnf → f in L and almost everywhere. ∞ In general, suppose { an }n=0 ⊂ R, and an → L, then σn → L as well. We say that a sequence for which σn converges is Cesar´osummable. The converse is in general not true: n Example 8.1.1. Let an = (−1) . Clearly this does not converge to anything— it jumps between 1 and −1 indefinitely. However it does converge in the Cesar´o sense, since σ0 = 1, σ1 = 0, σ2 = 1/3, σ3 = 0, σ4 = 1/5, and so on. N There are other curious ways to sum things:

Example 8.1.2. For a sequence a0, a1,..., let 0 < r < 1 and consider S(r) = 2 a0 + a1r + a2r + .... If lim S(r) r→1 exists, we say that the sequence a0, a1,... is Abel summable. N Let f ∈ L1([−π, π)), then n n X X 1 Z π S f(x) = fˆ(k)eikx = f(t)e−ikt dteikx n 2π k=−n k=−n −π n 1 Z π X = f(t) eik(x−t) dt. 2π −π k=−n FEJER KERNELS 20

We identify the inner sum n X ikx Dn(x) = e k=−n called the Dirichlet kernel. Therefore 1 Z π Skf(x) = f(t)Dk(x − t) dt 2π −π and Z π n 1 X Dk(x − t) σ f(x) = f(t) dt. n 2π n + 1 −π k=0 Now we once more identify the inner sum as a new piece of notation—this time it will turn out to be very useful, after some algebra: n X Dk(x) K (x) = n n + 1 k=0 is the so-called Fejer kernel, and using it we have 1 Z π σnf(x) = f(t)Kn(x − t) dt. 2π −π We will now spend some rewriting the Fejer kernel in a more practical way: ei0x + P1 eikx + P2 eikx + ... + Pn eikx K (x) = k=−1 k=−2 k=−n n n + 1 and if we just count how many times each eikx appears for each k, we can clearly rewrite it as n X (n + 1) − |l| K (x) = eils. n n + 1 l=−n Now as an aside, note that n 2n 2n  X 2 X X X ei(j−n/2)x = ei(l−n)x = ((n + 1) − |l − n|)ei(l−n)x j=0 l=0 j+k=l l=0 and therefore n n 1  X 2 1  X 2 K (x) = ei(j−n/2)x = e−in/2x eijx n n + 1 n + 1 j=0 j=0 and the sum in the last step is geometric, so this is the same as 1  1 − (eix)n+1 2 1 e−in/2xiei(n/2+1)x 2 e−in/2x = n + 1 1 − eix n + 1 1 − eix and by multiplying and dividing by e−ix/2 we get 1 e−i(n+1)/2x − ei(n+1)/2x 2 1 −2i sin((n + 1)/2x)2 = n + 1 e−ix/2 − eix/2 n + 1 −2i sin(x/2) and simplifying this is just

n+1 2 1 sin( 2 x) Kn(x) = x . n + 1 sin( 2 ) FEJER KERNELS 21

Lemma 8.1.3 (Properties of the Fejer kernel). (i) Kn(x) ≥ 0 for all x ∈ [−π, π).

(ii) Fix a δ > 0. Then Kn(x) → − as n → ∞ uniformly on [−π, π) \ (−δ, δ). 1 Z π (iii) Kn(x) dx = 1. 2π −π Proof. (i) is quite clear, since we have the square of a real number. (ii) is reconciled by noting that the sin in the numerator is bounded by 1, and in the bottom we can take x = δ since |x| > δ means that

n+1 2 1 sin( 2 x) 1 1 Kn(x) = x = 2 n + 1 sin( 2 ) n + 1 sin(δ/2) which goes to 0 uniformly as n → ∞, since the estimate is independent of x so long as |x| > δ. (iii) We just compute, recalling one of the earlier forms of the Fejer kernel:

n 1 Z π 1 Z π X n + 1 − |k| K (x) dx = eikx dx 2π n 2π n + 1 −π −π k=−n n X 1 n + 1 − |k| Z π = eikx dx 2π n + 1 k=−n −π and this last integral is 0 unless k = 0, and so the whole thing is equal to 1 n + 1 2π = 1. 2π n + 1 Remark 8.1.4. Any sequence of functions satisfying (i)–(iii) is called an ap- proximate identity.

p Theorem 8.1.5. (i) For f ∈ L ([−π, π)), 1 ≤ p < ∞, then σnf(x) → f(x) in the Lp([−π, π)) norm, i.e.

Z π 1/p  1 p  kσnf − fkp = |σnf(x) − f(x)| dx → 0 2π −π as n → ∞.

1 (ii) If f ∈ L , and f is continuous at x, then σnf(x) → f(x).

1 (iii) If f ∈ L ([−π, π)) then σnf(x) → f(x) almost everywhere.

Remark 8.1.6. The proof of this relies on the fact that Kn(x) is an approxi- mate identity, and no other special properties of Kn(x). Therefore any other approximate identity has the same properties. Remark 8.1.7. The answer to our original question—actual convergence of par- tial sums, not of the means—is a question that has been subject to much study. 2 Carleson proved that Snf(x) → f(x) almost everywhere in L , and Hunt later proved the same in Lp for 1 < p < ∞. It was shown to be false by Kolmogorov for p = 1. CONVERGENCE OF CESARO´ MEANS 22

Lecture 9 Convergence of Cesar´oMeans

9.1 Convergence of Fourier Sums We prove the theorem stated at the end of last lecture.

p Theorem 9.1.1. (i) For f ∈ L ([−π, π)), 1 ≤ p < ∞, then σnf(x) → f(x) in the Lp([−π, π)) norm, i.e.

Z π 1/p  1 p  kσnf − fkp = |σnf(x) − f(x)| dx → 0 2π −π as n → ∞.

1 (ii) If f ∈ L , and f is continuous at x, then σnf(x) → f(x).

1 (iii) If f ∈ L ([−π, π)) then σnf(x) → f(x) almost everywhere. Proof. We start by proving (ii). Fix an x, then

1 Z π σnf(x) − f(x) = f(t)Kn(x − t) dt − f(x) 2π −π 1 Z π 1 Z π = f(x − t)Kn(t) dt − f(x)Kn(t) dt 2π −π 2π −π 1 Z π = (f(x − t) − f(x))Kn(x) dt. 2π −π Letting ε > 0, then there exists a δ > 0 such that if |t| < δ then |f(x−t)−f(x)| < ε by continuity. By the second property of approximate identities, we can choose an n large enough so that |Kn(t)| < ε/(kfk1 + |f(x)|) for all t ∈ [−π, π) \ (−δ, δ). Then 1 Z π |σnf(x) − f(x)| ≤ |f(x − t) − f(x)|Kn(t) dt 2π −π 1 Z δ = |f(x − t) − f(x)|Kn(t) dt+ 2π −δ 1 Z + |f(x − t) − f(x)|K (t) dt 2π n [−π,π) \(−δ,δ) ε(kfk + |f(x)|) ≤ ε · 1 + 1 = 2ε. kfk1 + |f(x)| For (i), we compute and cleverly use Jensen’s inequality at one point:

Z π Z π Z π p 1 p 1 1 |σnf(x) − f(x)| dx = (f(x − t) − f(x))Kn(t) dt dx 2π −π 2π −π 2π −π Z π Z π 1 1 p ≤ |f(x − t) − f(x)| Kn(t) dt dx. 2π −π 2π −π CONVERGENCE OF CESARO´ MEANS 23

Now noting that dµ = Kn(t)/(2π) dt is a measure with total mass 1, and that |·|p is convex, we have by Jensen’s inequality that this equals Z π Z π 1 1 p |f(x − t) − f(x)| dx Kn(t) dt. 2π −π 2π −π | {z } =h(t)

We therefore have 1 Z π 1 Z π h(t)Kn(t) dt = h(0 − t)Kn(t) dt 2π −π 2π −π but this goes to h(0) = 0 as n → ∞, by (ii). (iii) We have Z π Z π 1 1 |σnf(x)| = Kn(t)f(x − t) dt ≤ Kn(t)|f(x − t)| dt 2π −π 2π −π 1 Z π Z Kn(t) = dr|f(x − t)| dt 2π −π 0 1 Z ∞ = χ[0,Kn(t))(r) dr|f(x − t)| dt 2π 0 Z ∞ 1 Z π = χ[0,Kn(t))(r)|f(x − t)| dt dr. 0 2π −π

Now χ[0,Kn(t))(r) is the same as χ[r,∞)(Kn(t)), meaning that we have Z ∞ 1 Z π χ[r,∞)(Kn(t))|f(x − t)| dt dr. 0 2π −π

Letting Ir be the subset of [−π, π) where that characteristic function is 1, we have Z ∞ |I | Z 1 r |f(x − t)| dt dr. 0 2π Ir |Ir| This inner integral is bounded by the maximal average of f, defined by 1 Z Mf(x) = sup |f(t)| dr, r>0 Br(x) Br (x) which by a theorem of Hardy and Little wood is proven to be in Lp if f is. So our expression is bounded by

Mf(x)kKnk1 = Mf(x).

We wish to conclude that σnf(x) ≤ Mf(x). To do so, let

T f(x) = lim sup|σnf(x) − f(x)|. n→∞

If we can show that T f(x) = 0 almost everywhere, we are done. Let N ∈ N, and choose a g such that 1 Z π 1 kf − gk1 = |f(t) − g(t)| dt < . 2π −π N TOWARD CONVERGANCE OF PARTIAL SUMS 24

In fact (ii) essentially shows that σng → g uniformly. Then

|σnf(x) − f(x)| = |σn(f − g)(x) + σng(x) − (f − g)(x) − g(x)|

≤ |σn(f − g)(x)| + |σng(x) − g(x)| + |(f − g)(x)| ≤ M(f − g)(x) + |σg(x) − g(x)| + |(f − g)(x)|.

Taking lim sup, we get T f(x) ≤ M(f − g)(x) + |f(x) − g(x)| and ε ε { x | T f(x) > ε } ⊂ { x | M(f − g)(x) > } ∪ { x | |f(x) − g(x)| > } 2 2 and taking measures of these sets we have that

C/N 1/N T f(x) ≤ + . ε/2 ε/2

Now let N → ∞, then |{ x | T f(x) > ε }| = 0. Take a countable sequence of εn going to 0, then |{ x | T f(x) > 0 }| = 0, and we’re done.

Lecture 10 Toward Convergance of Partial Sums

10.1 Dirichlet Kernels Recall that

n n X X  1 Z π  S f(x) = fˆ(k)eikx = f(t)e−ikt dt eikx n 2π k=−n k=−n −π n 1 Z π X 1 Z π = f(t) eik(x−t) dt = f(t)D (x − t) dt 2π 2π n −π k=−n −π wherre n X iks Dn(s) = e k=−n is the Dirichlet Kernel. We can rewrite

n n is/2 −is/2 X i(k+1/2)s X i(k−1/2)s i(n+1/2)s −i(n−1/2)s e Dn(s)−e Dn(s) = e − e = e −e . k=−n k=−n

Therefore

ei(n+1/2)s − e−i(n+1/2)s 2i sin n+1 s sin n+1 s D (s) = = 2 = 2 . n is/2 −is/2 s  s  e − e 2i sin 2 sin 2 So in all Z π n+1  1 sin 2 (x − t) Snf(x) = f(t) x−t  dt. 2π −π sin 2 TOWARD CONVERGANCE OF PARTIAL SUMS 25

Remark 10.1.1. Note that the Dirichlet kernel Dn is not an approximate iden- tity; certainly it changes sign, and also

1 Z π 1 Z π sin n+1 s 1 Z π sin n+1 s |D (s)| dt = 2 ds ≥ 2 ds 2π n 2π s  2π s −π −π sin 2 −π 1 Z (n+1)/2π |sin(u)| 2 Z (n+1)/2π |sin(u)| = du = du π (−n+1)/2π |u| π 0 u (n−1)/2 (n−1)/2 2 X Z (k+1)π |sin(u)| 2 X Z (k+1)π ≥ = |sin(u)| du π kπ π2 k=0 kπ k=0 kπ (n−1)/2 2 X 1 2 = ≈ log((n − 1)/2), π2 k π2 k=0 meaning that the L1 norm of the Dirichlet kernel diverges as n → ∞. There are now two principal things we wish to discuss. First: convergence of the partial sums for continuous functions.

10.2 Convergence for Continuous Functions For f ∈ C([−π, π)), set

n n X ik0 ˆ X ˆ Tnf = Snf(0) = e f(k) = f(k). k=−n k=−n Note that Z π ˆ 1 ikt |f(k)| = f(t)e dt ≤ kfk∞, 2π −π if f is bounded. Then also

|Tnf| ≤ (2n + 1)kfk∞.

In other words every Tn is a bounded linear functional from C([−π, π)) to C, however the bound grows with n. We’ll now do something clever: for a fixed n, construct the function g such that ( 1, if D (t) ≥ 0 g(t) = n −1, if Dn(t) < 0. In other words g is a bunch of line segments at y = 1 and y = −1, jumping between the two. Certainly g is discontinuous at these jumps, but we can approximate it to any desired accuracy by gj ∈ C([−π, π)) such that kgjk∞ ≤ 1 and gj(t) → g(t) pointwise, where we connect the line segments with steeper and steeper lines for each j. Then 1 Z π lim Tngj = lim Sngj(0) = lim gj(t)Dn(0 − t) dt j→∞ j→∞ j→∞ 2π −π 1 Z π = |Dn(t)| dt ≈ log(n)|gj|∞. 2π −π CONVERGENCE IN LP 26 by dominated convergence. Therefore Tng kTnk = sup ≥ log(n). g∈C([−π,π)) normg∞ Therefore we have a containing a family of bounded linear func- tionals Tn. Recall the following from : Theorem 10.2.1 (Principle of Bounded Convergence). Suppose X is a Banach space, Tα, α ∈ Λ is a family of bounded linear functionals on X, i.e. for all α

kTαk = sup|Tαx|kxk < ∞. x∈X

Then either sup|Tαx| = ∞ α∈Λ for all x in a dense subset of X, or there exists an M such thatkTαk ≤ M for all α ∈ Λ.

In our case kTnk ≥ log(n) (maybe with some suitable constant) for all n, and so by the bounded convergence principle

supkTnfk = ∞ n for all f in a dense subset of C([−π, π)). In other words Snf(0) doesn’t converge to f(0) pointwise for all f in some dense subset, so even for nice, continuous func- tions we are very far indeed from having pointwise convergence of the Fourier series. Next time we’ll tackle the same problem with convergence in Lp instead.

Lecture 11 Convergence in Lp

Last time we established that there exists a dense subset of C([−π, π)) with Snf(x) diverging. Recall that 1 Z π Snf(x) = Dn(x − t)f(t) dt 2π −π with n+1  sin 2 t Dn(t) = t  . sin 2

11.1 Convergence in Lp Theorem 11.1.1. If f ∈ Lp([−π, π)), with 1 < p < ∞, then

kSnf − fkp → 0 as n → ∞. CONVERGENCE IN LP 27

In order to study this, we will make use of the conjugate series of f, which is defined as ∞ X f˜ ∼ −i sgn(k)fˆ(k)eikx. k=−∞ This turns out to be an interesting construction: Theorem 11.1.2 (M. Riesz). Given f ∈ Lp([−π, π)), 1 < p < ∞, f˜ defines a unique function in Lp, i.e. there exists a unique function f˜ ∈ Lp([−π, π)) such ˆ that f˜(k) = i sgn(k)fˆ(k) for every k. ˜ Furthermore there exists a constant Cp such that kfkp ≤ Cpkfkp. Remark 11.1.3. Given a harmonic function u on the unit disc, and assuming u is somewhat well-behaved, then it has boundary values on the unit circle. It turns out then that u(θ) = lim u(rxiθ) = f(θ), r→1− and with v being the conjugate function of u we have v(θ) = f˜(θ). That is to say, this conjugate series does not come from nowhere! p ˜ ˜ For f ∈ L ([−π, π)), define P−f = (f − if)/2 and P+f = (f + if)/2. Then ˜ kfk kfk kfk Cpkfk 1 + C kP fk ≤ p + p ≤ p + p = p kfk , − p 2 2 2 2 2 p

p p so P− : L ([−π, π)) → L ([−π, π)) is a bounded operator, since

normP−f p 1 + Cp kP−k = sup = . f∈Lp kfkp 2

In the same way P+ is a bounded operator. We call these P− and P+ because they are projections:

∞ ∞ X fˆ(k) X −i sgn(k)fˆ(k) P f ∼ eikx + i eikx + 2 2 k=−∞ k=−∞ ∞ ∞ X fˆ(k) + sgn(k)fˆ(k) X = eikx = fˆ(k)eikx. 2 k=−∞ k=0

If we now apply P+ again,

∞ ∞ ∞ X fˆ(k) X −i sgn(k)fˆ(k) X P ◦ P (f) = eikx + i eikx = fˆ(k)eikx = P f. + + 2 2 + k=0 k=0 k=0

In other words, P+ sets negative Fourier coefficients to 0, and P− sets nonneg- ative ones to 0. The reason this is interesting is this: Snf is also the Fourier series of f with CONVERGENCE IN LP 28 a bunch of coefficients set to 0;

n n X ˆ ikx ix(n+1) X ˆ ix(k−n−1) Snf(x) = f(k)e = e f(k)e k=−n k=−n ∞ ix(n+1)  X ˆ ix(k−n−1) = e P− f(k)e k=−n ∞ ix(n+1)  −ix(2n+2) X ˆ ix(k+n) = e P− e f(k)e k=−n ∞ ix(n+1)  −ix(2n+2)  X ˆ ix(k+n) = e P− e P+ f(k)e n=−∞ ∞ ix(n+1)  −ix(2n+2)  −inx X ˆ ikx = e P− e P+ e f(k)e n=−∞

2 which means that kSnfkp ≤ Cp kfkp. p Now we’re ready to prove that Snf → f in L : Proof. Let ε > 0. Pick a trigonometric polynomial q (the Fejer kernels, for instance) with kf − qkp < ε. Then

kSnf − fkp ≤ kSnf − Snqkp + kSnq − qkp + kq − fkp

= kSn(f − q)kp + kSnq − qkp + kq − fkp 2 ≤ Cp kf − qkp + kSnq − qkp + kq − fkp

2 which is bounded by Cp ε + ε since the middle term is 0 for a sufficiently large n.

11.2 Almost Everywhere Convergence Kolmogorov proved in 1925 that there exists a function f ∈ L1([−π, π)) such that Snf(x) diverges almost everywhere. He later showed that in fact there exists such an f that diverges on all x, not just on sets of positive measure. 2 Carleson proved in 1966 that if you instead work in L , then Snf(x) → f(x) almost everywhere. Hunt improved this in 1967 to work for f ∈ Lp for 1 < p < ∞. We will prove this under the assumption of a lemma we will not prove: Lemma 11.2.1. Define Z π 1 Mf(x) = sup|Snf(x)| = sup Dn(x − t)f(t) dt . n n 2π −π

For 1 < p < ∞, there exists a constant Cp such that kMfkp ≤ Cpkfkp for every f ∈ Lp.

Let us prove that Snf converges almost everywhere to f given the lemma: MAXIMAL FUNCTIONS 29

Proof. Set T f(x) = lim sup|Snf(x) − f(x)|. n→∞ We want T f(x) = 0 almost everywhere. Let N ∈ N, and choose a trigonometric polynomial q such that kf − qkp < 1/N. Then

|Snf − f| ≤ |Snf − Snq| + |Snq − q| + |q − f|.

Note that Snq → q at every x since they’re equal for large n. Taking lim sup we get T f ≤ M(f − q) + |f − q| and so

|SetxT f > ε| = |{ x | M(f − q) } > ε/2| + |{ x | |f − q| > ε/2 }| but by the lemma kM(f − q)kp ≤ Cpkf − qkp. By Chebyshev kM(f − q)k kf − qk |{ x | T f(x) > ε }| ≤ p + p (ε/2)p (ε/2)p 2p   ≤ C kf − qk + kf − qk εp p p p 2p =≤ (C + 1)/N εp p which goes to 0 as N → ∞.

Lecture 12 Maximal Functions

12.1 Hardy-Littlewood Maximal Functions

Definition 12.1.1. Suppose f is a Lebesgue measurable function on Rd. We 1 d say that f ∈ Lloc(R ) if Z |f(x)| dx < ∞ B for every ball B ⊂ Rd. Note that we do not require Z |f(x)| dx < ∞, d R though clearly L1 functions have the local property.

1 d Definition 12.1.2 (Maximal function). Suppose f ∈ Lloc(R ), we set 1 Z Mf(x) = sup |f(y)| dy, r>0 |B(x, r)| B(x,r) where by B(x, r) we mean the ball centred on x with radius r.

1 d Remark 12.1.3. • Given a function f ∈ Lloc(R ), it is not clear whether Mf(x) < ∞ at any x. MAXIMAL FUNCTIONS 30

• Suppose f ∈ Lp(Rd), and let B ⊂ Rd be a ball. Then Z Z |f(y)| dy = χB(y)|f(y)| dy d B R Z 1/q Z 1/p  q   p  1/q ≤ |χB(y)| dy |f(y)| dy ≤ |B| kfkp. d d R R p 1 In other words L ⊂ Lloc, as hinted at above. • If µ is a positive Borel measure, then we can define the maximal function of a measure analogously: 1 1 Z Mµ(x) = sup µ(B(x, r)) = sup dµ. r>0 |B(x, r)| r>0 |B(x, r)| B(x,r) This is a generalisation of the previous definition, since dµ = |f(y)| dy defines a measure. Proposition 12.1.4. If µ is a Borel measure, then Mµ is a Borel measurable function.

Proof. Let λ > 0, and let Eλ = { x | Mµ(x) > λ }. Take x ∈ Eλ. Then there exists some r0 > 0 such that µ(B(x, r )) 0 = t > λ. |B(x, r0)| n n Choose δ such that (r0 + δ) < r0 t/λ, which is possible since t/λ > 1. Suppose y ∈ B(x, δ). Then B(y, r0 + δ) ⊃ B(x, r0). This follows directly by the triangle inequality; take z to be in B(x, r0), then

d(z, y) ≤ d(x, y) + d(x, z) ≤ δ + r0. Therefore

µ(B(y, r0 + δ)) ≥ µ(B(x, r0)) = t|B(x, r0)| n (r0 + δ) > n λ|B(x, r)| = λ|B(y, r0 + δ)|. r0 n n Since (r0 + δ) /r0 is the ratio of the volumes of the two balls. This means that Mµ(y) > λ, meaning that Ey is open, making it measurable, which in turn makes Mµ a measurable function. Remark 12.1.5. One can also define Mµ and Mf using cubes instead of balls, d say Q(x, r) = { y ∈ R | |xi − yi| < r, i = 1, 2, . . . , d }. Then µ(Q(x, r)) MQµ(x) = sup , r>0 |Q(x, r)| using which µ(B(x, r)) µ(Q(x, r)) |Q(x, r)| µ(Q(x, r)) 2d ≤ = |B(x, r)| |Q(x, r)| |B(x, r)| |Q(x, r)| cd

n/2 where cn = π Γ(n/2 + 1) is the volume of a d-dimensional unit sphere. In other words these different ways of defining the maximal function are the same up to multiplication by some constant depending on the dimension. MAXIMAL FUNCTIONS 31

Remark 12.1.6. One can also define it in terms of surface integrals over the boundaries of balls, but this is less well-understood. Theorem 12.1.7 (Hardy-Littlewood, 1930). If µ is a positive Borel measure on Rd, then for every λ > 0, 3d |{ x ∈ d | Mµ(x) > λ }| ≥ µ( d). R λ R

1 d Remark 12.1.8. If f ∈ Lloc(R ) then Z µ(E) = |f(y)| dy, E meaning that

d Z d d 3 3 |{ x ∈ R | Mf(x) > λ }| ≤ |f(y)| dy = kfk1. λ d λ R

Remark 12.1.9. Let δ0 be the Dirac measure at 0, i.e. δ0(E) = 1 if 0 ∈ E, and 0 otherwise. In R1, we then have δ (B(x, |x| + ε)) 1 0 = , |B(x, |x| + ε)| 2(|x| + ε) meaning that Mδ0(x) ≥ 1/(2|x|). Rearranging we therefore have   1 { x | Mδ0(x) > λ } ⊃ x |x| < , 2λ which it we measure the sets yields   1 1 |{ x | Mδ0(x) > λ }| ≥ x |x| < = . 2λ λ This serves to demonstrate that the bound in Hardy-Littlewood’s theorem is about as good as it gets. Remark 12.1.10. Suppose f ∈ L1(Rd). Then the theorem says 3d |{ x ∈ d | Mf(x) > λ }| ≤ kfk . R λ 1 Suppose it were the case that we knew Z Z Mf(x) dx ≤ C |f(x)| dx. d d R R Then Z Z λ|{ x | Mf(x) > λ }| ≤ Mf(x) dx ≤ Mf(x) dx d { x|Mf(x)>λ } R Z ≤ C |f(x)| dx. d R It turns out, however, that what we assume above it never true. MAXIMAL FUNCTIONS 32

We will state and prove a lemma that takes us most of the way toward the Hardy-Littlewood theorem:

Lemma 12.1.11 (Wiener’s covering lemma). Suppose W is a set in Rd, and that N [ W ⊂ B(xi, ri), i=1 i.e. W can be covered by a finite set of balls. Then there exists a set of indices S ⊂ { 1, 2,...,N } such that

(i) The balls { B(xi, ri) | i ∈ S } are disjoint, [ (ii) W ⊂ B(xi, 3ri), and i∈S

d X (iii) |W | ≤ 3 |B(xi, ri)|. i∈S This is a so-called covering lemma, since it tells about covers. In particular it tells us that if we can cover a set in a finite set of balls, then we can pick a subset of that finite set of balls that is disjoint, and still covers the set if we blow each of the remaining balls up to thrice their original radius. Proof. (ii) implies (iii) quite trivially:

[ X d X |W | ≤ B(xi, 3ri) ≤ |B(xi, 3ri)| = 3 |B(xi, ri)|. i∈S i∈S i∈S

By reordering, we may assume r1 ≥ r2 ≥ ... ≥ rN . Let i1 = 1, and consider the biggest ball B(x1, r1) = B(xi1 , ri1 ). Discard all balls that intersect this one (the idea being that the three-fold enlargement of this ball envelopes any ball that intersects it, so we don’t need them anyway).

It no such balls exist, we simply stop. If not, let B(xi2 , ri2 ) be the largest remaining ball in the list. Throw away any ball that intersects it; If no such ball exists, stop, otherwise continue in the same fashion. This process must eventually terminate, since we’re working on a finite set of balls. Thus S = { i1, i2, . . . , i` }. Part (i), the balls being disjoint, is clear by construction. To get (ii) we consider some ball B(xj, rj) in the original list. If B(xj, rj) is in the new list, we’re good to go, since trivially [ B(xj, rj) ⊂ B(xj, 3rj) ⊂ B(xi, 3ri). i∈S

If B(xj, rj) was discarded, it is because it intersected some B(xik , rik ), for which rik ≥ rj. Then clearly the discarded ball is contained in the three-fold enlarge- ment of B(xik , rik ), since if we take a point z in the intersection and a point y in the discarded ball,

d(y, xik ) ≤ d(y, xj) + d(xj, z) + d(z, xik ) ≤ 3rik . MORE ON MAXIMAL FUNCTIONS 33

Lecture 13 More on Maximal Functions

13.1 Proof of Hardy-Littlewood’s Theorem We wish to prove the theorem stated last lecture. The proof is almost done, given Wiener’s covering lemma that we proved at the end of last lecture. Theorem 13.1.1 (Hardy-Littlewood, 1930). If µ is a positive Borel measure on Rd, then for every λ > 0, 3d |{ x ∈ d | Mµ(x) > λ }| ≥ µ( d). R λ R

Proof. Fix λ > 0. Let K ⊂ { x ∈ Rn | Mµ(x) > λ } be any compact subset. Then for each x ∈ K, choose a ball B(x, rx) such that

µ(B(x, r )) x > λ, |B(x, rx)| which is possible since the supremum of those quantities exceed λ, and so there exists at least one ball that does too—it might not be equal to the supremum, but must at least be somewhere in between. Now K is compact, so there exists some B(x1, r1),...,B(xN , rN ) which is a finite subcover of K. If we now use Wiener’s covering lemma on this finite subcover, we get a further reduction to { B(xi, ri) }i∈S for some S ⊂ { 1, 2,...,N }. Then

X X µ(B(xi, ri)) |K| ≤ 3d |B(x , r )| ≤ 3d ≤ 3dµ( d) i i λ R i∈S i∈S since the last sum is over a collection of disjoint balls, so the sum of the measures is the measure of the union, and the measure of that union is naturally bounded by the measure of the entire space. Now by inner regularity,

d d |{ x ∈ R | Mµ(x) > λ }| = sup{ |K| | K ⊂ { x ∈ R | Mµ(x) > λ },K compact }. Consequently we also have the previous bound for the entire set we considered.

In summary, what we know about Hardy-Littlewood’s maximal function is this:

Theorem 13.1.2 (Hardy-Littlewood, 1930). Let f be a measurable function on Rd. Then (i) If f ∈ Lp(Rd), 1 ≤ p ≤ ∞, then Mf is finite almost everywhere.

(ii) If f ∈ L1(Rd), then 3d |{ x ∈ d | Mf(x) > λ }| ≤ kfk . R λ 1

(iii) For 1 < p ≤ ∞, there exists a constant Ap such that kMfkp ≤ Apkfkp. MORE ON MAXIMAL FUNCTIONS 34

It turns out Ap is monotone in p, and approaches ∞ as p approaches 1. Remark 13.1.3. Note that 1 Z Mf(x) = sup |f(y)| dy r>0 |B(x, r)| B(x,r) 1 Z ≤ sup kfk∞ dy = kfk∞. r>0 |B(x, r)| B(x,r)

In other words we know A∞ = 1.

Definition 13.1.4 (Weak and (strong) type operators). Let T : Lp(Rd) → Lq(Rd), q ≤ p, q ≤ ∞. Then (i) T is said to be of (strong) type (p, q) if there exists a constant A such p d that kT fkq ≤ Akfkp for all f ∈ L (R ). (ii) T is called weak type (p, q) if there exists a constant A such that

Akfk q |{ x ∈ d | |T f(x)| > λ }| ≤ p R λ

for all f ∈ Lp(Rd) and all λ > 0, with A being independent of these. Remark 13.1.5. If an operator T is of type (p, q), then it is also of weak type

(p, q), by Chebyshev’s inequality: We have kT fkq ≤ Akfkp, and the left-hand side satisfies

Z 1/q Z 1/q  q   q  kT fkq = |T f(x)| dx ≥ |T f(x)| dx d R { x||T f(x)|>λ } ≥ λq|{ x | |T f(x)| > λ }|1/q, meaning that Akfk q |{ x | |T f(x)| > λ }| ≤ p . λ Remark 13.1.6. Therefore M is of (strong) type (∞, ∞) and weak type (1, 1). We will introduce the following notation in order to make the upcoming discus- sion nicer: Definition 13.1.7. The sum of Lp spaces is defined as:

p1 d p2 d p1 d p2 d L (R ) + L (R ) = { f1 + f2 | f1 ∈ L (R ), f2 ∈ L (R ) }.

p d p1 d p2 d Proposition 13.1.8. Suppose p1 < p < p2. Then L (R ) ⊂ L (R )+L (R ). Proof. Let f ∈ Lp(Rd), and γ > 0. Set ( f(x), if |f(x)| ≥ γ f1(x) = 0, otherwise and ( f(x), if |f(x)| < γ f2(x) = 0, otherwise. MORE ON MAXIMAL FUNCTIONS 35

Therefore by construction f = f1 + f2. Then Z Z Z p1 p p1−p p p1−p |f1(x)| dx = |f1(x)| |f1(x)| dx = |f(x)| |f(x)| dx d d R R { x||f(x)|≥γ } Z p1−p p p1−p p ≤ γ |f(x)| dx ≤ γ kfkp < ∞ { x||f(x)|≥γ }

p1 d so f1 ∈ L (R ), and similarly Z Z p2 p p2−p p2−p p |f2(x)| dx = |f(x)| |f(x)| dx ≤ γ kfkp < ∞, d R { x||f(x)|<γ }

p2 d meaning that f2 ∈ L (R ). A remarkable theorem that we will not prove here is this:

Theorem 13.1.9 (Marcinkiewicz interpolation theorem, 1939). Suppose p1 < p1 d p2 d p2, and that T is a mapping from L (R ) + L (R ) to the space of measurable functions. Suppose further that (i) |T (f + g)(x)| ≤ |T f(x)| + |T g(x)|, i.e. T is sublinear;

(ii) T is weak type (p1, p1); and

(iii) T is weak type (p2, p2).

Then T is type (p, p) for all p1 < p < p2. Which this in hand we can prove the very last of our statements about the maximal function: Proof. Let T = M. Certainly M(f +g)(x) ≤ Mf(x)+Mg(x), and moreover we know M is weak type (1, 1) and strong type (∞, ∞), so in particular it is weak type (∞, ∞). Therefore it is type (p, p) for all 1 < p < ∞ by the interpolation theorem. It turns out that we can do better than the above interpolation theorem, actu- ally:

p1 d Theorem 13.1.10. Suppose p1 < p2, and that T is a mapping from L (R ) + Lp2 (Rd) to the space of measurable functions. Suppose further that (i) |T (f + g)(x)| ≤ |T f(x)| + |T g(x)|, i.e. T is sublinear;

(ii) T is weak type (p1, q1); and

(iii) T is weak type (p2, q2). Then T is type (p, q) for p and q such that (1/p, 1/q) lies on the line segment between (1/p1, 1/q1) and (1/p2, 1/q2). MARCINKIEWICZ INTERPOLATION 36

Lecture 14 Marcinkiewicz Interpolation

14.1 Proof of Marcinkiewicz Interpolation Theorem We will now prove the interpolation theorem we stated and used at the end of last lecture.

Proof. First suppose p2 6= ∞. We wish to show that kT fkp ≤ Apkfkp for every p d f ∈ L (R ), with Ap not depending on f (but probably depending on p, T , and d). Let m(λ) = |{ x ∈ Rd | |T f(x)| > λ }|. Then Z Z ∞ |T f(x)|p = pλp−1m(λ) dλ. d R 0 This is effectively the layer cake theorem. We therefore need to estimate m(λ), so fix λ > 0 and let ( f(x), if |f(x)| > λ, f1(x) = 0, otherwise and ( f(x), if |f(x)| ≤ λ, f2(x) = 0, otherwise.

Therefore f = f1 + f2, and by assumption we have sublinearity of T so

|T f(x)| = |T (f1 + f2)(x)| ≤ |T f1(x)| + |T f2(x)|, whereby m(λ) = |{ x | |T f(x)| > λ }| ≤ |{ x | |T f1(x)| > λ/2 }| + |{ x | |T f2(x)| > λ/2 }|, but by weak type we have

A1kf1k p1 |{ x | |T f (x)| > λ/2 }| ≤ p1 1 λ/2 and A2kf2k p2 |{ x | |T f (x)| > λ/2 }| ≤ p2 . 2 λ/2 Therefore Z Z ∞ |T f(x)|p dx = pλp−1m(λ) dλ d R 0 ∞ ∞ Z A1kf1k p1 Z A2kf2k p2 ≤ pλp−1 p1 dλ + pλp−1 p2 dλ . 0 λ/2 0 λ/2 | {z } | {z } =I =II MARCINKIEWICZ INTERPOLATION 37

Studying the two integrals one at a time, we have Z ∞ p I = p(2A )p1 λp−p1−1kf k 1 dλ 1 1 p1 0 Z ∞ Z p1 p−p1−1 p1 = p(2A1) λ |f(x)| dx dλ 0 { y∈|f(y)|>λ } Z Z |f(x)| p1 p−p1−1 p1 = p(2A1) λ |f(x)| dλ dx d R 0 p−p Z |f(x)| 1 p(2A )p1 Z p1 p1 1 p = p(2A1) |f(x)| dx = |f(x)| dx, d p − p p − p d R 1 1 R and Z ∞ p II = p(2A )p2 λp−p2−1kf k 2 dλ 2 2 p2 0 Z ∞ Z p2 p−p2−1 p2 = p(2A2) λ |f(x)| dx dλ 0 { y∈|f(y)|≤λ } Z Z |f(x)| p2 p−p2−1 p2 = p(2A2) λ |f(x)| dλ dx d R 0 p−p Z |f(x)| 2 p(2A )p2 Z p2 p2 2 p = p(2A2) |f(x)| · − dx = |f(x)| dx. d p − p p − p d R 2 2 R Therefore Z p1 p2 p p(2A1) p(2A2)  p |T f(x)| dx ≤ + kfkp. d p − p p − p R 1 2 Notice that this quantity blows up near p1 and p2—if not, we could have taken clever limits and turned weak type into strong type at the endpoints. Now suppose p2 = ∞. We proceed almost as before, but assign f1 and f2 slightly differently: ( f(x), if |f(x)| > λ/(2A2), f1(x) = 0, otherwise and ( f(x), if |f(x)| ≤ λ/(2A2), f2(x) = 0, otherwise.

Once again f = f1 + f2, but this time kT f2k ≤ A2kf2k∞ ≤ A2λ/(2A2) = λ2, whereby |{ x | |T f2(x)| > λ/2 }| = 0. Therefore

m(λ) ≤ |{ x | |T f1(x)| > λ/2 }|, LEBESGUE DIFFERENTIATION THEOREM 38 and Z Z ∞ Z ∞ p p−1 p−1 |T f(x)| dx = pλ m(λ) dλ ≤ pλ |{ x | |T f1(x)| > λ/2 }| dλ d R 0 0 ∞ Z 2A1kf1k p1 ≤ pλp−1 p1 dx 0 λ Z ∞ Z p1 p−p1−1 p1 = p(2A1) λ |f1(x)| dx dλ d 0 R Z ∞ Z p1 p−p1−1 p1 = p(2A1) λ |f(x)| dx dλ 0 { y||f(y)|>λ/(2A2) } Z Z 2A2|f(x)| p1 p1 p−p1−1 = p(2A1) |f(x)| λ dλ dx d R 0 p1 p−p1 p(2A1) (2A2) p = kfkp, p − p1 and we are done. The more general version hinted at in the end of the last lecture is proven in much the same way, but is messier. We can actually interpolate from even weaker assumptions: Definition 14.1.1 (Restricted weak type). Let 1 ≤ p ≤ ∞. An operator T is said to be restricted weak type (p, p) if

 kχEk p |{ x | |T χ (x)| > λ }| ≤ A p E p λ for all measurable sets E.

Theorem 14.1.2 (Stein-Weiss). Suppose 1 ≤ p1 < p2 ≤ ∞. Suppose T be an operator from Lp1 (Rd) + Lp2 (Rd) to the space of measurable functions. Assume T is subliniear, and that T is restricted weak type (p1, p1) and (p2, p2). Then T is strong type (p, p) for all p1 < p < p2.

Lecture 15 Lebesgue Differentiation Theorem

15.1 A Note About Maximal Functions Recall how we discussed how the Hardy-Littlewood maximal function can be defined in terms of cubes instead of balls, but that it does not work over arbitrary rectangles. It does, however, work over rectangles so long as you require their sides to be parallel to the axes, i.e.

R = [a1, b1] × [a2, b2] × ... × [ad, bd]. Then 1 Z 1 Z bd Z b1 |f(y)| dy = ... |f(y1, . . . , yd)| dy1 . . . dyd |R| R |R| ad a1 1 Z bd 1 Z b1 = ... |f(y1, . . . , yd)| dy1 . . . dyd |[ad, bd]| ad |[a1, b1]| a1

≤ Md(... (M2(M1(f))) ...)(x), LEBESGUE DIFFERENTIATION THEOREM 39

where by Mi we mean the maximal function in the ith variable. So if we define 1 Z Mf(x) = sup R, x ∈ R |f(y)| dy |R| R where the rectangles R have sides parallel to the axes, then

Mf(x) ≤ Md(... (M2(M1(f))) ...)(x) and Z Z Z Z p p |Mf(x)| dx ≤ ... (Md ...M1f(x)) dxd . . . dx1 d R ZR R Z R p p = ... |Md−1 ...M1f(x)| dxd . . . dx1 ≤ Ckfkp. R R In other words rectangles with arbitrary direction is bad, but rectangles in the same orientation are OK. One then asks how many directions are OK—it has been shown that rectangles with major axes with angles π/2k work.

15.2 Lebesgue Differentiation Theorem

Theorem 15.2.1. Suppose 1 ≤ p ≤ ∞ and f ∈ Lp(Rd). Then for almost every x ∈ Rd, 1 Z lim |f(y) − f(x)| dy = 0. r→0 |B(x, r)| B(x,r)

Corollary 15.2.2. If f ∈ Lp(Rd), then for almost every x ∈ Rd 1 Z lim f(y) dy = f(x). r→0 |B(x, r)| B(x,r)

Proof. Assuming the theorem, we have

1 Z 1 Z

f(y) dy − f(x) ≤ |f(y) − f(x)| dy = 0 |B(x, r)| B(x,r) |B(x, r)| B(x,r) from which the corollary follows. Remark 15.2.3. In one dimension we have 1 Z x+r lim f(y) dy = f(x) r→0 2r x−r almost everywhere by the Fundamental theorem of ; the above is just a limit of a difference quotient. A point x ∈ Rd is said to be a Lebesgue point of f if 1 Z lim |f(y) − f(x)| dy = 0. r→∞ |B(x, r)| B(x,r)

We will let Lf denote the set of all Lebesgue points. LEBESGUE DIFFERENTIATION THEOREM 40

Lemma 15.2.4. If x is a point of continuity of f, then x ∈ Lf . Proof. Let ε >). Then there exists some δ > 0 such that if |x − y| < δ, then |f(x) − f(y)| < ε. So if r < δ, then

1 Z 1 Z |f(y) − f(x)| dy ≤ ε dy = ε. |B(x, r)| B(x,r) |B(x, r)| B(x,r)

Proof. Proof of the theorem Let f ∈ Lp, 1 ≤ p < ∞. Set Z Z Trf(x) = |f(y) − f(x)| dy. |B(x,r)| B(x,r)

Then T f(x) = lim sup Trf(x). r→0 We want to show that T f(x) = 0 almost everywhere. Let ε > 0, and let k ∈ N. p d Choose a continuous function g ∈ L (R ) such that kf − gkp < 1/k. Then 1 Z Trf(x) = |f(y) − f(x)| dy |B(x, r)| B(x,r) 1 Z 1 Z ≤ |(f − g)(y) − (f − g)(x)| dy + |g(y) − g(x)| dy |B(x, r)| B(x,r) |B(x, r)| B(x,r) 1 Z ≤ |(f − g)(y)| dy + |(f − g)(x)| + Trg(x). |B(x, r)| B(x,r)

Therefore T f(x) ≤ M(f − g)(x) + |(f − g)(x)| + T g(x). The last term is 0 by the lemma, and the first term comes from lim sup ≤ sup. Therefore

|{ x | T f(x) > ε }| ≤ |{ x | M(f − g)(x) > ε/2 }| + |{ x | |f − g)(x)| > ε/2 }| but p Ckf − gk 1 2p |{ x | M(f − g)(x) > ε/2 }| ≤ p < C . (ε/2)2 kp εp Therefore 2p 1 |{ x | T f(x) > ε }| ≤ C . εp kp Let k → ∞, and we get |{ x | T f(x) > ε }| = 0. Thus

∞ [ { x | T f(x) > 0 } = { x | T f(x) > 1/n } n=1 and so T f(x) = 0 almost everywhere. 1 d d For p = ∞, fix N and consider fχB(0,N) ∈ L (R ). Now almost every x ∈ R is a Lebesgue point, and in particular x ∈ B(0,N − 1). Let N go to ∞, and we capture everything.

1 d Corollary 15.2.5. If f ∈ Lloc(R ), then almost every x is a Lebesgue point. MAXIMAL FUNCTIONS AND KERNELS 41

We can manage a generalisation of this theorem that seems quite powerful, but isn’t actually all that impressive.

Definition 15.2.6 (Regular set). A family of sets { Ek(x) } is said to be k∈N regular at x if there exists an α > 0 and a sequence ki decreasing to 0 such that Eki (x) ⊂ B(x, ki) and |Eki (x)| > α|B(x, ki)| for every i.

1 d Theorem 15.2.7. Suppose f ∈ Lloc(R ). Suppose { Ek(x) } is a regular family at x. If x ∈ Lf , then 1 Z lim |f(y) − f(x)| dy = 0. i→∞ |Ek (x)| i Eki (x)

1 d Corollary 15.2.8. Suppose f ∈ Lloc(R ). Suppose at each x there exists a family of regular sets { Ek(x) }. Then 1 Z lim |f(y) − f(x)| dy = 0 i→∞ |Ek (x)| i Eki (x) almost everywhere. Proof of theorem. It’s pretty much straight forward from what we already know: 1 Z 1 Z |f(y) − f(x)| dy ≤ |f(y) − f(x)| dy |Ek (x)| |Ek (x)| i Eki (x) i B(x,ki) 1 Z ≤ |f(y) − f(x)| dy → 0 αx|B(x, ki)| B(x,ki)

as i → ∞ since |Eki (x)| > αx|B(x, ki)|.

Lecture 16 Maximal Functions and Kernels

16.1 Generalising Lebesgue Differentiation Theorem

1 d We showed that if f ∈ Lloc(R ), then 1 Z lim f(y) dy = f(x) r→0 |B(x, r)| B(x,r) almost everywhere. We can rewrite this in the following way: 1 Z Z 1 f(y) dy = χB(x,r)(y) f(y) dy |B(x, r)| d |B(x, r)| B(x,r) R Z χB(x,r)(x − y) = d f(y) dy d r |B(0, r)| R Z x−y χB(0,1)( r ) = d f(y) dy d r |B(0, 1)| R Z 1 x − y  = d ϕ f(y) dy, d r r R MAXIMAL FUNCTIONS AND KERNELS 42 where χ (s) ϕ(s) = B(0,1) . |B(0, 1)| Note that Z ϕ(s) ds = 1. d R We ask ourselves the following question: Given a ϕ with total mass 1, when is it true that Z x−y ϕ( r ) lim d f(y) dy = f(x) r→0 d r R almost everywhere, like with the above? In order to make life easier on ourselves we will often write 1 ϕ (s) = ϕ(s/r). r rd Consider ϕ(s) being some function on Rd with total mass 1, let Z Z 1 x − y  v(x, t) = ϕt ∗ f(x) = ϕt(x − y)f(y) dy = d ϕ f(y) dy. d d t t R R d+1 d We will think of v(x, t) as a function on R+ = { (x, t) | x ∈ R , t > 0 }. This means that the question we’re asking, namely whether lim v(x, t) = f(x) t→0 almost everywhere, boils down to asking what happens as we project radially downwards onto Rd. We can make a slight generalisation without trouble: Consider

d Γα(x) = { (y, t) | y ∈ R , t > 0, |x − y| < αt }, i.e. a cone above the fixed point x. We may then consider limits of the form lim v(y, t) = f(x) (y,t)→(x,0) (y,t)∈Γα(x) almost everywhere. This type of limit is called a nontangential limit since the path we take approaching (x, 0) can’t be tangential to x in Rd. Recall how when we proved the Lebesgue differentiation theorem, we wrote f = f − g + g with g being a continuous function, then proved that the theorem is true for g, and finally controlled f − g using the maximal function M(f − g). We use exactly the same strategy for this more general result:

Lemma 16.1.1. Suppose ϕ ∈ L1(Rd) such that Z ϕ(x) dx = 1, d R and let α > 0. Finally let g be continuous with compact . Then lim v(y, t) = g(x) (y,t)→(x,0) (y,t)∈Γα(x) at all x. MAXIMAL FUNCTIONS AND KERNELS 43

Proof. Take (yj, tj) ∈ Γα(x) such that (yj, tj) → (x, 0) as j → ∞. Then

|v(yj, tj) − g(x)| = |(ϕtj ∗ g)(yj) − g(x)| yj −s Z ϕ( ) yj = d g(s) ds − g(x) d R tj yj −s yj −s Z ϕ( ) Z ϕ( ) yj yj = d g(s) ds − d g(x) ds d d R tj R tj yj −s Z ϕ( ) yj ≤ d |g(s) − g(x)| ds. d R tj

We make a change of variable, taking u = (yj −s)/tj, meaning that s = yj −tju d d d and du = (−1) /tj ds. Note that the (−1) will disappear when we switch the limits of integration in each variable, should d happen to be odd. Then the above is equal to Z |ϕ(u)||g(yj − tju) − g(x)| du d R and since g is continuous,

|yj − tju − x| ≤ |yj − x| + tj|u| < (α + |u|)tj

(since |yj − x| < αj in Γα(x)) implies that

lim |g(yj − tju) − g(x)| = 0 j→∞ for every x. Moreover

|ϕ(u)||g(yj − tju) − g(x)| ≤ 2kgk∞|ϕ(u)| since g is continuous with compact support. Therefore by Lebesgue dominated convergence theorem, Z Z lim |ϕ(u)||g(yj −tju)−g(x)| du = lim |ϕ(u)||g(yj −tju)−g(x)| du = 0. j→∞ d d j→∞ R R

Definition 16.1.2. A function Ψ on Rd is said to be radial if Ψ(x) = Ψ(y) whenever |x| = |y|. In other words, and hence the name, the value of the function at a point depends only on the distance from the origin of that point. Sometimes, in an abuse of notation, we will write Ψ(r), with r ≥ 0 being the radius. Lemma 16.1.3. Suppose Ψ is radial, bounded and positive. Suppose moreover that Ψ ∈ L1(Rd) and that Ψ(x) is decreasing sa |x| → ∞. Set Ψ(˜ x) = sup Ψ(y). { y||y−x|<α }

Then Ψ(˜ x) is a bounded, radial function in L1(Rd), and Ψ(˜ x) is decreasing as |x| → ∞. RISING SUN LEMMA 44

Proof. By definition, we have that for |x| ≤ α, Ψ(˜ x) = Ψ(0), and if |x| > α, then Ψ(˜ x) = Ψ(|x| − α). From there on, the boundedness and radialness follow by definition, and the function being decreasing is clear by the above.

Definition 16.1.4. For α > 0, ϕ ∈ L1(Rd), and f ∈ Lp(Rd) for 1 ≤ p ≤ ∞. Set v(x, t) = ϕt ∗ f(x) and let

Nαv(x) = sup{ |v(y, t)| | (t, y) ∈ Γα(x) }, called the nontangential maximal function. In other words, it is the supre- mum of all the v(y, t) in the cones discussed before. This plays the same role the Hardy-Littlewood maximal function did in the Lebesgue differentiation theorem. As it turns out, we needn’t reinvent the wheel either—much of what we know about the Hardy-Littlewood maximal function transfers!

Theorem 16.1.5. Suppose ϕ ∈ L1(Rd) is bounded. Let Ψ(x) = ess sup|ϕ(y)|, |y|≥|x| which is integrable. Then if f ∈ Lp(Rd) with 1 ≤ p ≤ ∞,

Nαv(x) ≤ CMf(x), where Ψ depends only on d, ϕ¡ and α. The function Ψ as defined above is called the least decreasing radial majo- rant of ϕ.

Proof. Take (y, t) ∈ Γα(x). Then

Z ϕ( y−s ) Z Ψ( y−s ) t t ϕt ∗ f(y) = d f(s) ds ≤ d |f(s)| ds d t d t R R Z ˜ x−s Ψ( t ) ≤ d |f(s)| ds d t R since

x − s y − s − < α. t t Therefore Z ˜ x−s Ψ( t ) Nαv(x) ≤ sup d |f(s)| ds. d t t>0 R

Lecture 17 Rising Sun Lemma

17.1 Nontangential Maximal Function We start by proving the theorem stated at the end of last lecture. RISING SUN LEMMA 45

Proof continued. We left off last time at

Z ˜ x−s Ψ( t ) Nαv(x) ≤ sup d |f(s)| ds. d t t>0 R To finish the proof we would like to show that

Z ˜ x−s Z Ψ( t ) χB(x,r) sup d |f(s)| ≤ CMf(x) = C sup |f(y)| dy. d t d |B(x, r)| t>0 R r>0 R We restrict our study to the case of x = 0—all other cases can be derived from this by a change of variable in the ordinary way. Therefore we instead need to show that Z ˜ s Z Ψ( t ) χB(0,r) sup d |f(s)| ≤ CMf(0) = C sup |f(y)| dy. d t d |B(0, r)| t>0 R r>0 R Fix t, and we get, using Fubini’s theorem to switch the order of integration,

Z ˜ s Z Z 1/tdΨ(˜ s/t) Ψ( t ) d |f(s)| = dr|f(s)| ds d t d R R 0 Z ∞ Z = |f(s)| ds dr 0 |{ s|Ψ(˜ s/t)>r }| Z ∞ |{ s | Ψ(˜ s/t) > r }| Z = |f(s)| ds dr. 0 |{ s | Ψ(˜ s/t) > r }| { s|Ψ(˜ s/t)>r }

Now since Ψ˜ is radial and decreasing, the level sets for it are particular balls centred on the origin. Now the maximal function is the supremum over all such balls, so it dominates the above: Z ∞ Z ≤ |{ s | Ψ(˜ s/t) > r }|Mf(0) dr = Mf(0) Ψ(˜ s) ds. d 0 R Theorem 17.1.1. Suppose ϕ ∈ L1(Rd) is bounded, that its integral is 1, and that its least decreasing radial majorant Ψ is integrable. Then for f ∈ L1(Rd), for 1 ≤ p ≤ ∞, lim ϕt ∗ f(y) = f(x) (y,t)→(x,0) (y,t)∈Γα(x) almost everywhere. Proof. Set T f(x) = lim sup |ϕt ∗ f(y) − f(x)|. (y,t)→(x,0) (y,t)∈Γα(x) Let ε > 0 and k a positive integer. Choose a function g that is continuous with compact support such that kf − gkp < 1/k. Then

ϕt ∗ f(y) − f(x) = ϕtf(y) − ϕt ∗ g(y) + ϕt ∗ g(y) − g(x) + g(x) − f(x).

By the lemma from last time, T g(x) = 0. Thus

T f(x) ≤ T (f − g)(x) + |g(x) − f(x)| ≤ Nα(ϕt ∗ (f − g))(x) + |g(x) − f(x)|, RISING SUN LEMMA 46 so T f(x) ≤ CM(f − g)(x) + |f(x) − g(x)|. Therefore

|{ x | T f(x) > ε }| ≤ |{ x | M(f − g)(x) > ε/(2C) }| + |{ x | |f(x) − g(x)| > ε/2 }| which by the maximal function being weak type (1, 1) and using Chebyshev on the second set is bounded by

Dkf − gkp kf − gkp p + p (ε/(2C))p (ε/2)p for a constant D. All of this is therefore bounded by 1/kp and some constant, but as we let k → ∞ all goes to 0. Therefore

|{ x | T f(x) > ε }| = 0, and if we take some countable sequence of εn going to 0 and union over these we get |{ x | T f(x) > 0 }| = 0, and we are done.

17.2 Riesz’s Proof of the Hardy-Littlewood Theorem We’ll take a moment to appreciate the beautifully simple proof Riesz produced for the Hardy-Littlewood Maximal Function Theorem. Consider 1 Z ξ MRf(x) = sup |f(t)| dt, ξ>x ξ − x x making it a right-handed maximal function, and similarly for ML. Then clearly Mf ≤ MRf + MLf. Now set Z x F (x) = |f(t)| dt, 0 making F an increasing function. Fix λ and imagine rays of a run infinitely fat away shining down on the graph of F , with rays coming in with slope λ. Then there will be areas that are in the shadow, which we can characterise as intervals (ai, bi). Then MRf(x) > λ if and only if there exists some ξ > x such that 1 Z ξ |f(t)| dt > λ ξ − x x which in turn is true if and only if there exists some ξ > x such that

F (ξ) − F (x) > λ ξ − x which moreover is true if and only if x is in the shadow. Therefore { x | MRf(x) > λ } is the same as the set of x in the shadows, which is the union [ (ai, bi). i CALDERON-ZYGMUND´ DECOMPOSITION OF FUNCTIONS 47

For each (ai, bi) we have exactly

F (b ) − F (a ) i i = λ. bi − ai Therefore X 1 X |{ x | M f(x) > λ }| = (b − a ) = (F (b ) − F (a )) R i i λ i i i i 1 X Z bi 1 Z kfk ≤ |f(t)| dt ≤ |f(t)| dt = 1 . λ λ λ i ai R

So MR is weak type (1, 1), by a really, really simple argument. The same argument gives ML as weak (1, 1), and together M is weak type (1, 1), and we’ve proven the Hardy-Littlewood Maximal Function Theorem.

Lecture 18 Calder´on-Zygmund Decomposition of Functions

18.1 Higher-Dimensional Rising Sun Lemma Theorem 18.1.1 (Calder´on-Zygmind, 1952). Let f ≥ 0 be a function in L1(Rd), and let λ > 0. Then there exists a decomposition of Rd so that (i) Rd = F ∪ Ω and F ∩ Ω = ∅; (ii) f ≤ λ almost everywhere on F ; S (iii)Ω= k Qk where Qk are cubes whose interiors are disjoint and so that for each cube Qk, we have 1 Z λ < f(x) dx ≤ 2dλ. |Qk| Qk

Remark 18.1.2. Note that F ⊂ { x | f ≤ λ } but they need not be equal.

Definition 18.1.3 (Dyadic cube). A dyadic cube in Rd is a cube of the form k k + 1 k k + 1 k k + 1 Q = 1 , 1 × 2 , 2 × ... × d , d 2j 2j 2j 2j 2j 2j for some integers k1, k2, . . . , kd and an integer j.

Proof of theorem. Since f ∈ L1(Rd), we have Z f(t) dt < ∞ d R and therefore we can choose j such that 1 Z f(t) dt < λ |Q| Q CALDERON-ZYGMUND´ DECOMPOSITION OF FUNCTIONS 48 for every dyadic cube Q of side length 2j, since the integral over the whole space is some finite number, and so we can choose j large enough that 1 Z j f(t) dt < λ 2 d R and so if Q has side length 2j, then 1 Z 1 Z λ > f(t) dt ≥ f(t) dt. |Q| d |Q| R Q Now let Q0 be one of the cubes in this family. Divide Q0 into 2d dyadic subcubes, and let Q00. There are two possibilities: 1 Z Type 1 00 f(x) dx ≤ λ, |Q | Q00 1 Z Type 2 00 f(x) dx > λ. |Q | Q00 If Q00 is a cube of Type 2, we leave it alone. Otherwise, we divide it into 2d subcubes again and repeat. Continue this process, always leaving fixed cubes of Type 2 and dividing S cubes of Type 1. Let { Qk } be the collection of all Type 2 cubes. Set Ω = k Qk. By construction Qk have disjoint interiors. Now consider a typical Qk. It was the product of dividing a Type 1 cube, so if we for Qk let Q˜k denote the parent cube it stemmed from, we have 1 Z 1 Z λ < f(x) dx and f(x) dx ≤ λ. ˜ |Qk| Qk Qk Q˜k So we have

1 Z |Q˜ | 1 Z f(x) dx ≤ k · f(x) dx ≤ 2dλ ˜ |Qk| Qk |Qk| |Qk| Q˜k

d since the radio between |Q˜k| and |Qk| is precisely 2 since we divided the former into 2d equal parts to get the latter. Now set F = Ωc. Suppose x 6∈ Ω, meaning that x is never in a Type 2 cube. Therefore there exist dyadic cubes Q` with side lengths going to 0 such that x ∈ Q` for all ` and each Q` is Type 1. In other words, since these Q` are ` shrinking, they are regular√ at x, meaning that if Q` has side length 1/2 we can ` cover it by a ball B(x, d/2 ) ⊃ Q`. So for almost every x ∈ F , by Lebesgue differentiation theorem we have 1 Z λ ≥ f(y) dy → f(x) |Q`| Q` and so f(x) ≤ λ for almost every x ∈ F , as claimed. Note that this method of proof isn’t all that unfamiliar: it is exactly the same strategy one uses when proving, say, Bolzano-Weierstrass theorem about con- vergent subsequences of bounded sequences. DENSITY OF SETS 49

Lecture 19 Density of Sets

19.1 Hardy-Littlewood’s Theorem from Calder´on-Zygmund We will show, mostly for fun, that if we have a Calder´on-Zygmund decomposi- tion or Rd, then the weak type (1, 1) of Mf follows. Fix λ and suppose x ∈ Rd has Mf(x) > cλ for some constant c we’ll fix later. Then there exists a ball B(x, r) such that

1 Z |f(y)| dy > cλ. |B(x, r)| B(x,r)

Let j be the largest integer so that a dyadic cube of the side length 2j is inside B(x, r). Consider all dyadic cubes of side length 2j that intersect B(x, r)—call these c1, c2, . . . , cm, where the amount m of them depends only on the dimension d. Then Z Z |f(y)| dy ≥ |f(y)| dy > cλ|B(x, r)| ≥ cλ|cj| Sm j=1 cj B(x,r) for every j. So there exists at least one cj such that Z c |f(y)| dy > λ|cj| = λ|cj|, cj m if we now fix c = m. S We claim that cj ⊂ k Qk, where Qk are the same cubes as in the Calder´on- Zygmund decomposition. If not, then cj ∩ Qk = ∅ for all k, meaning that cj ⊂ F , but since |f| ≤ λ on F , this gives rise to a contradiction since it implies 1 Z |f(y)| dy ≤ λ, |cj| cj even though we know the same is strictly greater than λ. Therefore 1 Z λ < |f(y)| dy ≤ 2dλ. |cj| cj There exists a constant D depending on the dimension d such that the D-fold enlargement of cj contains B(x, r). Call this enlargementc ˜j. Similarly, let Q˜k be the D-fold enlargements of the Calder´on-Zygmund cubes, and consider the S ˜ union k Qk. Then d [ ˜ { x ∈ R | Mf(x) > cλ } ⊂ Qk. k These cubes are disjoint, so

d X ˜ d X |{ x ∈ R | Mf(x) > cλ }| ≤ |Qk| ≤ D |Qk|, k k DENSITY OF SETS 50

and the measure of Qk can be bounded by an integral from rearranging their property from the decomposition, so the above is less than or equal to

X 1 Z Dd Z Dd |f(y)| dy ≤ |f(y)| dy. λ λ S k Qk k Qk Thus d Z d Z d d cD cD cD |{ x ∈ R |Mf(x) > λ }| ≤ |f(y)| dy ≤ |f(y)| dy = kfk1, λ S λ d λ k Qk R meaning that we have weak type (1, 1). There is a problem with this line of reasoning: we used the Lebesgue Differ- entiation Theorem to prove the existence of the Calder´on-Zygmund decompo- sition, but our proof of the Lebesgue Differentiation Theorem in turn required the Hardy-Littlewood theorem for their maximal function.

19.2 Density of Sets

Definition 19.2.1 (Point of density). Let E ⊂ Rd be a measurable set. We say that x is a point of density of E if

|E ∩ B(x, r)| lim = 1. r→0 |B(x, r)|

Theorem 19.2.2. Let E ⊂ Rd be measurable. Then almost every point of E is a point of density of E. Proof. By Lebesgue Differentiation Theorem we have 1 Z lim f(y) dy = f(x) r→0 |B(x, r)| B(x,r) for almost every x. Thus in particular, taking f = χE, we have 1 Z lim χE(y) dy = χE(x) r→0 |B(x, r)| B(x,r) almost everywhere. But for x ∈ E this is the same as

|B(x, r) ∩ E| lim = 1. r→0 |B(x, r)|

Example 19.2.3. Let E be the set of irrational numbers in R. Then |E ∩ B(x, r)| = 1 |B(x, r)| for all x, r, so every R is a point of density. N Example 19.2.4. Let E = Q. Then |Q ∩ B(x, r)| = 0 for all x, r, so Q has no points of density. N DENSITY OF SETS 51

Definition 19.2.5 (Distance to a set). Let F ⊂ Rd. Set δ(x, F ) = inf{ |x − y| | y ∈ F }, called the distance from x to the set F . We write δ(x) if the set F is understood. It is a fact that if F is closed, then x ∈ F is equivalent with δ(x, F ) = 0.

Proposition 19.2.6. Let F ⊂ Rd be closed. Let x ∈ F , then for every y ∈ Rd we have δ(x + y) ≤ |y|. Proof. This is completely straight forward: certainly we have |y| = |(x+y)−x|. Taking infimum, we have δ(x + y) ≤ |y|.

Proposition 19.2.7. Let F ⊂ Rd be closed. Then if x is a point of density of F , then δ(x, y) = o(|y|) as |y| → 0. That is, if x is a point of density of F , it has the property that given ε > 0, there exists some ν depending on x and ε such that |y| < ν implies δ(x, y) ≤ ε|y|. Remark 19.2.8. We must have x ∈ F , otherwise there would be a ball around x that doesn’t meet F , so the quotient in the density would be 0. Moreover a point of density can be on the boundary of F : suppose, for instance, x is at the point of a cusp. Proof. Let x be a point of density and ε > 0. Consider B(x, |y| + ε|y|) and B(x + y, ε|y|). Then clearly

B(x + y, ε|y|) ⊂ B(x, |y| + ε|y|).

Note that if F ∩ B(x + y, ε|y|) = ∅, then |F ∩ B(x, |y| + ε|y|)| |B(x, |y| + ε|y|)| − |B(x + y, ε|y|)| ≤ |B(x, |y| + ε|y|)| |B(x, |y| + ε|y|)| d d Cd(|y| + ε|y|) − Cd(ε|y|) = d Cd(|y| + ε|y|) d d (1 + ε) − ε ε d = = 1 −  , (1 + ε)d 1 + ε where Cd is the volume of the unit sphere in d dimensions. Since x is a point of density of F , this is false, because the above quantity is equal to 1 for small radii. Thus there exists some ν such that |y| < ν implies B(x + y, ε|y|) contains points of F , and therefore δ(x, y) < ε|y|.

Definition 19.2.9 (Marcinkiewicz integral). Let F be a closed set in Rd. Set δ(x) = δ(x, F ). For x ∈ Rd, define Z δ(x + y) I(x) = d+1 dy |y|<1 |y| called the Marcinkiewicz integral. Theorem 19.2.10. (i) If x 6∈ F , then I(x) = ∞. (ii) For almost every x ∈ F , I(x) < ∞. MARCINKIEWICZ INTEGRAL 52

Lecture 20 Marcinkiewicz Integral

20.1 Convergence of Marcinkiewicz Integral Before we go on to prove the theorem about the convergence of Marcinkiewicz integrals stated last time, note that in Rd, Z Z 1 Z 1 1 d−1 α dy = α r dσ dr |y|≤1 |y| 0 Sd−1 r here Sd−1 is the d − 1 dimensional sphere and σ is the surface area measure. This is then equal to

 1 d−α  r if d 6= α Z 1 1  d−α σ(Sd−1) dr = σ(Sd−1) 0 rα−d+1 1 0 log(r) if d = α,  0 which is therefore infinite if d ≤ α and otherwise σ(Sd−1)/(d − α). Note moreover that the previous bounds on δ(x + y) we’ve discussed are insufficient to prove the theorem, since δ(x + y) ≤ |y| simply yields

Z |y| Z 1 I(x) ≤ d+1 dy = d dy = ∞, |y|<1 |y| |y|<1 |y| so no information gained. Similarly, using the second bound, saying that if x ∈ F we have δ(d + y) = o(|y|), so in other words for every ε > 0 there exists a ν > 0 such that |y| < ν implies δ(x + y) < ε|y|. This means that if we pick a sequence εn going to 0, we can create a function γ(y) such that δ(x + y) < γ(y). Therefore Z o(|y|) Z γ(y)|y| Z γ(y) I(x) ≤ d+1 dy = d+1 dy = d dy. |y|<1 |y| |y|<1 |y| |y|<1 |y|

Therefore if, say, γ(y) = |y|β for β > 0, then the integral will converge, whereas if, perhaps, γ(y) = 1/ log(1/|y|), the integral will be infinite. In other words bounding the numerator by o(|y|) is not enough either. All this to say, the theorem is quite subtle! We’ll prove the theorem using the following lemma:

Lemma 20.1.1. Let F be a closed set whose complement has finite measure. Set Z δ(x + y) I∗(x) = d+1 dy. d R |y|

Then I∗(x) < ∞ for almost every x ∈ F . Moreover Z c I∗(x) dx ≤ C|F | < ∞, F for some constant C. MARCINKIEWICZ INTEGRAL 53

Note that I∗(x) ≥ I(x) since the former integrates over a larger set. Proof. We show the second part—doing so implies the former, since the inte- grand has to be finite for the entire integral to be finite. This goes more or less as usual: we use Fubini’s theorem to switch the order of integration. First, however, we make the switch of variable u = y + x, so Z Z Z δ(x + y) Z Z δ(u) I∗(x) dx = d+1 dy dx = d+1 du dx d d F F R |y| F R |u − x| Z Z δ(u) Z Z δ(u) = d+1 du dx = d+1 dx du, F F c |u − x| F c F |u − x| where the switch from Rd to F c stems from δ(u) = 0 on F . Examining the inner integral, keeping u ∈ F c fixed. Thus δ(u) ≤ |x − u|, whereby Z δ(u) Z δ(u) Z 1 d+1 dx ≤ d+1 dx = δ(u) d+1 dx F |u − x| { x||x−u|d+1 } |x − u| B(u,δ(u))c |x − u| Z Z ∞ Z 1 1 d−1 = δ(u) d+1 dy = δ(u) d+1 r dσ dr B(o,δ(u))c |y| δ(u) Sd−1 r Z ∞ d−1 1 d−1 1 = δ(u)σ(S ) 2 dr = δ(u)σ(S ) = C δ(u) r δ(u) whereby Z Z Z Z δ(u) c I∗(x) dx = d+1 dx du ≤ C du = C|F |. F F c F |x − u| F c Proof of the Theorem. For (i), let x ∈ F c. Then there exists some r > 0, and without loss of generality let us assume r < 1, such that B(x, r) ⊂ F c since F c is open. Then if |y| < r/2, we have δ(x + y) ≥ r/2. Hence Z δ(x + y) Z δ(x + y) r Z 1 I(x) = d+1 dy ≥ d+1 dy ≥ d+1 dy = ∞. |y|<1 |y| |y| 0. In other words our result is the special case λ = 1. INTEGRAL OPERATORS 54

Lecture 21 Integral Operators

21.1 Schur’s Lemma As a change of topic, we will now concern ourselves with certain integral oper- ators for a bit. In particular, let (X, µ) and (Y, ν) be measure spaces, and let k : X × Y → C be a (jointly) measurable function. Define T as a map from the measurable functions on Y to the measurable functions on X by Z T f(x) = k(x, y)f(y) dν(y). Y A fairly fundamental result about these is this: Theorem 21.1.1 (Schur’s lemma). Let (X, µ) and (Y, ν) be measure spaces, and let k : X × Y → C be measurable. Set Z T f(x) = k(x, y)f(y) dν(y). Y Suppose that there exists a constant C such that Z (i) |k(x, y)| dµ(x) ≤ C independent of y and X Z (ii) |k(x, t)| dν(y) ≤ C independent of x. Y Then T is bounded on Lp, i.e. for each p with 1 ≤ p ≤ ∞, there exists a constant ∗ ∗ Cp such that kT fkLp(X) ≤ Cp kfkLp(Y ). Proof. First consider p = ∞. Then we have Z Z |T f(x)| ≤ |k(x, y)||f(y)| dν(y) ≤ kfk∞ |k(x, y)| dν(y) ≤ Ckfk∞, Y Y so T maps a function in L∞(Y ) to a function in L∞(X). Next consider p = 1. Then Z Z Z

|T f(x)| dµ(x) = k(x, y)f(y) dν(y) dµ(x) X X Y Z Z ≤ |k(x, y)||f(y)| dν(y) dµ(x) X Y Z Z = |f(y)| |k(x, y)| dµ(x) dν(y) Y X Z ≤ C |f(y)| dν(y) = CkfkL1(Y ). Y Therefore T is (strong) type (1, 1) and (∞, ∞), so by Marcinkiewicz interpola- tion T is strong type (p, p) for all 1 < p < ∞. It is possible to fashion this into a new and improves version of the same result— we won’t prove it, but it’s similar: INTEGRAL OPERATORS 55

Theorem 21.1.2. Suppose k is as in the previous theorem. Suppose k satisfies Z (i) |k(x, y)|r dµ(x) ≤ C independent of y and X Z (ii) |k(x, t)|r dν(y) ≤ C independent of x. Y ∗ Then kT fkLq (X) ≤ Cp,qkfkLp(Y ) where 1/q = 1/p + 1/r − 1. We will now try to answer the question what happens when we don’t have these strong convergence conditions on k. That is, define T in the same way, but suppose we have Z Z |k(x, y)| dx = |k(x, y)| dy = ∞. d d R R Let f ∈ L1(R) and consider 1 Z f(t) h(z) = dt. 2πi t − z R Then h is analytic in the upper half plane. Note that we can bound it by Z Z Z f(t) |f(t)| 1 dt ≤ dt = |f(t)| dt. t − z Im z Im z R R R Let z = x + iy, and we get

1 Z f(t) 1 Z f(t) (t − x) + iy h(z) = dt = · dt 2πi t − (x + iy) 2πi (t − x) − iy (t − x) + iy R R 1 Z f(t)(t − x) iyf(t) = + dt 2πi (t − x)2 + y2 (t − x)2 + y2 R 1 Z f(t)y −i Z f(t)(t − x) = dt + dt. 2π (t − x)2 + y2 2π (t − x)2 + y2 R R Suppose f is real-valued. Then we have just decomposed h into real and imag- inary parts. It is a fact known as Fatou’s theorem that

lim h(x + iε) ε→0+ exists for almost every x. Call this limit h(x). We claim that 1 Z f(t) Im h(x) = lim − dt. ε→0 2π { t||t−x|>ε } t − x

I.e. the imaginary part of h is its own so-called Hilbert transform. Note for the record that Z 1 dt = ∞ |t − x| R INTEGRAL OPERATORS 56 for all x, so Schur’s lemma doesn’t work. The proof of this claim largely boils down to symmetry. Fix x and ε > 0. Call { t | |t − x| > ε } = Tε and let us examine Z Z f(t) f(t)(t − x) dt − dt t − x (t − x)2 + ε2 Tε R Z Z  1 t − x  t − x ≤ f(t) − 2 2 dt + f(t) 2 2 Tε t − x (t − x) + ε Tε (t − x) + ε Z 2 2 2 Z (t − x) + ε − (t − x) t − x = f(t) 2 2 dt + (f(t) − f(x)) 2 2 dt Tε (t − x)((t − x) + ε ) Tε (t − x) + ε since the area over which we are integrating is symmetric with respect to x, and the fraction in the second integral is odd. Bounding appropriately by ε we thus get

Z 2 Z ε 1 ≤ f(t) 2 2 dt + |f(t) − f(x)| dt = I + II. Tε (t − x)((t − x) + ε ) ε Tε By Lebesgue Differentiation theorem, II goes to 0 almost everywhere as ε → 0. n+1 n For the first integral, let Tn = { t | 2 ε ≥ |t − x| > 2 ε } and write

∞ ∞ X Z |f(t) − f(x)|ε2 X Z |f(t) − f(x)|ε2 I ≤ dt ≤ dt |t − x||(t − x)2 + ε2| 2nε((2nε)2 + ε2) n=0 Tn n=0 Tn ∞ ∞ X 1 Z |f(t) − f(x)| X 1 Z = dt = |f(t) − f(x)| dt 2nε 22n + 1 (2nε)(22n + 1) n=0 Tn n=0 Tn N R ∞ R X T |f(t) − f(x)| dt X T |f(t) − f(x)| dt ≤ n + n (2nε)(22n + 1) (2nε)(22n + 1) n=0 n=N+1 N ∞ X 1 Z X 2Mf(x) ≤ |f(t) − f(x)| dt + , (2nε)(22n + 1) 22n + 1 n=0 Tn n=N+1 which goes to 0 as ε → 0, again by Lebesgue differentiation theorem. To conclude, then, if we take f ∈ L1(R) to be real-valued and set 1 Z f(t) h(z) = dt, 2πi t − z R we have that lim h(x + iy) := h(x) y→0 exists almost everywhere and

Z f(t) Im h(x) = lim dt. ε→0 { t||t−x|>ε } t − x

It is also possible to show that Re h(x) = f(x) almost everywhere. INTEGRAL OPERATORS CONTINUED 57

Lecture 22 Integral Operators continued

22.1 Singular Integrals

Theorem 22.1.1. Let k(x) ∈ L2(Rd). Suppose (i) |kˆ(ξ)| ≤ B for every ξ ∈ Rd, and n+1 (ii) k is C1 on Rd \{ 0 } and |∇k(x)| ≤ B/|x| . For f ∈ Lp ∩ L2, set Z T f(x) = k(x − y)f(y) dy. d R

Then for 1 < p < ∞ there exists a constant Ap,d,B such that

kT fkp ≤ Ap,d,Bkfkp for every f ∈ Lp(Rd). This is due to Calder´onand Zygmund in the early nineteen fifties, and uses their decomposition. Ultimately we want to consider k(x, y), of course—this then is a special case. The integral operator T thus defined is a singular integral operator, and Z k(x − y)f(y) dy d R itself is a singular integral. The name comes from the fact that k can have singularities: Example 22.1.2. We saw last time that Z 1 Hf(x) = lim f(y) dy ε→0 |x−y|>ε x − y works in this framework, and it certainly has a singularity, since k(x) = 1/x. N The proof is in four parts: 1. Show that T is type (2, 2). 2. Show that T is weak type (1, 1). 3. Apply Macinkiewicz interpolation to get that T is type (p, p) for all 1 < p < 2. 4. Use the fact that the dual space to Lp is Lq, with 1/p + 1/q = 1, so it 1 < p < 2, we have 2 < q < ∞. Proof of part 1. Let f ∈ L2 and consider Z T f(x) = k(x − y)f(y) dy. d R Since we can think of the above as a convolution, we have Tc f(ξ) = kˆ(ξ)fˆ(ξ). Hence ˆ ˆ ˆ ˆ kT fk2 = kTc fk2 = kkfk2 ≤ kkk∞kfk2 ≤ Bkfk2. INTEGRAL OPERATORS CONTINUED 58

Unfortunately, step 2 is not nearly as simple: Proof of part 2. We want to show that there exists some constant C such that Z d C |{ x ∈ R | |T f(x)| > λ }| ≤ |f(x)| dx λ d R for every λ > 0. Fix λ and apply the Calder´on-Zygmund decomposition to |f(y)|, obtaining F and Ω, with Ω composed of cubes Qj. Define ( f(x) if x ∈ F g(x) = 1 R f(y) dy if x ∈ Qj for some j |Qj | Qj and ( 0 if x ∈ F b(x) = 1 R f(x) − f(y) dy if x ∈ Qj for some j. |Qj | Qj Then f(x) = g(x) + b(x), where we call g the good function and b the bad function. This is due to |g(x)| ≤ 2dλ for all x ∈ Rd. On the other hand, for each Qj we have 1 Z 1 Z 1 Z b(x) dx = f(x) − f(y) dy dx = 0 |Qj| Qj |Qj| Qj |Qj| Qj so the average of b is 0 on each cube. By linearity of T we moreover have T f = T g + T b, which means that |T f| ≤ |T g| + |T b|, whence

|{ x | |T f(x)| > λ }| ≤ |{ x | |T g(x)| > λ/2 }| + |{ x | |T b(x)| > λ/2 }|.

So we estimate those two measures. First, T g, by Z Z Z 2 2 2 2 kgk2 = |g(x)| dx = |g(x)| dx + |g(x)| dx d F Ω ZR Z d 2 d 2 1 ≤ λ|f(x)| dx + (2 λ) |Ω| ≤ λkfk1 + (2 λ) |f(x)| dx ≤ Cλkfk1. F λ Ω There are two things of particular note here. In the first inequality, we have bounded one of the g by λ, and the other by f, on F . Secondly, the inequality from |Ω| to the integral is the Calder´on-Zygmund decomposition rearranged to some for the measure of the cubes. Now by part 1 above,

2 2 kT gk2 ≤ Ckgk2 ≤ Cλkfk1, implying that 1 4 C |{ x | |T g(x)| > λ/2 }| ≤ kT gk2 ≤ Cλkfk = kfk . (λ/2)2 2 λ2 1 λ 1 It remains to ponder T b. Set, for convenience, ( b(x) if x ∈ Qj bj(x) = 0 otherwise. INTEGRAL OPERATORS CONTINUED 59

P Then b(x) = j bj(x), with bj supported on Qj, and on Qj we have 1 Z bj(x) = f(x) = f(y) dy. |Qj| Qj P So T b(x) = j T bj(x), which works since, even though it’s an infinite sum, the left-hand side is an integral which we decomposed into integrals over disjoint parts, so there is no problem with interchanging the sum and the integral. Fixing j, we then have Z T bj(x) = k(x − y)bj(y) dy. Qj

Let cj denote the centre of the cube Qj. Then Z T bj(x) = (k(x − y) − k(x − cj))bj(y) dy Qj since Z k(x − cj)bj(y) dy = 0 Qj due to the first factor not depending on y, so it comes out in front, and bj has zero average. Let Q˜j be the cube with centre cj but side lengths 2 times that of Qj. Suppose x 6∈ Q˜j. Then for y ∈ Qj we have

|k(x − y) − k(x − cj)| ≤ |y − cj||∇k(x − y˜)|, withy ˜ on the line segment between y and cj. Think of this as nothing more than the Mean Value Theorem, applied on that same line segment. So CB CB |k(x − y) − k(x − cj)| ≤ (diamQj) d+1 ≤ (diam Qj) d+1 |x − y˜| |x − cj| since |x − cj| ≤ |x − y˜| + |y˜ − cj| ≤ |x − y˜| + diam Qj ≤ 2|x − y˜|. Therefore Z diam Qj |T bj(x)| ≤ C d+1 |bi(y)| dy |x − cj| Qj

diam Q Z 1 Z ≤ C j f(x) − f(y) dy dx d+1 |Q | |x − cj| Qj j Qj Z diam Qj C diam Qj d ≤ 2C d+1 |f(y)| dy ≤ d+1 2 λ|Qj| |x − cj| Qj |x − cj| diam Qj = D d+1 λ|Qj|. |x − cj| Therefore Z Z 1 |T bj(x)| dx ≤ D diam Qj|Qj|λ d+1 dx, ˜c ˜c Qj Qj |x − cj| INTEGRAL OPERATORS CONTINUED 60 which, if we switch to polar coordinates, integrate over the complement of the ˜ ˜c ball inscribed in Qj, call it Bj , and recentre at the origin, if bounded by Z 1 d−1 C D diam Qj|Qj|λ d+1 r dr dσ = C diam Qj|Qj|λ = D|Qj|λ. ˜c r diam Qj Bj

Lecture 23 Integral Operators continued

23.1 Finishing the Proof We continue from where we left off with the proof of Part 2:

S ˜ c ˜c Proof of Part 2, continued. Set E = ( j Qj) . Then E ⊂ Qj for every j, and

Z X Z X Z |T b(x)| dx ≤ |T bj(x)| dx ≤ |T bj(x)| dx ˜c E j E j Qj X X 1 Z ≤ D|Q |λ ≤ Dλ |f(x)| dx ≤ Dkfk . j λ 1 j j Qj

So on E, T b is in L1. Hence

Dkfk |{ x ∈ E | |T b(x)| > λ/2 }| ≤ 1 λ

c S ˜ by Chebyshev. Moreover, E = j Qj, and thus

[ X X X 1 Z C |Ec| = Q˜ ≤ |Q˜ | ≤ C |Q | ≤ C |f(x)| dx ≤ kfk . j j j λ λ 1 j j j j Qj

Therefore C |{ x ∈ Ec | |T b(x)| > λ/2 }| ≤ |Ec| ≤ kfk , λ 1 so T is weak type (1, 1).

Remark 23.1.1. The only place we used and will use the hypothesis |∇k(x)| ≤ B/|x|d+1 is the estimate Z |T bj(x)| dx ≤ Dλ|Qj|. c Qj We can replace it by Z |k(x − y) − k(x)| dx ≤ B. |x|≥2|y|

Proof of Step 3. We have shown that T is type (2, 2) and weak type (1, 1), so by Marcinkiewicz interpolation T is type (p, p) for all 1 < p < 2. INTEGRAL OPERATORS CONTINUED 61

Proof of Step 4. We want to show that the operator T is type (p, p) for 2 < p < ∞. Let f ∈ Lp ∩ L1, 2 < p < ∞. Let ϕ ∈ Lq, with 1/p + 1/q = 1, such that kϕkq = 1 and ϕ is continuous with compact support. Then Z Z Z T f(x)ϕ(x) dx = k(x − y)f(y) dyϕ(x) dx d d d R ZR ZR = k(x − y)f(y)ϕ(x) dx dy, d d R R where we used Fubini’s theorem in the last equality. To see why this is valid, consider Z Z Z Z |k(x − y)ϕ(x)f(y)| dx dy = |f(y)| |k(x − y)ϕ(x)| dx dy d d d d R R R R Z  Z 1/2 Z 1/2 = |f(y)| |k(x − y)|2 dx |ϕ(x)|2 dx dy d d d ZR R R = |f(y)| dykϕk2ϕk2 = kfk1kϕk2ϕk2 < ∞. d R Set k˜(x) = k(−x). Then Z Z Z Z T f(x)ϕ(x) dx = k˜(y − x)ϕ(x) dxf(y) dy = T˜ ϕ(y)f(y) dy d d d d R R R R where T˜ is the operator with kernel k˜. Note that k˜ satisfies all the same estimates as k, so T˜ is type (p, p) for 1 < p < 2. Therefore Z Z ˜ ˜ T f(x)ϕ(x) dx = T ϕ(x)f(x) dx ≤ kT fk1kfkp eqCkϕk1kfkp ≤ Ckfkp. d d R R Since  Z  q kT fkp = sup T f(x)ϕ(x) dx ϕ ∈ L , kϕkq = 1, ϕ compact support d R we get kT fkp ≤ Ckfkp.

Theorem 23.1.2. Suppose k(x), x ∈ Rd, satisfies (i) |k(x)| ≤ B/|x|d, |x| > 0; Z (ii) |k(x − y) − k(x)| dx ≤ B for all y ∈ Rd; |x|≥2|y| Z (iii) k(x) dx = 0 whenever 0 < R1 < |x| < R2 < ∞. R1<|x|ε

Then there exists a constant A = Ap,d,B such that kTεfkp ≤ Akfkp independent of ε and lim Tεf(x) = T f(x) ε→0 p exists in L norm and kT fkp ≤ Akfkp. INTEGRAL OPERATORS CONTINUED 62

Turns out that in fact the above limit also exists pointwise almost everywhere, but that is much more work.

Example 23.1.3. Consider the Hilbert transform 1 Z 1 Hf(x) = lim f(x − y) dy, ε→0 π |y|≥ε y i.e. the above with the kernel k(x) = 1/(πx). Then k satisfies the hypotheses of the theorem; clearly |k(x)| ≤ B/|x|, where B = 1/π, the last property is satisfies by symmetry since k is odd, and for (ii) we have

Z 1 Z 1 1

|k(x − y) − k(x)| dx = − dx |x|≥2|y| π |x|≥2|y| x − y x 1 Z |y| = dx π |x|≥2|y| |x − y||x| 1 Z |y| ≤ dx π |x|≥2|y| (|x| − |y|)|x| |y| Z 1 ≤ dx π |x|≥2|y| (|x| − |y|)|x| |y| Z ∞ 1 ≤ 2 2 dx = C. N π 2|y| |x|

Lemma 23.1.4. Suppose k satisfies (i)–(iii) of the last theorem. Set ( k(x) if |x| ≥ ε kε(x) = 0 if |x| < ε.

2 d ˆ Then kε ∈ L (R ) and kkεk ≤ CB where B is as before, and C is a constant independent of ε and B. Proof. It suffices to consider ε = 1. Suppose we know |kˆ(ξ)| ≤ CB, and consider k˜(x) = εdk(εx). Then k˜ satisfies (i)–(iii) of the theorem with the same constant B. To see (i), consider

B B |k˜(x)| = |εdk(εx)| ≤ εd = , |εx|d |x|d and similarly (ii) and (iii). Then ( 1 1 kˆ(x/ε) if |x/ε| > 1 ˜ εd (k)1(x/ε) = εd 0 if |x/ε| ≤ 1. and

1 \ kˆ (ξ) = k˜ (·/ε)(ξ) = |(dk˜) (εy)| ≤ CB. ε d 1 1 ε INTEGRAL OPERATORS CONTINUED 63

Lecture 24 Integral Operators continued

24.1 Proof of the Lemma We start by finishing up the proof of the lemma from last time.

Proof continued. As discussed it suffices to consider k1. Then we have Z ˆ ix·y k1(y) = lim e k1(x) dx R→∞ |x|≤R Z Z −ix·y −ix·y = e k1(x) dx + lim e k1(x) dx |x|≤2π/|y| R→∞ 2π/|y|≤|x|≤R = I + II.

We can bound I using property (iii) from the assumptions of our theorem, i.e. Z Z −ix·y −ix·y I = e k1(x) dx = (e − 1)k1(x) dx |x|≤2π/|y| |x|≤2π/|y| meaning that Z −ix·y |I| ≤ |e − 1||k1(x)| dx |x|≤2π/|y| Z Z B ≤ |x||y||k1(x)| dx ≤ |y| |x| d dx |x|≤2π/|y| |x|≤2π/|y| |x| Z Z Z 2π/|y| 1 1 d−1 = B|y| d−1 dx = B|y| d−1 r dr dσ |x|≤2π/|y| |x| Sd−1 0 r

= CdB, so I is bounded by a constant independent of ε. Note that we used the property that |eiθ − 1| = |eiθ − ei0| ≤ θ. For II, set z = yπ/|y|2, and note that e−iz·y = −1. Then Z Z −ix·y −i(x−z)·y e k1(x) dx = k1(x − z)e dx 2π/|y|≤|x|≤R 2π/|y|≤|x−z|≤R Z −ix·y = − k1(x − z)e dx. ≤|x−z|≤R

Therefore our integral is bounded by Z Z 1 −ix·y −ix·y  e k1(x) dx − k1(x − z)e dx 2 2π/|y|≤|x|≤R 2π/|y|≤|x−z|≤R which in turn we can write as Z Z 1 −ix·y −ix·y e k1(x) dx − e k1(x − z) dx+ 2 2π/|y|≤|x|≤R 2π/|y|≤|x|≤R Z Z −ix·y −ix·y  + e k1(x − z) dx − k1(x − z)e dx . 2π/|y|≤|x|≤R 2π/|y|≤|x−z|≤R INTEGRAL OPERATORS CONTINUED 64

Call the first line IIa, and the second line IIb. Now 1 Z |IIa| ≤ |k1(x) − k1(x − z)| dx 2 2π/|y|≤|x|≤R 1 Z = |k1(x) − k1(x − z)| dx ≤ B 2 2|z|≤|x|≤R by assumption. For IIb, we have the same integrand but different regions, so let

R = { x | 2π/|y| ≤ |x| ≤ R }4{ x | 2π/|y| ≤ |x − z| ≤ R } and therefore Z −ix·y |IIb| ≤ e k1(x − z) dx R and we can integrate over two large annuli which captures this symmetric dif- ference (plus a bit more, but we have control over it):

Z B Z B |IIb| ≤ C d du + C d du π/|y|≤|u|≤3π/|y| |u| R−π/|y|≤|u|d≤R+π/|y| |u| Z 3π/|y| Z Z R+π/|y| Z B d−1 B d−1 = C d r dσ dr + C d r dσ dr π/|y| Sd−1 r R−π/|y| Sd−1 r Z 3π/|y| 1 Z R+|y| 1 = CB dr + CB dr π/|y| r R−π/|y| r R + π/|y| = CB log(3) + CB log ≤ CB, R − π/|y| and we are done.

Lecture 25 Integral Operators continued

25.1 Finalising the Proof We finish off the course by proving Theorem 23.1.2:

Proof. We need to show that each kε satisfies the second property of the previous theorem, namely that Z |kε(x − y) − kε(x)| dx ≤ B. |x|≥2|y|

To accomplish this, note that we can write this integral as Z Z Z |k(x − y) − k(x)| dx + |k(x)| dx + |k(x − y)| dx,

|x|≥2|y| |x|≥2|y| |x|≥2|y| |x−y|>ε |x−y|<ε |x−y|>ε |y|>ε |y|>ε |y|<ε which we denote I, II, and III respectively. INTEGRAL OPERATORS CONTINUED 65

The first one is bounded by B by hypothesis. For the second one we have 1/|x| > 1/ε, so Z B d B II ≤ d dx = Cdε d = CdB. |x−y|<ε ε ε Moreover Z B d B III ≤ d dx = Cdε d = CdB. |x|<ε ε ε

Hence by the previous theorem kTεfkp ≤ AB,p,dkfkp independent of ε. Now we need to prove the existence of the limit. Consider f ∈ C∞ with compact support. Then Z Tεf(x) = k(y)f(x − y) dy |y|≥ε Z Z = k(y)f(x − y) dy + k(y)(f(x − y) − f(x)) dy = I + II, |y|≥1 1≥|y|≥ε where we can add f(x) as we do since it’s constant with respect to y, meaning that in the second integral we’d just be integrating k over a ring, where it is zero by hypothesis. Taking 1/p + 1/q = 1 and using H¨older’sinequality we have

Z Z 1/q Z 1/p  q   p  |I| = k1(y)f(x − y) dy ≤ |k1(y)| dy |f(x − y)| dy d d d R R R = Ckfkp. For the second one, we have by the Mean value theorem that Z Z B |II| ≤ |k(y)|k∇fk∞|y| dy ≤ k∇fk∞ d |y| dy 1≥|y|≥ε 1≥|y|≥ε |y|

≤ Bωdk∇fk∞,

d where by ωd we mean the volume of the unit ball in R . So if ε1 > ε2, then Z Z 1 |Tε1 f(x) − Tε2 f(x)| ≤ |k(y)|k∇fk∞|y| dy ≤ CB d−1 dy ε1≥|y|≥ε2 ε1≥|y|≥ε2 |y| Z Z ε2 1 d−1 = CB d−1 r dr dσ ≤ CB|ε2 − ε1|. d−1 S ε1 r

p Hence Tεf(x) converges uniformly in ε. Now consider f ∈ L and let η > 0. ∞ Write f = f1 + f2, with f1 ∈ C with compact support and kf2kp < η, which is possible since smooth functions with compact support are dense in Lp. Then Tεf(x) = Tεf1(x) + Tεf2(x) by linearity of Tε, so

kTε1 f(x) − Tε2 f(x)k ≤ kTε1 f1(x) − Tε2 f1(x)kp + kTε1 f2(x) − Tε2 f2(x)kp

≤ kTε1 f1(x) − Tε2 f1(x)kp + Akf2kp + Akf2kp

≤ kTε1 f1(x) − Tε2 f1(x)kp + 2Aη. INTEGRAL OPERATORS CONTINUED 66

By the above we can choose ε1 and ε2 small enough so that the first part is p bounded by η, and so Tεf is a Cauchy sequence in L , in the sense that we can choose any sequence of εn that will be Cauchy. Hence

lim Tεf = T f ε→0 exists in Lp. Note that we did not show that this limit exists pointwise almost everywhere. This is true, but to do it we need to create

∗ T f(x) = supkTεf(x)k ε>0 and do similar maximal function computations we did before. Index

Abel summable, 19 type, 34 approximate identity, 21 strong, 34 weak, 34 Cesaro summable, 19 complete, 12 weak type conjugate series, 27 restricted, 38 convolution, 7

Dirichlet kernel, 20, 24 distance to set, 51 dyadic cube, 47

Fejer kernel, 20 Fourier coefficients, 2 series, 3 transform, 3 Fourier transform inverse, 17 function radial, 43

Hilbert space, 12 Hilbert transform, 55, 62 inner product space, 1

Lebesgue point, 39 limit nontangential, 42

Marcinkiewicz integral, 51 maximal, 12 maximal function nontangential, 44 metric, 1 mollifier, 9 orthogonal, 1 orthonormal, 1 point of density, 50 regular set, 41 singular integral, 57 operator, 57

67