<<

Lectures in Harmonic Analysis

Joseph Breen Lectures by: Monica Visan

Last updated: June 29, 2017

Department of Mathematics University of California, Los Angeles Contents

Preface 3

1 Preliminaries4 1.1 Notation and Conventions...... 4 1.2 Lp Spaces...... 4 1.3 Complex Interpolation...... 5 1.4 Some General Principles...... 7 1.4.1 Integrating |x|−α ...... 8 1.4.2 Dyadic Sums...... 8

2 The Fourier Transform 10 d 2.1 The Fourier Transform on R ...... 10 d 2.2 The Fourier Transform on T ...... 16

3 Lorentz Spaces and Real Interpolation 19 3.1 Lorentz Spaces...... 19 3.2 Duality in Lorentz Spaces...... 23 3.3 The Marcinkiewicz Interpolation Theorem...... 26

4 Maximal Functions 29 4.1 The Hardy-Littlewood Maximal Function...... 29 4.2 Ap Weights and the Weighted Maximal Inequality...... 30 4.3 The Vector-Valued Maximal Inequality...... 36

5 Sobolev Inequalities 41 5.1 The Hardy-Littlewood-Sobolev Inequality...... 41 5.2 The Sobolev Embedding Theorem...... 44 5.3 The Gagliardo-Nirenberg Inequality...... 47

6 Fourier Multipliers 49 6.1 Calderon-Zygmund Convolution Kernels...... 49 6.2 The Hilbert Transform...... 56 6.3 The Mikhlin Multiplier Theorem...... 57

7 Littlewood-Paley Theory 61 7.1 Littlewood-Paley Projections...... 61 7.2 The Littlewood-Paley Square Function...... 68 7.3 Applications to Fractional Derivatives...... 71

8 Oscillatory Integrals 79 8.1 Oscillatory Integrals of the First Kind, d = 1 ...... 80 8.2 The Morse Lemma...... 87 8.3 Oscillatory Integrals of the First Kind, d > 1 ...... 89

1 8.4 The Fourier Transform of a Surface ...... 92

9 Dispersive Partial Differential Equations 94 9.1 Dispersion...... 94 9.2 The Linear Schrodinger¨ Equation...... 95 9.2.1 The Fundamental Solution...... 95 9.2.2 Dispersive Estimates...... 96 9.2.3 Strichartz Estimates...... 96

10 Restriction Theory 97 10.1 The Restriction Conjecture...... 97 10.2 The Tomas-Stein Inequality...... 100 10.3 Restriction Theory and Strichartz Estimates...... 103

11 Rearrangement Theory 108 11.1 Definitions and Basic Estimates...... 108 11.2 The Riesz Rearrangement Inequality, d = 1 ...... 114 11.3 The Riesz Rearrangement Inequality, d > 1 ...... 117 11.4 The Polya-Szego inequality...... 122 11.5 The Energy of the Hydrogen Atom...... 125

12 Compactness in Lp Spaces 128 12.1 The Riesz Compactness Theorem...... 128 12.2 The Rellich-Kondrachov Theorem...... 131 12.3 The Strauss Lemma...... 134 12.4 The Refined Fatou’s Lemma...... 136

13 The Sharp Gagliardo-Nirenberg Inequality 137 13.1 The Focusing Cubic NLS...... 137

14 The Sharp Sobolev Embedding Theorem 138

15 Concentration Compactness in Partial Differential Equations 139

References 139

2 Preface

**Eventually I’ll probably write a better introduction, this is a placeholder for now**

The following text covers material from Monica Visan’s Math 247A and Math 247B classes, offered at UCLA in the winter and spring quarters of 2017. The content of each chapter is my own adaptation of lecture notes taken throughout the two quarters. Chapter1 contains a selection of background material which was not covered in lec- tures. The rest of the chapters roughly follow the chronological order of lectures given throughout the winter and spring, with some slight rearranging of my own choice. For example, basic results about the Fourier transform on the torus are included in Chapter 2, despite being the subject of lectures in the spring. A strictly chronological (and incom- plete) version of these notes can be found on my personal web page at http://www. math.ucla.edu/˜josephbreen/. Finally, any mistakes are likely of my own doing and not Professor Visan’s. Comments and corrections are welcome.

3 Chapter 1

Preliminaries

The material covered in these lecture notes is presented at the level of an upper division graduate course in harmonic analysis. As such, we assume that the reader is comfortable with real and complex analysis at a sophisticated level. However, for convenience and completeness we recall some of the basic facts and inequalities that are used throughout the text. We also establish conventions with notation, and discuss a few select topics that the reader may be unfamiliar with (for example, the Riesz-Thorin interpolation theorem). Any standard textbook in real analysis or harmonic analysis is a suitable reference for this material, for example, [3], [6], and [8].

1.1 Notation and Conventions

If x ≤ Cy for some constant C, we say x . y. If x . y and y . x, then x ∼ y. If the implicit constant depends on additional data, this is manifested as a subscript in the inequality d d sign. For example, if A denotes the ball of radius r > 0 in R and | · | denotes R -Lebesgue d measure, we have |A| ∼d r . However, the dependencies of implicit constants are usually clear from context, or stated otherwise. FINISH

1.2 Lp Spaces

ADD

Proposition 1.1 (Young’s Convolution inequality). For 1 ≤ p, q, r ≤ ∞,

kf ∗ gk r d kfk p d kgk q d L (R ) . L (R ) L (R )

1 1 1 whenever 1 + r = p + q . Young’s convolution inequality may be proved directly, or by using the Riesz-Thorin interpolation theorem (see Section 1.3). For details on both methods, refer to [3]. It is convenient to record a few special cases of Young’s inequality that arise frequently.

Corollary 1.2 (Young’s Convolution inequality, special cases). The following convolution in- equalities hold:

1. For any 1 ≤ p ≤ ∞, kf ∗ gkL∞ . kfkLp kgkLp0 . In particular, kf ∗ gkL∞ . kfkL2 kgkL2 and kf ∗ gkL∞ . kfkL1 kgkL∞ .

2. For any 1 ≤ p ≤ ∞, kf ∗ gkLp . kfkL1 kgkLp .

4 1.3 Complex Interpolation

An important tool in harmonic analysis is interpolation. Broadly speaking, interpolation considers the following question: given estimates of some kind on two different spaces, say Lp0 and Lp1 , what can be said about the corresponding estimate on Lp for values of p between p0 and p1? There are a number of theorems which answer this kind of question. In Chapter3, we will prove the Marcinkiewicz interpolation theorem using real analytic methods. Here, we prove a different interpolation theorem using complex analytic methods. Both theorems have their own advantages and disadvantages, and will be used frequently. Our treatment of the Riesz-Thorin interpolation theorem follows [8]. The main state- ment is as follows. Theorem 1.3 (Riesz-Thorin interpolation). Suppose that T is a bounded linear map from Lp0 + Lp1 → Lq0 + Lq1 and that

kT fkLq0 ≤ M0 kfkLp0 and kT fkLq1 ≤ M1 kfkLp1 for some 1 ≤ p0, p1, q0, q1 ≤ ∞. Then

1−t t kT fkLqt ≤ M0 M1 kfkLpt where 1 = 1−t + t and 1 = 1−t + t , for any 0 ≤ t ≤ 1. pt p0 p1 qt q0 q1 As noted above, the proof of this theorem relies on complex analysis. Specifically, we need the following lemma. Lemma 1.4 (Three Lines lemma). Suppose that Φ(z) is holomorphic in S = {0 < Re z < 1} and continuous and bounded on S¯. Let

M0 := sup |Φ(iy)| and M1 := sup |Φ(1 + iy)|. y∈R y∈R Then 1−t t sup |Φ(t + iy)| ≤ M0 M1 y∈R for any 0 ≤ t ≤ 1. Proof. We can assume without loss of generality that Φ is nonconstant, otherwise the con- clusion is trivial. We first prove a special case of the lemma. Suppose that M0 = M1 = 1 and that ¯ sup0≤x≤1 |Φ(x + iy)| → 0 as |y| → ∞. In this case, φ has a global (on S) supremum of M > 0. Let {zn} be a sequence such that |Φ(zn)| → M. Because sup0≤x≤1 |Φ(x + iy)| → 0 as |y| → ∞, the sequence {zn} is bounded, hence there is a point z∞ ∈ S¯ such that a subsequence of {zn} converges to z∞. By continuity, |Φ(z∞)| = M. Since Φ is nonconstant, by the maximum modulus principle, z∞ is on the boundary of S¯. This means that |Φ(z)| is globally bounded by 1. Since M0 = M1 = 1, this proves the special case. Next, we remove the decay assumption and only suppose that M0 = M1 = 1. For ε > 0, define ε(z2−1) Φε(z) := Φ(z)e .

First observe that |Φε(z)| ≤ 1 if Re z = 0 or Re z = 1. Indeed, note that for y ∈ R we have

2 2 ε((iy) −1) −ε(y +1) |Φε(iy)| = |Φ(iy)| e ≤ e ≤ 1 and

2 2 2 ε((1+iy) −1) ε(1+2iy−y −1) 2iyε −εy |Φε(1 + iy)| = |Φ(1 + iy)| e ≤ e ≤ e e ≤ 1.

5 Next, we claim that Φε satisfies the decay condition from the previous case. Indeed, note that

2 2 2 ε((x+iy) −1) ε(x −y −1+2xyi) |Φε(x + iy)| = |Φ(x + iy)| e = |Φ(x + iy)| e 2 2 ε(x −y −1) ≤ |Φ(x + iy)| e .

Because 0 ≤ x ≤ 1 and because |Φ| is bounded, as |y| → ∞, |Φε(x + iy)| → 0 uniformly in x. By the previous case, |Φε(z)| ≤ 1 uniformly in S¯. Since this holds for all ε > 0, letting ε → 0 gives the desired result for Φ. Finally we assume that M0 and M1 are arbitrary positive numbers. Define φ˜(z) := z−1 −z M0 M1 φ(z). Note that if Re z = 0, then ˜ iy−1 −iy iy −iy |Φ(z)| ≤ |M0 M1 |M0 ≤ |M0 | · |M1 | ≤ 1 and if Re z = 1, then ˜ iy −1−iy iy −iy |Φ(z)| ≤ |M0 M1 |M1 ≤ |M0 | · |M1 | ≤ 1. The claim then follows by applying the previous case.

Proof of Theorem 1.3. By scaling appropriately, we may assume without loss of generality that kfkLpt = 1. First, consider the case when f is a simple function. By duality, Z

kT fk qt = sup T f(x) g(x) dx L where the supremum is taken over simple functions g with kgk q0 = 1. Fix such a g. First, L t consider the case pt < ∞ and qt > 1. Let     1 − z z 0 1 − z z γ(z) := pt + and δ(z) := qt 0 + 0 . p0 p1 q0 q1 Define f g f := |f|γ(z) and g := |g|δ(z) . z |f| z |g|

Note that γ(t) = 1. Hence, ft = f. Also, if Re z = 0, then   Z Z p0 Z p0 p γ(z) pt−z pt+ p pt t p1 t kfzk p0 = |f| dx = |f| dx = |f| dx = kfk pt = 1. L L

A similar computation gives kfzk p = 1 if Re z = 1. Also, kgzk q0 = 1 if Re z = 0 and L 1 L 0 kgzk q0 = 1 if Re z = 1. L 1 Next, define Z Φ(z) := T fz(x) gz(x) dx. P As f and g are simple, we may write f = k akχEk where the sets Ek are disjoint and of P finite measure, and likewise g = j bjχFj . Then

X γ(z) ak X δ(z) bj fz = |ak| χEk and gz = |bj| χFj . |ak| |bj| k j

Also with this notation we have Z X γ(z) δ(z) ak bj Φ(z) = |ak| |bj| T (χEk ) χFj dx. |ak| |bj| j,k

6 From this it is evident that Φ(z) is holomorphic in S = {0 < Re z < 1} and continuous and bounded on S¯. Moreover, if Re z = 0, then Holder’s inequality gives Z |Φ(z)| ≤ |T fz(x) gz(x)| dx ≤ kT fzk q kgzk q0 ≤ M0 kfk p = M0. L 0 L 0 L 0

Similarly, if Re z = 1, |Φ(z)| ≤ M1. Therefore, by the Three Lines lemma, if Re z = t then 1−t t |Φ(z)| ≤ M0 M1. As Φ(t) = R T f(x) g(x) dx, this is the desired result. The passage from simple functions to general functions is a straightforward limiting argument, and the remaining cases pt = 1 and qt = ∞ are handled similarly. For more details, consult [8].

It is convenient to state a few special cases of Riesz-Thorin interpolation. These occur frequently, and also provide more concrete examples of interpolation at work. Corollary 1.5 (Riesz-Thorin interpolation, special cases). p p 1. Fix 1 ≤ p0 < p1 ≤ ∞. Suppose that T is bounded as a linear map from L 0 → L 0 and also p p p p from L 1 → L 1 . Then T is bounded from L → L for all p0 ≤ p ≤ p1. 2. Suppose that T is bounded as a linear map from L2 → L2 and also from L1 → L∞. Then T 0 is bounded from Lp → Lp for all 1 ≤ p ≤ 2.

Proof. 1. If T is bounded from Lp0 → Lp0 and Lp1 → Lp1 , then Riesz-Thorin implies that T is bounded from Lpt → Lqt for all 0 ≤ t ≤ 1 with 1 1 − t t 1 1 − t t = + and = + . pt p0 p1 qt p0 p1

Thus, pt = qt. Moreover, as t ranges from 0 to 1, pt ranges from p0 to p1. Therefore, T p p is bounded from L → L for all p0 ≤ p ≤ p1. 2. Riesz-Thorin gives boundedness of T : Lpt → Lqt for 0 ≤ t ≤ 1 with 1 1 − t t 1 1 − t t = + and = + . pt 1 2 qt ∞ 2 1 = 1 − t 1 = t q = p0 p = 2 t 0 Thus, pt 2 and qt 2 so that t t. Moreover, t 2−t , so as ranges from to 1, pt ranges from 1 to 2.

The first substantial application of interpolation is found in found in Chapter2, where we show that the Fourier transform is well-defined on Lp spaces for 1 ≤ p ≤ 2. As a final remark, the Riesz-Thorin interpolation theorem is an interpolation result based on complex analytic tools, namely, the maximum principle for holomorphic functions. In Chapter3, we prove a different interpolation theorem — the Marcinkiewicz interpolation theorem — based on real analytic techniques.

1.4 Some General Principles

Throughout the rest of this text, there are a number of straightforward principles that we will employ over and over again, often without explicit comment. For convenience, we include a short discussion of these recurring principles here.

7 1.4.1 Integrating |x|−α The first concern integration of functions of the form |x|−α. In words, integrating |x|−α over d a region of R increases the exponent by d. Explicitly, we have the following proposition. Proposition 1.6. Fix R > 0. Then

R −α 1. The integral |x|>R |x| dx is finite when α > d, in which case Z |x|−α dx ∼ R−α+d. |x|>R

R −α 2. The integral |x|

−α p d Consequently, |x| ∈/ L (R ) for 1 ≤ p < ∞, for any choice of α. Proof. Fix R > 0. r = |x|. Then dx = rd−1 dr. Thus, ∞ Z 1 Z 1 Z ∞ dx = rd−1 dr = r−α+d−1 dr ∼ r−α+d . α α |x|>R |x| r>R r r=R R This limit exists precisely when −α + d < 0, i.e., when α > d, in which case Z 1 −α+d α dx ∼ R . |x|>R |x|

Similarly, R Z 1 dx ∼ r−α+d . α |x|

1.4.2 Dyadic Sums Another computational tool ubiquitous in harmonic analysis is dyadic sums; that is, geo- 1 metric series with r = 2 . The following proposition is easy, but worth recording. n Proposition 1.7. Let 2Z = { 2 : n ∈ Z } be the set of dyadic numbers. Then X 1 1 = 2 · N N0 N∈2Z; N≥N0 and X N = 2 · N0

N∈2Z; N≤N0

8 n Proof. Fix a dyadic number N0 = 2 0 ∈ 2Z. Then

n 1 ∞ 1n 1  0 1 X X −n X 2 −n0 = 2 = = 1 = 2 · 2 = 2 · . N 2 1 − 2 N0 N∈2Z; N≥N0 n∈Z; n≥n0 n=n0

Essentially the same computation gives X N = 2 · N0.

N∈2Z; N≤N0

9 Chapter 2

The Fourier Transform

Our study of harmonic analysis naturally begins with the Fourier transform. Historically, the Fourier transform was first introduced by Joseph Fourier in his study of the heat equa- tion. In this context, Fourier showed how a periodic function could be decomposed into a sum of sines and cosines which represent the frequencies of the function. From a modern point of view, the Fourier transform is a transformation which accepts a function and re- turns a new function, defined via the frequency data of the original function. As a bridge between the physical domain and the frequency domain, the Fourier transform is the main tool of use in harmonic analysis. The basic theory of the Fourier transform is standard, and there are a wealth of refer- ences (for example, [3], [6], and [7]) for the following material.

2.1 The Fourier Transform on Rd

We begin by discussing the Fourier transform of complex-valued functions on the Eu- d clidean space R . The Fourier transform is defined as an integral transformation of a func- tion; as such, it is natural to first consider integrable functions.

1 d Definition 2.1. The Fourier transform of a function f ∈ L (R ) is given by Z F(f)(ξ) := fˆ(ξ) := (f)ˆ(ξ) := e−2πix·ξf(x) dx. Rd There are a number of conventions for placement of constants when defining the Fourier transform. For example, another common definition of the Fourier transform is Z ˆ 1 −ix·ξ f(ξ) = d e f(x) dx. (2π) 2 Rd

These choices clearly do not alter the theory in any meaningful way; they are simply a mat- ter of notational preference. In this text, we will use either convention when convenient. In this section, we will use the former. We have two immediate goals. One is to understand the basic properties of the Fourier transform and how it interacts with operations such as differentiation, convolution, etc. The other goal is to understand which classes of functions the Fourier transform can be reasonably defined on. As and aid towards both of these goals, we introduce (or recall) a suitably nice class of functions. Here, we use the following multi-index notation: for d α = (α1, . . . , αd) ∈ N , define

|α| α α1 αd α ∂ |α| := α1 + ··· αd; x := x1 ··· xd ; D := α1 αd . ∂x1 ··· ∂xd

10 ∞ d α β ∞ d Definition 2.2. A C -function f : R → C is Schwartz if x D f ∈ Lx (R ) for every d multiindex α, β ∈ N . The collection of Schwartz functions, Schwartz space, is denoted d S(R ). In words, a function is Schwartz if it is smooth, and if all of its derivatives decay faster 2 than any polynomial. An example of a Schwartz function is e−|x| . The set of Schwartz functions forms a Frechet space, i.e., a locally convex space (a vector space endowed with a family of seminorms {ρα} that separates points: if ρα(f) = 0 for all d α, then f = 0) which is metrizable and complete. In the case of S(R ), the collection of seminorms is given by {ρα,β}α,β∈Nd , where α β ρα,β(f) := x D f . ∞ d Lx (R ) d p The Frechet space structure of S(R ) is not our main concern; rather, density in L spaces ∞ d d (indeed, note that Cc (R ) ⊆ S(R )) and its interaction with the Fourier transform are of more importance. Here, we collect a number of basic and important properties of the Fourier transform of Schwartz functions. d Proposition 2.3. Let f ∈ S(R ). Then: 1. If g(x) = f(x − y), then gˆ(ξ) = e−2πiy·ξfˆ(ξ). 2. If g(x) = e2πix·ηf(x), then gˆ(ξ) = fˆ(ξ − η).

d −1 −t 3. If g(x) := f(T x) for T ∈ GL(R ), then gˆ(ξ) = | det T | fˆ(T ξ). In particular, if T is a rotation or reflection and f(T x) = f(x) (i.e. f is rotation or reflection invariant), then gˆ(ξ) = fˆ(ξ).

4. If g(x) := f(x), then gˆ(ξ) = fˆ(−ξ). 5. If g(x) := Dαf(x), then gˆ(ξ) = (2πiξ)αfˆ(ξ).

α i |α| α ˆ 6. If g(x) := x f(x), then gˆ(ξ) = 2π Dξ f(ξ).

1 d 7. If g(x) := (k ∗ f)(x) for k ∈ L (R ), then gˆ(ξ) = kˆ(ξ) · fˆ(ξ). ˆ 8. We have the basic estimate f ≤ kfk 1 . ∞ Lx Lξ Proof. 1. Making the change of variables z = x − y, we have Z Z Z gˆ(ξ) = e−2πix·ξf(x − y) dx = e−2πi(z+y)·ξf(z) dz = e−2πiy e−2πiz·ξf(z) dz

= e−2πiyfˆ(ξ).

2. Computing directly gives: Z Z gˆ(ξ) = e−2πix·ξe2πix·ηf(x) dx = e−2πix·(ξ−η)f(x) dx = fˆ(ξ − η).

d 3. For T ∈ GL(R ), make the change of variables y = T x. Then

Z Z −1 gˆ(ξ) = e−2πix·ξf(T x) dx = e−2πi(T y)·ξf(y)| det T |−1 dy

Z −t = | det T |−1 e−2πiy·(T ξ)f(y) dy

= | det T |−1fˆ(T −tξ).

11 If T is a rotation or reflection (or more generally, T is orthogonal), then T tT = TT t = I and det T = 1. In this case, gˆ(ξ) = | det T |−1fˆ(T −tξ) = fˆ(ξ).

4. Computing directly gives: Z Z Z gˆ(ξ) = e−2πix·ξf(x) dx = e−2πix·ξf(x) dx = e−2πix·(−ξ)f(x) dx = fˆ(−ξ).

5. Integrating by parts, we have: Z Z gˆ(ξ) = e−2πix·ξDαf(x) dx = (−1)|α| Dαe−2πix·ξf(x) dx Z Z = (−1)|α| (−2πiξ)αe−2πix·ξf(x) dx = (2πiξ)α e−2πix·ξf(x) dx

= (2πξ)αfˆ(ξ).

α −2πix·ξ α −2πix·ξ α −2πix·ξ 6. Note that Dx e = (−2πix) e , so that x e . Thus, integrating by parts again gives:

Z  i |α| gˆ(ξ) = e−2πix·ξxαf(x) dx = fˆ(ξ). 2π

7. Computing, we have Z Z Z gˆ(ξ) = e−2πix·ξ(k ∗ f)(x) dx = e−2πix·ξ k(x − y)f(y) dy dx ZZ = e−2πix·ξk(x − y)f(y) dx dy.

We make a change of variables z = x − y in the inner integral to get ZZ Z Z gˆ(ξ) = e−2πi(z+y)·ξk(z)f(y) dz dy = e−2πiz·ξk(z) dz · e−2πiy·ξf(y) dy

= kˆ(ξ) · fˆ(ξ).

8. This follows from the triangle inequality for integrals: Z Z Z −2πix·ξ −2πix·ξ e f(x) dx ≤ | |e ||f(x)| dx = |f(x)| dx.

From this proposition, it becomes immediately clear why the Fourier transform is such a powerful computational tool. For example, properties 5 and 7 above describe how the Fourier transform turns complicated function operations like differentiation and convolu- tion into multiplication. This becomes incredibly useful when studying partial differential equations, for instance. Also note that properties 1,2,3,4,7, and 8 extend immediately to 1 d functions in L (R ). d ˆ d d ˆ ˆ Proposition 2.4. If f ∈ S(R ), then f ∈ S(R ). Moreover, if fn → f ∈ S(R ), then fn → f ∈ d S(R ).

12 d Proof. Suppose that f ∈ S(R ). Note that Z α β α −2πix·ξ α β |ξ D fˆ(ξ)| ∼ |ξ xdβf(ξ)| ∼ |D\αxβf(ξ)| = e D (x f(x)) dx .

α β 1 d Because f is Schwartz, D (x f(x)) ∈ L (R ). Therefore,

ξαDβfˆ(ξ) Dα(xβf(x)) < ∞ ∞ . 1 Lξ L ˆ d d ˆ ˆ d so that f ∈ S(R ). It is clear that {fn} → f ∈ S(R ) if and only if {fn} → f ∈ S(R ) by properties 5 and 6 of the previous proposition.

1 d Corollary 2.5 (Riemann-Lebesgue Lemma). If f ∈ L (R ), then fˆis uniformly continuous and vanishes at ∞. d Proof. Let C0(R ) denote the space of continuous functions which vanish at ∞. 1 d Let {fn} be a sequence of Schwartz functions with fn → f in L (R ). Since

f\n − f ≤ kfn − fkL1 L∞ ˆ ˆ ∞ d ˆ d ˆ d d it follows that fn → f in L (R ). As f ∈ S(R ), f ∈ C0(R ). Since C0(R ) is closed under uniform convergence, we are done.

Aside from simple properties and estimates of the Fourier transform and its interaction with various function operations, we have not computed any Fourier transforms of actual functions. Lemma 2.6. Let A be a real, symmetric, positive definite d × d matrix. Then Z −x·Ax −2πix·ξ d − 1 −π2ξ·A−1ξ e e dx = π 2 (det A) 2 e .

Proof. Since A is real, symmetric, and positive definite, it is diagonalizable. Explicitly, there is an orthogonal matrix O and a diagonal matrix D = diag(λ1, . . . , λd), λj > 0, so that A = OT DO. Let y = Ox and η = Oξ. Then

d T X 2 x · Ax = x · O DOx = Ox · DOx = y · Dy = λjyj j=1 and x · ξ = OT y · OT η = y · η. Thus, d Z Z Pd 2 Z 2 −x·Ax −2πix·ξ − λj y −2πiy·η Y −λj y −2πiyj ηj e e dx = e j=1 j e dy = e j dyj. j=1 R Computing each of these one-variable integrals, we have:

Z Z 2 2 2 2 2 Z 2 2 −λy2−2πiyη −λ y− πiη − π η − π η −λy2 − π η − 1 1 e dy = e ( λ ) λ dy = e λ e dy = e λ λ 2 π 2 . R R R So 2 2 d π η 2 Z j 1 Pd π 2 Y − − 1 d 1 − j=1 ηj −x·Ax −2πix·ξ λj 2 2 2 − 2 λj e e dx = e λj π = π (λ1 ··· λd) e j=1 d − 1 −π2η·D−1η = π 2 (det A) 2 e .

The result follows from the fact that η · D−1η = Oξ · D−1Oξ = ξ · A−1ξ.

13 2 Corollary 2.7. The function e−π|x| is an eigenvalue of the Fourier transform with eigenvalue 1. 2 Explicitly, e\−π|x|2 = e−π|ξ| .

Proof. This follows from the previous lemma with A = πId. Indeed,

Z − 1 −π|x|2 −2πix·ξ d  d 2 −π2ξ· 1 ξ −π|ξ|2 e e dx = π 2 π e π = e .

Above, we computed that the Fourier transform of a Schwartz function is again Schwartz. In fact, the Fourier transform is an isometry on Schwartz space and extends to a unitary operator on L2. The next few results summarize these facts. d ˆ Theorem 2.8 (Fourier Inversion). If f ∈ S(R ), then (fˆ)(x) = f(−x). Equivalently, f(x) = (fˆ)ˆ(−x) =: (fˆ)ˇ(x) =: F −1(fˆ)(x). Proof. For ε > 0, let Z −πε2|ξ|2 2πix·ξ Iε(x) := e e fˆ(ξ) dξ.

R 2πix·ξ By the dominated convergence theorem, as ε → 0, Iε(x) → e fˆ(ξ) dξ. Thus, to prove the theorem, it suffices to show that Iε(x) → f(x) as ε → 0. We have Z Z Z −πε2|ξ|2 2πix·ξ −πε2|ξ|2 2πix·ξ −2πiy·ξ Iε(x) = e e fˆ(ξ) dξ = e e e f(y) dy dξ

Z Z 2 2 = f(y) e−πε |ξ| e−2πi(y−x)·ξ dξ dy.

Note that Z ˆ − 1 −πε2|ξ|2 −2πi(y−x)·ξ  −πε2|ξ|2  d  2 d 2 −π2(y−x)· 1 (y−x) e e dξ = e (y − x) = π 2 (πε ) e πε2

|y−x|2 −d −π = ε e ε2 . Consequently, Z |y−x|2 −d −π Iε(x) = f(y) ε e ε2 dy = (f ∗ φε)(x)

−d x  −π|x|2 where φε(x) = ε φ ε with φ(x) = e . It is a standard result that {φε} is an approxi- mation to the identity. Thus, as ε → 0,

Iε(x) = (f ∗ φε)(x) → f(x).

d R R R Lemma 2.9. If f, g ∈ S(R ), then fˆ(ξ)g(ξ) dξ = f(x)ˆg(x) dx. In particular, fˆ(ξ)gˆ(ξ) dx =

R ˆ d f(x)g(x) dx, so that f = kfkL2 . Hence, the Fourier transform is an isometry on S(R ). L2 Proof. Computing, we have Z ZZ Z Z fˆ(ξ)g(ξ) dξ = e−2πix·ξf(x) dx g(ξ) dξ = f(x) e−2πix·ξg(ξ) dξ dx Z = f(x)ˆg(x) dx.

For the “in particular” statement, let h = gˆ. Then

hˆ(x) = gˆ = (ˆg)ˆ(−x) = g(x).

14 d Theorem 2.10 (Plancharel). The Fourier transform extends from an operator on S(R ) to a uni- 2 d tary operator on L (R ). 2 2 d d L Proof. Fix f ∈ L (R ). Let {fn} ⊆ S(R ) such that fn −→ f. As the Fourier transform is an

d ˆ ˆ ˆ isometry on S(R ), we have fn − fm = kfn − fmkL2 . This shows that {fn} is Cauchy L2 2 2 in L . Let fˆ := limn→∞ fˆn, the limit being taken in the L -sense. We claim that fˆ is well-defined. Let {fn} and {gn} be two sequences of Schwartz func- L2 tions such that fn, gn −→ f. Define  fk if n = 2k − 1 hn := . gk if n = 2k

2 L 2 Then hn −→ f. By the argument above, {hˆn} converges in L , which implies by the unique- ness of limits that lim fˆ = lim gˆ . n→∞ n n→∞ n 2 d ˆ Next we claim that f ∈ L (R ), then f = kfkL2 , so the Fourier transform is an L2 2 d 2 isometry on L (R ). Because the norm function is continuous in the L topology, we have

ˆ ˆ f = lim fn = lim kfnkL2 = lim fn = kfkL2 . L2 n→∞ L2 n→∞ n→∞ L2 Before completing the proof, we remark that in infinite , an isometry is not nec- 2 2 essarily a unitary operator. For example, let T : ` (N) → ` (N) be the right-shift operator given by T (a0, a1, a2,... ) := (0, a0, a1,... ). Then ∗ T (a0, a1, a2,... ) := (a1, a2, a3,... ). Clearly T is an isometry, but TT ∗ 6= I. However, to prove that the isometry F is unitary, it suffices to prove that F is surjective. 2 d d We claim that im F is closed in L . From this claim, since S(R ) ⊆ im F and S(R ) is dense 2 d in L (R ), the proof is complete. 2 To demonstrate this claim, let g ∈ im F. Then there is a sequence {fn} of L functions 2 L 2 so that fˆn −→ g. As the Fourier transform is an isometry on L , this implies that {fn}

2 ˆ ˆ ˆ converges in L . Let f := limn→∞ fn. Then fn − f = kfn − fkL2 → 0 so that g = fn. L2

1 d To summarize our progress thus far, we have defined the Fourier transform on L (R ), investigated its properties on Schwartz functions, and used the class of Schwartz functions 2 d to define the Fourier transform on L (R ). Next, we used Riesz-Thorin interpolation (see p d Chapter1) to define the Fourier transform on L (R ) for 1 ≤ p ≤ 2. Even more, we will p d show that the Fourier transform cannot be defined as a bounded operator on L (R ) spaces for p > 2. d Theorem 2.11 (Hausdorff-Young). If f ∈ S(R ), then

ˆ f 0 . kfkLp Lp 1 1 for 1 ≤ p ≤ 2, where p + p0 = 1.

ˆ ˆ Proof. We have the estimates f ≤ kfkL1 and f = kfkL2 . Applying the Riesz- L∞ L2 Thorin interpolation theorem with p0 = 1, p1 = 2, q0 = ∞, and q1 = 2, it follows that the Fourier transform is bounded from Lpθ → Lqθ , where 1 1 − θ θ 1 θ = + and = pθ 1 2 qθ 2

15 1 θ 1 1 2 so that = 1 − , hence + = 1. Since pθ = , as θ ∈ [0, 1] we have pθ ∈ [1, 2] as pθ 2 pθ qθ 2−θ desired.

Conversely, we have the following. This is one of our first examples of the power of scaling arguments.

ˆ d Theorem 2.12. If f . kfkLp for all f ∈ S(R ) for some 1 ≤ p, q ≤ ∞, then 1 ≤ p ≤ 2 and Lq q = p0. d Proof. Fix f 6= 0 ∈ S(R ). For λ > 0, let fλ(x) := f(x/λ). Then Z x fˆ (ξ) = e−2πix·ξf dx = λdfˆ(λξ). λ λ

ˆ We have fλ . kfλkLp , which implies Lq − d d d q ˆ p λ · λ f . λ kfkLp . Lq

d d d d 1 1 d − 0 0 Thus, λ · λ q . λ p , so that λ q . λ p and hence λ q . λ p for all 0 < λ < ∞. Letting 1 1 1 1 0 λ → ∞ gives q0 ≤ p , and letting λ → 0 gives q0 ≥ p . Thus, so q = p . 0 ∞ d Next, we show 1 ≤ p ≤ 2. It suffices to show p ≤ p . Let ϕ ∈ Cc (R ) with supp ϕ ⊆ −2πix·λke1 B(0, 1/2). Let ϕk(x) = e ϕ(x − ke1). Using properties 1 and 2 of the Fourier transform, it follows that −2πiξ·ke1 ϕˆk(ξ) = e ϕˆ(ξ − λke1). 1 PN p Let f = ϕk. Since the supports of ϕk are disjoint, it follows that kfk p ∼ N . Next, k=1 L we compute, expanding out fˆ as follows: Lp0

N N X −2πiξ·ke1 X −2πiξ·ke1 C e ϕˆ(ξ − λke1)χ (ξ) + e ϕˆ(ξ − λke1)χ (ξ) B(λke1,λ/2) B(λke1,λ/2) k=1 k=1 Lp0 N X ≥ e−2πiξ·ke1 ϕˆ(ξ − λke )χ (ξ) 1 B(λke1,λ/2) k=1 Lp0 N X −2πiξ·ke1 C − e ϕˆ(ξ − λke1)χB(λke ,λ/2)(ξ) 0 1 Lp k=1 1 p0 C & N − N ϕˆ(ξ)χB(0,λ/2)(ξ) 0 . Lp Because ϕˆ is Schwartz,

1 ! p0 Z 1 − d ϕˆ(ξ)χC (ξ) dξ λ p0 B(0,λ/2) p0 . 2d . L |ξ|>λ/2 |ξ|

1 1 ˆ p0 p which → 0 as λ → ∞. Thus, f 0 . kfkLp if and only if N . N for all N, which is Lp true if and only if p ≤ p0.

2.2 The Fourier Transform on Td.

The primary focus of this text is harmonic analysis on , and consequently d the Fourier transform on R is the most important tool for our purposes. One can also d d d define the Fourier transform on the torus T := R /Z , equivalently, for periodic functions. In the interest of studying dispersive partial differential equations on the torus (see Chapter 9), we include a brief discussion of the Fourier transform in this setting.

16 ∞ d d Definition 2.13. For a C -function f : T → C, we define its Fourier transform fˆ : Z → C by Z fˆ(k) = e−2πik·xf(x) dx. Td ˆ 2πik·x Then f(k) = hf, eki where ek(x) := e .

The characters ek are orthonormal, since Z 2πi(k−l)·x hek, eli = e = δkl. Td We quickly establish some familiar properties of the Fourier transform.

∞ d Proposition 2.14 (Bessel’s Inequality). For f ∈ C (T ),

X 2 2 | hf, e i | ≤ kfk 2 d . k L (T ) k∈Zd

d Proof. Let S ⊆ Z be a finite set. Then 2 * +

X X X 0 ≤ f − hf, eki ek = f − hf, eki ek, f − hf, eli el

k∈S L2(Td) k∈S l∈S * + 2 X X X = kfk 2 d − 2 hf, e i he , fi + hf, e i e , hf, e i e L (T ) k k k k l l k∈S k∈S l∈S X X = kfk2 − 2 | hf, e i |2 + | hf, e i |2 L2(Td) k k k∈S where the last equality follows from orthogonality of the ek’s. Simplifying and rearranging gives X 2 2 | hf, e i | ≤ kfk 2 d k L (R ) k∈S for every finite set S, hence the desired inequality.

∞ d P 2 d f ∈ C ( ) d hf, e i e ∈ L ( ) This fact shows that if T , then k∈Z k k T . Unsurprisingly, these two objects can be identified.

∞ d Proposition 2.15 (Fourier inversion). If f ∈ C (T ), then

X X ˆ 2πik·x f = hf, eki ek = f(k)e . k∈Zd k∈Zd P f 6= d hf, e i e Proof. Suppose for the sake of contradiction that k∈Z k k. Then by Stone- Weierstrass, there is a trigonometric polynomial g such that * + X f − hf, eki ek, g 6= 0. k∈Zd

For any character el, * + X f − hf, eki ek, el = hf, eli − hf, eli = 0. k∈Zd This is a contradiction.

17 The following result is unique to the periodic case.

Lemma 2.16 (Poisson summation). Let ϕ ∈ S(R). Then X X ϕ(x + n) = ϕˆ(n)e2πinx n∈Z n∈Z x ∈ P ϕ(n) = P ϕˆ(n) for all R. In particular, n∈Z n∈Z . F (x) = P ϕ(x + n) F (x) = P ϕˆ(n)e2πinx F Proof. Define 1 n∈Z and 2 n∈Z . Note that both 1 and F2 are 1-periodic in x. Also, since ϕ is Schwartz, each sum converges absolutely in n and converges uniformly in x on compact sets. Therefore, F1 and F2 are both continuous functions on T. As such, to give the desired equality it suffices to prove that F1 and F2 have the same Fourier coefficients. The definition of F2 immediately gives Fˆ2(k) =ϕ ˆ(k) for all k ∈ Z. Next, we compute:

Z 1 Z n+1 X −2πikx X −2πik(y−n) Fˆ1(k) = ϕ(x + n)e dx = ϕ(y)e dy 0 n n∈Z n∈Z X Z n+1 = ϕ(y)e−2πiky dy n n∈Z Z = ϕ(y)e−2πiky dy R =ϕ ˆ(k).

The in particular statement follows by considering x = 0.

18 Chapter 3

Lorentz Spaces and Real Interpolation

In this chapter, we introduce a new class of functions space, namely, Lorentz spaces. These spaces measure the size of functions in a more refined manner than the classical Lp spaces. After developing some of the basic theory of Lorentz spaces, we then state and prove the Marcinkiewicz interpolation theorem in its most general setting. Some references for Lorentz spaces are [5] and [6].

3.1 Lorentz Spaces

Before defining Lorentz spaces in their full generality, we first consider a simpler and more familiar class of functions. d Definition 3.1. For 1 ≤ p ≤ ∞ and f : R → C measurable, define 1 ∗ n d o p p kfkL := sup λ x ∈ R : |f(x)| > λ . (3.1) weak λ>0 p p d The weak-L space, written Lweak(R ), is the family of measurable functions f for which ∗ kfk p is finite. Lweak The quantity defined in (3.1) contains an asterisk as a superscript because is not a norm. However, it is a quasinorm. Recall that a quasinorm satisfies the same properties as a norm, except that the triangle inequality is replaced by the quasi triangle inequality: kf + gk ≤ C (kfk + kgk) for some universal constant C > 0. p p d To clarify the definition of the weak-L (quasi)norm, consider a function f ∈ L (R ). Using the fundamental theorem of calculus, we can write Z Z Z |f(x)| kfkp = |f(x)|p dx = pλp−1 dλ dx. Lp(Rd) Rd Rd 0 By Tonelli’s theorem, Z ∞ Z Z ∞ n o kfkp = pλp−1 dx dλ = pλp−1 x ∈ d : |f(x)| > λ dλ Lp(Rd) R 0 { x∈Rd : |f(x)|>λ } 0 Z ∞  n o 1 p dλ d p = p λ x ∈ R : |f(x)| > λ . 0 λ

1 1 1 p  d p So kfk p d = p λ x ∈ : |f(x)| > λ . Using the convention p ∞ = L (R ) R p dλ L ((0,∞), λ ) 1, we can then write 1 ∗ 1 n d o p kfk p d = p ∞ λ x ∈ : |f(x)| > λ . L (R ) R weak ∞ dλ L ((0,∞), λ ) In this way, weak-Lp spaces are a clear generalization of Lp spaces.

19 d − p p d p d Example 3.2. Let f(x) = |x| . Then f ∈ Lweak(R ), but f∈ / L (R ). To see this, first note that Z Z Z ∞ Z ∞ p 1 1 d−1 1 |f(x)| dx = d dx ∼ d r dr ∼ dr. Rd Rd |x| 0 r 0 r

p p d This latter integral does not converge, hence the L -norm of f is not finite, so f∈ / L (R ). On the other hand,

1 p 1 1 ( ) p ( ) p n o 1  1  d x ∈ d : |f(x)| > λ p = x ∈ d : > λ = x ∈ d : |x| < . R R d R |x| p λ

The set being measured is a ball of radius (1/λ)p/d. The volume of a such a ball scales d according to (1/λ)p/d = (1/λ)p. Taking pth roots as above then gives

n o 1 1 x ∈ d : |f(x)| > λ p . R . λ Hence, ∗ 1 kfk p d sup λ = 1 < ∞ L (R ) . weak λ>0 λ p d so that f ∈ Lweak(R ). Lorentz spaces are an even further generalization of Lp and weak-Lp spaces.

p,q d Definition 3.3. For 1 ≤ p ≤ ∞ and 1 ≤ q ≤ ∞, the Lorentz space L (R ) is the space of d measurable functions f : R → C for which the quantity

1 1 n o p ∗ q d kfk p,q d := p λ x ∈ : |f(x)| > λ L (R ) R (3.2) q dλ L ((0,∞), λ ) is finite.

p d p,p d p d p,∞ d By our previous computation, L (R ) = L (R ), and Lweak(R ) = L (R ). Also, as with the weak-Lp norm, the Lp,q-norm defined in (3.2) is not actually a norm.

∗ k·k p,q d Lemma 3.4. The quantity L (R ) is a quasinorm. ∗ kfk p,q d = 0 Proof. First, note that if L (R ) , then

1 n d o p λ x ∈ R : |f(x)| > λ = 0 q dλ L ((0,∞), λ )

 d which implies that, for almost all λ > 0, x ∈ R : |f(x)| > λ = 0. It follows that f(x) = 0 almost everywhere. Thus, f = 0. Next, let a ∈ C. Then 1 1 n o p ∗ q d kafk p,q d = p λ x ∈ : |af(x)| > λ L (R ) R q dλ L ((0,∞), λ ) 1   p 1 λ d λ = |a|p q x ∈ : |f(x)| > . a R a q dλ L ((0,∞), λ )

∗ ∗ η = λ/a kafk p,q d = |a| kfk p,q d By making the change of variables , we have L (R ) L (R ).

20 It remains to prove that quasi-triangle inequality. To do this, we invoke the fact that for 0 < α < 1, the map x 7→ xα is concave. It is a standard fact that concave functions are subadditive, so that (x + y)α ≤ xα + yα. With this in mind, we compute: 1 1 n o p ∗ q d kf + gk p,q d = p λ x ∈ : |f(x) + g(x)| > λ . L (R ) R q dλ L ((0,∞), λ ) Observe that n o  λ   λ  x ∈ d : |f(x) + g(x)| > λ ⊆ x ∈ d : |f(x)| > ∪ x ∈ d : |g(x)| > . R R 2 R 2 Thus, 1 1   λ   λ   p ∗ q kf + gk p,q d ≤ p λ x : |f(x)| > + x : |g(x)| > . L (R ) 2 2 q dλ L ((0,∞), λ ) 1 By subbaditivity of the concave map x 7→ x p , we have 1 1 ! 1  λ  p  λ  p ∗ q kf + gk p,q d ≤ p λ x : |f(x)| > + x : |g(x)| > . L (R ) 2 2 q dλ L ((0,∞), λ ) Distributing the λ, factoring out a 2, and appliyng the usual Lq-Minkowski inequality then yields: 1 1 1 λ  λ  p λ  λ  p ∗ q kf + gk p,q d ≤ 2p x : |f(x)| > + x : |g(x)| > L (R ) 2 2 2 2 q dλ L ((0,∞), λ )    1 1 λ λ p ≤ 2p q x : |f(x)| >  2 2 q dλ L ((0,∞), λ )    1 λ λ p + x : |g(x)| > 2 2  q dλ L ((0,∞), λ )  ∗ ∗  = 2 kfk p,q d + kgk p,q d . L (R ) L (R )

∗ 1 < p < ∞ 1 ≤ q ≤ ∞ k·k p,q d For and , we will show that the quasinorm L (R ) is in fact equivalent to a norm. For p = 1, q 6= 1, the quasinorm is not equivalent to a norm. However, in this case, there does exist a metric that generates the same topology. Thus, in p,q d either of these cases, L (R ) is a complete . As another quick remark, we note that if |g| ≤ |f|, then for any λ > 0,

n d o n d o x ∈ R : |g(x)| > λ ⊆ x ∈ R : |f(x)| > λ . ∗ ∗ kgk p,q d ≤ kfk p,q d This implies that L (R ) L (R ). It is natural to ask whether different Lorentz spaces have any simple embedding prop- erties. The next result gives a partial answer to this question. 1 ≤ p < ∞ 1 ≤ q ≤ ∞ f ∈ Lp,q( d) f = P f Proposition 3.5. Let , , and R . Write m∈Z m, where f (x) := f(x) χ . m { x∈Rd : 2m≤|f(x)|≤2m+1 } Then ∗ kfk p,q d ∼ kf k p d . L (R ) p,q m L (R ) q `m(Z) p,q d p,q d In particular, L 1 (R ) ⊆ L 2 (R ) whenever q1 ≤ q2.

21 Proof. By the preceding remark, note that ∗ ∗

X m X 2 χ{ x : 2m≤|f|≤2m+1 } ≤ fm

m∈Z Lp,q(Rd) m∈Z Lp,q(Rd) ∗

X m+1 ≤ 2 χ{ x : 2m≤|f|≤2m+1 } .

m∈Z Lp,q(Rd) f(x) = P 2mχ Therefore, it suffices to prove the proposition for a function of the form m∈Z Em where {Em} is a pairwise disjoint collection of sets. 1 ∗ m p kfk p,q d ∼ 2 |E | In this case, we need to show that L (R ) p,q m q . We compute: `m(Z) ∞  q Z q dλ ∗ q p kfk p,q d = p λ | { x : |f(x)| > λ } | L (R ) 0 λ 2m X Z q dλ = p λq | { x : |f(x)| > λ } | p . 2m+1 λ m∈Z m+1 m S Next, observe that for λ ∈ [2 , 2 ), { x : |f(x)| > λ } = n≥m EM . As the Em’s are disjoint, we then have q m   p q Z 2  ∗  X q X dλ kfk p,q d = p λ |E | . L (R )  m  2m+1 λ m∈Z n≥m This removes the dependence on λ in the set inside the integral. Because m Z 2 dλ p p p λq = (2mq − 2(m+1)q) = (1 − 2q) 2mq 2m+1 λ q q it follows that q 1 q   p   p q  ∗  X mq X m X kfk p,q d ∼ 2 |E | = 2 |E | . L (R ) p,q  n   m  m∈ n≥m n≥m Z q `m(Z) Thus, 1   p ∗ X 1 m m p kfkLp,q( d) ∼p,q 2  |En| &p,q 2 (|Em|) q . R ` ( ) n≥m m Z q `m(Z) This gives half of the ∼p,q relation that we need. To get the other inequality, we compute as follows. First, invoking concavity of frac- tional powers as before, 1   p 1 ∗ m X m X p kfk p,q d ∼p,q 2  |En| ≤ 2 |En| . L (R ) n≥m n≥m q q `m( ) `m(Z) Z Next, making the change of variables n = m + k,

m X 1 X −k m+k 1 X −k m+k 1 = 2 |E | p = 2 2 |E | p ≤ 2 2 |E | p m+k m+k m+k q `m(Z) k≥0 q k≥0 q k≥0 `m(Z) `m(Z) X −k m 1 = 2 2 |Em| p `q ( ) k≥0 m Z 1 = 2m|E | p m q `m(Z)

22 1 ∗ m p kfk p,q d 2 |E | so that L (R ) .p,q m q . Therefore, `m(Z) 1 ∗ m p kfk p,q d ∼ 2 |E | L (R ) p,q m q `m(Z) as desired. q q The “in particular” statement follows from the fact that if q1 ≤ q2, then ` 1 (Z) ⊆ ` 2 (Z).

As noted in the proof of the above proposition, the monotonicty of the quasinorm allows for the following reduction in many computations involving Lorentz spaces. If p,q d f ∈ L (R ) is real-valued and nonnegative, then we can assume without loss of general- f = P 2mχ E ity that m∈Z Em , where the sets m are pairwise disjoint. In this case, 1 ∗ m p kfk p,q d = 2 |E | . L (R ) m q `m(Z) We can further decompose a general function into its real and imaginary parts, and then into their corresponding positive and negative parts, to apply this reduction.

3.2 Duality in Lorentz Spaces

One of the most important principles in Lp space theory is that of duality. The analogous fact for Lorentz spaces is given by the following proposition. Proposition 3.6. Let 1 < p < ∞ and 1 ≤ q ≤ ∞, and let p0 and q0 denote the respective Holder conjugates. Then Z kfk∗ ∼ sup f(x)g(x) dx . Lp,q(Rd) p,q (3.3) ∗ d kgk p0,q0 d ≤1 R L (R ) Proof. As the quasinorm is positively homogeneous, we can scale the function f appro- ∗ kfk p,q d = 1 priately and assume without loss of generality that L (R ) . Also, as remarked previously, we can assume without loss of generality that f and g are real-valued and non- f = P 2nχ F g = P 2mχ negative, hence we can take n∈Z Fn for disjoint sets n and m∈Z Em for disjoint sets Em. In this case, we have  q 1 q q ∗ n p X nq p 1 = kfk p,q d = 2 |F | = 2 |F | . L (R ) n q n `n(Z) n∈Z We decompose the above sum as follows. Let 2Z denote the set of dyadic numbers. Then  q q q q q X nq X X nq X X nq X X n 2 |Fn| p = 2 N p = N p 2 ∼ N p  2  . n∈Z N∈2Z n:|Fn|∼N N∈2Z n:|Fn|∼N N∈2Z n:|Fn|∼N The latter equivalence is a straightforward exercise in dealing with dyadic sums, together with the fact that the `q-norm is bounded by the ell1-norm. Thus,  q X X n 1 1 ∼  2 N p  . N∈2Z n:|Fn|∼N ∗ Similarly, since kgk p0,q0 d ≤ 1, the same computation gives L (R )  q0 1 X X m 0  2 |Em| p  . 1. M∈2Z m:|Em|∼M

23 With these computations and our reductions in mind, we demonstrate the equivalence in (3.3). We first show that the right hand side is bounded by the left. Z Z X n m X n m f(x)g(x) dx = 2 2 χFn∩Em (x) dx = 2 2 |Fn ∩ Em| n,m∈Z n,m∈Z X X n X m . 2 2 min{N,M}. N,M∈2Z n:|Fn|∼N m:|Em|∼M

1 1 0 By our above computations, we eventually want to introduce the quantities N p and |Em| p . We force them in the sum as follows: ( ) 1 1 N M X X n p X m p0 . 2 N 2 |Em| min 1 1 , 1 1 p p0 p p0 N,M∈2Z n:|Fn|∼N m:|Em|∼M N M N M 1 1 (  0   ) X X 1 X 1 N p M p = 2n|F | p 2m|E | p0 min , . n m M N N,M∈2Z n:|Fn|∼N m:|Em|∼M

q Next, we use Holder’s inequality on the above sum, viewing the n-sum as living in `N and q0 the m-sum as living in `M . This yields

1   q 1 1  q (  0   ) X X 1 N p M p 2n|F | p min , .   n  M N  N,M∈2Z n:|Fn|∼N 1 0  q  q0   ( 1 1 ) X X 1  N  p0 M  p ·  2M |E | p0 min ,  .   m  M N  N,M∈2Z m:|Em|∼M

In the first term, freeze N and sum over M. Because p > 1 and hence p0 < ∞, we have

( 1 1 ) 1 1 X  N  p0 M  p X M  p X  N  p0 min , = + 1 + 1 1. M N N M . . M∈2Z M≤N M>N

Similarly, in the second term we freeze M and sum over N. This gives

1 1  0  0   q q  q q Z X X 1 X X 1 n p  M p0  f(x)g(x) dx .   2 |Fn|   ·   2 |Em|   N∈2Z n:|Fn|∼N M∈2Z m:|Em|∼M

. 1.

This gives one inequality. Next, we consider the converse inequality. Again we can take f to be of the form P 2nχ kfk∗ ∼ 1 n Fn with Lp,q(Rd) . Let

 1 q−1 − 1 q X n p p0 X n(q−1) p −1 g = 2 |Fn| |Fn| χFn = 2 |Fn| χFn . n n

We make two claims, from which the desired inequality clearly follows: R 1. f(x)g(x) dx & 1; ∗ 2. kgk p0,q0 d 1. L (R ) .

24 The first claim follows from

Z Z q q X n(q−1) n p −1 X nq p f(x)g(x) dx = 2 2 |Fn| χFn = 2 |Fn| n n 1 q = 2n|F | p n q `n(Z) ∼ 1.

To see the second claim, write X g ∼ NχS F n∈Sn n N∈2Z q n n(q−1) −1 o where Sn = n : 2 |Fn| p ∼ N . Note that the Sn’s are disjoint sets. Then

q0 ! 0 q0 p  ∗  X q0 X kgk p0,q0 d ∼ N |Fn| L (R ) N∈2Z n∈Sn q0 p ! p0 X 0 X   q−p ∼ N q N2−n(q−1)

N∈2Z n∈Sn 0 X 0 X p · q −nq p−1 ∼ N q N q−p p0 2 q−p

N∈2Z n∈Sn where in the last equivalence we have used the fact about dyadic sums and `q norms from 0 q above and the fact that q = q−1 . It remains to compute and simplify all of the Holder exponents. Doing this gives

0 q  q  q−p  q−p p−q ∗ X n(q−1) p −nq q−p kgk p0,q0 d ∼ 2 |Fn| 2 L (R ) n∈Z q X nq ∼ |Fn| p 2 n∈Z . 1. This proves both claims, the the converse inequality, and hence completes the proof.

The right hand side of (3.3) defines a norm which is equivalent to the quasinorm Lorentz p,q d quasinorm. With respect to this norm, L (R ) for 1 < p < ∞, 1 ≤ q ≤ ∞, is a Banach space. The proof of completeness is the same as the proof of completeness in Lp spaces. p0,q0 d The dual space in this case is L (R ). As remarked above, if p = 1 and q 6= 1, there is no norm which is equivalent to the quasinorm. The following example demonstrates this explicitly. PN 1 Example 3.7. Consider the case p = 1, q = ∞, d = 1. Let f(x) = n=1 |x−n| . We ∗ ∗ saw in a previous example that 1 = 1 1. This implies that |x−n| 1,∞ |x−n| 1 . L (R) Lweak(R) ∗ PN 1 n=1 |x−n| . N. L1,∞(R) ∗ PN 1 We claim that n=1 |x−n| & N log N. We have L1,∞(R)

N ∗ ( N ) X 1 X 1 = sup λ x ∈ d : > λ . |x − n| R |x − n| 1,∞ λ>0 n=1 L (R) n=1

25 1 1 Fix x ∈ [0,N]. Note that, for n ≥ x, |x−n| < n, so that |x−n| > n . For n < x, we can consider P 1 each finite number and rearrange them so that their sum is comparable to n C log N ⊇ [0,N]. Therefore,

( N ) d X 1 kfk 1,∞ ≥ C log N x ∈ : > C log N ≥ CN log N L (R) R |x − n| n=1 as claimed. Now, suppose that the quasinorm was equivalent to a norm. Then by the triangle in- equality, ∗ N N ∗ X 1 X 1 N log N . . . N, |x − n| |x − n| 1,∞ 1,∞ L ( ) n=1 L (R) n=1 R which is a contradiction.

3.3 The Marcinkiewicz Interpolation Theorem

Finally, we arrive at the Marcinkiewicz interpolation theorem. Before stating and proving the theorem, we recall the notion of sublinearity and introduce some language that will be used throughout the remainder of this text.

Definition 3.8. A mapping T on a class of measurable functions is sublinear if |T (cf)| ≤ |c||T (f)| and |T (f + g)| ≤ |T (f)| + |T (g)| for all c ∈ C and f, g in the support of T . Any linear operator is obviously sublinear. A more interesting class of examples, and perhaps the primary motivation for defining sublinearity, is the following.

Example 3.9. Given a family of linear mappings {Tt}t∈I , the mapping defined by

T (f)(x) := kTt(f)(x)k q Lt is sublinear. When q = ∞, operators of this form are called maximal functions. When q = 2, the term square function is used.

Definition 3.10. Let 1 ≤ p, q ≤ ∞. A mapping of functions T is of (strong) type (p, q) if kT fk q d kfk p d T (p, q) L (R ) .T L (R ). That is, is of type if it is bounded as a mapping from p d q d ∗ L ( ) → L ( ) q < ∞ T (p, q) kT fk q,∞ d R R . Similarly, for , is of weak type if L (R ) .T 1 ∗ p kfk p d (p, q) kT χ k q,∞ d |F | L (R ), and is of restricted weak type if F L (R ) .T for all finite mea- sure sets F .

Note that type (p, q) implies weak type (p, q), which implies restricted weak type (p, q). An alternative characterization of restricted weak type operators is given by the following proposition.

Proposition 3.11. A mapping T is of restricted weak type (p, q) if and only if

Z 1 1 |T χF ||χE| dx . |E| p |F | q for all finite measure sets E,F .

26 1 ∗ p T (p, q) kT χ k q,∞ d |F | Proof. First, suppose that is of restricted type . Then F L (R ) .T . By the duality of Lorentz quasinorms, this implies Z ∗ ∗ |T χF ||χE| dx kT χF k q,∞ d kχEk q0,1 d . . L (R ) L (R )

Note that q0 Z 1  ∗  0 q0 dλ kχEk q0,1 d = q λ |{ x : χE(x) > λ }| ∼ |E|. L (R ) 0 λ Hence, Z 1 1 |T χF ||χE| dx . |E| p |F | q .

Conversely, suppose that the above inequality holds for all finite measure sets E and F . We again invoke duality to write Z Z ∗ kT χF kLq,∞( d) ∼ sup T χF g dx sup |T χF ||g| dx. R ∗ . ∗ kgk 0 ≤1 kgk 0 ≤1 Lq ,1 Lq ,1

P m Consider g := m 2 Em for an appropriate disjoint collection of sets Em. Then

Z 1 1 1 X m p q0 p ∗ |T χF ||g| dx 2 |F | |E| |F | kgk q0,1 d . . L (R ) m 1 . |F | p as desired.

The formulation of the Marcinkiewicz interpolation theorem that we state below is due to Hunt (see [5]). The classical statement of the theorem is given as Corollary 3.14 below.

Theorem 3.12 (Marcinkiewicz interpolation). Fix 1 ≤ p1, p2, q1, q2 ≤ ∞ such that p1 6= p2 and q1 6= q2. Let T be a sublinear operator of restricted weak type (p1, q1) and of restricted weak type (p2, q2). Then for any 1 ≤ r ≤ ∞ and 0 < θ < 1,

∗ ∗ kT fkLqθ,r . kfkLpθ,r where 1 = θ + 1−θ and 1 = θ + 1−θ . pθ p1 p2 qθ q1 q2

Before proving the theorem, we make a few remarks. First, if pθ ≤ qθ, taking r = qθ gives ∗ kT fkLqθ . kfkLpθ,qθ . kfkLpθ so that T is of strong type (pθ, qθ). The requirement that pθ ≤ qθ is essential for the strong type conclusion, as evidenced by the following example.

f(x) p Example 3.13. Define T (f)(x) := 1 . Then T is bounded as an operator from L (R) → |x| 2 2p 2p ,∞ p L p+2 (R) for any 2 ≤ p ≤ ∞, but is not bounded as an operator from L (R) → L p+2 (R). To prove this, we invoke Holder’s inequality in Lorentz spaces: for 1 ≤ p1, p2, p < ∞ 1 1 1 1 1 1 ∗ 1 ≤ q , q , q ≤ ∞ = + = + kfgk p,q and 1 2 satisfying p p1 p2 and q q1 q2 , we have L . ∗ ∗ kfkLp1,q1 kgkLp2,q2 . By this inequality, we have: ∗ ∗ ∗ − 1 kT fk 2p kfk p,∞ |x| 2 kfk p . ,∞ . L (R) 2,∞ . L (R) L p+2 (R) L (R)

27 − p+2 − 1   2p f(x) = |x| p log |x| + 1 On the other hand, consider the function |x| . We first claim p that f ∈ L (R). Indeed, we can write

p+2 Z   − 2 p 1 1 kfk p = log |x| + dx L (R) |x| |x| p+2 p+2 ! Z 2   − 2 Z ∞   − 2 1 1 1 1 = 2 log |x| + dx + log |x| + dx . 0 |x| |x| 2 |x| |x|

Consider the second integral. Since p ≥ 2, we have

p+2 Z ∞   − 2 Z ∞   −2 Z ∞ 1 1 1 1 1 log |x| + dx ≤ log |x| + dx ≤ 2 dx 2 |x| |x| 2 |x| |x| 2 x(log x) Z ∞ 1 = 2 du log 2 u < ∞.

p By symmetry, the singularity at 0 behaves similarly. Hence, f ∈ L (R). However, it is not difficult to show that the quantity

2p −1 Z 2  1  p+2 − p+2 kT fk 2p = |x| log |x| + dx L p+2 |x| is not finite.

We now prove Theorem 3.12.

Proof.

Corollary 3.14 (Marcinkiewicz interpolation).

28 Chapter 4

Maximal Functions

The notion of averaging is one which is ubiquitous in analysis. For example, convolution can be viewed as a sort of averaging against a given function. Convolving a function with a smooth function then smooths the original function, a fact which is immensely useful. In this chapter, we study averaging in more depth via maximal functions. We begin by recalling the classical Hardy-Littlewood maximal inequality, and spend the rest of the chapter developing generalizations.

4.1 The Hardy-Littlewood Maximal Function

The reader may already be familiar with the Hardy-Littlewood maximal function, defined as 1 Z M(f)(x) := sup |f(y)| dy. (4.1) r>0 |Br(x)| Br(x) 1 R |f(y)| dy The quantity |Br(x)| Br(x) represents the average value of the function on a ball centered at x. Thus, the maximal function measures the largest possible average values for f on various balls. The following theorem is standard. Theorem 4.1 (Hardy-Littlewood maximal inequality). Let M(f) denote the Hardy-Littlewood maximal function as in (4.1). Then

p d 1. For f ∈ L (R ), 1 ≤ p ≤ ∞, M(f) is finite almost everywhere. 2. The operator M is of weak type (1, 1), and of strong type (p, p) for 1 < p ≤ ∞.

Later in this chapter, we will prove a generalization of Theorem 4.1, and so we defer a proof until then. Unraveling statement 2 of Theorem 4.1, we see that M being of weak type (1, 1) means

∗ kMfk 1,∞ d kfk 1 d L (R ) . L (R ) and hence C | { x :(Mf)(x) > λ } | ≤ kfk 1 d λ L (R ) for all λ > 0. This is the usual formulation of the Hardy-Littlewood maximal inequality. We remark that the p > 1 condition in statement 2 is necessary, because M is not of strong ∞ d type (1, 1). To see this, let φ ∈ Cc (R ) with supp φ ⊆ B1/2(0). Fix |x| > 1. If r < |x| − 1/2, R φ(y) dy = 0 r > |x| + 1/2 then Br(x) . If , then 1 Z 1 Z 1 Z φ(y) dy = φ(y) dy ≤ φ(y) dy. |B (x)| |B (x)| |B (x)| r Br(x) r B|x|+1/2(x) |x|+1/2 B|x|+1/2(x)

29 Thus, 1 Z 1 (Mφ)(x) = sup φ(y) dy & d . |x|−1/2≤r≤|x|+1/2 |Br(x)| Br(x) |x| 1 1 d But |x|d is not in L (R ). We close this section with a classic application of the Hardy-Littlewood maximal in- equality.

1 d Theorem 4.2 (Lebesgue differentiation). Let f ∈ Lloc(R ). Then 1 Z lim f(y) dy = f(x) r→0 |Br(x)| Br(x)

d for almost every x ∈ R . Proof. ADD IN

More details on the classical Hardy-Littlewood maximal function can be found in [9].

4.2 Ap Weights and the Weighted Maximal Inequality

The main theorem of this section, which is a generalization of Theorem 4.1, is the following.

d Theorem 4.3 (Weighted maximal inequality). Let ω : R → [0, ∞) be locally integrable. Asso- R ciate to ω the measure defined by ω(E) := E ω(y) dy. Then M : L1(M(ω) dx) → L1,∞(ω dx) and M : Lp(M(ω) dx) → Lp(ω dx) for 1 < p ≤ ∞. Explicitly, 1 Z ω ({ x : |(Mω)(x)| > λ }) . |f(x)| (Mω)(x) dx λ Rd and Z Z p p |(Mf)(x)| ω(x) dx . |f(x)| (Mω)(x) dx. Rd Rd Note that if ω ≡ 1, then M(ω) ≡ 1, and we recover the classical Hardy-Littlewood maximal inequality in Theorem 4.1. Before proving the weighted maximal inequality, we establish some preliminary facts on Ap weights.

Definition 4.4. A function ω satisfies the A1 condition, written ω ∈ A1, if M(ω) . ω almost everywhere.

1 1,∞ Note that if ω ∈ A1, then Theorem 4.3 gives M : L (ω dx) → L (ω dx) and M : Lp(ω dx) → Lp(ω dx) for 1 < p ≤ ∞.

Lemma 4.5. The following are equivalent:

1. ω ∈ A1; ω(B) 2. |B| . ω(x) for almost all x ∈ B and all balls B;

1 R 1 R 3. |B| B f(y) dy . ω(B) B f(y)ω(y) dx for all f ≥ 0 and all balls B.

30 Proof. We first show (1) ⇒ (2). Fix a ball B0 of radius r0 and let x ∈ B0. Then 1 Z 1 Z ω(x) & (Mω)(x) = sup ω(y) dy ≥ ω(y) dy. r>0 |Br(x)| |B2r (x)| Br(x) 0 B2r0 (x)

Because B2r0 (x) ⊇ B0 (drawing a picture makes this clearer) and because |B2r0 (x)| .d |B0|, Z 1 ω(B0) ω(x) & ω(y) dy = |B0| B0 |B0| as desired. The implication (2) ⇒ (1) follows immediately from the definition of the maximal function. Next, we show (2) ⇒ (3). Fix a ball B and f ≥ 0. Then 1 Z 1 Z ω(B) 1 Z f(y)ω(y) dy & f(y) dy = f(y) dy. ω(B) B ω(B) B |B| |B| B Finally (3) ⇒ (2). Fix a ball B and let x ∈ B be a Lebesgue point of B (i.e., a point for which the conclusion of the Lebesgue differentiation theorem holds). Choose r << 1 so that Br(x) ⊆ B. Let f = χBr(x). Then by (3), Z Z Z 1 1 |Br(x)| 1 χBr(x)(y) dy . χBr(x)(y)ω(y) dx ⇒ . ω(y) dy. |B| B ω(B) B |B| ω(B) Br(x) Since x is a Lebesgue point, ω(B) 1 Z . ω(y) dy → ω(x) |B| |Br(x)| Br(x) as r → 0.

d Definition 4.6. A function ω : R → [0, ∞) satisfies the Ap condition, written ω ∈ Ap, if p  Z 0  0 ω(B) 1 − p p sup ω(y) p dy . 1, B |B| |B| B d where the supremum is taken over all open balls in R . We note for convenience that the above condition is equivalent to

p ω(B) Z p0  p0 − p sup p ω(y) dy . 1. B |B| B

Also, if 1 < p < q < ∞, then Ap ⊆ Aq. This follows from Holder’s inequality. The importance of the class of Ap weights is the following theorem, which characterizes when the maximal function is a bounded operator on a weighted Lp space. p p Theorem 4.7. Fix 1 < p < ∞. Then M : L (ω dx) → L (ω dx) if and only if ω ∈ Ap. Proof. Here, we only explicitly prove one direction (⇒) of this statement. For the other direction, we will provide an outline of the argument and leave the details as an exercise for the reader. p0 p p − Suppose that M : L (ω dx) → L (ω dx). Fix a ball B and let f = (ω + ε) p χB. For x ∈ B, Z 0 Z 0 1 − p 1 − p (Mf)(x) = sup (ω(y) + ε) p χB(y) dy ≥ (ω(y) + ε) p dy r>0 |B(x, r)| B(x,r) |2B| B Z 0 1 − p & (ω(y) + ε) p dy =: λ. |B| B

31 Here our definition of λ includes any implicit constants from the above inequality. Then  λ  1 Z ω(B) ≤ ω x :(Mf)(x) > |f(y)|pω(y) dy 2 . λp where the final inequality follows from the fact that M : Lp(M(ω) dx) → Lp(ω dx), and hence M : Lp(M(ω) dx) → Lp,∞(ω dx). Plugging in our definition of λ gives

−p Z p0  Z p − −p0 ω(B) . |B| (ω(y) + ε) p dy (ω(y) + ε) ω(y) dy B B −p Z p0  Z p − −p0+1 . |B| (ω(y) + ε) p dy (ω(y) + ε) dy B B Z 0 −p Z 0 − p − p = |B|p (ω(y) + ε) p dy (ω(y) + ε) p dy B B Z 0 −(p−1) − p = |B|p (ω(y) + ε) p dy . B Thus, p ω(B) Z p0  p0 − p p (ω(y) + ε) dy . 1. |B| B

Letting ε → 0 gives ω ∈ Ap.

Now, we sketch the proof of the converse direction. Suppose that ω ∈ Ap. Define the following weighted maximal function: 1 Z (Mωf)(x) := sup |f(y)|ω(y) dy. r>0 ω(B(x, r)) B(x,r)

1 1,∞ It can be shown (via techniques from this section) that Mω : L (ω dx) → L (ω dx), and p p moreover (Mf) . Mω(f ). Fix f ∈ Lp(ω dx). Then p  ∗  p p p p kMfkLp,∞(ω dx) = sup λ ω ({ x :(Mf)(x) > λ }) = sup λ ω ({ x :(Mf) (x) > λ }) λ>0 λ>0 = sup η ω ({ x :(Mf)p(x) > η }) . η>0

p p Since (Mf) (x) . Mω(|f| )(x), p p ω ({ x :(Mf) (x) > η }) . ω ({ x : Mω(|f| )(x) > η }) . Thus, p  ∗  p p kMfkLp,∞(ω dx . sup η ω ({ x : Mω(|f| )(x) > η }) = kMω(|f| )kL1,∞(ω dx) . η>0

1 1,∞ Since Mω : L (ω dx) → L (ω dx), Z p p p p kMω(|f| )kL1,∞(ω dx) . k|f| kL1(ω dx) = |f(x)| ω(x) dx = kfkLp(ω dx) .

Therefore, ∗ kMfkLp,∞(ω dx . kfkLp(ω dx) so that M : Lp(ω dx) → Lp,∞(ω dx).

32 Theorem 4.8. Fix 1 ≤ p ≤ ∞, and let dµ be a nonnegative Borel measure. If M : Lp(dµ) → p,∞ L (dµ), then dµ = ω dx for some ω ∈ Ap.

Proof. If we can prove that dµ is absolutely continuous, then in light of the previous proof, we are done. Decompose dµ = ω(x) dx + dν where dν is the singular part of dµ, i.e., there exists a compact set K with |K| = 0 but ν(K) > 0. Define Un = { x : d(x, K) < 1/n } and T fn = χUn\K . As Un \ K ⊇ Un+1 \ K, and (Un \ K) = ∅, it follows that fn → 0 pointwise. We claim that dµ is finite on compact sets. To see this, pick a measurable set E with 0 < µ(E) < ∞; this possible since µ is nontrivial. We may assume that E is compact by inner regularity. Then 1 Z (MχE)(x) = sup χE(y) dy. r>0 |B(x, r)| B(x,r)

Let r = d(x, E) + diam E. Then |E| (Mχ )(x) . E & rd Since E is compact, if we restrict x to a compact set then d(x, E) is bounded below and above. Thus, MχE &E,F 1 uniformly for x ∈ F with F compact. So suppose for the sake of contradiction that there is a compact set F with µ(F ) = ∞. Then Z ∞ = µ(F ) ≤ µ ({ x : MχE(x) &E,F 1 }) .E,F dµ = µ(E) < ∞ E which is a contradiction. R p With this claim proven, next, note that |fn| dµ → 0 by the dominated convergence theorem. Fix x ∈ K. Then since B(x, 1/n) ⊆ Un and since |K| = 0, 1 Z 1 Z (Mf)(x) = sup χUn\K (y) dy ≥ χUn\K (y) dy r>0 |B(x, r)| B(x,r) |B(x, 1/n)| B(x,1/n) 1 Z = χB(x,1/n)(y) dy |B(x, 1/n)| KC 1 Z = d χB(x,1/n)(y) dy R KC = 1.

So x ∈ { x :(Mfn)(x) > 1/2 }. Thus, K ⊆ { x :(Mfn)(x) > 1/2 }. So Z p 0 < ν(K) < µ(K) ≤ µ ({ x :(Mfn)(x) > 1/2 }) . |fn| dµ → 0 which is a contradiction. Thus, dµ = ω(x) dx, and by the previous theorem, ω ∈ Ap.

Lemma 4.9. We have ω ∈ Ap if and only if  Z p Z 1 1 p f(y) dy . f(y) ω(y) dy |B| B ω(B) B uniformly for all f ≥ 0 and all balls B.

Proof. First, suppose that ω ∈ Ap. Then

p ω(B) Z p0  p0 − p sup p w(y) . 1. B |B| B

33 Then, by Holder,

p p 1 Z  1 Z 1 1  p − p p f(y) dy = p f(y)ω(y) ω(y) dy |B| B |B| B p 1 Z Z p0  p0 p − p . p |f(y)| ω(y) dy ω(y) |B| B B Z p 1 p |B| . p |f(y)| ω(y) dy |B| B ω(B) 1 Z = f(y)pω(y) dy. ω(B) B To see the converse direction, suppose that  Z p Z 1 1 p f(y) dy . f(y) ω(y) dy |B| B ω(B) B

0 − p uniformly for all f ≥ 0 and all balls B. Let f = (ω + ε) p . Then

 0 p 1 Z p 1 Z 0 − p −p p (ω(y) + ε) dy . (ω(y) + ε) ω(y) dy |B| B ω(B) B 1 Z 0 ≤ (ω(y) + ε)−p +1 dy ω(B) B 1 Z p0 = (ω(y) + ε) p dy. ω(B) B So p−1 ω(B) Z p0  − p p (ω(y) + ε) dy . 1. |B| B Letting ε → 0 gives the result.

The final step before proving the weighted maximal inequality is a version of a covering lemma, due to Vitali.

d Lemma 4.10 (Vitali). Let F be a finite collection of open balls in R . Then there exists a subcollec- S S tion S of F such that distinct balls in S are disjoint, and B∈F ⊆ B∈S 3B. Proof. Run the following algorithm.

1. Set S := ∅.

2. Choose a ball in F of largest radius, and add it to S.

3. Discard all of the balls in F which intersect balls in S.

4. If all balls in F are removed, stop. Otherwise, return to step 2.

This algorithm terminates because at least one ball in F is discarded at every step. By construction, distinct balls in S are disjoint. Finally, if B is a ball in F which is not in S, it necessarily intersects some ball B0 ∈ S of larger radius. By the triangle inequality (draw a 0 S S picture), 3B contains B. Thus, B∈F ⊆ B∈S 3B.

At last, we prove the weighted maximal inequality.

34 Proof of Theorem 4.3. First, we claim that M : L∞(M(ω) dx) → L∞(ω dx). To see this, let f ∈ L∞(M(ω) dx). Then

kMfkL∞(ω dx) = inf sup |(Mf)(x)| ≤ inf sup |(Mf)(x)| ω(E)=0 x∈EC |E|=0 x∈EC R because |E| = 0 implies ω(E) = 0, asω(E) = E w(y) dy. We also have the trivial estimate that kMf(x)kL∞(dx) ≤ kfkL∞(dx), which follows from moving absolute value signs inside the integral definition of Mf. Then

kMfkL∞(ω dx) ≤ inf sup |f(x)|. |E|=0 x∈EC

Note that, unless ω ≡ 0, Mω(x) > 0 for all x. Thus,

kMfkL∞(ω dx) ≤ inf sup |f(x)| = inf sup |f(x)| = kfkL∞(Mω dx) |E|=0 x∈EC (Mω)(E)=0 x∈EC as desired. Since M is bounded from L∞(M(ω) dx) → L∞(ω dx), by the Marcinkiewicz interpo- lation theorem, it suffices to prove that M is bounded from L1(M(ω) dx) → L1,∞(ω dx). Indeed, set p1 = 1, q1 = 1 and p2 = ∞, q2 = ∞, and for θ ∈ (0, 1), define 1 θ 1 − θ 1 = + = θ ⇒ pθ = pθ p1 p2 θ and 1 θ 1 − θ 1 = + = θ ⇒ qθ = . qθ q1 q2 θ 1 For r = qθ = pθ = θ , the interpolation theorem gives boundedness of M as an operator 1 1 θ θ 1 from L (M(ω) dx) → L (ω dx). As 0 < θ < 1, 1 < θ < ∞, hence the theorem is proved. So it remains to prove M : L1(M(ω) dx) → L1,∞(ω dx). For f ∈ L1(M(ω) dx), we need ∗ to show that kMfkL1,∞(ω dx) . kfkL1(M(ω) dx), i.e., Z sup λ ω ({ x : |Mf(x)| > λ }) . |f(x)|M(ω)(x) dx. λ>0

Fix λ > 0 and consider { x : |Mf(x)| > λ }. Let K ⊆ { x : |Mf(x)| > λ } be compact. For x ∈ K, Mf(x) > λ, so by definition of the maximal function there exists rx > 0 so that 1 Z |f(y)| dy > λ. |B (x)| r Brx (x) S Then K ⊆ x∈K Brx (x), and compactness then gives a finite subcover of such balls. By Vitali’s covering lemma, there exists a subcollection S of these balls such that distinct balls S P in S are disjoint, and K ⊆ B∈S 3B. Then ω(K) ≤ B∈S ω(3B).

Consider a ball Bj ∈ S with radius rj. For y ∈ Bj, note that B4rj (y) ⊇ 3Bj. Thus, Z Z d ω(3Bj) = ω(x) dx ≤ ω(x) dx ≤ (Mω)(y) |B4rj (y)| = (Mω)(y) 4 |Bj|. 3Bj B4rj (y)

Therefore, Z Z d ω(3Bj) · |f(y)| dy ≤ 4 |Bj| (Mω)(y)|f(y)| dy Bj Bj so that Z Z 1 d ω(3Bj) · |f(y)| dy ≤ 4 (Mω)(y)|f(y)| dy. |Bj| Bj Bj

35 But 1 R |f(y)| dy > λ, so |Bj | Bj 1 Z ω(3Bj) .d (Mω)(y)|f(y)| dy. λ Bj

Since the balls Bj are all disjoint, we then have

X 1 Z 1 ω(K) ≤ ω(3Bj) . (Mω)(y)|f(y)| dy = kfkL1(M(ω) dx) . λ d λ Bj ∈S R

As this holds for all compact K ⊆ { x : |Mf(x)| > λ }, by the inner regularity of the mea- sure ω, we are done.

4.3 The Vector-Valued Maximal Inequality

Our first application of the weighted maximal inequality is a similar estimate for vector- valued maximal functions.

d 2 Definition 4.11. For f : R → ` (N) given by f(x) = {fn(x)}n≥1, define

kfk p := {fn(x)}`2 p . L n Lx

The vector-valued maximal function M¯ is given by ¯ M(f)(x) := k{Mfn(x)}k 2 . `n

d Note that Mf¯ : R → [0, ∞], so that Mf¯ is not itself vector-valued. Rather, the phrase “vector-valued” refers only to the input functions f. The main estimate on M¯ is the following vector-valued version of the classical Hardy- Littlewood maximal inequality.

Theorem 4.12 (Vector-valued maximal inequality).

1. The operator M¯ is of weak type (1, 1), that is, ¯ ∗ Mf L1,∞ . kfkL1

d 2 for f : R → ` (N); 2. M¯ is of strong type (p, p) for all 1 < p < ∞, that is, ¯ Mf Lp . kfkLp

d 2 for f : R → ` (N). Before proving Theorem 4.12, we give an outline of the argument. The case p = 2 is a straightforward computation, and the case p > 2 will follow from the weighted maximal inequality. Having established the case p = 2, by the Marcinkiewicz Interpolation theorem it will suffice to prove the weak type estimate for p = 1 to prove the entire theorem. To prove this case we employ a Calderon-Zygmund decomposition technique. d 2 We adopt the convention that for f : → ` ( ), |f(x)| := k{fn(x)}k 2 . R N `n 1 d Lemma 4.13 (Calderon-Zygmund decomposition). Given f ∈ L (R ) (possibly vector valued) and λ > 0, there is a decomposition f = g + b such that:

d 1. |g(x)| ≤ λ for almost all x ∈ R ;

36 2. b = fχQk , where {Qk} is a collection of cubes whose interiors are disjoint and such that 1 Z λ < |b(y)| dy ≤ 2dλ. |Qk| Qk

In this decomposition, g represents the good part of f, i.e., the part which is essentially uniformly bounded by λ, and b represents the bad part. The requirement of b in the decom- position says that the average of b on any given cube is on the order of λ.

d Proof. Decompose R into dyadic cubes of the form:

n n n n Qk = [2 k1, 2 (k1 + 1) × · · · × [2 kd, 2 (kd + 1)) where the diameter of each cube is chosen to be large enough so that 1 Z |f(y)| dy ≤ λ. Qk Qk R Because d |f(y)| dy < ∞, this is certainly possible. R Run the following algorithm. Fix a cube Q from the above decomposition. Divide Q into 2d equal sized cubes. Let Q0 denote one of these smaller cubes. If Q satisfies 1 Z 0 |f(y)| dy > λ, Q Q0 then stop and add Q0 to the collection of cubes which define the support of b. Note that, for such a cube, Z d Z 1 2 d λ < 0 |f(y)| dy ≤ |f(y)| dy ≤ 2 λ |Q | Q0 |Q| Q as required by the definition of b. If Q satisfies 1 Z 0 |f(y)| dy ≤ λ, Q Q0 then subdivide Q0 as we did with Q and continue until, if ever, we land in the previous case. S Having run this algorithm, let b = fχ Qk , where {Qk} is the collection of cubes from the above algorithm. Set g = f − b. By construction, the cubes Qk have disjoint interiors. S It only remains to check that |g| ≤ λ almost everywhere. If x∈ / Qk, then by our application there is a sequence of cubes, each containing x, with diameters shrinking to 0, such that the average value of |f| on these cubes is all ≤ λ. By the Lebesgue differentiation theorem, |g| ≤ λ almost everywhere.

Now, we prove the vector-valued maximal inequality.

2 d Proof Theorem 4.12. First consider the case p = 2. For a vector-valued f = {fn} ∈ L (R ), ¯ we need to show that Mf L2 . kfkL2 . By definition, Z Z ¯ 2 2 X 2 Mf 2 = k{Mfn(x)}k`2 dx = |Mfn(x)| dx. L d n d R R n≥1

Since all the quantities in question are nonnegative, we can interchange the integral and summation (by Tonelli’s theorem) to get Z ¯ 2 X 2 Mf 2 = |Mfn(x)| dx. L d n≥1 R

37 Previously, we proved that M is of strong type (2, 2). So Z Z Z ¯ 2 X 2 X 2 X 2 2 Mf 2 = |Mfn(x)| dx . |fn(x)| dx = |fn(x)| dx = kfkL2 L d d d n≥1 R n≥1 R R n≥1 as desired. ¯ Next, suppose that 2 < p < ∞. We wish to show that Mf Lp . kfkLp . Observe that 1 ¯ 2 2 Mf p = kMfn(x)k 2 = (kMfn(x)k 2 ) L `n p `n p/2 Lx Lx

¯ 2 ¯ 2 so that Mf Lp = (Mf(x)) Lp/2 . By the duality characterization of norms, Z ¯ 2 ¯ 2 ¯ 2 Mf Lp = (Mf(x)) Lp/2 = sup (Mf(x)) ω(x) dx kωk 0 =1 L(p/2) Z X 2 = sup |Mfn(x)| ω(x) dx kωk 0 =1 L(p/2) n≥1 Z X 2 = sup |Mfn(x)| ω(x) dx. kωk 0 =1 L(p/2) n≥1

For ω ≥ 0, the weighted maximal inequality that we proved earlier tells us that M : L2(M(ω) dx) → L2(ω dx). Thus, supposing without loss of generality that the supremum in the above expression is taken over ω ≥ 0, we have Z Z ¯ 2 X 2 X 2 Mf Lp = sup |Mfn(x)| ω(x) dx . sup |fn(x)| (Mω)(x) dx kωk 0 =1 kωk 0 =1 L(p/2) n≥1 L(p/2) n≥1 Z X 2 = sup |fn(x)| (Mω)(x) dx. kωk 0 =1 L(p/2) n≥1

Applying Holder’s inequality,

¯ 2 X 2 Mf p sup |fn| kMωk (p/2)0 . L . L kωk (p/2)0 =1 n≥1 L L(p/2)

By the standard Hardy-Little maximal inequality theorem, kMωkL(p/2)0 . kωkL(p/2)0 = 1. Finally, note that

X 2 2 2 |fn| = kfnk`2 = kfkLp n L(p/2) n≥1 L(p/2) ¯ so that Mf Lp . kfkLp as desired. As noted above, it remains to prove the weak type (1, 1) claim. Explicitly, given f = 1 d {fn} ∈ L (R ), we need to show that  ¯ sup λ x :(Mf)(x) > λ . kfkL1 . λ>0

Using the Calderon-Zygmund decomposition, write f = g + b with |g| ≤ λ almost every- S where and b = fχ Qk where Qk are cubes with disjoint interiors satisfying 1 Z |b(y)| dy ∼ λ. |Qk| Qk

38 Because M¯ is a sublinear operator,

 λ   λ   x :(Mf¯ )(x) > λ ⊆ x :(Mg¯ )(x) > ∪ x :(Mb¯ )(x) > . 2 2

Consider the first set. We have already shown that M¯ is of strong type (2, 2), so in particular it is of weak type (2, 2). Thus,   ¯ λ 1 2 1 2 1 1 1 x :(Mg)(x) > kgk 2 = |g| 1 ≤ kλ|g|k 1 = kgk 1 ≤ kfk 1 . 2 . λ L λ2 L λ2 L λ L λ L

 ¯ λ Therefore, it remains to prove the same type of estimate for the set x :(Mb)(x) > 2 . Let 2Qk denote the cube with same center as Qk and twice the side length. Then, be- cause the interior of the Qk’s are disjoint, we have: Z Z [ d X X 1 1 2Qk ≤ 2 |Qk| . |b(y)| dy . |b(y)| dy λ λ d k k Qk R 1 ≤ kfk 1 . λ L With this estimate, it remains to show that   [ ¯ λ 1 x∈ / 2Qk :(Mb)(x) > kfk 1 . 2 . λ L

avg P 1 R Define bn (x) = χQ (x) |bn(y)| dy. If x ∈ Qk, then k k |Qk| Qk Z Z Z avg 1 1 1 kb (x)k 2 = |bn(y)| dy ≤ kbn(y)k 2 dy = |b(y)| dy λ n `n `n . |Qk| Q 2 |Qk| Q |Qk| Q k `n k k

2 invoking Minkowski’s integral inequality to move the `n norm inside the integral. Thus,

avg avg [ 1 kb k 1 = kb (x)k 2 λ Qk λ · kfk 1 = kfk 1 . L n `n 1 . . L L Lx λ S Now, fix x∈ / 2Qk. Then

1 Z 1 X Z Mbn(x) = sup |bn(y)| dy = sup |bn(y)| dy. r>0 |B(x, r)| r>0 |B(x, r)| B(x,r) k B(x,r)∩Qk S Suppose that x∈ / 2Qk but that B(x, r) ∩ Qk 6= ∅. Let√l be the side√ length of Qk. Then necessarily r√ ≥ l/2. Note that the diameter of Qk is dl ≤ 2r d. This implies that B(x, r(1 + 2 d)) ⊇ Qk. Thus,

1 X Z Mbn(x) .d sup √ |bn(y)| dy r>0 |B(x, r(1 + 2 d)| k Qk ! 1 X Z Z  = sup √ √ χQk (z) dz |bn(y)| dy r>0 |B(x, r(1 + 2 d)| k B(x,r(1+2 d)) Qk 1 Z X 1 Z = sup √ √ χQk (z) · |bn(y)| dy dz r>0 |Qk| |B(x, r(1 + 2 d)| B(x,r(1+2 d)) k Qk avg = Mbn (x).

39 ¯ ¯ avg ¯ It follows that Mb . Mb . So we again apply the (2, 2) estimate of M to get   [ ¯ λ n [ ¯ avg o 1 avg 2 x∈ / 2Qk :(Mb)(x) > ≤ x∈ / 2Qk :(Mb )(x) λ kb k 2 2 & . λ2 L 2 1 avg 1 avg 2 = kb (x)k 2 = kb (x)k 2 2 n `n 2 2 n `n 1 λ Lx λ Lx 1 avg 1 avg kb (x)k 2 = kb k 1 . n `n 1 L λ Lx λ 1 kfk 1 . . λ L

40 Chapter 5

Sobolev Inequalities

This chapter contains a brief foray into the world of so-called Sobolev inequalities. A typical inequality of this type bounds a certain norm of a function by a norm of a derivative of the function. In vague terms, this kind of estimate bounds the average gradient (or energy) of a function below by its overall spread. This turns out to be a useful fact. The inequalities we prove in this chapter will appear again throughout the rest of the book in different contexts. For example. we will provide an alternate proof for the Gagliardo-Nirenberg inequality after we develop the machinery of Littlewood-Paley the- ory. Also, we will discuss the optimal constants of the Gagliardo-Nirenberg and Sobolev embedding inequalities in the future.

5.1 The Hardy-Littlewood-Sobolev Inequality

As a precursor to the Sobolev inequalities we will prove, we begin with an important con- volution estimate called the Hardy-Littlewood-Sobolev inequality. We actually prove two such inequalities in this section, the latter a generalization of the first. There are many ways to prove the following theorem. The approach we take is due to Hedberg [4] and is amenable to proving certain inverse inequalities. d Theorem 5.1 (Hardy-Littlewood-Sobolev I). Let f ∈ S(R ). Then

1 f ∗ α . kfkLp |x| Lr 1 1 α whenever 1 + r = p + d for 1 < p < r < ∞ and 0 < α < d. d p d Proof. We remark that requiring f ∈ S(R ) ensures that f ∈ L (R ) for all necessary p. Decompose the convolution as follows:  1  Z f(y) Z f(y) Z f(y) f ∗ α (x) = α dy = α dy + α dy |x| |x − y| |x−y|≤R |x − y| |x−y|>R |x − y| for some R > 0. We will estimate each integral and then optimize in R. Consider the first integral.

Z f(y) Z |f(y)| X Z |f(y)| dy ≤ dy ≤ dy α α α |x−y|≤R |x − y| |x−y|≤R |x − y| r<|x−y|<2r |x − y| r∈2Z;r≤R Z X −α . r |f(y)| dy |x−y|<2r r∈2Z;r≤R X rd−α Z . |f(y)| dy. |B(x, 2r)| |x−y|<2r r∈2Z;r≤R

41 1 R The quantity |B(x,2r)| |x−y|<2r |f(y)| dy is bounded by Mf(x), so

Z f(y) X dy ≤ Mf(x) rd−α. α |x−y|≤R |x − y| r∈2Z;r≤R

Because d − α > 0 and because the r’s are dyadic numbers, the sum is summable and is bounded (up to a constant) by the largest term. Thus,

Z f(y) dy ≤ Rd−αMf(x). α |x−y|≤R |x − y|

Now we consider the second integral. We have

Z f(y)   1  α dy = f ∗ α χ{|x|>R} (x). |x−y|≤R |x − y| |x|

By Young’s convolution inequality (or more directly, Holder’s inequality),

1   Z ! p0 1 1 1 f ∗ α χ{|x|>R} . kfkLp α χ{|x|>R} = kfkLp αp0 dx . |x| L∞ |x| Lp0 |x|>R |x|

d−αp0 0 0 If αp > d, then the integral quantity is finite, and in particular . R p . Since α 1 α 1 1 + > 1 ⇒ > 1 − = d p d p p0 this is indeed the case. Thus,   Z f(y) 1 d −α p0 dy ≤ f ∗ χ kfk p R . α α {|x|>R} . L |x−y|≤R |x − y| |x| L∞

We optimize our choice of R by requiring the two estimates to be comparable, so that p d −α  kfk  d d−α p0 Lp R Mf(x) ∼ R kfkp. Choose R so that R ∼ Mf(x) . With this choice, p !d−α  kfk  d d−α p 1 p d p 1− (d−α) f ∗ (x) Mf(x) = kfk p (Mf(x)) d . |x|α . Mf(x) L

Simplifying the exponents,

p 1 α p 1 − (d − α) = p − 1 + = . d p d r

So 1 1− p p r r f ∗ (x) kfk p (Mf(x)) . |x|α . L Taking the Lr norm and using the fact that M is of type (p, p) then gives

1 1− p p 1− p p 1− p p r r r r r r f ∗ kfk p (Mf(x)) = kfk p kMfk p kfk p kfk p = kfk p α . L r L L . L L L |x| Lr L as desired.

42 It is tempting to view the Hardy-Littlewood-Sobolev inequality as having more or less the same content as Young’s convolution inequality, as the statements of the estimates are similar. However, the fact that |x|−α is not in any Lp spaces makes Hardy-Littlewood- Sobolev much more subtle. −α d ,∞ d Note, however, that the function |x| lives in the weak space L α (R ). This suggests the following generalization of the previous theorem. Theorem 5.2 (Hardy-Littlewood-Sobolev II). For 1 < p < r < ∞ and 1 < q < ∞, we have

∗ kf ∗ gkLr . kfkLp kgkLq,∞ 1 1 1 whenever 1 + r = p + q . Proof. We begin with a few reductions. By rescaling, we can assume without loss of gener- ∗ ality that kfkLp = kgkLq,∞ = 1. Furthermore, note that for a fixed such g it suffices to prove that the operator f 7→ f ∗ g is of strong type (p, r). In fact, by the Marcinkiewicz interpo- lation theorem, it suffices to prove that this operator is of weak type (p, r). This is because the condition 1 < p < r < ∞ is an open condition, so that we can always find necessary (p1, r1) and (p2, r2) satisfying the relevant conditions to yield the strong type estimate in the interpolation theorem. ∗ −r So fix λ > 0. We need to show that kf ∗ gkLr,∞ . 1, i.e., |{ x : |f ∗ g|(x) > λ }| . λ . As in the previous proof, we decompose g = gχ|g|≤R + gχ|g|>R := g1 + g2. Then     λ λ |{ x : |f ∗ g|(x) > λ }| ≤ x : |f ∗ g1|(x) > + x : |f ∗ g2|(x) > . 2 2

We will show that the first quantity on the right is 0 for an appropriately chosen R. Intu- itively, since g1 is bounded by R, the convolution of f with g cannot be too large. Comput- ing,

1 Z ∞  0 p0 dα p kf ∗ g1kL∞ . kfkLp kg1kLp0 = kg1kLp0 . α |{ x : |g1(x)| > α }| 0 α 1 Z R  p0 p0 dα = α |{ x : |g1(x)| > α }| 0 α

0 where we have used the layer-cake representation for the Lp norm and (see Chapter3) and the fact that |g1| ≤ R. Continuing,

1 Z R  p0 q p0−q dα kf ∗ g1kL∞ . sup [α |{ x : |g1(x)| > α }|] α 0 α>0 α 1 1   0 Z R  p0 q p p0−q dα = sup [α |{ x : |g1(x)| > α }|] α . α>0 0 α

1 ∗ p0 The quantity on the left is precisely (kgkLq,∞ ) , which we are assuming is 1. The integral on the right is integrable if p0 − q > 0. By assumption, 1 1 1 1 1 1 − = − ⇒ < ⇒ p0 > q. p q r p0 q

1 p0−q  0  p0 R R p −q dα p0 So the integral is finite, and in particular, 0 α α . R . Putting this all together gives p0−q p0 kf ∗ g1kL∞ . R .

43  λ Recall that we are estimating x : |f ∗ g1|(x) > 2 . Thus, if we choose R small enough (dependent on λ, say R = cλ for some sufficiently small constant c, then kf ∗ g1kL∞ ≤ λ/2, and so   λ x : |f ∗ g1|(x) > = 0. 2  λ Thus, it remains to estimate x : |f ∗ g2|(x) > 2 . By Chebychev’s inequality,   λ −p p −p p x : |f ∗ g2|(x) > λ kf ∗ g2k p λ (kfk p kg2k 1 ) 2 . L . L L Z ∞ p −p dα = λ α |{ x : |g2(x)| > α }| . 0 α

We consider the integral separately. Z ∞ dα α |{ x : |g2(x)| > α }| 0 α Z R dα Z ∞ dα = α |{ x : |g2(x)| > α }| + α |{ x : |g2(x)| > α }| 0 α R α Z ∞ dα ≤ R · |{ x : |g(x)| > R }| + sup αq |{ x : |g(x)| > α }| α1−q . α>0 R α

q ∗ As before, supα>0 α |{ x : |g(x)| > α }| = kgkLq,∞ = 1. Furthermore, since q > 1, we have R ∞ 1−q dα 1−q R α α . R . So Z ∞ dα q 1−q 1−q α |{ x : |g2(x)| > α }| . R · |{ x : |g(x)| > R }| R + R . 0 α

q ∗ But since R · |{ x : |g(x)| > R }| ≤ kgkLq,∞ = 1, we then have Z ∞ dα 1−q α |{ x : |g2(x)| > α }| . R . 0 α

1− q Plugging this into our original estimate and using R p0 = cλ,

 p0  −p p(1−q) −p p(1−q) −p 0 p(1−q) −r |{ x : |f ∗ g2| > λ/2 }| . λ R . λ R . λ λ p −q . λ .

5.2 The Sobolev Embedding Theorem

As a consequence of the Hardy-Littlewood Sobolev inequalities, we prove a version of the Sobolev embedding theorem. Towards this goal, we begin with a computation. We wish to (formally) take the Fourier 1 p d transform of functions of the form |x|d−α . As this function is not in L (R ), we consider the 0 d Fourier transform in the sense of distributions. Recall that if T ∈ S (R ) is a tempered d distribution, its Fourier transform is defined by Tˆ(f) := T (fˆ) for f ∈ S(R ).

\1 1 Proposition 5.3. Let 0 < α < d. Then |x|d−α ∼ |x|α , in the sense of tempered distributions. Explicitly,    ˆ − d−α d − α 1 − α α 1 π 2 Γ = π 2 Γ . (5.1) 2 |x|d−α 2 |x|α

44 Proof. First, observe that

∞ ∞   d−α   Z 2 d−α dt Z u 2 du d−α d − α 1 −πt|x| 2 −u − 2 e t = e 2 = π Γ d−α . 0 t 0 π|x| u 2 |x|

d So for f ∈ S(R ),    ˆ     − d−α d − α 1 − d−α d − α 1 π 2 Γ (f) = π 2 Γ (fˆ) 2 |x|d−α 2 |x|d−α

Z d−α d − α 1 − 2 ˆ = π Γ d−α f(x) dx. Rd 2 |x| Our first observation, together with the definition of the Fourier transform, gives Z Z ∞ Z Z −πt|x|2 d−α dt −2πix·y = e t 2 e f(y) dy dx Rd 0 Rd t Rd Z Z ∞ Z  −πt|x|2 −2πix·y − d−α dt = e e dx t 2 f(y) dy. Rd 0 Rd t

2 The quantity in the parentheses is precisely the Fourier transform of e−π|x| . Recall that

−x·Axˆ d − 1 −π2ξ·A−1ξ e = π 2 (det A) 2 e .

Thus, Z 2 −πt|x|2 −2πix·y d − d −π2y· 1 y d − π|y| e e dx = π 2 (πt) 2 e πt = t 2 e t . Rd Continuing the computation,

Z Z ∞ 2 Z Z ∞ 2 − π|y| − α dt d − π|y| − d−α dt = e t t 2 f(y) dy = t 2 e t t 2 f(y) dy Rd 0 t Rd 0 t − α Z Z ∞ π|y|2  2 du = e−u f(y) dy Rd 0 u u Z Z ∞ − α −α α −u du = π 2 |y| u 2 e f(y) dy Rd 0 u Z − α α −α = π 2 Γ |y| f(y) dy Rd 2  − α α −α = π 2 Γ |y| (f). 2 This proves (5.1) in the sense of tempered distributions.

The are various forms of the Sobolev embedding theorem. The one we consider here involves a fractional derivative operator |∇|s, which generalizes the usual notion of deriva- tive. The operator |∇|s is defined via its action in the Fourier domain. d s Definition 5.4. Fix s > −d. For f ∈ S(R ), define |∇| via (|∇|sf)ˆ(ξ) := (2π|ξ|)sfˆ(ξ), where this equality is understood in the sense of tempered distributions. s d We require s > −d so that the quantity (2π|ξ|) makes sense as a distribution on R . As (Dαf)ˆ(ξ) = (2πiξ)αfˆ(ξ), it is clear why the operator |∇|s generalizes the usual notion of a derivative. It is natural to wonder how estimates on |∇|s compare to estimates on the usual derivative operator ∇. It turns out to be a fact that k|∇|fkLp . k∇fkLp . We will prove this in the future when we discuss Calderon-Zygmund convolution kernels.

45 d s Theorem 5.5 (Sobolev embedding). Fix 1 < p < ∞, s > 0. Let f ∈ S(R ) such that |∇| f ∈ p d L (R ).1 Then s kfkLq . k|∇| fkLp 1 1 s provided p = q + d .

d q0 p Proof. Recall that S(R ) is dense in L . Thus, by duality of L -norms and Plancharel’s theorem, D ˆ E kfkLq = sup hf, gi = sup f, gˆ g∈S( d):kgk 0 =1 g∈S:kgk 0 =1 R Lq Lq D E = sup (2π|ξ|)sfˆ(ξ), (2π|ξ|)−sgˆ(ξ) . g∈S:kgk 0 =1 Lq

s 0 d −s By our previous comments, (2π|ξ|) fˆ(ξ) ∈ S (R ). We would like for (2π|ξ|) gˆ(ξ) to be a Schwartz function but if g˜ does not vanish near the origin then the singularity at ξ = 0 may prevent this from being true. Thus, we consider a slightly smaller space of functions. We claim that the family

n d o F := g ∈ S(R ) :g ˆ vanishes on a neighborhood of ξ = 0 is dense in Lp for 1 < p < ∞. Obviously it suffices to show that F is dense in S. Fix d d g ∈ S(R ). Let ϕ be a smooth bump function on R such that ϕ(ξ) = 1 for |ξ| ≤ 1 and ϕ(ξ) = 0 for |ξ| ≥ 2. Define  |ξ| gˆ (ξ) :=g ˆ(ξ) 1 − ϕ . ε ε

Note that !ˆ  |ξ| |ξ|  |ξ|ˇ gˆ(ξ) 1 − ϕ =g ˆ(ξ) − gˆ(ξ)ϕ = g(x) − g ∗ ϕ (x) . ε ε ε

Hence   ˇ |ξ| − d d d p kg − gεkLp = g ∗ ϕ = g ∗ ε ϕˇ(ε|x|) . ε kgkL1 ε kϕˇkLp . ε Lp Lp Since g and ϕ are fixed and because p > 1, as ε → 0, the above quantity → 0. This estimate clearly fails for p = 1. Intuitively, F is not dense in L1 because gˆ(0) is the d 1 total integral over R of g. Functions with mean zero are certainly not dense in L . For 1 < p < ∞, we have a new dense subset F. So

D s ˆ −s E kfkLq = sup (2π|ξ|) f(ξ), (2π|ξ|) gˆ(ξ) . g∈F:kgk 0 =1 Lq

The functions (2π|ξ|)−sgˆ(ξ) are certainly Schwartz, since gˆ is a Schwartz function which vanishes in a neighborhood of 0. Applying Plancharel again gives

s −s s −s kfkLq = sup |∇| f, |∇| g . sup k|∇| fkLp |∇| g Lp0 . g∈F:kgk 0 =1 g∈F:kgk 0 =1 Lq Lq We then have |∇|−sg (x) = (2π|ξ|)−sgˆ(ξ)ˇ(x).

1In the sense of distributions, i.e., |∇|sf is given by integration against a function in Lp.

46 Now, recall that if T is a tempered distribution and g is a Schwartz function, then (T ∗ g)ˆ = Tˆ · gˆ. Since (2π|ξ|)−s is a tempered distribution and gˆ is Schwartz, it follows that

 ˆˇ |∇|−sg (x) = ((2π|ξ|)−s)ˇ∗ g (x).

−s ˇ s−d Our computation from (5.1) tells us that (2π|ξ|) ) ∼d,s |x| . Therefore,

−s s−d |∇| g ∼d,s |x| ∗ g.

By the first Hardy-Littlewood-Sobolev inequality,

1 0 g ∗ d−s . kgkLq |x| Lp0

1 1 d−s provided 1 + p0 = q0 + d . But this is true precisely when 1 1 s 1 1 s 1 1 s 1 + 1 − = 1 − + 1 − ⇐⇒ − = − − ⇐⇒ = + p q d p q d p q d which holds by assumption. Therefore,

s −s s s 0 kfkLq . sup k|∇| fkLp |∇| g Lp0 . sup k|∇| fkLp kgkLq = k|∇| fkLp g∈F:kgk 0 =1 g∈F:kgk 0 =1 Lq Lq as desired.

It will be useful to record the special case of Sobolev embedding when p = 2 and s = 1.

d Corollary 5.6 (Sobolev embedding, p = 2). Fix d ≥ 3. Suppose that f ∈ S(R ) with |∇|f ∈ 2 d L (R ). Then kfk 2d . k|∇|fkL2 . L d−2 1 1 1 Proof. By the Sobolev embedding theorem, kfkLq . k|∇|fkL2 provided 2 + q + d . This means 1 d − 2 = q 2d 2d and so q = d−2 as desired.

Using the yet-unproven fact that k|∇|fkL2 . k∇fkL2 , this special case of the Sobolev 1 d 2d d embedding theorem gives the inclusion H˙ (R ) ⊆ L d−2 (R ).

5.3 The Gagliardo-Nirenberg Inequality

In this section, we use the Sobolev embedding theorem to prove a version of the Gagliardo- Nirenberg inequality. As remarked at the beginning of this chapter, this theorem will be restated and revisited in different contexts after developing more sophisticated tools. The following estimate is stated for d ≥ 3, but in fact it holds for d = 1 and d = 2, where 0 < p < ∞. The case d = 1 is an easy calculus exercise, while the d = 2 case is more subtle.

d 4 Theorem 5.7 (Gagliardo-Nirenberg). Fix d ≥ 3 and f ∈ S(R ). Then for all 0 < p < d−2 ,

pd pd p+2 p+2− 2 2 kfkp+2 . kfk2 k|∇|fk2 .

47 Proof. For convenience, we recall the log-convexity of Lp norms: for any 0 < θ < 1 and 1 1−θ θ 1−θ θ r = p + q , we have kfkr ≤ kfkp kfkq . 4 d Let 0 < p < d−2 and f ∈ S(R ). We need to show that

pd pd 1− 2(p+2) 2(p+2) kfkp+2 ≤ kfk2 k|∇|fk2 .

pd 4 Let θ = 2(p+2) . Because p > 0, θ > 0. Because 0 < p < d−2 , p(d − 2) < 4, hence pd < 4 + 2p. Thus, pd pd θ = = < 1 2(p + 2) 4 + 2p and so 0 < θ < 1. Observe that pd 4 + 2p − pd 1 − θ = 1 − = . 2(p + 2) 2(p + 2)

Thus, for any A, B > 0,

1 − θ θ 4 + 2p − pd pd 1 4 + 2p − pd pd  + = + = + . A B 2A(p + 2) 2B(p + 2) p + 2 2A 2B

p 4+2p−pd pd With log-convexity of L norms in mind, we want to choose A and B so that 2A + 2B = 1. Towards our end goal, set A = 2. Then 4 + 2p − pd pd p pd pd 1 d d + = 1 ⇒ 1 + − + = 1 ⇒ − + = 0. 4 2B 2 4 2B 2 4 2B Solving for B yields d d 1 d − 2 2B 4 2d = − = ⇒ = ⇒ B = . 2B 4 2 4 d d − 2 d − 2 These computations demonstrate that 1 1 − θ θ = + . p + 2 2 2d d−2

Thus, 1−θ θ kfkp+2 ≤ kfk2 kfk 2d . d−2

Next, by the p = 2 Sobolev embedding theorem, kfk 2d k|∇|fk2. So d−2 .

1− pd pd 1−θ θ 2(p+2) 2(p+2) kfkp+2 . kfk2 k|∇|fk2 = kfk2 k|∇|fk2 as desired.

48 Chapter 6

Fourier Multipliers

In Chapter5, we introduced the operator |∇|s. This was our first experience with a nontriv- ial Fourier multiplier operator, i.e., an operator defined by inverting a multiplication operator in the Fourier domain. The operator |∇|s is a Fourier multiplier with symbol (2π|ξ|)s. We saw other simple examples of when first defining the Fourier transform. For example, the regular partial derivative operator ∂xk is a Fourier multiplier with symbol 2πiξk; the translation operator f(·) 7→ f(· − y) is a Fourier multiplier with symbol e2πiy·ξ, etc. More generally, the Fourier multiplier with symbol m(ξ) is the operator

 ˇ Z f(x) 7→ m(ξ)fˆ(ξ) (x) = e2πix·ξm(ξ)fˆ(ξ) dξ.

In this chapter, we study the Lp-boundedness of various Fourier multipliers.

6.1 Calderon-Zygmund Convolution Kernels

s When the operator |∇| was introduced in Chapter5, we remarked that the estimate k|∇|fkLp . k∇fkLp holds for Schwartz functions f. The usefulness of this estimate is apparent from the fact that |∇|s is a nonlocal operator, in that it was defined by multiplication in the Fourier domain and hence by convolution in the physical domain. On the other hand, ∇ is a purely local operator. Recall that |∇|df(ξ) = 2π|ξ|fˆ(ξ). Write

2 2 |ξ| X ξj X ξj X 2π|ξ| = 2π = 2π = −i · 2πiξ = m (ξ)2πiξ |ξ| |ξ| |ξ| j j j

ξj where mj(ξ) := −i |ξ| are called Riesz multipliers. To show the desired estimate k|∇|fkLp . p k∇fkLp , it is enough to prove that the Riesz multipliers are bounded on L , that is, the op- ˆ erators Rj defined via Rdjf(ξ) := mj(ξ)f(ξ), or equivalently Rjf =m ˇj ∗ f, are bounded on Lp. Indeed, X |∇|df(ξ) = 2π|ξ|fˆ(ξ) = mj(ξ)2πiξjfˆ(ξ) P so that |∇|f = Rj(∂jf). By the triangle inequality and the equivalence of norms on finite dimensional spaces, X X k|∇|fkLp ≤ kRj(∂jf)kLp . k∂jfkLp . k∇fkLp .

To prove that this is the case, we will prove Lp-boundedness for a more general class of objects. In particular, by passing to distributions we can view any Fourier multiplier as a convolution kernel, i.e., an operator defined by f 7→ K ∗ f for some distribution K. This follows immediately from the fact that the (inverse) Fourier transform turns multiplication into convolution. In this section, we consider a special kind of convolution kernel.

49 d Definition 6.1. A function K : R \{0} → C is a Calderon-Zygmund convolution kernel if it satisfies: 1 1. |K(x)| . |x|d uniformly in x 6= 0; 2. R K(x) dx = 0 for all 0 < R < R < ∞; R1≤|x|≤R2 1 2 R d 3. |x|≥2|y| |K(x) − K(x + y)| dx . 1 uniformly in y ∈ R . The first condition says that K is allowed to have a singularity at 0, as long as the singularity is not too bad. The second condition says that K satisfies a kind of cancellation property; note that if K is an odd function, this condition is satisfied. The third condition is a smoothness requirement. 1 As a first remark, we claim that if |∇K|(x) . |x|1+d , then K satisfies condition 3. Indeed, by the fundamental theorem of calculus, Z 1 K(x) − K(x + y) = − y · ∇K(x + θy) dθ. 0 So Z Z Z 1 1 |K(x) − K(x + y)| dx ≤ |y| · 1+d dθ dx. |x|≥2|y| |x|≥2|y| 0 |x + θy|

Since |x| ≥ 2|y|, |x + θy| & |x|. Thus, Z Z Z 1 1 Z 1 |K(x) − K(x + y)| dx . |y| · 1+d dθ dx = |y| · 1+d dx |x|≥2|y| |x|≥2|y| 0 |x| |x|≥2|y| |x| 1 |y| · = 1. . |y| Therefore, K satisfies the third as claimed. Relevant to our motivating discussion, we claim that the kernel of the Riesz multiplier defined above is a Calderon-Zygmund convolution kernels. To see this, first note that

 ξ ˇ  1 2πiξ ˇ K :=m ˇ = −i j = − · j . j j |ξ| 2π |ξ| In the Sobolev embedding theorem section, we computed the inverse Fourier transform of 1 functions of the form |ξ|s . Recalling this computation (see Equation (5.1)) we have 1  1  x K (x) ∼ − ∂ ∼ j j d 2π j |x|d−1 d |x|d+1

x since ∂ |x| = j . Clearly, |K (x)| 1 and R K (x) dx = 0 (since x 7→ x is odd). j |x| j .d |x|d R1≤|x|≤R2 j j 1 Finally, we wish to show |∇Kj| . |x|1+d to verify the third condition. Note that

−(d+1) −(d+1) −(d+1)−1 −1 ∂kKj(x) ∼ ∂kxj|x| = δjk|x| − (d + 1)|x| xk|x| δ x = jk − (d − 1) k , |x|d+1 |x|d+3

1 so certainly |∇Kj| . |x|1+d . Thus, Kj is a Calderon-Zygmund convolution kernel. The first result on Calderon-Zygmund convolution kernels is an L2 → L2 type estimate.

Theorem 6.2. Suppose K is a Calderon-Zygmund convolution kernel. For ε > 0, let Kε := K · χε≤|x|≤1/ε. Then kKε ∗ fkL2 . kfkL2 uniformly in ε > 0. Moreover, K ∗ f = limε→0 Kε ∗ f d 2 d and the operator K ∗ f extends as a bounded operator from S(R ) to L (R ).

50 Proof. First, we will show that Kε is a Calderon-Zygmund convolution kernel. Because Kε is just a restriction of K, conditions 1 and 2 are immediate. We only need to verify that R d |x|≥2|y| Kε(x) − Kε(x + y)| dy ≤ 1 uniformly in y ∈ R and ε > 0. To do this, consider the following three subregions of |x| ≥ 2|y|:

A = {|x| ≥ 2|y|, ε ≤ |x|, |x + y| ≤ 1/ε} B = {|x| ≥ 2|y|, ε ≤ |x| ≤ 1/ε, |x + y| < ε or |x + y| > 1/ε} C = {|x| ≥ 2|y|, ε ≤ |x + y| ≤ 1/ε, |x| < ε or |x| > 1/ε}.

Then Z Z |Kε(x) − Kε(x + y)| dy ≤ |K(x) − K(x + y)| dy |x|≥2|y| A Z Z + |K(x)| dy + |K(x + y)| dy. B C

The integral over A is . 1 because K is a Calderon-Zygmund convolution kernel. Consider the integral over B. Assume first that |x + y| < ε. Then |x| ≤ |x + y| + |y| < |x| ε + 2 , hence |x| < 2ε. The contribution of this part of the integral is bounded by Z Z 1 |K(x)| dx ≤ d dx . 1. ε≤|x|≤2ε ε≤|x|≤2ε |x|

2 If |x + y| > 1/ε, then |x| ≥ |x + y| − |y| > 1/ε − |x|/2, hence |x| ≥ 3ε . The contribution of this part of the integral is bounded by Z Z 1 |K(x)| dx ≤ dx . 1. 2 1 2 1 |x|d 3ε ≤|x|≤ ε 3ε ≤|x|≤ ε R Thus, B |K(x)| dy . 1. Bounding the integral over C is essentially the same argument, paired with a change R of variables. Thus, C |K(x + y)| dy . 1. This verifies the third condition, hence Kε is a Calderon-Zygmund convolution kernel. Next, we want to show kKε ∗ fkL2 . kfkL2 uniformly in ε > 0. By Plancharel, we have

ˆ ˆ ˆ ˆ ˆ kKε ∗ fkL2 = K\ε ∗ f = Kε · f ≤ Kε f Kε kfkL2 . L2 L2 L∞ L2 L∞

ˆ Thus, it suffices to prove that Kε . 1 uniformly in ε. L∞ We will decompose the Fourier transform of Kε into an integral without any oscillation and an integral with oscillation. Explicitly, Z Z Z −2πix·ξ −2πix·ξ −2πix·ξ Kˆε(ξ) = e Kε(x) dx = e Kε(x) dx + e Kε(x) dx. |x|≤1/|ξ| |x|≥1/|ξ|

In the first integral, |x · ξ| ≤ |x||ξ| ≤ 1, thus the term e−2πix·ξ is restricted in its oscilla- tion. Using the cancellation property of Calderon-Zygmund convolution kernels, we have R R |x|≤1/|ξ| Kε(x) dx = ε≤|x|≤1/|ξ| K(x) dx = 0. Thus,

Z Z   −2πix·ξ −2πix·ξ e Kε(x) dx = e − 1 Kε(x) dx |x|≤1/|ξ| |x|≤1/|ξ| Z −2πix·ξ ≤ |e − 1||Kε(x)| dx. |x|≤1/|ξ|

51 −d Because Kε is a Calderon-Zygmund convolution kernel, |Kε(x)| . |x| . The quantity |e−2πx·ξ − 1| is the distance from the point e−2πix·ξ of the unit circle to 1. This is bounded by the arc-length of that segment on the unit circle corresponds to the phase, which is |x · ξ| ≤ |x||ξ|. Therefore,

Z Z 1 −2πix·ξ −d e Kε(x) dx ≤ |x||ξ||x| dx . |ξ| · = 1. |x|≤1/|ξ| ε≤|x|≤1/|ξ| |ξ|

Next, consider the second integral. Here we invoke the smoothness condition of Calderon- Zygmund convolution kernels. Write

2πi ξ·ξ 1 − eiπ 1 − e 2|ξ|2 1 = = . 2 2 Then

 2πi ξ·ξ  Z Z 2|ξ|2 −2πix·ξ −2πix·ξ 1 − e e Kε(x) dx = e   Kε(x) dx |x|>1/|ξ| |x|>1/|ξ| 2

Z   ξ  1 −2πix·ξ −2πiξ· x− 2 = e − e 2|ξ| Kε(x) dx 2 |x|>1/|ξ| Z 1 −2πix·ξ = e Kε(x) dx 2 |x|>1/|ξ| Z   1 −2πix·ξ ξ − e Kε x + 2 dx 2 x+ ξ >1/|ξ| 2|ξ| 2|ξ|2

We want to combine these integrals, but have to account for the shifted circles which define the domains; it helps to draw a picture here. We have: Z    1 −2πix·ξ ξ = e Kε(x) − Kε x + 2 dx 2 |x|>1/|ξ| 2|ξ| Z   Z   1 −2πix·ξ ξ 1 −2πix·ξ ξ + e Kε x + 2 dx − e Kε x + 2 dx 2 A 2|ξ| 2 B 2|ξ| where   1 ξ A = x : |x| ≥ ≥ x + , |ξ| 2|ξ|2   1 ξ B = x : |x| ≤ ≤ x + . |ξ| 2|ξ|2

Estimate the first of these three integrals by employing the smoothness condition of the Calderon-Zygmund convolution kernels:

1 Z   ξ  e−2πix·ξ K (x) − K x + dx ε ε 2 2 |x|>1/|ξ| 2|ξ| Z   ξ . Kε(x) − Kε x + 2 dx . 1. |x|>2 ξ 2|ξ| 2|ξ|2

Next, consider the integral over A. By the reverse triangle inequality,

1 ξ ξ 1 1 |x| ≥ ≥ x + ⇒ x + ≥ |x| − ≥ . |ξ| 2|ξ|2 2|ξ|2 2|ξ| |ξ|

52 Thus, Z   Z   1 −2πix·ξ ξ ξ e Kε x + 2 dx . Kε x + 2 dx 2 A 2|ξ| 1 ≤ x+ ξ ≤ 1 2|ξ| 2|ξ| 2|ξ|2 |ξ| Z 1 . dx 1 1 |x|d 2|ξ| ≤|x|≤ |ξ| . 1 where we have made the obvious change of variables and invoked the first property of Calderon-Zygmund convolution kernels. Use a similar trick for bounding the integral over B, noting that

1 ξ ξ 1 3 |x| ≤ ≤ x + ⇒ x + ≤ |x| + ≤ . |ξ| 2|ξ|2 2|ξ|2 2|ξ| 2|ξ|

R e−2πix·ξK (x) dx 1 |Kˆ (ξ)| 1 Combining all of this gives |x|>1/|ξ| ε . , and hence ε . uni-

ˆ formly in ε and ξ, thus, Kε(ξ) . 1 uniformly in ε. Therefore, kKε ∗ fkL2 . kfkL2 . L∞ 2 Next, we wish to show that Kε ∗ f converges in L as ε → 0. It suffices to show that d {Kε ∗ f} is Cauchy. So fix 0 < ε2 < ε1 and fix f ∈ S(R ). Then Z Z

(Kε1 ∗ f − Kε2 ∗ f)(x) = K(y)f(x − y) dy − K(y)f(x − y) dy ε1≤|y|≤1/ε1 ε2≤|y|≤1/ε2 Z Z = − K(y)f(x − y) dy − K(y)f(x − y) dy. ε2≤|y|≤ε1 1/ε1≤|y|≤1/ε2

Consider the second integral.

Z

K(y)f(x − y) dy K · χ kfk 1 . 1/ε1≤|y|≤1/ε2 L2 L 1/ε ≤|y|≤1/ε 1 2 L2 ! 1 Z 1 2 .f 2d dx 1/ε1≤|y|≤1/ε2 |x| d 2 . ε1 which → 0 as ε1 → 0. For the first integral, the cancellation property of K gives Z Z K(y)f(x − y) dy = K(y)(f(x − y) − f(x)) dy ε2≤|y|≤ε1 ε2≤|y|≤ε1 Z Z 1 = K(y) y ∇f(x − θy) dθ dy. ε2≤|y|≤ε1 0

Thus,

Z Z 1 Z 1 K(y)f(x − y) dy |y| k∇f(x − θy)k 2 dy dθ . |y|d Lx ε2≤|y|≤ε1 2 0 ε2≤|y|≤ε1 Lx Z 1 .f d−1 dy ≤ ε1 ε2≤|y|≤ε1 |y| → 0 as ε1 → 0.

53 This shows that for a Schwartz function f, {Kε ∗ f} is Cauchy. By density, {Kε ∗ f} is 2 d Cauchy for any f ∈ L . To see this argument explicitly, fix δ > 0 and let g ∈ S(R ) be such that kf − gkL2 < δ. Then by the usual triangle inequality argument,

kKε1 ∗ f − Kε2 ∗ fkL2 ≤ kKε1 ∗ g − Kε2 ∗ gkL2 + kKε1 ∗ (f − g)kL2 + kKε2 ∗ (f − g)kL2 .

The first quantity → 0 as ε1, ε2 → 0 since g is Schwartz. We have already seen that Calderon-Zygmund convolution kernels are of type (2, 2), thus,

kKε1 ∗ (f − g)kL2 . kf − gkL2 ≤ δ and likewise for Kε2 . Therefore, taking ε1, ε2 → 0 and then taking δ → 0 gives the desired result. 2 2 Therefore, we can define K ∗ f := L − limε→0 Kε ∗ f. This operator extends to L by uniform boundedness of the Kε operators.

This theorem says that Calderon-Zygmund convolution kernels are operators of type (2, 2). Using this fact together with interpolation gives a wider range of estimates.

Theorem 6.3. Suppose that K is a Calderon-Zygmund convolution kernel. For ε > 0, let Kε := K · χε≤|x|≤1/ε. Then 1. K is of weak-type (1, 1), i.e.,

n d o 1 x ∈ : |K ∗ f| > λ kfk 1 d R ε . λ L (R ) uniformly in λ > 0 and ε > 0.

2. Kε is of (strong) type (p, p) for 1 < p < ∞, i.e.,

kK ∗ fk p d kfk p d ε L (R ) . L (R ) uniformly in ε > 0.

p d Moreover, for 1 < p < ∞, K ∗ f := L − limε→0 Kε ∗ f extends as a bounded map on S(R ) to a p d bounded map on L (R ).

Proof. Assume first that we have proven 1, that Kε is of weak-type (1, 1). Since Kε is of type (2, 2) by the previous theorem, it follows from the Marcinkiewicz interpolation theorem that Kε is of type (p, p) for 1 < p < 2. We then use a duality argument to achieve the estimate for 2 < p < ∞. Explicitly, for p in this range, ZZ kKε ∗ fkLp = sup hKε ∗ f, gi = sup Kε(x − y)f(y) g(x) dy dx kgk 0 =1 kgk 0 =1 Lp Lp ZZ = sup (Kε)R(y − x)g(x) dx f(y) dy kgk 0 =1 Lp D E = sup f, (Kε)R ∗ g kgk 0 =1 Lp where (Kε)R(x) = Kε(−x) is the reflection of Kε. By Holder’s inequality,

kKε ∗ fk p ≤ sup kfk p (Kε)R ∗ g . L L p0 kgk 0 =1 L Lp

0 Note that (Kε)R is also a Calderon-Zygmund convolution kernel. Since 1 < p < 2, we then

0 have (Kε)R ∗ g 0 . kgkLp = 1, so that kKε ∗ fkLp . kfkLp as desired. Lp

54 For the “moreover” statement, we wish to show that kKε1 ∗ f − Kε2 ∗ fkLp → 0 as ε1, ε2 → 0 as in the previous theorem. Though the same proof technique as before will work, we present an alternative argument here. For 1 < p < 2, choose 1 < q < p and compute, using the log-concavity of Lp-norms:

θ 1−θ kKε1 ∗ f − Kε2 ∗ fkLp . kKε1 ∗ f − Kε2 ∗ fkLq kKε1 ∗ f − Kε2 ∗ fkL2

1 θ 1−θ θ θ 1−θ where p = q + 2 . We have kKε1 ∗ f − Kε2 ∗ fkLq . kfkLq and kKε1 ∗ f − Kε2 ∗ fkL2 → 0 from previous results. For 2 < p < ∞, choose p < q < ∞ and argue similarly. 1 d Thus, to prove the entire theorem it only remains to show 1. Let f ∈ L (R ). We perform a modified version of the Calderon-Zygmund decomposition introduced earlier. For a fixed λ > 0, write f = g + b where:

- |g| . λ a.e.; S - b is supported on a union of cubes {Qk} whose interiors are disjoint and | Qk| . 1 R λ |f(y)| dy; 1 R R - b |Q = f |Q − f(y) dy, so that consequently b(y) dy = 0; k k |Qk| Qk Qk

- 1 R |b(y)| dy λ. |Qk| Qk . Then     λ λ |{ x : |Kε ∗ f| > λ }| ≤ x : |Kε ∗ g| > + x : |Kε ∗ b| > . 2 2

We estimate the first quantity on the right using the (2, 2) estimate for Kε. Using Cheby- chev’s inequality and the uniform bound on g, we have:

  λ −2 2 x : |Kε ∗ g| > λ kKε ∗ gk 2 2 . L −2 2 −2 2 −1 1 λ kgk 2 = λ |g| 1 λ kgk 1 ≤ kfk 1 . . L L . L λ L 1 kfk To estimate the second quantity, we first remove a set of comparable size to√λ L1 . Let ∗ Qk be the cube with same center, call it xk, as Qk but with side lengths 2 d`(Qk). (To draw a picture, draw Qk and circumscribe a circle; draw a circle with twice the radius, then circumscribe a cube around this circle.) Then √ [ ∗ X d X 1 Q ≤ |Q | ≤ (2 d) |Q | kfk 1 . k k k .d λ L

 S ∗ λ Thus, it suffices to estimate the quantity x∈ / Qk : |Kε ∗ b| > 2 . By Chebychev, we have   [ ∗ λ 1 x∈ / Qk : |Kε ∗ b| > . kKε ∗ bkL1((S Q∗)C ) . 2 λ k Writing out the definition of the convolution above gives

Z X Z (Kε ∗ b)(x) = Kε(x − y)b(y) dy = Kε(x − y)b(y) dy Qk X Z = (Kε(x − y) − Kε(x − xk)) b(y) dy Qk

55 where the last equality follows from the fact that b has mean 0. So Z kKε ∗ bkL1((S Q∗)C ) = |Kε ∗ b|(x) dx k S ∗ C ( Qk) ) X Z Z ≤ |Kε(x − y) − Kε(x − xk)| |b(y)| dy dx ∗ C (Qk) Qk X Z Z = |b(y)| |Kε(x − y) − Kε(x − xk)| dx dy. ∗ C Qk (Qk)

Making the change of variables x − xk 7→ x yields X Z Z kKε ∗ bkL1((S Q∗)C ) = |b(y)| |Kε(x + xk − y) − Kε(x)| dx dy k ∗ C Qk (Qk) −xk ∗ C ∗ C ∗ C where (Qk) −x√k denotes the translation√ of (Qk) by the point xk. Note that if x ∈ (Qk) − xk, then |x| > 2 d`(Qk). Since d`(Qk) is the diameter of Qk and y ∈ Qk, this implies that |x| ≥ 2|y−xk|. By the the smoothness property of Calderon-Zygmund convolution kernels, Z |Kε(x + xk − y) − Kε(x)| dx . 1. ∗ C (Qk) −xk Thus, X Z Z kKε ∗ bk 1 S ∗ C |b(y)| dy |f(y)| dy L (( Qk) ) . . Qk and we are done.

In the proof of this theorem, having already shown (2, 2) boundedness of K, the only prop- erty of Calderon-Zygmund convolution kernels that we explicitly used was the smoothness condition. Interpolation was enough to get boundedness for other p values. This gives us a more general fact. Proposition 6.4. If K is any convolution kernel (not necessarily Calderon-Zygmund) which is of R type (2, 2) and satisfies |x|≥2|y| |K(x + y) − K(x)| dx . 1 uniformly in y, then K extends to a p d bounded map on L (R ) for 1 < p < ∞.

6.2 The Hilbert Transform

As a first application of Lp-boundedness of Calderon-Zygmund convolution kernels, we discuss the Hilbert transform. The Hilbert transform is the marquee example of a singular integral operator, and is important in fields such as signal processing. Definition 6.5. The Hilbert transform is the convolution operator corresponding to the 1 kernel K : R \{0} → R given by K(x) = πx . Explicitly, the Hilbert transform of a function f : R → R is: Z 1 (Hf)(x) := lim f(x − y) dy. ε→0 1 πy ε≤|y|≤ ε A priori, it is not clear that the Hilbert transform is well-defined for even nice functions f. However, we claim that K is a Calderon-Zygmund convolution kernel. The estimate 1 |K(x)| . |x| is immediate from the definition, and the fact that K satisfies the cancellation 1 property over annuli in R follows from the fact that x is an odd function. Checking the smoothness condition, we have Z Z Z 1 1 1 1 |y| |K(x + y) − K(x)| dx = − dx = dx. |x|≥2|y| π |x|≥2|y| x + y x π |x|≥2|y| |x + y||x|

56 Since |x| ≥ 2|y|, |x + y| ≥ |x| − |y| & |x|, so that Z Z |y| |K(x + y) − K(x)| dx . 2 dx . 1. |x|≥2|y| |x|≥|y| |x| Thus, K is a Calderon-Zygmund convolution kernel. By the results from the previous section, this proves the following. p p Proposition 6.6. The Hilbert transform is a bounded operator from L (R) → L (R) for 1 < p < ∞. It is worth nothing that the strong-type estimates do indeed fail for p = 1 and p = ∞. 1 ∞ For example, let f = χ[a,b]. Then f ∈ L (R) and f ∈ L (R), and 1 Z 1 (Hf)(x) = lim χ[a,b](x − y) dy. π ε→0 1 y ε≤|y|≤ ε Consider the case where x − a, x + b > 0. Then for ε sufficiently small, Z x−b 1 1 1 x − a (Hf)(x) = dy = log π x−a y π x − b

1 ∞ for almost all x, which is neither in L (R) nor L (R). The cases x − a, x − b < 0 and x − a < 0 < x − b are handled similarly. It can be shown that the Hilbert transform is a Fourier multiplier with symbol −i sgn(ξ). That is, for f ∈ S(R), Hfd(ξ) = −i sgn(ξ)fˆ(ξ). ξj ˆ As such, the Riesz multipliers Rdjf(ξ) = −i |ξ| f(ξ) defined earlier in this chapter can be viewed as a higher dimensional analogue of the Hilbert transform.

6.3 The Mikhlin Multiplier Theorem

In this section we prove a multiplier theorem which gives Lp-boundedness of a large class of multipliers. This particular theorem, and its proof technique, offers a segue into the Littlewood-Paley theory developed in the next chapter. The main result is the following. d Theorem 6.7 (Mikhlin multiplier theorem). Let m : R \{0} → C satisfy

α 1 D m(ξ) (6.1) ξ . |ξ||α|

 d+1  ˆˇ p d uniformly in ξ for all 0 ≤ |α| ≤ 2 . Then f 7→ (m · f) =m ˇ ∗ f is bounded on L (R ) for 1 < p < ∞. To prove this theorem, it will useful to localize at different frequencies in the Fourier domain. With this in mind, we introduce a particular family of bump functions. Let ϕ ∈ ∞ d C (R ) be a smooth bump function satisfying  1 |x| ≤ 1 ϕ(x) = 11 . 0 |x| ≥ 10 Let ψ(x) = ϕ(x) − ϕ(2x). Then  1 0 |x| ≤ 2  11 ψ(x) = 1 20 ≤ |x| ≤ 1 .  11 0 |x| ≥ 10

57 Z x  For N ∈ 2 , define ψN (x) = ψ N . These annular bump functions which are ∼ 1 on a scale of N will be used in the proof of the Theorem 6.7, and are central to Littlewood-Paley theory.

Proof of Theorem 6.7. For p = 2, the desired conclusion follows from Plancharel. Indeed,

kmˇ ∗ fk 2 = m · fˆ ≤ kmk ∞ fˆ . Lx 2 Lξ 2 Lξ Lξ

By choosing α = 0 in (6.1) it follows that m is uniformly bounded, so kmk ∞ < ∞. Thus, Lξ ˆ kmˇ ∗ fk 2 f . Lx . 2 Lξ With this (2, 2) estimate, by Proposition 6.4, to prove boundedness on Lp for 1 < p < ∞ it suffices to prove that K =m ˇ satisfies the smoothness condition of Calderon-Zygmund convolution kernels. Before doing this with the hypotheses of the theorem, we prove a slightly easier case. In particular, suppose that (6.1) holds for all |α| ≤ d + 2 rather than all  d+1  1 |α| ≤ 2 . In this case, we will show that |∇K(x)| . |x|d+1 . By previous remarks, this implies that K satisfies the desired smoothness condition. Towards this goal, note that X ψN (x) = 1 N∈2Z

d for almost all x ∈ R . Write X X m(ξ) = m(ξ)ψN (ξ) =: mN (ξ). N∈2Z N∈2Z P Then |∇K(x)| ≤ N∈2Z |∇mˇN (x)|. By the properties of the Fourier transform,

α α kx ∇mˇN (x)kL∞ . Dξ ξmN (ξ) 1 . x Lξ

By the product rule,

α α X α1 α2 Dξ (ξmN (ξ)) = Dξ (ξm(ξ)ψN (ξ)) = Cα1,α2 Dξ (ξm(ξ)) Dξ (ψN (ξ)). α1+α2=α

α1 −|α1| α1 1−|α1| By assumption, |Dξ m(ξ)| . |ξ| . Therefore, |Dξ (ξm(ξ))| . |ξ| , again by the product rule. Next, by the chain rule,

  ξ   ξ  α2 α2 −|α2| α2 |D ψN (ξ)| = D ψ = N (D ψ) . ξ ξ N ξ N

Thus,  ξ  α X 1−|α1| −|α2| α2 |D ξmN (ξ)| |ξ| N (D ψ) . ξ . ξ N α1+α2=α

Since ψN and all of its derivatives are supported on an annulus of radius comparable to N, Z α α X 1−|α1| −|α2| kx ∇mˇN (x)k ∞ D ξmN (ξ) 1 |ξ| N dξ Lx . ξ L . ξ |ξ|∼N α1+α2=α X 1+d−|α | −|α | . N 1 N 2 α1+α2=α 1+d−|α| . N .

58 By choosing α = 0 and |α| = d + 2, it then follows that  1  |∇mˇ (x)| min N d+1, N . N|x|d+2 so that X X  1  |∇K(x)| ≤ |∇mˇ (x)| min N d+1, . N . N|x|d+2 N∈2Z N∈2Z Splitting this sum over small and large frequencies, chosen appropriately, X X 1 |∇K(x)| N d+1 + . . N|x|d+2 1 1 N≤ |x| N> |x|

Since both of these sums are over dyadic numbers, the first sum is bounded (up to a con- stant) by its largest term and the second sum is bounded (up to a constant) by its smallest term. Thus, 1 1 1 |∇K(x)| + . . |x|d+1 1 d+2 . |x|d+1 |x| |x| By our initial remark, we are done.  d+1  Next, we return to the original assumption that |α| ≤ 2 . The proof is similar, except 2 d ∞ d that we perform the computation in L (R ) instead of L (R ). Using Plancharel and then proceeding like before, we have:    α α X α1 α2 ξ k(−2πix) mˇ (x)k 2 = D m (ξ) D m(ξ) · D ψ N L ξ N L2 . ξ ξ x ξ N 2 α1+α2=α Lξ

X 1 · N −|α2| . |α | |ξ| 1 2 α1+α2=α Lξ(|ξ|∼N) ! 1 X Z 2 = N −|α2| |ξ|−2|α1| dξ |ξ|∼N α1+α2=α 1 X −|α |  −2|α |+d 2 . N 2 N 1 α1+α2=α −|α|+ d = N 2 .

d In particular, by choosing α = 0 we have kmˇN (x)k 2 N 2 . Thus, for a fixed A > 0, Lx . Holder’s inequality gives Z d d |mˇ (x)| dx kmˇ (x)k 2 χ N 2 A 2 N . N L |x|≤A L2 . |x|≤A and Z |α| −|α| |mˇN (x)| dx |x| mˇN (x) |x| χ|x|>A . . 2 2 |x|>A L L

d |α| −|α|+ 2 By the above computation, |x| mˇN (x) L2 . N . Computing the other norm,

1 Z ! 2 −|α| −2|α| −|α|+ d |x| χ|x|>A = |x| dx A 2 2 . L |x|>A

 d+1  provided |α| ≤ 2 . Combining these two estimates gives Z −d d+1 e+ d |mˇN (x)| dx . (NA) 2 2 . |x|>A

59 In particular, choosing A = 1/N, we have Z |mˇN (x)| dx . 1 uniformly in N ∈ 2Z. Essentially the same computation gives Z |∇mˇN (x)| dx . N where the implicit constant is independent of N. The only difference in the calculation is α ∂ α that we begin by estimating (−2πix) mˇN (x) 2 ∼ D ξmN (ξ) and consequently ∂x Lx ξ 2 Lξ pick up an extra power of N throughout. Thus, Z X Z |K(x + y) − K(x)| dx ≤ |mˇN (x + y) − mˇN (x)| dx |x|≥2|y| |x|≥2|y| N∈2Z X Z = |mˇN (x + y) − mˇN (x)| dx 1 |x|≥2|y| N≤ |y| X Z + |mˇN (x + y) − mˇN (x)| dx. 1 |x|≥2|y| N> |y| In the first sum, over low frequencies, we can use the fundamental theorem of calculus to get X Z X Z Z 1 |mˇN (x + y) − mˇN (x)| dx ≤ |y| |∇mˇN (x + θy)| dθ dx. 1 |x|≥2|y| 1 |x|≥2|y| 0 N≤ |y| N≤ |y|

d Using Fubini’s theorem and integrating over R instead of |x| ≥ 2|y| gives X Z X X |mˇN (x + y) − mˇN (x)| dx ≤ |y| k∇mˇN kL1 . |y|N . 1 1 |x|≥2|y| 1 1 N≤ |y| N≤ |y| N≤ |y| where the last inequality follows from the usual dyadic sum argument. For the second sum, we have the crude estimate X Z X Z |mˇN (x + y) − mˇN (x)| dx ≤ |mˇN (x + y)| + |mˇN (x)| dx 1 |x|≥2|y| 1 |x|≥2|y| N> |y| N> |y| X Z . |mˇN (x)| dx 1 |x|≥|y| N> |y| X −d d+1 e+ d . (N|y|) 2 2 . 1 N> |y|  d+1  d Here, we have used our previous estimate with A = |y|. Since − 2 + 2 < 0 and the 1 sum ranges over dyadic numbers greater than |y| , we have

d+1 d  −d 2 e+ 2 X − d+1 + d 1 (N|y|) d 2 e 2 |y| = 1. . |y| 1 N> |y| Therefore, Z |K(x + y) − K(x)| dx . 1 + 1 . 1 |x|≥2|y| and we are done.

60 Chapter 7

Littlewood-Paley Theory

In this chapter, we develop the theory of Littlewood-Paley projections and discuss a few of their immediate applications. More applications will become apparent in various contexts throughout the rest of this text. The philosophy behind Littlewood-Paley theory is to decompose functions by localiz- ing at different frequencies in the Fourier domain. As we shall see, this turns out to be a powerful idea. One source of motivation is the desire for a notion of orthogonality in p 2 P L spaces for p 6= 2. Indeed, suppose that we can write f ∈ L as f = j fj, where the functions fˆ have disjoint support in the frequency domain. Then by Plancharel, 2 2 2 2 X X ˆ X ˆ X 2 kfkL2 = fj = fj = fj = kfjk 2 . L2 L j j j j L2 L2 The calculation shows that the Fourier transform has some nice orthogonality properties in L2. Unfortunately, this is not strictly true in other Lp spaces. Littlewood-Paley theory gives a way of understanding the orthogonality properties of the Fourier transform in such spaces. Some references for the following material are [2] and [3].

7.1 Littlewood-Paley Projections

We begin by defining the Littlewood-Paley projection operators. Throughout the rest of this chapter, we will use the following bump functions: let ϕ be a smooth function (which we view on the frequency domain) such that  1 |ξ| ≤ 1 ϕ(ξ) = 11 . 0 |ξ| ≥ 10 Let ψ(ξ) = ϕ(ξ) − ϕ(2ξ). Then  1 0 |ξ| ≤ 2  11 ψ(ξ) = 1 20 ≤ |ξ| ≤ 1 .  11 0 |ξ| ≥ 10   Z ξ For N ∈ 2 , define ψN (ξ) := ψ N .

Definition 7.1. The Littlewood-Paley projection onto frequencies |ξ| ∼ N, denoted PN , is defined in the Fourier domain for Schwartz functions f via ˆ PdN f(ξ) = ψN (ξ)f(ξ). Equivalently, d ˇ PN f = f ∗ N ψ(N·).

61 In words, the operator PN localizes the function f to frequencies of order N. Note that 2 PN is not a true projection operator, despite its name, since PN 6= PN . Littlewood-Paley projections can naturally be defined on wider ranges of frequencies.

Definition 7.2. The Littlewood-Paley projection onto low frequencies, denoted P≤N , is defined in the Fourier domain for Schwartz functions f via  ξ  P\f(ξ) = ϕ fˆ(ξ). ≤N N

Equivalently, d P≤N f = f ∗ N ϕˇ(N·).

The Littlewood-Paley projection onto high frequencies is P>N := I − P≤N , and the P Littlewood-Paley projection onto medium frequencies is PM≤ ≤N := M≤K≤N PK .

We often use the shorthand notation fN , f≤N , f>N , and fM≤ ≤N for PN f, P≤N f, P>N f, and PM≤ ≤N f, respectively. The following proposition contains basic properties of the Littlewood-Paley projections that will be used frequently.

Proposition 7.3. Fix N ∈ 2Z.

p 1. The operators PN and P≤N are bounded on L for 1 ≤ p ≤ ∞, i.e., kfN kLp + kf≤N kLp . kfkLp .

2. We have the pointwise estimate |fN (x)| + |f≤N (x)| . (Mf)(x).

p p P L 3. For f ∈ L with 1 < p < ∞, N∈2Z PN f −→ f.

d d p − q 4. (Bernstein inequality I) kfN kLq . N kfN kLp for 1 ≤ p ≤ q ≤ ∞. s s 5. (Bernstein inequality II) k|∇| fN kLp ∼ N kfN kLp for 1 ≤ p ≤ ∞ and s ∈ R.

Proof.

1. We compute:

d ˇ d ˇ kfN kLp = f ∗ N ψ(N·) . kfkLp N ψ(N·) . Lp L1

d ˇ 1 d ˇ ˇ By construction, N ψ(N·) is L -normalized, so that N ψ(N·) L1 = ψ L1 . Thus, ˇ kfN kLp . kfkLp ψ L1 .

The same computation works for kf≤N kLp .

2. We show the estimate for |fN (x)|, and as before, the same proof works for |f≤N (x)|. We have Z  d ˇ  d ˇ fN (x) = f ∗ N ψ(N·) (x) = N f(x − y)ψ(Ny) dy.

Because ψ is Schwartz, ψˇ is Schwartz, hence is bounded and decays up to any order. Thus, Z |f(x − y)| |f (x)| N d dy N . hN|y|i100d

62 2 1 where we use the Japanese angle bracket notation hxi := (1 + |x| ) 2 . When |y| is small, we exploit the boundedness of the angle bracket, and when |y| is large we exploit its decay. Explicitly, Z Z d d |f(x − y)| |fN (x)| . N |f(x − y)| dy + N dy. 1 1 (N|y|)100d |y|≤ N |y|> N Because |B(0, 1/N)| ∼ N −d so that 1/|B(0, 1/N)| ∼ N d, we have by definition of the maximal function Z N d |f(x − y)| dy (Mf)(x). 1 . |y|≤ N To estimate the second integral, we sum over dyadic annuli. Z Z d |f(x − y)| d X |f(x − y)| N 100d dy . N 100d dy |y|> 1 (N|y|) M≤|y|≤2M (NM) N Z 1 M∈2 ; M> N d Z d X −100d M . N (NM) |f(x − y)| dy |B(0, 2M)| B(0,2M) Z 1 M∈2 ; M> N X −99d . (NM) (Mf)(x). Z 1 M∈2 ; M> N By the usual dyadic sum argument, −99d X  1  (NM)−99d(Mf)(x) M (Mf)(x) = (Mf)(x) . M Z 1 M∈2 ; M> N and we are done.

d p d 3. By the density of S(R ) ⊆ L (R ) and by property (1), it suffices by the usual approx- d imation argument to prove the claim for Schwartz functions. Fix f ∈ S(R ). For p = 2, by Plancharel we have  

ˆ X ˆ X ˆ X kf − fN≤ ≤M k 2 = f 1 − ψK  ≤ f ψK + f ψK . L N≤K≤M KM 2 L2 L L P P Since KM ψK ≤ χ|ξ|≥M , by the dominated convergence theorem, the left hands side tends to 0 as N → 0, and the right hand side tends to 0 as M → ∞. 1 θ 1−θ For 1 < p < 2, we use interpolation. Choose θ so that p = 1 + 2 . Then θ 1−θ kf − fN≤ ≤M kLp ≤ kf − fN≤ ≤M kL1 kf − fN≤ ≤M kL2 . By (1), θ θ θ kf − fN≤ ≤M kL1 ≤ kfkL1 + kfN≤ ≤M kL1 . kfkL1 . 1−θ Our previous computation shows that kf − fN≤ ≤M kL2 → 0 and thus

kf − fN≤ ≤M kLp → 0.

For 2 < p < ∞, the same trick works:

2 2 1− p p kf − fN≤ ≤M kLp ≤ kf − fN≤ ≤M kL1 kf − fN≤ ≤M kL2 . The left hand side is bounded by (1), and the right hand side → 0 by the previous calculation.

63 1 1 1 4. By Young’s convolution inequality, for r satisfying p + r = q + 1,

d d d− d ˇ ˇ r ˇ kfN kLq = f ∗ N ψ(N·) . kfkLp N ψ(N·) = kfkLp N ψ r Lq Lr L d d p − q . N kfkLp .

This is great but this isn’t the estimate we need. To recover fN on the right, we use ˜ ˜ fattened Littlewood-Paley projections PN := P N . Note that PN PN = PN . Also, 2 ≤ ≤2N we have   d˜  X  ˆ PN f(ξ) =  ψK  (ξ)f(ξ) N 2 ≤K≤2N so that  ˇ ˜  X  X d ˇ d X ˇ PN f = f ∗  ψK  = f ∗ K ψ(K·) ∼ f ∗ N ψ(K·) N N N 2 ≤K≤2N 2 ≤K≤2N 2 ≤K≤2N ˇ = f ∗ N dψ˜(N·)

for an appropriately defined ψ˜. 1 Thus,

˜ d ˜ˇ d ˜ˇ kfN kLq = PN fN ∼ fN ∗ N ψ(N·) . kfN kLp N ψ(N·) Lq Lq Lr d− d ˇ r ˜ = kfN kLp N ψ Lr d d p − q . N kfN kLp .

5. Fix s ∈ R. By definition, we have: |ξ|s  ξ  |∇|\sf (ξ) ∼ |ξ|sψ (ξ)fˆ(ξ) = N s ψ fˆ(ξ). N N N N

s ∞ d Since the support of ψ is localized away from the origin, ρ(ξ) := |ξ| ψ(ξ) ∈ Cc (R \   s s ξ ˆ {0}) for any value of s. So |∇|\fN (ξ) ∼ N ρ N f(ξ), hence

s s h d i |∇| fN = N f ∗ N ρˇ(N·) .

Thus,

s s d s s k|∇| fN kLp . N kfkLp N ρˇ(N·) = N kfkLp kρˇkL1 . N kfkLp . L1 Via the same fattened Littlewood-Paley projections technique, we get the estimate s s k|∇| fN kLp . N kfN kLp . To get the reverse inequality, observe that ˆ −s s ˆ −s s ˆ fN (ξ) = |ξ| |ξ| fN (ξ) ∼ (|∇| |∇| fN )(ξ).

−s s Thus, applying the inequality we already have to |∇| |∇| fN gives

−s s −s s kfN kLp ∼ |∇| |∇| fN Lp . N k|∇| fN kLp s s so that N kfN kLp . k|∇| fN kLp as desired.

1 P ˇ ˇ N  ˇ ˇ Note that, since we are summing over dyadic numbers, N ψ(K·) = ψ · + ψ (N·) + ψ (2N·). 2 ≤K≤2N 2

64 R ˆ We remark that (3) does not hold for p = 1 or p = ∞. Note that fN (x) dx = fN (0) = R P L1 0 for any N ∈ 2Z, so fN≤ ≤M (x) dx = 0. But PN f −→ f implies that their means converge. The claim fails for p = 1 by choosing an f ∈ L1 with R f(x) dx 6= 0. To see why ∞ ∞ the claim fails for p = ∞, note that fN≤ ≤M ∈ C . Since convergence in L is uniform P convergence, lim PN f is continuous. We also remark that the same proof technique in (4) gives the estimate kf≤N kLq . d d p − q N kf≤N kLp . Though (3) does not hold for p = 1, we do have the following result.

1 1 d R P L Proposition 7.4. Let f ∈ L ( ) with d f(x) dx = 0. Then P f −→ f. R R N Proof. First, we claim that f can be approximated by a smooth function with compact sup- port and mean 0. Indeed, fix ε > 0. Then since f ∈ L1, there exists an R > 0 such that R |x|>R |f(x)| dx < ε. Then Z f(x)χB(0,R)(x) + O(ε)χR≤|x|≤2R(x) dx = 0 for some appropriately chosen constant O(ε). Convolving with a smooth approximation to the identity the desired approximation. ∞ d R Thus, we can assume that f ∈ Cc (R ) and f(x) dx = 0. By the triangle inequality,

X X X f − PN f ≤ PK f + kPK fk 1 . L N≤K≤M KM L1 L First, consider the high frequencies. By Bernstein’s inequality,

X X −1 kPK fkL1 . K k|∇|fK kL1 . K>M K>M

∞ d Since f ∈ Cc (R ) and the sum is a geometric sum over dyadic numbers,

X −1 kPK fkL1 .f M K>M which → 0 as M → ∞. Next, consider the low frequencies. We exploit the fact that convolving with a mean 0 function is like taking a derivative, in the following sense: we have Z Z d d (P≤f)(x) = f(y)N ϕˇ(N(x − y)) dy = N f(y) [ϕ ˇ(N(x − y)) − ϕˇ(Nx)] dy since f has mean 0. By the fundamental theorem of calculus, Z 1 ϕˇ(N(x − y)) − ϕˇ(Nx) = Ny ∇ϕˇ(Nx − θNy) dθ. 0 Using the fact that the support of f is contained in B(0,R), along with the fact that ϕˇ is Schwartz and hence decays as fast as we need, ZZ Z 1 d+1 kP≤N fkL1 . N |y||f(y)| |∇ϕˇ(Nx − θNy)| dθ dy dx 0 ZZ 1 N d+1R |f(y)| dy dx. . hNxi100d

65 Here we have also used the fact that NR << 1 for N << 1, so we don’t need to include it in the Japanese angle bracket. So Z d+1 1 d+1 −d kP fk N R kfk 1 dx N R kfk 1 N ≤N L1 . L hNxi100d . L which → 0 as N → 0.

Broadly speaking, low frequencies of a function contribute to smoothness, whereas high frequencies contribute to irregularity. This is made somewhat precise by the following proposition. ∞ d Proposition 7.5. Let f ∈ L (R ) and fix 0 < α < 1. Then f is α-Holder continuous if and only −α if kP≥N fkL∞ . N for all N ≥ 1. d Proof. Suppose first that f is α-Holder continuous. Then for all x, y ∈ R with y 6= 0, α |f(x − y) − f(x)| . |y| . Fix N ∈ 2Z such that N ≥ 1. By definition of the Littlewood-Paley projections,

X X X d ˇ |P≥N f(x)| = PK f(x) ≤ |PK f(x)| = (f ∗ K ψ(K·)(x)

K≥N K≥N K≥N Z X d = K f(x − y)ψˇ(Ky) dy . d K≥N R

R ˇ ˇˆ ˇ Observe that d ψ(y) dy = ψ(0) = ψ(0) = 0. Thus, ψ has mean zero, hence R Z Kd f(x)ψˇ(Ny) dy = 0. Rd Inserting this into the above expression gives Z Z X d ˇ d ˇ |P≥N f(x)| ≤ K f(x − y)ψ(Ky) dy − K f(x)ψ(Ky) dy d d K≥N R R Z X d = K [f(x − y) − f(x)] ψˇ(Ky) dy d K≥N R X Z ≤ Kd |f(x − y) − f(x)||ψˇ(Ky)| dy. d K≥N R

Since f is α-Holder continuous, we have Z X d α ˇ |P≥N f(x)| . K |y| |ψ(Ky)| dy. d K≥N R

Making the change of variables z = Ky gives dy = K−d dz, hence Z α Z X d z ˇ −d X −α α ˇ |P≥N f(x)| . K |ψ(z)| K dz = K |z| |ψ(z)| dz. d K d K≥N R K≥N R

Since ψ is a smooth function with compact support on an annulus of radius ∼ 1, ψ is Schwartz, and hence ψˇ is Schwartz. Thus, ψˇ decays up to any polynomial order, and |z|α|ψˇ(z)| is bounded. Therefore, Z Z Z α ˇ α ˇ α 1 |z| |ψ(z)| dz . |z| |ψ(z)| dz + |z| 100d dz . 1 + 1 = 1. Rd |z|≤1 |z|>1 |z|

66 This gives X −α −α |P≥N f(x)| . K . N K≥N −α This bound is uniform in x. Thus, kP≥N fkL∞ . N for all dyadic N ≥ 1. Next, we show the converse statement. −α Suppose that kP≥N fkL∞ . N for all dyadic N ≥ 1. We need to show that |f(x − y) − α d d ∞ d f(x)| . |y| for all x ∈ R and y 6= 0 ∈ R . Because f ∈ L (R ) and hence |f(x − y) − f(x)| ≤ 2 kfkL∞ . 1, it suffices to consider 0 < |y| < 1. d So fix x, y ∈ R with 0 < |y| < 1. Decompose |f(x − y) − f(x)| as

|f(x − y) − f(x)| ≤ |f<1(x − y) − f<1(x)| + |f≥1(x − y) − f≥1(x)|. We estimate these terms separately. First, consider the projection of f onto frequencies < 1. By definition of the Littlewood-Paley projection P<1,

|f<1(x − y) − f<1(x)| = |(f ∗ ϕˇ)(x − y) − (f ∗ ϕˇ)(x)| Z Z

= f(x − y − z)ϕ ˇ(z) dz − f(x − z)ϕ ˇ(z) dz .

Changing variables in the first integral via y + z 7→ z gives Z Z

|f<1(x − y) − f<1(x)| = f(x − z)ϕ ˇ(z − y) dz − f(x − z)ϕ ˇ(z) dz

Z ≤ |f(x − z)| |ϕˇ(z − y) − ϕˇ(z)| dz.

By the fundamental theorem of calculus, Z Z 1 |f<1(x − y) − f<1(x)| ≤ |f(x − z)||y| |∇ϕˇ(z − θy)| dθ dz 0 ZZ 1 ≤ kfk∞ |y| |∇ϕˇ(z − θy)| dθ dz. 0 Because ϕ is Schwartz, ∇ϕˇ is Schwartz, hence is bounded and decays up to any order. Consequently, ZZ 1 Z 1 |∇ϕˇ(z − θy)| dθ dz . 100d dz . 1. 0 hzi α α Thus, |f<1(x − y) − f<1(x)| . |y|. Since |y| < 1, |y| ≥ |y|, so |f<1(x − y) − f<1(x)| . |y| . Next we consider |f≥1(x − y) − f≥1(x)|. For any N ≥ 1, by assumption, −α |f≥N (x − y) − f≥N (x)| ≤ |f≥N (x − y)| + |f≥N (x)| ≤ 2 kP≥N fkL∞ . N . (7.1) On the other hand, for any K ≥ 1 we compute as above with fattened Littlewood-Paley projections to get

|fK (x − y) − fK (x)| = |P˜K fK (x − y) − P˜K fK (x)|

d ˜ˇ d ˜ˇ = (fK ∗ K ψ(K·))(x − y) − (fK ∗ K ψ(K·))(x) Z Z d ˜ˇ d ˜ˇ = K fK (x − y − z)ψ(Kz) dz − K fK (x − z)ψ(Kz) dz Rd Rd Z Z d ˜ˇ d ˜ˇ = K fK (x − z)ψ(K(z − y)) dz − K fK (x − z)ψ(Kz) dz Rd Rd Z d h ˜ˇ ˜ˇ i = K ψ(K(z − y)) − ψ(Kz) fK (x − z) dz Rd Z d ˜ˇ ˜ˇ ≤ K kfK kL∞ ψ(Kz − Ky) − ψ(Kz) dz. Rd

67 Changing variables via Kz 7→ z and hence dz 7→ K−d dz gives Z ˜ˇ ˜ˇ |fK (x − y) − fK (x)| ≤ kfK kL∞ ψ(z − Ky) − ψ(z) dz. Rd

Since fK = f≥K − f≥2K ,

−α −α −α kfK kL∞ ≤ kf≥K kL∞ + kf≥2K kL∞ . K + (2K) . K .

Then by this fact and by the fundamental theorem of calculus,

ZZ 1 −α ˜ˇ |fK (x − y) − fK (x)| ≤ K K|y| ∇ψ(z − θNy dθ dz. 0

1 ˇ ψ RR ∇ψ˜(z − θNy dθ dz 1 Since is Schwartz, the same argument as above gives 0 . . Thus, for any K ≥ 1 we have the estimate

1−α |fK (x − y) − fK (x)| . K |y|. (7.2)

So for any dyadic N ≥ 1, by (7.1) and (7.2), X |f≥1(x − y) − f≥1(x)| ≤ |fK (x − y) − fK (x)| + |f≥N (x − y) − f≥N (x)| 1≤K

P 1−α 1−α Since the sum is over dyadic K, 1≤K

1−α −α |f≥1(x − y) − f≥1(x)| . N |y| + N .

Choose N ≥ 1 such that 1 1 ≤ N ≤ 2 . |y| |y| Since |y| < 1, this is certainly possible. Then N ∼ |y|−1, and thus

−1 1−α −1 −α α−1 α |f≥1(x − y) − f≥1(x)| . (|y| ) |y| + (|y| ) = |y| |y| + |y| α . |y| .

Putting the two estimates together yields

α α |f(x − y) − f(x)| ≤ |f<1(x − y) − f<1(x)| + |f≥1(x − y) − f≥1(x)| . |y| + |y| α . |y| so that f is α-Holder continuous as desired.

7.2 The Littlewood-Paley Square Function

In this section, we introduce the Littlewood-Paley square function, and the key fact that its Lp-norm is comparable to the Lp-norm of the input function.

d Definition 7.6. For f ∈ S(R ), the Littlewood-Paley square function of f is given by

1   2 X 2 (Sf)(x) =  |fN (x)|  . N∈2Z

68 To better estimate the size of the square function, we borrow an inequality from proba- bility theory.

Lemma 7.7 (Kinchin’s Inequality). Let Xn be independent identically distributed random vari- ables such that Xn = ±1 with equal probability. Then for any 0 < p < ∞,

 p 1 q X p X 2 E cnXN ∼ |cn| .

Proof. By considering real and imaginary parts, it suffices to assume that cn ∈ R. Then p Z ∞   dλ X p X E cnXN = p λ P cnXN > λ . 0 λ We have         X X X X P cnXN > λ ≤ P cnXN > λ + P cnXN < −λ = 2 P cnXN > λ because the Xn are i.i.d with Xn = ±1 with equal probability. Next, recall the exponential Chebychev’s inequality: for a random variable X and t > 0,

−tλ tX  P (X ≥ λ) ≤ e E e .

Thus, for t > 0,

   P  X −λt t cnXn P cnXN > λ ≤ 2e E e .

Because the Xn are independent, this gives

 X  Y Y 1 1  c X > λ ≤ 2e−λt etcnXn  = 2e−λt etcn + e−tcn P n N E 2 2 −λt Y = 2e cosh(tcn).

x2 Recall that cosh(x) ≤ e 2 ; one quick way to see this is by comparing Taylor series. So

2 2  X  −λt Y (tcn) −λt t P c2 2 2 n P cnXN > λ ≤ 2e e = 2e e .

2 P 2 λ Choosing t such that λt = t c , hence t = P 2 , gives n cn

λ2  X  −λt λt − λt − P 2 2 2 2 cn P cnXN > λ ≤ 2e e = 2e = 2e . So

p Z ∞ λ2 X p − P 2 dλ 2 cn E cnXN .p λ e . 0 λ z = √ λ Making the changes of variables P 2 yields cn

p p Z ∞ 2 p X X 2  2 p − z dz X 2  2 2 E cnXN . cn z e . cn . 0 z

This gives the . direction of the statement. Next, we need the & direction. We first consider the case 1 ≤ p < ∞. Note that P 2 P 2 |cn| = E | cnXn| . This is because the Xn are independent, and so

E(XnXm) = E(Xn) E(Xm) = 0.

69 Then by Holder, and by the above inequality,

1 1 0 1 1  p  p  p0  p   X 2 X p X X p X 2 2 |cn| ≤ E cnXn E cnXn . E cnXn cn .

This gives   p  p 1 X 2 2 X p cn . E cnXn as desired. For 0 < p < 1, we use Cauchy-Schwarz:

2  p 2− p  X 2 X X 2 X 2 |cn| = E cnXn = E cnXn cnXn

1 1  p  4−p 2 X 2 X ≤ E cnXn E cnXn .

Again by the first demonstrated inequality we have

 p 1   4−p   p  p 1 X 2 X 2 X 2 4 X 2 4 X 2 |cn| . E cnXn |cn| ⇒ |cn| . E cnXn .

2 Raising both sides to the power p gives the desired result. The main result of this section is the following theorem.

Theorem 7.8 (Square function estimate). For 1 < p < ∞, we have kSfkLp ∼ kfkLp .

Proof. We will first show kSfkLp . kfkLp . As a remark, the proof of this direction does not rely on the specific multiplier ψ defining P1, so this inequality holds in more generality. In ∞ d particular, ψ can be replaced by any element of Cc (R \{0}). P Let m(ξ) := N∈2Z ψN (ξ)XN , where the XN are i.i.d random variables such XN = ±1 with equal probability. Then kmkL∞ . 1, because for any given ξ, only finitely many of the summands give any nonzero contribution due to the compact supports of the φN . Also,   α X −|α| α ξ |D m(ξ)| ≤ N D ψ . ξ ξ N N∈2Z

Because ψN is smooth and has support |ξ| ∼ N, and because only finitely summands contribute, α −|α| |Dξ m(ξ)| . |ξ| . d As this holds for any α ∈ N , the Mikhlin multiplier theorem gives kmˇ ∗ fkLp . kfkLp for P all 1 < p < ∞. Note that mˇ ∗ f = N∈2Z fN XN . So by Kinchin’s inequality,

  1  p 1 2 p X 2 X (Sf)(x) =  |fN (x)|  ∼ E fN (x)XN 

N∈2Z N∈2Z p 1 = (E |mˇ ∗ f| ) p . Using Fubini’s theorem, Z p p p kSfkLp ∼ E |mˇ ∗ f| (x) dx . E kmˇ ∗ fkLp

Since the expectation of a constant is itself,

p p p p kSfkLp . E kmˇ ∗ fkLp . E kfkLp = kfkLp .

70 Next, we show kfkLp . kSfkLp using duality and the fattened Littlewood-Paley projec- tions. Recall that P˜N PN = PN . We have

DX E DX ˜ E kfkLp = sup hf, gi = sup PN f, g = sup PN PN f, g kgk 0 kgk 0 kgk 0 Lp =1 Lp =1 Lp =1 X D E = sup PN f, P˜N g kgk 0 Lp =1 since the operators PN are self-adjoint. By Cauchy-Schwarz and then Holder,

 1 1  1 X 2 2 X ˜ 2 2 X ˜ 2 2 kfkLp ≤ sup |PN f| , |PN g| ≤ sup kSfkLp |PN g| . 0 kgk 0 kgk 0 p Lp =1 Lp =1 L

  1 P ˜ 2 2 By the remark at the beginning of the proof, |PN g| . kgkLp0 . Thus, Lp0

kfkLp . kSfkLp .

7.3 Applications to Fractional Derivatives

One of the first properties of Littlewood-Paley projections that we proved was Bernstein’s s s inequality, which says k|∇| fN kLp ∼ N kfN kLp . This should not be surprising, as differ- entiation of a function amount to multiplication by ξ in the frequency domain, and the function fN is localized to frequencies |ξ| ∼ N. Because the Littlewood-Paley projections interact with differentiation so nicely, they naturally have applications in partial differen- tial equations. In this section, we prove a product rule and chain rule for the fractional derivative operator |∇|s. First, we provide a square function estimate involving the fractional derivative. This captures the notion that the derivative |∇|s can be expressed as a linear combination of Littlewood-Paley projection operators. ˆ ∞ d Proposition 7.9. Let 1 < p < ∞ and suppose that f ∈ Cc (R \{0}). Then

1. For s ∈ R, 1   2

s X s 2 k|∇| fkLp ∼  |N fN |  . N∈2Z Lp 2. For s > 0, 1   2

s X s 2 k|∇| fkLp ∼  |N f≥N |  . N∈2Z Lp Proof.

1. First, consider the & inequality. We have

1 1   2   2

X s 2 X s −s s 2  |N fN |  =  |N |∇| |∇| fN |  . N∈2Z N∈2Z Lp Lp

71 s s s Also, |∇| fN = |∇| PN f = PN (|∇| f), as these operators are given by multiplication on the Fourier side. Thus, 1 1   2   2

X s 2 X s −s s 2  |N fN |  =  |N |∇| PN (|∇| f)|  . N∈2Z N∈2Z Lp Lp Recall that in the proof of the Littlewood-Paley square function estimate from, we proved one direction in a greater generality; we showed that kSfkLp . kfkLp where ∞ d ˜ −s S is defined via any ψ ∈ Cc (R \{0}). In particular, this holds for ψ(ξ) := |ξ| ψ(ξ), ∞ d ˜ ∞ d where ψ is the usual Littlewood-Paley ψ. Because ψ ∈ Cc (R \{0}), ψ ∈ Cc (R \ ˜ s −s {0}). Defining ψN (ξ) := N |ξ| ψ(ξ/N), the general square function estimate gives

1   2

X s −s s 2 s  |N |∇| PN (|∇| f)|  . k|∇| fkLp . N∈2Z Lp Next, we show ., using duality and the fattened Littlewood-Paley projections.

s s DX s E k|∇| fkLp = sup h|∇| f, gi = sup PN (|∇| f), g kgk 0 =1 kgk 0 =1 Lp Lp DX s E = sup |∇| P˜N PN f, g kgk 0 =1 Lp X D s E = sup PN f, |∇| P˜N g kgk 0 =1 Lp X D s −s s E = sup N PN f, N |∇| P˜N g . kgk 0 =1 Lp Applying Cauchy-Schwarz and then Holder,

 1 1  s X 2 2 2 X −s s ˜ 2 2 k|∇| fkLp ≤ sup |N fN | , |N |∇| PN g| kgk 0 =1 Lp 1 1 X s 2 2 X −s s ˜ 2 2 ≤ sup |N fN | |N |∇| PN g| . 0 kgk 0 =1 p p Lp L L By the same remark from above,

1 X −s s ˜ 2 2 |N |∇| PN g| . kgkLp0 . Lp0 1 s P s 2 2 Thus, k|∇| fkLp . |N fN | . Lp

2. Note that PN = P≥N − P≥2N . Thus,

1 s X s 2 2 k|∇| fkLp ∼ |N fN | Lp 1 1 X s 2 2 X s 2 2 ≤ |N f≥N | + |N f≥2N | Lp Lp where we have invoked the `2 triangle inequality followed by the Lp triangle inequal- ity. But up to a factor of 2,

1 1 X s 2 2 X s 2 2 |N f≥2N | . |N f≥N | . Lp Lp

72 So 1 s X s 2 2 k|∇| fkLp . |N f≥N | . Lp Next, we show &. We have

X s 2 X 2s 2 X 2s X 2 |N f≥N | = N |f≥N | ≤ N |fK | N∈2Z K≥N     X 2s X X ≤ N  |fN1 |  |fN2 | . N∈2Z N1≥N N2≥N

Multiplying the latter sums out, rearranging, and picking up a factor of 2 from sym- metry gives X s 2 X 2s X |N f≥N | ≤ 2 N |fN1 ||fN2 |. N∈2Z N≤N1≤N2 Cleverly inserting constants yields

X s 2 X 2s X 1 s s |N f≥N | ≤ 2 N s s |N1 fN1 ||N2 fN2 | N1 N2 N∈2Z N≤N1≤N2 2s X N s s = 2 s s |N1 fN1 ||N2 fN2 |. N1 N2 N,N1,N2; N≤N1≤N2

Freeze N1 and N2 and consider the sum over N. This is a dyadic sum, and by the usual argument we can bound this by plugging in the largest term, which is N1. Thus,  s X s 2 X N1 s s |N f≥N | . |N1 fN1 ||N2 fN2 |. N2 N1≤N2

This is summable in both N1 and N2, since s > 0. By Schur’s test, it follows that

1 1   2   2 X s 2 X s 2 X s 2 X s 2 |N f≥N | .  |N2 fN2 |   |N1 fN1 |  = |N fN | . N2 N1

Taking the square root of both sides and then the Lp norm gives

1 1 X s 2 2 X s 2 2 |N f≥N | . |N fN | . Lp Lp Applying the result from the previous part gives the desired result.

Using these estimates, we now prove the product rule and chain rule for the operator |∇|s, both results due to Christ and Weinstein [1]. As |∇|s is a non-local operator, these rules are given as Lp norm estimates, rather than as pointwise facts.

Theorem 7.10 (Fractional product rule). Let 1 < p < ∞ and s > 0. Then

s s s k|∇| (fg)kLp . k|∇| fkLp1 kgkLp2 + k|∇| gkLq1 kfkLq2 where 1 = 1 + 1 = 1 + 1 . p p1 p2 q1 q2

73 1 s P 2s 2 2 Proof. The previous proposition gives k|∇| (fg)kLp ∼ N |PN (fg)| . We per- Lp form a paraproduct decomposition on fg:

fg = f N g + f N g = f N g + f N g N + f N g N . ≥ 8 < 8 ≥ 8 < 8 ≥ 8 < 8 < 8 Then PN (fg) = PN (f N g) + PN (f N g N ) + PN (f N g N ). ≥ 8 < 8 ≥ 8 < 8 < 8 N N Note that f N and g N have frequency supports bounded by 2 = . Thus, the maxi- < 8 < 8 8 4 N N N mum frequency attained by f N g N is bounded by + = . Thus, PN (f N g N ) = 0. < 8 < 8 4 4 2 < 8 < 8 Thus, it remains to consider the first two terms. Using the fact that the Littlewood-Paley projections are bounded by the maximal func- tion, h i |PN (fg)| M(f N g) + M(f N g N ) M(f N g) + M (Mf) g N . . ≥ 8 < 8 ≥ 8 . ≥ 8 ≥ 8 Therefore, the `2 and Lp triangle inequalities yield

1 X 2s 2 2 N |PN (fg)| Lp 1 1  2 2  2 2 X s X h s i M(N f N g) + M (Mf) N g N . . ≥ 8 ≥ 8 Lp Lp Next, apply the vector-valued Hardy-Littlewood maximal inequality to both norms:

1 1 1    2 2  2 2 X 2s 2 2 X s X s N |PN (fg)| . N f≥ N g + (Mf) N g≥ N p 8 8 L Lp Lp 1 1  2 2  2 2 X s X s = |g| N f N + |Mf| N g N . ≥ 8 ≥ 8 Lp Lp Applying Holder’s inequality,

1 X 2s 2 2 N |PN (fg)| Lp 1 1  2 2  2 2 X s X s kgkLp2 N f N + kMfkLq2 N g N . . ≥ 8 ≥ 8 Lp1 Lq1

Since the maximal function is of type (q2, q2), kMfkLq2 . kfkLq2 . Applying part 2 of the previous proposition to the p1 and q1 norms yields

1 s X 2s 2 2 s s k|∇| (fg)kLp ∼ N |PN (fg)| . kgkLp2 k|∇| fkLp1 + kfkLq2 k|∇| gkLq1 . Lp

Theorem 7.11 (Fractional chain rule). Suppose F : C → C satisfies

|F (u) − F (v)| . |u − v| · |G(u) − G(v)|

d for all functions u, v : R → C and for some function G : C → [0, ∞). Then for 1 < p < ∞ and 0 < s < 1, s s k|∇| F (u)kLp . k|∇| ukLp1 kG(u)kLp2 where 1 = 1 + 1 and 1 < p ≤ ∞. p p1 p2 2

74 Proof. As before, the previous proposition yields

1 s X 2s 2 2 k|∇| F (u)kLp ∼ N |PN F (u)| . Lp

Observe that the function ψˇ is a mean zero function. Indeed, Z ψˇ(x) dx = ψˇˆ(0) = ψ(0) = 0. Rd We use this and the fact that convolving with a mean zero function is like differentiation as follows: Z d ˇ d PN F (u)(x) = (F (u) ∗ N ψ(N·))(x) = N NyFˇ (u(x − y)) dy Z = N d ψˇ(Ny)[F (u(x − y)) − F (u(x))] dy and so Z d ˇ |PN F (u)(x)| . N |ψ(Ny)| |u(x − y) − u(x)| |G(u(x − y)) − G(u(x)| dy.

Decomposing |u(x − y) − u(x)| based on where we expect cancellation, we have X |u(x − y) − u(x)| ≤ |u>N (x − y)| + |u>N (x)| + |uK (x − y) − uK (x)|. K≤N

We proceed by considering each of these terms separately. First, we claim that

|uK (x − y) − uK (x)| . K|y| |(MuK )(x − y) + (MuK )(x)| .

Indeed, if K|y| & 1, then

|uK (x − y) − uK (x)| ≤ |uK (x − y)| + |uK (x)| . |(MuK )(x − y) + (MuK )(x)| . K|y| |(MuK )(x − y) + (MuK )(x)| . If K|y| << 1, then

uK (x − y) − uK (x) = P˜K (uK (x − y) − uK (x)) Z d ˜ˇ = K ψ(Ky)[uK (x − y − z) − uK (x − z)] dz Z d h ˜ˇ ˜ˇ i = K ψ(K(z − y)) − ψ(Kz) uK (x − z) dz via the change of variables y + z 7→ z. By the fundamental theorem of calculus, Z Z 1 d ˇ |uK (x − y) − uK (x)| ≤ K K|y| ∇ψ(Kz − θKy) dθ |uK (x − z)| dz. 0 ˇ Because ψ˜ is Schwartz and thus is bounded and decays as quickly as we need, we have Z 1 |u (x − y) − u (x)| K|y| Kd |u (x − z)| dz. K K . hKzi100d K

In previous arguments, by considering z < 1/K and z > 1/K separately, we have seen that Z 1 Kd |u (x − z)| dz (Mu )(x). hKzi100d K . K

75 Our claim then follows from this fact. Next, we consider the contribution of |u>N (x − y)| to |PN F (u)(x)|. The contribution is bounded by Z d ˇ N |ψ(Ny)| |u>N (x − y)| [G(u)(x − y) + G(u)(x)] dy.

By the same trick as above, use the quickly decaying and bounded nature of ψˇ to estimate this integral by

M (u>N · G(u)) (x)+G(u)(x)·M(u>N )(x) . M (u>N · G(u)) (x)+M(G(u))(x)·M(u>N )(x).

Similarly, the contribution of |u>N (x)| to |PN F (u)(x)| is bounded by Z d ˇ N |ψ(Ny)| |u>N (x)| [G(u)(x − y) + G(u)(x)] dy

. |u>N (x)| · M(G(u))(x) + |u>N (x)| · G(u)(x) . M(u>N )(x) · M(G(u))(x).

Thus, the contribution of |u>N (x − y)| + |u>N (x)| to |PN F (u)(x)| is bounded by

M (u>N · G(u)) (x) + M(G(u))(x) · M(u>N )(x).

s Estimate the contribution of M (u>N · G(u)) (x) to k|∇| F (u)kLp as follows:

1 1   2   2

X 2s 2 X s 2  N |M(u>N G(u))|  =  |M(N u>N G(u))|  N∈2Z N∈2Z Lp Lp 1   2

X s 2 .  |N u>N G(u)|  N∈2Z Lp using the vector-valued Hardy-Littlewood maximal inequality. Next, pull the G(u) out of the sum and apply Holder’s inequality followed by our derivative estimate from earlier:

1   2 X s 2 s kG(u)k p |N u | kG(u)k p k|∇| uk p . L 2  >N  . L 2 L 1 N∈2Z Lp1

s The contribution of M(G(u))(x) · M(u>N )(x) to k|∇| F (u)kLp is almost the exact same calculation. P Thus, it only remains to bound the contribution of the K≤N |uK (x − y) − uK (x)| term.

76 The contribution to |PN F (u)| is Z X d ˇ N |ψ(Ny)||uK (x − y) − uK (x)||G(u)(x − y) − G(u)(x)| dy K≤N Z X d ˇ . N |ψ(Ny)|K|y| |(MuK )(x − y) + (MuK )(x)| |G(u)(x − y) − G(u)(x)| dy K≤N X K Z = N d|ψˇ(Ny)|N|y| |(Mu )(x − y) + (Mu )(x)| |G(u)(x − y) − G(u)(x)| dy N K K K≤N X K [M(M(u )G(u))(x) + M(M(u ))(x) G(u)(x) . N K K K≤N

+ (M(uK )M(G(u)))(x) + (M(uK )G(u))(x)] X K [M(M(u )G(u))(x) + M(M(u ))(x) G(u)(x)] . N K K K≤N X K =: C . N K K≤N

s Towards the goal of estimating the contribution of this term to k|∇| F (u)kLp , we first com- pute:

2    

X 2s X K X X K X L N CK =  CK   CL N N N N K≤N N K≤N L≤N X X KL ≤ 2 N 2s C C N 2 K L N K≤L≤N where the factor of 2 comes from the symmetry of the summations. Inserting Ks and Ls terms and then summing in N gives

X 2(s−1) 1−s 1−s s s = 2 N K L (K CK )(L CL) N,K,L; K≤L≤N X 2(s−1) 1−s 1−s s s . L K L (K CK )(L CL) K≤L 1−s X K  = (KsC )(LsC ). L K L K≤L

This is summable in both K and L. Hence, by Schur’s test,

2 1 1 ! 2 ! 2 X 2s X K X s 2 X s 2 X 2s 2 N CK (K CK ) (L CL) = K C . N . K N K≤N K L K

s Using this, we can estimate the contribution to k|∇| F (u)kLp by

1 1 ! 2 ! 2 X 2s 2 X 2s 2 K |M(M(uK ) G(u))| + K |M(M(uK )) M(G(u))| .

K K Lp Lp The rest of the computation uses the vector valued maximal inequality in exactly the same manner as above. To avoid unnecessary repetition, the details are left to the reader.

77 An example of such a function F which appears in applications in partial differential equations is F (u) = |u|pu. Indeed, by the fundamental theorem of calculus,

|F (u) − F (v)| ≤ |u − v| · ||u|p + |v|p| .

Nonlinearities in equations such as the Schrodinger equation often take this form. Also, this fractional product rule is useful for the case s > 1, as we can typically write derivative operators as an integer part plus a fractional part 0 < s < 1.

78 Chapter 8

Oscillatory Integrals

In this chapter, we develop the basic tools to study oscillatory integrals. The fundamental example of an oscillatory integral is the Fourier transform: Z e−2πix·ξf(x) dx. Rd The exponential term in the above integral has unit modulus, but nontrivial phase oscilla- tion. The most naive way to estimate the size of a particular Fourier transform is to use the integral triangle inequality: Z Z Z −2πix·ξ −2πix·ξ e f(x) dx ≤ |e f(x)| dx = |f(x)| dx. Rd Rd Rd Though useful in some contexts, this estimate is crude because it ignores any oscillation coming from the phase e−2πix·ξ. For example, if the function f is not integrable, estimating in this way is useless. Oftentimes, taking advantage of phase oscillation and cancellation is a more effective way to bound the size of an integral like the Fourier transform. More generally, we define two different notions of oscillatory integrals.

Definition 8.1. An oscillatory integral of the first kind is an integral of the form Z I(λ) = eiλφ(x) ψ(x) dx Rd d d where φ : R → R, ψ : R → C, and λ > 0. The main goal of this chapter is to understand the asymptotic behavior of these inte- grals as λ → ∞. We will eventually see applications of this to various PDEs. Oscillatory integrals of the first kind will be our only focus, but we provide the follow- ing definition for the sake of completeness.

Definition 8.2. An oscillatory integral of the second kind is an operator of the form Z iλφ(x,y) (Tλf)(x) = e K(x, y) f(y) dy Rd d d d d d where φ : R × R → R, K : R × R → C, f : R → C, and λ > 0.

In this case, it is desirable to understand the asymptotic behavior of kTλkop as λ → ∞. We refer to [10] for more details on oscillatory integrals of the second kind, as well as the other material in this chapter.

79 8.1 Oscillatory Integrals of the First Kind, d = 1

We begin by studying oscillatory integrals of the first kind in one . In this setting, the theory is completely understood and admits a simpler presentation. There are three main techniques presented in this section: nonstationary phase, the van der Corput lemma, and stationary phase. We begin with nonstationary phase. Roughly, the principle of nonstationary phase says that if the phase eiλφ(x) is nonstationary in the sense that φ0(x) 6= 0, then the oscillatory integral I(λ) is rapidly decaying. In other words, integrating a wildly varying phase leads to cancellation, and hence decay. This should make sense from the perspective of adding sinusoidal waves: if the waves have similar phases, their sum amplifies, whereas summing waves with different phases leads to cancellation in amplitude. More directly, the principle of nonstationary phase uses the compact support of ψ to- gether with integration by parts to achieve decay up to any order. Proposition 8.3 (Nonstationary phase). Let φ : R → R and ψ : R → C be smooth with ψ compactly supported in (a, b). Assume φ0(x) 6= 0 for all x ∈ [a, b]. Then Z b iλφ(x) −N e ψ(x) dx .N λ (8.1) a for any N ≥ 0. Here the implicit constant does not depend on a and b. Proof. As remarked, the proof of this proposition is simply repeated applications of inte- gration by parts. The details are as follows. 0 iλφ(x) 1 d iλφ(x) 1 d iλφ(x) Because φ (x) 6= 0, we have e = iλφ0(x) dx e . Let D := iλφ0(x) dx . Then e = DN eiλφ(x) for any N ≥ 0. To integrate by parts with D, we need to compute its adjoint. For any smooth f, normal integration by parts gives Z Z 1 df Z d  1  Df(x) ψ(x) dx = 0 (x) ψ(x) dx = − f(x) 0 ψ(x) dx. R R iλφ (x) dx R dx iλφ (x) Here, the boundary terms vanish because of the compact support of ψ. Thus, the adjoint of D is given by d  1  tDf(x) := − f(x) . dx iλφ0(x) We have Z b Z b Z b eiλφ(x)ψ(x) dx = DN (eiλφ(x))ψ(x) dx = eiλφ(x) (tD)N ψ(x) dx a a a Z b  N iλφ(x) d 1 = e − 0 (ψ)(x) dx. a dx iλφ (x) Therefore,

Z b Z b  d 1 N eiλφ(x)ψ(x) dx ≤ λ−N (ψ)(x) dx. 0 a a dx φ (x) Inside this integral, the derivative can hit either the φ0(x) in the denominator or the ψ(x). So Z b  d 1 N (ψ)(x) dx 0 a dx φ (x) is some finite constant which depends on N + 1 derivatives of φ(x) and N derivatives of ψ(x). Thus, Z b iλφ(x) −N e ψ(x) dx .N λ a

80 for any N ≥ 0, where the implicit constant depends on N + 1 derivatives of φ(x) and N derivatives of ψ(x). If ψ does not have compact support, the best decay in λ we can expect is λ−1. For example, when ψ(x) = 1 and φ(x) = x, Z b iλb iλa iλx e − e −1 e dx = . λ . a iλ Summarizing our analysis so far, we have the following informal heuristics:  control of φ0 ⇒ I(λ) decays like λ−1 whereas  0   control of φ  & ⇒ I(λ) decays to any order.  compact support of ψ  The first heuristic is suggested by the preceding example, and the second heuristic is the principle of nonstationary phase. Here, “control” means bounding away from 0. As φ determines the phase oscillation of I(λ), these ideas suggest that a lot of oscillation leads to a lot of cancellation, hence decay. The next principle is a generalization of the first heuristic, which analyses the decay of I(λ) if we only have control of the k-th derivative of φ, rather than φ0. (k) Lemma 8.4 (van der Corput). Let φ : R → R be smooth. Fix k ≥ 1, and assume that |φ (x)| ≥ 1. If k = 1, assume further that φ0 is monotone. Then Z b iλφ(x) − 1 e dx .k λ k (8.2) a where the implicit constant does not depend on a or b. Proof. We proceed by induction on k. Consider the k = 1 case. Because φ0 6= 0, b b Z Z 1 d h 0 i eiλφ(x) dx = eiλφ (x) dx. a a iλφ(x) dx Integrating by parts (with boundary terms!) gives Z b Z b   iλφ(b) iλφ(a) iλφ(x) d 1 iλφ(x) e e e dx = − 0 e dx + 0 − 0 a a dx iλφ (x) iλφ (b) iλφ (a) " iφ(b) iλφ(a) # Z b   1 e e 1 iλφ(x) d 1 = 0 − 0 − e 0 dx. λ iλφ (b) iφ (a) iλ a dx φ (x) Thus, Z b Z b iλφ(x) 1 1 d 1 e dx . + 0 dx. a λ λ a dx φ (x)

φ0 d 1 = ± d 1 Because is monotone, dx φ0(x) dx φ0(x) . So we can estimate further by Z b Z b iλφ(x) 1 1 d 1 e dx . + 0 dx a λ λ a dx φ (x) Z b 1 1 d 1 = + 0 dx λ λ a dx φ (x)

1 1 1 1 = + − λ λ φ0(b) φ0(a) 1 . . λ

81 This establishes the base case k = 1. Next, assume by induction that the desired estimate holds for some k ≥ 1. Suppose that |φ(k+1)(x)| ≥ 1. By replacing φ with −φ if necessary, we may assume without loss of generality that φ(k+1)(x) ≥ 1. Then φ(k) is strictly increasing, and consequently there is at most one point c ∈ [a, b] such that φ(k)(c) = 0. Case 1. First, consider the case where there is such a c. Since φ(k+1) ≥ 1, |φ(k)(x)| ≥ δ for all x of distance at least δ from c, i.e., for all x ∈ [a, b] \ (c − δ, c + δ). Close to c, we have the trivial estimate Z iλφ(x) e dx ≤ 2δ. [a,b]∩(c−δ,c+δ) Away from c, we estimate as follows. Because |φ(k)(x)| ≥ δ on the complement of (c − − 1 δ, c + δ), we can rescale φ to use the inductive hypothesis. Let φδ(x) := φ(δ k x). Then (k) 1 (k) −1 (k) − k 1 φδ (x) = δ φ (δ x) and hence |φδ (x)| ≥ 1 outside of (c − δ, c + δ). Thus,

− 1 − 1 k k Z c−δ Z δ (c−δ) − 1 Z δ (c−δ) iλφ(x) − 1 iλφ(δ k y) − 1 iλφ (y) e dx = δ k e dy = δ k e δ dy − 1 − 1 k k a δ a δ a − 1 − 1 . δ k λ k .

R b iλφ(x) − 1 − 1 e dx δ k λ k The same computation gives c+δ . . Combining these estimates with the first trivial estimate yields

Z b iλφ(x) − 1 − 1 e dx . δ + δ k λ k a

− 1 − 1 − 1 for any δ > 0. We optimize the choice of δ by requiring δ = δ k λ k , and thus δ = λ k+1 . This gives Z b iλφ(x) − 1 e dx .k λ k+1 a as desired. Case 2. Next, suppose that no such c exists, i.e., φ(k)(x) 6= 0 for all x ∈ [a, b]. It is either the case that φ(k)(a) > 0 or φ(k)(a) < 0. Suppose first that φ(k)(a) > 0. Arguing as before, because φ(k+1)(x) ≥ 1 we know that φ(k)(x) ≥ δ for all x ∈ (a + δ, b]. Then

Z b Z a+δ Z b iλφ(x) iλφ(x) iλφ(x) e dx ≤ e dx + e dx . a a a+δ

The first quantity is trivially bounded by δ. In the second integral, we rescale by δ and use R b iλφ(x) − 1 − 1 e dx δ k λ k the inductive hypothesis exactly as before to get a+δ . . Optimizing in − 1 δ by choosing δ = λ k+1 gives

Z b iλφ(x) − 1 e dx .k λ k+1 a as desired. Suppose now that φ(k)(a) < 0. Then φ(k)(b) < 0. The argument from here is analogous, this time decomposing [a, b] into [a, b − δ] and (b − δ, b].

1The following integral is understood to be 0 if c − δ ≤ a.

82 If k = 1, then the monotonicity assumption on φ0 is in fact needed. Note that

Z b Z b Z b iφ(x) e dx = cos(φ(x)) + i sin(φ(x)) dx ≥ sin(φ(x)) dx . a a a

It is possible to construct a function φ such that φ0(x) is large when sin(φ(x)) is small, and φ0(x) is small when sin(φ(x)) is large. Consequently,

| { x : sin φ(x) } | << | { x : sin φ(x) > 0 } | and Z b

sin(φ(x)) dx → ∞ a as |a|, |b| → ∞. We leave the details to the reader. Finally, we come to the principle of stationary phase. By the principle of nonstationary phase, I(λ) decays as rapidly as we like when φ0 6= 0. Thus, it remains to consider what happens near critical points of φ, i.e., where the phase is not oscillating rapidly. The following proposition gives a complete description of the asymptotic behavior of I(λ) near a well-behaved critical point.

Proposition 8.5 (Stationary phase). Let φ : R → R be smooth, and assume that φ has a nonde- 2 3 generate critical point at x0. If ψ is smooth and supported in a sufficiently small neighborhood neighborhood of x0, then Z I(λ) = eiλφ(x)ψ(x) dx R satisfies

h√ 1 π 00 i 1  3  00 − 2 i sgn φ (x0) iλφ(x0) − − I(λ) = 2πi φ (x0) e 4 e ψ(x0) λ 2 + O λ 2 (8.3) as λ → ∞.

This statement of the principle of stationary phase gives an explicit constant in (8.3) that will prove to be useful. Before providing the proof of this fact, we can easily derive the general behavior if we ignore the explicit constant. ∞ In particular, let a ∈ Cc (R) be a smooth bump function satisfying a(x) = 1 for |x| ≤ 1 and a(x) = 0 for |x| ≥ 2. We use this bump function to localize close to and far away from x0 as follows: Z Z iλφ(x)  1  iλφ(x) h  1 i |I(λ)| ≤ e ψ(x) a λ 2 (x − x0) dx + e ψ(x) 1 − a λ 2 (x − x0) dx . R R

− 1 Consider the first integral. The support of the rescaled a function is λ 2 , so ignoring any oscillation and using the triangle inequality gives Z iλφ(x)  1  − 1 e ψ(x) a λ 2 (x − x0) dx . λ 2 . R

 1  Consider the second integral. As ψ(x) has compact support and 1 − a λ 2 (x − x0) has 0 h  1 i support away from x0, φ 6= 0 on the support of ψ(x) 1 − a λ 2 (x − x0) . We would like to use the principle of nonstationary phase to estimate this term, but because the function

2 0 00 That is, φ (x0) = 0 and φ (x0) 6= 0. 3The precise nature of sufficiently small will be clear from the proof. Primarily, the support of ψ must not contain any other critical points of φ.

83 h  1 i ψ(x) 1 − a λ 2 (x − x0) has support which scales with λ, the implicit constant in (8.1) 1 depends on λ. However, as a is rescaled by a factor of λ 2 , every iteration of integration by −1 1 − 1 parts in the proof of nonstationary phase picks up a a factor of λ · λ 2 = λ 2 . Thus, Z iλφ(x) h  1 i − N e ψ(x) 1 − a λ 2 (x − x0) dx . λ 2 R  − 3  for any order N. We can certainly choose N large enough so that this term is O λ 2 . Thus, − 1  − 3  |I(λ)| . λ 2 + O λ 2 as desired. Now we provide the full proof, keeping track of the explicit constants.

0 00 Proof of Proposition 8.5. Because φ (x0) = 0 and φ (x0) = 0, φ is well-approximated by its order-2 Taylor expansion near x0. Expanding gives us 1 φ(x) = φ(x ) + (x − x )φ0(x ) + (x − x )2φ00(x ) + O(|x − x |3) 0 0 0 2 0 0 0 00 2 = φ(x0) + φ (x0)(x − x0) (1 + η(x)) where η : R → R is smooth and η(x) = O(|x − x0|). Let U be an open neighborhood of x0, chosen sufficiently small enough so that |η(x)| < 1 for all x ∈ U, and φ(x) 6= 0 for all x 6= x0 ∈ U. Suppose that the support of ψ is contained in U. p Change variables via y = (x − x0) 1 + η(x). This is a diffeomorphism from U to a small neighborhood of y = 0. Then Z iλ φ(x )+ 1 φ00(x )(x−x )2(1+η(x)) I(λ) = e [ 0 2 0 0 ]ψ(x) dx R Z iλφ(x ) i 1 φ00(x )λy2 = e 0 e 2 0 ψ˜(y) dy R ˜ ∞ where ψ ∈ Cc (R) is the appropriately transformed function under the indicated diffeo- morphism. Note that ψ˜ has support contained the the small neighborhood surrounding ˜ 1 00 ∞ y = 0. Let λ := 2 φ (x0)λ and let γ ∈ Cc (R) be a function which is identically 1 on the support of ψ˜. Then

Z ˜ 2 I(λ) = eiλφ(x0) eiλy ψ˜(y) dy R Z ˜ 2 2 h 2 i = eiλφ(x0) eiλy e−y ey ψ˜(y) γ(y) dy. R 2 Taylor expand ey ψ˜(y) by N y2 ˜ X j N+1 e ψ(y) = ajy + y RN (y) =: P (N) j=0 where RN is smooth and bounded. With this, we decompose I(λ) as follows:

N Z 2 2 iλφ(x0) X iλy˜ −y j I(λ) = e aj e e y dy j=0 R

Z 2 2 iλφ(x0) iλy˜ −y N+1 + e e e y RN (y)γ(y) dy R Z ˜ 2 2 + eiλφ(x0) eiλy e−y P (y)[γ(y) − 1] dy. R

84 Let (1), (2), and (3) denote the first, second, and third lines of this decomposition, respec- tively. 2 Consider (3) first. Note that e−y P (y)[γ(y) − 1] is supported outside of a neighborhood of y = 0. As (y2)0 = 2y 6= 0 away from 0, by the principle of nonstationary phase we have ˜ −m |(3)| . |λ| for any order m. ∞ Next, we estimate (2). Let a ∈ Cc (R) be a smooth bump function satisfying a(x) = 1 for |x| ≤ 1 and a(x) = 0 for |x| ≥ 2. For any ε > 0, we have a decomposition

Z 2 2 iλφ(x0) iλy˜ −y N+1 (2) = e e e y RN (y)γ(y)ϕ(y/ε) dy R Z 2 2 iλφ(x0) iλy˜ −y N+1 + e e e y RN (y)γ(y) [1 − ϕ(y/ε)] dy. R Label the first line of this decomposition (2i), and the second line (2ii). We will estimate each term separately and then optimize in ε. Because the support of ϕ(y/ε) is small (of size ∼ ε, when ε is small we do not expect much oscillation from the phase in this integral. So −y2 to estimate (2i) we simply use the triangle inequality, boundedness of RN (y) and e , and the support of ϕ(y/ε) to get Z N+1 N+1 N+2 |(2i)| . y ϕ(y/ε) dy . ε ε = ε . R

˜ 2 To estimate (2ii) we use the nonstationary oscillation of the phase eiλy . Because we need to keep track of the ε’s, we will integrate by parts directly instead of applying our previous −y2 nonstationary phase result. To simplify notation, let X(y) := e RN (y)γ(y). Define the differential operator D by 1 d D := 2iλy˜ dy

˜ 2 ˜ 2 so that Deiλy = eiλy . Then the adjoint is given by

d  1  (tDf)(y) = − f(y) . dy 2iλy˜

Thus,

Z 2 2 iλφ(x0) iλy˜ −y N+1 (2ii) = e e e y RN (y)γ(y) [1 − ϕ(y/ε)] dy R Z ˜ 2 = eiλφ(x0) Dm[eiλy ] yN+1X(y) [1 − ϕ(y/ε)] dy R Z ˜ 2 = eiλφ(x0) eiλy (tD)m yN+1X(y) [1 − ϕ(y/ε)] dy. R We carefully analyze the (tD)m yN+1X(y) [1 − ϕ(y/ε)] term. For clarity, consider a single application of tD:

d  1  tD yN+1X(y) [1 − ϕ(y/ε)] = − yN+1X(y) [1 − ϕ(y/ε)] . dy 2iλy˜

By applying tD to yN+1X(y) [1 − ϕ(y/ε)] m times, we pick up m copies of y in the denominator. Moreover, for each application of tD, the derivative can either hit the y in the denominator, the yN+1 term, X(y), or 1 − ϕ(y/ε). Let k denote the number of times the derivative hits the y denominator, let α1 denote the number of times the deriva- N+1 tive hits the y , and let α2 the number of times the derivative hits 1 − ϕ(y/ε). As

85 −y2 X(y) = e RN (y)γ(y), all of its derivatives are bounded and hence uninteresting. Ap- plying the integral triangle inequality to (2ii) and summing over all possible combinations of derivations described above, we have

m 1 X X Z |y|N+1−α1 ε−α2 |(2ii)| dy. . m m+k |λ˜| |y|≥ε |y| k=0 α1+α2≤m−k

The ≤ sign in the second sum is to account for the fact that we are ignoring the derivatives of X(y). The region of integration is due to the support of 1 − ϕ(y/ε). Continuing the estimate, m Z ˜ −m X X −α2 N+1−α1−(m+k) |(2ii)| . |λ| ε |y| dy. |y|≥ε k=0 α1+α2≤m−k

The most singular integral in this sum corresponds to α1 = 0 and k = 0. This partic- ular integral is finite provided N + 1 − m < −1, i.e., N + 2 < m, and is bounded by εN+2−(α1+α2)−(m+k). So, as long as m is chosen so that N + 2 < m, we have

m ˜ −m X X N+2−(α1+α2)−(m+k) |(2ii)| . |λ| ε . k=0 α1+α2≤m−k

Because ε is small, the summand is majorized when α1 + α2 = m − k. Thus,

m m ˜ −m X X N+2−(m−k)−(m+k) ˜ −m X X N+2−2m |(2ii)| . |λ| ε = |λ| ε k=0 α1+α2≤m−k k=0 α1+α2≤m−k ˜ −m N+2−2m . |λ| ε .

We now optimize in ε. Choose ε so that the (2i) and (2ii) estimates are equal:

N+2 −m N+2−2m − 1 ε = |λ˜| ε ⇒ ε = |λ˜| 2 .

Then ˜ − N+2 |(2)| . |λ| 2 . This estimate is acceptable for N ≥ 1. It remains to estimate (1):

N Z 2 2 iλφ(x0) X iλy˜ −y j (1) = e aj e e y dy. j=0 R

Note that a0 = ψ˜(0) = ψ(x0). For j ≥ 0,

Z ˜ 2 2 Z ˜ 2 eiλy e−y yj dy = e−(1−iλ)y yj dy. R R

1 Let z = (1 − iλ˜) 2 . Then

Z ˜ 2 j 1 Z 2 −(1−iλ)y j ˜ − 2 − 2 −z j e y dy = (1 − iλ) 1 e z dz R (1−iλ˜) 2 R

− j − 1 where we have chosen a branch of the function z 2 2 with cut along the negative real axis. 2 Using the decay of e−z to shift the contour, we have Z Z iλy˜ 2 −y2 j − j − 1 −y2 j e e y dy = (1 − iλ˜) 2 2 e y dy R R

86 2 When j is odd, the function e−y yj is odd, and consequently the above integral is 0. When − j+1 j ≥ 2 is even, the asymptotic contribution of the above integral is |λ˜| 2 . As j ≥ 2, this − 3 quantity is O(λ 2 ) and hence is acceptable. It remains to consider the contribution of the ˜ iθ p ˜2 −λ˜ ˜ j = 0 term. Write (1 − iλ) = re . Then r = 1 + λ and tan θ = 1 = −λ. Recall that 2 √ R e−y dy = π. Thus, the contribution of the j = 0 term to (1) is therefore R

1 √ 1 θ iλφ(x0) − iλφ(x0) − −i e a0(1 − iλ˜) 2 π = e ψ(x0)r 2 e 2 .

˜ π ˜ π ˜ 1 00 As λ → ∞, θ → − 2 , and as λ → −∞, θ → 2 . Recall that λ = 2 φ (x0)λ. Thus, as λ → ∞ −i θ i π sgn φ00(x ) we have e 2 → e 4 0 . Thus, the contribution to (1) is   1 √ 2 2 π 00 iλφ(x0) i 4 sgn φ (x0) e ψ(x0) π 00 e . λ|φ (x0)|

Careful inspection shows that this is precisely the first term of (8.3), which completes the proof.

8.2 The Morse Lemma

The remainder of this chapter is dedicated to proving the principles of nonstationary and stationary phase in higher dimensions. In the proof of stationary phase, it will be useful to change variables in a particularly nice way, as described by the following lemma.

d Lemma 8.6 (Morse). Suppose that φ : R → R is smooth with a nondegenerate critical point at d ∂y x0 ∈ R . There is a change of coordinates x 7→ y(x) such that y(x0) = 0, ∂x (x0) = Id, and 1 1 φ(x) − φ(x ) = λ y2 + ··· λ y2, 0 2 1 1 2 d d 2 where λ1, . . . , λd are the eigenvalues of D φ(x0). We have dedicated an entire section to the Morse lemma, as its proof is technical and does not strictly involve any harmonic analysis or techniques from oscillatory integrals. The higher dimensional analogues of nonstationary and stationary phase are contained in the next section, and the reader may be inclined to skip ahead.

2 Proof. Because D φ(x0) is symmetric, we may change coordinates if necessary and assume 2 without loss of generality that D φ(x0) = diag (λ1, . . . , λd). By Taylor expansion and inte- gration by parts, we may write

Z 1 d2 φ(x) = φ(x0) + (x − x0)∇φ(x0) + (1 − t) 2 [φ(x0 + t(x − x0))] dt 0 dt d Z 1 X ∂2φ = φ(x ) + (1 − t) (x − x ) (x − x ) (x + t(x − x )) dt. 0 0 i 0 j ∂x ∂x 0 0 0 i,j=1 i j

Pd So φ(x) − φ(x0) = i,j=1(x − x0)i(x − x0)jµij(x), where

Z 1 ∂2φ µij(x) = (1 − t) (x0 + t(x − x0)) dt. 0 ∂xi ∂xj

Since φ is smooth, µij is smooth. Also, µij = µji and

1 ∂2φ µij(x0) = (x0). 2 ∂xi ∂xj

87 From here, we argue by induction. Suppose that we have found a change of variables ∂y x 7→ y(x) such that y(x0) = 0, ∂x (x0) = I, and

d 1 1 X φ(x) − φ(x ) = λ y2 + ··· λ y2 + y y µ˜ (y). 0 2 1 1 2 r−1 r−1 i j ij i,j≥r

For the base case, we take r = 1, y = x − x0, and µ˜ij(y) = µij(x). Next, we perform some computations assuming the above decomposition. First,

2   2 ∂ 1 2 ∂yk ∂yk ∂ yk λkyk = λk · + λk · yk . ∂xi ∂xj 2 ∂xi ∂xj ∂xi ∂xj x=x0 x0 x0 x0 x0

∂y The second summand is 0 because yk(x0) = 0. Because ∂x (x0) = I by assumption, we are left with 2   ∂ 1 2 λkyk = λkδijδik. ∂xi ∂xj 2 x=x0 2 As D φ(x0) = diag (λ1, . . . , λd), we get   d 2 X D  yiyjµ˜ij(y) = diag (λr, . . . , λd). i,j≥r x=x0

Note that   2 d ! ∂ X X ∂yi ∂yj ∂yi ∂yj  yiyjµ˜ij(y) = · + · µ˜ij(x = x0) ∂xl ∂xk ∂xl ∂xk ∂xk ∂xl i,j≥r x=x0 i,j≥r x0 x0 x0 x0 X = (δilδjk + δikδjl)˜µij(0) i,j≥r

= 2m ˜ lk(0).

This implies that µ˜ij(0) = diag (λr, . . . , λd). Next, perform the change of variables y 7→ y0(y) defined by

 0 yj := yj for j 6= r; q   . y0 := µ˜rr(y) y + P µ˜jr(y) y  r λr/2 r j≥r+1 µ˜rr(y) j

0 ∂y0 We need to show that this diffeomorphism satisfies y (x0) = 0, ∂x (x = x0) = I, and

d 1 1 X φ(x) − φ(x ) = λ (y0 )2 + ··· λ (y0 )2 + y0y0 µ˜0 (y). 0 2 1 1 2 r r i j ij i,j≥r+1

0 0 for some functions µ˜ij(y). It is clear that y (x0) = 0, since yk(x0) = 0 for all k. next, we have   0 s ∂yr µ˜rr(y(x0)) X = δir + δij · 0 = δir ∂xi λr/2 x0 j≥r+1

88 ∂y0 and so ∂x (x = x0) = I. Finally,

d X 1 y y µ˜ (y) − λ (y0 )2 i j ij 2 r r i,j≥r 2 d     X 2 X µ˜jr(y) X µ˜jr(y) = yiyjµ˜ij(y) − m˜ rr(y) yr + 2yr yj +  yj  µ˜rr(y) µ˜rr(y) i,j≥r j≥r+1 j≥r+1   X µ˜ir(y)˜µjr(y) = yiyj µ˜ij(y) − . µ˜rr(y) i,j≥j+1

µ˜0 =µ ˜ (y) − µ˜ir(y)˜µjr(y) Setting ij ij µ˜rr(y) gives the desired conclusion.

8.3 Oscillatory Integrals of the First Kind, d > 1

d In this section we consider the asymptotics of oscillatory integrals on R . We begin with the principle of nonstationary phase. The statement and proof is essentially the same as in the one-dimensional case. d d Proposition 8.7 (Nonstationary phase). Let φ : R → R and ψ : R → C be smooth. Assume that ψ has compact support, and that ∇φ 6= 0 on the support of ψ. Then Z iλφ(x) −N e ψ(x) dx . λ Rd for any N ≥ 0. Proof. As in the one-dimensional case, we simply use integration by parts to exploit the phase oscillation. Because ∇φ 6= 0 on the support of ψ, we have ∇φ(x) eiλφ(x) = ∇eiλφ(x). iλ|∇φ(x)|2 Then Z Z iλφ(x) ∇φ(x) iλφ(x) I(λ) = e ψ(x) dx = 2 ∇e ψ(x) dx Rd Rd iλ|∇φ(x)| Z   1 iλφ(x) ∇φ(x) = − e ∇ 2 ψ(x) dx. iλ Rd iλ|∇φ(x)| −1 Thus, |I(λ)| . λ , where the implicit constant depends on two derivatives of φ and one derivative of ψ. Iterating integration by parts as before gives decay of |I(λ)| to any order.

Next, we consider the principle of stationary phase. The one-dimensional principle of − 1 stationary phase described how I(λ) decays like λ 2 near a nondegenerate critical point of − d the phase. In the higher-dimensional case, the decay is like becomes λ 2 . The proof is structurally the same as in the one-dimensional case, with some slightly more involved computations. d d Proposition 8.8 (Stationary phase). Let φ : R → R and ψ : R → C be smooth. Assume that d φ has a nondegenerate critical point at x0 ∈ R , and that ψ is supported in a sufficiently small R iλφ(x) neighborhood of x0. Then I(λ) = d e ψ(x) dx satisfies R  d  d Y − 1 d  d 1  iλφ(x0) 2 2 − 2 − 2 − 2 I(λ) = e ψ(x0)(2πi) µj  λ + O λ (8.4) j=1

89 2 as λ → ∞, where µ1, . . . , µd are the eigenvalues of D φ(x0).

Proof. By the Morse lemma, there is a change of variables x 7→ y(x) such that y(x0) = 0, ∂y ∂x (x0) = Id, and 1 1 φ(x) − φ(x ) = µ y2 + ··· + µ y2. 0 2 1 1 2 d d With this change of variables, we have Z iλφ(x ) iλ Pd 1 µ y2 I(λ) = e 0 e j=1 2 j j ψ˜(y) dy where ψ˜(y) absorbs the Jacobian from the change of variables. As in the proof of the one- ∞ d ˜ dimensional case, let γ ∈ Cc (R ) be identically 1 on the support of ψ. Then Z iλφ(x ) iλ Pd 1 µ y2 −|y|2 h |y|2 i I(λ) = e 0 e j=1 2 j j e e ψ˜(y) γ(y) dy.

By Taylor expansion, we can write

|y|2 ˜ X α X β X β e ψ(y) = cαy + y Rβ(y) =: P (y) + y Rβ(y). |α|≤N |β|=N+1 |β|=N+1

As in the one-dimensional case, we then have a decomposition

Z Pd 1 2 2 iλφ(x0) X iλ µj y −|y| α I(λ) = e cα e j=1 2 j e y dy |α|≤N

Z Pd 1 2 2 iλφ(x0) X iλ µj y −|y| β + e e j=1 2 j e y Rβ(y)γ(y) dy |β|=N+1 Z iλφ(x ) iλ Pd 1 µ y2 −|y|2 + e 0 e j=1 2 j j e P (y)(γ(y) − 1) dy.

Let (1), (2), and (3) denote the first, second, and third lines of this decomposition, respec- tively. Estimating (1). For |α| ≤ N, we have

d iλµ2 Z d 1 Z j P 2 2 Y 2 2 αj iλ j=1 2 µj yj −|y| α 2 yj −yj e e y dy = e e yj dyj j=1 α − 1 − j d iλµ2 ! 2 2 Z Y j −y2 αj = 1 − e j y dy . 2 j j j=1

For |α| ≥ 2, this computation gives Z iλ Pd 1 µ y2 −|y|2 α  − d − |α|   − d −1 e j=1 2 j j e y dy = O |λ| 2 2 = O λ 2

2 α R −yj j as λ → ∞. When |α| = 1, the contribution is 0 because e yj dyj = 0 when αj is odd. iλµ2 q λ2µ2 j iθj j λµj When α = 0, write 1 − 2 = rje where rj = 1 + 4 and tan θj = − 2 . As λ → ∞, π θj → − 2 sgn µj. Thus,

d d d 1 θ  − 2 Y − −i j λ Y i π sgn µ − 1 r 2 e 2 → e 4 j (π|µ |) 2 j 2 j j=1 j=1

90 as λ → ∞. Finally, observe that c0 = ψ(x0). Rewriting the above constant and combining the estimates thus far gives

 d  d Y − 1 d  d 1  iλφ(x0) 2 2 − 2 − 2 − 2 (1) = e ψ(x0)(2πi) µj  λ + O λ . j=1

Estimating (2). ∞ d Let a ∈ Cc (R ) satisfy a(x) = 1 for |x| ≤ 1 and a(x) = 0 for |x| ≥ 2. Let Xβ(y) = −|y|2 e Rβ(y)γ(y). For ε > 0 (to be determined later), we have

Z Pd 1 2 iλφ(x0) X iλ µj y β (2) = e e j=1 2 j y Xβ(y)a(y/ε) dy |β|=N+1

Z Pd 1 2 iλφ(x0) X iλ µj y β + e e j=1 2 j y Xβ(y) [1 − a(y/ε)] dy. |β|=N+1

Let (2i) and (2ii) denote the first and second lines of this decomposition, respectively. As in the one-dimensional case, we do not expect any cancellation in (2i) because of the small support of a(·/ε), thus we estimate (2i) crudely by Z N+1 N+1+d |(2i)| . |y| dy . ε . |y|≤2ε Next we consider (2ii). We will effectively use nonstationary phase on this integral, but we have to be explicit with the integration by parts because we are optimizing our choice of ε. Let  |x|2  Γ = x ∈ d : x > . j R j d d Sd Then R = j=1 Γj. Fatten the regions as follows:  |x|2  Γ˜ = x ∈ d : x > . j R j 2d

We will define a particular partition of unity subordinate to this collection of sets. Let d−1 d ˜ d−1 ϕ˜j : S ⊆ R → R be functions with support in Γj ∩ S satisfying ϕ˜j(x) = 1 for d−1 ϕ˜j d x ∈ Γj ∩ S . Set ϕj = Pd . Finally, extend ϕj to all of R by setting ϕj(x) = ϕj(x/|x|). j=1 ϕ˜j Pd 1 2 Next, let Q(y) = j=1 2 µjyj . Then ∇Q(y) = (µ1y1, . . . , µdyd), and hence Q has one critical point at y = 0. We have

d Z iλφ(x0) X X iλQ(y) β (2ii) = e e y Xβ(y) [1 − a(y/ε)] ϕj(y) dy. j=1 |β|=N+1

Note that 1 ∂ eiλQ(y) = eiλQ(y). iλµjyj ∂yj The computation from here is similar to the one-dimensional case. Integrating by parts m 2 times via ∂ in each set Γ˜ , and using the fact that y ≥ |y| in Γ˜ , we compute as follows. ∂yj j j d j Let k denote the number of times the adjoint derivative hits the yi in the denominator of the above fraction. Then m 1 Z |y|N+1−|α1| X X −|α2| |(2ii)| . m m+k ε dy. λ |y|≥ε |y| k=0 α1+α2≤m−k

91 This integral is finite provided m + k − N − 1 + |α1| > d, in which case

m 1 X X |(2ii)| εN+1+d−(|α1|+|α2|)−(m−k) . λm k=0 α1+α2≤m−k 1 εN+1+d−2m. . λm To optimize our choice in ε > 0 between (2i) and (2ii), we require 1 εN+1+d = εN+1+d−2m λm

− 1 and thus ε = λ 2 . With this choice,

− d − N+1 |(2)| . λ 2 2 , which is acceptable for N ≥ 1. Estimating (3). Finally, we estimate (3). As γ(y) − 1 is supported away from the origin, and because we don’t need to worry about optimizing between different estimates, we can blindly apply −m the principle of nonstationary phase to get |(3)| . λ for any order m, which is obviously acceptable. This completes the proof.

8.4 The Fourier Transform of a Surface Measure

We end this chapter by applying the theory of oscillatory integrals to the surface measure d−1 d on the unit sphere S ⊆ R . This will have applications when we study dispersive PDEs in Chapter9, and also when we study restriction theory in Chapter 10. d−1 d Let dσ denote the surface measure supported on S ⊆ R . Define the inverse Fourier d transform of the measure dσ to be the function on R given by Z ˇ 1 ix·ξ dσ(x) := d e dσ(ξ). (2π) 2 Sd−1

This function satisfies the following decay property.

d−1 d ˇ − d−1 Lemma 8.9. If dσ is the surface measure on S ⊆ R , then |dσ(x)| . hxi 2 . Proof. First, observe that Z dσˇ (0) = dσ(ξ) ∼ 1. Sd−1 ˇ − d−1 Thus, to prove the lemma it suffices to prove |dσ(x)| . |x| 2 . As dσ is rotation invariant, dσˇ is also rotation invariant. In particular,

dσˇ (x) = dσˇ ((0,..., 0, |x|)) Z ∼ ei|x|ξd dσ(ξ). Sd−1 Let θ denote the angle between ξ and the north pole (0,..., 0, 1). Changing variables via spherical coordinates gives Z π dσˇ (x) ∼ ei|x| cos θ sind−2(θ) dθ. 0

92 d i|x| cos θ i|x| cos θ Note that dθ e = e (−i|x| sin θ). Thus, if d ≥ 4 we can integrate by parts to get Z π d h i 1 dσˇ (x) ∼ − ei|x| cos θ sind−2(θ) dθ 0 dθ i|x| sin θ 1 Z π d h i = − ei|x| cos θ sind−3(θ) dθ i|x| 0 dθ π 1 d − 3 Z π i|x| cos θ d−3 i|x| cos θ d−4 = − e sin (θ) + e sin (θ) cos θ dθ. i|x| i|x| 0 0 The boundary terms are 0, since sin 0 = sin π = 0. So if d ≥ 4, one iteration of integration by parts gives Z π dσˇ (x) ∼ |x|−1 ei|x| cos θ sind−4(θ) cos θ dθ. 0 In words, one iteration of integration by parts picked up a power of |x|−1, decreased the power of sin θ in the integral by 2, and picked up a power of cos θ. d−2 If d is even, we can integrate by parts 2 times to eliminate the original d − 2 copies of sin θ. Along the way we pick up more copies of sin θ and cos θ from differentiating products of sin θ and cos θ, but the elimination of the original d − 2 copies of sin θ ensures Z π − d−2 i|x| cos θ dσˇ (x) ∼ |x| 2 e P (θ) dθ 0 where P (θ) is a trigonometric polynomial containing at least one term without a sin θ. d−3 Similarly, if d is odd we can integrate by parts 2 times to get Z π − d−3 i|x| cos θ dσˇ (x) ∼ |x| 2 e sin θQ(θ) dθ 0 for some trigonometric polynomial Q(θ). Integrating by parts one more time gives the desired estimate: Z π − d−3 −1 d h i|x| cos θi dσˇ (x) ∼ |x| 2 |x| e Q(θ) dθ 0 dθ − d−1 ∼ |x| 2 .

∞ ∞ Thus, it remains to consider when d is even. Let a1 ∈ Cc (−ε, ε), a2 ∈ Cc (ε/2, π − ε/2), ∞ and a3 ∈ Cc (π − ε, π + ε) satisfy a1 + a2 + a3 = 1 on (0, π). In the support of a2, the phase φ(θ) = cos θ has no critical points, so nonstationary phase gives Z π i|x| cos θ −1 e P (θ)a2(θ) dθ . |x| . 0

On the support of a1, we have Z π Z ε i|x| cos θ i|x| cos θ e P (θ)a2(θ) dθ = e P (θ)a2(θ) dθ. 0 0 The phase cos θ has one nondegenerate critical point at 0. The principle of stationary phase − 1 gives a contribution of O(|x| 2 ), and similarly on the support of a3. Thus,

ˇ − d−2 − 1 − d−1 dσ(x) . |x| 2 |x| 2 = |x| 2 as desired. The lower dimensional cases are straightforward exercises.

93 Chapter 9

Dispersive Partial Differential Equations

ADD INTRODUCTION The most important examples of dispersive partial differential equations are the nonlin- ear Schr¨odingerequation i∂tu + ∆u = F (u, Du) (9.1) and the Korteweg-de Vries (KdV) equation

ut + uxxx + γuux = 0. (9.2)

d d In (9.1), u : Rx ×Rt → C or u : Tx ×Rt → C, and in (9.2), u : Rx ×Rt → R or u : Tx ×Rt → R. These equations arise in a number of physical contexts and are an active area of research. FINISH

9.1 Dispersion

What does it mean for a partial differential equation to be dispersive? Informally, an equa- tion is dispersive if its solutions are waves whose spatial profiles spread out over time, if no boundary conditions are imposed. This heuristic is nice, but is not mathematically rigorous in any way. The goal of this section is to give a (mostly) formal definition of dispersion. A slightly more tractable way of identifying a dispersive equation is via the frequency support of its fundamental solution. Consider the general linear evolution equation

∂tu = iP (D)u. (9.3)

d Here, u : Rx × Rt → C is a complex-valued function of space and time, and P (D) is a linear differential operator with Fourier symbol 2πϕ(ξ). Taking the space-time Fourier transformation of (9.3) gives 2πiτuˆ(ξ, τ) = 2πiϕ(ξ)ˆu(ξ, τ). Rearranging and dividing by 2πi gives [τ − ϕ(ξ)]u ˆ(ξ, τ) = 0. (9.4) This equation tells us that the space-time Fourier transform of a solution of (9.3) is sup- ported on the surface Σ := { (ξ, τ): τ = ϕ(ξ) }. We may identify an evolution equation of the form (9.3) as dispersive if the surface Σ is curved.1 Indeed, the surface corresponding to the Schrodinger¨ equation (9.1) is a parabaloid, and the surface corresponding to the KdV equation (9.2) is a cubic curve. An example of an equation which is not dispersive in this sense is the transport equation ut − cux = 0. (9.5)

1Here we are being intentionally vague about curvature to avoid any unnecessary excursions into differen- tial geometry.

94 The corresponding surface is the hyperplane {τ = cξ}, which is decidedly not curved. To be more formal, consider the evolution equation

 ∂ u = Lu t . (9.6) u(x, 0) = u0(x)

d d d Here, u : Rx × Rt → V , where V is either C or R , and L is a skew-adjoint constant-coefficient linear differential operator, that is, Z Z hLu(x), v(x)iV dx = − hu(x), Lv(x)iV dx Rd Rd 1 for all functions u and v. With D = i ∇x, we may write

X α Lu = cα∂x u := ih(D)u, |α|≤k

d where h : R → End V is the polynomial

X |α|−1 α h(ξ) ∼π i cαξ . |α|≤k

We refer to the symbol h as the dispersion relation of (9.6). For the linear Schrodinger¨ equa- tion, the dispersion relation is h(ξ) = −|ξ|2, and for the Airy equation (the linear part of the 3 d KdV equation) the dispersion relation is h(ξ) = ξ . For a fixed v ∈ R , the general transport equation ∂tu = −v · ∇xu (9.7) has dispersion relation h(ξ) = −v · ξ. By considering these dispersion relations, we are irrevocably led to the following defi- nition.

Definition 9.1. An evolution equation of the form (9.6) is dispersive if ∇h is not constant.

9.2 The Linear Schr¨odingerEquation

A necessary prerequisite for understanding the NLS equation (9.1) or the KdV equation (9.2) is to develop a solid understanding of their linear counterparts. Thus, we begin with the linear Schr¨odingerequation, given by

 iu + ∆ u t 2 . (9.8) u(x, 0) = u0(x)

d d d Here, u : Rx × Rt → C and u0 : Rx → C. For now, we will assume that u0 ∈ S(R ).

9.2.1 The Fundamental Solution Our first order of business is to compute a concrete representation of a solution to the linear Schrodinger¨ equation. To solve (9.8), we first apply the spatial Fourier transform.2 As ∆ˆ = (iξ) · (iξ) = −|ξ|2, this gives

( |ξ|2 iuˆt(ξ) = 2 uˆ(ξ) uˆ(0, ξ) =u ˆ0(ξ)

2 ˆ 1 R −ix·ξ In this chapter we will adopt the convention that f(ξ) = d d e f(x) dx. (2π) 2 R

95 This is an ODE with respect to t, and can be solved by separation of variables. Indeed, the solution is given on the Fourier side by

2 −i |ξ| t uˆ(t, ξ) = e 2 uˆ0(ξ). (9.9)

Inverting the Fourier transform, we have

1 Z |ξ|2 ix·ξ−i 2 t u(t, x) = d e uˆ0(ξ) dξ (2π) 2 Rd  2  Z x |ξ| 1 it t ·ξ− 2 = d e uˆ0(ξ) dξ. (2π) 2 Rd

9.2.2 Dispersive Estimates 9.2.3 Strichartz Estimates

96 Chapter 10

Restriction Theory

In this chapter, we study one of the most important unsolved problems in harmonic anal- ysis: the restriction problem. Vaguely, the restriction problem asks when one can mean- d ingfully restrict the Fourier transform of a function to a subspace of R . While this is an interesting question in harmonic analysis on its own, we will see that the restriction prob- lem has close ties with dispersive partial differential equations. There are also connections to other major problems in harmonic analysis, such as the Kakeya conjecture, though we will only focus on the dispersive PDE connection in this chapter. For more information on restriction theory and its connections to other parts of harmonic analysis, refer to [10] and [11]. We begin by formalizing the statement of the restriction problem.

10.1 The Restriction Conjecture

A simple motivator for the restriction problem stems from the following observation. Fix p d d ≥ 2. In Chapter2 we defined the Fourier transform for functions in L (R ) for 1 ≤ p ≤ 1 d ˆ d 2. If f ∈ L (R ), then by the Riemann-Lebesgue lemma we know that f ∈ C0(R ). In d particular, we can meaningfully restrict fˆ to a set of a measure 0 in R , as fˆ is continuous. 2 d 2 d 2 On the other hand, if f ∈ L (R ), we only know that fˆ ∈ L (R ). As an arbitrary L function can be redefined on sets of measure zero, we cannot restrict fˆto a sets of measure 0. p d What about functions f ∈ L (R ) for 1 < p < 2? By the above discussion, restriction to measure zero sets is completely well-defined for p = 1 and entirely not well-defined for p = 2. Ideally, we would like a critical value between 1 and 2 for which restriction is well-defined for p smaller than this critical value, but this is unfortunately not the case. It p d turns out that there are functions that belong to L (R ) for all p > 1 but for which fˆ blows up on a hyperplane.

d−1 d Example 10.1. Let ψ : R → C be a smooth bump function. Define f : R → C by

ψ(x2, . . . , xd) f(x1, . . . , xd) = . 1 + |x1|

Observe that Z Z Z p 1 |f(x)| dx = ψ(x2, . . . , xd) dx2 . . . dxd · p dx1 < ∞ Rd Rd−1 (1 + |x1|)

97 p d for any p > 1, hence f ∈ L (R ) for all p > 1. The Fourier transform of f is Z − d −ix·ξ ψ(x2, . . . , xd) fˆ(ξ) = (2π) 2 e dx 1 + |x1| Z −ix1ξ1 − 1 ˆ e = (2π) 2 ψ(ξ2, . . . , ξd) dx1. 1 + |x1|

R e−ix1ξ1 R 1 On the hyperplane {ξ1 = 0}, the integral dx1 = dx1 is divergent. Thus, the 1+|x1| 1+|x1| restriction of fˆ to this set is not well-defined.

Stein discovered (see ??) that if a surface S of measure 0 has “sufficient curvature,” then p d one may restrict fˆto S when f ∈ L (R ) for certain values of p. As such, we can informally state the question of restriction of the Fourier transform as follows: for which zero-measure p d sets S and which 1 < p < 2 can one meaningfully restrict fˆ to S if f ∈ L (R )? There are three surfaces which have received a lot of attention:

 d - The sphere, Ssphere = ξ ∈ R : |ξ| = 1 . 1  d 1 ¯ 2 - The parabaloid, Sparab = ξ ∈ R : ξd = 2 |ξ| .  d ¯ - The cone, Scone = ξ ∈ R : ξd = |ξ| . These three surfaces are endowed with the following canonical measures:

- The sphere is given the usual surface measure dσ.

- The parabaloid is given the measure dσ defined via

Z Z  1  f(ξ) dσ(ξ) := f ξ,¯ |ξ¯|2 dξ.¯ d−1 2 Sparab R

- The cone is given the measure dσ defined via

Z Z dξ¯ f(ξ) dσ(ξ) := f ξ,¯ |ξ¯| . d−1 ¯ Scone R |ξ|

With these surfaces and measures, we can give a quantitative formulation of the restriction problem. Explicitly, we seek an estimate of the form

ˆ f |S .p,q,S kfkLp( d) (10.1) Lq(S,dσ) R

We shall refer to an estimate of the form (10.1) by R(p → q). For now, our focus will be on the case S = Ssphere. First, a few remarks.

1. If p = 1, then R(p → q) holds for all 1 ≤ q ≤ ∞. Indeed, by Holder’s inequality,

1 ˆ q ˆ f |S . σ(S) f . kfkL1( d) Lq(S,dσ) L∞(S,dσ) R

because S has finite measure.

2. If p = 2, then by our initial discussion R(p → q) fails for every 1 ≤ q ≤ ∞.

1 ¯ Here we are using the notation ξ = (x1, . . . , xd−1).

98 3. If R(p → q) holds, then R(˜p → q˜) holds for all p˜ ≤ p and q˜ ≤ q. Indeed, by Holder’s inequality,

1 1 ˆ q˜− q ˆ f |S . σ(S) f |S . Lq˜(S,dσ) Lq(S,dσ) From here, let ϕ be a function such that ϕˆ is compactly supported and ϕˆ = 1 on a ball containing S. Then

ˆ ˆ f |S . f |S ·ϕˆ . f[∗ ϕ . kf ∗ ϕkLp( d) Lq˜(S,dσ) Lq(S,dσ) Lq(S,dσ) R

where the last inequality holds because R(p → q) holds. Next, we apply Young’s 1 1 1 inequality with r chosen so that 1 + p = p˜ + r . Note that, for such an r, we have 1 p˜−p r = 1 + pp˜ ≤ 1 so that this choice is possible. Then

ˆ f |S . kfkLp˜( d) kϕkLr( d) . kfkLp˜( d) Lq˜(S,dσ) R R R as desired. Because of this, when studying restriction problems we adopt a philosophy of maxi- mizing p and q for which R(p → q) holds.

In modern papers dealing with the restriction problem, the formulation is often pre- p d sented using the adjoint of the expression R(p → q). In particular, let R : L (R ) → q ˆ L (S, dσ) be the restriction operator given by Rf = f |S. Consider the adjoint operator ∗ q0 p0 d R : L (S, dσ) → L (R ). For a Schwartz function f, we have

Z d Z Z − 2 −ix·ξ hRf, giL2(S,dσ) = (Rf)(ξ)g(ξ) dσ(ξ) = (2π) e f(x) dx g(ξ) dσ(ξ) S S Rd Z Z − d = f(x)(2π) 2 eix·ξg(ξ) dσ(ξ) dx Rd S Z = f(x) (g dσ)ˇdx Rd D E = f, (g dσ)ˇ . L2(S,dσ)

This implies that R∗g = (g dσ)ˇ. Thus, R(p → q) holds if and only if R∗(q0 → p0) holds, i.e., if there is an estimate of the form

ˇ (g dσ) p0 d p,q,S kgk q0 . (10.2) L (R ) . L (S,dσ) We begin by considering necessary conditions for R∗(q0 → p0) to hold. It is natural to first ˇ − d−1 ˇ p0 analyze what happens when g ≡ 1. By Lemma 8.9, |(dσ)| . hxi 2 . Thus, (dσ) ∈ L d−1 0 2d precisely when 2 p > d, which is equivalent to p < d+1 . So for the function g ≡ 1, R∗(q0 → p0) holds when 2d p < . d + 1 This provides one necessary condition. Another condition comes from the following ex- ample.

Example 10.2 (Knapp). Let κ ⊆ S be a cap on the sphere centered at the point ξ0 = (0,..., 0, −1) of horizontal radius 1/R for some R >> 1. Near ξ0, we can parametrize the cap κ and Taylor expand via q  |ξ¯|2  ξ = − 1 − |ξ¯|2 = − 1 − + O(|ξ¯|4) = −1 + O(R−2) d 2

99 ¯ −2 because |ξ| ≤ 1/R on κ. This means that the ξd-height of the cap is on the order of R , i.e., the cap κ is contained in a cylinder D of radius 1/R and thickness ∼ 1/R2. Let T denote the “dual” cylinder to D, i.e., the cylinder centered at 0 with radius R and height ∼ R2. Let g = χκ. Then 1 d−1 q0 −1 q0 kgkLq0 (S,dσ) = σ(κ) . (R ) . Also, Z Z ˇ − d ix·ξ − d −ix ix(ξ−ξ ) (g dσ)(x) = (2π) 2 e dσ(ξ) = (2π) 2 e d e 0 dσ(ξ). κ κ Note that, for ξ ∈ κ,  1/R 1 ≤ i ≤ d − 1 |(ξ − ξ ) | 0 i . 1/R2 i = d and for x ∈ T ,  R 1 ≤ i ≤ d − 1 |x | . i . R2 i = d Thus, for most x ∈ T , we have

ˇ −(d−1) |(g dσ)(x)| & σ(κ) ∼ R .

So

1 d−1+2 d+1 ˇ ˇ −(d−1) 0 −(d−1) 0 −(d−1) 0 (g dσ) 0 (g dσ) 0 R |T | p ∼ R R p = R R p . Lp (Rd) & Lp (T ) &

d+1 d−1 ∗ 0 0 −(d−1) 0 − 0 Therefore, if R (q → p ) holds, then R R p . R q . Letting R → ∞ gives the condition d + 1 d − 1 ≤ . p0 q From these two examples, two necessary scaling conditions for R∗(q0 → p0) to hold are 2d d+1 d−1 1 ≤ d+1 and p0 ≤ q . One might wonder whether such conditions are also sufficient. Formally, we pose the following conjecture.

d−1 d Conjecture 10.3 (Restriction conjecture for the sphere). Let d ≥ 2, and let S = S ⊆ R be the unit sphere endowed with the canonical surface measure. Then

ˆ f |S .p,q kfkLp( d) , Lq(S,dσ) R or equivalently ˇ (g dσ) 0 p,q kgk q0 , Lp (Rd) . L (S,dσ) 2d d−1 0 whenever 1 ≤ p ≤ d+1 and 1 ≤ q ≤ d+1 p . Zygmund proved this conjecture for d = 2 (see ??); for all other d, this remains an open problem.

10.2 The Tomas-Stein Inequality

Though the restriction conjecture for the sphere is open for d > 2, we have the following result, due to Tomas and Stein.

100 d−1 d Theorem 10.4 (Tomas-Stein). Let d ≥ 2, and let S = S ⊆ R be the unit sphere endowed with the canonical surface measure. Then

ˆ f |S .p kfkLp( d) L2(S,dσ) R

2(d+1) whenever 1 ≤ p ≤ d+3 . 2d 0 When q = 2, the scaling conditions in Conjecture 10.3 become 1 ≤ p ≤ d+1 and p ≥ 2(d+1) 2(d+1) 2(d+1) 2d d−1 , which is equivalent to p ≤ d−3 . Moreover, for d > 1, d+3 < d+1 . Thus, the Tomas-Stein inequality is indeed the full restriction conjecture for the sphere in the special case q = 2.

Proof. For the sake of instruction, we begin with a proof attempt which will not prove the full statement of Theorem 10.4. − d−1 Attempt 1: The first approach relies on using the decay condition |σˇ(x)| . hxi 2 . We p d 2 wish to show that the operator R : L (R ) → L (S, dσ) is bounded, which is equivalent ∗ 2 p0 d to showing boundedness of R : L (S, dσ) → L (R ), which is equivalent to showing ∗ p d p0 d boundedness of R R : L (R ) → L (R ). Let f be a Schwartz function. By definition, Z Z ∗ ˇ − d ix·ξ − d ix·ξ (R Rf)(x) = (Rf dσ)(x) = (2π) 2 e (Rf)(ξ) dσ(ξ) = (2π) 2 e fˆ(ξ) dσ(ξ) S S Z Z − d − d ix·ξ −iy·ξ = (2π) 2 (2π) 2 e e f(y) dy dσ(ξ) S Rd Z − d = (2π) 2 f(y)ˇσ(x − y) dy. Rd

∗ − d−1 2d ,∞ d Hence, R Rf = f ∗ σˇ. Next, because |σˇ(x)| . hxi 2 , we have σˇ ∈ L d−1 (R ). By the Hardy-Littlewood-Sobolev inequality,

∗ kR Rfk p0 d kfk p d kσˇk 2d kfk p L ( ) . L (R ) ,∞ d . L R L d−1 (R ) 1 1 d−1 2 d−1 0 4d provided that 1 + p0 = p + 2d . This is equivalent to p = 2d , hence p = d−1 , hence 4d 4d p = 3d+1 . So R(p → 2) holds for all 1 ≤ p ≤ 3d+1 . But a quick calculation shows that 4d 2(d+1) 3d+1 < d+3 , so we do not have the full scaling range in the statement of the theorem. Attempt 2: To remedy this, we use both the decay of |σˇ(x)| and the oscillation. We have X ϕ(x) + ψN (x) = 1 N>1 where ϕ and ψN are the usual Littlewood projections, and N is a dyadic number. So we have a decomposition X f ∗ σˇ = f ∗ (ϕσˇ) + f ∗ (ψN σˇ). N>1 By the triangle inequality,

∗ X kR Rfk 0 ≤ kf ∗ (ϕσˇ)k 0 + kf ∗ (ψ σˇ)k 0 . Lp (Rd) Lp (Rd) N Lp (Rd) N>1

1 1 1 p0 0 Consider the first term. Note that 1 + p0 = p + r when r = 2 . Because p < 2, p > 2, hence p0 r = 2 is a sensible choice. So by Young’s convolution inequality, we have

kf ∗ (ϕσˇ)k 0 kfk p d kϕσˇk 0 . Lp (Rd) . L (R ) p L 2 (Rd)

101 As ϕ is a bounded function with compact support, kϕσˇk p0 < ∞. Thus, L 2 (Rd)

kf ∗ (ϕσˇ)k 0 kfk p d . Lp (Rd) . L (R ) Next, we consider the sum. In order to retain summability, we seek an estimate of the form

−ε(p) kf ∗ (ψ σˇ)k 0 N kfk N Lp (Rd) . Lp(Rd) for some ε(p) > 0. The strategy is to use real interpolation between a L1 → L∞ and L2 → L2 type estimate. First, because ψN is supported on an annulus of radius ∼ N, the decay of |σˇ| gives

− d−1 kf ∗ (ψ σˇ)k ∞ d kfk 1 d kψ σˇk ∞ d N 2 kfk 1 d . N L (R ) . L (R ) N L (R ) . L (R ) For the L2 → L2 type estimate, we use Plancharel to get

ˆ ˆ kf ∗ (ψN σˇ)k 2 d = f · (ψN σˇ) L (R ) 2 d L (R ) ˆ ˆ ˆ ≤ f ψN ∗ dσ = kfk 2 d ψN ∗ dσ . 2 d ∞ d L (R ) ∞ d L (R ) L (R ) L (R ) Expanding the convolution, Z Z ˆ ˆ d ˆ (ψN ∗ dσ)(x) = ψN (x − y)dσ(y) = N ψ(N(x − y)) dσ(y). S S So for any m, since ψ is Schwartz, Z ˆ d 1 |(ψN ∗ dσ)(x)| . N m dσ(y) S hN(x − y)i m Z X Z M  = N d dσ(y) + N d dσ(y). N |x−y|≤1/N; y∈S M≤N |x−y|∼1/M; y∈S R The first integral |x−y|≤1/N; y∈S dσ(y) is the measure of a cap on the sphere of radius . 1/N, R d−1 R hence |x−y|≤1/N; y∈S dσ(y) . (1/N) . Similarly, |x−y|∼1/M; y∈S dσ(y) is the measure of two caps on the sphere with radius . 1/M, so in total we have d−1 m d−1 m−(d−1)  1  X M   1  X M  |(ψˆ ∗ dσ)(x)| N d + N d = N + N . N . N N M N M≤N M≤N ˆ The sum in question is finite if m > d − 1. For such a choice of m, we then have |(ψN ∗ ˆ dσ)(x)| N, hence ψN ∗ dσ N. Therefore, kf ∗ (ψN σˇ)k 2 d N kfk 2 d . . ∞ d . L (R ) . L (R ) L (R ) − d−1 kf ∗ (ψ σˇ)k ∞ d N 2 kfk 1 d Now we interpolate between the estimates N L (R ) . L (R ) and kf ∗ (ψ σˇ)k 2 d N kfk 2 d N L (R ) . L (R ). Note that 1 2/p0 1 − 2/p0 = + . p0 2 ∞ Thus,

2 1− 2 p0 p0 kf ∗ (ψN σˇ)k p0 d kf ∗ (ψN σˇ)k kf ∗ (ψN σˇ)k L (R ) . L2(Rd) L∞(Rd) d−1   2 2 2 − 1− 2 0 1− 0 N p0 N 2 p0 kfk p kfk p . L2(Rd) L1(Rd) − d−1 + d+1 2 p0 N kfk p d . . L (R )

102 d−1 d+1 0 2(d+1) 2(d+1) Set ε(p) = − 2 + p0 . Then ε(p) > 0 precisely when p > d−1 , or equivalently p < d+3 . 2(d+1) This gives the desired estimate in all cases, save for the endpoint p = d+3 . Attempt 3: The above argument, which proves the theorem except for the endpoint case, is due to Tomas. The loss of the endpoint is essentially due to the use of the triangle in- equality in the dyadic sum. To recover the endpoint case, Stein used complex interpolation with estimates of the form

X d−1 +it N 2 f ∗ (ψ σˇ) kfk 1 d N . L (R )

N>1 L∞(Rd)

X −1+it N f ∗ (ψ σˇ) kfk 2 d . N . L (R )

N>1 L2(Rd) We will not consider the details of the argument here; for reference, see [10].

10.3 Restriction Theory and Strichartz Estimates

As alluded to in the introduction of this chapter, the restriction of the Fourier transform is deeply connected to dispersive partial differential equations. In this section, we will describe this connection by studying an analogue of Theorem 10.4 for the paraboloid, and showing its equivalence to a symmetric Strichartz inequality. In this setting, because Strichartz estimates involve space and time, it will be convenient to increase the dimension of our restriction estimates from d to d+1. As such, we now seek an estimate of the form

ˇ (g dσ) 2(d+2) kgk 2 . L (Sparab,dσ) (10.3) L d (Rd+1) where dσ is the canonical measure on the paraboloid. This is dual formulation of the end- point case for the Tomas-Stein inequality on the paraboloid. For the reader’s convenience, we recall some estimates from Chapter9. Recall that the it ∆ free propagator e 2 is defined for Schwartz functions by

2 ∆  |ξ|  it 2 −1 −it 2 e f = Fξ e f(ξ) where u0(x) = u(0, x). By interpolating the estimates

it ∆ it ∆ − d e 2 f = kfk 2 and e 2 f |t| 2 kfk 1 2 Lx ∞ . Lx Lx Lx we obtained (see Theorem ??) the dispersive estimate

 1 1  it ∆ −d − e 2 f |t| 2 p kfk 0 p . Lp (10.4) Lx x for 2 ≤ p ≤ ∞. From this, we derived (see Theorem ??) the following inequality.

d Theorem 10.5 (Strichartz inequality). Let f ∈ S(Rx). Then

it ∆ e 2 f kfk 2 q r . Lx Lt Lx

2 d d for 2 ≤ q, r ≤ ∞ satisfying q + r = 2 and (d, q, r) 6= (2, 2, ∞).

103 When q = r, the above inequality is known as a symmetric Strichartz inequality. In this case, the scaling relation gives 2 d d 2(d + 2) + = ⇒ q = . q q 2 d

2(d+2) As d > 2, this proves: d Theorem 10.6 (Symmetric Strichartz inequality). Let f ∈ S(Rx). Then

it ∆ 2 e f 2(d+2) . kfkL2 . d x Lt,x We will now prove the following result. Proposition 10.7. The Tomas-Stein inequality for the paraboloid is equivalent to the symmetric Strichartz inequality; that is, the estimate ˇ (g dσ) 2(d+2) kgk 2 . L (Sparab,dσ) L d (Rd+1) is equivalent to the estimate it ∆ 2 e f 2(d+2) . kfkL2 . d x Lt,x d d Proof. Let u0 ∈ S(R ) and ϕ ∈ S(R × R ). We adopt the notation ϕ˜ = Ft,xϕ to denote the −1 space-time Fourier transform of ϕ. Let ψ = Fx ϕ˜. it ∆ We wish to calculate the space-time Fourier transform of e 2 u0, in the distributional sense. We have    Z  ^it ∆ D it ∆ E − d it ∆ −ix·ξ e 2 u0, ϕ = e 2 u0, ϕ˜ = (2π) 2 e 2 u0, e ψ(t, ξ) dξ t,x d t,x R t,x Z Z Z − d  it ∆  −ix·ξ = (2π) 2 e 2 u0 (x)e ψ(t, ξ) dx dt dξ Rd R Rd Z Z 2 −it |ξ| = e 2 uˆ0(ξ)ψ(t, ξ) dt dξ Rd R 2 1 Z |ξ|  = (2π) 2 uˆ0(ξ)[Ftψ(ξ)] dξ. Rd 2  |ξ|2   |ξ|2  Observe that [Ftψ(ξ)] 2 = ϕ 2 , ξ . Thus,

 ∆  1 ^it 2 2 e u0, ϕ = (2π) huˆ0 dσparab, ϕit,x . t,x

1 So, in the distributional sense, and up to a factor of (2π) 2 ,

^it ∆ it ∆ e 2 u0 =u ˆ0 dσparab ⇒ e 2 u0 = Ft,x (u ˆ0 dσparab) .

 |ξ|2  Let g ξ, 2 =u ˆ0(ξ). Then

kgk 2 = kuˆ k 2 = ku k 2 L (Sparab,dσ) 0 L 0 L

ˇ it ∆ it ∆ 2 2 and (g dσparab) = e u0. Thus, the estimate e u0 2(d+2) . ku0kL2 yields the Tomas- d Lt,x Stein estimate ˇ (g dσ) 2(d+2) kgk 2 . . L (Sparab,dσ) L d (Rd+1) The converse implication is essentially the same.

104 This proves that the endpoint Tomas-Stein inequality holds for the paraboloid by way of the symmetric Strichartz estimate, which by our remark earlier in this chapter proves the full Tomas-Stein inequality for the paraboloid. However, it is of interest to prove that the Tomas-Stein inequality on the sphere also implies the Tomas-Stein estimate on the paraboloid, without appealing directly to Strichartz estimates.

Proposition 10.8. The Tomas-Stein estimate on the sphere implies the Tomas-Stein estimate on the paraboloid.

Before proving this proposition, we explicitly compute the surface area measure on the sphere.

d d+1 Lemma 10.9. The canonical surface measure on the unit sphere S ⊆ R is given explicitly by 1 dσSd = dξ. p1 − |ξ|2

d d+1 Proof. Let φ : R → R be a coordinate parametrization of the sphere given by φ(ξ) = p (ξ, 1 − |ξ|2). Then q T dσSd = det [(∇φ) (∇φ)] dξ. p It remains to compute det [(∇φ)T (∇φ)]. We have  1 0 ··· 0   0 1 ··· 0     . . . .  ∇φ(ξ) =  . . . .  .    0 0 ··· 1    √−ξ1 √−ξ2 ··· √−ξd 1−|ξ|2 1−|ξ|2 1−|ξ|2 Thus,   2 T ξiξj |ξ| (∇φ(ξ)) (∇φ(ξ)) = δij + 2 = I + 2 Pξ 1 − |ξ| 1≤i,j≤d 1 − |ξ|

h ξiξj i where Pξ = |ξ|2 is the projection matrix onto the ξ direction. Consequently,

 |ξ|2  |ξ|2  1  (∇φ(ξ))T (∇φ(ξ))ξ = I + P ξ = ξ + ξ = ξ 1 − |ξ|2 ξ 1 − |ξ|2 1 − |ξ|2

T 1 so that (∇φ) (∇φ) has an eigenvector ξ with eigenvalue 1−|ξ|2 . Next, observe that if x is a vector orthogonal to ξ, then  |ξ|2  I + P x = x 1 − |ξ|2 ξ so that (∇φ)T (∇φ) has a d − 1-dimensional eigenspace corresponding to the eigenvalue 1. Thus, q 1 det [(∇φ)T (∇φ)] = p1 − |ξ|2 as desired.

With this, we can prove Proposition 10.8.

105 Proof. Assume the Tomas-Stein estimate on the sphere, that is,

ˇ (g dσ d ) 2(d+2) kgk 2 d . S . L (S ,dσ d ) L d (Rd+1) S Because Tomas-Stein on the paraboloid is equivalent to the symmetric Strichartz estimate, our goal is to derive an estimate of the form

it ∆ 2 e u0 2(d+2) . ku0kL2 d d x Lt,x (R×R )

∞ d where we can take uˆ0 ∈ Cc (R ) by density. d Fix λ >> 1. Define a function gλ on the sphere S as follows: ( p d p g (ξ, 1 − |ξ|2) = λ 2 uˆ (λξ) 1 − |ξ|2 λ 0 . p 2 gλ(ξ, − 1 − |ξ| ) = 0

∞ d Because uˆ0 ∈ Cc (R ), for λ large, gλ is supported on a small ξ-region, i.e., a small cap around the north pole. Using the surface area lemma, we have Z 2 d 2 2 1 kg k 2 d = λ |uˆ (λξ)| (1 − |ξ| ) dξ λ L (S ,dσ d ) 0 p S 1 − |ξ|2 Z d 2p 2 = λ |uˆ0(λξ)| 1 − |ξ| dξ r Z |ξ|2 = |uˆ (ξ)|2 1 − dξ. 0 λ

2 So for very large λ we have kgλk 2 d ∼ kuˆ0k 2 = ku0k 2 . L (S ,dσSd ) Lξ Lx ˇ Next, we compute (gλ dσSd ):

d+1 Z d ˇ − itω+ix·ξ p 2 1 p 2 (gλ dσSd )(t, x) = (2π) 2 e λ 2 uˆ0(λξ) 1 − |ξ| dξ δ(ω = 1 − |ξ| ) p1 − |ξ|2 Z √ − d+1 it 1−|ξ|2+ix·ξ d = (2π) 2 e λ 2 uˆ0(λξ) dξ r |ξ|2 x d+1 d Z it 1− +i ·ξ − − λ2 λ = (2π) 2 λ 2 e uˆ0(ξ) dξ.

q |ξ|2 |ξ|2  |ξ|4  By Taylor expansion, we have 1 − λ2 = 1 − 2λ2 + O λ4 . Also note that

Z it|ξ|2 − d+1 − d it− +i x ·ξ − d+1 − d it h i t ∆ i x (2π) 2 λ 2 e 2λ2 λ uˆ (ξ) dξ = (2π) 2 λ 2 e · e λ2 2 u . 0 0 λ Thus, we can write

d+1 d h t ∆ i x ˇ − − it i 2 (g dσ d )(t, x) = (2π) 2 λ 2 e · e λ 2 u λ S 0 λ  r !  |ξ|2 |ξ|2 Z it|ξ|2 it 1− −1+ − d+1 − d it− +i x ·ξ λ2 2λ2 2 2 2λ2 λ   + (2π) λ e uˆ0(ξ) e − 1 dξ.

By the Taylor expansion comment and by the usual arc-length type estimate, for λ large we have ! r 2 2 it 1− |ξ| −1+ |ξ| 4 λ2 2λ2 |ξ| e − 1 |t| . . 4 λ

106 Consequently,

 r !  |ξ|2 |ξ|2 Z it|ξ|2 it 1− −1+ − d+1 − d it− +i x ·ξ λ2 2λ2 2 2 2λ2 λ   (2π) λ e uˆ0(ξ) e − 1 dξ

Z − d −4 4 − d −4 . λ 2 |t| |uˆ0(ξ)||ξ| dξ . λ 2 |t|

∞ d where the last inequality follows from the fact that uˆ0 ∈ Cc (R ). Thus,

ˇ ˇ (gλ dσSd ) 2(d+2) & (gλ dσSd ) 2(d+2) L d (Rd+1) L d (|t|≤λ2(1+ε),|x|≤λ1+ε) d h t ∆ i x − i 2 & λ 2 e λ 2 u0 2(d+2) λ L d (|t|≤λ2(1+ε),|x|≤λ1+ε) − d −4 2(1+ε) (2(1+ε)+d(1+ε)) d − λ 2 λ λ 2(d+2) where the final power of λ comes from considering the volume of the set. Changing vari- ables and simplifying the power of λ in the error term then gives

d d ∆ dε ˇ − (2+d) 2(d+2) it −2+2ε+ (gλ dσ d ) 2(d+2) λ 2 λ e 2 u0 2(d+2) − λ 2 . S d+1 & 2ε ε L d (R ) L d (|t|≤λ ,|x|≤|λ| ) Thus, it ∆  −2+ d+4 ε 2 2 e u0 2(d+2) + O λ . ku0kL2 . d 2ε ε x Lt,x (|t|≤λ ,|x|≤|λ| ) As λ → ∞, for ε << 1 we then get

it ∆ 2 e u0 2(d+2) . ku0kL2 d d+1 x Lt,x (R ) as desired.

In the end, via either method, we have proven the Tomas-Stein inequality on the para- baloid.

d+1 Theorem 10.10 (Tomas-Stein). Let d ≥ d, and let Sparab ⊆ R be the paraboloid endowed with the canonical surface measure. Then

ˇ (g dσ) 0 kgk 2 Lp (Rd+1) . L (Sparab,dσ)

0 2(d+2) whenever p ≥ d .

107 Chapter 11

Rearrangement Theory

Our next topic of study is the theory of rearrangements. Rearrangement inequalities have applications to problems involving the minimization of energy functionals and other vari- ational problems, and can be used in showing the existence and uniqueness of minimizers. Later in this chapter, we will see an explicit application of this type to the energy func- tional of the hydrogen atom. There are also connections to classical problems such as the isoperimetric inequality. A good reference for the following material is Chapter 3 of [6].

11.1 Definitions and Basic Estimates

d We begin by recalling the layer-cake decomposition of a Borel measurable function f : R → [0, ∞); namely, the identity Z ∞ f(x) = χ{f>λ}(x) dλ. 0 This equality follows from the observation that

 1 f(x) > λ  1 λ < f(x) χ (x) = = = χ (λ) {f>λ} 0 f(x) ≤ λ 0 λ ≥ f(x) [0,f(x)) and so Z ∞ Z ∞ Z f(x) χ{f>λ}(x) dλ = χ[0,f(x))(λ) dλ = dλ = f(x). 0 0 0 This decomposition will frequently be used in the proofs of the inequalities to come. With this in mind, we begin discussing rearrangements.

d Definition 11.1. If A ⊆ R is Borel measurable, its symmetric rearrangement is the set

∗ n d o A = x ∈ R : |x| < r where r is chosen so that |A∗| = |A|.

d Definition 11.2. Let f : R → C be a Borel measurable function which vanishes at infinity, that is, |{|f| > λ}| < ∞ for all λ > 0. The symmetric decreasing rearrangement of f is given by Z ∞ ∗ f (x) = χ{|f|>λ}∗ (x) dλ. 0 The intuition and geometry behind the rearrangement of a set A is clear: A∗ is simply the open ball centered at the origin with the same measure as A. To give intuition for

108 the symmetric rearrangement of a function, we compute a simple example. Suppose that f : R → C is given by f(x) = χ[−2,−1](x) + χ[1,2](x). Fix λ > 0. Then  ([−2, −1] ∪ [1, 2])∗ 0 < λ < 1  (−1, 1) 0 < λ < 1 {|f| > λ}∗ = = . ∅∗ λ ≥ 1 ∅ λ ≥ 1 Therefore, Z 1 Z 1  ∗ 1 x ∈ (−1, 1) f (x) = χ{|f|>λ}∗ (x) dλ = χ(−1,1)(x) dλ = = χ(−1,1)(x). 0 0 0 x∈ / (−1, 1) For a slightly more interesting example, suppose that g : R → C is given by g(x) = χ[−2,−1](x) + 2χ[1,2](x). Fix λ > 0. Then  ([−2, −1] ∪ [1, 2])∗ 0 < λ < 1  (−1, 1) 0 < λ < 1 ∗  ∗  1 1  {|g| > λ} = ([1, 2]) 1 ≤ λ < 2 = − 2 , 2 1 ≤ λ < 2 .  ∅∗ λ ≥ 2  ∅ λ ≥ 2 Therefore, Z 1 Z 2 ∗ g (x) = χ(−1,1)(x) dλ + χ(− 1 , 1 )(x) dλ = χ(−1,1)(x) + χ(− 1 , 1 )(x). 0 1 2 2 2 2 Roughly, the symmetric decreasing rearrangement of a function takes the mass under the graph of the function and rearranges it to return a function which is radially symmetric and decreasing with the same total mass. These examples suggest the following basic properties of symmetric decreasing rear- rangements. d Proposition 11.3. If f : R → C is a Borel measurable function which vanishes at infinity, then f ∗ is nonnegative, radially symmetric, radially non-increasing, and lower-semicontinuous. Proof. Nonnegativity is clear. Radial symmetry of f ∗ follows from the fact that the set {|f| > λ}∗ is radially symmetric. The fact that f ∗ is radially decreasing follows from the observation that if λ > µ, then {|f| > λ}∗ ⊆ {|f| > µ}∗. To prove lower-semicontinuity, recall that f ∗ is lower-semicontinuous if and only if {f ∗ > λ} is open for all λ. We have  Z ∞  n d ∗ o d [ ∗ x ∈ R : f (x) > λ = x ∈ R : χ{|f|>t}∗ (x) dt > λ = {|f| > t} . 0 t>λ Thus, {f ∗ > λ} is open.

Next, observe that the rearrangement function satisfies the following monotonicity property. d Proposition 11.4. If f, g : R → C are Borel measurable functions which vanishes at infinity, and if |f| ≤ |g|, then f ∗ ≤ g∗. Proof. If |f| ≤ |g|, then {|f| > λ} ⊆ {|g| > λ}. Thus, {|f| > λ}∗ ⊆ {|g| > λ}∗, and the claim follows.

Next, we make an observation that will be used frequently. By definition, Z ∞ ∗ f (x) = χ{|f|>λ}∗ (x) dλ. 0 On the other hand, we have the layer-cake decomposition of f ∗, which gives Z ∞ ∗ f (x) = χ{f ∗>λ}(x) dλ. 0 This suggests the equality {|f| > λ}∗ = {f ∗ > λ}. Indeed, this is the case.

109 d Proposition 11.5. If f : R → C is a Borel measurable function which vanishes at infinity, then {|f| > λ}∗ = {f ∗ > λ}.

Proof. Note that Z ∗ |{|f| > t} | = |{|f| > t}| = χ{|f|>t}(x) dx. Rd As t → λ from above, by the dominated convergence theorem, it then follows that Z ∗ ∗ |{|f| > t} | → χ{|f|>λ}(x) dx = |{|f| > λ}| = |{|f| > λ} |. Rd ∗ S ∗ ∗ ∗ Because {f > λ} = t>λ{|f| > t} , we have thus shown |{f > λ}| = |{|f| > λ} |. Because f ∗ is radially symmetric and lower-semicontinuous, {f ∗ > λ} is an open ball centered at the origin. By definition of a set rearrangement, {|f| > λ}∗ is also an open ball centered at the origin. As two open balls of equal measure with same center, {|f| > λ}∗ = {f ∗ > λ}.

d Corollary 11.6. If f : R → C is a Borel measurable function which vanishes at infinity, then

∗ kfk p d = kf k p d L (R ) L (R ) for 1 ≤ p ≤ ∞.

Proof. This follows from Z ∞ p p−1 kfkLp( d) = p λ |{|f| > λ}| dλ R 0 when 1 ≤ p < ∞. The case p = ∞ is similar.

In preparation for our first major result, we make a few remarks. We say that a function d f : R → [0, ∞) is strictly symmetric decreasing if f is radial and if |x| > |y| implies that d ∗ f(x) < f(y). Note that this implies f(x) > 0 for all x ∈ R , and furthermore f = f. d Theorem 11.7 (Hardy-Littlewood). Let f, g : R → [0, ∞) be vanishing at infinity. Then Z Z f(x)g(x) dx ≤ f ∗(x)g∗(x) dx.

Moreover, if f is strictly symmetric decreasing, then equality holds if and only if g = g∗ almost everywhere.

Proof. First, we demonstrate the inequality. We compute, using a layer-cake decomposi- tion: Z Z Z ∞ Z ∞ f(x)g(x) dx = χ{f>λ}(x) dλ χ{g>µ} dµ dx Rd 0 0 Z ∞ Z ∞ = |{f > λ} ∩ {g > µ}| dλ dµ. 0 0 Suppose without loss of generality that |{f > λ}| ≤ |{g > µ}|. Then {f > λ}∗ ⊆ {g > µ}∗. So

|{f > λ} ∩ {g > µ}| ≤ |{f > λ}| = |{f > λ}∗| = |{f > λ}∗ ∩ {g > µ}∗| Z = χ{f>λ}∗ (x)χ{g>µ}∗ (x) dx

110 and hence Z Z Z ∞ Z ∞ Z ∗ ∗ f(x)g(x) dx ≤ χ{f>λ}∗ (x) dλ χ{g>µ}∗ (x) dµ dx = f (x)g (x) dx. Rd 0 0 Next, we consider the case of equality. Suppose that f is strictly symmetric decreasing and that R f(x)g(x) dx = R f ∗(x)g∗(x) dx. Then R f(x)g(x) dx = R f(x)g∗(x) dx, hence Z ∞ Z Z ∞ Z f(x)χ{g>µ}(x) dx dµ = f(x)χ{g>µ}∗ (x) dx dµ. 0 Rd 0 Rd Because Z Z f(x)χ{g>µ}(x) dx ≤ f(x)χ{g>µ}∗ (x) dx Rd Rd by the demonstrated inequality, it follows that Z Z f(x)χ{g>µ}(x) dx = f(x)χ{g>µ}∗ (x) dx Rd Rd for almost every µ. We claim that |{g > µ}∆{g > µ}∗| = 0. To see this, first observe that all open balls centered at 0 occur as super-level sets of f. That is, {f > λ} = B(0, r(λ)) where λ 7→ r(λ) is non-increasing and r(λ) → ∞ as λ → 0, since f is strictly symmetric decreasing. Note that r(λ) is continuous. If it were not, then by monotonicity there would be a jump discontinuity at some point λ0, so that

lim |{f > λ0 + ε}| − |{f > λ0 − ε}| < 0 ε→0 and thus lim |{λ0 − ε ≤ f < λ0 + ε}| > 0. ε→0

This implies that |{f = λ0}| > 0, which is a contradiction, as f is strictly symmetric de- creasing. Now, by the above equality, we have Z ∞ Z Z ∞ Z χ{f>λ}(x)χ{g>µ}(x) dx dλ = χ{f>λ}(x)χ{g∗>µ}(x) dx dλ. 0 0 d If A ⊆ R is a Borel measurable function, then the function Z FA(λ) := χ{f>λ}(x)χA(x) dx = |A ∩ B(0, r(λ))| is continuous. By our demonstrated inequality, Z F{g>µ}(λ) ≤ χ{f>λ}(x)χ{g∗>µ}(x) dx = F{g∗>µ}(λ).

The assumed equality implies Z ∞ Z ∞ F{g>µ}(λ) dλ = F{g∗>µ}(λ) dλ 0 0 and thus F{g>µ}(λ) = F{g∗>µ}(λ) for almost every λ. Continuity of these functions gives equality for all λ > 0. That is, |B(0, r(λ)) ∩ {g > µ}| = |B(0, r(λ)) ∩ {g∗ > µ}| for all λ > 0. Because r(λ) is continuous and r(λ) → ∞ as λ → 0, |B(0, r) ∩ {g > µ}| = |B(0, r) ∩ {g∗ > µ}| for all r > 0. From this, it follows that |{g > µ}∆{g > µ}∗| = 0 for almost every µ as desired. From this, we have g = g∗ almost everywhere.

111 A quick corollary of this inequality is that the symmetric decreasing rearrangement compresses the L2 distance between functions.

2 d ∗ ∗ Corollary 11.8. Let f, g ∈ L (R → [0, ∞)). Then kf − g kL2 ≤ kf − gkL2 . Proof. We have Z Z ∗ ∗ 2 ∗ ∗ 2 ∗ 2 ∗ 2 ∗ ∗ kf − g kL2 = (f (x) − g (x)) dx = kf kL2 + kg kL2 − 2 f (x)g (x) dx Z 2 2 ≤ kfkL2 + kgkL2 − 2 f(x)g(x) dx 2 = kf − gkL2 .

This estimate holds for more general Lp spaces. In fact, we have the following more general theorem.

d Theorem 11.9. Let φ : R → [0, ∞) be convex with φ(0) = 0, and suppose that f, g : R → [0, ∞) are vanishing at infinity. Then Z Z φ(f ∗ − g∗)(x) dx ≤ φ(f − g)(x) dx.

Moreover, if φ is strictly convex and f is strictly symmetric decreasing, then equality holds if and only if g = g∗ almost everywhere.

Proof. Write φ = φ+ + φ−, where

 φ(t) t ≥ 0  0 t ≥ 0 φ (t) = ; φ (t) = . + 0 t < 0 − φ(t) t < 0

Both φ+ and φ− are convex, and it suffices to prove the inequality for only φ+. Recall that convex functions are Lipschitz on compact domains, and that if c < a < x < y < b < d, we have φ (a) − φ (c) φ (y) − φ (x) φ (d) − φ (b) + + ≤ + + ≤ + + . a − c y − x d − b

Thus, convex functions are absolutely continuous, hence differentiable almost everywhere, R t 0 0 and so φ+(t) = 0 φ+(s) ds. Moreover, φ+ is a nondecreasing function, and strictly increas- 0 ing if φ is strictly convex. Thus, φ+ has countably many (jump) discontinuities, and so we 0 0 may modify φ+ on a countable set to ensure that φ+ is left-continuous. So,

Z f(x)−g(x) Z f(x) 0 0 φ+(f(x) − g(x)) = φ+(t) dt = φ+(f(x) − µ) dµ 0 g(x) Z ∞ 0 = φ+(f(x) − µ)χ{g≤µ}(x) dµ 0

0 where the last equality follows from the fact that φ+(t) = 0 for t < 0. Thus, Z Z ∞ Z 0 φ+(f(x) − g(x)) dx = φ+(f(x) − µ)χ{g≤µ}(x) dx dµ. 0 We now make two claims, which are left as exercises to the reader.

112 1. We have the inequality Z Z ∗ f(x)χ{g≤µ}(x) dx ≥ f (x)χ{g∗≤µ}(x) dx.

2. If F is nondecreasing, left continuous, and satisfies F (0) = 0, then (F ◦ f)∗ = F (f ∗).

Using these two facts, we have Z Z ∞ Z 0 ∗ φ+(f(x) − g(x)) dx ≥ [φ+(f(x) − µ)] χ{g∗≤µ}(x) dx dµ 0 Z ∞ Z 0 ∗ = φ+(f (x) − µ)χ{g∗≤µ}(x) dx dµ 0 where we have adopted the obvious convention that (f − µ)∗ := f ∗ − µ to account for the fact that f − µ does not vanish at ∞. Unraveling the above integral, it follows that Z Z ∗ ∗ φ+(f(x) − g(x)) dx ≥ φ+(f (x) − g (x)) dx, which is the desired inequality. Next, we consider the equality case. Assume that φ is strictly convex, f is strictly sym- R R ∗ ∗ metric decreasing, and that φ+(f(x)−g(x)) dx = φ+(f (x)−g (x)) dx. Then for almost every µ, Z Z 0 0 φ+(f(x) − µ)χ{g≤µ}(x) dx = φ+(f(x) − µ)χ{g∗≤µ}(x) dx

∗ 0 because f = f . Let h(x) := φ+(f(x) − µ). Because f is strictly symmetric decreasing and 0 φ+ is strictly increasing, h is strictly symmetric decreasing on the set {f(x) > µ}. Rewriting the above equality with a layer-cake decomposition then gives Z ∞ Z Z ∞ Z χ{h>λ}(x)χ{g≤µ}(x) dx dλ = χ{h>λ}(x)χ{g∗≤µ}(x) dx dλ. 0 0 Define Z F{g≤µ}(λ) := χ{h>λ}(x)χ{g≤µ}(x) dx; Z F{g∗≤µ}(λ) := χ{h>λ}(x)χ{g∗≤µ}(x) dx.

R ∞ R ∞ By claim 1, F{g≤µ}(λ) ≥ F{g∗≤µ}(λ). Since 0 F{g≤µ}(λ) dλ = 0 F{g∗≤µ}(λ) dλ, we then have F{g≤µ}(λ) = F{g∗≤µ}(λ) for almost all λ > 0. Since these functions are continuous in λ, equality in fact holds for all λ > 0. So for all λ > 0,

|{h > λ} ∩ {g ≤ µ}| = |{h > λ} ∩ {g∗ ≤ µ}| and so

|{f > r}∩{g ≤ µ}| = |{f > r}∩{g∗ ≤ µ}| ⇒ |{f > r}∩{g > µ}| = |{f > r}∩{g∗ > µ}| for all r > µ, since h is strictly symmetric decreasing. Working through the exact same argument with φ− in place of φ+ gives the above equality for all r < µ. As before, it follows that |{g > µ}∆{g∗ > µ}| = 0 and consequently g = g∗ almost everywhere.

113 11.2 The Riesz Rearrangement Inequality, d = 1

Next, we consider a classical rearrangement inequality due to Riesz. The motivation for studying inequalities of the kind to follow originates with Poincare and his work on the shapes of fluid bodies in equilibrium. The resulting variational problem deals with inte- grals of the form ZZ f(x)h(y)|x − y|−1 dx dy.

More generally, variational problems involving symmetric decreasing functions h(x − y) 2 1 − |x−y| such as the heat kernel (4πt) 1 e t are of interest. The following inequality holds more generally in Euclidean space of any dimension, but because of the technical nature of the proof, we begin by considering the one dimen- sional case.

Theorem 11.10 (Riesz rearrangement inequality, d = 1). Let f, g, h : R → [0, ∞) be vanishing at infinity. Then ZZ ZZ I(f, g, h) := f(x)g(x − y)h(y) dx dy ≤ f ∗(x)g∗(x − y)h∗(y) dx dy = I(f ∗, g∗, h∗).

Moreover, if g is strictly symmetric decreasing, then equality holds if and only if there exists x0 ∈ R ∗ ∗ such that f(x) = f (x − x0) and h(x) = h (x − x0) for almost all x ∈ R. Proof. Using a layer-cake decomposition on f, g and h, we have Z ∞ Z ∞ Z ∞ ZZ I(f, g, h) = χ{f>λ}(x)χ{g>µ}(x − y)χ{h>ν}(y) dx dy dλ dµ dν. 0 0 0

Thus, it suffices to prove the inequality for f = χA, g = χB, and h = χC for finite-measure sets A, B, C. Approximating these sets from above by open sets An,Bn, and Cn, we have ∗ ∗ ∗ ∗ ∗ ∗ I(χAn , χBn , χCn ) → I(χA, χB, χC ) and I(χAn , χBn , χCn ) → I(χA , χB , χC ) by the dom- inated convergence theorem, hence we may reduce further and assume without loss of generality that A, B, and C are open finite-measure sets. S Because A is an open subset of R, we can write A = j≥1 Ij where Ij are open, disjoint Sm intervals such that |Ij| ≥ |Ij+1|. Let Am := j=1 Ij. Let Bm and Cm denote the analogous decompositions for B and C. By the monotone convergence theorem, I(χAm , χBm , χCm ) → ∗ ∗ ∗ ∗ ∗ ∗ I(χA, χB, χC ) and I(χAm , χBm , χCm ) → I(χA , χB , χC ). Consequently, we may assume without loss of generality that A, B, C are each finite unions of disjoint open intervals. Write

J X χA(x) = fj(x − aj) j=1 K X χB(x) = gk(x − bk) k=1 L X χC (x) = hl(x − cl) l=1 where fj, gk, hl are characteristic functions of open intervals which are centered at 0. Then we have

J K L X X X ZZ I(χA, χB, χC ) = fj(x − aj)gk(x − y − bk)hl(y − cl) dx dy. j=1 k=1 l=1

114 ∗ ∗ Observe that (fj(x − aj)) = fj . To demonstrate the desired inequality, our goal is to take each of the characteristic functions in the integral and shift them towards 0. For t ∈ [0, 1] we define ZZ Ijkl(t) := fj(x − taj)gk(x − y − tbk)hl(y − tcl) dx dy.

We claim that this function is non-increasing with respect to t. To see this, change variables via u = x − taj and u − v = x − y − tbk, so that y = v + t(aj − bk). Then ZZ Ijkl(t) = fj(u)gk(u − v)hl(v + t(aj − bk − cl)) du dv.

R Let wjk(v) = fj(u)gk(u − v) du. Because fj and gk are both characteristic functions of intervals centered at 0, this is a symmetric decreasing function in v. Thus Z Ijkl(t) = wjk(v)hl(v + t(aj − bk − cl)) dv is non-increasing in t by the same reasoning. Next, define

J K L X X X ZZ It(χA, χB, χC ) := fj(x − taj)gk(x − y − tbk)hl(y − tcl) dx dy. j=1 k=1 l=1

As t → 0, the intervals corresponding to the functions fj(x−taj), gk(x−y −tbk), and hl(y − tcl) shift towards the origin. As t descends from 1 to 0, when two intervals corresponding to the same set A, B, or C, touch, redefine the fj’s, gk’s, and hl’s accordingly. Carrying this out for finitely many times (because there are finitely many intervals comprising A, B, and C), we end up with A∗,B∗, and C∗ intervals centered at 0. Moreover, since the functions Ijkl(t) are non-increasing,

I(χA, χB, χC ) = I1(χA, χB, χC ) ≤ I0(χA, χB, χC ).

By construction, I0(χA, χB, χC ) = I(χA∗ , χB∗ , χC∗ ). Thus, we have demonstrated the in- equality I(χA, χB, χC ) ≤ I(χA∗ , χB∗ , χC∗ ) as desired. Next, we consider the case of equality. Assume that g is strictly symmetric decreasing and that I(f, g, h) = I(f ∗, g∗, h∗). Applying a layer-cake decomposition to f and h yields the equality

Z ∞ Z ∞ ZZ χ{f>λ}(x)g(x − y)χ{h>µ}(y)dxdy dλ dµ 0 0 Z ∞ Z ∞ ZZ = χ{f ∗>λ}(x)g(x − y)χ{h∗>µ}(y)dxdy dλ dµ 0 0 since g = g∗. Consider the inner dx dy integrals. By the demonstrated inequality, ZZ ZZ χ{f>λ}(x)g(x − y)χ{h>µ}(y) dx dy ≤ χ{f ∗>λ}(x)g(x − y)χ{h∗>µ}(y) dx dy.

Hence, equality of the above integrals implies that ZZ ZZ χ{f>λ}(x)g(x − y)χ{h>µ}(y) dx dy = χ{f ∗>λ}(x)g(x − y)χ{h∗>µ}(y) dx dy for almost all λ, µ > 0. Fix λ and µ for which this holds.

115 Set A = {f > λ} and B = {h > µ}. We wish to show that there exists x0 ∈ R such that ∗ ∗ f(x) = f (x − x0) and h(x) = h (x − x0) for almost all x ∈ R. That is, we wish to show that A and B are intervals (up to a zero-measure set) centered at the same point, and that this central point does not depend on λ or µ. Fixing λ and letting µ vary (and vice versa) shows that the central point will not depend on λ or µ. Performing a layer-cake decomposition on g in the above equality yields

Z ∞ ZZ χ{f>λ}(x)χ{g>σ}(x − y)χ{h>µ}(y) dx dy dσ 0 Z ∞ ZZ = χ{f ∗>λ}(x)χ{g>σ}(x − y)χ{h∗>µ}(y) dx dy dσ. 0 Thus, applying our demonstrated inequality to the dx dy integrals as before, it follows that ZZ χ{f>λ}(x)χ{g>σ}(x − y)χ{h>µ}(y) dx dy ZZ = χ{f ∗>λ}(x)χ{g>σ}(x − y)χ{h∗>µ}(y) dx dy for almost all σ > 0. But ZZ σ 7→ χ{f>λ}(x)χ{g>σ}(x − y)χ{h>µ}(y) dx dy; ZZ σ 7→ χ{f ∗>λ}(x)χ{g>σ}(x − y)χ{h∗>µ}(y) dx dy are continuous functions in σ, since g is strictly symmetric decreasing. Therefore, equality holds for all σ. The fact that g is strictly symmetric decreasing then gives the equality ZZ ZZ χA(x)χ r r (x − y)χB(y) dx dy = χA∗ (x)χ r r (x − y)χB∗ (y) dx dy (− 2 , 2 ) (− 2 , 2 ) for all r > 0, with A and B as defined above. We claim that A and B must be intervals. To see this, pick r > |A|+|B|. With this choice of r, ZZ χA∗ (x)χ r r (x − y)χB∗ (y) dx dy = |A| · |B|. (− 2 , 2 ) Then for all r > |A| + |B|, we must have ZZ |A| · |B| = χA(x)χ r r (x − y)χB(y) dx dy (− 2 , 2 ) Z Z = χ r r (x) χA(x + y)χB(y) dy dx. (− 2 , 2 ) R By this equality, the continuous function w(x) := χA(x + y)χB(y) dy must have support r r  in the interval − 2 , 2 . Let IA be the smallest interval such that |IA ∩ A| = |A|, let IB be the smallest interval such that |IB ∩ B| = |B|, and let IAB be the smallest interval containing the support of w. We leave it as an exercise to the reader to show that |IAB| = |IA| + |IB|. With this claim, we then have |IA| + |IB| < r for all r > |A| + |B|. Taking the infimum over such r gives |IA| + |IB| ≤ |A| + |B|. Because IA and IB were chosen to be intervals that essentially contain A and B respectively, it follows that IA = A and IB = B, up to sets of measure zero, so that A and B are intervals. Thus ZZ ZZ χA(x)χ r r (x − y)χB(y) dx dy = χA∗ (x)χ r r (x − y)χB∗ (y) dx dy (− 2 , 2 ) (− 2 , 2 )

116 for all r > 0, where A and B are intervals. Letting r → 0 yields Z Z χA(x)χB(x) dx = χA∗ (x)χB∗ (x) dx.

R From here we consider two cases. If |A| = |B|, then χA∗ (x)χB∗ (x) dx = |A|. But if A and R B are not centered at the same point, then χA(x)χB(x) dx < |A|, which is a contradiction. Thus, A and B are intervals centered at the same point, as desired. If |A| > |B|, let r = |A| − |B|. Then ZZ Z Z χA∗ (x)χ(− r , r )(x − y)χB∗ (y) dx dy = dx dy = r|B|. 2 2 ∗ ∗ r r B A ∩(y− 2 ,y+ 2 )

On the other hand, if A and B are not centered at the same point, then ZZ Z Z χA(x)χ(− r , r )(x − y)χB(y) dx dy = dx dy < r|B|. 2 2 r r B A∩(y− 2 ,y+ 2 )

This is a contradiction, thus A and B are centered at the same point. This completes the proof.

11.3 The Riesz Rearrangement Inequality, d > 1

d In this section, we prove a generalization of the previous theorem for Euclidean space R . In order to do this, it is necessary another notion of symmetrizing sets and functions.

d d−1 Definition 11.11. Let A ⊆ R be a Borel measurable set of finite measure. Let e ∈ S ⊆ d ∗e R . The Steiner symmetrization of A along e is the set A which is obtained from A by symmetrizing along lines parallel to e.

d Definition 11.12. Let f : R → [0, ∞) be Borel measurable and vanishing at infinity. For d−1 d d d −1 e ∈ S ⊆ R , let ρ : R → R be the rotation taking e to e1. Write ρf(x) := f(ρ (x)), ∗1 and let (ρf) be the one-dimensional symmetric decreasing rearrangement of ρf along x1, ∗e −1 ∗1 keeping x2, . . . , xd fixed. The Steiner symmetrization of f along e is f := ρ (ρf) .

To clarify the definition of the Steiner symmetrization of a function, observe that f ∗1 can be written as follows:

∗1 ∗ f (x1, x2, . . . , xd) = (x1 7→ f(x1, x2, . . . , xd)) .

Also note that these quantities are well-defined by the usual considerations of product measures and measurability of slices.

d Theorem 11.13 (Riesz rearrangement inequality). Let f, g, h : R → [0, ∞) be vanishing at infinity. Then ZZ ZZ I(f, g, h) := f(x)g(x − y)h(y) dx dy ≤ f ∗(x)g∗(x − y)h∗(y) dx dy = I(f ∗, g∗, h∗).

d Moreover, if g is strictly symmetric decreasing, then equality holds if and only if there exists x0 ∈ R ∗ ∗ d such that f(x) = f (x − x0) and g(x) = g (x − x0) for almost all x ∈ R . Proof. The proof proceeds as follows. We consider the inequality when d = 2 separately, then prove the general inequality by induction. Finally, we consider the case of equality. The case d = 2.

117 Suppose that d = 2. By the usual layer-cake decomposition argument that we have given multiple times, it suffices to prove the inequality when f = χA, g = χB, and h = 2 χC for finite measure sets A, B, C ⊆ R . To simplify notation, we write I(A, B, C) := ∗e ∗e ∗e 1 I(χA, χB, χC ). First, observe that I(A, B, C) ≤ I(A ,B ,C ) for any e ∈ S . This follows from Fubini’s theorem and the one-dimensional Riesz rearrangement inequality, since in this case Steiner symmetrization is essentially symmetric rearrangement one direction. Next, we define sets Ak,Bk, and Ck for k ≥ 1 by more or less iterating Steiner sym- ∗ ∗ metrization in various directions, with the goal of showing Ak → A ,Bk → B , and ∗ Ck → C . So let α be an irrational multiple of 2π and let Rα denote the rotation around the origin by α. Let X and Y denote the Steiner symmetrizations along the x and y axes, ∗1 ∗1 respectively. Note then that Xf = f and Y f = Rπ/2 R−π/2f . For k ≥ 1, define

k Ak = (YXRα) (A) k Bk = (YXRα) (B) k Ck = (YXRα) (C).

In words, we obtain Ak by rotating A, symmetrizing in the X direction and then sym- metrizing in the Y direction, and then iterating this process k times. Because these opera- tions are invariant under measure, |Ak| = A. Moreover, Ak has double-reflection symmetry over both the x and y axes, and hence we can write

 2 A Ak = (x, y) ∈ R : |y| < ωk (|x|)

A where ωk : [0, ∞) → [0, ∞) is non-increasing, symmetric, and lower-semicontinuous. The analogous statements obviously hold for B and C. L2 ∗ Our goal is to show that χAk −→ χA , and similarly for B and C. To see why this is sufficient, first observe that

∗ ∗ ∗ |I(A ,B ,C ) − I(Ak,Bk,Ck)| ZZ

≤ (χA∗ − χA )(x)χB∗ (x − y)χC∗ (y) dx dy k ZZ

+ χA (x)(χB∗ − χB )(x − y)χC∗ (y) dx dy k k ZZ

+ χA (x)χB (x − y)(χC∗ − χC )(y) dx dy k k k

∗ ∗ ∗ ∗ ∗ ≤ kχA − χAk kL2 kχB ∗ χC kL2 + kχB − χBk kL2 kχAk ∗ χC kL2

∗ + kχC − χCk k kχAk ∗ χBk kL2

∗ ∗ ∗ ∗ ∗ ≤ kχA − χAk kL2 kχB kL1 kχC kL2 + kχB − χBk kL2 kχAk kL2 kχC kL1

∗ + kχC − χCk k kχAk kL1 kχBk kL1 .

∗ Because A, B, and C are finite measure sets, if we can show that χAk → χA and likewise ∗ ∗ ∗ for B and C, then this shows that I(Ak,Bk,Ck) → I(A ,B ,C ). Next, observe that

I(Ak+1,Bk+1,Ck+1) = I(YXRαAk,YXRαBk,YXRαCk)

≥ I(XRαAk,XRαBk,XRαCk)

≥ I(RαAk,RαBk,RαCk)

= I(Ak,Bk,Ck).

The first and second inequalities follow from the one-dimensional Riesz rearrangement inequality, and the last equality is because the quantity I(f, g, h) is invariant under rotation,

118 via the obvious change of variables. Hence, I(Ak,Bk,Ck) is an increasing sequence which converges to I(A∗,B∗,C∗). In particular,

∗ ∗ ∗ I(A, B, C) = I(A0,B0,C0) ≤ I(A ,B ,C ) as desired. With this goal in mind, we make the following observations. Suppose that F and G are finite measure sets. Then kRαχF − RαχGkL2 = kχF − χGkL2 , and by the inequality ∗ ∗ kf − g kL2 ≤ kf − gkL2 we also have

kXχF − XχGkL2 ≤ kχF − χGkL2

kY χF − Y χGkL2 ≤ kχF − χGkL2 .

Consequently, we may assume without loss of generality that A, B, and C are bounded 0 sets. Indeed, for every ε > 0 there exists an r > 0 and a set A ⊆ Br(0) such that 0 k 0 kχA − χA0 kL2 < ε. Let Ak = (YXRα) (A ). Then by the above inequalities we have

0 0 χAk − χA ≤ kχA − χA kL2 < ε. k L2

0 Thus, it is enough to consider convergence of the finite measure sets Ak, and similarly for B and C. From now on we suppose that A, B, and C are bounded. This implies that the func- A B C tions ωk , ωk , and ωk are uniformly bounded. For now we focus on A, as the arguments for B and C are exactly the same. Using a diagonalization argument and passing to a subse- A quence l(k), we see that {ωl(k)} converges at every single rational number. By monotonicity, A {ωl(k)} converges everywhere except at countably many points, at most. This implies that ˜ ˜ χAl(k) → χA˜ almost everywhere for some set A. Necessarily, A has double reflection sym- metry across the x and y axes. To conclude that A˜ = A∗, which then concludes the proof of ˜ ˜ the d = 2 inequality, it suffices to show that |A| is a ball, since |A| = limk→∞ |Al(k)| = |A|. We will accomplish this by essentially showing that RαA˜ = A˜, and using that fact that α is an irrational multiple of 2π and thus A˜ has rotational symmetry on a dense set of angles. ∗ ∗ To do this, we take advantage of the equality case in the kf − g kL2 ≤ kf − gkL2 , which presumes f to be strictly symmetric decreasing. Let

2 2 γ(x, y) = e−|x| −|y|

2 be a Gaussian on R , which is indeed strictly symmetric decreasing. Let ak = kγ − χAk kL2 . Note that Rαγ = γ, Xγ = γ, and Y γ = γ. So

a = γ − χ = kYXR γ − YXR χ k ≤ kγ − χ k k+1 Ak+1 L2 α α Ak L2 Ak L2 = ak.

Thus, {ak} is a decreasing sequence. Because ak ≥ 0, a := limk→∞ ak = limk→∞ al(k) exists.

Moreover, by the dominated convergence theorem, a = γ − χA˜ L2 . Next, note that

al(k)+1 − γ − YXRαχA˜ 2 = γ − χAl(k)+1 − γ − YXRαχA˜ 2 L L2 L

≤ χAl(k)+1 − YXRαχA˜ L2

= YXRαχAl(k) − YXRαχA˜ L2

≤ χAl(k) − χA˜ L2

119 which → 0 as k → ∞. On the other hand,

al(k)+1 − γ − YXRαχA˜ L2 → a − γ − YXRαχA˜ L2 as k → ∞. Thus, a = γ − YXRαχA˜ L2 . But we have

a = γ − YXRαχA˜ L2 = Y γ − YXRαχA˜ L2 ≤ γ − XRαχA˜ L2

= Xγ − XRαχA˜ L2 ≤ γ − RαχA˜ L2

= γ − χA˜ L2 = a.

Therefore, we must have γ − RαχA˜ L2 = γ − XRαχA˜ L2 , i.e., ZZ ZZ 2 2 |γ − RαχA˜| dx dy = |γ − XRαχA˜| dx dy.

R 2 R 2 R 2 R Because |γ − RαχA˜| dx ≤ |γ − XRαχA˜| dx, it follows that |γ − RαχA˜| dx = |γ − 2 XRαχA˜| dx for almost every y. Thus, for almost every y, XRαχA˜ = RαχA˜ for almost every 2 x. By Fubini’s theorem, XRαχA˜ = RαχA˜ almost everywhere in R . The same argument 2 shows that YXRαχA˜ = XRαχA˜ almost everywhere in R , and hence YXRαχA˜ = RαχA˜ 2 almost everywhere in R . Thus, RαχA˜ has double reflection symmetry about both axes. If P denotes reflection across the y-axis, then

RαχA˜ = PRαχA˜ = R−αP χA˜ = R−αχA˜ which implies that χA˜ = R2αχA˜. Because α, and hence 2α, is an irrational multiple of 2π, the set {n(2α) mod 2π} is dense in [0, 2π). Thus, the function

µ(θ) = χA˜ − Rθ has dense set of zeroes. To conclude that A˜ is a circle, and thus conclude the proof of the Riesz rearrangement inequality in the d = 2 case, it suffices to prove that µ is continuous in θ. By writing and expanding µ2 as an inner product, it suffices to show continuity of the map ZZ θ 7→ χA˜ · RθχA˜ dx dy.

∞ To see this, let fn ∈ C such that fn → χA˜. Then

ZZ ZZ 1 2 χ ˜ · Rθχ ˜ dx dy − fn · Rθχ ˜ dx dy ≤ χ ˜ − fn 2 · |A| → 0 A A A A L RR uniformly in θ. Because θ 7→ fn · Rθχ ˜ dx dy is continuous, it then follows that θ 7→ RR A χA˜ · RθχA˜ dx dy is a uniform limit of continuous functions, hence continuous. Thus, µ is continuous, and so A˜ is a ball and therefore A˜ = A∗. The case d > 2. d−1 d d Let e ∈ S ⊆ R . Recall that the Steiner symmetrization of f : R → [0, ∞) along e ∗e −1 ∗1 is f (x) = (f ◦ ρ ) (ρx), where ρ is a rotation defined by ρe = e1. Define the Schwartz symmetrization along directions perpendicular to e by

∗e⊥ −1 ∗e⊥ f (x) := (f ◦ ρ ) 1 (ρx),

⊥ where ∗e1 indicates symmetrization in the variables x2, . . . , xd, keeping x1 fixed. As before, to prove the d > 2 case it suffices to consider f = χA, g = χB, and h = χC for finite and bounded measure sets A, B, C. Let R be a rotation such that Red =

120 ed−1. Let Y denote the Steiner symmetrization along ed, and let X denote the Schwartz symmetrizations perpendicular to ed. Let

k Ak = (YXR) A k Bk = (YXR) B k Ck = (YXR) C.

0 0 Write x = (x , xd) where x = (x1, . . . , xd−1). By induction, we have ZZ 0 0 0 0 0 0 f(x , xd)g(x − y , xd − yd)h(y , yd) dx dy ZZ 0 0 0 0 0 0 ≤ (Xf)(x , xd)(Xg)(x − y , xd − yd)(Xh)(y , yd) dx dy .

In particular, I(Ak+1,Bk+1,Ck+1) ≥ I(Ak,Bk,Ck), so by the same argument as before it L2 ∗ suffices to prove χAk −→ χA , and likewise for B and C. By construction, we can write

n d A 0 o Ak = x ∈ R : |xd| < ωk (|x |)

A A where ωk is a symmetric and non-increasing function. The ωk ’s are uniformly bounded by boundedness of A. Exactly as before, we may pass to a subsequence to get χAl(k) → χA˜ almost everywhere. As each Al(k) is rotationally symmetric around the xd axis, the limiting set A˜ is also rotationally symmetric around the xd axis. Also as before, define the Gaussian ˜ ˜ γ and set ak = kγ − χAk kL2 . Running through the same computation gives YXRA = RA almost everywhere. Hence, RA˜ is rotationally symmetric along the xd axis, which implies that A˜ is rotationally symmetric around the xd axis and the xd−1 axis. We will show that this implies that A˜ is a ball. Then because |A˜| = |A| = |A∗| it will follow that A˜ = A∗ almost everywhere, as desired. Let ϕε be a radial smooth approximation to the identity and set χε(x) = (ϕε ∗ χA˜)(x). Then χε and χA˜ have the same symmetry properties. As such, we can write

q 2 2 q 2 2 2 χε(x1, . . . , xd) = u( x1 + ··· + xd−1, xd) = v( x1 + ··· + xd−2 + xd, xd−1)

2 2 2 for smooth functions u and v of two variables. Writing ρ = x1 + ··· + xd−1, we have q q 2 2 2 2 u( ρ + xd−1, xd) = v( ρ + xd, xd−1) for all xd−1, xd ∈ R and for all ρ > 0. In particular, setting xd = 0 we have q 2 2 p 2 u( ρ + xd−1, 0) = v( ρ , xd−1) 2 2 2 2 for all xd−1 ∈ R and for all ρ > 0. Choosing ρ = x1 +···+xd−2 +xd gives χε(x) = u(|x|, 0), hence χε is spherically symmetric and so A˜ is a ball. This completes the proof of the inequality for d > 2. The case of equality. Finally, we consider the equality statement. Suppose that g is strictly symmetric de- creasing and that equality holds. As before, we assume that f = χA and h = χB for finite 0 measure sets A and B. Using the (x , xd) notation from above, we have Z Z Z Z 0 0 0 0 0 0 f(x , xd)g(x − y , xd − yd)h(y , yd) dx dy dxd dyd R R Rd−1 Rd−1 Z Z Z Z ∗ 0 ∗ 0 0 ∗ 0 0 0 = f (x , xd)g (x − y , xd − yd)h (y , yd) dx dy dxd dyd. R R Rd−1 Rd−1

121 By the inequality in the statement of the theorem, Z Z 0 0 0 0 0 0 f(x , xd)g(x − y , xd − yd)h(y , yd) dx dy Rd−1 Rd−1 Z Z ∗ 0 ∗ 0 0 ∗ 0 0 0 ≤ f (x , xd)g (x − y , xd − yd)h (y , yd) dx dy . Rd−1 Rd−1 Hence, the assumed equality gives Z Z 0 0 0 0 0 0 f(x , xd)g(x − y , xd − yd)h(y , yd) dx dy Rd−1 Rd−1 Z Z ∗ 0 ∗ 0 0 ∗ 0 0 0 = f (x , xd)g (x − y , xd − yd)h (y , yd) dx dy Rd−1 Rd−1 for almost all (xd, yd).

By induction we get sets Axd and Byd corresponding to the functions f(·, xd) and h(·, yd) d−1 which are balls in R centered at the same point, by the equality case of the one dimen- sional Riesz rearrangement inequality. Also, by the same argument as before, this central point is independent of xd and yd. Thus, A and B are rotationally symmetric with respect to a line L1 which is parallel to the xd axis. Similarly, A and B must also be rotationally symmetric with respect to another line L2 parallel to the ed−1 axis. We claim that L1 ∩L2 6= ∅. In d = 2, this is obvious, but the existence of skew-symmetric lines in higher dimensions makes the d ≥ 3 case more complicated. This can be seen via a 1 ping pong argument: if L1 and L2 do not intersect, then by rotating around L2 180 degrees it follows that the sets are rotationally symmetric about L1 and and an affine shift of L1. By rotating around this shift of L1, the sets are then also rotationally symmetric around an affine shift of L2. Continuing this process, we contradict the finite measure of A and B. Thus, A and B are rotationally symmetric about two axes. As we argued previously, it follows that A and B are balls.

11.4 The Polya-Szego inequality

Our first application of the Riesz rearrangement inequality is a special case of the Polya- Szego inequality. We begin with a lemma. d 1 1 Lemma 11.14. Let f : R → C be such that f ∈ Lloc and ∇f ∈ Lloc. Then ( (u∇u+v∇v)(x) if f = u + iv 6= 0 ∇|f|(x) = |f(x)| 0 if f = u + iv = 0 in the sense of distributions. In particular, |∇|f|| ≤ |∇f| almost everywhere. Proof. First we demonstrate the “in particular” statement, assuming the formula for ∇|f|. For f(x) 6= 0 (up to a choice of representative), we have by Cauchy-Schwarz u2|∇u|2 + v2|∇v|2 + 2uv∇u∇v (x) |∇|f(x)||2 = |f(x)|2 u2|∇u|2 + v2|∇v|2 + u2|∇v|2 + v2|∇u|2 (x) ≤ |f(x)|2 (u2 + v2)(|∇u|2 + |∇v|2) = (x) |f|2 = |∇f(x)|2

1Drawing a picture here is helpful.

122 as desired. ∞ To prove the formula in the statement, let ψ ∈ Cc . We wish to show Z Z u∇u + v∇v ∇ψ |f| dx = − ψ dx. {f6=0} |f| The first issue we need to deal with is moving the derivative onto |f|. To do this, for δ > 0 p p consider δ2 + |f|2. Note that for small δ, δ2 + |f|2 ≤ 1 + |f|, so by the dominated convergence theorem Z p Z ∇ψ δ2 + |f|2 dx → ∇ψ |f| dx as δ → 0. Next, let ϕε be an approximation to the identity, uε = u ∗ ϕε, and vε = v ∗ ϕε. 1 Then uε → u and vε → v almost everywhere and in Lloc. Also, ∇(uε + ivε) → ∇f almost 1 everywhere and in Lloc. Note that Z p 2 2 2 p 2 2 2 ∇ψ δ + u + v − δ + uε + vε dx Z (u − u )(u + u ) + (v − v )(v + v ) = ∇ψ √ ε ε ε ε dx 2 2 2 p 2 2 2 δ + u + v + δ + uε + vε so that Z p p  ∇ψ δ2 + u2 + v2 − δ2 + u2 + v2 dx ε ε Z |u − u |(|u| + |u |) + |v − v |(|v| + |v |) ≤ |∇ψ| √ ε ε ε ε dx 2 2 2 p 2 2 2 δ + u + v + δ + uε + vε Z ≤ 2 |∇ψ| (|u − uε| + |v − vε|) dx.

1 Because uε → u and vε → v in Lloc, since ψ has compact support, it follows that Z Z p 2 2 2 p 2 2 2 ∇ψ δ + uε + vε dx → ∇ψ δ + u + v dx as ε → 0. Integrating by parts, we then have Z Z p 2 2 2 p 2 2 2 ∇ψ δ + u + v dx = lim ∇ψ δ + uε + vε dx ε→0 Z u ∇u + v ∇v = − lim ψ ε ε ε ε dx ε→0 p 2 2 2 δ + uε + vε Z u ∇u + v ∇v = − lim ψ ε ε dx ε→0 p 2 2 2 δ + uε + vε Z u (∇u − ∇u) + v (∇v − ∇v) − lim ψ ε ε ε ε dx. ε→0 p 2 2 2 δ + uε + vε Consider the second integral. We have Z Z uε(∇uε − ∇u) + vε(∇vε − ∇v) ψ dx ≤ |ψ| (|∇uε − ∇u| + |∇vε − ∇v|) dx p 2 2 2 δ + uε + vε

1 Again by Lloc convergence and compact support of ψ, this integral converges to 0 as ε → 0. In the first integral, observe that

Z u ∇u + v ∇v Z ψ ε ε dx ≤ |ψ| (|∇u| + |∇v|) dx < ∞. p 2 2 2 δ + uε + vε

123 Thus, by the dominated convergence theorem, Z p Z u ∇u + v ∇v ∇ψ δ2 + u2 + v2 dx = − lim ψ ε ε dx ε→0 p 2 2 2 δ + uε + vε Z u∇u + v∇v = − ψ √ dx. δ2 + u2 + v2 Letting δ → 0 gives Z Z u∇u + v∇v ∇ψ|f| dx = − ψ dx |f| as desired.

d 1 Theorem 11.15 (Polya-Szego). Let f : R → C be vanishing at infinity such that f ∈ Lloc and 2 ∗ ∇f ∈ L . Then k∇f kL2 ≤ k∇fkL2 . 2 1 Proof. First, note that ∇f ∈ L implies ∇f ∈ Lloc, hence by the previous lemma we have |∇|f|| ≤ |∇f|. Consequently, it suffices to consider functions f satisfying f ≥ 0. Next, we make a further reduction. For ε > 0, define    0 f ≤ ε 1  −1 fε := min , [f − ε]+ = f − ε ε ≤ f ≤ ε + ε = ϕε ◦ f ε  1 −1 ε f ≥ ε + ε where  0 x ≤ ε  −1 ϕε(x) = x − ε ε ≤ x ≤ ε + ε .  1 −1 ε x ≥ ε + ε Note that Z Z 1 −1 |fε| dx = {f ≥ ε + ε } + |f − ε| dx ε {ε≤f≤ε+ε−1} 1 −1 1 ≤ {f ≥ ε + ε } + |{f ≥ ε}| . ε ε R Because f vanishes at infinity, the above super-level sets have finite measure, hence |fε| dx 1 2 is finite and so fε ∈ L . Similarly, it can be shown that fε ∈ L . Also note that

L2 ∇fε = ∇fχ{ε≤f≤ε+ε−1} −→∇f as ε → 0 by the monotone convergence theorem. ∗ ∗ Next, recall that because ϕε is symmetric and non-decreasing, (fε) = (ϕε ◦ f) = 2 ∗ ∗ ∗ ∗ L ∗ ϕε ◦ f = (f )ε. Thus, we have ∇(fε) = ∇(f )ε −→ ∇f . This implies that it suffices to consider f ∈ L1 ∩ L2 and ∇f ∈ L2. By Plancharel and the monotone convergence theorem, we have

Z Z Z 1 − e−t|ξ|2 |∇f|2 dx = |ξ|2|fˆ(ξ)|2 dξ = lim |fˆ(ξ)|2 dξ. t→0 t Because f ∈ L1 and f ≥ 0, we can distribute and use Fubini to get Z  Z Z Z  2 1 2 −d −t|ξ|2 −ix·ξ iy·ξ |∇f| dx = lim kfk 2 − (2π) e e f(x) dx e f(y) dy dξ t→0 t L  ZZ Z  1 2 −d −t|ξ|2−i(x−y)·ξ = lim kfk 2 − (2π) f(x)f(y) e dξ dx dy t→0 t L  2 1  ZZ Z i(x−y) |x−y|2  2 −d −t ξ+ 2t − = lim kfk 2 − (2π) f(x)f(y) e e 4t dξ dx dy t→0 t L

124 d 2 Pd 2 where we are using the convention that for z ∈ C , z = j=1 zj . Changing variables and integrating the resulting ξ-Gaussian then gives

Z  ZZ 2  2 1 2 −d − d d − |x−y| |∇f| dx = lim kfk 2 − (2π) t 2 π 2 f(x) e 4t f(y) dx dy . t→0 t L By the Riesz rearrangement inequality, and by equi-measurability of f ∗ with f,

Z  ZZ 2  2 1 ∗ 2 −d − d d ∗ − |x−y| ∗ |∇f| dx ≥ lim kf k 2 − (2π) t 2 π 2 f (x) e 4t f (y) dx dy . t→0 t L Beginning with the expression on the right and undoing all of the previous calculations, 2 ∗ 2 we recover the inequality k∇fkL2 ≥ k∇f kL2 .

2 − d − |x−y| We make passing observation that the heat kernel (4πt) 2 e 4t appears in the above proof. This use of the heat kernel is possible because the stated estimate is in L2. A gener- alization of the Polya-Szego inequality holds for other values of p, but the above method does not give this result.

11.5 The Energy of the Hydrogen Atom

As an application of the Polya-Szego inequality, we consider the energy functional Z |f(x)|2 E(f) := |∇f(x)|2 − dx R3 |x| 1 3 for f ∈ H (R ). Such a functional can be taken to represent the energy of a hydrogen atom. We will demonstrate that inf E(f) kfkL2 =1 is achieved by a symmetric decreasing function. Via a scaling argument, we show that the 3 ∞ 3 − 2 x  infimum is negative. Let ϕ ∈ Cc (R ) with kϕkL2 = 1. For λ > 0, let ϕλ(x) = λ ϕ λ . Then kϕλkL2 = kϕk2 = 1, and x  2 Z x 2 Z ϕ E(ϕ ) = λ−2λ−3 (∇ϕ) dx − λ−3 λ dx λ λ |x| Z 2 −2 2 −1 |ϕ(x)| = λ k∇ϕk 2 − λ dx L |x|

2 |ϕ(x)|2 Because k∇ϕkL2 and |x| dx are fixed numbers, for large λ the above quantity is negative. E := inf E(f) < 0 Hence, min kfk2=1 . To show that the problem is well-posed, we also must prove that the infimum is not identically −∞. To do this, we first recall Hardy’s inequality: d Proposition 11.16 (Hardy’s Inequality). Let f ∈ S(R ) and 0 ≤ s < d. Then

f(x) s s . k|∇| fkLp |x| Lp d for all 1 < p < s . By this inequality, we have

f(x) 1 1 1 1 1 |∇| 2 f(x) = |ξ| 2 fˆ(ξ) = |ξ| 2 |fˆ(ξ)| 2 |fˆ(ξ)| 2 1 . 2 2 2 2 L L L |x| L2 1 1 1 ≤ |ξ| 2 |fˆ(ξ)| 2 |fˆ(ξ)| 2 L4 L4 1 1 2 2 = k∇fkL2 kfkL2 .

125 2 So for kfkL2 = 1, E(f) ≥ k∇fkL2 − C k∇fkL2 where the constant C comes from Hardy’s 2 inequality. So Emin ≥ infx>0 x − Cx > −∞ as desired. 1 3 Next, we show that the infimum is achieved. Let fn ∈ H (R ) be a decreasing minimiz- ing sequence, that is, kfnkL2 = 1 for all n and E(fn) → Emin. First, we claim that we can find a minimizing sequence of radially symmetric decreasing functions. Indeed, consider ∗ ∗ ∗ fn. We have kfnkL2 = kfnkL2 = 1, and by the Polya-Szego inequality, k∇fnkL2 ≤ k∇fnkL2 . By the Hardy-Littlewood rearrangement estimate, Z |f (x)|2  1   1   1  Z |f ∗(x)|2 n dx = , |f (x)|2 ≤ , (|f (x)|2)∗ = , |f ∗(x)|2 = n dx. |x| |x| n |x| n |x| n |x| ∗ ∗ Hence, E(fn) ≤ E(fn), which implies that fn is also a minimizing sequence. Consequently, we may assume from now on that each fn is radially symmetric and decreasing. 1 3 Next, we claim that the sequence {fn} is bounded in H (R ). If it were not, then because 2 kfnkL2 = 1, the quantities k∇fkL2 are unbounded. But since E(f) ≥ k∇fkL2 − C k∇fkL2 , this implies that lim sup E(fn) = ∞. This contradicts the assumption that E(fn) decreases 1 3 1 3 to Emin < 0. Hence, {fn} is indeed bounded in H (R ). Because H (R ) can be reformu- lated on the Fourier side as an L2-space with a weighted measure, and bounded sets in L2 spaces are weakly compact, it follows that (passing to a subsequence) fn * f weakly in 1 3 H (R ). We need to demonstrate that the weak limit f is an optimizer. First, we will show that

E(f) ≤ lim inf E(fn) = Emin 2 as desired. By weak lower semi-continuity of the L norm, k∇fkL2 ≤ lim inf k∇fnkL2 . This is great, but it doesn’t help in demonstrating the energy inequality because it tells R |f(x)|2 us nothing about the potential energy term − |x| dx. To remedy this, we invoke the following fact. 1 3 p 3 Lemma 11.17 (Strauss). The space H (R ) embeds compactly into L (R ) for 2 < p < 6. The proof of this lemma is found in Chapter 12, after discussing some compactness results in Lp spaces. Here we only remark that the upper bound 6 for p comes from the scaling condition of the Sobolev embedding theorem. 1 3 In particular, this lemma implies that because fn * f weakly in H (R ), then fn → f p 3 strongly in L (R ) for 2 < p < 6. With this in mind, we estimate as follows: Z 2 Z 2 Z |fn(x)| |f(x)| (|fn| − |f|)(|fn| + |f|) dx − dx ≤ dx |x| |x| |x| Z |f − f|(|f | + |f|) ≤ n n dx. |x| p In order to use the fact that fn → f strongly in certain L spaces, we want to apply Holder’s 1 p inequality. But since |x| is only in weak L spaces, we split up the integral: Z 2 Z 2 |fn(x)| |f(x)| dx − dx |x| |x| Z |f − f|(|f | + |f|) Z |f − f|(|f | + |f|) ≤ n n dx + n n dx. |x|≤1 |x| |x|≥1 |x| 1 p 3 In the first integral, |x| ∈ L (B(0, 1) ⊆ R ) for p < 3. Applying Holder’s inequality twice gives Z |fn − f|(|fn| + |f|) 1 dx ≤ k|fn − f|(|fn| + |f|)kL2(B(0,1)) |x|≤1 |x| |x| L2(B(0,1))

1 ≤ kfn − fkL4 (kfnkL4 + kfkL4 ) . |x| L2(B(0,1))

126 The first norm is finite by the above comment. We claim that the quantities kfnkL4 + kfkL4 are bounded in n. To see this, recall the Gagliardo-Nirenberg inequality:

Proposition 11.18 (Gagliardo-Nirenberg inequality). Fix d ≥ 1 and 0 < p < ∞ for d = 1, 2, 4 d or 0 < p < d−2 for d ≥ 3. Then for all f ∈ S(R ),

pd pd p+2 p+2− 2 2 kfkLp+2 . kfkL2 k∇fkL2 .

A proof of this inequality is also contained in Chapter 12. When p = 2 and d = 3, 4 3 the inequality above takes the form kfkL4 . kfkL2 k∇fkL2 . Because {fn} is a bounded 1 3 sequence in H (R ), it follows that kfnkL4 +kfkL4 is bounded. Then because kfn − fkL4 → 0, Z |f − f|(|f | + |f|) n n dx → 0. |x|≤1 |x| Similarly, Z |fn − f|(|fn| + |f|) 1 dx ≤ kfn − fkL4 (kfnkL2 + kfkL2 ) . kfn − fkL4 . |x|≥1 |x| |x| L4(|x|≥1)

Thus, Z |f − f|(|f | + |f|) n n dx → 0. |x|

From this and the fact that k∇fkL2 ≤ lim inf k∇fnkL2 it follows that

E(f) ≤ lim inf E(fn) = Emin as desired. In fact, a stronger statement is true. All of the optimizers of the energy functional E(f) above are radially symmetric; in particular, any optimizer satisfies f = eiθf ∗. We sketch the argument here. If f is an optimizer, then as seen above it follows that f ∗ is also an optimizer. ∗ ∗ From E(f) = E(f ) and Polya-Szego we deduce the equalities k∇fk 2 = k∇f k 2 and L L f(x) f ∗(x) ∗ 1 = 1 . By the equality case of our first rearrangement inequality, |f| = f |x| 2 L2 |x| 2 L2 almost everywhere. Unfortunately this does not rule out the possibility that f itself is not radial. However, we then have

∗ k∇fkL2 = k∇f kL2 = k∇|f|kL2 .

Having previously shown |∇|f|| ≤ |∇f| almost everywhere, it follows that |∇|f|| = |∇f| almost everywhere. Then if f = u + iv, we have u∇v = v∇u almost everywhere. Writing f = reiθ where r and θ are functions of x, we then have r∇θ = 0 almost everywhere. Because f is not identically 0, θ is constant as desired.

127 Chapter 12

Compactness in Lp Spaces

The analysis of the energy of the hydrogen atom in Chapter 11 is a model problem in the calculus of variations. Given a functional on some spaces of functions, we wish to determine its optimimum value, either a supremum or infimum, and moreover show that this optimum is achieved. By definition of the supremum and infimum, we can always find a sequence of functions whose functional values tend towards the optimum. From here, we would ideally invoke some sort of compactness of the function space to extract a limiting function which optimizes the functional. However, as we know from previous mathematical experiences, compactness in infinite dimensional spaces, and in particular function spaces, is far from ideal. For example, it is well-known that a closed and bounded set in an infinite dimensional Banach space is not compact in the norm-topology. We can, however, achieve varying degrees of compactness by considering weaker topologies. Also well-known is the the following result, which we have invoked throughout the text already. Theorem 12.1 (Banach-Alaoglu). Let X be a Banach space. A closed and bounded set in X∗ is compact in the weak-* topology. If X is reflexive, it immediately follows that a closed and bounded set in X is weakly compact. In the context of Lp-spaces (where we also have separability), this is manifested by the following theorem.

p d Theorem 12.2 (Weak compactness). If {fn} is a bounded sequence in L (R ) for 1 < p < ∞, p d then, passing to a subsequence, fn * f for some f ∈ L (R ). Oftentimes, weak compactness is not enough to solve optimization problems. For ex- ample, if fn is a translation of a bump function whose profile marches off to infinity, then fn * 0. As such, we must seek compactness with more care. The next four chapters of this text may be taken a single unit; a careful study of com- pactness in various function spaces and its consequences in harmonic analysis and partial differential equations. In the present chapter, we begin by developing a foundation of standard compactness results, and in Chapters 13, 14, and 15 we study the delicate no- tion of concentration compactness, which measures the defects of compactness in various contexts.

12.1 The Riesz Compactness Theorem

The first theorem we prove addresses the question of precompactness in the norm topology. For continuous functions, the answer is given by the well-known Arzela-Ascoli theorem. The main compactness tool used in this chapter is an analogous result for Lp spaces.

p d Theorem 12.3 (Riesz compactness). Fix 1 ≤ p < ∞. A family F ⊆ L (R ) is precompact in p d L (R ) if and only if it satisfies the following three properties:

128 1. (Boundedness) There exists A > 0 such that kfkLp ≤ A for all f ∈ F. 2. (Equicontinuity) For every ε > 0 there exists a δ > 0 such that R |f(x + y) − f(x)|p dx < ε for all |y| < δ and for all f ∈ F.

R p 3. (Tightness) For every ε > 0, there exists an R > 0 such that |x|≥R |f(x)| dx < ε for all f ∈ F.

Proof. The forward direction is fairly straightforward. Suppose that F is precompact in p d Sn ε L (R ). Then for every ε > 0, there exists f1, . . . , fn ∈ F such that F ⊆ j=1 B(fj, 10 ). ε Because any function f ∈ F lies in one of these balls, kfkLp ≤ 10 + maxj kfjkLp . Thus, F is bounded. Next, for |y| < δ = δ(ε) << 1 sufficiently small, we have ε kf(· + y) − fk p ≤ 2 + max kfj(· + y) − fjk p < ε, L 10 j L so that F is equicontinuous. Finally, ε kfk p ≤ + max kfjk p < ε L (|x|≥R) 10 j L (|x|≥R) for R sufficiently large. Hence, F is tight. The other direction is more involved. Suppose that F is bounded, equicontinuous, and Sn tight. Fix ε > 0. We need to show the existence of f1, . . . , fn ∈ F so that F ⊆ j=1 B(fj, ε). We will approximate F by a continuous family of functions and then apply Arzela- ∞ R Ascoli. Let ϕ ∈ Cc be a radial bump function satisfying ϕ dx = 1 and

 c |x| ≤ 1 ϕ(x) = d 2 0 |x| > 1 for some dimensional constant cd. For R > 0 and f ∈ F, define  x  Z  x   x  Z f (x) = ϕ f x − ϕ(y) dy = ϕ f(y) Rdϕ(R(x − y)) dy. R R R R

The passage from f to fR “fuzzes” f at a scale 1/R. Define FR = { fR : f ∈ F }. Because fR is defined as convolution with a smooth function multiplied by a compactly supported function, FR is a continuous family of functions supported on B(0,R). Also,  Z    x   x   x  kf − fRk p = 1 − ϕ f(x) + ϕ f(x) − f x − ϕ(y) dy . L p R L R R Lp Because F is tight, we can estimate the first norm by   x  ε

1 − ϕ f(x) = kfkLp(|x|>R) < R Lp 10 for R = R(ε) >> 1 sufficiently large. For the second norm, we use the fact that ϕ has unit mass. We have  Z  Z  x   x    x  ϕ f(x) − f x − ϕ(y) dy ≤ f(x) − f ϕ(y) dx R R p R p L Lx Z  x  ≤ |ϕ(y)| f(x) − f dy p R Lx ε ≤ 10 ε for R sufficiently large, by equicontinuity of F. Thus, kf − fRkLp < 5 for R sufficiently large.

129 Next, we claim that FR is uniformly bounded and equicontinuous. To see boundedness, by Young’s inequality and a change of variables we have

 y  d d p p kf k ∞ f x − kϕk 0 R kfk p R A. R L . p Lp . L . R Ly To see equicontinuity, write     Z x + y  x   z  |fR(x + y) − fR(x)| ≤ ϕ − ϕ f x + y − ϕ(z) dz R R R Z  x  h  z   z i + ϕ f x + y − − f x − ϕ(z) dz R R R x + y   x  d p ϕ − ϕ R kfk p kϕk p0 . R R L L  z   z  + f x + y − − f x − kϕk 0 p Lp R R Lz x + y   x  d d p p ϕ − ϕ R A + R kf(· + y) − f(·)k p . . R R L Because ϕ is smooth and because F is tight, for |y| < δ = δ(ε, R) with δ sufficiently small, |fR(x + y) − fR(x)| < ε uniformly in f. Therefore, as FR is a continuous family of uniformly bounded and equicontinuous compactly supported functions, it follows by the Arzela-Ascoli theorem that FR is pre- ∞ compact in the L topology. So for every ε˜ > 0 there exists f1, . . . , fn ∈ F such that Sn FR ⊆ j=1 BL∞ ((fj)R, ε˜). Fix f ∈ F. There exists a j such that fR ∈ BL∞ ((fj)R, ε˜). By the triangle inequality, we have

kf − fjkLp ≤ kf − fRkLp + kfj − (fj)RkLp + kfR − (fj)RkLp ε ε d < + + c kf − (f ) k R p . 5 5 d j j R L∞

Because kfj − (fj)RkL∞ < ε˜, by choosing ε˜sufficiently small it follows that kf − fjkLp < ε, Sn and hence F ⊆ j=1 B(fj, ε) as desired.

Our first corollary of the Riesz compactness theorem is an alternative characterization of precompact families when p = 2. In particular, in this case we can replace the equiconti- nuity condition by tightness in the Fourier domain. 2 d Corollary 12.4. A family F ⊆ L (R ) is precompact if and only if it satisfies the following two properties:

1. There exists A > 0 such that kfkL2 ≤ A for all f ∈ F. 2. For every ε > 0, there exists R > 0 such that Z Z |f(x)|2 dx + |fˆ(ξ)|2 dξ < ε |x|≥R |ξ|≥R

for all f ∈ F. Proof. The forward direction is almost exactly the same as in the proof of the Riesz com- R ˆ 2 pactness theorem; tightness in the Fourier domain (i.e. the estimate |ξ|≥R |f(ξ)| dξ < ε) 2 d comes from the fact that the Fourier transform is an isometry on L (R ). Conversely, suppose the two conditions hold. By the Reisz compactness theorem, it suffices to verify that F is equicontinuous.

130 2 d Fix ε > 0. Using the fact that the Fourier transform is an isometry on L (R ), for any R > 0 we can write

iy·ξ kf(x + y) − f(x)k 2 = e fˆ(ξ) − fˆ(ξ) Lx 2 Lξ h i ≤ eiy·ξfˆ(ξ) − fˆ(ξ) + fˆ(ξ) eiy·ξ − 1 . L2(|ξ|≥R) L2(|ξ|≤R)

When |ξ| is large, there is a lot of oscillation coming from the eiy·ξ phase. Thus, we simply use the triangle quality and tightness of fˆ to estimate the first term: ε eiy·ξfˆ(ξ) − fˆ(ξ) ≤ 2 fˆ < L2(|ξ|≥R) L2(|ξ|≥R) 2 for R sufficiently large. For the second term, we have h i fˆ(ξ) eiy·ξ − 1 ≤ fˆ· |y| · |ξ| ≤ |y|RA. L2(|ξ|≤R) L2(|ξ|≤R)

Then fˆ(ξ) eiy·ξ − 1 < ε/2 for |y| < δ where δ(R, ε) is sufficiently small. For L2(|ξ|≤R) such a choice of δ, kf(x + y) − f(x)k 2 < ε uniformly in f and hence F is equicontinuous. Lx

12.2 The Rellich-Kondrachov Theorem

1 d In this section, we consider the issue of compactness in the Sobolev space H (R ) and the 1 d homogeneous Sobolev space H˙ (R ). For convenience, we recall their definitions here. Definition 12.5. 1 d d 1. The H (R ) is the completion of S(R ) under the norm kfkH1 := kfkL2 + k∇fkL2 . 1 d d 2. The Hilbert space H (R ) is the completion of S(R ) under the norm kfkH1 := kfkL2 + k∇fkL2 . 1 d 2 In words, functions in H (R ) are L -functions whose (distributional) derivative is also 2 d 1 d 2 d in L (R ). For a function in H˙ (R ) we only assume that the gradient is in L (R ); the 2 d function itself may not be in L (R ). 1 d The two most important estimates in the study of the Sobolev spaces H (R ) and 1 d H˙ (R ) are the Gagliardo-Nirenberg inequality and the p = 2 Sobolev embedding theo- rem, respectively. Both results have already been stated and proven in this text, but we recall them here for convenience. Proposition 12.6 (Gagliardo-Nirenberg). Fix d ≥ 1. Let p satisfy 2 ≤ p < ∞ if d = 1 or d = 2, 2d 1 d and 2 ≤ p < d−2 if d ≥ 3. Then for f ∈ H (R ),

2d−p(d−2) d(p−2) 2p 2p kfkLp . kfkL2 k∇fkL2 . d Proposition 12.7 (Sobolev embedding, p = 2). Fix d ≥ 3. Suppose that f ∈ S(R ) with 2 d ∇f ∈ L (R ). Then kfk 2d . k∇fkL2 . L d−2

From Gagliardo-Nirenberg we have the relationship kfkp . kfkH1 , and from Sobolev embedding we have kfk 2d . kfk ˙ 1 . As promised in Chapter5, we provide a slick proof L d−2 H of Proposition 12.6 using Littlewood-Paley theory, which we previously did not have at our disposal.

131 Proof of Proposition 12.6. It suffices to consider Schwartz functions f. Because the Littlewood-Paley projections of f converge to f in Lp for the prescribed values of p, we have X kfkLp ≤ kfN kLp . N∈2Z By Bernstein’s first inequality,

d d d d 2 − p 2 − p kfN kLp . N kfN kL2 . N kfkL2 .

On the other hand, by Bernstein’s second inequality we have

d d d d 2 − p −1 2 − p −1 kfN kLp . N N k∇fN kL2 . N k∇fkL2 .

Together, this implies that

n d d d d o X 2 − p 2 − p −1 kfkLp . min N kfkL2 ,N k∇fkL2 . N∈2Z

Because the Gagliardo-Nirenberg inequality is trivial when p = 2, we may assume that d d d d d−2 d p > 2. Consequently, 2 − p > 0. Also, 2 − p − 1 = 2 − p < 0. This latter statement is 2d clear when d = 1 or d = 2, and when d ≥ 3 this follows from the fact that p < d−2 . Using the positivity and negativity of the above exponents, we split the sum into two summable series to get the desired inequality. Explicitly,

d d d d X 2 − p X 2 − p −1 kfkLp . N kfkL2 + N k∇fkL2 k∇fk k∇fk L2 L2 N. kfk N& kfk L2 L2 d d d d d d d d 2 − p 1− 2 + p 2 − p 1+ p − 2 . k∇fkL2 kfkL2 + k∇fkL2 kfkL2 2d−p(d−2) d(p−2) 2p 2p . kfkL2 k∇fkL2 .

For an H1-bounded family of functions, the Reisz compactness theorem leads to the following important result.

1 d d Theorem 12.8 (Rellich-Kondrachov, H (R )). Let χR be a cutoff function on BR(0) ⊆ R , p d smooth or not. Then the family F := { χRf : kfkH1 ≤ 1 } is compact in L (R ) for ( 2 ≤ p < ∞ d = 1, 2 2d . 2 ≤ p < d−2 d ≥ 3

1 d Proof. Let B1 denote the unit ball in H (R ). We begin with the observation that F is closed. p To see this, let fn ∈ B1 be a sequence of functions such that χRfn converges in L . Because 2 fn ∈ B1, the sequence fn is bounded in L . Passing to a subsequence, we have fn * f 2 2 weakly in L . This implies that χRfn * χRf weakly in L . As weak limits are unique, it p follows that χRfn → χrf in L and hence F is closed. Thus, to prove the theorem it suffices to prove that F is precompact. We will invoke the Reisz compactness theorem. As such, we need to demonstrate boundedness, equicontinuity, and tightness. By Gagliardo-Nirenberg, kχRfkLp ≤ kfkLp . kfkH1 and hence the family F is bounded. Tightness of F is immediate, as each function in F has compact support.

132 It remains to show equicontinuity. Fix ε > 0. We have

kχ f(x + y) − χ f(x)k p R R Lx ≤ k[χ (x + y) − χ (x)]f(x + y)k p + kχ (x)[f(x + y) − f(x)]k p . R R Lx R Lx

To estimate the first term, we want to use Holder, the Lp continuity property of translations on χR to get smallness, and then Gagliardo-Nirenberg on the f term to get uniformity in ∞ f. However, because translation of χR is not continuous in the L -norm, we have to be careful. We have

k[χR(x + y) − χR(x)]f(x + y)k p ≤ kχR(x + y) − χR(x)k r kf(x + y)k q Lx Lx Lx for r and q satisfying ( ∞ d = 1, 2 1 1 1 p < q < ; = + . 2d p r q d−2 d ≥ 3

kf(x + y)k q f |y| < δ(ε) << 1 By Gagliardo-Nirenberg, Lx is uniformly bounded in . For where δ is sufficiently small, we have kχR(x + y) − χR(x)k r < ε. Lx To estimate the second term, we simply ignore the χR and use Gagliardo-Nirenberg:

kχ (x)[f(x + y) − f(x)]k p ≤ kf(x + y) − f(x)k p R Lx Lx θ 1−θ kf(x + y) − f(x)k 2 k∇x[f(x + y) − f(x)]k 2 . Lx Lx where θ is the correct exponent dictated by Gagliardo-Nirenberg. By the triangle inequality,

1−θ 1−θ k∇x[f(x + y) − f(x)]k 2 k∇fk 2 Lx . L and hence is uniformly bounded in f. By the fundamental theorem of calculus,

θ θ kf(x + y) − f(x)k 2 (|y| k∇fk 2 ) . Lx . L

|y| kχ (x)[f(x + y) − f(x)]k p < ε Thus, for sufficiently small, R Lx . These two estimates give equicontinuity of F. By the Riesz compactness theorem, the proof is complete.

1 d The same result holds for H˙ (R ). The proof is for the most part the same, with the Sobolev embedding theorem replacing the Gagliardo-Nirenberg inequality as the main es- timation tool. ˙ 1 d 2d Theorem 12.9 (Rellich-Kondrachov, H (R )). Fix d ≥ 3 and 2 ≤ p < d−2 . Let χR be a cutoff d  function on BR(0) ⊆ R , smooth or not. Then the family F := χRf : kfkH˙ 1 ≤ 1 is compact p d in L (R ). Proof. The proof of closedness of F is the same, this time using the fact that the sequence 2d d 2 d is bounded in L d−2 (R ) (by Sobolev embedding) rather than L (R ). To use the Riesz compactness theorem, we need to show boundedness, tightness, and equicontinuity. As before, tightness is immediate by compact support. To see bounded- ness, we first apply Holder, and then Sobolev embedding:

kχRfkLp ≤ kχRkLα kfk 2d . kfk ˙ 1 . 1 L d−2 H

1 1 1 where α = p − 2d/(d−2) .

133 To prove equicontinuity, write

kχ f(x + y) − χ f(x)k p R R Lx ≤ k[χ (x + y) − χ (x)]f(x + y)k p + kχ (x)[f(x + y) − f(x)]k p R R Lx R Lx as before. Estimating the first term, we have

k[χR(x + y) − χR(x)]f(x + y)k p ≤ kχR(x + y) − χR(x)k α kf(x + y)k 2d Lx Lx d−2 Lx

kχR(x + y) − χR(x)k α . . Lx

α d Continuity of translations in L (R ) gives equicontinuity in this term. To estimate the second term, write

kχ (x)[f(x + y) − f(x)]k p ≤ kf(x + y) − f(x)k p R Lx Lx 1−θ θ ≤ kf(x + y) − f(x)k 2 kf(x + y) − f(x)k 2d Lx d−2 Lx

 1 1  where θ = d 2 − 2d/(d−2) . By Sobolev embedding, the latter term is uniformly bounded. The first term we estimate by first using the fundamental theorem of calculus:

1−θ 1−θ 1−θ kf(x + y) − f(x)k 2 (|y| k∇fk 2 ) |y| Lx . L . which gives equicontinuity..

12.3 The Strauss Lemma

In Chapter 11, we studied the energy functional of the hydrogen atom. A crucial step in our analysis was the Strauss lemma, which we did not prove. In this section, we provide a proof. First, we have the following theorem, which describes the decay of radial functions in 1 d H˙ (R ). Proposition 12.10 (Radial Sobolev embedding). Fix d ≥ 2 and 1 ≤ p < ∞. For a radial 1 d p d function f ∈ H˙ (R ) ∩ L (R ), we have

2(d−1) p 2 p+2 2+p 2+p r |f(r)| . kfkLp k∇fkL2 for almost all r > 0.

d Proof. It suffices to prove the inequality for Schwartz functions. Indeed, S(R ) is dense 1 d p d in H˙ (R ) ∩ L (R ), and by passing to a subsequence we can ensure almost everywhere convergence. As |∇|f|| ≤ |∇f| almost everywhere, we can further assume that f ≥ 0. By the fundamental theorem of calculus we have Z ∞ Z ∞ d−1 1+ p d−1 0 p d−1 0 p r f(r) 2 = r f (σ)f(σ) 2 dσ . σ |f (σ)|f(σ) 2 dσ r r 1 1 Z ∞  2 Z ∞  2 d−1 0 2 d−1 p . σ |f (σ)| dσ σ |f(σ)| dσ r r p 2 . k∇fkL2 kfkLp .

The final inequality here comes from the fact that f is radial and noting that σd−1 dσ de- scribes a polar coordinate transformation. Raising both sides of this inequality to the power 2 p+2 completes the proof.

134 Observe that as p increases in the above inequality, the decay condition weakens. In 1 d 2 d 1 d particular, the case p = 2 gives particularly good decay. As H˙ (R ) ∩ L (R ) = H (R ), this proves the following.

1 d Corollary 12.11. If f ∈ Hrad(R ) for d ≥ 2, then

d−1 1 1 2 2 2 r |f(r)| . kfkL2 k∇fkL2 for almost all r > 0.

Finally, we prove the Strauss lemma.

2d Lemma 12.12 (Strauss). Fix d ≥ 2 and 2 < p < ∞ if d = 2 or 2 < p < d−2 if d ≥ 3. Then 1 d p d Hrad(R ) embeds compactly into L (R ). 1 Proof. By normalizing, it suffices to prove that the unit ball B1 ⊆ Hrad is precompact in p d L (R ). Naturally, we will apply the Riesz compactness theorem. The arguments for boundedness and equicontinuity are similar to those used in the proof of the Rellich-Kondrachov theorem. For f ∈ B1, Gagliardo-Nirenberg gives

θ 1−θ kfkLp . kfkL2 k∇fkL2 . 1

p for some the appropriate exponent θ. Thus, B1 is L -bounded. Equicontinuity follows from Gagliardo-Nirenberg and the fundamental theorem of calculus:

θ 1−θ kf(x + y) − f(x)k p kf(x + y) − f(x)k 2 k∇ (f(x + y) − f(x))k 2 Ly . L x L θ 1−θ . (|y| k∇fkL2 ) k∇fkL2 θ . |y| .

It remains to show tightness. We wish to use Corollary 12.11. If we try to use the theorem directly, we get Z Z Z ∞ p − d−1 p − d−1 p d−1 |f(x)| dx . r 2 dx = r 2 r dr |x|≥R |x|≥R R d− d−1 p . R 2

d−1 2d provided that d − 2 p < 0, hence p > d−1 . But the conclusion holds for p > 2, so this is 2 d not quite strong enough. Instead, we first use the fact that f ∈ L (R ). Writing Z Z |f(x)|p dx = |f(x)|2 |f(x)|p−2 dx |x|≥R |x|≥R and using the radial Sobolev embedding theorem on |f(x)|p−2 rather than |f(x)|p gives Z 2 d−1 2 p − 2 (p−2) |f(x)| dx . kfkL2 R kfkL2 . |x|≥R

This can be made uniformly small for p > 2 by choosing R sufficiently large. By the Riesz compactness theorem, the proof is complete.

2d The endpoint cases in the Strauss lemma, namely p = 2 and p = d−2 when d ≥ 3, do 2d 1 d 2 d d−2 d not hold. That is, Hrad(R ) does not embed compactly into L (R ) or L (R ). To see that 1 d 2 d Hrad(R ) does not embed compactly into L (R ), we choose a radial bump function and

135 d ∞ − 2 x rescale on a large scale. Explicitly, let ϕ ∈ Cc be radial, and define ϕλ(x) := λ ϕ( λ ). Then kϕλkL2 = kϕkL2 . Also, d d − 2 −1 2 −1 k∇ϕλkL2 = λ λ λ k∇ϕkL2 = λ k∇ϕkL2 1 d which is . 1 for λ >> 1, hence ϕλ ∈ H (R ). Note that d − 2 | hϕλ, ψi | . λ kϕkL∞ kψkL1 2 and so ϕ * 0as λ → ∞. However, as the L -norm of ϕλ is constant in λ, the ϕλ’s cannot 2 d 1 d converge strongly to 0 in L (R ). The argument for showing that Hrad(R ) does not embed 2d d−2 d−2 d − 2 x compactly into L (R ) is similar; here, we rescale to small scales. Let ϕλ(x) = λ ϕ( λ ). Then kϕλk 2d = kϕk 2d , and in fact kϕλk ˙ 1 = kϕk ˙ 1 . Also, L d−2 L d−2 H H d−2 d − 2 2 kϕλkL2 = λ λ kϕkL2 = λ kϕkL2 which is . 1 for λ << 1. We have ϕλ * 0 as λ → 0, but ϕλ cannot converge to 0 in 2d d L d−2 (R ).

12.4 The Refined Fatou’s Lemma

The last result that we include in this chapter is an improvement on Fatou’s lemma. It p d will be important for establishing so-called asymptotic decoupling in L (R ) when we study concentration compactness in Chapters 13 and 14. p Lemma 12.13 (Brezis-Lieb). Fix 1 ≤ p < ∞, and let fn be an L -bounded sequence of functions. Suppose that fn → f almost everywhere. Then Z p p p | |fn| − |f − fn| − |f| | dx = 0.

Proof. First, recall that for any real numbers a and b,

p p p p |a + b| − |a| ≤ ε|a| + Cε|b| (12.1) for any ε > 0, where Cε is some constant dependent on ε and p. Let  + ε p p p p gn := |fn| − |fn − f| − |f| − ε|fn − f| . ε Because fn → f almost everywhere, gn → 0 almost everywhere. By the triangle inequality and (12.1) with a = fn − f and b = f,  + ε p p p p gn ≤ |fn| − |fn − f| + |f| − ε|fn − f| p p + ≤ (Cε|f| + |f| ) p = (Cε + 1) |f| . p p p As f ∈ L (the sequence fn is L -bounded!), |f| is integrable and so by the Dominated R ε convergence theorem gn dx → 0. Thus, Z p p p | |fn| − |f − fn| − |f| | dx Z  + Z p p p p p ≤ |fn| − |fn − f| − |f| − ε|fn − f| dx + ε|fn − f| dx Z Z ε p = gn dx + ε |fn − f| dx.

p For any ε > 0, the first quantity goes to 0. As fn is L -bounded by assumption, the second term goes to 0 uniformly in ε. This completes the proof.

136 Chapter 13

The Sharp Gagliardo-Nirenberg Inequality

As alluded to in the introduction to Chapter 12, in some sense, the next three chapters are in some sense devoted to the notion of concentration compactness. Broadly, concentration com- pactness studies the defects of compactness, and describes how to deal with these defects. This will be described formally in great deal in the future. Although concentration compactness is the underlying theme of the next three chap- ters, it is contextualized by various sharp inequalities. Indeed, our primary use for the principle of concentration compactness will be to determine optimal constants for the Gagliardo-Nirenberg inequality, the Sobolev embedding theorem, and the Strichartz in- equality. In this chapter, our focus is the Gagliardo-Nirenberg inequality. For good measure, we recall the inequality here. Theorem 13.1 (Gagliardo-Nirenberg). Fix d ≥ 1. Let p satisfy 2 ≤ p < ∞ if d = 1 or d = 2, 2d 1 d and 2 ≤ p < d−2 if d ≥ 3. Then for f ∈ H (R ),

2d−p(d−2) d(p−2) 2p 2p kfkLp . kfkL2 k∇fkL2 . We will prove the sharp version of inequality first without using concentration com- pactness, and then spend the rest of the chapter developing the principle of concentra- tion compactness to provide and alternate proof. Chapters 14 and 15 develop analogous concentration compactness principles for the Sobolev embedding theorem and Strichartz inequality, respectively.

13.1 The Focusing Cubic NLS

Before we determine the optimal constant in the Gagliardo-Nirenberg inequality, we feel it necessary to provide a suitable source of motivation. After all, we have spent the entirety of this text generously employing estimates involving ., with no regard for explicit con- stants. Unsurprisingly, the optimal constant in Theorem 13.1 is important in the study of certain nonlinear Schrodinger¨ equations. In particular, we will consider the focusing cubic nonlinear Schr¨odingerequation in d = 2, given by  i∂ u + ∆u = −|u|2u t . (13.1) u(0) = u0

2 Here, t ∈ R and x ∈ R , and for now we take the initial data u0 to be a complex-valued Schwartz function. This is one of the most commonly studied nonlinear Schrodinger¨ equa- tions.

137 Chapter 14

The Sharp Sobolev Embedding Theorem

138 Chapter 15

Concentration Compactness in Partial Differential Equations

139 References

[1] F.M. Christ and M.I. Weinstein. “Dispersion of small amplitude solutions of the gen- eralized Korteweg-de Vries equation”. In: Journal of Functional Analysis 100.1 (1991), pp. 87–109. ISSN: 0022-1236. [2] J. Duoandikoetxea. Fourier Analysis. Crm Proceedings & Lecture Notes. American Mathematical Soc., 2001. ISBN: 9780821883846. [3] L. Grafakos. Classical Fourier Analysis. Second Edition. 489 pp: Springer, 2008. [4] L.I. Hedberg. “On Certain Convolution Inequalities”. In: Proceedings of the American Mathematical Society 36.2 (1972), pp. 505–510. [5] R.A. Hunt. “An extension of the Marcinkiewicz interpolation theorem to Lorentz spaces”. In: Bull. Amer. Math. Soc. 70.6 (Nov. 1964), pp. 803–807. [6] E.H. Lieb and M. Loss. Analysis. Second Edition. 348 pp: American Mathematical Society, 2001. [7] E. Stein and R. Shakarchi. Fourier Analysis: An Introduction. 326 pp: Princeton Univer- sity Press, 2003. [8] E. Stein and R. Shakarchi. Functional Analysis: Introduction to Further Topics in Analysis. 423 pp: Princeton University Press, 2011. [9] E. Stein and R. Shakarchi. Real Analysis: Measure Theory, Integration, and Hilbert Spaces. 402 pp: Princeton University Press, 2005. [10] E.M. Stein and T.S. Murphy. Harmonic Analysis: Real-variable Methods, Orthogonal- ity, and Oscillatory Integrals. Monographs in harmonic analysis. Princeton University Press, 1993. [11] T. Tao. “Recent progress on the restriction conjecture”. In: ArXiv Mathematics e-prints (Nov. 2003). eprint: math/0311181.

140