JUHA KINNUNEN Real Analysis

Department of Mathematics and Systems Analysis, Aalto University 2020 Contents

1 Lp spaces1 1.1 Lp functions ...... 1 1.2 Lp ...... 4 1.3 Lp spaces for 0 p 1 ...... 12 < < 1.4 Completeness of Lp ...... 13

1.5 L∞ space ...... 19

2 The Hardy-Littlewood maximal 25 2.1 Local Lp spaces ...... 25 2.2 Definition of the maximal function ...... 27 2.3 Hardy-Littlewood-Wiener maximal function theorems ...... 29 2.4 The Lebesgue differentiation theorem ...... 37 2.5 The fundamental theorem of calculus ...... 43 2.6 Points of density ...... 44 2.7 The Sobolev embedding ...... 47

3 Convolutions 52 3.1 Two additional properties of Lp ...... 52 3.2 Convolution ...... 56 3.3 Approximations of the identity ...... 61 3.4 Pointwise convergence ...... 64 3.5 Convergence in Lp ...... 68 3.6 Smoothing properties ...... 70 3.7 The Poisson kernel ...... 73

4 Differentiation of measures 75 4.1 Covering theorems ...... 75 4.2 The Lebesgue differentiation theorem for Radon measures ..... 84 4.3 The Radon-Nikodym theorem ...... 89 4.4 The Lebesgue decomposition ...... 92 4.5 Lebesgue and density points revisited ...... 95

5 Weak convergence methods 98 CONTENTS ii

5.1 The Riesz representation theorem for Lp ...... 98 5.2 Partitions of unity ...... 104 5.3 The Riesz representation theorem for Radon measures ...... 107 5.4 Weak convergence and compactness of Radon measures .... 114 5.5 Weak convergence in Lp...... 118 5.6 Mazur’s lemma ...... 123 The Lp spaces are probably the most important function spaces in analysis. This section gives basic facts about Lp spaces for general measures. These include Hölder’s inequality, Minkowski’s inequality, the Riesz-Fischer theo- rem which shows the completeness and the corresponding facts for the L∞ space. 1 Lp spaces

In this section we study the Lp spaces in order to be able to capture quantitative information on the average size of measurable functions and boundedness of operators on such functions. The cases 0 p 1, p 1, p 2, 1 p and < < = = < < ∞ p are different in character, but they are all play an important role in in = ∞ Fourier analysis, harmonic analysis, analysis and partial differential equations. The space L1 of integrable functions plays a central role in measure and integration theory. The L2 of square integrable functions is important in the study of Fourier series. Many operators that arise in applications p 1 are bounded in L for 1 p , but the limit cases L and L∞ require a special < < ∞ attention.

1.1 Lp functions

Definition 1.1. Let µ be an outer measure on Rn, A Rn a µ-measurable set and ⊂ f : A [ , ] a µ-measurable function. Then f Lp(A), 1 p , if → −∞ ∞ ∈ É < ∞ µZ ¶1/p p f p f dµ . k k = A | | < ∞

T HEMORAL : For p 1, f L1(A) if and only if f is integrable in A. For = ∈ | | 1 p , f Lp(A) if and only if f p is integrable in A. É < ∞ ∈ | | Remark 1.2. The measurability assumption on f essential in the definition. For example, let A [0,1] be a non-measurable set with respect to the one-dimensional ⊂ Lebesgue measure and consider f : [0,1] R, →  1, x A, f (x) ∈ =  1, x [0,1] \ A. − ∈

1 CHAPTER 1. LP SPACES 2

Then f 2 1 is integrable on [0,1], but f is not a Lebesgue measurable function. = n n Example 1.3. Let f : R [0, ], f (x) x − and assume that µ is the Lebesgue → ∞ n = | | i 1 i measure. Let A B(0,1) {x R : x 1} and denote Ai B(0,2− + ) \ B(0,2− ), = = ∈ | | < = i 1,2,.... Then = Z Z np X∞ np x − dx x − dx B(0,1) | | = i 1 Ai | | = Z X∞ npi i np npi 2 dx (x Ai x 2− x − 2 ) É i 1 Ai ∈ ⇒ | | Ê ⇒ | | É = X∞ npi X∞ npi i 1 2 Ai 2 B(0,2− + ) = i 1 | | É i 1 | | = = X∞ npi i 1 n Ωn 2 (2− + ) (Ωn B(0,1) ) = i 1 = | | = X∞ npi ni n n X∞ in(p 1) Ωn 2 − + 2 Ωn 2 − , = i 1 = i 1 < ∞ = = if n(p 1) 0 p 1. Thus f Lp(B(0,1)) for p 1. − < ⇐⇒ < ∈ < On the other hand, Z Z np X∞ np x − dx x − dx B(0,1) | | = i 1 Ai | | = Z X∞ np(i 1) i 1 np np(i 1) 2 − dx (x Ai x 2− + x − 2 − ) Ê i 1 Ai ∈ ⇒ | | < ⇒ | | > = X∞ np(i 1) n np X∞ npi in 2 − Ai Ωn(2 1)2− 2 2− = i 1 | | = − i 1 = = i 1 i ( Ai B(0,2− + ) B(0,2− ) | | = | | − | | ( i 1)n in n in Ωn(2 − + 2− ) Ωn(2 1)2− ) = − = − X∞ in(p 1) C(n, p) 2 − , = i 1 = ∞ = if n(p 1) 0 p 1. Thus f Lp(B(0,1)) for p 1. This shows that − Ê ⇐⇒ Ê ∉ Ê f Lp(B(0,1)) p 1. ∈ ⇐⇒ < n i i 1 If A R \ B(0,1), then we denote Ai B(0,2 ) \ B(0,2 − ), i 1,2,..., and a = = = similar argument as above shows that

f Lp(Rn \ B(0,1)) p 1. ∈ ⇐⇒ > 1 1 n n Observe that f L (B(0,1)) and f L (R \ B(0,1)). Thus f (x) x − is a 6∈ 6∈ = | | borderline function in Rn as far as integrability is concerned.

T HEMORAL : The smaller the parameter p is, the worse local singularities an Lp function may have. On the other hand, the larger the parameter p is, the more an Lp function may spread out globally. CHAPTER 1. LP SPACES 3

Example 1.4. Assume that f : Rn [0, ] is radial. Thus f depends only on x → ∞ | | and it can be expressed as f ( x ), where f is a function defined on [0, ). Then | | ∞ Z Z ∞ n 1 f ( x ) dx ωn 1 f (r)r − dr, (1.1) Rn | | = − 0 where 2πn/2 ωn 1 − = Γ(n/2) is the (n 1)-dimensional volume of the unit sphere ÇB(0,1) {x Rn : x 1}. − = ∈ | | = Let us show how to use this formula to compute the volume of a ball B(x, r) n n = {y R : y x r}, x R and r 0. Denote Ωn m(B(0,1)). By the translation ∈ | − | < ∈ > = and scaling invariance, we have

n n r Ωn r m(B(0,1)) m(B(x, r)) m(B(0, r)) = = = Z Z χB(0,r)(y) d y χ(0,r)( y ) d y = Rn = Rn | | Z r n n 1 r ωn 1 ρ − dρ ωn 1 . = − 0 = − n

In particular, it follows that ωn 1 nΩn and − = 2πn/2 rn πn/2 m(B(x, r)) rn. = Γ(n/2) n = Γ( n 1) 2 + Let r 0. Then > Z 1 Z 1 dx χ n (x) dx α α R \B(0,r) Rn\B(0,r) x = Rn x | | Z| | n 1 r χ n (rx) dx α R \B(0,r) = Rn rx Z | | n α 1 r χ n (x) dx − α R \B(0,1) = Rn x | | Z 1 rn α dx , α n, − α = Rn\B(0,1) x < ∞ > | | and, in a similar way, Z 1 Z 1 dx rn α dx , α n. α − α B(0,r) x = B(0,1) x < ∞ < | | | | Observe, that here we formally make the change of variables x ry. = On the other hand, the integrals can be computer directly by (1.1). This gives Z 1 Z dx ω ∞ ρ αρn 1 dρ α n 1 − − Rn\B(0,r) x = − r | | ¯ ωn 1 α n¯∞ ωn 1 α n − ρ− + ¯ − r− + , α n = α n ¯ = α n < ∞ > − + r − and Z 1 Z r dx ω ρ αρn 1 dρ α n 1 − − B(0,r) x = − 0 | | ¯r ωn 1 α n¯ ωn 1 n α − ρ− + ¯ − r − , α n. = α n ¯ = α n < ∞ < − + 0 − CHAPTER 1. LP SPACES 4

Remarks 1.5: Formula (1.1) implies following claims:

α 1 (1) If f (x) c x − in a ball B(0, r), r 0, for some α n, then f L (B(0, r)). | | É | | > < ∈ α On the other hand, if f (x) c x − in B(0, r) for some α n, then f | | Ê | | > ∉ L1(B(0, r)). α n 1 n (2) If f (x) c x − in R \ B(0, r) for some α n, then f L (R \ B(0, r)). | | É | | > ∈ α n On the other hand, if f (x) c x − in R \ B(0, r) for some α n, then | | Ê | | < f L1(Rn \ B(0, r)). ∉ Remark 1.6. f Lp(A) f (x) for µ-almost every x A. ∈ =⇒ | | < ∞ ∈

Reason. Let Ai {x A : f (x) i}, i 1,2,.... Then = ∈ | | Ê = \∞ {x A : f (x) } Ai ∈ | | = ∞ = i 1 = and Z {x A : f (x) } µ(Ai) 1 dµ ∈ | | = ∞ É = Ai Z µ f ¶p | | dµ ( f i in Ai) É Ai i | | Ê 1 Z f p d i p µ 0 as . É i A | | → → ∞ ■ | {z } <∞ The converse is not true, as the previous example shows.

1.2 Lp norm

If f Lp(A), 1 p , the norm of f is the number ∈ É < ∞ µZ ¶1/p p f p f dx . k k = A | | We shall see that this has the usual properties of the norm:

(1) (Nonnegativity) 0 f p , É k k < ∞ (2) f p 0 f 0 µ-almost everywhere, k k = ⇐⇒ = (3) (Homogeneity) af p a f p, a R, k k = | |k k ∈ (4) (Triangle inequality) f g p f p g p. k + k É k k + k k The claims (1) and (3) are clear. For p 1, the claim (4) follows from the = pointwise triangle inequality f (x) g(x) f (x) g(x) . For p 1, the claim (4) | + | É | | + | | > is not trivial and we shall prove it later in this section. Let us recall how to prove (2). Recall that if a property holds except on a set of µ measure zero, we say that it holds µ-almost everywhere. CHAPTER 1. LP SPACES 5

Assume that f 0 µ-almost everywhere in A. Then ⇐= = Z Z Z f p dµ f p dµ f p dµ 0. A | | = A { f 0} | | + A { f 0} | | = | ∩ | |={z } | ∩ | |>{z } 0, 0, = = f 0 µ-a.e. µ(A { f 0}) 0 | | = ∩ | | > =

Thus f p 0. k k = © 1 ª Assume that f p 0. Let Ai x A : f (x) , i 1,2,.... Then =⇒ k k = = ∈ | | Ê i = [∞ {x A : f (x) 0} Ai ∈ | | > = i 1 = and Z Z Z p p p µ(Ai) 1 dµ i f dµ i f dµ 0. (i f 1 in Ai) = Ai É Ai | | É A | | = | | Ê | {z } 0 = Thus µ(Ai) 0 for every i 1,2,... and = = Ã ! [∞ X∞ µ Ai µ(Ai) 0. i 1 É i 1 = = = In other words, f 0 µ-almost everywhere in A. = For µ-measurable functions f and g on a µ-measurable set A, we are interested in the case f (x) g(x) for µ-almost every x A, which means that = ∈ µ({x A : f (x) g(x)}) 0. ∈ 6= = In the case f g µ-almost everywhere, we do not usually distinguish f from g. = That is, we shall regard them as equal. We could be formal and consider the equivalence relation

f g f g µ-almost everywhere in A ∼ ⇐⇒ = but this is hardly necessary. In practice, we are thinking f as the equivalence class of all functions which are equal to f µ-almost everywhere in A. Thus Lp(A) actually consists of equivalence classes rather than functions, but we shall not make the distinction. In measure and integration theory we cannot distinguish f from g, if the functions are equal µ-almost everywhere. In fact, if f g µ-almost p p = everywhere in A, then f L (A) g L (A) and f g p 0. In particular, ∈ ⇐⇒ ∈ k − k = this implies that f p g p. On the other hand, if f g p 0, then f g k k = k k k − k = = µ-almost everywhere in A. Another situation that frequently arises is that the function f is defined only almost everywhere. Then we say that f is measurable if and only if its zero extension to the whole space is measurable. Observe, that this does not affect the Lp norm of f . Next we show that Lp(A) is a . CHAPTER 1. LP SPACES 6

Lemma 1.7. (i) If f Lp(A), then af Lp(A), a R. ∈ ∈ ∈ (ii) If f , g Lp(A), then f g Lp(A). ∈ + ∈ Z Z Proof. (1) af p dµ a p f p dµ . A | | = | | A | | < ∞ (2) p 1 The triangle inequality f g f g implies = | + | É | | + | | Z Z Z f g dµ f dµ g dµ . A | + | É A | | + A | | < ∞ 1 p The elementary inequality < < ∞ (a b)p (2max(a, b))p 2p max(ap, bp) + É = 2p(ap bp), a, b 0, 0 p (1.2) É + Ê < < ∞ implies Z µZ Z ¶ f g p dµ 2p f p dµ g p dµ . A | + | É A | | + A | | < ∞ ä

Remark 1.8. Note that the proof applies for 0 p . Thus Lp(A) is a vector < < ∞ space for 0 p . However, it will be a normed space with the Lp norm only for < < ∞ p 1 as we shall see later. Ê Remark 1.9. A more careful analysis gives the useful inequality

p p 1 p p (a b) 2 − (a b ), a, b R, 1 p . (1.3) + É + ∈ É < ∞ Remarks 1.10: (1) If f : A C is a complex-valued function, then f is said to be µ-measurable → if and only if Re f and Im f are µ-measurable. We say that f L1(A) if ∈ Re f L1(A) and Im f L1(A), and we define ∈ ∈ Z Z Z f dµ Re f dµ i Im f dµ, A = A + A where i is the imaginary unit. This integral satisfies the usual linearity properties. It also satisfies the important inequality ¯Z ¯ Z ¯ ¯ ¯ f dµ¯ f dµ. ¯ A ¯ É A | | The definition of the Lp spaces and the norm extends in a natural way

to complex-valued functions. Note that the property af p a f p for k k = | |k k every a C and thus Lp is a complex vector space. ∈ (2) The space L2(A) is an inner product space with the inner product Z f , g f g dµ, f , g L2(A). 〈 〉 = A ∈ Here g is the complex conjugate which can be neglected if the functions are real-valued. This inner product induces the standard L2-norm, since µZ ¶1/2 µZ ¶1/2 2 1/2 f 2 f dµ f f dµ f , f . k k = A | | = A = 〈 〉 CHAPTER 1. LP SPACES 7

(3) In the special case that A N and µ is the counting measure, the Lp(N) = spaces are denoted by l p and ( ) p X∞ p l (xi) : xi , 1 p . = i 1 | | < ∞ É < ∞ =

Here (xi) is a sequence of real (or complex) numbers. In this case, Z X∞ x dµ x(i) N = i 1 = for every nonnegative function x on N. Thus

à !1/p X∞ p x p xi . k k = i 1 | | = Note that the theory of Lp spaces applies to these sequence spaces as well.

Definition 1.11. Let 1 p . The Hölder conjugate p0 of p is the number < < ∞ which satisfies 1 1 1. p + p0 =

For p 1 we define p0 and if p , then p0 1. = = ∞ = ∞ = Remark 1.12. Note that p p0 , = p 1 − p 2 p0 2, = =⇒ = 1 p 2 p0 2, < < =⇒ > 2 p 1 p0 2, < < ∞ =⇒ < < p 1 p0 , → =⇒ → ∞ (p0)0 p. = Lemma 1.13 (Young’s inequality). Let 1 p . Then for every a 0, b 0, < < ∞ Ê Ê ap bp0 ab , É p + p0 with equality if and only if ap bp0 . =

T HEMORAL : Young’s inequality is a very useful tool in splitting a product to

a sum. Morever, it shows where the conjugate exponent p0 comes from.

Proof. The claim is obviously true, if a 0 or b 0. Thus we may assume that = = a 0 and b 0. Clearly > > ap bp0 1 ap 1 1 µ a ¶p 1 a ab ab1 p0 0 0 p − p /p p /p É p + p0 ⇐⇒ p b 0 + p0 − Ê ⇐⇒ p b 0 + p0 − b 0 Ê CHAPTER 1. LP SPACES 8

Let t a/bp0/p and define ϕ : (0, ) R, = ∞ → 1 1 ϕ(t) tp t. = p + p0 − Then 1 p 1 ϕ(0) , lim ϕ(t) and ϕ0(t) t − 1. = p t = ∞ = − 0 →∞ Note that ϕ0(t) 0 t 1, from which we conclude = ⇐⇒ = 1 1 ϕ(t) ϕ(1) 1 0 for every t 0. Ê = p + p0 − = >

Moreover, ϕ(t) 0, if t 1. It follows that ϕ(t) 0 if and only if a/bp0/p t 1. > 6= = = = ä Remarks 1.14: (1) Young’s inequality for p 2 follows immediately from = a2 b2 (a b)2 0 a2 2ab b2 0 ab 0. − Ê ⇐⇒ − + Ê ⇐⇒ 2 + 2 Ê Ê (2) Young’s inequality can be also proved geometrically. To see this, consider p 1 1/(p 1) p 1 the curves y x − and the inverse x y − y 0− . Then = = = Z a p Z b p0 p 1 a p 1 b x − dx and y 0− d y . 0 = p 0 = p0 By comparing the areas under the curves that these integrals measure, we have Z a Z b p p0 p 1 p 1 a b ab x − dx y 0− d y . É 0 + 0 = p + p0 Theorem 1.15 (Hölder’s inequality). Let 1 p and assume that f Lp(A) < < ∞ ∈ and g Lp0 (A). Then f g L1(A) and ∈ ∈ Z µZ ¶1/p µZ ¶1/p0 f g dµ f p dµ g p0 dµ . A | | É A | | A | | Moreover, an equality occurs if and only if there exists a constant c such that f (x) p c g(x) p0 for µ-almost every x A. | | = | | ∈

T HEMORAL : Hölder’s inequality is very useful tool in estimating a product of functions.

Remark 1.16. Hölder’s inequality states that f g 1 f p g p , 1 p . Ob- k k É k k k k 0 < < ∞ serve that for p 2 this is the Cauchy-Schwarz inequality f , g f 2 g 2. = |〈 〉| É k k k k

Proof. If f p 0, then f 0 µ-almost everywhere in A and thus f g 0 µ-almost k k = = = everywhere in A. Thus the result is clear, if f p 0 or g p 0. We may assume k k = k k = that f p 0 and g p 0. Let k k > k k > f g fe and ge . = f p = g p k k k k 0 CHAPTER 1. LP SPACES 9

Then ° f ° f p f ° ° k k 1 and g˜ 1. e p ° ° p0 k k = ° f p ° = f p = k k = k k p k k By Young’s inequality 1 Z Z f g dµ fege dµ f p g p A | | = A | | k k k k 0 1 Z 1 Z 1 1 p p0 fe dµ ge dµ 1. É p A | | + p0 A | | = p + p0 = | {z } | {z } 1 1 = = An equality holds if and only if Z µ 1 1 ¶ p p0 fe ge fege dµ 0, A p | | + p0 | | − | | = | {z } 0 Ê which implies that 1 1 p p0 fe ge fege 0 µ-almost everywhere in A. p | | + p0 | | − | | =

The equality occurs in Young’s inequality if and only if fe p g p0 µ-almost | | = |e| everywhere in A. In this case p f p f (x) p k k g(x) p0 for µ-almost every x A. | | = p0 | | ∈ g p k k 0 ä W ARNING : f Lp(A) and g Lp(A) does not imply that f g Lp(A). ∈ ∈ ∈ Reason. Let 1 f : (0,1) R, f (x) , g f , → = px = and assume that µ is the Lebesgue measure. Then f L1((0,1)) and g L1((0,1)), ∈ ∈ but 1 (f g)(x) f (x)g(x) and f g L1((0,1)). = = x ∉ ■ Remarks 1.17: (1) For p 2 we have Schwarz’s inequality = Z µZ ¶1/2 µZ ¶1/2 f g dµ f 2 dµ g 2 dµ . A | | É A | | A | | (2) Hölder’s inequality holds for arbitrary measurable functions with the interpretation that the integrals may be infinite. (Exercise)

Lemma 1.18 (Jensen’s inequality). Let 1 p q and assume that 0 É < < ∞ < µ(A) . Then < ∞ µ 1 Z ¶1/p µ 1 Z ¶1/q f p dµ f q dµ . µ(A) A | | É µ(A) A | | CHAPTER 1. LP SPACES 10

T HEMORAL : The integral average is an increasing function of the power.

Proof. By Hölder’s inequality

Z µZ ¶p/q µZ ¶(q p)/q p pq/p q/(q p) − f dµ f dµ 1 − dµ A | | É A | | A µZ ¶p/q q 1 p/q f dµ µ(A) − . = A | | ä

Remark 1.19. If 1 p q and µ(A) , then Lq(A) Lp(A). É < < ∞ < ∞ ⊂

W ARNING : Let 1 p q . In general, Lq(A) Lp(A) or Lp(A) Lq(A). É < < ∞ 6⊂ 6⊂ Reason. Let f : (0, ) R, f (x) xa and assume that µ is the Lebesgue measure. ∞ → = Then f L1((0,1)) a 1 and f L1((1, )) a 1. ∈ ⇐⇒ > − ∈ ∞ ⇐⇒ < − Assume that 1 p q . Choose b such that 1/q b 1/p. Then the func- b É < < ∞ p É É q tion x− χ(0,1)(x) belongs to L ((0, )), but does not belong to L ((0, )). On the b ∞ q ∞ other hand, the function x− χ(1, )(x) belongs to L ((0, )), but does not belong to ∞ ∞ Lp((0, )). ∞ ■ Examples 1.20: (1) Let A (0,1), µ be the Lebesgue measure and 1 p . Define f : (0,1) = É < ∞ → R, 1 f (x) . = x1/p(log(2/x))2/p Then f Lp((0,1)), but f Lq((0,1)) for any q p. Thus for every p with ∈ ∉ > 1 p , there exists a function f which belongs to Lp((0,1)), but does É < ∞ not belong to any higher Lq((0,1)) with q p. (Exercise) > (2) Let 1 p q . Assume that A contains µ-measurable sets of arbitrarily É < < ∞ small positive measure. Then there are pairwise disjoint µ-measurable

sets Ai A, i 1,2,..., such that µ(Ai) 0 and µ(Ai) 0 as i . Let ⊂ = > → → ∞

X∞ f aiχAi , = i 1 =

where ai 0 with ai as i are chosen so that Ê → ∞ → ∞ X∞ q X∞ p ai µ(Ai) and ai µ(Ai) . i 1 = ∞ i 1 < ∞ = = Then f Lp(A) \ Lq(A). It can be shown, that Lp(A) is not contained in ∈ Lq(A) if and only if A contains measurable sets of arbitrarily small positive measure. (Exercise) CHAPTER 1. LP SPACES 11

(3) Let 1 p q . Assume that A contains µ-measurable sets of arbitrarily É < < ∞ large measure. Then there are pairwise disjoint µ-measurable sets Ai A, ⊂ i 1,2,..., such that µ(Ai) 0 and µ(Ai) as i . Let = > → ∞ → ∞

X∞ f aiχAi , = i 1 =

where ai 0 with ai as i are chosen so that Ê → ∞ → ∞ X∞ q X∞ p ai µ(Ai) and ai µ(Ai) . i 1 < ∞ i 1 = ∞ = = Then f Lq(A) \ Lp(A). It can be shown, that Lq(A) is not contained ∈ in Lp(A) if and only if A contains measurable sets of arbitrarily large measure. (Exercise)

Remark 1.21. There is a more general version of Jensen’s inequality. Assume that 0 µ(A) . Let f L1(A) such that a f (x) b for every x A. If ϕ is a convex < < ∞ ∈ < < ∈ function on (a, b), then

µ 1 Z ¶ 1 Z ϕ f dµ ϕ f dµ. µ(A) A É µ(A) A ◦

The cases a and b are not excluded. Observe, that in this case may = −∞ = ∞ happen that ϕ f is not integrable. We leave the proof as an exercise. ◦ Theorem 1.22 (Minkowski’s inequality). Assume 1 p and f , g Lp(A). É < ∞ ∈ Then f g Lp(A) and + ∈ f g p f p g p. k + k É k k + k k Moreover, an equality occurs if and only if there exists a positive constant c such that f (x) cg(x) for µ-almost every x A. = ∈

T HEMORAL : Minkowski’s inequality is the triangle inequality for the Lp norm. It implies that the Lp norm, with 1 p , is a norm in the usual sense É < ∞ and that Lp(A) is a normed space if the functions that coincide almost everywhere are identified.

Remark 1.23. Elementary inequalities (1.3) and (1.4) imply that

µZ ¶1/p µZ ¶1/p p (p 1)/p p p f g p f g dµ 2 − ( f g ) dµ k + k = A | + | É A | | + | | õZ ¶1/p µZ ¶1/p! (p 1)/p p p 2 − f dµ g dµ É A | | + A | | (p 1)/p 2 − ( f p g p), 1 p . = k k + k k É < ∞ (p 1)/p Observe that the factor 2 − is strictly greater than one for p 1 and Minkowski’s > inequality does not follow from this. CHAPTER 1. LP SPACES 12

Proof. p 1 : The triangle inequality, as in the proof of Lemma 1.7, shows that = f g 1 f 1 g 1. k + k É k k + k k 1 p : If f g p 0, there is nothing to prove. Thus we may assume < < ∞ k + k = that f g p 0. Then by Hölder’s inequality k + k > Z Z Z p p 1 p 1 f g dµ f g − f dµ f g − g dµ A | + | É A | + | | | + A | + | | | p p 1 p 1 ( f g f g − f g f g − ( f g )) | + | = | + | | + | É | + | | | + | | µZ ¶1/p0 µZ ¶1/p (p 1)p p f g − 0 dµ f dµ É A | + | A | | µZ ¶1/p0 µZ ¶1/p (p 1)p p f g − 0 dµ g dµ . + A | + | A | |

Since (p 1)p0 p and 0 f g p , we have − = < k + k < ∞ µ ¶1 (p 1)/p µ ¶1/p µ ¶1/p Z − − Z Z f g p dµ f p dµ g p dµ . A | + | É A | | + A | |

It remains to consider when the equality can occur. This happens if there is an p p 1 p 1 equality in the pointwise inequality f g f g − f g f g − ( f g ) | + | = | + | | + | É | + | | |+| | as well as equality in the applications of Hölder’s inequality. Equality occurs in in the pointwise inequality above if f (x) and g(x) have the same sign. On the other hand, equality occurs in Hölder’s inequality if

p p p c1 f (x) f (x) g(x) c2 g(x) for µ-almost every x A. | | = | + | = | | ∈ This completes the proof. ä Note that the normed space Lp(A), 1 p , is a metric space with the metric É < ∞

d(f , g) f g p. = k − k

1.3 Lp spaces for 0 p 1 < < It is sometimes useful to consider Lp spaces for 0 p . Observe that Definition < < ∞ 1.1 makes sense also when 0 p 1 and the space is a vector space by the same < < argument as in the proof of Lemma 1.7. However, f p is not a norm for 0 p 1. k k < <

Reason. Let f χ[0, 1 ) and g χ[ 1 ,1]. Then f g χ[0,1] so that f g p 1. On = 2 = 2 + = k + k = 1/p 1/p the other hand, f p 2− and g p 2− . Thus k k = k k = 1/p 1 1/p f p g p 2 2− 2 − 1, when 0 p 1. k k + k k = · = < < <

This shows that f p g p f g p. k k + k k < k + k ■ CHAPTER 1. LP SPACES 13

Thus the triangle inequality does not hold true when 0 p 1, but we have < < the following result.

Lemma 1.24. If f , g Lp(A) and 0 p 1, then f g Lp(A) and ∈ < < + ∈ f g p f p g p. k + kp É k kp + k kp Proof. The elementary inequality

(a b)p ap bp, a, b 0, 0 p 1, (1.4) + É + Ê < < implies Z Z Z p p p f g f g p dµ f p dµ g p dµ f g . k + kp = | + | É | | + | | = k kp + k kp A A A ä However, Lp(A) is a metric space with the metric Z p p d(f , g) f g p f g dµ = k − k = A | − | This metric is not induced by a norm, since f p does not satisfy the homogeneity k kp required by the norm. On the other hand, f p satisfies the homogeneity, but not k k the triangle inequality. Remarks 1.25: (1) By (1.2), we have

¡ p p¢1/p 1/p f g p f g 2 ( f p g p), 0 p 1. k + k É k kp + k kp É k k + k k < < Thus a quasi triangle inequality holds with a multiplicative constant. (2) If f , g Lp(A), f 0, g 0, then ∈ Ê Ê

f g p f p g p, 0 p 1. k + k Ê k k + k k < < This is the triangle inequality in the wrong direction (exercise).

Remark 1.26. It is possible to define the Lp spaces also when p 0. A µ-measurable < function is in Lp(A) for p 0, if < Z Z 1 1 1 0 f p dµ dµ , where 1. < A | | = A f p0/(1 p0) < ∞ p + p = | | − 0 If f Lp(A) for p 0, then f 0 µ-almost everywhere and f µ-almost ∈ < 6= | | < ∞ everywhere. However, this is not a vector space.

1.4 Completeness of Lp

Next we prove a famous theorem, which is not only important in the theory of Lp spaces, but has a historical interest as well. The result was found independently CHAPTER 1. LP SPACES 14

by F. Riesz and E. Fisher in 1907, primarily in connection with the Fourier series which culminates in showing completeness of L2. p Recall that a sequence (fi) of functions fi L (A), i 1,2,..., converges in ∈ = Lp(A) to a function f Lp(A), if for every ε 0 there exists i such that ∈ > ε

fi f p ε when i i . k − k < Ê ε Equivalently,

lim fi f p 0. i k − k = →∞ p A sequence (fi) is a Cauchy sequence in L (A), if for every ε 0 there exists i > ε such that

fi f j p ε when i, j i . k − k < Ê ε W ARNING : This is not the same condition as

fi 1 fi p ε when i iε. k + − k < Ê Indeed, the Cauchy sequence condition implies this, but the converse is not true (exercise). p p C LAIM : If fi f in L (A), then (fi) is Cauchy sequence in L (A). → p ε Reason. Let ε 0. Since fi f in L (A), there exists i such that fi f p > → ε k − k < 2 when i i . By Minkowski’s inequality Ê ε ε ε fi f j p fi f p f f j p ε k − k É k − k + k − k < 2 + 2 = when i, j iε. Ê ■ p Theorem 1.27 (Riesz-Fischer). For every Cauchy sequence (fi) in L (A), 1 p p É p , there exists f L (A) such that fi f in L (A) as i . < ∞ ∈ → → ∞ p T HEMORAL : L (A), 1 p , is a with the norm p. In É < ∞ k · k particular, L2(A) is a Hilbert space.

p Proof. Assume that (fi) is a Cauchy sequence in L (A). We construct a subse-

quence as follows. Choose i1 such that 1 fi f j p when i, j i1. k − k < 2 Ê

We continue recursively. Suppose that i1, i2,..., ik have been chosen such that 1 fi f j p when i, j ik. k − k < 2k Ê

Then choose ik 1 ik such that + > 1 f f when i, j i . i j p k 1 k 1 k − k < 2 + Ê + CHAPTER 1. LP SPACES 15

For the subsequence (fik ), we have 1 f f , k 1,2,.... ik ik 1 p k k − + k < 2 = Let l X X∞ gl fik 1 fik and g fik 1 fik . = k 1 | + − | = k 1 | + − | = = Then l X X∞ lim gl lim fik 1 fik fik 1 fik g l = l k 1 | + − | = k 1 | + − | = →∞ →∞ = = and as a limit of µ-measurable functions g is a µ-measurable function. Fatou’s lemma and Minkowski’s inequality imply µZ ¶1/p µZ ¶1/p p p g dµ liminf gl dµ A É l A →∞ l µZ ¶1/p 1 liminf X f f p d X∞ 1. ik 1 ik µ k É l k 1 A | + − | É k 1 2 = →∞ = = Thus g Lp(A) and consequently g(x) for µ-almost every x A. It follows ∈ < ∞ ∈ that the series X∞ fi1 (x) (fik 1 (x) fik (x)) + k 1 + − = converges absolutely for µ-almost every x A. Denote the sum of the series by ∈ f (x) for those x A at which it converges and set f (x) 0 in the remaining set of ∈ = measure zero. Then

X∞ f (x) fi1 (x) (fik 1 (x) fik (x)) = + k 1 + − = Ã l 1 ! X− lim fi1 (x) (fik 1 (x) fik (x)) = l + k 1 + − →∞ = lim fi (x) lim fi (x) = l l = k k →∞ →∞ for µ-almost every x A. Thus there is a subsequence which converges µ-almost ∈ everywhere in A. Next we show that the original sequence converges to f in Lp(A). p C LAIM : fi f in L (A) as i . → → ∞ p Reason. Let ε 0. Since (fi) is a Caucy sequence in L (A), there exists i such > ε that fi f j p ε when i, j i . For a fixed i, we have fi fi f fi µ-almost k − k < Ê ε k − → − everywhere in A as ik . By Fatou’s lemma → ∞ µZ ¶1/p µZ ¶1/p p p f fi dµ liminf fik fi dµ ε. A | − | É k A | − | É →∞ ■ p p This shows that f fi L (A) and thus f (f fi) fi L (A). Moreover, for − ∈ = − + ∈ every ε 0 there exists i such that fi f p ε when i i . This completes the > ε k − k < Ê ε proof. ä CHAPTER 1. LP SPACES 16

W ARNING : In general, if a sequence has a converging subsequence, the original sequence need not converge. In the proof above, we used the fact that we have a Cauchy sequence.

We shall often use a part of the proof of the Riesz-Fisher theorem, which we now state.

p Corollary 1.28. If fi f in L (A), then there exist a subsequence (fi ) such that → k

lim fi (x) f (x) µ-almost every x A. k k = ∈ →∞

Proof. The proof of the Riesz-Fischer theorem gives a subsequence (fik ) and a function g Lp(A) such that ∈

lim fi (x) g(x) µ-almost every x A k k = ∈ →∞ p p and fik g in L (A). On the other hand, fi f in L (A), which implies that → p → fi f in L (A). By the uniqueness of the limit, we conclude that f g µ-almost k → = everywhere in A. ä Remarks 1.29:

Let us compare the various modes of convergence of a sequence (fi) of functions in Lp(A).

p (1) If fi f in L (A), then → lim fi p f p. i k k = k k →∞ Reason. fi p fi f f p fi f p f p implies fi p f p fi k k = k − + k É k − k +k k k k −k k É k − f p. In the same way f p fi p fi f p. Thus k k k − k k É k − k ¯ ¯ ¯ fi p f p¯ fi f p 0, k k − k k É k − k → from which it follows that

lim fi p f p. i k k = k k →∞ ■

p (2) fi f in L (A) implies that f f in measure. → → Reason. By Chebyshev’s inequality

1 Z x A : f x f x f f p d µ({ i( ) ( ) ε}) p i µ ∈ | − | Ê É ε A | − | 1 fi f p 0 as i . = εp k − k → → ∞ ■

p (3) If fi f in L (A), then there exist a subsequence (fi ) such that → k

lim fi (x) f (x) µ-almost every x A. k k = ∈ →∞ CHAPTER 1. LP SPACES 17

Reason. The convergence in measure implies the existence of an almost ev- erywhere converging subsequence. This gives another proof of the previous corollary ■ 1 (4) In the case p 1, fi f in L (A) implies not only that = → Z Z lim fi dµ f dµ i A | | = A | | →∞ but also that Z Z lim fi, dµ f dµ. i A = A →∞ Reason. ¯Z ¯ Z ¯ ¯ ¯ (fi f ) dµ¯ fi f dµ fi f 1 0 as i . ¯ A − ¯ É A | − | = k − k → → ∞ ■

T HEMORAL : This is a useful tool in showing that a sequence does not converge in Lp.

Example 1.30. Let fi χ[i 1,i), i 1,2,..., and f 0. Assume that µ is the = − = = Lebesgue measure. Then

lim fi(x) f (x) for every x R. i = ∈ →∞

However, fi p 1 for every i 1,2,... and f p 0. Thus the sequence (fi) does k k = = k k = not converge to f in Lp(R), 1 p . É < ∞ Example 1.31. In the following examples we assume that µ is the Lebesgue mea- sure.

p (1) fi f almost everywhere does not imply fi f in L . Let → → 2 fi i χ¡0, 1 ¢, i 1,2,.... = i = Then Z Z p 2p p 2p 1 2p 1 fi dx i χ 1 dx i i − . R | | = R (0, i ) = i = < ∞ p Thus fi L (R), 1 p , fi(x) 0 for every x R, but ∈ É < ∞ → ∈ 2 1 fi p i − p i as i . k k = Ê → ∞ → ∞ p Thus (fi) does not converge in L (R). p (2) fi f in L does not imply fi f almost everywhere. Consider the sliding → → sequence of functions

k f k kχh j j 1 i, k 0,1,2,..., j 0,1,2,...,2 1. 2 j , + + = 2k 2k = = − CHAPTER 1. LP SPACES 18

Then k p f2k j p k2− 0 as k k + k = → → ∞ p which implies that fi 0 in L (R), 1 p , as i . However, the → É < ∞ → ∞ sequence (fi(x)) fails to converge for every x [0,1], since ∈

limsup fi(x) and liminf fi(x) 0 i = ∞ i = →∞ →∞ for every x [0,1]. Note that there are many converging subsequences. For ∈ example, f2k 1(x) 0 for every x [0,1] as k . + → ∈ → ∞ (3) A sequence can converge in Lp without converging in Lq. Consider

1 fi i− χ(i,2i), i 1,2,.... = = 1 1/p p Then fi p i− + . Thus f 0 in L (R), 1 p , but fi 1 1 for k k = → < < ∞ k k1 =n every i 1,2..., so that the sequence (fi) does not converge in L (R ). = The following theorem clarifies the difference between the pointwise conven- gence and Lp-convergence.

p p Theorem 1.32. Assume that fi L (A), i 1,2,... and f L (A), 1 p . If ∈ = ∈ É

Proof. Since fi and f µ-almost everywhere in A, by (1.2), we have | | < ∞ | | < ∞ p p p p 2 ( fi f ) fi f 0 µ-almost everywhere in A. | | + | | − | − | Ê

The assumption fi f µ-almost everywhere in A implies → p p p p p 1 p lim (2 ( fi f ) fi f ) 2 + f µ-almost everywhere in A. i | | + | | − | − | = | | →∞ Applying Fatou’s lemma, we obtain Z Z p 1 ¡ p p p p¢ 2 + f dµ liminf 2 ( fi f ) fi f dµ A | | É i A | | + | | − | − | →∞ µZ Z Z ¶ p p p p p liminf 2 fi dµ 2 f dµ fi f dµ É i A | | + A | | − A | − | →∞Z Z Z p p p p p lim 2 fi dµ 2 f dµ limsup fi f dµ = i A | | + A | | − A | − | →∞ i Z Z →∞Z p p p p p 2 f dµ 2 f dµ limsup fi f dµ. = A | | + A | | − i A | − | →∞

Here we used the facts that if (ai) is a converging sequence of real numbers and

(bi) is an arbitrary sequence of real numbers, then

liminf(ai bi) lim ai liminf bi and liminf( bi) limsup bi. i + = i + i i − = − i →∞ →∞ →∞ →∞ →∞ CHAPTER 1. LP SPACES 19

R p 1 p Subtracting 2 + f dµ from both sides, we have A | | Z p limsup fi f dµ 0. i A | − | É →∞ On the other hand, since the integrands are nonnegative Z p limsup fi f dµ 0. i A | − | Ê →∞ Thus Z p lim fi f dµ 0. i A | − | = →∞ ä

1.5 L∞ space

The definition of the L∞ space differs substantially from the definition of the Lp space for 1 p . The main difference is that instead of the integration É < ∞ the definition is based on the almost everywhere concept. The class L∞ consists of bounded measurable functions with the interpretation that we neglect the behaviour of functions on a set of measure zero.

Definition 1.33. Let A Rn be a µ-measurable set and f : A [ , ] a µ- ⊂ → −∞ ∞ measurable function. Then f L∞(A), if there exists M, 0 M , such that ∈ É < ∞ f (x) M for µ-almost every x A. | | É ∈

Functions in L∞ are sometimes called essentially bounded functions. If f L∞(A), ∈ then the essential supremum of f is

esssup f (x) inf{M : f (x) M for µ-almost every x A} x A = É ∈ ∈ inf©M : µ({x A : f (x) M}) 0ª = ∈ > = and the essential infimum of f is

essinf f (x) sup{m : f (x) m for µ-almost every x A} x A = Ê ∈ ∈ sup©m : µ({x A : f (x) m}) 0ª. = ∈ < =

The L∞ norm of f is f esssup f (x) . k k∞ = x A | | ∈

It is clear that f L∞(A) if and only if f . Note also that if f L∞(A), ∈ k k∞ < ∞ ∈ then there exists M, 0 M , such that f (x) M for µ-almost every x A. É < ∞ | | É ∈ This implies M f (x) M for µ-almost every x A. − É É ∈ CHAPTER 1. LP SPACES 20

and thus esssupx A f (x) and essinfx A f (x) . ∈ < ∞ ∈ > −∞

T HEMORAL : L∞ consists of measurable functions that can be redefined on a set of measure zero so that the functions become bounded. The essential supremum is supremum outside sets of measure zero. Observe that the standard supremum of a bounded function f is

sup f (x) inf{M : {x A : f (x) M} }. x A = ∈ > = ; ∈ W ARNING : The Lp norm for 1 p depends on the average size of the É < ∞ function, but L∞ norm depends on the pointwise values of the function outside a set of measure zero. More precisely, the Lp norm for 1 p depends very É < ∞ much on the underlying measure µ and would be very sensitive to any changes

in µ. The L∞ depends only on the class of sets of µ measure zero and not on the distribution of the measure µ itself.

Remark 1.34. In the special case that A N and µ is the counting measure, =© ª the L∞(N) space is denoted by l∞ and l∞ (xi) : supi N xi . Here (xi) is a = ∈ | | < ∞ sequence of real (or complex) numbers. Thus l∞ is the space of bounded sequences.

Example 1.35. Assume that µ is the Lebesgue measure.

(1) Let f : R R, f (x) χQ(x). Then f 0, but supx R f (x) 1. → = k k∞ = ∈ | | = n n (2) Let f : R R, f (x) 1/ x . Then f L∞(R ). → = | | ∉ Remarks 1.36:

(1) f supx A f (x) . k k∞ É ∈ | | (2) Let f L∞(A). By the definition of infimum, for every ε 0, we have ∈ > µ({x A : f (x) f ε}) 0 and µ({x A : f (x) f ε}) 0. ∈ | | > k k∞ + = ∈ | | > k k∞ − >

(3) If f C(A) and µ(A) 0, then f supx A f (x) . (Exercise) ∈ > k k∞ = ∈ | |

Lemma 1.37. Assume that f L∞(A). Then ∈

(1) f (x) esssupx A f (x) for µ-almost every x A and É ∈ ∈ (2) f (x) essinfx A f (x) for µ-almost every x A. Ê ∈ ∈

T HEMORAL : In other words, f (x) f for µ-almost every x A. This | | É k k∞ ∈ means that if f L∞(A), there exists a smallest number number M such that ∈ f (x) M for µ-almost every x A. This smallest number is f . | | É ∈ k k∞

Proof. (1) For every i 1,2,... there exists Mi such that = 1 Mi esssup f (x) < x A + i ∈

and f (x) Mi for µ-almost every x A. É ∈ CHAPTER 1. LP SPACES 21

Thus there exists Ni A with µ(Ni) 0 such that f (x) Mi for every x A\Ni. S ⊂ P = É ∈ Let N ∞i 1 Ni. Then µ(N) ∞i 1 µ(Ni) 0. Observe that = = É = = \∞ [∞ (A \ Ni) A \ Ni A \ N. i 1 = i 1 = = = Then 1 f (x) Mi esssup f (x) for every x A \ N, i 1,2,.... É < x A + i ∈ = ∈

Letting i , we obtain f (x) esssupx A f (x) for every x A \ N. → ∞ É ∈ ∈ (2) (Exercise) ä

Lemma 1.38 (Minkowski’s inequality for p ). If f , g L∞(A), then = ∞ ∈ f g f g . k + k∞ É k k∞ + k k∞ Proof. By Lemma 1.37 f (x) f for µ-almost every x A and g(x) g | | É k k∞ ∈ | | É k k∞ for µ-almost every x A. Thus ∈ f (x) g(x) f (x) g(x) f g for µ-almost every x A. | + | É | | + | | É k k∞ + k k∞ ∈

By the definition of the L∞ norm, we have f g f g . k + k∞ É k k∞ + k k∞ ä

T HEMORAL : This is the triangle inequality for the L∞-norm. It implies that

the L∞ norm is a norm in the usual sense and that L∞(A) is a normed space if the functions that coincide almost everywhere are identified.

1 Theorem 1.39 (Hölder’s inequality for p and p0 1). If f L (A) ja g 1 = ∞ = ∈ ∈ L∞(A), then f g L (A) ∈ f g 1 g f 1. k k É k k∞k k

T HEMORAL : In practice, we take the essential supremum out of the integral.

Proof. By Lemma 1.37, we have g(x) g for µ-almost every x A. This | | É k k∞ ∈ implies f (x)g(x) g f (x) for µ-almost every x A | | É k k∞| | ∈ and thus Z f (x)g(x) dµ g f 1. | | É k k∞k k A ä

p Remark 1.40. There is also an L version f g p g f p of the previous k k É k k∞k k theorem.

Next result justifies the notation f as a limit notion of f p as p . k k∞ k k → ∞ Theorem 1.41. If f Lp(A) for some 1 p , then ∈ É < ∞

lim f p f . p ∞ →∞ k k = k k CHAPTER 1. LP SPACES 22

p T HEMORAL : In this sense, L∞(A) is the limit of L (A) spaces as p . → ∞ Moreover, this gives a useful method to show that f L∞: It is enough find a ∈ uniform bound for the Lp norms as p . This gives a method to pass from → ∞ average the information f p to the pointwise information f outside a set of k k k k∞ measure zero.

Proof. Denote Aλ {x A : f (x) λ}, λ 0. Assume that 0 λ f . By the = ∈ | | > Ê É < k k∞ definition of the L∞ norm, we have µ(A ) 0. By Chebyshev’s inequality λ > Z µ f ¶p 1 Z A d f p d µ( λ) | | µ p µ É A λ = λ A | | < ∞

1/p 1/p and thus f p λµ(A ) . Since 0 µ(A ) , we have µ(A ) 1 as p . k k Ê λ < λ < ∞ λ → → ∞ This implies

liminf f p λ whenever 0 λ f . p ∞ →∞ k k Ê É < k k By letting λ f , we have → k k∞

liminf f p f . p ∞ →∞ k k Ê k k On the other hand, for q p , we have < < ∞ µZ ¶1/p µZ ¶1/p p q p q 1 q/p q/p f p f dµ f f − dµ f − f q . k k = A | | = A | | | | É k k∞ k k

Since f q for some q, this implies k k < ∞

limsup f p f . p k k É k k∞ →∞ We have shown that

limsup f p f liminf f p, p k k É k k∞ É p k k →∞ →∞ which implies that the limit exists and

lim f p f . p k k = k k∞ →∞ ä

Remarks 1.42: (1) The assumption f Lp(A) for some 1 p can be replaced with the ∈ É < ∞ assumption µ(A) . < ∞ (2) Recall that by Jensen’s inequality, the integral average

µ 1 Z ¶1/p f p dµ µ(A) A | |

is an increasing function of p. CHAPTER 1. LP SPACES 23

(3) If 0 µ(A) , then for every µ-measurable function < < ∞ µ 1 Z ¶1/p lim f p dµ esssup f , p (A) →∞ µ A | | = A | | µ Z ¶ 1/p 1 p − lim f − dµ essinf f p (A) A →∞ µ A | | = | | and µ 1 Z ¶1/p µ 1 Z ¶ lim f p dµ exp log f dµ . p 0 µ(A) A | | = µ(A) A | | →

Theorem 1.43. L∞(A) is a Banach space.

T HEMORAL : The claim and proof is the same as in showing that the space of continuous functions with the supremum norm is complete. The only difference is that we have to neglect sets of zero measure.

Proof. Let (fi) be a Cauchy sequence in L∞(A). By Lemma 1.37, we have

fi(x) f j(x) fi f j for µ-almost every x A. | − | É k − k∞ ∈

Thus there exists Ni, j A, µ(Ni, j) 0 such that ⊂ =

fi(x) f j(x) fi f j for every x A \ Ni, j. | − | É k − k∞ ∈

Since (fi) is a Cauchy sequence in L∞(A), for every k 1,2,..., there exists ik = such that 1 fi f j when i, j ik. k − k∞ < k Ê This implies

1 fi(x) f j(x) for every x A \ Ni, j, i, j ik. | − | < k ∈ Ê S S Let N ∞i 1 ∞j 1 Ni, j. Then = = =

X∞ X∞ µ(N) µ(Ni, j) 0 É i 1 j 1 = = = and 1 fi(x) f j(x) for every x A \ N, i, j ik. | − | < k ∈ Ê

Thus (fi(x)) is a Cauchy sequence for every x A \ N. Since R is complete, ∈ there exists

lim fi(x) f (x) for every x A \ N. i = ∈ →∞ We set f (x) 0, when x N. Then f is measurable as a pointwise limit of = ∈ measurable functions Letting j in the preceding inequality gives → ∞ 1 fi(x) f (x) for every x A \ Ni, j, i, j ik, | − | É k ∈ Ê CHAPTER 1. LP SPACES 24 which implies 1 fi f when i ik. k − k∞ É k Ê

Since f fi fi f , we have f L∞(A) and fi f in L∞(A) as k k∞ É k k∞ + k − k∞ < ∞ ∈ → i . → ∞ ä

Remark 1.44. The proof shows that fi f in L∞(A) as i implies that fi f → → ∞ → uniformly in A \ N with µ(N) 0. This means that L∞ convergence is uniform = convergence outside a set of measure zero. Uniform convergence outside a set of measure zero implies immediately pointwise convergence almost everywhere, compare to Corollary 1.28 for Lp with 1 p . É < ∞

Example 1.45. Assume that µ is the Lebesgue measure. Let fi : R R, →  0, x ( ,0),  ∈ −∞ £ 1 ¤ fi(x) ix, x 0, , = i  ∈ 1, x ¡ 1 , ¢, ∈ i ∞

for i 1,2,... and let f χ[0, ). Then fi(x) f (x) for every x R as i , fi = = ∞ → ∈ → ∞ k k∞ = 1 for every i 1,2,..., f 1 so that limi fi f , but fi f 1 = k k∞ = →∞ k k∞ = k k∞ k − k∞ = for every i 1,2,.... Thus limi fi f 1 0. This shows that the claim of = →∞ k − k∞ = 6= Theorem 1.32 does not hold when p . = ∞ The Hardy-Littlewood maximal function is a very useful tool in analysis. The maximal function theorem asserts that the maximal operator is bounded from Lp to Lp for p 1 and for p 1 there is a weak type estimate. The weak > = type estimate is used to prove the Lebesgue differentiation theorem, which gives a pointwise meaning for a locally integrable function. The Lebesgue differentiation theorem is a higher dimensional version of the fundamental theorem of calculus. It is applied to the study of the density points of a measurable set. As an application we prove a Sobolev embedding theorem. 2 The Hardy-Littlewood maximal function

In this section we restrict our attention to the Lebesgue measure on Rn. We prove Lebesgue’s theorem on differentiation of integrals, which is an extension of the one-dimensional fundamental theorem of calculus to the n-dimensional case. This theorem states that, for a (locally) integrable function f : Rn [ , ], we have → −∞ ∞ 1 Z lim f (y) d y f (x) r 0 B(x, r) B(x,r) = → | | for almost every x Rn. Recall that B(x, r) {y Rn : y x r} is the open ball ∈ = ∈ | − | < with the center x and radius r 0. In proving this result we need to investigate > very carefully the behaviour of the integral averages above. This leads to the Hardy-Littlewood maximal function, where we take the supremum of the integral averages instead of the limit. The passage from the limiting expression to a corresponding maximal function is a situation that occurs often. Hardy and Littlewood wrote that they were led to study the one-dimensional version of the maximal function by the question how a score in cricket can be maximized: “The problem is most easily grasped when stated in the language of cricket, or any other game in which the player complies a series of scores of which average is recorded.”As we shall see, these concepts and methods have a universal significance in analysis.

2.1 Local Lp spaces

If we are interested in pointwise properties of functions, it is not necessary to require integrablity conditions over the whole underlying domain. For example, in the Lebesgue differentiation theorem above, the limit is taken over integral

25 CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 26 averages over balls that shrink to the point x, so that the behaviour of f far away from x is irrelevant.

Definition 2.1. Let Ω Rn be an open set and assume that f : Ω [ , ] is a ⊂ p → −∞ ∞ measurable function. Then f L (Ω), if ∈ loc Z f p dx , 1 p , K | | < ∞ É < ∞ and esssup f , p K | | < ∞ = ∞ for every compact set K Ω. ⊂ Examples 2.2: p Lp(Ω) L (Ω), but the reverse inclusion is not true. ⊂ loc (1) Let f : Rn R, f (x) 1. Then f Lp(Rn) for any 1 p , but f Lp (Rn) → = ∉ É < ∞ ∈ loc for every 1 p . É < ∞ n 1/2 1 n 1 n (2) Let f : R R, f (x) x − . Then f L (R ), but f L (R ). → = | | ∉ ∈ loc n x 1 n 1 n (3) Let f : R R, f (x) e| |. Then f L (R ), but f L (R ). → = ∉ ∈ loc n/p p (4) Let f : B(0,1)\{0} R, f (x) x − . Then f L (B(0,1)\{0}) for 1 p n, p → = | | ∉ É < but f L (B(0,1) \{0}) for 1 p . Moreover, f L∞(B(0,1) \{0}), but ∈ loc < < ∞ ∉ f L∞ (B(0,1) \{0}). ∈ loc n n n (5) For p , let f : R R, f (x) x . Then f L∞(R ), but f L∞ (R ). = ∞ → = | | ∉ ∈ loc Remarks 2.3: q p 1 (1) If 1 p q , then L∞ (Ω) L (Ω) L (Ω) L (Ω). É É É ∞ loc ⊂ loc ⊂ loc ⊂ loc Reason. By Jensen’s inequality 1 Z µ 1 Z ¶1/p µ 1 Z ¶1/q f dx f p dx f q dx esssup f , K K | | É K K | | É K K | | É K | | | | | | | | where K is a compact subset of Ω with K 0. | | > ■ p (2) C(Ω) L (Ω) for every 1 p . ⊂ loc É É ∞ Reason. Since f C(Ω) assumes its maximum in the compact set K and | | ∈ K has a finite Lebesgue measure, we have Z f p dx K (esssup f )p K (max f )p . K | | É | | K | | É | | K | | < ∞ ■ (3) f Lp (Rn) f Lp(B(0, r)) for every 0 r f Lp(A) for every ∈ loc ⇐⇒ ∈ < < ∞ ⇐⇒ ∈ bounded measurable set A Rn. ⊂ (4) In general, the quantity µZ ¶1/p sup f p dx K Rn K | | ⊂ is not a norm in Lp (Rn), since it may be infinity for some f Lp (Rn). loc ∈ loc Consider, for example, constant functions on Rn. CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 27

2.2 Definition of the maximal function

We begin with the definition of the maximal function.

Definition 2.4. The centered Hardy-Littlewood maximal function M f : Rn → [0, ] of f L1 (Rn) is defined by ∞ ∈ loc 1 Z M f (x) sup f (y) d y, = r 0 B(x, r) B(x,r) | | > | | where B(x, r) {y Rn : y x r} is the open ball with the radius r 0 and the = ∈ | − | < > center x Rn. ∈

T HEMORAL : M f (x) gives the maximal integral average of the absolute value of the function on balls centered at x. Remarks 2.5: (1) It is enough to assume that f : Rn [ , ] is a measurable function in → −∞ ∞ the definition of the Hardy-Littlewood maximal function. The assumption f L1 (Rn) guarantees that the integral averages are finite. ∈ loc (2) M f is defined at every point x Rn. If f g almost everywhere in Rn, then ∈ = M f (x) M g(x) for every x Rn. = ∈ (3) It may happen that M f (x) for every x Rn. For example, let f : Rn R, = ∞ ∈ → f (x) x . Then M f (x) for every x Rn. = | | = ∞ ∈ (4) There are several seemingly different definitions, which are comparable. Let 1 Z Mf f (x) sup f (y) d y = B x B B | | 3 | | be the noncentered maximal function, where the supremum is taken over all open balls B containing the point x Rn, then ∈ n M f (x) Mf f (x) for every x R . É ∈ On the other hand, if B B(z, r) x, then B(z, r) B(x,2r) and = 3 ⊂ 1 Z B(x,2r) 1 Z f (y) d y | | f (y) d y B B | | É B(z, r) B(x,2r) B(x,2r) | | | | | | | | 1 Z 2n f (y) d y = B(x,2r) B(x,2r) | | | | 2n M f (x). É n This implies that Mf f (x) 2 M f (x) and thus É n n M f (x) Mf f (x) 2 M f (x) for every x R . É É ∈ (5) It is possible to use cubes in the definition of the maximal function and this will give a comparable notion as well. CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 28

Examples 2.6:

(1) Let f : R R, f χ[a,b]. Then M f (x) 1, if x (a, b). For x b a calculation → = = ∈ Ê shows that the maximal average is obtained when r x a. Similarly, = − when x a, the maximal average is obtained when r b x. Thus É = −  b a  2 x− b , x a,  | − | É M f (x) 1, x (a, b), =  ∈  b a  2 x− a , x b. | − | Ê Note that the centered maximal function M f has jump discontinuities at x a and x b. = = T HEMORAL : f L1(R) does not imply M f L1(R). ∈ ∈ (2) Consider the noncenter maximal function Mf f of f : R R, f χ[a,b]. Again → = Mf f (x) 1, if x (a, b). For x b a calculation shows that the maximal = ∈ > average over all intervals (z r, z r) is obtained when z (x a)/2 and − + = + r (x a)/2. Similarly, when x a, the maximal average is obtained when = − < z (b x)/2 and r (b x)/2. Thus = + = −  b a  x−b , x a,  | − | É Mf f (x) 1, x (a, b), =  ∈  b a  x−a , x b. | − | Ê Note that the uncentered maximal function M f does not have discontinu- ities at x a and x b. = = Lemma 2.7. If f C(Rn), then f (x) M f (x) for every x Rn. ∈ | | É ∈

T HEMORAL : This justifies the terminology, since the maximal function is pointwise bigger or equal than the absolute value of the original function.

Proof. Assume that f C(Rn) and let x Rn. Then for every ε 0 there exists ∈ ∈ > δ 0 such that f (x) f (y) ε if x y δ. This implies > | − | < | − | < ¯ Z ¯ ¯ Z ¯ ¯ 1 ¯ ¯ 1 ¯ ¯ f (y) d y f (x) ¯ ¯ ( f (y) f (x) ) d y¯ ¯ B(x, r) B(x,r) | | − | |¯ = ¯ B(x, r) B(x,r) | | − | | ¯ | | | |Z 1 ¯ ¯ ¯ f (y) f (x) ¯ d y É B(x, r) B(x,r) | | − | | | | 1 Z f (y) f (x) d y ε, if r δ. É B(x, r) B(x,r) | − | É É | | Thus 1 Z f (x) lim f (y) d y M f (x) for every x Rn. | | = r 0 B(x, r) B(x,r) | | É ∈ → | | ä The next thing we would like to show is that M f : Rn [0, ] is a measurable → ∞ function. Recall that a function f : Rn [ , ] is lower semicontinuous, if → −∞ ∞ CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 29

the distribution set {x Rn : f (x) λ} is open for every λ R. Since open sets ∈ > ∈ are Lebesgue measurable, it follows that every lower semicontinuous function is Lebesgue measurable.

Lemma 2.8. Assume that f Lp (Rn). Then M f is lower semicontinuous. ∈ loc Proof. Let A {x Rn : M f (x) λ}, λ 0. For every x A there exists r 0 such λ = ∈ > > ∈ λ > that 1 Z f (y) d y λ. B(x, r) B(x,r) | | > | | By the properties of the integral

1 Z 1 Z f (y) d y lim f (y) d y, B(x, r) B(x,r) | | = r0 r B(x, r0) B(x,r) | | | | r →r | | 0> which implies that there exists r0 r such that > 1 Z f (y) d y λ. B(x, r ) B(x,r) | | > | 0 |

If x x0 r0 r, then B(x, r) B(x0, r0), since y x0 y x x x0 r (r0 r) r0 | − | < − ⊂ | − | É | − |+| − | < + − = for every y B(x, r). Thus ∈ 1 Z 1 Z λ f (y) d y f (y) d y < B(x, r ) B(x,r) | | É B(x, r ) B(x ,r ) | | | 0 | | 0 | 0 0 1 Z f (y) d y M f (x0), if x x0 r0 r. = B(x , r ) B(x ,r ) | | É | − | < − | 0 0 | 0 0

This shows that B(x, r0 r) A and thus A is an open set. − ⊂ λ λ ä n n Example 2.9. Let R 0 and define f : R R, f (x) χB(0,R)(x) for every x R . > → = ∈ Then M f (x) 1 for every x B(0,R) and M f (x) 1 for every x ÇB(0,R). The = ∈ < ∈ first claim holds since B(0,R) is open, and the second follows from the fact that there exists a ball B(y, r ) B(x, r) \ B(0,R) if x ÇB(0,R) and r 0. This shows 2 ⊂ ∈ > that M f is not continuous at the points x ÇB(0,R). ∈

2.3 Hardy-Littlewood-Wiener maximal func- tion theorems

Another point of view is to consider the Hardy-Littlewood maximal operator f M f . We shall list some properties of this operator below. 7→ Lemma 2.10. Assume that f , g L1 (Rn). ∈ loc (1) (Positivity) M f (x) 0 for every x Rn. Ê ∈ (2) (Sublinearity) M(f g)(x) M f (x) M g(x). + É + CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 30

(3) (Homogeneity) M(af )(x) a M f (x), a R. = | | ∈ n (4) (Translation invariance) M(τy f )(x) (τyM f )(x), y R , where τy f (x) = ∈ = f (x y). + (5) (Scaling invariance) M(δa f )(x) (δa M f )(x), where δa f (x) f (ax) with = = a 0. > Proof. Exercise. ä n α n Example 2.11. Let 0 α n and define f : R \{0} R, f (x) x − . Let x R \{0} x < < → = | | ∈ and write z x . By Lemma 2.10 (5) and (3), we have = | |

M f (x) M f ( x z) (δ x M f )(z) = | | = | | α M(δ x f )(z) x − M f (z). = | | = | | α n Thus M f (x) M f (z) x − for every x R \{0}, where z ÇB(0,1). Since f is radial, = | | ∈ ∈ the value M f (z) is independent of z ÇB(0,1). Moreover, ∈ 1 Z M f (z) sup f (y) d y = r 0 B(z, r) B(z,r)| | > | | 1 Z 1 Z sup f (y) d y sup f (y) d y É 0 r 1 B(z, r) B(z,r)| | + r 1 B(z, r) B(z,r)| | < < 2 | | Ê 2 | | ³ 1 ´ α 1 Z − α C(n)sup y − d y É 2 + r 1 B(0, r 1) B(0,r 1)| | Ê 2 | + | + 1 Z r 1 α C n + α n 1 d 2 ( )sup n ρ− ρ − ρ É + r 1 (r 1) 0 Ê 2 + α α 2 C(n,α)sup(r 1)− . É + r 1 + < ∞ Ê 2 This shows that M f is a constant multiple of f .

We are interested in behaviour of the maximal operator in Lp spaces. The following results were first proved by Hardy and Littlewood in the one-dimensional case and extended later by Wiener to the higher dimensional case.

n n Lemma 2.12. If f L∞(R ), then M f L∞(R ) and M f f . ∈ ∈ k k∞ É k k∞

T HEMORAL : The maximal function is essentially bounded, and thus finite almost everywhere, if the original function is essentially bounded. Intuitively this is clear, since the integral averages cannot be larger than the essential supremum of the function.

Proof. For every x Rn and r 0 we have ∈ > 1 Z 1 f (y) d y f B(x, r) f . B(x, r) B(x,r) | | É B(x, r) k k∞| | = k k∞ | | | | Thus M f (x) f for every x Rn and M f f . É k k∞ ∈ k k∞ É k k∞ ä CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 31

n n Another way to state the previous lemma is that M : L∞(R ) L∞(R ) is → a . As we have seen before, f L1(R) does not imply that ∈ M f L1(R) and thus the Hardy-Littlewood maximal operator is not bounded in ∈ L1(Rn). We give another example of this phenomenon.

Example 2.13. Let r 0. Then there are constants c1 c1(n) and c2 c2(n) such > = = that n n c1r c2r M(χB(0,r))(x) ( x r)n É É ( x r)n | | + | | + for every x Rn (exercise). Since these functions do not belong to L1(Rn), we see ∈ that the Hardy-Littlewood maximal operator does not map L1(Rn) to L1(Rn).

Next we show even a stronger result that M f L1(Rn) for every nontrivial ∉ f L1 (Rn). ∈ loc Remark 2.14. M f L1(Rn) implies f 0. ∈ = Reason. Let r 0 and let x Rn such that x r. Then > ∈ | | Ê 1 Z M f (x) f (y) d y Ê B(x,2 x ) B(x,2 x ) | | | | | | | | 1 Z f (y) d y Ê B(0,2 x ) B(0,r) | | | | | | (B(0, r) B(x,2 x ), y r y x y x r x 2 x ) ⊂ | | | | < =⇒| − | É | | + | | < + | | É | | c Z f y d y n ( ) . = x B(0,r) | | | | If f 0, we choose r 0 large enough that 6= > Z f (y) d y 0. B(0,r) | | >

Then M f (x) c/ x n for every x Rn \ B(0, r). Since c/ x n L1(Rn \ B(0, r)) we Ê | | ∈ | | ∉ conclude that M f L1(Rn). This is a contradiction and thus f 0 almost every- ∉ = where. ■ The remark above shows that the maximal function is essentially never in L1, but the essential issue for this is what happens far away from the origin. The next example shows that the maximal function does not need to be even locally in L1.

Example 2.15. Let f : R R, → χ(0,1/2)(x) f (x) . = x(log x)2

Then f L1(R), since ∈ Z Z 1/2 1 ¯1/2 1 f (x) dx dx ¯ . 2 ¯ R | | = 0 x(log x) = ¯0 − log x < ∞ CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 32

For 0 x 1/2, we have < < 1 Z 2x 1 Z x M f (x) f (y) d y f (y) d y Ê 2x 0 Ê 2x 0 1 Z x 1 1 ¯x 1 d y ¯ 2 ¯ = 2x 0 y(log y) = 2x ¯0 − log y 1 L1((0,1/2)). = −2xlog x ∉

Thus M f L1 (R). ∉ loc After these considerations, the situation for L1 boundedness looks rather hopeless. However, there is a substituting result, which says that if f L1, then ∈ M f belongs to a weakened version of L1.

Definition 2.16. A measurable function f : Rn [ , ] belongs to weak L1(Rn), → −∞ ∞ if there exists a constant c, 0 c , such that É < ∞ c {x Rn : f (x) λ} for every λ 0. | ∈ | | > | É λ > Remarks 2.17: (1) L1(Rn) weak L1(Rn). ⊂ Reason. By Chebyshev’s inequality 1 Z {x Rn : f (x) λ} f (y) d y | ∈ | | > | É λ {x Rn: f (x) λ} | | ∈ | |> 1 f 1 for every λ 0. É λ k k > ■

(2) Weak L1(Rn) L1(Rn). 6⊂ n n 1 n Reason. Let f : R [0, ], f (x) x − . Then f L (R ), but → ∞ = | | ∉ n 1 {x R : f (x) λ} B(0,λ− n ) | ∈ | | > | = | | 1 n 1 Ωn(λ− n ) Ωnλ− for every λ 0. = = > 1 n Here Ωn B(0,1) . Thus f belongs to weak L (R ). = | | ■ The next goal is to show that the Hardy-Littlewood maximal operator maps L1 to weak L1. The proof is based on the extremely useful covering theorem.

Theorem 2.18 (Vitali covering theorem). Let F be a collection of open balls B such that à ! diam [ B . B F < ∞ ∈ Then there is a countable (or finite) subcollection of pairwise disjoint balls B(xi, ri) ∈ F , i 1,2,..., such that = [ [∞ B B(xi,5ri). B F ⊂ i 1 ∈ = CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 33

T HEMORAL : Let A be a bounded subset of Rn and suppose that for every

x A there is a ball B(x, rx) with the radius rx 0 possibly depending on the ∈ > point x. We would like to have a countable subcollection of pairwise disjoint balls

B(xi, ri), i 1,2,..., which covers the union of the original balls. In general, this = is not possible, if we do not expand the balls. Thus ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ [ ¯ ¯[∞ ¯ X∞ A ¯ B(x, rx)¯ ¯ B(xi,5ri)¯ B(xi,5ri) | | É ¯x A ¯ É ¯i 1 ¯ É i 1 | | ∈ = ¯ =¯ ¯ ¯ ¯ ¯ ¯ ¯ n X∞ n ¯[∞ ¯ n ¯ [ ¯ 5 B(xi, ri) 5 ¯ B(xi, ri)¯ 5 ¯ B(x, rx)¯. = i 1 | | = ¯i 1 ¯ É ¯x A ¯ = = ∈ Note the measure of A can be estimated by the measure of the union of the balls S S S and the measures of x A B(x, rx), ∞i 1 B(xi,5ri) and ∞i 1 B(xi, ri) are comparable. ∈ = =

T HESTRATEGYOFTHEPROOF : The greedy principle: The balls are selected inductively by taking the largest ball with the required properties that has not been chosen earlier.

Proof. Assume that B(x1, r1),...,B(xi 1, ri 1) F have been selected. Define − − ∈ ( i 1 ) [− di sup r : B(x, r) F and B(x, r) B(x j, r j) . = ∈ ∩ j 1 = ; =

Observe that di , since sup r . If there are no balls B(x, r) F such < ∞ B(x,r) F < ∞ ∈ that ∈ i 1 [− B(x, r) B(x j, r j) , ∩ j 1 = ; = the process terminates and we have selected the balls B(x1, r1),...,B(xi 1, ri 1). − − Otherwise, we choose B(xi, ri) F such that ∈ i 1 1 [− ri di and B(xi, ri) B(x j, r j) . > 2 ∩ j 1 = ; =

We can also choose the first ball B(x1, r1) in this way. The selected balls are pairwise disjoint. Let B F be an arbitrary ball in ∈ the collection F . Then B B(x, r) intersects at least one of the selected balls = B(x1, r1),B(x2, r2),..., since otherwise B(x, r) B(xi, ri) for every i 1,2,... ∩ = ; = and, by the definition of di, we have di r for every i 1,2,.... This implies Ê = 1 1 ri di r 0 for every i 1,2,..., > 2 Ê 2 > = and by the fact that the balls are pairwise disjoint, we have ¯ ¯ ¯ ¯ ¯[∞ ¯ X∞ ¯ B(xi, ri)¯ B(xi, ri) . ¯i 1 ¯ = i 1 | | = ∞ = = CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 34

S ¯S ¯ This is impossible, since ∞i 1 B(xi, ri) is bounded and thus ¯ ∞i 1 B(xi, ri)¯ . = = < ∞ Since B(x, r) intersects some ball B(xi, ri), i 1,2,..., there is a smallest index = i such that B(x, r) B(xi, ri) . This implies ∩ 6= ; i 1 [− B(x, r) B(x j, r j) ∩ j 1 = ; =

and by the selection process r di 2ri. Since B(x, r) B(xi, ri) and r 2ri, É < ∩ 6= ; É we have B(x, r) B(xi,5ri). ⊂

Reason. Let z B(x, r) B(xi, ri) and y B(x, r). Then ∈ ∩ ∈

y xi y z z xi 2r ri 5ri. | − | É | − | + | − | É + É ■ This completes the proof. ä Remarks 2.19: (1) The factor 5 in the theorem is not optimal. In fact, the same proof shows that this factor can be replaced with 3. A slight modification of the ar- gument shows that 2 ε for any ε 0 will do. To obtain this, choose + > B(xi, ri) F such that ∈ 1 ri di > 1 ε + in the proof above. This is the optimal result, since 2 does not work in general (exercise). (2) A similar covering theorem holds true for cubes as well. (3) Some kind of boundedness assumption is needed in the Vitali covering theorem.

Reason. Let B(0, i), i 1,2,.... Since all balls intersect each other, the = only subfamily of pairwise disjoint balls consists of one single ball B(0, i) S n and the enlarged ball B(0,5i) does not cover ∞i 1 B(0.i) R . = = ■ Theorem 2.20 (Hardy-Littlewood I). Let f L1(Rn). Then ∈ n n 5 {x R : M f (x) λ} f 1 for every λ 0. | ∈ > | É λ k k >

T HEMORAL : The Hardy-Littlewood maximal operator maps L1 to weak L1. It is said that the Hardy-Littlewood maximal operator is of weak type (1,1).

n Proof. Let A {x R : M f (x) λ}, λ 0. For every x A there exists rx 0 λ = ∈ > > ∈ λ > such that 1 Z f (y) d y λ (2.1) B(x, rx) B(x,r ) | | > | | x S We would like to apply the Vitali covering theorem, but the set x A B(x, rx) is not ∈ λ necessarily bounded. To overcome this problem, we consider the sets A B(0, k), λ ∩ CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 35

k 1,2,.... Let F be the collection of balls for which (2.1) and x A B(0, k). If = ∈ λ ∩ B(x, rx) F , then ∈ Z n 1 1 Ωnrx B(x, rx) f (y) d y f 1, = | | < λ B(x,rx) | | É λ k k so that à ! [ diam B(x, rx) . x A B(0,k) < ∞ ∈ λ∩ By the Vitali covering theorem, we obtain pairwise disjoint balls B(xi, ri), i = 1,2,..., such that [∞ Aλ B(0, k) B(xi,5ri). ∩ ⊂ i 1 = This implies ¯ ¯ ¯ ¯ ¯[∞ ¯ X∞ n X∞ Aλ B(0, k) ¯ B(xi,5ri)¯ B(xi,5ri) 5 B(xi, ri) | ∩ | É ¯i 1 ¯ É i 1 | | = i 1 | | = = = n Z n Z n 5 X∞ 5 5 f (y) d y f (y) d y f 1. É λ | | = λ S | | É λ k k i 1 B(xi,ri) ∞i 1 B(xi,ri) = = Finally, 5n Aλ lim Aλ B(0, k) f 1. | | = k | ∩ | É λ k k →∞ ä

Remark 2.21. f L1(Rn) implies M f almost everywhere in Rn. ∈ < ∞ Reason. n n n 5 {x R : M f (x) } {x R : M f (x) λ} f 1 0 as λ . | ∈ = ∞ | É | ∈ > | É λ k k → → ∞ ■

The next goal is to show that the Hardy-Littlewood maximal operator maps Lp to Lp if p 1. We recall the following Cavalieri’s principle. > Lemma 2.22. Assume that µ is an outer measure, A Rn is µ-measurable set ⊂ and f : A [ , ] is a µ-measurable function. Then → −∞ ∞ Z Z p ∞ p 1 f dµ p λ − µ({x A : f (x) λ}) dλ, 0 p . A | | = 0 ∈ | | > < < ∞ Proof. Z Z Z f (x) p | | p 1 f dµ χA(x)p λ − dλ dµ(x) A | | = Rn 0 Z Z ∞ p 1 p χA(x)χ[0, f (x) )(λ)λ − dλ dµ(x) = Rn 0 | | Z Z ∞ p 1 p χA(x)χ[0, f (x) )(λ)λ − dµ(x) dλ (Fubini’s theorem) = 0 Rn | | Z Z ∞ p 1 p λ − χA(x)χ{x Rn: f (x) λ}(x) dµ(x) dλ = 0 Rn ∈ | |> Z ∞ p 1 p λ − µ({x A : f (x) λ}) dλ. = 0 ∈ | | > ä CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 36

Remark 2.23. More generally, if ϕ : [0, ) [0, ) is a nondecreasing continuously ∞ → ∞ differentiable function with ϕ(0) 0, then = Z Z ∞ ϕ f dµ ϕ0(λ)µ({x A : f (x) λ}) dλ. A ◦ | | = 0 ∈ | | > (Exercise)

Now we are ready for the Hardy-Littlewood maximal function theorem.

Theorem 2.24 (Hardy-Littlewood II). Let f Lp(Rn), 1 p . Then M f p n ∈ < É ∞ ∈ L (R ) and there exists c c(n, p) such that M f p c f p. = k k É k k

T HEMORAL : The Hardy-Littlewood maximal operator maps Lp to Lp if p 1. > It is said that the Hardy-Littlewood maximal operator is of strong type (p, p).

W ARNING : The result is not true p 1. Then we only have the weak type = estimate.

Proof. Let f f1 f2, where f1 f χ{ f λ/2}, that is, = + = | |>  f (x), f (x) λ/2, f1(x) | | > = 0, f (x) λ/2. | | É

Then f1(x) λ/2 if f (x) λ/2 and thus | | > | | > Z Z p 1 p f1(x) dx f1(x) f1(x) − dx Rn | | = {x Rn: f (x) λ/2} | | | | ∈ | |> µ λ ¶1 p − f p . É 2 k kp < ∞

1 n n This shows that f1 L (R ). On the other hand, f2(x) λ/2 for every x R , which ∈ n | p | É ∈ implies f2 λ/2 and f2 L∞(R ). Thus every L function can be represented ∞ k k É 1 ∈ as a sum of an L function and an L∞ function. By Lemma 2.12, we have λ M f2 f2 k k∞ É k k∞ É 2 From this we conclude using Lemma 2.10 that

λ M f (x) M(f1 f2)(x) M f1(x) M f2(x) M f1(x) = + É + É + 2 n for almost every x R and thus M f (x) λ implies M f1(x) λ/2 for almost every ∈ > > x Rn. It follows that ∈ ¯½ ¾¯ n ¯ n λ ¯ {x R : M f (x) λ} ¯ x R : M f1(x) ¯ | ∈ > | É ¯ ∈ > 2 ¯ 2 5n · f1 1 (Theorem 2.20) É λ k k 2 5n Z · f (x) dx for every λ 0 = λ {x Rn: f (x) λ/2} | | > ∈ | |> CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 37

By Cavalieri’s principle Z Z p ∞ p 1 n M f dx p λ − {x R : M f (x) λ} dλ Rn | | = 0 | ∈ > | Z Z n ∞ p 2 p 2 5 λ − f (x) dx dλ É · · 0 {x Rn: f (x) λ/2} | | ∈ | |> Z Z 2 f (x) n | | p 2 p 2 5 f (x) λ − dλ dx = · · Rn | | 0 (as in the proof of Lemma 2.22) n Z p 2 5 p 1 · · f (x) 2f (x) − dx = p 1 Rn | || | − p 2p 5n Z · · f (x) p dx. = p 1 Rn | | − This completes the proof. ä Remarks 2.25: (1) The proof above gives

µ p5n ¶1/p M f p 2 f p, 1 p k k É p 1 k k < < ∞ − for the operator norm of M : Lp(Rn) Lp(Rn). Note that it blows up as → p 1 and converges to 2 as p . → → ∞ It can be shown that that the constant in the strong type (p, p) inequality for p 1 can be chosen to be independent of the dimension n, see E. M. > Stein and J.-O. Strömberg, Behavior of maximal functions in Rn for large n, Ark. Mat. 21 (1983), 259–269. However, it is an open question whether the constant in the weak type (1,1) estimate is independent of the dimension. (2) As a byproduct of the proof we get the following useful result. Let 1 p É < r q . Then for every f Lr(Rn) there exist g Lp(Rn) and h Lq(Rn) < É ∞ ∈ ∈ ∈ such that f g h. Hint: g f χ{ f 1}. = + = | |> (3) The proof above is a special case of the Marcinkiewicz interpolation the- orem, which applies to more general operators as well. In this case, we interpolate between the weak type (1,1) estimate and the strong type ( , ) estimate. ∞ ∞

2.4 The Lebesgue differentiation theorem

The Lebesgue differentiation theorem is a remarkable result, which shows that a quantitative weak type estimate for the maximal function implies almost ev- erywhere convergence of integral averages using the fact that the convergence is clear for a dense class of continuous functions. CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 38

Theorem 2.26. Assume f L1 (Rn). Then ∈ loc 1 Z lim f (y) f (x) d y 0 r 0 B(x, r) B(x,r) | − | = → | | for almost every x Rn. ∈ Remark 2.27. In particular, it follows that

1 Z lim f (y) d y f (x) r 0 B(x, r) B(x,r) = → | | for almost every x Rn. ∈ Reason. ¯ Z ¯ ¯ Z ¯ ¯ 1 ¯ ¯ 1 ¯ ¯ f (y) d y f (x)¯ ¯ (f (y) f (x)) d y¯ ¯ B(x, r) B(x,r) − ¯ = ¯ B(x, r) B(x,r) − ¯ | | | | 1 Z f (y) f (x) d y 0 É B(x, r) B(x,r) | − | → | | for almost every x Rn as r 0. ∈ → ■ Note that this implies

1 Z f (x) lim f (y) d y | | = r 0 B(x, r) B(x,r) | | → | | 1 Z sup f (y) d y M f (x) É r 0 B(x, r) B(x,r) | | = > | | for almost every x Rn. ∈

T HEMORAL : A locally integrable function is a limit of the integral aver- ages at almost every point. Observe, that Lebesgue’s differentiation tells that the limit of the integral averages exists and that it coincides with the function almost everywhere. This gives a passage from average information to pointwise information.

Proof. We may assume that f L1(Rn), since the theorem is local. Indeed, we may ∈ consider the functions fi f χB(0,i), i 1,2,.... Define a infinitesimal version of = = the Hardy-Littlewood maximal function as

1 Z f ∗(x) limsup f (y) f (x) d y. = r 0 B(x, r) B(x,r) | − | → | | n We shall show that f ∗(x) 0 for almost every x R . The proof is divided into six = ∈ steps.

(1) Clearly f ∗ 0. Ê (2) (f g)∗ f ∗ g∗. + É + CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 39

Reason. 1 Z (f g)(y) (f g)(x) d y B(x, r) B(x,r) | + − + | | | 1 Z 1 Z f (y) f (x) d y g(y) g(x) d y. É B(x, r) B(x,r) | − | + B(x, r) B(x,r) | − | ■ | | | |

(3) If g is continuous at x, then g∗(x) 0. = Reason. For every ε 0 there exists δ 0 such that g(y) g(x) ε whenever > > | − | < x y δ. This implies | − | < 1 Z g(y) g(x) d y ε, if 0 r δ. B(x, r) B(x,r) | − | < < É | | ■ n (4) If g C(R ), then (f g)∗ f ∗. ∈ − = Reason. By (3), we have

(f g)∗ f ∗ ( g)∗ f ∗ and f ∗ (f g)∗ g∗ (f g)∗, − É + − = É − + = − so that the equality holds. ■

(5) f ∗ M f f . É + | | Reason. 1 Z 1 Z f (y) f (x) d y ( f (y) f (x) ) d y B(x, r) B(x,r) | − | É B(x, r) B(x,r) | | + | | | | | | 1 Z f (y) d y f (x) É B(x, r) B(x,r) | | + | | | | M f (x) f (x) . É + | | ■

(6) If f ∗(x) λ, by (5) we have M f (x) f (x) λ, from which we conclude that > +| | > M f (x) λ/2 or f (x) λ/2. Thus > | | > ¯½ ¾¯ ¯½ ¾¯ n ¯ n λ ¯ ¯ n λ ¯ {x R : f ∗(x) λ} ¯ x R : M f (x) ¯ ¯ x R : f (x) ¯ | ∈ > | É ¯ ∈ > 2 ¯ + ¯ ∈ | | > 2 ¯ 2 5n 2 · f 1 f 1 (Theorem 2.20 and Chebyshev) É λ k k + λ k k 2(5n 1) + f 1 = λ k k Finally, we are ready to prove the theorem. Recall from the measure and integration theory that compactly supported continuous functions are dense in 1 n n L (R ), see Theorem 3.3. Thus for every ε 0 there exists g C0(R ) such that > ∈ f g 1 ε. Then k − k < n n {x R : f ∗(x) λ} {x R : (f g)∗(x) λ} (Property (4)) | ∈ > | = | ∈ − > | 2(5n 1) + f g 1 (Property (6)) É λ k − k 2(5n 1) + ε. < λ CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 40

n Letting ε 0, we conclude that {x R : f ∗(x) λ} 0 for every λ 0. It follows → | ∈ > | = > that ¯ ¯ ¯ ½ 1 ¾¯ n ¯[∞ n ¯ {x R : f ∗(x) 0} ¯ x R : f ∗(x) ¯ | ∈ > | = ¯i 1 ∈ > i ¯ = ¯½ ¾¯ X∞ ¯ n 1 ¯ ¯ x R : f ∗(x) ¯ 0. É i 1 ¯ ∈ > i ¯ = = | {z } 0 = n This shows that f ∗(x) 0 for almost every x R and (1) implies f ∗(x) 0 for É ∈ = almost every x Rn. ∈ ä Definition 2.28. A point x Rn is a Lebesgue point of f L1 (Rn), if there exists ∈ ∈ loc a R such that ∈ 1 Z lim f (y) a d y 0. r 0 B(x, r) B(x,r) | − | = → | | T HEMORAL : The Lebesgue differentiation theorem asserts that almost every point is a Lebesgue point for a locally integrable function. Thus locally integrable function can be defined pointwise almost everywhere. Remarks 2.29: (1) We would like to define the Lebesgue point so that a is replaced with f (x), but there is a problem with this definition since the equivalence class of f is defined only up to a set of measure zero. If f g almost everywhere, the = functions have the same Lebesgue points. Thus the notion of a Lebesgue point is independent of the representative in the equivalence class in 1 n Lloc(R ). (2) If x is a Lebesgue point of f , then

1 Z lim f (y) d y a. r 0 B(x, r) B(x,r) = → | | Thus we may uniquely define the pointwise value of f by the above limit at a Lebesgue point. (3) Whether x is a Lebesgue point of f is completely independent of the value f (x). In fact, the function f does not even need to be defined at x. By the Lebesgue differentiation theorem, almost every point x Rn is a Lebesgue 1 n ∈ point of f Lloc(R ). Moreover, if f is a specific function in the equivalence 1∈ n class in Lloc(R ), then for almost every x the number a is f (x).

Example 2.30. Let f : R R be the Heaviside function →  1, x 0,  > f (x) 1 , x 0, = 2  = 0, x 0. < CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 41

Then 1 Z x r lim + f (y) d y f (x) for every x R, r 0 2r x r = ∈ → − but 0 is not a Lebesgue point of f .

Reason.

1 Z r 1 Z 0 1 Z r f (y) a d y a d y 1 a d y 2r r | − | = 2r r | | + 2r 0 | − | − 1 − 1 a 1 a 0 for every a R, r 0. = 2 | | + 2 | − | 6= ∈ > ■

Next we remark that the use of balls is not crucial in the Lebesgue differentia- tion theorem. The theory of maximal functions can be done with cubes instead of balls, for example. As we shall see, the geometry of the sets does not play role here.

Definition 2.31. A sequence of measurable sets Ai, i 1,2,..., converges regu- = larly to a point x Rn, if there exist a constant c 0 and a sequence of positive ∈ > numbers ri, i 1,2,..., such that =

(1) Ai B(x, ri), i 1,2,..., ⊂ = (2) lim ri 0 and i = →∞ (3) Ai B(x, ri) c Ai , i 1,2,.... | | É | | É | | =

T HEMORAL : The conditions (1) and (2) ensure that the sets Ai converge to x. The condition (3) ensures that the convergence is not too fast with respect to

the Lebesgue measure: the volume of each Ai is at least certain percentage of the volume of B(x,ri). Note that x does not have to belong to the sets Ai. Examples 2.32: (1) Let ½ ¾ n l Q(x, l) y R : yi xi , i 1,..., n = ∈ | − | < 2 = be an open cube with the center x Rn and the side length l 0. µ 2 ¶ ∈ > C LAIM : Q x, r B(x, r). pn ⊂

2 r Reason. Let y Q(x, r). Then yi xi , i 1,..., n, which implies ∈ pn | − | < pn =

à n !1/2 µ µ ¶2¶1/2 X 2 r y x yi xi n r. | − | = i 1 | − | < · pn = = Thus y B(x, r). ∈ ■ ¯ µ ¶¯ ¯ 2 ¯ C LAIM : B(x, r) c¯Q x, r ¯. | | = ¯ pn ¯ CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 42

Reason. ¯ µ ¶¯ B(x, r) ¯ 2 ¯ B(x, r) | | ¯Q x, r ¯ | | = Q(x, 2 r) ¯ pn ¯ | pn | n ¯ µ ¶¯ Ωnr ¯ 2 ¯ ¯Q x, r ¯ (Ωn B(0,1) ) = ( 2 )nrn ¯ pn ¯ = | | pn ¯ µ ¶¯ n/2 ¯ 2 ¯ Ωnn c¯Q x, r ¯, c . = ¯ pn ¯ = 2n ■

2 Thus the the cubes Q(x, ri) converge regularly to x if ri 0 as i . pn → → ∞ (2) Let A B(0,1) be arbitrary measurable set with A 0 and denote ⊂ | | > n Ar(x) x rA {y R : y x rz, z A}. = + = ∈ = + ∈

Then Ar(x) x rB(0,1) B(x, r) and ⊂ + = B(x, r) B(x, r) | | Ar(x) | | = Ar(x) | | | | rn B(0,1) | | Ar(x) = rn A | | | | B(0,1) c Ar(x) , c | | = | | = A | | Thus the the sets Ar converge regularly to x if ri 0 as i . This i → → ∞ means that we can construct a sequence that converges regularly from an arbitrary set A B(0,1) with A 0. ⊂ | | > For example, if A B(0,1) \ B(0,1/2), then Ar(x) B(x, r) \ B(x, r/2) and = = x Ar(x) for any r 0. ∉ > Theorem 2.33. Assume that f L1 (Rn) and let x be a Lebesgue point of f . If ∈ loc the sequence Ai, i 1,2,..., converges regularly to x, then = 1 Z lim f (y) f (x) d y 0. i Ai A | − | = →∞ | | i T HEMORAL : The Lebesgue differentiation theorem holds for any regularly converging sets. Proof.

1 Z c Z f (y) f (x) d y f (y) f (x) d y 0 as i . Ai A | − | É B(x, ri) B(x,r ) | − | → → ∞ | | i | | i ä

Remark 2.34. The convergence of the previous theorem is valid. Assume that 1 n n f L (R ) and let x R . If for every sequence Ai, i 1,2,..., that converges ∈ loc ∈ = regularly to x, there exists

1 Z lim f (y) d y, i Ai A →∞ | | i CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 43

then x is a Lebesgue point of f . (Exercise) Hint: By interlacing two sequences, show that the limit is independent of the sequence. Then show that we may assume that the limit is zero. Then assume

that ri 0 and take → n n Ai B(x, ri) {y R : f (y) 0} or Ai B(x, ri) {y R : f (y) 0} = ∩ ∈ Ê = ∩ ∈ <

depending on which choice satisfies Ai B(x, ri) /2. Show that | | Ê | | 1 Z lim f (y) d y 0. i B(x, ri) B(x,r ) = →∞ | | i 2.5 The fundamental theorem of calcu- lus

As an application of the Lebesgue differentiation theorem, we prove the following theorem of Lebesgue in the one-dimensional case.

Theorem 2.35. Assume that f L1([a, b]) and define F : [a, b] R, ∈ → Z F(x) f (y) d y. = [a,x]

Then F0(x) exists and F0(x) f (x) for almost every x [a, b]. = ∈

T HEMORAL : This is a general version of the fundamental theorem of calculus, which is elementary in the case f C([a, b]). ∈

Proof. Define f (x) 0 for every x R \ [a, b]. Let ri 0 with limi ri 0 and = ∈ > →∞ = denote Ai (x, x ri), i 1,2,.... Then the sets Ai converge regularly to x. By = + = Theorem 2.33

F(x ri) F(x) 1 Z lim + − lim f (y) d y f (x) i ri = i ri (x,x ri) = →∞ →∞ +

for almost every x R. Since the sequence is arbitrary, we conclude that F0 (x) ∈ + exists and F0 (x) f (x) for almost every x R. + = ∈ Similarly, by choosing Ai (x ri, x), i 1,2,..., we obtain = − = F(x ri) F(x) lim − − f (x) i ri = →∞

and F0 (x) f (x) for almost every x R. Therefore F0(x) exists and F0(x) f (x) for − = ∈ = almost every x [a, b]. ∈ ä Remark 2.36. Assume that f L1([a, b]) and define F : [a, b] R, ∈ → Z F(x) F(a) f (y) d y. = + [a,x] CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 44

Then F0(x) f (x) for almost every x [a, b] and thus = ∈ Z F(x) F(a) F0(y) d y. (2.2) = + [a,x]

P ROBLEM : What do we have to assume about the function F to guarantee that (2.2) holds?

(1) If F C1([a, b]), then (2.2) holds. ∈ (2) If F χ[ 1,1] then F0 0 almost everywhere in R, but (2.2) does not hold. = − = (3) It is not enough that F is differentiable everywhere.

Reason. Let F : R R, →  x2 sin 1 , x 0 F(x) x 6= = 0, x 0. = 1 Then F0(x) exists for every x R, but F0 L (R) (exercise). Thus (2.2) does ∈ ∉ not make sense. ■

(4) It is not enough that F C([a, b]), F0(x) exists for almost every x [a, b] 1 ∈ ∈ and F0 L ([a, b]). ∈ Reason. For the Cantor-Lebesgue function (see Measure and Integral) Z F(1) 1 0 F(0) F0(y) d y. = 6= = + [0,1] ■

T HEFINALANSWER : The formula (2.2) defines an important class of functions: A function F : [a, b] R is absolutely continuous if there exists f → ∈ L1([a, b]) such that Z F(x) F(a) f (y) d y = + [a,x]

for every x [a, b]. It follows that f (x) F0(x) for almost every x [a, b]. ∈ = ∈

2.6 Points of density

We discuss a special case of the Lebesgue differentiability theorem. Let A Rn a ⊂ measurable set and consider f χA. By the Lebesgue differentation theorem = 1 Z A B(x, r) lim χA(y) d y lim | ∩ | χA(x) r 0 B(x, r) B(x,r) = r 0 B(x, r) = → | | → | | for almost every x Rn. In particular, ∈ A B(x, r) lim | ∩ | 1 for almost every x A r 0 B(x, r) = ∈ → | | and A B(x, r) lim | ∩ | 0 for almost every x Rn \ A. r 0 B(x, r) = ∈ → | | CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 45

Definition 2.37. Let A be an arbitrary subset of Rn. A point x Rn is a point of ∈ density of A, if A B(x, r) lim | ∩ | 1. r 0 B(x, r) = → | | T HEMORAL : Density points are measure theoretic interior points of the set. Loosely speaking, the small balls around x are almost entirely covered by A. The points with zero density belong to the measure theoretic complement of the set. In this case, the small balls around x are almost entirely covered by the complement of A. The Lebesgue differentiation theorem asserts that almost every point of a measurable set is a density point and almost every point of the complement of measurable set is a point of zero density. Examples 2.38: (2i 1) 2i 2i (2i 1) (2i 1) (1) Let Ii [2− + ,2− ], i 1,2,.... Then Ii 2− 2− + 2− + , = S = | | = − = i 1,2,.... Let A ∞i 1 Ii. Then = = =

2k [∞ A B(0,2− ) Ii ∩ = i k = and thus 1 4 1 A B(0,2 2k) X∞ , k 1,2,.... − 2i 1 2k 1 | ∩ | = i k 2 + = 3 2 + = = This implies 2k 2k A B(0,2− ) 4 1 2 1 | ∩ | B(0,2 2k) = 3 22k 1 · 2 = 3 | − | + and (2k 1) 2k 1 A B(0,2− + ) 4 1 2 + 1 | ∩ | . B(0,2 (2k 1)) = 3 22k 3 · 2 = 6 | − + | + Thus the limit A B(0, r) lim | ∩ | r 0 B(0, r) → | | does not exist and x 0 is not a density point of A. = 2 (2) Let A {x R : xi 1, i 1,2}. Then = ∈ | | < =  1, x A,  ∈  A B(x, r)  1 , x ÇA \{(1,1),( 1,1),( 1, 1),(1, 1)}, lim | ∩ | 2 ∈ − − − − r 0 B(x, r) =  1 , x {(1,1),( 1,1),( 1, 1),(1, 1)}, → | |  4  ∈ − − − − 0, x R2 \ A. ∈

(3) Let A {x reiθ : r 0, 0 θ 2πα}, 0 α 1. Then = = > É É < < A B(0, r) 2πα lim | ∩ | lim α. r 0 B(0, r) = r 0 2π = → | | → CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 46

Remarks 2.39: (1) There does not exist a Lebesgue measurable set A Rn such that ⊂ 1 A B(x, r) B(x, r) for every x A, r 0. | ∩ | = 2 | | ∈ > Reason. Assume that there exists such a set A. Note that 1 A A B(x, r) B(x, r) 0, if r 0. | | Ê | ∩ | = 2 | | > > By the Lebesgue differentiation theorem A B(x, r) lim | ∩ | 1 r 0 B(x, r) = → | | for almost every x A and thus on a set of positive measure in A. This ∈ contradicts with the fact that A B(x, r) 1 lim | ∩ | r 0 B(x, r) = 2 → | | for every x A. ∈ ■ (2) Let A Rn be a measurable set. Then ⊂ A 0 A has a Lebesgue point. | | > ⇐⇒ Reason. By the Lebesgue differentiation theorem the a set of Lebesgue =⇒ points of f χA in the set A has positive measure. Thus there exists at = least one point with the required property. Assume that there exists x A such that ⇐= ∈ A B(x, r) lim | ∩ | 1. r 0 B(x, r) = → | | Then for every ε 0 there exists δ 0 such that > > A B(x, r) | ∩ | 1 ε when 0 r δ. B(x, r) > − < < | | This implies

A A B(x, r) (1 ε) B(x, r) 0, 0 ε 1. | | Ê | ∩ | > − | | > < < ■ (3) If A Rn is a measurable set such that ⊂ A B(x, r) lim | ∩ | 1 r 0 B(x, r) < → | | for every x A, then A 0. ∈ | | = (4) Assume that Ω Rn is an open set. If there exists γ, 0 γ 1, such that ⊂ < É Ω B(x, r) γ B(x, r) for every x ÇΩ, r 0, | ∩ | Ê | | ∈ > then ÇΩ 0. | | = Recall that the complement of a fat Cantor set is an open set whose bound- ary has positive measure. This shows that the claim above is nontrivial. CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 47

Reason. Since Ω Rn is open, we have ÇΩ Rn \ Ω. By the Lebesgue ⊂ ⊂ differentiation theorem Ω B(x, r) lim | ∩ | 0 for almost every x ÇΩ. r 0 B(x, r) = ∈ → | | On the other hand,

Ω B(x, r) lim | ∩ | γ 0 for every x ÇΩ. r 0 B(x, r) Ê > ∈ → | | Thus ÇΩ 0. | | = ■ (5) Let A be an arbitrary subset of Rn. Then

A B(x, r) lim | ∩ | 1 for almost every x A. r 0 B(x, r) = ∈ → | | Note that this holds without the assumption that A is measurable. More- over, a set A Rn is measurable if and only if ⊂ A B(x, r) lim | ∩ | 0 for almost every x Rn \ A. r 0 B(x, r) = ∈ → | | For the proof, see [6, p. 464] and [8, Remarks 2.15 (2)].

2.7 The Sobolev embedding

This section discusses an application of the Hardy-Littlewood-Wiener theorem, see Theorem 2.24. We begin with considering the one-dimensional case. If u C1(R), ∈ 0 there exists an interval [a, b] R such that u(x) 0 for every x R \ [a, b]. By the ⊂ = ∈ fundamental theorem of calculus, Z x Z x u(x) u(a) u0(y) d y u0(y) d y, (2.3) = + a = −∞ since u(a) 0. On the other hand, = Z b Z ∞ 0 u(b) u(x) u0(y) d y u(x) u0(y) d y, = = + x = + x so that Z ∞ u(x) u0(y) d y. (2.4) = − x Equalities (2.3) and (2.4) imply Z x Z ∞ 2u(x) u0(y) d y u0(y) d y = − x Z−∞x Z u0(y)(x y) u0(y)(x y) − d y ∞ − d y = x y + x x y Z−∞ | − | | − | u0(y)(x y) − d y, = R x y | − | CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 48

from which it follows that Z 1 u0(y)(x y) u(x) − d y for every x R. = 2 R x y ∈ | − | Next we extend this to Rn.

Lemma 2.40. If u C1(Rn), then ∈ 0 1 Z u(y) (x y) u x d y x n ( ) ∇ · n− for every R , = ωn 1 Rn x y ∈ − | − | where ωn 1 nΩn on (n 1)-dimensional measure of ÇB(0,1) and − = − µ Çu Çu ¶ u ,..., ∇ = Çx1 Çxn is the gradient of u.

T HEMORAL : This is a higher dimensional version of the fundamental theo- rem of calculus

Proof. If x Rn and e ÇB(0,1), by the fundamental theorem of calculus ∈ ∈ Z Ç Z u(x) ∞ (u(x te)) dt ∞ u(x tv) e dt. = − 0 Çt + = − 0 ∇ + · By the Fubini theorem Z ωn 1u(x) u(x) 1 dS(e) − = ÇB(0,1) Z Z ∞ u(x te) e dt dS(e) = − ÇB(0,1) 0 ∇ + · Z Z ∞ u(x te) e dS(e) dt (Fubini) = − 0 ÇB(0,1) ∇ + · Z Z y 1 ∞ u(x y) dS(y) dt n 1 = − 0 ÇB(0,t) ∇ + · t t − 1 n (y te, dS(e) t − dS(y)) = = Z Z y ∞ u x y dS y dt ( ) n ( ) = − 0 ÇB(0,t) ∇ + · y | | Z u(x y) y d y ∇ +n · (polar coordinates) = − Rn y | | Z u(z) (z x) dz z x y d y dz ∇ · n− ( , ) = − Rn z x = + = | − | Z u(y) (x y) d y ∇ · n− . = Rn x y ä | − | CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 49

By the Cauchy-Schwarz inequality and Lemma 2.40, we have ¯ 1 Z u(y) (x y) ¯ u x ¯ d y¯ ( ) ¯ ∇ · n− ¯ | | = ¯ ωn 1 Rn x y ¯ − | − | 1 Z u(y) x y d y |∇ || n− | É ωn 1 Rn x y − | − | (2.5) 1 Z u(y) d y |∇ n |1 = ωn 1 Rn x y − − | − | 1 I1( u )(x), = ωn 1 |∇ | − where I f , 0 α n, is the Riesz potential α < < Z f (y) I f (x) d y. (2.6) α n α = Rn x y | − | − Lemma 2.41. If 0 α n, there exists a constant c c(n,α) 0, such that < < = > Z f (y) | | d y crα M f (x) n α B(x,r) x y É | − | − for every x Rn and r 0. ∈ > T HEMORAL : Some other operator, in this case the Riesz potential, can be bounded by the maximal maximal operator.

n i Proof. Let x R and denote Ai B(x, r 2− ). Then ∈ = Z Z f (y) X∞ f (y) | | d y | | d y n α n α B(x,r) x y − = i 0 Ai\Ai 1 x y − | − | = + | − | µ r ¶α n Z X∞ − f (y) d y i 1 É i 0 2 + Ai | | = µ 1 ¶α n µ r ¶α 1 µ r ¶ n Z X∞ − − f (y) d y Ωn i i = i 0 2 2 Ωn 2 Ai | | = µ 1 ¶α n µ r ¶α 1 Z X∞ − f (y) d y Ωn i = i 0 2 2 Ai Ai | | = | | µ 1 ¶i c M f (x) rα X∞ α = i 0 2 = c rα M f (x). = ä

Theorem 2.42 (The Sobolev inequality for the Riesz potentials). Assume that 1 p n and 0 α n/p. Then there exists a constant c c(n, p,α) 0, such that < < < < = > for every f L(Rn) we have ∈ pn Iα f p c f p, where p∗ . k k ∗ É k k = n αp − T HEMORAL : The proof applies the Hardy-Littlewood-Wiener theorem to conclude a norm estimate for some other operator, in this case the Riesz potential. CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 50

Proof. If f 0 almost everywhere, there is nothing to prove, and thus we may = assume that f 0 on a set of positive measure. This implies M f (x) 0 for every > > x Rn. By Hölder’s inequality ∈ Z f (y) µZ ¶1/p µZ ¶1/p0 | | d y f (y) p d y x y (α n)p0 d y , n α − Rn\B(x,r) x y É Rn\B(x,r) | | Rn\B(x,r) | − | | − | − where Z Z Z (α n)p ∞ (α n)p x y − 0 d y x y − 0 dS(y) dρ Rn\B(x,r) | − | = r ÇB(x,ρ) | − | Z Z ∞ (α n)p ρ − 0 1 dS(y) dρ = r ÇB(x,ρ) | {z } n 1 ωn 1 ρ − Z = − ∞ (α n)p0 n 1 ωn 1 ρ − + − dρ = − r ωn 1 n (n α)p − r − − 0 . = (n α)p n − 0 − The exponent can be written in the form p αp n n (n α)p0 n (n α) − , − − = − − p 1 = p 1 − − and thus Z f (y) | | d y crα n/p f . n α − p Rn\B(x,r) x y É k k | − | − Lemma 2.41 implies

Z f (y) Z Z I f (x) | | d y ... d y ... d y α n α | | É Rn x y = B(x,r) + Rn\B(x,r) | − | − ³ α α (n/p) ´ c r M f (x) r − f p . É + k k By choosing µ M f (x) ¶ p/n r − 0, = f p > k k we obtain 1 αp/n αp/n I f (x) cM f (x) − f . | α | É k kp

By raising both sides to the power p∗ np/(n αp), we have = − ( p/n)p I f (x) p∗ cM f (x)p f α ∗ | α | É k kp The Hardy-Littlewood theorem II (Theorem 2.24) implies Z Z p∗ (αp/n)p∗ p (αp/n)p∗ p Iα f (x) d y c f p (M f (x)) dx c f p f p Rn | | É k k Rn É k k k k and thus αp/n p/p∗ I f p c f + c f p. k α k ∗ É k kp = k k ä CHAPTER 2. THE HARDY-LITTLEWOOD MAXIMAL FUNCTION 51

Corollary 2.43 (The Sobolev inequality). If 1 p n, there exists a constant < < c c(n, p) such that = u p c u p k k ∗ É k∇ k for every u C1(Rn). ∈ 0 Proof. By (2.5), we have

1 n u(x) I1( u )(x) for every x R , | | É ωn 1 |∇ | ∈ − Thus Theorem 2.42 implies

u p c I1( u ) p c u p. k k ∗ É k |∇ | k ∗ É k∇ k ä

Remark 2.44. Let Ω Rn be an open set and u C1(Ω). By defining u(x) 0 for ⊂ ∈ 0 = every x Rn \ Ω, we have ∈

µZ ¶1/p∗ µZ ¶1/p u p∗ dx c u p dx . Ω | | É Ω |∇ | In this section we consider the definition and properties of convolution. Convolutions are used to approximate and mollify Lp functions. Moreover, many operators in har- monic analysis and partial differential equations can be written as a convolution. Approximations of the identity converge in Lp and pointwise almost everywhere under appropriate assumptions. As an application we show that C (Rn) is dense in Lp(Rn) for 1 p . Solution to the 0∞ É < ∞ Dirichlet problem with Lp boundary values for the Laplace equation in the upper half space can be expressed as a convolution against the Poisson kernel. 3 Convolutions

In this section we work with the Lebesgue measure on Rn.

3.1 Two additional properties of Lp

First we consider the question of approximation of Lp functions by compactly supported continuous functions.

Definition 3.1. The support of a function f : Rn [ , ] is → −∞ ∞ supp f {x Rn : f (x) 0}. = ∈ 6= n n If f C(R ) and supp f is a compact set, then we denote f C0(R ) and say that f ∈ ∈ is a compactly supported continuous function.

T HEMORAL : A function is compactly supported if and only if it is zero in the complement of a sufficiently large ball. Thus a compactly supported function is identically zero far way from the origin.

n p n Remark 3.2. C0(R ) L (R ) for every 1 p . Thus compactly supported ⊂ É É ∞ continuous functions belong to every Lp(Rn) with 1 p . É É ∞ Reason. 1 p É < ∞ Z Z f (x) p dx f (x) p dx sup f (x) p supp f , Rn | | = supp f | | É x supp f | | | | < ∞ ∈ since a continuous function assumes its maximum in a compact set and a compact set has finite Lebesgue measure. p = ∞ f (x) sup f (x) supp f , | | É x supp f | || | < ∞ ∈ from which it follows that f . k k∞ < ∞ ■

52 CHAPTER 3. CONVOLUTIONS 53

n p n Theorem 3.3. Assume that 1 p . Then C0(R ) is dense in L (R ). É < ∞ n T HEMORAL : This means that for every ε 0 there is a function g C0(R ) > p n ∈ such that f g p ε. Equivalently, any function f L (R ) can be approximated k − k < n p n ∈ by functions fi C0(R ) in L (R ), that is, fi f p 0 as i . ∈ k − k → → ∞ n n W ARNING : C0(R ) is not dense in L∞(R ), because the limit of continuous n functions in L∞(R ) is a continuous function. If this would be true, then this would n imply that all functions L∞(R ) are continuous, which is not the case. There is also another reson why this is not true. The constant function f : Rn R, f (x) 1 → = cannot be approximated by compactly supported functions in L∞(R).

Proof. Assume f Lp(Rn), 1 p . ∈ É < ∞ (1) Let fi f χB(0,i), i 1,2,.... Then = = n lim fi(x) f (x) for every x R . i = ∈ →∞ By the Lebesgue dominated convergence theorem Z Z p p lim fi f dx lim fi f dx 0. i Rn | − | = Rn i | − | = →∞ →∞ Observe that

p p p p 1 n fi(x) f (x) ( fi(x) f (x) ) 2 f L (R )) | − | É | | + | | É | | ∈ gives an integrable majorant. Thus compactly supported functions in Lp(Rn) are dense in Lp(Rn) and we may assume that f is such a function. (2) Since f f f , we may assume that f 0 and that f 0 outside a = + − − Ê = compact set. Indeed this set can be chosen to be a closed ball B(0, i) for i large enough. (3) Since f 0 is measurable, there exists an increasing sequence of simple Ê functions gi such that

n lim gi(x) f (x) for every x R . i = ∈ →∞

Since 0 gi f , we have É É Z Z p p gi dx f dx Rn É Rn < ∞ p n and thus gi L (R ), i 1,2,.... By the Lebesgue dominated convergence theorem ∈ = Z Z p p lim gi f dx lim gi f dx 0. i Rn | − | = Rn i | − | = →∞ →∞ Observe that

p p p p 1 n gi(x) f (x) ( gi(x) f (x) ) 2 f L (R )) | − | É | | + | | É | | ∈ CHAPTER 3. CONVOLUTIONS 54

gives an integrable majorant. Thus we may assume that f is a nonnegative simple function with a compact support. k X (4) A simple function can be represented as the finite sum f aiχAi , where = i 1 = the sets Ai are bounded, measurable and pairwise disjoint, ai 0. Thus we may Ê assume that f χA, where A is a bounded measurable set. = (5) By the approximation result for measurable sets, there exists an open set G A and a closed set F A such that G \ F 1/p ε, where ε 0. The set F is ⊃ ⊂ | | < > compact, since it is closed and bounded. (6) We recall the following version of the Urysohn lemma. Assume that G Rn ⊂ is an open set and that F G a compact set. Then there exists a continuous ⊂ function g : Rn R such that → (1)0 g(x) 1 for every x Rn, É É ∈ (2) g(x) 1 for every x F and = ∈ (3) supp g is a compact subset of G.

Reason. Let 1 U {x Rn : dist(x,F) dist(F,Rn \G)}. = ∈ < 2 Then F U U G, U is open and U is compact. Define g : Rn R, ⊂ ⊂ ⊂ → dist(x,Rn \U) g(x) , (3.1) = dist(x,F) dist(x,Rn \U) + where dist(x, A) inf{ x y : y A} is the distance of x from A. = | − | ∈ It is clear that 0 g(x) 1 for every x Rn. É É ∈ Let x F. Then dist(x,F) 0. Since F U, there exists r 0 such that ∈ = ⊂ > B(x, r) U. This implies dist(x,Rn \U) r 0 and thus g(x) 1. ⊂ Ê > = Moreover, supp g {x Rn : g(x) 0} U, which is a closed and bounded set = ∈ 6= ⊂ and thus compact. n We claim that x dist(x, A) is continuous for every A . Let x, x0 R . Then 7→ 6= ; ∈

dist(x, A) x y x x0 x0 y É | − | É | − | + | − |

for every y A. This implies dist(x, A) x x0 dist(x0, A) and thus dist(x, A) ∈ − | − | É − dist(x0, A) x x0 . By switching the roles of x and x0 we have dist(x0, A) É | − | − dist(x, A) x x0 , from which we conclude É | − |

dist(x, A) dist(x0, A) x x0 . | − | É | − | Note that dist(x, A) is even Lipschitz continuous with the constant 1. This implies that g is continuous. Thus g has all the required properties. ■ CHAPTER 3. CONVOLUTIONS 55

T HEMORAL : This shows that there exists a continuous function g which

satisfies χF f χG. Note that it is easy to find semicontinuous functions with É É this property, since χF and χG will do.

(7) Let g be a function as in (6). Note that χA(x) g(x) 1 1 0 for every n − = − = x F, χA(x) g(x) 0 0 0 for every x R \G and χA g 1. Thus ∈ − = − = ∈ | − | É µZ ¶1/p p 1/p f g p χA g dx G \ F ε. k − k = Rn | − | É | | < ä

Remark 3.4. The proof above shows that simple functions are dense in Lp(Rn) for 1 p . É É ∞ Remark 3.5. Let us briefly discuss the question of separability of the Lp spaces. Recall that a metric space is separable, if there is a countable dense subset of the space. For the sequence spaces l p with 1 p this is an exercise. However, the É < ∞ space l∞ is not separable, Reason. Let S be the collection of all sequences consisting of ones and zeros. If

x and y are distinct, then x y 1. Since S is an uncountable subset of l∞ k − k∞ Ê and every pair of points in S is at least a unit distance apart, there can be no

countable dense subset of l∞. ■ Generally, we can expect similar assertions for the Lp spaces, Indeed, the n p n space L∞(R ) is not separable and L (R ) with 1 p are separable. For É < ∞ example, L1([0,1]) is separable, since the collection of rational linear combinations of the characteristic functions of those sets that are finite unions of intervals with rational endpoints gives a countable dense subset. However, for other measures that the Lebesgue measure separability depends on the measure.

Then we study a useful continuity property of the integral. This will be an important tool in proving that convolution approximations converge to the original function. Moreover, it can be used to prove the Riemann-Lebesgue lemma, which asserts that the Fourier transform fb(ξ) of a function f L1(Rn) has the property ∈ lim ξ fb(ξ) 0 (exercise). | |→∞ = Theorem 3.6. Assume f Lp(Rn), 1 p . Then ∈ É < ∞ Z lim f (x y) f (x) p dx 0. y 0 Rn | + − | = → n T HEMORAL : Let τy f (x) f (x y), y R , be the translation. Then = + ∈

lim τy f f p 0. y 0 k − k → → Thus the translation τy f depends continuously on y at y 0. = n W ARNING : The claim does not hold when p . In fact, if f L∞(R ) = ∞ ∈ satisfies limy 0 τy f f 0, then f can be redefined on a set of measure zero → k − k∞ = so that it becomes uniformly continuous (exercise). CHAPTER 3. CONVOLUTIONS 56

Reason. Let f : R R, f χ[0, ). Then → = ∞ esssup f (x y) f (x) 1 for every y 0. x R | + − | = 6= ∈ ■

n n Proof. Let ε 0 and y R . By Theorem 3.3, there exists g C0(R ) such that > ∈ ∈ µZ ¶1/p ε f (x) g(x) p dx . Rn | − | < 3 The translation invariance of the Lebesgue integral implies µZ ¶1/p µZ ¶1/p f (x y) g(x y) p dx f (x) g(x) p dx . Rn | + − + | = Rn | − | n n Since g C0(R ), there exists r 0 such that g(x) 0 for every x R \ B(0, r) and ∈ > = ∈ thus g is uniformly continuous in Rn. Here we used the fact that a continuous function is uniformly continuous on compact sets. Therfore, there exists 0 δ 1 < É such that ε g(x y) g(x) for every x Rn, y δ. | + − | < 3 B(0, r 1) 1/p ∈ | | < | + | Since g(x y) g(x) 0 for every x Rn \ B(0, r 1), we have + − = ∈ + µZ ¶1/p µZ ¶1/p g(x y) g(x) p dx g(x y) g(x) p dx Rn | + − | = B(0,r 1) | + − | +ε ε B(0, r 1) 1/p . < 3 B(0, r 1) 1/p | + | = 3 | + | Therefore µZ ¶1/p f (x y) f (x) p dx Rn | + − | µZ ¶1/p µZ ¶1/p f (x y) g(x y) p dx g(x y) g(x) p dx É Rn | + − + | + Rn | + − | µZ ¶1/p g(x) f (x) p dx + Rn | − | ε ε ε ε. < 3 + 3 + 3 = ä

3.2 Convolution

We begin with a formal definition of the convolution.

Definition 3.7. Assume that f , g : Rn [ , ] are measurable functions. On a → −∞ ∞ formal level, the convolution f g : Rn [ , ] is defined by ∗ → −∞ ∞ Z (f g)(x) f (y)g(x y) d y, ∗ = Rn − when this makes sense. CHAPTER 3. CONVOLUTIONS 57

W ARNING : It is not clear that the integrand is a measurable function and that the integral is well defined. This requires further analysis.

Remark 3.8. The function y f (y)g(x y) is a measurable function for a fixed 7→ − x Rn. ∈ Reason. Let U R be an open set. The translation function Φ : Rn Rn, Φ(y) ⊂ → = x y, maps measurable sets to measurable sets, so that − 1 1 1 (g Φ)− (U) Φ− (g− (U)) ◦ = is a measurable set. This shows that y (g Φ)(y) g(x y) is a measurable 7→ ◦ = − function. Thus y f (y)g(x y) is a measurable function as a product of two 7→ − measurable functions. ■

T HEMORAL : The convolution is well defined for nonnegative functions f and g, but the integral may be infinite for every x Rn. ∈ A more careful analysis is needed to deal with sign changing functions. Then we need conditions under which the integrals of the positive and negative parts are finite. We begin with considering the measurability question with respect to the product Lebesgue measure in Rn Rn. This is needed in the application of × Fubini’s theorem, which ensures almost everywhere finiteness of f g under | | ∗ | | appropriate conditions. Remarks 3.9: (1) Assume that f : Rn [ , ] is a measurable function. There exists a → −∞ ∞ Borel function φ : Rn [0, ] such that φ f almost everywhere. → ∞ = 1 n Reason. Recall that φ is Borel function, if the preimage φ− (U) {x R : = ∈ φ(x) U} is a Borel set, when U R is a open set. ∈ ⊂ Assume first that f is nonnegative. Since f is a nonnegative measurable

function, there exists an increasing sequence of simple functions fi, i = 1,2,..., such that

f (x) lim fi(x) = i →∞ n for every x R . Each simple function fi can be written as a finite sum ∈ X fi a jχA j , = j

1 where A j f − ({a j}) are measurable sets. For each measurable set A j = P there is a Borel set Bi such that Ai \ Bi 0. Then φi a jχB is a | | = = j j Borel function, 0 φi fi and φi fi almost everywhere. Note that the É É = sequence φi is not necessary increasing and the limit limi φi does not →∞ necessarily exist, but

φ liminfφi = i →∞ CHAPTER 3. CONVOLUTIONS 58

is a Borel function and φ f almost everywhere. n= In the general case f : R [ , ], we write f f + f − and consider → −∞ ∞ = − the positive and negative parts separately. ■ Note that the integrals do not change if we replace f with φ. In particular, we may assume that f and g are Borel functions in the definition of convolution. (2) If f : Rn R and g : Rn R are Borel functions, then F : Rn Rn R, → → × → F(x, y) f (y)g(x y) is a Borel function. = − Reason. The functions

Φ : Rn Rn Rn, Φ(x, y) x y and Ψ : Rn Rn Rn, Ψ(x, y) y, × → = − × → = are continuous and F(x, y) f (Ψ(x, y))g(Φ(x, y)), that is, F (f Ψ) (g Φ). = = ◦ · 1◦ Let U R be an open set. Since f is a Borel function, the preimage f − (U) ⊂ is a Borel set in Rn. Since Ψ is a continuous function and the preimage of a Borel set is a Borel set in a continuous function, we conclude that

1 . 1 1 (f Ψ)− (U) Ψ − (f − (U)) ◦ = is a Borel set in R2n. In same way, we see that g Φ is a Borel function ◦ and thus F is a Borel function as a product of two Borel functions. Since the product Lebesgue measure on Rn Rn is a Borel measure, we conclude × that F is measurable. ■

T HEMORAL : We may assume that the functions f and g in the definition of convolution are Borel functions. In this case the Fubini’s theorem is available as a tool to show that the integral is well defined.

The next result settles the integrability questions in the definition of the convolution under certain assumptions.

Theorem 3.10 (Young’s theorem). Assume that f Lp(Rn), 1 p and g 1 n n ∈ É É ∞ ∈ L (R ). Then (f g)(x) exists for almost every x R and f g p f p g 1. ∗ ∈ k ∗ k É k k k k

T HEMORAL : The convolution of an Lp function and L1 function is well defined. Moreover, it is an Lp function.

W ARNING : f , g L1(Rn) does not imply that the function y f (y)g(x y) is ∈ 7→ − in L1(Rn) for a fixed x Rn. A product of integrable functions is not necessarily ∈ 1 n integrable. However, f g 1 f 1 g 1 and thus f g L (R ). k ∗ k É k k k k ∗ ∈ Proof. p 1 First assume that f and g are nonnegative. Then f (y)g(x y) is a = − nonnegative measurable function on R2n and by Fubini’s theorem for nonnegative functions Z Z Z Z f (y)g(x y) d y dx f (y)g(x y) dx d y. Rn Rn − = Rn Rn − CHAPTER 3. CONVOLUTIONS 59

On the other hand, Z Z Z (f g)(x) dx f (y)g(x y) d y dx Rn ∗ = Rn Rn − Z µZ ¶ f (y) g(x y) dx d y = Rn Rn − Z µZ ¶ f (y) g(x) dx d y = Rn Rn Z Z f (y) d y g(x) dx = Rn Rn

Thus we have f g 1 f 1 g 1 and the claim holds in this case. k ∗ k = k k k k Let us then consider the general case. By the begining of the proof f g | | ∗ | | exists almost everywhere. Thus for almost every x the function y f (y)g(x y) 7→ | − | is integrable. This means that for almost every x the function y f (y)g(x y) is 7→ − integrable and we conclude that f g exists almost everywhere. Since f g ∗ | ∗ | É f g , we have | | ∗ | | f g 1 f g 1 f 1 g 1. k ∗ k É k | ∗ | |k = k k k k p = ∞ Z (f g)(x) f (y) g(x y) d y | ∗ | É Rn | || − | Z esssup f (y) g(x y) d y f g 1. É y Rn | | Rn | − | = k k∞k k ∈

This implies that f g f g 1. k ∗ k∞ É k k∞k k 1 p By Hölder’s inequality < < ∞ Z (f g)(x) f (y) g(x y) d y | ∗ | É Rn | || − | Z f (y) g(x y) 1/p g(x y) 1/p0 d y = Rn | || − | | − | µZ ¶1/p µZ ¶1/p0 f (y) p g(x y) d y g(x y) d y . É Rn | | | − | Rn | − |

This implies Z p p/p0 p (f g)(x) g 1 f (y) g(x y) d y | ∗ | É k k Rn | | | − | and thus Z Z Z p p/p0 p (f g)(x) dx g 1 f (y) g(x y) d y dx Rn | ∗ | É k k Rn Rn | | | − | Z µZ ¶ p/p0 p g 1 f (y) g(x y) dx d y (Fubini) = k k Rn | | Rn | − |

p/p0 p p p g g 1 f g f . = k k1 k k k kp = k k1 k kp ä CHAPTER 3. CONVOLUTIONS 60

Remark 3.11. Let f L1(Rn)\L2(Rn) such that f (x) f ( x). By Young’s inequality ∈ = − (f f )(x) for almost every x Rn. However, ∗ < ∞ ∈ Z Z (f f )(0) f (y)f ( y) d y f (y) 2 d y , ∗ = Rn − = Rn | | = ∞ which shows that f f blows up at x 0. ∗ = Theorem 3.12. Assume that 1 p , f Lp(Rn) and g Lp0 (Rn). Then (f n É É ∞ ∈ ∈ ∗ g)(x) exists for every x R and f g f p g p . Moreover, the function f g ∈ k ∗ k∞ É k k k k 0 ∗ is uniformly continuous in Rn.

T HEMORAL : The convolution of an Lp function and Lp0 function is well defined. Moreover, it is a bounded and continuous function.

Proof. In the general case, either p or p0 is finite (or both). Assume p . By < ∞ Hölder’s inequality ¯Z ¯ ¯ ¯ (f g)(x) ¯ f (x y)g(y) d y¯ | ∗ | = ¯ Rn − ¯ µZ ¶1/p µZ ¶1/p0 f (x y) p d y g(y) p0 d y É Rn | − | Rn | |

f p g p = k k k k 0 < ∞ n for every x R . This implies f g f p g p . ∈ k ∗ k∞ É k k k k 0 Moreover, by Hölder’s inequality and reflection invariance and translation invariance of the Lebesgue integral, we have ¯Z ¯ ¯ ¯ (f g)(x) (f g)(z) ¯ (f (x y) f (z y))g(y) d y¯ | ∗ − ∗ | = ¯ Rn − − − ¯ µZ ¶1/p µZ ¶1/p0 f (x y) f (z y) p d y g(y) p0 d y É Rn | − − − | Rn | | µZ ¶1/p µZ ¶1/p0 f (y x) f (y z) p d y g(y) p0 d y = Rn | − − − | Rn | |

τ x f τ z f p g p τz x f f p g p . = k − − − k k k 0 = k − − k k k 0

Theorem 3.6 implies that for every ε 0 there exists δ 0 such that τz x f f p > > k − − k < ε whenever z x δ. Thus | − | <

(f g)(x) (f g)(z) ε g p | ∗ − ∗ | < k k 0 for every x Rn and z δ. Therefore, f g is uniformly continuous. ∈ | | < ∗ ä The following lemma shows that the convolution regarded as a multiplication in L1(Rn) satisfies certain standard algebraic laws.

Lemma 3.13. Assume f , g, h L1(Rn) and a, b R. Then the following claims are ∈ ∈ true: CHAPTER 3. CONVOLUTIONS 61

(1) (Commutative law) f g g f . ∗ = ∗ (2) (Associative law)f (g h) (f g) h. ∗ ∗ = ∗ ∗ (3) (Distributive law) (af bg) h a(f h) b(g h). + ∗ = ∗ + ∗ Proof. (Exercise) ä

3.3 Approximations of the identity

The previous lemma, Riesz-Fischer theorem and Young’s theorem show that L1(Rn) is a commutative with the convolution as a product. This algebra does not have a multiplicative identity, that is, there does not exist φ L1(Rn) ∈ such that φ f f for every f L1(Rn). ∗ = ∈ Reason. Assume that there exists such a φ. Then, in particular, φ f f for n ∗ = every f L∞(R ) with a compact support. Theorem 3.12 implies that φ f is ∈ n ∗ continuous. Since φ f f , this shows that every f L∞(R ) with a compact ∗ = ∈ support is continuous. This is not true, take f χB(0,1), for example. = ■ However, there exists approximations of the identity in the sense that there exists a collection of functions φ L1(Rn) such that φ f f in L1(Rn) as ε 0. ε ∈ ε ∗ → → In fact, the limit exists in Lp(Rn) and pointwise under appropriate assumptions. This gives a very useful method to produce approximations of functions in Lp(Rn).

Definition 3.14. Let φ L1(Rn), ε 0. Define ∈ > 1 ³ x ´ n φε(x) φ , x R . = εn ε ∈ Such a collection of functions is called an approximation of the identity.

n Remark 3.15. Let φ C0(R ). ∈ (1) Since a continuous function attains its maximum value in a compact set, we have sup φ(x) max φ(x) . n x Rn | | = x R | | < ∞ ∈ ∈ The definition of φε implies 1 1 φε max φε(x) max φ(x) φ , ε 0. k k∞ = x Rn | | = εn x Rn | | = εn k k∞ > ∈ ∈ Unless φ(x) 0 for every x Rn, we have = ∈

limmax φε(x) . ε 0 x Rn | | = ∞ → ∈ For the supports we have

suppφ εsuppφ, ε 0. ε = > CHAPTER 3. CONVOLUTIONS 62

Reason. Since ½ ¾ n n 1 ³ x ´ n n ³ x ´ o {x R : φε(x) 0} x R : φ 0 x R : φ 0 ∈ 6= = ∈ εn ε 6= = ∈ ε 6= ©εx Rn : φ(x) 0ª ε{x Rn : φ(x) 0}, = ∈ 6= = ∈ 6= we have

suppφ {x Rn : φ (x) 0} ε{x Rn : φ(x) 0} ε = ∈ ε 6= = ∈ 6= ε{x Rn : φ(x) 0} εsuppφ. = ∈ 6= = ■ Since suppφ is compact and suppφ is compact for every ε 0. Thus ε > suppφ and | | < ∞ n lim suppφε lim εsuppφ ε lim suppφ 0. ε 0 | | = ε 0 | | = ε 0 | | = → → → (2) Let 1 p . By the change of variables y x , dx εn d y, we have É < ∞ = ε = Z Z ¯ 1 ³ x ´¯p 1 Z ¯ ³ x ´¯p (x) p dx ¯ ¯ dx ¯ ¯ dx φε ¯ n φ ¯ np ¯φ ¯ Rn | | = Rn ¯ ε ε ¯ = ε Rn ε 1 Z Z n ¯ y ¯p d y n(1 p) ¯ y ¯p d y np ε ¯φ( )¯ ε − ¯φ( )¯ = ε Rn = Rn and thus µ ¶(p 1)/p 1 − φε p φ p. k k = εn k k

T HEMORAL : Smaller values of ε 0 produce higher peaks and smaller > supports. Convolution with approximations of the identity is expected to act as the identity operator on a class of functions as ε 0. → Examples 3.16: (1) Let φ : Rn R, → χB(0,1)(x) φ(x) . = B(0,1) | | Then 1 χB(0,1)(x/ε) χB(0,ε)(x) φε(x) , ε 0. = εn B(0,1) = B(0,ε) > | | | | Assume f L1(Rn). Then ∈ Z 1 Z (f φε)(x) f (y)φε(x y) d y f (y) d y ∗ = Rn − = B(x,ε) B(x,ε) | | is the integral average of over the ball B(x,ε). By Young’s inequality

f φ 1 f 1 φ 1 f 1 for every ε 0, k ∗ εk É k k k εk = k k >

since φ 1 1 for every ε 0. By the Lebesgue differentiation theorem, k εk = > Theorem 2.26, we have 1 Z lim(f φε)(x) lim f (y) d y f (x) ε 0 ∗ = ε 0 B(x,ε) B(x,ε) = → → | | for almost every x Rn. ∈ CHAPTER 3. CONVOLUTIONS 63

(2) Let ϕ : Rn R, →  ³ ´ exp 1 , x 1,  x 2 1 ϕ(x) | | − | | < = 0, x 1. | | Ê n 1 n Then ϕ C0(R ) and thus ϕ L (R ) with 0 ϕ 1 . Let ∈ ∈ < k k < ∞ ϕ(x) φ(x) , x Rn. = ϕ 1 ∈ k k n Then φ C0(R ) and suppφ B(0,ε). Moreover ε ∈ ε = Z 1 Z ³ x ´ x dx dx φε( ) n φ Rn = ε Rn ε 1 Z x y n d y y dx n d y n φ( )ε ( , ε ) = ε Rn = ε = Z φ(y) d y = Rn Z ϕ(x) ϕ 1 dx k k 1, ε 0. = Rn ϕ 1 = φ 1 = > k k k k Young’s inequality implies

f φ 1 f 1 φ 1 f 1 for every ε 0. k ∗ εk É k k k εk = k k >

The function φε is called the standard mollifier. Since Z φε(x y) d y 1 B(x,ε) − =

for every ε 0, the Lebesgue differentiation theorem implies > ¯Z Z ¯ ¯ ¯ fε(x) f (x) ¯ φε(x y)f (y) d y f (x) φε(x y) d y¯ | − | = ¯ B(x,ε) − − B(x,ε) − ¯ ¯Z ¯ ¯ ¯ ¯ φε(x y)(f (y) f (x)) d y¯ = ¯ B(x,ε) − − ¯ 1 Z ³ x y ´ f y f x d y n φ − ( ) ( ) É ε B(x,ε) ε | − | 1 Z n f (y) f (x) d y 0 Ωn φ L∞(R ) É k k B(x,ε) B(x,ε) | − | → | | n for almost every x R as ε 0. Here Ωn B(0,1) . ∈ → = | | Next we discuss properties of a general approximation of the identity.

Lemma 3.17. Let φ L1(Rn). ∈ Z Z (1) φε(x) dx φ(x) dx for every ε 0. Rn = Rn > Z (2) lim φε(x) dx 0 for every r 0. ε 0 Rn\B(0,r) | | = > → CHAPTER 3. CONVOLUTIONS 64

T HEMORAL : The assertion (1) explains the scaling factors in the definition of φ . These are chosen so that the integral of φ is independent of ε 0. The ε ε > assertion (2) tells that the integral of φ becomes as small as we please for ε 0 | ε| > small enough. This indicates that φ concentrated in a small neighbourhood | ε| of the point x. Note that assertions (1) and (2) hold for compactly supported continuous functions.

Remark 3.18. The assetion (2) is clear, if the support of φ is a compact set.

Proof. (1) By a change of variables

Z 1 Z ³ x ´ Z x x dx dx y d y y dx n d y φε( ) n φ φ( ) ( , ε ). Rn = ε Rn ε = Rn = ε =

(2) By the same change of variables as above

Z 1 Z ¯ ³ x ´¯ (x) dx ¯ ¯ dx φε n ¯φ ¯ Rn\B(0,r) | | = ε Rn\B(0,r) ε Z x φ(y) d y (y , dx εn d y) = n r | | = ε = R \B(0, ε ) Z φ(x) χRn\B(0, r )(x) dx 0 = Rn | | ε →

as ε 0 by the Lebesgue dominated convergence theorem with the integrable → majorant 1 n φ χRn\B(0, r ) φ L (R )) for every ε 0. | | ε É | | ∈ > ä

3.4 Pointwise convergence

There is a connection between the approximations of the identity and the Hardy- Littlewood maximal function. Recall that a function φ : Rn R is radial, if its value → only depends on x . Thus a nonnegative radial function is of the form f (x) ϕ( x ) | | = | | for some function ϕ : R+ R+. We say that a radial function is decreasing, if → x y implies φ(x) φ(y). | | Ê | | É Theorem 3.19. Assume that φ L1(Rn) is nonnegative, radial and decreasing ∈ Then

sup (φε f )(x) φ 1M f (x) ε 0 | ∗ | É k k > for every x Rn. ∈

T HEMORAL : The Hardy-Littlewood maximal operator gives a pointwise upper bound for many other operators as well. In this sense the maximal function controls the weighted averages of a function with respect to any radial and decreasing function. CHAPTER 3. CONVOLUTIONS 65

Proof. First assume in addition to the given hypotheses that φ is a simple function in the form X φ aiχB(0,ri) = i with ai 0. The sum here is over finitely many indices only. Then > Z X XZ X φ 1 aiχB(0,ri) dx aiχB(0,ri) dx ai B(0, ri) . k k = Rn i = i Rn = i | | Moreover, ¯Z ¯ ¯ 1 Z ³ y ´ ¯ f x ¯ f x y y d y¯ ¯ f x y d y¯ (φε )( ) ¯ ( )φε( ) ¯ ¯ n ( )φ ¯ | ∗ | = ¯ Rn − ¯ = ¯ ε Rn − ε ¯ ¯Z ¯ ¯ ¯ y n ¯ f (x εz)φ(z) d y¯ (z , y εz, d y ε dz) = ¯ Rn − ¯ = ε = = ¯ ¯ ¯ Z ¯ Z ¯X ¯ X ¯ f (x εz)ai dz¯ ai f (x εz) dz = ¯ i B(0,ri) − ¯ É i B(0,ri) | − | X 1 Z ai B(0, ri) f (x εz) dz, = | | B(0, ri) B(0,r ) | − | i | | i where 1 Z 1 Z f x z dz f y d y ( ε ) n ( ) B(0, ri) B(0,r ) | − | = ε B(0, ri) B(x,εr ) | | | | i | | i n (y x εz, z (y x)/ε, dz ε− d y) = − = − = 1 Z f (y) d y M f (x). = B(x,εri) B(x,εr ) | | É | | i Thus X (φε f )(x) ai B(0, ri) M f (x) φ 1M f (x). | ∗ | É i | | = k k Then we consider the general case. Since φ is nonnegative, radial and decreas-

ing, there is an increasing sequence of nonnegative simple functions φ j such that n φ j(x) φ(x) for every x R as j . By the Lebesgue monotone convergence → ∈ → ∞ theorem Z (φε f )(x) f (x y) φε(y) d y | ∗ | É Rn | − | Z f (x y) lim (φ j)ε(y) d y = Rn | − | j Z →∞ lim f (x y) (φ j)ε(y) d y = j Rn | − | →∞ lim φ j 1M f (x) φ 1M f (x) É j k k = k k →∞ for every x Rn. ∈ ä Theorem 3.20. Assume that φ L1(Rn) is nonnegative, radial and decreasing, R p n ∈ a n φ(x) dx and f L (R ), 1 p . Then = R ∈ É É ∞ n lim(φε f )(x) af (x) for almost every x R . ε 0 ∗ = ∈ → CHAPTER 3. CONVOLUTIONS 66

T HEMORAL : This is the Lebesgue differentiation theorem for approximations of the identity. This shows that the convolution approximations can be seen as weighted averages of the function.

Proof. 1 p Define a maximal operator related to the approximation of the É < ∞ identity by

Mφ f (x) sup (φε f )(x) . = ε 0 | ∗ | > By Theorem 3.19

n M f (x) φ 1M f (x) for every x R . φ É k k ∈ By the weak type estimate for the Hardy-Littlewood maximal function, see Theorem 2.20, for f L1(Rn), we have ∈ n n n 5 φ 1 {x R : Mφ f (x) λ} {x R : φ 1M f (x) λ} k k f 1 | ∈ > | É | ∈ k k > | É λ k k for every λ 0. > On the other hand, by Chebyshev’s inequality and the strong type estimate for the Hardy-Littlewood maximal function, see Theorem 2.24, for f Lp(Rn), ∈ 1 p , we have < < ∞ n 1 p c(n, p) p {x R : Mφ f (x) λ} M f p f p (3.2) | ∈ > | É λp k k É λp k k for every λ 0. Thus (3.2) holds for 1 p . > É < ∞ The proof of the claim is based on these two estimates and is somewhat similar to the proof of Theorem 2.26. Let η 0. Since compactly supported continuous p n > n functions are dense in L (R ), there exists g C0(R ) such that f g p η. Since ∈ k − k < g is continuous, there exists δ 0 such that g(x y) g(x) η whenever y δ. > | − − | < | | < By Lemma 3.17 (1), we have Z Z Z ag(x) g(x) φ(y) d y g(x) φε(y) d y g(x)φε(y) d y. = Rn = Rn = Rn This implies ¯Z Z ¯ ¯ ¯ (φε g)(x) ag(x) ¯ g(x y)φε(y) d y g(x)φε(y) d y¯ | ∗ − | = ¯ Rn − − Rn ¯ ¯Z ¯ ¯ ¯ ¯ (g(x y) g(x))φε(y) d y¯ = ¯ Rn − − ¯ Z g(x y) g(x) φε(y) d y É Rn | − − | Z Z ... d y ... d y = B(0,δ) + Rn\B(0,δ) Z Z η φε(y) d y 2 g φε(y) d y. É B(0,δ) + k k∞ Rn\B(0,δ) By Lemma 3.17 Z φε(y) d y φε 1 φ 1 B(0,δ) É k k = k k CHAPTER 3. CONVOLUTIONS 67

and Z φε(y) d y 0 as ε 0. Rn\B(0,δ) → → By letting first η 0 and then ε 0, we have → → n limsup (φε g)(x) ag(x) 0 for every x R . ε 0 | ∗ − | = ∈ → This shows that

n lim(φε g)(x) ag(x) for every x R , ε 0 ∗ = ∈ → n that is, the claim of the theorem holds for g C0(R ) at every point. ∈ Then we consider the corresponding claim for f Lp(Rn). We note that ∈

limsup (φε f )(x) af (x) ε 0 | ∗ − | → limsup φε (f g)(x) a(f g)(x) limsup (φε g)(x) ag(x) É ε 0 | ∗ − − − | + ε 0 | ∗ − | → | → {z } 0 = M (f g)(x) a (f g)(x) . É φ − + | − | Let ½ ¾ n 1 Ai x R : limsup (φε f )(x) af (x) , i 1,2,.... = ∈ ε 0 | ∗ − | > i = → As in the proof of Theorem 2.26 ½ ¾ ½ ¾ n 1 n 1 Ai x R : M (f g)(x) x R : f (x) g(x) , i 1,2,..., ⊂ ∈ φ − > 2i ∪ ∈ | − | > 2i =

and by (3.2) and Chebyshev’s inequality we have ¯½ ¾¯ ¯½ ¾¯ ¯ n 1 ¯ ¯ n 1 ¯ Ai ¯ x R : M (f g)(x) ¯ ¯ x R : f (x) g(x) ¯ | | É ¯ ∈ φ − > 2i ¯ + ¯ ∈ | − | > 2i ¯ cip f g p (2i)p f g p É k − kp + k − kp p cip f g cipηp, i 1,2,..., = k − kp É = S By letting η 0, we conclude Ai 0 for every i 1,2,... and thus ∞i 1 Ai P → | | = = | = | É ∞i 1 Ai 0. This shows that = | | = n {x R : limsup (φε f )(x) af (x) 0} 0, | ∈ ε 0 | ∗ − | > | = → from which we conclude that

n limsup (φε f )(x) af (x) 0 for almost every x R . ε 0 | ∗ − | = ∈ → n p Let f L∞(R ). We show that = ∞ ∈

lim(φε f )(x) af (x) for almost every x B(0, r), r 0. ε 0 ∗ = ∈ > → CHAPTER 3. CONVOLUTIONS 68

1 n Let f1 f χB(0,r 1) and f2 f f1. Then f1 L (R ) and by the beginning of the = + = − ∈ proof n lim(φε f1)(x) af1(x) for almost every x R . ε 0 ∗ = ∈ → We claim that

lim(φε f2)(x) 0 for almost every x B(0, r), r 0. ε 0 ∗ = ∈ > → To see this, let x B(0, r) and y 1. Then x y B(0, r 1) and thus f2(x y) 0. ∈ | | < − ∈ + − = This implies ¯Z ¯ ¯ ¯ (φε f2)(x) ¯ f2(x y)φε(y) d y¯ | ∗ | = ¯ Rn − ¯ ¯Z ¯ ¯ ¯ ¯ f2(x y)φε(y) d y¯ = ¯ Rn\B(0,1) − ¯ Z f2 φε(y) d y 0 as ε 0. = k k∞ Rn\B(0,1) → → ä

n Remark 3.21. If f L∞(R ) is continuous at x, then ∈

lim(φε f )(x) af (x). ε 0 ∗ = → n Moreover, if f L∞(R ) is uniformly continuous, the convergence is uniform. ∈ (Exercise)

3.5 Convergence in Lp

Theorem 3.20 asserts that a convolution approximation of a Lp function con- verge almost everywhere, but in general almost everywhere convergence does not imply convergence in Lp. However, the next result shows that this is true for approximations of the identity.

1 n R p n Theorem 3.22. Assume that φ L (R ), a n φ(x) dx and f L (R ), 1 p ∈ = R ∈ É < . Then ∞ lim φε f af p 0. ε 0 k ∗ − k = → T HEMORAL : Approximations of the identity converge in Lp for 1 p . É < ∞ W ARNING : The result does not hold true for p . In this case the corre- n = ∞ sponding claim claim is the following: If f L∞(R ) is uniformly continuous, then ∈ φ f af uniformly in Rn, that is, ε ∗ →

lim φε f af 0. ε 0 k ∗ − k∞ = → Proof. By Lemma 3.17 (1), we have Z Z Z af (x) f (x) φ(y) d y f (x) φε(y) d y f (x)φε(y) d y. = Rn = Rn = Rn CHAPTER 3. CONVOLUTIONS 69

Thus ¯Z ¯ ¯ ¯ (f φε)(x) af (x) ¯ (f (x y) f (x))φε(y) d y¯ | ∗ − | = ¯ Rn − − ¯ Z 1/p 1/p0 f (x y) f (x) φε(y) φε(y) d y. É Rn | − − || | | | µ 1 1 ¶ 1, 1 p , the case p 1 is an exercise p + p0 = < < ∞ = By Fubini’s theorem and Hölder’s inequality Z ¯ ¯p ¯(f φε)(x) af (x)¯ dx Rn ∗ − Z µZ ¶µZ ¶p/p0 p f (x y) f (x) φε(y) d y φε(y) d y dx É Rn Rn | − − | | | Rn | | Z µZ ¶ p/p0 p φ 1 φε(y) f (x y) f (x) dx d y, = k k Rn | | Rn | − − | where Z µZ ¶ Z Z p φε(y) f (x y) f (x) dx d y ... d y ... d y Rn | | Rn | − − | = B(0,r) + Rn\B(0,r)

Let η 0. By Theorem 3.6 there exists r 0 such that > > Z p η f (x y) f (x) dx p for every y B(0, r). Rn | − − | < 2( φ 1) ∈ k k1 + This shows that Z µZ ¶ p/p0 p φ 1 φε(y) f (x y) f (x) dx d y k k B(0,r)| | Rn | − − |

p/p0 1 η φ 1 + η ³ p ´ k k p . 1 p É 2 φ 1 É 2 p0 + = k k1 +

By Lemma 3.17 (2), there exists ε0 0 such that > Z η φ (y) d y , as 0 ε ε . ε p p/p 0 Rn\B(0,r) | | < 2p 2( f φ 0 1) < < + k kpk k1 + This shows that Z µZ ¶ p/p0 p φ 1 φε(y) f (x y) f (x) dx d y k k Rn\B(0,r)| | Rn | − − | Z p 1 p/p0 p 2 + φ 1 f p φε(y) d y É k k k k Rn\B(0,r) | | ³ ´ f (x y) f (x) p 2p¡ f (x y) p f (x) p¢ | − − | É | − | + | | η , as 0 ε ε0. < 2 < < Thus p η η f φ af η. k ∗ ε − kp < 2 + 2 = ä CHAPTER 3. CONVOLUTIONS 70

Remark 3.23. An examination of the proof above shows that a more general result 1 n holds as well. Let φi L (R ), i 1,2,..., is a sequence with the properties ∈ = Z (1) lim φi(x) dx a, i Rn = →∞Z (2) sup φi(x) dx and i Rn | | < ∞ Z (3) lim φi(x) dx 0 for every r 0. i Rn\B(0,r) | | = > →∞ Then

lim φε f af p 0. ε 0 k ∗ − k = → Note that here φi do not have to be nonnegative or given by the formula for the approximate identity.

3.6 Smoothing properties

For a positive integer m, let Cm(Rn) denote the class of functions f : Rn R, whose → partial derivatives Çα1 ... αn f Dα f + + α1 αn = Çx1 ...Çxn up to in including those of order m exist and are continuous. The subset of Cm(Rn) m n n with functions of compact support is denoted by C0 (R ). Moreover, C∞(R ) is the class of functions which have continuous partial derivatives of all orders, that is,

n \∞ m n C∞(R ) C (R ), = m 1 = n and C0∞(R ) is the corresponding class of functions with a compact support. Note that there are such a functions.

n Example 3.24. The function φ in Example 3.16 (2) belongs to C0∞(R ) (exercise). Hint: First define h : R R, →  0, t 0, h(t) É = exp( 1/t), t 0. − > (m) Then h C∞(R). Prove by induction that h (t) Pm(1/t)exp( 1/t) for t 0, ∈ = − (m) > where Pm is a polynomial of degree 2m. Then prove by induction that h (0) 0. 2 n = Now φ(x) h(1 x ) belongs to C∞(R ) as a composed function of two functions n = − | | 2 2 in C∞(R ). Moreover, if x 1, then 1 x 0 and thus h(1 x ) 0. Therefore | | Ê n − | | É − | | = this function belongs to C0∞(R ). p n n n Theorem 3.25. If f L (R ), 1 p and φ C∞(R ), then f φ C∞(R ) ∈ É É ∞ ∈ 0 ∗ ε ∈ and Dα(f φ )(x) (f Dαφ )(x) ∗ ε = ∗ ε n n for every x R , ε 0, α (α1,...,αn) N . ∈ > = ∈ CHAPTER 3. CONVOLUTIONS 71

T HEMORAL : Convolution inherits smoothness of the mollifier, since we differentiate under the integral sign. This is justified by the Lebesgue dominated convergence theorem.

n W ARNING : In general f φ C∞(R ), that is, the convolution approximation ∗ ε ∉ 0 does not have a compact support.

n p n p n Remark 3.26. Theorem 3.22 implies that C∞(R ) L (R ) is dense in L (R ) for ∩ 1 p . É < ∞

Proof. We observe that f φ is a continuous function (exercise). Let ei (0,...,1,...,0) ∗ ε = be the standard ith basis vector in Rn, i 1,..., n, 0 h 1, h R. Then = < | | < ∈ (f φ )(x he ) (f φ )(x) 1 Z 1 · µ x he y ¶ ³ x y ´¸ ε i ε i f y d y ∗ + − ∗ n φ + − φ − ( ) h = ε Rn h ε − ε

C LAIM : · µ ¶ ¸ 1 x hei y ³ x y ´ 1 Çφ ³ x y ´ φ + − φ − − as h 0. h ε − ε → ε Çxi ε → Reason. Let ³ x y ´ ϕ(x) φ − . = ε Then Çϕ 1 Çφ ³ x y ´ (x) − . Çxi = ε Çxi ε ■ Next we derive a bound so that we may apply the Lebesgue dominated conver- gence theorem. By the fundamental theorem of calculus

Z h Ç ϕ(x hei) ϕ(x) (ϕ(x tei)) dt + − = 0 Çt + Z h Dϕ(x tei) ei dt, = 0 + ·

Çϕ Çϕ where Dϕ ¡ ,..., ¢ is the gradient of ϕ. This implies = Çx1 Çxn Z h | | ϕ(x hei) ϕ(x) Dϕ(x tei) ei dt | + − | É 0 | + · | Z h ¯ µ ¶ ¯ 1 | | ¯ x tei y ¯ ¯Dφ + − ei¯ dt = ε 0 ¯ ε · ¯ Z h ¯ µ ¶¯ 1 | | ¯ x tei y ¯ h ¯Dφ + − ¯ dt | | Dφ . É ε 0 ¯ ε ¯ É ε k k∞

Define ½ ¾ x y x hei y K y Rn : − suppφ or + − suppφ, 0 h 1 . = ∈ ε ∈ ε ∈ < | | < CHAPTER 3. CONVOLUTIONS 72

Since suppφ is compact, we see that K is a bounded set. By the Lebesgue domi- nated convergence theorem

Ç(f φ ) (f φ )(x hei) (f φ )(x) ∗ ε (x) lim ∗ ε + − ∗ ε Çxi = h 0 h → 1 Z 1 · µ x he y ¶ ³ x y ´¸ i f y d y lim n φ + − φ − ( ) = h 0 ε K h ε − ε → 1 Z 1 · µ x he y ¶ ³ x y ´¸ i f y d y n lim φ + − φ − ( ) = ε K h 0 h ε − ε → ³ 1 ´ LDCT, Dφ f L1(K) ε k k∞| | ∈ 1 Z 1 Çφ ³ x y ´ f y d y n − ( ) = ε K ε Çxi ε Z Çφ µ Çφ ¶ ε (x y)f (y) d y ε f (x). = K Çxi − = Çxi ∗ Since this partial derivative is given by a similar convolution as in the definition of f φ itself, it is a continuous function. By induction it follows that f φ ∗ ε ∗ ε possesses continuous partial derivatives of all orders. ä Next we show that every function in Lp(Rn) can be approximated by compactly supported smooth functions for 1 p . This result does not hold true for É < ∞ p . This is simply because the uniform limit of continuous functions is itself = ∞ continuous.

n n n Remark 3.27. The closure of C0∞(R ) in L∞(R ) is the subspace of C(R ) consisting of functions satisfying lim f (x) 0. x = | |→∞ (Exercise)

n p n Theorem 3.28. C∞(R ) is dense in L (R ) for 1 p . 0 É < ∞ T HEMORAL : Not only compactly supported continuous functions, but also compactly supported smooth functions are dense in Lp for 1 p . É < ∞ p n n Proof. Assume f L (R ) and let η 0. Theorem 3.3 shows that C0(R ) is dense p n ∈ >n in L (R ) so that there exists g C0(R ) such that f g Lp(Rn) η/2. Let φ be the ∈ k − k < ε standard mollifier in Example 3.16 (2). Theorem 3.25 shows that g φ C∞(Ω). ∗ ε ∈ C LAIM : supp(g φ ) is compact. ∗ ε Reason. If (g φ )(x) 0, then there exists y Rn such that g(y)φ (x y) 0, ∗ ε 6= ∈ ε − 6= which implies that g(y) 0 and φ (x y) 0. If g(y) 0, then y supp g and we 6= ε − 6= 6= ∈ denote K supp g. If φ (x y) 0, then x y ε. Thus = ε − 6= | − | É K {y Rn : dist(x,K) ε} ε = ∈ É is a compact set and (g φ )(x) 0 for every x Rn \ K . This implies that g φ ∗ ε = ∈ ε ∗ ε has a compact support. ■ CHAPTER 3. CONVOLUTIONS 73

By Theorem 3.22 there exists ε0 0 such that > η g (g φ ) p when 0 ε ε0. k − ∗ ε k < 2 < < Thus η η f (g φ ) p f g p g (g φ ) p η. k − ∗ ε k É k − k + k − ∗ ε k < 2 + 2 = ä

3.7 The Poisson kernel

We consider an example from the theory of partial differential equations. Assume that f Lp(Rn), 1 p . Let P : Rn R, ∈ É < ∞ → 1 P(x) c(n) = (1 x 2)(n 1)/2 + | | + (n 1)/2 be the Poisson kernel, where c(n) Γ((n 1)/2)/π + is chosen such that = + Z P(x) dx 1. Rn = Then 1 ³ x ´ ε Pε(x) P c(n) , ε 0, = εn ε = ( x 2 ε2)(n 1)/2 > | | + + is an approximation of the identity and we may apply the theory developed above. By Young’s theorem, the Poisson integral of f Z u(x,ε) (f Pε)(x) Pε(x y)f (y) d y = ∗ = Rn − is well defined and

f P p f p P 1 f p for every ε 0. k ∗ εk É k k k εk = k k > Theorem 3.20 implies that

n lim(f Pε)(x) f (x) for almost every x R . ε 0 ∗ = ∈ → n Theorem 3.25 shows that the function x u(x,ε) (f P )(x) belongs to C∞(R ) 7→ = ∗ ε for every fixed ε 0. Moreover, the function u is a solution to the Laplace equation > in the upper half space

n 1 n 1 R + {(x1,..., xn,ε) R + : ε 0}, + = ∈ > that is, 2 2 2 Ç u Ç u Ç u n 1 ∆u ... 0 in R + . = 2 + + 2 + 2 = Çx1 Çxn Çε + Thus u(x,ε) (f P )(x) is a solution to the Dirichlet problem = ∗ ε  n 1 ∆u 0 in R + , = + n 1 n u f on ÇR + R , = + = CHAPTER 3. CONVOLUTIONS 74

in the sense that

lim u(x,ε) f (x) for almost every x Rn. ε 0 = ∈ → Moreover, Theorem 3.22 shows that u(x,ε) f (x) in Lp(Rn) as ε 0. → → Note also, that by Theorem 3.19, there exists a constant c such that

n sup (f Pε)(x) cM f (x) for every x R . ε 0 | ∗ | É ∈ > T HEMORAL : This gives a method to define and study a solution to a Dirichlet problem in the upper half space for boundary values that only belong to Lp. In particular, the boundary values do not have to be continuous or bounded. On the other hand, this gives another point of view to the convolution approximations. They can be seen as extensions of functions to the upper half space.

Remark 3.29. Let µ be a finite Radon measure on Rn. The convolution of µ with a function f L1(Rn;µ) is defined as ∈ Z (f µ)(x) f (x y) dµ(y). ∗ = Rn − It can be shown that

n n Pε µ 1 µ(R ) and lim Pε µ 1 µ(R ). k ∗ k É ε 0 k ∗ k = → Moreover, Z Z n lim (Pε µ)(x)f (x) dx f (x) dµ(x) for every f C0(R ). ε 0 Rn ∗ = Rn ∈ → (Exercise). This means that the measures (P µ)(x)f (x) dx converge weakly to ε ∗ µ as ε 0. We shall discuss the weak convergence of measures later. Note that → this holds, in particular, when µ is Dirac’s delta. In this case we obtain the fundamental solution, which is the Poisson kernel itself. Derivatives of measures are very useful tools in represent- ing measures as integrals with respect to another measure. The Radon-Nikodym theorem is a version of the fundamen- tal theorem of calculus for measures. It has applications not only in analysis but also in probability theory. Differentia- tion of measures also extend the Lebesgue differentiation theorem for more general Radon measures. A powerful Besicovitch covering theorem is used in the arguments. 4 Differentiation of measures

There exists a useful differentiation theory for measures which has similar fea- tures as the differentiation theory for real functions. The first problem is to find a way to define the derivative of measures and to show that it exists.

4.1 Covering theorems

Let us recall the definition of a Radon measure from the measure and integration theory.

Definition 4.1. Let µ be an outer measure on Rn.

(1) µ is called a Borel outer measure, if all Borel sets are µ-measurable. (2) A Borel outer measure µ is called Borel regular, if for every set A Rn ⊂ there exists a Borel set B such that A B and µ(A) µ(B). ⊂ = (3) µ is a Radon outer measure, if µ is Borel regular and µ(K) for every < ∞ compact set K Rn. ⊂

T HEMORAL : The Lebesgue outer measure is a Radon measure. Gen- eral Radon measures have many good approximation properties similar to the Lebesgue measure. There is also a natural way to construct Radon measures by the Riesz representation theorem. This will be discussed in this course.

For an arbitrary Radon measure µ on Rn, there is no systematic way to control µ(B(x,5r)) in terms of µ(B(x, r)). The measure µ is called doubling, if there is a constant c such that

µ(B(x,2r)) cµ(B(x, r)) for every x Rn, r 0. É ∈ >

75 CHAPTER 4. DIFFERENTIATION OF MEASURES 76

If µ is doubling, then

µ(B(x,5r)) cµ(B(x,5r/2)) c2µ(B(x,5r/4)) É É c3µ(B(x,5r/8)) c3µ(B(x, r)) for every x Rn, r 0. É É ∈ > Let A be a bounded subset of Rn and suppose that for every x A there is ∈ a ball B(x, rx) with the radius rx 0 possibly depending on the point x. By the > Vitali covering theorem, see Theorem 2.18, we have a countable subcollection of

pairwise disjoint balls B(xi, ri), i 1,2,..., dilates of which covers the union of the = original balls. Thus

³ [∞ ´ ³ [∞ ´ X∞ µ(A) µ B(x, rx) µ B(xi,5ri) µ(B(xi,5ri)) É x A É i 1 É i 1 ∈ = = 3 X∞ 3 ³ [∞ ´ 3 ³ [ ´ c µ(B(xi, ri)) c µ B(xi, ri) c µ B(x, rx) . = i 1 = i 1 É x A = = ∈ This shows that for a doubling measure we can use similar covering arguments as for the Lebesgue measure. However, for a general Radon measure Theorem 2.18 is not useful. We need a covering theorem that does not require us to enlarge the balls, but allows the balls to have overlap. The claim is purely geometric and it will be an important tool to prove other covering theorems.

Theorem 4.2 (Besicovitch covering theorem). There exist integers P and Q, depending only on n, with the following properties. Let A Rn be a bounded set ⊂ and let F be a collection of closed balls B(x, r) such that every point of A is a center of some ball in F .

(1) There is a countable subcollection of balls B(xi, ri) F , i 1,2,..., such ∈ = that they cover the set A, that is,

[∞ A B(xi, ri) ⊂ i 1 = n and that every point of R belongs to at most P balls B(xi, ri), that is,

X∞ χA χB(xi,ri) P. É i 1 É =

(2) There are subcollections F1,...,FQ F such that each Fk consists of ⊂ countably many pairwise disjoint balls in F and

Q [ [ A B(xi, ri). ⊂ k 1 B(xi,ri) Fk = ∈ T HEMORAL : Property (1) asserts that the subcollection covers the set of center points of the original balls and that the balls in the subcollection have CHAPTER 4. DIFFERENTIATION OF MEASURES 77

bounded overlap. Property (2) asserts that the subcollection can be distributed in a finite number of subcollections of disjoint balls. Let A is a bounded subset of Rn and suppose that for every x A there is ∈ a ball B(x, rx) with the radius rx 0 possibly depending on the point x. By the > Besicovitch covering theorem, we have a countable subcollection of balls B(xi, ri), i 1,2,..., which covers the union of the original balls. Thus =

³ [∞ ´ X∞ µ(A) µ B(xi, ri) µ(B(xi, ri)) É i 1 É i 1 = = ³ [∞ ´ ³ [ ´ Pµ B(xi, ri) Pµ B(x, rx) . É i 1 É x A = ∈ Example 4.3. Let µ be the Radon measure on R2 defined by

µ(A) {x R : (x,0) A} , = | ∈ ∈ | where denotes the one-dimensional Lebesgue measure. The collection | · | F {B((x, y), y) : x R, 0 y } = ∈ < < ∞ of closed balls covers the set A {(x,0) : x R}, but for any countable subcollection = ∈ we have ³ [∞ ´ µ A Bi 0. ∩ i 1 = = T HEMORAL : The previous example shows that it is essential in the Besicov- itch covering theorem, that every point of A is (more or less) a center of some ball in the collection. In particular, it is not enough, that every point of A belongs to a ball in the collection.

We need a couple of lemmas in the proof of the Besicovitch covering theorem.

Lemma 4.4. If x, y Rn, 0 x x y and 0 y x y , then ∈ < | | < | − | < | | < | − | ¯ ¯ ¯ x y ¯ ¯ ¯ 1. ¯ x − y ¯ Ê | | | | T HEMORAL : This means that the angle between points x and y is at least

60◦.

Reason. Since ¯ ¯2 ¿ À ¯ x y ¯ x y x y ¯ ¯ , ¯ x − y ¯ = x − y x − y | | | | | | | | | | | | x, x x, y y, x y, y 〈 〉 〈 〉 〈 〉 〈 〉 = x 2 − x y − x y + y 2 | | | || | | || | | | x, y 2 2 〈 〉 , = − x y | || | CHAPTER 4. DIFFERENTIATION OF MEASURES 78 we have ¯ ¯ ¯ x y ¯ x, y ¯ ¯ 1 2 2 〈 〉 1 ¯ x − y ¯ Ê ⇐⇒ − x y Ê | | | | | || | x, y 1 〈 〉 ⇐⇒ x y É 2 | || | 1 cos (x, y) ⇐⇒ ∠ É 2 ∠(x, y) 60◦ ⇐⇒ Ê ■

Proof. We may assume that n 2, since there is a plane containing x, y and = the origin. Moreover, we may assume that x (x1,0,...,0). If x B(y, y ) and = ∉ | | x1 y B(x, x ), then by plane geometry y1 . Since ∉ | | É 2 x | | 1 cosα 2 = x = 2 | | we conclude that (x, y) 60◦ (Figure required). Another way to prove the lemma ∠ Ê is to use the cosine theorem. ä The following lemma is the core of the proof of the Besicovitch covering theo- rem.

n Lemma 4.5. There exists a constant N N(n) such that if x1,..., xk R and = ∈ r1,..., rk 0 such that >

(1) x j B(xi, ri), whenever i j and ∉ 6= Tk (2) i 1 B(xi, ri) , = 6= ; then k N(n). É

T HEMORAL : Condition (1) asserts that the center of every ball belongs only to that ball of which center it is and (2) asserts that all the balls intersect at some point. The claim is that there can be only finitely many such balls.

Remark 4.6. For example infinite dimensional Hilbert space l2 does not have the property above, so that it is some kind finite dimensionality condition.

2 Reason. Let ei, i 1,2,..., be the standard orthonormal basis of l , that is, the = ith term of ei is one and all other terms are zero. Then every closed ball B(ei,1),

i 1,2,..., contains the origin and every ei belongs to only that ball of which = center it is. This example also shows that the Besicovitch covering theorem does not hold 2 in l . Indeed, if we remove any ball B(ei,1), then the center ei is not covered by the other balls. Moreover, the balls do not have bounded overlap at the origin. ■ CHAPTER 4. DIFFERENTIATION OF MEASURES 79

Tk Proof. We may assume that 0 i 1 B(xi, ri). Then (1) implies that xi 0 for every ∈ = 6= i 1,..., k. To see this, assume on the contrary that xi 0 for some i0 1,..., k. = 0 = = Then xi B(xi, ri) for every i 1,2,..., k, which is not possible. We have 0 ∈ =

0 xi ri xi x j , j i. < | | < < | − | 6= Lemma 4.4 implies ¯ ¯ ¯ xi x j ¯ ¯ ¯ 1, j i (4.1) ¯ xi − x j ¯ Ê 6= | | | | SInce ÇB(0,1) Rn is a compact set, it can be covered by finitely many balls 1 ⊂ B(yi, ) with yi ÇB(0,1), i 1,..., N(n). 2 ∈ = Then k N, since otherwise for some indices i, j k, i j, the points xi and xi x j É 1 É 6= | | would belong to the same ball B(yi , ) with i0 N. This implies x j 0 2 | | É ¯ ¯ ¯ xi x j ¯ ¯ ¯ 1, ¯ xi − x j ¯ < | | | | which contradicts (4.1). ä Now we are ready for the proof of the Besicovitch covering theorem. This proof is technical and can be omitted in the first reading.

Proof. (1) Step 1 Since A is bounded and or every x A there exists B(x, rx) B, ∈ ∈ we may assume that

M1 sup{rx : x A} . = ∈ < ∞

(If rx 2diam A, then the single ball B(x, rx) satisfies the required properties.) > M1 Choose x1 A such that rx . Then we choose recursively ∈ 1 Ê 2 j [ M1 x j 1 A \ B(xi, rx j ) such that rx j 1 , + ∈ i 1 + Ê 2 =

M1 M1 as long as this is possible. Since xi x j , i j, and rx M1, we | − | Ê 2 6= 2 É i É ³ M1 ´ ³ M1 ´ conclude that the balls B xi, , i 1,2,..., are disjoint. Then B xi, B(x,R) 4 = 4 ⊂ with R diam(A) M and x A. This implies = + ∈ ¯ ¯ k ¯ µ M ¶¯ ¯ k µ M ¶¯ X ¯ 1 ¯ ¯[ 1 ¯ ¯B xi, ¯ ¯ B xi, ¯ B(x,R) . i 1 ¯ 4 ¯ = ¯i 1 4 ¯ É | | = = On the other hand, k ¯ µ ¶¯ µ ¶n X ¯ M1 ¯ M1 ¯B xi, ¯ k B(0,1) i 1 ¯ 4 ¯ = | | 4 = which implies µ 4 ¶n B(x,R) k | | É M1 B(0,1) < ∞ | | and consequently there are only finitely many points xi, i 1,..., k1. = CHAPTER 4. DIFFERENTIATION OF MEASURES 80

Denote ( k ) [1 M2 sup rx : x A \ B(x, rxi ) . = ∈ i 1 < ∞ = Choose k [1 M2 x A \ B(x, rx ) such that rx k1 1 i k1 1 + ∈ i 1 + Ê 2 = and recursevely

j [ M2 x j 1 A \ B(xi, rxi ) such that rx j 1 . + ∈ i 1 + Ê 2 =

M1 By the construction, we have M2 . Again we obtain finitely many points as É 2 above. By continuing this way, we obtain a countably, or finitely, many

(1) indices 0 k0 k1 k2 ..., = < < < Mi (2) numbers Mi such that Mi 1 , + É 2 (3) balls B(xi, rx ) B and i ∈ (4) classes of indices I j {k j 1 1,..., k j}, j 1,2,.... = − + =

We shall show that the collection B(xi, rx ), i 1,2,..., has the desired properties. i = C LAIM :

M j M j 1 rx M j − , when i I j, (4.2) 2 É i É É 2 ∈ j [ x j 1 A \ B(xi, rxi ) and (4.3) + ∈ i 1 = [ [ xi A \ B(x j, rx ), when i Ik. (4.4) ∈ j ∈ m k j Im 6= ∈ Reason. The first two properties (4.2) and (4.3) follow from the construction. To

prove (4.4), assume that i Ik, m k and j Im. If m k, then by (4.3) we have ∈ 6= ∈ < xi B(x j, rx ). If k m , then (4.2) and m 1 k imply ∉ j < − Ê

Mm 1 Mk rx Mm − rx . j É É 2 É 2 É i

Thus (4.3) implies x j B(xi, rxi ) and consequently xi B(x j, rx j ). ∉ ∉ ■ Observe that

1 i (4.3) Mi 2 − M1, i 1,2,... =⇒ É = Mi 0, i =⇒ → → ∞ rx 0, i . =⇒ i → → ∞

[∞ C LAIM : A B(xi, rxi ). ⊂ i 1 = CHAPTER 4. DIFFERENTIATION OF MEASURES 81

S Reason. Assume, on the contrary, that there exists x A \ ∞ B(xi, rx ). Then ∈ i 1 i M j S= there exists j such that 2 rx M j, which implies that x i I j B(xi, rxi ). This É É ∈ ∈ is a contradiction. ■ Step 2 We shall show that every x Rn belongs to at most P(n) 16n N(n) ∈ = balls, where N(n) is as in Lemma 4.5 Assume that x Tp B(x , r ). i 1 mi xmi ∈ = If B1,...,Bs are balls in the collection {B(xmi , rxm )}i 1,...,p with the property i = that each ball belongs to a different class of indices I j. Property (4.4) implies that Ts x Bk and every ball Bk does not contain the center of any other ball B j with ∈ k 1 k j.= Lemma 4.5 implies s N(n) and 6= É © ª ] j : I j {mi : i 1,..., p} N(n). ∩ = 6= ; É

In other words, the indices mi can belong to at most N(n) classes of indices I j. n C LAIM : ]{I j {mi : i 1,..., p}} 16 , j 1,2,.... ∩ = É = Reason. Fix j and denote

I j {mi : i 1,..., p} {l1,... lq} ∩ = = 1 Properties (4.2) and (4.3) imply B(xl , rx ), i 1,... q, are pairwise disjoint and i 4 li = they are contained in the ball B(x,2M j). Thus µ ¶n q ¯ µ ¶¯ M j X ¯ M j ¯ n q B(0,1) ¯B xli , ¯ B(x,2M j) B(0,1) (2M j) | | 8 É i 1 ¯ 4 ¯ É | | = | | = This implies q 16n. = ■

(2) Let B(xi, rx ), i 1,2,..., be the collection of balls in the claim (1) of the i = Besicovitch covering. Since M j 0, j , for every ε 0 there are only a → → ∞ > finite number of balls B(xi, rx ) such that rx ε. Thus we may assume that i i Ê rx rx .... Denote Bi B(xi, rx ), i 1,2,.... Let B1,1 B1 and inductively 1 Ê 2 Ê = i = = B1, j 1 Bk, where k is the smallest index, for which + = j [ Bk B1,i . ∩ i 1 = ; = Continuing this way, we obtain a countable (or finite) subcollection

B1 {B1,1,B1,2,...}, = S which consists of pairwise disjoint balls. If ∞i 1 B1,i does not cover the set A, we = choose B2,1 Bk, where k is the smallest index for which Bk B1. = ∉ Inductively, let B2, j 1 Bk, where k is the smallest index for which + = j [ Bk B2, j . ∩ i 1 = ; = This gives subcollections B1,B2,... consisting of pairwise disjoint balls. LAIM Sm S n C : A k 1 ∞i 1 Bk,i with m 4 P 1. ⊂ = = = + CHAPTER 4. DIFFERENTIATION OF MEASURES 82

Sm S n Reason. We show that, if there exists x A \ k 1 ∞i 1 Bk,i, then m 4 P. Since S ∈ = = É A ∞i 1 Bi, there exists i such that x Bi B(xi, rxi ). Then Bi Bk, k m and, ⊂ = ∈ = ∉ É by the definition of Bk, there exists Bk,i such that Bk,i Bi and rx rx k k ∩ 6= ; i É ik for every k m. Thus for every k m there exists a ball É É

B0 B(xi,2rx ) Bk,i k ⊂ i ∩ k n such that the radius of B0k is ri/2. By (1), each point in R belongs to at most P balls Bk,i , k 1,..., m. This holds for subballs B0 as well. This implies k = k m X χ PχSm B0k k 1 B0k k 1 É = = and consequently ¯ ¯ ¯ m ¯ 2nrn B(0,1) B(x ,2r ) ¯ [ B ¯ (B B(x ,2r )) i i xi ¯ 0k¯ 0k i i | | = | | Ê ¯k 1 ¯ ⊂ = m Z 1 Z X χSm B dx χB dx n k 1 0k n 0k = R = Ê P R k 1 = m n 1 X m ³ rxi ´ B0k B(0,1) . = P k 1 | | = P | | 2 = This shows that m 4nP. É ■

Remarks 4.7: (1) The assumption that A is bounded can be replaced with

sup{r : B(x, r) F } ∈ < ∞ in the Besicovitch covering theorem 4.2. (2) The balls in the Besicovitch covering theorem 4.2 need not be closed. (3) The balls can be replaced, for example, by cubes in the Besicovitch covering theorem 4.2.

We take another look at the Vitali covering theorem.

Theorem 4.8 (Infinitesimal Vitali covering theorem). Let µ be a Radon mea- sure on Rn, A Rn and F a collection of closed balls such that each point of A is a ⊂ center of arbitrarily small balls F , that is,

inf{r 0 : B(x, r) F } 0 for every x A. > ∈ = ∈

Then there are disjoint balls B(xi, ri) F , i 1,2,..., such that ∈ =

³ [∞ ´ µ A \ B(xi, rxi ) 0. i 1 = = CHAPTER 4. DIFFERENTIATION OF MEASURES 83

T HEMORAL : The infinitesimal Vitali covering theorem implies that every open set can be exhausted by countably many disjoint balls up to a set of measure zero.

Proof. We may assume that µ(A) 0, because otherwise the claim is clear. Assume > first that A is bounded. By approximation properties of measurable sets for a Radon measure, there exists an open set G A such that ⊃ µ 1 ¶ µ(G) 1 µ(A). É + 4Q

By the Besicovitch covering theorem, there are subcollections F1,...,FQ such

that the balls in each Fk are disjoint and

Q [ [ A B(xi, ri) G. ⊂ ⊂ k 1 B(xi,ri) Fk = ∈ This implies Q X ³[ ´ µ(A) µ B(xi, ri) . É k 1 Fk = Thus there exists k such that

³[ ´ µ(A) Qµ B(xi, ri) . É Fk ¡S ¢ Reason. If µ(A) Qµ B(xi, ri) for every k 1,...,Q, then > Fk = Q Q X X ³[ ´ Qµ(A) µ(A) Q µ B(xi, ri) . = > k 1 k 1 Fk = = This is impossible. ■

There exists a finite subcollection F 0 Fk such that 1 ⊂ ³[ ´ µ(A) 2Qµ B(xi, ri) . É F10

Let [ A1 A \ B(xi, ri). = F10 Then

³ [ ´ µ(A1) µ G \ B(xi, ri) É F10 µ ¶ ³[ ´ 1 1 µ(G) µ B(xi, ri) 1 µ(A) = − É + 4Q − 2Q F10 µ 1 ¶ 1 1 µ(A) γµ(A), γ 1 1. = − 4Q = = − 4Q < CHAPTER 4. DIFFERENTIATION OF MEASURES 84

In practice, this means that balls in F10 cover a certain percentage of A in the sense of measure. Then we apply the same argument to the collection    ³[ ´  F 0 B(x, r) F : B(x, r) B(xi, ri) . =  ∈ ∩ = ; F10 [ Note that A1 is a subset of the open set G \ B(xi, ri). There exists an open set

F10 G1 such that µ ¶ [ 1 A1 G1 G \ B(xi, ri) and µ(G1) 1 µ(A1). ⊂ ⊂ É + 4Q F10

Thus there exists a finite subcollection F20 such that ³[ ´ ³[ ´ B(xi, ri) B(x, r) ∩ = ; F10 F20 and [ µ(A2) γµ(A1), where A2 A \ B(x, r). É = F10 ,F20 By continuing this process, we obtain

³ [ ´ k µ A \ B(x, r) γ µ(A) F ... F É 10 ∪ ∪ k0 and the result follows by letting k , since γ 1 and µ(A) . → ∞ < < ∞ In order to remove the assumption that A bounded, we use the fact that µ(ÇB(0, r)) 0 for at most countably many radii r 0, if µ is a Radon measure > > (exercise). Hence we may choose the radii 0 r1 r2 ... such that rk as < < < → ∞ k and µ(ÇB(0, rk)) 0 for every k 1,2,... → ∞ n = = Denote A1 {x R : x r1}, = ∈ | | < n Ak {x R : rk 1 x rk}, k 2,3,... = ∈ − < | | < = and k F {B(x, r) F : B(x, r) Ak, x A}. = ∈ ⊂ ∈ The claim follows by applying the proof above for the sets Ak and the coverings F k, k 1,2,... = ä

4.2 The Lebesgue differentiation theorem for Radon measures

It is not immediately clear how to define derivative of a measure. Let f : [a, b] → [0, ] be a nonnegative integrable function and define F(x) R f (y) d y. Then ∞ = [a,x] CHAPTER 4. DIFFERENTIATION OF MEASURES 85

by Theorem 2.35, F0(x) f (x) for almost every x [a, b]. Let us rewrite this in = ∈ another way. Let µ(A) R f (y) d y for every Lebesgue measurable set A R and = A ⊂ denote the one-dimensional Lebesgue measure by ν. Then

F(x r) F(x) 1 Z µ([x, x r]) + − f (y) d y + . r = r [x,x r] = ν([x, x r]) + + Thus µ([x, x r]) F0(x) lim + f (x) for almost every x [a, b]. = r 0 ν([x, x r]) = ∈ → + This suggest the following definition for the derivative of measures.

Definition 4.9. Let µ and ν be Radon measures on Rn. The upper derivative of ν with respect to µ is ν(B(x, r)) Dµν(x) limsup = r 0 µ(B(x, r)) → and the lower derivative of ν with respect to µ is

ν(B(x, r)) D ν(x) liminf . µ = r 0 µ(B(x, r)) →

We use the convention that D ν(x) and D ν(x) , if µ(B(x, r)) 0 for some µ = ∞ µ = ∞ = r 0. At the points where the limit exists, we define the derivative of ν with > respect to µ as D ν(x) D ν(x) D ν(x) . µ = µ = µ < ∞ Examples 4.10: (1) Let A Rn be µ-measurable. By the measure theory, ν µ A is a Radon ⊂ = b measure and ν(B(x, r)) µ(A B(x, r)) lim lim ∩ r 0 µ(B(x, r)) = r 0 µ(B(x, r)) → → measures the density of A at x. (2) Assume that µ is a Radon measures on Rn and f L1(Rn;µ). Define ∈ ν(A) R f dµ for every µ-measurable set A Rn. Then = A | | ⊂ ν(B(x, r)) 1 Z lim lim f dµ r 0 µ(B(x, r)) = r 0 µ(B(x, r)) B(x,r) | | → → is the limit of the integral averages as in the Lebesgue differentiation theorem.

Lemma 4.11. Dµν, Dµν and Dµν are Borel measurable.

Proof. C LAIM : limsupµ(B(y, r)) µ(B(x, r)) for every x Rn. y x É ∈ → Reason. By an approximation result for measurable sets, there exists an open set G B(x, r) such that µ(G) µ(B(x, r)) ε. Observe that B(x, r) denotes the ⊃ < + CHAPTER 4. DIFFERENTIATION OF MEASURES 86

closed ball with center x and radius r. It follows that B(y, r) G, if x y 1 n ⊂ | − | < 2 dist(B(x, r),R \G). Thus

µ(B(y, r)) µ(G) µ(B(x, r)) ε, É < + if x y 1 dist(B(x, r),Rn \G). This implies | − | < 2 limsupµ(B(y, r)) µ(B(x, r)) y x É → and thus x µ(B(x, r)) is upper semicontinuous. Similarly x ν(B(x, r)) is upper 7→ 7→ semicontinuous and consequently the functions are Borel measurable. ■ ν(B(x, r)) C LAIM : Dµν(x) limsup . = r 0 µ(B(x, r)) r →Q ∈ + Reason. Since B(x, r) is a closed ball, µ ¶ \∞ 1 B(x, r) B x, r and µ(B(x, r 1)) , = i 1 + i + < ∞ = we have à µ ¶! µ µ ¶¶ \∞ 1 1 µ(B(x, r)) µ B x, r lim µ B x, r . = i 1 + i = i + i = →∞ This implies that µ and ν are continuous from right and that we may replace the

limes superior with a limes superior over the rationals. Consequently, Dµν is a countable limes superior of Borel functions and hence it is a Borel function. The

measurability of D ν and Dµν are proved in a similar manner (exercise). µ ■ The following result will be an extremely useful tool in our analysis.

Theorem 4.12. Assume that µ and ν are Radon measures in Rn, A Rn and ⊂ 0 t . < < ∞ (1) If D ν(x) t for every x A, then ν(A) tµ(A). µ É ∈ É (2) If D ν(x) t for every x A, then ν(A) tµ(A). µ Ê ∈ Ê

T HEMORAL : These inequalities give distribution set estimates

ν({x Rn : D ν(x) t}) tµ(Rn) ∈ µ É É and 1 µ({x Rn : D ν(x) t}) ν(Rn). ∈ µ Ê É t which are Chebyshev type inequalities for Radon measures.

Remark 4.13. The set A Rn does not necessarily have to be measurable, compare ⊂ to the Vitali covering theorem (Theorem 4.8). CHAPTER 4. DIFFERENTIATION OF MEASURES 87

Proof. (1) If µ(A) , the claim is clear, so that we may assume µ(A) . Let = ∞ < ∞ ε 0. There exists an open G A such that µ(G) µ(A) ε. Since D ν(x) t for > ⊃ < + µ É every x A there is an arbitrarily small r 0 such that ∈ > ν(B(x, r)) (t ε)µ(B(x, r)) and B(x, r) G. É + ⊂ By the Vitali covering theorem (Theorem 4.8), there is a countable subcollection of

pairwise disjoint balls B(xi, ri), i 1,2,..., such that =

ν(B(x, ri)) (t ε)µ(B(x, ri)) and B(x, ri) G É + ⊂ for every i 1,2,... and = ³ [∞ ´ ν A \ B(xi, ri) 0. i 1 = = Thus

³ [∞ ´ ³ [∞ ´ ν(A) ν A B(xi, ri) ν A \ B(xi, ri) É ∩ i 1 + i 1 = | ={z } 0 = X∞ X∞ ν(B(xi, ri)) (t ε) µ(B(xi, ri) É i 1 É + i 1 = = ³ [∞ ´ (t ε)µ B(xi, ri) (the balls are disjoint) É + i 1 = (t ε)µ(G) (t ε)(µ(A) ε) for every ε 0. É + É + + > Letting ε 0, we have ν(A) tµ(A). → É (2) (Exercise) ä n Theorem 4.14. If µ and ν are Radon measures on R , then the derivative Dµν(x) exists and is finite for µ-almost every x Rn. ∈

T HEMORAL : This is a version of the Lebesgue differentiation theorem for general Radon measures.

Proof. Step 1 C LAIM : D ν D ν µ-almost everywhere in Rn, that is, µ = µ

µ({x Rn : D ν(x) D ν(x)}) 0. ∈ µ > µ = Reason. Let i, k {1,2,...}, p, q Q with p q. Define ∈ ∈ <

Ak,p,q {x B(0, k) : D ν(x) p q D ν(x)} = ∈ µ É < É µ and

Ak,i {x B(0, k) : D ν(x) i}. = ∈ µ Ê Theorem 4.12 implies

qµ(Ak,p,q) ν(Ak,p,q) pµ(Ak,p,q), É É CHAPTER 4. DIFFERENTIATION OF MEASURES 88

from which it follows that p µ(Ak,p,q) µ(Ak,p,q) µ(B(0, k)). É q É | {z } <∞

Since p q, we conclude that µ(Ak,p,q) 0. Thus < = µ({x B(0, k) :D ν(x) D ν(x)}) ∈ µ > µ ³ [ ´ µ Ak,p,q = 0 p q p<,q,

n X∞ µ({x R : Dµν(x) Dµν(x)}) µ({x B(0, k) : Dµν(x) Dµν(x)}) 0. ∈ > É k 1 ∈ > = = Since D ν D ν always, we conclude that D ν D ν µ-almost everywhere in µ Ê µ µ = µ Rn. ■

Step 2 C LAIM : D ν µ-almost everywhere in Rn or equivalently µ < ∞

µ({x Rn : D ν(x) }) 0. ∈ µ = ∞ = Reason. Theorem 4.12 implies

1 1 µ(Ak,i) ν(Ak,i) ν(B(0, k)). É i É i | {z } <∞ Thus 1 µ({x B(0, k) : D ν(x) }) µ(Ak,i) ν(B(0, k)) for every i, k 1,2,.... ∈ µ = ∞ É É i = By letting i , we have → ∞ µ({x B(0, k) : D ν(x) }) 0 for every k 1,2,.... ∈ µ = ∞ = = Therefore

n ³ [∞ ´ µ({x R : Dµν(x) }) µ {x B(0, k) : Dµν(x) } ∈ = ∞ = k 1 ∈ = ∞ = X∞ µ({x B(0, k) : Dµν(x) }) 0. É k 1 ∈ = ∞ = ■ = CHAPTER 4. DIFFERENTIATION OF MEASURES 89

4.3 The Radon-Nikodym theorem

Assume that µ and ν are Radon measures on Rn. Let f be a nonnegative µ- measurable function and define ν(A) R f dµ, where A µ-measurable. Then ν = A is a measure with the property that µ(A) 0 implies ν(A) 0. Conversely, if ν = = is a Radon measure on Rn, does there exist a µ-measurable function f such that ν(A) R f dµ for every µ-measurable set A? The Radon-Nikodym theorem shows = A that this is the case if ν is absolutely continuous with respect to µ.

Definition 4.15. A outer measure ν is absolutely continuous with respect to another outer measure µ, if µ(A) 0 implies ν(A) 0. In this case we write ν µ. = = ¿

T HEMORAL : ν µ means that ν is small if µ is small. When we are dealing ¿ with more than one measure, the term almost everywhere becomes ambiguous and we have to specify almost everywhere with respect µ or ν. If ν µ and a ¿ property holds µ-almost everywhere, then it also holds ν-almost everywhere. Examples 4.16: (1) Let µ be the Lebesgue measure and ν be the Dirac measure at the origin,

 1, 0 A, ν(A) ∈ = 0, 0 A. ∉ Then µ({0}) 0, but ν({0}) 1. Thus ν is not absolutely continuous with = = respect to µ. (2) Let f L1(Rn;µ) and define ν(A) R f dµ, where A is µ-measurable. ∈ = A | | Then µ(A) 0 implies ν(A) R f dµ 0. Thus ν µ. = = A | | = ¿ Remarks 4.17: (1) It is often useful, in particular in connection with integrals, to use the following ε, δ version of absolute continuity: If ν is a finite measure, then ν µ if and only if for every ε 0 there exists δ 0 such that ν(A) ε ¿ > > < for every µ-measurable set A with µ(A) δ. In particular, if f L1(Rn;µ), < ∈ then for every ε 0 there exists δ 0 such that R f dµ ε for every > > A | | < µ-measurable set A with µ(A) δ. (Exercise) < (2) It is easy to verify that that the relation is reflexive (µ µ) and transi- ¿ ¿ tive (µ1 µ2 and µ2 µ3 imply µ1 µ3.) ¿ ¿ ¿ The following theorem on absolutely continuous measures is very important. It shows that differentiation of measures and integration are inverse operations. It has applications in the identification of continuous linear functionals on Lp, 1 É p . Moreover, a general version of the theorem is applied in the construction < ∞ of the conditional expectation in the probability theory. CHAPTER 4. DIFFERENTIATION OF MEASURES 90

Theorem 4.18 (Radon-Nikodym theorem). Assume that µ and ν are Radon measures on Rn and ν µ. Then ¿ Z ν(A) Dµν dµ = A

for every µ-measurable set A Rn. ⊂

T HEMORAL : The Radon-Nikodym theorem asserts that ν can be expressed

as an integral with respect to µ and the Radon-Nikodym derivative Dµν can be computed by differentiating ν with respect to µ. This is a version of the fundamental theorem of calculus for Radon measures.

Remark 4.19. Note that D ν does not have to be integrable. In fact, D ν µ µ ∈ L1(Rn;µ) if and only if ν(Rn) . < ∞ Proof. Step 1 Since A is a µ-measurable set and µ is a Borel regular measure, there exists a Borel set B A such that µ(B \ A) 0. Since ν µ, we have ⊃ = ¿ ν(B \ A) 0 and thus B \ A is ν-measurable. This implies that A B \ (B \ A) is = = ν-measurable. Step 2 Let Z {x A : D ν(x) 0}, = ∈ µ = I {x A : D ν(x) } = ∈ µ = ∞ and N {x A : D ν(x) D ν(x)}. = ∈ µ 6= µ By Theorem 4.14 µ(I) 0 and µ(N) 0. Since ν µ, we also have ν(I) 0 and = = ¿ = ν(N) 0. Theorem 4.12 implies = ν(Z B(0, i)) tµ(Z B(0, i)) for every i 1,2,..., 0 t . ∩ É | ∩{z } = < < ∞ <∞ By letting t 0, we obtain ν(Z B(0, i)) 0 for every i 1,2,.... Thus → ∩ = =

³ [∞ ´ X∞ ν(Z) ν (Z B(0, i)) ν(Z B(0, i)) 0. = i 1 ∩ É i 1 | ∩{z } = = = 0 = Let n i i 1o Ai x A : t D ν(x) t + , i Z, 1 t . = ∈ É µ < ∈ < < ∞ S Then A \ ∞i Ai Z I N and =−∞ ⊂ ∪ ∪

³ [∞ ´ ν A \ Ai ν(Z I N) ν(Z) ν(I) ν(N) 0. i É ∪ ∪ É + + = =−∞ CHAPTER 4. DIFFERENTIATION OF MEASURES 91

Step 3

³ [∞ ´ ³ [∞ ´ X∞ ν(A) ν A Ai ν A \ Ai ν(Ai) = ∩ i + i É i =−∞ | {z=−∞ } =−∞ 0 = X∞ i 1 t + µ(Ai) (Theorem 4.12 (1)) É i =−∞ X∞ i t t µ(Ai) = i =−∞ Z X∞ i t Dµν dµ (Dµν t in Ai) É i Ai Ê Z=−∞ t Dµν dµ. = A S In the last equality we used the facts that A ∞i Ai (Z I N), where the = =−∞ ∪ ∪ ∪ sets are disjoint and Z Z Z Z Z Dµν dµ Dµν dµ Dµν dµ Dµν dµ Dµν dµ. = S + + + A ∞i Ai Z I N =−∞ | {z } | {z } | {z } 0 0,(µ(I) 0) 0,(µ(N) 0) = = = = = On the other hand,

³ [∞ ´ ν(A) ν Ai Ê i =−∞ X∞ ν(Ai)(Ai disjoint) = i =−∞ X∞ i t µ(Ai) (Theorem 4.12 (2)) Ê i =−∞ 1 X∞ i 1 t + µ(Ai) = t i =−∞ Z 1 X∞ i 1 Dµνdµ (Dµν t + in Ai) Ê t i Ai É =−∞ 1 Z Dµν dµ. (as above) = t A Thus 1 Z Z Dµν dµ ν(A) t Dµν dµ, 1 t t A É É A < < ∞ and by letting t 1, we arrive at → Z ν(A) D ν dµ. = µ A ä

Remark 4.20. The Radon-Nikodym derivative has many properties reminiscent of standard derivatives. Let ν, µ and ζ be Radon measures on Rn. CHAPTER 4. DIFFERENTIATION OF MEASURES 92

(1) if ν µ and f is a nonnegative µ-measurable function, then ¿ Z Z f dν f Dµν dµ A = A for every measurable set A. (Exercise) Hint: if g is a nonnegative µ-measurable function and ν(A) R g dµ for = A every µ-measurable set A Rn, then for every nonnegative measurable ⊂ function f we have Z Z f dν f g dµ. A = A (2) If ν µ and ζ µ, then D (ν ζ) D ν D ζ µ-almost everywhere. ¿ ¿ µ + = µ + µ (3) If ν ζ µ, then D ν D νD ζ µ-almost everywhere. ¿ ¿ µ = ζ µ (4) If ν µ and µ ν, then D ν 1/D µ µ-almost everywhere. ¿ ¿ µ = ν Remark 4.21. The Radon-Nikodym derivative is unique: If f L1 (Rn;µ) is a ∈ loc nonnegative function and define ν(A) R f dµ for every µ-measurable set A Rn. = A ⊂ Then f D ν µ-almost everywhere. (Exercise) = µ Remark 4.22. The Radon-Nikodym theorem holds in a more general context: If µ is a σ-finite measure on X and ν is a σ-finite signed measure on X such that ν µ. ¿ Then there exists a real-valued measurable function f such that ν(A) R f dµ = A for every measurable set A X with ν (A) . If g is another function such ⊂ | | < ∞ that ν(A) R g dµ for every measurable set A X with ν (A) , then f g µ- = A ⊂ | | < ∞ = almost everywhere. The function f above is called the Radon-Nikodym derivative of ν with respect to µ. However, in the general case there is no formula for the Radon-Nikodym derivative.

4.4 The Lebesgue decomposition

In this section we consider measures which are not necessarily absolutely con- tinuous. The following definition describes an extreme form of non absolute continuity.

Definition 4.23. The Radon measures µ and ν are mutually singular, if there exists a Borel set B Rn such that ⊂ µ(Rn \ B) ν(B) 0. = = In this case we write µ ν. ⊥

T HEMORAL : Mutually singular measures live on complementary sets.

Remark 4.24. Absolutely continuous and singular measures have the following general properties (exercise): CHAPTER 4. DIFFERENTIATION OF MEASURES 93

(1) If ν1 µ and ν2 µ, then (ν1 ν2) µ. ⊥ ⊥ + ⊥ (2) If ν1 µ and ν2 µ, then (ν1 ν2) µ. ¿ ¿ + ¿ (3) If ν1 µ and ν2 µ, then ν1 ν2. ¿ ⊥ ⊥ (4) If ν µ and ν µ, then ν 0. ¿ ⊥ = Theorem 4.25 (Lebesgue decomposition). Let µ and ν be Radon measures on Rn.

(1) Then ν νa νs, where νa and νs are Radon measures with νa µ and = + ¿ νs µ. ⊥ n (2) Furthermore, D ν D νa and D νs 0 µ-almost everywhere in R and µ = µ µ = Z ν(A) Dµνa dµ νs(A) = A +

for every Borel set A Rn. ⊂

T ERMINOLOGY : We call νa the absolutely continuous part and νs the singular part of ν with respect to µ.

T HEMORAL : A Radon measure can be split into an absolutely continuous and singular parts. The absolutely continuous part lives in the set where D ν µ < ∞ and the singular part in the set where D ν . µ = ∞ [∞ Proof. Step 1 Rn B(0, i),so that by considering the restrictions µ B(0, i) and = i 1 b ν B(0, i), i 1,2,..., we= may assume µ(Rn) and ν(Rn) . b = < ∞ < ∞ Step 2 Define

F {A Rn : A Borel, µ(Rn \ A) 0}. = ⊂ =

Then for every i 1,2,... there exits Bi F such that = ∈ 1 ν(Bi) inf ν(A) . É A F + i ∈ n n T Note that the infimum is finite, since R F and ν(R ) . B ∞i 1 Bi is a Borel ∈ < ∞ = = set and n ³ [∞ n ´ X∞ n µ(R \ B) µ (R \ Bi) µ(R \ Bi) 0. = i 1 É i 1 | {z } = = = 0,Bi F = ∈ Thus B F and ∈ 1 inf ν(A) ν(B) inf ν(A) for every i 1,2,.... A F É É A F + i = ∈ ∈ By letting i we conclude → ∞ ν(B) inf ν(A). = A F ∈ CHAPTER 4. DIFFERENTIATION OF MEASURES 94

Thus B is the smallest possible set in the sense of ν measure such that the complement is of µ-measure zero. Define

n νa ν B and νs ν (R \ B). = b = b

By the properties of the restrictions of measures, νa and νs are Radon measures.

Step 3 C LAIM : νa µ. ¿ Reason. Suppose, on the contrary, that A B, A is a Borel set, µ(A) 0, but ⊂ = ν(A) 0. Then > µ(Rn \ (B \ A)) µ(A (Rn \ B)) µ(A) µ(Rn \ B) 0 = ∪ É | {z }+| {z } = 0 0,B F = = ∈ and thus B \ A F . This implies ∈ ν(B \ A) ν(B) ν(A) ν(B) inf ν(A). (0 ν(A) ) = − < = A F < < ∞ ∈ This shows that ν(A) 0, from which it follows that νa(A) ν(A B) 0 and = = ∩ = consequently νa µ. ¿ ■ On the other hand,

n n µ(R \ B) 0 ν((R \ B) B) νs(B), = = | {z ∩ } = =; so that νs µ. ⊥ Step 4 Let ½ ¾ n 1 Ci x R : D νs(x) , i 1,2,.... = ∈ µ Ê i = Then

1 1 n µ(Ci) (µ(Ci B) µ(Ci (R \ B)) i = i ∩ + | ∩{z } 0,µ(Rn\B) 0 = = νs(Ci B) (Theorem 4.12) É ∩ n ν((Ci B) (R \ B)) 0. = | ∩ ∩{z } = =; This shows that µ(Ci) 0 for every i 1,2,..., and consequently = = n ³ [∞ ´ X∞ µ({x R : Dµνs(x) 0}) µ Ci µ(Ci) 0. ∈ > = i 1 É i 1 = = = n Thus D νs(x) 0 for µ-almost every x R and µ = ∈ νa(B(x, r)) Dµνa(x) lim = r 0 µ(B(x, r)) → ν((B(x, r)) νs(B(x, r)) lim − = r 0 µ(B(x, r)) → n D ν(x) D νs(x) D ν(x) for µ-almost every x R . = µ − µ = µ ∈ CHAPTER 4. DIFFERENTIATION OF MEASURES 95

The Radon-Nikodym theorem implies Z n νa(A) Dµνa dµ, A R µ-measurable. = A ⊂ Thus

ν(A) νa(A) νs(A) = + Z Dµνa dµ νs(A) = A + Z Dµν dµ νs(A). = A + ä

Remark 4.26. It can be shown that µ ν if and only if D ν 0 µ-almost every- ⊥ µ = where. (Exercise)

Remark 4.27. The Lebesgue decomposition holds in a more general context: Let

µ and ν be σ-finite signed measures on X. Then ν νa νs, where νa and νs are = + σ-finite signed measures with νa µ and νs µ. Moreover, this decomposition is ¿ ⊥ unique.

4.5 Lebesgue and density points revisited

We shall prove a version of the Lebesgue differentiation theorem for an arbitrary Radon measure on Rn.

Theorem 4.28 (Lebesgue differentiation theorem). Let µ be a Radon mea- sure on Rn and f L1 (Rn;µ). Then ∈ loc 1 Z lim ¡ ¢ f dµ f (x) r 0 B(x, r) B(x,r) = → µ for µ-almost every x Rn. ∈ Proof. Define Z ν±(B) f ± dµ, = B when B Rn is a Borel set and ⊂

ν±(A) inf{ν±(B) : A B, B Borel} = ⊂ n for an arbitrary A R . Then ν+ and ν− are Radon measures and ν± µ (exer- ⊂ ¿ cise). The Radon-Nikodym theorem (Theorem 4.18) implies Z Z ν±(A) Dµν± dµ f ± dµ = A = A CHAPTER 4. DIFFERENTIATION OF MEASURES 96

n for every µ-measurable set A R . This implies that D ν± f ± µ-almost every- ⊂ µ = where in Rn. Consequently, 1 Z 1 ·Z Z ¸ lim f dµ lim f + dµ f − dµ r 0 µ(B(x, r)) B(x,r) = r 0 µ(B(x, r)) B(x,r) − B(x,r) → → 1 £ ¤ lim ν+(B(x, r)) ν−(B(x, r)) = r 0 µ(B(x, r)) − → D ν+(x) D ν−(x) = µ − µ f +(x) f −(x) f (x) = − = for µ-almost every x Rn. ∈ ä Corollary 4.29. Let µ be a Radon measure on Rn and f L1 (Rn;µ). Then ∈ loc 1 Z lim f f (x) dµ 0 r 0 µ(B(x, r)) B(x,r) | − | = → for µ-almost every x Rn. ∈ S Proof. Let ∞i 1{qi} Q be an enumeration of the rationals. By Theorem 4.28, for = = n every i 1,2,... there exists Ai R such that µ(Ai) 0 and = ⊂ = Z 1 n lim f qi dµ f (x) qi for every x R \ Ai. r 0 µ(B(x, r)) B(x,r) | − | = | − | ∈ → S P Let A ∞i 1 Ai. Then µ(A) ∞i 1 µ(Ai) 0 and = = É = = Z 1 n lim f qi dµ f (x) qi for every x R \ A. r 0 µ(B(x, r)) B(x,r) | − | = | − | ∈ → n Fix x R \ A and ε 0. Then there exists qi such that f (x) qi ε/2. This ∈ > | − | < implies 1 Z limsup f f (x) dµ r 0 µ(B(x, r)) B(x,r) | − | → 1 ·Z Z ¸ limsup f qi dµ qi f (x) dµ É r 0 µ(B(x, r)) B(x,r) | − | + B(x,r) | − | → ε ε f (x) qi f (x) qi ε. = | − | + | − | < 2 + 2 = ä

Remarks 4.30: (1) We have already seen in Example 2.30 that

1 Z lim ¡ ¢ f dµ f (x) r 0 B(x, r) B(x,r) = → µ does not necessarily imply 1 Z lim f f (x) dµ 0 r 0 µ(B(x, r)) B(x,r) | − | = → at a given point x Rn. The point in the proof above is that the previous ∈ equality holds for every function f L1 (Rn;µ) for µ-almost every x Rn ∈ loc ∈ and this implies the latter equality for µ-almost every x Rn. ∈ CHAPTER 4. DIFFERENTIATION OF MEASURES 97

(2) In contrast with the proof of the Lebesgue differentiation theorem (Theo- rem 2.26) based on the Hardy-Littlewood maximal function, this proof does not depend on the density of compactly supported continuous functions in L1(Rn;µ).

We discuss a special case of the Lebesgue differentiability theorem. Let A Rn ⊂ a µ-measurable set and consider f χA. By the Lebesgue differentation theorem = 1 Z µ(A B(x, r)) lim χA dµ lim ∩ χA(x) r 0 µ(B(x, r)) B(x,r) = r 0 µ(B(x, r)) = → → for µ-almost every x Rn. In particular, ∈ µ(A B(x, r)) lim ∩ 1 for µ-almost every x A r 0 µ(B(x, r)) = ∈ → and µ(A B(x, r)) lim ∩ 0 for µ-almost every x Rn \ A. r 0 µ(B(x, r)) = ∈ → Thus the theory of density points extends to general Radon measures on Rn. In this chapter we show that Radon measures arise nat- urally in connection with linear functionals on compactly supported continuous functions. Moreover, we consider weak convergence of Radon measures and Lp functions and obtain useful compactness theorems. 5 Weak convergence methods

Radon measures on Rn interact nicely with the Euclidean topology. Indeed, mea- surable sets can be approximated by open sets from outside and compact sets from inside and integrable functions can be approximated by compactly supported continuous functions. In this chapter we show that certain linear functionals on compactly supported continuous functions are characterized by integrals with respect to Radon measures. This fact constitutes an important link between measure theory and and it also provides a useful tool for constructing such measures.

5.1 The Riesz representation theorem for Lp

A mapping L : Lp(Rn) R is a linear functional, if → L(af bg) aL(f ) bL(g) + = + for every f , g Lp(Rn) and a, b R. The functional L is bounded, if there exists a ∈ ∈ constant M such that < ∞ p n L(f ) M f p for every f L (R ). | | É k k ∈ The norm of L is the smallest constant M for which the bound above holds, that is, L(f ) L sup | | p n k k = f L (R ), f p 0 f p ∈ k k 6= k k L(f ) sup p n = f L (R ), f p 0 f p ∈ k k 6= k k sup L(f ) . p n = f L (R ), f p 1 | | ∈ k k É

98 CHAPTER 5. WEAK CONVERGENCE METHODS 99

Recall, that the linear functional

L : Lp(Rn) R is continuous L is bounded L . → ⇐⇒ ⇐⇒ k k < ∞ The space of bounded linear functionals on Lp(Rn) is called the of p n p n L (R ). The dual space is denoted by L (R )∗. The main result of this section provides us with a representation for continuous linear functionals on Lp(Rn) with 1 p . This is called the Riesz representation theorem and it gives a É < ∞ p n characterization for L (R )∗ with 1 p . We begin with the easier direction. É < ∞ Theorem 5.1. Let 1 p and assume that µ is a Radon measure. Then for É É ∞ every g Lp0 (Rn), the functional L : Lp(Rn) R, ∈ → Z L(f ) f g dµ = Rn p n is linear and bounded and thus belongs to L ∗(R ). Moreover, L g p . k k = k k 0

T HEMORAL : This shows that for every function g Lp0 (Rn) there is a ∈ bounded linear functional L : Lp(Rn) R with L g . With this interpreta- p0 p n p n → k k = k k tion L 0 (R ) L (R )∗. ⊂

Proof. The linearity follows from the linearity of the integral. If g p 0, the k k 0 = claim is clear. Hence we may assume that g p 0. k k 0 > 1 p By Hölder’s inequality < < ∞ ¯Z ¯ Z ¯ ¯ L(f ) ¯ f g dµ¯ f g dµ | | = ¯ Rn ¯ É Rn | || | µZ ¶1/p µZ ¶1/p0 f p d g p0 d f g . µ µ p p0 É Rn | | Rn | | = k k k k This implies

L sup L(f ) g p . p n 0 k k = f L (R ), f p 1 | | É k k < ∞ ∈ k k É On the other hand, the function f g p0/p sign g belongs Lp(Rn), since = | | µZ ¶1/p µZ ¶1/p p p0 p0/p f p f dµ g dµ g p . k k = Rn | | = Rn | | = k k 0 < ∞ Thus Z Z L(f ) g p0/p gsign g dµ g p0 dµ | | = Rn | | | {z } = Rn | | g =| |  1/p µZ ¶1/p0 Z g p0 dµ  g p0 dµ f g   p p0 = Rn | | Rn | {z| } = k k k k f p =| | and L(f ) L sup | | g p . p n 0 k k = f L (R ), f p 0 f p Ê k k ∈ k k 6= k k CHAPTER 5. WEAK CONVERGENCE METHODS 100

Since the inequality in the reverse direction holds always, we have L g p . k k = k k 0 p Assume that g L1(Rn). Then = ∞ ∈ Z L(f ) f g dµ f g 1, | | É Rn | || | É k k∞k k which implies

L sup L(f ) g 1 . p n k k = f L (R ), f p 1 | | É k k < ∞ ∈ k k É n On the other hand, the function f sign g belongs L∞(R ) and thus = Z Z L(f ) gsign g dµ g dµ g 1. | | = Rn = Rn | | = k k This shows that

L sup L(f ) g 1, p n k k = f L (R ), f p 1 | | Ê k k ∈ k k É from which it follows that L g 1. n k k = k k p 1 Let g L∞(R ). Then = ∈ Z L(f ) f g dµ f 1 g , | | É Rn | || | É k k k k∞ which implies L sup L(f ) g . k k = 1 n | | É k k∞ < ∞ f L (R ), f 1 1 ∈ k k É Assume first that µ(Rn) . Let ε 0 and set < ∞ >

n χAε sign g Aε {x R : g(x) g ε} and f . = ∈ | | Ê k k∞ − = µ(Aε) Observe that 0 µ(A ) µ(Rn) . Then f L1(Rn), since < ε É < ∞ ∈ Z χAε f 1 dµ 1. k k = Rn µ(Aε) = Thus Z g L(f ) | | dµ g ε. | | = Aε µ(Aε) Ê k k∞ − By letting ε 0, we have L(f ) g and → | | Ê k k∞ L sup L(f ) g . k k = 1 n | | Ê k k∞ f L (R ), f 1 1 ∈ k k É It follows that L g in the case µ(Rn) . k k = k k∞ < ∞ n n n S The case µ(R ) follows by exhausting R with sets Ai Ai 1, R ∞i 1 Ai = ∞ ⊂ + = = with µ(Ai) for every i 1,2,.... Choosing < ∞ =

χAi,ε sign g f , Ai,ε Ai Aε, = µ(Ai,ε) = ∩ in the argument above, we obtain

g L (A ) ε L g L (Rn) for every i 1,2,.... k k ∞ i,ε − É k k É k k ∞ = The claim follows from this by first letting ε 0 and then i . → → ∞ ä CHAPTER 5. WEAK CONVERGENCE METHODS 101

Then we show that the converse of the previous theorem holds.

Theorem 5.2 (Riesz representation theorem in Lp). Let 1 p and as- É < ∞ sume that µ is a Radon measure. For every bounded linear functional L : Lp(Rn) → R there exists a unique g Lp0 (Rn) such that ∈ Z L(f ) f g dµ for every f Lp(Rn). (5.1) = Rn ∈

Moreover, L g p . k k = k k 0

T HEMORAL : The dual space of Lp(Rn) with 1 p is isomorphic to É < ∞ Lp0 (Rn).

n W ARNING : The result does not hold for p , since L∞(R )∗ is not a subset = ∞ of L1(Rn).

Proof. By considering the positive and negative parts separately, we may assume f 0. Ê (1) We assume first that µ(Rn) . For a measurable set A Rn, let < ∞ ⊂

ν(A) L(χA). = n p n Since µ(R ) , we have χA L (R ) and so ν is well defined. < ∞ ∈ (2) C LAIM : ν is countable additive on disjoint measurable sets.

Reason. If A and B are disjoint measurable sets, then χA B χA χB and thus ∪ = + ν(A B) ν(A) ν(B). ∪ = +

The continuity of L allows us to extend this to countable additivity. If A1, A2,... S Sk are pairwise disjoint measurable sets, E ∞i 1 Ai and Ek i 1 Ai, then = = = = Z p p k →∞ χE χEk p χE χEk dµ µ(E \ Ek) 0, k − k = Rn | − | = −−−−→ E E S A n Lp n since \ k ∞i k 1 i and µ(R ) . It follows that χEk χE in (R ) and by = = + < ∞ → the continuity of L, L(χE ) L(χE) as k . Consequently, k → → ∞ k à ! X∞ X [∞ ν(Ai) lim ν(Ai) lim ν(Ek) ν(E) ν Ai . = k = k = = i 1 →∞ i 1 →∞ i 1 = = = ■

(3) C LAIM : ν is absolutely continuous with respect to µ.

p n Reason. If µ(A) 0, then χA 0 in L (R ) and since a linear functional maps zero = = to zero, we have ν(A) L(χA) 0. = = ■ (4) It follows that ν is a Radon measure. By the Radon-Nikodym theorem, see Theorem 4.18, there exists g L1(Rn) for which ∈ Z ν(A) L(χA) χA g dµ = = Rn CHAPTER 5. WEAK CONVERGENCE METHODS 102

for every measurable set A Rn. This proves (5.1) for the characteristic functions. ⊂ We still have to show that g Lp0 (Rn), (5.1) holds for all f Lp(Rn) and L ∈ ∈ k k = g . p0 k k n (5) The representation in (5.1) holds for every f L∞(R ). ∈ Reason. In (4) we showed that (5.1) holds for every characteristic function of a measurable set and consequently it holds for linear combinations of such sets. n Thus (5.1) holds for simple functions. Every f L∞(R ) can be approximated in n ∈ L∞(R ) by a sequence (fi) of simple functions. Thus µZ ¶1/p p n i fi f p fi f dµ fi f µ(R ) →∞ 0 k − k = Rn | − | É k − k∞ −−−→ and i L(fi) L(f ) L fi f p →∞ 0. | − | É k kk − k −−−→ On the other hand, ¯Z Z ¯ Z Z ¯ ¯ i ¯ fi g dµ f g dµ¯ fi f g dµ fi f g dµ →∞ 0. ¯ Rn − Rn ¯ É Rn | − || | É k − k∞ Rn | | −−−→ Thus Z Z L(f ) lim L(fi) lim fi g dµ f g dµ. = i = i Rn = Rn →∞ →∞ n This shows (5.1) holds for every f L∞(R ). ∈ ■ p n (6) C LAIM : g p L and thus g L 0 (R ). k k 0 É k k ∈ Reason. 1 p Let < < ∞

n p0 1 Ai {x R : g(x) i} and f χA g − sign g, i 1,2,.... = ∈ | | É = i | | = n p p Then f L∞(R ) and f g 0 on Ai. Thus ∈ | | = | | Z Z µZ ¶1/p p0 p0 g dµ f g dµ L(f ) L f p L g dµ . n Ai | | = R = É k kk k = k k Ai | | This implies Z g p0 dµ L p0 for every i 1,2,..., Ai | | É k k = and letting i , by the monotone convergence theorem, we have g p L . → ∞ k k 0 É k k p 1, p0 Let = = ∞ ½ ¾ n 1 Ai x R : g(x) L , i 1,2,.... = ∈ | | Ê k k + i = Then ¯Z ¯ µ ¶ ¯ ¯ 1 L χAi 1 L(χAi sign g) ¯ χAi gsign g dµ¯ L µ(Ai), k kk k Ê | | = ¯ Rn ¯ Ê k k + i which can happen only if µ(Ai) 0 for i 1,2,.... Since = = à ! n [∞ X∞ µ({x R : g(x) L }) µ Ai µ(Ai) 0, ∈ | | > k k = i 1 É i 1 = = = we have g(x) L for almost every x Rn. This implies g L . | | É k k ∈ k k∞ É k k ■ CHAPTER 5. WEAK CONVERGENCE METHODS 103

(7) Thus g Lp0 (Rn) and ∈ Z L(f ) f g dµ (5.2) = Rn

n for every f L∞(R ). Both sides of (5.2) are continuous linear functionals on p n ∈ n L (R ) and they coincide on the dense subset L∞(R ). Consequently, they coincide on the whole of Lp(Rn). This proves that (5.1) holds for every f Lp(Rn). ∈ (8) C LAIM : g p L . k k 0 = k k

Reason. By (6) we have g p L . The opposite direction comes from Hölder’s k k 0 É k k inquality, since ¯Z ¯ L(f ) ¯ f g d ¯ f g ¯ µ¯ p p0 | | É ¯ Rn ¯ É k k k k so that L sup L(f ) g p . p n 0 k k = f L (R ), f p 1 | | É k k ∈ k k É ■

(9) The proof is now complete in the case µ(Rn) . Now we consider the case < ∞ µ(Rn) . = ∞ (10) For every σ-finite measure on Rn, there exists a function w L1(X;µ) ∈ such that 0 w(x) 1 for every x Rn. < < ∈ Reason. To say that µ is σ-finite means that Rn is the union of a countably many n measurable sets Ai, i 1,2,..., for which µ(Ai) . Define wi(x) 0, if x R \ Ai = < ∞ = ∈ and i 2− wi(x) = 1 µ(Ai) + P if x Ai. Then w ∞i 1 wi has the required properties. ∈ = = ■ (11) Define a measure Z µe(A) w dµ = A for µ-measurable sets A Rn. Then µ(Rn) and fe w1/p fe is a linear isometry ⊂ e < ∞ 7→ from Lp(Rn;µ) onto Lp(Rn;µ), because w(x) 0 for every x Rn. Hence e > ∈ 1/p Le(fe) L(w fe) = p n p n defines a bounded linear functional Le : L (R ;µ) L (R ;µ) with T L . The e → e k k = k k beginning of the proof shows that there exists g Lp0 (Rn;µ) such that e ∈ e Z Le(fe) fege dµe = Rn for every fe Lp(Rn;µ). Set g w1/p0 g for 1 p and g g for p 1. Then ∈ e = e < < ∞ = e = Z Z p0 p0 p0 p0 g dµ ge dµe Le L , Rn | | = Rn | | = k k = k k CHAPTER 5. WEAK CONVERGENCE METHODS 104

if p 1. If p 1, then > = g g Le L . k k∞ = kek∞ = k k = k k Finally, Z Z 1/p 1/p L(f ) Le(w− f ) w− f ge dµe f g dµ = = Rn = Rn for every f Lp(Rn). ∈ ä Remarks 5.3:

(1) For p p0 2 the Riesz representation theorem can be proved using the = = facts that L2(Rn) is a complete space and therefore a Hilbert space, and that bounded linear functionals on a Hilbert space are given by inner products. p n p n p n (2) It follows that L (R )∗∗ L (R ), and thus L (R ) is a reflexive space = when 1 p . < < ∞ (3) The result does not hold for p 1 without the σ-finiteness assumption. = For p 1, we do not need to assume that µ is σ-finite in fact, although the > proof requires some different details.

Remarks 5.4: (1) It follows that

½Z ¾ f sup f g dx : g Lp0 (Rn), g 1 . p p0 k k = Rn ∈ k k É

n p n (2) Since C0(R ) is dense in L 0 (R ) for 1 p , we have < < ∞ ½Z ¾ f sup f g dx : g C (Rn), g 1 . p 0 p0 k k = Rn ∈ k k É

5.2 Partitions of unity

In this section we briefly discuss partitions of unity which are tools to localize problems in analysis to a neighbourhood of each point. In general, a partition of unity is a collection Φ of continuous functions ϕ with 0 ϕ 1 for every ϕ Φ and P É É ∈ ϕ Φ ϕ(x) 1 such that for every point x there is a neighbourhood of x such that ∈ = only finitely many of the functions are nonzero. This ensures that the sum above has only finitely many nonzero terms at each point and that its partial sums are well defined continuous functions. We shall consider a special case of this.

n Theorem 5.5. Let K R be a compact set and Ui, i 1,2,..., k, be a collection ⊂ = n Sk of finitely many open subsets of R such that K i 1 Ui. There exist functions n ⊂ = ϕi C0(R ), i 1,..., k, such that ∈ =

(1)0 ϕi 1 for every i 1,..., k, É É = (2) suppϕi is a compact subset of Ui for every i 1,..., k and = CHAPTER 5. WEAK CONVERGENCE METHODS 105

Pk (3) i 1 ϕi(x) 1 for every x K. = = ∈

T ERMINOLOGY : The collection of functions ϕi is called a partition of unity

related to the collection Ui, i 1,2,..., k. = T HEMORAL : Partition of unity is a very useful tool to localize functions, since k k X X f (x) f (x) ϕi(x) f (x)ϕi(x) for every x K. = i 1 = i 1 ∈ = = Thus every function can be represented as a sum of compactly supported functions.

Proof. For every x K, there exists a ball B(x, rx) such that B(x,2rx) Ui for ∈ ⊂ some i 1,..., k. Since K is compact, there exists a finite subcollection B(xi, ri), = i 1,..., m, such that = m [ K B(xi, ri). ⊂ i 1 = Let Ai be the union of those closed balls B(xi, ri) for which B(xi,2ri) Ui. Then ⊂ Sk Ai is a compact subset of Ui and K i 1 Ai. As in (3.1), we can obtain a cutoff n ⊂ = function gi C(R ) such that ∈

(1)0 gi 1, É É (2) gi 1 in Ai and = (3) supp gi is a compact subset of Ui.

Define

ϕ1 g1, = ϕ2 (1 g1)g2, = − . .

ϕk (1 g1)...(1 gk 1)gk. = − − −

Clearly 0 ϕi 1 and suppϕi Ui for every i 1,2..., k. By induction É É ⊂ = k X ϕi 1 (1 g1)...(1 gk). i 1 = − − − = In addition, k k X X 0 ϕi 1 and ϕi 1 on K, É i 1 É i 1 = = = since if x K, then x Ai for some i 1,..., k and thus 1 gi(x) 0. ∈ ∈ = − = ä Remark 5.6. The proof above applies in any locally compact Hausdorff space.

Remark 5.7. Let us briefly discuss another approach to the partition of unity. CHAPTER 5. WEAK CONVERGENCE METHODS 106

(1) Assume that the balls B(xi, ri), i 1,2,..., have the property that even the = enlarged balls B(xi,2ri), i 1,2,..., are of bounded overlap, see the Besi- = covitch covering theorem (Theorem 4.2) for the bounded overlap property.

For every i 1,2,..., there is a continuous function gi such that 0 gi 1, = É É gi 1 on B(xi, ri) and supp gi B(xi,2ri). Define = = gi(x) ϕi(x) P , i 1,2,.... = ∞j 1 g j(x) = = Observe that the sum is only over finitely many terms at a given point. Then X∞ [∞ ϕi 1 in B(xi,2ri). i 1 = i 1 = = (2) This method applies to other sets than balls as well. Let U be a nonempty n open set in R . Denote U0 and = ; ½ 1 ¾ Ui x U : dist(x,ÇU) B(0, i), i 1,2,.... = ∈ > i ∩ = S Then U ∞i 1 Ui and Ui is a compact subset of Ui 1 for every i 1,2,.... = = + = C LAIM : There exists ϕi C0∞(Ui 2 \ Ui 1), 0 ϕi 1 for every i ∈ + − É É = 1,2,... and X∞ ϕi 1 in U. i 1 = =

Reason. There exists gi C0∞(Ui 2 \Ui 1) with 0 gi 1 and gi 1 in ∈ + − É É = Ui 1 \Ui for every i 1,2,.... Define ϕi : U R, + = → gi(x) ϕi(x) P , i 1,2,.... = ∞j 1 g j(x) = = ■ Let f Lp(U) with 1 p and assume that φ be an approximation of ∈ É < ∞ ε the identity as in Defintion 3.14. Then ϕi f has a compact support and

supp(ϕi f ) Ui 2 \Ui 1. Fix ε 0. Choose εi 0 so small that ⊂ + − > >

supp(φεi (ϕi f )) Ui 2 \Ui 1 ∗ ⊂ + − and ε φ (ϕi f ) ϕi f Lp(U) , i 1,2,.... k εi ∗ − k < 2i = Define X∞ g φεi (ϕi f ). = i 1 ∗ = This function belongs to C∞(U), since in a neighbourhood of any point x U, there are only finitely many nonzero terms in the sum. Moreover, ∈ ° ° °X∞ X∞ ° f g Lp(U) ° φ (ϕi f ) ϕi f ° k − k = ° εi ∗ − ° °i 1 i 1 °Lp U = = ( ) ε X∞ ° ( f ) f ° X∞ . °φεi ϕi ϕi °Lp(U) i ε É i 1 ∗ − É i 1 2 = = = CHAPTER 5. WEAK CONVERGENCE METHODS 107

p This shows that C∞(U) is dense in L (U) for 1 p for an arbitrary n É < ∞ p open subset U of R . It can be shown that even C0∞(U) is dense in L (U) for 1 p for an arbitrary open subset U of Rn (exercise). É < ∞

5.3 The Riesz representation theorem for Radon measures

We denote the space of continuous functions f : Rn Rm by C(Rn;Rm), where → n, m 1,2,.... The support of such a function f is = supp f {x Rn : f (x) 0} = ∈ 6= and n m © n m nª C0(R ;R ) f C(R ;R ) : supp f is a compact subset of R = ∈ is the space of compactly supported continuous functions f : Rn Rm. → Assume that µ is a Radon measure on Rn and let σ : Rn Rm be a µ-measurable n → n m function such that σ(x) 1 for µ-almost every x R . Define L : C0(R ;R ) R, | | = ∈ → Z L(f ) f σ dµ = Rn ·

n m for every f C0(R ;R ). Then ∈ L(f g) L(f ) L(g) and L(af ) aL(f ), a R, + = + = ∈ so that L is a linear functional. Let K Rn be a compact set and assume that n m ⊂ n f C0(R ;R ) with supp f K and f (x) 1 for every x R . Then ∈ ⊂ | | É ∈ ¯Z ¯ Z Z ¯ ¯ L(f ) ¯ f σ dµ¯ f σ dµ f σ dµ | | = ¯ Rn · ¯ É Rn | · | É Rn | || | Z f dµ µ(supp f ) µ(K) , = Rn | | É É < ∞ which implies

© n m ª sup L(f ) : f C0(R ;R ), f 1, supp f K . | | ∈ | | É ⊂ < ∞ n m This is the norm of the linear functional L over the class of functions f C0(R ;R ) ∈ with supp f K. Thus this functional is locally bounded. ⊂ T HEMORAL : The integral with respect to a Radon measure defines a locally n m bounded linear functional L : C0(R ;R ) R as above. → The next theorem shows that the converse holds as well. CHAPTER 5. WEAK CONVERGENCE METHODS 108

n m Theorem 5.8 (Riesz representation theorem). Assume that L : C0(R ;R ) → R is a linear functional satisfying

© n m ª sup L(f ) : f C0(R ;R ), f 1, supp f K (5.3) ∈ | | É ⊂ < ∞ for every compact set K Rn. Then there exists a Radon measure µ on Rn and a ⊂ µ-measurable function σ : Rn Rm such that σ(x) 1 for µ-almost every x Rn → | | = ∈ and Z L(f ) f σ dµ = Rn · n m for every f C0(R ;R ). ∈ n m T HEMORAL : A locally bounded linear functional on C0(R ;R ) can be char- acterized as an integral with respect to a Radon measure. This gives a method to construct Radon measures and motivates the study of Radon measures. The role of σ is just to assign a sign so that the measure µ is nonnegative.

n Example 5.9. Let L : C0(R ) R, L(f ) f (x0) be the evaluation map for a fixed n n → = n x0 R . Let K R be a compact set, f C0(R ) with f 1 and supp f K. Then ∈ ⊂ ∈ | | É ⊂

L(f ) f (x0) 1 . = É < ∞ This functional is positive in the sense that L(f ) 0 whenever f 0. By the Riesz Ê Ê representation theorem, there exists a Radon measure µ on Rn such that Z L(f ) f dµ = Rn

n for every f C0(R ). It follows that for the evaluation map, the measure µ is equal ∈ to Dirac’s measure δx0 concentrated at x0.

Definition 5.10. The variation measure µ is defined as

© n m ª µ(U) sup L(f ) : f C0(R ;R ), f 1, supp f U = ∈ | | É ⊂ for every open set U Rn. ⊂ Now we are ready for the proof of Theorem 5.8.

Proof. (1) Define the variation measure for open sets as above and set

µ(A) inf{µ(U) : A U, U open} = ⊂ for an arbitrary A Rn. ⊂ (2) C LAIM : µ is an outer measure.

n S Reason. Let U and Ui, i 1,2,..., be open subsets of R such that U ∞i 1 Ui. n m = ⊂ = Choose f C0(R ;R ) such that f 1 and supp f U. Since supp f is a compact ∈ | | É ⊂ CHAPTER 5. WEAK CONVERGENCE METHODS 109

set and the collection of sets Ui, i 1,2,..., is an open covering of K, there are = Sk finitely many sets Ui, i 1,..., k, such that supp f i 1 Ui. = ⊂ = Let ϕi, i 1,..., k, be a partition of unity related to the collection Ui, i 1,..., k, = = such that 0 ϕi 1, suppϕi Ui for every i 1,..., k, and and É É ⊂ = k X ϕi(x) 1 for every x supp f . i 1 = ∈ = Then k k X X n f (x) f (x) ϕi(x) f (x)ϕi(x) for every x R , = i 1 = i 1 ∈ = = supp(ϕi f ) Ui and 0 ϕi f 1 for every i 1,2,.... Thus ⊂ É É = ¯ Ã !¯ ¯ ¯ ¯ k ¯ ¯ k ¯ k ¯ X ¯ ¯X ¯ X X∞ L(f ) ¯L f ϕi ¯ ¯ L(f ϕi)¯ L(f ϕi) µ(Ui). | | = ¯ i 1 ¯ = ¯i 1 ¯ É i 1 | | É i 1 = = = = By taking the supremum over such functions f , we have

X∞ µ(U) µ(Ui). É i 1 = S Then let Ai, i 1,2,..., be arbitrary sets with A ∞i 1 Ai. Fix ε 0. For every = ⊂ = > i 1,2,..., choose an open set Ui such that Ai Ui and = ⊂ ε µ(Ui) µ(Ai) . É + 2i Then à ! µ ε ¶ (A) [∞ U X∞ (U ) X∞ (A ) X∞ (A ) . µ µ i µ i µ i i µ i ε É i 1 É i 1 É i 1 + 2 = i 1 + = = = = ■

(3) C LAIM : µ is a Radon measure.

Reason. Assume first that U1 and U2 are open sets with dist(U1,U2) 0. Let fi n m > n m∈ C0(R ;R ) such that 0 fi 1 and supp fi Ui, i 1,2. Then f1 f2 C0(R ;R ), É É ⊂ = + ∈ 0 f1 f2 1 and supp(f1 f2) U1 U2, so that É + É + ⊂ ∪

L(f1) L(f2) L(f1 f2) µ(U1 U2). + = + É ∪

By taking the supremum over all admissible functions f1 and f2, we obtain

µ(U1) µ(U2) µ(U1 U2). + É ∪

On the other hand, by (1) we have µ(U1 U2) µ(U1) µ(U2). Thus + É +

µ(U1) µ(U2) µ(U1 U2). + = +

Assume then that A1 and A2 are arbitrary sets with dist(A1, A2) 0. Let ε 0 n > > and choose an open set U R such that A1 A2 U and ⊂ ∪ ⊂

µ(U) µ(A1 A2) ε. É ∪ + CHAPTER 5. WEAK CONVERGENCE METHODS 110

n Take open sets Ui R such that Ai Ui and dist(U1,U2) 0. For example, we ⊂ ⊂ > may take ½ ¾ n 1 Ui x R : dist(x, Ai) dist(A1, A2) , i 1,2. = ∈ < 3 =

Then Ai Ui U and dist(U1 U,U2 U) 0. Thus ⊂ ∩ ∩ ∩ >

µ(A1) µ(A2) µ(U1 U) µ(U2 U) + É ∩ + ∩ µ(U (U1 U2)) µ(A1 A2) ε. = ∩ ∩ É ∪ +

Letting ε 0, we have µ(A1 A2) µ(A1) µ(A2). Again, the reverse inequality → ∪ É + follows by , so that

µ(A1 A2) µ(A1) µ(A2). ∪ É + This shows that µ is a metric outer measure and consequently it is a Borel measure. To see that µ is Borel regular, let A be an arbitrary subset of Rn. Then there

exist open sets Ui, i 1,2,..., such that A Ui and = ⊂ 1 µ(Ui) µ(A) , i 1,2,.... < + i = Thus à ! \∞ 1 µ(A) µ Ui µ(Ui) µ(A) É i 1 É < + i = for every i 1,2,..., which implies = à ! \∞ µ(A) µ Ui . = i 1 = Finally, we show that µ(K) for every compact set K Rn. It is enough to < ∞ ⊂ show that µ(B(x, r)) for every ball B(x, r) Rn. This is clear, since (5.3) gives < ∞ ⊂ © n m ª µ(B(x, r)) sup L(f ) : f C0(R ;R ), f 1, supp f B(x, r) = ∈ | | É ⊂ n n m o sup L(f ) : f C0(R ;R ), f 1, supp f B(x, r) . É ∈ | | É ⊂ < ∞ Thus µ satisfies all conditions in the definition of the Radon measure. ■ n n n (4) Denote C+(R ) {f C0(R ) : f 0} and for every f C+(R ) define 0 = ∈ Ê ∈ 0 © n m ª ν(f ) sup L(g) : g C0(R ;R ), g f . = | | ∈ | | É n Observe that if f1, f2 C0+(R ) and f1 f2, then ν(f1) ν(f2). Moreover, ν(af ) ∈ n É É = aν(f ) for every a 0 and f C0+(R ). Ê ∈ n (5) C LAIM : ν(f1 f2) ν(f1) ν(f2) for every f1, f2 C+(R ). + = + ∈ 0 CHAPTER 5. WEAK CONVERGENCE METHODS 111

n m Reason. If g1, g2 C0(R ;R ) such that g1 f1 and g2 f2, then g1 g2 ∈ | | É | | É | + | É g1 g2 f1 f2. In addition, we may assume that L(g1) 0 and L(g2) 0. Thus | |+| | É + Ê Ê

L(g1) L(g2) L(g1) L(g2) L(g1 g2) L(g1 g2) ν(f1 f2). | | + | | = + = + É | + | É + n m n m By taking suprema over g1 C0(R ;R ) and g2 C0(R ;R ), we have ∈ ∈

ν(f1) ν(f2) ν(f1 f2). + É + n m To prove the reverse inequality, let g C0(R ;R ) with g f1 f2. For i 1,2, ∈ | | É + = set  fi g  , if f1 f2 0, gi f1 f2 + > =  + 0, if f1 f2 0, + = n m Then g1, g2 C0(R ;R ) and g g1 g2. Moreover, g1 f1 and g2 f2, so that ∈ = + | | É | | É

L(g) L(g1) L(g2) ν(f1) ν(f2), | | É | | + | | É +

from which it follows that ν(f1 f2) ν(f1) ν(f2). + É + ■ Z n (6) C LAIM : ν(f ) f dµ for every f C0+(R ). = Rn ∈

Reason. Let ε 0. Choose 0 t0 t1 tk such that > = < < ··· < 1 tk 2 f , 0 ti ti 1 ε and µ(f − {ti}) 0 for every i 1,..., k. = k k∞ < − − < = = 1 Set Ui f − ((ti 1, ti)), then Ui is open and µ(Ui) for every i 1,..., k. By = − < ∞ = approximation properties of measurable sets with respect to a Radon measure,

there exist compact sets Ki Ui such that ⊂ ε µ(Ui \ Ki) i 1,..., k. < k = n m Futhermore, there exist functions gi C0(R ;R ) with gi 1, supp gi Ui such ∈ | | É ⊂ that ε L(gi) µ(Ui) , i 1,..., k. | | Ê − k = n Note also that there exit functions hi C+(R ) such that supp hi Ui, 0 hi 1 ∈ 0 ⊂ É É and hi 1 in the compact set Ki supp gi. Then = ∪ ε ν(hi) L(gi) µ(Ui) , i 1,..., k, Ê | | Ê − k = and

© n m ª ν(hi) sup L(g) : g C0(R ;R ), g hi = | | ∈ | | É © n m ª sup L(g) : g C0(R ;R ), g 1, supp g Ui µ(Ui). É | | ∈ | | É ⊂ = Thus ε µ(Ui) ν(hi) µ(Ui), i 1,..., k. − k É É = CHAPTER 5. WEAK CONVERGENCE METHODS 112

Define ( Ã k ! ) n X A x R : f (x) 1 hi(x) 0 . = ∈ − i 1 > = Then A is an open set. We have

à k ! ( k ) X n m X ν f f hi sup L(g) : g C0(R ;R ), g f f hi − i 1 = | | ∈ | | É − i 1 = = © n m ª sup L(g) : g C0(R ;R ), g f χA É | | ∈ | | É k k∞ © n m ª f sup L(g) : g C0(R ;R ), g χA = k k∞ | | ∈ | | É Ã k ! [ f µ(A) f µ (Ui \{hi 1}) = k k∞ = k k∞ i 1 = = k X f µ(Ui \ Ki) ε f . = k k∞ i 1 É k k∞ = Thus à k ! à k ! X X ν(f ) ν f f hi ν f hi = − i 1 + i 1 = = k X ε f ν(f hi) É k k∞ + i 1 = k X ε f tiµ(Ui) É k k∞ + i 1 = and k k k X X ³ ε ´ X ν(f ) ν(f hi) ti 1 µ(Ui) ti 1µ(Ui) tkε. Ê i 1 Ê i 1 − − k Ê i 1 − − = = = Since k k X Z X ti 1µ(Ui) f dµ tiµ(Ui), i 1 − É Rn É i 1 = = we have ¯ Z ¯ k ¯ ¯ X ¯ν(f ) f dµ¯ (ti ti 1)µ(Ui) ε f εtk ¯ − Rn ¯ É i 1 − − + k k∞ + = εµ(supp f ) 3ε f . É + k k∞ ■ (7) There exists a µ-measurable function σ : Rn Rm such that → Z L(f ) f σ dµ = Rn · n m for every f C0(R ;R ). ∈ n n Reason. Fix e R with e 1. Define νe(f ) L(f e) for every f C0(R ). Then νe ∈ | | = = ∈ is linear and

n m νe(f ) L(f e) sup{ L(g) : g C0(R ;R ), g f } | | = | | É | | ∈ | | É | | Z ν( f ) f dµ. = | | = Rn | | CHAPTER 5. WEAK CONVERGENCE METHODS 113

1 n Thus we can extend νe to a bounded linear functional on L (R ;µ). Hence there n exists σe L∞(R ;µ) such that ∈ Z n λe(f ) f σe dµ, f C0(R ). = Rn ∈ m Pm Let e1,..., em be the standard basis for R and define σ i 1 σei ei. For f n m = = ∈ C0(R ;R ), we have

m m X X Z Z L(f ) L((f ei)ei) (f ei)σei dµ f σ dµ. = i 1 · = i 1 Rn · = Rn · = = ■

(8) C LAIM : σ(x) 1 for µ-almost every x Rn. | | = ∈ Reason. Let U Rn be a open set with µ(U) . By definition ⊂ < ∞ ½Z ¾ n m µ(U) sup f σ dµ : f C0(R ;R ), f 1, supp f U . = Rn · ∈ | | É ⊂ n m Let fi C0(R ;R ) such that fi 1, supp fi U and fi σ σ µ-almost every- ∈ | | É ⊂ · → | | where. Thus Z Z σ dµ lim fi σ dµ µ(U). U | | = i Rn · É →∞ n m On the other hand, if f C0(R ;R ) is such that f 1, supp f U, then ∈ | | É ⊂ Z Z f σ dµ σ dµ Rn · É U | | and consequently Z µ(U) σ dµ. = U | | Thus Z µ(U) σ dµ for every open U Rn. É U | | ⊂ This implies σ 1 for µ-almost everywhere. | | = ■ Next we prove the Riesz representation theorem for positive linear functionals n on C0(R ).

n Theorem 5.11. Assume that L : C0(R ) R is a positive linear functional, that n → is, L(f ) 0 for every f C0(R ) with f 0. Then there exists a Radon measure µ Ê ∈ Ê on Rn such that Z L(f ) f dµ = Rn n for every f C0(R ). ∈ Remarks 5.12: n (1) If f , g C0(R ) and f g, then T(f ) T(g) T(f g) 0 and thus T(f ) ∈ Ê − = − Ê Ê T(g). n (2) Positive linear functionals on C0(R ) are not necessarily bounded, but they are locally bounded, as we shall see in the proof below. CHAPTER 5. WEAK CONVERGENCE METHODS 114

n n Proof. Let K be a compact subset of R and choose ϕ C0∞(R ) such that ϕ 1 on n ∈ = K and 0 ϕ 1 (exercise). Then for every f C∞(R ) with supp f K, set É É ∈ 0 ⊂ g f ϕ f 0. = k k∞ − Ê Thus 0 L(g) f L(ϕ) L(f ) É = k k∞ − which implies L(f ) c f É k k∞ with c L(ϕ). The mapping L can be thus extended to a linear mapping from n = C0(R ) to R, which satisfies the assumptions in the Riesz representation theorem (exercise). Hence there exists a Radon measure µ and a µ-measurable function σ : Rn R such that σ(x) 1 for µ-almost every x Rn and → | | = ∈ Z L(f ) f σ dµ = Rn n n for every f C0(R ). Then σ(x) 1 for µ-almost every x R and positivity of ∈ = ± ∈ the operator implies σ(x) 1 for µ-almost every x Rn (exercise). = ∈ ä Remark 5.13. The Riesz representation theorem holds in a much more general context. The underlying space can be any locally compact Hausdorff space X instead of Rn.

5.4 Weak convergence and compactness of Radon measures

Let us recall the notion of weak convergence from functional analysis. Let X be a

normed space. A sequence (xi) in X is said to be weakly converging, if there exists an element x X such that ∈

lim x∗(xi) x∗(x) for every x∗ X ∗. i = ∈ →∞

A sequence (x∗i ) in the dual space X ∗ is said to be weakly (weak star) converging, if there exists an element x∗ X ∗ such that ∈

lim x∗(x) x∗(x) for every x X. i i = ∈ →∞ Both weak and weak star convergences are pointwise convergences tested on

every element of the space X ∗ and X, respectively. By the Riesz representation theorem, every bounded linear functional on n C0(R ) is an integral with respect to a Radon measure. This gives a motivation for the following definition. CHAPTER 5. WEAK CONVERGENCE METHODS 115

Definition 5.14. The sequence (µi) of Radon measures µi, i 1,2,..., converges = weakly to the Radon measure µ, if Z Z n lim f dµi f dµ for every f C0(R ). i Rn = Rn ∈ →∞ In this case we write µi * µ as i . → ∞ Examples 5.15:

(1) Let φ , ε 0, be the standard mollifier in Example 3.16 (2) and let µi be a ε > Radon measure on Rn defined by Z µi(A) µεi (A) φεi (y) d y, = = A n with εi 0 as i , for every Borel set A R . As in Example 3.16 (2), → → ∞ ⊂ we have Z Z lim f (y) dµi(y) lim f (y)ϕi(y) d y f (0) i Rn = i Rn = →∞ →∞ n for every f C0(R ). This implies that µi * δ0 as i , where δ0 is ∈ → ∞ Dirac’s measure at 0.

(2) Let δi be Dirac’s measure at i 1,2,... on R. Then δi * 0 as i . = → ∞ (3) Let 1 ³ ´ µi δ 1 δ 2 δ i , i 1,2,.... = i i + i + ··· + i = Then for every f C0(R), we have ∈ i µ ¶ 1 Z X 1 j Z f dµi f f (x) dx, R = j 1 i i → 0 = 1 since these are Riemann sums in [0,1]. Thus µi * m [0,1] as i b → ∞ n Lemma 5.16. Assume that µi, i 1,2,..., are Radon measures on R with µi * µ = as i . Then the following claims are true: → ∞ n (1) limsupi µi(K) µ(K) for every compact set K R and →∞ É ⊂ n (2) µ(U) liminfi µi(U) for every open set U R . É →∞ ⊂ Proof. (1) Let K Rn be compact and choose open U such that K U. Choose n ⊂ ⊂ f C0(R ) such that 0 f 1, supp f U and f 1 on K. Then ∈ É É ⊂ = Z Z µ(U) f dµ lim f dµi limsupµi(K). Ê Rn = i Rn Ê i →∞ →∞ Taking infimum over all open sets U K, we have ⊃

limsupµi(K) inf{µ(U) : U K, U open} µ(K). i É ⊃ = →∞ (2) Let U be open and K U compact. Assume that the function f is as above. ⊂ Then Z Z µ(K) f dµ lim f dµi liminfµi(U). É Rn = i Rn É i →∞ →∞ Thus µ(U) sup{µ(K) : K U, K compact} liminfµi(U). = ⊂ É i →∞ ä CHAPTER 5. WEAK CONVERGENCE METHODS 116

Next we prove a very useful weak compactness result for Radon measures.

n Theorem 5.17. Let (µi), i 1,2,..., be a sequence of Radon measures on R with =

supµi(K) i < ∞

n for every compact set K R . Then there is a subsequence (µi ), j 1,2,..., and a ⊂ j = Radon measure µ such that µi * µ as i . → ∞

T HEMORAL : Every locally bounded sequence of Radon measures on Rn has a weakly converging subsequence.

n Proof. (1) Assume first that M sup µi(R ) . = i < ∞ n (2) Let {f j}∞j 1 be a countable dense subset of C0(R ) with respect to = k · k∞ norm (exercise). Assumption (1) implies that Z n sup f1 dµi f1 supµi(R ) M f1 . i Rn É k k∞ i = k k∞ < ∞ R This shows that Rn f1 dµi is a bounded sequence in R and thus it has a converging 1 subsequence. Hence there exists a subsequence (µ ) of (µi) and a1 R such that i ∈ Z 1 f1 dµi a1 as i . Rn → → ∞

j j 1 Inductively, we choose a subsequence (µ ) of (µ − ) and a j R such that i i ∈ Z k fk dµi ak as i . Rn → → ∞

i Then the diagonal sequence (µi) satisfies Z i lim fk dµi ak for every k 1,2,.... i Rn = = →∞

Let S be the vector space spanned by fk, k 1,2,..., that is, = ( m ) X S g λk fk : λi R, m N . = = k 1 ∈ ∈ = Define m m X X L(g) λkak, when g λk fk. = k 1 = k 1 = = Then m m Z X X i L(g) λkak lim λk fk dµ n i = k 1 = k 1 i R = = →∞ Z m Z X i i lim λk fk dµ lim g dµ n i n i = i R k 1 = i R →∞ = →∞ CHAPTER 5. WEAK CONVERGENCE METHODS 117

for every g S. Thus L is a positive linear functional on S. Moreover, ∈ ¯Z ¯ ¯ i¯ i n L(g) lim ¯ g dµi¯ sup( g µi(R )) M g . (5.4) | | = i ¯ Rn ¯ É k k∞ É k k∞ →∞ i This shows that L is a bounded functional on S. (3) The functional L on S can be uniquely extended to a bounded linear n functional on C0(R ).

n n Reason. Let f C0(R ). Since {f j}∞j 1 is dense in C0(R ), there exists a sequence ∈ = n (f j) such that f j f 0, that is, f j f uniformly in R as j . Define k − k∞ → → → ∞

L(f ) lim L(f j). = j →∞ n It follows from (5.4) that L is a well defined functional in C0(R ) and that (5.4) n holds for every f C0(R ). ∈ ■ (4) According to the Riesz representation theorem 5.8 there exists a Radon measure µ on Rn such that Z n L(f ) f dµ for every f C0(R ). = Rn ∈

n Observe, that L is a positive linear functional on C0(R ).

n Reason. If f C0(R ) with f 0, and f j f 0 with f j S, then ∈ Ê k − k∞ → ∈

liminf(min f j) 0 j Rn Ê →∞ and hence Z i L(f ) lim L(f j) lim lim f j dµi 0, = j = j i Rn Ê →∞ →∞ →∞ since Z i f j dµi M min{0,min f j}. Rn Ê Rn ■

(5) C LAIM : µi * µ as i . → ∞ n Reason. Let ε 0 and f C0(R ). Choose g S such that > ∈ ∈ ε f g . k − k∞ < 2M Then for large enough i, we have ¯ Z ¯ ¯ Z ¯ ¯Z ¯ ¯ i¯ ¯ i¯ ¯ i¯ ¯L(f ) f dµi¯ L(f g) ¯L(g) g dµi¯ ¯ (g f ) dµi¯ ¯ − Rn ¯ É | − | + ¯ − Rn ¯ + ¯ Rn − ¯ M f g ε M f g 2ε. É k − k∞ + + k − k∞ É This proves the claim. ■ CHAPTER 5. WEAK CONVERGENCE METHODS 118

n (6) Finally, we remove the assumption supi µi(R ) . The assumption n < ∞ supi µi(K) for every compact set K R and the argument above show that < ∞ j ⊂ j 1 for every j there is a subsequence (µi ) of (µi− ) such that

j j µ B(0, j)* ν , j 1,2,..., i b = where ν j is a Radon measure with ν j(Rn \ B(0, j)) 0. The the diagonal sequence i = (µi) satisfies

i j µ B(0, j)* ν as i for every j 1,2,.... ib → ∞ = j k Observe that ν B(0,k) ν , k 1,..., j. Thus we may define a Radon measure b = =

X∞ j µ(A) ν (A (B(0, j) \ B(0, j 1))). = j 2 ∩ + = When j is so large that supp f B(0, j), then ⊂ Z Z Z j i f dµ f dν lim f dµi, Rn = Rn = i Rn →∞ so that µi * µ as i . i → ∞ ä

5.5 Weak convergence in Lp.

Next we consider weak convergence in Lp. Recall that the Riesz representation p theorem (Theorem 5.2) gives a characterization for L (A)∗ with 1 p . É < ∞ p n Definition 5.18. Let 1 p . A sequence (fi) of functions in L (R ) converges É É ∞ weakly (weak star if p ) in Lp(Rn) to a function f Lp(Rn), if = ∞ ∈ Z Z p0 n lim fi g dx f g dx for every g L (R ). i Rn = Rn ∈ →∞

Here we use the interpretation that p0 if p 1 and p0 1 if p . = ∞ = = = ∞ p n p n Remark 5.19. fi f strongly in L (R ) implies fi f weakly in L (R ) as i . → → → ∞ Reason. By Hölder’s inequality, we have ¯Z Z ¯ Z ¯ ¯ ¯ fi g dx f g dx¯ fi f g dx ¯ Rn − Rn ¯ É Rn | − || |

fi f p g p 0 as i É k − k k k 0 → → ∞

for every g Lp0 (Rn). ∈ ■ We illustrate some typical features of the behaviour of a sequence which converges weakly but not strongly. CHAPTER 5. WEAK CONVERGENCE METHODS 119

Examples 5.20:

(1) (Oscillation) Let fi : (0,2π) R, fi(x) sin(ix), i 1,2,.... Then fi con- →p = = verges weakly to f 0 in L ((0,2π)), but fi p c(p) 0 for every i = k p k = > = 1,2,..., so that fi does not converge to 0 in L ((0,2π)). Observe, that

f p liminf fi p. k k < i k k →∞ T HEMORAL : Sequences of rapidly oscillating functions provide exam- ples of weakly converging sequences that do not converge strongly.

(2) (Concentration) Let fi : ( 1,1) R, − →  i, 0 x 1/i, fi(x) É É = 0, otherwise.

Then fi δ0 weakly as measures in ( 1,1). Observe that fi converges → − weakly to zero as measures in (0,1), but fi does not converge weakly in L1((0,1)). T HEMORAL : Sequences of concentrating functions provide examples of weakly converging sequences that do not converge strongly.

(3) Let 1 p and fi : ( 1,1) R, < < ∞ − →  i1/p, 0 x 1/i, fi(x) É É = 0, otherwise.

p Then fi 0 weakly in L (( 1,1)), but the sequence (fi) does not converge p → − in L (( 1,1)), since fi p 1 for every i 1,2,.... This shows that the − k k = = q norms fi p 1 concentrate. However, fi 0 in L (( 1,1)) for every q p. k k = → − < Indeed, Z q q/p 1 fi dx i − 0 as i . ( 1,1) | | = → → ∞ − In particular, the norms fi q, q p, do not concentrate. k k < The next result shows that any weakly converging sequence is bounded.

p n Theorem 5.21. If fi f weakly in L (R ) with 1 p , then → É É ∞

f p liminf fi p. k k É i k k →∞ T HEMORAL : The Lp-norm is lower semicontinuous with respect to the weak convergence.

Proof. 1 p If f p 0, the claim is clear. Hence we may assume that < < ∞ k k = p/p p n f p 0. The function g f 0 sign f belongs L 0 (R ), since k k > = | |

µZ ¶1/p0 µZ ¶1/p0 p/p g g p0 dx f p dx f 0 . p0 p k k = Rn | | = Rn | | = k k < ∞ CHAPTER 5. WEAK CONVERGENCE METHODS 120

By the definition of the weak convergence Z Z Z Z p/p0 p p lim fi g dx f g dx f f sign f dx f dx f p. i Rn = Rn = Rn | | | {z } = Rn | | = k k →∞ f =| | On the other hand, by Hölder’s inequality ¯Z ¯ ¯ f g dx¯ f g f f p/p0 . ¯ i ¯ i p p0 i p p ¯ Rn ¯ É k k k k = k k k k Thus p p/p0 f p liminf fi p f p . k k É i k k k k →∞ Dividing through by f p/p0 0 implies the claim. k kp > p 1 Exercise. = n S p Exhaust R with sets A j A j 1, A ∞j 1 A j with A j for every = ∞ ⊂ + = = | | < ∞ j 1,2,.... Let ε 0 and set = >

A j, {x A j : f (x) f L (A ) ε} and g χA sign f . ε = ∈ | | Ê k k ∞ j − = j,ε Then Z Z lim f g dx f g dx ( f ) A . i L∞(A j) ε j,ε i A = A Ê k k − | | →∞ j j On the other hand, by Hölder’s inequality ¯ ¯ ¯Z ¯ ¯ ¯ ¯ fi g dx¯ fi A j,ε . ¯ A j ¯ É k k∞| |

Dividing through by A j, 0 implies | ε| >

f L (A ) ε liminf fi L (A ). k k ∞ j − É i k k ∞ j →∞ The claim follows for the set A j by letting ε 0. → The argument above gives

f L (A ) liminf fi L (A ) k k ∞ j É i k k ∞ j →∞ liminf fi L (A) for every j 1,2,.... É i k k ∞ = →∞ The claim follows from this. ä p Remark 5.22. Let 1 p . If fi f weakly in L (A), it does not follow that < < ∞ → that limi fi p f p. Nor does the reverse implication hold true. Example →∞ k k = k k 5.20 (1) gives a counterexample for both claims. The following result explains p the situation: If fi f weakly in L (A) with 1 p and limi fi p f p, →∞ → p < < ∞ k k = k k then fi f strongly in L (A). This result will not be proved here. → Remark 5.23. The previous theorem is a general fact in the functional analysis.

Let X be a Banach space. If a sequence (xi) converges weakly to x X, then it is ∈ bounded and

x liminf xi . k k É i k k →∞ CHAPTER 5. WEAK CONVERGENCE METHODS 121

The previous theorem asserts that a weakly converging sequence is bounded. The next result shows that the converse is true up to a subsequence. One of the most useful applications of the weak convergence is in compactness arguments. A bounded sequence in Lp does not need to have any convergent subsequence with convergence interpreted in the standard Lp sense. However, there exists a weakly converging subsequence.

Theorem 5.24. Let 1 p . Assume that the sequence (fi) of functions fi < < ∞ ∈ Lp(Rn), i 1,2,..., satisfies = sup fi p . i k k < ∞ p n Then there exists a subsequence (fi ) and a function f L (R ) such that fi f j ∈ j → weakly in Lp(Rn).

T HEMORAL : This shows that Lp with 1 p is weakly sequentially < < ∞ compact, that is, every bounded sequence in Lp with 1 p has a weakly < < ∞ converging subsequence. This is an analogue of the Bolzano-Weierstrass theorem.

Remark 5.25. The claim does not hold for p 1. Indeed, if (fi) is a sequence of 1 n = nonnegative functions in L (R ) with sup fi 1 , there is no guarantee that i k k < ∞ some subsequence will converge weakly in L1(Rn). However, let µ be a Radon measure on Rn and consider the measures defined by Z νi(A) fi dµ = A

for every Borel set A Rn. Then by Theorem 5.17, there exists a Radon measure ⊂ ν on Rn such that Z Z n lim ϕfi dµ ϕ dµ for every ϕ C0(R ). i Rn = Rn ∈ →∞ Example 5.26. Let µ be the one-dimensional Lebesgue measure on R. Then the

sequence fi iχ£0, 1 ¤, i 1,2,..., converges in measure to zero, satisfies fi 1 1 = i = k k = for every i 1,2,..., but does not converge weakly in L1(R). However, consider the = measures Z ¯ · ¸¯ ¯ 1 ¯ νi(A) i 1 dx i ¯A 0, ¯ = A £0, 1 ¤ = ¯ ∩ i ¯ ∩ i for every Borel set A R, then νi * δ0 as i . ⊂ → ∞

Proof. (1) We may assume that fi 0 almost everywhere for every i 1,2,..., for Ê = otherwise we may consider the positive and negative parts fi+ and fi−. (Exercise) (2) Define Radon measures µi, i 1,2,..., by setting = Z µi(A) fi dx (5.5) = A CHAPTER 5. WEAK CONVERGENCE METHODS 122 where A Rn is a Borel set. Then for each compact set K Rn, ⊂ ⊂ Z µZ ¶1/p p 1 1/p µi(K) fi dx fi dx K − for every i 1,2,.... = K É K | | =

This implies supi µi(K) . Thus we may apply Theorem 5.17 to find a Radon n < ∞ measure µ on R and a subsequence (µi ), j 1,2,..., such that µi * µ as i . j = → ∞ (3) µ is absolutely continuous with respect to the Lebesgue measure.

Reason. Assume that A Rn is a bounded set with A 0. Let ε 0 and choose ⊂ | | = > an open and bounded set U A such that U ε. Then ⊃ | | <

µ(U) liminfµi (U) (Lemma 5.16) É i j →∞ Z liminf fi j dx (5.5) = j U →∞ µZ ¶1/p p 1 1/p 1 1/p liminf fi dx U − cε − . É j U j | | É →∞ Thus µ(A) 0. = ■ (4) By the Radon-Nikodym theorem 4.18, there exists f L1 (Rn) satisfying ∈ loc Z µ(A) f dx (5.6) = A for every Borel set A Rn. ⊂ (4) C LAIM : f Lp(Rn). ∈ n Reason. Let ϕ C0(R ). Then ∈ Z Z ϕf dx ϕ dµ (5.6) Rn = Rn Z lim ϕ dµi j (5.5) = j Rn →∞ Z lim ϕfi j dx (5.6) = j Rn →∞ sup fi p ϕ q (Hölder) É i k k k k

c ϕ q. (assumption) É k k By Theorem 5.2 ½Z ¾ f sup f dx : C (Rn), 1 . p ϕ ϕ 0 ϕ p0 k k = Rn ∈ k k É < ∞ ■ p n (5) C LAIM : fi f weakly in L (R ). j → Reason. We showed above that Z Z lim ϕfi j dx ϕf dx j Rn = Rn →∞ CHAPTER 5. WEAK CONVERGENCE METHODS 123

n p n n for every ϕ C0(R ). Assume that g L 0 (R ). Let ε 0 and choose ϕ C0(R ) ∈ ∈ > ∈ with g ϕ p ε. Then k − k 0 < ¯Z Z ¯ ¯ ¯ ¯ fi j g dx f g dx¯ ¯ Rn − Rn ¯ ¯Z Z ¯ ¯Z Z ¯ ¯Z Z ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ fi j g dx ϕfi j dx¯ ¯ ϕfi j dx ϕf dx¯ ¯ ϕf dx gf dx¯ É ¯ Rn − Rn ¯ + ¯ Rn − Rn ¯ + ¯ Rn − Rn ¯ ¯Z Z ¯ f g ¯ f dx f dx¯ f g i j p ϕ p0 ¯ ϕ i j ϕ ¯ p ϕ p0 É k k k − k + ¯ Rn − Rn ¯ + k k k − k ¯Z Z ¯ ¯ ¯ εsup fi p ¯ ϕfi j dx ϕf dx¯ ε f p É i k k + ¯ Rn − Rn ¯ + k k This implies Z Z lim fi j g dx f g dx. j Rn = Rn →∞ ■

Remark 5.27. There is a general theorem in functional analysis, which says that a Banach space is weakly sequentially compact if and only if it is reflexive. This is another manifestation that Lp(Rn) spaces are reflexive for 1 p . < < ∞

Corollary 5.28. Let 1 p . Assume that the sequence (fi) of functions p n < É ∞ p n fi L (R ), i 1,2,.... Then fi f weakly in L (R ) if and only if ∈ = →

(1) supi fi p and Zk k < ∞Z n (2) lim fi dx f dx for every measurable set A R with µ(A) . i A = A ⊂ < ∞ →∞ Proof. The result follows from the fact that simple functions are dense in Lp0 (Rn) using the results above. ä

T HEMORAL : Property (2) asserts that the averages of the functions fi converge to the average of f over A.

5.6 Mazur’s lemma

This section discusses a method to upgrade weak convergence to strong conver- gence. We are mainly interested in the case of Lp(Rn) with 1 p , compare to < < ∞ Theorem 5.24, but we consider a general normed space X. We recall some facts related to the Hanh-Banach theorem, which are needed in the argument. A function p : X R is sublinear, if p(x y) p(x) p(y) for → + É + every x, y X and p(λx) λp(x) for every x X and λ 0. In particular, every ∈ = ∈ Ê on X determines a sublinear function. More generally, if C is a convex, open neighborhood of 0 in a normed space X , then p : X R, → © 1 ª p(x) inf λ 0 : λ− x C , = > ∈ ε CHAPTER 5. WEAK CONVERGENCE METHODS 124

is a sublinear function, called the associated with C. Let p : X R is sublinear function, let Y be a vector subspace of X and let f : Y R → → be a linear function with f (x) p(x) for every x Y . The Hahn-Banach theorem É ∈ asserts that there exist a linear function f : X R such that f Y f , that is → | = f (x) f (x) for every x Y , and f (x) p(x) for every x X. The mapping f is called = ∈ É ∈ a linear extension of f to X.

Theorem 5.29 (Mazur’s lemma). Assume that X is a normed space and that

xi x weakly in X as i . Then for every ε 0, there exists m N and a convex → → ∞ > ∈ Pm Pm combination i 1 ai xi with ai 0 and i 1 ai 1, such that = Ê = = ° ° ° m ° ° X ° °x ai xi° ε. ° − i 1 ° < =

If xi x weakly in X, Mazur’s lemma gives the existence of a sequence (xk) of → convex combinations

m m Xi Xi exi ai, j x j, with ai, j 0 and ai, j 1, = j 1 Ê j 1 = = =

such that xi x in the norm of X as i . e → → ∞ T HEMORAL : For every weakly converging sequence, there is a sequence of convex combinations that converges strongly. Thus weak convergence is upgraded to strong convergence for a sequence of convex combinations. Observe that some

of the coefficients ai may be zero so that the convex combination is essentially for a subsequence.

Proof. Let C be the set of all convex combinations of xi, i 1,2,..., that is = ( m m ) X X C ai xi : ai 0, ai 1, m N . = i 1 Ê i 1 = ∈ = =

By replacing the sequence (xi)∞i 1 by a sequence (xi x1)∞i 1 and x by x x1, we = − = − may assume that 0 C. For a contradiction, assume that there exists ε 0 with ∈ > x y 2ε for every y C. In particular, this implies x 0. k − k Ê ∈ 6= The ε-neighbourhood of C defined by

C {y X : dist(y,C) ε} ε = ∈ < is a convex set and an open neighbourhood of 0. Consider the Minkowski functional p : X R associated with C defined by → ε 1 p(y) inf{λ 0 : λ− y C }. = > ∈ ε

Since Cε is convex, p is a sublinear function on X. Since Cε is an open neighbour- hood of 0, p is a continuous function on X (exercise). CHAPTER 5. WEAK CONVERGENCE METHODS 125

Let z C and y C with z y ε. Since x y 2ε for every y C, by the ∈ ε ∈ k − k < k − k Ê ∈ triangle inequality, we conclude that

x z x y y z 2ε ε ε k − k Ê k − k − k − k > − = 1 for every z C . This implies that there exists y0 C such that x λ− y0 with ∈ ε ∈ ε = 0 λ 1 and p(y0) 1 and thus < < = 1 1 1 p(x) p(λ− y0) λ− p(y0) λ− 1. = = = >

Consider the vector subspace Y {ty0 : t R} of X and the linear function f : Y R = ∈ → defined by f (ty0) t. Then f (y) p(y) for every y Y and by the Hahn-Banach = É ∈ theorem, there exists an extension of f to a linear functional f : X R with → f (y) p(y) for every y X. Since p is continuous, f is continuous and thus f X ∗. É ∈ ∈ Since xI x weakly in X, we have →

f (x) lim f (xi). = i →∞

Since x Y we have f (x) p(x) and since xi C we have p(xi) 1 for every ∈ = ∈ É i 1,2,.... It follows that =

1 p(x) f (x) lim f (xi) liminf p(xi) 1 < = = i É i É →∞ →∞ which is a contradiction. ä The following tail version of Mazur’s lemma will be useful in applications. Observe that the indexing of the sequence of convex combinations starts from i instead of 1.

Lemma 5.30. Assume that X is a normed space and that xi x weakly in X as → i . Then there exists a sequence of convex combinations → ∞ m m Xi Xi exi ai, j x j, with ai, j 0 and ai, j 1, = j i Ê j i = = =

such that xi x in the norm of X as i . e → → ∞

Proof. By applying Mazur’s lemma repeatedly to the sequences (x j)∞j i, i 1,2,..., = = we obtain convex combinations

m m Xi Xi exi ai, j x j, with ai, j 0 and ai, j 1, = j i Ê j i = = = 1 such that xi x . This implies that xi x in the norm of X as i . ke − k < i e → → ∞ ä We discuss briefly applications of Mazur’s lemma for Lp(Rn) with 1 p . < < ∞ The following process upgrades the mode of convergence stepwise. This is applied, for example, in the direct methods of the calculus of variations. CHAPTER 5. WEAK CONVERGENCE METHODS 126

p n (1) Assume that (fi) is a bounded sequence in L (R ). By Theorem 5.24 there p n exists a subsequence (fi ) and a function f L (R ) such that fi f j ∈ j → weakly in Lp(Rn). Thus from boundedness we obtain weak convergence for a subsequence. p n (2) If fi f weakly in L (R ), by Mazur’s lemma there is a sequence of convex j → combinations that converges in Lp(Rn). Thus from weak convergence we obtain strong convergence for a subsequence of convex combinations. (3) Strong convergence in Lp(Rn) implies almost everywhere convergence for a subsequence by Corollary 1.28. Thus from strong convergence we obtain almost everywhere convergence for a subsequence.

The following result is useful in identifying a weak limit in Lp(Rn).

p n Theorem 5.31. Let 1 p and assume that fi f weakly in L (R ). If fi g É < ∞ → → almost everywhere in Rn as i , then f g almost everywhere in Rn. → ∞ =

Proof. By Lemma 5.30 there exists a sequence (fei) of convex combinations

m m Xi Xi fei ai, j f j, with ai, j 0 and ai, j 1, = j i Ê j i = = = p n such that fei f in L (R ) as i . By Corollary 1.28, there exists a subsequence, → → ∞ n denoted again by (fei), such that fei f almost everywhere in R . Since fi g → → almost everywhere in Rn, we have

m Xi fei ai, j f j f = j i → = as i . We conclude that f g almost everywhere in Rn. → ∞ = ä

Remark 5.32. If fi(x) f (x) as i does not imply, for general convex combina- → → ∞ tions, that

m m Xi i Xi ai, j f j(x) →∞ f (x), with ai, j 0 and ai, j 1. j 1 −−−→ Ê j 1 = = = However, for tail convex combinations, we have

m m Xi i Xi ai, j f j(x) →∞ f (x), with ai, j 0 and ai, j 1. j i −−−→ Ê j i = = = This is the advantage of the tail version of Mazur’s lemma, see Lemma 5.30.

Corollary 5.33. Let 1 p and assume that (fi) is a bounded sequence in p n < < ∞ n p n L (R ). If fi f almost everywhere in R , then fi f weakly in L (R ). → → CHAPTER 5. WEAK CONVERGENCE METHODS 127

Proof. By Theorem 5.24 there exists a subsequence, still denoted by (fi), and a p n p n function g L (R ) such that fi g weakly in L (R ) as i . By Lemma 5.30 ∈ → → ∞ there exists a sequence (fei) of convex combinations

m m Xi Xi fei ai, j f j, with ai, j 0 and ai, j 1, = j i Ê j i = = = p n n such that fei g in L (R ) as i . Since fi f almost everywhere in R as → → ∞ → i , we have → ∞ m Xi fei ai, j f j f = j i → = n as i . We conclude that f g almost everywhere in R and fi f weakly in → ∞ = → Lp(Rn) as i . → ∞ ä Remark 5.34. Let 1 p . Since Lp(Rn) is a uniformly convex Banach space, < < ∞ the Banach–Saks theorem which asserts that a weakly convergent sequence has a subsequence whose arithmetic means converge in the norm. Assume that a p n sequence (fi)i N converges to f weakly in L (R ) as i . Then there exists ∈ → ∞ 1 Pk a subsequence (fi j ) j N for which the arithmetic mean k j 1 fi j converges to f ∈ = in Lp(Rn) as k . The advantage of the Banach–Saks theorem compared to → ∞ Mazur’s lemma is that we can work with the arithmetic means instead of more general convex combinations

THE END Bibliography

[1] V.I. Bogachev, Measure Theory 1 and 2, Springer 2007.

[2] A.M. Bruckner, J.B. Bruckner, and B.S. Thomson, Real Analysis, Prentice-Hall, 1997.

[3] D.L. Cohn, Measure theory (2nd edition), Birkhäuser, 2013.

[4] L.C. Evans and R.F. Gariepy, Measure Theory and Fine Properties of Functions, CRC Press, 1992.

[5] G.B. Folland, Real Analysis. Modern Techniques and Their Applications (2nd edition), John Wiley & Sons, 1999.

[6] F. Jones, Lebesgue Integration on Euclidean Space (revised edition), Jones and Bartlett Publishers, 2001.

[7] K.L. Kuttler, Modern Real Analysis, CRC Press, 1998.

[8] P. Mattila, Geometry of sets and measures in Euclidean spaces. Fractals and rectifiability, Cambridge University Press, 1995.

[9] W. Rudin, Real and Complex Analysis, McGraw-Hill, 1986.

[10] D.A. Salamon, Measure and Integration, EMS, 2016,

[11] E. Stein and R. Sakarchi, Real Analysis: Measure Theory, Integration, and Hilbert Spaces, Princeton University Press, 2005.

[12] M.E. Taylor, Measure theory and integration, American Mathematical Society, 2006.

[13] R.L. Wheeden and A. Zygmund, Measure and Integral: An introduction to Real Analysis, Marcel Dekker, 1977.

[14] W.P. Ziemer, Modern Real Analysis, PWS Publishing Company, 1995.

[15] J. Yeh, Real Analysis, Theory of Measure and Integration (2nd edition), World Scientific, 2006.

128