Differentiation
Total Page:16
File Type:pdf, Size:1020Kb
Chapter 4 Differentiation Does there exist a Lebesgue measurable set that fills up exactly half of each interval? To get a feeling for this question, consider the set E = [0, 1 ] [ 1 , 3 ] [ 1 , 5 ] [ 3 , 7 ]. 8 [ 4 8 [ 2 8 [ 4 8 This set E has the property that b E [0, b] = j \ j 2 1 1 3 for b = 0, 4 , 2 , 4 , 1. Does there exist a Lebesgue measurable set E [0, 1], perhaps constructed in a fashion similar to the Cantor set, such that the equation⊂ above holds for all b [0, 1]? In this2 chapter we see how to answer this question by considering differentia- tion issues. We begin by developing a powerful tool called the Hardy–Littlewood maximal inequality. This tool is used to prove an almost everywhere version of the Fundamental Theorem of Calculus. These results lead us to an important theorem about the density of Lebesgue measurable sets. Trinity College at the University of Cambridge in England. G. H. Hardy (1877–1947) and John Littlewood (1885–1977) were students and later faculty members here. If you have not already done so, you should read Hardy’s remarkable book A Mathematician’s Apology (do not skip the fascinating Foreword by C. P. Snow) and see the movie The Man Who Knew Infinity, which focuses on Hardy, Littlewood, and Srinivasa Ramanujan (1887–1920). CC-BY-SA Rafa Esteve © Sheldon Axler 2020 S. Axler, Measure, Integration & Real Analysis, Graduate Texts 101 in Mathematics 282, https://doi.org/10.1007/978-3-030-33143-6_4 102 Chapter 4 Differentiation 4A Hardy–Littlewood Maximal Function Markov’s Inequality The following result, called Markov’s inequality, has a sweet, short proof. We will make good use of this result later in this chapter (see the proof of 4.10). Markov’s inequality also leads to Chebyshev’s inequality (see Exercise 2 in this section). 4.1 Markov’s inequality Suppose (X, , m) is a measure space and h 1(m). Then S 2 L 1 m( x X : h(x) c ) h f 2 j j ≥ g ≤ c k k1 for every c > 0. Proof Suppose c > 0. Then 1 m( x X : h(x) c ) = c dm f 2 j j ≥ g c x X : h(x) c Zf 2 j j≥ g 1 h dm ≤ c x X : h(x) c j j Zf 2 j j≥ g 1 h , ≤ c k k1 as desired. St. Petersburg University along the Neva River in St. Petersburg, Russia. Andrei Markov (1856–1922) was a student and then a faculty member here. CC-BY-SA A. Savin Section 4A Hardy–Littlewood Maximal Function 103 Vitali Covering Lemma 4.2 Definition 3 times a bounded nonempty open interval Suppose I is a bounded nonempty open interval of R. Then 3 I denotes the ∗ open interval with the same center as I and three times the length of I. 4.3 Example 3 times an interval If I = (0, 10), then 3 I = ( 10, 20). ∗ − The next result is a key tool in the proof of the Hardy–Littlewood maximal inequality (4.8). 4.4 Vitali Covering Lemma Suppose I1,..., In is a list of bounded nonempty open intervals of R. Then there exists a disjoint sublist Ik1 ,..., Ikm such that I I (3 I ) (3 I ). 1 [···[ n ⊂ ∗ k1 [···[ ∗ km 4.5 Example Vitali Covering Lemma Suppose n = 4 and I1 = (0, 10), I2 = (9, 15), I3 = (14, 22), I4 = (21, 31). Then 3 I = ( 10, 20), 3 I = (3, 21), 3 I = (6, 30), 3 I = (11, 41). ∗ 1 − ∗ 2 ∗ 3 ∗ 4 Thus I I I I (3 I ) (3 I ). 1 [ 2 [ 3 [ 4 ⊂ ∗ 1 [ ∗ 4 In this example, I1, I4 is the only sublist of I1, I2, I3, I4 that produces the conclusion of the Vitali Covering Lemma. Proof of 4.4 Let k1 be such that I = max I ,..., I . j k1 j fj 1j j njg Suppose k ,..., k have been chosen. 1 j The technique used here is called a Let k + be such that I is as large j 1 j kj+1 j greedy algorithm because at each as possible subject to the condition that stage we select the largest remaining I ,..., I are disjoint. If there is no k1 kj+1 interval that is disjoint from the choice of kj+1 such that Ik1 ,..., Ikj+1 are previously selected intervals. disjoint, then the procedure terminates. Because we start with a finite list, the procedure must eventually terminate after some number m of choices. 104 Chapter 4 Differentiation Suppose j 1, . , n . To complete the proof, we must show that 2 f g I (3 I ) (3 I ). j ⊂ ∗ k1 [···[ ∗ km If j k ,..., k , then the inclusion above obviously holds. 2 f 1 mg Thus assume that j / k ,..., k . Because the process terminated without 2 f 1 mg selecting j, the interval Ij is not disjoint from all of Ik1 ,..., Ikm . Let IkL be the first interval on this list not disjoint from Ij; thus Ij is disjoint from Ik1 ,..., IkL 1 . Because j was not chosen in step L, we conclude that I I . Because I I− = Æ, this j kL j ≥ j jj kL \ j 6 last inequality implies (easy exercise) that I 3 I , completing the proof. j ⊂ ∗ kL Hardy–Littlewood Maximal Inequality Now we come to a brilliant definition that turns out to be extraordinarily useful. 4.6 Definition Hardy–Littlewood maximal function Suppose h : R R is a Lebesgue measurable function. Then the Hardy– ! Littlewood maximal function of h is the function h : R [0, ¥] defined by ∗ ! 1 b+t h∗(b) = sup h . > 2t b t j j t 0 Z − In other words, h∗(b) is the supremum over all bounded intervals centered at b of the average of h on those intervals. j j 4.7 Example Hardy–Littlewood maximal function of c[0, 1] [ ] As usual, let c[0, 1] denote the characteristic function of the interval 0, 1 . Then 1 2(1 b) if b 0, − ≤ ( ) ( ) = < < c[0, 1] ∗ b 81 if 0 b 1, > < 1 if b 1, 2b ≥ The graph of (c )∗ on [ 2, 3]. > [0, 1] − :> as you should verify. If h : R R is Lebesgue measurable and c R, then b R : h (b) > c is ! 2 f 2 ∗ g an open subset of R, as you are asked to prove in Exercise 9 in this section. Thus h∗ is a Borel measurable function. Suppose h 1(R) and c > 0. Markov’s inequality (4.1) estimates the size of 2 L the set on which h is larger than c. Our next result estimates the size of the set on j j which h∗ is larger than c. The Hardy–Littlewood maximal inequality proved in the next result is a key ingredient in the proof of the Lebesgue Differentiation Theorem (4.10). Note that this next result is considerably deeper than Markov’s inequality. Section 4A Hardy–Littlewood Maximal Function 105 4.8 Hardy–Littlewood maximal inequality Suppose h 1(R). Then 2 L 3 b R : h∗(b) > c h jf 2 gj ≤ c k k1 for every c > 0. Proof Suppose F is a closed bounded subset of b R : h∗(b) > c . We will 3 ¥ f 2 g show that F c ¥ h , which implies our desired result [see Exercise 24(a) in Section 2Dj].j ≤ − j j For each b F,R there exists t > 0 such that 2 b + 1 b tb 4.9 h > c. 2tb b t j j Z − b Clearly F (b t , b + t ). ⊂ − b b b F [2 The Heine–Borel Theorem (2.12) tells us that this open cover of a closed bounded set has a finite subcover. In other words, there exist b ,..., b F such that 1 n 2 F (b t , b + t ) (b t , b + t ). ⊂ 1 − b1 1 b1 [···[ n − bn n bn To make the notation cleaner, relabel the open intervals above as I1,..., In. Now apply the Vitali Covering Lemma (4.4) to the list I1,..., In, producing a disjoint sublist Ik1 ,..., Ikm such that I I (3 I ) (3 I ). 1 [···[ n ⊂ ∗ k1 [···[ ∗ km Thus F I I j j ≤ j 1 [···[ nj (3 I ) (3 I ) ≤ j ∗ k1 [···[ ∗ km j 3 I + + 3 I ≤ j ∗ k1 j ··· j ∗ km j = 3( I + + I ) j k1 j ··· j km j 3 < h + + h c Ik j j ··· Ik j j Z 1 Z m 3 ¥ h , ≤ c ¥j j Z− where the second-to-last inequality above comes from 4.9 (note that I = 2t for j kj j b the choice of b corresponding to Ikj ) and the last inequality holds because Ik1 ,..., Ikm are disjoint. The last inequality completes the proof. 106 Chapter 4 Differentiation EXERCISES 4A 1 Suppose (X, , m) is a measure space and h : X R is an -measurable function. ProveS that ! S 1 ( ( ) ) p m x X : h x c p h dm f 2 j j ≥ g ≤ c Z j j for all positive numbers c and p. 2 Suppose (X, , m) is a measure space with m(X) = 1 and h 1(m). Prove that S 2 L 1 2 m x X : h(x) h dm c h2 dm h dm 2 − ≥ ≤ c2 − n Z o Z Z for all c > 0. [The result above is called Chebyshev’s inequality; it plays an important role in probability theory. Pafnuty Chebyshev (1821–1894) was Markov’s thesis advisor.] 3 Suppose (X, , m) is a measure space.