<<

Partial Differential Equations

Mollification

(With one colorful figure)

1 Lp-spaces, local integrability

I will assume you know what it means for a to be measurable, integrable, etc. In this course, measurability always means Lebesgue measurability, integrability is Lebesgue integrability. One very nice feature of Lebesgue integrability is that if E is a measurable of Rn of positive measure1 and f : E → R is measurable and non-negative (f(x) ≥ 0 for all x ∈ E), then Z f(x) dx E is always defined, possibly equal to ∞. A function f : E → R is integrable over E, in symbols f ∈ L1(E), if and only if Z |f(x)| dx < ∞. E More generally, one defines Lp(E) for 1 ≤ p, ∞ by: f : E → R is in Lp(E), if and only if Z |f(x)|p dx < ∞. E

∞ One also defines f ∈ L (E) iff ess sup x∈E|f(x)| < ∞. The essential supremum or ess sup of a function f over a set E can be defined as the infimum (strangely enough) of the sup of all functions g that are a.e. to f:

ess sup x∈Ef = inf{sup g(x): g(x) = f(x) for a.e.x ∈ E}. x∈E

This is a notation I’ll be using on occasions: Assume p ∈ [1, ∞]. The conjugate index of p will be denoted by p0; it is defined to be the number in [1, ∞] so that the equation 1 1 + = 1 p p0 holds. If p = 1, one interprets 1/p0 = 0 as meaning p0 = ∞; if p = ∞, one interprets 1/p = 0, so p0 = 1. It would have been, perhaps, easier to simply say: p p0 = , p − 1 but the equation 1/p + 1/p0 = 1 is usually preferred; it makes it clear at once that p, p0 play symmetric roles and that (p0)0 = p. A few things to notice are: 20 = 2; if 1 ≤ p < 2, then 2 < p0 ≤ ∞, and vice-versa: 2 < p ≤ ∞ implies 1 ≤ p0 < 2.

One defines a in the spaces Lp(E) by

( 1/p R |f(x)|p dx , if p < ∞, kfkp = kfkLp(E) = E ess sup x∈E|f(x)| if p = ∞.

To see it is a norm is not totally trivial. That is, it is trivially a norm (if we interpret, as one does, f = 0 to mean f = 0 a.e.) for p = 1 and for p = ∞. Otherwise it is not so obvious. To prove it is a norm also for 1 < p < ∞, one

1Lebesgue integration over null sets is a rather stupid endeavor, done only by people who love the number zero so much they can’t keep their hands off it. 1 LP -SPACES, LOCAL INTEGRABILITY 2

first proves a fundamental inequality: H¨older’sinequality. It states: Let E be a measurable subset of n and let 0 R f ∈ Lp(E), g ∈ Lp (E), then fg ∈ L1(E) and Z |f(x)g(x)| dx ≤ kfkLp(E)kgkLp0 (E). (1) n R The proof of (1) is based on the following result, which could be a nice exercise in a Calculus course: Assume a, b are two non-negative real numbers. Then, if 1 < p < ∞ and p0 = p/(p − 1),

1 1 0 ab ≤ ap + bp p p0 The hint one can give for this exercise is that it suffices to prove it for a, b > 0, to prove first that ϕ(x) = 0 0 xp/p + x−p /p0 ≥ 1 for all x > 0, and to apply it with x = a1/p b−1/p. Once one has this inequality, given now p p0 f ∈ L (E), g ∈ L (E), one applies it with a = |f(x)|/kfkp, b = |g(x)|/kgkp0 to get 0 1 |f(x)|p |g(x)|p |f(x)g(x)| ≤ p + p0 kfk kgk 0 kfk p p p kgkp0 for a.e. x ∈ E. Integrating over E gives 1 Z 1 1 |f(x)g(x)| dx ≤ + 0 = 1. kfkpkgkp0 E p p This proves (1) if 1 < p < ∞; the proof when p = 1 (or equivalently when p = ∞) is trivial.

If p = 2, (1) is known as the Cauchy-Schwarz inequality in the Western part of the world. The Russians prefer to call it the Bunyakovsky inequality, since Viktor Bunyakovsky (1804-1889) had it some 25 years before Cauchy or Schwarz.

With (1) it is now easy to prove also for 1 < p < ∞ that

kf + gkLp(E) ≤ kfkLp(E) + kgkLp(E) (2) p for all f, g ∈ L (E); in other words, the main step in seeing k · kLp(E) is a norm. The cases p = 1, ∞ being, as mentioned, trivial (I hope that if I say a sufficient number of times that it is trivial you will believe it), one assumes 1 < p < ∞ and proceeds as follows (notice that (p − 1)p0 = p): Z Z Z Z p p p−1 p−1 p−1 kf + gkp = |f + g| dx = |f + g||f + g| dx ≤ |f| |f + g| dx + |g| |f + g| dx E E E E 1/p 1/p0 1/p 1/p0 Z  Z p0  Z  Z p0  ≤ |f|p dx |f + g|p−1 dx + |g|p dx |f + g|p−1 dx E E E E p/p0 p/p0 = kfkp kf + gkp + kgkp kf + gkp . 0 p/p 0 Dividing both sides by kf + gkp , (2) follows, since p − (p/p ) = p − (p − 1) = 1. p FACT: The spaces L (E) are complete in the metric defined by the k · kLp(E) norm, thus Banach spaces. Definition 1 Let U be open in Rn and let f : U → R be measurable. We say f is locally integrable in U and write f ∈ L1 (U) iff loc Z |f(x)| dx < ∞ K for all compact sets K ⊂ U. Notice that if the restriction of f to a compact set K satisfies

kfkLp(K) < ∞, then Z |f(x)| dx < ∞ K This is clear if p = 1, since then both statements say the same thing. Otherwise, with χK denoting the characteristic function of K, |K| the Lebesgue of K, Z Z 1/p0 |f(x)| dx = χK (x)|f(x)| dx ≤ kχK kLp0 (K)kfkLp(K) = |K| kfkLp(K) < ∞. K K 2 CONVOLUTIONS 3

2

Let f, g : Rn → R be measurable. Then it is not too hard to show that for almost all x ∈ Rn, the function y 7→ f(x − y)g(y) is measurable. If (and only if) it is also integrable for almost all x ∈ Rn, one defines a function f ∗ g : n → by R R Z f ∗ g(x) = f(x − y)g(y) dy n R for almost all x ∈ Rn.

I do not know (and I don’t really care to know) the EXACT conditions on f, g so that this product is defined. But it is useful to know some basic conditions under which the product is defined.

p n q n r n 1 1 1 • If f ∈ L (R ), g ∈ L (R ) and 1/p + 1/q ≥ 1, then f ∗ g is defined, f ∗ g ∈ L (R ) where r = p + q − 1. Moreover,

r n p n q n kf ∗ gkL (R ) ≤ kfkL (R ) kgkL (R ). (3) Important subcases of this fact are: The case p = q = 1. In this case r = 1 and one sees that L1(Rn) is an algebra under the convolution product. It is also frequently used with p = 1. Then r = q and it becomes

q n 1 n q n kf ∗ gkL (R ) ≤ kfkL (R ) kgkL (R ). The case p = q = 2. In this case r = ∞ so f ∗ g is bounded. One can prove a bit more: In this case, f ∗ g is continuous. More generally, if q = p0, then r = ∞ and one can prove also that f ∗ g is continuous.

1 n • If f, g ∈ Lloc(R ), and at least one of them vanishes outside of a compact set, then f ∗ g is defined. If both vanish outside of compact sets, then f ∗g is locally integrable and vanishes outside of a compact set. Generally speaking, if f(x) = 0 for x∈ / A and g(x) = 0 for x∈ / B, then f ∗ g(x) = 0 for x∈ / A + B. Specifically,

supp f ∗ g ⊂ supp f + supp g.

Proof of (3). Inequality (3) is known as the Hausdorff, sometimes Hausdorff-Young, inequality. Its proof consists in a moderately clever application of H¨older’sinequality, more precisely of a simple generalization of H¨older’s inequality: Let p1, . . . , pn ∈ [1, ∞] be such that 1 1 + ··· + = 1. p1 pn Then Z |f1f2 ··· fn| dx ≤ kf1kLp1 (E)kf2kLp2 (E) · · · kfnkLpn (E). (4) E This is easily proved by induction on n; the case n = 2 being just regular H¨older.We need the case n = 3.

1 1 1 1 1 Assume thus that p, q ∈ [1, ∞], that p + q ≥ 1, and set r = p + q − 1 (notice, incidentally, that 1/p + 1/q ≤ 2 so that 0 ≤ 1/r ≤ 1 and r ∈ [1, ∞]). Let f ∈ Lp(Rn), g ∈ Lq(Rn); we may assume that f, g only assume non-negative values. Now 1 1 1 1 1 1 + + = 2 − − + = 1 q0 p0 r q p r and we can use them as powers for an application of H¨older. Notice that we must have p ≤ r because 1/p = 1/r + 1/q0 ≥ 1/r; similarly q ≤ r. Thus p/r, q/r ∈ [0, 1] and we can write (for x ∈ Rn) Z Z 1− p 1− q p q f(x − y)g(y) dy = f(x − y) r g(y) r (f(x − y) r g(y) r ) dy. n n R R We now apply H¨olderwith q0, p0, r to get

Z Z 1/q0 Z 1/p0 Z 1/r q0(1− p ) p0(1− q ) p q f(x − y)g(y) dy ≤ f(x − y) r dy g(y) r f(x − y) g(y) ) dy . n n n n R R R R 2 CONVOLUTIONS 4

Now p 1 1 1 1 q0(1 − ) = pq0( − ) = pq0(1 − ) = pq0 = p. r p r q q0

0 q R n R Similarly, p (1 − ) = q. Moreover, a change of shows that h(x − y) dy = n h(y) dy so that we proved r R R Z Z 1/r p/q0 q/p0 p q f(x − y)g(y) dy ≤ kfkLp( n)kgkLp( n) f(x − y) g(y) dy n R R n R R for all x ∈ Rn. Raising to the power r and integrating with respect to x, using Fubini to change the order of integration: Z Z r Z Z r pr/q0 qr/p0 p q kf ∗ gkLr ( n) = f(x − y)g(y) dy dx ≤ kfkLp( n)kgkLp( n) f(x − y) g(y) dy dx R n n R R n n R R R R Z Z  Z pr/q0 qr/p0 p q pr/q0 qr/p0 p q = kfkLp( n)kgkLp( n) f(x − y) dx g(y) dy = kfkLp( n)kgkLp( n)kfkLp( n) g(y) dy R R n n R R R n R R R pr/q0 qr/p0 p q pr/q0+p qr/p0+q r r = kfk p n kgk p n kfk p n kgk q n = kfk p n kgk p n = kfk p n kgk p n . L (R ) L (R ) L (R ) L (R ) L (R ) L (R ) L (R ) L (R ) The result follows.

Other basic properties of this product are: 1. f ∗ g = g ∗ f. 2.( f ∗ g) ∗ h = f ∗ (g ∗ h). 3. f ∗ (g + h) = f ∗ g + f ∗ h. These properties hold in the sense that if the convolutions on the right hand side are defined, so are those on the left hand side, and both sides are equal. Moving over to perhaps less basic properties, we have

2 1 n k n k n Theorem 1 Assume f ∈ Lloc(R ) and g ∈ Cc (R ), where k ∈ {0, 1, 2,...}, or k = ∞. Then f ∗ g ∈ C (R ) and (if k ≥ 1) Dα(f ∗ g) = f ∗ Dαg for all multi-indices α such that |α| ≤ k. Proof. The simplest proof uses Lebesgue’s dominated convergence theorem, but I’ll prove it using only Riemann integration tools. Let K = supp g. We also need to use the following rather simple facts: If a set C is compact and we define for δ > 0 the set Cδ by

n Cδ = {x ∈ R : dist(x, C) ≤ δ},

n then Cδ is compact. If C is compact and x ∈ R , then x − C = {x − y : y ∈ C} is compact. n We prove first that f ∗ g is continuous if g is continuous. For this let x0 ∈ R and let  > 0 be given. A compactly supported is uniformly continuous, so there exists δ > 0 such that |g(x) − g(y)| <  if |x − y| < δ. We may (and will) assume that δ ≤ 1. If |x − x0| < δ, then

supp g(x − ·) ⊂ (x0 − K)1; that is, the map y 7→ g(x − y) is supported by the compact set consisting of all points at less than or equal 1 from K shifted by −x0. In fact, if g(x − y) 6= 0, then x − y ∈ K, so x − y = z for some z ∈ K. Then y = x − z and |y − (x0 − z)| = |x − x0| < δ ≤ 1. Since x0 − z ∈ x0 − K, we see that y ∈ (x0 − K)1. We thus have, for |x − x0| < δ, Z Z Z |f∗g(x)−f∗g(x0)| ≤ |f(y)| |g(x−y)−g(x0−y)| dy = |f(y)| |g(x−y)−g(x0−y)| dy <  |f(y)| dy, n R (x0−K)1 (x0−K)1

2 The subscript c in Cc means “compact .” The case k = 0 refers to continuous functions. 2 CONVOLUTIONS 5

the last inequality being due to |(x − y) − (x0 − y)| = |x − x0| < δ. Now Z |f(y)| dy (x0−K)1 depends on x0, but it does not depend on  or δ. It is finite because f is locally integrable. Instead of taking R d > 0 for  > 0, given  > 0 we could have taken the δ > 0 that works for / (x −K) |f(y)| dy and ended with 0R 1 |f ∗ g(x) − f ∗ g(x0)| <  if |x − x0| < δ. One minor problem would be, however, if |f(y)| dy = 0, then we (x0−K)1 would be dividing by 0. There are (at least) two ways of solving this. One, in this case f ∗ g ∗ x) = f ∗ g(x0) for R |x − x0| < . Or one can add 1 to |f(y)| dy (see the next step). (x0−K)1

Next we prove that if g ∈ C1( n), then f ∗ g ∈ C1( n) and ∂(f∗g) = f ∗ ∂g for j = 1, . . . , n. This is the main c R R ∂xj ∂xj step. n Let 1 ≤ j ≤ n. Let x0 ∈ R . We want to see that f ∗ g has a partial with respect to xj at x0. ∂g Because g has compact support, so does ; in fact, supp ddxgxj ⊂ supp g = K. For simplicity of notation, let ∂xj J = (x0 − K)1; the set of all points at distance less than 1 from x0 − K. Let ej be the j-th element of the canonical n basis of R , ej = (δj1, δj2, . . . , δjn), δjk = 0 if k 6= j, δjj = 1. Let  > 0 be given. Because ∂g has compact support, there is δ ∈ (0, 1] such that ∂xj

∂g ∂g  (x) − (y) < R if |x − y| < δ. ∂xj ∂xj 1 + J |f(y)| dy n ¯ Suppose that |x − x0| < δ, h ∈ R, |h| < δ, y ∈ R . By the (Calculus 1 ) mean value theorem, there is h between 0 and h, thus |h¯| < δ, such that

g(x0 + hej − y) − g(x0) ∂g ¯ = (x0 + hej − y); h ∂xj it is likely that h¯ depends on y, but ¯ ¯ |(x0 + h − y) − (x0 − y)| = |h| ≤ |h| < δ, thus

g(x0 + hej − y) − g(x0) ∂g ∂g ¯ ∂g  − (x0 − y) = (x0 + hej − y) − (x0 − y) < R . h ∂xj ∂xj ∂xj 1 + J |f(y)| dy

n ∂g This being true for all y ∈ , and considering that for |h| < δ we have that g(x0 + hej − y) and (x0 − y) are R ∂xj zero if y∈ / J, we see that Z   f ∗ g(x0 + hej) − f ∗ g(x0) ∂g g(x0 + hej − y) − g(x0) ∂g − f ∗ (x0) = f(y) − (x0 − y) dy h ∂xj J h ∂xj Z Z g(x0 + hej − y) − g(x0) ∂g  ≤ |f(y)| − (x0 − y) dy ≤ R |f(y)| dy <  J h ∂xj 1 + J |f(y)| dy J if |h| < δ. This proves ∂(f∗g) = f ∗ ∂g . Since ∂g is continuous, it follows (by the first part of the proof) that ∂xj ∂xj ∂xj f ∗ ∂g is continuous. Thus f ∗ g ∈ C1( n). ∂xj R Induction takes over now. Assume proved for some m, 1 ≤ m < k, that f ∈ Cm(Rn) and Dα(f ∗ g) = f ∗ Dαg for all multi-indices α, |α| ≤ m. Let β be a multi-index with |β| = m + 1. Then there is α with |α| = m, j, such α k−m n 1 n that β = α + ej. Since m < k, D g ∈ Cc (R ) ⊂ Cc (R ), and by the induction hypothesis and the case k = 1 we have that ∂ ∂ ∂Dα Dβ(f ∗ g) = Dα(f ∗ g) = (f ∗ Dαg) = f ∗ g = f ∗ Dβg. ∂xj ∂xj ∂xj Moreover, since Dβg is continuous, so is f ∗ Dβg. 3 MOLLIFICATION 6

3 Mollification

Functions that are differentiable as many times as one wishes, in other words C∞ functions, are very smooth. However, they are still quite plastic, malleable. One can play with them. One step above these functions we have analytic functions; these are quite rigid. If you try to bend them, they break. A basic fact about C∞ functions is that given any with non-empty interior, there is a non-zero C∞ function with support contained in that set. The starting point is:

Lemma 2 There exists ψ ∈ C∞(Rn) such that: 1. supp ψ = B(0, 1).

2. ψ(x) ≥ 0 for all x ∈ Rn. R 3. n ψ(x) dx = 1. R 4. ψ is radial; ψ(x) = ϕ(r) where vf : [0, ∞) → R. Proof. Of all possible such functions, the simplest is probably

 − 1 1 1−|x|2  c e , if |x| < 1, ψ(x) =  0 otherwise,

Z − 1 where c = e 1−|x|2 dx. Proving that it satisfies the conditions will be left partially as an exercise. Most of it |x|<1 is absolutely trivial. For example it is trivial that the support is B(0, 1), that it is radial, non-negative and that it is C∞ at all points except possibly those where |x| = 1. All one needs to verify is that the pasting together is smooth. Here are some hints on how to do this in a more or less efficient way. Prove first3: If ϕ : (0, ∞) → R is C∞, if ψ : Rn\{0} → R is defined by ψ(x) = ϕ(|x|2), then ψ ∈ C∞(ψ : Rn\{0} → R). It’s easier to work with |x|2 instead of |x|; both are C∞ for |x|= 6 0, but |x|2 is C∞ everywhere. This reduces the proof to showing that

 e−frac11−t, if 0 < t < 1, ϕ : t 7→ 0 if t ≥ 1, is in C∞(0, ∞). One can simplify this a bit more, since all that matters is what happens at t = 1. One has ϕ(t) =ϕ ˜(1 − t), whereϕ ˜ :(−∞, 1) → R by

 e−1/s, if 0 < s < 1, ϕ˜ : s 7→ 0 if s ≤ 0.

For a function of one variable, it is an easy exercise to prove: If I is an interval in R, f : I → R is continuous. if 0 c ∈ I, if f is differentiable at all points of I except possibly at c, if limt→c f (t) exists, then f is also differentiable 0 0 at c and f (c) = limt→c f (t). Concerningϕ ˜, all are defined for s < 0 and for s > 0. What one needs to to is to show thatϕ ˜ and all these derivatives have a at 0.

Of the properties of the mollifier, as the function ψ of Lemma 2 is sometimes called, the property of being radial is not so frequently used. It is however essential in the proof in Evans of the fact that harmonic functions are in C∞. An easy consequence of this lemma is:

Theorem 3 Let U be open in Rn, K compact, and assume K ⊂ U. There exists ϕ ∈ C∞(Rn) such that ϕ(x) = 1 for all x ∈ K and supp ϕ ⊂ U.

3Our concrete ψ is clearly C∞ at 0, so we don’t need to worry what happens there. This statement corrects a mistake in a homework exercise. 3 MOLLIFICATION 7

Proof. We assume K 6= ∅; otherwise the identically zero function fits the bill. There exists V open, with V¯ compact, such that K ⊂ V ⊂ V¯ ⊂ U. In fact, if U = Rn, we can take V = B(0,R), where R is large enough so K ⊂ B(0,R). Otherwise, let n n δ = dist(K, R \U) = inf{|x − y| : x ∈ K, y ∈ R \U}. n n Because K, R \K are not empty, δ is finite. It is not hard to show that there will exist x0 ∈ K, y0 ∈ R \U such that δ = |x0 − y0|, thus δ > 0. Let n V = {x ∈ R : dist(x, K) ≤ δ/3}. Since V is compact, we can repeat this; there is W open, W¯ compact, such that V¯ ⊂ W ⊂ W¯ ⊂ U. With χV denoting the characteristic function of V , let ϕ = χV ∗ ψ, where ψ be the mollifier of Lemma 2 and −n n ¯ n we define for  > 0, ψ(x) =  ψ(x/), and  > 0 is chosen so that  < dist(K, R \V ) and also  < dist(V, R \W ). Here is a picture of the sets:

n If x ∈ K, then ψ(y)χV (x − y) = ψ(y) for all y ∈ R . In fact, if |y| < , then x − y is at distance <  from K, hence in V and χV (x − y) = 1. Otherwise (if |y| ≥ ), then ψ(y) = 0 and both sides of the proposed equality are 0, so it also holds. Thus, for x ∈ K, Z Z ϕ(x) = ψ ∗ χV (x) = ψ(y)χV (x − y) dy = ψ(y) dy = 1. n n R R R If ϕ(x) 6= 0, then n ψ)(y)χV (x − y) dy 6= 0; there must be therefore at least on y such that ψ(y) 6= 0, thus R |y| < , and χV (x − y) 6= 0, thus x − y ∈ V . Thus x is at distance ≤ |y| <  from V , hence has to be in W . This proves supp ϕ ⊂ W¯ , hence supp ϕ is a compact subset of U. Since ϕ ∈ C∞(Rn) by Theorem 1, we are done.

This shows how malleable C∞functions are. Next come the approximation results. For all of them, let ψ ∈ ∞ n n R Cc (R ), ψ ≥ 0 (i.e., ψ(x) ≥ 0 for all x ∈ R ), supp ψ ⊂ B(0, 1) and n ψ(x) dx = 1. For  > 0 define ψ by −n ∞ n R R ψ(x) =  ψ(x/). Then ψ ∈ C ( ), ψ ≥ 0 and n ψ(x) dx = 1. The existence of such a ψ is guaranteed by c R R Theorem 2.

n n Theorem 4 Let f ∈ Cc(R ). For  > 0 set f = f ∗ ψ. Then lim→0+ f(x) = f(x) uniformly in R . R n Proof. Because n ψ(y) dy = 1, we can write for x ∈ , R R Z Z Z f(x) − f(x) = ψ(y)f(x − y) dy − f(x) ψ(y) dy = ψ(y)[f(x − y) − f(x)] dy n n n R R R for all  > 0. Let η > 0 be given. Because f is continuous of compact support, it is uniformly continuous; there is 1 > 0 0 0 n 0 such that |f(y) − f(y )| < η if y, y ∈ R and |y − y | ≤ 1. Assume 0 <  < 1. Since supp ψ ⊂ B(0, ) ⊂ B(0, 1) we see that in a product ψ(y)[f(x − y) − f(x)] either ψ(y) = 0 (if |y| ≥ 1) or |(x − y) − x| = |y| < 1 and then |f(x − y) − f(x)| < η. Thus

ψ(y)|f(x − y) − f(x)| < ηψ(y) and Z Z |f(x) − f(x)| ≤ ψ(y)|f(x − y) − f(x)| dy ≤ η ψ(y) dy = η. n n R R 3 MOLLIFICATION 8

We showed: Given η > 0, there exists 1 > 0 such that |f(x) − f(x)| < η whenever 0 <  < 1. The theorem follows.

We now need a few facts about measurable functions. A rather fundamental theorem is Lusin’s Theorem. You can find a proof in most measure theory textbooks, and there are also proofs available online. I am adapting the version found in W. Rudin’s Real and Complex Analysis., which is valid for regular Borel measures in locally compact spaces, to our case. As already once before, if A ⊂ Rn is measurable, I denote by |A| its measure. Yes, the symbol | · | is quite overloaded!

Theorem 5 (Lusin’s Theorem) Let f : Rn → R be measurable and assume that f is supported by a set of finite measure; that is, there is A a measurable subset of Rn, |A| < ∞, such that f(x) = 0 if x∈ / A. For every  > 0, n there exists g ∈ Cc(R ) such that n |{x ∈ R : f(x) 6= g(x)}| <  and such that sup |g(x)| ≤ sup |f(x)|. n n x∈R x∈R

One should not be fooled by the statement of this theorem. At first glance it seems to imply that a measurable function must be itself continuous at many points; shouldn’t f be continuous everywhere that g is continuous? The answer is no, as the following simple example shows. Let f : R → R be the Dirichlet function (f(x) = 1 if x ∈ Q, f(x) = 0 if x ∈ R\Q) and let g(x) = 0 for all x ∈ R, so g is very continuous at all points of R. Then |{x ∈ R : f(x) 6= g(x)}| = 0 yet f is discontinuous at ALL points of R.

The following two lemmas contain results from elementary measure theory that are also needed.

Lemma 6 Let f : Rn → R be integrable. Let  > 0. There exists a measurable set A ⊂ Rn of finite measure (|A| < ∞) such that Z |f| dx < . n R \A Proof. A simple proof relies on the definition of of a positive function. We may assume f ≥ 0, since only its plays a role. You might recall that if f : Rn → [0, ∞) is measurable, then Z Z f(x) dx = sup{ s(x) dx : s a measurable , 0 ≤ s ≤ f}. n n R R In words: The integral of f is by definition the supremum of the set of of all non-negative measurable simple functions that are everywhere less than or equal to f. A measurable simple function is one that can be written in the form m X s = cjχEj , j=1 where c1, . . . , cm ∈ R, E1,...,Em are measurable sets, Ei ∩ Ej = ∅ if i 6= j, and for a set E, χE denotes the characteristic function of E. By requiring that the sets E1,...,Em are pairwise disjoint, one ensures that the values taken by s are precisely the numbers c1, . . . , cm. One can also require that all the numbers cj be distinct, 4 but that is not important here. If s is expressed as above, and if all the cj’s are non-negative, one defines

m Z X s(x) dx = cj|Ej|. n R j=1

In this definition, if |Ej| = ∞, cj = 0, one defines 0 · ∞ = 0. This allows the extra assumption: All cj’s are R R positive. If then any set Ej has infinite measure, then n s dx = ∞. Now let  > 0 be given. Since n f dx < ∞, R R 4 Actually, for the purpose of uniqueness, one assumes that all the cj are distinct, then proves that this distinctness is not necessary; it isn’t even necessary for the Ej ’s to be pairwise disjoint as long as they are measurable. 3 MOLLIFICATION 9

R R n f dx < ∞ −  < n f dx < ∞ and by definition of supremum, there exists a measurable simple function s such R R that 0 ≤ s ≤ f and Z Z Z f dx < ∞ −  < s dx ≤ f dx < ∞. n n n R R R Express s in the form m X s = cjχEj , j=1

We may assume that cj > 0 for all j, except if s ≡ 0. Assuming first s 6≡ 0, from

m Z X Z s(x) dx = cj|Ej| ≤ f dx < ∞ n n R j=1 R

Sm we conclude that |Ej| < ∞ for j = 1, . . . , m thus, defining A = j=1 Ej, we also have |A| < ∞. If s ≡ 0, define A = ∅. In either case we have s(x) = 0 if x∈ / A, hence Z Z Z Z Z f(x) dx = (f(x) − s(x)) dx ≤ (f(x) − s(x)) dx = f(x) dx − s(x) dx < . n n n n n R \A R \A R R R We are done.

Lemma 7 Let f : Rn → R be integrable. For every η > 0, there exists δ > 0 such that if E is measurable and |E| < δ, then Z |f| dµ < η. E Proof. This lemma is a particular case of a more general fact about absolute continuity. To prove it, we may again assume f ≥ 0 and define a map ν from measurable of Rn to real numbers by Z ν(E) = f(x) dx E if E is a measurable subset of Rn. It is immediate that ν(E) ≥ 0 for all E ⊂ Rn. It takes a bit more effort to prove that ν is a measure; specifically that if E1,E2,... is a of pairwise disjoint measurable sets, then

∞ ∞ [ X ν( Ej) = ν(Ej). j=1 j=1

As a measure, it is a finite measure, because for every measurable set E, Z Z ν(E) = f(x) dx ≤ f(x) dx < ∞. n E R To be contrarian, I will proceed by contradiction, so assume that for some η > 0 there is no δ > 0 that works. For k every k ∈ N we can then find a measurable set Ek such that |Ek| < 1/2 and Z f(x) dx ≥ η. Ek

Now for m ∈ N, let ∞ [ Fm = Ek, k=m then F1 ⊃ F2 ⊃ · · · and ∞ ∞ X X 1 1 |F | ≤ |E | ≤ = . m k 2k 2m−1 k=m k=m 3 MOLLIFICATION 10

Since |F1| ≤ 1 < ∞ and {Fm} is a decreasing sequence of sets, it is a fact from general measure theory that setting

∞ \ F = Fm, m=1 one has |F | = lim |Fm|, thus |F | = 0. m→∞

But ν being a finite measure (so ν(F1) < ∞), one also has Z Z ν(F ) = lim ν(Fm); i.e., f(x) dx = lim f(x) dx. m→∞ m→∞ F Fm

And here we have a contradiction; since Em ⊂ Fm, we have Z Z f(x) dx ≥ f(x) dx ≥ η Fm Em for all m, thus Z Z f(x) dx = lim f(x) dx ≥ η. m→∞ F Fm On the other hand, |F | = 0 so that Z f(x) dx = 0. F (That the integral of a non-negative measurable function over a null set is 0 is best established from the definition; it is sort of immediate). This contradiction proves that there cannot be an η without a δ. We are done.

Here are two important consequences of Lusin’s Theorem.

Theorem 8 Let p ∈ [1, ∞) and let E be a measurable subset of Rn, |E| > 0, possibly E = Rn. The set of n p restrictions of functions of Cc(R ) to E is dense in L (E). This means, precisely:

p n If f ∈ L (E) and  > 0, there exists g ∈ Cc(R ) such that Z 1/p p kf − gkLp(E) = |f − g| dx < . E

p n Equivalently, for every f ∈ L (E), there is a sequence {gm} in Cc(R ) such that {gm|E} converges to f in the Lp(E) norm.

Proof. It suffices to prove the theorem assuming E = Rn. In fact, assume it proved in this case, let E ⊂ Rn, |E| > 0. Let f ∈ Lp(E). Define f˜ : Rn → R by f˜(x) = f(x) if x ∈ E, f˜(x) = 0 if x ∈ Rn\E. (One usually says: extend f as zero to all of Rn.) It is a simple exercise in very elementary measure theory to verify that f˜ is p n n ˜ p n p measurable and in L (R ); in fact kfkL (R ) = kfkL (E). By our assumption, given  > 0, there exists g ∈ Cc(R ) ˜ p n such that kf − gkL (R ) < . But then Z 1/p Z 1/p Z 1/p p ˜ p ˜ p kf − g|EkLp(E) = |f(x) − g(x)| dx = |f(x) − g(x)| dx ≤ |f(x) − g(x)| dx <  n E E R

Assume thus that f ∈ Lp(Rn) and let  > 0. Suppose first f is bounded, say |f(x)| ≤ M for all x ∈ Rn. Since |f|p is integrable, by Lemma 6, there exists a measurable set A of finite measure such that

Z   p |f(x)|p dx < . (5) n 3 R \A 3 MOLLIFICATION 11

n Let h = χAf; thus h : R → R satisfies h(x) = f(x) if x ∈ A, h(x) = 0 otherwise. Lusin’s Theorem applies to h; n there is thus g ∈ Cc(R ) such that   p |{x ∈ n : h(x) 6= g(x)}| < R 6M and such that sup |g(x)| ≤ sup |h(x)|. n n x∈R x∈R

Since h = χAf, h is also bounded by M, hence so is g. n  p Let S = {x ∈ R : h(x) 6= g(x)} so that |S| < 4M . Z Z Z |f(x) − g(x)|p dx = |h(x) − g(x)|p dx = |h(x) − g(x)|p dx A A S∩A Z   p   p ≤ (2M)p dx = (2M)p|S| < (2M)p = . (6) S∩A 6M 3 We also have, since h(x) = 0 outside of A, so if g(x) 6= 0, then x ∈ S\A, Z Z   p |g(x)|p dx = |g(x)|p dx ≤ M p|S\A| ≤ M p|S| < . (7) n 6 R \A S\A Inequalities (5), (6) and (7) can be written, respectively, in the form  k(1 − χ )fk p n < , A L (R ) 3  kχ (f − g)k p n < , A L (R ) 3  k(1 − χ )gk p n < . A L (R ) 6 Thus

p n p n kf − gkL (R ) = k(1 − χ)(f − g) + χ(f − g)kL (R )    ≤ k(1 − χ )fk p n + k(1 − χ )gk p n + kχ (f − g)k p n < + + < . A L (R ) A L (R ) A L (R ) 3 3 6 This takes care of the case in which f ∈ Lp(Rn) is bounded. In the general case, by Lemma 7, given  > 0, there is δ > 0 such that if E is measurable, and |E| < δ, then Z   p |f(x)|p dx < ; E 2 that is,  kχ fk p n < . E L (R ) 2 n n There is a measurable set E with of measure |E| < δ such that f is bounded on R \E. In fact, if EM = {x ∈ R : |f(x)| > M}, then Z Z p p p ∞ > |f(x)| dx ≥ |f(x)| dx ≥ M |EM | n R EM so that Z 1 p |EM | ≤ p |f(x)| dx → 0 M n R n as M → ∞; selecting M large enough we get |EM | < δ and |f(x)| ≤ M on R \EM . Take E = EM . Then n h = (1 − chiE)f is bounded, and there exists thus g ∈ Cc(R ) such that  kh − gk p n < . L (R ) 2 Then   kf − gk p n = kχ f + h − gk p n ≤ kχ fk p n + kh − gk p n < + = . L (R ) E L (R ) E L (R ) L (R ) 2 2 3 MOLLIFICATION 12

n ∞ n This result is false if p = ∞, Cc(R ) is NOT dense in L (R ).

The next consequence of Lusin’s theorem (rather of Theorem 8) is:

p n n n n Theorem 9 Let f ∈ L (R ), 1 ≤ p < ∞. For x ∈ R define fx : R → R by fx(y) = f(y + x). The map n p n x 7→ fx : R → L (R ) is continuous. In other words, this theorem states no more nor less than: Let f ∈ Lp(Rn), 1 ≤ p < ∞. For every  > 0 there is δ > 0 such that if |x| < δ, then Z 1/p p p n kfx − fkL (R ) = |f(y + x) − f(y)| dy < . n R n Proof. Let  > 0 be given. By Theorem 8, there exists g ∈ Cc(R ) such that  kf − gk p n < . L (R ) 3 n Let K = sup g; since K is compact, so is the set K1 = {x ∈ R : dist(x,K) ≤ 1. Being compact, its measure |K1| is finite. Since g is continuous of compact support, it is uniformly continuous; there is thus δ > 0 such that if x, y ∈ Rn, and |x − y| < δ, then  |g(x) − g(y)| < . 1/p 3(|K1| + 1) n We may and will assume that δ ≤ 1. Let |x| < δ. If y ∈ R \K1, then y, x + y∈ / K and g(x + y) − g(x) = 0. If 1/p y ∈ K1, then we use that |x + y − y| = |x| < δ and |g(x + y) − g(y)| < /[3(|K1| + 1) ]. It follows that if |x| < δ, then

Z 1/p Z 1/p Z p 1/p p p  p n kgx − gkL (R ) = |g(x + y) − g(x)| dy = |g(x + y) − g(x)| dy ≤ p dy n 3 (|K | + 1) R K1 K1 1   |K | 1/p  = 1 < . 3 |K1| + 1 3

p n p n A simple change in variables proves that kfx − gxkL (R ) = kf − gkL (R ) so that

p n p n p n p n kfx − fkL (R ) ≤ kfx − gxkL (R ) + kgx − gkL (R ) + kg − fkL (R )  2 = kg − gk p n + 2kg − fk p n < + = . x L (R ) L (R ) 3 3

We now come to what could be the main mollification result.

−n n 1 n Theorem 10 Let (as usual) ψ be the mollifier of Lemma 2 ψ(x) =  ψ(x/) for  > 0, x ∈ R . If f ∈ Lloc(R ), set f = f ∗ ψ. Then:

p n ∞ n p n 1. If f ∈ L (R ), 1 ≤ p ≤ ∞, then f ∈ C (R ) ∩ L (R ) and

p n p n kfkL (R ) ≤ kfkL (R ). (8)

2. If f ∈ Lp(Rn), 1 ≤ p < ∞, then lim kf − fkLp( n) = 0. (9) →0+ R This result is not true if p = ∞.

∞ n p Proof. Concerning 1, we already know f ∈ C (R ). That f ∈ L and (8) holds is a consequence of Hausdorff- R Young (3), since n ψ(y) dy = 1. R n n To prove 2, by Theorem 8, we first notice that it is true if f ∈ Cc(R ), In fact, assuming f ∈ Cc(R ), let K = supp f and let K1 be all points at distance at most 1 from K. Since

supp f ⊂ supp f + supp ψ ⊂ K + B(0, ) ⊂ K1 3 MOLLIFICATION 13

if 0 <  ≤ 1. It follows that for  ∈ (0, 1], supp (f − f) ⊂ K1 and we see that

Z 1/p p 1/p p n kf−fkL (R ) = |f(x) − f(x)| dx ≤ sup |f(x)−f(x)|(|K1|) .ByT heorem4, weseethat(9)holdsinthiscase. K1 x∈K1

p n n Passing to the general case, let f ∈ L (R ), 1 ≤ p < ∞. Then, by Theorem 8, given η > 0, there is g ∈ Cc(R ) such that η kf − gk p n < . L (R ) 3 Then (we use (8) in the penultimate inequality below):

p n p n p n p n kf − fkL (R ) ≤ kf − g|L (R ) + kg − gkL (R ) + kg − fkL (R )

p n p n p n p n p n = kf − g|L (R ) + kg − gkL (R ) + k(g − f) ∗ ψkL (R ) ≤ 2kf − g|L (R ) + kg − gkL (R ) 2η < + kg − g k p n 3  L (R )

By the already proved Cc case of this theorem, there is 1 > 0 so that if 0 <  < 1 then η kg − g k p n < ,  L (R ) 3

p n proving that kf − fkL (R ) < η if 0 <  < 1, and (9) follows. Note: One can avoid proving the result first for compactly supported continuous functions and shorten the proof somewhat, if one uses Theorem 9.