NOTES ON DISTRIBUTIONS MATH 565, FALL 2017

1. Introduction and examples The notion of a distribution (also called generalized functions) was introduced to make sense of operations such as the delta , which in some sense we very much want to be a function, but we cannot apply the usual rules of integration and differentiation to it. n ∞ Given a nonempty Ω ⊂ R , consider Cc (Ω) as the collection of smooth compactly supported functions φ :Ω → C satisfying supp(φ) ⊂ Ω. In the theory of distributions, such functions are also called test functions and very often the notation D(Ω) is used alternatively, so ∞ treat it as the same as Cc (Ω). It is not hard to check that D(Ω) is always but it is not a trivial task to endow D(Ω) with a topology for which D(Ω) will be complete in the sense that every Cauchy sequence converges. It turns out this can be done by passing to the theory of topological vector spaces, vector spaces whose topology is not necessarily induced by a norm (or even a metric!). Nonetheless, good sense can be made of convergence in the following way.

Definition 1.1. A sequence {φj} in D(Ω) is said to converge to a function φ ∈ D(Ω) if there exists a compact set K ⊂ Ω such that supp(φj) ⊂ K for every j ∈ N and α α α α (1.1) lim k∂ φj − ∂ φk∞ = lim sup |∂ φj(x) − ∂ φ(x)| = 0, for every multi-index α. j→∞ j→∞ x∈Ω

Note that since supp(φj) ⊂ K for each j, we must have supp(φ) ⊂ K as well and hence we could replace supx∈Ω by supx∈K here. Definition 1.2. A distribution on Ω is a linear map u : D(Ω) → C satisfying the following: for each compact K ⊂ Ω, there exists a constant CK and an integer NK ≥ 0 such that

X α (1.2) |hu, φi| ≤ CK k∂ φk∞.

|α|≤NK The space of distributions over Ω is denoted by D0(Ω), it is a vector space under the natural addition and scalar multiplication operations 0 (1.3) hc1u1 + c2u2, φi := c1hu1, φi + c2hu2, φi uj ∈ D (Ω), cj ∈ C, j = 1, 2.

Note that our definition of convergence means that u is sequentially continuous in that if φj → φ in the sense of (1.1), then limj→∞hu, φji = hu, φi. Since u is a linear map u : D(Ω) → C, one might say that the notation for the action of u on a test function φ should be u(φ) rather than hu, φi, but there are advantages to viewing u as a “pairing” with test function.

∞ 0 0 Definition 1.3. A sequence {uk}k=1 ⊂ D (Ω) converges to u in the topology of D (Ω) if

lim huk, φi = hu, φi for every φ ∈ D(Ω). k→∞

Remark 1.4. Note that if φk or uk instead depends on a parameter p lying in a metric space with limit point p0, then we can define limp→p0 φp and limp→p0 up in the natural way by saying that these limits are φ and u respectively if and only if the limit of every sequence pk → p as k → ∞ satisfies

φpk → φ and upk → u respectively. 1 2 NOTES ON DISTRIBUTIONS

1.1. Examples. n Example 1.5. Given an open set Ω ⊂ R , define the locally integrable functions on Ω as those which are integrable over any compact set K ⊂ Ω: 1 1 n Lloc(Ω) := {f measurable on Ω: f1K ∈ L (R ) for any compact K ⊂ Ω}. 1 R Given any f ∈ Lloc(Ω), f defines a distribution via hf, φi := Ω f(x)φ(x)dx. To see that this operation satisfies (1.2), observe that if supp(φ) ⊂ K, then φ = φ1K and hence Z Z Z 

|hf, φi| = f(x)1K (x)φ(x)dx = f(x)φ(x)dx ≤ |f(x)|dx kφk∞ Ω K K R by H¨older’sinequality. Hence (1.2) is satisfied with NK = 0 and CK = K |f(x)|dx. The previous example is of fundamental importance in the theory of distributions. Frequently we will conflate the idea of a locally integrable function with the distribution it defines. n Example 1.6. Given any open Ω ⊂ R containing the origin, the Dirac distribution δ is defined by hδ, φi = φ(0). It is nearly trivial that |hδ, φi| ≤ kφk∞ and hence δ satisfies the inequality (1.2) with CK = 1 and NK = 0. R −n Moreover, if ψ is integrable with ψ(x)dx = a ∈ C, define ψ(x) =  ψ(x/). Treating ψ as a distribution as in the previous example, the ψ converge to aδ in the sense of distributions: Z Z Z −n lim hψ, φi = lim  ψ(x/)φ(x)dx = lim ψ(x)φ(x)dx = φ(0) ψ(x)dx = haδ, φi. →0+ →0+ →0+ where the penultimate identity follows from a standard application of the dominated convergence theorem. In the special case a = 1, this rigorously establishes the common non-rigorous definition of the Dirac distribution as being a limit of functions which are increasingly peaked at the origin.

Example 1.7. Let Ω = (0, ∞) ⊂ R. Let u be the linear map on D(Ω) defined by ∞ X hu, φi = nφ (1/n) . n=1 That u is indeed well-defined is implicit in the ensuing observation establishing (1.2) for this dis- tribution. Suppose K ⊂ (0, ∞) is compact. Then d(K, (−∞, 0]) > 0, so there exists N ∈ N large such that K ⊂ [1/N, ∞) and hence for any φ supported in K, N N N X X X N(N + 1) |hu, φi| = nφ (1/n) ≤ n|φ (1/n) | ≤ kφk n = kφk . ∞ 2 ∞ n=1 n=1 1 N(N+1) Therefore (1.2) is satisfied with CK = 2 and NK = 0. Note that u does not define a distribution on R. n Example 1.8. Let Ω ⊂ R and let S be any parameterized k-surface with surface measure dS. Then Z hu, φi := φdS, S is easily verified to define a distribution. In particular, if supp(φ) ⊂ K, then the constants in (1.2) R 1 n−1 can be taken as NK = 0 and CK = K∩S 1 dS. Cases of special interest will be when S = S is the unit sphere or more generally, S is quadratic surface. P∞ (n) Exercise 1.9. Check that hu, φi := n=1 nφ (n) defines a distribution on R satisfying (1.2) for (n) any compact K ⊂ R (where φ is the n-th of φ). 1 R There is a slight technicality here in that its not obvious that K∩S 1 dS is always well-defined, but it is possible to enlarge K so that the integral is indeed well-defined. NOTES ON DISTRIBUTIONS 3

2. Operations on distributions 1 2.1. Multiplication by a smooth function. As we saw, any function f ∈ Lloc(Ω) defines a distribution via Z hf, φi = f(x)φ(x) dx.

If ψ ∈ C∞(Ω), we of course define the product of ψ and f as simply the function (ψf)(x) = ψ(x)f(x), its corresponding action on D(Ω) is then Z Z hψf, φi = ψ(x)f(x)φ(x) dx = f(x)ψ(x)φ(x) dx = hf, ψφi.

Indeed, the product ψφ defines a test function and supp(ψφ) ⊂ supp(φ), so the right hand side here is meaningful. Definition 2.1. Given u ∈ D0(Ω) and ψ ∈ C∞(Ω), the product ψu is defined to be the distribution hψu, φi := hu, ψφi. Note that this does indeed define a distribution in that it satisfies (1.2): if K ⊂ Ω is compact, and C , N are the constants satisfied by u in (1.2), then sup sup |∂αψ(x)| is finite, and K K |α|≤NK x∈K hence for supp(φ) ⊂ K, the Liebniz rule implies the existence of a constant M depending on this quantity such that

X α X α |hψu, φi| = |hu, ψφi| ≤ CK k∂ (ψφ)k∞ ≤ MCK k∂ φk∞.

|α|≤NK |α|≤NK

Hence ψu also satisfies (1.2), with the constant MCK replacing CK .

k β 1 2.2. Differentiation of a distribution. Now suppose f ∈ C (Ω) is such that ∂ f ∈ Lloc(Ω) for every multi-index |β| ≤ k. Then we would have that for any fixed |β| ≤ k, ∂βf defines a distribution via Z h∂βf, φi = ∂βf(x)φ(x) dx.

However, an integration by parts shows that Z Z h∂βf, φi = ∂βf(x)φ(x) dx = (−1)|β| f(x)∂βφ(x) dx = (−1)|β|hf, ∂βφi.

Indeed, ∂βφ defines a test function and hence we are led to make the following definition: Definition 2.2. Given u ∈ D0(Ω) and a multi-index β, the partial derivative ∂βu is defined as h∂βu, φi := (−1)|β|hu, ∂βφi. Once again there is the matter of checking that this operation actually defines a distribution on Ω, so suppose K ⊂ Ω is any compact subset and that CK ,NK are the constants satisfied by u in (1.2). Then

β β X α β X α |h∂ u, φi| = |hu, ∂ φi| ≤ CK k∂ (∂ φ)k∞ ≤ CK k∂ φk∞.

|α|≤NK |α|≤NK +|β|

Example 2.3. Let H(x) be the Heaviside function on R ( 1, if x ≥ 0 H(x) = 0, if x < 0. 4 NOTES ON DISTRIBUTIONS

R ∞ R ∞ Since H is locally integrable, hH, φi = −∞ H(x)φ(x)dx = 0 φ(x)dx defines a distribution on R. Let us compute ∂H, the first derivative of H: Z ∞ ∞ 0 h∂H, φi = −hH, ∂φi = − φ (x)dx = −φ(x) = φ(0) 0 0 since φ is compactly supported, φ(x) → 0 as x → ∞. n n 2.3. Translation, reflection, and dilation. Given a function f : R → C, y ∈ R , and r > 0, define τyf and f˜ to be the functions

τyf(x) := f(x − y) f˜(x) := f(−x), µrf(x) := f(rx) which of course define translation, reflection, and dilation operations on f. Once again if f ∈ 1 n ˜ Lloc(R ), then so is τyf and f and Z Z hτyf, φi = f(x − y)φ(x) dx = f(x)φ(x + y) dx = hτ−yf, φi Rn Rn Z Z hf,˜ φi = f(−x)φ(x) dx = f(x)φ(−x) dx = hf, φ˜i Rn Rn Z Z −n −n hµrf, φi = f(rx)φ(x) dx = r f(x)φ(x/r) dx = r hf, µr−1 φi Rn Rn Again the consideration of locally integrable functions f, lead us to make the following definitions. 0 n n Definition 2.4. Given u ∈ D (R ), y ∈ R , r > 0 define the following operations

hτyu, φi := hu, τ−yφi, (translation by y) hu,˜ φi := hu, φ˜i, (reflection) −n hµru, φi := r hu, µr−1 φi, (dilation by r > 0) . λ Moreover, if there exists λ such that µru = r u for every r > 0, then u is said to be homogeneous of degree λ.

Exercise 2.5. Verify that τyu, u˜, µru all satisfy (1.2) and establish the dependence on the constants appearing there in terms of the original constants CK and NK satisfied by u. Example 2.6. Let δ denote Dirac distribution at the origin hδ, φi. Then it is easily verified that δ is under reflection and hτyδ, φi = φ(y) (so that τyδ is a Dirac distribution “centered at y”). Finally, n δ is homogeneous of degree −n since for every φ ∈ D(R ), we have: −n −n −n −n hµrδ, φi = r hδ, µr−1 φi = r φ(x/r) x=0 = r φ(0) = hr δ, φi. n Exercise 2.7. Let f : R \0 → C be a function which is locally integrable and homogeneous of degree λ λ. Check that this implies Re(λ) > −n. Then verify that in the sense of distributions µrf = r f.

3. Compactly supported and tempered distributions 3.1. Compactly supported distributions. n Definition 3.1. Given an open set Ω ⊂ R and a further open subset U ⊂ Ω, we say a distribution ∞ u ∈ D(Ω) vanishes on U if hu, φi = 0 for every φ ∈ D(U) = Cc (U) (that is, the smooth functions compactly supported in U). The of u is the defined as the set whose complement is the union of all open sets on which u vanishes, that is, (supp(u))C := ∪{U ⊂ Ω open : u vanishes on U}. Equivalently, supp(u) is the intersection of all closed sets F ⊂ Ω such that u vanishes on Ω \ F . Either way, it is clear that supp(u) is always a closed set. NOTES ON DISTRIBUTIONS 5

Exercise 3.2. Suppose f :Ω → C is continuous (and hence locally integrable), check that the support of f as a function is identical to its support as a distribution in D0(Ω) in the sense of 1 Definition 3.1. For those who know measure theory, show that more generally, if f ∈ Lloc(Ω) is merely locally integrable, then in the sense of distributions (supp(f))C = ∪{U ⊂ Ω open : f vanishes a.e. on U}. Exercise 3.3. Show that the support of the Dirac distribution δ is the origin {0}. Exercise 3.4. Show that for any multi-index α, supp(∂αu) ⊂ supp(u). Construct an example for which the containment is proper. Definition 3.5. A distribution u ∈ D(Ω) is said to be compactly supported if supp(u) is a compact set. The space of compactly supported distributions on an open set Ω is denoted by E0(Ω). The significance of compactly supported distributions lies in that it extends to a sequentially ∞ ∞ continuous linear mapping u : C (Ω) → C. Indeed, the C Urysohn lemma furnishes a bump ∞ function ψ ∈ Cc (Ω) such that ψ(x) = 1 in a small neighborhood of supp(u) (and hence 1 − ψ vanishes on a neighborhood of supp(u)). Hence h(1 − ψ)u, φi = hu, (1 − ψ)φi = 0, since (1 − ψ)φ is smooth and supported in (supp(u))C . Thus hu, φi = hψu, φi for every φ ∈ D(Ω) and we may extend u to functions η ∈ C∞(Ω) which are not necessarily compactly supported via (3.1) hu, ηi := hu, ψηi. Moreover, this definition can be seen to be independent of choice of bump function ψ which is 1 on a neighborhood of supp(u). Next note that with the definition (3.1) we may take K = supp(ψ) and definition of a distribution furnishes constant CK ,NK as in (1.2) such that X α X α |hu, ηi| = |hu, ψηi| ≤ CK sup |∂ (ψη)| ≤ C˜ sup |∂ η(x)| x∈K x∈K |α|≤NK |α|≤NK where C˜ depends on C , sup k∂αψk , and the generalized binomial coefficients appearing K |α|≤NK ∞ in the Liebniz inequality. The crucial aspect of this inequality is that it depends only on ψ, but not at all on η. In particular, if η happens to be compactly supported, it is independent of the support.   Hence if we C = C sup k∂ ψk and N = N , we have that there exist a compact set K |α|≤NK α ∞ K K ⊃ supp(u) such that X (3.2) |hu, ηi| ≤ C sup |∂αη(x)| for every η ∈ C∞(Ω). x∈K |α|≤N This shows that in particular, when u is restricted to functions in D(Ω), the constants appearing in (1.2) can be taken to be independent of supp(φ). We conclude this discussion by defining the notion of convergence of a sequence of C∞(Ω) functions. ∞ ∞ Definition 3.6. A sequence of functions {ηj}j=1 in C (Ω) is said to converge to η in the ∞ α α topology of C (Ω) if for each multi-index α, ∂ ηj → ∂ η uniformly on any compact set. In other words, for each α and compact K ∈ Ω, α α lim ∂ ηj − ∂ η|K = 0 j→∞ K ∞ It is then not hard to see that when u ∈ E0(Ω), the extension (3.1) defines a sequentially contin- uous map in that ∞ lim hu, ηji = hu, ηi whenever ηj → η in C (Ω). j→∞ 6 NOTES ON DISTRIBUTIONS

3.2. Tempered distributions. In a nutshell, tempered distributions are continuous linear maps n from S(R ) → C. Much like in the previous cases, the definition is motivated by first defining n convergence in the topology of S(R ). n Definition 3.7. Given a function φ ∈ S(R ), define N α (3.3) kφk(N,α) := sup (1 + |x|) |∂ φ(x)|. x∈Rn ∞ n n A sequence {φj}j=1 in S(R ) is said to converge in the topology of S(R ) if for any multi-index α and any integer N ≥ 0 N α α lim kφj − φk(N,α) = lim sup (1 + |x|) |∂ φj(x) − ∂ φ(x)| = 0. j→∞ j→∞ n x∈R

Proposition 3.8. Let N ≥ 0 be an integer. There exists constants CN,1,CN,2 > 0 such that X β N X β (3.4) CN,1 |x | ≤ (1 + |x|) ≤ CN,2 |x | |β|≤N |β|≤N n Consequently, for any φ ∈ S(R ) and any multi-index α, X β α X β α (3.5) CN,1 sup |x ∂ φ(x)| ≤ kφk(N,α) ≤ CN,2 sup |x ∂ φ(x)| x∈ n x∈ n |β|≤N R |β|≤N R Proof. Once (3.4) is established, (3.5) follows by revisiting the definition (3.3). The first inequality in (3.4) is the easier of the two since |xβ| ≤ (1 + |x|)N for any |β| ≤ N, hence one can simply take −1 CN,1 to satisfy CN,1 = #{β : |β| ≤ N}. n For the second inequality in (3.4), given two positive functions f(x), g(x) on R \{0} which are both homogeneous of degree N, observe that f(x) |x|N f(x/|x|) f(x/|x|) = = . g(x) |x|N g(x/|x|) g(x/|x|) Hence for any x 6= 0 the ratio f(x)/g(x) is bounded above and below by the maximum and minimum n−1 values of the ratio attained on the unit sphere (which exist since S is compact) f(ω) f(x) f(ω) min ≤ ≤ max , ω∈Sn−1 g(ω) g(x) ω∈Sn−1 g(ω) N Pn N Applying this to f(x) = |x| and g(x) = j=1 |xj | there exists a positive constant c0 such that N Pn N n |x| ≤ c0 j=1 |xj | for every x ∈ R . Hence

 n  N N N N X N N X β (1 + |x|) ≤ 2 (1 + |x| ) ≤ 2 1 + c0 |xj | ≤ 2 c0 |x | j=1 |β|≤N  n Definition 3.9. A tempered distribution is a linear map S(R ) → C such that there exists integers M,N ≥ 0 and a constant C satisfying X n (3.6) |hu, φi| ≤ C kφk(N,α), for each φ ∈ S(R ). |α|≤M n 0 n The space of tempered distributions on R is denoted as S (R ). It can be verified that it forms a vector space, defining addition and scalar multiplication as before. ∞ n It is not hard to see that the restriction of any tempered distribution to Cc (R ) is a distribution n in the sense of Definition 1.1. Also, as before, if φj → φ in the topology of S(R ) then it is not hard to see that limj→∞hu, φji = hu, φi. NOTES ON DISTRIBUTIONS 7

Example 3.10. Let f be the continuous, locally integrable function f(x) = e|x|2 . Integration against n f does not define a tempered distribution. Indeed, since Gaussians are always in S(R ), the integral

Z 2 2 e|x| e−|x| /2 dx is divergent. Hence integration of the Schwartz class function e−|x|2/2 against f is ill defined. n N Exercise 3.11. Let f : R → C be a locally integrable function such that |f(x)| ≤ C(1 + |x|) for some constants C and N. We call such functions slowly increasing. Show that hf, φi = R f(x)φ(x)dx defines a tempered distribution. Rn These two examples lead to our intuition that tempered distributions “grow at most polynomi- ally” at infinity. 0 n Example 3.12. Let u ∈ E (R ) be a compactly supported distribution. By the observations in §3.1, ∞ n n u extends to a linear map C (R ) → C and hence hu, φi is well-defined for φ ∈ S(R ). Moreover, n if ψ is a bump function identically one on a neighborhood of supp(u), then (3.2) (with Ω = R and n φ = η) implies that for some compact K ⊂ R X α X |hu, φi| ≤ C sup |∂ φ(x)| ≤ C kφk(0,α). x∈K |α|≤N |α|≤N Hence every compactly supported distribution defines a tempered distribution.

4. Fourier transforms of tempered distributions 0 n Definition 4.1. Given u ∈ S (R ), define its as the tempered distribution hu,b φi := hu, φbi. To see that ub does indeed satisfy (3.6), first observe that by (3.6) and (3.5), there exists constants C, C˜ such that X ˜ X β α |hu,b φi| = |hu, φbi| ≤ C kφbk(N,α) ≤ C sup |x ∂ φb(x)| x∈ n |α|≤M |β|≤N,|α|≤M R

It can then be seen that |hu,b φi| satisfies (3.6) once we have the following proposition. Proposition 4.2. Suppose M,N ≥ 0 are integers and that α, β are multi-indices such that |β| ≤ N, |α| ≤ M. Then there exists a constant C such that β α X sup |x ∂ φb(x)| ≤ C kφk(M+n+1,γ) x∈ n R |γ|≤N Proof. Begin by observing that by the standard L∞ bound on the Fourier transform β α β α ∧ β α sup |x ∂ φ(x)| = sup [∂ (ξ φ)] (x) ≤ k∂ (ξ φ)k 1 n b L (R ) x∈Rn x∈Rn By the Liebniz rule there exists constants cγ,α such that β α X α−(β−γ) γ ∂ (ξ φ) = cγ,αξ ∂ φ α−β≤γ≤β

(where min(α, β) is the multi-index (min(α1, β1),..., min(αn, βn))). For each term here, Z ! Z |ξα−γ∂γφ(ξ)|dξ ≤ sup (1 + |ξ|)M+n+1|∂γφ(ξ)| |ξα−γ|(1 + |ξ|)−M−n−1dξ Rn ξ∈Rn Rn

Note that the right hand side here can be written as Cα,γkφk(M+n+1,γ) for some constant Cα,γ depending on α, γ.  8 NOTES ON DISTRIBUTIONS

Example 4.3. Consider the Dirac distribution δ. The distribution is compactly supported, hence n tempered. We now compute its Fourier transform by testing against any φ ∈ S(R ): Z Z hδ,b φi = hδ, φbi = φb(0) = e−i0·xφ(x) dx = 1 · φ(x) dx = h1, φi. Rn Rn This shows that δb = 1, that is, δb is the distribution determined by the constant function 1. Similarly, b1 = (2π)nδ, Z n hb1, φi = h1, φbi = φb(x)dx = (2π) φ(0), where we used the inverse Fourier transform in the last line.

0 n Definition 4.4. Given u ∈ S (R ), define its inverse Fourier transform as the tempered dis- tribution satisfying hu,ˇ φi := hu, φˇi, −n R ix·ξ where as usual φˇ(x) = (2π) e φb(ξ)dξ. This operation is easily verified to invert the Fourier transform. Indeed, uˆˇ = u since

hu,ˆˇ φi = hu,ˆ φˇi = hu, φˇˆi = hu, φi. A similar computation shows uˇˆ = u. Hence we may conclude the following theorem. Theorem 4.5. The Fourier transform F on the space of tempered distributions defines a sequen- 0 n 0 n tially continuous linear bijection F : S (R ) → S (R ). 4.1. Fourier and Fourier-Laplace transforms of compactly supported distributions. We begin by recalling some properties of analytic functions on an open domain U ⊂ C. Given a function F : U → C, let ζ = ξ + iν denote the complex variable in U with ξ, ν ∈ R so that Re(ζ) = ξ, Im(ζ) = ν. Recall that F is said to be analytic on U if at each point ξ + iν ∈ U, F satisfies the Cauchy-Riemann equations ∂u ∂v ∂u ∂v (4.1) = , = − , where u = Re(F ), v = Im(F ). ∂ξ ∂ν ∂ν ∂ξ Next, we introduce the partial differential operators ∂ 1  ∂ ∂  ∂ 1  ∂ ∂  = − i , = + i . ∂ζ 2 ∂ξ ∂ν ∂ζ 2 ∂ξ ∂ν Hence the Cauchy-Riemann equations can be rewritten as 0 = ∂F . Indeed, observing that ∂ζ ∂F 1 ∂u ∂v  ∂u ∂v  = − + i + , u = Re(F ), v = Im(F ) ∂ζ 2 ∂ξ ∂ν ∂ν ∂ξ it easily follows that the Cauchy-Riemann equations (4.1) are satisfied if and only if ∂F = 0. It is ∂ζ ∞ well known that if F is analytic on U, then it is C (U) as a function of the real variables ξ, ν ∈ R. In n let ζ = (ζ , . . . , ζ ) denote coordinates on n with ζ = ξ + iν and let ∂ = 1 ( ∂ − i ∂ ) C 1 n C j j j ∂ζj 2 ∂ξj ∂νj ∂ 1 ∂ ∂ n n = 2 ( ∂ξ + i ∂ν ) for j = 1, . . . , n. Hence as vectors ξ = Re(ζ) ∈ R , ν = Im(ζ) ∈ R , Thus if ∂ζj j j n U ⊂ C is an open domain, we say that F : U → C is analytic on U if at each point ζ ∈ U, and each j = 1, . . . , n we have ∂F = 0. Equivalently, F satisfies the Cauchy-Riemann equations in each ∂ζj variable. As in the 1 dimensional case, F is C∞ as a function of the n dimensional real variables n ξ, ν ∈ R essentially because one has higher dimensional versions of the Cauchy integral formula and power series developments. NOTES ON DISTRIBUTIONS 9

One function of central interest to us is e−ix·ζ as a function of ζ. It is not difficult to check that ∂ −ix·ζ n n e = 0 for every ζ ∈ C , so that it defines an on all of C : ∂ζj   ∂ −ix·ζ 1 ∂ ∂ −ix·ξ x·ν 1 −ix·ξ x·ν (4.2) e = + i (e e ) = (−ixj + ixj)(e e ) = 0. ∂ζj 2 ∂ξj ∂νj 2

2 n n Lemma 4.6. Suppose f ∈ C (R ) and let ej denote the j-th standard basis vector in R . Then the difference quotients

τ−he f(x) − f(x) (4.3) j h n k+2 n converge uniformly to the function ∂xj f(x) on compact sets in R . Moreover, if f ∈ C (R ) and |α| ≤ k, then

τ−he f(x) − f(x) (4.4) ∂α j h

α+e n converge uniformly to the function ∂ j f(x) on compact sets in R . Proof. First observe that (4.4) is (∂αf)(x + he ) − (∂αf)(x) j . h k+2 n α k+2−|α| n 2 n Thus if f ∈ C (R ) and |α| ≤ k, then ∂ f ∈ C (R ) ⊂ C (R ), so the α 6= 0 case follows from the α = 0 case, and we thus restrict attention to the latter in what follows. Let K be a compact set and denote K + hej = {x + hej : x ∈ K}. For |h| ≤ 1, the closure of the union of all the K + hej for |h| ≤ 1,

Ke := ∪|h|≤1(K + hej),

2 n is a closed, bounded set and hence compact. Thus since f ∈ C (R ), there exists M such that sup |∂2 f(x)| ≤ M. Thus if x ∈ K and |h| ≤ 1, Taylor’s theorem applied to t 7→ f(x + te ) x∈Ke xj j implies that there exists c(h, x) such |c(h, x)| ≤ h and such that the absolute value of the expression in (4.3) satisfies

f(x + hej) − f(x) h 2 M|h| − ∂x f(x) = ∂ f (x + c(h, x)ej) ≤ → 0, as h → 0. h j 2 xj 2  n Lemma 4.7. Let φ ∈ S(R ). Then for any multi-index α and any integer N ≥ 0, there exists a constant CN,α such that

τ−hej φ − φ (4.5) − ∂xj φ ≤ CN,α|h|. h (N,α)

1 n Hence the difference quotients h (τ−hej φ − φ) converge to ∂xj φ in the topology of S(R ) as h → 0. Proof. The proof is very similar to that in Lemma 4.6, and in particular, by the same considerations n there, it suffices to treat the case α = 0. Applying Taylor’s theorem as before, for each x ∈ R and each |h| ≤ 1, we have a c(x, h) satisfying |c(h, x)| ≤ |h| and

τ−he φ(x) − φ(x) h j − ∂ φ(x) = ∂2 φ(x + c(x, h)e ). h xj 2 xj j 10 NOTES ON DISTRIBUTIONS

We use this to estimate for |h| ≤ 1,

τ−he φ(x) − φ(x) |h| (1 + |x|)N j − ∂ φ(x) ≤ (1 + |x + c(x, h)e | + |c(x, h)|)N |∂2 φ(x + c(x, h)e )| h xj 2 j xj j N |h| X N ≤ |c(x, h)|N−k (1 + |x + c(x, h)e |)k |∂2 φ(x + c(x, h)e )| 2 k j xj j k=0 N |h| X N ≤ kφk 2 k (k,2ej ) k=0 where we have used |c(x, h)| ≤ |h| ≤ 1 in the last expression. The sum in k on the right determines the constant CN,α, which is independent of x, h. Hence (4.5) follows by taking the supremum over n all x ∈ R on the left hand side.  0 n n −ix·ζ −ix·ξ x·ν Theorem 4.8. Suppose u ∈ E (R ) and for ζ ∈ C , let eζ = e = e e , so that as a ∞ n n function of x, eζ ∈ C (R ). Define a function F : C → C by

(4.6) F (ζ) = hu, eζ i. n Then F (ζ) is analytic on C . n 2 Proof. Treating F as a function of the real vectors ξ, η ∈ R , observe that by linearity * + F (ξ + he , ν) − F (ξ, ν) e−ix·(ξ+hej ) − e−ix·ξ  e−ihxj − 1 j = u(x), ex·ν = u(x), e (x) . h h ζ h

e−ihxj −1 We now claim that eζ (x) h and its in x converge uniformly to −ixjeζ (x) on compact sets as h → 0. Once this is established, then by continuity of u, we will have that the right hand side converges to hu(x), −ix e (x)i as h → 0 and then have that ∂F is this limit. The j ζ ∂ξj principle is essentially the same as in the proof of Lemma 4.6: a Taylor expansion of the function f(t) = e−itxj yields the existence of a number |c(h)| ≤ |h| such that

e−ihxj − 1  1  e (x) + ix = h e (x)(−ix )2e−ic(h)xj . ζ h j 2 ζ j As before, given a compact set K, there is a uniform upper bound on the function in parentheses e−ihxj −1 as x ranges over all points in K and all |h| ≤ 1, implying that eζ (x)( h + ixj) does indeed converge uniformly to 0 on K. Moreover, we may differentiate both sides of this identity with α respect to ∂x for any multi-index α, so that the same idea yields that uniformly on K    α −1 −ihxj ∂x eζ (x) h (e − 1) + ixj → 0 as h → 0.

We now have shown that ∂F exists and equals hu(x), (−ix )e (x)i. Essentially the same argument ∂ξj j ζ establishes ∂F = hu, x e (x)i, at which point F satisfies the Cauchy-Riemann equations in each ∂ηj j ζ n variable, implying F is analytic on all of C .  n ∞ n Corollary 4.9. Define F : R → C by F (ξ) = hu, eξi. Then F is C (R ). Proof. The function F is just the restriction of the function in (4.6) to ν = Im(ζ) = 0. Since analytic functions are always C∞ functions when regarded as a function of their real and imaginary ∞ parts ξ, ν, then F is a C function of ξ when restricted to ν = 0. 

2 The use of u(x) here should be taken as a reminder that u acts on the function of x given by eζ (x) and not to imply that u(x) is an actual function instead of a distribution. NOTES ON DISTRIBUTIONS 11

0 n n Theorem 4.10. Suppose u ∈ E (R ) and for ζ ∈ C , let F (ζ) = hu, eζ i as before. Suppose further that supp(u) ⊂ B(0, a), the closed ball of radius a about the origin. Then there exists constants C,N such that (4.7) |F (ζ)| ≤ C(1 + |ζ|)N ea|Im(ζ)|. In particular, if Im(ζ) = 0 so that ζ = ξ, then |F (ξ)| ≤ C(1 + |ξ|)N .

∞ 1 Proof. Let ψ be a C (R) function such that ψ(t) = 0 if t ≤ −1 and ψ(t) = 1 if t ≥ 2 . Then define

ψζ (x) := ψ(|ζ|(a − |x|)). 1 Hence ψζ (x) = 1 if |ζ|(a − |x|) ≥ − 2 and ψζ (x) = 0 if |ζ|(a − |x|) ≤ −1. Manipulating these identities reveals that 1 1 ψ (x) = 1 if |x| ≤ a + and ψ (x) = 0 if |x| ≥ a + . ζ 2|ζ| ζ |ζ| ∞ n In particular, if ψ(x) = 1 in a neighborhood of the origin, so it can be verified that ψζ ∈ Cc (R ) and supp(ψ) ⊂ B(0, a + |ζ|−1). Moreover, if α is a multi-index of order k an induction argument (j) shows that there exist functions of x, cj,α(x), which are bounded for ψ (|ζ|(a − |x|)) 6= 0 and satisfy k α X j (j) ∂x ψζ (x) = cj,α(x)|ζ| ψ (|ζ|(a − |x|)). j=1

Therefore, there exists a constant Cα such that α |α| |∂x ψζ (x)| ≤ Cα(1 + |ζ|) . α α ˜ From the Liebniz rule and the identity ∂x eζ (x) = (−iζ) eζ (x), we now have a constant Cα satisfying

α |α| |α| |Im(ζ)|(a+|ζ|−1) (4.8) sup |∂ (ψζ (x)eζ (x))| ≤ C˜α(1 + |ζ|) sup |eζ (x)| ≤ C˜α(1 + |ζ|) e , n x∈R x∈supp(ψζ )

x·Im(ζ) |Im(ζ)|(a+|ζ|−1) where the last inequality follows from |eζ (x)| = e ≤ e for x ∈ supp(ψζ ). Note that we may further bound e|Im(ζ)|(a+|ζ|−1) ≤ e · e|Im(ζ)|a since |Im(ζ)|/|ζ| ≤ 1. Since ψζ = 1 in a neighborhood of supp(u), there exists a constant C˜ and an integer N such that

X α |F (ζ)| = |hu, eζ i| = |hu, ψζ eζ i| ≤ C˜ k∂ (ψζ eζ )k∞. |α|≤N Given (4.8) and the ensuing observation, we now have for some larger constant C

|F (ζ)| ≤ C(1 + |ζ|)N e|Im(ζ)|a.

 At this stage, it is natural to study the relation between the function F (ξ) (the restriction of F 0 n to Im(ζ) = 0) and ub, the Fourier transform of u ∈ E (R ) in the sense of distributions. It turns out that they are indeed the same!

Theorem 4.11. Let F (ξ) = hu, eξi. Then the slowly increasing function F defines a tempered n distribution which satisfies ub = F . In other words, for every Schwartz class function φ ∈ S(R ) Z (4.9) hu, φbi = F (ξ)φ(ξ) dξ. Rn 12 NOTES ON DISTRIBUTIONS

Proof. More precisely this is a proof sketch since we will eschew a few difficult technicalities. In a nutshell, we want to justify that we can “pass the distribution under the integral sign” to obtain  Z  Z Z hu(x), φb(x)i = u(x), eξ(x)φ(ξ) dξ = hu(x), eξ(x)iφ(ξ) dξ = F (ξ)φ(ξ) dξ. Rn Rn Rn To achieve this, define n X g(x) :=  em(x)φ(m). m∈Zn Note that n is the volume of the cube of sidelength centered at the point m (or centered at any other point!). Consequently, g(x) is a Riemann sum approximation to the integral Z −ix·ξ g(x) ≈ e φ(ξ) dξ = φb(x). Rn With some work it can be seen that this approximation converges uniformly on any compact set n in R as  → 0. Moreover, any derivative of g(x) can be seen to converge uniformly to φb(x) on compact sets. By continuity of u, we have that

hu, φbi = limhu, gi. →0 At the same time, by linearity and the uniform convergence observed above

n X n X hu, gi =  φ(m)hu, emi =  F (m)φ(m). m∈Zn m∈Zn Similar to before, this is a Riemann sum approximation to the right hand side of (4.9) in the limit  → 0, which completes the proof of this identity. 

It is now appropriate to write ub(ξ) = hu, eξi given our convention to treat a locally integrable function and the distribution it determines as the same. Given Theorems 4.8 and 4.10, we actually n have that ub(ζ) extends the Fourier transform of u to an analytic function on R , sometimes called the Fourier-Laplace transform. Recall that the Laplace transform of a function f : (0, ∞) → C R ∞ −sx is defined as L(f)(s) = 0 f(x)e dx. In constrast to the Fourier transform, it pairs a function f with an exponentially decaying function e−sx rather than the oscillatory e−ixξ. The term “Fourier- Laplace” is thus used since it mixes the Fourier and Laplace models. N a|Im(ζ)| 0 n In particular, Theorem 4.10 shows that |F (ζ)| = |ub(ζ)| ≤ C(1+|ζ|) e whenever u ∈ E (R ) satisfies supp(u) ⊂ B(0, a). Interestingly enough, the converse of this theorem is also true: if F (ζ) is n N a|Im(ζ)| analytic on C and satisfies |F (ζ)| ≤ C(1+|ζ|) e , then F is the Fourier-Laplace transform of a compactly supported distribution u with supp(u) ⊂ B(0, a). These results go under the name of the Paley-Wiener-Schwartz theorem. Proving this would be doable for us, but it will take us astray from our bigger objectives. Nonetheless, being able to take Fourier transforms of distributions is an important aspect of what we want to do. One more important consequence of Theorem 4.8 is that it shows that the Fourier transform of a nonzero compactly supported distribution is never compactly supported itself. Indeed, if n ub(ξ) vanishes for all ξ ∈ R \ B(0,R) for some R > 0, then the coefficients of any power series C 3 development of ub(ζ) centered at a point ξ0 ∈ {ζ : Im(ζ) = 0} ∩ B(0,R) would have to vanish . Hence ub would vanish on an open set, which by identity principles in several complex variables n would have to imply that ub = 0 on C and hence u = 0.

3 α This may not be quite as clear as it seems since at first we can only say that ∂ξ ub(ξ0) = 0 for real derivatives in ξ, but by repeated use of the Cauchy-Riemann equations ∂ u(ξ ) = −i∂ u(ξ ) we can conclude that ∂αu(ξ ) vanishes ξj b 0 νj b 0 ζ b 0 for every α. NOTES ON DISTRIBUTIONS 13

5. In this section we define convolutions of a function with a tempered distribution. Sometimes this can be done with distributions which are not necessarily tempered, but we restrict attention 0 n to the case of u ∈ S (R ). n n Proposition 5.1. Suppose ψ, φ ∈ S(R ). Then ψ ∗ φ ∈ S(R ) and in particular for any integer N ≥ 0 and any multi-index α, there exists a constant C such that

(5.1) kψ ∗ φk(N,α) ≤ Ckψk(N,α)kφk(N+n+1,0).

1 n α α Proof. Since any partial derivative of φ or ψ is in L (R ), we have ∂ (ψ ∗ φ) = (∂ ψ) ∗ φ, hence it suffices to show (5.1) when α = 0. In this case, Z (1 + |x|)N |(ψ ∗ φ)(x)| ≤ (1 + |x|)N |ψ(x − y)||φ(y)| dy Z ≤ (2 + |x − y| + |y|)N |ψ(x − y)||φ(y)| dy

N X N Z     ≤ (1 + |x − y|)k|ψ(x − y)| (1 + |y|)N−k|φ(y)| dy k k=1 N X N Z ≤ kψk kφk (1 + |y|)−n−1 dy k (N,0) (N+n+1,0) k=1 ≤ Ckψk(N,0)kφk(N+n+1,0).



We can now motivate the definition of a with a Schwartz class function. First observe n that if φ, ψ, η ∈ S(R ), then Z Z Z Z Z (ψ ∗ η)(x)φ(x) dx = ψ(x − y)η(y)φ(x) dydx = η(y) ψ(x − y)φ(x) dxdy Rn Rn Rn Rn Rn Z Z Z = η(y) ψ˜(y − x)φ(x) dxdy = η(y)(ψ˜ ∗ φ)(y) dy, Rn Rn Rn where we recall that ψ˜ is the reflection of ψ, ψ˜(x) = ψ(−x).

0 n Definition 5.2. The convolution of a tempered distribution u ∈ S (R ) and a Schwartz class n n function ψ ∈ S(R ) is defined as the distribution which acts on φ ∈ S(R ) by hψ ∗ u, φi := hu, ψ˜ ∗ φi.

0 n n n ˜ Theorem 5.3. Given u ∈ S (R ), and ψ ∈ S(R ), define η : R → C by η(x) = hu, τxψi. Then η ∞ n is a slowly increasing C (R ) function such that Z (5.2) hψ ∗ u, φi = η(x)φ(x) dx = hη, φi, Rn that is, ψ ∗ u as a distribution is identical to the distribution determined by η. Moreover,

∂αη = (∂αψ) ∗ u = ψ ∗ (∂αu)(5.3) (5.4) supp(ψ ∗ u) ⊂ {x + y : x ∈ supp(u), y ∈ supp(ψ)}. 14 NOTES ON DISTRIBUTIONS

Proof. That η is slowly varying follows from the following computation: X N α |hu, τxψei| ≤ C sup(1 + |y|) |∂ ψe(y − x)| y |α|≤M   N X N α ≤ C(1 + |x|)  sup(1 + |x − y|) |∂ ψ(x − y)| , y |α|≤M N ≤ Ckψk(0,α)(1 + |x|) . ∞ n Next we show that η ∈ C (R ). Begin by observing that since τy+xf = τy(τxf), an application n of Lemma 4.7 implies that in the sense of convergence in S(R ),

τx+hej ψe(y) − τxψe(y) τhej (τxψe(y)) − (τxψe(y)) = → −∂y (τxψe(y)), as h → 0, h h j where the minus sign results from the use of τhej rather than τ−hej in the difference quotient. Moreover,   ej ej −∂yj (τxψe(y)) = −∂yj (ψe(y − x)) = −∂yj (ψ(x − y)) = (∂ ψ)(x − y) = τx∂]ψ (y). Hence by linearity of u, * + hu, τ ψi − hu, τ ψi τ ψ(y) − τ ψ(y) x+hej e x e x+hej e x e D  e E ∂x η(x) = lim = lim u(y), = u, τx∂]j ψ . j h→0 h h→0 h An induction argument on the order of α now shows that

α D α E (5.5) ∂ η(x) = u, τx∂gψ . We now show (5.2), which is similar to the proof of Theorem 4.11. Once again some technicalities are neglected here. Define n X g(y) =  φ(m)τmψe(y), m∈Zn so that Z n X n X lim hu, gi = lim  φ(m)hu, τmψei = lim  φ(m)η(m) = η(x)φ(x) dx. →0+ →0+ →0+ n m∈Zn m∈Zn R n At the same time in the topology of S(R ), Z Z lim g(y) = τxψe(y)φ(x) dx = ψe(y − x)φ(x) dx = (ψe ∗ φ)(y), →0+ Rn Rn where we have used that τxψe(y) = ψe(y − x) = ψ(x − y). The identity (5.2) now follows since Z hψ ∗ u, φi = hu, ψe ∗ φi = lim hu, gi = η(x)φ(x) dx. →0+ Rn Given (5.5), we may now replace ψ by ∂αψ to obtain h∂αη, φi = h(∂αψ) ∗ u, φi, n and since this holds for every φ ∈ S(R ), the first identity in (5.3) follows. The idea behind the second identity is of course that since we can now consider ((∂αψ)∗u)(x) = ∂αη(x) to be a function, α α this function can equivalently be computed as ψ ∗ (∂ u)(x) = h∂ u, τxψei. For the second identity, α |α| α α α observe that ∂ ψe = (−1) ∂gψ and by the chain rule ∂ (τxf) = τx(∂ f) and hence α α |α| α |α| α α α (∂ ψ)∗u(x) = hu, τx∂gψi = (−1) hu, τx(∂ ψe)i = (−1) hu, ∂ (τxψe)i = h∂ u, τxψei = ψ∗(∂ u)(x). NOTES ON DISTRIBUTIONS 15

n To conclude, we show (5.4). Given z ∈ R , observe that

supp(τzψ˜) = {z − y : y ∈ supp(ψ)}.

If 0 6= (ψ ∗u)(z) = hu, τzψi, then supp(u)∩supp(τzψ˜) 6= ∅, so let x be a member of this intersection. Thus x ∈ supp(u) and x = z − y for some y ∈ supp(ψ), that is, z = x + y with y ∈ supp(ψ). This now shows {z :(ψ ∗ u)(z) 6= 0} ⊂ {x + y : x ∈ supp(u), y ∈ supp(ψ)}, at which point (5.4) follows by taking closures.  6. Further reading The treatment above draws from the texts [Fol99, Ch. 9], [SS11, Ch. 3], [Fri98]. Folland’s text [Fol99] is a rather comprehensive introduction to several aspects of . It gives a concise introduction to topological vector spaces in an earlier chapter, so it does not hesitate to use some of the definitions introduced there. Nonetheless it is an accessible treatment to the subject. Like Folland’s book, distributions account for only one chapter in the fourth volume of the series of Stein and Shakarchi [SS11]. Nonetheless, it is another well-written and concise treatment of the theory of distributions. Friedlander’s text [Fri98] is arguably a must for anyone who needs to use the theory distributions routinely. Aside from the appendix, it avoids most of the theory of topological vector spaces, getting straight to the heart of the matter. More importantly, it thoroughly treats the wide range of topics one often needs to study harmonic analysis and PDE from this standpoint. Strichartz’ text [Str94] is another accessible introduction to the subject. The texts of Rudin [Rud73] present distributions (and other subjects in functional analysis) from the standpoint of topological spaces. In particular, it gives a comprehensive introduction to the ∞ ∞ n topologies one endows on the vector spaces Cc (Ω), C (Ω), and S(R ). Finally, the first volume of H¨ormander’sseries [H90]¨ is a sophisticated treatment of distributions, seeking to set the stage for pseudodifferential operators and Fourier integral operators.

References [Fol99] Gerald B. Folland, Real analysis, second ed., Pure and Applied Mathematics (New York), John Wiley & Sons, Inc., New York, 1999, Modern techniques and their applications, A Wiley-Interscience Publication. MR 1681462 [Fri98] F. G. Friedlander, Introduction to the theory of distributions, second ed., Cambridge University Press, Cam- bridge, 1998, With additional material by M. Joshi. MR 1721032 [H90]¨ Lars H¨ormander, The analysis of linear partial differential operators. I, second ed., Springer Study Edition, Springer-Verlag, Berlin, 1990, Distribution theory and Fourier analysis. MR 1065136 [Rud73] Walter Rudin, Functional analysis, McGraw-Hill Book Co., New York-D¨usseldorf-Johannesburg, 1973, McGraw-Hill Series in Higher Mathematics. MR 0365062 [SS11] Elias M. Stein and Rami Shakarchi, Functional analysis, Princeton Lectures in Analysis, vol. 4, Princeton University Press, Princeton, NJ, 2011, Introduction to further topics in analysis. MR 2827930 [Str94] Robert S. Strichartz, A guide to distribution theory and Fourier transforms, Studies in Advanced Mathe- matics, CRC Press, Boca Raton, FL, 1994. MR 1276724