Gerrit van Dijk Distribution Theory De Gruyter Graduate Lectures

Gerrit van Dijk Distribution Theory

Convolution, Fourier Transform, and Laplace Transform Subject Classification 2010 35D30, 42A85, 42B37, 46A04, 46F05, 46F10, 46F12

Author Prof. Dr. Gerrit van Dijk Universiteit Leiden Mathematisch Instituut Niels Bohrweg 1 2333 CA Leiden The Netherlands E-Mail: [email protected]

ISBN 978-3-11-029591-7 e-ISBN 978-3-11-029851-2

Library of Congress Cataloging-in-Publication Data A CIP catalog record for this book has been applied for at the Library of Congress.

Bibliographic information published by the Deutsche Nationalbibliothek The Deutsche Nationalbibliothek lists this publication in the Deutsche Nationalbibliografie; detailed bibliographic data are available in the Internet at http://dnb.dnb.de.

© 2013 Walter de Gruyter GmbH, Berlin/Boston Cover image: Fourier-Transformation of a sawtooth wave. © TieR0815 Typesetting: le-tex publishing services GmbH, Leipzig Printing and binding: Hubert & Co.GmbH & Co. KG, Göttingen Printed on acid-free paper Printed in Germany

www.degruyter.com Preface

The mathematical concept of a distribution originates from physics. It was first used by O. Heaviside, a British engineer, in his theory of symbolic calculus and then by P.A. M. Dirac around 1920 in his research on quantum mechanics, in which he in- troduced the delta- (or delta-distribution). The foundations of the mathe- matical theory of distributions were laid by S. L. Sobolev in 1936, while in the 1950s L. Schwartz gave a systematic account of the theory. The theory of distributions has numerous applications and is extensively used in mathematics, physics and engineer- ing. In the early stages of the theory one used the term rather than distribution, as is still reflected in the term delta-function and the title of some textbooks on the subject. This book is intended as an introduction to distribution theory, as developed by Laurent Schwartz. It is aimed at an audience of advanced undergraduates or begin- ning graduate students. It is based on lectures I have given at Utrecht and Leiden University. Student input has strongly influenced the writing, and I hope that this book will help students to share my enthusiasm for the beautiful topics discussed. Starting with the elementary theory of distributions, I proceed to products of distributions, Fourier and Laplace transforms, tempered distributions, summable distributions and applications. The theory is illustrated by several exam- ples, mostly beginning with the case of the real line and then followed by examples in higher dimensions. This is a justified and practical approach in our view, it helps the reader to become familiar with the subject. A moderate number of exercises are added with hints to their solutions. There is relatively little expository literature on distribution theory compared to other topics in mathematics, but there is a standard reference [10], and also [6]. I have mainly drawn on [9] and [5]. The main prerequisites for the book are elementary real, complex and functional analysis and Lebesgue integration. In the later chapters we shall assume familiarity with some more advanced measure theory and functional analysis, in particular with the Banach–Steinhaus theorem. The emphasis is however on applications, rather than on the theory. For terminology and notations we generally follow N. Bourbaki. Sections with a star may be omitted at first reading. The index will be helpful to trace important notions defined in the text. Thanks are due to my colleagues and students in The Netherlands for their re- marks and suggestions. Special thanks are due to Dr J. D. Stegeman (Utrecht) whose help in developing the final version of the manuscript has greatly improved the pre- sentation.

Leiden, November 2012 Gerrit van Dijk

Contents

Preface V

1 Introduction 1

2 Definition and First Properties of Distributions 3 2.1 Test Functions 3 2.2 Distributions 4 2.3 of a Distribution 6

3 Differentiating Distributions 9 3.1 Definition and Properties 9 3.2 Examples 10 λ−1 3.3 The Distributions x+ (λ =0, −1, −2,...)* 12 3.4 Exercises 14 3.5 Green’s Formula and Harmonic Functions 14 3.6 Exercises 20

4 Multiplication and Convergence of Distributions 22 4.1 Multiplication with a C ∞ Function 22 4.2 Exercises 23 4.3 Convergence in D 23 4.4 Exercises 24

5 Distributions with Compact Support 26 5.1 Definition and Properties 26 5.2 Distributions Supported at the Origin 27 5.3 Taylor’s Formula for Rn 27 5.4 Structure of a Distribution* 28

6 Convolution of Distributions 31 6.1 Tensor Product of Distributions 31 6.2 Convolution Product of Distributions 33 6.3 Associativity of the Convolution Product 39 6.4 Exercises 39 6.5 Newton Potentials and Harmonic Functions 40 6.6 Convolution Equations 42 6.7 Symbolic Calculus of Heaviside 45 6.8 Volterra Integral Equations of the Second Kind 47 VIII Contents

6.9 Exercises 49 6.10 Systems of Convolution Equations* 50 6.11 Exercises 51

7TheFourierTransform52 7.1 Fourier Transform of a Function on R 52 7.2 The Inversion Theorem 54 7.3 Plancherel’s Theorem 56 7.4 Differentiability Properties 57 7.5 The Schwartz Space S(R) 58 7.6 The Space of Tempered Distributions S(R) 60 7.7 Structure of a Tempered Distribution* 61 7.8 Fourier Transform of a Tempered Distribution 63 7.9 Paley–Wiener Theorems on R* 65 7.10 Exercises 68 7.11 Fourier Transform in Rn 69 7.12 The Heat or Diffusion Equation in One Dimension 71

8TheLaplaceTransform74 8.1 Laplace Transform of a Function 74 8.2 Laplace Transform of a Distribution 75 8.3 Laplace Transform and Convolution 76 8.4 Inversion Formula for the Laplace Transform 79

9 Summable Distributions* 82 9.1 Definition and Main Properties 82 9.2 The Iterated Poisson Equation 83 9.3 Proof of the Main Theorem 84 9.4 Canonical Extension of a Summable Distribution 85 9.5 RankofaDistribution 87

10 Appendix 90 10.1 The Banach–Steinhaus Theorem 90 10.2 TheBetaandGammaFunction 97

11 Hints to the Exercises 102

References 107

Index 109 1Introduction

Differential equations appear in several forms. One has ordinary differential equa- tions and partial differential equations, equations with constant coefficients and with coefficients. Equations with constant coefficients are relatively well under- stood. If the coefficients are variable, much less is known. Let us consider the singular differential equation of the first order

xu = 0 . (∗)

Though the equation is defined everywhere on the real line, classically a solution is only given for x>0 and x<0.Inbothcasesu(x) = c with c a constant, different for x>0 and x<0 eventually. In order to find a global solution, we consider a weak form of the differential equation. Let ϕ be a C1 function on the real line, vanishing outside some bounded interval. Then equation (∗)canberephrasedas

∞ xu,ϕ = xu(x) ϕ(x) dx = 0 . −∞

Applying partial integration, we get xu,ϕ =− u, ϕ + xϕ = 0 .

Take u(x) = 1 for x ≥ 0, u(x) = 0 for x<0.Thenweobtain ∞ ∞ ∞ xu,ϕ =− ϕ(x) + xϕ(x) dx =− ϕ(x) dx + ϕ(x) dx = 0 . 0 0 0

Call this function H, known as the Heaviside function. We see that we obtain the following (weak) global solutions of the equation:

u(x) = c1 H(x) + c2 , with c1 and c2 constants. Observe that we get a two-dimensional solution space. One can show that these are all weak solutions of the equation. The functions ϕ are called test functions. Of course one can narrow the class of test functions to Ck functions with k>1, vanishing outside a bounded interval. This is certainly useful if the order of the differential equation is greater than one. It would be nice if we could assume that the test functions are C∞ functions, vanishing outside a bounded interval. But then there is really something to show: do such functions exist? The answer is yes (see Chapter 2). Therefore we can set up a nice theory of global solutions. This is important in several branches of mathematics and physics. Consider, for example, a point mass in R3 with force field having potential V = 1/r , 2 Introduction

r being the function. It satisfies the partial differential equation ΔV = 0 outside 0. To include the origin in the equation, one writes it as

ΔV =−4πδ with δ the functional given by δ, ϕ=ϕ(0). So it is the desire to go to global equations and global solutions to develop a theory of (weak) solutions. This theory is known as distribution theory. It has several applications also outside the theory of differential equations. To mention one, in representation theory of groups, a well- known concept is the character of a representation. This is perfectly defined for finite- dimensional representations. If the dimension of the space is infinite, the concept of distribution character can take over that role. 2 Definition and First Properties of Distributions

Summary In this chapter we show the existence of test functions, define distributions, give some examples and prove their elementary properties. Similar to the notion of support of a function we define the support of a distribution. This is a rather technical part, but it is important because it has applications in several other branches of mathematics, such as differential geometry and the theory of Lie groups.

Learning Targets

 Understanding the definition of a distribution.  Getting acquainted with the notion of support of a distribution.

2.1 Test Functions

n We consider the Euclidean space R , n ≥ 1,withelementsx = (x1,...,xn).One = 2 +···+ 2 defines x x1 xn ,thelength of x. Let ϕ be a complex-valued function on Rn. The closure of the set of points {x ∈ Rn : ϕ(x) =0} is called the support of ϕ and is denoted by Supp ϕ.

For any n-tuple k = (k1,...,kn) of nonnegative integers ki one defines the par- tial differential operator Dk as

∂ k1 ∂ kn ∂|k| Dk = ··· = . ∂x ∂x k1 ··· kn 1 n ∂x1 ∂xn

The symbol |k|=k1 +···+kn is called the order of the partial differential operator. Note that order 0 corresponds to the identity operator. Of course, in the special case n = 1 we simply have the differential operators dk/dxk (k ≥ 0). Afunctionϕ : Rn → C is called a Cm function if all partial Dkϕ of order |k|≤m exist and are continuous. The space of all Cm functions on Rn will be denoted by Em(Rn). In practice, once a value of n ≥ 1 is fixed, this space will be simply denoted by Em. Afunctionϕ : Rn → C is called a C∞ function if all its partial derivatives Dkϕ exist and are continuous. A C∞ function with compact support is called a test func- tion.ThespaceofallC∞ functions on Rn will be denoted by E(Rn),thespaceoftest functions on Rn by D(Rn). In practice, once a value of n ≥ 1 is fixed, these spaces will be simply denoted by E and D respectively. It is not immediately clear that nontrivial test functions exist. The requirement of being C∞ is easy (for example every polynomial function is), but the requirement of also having compact support is difficult. See the following example however. 4 Definition and First Properties of Distributions

Example of a Test Function First take n = 1, the real line. Let ϕ be defined by ⎧ ⎨⎪ 1 exp − if |x| < 1 = − 2 ϕ(x) ⎪ 1 x ⎩ 0 if |x|≥1 .

Then ϕ ∈D(R). Toseethis,itissufficienttoshowthatϕ is infinitely many times differentiable at the points x =±1 and that all derivatives at x =±1 vanish. After performing a translation, this amounts to showing that the function f defined by f(x) = e−1/x (x > 0), f(x) = 0 (x ≤ 0) is C∞ at x = 0 and that f (m)(0) = 0 for −1/x k all m = 0, 1, 2,.... This easily follows from the fact that limx↓0 e /x = 0 for all k = 0, 1, 2,.... For arbitrary n ≥ 1 we denote r = x and take ⎧ ⎨⎪ 1 exp − if r<1 ϕ(x) = 1 − r 2 ⎩⎪ 0 if r ≥ 1 .

Then ϕ ∈D(Rn). The space D is a complex linear space, even an algebra. And even more generally, if ϕ ∈Dand ψ a C∞ function, then ψϕ ∈D. It is easily verified that Supp(ψϕ) ⊂ Supp ϕ ∩ Supp ψ. We define an important convergence principle in D

Definition 2.1. A of functions ϕj ∈D(j = 1, 2,...)converges to ϕ ∈Dif the following two conditions are satisfied:

(i) The supports of all ϕj are contained in a compact set, not depending on j, k (ii) For any n-tuple k of nonnegative integers the functions D ϕj converge uniformly to Dkϕ(j→∞).

2.2 Distributions

We can now define the notion of a distribution. A distribution on the Euclidean space Rn (with n ≥ 1) is a continuous complex-valued linear function defined on D(Rn), the linear space of test functions on Rn. Explicitly, a function T : D(Rn) → C is a distribution if it has the following properties: a. T(ϕ1 + ϕ2) = T(ϕ1) + T(ϕ2) for all ϕ1,ϕ2 ∈D, b. T(λϕ) = λT(ϕ)for all ϕ ∈Dand λ ∈ C, c. If ϕj tends to ϕ in D then T(ϕj) tends T(ϕ).

Instead of the term linear function, the term linear form is often used in the literature. Distributions 5

 The set D of all distributions is itself a linear space: the sum T1 + T2 and the scalar product λT are defined by

T1 + T2,ϕ=T1,ϕ+T2,ϕ , λT, ϕ=λ T, ϕ for all ϕ ∈D. One usually writes T, ϕ for T(ϕ), which is convenient because of the double linearity.

Examples of Distributions 1. A function on Rn is called locally integrable if it is integrable over every compact subset of Rn. Clearly, continuous functions are locally integrable. Let f be a lo- n cally integrable function on R .ThenTf ,definedby   Tf ,ϕ = f(x) ϕ(x) dx(ϕ∈D) Rn

n is a distribution on R , a so-called regular distribution.ObservethatTf is well defined because of the compact support of the functions ϕ. One also writes in

this case f, ϕ for Tf ,ϕ. Because of this relationship between Tf and f ,it is justified to call distributions generalized functions,whichwascustomaryatthe start of the theory of distributions. Clearly, if the function f is continuous, then

the relationship is one-to-one: if Tf = 0 then f = 0. Indeed:

Lemma 2.2. Let f be a continuous function satisfying f, ϕ=0 for all ϕ ∈D. Then f is identically zero.

Proof. Let f satisfy f, ϕ=0 for all ϕ ∈Dand suppose that f(x0) =0 for some x0. Then we may assume that Re f(x0) =0 (otherwise consider if ), even Re f(x0)>0.Sincef is continuous, there is a neighborhood V of x0 where ∈D ≥ = ⊂ Re f(x 0)>0.Ifϕ , ϕ 0, ϕ 1 near x0 and Supp ϕ V ,then Re[ f(x) ϕ(x)dx] > 0,hencef, ϕ =0. This contradicts the assumption on f .

We shall see later on in Section 6.2 that this lemma is true for general locally

integrable functions: if Tf = 0 then f = 0 almost everywhere. 2. The Dirac distribution δ: δ, ϕ=ϕ(0)(ϕ ∈D). More generally for a ∈ Rn,

δ(a),ϕ=ϕ(a). One sometimes uses the terminology δ-function and writes

δ, ϕ= ϕ(x) δ(x) dx, δ(a),ϕ= ϕ(x) δ(a)(x) dx Rn Rn

for ϕ ∈D.Butclearlyδ is not aregularfunction. 6 Definition and First Properties of Distributions

3. For any n-tuple k of nonnegative integers and any a ∈ Rn, T, ϕ=Dkϕ(a) (ϕ ∈D) is a distribution on Rn.

We return to the definition of a distribution and in particular to the continuity property. An equivalent definition is the following:

Proposition 2.3. A distribution T is a linear function on D such that for each compact n subset K of R there exists a constant CK and an integer m with     k  |T, ϕ| ≤ CK sup D ϕ |k|≤m for all ϕ ∈Dwith Supp ϕ ⊂ K.

Proof. Clearly any linear function T that satisfies the above inequality is a distribu- tion. Let us show the converse by contradiction. If a distribution T does not satisfy such an inequality, then for some compact set K and for all C and m,e.g.forC = m

(m = 1, 2,...),wecanfindafunctionϕm ∈Dwith T, ϕm=1, Supp ϕm ⊂ K, k |D ϕm|≤1/m if |k|≤m.Clearlyϕm tends to zero in D if m tends to infinity, but T, ϕm=1 for all m. This contradicts the continuity condition of T . If m can be chosen independent of K,thenT is said to be of finite order.The smallest possible m is called the order of T . If, in addition, CK can be chosen inde- pendent of K,thenT is said to be a summable distribution (see Chapter 9).

In the above examples, Tf ,δ,δ(a) are of order zero, in Example 3 above the dis- tribution T is of order |k|, hence equal to the order of the partial differential operator.

2.3 Support of a Distribution

Let f be a nonvanishing continuous function on Rn. The support of f was defined in Section 2.1 as the closure of the set of points where f does not vanish. If now ϕ  = is a test function with support in the complement of the support of f then f, ϕ Rn f(x) ϕ(x) dx = 0. It is easy to see that this complement is the largest open set with this property. Indeed, if O is an open set such that f, ϕ=0 for all ϕ with Supp ϕ ⊂ O, then, similar to the proof of Lemma 2.2, f = 0 on O,henceSupp f ∩ O =∅. These observations permit us to define the support of a distribution in a similar way. We begin with two important lemmas, of which the proofs are rather terse, but they are of great value.

Lemma 2.4. Let K ⊂ Rn be a compact subset of Rn. Then for any open set O contain- ing K, there exists a function ϕ ∈Dsuch that 0 ≤ ϕ ≤ 1,ϕ= 1 on a neighborhood of K and Supp ϕ ⊂ O. Support of a Distribution 7

Proof. Let 0

Then f is C∞ and the same holds for

b  b F(x) = f(t)dt f(t)dt. x a  1 if x ≤ a Observe that F(x) = 0 if x ≥ b. Rn = 2 +··· 2 ∞ The function ψ on given by ψ(x1,...,xn) F(x1 xn) is C ,isequal to 1 for r 2 ≤ a and zero for r 2 ≥ b. Let B ⊂ B be two different concentric balls in Rn. By performing a linear trans- formation in Rn if necessary, we can now construct a function ψ in D that is equal to 1 on B and zero outside B. Now consider K. There are finitely many balls B1,...,Bm needed to cover K and m ⊂ we can arrange it such that i=1 Bi O. Onecanevenmanageinsuchawaythat   the balls B1,...,Bm which are concentric with B1,...,Bm, but have half the radius,  cover K as well. Let ψi ∈Dbe such that ψi = 1 on Bi and zero outside Bi.Set

ψ = 1 − (1 − ψ1)(1 − ψ2) ···(1 − ψm).

Then ψ ∈D, 0 ≤ ψ ≤ 1, ψ is equal to 1 on a neighborhood of K and Supp ψ ⊂ O.

Rn Lemma 2.5 (). Let O1,...,Om be open subsets of , K a compact Rn ⊂ m ∈D subset of and assume K i=1 Oi. Then there are functions ϕi with ⊂ ≥ m ≤ m = Supp ϕi Oi such that ϕi 0, i=1 ϕi 1 and i=1 ϕi 1 on a neighborhood of K.  ⊂ ⊂ m Proof. Select compact subsets Ki Oi such that K i−1 Ki.Bythepreviouslem- ma there are functions ψi ∈Dwith Supp ψi ⊂ Oi,ψi = 1 on a neighborhood of Ki, 0 ≤ ψi ≤ 1.Set

ϕ1 = ψ1,ϕi = ψi(1 − ψ1) ···(1 − ψi−1)(i= 2,...,m).  m = − − ··· − Then i=1 ϕi 1 (1 ψ1) (1 ψm), which is indeed equal to 1 on a neigh- borhood of K. Let T be a distribution. Let O be an open subset of Rn such that T, ϕ=0 for all ϕ ∈Dwith Supp ϕ ⊂ O.ThenwesaythatT vanishes on O.LetU be the union of such open sets O.ThenU is again open, and from Lemma 2.5 it follows that T vanishes on U.ThusU is the largest open set on which T vanishes. Therefore we can now give the following definition. 8 Definition and First Properties of Distributions

Definition 2.6. The support of the distribution T on Rn, denoted by Supp T ,isthe complement of the largest open set in Rn on which T vanishes.

Clearly, Supp T is a closed subset of Rn.

Examples

1. Supp δ(a) ={a}.

2. If f is a continuous function, then Supp Tf = Supp f . The following proposition is useful.

Proposition 2.7. Let T be a distribution on Rn of finite order m.ThenT, ψ=0 for all ψ ∈Dfor which the derivatives Dkψ with |k|≤m vanish on Supp T .

Proof. Let ψ ∈Dbe as in the proposition and assume that Supp ψ is contained in a compact subset K of Rn.SincetheorderofT is finite, equal to m, one has     k  |T, ϕ| ≤ CK sup D ϕ |k|≤m for all ϕ ∈Dwith Supp ϕ ⊂ K, CK being a positive constant. Applying Lemma 2.4, choose for any ε>0 afunctionϕε ∈Dsuch that ϕε = 1 k on a neighborhood of K ∩ Supp T and such that sup |D (ϕεψ)| <εfor all partial differential operators Dk with |k|≤m.Then     k  n |T, ψ| = |T, ϕεψ| ≤ CK sup D (ϕεψ) ≤ CK (m + 1) ε. |k|≤m

Since this holds for all ε>0, it follows that T, ψ=0.

Further Reading

Every book on distribution theory contains of course the definition and first proper- ties of distributions discussed in this chapter. See, e. g. [9], [10] and [5]. Schwartz’s book [10] contains a large amount of additional material. 3 Differentiating Distributions

Summary In this chapter we define the of a distribution and show that any distribu- tion can be differentiated an arbitrary number of times. We give several examples of distributions and their derivatives, both on the real line and in higher dimensions. The principal value and the finite part of a distribution are introduced. In higher di- mensions we derive Green’s formula and study harmonic functions only depending on the radius r , both as classical functions outside the origin and globally as distribu- tions. These functions play a prominent role in physics, when studying the potential ofafieldofapointmassorofanelectron.

Learning Targets

 Understanding that any distribution can be differentiated an arbitrary number of times.  Learning about the principal value and the finite part of a distribution. Deriving a formula of jumps.  What are harmonic functions and how do they behave as distributions?

3.1 Definition and Properties

If f is a continuously on R and ϕ ∈D(R),thenitiseasily verified by partial integration, using that ϕ has compact support, that

  ∞ ∞   df dϕ ,ϕ = f (x) ϕ(x) dx =− f(x)ϕ(x) dx =− f, . dx dx −∞ −∞

In a similar way one has for a continuously differentiable function f on Rn(n ≥ 1),     ∂f ∂ϕ ,ϕ =− f, ∂xi ∂xi for all i = 1, 2,...,nand all ϕ ∈D(Rn). The following definition is then natural for n a general distribution T on R : ∂T/∂xi,ϕ=−T, ∂ϕ/∂xi.Noticethat∂T/∂xi (ϕ ∈D(Rn)) is again a distribution. We resume:

Definition 3.1. Let T be a distribution on Rn. Then the ith partial derivative of T is defined by     ∂T ∂ϕ   ,ϕ =− T, ϕ ∈D(Rn) . ∂xi ∂xi 10 Differentiating Distributions

One clearly has

∂2T ∂2T = (i, j = 1,...,n) ∂xi∂xj ∂xj ∂xi because the same is true for functions in D(Rn). Let k be a n-tuple of nonnegative integers. Then one has       DkT, ϕ = (−1)|k| T, Dkϕ ϕ ∈D(Rn) .

We may conclude that a distribution can be differentiated an arbitrary number of times. In particular any locally integrable function is C∞ as a distribution. Also the δ-function can be differentiated an arbitrary number of times. = k ∈ Let D |k|≤m ak D be a differential operator with constant coefficients ak t t |k| k C. Then the adjoint or transpose D of D is by definition D = |k|≤m(−1) ak D . One clearly has ttD = D and     DT, ϕ= T,tDϕ ϕ ∈D(Rn)

t = for any distribution T .If D D we say that D is self-adjoint. The Laplace operator Δ = n 2 2 i=1 ∂ /∂xi is self-adjoint.

3.2 Examples

Asbefore,webeginwithsomeexamplesontherealline. 0 if x<0 1. Let Y(x) = 1 if x ≥ 0. The function Y is called the Heaviside function, named after Oliver Heaviside (1850–1925), a self-taught British electrical engineer, mathematician, and physi- cist. One has Y  = δ. Indeed, ∞ Y ,ϕ =− Y, ϕ =− ϕ(x) dx = ϕ(0) =δ, ϕ (ϕ ∈D). 0

2. For all m = 0, 1, 2,...one has δ(m),ϕ=(−1)m ϕ(m)(0) (ϕ ∈D). This is easy. 3. Let us generalize Example 1 to more general functions and higher derivatives. Consider a function f : R → C that is m times continuously differentiable for x =0 and such that

lim f (k)(x) and lim f (k)(x) x↓0 x↑0 exist for all k ≤ m. Let us denote the difference between these limits (the jump)

by σk,thus (k) (k) σk = lim f (x) − lim f (x) . x↓0 x↑0 Examples 11

Call f ,f,...the distributional derivatives of f and {f }, {f },...the distribu- tions defined by the ordinary derivatives of f on R\{0}. For example, if f = Y , then f  = δ, {f }=0. One easily verifies by partial integration that ⎧  ⎪  =  + ⎪ f f σ0 δ, ⎪    ⎨ f = f + σ0 δ + σ1 δ, ⎪ . ⎪ . ⎪ .  ⎩ (m) (m) (m−1) (m−2) f = f + σ0 δ + σ1 δ +···+σm−1 δ.

So, for example,   Y(x) cos x =−Y(x) sin x + δ,   Y(x) sin x = Y(x) cos x.

1 4. The distribution pv x . We now come to an important notion in distribution theory, the principal value of a function. We restrict ourselves to the function f(x) = 1/x and define pv(1/x) (the principal value of 1/x)by   ϕ(x) pv 1 ,ϕ = lim dx(ϕ∈D). x ε↓0 x |x|≥ε

ε By a well-known result from calculus, we know that limx↓0 x log x = 0 for any ε>0. Therefore, log |x| is a locally integrable function and thus defines a regu- lar distribution. Keeping this in mind and using partial integration we obtain for any ϕ ∈D

  ∞ pv 1 ,ϕ =−lim log |x| ϕ(x) dx =− log |x| ϕ(x) dx, x ε↓0 |x|≥ε −∞

and we see that pv(1/x) is a distribution since   1 = | |  pv x log x in the sense of distributions. 5. Partie finie. We now introduce another important notion, the so-called partie finie (or finite part) of a function. Again we restrict to particular functions, on this occasion to powers of 1/x, which occur when we differentiate the function 1/x. The notion of partie finie occurs when we differentiate the principal value of 1/x.Letus define for ϕ ∈D    −ε ∞ 1 ϕ(x) ϕ(x) 2ϕ(0) Pf 2 ,ϕ = lim dx + dx − . x ε↓0 x2 x2 ε −∞ ε 12 Differentiating Distributions

Though this is a rather complicated expression, it occurs in a natural way when we compute [pv(1/x)] (do it!). It turns out that ! " 1 =− 1 pv x Pf x2 . Thus Pf(1/x2) is a distribution. Keeping this in mind, let us define # $ d x−(n−1) Pf x−n =− Pf (n = 2, 3,...) dx n − 1 and we set Pf(1/x) = pv(1/x). For an alternative definition of Pf x−n,see[10],SectionII,2:26.

λ−1 3.3 The Distributions x+ (λ =0, −1, −2,...)*

Let us define for λ ∈ C with Re λ>0,   ∞ λ−1 λ−1 x+ ,ϕ = x ϕ(x) dx(ϕ∈D). (3.1) 0 Since xλ−1 is locally integrable on (0, ∞) for Re λ>0, we can regard xλ−1 as a dis- − − tribution. Expanding xλ λ0 = e(λ λ0) log x (x > 0) into a power series, we obtain λ−1 λ −1 −ε λ −1 for any (small) ε>0 the inequality |x − x 0 |≤x |λ − λ0||x 0 ||log x| (0 0. Applying partial integration we obtain

d λ λ−1 x+ = λx+ dx λ−1 if Re λ>0. Let us now define the distribution x+ for all λ =0, −1, −2,... by choosing a nonnegative integer k such that Re λ + k>0 and setting

k λ−1 1 d λ+k−1 x+ = x+ . λ(λ+ 1) ···(λ + k − 1) dx Explicitly one has ∞   − k k λ−1 ( 1) λ+k−1 d ϕ x+ ,ϕ = x dx(ϕ∈D). (3.2) λ(λ+ 1) ···(λ + k − 1) dxk 0 It is clear, using partial integration, that this definition agrees with equation (3.1) when Re λ>0, and that it is independent of the choice of k provided Re λ + k>0. It also gives an analytic function of λ on the set  Ω = λ ∈ C : λ =0, −1, −2,... . λ−1 The Distributions x+ (λ =0, −1, −2,...)* 13

λ λ−1 Note that (d/dx)x+ = λx+ for all λ ∈ Ω. The excluded points λ = 0, −1, −2,... λ−1 are simple poles of x+ . In order to compute the residues at these poles, we replace k by k + 1 in equation (3.2), and obtain     λ−1 λ−1 Resλ=−k x+ ,ϕ = lim (λ + k) x+ ,ϕ λ→−k ∞ & (−1)k+1 dk+1ϕ dkϕ = (x) dx = (0) k! . (−k + 1) ···(−1) dxk+1 dxk 0

λ−1 k (k) Hence Resλ=−kx+ ,ϕ=(−1) δ /k!,ϕ(ϕ ∈D). The gamma function Γ (λ),definedby

∞ Γ (λ) = e−x xλ−1 dx, 0 is also defined and analytic for Re λ>0. This can be shown as above, with ϕ re- placed by the function e−x. We refer to Section 10.2 for a detailed account of the gam- ma function. Using partial integration, one obtains Γ (λ + 1) = λ Γ (λ) for Re λ>0. This functional equation permits us, similar to the above procedure, to extend the gamma function to an analytic function on the whole complex plane minus the points λ = 0, −1, −2,...,thusonΩ. Furthermore, the gamma function has simple poles at these points with residue at the point λ =−k equal to (−1)k/k!.Awell-knownfor- mula for the gamma function is the following: π Γ (λ) Γ (1 − λ) = . sin πλ

We refer again to Section 10.2. From this formula we may conclude that Γ (λ) vanishes nowhere on C\Z.SinceΓ (k) = (k − 1)! for k = 1, 2,...,weseethatΓ (λ) vanishes in  no point of Ω. Define now Eλ ∈D by

λ−1 x+ E = (λ ∈ Ω), λ Γ (λ)

(k) E−k = δ (k = 0, 1, 2,...).

Then λ Eλ,ϕ (ϕ ∈D) is an entire function on C.Notethat

d Eλ = E − dx λ 1 for all λ ∈ C. 14 Differentiating Distributions

3.4 Exercises

Exercise 3.2. Let Y be the Heaviside function, defined in Section 3.2,Example1 and let λ ∈ C. Prove that in distributional sense the following equations hold: # $ d d2 Y(x) sin ωx − λ Y(x)eλx = δ, + ω2 = δ, dx dx2 ω # $ dm Y(x)xm−1 = δ for m a positive integer. dxm (m − 1)!

Exercise 3.3. Determine, in distributional sense, all derivatives of |x|.

Exercise 3.4. Find a distribution of the form F(t) = Y(t) f(t),withf atwotimes continuously differentiable function, satisfying the distribution equation

d2F dF a + b + cF = mδ + nδ dt2 dt with a, b, c, m, n complex constants. Now consider the particular cases (i) a = c = 1; b = 2; m = n = 1, (ii) a = 1; b = 0; c = 4; m = 1; n = 0, (iii) a = 1; b = 0; c =−4; m = 2; n = 1.

Exercise 3.5. Define the following generalizations of the partie finie:

∞ ( ∞ ) ϕ(x) ϕ(x) Pf dx = lim dx + ϕ(0) log ε , x ε↓0 x 0 ε ∞ ( ∞ ) ϕ(x) ϕ(x) ϕ(0) Pf dx = lim dx − + ϕ(0) log ε . x2 ε↓0 x2 ε 0 ε

These expressions define distributions, denoted by pv(Y (x)/x) and Pf(Y (x)/x2). Show this and determine the derivative of the first expression.

3.5 Green’s Formula and Harmonic Functions

We continue with examples in Rn for n ≥ 1. Green’s Formula and Harmonic Functions 15

1. Green’s formula. n Let S be a hypersurface F(x1,...,xn) = 0 in R . The function F is assumed to be a C1 function such that grad F does not vanish at any point of S.Such a hypersurface is called regular. Therefore, in each point of S a tangent surface exists and with a normal vector on it. Choose at each x ∈ S as normal vector grad F(x) n(x) = . | grad F(x)|

Clearly n(x) depends continuously on x.Givenx ∈ S and e ∈ Rn, the line x + te with (e, n(x)) =0 does not belong to S if t is small, t =0.Verifyit! Notice also that S has Lebesgue measure equal to zero. Let f be a C2 function on Rn\S such that for each x ∈ S and each partial deriva- tive Dkf on Rn\S (|k|≤2)thelimits

lim Dkf(y) and lim Dkf(y) y → x y → x F(y) > 0 F(y) < 0

exist. Observe that F increases in the direction of n and decreases in the direction of −n . Thedifferencewillbedenotedbyσ k:

σ k(x) = lim Dkf(y) − lim Dkf(y) (x ∈ S). y → x y → x F(y) > 0 F(y) < 0

Assume that σ k is a nice, e. g. continuous, function on S. Let us regard f as a distribution, Dkf as a distributional derivative, {Dkf } as the distribution given by the function Dkf on Rn\S.Forϕ ∈Dwe have     ∂f ∂ϕ ,ϕ =− f, ∂x1 ∂x1 ∂ϕ =− f(x) dx ∂x1 ∂ϕ =− ··· f(x1,...,xn) (x1,...,xn) dx1 ···dxn ∂x1 ∂ϕ =− dx2 ···dxn f(x1,x2,...,xn) (x1,x2,...,xn) dx1 . ∂x1

0 (0,...,0) From Section 3.2, Example 3, follows, with σ = σ and e 1 = (1, 0,...,0):     ∂f 0 ,ϕ = σ (x1,...,xn) sign e 1, n(x 1,...,xn) ϕ(x1,...,xn) dx1 ...dxn ∂x1 S ∞ ∂f + dx2 ...dxn ϕ dx1 . ∂x1 −∞ 16 Differentiating Distributions

Let ds be the surface element of S at x ∈ S. One has cos θ1 dx1 ∧ ds = dn(x) ∧ ds = dx1 ∧ dx2 ... ∧ dxn,sodx2 ...dxn =|cos θ1| ds, θ1 being the angle between e 1 and n(x) .Wethusobtain   0 σ (x1,...,xn) sign e 1, n(x 1,...,xn) ϕ(x1,...,xn) dx1 ...dxn = S 0 σ (s) ϕ(s) cos θ1(s) ds. S   = 0 ∈D Notice that T, ϕ S σ (s) ϕ(s) cos θ1(s) ds(ϕ ) defines a distribu- 0 tion T . Symbolically we shall write T = (σ cos θ1)δS .Hencewehavefor i = 1,...,n, in obvious notation, * +   ∂f ∂f 0 = + σ cos θi δS . ∂xi ∂xi Differentiating this expression once more gives  2 2    ∂ f = ∂ f + ∂ 0 + i 2 2 (σ cos θi)δS σ cos θi δS . ∂xi ∂xi ∂xi

Here σ i = σ (0,...,0,1,0,...,0),with1ontheith place. n ∂2 Δ = Let 2 be the Laplace operator. Then we have i=1 ∂xi

 n ∂   n   Δf = Δf + σ 0 cos θ δ + σ i cos θ δ . ∂x i S i S i=1 i i=1 Observe that n n   ∂f ∂f a. σ i cos θ is the jump of cos θ = , i i ∂x ∂ν i=1 i=1 i

the derivative in the direction of n(s) .Callitσν . , - n   n ∂ 0 ∂ϕ 0 ∂ϕ 0 b. σ cos θi δS ,ϕ =− cos θi σ ds =− σ ds. = ∂xi = ∂xi ∂ν i 1 S i 1 S 0 Symbolically we will denote this distribution by (∂/∂ν)(σ δS ).

We now come to a particular case. Let S be the border of an open volume V and assume that f is zero outside the closure V of V .Letν = ν(s) be the direction of the inner normal at s ∈ S.Set

∂f ∂f (s) = lim (x) ∂ν x → s ∂ν x ∈ V Green’s Formula and Harmonic Functions 17

and assume that this limit exists. Then one has Δf, ϕ = f, Δϕ = f(x)Δϕ(x) dx V        ∂f ∂ = Δf ,ϕ + δ ,ϕ + (f δ ), ϕ ∂ν S ∂ν S ∂f ∂ϕ = Δf(x) ϕ(x)dx + (s) ϕ(s) ds − f(s) (s) ds. ∂ν ∂ν V S S

Hence we get Green’s formula in n-dimensional space   ∂f ∂ϕ f Δϕ − Δfϕ dx = ϕ − f ds(ϕ∈D), ∂ν ∂ν V S

where still ν is the direction of the inner normal vector. Observe that this formula is even correct for any C2 function ϕ. 2. Harmonic functions. 2 n Let f be a C function on R \{0}, only depending on the radius r = x = 2 +···+ 2 = = x1 xn.Noticethatr r(x1,...,xn) is not differentiable at x 0. The function f is called a harmonic function if Δf = 0 on Rn\{0}. What is the general form of f ,whenwritingf(x) = F(r)?Wehaveforr =0

∂f dF ∂r x dF = = i , ∂xi dr ∂xi r dr # $ ∂2f ∂ x dF x ∂ dF dF r − x2/r = i = i + i 2 2 ∂xi ∂xi r dr r ∂xi dr dr r # $ x2 d2F 1 x2 dF = i + − i , r 2 dr 2 r r 3 dr

d2F n − 1 dF hence Δf(x) = + . dr 2 r dr We have to solve d2F/dr 2 + (n − 1)/r (dF/dr) = 0.Setu = dF/dr .Then u + [(n − 1)/r ] · u = 0,henceu(r ) = c/rn−1 for some constant c and thus ⎧ A ⎨⎪ + = − B if n 2 , F(r) = r n 2 ⎩⎪ A log r + B if n = 2 ,

A and B being constants. The constant functions are harmonic functions everywhere, while the functions 1/r n−2 (n>2)andlog r (n = 2) are harmonic functions outside the origin x = 0.Atx = 0 they have a singularity. Both functions are however locally integrable 18 Differentiating Distributions

(see below, change to spherical coordinates), hence they define distributions. We shall compute the distributional derivative Δ(1/r n−2) for n =2. Therefore we need some preparations. 3. Spherical coordinates and the area of the (n − 1)-dimensional sphere Sn−1. Spherical coordinates in Rn(n > 1) are given by

x1 = r sin θn−1 ··· sin θ2 sin θ1

x2 = r sin θn−1 ··· sin θ2 cos θ1

x3 = r sin θn−1 ··· cos θ2 . .

xn−1 = r sin θn−1 cos θn−2

xn = r cos θn−1 .

Here 0 ≤ r<∞;0≤ θ1 < 2π, 0 ≤ θj <π(j=1). ≥ 2 = 2 +···+ 2 = Let rj 0,rj x1 xj and assume that rj > 0 for all j 1.Then ⎧ x + ⎪ = j 1 ⎪ cos θj , ⎪ rj+1 ⎨⎪ rj x1 ⎪ sin θ = (j =1), sin θ = , ⎪ j 1 ⎪ rj+1 r2 ⎩⎪ r = rn .

n−1 One has dx1 ...dxn = r J(θ1,...,θn−1) dr dθ1 ...dθn−1 with   n−2 n−3 J θ1,...,θn−1 = sin θn−1 sin θn−2 ···sin θ2 .

These expressions can easily be verified by induction on n.Forn = 2 and n = 3 n−1 n they are well known. Let Sn−1 denote the area of S , the unit sphere in R . Then 2π π π   Sn−1 = ··· J θ1,...,θn−1 dθ1 ...dθn−1 . 0 0 0 To compute Sn−1, we can use the√ explicit form of J. Easier is the following ∞ −x2 method. Recall that −∞ e dx = π.Thenweobtain   ∞ n − x2+···+x2 n−1 −r 2 π 2 = e 1 n dx1 ...dxn = r e dr.Sn−1 Rn 0 ∞ 1 −t n −1 1 n = e t 2 dt.Sn−1 = Sn−1 Γ . 2 2 2 0

n/2 Hence Sn−1 = (2π )/Γ (n/2).IfwesetS0 = 2 then this formula for Sn−1 is valid for n ≥ 1. For properties of the gamma function Γ ,seeSection10.2. Green’s Formula and Harmonic Functions 19

4. Distributional derivative of a harmonic function. We return to the computation of the distributional derivative Δ(1/r n−2) for n =2. One has     1 1 Δ ,ϕ = , Δϕ r n−2 r n−2 1 1 = Δ = Δ ∈D − ϕ(x) dx lim − ϕ(x) dx(ϕ ). r n 2 ε→0 r n 2 Rn r ≥ε  n−2 Δ = Let us apply Green’s formula to the integral r ≥ε(1/r ) ϕ(x) dx.TakeV n−2 Vε ={x : r>ε}, S = Sε ={x : r = ε}, f(x) = 1/r (r>ε). Then ∂/∂ν = ∂/∂r.FurthermoreΔf = 0 on Vε.Wethusobtain 1 1 ∂ϕ −(n − 2) Δϕ(x) dx =− ds + ϕ(s) ds. r n−2 r n−2 ∂r εn−1 = = Vε r ε r ε One has             ∂ϕ  n x ∂ϕ   ∂ϕ    =  i  ≤ n sup  (x) .  ∂r   r ∂x  i.x  ∂x  i=1 i i Hence          1 ∂ϕ   ∂ϕ   ds  ≤ n sup   .εS − → 0  n−2  i,x   n 1 r ∂r ∂xi r =ε when ε → 0.Moreover −(n − 2) −(n − 2) ϕ(s) ds = ϕ(0) ds εn−1 εn−1 = = r ε r ε −(n − 2) + ϕ(s) − ϕ(0) ds, εn−1 r =ε and −(n − 2) ϕ(0) ds =−(n − 2)S − ϕ(0). εn−1 n 1 r =ε For the second summand, one has     √   √   | − |≤  ∂ϕ  ≤  ∂ϕ  ϕ(s) ϕ(0) s n sup x ≤ε   ε n supi,x   , 1≤i≤n ∂xi ∂xi

for s =ε.Hence      −(n − 2)   ϕ(s) − ϕ(0) ds  ≤ const.ε.  εn−1  r =ε We may conclude   1 Δ ,ϕ =−(n − 2)Sn−1 ϕ(0), r n−2 20 Differentiating Distributions

or 1 2π n/2 Δ =−(n − 2) δ(n=2). r n−2 Γ (n/2) In a similar way one can show in R2

Δ (log r) = 2πδ (n= 2).

Special Cases d2 n = 1: Δ |x|= |x|=2 δ. dx2

1 n = 3: Δ =−4πδ. r

3.6 Exercises

Exercise 3.6. In the (x, y)-plane, consider the square ABCD with

A = (1, 1), B= (2, 0), C = (3, 1), D= (2, 2).

Let T be the distribution defined by the function that is equal to 1 on the square and zero elsewhere. Compute in the sense of distributions

∂2T ∂2T − . ∂y2 ∂x2

Exercise 3.7. Show that, in the sense of distributions, # $# $ ∂ ∂ 1 + i = 2πδ. ∂x ∂y x + iy

Exercise 3.8. Show that, in the sense of distributions,

Δ(log r) = 2πδ in R2.

Exercise 3.9. In Rn one has, in the sense of distributions, for n ≥ 3, * + 1 1 Δ = Δ if m

Exercise 3.10. In the (x, t)-plane, let

Y(t) 2 E(x,t) = √ e−x /(4t) . 2 πt Show that, in the sense of distributions,

∂E ∂2E − = δ(x) δ(t) . ∂t ∂x2

Further Reading

Most of the material in this chapter can be found in [9], where many more exercises are available. Harmonic functions only depending on the radius are treated in several books on physics. Harmonic polynomials form the other side of the medal, they de- pend only on the angles (in other words, their domain of definition is the unit sphere). These so-called spherical harmonics arise frequently in quantum mechanics. For an account of these spherical harmonics, see [4], Section 7.3. 4 Multiplication and Convergence of Distributions

Summary In this short chapter we define two important notions: multiplication of a distribution with a C∞ function and a convergence principle for distributions. Multiplication with a C∞ function is a crucial ingredient when considering differential equations with variable coefficients.

Learning Targets

 Understanding the definition of multiplication of a distribution with a C∞ func- tion and its consequences.  Getting familiar with the notion of convergence of a sequence or series of distribu- tions.

4.1 Multiplication with a C ∞ Function

There is no multiplication possible of two arbitrary distributions. Multiplication of a distribution with a C∞ function can be defined. Let α be a C∞ function on Rn. Thenititeasilyverifiedthatthemappingϕ n αϕ is a continuous linear mapping of D(R ) into itself: if ϕj converges to ϕ in n n D(R ),thenαϕj converges to αϕ in D(R ) when j tends to infinity. Therefore, given a distribution T on Rn, the mapping defined by   ϕ T, αϕ ϕ ∈D(Rn) is again a distribution on Rn.Wethuscandefine:

Definition 4.1. Let α ∈E(Rn) and T ∈D(Rn). Then the distribution αT is defined by   αT, ϕ=T, αϕ ϕ ∈D(Rn) .

If f is a locally integrable function, then clearly αf ={αf } in the common notation.

Examples on the Real Line 1. αδ = α(0)δ,inparticularxδ = 0. 2. (αδ) = α(0)δ + α(0)δ.

The following result is important, it provides a kind of converse of the first example.

Theorem 4.2. Let T be a distribution on R.IfxT = 0 then T = cδ, c being a constant. Exercises 23

Proof. Let ϕ ∈Dand let χ ∈Dbe such that χ = 1 near x = 0.Define ⎧ ⎪ − ⎨ ϕ(x) ϕ(0)χ(x) = = if x 0 ψ(x) ⎪ x ⎩ ϕ(0) if x = 0 .

Then ψ ∈D(use the Taylor expansion of ϕ at x = 0,cf.Section5.3)andϕ(x) = ϕ(0) · χ(x) + xψ(x).HenceT, ϕ=ϕ(0) T, χ(ϕ ∈D), therefore T = cδ with c =T, χ.

Theorem 4.3. Let T ∈D(Rn) and α ∈E(Rn). Then one has for all 1 ≤ i ≤ n,

∂ ∂α ∂T (α T ) = T + α . ∂xi ∂xi ∂xi The proof is left to the reader.

4.2 Exercises

Exercise 4.4. a. Show, in a similar way as in the proof of Theorem 4.2,thatT  = 0 implies that T = const.   b. Let T and T be regular distributions, T = Tf , T = Tg. Assume both f and g are continuous. Show that f is continuously differentiable and f  = g.

Exercise 4.5. Determine all solutions of the distribution equation xT = 1.

4.3 Convergence in D

Definition 4.6. A sequence of distributions {Tj } converges to a distribution T if Tj ,ϕ converges to T, ϕ for every ϕ ∈D.

The convergence corresponds to the so-called weak on D,knownfrom functional analysis. The space D is sequentially complete: if for a given sequence of distributions {Tj }, the sequence of scalars Tj ,ϕ tends to T, ϕ for each ϕ ∈D, when j tends to infinity, then T ∈D. This important result follows from Banach– Steinhaus theorem from functional analysis, see [1], Chapter III, §3, n◦ 6, Théorème 2. A detailed proof of this result is contained in Section 10.1. ∞ Aseries T of distributions T is said to be convergent if the sequence {S }  j=o j j N = N D with SN j=0 Tj converges in .

Theorem 4.7. If the locally integrable functions fj converge to a locally integrable func- tion f for j →∞, point-wise almost everywhere, and if there is a locally integrable func- 24 Multiplication and Convergence of Distributions

tion g ≥ 0 such that |fj (x)|≤g(x) almost everywhere for all j, then the sequence of distributions fj converges to the distribution f . This follows immediately from Lebesgue’s theorem on dominated convergence.

 Theorem 4.8. Differentiation is a continuous operation in D :if{Tj } converges to T , k k then {D Tj } converges to D T for every n-tuple k. The proof is straightforward.

Examples 1. Let ⎧ ⎨ 0 if x >ε f (x) n (x ∈ Rn). ε ⎩ if x ≤ε n ε Sn−1

Then fε converges to the δ-function when ε ↓ 0. 2. The functions

= √1  −x2/(2ε2) ∈ R fε(x) n e (x ) εn 2π

converge to δ when ε ↓ 0.

4.4 Exercises

Exercise 4.9. a. Show that for every ϕ ∈ C∞(R) and every interval (a, b),

b b sin λx ϕ(x) dx and cos λx ϕ(x) dx a a

converge to zero when λ tends to infinity. b. Derive from this, by writing ϕ(x) = ϕ(0) + ϕ(x) − ϕ(0)

that the distributions (sin λx)/x tend to πδwhen λ tends to infinity. c. Show that, for real λ, the mapping

∞ cos λx cos λx ϕ Pf ϕ(x) dx = lim ϕ(x) dx(ϕ∈D) x ε↓0 x −∞ |x|≥ε

defines a distribution. Show also that these distributions tend to zero when λ tends to infinity. Exercises 25

Exercise 4.10. Show that multiplication by a C∞ function defines a continuous linear mapping in D. Determine the limits in D,whena ↓ 0,of a ax and . x2 + a2 x2 + a2

Further Reading

This chapter is self-contained and no reference to other literature is necessary. 5 Distributions with Compact Support

Summary Distributions with compact support form an important subset of all distributions. In this chapter we reveal their structure and properties. In particular the structure of distributions supported at the origin is shown.

Learning Targets  Understanding the structure of distributions supported at the origin.

5.1 Definition and Properties

We recall the space E of C∞ functions on Rn. The following convergence principle is defined in E:

k k Definition 5.1. Asequenceϕj ∈Etends to ϕ ∈Eif D ϕj tends to D ϕ(j→∞) for all n-tuples k of nonnegative integers, uniformly on compact subsets of Rn.

Thus, given a compact subset K, one has      k − k  → →∞ supx∈K D ϕj (x) D ϕ(x) 0 (j ) for all n-tuples k of nonnegative integers. In a similar way one defines a convergence principle in the space Em of Cm functions on Rn by restricting k to |k|≤m.Notice that D⊂Eand the injection is linear and continuous:ifϕj,ϕ∈Dand ϕj tends to ϕ in D,thenϕj tends to ϕ in E. Let {αj } be a sequence of functions in D with the property αj(x) = 1 on the ball n {x ∈ R : x ≤j} and let ϕ ∈E.Thenαj ϕ ∈Dand αj ϕ tends to ϕ(j→∞) in E.WemayconcludethatD is a dense subspace of E. Denote by E the space of continuous linear forms L on E. Continuity is defined as in the case of distributions: if ϕj tends to ϕ in E,thenL(ϕj ) tends to L(ϕ). Let T be a distribution with compact support and let α ∈Dbe such that α(x) = 1 for x in a neighborhood of Supp T . We can then extend T to a linear form on E by L, ϕ=T, αϕ(ϕ ∈E). This extension is obviously independent of the choice of α.MoreoverL is a continuous linear form on E and L = T on D.BecauseD is a dense subspace of E, this extension L of T to an element of E is unique. We have:

Theorem 5.2. Every distribution with compact support can be uniquely extended to a continuous linear form on E.

Conversely, any L ∈E is completely determined by its restriction to D.This restriction is a distribution since D is continuously embedded into E.

Theorem 5.3. The restriction of L ∈E to D is a distribution with compact support. Distributions Supported at the Origin 27

Proof. The proof is by contradiction. Assume that the restriction T of L does not have compact support. Then we can find for each natural number m afunctionϕm ∈D with Supp ϕm ∩{x : x

Proposition 5.4. Let T be a distribution. Then T has compact support if and only if there exists a constant C>0, a compact subset K and a positive integer m such that      | | ≤  k  T, ϕ C supx∈K D ϕ(x) |k|≤m for all ϕ ∈D.

Observe that if T satisfies the above inequality, then Supp T ⊂ K.Wealsomay conclude that a distribution with compact support has finite order and is summable.

5.2 Distributions Supported at the Origin

Rn Theorem 5.5. Let T be a distribution on supported at the origin. Then there exist k an integer m and complex scalars ck such that T = |k|≤m ck D δ. Proof. We know that T has finite order, say m. By Proposition 2.7 we have T, ϕ=0 for all ϕ ∈Dwith Dkϕ(0) = 0 for |k|≤m. The same then holds for the unique extension of T to E. Now write for ϕ ∈D, applying Taylor’s formula (see Section 5.3)

 xk ϕ(x) = Dkϕ(0) + ψ(x) (x ∈ Rn) k! |k|≤m

k = k1 k2 ··· kn = ··· ∈E defining x x1 x2 xn ,k! k1!k2! kn! and taking ψ such that Dkψ(0) = 0 for |k|≤m.Thenweobtain , -    (−1)|k| xk T, ϕ= T, Dkδ, ϕ (ϕ ∈D), k! |k|≤m  = k or T |k|≤m ckD δ with , - (−1)|k|xk c = T, . k k!

5.3 Taylor’s Formula for Rn

We start with Taylor’s formula for R.Letf ∈D(R). Then we can write for any m ∈ N

m x  xj 1 f(x) = f (j)(0) + (x − t)m f (m+1)(t) dt, = j! m! j 0 0 28 Distributions with Compact Support

which can be proved by integration by parts. The change of variable t → xs in the integral gives for the last term

1 xm+1 (1 − s)m f (m+1)(xs) ds. m! 0

This term is of the form (xm+1/m!)g(x)in which, clearly, g is a C∞ function on R, hence an element of E(R).Inparticular

m 1  f (j)(0) 1 f(1) = + (1 − s)m f (m+1)(s) ds. (5.1) = j! m! j 0 0

Now let ϕ ∈D(Rn) and let t ∈ R. Then we can easily prove by induction on j,

1 dj  xk   ϕ(xt) = Dkϕ (xt) . j! dtj k! |k|=j

Taking f(t) = ϕ(xt) in formula (5.1), we obtain Taylor’s formula

 k    k x k x ϕ(x) = D ϕ (0) + ψk(x) k! k! |k|≤m |k|=m+1 with 1   m k ψk(x) = (m + 1) (1 − t) D ϕ (xt) dt. 0 n Clearly ψk ∈E(R ).

5.4 Structure of a Distribution*

Theorem 5.6. Let T be a distribution on Rn and K an open bounded subset of Rn.Then there exists a continuous function f on Rn and a n-tuple k (both depending on K)such that T = Dkf on K.

Proof. Without loss of generality we may assume that K is contained in the cube

Q ={(x1,...,xn) : |xi| < 1 for all i}.LetusdenotebyD(Q) the space of all ϕ ∈D(Rn) with Supp ϕ ⊂ Q. Then, by the mean value theorem, we have      ∂ϕ  |ϕ(x)|≤sup   (x ∈ Rn) ∂xi for all ϕ ∈D(Q) and all i = 1,...,n.LetR = ∂/∂x1 ··· ∂/∂xn and let us denote ∈ for y Q  Qy = (x1,...,xn) : −1

We then have the following integral representation for ϕ ∈D(Q): ϕ(y) = (Rϕ)(x) dx.

Qy

For any positive integer m set     =  k  ϕ m sup|k|≤m, x D ϕ(x) .

Then we have for all ϕ ∈D(Q) the inequalities      m   m+1  ϕ m ≤ sup R ϕ ≤ R ϕ(x) dx. Q

Because T is a distribution, there exist a positive constant c and a positive integer m such for all ϕ ∈D(Q)    m+1  |T, ϕ| ≤ c ϕ m ≤ c R ϕ(x) dx. Q

m+1 Let us introduce the function T1 defined on the image of R by   m+1 T1 R ϕ =T, ϕ .

This is a well-defined linear function on Rm+1(D(Q)). It is also continuous in the following sense:

|T1(ψ)|≤c |ψ(x)| dx. Q

Therefore, using the Hahn–Banach theorem, we can extend T1 to a continuous linear form on L1(Q). So there exists an integrable function g on Q such that     m+1 m+1 T, ϕ=T1 R ϕ = g(x) R ϕ (x) dx Q for all ϕ ∈D(Q). Setting g(x) = 0 outside Q and defining

y1 yn

f(y) = ... g(x)dx1 ...dxn , −1 −1 and using integration by parts, we obtain   T, ϕ=(−1)n f(x) Rm+2ϕ (x) dx Rn for all ϕ ∈D(Q).HenceT = (−1)n+m+2 Rm+2f on K. For distributions with compact support we have a global result. 30 Distributions with Compact Support

Theorem 5.7. Let T be a distribution with compact support. Then there exist finitely n many continuous functions fk on R , indexed by n-tuples k,suchthat  k T = D fk . k

Moreover, the support of each fk can be chosen in an arbitrary neighborhood of Supp T .

Proof. Let K be an open bounded set such that Supp T ⊂ K. Select χ ∈D(Rn) such that χ(x) = 1 on a neighborhood of Supp T and Supp χ ⊂ K. Then by Theorem 5.6 there is a continuous function f on Rn such that T = Dlf on K for some n-tuple l. Hence      |l| l k T, ϕ=T, χ· ϕ=(−1) f, D (χϕ) = D fk,ϕ |k|≤|l|

n for suitable continuous functions fk with Supp fk ⊂ K and for all ϕ ∈D(R ).

Further Reading

The proof of the structure theorem in Section 5.4, of which the proof can be omitted at first reading, is due to M. Pevzner. Several other proofs are possible, see Section 7.7 for example. See also [10]. Distributions with compact support occur later in this book at several places. 6 Convolution of Distributions

Summary This chapter is devoted to the convolution product of distributions, a very useful tool with numerous applications in the theory of differential and integral equations. Several examples, some related to physics, are discussed. The symbolic calculus of Heaviside, one of the eldest applications of the theory of distributions, is discussed in detail.

Learning Targets

 Getting familiar with the notion of convolution product of distributions.  Getting an impression of the impact on the theory of differential and integral equa- tions.

6.1 Tensor Product of Distributions

Set X = Rm,Y= Rn,Z= X × Y( Rn+m).Iff is a function on X and g one on Y , then we define the following function on Z:

(f ⊗ g)(x, y) = f(x)g(y) (x ∈ X,y ∈ Y).

We call f ⊗ g the tensor product of f and g. If f is locally integrable on X and g locally integrable on Y ,thenf ⊗ g is locally integrable on Z. Moreover one has for u ∈D(X) and v ∈D(Y ),

f ⊗ g, u ⊗ v=f, ug, v .

If ϕ ∈D(X × Y) is arbitrary, not necessarily of the form u ⊗ v, then, by Fubini’s theorem, f ⊗ g, ϕ=f(x)⊗ g(x), ϕ(x,y) (notation)

=f(x), g(y), ϕ(x,y)

=g(y), f(x), ϕ(x,y) . The following proposition, of which we omit the technical proof (see [10], Chapter IV, §§ 2, 3, 4), is important for the theory we shall develop.

Proposition 6.1. Let S be a distribution on X and T one on Y . There exists one and only one distribution W on X × Y such that W, u⊗ v=S, uT, v for all u ∈D(X), v ∈D(Y ). W is called the tensor product of S and T and is denoted by W = S ⊗ T .

It has the following properties. For fixed ϕ ∈D(X × Y)set θ(x) =Ty ,ϕ(x,y). Then θ ∈D(X) and one has

W, ϕ=S, θ=Sx, Ty ,ϕ(x,y) , 32 Convolution of Distributions

and also

W, ϕ=Ty , Sx,ϕ(x,y) .

Proposition 6.2. Supp S ⊗ T = Supp S × Supp T .

Proof. Set A = Supp S, B = Supp T . (i) Let ϕ ∈D(X ×Y)be such that Supp ϕ∩(A×B) =∅. There are neighborhoods A of A and B of B such that Supp ϕ∩(A×B) =∅too. For x ∈ A the function y ϕ(x, y) vanishes on B, so, in the above notation, θ(x) = 0 for x ∈ A,

hence Sx,θ(x)=0 and thus W, ϕ=0.WemayconcludeSupp W ⊂ A× B. (ii) Let us prove the converse. Let (x, y) ∉ Supp W . Then we can find an open “rectangle” P × Q with center (x, y), that does not intersect Supp W .Nowas- sume that for some u ∈D(X) with Supp u ⊂ P one has S, u=1.Then T, v=0 for all v ∈D(Y ) with Supp v ⊂ Q.HenceQ ∩ B =∅, and therefore (x, y) ∉ A × B.

Examples and More Properties k l ⊗ = k ⊗ l 1. DxDy (Sx Ty ) DxSx Dy Ty .

2. δx ⊗ δy = δ(x,y).

3. A function on X × Y is independent of x if it is of the form 1x ⊗ g(y),withg afunctiononY . Similarly, a distribution is said to be independent of x if it is of

the form 1x ⊗ Ty with T a distribution on Y . One then has , -     1x ⊗ Ty ,ϕ(x,y) = Ty ,ϕ(x,y) dx = Ty , ϕ(x, y) dx X X for ϕ ∈D(X × Y). We remark that the tensor product is associative     Sx ⊗ Ty ⊗ Uξ = Sx ⊗ Ty ⊗ Uξ .

Observe that    Sx ⊗ Ty ⊗ Uξ , u(x)v(y)t(ξ) =S, uT, vU, t .

4. Define  1 if xi ≥ 0 (1 ≤ i ≤ n) Y(x ,...,x ) = 1 n 0 elsewhere.

Then Y(x1,...,xn) = Y(x1) ⊗···⊗Y(xn).Moreover ∂nY = δ(x1,...,xn) . ∂x1 ···∂xn Convolution Product of Distributions 33

6.2 Convolution Product of Distributions

Let S and T be distributions on Rn. We shall say that the convolution product S ∗ T of S and T exists if there is a “canonical” extension of S ⊗ T to functions of the form (ξ, η) ϕ(ξ + η)(ϕ ∈D),suchthat   S ∗ T, ϕ= Sξ ⊗ Tη,ϕ(ξ+ η) (ϕ ∈D) defines a distribution.

Comments n n 1. Sξ ⊗ Tη is a distribution on R × R . 2. The function (ξ, η) ϕ(ξ + η) (ϕ ∈D) has no compact support in general. The support is contained in a band of the form

@ @@ η @@ @@@ @@@@ @@@@ @@@@@ @@@@@@ @@@@@@@ ξ @@@@@@@ @@@@@@@ @@@@@@ @@@@@ @@@@@ @@@@ @@@ @@@ @@ @ 3. The distribution S ∗ T exists for example when the intersection of Supp(S ⊗ T) = Supp S × Supp T with such bands is compact (or bounded). The continuity

of S∗T is then a straightforward consequence of the relations (∂/∂ξi)ϕ(ξ+η) = (∂ϕ/∂ξi)(ξ + η) etc. In that case T ∗ S exists also and one has S ∗ T = T ∗ S. 4. Case by case one has to decide what “canonical” means.

We may conclude:

Theorem 6.3. A sufficient condition for the existence of the convolution product S ∗ T is that for every compact subset K ⊂ Rn one has

ξ ∈ Supp S, η∈ Supp T, ξ+ η ∈ K ⇒ ξ and η remain bounded.

The convolution product S ∗ T has then a canonical definition and is commutative: S ∗ T = T ∗ S.

Special Cases (Examples) 1. At least one of the distributions S and T has compact support. For example: S ∗ δ = δ ∗ S = S for all S ∈D(Rn). 34 Convolution of Distributions

2. (n = 1) We shall say that a subset A ⊂ R is bounded from the left if A ⊂ (a, ∞) for some a ∈ R. Similarly, A is bounded from the right if A ⊂ (−∞,b)for some b ∈ R. If Supp S and Supp T are both bounded from the left (right), then S ∗ T exists. Example: Y ∗ Y exists, with Y , as usual, the Heaviside function. 3. (n = 4)LetSupp S be contained in the positive light cone

t ≥ 0 ,t2 − x2 − y2 − z2 ≥ 0 ,

and let Supp T be contained in t ≥ 0.ThenS ∗ T exists. Verify that Theorem 6.3 applies.

Theorem 6.4. Let S and T be regular distributions given by the locally integrable func- tions f and g. Let their supports satisfy the conditions of Theorem 6.3. Then one has   (i) Rn f(x−t)g(t) dt and Rn f(t)g(x−t)dt exist for almost all x, and are equal almost everywhere to, say, h(x); (ii) h is locally integrable; (iii) S ∗ T is a function, namely h.

The proof can obviously be reduced to the well-known case of L1 functions f and g with compact support. Sometimes the conditions of Theorem 6.3 are not fulfilled, while the convolution product nevertheless exists. Let us give two examples. 1. If f and g are arbitrary functions in L1(Rn),thenf ∗ g,definedby f ∗ g(x) = f(t)g(x − t)dt Rn

1 n exists and is an element of L (R ) again. One has f ∗ g 1 ≤ f 1 g 1. 2. If f ∈ L1(Rn), g ∈ L∞(Rn),thenf ∗ g defined by f ∗ g(x) = f(t)g(x − t)dt Rn exists and is an element of L∞(Rn) again. It is even a continuous function. Fur-

thermore f ∗ g ∞ ≤ f 1 g ∞. Let now f and g be locally integrable on R, Supp f ⊂ (0, ∞) Supp g ⊂ (0, ∞). Then f ∗ g exists (see special case 2) and is a function (Theorem 6.4). One has Supp(f ∗ g) ⊂ (0, ∞) and ⎧ ⎪ ≤ ⎪ 0 if x 0 ⎨⎪ ∗ = x f g(x) ⎪ ⎪ − ⎩⎪ f(x t)g(t)dt if x>0 . 0 Convolution Product of Distributions 35

Examples 1. Let xα−1 Y α(x) = Y(x) eλx (α > 0,λ∈ C). λ Γ (α) α ∗ β = α+β One then has Yλ Yλ Yλ , applying the following formula for the beta func- tion: 1 Γ (α) Γ (β) B(α, β) = tα−1 (1 − t)β−1 dt = Γ (α + β) 0 (see Section 10.2). 2. Let √1 −x2/(2σ 2) Gσ (x) = e (σ > 0). σ 2π ∗ = √ Then one has Gσ Gτ G σ 2+τ2 .

Gσ is the Gauss distribution with expectation 0 and variance σ .IfGσ is the dis- tribution function of the stochastic variable x and Gτ the one of y,andifthe ∗ stochastic variables are independent,√ then Gσ Gτ is the distribution function of x + y. Its variance is apparently σ 2 + τ2. 3. Let 1 a P (x) = (a > 0), a π 2 x2 + a2 the so-called Cauchy distribution.

One has Pa ∗ Pb = Pa+b.Weshallprovethislateron.

Theorem 6.5. Let α ∈Eand let T be a distribution. Assume that the distribution T ∗{α} exists (e. g. if the conditions of Theorem 6.3 are satisfied). Then this distribution is regular and given by a function in E,namelyx Tt ,α(x− t).

We shall write T ∗ α for this function. It is called the regularization of T by α.

Proof. We shall only consider the case where T and {α} satisfy the conditions of ∞ Theorem 6.3. Then the function x Tt,α(x− t) exists and is C since x ∞ β(x) Tt ,α(x−t)=Tt , β(x) α(x −t) is C for every β ∈D, by Proposition 6.1. 36 Convolution of Distributions

We now show that T ∗{α} is a function: T ∗{α}={T ∗ α}. One has for ϕ ∈D   T ∗{α},ϕ= Tξ ⊗ α(η), ϕ(ξ + η)   = Tξ, α(η), ϕ(ξ + η) , -

= Tξ, α(η) ϕ(ξ + η) dη Rn , -

= Tξ, α(x − ξ)ϕ(x)dx  Rn  = Tξ, ϕ(x), α(x − ξ)   = Tξ ⊗ ϕ(x), α(x − ξ) .

Applying Fubini’s theorem, we obtain       Tξ ⊗ ϕ(x), α(x−ξ) = ϕ(x), Tξ ,α(x−ξ) = ϕ(x) Tξ ,α(x− ξ) dx Rn = (T ∗ α)(x) ϕ(x) dx.

Rn

Hence the result.

Examples 1. If T has compact support or T = f with f ∈ L1.ThenT ∗ 1 =T, 1. 2. If α is a polynomial of degree ≤ m,thenT ∗α is. Let us take n = 1 for simplicity. Then      xk (T ∗ α)(x) = T ,α(x− t) = T ,α(k)(−t) t k! t k≤m by using Taylor’s formula for the function x α(x−t).HereT is supposed to be a distribution with compact support or T = f with xkf ∈ L1 for k = 0, 1,...,m.

n n Definition 6.6. Let T be a distribution on R and a ∈ R . The translation τaT of T over a is defined by   τaT, ϕ= Tξ,ϕ(ξ+ a) (ϕ ∈D).

Proposition 6.7. Onehasthefollowingformulae: (i) δ ∗ T = T for any T ∈D(Rn);

(ii) δ(a) ∗ T = τ(a)T; δ(a) ∗ δ(b) = δ(a+b); (iii) (n = 1) δ ∗ T = T  for any T ∈D. Convolution Product of Distributions 37

The proof is left to the reader. In a similar way one has δ(m)∗T = T (m) on R,and, if D is a differential operator on Rn with constant coefficients, then Dδ ∗ T = DT, for all T ∈D(Rn).Inparticular,if

n  ∂2 Δ = Δ = Rn n 2 (Laplace operator in ) i=1 ∂xi and 1 ∂2 = − Δ (differential operator of d’Alembert in R4) , c2 ∂t2 3 then Δ δ ∗ T = Δ T and δ ∗ T = T .

 n  n Proposition 6.8. Let Sj tend to S(j→∞) in D (R ) and let T ∈D(R ).Then Sj ∗ T tends to S ∗ T(j→∞) if either T has compact support or the Supp Sj are uniformly bounded in Rn. ∈ The proof is easy and left to the reader. As application consider a function ϕ n n D(R ) satisfying ϕ ≥ 0, Rn ϕ(x) dx = 1,andsetϕk(x) = k ϕ(kx) for k = n 1, 2,....Thenδ = limk→∞ ϕk and Supp ϕk remains uniformly bounded in R . Hence for all T ∈D(Rn) one has

T = T ∗ δ = lim T ∗ ϕk . k→∞

∞ Notice that T ∗ ϕk is a C function, so any distribution is the limit, in the sense of distributions, of a sequence of C∞ functions of the form T ∗ ϕ with ϕ ∈D. Now take n = 1.Thenδ is also the limit of a sequence of polynomials. This follows from the Weierstrass theorem: any continuous function on a closed interval can be uniformly approximated by polynomials. Select now the sequence ofpolynomialsasfollows.Write(asabove)δ = limk→∞ ϕk with ϕk ∈D.For | − | each k choose a polynomial pk such that supx∈[−k,k] ϕk(x) pk(x) < 1/k by the Weierstrass theorem. Then δ = limk→∞ pk. More generally, if T ∈D(R) has compact support, then, by previous results, T is the limit of a sequence of polynomials (because T ∗ α is a polynomial for any polynomial α). We just mention, without proof, that this result can be extended to n>1. 1 n Let now f ∈ L (R ) and let again {ϕk} be the sequence of functions considered above. Then f = limk→∞ f ∗ϕk in the sense of distributions. One can however show a much stronger result in this case with important implications.

∈ 1 Rn Lemma 6.9. Let f L ( ).Thenonecanfindforeveryε>0 a neighborhood V of n x = 0 in R such that Rn |f(x − y)− f(x)| dx<εfor all y ∈ V .

Proof. Approximate f in L1-norm with a continuous function ψ with compact sup- port. Then the lemma follows from the uniform continuity of ψ. 38 Convolution of Distributions

1 n 1 Proposition 6.10. Let f ∈ L (R ).Thenf = limk→∞ f ∗ ϕk in L -norm. Proof. One has       f − f ∗ ϕk 1 =  f(x)− f(x − y) ϕk(y) dy dx Rn Rn     ≤ f(x)− f(y − x) ϕk(y) dx dy. Rn Rn

Given ε>0, this expression is less than ε for k large enough by Lemma 6.9. Hence the result.

Corollary 6.11. Let f ∈ L1(Rn).Iff = 0 as a distribution, then f = 0 almost every- where.

Proof. If f = 0 as a distribution, then f ∗ ϕk = 0 as a distribution for all k,because n f ∗ ϕk,ϕ=f, ϕˇ k ∗ ϕ for all ϕ ∈D(R ),definingϕˇ k(x) = ϕk(−x)(x ∈ n R ).Sincef ∗ ϕk is a continuous function, we conclude that f ∗ ϕk = 0.Thenby Proposition 6.10, f = 0 almost everywhere.

Corollary 6.12. Let f be a locally integrable function. If f = 0 as a distribution, then f = 0 almost everywhere.

Proof. Let χ ∈D(Rn) be arbitrary. Then χf ∈ L1(Rn) and χf = 0 as a distribu- tion. Hence χf = 0 almost everywhere by the previous corollary. Since this holds for all χ,weobtainf = 0 almost everywhere. We continue with some results on the support of the convolution product of two distributions.

Proposition 6.13. Let S and T be two distributions such that S ∗ T exists, Supp S ⊂ A, Supp T ⊂ B.ThenSupp S ∗ T is contained in the closure of the set A + B.

The proof is again left to the reader.

Examples 1. Let Supp S and Supp T be compact. Then S ∗ T has compact support and the support is contained in Supp S + Supp T . 2. (n = 1) If Supp S ⊂ (a, ∞), Supp T ⊂ (b, ∞),thenSupp S ∗ T ⊂ (a + b, ∞). 3. (n = 4) If both S and T have support contained in the positive light cone t ≥ 0, t2 − x2 − y2 − z2 ≥ 0,thenS ∗ T has. Associativity of the Convolution Product 39

6.3 Associativity of the Convolution Product

Let R, S, T be distributions on Rn with supports A, B, C, respectively. We define the distribution R ∗ S ∗ T by   R ∗ S ∗ T, ϕ= Rξ ⊗ Sη ⊗ Tζ ,ϕ(ξ+ η + ζ) (ϕ ∈D). This distribution makes sense if, for example,

ξ ∈ A, η ∈ B, ζ ∈ C, ξ + η + ζ bounded , implies ξ, η and ζ are bounded. The associativity of the tensor product then gives the associativity of the convolution product

R ∗ S ∗ T = (R ∗ S) ∗ T = R ∗ (S ∗ T).

The convolution product may not be associative if the above condition is not satisfied. The distributions (R ∗ S)∗ T and R ∗ (S ∗ T)might exist without being equal. So is (1 ∗ δ) ∗ Y = 0 while 1 ∗ (δ ∗ Y)= 1 ∗ δ = 1.

Theorem 6.14. The convolution product of several distributions makes sense, is com- mutative and associative, in the following cases: (i) All, but at most one distribution has compact support. (ii) (n = 1) The supports of all distributions are bounded from the left (right). (iii) (n = 4) All distributions have support contained in t ≥ 0 and, up to at most one, also in t ≥ 0, t2 − x2 − y2 − z2 ≥ 0.

The proof is left to the reader. Assume that S and T satisfy: ξ ∈ Supp S, η ∈ Supp T, ξ + η bounded, implies that ξ and η are bounded. Then one has:

Theorem 6.15. Translation and differentiation of the convolution product S ∗ T goes by translation and differentiation of one of the factors.

This follows easily from Theorem 6.14 by invoking the delta distribution and using  n τa T = δ(a) ∗ T,DT = Dδ ∗ T if T ∈D(R ), D being a differential operator with constant coefficients on Rn and a ∈ Rn.

Example (n = 2)

∂2 ∂2S ∂2T ∂S ∂T ∂S ∂T (S ∗ T) = ∗ T = S ∗ = ∗ = ∗ . ∂x∂y ∂x∂y ∂x∂y ∂x ∂y ∂y ∂x

6.4 Exercises

Exercise 6.16. Show that the following functions are in L1(R):

2 2 e−|x| , e−ax (a > 0), xe−ax (a > 0). 40 Convolution of Distributions

Compute (i) e−|x| ∗ e−|x|, 2 2 (ii) e−ax ∗ xe−ax , 2 2 (iii) xe−ax ∗ xe−ax .

Exercise 6.17. Determine all powers Fm = f ∗ f ∗···∗f (m times), where  1 if −1

 Determine also Supp Fm and Fm.

Exercise 6.18. In the (x, y)-plane we consider the set C :0< |y|≤x.Iff and g are functions with support contained in C, write then f ∗ g in a “suitable” form. Let χ be the characteristic function of C.Computeχ ∗ χ.

6.5 Newton Potentials and Harmonic Functions

Let μ be a continuous mass distribution in R3 and μ . μ(y1,y2,y3) dy1 dy2 dy3 U (x1,x2,x3) = 2 2 2 (x1 − y1) + (x2 − y2) + (x3 − y3) the potential belonging to μ. Generalized to Rn we get μ(y) dy 1 U μ = or U μ = μ ∗ (n > 2). x − y n−2 r n−2 Rn

Definition 6.19. The Newton potential of a distribution T on Rn(n > 2) is the distri- bution 1 U T = T ∗ . r n−2 In order to guarantee the existence of the convolution product, let us assume, for the time being, that T has compact support.

Theorem 6.20. U T satisfies the Poisson equation

n 2π 2 ΔU T =−NT with N = ! " . Γ n 2 The proof is easy, apply in particular Section 3.5. An elementary or fundamental solution of Δ U = V is, by definition, a solution of Δ U = δ.HenceE =−1/(Nr n−2) is an elementary solution of the Poisson equation. Call a distribution U harmonic if ΔU = 0. Newton Potentials and Harmonic Functions 41

Lemma 6.21 (Averaging theorem from potential theory). Let f be a harmonic function on Rn (i. e. f is a C2 function on Rn with Δf = 0). Then the average of f over any sphere around x with radius R is equal to f(x):

f(x) = f(x − y)dωR(y) = f(x − Rω)dω y =R Sn−1 with

|J(θ ,...,θ − )|dθ ···dθ − dω = 1 n 1 1 n 1 (in spherical coordinates) , Sn−1

dωR(y) = surface element of sphere around x = 0 with radius R, such that total mass is 1 .

Proof. Let n>2. In order to prove this lemma, we start with a C2 function f on Rn, afunctionϕ ∈D(Rn) and a volume V with boundary S, and apply Green’s formula (Section 3.5):   ∂ϕ ∂f f Δ ϕ − ϕ Δ f dx + f − ϕ dS = 0 , (i) ∂ν ∂νi V S with ν the inner normal on S.

Suppose now f harmonic, so Δ f = 0,andtakeV = VR(x),beingaballaroundx with radius R and boundary S.Takeϕ ∈D(Rn) such that ϕ = 1 on a neighborhood of V .Then(i)gives ∂f dS = 0 . (ii) ∂ν S n n−2 Choose then ϕ ∈D(R ) such that ϕ(y) = 1/r for 0

f(x − Rω)dω = f(x − R0 ω) dω. Sn−1 Sn−1

Let finally R0 tend to zero. Then the left-hand side tends to f(x). Theproofinthecasesn = 1 and n = 2 of this lemma is left to the reader. 42 Convolution of Distributions

Theorem 6.22. A harmonic distribution is a C∞ function.  Proof. Choose α ∈D(Rn) rotation invariant and such that α(x) dx = 1.Letf be a C2 function satisfying Δ f = 0. According to Lemma 6.21,

∞ n−1 f ∗ α(x) = f(x − t)α(t)dt = Sn−1 f(x − rω)r α(r ) dr dω Rn 0 Sn−1 ∞ n−1 = Sn−1 α(r ) r dr · f(x) = f(x). 0

Set ϕ(x)ˇ = ϕ(−x) for any ϕ ∈D.Thenalsoϕˇ ∈D.LetnowT be a harmonic distribution. Consider the regularization of T by ϕˇ: f = T ∗ ϕˇ .Thenwehave

T ∗α, ϕ=(T ∗α)∗ϕ(ˇ 0)=(T ∗ϕ)ˇ ∗α(0)=f ∗α(0)=f(0)=T ∗ϕ(ˇ 0)=T, ϕ for any ϕ ∈D,sincef = T ∗ ϕˇ is a harmonic function. Hence T ∗ α = T and thus T ∈E(Rn). Notice that the convolution product is associative (and commutative) in this case, since both α and ϕˇ have compact support. We remark that, by applying more advanced techniques from the theory of differ- ential equations, one can show that a harmonic function on Rn is even real analytic. The following corollary is left as an exercise.

Corollary 6.23. A sequence of harmonic functions, converging uniformly on compact subsets of Rn, converges to a harmonic function.

6.6 Convolution Equations

We begin with a few definitions. An algebra is a vector space provided with a bilinear and associative product. A convolution algebra A is a linear subspace of D, closed under taking convolu- tion products. The product should be associative (and commutative) and, moreover, δ should belong to A. A convolution algebra is thus in particular an algebra, even with unit element. Let us give an illustration of this notion of convolution algebra.

Examples of Convolution Algebras (i) E: the distributions with compact support on Rn.   (ii) D+: the distributions with support in [0, ∞) on R; similarly D−. (iii) In R4, the distributions with support contained in the positive light cone t ≥ 0, t2 − x2 − y2 − z2 ≥ 0. Convolution Equations 43

Let A be a convolution algebra. A convolution equation in A is an equation of the form A ∗ X = B.

Here A, B ∈A are given and X ∈A has to be determined.

Theorem 6.24. The equation A ∗ X = B is solvable for any B in A if and only if A has an inverse in A,thatisanelementA−1 ∈A such that

A ∗ A−1 = A−1 ∗ A = δ.

The inverse A−1 of A is unique and the equation has a unique solution. Namely X = A−1 ∗ B.

This algebraic result is easy to show. One just copies the method of proof from elementary algebra. Not every A has an inverse in general. If A is a C∞ function with compact support, then A−1 ∉ A,forA∗A−1 is then C∞, but on the other hand equal to δ.Inprinciple however A ∗ X = B might have a solution in A for several B. The Poisson equation Δ U = V has been considered in Section 6.5. This is a con- volution equation: Δ U = Δ δ ∗ U = V with elementary solution X =−1/(4πr) in R3.IfV has compact support, then all solutions of Δ U = V are

1 U =− ∗ V + harmonic C∞ functions . 4πr

Though Δ δ and V are in E(R3), X is not. There is no solution of Δ U = δ in E(R3) (see later), but there is one in D(R3). The latter space is however no convolution algebra. There are even infinitely many solutions of Δ U = V in D(R3).  We continue with convolution equations in D+.

Theorem 6.25. Consider the ordinary differential operator

dm dm−1 d D = + a +···+a − + a dxm 1 dxm−1 m 1 dx m  with complex constants a1,...,am.ThenDδis invertible in D+ and its inverse is of the form YZ,withY the Heaviside function and Z the solution of DZ = 0 with initial conditions

Z(0) = Z(0) =···=Z(m−2)(0) = 0,Z(m−1)(0) = 1 .

Proof. Any solution Z of DZ = 0 is an analytic function, so C∞. We have to show that D(YZ) = δ if Z satisfies the initial conditions of the theorem. Applying the formulae 44 Convolution of Distributions

of jumps from Section 3.2, Example 3, we obtain

(Y Z) = YZ + Z(0)δ (Y Z) = YZ + Z(0)δ+ Z(0)δ . . (Y Z)(m−1) = YZ(m−1) + Z(m−2)(0)δ+···+Z(0)δ(m−2) (Y Z)(m) = YZ(m) + Z(m−1)(0)δ+···+Z(0)δ(m−1) .

Hence (Y Z)(k) = YZ(k) if k ≤ m − 1 and (Y Z)(m) = YZ(m) + δ, therefore D(YZ) = YDZ + δ = δ.

Examples d 1. Let D = −λ,withλ ∈ C. One has Dδ= δ−λδand (δ−λδ)−1 = Y(x)eλx. dx # $ d2 sin ωx 2. Consider D = + ω2 with ω ∈ R.Then(δ + ω2 δ)−1 = Y(x) . dx2 ω

d m xm−1 3. Set D = − λ .Then(δ − λδ)−m = Y(x) eλx. dx (m − 1)!

As application we determine the solution in the sense of function theory of Dz = f if D is as in Theorem 6.25 and f a given function, and we impose the initial conditions (k) m z (0) = zk (0 ≤ k ≤ m − 1).Weshalltakez ∈ C and f continuous. We first compute Yz. As in the proof of Theorem 6.25, we obtain

m (k−1) D(Yz) = Y(Dz)+ ck δ k=1  = + +···+ = + m (k−1) with ck am−k z0 am−k−1 z1  zm−k.HenceD(Yz) Yf k=1 ck δ , = ∗ + m (k−1) ≥ and therefore Yz YZ (Y f k=1 ck δ ),or,forx 0

x m (k−1) z(x) = Z(x − t)f(t) dt + ck Z (x) . = 0 k 1

 Similarly we can proceed in D− with −Y(−x). One obtains the same formula. Verify that z satisfies Dz = f with the given initial conditions.

 −1 Proposition 6.26. If A1 and A2 are invertible in D+,thenA1∗A2 is and (A1∗A2) = −1 ∗ −1 A2 A1 . This proposition is obvious. It has a nice application. Let D be the above differ- ential operator and let

m m−1 P(z) = z + a1z +···+am−1z + am . Symbolic Calculus of Heaviside 45

Then P(z)can be written as

P(z) = (z − z1)(z − z2) ···(z − zm), with z1,...,zm being the complex roots of P(z) = 0.Hence

d d d D = − z1 − z2 ··· − zm dx dx dx or          Dδ = δ − z1δ ∗ δ − z2δ ∗···∗ δ − zmδ . Therefore − (Dδ) 1 = Y(x)ez1x ∗ Y(x)ez2x ∗···∗Y(x)ezmx

  m λx m−1 which is in D+.Forexample(δ − λδ) ;itsinverseisY(x)e x /(m − 1)!, which can easily be proved by induction on m; see also Example 3 above.

6.7 Symbolic Calculus of Heaviside

 It is known that D+ is a commutative algebra over C with unit element δ and without  zero divisors. The latter property says: if S ∗ T = 0 for S,T ∈D+,thenS = 0 or T = 0. For a proof we refer to [14], p. 325, Theorem 152 and [10], p. 173. Therefore, by a well-  known theorem from algebra, D+ has a quotient field, where convolution equations can be solved in the natural way by just dividing. Let us apply this procedure, called symbolic calculus of Heaviside.

Consider again, for complex numbers ai,

dm dm−1 d D = + a +···+a − + a . dxm 1 dxm−1 m 1 dx m

We will determine the inverse of Dδ again. Let

m m−1 P(z) = z + a1z +···+am−1z + am and write / kj P(z) = (z − zj) j the zj being the mutually different roots of P(z) = 0 with multiplicity kj . =  Set p δ and write zj for zjδ,1forδ0and just product for convolution product. kj  Then we have to determine the inverse of j (p − zj) in the quotient field of D+. By partial fraction decomposition we get   1 cj,kj cj,1 0 = +···+ , (z − z )kj (z − z )kj (z − z ) j j j j j 46 Convolution of Distributions

m z x m−1 the cj,l being complex scalars. The inverse of (p−zj ) is Y(x)e j x /(m − 1)!,  −1 whichisanelementofD+ again. Hence (Dδ) is a sum of such expressions # $  k −1 − x j (Dδ) 1 = Y(x) c +···+c ezj x . j,kj (k − 1)! j,1 j j

Let us consider a particular case. Let D = d2/dx2 + ω2 with ω ∈ R.Write

1 1 1 1 = − . z2 + ω2 2iω z − iω z + iω Thus 1   sin ωx (Dδ)−1 = Y(x) eiωx − e−iωx = Y(x) . 2iω ω Another application concerns integral equations. Consider the following integral equation on [0, ∞]:

x cos(x − t)f(t) dt = g(x) (g given,x≥ 0). 0

Let us assume that f is continuous and g is C1 for x ≥ 0.Replacingf and g by Yf and Yg, respectively, we can interpret this equation in the sense of distributions, and we actually have to solve Y(x) cos x ∗ Yf = Yg

 in D+.Wehave ( ) ( ) eix + e−ix 1 1 1 p Y(x) cos x = Y(x) = + = . 2 2 p − i p + i p2 + 1

The inverse is therefore

p2 + 1 1 = p + = δ + Y(x). p p

Hence

x x Yf = Yg∗(δ +Y(x)) = (Y g) +Y(x) g(t)dt = Yg +g(0)δ+Y(x) g(t)dt. 0 0

We conclude that g(0) has to be zero since we want a function solution. But this was clear a priori from the integral equation. Volterra Integral Equations of the Second Kind 47

6.8 Volterra Integral Equations of the Second Kind

Consider for x ≥ 0 the equation

x f(x)+ K(x − t)f(t) dt = g(x) . (6.1) 0

We assume (i) K and g are locally integrable, f locally integrable and unknown, (ii) K(x) = g(x) = f(x) = 0 for x ≤ 0.

 Then we can rewrite this equation as a convolution equation in D+

f + K ∗ f = g or (δ + K) ∗ f = g.

Equation (6.1) is a particular case of a Volterra integral equation of the second kind. The general form of a Volterra integral equation is:

x K(x,y)f(y)dy = g(x) (first kind), 0 x f(x)+ K(x,y)f(y)dy = g(x) (second kind). 0

Here we assume that x ≥ 0 and that K is a function of two variables on 0 ≤ y ≤ x. For an example of a Volterra integral equation of the first kind, see Section 6.7. Volter- ra integral equations of the second kind arise for example when considering linear ordinary differential equations with variable coefficients

dmu dm−1u du + a (x) +···+a − (x) + a (x) = g(x) dxm 1 dxm−1 m 1 dx m in which the ai(x) and g(x) are supposed to be continuous. Let us impose the initial conditions

 (m−1) u(0) = u0,u(0) = u1,...,u (0) = um−1 .

Then this differential equation together with the initial conditions is equivalent with the Volterra integral equation of the second kind

x v(x) + K(x,y)v(y)dy = w(x) 0 48 Convolution of Distributions

with

v(x) = u(m)(x) , (x−y)2 (x−y)m−1 K(x, y) = a1(x) + a2(x) (x−y)+ a3(x) +···+am(x) , 2! (m−1)!

m−1 m−1   xk−l w(x) = g(x) − u a − (x) . k m l (k − l)! l=0 k=l

In particular, we obtain a convolution Volterra equation of the second kind if all ai are constants, so in case of a constant coefficient differential equation. Of course we have to restrict to [0, ∞), but a similar treatment can be given on (−∞, 0],byworkingin  D−.

Example Consider d2u/dx2 + ω2u =|x|.Thenv = u satisfies

x 2 2 2 v(x) + ω (x − y)v(y)dy =|x|−ω u0 − ω u1x. 0 Theorem 6.27. For all locally bounded, locally integrable functions K on [0, ∞),the  −1 distribution A = δ + K is invertible in D+ and A is again of the form δ + H with H a locally bounded, locally integrable function on [0, ∞).

Proof. Let us write, symbolically, δ = 1 and K = q. Then we have to determine the   inverse of 1 + q, first in the quotient field of D+ and then, hopefully, in D+ itself. We have ∞ 1  = (−1)k qk 1 + q k=0  provided this series converges in D+.Set

E = δ − K + K∗2 +···+(−1)kK∗k +··· .

 Is this series convergent in D+? Yes, take an interval 0 ≤ x ≤ a and set Ma = | | ess. sup0≤x≤a K(x) .Forx in this interval one has      x       ∗2  =  −  ≤ 2 K (x)  K(x t)K(t) dt xMa , 0 hence, with induction on k,     xk−1 K∗k(x) ≤ Mk , (k − 1)! a | ∗k |≤ k k−1 − = ≤ ≤ and thus K (x) Maa /(k 1)! (k 0, 1, 2,...)for0 x a.Ontheright- Maa hand side we see terms of a convergent series, with sum Ma e . Therefore the series Exercises 49

 ∞ − k ∗k k=1( 1) K converges to a locally bounded, locally integrable, function H on [0, ∞) by Lebesgue’s theorem on dominated convergence, hence the series converges   in D+.ThusE = δ + H and E ∈D+.VerifythatE ∗ A = δ. Thus the solution of the convolution equation (6.1) is

x f = (δ + H) ∗ g, or f(x) = g(x) + H(x − t)g(t)dt(x≥ 0). 0

We conclude that the solution is given by a formula very much like the original equa- tion (6.1).

6.9 Exercises

 Exercise 6.28. Determine the inverses of the following distributions in D+:

δ − 5 δ + 6 δ; Y + δ; Y(x)ex + δ .

Exercise 6.29. Solve the following integral equation:

x (x − t) cos(x − t)f(t) dt = g(x) , 0 for g a given function and f an unknown function, both locally integrable and with support in [0, ∞).

Exercise 6.30. Let g be a function with support in [0, ∞).Findadistributionf(x) with support in [0, ∞) that satisfies the equation

x   e−t − sin t f(x − t)dt = g(x) . 0

Which conditions have to be imposed on g in order that the solution is a continuous function? Determine the solution in case g = Y , the Heaviside function.

Exercise 6.31. Solve the following integral equation:

x f(x)+ cos(x − t)f(t) dt = g(x) . 0

Here g is given, f unknown. Furthermore Supp f and Supp g are contained in [0, ∞). 50 Convolution of Distributions

Exercise 6.32. Let f(t)be the solution of the differential equation

u + 2u + u + 2u =−10 cos t, with initial conditions

u(0) = 0,u(0) = 2,u(0) =−4 .

Set F(t) = Y(t)f(t). Write the differential equation for F, in the sense of distributions.  Determine then F by symbolic calculus in D+.

6.10 Systems of Convolution Equations*

Consider the system of n equations with n unknowns Xj

A11 ∗ X1 + A12 ∗ X2 +···+A1n ∗ Xn = B1 . .

An1 ∗ X1 + An2 ∗ X2 +···+Ann ∗ Xn = Bn ,

   where Aij ∈A,Xj ∈A,Bj ∈A. The following considerations hold for any system of linear equations over an ar- bitrary commutative ring with unit element (for example A).

Let us write A = (Aij ), B = (Bj ), X = (Xj ).ThenwehavetosolveA ∗ X = B.

Theorem 6.33. The equation A∗X = B has a solution for all B if and only if Δ = det A has an inverse in A. The solution is then unique, X = A−1 ∗ B.

Proof.

(i) Suppose A∗X = B has a solution for all B. Choose B such that Bi = δ, Bj = 0 for j =i, and call the solution in this case Xi = (Cij ).Leti vary between 1 and n. Then  n 0 if i =j A ∗ C = ik kj δ if i = j, k=1

or A ∗ C = δI with C = (Cij ).Thendet A ∗ det C = δ,henceΔ has an inverse in A. −1  (ii) Assume that Δ exists in A .Letaij be the minor of A = (Aij ) equal to (−1)i+j × the determinant of the matrix obtained from A by deleting the jth row −1 and the ith column. Set Cij = aij ∗Δ .ThenA∗C = C ∗A = δI,soA∗X = B is solvable for any B, X = C ∗ B.

The remaining statements are easy to prove and left to the reader. Exercises 51

6.11 Exercises

 Exercise 6.34. Solve in D+ the following system of convolution equations:    δ ∗ X1 + δ ∗ X2 = δ   δ ∗ X1 + δ ∗ X2 = 0 .

Further Reading

Rigorous proofs of the existence of the tensor product and convolution product are in [5], [10] and also in [15]. For the impact on the theory of differential equations, see [7]. 7TheFourierTransform

Summary Fourier transformation is an important tool in mathematics, physics, computer sci- ence, chemistry and the medical and pharmaceutical sciences. It is used for analyzing and interpreting of images and sounds. In this chapter we lay the fundamentals of the theory. We define the Fourier transform of a function and of a distribution. Unfortu- nately not every distribution has a Fourier transform, we have to restrict to so-called tempered distributions. As an example we determine the tempered solutions of the heat or diffusion equation.

Learning Targets

 A thorough knowledge of the Fourier transform of an integrable function.  Understanding the definition and properties of tempered distributions.  How to determine a tempered solution of the .

7.1 Fourier Transform of a Function on R

For f ∈ L1(R) we define ∞ f(y)1 = f(x)e−2πixydx(y∈ R). −∞

We call f1 the Fourier transform of f , also denoted by Ff . We list some elementary properties without proof. ˆ 1 1 1 a. (c1f1 + c2f2) = c1f1 + c2f2 for c1,c2 ∈ C and f1,f2 ∈ L (R). 1 1 1 b. |f(y)|≤ f 1, f is continuous and lim|y|→∞ f(y) = 0 (Riemann–Lebesgue lemma)forallf ∈ L1(R). To prove the latter two properties, approximate f by a step function. See also Exercise 4.9 a. c. (f ∗ g)ˆ = f1 · g1 for all f,g ∈ L1(R). d. Let f(x) = f(−x)(x ∈ R).Then(f) ˆ = f1 for all f ∈ L1(R). e. Let (Lt f )(x) = f(x − t),(Mρf )(x) = f(ρx),ρ >0.Then

ˆ −2πity 1 (Ltf)(y) = e f(y), 2 3 ˆ 2πitx 1 e f(x (y) = (Lt f )(y) , # $ ˆ 1 1 y 1 (Mρf)(y) = f for all f ∈ L (R). ρ ρ Fourier Transform of a Function on R 53

Examples

1. Let A>0 and ΦA the characteristic function of the closed interval [−A, A]. sin 2πAy Then Φ1 (y) = (y =0), Φ1 (0) = 2A. A πy A 2. Set  1 −|x| for |x|≤1 Δ(x) = 0 for |x| > 1 .

(triangle function) # $ 2 sin πy Then Δ = Φ ∗ Φ ,soΔ1 (y) = (y =0), Δ1 (0) = 1 . 1/2 1/2 πy

3. Let T be the trapezoidal function ⎧ ⎪ ⎨⎪ 1 if |x|≤1 = −| | | |≤ T(x) ⎪ 1 x if 1 < x 2 ⎩⎪ 0 if |x|≥2 .

sin 3πysin πy Then T = Φ ∗ Φ ,henceT(y)1 = (y =0) and T(1 0) = 3. 1/2 3/2 (πy)2

2a 4. Let f(x) = e−a|x|,a>0.Thenf(y)1 = . a2 + 4π 2y2  2 5. Let f(x) = e−ax ,a > 0. Then we get by complex integration f(y)1 = π/a · −π 2y2/a F −πx2 = −πy2 e .Inparticular (e ) e . ∞ −ax2 = We know −∞ e dx π/a for a>0. From this result we shall deduce, ap- ∞ −a(t+ib)2 plying Cauchy’s theorem from complex analysis, that −∞ e dt = π/a for all a>0 and b ∈ R. Consider the closed path W in the complex plane consisting of the line segments

from −R to R (k1), from R to R + ib (k2), from R + ib to −R + ib (k3)andfrom −R + ib to −R (k4). Here R is a positive .

−R + ib ← k3 R + ib

k4 ↑

↓ k2

−R k1 → R 54 The Fourier Transform

5 −az2 −az2 = Since e is analytic in z, one has by Cauchy’s theorem W e dz 0.Let us split this integral into four pieces. One has

R 2 2 e−az dz = e−ax dx,

k1 −R

R 2 2 e−az dz =− e−a(t+ib) dt.

k3 −R

Parametrizing k2 with z(t) = R + it,weget

b 2 2 e−az dz = i e−a(R+it) dt.

k2 0 Since    2  2 2 2 2  e−a(R+it)  = e−a(R −t ) ≤ e−a(R −b ) (a ≤ t ≤|b|),

we get      2  2 2 2  e−az dz  ≤|b| e−a(R −b ) , hence lim e−az dz = 0 .   R→∞ k2 k2

In a similar way 2 lim e−az dz = 0 . R→∞ k4 Therefore ∞ ∞ 6 2 2 π e−a(t+ib) dt = e−ax dx = . a −∞ −∞

We now can determine f1 ∞ ∞   6 2 2 2 1 −ax2 −2πixy −π 2y2/a −a x+ πiy π − π y f(y) = e e dx = e e a dx = e a . a −∞ −∞

7.2 The Inversion Theorem

1 Theorem 7.1. Let f ∈ L (R) and x apointwheref(x + 0) = limx↓0 f(x) and f(x − 0) = limx↑0 f(x)exist. Then one has ∞ 2 2 f(x + 0) + f(x − 0) lim f(y)1 e2πixy e−4π αy dy = . α↓0 2 −∞ The Inversion Theorem 55

Proof. Step 1. One has ∞ ∞ ∞ 2 2 2 2 f(y)1 e2πixy e−4π αy dy = f(t)e2πi(x−t)y e−4π αy dtdy −∞ −∞ −∞

∞ ( ∞ ) 2 2 = f(t) e−4π αy e2πi(x−t)ydy dt (by Fubini’s theorem) −∞ −∞

∞ ∞ 2 2 1 (x−t) 1 − t = √ f(t)e 4α dt = √ f(x + t)e 4α dt. 2 πα 2 πα −∞ −∞ Step 2. We now get ∞ * + 2 2 f(x + 0) + f(x − 0) f(y)1 e2πixy e−4π αy dy − 2 −∞

∞   1 2 = √ f(x + t) − f(x − 0) + f(x − t) − f(x − 0) e−t /4αdt 2 πα 0

δ ∞ 1 1 = √ ···+ √ ···=I + I , 2 πα 2 πα 1 2 0 δ where δ>0 still has to be chosen. Step 3. Set φ(t) =|f(x + t) − f(x + 0) + f(x − t) − f(x − 0)|.Letε>0 be given. Choose δ such that φ(t) < ε/2 for |t| <δ,t>0. This is possible, since limt↓0 φ(t) = 0.Thenwehave|I1|≤ε/2.

Step 4. We now estimate I2.Wehave ∞ √ ∞ 1 2 2 α φ(t) |I |≤ √ φ(t) e−t /4αdt ≤ √ dt. 2 2 πα π t2 δ δ  ∞ 2 Now δ φ(t)/t dt is bounded, because ∞ ∞ ∞ ∞ φ(t) |f(x + t)| |f(x − t)| |f(x + 0) − f(x − 0)| dt ≤ dt + dt + dt t2 t2 t2 t2 δ δ δ δ 2 1  ≤ f 1 + |f(x + 0) − f(x − 0)| . δ2 δ

Hence limα↓0 I2 = 0,so|I2| <ε/2 as soon as α is small, say 0 <α<η. Step 5. Summarizing: if 0 <α<η,then    ∞   2 2 f(x + 0) + f(x − 0)   f(y)1 e2πixy e−4π αy dy −  <ε.  2  −∞ 56 The Fourier Transform

Corollary 7.2. If both f and f1 belong to L1(R) and f is continuous in x,then ∞ f(x) = f(y)1 e2πixy dy. −∞

Corollary 7.3. Let f ∈ L1(R) and f1 ≥ 0.Iff is continuous in x = 0, then one has f1 ∈ L1(R) and therefore ∞ f(0) = f(t)1 dt. −∞ Proof. Corollary 7.2 is easily proved by applying Lebesgue’s theorem on dominated convergence. Corollary 7.3 is a consequence of Fatou’s lemma.

Applications 2 2 1. Let Pa(x) = (1/π)[a/(x +a )] (a>0), see Section 6.2. Then we know by Sec- −2πa|y| tion 7.1, Example 4, that Pa(x) =F(e )(x). Therefore, by the inversion −2πa|y| theorem, e =F(Pa)(y).Hence

−2πa|y| −2πb|y| −2π(a+b)|y| F (Pa ∗ Pb)(y) = e e = e =F(Pa+b)(y) .

Again by the inversion theorem we get Pa ∗ Pb = Pa+b. 2. If f and g are in L1(R), and if both functions are continuous, then f1 = g1 implies f = g. Later on we shall see that this is true without the continuity assumption, or, otherwise formulated, that F is a one-to-one linear map.

7.3 Plancherel’s Theorem

1 2 1 2 1 Theorem 7.4. If f ∈ L ∩ L ,thenf ∈ L and f 2 = f 2.

Proof. Let f be a function in L1 ∩L2.Thenf ∗f is in L1 and is a continuous function. Moreover (f ∗f) ˆ =|f |2. According to Corollary 7.3we have |f1|2 ∈ L1,hencef1 ∈ L2. Furthermore ∞ ∞      2  2 f ∗ f( 0) = f(x) dx = f(y)1  du. −∞ −∞ 1 So f 2 = f 2. The Fourier transform is therefore an isometric linear mapping, defined on the dense linear subspace L1 ∩ L2 of L2. Let us extend this mapping F to L2, for instance by k Ff(x) = lim f(t)e−2πixtdt. (L2-convergence) k→∞ −k Differentiability Properties 57

We then have:

Theorem 7.5 (Plancherel’s theorem). The Fourier transform is a unitary operator on L2.

Proof. For any f, g ∈ L1 ∩ L2 one has ∞ ∞ f(x)g(x)1 dx = f(x)g(x)1 dx. −∞ −∞   ∞ ∞ 2 Since both −∞ f(x)(Fg)(x)dx and −∞(Ff )(x)g(x)dx are defined for f, g ∈ L , and because    ∞     F  ≤ F =  f(x)( g)(x)dx  f 2 g 2 f 2 g 2 , −∞    ∞     F  ≤ F =  ( f )(x)g(x)dx  f 2 g 2 f 2 g 2 , −∞ we have for any f, g ∈ L2 ∞ ∞ f(x)(Fg)(x)dx = (Ff )(x)g(x)dx. −∞ −∞

F 2 ⊂ 2 F 2 = 2 Notice that (L ) L is a closed linear subspace. We have to show that (L ) L . ∈ 2 F 2 ∞ F = ∈ 2 Let g L ,gorthogonal to (L ),so −∞( f )(x)g(x)dx 0 for all f L . ∞ 2 Then, by the above arguments, −∞ f(x)(Fg)(x)dx = 0 for all f ∈ L ,hence 2 2 F(g) = 0.Since g 2 = Fg 2,wegetg = 0.SoF(L ) = L .

7.4 Differentiability Properties

Theorem 7.6. Let f ∈ L1(R) be continuously differentiable and let f  be in L1(R) too. Then f7(y) = (2πiy)f(y)1 .

Proof. By partial integration and for y =0 we have

A  A e−2πixy x=A 1 f(x)e−2πixy dx = f(x) + f (x) e−2πixy dx. −2πiy x=−A 2πiy −A −A  = + x   ∈ 1 R Notice that lim|x|→∞ f(x) exists, since f(x) f(0) 0 f (t) dt and f L ( ). 1 Hence, because f ∈ L (R), lim|x|→∞ f(x) = 0.Thenweget ∞ ∞ 1 f(x)e−2πixy dx = f (x) e−2πixy dx, 2πiy −∞ −∞ 58 The Fourier Transform

or (2πiy)f(y)1 = f7(y) for y =0. Since both sides are continuous in y,the theorem follows.

Corollary 7.7. Let f be a Cm function and let f (k) ∈ L1(R) for all k with 0 ≤ k ≤ m. Then  f (m)(y) = (2πiy)m f(y).1

Notice that for f as above, f(y)1 = o(|y|−m).

Theorem 7.8. Let f ∈ L1(R) and let (−2πix)f(x) = g(x) be in L1(R) too. Then f1 is continuously differentiable and (f)1  = g1.

Proof. By definition

∞ ( ) e−2πixh − 1 1  −2πixy0 (f) (y0) = lim f(x)e dx. h→0 h −∞

We shall show that this limit exists and that we may interchange limit and integration. This is a consequence of Lebesgue’s theorem on dominated convergence. Indeed,  ( )  −2πixh   − e − 1   f(x)e 2πixy0  ≤  h 

|f(x)||2πx||sin 2πxθ1(h, x) − i cos 2πxθ2(h, x)|

1 with 0 ≤ θi(h, x) ≤ 1 (i = 1, 2), hence bounded by 2|g(x)|.Sinceg ∈ L (R) we are done.

1 m 1 Corollary 7.9. Let f ∈ L (R), gm(x) = (−2πix) f(x)and gm ∈ L (R) too. Then 1 m 1(m) f is C and f = g8m.

Both Corollary 7.7 and Corollary 7.9 are easily shown by induction on m.

7.5 The Schwartz Space S(R)

Let f be in L1(R). Then one obviously has, applying Fubini’s theorem, f,1 ϕ= f, ϕ1  for all ϕ ∈D(R).IfT is an arbitrary distribution, then a “natural” definition of T1 would be T,1 ϕ=T, ϕ1 (ϕ ∈D(R)).ThespaceD(R) is however not closed under taking Fourier transforms. Indeed, if ϕ ∈D(R),thenϕ1 can be continued to a complex entire analytic function ϕ(z)1 . This is seen as follows. Let Supp ϕ ⊂ [−A, A].Then

A ∞ A  (−2πiz)k ϕ(z)1 = ϕ(y) e−2πiyz dy = ϕ(y) yk dy = k! −A k 0 −A The Schwartz Space S(R) 59

and this is an everywhere convergent power series. Therefore, if ϕ1 were in D(R), then ϕ = 0. Thus we will replace D=D(R) with a new space, closed under taking Fourier transforms, the Schwartz space S=S(R).

Definition 7.10. The Schwartz space S consists of C∞ functions ϕ such that for all k, l ≥ 0, the functions xl ϕ(k)(x) are bounded on R.

Elements of S are called Schwartz functions, also functions whose derivatives are “rapidly decreasing” (à décroissance rapide). Observe that D⊂S.Anexampleof 2 a new Schwartz function is ϕ(x) = e−ax (a > 0).

Properties of S 1. S is a complex vector space. 2. If ϕ ∈S,thenxl ϕ(k)(x) and [xl ϕ(x)](k) belong to S as well, for all k, l ≥ 0. 3. S⊂Lp for all p satisfying 1 ≤ p ≤∞. 4. S=S1 .

The latter property follows from Section 7.4 (Corollaries 7.7 and 7.9).

One defines on S a convergence principle as follows: say that ϕj tendstozero (notation ϕj → 0)whenforallk, l ≥ 0,      l (k)  → sup x ϕj (x) 0 if j tends to infinity. OnethenhasanotionofcontinuityinS,inasimilarwayasinthespacesD and E.

Theorem 7.11. The Fourier transform is an isomorphism of S.

Proof. The linearity of the Fourier transform is clear. Let us show the continuity of the

Fourier transform. Let {ϕj} be a sequence of functions in S. l 1 (k) = 1 = − k (l) (i) Notice that (2πiy) ϕj (y) ψj (y) where ψj (x) [( 2πix) ϕj(x)] . Clearly ψj ∈S. | l 1 (k) |=|1 |≤ (ii) One has (2πiy) ϕj (y) ψj (y) ψj 1. (iii) The following inequalities hold:

∞ 1

ψj 1 ≤ |ψj(x)| dx = |ψj(x)| dx + |ψj (x)| dx −∞ −1 |x|≥1 2 |x ψj(x)| ≤ 2sup|ψ |+ dx ≤ 2sup|ψ |+2sup|x2 ψ (x)| . j x2 j j |x|≥1 60 The Fourier Transform

Now, if ϕj → 0 in S,thenψj → 0 in S,hence ψj 1 → 0 and therefore | l 1 (k) |→ →∞ 1 → S sup (2πiy) ϕj (y) 0 if j .Henceϕj 0 in . Thus the Fourier transform is a continuous mapping from S to itself. In a similar way one shows that the inverse Fourier transform is continuous on S. Hence the Fourier transform is an isomorphism. Observe that item (iii) in the above proof implies that the injection S⊂L1 is continuous; the same is true for S⊂Lp for all p satisfying 1 ≤ p ≤∞.

7.6 The Space of Tempered Distributions S(R)

One has the inclusions D⊂S⊂E. The injections are continuous, the images are dense. Compare with Section 5.1.

Definition 7.12. A tempered distribution is a distribution that can be extended to a con- tinuous linear form on S.

Let S =S(R) be the space of continuous linear forms on S. Then, by the re- mark at the beginning of this section, S can be identified with the space of tempered distributions.

Examples p q 1. Let f ∈ L , 1 ≤ p ≤∞.ThenTf is tempered since S⊂L ,forq such that 1/p + 1/q = 1, and the injection is continuous. 2. Let f be a locally integrable function that is slowly increasing (à croissance lente): there are A>0 and an integer k ≥ 0 such that |f(x)|≤A |x|k if |x|→∞.

Then Tf is tempered. This follows from the following inequalities:        ∞   1            ≤   +    f(x) ϕ(x) dx  f(x) ϕ(x)dx  f(x) ϕ(x)dx −∞ −1 |x|≥1 1        +  |f(x)| ≤ sup |ϕ| f(x) dx + sup ϕ(x) xk 2 dx. |x|k+2 −1 |x|≥1 3. Every distribution T with compact support is tempered. To see this, extend T to E (Theorem 5.2) and then restrict T to S. 4. Since ϕ → ϕ is a continuous linear mapping from S to S,thederivativeofatem- pered distribution is again tempered. 5. If T is tempered and α a polynomial, then αT is tempered, since ϕ → αϕ is a continuous linear map from S into itself. x2 6. Let f(x) = e .ThenTf is not tempered. Structure of a Tempered Distribution* 61

 ∞ Indeed let α ∈D,α ≥ 0,α(x) = α(−x) and −∞ α(x) dx = 1.Letχj be the characteristic function of the closed interval [−j, j] and set αj = χj ∗ α.Then −x2 αj ∈Dand αj ϕ → ϕ in S for all ϕ ∈S. Choose ϕ(x) = e .ThenTf ,αjϕ=  ∞  ∞ −∞ αj (x) dx.Butlimj→∞ −∞ αj (x) dx =∞. Similar to Propositions 2.3 and 5.4 one has:

Proposition 7.13. Let T be a distribution. Then T is tempered if and only if there exists aconstantC>0 and a positive integer m such that      |T, ϕ| ≤ C sup xkϕ(l)(x) k,l≤m for all ϕ ∈D.

7.7 Structure of a Tempered Distribution*

This section is devoted to the structure of tempered distributions on R.

Theorem 7.14. Let T be a tempered distribution on R. Then there exists a continuous, slowly increasing, function f on R and a positive integer m such that T = f (m) in the sense of distributions.

Proof. Since T is tempered, there exists a positive integer N such that      |T, ϕ| ≤ const. sup xkϕ(l)(x) k,l≤N for all ϕ ∈D. Consider (1 + x2)−N T .Wehave   ! "   9! " :   −N    −N (l)  + 2  ≤  k + 2   1 x T, ϕ  const. sup x 1 x ϕ(x)  k,l≤N      1 ≤ const. sup ϕ(k)(x) . 1 + x2 k≤N

By partial integration from −∞ to x,weobtain ∞ ∞ |ϕ(k)(x)| |2xϕ(k)(x)| |ϕ(k+1)(x)| ≤ dx + dx. 1 + x2 (1 + x2)2 1 + x2 −∞ −∞

Hence    ∞  ! "     1/2  −N   2  1 + x2 T, ϕ  ≤ const. ϕ(k)(x) dx k≤N+1 −∞ for all ϕ ∈D.LetHN+1 be the (Sobolev) space of all functions ϕ ∈ L2(R) for which the distributional derivatives ϕ,...,ϕ(N+1) also belong to L2(R),providedwiththe 62 The Fourier Transform

scalar product ∞ N+1 (ψ, ϕ) = ψ(k)(x) ϕ(k)(x) dx. k=0 −∞ It is not difficult to show that HN+1 is a (see also Exercise 7.24). There- fore, (1 + x2)−N T can be extended to HN+1 and thus (1 + x2)−N T = g for some g ∈ HN+1,i.e. ∞    N+1 −N 1 + x2 T, ϕ = (g, ϕ) = g(k)(x) ϕ(k)(x) dx. k=0 −∞ Thus ! " N+1 −N  1 + x2 T = (−1)k g(2k) k=0 in the sense of distributions. Now write x G(x) = g(t)dt. 0   Then G is continuous, slowly increasing because |G(x)|≤ |x| g 2,andG = g in the sense of distributions. Thus

! " N+1 −N  1 + x2 T = (−1)k G(2k+1) k=0 on D.HenceT is of the form N+1 T = dk Gk k=0 for some constants dk,whileeachGk is a linear combination of terms of the form ; <   (2k+1−l) 1 + x2 N (l)G (0 ≤ l ≤ 2k + 1).

So, a fortiori, each term of Gk is a slowly increasing continuous function. Integrating some terms of Gk a number of times and keeping in mind that if h is a continuous x slowly increasing function, then x 0 h(t)dt isagainsuchafunction,wegetthe following result: T = f (m) , with m = 2N + 3 and f a continuous, slowly increasing, function. This completes the proof. The following corollary turns out to be useful in Chapter 8.

Corollary 7.15. Let T be a tempered distribution on R with support contained in (0, ∞). There exists a continuous, slowly increasing, function f with Supp f ⊂ (0, ∞) and a positive integer m such that T = f (m) in the sense of distributions. Fourier Transform of a Tempered Distribution 63

Proof. Let T = f (m) as in Theorem 7.14. Since Supp T ⊂ (0, ∞),wecanfindβ ∈ E(R) with Supp β ⊂ (0, ∞) and β(x) = 1 for x in a neighborhood of Supp T .Then T = βT.Wethengetforallϕ ∈D # $   m   m (m−k) T, ϕ=T, βϕ= f (m),βϕ = (−1)k β(k)f ,ϕ , k k=0  ! " = m − k m (k) (m−k) hence T k=0( 1) k [β f] . Integrating some terms once more a num- ber of times gives the desired result.

7.8 Fourier Transform of a Tempered Distribution

Let T ∈S.

Definition 7.16. The Fourier transform T1 of T is defined by T,1 ϕ=T, ϕ1 (ϕ ∈S).

Observe that T1 ∈S.WeshallalsowriteT1 =FT . If ∞ F ϕ(x) = ϕ(y) e2πixy dy(ϕ ∈S), −∞ then F T can be defined in a similar way for T ∈S. One has FF=FF = id in S and in S.  Choose the following convergence principle in S : Tj → 0 if Tj ,ϕ→0 for all ϕ ∈S(j →∞). Compare this with Section 4.3. As said before, this leads to a notion of continuity in S.ThenF and F are isomorphisms of S.

Examples 1. δ1 = 1.  1 ∞ −2πixy 2. 1 = δ. This relation is sometimes symbolically written as −∞ e dy = δ(x). 3. F δ = 2πix,F δ(m) = (2πix)m. 2πiax 4. F δ(a) = e . 5. If T ∈S,thenF (T (m)) = (2πix)m (F T)and F ((−2πix)m T) = (F T)(m).

One also has ∈ 1 1 = –Iff L ,thenTf Tf1. This implies, in particular, using Corollary 6.12, that f1 = 0 gives f = 0. See also Section 7.2,Application 2. 2 1 ∈ = 1 –Iff L ,thenTf is tempered and, by Plancherel’s theorem, Tf Tf . –Iff ∈ Lp,thenf1 ∈ Lq with q such that 1/p + 1/q = 1, 1 ≤ p ≤ 2 (without proof; see [11]). 64 The Fourier Transform

 −2πixz Let T be a distribution with compact support, so T ∈E.ThenTx, e  (z ∈ C) is well defined and one has , - , - ∞ ∞  (−2πix)k  (−2πix)k T , e−2πixz= T , zk = T , zk , x x k! x k! k=0 k=0 for every z ∈ C, because the power series for e−2πixz not only converges uniformly −2πixz on compact sets (with z fixed), but also in E. The conclusion is: z →Tx, e  is an entire analytic function.

 −2πixy Theorem 7.17. For T ∈E,thefunctiony →Tx , e  is the Fourier transform of T . It is a slowly increasing function that can be extended to an entire analytic function −2πixz on the complex plane, given by z →Tx, e .

Proof. Let α ∈Dbe such that α(x) = 1 on a neighborhood of Supp T .CallK = Supp α. By Section 2.2, Proposition 2.3, there is an integer m ≥ 0 and a constant c > 0 such that   m   |T, ϕ| ≤ c sup ϕ(k)(x) k=0 for all ϕ ∈Dwith Supp ϕ ⊂ K.Forψ ∈Eset ϕ = αψ.Thenweget   m   | | = | | ≤  (k)  T, ψ T, ϕ c supx∈K ψ (x) k=0 for some constant c>0 and all ψ ∈E.Hence, m | −2πixy| ≤ | m|≤ | |m Tx , e c supx∈K (2πiy) const. y , k=0 −2πixy if |y|→∞.Soy →Tx , e  is a slowly increasing function and thus defines a tempered distribution. Moreover, one has for all ϕ ∈D   T,1 ϕ = T, ϕ1 , ∞ - −2πixy = Tx, ϕ(y) e dy −∞

, ∞ - −2πixy = Tx, α(x) ϕ(y)e dy −∞   −2πixy = Tx ⊗ 1y , α(x)ϕ(y) e ∞   −2πixy = ϕ(y) Tx, e dy. (“Fubini”) −∞

The tempered distributions T1 and (the distribution defined by) the function −2πixy x Tx, e  thus coincide. Paley–Wiener Theorems on R* 65

Fourier Transform and Convolution    ∞ If S,T ∈E ,thenS ∗ T ∈E and (S ∗ T)is a C function, which is clearly equal to y S(y)1 T(y)1 . We mention (without proof): if S ∈E and T ∈S,thenS ∗ T ∈S and  ∞ (S ∗ T) = S(y)1 T1.NoticethatS(y)1 is a slowly increasing C function. Further-  more, we already knew that for f,g ∈ L1, (f ∗ g)(y) = f(y)1 g(y)1 for all y ∈ R.

7.9 Paley–Wiener Theorems on R*

The Paley–Wiener theorems characterize the Fourier transforms of C∞ functions with compact support and distributions with compact support.

Theorem 7.18. a. Let ϕ ∈Dwith Supp ϕ ⊂ [−A, A].Thenϕ1 is an entire analytic function on C that satisfies

(∗) for every integer m ≥ 0 there is a constant cm > 0 such that for all w ∈ C

−m 2πA| Im w| |ϕ(w)1 |≤cm (1 +|w|) e . b. Let F be an entire analytic function satisfying (∗),forsomeA>0. Then there is afunctionϕ ∈Dwith Supp ϕ ⊂ [−A, A] and ϕ1 = F.

Theorem 7.19. a. Let T ∈E with Supp T ⊂ [−A, A].ThenT1 is an entire analytic function on C that satisfies the following: (∗∗) there exists an integer m ≥ 0 and a constant c>0 such that for all w ∈ C

|T(w)1 |≤c(1 +|w|)m e2πA| Im w| . b. Let F be an entire analytic function satisfying (∗∗) for some A>0. Then there is a distribution T ∈E with Supp T ⊂ [−A, A] and T1 = F.

Proof. (Theorem 7.18) a. Let ϕ ∈Dand Supp ϕ ⊂ [−A, A].Setw = u + iv.Then

A |ϕ(w)1 |≤ |ϕ(x)| e2πxv dx ≤ c e2πA|v| . −A

Let m be a positive integer. Set ψ(x) = 1/(2πi)m ϕ(m)(x).Thenψ(w)1 = m 1 ∈D  w ϕ(w).Sinceψ there is a constant cm > 0 such that

| 1 |≤  2πA|v| ψ(w) cm e , 66 The Fourier Transform

hence there is cm > 0 such that

1 |ϕ(w)1 |≤c e2πA|v| , m (1 +|w|)m

since (1 +|w|)m/|w|m is bounded for |w|→∞. b. Let f be the restriction of F to R and set ϕ = Ff .SinceF satisfies (∗), ϕ is C∞ by Corollary 7.9 and ∞ ϕ(x) = F(t)e2πixt dt. −∞ Itsufficestoshowthatϕ(x) = 0 for |x| >A. Consider the following path W (with u fixed):

−M + iu ← k3 M + iu

k4 ↑

↓ k2

−M k1 → M

5 2πxw = ∈ R Then W F(w)e dw 0 for all x .Wehave

M F(w)e2πxw dw = f(t)e2πixt dt.

k1 −M

M − F(w)e2πixw dw = e−2πxu F(t + iu) e2πixt dt.

k3 −M

Furthermore        u       2πixw  =  + −2πxs   F(w)e dw   F(M is) e ds 

k2 0

u c c ≤ m e2πA|s| e−2πxs ds ≤ const. m , (1 + M)m (1 + M)m 0 Paley–Wiener Theorems on R* 67

 and this tends to zero if M tends to infinity. In a similar way we obtain | |→0 k4 if M →∞. Therefore # ∞ $ ϕ(x) = e−2πxu F(t + iu) e2πixt dt −∞ for all u. Consequently ∞ c |ϕ(x)|≤e−2πxu 2 e2πA|u| dt = const. e2π(A|u|−xu) . 1 + t2 −∞

If x>A,letu →∞.Thenϕ(x) = 0.Ifx<−A,letu →−∞.Thenϕ(x) = 0 as well. Proof. (Theorem 7.19) a. Let T ∈E and Supp T ⊂ [−A, A].ByTheorem7.17,T1 is an entire analytic function. From Proposition 5.4 we obtain

k | 1 | ≤ | (j) | T, ψ c supx∈[−A, A] ψ (x) j=1

1 −2πixw for some c>0, some integer k>0 and all ψ∈E.MoreoverT(w)=Tx , e . From this relation we easily obtain

|T(w)1 |≤c(k+ 1)(2π)k (1 +|w|)k e2πA| Im w| .

 ∞ b. Let α ∈D,α≥ 0,α(x)= α(−x) and −∞ α(x) dx = 1.Setαε(x) = 1/ε α(x/ε) for any ε>0. Let F be an entire analytic function satisfying (∗∗). Call f the restriction of F to R.Thenf is a slowly increasing function, hence in S.SetT = Ff ,soT ∈S

as well. We shall show that Supp T ⊂ [−A, A].CallTε = αε ∗ T .Fromthe identity

1 1 1 1 ˇ α1εT, ϕ=T, α1εϕ=T, (αε ∗ ϕ)=T, (αε ∗ ϕ)=αε ∗ T, ϕˇ 

 1 1 for ϕ ∈Dand ϕ(x)ˇ = ϕ(−x) (x ∈ R),weseethatTε ∈S and Tε = α1ε ·T .Fur- 1 1 thermore, Tε is an entire analytic function, namely Tε = α1εF.SinceSupp αε ⊂ [−ε, ε] we have by Theorem 7.18: for any k there is a ck > 0 such that c |α1 (w)|≤ k e2πε| Im w| , ε (1 +|w|)k hence cc |T1 (w)|≤ k e2π(A+ε) | Im w| . ε (1 +|w|)k−m

By Theorem 7.18 b., we have Tε ∈Dand Supp Tε ⊂ [−A − ε, A + ε].Since T = limε↓0 Tε we get Supp T ⊂ [−A, A]. 68 The Fourier Transform

7.10 Exercises

Exercise 7.20. a. Show that the Fourier transform of an odd (even) distribution is an odd (even) dis- tribution. b. Determine all odd distributions satisfying xT = 1. c. Deduce from b. the Fourier transform of pv(1/x).

Exercise 7.21. Determine the Fourier transform f1 of  |x| if |x| < 1 f(x) = 0 elsewhere.

Show that f1 is a C∞ function.

Exercise 7.22. Consider the tempered distribution |x|. a. Compute δ ∗|x|. b. Deduce that F(|x|) = A[Pf(1/x2)] + Cδwith constants A and C. Determine A. c. Compute d/dx[pv(1/x)]. Determine F[Pf (1/x2)] and then the constant C.

Exercise 7.23. If the product of two entire analytic functions is zero, then at least one of both functions is identically zero. Apply this fact for showing that the convolution algebra E hasnozerodivisors.

Exercise 7.24. Let H1 be the space of functions f ∈ L2(R), whose first derivative (in the sense of distributions) belongs to L2(R) as well. Provide H1 with the scalar product ∞ ∞ df dg (f , g) = f(x) g(x)dx + dx. 1 dx dx −∞ −∞

1 = 1/2 a. Show that H is a Hilbert space with norm N1(f ) (f , f )1 . b. A tempered distribution f belongs to H1 if and only if  1 + λ2 f(λ)1 ∈ L2(R).

Here f1 is the Fourier transform of f .Proveit. c. For f ∈ H1 set = = = = f == 1 + λ2 f(λ)1 = . 2 1 Show that f and N1(f ) are equivalent norms on H . Fourier Transform in Rn 69

7.11 Fourier Transform in Rn  = =  = n If x (x1,...,xn), y (y1,...,yn),thenwewrite x, y i=1 xiyi.Weshall also write dx for dx1 ...dxn. For f ∈ L1(Rn), define the Fourier transform f1 =Ff of f by f(y)1 = f(x)e−2πix,y dx(y∈ Rn). Rn

This Fourier transform behaves in the same way as in case n = 1. One has, for exam- ple, F −πr2 = −πr2 2 = 2 +···+ 2 (e ) e (r x1 xn). The Schwartz space S(Rn) consists of all C∞ functions ϕ on Rn such that

| l k | ∞ supx∈Rn x D ϕ(x) < for all n-tuples l and k of nonnegative integers. The space S(Rn) carries the conver- gence principle set forward by its defining relations. Tempered distributions on Rn are elements of S(Rn), the space of continuous linear forms on S(Rn). Choose in S(Rn) the similar convergence principle as in case n = 1, see Section 7.8. The reader may verify that the structure theorem from Section 7.7 can be generalized to Rn, using the same method of proof. The Fourier transform of a tempered distribution is defined as in case n = 1. The same kind of theorems hold. Also the Paley–Wiener theorems can be generalized to Rn. Here are some formulae: 9 : ∂δ ∂δ F = 2πiyk , F [−2πixk] = (1 ≤ k ≤ n) . ∂xk ∂yk We proceed now with a detailed study of the Fourier transforms of so-called radial n functions f on R : f(x1,...,xn) depends only on r ,sof(x1,...,xn) = Φ(r ). Notice that the requirement f ∈ L1(Rn) is equivalent with r n−1Φ(r ) ∈ L1(0, ∞). Let f ∈ L1(Rn) be a radial function. We claim that f1 is also radial. This is seen as follows. Observe that f being radial is the same as saying that f(Ax) = f(x) almost everywhere for all orthogonal transformations A. Sowehavetoshowthat f(Ay)1 = f(y)1 for all y ∈ Rn and all A. This is however clear from the definition of f1 and the properties of A f(Ay)1 = f(x)e−2πix,Ay dx = f(Ax)e−2πix,y dx Rn Rn = f(x)e−2πix,y dx = f(y).1 Rn 70 The Fourier Transform

. 1 = Ψ = = 2 +···+ 2 Ψ Let f(y) (ρ) with ρ y y1 yn. We try to determine in terms of Φ, and conversely. One has for n ≥ 2

1 −2πixnρ Ψ(ρ) = f(0, 0,...,0,ρ)= e f(x1,...,xn) dx1 ···dxn Rn ∞ π −2πirρ cos θ n−1 n−2 = Sn−2 e Φ(r ) r sin θ dθ dr, 0 0 applying spherical coordinates and setting

n−1 2π 2 S − =   . n 2 Γ n−1 3 We observe that some special function occurs in this integral formula, a Bessel function. The Bessel function of the first kind of order ν has the following integral represen- tation: π ! " (x/2)ν J (x) = √ e−ix cos θ sin 2ν dθ Re(ν) > − 1 . ν π Γ (ν + 1/2) 2 0 For ν = (n − 2)/2, x = 2πρr one obtains

n−2 n−2 n−2 π π 2 ρ 2 r 2 −2πirρcos θ n−2 n−2 = √ J (2πρr) n−1 e sin θ dθ 2 π Γ ( ) 2 0

− − π n−32 n 2 n 2 π ρ 2 r 2 −2πirρcos θ n−2 = − e sin θ dθ. Γ ( n 1 ) 2 0 Therefore ∞ 2π n Ψ = 2 − Φ ≥ (ρ) n−2 r J n 2 (2πρr) (r ) dr(n2). (7.1) ρ 2 2 0 One also has, if ρn−1 Ψ(ρ) ∈ L1(0, ∞) and Φ is continuous, then ∞ 2π n Φ = 2 − Ψ ≥ (r ) n−2 ρ J n 2 (2πρr) (ρ) dρ(n2). (7.2) r 2 2 0 One calls the transform in equation (7.1) (and equation (7.2)) Fourier–Bessel transform or Hankel transform of order (n − 2)/2.ItisdefinedonL1((0, ∞), r n−1 dr). We leave the case n = 1 to the reader. There are many books on Bessel functions, Fourier–Bessel transforms and Han- keltransforms,seeforexample[16].Becauseofitsimportanceformathematical physics, it is useful to study them in some detail. The Heat or Diffusion Equation in One Dimension 71

7.12 The Heat or Diffusion Equation in One Dimension

The heat equation was first studied by Fourier (1768–1830) in his famous publication “Théorie analytique de la chaleur”. It might be considered as the origin of, what is now called, Fourier theory.

| x = 0 Let us consider a bar of length infinity, thin, homogeneous and heat conducting. Denote by U(x,t) the temperature of the bar at place x and time t.Letc be its heat capacity. This is defined as follows: the heat quantity in 1 cm of the bar with tem- perature U is equal to cU.Letγ denote the heat conducting coefficient: the heat quantity that flows from left to right at place x in one second is equal to −γ∂U/∂x, with −∂U/∂x the temperature decay at place x. Let us assume, in addition, that there are some heat sources along the bar with heat density ρ(x, t). This means that between t and t + dt the piece (x, x + dx) of the bar is supplied with ρ(x, t)dx dt calories of heat. We assume furthermore that there are no other kinds of sources in play for heat supply (neither positive nor negative). Let us determine how U behaves as a function of x and t. The increase of heat in + + (x, x9 dx) between t and t dt is : ∂U ∂U ∂2U 1. γ (x + dx, t) − γ (x, t) dt ∼ γ dx dt (by conduction), ∂x ∂x ∂x2 2. ρ(x, t)dx dt (by sources).

( ) ∂2U Together: γ + ρ(x, t) dxdt. ∂x2

On the other hand this is equal to c(∂U/∂t)dxdt. Thus one obtains the heat equa- tion ∂U ∂2U c − γ = ρ. ∂t ∂x2 Let us assume that the heat equation is valid for distributions ρ too, e. g. for – ρ(x, t) = δ(x):eachsecondonlyatplacex = 0 one unit of heat is supplied to the bar: – ρ(x, t) = δ(x) δ(t):onlyatt = 0 and only at place x = 0 one unit of heat is supplied to the bar, during one second.

We shall solve the following Cauchy problem:findafunctionU(x,t)for t ≥ 0,which 2 satisfies the heat equation with initial condition U(x,0) = U0(x),withU0 a C func- tion. 72 The Fourier Transform

We will consider the heat equation as a distribution equation and therefore set  U(x,t) if t ≥ 0 , U(x,t) = 0 if t<0 , and we assume that U is C2 as a function of x and C1 as a function of t for t ≥ 0.We have  * + ∂2U ∂2U ∂U ∂U = , = + U (x) δ(t) , ∂x2 ∂x2 ∂t ∂t 0 so that U satisfies

∂U ∂2U c − γ = ρ(x,t) + cU (x) δ(t). (7.3) ∂t ∂x2 0 We assume, in addition, – x ρ(x,t) is slowly increasing for all t, – x U(x,t) is slowly increasing for all t,

– x U0(x) is slowly increasing.

  Set V(y,t) =Fx U(x,t), σ(y,t) =Fx ρ(x, t), V0(y) =Fx U0(x).Equation(7.3) becomes, after taking the Fourier transform with respect to x,

∂V(y,t) c + 4π 2γy2 V(y,t) = σ(y,t) + cV (y) δ(t) . ∂t 0

Now fix y (that is: consider the distribution ϕ  ., ϕ(t)⊗ψ(y) for fixed ψ ∈D).  We obtain a convolution equation in D+,namely

 2 2  (c δ + 4π γy δ) ∗ V(y,t) = σ(y,t) + cV0(y) δ(t) .

2 2 Call A = cδ + 4π 2γy2 δ.ThenA−1 = (Y (t)/c) e−4π γy t. We obtain the solution

Y(t) 2 2 2 2 V(y,t) = σ(y,t) ∗ e−4π (γ/c)y t + V (y) Y (t) e−4π (γ/c)y t . t c 0

 = F  Observe that U(x,t) y V(y,t). Important special case: U0(x) = 0, ρ(x,t) = δ(x) δ(t). Then one asks in fact for an elementary solution of equation (7.3). We have σ(y,t) = δ(t), V0(y) = 0, hence Y(t) 2 2 V(y,t) = e−4π (γ/c)y t . c An elementary solution of the heat equation is thus given by

2 Y(t) 1 − x U(x,t) =  e 4(γ/c)t . c (γ/c)πt The Heat or Diffusion Equation in One Dimension 73

We leave it to the reader to check that U indeed satisfies

∂U ∂2U c − γ = δ(x) δ(t) . ∂t ∂x2

If there are no heat sources, then U(x,t) is given by

U(x,t) = cU0(x) ∗x E(x,t) if t>0 , with 1 − x2 E(x,t) =  e 4(γ/c)t (t > 0). 2c (γ/c)πt We leave it as an exercise to solve the heat equation in three dimensions

∂U c − γ ΔU = ρ. ∂t

Further Reading

More on the Fourier transform of functions is in [4] and the references therein. More on tempered solutions of differential equations one finds for example in [7]. Notice that the Poisson equation has a tempered elementary solution. Is this a general phe- nomenon for differential equations with constant coefficients? Another, more mod- ern, method for analyzing images and sounds is the wavelet transform. For an intro- duction, see [3]. 8TheLaplaceTransform

Summary The Laplace transform is a useful tool to find nonnecessarily tempered solutions of differential equations. We introduce it here for the real line. It turns out that there is an intimate connection with the symbolic calculus of Heaviside.

Learning Targets  Learning about how the Laplace transform works.

8.1 Laplace Transform of a Function

For any function f on R, that is locally integrable on [0, ∞),wedefinetheLaplace transform L by ∞ Lf(p) = f(t)e−pt dt(p∈ C). 0 Existence −ξ t 1 1. If for t ≥ 0 the function e 0 |f(t)| is in L ([0, ∞)) for some ξ0 ∈ R,then clearly Lf exists as an absolutely convergent integral for all p ∈ C with Re(p) ≥

ξ0.Furthermore,Lf is analytic for Re(p) > ξ0 and, in that region, we have for all positive integers m ∞ (Lf)(m)(p) = (−t)m f(t)e−pt dt. 0

Observe that it suffices to show the latter statements for ξ0 = 0 and m = 1.In that case we have for p with Re(p) > 0 the inequalities         e−(p+h)t − e−pt  e−ht − 1  |f(t)|   ≤|f(t)| e−ξt   ≤|f(t)| e−ξt t et|h|  h   h  ≤|f(t)| t e−(ξ−ε)t

if |h| <εand ξ = Re(p).Becauseξ>0, one has ξ − ε>0 for ε small. Set 1 δt ξ − ε = δ. Then there is a constant C>0 such that t ≤ C e 2 .Wethusmay conclude that the above expressions are bounded by C |f(t)| provided |h| <ε. Now apply Lebesgue’s theorem on dominated convergence. 2. If f(t) =O(ekt)(t →∞),thenLf exists for Re(p) > k. In particular, slowly increasing functions have a Laplace transform for Re(p) > 0.Iff has compact support, then Lf is an entire analytic function. 2 2 3. If f(t) = e−t ,thenLf is entire. If f(t) = et ,thenLf does not exist. Laplace Transform of a Distribution 75

Examples 1. L1 = 1/p, L(tm) = m!/pm+1 (Re(p) > 0).Byabuseofnotationwefrequently omit the argument p in the left-hand side of the equations. 2. Let f(t) = tα−1, α being a complex number with Re(α) > 0.ThenLf(p) = Γ (α)/pα for Re(p) > 0. 3. Set for c>0,  1 if t ≥ c , H (t) = c 0 elsewhere.

−cp Then L(Hc)(p) = e /p for all p with Re(p) > 0. 4. Let f(t) = e−αt (α complex). Then Lf(p) = 1/(p + α) for Re(p) > − Re(α). In particular, for a real, a – L (sin at) = for Re(p) > 0 , p2 + a2 p – L (cos at) = for Re(p) > 0 . p2 + a2

−at 5. Set h(t) = e f(t) with a ∈ R and assume that Lf exists for Re(p) > ξ0. Then Lh exists for Re(p) > ξ0 − a and Lh(p) =Lf(p+ a). In particular, b – L (e−at sin bt) = for Re(p) > −a, (p2 + a2) + b2 p + a – L (e−at cos bt) = for Re(p) > −a, (p2 + a2) + b2 # $ tα−1 eλt 1 – L = for Re(p) > Re(λ) , and Re(α) > 0 . Γ (α) (p − λ)α

6. Let h(t) = f (λt)(λ > 0) and let Lf exist for Re(p) > ξ0.ThenLh(p) exists for Re(p) > λξ0 and Lh(p) = (1/λ)Lf(p/λ).

8.2 Laplace Transform of a Distribution

 Let T ∈D+, thus the support of T is contained in [0, ∞). Assume that there exists −ξ0t  −pt  ξ0 ∈ R such that e Tt ∈S(R). Then evidently e Tt ∈S(R) as well for all ∞ p ∈ C with Re(p) ≥ ξ0.Letnowα be a C function with Supp α bounded from the left and α(t) = 1 in a neighborhood of [0, ∞).Thenα(t) ept ∈S(R) for all p ∈ C with Re(p) < 0. Therefore,   −ξ0t (−p+ξ0)t e Tt ,α(t)e

−pt  exists, commonly abbreviated by Tt , e  (Re(p) > ξ0). We thus define for T ∈D+ as above 76 The Laplace Transform

Definition 8.1.   −pt LT(p) = Tt , e (Re(p) > ξ0).

Comments 1. The definition does not depend on the special choice of α.

2. LT is analytic for Re(p) > ξ0 (without proof; a proof can be given by using a slight modification of Corollary 7.15, performing a translation of the interval). 3. The definition coincides with the one given for functions in Section 8.1 in case of a regular distribution. (m) m 4. For any positive integer m, (LT) (p) =L((−t) T )(p) for Re(p) > ξ0. 5. L(λT + μS) = λL(T ) + μL(S) for Re(p) large, provided both S and T admit a Laplace transform.

Examples (m) m −ap 1. Lδ = 1, Lδ = p , Lδ(a) = e . 2. If T ∈E then LT is an entire analytic function. 3. Based on the formulae Lδ(m) = pm, L(tl) = l!/pl+1 and L(e−at T )(p) = LT(p + a), one can determine the inverse Laplace transform of any rational  function P(p)/Q(p) (P, Q polynomials) as a distribution in D+.Onejustap- plies partial fraction decomposition.

8.3 Laplace Transform and Convolution

 −ξt −ξt Let S and T be two distributions in D+ such that e S and e T are tempered distributions for all ξ ≥ ξ0. Then according to [10], Chapter VIII (or, alternatively, using again the slight modification of Corollary 7.15) we have

−ξt  e (S ∗ T) ∈S for ξ>ξ0 .

Moreover   −p(t+u) L(S ∗ T )(p) = St ⊗ Tu, e   −pt −pu = St, e Tu, e

=LS(p)·LT(p) for Re(p) > ξ0 . Therefore, L(S ∗ T) =LS ·LT.

As a corollary we get:

(m) (m) m L(T )(p) =L(δ ∗ T )(p) = p LT(p) for Re(p) > max(ξ0, 0). Laplace Transform and Convolution 77

Let now f be a function on R with f(t) = 0 for t<0. Assume that e−ξt |f(t)| is 1 integrable for ξ>ξ0,and,inaddition,thatf is C for t>0,thatlimt↓0 f(t) = f(0)  and that limt↓0 f (t) is finite. We know then that

{f } ={f }+f(0)δ, in which {f } is the derivative of the distribution {f }.ThenLf  is clearly defined where L{f } is defined, hence where Lf is defined. Furthermore,

 Lf (p) = p(Lf )(p) − f(0)(Re(p) > ξ0).

With induction we obtain the following generalization. Let f be m times continuous- k ly differentiable on (0, ∞), and assume that limt↓0 f (t) exists for all k ≤ m.Set (k) (k) limt↓0 f (t) = f (0)(0 ≤ k ≤ m − 1).Then

Lf (m)(p) = pm Lf(p)−{f (m−1(0) + pf(m−2(0) +···+pm−1 f(0)}   Re(p) > ξ0 .

Examples 1. Let f = Y be the Heaviside function. Then f  = 0 and f(0) = 1.Thus,aswe know already, L(Y )(p) = 1/p. 2. Y(t) sin at hasasLaplacetransforma/(p2 + a2) for Re(p) > 0. Differentiation gives: aY (t) cos at has Laplace transform (ap)/(p2 + a2) for Re(p) > 0.See also Section 8.1. 3. Recall that tα−1 tβ−1 tα+β−1 Y(t)eλt ∗ Y(t)eλt = Y(t)eλt Γ (α) Γ (β) Γ (α + β) for Re(α) > 0, Re(β) > 0, see Section 6.2. Laplace transformation gives 1 1 1 · = (Re(p) > Re(λ)) . (p − λ)α (p − λ)β (p − λ)α+β The second relation is easier than the first one, of course. The second relation implies the first one, because of

 −ξt  Theorem 8.2. Let T ∈D+ satisfy e Tt ∈S(R) for ξ>ξ0.IfLT(p) = 0 for Re(p) > ξ0,thenT = 0.

Proof. One has e−ξt T, α(t)e−εt e2πiηt=0 for all η ∈ R and all ε>0.Here

α(t) is chosen as in the definition of LT and ξ is fixed, ξ>ξ0.Letϕ ∈S. Then e−ξt T, α(t)e−εt ϕ(η)1 e2πiηt=0 for all η ∈ R and all ε>0. Applying Fubini’s theorem in S(R) gives ∞     0 = e−ξt T, α(t)e−εt ϕ(η)1 e2πiηt dη = e−ξtT, α(t)ϕ(t)e−εt −∞ 78 The Laplace Transform

for all ϕ ∈Sand all ε>0.Hencee−(ξ+ε)t T, ϕ=0 for all ϕ ∈Sand all ε>0,thus,forϕ ∈D, T, ϕ=e−(ξ+ε)t T, e(ξ+ε)t ϕ=0,soT = 0.

“Symbolic calculus” is nothing else then taking Laplace transforms: L maps the   subalgebra of D+ generated by δ and δ onto the algebra of polynomials in p.Ifthe  polynomial P is the image of A: LA = P,andifLB = 1/P for some B ∈D+,thenB is the inverse of A by Theorem 8.2.

TheAbelIntegralEquation Theoriginofthisequationisin[N.Abel,Solutiondequelquesproblèmesàl’aide d’intégrales définies, Werke (Christiania 1881), I, pp. 11–27]. The equation plays a role in Fourier analysis on the upper half plane and other symmetric spaces. The general form of the equation is

t ϕ(τ) dτ = f(t) (t ≥ 0, 0 <α<1), (t − τ)α 0 a Volterra integral equation of the first kind. Abel considered only the case α = 1/2. The solution of this equation falls outside the scope of symbolic calculus. Set ϕ(t) = f(t) = 0 for t<0. Then the equation changes into a convolution  equation in D+ Y(t)t−α ∗ ϕ = f.

Let us assume that ϕ and f admit a Laplace transform. Set Lϕ = Φ, Lf = F.Then we obtain Γ (1 − α) Φ(p) = F(p), p1−α hence pF(p) Lf (p) + f(0) Φ(p) = = . Γ (1 − α) pα Γ (1 − α) pα Therefore ; < 1 ϕ(t) = Y(t)tα−1 ∗ f (t) + f(0)Y(t)tα−1 Γ (1 − α) Γ (α)

 t sin πα f (τ) f(0) = dτ + π (t − τ)1−α t1−α 0 for t ≥ 0, almost everywhere. We have assumed that f is continuous for t ≥ 0,thatf   exists and is continuous for t>0,andthatlimt↓0 f (t) is finite. We also applied here the formula Γ (α) Γ (1 − α) = (sin πα)/π for 0 <α<1. See equation (10.8). Inversion Formula for the Laplace Transform 79

8.4 Inversion Formula for the Laplace Transform

−ξt 1 Let f be a function satisfying f(t) = 0 for t<0 and e f(t) ∈ L for ξ>ξ0. Set F(p) =Lf(p),thus

∞ −ξt −iηt F(ξ + iη) = f(t)e e dt(ξ>ξ0). 0

1 If η |F(ξ + iη)|∈L for some ξ>ξ0,then ∞ 1 f(t)e−ξt = F(ξ + iη) eiηt du 2π −∞ at the points where f is continuous, or

∞ ξ+i∞ 1 1 f(t) = F(ξ + iη) e(ξ+iη)t dη = F(p)ept dp. 2π 2πi −∞ ξ−i∞

The following theorem holds:

Theorem 8.3. An analytic function F coincides with the Laplace transform of a distri-  bution T ∈D+ in some open right half plane if and only if there exists c ∈ R such that F(p)is defined for Re(p) > c and

|F(p)|≤ polynomial in |p| (Re(p) > c) .

 −ct  Proof. “Necessary”*. Let T ∈D+ be such that e Tt ∈S for some c>0.Then  −ct  T = S +U with S ∈D+ of compact support, Supp U ⊂ (0, ∞) and e Ut ∈S.This is evident: choose a function α ∈Dsuch that α(x) = 1 in a neighborhood of x = 0 and set S = αT, U = (1−α) T .ThenL S(p)is an entire function of p of polynomial −ct growth. To see this, compare with the proof of Theorem 7.17. Furthermore, e Ut is (m) ⊂ of the form f for some continuous, slowly increasing, function! "f with Supp f ∞ = ct (m) = m − k m k ct (m−k) (0, ) by Corollary 7.15. Then Ut e f k=0( 1) k c (e f) . Moreover e−ξt ect f(t) ∈ L1 for ξ>c,henceL (ect f )(p) exists and is bounded for Re(p) ≥ c + 1. Therefore # $ m m L U(p) = (−1)k ck L (ect f)(m−k)(p) = k k 0 # $ m m = (−1)k ck pm−k L (ect f )(p) k k=0 satisfies |L U(p)|≤polynomial in |p| of degree m,forRe(p) > c + 1.Thiscom- pletes this part of the proof. 80 The Laplace Transform

“Sufficient”. Let F be analytic and |F(p)|≤C/|p|2 for ξ = Re(p) > c > 0, C being a positive constant. Then |F(ξ + iη)|≤C/(ξ2 + η2) for ξ>cand all η. Therefore, ξ+i∞ 1 f(t) = F(p)ept dp 2πi ξ−i∞ exists for ξ>cand all t and does not depend on the choice of ξ>cby Cauchy’s theorem. Let us write ∞ 1 e−ξt f(t) = F(ξ + iη) eiηt dη(ξ>c). 2π −∞

−ξt ∈ 1 = Then we see that f is continuous. Moreover, if e f(t) L for ξ>c,thenF(p) ∞ −pt −∞ f(t)e dt for ξ = Re(p) > c, by Corollary 7.2 on the inversion of the Fourier transform. We shall now show (i) f(t) = 0 for t<0, (ii) e−ξt f(t) ∈ L1 for ξ>c.

It then follows that F =Lf . To show (i), observe that ∞ C eξt c dη C eξt |f(t)|≤ = (ξ > c) . 2πc c2 + η2 2c −∞

ξt For t<0 we thus obtain |f(t)|≤limξ→∞ C e /(2c) = 0,hencef(t) = 0. To show (ii), use again |f(t)|≤C eξt/(2c) for all ξ>c,hence|f(t)|≤ C ect/(2c). Therefore

C e−ξt |f(t)|≤ e−(ξ−c)t ∈ L1 for ξ>c and t ≥ 0 . 2c Let now, more generally, F be analytic and

|F(p)|≤polynomial in |p| of degree m, for ξ = Re(p) > c > 0. Then |F(p)/pm+2|≤C/|p|2 for ξ>cand some constant C. Therefore, there is a continuous function f such that (i) f(t) = 0 for t<0, (ii) e−ξt f(t) ∈ L1 for ξ>c,and

F(p) = pm+2 Lf(p) (ξ> c). Inversion Formula for the Laplace Transform 81

Then F(p) =L(f (m+2))(p) for ξ>cor, otherwise formulated, F =LT with T = dm+2f/dtm+2, the derivative being taken in the sense of distributions.

 Notice that the theorem implies that for any T ∈D+ which admits a Laplace transform there exists a positive integer m such that T is the mth derivative (in the sense of distributions) of a continuous function f with support in [0, ∞) and such −ξ t 1 that for some ξ0, e 0 f(t) ∈ L .

Further Reading

Further reading may be done in the “bible” of the Laplace transform [17]. 9 Summable Distributions*

Summary This chapter might be considered as the completion of the classification of distribu- tions. As indicated, it can be omitted in the first instance.

Learning Targets

 Understanding the structure of summable distributions.

9.1 Definition and Main Properties

We recall the definition |k|=k1 +···+kn if k = (k1,...,kn) is an n-tuple of nonnegative integers. We also recall from Section 2.2:

Definition 9.1. A summable distribution on Rn is a distribution T with the following property: there exists an integer m ≥ 0 and a constant C satisfying  |T, ϕ| ≤ C sup |Dkϕ| (ϕ ∈D(Rn)) . |k|≤m

The smallest number m satisfying this inequality is called the summability-order of T , abbreviated by sum-order (T ). Observe that locally, i. e. on every open bounded subset of Rn, each distribu- tion T is a summable distribution. Moreover, any distribution with compact support is summable. A summable distribution is of finite order in the usual sense, and we have order(T ) ≤ sum-order(T ), with equality if T has compact support. It immediately follows from the definition that the derivative of a summable distribution is summa- ble with ∂T sum-order ≤ sum-order(T ) + 1 . ∂xi n The space of summable distributions of sum-order 0 coincides with the space Mb(R ) Rn of bounded measures on , that is with the space of measures μ with finite total mass Rn d|μ|(x). This is a well-known fact. First observe that a distribution of order 0 can be uniquely extended to a continuous linear form on the space D0 of continu- ous functions with compact support, provided with its usual topology (see [2], Chap- ter III), thus to a measure, using a sequence of functions {ϕk} as in Section 6.2 and 0 defining T, ϕ=limk→∞T, ϕk ∗ ϕ for ϕ ∈D. Next apply for example [2], Chapter IV, § 4, Nr. 7. k If μ ∈Mb then the distribution D μ is summable with sum-order ≤|k|. We shall show the following structure theorem: The Iterated Poisson Equation 83

Theorem 9.2 (L. Schwartz [10]). Let T be a distribution. Then the following conditions on T are equivalent: 1. T is summable.  k 2. T is a finite sum D μk of derivatives of bounded measures μk. k k 1 3. T is a finite sum k D fk of derivatives of L functions fk. n 4. For every α ∈D, α ∗ T belongs to Mb(R ). 5. For every α ∈D, α ∗ T belongs to L1(Rn).

We need some preparations for the proof.

9.2 The Iterated Poisson Equation

In this section we will consider elementary solutions of the iterated Poisson equation

Δl E = δ where l is a positive integer and Δ the Laplace operator in Rn. One has, similarly to the case l = 1 (see Section 3.5)

n   2π 2 Δl r 2l−n = (2l − n)(2l − 2 − n)...(4 − n)(2 − n) 2l−1 (l − 1)!   δ. Γ n 2

Thus, if 2l − n<0 or 2l − n ≥ 0 and n odd, then there exists a constant Bl,n such that   l 2l−n Δ Bl,n r = δ. If now 2l − n ≥ 0 and n even, then

n 2π 2 Δl(r 2l−n log r) = [[(2l − n)(2l − 2 − n)...(4 − n)(2 − n)]] 2l−1 (l − 1)!   δ, Γ n 2 where in the expression between double square brackets the factor 0 has to be omit- ted. Thus, there exists a constant Al,n such that   l 2l−n Δ Al,n r log r = δ.

We conclude that for all l and n there exist constants Al,n and Bl,n such that   l 2l−n Δ r (Al,n log r + Bl,n) = δ.

Let us call this elementary solution of the iterated Poisson equation E = El,n. We now replace E by a parametrix γE,withγ ∈D(Rn), γ(x) = 1 for x in a neighborhood of 0 in Rn. Then one has for any T ∈D(Rn)  Δl (γE) − ζ = δ for some ζ ∈D, Δl (γE ∗ T)− ζ ∗ T = T. 84 Summable Distributions*

This result, of which the proof is straightforward, implies the following. If we take for T a bounded measure and take l so large that E is a continuous function, then ev- ery bounded measure is a finite sum of derivatives of functions in L1(Rn) (which are continuous and converge to zero at infinity). This proves in particular the equivalence of 2. and 3. of Theorem 9.2. Moreover, replacing T with α ∗ T in the above parametrix formula easily implies the equivalence of 4. and 5.

9.3 Proof of the Main Theorem

To prove Theorem 9.2, it is sufficient to show the implications: 2 ⇒ 1 ⇒ 4 ⇒ 2.  ⇒ = k 2 1.LetT |k|≤m D μk.Then          |k| k  k |T, ϕ| =  (−1) μk,D ϕ  ≤ C sup |D ϕ| |k|≤m |k|≤m for some constant C and all ϕ ∈D.HenceT is summable.

1 ⇒ 4.Letα ∈Dbe fixed and T summable. Then T satisfies a relation of the form  |T, ϕ| ≤ C sup |Dkϕ| (ϕ ∈D). |k|≤m Therefore,  |T ∗ α, ϕ| = |T, αˇ ∗ ϕ| ≤ C sup |Dkαˇ ∗ ϕ| |k|≤m ⎛ ⎞  ⎝ k ⎠ ≤ C sup D αˇ 1 sup |ϕ| (ϕ ∈D). |k|≤m

Hence T ∗ α ∈Mb.

4 ⇒ 2. This is the most difficult part. From the relation

T ∗ ϕ,ˇ αˇ=T ∗ α, ϕ for α, ϕ ∈D, we see, by 4, that

|T ∗ ϕ,ˇ αˇ| ≤ Cα for all ϕ ∈Dwith sup |ϕ|≤1,withCα a positive constant. By the Banach– Steinhaus theorem (or the principle of uniform boundedness) in D,seeSection10.1, we get: for any open bounded subset K of Rn there is a positive integer m and a con- stant CK such that  k |T ∗ ϕ,ˇ αˇ| ≤ CK sup |D α| (9.1) |k|≤m Canonical Extension of a Summable Distribution 85

for all α ∈Dwith Supp α ⊂ K and all ϕ ∈Dwith sup |ϕ|≤1.Letusdenote by Dm the space of Cm functions with compact support and by Dm(K) the subspace consisting of all α ∈Dm with Supp α ⊂ K. Similarly, define D(K) as the space of functions α ∈Dwith Supp α ⊂ K.LetusprovideDm(K) with the norm        α = sup Dkα α ∈Dm(K) . |k|≤m

It is not difficult to show that D(K) is dense in Dm(K). Indeed, taking a sequence of functions ϕk (k = 1, 2,...) as in Section 6.2, one sees, by applying the relation l l D (ϕk ∗ ϕ) = ϕk ∗ D ϕ for any n-tuple of nonnegative integers l with |l|≤m and using the uniform continuity of Dlϕ,thatanyϕ ∈Dm(K) can be approximated in m the norm topology of D (K) by functions of the form ϕk ∗ ϕ,whichareinD(K) for k large. From equation (9.1) we then see that every distribution T ∗ ϕˇ can be uniquely extended to Dm(K), such that equation (9.1) remains to hold. Therefore, m for all α ∈D (K), T ∗ α is in Mb, since still T ∗ ϕ,ˇ αˇ=T ∗ α, ϕ(ϕ ∈D). Now apply the parametrix formula again, taking α = γE= γEl,n with l large enough in order that α ∈Dm. The proof of the theorem is now complete.

9.4 Canonical Extension of a Summable Distribution

The content of this section (and the next one) is based on notes by the late Erik Thomas [13]. Let T be a summable distribution of sum-order m.DenotebyBm =Bm(Rn) the space of Cm functions ϕ on Rn whichareboundedaswellasitsderivativesDkϕ with |k|≤m.LetusprovideBm with the norm  k m pm(ϕ) = sup |D ϕ| (ϕ ∈B ). |k|≤m

Then, by the Hahn–Banach theorem, T can be extended to a continuous linear form Bm on . B= ∞ Bm Let m=0 and provide it with the convergence principle: a sequence {ϕj } in B converges to zero if, for all m, the scalars pm(ϕj ) tend to zero when j tends to infinity. Then the extension of T to Bm is also a continuous linear form on B. There exists a variety of extensions in general. We will now define a canonical exten- B sion of T to . To this end we make use of a representation of T as a sum of derivatives k of bounded measures T = |k|≤m D μk. The number m arising in this summation may be larger than the sum-order of T . With help of this representation we can easily extend T to a continuous linear form on B, since bounded measures are naturally defined on B. Indeed, the functions in B are integrable with respect to any bound- ed measure. We shall show that such an extension does not depend on the specific 86 Summable Distributions*

representation of T as a sum of derivatives of bounded measures, it is a canonical ex- tension. The reason is the following. Observe that any μ ∈Mb is concentrated, up to ε>0, on a compact set (depending on μ). It follows that a canonical extension has a special property, T has the bounded convergence property. A continuous linear form on Bm is said to have the bounded convergence prop- ∈Bm p ∞ → erty of order m if the following holds: if ϕj , supj m(ϕj )< and ϕj m 0 in E ,thenT, ϕj →0 when j tends to infinity. A similar definition holds in B: a continuous linear form on B has the bounded convergence property if ϕj ∈ B p ∞ → E  → , supj m(ϕj )< for all m,andϕj 0 in ,then T, ϕj 0 when j tends to infinity. Relying on this property we can easily relate T with its extension to B. Indeed, let α ∈Dbe such that α(x) = 1 in a neighborhood of x = 0,andsetαj (x) = ∈B E p α(x/j). Then, for any ϕ the functions αj ϕ tend to ϕ in and supj m(ϕj )< ∞ for all m.HenceT, ϕ=limj→∞ T, ϕj . Therefore the extension of T to B does not depend on the specific representation. We call it the canonical extension .In particular one can define the total mass T, 1 of T , sometimes denoted by T(dx), which accounts for the name “summable distribution”. Furthermore, the canonical extension satisfies again

|T, ϕ| ≤ C pm(ϕ) for some constant C and all ϕ ∈B, m being now the sum-order of T . Conversely, any continuous linear form T on B gives, by restriction to D,asum- mable distribution. There is however only a one-to-one relation between T and its restriction to D if we impose the condition: T has the bounded convergence property. The above allows one to define several operations on summable distribution whicharecommonforboundedmeasures.

Fourier Transform Any summable distribution T is tempered, so has a Fourier transform. But there is more: F T is function,

−2πix,y n F T(y) =Tx , e  (y ∈ R ).

Here we use the canonical extension of T .ClearlyF T is continuous (since any μ ∈

Mb is concentrated up to ε>0 on a compact set) and of polynomial growth.

Convolution Let T and S be summable distributions. Then the convolution product S ∗ T exists. Indeed, let us define S∗T,ϕ=T, ϕ∗Sˇ(ϕ ∈D). This is a good definition, since ϕ ∗ Sˇ ∈B.ClearlyS ∗ T is summable again. Due to the canonical representation of T and S as a sum of derivatives of bounded measures, we see that this convolution product is commutative and associative and furthermore,

F (T ∗ S) =F(T ) ·F(S) . Rank of a Distribution 87

Let O be the space of functions f ∈E(Rn) such that f and the derivatives of f have at most polynomial growth. Then O operates by multiplication on the space S(Rn) and on S(Rn). Moreover, since the functions in O have polynomial growth, they define themselves tempered distributions.

Theorem 9.3. Every T ∈F(O) is summable. More precisely, if P is a polynomial then PT is summable. Conversely, if PT is summable for every polynomial P,then T ∈F(O).

Proof. If T =F(f ) with f ∈Oand α ∈D,thenα ∗ T =F(βf ) with β ∈Sthe inverse Fourier transform of α.Sinceβf belongs to S we have α ∗ T ∈S, a fortiori α ∗ T ∈ L1. Therefore, by Theorem 9.2, T is summable. Similarly, if P is a polyno- mial, then PT =F(Df ) for some differential operator with constant coefficients, so PT ∈F(O).Conversely,ifPT is summable for all polynomials P,thenD F(T ) is continuous and of polynomial growth for all differential operators D with constant coefficients, and so F(T ) belongs to O (cf. Exercise 4.4 b.).

9.5 Rank of a Distribution

What is the precise relation between the sum-order of T and the number of terms in arepresentationofT as a sum of derivatives of bounded measures from Theorem 9.2. To answer this question, we introduce the notion of rank of T .

For every n-tuple of nonnegative integers k = (k1,...,kn) we set |k|∞ = max1≤i≤n ki. A partial differential operator with constant coefficients ak is said to have rank m if it is of the form  k D = akD |k|∞≤m and not every ak with |k|∞ = m vanishes.

Definition 9.4. A distribution T is said to have finite rank if there exists a positive inte- ger m and a constant C>0 such that  |T, ϕ| ≤ C sup |Dkϕ| (ϕ ∈D). |k|∞≤m

The smallest such m is called the rank of T .

Clearly, |k|∞ ≤|k|≤n|k|∞. So the distributions of finite rank coincide with the summable distributions. There is however a striking difference between sum- order and rank: the sum-order may grow with the dimension while the rank remains constant. For example sum-order(Δδ) = 2n and rank(Δδ) = 2, Δ being the usual Laplacian in Rn. 88 Summable Distributions*

Definition 9.5. A mollifier of rank m is a bounded measure K having the following prop- erties: k 1. D K is a bounded measure for all n-tuples k with |k|∞ ≤ m. 2. There exists a differential operator with constant coefficients D of rank m such that DK = δ.

Theorem 9.6. There exists mollifiers of rank m.LetK be a mollifier of rank m and let D be a differential operator of rank m such that DK = δ.ThenifT is any summable distribution of rank ≤ m, the summable distribution μ = K ∗ T is a bounded measure and T = Dμ.

Thus for summable distributions of rank ≤ m we get a representation of the form  k T = D μk , |k|∞≤m which gives an answer to the question posed at the beginning of this section. It turns out that the rank of a distribution is an important notion. n k Let E(m)(R ) be the space of functions ϕ such that D ϕ exists and is continuous n for all k with |k|∞ ≤ m,andletB(m)(R ) be the subspace of function ϕ ∈E(m) k such that D ϕ is bounded for all k with |k|∞ ≤ m.NoticethatB(m) is a with norm  k p(m)(ϕ) = sup |D ϕ| (ϕ ∈B(m)). |k|∞≤m

Clearly any summable distribution T of rank ≤ m has a canonical extension to the space B(m), having the bounded convergence property of rank m:ifϕj → 0 in E(m) p ∞  → and supj (m)(ϕj )< ,then T, ϕj 0. This follows in the same way as before.

Proof. (Theorem 9.6) For dimension n = 1 and rank m = 1,weset,ifa>0,

1 L(x) = L (x) = Y(x)e−x/a . a a Here Y is, as usual, the Heaviside function. Furthermore, set

d D = 1 + a . dx Then DL = δ and L and L are bounded measures. Now consider the case of rank 2 in dimension n = 1 in more detail. For a>0 set x/a Ka = La ∗ Lˇa,withLˇa(x) = (1/a) Y (−x) e .Then 1 K(x) = K (x) = e−|x|/a . a 2a Rank of a Distribution 89

Set d d d2 D = 1 + a 1 − a = 1 − a2 . dx dx dx2 Then K,K and K are bounded measures, and DK = δ, ϕ = ϕ ∗ δ = ϕ ∗ DK = K ∗ Dϕ , ϕ = K ∗ Dϕ , ϕ = K ∗ Dϕ .

Therefore there exists a constant C such that for all ϕ ∈D(R),

sup |ϕ|≤C sup |Dϕ|, sup |ϕ|≤C sup |Dϕ|, sup |ϕ|≤C sup |Dϕ| and thus, if T has rank ≤ 2, we therefore have, for some C,

|T, ϕ| ≤ C sup |Dϕ| for all ϕ ∈D(R), but also for ϕ ∈B(m)(R). It follows that

|K ∗ T, ϕ| = |T, K ∗ ϕ| ≤ C sup |DK ∗ ϕ|≤C sup |ϕ| for ϕ ∈D(R),sothatK ∗ T is a bounded measure. Let us now consider the case of dimension n>1,butstillm = 2. Set in this case

K(x) = Ka(x1) ⊗···⊗Ka(xn) and # $ /n ∂2 = − 2 D 1 a 2 . i=1 ∂xi k Then D K is a bounded measure for |k|∞ ≤ 2. By the same argument as before, we see that for some C,   sup |Dkϕ|≤C sup |Dϕ| ϕ ∈D(Rn) if |k|∞ ≤ 2,andifT has rank ≤ 2,wehaveagain   |T, ϕ| ≤ C sup |Dϕ| ϕ ∈D(Rn) , so that μ = K ∗ T is a bounded measure and T = Dμ. For higher rank m we take H = K ∗···∗K(ptimes) if m = 2p,and

H = K∗···∗K∗L(ptimes K) if m is odd, m = 2p+1.HereL(x) = L(x1)⊗···⊗ n −(x +···+x )/a L(xn) = (1/a )Y(x1,...,xn) · e 1 n . We choose also the corresponding powers of the differential operators, and find in the same way: if T has rank ≤ m, then μ = H ∗ T is a bounded measure and T = Dμ.

Further Reading

An application of this chapter is a mathematically correct definition of the Feynman path integral, see [13]. 10 Appendix

10.1 The Banach–Steinhaus Theorem

Topological spaces We recall some facts from topology. Let X be a set. Then X is said to be a topological space if a system of subsets T has been selected in X with the following three properties: 1. ∅∈T,X ∈T. 2. The union of arbitrary many sets of T belongs to T again. 3. The intersection of finitely many sets of T belongs to T again.

The elements of T are called open sets, its complements closed sets.Letx ∈ X.An open set U containing x is called a neighborhood of x. The intersection of two neigh- borhoods of x is again a neighborhood of x. A system of neighborhoods U(x) of x is called a neighborhood basis of x if every neighborhood of x contains an element of this system. Suppose that we have selected a neighborhood basis for every element of X. Then, clearly, the neighborhood basis of a specific element x has the following properties: 1. x ∈ U(x).

2. The intersection U1(x) ∩ U2(x) of two arbitrary neighborhoods in the basis con- tains an element of the basis. 3. If y ∈ U(x), then there is a neighborhood U(y) in the basis of y with U(y) ⊂ U(x).

If, conversely, for any x ∈ X asystemofsetsU(x) is given, satisfying the above three properties, then we can easily define a topology on X such that these sets form a neighborhood basis for x,foranyx. Indeed, one just calls a set open in X if it is the unionofsetsoftheformU(x). Let X be a topological space. Then X is called a Hausdorff space if for any two points x and y with x =y there exist neighborhoods U of x and V of y with U ∩ V =∅. Let f be a mapping from a topological space X to a topological space Y .Onesays that f is continuous at x ∈ X if for any neighborhood V of f(x) there is a neigh- borhood U of x with f(U) ⊂ V . The mapping is called continuous on X if it is continuous at every point of X. This property is obviously equivalent with saying that the inverse image of any open set in Y is open in X. Let X be again a topological space and Z, a subset of X.ThenZ can also be seen as a topological space, a topological subspace of X, with the induced topology: open sets in Z are intersections of open sets in X with Z. The Banach–Steinhaus Theorem 91

Fréchet spaces Let now X be a complex vector space.

Definition 10.1. AseminormonX is a function p : X → R satisfying 1. p(x) ≥ 0; 2. p(λx) =|λ| p(x); 3. p(x + y) ≤ p(x) + p(y) (triangle inequality) for all x, y ∈ X and λ ∈ C.

Observe that a seminorm differs only slightly from a norm: it is allowed that p(x) = 0 for some x =0. Clearly, by (iii), p is a convex function, hence all sets of the form {x ∈ X : p(x) < c},withc a positive number, are convex. We will consider vector spaces X provided with a countable set of seminorms p1,p2,....Givenx ∈ X, ε>0 and a positive integer m, we define the sets V(x,m; ε) as follows:  V(x,m; ε) = y ∈ X : pk(x − y) < ε for k = 1, 2,...,m .

Then, clearly, the intersection of two of such sets contains again a set of this form. Furthermore, if y ∈ V(x,m; ε) then there exists ε > 0 such that V(y,m; ε) ⊂ V(x,m; ε).WecanthusprovideX with a topology by calling a subset of X open if it is a union of sets of the form V(x,m; ε). This topology has the following properties: the mappings (x, y) x + y

(λ, x) λx, from X × X → X and C × X → X, respectively, are continuous. The space X is said to be a topological vector space. Clearly, any x ∈ X has a neighborhood basis consisting of convex sets. There- fore X is called a locally convex (topological vector) space. The space X is a Hausdorff space if and only if for any nonzero x ∈ X there is aseminormpk with pk(x) =0. From now we shall always assume that this property holds.

Let {xn} be a sequence of elements xn ∈ X.Onesaysthat{xn} converges to x ∈ X,innotationlimn→∞ xn = x or xn → x(n→∞), if for any neighborhood V of x there exists a natural number N such that xn ∈ V for n ≥ N. Alternatively, one may say that {xn} converges to x if for any ε>0 and any k ∈ N there is a natural number N = N(k,ε) satisfying pk(xn − x) < ε for n ≥ N.BecauseX is a Hausdorff space, the limit x, if it exists, is uniquely determined.

Asequence{xn} is called a Cauchy sequence if for any neighborhood V of x = 0 there is a natural number N such that xn − xm ∈ V for n, m ≥ N.Alternatively, one may say that {xn} is a Cauchy sequence if for any ε>0 and any k ∈ N there is a natural number N = N(k,ε) satisfying pk(xn − xm)<εfor n, m ≥ N. 92 Appendix

The space X is said to be complete (also: sequentially complete) if any Cauchy sequence is convergent.

Definition 10.2. A complete locally convex Hausdorff topological vector space with a topology defined by a countable set of seminorms is called a Fréchet space.

Examples of Fréchet Spaces 1. Let K be an open bounded subset of Rn and let D(K) be the space of functions ϕ ∈D(Rn) with Supp ϕ ⊂ K.ProvideD(K) with the set of seminorms (actual- ly norms in this case) given by  k pm(ϕ) = sup |D ϕ| (ϕ ∈D(K)) . |k|≤m

Then D(K) is a Fréchet space. Notice that the seminorms form an increasing

sequence in this case: pm(ϕ) ≤ pm+1(ϕ) for all m and ϕ. This property might be imposed on any Fréchet space X by considering the set of new seminorms  = +···+ ∈ N pm p1 pm (m ). Obviously, these new seminorms define the same topology on X. 2. The spaces E and S.ForE we choose the seminorms

= | k | pk,K(ϕ) supx∈K D ϕ ,

k being a n-tuple of nonnegative integers and K a compact subset of Rn.Taking for K aballaroundx = 0 of radius l(l ∈ N), which suffices, we see that we obtain a countable set of seminorms. For S we choose the seminorms

l k pk,l(ϕ) = sup |x D ϕ| .

Here both k and l are n-tuples of nonnegative integers.

The dual space

Let X be a Fréchet space with an increasing set of seminorms pm(m = 1, 2,...).De- note by X the space of (complex-valued) continuous linear forms x on X.Thevalue of x at x is usually denoted by x,x, showing a nice bilinear correspondence be- tween X and X

– x,λx+ μy=λ x,x+μ x,y , – λx + μy,x=λ x,x+μ y,x , for x,y ∈ X,x,y ∈ X,λ,μ∈ C. The space X is called the (continuous) dual space of X. The Banach–Steinhaus Theorem 93

Proposition 10.3. Alinearformx on X is continuous at x = 0 if and only if there is m ∈ N and a constant c>0 such that      x ,x ≤ cpm(x) for all x ∈ X.Inthiscasex is continuous everywhere on X.

Proof. Let x be continuous at x = 0. Then there is a neighborhood U(0) of x = 0 such that |x,x| < 1 for x ∈ U(0). Since the set of seminorms is increasing, we may assume that U(0) is of the form {x : pm(x) ≤ δ} for some δ>0.Thenweget for any x ∈ X and any ε>0,

δx x1 = ∈ U(0), pm(x) + ε

 thus |x ,x1| < 1 and           pm(x) + ε pm(x) + ε x ,x = x ,x  < . 1 δ δ  Hence |x ,x| ≤ cpm(x) for all x ∈ X with c = 1/δ,sinceε was arbitrary. Conversely, any linear form, satisfying a relation      x ,x ≤ cpm(x) (x ∈ X) for some m ∈ N and some constant c>0, is clearly continuous at x = 0.But,we get more. If x0 ∈ X is arbitrary, then we obtain      x ,x− x0 ≤ cpm(x − x0)(x∈ X) ,

 hence x is continuous at x0. Indeed, if ε>0 is given, then we get            x ,x− x0 = x ,x−x ,x0 <ε if pm(x − xo)<ε/cand this latter inequality defines a neighborhood of x0. We can also define continuity of a linear form x in another way.

Proposition 10.4. Alinearformx on X is continuous on X if one of the following con- ditions is satisfied: (i) x is continuous with respect to the topology of X; (ii) there exists m ∈ N and c>0 such that      x ,x ≤ cpm(x)

for all x ∈ X;   (iii) for any sequence {xn} with limn→∞ xn = x one has limn→∞x ,xn=x ,x. 94 Appendix

Proof. It is sufficient to show the equivalence of (ii) and (iii). Obviously, (ii) implies (iii). The converse implication is proved by contradiction. If condition (ii) is not sat- isfied, then we can find for all m ∈ N and all c>0 an element xm ∈ X with  |x ,xm| = 1 and pm(xm)<1/m,takingc = m.Thenwehavelimm→∞ xm = 0,  because the sequence of seminorms is increasing. But |x ,xm| = 1 for all m.This contradicts (iii). There exists sufficiently many continuous linear forms on X. Indeed, compare with the following proposition.

 Proposition 10.5. For any x0 = 0 in X there is a continuous linear form x on X with  x ,x0 =0.

Proof. Because x0 = 0, there exists a seminorm pm with pm(x0)>0. By Hahn–   Banach’s theorem, there exists a linear form x on X with |x ,x| ≤ pm(x) for all   x ∈ X and x ,x0=pm(x0). By Proposition 10.4 the linear form x is continuous.

Metric spaces Let X be a set. We recall:

Definition 10.6. AmetriconX is a mapping d : X × X → R with the following three properties: (i) d(x, y) ≥ 0 for all x, y ∈ X, d(x, y) = 0 if and only if x = y; (ii) d(x, y) = d(y, x) for all x, y ∈ X; (iii) d(x, y) ≤ d(x, z) + d(z, y) for all x, y,z ∈ X (triangle inequality).

Clearly, the balls B(x,ε) ={y ∈ X : d(x, y) < ε} can serve as a neighborhood basis of x ∈ X.Hereε is a positive number. The topology generated by these balls is called the topology associated with the metric d.ThesetX, provided with this topology, is then called a metric space.

Let now X beaFréchetspacewithanincreasingsetp1 ≤ p2 ≤···of semi- norms. We shall define a metric on X in such a way that the topology associated with the metric is the same as the original topology, defined by the seminorms. The space X is called metrizable.

Theorem 10.7. Any Fréchet space is metrizable with a translation invariant metric.

Proof. Fortheproofwefollow[8],§18,2.Set ∞  1 p (x) x = k 2k 1 + p (x) k=1 k and define d by d(x, y) = x − y for x, y ∈ X.Thend is a translation in- variant metric. To see this, observe that x =0 if and only if pk(x) = 0 for The Banach–Steinhaus Theorem 95

all k,hencex = 0. Thus the first property of a metric is satisfied. Furthermore, d(x, y) = d(y, x) since x = −x . The triangle inequality follows from the re- lation: if two real numbers a and b satisfy 0 ≤ a ≤ b,thena/(1 + a) ≤ b/(1 + b). Indeed, we then obtain p (x + y) p (x) + p (y) p (x) p (y) k ≤ k k ≤ k + k 1 + pk(x + y) 1 + pk(x) + pk(y) 1 + pk(x) 1 + pk(y) for all x,y ∈ X. Next we have to show that the metric defines the same topology as the seminorms. Therefore it is sufficient to show that every neighborhood of x = 0 in the first topolo- gy, contains a neighborhood of x = 0 in the second topology and conversely. a. The neighborhood given by the inequality x < 1/2m contains the neighbor- m+1 hood given by pm+1(x) < 1/2 . Indeed, since pk(x)/[1 + pk(x)] ≤ pk(x), m+1 ≤ ≤···≤ we have for any x satisfying pm+1(x) < 1/2 : p1(x) p2(x) m+1 m+1 m+1 k+ ∞ k m pm+1(x) < 1/2 ,hence x < 1/2 k=0 1/2 k=m+2 1/2 < 1/2 . m b. The neighborhood given by pk(x) < 1/2 contains the neighborhood given by x < 1/2m+k+1. Indeed, for any x satisfying x < 1/2m+k+1 one has (1/2k)· m+k+1 m+1 [pk(x)/(1 + pk(x))] < 1/2 ,hencepk(x)/[1 + pk(x)] < 1/2 .Thus m+1 m+1 m pk(x) · (1 − 1/2 )<1/2 and therefore pk(x) < 1/2 .

This completes the proof of the theorem.

Baire’s theorem The following theorem, due to Baire, plays an important role in our final result.

Theorem 10.8. If a complete metric space is the countable union of closed subsets, then at least one of these subsets contains a nonempty open subset. = = Proof. Let (X, d) be a complete metric space, Sn closed in X (n 1, 2,...)andX ∞ = \ n=1 Sn. Suppose no Sn contains an open subset. Then S1 X and X S1 is open. There is x1 ∈ X\S1 and a ball B(x1,ε1) ⊂ X\S1 with 0 <ε1 < 1/2.Nowtheball B(x1,ε1) does not belong to S2,henceX\S2 ∩ B(x1,ε1) contains a point x2 and aballB(x2,ε2) with 0 <ε2 < 1/4. In this way we get a sequence of balls B(xn,εn) with 1 B(x ,ε ) ⊃ B(x ,ε ) ⊃··· ;0<ε < ,B(x,ε ) ∩ S =∅. 1 1 2 2 n 2n n n n n For n

d(xn,x)≤ d(xn,xm) + d(xm,x)<εn + d(xm,x)→ εn (m →∞).

So x ∈ B(xn,εn) for all n.Hencex ∉ Sn for all n,sox ∉ X, which is a contradic- tion. 96 Appendix

The Banach–Steinhaus theorem We now come to our final result.

Theorem 10.9. Let X be a Fréchet space with an increasing set of seminorms. Let F be asubsetofX. Suppose for each x ∈ X the family of scalars {x,x : x ∈F}is bounded. Then there exists c>0 and a seminorm pm such that      x ,x ≤ cpm(x) for all x ∈ X and all x ∈F.

Proof. For the proof we stay close to [12], Chapter III, Theorem 9.1. Since X is a com- plete metric space, Baire’s theorem can be applied. For each positive integer n let    Sn ={x ∈ X : |x ,x| ≤ n for all x ∈F}. The continuity of each x ensures that Sn is closed. By hypothesis, each x belongs to some Sn. Thus, by Baire’s theo- rem, at least one of the Sn,saySN , contains a nonempty open set and hence contains a set of the form V(x0,m; ρ) ={x : pm(x − x0)<ρ} for some x0 ∈ SN ,ρ>0   and m ∈ N.Thatis,|x x| ≤ N for all x with pm(x − x0)<ρand all x ∈F. Now, if y is an arbitrary element of X with pm(y) < 1,thenwehavex0 + ρy ∈   V(x0,m; ρ) and thus |x ,x0 + ρy| ≤ N for all x ∈F. It follows that                x ,ρy ≤ x ,x0 + ρy + x ,x0 ≤ 2N.

Letting c = 2N/ρ,wehavefory with pm(y) < 1,           1 x ,y = x ,ρy ≤ c, ρ and thus, , -      y     =    + ≤ + x ,y  x ,  (pm(y) ε) c(pm(y) ε) pm(y) + ε for any ε>0,hence       x ,y ≤ cpm(y) for all y ∈ X and all x ∈F. The theorem just proved is known as Banach–Steinhaus theorem, and also as the principle of uniform boundedness.

Application We can now show the following result, announced in Section 4.3.

Theorem 10.10. Let {Tj } be a sequence of distributions such that the limit limj→∞Tj ,ϕ exists for all ϕ ∈D.SetT, ϕ=limj→∞ Tj ,ϕ (ϕ ∈D).ThenT is a distribution. The Beta and Gamma Function 97

Proof. Clearly T is a linear form on D.WeonlyhavetoprovethecontinuityofT .LetK be an open bounded subset of Rn and consider the Fréchet space D(K) of functions ϕ ∈Dwith Supp ϕ ⊂ K and seminorms  k pm(ϕ) = sup |D ϕ| (ϕ ∈D(K)) . |k|≤m

Since limj→∞ Tj ,ϕ exists, the family of scalars {Tj ,ϕ : j = 1, 2 ...} is bound- ed for every ϕ ∈D(K). Therefore, by the Banach–Steinhaus theorem, there exist a positive integer m and a constant CK > 0 such that for each j = 1, 2,...      k Tj ,ϕ ≤ CK sup |D ϕ| |k|≤m for all ϕ ∈D(K),hence  k |T, ϕ| ≤ CK sup |D ϕ| |k|≤m for all ϕ ∈D(K).SinceK was arbitrary, it follows from Proposition 2.3 that T is a distribution.

10.2 The Beta and Gamma Function

We begin with the definition of the gamma function ∞ Γ (λ) = e−x xλ−1 dx. (10.1) 0

− − This integral converges for Re λ>0.Expandingxλ λ0 = e(λ λ0) log x(x > 0) into a power series, we obtain for any (small) ε>0 the inequality                 λ−1 λ0−1 −ε λ0−1  x − x  ≤ x  λ − λ0   x   log x  (0 0 and ∞ Γ (λ) = e−x xλ−1 log x dx(Re λ>0). 0 Using partial integration we obtain

Γ (λ + 1) = λ Γ (λ) (Re λ>0). (10.2)

Since Γ (1) = 1,wehaveΓ (n + 1) = n! for n = 0, 1, 2,.... 98 Appendix

We are looking for an analytic continuation of Γ . The above relation (10.2) implies, for Re λ>0, Γ (λ + n + 1) Γ (λ) = . (10.3) (λ + n)(λ + n − 1) ···(λ + 1)λ The right-hand side has however a meaning for all λ with Re λ>−n − 1 minus the points λ = 0, −1,...,−n. We use this expression as definition for the analytic continuation: for every λ with λ =0, −1, −2,... we determine a natural number n with n>− Re λ − 1. Then we define Γ (λ) as in equation (10.3). Using equation (10.2) one easily verifies that this is a good definition: the answer does not depend on the special chosen n, provided n>− Re λ − 1. It follows also that the gamma function has simple poles at the points λ = 0, −1, −2,...with residue at λ =−k equal to (−1)k/k!. Substituting x = t2 in equation (10.1) one gets ∞ 2 Γ (λ) = 2 e−t t2λ−1 dt, (10.4) 0 √ hence Γ (1/2) = π. We continue with the beta function, defined for Re λ>0, Re μ>0 by

1 B(λ, μ) = xλ−1 (1 − x)μ−1 dx. (10.5) 0 There is a nice relation with the gamma function. One has for Re λ>0, Re μ>0, using equation (10.4), ∞ ∞ 2 2 Γ (λ) Γ (μ) = 4 x2λ−1 t2μ−1 e−(x +t ) dx dt. 0 0 Inpolarcoordinateswemaywritetheintegralas

∞ π/ 2 2 r 2(λ+μ)−1 e−r (cos ϕ)2λ−1 (sin ϕ)2μ−1 dϕ dr = 0 0 π/ 2 1 Γ + 2λ−1 2μ−1 = 1 2 (λ μ) (cos ϕ (sin ϕ) dϕ 4 B(λ, μ) , 0 by substituting u = sin2 ϕ in the latter integral. Hence Γ (λ) Γ (μ) B(λ, μ) = (Re λ>0, Re μ>0). (10.6) Γ (λ + μ) Let us consider two special cases. For λ = μ and Re λ>0 we obtain

Γ (λ)2 B(λ, λ) = . Γ (2λ) The Beta and Gamma Function 99

On the other hand,

1 π/ 2 = λ−1 − λ−1 = 1 2λ−1 2λ−1 B(λ, λ) x (1 x) dx 2 (cos ϕ) (sin ϕ) dϕ 0 0 π/ 2 = 2−2λ (sin 2ϕ)2λ−1 dϕ. 0

Splitting the path of integration [0,π/2] into [0,π/4] ∪ [π/4,π/2],usingthere- = 1 − = 2 lation sin 2ϕ sin 2( 2 π ϕ) and substituting t sin 2ϕ we obtain

π/ 2 1   Γ Γ 1 −2λ 2λ−1 −2λ+1 λ−1 − 1 1−2λ (λ) 2 2 (sin 2ϕ) dϕ = 2 t (1 − t) 2 dt = 2   . Γ λ + 1 0 0 2

Thus ! " Γ = 2λ−1 −1/2 Γ Γ + 1 (2λ) 2 π (λ) λ 2 . (10.7) This formula is called the duplication formula for the gamma function. Another important relation is the following: π Γ (λ) Γ (1 − λ) = B(λ, 1 − λ) = , (10.8) sin πλ first for real λ with 0 <λ<1 and then, by analytic continuation, for all noninteger λ in the complex plane. This formula can be shown by integration over paths in the complex plane. Consider

1 B(λ, 1 − λ) = xλ−1 (1 − x)−λ dx, (0 <λ<1). 0

Substituting x = t/(1 + t) we obtain ∞ tλ−1 B(λ, 1 − λ) = dt. 1 + t 0

To determine this integral, consider for 0

Figure 10.1. The closed path W .

λ−1 z now on G1 and G2, respectively, including the border, by means of analytic ex- tensions f1 and f2, respectively, of the, in a neighborhood of z = 1 defined, principal value,andinsuchawaythatf1(z) = f2(z) for all z ∈ k4. This can be done for example by means of the cuts ϕ =−(3/4)π and ϕ = (3/4)π, respectively. We obtain (λ−1)(log |z|+i arg z) –onG1: f1(z) = e , π/2 ≤ arg z ≤ π, (λ−1)(log |z|+i arg z) –onG2: f2(z) = e , −π ≤ arg z ≤ π/2.

On k4 we have to take arg z = π/2 in both cases. We now apply the residue theorem on G1 with g1(z) = f1(z)/(1 − z) and on G2 with g2(z) = f2(z)/(1 − z), respectively. Observe that g1 has no poles in G1 and g2 has a simple pole at z = 1 in G2, with residue −f2(1) =−1.Weobtain

g1(z) dz + g2(z) dz =−2πi.

∂G1 ∂G2

From the definitions of g1 and g2 follows:

g1(z) dz + g2(z) dz = 0 ,

k4 −k4

−r ρ e(λ−1)(log |x|+πi) yλ−1 g (z) dz = dx = e(λ−1)πi dy, 1 1 − x 1 + y − k1 ρ r

−ρ ρ e(λ−1)(log |x|−πi) yλ−1 g (z) dz = dx = e−(λ−1)πi dy. 2 1 − x 1 + y − −k1 r r The Beta and Gamma Function 101

Furthermore, for |z|=r : |zλ−1/(z − 1)|≤r λ−1/(1 − r),hence         2πrλ  +  ≤  g1(z) dz g2(z) dz  ,   1 − r k21 k22 and, similarly,         λ  +  ≤ 2πρ  g1(z) dz g2(z) dz  .   ρ − 1 k31 k32 Taking the limits r ↓ 0 and ρ →∞,weseethat 2 3 e(λ−1)πi − e−(λ−1)πi I + 2πi = 0 with I = B(λ, 1 − λ),hence ∞ tλ−1 π B(λ, 1 − λ) = dt = . 1 + t sin πλ 0 11 Hints to the Exercises

Exercise 3.2. Use the formula of jumps (Section 3.2, Example 3). For the last equa- tion, apply induction with respect to m.

Exercise 3.3. Write |x|=Y(x)x − Y(−x)x.Then|x| = Y(x)− Y(−x) by the formula of jumps, hence |x| = 2δ.Thus|x|(k) = 2δ(k−2) for k ≥ 2.

Exercise 3.4. Applying the formula of jumps, one could take f such that ⎧ ⎪   ⎨⎪ af + bf + cf = 0 = ⎪ af (0) n ⎩⎪  af (0) + bf(0) = m.

This system has a well-known classical solution. The special cases are easily worked out.

Exercise 3.5. The first expression is easily shown to be equal to the derivative of the distribution Y(x) log x, which itself is a regular distribution. The second expression is the derivative of the distribution Y(x) pv − δ, x thus defines a distribution. This is easily worked out, applying partial integration.

Exercise 3.6. This is most easily shown by performing a transformation

x = u + v, y = u − v.

Then dx dy = 2du dv, ∂2ϕ/∂x2 − ∂2ϕ/∂y2 = ∂2Φ/∂u ∂v for ϕ ∈D(R2),with Φ(u, v) = ϕ(u + v,u − v). One then obtains , - ∂2T ∂2T − ,ϕ = 2 ϕ(3, 1) − ϕ(2, 2) + ϕ(1, 1) − ϕ(2, 0) ∂x2 ∂y2 for ϕ ∈D(R2).

Exercise 3.7. Notice that # $ ∂ ∂ 1 ∂ 1 + i = 2 = 0 ∂x ∂y x + iy ∂z z for z =0.Clearly,1/(x + iy) is a locally integrable function (use polar coordinates). Let χ ∈D(R), χ ≥ 0,χ(x)= 1 in a neighborhood of x = 0 and Supp χ ⊂ (−1, 1). 2 2 2 2 Set χε(x, y) = χ[(x + y )/ε ] for any ε>0.Thenwehaveforϕ ∈D(R ) ,# $ - ,# $ - ∂ ∂ 1 ∂ ∂ 1 + i ,ϕ = + i ,χϕ ∂x ∂y x + iy ∂x ∂y x + iy ε Hints to the Exercises 103

for any ε>0. Working out the right-hand side in polar coordinates and letting ε tend to zero, gives the result.

Exercise 3.8. Apply Green’s formula, as in Section 3.5, or just compute using polar coordinates.

Exercise 3.9. Apply Green’s formula, as in Section 3.5.

Exercise 3.10. Show first that ∂E/∂t − ∂2E/∂x2 = 0 for t =0.Thenjustcompute, applying partial integration.

Exercise 4.4. a. The main task is to characterize all ψ ∈Dof the form ψ = ϕ for some ϕ ∈D.  ∞ Such ψ are easily shown to be the functions ψ in D with −∞ ψ(x) dx = 0.  = x = + b. Write F(x) 0 g(t)dt and show that F f const.

Exercise 4.5. The solutions are pv(1/x) + const.

Exercise 4.9. a. Use partial integration. ∞ A 9 : A sin λx ϕ(x) − ϕ(0) sin λx b. ϕ(x) dx = sin λx dx + ϕ(0) dx x x x −∞ −A −A if Supp ϕ ⊂ [−A, A]. ThefirsttermtendstozerobyExercise4.4a.Thesecond term is equal to

A λA sin λx sin y ϕ(0) dx = ϕ(0) dy , x y −A −λA

∞ sin y which tends to πϕ(0) if λ →∞,since dy = π. y −∞ ∞   cos λx cos λx − 1 1 c. lim ϕ(x)dx = ϕ(x) dx + pv ,ϕ . ε↓0 x x x |x|≥ε −∞ 104 Hints to the Exercises

∞ cos λx Hence ϕ Pf ϕ(x) dx is a distribution. If Supp ϕ ⊂ [−A, A],then x −∞ cos λx cos λx ϕ(x) dx = ϕ(x) dx x x | |≥ ≥| |≥ x ε A x ε cos λx = [ϕ(x) − ϕ(0)] dx x ≥| |≥ A x ε cos λx + ϕ(0) dx. x A≥|x|≥ε

The latter term is equal to zero. Hence

A 9 : cos λx ϕ(x) − ϕ(0) lim ϕ(x) dx = cos λx dx. ε↓0 x x |x|≥ε −A

This term tends to zero again if λ →∞by Exercise 4.4 a.

a ax Exercise 4.10. lim = πδ, lim = 0. a↓0 x2 + a2 a↓0 x2 + a2

Exercise 6.16. (i) e−|x| ∗ e−|x| = (1 +|y|) e−|y|. 2 1 d 2 1 2 For (ii) and (iii), observe that xe−ax =− e−ax =− δ ∗ e−ax . 2a dx 2a

Exercise 6.17. Observe that f(x) = f(−x), Fm(x) = Fm(−x) and f = δ−1 ∗ Y − δ1 ∗ Y.Then Supp Fm ⊂ [−m, m].

Exercise 6.18. Let A be the rotation over π/4 in R2: # $ 1 1 −1 A = √ . 2 11

Then χ(x) = Y(Ax)for x ∈ R2. Use this to determine f ∗ g and χ ∗ χ.

Exercise 6.28. Apply Sections 6.6 and 6.7.

Exercise 6.29. Idem.

Exercise 6.30. Condition: g has to be a continuous function.

Exercise 6.31. Apply Sections 6.7 and 6.8.

Exercise 6.32. Set, as usual, p = δ and write z for zδ.

Exercise 6.34. Apply Section 6.10. Hints to the Exercises 105

= 1 1 =− − − Exercise 7.20. T pv x , T πi Y(y) Y( y) .

sin 2πy cos 2πy − 1 Exercise 7.21. f(y)1 = + (y =0), f(1 0) = 1. 2πy 2πy

1 ∞  f is C ,alsobecauseTf ∈E . 1 1 Exercise 7.22. F(|x|) =− Pf . 2π 2 x2

Exercise 7.23. Take the Fourier or Laplace transform of the distributions.

2 D Exercise 7.24. Observe that, if fm tends to f in L ,thenTfm tends to Tf in .

References

[1] N. Bourbaki, Espaces Vectoriels Topologiques, Hermann, Paris, 1964. [2] N. Bourbaki, Intégration, Chapitres 1, 2, 3 et 4, Hermann, Paris, 1965. [3] I. Daubechies, Ten Lectures on Wavelets, Society for Industrial and Applied Mathematics, Philadelphia, Pennsylvania, 1992. [4] G. van Dijk, Introduction to Harmonic Analysis and Generalized Gelfand Pairs, de Gruyter, Berlin, 2009. [5] F.G.FriedlanderandM.Joshi,Introduction to the Theory of Distributions, Cambridge University Press, Cambridge, 1998. [6] I. M. Gelfand and G. E. Shilov, Generalized Functions, Vol. 1, Academic Press, New York, 1964. [7] L. Hörmander, The Analysis of Linear Partial Differential Operators, Vol. I–IV, Springer-Verlag, Berlin, 1983–85, Second edition 1990. [8] G. Köthe, Topologische Lineare Räume, Springer-Verlag, Berlin, 1960. [9] L. Schwartz, Méthodes Mathématiques pour les Sciences Physiques, Hermann, Paris, 1965. [10] L. Schwartz, Théorie des Distributions, nouvelle édition, Hermann, Paris, 1978. [11] E. M. Stein and G. Weiss, Introduction to Fourier Analysis on Euclidean Spaces, Princeton University Press, Princeton, 1971. [12] A. E. Taylor and D. C. Lay, Introduction to Functional Analysis, second edition, John Wiley and Sons, New York, 1980. [13] E. G. F. Thomas, Path Distributions on Sequence Spaces, preprint, 2000. [14] E. C. Titchnarsh, Theory of Fourier Integrals, Oxford University Press, Oxford, 1937. [15] F. Trèves, Topological Vector Spaces, Distributions and Kernels, Academic Press, New York, 1967. [16] E. T. Whittaker and G. N. Watson, A Course of Modern Analysis, Cambridge University Press, Cambridge, 1980. [17] D. Widder, The Laplace Transform, Princeton University Press, Princeton, 1941.

Index

Abel integral equation 78 metric 94 averaging theorem 41 metric space 94 metrizable space 94 Banach–Steinhaus theorem 23, 84, 90 mollifier 88 Bessel function 70 bounded neighborhood 90 –fromtheleft34 neighborhood basis 90 – from the right 34 Newton potential 40 bounded convergence property 86 open set 90 canonical extension 85, 86 order of a distribution 6 Cauchy distribution 35 Cauchy problem 71 Paley–Wiener Theorems 65 Cauchy sequence 91 parametrix 83 closed set 90 Partie finie 11 complete space 92 Plancherel’s theorem 57 convergence principle 4, 26, 59 Poisson equation 40 convolution algebra 42 principal value 11 convolution equation 43 convolution of distributions 33 radial function 69 rank of a distribution 87 derivative of a distribution 9 rapidly decreasing function 59 distribution 4 Riemann–Lebesgue lemma 52 dual space 92 duplication formula 99 Schwartz space 59 seminorm 91 elementary solution 40 slowly increasing function 60 Fourier transform spherical coordinates 18 –ofadistribution63 structure of a distribution 28, 61 –ofafunction52 summability order 82 Fourier–Bessel transform 70 summable distribution 6, 82 Fréchet space 91 support –ofadistribution8 Gauss distribution 35 –ofafunction3 Green’s formula 17 symbolic calculus 45 Hankel transform 70 tempered distribution 60 harmonic distribution 40 tensor product harmonic function 17, 40 – of distributions 31 heat equation 71 –offunctions31 Heaviside function 10 test function 3 inversion theorem 54 topological space 90 iterated Poisson equation 83 topological vector space 91 trapezoidal function 53 Laplace transform triangle function 53 –ofadistribution75 –ofafunction74 uniform boundedness 96 locally convex space 91 locally integrable function 5 Volterra integral equation 47