An Overview of the Theory of Distributions
Total Page:16
File Type:pdf, Size:1020Kb
An Overview of the Theory of Distributions Matt Guthrie Adapted from Halperin[1] 1 Chapter 1 Introduction For the range of time between the introduction of the operational calculus at the end of the 19th century and the mid 20th century, many formulas were in use which had not been adequately clarified from the mathematical point of view. For instance, consider the Heaviside function ( 0 for x < 0 H(x) = (1.1) 1 for x ≥ 0: It is said that the derivative of this function is the Dirac delta function δ(x), which has the mathematically impossible properties that it vanishes everywhere except at the origin where its value is so large that Z 1 δ(x) dx = 1: (1.2) −∞ This \function" and its successive \derivatives" have been used with considerable success, especially in electromagnetics and gravitation. It was suggested by Dirac himself that the delta function could be avoided by using instead a limiting procedure involving ordinary, mathematically possible, functions. However, the delta function can be kept and made rigorous by defining it as a measure, that is, as a set function in place of an ordinary point function. This suggests that the notion of a point function be enlarged to include new entities1 and that the notion of the derivative be correspondingly generalized so that within the new system of entities, every point function should have a rigorously defined derivative. This is done in the theory of distributions. The new system of entities, called distributions, includes all continuous functions, all Lebesgue locally summable functions, and new objects of which a simple example is the Dirac delta function mentioned above. The more general but rigorous process of differentiation assigns to every distribution a derivative which is again a distribution, and so every distribution, including every locally summable point function, has derivatives of all orders. The derivative of a locally summable point function is always a distribution although not, in general, a point function. However, it coincides with the classical derivative when the latter exists and is locally summable. 1Just as the notion of a rational number was enlarged by Dedekind to include all real numbers. 2 This theory of distributions gives rigorous content and validity to the formulas of operational calculus mentioned above. It can be developed not only for functions of one variable, but also for functions of several variables and it provides a simple but more complete theory of such topics as Fourier series and integrals, convolutions, and partial differential equations. A systematic exposition of the theory of distributions is given in Grubb's recent Distributions and Operators[2]. There's also the recommended reference work by Strichartz, A Guide to Distribution Theory and Fourier Transforms[3]. The comprehensive treatise on the subject, although quite old now, is Gel'fand, and Shilov's Generalized functions[4]. A very good (though pretty advanced) source that's now available from Dover is Topological Vector Spaces, Distributions and Kernels by Trves[5]. That book is one of the classic texts on functional analysis and if you're an analyst or aspire to be, there's no reason not to have it now. Of course, if you read French, you really should go back and read Schwartz's original treatise[6]. However, the following pages may serve as a useful introduction to some of the basic ideas. 3 Chapter 2 Point Functions as Functionals The following discussion will lead to the precise definition of distributions given later in this chapter. Let (a; b) be a finite closed interval. A continuous function f(x) can be considered a point function1 but it can also be considered in another way:2 it defines the functional F (φ) as Z b F (φ) = f(x)φ(x) dx; (2.1) a where F (φ) is a number3 defined for every continuous φ(x). This functional is linear, that is F (c1φ1 + c2φ2) = c1F (φ1) + c2F (φ2): (2.2) Since there are linear functionals which cannot be expressed in this way, in terms of any continuous (or even Lebesgue summable[7] f(x)), our first suggestion is that the distributions be defined as the arbitrary linear functionals F (φ) over some suitable set S of continuous φ(x). The φ(x) chosen will be called testing functions. If f(x) is absolutely continuous and has a derivative f 0(x), the derivative will also define a linear functional, namely Z b f 0(x)φ(x) dx: a This new functional can be expressed in terms of the original F (φ), for some φ, by use of integration by parts: Z b Z b f 0(x)φ(x) dx = − f(x)φ0(x) dx = −F (φ0); (2.3) a a 1Point functions need be defined almost everywhere and two functions are identified if they are equal almost everywhere. Thus, we say that f(x) is identically zero if it has the value zero almost everywhere, that f(x) is continuous if it can be identified with a continuous function; the upper bound of f(x) will mean the essential upper bound, the derivative of f(x) will mean the derivative of that function which can be identified with f(x) and which has a derivative if such a function exists, and so on. 2Just as every rational number was considered by Dedekind in a new way: as defining a cut in the set of all rationals. 3Numbers may be taken to mean real numbers or complex numbers. 4 which is valid for all φ(x) that vanish at a and b and have a continuous first derivative. This suggests that we allow S to include only such φ(x) and not all continuous φ(x). Equation (2.3) also suggests that for every linear F (φ), whether it comes from an f(x) possessing a derivative f 0(x) or not, F 0(φ), a derivative of F which is a linear functional on S, be defined by F 0(φ) = −F (φ0): (2.4) If F 0(φ) is to be defined for all φ 2 S, S must be restricted to include only such φ(x) as have φ0(x) also in S. This leads to the condition that φ(x) mush have derivatives of all orders and they (as well as φ(x) itself) must vanish at a and b. Important examples of such φ(x) are the functions 1=n φn(x) = (φc;d(x)) ; obtained by choosing n 2 N and a ≤ c < d ≤ b and defining ( 1 1 exp − x−c + d−x for c < x < d φc;d(x) = (2.5) 0 otherwise: It is not necessary to restrict S any further. However, in restricting S we must be careful to identify the point function f(x) and the functional F (φ) which it defines. Thus, we want S to include sufficiently many testing functions so that functions f(x) which are different on (a; b) will be identified with different F (φ). We do not want Z b f(x)φ(x) dx a to vanish for all φ 2 S except when f(x) is identically zero on (a; b). This requirement is satisfied if S includes all 1=n (φc;d(x)) ; for Z b Z d 1=n lim f(x)(φc;d(x)) dx = f(x) dx; (2.6) n!1 a c and if Z d f(x) dx = 0 (2.7) c for all a ≤ c < d ≤ b, then f(x) is identically zero on (a; b). Another point is this: the F (φ) which are defined by point functions, and all derivatives of such (the F (n)(φ)) have a certain continuity property and it is desirable to require this continuity property in defining distributions.4 4As will be shown in Chapter 5, this gives the smallest enlargement or completion of the system of locally summable point functions which permits unrestricted differentiation. 5 Finally, with a view to later applications, we will define distributions on open intervals. The closed interval gives a mathematically simpler situation and indeed the open interval will be discussed in terms of its closed subintervals. However, the word \distribution" will be reserved for the open interval and the terminology \continuous linear functional" will be used for the closed interval. The precise definitions are: • Definition of a distribution For any closed finite interval (a; b), let S(a;b) consist of all continuous φ(x) possessing deriva- tives φ(n)(x) of all orders which, along with φ(x) itself, vanish at a and b and for x outside (a; b). A linear functional F (φ), defined for all φ 2 S(a;b), is called a continuous linear func- tional on (a; b) if it has the property that whenever all φ, φm 2 S(a;b) and the φm(x) converge (n) (n) uniformly to φx, and for each n, the derivatives φm (x) converge uniformly to φ (x), then F (φm) will converge to F (φ). For an arbitrary open interval I, finite or infinite, a distribution on I is a linear functional F (φ) such that for every closed finite interval (a; b) contained in I, F (φ) is defined for φ 2 S(a;b) and, when restricted to these φ, defines a continuous linear functional on (a; b). • Identification of the distribution and point function A distribution F (φ) on an interval I is to be identified with a point function f(x) if, for every closed finite interval (a; b) 2 I, f(x) is summable on (a; b) and Z b F (φ) = f(x)φ(x) dx (2.8) a for all φ 2 S(a;b).