Supplement. the Dirac Delta Function, a Cautionary Tale

Total Page:16

File Type:pdf, Size:1020Kb

Supplement. the Dirac Delta Function, a Cautionary Tale The Dirac Delta Function 1 Supplement. The Dirac Delta Function, A Cautionary Tale Note. In the study of charge distributions in electricity and magnetism, when con- sidering point charges it is common to introduce the “Dirac delta function,” δ(x). This function is defined to be extended real valued with the following properties: 0 if x 6= 0 Z ∞ δ(x) = and δ(x) dx = 1. ∞ if x = 0 −∞ However, this is very misleading! First, Riemann integrals do not address extended real valued functions. Second, such a function would violate Proposition 4.9 of Royden and Fitzpatrick’s Real Analysis, 4th Edition: Proposition 4.9. Let f be a nonnegative measurable (extended real R valued) function on set E. Then the Lebesgue integral E f = 0 if and only if f = 0 a.e. on E. Note. In this supplement, we refer to the junior-level undergraduate text Foun- dations of Electromagnetic Theory, 3rd Edition by John Reitz, Frederick Milford, Robert Christy, Addison-Wesley (1979). We reveal some errors in it’s presentation of δ(x) and explore an alternative approach to “get the math right.” The Dirac Delta Function 2 Note. Reitz, Milford, and Christy introduce the Dirac delta function on pages 43 and 44. Note. The motivation for δ(x) is given on page 43. On page 44, they comment that “. it is a legitimate mathematical object, which leads to no difficulties if one is cautious. .” In the footnote, they falsely state: “The Riemann integral of such a function is zero if it exists at all, but the integration can be handled by the more general Lebesgue integral.” Well, the Lebesgue integral does “handle” the integration, but it gives zero as the integral value. The Dirac Delta Function 3 Definition. Reitz, Milford, and Christy give more details in their Appendix IV, and the notation there is consistent with that used above in these notes (as opposed to a function of a vector variable as stated on their pages 43 and 44). Note. In equations IV-2, IV-3, and IV-4 they write δ(x) as the limit of three 2 2 √1 −x /ε different functions. In equation IV-2, they state δ(x) = limε→0 πεe (a normal √ distribution with mean µ = 0 and standard deviation σ = ε/ 2). Since this is a ∞ 2 2 R √1 −x /ε normal distribution, −∞ πεe dx = 1. As ε → 0, the distribution gets very narrow with a large spike at x = 0. If we replace ε with 1/n where n ∈ N, then we 2 √n −(nx) can create a sequence of functions fn = π e where limn→∞ fn(x) = δ(x). The Dirac Delta Function 4 Note. Recall that from Royden and Fitzpatrick: Monotone Convergence Theorem. Let {fn} be an increasing sequence of nonnegative measurable functions on set E. If {fn} → f pointwise a.e. on E, then Z Z Z lim fn = lim fn = f. n→∞ E E n→∞ E 2 √n −(nx) In the previous note, we have fn = π e and limn→∞ fn(x) = δ(x), but the convergence is not monotone increasing, so the Monotone Convergence Theorem R ∞ does not apply to give that the integral −∞ δ(x) dx is 1. It is clear that Reitz, Milford, and Christy are trying to use the values of the integrals of the normal distributions to justify the claim that the integral of δ(x) should be 1, but this is not the case. Note. Recall that from Royden and Fitzpatrick: Fatou’s Lemma. Let {fn} be a sequence of nonnegative measurable functions on set E. If {fn} → f pointwise a.e. on E, then Z Z f ≤ lim inf fn . E E Now we can apply Fatou’s Lemma to the sequence of normal distributions above, since the convergence to δ(x) is pointwise and all functions here are nonnegative. But this simply allows us to conclude that ∞ ∞ Z Z n 2 δ(x) dx ≤ √ e−(nx) dx = 1, −∞ −∞ π still not an equality of 1, but merely an upper bound of 1. The Dirac Delta Function 5 Note. Reitz, Milford, and Christy’s “no difficulties if one is cautious” comment is correct, with the right kind of caution! In Analysis 2 (MATH 4227/5227) Riemann- Stieltjes integrals are introduced (see my online class notes on 6-3. The Riemann- Stieltjes Integral; in particular, see Theorem 6-26). One way to resolve the desired properties of the Dirac delta function, is to use Riemann-Stieltjes integrals. You should review these notes before reading the next note. Note. Let 0 for x ∈ (−∞, 0) g(x) = 1 for x ∈ [0, ∞). Then the derivative of g is 0 if n 6= 0. The definition of limit gives that the derivative of g is ∞ at 0. Also, we have the Riemann-Stieltjes integral Z ∞ Z 0 Z ∞ dg = dg + (1) lim g(x) − lim g(x) + dg = 0 + (1)[1] + 0 = 1. −∞ −∞ x↓0 x↑0 0 So the derivative of g(x) has the values of δ(x) given above and the Riemann- Stieltjes integral of f(x) = 1 with respect to g satisfies the integral property of δ(x) given above. If f is a function defined on all of R, then we can use the Riemann- Stieltjes integral to determine the value of f at a specific point (say x = x0): Z ∞ f(x) dg(x − x0) = f(x0) lim g(x − x0) − lim g(x − x0) = f(x0)[1] = f(x0). −∞ x↓x0 x↑x0 Note. Another, more sophisticated solution involves the Dirac measure concen- trated at x0, denoted δx0 . For a σ-algebra M of subsets of a set X and a point x0 belonging to X, the measure of 1 is assigned to a set in M that contains x0, and a measure of 0 is assigned to a set that does not contain x0. We then have for R measurable function f on X that X f dδx0 = f(x0), as desired. This is explored in Problem 18.26 (in 18.2. Integration of Nonnegative Measurable Functions) and Problem 18.27(ii) (in 18.3. Integration of General Measurable Functions). The Dirac Delta Function 6 Note. The Dirac delta function is named for Paul A. M. Dirac (August 8, 1902– October 20, 1984). Another approach is to treat δ(x) not as a function, but as a distribution (or a generalized function). It is therefore more accurately called the “Dirac delta distribution.” A classical work on the theory of distributions is I. M. Gel’fand and G. E. Shilov’s Generalized Functions, Volumes 15, Academic Press, (1966-1968). Revised: 11/9/2020.
Recommended publications
  • 1. How Different Is the T Distribution from the Normal?
    Statistics 101–106 Lecture 7 (20 October 98) c David Pollard Page 1 Read M&M §7.1 and §7.2, ignoring starred parts. Reread M&M §3.2. The eects of estimated variances on normal approximations. t-distributions. Comparison of two means: pooling of estimates of variances, or paired observations. In Lecture 6, when discussing comparison of two Binomial proportions, I was content to estimate unknown variances when calculating statistics that were to be treated as approximately normally distributed. You might have worried about the effect of variability of the estimate. W. S. Gosset (“Student”) considered a similar problem in a very famous 1908 paper, where the role of Student’s t-distribution was first recognized. Gosset discovered that the effect of estimated variances could be described exactly in a simplified problem where n independent observations X1,...,Xn are taken from (, ) = ( + ...+ )/ a normal√ distribution, N . The sample mean, X X1 Xn n has a N(, / n) distribution. The random variable X Z = √ / n 2 2 Phas a standard normal distribution. If we estimate by the sample variance, s = ( )2/( ) i Xi X n 1 , then the resulting statistic, X T = √ s/ n no longer has a normal distribution. It has a t-distribution on n 1 degrees of freedom. Remark. I have written T , instead of the t used by M&M page 505. I find it causes confusion that t refers to both the name of the statistic and the name of its distribution. As you will soon see, the estimation of the variance has the effect of spreading out the distribution a little beyond what it would be if were used.
    [Show full text]
  • 1 One Parameter Exponential Families
    1 One parameter exponential families The world of exponential families bridges the gap between the Gaussian family and general dis- tributions. Many properties of Gaussians carry through to exponential families in a fairly precise sense. • In the Gaussian world, there exact small sample distributional results (i.e. t, F , χ2). • In the exponential family world, there are approximate distributional results (i.e. deviance tests). • In the general setting, we can only appeal to asymptotics. A one-parameter exponential family, F is a one-parameter family of distributions of the form Pη(dx) = exp (η · t(x) − Λ(η)) P0(dx) for some probability measure P0. The parameter η is called the natural or canonical parameter and the function Λ is called the cumulant generating function, and is simply the normalization needed to make dPη fη(x) = (x) = exp (η · t(x) − Λ(η)) dP0 a proper probability density. The random variable t(X) is the sufficient statistic of the exponential family. Note that P0 does not have to be a distribution on R, but these are of course the simplest examples. 1.0.1 A first example: Gaussian with linear sufficient statistic Consider the standard normal distribution Z e−z2=2 P0(A) = p dz A 2π and let t(x) = x. Then, the exponential family is eη·x−x2=2 Pη(dx) / p 2π and we see that Λ(η) = η2=2: eta= np.linspace(-2,2,101) CGF= eta**2/2. plt.plot(eta, CGF) A= plt.gca() A.set_xlabel(r'$\eta$', size=20) A.set_ylabel(r'$\Lambda(\eta)$', size=20) f= plt.gcf() 1 Thus, the exponential family in this setting is the collection F = fN(η; 1) : η 2 Rg : d 1.0.2 Normal with quadratic sufficient statistic on R d As a second example, take P0 = N(0;Id×d), i.e.
    [Show full text]
  • Chapter 5 Sections
    Chapter 5 Chapter 5 sections Discrete univariate distributions: 5.2 Bernoulli and Binomial distributions Just skim 5.3 Hypergeometric distributions 5.4 Poisson distributions Just skim 5.5 Negative Binomial distributions Continuous univariate distributions: 5.6 Normal distributions 5.7 Gamma distributions Just skim 5.8 Beta distributions Multivariate distributions Just skim 5.9 Multinomial distributions 5.10 Bivariate normal distributions 1 / 43 Chapter 5 5.1 Introduction Families of distributions How: Parameter and Parameter space pf /pdf and cdf - new notation: f (xj parameters ) Mean, variance and the m.g.f. (t) Features, connections to other distributions, approximation Reasoning behind a distribution Why: Natural justification for certain experiments A model for the uncertainty in an experiment All models are wrong, but some are useful – George Box 2 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Bernoulli distributions Def: Bernoulli distributions – Bernoulli(p) A r.v. X has the Bernoulli distribution with parameter p if P(X = 1) = p and P(X = 0) = 1 − p. The pf of X is px (1 − p)1−x for x = 0; 1 f (xjp) = 0 otherwise Parameter space: p 2 [0; 1] In an experiment with only two possible outcomes, “success” and “failure”, let X = number successes. Then X ∼ Bernoulli(p) where p is the probability of success. E(X) = p, Var(X) = p(1 − p) and (t) = E(etX ) = pet + (1 − p) 8 < 0 for x < 0 The cdf is F(xjp) = 1 − p for 0 ≤ x < 1 : 1 for x ≥ 1 3 / 43 Chapter 5 5.2 Bernoulli and Binomial distributions Binomial distributions Def: Binomial distributions – Binomial(n; p) A r.v.
    [Show full text]
  • Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F
    Basic Econometrics / Statistics Statistical Distributions: Normal, T, Chi-Sq, & F Course : Basic Econometrics : HC43 / Statistics B.A. Hons Economics, Semester IV/ Semester III Delhi University Course Instructor: Siddharth Rathore Assistant Professor Economics Department, Gargi College Siddharth Rathore guj75845_appC.qxd 4/16/09 12:41 PM Page 461 APPENDIX C SOME IMPORTANT PROBABILITY DISTRIBUTIONS In Appendix B we noted that a random variable (r.v.) can be described by a few characteristics, or moments, of its probability function (PDF or PMF), such as the expected value and variance. This, however, presumes that we know the PDF of that r.v., which is a tall order since there are all kinds of random variables. In practice, however, some random variables occur so frequently that statisticians have determined their PDFs and documented their properties. For our purpose, we will consider only those PDFs that are of direct interest to us. But keep in mind that there are several other PDFs that statisticians have studied which can be found in any standard statistics textbook. In this appendix we will discuss the following four probability distributions: 1. The normal distribution 2. The t distribution 3. The chi-square (␹2 ) distribution 4. The F distribution These probability distributions are important in their own right, but for our purposes they are especially important because they help us to find out the probability distributions of estimators (or statistics), such as the sample mean and sample variance. Recall that estimators are random variables. Equipped with that knowledge, we will be able to draw inferences about their true population values.
    [Show full text]
  • Measure, Integral and Probability
    Marek Capinski´ and Ekkehard Kopp Measure, Integral and Probability Springer-Verlag Berlin Heidelberg NewYork London Paris Tokyo Hong Kong Barcelona Budapest To our children; grandchildren: Piotr, Maciej, Jan, Anna; Luk asz Anna, Emily Preface The central concepts in this book are Lebesgue measure and the Lebesgue integral. Their role as standard fare in UK undergraduate mathematics courses is not wholly secure; yet they provide the principal model for the development of the abstract measure spaces which underpin modern probability theory, while the Lebesgue function spaces remain the main source of examples on which to test the methods of functional analysis and its many applications, such as Fourier analysis and the theory of partial differential equations. It follows that not only budding analysts have need of a clear understanding of the construction and properties of measures and integrals, but also that those who wish to contribute seriously to the applications of analytical methods in a wide variety of areas of mathematics, physics, electronics, engineering and, most recently, finance, need to study the underlying theory with some care. We have found remarkably few texts in the current literature which aim explicitly to provide for these needs, at a level accessible to current under- graduates. There are many good books on modern probability theory, and increasingly they recognize the need for a strong grounding in the tools we develop in this book, but all too often the treatment is either too advanced for an undergraduate audience or else somewhat perfunctory. We hope therefore that the current text will not be regarded as one which fills a much-needed gap in the literature! One fundamental decision in developing a treatment of integration is whether to begin with measures or integrals, i.e.
    [Show full text]
  • The Axiom of Choice and Its Implications
    THE AXIOM OF CHOICE AND ITS IMPLICATIONS KEVIN BARNUM Abstract. In this paper we will look at the Axiom of Choice and some of the various implications it has. These implications include a number of equivalent statements, and also some less accepted ideas. The proofs discussed will give us an idea of why the Axiom of Choice is so powerful, but also so controversial. Contents 1. Introduction 1 2. The Axiom of Choice and Its Equivalents 1 2.1. The Axiom of Choice and its Well-known Equivalents 1 2.2. Some Other Less Well-known Equivalents of the Axiom of Choice 3 3. Applications of the Axiom of Choice 5 3.1. Equivalence Between The Axiom of Choice and the Claim that Every Vector Space has a Basis 5 3.2. Some More Applications of the Axiom of Choice 6 4. Controversial Results 10 Acknowledgments 11 References 11 1. Introduction The Axiom of Choice states that for any family of nonempty disjoint sets, there exists a set that consists of exactly one element from each element of the family. It seems strange at first that such an innocuous sounding idea can be so powerful and controversial, but it certainly is both. To understand why, we will start by looking at some statements that are equivalent to the axiom of choice. Many of these equivalences are very useful, and we devote much time to one, namely, that every vector space has a basis. We go on from there to see a few more applications of the Axiom of Choice and its equivalents, and finish by looking at some of the reasons why the Axiom of Choice is so controversial.
    [Show full text]
  • The Normal Moment Generating Function
    MSc. Econ: MATHEMATICAL STATISTICS, 1996 The Moment Generating Function of the Normal Distribution Recall that the probability density function of a normally distributed random variable x with a mean of E(x)=and a variance of V (x)=2 is 2 1 1 (x)2/2 (1) N(x; , )=p e 2 . (22) Our object is to nd the moment generating function which corresponds to this distribution. To begin, let us consider the case where = 0 and 2 =1. Then we have a standard normal, denoted by N(z;0,1), and the corresponding moment generating function is dened by Z zt zt 1 1 z2 Mz(t)=E(e )= e √ e 2 dz (2) 2 1 t2 = e 2 . To demonstate this result, the exponential terms may be gathered and rear- ranged to give exp zt exp 1 z2 = exp 1 z2 + zt (3) 2 2 1 2 1 2 = exp 2 (z t) exp 2 t . Then Z 1t2 1 1(zt)2 Mz(t)=e2 √ e 2 dz (4) 2 1 t2 = e 2 , where the nal equality follows from the fact that the expression under the integral is the N(z; = t, 2 = 1) probability density function which integrates to unity. Now consider the moment generating function of the Normal N(x; , 2) distribution when and 2 have arbitrary values. This is given by Z xt xt 1 1 (x)2/2 (5) Mx(t)=E(e )= e p e 2 dx (22) Dene x (6) z = , which implies x = z + . 1 MSc. Econ: MATHEMATICAL STATISTICS: BRIEF NOTES, 1996 Then, using the change-of-variable technique, we get Z 1 1 2 dx t zt p 2 z Mx(t)=e e e dz 2 dz Z (2 ) (7) t zt 1 1 z2 = e e √ e 2 dz 2 t 1 2t2 = e e 2 , Here, to establish the rst equality, we have used dx/dz = .
    [Show full text]
  • A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess
    S S symmetry Article A Family of Skew-Normal Distributions for Modeling Proportions and Rates with Zeros/Ones Excess Guillermo Martínez-Flórez 1, Víctor Leiva 2,* , Emilio Gómez-Déniz 3 and Carolina Marchant 4 1 Departamento de Matemáticas y Estadística, Facultad de Ciencias Básicas, Universidad de Córdoba, Montería 14014, Colombia; [email protected] 2 Escuela de Ingeniería Industrial, Pontificia Universidad Católica de Valparaíso, 2362807 Valparaíso, Chile 3 Facultad de Economía, Empresa y Turismo, Universidad de Las Palmas de Gran Canaria and TIDES Institute, 35001 Canarias, Spain; [email protected] 4 Facultad de Ciencias Básicas, Universidad Católica del Maule, 3466706 Talca, Chile; [email protected] * Correspondence: [email protected] or [email protected] Received: 30 June 2020; Accepted: 19 August 2020; Published: 1 September 2020 Abstract: In this paper, we consider skew-normal distributions for constructing new a distribution which allows us to model proportions and rates with zero/one inflation as an alternative to the inflated beta distributions. The new distribution is a mixture between a Bernoulli distribution for explaining the zero/one excess and a censored skew-normal distribution for the continuous variable. The maximum likelihood method is used for parameter estimation. Observed and expected Fisher information matrices are derived to conduct likelihood-based inference in this new type skew-normal distribution. Given the flexibility of the new distributions, we are able to show, in real data scenarios, the good performance of our proposal. Keywords: beta distribution; centered skew-normal distribution; maximum-likelihood methods; Monte Carlo simulations; proportions; R software; rates; zero/one inflated data 1.
    [Show full text]
  • Lecture 2 — September 24 2.1 Recap 2.2 Exponential Families
    STATS 300A: Theory of Statistics Fall 2015 Lecture 2 | September 24 Lecturer: Lester Mackey Scribe: Stephen Bates and Andy Tsao 2.1 Recap Last time, we set out on a quest to develop optimal inference procedures and, along the way, encountered an important pair of assertions: not all data is relevant, and irrelevant data can only increase risk and hence impair performance. This led us to introduce a notion of lossless data compression (sufficiency): T is sufficient for P with X ∼ Pθ 2 P if X j T (X) is independent of θ. How far can we take this idea? At what point does compression impair performance? These are questions of optimal data reduction. While we will develop general answers to these questions in this lecture and the next, we can often say much more in the context of specific modeling choices. With this in mind, let's consider an especially important class of models known as the exponential family models. 2.2 Exponential Families Definition 1. The model fPθ : θ 2 Ωg forms an s-dimensional exponential family if each Pθ has density of the form: s ! X p(x; θ) = exp ηi(θ)Ti(x) − B(θ) h(x) i=1 • ηi(θ) 2 R are called the natural parameters. • Ti(x) 2 R are its sufficient statistics, which follows from NFFC. • B(θ) is the log-partition function because it is the logarithm of a normalization factor: s ! ! Z X B(θ) = log exp ηi(θ)Ti(x) h(x)dµ(x) 2 R i=1 • h(x) 2 R: base measure.
    [Show full text]
  • (Measure Theory for Dummies) UWEE Technical Report Number UWEETR-2006-0008
    A Measure Theory Tutorial (Measure Theory for Dummies) Maya R. Gupta {gupta}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 UWEE Technical Report Number UWEETR-2006-0008 May 2006 Department of Electrical Engineering University of Washington Box 352500 Seattle, Washington 98195-2500 PHN: (206) 543-2150 FAX: (206) 543-3842 URL: http://www.ee.washington.edu A Measure Theory Tutorial (Measure Theory for Dummies) Maya R. Gupta {gupta}@ee.washington.edu Dept of EE, University of Washington Seattle WA, 98195-2500 University of Washington, Dept. of EE, UWEETR-2006-0008 May 2006 Abstract This tutorial is an informal introduction to measure theory for people who are interested in reading papers that use measure theory. The tutorial assumes one has had at least a year of college-level calculus, some graduate level exposure to random processes, and familiarity with terms like “closed” and “open.” The focus is on the terms and ideas relevant to applied probability and information theory. There are no proofs and no exercises. Measure theory is a bit like grammar, many people communicate clearly without worrying about all the details, but the details do exist and for good reasons. There are a number of great texts that do measure theory justice. This is not one of them. Rather this is a hack way to get the basic ideas down so you can read through research papers and follow what’s going on. Hopefully, you’ll get curious and excited enough about the details to check out some of the references for a deeper understanding.
    [Show full text]
  • Measure Theory and Probability
    Measure theory and probability Alexander Grigoryan University of Bielefeld Lecture Notes, October 2007 - February 2008 Contents 1 Construction of measures 3 1.1Introductionandexamples........................... 3 1.2 σ-additive measures ............................... 5 1.3 An example of using probability theory . .................. 7 1.4Extensionofmeasurefromsemi-ringtoaring................ 8 1.5 Extension of measure to a σ-algebra...................... 11 1.5.1 σ-rings and σ-algebras......................... 11 1.5.2 Outermeasure............................. 13 1.5.3 Symmetric difference.......................... 14 1.5.4 Measurable sets . ............................ 16 1.6 σ-finitemeasures................................ 20 1.7Nullsets..................................... 23 1.8 Lebesgue measure in Rn ............................ 25 1.8.1 Productmeasure............................ 25 1.8.2 Construction of measure in Rn. .................... 26 1.9 Probability spaces ................................ 28 1.10 Independence . ................................. 29 2 Integration 38 2.1 Measurable functions.............................. 38 2.2Sequencesofmeasurablefunctions....................... 42 2.3 The Lebesgue integral for finitemeasures................... 47 2.3.1 Simplefunctions............................ 47 2.3.2 Positivemeasurablefunctions..................... 49 2.3.3 Integrablefunctions........................... 52 2.4Integrationoversubsets............................ 56 2.5 The Lebesgue integral for σ-finitemeasure.................
    [Show full text]
  • The Radon-Nikodym Theorem
    The Radon-Nikodym Theorem Yongheng Zhang Theorem 1 (Johann Radon-Otton Nikodym). Let (X; B; µ) be a σ-finite measure space and let ν be a measure defined on B such that ν µ. Then there is a unique nonnegative measurable function f up to sets of µ-measure zero such that Z ν(E) = fdµ, E for every E 2 B. f is called the Radon-Nikodym derivative of ν with respect to µ and it is often dν denoted by [ dµ ]. The assumption that the measure µ is σ-finite is crucial in the theorem (but we don't have this requirement directly for ν). Consider the following two examples in which the µ's are not σ-finite. Example 1. Let (R; M; ν) be the Lebesgue measure space. Let µ be the counting measure on M. So µ is not σ-finite. For any E 2 M, if µ(E) = 0, then E = ; and hence ν(E) = 0. This shows ν µ. Do we have the associated Radon-Nikodym derivative? Never. Suppose we have it and call it R R f, then for each x 2 R, 0 = ν(fxg) = fxg fdµ = fxg fχfxgdµ = f(x)µ(fxg) = f(x). Hence, f = 0. R This means for every E 2 M, ν(E) = E 0dµ = 0, which contradicts that ν is the Lebesgue measure. Example 2. ν is still the same measure above but we let µ to be defined by µ(;) = 0 and µ(A) = 1 if A 6= ;. Clearly, µ is not σ-finite and ν µ.
    [Show full text]