Approximating with Gaussians

Approximating with Gaussians Craig Calcaterra and Axel Boldt Metropolitan State University [email protected] June 11, 2013 Abstract 2 Linear combinations of translations of a single Gaussian, e−x , are 2 shown to be dense in L (R). Two algorithms for determining the coefficients for the approximations are given, using orthogonal Hermite functions and least squares. Taking the Fourier transform of this result shows low-frequency trigonometric series are dense in L2 with Gaussian weight function. Key Words: Hermite series, Gaussian function, low-frequency trigonometric series AMS Subject Classifications: 41A30, 42A32, 42C10 1 Linear combinations of Gaussians with a single variance are dense in L2 L2 (R) denotes the space of square integrable functions f : R ! R with norm q kfk := R jf (x)j2 dx. We use f ≈ g to mean kf − gk < . The following 2 R 2 result was announced in [4]. Theorem 1 For any f 2 L2 (R) and any > 0 there exists t > 0 and N 2 N and an 2 R such that N P −(x−nt)2 f ≈ ane . n=0 arXiv:0805.3795v1 [math.CA] 25 May 2008 Proof. Since the span of the Hermite functions is dense in L2 (R) we have for some N N n P d −x2 f ≈ bn n e . (1) /2 n=0 dx 1 Now use finite backward differences to approximate the derivatives. We have for some small t > 0 N n P d −x2 bn n e n=0 dx −x2 1 h −x2 −(x−t)2 i 1 h −x2 −(x−t)2 −(x−2t)2 i ≈ b0e + b1 e − e + b2 2 e − 2e + e /2 t t 1 h −x2 −(x−t)2 −(x−2t)2 −(x−3t)2 i + b3 t3 e − 3e + 3e − e + ··· N n P 1 P k n −(x−kt)2 = bn n (−1) e . (2) n=0 t k=0 k This result may be surprising; it promises we can approximate to any degree of accuracy a function such as the following characteristic function of an interval 1 for x 2 [−10; −11] χ (x) := [−11;−10] 0 otherwise 2 with support far from the means of the Gaussians e−(x−nt) which are located 2 in [0; 1) at the points x = nt. The graphs of these functions e−(x−nt) are extremely simple geometrically, being Gaussians with the same variance. We only use the right translates, and they all shrink precipitously (exponentially) away from their means. P −(x−nt)2 ane ≈ characteristic function? Surely there is a gap in this sketchy little proof? No. We will, however, flesh out the details in section 2. The coefficients an are explicitly calculated and the L2 convergence carefully justified. But these details are elementary. We include them in the interest of appealing to a broader audience. Then is this merely another pathological curiosity from analysis? We prob- ably need impractically large values of N to approximate any interesting functions. 2 No, N need only be as large as the Hermite expansion demands. Certainly this particular approach depends on the convergence of the Hermite expansion, and for many applications Hermite series converge slower than other Fourier approximations{after all, Hermite series converge on all of R while, e.g., trigonometric series focus on a bounded interval. Hermite expansions do have powerful convergence properties, though. For example, Hermite series converge uniformly on finite compact subsets whenever f is twice continuously differentiable (i.e., 2 C2) and O e−cx for some c > 1 as x ! 1. Alternately if f has finitely many 2 discontinuities but is still C2 elsewhere and O e−cx the expansion again converges uniformly on any closed interval which avoids the discontinuities [15], [16]:. If f is smooth and properly bounded, the Hermite series converges faster than algebraically [7]. Then is the method unstable? Yes, there are two serious drawbacks to using Theorem 1. 1. Numerical differentiation is inherently unstable. Fortunately we are estimat- ing the derivatives of Gaussians, which are as smooth and bounded as we could hope, and so we have good control with an explicit error formula. It is true, though, that dividing by tn for small t and large n will eventually lead to huge coefficients an and round-off error. There are quite a few general techniques available in the literature for combatting round-off error in numerical differentiation. We review the well-known n-point difference formulas for derivatives in section 6. 2. The surprising approximation is only possible because it is weaker than the typical convergence of a series in the mean. Unfortunately 1 P −(x−nt)2 f (x) 6= ane n=0 Theorem 1 requires recalculating all the an each time N is increased. Further, the an are not unique. The least squares best choice of an are calculated in section 3, but this approach gives an ill-conditioned matrix. A different formula for the an is given in Theorem 3 which is more computationally efficient. Despite these drawbacks the result is worthy of note because of the new and unexpected opportunities which arise using an approximation method with such simple functions. In this vein, section 4 details an interesting corollary of Theorem 1: apply the Fourier transform to see that low-frequency trigonometric series are dense in L2 (R) with Gaussian weight function. 2 Calculating the coefficients with orthogonal functions In this section Theorem 3 gives an explicit formula for the coefficients an of Theorem 1. Let's review the details of the Hermite-inspired expansion 1 n P d −x2 f (x) = bn n e n=0 dx 3 claimed in the proof. The formula for these coefficients is n 2 d 2 b := 1p R f (x) ex e−x dx. n n!2n π dxn R Be warned this is not precisely the standard Hermite expansion, but a simple adaptation to our particular requirements. Let's check this formula for the bn using the techniques of orthogonal functions. Remember the following properties of the Hermite polynomials Hn ([16], n x2 dn −x2 e.g.). Define Hn (x) := (−1) e dxn e . The set of Hermite functions ( ) 1 −x2=2 hn (x) := p Hn (x) e : n 2 N pn!2n π is a well-known basis of L2 (R) and is orthonormal since R −x2 np Hm (x) Hn (x) e dx = n!2 πδm;n. (3) R This means given any g 2 L2 (R) it is possible to write 1 P 1 −x2=2 g (x) = c p p H (x) e (4) n n n n=0 n!2 π (equality in the L2 sense) where 1 R −x2=2 cn := p p g (x) Hn (x) e dx 2 . n!2n π R R The necessity of this formula for cn can easily be checked by multiplying both −x2=2 sides of (4) by Hn (x) e , integrating and applying (3). However, we want 1 n P d −x2 f (x) = bn n e n=0 dx 2 2 so apply this process to g (x) = f (x) ex =2. But f (x) ex =2 may not be L2 x2=2 2 integrable. If it is not, we must truncate it: f (x) e χ[−M;M] (x) is L for any M < 1 and f · χ[−M;M] ≈ f for a sufficiently large choice of M. Now we get /3 new cn as follows 1 x2=2 P 1 −x2=2 f (x) e χ (x) = c p p H (x) e so [−M;M] n n n n=0 n!2 π 1 1 n P (−1)n n −x2 P d −x2 f (x) χ (x) = c p p (−1) H (x) e = b e [−M;M] n n n n n n=0 n!2 π n=0 dx where 1 R x2=2 −x2=2 cn = p p f (x) e χ[−M;M] (x) Hn (x) e (x) dx n!2n π R 1 R = p p f (x) χ[−M;M] (x) Hn (x) dx n!2n π R 4 so we must have n (−1)n 2 d 2 1p R x −x bn = cn p p = n f (x) χ[−M;M] (x) e e dx. (5) n!2n π n!2 π dxn R Now the second step of the proof of Theorem 1 claims that the Gaussian's derivatives may be approximated by divided backward differences n n d −x2 1 P k n −(x−kt)2 n e ≈ n (−1) e dx t k=0 k in the L2 (R) norm. We'll use the \big oh" notation: for a real function Ψ the statement \ Ψ (t) = O (t) as t ! 0 " means there exist K > 0 and δ > 0 such that jΨ(t)j < K jtj for 0 < jtj < δ. Proposition 2 For each n 2 N and p 2 (0; 1) 0 11=p Z n p d −x2 1 Pn k n −(x−kt)2 @ e − (−1) e dxA = O (t) . dxn tn k=0 k R Proof. In Appendix 6 the pointwise formula is derived: n n d 1 Pn k n t P k n n+1 (n+1) n g (x) = n k=0 (−1) g (x − kt)− (−1) k g (ξk) dx t k (n + 1)! k=0 k where all of the ξk are between x and x + nt. Therefore the proposition holds −x2 (n+1) with g (x) = e since g (ξk) is integrable for each k.

Load more