Mollification Formulas and Implicit Smoothing

Molliﬁcation formulas and implicit smoothing

R. K. Beatson H.-Q. Bui

Research Report UCDMS 2003/19, December 2003 Department of Mathematics and Statistics, University of Canterbury, Christchurch, NEW ZEALAND

Original unsmoothed ﬁt Smoothed with c =5mm

Smoothed with c =10mm Smoothed with c =20mm

Figure 1: Implicit smoothing applied to a noisy Lidar scan. 1 Introduction

This paper develops some mollification formulas involving convolutions between popular radial basis function (RBF) basic functions Φ, and suitable mollifiers k. Polyharmonic spline, scaled Bessel kernel (Matern function) and compactly supported basic functions are considered. An application which motivated the development of the formulas is a technique called implicit smoothing. This computationally efficient technique smooths a previously obtained RBF fit by replacing the basic function Φ with a smoother version Ψ during evaluation. In the case of the polyharmonic spline basic functions the smoothed basic function is a generalised multiquadric or shifted thin-plate spline (at least up to a polynomial). Special cases of one of the mollification formulas developed here were given in the 1D setting in [Beatson and Dyn 1996]. That paper concerned error estimates for quasi interpolation with 1D generalised multiquadrics, and showed by elementary methods

(2j − 1)!! (•2+c2)(2j−1)/2 = |•|(2j−1)/2?j c2j (•2+c2)−(2j+1)/2, Figure 2: A noisy lidar scan of a statue in Santa Barbara. (2j)!! for j ∈ N. A multivariate√ analog relating Φ(x) = |x| and the multi- scan. Firstly an RBF is fitted to the noisy data yielding an RBF quadric Ψ(x) = x2 + c2 in R3 has been used to smooth implicit approximation surface fits to lidar and laser scanner data (see Figures 1 and 2, and Section 2). This application is detailed in [Carr et al. 2003]. That N X paper presents the application but not the mathematics underlying s(x) = p(x) + λiΦ(x − xi). (2.1) it. Implicit smoothing of globally supported RBFs should also have i=1 many other applications. [Fasshauer 1999] has used related basic function substitution techniques as part of a process for the numer- Then the initial RBF approximation is smoothed by convolution ical solution of PDEs with compactly supported RBFs. with the mollifier k yielding a smoother fit The purpose of the current paper is to present a mathematical N treatment of general versions of these, and related, mollification X formulas. Initially our development for the polyharmonic spline se(x) = q(x) + λiΨ(x − xi), (2.2) case was based on viewing odd powers of |x| as multiples of fun- i=1 damental solutions of iterated versions of Laplace’s equation. As where q = p ? k and Ψ = Φ ? k. Figure 1 shows zoomed such our treatment was restricted to polyharmonic splines in odd in views of the isosurfaces arising when this strategy applied to dimensions. Changing to arguments based more directly on gener- the noisy Lidar scan of Figure 2. Here Φ(x) = |x√| is the bi- alised functions has enabled many restrictions to be dropped. For 3 2 2 example in the case of the results for |x|β (Theorem 4.1) there is harmonic spline basic function in R and Ψ(x) = x + c is no longer any restriction on the parity of the dimension, nor any the ordinary multiquadric. In Figure 1 one can clearly see the requirement that the power β of |x| be odd or integer. Further- amount of smoothing increase with the parameter c. Figure 3 shows a thin-plate spline fit to data from the Mexican hat function more we develop analogous formulas for polyharmonic splines in 2 2 2 even dimension. Related results for scaled Bessel kernels (Matern f(x) = (1 − x ) exp(−x /2) at 400 scattered points in R . Uni- form random noise of magnitude 0.7 has been added to the original functions) and compactly supported radial basis functions are also 2 discussed. Mexican hat height data. Here Φ(√x) = x log |x| is the thin-plate spline and Ψ(x) = (x2 + c2) log x2 + c2 is the shifted thin-plate spline. Implicit smoothing has been employed to obtain an approx- Notation: In this paper the Fourier transform is defined as follows imation to the noise free signal. Z For important choices of Φ, and suitable choices of k, the −ixξ 1 d fb(ξ) := e f(x)dx, f ∈ L (R ). smoothed basic function Ψ turns out to be a simple and easy to Rd evaluate, function. Therefore in applying the technique there is no need for any explicit, and computationally expensive, evaluation of Also, except where explicitly noted, the generalized Fourier trans- convolutions. Rather one fits the initial radial basis function s, and forms that appear are the generalised transforms of the func- then smooths it when evaluating by substituting the smoothed ba- tions viewed as distributions acting on D(Rd\{0}) rather than on d sic function Ψ for the original basic function Φ. Thus the technique D(R ). This convention simplifies the discussion. can be viewed as smoothing by basic function substitution. In some important cases fast evaluators are available for the smoothed RBF se. 2 An application – implicit smoothing Advantages of the technique are

This section concerns an application of the mollification formulas • No explicit convolution to do. to come to the smoothing of RBF fits. This particular application motivated the development of the formulas. The process will be • There is no requirement that the data or evaluation points be called implicit smoothing and can be viewed as smoothing an inter- gridded. polant to noisy data rather than smoothing the data itself. The process starts with a noisy data set to be approximated. Fig- • Well understood linear filtering. Fourier transform and abso- ure 2 shows one example of such a noisy data set, a “noisy” Lidar lute moments of the smoothing kernel are known. Mollification formulas and implicit smoothing 3

• Smoothing can be chosen for appropriate frequency or length scale.

• A posteriori parameter for user to play with – “noise level/frequency” need not be known a priori.

Disadvantages of the method

• A posteriori parameter for user to play with.

• Just linear ﬁltering so will blur sharp features.

3 Technical lemmas (a) The Mexican hat function The following technical lemmas which deal with distributions and special functions will be needed in later sections.

Lemma 3.1. Identiﬁcation of a convolution

(i) Let β, > 0. Suppose g ∈ C(Rd), g(x) = O(|x|β ) as |x| → ∞, and k ∈ L∞(Rd), k(x) = O(|x|−(d+β+)) as |x| → ∞. Then g ? k is a C(Rd) function with (g ? k)(x) = O(|x|β ) as |x| → ∞.

(ii) Let β, , g and k be as in part (i). Let h ∈ C(Rd) be such β that h(x) = O(|x| ) as |x| → ∞. Viewing gb and bh as tempered distributions suppose that there exist functions G 1 d and H in Lloc R \{0} such that for all test functions φ ∈ d (b) Exact ﬁt to Mexican hat plus noise data D R \{0} Z hg,b φi = G(ξ)φ(ξ)dξ Rd and Z hbh, φi = H(ξ)φ(ξ)dξ. Rd

Further writing bk for the classical Fourier transform of k suppose

G(ξ)bk(ξ) = H(ξ), for almost all ξ 6= 0.

Then

where p is a polynomial of degree not exceeding the integer part of β.

Proof of part (i). Z (g ? k)(x) = g(x − y)k(y)dy Z = O (1 + |x| + |y|)β (1 + |y|)−(d+β+)dy Z = O (1 + |x|)β (1 + |y|)β (1 + |y|)−(d+β+)dy

= O((1 + |x|)β ). (d) Smoothed with c = 0.6 d Now ﬁx x ∈ R and let {tn} be a sequence tending to zero in Figure 3: Various ﬁts to noisy data created from the Mexican hat d function. R with |tn| ≤ 1 for all n. By an analogous argument to that above there is a constant C so that

β −(d+) |g(x + tn − y)k(y)| ≤ C(2 + |x|) (1 + |y|) Molliﬁcation formulas and implicit smoothing 4

w w cw−2v for almost all y. Hence, applying the Lebesgue dominated conver- = B , v − × gence theorem, 2 2 2 1 w 1 w log(c) + ψ − ψ v − . lim (g ? k)(x + tn) = (g ? k)(x). (3.2) n→∞ 2 2 2 2

It follows that g ? k is continuous. and Proof of part (ii) Z ∞ w 2 2 g ? k φ ∈ r log(r + c ) dr Below we will view as a tempered distribution. Let M(w, v) := v d (c2 + r2) r D R \{0} . Hence φb ∈ S the space of rapidly decreasing func- 0 w w cw−2v tions. The growth conditions on g, k, and φb combine to imply that = B , v − × all the iterated integrals below are absolutely convergent, so that the 2 2 2 n w o applications of Fubini’s Theorem that occur are justified. 2 log(c) + ψ (v) − ψ v − . (3.3) 2 h(g ? k)b, φi = hg ? k, φbi Proof of the first identity. The assumptions on v and w clearly im- ZZ ply the integral is convergent. Substituting c2t = r2 = g(x − y)k(y)dy φb(x)dx ∞ ∞ 2 w/2 ZZ Z rw dr 1 Z c t dt = g(z)φb(y + z)dz k(y)dy 2 2 v = 2 2 v 0 (c + r ) r 2 0 (c + c t) t Z Z h i cw Z ∞ tw/2 dt = g(z) e−iyξφ(ξ) (z)dz k(y)dy = b 2v v 2c 0 (1 + t) t Z w−2v D h −iy• i E c w w = g, e φ(•) k(y)dy = B , v − . b 2 2 2 Z D −iy• E Z ∞ w = g,b e φ(•) k(y)dy r | log r| dr Proof of the second identity. Since 2 2 v < ∞ we Z Z 0 (c + r ) r = G(ξ)e−iyξφ(ξ)dξ k(y)dy use the Lebesgue Dominated Convergence Theorem and differenti- ate under the integral sign to obtain ZZ = G(ξ)e−iyξφ(ξ)k(y)dy dξ d Z ∞ rw dr I(w, v) = . dw (c2 + r2)v r Z Z 0 = G(ξ)φ(ξ) e−iyξk(y)dy dξ Employing (3.1) this implies Z = G(ξ)k(ξ) φ(ξ)dξ d cw−2v w w b I(w, v) = B , v − Z dw 2 2 2 ( w w ) = H(ξ)φ(ξ)dξ d cw−2v Γ Γ v − = 2 2 dw 2 Γ(v) = hbh, φi. 1 n w w = log(c)cw−2vΓ Γ v − Hence g[ ? k − bh is a distribution supported at the origin. Therefore 2Γ(v) 2 2 g ? k = p + h where p is a polynomial. The growth of g ? k and h 1 w w w + ψ Γ cw−2vΓ v − implies that p is of degree at most the integer part of β. 2 2 2 2 1 w w w In the following B is the Beta function − cw−2vΓ ψ v − Γ v − 2 2 2 2 Z ∞ tz dt Γ(z)Γ(w) w w B 2 , v − 2 w−2v B(z, w) := z+w = , = c × 0 (1 + t) t Γ(z + w) 2 1 w 1 w and ψ is the Digamma function ψ(z) := Γ0(z)/Γ(z) (see e.g. log(c) + ψ − ψ v − . 2 2 2 2 [Abramowitz and Stegun 1965]). 1 Proof of the third identity. Proceeding as in the proof of the second Lemma 3.2. Let c, w, v ∈ R with c > 0 and v > w/2 > 0. Then integral identity Z ∞ w w−2v r dr c w w d Z ∞ rw dr v = B , v − . (3.1) M(w, v) = − . (c2 + r2) r 2 2 2 2 2 v 0 dv 0 (c + r ) r Further Employing (3.1) this implies Z ∞ w r log r dr ( w w ) I(w, v) := d cw−2v Γ Γ v − (c2 + r2)v r M(w, v) = − 2 2 0 dv 2 Γ(v) 1 Care is needed in interpreting the literature as many authors use ψ(z) w ( w 0 Γ Γ v − to denote the function Γ (z + 1)/Γ(z + 1) instead. See for example [Jones = − 2 −2 log(c)cw−2v 2 1982, page 114]. 2 Γ(v) Mollification formulas and implicit smoothing 5

ψ v − w Γ v − w + cw−2v 2 2 Using equation (3.1) again one has Γ(v) R β ) Z |x| kd,β,c(x)dx Γ v − w Γ0(v) β Rd w−2v 2 |x| kd,β,c(x)dx = R −c d k (x)dx (Γ(v))2 R Rd d,β,c R rd+β dr B w , v − w R (c2+r2)(β+2d)/2 r 2 2 w−2v = = c × R rd dr 2 R (c2+r2)(β+2d)/2 r n w o 2 log(c) + ψ (v) − ψ v − . d + β d d d + β 2 = cβ B( , )/B( , ) 2 2 2 2 β 4 Mollification formulas for powers of the = c . (4.6) modulus We are particularly interested in the polyharmonic splines. In odd dimension odd powers of the modulus are multiples of funda- In this section mollification formulas will be developed for powers mental solutions of iterated versions of Laplace‘s equation. Radial of the modulus. The flavour of the main result, Theorem 4.1, is that basis functions based on sums of shifts of these fundamental solu- the convolution of |x|β against an appropriate inverse multiquadric tions supplemented by polynomials have many wonderful proper- β/2 is the generalised multiquadric |x|2 + c2 . Further, a quanti- ties. Such polyharmonic splines arise naturally as smoothest inter- tative Korovkin Theorem, Proposition 4.2, estimates the distance polants (see [Duchon 1977]) and have performed extremely well in between the original unsmoothed RBF s and the corresponding many practical applications (see e.g. [Carr et al. 2001]). A particu- smoothed RBF s = s ? kd,β,c. larly important special case is that of the basic function Φ(x) = |x| e 3 Define the generalised multiquadric basic function ( the gener- in R . The corresponding RBFs, which take the form of a linear alised Fourier transform of a Bessel kernel) as polynomial plus sums of shifts of the modulus, are called biharmonic splines in R3. 2 2 β/2 d Ψβ,c(x) = (|x| + c ) , x ∈ R . (4.1) We will now show the mollification formula where c > 0. These functions are most often considered in the case Theorem 4.1. For all β such that <(β) > −d that β is a positive odd integer. Clearly Ψβ,c can be viewed as a β d smoothed out version of |x| . The results of this section show that (Φβ ? kd,β,c)(x) = Ψβ,c(x), for all x ∈ R . (4.7) the “smoothing out” is actually given by a convolution. More precisely Theorem 4.1 to come shows that in Rd and for Remark: This result includes the cases where the power β and the β all β > −d dimension d are both odd, and Φβ (x) = |x| is a polyharmonic Ψβ,c(x) = (Φβ ? kd,β,c)(x), (4.2) spline basic function. It also includes many other cases. In particular the range of validity of the formula includes those exceptional β where Φβ = |x| and the convolution kernel kd,β,c(x) is the gener- β’s for which Φβ and Ψβ,c are polynomial. alised multiquadric with negative index Ψ−β−2d,c(x), normalised to have integral one. That is Proof. We note the following generalized Fourier transforms for the relevant functions viewed as distributions acting on d+β d kd,β,c(x) = ad,β c Ψ−β−2d,c(x), (4.3) D(R \{0}). for some constant a . Write −(d+β)/2 d,β 2πd/2 |ξ| (FΨβ,c)(ξ) = K(d+β)/2(c|ξ|) d/2 Γ(−β/2) 2c σd := 2π /Γ(d/2), (4.4) for the surface area of the unit sphere in Rd. Then for β 6∈ 2N0, where K(d+β)/2 is a modified Bessel function (see [Abramowitz and Stegun 1965, page 374] or [Aronszajn and Smith Z Z rd dr 1961, page 415]). For β < −d this is a classical Fourier transform d+β d 1 = kd,β,c(x)dx = σdad,β c 2 2 (β+2d)/2 , on R . Also Rd R (r + c ) r Γ ((d + β)/2) implying from equation (3.1) that F |•|β (ξ) = 2d+β πd/2 |ξ|−d−β Γ(−β/2) −d/2 Γ((β + 2d)/2) ad,β = π . (4.5) S Γ((β + d)/2) for β∈ / (−d − 2N0) (2N0) (see [Gelfand and Shilov 1964, page 363]). Interpretation of convolution against the kernel as a low pass fil- Then using the normalising constant ter will be aided by the expression (4.8) for its Fourier transform −d/2 Γ ((β + 2d)/2) 1−(d+β)/2 ad,β = π 2 (β+d)/2 Γ ((β + d)/2) bkd,β,c(ξ) = K β+d (c|ξ|)(c|ξ|) , β > 0. Γ((β + d)/2) 2 defined above and that Kν (z) = K−ν (z) we find This Fourier transform is a positive function tending to zero expo- d+β nentially fast with |ξ|. Considering the graph of bkd,β,c(|ξ|) against bkd,β,c = ad,β c Ψb −2d−β,c |ξ| it is clear that the width of the graph at any fixed height is in- Γ ((β + 2d)/2) 2πd/2 versely proportional to c. This expresses precisely how the graph of = π−d/2cd+β Γ ((β + d)/2) Γ(−(−2d − β)/2) bkd,β,c(ξ) grows more and more peaked as c increases. Thus convo- (d+β)/2 lution with bkd,β,c will attenuate high frequencies more and more as |ξ| K (c|ξ|) c increases. −(d+β)/2 2c Mollification formulas and implicit smoothing 6

1−(d+β)/2 2 (β+d)/2 of continuity of all directional derivatives of order ` of f is given = K β+d (c|ξ|)(c|ξ|) (4.8) Γ((β + d)/2) 2 by (`) d (`) d Ω(f , R , δ) := sup ω(fu , R , δ). for β > −d . It is then clear that |u|=1 For a discussion of the properties of this modulus of continuity F |•|β (ξ)k (ξ) = Ψ (ξ), ξ ∈ Rd\{0}. bd,β,c b β,c see [Beatson and Light 1993]. Given ` ∈ N0 and 0 < α ≤ 1 (`) we will say the total derivative f is in LipM (α) if there exists a for a set of β’s including 0 < β < 1. positive a constant M such that Now fix c > 0, 0 < β < 1, and apply Lemma 3.1. The Lemma implies that Ω(f (`), Rd, δ) ≤ Mδα, for all 0 < δ < ∞. β | • | ? kd,β,c = pβ,c + Ψβ,c (4.9) For each multiindex γ adopt the usual notation defining |γ| = kγk1 where pβ,c is a polynomial of degree 0 and the equality holds point- and the normalized mononomial wise. Considering the point x = 0 we find 1 γ 1 γ1 γ2 γd Vγ (x) = x = x1 x2 ··· xd Z γ! γ1!γ2! ··· γd! β β | • | ? kd,β,c (0) = |t| ? kd,β,c(t)dt Rd With this notation in hand we can state the following folklore quan- β titative Korovkin Theorem. We include the simple proof for the = c , sake of completeness, and also because we do not know of a con- β venient reference. where we have used (4.6). Then observing that Ψβ,c(0) = c we deduce that the polynomial pβ,c in (4.9) must be identically zero. Proposition 4.2. Let ` ∈ N0, 0 < β ≤ 1, and B > 0. For each d We have therefore established that for all c > 0 and 0 < β < 1 c > 0 let kc : R → R be a bounded function such that Z d γ (Φβ ? kd,β,c)(x) = Ψβ,c(x), for all x ∈ R . (4.10) t kc(t)dt = δ0,|γ|, 0 ≤ |γ| ≤ `, (4.12) Rd d Z Now fix c > 0 and x ∈ R . The right-hand side of (4.7) is an `+β `+β |t| |kc(t)|dt ≤ Bc . (4.13) entire function of β. The left-hand side is continuous on Ωd := {β : Rd <(β) > −d} by the Lesbesgue dominated convergence theorem. A ` d standard argument using Morera’s theorem then shows that the left- Then for all f ∈ C (R ) with `-th total derivative in LipM (β) and hand side is analytic on Ωd. (4.10) shows that (4.7) holds for 0 < all c > 0 β < 1 <(β) > BM `+β . Hence, by analytic continuation, it holds for all kf ? kc − fk ≤ c . (4.14) −d. ∞ `! Proof. Taylor’s theorem with integral remainder for univariate The remainder of this section will concern the application of the functions implies Theorem above to implicit smoothing. Recall from (4.6) that ` j ` X (j) (−t) |t| (`) g(x − t) − g (x) ≤ ω(g , |t|), x, t ∈ R. Z j! `! β β j=0 |x| kd,β,c(x)dx = c . d R Applying this along the line segment joining x and x − t in Rd we A routine application of Holder’s¨ inequality then shows that find the multivariate Taylor theorem in the form

Z ` |x|αk (x)dx < cα, for all 0 < α < β. (4.11) X γ |t| (`) d d,β,c f(x − t) − (D f)(x)Vγ (−t) ≤ Ω f , R , |t| . Rd `! |γ|≤` These expressions for the α-th absolute moment of kd,β,c clearly quantify the manner in which the kernel becomes peaked as c ap- Hence using the hypotheses proaches zero. Loosely speaking they shows that the dominant part Z of convolution against the kernel is averaging on a length scale of |(f ? kc)(x) − f(x) | = (f(x − t) − f(x)) kc(t)dt d approximately c, at least for functions of sufficiently slow growth. R   Thus we can expect convolution against kd,β,c to lose, or smooth, Z X γ detail at this length scale. = f(x − t) − (D f)(x)Vγ (−t) kc(t)dt d A more precise statement about the error between the original R |γ|≤` RBF s and its smoothed version s is implied by the quantitative Z ` e |t| (`) d Korovkin theorem we are about to present. See the last paragraph ≤ Ω f , R , |t| |kc(t)|dt of this section for the details. Rd `! d Z Given a uniformly continuous function g : R → R define its M `+β BM `+β uniform norm modulus of continuity ≤ |t| |kc(t)|dt ≤ c . `! Rd `! ω(g, Rd, δ) := sup |g(x) − g(y)|. As a first application of Theorem 4.1 and Proposition 4.2 con- x,y∈Rd:|x−y|≤δ sider implicit smoothing of an RBF of form (2.1) when Φ(x) = |x| and p is of degree 1. Implicit smoothing of a surface in R3 mod- Let f ∈ C`(Rd) be a function with all `-th order partials uniformly elled with such biharmonic spline is illustrated in Figures 1 and 2. d continuous on R . Define the `-th order directional derivative of Then the unsmoothed function s ∈ LipM (1) where (`) f at x in the direction of u, fu (x), as the `-th derivative of the M = sup |∇s(x)|. univariate function g(t) = f(x + tu), at t = 0. The joint modulus d x∈R \{xi:1≤i≤N} Mollification formulas and implicit smoothing 7

∞ Convolving against k we note that linear polynomials are pre- X d,1,c = (H p)(|x|e ) served so that the smoothed RBF (2.2) takes the special form 2j 1 j=0 N ∞ X X 2j se(x) = p(x) + λiΨ1,c(x − xi). = |x| (H2j p)(e1) i=1 j=0 ∞ Applying the Korovkin theorem Proposition 4.2, using that X 2j R = b2j r , r = |x|, |x|kd,1,c(x)dx = c by (4.6), we see that ks − sk∞ ≤ Mc. Rd e j=0

5 Radial functions where b2j = (H2j p)(e1). This section outlines some known fundamental properties of radial 6 Mollification formulas for functions of functions. 2j A function f : Rd → R is radial if there is a univarate function the form r log r g such that f(x) = g(|x|) for all x. In this section mollification formulas are developed for the gener- 2j Lemma 5.1. Let f, k : Rd → R be such that the integral defining alised thin-plate spline basic function |x| log |x|, j ∈ N. The (f ? k)(x) is absolutely convergent for all x. If f and k are radial flavour of the main result Theorem 6.1 is that convolution of then so is f ? k. the generalised thin-plate basic function against a certain inverse

d multiquadric yields the corresponding shifted thin-plate spline Proof. Given x ∈ R choose a rotation matrix Q so that Qx = j |x|2 + c2 log p|x|2 + c2, plus a polynomial of degree 2j − 2. |x|e where e is the vector (1, 0,..., 0). Then 1 1 In even dimension even powers of the modulus multiplied by Z log |x| are fundamental solutions of iterated versions of Laplace‘s (f ? k)(x) = f(x − t)k(t)dt equation. In particular RBFs taking the form of a linear polyno- Rd mial plus a sum of shifts of x2 log |x| are the biharmonic RBFs in Z 2 = f(Qx − Qt)k(Qt)dt R . These thin-plate splines can be shown to be the solutions of Rd various smoothest interpolation and penalized smoothing problems Z (see e.g. [Duchon 1977], [Wahba 1990]) and have proved very suc- = f(|x|e1 − s)k(s)ds cessful in many scattered data ﬁtting applications, see for example d R [Hutchinson and Gessler 1994]. For such functions we will show = (f ? k)(|x|e1). molliﬁcation formulas of the form

d β/2 Given a polynomial p : R → R write it in terms of the mono- β 2 2 p 2 2 P α |•| log | • | ? kd,β,c = • + c log • + c + pβ , (6.1) mial basis as p(x) = d aαx . Define the homogeneous part α∈N0 P α where β ∈ 2 , k is as in(4.3), and p is a polynomial depend- of degree j of p, Hj p by (Hj p)(x) = |α|=j aαx . N d,β,c β ing on d, β and c. The first function on the right above Lemma 5.2. Let p : Rd → R be a polynomial which is also 2 2 β/2 2 21/2 d radial. Then Ξβ,c(x) := (|x| + c ) log |x| + c , x ∈ R . (6.2) (a) All the homogeneous parts Hj p of p, j = 0, 1,..., are also is the shifted thin-plate spline basic function of Dyn, Levin and radial. Rippa. Some properties of these functions can be found in [Dyn et al. 1986] and [Dyn 1989]. (b) There is a univariate polynomial q such that In contrast to the case of smoothing |x|β discussed in Section 4 a 2j p(x) = q(|x|), for all x. nonzero polynomial part does arise in smoothing |x| log |x|. For example the convolution on the left of (6.1) for d = 2, 2j = 2, and Proof of (a). Suppose that p is radial yet at least one homogeneous c = 1 evaluated at zero can be rewritten using polar coordinates as part of p is not. Let Hmp be the first non radial homogeneous part Z ∞ 2 2 2 −3 of p. Then there exist x, y with |x| = |y| = 1 but (Hmp)(x) 6= 2πr r log(r)(r + 1) dr. π (Hmp)(y). Now let 0 2 2 m−1 ∞ This equals 1/2, while (r + 1) log(r + 1) evaluated at r = 0 is X X e = p − (Hkp) = (Hkp) . 0. This direct calculation for the special case d = 2 is in agreement k=0 k=m with the general formula (6.5) which we are about to prove. Explicitly we will show Then Theorem 6.1. For j ∈ m m+1 N e(rx) − e(ry) = r ((Hmp)(x) − (Hmp)(y)) + O r , n o | • |2j log | • | ? k (x) = Ξ (x) + p (x) (6.3) as r → 0+ implying e(rx) 6= e(ry) for all sufficiently small d,2j,c 2j,c 2j r > 0. But e is radial by choice of m. Contradiction. where p is a radial polynomial of degree 2j − 2. Writing p in Proof of (b) From part (a) if j is odd then H p is both odd and ra- 2j 2j j the form dial, therefore identically zero. Hence writing e1 for the unit vector (1, 0,..., 0) and using part (a) again 2 2j−2 p2j (x) = b2j,0 + b2j,2|x| + ··· + b2j,2j−2|x| , ∞ X p(x) = (H2j p)(x) 2j 1 1 1 b2j,0 = c + + ··· + , (6.4) j=0 (2j − 2) + d (2j − 4) + d d Mollification formulas and implicit smoothing 8

2j and in particular The term involving |x| in the expression C2j is c2 b = . (6.5) Z 2,0 1 d+2j 2j 2 2−(d+j) 2 2 d ad,2j c |x| |y| + c log |y| + c dy 2 d Proof. Start with the the mollification formula of Theorem 4.1 R Z ∞ 2j 1 d+2j d 2 2−(d+j) 2 2 dr Z = |x| ad,2j σdc r r + c log r + c 2 2β/2 d+β β 2 2−(β+2d)/2 2 r |x| + c = ad,β c |y| |x − y| + c dy 0 d R 2j 1 d+2j = |x| ad,2j σdc M(d, d + j) for all x ∈ Rd and all <(β) > −d. Differentiating both sides with 2 1 1 2j + d respect to β yields = |x|2j log(c) + ψ(d + j) − ψ 2 2 2 β/2 1/2 |x|2 + c2 log |x|2 + c2 where we have employed (3.3). d Z −(β+2d)/2 = a cd+β |y|β |x − y|2 + c2 dy Combining the expressions for A2j and B2j , with the expression d,β 2j dβ Rd above for the coefficient of |x| in C2j we find that the terms in Z 2j d+β β 2 2−(β+2d)/2 |x| cancel and therefore the convolution has the form given in + ad,β c |y| log |y| |x − y| + c dy equation (6.3) of the Theorem. Rd Z We proceed to identify the constant part b2j,0 of the polynomial d+β β 2 2−(β+2d)/2 − ad,β c |y| |x − y| + c × p2j . It follows from formula (6.3) that Rd 2 21/2 2j log |x − y| + c dy | • | log | • | ? kd,2j,c (0) = Ξ2j,c(0) + b2j,0. (6.6)

= Aβ + Bβ − Cβ . Using the radial symmetry the left hand side can be rewritten as the Calculation of quantity Aβ . From (4.5) univariate integral Z ∞ d d+β d+β d+2j 2j 2 2−(2j+2d)/2 d−1 ad,β c − ad,β c log c σdad,2j c r log(r) r + c r dr dβ r=0 d Γ ((β + 2d)/2) = cd+β π−d/2 Applying the notation and results of Lemma 3.2 this becomes dβ Γ ((β + d)/2) ( 0 β+2d β+2d 0 β+d ) 2 d+2j d+β 1 Γ Γ Γ c I(d + 2j, j + d) = c 2 − 2 2 B d , 2j+d 2πd/2 β+d β+d 2 2 2 Γ 2 Γ 2 c2j d d 1 β + 2d β + d = log(c2) + ψ j + − ψ . = a cd+β ψ − ψ . 2 2 2 2 d,β 2 2 2j Hence employing the molliﬁcation formula (4.7) we ﬁnd Observing that Ξ2j,c(0) = c log(c) it follows that 2j 2 2β/2 1 β + 2d 1 β + d c d d A = |x| + c log c + ψ − ψ . b2j,0 = ψ j + − ψ . β 2 2 2 2 2 2 2

Quantity B2j is the convolution Employing the recurrence [Abramowitz and Stegun 1965, (6.3.5) and (6.3.6)] n 2j o | • | log | • | ? kd,2j,c (x) 1 1 1 ψ(n + z) = + + ··· + + ψ(z) we are interested in. (n − 1) + z (n − 2) + z z Calculation of quantity C2j . the expression for b2j,0 can be rewritten as Z 1 d+2j 2j 2 2−(d+j) 2 2 C2j = ad,2j c |x−y| |y| + c log |y| + c dy 1 1 1 2 b = c2j + + ··· + , 2j,0 (2j − 2) + d (2j − 4) + d d j Expanding the term |x − y|2j = |x|2 − 2 < x, y > +|y|2 that occurs above using the Binomial Theorem it is clear that |x − y|2j establishing formula (6.4). Substituting j = 1 gives the for- a polynomial of degree 2j in x and y. Collecting terms in the ex- mula (6.5). pansion by degree in x As an application of the molliﬁcation formula of Theorem 6.1 2j 2 2 j consider implicit smoothing of an ordinary thin-plate spline of form |x − y| = |x| − 2 < x, y > +|y| 2j 2j−2 N = |x| − 2j < x, y > |x| X 2 s(x) = p1(x) + λi|x − xi| log |x − xi| 2j−2 2 2 2j−4 + j|x| |y| + 2j(j − 1) < x, y > |x| i=1

+ terms of lower degree in x . in R2. Implicit smoothing of such a fit is illustrated in Figure 3. If the RBF coefficients λi satisfy the usual side conditions Substituting this expansion into the expression for C2j reveals that C2j is a polynomial of degree 2j in x. From Lemma 5.1 this con- N N volution of radial functions yields a radial function. Hence from X X λi = 0 and λixi = 0, (6.7) Lemma 5.2 C is a polynomial of degree j in |x|2. 2j i=1 i=1 Mollification formulas and implicit smoothing 9 then the sum of all the constant parts arising from smoothing, 7 Bessel kernels and Matern functions P 2 i λic /d , is zero. Thus the smoothed thin-plate spline is This section discusses mollification formulas for Bessel kernels and N the scaled versions known as Matern functions. These mollification X 2 2 p 2 2 se(x) = p1(x) + λi |x − xi| + c log |x − xi| + c . formulas can be exploited in an implicit smoothing technique for i=1 Matern RBFs. d Recall the Bessel kernels Gd,α for R given for α > 0 by Further, when (6.7) holds, one can form a far field expansion 2−α/2 Gbd,α(ξ) = 1 + |ξ| , P2(x) P3(x) A log(|x|) + P1(x) + + + ..., |x|2 |x|4 and

1 (α−d)/2 of s(x) where Pj denotes a polynomial of degree j. This expansion Gd,α(x) = K(d−α)/2(|x|)|x| . converges to s(x) with a geometric rate for all sufficiently large |x|. πd/22(d+α−2)/2Γ(α/2) (1) It follows that the gradient s (x) = ∇s(x) is bounded and hence Good references for the many wonderful properties of these func- s ∈ LipM (1) with tions are [Aronszajn and Smith 1961] and [Donoghue 1969]. The properties that we will need in the following are M = sup |∇s(x)|. x∈Rd 2 • Gd,α ∈ L , α > d/2. R α α Hence noting from (4.11) that Rd |x| kd,2,c(x)dx < c for all • Gd,α is continuous when α > d. 0 < α < 2 and applying the Korovkin theorem Proposition 4.2 we (α−d−1)/2 −|x| find that in this case • Gd,α(x) = Dd,α|x| e (1 + o(1)) as |x| → ∞. Here Dd,α is a constant depending on d and α . ks − sek∞ < Mc. • Gd,α is positive definite for α > d. Consider now using the triharmonic spline in R2 based on sums of shifts of |x|4 log |x| plus quadratics • Gd,α ?Gd,β = Gd,α+β , α, β > 0. The Bessel kernels are also basic functions corresponding to natural N X 4 smoothest interpolation problems. [Schaback 1993] and [Schaback s(x) = p (x) + λ |x − x | log |x − x |. 2 i i i 1995] having shown that for k > d/2 RBF interpolation with basic i=1 function Gd,2k yields the interpolant minimizing the Sobolev inner k d The triharmonic spline is C3. It is natural to use such a triharmonic product for W2 (R ) spline, rather than the usual thin-plate spline, if the function be- Z 2 2 2k ing approximated is smoother, or if a smoother approximation is kgkW k(Rd) = |gb(ω)| 1 + |ω| dω required. 2 Rd The usual side conditions for the triharmonic spline are that the over all sufficiently smooth interpolants. coefficients λi are “orthogonal to” quadratics. That is that More explicit forms for some of the Bessel kernels are

N −(d−1)/2 X π −|x| λiq(xi) = 0, for all quadratics q. (6.8) Gd,d+1(x) = e , 2dΓ d+1 i=1 2

These conditions imply that the polynomial parts arising from which is Lipschitz, 4 smoothing of the weighted shifts of | • | log | • | cancel to give −(d−1)/2 π −|x| the zero polynomial. Therefore the smoothed RBF has the form Gd,d+3(x) = (1 + |x|)e , d+1 d+3 2 Γ 2 N 2 X 2 2 p 2 2 which is twice continuously differentiable and se(x) = q2(x) + λi |x − xi| + c log |x − xi| + c , i=1 −(d−1)/2 π 2 −|x| Gd,d+5(x) = (3 + 3|x| + |x| )e . where q2 = p2 ? kd,4,c will usually differ from p2. 2d+2Γ d+5 Further, when the side conditions (6.8) hold, one can form a far 2 field expansion which is four times continuously differentiable. These formulas and others can be derived using [Abramowitz and Stegun 1965, P (x) P (x) Q (x) log(|x|) + P (x) + 3 + 4 + ..., (10.2.15) or (10.2.17)]. 1 2 |x|2 |x|4 Introduce a parameter c by considering the dilated version scaled to have integral 1 of s(x) where Q1(x) is a polynomial of degree 1, and for all j, Pj −d denotes a polynomial of degree j. This expansion converges to s(x) Md,α,c(x) = c Gd,α(x/c). (7.1) with a geometric rate for all sufficiently large |x|. It follows that all the second partials of s are bounded. Hence the first total derivative We have used the letter M for these functions as in the Statistics (1) literature they are often called Matern functions rather than scaled s ∈ LipM (1) for some constant M. Noting from (4.11) that R α α Bessel kernels (see e.g. [Stein 1999]). From (7.1) and its immediate Rd |x| kd,4,c(x)dx < c for all 0 < α < 4 and applying the Korovkin theorem Proposition 4.2 we find that in this case consequence Z Z 2 |x|M (x)dx = c |x|G (x)dx ks − sek∞ < Mc . d,α,c d,α (7.2) Rd Rd Mollification formulas and implicit smoothing 10

it is clear that c is a length scale associated with the kernel Md,α,c. Then Wendland derives from the work of [Schaback and Wu The positive definiteness of the Gd,α’s implies that of the 1996] formulas for the convolutions of compactly supported radial Md,α,c’s. Hence the scattered data interpolation problem of finding functions one of which should read a Matern RBF of the form X s(•) = λ M (• − x ). i d,α,c i I(f ?d+2 g) = 2π(If) ?d (Ig). (8.2) i taking given values fi at a finite number of given distinct points xi is guaranteed to have a unique solution. [Mouat and Beatson 2002] Now let kc be the characteristic function of the ball radius c discuss an application of Matern RBFs to the numerical solution of in 3D, normalised to have integral one. By Bochner’s theorem PDEs in which they significantly outperform multiquadrics. C3(x) = (kc?3kc)(x) will necessarily be positive definite. Statisti- The convolution property of the Gd,α’s implies cians call this function the spherical covariance, but it is also known as the Euclid hat. The latter name derives from the analogy with the M ?M = M , α, β > 0. d,α,c d,β,c d,α+β,c univariate piecewise linear hat function, or linear B-spline, on a uniform mesh, which is the convolution of two characteristic functions. Therefore given an RBF built upon the basic function Md,α,c we can carry out implicit smoothing in the form It follows easily from formula (8.2) that the spherical covariance, normalised to have value 1 at zero, and to be supported on a ball X radius 1, is s = λiMd,α,c(• − xi) i ( ) 3 1 X C (x) = 1 − |x| + |x|3 . −→ λiMd,α,c(• − xi) ?Md,β,c 3 2 2 i X = λiMd,α+β,c(• − xi) =: s.e i Other frequently used positive definite compactly supported ba- The RBF s based on the kernel Md,α,c is Lipschitz whenever α ≥ sic functions in R3 are the Askey function A and the Wendland d + 1. Hence, the Korovin theorem and the moment estimate (7.2) function W given by imply that

Z ! A(r) = ψ (r) = (1 − r)2 , which is Lipschitz (8.3) ks − sek∞ ≤ |x|Gd,β (x)dx sup |∇s(x)| c, 2,0 + Rd x∈Rd 4 2 W (r) = ψ3,1(r) = (1 − r)+(4r + 1), which is C . (8.4) whenever α ≥ d + 1. To use these functions at a length scale R one simply replaces r by 8 Mollification formulas for compactly |x|/R. supported RBFs We consider implicit smoothing of these basic functions by convolving them with the kernel kc defined above. The resulting func- This section discusses mollification formulas for compactly sup- tion will clearly be supported on {x : |x| ≤ 1 + c}. Looking in ported RBFs. These mollification formulas can be exploited in im- the Fourier domain we see that the smoothed function will not be plicit smoothing techniques for compactly supported RBFs. positive definite. Positive definiteness is important as it guarantees In this section we abuse notation and do not distinguish between the existence of solutions to interpolation problems. However, it radial functions in terms of the dimension of their Euclidean do- does not matter for our implicit smoothing application, in which main. Thus we write f(x) for a radial function which when writ- one smooths a previous fitted RBF. If one wants the smoothed ba- ten as g(|x|) we can consider as having any finite dimensional sic function to be positive definite then clearly one should choose n Euclidean domain R . Consequently we need to identify the di- a compactly supported positive definite function such as the Askey mensionality of the domain differently and do so by writing ?d for function or spherical covariance, normalised to have integral 1, as d the convolution in R . Following [Wendland 1995] given an L∞ the smoothing kernel rather than kc. compactly supported radial function f form from it another radial function (If) by defining (viewing f as a function of the variable Maple code based on (8.2) yields piecewise definitions for the r = |x|) smooth approximations Ψ = Φ ?3 kc to these basic functions Φ when 0 < c < 1/2. In particular (A?3 kc)(r) equals Z ∞  sf(s)ds, 0 ≤ t < ∞, (If)(t) = s=t (8.1)  (If)(−t), t otherwise. 1 4 1 2 3 3 2  r + 1 − r + 1 − c + c , 0 ≤ r < c,  10c3 c 2 5 Also define for t > 0  2  2 3 2 2c 1 r − 2r + 1 + c − , c ≤ r < 1 − c, 1 d  5 5 r (Df)(t) = − f(t).  2 t dt 1 5 1 4 1 − 3c 3 1 + c 2 3 r − 3 r + 3 r + r  80c 20c 16c 2c In the interior of intervals of continuity of f the fundamental theo-  1 + 6c2 + 16c3 + 9c4 1 + 10c3 + 15c4 + 6c5 − r + rem of calculus implies  3 3  16c 20c  1 − 5c2 + 15c4 + +16c5 + 5c6 1 (DIf)(t) = f(t). − , 1 − c ≤ r < 1 + c. 80c3 r Mollification formulas and implicit smoothing 11

Also (C3 ?3 kc)(r) equals BEATSON,R.K., AND LIGHT, W. A. 1993. Quasi-interpolation by thin plate splines on the square. Constructive Approximation  1 6 3 1 4 9, 407–433. − r + 2 + r  140c3 40c c2  CARR,J.C.,BEATSON,R.K.,CHERRIE,J.B.,MITCHELL,  3c 1 2 9 1 3 + 1 − r + 1 − c + c , 0 ≤ r < c, T. J., FRIGHT, W. R., MCCALLUM,B.C., AND EVANS, T. R.  4 c2 8 4  2001. Reconstruction and representation of 3d objects with ra-  1 3 3 2 3  r + c − r dial basis functions. In Computer Graphics SIGGRAPH 2001  2 5 2 proceedings), 67–76.   3 4 3 2 1 +1 + c − c , c ≤ r < 1 − c,  70 10 r CARR,J.C.,BEATSON,R.K.,MCCALLUM,B.C.,FRIGHT, 2 3 W. R., MCLENNAN, T. J., AND MITCHELL, T. J. 2003. 1 6 3(1 + 2c ) 4 1 + 4c 3  r − r + r Smooth surface reconstruction from noisy range data. In  280c3 80c3 16c3  GRAPHITE 2003, ACM Press, New York, 119–126.  3(1 − c2) 3(1 + 5c2 + 10c3 − 4c5) + r2 − r  8c 40c3 DONOGHUE, W. F. 1969. Distributions and Fourier Transforms.   1 + 8c3 + 9c4 − 2c6 Academic Press, New York.  + 3  16c DUCHON, J. 1977. Splines minimizing rotation-invariant semi-  2 4 5 7  3(3 − 14c + 35c + 28c − 4c ) 1 norms in Sobolev spaces. In Constructive Theory of Functions − , 1 − c ≤ r < 1 + c. 560c3 r of Several Variables, Springer-Verlag, Berlin, W. Schempp and K. Zeller, Eds., no. 571 in Lecture Notes in Mathematics, 85– Finally, (W?3 kc)(r) equals 100.  −1 6 2 6 DYN,N.,LEVIN,D., AND RIPPA, S. 1986. Numerical procedures  r8 + − r6 + −15 + 9c + r4  3 3 for surface fitting of scattered data by radial functions. SIAM  42c 7c 7c c  2 3 2 J. Sci. Stat. Comput. 7, 2, 639–659. + −10 + 30c − 30c + 10c r   2 3 45 4 3 5 DYN, N. 1989. Interpolation and approximation with radial and + 1 − 6c + 10c − c + c , 0 ≤ r < c,  7 2 related functions. In Approximation Theory VI, Academic Press,  5 4 2 3 4r − 15r + 20 + 12c r C. K. Chui, L. L. Schumaker, and J. D. Ward, Eds., vol. 1, 211–  234.  2 2 2 36 4 − 10 + 30c r + 24c + c r  7 FASSHAUER, G. E. 1999. On smoothing for multilevel approxi-   2 45 4 12 4 4 6 1 mation with radial basis functions. In Approximation Theory IX, + 1 − 6c − c + c + c , c ≤ r < 1 − c,  7 7 21 r Vanderbilt University Press, C. K. Chui and L. L. Schumaker,  1 15  r8 − r7 Eds., vol. 2, 55–62. 84c3 224c3 2 2 3  1 − 3c 6 1 − 15c − 16c 5 GELFAND,I.M., AND SHILOV, G. E. 1964. Generalized Func- + r − r  7c3 8c3 tions, vol. 1. Academic Press, New York.  2 2 3 4 5  3(2 + 5c + 3c ) 4 1 + 30c + 160c + 225c + 96c 3 − r + r HUTCHINSON, M. F., AND GESSLER, P. E. 1994. Splines – more  2c 16c3  2 3 2 than just a smooth interpolator. Geoderma 62, 45–67. −5 1 + 3c + 3c + c r  2 4 5 6 7  3 1 + 7c − 105c − 224c − 175c − 48c JONES, D. S. 1982. The theory of generalised functions. Cam- − r  56c3 bridge University Press, Cambridge.  3 5 6 7 8  1 + 14c − 84c − 140c − 90c − 21c + MOUAT, C. T., AND BEATSON, R. K. 2002. RBF collocation.  28c3 Tech. Rep. UCDMS 2002/3, Department of Mathematics and  2 4 6 7 9  5 − 36c + 126c − 420c − 576c − 64c 1 Statistics, University of Canterbury. − ,1 − c ≤ r < 1 + c. 672c3 r SCHABACK,R., AND WU, Z. 1996. Operators on radial functions. These smoothing formulas can be usefully employed when the J. Computational and Applied Mathematics 71, 257–270. original unsmoothed compactly supported RBF, s, contains many shifts of a single basic function, Φ, with a constant value of the SCHABACK, R. 1993. Comparison of radial basis function inter- radius R. Then the coefficients of the powers of r = |x|/R above polants. In Multivariate approximation from CAGD to wavelets, can all be precomputed, and the smoothed piecewise basic function, World Scientific, Singapore, 293–305. Ψ, evaluated reasonably efficiently by Horner’s method applied to SCHABACK, R. 1995. Multivariate interpolation and approxima- 2 r or r as appropriate. tion by translates of a basis function. In Approximation Theory VIII, Vol. I, World Scientific, Singapore, 491–514. References STEIN, M. L. 1999. Interpolation of Spatial Data: Some theory for Kriging. Springer-Verlag, New York. ABRAMOWITZ,M., AND STEGUN, I. A., Eds. 1965. Handbook WAHBA, G. 1990. Spline Models for Observational Data. No. 59 of Mathematical Functions. Dover, New York. in CBMS-NSF Regional Conference Series in Applied Math. SIAM. ARONSZAJN,N., AND SMITH, K. T. 1961. Theory of Bessel Potentials. Part 1. Ann. Inst. Fourier., Grenoble 11, 385–475. WENDLAND, H. 1995. Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Ad- BEATSON,R.K., AND DYN, N. 1996. Multiquadric B-splines. J. vances in Computational Mathematics 4, 389–396. Approximation Theory 87, 1–24.