The Gaussian Scale-Space Paradigm and the Multiscale Local Jet

International Journal of Computer Vision, 18, 61-75 (1996) (9 1996 Kluwer Academic Publishers, Boston. Manufactured in The Netherlands.

LUC FLORACK DETIINESC-Aveiro, Universidade de Aveiro, 3800 Aveiro, Portugual, and Computer Vision Research Group, Utrecht University Hospital, Heidelberglaan 100, NL-3584 CX Utrecht, The Netherlands luc @ cv. ruu.nl BART TER HAAR ROMENY Computer Vision Research Group, Utrecht University Hospital, Heidelberglaan 100, NL-3584 CX Utrecht, The Netherlands bart @ cv. ruu.nl MAX VIERGEVER Computer Vision Research Group, Utrecht University Hospital, Heidelberglaan 100, NL-3584 CX Utrecht, The Netherlands max@cv. ruu.nl JAN KOENDERINK Dept. of Medical and Physiological Physics, Utrecht University, Princetonplein 5, NL-3584 CC Utrecht, The Netherlands Received September 1, 1994; Accepted January 11, 1995

Abstract. A representation of local image structure is proposed which takes into account both the image’s spatial structure at a given location, as well as its “deep structure”, that is, its local behaviour as a function of scale or resolution (scale-space). This is of interest for several low-level image tasks. The proposed basis of scale-space, for example, enables a precise local study of interactions of neighbouring image intensities in the course of the blurring process. It also provides an extrapolation scheme for local image data, obtained at a given spatial location and resolution, to a finite scale-space neighbourhood. This is especially useful for the determination of sampling rates and for interpolation algorithms in a multilocal context. Another, particularly straightforward application is image enhancement or deblurring, which is an instance of data extrapolation in the high-resolution direction. A potentially interesting feature of the proposed local image parametrisation is that it captures a trade-off between spatial and scale extrapolations from a given interior point that do not exceed a given tolerance. This trade-off suggests the possibility of a fairly coarse scale sampling at the expense of a dense spatial sampling (large relative spatial overlap of scale-space kernels). The central concept developed in this paper is an equivalence class called the multiscale Zocal jet, which is a hierarchical, local characterisation of the image in a full scale-space neighbourhood. For this local jet, a basis of fundamental polynomials is constructed that captures the scale-space paradigm at the local level up to any given order.

1. Introduction ture of image structure (Witkin, 1983), (Koenderink, 1984b), (Koenderink and Van Doorn, 1987, 1990), A crucial aspect to be taken into account in the de- (Babaud et al., 1986) (Lindeberg, 1990ab), (Florack scription of local image structure is the resolution, et al., 1992, 1993, 1994) In particular, such a free or equivalently, the inner scale of spatial structures. scale parameter provides a sensible meaning to the In scale-space theory, a free scale parameter is intro- notion of locality, independent of the image’s un- duced to account for the intrinsically multiscale na- derlying sampling characteristics (pixels, voxels), by 62 Florack, et al.

“blowing up” an infinitesimal neighbourhood to a fi- sions of a basic, normalised Gaussian kernel. Hence nite extent. each Gaussian is characterised by its centre location 5 The unconfounding of sampling grid and local im- and width 0, the spatial location and scale of interest, age descriptors is important, because one usually does respectively. not care about the grid; images are sampled and ren- Since we will be dealing with local aspects of im- dered on various devices (including the human retina) age structure, it is convenient to restrict attention to based on fundamentally different grids. One can even a single base point 2 = 0’. Clearly this is inessential; display a greylevel image on a black-and-white de- the generalisation of (1) to any other spatial location vice without affecting its relevant content, provided is obtained by a mere shift, thus yielding the more the details of interest are sufficiently large relative to familiar convolution expression: the grid constant (cf. the halftone images in your ev- eryday newspaper). Since the grid entails inevitable sampling and display artifacts, one would like to minimise its disturbing effect on any task the image is intended to support. In practice this means that the in which ya denotes the normalised Gaussian of width sampling grid constant must be small relative to the 0, centred at 2 = 0’. The restriction to a Carte- smallest inner scale of interest. sian frame is likewise immaterial; we may assume In view of the above arguments, it is natural to (1) and (2) to hold in any (not necessarily recti- abandon local image descriptors that rely explicitly linear) coordinate system, such as a polar coordi- on the grid. Typical examples of these are the 5- nate system, simply by interpreting the derivatives point Laplacean widely applied in 2-dimensional im- as covariant derivatives. This will be done through- age processing for the extraction of zero-crossings, out the paper. For the sake of notational simplicity the 2-point difference quotient intended to approxi- we will use the Einstein summation convention for mate a first order image derivative, etc. Operations each pair of identical upper and lower indices. In- like these should only show up as small-scale limiting dices may be raised and lowered by means of the cases (i.e. at scales close to the grid constant). metric tensor. See also (Kay, 1988), (Lawden, 1962) According to the theory of regular tempered dis- for an easy introduction to classical tensor calculus, tributions (Schwartz, 1950), a derivative of a (not and (Spivak, 1970), chapter 4, vol. I for a more elab- necessarily smooth) function LO can be obtained in orate exposition. a well-posed way by correlating it with (i.e. taking The correlation scheme based on Gaussian deriva- the scalar product with) the conjugate derivative of a tive filters is pretty straightforward. One can use it to obtain a discrete set of local measurements (the smooth test function, 3’ E Cm(lR)say: Li1..,2,,for n = 0,. . . , N, say), and one can repeat this at any discrete number of locations and scales. Although (2) is a locally smooth representation from a pure mathematical point of view, it is important to Here, ail...i, denotes the n-th order derivative w.r.t. appreciate that smoothness of this representation does the spatial variables xzk (k = 1,.. . ,n), in which not have any operational significance. That is to say, each index il, has a value from the range 1,.. . , d, one cannot store a smooth representation in terms of labelling the d axes of some arbitrary Cartesian coor- mere numbers on a physical medium. What can be dinate frame centred at 2 = 0’. The input function is represented, however, is a routine that enables one denoted by Lo. The factor (-l)nexpresses the anti- to retrieve any sample from some presumed smooth symmetric nature of differentiation in a real function representation. In other words, smoothness is not an space; it is just the sign factor that shows up in an attribute of the measurements, despite the smooth- n-fold partial integration of (1). ness of the underlying kernels by which they may The scale-space paradigm is in fact an operational have been obtained. Even worse, the very scale-space instance of Schwartz theory, in which the test func- paradigm cannot sensibly be attributed to the (virtu- tions y are constructed to account for the physical ally arbitrary!) local measurements; any information notion of scale. To this end, the test functions are reminiscent of their prior kernels has been integrated taken to be translated and isotropically scaled ver- out. From these arguments it is clear that one needs The Caussian Scale-Space Paradigm and the Multiscale Local Jet 63

a dual representation (in the form of a local routine) physically sensible limits, say max(E/cO, ao/@) < 1, that captures the scale-space paradigm in addition to where E denotes the grid constant (pixel or voxel sep- the measurements. It is the purpose of this paper to aration) and e the size of the image domain or region make this explicit. of interest. To this end, we propose a smooth, finite parametri- The multiscale local jet of order N can now be de- sation of local image structure consistent with the fined as the equivalence class of images that have the scale-space paradigm. It takes into account both the same local structure up to N-th order (inclusive), in image’s spatial structure at a given location, as well space as well as in scale. This is nontrivial, because as its “deep structure”, that is, its local behaviour scale and space are confounded by (3). Consequently, as a function of scale. This is of interest for funda- the order of the jet relates to both space and scale in mental as well as practical low-level image problems. some dependent way, and has to be given a precise For example, the parametrisation enables a precise meaning. study of local interactions of neighbouring image in- A simple way of parametrising a local jet is by tensities when changing inner scale (Damon, 1990), considering a truncated local Taylor expansion that (Johansen et al., 1986), (Lindeberg, 1992ab). It also forms the common lowest order part of all local jet provides a scheme for a smooth extrapolation of lo- members. The coefficients in that expansion then cal image data, obtained at a given spatial location parametrise the local jet. This definition seems to and resolution, into a finite scale-space neighbour- require a choice of a local coordinate system. It must hood (not confined to the grid). The extent of the ex- be realised, however, that the concept of a local jet is trapolation region can be related to an accuracy mea- really coordinate independent. In order to make this sure. This is especially useful in the determination of coordinate independence manifest, the coefficients in sampling strategies (spatial overlap of kernels, scale the Taylor expansion are to be regarded as covari- discretisation). Finally, one may use the parametri- ant image derivatives, that is, image derivatives that sation for image enhancement or deblurring gen- behave as covariant tensors. eralising the well-known scheme based on subtrac- In order to make the above definition more pre- tion of the Laplacean of the image (which turns out cise, consider a formal expansion near the origin to be the lowest order case) (Hummel et al., 1987), (.’,t) = (d,O) of a coordinate system centred at a (Kimia and Zucker, 1993), (Wang et al., 1983). given interior point (recall that t corresponds to a The central concept in this paper is the multiscale physical scale a,so t = 0 corresponds to some inte- local jet, which will be explained in the next sec- rior point a = 00). If (E,6t) is a “small” excursion tion. It is basically a hierarchical, local characterisa- away from the origin, “small” meaning 6t = O(s0) tion of the image (or rather, of an equivalence class and IIS511 = O(ao), then the formal expansion reads of metamerical images (Koenderink, 1992)) in afull, (d + 1)-dimensional scale-space neighbourhood. L,(S?, St) = wo3..

2. Theory

In order to appreciate the notion of a multiscale local in which a subscript of L refers to a covariant spatial derivative and a parenthesised superscript to a scale jet, consider the scale-space L : Rd x lR+ -+ lR : derivative, evaluated at the point of interest. It is un- ($0) H L(< a). It is constrained by the diffusion equation derstood that all coefficients are to be evaluated at the origin (?,t) = (6,0), unless indicated otherwise. It will turn out that these coefficients can be obtained (A - L = o g) (3) by convolving the input image with certain spatial derivatives of the Gaussian kernel corresponding to The evolution paratneter t relates to spatial scale ac- the level of scale at the point of interest. cording to 2s = 0‘ if t = s - SO. In other words, The aim is to truncate the expansion to obtain a t = 0 corresponds to a physical inner scale (T = go, hierarchy of polynomials of increasing order which which will be assumed to lie somewhere inbetween identically satisfy the diffusion constraint (3) for 64 Florack, et al. proper scale-space behaviour. These ordered poly- which can also be written as nomials may serve to represent the scale-space of L near any given point with increasing accuracy. They L,(62, 6t) = exp { 6tA + 63.?} L . (10) may be used to represent the image's local jet of a given order, that is, the equivalence class of all im- This condensed formulation nicely reveals the role of ages with a certain order of contact at the point of the Laplacean as the infinitesimal generator of rescal- interest. The notion of contact now refers to both its ings, quite similar to the role of the gradient as the spatial as well as its scaling behaviour, and because generator of translations. The expansion (10) is valid these are linked through diffusion it is clear that the on any finite (2,t)-neighbourhood of the origin. coefficients in (4) are interrelated as well. Indeed, Our objective is to determine how to truncate the all t-derivatives are determined by spatial derivatives expansion such as to retain full compatibility with the only, for we have, for all j > 0: diffusion equation. In other words, we want to consider a finite, N-th order approximation L~(62,6t) of L, (62,6t)with exact scale-space properties. or, upon resolving the recursion: PROPOSITION1 (SCALE-SPACE POLYNOMIALS) The N-th order polynomial solution to the diffusion equation is given by

Here and henceforth, gij denotes the contravariant metric tensor, that is, the inverse of the covariant metric tensor gij. In a Cartesian coordinate system they both boil down to the invariant Kronecker sym- bol Sij (that is, 1 if 1: = j, 0 otherwise), and one Here, [XI denotes the entier of x, that is the largest may forget about the distinction between covariant integer n for which n 5 x. and contravariant indices. In any case, the length of the infinitesimal arc connecting xi to xi + dxi is, by The finite expansion in Proposition 1 represents definition, given by the covariant metric tensor: the image's multiscale local N-jet. Before turning to the proof, note that the N-th order polynomial LN is d12 = gijdx"dxi. (7) (n) a sum of n-th order, homogeneous polynomials L , each of which contains homogeneous monomials of The Laplacean aL/dt = AL (at the point of interest) degree n. For consistency, one has to attribute an is the lowest order and most familiar instance of (6). order or degree of homogeneity to t which is twice Equations (5) and (6) follow from substituting the that of xi (dimensional analysis): formal expansion (4) into the diffusion equation (3), using LEMMA1 (HOMOGENEOUSPOLYNOMIALS) The nn N-th order polynomial LN dejned in Proposition I A{xil ...xi.'} =gkl 1 1 (8) can be written as s=l,s#r r=l .. zil. . .z2p-~d~1.x~7+ik ...zi,-i~%zZs+i 1 . . .XZ" , n=O or, equivalently, by exploiting the linearity of (3). So the full solution to the diffusion equation in a (n) neighbourhood of the origin is given by the expansion with homogeneous polynomials L= Ln - Ln-l def (L-1 = 0) given by 11 L,(SZ,St) = y,y;-- (9) n! j! n=O j=o The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 65

full local neighbourhood in space and scale. In other words, regardless of how the measurements are obtained, the contraction onto the corresponding fundamental contratensor may well be regarded as the exact Lemma 1 is merely an order-by-order rearrange- local scale-space of some underlying d-dimensional ment of terms in Proposition 1. Now it is easily input signal: at the local level, nothing indeed re- (n) verified that the homogeneous polynomials L , and minds us of a prior Gaussian filtering stage. Re- hence the LN,indeed satisfy the diffusion equation: versely, given the initial d-dimensional data, the n-th order cotensor Ltl...znrepresents its n-th order covariant spatial derivative at a given interior point; its Proof of Proposition 1: From (8) it follows, after multiscale local jet is then represented by Lemma I. some dummy index manipulations, that Thus the fundamental tensor polynomials in Corol- lnlal-1 lary 1 are the dual objects which, by contraction onto (n) 1 1 n L (62,6t)= - the corresponding measurements, enforce a smooth (. (. - 2j - 2)! j! j =O internal representation by the scale-space paradigm.

gklh . , . yk,+lb+lL Note that, because of symmetrisation, the number of 21...z,,-2,-zk111..lC,) k1J+1 essential components in Corollary 1 equals Sx‘1. . .6z‘V -2J-2 6t3 .

(n) One readily verifies that this is just & L (SZSt).

(n) The degrees of freedom that have been discarded by The L have a definite parity, viz. (-l)n. Since truncation lead us to the issue of metamerical images the coefficients in Lemma 1 may have arbitrary val- (Koenderink, 1992). Recall that our starting point ues, they can be put in isolation so as to yield an was the formal expansion of a locally defined scale- n-th order, symmetric contratensor, which represents space (9), and our aim was to reconcile this with a fundamental, n-th order, homogeneous polynomial a set of local measurements up to some finite or- solution to the diffusion equation: der. This led to the operationally well-defined expression in Proposition 1. Since both expressions COROLLARY1 (PoLYNonlIAL BASIS)The n-th satisfy the diffusion equation, so does their differ- order homogeneous polynomial solution to the diffu- ence R~(62,St)= L,(63,6t) - L~(62,St)(we sion equation can be written as will call it the ghost image), by virtue of linear-

ity. This ghost image.. can be represented in terms of the polynomials Pzl...zrb(65, St), for n 2 N -t 1. Since the cotensors in this polynomial representa- in terms of the basis polynomials tion do not correspond to actual measurements, they must be considered as mere formal parameters. They constitute a local, order by order characterisation of an infinite-dimensional class of scalar field configu- rations that induce identical local measurements of orders 0, . . . , N. For this reason the hypothetical images, obtained by arbitrary extensions of the physical in which parentheses surrounding upper indices de- measurement data by some cross-section of the for- note index symmetrisation. mal parameter space, are called metamerical images.

The homogeneous polynomials Pal...z7i are inde- What about the rate of convergence of the expan- pendent of the measurement data. They can appar- sion? This is obviously a crucial question if a trunca- ently be contracted onto arbitrary n-th order coten- tion of the expansion is required to be “sufficiently” sors Lt,,,,z,,so as to yield polynomials that mani- accurate within a certain finite neighbourhood of the festly satisfy the diffusion equation, thus providing a central point of expansion. Let us therefore take a smooth interpretation of the point measurements on a closer look at the nature of the ghost image. 66 Florack, et al.

Definition. Let R be some neighbourhood containing and with A given by (12). (2,t) = E IRd x IR, such that (62,St) E R (6,O) Now we see the motivation behind our particular implies that the path C = {(2,t)=(XS2,X’St) X E 1 choice of homotopy: I = [0,1]} is contained in R. Then the homotopy A : R x I + IR for the image L : R -+ IR is defined (n) by OBSERVATION1 Let L (S2,St) be dejined as in Lemma 1. Then A(Slc’, St; A) = L(XS?, A’&). (12) 1 dnA (n) ___ (62, St;0) = L (S?, St) , n! dXn

The homotopy A describes the smooth transition of with the identiJcation given by the scale-space dejin- greylevels one encounters when walking away from ing difusion equation the origin (6,O) along a specific path C c R towards d - =A. a given neighbouring point (62,St). At the endpoints at we have L(6,O) = R(S2,St;O) and L(S2,St) = A(S2, St; 1). The path itself is obtained by a scal- From Corollary 2 and Observation 1 it follows that ing of the excursion parameters (S?,St). Note that R the remainder R~(62,St) in the expansion of the ho- is star-shaped in the (Z,a)-domain, not in the (2,t)- motopy is indeed the local ghost image L,(S?, St) - domain. L~(S2,6t),that is, all “lacking evidence” for an N- In order to appreciate the greylevel deviation one th order local measurement. encounters during the traversal of the path, consider In some sense one would like to minimise the ef- the Taylor expansion of A(&?, St; A) near X = 0, and fect of the ghost image, since it induces an uncertainty evaluate the result at X = 1 (which is possible by in the measurements even in the hypothetical case that virtue of the star-shaped domain). Recall from anal- the local measurements are obtained with infinite pre- ysis that the Taylor expansion can be derived by re- cision. It is therefore of fundamental interest to find peated partial integration of the identity an upper bound for the ghost image as a function of order N and excursion parameters (S2,St). In fact, A(SZ, St; 1) = A(S2, St;0) t we immediately obtain the following result:

LEMMA2 (GHOSTIMAGE) Let R~(62,St)be de- which immediately gives us the exact integral form fined as in Corollary 2. Then there exists a point for the remainder after N-fold repetition: A E I on the path defined by the homotopy (12), such that COROLLARY2 (PHYSICALvs. GHOSTIMAGE) 1 dN+lA RN(62, St ) = (62,St; A) . The full solution to the difusion equation, (N + l)!dXN+1 L,(S2, St), can be separated into a finite dimensional “physical image”, LN(62, St), and an infinite This is a standard way of rewriting the integral dimensional ghost image, RN(SZ, St), as follows: expression for the remainder RN(62, St) in Corol- lary 2, known as the formula of Lagrange, which fol- Lm(SZ, St) = L~(62,St) R~(62,St) , + lows straightforwardly by virtue of continuity of the with derivative in the integrand as a function of X (A E I is the parameter value at which the (N + 1)-th order N 1 dnh derivative of A(S2, St;A) attains its maximum value LN(S2,St) = --(62,6t; 0), n! dXn on the path C). It is clear that the ghost image must n=O vanish at the origin, and that, by convergence of the RN(S?,St) = Taylor series, it is a monotonically decreasing function of order N. However, Lemma 2, as it stands, is not a very useful way of writing the ghost image, The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 67

since it does not make the (62,St)-dependence ex- 3. Examples plicit. Since A(&?, St; A) is just L(AS2>X2St), we 3.1. Some Lowest Order Cases may substitute A-derivatives for 22- and t-derivatives using the chain rule together with Leibnitz's product rule. Since this is a rather tedious and cumber- Up to fourth order, the homogeneous polynomial so- some exercise, we will only present the result here lutions are given by and leave the details to the appendix:

RESULT1 (GHOSTIMAGE) Cj: Lemnza 2:

Addition of these, with the coefficients given by Here, it is understood that each of the k t-derivatives stands for a Laplacean as usual. the corresponding fixed-scale spatial derivatives of a given image, will yield a representative member of By taking terms with index k = N - q + 1 apart, the image's multiscale local 4-jet. we see that The symmetric tensors corresponding to the above scalar polynomials are given by

P(2,t) = 1, P"2,t) = xz, 1 P2j(2, t) = -xzx3 + 9%' t , 2 which entails Observation 1 as a special case (viz. for P"k(2, t) = LtxJxk + g(ZJ xk)t ! which = 0). 6 Result 1 shows that the value of the ghost image at the point (62,St) can be written as ajnite polynomial of excursion parameters, rather than an infinite Jr-g (i3 xxk 1) t series, the coefficients of which are image derivatives 2 of orders N + 1, . . . ,2(N + 1) evaluated at some point (ij kl) t2 (b2,i26t)on the path C that connects (S2,St) to +p 9 . the origin. This may come as a surprise, since the These fundamental polynomials, together with all ghost image clearly has an infinite number of degrees higher order ones, constitute a (non-orthogonal) basis of freedom. To resolve this apparent paradox, recall for the local decomposition of any scale-space. that the formal parameter E I corresponds to the point on the path C at which the (N + 1)-th order derivative of A(&?, St; A) attains its maximum value. 3.2. Scale-Space Germs This path, and hence the value of the formal parameter A, depends on the endpoint (S2,bt). Thus the By means of suitable contractions, all kinds of so- function i(62, St) is the rug that covers the missing called germs can be constructed from the funda- degrees of freedom. mental tensors of Corollary 1. Germs constrained 68 Florack, et al.

by diffusion play an important role in local Morse haviour of second order image structure can always theory for the diffusion equation (Damon, 1990), be explained in terms of these two germs. (Koenderink, 1986), (Lindeberg, 1992b). Examples Observe that the t-dependence drops out of (I!)), are the “H-versa1 germs” f(Z)and g(2,t) given by as a consequence of gzjZzj = 0. To see that (19) is really nothing but (14), take any local Cartesian frame 1 d d and diagonalise L,j by a suitable rotation of the frame f(2)= - uiz! with ai = 0, (14) (this construction of coordinates is usually called a 2 a=l i=l gauge condition), say L,j c-) diag(X1, . . . , Ad), upon which the scalar (1 9) reduces to and

* i=l - kl

To see how these relate to the second order funda- with the ai given by mental polynomial Pij(Z,t), take an arbitrary, symmetric cotensor Lij of rank 2, and decompose it into its traceless part and the remainder, say -- j=1

These numbers are indeed subject to the constramt in (14), but are otherwise arbitrary, thus capturing with d - 1 structural degrees of freedom of the Hessian. It can be shown that the eigenvalues Xk: (k = 1,.. . , d) can be expressed in terms of traces over powers of the Hessian Lij up to order d, inclusive. Note in PiX- ticular that the residual degree of freedom that has dropped out of (23) is just (21): Note that this decomposition is covariant. Then the germ f(2)(14) is obtained by contraction of Pij(Z,t) and 2,: (24)

f(Z) = ZijPij(?,t), (19) The true scale-dependent part is apparently confined to the “blob-like” germ g(2,t); indeed, the rate of (equivalently, one can decompose Pij (Z, t) into a change of inner scale is completely determined by traceless tensor and one proportional to gij and then a single degree of freedom of the Hessian L,, viz. contract Lij onto the former). The germ g(Z,t) the Laplacean of the image. The germ g(2,t) is rhe in (15) emerges, up to a scaling factor, by contracting canonical form describing the scale evolution at the Lij onto Rij: umbilical points of the image’s second order structure (that is at the points where the quadratic sur- Xg(Z,t) = RijPij(Z,t), (20) (2) face L (Z,t = 0) is equally curved in all spatial directions), whereas the t-independent germ f (3)de- in which the scalar X is given by scribes the stationary behaviour of the image’s second order structure at the well-known Laplacean zero- crossings (in d = 2, these correspond to the anti- umbilical points). The latter germ can be decom- posed into a linear combination of d - 1 independent This decomposition makes the complementarity of canonical forms that describe the stationary behaviour the germs f(2)and g(Z,t) obvious; the local be- at the various prototypical “hypersaddles”. The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 69

Fig. 1. From left to right: a second order light blob, its complementary saddle, a third order blob modulated by a linear ramp, and its complementary “monkey saddle”. Second and third order local image structure can always be explained in terms of linear combinations of such pairs. Figure 1 shows a typical blob and saddle in 2D, corresponding to the decomposition of Lij into Rij = Lij - Zij and Zij,respectively. As a less trivial example, consider the complemen- Note that $(Z,t) contains only a single degree of tary cotensors freedom, say wi = &gjkLijk; all other degrees of freedom contribute to the scale-independent $(?). For the sake of simplicity, consider the 2- dimensional case. From differential geometry it is well-known that a cubic binary form f(x,y) = aoy3 + alxy2 + a2x2y + a3z3 represents a Monge patch parametrisation of a monkey saddle, if and Again we have (cf. (16)) only if the corresponding cubic obtained by setting (z,y) = (1,<)in the binary form has three distinct roots (Lipschutz, 1969). That this is indeed the case in (28) for arbitrary coefficients Lijk (degenera- Contraction of Zijk onto pijk(Z,t) yields the mono- cies apart) follows from algebra (Salden et al., 1994): mial the cubic j(<) = aoC3 + a1C2 + a2< + a3 has three distinct roots if and only if its discriminant’ D = U~UE+ 18~0~1~2~3- 4~0~:- 4~!~3 - 27~:~; is positive. The reader may verify that the discriminant corresponding to (28) in the 2-dimensional case or is given in covariant form by

2 (gilgjrngkn (4LijkLlmn - 3LijlLkmn)) D= , 3072 (32) which is t-independent and is indeed a stationary solution of the diffusion equation. Its complement is which is indeed nonnegative definite. In fact, D de- given by the t-dependent monomial (a “blob” modu- generates if and only if zijk, and hence $(.’), van- lated by a linear ramp) ishes identically, i.e. if and only if there exists a cov- &Or Wi such that Lijk = wigjk + wjgik f wkgij, in which case $(.‘,t) =wixi(igjkzjxk+ (df2)t). This shows that, generically, (28) indeed parametrises a monkey saddle. Like with the second order saddle, a monkey saddle can neither be called a light nor a dark blob, and is indifferent with respect to blurring, while the “third order blob” (30) is a genuine 70 Florack, et al. blob modulated by a linear ramp, which is sensitive references, as well as a different deblurring strategy to blurring. However, owing to this linear ramp mod- can be found in (Kimia and Zucker, 1993). ulation a third order blob cannot be classified as light In order to say something about the truncation er- or dark either: changing the sign of the underlying ror (or equivalently, about the ghost image) at the 4 blob essentially amounts to reversing the gradient di- point of interest (62 = O,St = --&a2),we may use rection of the modulating ramp, w, H -wa;the result Result 1. However, since the spatial location is kept is a n-rotated copy of the original picture. Third or- fixed, it is more convenient to take one step back der local image structure can always be explained as and consider Lemma 2 with h(SZ7St; A) replaced by a sum of a monkey saddle and a blob-on-a-ramp. Fig- G(6t;p)= L(6Z = 8,pSt) (p E I). This yields the ure 1 shows a typical monkey saddle and “third order conventional formula of Lagrange: blob” in 2D, corresponding to the decomposition of Lajk into Z,gk and RPJk= Lagk - Zajk, respectively. One may appreciate the advantage of representing germs in a manifest covariant form, such as (19), (20), for some E Put differently, writing .@(E) in- (28), and (30) instead of the conventional, usually co- ji I. stead of RJ(-&a2): ordinate dependent representation, such as (14). One can always gauge down to canonical forms, such as (14), but in doing so things may get obscured by the choice of coordinates (in particular, the form of (14) depends on the coordinate system, so it is not with Mj defined as the maximum value of the j-th at all clear whether it describes anything meaning- scale derivative of the image at Z= 0 on the deblur ful!). Using the covariant formalism, all seemingly interval Sf E [-E, 01 (a26t= St): different coordinate representations of one and the same germ are captured by a single, manifest invari- ajo Mj = niax-(St;p) = L(j)(O‘,-fi&a2). (36) ant expression. pE~ap Note that deblurring is not the inverse of blurring (which is a semigroup operation that has no inverse), 3.3. Debbrring Gaussian Blur but rather an “image enhancement” operation; it does not create high resolution details that do not already A special instance of (10) is deblurring, which is ob- exist at the current level of scale, but sharpens exist- tained by taking = and St = -&az for 0 < SZ 0’ ing details (including “noise” of course). From (33) E 5 T. According to Proposition 1 we get, setting this is evident from the fact that one can only manip- L~(E)= ~~(62 = 6, fit = -&a2)for J = [~/2]: ulate a single (global) parameter E and, to a limited extent, the order N. One cannot expect to reconstruct the image at a higher resolution from this, since this (33) requires an infinite number of local degrees of fr1:e- dom (by convergence of the series expansion (9), this would have been possible if the limit N --+ cm were physically realisable). with again L(J)= yklll . . .ykJIJ~klll...k,~,evaluated Examples of deblurring using (33) are presented at the origin. The tilde denotes spatial differentiation in Figures 2 and 3. Note that despite the signifi- w.r.t. 5?, defined by the natural scaling xa = a?. cant amount of noise used in the examples, it is czp- The first order part of this scheme is a well-known parently the case that the use of high order deriva- deblurring technique, widely applied even long be- tives (up to order 12 in this case) may be bene- fore scale-space theory had been established. In a Jicial rather than prohibitive. An example of de- scale-space, this scheme becomes operationally fea- blurring up to even higher orders is described in sible up to some order N (but note that there is no (Ter Haar Romeny et al., 1994). Note also that the a priori limitation for N). For a survey and a list of global deblur parameter St = -&a2scales proportion- references, see e.g. (Wang et al., 1983). Some more ally to a‘; in this example we have taken E slightly The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 7 1

below one half2 (E = 0.469 at scale (T = 7.39, hence By means of an example it was shown that the St = -25.6). Although E is essentially a free pa- lowest order local scale-space representation that is rameter, it is clear that, given a certain order N, one nontrivial with respect to space as well as scale re- should take it neither too large (E >> 1/2), nor too quires a complete set of local measurements up to sec- small (E << 1/2). In the first case one may expect the ond order, inclusive (6 degrees of freedom for a two- extrapolation to become unreliable, in the latter case dimensional image, 1 of which can locally be gauged the &-correctionsbecome negligible (albeit very ac- away by rotation). The second lowest order refine- curate) compared to zeroth order. The value E = 1/2 ment requires all local measurements up to fourth corresponds to the case s + St = 0, i.e. to a (hypo- order (15 - 1 essential components in two dimen- thetical) scale excursion to “infinite resolution” (recall sions), etc. Note that, with contemporary imaging that s = 02/2). techniques, these measurements usually have to be extracted a posteriori, that is, after the image acquisition and reconstruction stage. In biological visual systems, receptive fields of various sizes and weight 4. Conclusion and Discussion profiles are probed to produce these measurements directly; individual rod and cone outputs are disre- We have introduced the concept of a multiscale local jet by means of a finite parametrisation of local image garded already at a sensory stage. structure in the spatial as well as the scale domain. Local image extrapolation in space and scale (with It is characterised by an order of contact N, and subpixel accuracy) is a straightforward application, captures an equivalence class of local input signals comprising Gaussian “deblurring” as a special case. that induce identical measurements (“metamerical im- Deblurring according to such an extrapolation scheme ages”) when correlated against Gaussian derivatives is a remarkably simple, linear operation, which re- (scaled to a given scale and centred at a given loca- quires spatial derivatives of even orders 2N of the tion) of orders 0, . . . N. form AJL, J = 0,. . . ,N. It was shown empiri- The multiscale local jet can be represented by a fi- cally that even for noisy images there is no a priori nite hierarchy of fundamental polynomials with exact limitation to low orders, provided the details of inter- scale-space properties. These polynomials identically est have a sufficiently large intrinsic scale relative to satisfy the diffusion equation underlying scale-space. those that are irrelevant (“noise”, grid). In the exam- By virtue of this property, and because of their ex- ples shown in this paper, the use of spatial derivatives plicit form, they are useful in the study of the local of orders as high as 12 turned out to improve perfor- structure of scale-space: one can use them to make mance, even in images that are heavily corrupted by local approximations of scale-space within the “scale- small-scale Gaussian additive noise (0 dB signal to space manifold” C, i.e. within a neighbourhood of noise ratio). Note that an extrapolation scheme (such a given set of dependent and independent variables as deblun-ing) can be combined with a resampling (27t; L) E C defined by the diffusion equation. In scheme, so as to maintain a fixed relative spatial den- particular, they are convenient for the study of the sity of kernels at each scale (scale invariance). This generic behaviour of local scale-space features, and requires the general extrapolation formula to be used for the classification of bifurcations. Moreover, the (e.g., for appropriate spatial upsampling in the case fundamental polynomials constitute a smooth (non- of deblurring). orthogonal) basis for a local scale-space expansion. It was shown that there exists a local trade-off This basis embodies the local scale-space paradigm, between spatial and scale extrapolation, which may and is independent of the input signal. Measurements lie at the basis of a scale-space sampling strategy. In extracted from the input signal by means of correla- particular, this trade-off seems to admit a coarse scale tion with Gaussian derivative profiles correspond to sampling at the expense of a fairly large spatial over- the coefficients in a local expansion with respect to lap of Gaussian filters. It may also provide a theoret- this basis. In this sense the fundamental polynomials ical basis for physiological indications of overlapping are the duals of the local measurements: only as a receptive fields in mammalian visual cortex. A pre- pair do these convey a smooth, local representation cise sampling strategy, however, can only be reached of scale-space. once several related problems have been dealt with. 72 Florack, et al.

ICV 3DCV 3DCV 3D 3DCV 3DCV 3C ICV 3DCV 3DCV 3D

Fig. 2. Left image: A synthetic, binary test image (intensity difference 255 units) of 512 x 512 pixels. Second image: same as first one, but blurred to scale (T = 7.39 pixels. Third image: same as first one, but now with additive, pixel-uncorrelated, Gaussian noise with a standard deviation of 255 intensity units. Right image: same as third one, but blurred to scale (T = 7.39 pixels.

Fig. 3. Debluning as calculated for the low-resolution images of Figure 2 (second and fourth image respectively) of orders 1 to 6 (inclusive), using E = 0.469. The derivatives involved in the scale expansion have been evaluated at high resolution, scale U = 1.00 pixels, on the low-resolution input images (n = 7.39 pixels). Displayed from left to right, top to bottom, the first six images show the results of deblurring on the noise-free image: L.'=~(E),, . . , LJ=6(~), while the last six images show the corresponding results for the noise-perturbed image.

F~~example, we did not address the effect of inter- the local scale-space expansion. Note that the internal noise inherent to the ~~~~~i~~ filtering process nal errors (owing to quantisation and other sources of (random fluctuations, quantisation errors, etc.), nor noise) of the filters used to obtain the measurements did we consider multilocal issues, matters of physical do not manifest themselves locally, that is, without limits and economy of hardware. We only considered the ability to compare neighbouring measurements. the external errors that follow from a truncation of Because of their realisation through integration, the The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 73 measurements by themselves do not contain any infor- Appendix mation about the filters (which is just why the local A.l. Proof of Result 1 scale-space paradigm had to be embodied in the form of the fundamental polynomials in the first place). Result 1 follows from While the external errors decrease, the internal er- LEMMA3 Definitions as for Result 1. rors increase as a function of order (not treated in this 1 d”A paper, see e.g. (Blom, 1993)). The apparent trade- -- (62, St; A) = off remains to be investigated. Finding this trade-off n! dAn would solve the notorious problem of determining 11 k the order of differentiation that optimises the repre- q=1 k=n-q sentation of local image structure. The local ghost images defined by the optimal order then correspond to fundamentally inaccessible local degrees of freedom, quite similar to the ghost images one encounters in image reconstruction, corresponding to (near) zero Before we turn to the proof of this, consider the following scale-space index conventions. eigenvalues of the acquisition matrix (Barrett, 1992). By construction, it was shown that a contraction of Definition. measurements and corresponding fundamental poly- For each spatial index i = 1, . . . , d we define the nomials always yields a smooth, local, internal repre- scale-space index QI = 0, . . . , d, such that the val- sentation, regardless of the filter characteristics (and ues a = 0 and a = 1, . . . , d refer to the scale (t) and spatial dimensions, respectively. We will hence, regardless of internal errors). It remains an (xi) sometimes write a as &(i, 0). intriguing problem how one should ‘‘glue” together For each QI = 0,. . . ,d and 1 = 0,1,2,. . ., let all local expansions obtained at various positions and @(A) be given by the l-th order derivative of scales. It is also not clear whether one can actually @(A) = (ASzi,A2St). In other words, @(A) = endow the scale-space domain with a natural metric, (hi,2ASt), $?(A) = (0,2St), and #(A) = (0,O) i.e. one compatible with a connection. In any case for all 1 2 3. it is clear that filter characteristics do become crucial Finally, let Lal...ol,,denote the q-th order deriva- when neighbouring measurements are considered si- tive of L w.r.t. zal,.. . ,za”,with z‘lZ(zi,t). We multaneously. The scale-space paradigm prescribes will also write LI:.)..is for a k-fold derivative w.r.t. that the ideal filter profiles should correspond to linear t and q-fold derivative w.r.t. 91,. . . , zzci. derivatives of a scaled Caussian. The closer a realisation of correlating filters matches the ideal design, Proof of Result 1: Note that A(65, 6t;A) = L($,”(X)),so differentiation w.r.t. A requires ap- the better a bilocal connection can be expected to fit. plication of the chain rule and, since this will yield This connection problem is clearly related to the issue products of A-dependent factors, of Leibnitz’s prod- of scale-space sampling. It is also of physiological uct rule for differentiation. By induction one easily relevance to the problem of local sign determination, verifies that i.e. the determination of the spatial context of a set of unlabelled local measurements (this problem should not be confused with the somatotopic nature of neural mappings). Metaphorically speaking, each local measurement can be regarded as a piece of a jig-saw puzzle, and only by virtue of an internal representation of the scale-space paradigm, together with a signifi- in which the Einstein convention is in effect for the cant filter overlap factor, one may hope to be able a-type indices. The * on top of the summation sym- to reconstruct such a puzzle (Koenderink, 19844, bols indicates that the 1-type indices are subject to a (Koenderink, 1990), (Lotze, 1884). common constraint, viz. 74 Florack, et al.

4 in which X(X;k, n - q) stands for * : Clj==rl, j=l X(X;k, n - q) = that is, the total number of derivations equals n. Now we make the scale and spatial summations explicit: suppose that k of the a-type indices are zero, say 01, . . , ak = 0 (after suitable rearrangement), so . with @(A) defined by $?(A) = 2X and &(A) = 2. that the remaining q - k indices must refer to space, Finally, we must evaluate X(X;k, n - q). Again i.e. ak+1 = ik+l, ... ,aq = 2,. In order to account we may assume that the first j indices 11, . . . , lj are for all terms in the summation we must take into ac- equal to 1, and the remaining indices lj+l,. . . 11, count a combinatorial factor (3 and sum over all are equal to 2, provided we sum over all values of j = 0,.. . k with the appropriate combinato- possible values of k = 0, . . . , q. This leads to , rial factor (t), and provided we respect the constraint **. For a given j this constraint becomes j .1+ (k - j).2 = n - q + k, from which it follows that we only get an effective term if j = k - n + q (and since j 2 0 we may as well start the k-sum at * * k. k = n - q). Thus we find

The q - k summations over lk+l, . . . $ contribute by only one effective term each, since @(A) is nonzero and substitution finally yields only if 1 = 1, for which it equals Sz2. Evaluating the --1 d" corresponding sums yields n!dXn L(G(N)= k

q=l k=n-q

** ** (2X)k-"+qLg;l,,,is (+;(X))6zik+l. . .6zir'6tk This completes the proof of Lemma 3 and thus of

k, Result 1 (after some trivial index manipulations).

subject to the constraint Acknowledgements

k This work has been carried out as part of the na- tional priority research programme "3D Computer j=1 Vision", supported by the Netherlands Ministries of Both components 4y(X) and @(A) contain a factor Economic Affairs and Education & Science through St, so we may extract this and rewrite the expression a SPIN grant, and by several industrial participants. into The final revision of the manuscript has been pre- pared partially at INRIA Sophia-Antipolis, France, and partially at INESC Aveiro, Portugal, as part of the ERCIM Fellowship Programme, financed by the Commission of the European Communities. Janita Wilting is gratefully acknowledged for her critical comments and suggestions. The Gaussian Scale-Space Paradigm and the Multiscale Local Jet 75

Notes Koenderink, J. J. 1984b. The structure of images. Biol. Cybern., 501363-370. 1. The classical algebraic literature sometimes uses a bracket for- Koenderink, J. J. 1992. Local image structure. In Johansen, P., malism based on so-called “Wansvectants”,which makes the co- and Olsen, S., editors, Theory & Applications of’ Image Anul- ordinate invariance of magic combinations of coefficients like ysis, volume 2 of Series in Machine Perception and Artificial these transparent. Intelligence, World Scientific, Singapore, pp. 15-21. Koenderink, J. J. and van Doorn, A. J. 1986. Dynamic shape. 2. This was done by visual inspection after a quick trial-and-error Cybern., 53:383-396. by taking a starting value 6t = -0.1 and doubling this until Biol. a visually reasonable result was obtained. This explains the Koenderink, J. J. and van Doorn, A. J. 1987. Representation of local geometry in the visual system. Biol. Cybern., 55:367-375. somewhat odd value of E = 0.469; no attempt has been made Koenderink, J. J. and van Doorn, A. J. 1990. Receptive field to rigorously optiniise the choice of E. The qualitative results families. Biol. Cybern., 63:291-298. turn out to be rather insensitive to the precise value of E % 3. Koenderink, J. J. 1990. The brain a geometry engine. Psycholog- icul Research, 52:122-127. Lawden, D. F. 1962. An Introduction to Tensor Calculus und References Relativity. Spottiswoode Ballantyne & CO Ltd. Lindeberg, T. 1990. Scale-space for discrete signals. IEEE Babaud, J., Witkin, A. P., Baudin. M. and Duda, R. 0. 1986. Truns. Puttern Analysis und Muchine Intelligence, 12(3):234- Uniqueness of the gaussian kernel for scale-space filtering. IEEE 245. Truns. Puttern Anulysis und Muchine Intelligence, 1):26-33. 8( Lindeberg, T. and Eklundh, J. 0. 1990. On the computation of Barrett, H. H. 1992. Image reconstruction and the solution of a scale-space primal sketch. J. of Ws. Comm. und Itn. Rep, inverse problems in medical imaging. In Todd-Pokropck, A. E. 2( 1):55-78. and Viergever, M. A., editors, Medicul Images: Formution, Hun- Lindeberg, T. 1992a. On the behaviour in scale-space of local dling and Evuluution, NATO AS1 Series Springer Verlag, F98, extrema and blobs. In Johansen, P., and Olsen, S., editors, The- Berlin, pp. 3-42. ory & Applications of Imuge Analysis, volume 2 of Series in Blom, J., ter Haar Romeny, B. M., Bel, A. and Koenderink, J. I. Muchine Perception und Artificial Intelligence, World Scientific, 1993. Spatial derivatives and the propagation of noise in gaus- Singapore, pp. 38-47. sian scale-space. of’ Comm. und Repr: 4( 13. J. Vis. lm. , I): I - Lindeberg, T. 1992b. Scale-space behaviour of local extrema and Damon, J. 1990. Local Morse theory for solutions to the heat blobs. Journal qf Muthematicul Imuging and Vision, I (1):65-99. equation and Gaussian blurring. Internal report, Department of Lipschutz, M. M. 1969. Diflerentiul Geometry. Schaum’s Outline Mathematics, University of North Carolina, Chapel Hill, North Series. McGraw Hill, New York. Carolina, USA. Lotw, H. 1884. Mikrokosmos. Hirzel, Leipzig. Florack, L. M. J., ter Haar Romeny, B. M., Koenderink, J. J. and Salden, A. H., ter Haar Romeny, B. M. and Viergever, M. A. 1994. Viergever, M. A. 1992. Scale and the differential structure of Local and multilocal scale-space description. In 0, Y.-L., Toet, images. Image und Vision Computing, 10(6):376-388. A., Heijmans, H. J. A. M., Foster, D. H. and Meer, P., editors, Florack, L. M. J., ter Haar Romeny, B. M., Koenderink, J. J. Proc. of the NATO Advanced Reseurch Workshop Shupe in Pic- and Viergever, M. A. 1993. Cartesian differential invariants ture - Muthemuticul Description of Shape in Greylevel Images, in scale-space. Journal of Muthemuticul Imuging und Vision. NATO AS1 Series F126. Springer Verlag, Berlin. 3(4):327-348. Schwartz, L. 1950-1951. Thiorie des Distributions, volume I, I1 Florack, L. M. J., ter Haar Romeny, B. M., Koenderink, J. J. and of Actuulitis scientijiyues et industrielles; 1091,1122. Publica- Viergever, M. A. 1994. Linear scale-space. Journal of Muthe- tions de I’lnstitut de MathCmatique de I’ UniversitC de Strasbourg, muticul lmuging und Vision, 4(4):325-35 1. Paris. Hummel, R. A., Kiniia, B. B. and Zucker, S. W. 1987. Deblurring Spivak, M. 1970-1975. A Comprehensive Introduction to Difer- gaussian blur. Coinputer Vision. Gruphics, and Imuge Process- entiul Geometry, volume I-V. Publish or Perish, Inc., Houston, ing. 38:66-80. Texas. Johansen, P., Skelboe, S., Grue, K. and Andersen, J. D. 19x6. Rep- Ter Haar Romeny, B. M., Florack, L. M. J., de Swart, M., Wilt- resenting signals by their top points in scale-space. Proceed- In ing, J. J. and Viergever, M. A. 1994. Deblurring Gaussian blur. ings of’the 8-th Internutionul Conference Puttern Recognition, on In Proc. Mathematical Methods in Medicul Imuging 11, volume Paris, pp. 215-217. 2299, San Diego, CA, pp. 25-26. Kay, D. C. 1988. Tensor Culculus. Schaum’s Outline Series. Wang, D. C. C., Vagnucci, A. H. and Li, C. C. 1983. Digital McGraw-Hill Book Company, New York. image enhancement: A survey. Computer Vision, Graphics, und Kimia, B. B. and 2ucker.S. W. 1993. Analytic inverse of’ discrete Imuge Processing, 24:363-381. Gaussian blur. Opticul Engineering, 32( 1 ): 166-176. Witkin, A. P. 1983. Scale space filtering. In Proc. Internationul Koenderink. J. J. 1984a. The concept of local sign. In van Doorn, Joint Conjerence on Artificial Intelligence, Karlsruhe, W. Ger- A. J., van de Grind, W. A. and Koenderink, J. J, editors, Limits many, pp. 1019-1023. in Perception, VNU Science Press, Utrecht, pp. 495-547.