Journal of Mathematical Imaging and Vision 10, 237–252 (1999) c 1999 Kluwer Academic Publishers. Manufactured in The Netherlands.

Linear Scale-Space has First been Proposed in Japan

JOACHIM WEICKERT Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 Copenhagen, Denmark [email protected]

SEIJI ISHIKAWA Department of Control Engineering, Kyushu Institute of Technology, Sensuicho 1-1, Tobata, Kitakyushu 804, Japan [email protected]

ATSUSHI IMIYA Department of Information and Computer Sciences, Faculty of Engineering, Chiba University, 1-33 Yayoi-Cho, Inage-Ku 263, Chiba, Japan [email protected]

Abstract. Linear scale-space is considered to be a modern bottom-up tool in . The American and European vision community, however, is unaware of the fact that it has already been axiomatically derived in 1959 in a Japanese paper by Taizo Iijima. This result formed the starting point of vast linear scale-space research in Japan ranging from various axiomatic derivations over deep structure analysis to applications to optical character recognition. Since the outcomes of these activities are unknown to western scale-space researchers, we give an overview of the contribution to the development of linear scale-space theories and analyses. In particular, we review four Japanese axiomatic approaches that substantiate linear scale-space theories proposed between 1959 and 1981. By juxtaposing them to ten American or European axiomatics, we present an overview of the state-of-the-art in Gaussian scale-space axiomatics. Furthermore, we show that many techniques for analysing linear scale-space have also been pioneered by Japanese researchers.

Keywords: scale-space, axiomatics, deep structure, optical character recognition (OCR)

1. Introduction should simplify the image without creating spurious structures. Since a scale-space introduces a hierarchy A rapidly increasing number of publications, work- of the image features, it constitutes an important step shops and conferences which are devoted to scale-space from a pixel-related image description to a semantical ideas confirms the impression that the scale-space image description. paradigm belongs to the challenging new topics in com- Usually a 1983 paper by Witkin [71] or an un- puter vision. This is also reflected in a number of recent published 1980 report by Stansfield [64] are regarded books on this subject [14, 18, 19, 47, 63, 68]. as the first references to the scale-space idea. Witkin In scale-space theory one embeds an image f : obtained a scale-space representation by 2 R → R into a continuous family {Tt f | t 0} of of the original image with Gaussians of increasing gradually smoother versions of it. The original image width. Koenderink [43] pointed out that this Gaussian corresponds to the scale t = 0 and increasing the scale scale-space is equivalent to calculating (Tt f )(x) as the 238 Weickert, Ishikawa and Imiya solution u(x, t) of the linear diffusion process always possible to derive the Gaussian scale-space as X the unique possibility. The fact that derivations such ∂ = ∂ = 1 , as [13, 48, 52, 56] have been found recently shows t u xi xi u : u (1) i that linear scale-space axiomatics belong to the current research topics in computer vision. u(x, 0) = f (x). (2) However, since the linear diffusion equation is well- established in mathematics and physics since Fourier’s Soon this linear filtering became very popular in pioneering works in 1822 [16], and image processing image processing, and many results have been ob- was already an active field in the fifties, one might won- tained with respect to the axiomatization, differential der whether the concept of Gaussian scale-space is not geometry, deep structure, and applications of linear dif- much older as well. Koenderink [44] states in a very fusion filtering. An excellent overview of all these as- nice preface discussing how scale-related ideas can be pects can be found in the recent book edited by Sporring traced back in literature, poetry, painting and cartogra- et al. [63]. phy that “the key ideas have been around for centuries Perona and Malik [57] pioneered the field of non- and essentially everything important was around by the linear diffusion processes, where the diffusivity is end of the nineteenth century.” The goal of the present adapted to the underlying image structure. Many reg- paper is to supplement these statements by showing that ularized variants of the Perona-Malik filter are well- not only the ingredients, but also their final mixture and posed and possess scale-space properties [68]. There application to image processing is much older than gen- exist also generalizations of the latter theories to erally assumed. To this end we present four Japanese processes with a diffusion ten- linear scale-space approaches, which are older than sor [68] and to dynamic scale-space paradigms which American and European ones. The first one of them can be analyzed in terms of algebraic invariance the- dates back to 1959. ory and differential and integral geometry [58]. Other The outline of this paper is as follows: In Section 2 important classes of nonlinear scale-spaces are related we describe the basic ideas of a one-dimensional ax- to morphology. Some of them are continuous-scale iomatic for Gaussian scale-space that has been dis- versions of classical morphological processes such as covered by Taizo Iijima in 1959 [22]. Section 3 dilation or erosion [5, 6, 40], others can be described studies a two-dimensional version of this axiomatic as intrinsic evolutions of level curves [1, 42, 54, 60]. leading to affine Gaussian scale-space. It has been These geometric scale-spaces are generated by non- established in 1962. In 1971 Iijima presented a linear partial differential equations (PDEs) which are more physical two-dimensional axiomatics of affine designed to have Euclidean [5, 6, 40, 42], affine [1, 60] Gaussian scale-space [27]. Its principles are sketched or projective invariances [7, 11, 12, 54]. Overviews of in Section 4. The next section is concerned with these nonlinear theories can be found in [10, 18, 68]. further scale-space research of Iijima and his stu- Besides the above-mentioned continuous scale- dents. Section 6 describes a two-dimensional ax- space concepts there has also been research on how iomatic which has been found by Nobuyuki Otsu in to construct adequate discrete scale-space frameworks, 1981 [55]. In Section 7 we shall relate all these results both for linear [3, 46, 47, 59] and nonlinear diffusion to the well-known linear axiomatics that have been es- filtering [67, 68]. Since a more detailed discussion of tablished since 1984. We conclude with a discussion in these algorithmically important frameworks would be Section 8. In order to give the reader an impression of beyond the scope of this paper, we shall focus on con- the spirit of these Japanese papers, we stick closely to tinuous theories in the sequel. the notations in the original work. The discussions are This diversity of scale-space approaches has trig- supplemented with remarks on the philosophical back- gered people to investigate which axiomatic principles ground, physical and biological motivations, results on are involved in these theories and how they differ [1, 4, the deep structure in scale-space and applications to 5, 13, 15, 43, 46, 48, 52, 54, 56, 72]. Apart from a few optical character recognition (OCR).1 notable exceptions such as the morphological equiva- lent of Gaussian scale-space [5], affine morphological scale-space [1] and intrinsic geometric heat flows [54], 2. Iijima’s One-Dimensional Axiomatic (1959) most of these axiomatics use (explicitly or implicitly) one requirement: a assumption (superposi- This section presents first some remarks on the his- tion principle). Within such a linear framework it was torical and philosophical roots of Japanese scale-space Linear Scale-Space has First been Proposed in Japan 239 research, and afterwards it describes the axioms and re- Iijima imposes basic principles which are in accor- sulting lemmas and theorems of the earliest scale-space dance with requirements from observation theory: a approach. robust object recognition should be invariant under changes in the reflected light intensity, parallel shifts 2.1. Historical and Philosophical Background in position, and expansions or contractions of the ob- ject. In addition to these three transformations he con- 8 Japanese scale-space research was initiated by Taizo siders an observation transformation which depends Iijima. After graduating in electrical engineering and on an observation parameter and which transforms ( ) ( ) mathematics from Tokyo Institute of Technology in the original image g x into a blurred version f x . 1948, he joined the Electrotechnical Laboratory (ETL). More precisely, he calls this class of blurring trans- In his thesis titled ‘A fundamental study on electro- formations ‘boke’ (defocusing), and he assumes that it has the structure2 magnetic radiation’ he derived the third analytical so- Z ∞ lution of the radiation equation. During these studies f (x) = 8[g(x0), x,]= {g(x0), x, x0,}dx0, he acquired the mathematical and physical background ∞ for his later scale-space research. (3) At the ETL Iijima was involved in different re- satisfying four conditions: search activities on speech and pattern recognition. Triggered by actual needs such as optical charac- (I) Linearity (with respect to multiplications): If the ter recognition (OCR), but also voice typewriting, or intensity of a pattern becomes A times its origi- medical diagnosis, he wanted to establish a general nal intensity, then the same should happen to the theoretical framework for extracting characteristic in- observed pattern: formation of patterns. This framework should avoid extreme standpoints such as purely deterministic or 8[Ag(x0), x,]=A8[g(x0), x,]. (4) purely stochastic classifications, and it should make use of the original physical or geometric characteris- (II) Translation invariance: Filtering a translated tics of the patterns [26]. image is the same as translating the filtered im- Besides this problem-driven background, there was age: also a philosophical motivation for Iijima’s scale-space research. Its principles go back to Zen Buddhism, 8[g(x0 a), x,]=8[g(x0), x a,]. (5) and they can be characterized by the sentence “Any- thing is nothing, nothing is anything”. Applied to the (III) : If a pattern is spatially en- scale-space context this means to obtain the desired larged by some factor , then there exists a information, it is necessary to control the unwanted in- 0 = 0(, ) such that formation. The blurred scale-space evolution may be interpreted as a kind of unwanted information which 8[g(x0/), x,]=8[g(x0), x/, 0]. (6) helps to understand the semantical content of the un- blurred image. In this sense important information is (IV) (Generalized) semigroup property:Ifgis existing among seemingly unimportant information. observed under a parameter 1 and this observa- tion is observed under a parameter 2, then this 2.2. Axioms is equivalent to observing g under a suitable pa- rameter 3: 00 0 00 Iijima’s first axiomatic formulation of the scale-space 8[8[g(x ), x ,1],x,2]=8[g(x ), x,3], concept can be found in a technical paper from 1959 ti- (7) tled ‘Basic theory of pattern observation’ [22]. A jour- nal version of this paper has been published in 1962 where 3 = 3(1,2), but not necessarily 3 = under the title ‘Basic theory on normalization of pat- 1 + 2. tern (In case of typical one-dimensional pattern)’ [23]. Both papers are written in Japanese. The restriction to Later on we shall see that—in order to determine the the one-dimensional case is for simplicity reasons. Ex- Gaussian uniquely—this axiomatic has to be supple- tensions to the two-dimensional case are discussed in mented with a fifth requirement: preservation of posi- Section 3. tivity. 240 Weickert, Ishikawa and Imiya

2.3. Consequences In a next step Iijima simplifies the result of Lemma 4. The case 8[g(x0), x,] 0 is of no sci- In [23] Iijima establishes four lemmas that confine the entific interest and is not considered any further. In class of integral operators (3) by consecutively impos- Eq. (10) the function can be eliminated by rescaling ing conditions (I)–(IV). via () = k/. Then the k-dependence in (11) vanishes by means of substitution of variables. Lemma 1. If 8 has the structure (3) and satisfies the The preceding results immediately yield linearity axiom (I), then it can be written as the integral Theorem 1. If 8 satisfies (3) and the axioms Z ∞ 8[g(x0), x,]= g(x0)(x,x0,)dx0. (8) (I)–( IV), then it is given by ∞ Z ∞ 0 0 0 0 x x dx 8[g(x ), x,]= g(x )m (13) Lemma 2. If 8 is given by (8) and satisfies the trans- ∞ lation invariance axiom (II), then it can be written as a convolution operation: with Z Z ∞ ∞ 1 ( ) = ( 2m + ) 8 ( 0), , = ( 0)( 0,) 0. m u exp i u d [g x x ] g x x x dx (9) 2 ∞ ∞ (m = 1, 2,...). (14) Lemma 3. If 8 is given by (9) and satisfies the scale invariance axiom (III), then it can be written as For this family it follows that 0 in (III) becomes 0 = /, and in (IV) satisfies 8[g(x0), x,] 3 Z ∞ 0 0 0 2m = 2m + 2m . = g(x ) (( )(x x )) ( ) dx , (10) 3 1 2 (15) ∞ = where () is an arbitrary function of . For the special case m 1 Eq. (14) becomes 1 u2 Lemma 4. If 8 is given by (10) and satisfies the semi- (u) = √ exp , (16) 1 group axiom (IV), then the convolution kernel has the 2 4 specific structure which gives Z ∞ 1 2m 2m (u)= exp(k + iu) d 0 2 ∞ 8[g(x ), x,] Z (k ∈ R, m = 1, 2,...), (11) 1 ∞ (x x 0)2 = √ g(x0) exp dx0. (17) 2 or 2 ∞ 4 0 0 Thus, 8[g(x ), x,] is just the convolution between g 8[g(x ), x,]0. √ and a Gaussian with standard deviation 2. In a subsequent theorem Iijima establishes that the The proofs of the first three lemmas are rather short case m = 1 is of specific interest: if one requires that and not very complicated, whereas the longer proof 8 is positivity preserving, i.e., of Lemma 4 involves some more sophisticated reason- ings in the Fourier domain: in order to derive the nat- 8[ f (x0), x,]>0 ∀ f(x)>0, ∀>0, (18) ural numbers m in Lemma 4, he shows that 9(), the Fourier transform of (u), must satisfy then m = 1 arises by necessity. The proof presents an

( ) explicit example, where the positivity is not preserved 9 2m (0) > 9() = exp 2m . (12) for m 1. (2m)! This concludes Iijimas axiomatic derivation of the Gaussian kernel by requiring linearity, translation in- for some m ∈ N. Establishing 9(2m)(0) =(2m)! variance, scale invariance, semi-group property, and k(2m) finally yields (11). preservation of positivity. Linear Scale-Space has First been Proposed in Japan 241

Interestingly, Iijima gives also reasons why e.g., the (IV) (Generalized ) semigroup property: For every 61 visual perception by a human is carried out through a and 62 there exists a 63 = 63(61,62)such that lens [26]: an optical lens has a Gaussian-like blurring 00 0 00 profile that be regarded as a natural consequence from 8[8[ f (x ), x ,61],x,62]=8[f(x ), x,63]. elementary observational principles. (23) Iijima considered linear scale-space as a first step in his theory of pattern recognition. In [26] one can find These four axioms in combination with the requirement an overview of his ideas, the main one-dimensional re- of positivity preservation are sufficient to derive that sults, their motivation from a viewpoint of observation the blurring transformation 8 is given by the affine theory and their theoretical foundation as a classifica- Gaussian scale-space: tion tool for OCR and other problems. This paper is written in English. 8[ f (x0), x,6] Z ∞ Z ∞ = ( 0 , 0 ) 3. Iijima’s Two-Dimensional Axiomatic (1962) f x1 x2 ∞ ∞ ( 0 , 0 ,6) 0 0 Having obtained these one-dimensional results, it was 1 x1 x1 x2 x2 dx1dx2 (24) straightforward for Iijima to generalize them to a two- with dimensional scale-space axiomatic. This was done in a Japanese technical paper from 1962 [24], followed 1(u1, u2,6) by a journal paper in 1963 [25], which also contains an ! 1 u2 2 u u + u2 informative English abstract. = exp 22 1 12 1 2 11 2 In [25] he considers a blurring transformation of 42 4 2 type (25) and f (x) = 8[ f (x0), x,6] 11 12 11 12 Z ∞ Z ∞ 6 = 2 , det = 1. = { ( 0), , 0,6} 0 0, 12 22 12 22 f x x x dx1dx2 (19) ∞ ∞ (26) where 6 is a symmetric positive definite 2 2 matrix. That transformation should satisfy four conditions: In order to obtain the usual (isotropic) Gaussian scale- space kernel, one has to impose one more axiom, namely invariance under rotations. Adding this rota- (I) Linearity (with respect to multiplications): tional invariance condition to axiom (III) yields the ordinary pure scale invariance condition. 8[Ff(x0), x,6]=F8[f(x0), x,6] ∀F ∈R. (20) 4. Iijima’s Two-Dimensional Axiomatic Based on Physical Principles (1971) (II) Translation invariance: In 1971 Iijima decided to reconsider his scale-space 0 0 8[ f (x a), x,6] = 8[f(x ), x a,6] and pattern recognition theory in order to arrive at a physically consistent reformulation and simplification ∀ ∈R. a (21) of his ideas [27–34]. As a first step in this theory, two-dimensional affine (III) Scale invariance and closedness under affine Gaussian scale-space has been derived from physical transformations: If the pattern is transformed principles. This is treated in a paper that is available as by an (invertible) matrix 3, then there exists a a complete English translation in [27]. 60 = 60(6, 3) such that The goal of this paper is to obtain a generalized figure f (r,)from an original figure f (r) in a way which is 8 (31 0), ,6 =8 ( 0), 31 ,60 . [ f x x ] [f x x ] comparable with the defocusing of an optical system. (22) Two principles are assumed to hold: 242 Weickert, Ishikawa and Imiya

(I) Conservation principle: The blurring transforma- 5. Further Scale-Space Analysis by Iijima tion does not change the total light energy within and his Students the image. This means that the image satisfies the continuity equation Iijima considered scale-space as a useful tool in the context of pattern analysis with application to character ∂ f (r,) + divI (r,)=0 (27) recognition, feature extraction and focus-of-attention. ∂ Below we shall sketch some key ideas of his recognition theory as well as its applications to OCR and deep where the flux I denotes the figure flow, r is the structure scale-space analysis. location, serves as a blurring parameter, and div is the divergence operator in R2. 5.1. Iijima’s Pattern Recognition Theory (II) Principle of maximum loss of figure impression: The figure flow is determined such that In [27] Iijima shows that the basic equation of figure |I T ∇f |2 is essentially invariant under conformal transforma- J(I ) := (28) tions describing multiplication of grey values with I T R1 I a constant, addition of linear brightness gradients, takes its maximum value. Here, R() denotes a translations, and affine transformations. Essentially in- positive definite matrix which is the medium con- variant means that it may be transformed into another stant of the blurring process. anisotropic linear diffusion equation. This expression was motivated as follows. The set of these conformal mappings form an al- Iijima regards f (r,) as an energy density dis- gebraic group, while the set of affine - ω( ,) = 1 2( ,) tribution and r : 2 f r as a quantity of ring transformations form a semigroup. Iijima studies information which he coins figure impression. He compositions of a conformal mapping and a blurring derives that transformation under the name observational transfor- Z Z mations. These transformations are used to construct a d ω( ,) = ( ,) theory of pattern classification: two figures are consid- r dr q r dr (29) d R2 R2 ered as equivalent if they result from the same original figure by observational transformations. where q(r,):=∇T f(r,)I(r,)is called loss Iijima compares the invariance of the basic equa- of figure impression. tion of figure under conformal transformations with He measures the flow intensity kI k by defining the invariance of Newton’s basic equations of motion a norm which relates it to R() via to Galileo’s transformation, and the invariance of the Maxwell equations to the Lorentz transformation. It kI k2 := I T (r,)R 1() I (r,). (30) seems that he was fully aware of the future importance of his discovery when he wrote in [27] that “this paper Thus, (28) describes just the (squared) loss of fig- provides a basis for exploring the recognition theory of ure impression normalized by the flow intensity. visual patterns and solving mathematically the various Iijima shows that (28) is maximized for problems in visual physiology”. This subsequent recognition theory is documented I (r,)=R() ∇f(r,). (31) in a series of Japanese papers which are available as full English translations [28–34]. Iijima regards a Together with the conservation principle this leads to Gaussian-blurred figure as an element in a Hilbert space the anisotropic linear diffusion equation which can be expanded in an orthonormal function sys- tem given in terms of Hermite polynomials. The simi- ∂ ( ,) larity between two patterns f and g is a function of the f r = ( () ∇ ( ,)) ∂ div R f r (32) scalar product (, ) in this Hilbert space:

which is just the formulation of affine Gaussian scale- ( f, g) S( f, g) := , (33) space from Section 3 as a partial . k f kkgk R() is a diffusion tensor. Iijima calls this equation the √ basic equation of figure. where k f k := ( f, f ) denotes the induced norm. Linear Scale-Space has First been Proposed in Japan 243

Gaussian blurring plays a central role in this theory. in [39]. For a solution f (x, t) of the isotropic lin- It should be performed at the very beginning, because ear diffusion equation, they constructed a curve which it makes the algorithms insensitive to noise, it re- comprises the stationary points, i.e., locations (x, t) duces effects of positional deviation, and it allows a with ∇ f (x, t) = 0. This so-called stationary curve r(t) coarser sampling. In order to overcome the problem obeys the equation that blurred patterns become more similar, he devised dr(t) a specific canonical transformation. Incorporating all Hess( f ) =1(∇f(r,t)). (34) these features has lead Iijima to a scale-space based dt subspace technique for matching two patterns which he They stated criteria for identifying stable viewpoints called simple similarity method. In order to determine on the stationary curve, for instance by requiring that the similarity between an observed pattern and a whole dr(t) dt vanishes there without vanishing in a neighbor- equivalence class of reference patterns (e.g., different hood. Afterwards they linked these stable viewpoints realizations of one letter of the alphabet), Iijima pro- to a topological scale-space tree [75], which introduces posed an extension named multiple similarity method a scale hierarchy. To each stable point (xi , ti ) they as- [38]. It is essentially a weighted mean square of the sign a region of interest which is given by a disk with simple similarity measures between the observed pat- center xi and radius ti . Applied to an image of Zhao tern matched with each of the reference patterns. himself, this focus-of-attention method extracted eyes, In 1973 Iijima condensed his whole scale-space and nostrils and the mouth as regions of interest [73]. pattern classification theory to a Japanese textbook Some of this work on scale-space trees was further [35]. It can be regarded as one of the first monographs pursued by the image processing group of Makoto Sato, on linear scale-space theory. another former Ph.D. student of Iijima. Sato’s group established linear scale-space results ranging from 5.2. Applications to Character Recognition deep structure analysis [8, 61, 62, 66] to the filtering of periodic or spherical patterns [41, 65]. All cited Sato Applications of Iijima’s theory to OCR have been papers are written in English. presented in English proceedings papers at the First Iijima continued his research on linear scale-space USA-Japan Computer Conference in 1972 [37], and techniques till the nineties [2]. After 1972 he held pro- at the First International Joint Conference on Pattern fessorships at Tokyo Institute of Technology, Tokyo Recognition in 1973 [38]. In [37] it is described how Engineering University, and the Advanced Institute of Iijima, Genchi and Mori, have realized the multiple Science and Technology. In spring 1997 he retired at similarity method in hardware in the optical character the age of 72. An English bibliography can be found reader ASPET/71. Gaussian blurring was performed in [2]. by optical defocusing. ASPET/71 was capable of read- ing 2000 alphanumeric characters per second, and 6. Otsu’s Two-Dimensional Axiomatic (1981) the scale-space part has been regarded as the reason for its reliability and robustness. Others stated about Iijima’s scale-space research also inspired other people ASPET/71 that “it has been proved to have better than his direct students. We shall illustrate this below performance than any similar conventional method” by discussing the work of Nobuyuki Otsu on scale- [53]. It is now exhibited at the National Museum of space axiomatics and deblurring. Sciences in Tokyo. The ASPET/71 was an analog machine, but its commercial variant OCR-V100 by 6.1. Derivation of the Gaussian Toshiba used digital technology fully. Iijima’s multiple similarity method has also become the main algorithm In 1981 another Japanese scale-space axiomatic has of Toshiba’s later OCR systems [50]. been established in the Ph.D. thesis of Nobuyuki Otsu [55]. This thesis, which was mainly concerned with 5.3. Deep Structure Analysis the extraction of invariants for pattern recognition, was written at the ETL, where Iijima was working in the In the eighties Iijima addressed together with Nan-yuan sixties. Otsu derived two-dimensional Gaussian scale- Zhao, his Ph.D. student, the problem of deep structure space in an axiomatic way by modifying the axioms analysis in scale-space [74]; see also the discussion described in Section 2. Section 4.1 of his thesis is 244 Weickert, Ishikawa and Imiya titled ‘Axiomatic derivation of the scale transforma- (V) His next requirement which he names “Normal- tion’. There he considers some transformation of an ization of energy” actually consists of two parts: image f into an image f˜, for which the following Preservation of nonnegativity, holds: f˜(r) 0 ∀ f (r) 0, (41) (I) Representation as a linear integral operator: and average grey level invariance, There exists a function W : R2 R2 → R such Z Z that ˜ Z f (r) dr = f (r)dr. (42) R2 R2 f˜(r) = W(r, r 0) f (r 0) dr0 ∀r ∈ R2. (35) R R2 ( ) ( ) = This leads to W r 0 and R2 W r dr 1, respectively. (II) Translation invariance: For all r ∈ R2 and for all ∈ R2 = 1 = 1 a it is required that Combining these results gives k 22 and c 22. Z This yields the Gaussian kernel ˜( ) = ( , 0) ( 0 ) 0. f r a W r r f r a dr (36) 1 x 2 + y2 R2 W(r) = exp (43) R 22 2 2 ( , 0 + ) ( 0) 0 Since this is just R2 W rRr a f r dr , and ˜( ) = ( , 0) ( 0) (I) states that f r a R2 W r a r f r and concludes the axiomatic derivation of two- dr0, it follows that the integral kernel is symmet- dimensional linear scale-space. ric, 6.2. Further Results W(r, r 0 + a) = W(r a, r 0), (37) Section 4.2 of Otsu’s thesis is titled ‘Representation and, thus, it is a convolution kernel: of scale-space transformation and semigroup’. It is devoted to the N-dimensional Gaussian scale-space. 0 0 W(r, r ) = W(r r ). (38) With := 2/2 he defines 1 |r|2 (III) Rotation invariance (of the kernel): For all rota- T () f (r) := exp f (r). (44) T 2 ( )N/2 tion matrices T2 and for all r = (x, y) ∈ R it 4 4 is assumed that Using Fourier techniques he shows that the generator of the scale-space transformation is the Laplacean: W(T2r) = W(r). (39) f˜(r,)=T() f (r) = exp(1) f (r). (45) Hence, W depends only on |r|: W(r) = W(x2 + 2) y . This gives (IV) Separability: There exists a function u : R → R such that ∂ ˜( ,) f r = 1( (1) ( )) = 1 ˜( ,). ∂ exp f r f r W(r) = u(x) u(y). (40) Thus, f˜ satisfies the isotropic linear diffusion equation. Combining this with (III) implies after elemen- The formal inversion of the scale-space transforma- tary manipulations that tion by means of (45) is

W(r) = k exp[c (x 2 + y2)] f (r) = [T ()]1 f˜(r,) = exp(1) f˜(r,) with some parameters k, c ∈ R. In order to ar- rive at k > 0 and c < 0, however, additional con- 2 = I 1 + 12 f˜(r,). straints are needed. 2 Linear Scale-Space has First been Proposed in Japan 245

For the case that or 12 f˜ is small, Otsu proposes to Since every continuous linear functional can be approximate [T ()]1 by [I 1] and to use it for written as an integral operator, it follows that recovering the original image from a blurred one.3 Florack’s topological duality paradigm [13] can also be interpreted as requiring the existence of 5 7. Relation to Other Work a linear integral operator: — Translation invariance: Let a translation a be ( )( ) = ( ) Having sketched the basic ideas of these Japanese defined by a f x : f x a . Then, axiomatics, it is natural to ask about similarities and dif- = ∀ ∈ RN , ∀ > . ferences to other approaches. Table1 gives an overview a Tt Tt a a t 0 of the current axiomatics for the continuous Gaussian scale-space. These axioms and some of their relations Since usually linearity and translation invariance are can be explained as follows:4 imposed in conjunction, we have summarized them under the term “convolution kernel”. Convolution kernel: There exists a family of func- Semigroup property: tions {k : R → R | t 0} such that t = ( ) ∀ , , ∀ . Z Tt+s f Tt Ts f t s 0 f 0 0 0 (Tt f )(x) = kt (x x ) f (x ) dx . RN This property ensures that one can implement the scale-space process as a cascade smoothing. In Section 6.1 we have already seen that this prop- Locality: For small t the value of Tt f at any point erty can be derived from the two assumptions: x is determined by its vicinity: — Linear integral operator: There exists a family of kernels {k | t 0} with lim (Tt f Tt g)(x) = o(x) t t→0+ Z ∞ 0 0 0 for all f, g ∈ C whose of order 0 (Tt f )(x) = kt (x, x ) f (x ) dx . RN are identical.

Table 1. Overview of continuous Gaussian scale-space axiomatics (I1 = Iijima [22, 23], I2 = Iijima [24, 25], I3 = Iijima [27], O = Otsu [55], K = Koenderink [43], Y = Yuille/Poggio [72], B = Babaud et al. [4], L1 = Lindeberg [46], F1 = Florack et al. [15], A = Alvarez et al. [1], P = Pauwels et al. [56], N = Nielsen et al. [52], L2 = Lindeberg [48], F2 = Florack [13]).

I1 I2 I3 O K Y B L1 F1 A P N L2 F2

Convolution kernel Semigroup property Locality Regularity Infinitesimal generator Max. loss principle Causality Nonnegativity Tikhonov regularization Average grey level invar. Flat kernel for t →∞ Isometry invariance Homogeneity & isotropy Separability Scale invariance Valid for dimension 1 2 2 2 1, 2 1, 2 1 1 >1N1,2N N N 246 Weickert, Ishikawa and Imiya

Regularity: A precise definition of the smoothness For this reason, Koenderink [43] required that at requirements for the scale-space operator depends spatial extrema (with nonvanishing determinant of on the author: the Hessian) isophotes in scale-space are upwards convex; In two dimensions he showed that at these — Since the original image creates the scale-space, extrema the diffusion equation it is natural to assume that it is continuously embedded, i.e., limt→0+ Tt = I . In the linear ∂t u = (x,t)1u (46) convolution case, this means that kt (x) tends to Dirac’s delta distribution [72] and its Fourier has to be satisfied. Hereby, denotes a positive- transform becomes 1 everywhere [15]. valued function. — Babaud et al. [4] and Florack [13] consider in- Hummel [21] established the equivalence be- finitely times differentiable convolution kernels tween causality and a maximum principle for certain which are rapidly decreasing functions in x, i.e., parabolic operators. ∞ they are vanishing at faster than any inverse We may also derive the causality equation (46) of polynomials. and its N-dimensional generalizations by requiring — Lindeberg uses kernels kt which are Borel mea- that local extrema with positive or negative definite surable in t [46], or kernels which converge for Hessians are not enhanced [4, 48]: This assumption t → 0+ in the L1 norm to the Dirac distribution states that such an extremum in x0 at scale t0 satisfies [48]. — Alvarez et al. [1] require that ∂t u > 0ifx0is a minimum,

∂t u < 0ifx0is a maximum. kTt ( f + hg) (Tt( f ) + hg)k∞ Cht This is just the causality requirement sign(∂ u) = , ∈ , t for all h t [0 1], and for all smooth f , g, sign(1u). Moreover, in one dimension, nonen- where C may depend on f and g. hancement of local extrema is equivalent to the re- — Pauwels et al. [56] assume that the convolution quirement that the number of local extrema does ( ) kernel kt x is separately continuous in x and not increase [4, 46]. In higher dimensions, however, in t. diffusion scale-spaces may create new extrema, see Infinitesimal generator: The existence of e.g., [9, 45, 72].

T f f Nonnegativity: If the nonnegativity of the convolu- lim t =: A[ f ] t→0+ t tion kernel,

guarantees that the semigroup can be represented by kt (x) 0 ∀ x, ∀ t > 0, the evolution equation is violated, new level crossings may appear for t > 0, ∂ = . t u A[u] such that the causality property does not hold. Within a linear framework with spatially continu- From the mathematical literature it is well-known ous convolution kernels, nonnegativity is equivalent that the existence of an infinitesimal generator fol- to the monotony requirement [1] lows from the semigroup property when being com- bined with regularity assumptions [20]. f (x) g(x) ∀ x Principle of maximum loss of figure impression: See Section 4. ⇒ (Tt f )(x) (Tt g)(x) ∀ x, ∀ t > 0 Causality: The scale-space evolution should not create new level curves when increasing the scale and the preservation of nonnegativity: parameter. If this is satisfied, iso-intensity linking through the scales is possible and a structure at a f (x) 0 ∀ x coarse scale can (in principle) be traced back to the ⇒ ( )( ) ∀ , ∀ > . original image. Tt f x 0 x t 0 Linear Scale-Space has First been Proposed in Japan 247

Tikhonov regularization: In the one-dimensional Homogeneity and isotropy: Koenderink [43] re- case, u is called a Tikhonov regularization of f ∈ quired that the scale-space treats all spatial points L2(R), if it minimizes the energy functional equally. He assumed that the diffusion equation (46), " # which results at extrema from the causality require- Z ∞ X di u 2 ment, should be the same at each spatial position (re- = ( )2 + . E f [u] f u i dx gardless whether there is an extremum or not) and R dxi i=1 for all scales. He named these requirements homo- geneity and isotropy.6 This concept and an N-dimensional generalization Separability: The convolution kernel k (x) with x = of it has been used by Nielsen et al. [52]. The first t (x ,...,x )T ∈RN may be split into N factors, each term under the integral ensures that u remains close 1 N acting along one coordinate axis: to f , while the second one is responsible for the smoothness of u: the positive coefficients serve i ( ) = ( ) ( ). as weights which guarantee smoothness of the i-th kt x k1,t x1 kN,t xN . Average grey level invariance: The average grey Scale invariance: Let (S f )(x) := f (x). Then 0 level invariance there exists some t (, t) with Z Z ST 0 = T S. Tt fdx= fdx ∀t>0 t t RN RN One may achieve this by requiring that, in the N- can be achieved by means of the continuity Eq. (27) dimensional case, the convolution kernel k has the in connection with reflecting or periodic boundary t structure conditions. It boils down to the normalization con- dition 1 x Z k (x) = 8 t 9 N (t) 9(t) kt (x) dx = 1, RN with a continuous, strictly increasing rescaling func- if we consider linear convolution kernels. tion 9. This means that all kernels can be derived by In this case normalization is also equivalent to grey stretching a parent kernel such that its area remains level shift invariance [1]: constant [56]. It is evident that this is related to the normalization condition. ( ) = , Tt 0 0 Scale invariance follows also from the semigroup property when being combined with isometry invari- Tt ( f + C) = Tt ( f ) + C ance and causality [48]. Moreover, scale invariance, for all images f and for all constants C. translation invariance and isometry invariance result Flat kernel for t →∞:Fort→∞, one expects that from the more general assumption of invariance un- the kernel spreads the information uniformly over der the spacetime symmetry group; see [13] for more the image. Therefore, if the integral over the kernel details. should remain finite, it follows that the kernel has to become entirely flat: limt→∞ kt (x) = 0. We observe that—despite the fact that all presented Isometry invariance: Let R ∈ RN be an orthogo- axiomatics use many similar requirements—not two of nal transformation (i.e., det R =1) and define them are identical. Each of the 14 axiomatics confirms (Rf)(x) := f (Rx). Then, and enhances the evidence that the others give: that Gaussian scale-space is unique within a linear frame- Tt (Rf)= R(Tt f) ∀ f, ∀t >0. work. This theoretical foundation is the backbone of a lot of successful applications of linear scale-space In the one-dimensional case with a linear con- theory. volution kernel this invariance under rotation and Nevertheless, apart from their historical merits, the mirroring comes down to the symmetry condition early Japanese approaches differ from the well-known kt (x) = kt (x). axiomatics after 1984 in several aspects: 248 Weickert, Ishikawa and Imiya

Firstly, it is interesting to note that all Japanese ax- community. They reveal many interesting qualities iomatics require only quite a few axioms in order to which should trigger everyone who is interested in derive Gaussian scale-space. Even recent approaches scale-space theory to have a closer look at them. which intend to use a minimal set of first principles do The discussed results demonstrate that an entire not utilize less axioms. world of linear scale-space theory has evolved in Iijima’s one- and two-dimensional frameworks from Japan ranging from axiomatics for isotropic and affine 1959 and 1962, respectively, do not only belong to the Gaussian scale-space over group theory and deep most systematic derivations of Gaussian scale-space, structure analysis to hardware implementations for they appear also rather modern: principles such as non- OCR. The Japanese scale-space paradigm was well- negativity and the semigroup property are typical for embedded into a general framework for pattern recog- axiomatizations after 1990, and also the importance nition and object classification [26, 35, 55], and many of scale invariance has been emphasized mainly in re- results have been established earlier than in America cent years. Also affine Gaussian scale-space which he and Europe. pioneered in 1962 has been further pursued only re- It is surprising that eastern and western scale-space cently [47]. theory have evolved with basically no interaction: To Iijima’s physical motivation for affine Gaussian the best of our knowledge, the first citation of Iijima’s scale-space from 1971 uses only two principles which work by non-Japanese scale-space researchers was reduce the essential features of linear diffusion filter- made in 1996 [69]. Conversely, also Japanese work ing to a minimum. In this sense it is written in a sim- after 1983 was not always aware of American and ilar spirit as Koenderink’s derivation [43], which also European scale-space results. Some (English) papers uses two highly condensed requirements (although of by Makoto Sato’s group [61, 62, 65, 66] cite both Iijima very different nature). Concepts such as the principle and Witkin. His paper with the probably most explicit of maximum loss of figure impression may remind reference to Iijima’s work has been presented at the some of the readers of properties of nonlinear scale- ICASSP ’87 [61], where Sato and Wada cite the orig- spaces like the Euclidean and affine shortening flow inal Japanese versions of [27, 28] and state: “The no- [1, 54, 60]: they shrink the Euclidean or affine perime- tion same as scale-space filtering was also proposed by ter of a closed curve as fast as possible. Moreover, the T. Iijima in the field of pattern recognition. He derived group-theoretical studies in [27] prove that Iijima has a partial differential equation, called basic equation, also pioneered modern scale-space analysis such as in from the continuity of the light energy in the waveform [1, 54, 58]. The canonical figures used in [27] to set observation”. up similarity measures can be regarded as predeces- In 1992 several direct hints to Iijima’s scale-space sors of the canonical frames that have become popu- research can be found in the widespread journal lar in computer vision [51]. Last but not least, Iijima’s Proceedings of the IEEE [50]. In an invited historical anisotropic diffusion Eq. (32) may be viewed as a linear review of OCR methods written by Mori, Suen and predecessor of various nonlinear anisotropic diffusion Yanamoto, 10 out of the 193 key references were papers scale-spaces as considered in [68]. by Iijima. Concerning his contributions, the authors Otsu’s two-dimensional axiomatic is very appeal- state that “the concept of blurring was first introduced ing due to its simplicity: in contrast to many other ap- into pattern recognition by his work, whereas it was proaches it does not require advanced mathematical widely attributed to Marr in the West. Iijima’s idea was techniques like Fourier analysis, complex integrals, or derived from his study on modeling the vision observa- functional analysis in order to derive the uniqueness of tion system. (...) Setting reasonable conditions for the the Gaussian kernel. It is therefore a well-suited ap- observation system, he proved that the mathematical proach even for undergraduate courses in image pro- form must be a convolution of a signal f (x0) with a cessing. Gaussian kernel”. They also give a recommendation to the non-Japanese audience: “Iijima’s theory is not so easy to understand, but his recent book [36] is readable, 8. Discussion although it is written in Japanese”. One can only speculate why nobody paid attention In this paper we have analyzed Japanese axiomatics for to these passages. Maybe, because none of the authors the linear diffusion scale-space that have been unknown referred to one of Iijima’s English scale-space papers. in the American and European image processing The present paper contains 12 references to English Linear Scale-Space has First been Proposed in Japan 249 publications by Iijima. They can be found in many li- Acknowledgments braries in America and Europe, and a short look at papers such as [26, 27] should convince everybody We thank Alfons Salden, Luc Florack, Peter Johansen that there remains no justification to deny Iijima’s pi- and Jon Sporring for helpful comments and stimulat- oneering role in linear scale-space theory because of ing discussions. Parts of this work have been carried language reasons. out while Seiji Ishikawa and Joachim Weickert were Another reason might be that Iijima’s work came with the Image Sciences Institute of Utrecht Univer- too early to be appreciated: His theory was mathemati- sity, The Netherlands. It has also been supported by cally much more demanding than other methods at this the Danish National Science Research Council under time. At a stage where pattern recognition was still in grant no. 9502164. its infancy and experimenting with very simple meth- ods, it was not easy to make techniques popular, which are based on advanced mathematics. Also computing Note facilities were more restricted in the sixties and seven- ties than they are today. Despite very remarkable devel- 1. A preliminary version of this paper focusing exclusively on the axiomatic aspects of two of these four Japanese frameworks has opments such as the scale-space based optical character been presented in [70]. readers, it was certainly more difficult to attract people 2. The variable x0 serves as a dummy variable. by presenting computed results that demonstrate the 3. This is an ill-posed problem which may lead to unstable results. advantages of a conceptually clean scale-space tech- 4. Of course, such a table can only give a “flavor” of the different ap- nique over ad-hoc strategies. proaches, and the precise description of each axiom may slightly vary from paper to paper. Several relations between the presented When scale-space became popular in America and axioms are discussed in [1, 48, 56]. Europe in the eighties, the situation was different: 5. Dirac point distributions and their derivatives are admitted as Computing power was much higher, and the pattern “functions under the integral”. recognition and computer vision community had be- 6. In our terminology, homogeneity and isotropy are much stricter come mature for better founded techniques which take requirements than translation and isometry invariance. They en- able Koenderink to derive Gaussian scale-space under only one advantage of centuries of research in mathematics and additional assumption (causality). physics. Today it is possible to establish an interna- tional conference solely devoted to scale-space ideas which attracts people from many countries and disci- References plines [19]. Would this have been possible 30 years ago? Certainly not. 1. L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel, “Axioms and fundamental equations in image processing,” Arch. Rational Unfortunately, it seems that Iijima was not the only Mech. Anal., Vol. 123, pp. 199–257, 1993. one who has pioneered the field of partial differential 2. Y. Aoki and T. Iijima, “Pluralizing method of simple similar- equations in image processing far ahead of his time, so ity,” IEICE Trans. Inf. Syst., Vol. E79-D, pp. 485–490, 1996 (in that his work fell into oblivion for decades. Another English). ˚ example is the fact that already in 1965 the Nobel prize 3. K. Astr¨om and A. Heyden, “Stochastic analysis of image ac- quisition and scale-space smoothing,” in Gaussian Scale-Space winner Dennis Gabor—the inventor of optical holog- Theory, J. Sporring, M. Nielsen, L. Florack, and P. Johansen raphy and the so-called Gabor functions—proposed a (Eds.), Kluwer: Dordrecht, 1997, pp. 129–136. deblurring algorithm based on combining mean cur- 4. J. Babaud, A.P.Witkin, M. Baudin, and R.O. Duda, “Uniqueness vature flow with backward smoothing along flowlines of the Gaussian kernel for scale space filtering,” IEEE Trans. [17, 49]. This long-time forgotten method is similar to Pattern Anal. Mach. Intell., Vol. 8, pp. 26–33, 1986. 5. R. van den Boomgaard, “The morphological equivalent of the modern PDE techniques for image enhancement. Gauss convolution,” Nieuw Archief VoorWiskunde, Vierde Serie, Maybe this review helps a little bit that the pioneer- Deel, Vol. 10, pp. 219–236, 1992. ing work of these people receives the acknowledgement 6. R.W. Brockett and P. Maragos, “Evolution equations for that it deserves. It would also be nice if it encour- continuous-scale morphological filtering,” IEEE Trans. Signal ages the mutual interest between the Japanese and the Processing, Vol. 42, pp. 3377–3386, 1994. 7. A.M. Bruckstein and D. Shaked, “On projective invariant western scale-space community. The fact that the same smoothing and evolution of planar curves and polygons,” J. theory has been developed twice in two very different Math. Imag. Vision, Vol. 7, pp. 225–240, 1997. cultures shows that this theory is natural and that it is 8. S. Chinveeraphan, R. Takamatsu, and M. Sato, “A hierarchical worthwhile to study all of its various aspects. description of digital grayscale images based on image dipoles,” 250 Weickert, Ishikawa and Imiya

in Proc. 13th Int. Conf. Pattern Recognition (ICPR 13, Vienna, 30. T. Iijima, “Basic theory on normalization of figures,” Electronics Aug. 25–30, 1996), 1996, Vol. B, pp. 246–250. and Communications in Japan, Vol 54-C, No. 12, pp. 106–112, 9. J. Damon, “Local Morse theory for solutions to the 1971 (in English). and Gaussian blurring,” J. Differential Equations, Vol. 115, pp. 31. T. Iijima, “A system of fundamental functions in an abstract 368–401, 1995. figure space,” Systems, Computers, Controls, Vol. 2, No. 6, 10. R. Deriche and O. Faugeras, “Les EDP en traitement des images pp. 96–103, 1971 (in English). et vision par ordinateur,” Traitement du Signal, Vol. 13, No. 6, 32. T. Iijima, “Basic theory of feature extraction for figures,” Sys- 1996. tems, Computers, Controls, Vol. 3, No. 1, pp. 32–39, 1972 (in 11. F. Dibos, “Projective analysis of 2-D images,” IEEE Trans. Im- English). age Proc., Vol. 7, pp. 274–279, 1998. 33. T. Iijima, “Basic theory on the structural recognition of figures,” 12. O. Faugeras, “Sur l’´evolution de courbes simples du plan pro- Systems, Computers, Controls, Vol. 3, No. 4, pp. 30–36, 1972 jectif r´eel,” C.R. Acad. Sci. Paris, t. 317, S´erie I, pp. 565–570, (in English). 1993. 34. T. Iijima, “Theoretical studies on the figure identification by 13. L.M.J. Florack, “Data, models, and images,” in Proc. 1996 IEEE pattern matching,” Systems, Computers, Controls, Vol. 3, No. 4, Int. Conf. Image Processing (ICIP–96, Lausanne, Sept. 16–19, pp. 37–44, 1972 (in English). 1996), 1996, Vol. I, pp. 469–472. 35. T. Iijima, Pattern Recognition, Corona-sha, 1973 (in Japanese). 14. L.M.J. Florack, Image Structure, Kluwer: Dordrecht, 1997. 36. T. Iijima, “Theory of pattern recognition,” Series of Basic In- 15. L.M.J. Florack, B.M. ter Haar Romeny, J.J. Koenderink, and formation Technology, Vol. 6, Morishita Publishing, 1989 (in M.A. Viergever, “Scale and the differential structure of images,” Japanese). Image Vision Comput., Vol. 10, pp. 376–388, 1992. 37. T. Iijima, H. Genchi, and K. Mori, “A theoretical study of pattern 16. J. Fourier, Theorie´ Analytique de la Chaleur, Didot: Paris, 1822. identification by matching method,” in Proc. First USA–Japan 17. D. Gabor, “Information theory in electron microscopy,” Labo- Computer Conference (Tokyo, Oct. 3–5, 1972), 1972, pp. 42–48 ratory Investigation, Vol. 14, pp. 801–807, 1965. (in English). 18. B.M. ter Haar Romeny (Ed.), Geometry-Driven Diffusion in 38. T. Iijima, H. Genchi, and K. Mori, “A theory of character recog- Computer Vision, Kluwer: Dordrecht, 1994. nition by pattern matching method,” in Proc. First Int. Joint Conf. 19. B. ter Haar Romeny, L. Florack, J. Koenderink, and M. Viergever Pattern Recognition (IJCPR, Washington, Oct. 1973), 1973, (Eds.), Scale-Space Theory in Computer Vision, Lecture Notes pp. 50–56 (in English). in Comp. Science, Vol. 1252, Springer: Berlin, 1997. 39. A. Imiya and R. Katsuta, “Extraction of a structure feature 20. E. Hille and R.S. Philips, Functional Analysis and Semi-Groups, from three-dimensional objects by scale-space analysis,” in American Mathematical Society: Providence, 1957. Scale-Space Theory in Computer Vision, B. ter Haar Romeny, 21. R.A. Hummel, “Representations based on zero-crossings in L. Florack, J. Koenderink, and M. Viergever (Eds.), Lecture scale space,” in Proc. IEEE Comp. Soc. Conf. Computer Vision Notes in Comp. Science, Vol. 1252, Springer: Berlin, 1997, and Pattern Recognition (CVPR ’86, Miami Beach, June 22– pp. 345–348. 26, 1986), IEEE Computer Society Press: Washington, 1986, 40. P.T. Jackway and M. Deriche, “Scale-space properties of the pp. 204–209. multiscale morphological dilation–erosion,” IEEE Trans. Pat- 22. T. Iijima, “Basic theory of pattern observation,” Papers of Tech- tern Anal. Mach. Intell., Vol. 18, pp. 38–51, 1996. nical Group on Automata and Automatic Control, IECE, Japan, 41. G. Kin and M. Sato, “Scale space filtering on spherical pat- Dec. 1959 (in Japanese). tern,” in Proc. 11th Int. Conf. Pattern Recognition (ICPR 11, 23. T. Iijima, “Basic theory on normalization of pattern (in case of The Hague, Aug. 30–Sept. 3, 1992), 1992, Vol. C, pp. 638–641. typical one-dimensional pattern),” Bulletin of the Electrotechni- 42. B.B. Kimia, A. Tannenbaum, and S.W. Zucker, “Toward a com- cal Laboratory, Vol. 26, pp. 368–388, 1962 (in Japanese). putational theory of shape: An overview,” Computer Vision— 24. T. Iijima, “Observation theory of two-dimensional visual pat- ECCV ’90, O. Faugeras (Ed.), Lecture Notes in Comp. Science, terns,” Papers of Technical Group on Automata and Automatic Vol. 427, Springer: Berlin, 1990, pp. 402–407. Control, IECE, Japan, Oct. 1962 (in Japanese). 43. J.J. Koenderink, “The structure of images,” Biol. Cybern., Vol. 25. T. Iijima, “Basic theory on normalization of two-dimensional 50, pp. 363–370, 1984. visual pattern,” Studies on Information and Control, Pattern 44. J.J. Koenderink, “Scale in perspective,” in Gaussian Scale-Space Recognition Issue, IECE, Japan, No. 1, pp. 15–22, 1963 (in Theory, J. Sporring, M. Nielsen, L. Florack, and P. Johansen Japanese). (Eds.), Kluwer: Dordrecht, 1997, pp. xv-xx. 26. T. Iijima, “Theory of pattern recognition,” Electronics and Com- 45. L.M. Lifshitz and S.M. Pizer, “A multiresolution hierarchical ap- munications in Japan, pp. 123–134, Nov. 1963 (in English). proach to based on intensity extrema,” IEEE 27. T. Iijima, “Basic equation of figure and observational transfor- Trans. Pattern Anal. Mach. Intell., Vol. 12, pp. 529–540, 1990. mation,” Systems, Computers, Controls, Vol.2, No. 4, pp. 70–77, 46. T. Lindeberg, “Scale-space for discrete signals,” IEEE Trans. 1971 (in English). Pattern Anal. Mach. Intell., Vol. 12, pp. 234–254, 1990. 28. T. Iijima, “Basic theory on the construction of figure space,” 47. T. Lindeberg, Scale-Space Theory in Computer Vision, Kluwer: Systems, Computers, Controls, Vol. 2, No. 5, pp. 51–57, 1971 Boston, 1994. (in English). 48. T. Lindeberg, “On the axiomatic formulations of linear scale- 29. T. Iijima, “A suppression kernel of a figure and its mathematical space,” in Gaussian Scale-Space Theory, J. Sporring, M. characteristics,” Systems, Computers, Controls, Vol. 2, No. 6, Nielsen, L. Florack, and P. Johansen (Eds.), Kluwer: Dordrecht, pp. 16–23, 1971 (in English). 1997, pp. 75–97. Linear Scale-Space has First been Proposed in Japan 251

49. M. Lindenbaum, M. Fischer, and A. Bruckstein, “On Ga- 68. J. Weickert, Anisotropic Diffusion in Image Processing, bor’s contribution to image enhancement,” Pattern Recognition, Teubner-Verlag: Stuttgart, 1998. Vol. 27, pp. 1–8, 1994. 69. J.A. Weickert, B.M. ter Haar Romeny, and M.A. Viergever, 50. S. Mori, C.Y. Suen, and K. Yanamoto, “Historical review of “Conservative image transformations with restoration and scale- OCR research and development,” in Proc. IEEE, 1992, Vol. 80, space properties,” in Proc. 1996 IEEE Int. Conf. Image Pro- pp. 1029–1058. cessing (ICIP-96, Lausanne, Sept. 16–19, 1996), 1996, Vol. I, 51. J.L. Mundy and A. Zisserman (Eds.), Geometric Invariance in pp. 465–468. Computer Vision, MIT Press: Cambridge, 1992. 70. J. Weickert, S. Ishikawa and, A. Imiya, “On the history of 52. M. Nielsen, L. Florack, and R. Deriche, “Regularization, scale- Gaussian scale-space axiomatics,” in Gaussian Scale-Space space, and filters,” in Computer Vision—ECCV Theory, J. Sporring, M. Nielsen, L. Florack, and P. Johansen ’96: Volume II, B. Baxton and R. Cipolla (Eds.), Lecture Notes (Eds.), Kluwer: Dordrecht, 1997, pp. 45–59. in Comp. Science, Springer: Berlin, 1996, Vol.1065, pp. 70–81. 71. A.P. Witkin, “Scale-space filtering,” in Proc. Eighth Int. Joint 53. H. Nishino, “PIPS (pattern information processing system) Conf. on Artificial Intelligence (IJCAI ’83, Karlsruhe, Aug. 8– project—background and outline,” in Proc. 4th Int. Joint Conf. 12, 1983), 1983, Vol. 2, pp. 1019–1022. Pattern Recognition (IJCPR 4, Kyoto, Nov. 7–10, 1978), 1978, 72. A.L. Yuille and T.A. Poggio, “Scaling theorems for zero cross- pp. 1152–1161. ings,” IEEE Trans. Pattern Anal. Mach. Intell., Vol.8, pp. 15–25, 54. P.J. Olver, G. Sapiro, and A. Tannenbaum, “Classification and 1986. uniqueness of invariant geometric flows,” C.R. Acad. Sci. Paris, 73. N.-Y. Zhao, “Feature extraction of images by stable gaze tree,” t. 319, S´erie I, pp. 339–344, 1994. Ph.D. thesis, Dept. of Computer Science, Tokyo Inst. of Tech- 55. N. Otsu, “Mathematical studies on feature extraction in pat- nology, 1985 (in Japanese). tern recognition,” Ph.D. thesis, Report No. 818, Electrotech- 74. N.-Y. Zhao and T. Iijima, “Theory on the method of determina- nical Laboratory, 1-1-4, Umezono, Sakura-mura, Niihari-gun, tion of view-point and field of vision during observation and Ibaraki, Japan, 1981 (in Japanese). measurement of figure,” IECE Japan, Trans. D, Vol. J68-D, 56. E.J. Pauwels, L.J. Van Gool, P.Fiddelaers, and T. Moons, “An ex- pp. 508–514, 1985 (in Japanese). tended class of scale-invariant and recursive scale space filters,” 75. N.-Y. Zhao and T. Iijima, “A theory of feature extraction by the IEEE Trans. Pattern Anal. Mach. Intell., Vol. 17, pp. 691–701, tree of stable view-points,” IECE Japan, Trans. D, Vol. J68-D, 1995. pp. 1125–1132, 1985 (in Japanese). 57. P. Perona and J. Malik, “Scale space and edge detection using anisotropic diffusion,” IEEE Trans. Pattern Anal. Mach. Intell., Vol. 12, pp. 629–639, 1990. 58. A.H. Salden, “Dynamic scale-space paradigms,” Ph.D. thesis, Utrecht University, The Netherlands, 1996. 59. A.H. Salden, B.M. ter Haar Romeny, and M.A. Viergever, “Lin- ear scale-space theory from physical principles,” J. Math. Imag. Vision, Vol. 9, No. 2, pp. 103–139, 1998. 60. G. Sapiro and A. Tannenbaum, “Affine invariant scale-space,” Int. J. Comput. Vision, Vol. 11, pp. 25–44, 1993. 61. M. Sato and T. Wada, “A hierarchical representation of random waveforms by scale-space filtering,” in Proc. IEEE Int. Conf. Acoustics, Speech and Signal Processing (ICASSP-87, Dallas, Joachim Weickert obtained a M.Sc. in Industrial Mathematics in April 1987), 1987, pp. 273–276. 1991 and a Ph.D. in Mathematics in 1996, both from Kaiserslautern 62. M. Sato, T. Wada, and H. Kawarada, “A morphological study University, Germany. After receiving the Ph.D. degree, he worked as on structure line,” in Proc. 9th Int. Conf. Pattern Recognition post-doctoral researcher at the Image Sciences Institute of Utrecht (ICPR 9, Rome, Nov. 14–17, 1988), 1988, pp. 559–562. University, The Netherlands. In 1997 he joined the computer vi- 63. J. Sporring, M. Nielsen, L. Florack, and P. Johansen (Eds.), sion group of the Department of Computer Science at Copenhagen Gaussian Scale-Space Theory, Kluwer: Dordrecht, 1997. University. His current research interests include all aspects of par- 64. J.L. Stansfield, “Conclusions from the commodity expert tial differential equations and scale-space theory in image analysis. project,” A.I. Memo 671, Artificial Intelligence Lab., Mas- He authored the book “Anisotropic Diffusion in Image Processing” sachusetts Inst. of Technolgy, Cambridge, MA, USA, 1980. (Teubner Verlag, Stuttgart, 1998). 65. T. Wada, Y.H. Gu, and M. Sato, “Scale-space filtering for pe- riodic waveforms,” Systems and Computers in Japan, Vol. 22, pp. 45–54, 1991. 66. T. Wada and M. Sato, “Scale-space tree and its hierarchy,” in Proc. 10th Int. Conf. Pattern Recognition (ICPR 10, Atlantic City, June 16–21, 1990), IEEE Computer Society Press: Los Alamitos, 1990, Vol. 3, pp. 103–108. 67. J. Weickert, “Nonlinear diffusion scale-spaces: From the con- tinuous to the discrete setting,” in ICAOS ’96: Images, and PDEs, M.-O. Berger, R. Deriche, I. Herlin, J. Jaffr´e, and Seiji Ishikawa obtained B.E., M.E., and Ph.D. from Tokyo Uni- J.-M. Morel (Eds.), Lecture Notes in Control and Information versity in 1974, 1976, and 1979, respectively, where he majored in Sciences, Springer: London, 1996, Vol. 219, pp. 111–118. Mathematical Engineering and Instrumentation Physics. In 1979, he 252 Weickert, Ishikawa and Imiya

joined the Department of Computer Science in Kyushu Institute of Technology and he is currently professor at the Department of Con- trol and Mechanical Engineering. He was a visiting research fellow at Sheffield University, UK, from 1983 to 1984, and a visiting pro- fessor at Utrecht University, The Netherlands, in 1996. His research interests include three-dimensional shape recovery, medical image analysis, consistent labeling, and scale-space.

Atsushi Imiya received B.E., M.E., and Dr. Eng. from Tokyo Ins- titute of Technology in 1980, 1982 and 1985, respectively. In his graduate course, Prof. Iijima was his supervisor. His dissertation, however, was not on scale-space theory, but on the theory of integral equations. Since 1998 he is Professor of Artifical Intelligence and Cognitive Science at Chiba University, and he is also a member of the Computing Center of Chiba University. He has acted as a secretary of SIG-CV of IPSJ. His main interests are artificial intelligence and mathematics.