Journal of Mathematical Imaging and Vision 10, 237–252 (1999) c 1999 Kluwer Academic Publishers. Manufactured in The Netherlands.
Linear Scale-Space has First been Proposed in Japan
JOACHIM WEICKERT Department of Computer Science, University of Copenhagen, Universitetsparken 1, 2100 Copenhagen, Denmark [email protected]
SEIJI ISHIKAWA Department of Control Engineering, Kyushu Institute of Technology, Sensuicho 1-1, Tobata, Kitakyushu 804, Japan [email protected]
ATSUSHI IMIYA Department of Information and Computer Sciences, Faculty of Engineering, Chiba University, 1-33 Yayoi-Cho, Inage-Ku 263, Chiba, Japan [email protected]
Abstract. Linear scale-space is considered to be a modern bottom-up tool in computer vision. The American and European vision community, however, is unaware of the fact that it has already been axiomatically derived in 1959 in a Japanese paper by Taizo Iijima. This result formed the starting point of vast linear scale-space research in Japan ranging from various axiomatic derivations over deep structure analysis to applications to optical character recognition. Since the outcomes of these activities are unknown to western scale-space researchers, we give an overview of the contribution to the development of linear scale-space theories and analyses. In particular, we review four Japanese axiomatic approaches that substantiate linear scale-space theories proposed between 1959 and 1981. By juxtaposing them to ten American or European axiomatics, we present an overview of the state-of-the-art in Gaussian scale-space axiomatics. Furthermore, we show that many techniques for analysing linear scale-space have also been pioneered by Japanese researchers.
Keywords: scale-space, axiomatics, deep structure, optical character recognition (OCR)
1. Introduction should simplify the image without creating spurious structures. Since a scale-space introduces a hierarchy A rapidly increasing number of publications, work- of the image features, it constitutes an important step shops and conferences which are devoted to scale-space from a pixel-related image description to a semantical ideas confirms the impression that the scale-space image description. paradigm belongs to the challenging new topics in com- Usually a 1983 paper by Witkin [71] or an un- puter vision. This is also reflected in a number of recent published 1980 report by Stansfield [64] are regarded books on this subject [14, 18, 19, 47, 63, 68]. as the first references to the scale-space idea. Witkin In scale-space theory one embeds an image f : obtained a scale-space representation by convolution 2 R → R into a continuous family {Tt f | t 0} of of the original image with Gaussians of increasing gradually smoother versions of it. The original image width. Koenderink [43] pointed out that this Gaussian corresponds to the scale t = 0 and increasing the scale scale-space is equivalent to calculating (Tt f )(x) as the 238 Weickert, Ishikawa and Imiya solution u(x, t) of the linear diffusion process always possible to derive the Gaussian scale-space as X the unique possibility. The fact that derivations such ∂ = ∂ = 1 , as [13, 48, 52, 56] have been found recently shows t u xi xi u : u (1) i that linear scale-space axiomatics belong to the current research topics in computer vision. u(x, 0) = f (x). (2) However, since the linear diffusion equation is well- established in mathematics and physics since Fourier’s Soon this linear filtering became very popular in pioneering works in 1822 [16], and image processing image processing, and many results have been ob- was already an active field in the fifties, one might won- tained with respect to the axiomatization, differential der whether the concept of Gaussian scale-space is not geometry, deep structure, and applications of linear dif- much older as well. Koenderink [44] states in a very fusion filtering. An excellent overview of all these as- nice preface discussing how scale-related ideas can be pects can be found in the recent book edited by Sporring traced back in literature, poetry, painting and cartogra- et al. [63]. phy that “the key ideas have been around for centuries Perona and Malik [57] pioneered the field of non- and essentially everything important was around by the linear diffusion processes, where the diffusivity is end of the nineteenth century.” The goal of the present adapted to the underlying image structure. Many reg- paper is to supplement these statements by showing that ularized variants of the Perona-Malik filter are well- not only the ingredients, but also their final mixture and posed and possess scale-space properties [68]. There application to image processing is much older than gen- exist also generalizations of the latter theories to erally assumed. To this end we present four Japanese anisotropic diffusion processes with a diffusion ten- linear scale-space approaches, which are older than sor [68] and to dynamic scale-space paradigms which American and European ones. The first one of them can be analyzed in terms of algebraic invariance the- dates back to 1959. ory and differential and integral geometry [58]. Other The outline of this paper is as follows: In Section 2 important classes of nonlinear scale-spaces are related we describe the basic ideas of a one-dimensional ax- to morphology. Some of them are continuous-scale iomatic for Gaussian scale-space that has been dis- versions of classical morphological processes such as covered by Taizo Iijima in 1959 [22]. Section 3 dilation or erosion [5, 6, 40], others can be described studies a two-dimensional version of this axiomatic as intrinsic evolutions of level curves [1, 42, 54, 60]. leading to affine Gaussian scale-space. It has been These geometric scale-spaces are generated by non- established in 1962. In 1971 Iijima presented a linear partial differential equations (PDEs) which are more physical two-dimensional axiomatics of affine designed to have Euclidean [5, 6, 40, 42], affine [1, 60] Gaussian scale-space [27]. Its principles are sketched or projective invariances [7, 11, 12, 54]. Overviews of in Section 4. The next section is concerned with these nonlinear theories can be found in [10, 18, 68]. further scale-space research of Iijima and his stu- Besides the above-mentioned continuous scale- dents. Section 6 describes a two-dimensional ax- space concepts there has also been research on how iomatic which has been found by Nobuyuki Otsu in to construct adequate discrete scale-space frameworks, 1981 [55]. In Section 7 we shall relate all these results both for linear [3, 46, 47, 59] and nonlinear diffusion to the well-known linear axiomatics that have been es- filtering [67, 68]. Since a more detailed discussion of tablished since 1984. We conclude with a discussion in these algorithmically important frameworks would be Section 8. In order to give the reader an impression of beyond the scope of this paper, we shall focus on con- the spirit of these Japanese papers, we stick closely to tinuous theories in the sequel. the notations in the original work. The discussions are This diversity of scale-space approaches has trig- supplemented with remarks on the philosophical back- gered people to investigate which axiomatic principles ground, physical and biological motivations, results on are involved in these theories and how they differ [1, 4, the deep structure in scale-space and applications to 5, 13, 15, 43, 46, 48, 52, 54, 56, 72]. Apart from a few optical character recognition (OCR).1 notable exceptions such as the morphological equiva- lent of Gaussian scale-space [5], affine morphological scale-space [1] and intrinsic geometric heat flows [54], 2. Iijima’s One-Dimensional Axiomatic (1959) most of these axiomatics use (explicitly or implicitly) one requirement: a linearity assumption (superposi- This section presents first some remarks on the his- tion principle). Within such a linear framework it was torical and philosophical roots of Japanese scale-space Linear Scale-Space has First been Proposed in Japan 239 research, and afterwards it describes the axioms and re- Iijima imposes basic principles which are in accor- sulting lemmas and theorems of the earliest scale-space dance with requirements from observation theory: a approach. robust object recognition should be invariant under changes in the reflected light intensity, parallel shifts 2.1. Historical and Philosophical Background in position, and expansions or contractions of the ob- ject. In addition to these three transformations he con- 8 Japanese scale-space research was initiated by Taizo siders an observation transformation which depends Iijima. After graduating in electrical engineering and on an observation parameter and which transforms ( ) ( ) mathematics from Tokyo Institute of Technology in the original image g x into a blurred version f x . 1948, he joined the Electrotechnical Laboratory (ETL). More precisely, he calls this class of blurring trans- In his thesis titled ‘A fundamental study on electro- formations ‘boke’ (defocusing), and he assumes that it has the structure2 magnetic radiation’ he derived the third analytical so- Z ∞ lution of the radiation equation. During these studies f (x) = 8[g(x0), x, ]= {g(x0), x, x0, }dx0, he acquired the mathematical and physical background ∞ for his later scale-space research. (3) At the ETL Iijima was involved in different re- satisfying four conditions: search activities on speech and pattern recognition. Triggered by actual needs such as optical charac- (I) Linearity (with respect to multiplications): If the ter recognition (OCR), but also voice typewriting, or intensity of a pattern becomes A times its origi- medical diagnosis, he wanted to establish a general nal intensity, then the same should happen to the theoretical framework for extracting characteristic in- observed pattern: formation of patterns. This framework should avoid extreme standpoints such as purely deterministic or 8[Ag(x0), x, ]=A8[g(x0), x, ]. (4) purely stochastic classifications, and it should make use of the original physical or geometric characteris- (II) Translation invariance: Filtering a translated tics of the patterns [26]. image is the same as translating the filtered im- Besides this problem-driven background, there was age: also a philosophical motivation for Iijima’s scale-space research. Its principles go back to Zen Buddhism, 8[g(x0 a), x, ]=8[g(x0), x a, ]. (5) and they can be characterized by the sentence “Any- thing is nothing, nothing is anything”. Applied to the (III) Scale invariance: If a pattern is spatially en- scale-space context this means to obtain the desired larged by some factor , then there exists a information, it is necessary to control the unwanted in- 0 = 0( , ) such that formation. The blurred scale-space evolution may be interpreted as a kind of unwanted information which 8[g(x0/ ), x, ]=8[g(x0), x/ , 0]. (6) helps to understand the semantical content of the un- blurred image. In this sense important information is (IV) (Generalized) semigroup property:Ifgis existing among seemingly unimportant information. observed under a parameter 1 and this observa- tion is observed under a parameter 2, then this 2.2. Axioms is equivalent to observing g under a suitable pa- rameter 3: 00 0 00 Iijima’s first axiomatic formulation of the scale-space 8[8[g(x ), x , 1],x, 2]=8[g(x ), x, 3], concept can be found in a technical paper from 1959 ti- (7) tled ‘Basic theory of pattern observation’ [22]. A jour- nal version of this paper has been published in 1962 where 3 = 3( 1, 2), but not necessarily 3 = under the title ‘Basic theory on normalization of pat- 1 + 2. tern (In case of typical one-dimensional pattern)’ [23]. Both papers are written in Japanese. The restriction to Later on we shall see that—in order to determine the the one-dimensional case is for simplicity reasons. Ex- Gaussian uniquely—this axiomatic has to be supple- tensions to the two-dimensional case are discussed in mented with a fifth requirement: preservation of posi- Section 3. tivity. 240 Weickert, Ishikawa and Imiya
2.3. Consequences In a next step Iijima simplifies the result of Lemma 4. The case 8[g(x0), x, ] 0 is of no sci- In [23] Iijima establishes four lemmas that confine the entific interest and is not considered any further. In class of integral operators (3) by consecutively impos- Eq. (10) the function can be eliminated by rescaling ing conditions (I)–(IV). via ( ) = k/ . Then the k-dependence in (11) vanishes by means of substitution of variables. Lemma 1. If 8 has the structure (3) and satisfies the The preceding results immediately yield linearity axiom (I), then it can be written as the integral Theorem 1. If 8 satisfies (3) and the axioms Z ∞ 8[g(x0), x, ]= g(x0) (x,x0, )dx0. (8) (I)–( IV), then it is given by ∞ Z ∞ 0 0 0 0 x x dx 8[g(x ), x, ]= g(x ) m (13) Lemma 2. If 8 is given by (8) and satisfies the trans- ∞ lation invariance axiom (II), then it can be written as a convolution operation: with Z Z ∞ ∞ 1 ( ) = ( 2m + ) 8 ( 0), , = ( 0) ( 0, ) 0. m u exp i u d [g x x ] g x x x dx (9) 2 ∞ ∞ (m = 1, 2,...). (14) Lemma 3. If 8 is given by (9) and satisfies the scale invariance axiom (III), then it can be written as For this family it follows that 0 in (III) becomes 0 = / , and in (IV) satisfies 8[g(x0), x, ] 3 Z ∞ 0 0 0 2m = 2m + 2m . = g(x ) ( ( )(x x )) ( ) dx , (10) 3 1 2 (15) ∞ = where ( ) is an arbitrary function of . For the special case m 1 Eq. (14) becomes 1 u2 Lemma 4. If 8 is given by (10) and satisfies the semi- (u) = √ exp , (16) 1 group axiom (IV), then the convolution kernel has the 2 4 specific structure which gives Z ∞ 1 2m 2m (u)= exp( k + i u) d 0 2 ∞ 8[g(x ), x, ] Z (k ∈ R, m = 1, 2,...), (11) 1 ∞ (x x 0)2 = √ g(x0) exp dx0. (17) 2 or 2 ∞ 4 0 0 Thus, 8[g(x ), x, ] is just the convolution between g 8[g(x ), x, ] 0. √ and a Gaussian with standard deviation 2. In a subsequent theorem Iijima establishes that the The proofs of the first three lemmas are rather short case m = 1 is of specific interest: if one requires that and not very complicated, whereas the longer proof 8 is positivity preserving, i.e., of Lemma 4 involves some more sophisticated reason- ings in the Fourier domain: in order to derive the nat- 8[ f (x0), x, ]>0 ∀ f(x)>0, ∀ >0, (18) ural numbers m in Lemma 4, he shows that 9( ), the Fourier transform of (u), must satisfy then m = 1 arises by necessity. The proof presents an