ISSN: 1402-1544 ISBN 978-91-7583-XXX-X Se i listan och fyll i siffror där kryssen är

DOCTORAL T H E SIS

Department of Computer Science, Electrical and Space Engineering Division of Signals and Systems Landström Elliptical Anders Elements for Adaptive Structuring

ISSN 1402-1544 Elliptical Adaptive Structuring Elements ISBN 978-91-7583-047-6 (print) ISBN 978-91-7583-048-3 (pdf) for Mathematical Morphology Luleå University of Technology 2014

Anders Landström

Elliptical Adaptive Structuring Elements for Mathematical Morphology

Anders Landstr¨om

Dept. of Computer Science, Electrical and Space Engineering Lule˚aUniversity of Technology Lule˚a,Sweden

Supervisors: Matthew J. Thurley and H˚akan Jonsson Printed by Luleå University of Technology, Graphic Production 2014

ISSN 1402-1544 ISBN 978-91-7583-047-6 (print) ISBN 978-91-7583-048-3 (pdf) Luleå 2014 www.ltu.se To my mother and father, for all their support,

and Elizabeth, for all her patience.

Thank you.

iii iv Abstract

As technological advances drives the evolution of sensors as well as the systems using them, processing and analysis of multi-dimensional signals such as images becomes more and more common in a wide range of applications ranging from consumer products to automated systems in process industry. Image processing is often needed to enhance or suppress features in the acquired data, enabling better analysis of the signals and thereby better use of the system in question. Since imaging applications can be very different, image processing covers a wide range of methods and sub-fields. Mathematical morphology constitutes a well defined framework for non-linear im- age processing based on set relations. It relies on minimum and maximum values over neighborhoods (i.e. regions surrounding the individual points) defined by shapes or func- tions known as structuring elements. Classical morphological operations use a predefined structuring element which is used repeatedly for each point in the image. This is often not ideal, however, which has motivated the evolution of adaptive morphological filter- ing where the structuring element changes from point to point. The field of adaptive mathematical morphology includes many different concepts with different strengths and weaknesses, and the specific choice of method should be made with the specific applica- tion in mind. The main contribution of this thesis is a novel method for adaptive morphological filtering using Elliptical Adaptive Structuring Elements (EASE). The method enhances directional structures in images by orienting the structuring elements along the existing structure, and can be efficiently used to close gaps in such structures. The method is introduced by summarizing underlying theory as well as presenting a practical application motivating it: crack detection in casted steel. Furthermore, it is demonstrated how the method can be extended to allow for filtering of incomplete (i.e. partially missing) image data without need for pre-filtering. The EASE concept is also put in relation to other related work by presenting a survey of the field of adaptive mathematical morphology. In conclusion, EASE allows for fast structure-based adaptive morphological filtering of images based on solid mathematical theory, successfully enhancing directional structures such as lines, borders, etc. in the data. The method is user-friendly, as it does not require more than a few user-defined parameters, and can also be adapted for direct filtering of incomplete data.

v vi Contents Part I 1 Chapter 1 – Introduction 3 1.1 Background ...... 3 1.2 Contribution ...... 5 1.3 Related work ...... 5 1.4 Thesis outline ...... 7

Chapter 2 – The Application 9 2.1 Cracks in casted steel ...... 9 2.2 Industrial Machine Vision ...... 10 2.3 3D profile data ...... 10 2.4 Photometric stereo ...... 11

Chapter 3 – Underlying Theory 15 3.1 Digital images ...... 15 3.2 Mathematical morphology ...... 16 3.3 The Local Structure Tensor ...... 26 3.4 Normalized Convolution ...... 26 3.5 Normalized Differential Convolution ...... 29

Chapter 4 – Elliptical Adaptive Structuring Elements 31 4.1 The EASE concept ...... 31 4.2 Required parameters ...... 34 4.3 Implementation and computational issues ...... 35 4.4 EASE for incomplete Data ...... 36 4.5 Strengths and weaknesses ...... 38

Chapter 5 – Contributions 41 5.1 Paper A ...... 41 5.2 Paper B ...... 42 5.3 Paper C ...... 42 5.4 Paper D ...... 42 5.5 Paper E ...... 43 5.6 Paper F ...... 43

Chapter 6 – Conclusions and Future Work 45 6.1 Conclusions ...... 45

vii 6.2 Future work ...... 47

Part II 57 Paper A 59 1 Introduction ...... 61 2 Measurements ...... 64 3 Segmentation ...... 65 4 Classification ...... 73 5 Results ...... 77 6 Discussion ...... 78 7 Conclusion ...... 80 8 Future work ...... 81

Paper B 85 1 Introduction ...... 87 2 Method ...... 90 3 Implementation ...... 92 4 Results ...... 93 5 Discussion ...... 100 6 Conclusion ...... 101 7 Future work ...... 101

Paper C 105 1 Introduction ...... 107 2 Method ...... 109 3 Results ...... 117 4 Discussion ...... 119 5 Conclusion ...... 120 6 Future work ...... 120

Paper D 123 1 Introduction ...... 125 2 Method ...... 126 3 Experiments and results ...... 129 4 Discussion ...... 132 5 Conclusion ...... 134

Paper E 137 1 Introduction ...... 139 2 Theory ...... 141 3 Method ...... 143 4 Experiments and results ...... 145 5 Discussion ...... 150

viii Paper F 155 1 Introduction ...... 157 2 Overview of adaptive mathematical morphology ...... 158 3 Theory ...... 163 4 Selected methods ...... 168 5 Experimental results ...... 171 6 Discussion ...... 175 7 Perspectives and trends ...... 176

ix x Acknowledgments

First of all, I would like to thank my supervisors Matthew Thurley and H˚akan Jonsson for their support and guidance throughout this work. I am very grateful for the amount of freedom in my research, which have encouraged me to venture into new areas. I would also like to thank all colleagues at the department of Computer Science, Elet- rical and Space Engineering at Lule˚aUniversity of Technology. You have all contributed into making my PhD studies a memorable time, in a very positive sense. An extra thank you goes to everyone at the division of Signals and Systems, and in particular to Roland Hostettler and Martin Simonsson for many good discussions on various topics – sometimes work-related, sometimes not. A special thank you goes to Frida Nellros, who has been a very good friend and colleague throughout many years of studies. Another special thank you goes to Vladimir Curi´cand´ Cris Luengo Hendriks at Uppsala University, for very good collaboration. This thesis would not have been the same without our discussions. A thank you also goes to our measurement technology partners, Kemi-Tornio Univer- sity of Applied Sciences (KTUAS). More specifically: Harri Pikkarainen, Jukka Leinonen, Juha Maronen, and Pauli Vaara. Furthermore; I have greatly appriciated the collabo- ration with industry throughout this thesis, which has provided an interesting challenge and a solid motivation for my work. I especially want to thank Robert Johansson and Mats Emmoth at SSAB Lule˚afor their interest and assistance. My intention has always been to do research that can be of practical use, and you have provided me with that connection. I would also like to acknowledge ProcessIT Innovations, and in particular P¨ar-ErikMartinsson, for all invested time and effort. Finally I would like to express my gratitude towards my family: for all support throughout this journey, and for always being understanding when time has been a scarcity. This work was partly supported by the EU Interreg IVA Nord program and Jernkon- toret.

Anders Landstr¨om Lule˚a,November 2014

xi xii Part I

1 2 Chapter 1

Introduction

“Use a picture. It’s worth a thousand words.” – Arthur Brisbane

1.1 Background

1.1.1 Automated systems and Machine Vision

As technology advances, automated systems become more and more common. These solutions, designed to assist in different ways, often rely on processing of acquired sensor information, i.e. signals, in order to fulfill their task. For instance, consider a few examples such as an automated surveillance system using cameras to track movement [1], a car detecting pedestrians using a radar sensor [2], or an industrial system measuring the dimensions of the produced goods using a laser scanner [3]. The complexity of the signals as well as the complexity of the systems can of course vary greatly, but the common task of many automated systems is to interpret signals in order to behave or react as desired. Many more advanced systems rely on multi-dimensional signals, which do not vary with respect to one variable only (e.g. time or distance) but can vary with respect to multiple dimensions. Images constitute a common type of such signals, where the color values captured by the sensor (i.e. camera) varies in both an x-direction (left–right) and a y-direction (down–up). Other examples of multi-dimensional signals are volumetric data (which varies with respect to x, y, and a third spatial axis z) or video (which varies with respect to x, y, and a time variable t). Automated systems based on multi-dimensional data are often referred to as Machine Vision (MV) or Computer Vision (CV) systems, and rely on image processing and analysis. This thesis will focus on data which can be represented in two dimensions, i.e. can be depicted as an image, but the presented theory can be extended into higher dimensions as well.

3 4 Introduction

1.1.2 Image processing and analysis When an image captured by camera sensors is presented as an input signal to a system, it is merely a collection of numbers in a known order representing their different spatial po- sitions. These positions correspond to the smallest spatial entities in the resulting digital images, which are known as pixels. From these pixel values, which together form digital images, the system must then extract more high-level knowledge. Image processing and analysis constitute vital tools for extracting usable information for automated systems based on the acquisition of images or other types of multi-dimensional data. Image Pro- cessing, on one hand, deals with signal transformations and filtering and is typically used to enhance or suppress features in image data; e.g. smoothing, sharpening, etc. Image analysis, on the other hand, covers the extraction of desired information from digital images, usually at a higher level; e.g. detected objects, estimated movement direction, etc. However, there is no distinctive border where image processing turns into image analysis [4]. While both topics constitute essential parts of an automated machine vision system, this thesis will largely focus on the former. Image processing comprises many different types of tools and filters, and while many operations can be expressed in a linear framework where the image is typically convolved with a filter kernel, either directly in the spatial domain or indirectly in the Fourier domain, there are also a substantial amount of non-linear alternatives. The specific application considered in this thesis – detection of cracks in casted steel – is one example of a task where linear filters based on weighted sums of pixels may easily suppress the thin and partially broken crack signatures, which has motivated the exploration of non-linear methods. More specifically; this thesis presents a strategy for adaptive morphological filtering, which is based on minimum and maximum values over sets of points. The presented application, which deals with crack inspection, is quite specific, but the method itself can be applied to any type of image data.

1.1.3 Mathematical morphology Mathematical morphology, originally developed by Matheron [5] and Serra [6], is a well defined framework and a powerful tool for non-linear image processing. Morphological filters have been used for a wide range of tasks including analysis of DNA microarrays [7], LiDAR data [8], aggregates [9, 10], and pellets [3], to mention a few. The framework is based on shapes or functions known as structuring elements, which are used to probe the image. More specifically, the two basic operations and extracts the minimum or maximum value, respectively, within the neighborhood defined by the structuring element. For a more thorough introduction to mathematical morphology, including formal definitions, the reader is referred to Chapter 3, Sect. 3.2. Classical morphological operators are non-adaptive, i.e. the whole image is processed in the exact same way without taking variations in structure into account. This is often far from ideal, however, unless the aim of the operation is to detect a specific type of fixed signature. This has led to the development of adaptive mathematical morphology, which can be viewed as the morphological counterpart to other well-known methods for 1.2. Contribution 5 adaptive filtering [11–14] where the filter kernel should not operate across edges in the image. Morphological operators do not only remove noise but also preserve shapes in the image well, since they rely on minimum or maximum values rather than the median or weighted mean over pixel values within the kernel.

1.2 Contribution

The main purpose of this thesis is to investigate and demonstrate what can be achieved by combining the non-linear framework of mathematical morphology with typical linear methods which efficiently estimate structure in the data. The result is a set of morpho- logical filters based on Elliptical Adaptive Structuring Elements (EASE). The usefulness of the presented method is demonstrated through the application which has been the underlying catalyst to the work: automated detection of cracks in casted steel using Machine Vision. Cracks constitute thin structures which have a main direction but can often on a local level be better described as randomly oriented structures due to their zig-zagging patterns. Moreover, crack signatures are often split into shorter segments which should ideally be linked together in order to simplify extraction of the crack. The situation is further complicated by the fact that the 3D profile data used is incomplete, i.e. contains missing pixel values, as a result of the measurement process. This motivated the following research questions: RQ1 Can linear methods and mathematical morphology be combined into a general (i.e. not application-specific) filter able to efficiently enhance directional structures and link them together? RQ2 How can partly missing data be handled in a method of this type? RQ3 What would be the properties of such a filter? The work presented in this thesis combines two fields that are often kept separated, thereby providing a generic method for adaptive mathematical morphology based on solid mathematical theory. To some extent, the beauty lies in the simplicity: using the morphological framework – where parameters are directly linked to structures in the image – in combination with the well known Local Structure Tensor (LST) – which can be easily calculated based on parameters that can be clearly linked to scale – the EASE concept for adaptive morphology can be defined for ordinary images as well as for partially incomplete data. It is therefore a simple tool to use for structure-based adaptive morphological filtering, allowing for enhancement or linking of directional features in the data. As an example of a practical application of the presented method, this thesis also demonstrates its use for an industrial automated crack detection system.

1.3 Related work

This section briefly presents related work within the field of adaptive mathematical mor- phology, based on the more thorough survey presented in Paper F. It is concluded by 6 Introduction comments on the presented method with respect to other published work. Early theoretical work on morphological operators based on structuring elements that can change depending on their position in the image was conducted by Serra [15], and Charif-Chefchaouni and Schonfeld [16] later presented general theory for adaptive mor- phology in the binary case. Other early work on adaptive structuring elements was under- taken by Morales [17], Chen et al. [18], and Cheng and Venetsanopoulos [19]. During the last decade, interest in adaptive mathematical morphology has increased: methods such as general adaptive neighborhoods [20] and morphological amoebas [21] have been pre- sented, and the theoretical framework for adaptive operations has been explored [22–25]. As structuring elements can be constructed in a variety of different ways, the resulting different adaptive morphological filters can also be grouped in various ways. Maragos and Vachier [26] identified three groups: • adaptivity with respect to the spatial position in the image, • adaptivity with respect to gray level image values, and • algebraic principles such as group and representation theory; while Roerdink [25] distinguished between • location-adaptive mathematical morphology (adaptability with respect to position) and • input-adaptive mathematical morphology (adaptability with respect to image con- tent). It should be noted that the two latter groups match the two first groups considered by Maragos and Vachier well. To complement previous work by providing larger diversity of adaptivity, several mu- tually non-exclusive aspects to which existing methods can be associated were identified in Paper F: • Similarity: Most current methods for adaptive mathematical morphology consider local similarity of neighboring pixels, i.e. each structuring element includes points that are (in some defined sense) similar to its origin (e.g. by comparing gray-scale values). Some methods allow for complete adaptivity of the shape of the structuring elements [20, 27], while other rely on the distance transform [28] or a geodesic distance which take both spatial distance and gray level difference into account [29– 31]. Other examples of this aspect are methods that use predefined shapes but vary the size of the shape depending on local similarity [32,33]. • Structure: Structure in an image is related to similarity, but is still quite different: structure-based methods define structuring elements based on edges and contours, rather than restricting them by measures of similarity. This can be done by con- sidering orientation only [34,35] or by including other factors, such as distances to edges, as well [36]. Some methods work in multiple scales in order to adapt the size of the structuring element to the local scale of structures in the image [37,38]. 1.4. Thesis outline 7

• Partial Differential Equations: Main morphological operators can also be de- fined by diffusion equations, which enables processing by solving Partial Differential Equations (PDEs). Structuring elements are here defined implicitly by a unit ball that deform over time in the PDE, rather than explicitly as a spatial shape or function [39]. PDE-based methods operate in a continuous framework and can of course consider both similarity [40] and structure [41], and have also been defined for graph structures [42].

• Graphs: Adaptive morphological operations can also be defined on graphs. This type of filters have been implemented using PDEs [42] as well as minimum span- ning trees [43] and paths [44]. A general framework for mathematical morphology on graphs, which handles spatially variant structuring elements, has also been pre- sented [45].

• Group adaptivity: Morphological operators are in most cases defined as trans- lation-invariant operators, but this approach is not always useful [46]. This has motivated the concept of group morphology [47], where operations are instead invariant under other transformations such as non-commutative symmetry groups, rotation groups, or similar.

The method presented in this thesis (Papers B and E), which relies on both orientation and rate of anisotropy, is a clear example of input-adaptive mathematical morphology based on structure. As such, its structuring elements follow structures in the data but are still allowed to stretch outside regions of similarity, which is a requirement for any method intended for prolonging or linking of segments. There are similarities with other methods such as the line-shaped or rectangular structuring elements presented by Verd´u- Monedero et al. [35,36,48] or the continuous PDE-based morphology presented by Breuß et al. [41], which also use the LST explicitly or implicitly, but the EASE concept uses the full power of the LST as not only the estimated orientation but also its dominancy is considered. This is a major difference, as one can always find a direction even though it may very well be completely irrelevant (which is the case when there is no prevalent direction). Using the dominancy information contained in the LST does not impose an orientation in such cases, which avoids introducing a geometrical bias.

1.4 Thesis outline

This thesis consists of two parts; I and II. Part I provides an overview of the work, while Part II contains the scientific papers which constitute its backbone. Following this introduction, Chapter 2 contains a description of the particular application that motivated it: automated detection of cracks in casted steel using Machine Vision. The underlying theory used for the presented filter method is then presented in Chapter 3, while the core of the contribution of this work – the Elliptical Adaptive Structuring Elements (EASE) – is summarized in Chapter 4. Chapter 5 then provides a summary 8 Introduction of the scientific contributions in Part II of this work, while conclusions and thoughts regarding further work are presented in Chapter 6. Chapter 2

The Application

“Real data is an insult to good theory.” – Unknown

2.1 Cracks in casted steel

The vast majority of the world steel production tonnage is solidified by continuous cast- ing. [49]. The method has advantages in productivity, cost reduction and output quality; but because of factors such as design, operation, and maintenance it risks introducing various surface defects [50]. Being the major (semi-finished) product within steel casting, steel slabs (see Fig. 2.1) are thereby susceptible to crack formation. Since steel slabs are often intended for sheet steel rolling, such surface defects risk causing long sections of defective end-user products. Consequently, inspecting steel slabs before sending them through to the rolling mill, thereby avoiding related problems at the later stage, is im- portant to the steel industry. Automated systems, providing an objective and consistent result which can be more easily analyzed than manual inspection, can improve produc- tion efficiency as well as working conditions in steel casting. Yet, much inspection of casted steel is still manually operated [51,52]. The application considered in this thesis, which has motivated the development of the presented method for adaptive morphology, is automated crack inspection of casted steel slabs. In order to enable on-line testing of produced steel goods, Non-Destructive Testing (NDT) is required. Several NDT solutions have been presented for automated inspection of steel, based on various different data acquisition techniques such as thermocouples [53], eddy currents [54], magnetic powder [55], ultrasound [56] or sulfur prints [57]. This work has investigated the use of two data acquisition techniques: 3D profile data and photometric stereo.

9 10 The Application

(a) (b)

Figure 2.1: Examples of a steel slab (a) and a longitudinal surface crack (b).

2.2 Industrial Machine Vision

Machine Vision is a common tool within industrial applications, allowing for automated measurement and inspection of input material as well as produced goods [58–63]. In order to provide a reliable result, however, data collected by sensors in a machine vision setup needs to be properly processed and analyzed. Development of new methods and robust algorithms for use within image processing and analysis are therefore important aspects of the development and improvement of automated industrial systems for measurement and inspection. Gray-scale intensity imaging is commonly used for steel surface inspection; in com- bination with various different signal processing techniques such as wavelet transforms [64, 65], Gabor filters [52] and image morphology [52, 66, 67]. Yet, it has limitations: variations in lightning conditions may yield highly unpredictable gray levels which may cause psuedo-defects [65], and other parameters such as steel type may effect the image properties as well [67]. In addition to the more traditional machine vision approaches, numerous systems interpreting data collected by other Non-Destructive Testing (NDT) methods are also common in literature. References [53–57] constitute a few examples.

2.3 3D profile data

As gray-scale imaging is not always suitable, other data acquisition methods have been investigated. Pernkopf [68] notes that range imaging provides better contrast for surface defects with “three-dimensional characteristics”, and that the occurence of strong changes in reflective properties of the casted steel surface motivates the use of range data over intensity imaging due to less sensitivity to inhomogeneous reflectance. 2.4. Photometric stereo 11

(a) (b)

Figure 2.2: Laser line (enhanced in red) for laser triangulation of the surface profile for pellets on a conveyor belt, where the 3D structure is clearly visible (a, Original image from Ref. [3]), and for a steel surface (b).

This work considers 3D profile data collected by laser triangulation; where the position of a projected laser light (typically a laser line, see Fig. 2.2) is measured by a camera sensor at an offset angle, and the variation towards a zero position is converted to a surface height value. The resulting data describes the shape of a measured surface in spatial coordinates. Other examples of industrial systems using 3D profile data are: size measurement of iron ore pellets [3], size measurement of limestone particles [10], and steel surface inspection [69]. The collected surface profile can be displayed as an image, where pixel values corre- spond to the height information retrieved by the sensor for each point at the surface. An example image of a steel slab is depicted in Fig. 2.3, where also a few white “holes” in the data can be seen. This effect comes from a common issue known as (self-)occlusion, which occurs when the projected laser light is hidden to the sensor, i.e. the line-of-sight between the sensor and the point to be measured is obstructed (see Fig. 2.4). This leads to holes of missing data in the measured height profile, where height information is unknown.

2.4 Photometric stereo

Another method for obtaining more 3D-characteristic data is photometric stereo, origi- nally introduced by Woodham [70]. Photometric stereo uses multiple gray-scale images acquired from the same scene under illumination by differently placed light sources. Light from three different directions allows for uniquely determining the shape of a Lambertian surface (i.e. a surface which scatters light isotropically). More specifically, the different illumination patterns are converted into surface height gradient information by solving a linear system of equations. Two-source photometric stereo gives a less well defined surface, but conditions for existence and uniqueness of the system of equations for the two-source case have been investigated [71]. 12 The Application

0.5

0

-0.5  -1 Z (mm)

-1.5 Occluded data  -2

20 150 40 125 100 60 75 80 50 25 100 Y (mm) X (mm) (a)

0.5

0

-0.5

-1

-1.5

-2

(b)

Figure 2.3: A 150×100 mm (width×length) region of 3D profile data for a steel slab containing a crack; viewed in 3D (a) and from above (b). White pixels, e.g. in the bottom of the image at the right side of the crack (clearly seen in the 3D view), correspond to occluded data.

Figure 2.4: Example of self-occlusion. The height profile in the dashed red region is below the line-of-sight of the sensor and cannot be captured. 2.4. Photometric stereo 13

L1

scan line direction of movement

L2 (a) (b)

Figure 2.5: The photometric stereo setup used in Paper C: (a) Setup example with light sources in blue and yellow, and scan line in black. The gray area represents a part of the slab. (b) A portable measurement setup installation.

Similar to other presented work [72,73], Paper C is based on a two-source setup (see Fig. 2.5) which consists of yellow and blue lights that are visible in the green and blue channel, respectively, of the image. The benefit from this setup is that two different images can be obtained by considering the two color channels, while constant lights can be used in conjunction with a line camera that ensures a static geometrical setup of the scene. 14 The Application Chapter 3 Underlying Theory

“In theory, theory and practice are the same. In practice, they are not.” – Albert Einstein

3.1 Digital images

We now turn our interest to the toolbox required to define the presented method. Digital images are usually defined on a discrete rectangular grid of square elements known as pixels, and can be considered a specific sampling of an underlying continuous function (with the pixels as sampling points). Hence a digital image can be considered as a two- dimensional matrix of such pixels, in which each value represents the image value for that specific position. We will express images by the underlying continuous functions in the following theory, and the more general concept of points will therefore be used. We denote a point (pixel) position by a vector, e.g. x, and its corresponding image value by a function value, e.g. f(x). An image with point values given by the image function f(x) is often referred to simply as f. In the most basic type of images, point values are limited to one of two distinct values; false (0) or true (1). Such images are called binary and their image function range can be expressed as R(f) ∈ N0[0, 1], where f is the image function in question and N0 represents the natural numbers including zero. If we consider the image function 1 range R[0, 1] instead of N0[0, 1], we go from binary to gray-scale images . The exact range R(f) is limited by the image color depth, which defines the number of values f can assume. This can be generalized into complex numbers or higher dimensions as well. For instance, the range of a color image, represented by a three-element vector for each point (values for the red, green, and blue color channels) lies in R3. Commonly used within the

1 Gray-scale images can also be defined using natural numbers, e.g. N0[0, 255], but can always be rescaled to the interval R[0, 1].

15 16 Underlying Theory

field of image processing is the concept of tensors, which is a general notation for scalars, vectors and matrices. More specifically; a scalar is a tensor of order zero, a vector (which is a one-dimensional matrix) a tensor of order one, a two-dimensional matrix a tensor of order two, etc. In a gray-scale image a scalar value, and thereby a tensor of order zero, is assigned to each point. A color image on the other hand, where each point is represented by three color values, can be considered a collection of tensors of order one. For a review on the use of tensors within image analysis, the reader is referred to Ref. [74].

3.2 Mathematical morphology

Mathematical morphology, originally developed by Matheron [5] and Serra [6], is a frame- work for non-linear image processing based on set theory, focusing on geometrical struc- ture. It relies on structuring elements, i.e. shapes or functions s which are used to probe an image f by considering where in the data they do or do not fit2. Here f and s denote binary or gray-scale (zeroth order tensor) data. A short introduction to mathemati- cal morphology, based on theory found in the works of Serra [6, 15] and Bouaynaya and Schonfeld [22], is given below. For a more extensive background on the subject, the reader is referred to the given references. A more thorough introduction to the non-adaptive case can also be found in Ref. [83]. Morphological operations can be expressed using both functional notation as well as operator symbols. These two formulations can be used interchangeably, i.e.

(·)s(f) = f s (3.1) for any morphological operation (·) operating on the image f using the structuring ele- ment s. We will start by considering non-adaptive morphology, both in the binary and gray-scale case. These operations use rigid structuring elements, which remain the same for each point in the processed image. We then move on to adaptive morphology, where the structuring element is no longer rigid but may vary from point to point.

3.2.1 Binary morphology

In the binary case, consider an image f(x) ∈ N0[0, 1] defined for points x within the image domain D(f) (see example in Fig. 3.1a). The information within f can then be equivalently expressed as the set A = {x ∈ D(f): f(x) 6= 0} . (3.2)

Similarly, a binary structuring element s(u) ∈ N0[0, 1] (see example in Fig. 3.1b) and its reflected structuring element s∗(u) (i.e. the reflection of s through through the origin; see Fig. 3.1c) defined for points u in the domain D(s) can be denoted, respectively, as the sets B = {u ∈ D(s): s(u) 6= 0} , (3.3)

2The morphological framework can also be defined on lattice structures using algebraic tools [75,76], by Partial Differential Equations (PDEs) [77–79], and on graphs [45, 80–82]. 3.2. Mathematical morphology 17

(b)

(c) (a) (d) (e)

Figure 3.1: A binary image f with origin at the upper left (a), a centered structuring element s (b) and its reflection s∗ (c), and the structuring elements s and s∗ translated to a point x (d,e). The “+” sign marks the origin for the image coordinates.

B∗ = {u ∈ D(s): s(−u) 6= 0} . (3.4)

A structuring element sx(u) translated to a point x in the image f (Fig. 3.1d) and ∗ its reflection through x, sx(u) (Fig. 3.1e), can then be expressed, respectively, in set notation as [6,83] Bx = {u : u − x ∈ B} , (3.5) ∗ Bx = {u : x ∈ Bu} . (3.6) Note that ∗ u ∈ Bx ⇐⇒ x ∈ Bu, (3.7) which we will use as primary definition of the reflected (sometimes also referred to as the transposed or reciprocal) structuring element. We can also write

∗ Bx = {u : x − u ∈ B} . (3.8) We now need a fundamental concept within mathematical morphology: duality of operators. A dual operator Ψ∗ of Ψ operating on a set A is defined as [6] Ψ∗(A) = Ψ(Ac)c, (3.9) where (·)c denotes the set-theoretic complement of the set in question. By its definition, Ψ∗∗ = Ψ. Note that the symbol (∗) used to denote the dual operator is the same symbol which is used to denote the reflected structuring element. However, there is no risk of confusion since Eq. (3.9) considers operators while reflected structuring elements are functions (or sets). Morphological operations are then based on two operations which are dual with re- spect to the reflected structuring element; the erosion and dilation of an image f(x). The erosion and dilation of a binary image f using a binary structuring element s, denoted εs(f) (or f s) and δs(f) (or f ⊕ s), respectively, are defined as [6]

εs(f) = f s = A B = {x : Bx ⊆ A} , (3.10)

c ∗ c ∗ δs(f) = f ⊕ s = A ⊕ B = (A B ) = {x : Bx ∩ A 6= ∅} . (3.11) 18 Underlying Theory

(a) (b) (c) (d)

Figure 3.2: Morphological operations for the image and structuring element in Fig. 3.1; erosion (a), dilation (b), opening (c), and closing (d).

The very definition of the dilation with respect to erosion, i.e. A ⊕ B = (Ac B∗)c makes a dilation by B∗ dual to an erosion by B, and vice versa, i.e. [6]

∗ c c ∗ δs∗ (f) = (A ⊕ B ) = (A B) = εs(f). (3.12)

Geometrically, the erosion shrinks A by keeping only the points for which the corre- sponding translated structuring element is completely covered by A, while the dilation increases A by adding all points for which the corresponding translated structuring el- ement touches A (see Figs. 3.2a–3.2b). Note that (A B) ⊆ A ⊆ (A ⊕ B) for any structuring element B containing its origin, with equality if and only if B consists of one single (centered) point. Another fundamental property, which yields a consistent framework for mathematical morphology, is adjunction. The operations ε and δ are adjunct if and only if for any functions f and g

δ(f)(x) ≤ g(x) ∀x ∈ D(f, g) ⇐⇒ f(x) ≤ ε(g)(x) ∀x ∈ D(f, g), (3.13) where D(f, g) denotes the domain on which the functions f and g are operating. The importance of the adjunction property comes from the fact that adjunction holds for two operations δ and ε if and only if δ is a dilation and ε an erosion. Moreover, given that δ and ε are adjunct, their compositions γs(f) = δs(εs(f)) and ϕs(f) = εs(δs(f)) (also denoted f ◦ s and f • s) are an opening and a closing, respectively. That is, γ and ϕ fulfill a certain set of required properties (γ is increasing, idempotent, and anti-extensive, while ϕ is increasing, idempotent and extensive) [76]. The defined erosion and dilation are indeed adjunct, which will be proved for the general case in Section 3.2.3. Hence, the opening and closing, which are as well dual to each other with respect to the reflected structuring element, can in the binary case be defined as [6]

γs(f) = f ◦ s = A ◦ B = (A B) ⊕ B = ∪x∈D(f) {Bx : Bx ⊆ A} , (3.14)

c ∗ c ϕs(f) = f • s = A • B = (A ⊕ B) B = (A ◦ B ) . (3.15) 3.2. Mathematical morphology 19

y

f(x)

U[f] x

Figure 3.3: The Umbra transform U[f] of a one-dimensional signal f(x).

The opening keeps the parts of A where the structuring element B fits, while the closing increases A by adding points excluded from an opening on the complement of A by the reflected structuring element B∗ (see Figs. 3.2c–3.2d).

3.2.2 Gray-scale morphology Binary morphological operations can be extended for use in gray-scale images, i.e. im- ages f where R(f) = R[0, 1]. The structuring element is then generalized into a func- tion s ∈ R. A well-known strategy for the gray-scale extension of the binary operations relies on the Umbra transform U(f), defined as U[f] = {(x, y) ∈ (D(f) × R(f)) : y ≤ f(x)} . (3.16) If we, for illustration purposes, consider a one-dimensional signal f(x), the Umbra trans- form converts the function to a two-dimensional region U[f](x, y) (i.e. the area under- neath the function f, see Fig. 3.3). The function f(x) can be retrieved from its Umbra transform U[f] by extracting its top surface T (U[f]). Hence, we have W f = T (U[f]) = {y ∈ Dy(U[f]) : (x, y) ∈ U[f]} ∀x ∈ Dx(U[f]), (3.17) where W denotes the maximum operation. This relation between functions and their Umbra transforms is not necessarily true for any continuous function: the function in question must be upper-semi-continuous [15]. Functions defined on a discrete domain, however, are always upper-semi-continuous and this is therefore not an issue when dealing with digital images defined on a discrete grid. [22]. Using the Umbra transform, erosion and dilation can be redefined for the gray-scale case by [6] (·)s(f) = f s = T (U[f] U[s]). (3.18) Note that Umbra transforms are sets, and Equations (3.10)–(3.11) therefore apply on the right hand side of Equation (3.18). This definition of gray-scale morphology allows 20 Underlying Theory

v v v

∗ U[s](x,y) sx(u) + y y y ∗ s(u) y − sx(u) U[s](x,y) U[s] u x u x u

(a) (b) (c)

∗ Figure 3.4: Umbra structuring elements: U[s] (a), U[s](x,y) (b), and U[s](x,y) (c). for the properties of binary morphology to be kept in the operator conversion. As in the binary case, the opening and closing are defined by combining the erosion and dilation, but Eq. (3.18) actually holds for those operations as well. The resulting gray-scale morphological erosion and dilation can be more simply ex- pressed using the translated structuring element

sx(u) = s(u − x) (3.19) and its reflection [24] ∗ sx(u) = su(x) = s(x − u) (3.20) ∗ (compare to Equations (3.5)–(3.8)). Note that the reflected structuring element sx is actually simply a translation of s(−u). In the Umbra domain, the translated structuring element and its reflection is given by (see Fig. 3.4)

U[s](x,y) = U[sx + y] = U[sx] + y = {(u, v): v ≤ sx(u) + y} , (3.21)

∗  ∗ U[s](x,y) = (u, v):(u, v) ∈ U[s](x,y) = {(u, v): v ≥ y − sx(u)(x)} . (3.22) Hence, x and y translates the Umbra structuring element across the Umbra space, just as the binary structuring element was previously translated within the image. Gray-scale erosion of f(x) by s(u) is then defined by Eqs. (3.18) and (3.10) as [6]   εs(f) = f s = T [U[f] U[s]] = T (x, y): U(x,y)[s] ⊆ U[f]

= T [{(x, y): U[sx + y] ⊆ U[f]}] hn oi = T (x, y): y ≤ V {f(u) − s (u)} u∈D(sx) x

= V {f(u) − s (u)} u∈D(sx) x W = y∈R(f) {sx(u) + y ≤ f(u) ∀u ∈ D(f, sx)} . (3.23) 3.2. Mathematical morphology 21

y y

f(x) f(x)

U[f] ⊕ U[s]

U[f] U[s] x x

(a) (b)

y y

f(x) f(x)

U[f] ◦ U[s] U[f] • U[s] x x

(c) (d)

Figure 3.5: Morphological operations on the Umbra: erosion (a), dilation (b), opening(c), and closing (d).

The dilation can be similarly derived, yielding the expression

W ∗ δs(f) = f ⊕ s = T [U[f] ⊕ U[s]] = ∗ {f(u) + s (u)} u∈D(sx) x V ∗ ∗ = y∈R(f) {−sx(u) + y ≥ f(u) ∀u ∈ D(f, sx)} , (3.24) while the opening and closing are, analogously to the binary case, given by

γs(f) = f ◦ s = (f s) ⊕ s, (3.25)

ϕs(f) = f • s = (f ⊕ s) s. (3.26) A one-dimensional example of gray-scale morphology is presented in Fig. 3.5. Duality is in the gray-scale case defined by [6]

Ψ∗(f) = −Ψ(−f) (3.27) 22 Underlying Theory

f s 1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 60 10 60 5 10 40 40 0 5 20 0 20 -5 -5 0 0 -10 -10 (a) (b)

f s f ⊕ s 1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 0 (c) (d)

f ◦ s f • s 1 1

0.8 0.8

0.6 0.6

0.4 0.4

0.2 0.2

0 0 60 60 60 60 40 40 40 40 20 20 20 20 0 0 0 0 (e) (f)

Figure 3.6: Morphological operations operating on 2D image data f using a flat disk structuring element s: the original image (a), the structuring element(b), and the result- ing erosion (c), dilation (d), opening (e), and closing (f). 3.2. Mathematical morphology 23 and as in the binary case, the gray-scale morphological operations above are pair-wise dual with respect to the reflected structuring element. For erosion and dilation, we have

W V ∗ δ ∗ (f) = {f(u) + s (u)} = − {−f(u) − s (u)} = ε (f). (3.28) s u∈D(sx) x u∈D(sx) x s Of particular interest are so called flat structuring elements, which are commonly used in various applications. These structuring elements are closely related to the binary case, and are defined as  0, u ∈ B, s(u) = (3.29) −∞, u ∈/ B, where B is a subset of the domain of s. These structuring elements are usually depicted as a binary image, and their reflected structuring element can be defined by Eq. (3.7). In the case of flat structuring elements, Eqs. (3.23)–(3.24) translates into V εs(f) = f s = {f(u): u ∈ Bx} , (3.30)

W ∗ δs(f) = f ⊕ s = {f(u): u ∈ Bx} , (3.31) where Bx once again denotes the translation of B to the point x (Eq. (3.5)). Fig. 3.6 demonstrates the effects of gray-scale morphological operations using a flat disk.

3.2.3 Adaptive morphology The structuring elements considered in the two previous sections have been rigid. This is often not ideal, however. In particular, the resulting morphological operations risk stretching over edges, destroying vital edge information in the process. Therefore, con- sider instead a structuring element which is allowed to vary for each point in the im- age. This is simply a generalization of the translated rigid structuring elements given by Eq. (3.19) [22, 24], and we can once again denote the structuring element by sx(u), u ∈ D(sx). This time, however, the structuring element is not just translated to the point x ∈ D(f), but is also allowed to vary in shape from point to point. Hence, in the adaptive case we have [22]

sx(u) = s[x](u − x), (3.32) where s[x] denotes that the function s itself (rather than just its values) depends on the point x. The corresponding reflected structuring element is defined by

∗ sx(u) = su(x) = s[u](x − u) ∀x, u ∈ D(f). (3.33)

If an adaptive structuring element sx is assigned to each point x in the image, the full set is known as a Structuring Element Map. Note that as for rigid structuring elements, Eq. (3.29) can be used to obtain the flat case, in which case Eqs. (3.7), (3.30), and (3.31) still holds. Figure 3.7 shows an example of two points and their adaptive structuring elements. The EASE method is completely based on flat adaptive structuring elements, but for completion we summarize 24 Underlying Theory

Bu Bx u x

Figure 3.7: Two points x and u and their assigned adaptive flat structuring elements Bx ∗ and Bu. Since x ∈ Bu, u ∈ Bx (Eq. (3.7)). the general case as well, recognizing that the flat case is just a special case of the more general gray-scale theory. Following Bouaynaya and Schonfeld [22, 24], we start by going back to the binary case, i.e. where f, s ∈ N[0, 1]: let the image f and structuring element sx be represented by the sets A and Bx, respectively (analogously to the non-adaptive binary case, see Sect. 3.2.1). Equation (3.33) then translates into

∗ Bx = {u : x ∈ Bu} . (3.34)

The adaptive versions of the four basic morphological operations (erosion, dilation, open- ing, and closing) are then defined for the binary case by Eqs. (3.10)–(3.15), i.e.

εs(f) = f s = A B = {x : Bx ⊆ A} ,

c ∗ c ∗ δs(f) = f ⊕ s = A ⊕ B = (A B ) = {x : Bx ∩ A 6= ∅} ,

γs(f) = f ◦ s = A ◦ B = (A B) ⊕ B = ∪x∈D(f) {Bx : Bx ⊆ A} ,

ϕs(f) = f • s = A • B = (A ⊕ B) B.

Once the adaptive morphological operators have been defined for binary data, they can be extended to gray-scale images by generalizing the Umbra approach to allow for struc- turing elements which vary for each point x. The Umbra structuring element is, as before, set by Eqs. (3.5)–(3.6), but in the adaptive case the actual shape of the function sx de- pends on x by Eq. (3.32). This generalization of the structuring function has a major impact on the Umbra structuring element U[s](x,y): it is now invariant with respect to y only while x sets the actual shape of the structuring element, as desired [22]. Using Eqs. (3.21)–(3.22),

U[s](x,y) = U[sx + y] = U[sx] + y = {(u, v): v ≤ sx(u) + y} ,

∗  U[s](x,y) = (u, v):(u, v) ∈ U[s](x,y) = {(u, v): v ≥ y − su(x)} , where sx is defined by Eq. (3.32), the morphological erosion and dilation can be gener- 3.2. Mathematical morphology 25 alized for the adaptive case using Eqs. (3.23) and (3.24) [22]:

ε (f) = f s = T [U[f] U[s]] = V {f(u) − s (u)} s u∈D(sx) x W = y∈R(f) {sx(u) + y ≤ f(u) ∀u ∈ D(f, sx)} ,

W ∗ δs(f) = f ⊕ s = T [U[f] ⊕ U[s]] = ∗ {f(u) + s (u)} u∈D(sx) x V ∗ ∗ = y∈R(f) {−sx(u) + y ≥ f(u) ∀u ∈ D(f, sx)} .

Regarding duality, Equation 3.28 still holds, and we have

W V ∗ δ ∗ (f) = {f(u) + s (u)} = − {−f(u) − s (u)} = ε (f). s u∈D(sx) x u∈D(sx) x s Moreover, adjunction of the adaptive erosion and dilation can be proved as follows:

W ∗ δs(f)(x) ≤ h(x) ∀x ∈ D(f, h) ⇐⇒ ∗ {f(u) + s (u)} ≤ h(x) ∀x ∈ D(f, h) u∈D(sx) x ∗ ∗ ⇐⇒ f(u) + sx(u) ≤ h(x) ∀u ∈ D(sx), ∀x ∈ D(f, h) ∗ ∗ ⇐⇒ f(u) ≤ h(x) − sx(u) ∀u ∈ D(sx), ∀x ∈ D(f, h)

⇐⇒ f(u) ≤ h(x) − su(x) ∀x ∈ D(su), ∀u ∈ D(f, h) ⇐⇒ f(u) ≤ V {h(x) − s (x)} ∀u ∈ D(f, h) x∈D(su) u

⇐⇒ f(u) ≤ εs(h)(u) ∀u ∈ D(f, h). (3.35)

The defined erosion and dilation are thus adjunct, given that the same structuring ele- ments sx are used for both operations, and we can use Eqs. (3.25)–(3.26) for the adaptive case as well:

γs(f) = f ◦ s = (f s) ⊕ s,

ϕs(f) = f • s = (f ⊕ s) s. As pointed out by Roerdink [25], the condition that the same structuring elements (i.e. the same SEM) must be used for both erosion and dilation has often been overlooked in literature. Note that the proof above holds for the non-adaptive case as well. For a more thorough theoretical analysis of the properties of the adaptive morphological operations, the reader is referred to Bouaynaya and Schonfeld [22]. A note can here be made regarding the reflected structuring element: as previously ∗ mentioned, the entity sx is known under various different names such as the reflected, transposed, or reciprocal structuring element, and there is currently no consensus regard- ing proper terminology. While these names have all been used for both rigid and adaptive morphology, they risk leading to ambiguities in a general terminology since there is rea- ∗ sonable room for misinterpretation in the adaptive case. For instance, sx can no longer be described simply as the reflection of sx through the origin. In Paper F, the following two observations were made for the general case: 26 Underlying Theory

1. Given the set of structuring elements {sx : x ∈ D} there is a one-to-one relation ∗ between sx and sx for each point x.

∗ ∗ 2. We can alternate between sx and sx in a cyclic manner using the (·) notation based on Eq. (3.33), i.e. ∗∗ ∗ ∗ sx = (sx) = sx. (3.36)

These properties form a duality with respect to the set of structuring elements {sx : ∗ x ∈ D}, i.e. the whole set is needed to retrieve sx for a given point x. Based on this reasoning, the term dual structuring elements could be motivated. Note that this duality is for functions or sets, and not equivalent to the duality of operators defined by Eqs. (3.9) and (3.28). This thesis, however, uses the currently accepted term reflected to avoid confusion. Finally, it should be noted that other definitions of adaptive morphology have also been suggested. More details can be found in Paper F.

3.3 The Local Structure Tensor

Tensors as a method for representing local structure in images was originally introduced by Knutsson [84], using quadrature filters to estimate the local orientation within the data. In this work, however, we will use another method for estimating the local struc- ture tensor, based on the spatial image gradient [74]: For binary or gray-scale (zeroth order tensor) images, second order tensors can be used to represent local structure by associating to each point a 2 × 2 matrix constructed from the image gradient. More specifically; the Local Structure Tensor (LST) T(x) is given by   TX1X1 TX1X2 T  T(x) = (x) = Gσ ∗ ∇f(x)∇ f(x) , (3.37) TX1X2 TX2X2

T  ∂ ∂  where f(x) denotes the image value for each point x, ∇ = , and Gσ is a ∂x1 ∂x2 Gaussian kernel with standard deviation σ which regularizes the matrix. The LST T(x) holds information about local structure orientation around the point x (see table 3.1), and this knowledge can be extracted by considering its eigenvalues and eigenvectors; Eigenvectors provide the local data curvature, while the relation between the eigenvalues indicates its dominance (or lack thereof).

3.4 Normalized Convolution

In many imaging applications confidence in point values may vary. This is particularly true for range data captured by laser triangulation, where occluded data is common (see Chapter 2). Normalized convolution, introduced by Knutsson and Westin [85], provides a method for dealing with tensor-valued image data where some point values should be considered more reliable than others, whose values may be even completely missing. 3.4. Normalized Convolution 27

Table 3.1: Interpretation of the Eigenvalues λ1 and λ2 of the LST T

λ1 ≈ λ2  0 No dominant direction (edge crossing or point)

λ1  λ2 ≈ 0 Strong dominant direction (edge)

λ1 ≈ λ2 ≈ 0 No dominant direction (no edge)

The basic idea of the method is to separate between point values and point confidence, allowing convolution operations to weight point values differently depending on their reliability.

3.4.1 Definition Consider a tensor-valued image function T(x), defined on points x ∈ D(T), with corre- sponding certainty weights wc(x) representing the confidence in the point value. A gen- eralized form of convolution, operating on points u within the neighborhood Nx around point x, is then given by X C(x) = {waB b wcT}(x) = wa(u − x)B(u − x) wc(u)T(u), (3.38) u∈Nx where B denotes the operator filter basis and wa(x) represents weighting coefficients for an applicability filter, e.g. a Gaussian kernel. The “ ” symbol represents a multi- linear operation (e.g. scalar multiplication in the case of standard convolution), and the “b” symbol in “ b” marks the operation for the convolution. Note that u denotes global point coordinates while the translation (u − x) produces local point coordinates with respect to the processed point x. From Eq. (3.38), C(x) = {waB b wcT}(x). (3.39) Introducing another tensor

∗ N(x) = {waB B b· wc}(x), (3.40) where (·)∗ denotes the complex conjugate, normalized convolution is then defined as

−1 CN(x) = {waB b wcT}N(x) = N (x)C(x). (3.41) Note that the tensor N(x) contains certainty information associated to the new basis functions.

3.4.2 Basis coefficient estimation

Consider a gray-scale image (i.e. containing tensors of order zero) f(x) ∈ R and a set of (real-valued) basis functions {b1(u − x), b2(u − x), b3(u − x), . . . , bm(u − x)} defined 28 Underlying Theory

in local coordinates for points u within an n-point neighborhood Nx around a point x ∈ D(f), i.e. u ∈ {u1, u2, u3,..., un} ⊆ D(f). Then let the vector

T n f(x) = [f(u1), f(u2), f(u3), . . . , f(un)] ∈ R (3.42) denote the n corresponding point scalar values3, i.e. tensors of order zero, and define the basis vectors

T bk = [bk(u1), bk(u2), bk(u3), . . . , bk(un)] , k = 1, 2, 3, . . . , m. (3.43)

The problem of describing the neighborhood f(x) by a linear combination of the bases. i.e. finding a set of coefficients {β1(x), β2(x), . . . , βm(x)} which approximate the neighborhood f(x), can then be expressed as

Pm fB(x) = k=1 βk(x)bk = Bβ(x), (3.44) where B = [b1, b2,..., bm] and β(x) = [β1(x), β2(x), . . . , βm(x)]. The least-squares solution to this problem, which minimizes the mean square error kf(x) − fB(x)k, is given by −1 β(x) = BT B BT f(x). (3.45) If we then introduce a diagonal weighting matrix W(x), Eq. (3.44) yields

W(x)fB(x) = W(x)Bβ(x), (3.46) which, by Eq. (3.45), has the least-squares solution

−1 β(x) = (W(x)B)T (W(x)B) (W(x)B)T W(x)f(x) =

−1 = BT WT (x)W(x)B BT WT (x)W(x)f(x) =

−1 = BT W2(x)B BT W2(x)f(x). (3.47)

This weighted least squares problem can be solved by normalized convolution. Using tensor index notation, define the diagonal matrix W2(x) by

2 Wi,i(x) = wa(ui − x)wc(ui), i = 1, 2, 3, . . . , n, (3.48) i.e. the combination of the applicability and certainty weights corresponding to the i:th point in the neighborhood vector f(x). Note that the applicability, which is related to the local position of each point within the neighborhood, depends on the local point coordinate (ui − x) while the certainty depends on the global coordinate ui. From the definition of the basis matrix B, we also have

T Bi,j = Bj,i = bj(ui − x), i = 1, 2, 3, . . . , n, j = 1, 2, 3, . . . , m. (3.49)

3Note that the actual neighborhood within the image is usually (but not necessarily) quadratic but can still always be denoted in vector format, i.e. n = p2 for a neighborhood of size p × p. 3.5. Normalized Differential Convolution 29

Then consider the quantities BT W2(x)B and BT W2(x)f(x) in Eq. (3.47). These can now be expressed element-wise as

n T 2 X (B W (x)B)i,j = bi(uk − x)wa(uk − x)wc(uk)bj(uk − x), (3.50) k=1

n T 2 X (B W (x)f(x))i = bi(uk − x)wa(uk − x)wc(uk)f(uk), (3.51) k=1 i.e. standard convolutions where k or, equivalently, uk is the summation variable. More precisely, from Eqs. (3.41)–(3.40) we have

T 2 T B W (x)B = {waB · B b· wc}(x) = N(x), (3.52) T 2 B W (x)f(x) = {waB b· wcf}(x) = C(x). (3.53) Hence, by Eq. (3.41), the solution to Eq. (3.47) can be retrieved by normalized convolu- tion, i.e. −1 β(x) = CN(x) = {waB b wcf}N(x) = N (x)C(x). (3.54) This strategy is used iteratively in Paper D for reconstruction of missing 3D data.

3.5 Normalized Differential Convolution

Knutsson and Westin [85] also introduced the concept of normalized differential convo- lution, which yields a result equivalent to that of normalized convolution except for the constant basis which can be considered ignored (or cancelled) in the former [86].

3.5.1 Definition To define normalized differential convolution we first need the differential convolution

C∆(x) = {waB b wcT}∆(x)

= ({wa b· wc}{waB b wcT} − {waB b· wc} b {wa b· wcT})(x), (3.55) based on the same notation as in Sect. 3.4. The first term in this expression contains a weighting factor combining applicability and certainty, acting on a generalized convolu- tion (Eq. (3.38)). The subtracted second term can then be seen as a certainty-affected average operator acting on the certainty-affected data. In the words of the original au- thors, the resulting operation can be considered a “standard convolution weighted with the local energy minus the ’mean’ operator acting on the ’mean’ data”. It should be noted that Equation 3.55 is not normalized but will be heavily affected by the amount of available certainty. To see this, simply consider a point with zero certainty. Such a point would not contribute to the convolved sum, which would give the convolution result a bias towards zero. This is handled by the concept of normalized 30 Underlying Theory differential convolution which, analogously to the definition of normalized convolution, is defined by −1 CN∆(x) = N∆ (x)D∆(x) (3.56) where    D∆(x) = {wa b· wc} waB b wcT − {waB b· wc} b wa b wcT (x), (3.57) ∗   N∆(x) = ({wa b· wc}{waB B b· wc} − {waB b· wc} b waB b wc (x). (3.58)

3.5.2 Gradient estimation By using polynomial basis functions for approximating a signal f within a certain neigh- borhood (including the constant basis b0 = 1), i.e. by calculating the coefficients βk, k = 0, 1, 2, for the three polynomial basis functions {bk} = {1, ξ1, ξ2} in

3 X f(x + ξ) ≈ βkbk(ξ) = β0 · 1 + β1 · ξ1 + β2 · ξ2, (3.59) k=0 we can estimate the gradient ∇f(x) as the coefficients of the two bases b1 = ξ1 and T b2 = ξ2 given by the local pixel coordinate ξ = [ξ1 ξ2] . To see this, simply consider the first order Taylor expansion around a pixel x given by

∂f(x) ∂f(x) f(x + ξ) ≈ f(x) · 1 + · ξ1 + · ξ2. (3.60) ∂x1 ∂x2

From (3.59)–(3.60), the estimated coefficients β1 and β2 are identified as the approximated components of the function gradient. In practice, as we are not interested in the constant basis, this operation can be performed by normalized differential convolution. More specifically; T −1 ∇f(x) ≈ CN∆( x | wa, [ξ1, ξ2] , wc, f) = N∆ (x)D∆(x), (3.61) where wc denotes the certainty values for f and wa the chosen applicability function. No constant basis is thereby needed (as would be the case if ordinary normalized convolu- tion was used), which reduces the dimensionality of the involved matrices and decreases the required computational time. This approach is used in Paper E for estimating the gradient in incomplete (partly missing) data. Chapter 4 Elliptical Adaptive Structuring Elements

“Education is an admirable thing, but it is well to remember from time to time that nothing that is worth knowing can be taught.” – Oscar Wilde

4.1 The EASE concept

We are now ready to define the main contribution of this thesis: Elliptical Adaptive Structuring Elements (EASE) combine ideas from typical linear image processing (i.e. the local structure tensor) with the concept of adaptive mathematical morphology, yielding a powerful morphological tool which can be used to enhance directional features in the data without the smoothing typically induced by linear filters. The aim of this chapter is to provide an overview of the method. For details, the reader is referred to Part II of the thesis: in particular Papers B and E.

4.1.1 Obtaining structural information

T Let x = (x1 x2) denote point coordinates and f(x) the corresponding gray-scale value for every x ∈ D(f). We are interested in structures in the data and estimate these using the Local Structure Tensor (LST) T(x) (see Chapter 3), given for each point x by the 2 × 2 matrix defined by Eq. (3.37):   TX1X1 TX1X2 T  T(x) = (x) = Gσ ∗ ∇f(x)∇ f(x) TX1X2 TX2X2 (here stated once again for simplicity). The image gradient can be estimated by applying standard gradient filters, e.g. 3×3 pixel Scharr filters, operating on a slightly smoothed version of the input image produced by applying a small Gaussian filter with low standard

31 32 Elliptical Adaptive Structuring Elements deviation σ0 to the input image f. The smoothing in Eq. (3.37) which results from the convolution by Gσ regularizes the matrix, but also sets a scale for which the resulting LST can be considered representative. This scale is set by relating σ to a user-defined filter bandwidth radius rw through the expression r σ = √ w , (4.1) 2 ln 2 i.e. Gσ decreases to half of its maximum value at distance rw from its center. For each point x the eigenvalues λ1(x) and λ2(x)(λ1(x) ≥ λ2(x)) and the correspond- ing eigenvectors e1(x) and e2(x) of the symmetric LST T(x) can then be interpreted into information about structures (edges) present in the data, based on Table 3.1 in Chapter 3 (here summarized again for simplicity): λ1 ≈ λ2  0: No dominant direction (edge crossing or point), λ1  λ2 ≈ 0: Strong dominant direction (edge), λ1 ≈ λ2 ≈ 0: No dominant direction (no edge). The eigenvector e1(x) represents the local dominant direction of variation in the data. Hence e2(x), being orthogonal to e1(x), represents the direction of the smallest variation in the region [74].

4.1.2 Structuring elements definition EASE are defined by solid ellipses E(a, b, φ), where a is the semi-major axis, b the semi- minor axis, and φ the orientation (see Fig. 4.1). Each point x in the image f is then assigned such a structuring element. Hence, we have a = a(x), b = b(x), and φ = φ(x), which together define a Structuring Element Map for the image. Letting an x subscript denote that the elliptical structuring element has been trans- lated to the point x, we now use the notation

Ex = Ex(a(x), b(x), φ(x)) (4.2)

Figure 4.1: Ellipse parameters a, b, and φ, and their relation to the eigenvectors e1 and e2 of the LST. 4.1. The EASE concept 33 for the flat elliptical structuring element used at the point x. The axes a(x) and b(x) are set from the eigenvalues of T(x) by the expressions λ (x) a(x) = 1 · M, (4.3) λ1(x) + λ2(x)

λ (x) b(x) = 2 · M, (4.4) λ1(x) + λ2(x) where M denotes the maximum allowed semi-major axis. Divisions by zero and perfectly smooth regions are handled by adding a small positive number to the eigenvalues (small enough to be neglected for any other purpose, i.e. machine epsilon). These definitions yield 0 ≤ b(x) ≤ a(x) ≤ M, (4.5) a(x) + b(x) = M (4.6) for all values of λ1(x) and λ2(x). The resulting structuring elements will range dynamically from lines of length M where λ1(x)  λ2(x) ≈ 0, i.e. near strong dominant edges in the data, to disks with radius M/2 where λ1(x) ≈ λ2(x), i.e. where no single direction represents the local image structure. The orientation φ(x) is retrieved from the corresponding eigenvectors by  e (x)  2,x2  arctan , e2,x1 (x) 6= 0,  e (x) φ(x) = 2,x1 (4.7)   π/2, e2,x1 (x) = 0,

where e2,x1 (x) and e2,x2 (x) denote the components of the eigenvector e2(x). Given the image f and the parameters M and rw, the values of a(x), b(x),and φ(x) for the point x are thereby uniquely defined.

4.1.3 Morphological operations

The morphological operations erosion (εE) and dilation (δE) operating on the image f are then defined by ^ εE(f) = f(y) ∀x ∈ D(f), (4.8)

y:y∈Ex _ δE(f) = f(y) ∀x ∈ D(f). (4.9)

y:x∈Ey As the Structuring Element Map is defined once from the same input (pilot) image, the operations εE and δE are adjunct as long as the SEM remains unchanged (see Sect. 3.2.3). That is, δ(f) ≤ g ⇐⇒ f ≤ ε(g) (4.10) 34 Elliptical Adaptive Structuring Elements

(a) (b)

Figure 4.2: An input image with clear structures, displayed in jet colormap to enhance contrast differences (a), and its closing using EASE (b). A subset of the structuring elements used are shown in white. for any functions f and g and we can thereby define the morphological opening and closing, respectively, of the gray-scale image f as

γE(f) = δE(εE(f)), (4.11)

ϕE(f) = εE(◦δE(f)). (4.12) An example of the obtained elliptical structuring elements and the resulting closing are shown in Fig. 4.2.

4.2 Required parameters

Only a few parameters are required to use EASE. The underlying philosophy is that usage should be as easy as possible. In fact, empirical studies suggest that one single user-supplied parameter is sufficient for many cases (setting rw = M), but there is also room for further refinement when needed. The parameter rw or, equivalently, σ can be set to match the scale for which the LST should be representative, while σ0 (or the corresponding radial bandwidth) can be set to define the smoothing (and thereby the sensitivity) of the gradient estimation. These parameters should of course be set based on the size and nature of the structures of interest. Figure 4.3 shows how different choices of M and rw affect the morphological erosion. 4.3. Implementation and computational issues 35

(a)

(b) (c) (d)

(e) (f) (g)

(h) (i) (j)

Figure 4.3: An input image (a) and its erosions (b–j) using M = 4, 8, and 16 (top to bottom) and rw = 4, 8, and 16 (left to right).

4.3 Implementation and computational issues

The presented algorithm can be summarized as follows:

1. Pre-calculate relative pixel index lists for the required ellipses, i.e. the index shifts for the leftmost and rightmost border pixels (relative to the center pixel) for each row of the discretized structuring element. All needed structuring elements are then stored in a corresponding 3D Look-Up Table LUTE(a, b, φ) based on semi-major axes a ∈ [0,M], semi-minor axes b ∈ [0, a], and orientations φ ∈ [0, π). In practice, due to the symmetry of the ellipses, only half of the relative indices need to be stored. The other half can then be easily obtained by inverting the coordinates. 36 Elliptical Adaptive Structuring Elements

2. Estimate the image gradient ∇f(x) using 3×3 pixel Scharr filters on a pre-smoothed version of the input image, obtained using a small Gaussian filter with standard deviation σ0. Then calculate the LST for each pixel according to Eq. (3.37). The standard deviations σ and σ0 are set by defining bandwidth radii for the filters.

3. Obtain axis lengths and orientation for each pixel from the eigenvalues and eigen- vectors of the corresponding LST, using Eqs. (4.3), (4.4), and (4.7).

4. Perform an erosion or a dilation, or a combination of the two (an opening or a closing), using the elliptical structuring elements defined for each pixel x. More specifically, the elliptical neighborhoods provided by LUTE(a(x), b(x), φ(x)) are used to perform the operations given by Eqs. (4.8)–(4.9). In practice, the dilation is implemented as suggested by Lerallut [29], i.e. an updating procedure _ (δE(f))(y) ←− {(δE(f))(y), f(x)} , ∀y ∈ Bx, ∀x ∈ D(f). (4.13)

Given LUTE(a, b, φ) (which can be pre-computed and loaded by the program at execution time) the actual filtering is performed by convolutions and min/max operations within the defined neighborhoods. Denoting the number of pixels in the image by N and the maximum number of neighborhood pixels used in the operations, resulting from the choice of M and rw, by R, this yields a computational cost for the method of O(NR) for the standard case (with no missing values, i.e. complete data).

4.4 EASE for incomplete Data

EASE can also be adapted for use on incomplete data, using the concept of normalized convolution. We now assume that each point is assigned both a function value and a measure of certainty. Hence the value of each point, denoted either f(x) or T(x) depending on whether we are referring to the gray level f or the tensor T, is associated with a corresponding certainty (or confidence) weight c(x) ∈ R [0 1], representing our trust in the value at pixel x. In the case of incomplete data, we have for the original input data f(x)  0 if the value of x is unknown, c (x) = (4.14) f 1 otherwise. This case cannot be directly handled by the standard approach resulting from (3.37), (4.3), and (4.4), but tensor formulation enables straight-forward adaption through the concept of normalized convolution.

4.4.1 Estimating the gradient The benefits from using normalized differential convolution for estimating gradients in the case of missing data have been demonstrated by Knutsson and Westin [85], and we simply adapt their terminology to this work: following the discussion in Chapter 3, 4.4. EASE for incomplete Data 37

Sect. 3.5.2, the gradient components can be estimated as the coefficients β1 and β2 for the corresponding linear basis functions (i.e. the local coordinates) b1(ξ) = ξ1 and T b2(ξ) = ξ2 (ξ = [ξ1 ξ2] denoting the local coordinates), as estimated by (3.56)–(3.58). Hence, the gradient ∇f can be expressed

T −1 ∇f(x) ≈ CN∆( x | aξ, [ξ1, ξ2] , cf , f) = N∆ (x)D∆(x), (4.15) where aξ is a user-chosen applicability function defining the influence of the neighborhood around a considered point.

4.4.2 Estimating the LST When the signal is not fully trusted we should take into account that the estimated gradients ∇f(x) rely on different amounts of available information around x. We should therefore not use our estimated ∇f directly in (3.37), but instead identify that equation as a particular case of normalized convolution: Eq. (3.37) can be seen as normalized convolution of a fully trusted signal (full certainty for all pixels) using a constant basis. However; a certainty c∇f (x), representing our trust in the gradient obtained by (4.15), is needed. A simple approach for obtaining such a certainty c∇f (x) would be to use the quantity aξ ∗ cf (the standard convolution of the certainty weights), but this would not take the effects of the basis functions into account. The quantity N∆, however, contains a description of the certainties associated with the new basis functions [85], and thereby holds certainty information for the estimated gradient ∇f. As a measure of this certainty, we use the determinant of N∆, i.e.

c∇f (x) = |N∆(x)|. (4.16)

This choice of c∇f , previously used by Westin [87, p. 104], captures the amount of certainty associated with the new basis functions and allows for reduction of the impact on the final local structure tensor from gradients estimated from a small number of pixels on behalf of gradients estimated using more available information. The local structure tensor T(x) can now be calculated from (3.39)–(3.41) as

T −1 T(x) = CN( x | aw, 1, c∇f , ∇f∇ f) = N (x)D(x), (4.17) where aw is an applicability function chosen by the user (note that aw is, in general, different from aξ). This applicability determines the level of smoothing of the tensor field. Note that in practice the operations can be performed element-wise for the three defining elements of the symmetric 2×2 matrix (∇f∇T f), which drastically simplifies the required calculations.

4.4.3 Adaptive morphological processing The eigenvectors and eigenvalues of T(x) can now be used to set the elliptical structuring elements using Eqs. (4.3), (4.4), and (4.7). For the min/max operations in the actual 38 Elliptical Adaptive Structuring Elements morphological processing, missing data is then handled by simply ignoring the values of such pixels by setting the corresponding pixel values to −∞/+∞ depending on the type of operation. Apart from the maximum length of the semi-major axis, M, the applicability func- tions aξ and aw need to be defined. We set them as Gaussian kernels defined by radial bandwidths rξ and rw related to the corresponding standard deviations, analogously to the relation between rw and σ in Equation 4.1. Hence, the applicability functions aξ and aw decrease to half their maximum value at distances rξ and rw, respectively, from their centers. For overview, the whole method can then be summarized into the following steps:

1. Calculate the gradient ∇f by (4.15). Except for the input image and its certainty the user needs to provide rξ, which defines the smoothing used when calculating the gradient (thereby regularizing gradient calculations).

2. Retrieve the LST T(x) by (4.17). The user needs to set rw, which defines the spatial scale for which the LST should be representative.

3. Calculate eigenvalues and eigenvectors for T(x).

4. Define an elliptical structuring element for each pixel using (4.3), (4.4), and (4.7). The parameter M needs to be provided by the user.

5. Perform morphological operations based on (4.8) and (4.9) (the latter implemented by Eq. (4.13)), setting missing pixel values to +∞ or −∞ respectively.

Examples of structuring element and morphological openings for incomplete data are shown in Fig. 4.4.

4.5 Strengths and weaknesses

The result of using rigid (non-adaptive) structuring elements is easily predictable, and traditional morphology is therefore very useful for measuring sizes of objects in images or similar tasks. When it comes to feature enhancement though, such as the crack segment linking procedure in Paper A which relies on a series of morphological filters of increasing structuring element size, adaptive structuring elements are a good option. In many methods for adaptive morphology the variety of shapes used as structuring elements is more or less unlimited. In general, the structuring elements used in the often used similarity-based methods (see Chapter 1, Sect. 1.3) adapt very well to image regions. Some of them rely solely on point similarity (as defined by the user) and are completely unaffected by concepts such as scale. Others obtain spatial constraints from weighted combinations of gray level and spatial information. The structure-based EASE concept is from this perspective more similar to classical non-adaptive morphology, which usually relies on well-defined and often quite small structuring elements that represent some type of intuitive shape, e.g. disks, diamonds, lines, or crosses. 4.5. Strengths and weaknesses 39

(a) (b) (c)

(d) (e) (f) (g)

Figure 4.4: EASE on incomplete data. First row: An image containing randomly missing (black) data (a), a subset of the calculated structuring elements (b), and the resulting opening (c). Second row: Captured steel 3D profile data with cracks (d) and without cracks (f)with occluded data marked in red, and the adaptive openings (e and g).

Most structure-based methods rely on directional information. In particular, Verd´u- Monedero et al. use average diffused squared gradient fields to obtain angles for line- shaped structuring elements [35, 36, 48] and Tankyevych et al. [37] use principal value analysis of the Hessian matrix of 3D voxels to obtain directions for 3D line structur- ing elements. Using the PDE approach, Breuß et al. [41] present continuous morphology based on orientations extracted from the LST. Methods based solely on orientation, how- ever, cannot distinguish between having and not having structure since a direction can always be obtained but will be more or less random, i.e. irrelevant, in flat regions. Some constrain the structuring elements by other measures such as the distance to edges [36], but until now the full power of the LST has been overlooked within mathematical mor- 40 Elliptical Adaptive Structuring Elements phology: the LST-based PDE approach by Breuß et al. [41] does not take the eigenvalues of the LST into account, and the diffused squared gradient fields used by Verd´u-Monedero et al. [35,36,48] will use an equivalent direction, but cannot retrieve the information con- tained in the eigenvalues of the LST. By explicitly using the LST we can take the relation between its eigenvalues into account without need for interpolation into non-varying im- age regions. This strategy does not impose an orientation where there is no prevalent direction in the data and therefore avoids introducing a geometrical bias. The presented method is quite different from Group morphology, as there is no obvious way to define a group under which the intended structure-enhancing operations would be invariant. EASE is also quite far from the Graph aspect, but some comments should be made with respect to one particular graph-based method: path openings [44] are constructed for tasks similar to those for which the EASE method is primarily intended, i.e. enhancement and bridging of partly broken directional structures. Path openings are likely a better option for noise elimination in clear directional structures (paths), but EASE are a better choice for bridging gaps in the structure: they are designed to find connected paths, and even if allowed to ignore pixels on the way (by incomplete path openings [88]), the number of pixels in the gaps that should be bridged is in general too large for path openings to be of practical use in the intended application. The EASE method is designed to enhance directional structures and bridge gaps within them by linking segments together into larger connected regions, as is often the purpose of morphological operations such as openings or closings. This is hard to accom- plish by similarity-based methods (e.g. General Adaptive Neighborhoods [20] or Mor- phological Amoebas [21]) as these will constrain the structuring elements from reaching outside a region of similar points and therefore will not easily bridge gaps. This is a fundamental difference between the structure and similarity aspects, which should be carefully noted. Similarity-based methods, on the other hand, often provide a better option for noise filtering, where regions should not be expanded but cleaned from points with values deviating from a typical value for the region. An important practical issue, especially in systems intended for on-line use, is com- putational cost. Generally speaking, methods based on the propagation of geodesic distances (e.g. Morphological amoebas and Salience adaptive structuring elements) have relatively high computational complexity due to the distance propagation. Methods based solely on differences between gray level values, such as General adaptive neighbor- hoods, require less computational time (for more details, see Paper F). For estimating structure; accurate interpolation risk being time-consuming, while the LST contains ori- entation dominancy information in its eigenvalues – which in lower dimensions can be efficiently calculated from closed-form expressions. Moreover; if only a limited number of predefined shapes is required, as is the case for EASE, precalculated libraries (LUTs) of structuring elements can be used for fast processing – in particular for batch processing. Chapter 5

Contributions

“The whole is greater than the sum of its parts.” – Aristotle

The scientific contributions in Part II are related as follows: Paper A presents a system for crack detection, based on non-adaptive morphology. To overcome identified challenges, Elliptical Adaptive Structuring Elements (EASE) were designed in Paper B, and used in a practical application in Paper C. An alternative method for filling in partly incomplete data prior to processing is presented in Paper D, while Paper E shows how the EASE method can be adapted to allow for direct processing of such data without first filling holes by pre-filtering. Finally, Paper F provides a survey of the broader topic of adaptive mathematical morphology and puts the method in perspective to other work presented within the field.

5.1 Paper A

Paper A demonstrates how morphological image processing can be used in combination with statistical methods in an automated industrial system for crack detection in casted steel. More precisely; 3D profile data captured by laser triangulation is processed by morphological filtering using rigid (non-adaptive) structuring elements. The results for the presented method demonstrates proof-of-concept for a fully au- tomated crack detection system. Yet it should be noted that the use of non-adaptive filters requires a lot of constants to be set by the user. This was handled by relating the constants directly to actual physical measures, but motivated the development of suitable adaptive morphological filtering.

Personal contribution: Proposed method by the author together with Matthew Thurley, implementation and evaluation of results by the author.

41 42 Contributions

5.2 Paper B

Paper B introduces the primary contribution of this work: Elliptical Adaptive Structur- ing Elements (EASE) – although not named so at the time. Chapter 4 in this thesis is to a large extent based on this paper. The presented method for adaptive morphological filtering relies on the local structure tensor, which is used to define elliptical structur- ing elements for each pixel. The ellipses adapt automatically to the local direction of structures in the data while considering the directional dominancy. More specifically; the structuring element ranges dynamically from a line in regions with strong single- directional structure to a disk at locations where no such dominant direction exists. The impact of the two main parameters on the filtering result is investigated, as are the resulting shapes of the elliptical structuring elements used by the method. Furthermore; the usefulness of EASE is demonstrated on both ordinary images and 3D profile data, successfully enhancing directional image structures and crack features.

Personal contribution: General idea, proposed method, implementation, and evaluation of results by the author.

5.3 Paper C

Paper C presents a Machine Vision system for automated inspection of small corner cracks in casted steel. Two light sources of different colors are used together with a line-scan camera in a photometric stereo setup, and the two resulting different reflection patterns are used to cancel shadow effects as well as estimate the surface gradient. EASE is used for adaptive enhancement of structural features. These are then used to segment the image by statistical methods, whereafter crack probabilities are estimated for each segmented region. Results show that true cracks are successfully assigned a high crack probability, while only a minor proportion of other regions cause similar probability values. Proof-of- concept for the presented crack detection method is thereby provided.

Personal contribution: General idea by the author together with Matthew Thurley, pro- posed method, implementation, and evaluation of results by the author.

5.4 Paper D

Paper D considers how regions of missing data can be filled-in in incomplete images in general and 3D profile data in particular. A novel method for iterative hole-filling, based on normalized convolution, is presented and applied to both ordinary gray-scale images and 3D profile data. The presented algorithm fills holes in the data iteratively, beginning at the border of the existing (reliable) data. More precisely; The actual order of pixels to be filled in is set from a priority measure based on the confidence of surrounding pixels. The result is 5.5. Paper E 43 a hole-filling method which grows the reliable data regions in the image into the center of the image. Once a pixel x has been selected for reconstruction, a representation of the image in a given set of basis functions (here the set of second order polynomials) is retrieved using normalized convolution. The value of pixel x (which until this point is missing), is then set from the local approximation of the image function in the new basis functions. The results show that the method is not ideal for reconstruction of randomly sampled images, but handles data with holes such as those typically present in 3D profile data better than ordinary normalized convolution.

Personal contribution: General idea, proposed method, implementation, and evaluation of results by the author together with Frida Nellros.

5.5 Paper E

Paper E extends the concept of Elliptical Adaptive Structuring Elements (EASE) to incomplete, i.e. partially missing, data. It is demonstrated how known techniques for convolving uncertain data – normalized convolution and normalized differential convolu- tion – can be used to assign the EASE shapes. This approach enables robust processing of partially occluded or otherwise incomplete data. Results are presented for filtering of incomplete ordinary gray-scale images as well as 3D profile data where information is missing due to occlusion effects. The latter demonstrates the intended use of the method: enhancement of crack signatures in a surface inspection system for casted steel. The presented method is able to disregard unreliable data in a systematic and ro- bust way, enabling adaptive morphological processing of the available information while avoiding any false edges or other unwanted features introduced by the values of faulty pixels.

Personal contribution: General idea, proposed method, implementation, and evaluation of results by the author.

5.6 Paper F

Paper F presents an overview of adaptive mathematical morphology, providing back- ground and context as well as a unifying summary of the theoretical framework (which until now has been quite scattered in literature). A broad review of previous work within the field is provided and different aspects of adaptivity within mathematical morphology, useful for characterizing the many different types of adaptivity presented, are identified. Four methods, including EASE, are analyzed in more detail and their advantages and dis- advantages are discussed from an application-oriented viewpoint, providing perspective on the consequences of different types of adaptivity. 44 Contributions

The paper is concluded by a brief analysis of perspectives and trends within the field of adaptive mathematical morphology, discussing possible directions for future studies.

Personal contribution: General idea, literature study, and evaluation by the author to- gether with Vladimir Curi´c.´ The other co-authors contributed in discussions and com- ments. Chapter 6 Conclusions and Future Work

“Quotation is a serviceable substitute for wit.” – Oscar Wilde

6.1 Conclusions

This thesis demonstrates how concepts from (or related to) linear filtering can be com- bined with the non-linear morphological framework in a straight-forward manner, result- ing in a novel method for adaptive mathematical morphology based on the Local Struc- ture Tensor (LST): Elliptical Adaptive Structuring Elements (EASE). Previous work has touched the subject but not fully taken advantage of the information contained in the LST. By considering the relation between the eigenvalues as well as the direction ob- tained from the LST, edge dominancy information can be automatically included when constructing the Structuring Element Map (SEM). As ellipses are often used to depict 2×2 tensors there is from a morphological perspective a quite natural connection between the Local Structure Tensor (LST) and structuring elements, if only they can be defined in a meaningful way. This is what the EASE method is intended to accomplish. The EASE definition can be extended into a robust technique for adaptive morpho- logical processing of incomplete data, handling missing pixels within the method itself without need for prior reconstruction. This is very useful when processing partly occluded 3D profile data, as incomplete data can be robustly processed without introducing arti- ficial structures in originally flat regions. If the structuring element is too small it may only cover missing data, but this can be handled by simply measuring the amount of valid (non-missing) data it covers. However, the possibility of changing the sizes of the structuring elements and applicability functions used more dynamically should be con- sidered in the future. Also, an output certainty useful for discarding unreliable output pixel values could be defined by considering the relationship between the numbers of valid and missing pixels covered by each structuring element. Results show that the structuring elements indeed follow data structures well, turn- ing into aligned lines (or close to lines) around edges in the image. Furthermore, when

45 46 Conclusions and Future Work processing regions further away from distinct edges the method adapts dynamically to a non-directional approach by transforming the structuring element into smaller disks, which avoids a distinct geometrical bias from the filter itself in such locations. In partic- ular, this is true when data contains random noise, in which case the noise level affects the morphological operations (which are based on minimum and maximum values within neighborhoods) but does not cause artificial edges in the data resulting from biases from the structuring element. The Elliptical Adaptive Structuring Elements are in this sense robust to noise and are automatically adjusted to the local edge information (or lack thereof) in the data. They constitute a dynamic bridge between the two extremes; disk and (direction-adaptive) line structuring elements. The strength of the method lies in its generality and simplicity: it relies on up to three parameters and can be applied to any type of image data. The parameters are quite intuitive and are related to real measures in the image; The maximum semi-major axis sets the sizes of the structuring elements while the radial bandwidth for calculating the LST is related to the size of orientational features of interest. These two can in many cases even be set to the same number, since the possible reach of the assigned structuring elements then corresponds to the region considered for structure estimation. If needed, the level of Gaussian prefiltering when calculating the gradient can of course also be set by changing the underlying parameter with respect to the expected noise level, but for most cases a small smoothing operation should suffice. Hence, the presented morphological processing is reasonably easy to use even without deep insight into the underlying framework. It has furthermore been demonstrated how the adaptive elliptical structuring ele- ments can be applied to both gray-scale images and 3D profile data used in industrial applications, successfully enhancing directional structures within the data. While many methods for adaptive morphology are of a quite theoretical nature, and has not yet found explicit practical use, the application considered in this thesis – crack detection in casted steel – serves as a practical motivation for EASE. Yet, the method is not in any way re- stricted to this particular application. Moreover; while this thesis demonstrates its use on gray-scale and 3D profile data, further extension into higher dimensions is theoretically straight-forward (although of course more computationally demanding). Many methods for adaptive morphological processing are associated with a relatively high computational cost, which may be one reason why few have reached the wider im- age analysis community even though non-adaptive morphology is widely known. From this perspective, the low number of computations required for the EASE method is a big advantage: the LST calculations are in the standard setup based on convolutions – a standard process which is usually implemented highly efficiently. The eigenvalues of the LST can in lower dimensions be calculated from closed-form expressions, and can therefore be handled quite efficiently as well (but will of course need more computational power in higher dimensions). For the incomplete data case, the number of computations will be higher. As for the morphological operations, a Look-Up Table (LUT) allows for a reasonably efficient implementation. In systems where the adaptive morphological opera- tions should be repeated or otherwise performed several times, the LUT can be calculated 6.2. Future work 47 once (off-line) and then loaded into on-line applications, reducing the computational load for on-line systems at the cost of memory.

6.2 Future work

An obvious continuation of this work, although outside the scope of this thesis, is the implementation of the EASE concept in an on-line machine vision system, enabling crack detection similar to the system presented in paper B where a smaller set of parameters (which in Paper B resulted largely from the use of non-adaptive morphology) would sim- plify analysis and system tuning. The EASE concept, however, is not in any way limited to the practical task for which it was originally intended: there are likely many applica- tions which could benefit from bridging gaps or enhancing structure non-linearally. Also, there is no reason why the presented results cannot be generalized for higher dimensions, enabling processing of voxel information. However; even though the EASE method is reasonably efficient, the possibility of speeding up the process even more should be in- vestigated in order to optimize for high-speed on-line applications or multi-dimensional signals such as 3D data. EASE filtering is currently defined by up to three parameters: the maximum semi- major axis for the structuring elements, the filter bandwidth setting the scale for the LST, and the bandwidth for the prefiltering. The user currently needs to set at least one of them; the maximum semi-major axis length, in which case the scale for the LST is set accordingly. The benefit of this is that the user is provided control of the filter behavior. However, it would still be of interest to explore the possibility of letting the input image affect all parameters. This could be done by calculating the LST at different scales (for different values of rw) and select a value for M that captures the scale at which the LST captures enough information. Other approaches for estimating or interpreting image structure, where crossings can be separated from points, could also be investigated. For instance, the semi-major axis could be scaled based on the combined magnitude of the eigenvalues. Future work should also consider estimating the level of noise in the input image in order to define a proper level of pre-smoothing for calculating the gradient. This would allow for a more proper handling of thin structures in noise-free data. EASE for incomplete data is currently limited by the size of the filters used. Mean- while, Paper D demonstrates how an iterative scheme can be used to partly overcome this problem. It may be possible to combine the ideas, making the EASE algorithm for incomplete data more stable by iterative processing. Morever, as disks (and thereby unit balls) are special cases of ellipses it may be possible to formulate the EASE concept us- ing Partial Differential Equations (PDEs) – especially since the LST has previously been used similarly for obtaining orientations [41]. This would enable sub-pixel accuracy, as the operations would be defined in a continuous framework, but would likely increase the required processing time substantially since PDE solvers usually rely on iterative methods. All these suggestions, however, will likely not improve the method for every case, but should instead be considered variations for different scenarios. This is far from unique 48 Conclusions and Future Work for EASE, but is also the case for adaptive mathematical morphology in general. Based on the evaluations and discussions in Paper F, it is hard to imagine any future method for adaptive morphology as “optimal” or “ideal” for every case – it is more about having a toolbox for handling a variety of situations. The search for new tools, however, is an important task that should not be overlooked. For many methods for adaptive mathematical morphology, including EASE, the clas- sical morphological tool of granulometries become hard to define: A granulometry should sort objects based on size, and as adaptive shapes may change completely from one scale to another without fulfilling the absorption property, obtaining an ordered result becomes far from trivial. Also challenging is morphological filtering of multi-valued data (e.g. color images): the process of ordering vectors may be complex enough in the non-adaptive case [89,90], and does not become easier when introducing more degrees of freedom into the problem. If the same structuring elements are used for all image channels the task becomes more or less a direct ordering problem, but if structuring elements are allowed to vary from channel to channel the situation becomes challenging. Some work in this direction has been presented [29, 91], but there is still much room for further studies. A related challenge, which would be highly relevant to the presented method, would be the definition of an ordering which could take different certainty levels into account. This is far from trivial, but would enable morphological processing of signal/certainty type data where certainty values are not restricted to the binary variable used in this thesis. Finally, recent work has introduced mathematical morphology on Riemannian man- ifolds [92, 93]. This idea has a lot of potential, and may very well be a good starting point for a generalized unifiying framework for mathematical morphology. Moreover; the possibility of using geodesic distances in combination with 3D profile data is highly interesting from the practical perspective of this thesis. Bibliography 49

Bibliography

[1] T. D. Raty, “Survey on contemporary remote surveillance systems for public safety,” IEEE Transactions on Systems, Man, and Cybernetics, Part C: Applications and Reviews, vol. 40, no. 5, pp. 493–515, 2010. [2] T. Gandhi and M. Trivedi, “Pedestrian protection systems: Issues, survey, and challenges,” IEEE Transactions on Intelligent Transportation Systems, vol. 8, no. 3, pp. 413–430, 2007. [3] M. Thurley and T. Andersson, “An industrial 3D vision system for size measure- ment of iron ore green pellets using morphological image segmentation,” Minerals Engineering, vol. 21, no. 5, pp. 405–415, 2008. [4] R. C. Gonzalez and R. E. Woods, Digital Image Processing (3rd Edition). Upper Saddle River, NJ, USA: Prentice-Hall Inc., 2006. [5] G. Matheron, Random sets and integral geometry. New York: Wiley, 1975, vol. 1. [6] J. Serra, Image analysis and mathematical morphology. London: Academic Press, 1982. [7] J. Angulo and J. Serra, “Automatic analysis of dna microarray images using math- ematical morphology,” Bioinformatics, vol. 19, no. 5, pp. 553–562, 2003. [8] T. T. Vu, F. Yamazaki, and M. Matsuoka, “Multi-scale solution for building extrac- tion from lidar and image data,” International Journal of Applied Earth Observation and Geoinformation, vol. 11, no. 4, pp. 281 – 289, 2009. [9] J. Lee, M. Smith, L. Smith, and P. Midha, “A mathematical morphology approach to image based 3D particle shape analysis,” Machine Vision and Applications, vol. 16, no. 5, pp. 282–288, 2005. [10] M. Thurley, “Automated online measurement of limestone particle size distributions using 3D range data,” Journal of Process Control, vol. 21, no. 2, pp. 254–262, 2011. [11] M. Nagao and T. Matsuyama, “Edge preserving smoothing,” Computer Graphics and Image Processing, vol. 9, pp. 394–407, 1979. [12] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Sixth International Conference on Computer Vision, pp. 839–846. IEEE, 1998. [13] P. Milanfar, “A tour of modern image filtering: new insights and methods, both practical and theoretical,” Signal Processing Magazine, IEEE, vol. 30, no. 1, pp. 106–128, 2013. [14] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629–639, 1990. 50 Conclusions and Future Work

[15] J. Serra, Image analysis and mathematical morphology. Vol. 2. New York, NY: Academic Press, 1988.

[16] M. Charif-Chefchaouni and D. Schonfeld, “Spatially-variant mathematical morphol- ogy,” in IEEE International Conference on Image Processing (ICIP), vol. 2, pp. 555–559. IEEE, 1994.

[17] A. Morales, “Adaptive structuring element for noise and artifact removal,” in Pro- ceedings of the 23d Conference on Information Sciences and Systems, March 1989.

[18] C.-S. Chen, J.-L. Wu, and Y.-P. Hung, “Theoretical aspects of vertically invariant gray-level morphological operators and their application on adaptive signal and im- age filtering,” IEEE Transactions on Signal Processing, vol. 47, no. 4, pp. 1049–1060, 1999.

[19] F. Cheng and A. N. Venetsanopoulos, “Adaptive morphological operators, fast al- gorithms and their applications,” Pattern recognition, vol. 33, no. 6, pp. 917–933, 2000.

[20] J. Debayle and J. Pinoli, “Spatially adaptive morphological image filtering using intrinsic structuring elements,” Image Analysis & Stereology, vol. 24, no. 3, pp. 145–158, 2005.

[21] R. Lerallut, E.´ Decenci`ere,and F. Meyer, “Image filtering using morphological amoe- bas,” in Mathematical Morphology: 40 Years On, 2005.

[22] N. Bouaynaya and D. Schonfeld, “Spatially variant morphological image processing: theory and applications,” in Proceedings of SPIE, vol. 6077, pp. 673–684, 2006.

[23] N. Bouaynaya, M. Charif-Chefchaouni, and D. Schonfeld, “Theoretical foundations of spatially-variant mathematical morphology part i: Binary images,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 823–836, 2008.

[24] N. Bouaynaya and D. Schonfeld, “Theoretical foundations of spatially-variant math- ematical morphology part ii: Gray-level images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 837–850, 2008.

[25] J. Roerdink, “Adaptivity and group invariance in mathematical morphology,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2253–2256. IEEE, 2009.

[26] P. Maragos and C. Vachier, “Overview of adaptive morphology: trends and per- spectives,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2241–2244. IEEE, 2009. Bibliography 51

[27] U. Braga-Neto, “Alternating sequential filters by adaptive-neighbourhood structur- ing functions,” in In Proc. of International Symposium on Mathematical Morphology, pp. 139–146, 1996. [28] O. Cuisenaire, “Locally adaptable mathematical morphology using distance trans- formations,” Pattern recognition, vol. 39, no. 3, pp. 405–416, 2006. [29] R. Lerallut, E.´ Decenci`ere,and F. Meyer, “Image filtering using morphological amoe- bas,” Image and Vision Computing, vol. 25, no. 4, pp. 395–404, 2007. [30] J. Grazzini and P. Soille, “Edge-preserving smoothing using a similarity measure in adaptive geodesic neighbourhoods,” Pattern Recognition, vol. 42, no. 10, pp. 2306– 2316, 2009. [31] V. Curi´c,C.´ Luengo Hendriks, and G. Borgefors, “Salience adaptive structuring elements,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 7, pp. 809–819, 2012. [32] P. Dokl´adaland E. Dokl´adalov´a,“Grey-scale morphology with spatially-variant rect- angles in linear time,” in Advanced Concepts for Intelligent Vision Systems, pp. 674–685. Springer, 2008. [33] V. Curi´cand´ C. L. Luengo Hendriks, “Adaptive structuring elements based on salience information,” in Computer Vision and Graphics, pp. 321–328. Springer, 2012. [34] F. Shih and S. Cheng, “Adaptive mathematical morphology for edge linking,” In- formation sciences, vol. 167, no. 1, pp. 9–21, 2004. [35] R. Verd´u-Monederoand J. Angulo, “Spatially-variant directional mathematical mor- phology operators based on a diffused average squared gradient field,” in Advanced Concepts for Intelligent Vision Systems, pp. 542–553. Springer, 2008. [36] R. Verd´u-Monedero,J. Angulo, and J. Serra, “Anisotropic morphological filters with spatially-variant structuring elements based on image-dependent gradient fields,” IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 200–212, 2011. [37] O. Tankyevych, H. Talbot, P. Dokl´adal,and N. Passat, “Direction-adaptive grey- level morphology. application to 3d vascular brain imaging,” in 16th IEEE Interna- tional Conference on Image Processing (ICIP), pp. 2261–2264. IEEE, 2009. [38] J. Angulo and S. Velasco-Forero, “Structurally adaptive mathematical morphol- ogy based on nonlinear scale-space decompositions,” Image Analysis & Stereology, vol. 30, no. 2, pp. 111–122, 2011. [39] P. Maragos and C. Vachier, “A PDE formulation for viscous morphological oper- ators with extensions to intensity-adaptive operators,” in 15th IEEE International Conference on Image Processing (ICIP), pp. 2200–2203. IEEE, 2008. 52 Conclusions and Future Work

[40] M. Welk, M. Breuß, and O. Vogel, “Morphological amoebas are self-snakes,” Journal of Mathematical Imaging and Vision, vol. 39, no. 2, pp. 87–99, 2011.

[41] M. Breuß, B. Burgeth, and J. Weickert, “Anisotropic continuous-scale morphology,” Pattern Recognition and Image Analysis, pp. 515–522, 2007.

[42] V.-T. Ta, A. Elmoataz, and O. L´ezoray, “Nonlocal PDEs-based morphology on weighted graphs for image and data processing,” IEEE transactions on Image Pro- cessing, vol. 20, no. 6, pp. 1504–1516, 2011.

[43] J. Stawiaski and F. Meyer, “Minimum spanning tree adaptive image filtering,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2245–2248. IEEE, 2009.

[44] H. Heijmans, M. Buckley, and H. Talbot, “Path openings and closings,” Journal of Mathematical Imaging and Vision, vol. 22, no. 2–3, pp. 107–119, 2005.

[45] J. Cousty, L. Najman, F. Dias, and J. Serra, “Morphological filtering on graphs,” Computer Vision and Image Understanding, pp. 1–20, 2012.

[46] J. Roerdink and H. Heijmans, “Mathematical morphology for structures without translation symmetry,” Signal Processing, vol. 15, no. 3, pp. 271–277, 1988.

[47] J. Roerdink, “Group morphology,” Pattern Recognition, vol. 33, no. 6, pp. 877–895, 2000.

[48] R. Verd´u-Monedero,J. Angulo, and J. Serra, “Spatially-variant anisotropic morpho- logical filters driven by gradient fields,” Mathematical Morphology and Its Applica- tion to Signal and Image Processing, pp. 115–125, 2009.

[49] “World steel in figures,” World Steel Association, Brussels, Belgium, Tech. Rep., 2014.

[50] R. Mahapatra, J. Brimacombe, I. Samarasekera, N. Walker, E. Paterson, and J. Young, “Mold behavior and its influence on quality in the continuous casting of steel slabs: Part i. industrial trials, mold temperature measurements, and mathe- matical modeling,” Metallurgical and Materials Transactions B, vol. 22, pp. 861–874, 1991.

[51] X. Li, S. Tso, X. Guan, and Q. Huang, “Improving automatic detection of defects in castings by applying wavelet technique,” IEEE Transactions on Industrial Elec- tronics, vol. 53, no. 6, pp. 1927–1934, 2006.

[52] J. Yun, S. Choi, B. Seo, C. Park, and S. Kim, “Defects Detection of Billet Surface Using Optimized Gabor Filters,” in Proceedings of the 17th IFAC World Congress, pp. 77–82. The International Federation of Automatic Control, 2008. Bibliography 53

[53] B. G. Thomas, “On-line Detection of Quality Problems in Continuous Casting of Steel,” in Modeling, Control and Optimization in Ferrous and Nonferrous Industry, 2003 Materials Science and Technology Symposium, pp. 29–45, 2003.

[54] P. Meilland, “Novel Multiplexed Eddy-Current Array for Surface Crack Detection on Rough Steel Surface,” Proceedings of the 9th European Conference on Non- Destructive Testing (ECNDT), Berlin, 2006.

[55] T. Nishimine, O. Tsuyama, T. Tanaka, and H. Fujiwara, “Automatic magnetic particle testing system for square billets,” in Conference Record of the 1995 IEEE Industry Applications Conference, vol. 2, pp. 1585–1590. IEEE, 1995.

[56] M. Allazadeh, C. Garcia, K. Alderson, and A. Deardo, “Ultrasonic image analysis of steel slabs,” Advanced materials & processes, vol. 166, no. 12, pp. 26–27, 2008.

[57] J. Sirgo, R. Campo, A. Lopez, A. Diaz, and L. Sancho, “Measurement of center- line segregation in steel slabs,” in Conference Record of the 2006 IEEE Industry Applications Conference, vol. 1, pp. 516–520. IEEE, 2006.

[58] R. T. Chin and C. A. Harlow, “Automated visual inspection: A survey,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 4, no. 6, pp. 557– 573, 1982.

[59] R. Chin, “Automated visual inspection: 1981 to 1987,” Computer Vision, Graphics, and Image Processing, vol. 41, no. 3, pp. 346–381, 1988.

[60] E. Bayro-Corrochano, “Review of automated visual inspection 1983-1993, Part I: conventional approaches,” in Proceedings of SPIE, vol. 2055, p. 128, 1993.

[61] E. Bayro-Corrochano, “Review of automated visual inspection 1983-1993, Part II: approaches to intelligent systems,” in Proceedings of SPIE, vol. 2055, p. 159, 1993.

[62] T. Newman and A. Jain, “A survey of automated visual inspection,” Computer vision and image understanding, vol. 61, no. 2, pp. 231–262, 1995.

[63] E. Malamas, E. Petrakis, M. Zervakis, L. Petit, and J. Legat, “A survey on industrial vision systems, applications and tools,” Image and Vision Computing, vol. 21, no. 2, pp. 171–188, 2003.

[64] J. Yun, S. Choi, Y. Jeon, D. Choi, and S. Kim, “Detection of line defects in steel bil- lets using undecimated wavelet transform,” in International Conference on Control, Automation and Systems, pp. 1725–1728. IEEE, 2008.

[65] Y. Jeon, J. Yun, D. Choi, and S. Kim, “Defect detection algorithm for corner cracks in steel billet using discrete wavelet transform,” in ICCAS-SICE, pp. 2769–2773. IEEE, 2009. 54 Conclusions and Future Work

[66] M. Yazdchi, A. Mahyari, and A. Nazeri, “Detection and classification of surface de- fects of cold rolling mill steel using morphology and neural network,” in International Conference on Computational Intelligence for Modelling Control & Automation, pp. 1071–1076. IEEE, 2008.

[67] D. Lee, Y. Kang, C. Park, and S. Won, “Defect Detection Algorithm in Steel Billets using Morphological Top-Hat Filter,” in IFAC Workshop on Automation in Mining, Mineral and Metal Industry (IFACMMM), 2009.

[68] F. Pernkopf, “3D surface acquisition and reconstruction for inspection of raw steel products,” Computers in Industry, vol. 56, no. 8-9, pp. 876–885, 2005.

[69] I. Alvarez, J. Marina, J. Enguita, C. Fraga, and R. Garcia, “Industrial online surface defects detection in continuous casting hot slabs,” in Proceedings of SPIE, vol. 7389, p. 73891X, 2009.

[70] R. Woodham, “Photometric method for determining surface orientation,” Optical engineering, vol. 1, no. 7, pp. 139–144, 1980.

[71] R. Kozera, “Existence and uniqueness in photometric stereo,” Applied Mathematics and Computation, vol. 44, no. 1, pp. 1–103, 1991.

[72] M. Drew, “Robust specularity detection from a single multi-illuminant color image,” Computer Vision, Graphics, and Image Processing: Image Understanding, vol. 59, no. 3, pp. 320–327, 1994.

[73] D. Kang, Y. J. Jang, and S. Won, “Development of an inspection system for planar steel surface using multispectral photometric stereo,” Optical Engineering, vol. 52, no. 3, pp. 039 701–039 701, 2013.

[74] L. Cammoun, C. Casta˜no-Moraga, E. Mu˜noz-Moreno,D. Sosa-Cabrera, B. Acar, M. Rodriguez-Florido, A. Brun, H. Knutsson, and J. Thiran, “A review of tensors and tensor signal processing,” in Tensors in Image Processing and Computer Vision, pp. 1–32. Springer, 2009.

[75] H. J. A. M. Heijmans and C. Ronse, “The algebraic basis of mathematical morphol- ogy. i dilations and erosions,” Computer Vision, Graphics, and Image Processing, vol. 50, no. 3, pp. 245–295, 1990.

[76] C. Ronse and H. J. A. M. Heijmans, “The algebraic basis of mathematical morphol- ogy. ii openings and closings,” Computer Vision, Graphics, and Image Processing, vol. 55, no. 1, pp. 74–97, 1991.

[77] P. Maragos, “Pattern spectrum and multiscale shape representation,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 701–716, 1989. Bibliography 55

[78] L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel, “Axioms and fundamental equations of image processing,” Archive for rational mechanics and analysis, vol. 123, no. 3, pp. 199–257, 1993.

[79] R. W. Brockett and P. Maragos, “Evolution equations for continuous-scale mor- phological filtering,” IEEE Transactions on Signal Processing, vol. 42, no. 12, pp. 3377–3386, 1994.

[80] L. Vincent, “Graphs and mathematical morphology,” Signal Processing, vol. 16, no. 4, pp. 365–388, 1989.

[81] H. Heijmans, P. Nacken, A. Toet, and L. Vincent, “Graph morphology,” Journal of Visual Communication and Image Representation, vol. 3, no. 1, pp. 24–38, 1992.

[82] L. Najman and J. Cousty, “A graph-based mathematical morphology reader,” Pat- tern Recognition Letters, vol. 47, pp. 3–17, 2014.

[83] E. Dougherty and R. Lotufo, Hands-on Morphological Image Processing, ser. Tu- torial Texts in Optical Engineering. Bellingham, Washington, USA: SPIE - The International Society for Optical Engineering, 2003, vol. TT59.

[84] H. Knutsson, “Representing local structure using tensors,” in The 6th Scandinavian Conference on Image Analysis, pp. 244–251, June 1989.

[85] H. Knutsson and C.-F. Westin, “Normalized and differential convolution,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Pro- ceedings CVPR ’93, pp. 515–523, Jun. 1993.

[86] C. Westin, K. Nordberg, and H. Knutsson, “On the equivalence of normalized con- volution and normalized differential convolution,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 457–460. IEEE, 1994.

[87] C. Westin, “A tensor framework for multidimensional signal processing,” Ph.D. dis- sertation, Link¨oping University, Sweden, 1994.

[88] H. Talbot and B. Appleton, “Efficient complete and incomplete path openings and closings,” Image and Vision Computing, vol. 25, no. 4, pp. 416–425, 2007.

[89] E. Aptoula and S. Lef`evre,“A comparative study on multivariate mathematical morphology,” Pattern Recognition, vol. 40, no. 11, pp. 2914–2929, 2007.

[90] J. Angulo, “Morphological colour operators in totally ordered lattices based on dis- tances: Application to image filtering, enhancement and analysis,” Computer Vision and Image Understanding, vol. 107, no. 1, pp. 56–73, 2007.

[91] V. Gonz´alez-Castro, J. Debayle, and J.-C. Pinoli, “Color adaptive neighborhood mathematical morphology and its application to pixel-level classification,” Pattern Recognition Letters, vol. 47, pp. 50–62, 2014. 56 Conclusions and Future Work

[92] J. Angulo and S. Velasco-Forero, “Mathematical morphology for real-valued images on riemannian manifolds,” in Mathematical Morphology and Its Applications to Sig- nal and Image Processing, pp. 279–291. Springer, 2013.

[93] J. Angulo and S. Velasco-Forero, “Riemannian mathematical morphology,” Pattern Recognition Letters, vol. 47, pp. 93–101, 2014. Part II

57 58 Paper A Morphology-Based Crack Detection for Steel Slabs

Authors: Anders Landstr¨omand Matthew J. Thurley

Reformatted version of paper originally published in: IEEE Journal of Selected Topics in Signal Processing, 2012, vol. 6, no. 7, pp. 866–875.

c 2012, Institute of Electrical and Electronics Engineers, reprinted with permission.

59 60 Morphology-Based Crack Detection for Steel Slabs

Anders Landstr¨omand Matthew J. Thurley

Abstract

Continuous casting is a highly efficient process used to produce most of the world steel production tonnage, but can cause cracks in the semi-finished steel product output. These cracks may cause problems further down the production chain, and detecting them early in the process would avoid unnecessary and costly processing of the defective goods. In order for a crack detection system to be accepted in industry, however, false detection of cracks in non-defective goods must be avoided. This is further complicated by the presence of scales; a brittle, often cracked, top layer originating from the casting process. We present an approach for an automated on-line crack detection system, based on 3D profile data of steel slab surfaces, utilizing morphological image processing and statistical classification by logistic regression. The initial segmentation successfully extracts 80% of the crack length present in the data, while discarding most potential pseudo-defects (non-defect surface features similar to defects). The subsequent statistical classification individually has a crack detection accuracy of over 80% (with respect to total segmented crack length), while discarding all remaining manually identified pseudo-defects. Taking more ambiguous regions into account gives a worst-case false classification of 131 mm within the 30 600 mm long sequence of 150 mm wide regions used as validation data. The combined system successfully identifies over 70% of the manually identified (un- ambiguous) crack length, while missing only a few crack regions containing short crack segments. The results provide proof-of-concept for a fully automated crack detection system based on the presented method.

1 Introduction

1.1 Background Currently almost 95% of the world steel production tonnage is solidified by continuous casting [1]. The method has advantages in productivity, cost reduction and output qual- ity, but, depending on factors such as design, operation and maintenance, may introduce various surface defects [2]. Steel slabs (see Fig. 1), the major (semi-finished) product within steel casting, are therefore susceptible to crack formation. Since steel slabs are often intended for sheet steel rolling, surface defects such as cracks may result in long sections of defective end-user products. Consequently, inspection of steel slabs before sending them through to the rolling mill, thereby avoiding related

61 62 Paper A

(a) (b)

Figure 1: Examples of a steel slab (a) and a longitudinal surface crack (b). problems at the later stage, is important to the steel industry. However, most inspection systems within the area are still manually operated [3,4]. This work focuses on automated detection of longitudinal cracks in steel slabs, based on non-contact measurements. A good solution to the problem should efficiently detect potentially problematic cracks, while keeping the number of false positives (non-crack regions being identified as cracks) at a minimum. Therefore particular notice needs to be taken to the presence of scales, which constitute a brittle, often cracked, top layer, formed from oxidization in the manufacturing process. This scale layer is unavoidable during casting [5], and cracks therein are from a top view perspective similar to cracks in the steel and therefore risk causing false positives in the detection result. In the intended use of this work, a robust system must handle surfaces partially covered by scales without reporting false positives from pseudo-defects (non-defect surface features similar to defects).

1.2 Related research Surface inspection by computer vision is a wide topic, and its progress in industrial processes has been documented in several surveys during the last decades [6–11]. Of particular interest to this work are automated systems for steel surface inspection, which is a field of ongoing study well represented in literature. Gray-scale intensity imag- ing is commonly used, in combination with various different signal processing techniques such as wavelet transforms [12,13], Gabor filters [4] and image morphology [4,5,14]. Use of gray-scale intensity images has its limitations though. Variations in lightning condi- tions, giving rise to potential pseudo-defects, are a problem. In particular, light reflection from scale regions may vary substantially, making the gray level in intensity images highly unpredictable which may give rise to psuedo-defects [13]. Other parameters, such as steel 1. Introduction 63 type, may effect properties in gray-scale images as well [5]. Due to the shortcomings of intensity imaging, other alternative optical techniques for automated inspection of metallic surfaces have been suggested. Pernkopf [15] notes that range imaging provides better contrast to surface defects with “three-dimensional charac- teristics”, and that the strong changes in reflective properties in scale regions motivates the use of range data over intensity imaging due to less sensitivity to inhomogeneous reflectance. Pernkopf and O’Leary [16] summarize two range imaging methods: Light sectioning, using projected light to calculate distance, and photometric stereo, obtaining distances for a static scene from several intensity images using different light sources. Another solution, based on range data collected by conoscopic holography, is presented by Alvarez et al. [17]. In addition to the more traditional machine vision approaches, numerous systems interpreting data collected by other Non-Destructive Testing (NDT) methods such as thermocouples [18], eddy currents [19], magnetic powder [20], ultrasound [21] or sulfur prints [22] are also common in literature. These systems generally require more contact though, in contrast to intensity and range imaging. In addition to the given examples of methods for steel inspection, other methods for crack detection have been applied in other contexts. The possibility to use watershed segmentation for crack detection in X-ray images of welds have been studied [23], and the Hough transform have been used for crack detection in color images of biscuits [24] as well as X-ray images of welds [25]. Watershed segmentation can be used to identify cracks, given a proper set of seeds, but risks over-segmenting the data [23]. The method may therefore require substantial post-processing in order to properly separate cracks from pseudo-defects. A good set of seeds reduces post-processing, but instead increases the need for pre-processing. For example, use of stochastic watershed [26] would require a minimum amount of parameters while providing a good segmentation, but requires a substantial amount of processing time and is therefore not suitable for the intended on-line system. Moreover, when no crack is present watershed segmentation will still produce something which must be post- processed in order to rule out the presence of cracks. The Hough transform is a good tool for identifying line segments in the data, and have been used in crack detection systems [24, 25]. However, the Hough transform requires a binary image as input, which again requires pre-processing. The situation is also complicated by the fact that cracks in steel may not be perfectly straight, which further calls for pre-processing.

1.3 Contribution As stated in the previous section, inspection of metallic surfaces by intensity imaging suffers from limitations. In particular, the method may introduce pseudo-defects in scale regions. For robust non-contact inspection of steel surfaces partially covered by scales, range imaging provides a more promising alternative. Due to the nature of the presented problem, where the cracks by definition have 64 Paper A a longitudinal orientation and thereby represent a specific directional structure in the casted steel surface data, we present a solution based on mathematical morphology. The strength of mathematical morphology lies in its ability to identify and/or enhance features of specific shape and orientation in the data. It is a common technique for analyzing 3D profile data, and has previously been used in a wide range of applications such as LiDAR [27], aggregates [28,29], and pellets [30]. However, it has to the authors’ knowledge not been used for analysis of 3D profile data in the intended context; crack detection for casted steel. In this work, we present a strategy for morphology-based crack detection for steel slabs based on 3D surface profile data collected by laser triangulation. The system first segments the data using mathematical morphology, and the resulting connected regions are assigned a crack probability using a logistic regression model.

2 Measurements

Sets of 3D profile data for steel slab surfaces were acquired at two separate occasions, and are hence referred to as sets A and B, respectively. The data is processed in regions (here referred to as images) of 150×100 mm (width×length, or x×y) in size, identify- ing cracks by segmenting the data set and classifying the resulting connected regions. Set A, containing a total of 644 images collected from two slabs, was used as model set for the presented segmentation algorithm and thereafter used to train a classifier, while set B, containing a total of 323 images collected from four slabs, was used as an independent validation set. Reference maps for both sets, defining the location of strong crack signatures in the data, were produced by manually marking clear, open cracks in the data. Fig. 2 presents an example of crack data, captured at 0.1×0.1×0.0053 mm (width×length×depth) resolution.

0.5 0.5

0

0 -0.5

-1 -0.5 Z (mm)

-1.5

-2 -1

20 150 -1.5 40 125 100 60 75 80 50 25 -2 Y (mm) 100 X (mm) (a) (b)

Figure 2: Input data example: A 150×100 mm (width×length) region of steel slab profile data, viewed in 3D (a) and from above (b). 3. Segmentation 65

Table 1: Morphological notation

B Binary image. I Gray-scale image. s Structuring element. I s Erosion of I by s. I ⊕ s Dilation of I by s. I ◦ s = (I s) ⊕ s Opening of I by s. I • s = (I ⊕ s) s Closing of I by s. I rec⊕8 IM 8-pixel neighborhood reconstruction of I from a marker image IM . 3 Segmentation

Morphological image processing is frequently used throughout this section. Definitions of the morphological concepts can be found in the introduction to the subject provided by Dougherty and Lotufo [31], and notations used are summarized in Table 1. The nature of the morphological operations give rise to a set of parameters, which must be set in accordance to a definition of what we are actually looking for. Therefore these parameters have been set experimentally from the model set, while considering their relation to adequate physical measures relevant to the problem. We thereby define the following concepts: • Trench: A single-directional, less than 5 mm wide, distinct longitudinal depression in the surface. • Scale: An elevated, less than 25 mm wide, part of the surface, possibly cracked by an opening no more than 5 mm in width. • Crack signature: A trail of the bottom 0.1 mm wide portions of a trench, where the trench depth is larger than 0.1 mm. To emphasize the relation to the physical measures, lengths are denoted in [mm] rather than pixels. These parameters are likely dependent of the steel grade being inspected, but an investigation of such relations lies beyond the scope of this work. After preprocessing (described in Section 3.1), a coarse search for trenches is per- formed at 1×1 mm resolution (Section 3.2). The search is then gradually refined, first at 0.3×0.3 mm resolution where scales are excluded (Section 3.3) and finally at 0.1×0.1 resolution where cracks are identified (Section 3.4). This strategy decreases processing time and concentrates on regions where cracks are expected. Binary images displayed in this section are inverted, displaying pixels with value 1 (true) in black on a white background.

3.1 Preprocessing Before crack signatures are extracted from the data, preprocessing is performed. 66 Paper A

(a) (b)

Figure 3: (a) Side view example of occlusion: Surface information from the crossed-out region cannot reach the sensor, resulting in occluded (unknown) 3D profile data such as the white pixels around the scales shown in (b).

Identifying the slab region

The actual slab region in the data is identified so that we do not search for cracks outside the slab.

Compensating for slope

The slope of the slab is compensated for by subtracting a least-squares fitted plane from the data, using a uniformly randomly distributed set of 15 000 sample points (1% of the pixels) for the least-squares fit. We here assume that each 150×100 mm part of the slab surface can be approximated by a plane.

Handling occluded data

Occluded regions in the measured data, caused by other parts of the surface blocking the path between the projected laser line and the sensor (see Fig. 3), are potential crack indicators and therefore set to the minimum height value of the currently processed 150×100 mm region.

Removing noise

Noise in the data is reduced by median filtering, using a centered neighborhood of size 0.3×0.3 mm. 3. Segmentation 67

3.2 Trench search (Low Resolution) In order to reduce the total required processing time, an initial low resolution search for surface cavities is performed on a downsampled version of the preprocessed data set (see Fig. 4a). In the downsampling process, the data is converted to 1×1 mm resolution using a Gaussian filter before sampling (to be fully accurate the new scale is 1.1×1.1 mm in order to use an odd sized neighborhood, but for simplicity we write 1 mm in this text). This resolution is too low to capture a crack in detail, but the longitudinal trench in the data indicating a potential crack is clearly visible. Such trenches can be identified by morphological processing, so that only regions in their proximity need to be processed further.

Estimating trench depth

Let I1 denote the downsampled preprocessed 3D data (Fig. 4a) and define lh,5mm and lv,5mm as a horizontal and a vertical, respectively, line structuring element of length 5 mm. The difference between the minimum heights to which the two structuring elements can be pushed down into the preprocessed steel surface,

Iraw = (I1 • lh,5mm) − (I1 • lv,5mm), (1) then yields a first approximation Iraw of the trench depth, as presented in Fig. 4b. The operation separates longitudinal features from transversal, i.e. cracks from oscillation marks, by assigning them different signs (longitudinal trench depths being positive). The raw extracted signal may be effected by oscillation marks, but a more refined trench signature, Itrench = Iraw • lv,5mm, (2) can be obtained by performing another morphological closing with the 5 mm vertical line structuring element on the data (Fig. 4c).

Extracting trench markers

By thresholding Itrench, keeping only the top 1% largest positive trench depths, a binary marker Btop can be obtained. A final binary trench map

Btrenches = Btop rec⊕8 (Btop ◦ lv,10mm) , (3) containing regions of at least 10 mm (longitudinal) length connected to these largest trench depths, can then be retrieved from Btop through opening by reconstruction. The resulting binary image is presented in Fig. 4d.

3.3 Scale exclusion (Medium Resolution) Scales constitute a brittle oxidized top layer covering the casted steel (see Fig. 3b). This top layer is often cracked; in particular where scales have separated from the steel surface. 68 Paper A

(a) (b)

(c) (d)

Figure 4: The different images obtained during the trench search: I1 (a), Iraw (b), Itrench (c), and Btrenches (d).

Hence, the measured data is likely to contain pseudo-defects in the form of cracks in the scales which do not reach down into the actual steel surface below. Cracks will later be identified by considering local topology variations. At that stage, a scale crack will cause a signature very similar (or even identical to) a crack in the steel. Therefore, scale regions must be excluded from further processing to avoid false crack detection. Scales do not require a 0.1×0.1 mm resolution, and the procedure for excluding them is therefore performed at 0.3×0.3 mm resolution in order to reduce the number of required computations.

Identifying potential scales in one direction

Let I0.3 (Fig. 5a) denote a downsampled region around a surface trench identified in Btrenches (Fig. 4d). Then let lθ,5mm and Lθ,25mm denote lines of lengths 5 mm and 25 mm, respectively, in the same direction θ. The closing

Ic = I0.3 • lθ,5mm (4) 3. Segmentation 69

(a) (b) (c)

(d) (e) (f)

Figure 5: The different images obtained during the scale exclusion: I0.3 (a), ◦ Idiff(45 ) (b), Bscales (c), Bscales, e (d), Irec (e), and Bcav (f).

first fills in any small gaps (less than 5 mm) in the data in the direction θ. A signature for (possibly cracked) potential scales less than 25 mm in width, as viewed from the direction θ, can then be retrieved from the top-hat operation

Idiff (θ) = Ic − Ic ◦ Lθ,25mm. (5)

These identified potential scales, shown for θ = 45◦ in Fig. 5b, should not correspond to crack regions due to the local surface topology.

Combining several directions A single direction may give a crude result, but by combining the results from several directions this can be improved. By letting θ = {−45◦, 0◦, 45◦, 90◦}, where the angles 70 Paper A are denoted with respect to the x-axis (the transversal direction), a refined scale marker Bscales (Fig. 5c) representing only potential scales showing up in all four angles can be retrieved from the expression

4 ! \ Bscales = (Idiff (θk) > 0) . (6) k=1

The marker image is then eroded by a disk dr=1mm with radius 1 mm,

Bscales, e = Bscales dr=1mm, (7) which removes noise and shrinks the sizes of the obtained marker regions. This reduces the risk of noisy markers stretching into surface cavities (compare the original marker in Fig. 5c to the eroded marker in Fig. 5d).

Improving markers by reconstruction

From the resulting binary marker Bscales, e (Fig. 5d), a marker Imarker for gray-scale data can be constructed by letting

 I0.3 where Bscales, e = 1, Imarker = (8) 0 where Bscales, e = 0.

A morphological reconstruction on the inverse data set, as given by

Irec = − ((−I0.3) rec⊕8 (−Imarker)) , (9) then yields a surface where (possibly cracked) scales remain while cracks in the steel are filled in (Fig. 5e). A filter identifying cavities located where trenches were found in the low resolution step (section 3.2) can then be obtained by performing an opening by reconstruction on the filled in regions, using the trench marker Btrenches (Fig. 4d) from (3). This results in a binary filter

Bcav = (Irec > I0.3) rec⊕8 Btrenches, (10) where cavities in the steel surface, as opposed to scales, are identified (Fig. 5f). This filter can then be used to exclude scales and other non-cavities from further crack analysis.

3.4 Crack signature extraction (High Resolution)

In non-scale regions where longitudinal cavities were found, a more detailed investigation is needed. Fig. 6a shows an example of data processed in this step, I0.1. 3. Segmentation 71

Identifying sharp discontinuities

We restrict our attention to sharp discontinuities in the data in the horizontal (transver- sal) direction. This is done by retrieving a binary filter

 I0.1 • lh,0.3mm > I0.1 where Bcav = 1, Bdiscont = (11) 0 where Bcav = 0,

where lh,0.3mm is a horizontal line structuring element of length 0.3 mm (Fig. 6b). Bcav (Fig. 5f) is the cavity filter obtained from (10).

(a) (b) (c)

(d) (e)

Figure 6: The different images obtained during the crack signature extraction: I0.1 (a), Bdiscont (b), Idepth (c), Bcracks (d), and BL,10 (e). 72 Paper A

Depth estimation

The depths Idepth of horizontal discontinuities in the data are then approximated as

 I0.1 • lh,5mm − I0.1 where Bdiscont = 0, Idepth = (12) 0 where Bdiscont = 1, where lh,5mm is a horizontal line structuring element of length 5 mm. Hence, the crack depth is defined as the difference between the actual data value and the minimum height to which a horizontal line of length 5 mm can be pushed down into the data. The resulting image is shown in Fig. 6c.

Obtaining crack signatures

From the approximated crack depth Idepth, we define potential crack signatures Bcracks as

Bcracks = Idepth > 0.1, (13) neglecting depths smaller than 0.1 mm (see Fig. 6d).

Linking crack signatures

Neighboring remaining signatures are linked together while unconnected signatures are discarded, using a morphological filter given by the recursive expression

 Bcracks, k = 0, BL,k = (14) (BL,k−1 • dk) rec⊕8 ((BL,k−1 • dk) ◦ lk) , k > 0.

Here, dk is a disk structuring element of radius k mm and lk a vertical line structuring element of length 2k +0.1 mm, where k = {0.1, 0.2, 0.3,..., 1.0}. The filter is a modified close/open alternating filter. Instead of using the same element for both the closing and the opening a disk is used to close up parts that are not perfectly vertically aligned, while a vertical line is used to restrict the filter output to vertical features. An opening by reconstruction in each step makes the filter more conservative, keeping more of the grouped regions in each step. This linking produces a segmented data set, where each connected component marks a region corresponding to a potential crack in the steel surface (see Fig. 6e).

Excluding small regions

After the linking procedure, small regions are discarded by an area opening of size 5mm2. Connected components in the remaining signatures are reported as potential crack seg- ments. 4. Classification 73

Figure 7: Ellipses fitted to the connected regions reported by the segmenta- tion.

4 Classification

Once the data has been segmented, shape features are extracted for each reported con- nected component. These parameters can then be used to classify the potential cracks as (true) cracks or non-cracks. The potential cracks reported by the segmentation was first manually classified, producing a reference set where each connected region is marked as being a crack, a non-crack or unclassified (ambiguous regions where cracks are likely although no open crack is visible). This classification was based upon resemblance to the identified cracks in the model set (see section 2).

4.1 Variable extraction and orientation thresholding The depth of each connected region is reflected by its median depth, retrieved from (12). Since the region obtained from the morphological filtering is in general wider than the actual thin crack, we only consider pixels where the estimated depth is non-zero. Other shape parameters are obtained by approximating each connected region by an ellipse defined by the same corresponding second central moments (see Fig. 7). The lengths of the minor and major axes can then be considered to reflect the length and width of the crack, and the orientation is defined by the minimum absolute angle between the major axis and the x-axis (the transversal direction). We exclude all connected regions with orientation less than 45◦ from further analysis, since these can hardly be considered longitudinal. Box plots for the obtained variable distributions for set A within the manually defined groups cracks and non-cracks are presented in Fig. 8.

Orientation The orientation box plot in Fig. 8a shows that the identified cracks indeed show a lon- gitudinal behavior, and the orientation threshold could probably be raised. In this work 74 Paper A

Non-cracks Non-cracks (37 samples) (37 samples)

Cracks Cracks (30 samples) (30 samples)

50 60 70 80 90 2 3 4 5 6 7 Orientation (degrees) Minor axis (mm) (a) (b)

Non-cracks Non-cracks (37 samples) (37 samples)

Cracks Cracks (30 samples) (30 samples)

0 20 40 60 80 100 120 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Major axis (mm) Median depth (mm) (c) (d)

Figure 8: Box plots for studied variables in the model set; orientation (a), minor axis length (b), major axis length, (c), and median depth (d), respectively. however, the number of available samples was already considered scarce enough from a statistical point of view.

Minor axis In Fig. 8b, we see that cracks in the model set are in general wider than other reported regions. This is considered to be a result of frequently occurring zig-zag patterns in the cracks in set A – a property we can hardly assume to be true in general. Thus, the minor axis is considered unsuitable for classification.

Major axis & median depth The major axis and the median depth (figs. 8c and 8d) show separation between the groups, indicating that the segmentation succeeds in finding long, deep regions where cracks have been manually identified. These variables are therefore considered suitable for classification of the crack signatures.

4.2 Logistic regression In Fig. 9a, the manually classified regions are marked in the space spanned by the two variables selected for classification; the major axis and the median depth. While the number of regions containing cracks is not very high, we see that there is indeed a tendency for crack regions to be longer and deeper than the non-crack noise from the segmentation. In this space, a statistical classifier was then obtained by fitting a logistic 4. Classification 75

0.4

0.35

0.3

0.25

0.2

Median depth (mm) 0.15

0.1 Non-cracks 0.05 Unclassified Cracks 0 0 20 40 60 80 100 Major axis (mm) (a)

100 Non-cracks number % 90 Unclassified number % 80 Cracks number % 70 Non-cracks length % 60 Unclassified > 90% crack probability length % 50 Cracks

CDF (%) length % 40

30

20

10 < 10% crack probability 0 0 0.2 0.4 0.6 0.8 1 Posterior crack probability (b)

Figure 9: The two-dimensional classification space (a) and the corresponding CDFs (b). In both figures, lines marking 10%, 50% and 90% crack probabilities (seen from left to right) are displayed.

regression model to the manually classified groups containing cracks and non-cracks. This method was chosen because it does not assume normally distributed data [32,33], which makes it a suitable choice for the available model data where crack samples are quite sparsely distributed within the upper right part of Fig. 9a. 76 Paper A

Posterior probability Logistic regression can be used to estimate the posterior probability for each sample to belong to a certain group. More specifically, an n-variable logistic regression model is defined by the parameters {β0, β1, β2, ..., βn}, which for each sample represented by the variables {x1, x2, ..., xn} yields a posterior probability given by

1 Pposterior = . (15) − β +Pk=n β x 1 + e ( 0 k=1 k k)

The interested reader is referred to Afifi [33] and Dobson [34] for more details on the topic of logistic regression.

Probability threshold Cumulative distribution functions (CDFs) representing the posterior probabilities, given for each classified region by (15), are presented in Fig. 9b. The solid lines represents the CDFs in percentage of the total number of regions for each manually classified group, while the dashed lines represents the CDFs in total length percentage (total detected length of cracks vs. total length of manually marked cracks) for each manually classified group. A probability percentage threshold, which corresponds to a boundary line in Fig. 9a, can be used to identify cracks in the data. It should be noted that the classifier is blind to what the segmentation fails to report. Also, as mentioned, regions with orientation less than 45◦ have already been discarded at this stage. By selecting a probability value for thresholding higher than 50% we can avoid the misclassification risk near the 50% boundary, which will make the system more robust. This threshold should be a trade-off between 100% crack detection and 0% false positives. Since our primary focus in this work is on the latter, we want a high threshold value which still classifies most identified cracks correctly. Fig. 9b shows that we can set a probability requirement as high as 90% and still keep 93% of the total length of connected regions containing manually identified cracks, while safely avoiding any false positives. However, the unclassified samples must be taken into account as well. We can get worst-case scenarios by considering the two extremes;

1. all unclassified regions are assumed to be non-cracks, and

2. all unclassified regions are assumed to be cracks.

In case 1, the false detection rate is 21% of the total non-crack length, or 116 mm falsely reported cracks. In case 2, on the other hand, 81% of the total crack length is correctly classified. The truth should lie in-between these two extremes. The 21% false detection rate given by case 1 should be quite pessimistic, since the vast majority of the unclassified regions are located in the proximity of manually identified cracks, but cannot be ruled out at this point. 5. Results 77

Set A non-cracks Set A non-cracks (37 samples) (37 samples) Set A cracks Set A cracks (30 samples) (30 samples) Set B non-cracks Set B non-cracks (32 samples) (32 samples) Set B cracks Set B cracks (24 samples) (24 samples) 50 60 70 80 90 1 2 3 4 5 6 7 Orientation (degrees) Minor axis (mm) (a) (b)

Set A non-cracks Set A non-cracks (37 samples) (37 samples) Set A cracks Set A cracks (30 samples) (30 samples) Set B non-cracks Set B non-cracks (32 samples) (32 samples) Set B cracks Set B cracks (24 samples) (24 samples) 0 20 40 60 80 100 120 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 Major axis (mm) Median depth (mm) (c) (d)

Figure 10: Box plots for studied variables; orientation (a), minor axis length (b), major axis length, (c), and median depth (d), respectively. Results for set A are presented again for comparison.

5 Results

The performance of the presented method was validated by segmenting and classifying a second independent validation set B, containing 323 regions of 3D surface data collected from four different steel slabs.

5.1 Segmentation results The connected regions retrieved from the segmentation included 82% of the total length of the manually marked cracks. Correct crack signatures were extracted from all 17 images manually marked as containing cracks. Descriptive shape parameters were then extracted from each detected region. Box plots are presented in Fig. 10, together with the previous results for set A for compari- son. As before, all regions with less than 45◦ absolute orientation relative to the x-axis have at this stage been discarded. Variable distributions for set B validate the earlier observations for set A (Section 4.1). In particular, the problem with using the minor axis for classification is evident – crack widths in set B are in general smaller than in set A, and are to a large extent overlapping the widths of the non-crack regions.

5.2 Classification results The extracted variables were used as input to the logistic regression model obtained from set A, and the resulting classification was compared to a manual classification. Results 78 Paper A are presented in Fig. 11. In Fig. 11a, two dashed lines marking 10% and 90% crack probability, respectively, as well as a solid line representing the 50% probability boundary are shown in the 2D-space spanned by the major axis and the median depth. Figure 11b shows the corresponding cumulative distribution functions (CDFs) for the region-wise posterior probabilities, presented in both number of regions and length percentage. By comparing CDFs for crack and non-crack regions the separation tendency for the two groups can be evaluated. Probabilities are obtained from the logistic regression model, and are thereby based on the model data. It is clear that at the 90% crack probability level, suggested in section 4, the clas- sification of the validation set is as accurate as for the model set. 83% of the number of segmented crack regions, or 94% of the total length of segments containing cracks, are correctly classified. Still, no false positives are reported for the manually classified non-cracks on that probability level. Taking the unclassified regions into account by evaluating worst-case scenarios, as described previously in section 4.2 under Probability threshold, results in less than 23% false positives (131 mm) and at least 83% correcly classified cracks (with respect to total length). Here, as well as for the model set, only 2 regions containing cracks are completely missed. As for the model set, the missed regions contain only short cracks.

5.3 Combined results The segmentation successfully reports 527/645 mm (82%) of the manually marked cracks, while the classification identifies 490/528 mm (93%) of these segmented regions as cracks. This evaluation of the classification differs from the 94% stated in the previous section, since we do not here consider parts of the linked connected regions that do not overlap the manually marked visible crack segments. In total, 490 mm of the total 645 mm manually marked crack length were correctly classified, yielding a success rate (in length percentage) of 76%. The corresponding number for the model data is 73%. Worst-case scenario evaluation of the complete system is less trivial. A reference for non-crack and unclassified regions is hard to retrieve, since these regions result from the segmentation and cannot be manually marked before that point. What we can say, however, is that less than 131 mm crack length is falsely identified. This number can be compared to the total 30 600 mm long sequence of 150 mm wide sections of data present in the validation data set.

6 Discussion

While the amount of collected 3D data could certainly be improved in a continued future study, the results show clear tendencies for the available surface profiles:

• The presented segmentation algorithm successfully extracts more than 80% of the total crack length present in the data, while discarding most potential pseudo- defects (non-defect surface features similar to defects). 6. Discussion 79

0.4

0.35

0.3

0.25

0.2

Median depth (mm) 0.15

0.1 Non-cracks 0.05 Unclassified Cracks 0 0 20 40 60 80 100 Major axis (mm) (a)

100 Non-cracks number % 90 Unclassified number % 80 Cracks number % 70 Non-cracks length % 60 Unclassified > 90% crack probability length % 50 Cracks

CDF (%) length % 40

30

20

10 < 10% crack probability 0 0 0.2 0.4 0.6 0.8 1 Posterior crack probability (b)

Figure 11: Classification of the validation data (set B), in the two-dimensional classification space (a) and in CDF-format (b). In both figures, lines marking 10%, 50% and 90% crack probabilities (seen from left to right) are displayed.

• At a 90% probability level, the statistical classification individually has a crack detection accuracy of over 90% with respect to the total manually identified crack length, while discarding all remaining manually identified pseudo-defects. Taking ambiguous, unclassified regions into account gives a worst-case detection rate of 80 Paper A

over 80% and a worst-case false detection rate of 23% (corresponding to a length of 131 mm). • The combined system (segmentation and classification) detection success rate, with respect to the total length of manually marked distinct open cracks, is over 70%. • Only a few of the regions containing cracks are completely missed (a crack region may be identified even though all crack length within the region is not detected) and, most importantly, • no false positives are reported within the manually classified data and false detection in total is less than 131 mm among the 30 600 mm long sequence of 150 mm wide regions the validation set. These numbers indicate that cracks to a large extent can be separated from non-crack data, and an accurate automated crack-detection system based on the presented method is feasible. Some cracks – predominantly smaller ones – will likely remain undetected as a result of the trade-off between 100% detection and 0% false positives, but longer cracks can be quite safely identified. It is therefore important to link crack segments together whenever possible, so that they get long enough to be classified correctly. A potential method for achieving this is the Hough transform, which has not been used in this work. The crack probability measure allows for a more continuous crack resemblance as- sessment for each region, rather than a strict binary classification. This property can be used to avoid false positives. By thresholding at a higher crack probability value, false positives can be avoided at the cost of successful classification of true cracks. Tuning this threshold with respect to associated costs can then be considered as an optimization problem. The presented method relies on a number of parameters, which are set experimentally (and with respect to the actual physical measures they represent). A completely adaptive parameter selection would hardly be useful, due to the underlying physical quantities the numbers represent, but it may be possible to set them more systematically. In particular, the impact on these parameters from different types of steel grades should be further investigated. It may also be possible to reduce the number of parameters by introducing other methods such as watershed segmentation or the Hough transform, but this lies beyond the scope of this work.

7 Conclusion

The presented system provides a crack probability measure for each detected potential crack. We have shown that the data can be classified at 90% crack probability, resulting in less than 131 mm of the 30 600 mm long sequence of 150 mm wide regions in the val- idation set being detected as cracks without being manually identified as such. At this crack probability level, over 70% of the manually identified crack length was still suc- cessfully detected. No cracks were falsely detected in regions where manual identification completely ruled out the existence of cracks. 8. Future work 81

More data would allow for a more thorough statistical investigation of potential clas- sification variables as well as a more accurate crack probability estimation, but the pre- sented results clearly show the potential of the method. This work thereby provides a start for further development of the system into a fully automated morphology-based on-line crack detection installation.

8 Future work

The algorithm generally succeeds in identifying relatively long connected components where cracks are present. This makes the length an important variable in the classifi- cation, and cracks should therefore be linked together as much as possible so that they can be distinguished from noise. For this purpose, use of the Hough transform should be investigated. Combining data in several images, thereby providing more high level knowledge, would also allow for longer estimated crack lengths and thereby even more extreme major axes for regions representing very long cracks. In addition to the Hough transform, watershed segmentation should also be considered in a future study – with focus on seed selection and how over-segmentation can be effectively reduced. The handling of missing data, by setting such pixels to the minimum value of the 150×100 mm data being processed, did not effect classification results but is quite crude and could certainly be improved. The authors have previously considered reconstruction of occluded regions in 3D profile data of rocks [35], but such methods are usually quite computationally demanding and thereby hard to implement in an on-line system. A more locally adapted reconstruction of missing data should make the segmentation more robust though, and could be worth looking into further. Finally, more extensive measurements on different types of steel would provide more statistical data and allow for a more thorough investigation of the relation between steel grades and the parameters used, thereby resulting in a more accurate crack probability estimation.

Acknowledgment

The authors would like to thank ProcessIT Innovations for all their invested time and effort. We also thank SSAB for their participation and support. Finally, we thank our measurement technology partners, Kemi Technical University of Applied Sciences (KTUAS). In addition we acknowledge the EU INTERREG IVA Nord program for par- tially supporting this research.

References

[1] “World steel in figures,” World Steel Association, Brussels, Belgium, Tech. Rep., 2011. [Online]. Available: www.worldsteel.org 82 Paper A

[2] R. Mahapatra, J. Brimacombe, I. Samarasekera, N. Walker, E. Paterson, and J. Young, “Mold behavior and its influence on quality in the continuous casting of steel slabs: Part i. industrial trials, mold temperature measurements, and mathe- matical modeling,” Metallurgical and Materials Transactions B, vol. 22, pp. 861–874, 1991, 10.1007/BF02651163.

[3] X. Li, S. Tso, X. Guan, and Q. Huang, “Improving automatic detection of defects in castings by applying wavelet technique,” IEEE Transactions on Industrial Electron- ics, vol. 53, no. 6, pp. 1927–1934, 2006.

[4] J. Yun, S. Choi, B. Seo, C. Park, and S. Kim, “Defects Detection of Billet Surface Using Optimized Gabor Filters,” in Proceedings of the 17th IFAC World Congress. The International Federation of Automatic Control, 2008.

[5] D. Lee, Y. Kang, C. Park, and S. Won, “Defect Detection Algorithm in Steel Billets using Morphological Top-Hat Filter,” in IFAC Workshop on Automation in Mining, Mineral and Metal Industry (IFACMMM), 2009.

[6] R. T. Chin and C. A. Harlow, “Automated visual inspection: A survey,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. PAMI-4, no. 6, pp. 557– 573, nov. 1982.

[7] R. Chin, “Automated visual inspection: 1981 to 1987,” Computer Vision, Graphics, and Image Processing, vol. 41, no. 3, pp. 346–381, 1988.

[8] E. Bayro-Corrochano, “Review of automated visual inspection 1983-1993, Part I: conventional approaches,” in Proceedings of SPIE, vol. 2055, 1993, p. 128.

[9] E. Bayro-Corrochano, “Review of automated visual inspection 1983-1993, Part II: approaches to intelligent systems,” in Proceedings of SPIE, vol. 2055, 1993, p. 159.

[10] T. Newman and A. Jain, “A survey of automated visual inspection,” Computer vision and image understanding, vol. 61, no. 2, pp. 231–262, 1995.

[11] E. Malamas, E. Petrakis, M. Zervakis, L. Petit, and J. Legat, “A survey on industrial vision systems, applications and tools,” Image and Vision Computing, vol. 21, no. 2, pp. 171–188, 2003.

[12] J. Yun, S. Choi, Y. Jeon, D. Choi, and S. Kim, “Detection of line defects in steel billets using undecimated wavelet transform,” in International Conference on Control, Automation and Systems, 2008. IEEE, 2008, pp. 1725–1728.

[13] Y. Jeon, J. Yun, D. Choi, and S. Kim, “Defect detection algorithm for corner cracks in steel billet using discrete wavelet transform,” in ICCAS-SICE, 2009. IEEE, 2009, pp. 2769–2773. References 83

[14] M. Yazdchi, A. Mahyari, and A. Nazeri, “Detection and classification of surface de- fects of cold rolling mill steel using morphology and neural network,” in International Conference on Computational Intelligence for Modelling Control & Automation, 2008. IEEE, 2008, pp. 1071–1076. [15] F. Pernkopf, “3D surface acquisition and reconstruction for inspection of raw steel products,” Computers in Industry, vol. 56, no. 8-9, pp. 876–885, 2005. [16] F. Pernkopf and P. O’Leary, “Image acquisition techniques for automatic visual inspection of metallic surfaces,” NDT & E International, vol. 36, no. 8, pp. 609–617, 2003. [17] I. Alvarez, J. Marina, J. Enguita, C. Fraga, and R. Garcia, “Industrial online surface defects detection in continuous casting hot slabs,” in Proceedings of SPIE, vol. 7389, 2009, p. 73891X. [18] B. Thomas, “On-line Detection of Quality Problems in Continuous Casting of Steel,” in Proceedings of the International Symposium on Process Control and Optimization in Ferrous and Nonferrous Industry, TMS, Warrendale, PA,(Chicago, IL), 2003. [19] P. Meilland, “Novel Multiplexed Eddy-Current Array for Surface Crack Detection on Rough Steel Surface,” Proc. 9th ECNDT, Berlin, 2006. [20] T. Nishimine, O. Tsuyama, T. Tanaka, and H. Fujiwara, “Automatic magnetic particle testing system for square billets,” in Conference Record of the 1995 IEEE Industry Applications Conference, vol. 2. IEEE, 2002, pp. 1585–1590. [21] M. Allazadeh, C. Garcia, K. Alderson, and A. Deardo, “Ultrasonic image analysis of steel slabs,” Advanced materials & processes, vol. 166, no. 12, pp. 26–27, 2008. [22] J. Sirgo, R. Campo, A. Lopez, A. Diaz, and L. Sancho, “Measurement of centerline segregation in steel slabs,” in Conference Record of the 2006 IEEE Industry Applica- tions Conference, vol. 1. IEEE, 2006, pp. 516–520. [23] V. Rathod and R. Anand, “A comparative study of different segmentation techniques for detection of flaws in nde weld images,” Journal of Nondestructive Evaluation, vol. 31, pp. 1–16, 2012. [24] S. Nashat, A. Abdullah, and M. Abdullah, “A robust crack detection method for non-uniform distributions of coloured and textured image,” in 2011 IEEE Interna- tional Conference on Imaging Systems and Techniques (IST), may 2011, pp. 98–103. [25] S. Jiaxin, D. Dong, Z. Xinjie, and W. Li, “Weld slim line defects extraction based on adaptive local threshold and modified hough transform,” in 2nd International Congress on Image and Signal Processing, 2009. CISP ’09., oct. 2009, pp. 1–5. [26] J. Angulo and D. Jeulin, “Stochastic watershed segmentation,” in Proceedings of ISMM, 8th International Symposium on Mathematical Morphology, 2007. 84

[27] T. T. Vu, F. Yamazaki, and M. Matsuoka, “Multi-scale solution for building extrac- tion from lidar and image data,” International Journal of Applied Earth Observation and Geoinformation, vol. 11, no. 4, pp. 281 – 289, 2009.

[28] J. Lee, M. Smith, L. Smith, and P. Midha, “A mathematical morphology approach to image based 3D particle shape analysis,” Machine Vision and Applications, vol. 16, no. 5, pp. 282–288, 2005.

[29] M. Thurley, “Automated online measurement of limestone particle size distributions using 3D range data,” Journal of Process Control, vol. 21, no. 2, pp. 254–262, 2011.

[30] M. Thurley and T. Andersson, “An industrial 3D vision system for size measure- ment of iron ore green pellets using morphological image segmentation,” Minerals Engineering, vol. 21, no. 5, pp. 405–415, 2008.

[31] E. Dougherty and R. Lotufo, Hands-on Morphological Image Processing, ser. Tu- torial Texts in Optical Engineering. Bellingham, Washington, USA: SPIE - The International Society for Optical Engineering, 2003, vol. TT59.

[32] D. Johnson, Applied Multivariate Methods for Data Analysts. Pacific Grove, Cali- fornia, USA: Duxbury Press, 1998.

[33] A. Afifi, V. Clark, and S. May, Computer-aided multivariate analysis, 4th ed. Boca Raton, Florida, USA: CRC, 2004.

[34] A. Dobson, An Introduction to Generalized Linear Models, 2nd ed. Boca Raton, Florida, USA: CRC, 2002.

[35] A. Landstr¨om,F. Nellros, H. Jonsson, and M. Thurley, “Image reconstruction by pri- oritized incremental normalized convolution,” in Image Analysis, ser. Lecture Notes in Computer Science, A. Heyden and F. Kahl, Eds. Springer Berlin / Heidelberg, 2011, vol. 6688, pp. 176–185. Paper B Adaptive Morphology using Tensor-Based Elliptical Structuring Elements

Authors: Anders Landstr¨om and Matthew J. Thurley

Reformatted version of paper originally published in: Pattern Recognition Letters, 2013, vol. 34, no. 12, pp. 1416–1422.

c 2012, Elsevier, Reprinted with permission.

85 86 Adaptive Morphology using Tensor-Based Elliptical Structuring Elements

Anders Landstr¨omand Matthew J. Thurley

Abstract

Mathematical Morphology is a common strategy for non-linear filtering of image data. In its traditional form the filters used, known as structuring elements, have constant shape once set. Such rigid structuring elements are excellent for detecting patterns of a specific shape, but risk destroying valuable information in the data as they do not adapt in any way to its structure. We present a novel method for adaptive morphological filtering where the local struc- ture tensor, a well-known method for estimation of structure within image data, is used to construct adaptive elliptical structuring elements which vary from pixel to pixel de- pending on the local image structure. More specifically, their shape varies from lines in regions of strong single-directional characteristics to disks at locations where the data has no prevalent direction.

1 Introduction

1.1 Background Mathematical morphology, originally developed by Matheron [1] and Serra [2], is a pow- erful method for filtering highly non-linear image data. It is based on kernels called struc- turing elements, which are used to probe the image by considering pixel values within the resulting neighborhoods. More specifically, the two basic operations erosion and dilation extracts the minimum or maximum value, respectively, within the neighborhood defined by the structuring element. Traditional mathematical morphology uses one single user-defined structuring element for the whole image. This strategy enables efficient implementation and is very useful for detecting objects of a certain size and shape, but selecting a suitable structuring element becomes more challenging when size and shape of objects in the image are more varying. The challenge posed by objects of different sizes can be handled by iteratively processing the image using a set of differently sized structuring elements, as is done e.g. when calculating morphological granulometries [1] or the ultimate opening [3], but this requires that the whole image is processed multiple times. Consequently, there is an ongoing interest in more adaptive approaches to mathematical morphology, largely based on theory originally introduced by Serra [4].

87 88 Paper B

As pointed out by Roerdink [5], there is ambiguity in literature regarding the terms associated with adaptive or spatially-variant morphology. The terms are sometimes used interchangeably, but can also be used to separate between different approaches. Roerdink identifies two major categories of generalizations of classical mathematical morphology; 1. Group morphology, where translation invariance is replaced by other forms of in- variance, and

2. Adaptive morphology, where structuring elements depend on position or the input image. Regarding group morphology, Roerdink [6, 7] provides a framework for extending mor- phological operations to other invariant mathematical groups such as motion-invariant gray-scale operators and gray-scale operators on the sphere. Within adaptive morphol- ogy, Roerdink [5] defines two separate forms of adaptivity: (a) letting the structuring element depend on the location in the image only (location-adaptive morphology), and (b) letting the structuring element depend on the actual image values (input-adaptive morphology). In this work we will concentrate on input-adaptive morphology, setting the shapes of the structuring elements from the image structure. Once set, however, structuring elements remain fixed for each pixel and can then be considered depending on the location in the image. Input-adaptive morphology within the concept of binary images was extended by Charif-Chefchaouni and Schonfeld [8] based on the work by Serra [4]. Bouaynaya and Schonfeld continued the development of these ideas by defining input-adaptive math- ematical morphology for gray-scale images using the Umbra transform, extending the formal theoretical framework even further [9]. Roerdink [5] later complemented the work by Bouaynaya and Schonfeld by showing that the same structuring element must be used for both erosion and dilation at an image pixel in order to achieve adjunction – a necessary relation between the two in order to achieve the defining properties of mor- phological operations [10]. For more extensive background and theory on the topic of adaptive morphology, we refer the reader to Bouaynaya et al. [11, 12]. For a review of current methods within the subject, the reader is referred to Maragos and Vachier [13].

1.2 Related Work Several approaches to input-adaptive mathematical morphology have been suggested. Shih and Cheng [14] use adaptive elliptical structuring elements for edge linking in bi- nary data, setting the parameters of the ellipses from local path curvature. Shih and Gaddipati [15] presents a framework for general sweep mathematical morphology, where structuring elements vary along a given path (a closed curve or the boundary of a closed object). Lerallut et al. [16] introduce the concept of morphological amoebas, where a weighted distance measure is used to adapt the structuring element so that its growth across strong gradients is prevented. Recently, Curic et al. [17] present salience adaptive structuring elements, which are less flexible than morphological amoebas and less affected by noise. 1. Introduction 89

Debayle and Pinoli [18] define structuring elements based on a similarity measure, so that all pixels within a similar (as defined by the similarity measure) connected region share the same (non-centered) structuring element. Verd´u-Monederoet al. use average diffused squared gradient fields to obtain angles for line-shaped structuring elements [19–21]. An interpolation by diffusion sets angles for pixels further away from the actual directional variations (i.e. edges) within the image. Tankyevych et al. [22] use principal value analysis of the Hessian matrix of 3D voxels to obtain directions for 3D line structuring elements. To detect objects of different radii, multiple scales are used. Breuß et al. [23] use tensors to approximate angles for continuous morphology, achieved by solving partial differential equations for diffusion. In addition to the above, there is work on efficient implementation of size-invariant morphological operations without concern to how the per-pixel sizes are retrieved. Cuise- naire [24] presents an implementation for differently sized structuring elements shaped as balls of a selected metric, and Dokl´adaland Dokl´adalov´a[25] shows how rectangular structuring elements of varying sizes can be applied. For the case of adjunct morphological operations, Lerallut et al. [16] and Tankyevych et al. [22] have demonstrated how such operations can be efficiently implemented for adaptive structuring elements.

1.3 Contribution In this work, we define a framework for input-adaptive morphology, based on the Local Structure Tensor (LST), where adaptive elliptical structuring elements range from lines to disks depending on the local image structure. Capturing the eigenvalues as well as the eigenvectors of the LST within the shape of the structuring element allows the method to adapt not only to the orientation within the data, but also to its dominancy (the degree of anisotropy). This is important because any deviation in the data (such as noise) will yield an orientation, even when there is no true structure in that region of the image. Allowing the structuring element shape itself to vary depending on the eigenvalues of the LST avoids a distinct geometrical bias from the structuring element itself. Fixing the shape of the structuring element to lines, for instance, is highly likely to introduce artificial lines in the filtered image, since extreme values (e.g. noise) will cause distinct straight edges to appear in the result. The orientation angles provided by the LST are equivalent to the angles given by gradient fields [26] and have thereby been implicitly used by Verd´u-Monedero et al. for setting the orientations of line structuring elements [19–21]. In their work, more certain orientation information is interpolated by diffusion into regions where the gradient values are low. However, by explicitly using the LST we can take the relation between its eigenvalues into account instead of interpolating angles into non-varying image regions. This strategy does not impose an orientation where there is no prevalent direction in the data and therefore avoids introducing a geometrical bias. Interpolation is also generally quite time- consuming, while the LST contains orientation dominancy information in its eigenvalues 90 Paper B

– which in lower dimensions can be efficiently calculated from closed-form expressions. The LST has previously been used explicitly by Breuß et al. [23] for (diffusion-based) mathematical morphology, but as well without regard to its eigenvalues.

2 Method

2.1 The Local Structure Tensor

T Let x = (x1 x2) denote pixel coordinates and f(x) the corresponding gray-scale value. The Local Structure Tensor (LST) T(x), representing local directional features in the data, is then given by the 2 × 2 matrix   TX1X1 TX1X2 T  T(x) = (x) = Gσ ∗ ∇f(x)∇ f(x) , (1) TX1X2 TX2X2

T  ∂ ∂  where ∇ = and Gσ denotes a Gaussian filter kernel with standard devia- ∂x1 ∂x2 tion σ. The resulting smoothing regularizes the matrix. In practice, we estimate the image gradient by applying standard 3×3 pixel Scharr filters operating on a slightly smoothed version of the input image (produced by applying a small Gaussian filter with low standard deviation σ0 to the input image f), while σ for the tensor smoothing is defined by assigning a filter bandwidth radius rw. More specifically, σ is given by r σ = √ w , (2) 2 ln 2 i.e. Gσ decreases to half of its maximum value at distance rw from its center. For each pixel x, we then consider the eigenvalues λ1(x) and λ2(x)(λ1(x) ≥ λ2(x)) and the corresponding eigenvectors e1(x) and e2(x) of the symmetric LST T(x). These provide information about edges present in the data. Eigenvalues can be interpreted as follows: λ1 ≈ λ2  0: No dominant direction (edge crossing or point), λ1  λ2 ≈ 0: Strong dominant direction (edge), λ1 ≈ λ2 ≈ 0: No dominant direction (no edge). The eigenvector e1(x) represents the local dominant direction of variation in the data. Hence e2(x), being orthogonal to e1(x), represents the direction of the smallest variation in the region [27].

2.2 Adaptive elliptical structuring elements We define the (solid) elliptical structuring element E(a, b, φ), where a is the semi-major axis, b the semi-minor axis, and φ the orientation (see Fig. 1). Each pixel x in the image f is then assigned a structuring element. Hence, we have a = a(x), b = b(x), and φ = φ(x). 2. Method 91

Figure 1: Ellipse parameters a, b, and φ, and their relation to the eigenvectors e1 and e2 of the LST.

The axes a(x) and b(x) are set from the eigenvalues of T(x) by the expressions λ (x) a(x) = 1 · M, (3) λ1(x) + λ2(x)

λ (x) b(x) = 2 · M, (4) λ1(x) + λ2(x) where M denotes the maximum allowed semi-major axis. Divisions by zero and perfectly smooth regions are handled by adding a small positive number to the eigenvalues (small enough to be neglected for any other purpose, i.e. machine epsilon). For all values of λ1(x) and λ2(x) we have a(x) + b(x) = M and 0 ≤ b(x) ≤ a(x) ≤ M. The structuring elements will dynamically range from lines of length M where λ1(x)  λ2(x) ≈ 0, i.e. near strong dominant edges in the data, to disks with radius M/2 where λ1(x) ≈ λ2(x), i.e. where no single direction represents the local image structure (see Section 2.1). The orientation φ(x) is retrieved from the corresponding eigenvectors by  e (x)  2,x2  arctan , e2,x1 (x) 6= 0,  e (x) φ(x) = 2,x1 (5)   π/2, e2,x1 (x) = 0, where e2,x1 (x) and e2,x2 (x) denote the components of the eigenvector e2(x). Given the image f and the parameters M and rw, the values of a(x), b(x),and φ(x) for the pixel x are thereby uniquely defined. To further reduce the complexity we suggest setting rw = M, ending up with only one parameter, but this can of course be adjusted as needed based on the size and nature of the features of interest. This will be further investigated in Section 4.1.

2.3 Morphological operations For simplicity, we here introduce the notation

NE(x) = Ex(a(x), b(x), φ(x)) (6) 92 Paper B for the elliptical neighborhood surrounding the pixel x. The pixel-dependent ellipse parameters a(x), b(x), and φ(x) are calculated from the local image structure according to Eqs. (1)–(5), and the x subscript denotes that the elliptical structuring element has been translated to the pixel x. We then define the morphological operations erosion (εE) and dilation (δE) operating on the image f by ^ εE(f) = f(y) ∀x ∈ D(f), (7)

y:y∈NE (x) _ δE(f) = f(y) ∀x ∈ D(f), (8)

y:x∈NE (y) where V and W denotes the minimum and maximum operators, respectively, and D(f) is the support domain of the image f. The operations εE and δE are adjunct since the structuring elements are defined once from the same input image (referred to as the “pilot image” by Lerallut, [16]) and thereafter remain fixed throughout the subsequent morphological operations [5]. That is, δ(f) ≤ g ⇐⇒ f ≤ ε(g) (9) for any functions f and g. We can thereby define the operations

γE(f) = (δE ◦ εE)(f), (10)

ϕE(f) = (εE ◦ δE)(f) (11) to be the morphological opening and closing, respectively, of the gray-scale image f.

3 Implementation

The presented method was tested by a fairly straight-forward implementation in OpenCV v.2.3 [28]. The program can be summarized as follows:

1. Pre-calculate relative pixel index lists for the expected elliptical structuring ele- ments, i.e. the index shifts for the leftmost and rightmost border pixels (relative to the center pixel) for each row of the discretized structuring element. All of the required structuring elements, as defined by M, are then stored in a corresponding 3D Look-Up Table LUTE(a, b, φ) based on semi-major axes a ∈ [0,M], semi-minor axes b ∈ [0, a], and orientations φ ∈ [0, π). In practice, due to the symmetry of the ellipses, only half of the relative indices need to be stored. The other half can then be easily obtained by inverting the coordinates.

2. Estimate the image gradient ∇f(x) using 3×3 pixel Scharr filters on a pre-smoothed version of the input image, obtained using a small Gaussian filter with standard deviation σ0 = 0.95. Then calculate the LST for each pixel according to Eq. (1), defining the standard deviation σ by setting a filter bandwidth radius rw. 4. Results 93

3. Obtain axis lengths and orientation for each pixel from the eigenvalues and eigen- vectors of the corresponding LST, using Eqs. (3)–(5).

4. Perform an erosion or a dilation, or a combination of the two (an opening or a closing), using the elliptical structuring elements defined for each pixel x. More specifically, the elliptical neighborhoods provided by LUTE(a(x), b(x), φ(x)) are used to perform the operations given by Eqs. (7) and (8). In practice, the dilation is implemented as suggested by Lerallut [16], i.e.

_ [δE(f)](y) = {[δE(f)](y), f(x)} , ∀y ∈ NE(x), ∀x ∈ D(f). (12)

4 Results

4.1 Choice of Parameters

The impact of the two required input parameters M (maximum semi-major axis) and rw (filter bandwidth for tensor estimation) can be observed in Fig. 2, which shows erosion results for M = {4, 8, 16} and rw = {4, 8, 16} on a test image. Comparing the upper result row (M = 4) to the lower (M = 16), it is clear that the sizes of the structuring elements have increased substantially. Meanwhile, going from left (rw = 4) to right (rw = 16) changes adaptivity towards smaller features for adaptivity towards larger features in the data. Thus, the influences of the parameters are as expected: M provides a neighborhood size quantity, and rw needs to be balanced to capture the size of features of interest in the data.

4.2 Feature Enhancement

Figure 3 shows the erosion (εE(f)), dilation (δE(f)), opening (γE(f)), and closing (ϕE(f)) of the Lena test image, using M = rw = 8. The opening and closing succeed in enhancing and joining bright and dark features, respectively, directionally without over-stretching over clear borders in the data. This is particularly clear for the feathers and along the brim of the hat. The presented adaptive elliptical structuring elements range from disks of radius M/2 where no clear local orientation exists in the data to lines of length 2M + 1 where there is a strong directional structure. For comparison, the opening and closing based on these two extremes are shown in Fig. 4. For an even more detailed view of the differences, Fig. 5 shows a close-up view of the morphological closing of the upper left part of the hat, depicted contrast-stretched in jet colormap in order to enhance differences. The original image sub-section (Fig. 5a) has directional features in the hat, but also contains a noisy region without any directional features in the upper left. We see that the the disk structuring element, which does not adapt at all to the image structure, gives a crude result (Fig. 5b). 94 Paper B

(a)

(b) (c) (d)

(e) (f) (g)

(h) (i) (j)

Figure 2: The star test image (a) and its erosions (b–j) using M = 4, 8, and 16 (top to bottom) and rw = 4, 8, and 16 (left to right). 4. Results 95

(a)

(b) (c)

(d) (e)

Figure 3: Operations performed on the Lena image (512×512 pixels) (a) using M = rw = 8; erosion (b), dilation (c), opening (d), and closing (e). 96 Paper B

(a) (b)

(c) (d)

Figure 4: Openings and closings of the Lena image, using a regular (non-adaptive) disk structuring element of radius 4 (a, b) and an adaptive line structuring element of length 17 (c, d), corresponding to the two extremes for the adaptive elliptical structuring elements with M = rw = 8.

Adaptive line structuring elements (using the per-pixel angles obtained by Eq. (5)) adapt well to existing line features in the data, but also introduce lines in originally smooth regions containing noise (Fig. 5c). We see that the sensitivity of the line struc- turing elements towards noise in this case comes from the actual shape of the structuring elements rather than their directions, which indicates that this problem needs to be dealt with by using broader structuring elements in such regions. The presented adaptive elliptical structuring elements (Fig. 5d) provide the trade-off between the two extremes. By utilizing a broader range of shapes for the structuring elements, the operation acts along locally dominant directions in the data without in- troducing artificial structures in noisy regions where no clear directional structure exists 4. Results 97 in the original image. This comes at the expense of very thin structures, which are more likely to be erased by the closing using adaptive elliptical structuring elements (as opposed to using adaptive lines).

(a) (b)

(c) (d)

Figure 5: The upper left part of the hat in the Lena image (a) and its closings using a disk structuring element (b), an adaptive line structuring element (c), and adaptive elliptical structuring elements (d). A subset of the structuring elements used are shown in white. 98 Paper B

4.3 Impact of Noise

To study the impact of noise, Gaussian white noise with standard deviation s was added to the Lena image f, resulting in a noisy image fs. The pixel values in the elliptical closing will, as in Fig. 5, generally be higher than in the line closing, since high function values due to the added noise are allowed to affect a larger neighborhood. For the gradient the situation is different: the fact that adjacent lines in the data may cover completely different sets of pixels causes a highly varying result for noisy data, introducing a substantial amount of artificial edges similar to what we observe already in Fig. 5c. By studying the change in the gradient magnitude of the closed images ϕ(fs) as the amount of noise is increased, the effect of noise on edge information in the image can be more systematically investigated. Figure 6 shows slightly contrast-stretched close-ups of parts of closings ϕ(fs) of the Lena image at noise level s = 0.05 (Figs. 6a and 6b) and a plot of the ratio between the average gradient magnitudes for the closed noisy images and the closed original image, i.e. the ratio (ϕ(fs)/ϕ(f)), as the standard deviation s of the added noise increases from 0 to 0.1 (Fig. 6c). Some elliptical structuring elements can be identified in Fig. 6b, but the lines in Fig. 6a are much more distinct. This visual observation is more systematically validated in Fig. 6c, which indeed shows that the noise causes a substantially larger increase of edge structure in the line-filtered image than in the ellipse-filtered image.

4.4 Practical Application

Our final example is of a more practical nature. In Fig. 7a we see 3D profile (range) data of a surface crack in casted steel. The crack has clear directional characteristics, but is quite noisy. By enhancing the crack signature we can more easily distinguish it from the rest of the surface, which simplifies automated crack detection and allows for a more accurate estimation of crack length. After performing an opening using elliptical structuring elements (Fig. 7b) on the image in question the crack signature is indeed clearly enhanced. In particular, the operation efficiently joins crack sections that were originally split apart by noise, without noticeably affecting the horizontal trenches (known as oscillation marks). This example demonstrates how the presented methods can be used in challenging applications in computer vision. Enhancement of the clearly directional crack is as good for the elliptical structuring elements as for the more extreme line structuring elements (Fig. 7c), but the advantage of the presented method is that we do not risk introducing artificial lines as a result of noise in the data, as was discussed in Section 4.3. In particular, the type of distinct dark lines that appear in the originally flat but noisy upper left region of Fig. 6a as a result of the shapes of the line structuring elements would in this application likely be falsely interpreted as cracks. Elliptical structuring elements therefore make the system less sensitive to noise, which is highly desirable for an automated industrial system subject to environmental factors such as vibrations, dust, reflections, etc. 4. Results 99

(a) (b)

1.8 Elliptical SE’s Line SE’s 1.7 Disk SE’s

1.6

1.5

1.4

1.3

1.2 Relative average gradient magnitude

1.1

1 0 0.02 0.04 0.06 0.08 0.1 s (c)

Figure 6: Results of morphological closings of noise-corrupted data (s = 0.05) for line structuring elements (a) and elliptical structuring elements (b), and the increasing effect of noise on the gradient magnitude in filtered images (c). 100 Paper B

(a) (b) (c)

Figure 7: A noisy vertical surface crack (a) and its opening using adaptive ellipses (b) and adaptive lines (c) using M = rw = 10. The elliptical structuring elements manage to enhance the crack as good as the line structuring elements. Also, note that horizontal trenches, known as oscillation marks, are not noticeably affected.

5 Discussion

We have shown how the concepts of tensors and morphology can be combined into adap- tive morphology in a straight-forward manner. Previous work has touched the subject but not fully utilized the information held by the Local Structure Tensor (LST). By con- sidering the relation between the eigenvalues as well as the direction obtained from the LST, edge dominancy information is automatically included when assigning a structuring element to a pixel. The results show that the structuring elements indeed become line-shaped (or close to line-shaped) where there are strong dominant directions in the data, i.e. near edges. In addition, when processing regions further away from distinct edges the method adapts dynamically to a non-directional approach by transforming the structuring element into smaller disks, which avoids a distinct geometrical bias from the filter itself in such loca- tions. In particular, this is true when data contains random noise, in which case the noise level affects the morphological operations (which are based on minimum and maximum values within neighborhoods) but does not cause artificial edges in the data resulting 6. Conclusion 101 from biases from the structuring element. If the user needs to emphasize extremely thin features but does not care that similar features risk being added from noise (e.g. if the Signal to Noise Ratio is known to be very high) line structuring elements may very well be the best choice. For cases where noise is more of an issue, however, and in particular where thin artificial structures are unwanted, the presented adaptive elliptical structuring elements should be preferable. The method relies on no more than two input parameters: the maximum semi-major axis M and the radial bandwidth for structure estimation rw. The parameters are quite intuitive and are related to real measures in the image; M sets the sizes of the structuring elements while rw is related to the size of orientational features of interest. These two can in many cases even be set to the same number, since the possible reach of the assigned structuring elements then corresponds to the region considered for structure estimation. If needed, the level of Gaussian prefiltering when calculating the gradient can of course also be set by changing the underlying parameter σ0 to match the level of noise expected by the user, but for most cases a small smoothing (σ0 < 1) should suffice. Hence, the presented morphological processing is reasonably easy to use even without deep insight into the underlying framework. Regarding efficiency, the LST calculations are based on standard convolutions – a standard process which is usually implemented highly efficiently. The eigenvalues of the LST can in lower dimensions be calculated from closed-form expressions, and can therefore be handled quite efficiently as well (but will of course need more computational power in higher dimensions). As for the morphological operations, a Look-Up Table (LUT) allows for a reasonably efficient implementation. In systems where the adaptive morphological operations should be repeated or otherwise performed several times, the LUT can be calculated once (off-line) and then loaded into on-line applications, reducing the computational load for on-line systems at the cost of memory.

6 Conclusion

The adaptive elliptical structuring elements are robust to noise and are automatically adjusted to the local edge information (or lack thereof) in the data. They constitute a dynamic bridge between the two extremes; disk and (direction-adaptive) line structuring elements. We have demonstrated how the adaptive elliptical structuring elements can be applied to both gray-scale images and 3D profile data used in industrial applications, allowing for enhancement of directional structures within the data.

7 Future work

In this work, we set the structuring elements by setting two parameters: the maximum semi-major axis M and the filter bandwidth rw used to obtain the local structure tensor. In the future we intend to investigate the possibility to set those parameters adaptively 102 Paper B as well. This could be done by calculating the LST at different scales (for different values of rw) and select a value for M that captures the scale at which the LST captures enough information. Alternatively, other approaches for estimating image structure, where there is an even stronger relation to the variation in the different directions (e.g. where crossings can be separated from points), could be used to scale M suitably. Future work should also consider estimating the level of noise in the input image in order to define a proper level of pre-smoothing for calculating the gradient. This would allow for a more proper handling of thin structures in noise-free data. Finally, there is no reason why the presented results cannot be generalized for higher dimensions, enabling processing of voxel information.

References

[1] G. Matheron, Random sets and integral geometry. New York: Wiley, 1975, vol. 1.

[2] J. Serra, Image analysis and mathematical morphology. London: Academic Press, 1982.

[3] S. Beucher, “Numerical residues,” Image and Vision Computing, vol. 25, no. 4, pp. 405–415, 2007.

[4] J. Serra, Image analysis and mathematical morphology. Vol. 2. New York, NY: Academic Press, 1988.

[5] J. Roerdink, “Adaptivity and group invariance in mathematical morphology,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2253–2256. IEEE, 2009.

[6] J. Roerdink and H. Heijmans, “Mathematical morphology for structures without translation symmetry,” Signal Processing, vol. 15, no. 3, pp. 271–277, 1988.

[7] J. Roerdink, “Group morphology,” Pattern Recognition, vol. 33, no. 6, pp. 877–895, 2000.

[8] M. Charif-Chefchaouni and D. Schonfeld, “Spatially-variant mathematical morphol- ogy,” in IEEE International Conference on Image Processing (ICIP), vol. 2, pp. 555–559. IEEE, 1994.

[9] N. Bouaynaya and D. Schonfeld, “Spatially variant morphological image processing: theory and applications,” in Proceedings of SPIE, vol. 6077, pp. 673–684, 2006.

[10] J. Serra, “Viscous lattices,” Journal of Mathematical Imaging and Vision, vol. 22, no. 2, pp. 269–282, 2005. References 103

[11] N. Bouaynaya, M. Charif-Chefchaouni, and D. Schonfeld, “Theoretical foundations of spatially-variant mathematical morphology part i: Binary images,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 823–836, 2008. [12] N. Bouaynaya and D. Schonfeld, “Theoretical foundations of spatially-variant math- ematical morphology part ii: Gray-level images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 837–850, 2008. [13] P. Maragos and C. Vachier, “Overview of adaptive morphology: trends and per- spectives,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2241–2244. IEEE, 2009. [14] F. Shih and S. Cheng, “Adaptive mathematical morphology for edge linking,” In- formation sciences, vol. 167, no. 1, pp. 9–21, 2004. [15] F. Shih and V. Gaddipati, “General sweep mathematical morphology,” Pattern recognition, vol. 36, no. 7, pp. 1489–1500, 2003. [16] R. Lerallut, E.´ Decenci`ere,and F. Meyer, “Image filtering using morphological amoe- bas,” Image and Vision Computing, vol. 25, no. 4, pp. 395–404, 2007. [17] V. Curi´c,C.´ Luengo Hendriks, and G. Borgefors, “Salience adaptive structuring elements,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 7, pp. 809–819, 2012. [18] J. Debayle and J. Pinoli, “Spatially adaptive morphological image filtering using intrinsic structuring elements,” Image Analysis & Stereology, vol. 24, no. 3, pp. 145–158, 2005. [19] R. Verd´u-Monederoand J. Angulo, “Spatially-variant directional mathematical mor- phology operators based on a diffused average squared gradient field,” in Advanced Concepts for Intelligent Vision Systems, pp. 542–553. Springer, 2008. [20] R. Verd´u-Monedero,J. Angulo, and J. Serra, “Spatially-variant anisotropic morpho- logical filters driven by gradient fields,” Mathematical Morphology and Its Applica- tion to Signal and Image Processing, pp. 115–125, 2009. [21] R. Verd´u-Monedero,J. Angulo, and J. Serra, “Anisotropic morphological filters with spatially-variant structuring elements based on image-dependent gradient fields,” IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 200–212, 2011. [22] O. Tankyevych, H. Talbot, P. Dokl´adal,and N. Passat, “Direction-adaptive grey- level morphology. application to 3d vascular brain imaging,” in 16th IEEE Interna- tional Conference on Image Processing (ICIP), pp. 2261–2264. IEEE, 2009. [23] M. Breuß, B. Burgeth, and J. Weickert, “Anisotropic continuous-scale morphology,” Pattern Recognition and Image Analysis, pp. 515–522, 2007. 104

[24] O. Cuisenaire, “Locally adaptable mathematical morphology using distance trans- formations,” Pattern recognition, vol. 39, no. 3, pp. 405–416, 2006.

[25] P. Dokl´adaland E. Dokl´adalov´a,“Grey-scale morphology with spatially-variant rect- angles in linear time,” in Advanced Concepts for Intelligent Vision Systems, pp. 674–685. Springer, 2008.

[26] B. Rieger and L. Van Vliet, “A systematic approach to nd orientation representa- tion,” Image and Vision Computing, vol. 22, no. 6, pp. 453–459, 2004.

[27] L. Cammoun, C. Casta˜no-Moraga, E. Mu˜noz-Moreno,D. Sosa-Cabrera, B. Acar, M. Rodriguez-Florido, A. Brun, H. Knutsson, and J. Thiran, “A review of tensors and tensor signal processing,” in Tensors in Image Processing and Computer Vision, pp. 1–32. Springer, 2009.

[28] “OpenCV.” [Online]. Available: http://opencv.willowgarage.com Paper C Sub-Millimeter Crack Detection in Casted Steel using Color Photometric Stereo

Authors: Anders Landstr¨om, Matthew J. Thurley, and H˚akan Jonsson

Reformatted version of paper originally published in: Proceedings of the 2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013, pp. 435–441.

c 2013, Institute of Electrical and Electronics Engineers, reprinted with permission.

105 106 Sub-Millimeter Crack Detection in Casted Steel using Color Photometric Stereo

Anders Landstr¨om,Matthew J. Thurley, and H˚akan Jonsson

Abstract

A novel method for automated inspection of small corner cracks in casted steel is pre- sented, using a photometric stereo setup consisting of two light sources of different colors in conjunction with a line-scan camera. The resulting image is separated into two differ- ent reflection patterns which are used to cancel shadow effects and estimate the surface gradient. Statistical methods are used to first segment the image and then provide an estimated crack probability for each segmented region. Results show that true cracks are successfully assigned a high crack probability, while only a minor proportion of other regions cause similar probability values. About 80% of the cracks present in the segmented regions are given a crack probability higher than 70%, while the corresponding number for other non-crack regions is only 5%. The segmented regions contain over 70% of the manually identified crack pixels. We thereby provide proof-of-concept for the presented method.

1 Introduction

1.1 Background Cracks in casted steel constitute a quality issue for end-user products and may also cause problems in subsequent production processes such as sheet steel rolling. Finding cracks after casting, thereby avoiding related problems at later stages, is therefore important to the steel industry. Much of this inspection is still manually operated, while a robust automated system can improve both working conditions and production efficiency. More- over, automated methods provide an objective and consistent result which can be more easily analyzed than manual inspection. Moreover, in order to enable on-line testing of produced goods inspection must be done by Non-Destructive Testing (NDT). There exists many different NDT methods for defect detection in steel, such as ther- mocouples [1], eddy currents [2], ultra sound [3], conoscopic holography [4], and laser triangulation [5]. In this work however, we focus on the detection of small corner cracks which are in general thinner than 1 mm. Due to their size such cracks pose a challenge to the techniques listed above when an on-line installation is desired. Light intensity imaging on the other hand, i.e. standard gray-scale camera tech- nique, enables fast acquisition of high resolution surface data suitable for on-line NDT installations, and has been used throughout the years for various inspection tasks [6].

107 108 Paper C

Gray-scale images are often hard to interpret though, motivating the use of more ad- vanced techniques. We present a photometric stereo system for identifying small cracks with sub-millimeter width in casted steel slabs using machine vision, enabling appro- priate processing of such regions before slabs are shipped further down the production chain. In photometric stereo, originally introduced by Woodham [7], multiple gray-scale images are acquired of the same scene under illumination by differently placed light sources. Assuming a surface is Lambertian (i.e. scatters light isotropically), its shape can be uniquely determined from three images obtained by illuminating the scene from three different directions. The different illumination patterns are then translated into height gradient information for the surface by solving a linear system of equations. Two- source photometric stereo gives a less well defined surface, but conditions for existence and uniqueness of the system of equations for the two-source case have been investigated by Kozera [8]. Most methods based on photometric stereo relies on data captured by taking several photos of the same scene, varying the origin of illumination using a set of light sources lit one at a time, but this approach would be cumbersome for the intended use. Instead, it should be possible to mount the equipment in an on-line measurement rig passing over the steel slabs, scanning the steel below (see Fig. 1). In order to enable such on-line scanning the images should be acquired simultaneously, which can be achieved by color photometric stereo. Drew [9] estimates surface gradients from the different channels in a color image, using photometric stereo. No particular light setup is addressed, but conditions are imposed on the light setup by the underlying mathematical expressions, i.e. light directions must be linearly independent and colors must not be coplanar in RGB space. Color images have

Figure 1: The intended use of the final system: Scanning slabs from a measurement frame under which slabs can be placed for inspection. 2. Method 109 also been used by Balschbach et al. [10], who obtain height gradients for highly specular moving (wavy) water surfaces from refracted colored lights. Kang et al. [11] recently used colored light photometric stereo to retrieve signatures for artificial defects in planar steel surfaces. Their work considers surface reconstruction from gradients obtained from a configuration based on three and six differently colored light sources. However, surface reconstruction from gradients is quite susceptible to noise in the data and therefore requires very accurate input when addressing thin defects such as cracks.

1.2 Contribution The presented approach captures an estimated surface gradient from the different light reflections of two colored light sources. In addition, the use of two light sources also allows for separation between cracks and various shadowing effects. Working directly with the gradient makes the system more robust to noise, as compared to integrating gradients obtained by photometric stereo into a surface. Moreover, combining estimated gradient light intensities with gradient information provides a more robust identification than purely intensity-based approaches since the texture of the steel surface may affect the intensity of the reflected light. The work can be separated into three interrelated subproblems:

P1. Acquiring image data that captures the details of small surface cracks.

P2. Segmenting the image into possible crack regions.

P3. Classifying potential cracks as true or false signatures.

The rest of this article is arranged as follows: Section 2 presents solutions to the three subproblems P1-P3, based on a model set of steel samples containing cracks. Results for the presented method on another set of samples is then presented in Section 3, followed by an analysis in Section 4 and conclusions in Section 5. Future challenges and possibilities for the ongoing work are then addressed in Section 6.

2 Method

2.1 Measurement setup (P1) A transversal crack is characterized by a rapid change in the height derivative in the perpendicular direction, and will remain dark (low light intensity) independent of whether the projected light comes from the front or back. A solution for subproblem P1 must provide a high enough resolution to be able to get a clear signature of small cracks, but should be robust against variation of texture and reflections in the steel surface. Photometric stereo enables simple and cost-efficient estimation of the surface gradient (i.e. slope). In addition, the two light sources enables us to retrieve a gray-scale intensity 110 Paper C image of the sample where shadow effects appearing in only one of the images can be cancelled. We present a system which uses constant light of two different colors (yellow and blue) which can be well separated in the green and blue channels of the digital image captured by the camera.

The gradient, or 1:st derivative, is measured perpendicular to the cracks using two light sources L1 and L2 that are positioned in front of and behind the camera sensor, with respect to the relative movement between the camera and the slab. Thus, the gradient estimation is done longitudinally along the measured slab: the direction for which L1 and L2 are co-linear (see Fig. 2). A line scan color camera is used to capture one transversal line of data at the time, located at constant and equal distance relative to L1 and L2. This setup avoids the need for position-dependent normalization of intensity values, simplifying mathematical expressions considerably.

Assuming a Lambertian surface (i.e. isotropic light scattering), a coarse but simple estimation of the surface gradient can be achieved by considering how light incident from different directions is reflected (i.e. in what extent it reaches the sensor). Given the symmetrical co-linear measurement setup, the system of equations for two-light photo- metric stereo considered by Kozera [8], among others, reduces to a normalized difference of the two images for the longitudinal direction (note that nothing is known about the transversal direction). That is, an approximation of the gradient Sy of the surface S in

L1

scan line direction of movement

L2 (a) (b)

Figure 2: (a) Setup example with light sources in blue and yellow, and scan line in black. The gray area represents a part of the slab. (b) The portable measurement setup installation used in this work. 2. Method 111 the longitudinal (y-) direction of the slab can be obtained from the expression

I1 − I2 Sy = , (1) I1 + I2 where I1 and I2 are images captured while illuminating the surface from (one of) L1 and L2, respectively. For simplicity, we here omit a factor yielded by the angle of the incident light since it only scales the quantity. In a slightly different but related application, Nayar and Bolle [12] demonstrated how the same ratio of image intensities can be used to obtain a reflectance ratio for neighboring points in the same grayscale image, assuming identical light scattering for the two points. It should be noted that a steel surface hardly provides perfect conditions, and conse- quentially the resulting estimated gradient should not be expected to yield an exact slope due to the risk of specular effects and similar disturbances. Steel is also hardly not a Lambertian surface, and a more accurate model for light scattering should yield a better gradient estimation. Nevertheless, empirical studies show that the simple approximation provided by Eq. (1) produces a highly usable result.

2.2 Segmentation (P2)

We identify three key criteria for pixels within captured cracks:

C1. Low light intensity.

C2. Low light intensity in relation to the surrounding region (not necessarily the same as the above criterion).

C3. Rapid (positive) change in the perpendicular height derivative.

Rather than defining thresholds for the above criteria manually we then apply a prob- abilistic approach, combining them into a pixelwise crack probability value based on a logistic regression model constructed from a model set of images containing in total over 50 cracks of different sizes.

An example of a pair of input images I1 and I2 are shown in Fig. 3. We assume the placement of the steel is well known, which means we can safely restrict filtering and classification to the center part of the acquired data. This decreases computation time and avoids false detection of cracks in non-interesting regions. A standard centimeter- numbered ruler is displayed on the left side of the images. The difference in scale between the top and bottom of the images is a result of variation of the speed at which the line scan camera was swept over the sample. This demonstrates how different resolutions in the longitudinal (y-) direction can be obtained by varying the speed of the camera vs. the scanned slab. 112 Paper C

(a) (b)

Figure 3: The front and back images (a and b, respectively) for a steel sample in the model set, where cracks are visibile on the left side of the scarfed steel surface. The scale numbers to the left are denoted in centimeters.

Criterion C1

To cancel shadowing effects, we define the maximum intensity image C1 as the pixelwise maximum of I1 and I2, filtered in order to enhance elongated dark features, i.e.

C1 = γE (max {I1,I2}) . (2)

Here, γE denotes an adaptive morphological opening using elliptical structuring elements. This method was introduced in Ref. [13] as a method for enhancing and linking elon- gated features such as cracks. Maximum length of the semi-major axes of the elliptical structuring elements used to probe the image is set to 3 pixels.

Criterion C2

In the image C1, shadows that are not present in both images are cancelled while regions appearing dark in both images, such as cracks, remain. Furthermore, we also capture the pixel darkness in relation to the surrounding neighborhood by calculating the mor- phological close top-hat C2, i.e. the difference between the morphological (non-adaptive) closing ϕ(C1), using a reasonably large disk structuring element (radius 15 pixels), and C1 itself: C2 = ϕ(C1) − C1. (3) 2. Method 113

Examples of the resulting images C1 and C2 are presented in Figs. 4a and 4b.

Criterion C3 Ideally the height derivative (slope) of the surface should turn from negative to positive, but this is not always true since the estimated derivative is only an approximation. For our third criterion we therefore simply consider the value of the 2nd derivative (which should be high where there are cracks), as estimated from the photometric stereo setup. The 1:st derivative Sy of the surface S is estimated by Eq. (1), i.e.

I1 − I2 Sy = , I1 + I2 and a measure C3 related to our third criterion is then obtained from an estimation of the nd 2 derivative. This is done by first convolving the image Sy with a small 2D Gaussian derivative kernel DG (the standard deviation used for the Gaussian is as small as 0.5 pixels in order to capture the thin details of the crack) and then linking signatures with adaptive morphological filtering in the same way as for C1, i.e.

C3 = ϕE(Sy ∗ DG). (4)

Here a closing ϕE is performed instead of an opening in order to enhance bright, rather than dark, features in the data. The maximum semi-major axis of the structuring el- ements is set to 3 pixels here as well. An example of a resulting image is shown in Fig. 4c.

Probabilistic Segmentation As discussed (see the beginning of Section 2.2), we apply a statistical method to separate between pixels which are likely or unlikely to constitute a part of a crack: a logistic regression model is fitted to the model data set, providing a posterior crack probability for each pixel based on the criteria of manually marked cracks in the model set (see Fig. 4d). More specifically, given a criteria vector C(x) = {C1(x),C2(x),C3(x)} for a pixel x we define a corresponding crack probability p(x|α, C) by the posterior probability for logistic regression 1 p(x|α, C) = , (5) 1 + e−αT C(x) where α denotes the coefficients vector for the classifier obtained from the model set. A comparison of the value distributions for the variables and the pixel-wise crack probability (for the model set) is shown in Fig. 5. The data is then segmented into a thresholded binary image T by selecting a proba- bility threshold t, i.e.  1, where p(x|α, C) ≥ t T = (6) 0, otherwise . Based on the cumulative distribution functions (Fig. 5d) we select t = 0.75, keeping pixels with more than 75% crack probability (as defined by the model set). This yields 114 Paper C

(a) (b)

(c) (d)

Figure 4: The maximum intensity C1 (a), its top-hat C2 (b), the gradient change C3 (c), and the resulting estimated per-pixel crack proba- bility shaded in orange (d). The brightness of the orange shading is proportional to the assigned crack probability, i.e. regions with high crack probability are shaded in bright orange. 2. Method 115

1 1

0.8 0.8

0.6 0.6 CDF CDF 0.4 0.4

0.2 0.2

0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Per-pixel maximum intensity Per-pixel tophat (a) (b)

1 1

0.8 0.8

0.6 0.6 CDF CDF 0.4 0.4

0.2 0.2

0 0 -3 -2 -1 0 1 2 3 0 0.2 0.4 0.6 0.8 1 Per-pixel gradient change Per-pixel crack probability (c) (d)

Figure 5: CDFs for the maximum light intensity C1 (a), the close top-hat C2 (b), the gradient change C3 (c), and the estimated crack probabilities (d) for manually identified cracks (red) and other parts of the slab sample (blue). a set of segmented regions (see Fig. 6). A binary segmentation has thus been achieved by selecting one probability threshold rather than three variable thresholds. It should be noted that classification by logistic regression is usually performed by separating the classes at 50% class probability, but we select a higher, more conservative, required crack probability in order to limit the number of non-crack pixels in the resulting binary segmentation. The segmentation phase is concluded by cleaning the thresholded binary image T from clearly uninteresting parts. This is done by morphological filtering: False signatures at vertical edges are avoided by excluding pixels where a long vertical line can be fitted into the segmented set, using a morphological open top-hat. The exact length of the line should be long enough to avoid deleting cracks, and was here set to 25 pixels. Finally, small regions in the resulting binary image are excluded by an area opening, which 116 Paper C

(a) (b)

Figure 6: Manual pixel classification (a) and the segmented regions, thresh- olded at 75% crack probability (b). avoids subsequent processing of a large number of very small regions (noise). In this work, regions containing less than 25 pixels were removed.

2.3 Classification (P3) After segmentation potential cracks, i.e. segmented regions, are fed into another logistic regression classifier, but at this stage the classification is based on regional features (in contrast to the previous per-pixel classification). We denote identified regions as R and their features F(R). A statistical classifier is defined based on the regional features orientation, width, solidity, and crack probability (as defined by Eq. (5)). These variables, extracted for each of the segmented regions, were selected from a larger set by lasso regularization, i.e. by introducing a bias towards zero for the coefficients [14]. A non- biased model was then created based on the resulting variables. Hence, the resulting model has an intended bias towards simplicity in terms of variables, but is without bias in the final coefficients. The width is calculated from a pruned binary skeleton. Discarded variables included other geometrical properties such as length and area. Instead of manually setting a strictly predefined classification threshold we set the regional crack probability p(R|β, F) to the posterior probability for logistic regression, as in Eq. (5) but with another coefficient vector β corresponding to the extracted regional features F(R). Choosing a threshold value then becomes a trade-off between high detec- tion rate and a low number of false detections. It is therefore crucial that the resulting 3. Results 117

1

0.8

0.6 CDF 0.4

0.2

0 0 0.2 0.4 0.6 0.8 1 Per-region crack probability (a) (b)

Figure 7: Regional crack probabilities depicted in orange shading (a) and the total CDFs (b) for manually identified cracks (red) and other parts of the slab sample (blue). probability distributions for the two groups are well separated, which is indeed the case for the model set (see Fig. 7). For instance, a 70% threshold level for the regional crack probability keeps 86% of the segmented crack regions while discarding all but 6% of non-crack regions in the segmented set.

3 Results

The described method was used to scan and process a validation set of images, resulting in a segmented set of potential crack regions with corresponding estimated crack proba- bilities. The validation set contained, just as the model set, in total over 50 corner crack samples of various sizes. Figure. 8 shows an example of a captured front image from the validation set (Fig. 8a) and a few images for the same example during intermediate algorithm steps. The es- timated per-pixel crack probabilities are displayed in Fig. 8b, where pixels with higher crack probability are shaded in brighter orange, and the segmentation resulting from thresholding this probability image at 75% is displayed in Fig. 8c. These regions are then fed to a classifier, providing per-region crack probabilities as displayed in Fig. 8d. The CDFs corresponding to Figs. 8b and 8d are shown in Fig. 9, and demonstrates good separation between true cracks and other regions. At the 75% per-pixel threshold, more than 70% of the manually identified pixels are correctly classified in the first step (while discarding 98% of other pixels), while the second (regional) classification would at a 70% threshold level keep 78% of the segmented regions and discard all but 5% of the remaining 118 Paper C

(a) (b)

(c) (d)

Figure 8: An example from the validation set: The front image (a), the per- pixel posterior probabilities shaded in orange (b), the resulting binary segmentation (c), and the per-region crack probabilities for the segmented regions (d). The brightness of the orange shading is proportional to the assigned crack probability, i.e. regions with high crack probability are shaded in bright orange. 4. Discussion 119

1 1

0.8 0.8

0.6 0.6 CDF CDF 0.4 0.4

0.2 0.2

0 0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Per-pixel crack probability Per-region crack probability (a) (b)

Figure 9: CDFs for per-pixel crack probabilities (a) and per-region crack prob- abilities (b) for manually identified cracks (red) and other parts of the slab sample (blue). non-crack regions. Note that these threshold values can be easily varied depending on the preferences of the system user, allowing for trade-off between the number of successful detections and the risk for false detections. Furthermore, large scale usage would allow for collection of larger model sets and thereby more accurate probability estimations.

4 Discussion

In order to measure the surface slope in the longitudinal direction, the two light sources are located one in front of and the other behind the point that is currently measured. The camera is of line-scan type, capturing transversal rows, which keeps the distances towards the light sources at constant length. The 2nd derivative yields signatures for cracks but is quite sensitive to small devia- tions in the data, resulting in a substantial amount of noise. It therefore needs to be combined with other features in order to achieve a robust crack detection system. We have here used the light intensity, since its robustness against artificial noise is a good complement to the more sensitive 2nd derivative which is useful for finding the rapid slope changes that characterize the cracks. Using light intensity directly makes the sys- tem more dependent on the surface texture of the steel and the ambient light settings of the installation. Texture effects are handled by taking the morphological close top-hat into account, thereby considering not only the light intensity but also its relation to the surrounding region. The dependency on ambient light requires control of the surrounding environment in order to achieve robustness, but this can be quite simply addressed by shielding the measurement system from external light sources. Looking at the results for manually classified data, the method indeed assigns high 120 Paper C crack probabilities to crack regions and low crack probabilities to non-crack regions. We see that other transversal ditch-like features in the data are likely to be assigned a high crack probability, but this is a logical consequence of the algorithms – such regions do rep- resent a change from negative to positive slope. By taking other parameters into account such false positives can be drastically reduced, which is demonstrated by the separation in region crack probability for the segmented data (as compared to the pixelwise crack probability used in the segmentation). A key challenge in this type of application is the small size of defect regions, in relation to non-defect area. Since the cracks we are looking for make up such a low proportion of the steel surface, classification results must be very high in order to avoid false positives. The reason for this is that even at very low false positive classification rates the vast amount of non-defect data will still likely cause some non-crack regions being falsely identified as cracks. Hence, the possibility to find and use other additional extractable characteristics to improve results should be further pursued, with the goal of restricting the number of false positives to an absolute minimum. Due to the adaptive structuring elements crack signatures can be emphasized using only one parameter. Using statistical methods to estimate crack probabilities also reduces the total number of required user-set parameters: instead of strictly defining for the data what we require, we leave much of this work to the data itself. The low number of parameters simplifies adjustments and analysis of the method. It should be noted that due to the data-driven segmentation and classification, a more representative model set should improve results substantially. In particular, one model set is most likely not suitable for all circumstances: for instance, different steel grades may have different properties. However, the fact that the segmentation is to a large extent depending on the model set makes it reasonably easy to create different models for different situations.

5 Conclusion

We have demonstrated a novel method for automated detection of small corner cracks in casted steel. Cracks in general cause high crack probability values, while only a minor proportion of other regions are assigned similar crack probabilities. More specifically: about 80% of the cracks present in the segmented regions are given a crack probability higher than 70%, while the corresponding number for other non-crack regions is only 5%. The segmented regions contain over 70% of the manually identified crack pixels. The results thereby provide proof-of-concept for the presented method.

6 Future work

Due to the fact that the large amount of surface area not containing cracks is likely to cause false positives even with a very low rate of false classifications, future work should focus on decreasing the false positive classification rate even further. One option which can help to avoid false crack signatures from light intensity patterns originating from References 121 more large-scale topological variations is to combine the photometric stereo images with 3D profile (range) data, which can assist substantially in defining the actual surface where we want to scan for cracks. More specifically, distinct surface height variations on the larger scale such as e.g. the slab border, scales, and scarfing borders can be more easily discarded from potential crack regions by using 3D data. Higher resolution can be obtained not only by varying the speed of the camera in relation to the scanned surface, but also by varying the number of recorded scans required for each line in the image. This comes at the expense of higher noise levels due to fewer measurements being recorded for each point of the steel surface. Such effects on noise levels from increasing scanning speed should be investigated, in order to select an optimal measurement speed. Large scale testing and acquisition of data on-site is the natural next step, in order to provide good statistical evaluation of the method and thereby making a future prototype development possible. Comparison to other approaches also constitutes a topic for future work. Both of these are quite extensive tasks, however, and therefore lie beyond the scope of this work.

Acknowledgment

The authors would like to thank our measurement partners at Kemi-Tornio University of Applied Sciences (KTUAS), in particular Pauli Vaara who has been of great assistance in the work with the measurement setup as well as when performing the measurements. This work was partly funded by the Swedish Steel Producers’ Association, Jernkon- toret.

References

[1] B. G. Thomas, “On-line Detection of Quality Problems in Continuous Casting of Steel,” in Modeling, Control and Optimization in Ferrous and Nonferrous Industry, 2003 Materials Science and Technology Symposium, pp. 29–45, 2003.

[2] P. Meilland, “Novel Multiplexed Eddy-Current Array for Surface Crack Detection on Rough Steel Surface,” Proceedings of the 9th European Conference on Non- Destructive Testing (ECNDT), Berlin, 2006.

[3] M. Allazadeh, C. Garcia, K. Alderson, and A. Deardo, “Ultrasonic image analysis of steel slabs,” Advanced materials & processes, vol. 166, no. 12, pp. 26–27, 2008.

[4] I. Alvarez, J. Marina, J. Enguita, C. Fraga, and R. Garcia, “Industrial online surface defects detection in continuous casting hot slabs,” in Proceedings of SPIE, vol. 7389, p. 73891X, 2009. 122

[5] A. Landstr¨omand M. J. Thurley, “Morphology-based crack detection for steel slabs,” IEEE Journal on Selected Topics in Signal Processing, vol. 6, no. 7, pp. 866–875, 2012.

[6] E. Malamas, E. Petrakis, M. Zervakis, L. Petit, and J. Legat, “A survey on industrial vision systems, applications and tools,” Image and Vision Computing, vol. 21, no. 2, pp. 171–188, 2003.

[7] R. Woodham, “Photometric method for determining surface orientation,” Optical engineering, vol. 1, no. 7, pp. 139–144, 1980.

[8] R. Kozera, “Existence and uniqueness in photometric stereo,” Applied Mathematics and Computation, vol. 44, no. 1, pp. 1–103, 1991.

[9] M. Drew, “Robust specularity detection from a single multi-illuminant color image,” Computer Vision, Graphics, and Image Processing: Image Understanding, vol. 59, no. 3, pp. 320–327, 1994.

[10] G. Balschbach, J. Klinke, and B. J¨ahne,“Multichannel shape from shading tech- niques for moving specular surfaces,” in Computer Vision – ECCV’98, ser. Lecture Notes in Computer Science, H. Burkhardt and B. Neumann, Eds., vol. 1407, pp. 170–184. Springer Berlin Heidelberg, 1998.

[11] D. Kang, Y. J. Jang, and S. Won, “Development of an inspection system for planar steel surface using multispectral photometric stereo,” Optical Engineering, vol. 52, no. 3, pp. 039 701–039 701, 2013.

[12] S. K. Nayar and R. M. Bolle, “Reflectance based object recognition,” International Journal of Computer Vision, vol. 17, no. 3, pp. 219–240, 1996.

[13] A. Landstr¨omand M. J. Thurley, “Adaptive morphology using tensor-based elliptical structuring elements,” Pattern Recognition Letters, vol. 34, no. 12, pp. 1416–1422, 2013.

[14] R. Tibshirani, “Regression shrinkage and selection via the lasso,” Journal of the Royal Statistical Society. Series B (Methodological), pp. 267–288, 1996. Paper D Image Reconstruction by Prioritized Incremental Normalized Convolution

Authors: Anders Landstr¨om,Frida Nellros, H˚akan Jonsson, and Matthew J. Thurley

Reformatted version of paper originally published in: Proceedings of the 17th Scandinavian Conference on Image Analysis (SCIA), 2011, pp. 176–185.

c 2011, Springer, Reprinted with permission.

123 124 Image Reconstruction by Prioritized Incremental Normalized Convolution

Anders Landstr¨om,Frida Nellros, H˚akan Jonsson, and Matthew J. Thurley

Abstract A priority-based method for pixel reconstruction and incremental hole filling in incom- plete images and 3D surface data is presented. The method is primarily intended for reconstruction of occluded areas in 3D surfaces and makes use of a novel prioritizing scheme, based on a pixelwise defined confidence measure, that determines the order in which pixels are iteratively reconstructed. The actual reconstruction of individual pixels is performed by interpolation using normalized convolution. The presented approach has been applied to the problem of reconstructing 3D surface data of a rock pile as well as randomly sampled image data. It is concluded that the method is not optimal in the latter case, but the results show an improvement to ordinary normalized convolution when applied to the rock data and are in this case comparable to those obtained from normalized convolution using adaptive neighborhood sizes.

1 Introduction

There are many ways in which an image can be incomplete. Image sensors can be faulty, 3D surface image data can contain areas of missing data due to surface reflectance properties and occlusion [9] or image pixels can be lost or distorted during transmission of data. As an example of missing data, including sensor occlusion, consider Fig. 6 (on page 133) showing two grayscale images depicting 3D surface data of a rock pile where missing pixels are marked in black. The image to the right shows the rows of range data as mea- sured by a structured lighting sensor [9]. To simplify later analysis of the measurements, we consider how these missing pixels can be reconstructed. While many techniques deal with reconstruction of randomly missing pixels [3, 6–8], there is also a potential benefit from being able to reconstruct missing regions within an image. This particular kind of image reconstruction is known as hole filling and can be summarized into three broad categories. Inpainting describes the technique where an artist reconstructs missing sections in a painting. This process can be formalized into solvable mathematical problems with the aim of producing visually pleasing images [1,2]. Geometric methods are typically used when image data comprises a surface of 3D points, or it is appropriate to represent image pixels in this way. Points are tri-

125 126 Paper D

angulated into a mesh, upon which the reconstruction is based. Holes appear as non-triangulated parts of the mesh. These methods fill each hole by computing a suitable patch that fits in seamlessly within the close proximity of the hole. The patch is then sampled to get values for the missing points [5]. Kernel regression methods are commonly used for reconstruction of images based on sparse sets of irregularly sampled pixels, but can also be applied to whole regions of missing pixels. These methods are based on a foundation of linear algebra and use basis expansions of local neighborhoods to improve or fill in the data of a pixel [8]. The neighborhoods are weighted by an applicability function so the key point in these methods is how to choose the neighborhood size, shape and applicability. More recently, methods that adapt the neighborhood size according to the density of sampled points in the neighborhood and the shape of the applicability function to shapes of edges surrounding the neighborhood have been presented [7,8].

In this paper we present a Prioritized Incremental algorithm using Normalized Convo- lution (PINC) for reconstruction of missing regions in incomplete images and 3D surface data. Specifically, when applied to surface data of piled particles (e.g. rocks) the pre- sented method seeks to reconstruct the data in a way that preserves local topological variation and particle distinctness. The method makes use of a novel prioritizing scheme, based on a pixelwise defined confidence measure, that determines the order in which pix- els are iteratively reconstructed. The actual reconstruction is performed by interpolation using the kernel regression method known as normalized convolution [6].

2 Method

At any time during the reconstruction process for an image, every pixel can be sorted into one of the three following classes: 1. Valid pixels, where the original image contains data. 2. Unfilled pixels; non-valid pixels where no value has been assigned. 3. Filled pixels; non-valid pixels that have been assigned an interpolated value. The presented method reconstructs holes (regions of unfilled pixels) in a data set by interpolation of unfilled pixels in an order such that those with more “reliable data” in their proximity are processed before those with less. In order to achieve this reconstruc- tion, a prioritizing strategy needs to be defined. Such a prioritization can be achieved by using a measure of the “validity” of the neighbors of an unfilled pixel. Let d(~x) denote the 2D Euclidean distance from the pixel ~x to the closest pixel containing valid data. A pixelwise confidence measure can then be defined as  1 if ~x contains valid data,  wc(~x) = 0 if ~x contains unfilled data, (1)  1 d(~x)+1 if ~x contains filled data . 2. Method 127

As unfilled pixels are filled, their confidence measure is changed according to (1). Pixel confidence values for filled data thus monotonically decrease from 1 at the border of valid data towards 0 in the unfilled pixels. Consider an unfilled pixel ~x with a surrounding n × n neighborhood N~x,n containing pixels ~xN,1, ~xN,2, . . . , ~xN,n2 . By summing up the confidence of the pixels in N~x,3, a priority measure p(~x) based on the confidence in the immediate neighborhood of ~x can then be defined as 9 X p(~x) = wc(~xN,k) . (2) k=1

Since d(~x) ≥ 0, 0 ≤ wc(~x) ≤ 1 and 0 ≤ p(~x) ≤ 8. For example, if ~x is a unfilled pixel entirely surrounded by valid data, p(~x) would be 8 and that pixel should consequently be filled in before a pixel with fewer valid neighbors. The range of p(~x) depends on wc(~x), and will thus change if another confidence measure is selected. The idea is that wc should be chosen so that holes are filled inwards from their perimeters. The order of reconstruction is thus determined by prioritizing pixels using the defined confidence measure. Unfilled pixels of equal priority are processed in the same step. Knowledge of data set restrictions can be included where values are known to be within a certain interval. This is accomplished by truncating each assigned value within the interval limits directly after the interpolation step. The PINC algorithm is a combination of a strategy for selecting the order for filling- in missing data and a method for assigning values to the unfilled pixels. It can be summarized as follows:

1 While unfilled pixels remain, do: 2 Select the set X of pixels with highest priority. 3 For each pixel x in X: 4 Obtain coefficients for a local polynomial expansion around x. 5 Approximate and constrain the value at x. 6 Update the confidences of X and recalculate priorities.

2.1 Local polynomial expansion: assigning values Let ~x denote an unfilled pixel and f the pixelwise signal values of the data set. The value of ~x, f(~x), can be approximated by the constant-term coefficient for a best-fit local expansion in a selected basis (for instance, consider the constant term in a Taylor expansion). In this work a polynomial basis is used. Expressing the data values for N~x,n by the signal vector

~ ~ T n2 f = f(N~x,n) = (f(~xN,1), f(~xN,2), . . . , f(~xN,n2 )) ∈ R (3) n o ~ ~ ~ 2 and letting b1, b2,..., bm constitute a set of m < n linearly independent bases span- n2 ~ ~ ning a subspace S of R , it is possible to approximate f by its projection fS onto S. By 128 Paper D

T ~ ~ ~ ~  letting B = b1, b2,..., bm denote the basis matrix and ~c = {c1, c2, . . . , cm} represent ~ ~ ~ the corresponding coefficients for fS, we can write fS = B~c . The coefficients contained in ~c are given by ~ ~ ~ ~ arg min kfS − fk = arg min kB~c − fk , (4) ~c∈Rm ~c∈Rm which can be recognized as a least squares problem. However, an adjustment of the influences of the different pixels in N~x,n is desired so that pixels closer to ~x have a greater impact on the result than those further away. Care should also be taken to the reliability of the values in the neighborhood pixels. These desired objectives of pixelwise influence and reliability can be achieved by using normalized convolution with the diagonal matrices for applicability and certainty given by

    wa(~xN,1) ··· 0 wc(~xN,1) ··· 0 ~  . .. .  ~  . .. .  Wa =  . . .  and Wc =  . . .  0 0 wa(~xN,n2 ) 0 0 wc(~xN,n2 ) respectively. For all pixels in the neighborhood N~x,n, wa(~xN,k) is a Gaussian mask pro- viding applicability weights and wc(~xN,k) is the corresponding confidence mask, where k = 1, 2, . . . , n2. Following the outline provided by Farneb¨ack [4], influences of the neighborhood pixels in (4) are assigned weights by a matrix W~ , implicitly defined by ~ 2 ~ ~ W = WaWc (Fig. 1). A vector ~cW , representing the basis coefficients for the weighted neighborhood, can then be obtained from arg min kW~ B~c~ − W~ f~k . (5) m ~cW ∈R The solution to this problem is then given by ~ T ~ ~ ~ −1 ~ T ~ ~ ~ ~cW = (B WaWcB) B WaWcf , (6) which can be efficiently solved [4]. Once the coefficients in ~cW have been calculated, the coefficient corresponding to the constant base function can be retrieved, providing the approximation of the pixel value f(~x).

w w 2 a c w

Figure 1: Example of neighborhood weights corresponding to values of Wa (left), Wc 2 (center) and W (right) for N~x,9. 3. Experiments and results 129

3 Experiments and results

Three different data sets were reconstructed by both ordinary normalized convolution (NC) and the PINC algorithm:

1. A gray-scale image, corrupted by randomly removing 90% of the pixels.

2. The same image as in 1, with holes of three different shapes.

3. 3D surface data for a pile of rocks, where data is partially missing.

3.1 Reconstruction of randomly removed data Figure 2 shows an original image and a version with 90% of the pixels randomly removed. Reconstructions were performed using a neighborhood size for coefficient extraction of 15 × 15 and a Gaussian mask with σ = 1.5 as the applicability function. The original image is restricted to values in the range R [0, 1], wherefore these limits were chosen as constraints for the reconstruction. Results of the NC and PINC algorithms, using zeroth and second order polynomials, are presented in Fig. 3. It should be noted that the PINC algorithm provides a less detailed result, but shows a more stable behavior for higher order polynomials where ordinary normalized convolution returns small regions of extreme values (Fig. 3, upper right).

3.2 Reconstruction of holes Figure 4 shows the same image as in the previous section (Sec. 3.1), now artificially corrupted by creating holes. The resulting reconstructed images, obtained by using the same parameter setup as in the previous section, are presented in Fig. 5. Differences between the two reconstruction algorithms are visible, especially in the row of circles crossing the face region. Black and white defects remain in the lower section of the reconstructions performed by ordinary normalized convolution, especially for higher order polynomial expansions.

3.3 Reconstruction of missing 3D surface data 3D surface data from a structured lighting sensor [9] comprising a camera and a projector was used to test the algorithm. The data consists of spatially separated rows of 3D data points recorded on a 256 × 256 image grid and contains occlusions where the surface structure obscures the reflected light from reaching the camera (Fig. 6, right). The geometry of the sensor provides a pixelwise upper limit for the reconstruction of the occluded data, in the form of a linear interpolation between the measured pixels. The performance of the PINC algorithm was measured by calculating Root Mean Squared Error (RMSE) values between the interpolated data and a second data set comprising six overlapping measurements (Fig. 6, left), rescaling the data sets to gray scale images of range R[0, 1]. For comparison, the surface was also reconstructed by three 130 Paper D

Figure 2: The original image (left) and the version with 90% randomly removed data (right).

Figure 3: Reconstructions of the right image in Fig. 2 using NC (first row) and PINC (second row). Order of polynomials used for interpolation are 0 (left) and 2 (right). 3. Experiments and results 131

Figure 4: The original image (left) and a version with holes (right).

Figure 5: Reconstructions of the right image in Fig. 4 using NC (first row) and PINC (second row). Order of polynomials used for interpolation are 0 (left) and 2 (right). 132 Paper D different NC approaches. First, a neighborhood of the same size as used for the PINC algorithm, 9×9 pixels, was applied. Secondly, the neighborhood was extended to 15×15 to avoid the type of holes visible in the NC results presented in Fig. 5 (upper right). Thirdly, the NC algorithm was used with adaptive neighborhoods, where for each pixel the smallest surrounding neighborhood containing at least 25% valid data was used for interpolation. Results for second order polynomials are presented in Fig. 7. For the 9 × 9 neighborhood NC reconstruction, the RMSE value for the resulting data is 0.055. It should be noted that in this case the small neighborhood does not bridge the regions of missing data, giving a potential large error for those pixels. The 15×15 neighborhood NC reconstruction fills in the holes, providing an RMSE value of 0.031. NC reconstruction using an adaptive neighborhood gives a lower RMSE value, 0.030. Finally, the suggested PINC algorithm gives the RMSE value 0.029.

4 Discussion

As can be seen in Fig. 3, the PINC algorithm does not reconstruct fine details as effec- tively as NC on 90% randomly distributed missing data. The reconstruction by growing property of PINC can result in image structure from a location with a local cluster of pixels, spreading over the image and influencing the reconstruction around more isolated pixels. However, while the presented incremental method is less likely to capture small details in the randomly sampled data, it is less sensitive to extreme values when using higher order polynomials. From Fig. 5, it is clear that the suggested incremental approach fills in missing data where ordinary normalized convolution does not. This is because the chosen neighbor- hood is too small to bridge the largest holes. The problem can be approached by using adaptive neighborhoods, as described in [7,8]. Also, the result from the PINC algorithm is in general more pleasant to the eye than the NC reconstructed image. Presented results for 3D range data shows that our method gives the best RMSE value for the tested data. Also, as expected, we see that NC needs a larger neighborhood to cover the missing regions. The use of locally adaptive neighborhoods partially solves this problem, but demands more computational power due to the unconstrained size of the neighborhoods when available pixels become very sparse. However, since the PINC algorithm currently reaches the same performance as NC with adaptive neighborhood sizes, it should be possible to improve PINC by incorporating the techniques that adapt to their surroundings such as neighborhood and applicability presented in [7,8]. Even though the NC reconstruction with a 15 × 15 neighborhood here produces an RMSE value that is comparable to PINC and adaptive NC, this is not something we can expect to be true in the general case. The images used in this work are at quite low resolution and all occluded regions are roughly of the same magnitude which in this case makes the 15 × 15 neighborhood suitable for all regions. When having occluded regions of different size, this means that for using NC we would have to use neighborhoods that bridges the largest occluded region, something that would introduce smoothing in the smaller cavities. 4. Discussion 133

Figure 6: 3D surface data of a rock pile: Combined information from all six measurements (left) and from one measurement only (right).

Figure 7: Reconstructions of 3D surface data (Fig. 6, right) using 9×9 neighborhood NC (upper left), 15 × 15 neighborhood NC (upper right), adaptive neighborhood NC (lower left), and 9×9 neighborhood PINC (lower right). Order 2 polynomials were used, giving RMSE values 0.055, 0.031 0.030 and 0.029 respectively. 134 Paper D

5 Conclusion

By measuring RMSE values between a reconstructed partially occluded 3D rock pile surface and its true topology, we conclude that the suggested image reconstruction by prioritized incremental normalized convolution (PINC) performs better than ordinary normalized convolution (NC). To adapt the size of neighborhoods seem to be another possible approach of improving the performance of NC, but we have shown that a com- parable result can be achieved using smaller neighborhoods. The presented hole filling and reconstruction of randomly sampled data (Figs 5 and 3, respectively) highlights the differences between PINC and NC algorithms. The PINC algorithm is not adapted for reconstruction of data sets where most data is randomly removed, but is useful for its purpose; filling holes in 3D surface data.

References

[1] Amir Averbuch, G. Gelles, and Alon Schclar. Fast hole-filling in images via fast com- parison of incomplete patches. In Proceedings of Multimedia Content Representation, Classification and Security, MRCS 2006, pages 738–744, Istanbul, Turkey, September 2006.

[2] M. Bertalmio, G. Sapiro, V. Caselles, and C. Ballester. Image inpainting. In Proceed- ings of the 27th annual conference on Computer graphics and interactive techniques, SIGGRAPH ’00, pages 417–424. ACM Press, 2000.

[3] Flore Faille and Maria Petrou. Invariant image reconstruction from irregular samples and hexagonal grid splines. Image and Vision Computing, 28(8):1173 – 1183, 2010.

[4] Gunnar Farneb¨ack. Polynomial Expansion for Orientation and Motion Estimation. PhD thesis, Link¨opingUniversity, Sweden, 2002.

[5] Tao Ju. Fixing geometric errors on polygonal models: A survey. Journal of Computer Science and Technology, 24:19–29, 2009.

[6] H. Knutsson and C.-F. Westin. Normalized and differential convolution. In Pro- ceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, CVPR ’93, pages 515–523, June 1993.

[7] Tuan Q. Pham, Lucas J. van Vliet, and Klamer Schutte. Robust fusion of irregularly sampled data using adaptive normalized convolution. EURASIP J. Appl. Signal Process., 2006:236–236, January 2006.

[8] H. Takeda, S. Farsiu, and P. Milanfar. Kernel regression for image processing and reconstruction. IEEE Transactions on Image Processing, 16(2):349–366, February 2007. 135

[9] M.J. Thurley and K.C. Ng. Identifying, visualizing, and comparing regions in irregu- larly spaced 3D surface data. Computer Vision and Image Understanding, 98(2):239– 270, 2005. 136 Paper E Adaptive Morphological Filtering of Incomplete Data

Authors: Anders Landstr¨om, Matthew J. Thurley, and H˚akan Jonsson

Reformatted version of paper originally published in: Proceedings of the 2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA), 2013, pp. 428–434.

c 2013, Institute of Electrical and Electronics Engineers, reprinted with permission.

137 138 Adaptive Morphological Filtering of Incomplete Data

Anders Landstr¨om,Matthew J. Thurley, and H˚akan Jonsson

Abstract

We demonstrate how known convolution techniques for uncertain data can be used to set the shapes of structuring elements in adaptive mathematical morphology, enabling robust morphological processing of partially occluded or otherwise incomplete data. Results are presented for filtering of both gray-scale images containing missing data and 3D profile data where information is missing due to occlusion effects. The latter demonstrates the intended use of the method: enhancement of crack signatures in a surface inspection system for casted steel. The presented method is able to disregard unreliable data in a systematic and ro- bust way, enabling adaptive morphological processing of the available information while avoiding any false edges or other unwanted features introduced by the values of faulty pixels.

1 Introduction

1.1 Background Mathematical morphology is a non-linear framework for image processing, originally developed by Matheron [1] and Serra [2]. The methods are based on set theory, focusing on geometrical structure. In mathematical morphology filtering is performed by probing an image with functions known as structuring elements (SEs), considering where in the data the SEs do or do not fit. In standard mathematical morphology, the same SE (defined by the user) is used to filter the whole image. This is often not ideal, however, and therefore there is an ongoing interest in adaptive mathematical morphology where SEs are allowed to vary from pixel to pixel. Previous work [3] has demonstrated how the Local Structure Tensor (LST) can be used to assign the shapes of elliptical SEs for adaptive mathematical morphology. These elliptical SEs range from lines to disks depending on the degree of orientation dominancy (anisotropy) around the pixel for which each of the individual SEs is defined. We here present how the methods can be extended by established methods for handling partly uncertain data, enabling processing of incomplete (i.e. partly missing) data directly without need for additional pre-processing steps such as inpainting or other types of reconstruction of missing information.

139 140 Paper E

Separating between known and unknown pixel values, i.e. considering pixel certainty, is highly important whenever input data is incomplete. In particular, pixel certainty should be used to handle missing values in our intended application: crack detection in 3D steel surface profile data captured by laser triangulation. In this type of data, pixel values may be missing due to occlusion, which occurs when the surface itself blocks the line-of-sight between the camera sensor and the laser line projected onto the measured surface. Such missing pixels may be located within cracks but also appear around scales, which constitute a brittle oxidized top layer resulting from the casting process itself. Many different methods for reconstruction of images exist, including inpainting [4] and kernel methods [5]. Inpainting strives to fill holes of missing values with statistically likely data, similar to what an artist would do when reconstructing a missing piece of a painting. Kernel methods relies on regression or interpolation of missing data based on pixel values in the surrounding regions, as defined by a kernel function penalizing distance away from the considered pixel. Our goal, however, is not to achieve reconstruction of missing values, but to prohibit missing pixels from affecting the filtered image. Hence, rather than first reconstructing data from some given model, we use a kernel method known as normalized convolution [6] to define a method for adaptive morphological filtering which is unaffected by missing data. The resulting method ignores missing pixels and adjusts effected quantities accordingly. This is achieved using only a few required user-set parameters, simplifying filter optimization for a given task.

1.2 Contribution

This work combines existing techniques, and demonstrates how a robust method for adap- tive morphological processing which can handle missing pixel values without additional pre- or post-processing can be achieved by adopting existing computationally effective methods intended for processing data of signal/certainty type. The contribution of this work thereby lies not in the individual parts, but in how they are combined into a ro- bust technique for adaptive morphological processing of partly occluded or otherwise incomplete data. We emphasize the difference towards reconstruction methods: reconstruction methods could certainly be considered as a pre-processing step, but this work demonstrates how missing data can be merged directly into the technique without any need for additional pre-processing. The presented method is not limited to use for 3D profile data, but can be applied for adaptive morphological filtering of any type of incomplete images where not all pixel values are considered reliable. This could be a result of faulty image sensors, surface reflectance properties, or image pixels being lost or distorted during data transmission. 2. Theory 141

2 Theory

2.1 Adaptive morphology using elliptical structuring elements The LST is a mathematical entity holding information about orientational structure in the data. Originally introduced by Knutsson [7], it constitutes a commonly used method for estimating directional features in 2D images as well as in higher dimensions. As demonstrated in Ref. [3], adaptive elliptical SEs for mathematical morphology can T be set for each pixel x = (x1 x2) from its local structure tensor

T  T(x) = Gσ ∗ ∇f(x)∇ f(x) . (1)

 T Here f(x) denotes the image value for each pixel x, ∇ = ∂ ∂ is the gradient ∂x1 ∂x2 (nabla) operator, and Gσ is a Gaussian kernel with standard deviation σ which regularizes the matrix. For a more thorough background on image processing using the LST, the reader is referred to Ref. [8]. In Ref. [3] the eigenvalues and eigenvectors of T(x), which contain the dominant orientation in the data around each pixel, are used to set an elliptical SE for each pixel. More specifically, the semi-major axis length a(x) and the semi-minor axis length b(x) are defined as λ (x) +  a(x) = 1 M, (2) λ1(x) + λ2(x) + 2 λ (x) +  b(x) = 2 M, (3) λ1(x) + λ2(x) + 2 where λ1(x) and λ2(x) are the (positive) eigenvalues of the local structure tensor T(x),  is a small positive number (i.e. machine epsilon) added to handle situations where the eigenvalues are zero, and M is the user-defined maximum length of the semi-major axis. The orientation φ(x) is retrieved from the corresponding eigenvectors by

 e (x)  2,x2  arctan , e2,x1 (x) 6= 0,  e (x) φ(x) = 2,x1 (4)   π/2, e2,x1 (x) = 0,

where e2,x1 (x) and e2,x2 (x) denote the components of the eigenvector e2(x). The result is an adaptive method for mathematical morphology, in which elliptical SEs E(x) align to directional structures of the data but remains disk-shaped where no such dominant direction exists. These SEs are so-called flat, meaning that they simply describe a neigh- borhood in which the morphological operations are performed (as opposed to SEs in the form of functions). Using flat SEs E(x) the two basic morphological operations erosion, εE(f), and dila- tion, δE(f), of an image f(x) are defined as V εE(f) = {f(u): u ∈ E(x)} , (5) 142 Paper E

W δE(f) = {f(u): x ∈ E(u)} , (6) V and W denotes the minimum and maximum operators, respectively. The morphological opening γE and closing ϕE are then defined as

γE(f) = (δE ◦ εE)(f), (7)

ϕE(f) = (εE ◦ δE)(f). (8) The above formulation does not, however, produce satisfactory results for data where pixel information is uncertain or even missing, since all pixels are allowed to affect the final outcome although some may be completely irrelevant due to low certainty but can hold extreme values which has serious consequences for the minimum/maximum operations.

2.2 Convolving data of signal/certainty type Normalized convolution was introduced by Knutsson and Westin [6] and provides a mean for processing of data where pixel values are not considered equally reliable. This is done by assigning to each pixel a corresponding certainty weight representing our trust in the (scalar or higher tensor) value of that pixel. Following the notation of Knutsson and Westin [6] (slightly adapted for our terminol- ogy), a generalized convolution can be defined as  X aB b cfT (x) = a(u − x)B(u − x) c(u)fT(u), (9) u∈Nx where a and c denotes the applicability and certainty weight functions, respectively, B contains the set of basis functions used, and fT here represents the tensor-valued signal (in our case either the gray-level function f or the LST T). The ( ) symbol denotes a multi- linear operation, and the ( ˆ ) symbol marks the operation over which the convolution is performed. Nx is the neighborhood of pixels u around x on which the applicability function a(u − x) is non-zero. Note that we define u and x in a global coordinate system, while the difference (u − x) = ξ is a local coordinate, relative to x, on which the applicability function and the basis functions depend. Normalized convolution can then be formulated as

−1 CN( x | a, B, c, fT) = N (x)D(x), (10) where  D(x) = aB b cfT (x), (11)  ∗ N(x) = aBB b c (x), (12) and provides a tool for performing convolution operations on signals where our trust in the pixel (tensor) values varies over the image. In order to simplify notation, we have omitted to denote the dependency of D and N on a, B, c, and fT. 3. Method 143

The output, CN , contains a set of basis coefficients which are optimal in a least- squares sense. By using polynomial basis functions for approximating the signal within a certain neighborhood (including the constant basis b0 = 1), i.e. by calculating the coefficients βk, k = 0, 1, 2, for the three polynomial basis functions {bk} = {1, ξ1, ξ2} in 3 X f(x + ξ) ≈ βkbk(ξ) = β0 · 1 + β1 · ξ1 + β2 · ξ2, (13) k=0 we can estimate the gradient as the coefficients of the two bases b1 = ξ1 and b2 = ξ2, T given by the local pixel coordinate ξ = [ξ1 ξ2] . To see this, simply consider the first order Taylor expansion around a pixel x given by ∂f(x) ∂f(x) f(x + ξ) ≈ f(x) · 1 + · ξ1 + · ξ2. (14) ∂x1 ∂x2

From Eqs. (13) and (14), the estimated coefficients β1 and β2 are identified as the ap- proximated components of the function gradient. In practice, this can be done by another related operation: normalized differential convolution. The mathematical expression of this operation, given by −1 CN∆( x | a, B, c, fT) = N∆ (x)D∆(x), (15) where    D∆(x) = {a b· c} aB b cfT − {aB b· c} b a b cfT (x), (16) ∗   N∆(x) = {a b· c}{aB B b· c} − {aB b· c} b aB b c (x). (17) is more complex, and for more details the reader is referred to Knutsson and Westin [6,9]. Most importantly for our purposes, it removes the need for a constant basis and therefore reduces the dimensionality of the involved matrices. It has been shown that the result of normalized differential convolution is equivalent to that of normalized convolution with the constant basis added [10].

3 Method

In the following text we assume that our input data is of signal/certainty type, i.e. every pixel value, denoted either f(x) or T(x) depending on whether we are referring to the gray-level f or the tensor T, is associated with a corresponding certainty (or confidence) weight c(x) ∈ R [0 1], representing our trust in the value at pixel x. In the case of unknown data, e.g. occluded pixels in laser triangulation, we have for the original input data f(x)  0 if the value of x is unknown, c (x) = (18) f 1 otherwise. As already pointed out, this case cannot be directly handled by the more naive ap- proach given by Eqs. (1)–(4), but the tensor formulation allows for a straight-forward adaption by applying normalized convolution. Generalizing convolutions for signals with corresponding certainty, this concept is highly useful for our purpose. 144 Paper E

3.1 Estimating the gradient The benefits from using normalized differential convolution for estimating gradients in the case of missing data have been demonstrated by Knutsson and Westin [6], and we here simply adapt their terminology to this work: Following Eqs. (13) and (14) we can estimate the gradient components as the coefficients β1 and β2 for the corresponding basis functions b1(ξ) = ξ1 and b2(ξ) = ξ2, as estimated by Eqs. (15)–(17), i.e. the gradient ∇f is given by T −1 ∇f(x) ≈ CN∆( x | aξ, [ξ1, ξ2] , cf , f) = N∆ (x)D∆(x), (19) where aξ is a user-chosen applicability function defining the influence of the neighborhood around a considered point.

3.2 Estimating the LST We should not use our estimated ∇f directly in Eq. (1), but instead identify that equa- tion as a particular case of normalized convolution: Eq. (1) can be seen as normalized convolution of a fully trusted signal (full certainty for all pixels) using a constant basis. In our case, the fact that the estimated gradients ∇f(x) rely on different amounts of available information around x should be taken into account. Hence, we need to set a certainty c∇f (x), representing our trust in the gradient obtained by Eq. (19). A simple approach would be to use the quantity aξ ∗ cf (the standard convolution of the certainty weights), but this would not take the effects of the basis functions into account. The quantity N∆, however, contains a description of the certainties associated with the new basis functions [6], and thereby holds certainty information for the estimated gradient ∇f. As a measure of this certainty, we use the determinant of N∆, i.e.

c∇f (x) = |N∆(x)|. (20)

This choice of c∇f , previously used by Westin [9, p. 104], captures the amount of certainty associated with the new basis functions and allows for reduction of the impact on the final local structure tensor from gradients estimated from a small number of pixels on behalf of gradients estimated using more available information. The local structure tensor T(x) can now be calculated from Eqs. (10)–(12) as

T −1 T(x) = CN( x | a1, 1, c∇f , ∇f∇ f) = N (x)D(x), (21) where a1 is an applicability function chosen by the user (note that a1 is, in general, different from aξ). This applicability determines the level of smoothing of the tensor field. Note that in practice the operations can be performed element-wise for the three defining elements of the symmetric 2×2 matrix (∇f∇T f), which simplifies calculations drastically.

3.3 Adaptive morphological processing The eigenvectors and eigenvalues can then be used to set the elliptical SEs as described by Eqs. (2)–(4) in Sect. 2.1. In the actual morphological processing, based on min/max 4. Experiments and results 145 operations over the pixel-dependent neighborhood defined by the elliptical SEs, missing data is then handled by simply ignoring the values of such pixels. This is easily achieved by setting the corresponding pixel values to −∞/+∞ depending on type of operation. Apart from the maximum length of the semi-major axis, M, the applicability func- tions aξ and a1 need to be defined. We set them as Gaussian kernels defined by radial bandwidths rξ and r1, related to the standard deviations σξ and σ1 of the two Gaussian kernels by r(·) σ(·) = √ . (22) 2 ln 2

Hence, the applicability functions aξ and a1 decrease to half their maximum value at distances rξ and r1, respectively, from their centers. For overview, the whole method can then be summarized into the following steps:

1. Calculate the gradient ∇f by Eq. (19). Except for the input image and its certainty the user needs to provide rξ, which defines the smoothing used when calculating the gradient (thereby regularizing gradient calculations).

2. Retrieve the LST T(x) by Eq. (21). The user needs to set r1, which defines the spatial scale for which the LST should be representative.

3. Calculate eigenvalues and eigenvectors for T(x).

4. Define an elliptical structuring element for each pixel using Eqs. (2)–(4). The parameter M needs to be provided by the user.

5. Perform morphological operations based on Eqs. (5) and (6), setting missing pixel values to +∞ or −∞ respectively.

4 Experiments and results

In order to test the robustness of the presented method, we must know what the result would be in case we would have no missing data. This cannot be achieved for occlusions in measured 3D profile data, and we therefore apply the resulting morphological operations on two different data sets:

A. Gray-scale images corrupted by randomly removing pixels, and

B. 3D surface data for a casted steel surface, containing occluded pixels.

Apart from providing a case where we know the “truth”, the gray-scale data case also demonstrates the usefulness of the the method on other types of data, although it was designed specifically for 3D profile data. 146 Paper E

4.1 Gray-scale Image with Randomly Removed Data The performance of the presented strategy for assigning shapes to SEs was studied by randomly removing an increasing fraction α of the pixels in a set of 30 gray-scale test images (see examples in Fig. 1) and studying how the result of the morphological filtering of the corrupted data compares to the filtering of the complete original data. For com- parison, results for the uncorrupted data were also compared to the method in Ref. [3] using the same set of test images. More specifically, we calculate the Root Mean Square Error (RMSE) for the adaptive opening γα at each level of missing data, as compared to the adaptive opening γ0 of the original data, i.e. v u N u 1 X RMSE(γ ) = t (γ (x ) − γ (x ))2 (23) α N α i 0 i i=1 where xi, i = 1, 2,...,N, denote the N pixels in the image.

(a) (b)

(c) (d)

Figure 1: Enlarged sections of original images from the set (a and c) and corrupted versions at level α = 0.25 (b and d). 4. Experiments and results 147

For comparison against an opening γR of uncorrupted data using the method described in Ref. [3] (with corresponding parameters), RMSE(γR) was calculated, which resulted in RMSE values within the interval [0.2 1.69] · 10−2 (mean value 0.8 · 10−2 and standard deviation 0.2 · 10−2). Hence, there are only small differences between the results of the two approaches, given fully certain data. The error resulting from increased noise was compared to the corresponding RMSE values for two other alternative approaches: A1. Completely disregarding certainty (all values are considered valid in all operations).

A2. Considering certainty in the morphological min/max operations but not when cal- culating the SEs. Results for the image set are shown in Fig. 2. The error resulting from completely disregarding certainty is of course extreme, since more and more pixels in the resulting opened image will be set to zero by the increasing amount of black pixels in the original image. This result is trivial, but emphasizes why certainty must be taken into account when processing data where all pixel values cannot be considered equally valid. The RMSE values, although showing better results for our presented method, do not differ that much depending on whether certainty is considered when setting SE shapes or not, but this is because the RMSE is a global measure of image similarity and may not fully depict differences at a more local level. If we study the resulting images more in detail (see examples in Figs. 3 and 4), we see the benefit

0 10

−1 10 ) α γ RMSE(

−2 10

Alternative A1 Alternative A2 The presented method −3 10 0 0.05 0.1 0.15 0.2 0.25 α

Figure 2: Comparison of RMSE values. Dashed lines represent the mean RMSE value over the set of images, for each noise level, while dotted lines mark the standard deviation around the mean. 148 Paper E

(a) (b)

(c) (d)

(e) (f)

Figure 3: Calculated structuring elements (left) and corresponding openings (right) for an original uncorrupted image (a and b), the case when missing pixels are ignored in the min/max operations only (c and d), and the presented method where missing pixels are completely ignored throughout the whole procedure (e and f). Parameters were set to rξ = 1.25, r1 = 4, and M = 5. 4. Experiments and results 149

(a) (b)

(c) (d)

(e) (f)

Figure 4: Calculated structuring elements (left) and corresponding openings (right) for an original uncorrupted image (a and b), the case when missing pixels are ignored in the min/max operations only (c and d), and the presented method where missing pixels are completely ignored throughout the whole procedure (e and f). Parameter setup: rξ = 1.25, r1 = 4, and M = 5. 150 Paper E from taking confidence into account here as well: Details in the image become much more blurred if missing pixels are allowed to affect the shape of the SEs, because the contrast towards the randomly distributed black pixels cause evenly distributed high gradient values and thereby makes SEs wider overall. One can also easily predict that in the case of non-random distributions of missing pixels, any border along regions of missing data will cause directional bias for the resulting elliptical SEs.

4.2 3D surface data containing occluded pixels We now turn to our intended application of the presented method: detection of cracks in casted steel slabs. The objective is to enhance crack signatures in 3D profile data acquired by laser triangulation, without letting missing pixels give rise to false crack-like features in the data. Figure 5 shows two crack examples and the adaptive openings of the data using the presented approach. Figure 6 shows examples of scale regions which are also present in the measured data, but should not be falsely detected as cracks. Occluded pixels in the original data are shown in red. Note that the spatial distribution of the occluded pixels along the edges of the scales in Fig. 6 are not unlike the dark crack pixels in Fig. 5. The crack signature is successfully enhanced, while non-crack pixels containing miss- ing data are not allowed to affect the result noticeably. There are a few pixels in the opening of the scales data where the SE could not reach correctly measured data. The number of such pixels obviously depends on the size of the structuring element. How- ever, letting them remain with zero output certainty is actually not very problematic since they can be easily identified and excluded from further analysis by considering the amount of occluded and non-occluded pixels covered by the SE. An important benefit with the presented method is that occluded pixels are not allowed to affect other values in any way, so that the problem is contained within the originally occluded region. The examples demonstrate the robustness of the presented method: the adaptive opening has enhanced and linked crack segments without being corrupted or otherwise distorted by the occluded pixels.

5 Discussion

We have shown how techniques can be combined into a robust technique for adaptive morphological processing, handling the presence of missing pixels within the method itself. By formulating the problem so that the power of normalized convolution can be utilized for setting the shapes of Structuring Elements (SEs), incomplete data can be handled in a reasonable and systematic way. Our results show that 3D profile data containing occluded pixels can be robustly processed, without introducing artificial structures that may be mistaken for cracks. If the SE is too small it may only cover missing data, but this can be handled by simply measuring the amount of valid (non-missing) data it covers. However, the possibility of changing the size of the SE and applicability functions used more dynamically should 5. Discussion 151

(a) (b)

(c) (d)

Figure 5: Captured steel 3D profile data containing cracks (a and c) and the adaptive openings (b and d). Occluded pixels in the original 3D profile data are shown in red. Parameters were set to rξ = 1, a1 = 10, and M = 10. 152 Paper E

(a) (b)

(c) (d)

Figure 6: Captured steel 3D profile data containing scales (a and c) and the adaptive openings (b and d). Occluded pixels in the original 3D profile data are shown in red. Parameter setup: rξ = 1, a1 = 10, and M = 10. 153 be considered in the future. Also, an output certainty useful for discarding unreliable output pixel values could be defined by considering the relationship between the numbers of valid and missing pixels covered by each structuring element. The strength of the method lies in its generality: although designed for 3D profile data it can be applied to any type of image data. In this work, we have demonstrated its use on gray-scale and 3D profile data, but further extension into higher dimensions is theoretically straight-forward (but of course more computationally demanding).

References

[1] G. Matheron, Random sets and integral geometry. New York: Wiley, 1975, vol. 1.

[2] J. Serra, Image analysis and mathematical morphology. London: Academic Press, 1982.

[3] A. Landstr¨omand M. J. Thurley, “Adaptive morphology using tensor-based elliptical structuring elements,” Pattern Recognition Letters, vol. 34, no. 12, pp. 1416–1422, 2013.

[4] Z. Tauber, Z.-N. Li, and M. S. Drew, “Review and preview: Disocclusion by inpaint- ing for image-based rendering,” IEEE Transactions on Systems, Man, and Cyber- netics, Part C: Applications and Reviews, vol. 37, no. 4, pp. 527–540, 2007.

[5] H. Takeda, S. Farsiu, and P. Milanfar, “Kernel regression for image processing and reconstruction,” IEEE Transactions on Image Processing, vol. 16, no. 2, pp. 349–366, 2007.

[6] H. Knutsson and C.-F. Westin, “Normalized and differential convolution,” in IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Pro- ceedings CVPR ’93, pp. 515–523, Jun. 1993.

[7] H. Knutsson, “Representing local structure using tensors,” in The 6th Scandinavian Conference on Image Analysis, pp. 244–251, June 1989.

[8] L. Cammoun, C. Casta˜no-Moraga, E. Mu˜noz-Moreno,D. Sosa-Cabrera, B. Acar, M. Rodriguez-Florido, A. Brun, H. Knutsson, and J. Thiran, “A review of tensors and tensor signal processing,” in Tensors in Image Processing and Computer Vision, pp. 1–32. Springer, 2009.

[9] C. Westin, “A tensor framework for multidimensional signal processing,” Ph.D. dis- sertation, Link¨oping University, Sweden, 1994.

[10] C. Westin, K. Nordberg, and H. Knutsson, “On the equivalence of normalized con- volution and normalized differential convolution,” in IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP), pp. 457–460. IEEE, 1994. 154 Paper F Adaptive Mathematical Morphology – a Survey of the Field

Authors: Vladimir Curi´c,Anders´ Landstr¨om,Matthew J. Thurley, and Cris L. Luengo Hendriks

Reformatted version of paper originally published in: Pattern Recognition Letters, 2014, vol. 47, pp. 18–28.

c 2013, Elsevier, Reprinted with permission.

155 156 Adaptive Mathematical Morphology – a Survey of the Field

Vladimir Curi´c,Anders´ Landstr¨om,Matthew J. Thurley, and Cris L. Luengo Hendriks

Abstract

We present an up-to-date survey on the topic of adaptive mathematical morphology. A broad review of research performed within the field is provided, as well as an in-depth summary of the theoretical advances within the field. Adaptivity can come in many different ways, based on different attributes, measures, and parameters. Similarities and differences between a few selected methods for adaptive structuring elements are considered, providing perspective on the consequences of different types of adaptivity. We also provide a brief analysis of perspectives and trends within the field, discussing possible directions for future studies.

1 Introduction

1.1 Background Mathematical morphology, introduced by Matheron [1] and Serra [2], is a powerful frame- work for nonlinear image processing, thereby providing a useful tool for a number of tasks such as image filtering, image segmentation, shape comparison, etc. The most common way to define morphological operators is by using the concept of structuring elements. Structuring elements are usually small shapes used to probe the image, defining the out- put of the filter operation from the interaction between the two. Mathematical morphol- ogy can also be defined on a lattice structure using algebraic tools [3,4], in the continuous framework using partial differential equations [5–7], and on graph-like structures [8–10]. For a comprehensive study on the theory and different applications of mathematical mor- phology, the interested reader is referred to the excellent books on the topic by Soille [11] and Najman and Talbot [12]. In classical mathematical morphology, structuring elements remain the same for all points in the image domain, i.e. one single structuring element is used to process the whole image by translating it to every point in the image. We will refer to such non- adaptive structuring elements as rigid. One rigid structuring element is often not suitable for the whole image however, since the variation of image structures regarding e.g. shape, size, and orientation often provides a challenge when processing all points identically. For instance, morphological operators that stretch over edges as a result of unsuitable struc- turing elements can quickly destroy or corrupt important information in the image. In some cases several different structuring elements are considered by repeatedly processing

157 158 Paper F the image and selecting one single result for each point [13, 14], but this means that the whole image is processed once for each considered structuring element, while only a small fraction of the computed values may actually be needed to produce the final result. The usefulness and necessity of using adaptive structuring elements is thereby evident. Consequently, adaptive morphology has attracted a lot of attention in recent years and constitutes a topic of ongoing research. Adaptive structuring elements should adapt to the image structures, considering different image attributes such as gray level values (luminance, contrast), spatial distances between pixels, edges in the image, image gradi- ent, and noise. There are many ways in which this task can be achieved, and hence, due to variation of different image attributes, there is a number of recent papers that deal with adaptive mathematical morphology. Adaptivity in mathematical morphology could also be considered from different perspectives depending on the mathematical structure considered, such as group invariance of morphological operators, or partial differential equations. Nevertheless, recent papers address two important aspects of adaptive mathe- matical morphology: (1) how to construct adaptive structuring elements that are suitable for the image analysis task at hand, and (2) how to properly define morphological oper- ators with adaptive structuring elements.

1.2 Contribution To our knowledge only one survey paper on the different approaches in adaptive mathe- matical morphology has appeared in the literature so far [15], and, although of excellent quality, it does not include the latest developments. In the light of these remarks, the primary contribution of this paper is to:

1. provide an up-to-date survey of existing approaches for adaptive mathematical morphology (Sec. 2),

2. at one place summarize and clarify how adaptive morphological operators should be properly computed (Sec. 3),

3. present four methods for the construction of adaptive structuring elements (Sec. 4),

4. present application-oriented examples of approaches in adaptive mathematical mor- phology (Secs. 5 and 6), and

5. briefly discuss possible future directions in which this field might further develop (Sec. 7).

2 Overview of adaptive mathematical morphology

We here present a brief history of adaptive mathematical morphology and include (to our knowledge) all important work done in this field. 2. Overview of adaptive mathematical morphology 159

2.1 History

Mathematical morphology constitutes a well defined nonlinear theoretical framework based on set relations, using shapes or functions known as structuring elements to probe the processed data. Classical morphological operators are non-adaptive, i.e. the image is probed by a single rigid structuring element. This is often not ideal however, as discussed in Sec. 1, which has motivated the development of adaptive mathematical morphology. Most likely the first ideas on morphological operators based on non-rigid structuring elements appeared in the work of Serra [16], providing a foundation for adaptive mor- phology. One of the first applications of adaptive structuring elements was presented by Beucher et al. [17] who used structuring elements that adapt with respect to their position in the image, following the law of perspective for traffic cameras. Similarly, Verly and Delanoy [18] designed adaptive structuring elements for application to range images. Charif-Chefchaouni and Schonfeld [19] considered general theory for adaptive morphology in the binary case. Other early work on adaptive structuring elements was undertaken by Morales [20], Chen et al. [21], and Cheng and Venetsanopoulos [22]. Attention to this area of mathematical morphology has increased within the last ten years, including the introduction of methods such as general adaptive neighborhoods [23] and morphological amoebas [24]. These papers present two different methods for con- structing adaptive structuring elements based on local similarity measures for neighbor- ing pixels, and seem to have incited momentum for adaptive morphology within the mathematical morphology community as well as with the wider audience. At about the same time, gray valued input-adaptive morphology was defined by Bouaynaya and Schonfeld [25], based on the earlier work for the binary case by Charif-Chefchaouni and Schonfeld [19] and the umbra transform. Theoretical advances as well as limitations of adaptive mathematical morphology have been explored by Bouaynaya et al. [26] and Bouaynaya and Schonfeld [27]. Furthermore, Roerdink [28] addressed theoretical issues important for the development of the field, pointing out often overlooked issues regarding the properties of adaptive morphological operators. This study provides a good ground for further development of the field, as well as a discussion on the terminology used for adaptive structuring elements. The latest studies of adaptive mathematical morphology include work on adaptive structuring functions, mostly inspired by recent image filtering techniques such as bilat- eral filtering [29] and nonlocal means [30,31]. Furthermore, an interesting application of adaptive mathematical morphology to image regularization for inverse problems has also recently been presented by Purkait and Chanda [32]. It should be noted that adaptive structuring elements play a similar role as kernels used in other well-known methods for adaptive filtering [33–36] where a desirable property of the filtering kernel is that it does not operate across edges in the image. The goals are different, however, as morphological operators are used not only to remove the noise but also to preserve the shapes in the image. Morphological operators are based on the maximum or minimum value, rather than the median or weighted mean over pixel values within the kernel. 160 Paper F

2.2 Overview of the field There exists a variety of different methods for constructing adaptive morphological op- erators that differ in how they adapt to different image attributes. As a result, there are different ways in which approaches to adaptive mathematical morphology can be grouped. Maragos and Vachier [15] considered the following three groups: • adaptivity with respect to the spatial neighborhood position,

• adaptivity with respect to gray level image values, and

• algebraic principles such as group and representation theory At the same time, Roerdink [28] defined two categories of adaptiveness, referring to them as: • location-adaptive mathematical morphology (adaptability with respect only to the position in the image domain) and

• input-adaptive mathematical morphology (adaptability with respect to the image content, i.e. the position in the image domain as well as values in the image range), which obviously coincide with the two first groups considered by Maragos and Vachier [15]. We follow the aforementioned categories, but identify a larger diversity of adaptivity in mathematical morphology. We identify several aspects on adaptivity to which the various methods can be associated. Note that neither the aforementioned grouping of different methods nor the aspects we have identified provide strictly disjoint groups, since one method can belong to several different groups or aspects. Our primary goal is not to present a new categorization of different methods in adaptive mathematical morphology, but rather to complement the work by other authors.

Similarity Most methods for defining adaptive structuring elements rely on local similarity of neigh- boring pixels. Structuring elements include points that are similar to the considered cen- tral point (origin of the structuring element) according to some measure of local similarity such as spatial similarity, similarity between gray level values, etc. Some methods allow for complete adaptivity of the shape of the structuring elements. For instance, Debayle and Pinoli [23] presented a method for adaptive morphology based on homogeneous regions, obtaining spatially adaptive structuring elements determined by the connected component that contains the origin of the structuring element. This approach is closely related to earlier work by Braga-Neto [37]. Cuisenaire [38] presented locally adaptive mathematical morphology based on the distance transform. Possibly the best known method for adaptive structuring elements, morphological amoebas, is based on a geodesic distance taking both spatial distance and gray level difference into ac- count [39]. Similarly, Grazzini and Soille [40] considered spatially variable neighborhoods by utilizing another cost function for geodesic distances, while Curi´cet´ al. [41] defined 2. Overview of adaptive mathematical morphology 161 adaptive structuring elements that are computed from the salience distance transform of the edge image. In the same line, Morard et al. [42] presented a method that uses a region growing technique based on the similarity between neighboring points. Other methods use a predefined shape where the size of the shape is set according to a measure of local similarity. For example, Dokl´adaland Dokl´adalova [43] used rectangles, while Curi´cand´ Luengo Hendriks [44] used circles. Recent studies on adaptive structuring functions have also considered measures of local similarity between neighborhood points [29, 45] as well as on similarity between image patches [30, 31]. Recently, an interesting work that introduces adaptive morpho- logical operators into a stochastic framework has been presented by Angulo and Velasco- Forero [46]. This method is based on random walk simulations as an estimation for adaptive structuring functions. Apart from including pixels which are similar to the considered centre of the structur- ing element, as defined by spatial or geodesic distance, adaptive morphological operators can also be adapted to level set decomposition of the image [47, 48]. These studies are closely related to viscous mathematical morphology and viscous lattices [49].

Structure Related to similarity (or lack thereof), but still different by definition, is the concept of structure. Structure-based methods align the structuring elements to edges and contours, rather than restricting them by a measure of similarity. This can be done by considering orientation only, or by considering rate of anisotropy or distances to edges as well. Shih and Cheng [50] defined elliptical structuring elements from local path curvature obtained by edge linking in binary data. Tankyevych et al. [51] used adaptive morphology for vessel enhancement: directions for line structuring elements are obtained from a so called “vesselness function” depending on the eigenvalues of the Hessian for each neighborhood. Adaptive line and rectangular structuring elements, where the latter are constrained in width by nearby distinct edges, were considered by Verd´u-Monedero et al. [52], who obtained directions from diffused squared gradient fields. Landstr¨om and Thurley [53] used the Local Structure Tensor (LST) to define elliptical structuring elements which vary from lines to disks depending on the rate of anisotropy. This method has also been extended to enable processing of partly missing data [54]. A slightly different approach [55] is based on multiscale image decomposition, adapt- ing the size of the structuring elements to the local scale of structures in the image. This approach processes two points of similar scales in the same way, rather than using a level set decomposition of the image.

Partial Differential Equations A quite different approach to mathematical morphology is achieved by considering Partial Differential Equations (PDEs) where the main morphological operators can be defined using diffusion equations. This strategy defines morphological filters without explicit use of structuring elements. The implicit structuring element is a unit ball that can 162 Paper F deform over time. PDE-based methods may very well take similarity and structure into account, depending on the formulation of the problem, and operate in a continuous framework. Breuß et al. [56] proposed continuous morphology based on tensors, solving PDEs. Maragos and Vachier [48] introduced adaptivity of morphological operators for PDEs, defining viscous dilation and erosion. A generalization of the PDEs for nonlocal erosion and dilation has also been defined on a graph structure [57]. Welk et al. [58] proposed differential equations for morphological amoebas and pre- sented an interesting connection between morphological amoebas and self-snakes. In the same line, Welk [59] also considered a connection of morphological amoebas with a curvature-based PDE. It should be stressed that these connections of morphological amoebas with PDEs are specific for the amoeba median filter, and not general for mor- phological operators.

Graphs As already mentioned, there exists work on nonlocal morphological operators on graphs by Ta et al. [57]. Adaptive morphological operators, in particular ones based on morpho- logical amoebas, are defined using the concept of minimal spanning trees [60]. Cousty et al. [10] presented a unified framework for graphs that includes discrete spatially variant structuring elements. Path openings and closings are morphological operations with flexible line segments as structuring elements [61]. Unfortunately, path openings as well as area openings and other attribute openings are often not considered as a part of adaptive mathematical morphology.

Group adaptivity Morphological operators are mostly defined as translation-invariant operators. Never- theless, for certain applications, the usefulness of translation-invariant morphological operators can be limited [62]. Therefore, group morphology has been introduced [63]. In this framework, morphological operators are invariant under different types of trans- formations such as non-commutative symmetry groups, rotation groups, or similar. Morphological operators with fixed structuring elements are translation-invariant op- erators. However, as pointed out by Roerdink [28], most adaptive morphological op- erators are also translation-invariant operators. If an operator adapts its structuring element to local features in the image, in a translation-invariant manner, the operator is translation-invariant, even if the structuring element is not the same at all locations in the image. Therefore, adaptive morphological operators are not necessarily translation- invariant operators.

Efficient implementations Efficient algorithms have been proposed for morphological operators with rigid structur- ing elements (see e.g. [64, 65]). This is likely a strong reason for the wide use of mor- 3. Theory 163 phological operators within the image processing community. Nevertheless, the number of presented algorithms for adaptive morphological operators with focus on efficiency is still quite limited. An efficient algorithm for adaptive morphological operators on binary images has been proposed [66]. Stawiaski and Meyer [60] proposed fast implementation of morphological amoebas using the graph-based approach and the concept of minimal spanning trees, while Velasco-Forero and Angulo [31] presented an efficient implementa- tion of nonlocal morphological operators using sparse matrices.

3 Theory

The definition of adaptive morphological operators has often been overlooked in the liter- ature, as pointed out by Roerdink [28]. In this section we summarize general theoretical work aimed at properly and efficiently performing morphological operations using adap- tive structuring elements, i.e. adaptive morphological operators. Approaches for gener- alizing classical non-adaptive morphology into the adaptive case are covered. Our aim is to: (1) provide a short summary of how to properly compute adaptive morphological operators; and (2) put the considered approaches in perspective by using a unified nota- tion. For more details on the approaches, we refer the interested reader to the original publications. We start by addressing an important property known as adjunction, which is re- quired to define mathematically correct openings and closings from the combination of an erosion and a dilation. Morphological operations are then formulated for the rigid case, providing a basis for a coherent notation, before addressing the theoretical work on adaptive morphology. More specifically, we consider three suggested approaches for defining adaptive morphology, based on structuring elements, structuring functions, and using the notion of impulse functions.

3.1 Adjunction Let (L, ≤) be a complete lattice of gray valued functions with the domain D and range T . Two morphological operators ε and δ defined on a lattice L form an adjunction (ε, δ) when δ(f) ≤ g ⇐⇒ f ≤ ε(g), (1) for any pair of f, g ∈ L. One of the main issues within adaptive mathematical morphology concerns proper definition of adjunct morphological operators, which is addressed in a number of papers [27, 28, 31]. Adjunction is important because if the morphological erosion ε and dilation δ fulfill (1), i.e. form an adjunction (ε, δ), the morphological opening γ and closing ϕ can be defined as δ ◦ ε and ε ◦ δ, respectively [4]. Moreover, it has been shown [28] that morphological operators ε and δ satisfy the adjunction property (1) only if adaptive structuring elements are derived only once from the input image. That means that we should first compute adaptive structuring elements for every point in the image, i.e., have a set of structuring elements {Sx : x ∈ D}, 164 Paper F where D denotes the domain of the image. These same shapes, sometimes referred to as the structuring element map, should be used when computing both the erosion and dilation, since only then they will form an adjunction. These structuring elements could be computed from the input image or from a smoothed version of the input image, called the pilot image [39].

3.2 Non-adaptive morphology We first consider classical non-adaptive morphology, formulating it in a notation that sim- plifies subsequent generalization to the adaptive case. We here distinguish between (flat) structuring elements, which are defined solely as sets of pixels, and (non-flat) structuring functions, which have the range [−∞, 0]. Most commonly, morphological operations are performed using (flat) structuring elements, i.e. set-based structuring elements. Given a rigid structuring element S, its translation Sx to a point x ∈ D can be expressed as [2]

Sx = {y ∈ D : y − x ∈ S} , (2) which simply yields the structuring element S corresponding to the local variable (y−x). ∗ Each structuring element Sx has another corresponding element Sx. For non-adaptive structuring elements this is simply the translated reflection through the origin, given by

∗ Sx = {y ∈ D : x − y ∈ S} . (3)

∗ Note that Sx can also be implicitly, but equivalently, defined by the relation

∗ y ∈ Sx ⇐⇒ x ∈ Sy, x, y ∈ D. (4)

The erosion εS : L → L and dilation δS : L → L of a function f ∈ L by the structuring element S are then defined as [2] ^ εS(f)(x) = f(y), x ∈ D, (5)

y∈Sx _ δS(f)(x) = f(y), x ∈ D, (6) ∗ y∈Sx

V W ∗ where and denote the infimum and supremum operators, respectively. The entity Sx is known under various different names, such as the reflected, transposed, or reciprocal structuring element, and there is currently no consensus regarding proper terminology. While the aforementioned names have all been used for both the rigid and the adaptive case, they risk leading to ambiguities in a general terminology for adaptive morphology ∗ as there is reasonable room for misinterpretation (for instance, in the adaptive case Sx cannot simply be described as the reflection of Sx through the origin). However, for the general case we observe the following:

1. Given the set of structuring elements {Sx : x ∈ D} there is a one-to-one relation ∗ between Sx and Sx for each point x. 3. Theory 165

∗ ∗ 2. We can alternate between Sx and Sx in a cyclic manner using the (·) notation based on Eqs. (3) and (4) above, i.e.

∗∗ ∗ ∗ Sx = (Sx) = Sx. (7)

These properties form a duality with respect to the set of structuring elements {Sx : ∗ x ∈ D}, i.e. the whole set is needed to retrieve Sx for a given point x, and we will ∗ consequently refer to Sx as the dual structuring element. We then turn our focus to the non-flat case, i.e. structuring functions. Let s : D → [−∞, 0] be an arbitrary rigid structuring function and let sx represent its translation to a point x ∈ D. For any point y we then have

sx(y) = s(y − x), (8) while the corresponding dual structuring function s∗ is defined as [27]

∗ sx(y) = sy(x) = s(x − y). (9) It should here be noted that for structuring functions equation (7) becomes

∗∗ ∗ ∗ sx = (sx) = sx (10) in analogy with the flat case. Erosion and dilation by the structuring function s are then defined for each point x ∈ D by [2] ^ εs(f)(x) = (f(y) − sx(y)) , x ∈ D, (11)

y∈D(sx)

_ ∗ δs(f)(x) = (f(y) + sx(y)) , x ∈ D, (12) ∗ y∈D(sx) where D(sx) is the domain of the structuring function s in point x. It should be noted that the flat case is given by expressing a structuring element S using a corresponding structuring function s defined as

 0, y ∈ S, s(y) = (13) −∞, y ∈/ S, which inserted into (11) and (12) yields (5) and (6), respectively.

3.3 Adaptive structuring elements By far the most commonly used strategy within adaptive morphology relies on the use of (flat) structuring elements (see e.g. [39,41,53]). Given such a pixel-dependent structuring element S[x], which may vary with the considered pixel x, the translated structuring element Sx is defined as Sx = {y : y − x ∈ S[x]} . (14) 166 Paper F

Figure 1: Adaptive structuring elements are computed in points x, y and z. Since x ∈ Sy ∗ ∗ then y ∈ Sx, and x ∈/ Sz then z ∈/ Sx.

∗ Its dual structuring element Sx (see Fig. 1) is then implicitly defined by (4) or, equiva- lently, defined explicitly as [27]

∗ Sx = {y : x − y ∈ S[y]} . (15)

Note that this approach simply generalizes the rigid structuring elements, i.e. (14) and (15) simplify to (2) and (3) if S[x] = S, ∀x ∈ D. Once structuring elements are constructed, the morphological erosion and dilation are again defined by (5) and (6). These morphological operators satisfy the adjunction property (1) only if {Sx : x ∈ D} are computed once for the input image and used to compute both operations. Velasco-Forero and Angulo [31] defined a structuring element system that overcomes the issue of adjunction. These adaptive structuring elements satisfy the following prop- erties: (1) x ∈ Sx, and (2) y ∈ Sx ⇒ x ∈ Sy. Note that this is a sufficient but not necessary condition for ε and δ to form an adjunction (ε, δ). Retrieving the dual structuring elements in the adaptive case requires computing and storing the structuring elements for all points in the image. It may therefore be simpler to compute adjunct operators based on (4): computations can be done without ∗ explicitly calculating any dual structuring elements Sy, by observing that a processed ∗ pixel x constitutes a part of the dual structuring element Sy for any pixel y within its structuring element Sx. The adjunct dilation can be calculated by the following algorithm used by Lerallut et al. [39]: for each point x ∈ D do compute Sx for each y ∈ Sx do δ(y) = max(f(x), δ(y)) end for end for This way of computing the adjunct adaptive dilation does not require storage (or frequent recomputation) of all structuring elements. 3. Theory 167

3.4 Adaptive structuring functions Since the adaptive structuring function now varies for each point x ∈ D, it is here denoted with s[x]. In the theory provided by Bouaynaya and Schonfeld [25, 27], non-adaptive morphological operators based on rigid structuring functions (Sec. 3.2) are directly gen- eralized to the adaptive case. Hence, for adaptive structuring functions we have

sx(y) = s[x](y − x). (16) The corresponding dual structuring function is defined as ∗ sx(y) = sy(x) = s[y](x − y). (17) Note that, as for rigid structuring elements, (13) can be used to convert adaptive struc- turing elements into structuring functions, which gives the flat case covered in Sec. 3.3. For rigid structuring functions, i.e. if s[x] = s, ∀x ∈ D, the expressions (16) and (17) are simplified into (8) and (9), respectively. Similarly to a structuring element system, Velasco-Forero and Angulo [31] defined a morphological weight system where the structuring functions satisfy the following prop- erties: (1) sx(x) = 0, and (2) sx(y) = sy(x). This is a sufficient, but not necessary, condition for (ε, δ) to form an adjunction.

3.5 Impulse functions Adaptive morphological operators can also be defined using the notion of impulse func- tions, without explicit use of the adjunction property [46, 52]. In this case, the morpho- logical opening and closing are computed directly without resorting to compositions of the erosion and dilation. For h ∈ D and t ∈ T , the impulse function iy is defined by ( 1, if x = y iy(x) = (18) 0, if x 6= y for all x ∈ D, where 0 is the smallest element within the range T . Every function f ∈ L(D, T ) can then be written as _ f = ix · f(x), (19) x∈D and the erosion ε, dilation δ, and opening γ are defined, respectively, as [67] _ ε(f) = {ix · t : CD(sx),t ≤ f, x ∈ D}, (20) t∈T _ δ(f) = {CD(sx),f(x) : x ∈ D}, (21) _ γ(f) = {CD(sx),t ≤ f : x ∈ D}, (22) t∈T where D(sx) is the support of the structuring function in point x and CD(sx),t is the cone of base D(sx) and height t. 168 Paper F

4 Selected methods

In this section we consider two of the most influential (and currently most cited) works of adaptive mathematical morphology, as well as our own recent methods on this topic. We present a more in-depth study of the following approaches:

• General Adaptive Neighborhoods (GANs) [23],

• Morphological Amoebas (MAs) [39],

• Salience Adaptive Structuring Elements (SASEs) [41], and

• Elliptical Adaptive Structuring Elements (EASEs) [53].

These methods are presented in order of increasing constraints on the structuring element shapes, ranging from the first method (GANs), which provides complete adaptivity, to the last one (EASEs), where shapes are predefined.

4.1 General adaptive neighborhoods General adaptive neighborhoods (also called intrinsic structuring elements) were proposed by Debayle and Pinoli [23, 68, 69]. A connected neighborhood is considered for each point x ∈ D, which includes points based on a measure h(x) ∈ R (such as gray level values, contrast, or similar). This adaptive neighborhood, called a weak General Adaptive Neighborhood, is defined as

Vm(x) = {y ∈ D : |h(y) − h(x)| < m and y ∈ CC(x)} , (23) where CC(x) denotes the connected component of the point x and m > 0 is a tolerance that determines the size of the neighborhood. In order to have a proper structuring ele- ment that can directly construct adjunct morphological erosion and dilation, the concept of strong GANs is introduced. They are used as adaptive structuring elements for each point x ∈ D, and are defined as

m [ Sx = {Vm(z): x ∈ Vm(z)}. (24) z∈D

The GAN framework has been used in a connection with the Choquet filtering [70] and for the logarithmic image processing [71]. Also, the GAN approach has been re- cently extended to spatially and intensity adaptive morphology [72] where level sets are processed at different scales. The computational complexity of the direct implementation of the GANs is O(N 2) where N is the number of pixels in the image. For each point in the image it is necessary to check if all points in the image are in the predefined tolerance m and in the same connected component as the considered point. 4. Selected methods 169

4.2 Morphological amoebas Morphological amoebas likely constitute the most well-known method for adaptive math- ematical morphology. They rely on geodesic distances. When computing geodesic dis- tances, a 2D image is embedded into a 3D surface with two spatial coordinates and one that represents the gray level value. A geodesic distance between two points (x, f(x)) and (y, f(y)) is the shortest distance over the surface between the two points [73]. Let P(x, y) = {x = x1, ..., xi, xi+1, ..., xn = y} be a path connecting points x and y. The cost of the path is computed as the sum of the cost of all adjacent points in the path, i.e., c(xi, xi+1), i = 1, ..., n − 1. The cost of the path P(x, y) is then equal to the distance

n−1 X d(x, y) = min c(xi, xi+1). (25) P(x,y) i=1

r A structuring element Sx centered in a point x can then be computed as

r  Sx = y ∈ D : d(x, y) < r , (26) where r > 0 is the parameter that determines its size.

For morphological amoebas c(xi, xi+1) = 1+λ|f(xi)−f(xi+1)|, where λ > 0 is a weight parameter. The difference |f(xi)−f(xi+1)| penalizes the changes in gray level values, i.e., restricts structuring elements from growing across high gradients. The number 1 stands for the spatial distance between adjacent pixels and better properties of morphological amoebas can be achieved if Euclidean or weighted distances [74] are used instead of 1. Similarly to morphological amoebas, Grazzini and Soille [40] used the following costs between two adjacent pixels xi and xi+1 to define adaptive structuring elements:

1  c(x , x ) = |∇f(x )| + |∇f(x )| · kx − x k, (27) i i+1 2 i i+1 i i+1 and 1 c(x , x ) = |f(x ) − f(x )| · kx − x k. (28) i i+1 2 i i+1 i i+1 Region growing structuring elements [42] lie halfway between GANs and morphologi- cal amoebas. The points that belong to a structuring element are included with a region growing procedure, and the number of points in the structuring element is predefined. The growing procedure is based on the difference between gray level values of the adjacent points, which makes it similar to morphological amoebas if c(xi, xi+1) = |f(xi) − f(xi+1)|. This method produces adaptive structuring elements that do not necessarily occupy a whole homogeneous region as is the case with GANs. The computational complexity for morphological amoebas is O(N × r2 log r2) where r determines the size of the distance propagation, i.e., is the radius of the morphological amoeba. Notice that the computational cost could reach O(N 2 log N) if the size of the structuring elements is the whole image. 170 Paper F

4.3 Salience adaptive structuring elements Salience adaptive structuring elements [41] are adaptive structuring elements computed with path-based distances on the salience map of the input image. The salience map is obtained from the salience distance transform [75] of the weighted edges or some other attributes in the input image, containing information about the important structure in the image. For this particular method, the salience map SM is computed as _   SM(z) = Offset + NMS(f)(w) − kw − zk , z ∈ D, (29) w∈D where NMS(f) is the result of non-maximal suppression applied to the Gaussian gradient magnitude, as used in the Canny edge detector, and ^ _   Offset = NMS(f)(w) − kw − zk . (30) z∈D w∈D The smoothness used in the Gaussian derivatives is comparable to that of the pilot image in morphological amoebas. Since the salience map SM already incorporates spatial and tonal information, the cost of the path between two adjacent points xi and xi+1 is computed as c(xi, xi+1) = SM(xi) + SM(xi+1), (31) r and a salience adaptive structuring element Sx is defined by equation (26) using the cost given by equation (31). To allow a larger flexibility in size, the radii of the salience adaptive structuring elements also depend on the salience map SM, so that structuring elements located close to edges in the input image are smaller in size while structuring elements in homogeneous areas are larger. The salience map SM can be further controlled by scaling the initial edge map or using a different distance propagation for its construction. Salience adaptive structuring elements have the same computational cost as morpho- logical amoebas since they are based on the geodesic distance propagation on the salience map.

4.4 Elliptical adaptive structuring elements A structure-adaptive morphological filtering method was presented by Landstr¨omand Thurley [53]: elliptical adaptive structuring elements are based on the well-known Local Structure Tensor (LST), which is a 2x2 matrix for each pixel (in 2D) whose eigenvectors and eigenvalues, respectively, contain information about the orientation of structures in the image (i.e. edges) and the rate of anisotropy [76]. More specifically, the LST T is given by T  T(x) = Gσ ∗ ∇f(x)∇ f(x) , (32) T  ∂ ∂  where ∇ = is the gradient (nabla) operator and Gσ is a Gaussian kernel with ∂x1 ∂x2 standard deviation σ, which acts to regularize the matrix as well as setting the scale at which structures should be considered. 5. Experimental results 171

The orientation of the elliptical structuring elements is then given by the orientation of the eigenvalues of T, while the semi-major and -minor axes a and b are set from

λ (x) +  a(x) = 1 M, (33) λ1(x) + λ2(x) + 2

λ (x) +  b(x) = 2 M, (34) λ1(x) + λ2(x) + 2 where M is the user-defined semi-major axis,  > 0 is a small constant (i.e. machine epsilon), and λ1 and λ2 are the eigenvalues of T. The resulting elliptical structuring elements vary dynamically between lines, where the image is highly anisotropic, and disks, in isotropic regions. The LST is calculated on a neighborhood defined by a user-supplied radial bandwidth rw defining σ, whereafter structuring elements are defined from the eigenvectors and eigenvalues of the LST based on a user-set maximum semi-major axis M. Given a set of pre-defined structuring elements, which can be stored in a look-up table since a certain number of specific shapes are considered, the computational cost for the method is O(N × R). R here represents the maximum number of neighborhood pixels used in the operations, resulting from the choice of M and rw.

5 Experimental results

Studying the behaviour of different methods is important for understanding how different types of adaptivity affect the operations. We therefore present examples where a selected set of methods have been applied to the same images, considering the different behaviours caused by the different definitions as well as main differences and similarities. We do not intend to present an extensive comparison between different methods for adaptive morphological operators since they rely on: (1) different image attributes, (2) different measures of these attributes, and (3) different user-set parameters for these measures. As such they are not directly comparable, and a quantitative comparison can easily become biased due to the choice of images and parameters. The choice of parameters will of course play an important role in determining the size and/or shape of the structuring elements, and a number of different parameter setups were investigated for each of the considered methods. However, as the parameters are not directly comparable we focus on analyzing the behaviour of the structuring elements based on their definition and the underlying theory, and the presented results are based on parameters for which each of the methods will have similar size of structuring elements on average. The parameter values used in the experiments are summarized in Table 1. Note that for SASE the adaptive radius r(x) = k · mean(SM) − SM(x), x ∈ D is used. All methods considered here depend on the computation of adaptive structuring el- ements and the subsequent finding of the minimum or maximum of gray level values in the input image within the resulting neighborhood. In this study we perform three experiments. 172 Paper F

Table 1: Parameters used in the performed experiments.

Experiment Image size GAN MA SASE EASE

Fig. 2 186 × 186 m = 5 λ = 0.25, r = 40 k = 25 M = rw = 20

Fig. 3, left 91 × 91 m = 30 λ = 0.25, r = 10 k = 7 M = rw = 10

Fig. 3, right 170 × 170 m = 10 λ = 0.25, r = 4 k = 4 M = rw = 4

(a) (b) (c) (d)

(e) (f) (g) (h)

Figure 2: First row: Calculated structuring elements for (a) GANs, (b) morphological amoebas, (c) SASEs, (d) EASEs. Second row: Same as the first row but for a noisy image.

5.1 Experiment 1: Structuring element shapes

We first compare the shapes of adaptive structuring elements for a noiseless synthetic image, as well as for the same image with added independent identically distributed white Gaussian noise with standard deviation σ = 5. Figure 2 shows the shape of adaptive structuring elements for three points in the image. The circles denote the considered points in the image. Note how the two GANs for the green and the blue points blend together in Fig. 2a, which demonstrates that the structuring elements overlap. In the noisy case (Fig. 2e), the GAN for the red point contains many holes, while the other GANs shrink to a single point. We see that GANs indeed disregard any spatial distances, and are completely based on zones of homogeneous intensity. With increasing noise level, originally wide structuring elements become more and more sparse (i.e. have holes) 5. Experimental results 173 while thin structuring elements risk being reduced in size to a few (or even a single) pixel. Morphological amoebas align strictly to strong edges, while forming more disk-like shapes in regions of more isotropic structure (Fig. 2b, red structuring element). That is, edges cut off the structuring elements but do not otherwise affect the size of the amoebas. For noisy data (Fig. 2f), structuring elements shrink as the intensity variation within the amoebas increases, causing amoeba distances to increase more rapidly with respect to spatial distances. Note a few holes that appear in the structuring element denoted with red color. SASE cross edges more than amoebas. Comparing the sizes of the larger blue and the smaller green structuring elements (Fig. 2c), we also see that the size of these structuring elements are affected more near strong edges (in contrast to the case with amoebas). The size of SASE is affected in the noisy image due to large difference between the strong and weak edges in the input image (Fig. 2g). In most cases they form shapes with fewer holes than morphological amoebas. Adaptive elliptical structuring elements have convex and symmetric shapes without holes in them and they are least affected by the noise (Figs. 2d and 2h). Note that the line structuring element, denoted with red color, is the same for the noiseless and the noisy image. The elliptical structuring elements become line-shaped where there is a clear dominant directional structure in the image, while changing towards disks where directional structure is more ambiguous: the red structuring element is a vertical line while the blue structuring element is wider than its green counterpart. Note that the salience adaptive and the elliptical structuring element in the upper part of the image breach the border between the shapes more than the other two methods (although this is dependent on the choice of λ for morphological amoebas).

5.2 Experiment 2: Gray level contrast

The second experiment concerns morphological dilation of a synthetic image with dia- monds of different size and various contrast with smooth intensity transition between the objects and background. The results are depicted in Fig. 3, left column. Dilation with GANs changes the complete background, assigning it the value of the tolerance m, while leaving the inner parts of diamonds almost intact (Fig. 3 (b)). Note the artefacts that appear around diamonds for dilation with morphological amoebas, where the shapes of diamonds are all changed (Fig. 3 (c)). These effects result from small differences in contrast, in this case caused by the smoothing prefiltering which decreases the gray level values at the corners of the diamonds in the pilot image. Although this could be changed by using another type of prefiltering to create the pilot image (here a gaussian filter with standard deviation σ = 1 was used), the results indicate a sensitivity towards small contrast changes that can give rise to clear artefacts in the filtered image. SASEs do not handle the corners of the shapes well, but cause fewer artefacts than amoebas. This is the only method where dilation grows the shapes with low contrast more than high-contrast ones (Fig. 3 (d)). Elliptical adaptive structuring elements cause the least artefacts, and the resulting shapes of the dilated diamonds are not affected by their different levels of contrast (Fig. 3 (e)). 174 Paper F

(a)

(b)

(c)

(d)

(e)

Figure 3: Left column: Dilation of the diamond image; Right column: Opening of the historical text document. (a) Input image, (b) GANs, (c) Morphological amoebas, (d) SASEs, (e) EASEs. 6. Discussion 175

5.3 Experiment 3: Structures in real world data Our third experiment demonstrates morphological openings of a gray-level image con- taining parts of a historical text document (see Fig. 3 (a), right). The contours of the letters constitute structure that are occasionally discontinuous, i.e. have small gaps. Note that the text printed on the back side is partly visible in the background. The text is not significantly changed by GANs (Fig. 3 (b)), but noise in the background is reduced. Morphological amoebas (Fig. 3 (c)) fill in the smallest gaps in the letters and reduce noise. The salience adaptive structuring elements close gaps in the contour, but may cause blurred letters in the result (Fig. 3 (d)). The elliptical structuring elements, however, efficiently fill in the gaps without much change to the outer contours of the letters (Fig. 3 (e)).

6 Discussion

From classical non-adaptive morphology we are used to well-defined and often quite small structuring elements that represent some type of intuitive shape, e.g. disks, diamonds, lines, and crosses. In the adaptive case however, the diversity of shapes used as struc- turing elements is more or less unlimited and varies substantially between the methods. Nevertheless, it should be noted that all structuring elements in the more closely inves- tigated methods share two fundamental properties: (1) they are connected components and (2) they include a point x that they belong to (i.e. their origin). Methods based solely on connected homogeneous areas in the image, such as General Adaptive Neighborhoods, are completely unaware of spatial relations as long as connec- tivity with the origin of the structuring element is preserved. As a consequence, such methods are unaffected by scale (i.e. large vs. small objects). Also, for GANs, all points in the same connected homogeneous region will share the same adaptive structuring ele- ment independently of the neighboring structures in the image. Moreover, when applying this type of structuring elements to a close-to-uniform region, such as a black background with contrast variations, local variations in the input image may affect large regions in the output. On the other hand, if an image is highly corrupted by noise, methods based solely on gray-level values will still disregard highly noisy values and only include points for which the intensity levels are more similar to the considered origin. Hence, more extreme outliers (in terms of intensity levels) are not allowed to affect the result, which makes methods such as GANs suitable for noise removal. Methods based on weighted combinations of gray-level and spatial information provide a tool for spatial constraints, thereby taking the geometry of shapes into account. This is more similar to what we are used to from classical morphological processing. In the case of morphological amoebas, for instance, the parameter λ regulates addition between two incommensurate image domains and has a strong influence on the size and shape of the morphological amoebas. The resulting structuring elements adapt well to edges when gray-level values have larger impact, but increasing the weight of the spatial distance enables crossing of even strong edges. The weighting of the two thereby provides a 176 Paper F tradeoff between flexibility, i.e. avoid crossing edges, and constraints that forces them to cross edges. In the case of morphological amoebas, the two parameters λ and r should be set simultaneously. Setting an explicit weight parameter, such as the λ in morphological amoebas, can be challenging. To overcome this issue, adaptive structuring elements can be computed from a mapping that holds information about both spatial distances and gray-level values. This is done for salience adaptive structuring elements, which cross edges but are smaller in size near strong edges. In general, the structuring elements used in similarity-based methods adapt very well to image regions. Such methods should therefore be very useful for noise removal but are hard to use for linking segments together into larger connected regions, as is often the purpose of morphological operations such as closings. Such linking of structures can be obtained by enforcing further restrictions on the structuring elements with regard to shape, as is done for the adaptive elliptical structuring elements. With the constraint of adaptivity to a predefined (but variable) shape, it is possible to prolong or link existing dominant structure in the image while preserving the rest of the structure. An important practical aspect is the computational cost for each of the methods con- sidered in this study. Generally speaking, methods based on the propagation of geodesic distance (e.g. Morphological amoebas and Salience adaptive structuring elements) have relatively high computational complexity due to the distance propagation. Nonetheless, the complexity may be reduced by using optimized algorithms for distance propagation. Methods based solely on differences between gray level values, such as General adaptive neighborhoods, require less computational time. If only a limited number of predefined shapes is required, such as for the Elliptical adaptive structuring elements, precalculated libraries of structuring elements can be used for fast processing (especially for batch processing).

7 Perspectives and trends

As shown in Sec. 5 and discussed in Sec. 6, the methods presented in this survey have both benefits and drawbacks, depending on the specific task. Then, one could ask if a single “ideal” method for adaptive mathematical morphology exists? The answer to this question is probably no, since everybody (and every application) has a different idea of what “ideal” means in this context. None of the methods for adaptive morphological operators can be considered best for all purposes, and their usefulness is application dependent. Nonetheless, adaptive mathematical morphology seems to have room for several improvements and possible future studies. We note here some possible directions for future research. Few, if any, methods for adaptive morphological operators are spread to the wider image analysis community. This might be due to relatively high computational cost required to compute adaptive structuring elements. Efficient algorithms for computing adaptive structuring elements need to be developed, and this might be necessary in order to allow adaptive morphology to become more extensively used. References 177

Furthermore, most adaptive morphological operators that have been defined for the Euclidean space are, in fact, special cases of the recently introduced mathematical mor- phology on Riemannian manifolds [77]. This seems to be a prominent direction to gener- alize adaptive mathematical morphology and make a unified framework for all methods. Hence, we believe that this direction will be further explored. Despite that basic morphological operators have been defined properly (as presented in Sec. 3), it is still an open problem how to obtain useful granulometries based on adaptive structuring elements. As the absorption property is not necessarily satisfied, care must be taken. The granulometry is an historically important tool in mathematical morphology, and its relationship to adaptive morphological operators needs to be defined. As could be easily noted, all methods presented in Sec. 4 are defined for gray valued images and not for multivalued ones. Even an extension of classical mathematical mor- phology to multivalued data is a challenging task, since it depends on ordering vectors, and there is no unambiguous way to order vectors [78,79]. This extension becomes even more complicated when structuring elements adapt to the image content, since struc- turing elements could adapt differently for different image channels. So far only a few studies on adaptive morphological operators for color images have been presented [39] [80], and it is an interesting topic for further investigations.

Acknowledgement

We would like to thank to Dr. Johan Debayle for providing us with his code on General Adaptive Neighborhoods. Fig. 3 (a, right) is used courtesy of Per Cullhed, Uppsala University Library and Fredrik Wahlberg, the Handwritten Text Recognition project at Uppsala University.

References

[1] G. Matheron, Random sets and integral geometry. New York: Wiley, 1975, vol. 1.

[2] J. Serra, Image analysis and mathematical morphology. London: Academic Press, 1982.

[3] H. J. A. M. Heijmans and C. Ronse, “The algebraic basis of mathematical morphol- ogy. i dilations and erosions,” Computer Vision, Graphics, and Image Processing, vol. 50, no. 3, pp. 245–295, 1990.

[4] C. Ronse and H. J. A. M. Heijmans, “The algebraic basis of mathematical morphol- ogy. ii openings and closings,” Computer Vision, Graphics, and Image Processing, vol. 55, no. 1, pp. 74–97, 1991.

[5] P. Maragos, “Pattern spectrum and multiscale shape representation,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 11, no. 7, pp. 701–716, 1989. 178 Paper F

[6] L. Alvarez, F. Guichard, P.-L. Lions, and J.-M. Morel, “Axioms and fundamental equations of image processing,” Archive for rational mechanics and analysis, vol. 123, no. 3, pp. 199–257, 1993.

[7] R. W. Brockett and P. Maragos, “Evolution equations for continuous-scale mor- phological filtering,” IEEE Transactions on Signal Processing, vol. 42, no. 12, pp. 3377–3386, 1994.

[8] L. Vincent, “Graphs and mathematical morphology,” Signal Processing, vol. 16, no. 4, pp. 365–388, 1989.

[9] H. Heijmans, P. Nacken, A. Toet, and L. Vincent, “Graph morphology,” Journal of Visual Communication and Image Representation, vol. 3, no. 1, pp. 24–38, 1992.

[10] J. Cousty, L. Najman, F. Dias, and J. Serra, “Morphological filtering on graphs,” Computer Vision and Image Understanding, pp. 1–20, 2012.

[11] P. Soille, Morphological Image Analysis: Principles and Applications, 2nd ed. Springer-Verlag New York, Inc., 2003.

[12] L. Najman and H. Talbot, Mathematical Morphology. John Wiley & Sons, 2013.

[13] C. L. Luengo Hendriks and L. J. Van Vliet, “A rotation-invariant morphology for shape analysis of anisotropic objects and structures,” in Visual Form 2001, pp. 378–387. Springer, 2001.

[14] S. Beucher, “Numerical residues,” Image and Vision Computing, vol. 25, no. 4, pp. 405–415, 2007.

[15] P. Maragos and C. Vachier, “Overview of adaptive morphology: trends and per- spectives,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2241–2244. IEEE, 2009.

[16] J. Serra, Image analysis and mathematical morphology. Vol. 2. New York, NY: Academic Press, 1988.

[17] S. Beucher, J. Blosseville, and F. Lenoir, “Traffic spatial measurements using video image processing,” in Robotics and IECON’87 Conferences, pp. 648–655. Interna- tional Society for Optics and Photonics, 1988.

[18] J. Verly and R. Delanoy, “Adaptive mathematical morphology for range imagery,” IEEE Transactions on Image Processing, vol. 2, no. 2, pp. 272–275, 1993.

[19] M. Charif-Chefchaouni and D. Schonfeld, “Spatially-variant mathematical morphol- ogy,” in IEEE International Conference on Image Processing (ICIP), vol. 2, pp. 555–559. IEEE, 1994. References 179

[20] A. Morales, “Adaptive structuring element for noise and artifact removal,” in Pro- ceedings of the 23d Conference on Information Sciences and Systems, March 1989.

[21] C.-S. Chen, J.-L. Wu, and Y.-P. Hung, “Theoretical aspects of vertically invariant gray-level morphological operators and their application on adaptive signal and im- age filtering,” IEEE Transactions on Signal Processing, vol. 47, no. 4, pp. 1049–1060, 1999.

[22] F. Cheng and A. N. Venetsanopoulos, “Adaptive morphological operators, fast al- gorithms and their applications,” Pattern recognition, vol. 33, no. 6, pp. 917–933, 2000.

[23] J. Debayle and J. Pinoli, “Spatially adaptive morphological image filtering using intrinsic structuring elements,” Image Analysis & Stereology, vol. 24, no. 3, pp. 145–158, 2005.

[24] R. Lerallut, E.´ Decenci`ere,and F. Meyer, “Image filtering using morphological amoe- bas,” in Mathematical Morphology: 40 Years On, 2005.

[25] N. Bouaynaya and D. Schonfeld, “Spatially variant morphological image processing: theory and applications,” in Proceedings of SPIE, vol. 6077, pp. 673–684, 2006.

[26] N. Bouaynaya, M. Charif-Chefchaouni, and D. Schonfeld, “Theoretical foundations of spatially-variant mathematical morphology part i: Binary images,” IEEE Trans- actions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 823–836, 2008.

[27] N. Bouaynaya and D. Schonfeld, “Theoretical foundations of spatially-variant math- ematical morphology part ii: Gray-level images,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 30, no. 5, pp. 837–850, 2008.

[28] J. Roerdink, “Adaptivity and group invariance in mathematical morphology,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2253–2256. IEEE, 2009.

[29] J. Angulo, “Morphological bilateral filtering and spatially-variant adaptive struc- turing functions,” in Mathematical Morphology and Its Applications to Image and Signal Processing, pp. 212–223. Springer, 2011.

[30] P. Salembier, “Study on nonlocal morphological operators,” in Image Processing (ICIP), 2009 16th IEEE International Conference on, pp. 2269–2272. IEEE, 2009.

[31] S. Velasco-Forero and J. Angulo, “On nonlocal mathematical morphology,” in Math- ematical Morphology and Its Applications to Signal and Image Processing, pp. 219– 230. Springer, 2013. 180 Paper F

[32] P. Purkait and B. Chanda, “Adaptive morphologic regularizations for inverse prob- lems,” in Mathematical Morphology and Its Applications to Signal and Image Pro- cessing, pp. 195–206. Springer, 2013.

[33] M. Nagao and T. Matsuyama, “Edge preserving smoothing,” Computer Graphics and Image Processing, vol. 9, pp. 394–407, 1979.

[34] C. Tomasi and R. Manduchi, “Bilateral filtering for gray and color images,” in Sixth International Conference on Computer Vision, pp. 839–846. IEEE, 1998.

[35] P. Milanfar, “A tour of modern image filtering: new insights and methods, both practical and theoretical,” Signal Processing Magazine, IEEE, vol. 30, no. 1, pp. 106–128, 2013.

[36] P. Perona and J. Malik, “Scale-space and edge detection using anisotropic diffusion,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 12, no. 7, pp. 629–639, 1990.

[37] U. Braga-Neto, “Alternating sequential filters by adaptive-neighbourhood structur- ing functions,” in In Proc. of International Symposium on Mathematical Morphology, pp. 139–146, 1996.

[38] O. Cuisenaire, “Locally adaptable mathematical morphology using distance trans- formations,” Pattern recognition, vol. 39, no. 3, pp. 405–416, 2006.

[39] R. Lerallut, E.´ Decenci`ere,and F. Meyer, “Image filtering using morphological amoe- bas,” Image and Vision Computing, vol. 25, no. 4, pp. 395–404, 2007.

[40] J. Grazzini and P. Soille, “Edge-preserving smoothing using a similarity measure in adaptive geodesic neighbourhoods,” Pattern Recognition, vol. 42, no. 10, pp. 2306– 2316, 2009.

[41] V. Curi´c,C.´ Luengo Hendriks, and G. Borgefors, “Salience adaptive structuring elements,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 7, pp. 809–819, 2012.

[42] V. Morard, E. Decenciere, P. Dokl´adal et al., “Region growing structuring elements and new operators based on their shape,” Signal and Image Processing, pp. 78–85, 2011.

[43] P. Dokl´adaland E. Dokl´adalov´a,“Grey-scale morphology with spatially-variant rect- angles in linear time,” in Advanced Concepts for Intelligent Vision Systems, pp. 674–685. Springer, 2008.

[44] V. Curi´cand´ C. L. Luengo Hendriks, “Adaptive structuring elements based on salience information,” in Computer Vision and Graphics, pp. 321–328. Springer, 2012. References 181

[45] V. Curi´cand´ C. L. Luengo Hendriks, “Salience-based parabolic structuring func- tions,” in Mathematical Morphology and Its Applications to Signal and Image Pro- cessing, pp. 183–194. Springer, 2013. [46] J. Angulo and S. Velasco-Forero, “Stochastic morphological filtering and bellman- maslov chains,” in Mathematical Morphology and Its Applications to Signal and Image Processing, pp. 171–182. Springer, 2013. [47] C. Vachier and F. Meyer, “News from viscousland,” in Proceedings of the 8th In- ternational Symposium on Mathematical Morphology, San Jos´edos Campos, Brazil, pp. 189–200, 2007. [48] P. Maragos and C. Vachier, “A PDE formulation for viscous morphological oper- ators with extensions to intensity-adaptive operators,” in 15th IEEE International Conference on Image Processing (ICIP), pp. 2200–2203. IEEE, 2008. [49] J. Serra, “Viscous lattices,” Journal of Mathematical Imaging and Vision, vol. 22, no. 2, pp. 269–282, 2005. [50] F. Shih and S. Cheng, “Adaptive mathematical morphology for edge linking,” In- formation sciences, vol. 167, no. 1, pp. 9–21, 2004. [51] O. Tankyevych, H. Talbot, P. Dokl´adal,and N. Passat, “Direction-adaptive grey- level morphology. application to 3d vascular brain imaging,” in 16th IEEE Interna- tional Conference on Image Processing (ICIP), pp. 2261–2264. IEEE, 2009. [52] R. Verd´u-Monedero,J. Angulo, and J. Serra, “Anisotropic morphological filters with spatially-variant structuring elements based on image-dependent gradient fields,” IEEE Transactions on Image Processing, vol. 20, no. 1, pp. 200–212, 2011. [53] A. Landstr¨omand M. J. Thurley, “Adaptive morphology using tensor-based elliptical structuring elements,” Pattern Recognition Letters, vol. 34, no. 12, pp. 1416–1422, 2013. [54] A. Landstr¨om,M. J. Thurley, and H. Jonsson, “Adaptive morphological filtering of incomplete data,” in 2013 International Conference on Digital Image Computing: Techniques and Applications (DICTA), pp. 435–441. IEEE, 2013. [55] J. Angulo and S. Velasco-Forero, “Structurally adaptive mathematical morphol- ogy based on nonlinear scale-space decompositions,” Image Analysis & Stereology, vol. 30, no. 2, pp. 111–122, 2011. [56] M. Breuß, B. Burgeth, and J. Weickert, “Anisotropic continuous-scale morphology,” Pattern Recognition and Image Analysis, pp. 515–522, 2007. [57] V.-T. Ta, A. Elmoataz, and O. L´ezoray, “Nonlocal PDEs-based morphology on weighted graphs for image and data processing,” IEEE transactions on Image Pro- cessing, vol. 20, no. 6, pp. 1504–1516, 2011. 182 Paper F

[58] M. Welk, M. Breuß, and O. Vogel, “Morphological amoebas are self-snakes,” Journal of Mathematical Imaging and Vision, vol. 39, no. 2, pp. 87–99, 2011.

[59] M. Welk, “Relations between amoeba median algorithms and curvature-based PDEs.” in SSVM, pp. 392–403, 2013.

[60] J. Stawiaski and F. Meyer, “Minimum spanning tree adaptive image filtering,” in 16th IEEE International Conference on Image Processing (ICIP), pp. 2245–2248. IEEE, 2009.

[61] H. Heijmans, M. Buckley, and H. Talbot, “Path openings and closings,” Journal of Mathematical Imaging and Vision, vol. 22, no. 2–3, pp. 107–119, 2005.

[62] J. Roerdink and H. Heijmans, “Mathematical morphology for structures without translation symmetry,” Signal Processing, vol. 15, no. 3, pp. 271–277, 1988.

[63] J. Roerdink, “Group morphology,” Pattern Recognition, vol. 33, no. 6, pp. 877–895, 2000.

[64] R. Adams, “Radial decomposition of disks and spheres,” CVGIP: Graphical models and image processing, vol. 55, no. 5, pp. 325–332, 1993.

[65] J. Y. Gil and R. Kimmel, “Efficient dilation, erosion, opening, and closing algo- rithms,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 24, no. 12, pp. 1606–1617, 2002.

[66] H. Hedberg, P. Dokladal, and V. Owall, “Binary morphology with spatially variant structuring elements: Algorithms and architecture,” IEEE Transactions on Image Processing, vol. 18, no. 3, pp. 562–572, 2009.

[67] I. Bloch, H. Heijmans, and C. Ronse, “Mathematical morphology,” in Handbook of Spatial Logics, pp. 857–944. Springer, 2007.

[68] J. Debayle and J.-C. Pinoli, “General adaptive neighborhood image processing – part i: introduction and theoretical aspects,” Journal of Mathematical Imaging and Vision, vol. 25, no. 2, pp. 245–266, 2006.

[69] J. Debayle and J.-C. Pinoli, “General adaptive neighborhood image processing – part ii: practical application examples,” Journal of Mathematical Imaging and Vision, vol. 25, no. 2, pp. 266–284, 2006.

[70] J. Debayle and J.-C. Pinoli, “General adaptive neighborhood choquet image filter- ing,” Journal of Mathematical Imaging and Vision, vol. 35, no. 3, pp. 173–185, 2009.

[71] J.-C. Pinoli and J. Debayle, “Logarithmic adaptive neighborhood image processing (lanip): introduction, connections to human brightness perception, and application issues,” EURASIP Journal on Applied Signal Processing, vol. 2007, no. 1, pp. 114– 114, 2007. References 183

[72] J.-C. Pinoli and J. Debayle, “Spatially and intensity adaptive morphology,” IEEE Journal of Selected Topics in Signal Processing, vol. 6, no. 7, pp. 820–829, 2012.

[73] P. Soille, “Generalized geodesy via geodesic time,” Pattern Recognition Letters, vol. 15, no. 12, pp. 1235–1240, 1994.

[74] G. Borgefors, “Distance transformations in digital images,” Computer vision, graph- ics, and image processing, vol. 34, no. 3, pp. 344–371, 1986.

[75] P. L. Rosin and G. A. West, “Salience distance transforms,” Graphical Models and Image Processing, vol. 57, no. 6, pp. 483–521, 1995.

[76] L. Cammoun, C. Casta˜no-Moraga, E. Mu˜noz-Moreno,D. Sosa-Cabrera, B. Acar, M. Rodriguez-Florido, A. Brun, H. Knutsson, and J. Thiran, “A review of tensors and tensor signal processing,” in Tensors in Image Processing and Computer Vision, pp. 1–32. Springer, 2009.

[77] J. Angulo and S. Velasco-Forero, “Mathematical morphology for real-valued images on riemannian manifolds,” in Mathematical Morphology and Its Applications to Sig- nal and Image Processing, pp. 279–291. Springer, 2013.

[78] E. Aptoula and S. Lef`evre,“A comparative study on multivariate mathematical morphology,” Pattern Recognition, vol. 40, no. 11, pp. 2914–2929, 2007.

[79] J. Angulo, “Morphological colour operators in totally ordered lattices based on dis- tances: Application to image filtering, enhancement and analysis,” Computer Vision and Image Understanding, vol. 107, no. 1, pp. 56–73, 2007.

[80] J. Debayle and J.-C. Pinoli, “Spatially adaptive color image processing,” in Advances in Low-Level Color Image Processing, pp. 195–222. Springer, 2014.