<<

Classifying analysis techniques from their output

C. Guada 1 ∗, D. Gomez´ 2, J.T. Rodr´ıguez 1, J. Ya´nez˜ 1, J. Montero 1,3 1 Facultad de Ciencias Matematicas,´ Complutense University, Plaza de las Ciencias 3, Madrid, 28040, Spain† E-mail: [email protected], [email protected], [email protected], [email protected] 2 Facultad de Estudios Estad´ısticos, Complutense University, Av. Puerta de Hierro s/n, Madrid, 28040, Spain ‡ E-mail: [email protected] 3 Instituto de Geociencias IGEO (CSIC, UCM), Plaza de las Ciencias 3, Madrid, 28040, Spain§

Abstract In this paper we discuss some main image processing techniques in order to propose a classification based upon the output these methods provide. Because despite a particular technique can be su- pervised or unsupervised, and can allow or not the existence of fuzzy information at some stage, each technique has been usually designed to focus on a specific objective, and their outputs are in fact different according to each objective. Thus, they are in fact different methods. But due to the essential relation- ship between them they are quite often confused. In particular, this paper pursues a clarification of the differences between and , among other image processing techniques.

Keywords: Image segmentation, image classification, edge detection, fuzzy sets, machine learning, graphs.

1. Introduction eas as , image and data storage for transmission in business applications, medical imag- ing, acoustic and security, among many Image analysis or image processing has become a other fields. processing techniques scientific hot topic during the last decades, partic- are being increasingly demanded through all areas ularly because of the increasing amount of relevant of science and industry. information being stored in this format. Image anal- ysis has a wide range of applications in different ar- Many different techniques are considered “image ∗Facultad de Ciencias Matematicas,´ Complutense University, Plaza de las Ciencias 3, Madrid, 28040, Spain. †Facultad de Ciencias Matematicas,´ Complutense University, Plaza de las Ciencias 3, Madrid, 28040, Spain. ‡Facultad de Estudios Estad´ısticos, Complutense University, Av. Puerta de Hierro s/n, Madrid, 28040, Spain. §Instituto de Geociencias IGEO (CSIC-UCM), Plaza de las Ciencias 3, Madrid, 28040, Spain. C. Guada et al. processing” or “image analysis” techniques. Usu- Hence, the main objective of this paper is to ally, each technique is appropriate for some a small present a classification of a set of widely used im- range of tasks or for a specific problem. age processing algorithms, attending to the prob- For example, machine learning algorithms used lems they face, the learning scheme they are based in image analysis help in finding solutions to many on, and the representational framework they utilize. problems in speech recognition 1, 2 and vi- Some new image analysis concepts and methods are sion 3. vision uses image processing al- also provided. gorithms to extract significant information automat- In order to understand the remainder paper, the ically from . A number of algorithms are following notation is adopted. The image we con- available to analyze images and recognize objects in sider is being modeled as a two-dimensional func- them, and depending on the delivered output and the tion, where x and y are the coordinates in a plane, manner in which the images were encoded, specific and f (x,y) represents each pixel by a fixed num- 8 learning methodologies are needed 4. ber of measurable attributes . Formally, an image can be defined as I = { f (x,y);x = 1,...,n and y = Another well-known example is image segmen- 1,...m} which can be represented computationally tation. Image segmentation methods try to simplify in: an image into something of easier analysis. Usu- ally the image is transformed into segments or re- • Binary ≡ f (x,y) ∈ {0,1}. gions. These regions are supposed to be connected • Gray ≡ f (x,y) ∈ {0,...,255}. and represent a set of homogeneous pixels. Never- • RGB ≡ f (x,y) ∈ {0,...,255}3. theless, it is not clear when certain techniques should be considered as image segmentation methods, and A binary image is represented as a matrix allow- such a situation also applies to other image process- ing two possible values for each cell, two-tone col- ing techniques. It is quite common to classify two ors (usually 0 refers to black and 1 to white). Sim- methods in the same group of techniques even if the ilarly, a grey scale image may be defined as a two output and the information they provide are very dif- dimensional function f (x,y) where the amplitude of ferent. This is the case, for example, of edge detec- f at any pair of coordinates (x,y) refers to the inten- 5 6 tion methods , clustering image and image thresh- sity (gray level) of the image at that point. Instead, a 7 olding algorithms , all of them classified as “image color image is defined by a combination of individ- segmentation procedures” despite the obvious dif- ual 2D images. For instance, the RGB color system ferences in their outputs. represents a color image in three components (red, There is a significant difference between im- green and blue). That is, it has an array of three ma- age classification methods, edge detection methods, trices of equal size where the intensity of color of a image segmentation methods and hierarchical im- pixel compound is represented by the three colors 8. age segmentation methods. Particularly, besides the Figures 1, 2 and 3 show a binary image, a strong relationship between them, all these meth- grayscale image and RGB image, respectively. ods address different problems, and produce differ- The remainder of this paper is organized as fol- ent outputs. However, those different methods are lows: a review on image classification is presented not always clearly differentiated in the literature, and in Section 2. Section 3 is devoted to fuzzy im- sometimes even confused. This paper poses a crit- age classification. Some edge detection techniques ical view on some of these techniques in order to are shown in Section 4, which is extended to fuzzy show the conceptual differences between them. De- edge detection in Section 5. Image segmentation and pending on how a particular algorithm detects ob- some methods addressing this problem are analyzed jects in an image and shows the output, the method in Section 6. Section 7 pays attention to hierarchical might be understood as either performing classifi- image segmentation. In Section 8 fuzzy image seg- cation, detection, segmentation or hierarchical seg- mentation is addressed. Finally, some conclusions mentation. are shed. Classifying image analysis techniques

imum likelihood classification 15 or minimum dis- tance classification 16 are some techniques for su- pervised classification. Depending on certain fac- tors such as data source, spatial resolution, available classification software, desired output type and oth- ers, it is more appropriate to use one of them to pro- Fig. 1. Binary image. cess a given image. Supervised classification procedures are essen- tial analytical tools for extracting quantitative infor- mation from images. According to Richards and Jia 4, the basic steps to implement these techniques are the following:

• Decide the classes to be identified in the image. Fig. 2. Gray level image. • Choose representative pixels of each class which will form the training data. • Estimate the parameters of the algorithm classifier using the training data. • Categorize all the remaining pixels in the image with the classifier in each of regions desired. • Summarize the classification results in tables or Fig. 3. RGB image 9. display the segmented image. • Evaluate the accuracy of the final model using a test data set. 2. Image classification A variety of classification techniques such as Classification techniques have been widely used in those mentioned above have been used in image image processing to extract information from im- analysis. In addition, some of those methods have ages by assigning each pixel to a class. Therefore, been jointly used (e.g., neural networks 17,18, genetic two types of outputs can be obtained at the end of an algorithm with multilevel thresholds 19 or some vari- image classification procedure. The first kind of out- ant of support vector machines 20). put is a thematic map where pixels are accompanied In general, supervised classification provides by a label for identification with a class. The second good results if representative pixels of each class are kind of output is a table summarizing the number of chosen for the training data 4. Next, we pay attention image pixels belonging to each class. Furthermore, to methods for unsupervised classification. both supervised and unsupervised learning schemes 10 are applied for this image classification task . Next 2.2. Unsupervised classification, clustering we provide an overview of these methods. Unsupervised classification techniques are intended 2.1. Supervised classification to identify groups of individuals having common characteristics from the observation of several vari- Supervised classification is one of the most used ables for each individual. In this sense, the main techniques for image analysis, in which the super- goal is to find regularities in the input data, because visor provides information to the system about the unlike supervised classification techniques, there is categories present in each pattern in the training set. no supervisor to provide existing classes. The pro- Neural networks 11, genetic algorithms 12, sup- cedure tries to find groups of patterns, focussing in port vector machines 13, bayesian networks 14, max- those that occur more frequently in the input data 21. C. Guada et al.

According to Nilsson 22, unsupervised classifica- correlations are high. Conversely, if the objects tend tion consists of two stages to find patterns: to be more dissimilar, then the correlations are low. However, because these measures indicate the simi- • Form a partition R of the set Ξ of unlabeled train- larity by pattern matching between the features and ing patterns. The partition separates Ξ into mutu- do not observe the magnitudes of the observations, ally exclusive and exhaustive subsets R known as they are rarely used in . clusters. Distance measurements represent the similarity • Design a classifier based on the labels assigned to and proximity between observations. Unlike corre- the training patterns by the partition. lation measurements, these similarity measures of Clustering 23 is perhaps the most extended distance or proximity are widely used in cluster method of unsupervised classification for image pro- analysis. Among the distance measures available, cessing, which obtains groups or clusters in the input the most common are the Euclidean distance or dis- image. tance metric, the standardized distance metric or the 26 Particularly, cluster analysis looks for patterns Malahanobis distance . in images, gathering pixels within natural groups Let X(n × p) be a symmetric matrix with n ob- that make sense in the context studied 21. The aim servations of p variables. The similarity between is that the proposed clustering has to be somehow the observations can be described by the matrix optimal in the sense that observations in a cluster D(n × n): have to be similar, and in turn be dissimilar to those   of other clusters. In this way, we try to maximize d11 d12 ...... d1n the homogeneity of objects within the clusters while  . .   . d22 .  the heterogeneity between aggregates is maximized.    . . .. .  In principle, the number of groups to form and the D =  . . . .  groups themselves is unknown 24.  . . . .   . . .. .  A standard cluster analysis creates groups that   d d ...... d are as homogeneous as possible, and in turn the dif- n1 n2 nn ference between the various groups is as large as where di j represent the distance between observa- possible. The analysis consist on the main follow- tions i and j. If d is a distance, then d0 = ing steps: i j i j maxi, j{di j} − di j represents a measure of proximity 25 • First, we need to measure the similarity and dis- . similarity between two separate objects (similar- Once a measure of similarity is calculated, we ity measures the closeness of objects, so closer are able to proceed with the formation of clusters. 27 values suggest more homogeneity). A popular algorithm is k-means clustering . The algorithm was first proposed in 1957 by Lloyd 28 • Next, similarity and dissimilarity between clus- and published much later 29. K-means clustering ters is measured so that the difference between algorithm classifies the pixels of the input image groups become larger and thus group observations into multiple classes according to the distance from become as close as possible. each other. The points are clustered around cen- In this manner, the analysis can be divided into troids µi,i = 1,...k obtained by minimizing the Eu- these two basic steps, and in both steps we can be clidean distance. Let x1 and x2 be two pixels whose use correlation measures or any distance 25 as sim- Euclidean distance between them is: ilarity measure. Some similarity and dissimilarity s N 26 2 measures for quantitative features are shown in . d(x1,x2) = ∑(x1i − x2i) . (1) Correlation measurements refer to the measure- i=1 ment of similarity between two objects in a symmet- Thus, the sum of distances to k-center is mini- ric matrix. If the objects tend to be more similar, the mized, and the objective is to minimize the maxi- Classifying image analysis techniques mum distance from every point to its nearest center of image classification methods is presented in 35. 30. Then, having selected the centroids, each obser- 3. Fuzzy image classification vation is assigned to the most similar cluster based on the distance of the observation and the mean of In this section, we present an overview of fuzzy im- the cluster. The average of the cluster is then re- age classification. As in Section 2, these methods calculated, beginning an iterative process of cen- are divided in supervised techniques and unsuper- troid location depending on how observations are vised techniques. assigned to clusters; until the criterion function con- 31 verges . 3.1. Supervised fuzzy classification Clustering techniques have been used to per- form unsupervised classification for image process- Fuzzy classification techniques can be considered ing since 1969, when an algorithm finding bound- extensions of classical classification techniques. We aries in remote sensing data was proposed 32. Clus- simply need to take advantage of the fuzzy logic pro- tering implies grouping the pixels of an image in the posed to Zadeh in such a way that restricted applica- multispectral space 4. Therefore, clustering seeks to tions within a crisp framework we produce all those identify a finite set of categories and clusters to clas- techniques presented in Section 2.1. And similarly sify the pixels, having previously defined a criteria to traditional classification techniques, fuzzy classi- of similarity between them. There are a number of fication includes both supervised and unsupervised available algorithms 33. methods. 36 The resulting image of the k-means clustering al- Fuzzy logic was introduced in the mid-sixties gorithm with k=3 over Figure 3 is shown in Figure 4. of last century as an extension of classical binary logic. Some objects have a state of ambiguity re- garding their membership to a particular class. In this way, a person may be both “tall” and “not tall” to some extent. The difference lies in the degree of membership assigned to the fuzzy set “tall” and its complement “not tall”. Therefore, the intersec- tion of a fuzzy set and its complement is not al- Fig. 4. Cluster’s result. ways an empty set. Also, conflictive views can si- Thus, to achieve its main goal, any clustering al- multaneously appear within a fuzzy set 37. Such a gorithm must address three basic issues: i) how to vagueness is in the roots of fuzzy logic, and brings measure the similarity or dissimilarity?; ii) how the specific problems difficult to be treated by classical clusters are formed?; and iii) how many groups are logic. Still, these problems are real and should be formed?. These issues have to be addressed so the addressed 38. principle of maximizing the similarity between indi- A fuzzy set C refers to a class of objects with viduals in each cluster and maximizing the dissim- membership degrees. This set is usually character- 33 ilarity between clusters can be met . A detailed ized by a continuous membership function µC(x) review on clustering analysis is presented in 6. that assigns to each object a grade of membership Thus, summarizing, in the framework of super- in [0,1] indicating the degree of membership of x vised classification pixels are assigned to a known in C. The closer µC is to 1, the greater the degree number of predefined groups. However, in cluster of membership of x in C 36. Classical overviews of analysis the number of cluster groups and the groups fuzzy sets can be found in 36,39,40,41,42,43. themselves are not known in advance 24. Anyway, Hence, classes are defined according to certain image classification is a complex process for pattern attributes, and objects that possess these attributes recognition based on the contextual information of belong to the respective class. Thus, in many appli- the analyzed images 34. A more detailed discussion cations of fuzzy classification, we consider a set of C. Guada et al.

54 fuzzy classes C . The degree of membership µC(x) The fuzzy c-means algorithm constitutes an of each object x ∈ X to each class C ∈ C has to be alternative approach to the methods defined in sub- then determined 44. section 2.2, based on fuzzy logic. Fuzzy c-means The membership function is given by µc : X → provide a powerful method for cluster analysis. [0,1] for each class c ∈ C 44, where a quite complete The objective function of the fuzzy c-means al- framework is proposed, subject to learning). gorithm, given by Bezdek 55, is as follows: Fuzzy rule-based systems (FRBS) are methods N c of reasoning, where knowledge is expressed by m 2 Jm(U,v) = ∑ ∑ uikkyk − vikA. (2) means of linguistic rules. FRBS are widely used in k=1 i=1 45 various contexts, as for example fuzzy control or n where Y = {y1,y2,...,yN} ⊂ ℜ are the observa- fuzzy classification 46. FRBS allow the simultane- tions, c are the numbers of clusters in Y;2 6 c < ous use of certain parts of this knowledge to perform n, m;1 6 m < ∞ is the weighting exponent which inference. represents the degree of fuzziness, U;U ∈ Mf c, The main stages of a FRBS are listed below: v = (v1,v2,...,vc) are the vectors of centers, vi = (v ,v ,...,v ) is the centroid of cluster i, k k in- • Fuzzification, understood as the transformation of i1 i2 in A troduce A-norm above n, and A is positive weight the crisp input data into fuzzy data 47. ℜ matrix (n × n).56 • Construction of the fuzzy rule base, expressed According to Bezdek et al. 56, the fuzzy c-means through linguistic variables, i.e., IF antecedent algorithm basically consists of the following four THEN result 48,49,50. steps: • Inference over fuzzy rules, where we find the con- (0) sequence of the rule and then combine these con- • Set c,m,A,kkkA and choose a matrix U ∈ Mf c. sequences to get an output distribution 51. • Then at step k, k = 0,1,...,LMAX, calculate the mean v(i),i = 1,2,...,c with • Defuzzification, to produce a crisp value from the N m ∑k=1(ubik) yk fuzzy or linguistic output obtained from the previ- vi = N where 1 i c. b ∑ (u )m 6 6 37,52 k=1 bik ous inference process . • Calculate the updated membership matrix 2 (k+1) (k+1) c dbik (m−1) −1 FRBS are commonly applied to classification Ub = [ui j ] with uik = (∑ j=1( ) ) ; b b dbjk problems. These classifiers are called fuzzy rules 1 6 k 6 N; 1 6 i 6 c. based classification systems (FRBCS) or fuzzy clas- • Compare U(k+1) and U(k) in any convenient ma- 53 b b sifiers . trix norm. If kUb(k+1) −Ub(k)k < ε stop, otherwise Fuzzy classification is the process of grouping (k) (k+1) set Ub = Ub and return to the second step. objects in a family of fuzzy sets by assigning de- grees of membership to each of them, defined by the Fuzzy techniques have found a wide application truth value of a fuzzy propositional function 44. into image processing (some studies can be found 57,58,59 One of the advantages of fuzzy classifiers is that in ). They often complement the existing they do not assigned two similar observations to dif- techniques and can contribute to the development of 60 ferent classes in case the observations are near the more robust methods . Applying the technique of boundaries of the classes. Also, FRBCS facilitate fuzzy c-means on Figure 2, the resulting image is smooth degrees of membership in the transitions be- shown in Figure 5. tween different classes.

3.2. Unsupervised classification, fuzzy clustering Regarding unsupervised techniques, perhaps the most widely used technique is fuzzy c-means, that extends clustering to a fuzzy framework. Fig. 5. Fuzzy c-means’s result. Classifying image analysis techniques

And the direction is given by the angle: As it has been noted, the fuzzy c-means produces 61   a fuzzy partition of the input image characterizing gy α(x,y) = tan−1 . (5) the membership of each pixel to all groups by a de- gx gree of membership 56. In conclusion, fuzzy classification assigns differ- However, the errors associated with the approx- ent degrees of membership of an object to the de- imation may cause errors which require adequate fined categories. consideration. For example, an error may be that the edges are not detected equally well in all direc- 4. Edge detection tions, thus leading to erroneous direction estimation (anisotropic detecting edges) 10. Edge detection has been an important topic in image Therefore, the value of the gradient is related to processing for its ability to provide relevant infor- the change of intensity in areas where it is variable, mation of the image and provide the boundaries of and is zero where intensity is constant 8. objects of interest. The second derivative is generally computed us- Edge detection is a useful and basic operation to ing the Laplace operator or Laplacian, which for a acquire information about an image, such as its size, two-dimensional function f (x,y) is formed from the shape and even texture. This technique is considered second order derivatives8: the most common approach to detect significant dis- continuities in the values of intensities of an image, and this is achieved by taking spatial derivatives of ∂ 2 f (x,y) ∂ 2 f (x,y) first and second order (typically with the gradient ∇2 f (x,y) = + . (6) x2 y2 and Laplacian, respectively). That is, non smooth ∂ ∂ changes in the function f (x,y) of the image can be determined with the derivatives. Thus, the operators Remarkably, the Laplacian is rarely used directly that describe edges are tipically expressed by partial because its sensitivity to noise and its magnitude can derivatives 8,62. generate double edges. Moreover, it is unable to de- tect the direction of the edges, for which it is used in The result obtained with this technique consists 8 of a binary image such that those pixels where sharp combination with other techniques . changes have occurred appear bright, while the other Thus, the basic purpose of edge detection is to pixels remain dark. This output in turn allows a find the pixels in the image where the intensity or significant reduction on the amount of information brightness function changes abruptly, in such a way while preserving the important structural properties that if the first derivative of the intensity is greater of the image. in magnitude than a predetermined threshold, or if The first derivative is determined by the gradient, the second derivative crosses zero, then a pixel is 8,62 which is able to determine a change in the intensity declared to be an edge . In practice, this can function through a one-component vector (direction) be determined through the convolution of the image pointing in the direction of maximum growth of the masks, which involves taking a 3×3 matrix of num- image’s intensity 62. Thus, the gradient of a two- bers and multiply pixel by pixel with a 3 × 3 section dimensional intensity function f (x,y) is defined as: of the image. Then, the products are added and the result is placed in the center pixel of the image 63.   " ∂ f # There are several algorithms which implement gx ∂x ∇ f ≡ grad(f) = = ∂ f . (3) this method, using various masks 8,64. For ex- gy ∂y ample, the Sobel operator 65, a Prewitt operator 66 67 The magnitude of this vector is: , Roberts cross operator , Gaussian Laplacian filter5, Moment-based operator 68, etc. Next, we √ √ 2 2 2 2 ∇ f =mag(∇ f )= gx +gy = (∂ f /∂x) +(∂ f /∂y) . (4) look more closely to two of the most used operators. C. Guada et al.

4.1. Sobel operator four possible estimates of the single central gradi- ents (each single central gradient estimated is a vec- Sobel operator was first introduced in 1968 by Sobel 65 69 tor sum of vectors orthogonal pairs) in a 3×3 neigh- and then formally accredited and described in . borhood. The vector sum operation provides an av- This algorithm uses a discrete differential operator, erage over the directions of gradient measurement. and finds the edges calculating an approximation to The four gradients will have the same value if the the gradient of the intensity function in each pixel of density function is planar throughout the neighbor- an image through a 3×3 mask. That is, the gradient hood, but any difference refers to deviations in the at the center pixel of a neighborhood is calculated flatness of the function in the neighborhood 65. by the Sobel detector 8: 4.2. Canny operator

This edge detection algorithm is very popular and −1 −2 −1 −1 0 1 attempts to find edges looking for a local maximum Gx ⇒ 0 0 0 Gy ⇒ −2 0 2 70 gradient in the gradient direction . The gradient 1 2 1 −1 0 1 is calculated by the derivative of the Gaussian fil- ter with a specified standard deviation σ to reduce q noise. At each pixel the local gradient g = g2 + g2 Where the neighborhood would be represented x y as: and its direction (Eq. (5)) is determined. Then, a pixel will be an edge when his strength is a local z1 z2 z3 maximum in the gradient direction, and thus the al- z4 z5 z6 gorithm classifies border pixels with 1, and those z7 z8 z9 that are not in the peak gradient with 0 (Ref. 11). Consequently: This method detects both strong and weak edges (very marked), and it uses two thresholds (T1 and T2 such that T1 < T2). The hard edges will be those who q g = g2 + g2 are superior to T2 and will be weak edges those who x y are between T and T . Then, the algorithm links q 1 2 2 both types of edges but just considering as weak = [(z7 + 2z8 + z9) − (z1 + 2z2 + z3)] ... edges those connected to strong edges. This would 2 ...+[(z3 + 2z6 + z9) − (z1 + 2z4 + z7)] . (7) ensure that those edges are truly weak edges 8. The resulting image of edge detection using Thus, a pixel (x,y) is identified as an edge pixel Canny operator on Figure 2 is shown in Figure 7. if g > T in this point, where T refers to a predefined threshold. The resulting image of edge detection using So- bel operator on Figure 2 is shown in Figure 6.

Fig. 7. Edge detection’s result - Canny. Summarizing, edge detection techniques use a gradient operator, and subsequently evaluate through a predetermined threshold if an edge has Fig. 6. Edge detection’s result - Sobel. been found or not 71. Therefore, edge detection is a In summary, this technique estimates the gradi- process by which the analysis of an image is simpli- ent of an image in a pixel by the vector sum of the fied by reducing the amount of processed informa- Classifying image analysis techniques tion, and in turn retaining valuable structural infor- mation on object edges 70. A more detailed analysis on edge detection techniques can be obtained in 5.

5. Fuzzy edge detection Fig. 8. Fuzzy edge detection’s result. Frequently, the edges detected through an edge de- The fuzzy rules used to fuzzy inference system tection process (as those previously presented) are for edge detection consist on “coloring a pixel white false edges because these classics methods are sen- if it belongs to a uniform region. Otherwise, color- sitive to various issues such as noise or thick edges, ing the pixel black”: among others, and require a great amount of calcu- lation, so that discretization may present problems (i) IF Ix is zero and Iy is zero THEN Iout is white 64,72 . (ii) IF Ix is not zero or Iy is not zero THEN Iout is Fuzzy logic has proved to be well suited to ad- black. dress the uncertainty characterizing the process of where I and I are the image gradient along the x- 73 x y extracting information from an image . Hence, axis and y-axis and they have a zero-mean Gaus- many algorithms have included fuzzy logic in the sian membership function. Also, it is specified for entire process or at any particular stage of the image I the triangular membership functions, white and 72 out processing . An overview of models and methods black. based on fuzzy sets for image processing and image understanding can be found in 74. A first idea for applying fuzzy logic in edge de- tection was proposed by Pal and King 75. Subse- quently, Russo 76 designed fuzzy rules for edge de- tection. Different algorithms were created to address fuzzy edge detection since then. Fuzzy edge detection detects classes of pixels corresponding to the variation in level of gray inten- sity in different directions. This is achieved by using a specified membership function for each class, in such a way that the class assigned to each pixel cor- Fig. 9. Membership functions of the inputs/outputs. 64 responds to the highest membership value . Russo As we mentioned previously, there are several 76 proposed that an efficient way to process each versions of fuzzy edge detection algorithms in the pixel of the image must consider the set of neighbor- literature. For instance, there is an algorithm that ing pixels belonging to a rectangular window, called uses Epanechnikov function extended as a member- fuzzy mask. ship function for each class to be assigned to each Let K be the number of gray levels in an im- pixel 64, or another algorithm which uses operators age, the appropriate fuzzy set to process that im- that detect specific patterns of neighboring pixels age are made up of membership functions defined in for fuzzy rules 77. A study about representing the the universe [0,...,K − 1] 76. Fuzzy sets are created dataset as a fuzzy relation, associating a membership to represent the intensities of each variable and they degree with each element of the relation is presented are associated to the linguistic variables “black” and in 78. Other issue is that fuzzy logic can be used only “white”. on a certain part of the edge detection process 72. The resulting image of applied fuzzy edge detec- Fuzzy edge detection problems in which each tion on Figure 2 is shown in Figure 8. pixel has a degree of membership to the border can C. Guada et al. be seen in some studies 79,80,81,82,83. Moreover, in as follows: these studies the concept of fuzzy boundary have Image segmentation partitions an image R in C been introduced. subregions R1,R2,...,RC for a uniformity predicate P, such that the following items are satisfied: n 6. Image segmentation [ (i) Ri = R. i=1 Around 1970, image segmentation boomed as an ad- (ii) Ri ∩ R j = /0for all i and j,i 6= j. vanced research topic. In , image (iii) P(R ) = TRUE for all i = 1,2,...,C. segmentation is a process that divides an image into i multiple segments, in order to simplify or modify (iv) P(Ri ∪ R j) = FALSE for any pair of adjacent the representation of the image to be more mean- regions Ri and R j. ingful and easy to analyze 84. Actually, there are (v) Ri is connected, i = 1,2,...,C. many applications of image segmentation 85, as it 86 The first condition states that the segmentation is useful in a robot vision , detection of cancer- must be complete, i.e., the union of the segmented ous cells 87, identification of knee cartilage and gray 88 regions Ri must contain all pixels in the image. The matter/white matter segmentation in MR images , second condition indicates that regions have to be among others. disjoint, that is, that there is no overlap between The goal of segmentation is subdividing an im- them. The third condition states that pixels belong- age into regions or non-overlapping objects with ing to the same region should have similar proper- similar properties. For each pixel, it is determined ties. The fourth condition states that pixels from ad- whether it belongs to an object or not, producing a jacent regions Ri and R j differ in some properties. binary image. The subdivision level is dependent on Finally, the fifth condition expresses that each region the problem to be solved. Segmentation stops when must be connected 8. the objects of interest have been identified and iso- Next sections describe main methods of image lated. However, in most cases, processed images are segmentation. They are classified into four cate- not trivial and hence the segmentation process be- gories: thresholding methods (6.1), watershed meth- comes more complicated. ods (6.2), region segmentation methods (6.3) and The most basic attribute for segmentation is the graph partitioning methods (6.4). intensity of a monochrome image and the color components for a color image, as well as ob- 6.1. Thresholding methods jects boundaries and texture, which are very use- ful. In summary, the regions of a segmented im- Thresholding 92 is considered one of the simplest age should be uniform and homogeneous with re- and most common techniques that has been widely gard to some property such as color, texture, inten- used in image segmentation. In addition, it has sity, etc 84,89. Also, the quality of the segmentation served in a variety of applications such as object depends largely on the output. However, although recognition, image analysis and interpretation of image segmentation is an essential image process- scenes. Its main features are that it reduces the ing technique and has applications in many fields, complexity of the data and simplifies the process of there is no single method applicable to all kinds of recognition and classification of objects in the image images. Current methods have been classified ac- 93. cording to certain characteristics of their algorithms Thresholding assumes that all pixels whose 84,90. value, as its gray level, is within a certain range be- Formally, image segmentation can be defined as long to a certain class 94. In other words, threshold- a process of partitioning the image into a set of non- ing methods separate objects from the background 7. intersecting regions such that each group of con- So, according to a threshold, either preset by the re- nected pixels is homogeneous 85,91. It can be defined searcher or by a technique that determines its initial Classifying image analysis techniques value, a gray image is converted to a binary image (iv) Object attribute-based techniques 103,104, 84. An advantage of this procedure is that it is useful which seek measures of similarity between to segment images with very bright objects and with the gray level and binary images. dark and homogeneous backgrounds 33,100. (v) Spatial techniques 105, using probability distri- Let f (x,y) be a pixel of an image I with x = butions and/or correlations between pixels. 1,...,n and y = 1,...,m and B = {a,b} be a pair (vi) Local techniques 106, which adapts the thresh- of binary gray levels. Then, the pixel belongs to the old value at each pixel according to the local object if and only if its intensity value is greater than characteristics of the image. or equal to a threshold value T, and otherwise it be- 8,33 longs to the background . The resulting binary Possibly, the most known and used thresholding image of thresholding an image function at gray techniques are those that focus on the shape of the level T is given by: histogram. A threshold value can be selected by inspection of the histogram of the image, where if  a si f (x,y) > T fT (x,y) = (8) two different modes are distinguished then a value b si f (x,y) 6 T which separates them is chosen. Then, if necessary In this way, the main step of this technique is this value is adjusted by trial and error to generate to select the threshold value T, but this may not be better results. However, updating this value by an a trivial task at all 84. However, many approaches automatic approach can be carried out through an it- to obtain a threshold value have been developed, as erative algorithm 8: well as a number of indicators of performance eval- uation 93. For example, threshold T can remain con- (i) An initial estimate of a global threshold T is stant for the entire image (global threshold), or this performed. may change while it is classifying the pixels in the (ii) The image is segmented by the T value in 33 image (variable threshold) . Hence, some thresh- two groups G1 and G2 of pixels, resulting olding algorithms lie in choosing the threshold T ei- from Eq. (8). ther automatically, or through methods that vary the (iii) The average intensities t1 and t2 for the two threshold for the process according to the proper- groups of pixels are calculated. ties of local residents of the image (local or regional (iv) A new threshold value is calculated: T = threshold) 8,84. Some algorithms are: basic global 1 (t +t ). thresholding 95, minimum error thresholding 96, it- 2 1 2 erative thresholding 97, entropy based thresholding (v) Repeat steps 2 to 4 until subsequent iterations 98 and Otsu thresholding 99. produce a difference in thresholds lower than a According to Sezgin and Sankur 7, thresholding given sensibility or precision (∆T). techniques can be divided into six groups based on (vi) The image is segmented according to the last the information the algorithm manipulates: value of T. 100 (i) Histogram shape-based techniques , where Therefore, the basic idea is to analyze the his- an analysis of the peaks, valleys and curvatures togram of the image, and ideally, if two dominant of the histogram is performed. modes exist, there is a remarkable valley where (ii) Clustering-based methods 101, where the gray threshold T will be in the middle 84. However, de- levels are grouped into two classes, back- pending on the image, histograms can be very com- ground and object, or alternatively they are plex (e.g., having more than two peaks or not obvi- modeled as a mixture of two Gaussians. ous valleys), and this could lead this method to select (iii) Entropy-based techniques 102, in which the al- a wrong value. Furthermore, another disadvantage gorithms use the entropy of the object and is that only two classes can be generated, therefore, the background, the cross entropy between the the method is not able to be applied in multispec- original binary image, etc. tral images. It also does not take into account spatial C. Guada et al. information in the image, so it makes the method the other hand, the pixels of lower amplitude are re- sensitive to noise 33. ferred to as valley points 84,89. The histogram of Figure 2 is shown in Figure 10. By leaving a drop of conceptual water on any The threshold value is 170,8525. pixel from an altitude, this will flow to a lower el- evation point until reaching a local minimum. And in turn, the pixels that drain to a common mini- mum, form a retention gap characterizing a region of the image. Therefore the accumulations of water in local minima neighborhood form such catchments. So, the points belonging to these basins belong to Fig. 10. Histogram. the same watershed 89. Thus, an advantage of this The outcome of thresholding to the previous im- method is that it first detects the leading edge and age is shown on Figure 11. then calculates the basins of the detected gradients 84. Overall, the watershed algorithm is as follows:

• Calculate curvature of each pixel. • Find the local minima and assign a unique label to each of them. Fig. 11. Thresholding’s result. • Find each flat area and classify them as minimum or valley. In general, thresholding is considered to be a • Browse the valleys and allow each drop to find a simple and effective tool for supervised image seg- marked region. mentation. It however must achieve an ideal value T in order to segment images in objects and back- • Allow remaining unlabeled pixels fall similarly ground. More detailed studies can be found in and attach them to the labeled regions. 7,107,108,109. • Join those regions having a depth of basin below a predetermined threshold. 6.2. Watershed methods There are different algorithms to calculate the Watershed 110 was introduced in the field of digi- watershed, depending on how to extract the moun- tal topography. This method considered a grayscale tain rims from the topographic landscape. Two ba- picture as topographic reliefs 111,112,113,114. Later, sic approaches can be distinguished in the following this notion was studied in the field of image pro- subsections. cessing 115. Watershed aims to segmenting regions in catchments, i.e., in an area where a stream of con- 6.2.1. Rainfall approximation ceptual water bounded by a line trickles through the top of the mountains, which decomposes the image The rainfall algorithm 118 is based on finding the lo- into rivers and valleys 116. Light pixels can be con- cal minima of the entire image, which are assigned a sidered the top of the mountain, and dark pixels to unique label. Those that are adjacent are also com- be in the valleys 117. The gradient is treated as the bined with a unique tag. Then a drop of water is magnitude of an image as a topographic surface 84. placed in each pixel without label. These drops will Thus, it is considered that a monochrome image is flow through neighboring lower amplitudes to a lo- a surface altitude where the pixels of greater ampli- cal minimum assuming the value of the label. Then tude or greater magnitude of intensity gradient cor- local minimum pixels are shown in black and the respond to the points of the river, i.e. the lines of the roads of every drop to the local minimum are indi- watershed that form the boundaries of the object. On cated by white 89. Classifying image analysis techniques

6.2.2. Flooding approximation A critical review of several definitions of the wa- tershed transform and their algorithms, can be found 110 The flooding algorithm extracts the mountain in 121,122. rims by gradually flooding the landscape. The flood comes from below, through a hole in each minimum 6.3. Region segmentation methods of the relief and immerse the surface in a lake en- tering through the holes while it fill up the various In this section, we present two region segmentation catchment basins 119. In other words, the method methods. These techniques try that neighboring pix- involves placing a water fountain in local minima, els with similar properties are in the same region, then the valleys are submerged with water and each i.e., they focus on groups of pixels that have sim- catchment filled until it reaches almost to overflow, ilar intensity 71,94. This leads to the class of al- which will create a reservoir in which neighbors gorithms known as region growning and split and are connected by the ridge 89,117. This approach merge, where the general procedure is to compare comes from the bottom up to find the watershed, a pixel with its neighbors, and if they are homoge- i.e., it starts from a local minimum, and each re- neous then they are assigned to the same class 94. gion is flooded incrementally until it connects with Region growning and split and merge are presented its neighbors 116. in section 6.3.1 and section 6.3.2 respectively. By performing the procedure of watershed, im- age transform complement of Figure 1, is as shown 6.3.1. Region growning in Figure 12. Region growning 123 is a method for grouping neighboring pixels in segmented regions according to predefined criteria of growth. Firstly, it specifies small sets of initial pixel called seed points. From them, the regions grow by adding neighboring pix- els with similar predefined properties, until no more Fig. 12. Transform complement of binary image. pixels that meet the criteria for inclusion in the re- In Figure 13 is shown the outcome of watershed. gions are found and all pixels in the image have been scanned 8,33. Similarly, if two or more regions are homogeneous, they are grouped into one. Also, once a region has no more similar pixels, then we proceed to establish a new region from another seed 124. In summary, the basic procedure is presented bel- low 33: Fig. 13. Watershed’s result. • Select a seed group of pixels in the original image. In summary, watershed is a very simple and • Define true value criterion for the stopping rule. advantageous technique for real-life applications, • Combine all pairs of spatially adjacent regions mainly for segmenting grayscale images trou- fulfilling the criterion value. bleshooting, and which uses image morphology • Stop the growth of the regions when it is not pos- 84,117. Notably, because of this method finds the sible to combine more adjacent pixels. basins and borders at the time of use to segment images, the key lies in changing the original image However, a disadvantage of this method is the in an image where the basins correspond to objects fact that getting a good result depends on the selec- you want to identify. Also, when it is combined tion of the initial seed points and the order in which with other morphological tools, watershed transfor- they are scanned for grouping each pixel. This de- mation is at the basis of extremely powerful segmen- pendence makes the segmented result to be sensitive tation procedures 115,120. to the location and ordering of seeds 71,89. C. Guada et al.

The resulting image of region growing on Fig- TRUE are joined together. ure 2 by applying a seed in the pixel x = 148 y = 359, (iv) The process ends once further connections are is shown in Figure 14. not possible. Thus, split and merge arbitrarily subdivides an image into disjoint regions, and then these regions are joined or divided according to the conditions of image segmentation in Section 6. This method per- forms well when the colors of objects in the image are not very opposite. Fig. 14. Region growing’s result. The resulting image of split and merge on Fig- Consequently, region growing is a procedure ure 15 is shown in Figure 16. which iteratively groups pixels of an image into sub- regions according to a predefined criterion of homo- geneity from a seed 33,124. Similarly, region grown- ing has a good performance because it considers the spatial distribution of the pixels 124.

6.3.2. Split and merge Fig. 15. RGB image 9. Split and merge 71 is based on a representation of data in a quadtree where if a region of the original image is non-uniform in attribute then a square im- age segment is divided into four quadrants. Hence, if four neighbors are uniform, they are united in a square formed by these adjacent squares 89. Fig. 16. Split and merge’s result. Definition 1. Let I be an image and P be a predi- Generally speaking, split and merge begins with cate, the image is subdivided into small quadrants, nodes (squares) at some level of the quadtree, so if a such that for any region Ri it is P(Ri) = TRUE. If quadrant is not uniform a split is done, being subdi- P(R) = FALSE, the image is divided into quadrants. vided into four sub-quadrants. However, if four ad- If P is FALSE for any quadrant, the quadrant is sub- jacent quadrants are uniform, a merge between them divided into sub-quadrants, and so on. is performed 71. This process is represented by a tree with four Besides, it is important to mention that the main leaves descendants called quadtree, where the root advantage of region segmentation based techniques refers to the initial image and each of its nodes rep- is that they attempt to segment an image into regions resent its four subdivisions 8. 8,33 quickly, because the image is analyzed by regions The procedure is as follow : and no by pixels. (i) Start with the complete image, where R repre- sent the entire image region, if P(R) = FALSE. 6.4. Graph partitioning methods (ii) The entire image R is divided into four disjoint A digital image I is modeled as a valued graph or quadrants, and if P is false for any quadrant network composed by a set of nodes (υ ∈ V) and a (P(Ri) = FALSE), then the quadrants are sub- set of arcs E (e ∈ E ⊆ V ×V) that link such nodes divided into sub-quadrants until no more splits 125,126,127. Graphs have been used in many segmen- are possible. tation and classification techniques in order to rep- (iii) Once no more splits can be done, any adja- resent complex visual structures of computer vision cent regions Ri and R j for which P(Ri ∪ R j) = and machine learning. In this section algorithms Classifying image analysis techniques which use graph are presented, but first a formal steps 127: proposition is shown. • Represent the image structure in a network, defin- Proposition 1. If A digital image I is viewed as ing a function that assigns a change in image in- a graph where the nodes are the pixels, let V = tensities, so assign the weights of the edges in the {P1,P2,...,Pn} be a finite set of pixels in the image graph. and let E{{Pa,Pb}|Pa,Pb ∈ V} be unordered pairs • Define the set of K labeled pixels to act as seed of neighboring pixels. Two pixels are neighbors if points. there is an arc eab = {Pa,Pb} ∈ E. Let G = (V,E) • Solve the Dirichlet problem for each label except be the representation of neighboring relations be- the last, which is equivalent to calculate the prob- tween the pixels in an image. Let also d 0 be ab > ability that a random walker beginning from each the degree of dissimilarity between pixels P y P . a b unlabeled pixel reaches first each of the K seed Let D = {d |e ∈ E} be the set of all the dissim- ab ab pixels. ilarities. Let N(I) = {G = (V,E);D} be the net- work which summarizes all the previous information • Select from these K −tuples the more likely seed about an image I. for the random walker, that is, assign each node υi to the maximum probability label. With the above proposition, it is possible to ex- tract meaningful information about the processed The algorithm above consists in generating the image, checking the various properties of the graphs. graph weights, establishing the system of equations As we shall see throughout this section, the graphs to solve the problem and implement the practical de- turn out to be extremely important and useful tools tails. Several properties of this algorithm are quick for image segmentation. Some of the most popular calculation, rapid editing, inability to produce arbi- graph algorithms for image processing are presented trary segmentation with sufficient interaction and in- next. tuitive segmentation 127. Moreover, an advantage is that the edges of faint objects are found when they 6.4.1. Random walker are part of a consistent edge 130. The resulting image of random walker on Fig- A random walker 128 approach can be used for the ure 3 is shown in Figure 17. Notice that this fig- task of image segmentation through graphs: a user ure shows the two selected seeds (K = 2), green and interactively label a small number of pixels called blue, as well as the probability that a random walker seeds, and each of the not labeled pixels releases a released at each pixel reaches the foreground seeds. random walker, so the probability is calculated when That is, the near the green seed are the nearest ob- the random walker reaches a labeled seed 129. This jects so their color is similar to the pixel where seed algorithm may be useful for segmenting an image is. Similarly, blue seed is surrounded of other color in, for example, object and background, from pre- and its color is more similar. The intensity of the defined seeds indicating image regions belonging to color represents the probability that a random walker the objects. These seeds are labeled, and unlabeled leaving the first pixel reaches each seed. pixels will be marked by the algorithm using the following approach: let a random walker start on an untagged pixel, what is the probability that ar- rives first at each seed pixel?. A K − tuple vector is assigned to each pixel, which specify the proba- bility that a walker beginnings in an unlabeled pixel, first reaches each seed point K. Then, from those of K −tuples it is selected the more likely seed for the Fig. 17. Random walker’s result. random walker 127. Random walker contributes as a neutral tech- The random walker algorithm consist of four nique for image segmentation because it does not C. Guada et al. consider any information of the image. Instead, the C 132,134,135. segmentation is derived from the K −tuples of each The basic algorithm for this procedure developed pixel, selecting the most probable seed for each pixel by Boykov and Kolmogorov 133 based on increas- or random walker 130. ing paths through two search trees for the terminal nodes, consists of three steps that are iteratively re- 6.4.2. Graph cuts peated:

Graph cuts 131 is a multidimensional optimization • Growth phase: search trees S and T grow until tool that is capable of applying smoothing in pieces, they touch giving an s → t path. while maintaining marked discontinuities 132. It is • Augmentation phase: the found path is aug- supported in both graph theory and the problem of mented, search tree(s) break into forest(s). minimizing the energy function, uses the maximum • Adoption phase: trees S and T are restored. flow problem within graph. Thus, through the max- flow min-cut theorem, it is defined the minimum cut Other max-flow min-cut algorithms are based on of the graph, such that the cut size is not larger in the push-relabel methods, which are based on two any other cut 133. basic operations: push flow and relabel from an A s-t cut is a subset of arcs C ∈ E such that the overflowing node 135. S and T terminals are completely separated in the In general, the cut may generate a binary seg- induced graph G(C) = (V,E \C) 134. mentation with arbitrary topological properties 134. For example, if you work with the observed inten- Definition 2. A s-t cut or simple cut of a graph sity I(p) of a pixel p, a cut would symbolize a binary G = (V,E) is a binary partition of nodes V into two labeling process f , which assigns labels f ∈ {0,1} subsets with a primary vertex in each: source S and p 132,135 to the pixels in such a way that if p ∈ S then fp = 0 sink T = V − S such that s ∈ S and t ∈ T . 132 and if p ∈ T then fp = 1 . Thus, each graph has a set of edges E with two One advantage of this method is that it allows a types of terminal nodes P: n − links, referring to geometric interpretation, and also works with a pow- non-terminal neighboring arcs; and t − links, that erful tool for the energy minimization (a binary op- are used to connect terminal pixels to non-terminal timization method). There are a lot of techniques pixels. Each graph edge is assigned some nonneg- based on graph cuts that produce good approxima- ative weight or cost w(p,q), so a cost of a directed tions. This technique is used in many applications edge (p,q) may differ from the cost of the reverse as an optimization method for low vision problems, edge (q, p). The set of all graph edges consist of based on global energy formulations. In addition, n − links in N and t − links {(s, p),(p,t)} for non- 132,134 graph cuts can be very efficient computationally, and terminal nodes p ∈ P . this is why they provide a clean, flexible formulation The minimum cut problem is based on find- of image segmentation 132. ing the cut with the minimum cost among all cuts. Frequently, this method is combined with others, The cost of a cut C = (S,T) is the sum of the however, an improved graph cut method named nor- costs/weights of the arcs borders (p,q) such that malized cuts 138 is based on modifying the cost func- p ∈ S and q ∈ T. Thus, the min-cut and max- tion to eliminate outliers. The usage of this tech- flow problems are equivalent, because the maximum nique within image processing context is presented flow value is equal to the cost of the minimum cut in the following section. 132,136,137. If f is a flow in a graph G with two terminal 6.4.3. Normalized cuts nodes called source s and sink t, respectively repre- senting “object” and “background” labels, and a set This procedure works with weighted graphs G = of non-terminals P nodes, then the value of the max- (V,E) where the weight w(i, j) of each arc is a func- imum flow is equal to the capacity of a minimum cut tion of the similarity between nodes i and j 139. Classifying image analysis techniques

A graph G = (V,E) is optimally divided into • Use the eigenvector with the second smallest two disjoint subsets A and B with A ∪ B = V and eigenvalue to subdivide the graph by finding the A ∩ B = /0when Ncut value is minimized. The de- splitting points to minimize Ncut. gree of dissimilarity of A and B is measured as a • Check whether it is necessary to split the current fraction of the total weight of the connections arcs partition recursively by checking the stability of on all nodes in the network. This is called normal- the cut. 139 ized cut (Ncut) : • Stop if for any segment Ncut exceeds a specified value. cut(A,B) cut(A,B) Ncut(A,B) = + . (9) The resulting image of gray-scale image segmen- assoc(A,V) assoc(B,V) tation using normalized graph cuts on Figure 2 is shown in Figure 18. where assoc(A,V) = ∑u∈A,t∈V w(u,t) is the total cost of connecting the nodes in A with all nodes in the graph, and similarly with assoc(B,V). The total normalized association measure within a group for a given partition is expressed as 139: assoc(A,A) assoc(B,B) Ncut(A,B) = + . (10) assoc(A,V) assoc(B,V) where assoc(A,A) y assoc(B,B) refers to the total weight of the arcs connecting the nodes within A an B respectively. Fig. 18. Normalized cuts’s result. Given a partition of nodes of graph V into two Thus, the normalized cuts method originated be- disjoint complementary sets A and B, let x be an cause if we only consider to minimizing the cut N = |V| dimensional indication vector, xi = 1 if value cut(A,B) = ∑u∈A,v∈B w(u,v), a problem arise node i is in A and xi = −1 otherwise. Also, let that the sets of small cuts are favored if isolated pix- 138 di = ∑ j(W(i, j)) be the total connection weight from els appears . node i to all other nodes 138. Therefore, Eq. (6.4.3) can be rewritten as: 6.4.4. Region adjacency graphs

Region adjacency graphs (RGA) 71 provides a spa- ∑(x >0,x <0) −wi jxix j tial view of the image, in order to efficiently manipu- Ncut(A,B) = i j ... late the information contained in it, through a graph ∑xi>0 di which associates a region and the arcs linking each ∑(x <0,x >0) −wi jxix j + i j . (11) pair of adjacent regions with each vertex 124. ∑x <0 di i A region Ri defined by a quadruple Ri = The algorithm can be summarized in the follow- (k,xk,yk,Ak)|k ∈ (1,...,n), where k refers to the in- ing steps 138,139: dex number of the region, (xk,yk) represents the cen- ter of gravity and Ak represents the area. • Given an image, represent it in a weighted graph A RAG is defined by a set of regions and a set G = (V,E), and summarize the information into of pairs of regions: ({Ri},{(R j,Rk)} | i, j,k ∈ W and D (let D = diag(d1,d2,...,dN be an N × N (1,...,n)). Thus, the set of parts described all re- diagonal matrix and W be an N × N symmetric gions involved. And the set of pairs of regions indi- matrix with W(i, j) = wi j). cate regions which are adjacent 140. • Solve (D −W)x = λDx for the eigenvectors with As we mentioned above, nodes represent the the smallest eigenvalues. regions and are compared in terms of a formula- C. Guada et al. tion based on chains. Thus, a region can be per- sound structures making the formulation of the prob- ceived as a shape whose boundary is represented by lem more understandably and enabling less complex a cyclic chain (boundary string), which consists of computation 143. a sequence of simple graph arcs. According to a The image segmentation techniques presented in given algorithm, the similarities between two border this section have been categorized into four classes: chains are calculated 126. thresholding methods, watershed methods, region Two important characteristics of the RAG raised segmentation methods and graph partitioning meth- by Pavlidis 71 are: ods. In addition, some evaluation methods are cre- (i) The degree of a node is the number of adja- ated to evaluate quality of an image segmented. A review of these evaluation methods for image seg- cent regions to a given region, which usually is 144,145 proportional to the size of the region. mentation is presented in . (ii) If a region completely surrounds other regions, that node is a cutnode in RAG, as shown in the 7. Hierarchical image segmentation following figure. Hierarchy theory 146 has been frequently used in so- ciology and anthropology. These sciences study a hierarchical set of social classes or strata 147. In this section, we present the definition of hierarchical seg- mentation methods, which try to group individuals according to criteria starting from one group includ- ing all individuals until creating n groups each con- Fig. 19. Example of region adjacency graph 71. taining a unique individual. This can be done in two Despite being a very useful and robust tool, this ways, by agglomerative methods 148 or by divisive method has the disadvantage that it does not produce methods 149, where the groups are represented by a good results when the processed image has not per- dendrogram based on a two-dimensional diagram as fectly defined regions or discontinuities 141. a hierarchical clustering 4. The resulting image of RAG on Figure 2 is The agglomerative methods are responsible for shown in Figure 20. The algorithm uses watershed successively merge the n individuals in groups. On algorithm to find structure of the regions in the im- the other hand, the divisive methods choose to sep- age. Then, two regions are considered as neighbors arate the n individuals into progressively smaller if they are separated by a small number of pixels in groupings. horizontal or vertical direction. First, we begin by presenting the elements needed to obtain a hierarchical image segmentation using graphs. According to Gomez´ et al. 150,151 given an image I and its network N(I), a family S = (S0,S1,...,SK) of segmentations of N(I) constitute a hierarchical image segmentation of N(I) when the following Fig. 20. Region adjacency graph’s result. properties are verified: Accordingly, this technique is used to provide an efficient system for manipulating implicit informa- • St ∈ Sn(N(I)) for all t ∈ {0,1,...,K}, (i.e., each tion in an image through a spatial view. So each St is an image segmentation of N(I)). node in the RAG represents a region, and two nodes • There are two trivial partitions, the first S0 = (regions) are linked if they are neighbors 142. {{v},v ∈ V} containing groups with one node Summarizing, graph theory has the advantage of (singleton), and another partition SK = {V} con- organizing the image elements into mathematically taining all pixels in the same cluster. Classifying image analysis techniques

• |St | > |St+1| for all t = 0,1,...,K − 1, indicat- The D&L algorithm consists on the following ing that in each iteration the number of groups or main steps150: communities increases. t t+1 t • S ⊆eS for all t = 0,1,...,K −1, means that S is (i) Calculate the division and link weights for t finer that S − 1 (i.e., given two partitions P and Q each edge. of the set of pixels V of a network I, it is said that (ii) Organize subgraph edges Gt sequentially ac- P is finer than Q if for all A ∈ P exists B ∈ Q such cording to the weights of the previous step. that A ⊆ B). (iii) Build a spanning forest Ft ⊂ Gt (through, for instance, a Kruskal-like algorithm 163) based One advantage of hierarchical segmentation is on the arrangement found in the previous step. that it can produce an ordering of the segmentations, (iv) Build a partition Pt following the binary pro- which may be informative for picture display. So cedure applied to Ft . in any hierarchical image segmentation proceedings, Et+1 Et no apriori information about the number of clusters (v) Define a new set of edges from by is required. A dendrogram helps in making the se- removing those edges that connect different Pt lection of the number of groups on the same levels. groups of . t+1 The partitions are achieved through the selection of (vi) Repeat steps i-v while E 6= /0. one of the solutions in the nested sequence of clus- ters making up the hierarchy. The best partition is The output of the D&L algorithm (with six par- at a certain level, where the groupings below this titions) applied to Figure 3 is shown in Figure 21. height are far apart. Another advantage of hierarchical segmentation is that it can be used to build fuzzy boundaries by a hierarchical segmentation algorithm designed by the authors 150,152,153,154,155,156. There are some algorithms of hierarchical image segmentation, as 158,159,160,161,162 among others. In the next subsection, we present an algorithm of hier- archical image segmentation based on graphs.

7.1. Divide & link algorithm The divide-and-link (D&L) algorithm proposed by Gomez´ et al. 155 and extended in 156 is a binary itera- tive unsupervised method that obtains a hierarchical partition of a network, to be shown in a dendrogram, Fig. 21. D&L’s result. and it is framed as a process to partition networks through graphs. In summary, any hierarchical image segmenta- Generally speaking, D&L tries to group a set of tion algorithm is a divisive process because it starts pixels V of a graph G = (V,E) through an iterative with a trivial partition, formed by a single group con- procedure that classifies nodes in V in two classes V0 taining all nodes, and ends with another trivial par- and V1. However, the rating depends on the type of tition formed by as many groups as nodes possesses edge. There are endpoints of division edges that are the finite set of elements you wish to group. It also assigned in different classes (split nodes) and there has a polynomial time complexity and a low compu- are endpoints of link edges that are assigned in the tational cost. One advantage is that it can be used in same class (link nodes). a variety of applications. C. Guada et al.

8. Fuzzy image segmentation induces a fuzzy image segmentation of N(I) for any sequence w = (w0,w1,...,wK) such that: Fuzzy image segmentation concept was introduced K in 151,156 and extended in 150,157, where they estab- wt > 0 ∀t ∈ {0,1,...,K} ∑ wt = 1. (13) lished that a fuzzy image segmentation should be a t=0 set of fuzzy regions R ,...,R of the image. They 1 k The output of fuzzy image segmentation applied presented a way to define this concept based on the to Figure 3 is shown in Figure 22. In this picture fact that crisp image segmentation can be character- has been aggregated the output of hierarchical im- ized in terms of the set of edges that separates the age segmentation of Section 7.1, in order to build a adjacent regions of the segmentation. fuzzy image segmentation 157. Definition 1. Given an image modeled as a network image N(I) =  G = (V,E); D , a subset B ⊂ E characterizes an image segmentation if and only if the number of connected components of the par- tial graph generated by the edges E − B, denoted as G(E − B) = (V,E − B), decreases when any edge of B is deleted. Fig. 22. Fuzzy image segmentation’s result. In this sense, a formal definition of fuzzy im- age segmentation is through the fuzzyfication of the edge-based segmentation concept introduced in Def- 9. Final remarks inition 1 of this section (see 150 for more details). The main characteristics and relationships between  Definition 2. Given a network image N(I) = G = some of the best known techniques of digital image

(V,E); D , we will say that the fuzzy set Be = processing have been analyzed in this paper. These {(e, µB(e)), e ∈ E} produces a fuzzy image seg- techniques have been classified according to the out- mentation if and only for all α ∈ [0,1] the crisp set put which they generate, and attending to the prob- B(α) = {e ∈ E : µB(e) > α} produces an image lems they face. Also, this classification depends on segmentation in the sense of Definition 1. whether they are supervised or unsupervised meth- In the previous definition, the membership func- ods, and whether they are crisp or fuzzy techniques. tion of the fuzzy set Be for a given edge represents In this way we can notice that different problems the degree of separation between these two adjacent should be formally distinguished when trying to rec- pixels in the segmentation process. ognize an object in a digital image (such as classifi- In 157, it has proven that there exist a relation cation, edge detection, segmentation and hierarchi- between fuzzy image segmentation concept and the cal segmentation). For example, it has been pointed concept of hierarchical image segmentation. In that out that the output of an edge detection problem does paper, it is showed that it is possible to build a fuzzy not necessarily produce a suitable image segmenta- image segmentation from a hierarchical image seg- tion. Edge detection only detects the border of the mentation in the following way. objects and does not necessarily separate objects in Given a network image N(I) = G = (V,E);D , the image. let B = {B0 = E,B1,...,BK = /0} be a hierarchical Furthermore, some of the techniques reviewed in image segmentation, and for all t ∈ {0,1,...,K} let this paper can be used as a previous step for another µt : E −→ {0,1} be the membership function asso- technique. For example, an edge detection output ciated to the boundary set Bt ⊂ E. Then, the fuzzy could be the input for an image segmentation if we set Be defined as: find a partition of the set of pixels into connected re- gions. The opposite is also true, and it is possible K t to find the boundary of a set of objects through the µB(e) = ∑ wt µ (e) ∀e ∈ E. (12) t=0 set of pixels that connect pixels of different regions. Classifying image analysis techniques

Also, in an output of image segmentation, if two dif- 2. S. Thrun, “Learning metric-topological maps for in- ferent and not adjacent subsets of pixel can share the door mobile robot navigation,” Artificial Intelligence, same characteristics, we can apply an image classifi- 99, 1:21–71, (1998). 3. E. Rosten and T. Drummond, “Machine learning for cation technique and then these regions will belong high-speed corner detection,” Computer Vision-ECCV. to the same class. Moreover, a solution of clustering Springer Berlin Heidelberg, 430–443, (2006). can be a previous step to reach an image segmenta- 4. J. Richards and X. Jia, “Remote Sensing Digital Im- tion. Thus, although different, these techniques can age Analysis. An Introduction,” Springer, (1999). be related to each other. 5. D. Marr and E. Hildreth, “Theory of Edge Detection,” Proc. Roy. Soc. London. Series B, Biological Sciences Computational intelligence is a must in a wide , 207, 1167:187–217, (1980). range of challenging real-world image processing 6. A. Jain, M. Murty and P. Flynn, “Data Clustering: problems. More sophisticated original approaches, A Review,” ACM Computing Surveys, 31, 3:264–322, variations and combinations should be expected de- (1999). pending on the specific characteristics of each image 7. M. Sezgin and B. Sankur, “Survey over image thresh- and the decision-maker objectives (see, for example olding techniques and quantitative performance eval- 164 uation,” Electronic Imaging, 13, 1:146–165, (2004). ). In particular, one of the most important ap- 8. R. Gonzalez,´ R. Woods and S. Eddins, “Digital Im- plications of computational intelligent techniques is age Processing using MATLAB,” Chapman and Hall the imaging process which is used in many fields as Computing, (2009). for example, 165, 9. D. Martin, C. Fowlkes, D. Tal and J. Malik, “A 166, industrial tomography 167, chemical imaging 168 Database of Human Segmented Natural Images and its and thermography 169. Application to Evaluating Segmentation Algorithms and Measuring Ecological Statistics,” Proc. 8th Intl. We would like to remark that there were found Conf. Computer Vision, 2, 416–423, (2001). no significative differences in computational time 10. J. Bernd, “,” Springer, when processing all the images considered in this (2002). paper with different algorithms, being in general 11. W. McCulloch and W. Pitts, “A logical calculus of the quite short. ideas immanent in nervous activity,” The bulletin of mathematical biophysics, 5, 4:115–133, (1943). Finally, it is important to point out that in the 12. A. Turing, “Computing machinery and intelligence,” background of an edge detection process, there is a Mind, 49, 236:433–460, (1950). classification problem, because a pixel is classified 13. C. Cortes and V. Vapnik, “Support-Vector Networks,” as either edge or not edge. Similarly, a threshold- Machine Learning, 20, 273–297, (1995). ing method can be a classification problem because a 14. J. Pearl, “Reverend Bayes on Inference Engines: A Distributed Hierarchical Approach,” Association for pixel is classified as either object or background. In the Advancement of Artificial Intelligence, AAAI-82, this manner, the classification of the methods shown 133–136, (1982). in this paper can be extended even more generally, a 15. A. Scott and M. Symons, “Clustering methods based problem to be considered in a future research. on likelihood ratio criteria,” Biometrics, 27, 2:387– 397, (1971). 16. A. Wacker and D. Landgrebe, “Minimum Distance Acknowledgments Classification in Remote Sensing,” Proc. of the First Canadian Symposium on Remote Sensing, 2, 577– This research has been partially supported by the 599, (1972). 17. I. Aizenberg, N. Aizenberg, J. Hiltner, C. Moraga and Government of Spain, grant TIN2015-66471-P, and E. Meyer zu Bexten, “Cellular neural networks and by the Government of the Community of Madrid, computational intelligence in medical image process- grant S2013/ICE-2845 (CASI-CAM-CM). ing,” Image and Vision Computing, 19, 4:177–183, (2001). 18. J. Jiang, P. Trundle and R. Ren, “Medical image References analysis with artificial neural networks,” Computer- ized Medical Imaging and Graphics, 34, 8:617–631, 1. J. Martin and D. Jurafsky, “Speech and language pro- (2010). cessing,” International Edition, (2000). 19. S. Manikandan, K. Ramar, M. Willjuice Iruthayara- C. Guada et al.

jan and K. Srinivasagan, “Multilevel thresholding 38. V. Nokak,´ I. Perfilieva and J. Mockor, “Mathematical for segmentation of medical brain images using real principles of fuzzy logic,” Springer Science & Busi- coded genetic algorithm,” Measurement, 47, 0:558– ness Media, 517, (2012). 568, (2014). 39. V. Novak,´ I. Perfilieva and J. Mockor, “Mathematical 20. H. Yang, H. Wang, Q. Wang and X. Zhang, “LS-SVM Principles of Fuzzy Logic,” Boston: Kluwer Academic based image segmentation using color and texture in- Publishers, (1999). formation,” JVCIR, 23, 7:1095–1112, (2012). 40. D. Dubois, W. Ostasiewicz and H. Prade, “Fuzzy sets: 21. E. Alpaydin, “Introduction to Machine Learning,” The history and basic notions,” Fundamentals of fuzzy sets. MIT Press, (2010). Springer, 21–124, (2000). 22. N. Nilsson, “Introduction to Machine Learning,” 41. D. Dubois, H. Prade and H. Prade, “Fundamentals of Ebook, (1996). fuzzy sets,” Springer & Business Media, 7, (2000). 23. H. Driver and A. Kroeber, “Quantitative expression of 42. G. Klir and B. Yuan, “Fuzzy sets and fuzzy logic,” cultural relationships,” University of California Pub- New Jersey: Prentice Hall., 4, (1995). lications in American Archeology and Ethnology, 31, 43. E. Kerre, “Introduction to the Basic Principles of 211–256, (1932). Fuzzy Set Theory and some of its Applications, sec- 24. A. Rencher, “Methods of Multivariate Analysis,” Wi- ond revised edition,” Gent: Communication and Cog- ley Interscience, (2002). nition., (1993). 25. W. Hardle and L. Simar, “Applied Multivariate Statis- 44. A. Amo, J. Montero, G. Biging and V. Cutello, “Fuzzy tical Analysis,” Springer, (2003). classification systems,” EJOR, 156, 495–507, (2004). 26. R. Xu and D. Wunsch II, “Survey of Clustering Algo- 45. L. Zadeh, “A rationale for fuzzy control,” J. Dynamic rithms,” IEEE Transactions on Neural Networks, 16, Syst. Meas.Control, 94, 3–4, (1972). 3:645–678, (2005). 46. F. Herrera, M. Lozano and J.L. Verdegay, “A learn- 27. J. MacQueen, “Some methods for classification and ing process for fuzzy control rules using genetic al- analysis of multivariate observations,” Proc. of 5th gorithms,” Fuzzy Sets and Systems, 100, 143–158, Berkeley Symposium on Mathematical Statistics and (1998). Probability 1. University of California Press, 31, 281– 47. M. Galar, J. Fernandez, G. Beliakov and H. Bustince, 297, (1967). “Interval-valued fuzzy sets applied to stereo matching 28. S. Lloyd, “Least square quantization in PCM,” Bell of color images,” IEEE Trans. on Image Processing, Telephone Laboratories Paper, (1957). 20, 7:1949–1961, (2011). 29. S. Lloyd, “Least square quantization in PCM,” IEEE 48. E. Mamdani, “Application of fuzzy algorithms for Transactions on Information Theory, 28, 2:129–137, control of simple dynamic plant,” Proc. of the Insti- (1982). tution of Elec. Engineers, 121, 12:1585–1588, (1974). 30. T. Kanungo, D. Mount, N. Netanyahu, C. Piatko, 49. D. Ruan and E. Kerre, “Fuzzy IF-THEN Rules R. Silverman and A. Wum, “An Efficient k-Means in Computational Intelligence: Theory and Applica- Clustering Algorithm: Analysis and Implementation,” tions,” Boston: Kluwer Academic Publishers, (2000). IEEE Trans. on Pattern Analysis and Machine Intelli- 50. I. Perfilieva, “Logical foundations of rule-based sys- gence, 24, 7:881–892, (2002). tems,” Fuzzy Sets and Systems, 157, 5:615–621, 31. J. Wang, “Encyclopedia of Data Warehousing and (2006). Mining,” Idea Group Reference, (2006). 51. D. Dubois and H. Prade, “What are fuzzy rules and 32. A. Wacker. “A cluster approach to finding spatial how to use them,” Fuzzy sets and systems, 84, 2:169– boundaries in multispectral imagery,” Laboratory for 185, (1996). Applications of Remote Sensing, Purdue University 52. W. van Leekwijck and E. Kerre, “Defuzzification: cri- LARS Info. Note 122969, (1969). teria and classification,” Fuzzy Sets and Systems, 108, 33. R. Dass, Priyanka and S. Devi, “Image Segmentation 2:159–178, (1999). Techniques,” IJECT, 3, 1:66–70, (2012). 53. O. Cordon,´ M. del Jesus´ and F. Herrera “A proposal on 34. G. Toussaint, “The use of context in Pattern Recogni- reasoning methods in fuzzy rule-based classification tion,” , 10, 3:189–204, (1978). systems,” Intl. Journal of Approximate Reasoning, 20, 35. J. Lu and Q. Weng, “A survey of image classification 21–45, (1999). methods and techniques for improving classification 54. J. Dunn, “A Fuzzy Relative of the ISODATA Pro- performance,” IJRSA, 28, 5:823–870, (2007). cess and Its Use in Detecting Compact Well-Separated 36. L. Zadeh, “Fuzzy Sets,” Information and Control, 8, Clusters,” Journal of Cybernetics, 3, 3:32–57, (1973). 338–353, (1965). 55. J. Bezdek, “Pattern Recognition with Fuzzy Objective 37. Z. Chi, H. Yan and T. Pham, “Fuzzy Algorithms: With Function Algorithms,” Plenum Press, (1981). Applications to Image Processing and Pattern Recog- 56. J. Bezdek, R. Ehrlich and W. Full, “FCM: The fuzzy nition,” World Scientific, 10, (1996). c-means clustering algorithm,” & Geo- Classifying image analysis techniques

sciences, 10, 2-3:191–203, (1984). 76. F. Russo, “A user-friendly research tool for image pro- 57. J. Yang, S. Hao and P. Chung, “Color image segmenta- cessing with fuzzy rules,” Proc. of the First IEEE In- tion using fuzzy c-means and eigenspace projections,” ternational Conference on Fuzzy Systems, 561–568, Signal Processing, 82, 461–472, (2002). (1992). 58. D. Zhong and H. Yan, “Color image segmentation us- 77. F. Russo and G. Ramponi, “Edge extraction by FIRE ing color space analysis and fuzzy clustering,” Pro- operators,” IEEE Conf. on Fuzzy Systems, 1, 249–253, ceedings of the 2000 IEEE Signal Processing Society (1994). Workshop, 2, 624–633, (2000). 78. H. Bustince, E. Barrenechea, J. Fernandez,´ M. Pagola, 59. M. Nachtegael, D. Van der Weken, E. Kerre and W. J. Montero and C. Guerra, “Contrast of a fuzzy Philips, “Soft Computing in Image Processing,” Stud- relation,” Information Sciences, 180, 8:1326–1344, ies in Fuzziness and Soft Computing. Springer, 210, (2010). (2007). 79. H. Bustince, E. Barrenechea, M. Pagola and J. 60. E. Kerre and M. Nachtegael, “Fuzzy techniques in im- Fernandez,´ “Interval-valued fuzzy sets constructed age processing,” Physica-Verlag Springer, 52, (2000). from matrices: Application to edge detection,” Fuzzy 61. E. Ruspini, “A new approach to clustering,” Informa- Sets and Systems, 160, 13:1819–1840, (2009). tion and Control, 15, 22–32, (1969). 80. E. Barrenechea, H. Bustince and B. De Baets, “Con- 62. M. Sonka, V. Sonka and R. Boyle, “Image Processing, struction of interval-valued fuzzy relations with appli- Analysis and ,” Chapman and Hall cation to the generation of fuzzy edge images,” IEEE Computing, (1993). Trans. On Fuzzy Systems, 19, 819–830, (2011). 63. D. Phillips, “Image Processing in C,” R & D Publica- 81. C. Lopez-Molina,´ B. De Baets, H. Bustince, J. tions, (2000). Sanz and E. Barrenechea, “Multiscale edge detec- 64. L. Liang and C. Looney, “Competitive fuzzy edge de- tion based on gaussian smoothing and edge tracking,” tection,” Applied Soft Computing, 3, 123–137, (2003). Knowledge-Based Systems, 44, 101–111, (2013). 65. I. Sobel, “History and Definition of the Sobel Opera- 82. H. Bustince, V. Mohedano, E. Barrenechea and M. tor,” (2014). Pagola, “Definition and construction of fuzzy DI- 66. J. Prewitt, “Object Enhancement and Extraction,” Pic- subsethood measures,” Information Sciences, 176, ture processing and Psychopictorics, Academic Press, 21:3190–3231, (2006). (1970). 83. N. Senthilkumaran and R. Rajesh, “Edge detection 67. L. Roberts, “Machine perception of three-dimensional techniques for image segmentation a survey of soft solids,” Optical and Electro-optical Information Pro- computing approaches,” International Journal of Re- cessing, 159–197, (1965). cent Trends in Engineering, 1, 2:250–254, (2009). 68. A. Reeves, “A moment based two-dimensional edge 84. B. Basavaprasad and H. Ravindra, “A Survey on Tra- operator,” Proc. IEEE Comput. Vision & Patt. Recogn, ditional and Graph Theoretical Techniques for Im- 312–317, (1983). age Segmentation,” IJCA Proc. on Nat. Conf. on Re- 69. R. Duda and P. Hart, “Pattern Classification and Scene cent Advances in Information Technology, NCRAIT, Analysis,” John Wiley and Sons, 271–272, (1973). 1:38–46, (2014). 70. J. Canny, “A Computational Approach To Edge Detec- 85. N. Pal and S. Pal, “A review on image segmenta- tion,” IEEE Trans. on Pattern Analysis and Machine tion techniques,” Pattern Recognition, 26, 1277–1294, Intelligence, PAMI-8, 6:679–698, (1986). (1993). 71. T. Pavlidis, “A graphics and image processing,” 86. J. Bruce, T. Balch and M. Veloso, “Fast and inexpen- Springer, (1982). sive color image segmentation for interactive robots,” 72. C. Lopez-Molina,´ “The breakdown structure of edge International Conference on Intelligent Robots and detection: Analysis of individual components and re- Systems (IROS 2000). Proceedings, 3, 2061–2066, visit of the overall structure,” Ph.D Thesis, Navarra (2000). Public University, (2012). 87. Y. Al-Kofahi, W. Lassoued, W. Lee and B. Roysam, 73. F. Russo, “Edge detection in noisy images using fuzzy “Improved automatic detection and segmentation of reasoning,” Instrumentation and Measurement Tech- cell nuclei in histopathology images,” IEEE Trans. on nology Conference, 1, 369–372, (1998). Biomedical Engineering, 57, 4:841–852, (2010). 74. I. Bloch, “Fuzzy sets for image processing 88. V. Grau, A. Mewes, M. Alcaniz, R.Kikinis and S. and understanding,” Fuzzy Sets and Systems, Warfield, “Improved watershed transform for medical doi:10.1016/j.fss.2015.06.017, (2015). image segmentation using prior information,” IEEE 75. S. Pal and R. King, “On Edge Detection of X-Ray Im- Trans. on Medical Imaging, 23, 4:447–458, (2004). ages Using Fuzzy Sets,” IEEE Transactions on Pat- 89. W. Pratt, “Digital Image Processing,” Wiley - Inter- tern Analysis and Machine Intelligence, 5, 1:69–77, science, (2001). (1983). 90. Y. Ma, K. Zhan and Z. Wang, “Applications of Pulse- C. Guada et al.

Coupled Neural Networks,” Springer, (2010). ics, and Image Processing, 41, 233–260, (1988). 91. K. Fu and J. Mui, “A survey on image segmentation,” 108. C. Glasbey, “An Analysis of Histogram-Based Pattern Recognition, 13, 3–16 (1981). Thresholding Algorithms,” CVGIP: Graphical Mod- 92. J. Weszka, R. Nagel and A. Rosenfeld, “A Technique els and Image Processing, 55, 6:532–537, (1993). for Facilitating Threshold Selection for Object Extrac- 109. H. Bustince, M. Pagola, E. Barrenechea, J. Fernndez, tion from Digital Pictures,” University of Maryland P. Melo-Pinto, P. Couto, H. Tizhoosh and J. Montero, Center, TR-243, (1973). “Ignorance functions. An application to the calcula- 93. T. Wu, “Image data field-based framework for image tion of the threshold in prostate ultrasound images,” thresholding,” Optics & Laser Technology, 62, 1–11, Fuzzy sets and Systems, 161, 1:20–36, (2010). (2014). 110. S. Beucher and C. Lantujoul, “Use of watersheds in 94. R. Adams and L. Bischof, “Seeded Region Growing,” contour detection,” Intl. workshop on image process- IEEE Trans. on Pattern Analysis and Machine Intelli- ing, real-time edge and motion detection, (1979). gence, 16, 6:641–647, (1994). 111. S. Collins, “Terrain parameters directly from a digi- 95. S. Lee and S. Chung, “A comparative study of sev- tal terrain model,” Canadian Surveyor, 29, 5:507–518, eral global thresholding techniques for segmentation,” (1975). Comput. Vision. Graphics Image Process, 52, 171– 112. T. Puecker, and D. Douglas, “Detection of surface- 190, (1990). specific points by local parallel processing of discrete 96. J. Kittler and J. Illingworth, “Minimum error thresh- terrain elevation data,” Comput. Vision, Graphics, Im- olding,” Pattern Recognition, 19, 1:41–47, (1986). age Processing, 4, 375–387, (1975). 97. I. Daubechies, M. Defrise and C. Mol, “An itera- 113. D. Marks, J. Dozier and J. Frew, “Automated basin de- tive thresholding algorithm for linear inverse prob- lineation from digital elevation data,” Geoprocessing, lems with a sparsity constraint,” Comm. Pure Appl. 2, 299–311, (1984). Math, 57, 1413–1457, (2004). 114. L. Band, “Topographic partition of watersheds with 98. T. Pun, “A new method for grey-level picture thresh- digital elevation models,” Water Resources Res., 22, olding using the entropy of the histogram,” Signal 1:15–24, (1986). Processing, 2, 3:223–237, (1980). 115. L. Vincent and P. Soille, “Watersheds in Digital 99. N. Otsu, “A threshold selection method from gray Spaces: An Efficient Algorithm Based on Immersion level histograms,” IEEE Trans. Systems Man Cyber- Simulations,” IEEE Trans. on Pattern Analysis and net, 9, 62–66, (1979). Machine Intelligence, 13, 6:583–598, (1991). 100. J. Weszka and A. Rosenfeld, “Threshold Evaluation 116. A. Mangan and R. Whitaker, “Partitioning 3D Surface Techniques,” IEEE Trans. on Systems, Man and Cy- Meshes Using Watershed Segmentation,” IEEE Trans. bernetics, SMC-8, 8:622–629, (1978). on Visualization and Computer Graphics, 5, 4:308– 101. T. Ridler and S. Calvard, “Picture Thresholding Us- 321, (1999). ing an Iterative Selection Method,” IEEE Trans. on 117. E. Rashedi and H. Nezamabadi-pour, “A stochastic Systems, Man and Cybernetics, SMC-8, 8:630–632, gravitational approach to feature based color image (1978). segmentation,” Engineering Applications of Artificial 102. J. Kapur, P. Sahoo and A. Wong, “A New Method for Intelligence, 26, 4:1322–1332, (2013). Gray-Level Picture Thresholding Using the Entropy 118. A. Moga and M. Gabbouj, “A Parallel Watershed Al- of the Histogram,” Computer Vision, Graphics and gorithm Based on the Shortest Paths Computation,” Image Processing, 29, 273–285, (1985). 4th NTUG’95 & ZEUS’95 Parallel Programming and 103. J. Russ and C. Russ, “Automatic discrimination of fea- Applications, (IOS Press), 316–324, (1995). tures in gray-scale images,” Journal of Microscopy, 119. F. Meyer and S. Beucher, “Morphological segmenta- 148, 3:263–277, (1987). tion,” Journal of and Image 104. L. Hertz and R. Schafer, “Multilevel thresholding us- Representation, 1, 21–46, (1990). ing edge matching,” Computer Vision, Graphics, and 120. L. Vincent and S. Beucher, “The morphological ap- Image Processing, 44, 3:279–295, (1988). proach to Segmentation: An Introduction,” School of 105. R. Kirby and A. Rosenfeld, “A Note on the Use of Mines, Paris, (1989). (Gray Level, Local Average Gray Level) Space as an 121. S. Beucher and F. Meyer, “The morphological ap- Aid in Threshold Selection,” IEEE Trans. on Systems, proach to segmentation: the watershed transforma- Man and Cybernetics, SMC-9, 12:860–864, (1979). tion,” Mathematical Morphology in Image Process- 106. Y. Nakagawa and A. Rosenfeld, “Some experiments ing, (Ed. E.R. Dougherty), 433–481, (1993). on variable thresholding,” Pattern Recognition, 11, 122. J. Roerdink and A. Meijster, “The Watershed Trans- 3:191–204, (1979). form: Definitions, Algorithms and Parallelization 107. P. Sahoo, S. Soltani, A. Wong and Y. Chen, “A Survey Strategies,” Fundamenta Informaticae, (IOS Press), of Thresholding Techniques,” Comput. Vision, Graph- 41, 187–228, (2001). Classifying image analysis techniques

123. C. Brice and C. Fennema, “Scene analysis using re- 139. J. Shi and J. Malik, “Normalized Cuts and Image Seg- gions,” Artificial Intelligence, 1, 3-4:205–226, (1970). mentation,” IEEE Trans. on Pattern Analysis and Ma- 124. A. Tremeau´ and P. Colantoni, “Regions Adjacency chine Intelligence, 22, 8:888–905, (2000). Graph Applied to Color Image Segmentation,” IEEE 140. S. Flavell, S. Winter and D. Wilson, “Matching Re- Trans. on Image Processing, 9, 4:735–744, (2000). gion Adjacency Graphs,” Microprocessing and Micro- 125. P. Felzenszwalb and D. Huttenlocher, “Efficient programming, 31, 15:31–33, (1991). graph-based image segmentation,” International Jour- 141. A. Dutta, J. Llads, H. Bunke and U. Pal, “Near Convex nal of Computer Vision, 59, 2:167–181, (2004). Region Adjacency Graph and Approximate Neighbor- 126. J. Llados,´ E. Mart´ı and J. Villanueva, “Symbol Recog- hood String Matching for Symbol Spotting in Graph- nition by Error - Tolerant Subgraph Matching between ical Documents,” 12th International Conference on Region Adjacency Graphs,” IEEE Trans. on Pat- Document Analysis and Recognition (ICDAR), 1078– tern Analysis and Machine Intelligence, 23, 10:1137– 1082, (2013). 1143, (2001). 142. P. Colantoni and B. Laget, “Color Image Segmenta- 127. L. Grady, “Random Walks for Image Segmentation,” tion using region adjacency graphs,” Sixth Intl. Conf. IEEE Trans. on Pattern Analysis and Machine Intelli- on Image Proc. and Its App., 2, 698–702, (1997). gence, 28, 11:1768–1783, (2006). 143. B. Peng, L. Zhang and D. Zhang, “A survey of graph 128. H. Wechsler and M. Kidode, “A random walk pro- theoretical approaches to image segmentation,” Pat- cedure for texture discrimination,” IEEE Trans. on tern Recognition., 46, 3:1020–1038, (2013). Pattern Analysis and Machine Intelligence, PAMI-1, 144. Y. Zhang, “Image Segmentation,” Science Press, 3:272–280, (1979). (2001). 129. P. Tetali, “Random walks and the effective resistance 145. Y. Zhang, “A review of recent evaluation methods of networks,” Journal of Theoretical Probability, 4, for image segmentation,” IEEE Signal Proc. Society 1:101–109, (1991). Press. Malaysia: Proc. of the VI Intl. Symposium on 130. L. Grady and G. Funka-Lea, “Multi-Label Image Signal Proc. and Its Appl., (2001). Segmentation for Medical Applications Based on 146. H. Simon, “The architecture of complexity,” Proc. Graph-Theoretic Electrical Potentials,” ECCV, 230– of the American Philosophical Soc., 106, 467–482, 245, (2004). (1962). 131. D. Greig, B. Porteous and A. Seheult, “Exact Max- 147. R. Lynd and H. Lynd, “Middletown in Transition: imum A Posteriori Estimation for Binary Images,” A study in Cultural Conflicts,” Harcourt, Brace and Journal of the Royal Statistical Soc. Series B, 51, 271– Company, New York., (1937). 279, (1989). 148. M. Newman, “Fast algorithm for detecting commu- 132. Y. Boykov and O. Veksler, “Graph Cuts in Vision nity structure in networks,” Physical Review E, 69, and Graphics: Theories and Applications,” Handbook 6:066133, (2004). of Mathematical Models in Computer Vision, 76–96, 149. M. Girvan and M. Newman, “Community structure in (2005). social and biological networks,” Proc. of the National 133. Y. Boykov and V. Kolmogorov, “An Experimental Academy of Sciences of the USA, 99, 12:7821–7826, Comparison of Min-Cut/Max-Flow Algorithms for (2002). Energy Minimization in Vision,” IEEE Trans. on Pat- 150. D. Gomez,´ E. Zarrazola, J. Ya´nez˜ and J. Montero, A tern Analysis and Machine Intelligence, 26, 9:1124– Divide-and-Link Algorithm for Hierarchical Cluster- 1137, (2004). ing in Networks, Information Sciences, 316, 308–328, 134. Y. Boykov and G. Funka-Lea, “Graph Cuts and Effi- (2015). cient N-D Image Segmentation,” Intl. Journal of Com- 151. D. Gomez,´ E. Zarrazola, J. Ya´nez,˜ J. Rodr´ıguez and J. puter Vision, 70, 2:109–131, (2006). Montero, “A new concept of fuzzy image segmenta- 135. S. Sinha, “Graph Cut Algorithms in Vision, Graphics tion,” The 11th Intl. FLINS Conference, (2014). and Machine Learning. An Integrative Paper,” UNC 152. D. Gomez,´ J. Montero and J. Ya´nez,˜ “A divide- Chapel Hill, (2004). link algorithm based on fuzzy similarity for cluster- 136. L. Ford and D. Fulkerson, “Maximal Flow Through a ing networks,” International Conference on Intelligent Network,” Canadian Journal of Mathematics, 8, 399– Systems Design and Applications, ISDA, 1247-1252, 404, (1956). (2011). 137. P. Elias, A. Feinstein and C. Shannon, “Note on maxi- 153. D. Gomez,´ J. Montero and G. Biging, “Improve- mum flow through a network,” IRE Trans. on Inf. The- ments to remote sensing using fuzzy classification, ory, IT-2, 117–119, (1956). graphs and accuracy statistics,” Pure and Applied 138. S. Tatiraju and A. Mehta, “Image Segmentation using Geophysics, 165, 1555–1575, (2008). k-means clustering, EM and Normalized Cuts,” Uni- 154. D. Gomez,´ J. Montero and J. Ya´nez,´ “A coloring algo- versity of California, Department of Eecs. rithm for image classification,” Information Sciences, C. Guada et al.

176, 3645–3657, (2006). Trans. on Image Processing, 9, 12:2071–2082, (2000). 155. D. Gomez,´ J. Montero, J. Ya´nez´ and C. Poidomani, 162. J. Chamorro-Martinez, D. Sanchez, B. Prados-Suarez, “A graph coloring algorithm approach for image seg- E. Galan-Perales and M. Vila, “A hierarchical ap- mentation,” Omega, 35, 173-183, (2007). proach to fuzzy segmentation of colour images,” 156. D. Gomez´ and J. Montero, “Fuzzy sets in remote IEEE Intl. Conference on Fuzzy Systems, 2, 966–971, sensing classification,” Soft Computing, 12, 243-249, (2003). (2008). 163. J. Kruskal, “On the Shortest Spanning Subtree of a 157. D. Gomez,´ J. Ya´nez,˜ C. Guada, J. Rodr´ıguez, J. Mon- Graph and the Traveling Salesman Problem,” Proc. of tero, and E. Zarrazola, Fuzzy image segmentation the American Mathematical Soc., 7, 48–50, (1956). based upon hierarchical clustering, Knowledge-Based 164. A. Chatterjee and P. Siarry, “Computational Intelli- Systems, 87, 26–37, (2015). gence in Image Processing,” Springer, (2013). 158. P. Arbelaez, M. Maire, C. Fowlkes and J. Malik, “Con- 165. G. Sharma and R. Bala, “Digital color imaging hand- tour detection and hierarchical image segmentation,” book,” CRC press, (2002). IEEE Trans. on Pattern Analysis and Machine Intelli- 166. J. Beutel, H. Kundel and R. Van Metter, “Hand- gence, 33, 5:898–916, (2011). book of Medical Imaging volume 1: Physics and Psy- 159. S. Lhermitte, J. Verbesselt, I. Jonckheere, K. Nack- chophysics,” SPIE Press, (2000). aerts, J. van Aardt, W. Verstraeten and P. Coppin, “Hi- 167. M. Wang, “Industrial Tomography. Systems and Ap- erarchical image segmentation based on similarity of plications,” Woodhead Publishing, (2015). NDVI time series,” Remote Sensing of Environment, 168. C. Evans, E. Potma, M. Puoris’haag, D. Cotˆ e,´ C. Lin 112, 2:506-521, (2008). and X. Xie, “Chemical imaging of tissue in vivo with 160. P. Schroeter and J. Bigun,¨ “Hierarchical image video-rate coherent anti-Stokes Raman scattering mi- segmentation by multi-dimensional clustering and croscopy,” Proc. of Natl. Acad. of Sciences of USA, orientation-adaptive boundary refinement,” Pattern 102, 46:16807–16812, (2005). Recognition, 28, 5:695-709, (1995). 169. E. Ng, “A review of thermography as promising non- 161. H. Cheng and Y. Sun, “A hierarchical approach to invasive detection modality for breast tumor,” Intl. color image segmentation using homogeneity,” IEEE Journal of Thermal Sciences, 48, 5:849–859, (2009).