Robust Analysis of Feature Spaces: Color Image Segmentation
Total Page:16
File Type:pdf, Size:1020Kb
Robust Analysis of Feature Spaces: Color Image Segmentation Dorin Comaniciu Peter Meer Department of Electrical and Computer Engineering Rutgers University, Piscataway, NJ 08855, USA Keywords: robust pattern analysis, low-level vision, content-based indexing Abstract mo des can b e very large, of the order of tens. The highest density regions corresp ond to clusters A general technique for the recovery of signi cant centered on the mo des of the underlying probability image features is presented. The technique is basedon distribution. Traditional clustering techniques [6], can the mean shift algorithm, a simple nonparametric pro- b e used for feature space analysis but they are reliable cedure for estimating density gradients. Drawbacks of only if the number of clusters is small and known a the current methods (including robust clustering) are priori. Estimating the number of clusters from the avoided. Featurespace of any naturecan beprocessed, data is computationally exp ensive and not guaranteed and as an example, color image segmentation is dis- to pro duce satisfactory result. cussed. The segmentation is completely autonomous, Amuch to o often used assumption is that the indi- only its class is chosen by the user. Thus, the same vidual clusters ob ey multivariate normal distributions, program can produce a high quality edge image, or pro- i.e., the feature space can b e mo deled as a mixture of vide, by extracting al l the signi cant colors, a prepro- Gaussians. The parameters of the mixture are then cessor for content-based query systems. A 512 512 estimated by minimizing an error criterion. For exam- color image is analyzed in less than 10 seconds on a ple, a large class of thresholding algorithms are based standard workstation. Gray level images are hand led on the Gaussian mixture mo del of the histogram, e.g. as color images having only the lightness coordinate. [11]. However, there is no theoretical evidence that an 1 Intro duction extracted normal cluster necessarily corresp onds to a Feature space analysis is a widely used to ol for solv- signi cant image feature. On the contrary, a strong ing low-level image understanding tasks. Given an im- artifact cluster may app ear when several features are age, feature vectors are extracted from lo cal neighb or- mapp ed into partially overlapping regions. hoods and mapp ed into the space spanned by their Nonparametric density estimation [4, Chap. 6] comp onents. Signi cant features in the image then avoids the use of the normality assumption. The two corresp ond to high density regions in this space. Fea- families of metho ds, Parzen window, and k-nearest ture space analysis is the pro cedure of recovering the neighb ors, b oth require additional input information centers of the high density regions, i.e., the represen- (typ e of the kernel, numb er of neighb ors). This infor- tations of the signi cant image features. Histogram mation must b e provided by the user, and for multi- based techniques, Hough transform are examples of mo dal distributions it is dicult to guess the optimal the approach. setting. When the number of distinct feature vectors is Nevertheless, a reliable general technique for fea- large, the size of the feature space is reduced by group- ture space analysis can be develop ed using a simple ing nearbyvectors into a single cell. A discretized fea- nonparametric density estimation algorithm. In this ture space is called an accumulator. Whenever the size pap er we prop ose such a technique whose robust b e- of the accumulator cell is not adequate for the data, havior is sup erior to metho ds employing robust esti- serious artifacts can app ear. The problem was exten- mators from statistics. sively studied in the context of the Hough transform, 2 Requirements for Robustness e.g. [5]. Thus, for satisfactory results a featurespace should have continuous coordinate system. The con- Estimation of a cluster center is called in statistics tent of a continuous feature space can b e mo deled as the multivariate lo cation problem. To be robust, an a sample from a multivariate, multimo dal probability estimator must tolerate a p ercentage of outliers, i.e., distribution. Note that for real images the number of data p oints not ob eying the underlying distribution 1 of the cluster. Numerous robust techniques were pro- image domain. That is, the feature vectors satisfy ad- posed [10, Sec. 7.1], and in computer vision the most ditional, spatial constraints. While these constraints widely used is the minimum volume el lipsoid (MVE) are indeed used in the currenttechniques, their role is estimator prop osed by Rousseeuw [10,p. 258]. mostly limited to comp ensating for feature allo cation errors made during the independent analysis of the The MVE estimator is ane equivariant(anane feature space. To b e robust the featurespace analysis transformation of the input is passed on to the es- must ful ly exploit the image domain information. timate) and has high breakdown point (tolerates up As a consequence of the increased role of image to half the data b eing outliers). The estimator nds domain information the burden on the feature space the center of the highest density region by searching analysis can b e reduced. First al l the signi cant fea- for the minimal volume ellipsoid containing at least tures are extracted, and only after then are the clusters h data p oints. The multivariate lo cation estimate is containing the instances of these features recovered. the center of this ellipsoid. To avoid combinatorial The latter pro cedure uses image domain information explosion a probabilistic search is employed. Let the and avoids the normality assumption. dimension of the data b e p. A small number of (p + 1)- tuple of p oints are randomly chosen. For each(p + 1)- Signi cant features corresp ond to high densityre- tuple the mean vector and covariance matrix are com- gions and to lo cate these regions a search windowmust puted, de ning an ellipsoid. The ellipsoid is in ated be employed. The number of parameters de ning to include h points, and the one having the minimum the shap e and size of the window should b e minimal, volume provides the MVE estimate. and therefore whenever it is p ossible the featurespace should be isotropic. A space is isotropic if the distance Based on MVE, a robust clustering technique with between two p oints is indep endent on the lo cation of applications in computer vision was prop osed in [7]. the p oint pair. The most widely used isotropic space is The data is analyzed under several \resolutions" by the Euclidean space, where a sphere, having only one applying the MVE estimator rep eatedly with h values parameter (its radius) can b e employed as searchwin- representing xed p ercentages of the data p oints. The dow. The isotropy requirement determines the map- b est cluster then corresp onds to the h value yielding ping from the image domain to the feature space. If the highest density inside the minimum volume ellip- the isotropy condition cannot be satis ed, a Maha- soid. The cluster is removed from the feature space, lanobis metric should be de ned from the statement and the whole pro cedure is rep eated till the space is of the task. not empty. The robustness of MVE should ensure that each cluster is asso ciated with only one mo de of We conclude that robust feature space analysis re- the underlying distribution. The numb er of signi cant quires a reliable pro cedure for the detection of high clusters is not needed a priori. density regions. Such a pro cedure is presented in the next section. The robust clustering metho d was successfully em- ployed for the analysis of a large variety of feature spaces, but was found to b ecome less reliable once 3 Mean Shift Algorithm the number of mo des exceeded ten. This is mainly A simple, nonparametric technique for estimation due to the normality assumption emb edded into the of the density gradientwas prop osed in 1975 byFuku- metho d. The ellipsoid de ning a cluster can be also naga and Hostetler [4, p. 534]. The idea was recently viewed as the high con dence region of a multivari- generalized by Cheng [2]. ate normal distribution. Arbitrary feature spaces are Assume, for the moment, that the probability den- not mixtures of Gaussians and constraining the shap e sity function p(x ) of the p-dimensional feature vectors of the removed clusters to b e elliptical can intro duce x is unimo dal. This condition is for sake of clarity serious artifacts. The e ect of these artifacts propa- only, later will be removed. A sphere S of radius x gates as more and more clusters are removed. Fur- r , centered on x contains the feature vectors y such thermore, the estimated covariance matrices are not that k y x k r . The exp ected value of the vector reliable since are based on only p +1 points. Subse- z = y x,given x and S is x quent p ostpro cessing based on all the p oints declared inliers cannot fully comp ensate for an initial error. Z To be able to correctly recover a large number of = E [ zjS ] = (y x)p(y jS )dy (1) x x S signi cant features, the problem of feature space anal- x Z p(y ) ysis must b e solved in context. In image understand- dy (y x) = ing tasks the data to be analyzed originates in the p(y 2S ) x S x 2 If S is suciently small we can approximate lo cation of the mo de detected by the one-dimensional x MVE mo de detector, i.e., the center of the shortest p p(y 2S )=p(x )V where V = c r (2) x rectangular window containing half the data points S S x x [10, Sec.