International Journal of Computer Vision, 6:1, 59 70 (1991) (3 1991 Kluwer Academic Publishers. Manufactured in The Netherlands.
Robust Regression Methods for Computer Vision: A Review
PETER MEER,* DORON MINTZ, AND AZRIEL ROSENFELD Center for Automation Research, University of Maryland, College Park, MD 20742 DONG YOON KIM Agency for Defense Development, P.O. Box 35, Taejeon, Korea
Abstract Regression analysis (fitting a model to noisy data) is a basic technique in computer vision. Robust regression methods that remain reliable in the presence of various types of noise are therefore of considerable importance. We review several robust estimation techniques and describe in detail the least-median-of-squares (LMedS) method. The method yields the correct result even when half of the data is severely corrupted. Its efficiency in the presence of Gaussian noise can be improved by complementing it with a weighted least-squares-based procedure. The high time-complexity of the LMedS algorithm can be reduced by a Monte Carlo type speed-up technique. We discuss the relationship of LMedS with the RANSAC paradigm and its limitations in the presence of noise corrupting all the data, and we compare its performance with the class of robust M-estimators. References to published applications of robust techniques in computer vision are also given.
1 Introduction on the underlying noise distribution. For example, in the presence of Gaussian noise the mean estimator has Regression analysis (fitting a model to noisy data) is an asymptotic (large sample) efficiency of 1 (achieving an important statistical tool frequently employed in the lower bound) while the median estimator's effi- computer vision for a large variety of tasks. Tradition ciency is only 2/7r = 0.637 (Mosteller & Tukey 1977). and ease of computation have made the least squares The breakdown point of a regression method is the method the most popular form of regression analysis. smallest amount of outlier contamination that may force The least squares method achieves optimum results the value of the estimate outside an arbitrary range, For when the underlying error distribution is Gaussian. example, the (asymptotic) breakdown point of the mean However, the method becomes unreliable if the noise is 0 since a single large outlier can corrupt the result. has nonzero-mean components and/or if outliers The median remains reliable if less than half of the data (samples with values far from the local trend) are pres- are contaminated, yielding asymptotically the max- ent in the data. The outtiers may be the result of clut- imum breakdown point, 0.5. ter, large measurement errors, or impulse noise cor- The time complexity of the least squares method is rupting the data. At a transition between two homo- O(np 2) where n is the number of data points and p is geneous regions of the image, samples belonging to one the number of parameters to be estimated. Feasibility region may become outliers for fits to the other region. of the computation requires a time complexity of at Three concepts are usually employed to evaluate a most O(n2). regression method: relative efficiency, breakdown A new, improved regression method should provide: point, and time complexity. The relative efficiency of a regression method is defined as the ratio between the a. reliability in the presence of various types of noise~ lowest achievable variance for the estimated parameters i.e., good asymptotic and small sample efficiency; (the Cramer-Rao bound) and the actual variance pro- b. protection against a high percentage of outliers, i.e., vided by the given method. The efficiency also depends a high breakdown point; *Present address: Department of Electrical and Computer Engineer- c. a time complexity not much greater than that of the ing, Rutgers University, Piscataway. NJ 08855-0909 U.S,A. least squares method. 60 Meer, Mintz, Rosenfeld and Kim
Many statistical techniques have been proposed The M-estimators are the most popular robust regres- which satisfy some of the above conditions. These sion methods. (See the book by Hampel et al. (1986) techniques are known as robust regression methods, In for a leading reference.) These estimators minimize the section 2 a review of robust regression methods is sum of a symmetric, positive-definite function p(ri) of given. In section 3 the least-median-of-squares the residuals ri, with a unique minimum at r i = 0. (A (LMedS) method is discussed in detail. A comparison residual is defined as the difference between the data of the LMedS method with the RANSAC paradigm in point and the fitted value.) For the least squares method section 4 gives us the opportunity to analyze its p(r~) = r~. Several p functions have been proposed behavior for data usually met in computer vision prob- which reduce the influence of large residual values on lems. Through a simple application of LMedS, window- the estimated fit. Huber (1981) employed the squared operator based smoothing, we compare its performance error for small residuals and the absolute error for large in section 5 with the class of robust M-estimators. The residuals. Andrews (1974) used a squared sine function article is concluded in section 6. for small and a constant for large residuals. Beaton and Tukey's (1974) biweight is another example of these p functions. The M-estimates of the parameters are ob- 2. Robust Regression Methods tained by converting the minimization
The early attempts to introduce robust regression min Z o(ri) (2) methods involved straight-line fitting. In one class of i methods the data is the first partitioned into two or three into a weighted least squares problem which then is nearly equal sized parts (i < L; L < i < R; R < i), handled by available subroutines. The weights depend where i is the index of the data and L = R in the former on the assumed o function and the data. The reliabil- case. The slope fll and the intercept Go of the line are ity of the initial guess is of importance and the con- found by solving the system of nonlinear equations vergence of the solution is not proved for most of the 0 functions. Holland and Welsch (1977) developed med (Yi - t3o - t3,xi) =med (Yi - 13o - 13,xi) i
The presence of noise corrupting all the samples, set also cannot be less than half the data size. For ex- however, may create severe artifacts. Since this is the ample, suppose that a constant must be estimated from case in numerous computer vision applications, in the data having only two values. Then, if the cardinality next section we discuss the problem in detail. threshold on the consensus set is less than half the data size, the RANSAC procedure may validate either of the two constants. 4 Analysis of the LMedS Technique: Comparison To recover a model representing the relative and not with RANSAC absolute majority in the data, additional procedures should be implemented. Bolles and Fischler (1981) pro- The LMedS estimates are computed by minimizing the posed the use of nonparametric tests for detecting the cost function (7) through a search in the space of possi- "white" structure of residuals. A decomposition tech- ble solutions. Subsets of the data are chosen by ran- nique applied in both the spatial and parameter domains dom sampling and for each subset a model is estimated. (Mintz et al. 1990) also succeeds to return reliable The number of points in the subset is equal to the estimates when less than half the data carry the model. number of unknown model parameters, thus yielding The maximum number of subsets required to validate closed form solutions. If the number of points used is a model for a tolerated probability of error is estab- increased, in which case the model must be estimated lished by the same relation (12) in both LMedS and by least squares, no improvement in the final LMedS RANSAC. In LMedS, the algorithm will process that estimates can be expected (Meer et al. 1990). many p-tuples; in RANSAC, consensus may be found The search technique employed to find the LMedS earlier. However, if a tolerated residual error threshold estimates (projection pursuit) is common for several is used in LMedS, a stopping criterion similar to that high breakdown point robust estimators (Rousseeuw & of RANSAC is obtained. Using the statistical ter- Leroy 1987). A similar processing principle was pro- minology, RANSAC maximizes the number of inliers posed independently by Fischler and Bolles (1981) for while seeking the best model, and LMedS minimizes solving computer vision problems in the RANSAC a robust error measure of that model. The two criteria paradigm. They too compute a model by solving a are not mathematically equivalent, but since in the system of equations defined for a randomly chosen general case at least half of the data should become in- subset of points. All the data is then classified relative liers, they yield very similar results. to this model. The points within some error tolerance We conclude that LMedS and RANSAC, in spite of are called the consensus set of the model. If the car- being independently developed in different research dinality of the consensus set exceeds a threshold, the areas, are based on similar concepts. The only dif- model is accepted and its parameters recomputed based ference of significance is that the LMedS technique on the whole consensus set. If the model is not accepted generates the error measure during the estimation pro- a new set of points is chosen and the resulting model cedure, while RANSAC must be supplied with it. This is tested for validity. The error tolerance and the con- is an important feature when the noise is not homo- sensus set acceptance threshold must be set a priori. geneous, that is, the model is selectively corrupted. The In LMedS, the computed model is allocated an er- robustness of both LMedS and RANSAC in the ror measure, the median of the squared residuals. presence of severe deviations from the sought model Several models are tried and the one yielding the is contingent upon the existence of at least one subset minimum error is retained. The points are classified of data carrying the correct model. When noise cor- into inliers and outliers only relative to this final fit. rupts all the data (e.g., Gaussian noise) the quality of The LMedS model can then be refined by a least initial model estimates degrades and could lead to in- squares procedure on the inlier data points. By its correct decisions. The problem is emphasized when the definition the LMedS estimator will always return at model contains more parameters to represent the data least 50 percent of the data points as inliers. The error than is necessary. That is, the order of the model is tolerance is set automatically by the estimated robust incorrect. standard deviation. Consider again the two-valued data case, specificall); The LMedS estimator has a 0.5 breakdown point. a step edge of amplitude h, the difference between the Without a priori information about the structure of the two values. Assume the data is evenly split, that is, e data, in RANSAC the size of an acceptable consensus is close to 0.5. All the data is corrupted by a significant, Robust Regression Methods for Computer Vision: A Review 65 zero-mean noise process having values with significant window. There are now 13 pixels with high amplitudes probability in (-a, a), with a close to h/2. Let the and the operator returns, incorrectly, a high value RANSAC error threshold be a, and recall that this value similar to the amplitude of the edge. Thus, even when must be chosen a priori. A first-order model is em- the fraction of corrupted points is much smaller than ployed (p = 3) and thus planar fits are sought. The the theoretical breakdown point of a robust estimator, noise makes the models computed from triplets of data the operator may systematically fail near transitions be- points unreliable. A model representing a tilted plane tween homogeneous regions in images. At transitions, (similar to what the nonrobust least squares procedure samples of one region are outliers (noise) when fitting would recover from all the data) already yields an abso- a model to the other region, and a small fraction of lute majority of points within the error tolerance, and additional noise may reverse the class having the ma- is accepted by RANSAC. Reducing the error threshold jority. The size of the local operator also limits e, the decreases the expected number of points within the maximum amount of tolerated contamination, but this bounds, and thus a satisfactory consensus set size may artifact has less importance for images where the win- not be obtained for the correct model. With high prob- dow operators have relative large supports. ability RANSAC will return an incorrect decision. It should not be concluded from this section that high The LMedS procedure will also err. Most of the breakdown-point estimation procedures are not useful squared residuals relative to a tilted plane are between for computer vision. We have emphasized their sen- (0, a 2) and thus the median of the squared residuals sitivity to piecewise models, that is, to data containing is much less than a 2. Relative to the correct, horizon- discontinuities, in order to increase awareness of pos- tal plane slightly more than half of the squared residuals sible pitfalls. In section 2.1 several successful appli- are between (0, a 2) with the median being close to a z. cations of the LMedS estimators were mentioned. Thus the LMedS algorithm too will prefer the tilted Recently we have proposed a new technique for noisy- plane to the correct solution. For experimental data on image analysis in the piecewise polynomial domain this artifact of the LMedS estimators, and how it can combining simple nonrobust and robust processing be eliminated by a two-stage procedure, see Meer et modules. The method has a high breakdown point but al. (1990) and Mintz et al. (1990). avoids the problems discussed above. In the next sec- The robustness of the RANSAC paradigm, and the tion we use a simple application of the LMedS esti- high breakdown point of the LMedS estimators, are thus mators, window-operator-based smoothing, to compare meaningful only in situations when at least half of the its performance with M-estimators. data lie close to the desired model. For LMedS, in this case projection into the J30 direction results in a well- defined mode and there is no need for a complete pro- 5 Image Smoothing: LMedS vs. M-estimators jection pursuit procedure in which the optimum projec- tion direction is also sought. Signals unchanged by the application of an operator are When LMedS local operators are applied to model called root signals of that operator. The existence of discontinuities the number of tolerated outliers root signals assures the convergence of iterative decreases. Consider again the noiseless, ideal step edge nonlinear filtering procedures (Fitch et al. 1985). Their to which a 5 x 5 robust local operator is applied. importance is also recognized in computer vision (e.g., Assume that 10 pixels in the window belong to the edge Haralick & Watson 1981; Owens et al. 1989). In one (high amplitude) and 15 to the background (low dimension, a noiseless piecewise polynomial signal is amplitude) and that the center of the window falls on a root signal of an LMedS-based window operator of a background pixel. The operator returns the value of degree equal to the highest polynomial degree. This the majority of pixels, that is, the low amplitude of the property is in fact true for any 0.5 breakdown point background. estimator. Indeed, since the input is an ideal piecewise The image is then corrupted with fraction e = 0.2 polynomial signal, a contiguous group of Fn/2q pix- of asymmetric noise driving the corrupted samples into els always carries information about the same saturation at the upper bound. Without loss of generality polynomial. The 0.5 breakdown point assures that the we can assume that only 3 of the pixels belonging estimated regression coefficients are those of this to the background were corrupted in the processing polynomial. The value returned by the operator is the 66 Meer, Mintz, Rosenfeld and Kim value of the polynomial at the window center and is absolute-values region the large residuals have hyper- identical with the input. The noiseless one-dimensional bolically decreasing weights. To achieve 95 % asymp- piecewise polynomial signal remains undistorted. This totic efficiency for Gaussian noise Holland and Welsch root-signal property is not necessarily valid in two (1977) recommended CH = 1.345. dimensions. When the window is centered on a cor- In the class of redescending M-estimators, the large ner, it is not always the case that 51% of the pixels residuals have zero weights thus improving the outlier belong to the surface in the window center. Before pro- rejection properties. The biweight ceeding to compare root-signal properties of the M- estimators and the LMedS estimator we give a short description of the former. M-estimators have already been introduced in sec- (19) tion 2. The minimization problem (2) is reduced to reweighted least squares by differentiating (2) with respect to the sought regression coefficients. After some simple manipulations, the following expression for the weights is obtained: proposed by Beaton and Tukey (1974) is a well-known example of this class. The weight function is shown in 1 do(r/) wi = -- -- (16) figure 2b and the p function it was derived from in ri dG figure 2a. Holland and Welsch (1977) recommended CB where ri is the residual of the ith data point. Since the ---- 4.685 to assure superior performance for Gaussian p functions have a minimum at rl = 0, expression (16) noise. This tuning constant value was obtained from can always yield wi = 1 at the origin. All the weights a Monte Carlo study employing homogeneous data are positive and between 0 and 1. Different p functions (i.e., from one model). In computer vision we are often result in different weights. We consider here two such dealing with piecewise models (i.e., data with discon- functions. tinuities) and smaller tuning constants should be used The first robust M-estimator was proposed by Huber to discriminate the transitions. In our experiments we in 1964 (see Huber 1981). It is known as the minimax took CB = 2. M-estimator and has least squares behavior for small The weights cannot be computed without a standard residuals, and the more robust least-absolute-values deviation estimate, which in turn requires the estima- behavior for large residuals. The change in behavior tion of an initial fit. The quality of the initial fit is of is controlled by a tuning constant c H and the locally paramount importance, especially for redescending M- estimated noise standard deviation estimators. The use of robust estimators like least abso- lute deviations or LMedS is recommended (Hampel et = 1.4826 med Iri - med ril (17) al., 1986). We are interested in the outlier rejection where med denotes the median taken over the entire capability of the M-estimators, since this property window, and the factor 1.4826 compensates for the bias assures the undistorted recovery of the input. Therefore of the median estimator in Gaussian noise. Note that to facilitate the comparison between M-estimators and (17) is a robust estimator of a and can tolerate up to LMedS we employ an unweighted least squares proce- half the data being corrupted. The minimax weight dure (all wi = 1) to obtain the initial M-estimates. The function has the expression residual relative to this fit can be computed and the stan- dard deviation estimate obtained. Each data point is then allocated a weight and in the next iteration the new 1 Iri[ ~ CH& set of parameters is obtained by weighted least squares. ~ (18) w,,,i = c.S The standard deviation estimate 3 should not be up- ~_ Iril Ir, I > dated during the iterations (Holland & Welsch 1977). At consecutive iterations, samples yielding large residuals (outliers) have their weights reduced, and the and is shown in figure lb together with its p function estimates are computed mostly from values distributed (figure la). Note that the constant weight corresponds around the true surface. Often for the latter samples to the central (least squares) region, and in the least- a Gaussian distribution is assumed. Robust Regression Methods for Computer Vision: A Review 67 y J L (a) (b)
Fig. 1. The Huber minimax M-estimator. (a) The 0 function. (b) The weight function. 1/ (b)
Fig. 2. The biweight M-estimator. (a) The 0 function. (b) The weight function. The following quality measure was used by us as a window. The M-estimators did not converge for less stopping criterion for the iterations: than one percent of the pixels. For each method the smoothing error, defined as square root of the sum of the squared differences between the input and output values of a pixel, was computed. (See table 1.) As is well known, the least squares fit fails at discon- tinuities and significant blurring of the input occurs. This fit is also the initial estimate for the M-estimators. It yields an incorrect, high standard deviation estimate t=l and the iterative procedure cannot completely recover where the index I denotes the current iteration. Con- the input. The two M-estimators also blur edges, vergence is achieved if although in a lesser amount. The minimax M-estimator has nonzero weights for all the pixels and thus in- IE ~t~ - E¢l-l~[ < 0.001 I = 1, 2 .... (21) troduces slightly more blurring at edges than the The number of allowed iterations was bounded to 25. biweight estimator. All the errors made by the LMedS A 100xl00 synthetic image containing several estimator are due to the artifact at the corners. This polyhedral objects was used as an example. Only planar artifact can be eliminated by cooperative processes surfaces are present, joined at either step or roof edges. (Meer et al. 1990; Mintz et al. 1990). In figure 3a the gray level and in figure 3b the wire- Thus due to its high breakdown point only the LMedS frame representation of the image is shown. The im- window operator succeeds in preserving a piecewise age was smoothed with 5x5 window operators by polynomial image. This property is useful when the replacing the value of the center pixel with the intercept operator is used as a nonlinear interpolator. We have of the estimated plane. Note that as an artifact of this compared in this section the raw performances of the simple smoothing procedure a systematic error is in- estimators. Their properties can be improved when troduced at the corners where the center pixel does not using them as building blocks in more complex belong to the surface containing the majority. algorithms. The smoothing performance of the M- We have used least squares (figure 4a), the minimax estimators is corrected when several weight functions M-estimator (figure 4b), the biweight M-estimator are employed at successive iterations and the model (figure 4c), and the LMedS estimator (figure 4d) in the order is sequentially increased under the control of a 68 Meer, Mintz, Rosenfeld and Kim
(b)
Fig. 3. A synthetic 100xl00 image. (a) Gray-level representation. (b) Wireframe representation. quality measure (Besl et al. 1988). Similarly, better robust estimators to real computer vision problems, tolerance of the LMedS smoothing operators to noise with noisy data not perfectly representing the assumed corrupting all the samples can be achieved (Mintz et model, should be preceded by a careful analysis of the al. 1990). problem. If a clear dichotomy exists between the "good" and "bad" data (as is usually the case in Table 1. Root mean square smoothing errors. statistics) methods like LMedS are extremely useful. Method LeastSquares Minimax Biweight LMedS If good and bad data cannot be discriminated, addi- tional precautions should be taken. Error 944 922 855 392
Acknowledgments 6 Conclusion We would like to thank the anonymous reviewers whose We have presented a review of the different robust observations substantially improved the presentation of regression methods with special emphasis on the least the paper. The support of the Defense Advanced median of squares estimators. We did not present ap- Research Projects Agency and U.S. Army Night Vi- plications since most of them involve technical details sion and Electro Optics Laboratory under Grant beyond the goal of this article. A literature survey of DAAB07-86-K-F073, and of the National Science Foun- computer vision algorithms employing robust tech- dation under Grant DCR-86-03723, is gratefully niques was given in section 2.1. acknowledged. The desirable high breakdown point of the LMedS estimators is contingent upon the sought model being represented by weakly corrupted data. Using a prob- References abilistic speed-up technique, the computation of LMedS estimates is feasible, although more demanding than Andrews, D.E 1974. A robust method for multiple linear regression. Technometrics 16:523-531. that of M-estimates. The latter, however, requires a Beaton, A.E., and Tukey,J.W. 1974. The fitting of powerseries, mean- reliable initial estimate, possibly the output of an ing polynomials, illustrated on band-spectroscopic data. LMedS algorithm. We conclude that application of Technometrics 16:147-185. Robust Regression Methods for Computer Vision." A Review 69
Fig_ 4. Window smoothing of the image in figure 3. (a) Least squares operator. (b) Minimax M-estimator. (c) Biweight M-estimator. (d) LMedS estimator.
Besl, EJ., Birch, J.B., and Watson, L.T_ 1988. Robusl window Brown, G.W_, and Mood, A.M. 1951. On median tests for linear operators. Proc. 2nd Intern. Conf. Comput. Vision, Tampa, FL, hypotheses. Proc. 2nd Berkeley Syrup_ Math. Star. Prob. J. Neyman pp. 591-600. See also, Mach. Vision Appl. 2:179-214, 1989. (ed.), University of California Press: Berkeley and Los Angeles, Bolles, R.C., and Fischler, M.A. 1981. A RANSAC-based approach pp. 159-166. to model fitting and its application to finding cylinders in range Cheng, K.S., and Hettmansperger, T.P. 1983. Weighted least-squares data. Proc. 7th Intern. Joint Conf. Artif lntell., Vancouver, Canada, rank estimates. Commun. Star. A12:1069-I086. pp. 637-643. Coyle, EJ., Lin, J.H., and Gabbouj, M. 1989, Optimal stack fil- Bovik, A,C., Huang, T.S_, and Munson, Jr., D.C. 1987. The effect tering and the estimation and structural approaches to image proc- of median filtering on edge estimation and detection. IEEE Trans. essing. 1EEE Trans. Acoust. Speech, Sig. Process. 37:2037- Patt. Anal. Mach. lntell. PAMI-9:I81-194. 2066. 70 Meer, Mintz, Rosenfeld and Kim
Edelsbrunner, H., and Souvaine, D.L. 1988. Computing median-of- Kolvo, A.J., and Kim, C.W. 1989. Robust image modeling for squares regression lines and guided topological sweep. Report classification of surface defects on wood boards. IEEE Trans. Syst. UIUCDCS-R-88-1483. Department of Computer Science, Univer- Man, Cybern. 19:1659-1666, sity of Illinois at Urbana-Champaign. Kumar, R., and Hanson, A.R. 1989. Robust estimation of camera Efmn, B. 1988. Computer-intensive methods in statistical regression. location and orientation from noisy data having outliers. Proc. SIAM Review 30:421-449. Workshop on Interpretation of 3D Scenes, Austin, TX, pp. 52-60. Fischler, M.A., and Bolles, R.C. 1981. Random sample consensus: Lee, C.N., Haralick, R.M., and Zhuang, X. 1989. Recovering 3-D A paradigm for model fitting with applications to image analysis motion parameters from image sequences with gross errors. Proc. and automated cartography. Commu. ACM 24:381-395. Workshop on Visual Motion, Irvine, CA, pp. 46-53. Fitch, J.P., Coyle, E.J., and Gallagher, Jr., N.C. 1985. Root proper- Li, G. 1985. Robust regression. In D.C. Hoaglin, F. Mosteller, and ties and convergence rates of median filters," IEEE Trans. Acoust., J.W. Tukey (eds.), Exploring Data Tables, Trends and Shapes. Speech, Sig. Process. 33:230-240. Wiley: New York, pp. 281-343. Hampel, ER., Ronchetti, E.M., Rousseeuw, P.J., and Stahel, W.A. Meer, P., Mintz, D., and Rosenfeld, A. 1990. Least median of squares 1986. Robust Statistics: An Approach Based on Influence Func- based robust analysis of image structure. Proc. DARPA Image tions. Wiley: New York. Understanding Workshop, Pittsburgh, PA, pp. 231-254. Hamlick, R.M., and Watson, L. 1981. A facet model for image data. Mintz, D., and Amir, A. 1989. Generalized random sampling. CAR- Comput. Graph. Image Process. 15:113-129. TR-441. Computer Vision Laboratory, University of Maryland, Col- Haralick, R.M., and Joo, H. 1988. 2D-3D pose estimation. Proc. lege Park. 9th Intern. Conf. Pan. Recog. Rome, pp. 385-391. See also, IEEE Mintz, D., Meet, P., and R_osenfeld, A. 1990. A fast, high breakdown Trans. Systems, Man, Cybern. 19:1426-1446, 1989. point robust estimator for computer vision applications. Proc. Heiler, S. 1981. Robust estimates in linear regression--A simulation DARPA Image Understanding Workshop, Pittsburgh, PA, pp. approach. In Computational Statistics. H. BiJning and P. Naeve 255-258. (eds.), De Gruyter, Berlin, pp. 115-136. Mosteller, E, and Tukey, J.W. 1977. Data Analysis and Regression. Holland, P.W., and Welsch, R.E. 1977. Robust regression using Addison-Wesley: Reading, MA. iteratively reweighted least squares. Commun. Stat. A6:813-828. Owens, R., Venkatesh, S., and Ross, J. 1989. Edge detection is a Huber, P.J. 1981. Robust Statistics. Wiley: New York. projection. Patt. Recog. Left. 9:233-244. Jaeckel, L.A. 1972. Estimating regression coefficients by minimiz- Press, W.H., Flannery, B.P., Teukolsky, S.A., and Vetterling, W.T. ing the dispersion of residuals. Ann. Math. Stat. 43:1449-1458. 1988~ Numerical Recipes. Cambridge University Press, Cambridge, Johnstone, I.M., and Velleman, P.E 1985. The resistant line and England. related regression methods. J. Amer. Stat. Assoc. 80:1041-1059. Rousseeuw, P.J. 1984. Least median of squares regression. J. Amer. Jolion, J.M., Meet, P., and Rosenfeld, A. 1990a. Generalized Star. Assoc. 79:871-880. minimum volume ellipsoid clustering with applications in com- Rousseeuw, P.J., and Leroy A.M. 1987. Robust Regression & Outlier puter vision. Proc. Intern. Workshop Robust Comput. Vision, Seat- Detection. Wiley: New York. tle, WA, pp. 339-351. Siegel, A,E 1982. Robust regression using repeated medians. Jolion, J.M., Meer, P., and Bataouche, S., 1990b. Range image Biometrika 69:242-244. segmentation by robust clustering. CAR-TR-500. Computer Vision Steele, J.M., and Steiger, W.L. 1986. Algorithms and complexity Laboratory, University of Maryland, College Park. for least median of squares regression. Discrete Appl. Math. Kamgar-Parsi, B., Kamgar-Parsi, B., and Netanyahu, N.S. 1989. A 14:93-100. nonparametric method for fitting a straight line to a noisy image. Theil, H. 1950. A rank-invariant method of linear and polynomial IEEE Trans. Patt. Anal. Mach. lntell. PAMI-11:998-1001. regression analysis (parts 1-3). Ned. okad. Wetensch. Proc. ser. Kashyap, R.L., and Eom, K.B. 1988. Robust image models and their A 53:386-392, 521-525, 1397-1412. applications. In Advances in Electronics and Electron Physics, vol. Tirumalai, A., and Schunck, B.G. 1988. Robust surface approxima- 70, Academic Press: San Diego, CA, pp. 79-157. tion using least median squares regression. CSE-TR-13-89. Artificial Kim, D.Y., Kim, J,J., Meer, P., Minm, D., and Rosenfeld, A. 1989. Intelligence Laboratory, University of Michigan, Ann Arbor. Robust computer vision: A least median of squares based approach. Proc. DARPA Image Understanding Workshop, Palo Alto, CA, pp. 1117-1134.