Boundary Detection by Constrained Optimization ‘ DONALD GEMAN, MEMBER, IEEE, STUART GEMAN, MEMBER, IEEE, CHRISTINE GRAFFIGNE, and PING DONG, MEMBER, IEEE
Total Page:16
File Type:pdf, Size:1020Kb
IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL. 12. NO. 7, JULY 1990 609 Boundary Detection by Constrained Optimization ‘ DONALD GEMAN, MEMBER, IEEE, STUART GEMAN, MEMBER, IEEE, CHRISTINE GRAFFIGNE, AND PING DONG, MEMBER, IEEE Abstract-We use a statistical framework for finding boundaries and tations related to image segmentation: partition and for partitioning scenes into homogeneous regions. The model is a joint boundary labels. These, in tum, are special cases of a probability distribution for the array of pixel gray levels and an array of “labels.” In boundary finding, the labels are binary, zero, or one, general “label model” in the form of an “energy func- representing the absence or presence of boundary elements. In parti- tional’’ involving two components, one of which ex- tioning, the label values are generic: two labels are the same when the presses the interactions between the labels and the (inten- corresponding scene locations are considered to belong to the same re- sity) data, and the other encodes constraints derived from gion. The distribution incorporates a measure of disparity between cer- general information or expectations about label patterns. tain spatial features of pairs of blocks of pixel gray levels, using the Kolmogorov-Smirnov nonparametric measure of difference between The labeling we seek is, by definition, the minimum of the distributions of these features. Large disparities encourage inter- this energy. vening boundaries and distinct partition labels. The number of model The partition labels do not classify. Instead they are parameters is minimized by forbidding label configurations that are in- generic and are assigned to blocks of pixels; the size of consistent with prior beliefs, such as those defining very small regions, the blocks (or label resolution) is variable, and depends or redundant or blindly ending boundary placements. Forbidden con- figurations are assigned probability zero. We examine the MAP (mar- on the resolution of the data and the intended interpreta- imum a posterion’) estimator of boundary placements and partition- tions. The boundary labels are just “on”/“off ,” and are ings. The forbidden states introduce constraints into the calculation of also of variable resolution, associated with an interpixel these configurations. Stochastic relaxation methods are extended to ac- sublattice. In both cases, the interaction term incorporates commodate constrained optimization, and experiments are performed a measure of disparity between certain spatial features of on some texture collages and some natural scenes. pairs of blocks of pixel gray levels, using the Kolmogo- i Zndex Terms-Annealing, Bayesian inference, boundary finding, rov-Smirnov nonparametric measure of difference be- constrained optimization, Gibbs distribution, MAP estimate, Markov tween the distributions of these features. Large disparities random field, segmentation, stochastic relaxation, texture discrimina- encourage intervening boundaries and distinct partition tion. labels. The number of model parameters is reduced by forbidding label configurations that are inconsistent with I. INTRODUCTION prior expectations, such as those defining very small re- ANY problems in image analysis, from signal res- gions, or redundant or blindly ending boundary place- Mtoration to object recognition, involve representa- ments. These forbidden states introduce constraints into tions of the observed data, usually radiant energy or range the calculation of the optimal label configuration. measurements, in terms of unobserved attributes or label Both models are applied mainly to the problem of tex- variables. These representations may be conclusive, or ture segmentation. The data is a gray-level image con- serve as intermediate data structures for further analysis, sisting of textured regions, such as a mosaic of microtex- ’ perhaps involving additional data, assorted “sketches,” tures from the Brodatz album [6], a patch of rug inside or stored models. This work concems two such represen- plastic, or radar-imaged ice floes in water. It is well- known that humans perceive “textural” boundaries be- Manuscript received April 18, 1988; revised September 14, 1989. Rec- tween regions of approximately the same average bright- ommended for acceptance by A. K. Jain. The work of D. Geman and P. Dong was supported in part by the Office of Naval Research under Contract ness because the regions themselves, although containing “14-86-K-0027. The work of S. Geman and C. Graffigne was sup- sharp intensity changes, are perceived as “homogene- ! ported in part by the Army Research Office under Contract DAAL03-86- ous” based on other properties, namely those involving K-0171 to the Center for Intelligent Control Systems, by the National Sci- ence Foundation under Grant DMS-8352087, and by the General Motors the spatial distribution of intensity values. The goal, then, Research Laboratories. is to find the (visually) distinct texture regions, either by D. Geman is with the Department of Mathematics and Statistics, Uni- assigning categorical labels to the pixels, or by construct- versity of Massachusetts, Amherst, MA 01003. S. Geman is with the Division of Applied Mathematics, Brown Univer- ing a boundary map. Obviously, the problem is more dif- sity, Providence, RI 02912. ficult than texture classification, in which we are pre- C. Graffigne was with the Division of Applied Mathematics, Brown sented with only one texture from a given list; , University, Providence, RI 02912. She is now with the French National Center for Scientific Research (CNRS), Equipe de Statistique Appliquke, segmentation may be complicated by an absence of infor- Bitiment 425, UniversitC de Paris-Sud, 91405 Orsay Cedex, France. mation about the number of textures, or about the size, , P. Dong was with the Department of Mathematics and Statistics, Uni- shape, or number of regions. versity of Massachusetts, Amherst, MA 01003. He is now with Codex, Mansfield. MA 02048. There is no “model” for the individual textures, and IEEE Log Number 90361 12. hence no capacity for texture synthesis. Our approach is 0162-8828/90/0700-0609$01.OO 0 1990 IEEE 610 IEEE TRANSACTIONS ON PATTERN ANALYSIS AND MACHINE INTELLIGENCE. VOL. 12. NO. 7. JULY 1990 therefore different from those for classification and seg- Apparently, some of these ideas have been around for mentation in which textures are discriminated based on a while. For example, the Kolmogorov-Smirnov statistic model parameters which are estimated from specific tex- is recommended in [51], and reference is made to still ture samples. Partitionings and boundary placements are earlier papers; more recently, see [60]. Moreover, the dis- driven by the observed spatial statistics as summarized by tributional properties of residuals (from surface-fitting) selected features. Still, the labeling is not unsupervised are advocated in [29], [48] for detecting discontinuities. because in some cases we use “training samples” to de- It is certainly our contention that the statistical warehouse termine feature thresholds for the disparity measures; see is full of useful tools for computer vision. Sections I1 and 111. Finally, our model may be interpreted in a Bayesian Since many visually distinct textures have nearly iden- framework as a “prior” joint probability distribution for tical histograms, segmentation must rely on features, or the array of pixel gray levels and the array of labels. For- transformations, which go beyond the raw gray levels, bidden configurations are assigned probability zero, and involving various spatial statistics. We experimented with the label estimate is then associated with the MAP (max- several conventional classes of features, in particular the imum a posteriori) estimate. We shall discuss this inter- well-known ones based on cooccurrence matrices [ 161, pretation in more detail later on. Suffice to say that, [3 13, but finally adopted a new class based mainly on “di- whereas our formulation of the problem and definition of rectional residuals” and involving third and higher order the “best” labeling are formally independent of the sto- distributions. However, our viewpoint is exactly that ex- chastic outlook, our optimization procedures are in fact pressed by Zobrist and Thompson [62], Triendl and Hen- strongly motivated by this viewpoint. In fact, in order to dersen [59], and others: instead of trying to find exactly deal with the more difficult textures, the deterministic the “right” features to convert texture to tone differences, (‘ ‘zero-temperature”) relaxation algorithm we use for one should find a mechanism for integrating multiple, most of our experiments must be replaced by a version of even redundant, ‘‘cues.’’ The label model provides a co- stochastic relaxation extended to accommodate con- herent method for integrating such information and, strained sampling and optimization. simultaneously, organizing the label patterns. Whereas texture discrimination may be regarded as the A. Applications detection of discontinuities in surface composition, we Texture is a dominant feature in remotely sensed im- also apply the boundary model to the problem of locating ages and regions cannot be distinguished by methods sudden changes in depth (occluding boundaries) or shape based solely on shading, such as edge detectors or clus- (surface creases, etc.). The idea is to define contours tering algorithms. Specifically, for example, one might which are faithful to the 3-D scene but avoid the “non- wish