Snakes: Active Contour Models
Total Page:16
File Type:pdf, Size:1020Kb
International Journal of Computer Vision, 321-331 (1988) o 1987 KIuwer Academic Publishers, Boston, Manufactured in The Netherlands Snakes: Active Contour Models MICHAEL KASS, ANDREW WITKIN, and DEMETRI TERZOPOULOS Schlumberger Palo Alto Research, 3340 Hillview Ave., Palo Alto, CA 94304 Abstract A snake is an energy-minimizing spline guided by external constraint forces and influenced by image forces that pull it toward features such as lines and edges. Snakes are active contour models: they lock onto nearby edges, localizing them accurately. Scale-space continuation can be used to enlarge the cap- ture region surrounding a feature. Snakes provide a unified account of a number of visual problems, in- cluding detection of edges, lines, and subjective contours; motion tracking; and stereo matching. We have used snakes successfully for interactive interpretation, in which user-imposed constraint forces guide the snake near features of interest. 1 Introduction organizations. By adding suitable energy terms to the minimization, it is possible for a user to push In recent computational vision research, low- the model out of a local minimum toward the level tasks such as edge or line detection, stereo desired solution. The result is an active model matching, and motion tracking have been widely that falls into the desired solution when placed regarded as autonomous bottom-up processes. near it. Marr and Nishihara [ 111, in a strong statement of Energy minimizing models have a rich history this view, say that up to the 2.5D sketch, no in vision going back at least to Sperling’s stereo “higher-level” information is yet brought to bear: model [16]. Such models have typically been the computations proceed by utilizing only what regarded as autonomous, but we have developed is available in the image itself. This rigidly se- interactive techniques for guiding them. Interact- quential approach propagates mistakes made at ing with such models allows us to explore the en- a low level without opportunity for correction. It ergy landscape very easily and develop effective therefore imposes stringent demands on the reli- energy functions that have few local minima and ability of low-level mechanisms. As a weaker but little dependence on starting points. We hope more attainable goal for low-level processing, we thereby to make the job of high-level interpreta- argue that it ought to provide sets of alternative tion manageable yet not constrained un- organizations among which higher-level pro- necessarily by irreversible low-level decisions. cesses may choose, rather than shackling them The problem domain we address is that of prematurely with a unique answer. finding salient image contours-edges, lines, and In this paper we investigate the use of energy subjective contours-as well as tracking those minimization as a framework within which to contours during motion and matching them in realize this goal. We seek to design energy func- stereopsis. Our variational approach to finding tions whose local minima comprise the set of image contours differs from the traditional ap- alternative solutions available to higher-level proach of detecting edges and then linking them. processes. The choice among these alternatives In our model, issues such as the connectivity of could require some type of search or high-level the contours and the presence of corners affect reasoning. In the absence of a well-developed the energy functional and hence the detailed high-level mechanism, however, we use an in- structure of the locally optimal contour. These teractive approach to explore the alternative issues can, in principle, be resolved by very high- 322 Kass, Witkin, and Terzopoulos (4 (4 Fig. 1. Lower-left: Original wood photograph from Brodatz. Others: Three different local minima for the active contour model. level computations. Perhaps more importantly, Without detailed knowledge about the object high-level mechanisms can interact with the con- in view, it is difficult to justify a choice among the tour model by pushing it toward an appropriate three interpretations. Knowing that wood is a local minimum. Optimization and relaxation layered structure, or perhaps inferring its layered have been used previously in edge and line detec- structure from elsewhere in the picture could tion, [3,5,13,24,25], but without the interactive help to rule out interpretation (b). Beyond that, guiding used here. the ‘correct’ interpretation could be very task de- In many image interpretation tasks, the correct pendent. In many domains, such as analyzing interpretation of low-level events can require seismic data, the choice of interpretation can de- high-level knowledge. Consider, for example, the pend on expert knowledge. Different seismic in- three perceptual organizations of two dark lines terpreters can derive significantly different per- in figure 1. The three different organizations cor- ceptual organizations from the same seismic respond to three different local minima in our sections depending on their knowledge and line-contour model. It is important to notice that training. Because a single ‘correct’ interpretation the shapes of the lines are materially different cannot always be defined, we suggest low-level in the three examples, not just because of a dif- mechanisms which seek appropriate local min- ferent linking of line segments. The segments ima instead of searching for global minima. themselves are changed by the perceptual Unlike most other techniques for finding organization. salient contours, our model is active. It is always Snakes 323 minimizing its energy functional and therefore toward salient image features like lines, edges, exhibits dynamic behavior. Because of the way and subjective contours. The external constraint the contours slither while minimizing their en- forces are responsible for putting the snake near ergy, we call them snakes. Changes in high-level the desired local minimum. These forces can, for interpretation can exert forces on a snake as it example, come from a user interface, automatic continues its minimization. Even in the absence attentional mechanisms, or high-level interpre- of such forces, snakes exhibit hysteresis when ex- tations. posed to moving stimuli. Representing the position of a snake paramet- Snakes do not try to solve the entire problem of rically by v(s) = (x(s), y(s)), we can write its energy finding salient image contours. They rely on functional as other mechanisms to place them somewhere near the desired contour. However, even in cases where no satisfactory automatic starting mech- anism exists, snakes can still be used for semi- automatic image interpretation. If an expert user = Io1 Ei,t(v(s)) + Eim&V(S)) pushes a snake close to an intended contour, its energy minimization will carry it the rest of the + Eccintv(s))ds (1) way. The minimization provides a ‘power assist’ where Eint represent the internal energy of the for a person pointing to a contour feature. spline due to bending, Eimage gives rise to the Snakes are an example of a more general tech- image forces, and E,,, gives rise to the external nique of matching a deformable model to an constraint forces. In this section, we develop Eini image by means of energy minimization. In spirit and give examples of E,,, for interactive inter- and motivation, this idea shares much with the pretation. Eimage is developed in section 3. rubber templates of Widrow [23]. From any start- ing point, the snake deforms itself into conform- ity with the nearest salient contour. We have ap- 2.1 Internal Energy plied the same basic techniques to the problem of 3D object reconstruction from silhouettes by The internal spline energy can be written using energy minimizing surfaces with preferred symmetries [ 171. We expect this general approach Eint = (~(shts)12 + P(~>1W>12V2 (2) will find a wide range of applicability in vision. The spline energy is composed of a first-order In section 2 we present a basic mathematical term controlled by a(s) and a second-order term description of snakes along with their Euler controlled by p(s). The first-order term makes the equations. Then in section 3 we give details of the snake act like a membrane and the second-order energy terms that can make a snake attracted to term makes it act like a thin plate. Adjusting the different types of important static, monocular weights a(s) and p(s) controls the relative impor- features such as lines, edges, and subjective con- tance of the membrane and thin-plate terms. Set- tours. Section 4 addresses the applicability of ting p(s) to zero at a point allows the snake to snake models to stereo correspondence and mo- become second-order discontinuous and develop tion tracking. Finally, section 5 discusses further a corner. The controlled continuity spline is a refinements and directions of our current work. generalization of a Tikonov stabilizer [19] and can formally be regarded as regularizing [14,15] the problem. 2 Basic Snake Behavior Details of our minimization procedure are given in the appendix. The procedure is an O(n) Our basic snake model is a controlled continuity iterative technique using sparse matrix methods. [18] spline under the influence of image forces Each iteration effectively takes implicit Euler and external constraint forces. The internal steps with respect to the internal energy and ex- spline forces serve to impose a piecewise smooth- plicit Euler steps with respect to the image and ness constraint. The image forces push the snake external constraint energy. The numeric con- 324 Kass, Witkin, and Terzopoulos siderations are relatively important. In a fully ex- only to push a snake near the feature. Once close plicit Euler method, it takes 0(n2)iterations each enough, the energy minimization will pull the of O(n) time for an impulse to travel down the snake in the rest of the way. Accurate tracking of length of a snake.