next up previous
Next: The Kalman-filter model Up: Model-based Approaches Previous: Model-based Approaches

Snakes: an active model

 

Kass developed a novel technique for image segmentation, which was able to solve a large class of segmentation problems that had eluded more conventional techniques. He was interested in developing a model-based technique that could recognize familiar objects in the presence of noise and other ambiguities. Kass [47] proposed the concept of a snake, which was an active contour model using ``an energy minimizing spline guided by external constraint forces and influenced by image forces that pull it toward features such as lines and edges.'' Snakes have been successful in performing tasks such as edge detection (both actual as well as subjective), corner detection, motion tracking, and stereo matching.

A spline is nothing more than a polynomial or set of polynomials used to describe or approximate curves and surfaces. Although the polynomials that make up the spline can be of arbitrary degree, the most commonly used are cubic polynomials. For example, a simple two-dimensional curve can be approximated by the following pair of cubic equations:

  eqnarray365

Higher-order polynomials can have undesirable non-local properties. Complex shapes can be decomposed into smaller regions having fewer inflection points by the introduction of appropriately placed knots (i.e., control points). In equation gif, the variable u is called the spline parameter and typically varies over the interval [0,1]. Because cubic polynomials can have a maximum of two inflection points, complex shapes need to be decomposed into simpler segments or patches having at most two inflections. Although the coefficients a, b, c, d uniquely determine the shape of the spline, they are not usually specified directly. Instead, they are computed from other constraints such as continuity of zeroth, first- and second-order derivatives at boundary points between neighboring polynomial segments.

Snakes fall into the category of active contour models because they dynamically alter their shape and position while trying to seek a minimal energy state. A two-dimensional dynamic contour called v can be defined in terms of its x and y coordinates, which in turn are parameterized by s , the linear parameter, and t , the time parameter,

displaymath3864

where tex2html_wrap_inline3886 , usually defined as the closed interval [0,1], and tex2html_wrap_inline3888 , usually defined as the half-open interval [0, tex2html_wrap_inline3890 ). The coefficients that minimize the energy of the spline can be found using either optimization techniques or differential calculus.

In Kass's original model, the total energy of a snake is made up of three subterms:

displaymath3865

tex2html_wrap_inline3892 is the internal energy of the spline and depends solely on the shape of the spline. tex2html_wrap_inline3894 is the image energy and depends solely on the image intensity values along the path of the spline. tex2html_wrap_inline3896 is the constraint energy and is created by artificial energy fields imposed by the user or the high-level control agent. Other energy terms can be defined, but for the purpose of this review section, I will limit the discussion to these three terms.

The internal energy is defined as

displaymath3866

where tex2html_wrap_inline3898 controls the amount of stretching the snake is willing to undergo and tex2html_wrap_inline3900 controls the amount of flexing it will allow. Large values of tex2html_wrap_inline3898 will increase the internal energy of the snake as it stretches more and more, whereas small values of tex2html_wrap_inline3898 will make the energy function insensitive to the amount of stretch. Similarly, large values of tex2html_wrap_inline3900 will increase the internal energy of the snake as it develops more curves, whereas small values of tex2html_wrap_inline3900 will make the energy function insensitive to curves in the snake. Smaller values of both tex2html_wrap_inline3898 and tex2html_wrap_inline3900 will place fewer constraints on the size and shape of the snake.

The image energy of the snake is defined as

displaymath3867

where tex2html_wrap_inline3914 is called the line coefficient and tex2html_wrap_inline3916 is called the edge coefficient. Large positive values of tex2html_wrap_inline3914 tend to make the snake align itself with dark regions in the image, I(x,y), whereas large negative values of tex2html_wrap_inline3914 tend to make the snake align itself with bright regions in the image. Small absolute values of tex2html_wrap_inline3914 make the snake more indifferent to intensity variations in the image. Similarly, large positive values of tex2html_wrap_inline3916 tend to make the snake align itself with sharp edges in the image whereas large negative values of tex2html_wrap_inline3916 make the snake avoid these edges. Small absolute values of tex2html_wrap_inline3916 make the snake indifferent to edges in the image.

The external constraint energy is defined as

displaymath3868

where the tex2html_wrap_inline3932 terms are external spring factors and the tex2html_wrap_inline3934 terms are called volcano factors. A large value of tex2html_wrap_inline3932 makes the snake behave as if there was a powerful spring connected between a point on the image at tex2html_wrap_inline3938 and a point on the snake tex2html_wrap_inline3940 . The larger the value of tex2html_wrap_inline3932 , the more powerful the force of the spring. The tex2html_wrap_inline3934 terms are called volcanoes because the plot of tex2html_wrap_inline3946 resembles the profile of a symmetric volcano. Computationally, these volcano terms act as a repulsion force between a point on the image at a distance tex2html_wrap_inline3948 from a point on the snake; the larger the value of tex2html_wrap_inline3950 , the stronger the force of repulsion. As a result, springs and volcanoes are a way of capturing high-level knowledge about images and the features contained within these images.

Using the internal and image energy forces, a snake will find desired image features in an autonomous fashion. The spring forces can be set up interactively by the user to confine the region in which a snake will operate. The volcano forces can be set up by the user to define regions that the snake should avoid.

Snakes have several advantages over classical feature extraction techniques:

Snakes are not without their drawbacks, chief among which are the following:

In summary, snakes are a model-driven approach for solving many image understanding problems that are difficult, if not impossible, to tackle using classical approaches. Just like human vision, snakes start with an a priori model of what an object should look like. By using the smoothness constraints of the splines, they are able to fill in for missing and noisy boundary information. As a result, they are more robust than non-model based methods, which make little use of image structure. Much of the current research in active contour models deals with generalizing the form of the contours and overcoming the convergence and stability problems encountered during the energy minimization process [6] [25] [52] [53] [57].


next up previous
Next: The Kalman-filter model Up: Model-based Approaches Previous: Model-based Approaches

Ramani Pichumani
Mon Jul 7 10:34:23 PDT 1997