Kass developed a novel technique for image segmentation, which was able to solve a large class of segmentation problems that had eluded more conventional techniques. He was interested in developing a model-based technique that could recognize familiar objects in the presence of noise and other ambiguities. Kass [47] proposed the concept of a snake, which was an active contour model using ``an energy minimizing spline guided by external constraint forces and influenced by image forces that pull it toward features such as lines and edges.'' Snakes have been successful in performing tasks such as edge detection (both actual as well as subjective), corner detection, motion tracking, and stereo matching.
A spline is nothing more than a polynomial or set of polynomials used to describe or approximate curves and surfaces. Although the polynomials that make up the spline can be of arbitrary degree, the most commonly used are cubic polynomials. For example, a simple two-dimensional curve can be approximated by the following pair of cubic equations:
Higher-order polynomials can have undesirable non-local properties.
Complex shapes can be decomposed into smaller regions having fewer
inflection points by the introduction of appropriately placed
knots (i.e., control points). In equation , the
variable u is called the spline parameter and typically varies
over the interval [0,1]. Because cubic polynomials can have a
maximum of two inflection points, complex shapes need to be decomposed
into simpler segments or patches having at most two inflections.
Although the coefficients a, b, c, d uniquely determine the shape of
the spline, they are not usually specified directly. Instead, they
are computed from other constraints such as continuity of zeroth,
first- and second-order derivatives at boundary points between
neighboring polynomial segments.
Snakes fall into the category of active contour models because they dynamically alter their shape and position while trying to seek a minimal energy state. A two-dimensional dynamic contour called v can be defined in terms of its x and y coordinates, which in turn are parameterized by s , the linear parameter, and t , the time parameter,
where , usually defined as the closed
interval [0,1], and
, usually defined as the half-open
interval [0,
). The coefficients that minimize the energy of
the spline can be found using either optimization techniques or
differential calculus.
In Kass's original model, the total energy of a snake is made up of three subterms:
is the internal energy of the spline and depends
solely on the shape of the spline.
is the image energy
and depends solely on the image intensity values along the path of the
spline.
is the constraint energy and is created by
artificial energy fields imposed by the user or the high-level control
agent. Other energy terms can be defined, but for the purpose of this
review section, I will limit the discussion to these three terms.
The internal energy is defined as
where controls the amount of stretching the snake
is willing to undergo and
controls the amount of flexing it
will allow. Large values of
will increase the internal
energy of the snake as it stretches more and more, whereas small
values of
will make the energy function insensitive to the
amount of stretch. Similarly, large values of
will increase
the internal energy of the snake as it develops more curves, whereas
small values of
will make the energy function insensitive to
curves in the snake. Smaller values of both
and
will
place fewer constraints on the size and shape of the snake.
The image energy of the snake is defined as
where is called the line coefficient and
is
called the edge coefficient. Large positive values of
tend to
make the snake align itself with dark regions in the image, I(x,y),
whereas large negative values of
tend to make the snake align
itself with bright regions in the image. Small absolute values of
make the snake more indifferent to intensity variations in the
image. Similarly, large positive values of
tend to make the
snake align itself with sharp edges in the image whereas large
negative values of
make the snake avoid these edges. Small
absolute values of
make the snake indifferent to edges in the
image.
The external constraint energy is defined as
where the terms are external spring factors and the
terms are called volcano factors. A large value of
makes the snake behave as if there was a powerful spring connected
between a point on the image at
and a point on the snake
. The larger the value of
, the more powerful the
force of the spring. The
terms are called volcanoes because the
plot of
resembles the profile of a
symmetric volcano. Computationally, these volcano terms act as a
repulsion force between a point on the image at a distance
from
a point on the snake; the larger the value of
, the stronger
the force of repulsion. As a result, springs and volcanoes are a way
of capturing high-level knowledge about images and the features
contained within these images.
Using the internal and image energy forces, a snake will find desired image features in an autonomous fashion. The spring forces can be set up interactively by the user to confine the region in which a snake will operate. The volcano forces can be set up by the user to define regions that the snake should avoid.
Snakes have several advantages over classical feature extraction techniques:
Snakes are not without their drawbacks, chief among which are the following:
In summary, snakes are a model-driven approach for solving many image understanding problems that are difficult, if not impossible, to tackle using classical approaches. Just like human vision, snakes start with an a priori model of what an object should look like. By using the smoothness constraints of the splines, they are able to fill in for missing and noisy boundary information. As a result, they are more robust than non-model based methods, which make little use of image structure. Much of the current research in active contour models deals with generalizing the form of the contours and overcoming the convergence and stability problems encountered during the energy minimization process [6] [25] [52] [53] [57].