Next: The Bayesian model Up: Model-based Approaches Previous: Model-based vision

## Computational vision

Ponce and Kriegman have essentially extended the SUCCESSOR model to include algebraic surface patches and their intersecting curves [67]. They describe a new approach for recognizing curved three-dimensional objects from two-dimensional image contours derived from edge maps of gray-scale information. They claim that their technique is powerful enough to handle most objects created by existing CAD tools. Their basic their approach is to construct a closed-form expression for the error estimate between the predicted contour and the observed data using a technique known as elimination theory. They do not explain the details of elimination theory but merely state that it is a technique for removing independent variables from a set of parametric equations. This allows complex equations to be reparameterized into a consistent set of independent variables (in their case, three rotation angles and three displacement variables). Ponce and Kriegman have successfully used this approach to fit image contours to models of curved three-dimensional objects using the SUCCESSOR system. They have achieved surface recognition accuracies of 1.5 pixels for toroidal shaped objects containing many bumps and irregularities.

While the exact mathematical details in their paper are very skeletal, they basically use the following expression to represent the theoretical contours of each object:

where x and y represent the position of the object on the image; dx, dy, and dz represent the displacement of the object from the viewing origin and a, b, and c represent the orientation angles of the object. With this equation, they use the Levenberg-Marquardt algorithm to solve the non-linear least-squares minimization of the error function (in this case, the sum of over all the contour data points). A more accurate method is to define the error function as the distance between the image data points and the theoretical points. They argue that under the assumption that the distance between the actual and theoretical data points is relatively small, the error function can be approximated as

where nx and ny are the scaled normals of the contours at each data point. This equation can be solved using a simple Newton-Raphson method. Using this latter technique, they were able to detect the position and orientation of various toroidal objects and differentiate these objects based on various properties such as surface smoothness, color, and geometric aspect ratio.

The advantage of their system is that it is one of the few working systems that can recognize simple three-dimensional objects independent of the view angles and the presence of occluding objects and features. However, their system has many signficant drawbacks as well. For instance, each object model must be reparameterized into polynomial expressions involving the six variables described earlier. While they have done this for toroidal objects, they admit that this step is difficult to generalize and may require a great deal of CSG analysis to extend to other objects. Furthermore, they only work with two-dimensional edge maps (i.e., two-dimensional images representing only the edges of a gray-scale image). Since edge maps are susceptible to shadow effects from occluding surfaces, all the shadows had to be removed manually. In general, while their technique is very accurate, they require very "clean" images, which means a certain amount of manual preprocessing must be performed on each image. Both of these limitations make their approach unfeasible for use in general medical image processing. Most systems in the medical domain require robust, automated algorithms to deal with large studies involving dozens of slices of axial images per study. Furthermore, medical images contain very complicted geometries for which no adequate models have been developed (let alone subjected to the rigors of elimination theory).

Despite these drawbacks, Ponce and Kriegman have taken an important first step in recognizing simple three-dimensional objects from two-dimensional contour data. For future research, they propose to extend their elimination technique to include so-called superquadratics used in computer graphics. They would also like to couple the image segmentation and object recognition steps in order to eliminate the error-prone manual preprocessing of the two-dimensional gray-scale data. As a result, their object recognition approach shows great promise for becoming a useful image understanding tool in the near future.

Next: The Bayesian model Up: Model-based Approaches Previous: Model-based vision

Ramani Pichumani
Mon Jul 7 10:34:23 PDT 1997