next up previous
Next: Model-based Approaches Up: Shape Recognition Previous: Origin of scale

Limitations of non model-based vision

 

Moses et al, have constructed a theoretical argument for why non-model-based vision systems cannot correctly recognize objects in a consistent manner [58]. The authors offer a mathematical proof for their arguments based on a definition of consistent recognition functions. The details of this proof are beyond the scope of this dissertation. However, the basic result of the proof is that because different objects can produce similar looking images or image features, it is not possible to distinguish these objects without prior knowledge of how the images were formed. For example, consider the cylinder and the truncated cone in Figure gif. Both the top and bottom as well as the left and right sides of the cylinder are parallel in view (a), and the left and right sides of the cone are antiparallel. However, when viewed from orientation (b), the left and right sides of the cylinder have become antiparallel, making the cylinder indistinguishable from the cone. In order to recognize these two objects, a non-model-based vision system would have to be trained on all possible perspective transformations of these two objects. Furthermore, it would be difficult to take into account a known viewing orientation to improve the discrimination capability of the non-model-based system. On the other hand, a model-based system would not have to be trained on all possible view orientations. Furthermore, the model-based system could use the viewing orientation (either known or hypothesized) to discriminate between the two objects. For example, the ratio of the radii of the top and bottom disks spans a smaller interval for the cylinder than for the cone. Also, the degree of antiparallelism of the left and right sides is less pronounced for the cylinder than for the cone. Given the viewing orientation, the model-based system can predict a priori the degree of antiparallelism and the ratio of the radii that it will expect to see.

A less-contrived example of the limitations of non-model-based vision is the correction of blurred photographic images. Image blurring can be caused by camera motion, object motion, lens defocusing, chromatic separation, and image sensor diffusion. Without a model of the image formation process, it is difficult to reduce the degree of blurring. However, if we know the properties of the lens, or the velocity of the objects, or the physics of the image sensing device, we can optimally correct for the blurring. Of course, the correction will still be limited by factors such as the signal-to-noise ratio of the image sensors and the quality of the lens. Furthermore, some blurring processes such as intensity saturation and image bleeding are mathematically or physically irreversible. The point is that for any given set of constraints, a model-based vision system will have a better object discriminating capability than a non-model-based system. Indeed, many current research projects in computer vision deal with the construction of better geometric and computational models for recognizing objects and scenes.

   figure350
Figure: A cone and a cylinder (a) can have identical perspective transformations when viewed at a particular orientation (b).


next up previous
Next: Model-based Approaches Up: Shape Recognition Previous: Origin of scale

Ramani Pichumani
Mon Jul 7 10:34:23 PDT 1997