next up previous
Next: Four-dimensional geometric feature Up: published in proceedings Forth Previous: published in proceedings Forth

Introduction

Robust scene interpretation by means of machine vision is a key factor in various new applications in robotics. Part of this problem is the efficient recognition and classification of previously known three-dimensional (3D) shapes in arbitrary scenes. So far, heavily constrained conditions have been utilized, or otherwise solutions have not been achieved in real time.

With the availability of ever faster computers and 3D-sensing technology (real-time stereo processing, laser range-scanner, etc.), more general approaches become feasible. They allow for weaker scene restrictions and hence facilitate new scenarios. Fundamental to visual object recognition are descriptions of general free-form shapes. A good overview of the currently prevalent approaches is given in [2].

In computer graphics, surface meshes are a popular description of free forms. They are also useful for recognition purposes and the Internet makes them accessible to everybody for testing and comparing algorithms. A major drawback, however, is their large memory requirement. Furthermore, surface meshes are defined with respect to a global coordinate system. Thus time consuming registration is necessary to align the object of interest to the frame of the referenced object model before matching is possible. The same problems apply to voxel-based descriptions of shape.

Representations based on superquadrics, generalized cylinders, and splines all suffer from a great sensibility to noise and outliers in the sensed data. A significant effort is required to obtain a robust fit procedure and to select the model order so as to avoid over-fitting.

It is, therefore, most desirable to develop a shape representation that (i) is compact, (ii) is robust, (iii) does not depend on a global coordinate frame, and (iv) has the descriptive capacity to distinguish arbitrary shapes.

A promising approach is to analyze the statistical occurrence of features on a surface in 3D space. This has been pursued by extracting local features such as surface curvatures or geometric relations such as distances. Their distributions are represented as discrete histograms or piecewise-linear approximations thereof. The classification step may be realized by matching a measured distribution against distributions in a reference database of prototypes or by the search for characteristic patterns in a distribution.

For instance, Osada et al. [8] sample the statistics of point-pair distances across the whole surface of 3D objects. They demonstrate similarity search based on the distance distribution. However, a lot of information on shape is discarded by reduction to this one-dimensional feature. Vandeborre et al. [11] use three distributions of one-dimensional geometric features, based on curvature, distance, and volume. In both works, recognition performance is moderate and only suitable for a preliminary selection as performed, e.g., by an Internet search engine.

Hameiri and Shimshoni [3] look for symmetric form primitives, such as cylinders, cones, etc., in depth images. As the basic local feature, they use the two principle surface curvatures, accumulated in a two-dimensional histogram. The surface-curvature histogram is characteristic of each ideal form primitive and known a-priori from geometrical considerations. For real measured data, however, reliance upon curvatures is very sensitive to noise and artifacts. Moreover, for general shapes the distribution of curvatures will not be as crisp as for highly symmetric shapes, may hence be less informative, and many histograms may be required to cover all object views.

Multiple view-based histograms have been used by Hetzel et al. [4,7] who adapted a probabilistic approach from Schiele and Crowley [9] to depth images. According to Bayes' rule, the best match is calculated as the one with the highest a-posterior probability, given a set of random feature samples. As feature they have employed a collection of local surface measures, namely, pixel depth, surface normal, and curvature. Generally, however, a high number of histograms per object model increases processing time.

An alternative line of research has sought to describe single, possibly characteristic points on an object by their local surface shape. This includes the spin images of Johnson and Hebert [6] and the surface signatures of Yamany and Farag [12]. For creating their histograms, surface points are picked and a plane is rotated about their local surface normal. The surrounding points are accumulated in that plane. Both approaches require dense surface meshes. Hillenbrand and Hirzinger [5] have characterized singular surface shape by four-point-relation densities that are directly constructed from a 3D-point set.

In this paper, we propose statistical analysis of a new four-dimensional geometric feature. The distribution of this feature captures both local and global aspects of shape. The relevant measures may be calculated from a surface mesh or be estimated from multiple 3D-data points. Here we rely on triangular meshes as the input data. We need just one stored histogram per object that is learned from training data. In the presence of significant noise or occlusion, we still obtain reasonable recognition rates above 80%. The processing time with a database containing 20 object models is around five milliseconds. The present study describes preliminary results that justify further research along this line.

The paper is organized as follows. Section 2 introduces the four-dimensional geometric feature. In Section 3, the sampling of histograms in the training phase is discussed. Section 4 defines six different criteria for comparison of sensed data with the trained histograms. We evaluate these criteria for classifiers in Section 5. Recognition rates and processing times are demonstrated for artifical data, and performance under conditions of noise and partial object visibility is investigated. Furthermore, we verify generalization of the classifiers across a wide range of mesh resolutions. The paper concludes in Section 6 with a final rating of the different classifiers and a prospect of future work.

Figure 1: (a) Two surface points $ \bf{p}_1, \bf{p}_2$ and their orientations $ \bf{n}_1, \bf{n}_2$. (b) Illustration of the four parameters of our feature. The vector $ \bf{n}'_2$ is the projection of $ \bf{n}_2$ in the $ \bf{uw}$-plane. $ \alpha$, $\arccos \beta$, and $\arccos \gamma$ are angles; $ \delta$ is the length of the vector $ \bf{p}_2-\bf{p}_1$.
\begin{figure}\centering \begin{tabular}{c}
\mbox{
\psfrag{n1}{$\bf{n}_1$} \ps...
...width=3.5cm}} \\
\end{tabular} \\
\mbox{(b)} \\
\end{tabular} \end{figure}


next up previous
Next: Four-dimensional geometric feature Up: published in proceedings Forth Previous: published in proceedings Forth
Eric Wahl 2003-11-06