Figure 5: Examples for textured shapes (circle with center in the middle of the image), where the gradient is not sufficient for contour extraction: Left, weak contrast between contour and background (even for a human observer). Right, strong edges inside the contour.
In general natural texture discrimination in the 2D image plane is based on
In pattern recognition, one research domain which has to deal with the detection of boundaries on 1D signals, is the area of speech recognition. For the discrimination between a certain kind of speech (voiced, unvoiced) or the detection of irregularities in the speech, a lot of features have been discussed. The results are a good starting point for a definition of features, which can be applied to detect boundaries in the 2D image plane based on a 1D gray value signal.
First, let us look at an example of a 1D gray value signal, i.e. one active ray , for the images shown in 5 (right). The gray value signal can be found in 6, left. It is obvious, that the boundary between the two textures is visible within the active ray. But the usual image gradient is not suited to identify this position on the ray. Instead of the gradient, the changes in the frequency of the signal and the gray value range identify uniquely the boundary at position . In 6 right, the active ray has been mapped to the variance of the gray values,
which has been computed inside a window of width N=20 pixels on the active ray. The parameter a defines the starting position of the window. The window is shifted one by one pixel over the whole 1D gray value signal. One can easily see, that the gradient in this ``variance signal'' now identifies the boundary. Using the negative gradient of the variance as judgement function
we can integrate this energy in the energy minimization scheme (11).
Figure: Transformation of 1D gray value signal to 1D feature signal. Left: gray value signal, sampled in image 5 (right) from the center of the circle in direction . Right: Variance of the gray values within a window of width 20 pixels. The window is shifted over the whole signal with a step size of one pixel.
Another function known from speech recognition is the so called average magnitude difference function (AMDF) . The AMDF is defined as
with 2L+1 being the window width, and being the displacement inside the window. Now, following the energy minimization scheme the judgement function is defined as
because the AMDF is supposed to reach a maximum value at positions , where the boundary between two textures are located in the middle of the window starting at .
In contrast to region based active contour models, the complexity of the energy computation and minimization does not increase in a way, that real time constraint cannot be satisfied. In the case of the variance and the AMDF this can be done in real--time without dedicated hardware.
In the experimental part we summarize several other ``texture'' energies, which have been evaluated. Most of the features are window based features, i.e. we compute the feature at position by using a window with certain width out of the gray values of the active ray . We do not claim, that these features are the best ones, but in the context of real--time application, they proved to be the most efficient ones. They are also well suited to show the usability of the presented framework for the extraction of textured contours.