In artificial vision, the image model underlying many image analysis methods considers images as the sampling of a smooth world with meaningful discontinuities. Then we have a function , where both and are discrete and bounded spaces. Usually, is a spatial domain, so it comes with a given connectivity, while the semantics of the space gives us the image modality: range, luminance or color images from world scenes; physical parameters from medical images such as MR, CT, DSA, SPECT or PET; etc. Besides, since in this model L is a static entity, when we want to study the dynamic behavior of the objects contained in an image we embed it in an evolving family of manifolds, , being the space where live the parameters which controls the evolution. In practice, of course, this space is also discrete and bounded. Common evolution parameters are time and scale. In this setting, artificial vision tries to make a signal to symbol transformation by computing low-level features which allow high-level tasks to automatically distinguish between interesting objects appearing in the image. Most relevant features can be extracted from dissimilarity measures (edgeness), similarity measures (interiorness) or a summarization of both (medialness). Edgeness and interiorness are dual measures mainly related to segmentation processes while medialness is more like a 'shape encoder' that gives a degree of how much a point resembles the middle of an object.
When we deal with incommensurable axes, such as those of grey-level images (luminance, MR, CT, etc), the most common edgeness measure is the gradient magnitude (see notation in appendix A). In the 2-d case, for example, the idea is that an edge (object boundary) in some direction corresponds to a sharp change of L in a direction which is perpendicular to it. Observe that 'sharp change of L' can be translated as 'ridge of '. Moreover, as the gradient at each point indicates the direction of maximum change of L, the ridge at any point holds perpendicularly to it. Therefore, a scheme for extracting the edges of L can consist of computing the ridge-like structures of (without thinking in any particular definition by the moment).
A medialness measure, essentially, assigns to each point inside an object its distance to the boundary of that object, according with a given metric. The points at longest distance are in the middle of the object, therefore, at each point medialness is also accounting for the degree of being on a symmetry axis. Blum [5, 6, 7] was the first proposing this idea. He defined the Symmetry Axis (SA), also called Medial Axis (MA) or skeleton, of a binary object as the loci of centers of maximal disks contained in the object, in a way that the boundary of the union of all maximal disks is the boundary of the object. Then, Blum defined the Symmetric Axis Transform (SAT) of an object, also called Medial Axis Transform (MAT), as its SA together with the radii of the maximal disks at each SA point. In this way, the SAT compresses arbitrary shapes of binary objects. On the other hand, artificial vision daily tasks hardly ever deal with binary objects, at least, in early processing (low-level tasks). Instead, features have to be computed from grey-level images where the boundaries of the objects are not a priori known. Therefore, SAT idea has to be translated to the new data type. Two representative works in this line are:
On the other hand, ridge/valley-like structures on grey-level images have been proposed [44, 43, 25, 17] as a reliable approximation to the medial axis provided they are extracted directly from the data instead of from a medialness measure of it. In plain words this can be justifyied because ridge/valley-like structures tend to be in the middle of relatively brighter/darker regions at a given scale, as the brain stem or the skull in MR and CT images, vessels in DSA, roof 'edges' in range images and, a number of phenomena in regular luminance images [34, 97]. Other images having a predominant number of ridge/valley-like structures are fingerprint images (in general, images with oriented textures), hand-written documents, patterns generated by structured light, aerial photographs depicting roads, rivers, cut-fires, etc.
The advantage of using ridge/valley-like structures of the image itself is that the computational complexity is much lower than extracting them from a medialness measure of it. On the other hand, merely extracting ridge/valley-like structures does not give size information of the objects which have them as 'center'. In many applications this is not necessary since we just need the center (e.g. as landmark) but, when it is, the size of the objects can be related to the scale where the ridge/valley-structures are more salient [25, 28, 12, 51]. Another approach is to apply at each point on a ridge/valley-like structure, a filter responding optimally to a simultaneous boundary engagement and whose size (scale) can be related to the size of the object. An example is the Laplacean of the Gaussian (LoG).
As we have seen ridge/valley-like structures are powerful descriptors which can be involved in important tasks: detection of edges when applied to an edgeness measure, detection of medial axes when applied to a medialness measure, and detection of medial structures by themselves. On the other hand, in this section I have spoken about the use of ridge/valley-like structures in artificial vision without giving their specific mathematical definition. In next section I will review several definitions in use and their main properties. I have classified them as creases, separatrices and drainage patterns: