A surface cluster is a set of surface patches isolated by suitable boundaries. The goal of the surface cluster formation process is to produce a volumetric image representation for portions of the scene that might be identifiable three dimensional objects. There are three motivations for this:
The first motivation for a surface cluster is a competence issue - surface clusters are new representations that bridge the conceptual distance between the segmented surface image and the object. The point is to create an "unidentified, but distinct" object interpretation associated with sets of image features - a volumetric representation describing solid objects with approximate spatial relationships but without identifications. With this structure, the key image understanding representations now become (following Marr ): image - primal sketch - sketch - surface clusters - objects. The grouping creates a good starting point for further interpretation; it is a form of figure/ground separation for solid objects.
Such representations are needed for unidentifiable objects to allow intermediate levels of image interpretation even when full identification is not achieved, whether because of faults or lack of models in the database. Possible application areas include vehicle navigation, collision avoidance, object tracking or grasping.
The second motivation is to partition the image features into activity contexts for the later stages of recognition. This will help to focus scene understanding to make more obvious the interpretation of a group of image features and how they are matched with a model. By eliminating irrelevant and unrelated image features, it should be easier to identify and verify objects, since only features belonging to a single object will be present.
The "context" aspect of surface clusters is exploited by model invocation (Chapter 8) and hypothesis construction (Chapter 9). Model invocation requires contexts within which to accumulate evidence to hypothesize models, and surface clusters are ideal for invoking the volumetric models (the ASSEMBLY - see Chapter 7). To help fully instantiate an ASSEMBLY hypothesis, it may be necessary to search for additional evidence. Because ASSEMBLY invocation has occurred in a surface cluster context, any additional structural evidence should also come from the context. Thus, processing has been focused to a distinct region of the image.
The final motivation for creating these aggregations is one of performance - eliminating unrelated image features produces an immediate reduction in the complexity of the scene analysis. The whole interpretation has been reduced to a set of smaller independent problems, which is necessary given the quantity of data in an image.
A casual mathematical analysis supports this point. Here, we are mainly concerned with intersurface relationships (i.e. relative surface orientation and matching data surfaces to model SURFACEs). Since every surface on an object has a relationship to every other surface on the object, an object with visible surfaces has O() relationships. If there are objects in the scene, each with visible surfaces, there will be total visible surfaces. So, initially, the analysis complexity is O(). However, if the surface cluster process succeeds in partitioning the image into the objects, then the complexity is reduced to O(). For typical scenes, a nominal value for is 20, so this can lead to a substantial improvement in performance.