Next: Hypothesis Construction Up: Model Invocation Previous: Related Work

Discussion

The purpose of invocation is to reduce the computation involved in the model-to-data matching process. This has been partially achieved by basing invocation on propagated plausibility values, so the computation has been reduced from a detailed object comparison to evidence accumulation. Unfortunately, virtually every object model still needs to be considered for each image structure, albeit in a simplified manner. On the other hand, the model-to-data comparison computation has now been simplified. As a result, it is now amenable to large scale parallel processing.

One deficiency in this method is the absence of a justified formal criterion that determines when to invoke a model. Invoking when the plausibility was positive worked well in practice, but most seriously incorrect hypotheses are near -1.0. Hence, a threshold somewhat lower than 0.0 could be considered. This might lead to each object having a different threshold.

This work leaves several open "learning" problems:

How is the structure of the model network created and modified?
How are the features used for invocation selected?
How are the property and relationship weights chosen?

Other unsolved problems include resolving the problem of multiple invocations within the extended surface cluster hierarchy (as discussed above), preventing data feature evidence from being used for more than one model subcomponent and deciding when to invoke generic models (as well as the object-specific ones). The theory could also be extended for non-shape properties (e.g. color, texture, etc.) and quantified descriptors (e.g. "larger", "much larger") proposed by Marr [112] as an attempt to achieving scale invariance. Finally, though the class hierarchy and evidence computations were defined, no significant testing of this feature was undertaken.

For each evidence computation, some natural constraints were proposed as specification criteria. But, there were never enough constraints to uniquely determine the computation. The hope is that the variations in algorithms that this allows result only in slightly different performance levels. This has been partially tested using substantially different property evidence evaluation and evidence integration functions, without significant differences in the invocation results.

This chapter has only concerned visual recognition, but this invocation approach may have more general applicability. Any form of symbolic inference requires accessing the correct symbol. So, the model invocation problem is also a general cognitive problem, with the following aspects:

low level symbolic assertions are produced for the current input whether from an external (e.g. raw sense data) or internal (e.g. self-monitoring) source,
higher level concepts/symbols tend to be semi-distinctly characterizable based on "configurations" of lower level symbolic descriptions,
there are many potential higher level symbols, but only a small subset should be selected for closer consideration when matching a symbol,
the importance of a particular concept in invoking another is dependent on many factors, including structure, generics, experience and context, and
symbols "recognized" at one description level (either primitive or through matching) become usable for the invocation of more complex symbols.

Examples of this in a non-vision context might be something like an invocation of a Schankian fast-food restaurant schema [144] or recognizing words in speech.

Final Comments

This chapter formalized the associative basis of a model invocation process with the major elements as object types, property evidence inputs and associative links based on generic and component relations. The theory was based on sets of constraints describing how different evidence affects plausibility, and the use of surfaces and surface clusters as the contexts in which to accumulate evidence.

Next: Hypothesis Construction Up: Model Invocation Previous: Related Work

Bob Fisher 2004-02-26