Utilizing Temporal Associations

Having argued for the foundation of a 3D object recognition system based upon a viewer centred representation, we should bear in mind that the success of such a method essentially depends on the information gained from the collections of views. Although viewer centred representations are frequently implemented as the simple accumulation of the different aspects, a structured description showing the interrelations between the stored views is expected to be more beneficial. This construction needs a special guideline for organizing the information. Time has proven to be a good candidate, powerful and at the same time universal because the knowledge of temporal neighbourhood naturally facilitates the deduction of causal and spatial relationships. Therefore, the assumption of continuity leads to the conclusion that views appearing adjacently will probably stem from the same object.

**Figure 1:** The consideration of the temporal context allows to resolve ambiguities as illustrated for the fourth view of the sequence. The depicted planes are kindly provided by the authors of Seibert and Waxman (1992).
0.500000 $\resizebox*{0.45\textwidth}{!}{\includegraphics{eps/f16_f18.eps}}$

With respect to recognition and automated learning the consideration of the temporal context makes it feasible to manage unfamiliar views of a trained object. Especially within the framework of viewer centred representations information about temporal relations is important to resolve ambiguities of single views since the models of similar objects may contain indistinguishable aspects as depicted in figure 1. Furthermore recognition of single views is always subjected to inevitable errors which become remediable by the evaluation of view sequences (see table 1). Similarly the confidence in classifications as wells as the tolerance of errors can be increased by considering a sequence of views: A higher number of aspects assigned to the same object indicates a higher probability for the correctness of the classification.

Moreover, recording the occurrence of the sequences makes it possible to define a measure which denotes how typical is a certain motion of an object. Thus an increased stability of the object classification becomes achievable. In addition the deduction of exceptional situations is facilitated, hence allowing the detection of possible dangers or required interventions. Given the example of a car, a rotation about the horizontal axis (fortunately) occurs less often than a turning about its vertical axis and may therefore be regarded as an exception.

There are several neurophysical and psychological experiments hinting at the utilization of temporal associations in biology. Miyashita (1988) reports the training of macaque monkeys with a set of 97 randomly generated fractal images. Conducting delayed matching to sample (DMS) tasks the animals had to decide whether two consecutively presented patterns were the same or not. Training was performed over quite a long period of time by presenting a circular repetition of the images and thus maintaining the order of the training set. Subsequent single cell recordings in the inferotemporal cortex exhibited neurons with effective stimuli formed by clusters of consecutive patterns of the training set. Accordingly these randomly generated patterns established associations not for their geometric similarity but for their temporal connections. This conclusion is affirmed by further experiments of Sakai and Miyashita (1991) who managed to create associations between any pairs of shapes on the basis of a consecutive presentation.

Table: Possible deviations from a trained view sequence $ABCDE\protect$ becoming manageable by utilization of the temporal context. Within the examples differing positions are denoted in lower case letters.
Deviation Examples and Possible Causes

Mutation internal: AxCDE, AxyDE, AxCyE

terminal: xBCDE, xyzDE,ABCxy

misclassification of single views or considerably altered sampling rate; sequence starts later or ends sooner, i. e. x,y,z belong to another object or are unfamiliar views

Deletion BCDE, CDE, ACE

altered start or ending; changed speed: views are dropped or fused

Insertion xABCDE, ABCDExy,

AxBCDE, AxyBzCDE

altered start or ending; misclassification of single views or considerably altered sampling rate; slightly changed vantage point yields some new intermediate views

Inversion baCDE, cbaDE, edcba

irregular movement; inverse direction of movement

**Table:** Possible deviations from a trained view sequence $ABCDE\protect$ becoming manageable by utilization of the temporal context. Within the examples differing positions are denoted in lower case letters.
Deviation	Examples and Possible Causes
Mutation	internal: AxCDE, AxyDE, AxCyE
terminal: xBCDE, xyzDE,ABCxy
misclassification of single views or considerably altered sampling rate; sequence starts later or ends sooner, i. e. x,y,z belong to another object or are unfamiliar views
Deletion	BCDE, CDE, ACE
altered start or ending; changed speed: views are dropped or fused
Insertion	xABCDE, ABCDExy,
AxBCDE, AxyBzCDE
altered start or ending; misclassification of single views or considerably altered sampling rate; slightly changed vantage point yields some new intermediate views
Inversion	baCDE, cbaDE, edcba
irregular movement; inverse direction of movement

Psychophysical evidence for the existence of temporal associations can be found in Wallis (in press). His study makes use of training sequences which are artificially created by the combination of consecutive faces belonging to different (!) persons. Assuming the development of temporal associations, one would suppose interconnections between the views of a sequence possibly leading to the fusion of a single virtual face. Based on this hypothesis recognition errors are expected to show more often a confusion between faces belonging to a common artificial sequence than between faces of different sequences. In fact this was confirmed to be the case when testing subjects by the DMS-method. In other words the subjects formed associations between views because of their coincident appearance and not because of their similarity.

Summarizing the findings about temporal associations, Stryker (1991) deduces a powerful scheme which seems to supersede the need for mechanisms of geometric transformations or hierarchical connections of so-called trigger features. As previously mentioned for viewer centred representations, he considers temporal associations to be a trade-off between memory and computation providing a means for the brain to accomplish perceptual constancy.