Consider a motion sequence of multiple objects viewed by a static camera.
Assuming no particular 3D model of motion, the problem is restricted
to 2D projection of real 3D motion.
Let ,
,
be the
th frame of the
sequence, where
is the total number of the frames.
Assume feature points
have been detected in each
prior
to tracking. The number of points in the
th frame,
,
may vary from frame to frame.
Denote the total
number of distinct points that appear in the sequence by
.
This number is
equal to the total number of distinct trajectories
.
An occluded trajectory counts as one although it consists of 2 or more
pieces.
When a point enters
or leaves the view field in any frame
,
the trajectory is called partial.
A trajectory is broken if the point temporarily disappears
within the view field, and later reappears again. In this case, we
speak of (temporary) occlusion.
If a trajectory is broken, partial, or both, it is called
incomplete. Entries, exits and temporal occlusions are called
events.
The feature point tracking problem is a motion correspondence problem under the general assumptions of:
Assumption 2 means limited accelerations, i.e., limited changes in motion directions and speeds. The speeds themselves are also limited so that a point is observed a sufficient number of times as it crosses the view field. However, small inter-frame displacements are not assumed. Assumption 4 implies directional continuity of broken trajectories, which makes smoothness applicable to occluded paths as well.
In addition to the general assumptions 1-4, most algorithms use specific assumptions concerning the admissible events. These assumptions are discussed in section 3.