Temporal Factorization

Lihi Zelnik

Temporal factorization [5] is the dual approach to the more familiar multi-body factorization. The traditional multi-body factorization approaches (e.g., [1,2,3]) provide spatial clustering/segmentation by grouping together points moving with consistent motions. This is done by grouping columns of the correspondence matrix of [4]. Temporal factorization [5] is the dual approach to factorization, i.e., obtaining temporal clustering/segmentation by grouping together frames capturing consistent shapes. This is done by clustering the rows of the correspondence matrix of [4] instead of its columns. Temporal cuts are thus detected at non-rigid changes in the shape of the scene/object. For example, in a sequence showing a face which appears serious at some frames, and is smiling in other frames, all the "serious expression" frames will be grouped together and separated from all the "smile" frames which will be classified as a second group, even though the head may meanwhile undergo various random motions.

Example 1- Temporal factorization using dense optical flow

Results of factorization applied to a sequence taken from the movie "Brave Heart". The actor (Mel Gibson) is serious at first and then smiles while moving his head independently from his expression throughout the sequence. Optical flow was estimated relative to the first frame and the clustering was applied directly to it. We set the number of clusters to 2. This is a sample frame from the first detected temporal cluster, which shows the actor smiling:
Brave.a.jpg
This is a sample frame from the second detected temporal cluster which shows the actor serious:
Brave.b.jpg

Example 2 - Temporal factorization using sparse tracked features

Results of temporal factorization (into 2 clusters) applied to a video clip taken from the movie "Lord of the Rings - Fellowship of the Ring". The clip shows two hobbits first calm and then screaming. The shape-based temporal factorization detected the cut between the two expressions and grouped together all the "calm" frames separately from all the "scream" frames. An example calm frame is:
Lord.a.jpg
An example calm "scream" frame is:
Lord.b.jpg
Videos can be found here.

Bibliography

1: J. Costeira and T. Kanade.
A multi-body factorization method for motion analysis.
Proc. ICCV, pages 1071-1076, Cambridge, MA, June 1995.
2: C.W. Gear.
Multibody grouping from motion images.
IJCV, 2(29):133-150, 1998.
3: K. Kanatani.
Motion segmentation by subspace separation and model selection.
Proc. ICCV, Volume 1, pages 301-306, Vancouver, Canada, 2001.
4: C. Tomasi and T. Kanade.
Shape and motion from image streams under orthography: A factorization method.
IJCV, 9:137-154, November 1992.
5: L. Zelnik-Manor and M. Irani.
Temporal factorization vs. spatial factorization.
Proc. ECCV, Volume 2, pages 434-445, Prague, Czech Republic, 2004.