next up previous
Next: Bibliography Up: Harmonic Maps Previous: Introduction


High Resolution Tracking of Non-Rigid 3D Motion Using Harmonic Maps

In this section, we present a fully automatic method for high resolution, non-rigid dense 3D point tracking [9]. Harmonic maps were used in [11] to do surface matching, albeit focusing on rigid transformations. Given the source manifold $M$ and the target manifold $D$, only the boundary condition $u\vert _{\partial M} :
\partial M \rightarrow \partial D$ was used to constrain and uniquely determine the harmonic map $u: M \rightarrow D$. For applications like high resolution facial tracking though, we need to account for non-rigid deformations, with a high level of accuracy. To this end, we introduce additional feature correspondence constraints, in addition to the boundary constraint in our implementation of harmonic maps. We select a set of motion-representative feature corners (for example, for facial expression tracking, we select corners of eyes, lips, eye brows etc.) and establish inter-frame correspondences using commonly used techniques (for example, hierarchical matching used in [10]). We can then integrate these correspondence constraints with the boundary condition to calculate harmonic maps, which not only account for global rigid motion, but also subtle non-rigid deformations and hence achieve high accuracy registration and tracking.

The algorithm is illustrated in Figure 1 by considering the example of a synthetic surface $S$ undergoing non-rigid deformation.

Figure 1: Illustration of Harmonic Map: A Synthetic Example. (a) $S_o$: Initial configuration of surface. (b) $S_t$: Surface after non-rigid deformation. (c) $D_o$: Harmonic Map of $S_o$ with the hard boundary constraints only. (d) $D_t$: Harmonic map of $S_t$ with the hard boundary constraints only. We can notice that although $D_o$ and $D_t$ conform to each other around the boundary, the interior non-rigid deformation is still unaccounted for. (e) $D'_o$: Harmonic map of $S_o$ with the 'tip of the nose' as an additional feature-correspondence constraint. We can see that imposing correspondence constraints aligns $D'_o$ and $D_t$ better, resulting in accurate registration.
\includegraphics[height=1.8in]{figures/S_O.eps} \includegraphics[height=1.8in]{figures/S_T.eps}
(a) (b)
\includegraphics[height=1.8in]{figures/D_O.eps} \includegraphics[height=1.8in]{figures/D_T.eps} \includegraphics[height=1.8in]{figures/D_O_p.eps}
(c) (d) (e)

High quality dense point clouds of facial geometry moving at video speeds are acquired using a phase-shifting based structured light ranging technique [6]. To use such data for temporal study of the subtle dynamics in expressions, an efficient non-rigid 3D motion tracking algorithm is needed to establish inter-frame correspondences. Because our dynamic range sequences are acquired at a high frame rate (30 Hz), we can assume that the local deformation between two adjacent frames is small. To register two frames, we align their respective harmonic maps as closely as possible by imposing the suitable boundary and feature constraints. The motivation to do so is to establish a common parametric domain for the two surfaces, which, coupled with the above mentioned property, allows to recover 3D registration between the two frames.In our case, the harmonic maps are diffeomorphisms, that is one to one and on-to, and hence lend themselves as a natural choice for surface parameterization in tracking applications.

The outline of the non-rigid tracking algorithm is given as follows:

The accuracy of the proposed tracking algorithm is demonstrated through experiments on real data. We performed tracking on four subjects performing various expressions for a total of twelve sequences of 250-300 frames each (at 30Hz). Each frame contains approximately 80K 3D points, whereas the generic face mesh contains 8K nodes. The tracking results are available as video clips at http://www.cs.sunysb.edu/~ial/expressionModeling.html, including opening and closing of the mouth (female subject) or strongly asymmetric smile (male subject). Our technique tracks very accurately even in the case of topology change and severe `folding' of the data. (See Figure 2)

Figure 2: Snapshots from a Tracking Sequence of Subject A: a) Initial data frame. b) Initial tracked frame. c) Data at the expression peak. d) Tracked data at the peak. e) Close-up at the peak.
\includegraphics[height=1.8in]{figures/data_initial.eps} \includegraphics[height=1.8in]{figures/mesh_initial.eps} \includegraphics[height=1.8in]{figures/data_peak.eps} \includegraphics[height=1.8in]{figures/mesh_peak.eps} \includegraphics[height=1.8in]{figures/closeup_peak.eps}
(a) (b) (c) (d) (e)


next up previous
Next: Bibliography Up: Harmonic Maps Previous: Introduction
Yang Wang 2006-02-15