Suppose that all the cameras obey affine projections represented by (2). We write this equation with the all views and object points as follows:
The matrix is called the measurement matrix and (6) means that this matrix can be decomposed into a matrix representing camera motion and a matrix representing the object shape. This factorization can be performed through singular value decomposition (SVD) as follows; Let
be SVD of the measurement matrix with and singular values . The column vectors of the () matrix () are mutually orthogonal, which means (). If there is no noise in the 2D image coordinates , is rank 3 and in (7) is zero. Therefore one possible factorization of is given by
Though is not necessarily rank 3 with the presence of noise, the solution (8) is optimal in the sense that the ``corrected'' measurement matrix minimizes a criterion under the constraint: [10].
However, the solution (8) is not unique; If is one possible factorization, yields another factorization for arbitrary 33 non-singular matrix . Conversely, if the measurement matrix is factored in two ways, i.e. and , there exists a 33 non-singular matrix which satisfies and (see appendix A). This means that camera motion and the object shape obtained by decomposing the measurement matrix differ from ``true'' values, and , only by an unknown affine transformation , that is,
In other words, we can recover shape and motion up to an unknown affine transformation and can arbitrarily choose a 3D affine coordinate frame in terms of which structure of the motion and shape are described. We call this kind of recovery affine reconstruction.