This section introduces the mathematical background on perspective projections necessary for our purposes. Our notation follows [6].
A pinhole camera is modeled by its optical center C and its retinal plane (or image plane) . A 3-D point W is projected into an image point m given by the intersection of with the line containing C and W.
Let
be the coordinates of W in the world
reference frame (fixed arbitrarily) and
the pixel
coordinates of m. In homogeneous (or projective) coordinates
(1) |
The camera is therefore modeled by its perspective projection
matrix (henceforth simply camera matrix)
,
which
can be decomposed, using the QR factorization, into the product
(3) |
(4) |
The camera position and orientation (extrinsic parameters), are encoded by the rotation matrix and the translation , representing the rigid transformation that aligns the camera reference frame (Fig. 1) and the world reference frame.
Let us consider the case of two cameras (see Fig. 2).
If we take the first camera reference frame as the world reference
frame, we can write the two following general camera matrices:
A three-dimensional point
is projected onto both image
planes, to points
and
,
which constitute a conjugate pair. From the left camera we obtain:
Equation (8) means that
lies on
the line going trough
and the point
.
In projective coordinates the collinearity of these
three points can be expressed with the external product:
or
Since the rank of is in general two and, being defined up to a scale factor, it depends upon seven parameters. In the most general case, the only geometrical information that can be computed from pairs of images is the fundamental matrix. Its computation requires a minimum of eight point correspondences to obtain a unique solution [29,51].
It can be seen that (9) is equivalent to
Changing to normalized coordinates,
one obtain the original formulation of the
Longuet-Higgins [27] equation,
Unlike the fundamental matrix, whose only property is being of rank two, the essential matrix is characterized by the two constraints found by Huang and Faugeras [25] which are the nullity of the determinant and the equality of the two non-zero singular values. Indeed, the following Theorem holds:
Given two views of a scene, there is a linear projective
transformation (an homography) relating the projection
of the point of a plane
in the first view to its projection
in the second view, .
This application is given by a
invertible matrix
such that:
(15) |
(16) |