This section introduces the mathematical background on perspective projections necessary for our purposes. Our notation follows [6].
![]() |
A pinhole camera is modeled by its optical center C and
its retinal plane (or image plane) .
A 3-D point
W is projected into an image point m given by the
intersection of
with the line containing C and W.
Let
be the coordinates of W in the world
reference frame (fixed arbitrarily) and
the pixel
coordinates of m. In homogeneous (or projective) coordinates
![]() |
(1) |
The camera is therefore modeled by its perspective projection
matrix (henceforth simply camera matrix)
,
which
can be decomposed, using the QR factorization, into the product
![]() |
(3) |
![]() |
(4) |
The camera position and orientation (extrinsic parameters), are
encoded by the
rotation matrix
and the
translation
,
representing the rigid transformation that
aligns the camera reference frame (Fig. 1) and the world
reference frame.
Let us consider the case of two cameras (see Fig. 2).
![]() |
If we take the first camera reference frame as the world reference
frame, we can write the two following general camera matrices:
A three-dimensional point
is projected onto both image
planes, to points
and
,
which constitute a conjugate pair. From the left camera we obtain:
Equation (8) means that
lies on
the line going trough
and the point
.
In projective coordinates the collinearity of these
three points can be expressed with the external product:
or
Since
the
rank of
is in general two and, being defined up to a scale
factor, it depends upon seven parameters.
In the most general case, the only geometrical information that can be
computed from pairs of images is the fundamental matrix. Its
computation requires a minimum of eight point correspondences to
obtain a unique solution [29,51].
It can be seen that (9) is equivalent to
Changing to normalized coordinates,
one obtain the original formulation of the
Longuet-Higgins [27] equation,
Unlike the fundamental matrix, whose only property is being of rank two, the essential matrix is characterized by the two constraints found by Huang and Faugeras [25] which are the nullity of the determinant and the equality of the two non-zero singular values. Indeed, the following Theorem holds:
Given two views of a scene, there is a linear projective
transformation (an homography) relating the projection
of the point of a plane
in the first view to its projection
in the second view,
.
This application is given by a
invertible matrix
such that:
![]() |
(15) |
![]() |
(16) |