Next: References Up: Computer Vision IT412 Previous: Solving for the calibration

Subsections

Algorithms for estimating C

We assume that we are given the 3D coordinate vectors ${\bf M}_i$ of N reference points M_i as well as the 2D retinal coordinates (u_i, v_i) of their images. In general, we have at least 6 points, preferably more, and they are arranged in a special pattern, such as that shown in figure 2.

**Figure 2:** The pattern of points on a calibration frame.
$\begin{figure} \par \centerline{ \psfig {figure=figure92.ps} } \par\end{figure}$

There are several methods for obtaining the coefficients of the matrix C. We will outline both linear and non-linear methods.

Linear methods for estimating C

Recall that in homogeneous coordinates we have a linear relationship between the image points m_i and the 3D reference points M_i given by

$\begin{displaymath} \left[ \begin{array} {c} su \\ sv \\ s \end{array} \righ... ...[ \begin{array} {c} X \\ Y \\ Z \\ 1 \end{array} \right]. \end{displaymath}$

Because of the arbitrary scale factor involved, we have set q₃₄ = 1. From this equation we can write

$\begin{displaymath} u = \frac{Xq_{11} + Yq_{12} + Zq_{13} + q_{14}}{Xq_{31} + Yq_{32} + Zq_{33} + 1} \end{displaymath}$

and

$\begin{displaymath} v = \frac{Xq_{21} + Yq_{22} + Zq_{23} + q_{24}}{Xq_{31} + Yq_{32} + Zq_{33} + 1}. \end{displaymath}$

This implies that

Xq₁₁ + Yq₁₂ + Zq₁₃ + q₁₄ - uXq₃₁ - uYq₃₂ - uZq₃₃ = u

and

Xq₂₁ + Yq₂₂ + Zq₂₃ + q₂₄ - vXq₃₁ - vYq₃₂ - vZq₃₃ = v.

In fact, using the given structure of C, this can be written in shorthand as

$\begin{displaymath} {\bf q}_1^{\top}{\bf M}_i - u_i{\bf q}_3^{\top}{\bf M}_i + q_{14} - u_iq_{34} = 0\end{displaymath}$

and

$\begin{displaymath} {\bf q}_2^{\top}{\bf M}_i - v_i{\bf q}_3^{\top}{\bf M}_i + q_{24} - v_iq_{34} = 0. \end{displaymath}$

We can write this as

$\begin{displaymath} {\bf Lq} = 0, \end{displaymath}$

(5)

where L is a $2N \times 12$ matrix and q is a $12 \times 1$ vector.

We will first solve this system, however, without taking into account any special structure in the matrix C.

So given a set of N 3D world points and their image coordinates, we can build up the following matrix equation:

$\begin{displaymath} \left[ \begin{array} {ccccccccccc} X_1 & Y_1 & Z_1 & 1 & 0 ... ... \\ . \\ . \\ . \\ . \\ u_m \\ v_m \end{array} \right], \end{displaymath}$

where the matrix of knowns is $2N \times 11$ .

With 11 unknowns and each point providing 2 constraint equations, we need at least six points to solve the equation.

The best least squares estimate of the q_ij is obtained using the pseudo-inverse. If we write the equation above as

$\begin{displaymath}[{\bf B}] [{\bf C}] = [{\bf UV}] \end{displaymath}$

then

$\begin{displaymath}[{\bf C}] = [{\bf B}]^+[{\bf UV}]. \end{displaymath}$

When the equations are over-constrained, as in our case, the pseudo-inverse is given by

$\begin{displaymath}[{\bf B}] ^+ = [{\bf B}^{\top}{\bf B}]^{-1}{\bf B}^{\top}. \end{displaymath}$

In general, this matrix equation is very ill-conditioned and care must be taken in finding its solution.

Let's return now to the formulation given in equation (5). This is just a system of linear equation and we want to solve for q. Constraints must be imposed upon q however, to avoid the trivial solution q = 0, which is not physically significant. It is natural to use the constraints given to us by the structure of the matrix C, namely $\Vert{\bf q}_3 \Vert^2 = 1$ and $({\bf q}_1 \wedge {\bf q}_3) \cdot ({\bf q}_2 \wedge {\bf q}_3) = 0$ . This is done via a technique known as constrained optimisation, which we will not cover in lectures. Suffice it to say, it leads to a closed form solution. What is of more interest though, is the question of the rank of the matrix L, since this leads to an understanding of how the reference points should be chosen.

We know from standard linear algebra that if we have an $n \times m$ matrix L then

$\begin{displaymath} \mbox {rank}({\bf L}) + \mbox {null}({\bf L}) = m, \end{displaymath}$

where null(L) represents the dimension of the nullspace of L. In our case m = 12 and there are three cases to consider:

rank(L) = 12. Then the nullspace has dimension 0, and there is only one solution to the system, namely q = 0, which is not very meaningful.
rank(L) = 11. Then the nullspace has dimension 1 and there is a unique solution (up to a scale factor).
rank(L) < 11. Now there is an infinite number of solutions to equation (5). One way in which this can happen is if all the reference points M_i are in a plane.

So to achieve a unique solution to the system (5), we need to ensure that the reference points M_i are in general position. This means that the chosen points must not lie in a certain configuration, which can be defined mathematically but is beyond the scope of this lecture. If six or more points are chosen at random, and do not lie on a plane, then we can be confident that this situation will not occur.

Non-linear methods for estimating C

It is possible to re-cast the problem of solving equation (5) as a non-linear minimization problem, where we attempt to minimize the distance in the image plane between the points m_i and the re-projected points M_i. We can do this by defining the quantity

$\begin{displaymath} E = \sum_{i=1}^N \Vert\frac{{\bf q}_1^{\top}{\bf M}_i + q_{1... ...i + q_{24}}{{\bf q}_3^{\top}{\bf M}_i + q_{34}} - v_i \Vert^2. \end{displaymath}$

We then use constrained minimization techniques to minimize E, subject to the constraint that $\Vert{\bf q}_3 \Vert^2 = 1$ . In general, non-linear methods lead to much more robust solutions.

Next: References Up: Computer Vision IT412 Previous: Solving for the calibration

Robyn Owens
10/29/1997