The pixels which compose an image can be considered as a very long vector. Thus, an image of a prototypical face can be considered as a basis vector for describing other images by inner product. A set of such images can form a space for describing a shape. For example, images number 0, 30 and 60 from figure 6 can be used to define a space of a face turning. A particular face image can be represented by the vector of 3 inner products obtained with these three images. It is necessary to compute this inner product at an appropriate location, but since correlation is a sequence of inner products, it is possible to find the peak correlation, and then descibe the image by the vector of inner products obtained at that position.
The problem with this approach is that it can rapidly become expensive as the number of images increases. However, the image set can be reduced to minimal orthogonal basis set, and the correlation with this basis set used to describe the contents. This is the idea behind the EigenSpace coding made popular by Turk and Pentland [TP91]. Correlation with a set of Eigen images is commonly thought of as a technique for recognition. However, it is also possible to use such a coding as a compact image transmission code provided that the position, scale and contrast are suitably normalised.
To construct an eigenspace, we begin with a database of images, as for example shown in figure 8. We then compute an average image as shown in figure 9. Finally, an fast algorithm is used to compute the covariance of the of the image data based. For a 128 by 128 = 65,536 pixel image, this covariance matrix has, in theory, coefficients. Fortunately, this covariance is highly diagonal, and a fast algorithm can be used to compute and diagonalise this covariance. The principal components of the covariance matrix form an orthogonal basis set, which are the axis of the EigenSpace as shown in figure 9.
Figure 8: A small face data base composed of 4 images.
Figure 9: The three principal components of the covariance matrix.
One of the simplest applications of the eigenfaces method [TP91] is the recognition of a subject We have prepared a simple demo which works as follows. At the beginning of a session, the system classifies the subjects face in order to determine if the subject is known. Classification is performed by multiplying the normalized face by each of the principle component images in order to obtain a vector. The vector positions the image in the ``face space'' defined by the current Eigenfaces. If the face is near a position of this space which corresponds to a known subject, then the subject's image from the face-space database is displayed. If the vector is not near a known subject, the subject is classified as unknown and no face is displayed. Using the Eigenface technique, our Quadra 700 with no additional hardware can digitize and classify a face within a 108 by 120 image for a database of 12 images at about 1 frame per second.
It is possible to use the Eigenface technique to measure parameters. One example of this is for eye-tracking. We train a set of images of the subject looking in different directions and use these images to form an Eigen-Space. During execution of a task, a high-resolution window is placed over the subjects eyes, and the position in the Eigen-Space is computed. The nearest principle components are used to interpolate the current horizontal and vertical direction.
We are experimenting with this technique to determine the trade-off between resolution of the windows on the eyes, the number of eigen-images needed, and the precision which we can obtain in eye tracking. The goal is to be able to drive a pointing device, such as a mouse with such eye tracking. Facial expression contains useful informations about the user's state of mind. In the Neimo experiment, the user's state of mind is a very interesting information. The Eigenfaces idea can be easily extended to classifying the users facial expression. A set of facial expressions are obtained as the subject performs his task. These facial expressions are then used to form an Eigenspace. At each instant, the system determines the face expression class which most closely corresponds to the users current expression. In this way, we can experiment with anticipating the users ``mood'' based on facial expression.