In 1992, Olivier Faugeras  published a paper that overturned existing thinking about camera calibration and the extraction of metric information from our environment using cameras. He set out to determine what information could be extracted from a binocular stereo rig for which there was no three-dimensional metric calibration data available. All that is assumed is that we have a stereo camera system that is capable, by comparing the two images, of establishing some correspondence between them. Each such correspondence, written (m, m'), indicates that the two image points m and m' are very likely to be the images of the same world point M. Thus, the system does not know its intrinsic and extrinsic parameters. This is known as the uncalibrated system.
Surprisingly, it is still possible to reconstruct some very rich non-metric representations of the environment. What is actually extracted are the projective invariants of the scene. Precisely what this means will become clearer as the lecture progresses, but it does indicate that researchers may have been overly optimistic in trying to extract complete metric information; certainly, it has proved to be very difficult and sensitive to noise, and not at all necessary for many applications, such as robot navigation.
It turns out that it is actually possible to use these projective invariants to work out the camera calibration. Self-calibration refers to the process of calculating all the intrinsic parameters of the camera using only the information available in the images taken by that camera. No calibration frame or known object is needed: the only requirement is that there is a static object in the scene, and the camera moves around taking images. Thus self-calibration is ideal for a mobile camera, such as a camera mounted on a mobile robot. The actual camera movement itself does not need to be known.
The geometric information that relates two different viewpoints of the same scene is entirely contained in a mathematical construct known as the fundamental matrix. The two viewpoints could be a stereo pair of images, or a temporal pair of images. In the latter case the two images are taken at different times with the camera moving between image acquisitions.
We will begin this lecture by considering the geometry of two different images of the same scene, known as the epipolar geometry. We will then discuss the fundamental matrix and how it is calculated, and clarify precisely what information can be extracted in the uncalibrated case. Finally, we will consider self-calibration.