Date: June 25th, 2002

e-mail: fava@ee.wustl.edu

For additional information see my web page.

Depth from focus/defocus is the problem of estimating the 3D surface of a scene from a set of two or more images of that scene. The images are obtained by changing the camera parameters (typically the focal setting or the image plane axial position), and taken from the same point of view (see Figure below).

The difference between depth from focus and depth from defocus is that, in the first case it is possible to dynamically change the camera parameters during the surface estimation process, while in the second case this is not allowed(see [1-12] as a sample of the literature on depth from focus/defocus).

In addition, both the problems are called either active or passive depth from focus/defocus, depending on whether it is possible or not to project a structured light onto the scene.

While many computer vision techniques estimate 3D surfaces by using images obtained with pin-hole cameras, in depth from defocus we use real aperture cameras. Real aperture cameras have a short depth of field, resulting in images which appear focused only on a small 3D slice of the scene. The image process formation can be explained with optical geometry. The lens is modeled via the thin lens law, i.e. [ 1/f]=[ 1/v]+[ 1/u], where f is the focal length, u is the distance between the lens plane and the plane in focus in the scene, and v is the distance between the lens plane and the image plane.

A scene is typically modeled as a smooth opaque Lambertian (i.e. with constant bidirectional reflectance distribution function) surface s. Attached to the surface we have a texture r (otherwise called radiance or focused image).

In this case, the intensity I(**y**) at a pixel **y** Î Z^{2} (we
denote vector coordinates with boldface fonts) of the CCD surface
can be described by:

| (1) |

where the kernel h depends on the surface s and the optical
settings u, and **x** Î R^{2}. For a fixed surface s(**x**) = d
(i.e. a plane parallel to the lens plane at distance d from the
lens plane), the kernel h is function of the difference **y**-**x**,
i.e. integral 1 becomes the convolution

| (2) |

More in general, the kernel h determines the amount of blurring that affects a specific area of the surface in the scene. With ideal optics, the kernel can be represented by a pillbox function. However, in many algorithms for depth from defocus the kernel is approximated by a Gaussian (see Figure below)

| (3) |

Now, the original statement of the problem of depth from defocus
can be stated more precisely. Given a set of L ³ 2 images
I_{1}...I_{L} obtained with focal settings u_{1}...u_{L} from the same
scene, we want to reconstruct the surface s of the scene. For
some methods, this may also require to reconstruct the radiance
r.

In literature there exists a large variety of approximation models
for the above equations. The main exploited simplification is the
*equifocal assumption*. The equifocal assumption consists in
representing the surface locally with a plane parallel to the
image plane (i.e. an equifocal plane). Then, the image formation
process can be locally approximated by Eq. 2.

There also exists a number of real-time systems for depth from defocus. Depth from defocus has been proven to be effective for small distances (e.g. microscopy). Depth from defocus has been compared to stereo vision, provided that the optical system and the scene are properly re-scaled.

Here are some examples of defocused images and the corresponding depth reconstructions.

- [1]
- S. Chaudhuri and A. Rajagopalan.
*Depth from defocus: a real aperture imaging approach*, Springer Verlag, 1999. - [2]
- J. Ens and P. Lawrence.
An investigation of methods for determining depth from focus.
*IEEE Trans. Pattern Anal. Mach. Intell.*, 15:97-108, 1993. - [3]
- P. Favaro and S. Soatto.
Shape and radiance estimation from the information divergence of
blurred images.
*In Proc. European Conference on Computer Vision*, 1:755-68, June/July 2000. - [4]
- P. Favaro and S. Soatto.
Learning depth from defocus.
*Proc. IEEE European Conference on Computer Vision*, 2002 (in press). - [5]
- H. Jin and P. Favaro.
A variational approach to shape from defocus.
*Proc. IEEE European Conference on Computer Vision*, 2002 (in press). - [6]
- A. Mennucci and S. Soatto.
The accommodation cue, part 1: modeling.
*Essrl technical report 99-001, Washington University*, October 1999. - [7]
- S. Nayar and Y. Nakagawa.
Shape from focus.
*IEEE Trans. Pattern Anal. Mach. Intell.*, 16(8):824-831, 1994. - [8]
- A. Pentland.
A new sense for depth of field.
*IEEE Trans. Pattern Anal. Mach. Intell.*, 9:523-531, 1987. - [9]
- S. Soatto and P. Favaro.
A geometric approach to blind deconvolution with application to shape
from defocus.
*Proc. IEEE Computer Vision and Pattern Recognition*, 2:10-7, 2000. - [10]
- M. Subbarao and G. Surya.
Depth from defocus: a spatial domain approach.
*Intl. J. of Computer Vision*, 13:271-294, 1994. - [11]
- M. Watanabe and S. Nayar.
Rational filters for passive depth from defocus.
*Intl. J. of Comp. Vision*, 27(3):203-225, 1998. - [12]
- Y. Xiong and S. Shafer.
Depth from focusing and defocusing.
In
*Proc. of the Intl. Conf. of Comp. Vision and Pat. Recogn.*, pages 68-73, 1993.

File translated from T

On 25 Jun 2002, 16:45.