Next: The discrete Fourier transform Up: Computer Vision IT412 Previous: Lecture 4

Subsections

Fourier transform theory

So far we have been processing images by looking at the grey level at each point in the image. These methods are known as spatial methods.

However, there are many ways of transforming image data into alternative representations that are more amenable for certain types of analysis. The most common image transform takes spatial data and transforms it into frequency data. This is done using the Fourier transform.

The Fourier transform is simply a method of expressing a function (which is a point in some infinite dimensional vector space of functions) in terms of the sum of its projections onto a set of basis functions. Since an image is only defined on a closed and bounded domain (the image window), we can assume that the image is defined as being zero outside this window. In other words, we can assume that the image function is integrable over the real line.

To see how the Fourier transform works, we will begin with a one-dimensional signal and consider a simple step function. This is equivalent to taking a horizontal slice through an image that is black on its left half and white on its right half, as shown in figure 1.

Now, a step function (or a square wave form) can be represented as a sum of sine waves of frequency $\omega, 3 \omega, 5 \omega, \ldots$ , where $\omega$ is the frequency of the square wave, and we recall that frequency = 1/wavelength. Normally, frequency refers to the rate of repetitions per unit time, that is, the number of cycles per second (Hertz). In images we are concerned with spatial frequency, that is, the rate at which brightness in the image varies across the image, or varies with viewing angle. Figure 2 shows the sum of the first few terms in a sine wave decomposition of a square wave. This sum converges to the square wave as the number of terms tends to infinity.

**Figure 1:** A step function as a slice through an image.
$\begin{figure} \par \centerline{ \psfig {figure=figure41.ps} } \par\end{figure}$

**Figure 2:** A step function is the sum of an infinite number of sine waves.
$\begin{figure} \par \centerline{ \psfig {figure=step.ps,angle=-90,height=2in,width=4in} } \par\end{figure}$

From the decomposition of the signal into varying sinusoidal components we can construct a diagram displaying the amplitudes of all the sinusoids for all the frequencies. A graph of such a diagram is given in figure 3 below for the square wave.

Note that we have to consider negative frequencies (whatever that might actually mean) so the sinusoidal component of frequency f and amplitude A₁ has to be split into two components of amplitude A₁/2 at the frequencies +f and -f. A graph of the amplitude of the Fourier components is known as the spectrum of the wave form.

**Figure 3:** The amplitude of the sine waves at each frequency for a square wave.
$\begin{figure} \par \centerline{ \psfig {figure=figure43.ps,width=5in} } \par\end{figure}$

So what does the Fourier transform really mean? When we calculate the Fourier transform of an image, we treat the intensity signal across the image as a function, not just an array of values. The Fourier transform describes a way of decomposing a function into a sum of orthogonal basis functions in just the same way as we decompose a point in Euclidean space into the sum of its basis vector components.

For example, a vector v in 3-space is described in terms of 3 orthogonal unit vectors i, j and k, and we can write v as the sum of its projections onto these 3 basis vectors:

$\begin{displaymath} {\bf v} = x {\bf i} + y {\bf j} + z {\bf k}. \end{displaymath}$

Given the vector v, we can calculate the components of v in each of the i, j, and k directions by calculating the dot product (or inner product or projection) of v and each of these basis vectors. Thus

$\begin{displaymath} x = {\bf v} \cdot {\bf i}, y = {\bf v} \cdot {\bf j}, \mbox{and } x = {\bf v} \cdot {\bf k}. \end{displaymath}$

A similar process is used to calculate the Fourier transform of a function. The function is just, conceptually, a point in some vector space (although now the vector space is infinitely dimensional). Given our orthogonal basis functions, we calculate the component of our given function in each of the basis functions by calculating the inner product between the two. The standard basis functions used for Fourier transform are $\{\sin(2 \pi \omega x), \cos(2 \pi \omega x), \omega \in {\bf R} \}$ or, equivalently $\{e^{-i2 \pi \omega x}, \omega \in {\bf R} \}$ . It is the frequency $\omega$ that varies over the set of all real numbers to give us an infinite collection of basis functions. Since

$\begin{displaymath} e^{2 \pi i \omega x} = \cos(2 \pi \omega x) + {\rm i} \sin(2 \pi \omega x), \end{displaymath}$

we see that the Fourier transform has real and imaginary components. Moreover, the exponential form of basis function allows us to represent both real and complex valued functions by their Fourier transform.

We can show that any two basis functions of different frequencies are orthogonal by calculating their inner product and showing that it is 0. For example, for the real case and considering only the cosine terms,

$\begin{displaymath} \int \cos(2 \pi \omega_1 x) \cos(2 \pi \omega_2 x)dx = 0 \end{displaymath}$

for $\omega_1 \neq \omega_2$ , because the function being integrated is actually a cosine function itself ( $2 \cos A \cos B = \cos(A+B) + \cos(A-B)$ ) and so it has equal areas above and below the x-axis.

Thus, we project our given function f onto our basis functions $e^{-2 \pi i \omega x}$ to get the Fourier amplitudes $F(\omega)$ for each frequency $\omega$ :

$\begin{displaymath} {\cal F}(f(x)) = F(\omega) = \int f(x) e^{-2 \pi i \omega x}dx. \end{displaymath}$

In general, $F(\omega)$ will be complex, say of the form $a(\omega) + {\rm i}b(\omega)$ .

We often express F in polar form though:

$\begin{displaymath} F(\omega) = \mid F(\omega) \mid e^{i \Phi(\omega)}, \end{displaymath}$

where

$\begin{displaymath} \mid F(\omega) \mid = \sqrt{a^2 + b^2} \end{displaymath}$

and

$\begin{displaymath} \Phi(\omega) = \tan^{-1}(\frac{b}{a}). \end{displaymath}$

**Figure 4:** The amplitude and phase angle of a sine wave at a particular frequency.
$\begin{figure} \par \centerline{ \psfig {figure=figure44.ps} } \par\end{figure}$

The norm of the amplitude, $\mid F(\omega) \mid$ is called the Fourier spectrum of f, and the exponent $\Phi(\omega)$ is called the phase angle. The square of the amplitude is just $P(\omega) = a^2(\omega) + b^2(\omega)$ and is called the power spectrum of f.

In many applications only the amplitude information is needed and the phase information is discarded. However, despite this common practice, phase information should not be ignored. In images, as in sound signals, phase carries considerable information [3]. Oppenheim and Lim have shown that if we construct synthetic images made from the amplitude information of one image and the phase information of another, it is the image corresponding to the phase data that we perceive, if somewhat degraded.

**Figure 5:** The amplitude data are taken from the vdu image and the phase data are taken from the face image.
$\begin{figure} \par \centerline{\hbox{ \psfig {figure=vdu.ps,width=6cm} \psfig {figure=g4.ps,width=6cm} }} \par\end{figure}$

**Figure 6:** The phase data dominates our perception.
$\begin{figure} \par \centerline{ \psfig {figure=mix.ps} } \par\end{figure}$

Now, having generated the Fourier transform of a function, we want to be able to reconstruct the original function from its Fourier components. This is simply done by summing up all the Fourier components multiplied by their corresponding basis function, that is,

$\begin{displaymath} f(x) = \int F(\omega) e^{2 \pi i \omega x}d \omega. \end{displaymath}$

This is analogous to expressing the vector v as the sum of its projections onto the basis vectors. Note that the inverse Fourier transform uses the basis functions $e^{2 \pi i \omega x}$ , whilst the Fourier transform uses the basis functions $e^{-2 \pi i \omega x}$ .This prevents a sign change occurring in the reconstruction process, since ${\rm i} \times {\rm i} = -1, -{\rm i} \times {\rm i} = 1$ .

Example Consider again the square wave form shown in the figure below.

**Figure 7:** A square wave form and its Fourier spectrum.
$\begin{figure} \par \centerline{ \psfig {figure=figure44a.ps,width=12cm} } \par\end{figure}$

Now

This is a complex-values quantity, and the Fourier spectrum is given by its modulus, $AX \frac{\sin(\pi \omega X)}{\pi \omega X}$ .

Important properties of the Fourier transform

Suppose we are given two functions f and g, with Fourier transforms F and G, and suppose that a and b are constants. Then

The Fourier transform is linear, that is,
$\begin{displaymath} {\cal F }(af(x) + bg(y)) = aF(\omega) + bG(\omega). \end{displaymath}$
Changing spatial scale inversely affects frequency and amplitude, that is,
$\begin{displaymath} {\cal F}(f(ax)) = \frac{1}{a}F(\frac{\omega}{a}). \end{displaymath}$
Shifting the function only changes the phase of the spectrum, that is,
$\begin{displaymath} {\cal F}(f(x - a)) = F(\omega)e^{2 \pi i \omega a}. \end{displaymath}$

We can also take advantage of symmetries in the spatial and frequency domains as follows:

if f(x) is real, then $F(-\omega) = F(\omega)^*$
if f(x) is imaginary, then $F(-\omega) = -F(\omega)^*$
if f(x) is even, then $F(-\omega) = F(\omega)$
if f(x) is odd, then $F(-\omega) = -F(\omega)$ .

Here, the notation ^* indicates the complex conjugate operation. Thus, if f(x) is real and even, then $F(\omega)$ is real and even, and if f(x) is real and odd, then $F(\omega)$ is imaginary and odd.

The Convolution Theorem tells us that convolution in the spatial domain corresponds to multiplication in the frequency domain, and vice versa. That is,

$\begin{displaymath} f(x) \otimes g(x) \Leftrightarrow F(\omega)G(\omega) \end{displaymath}$

and equivalently

$\begin{displaymath} f(x)g(x) \Leftrightarrow F(\omega) \otimes G(\omega). \end{displaymath}$

Thus, convolution with large masks in the spatial domain can often be done more efficiently as multiplication in the frequency domain. Likewise, division in the frequency domain corresponds to deconvolution in the spatial domain. This is the basis by which image restoration for blur due to focus or motion is done.

Two dimensional Fourier transforms

Now an image is thought of as a two dimensional function and so the Fourier transform of an image is a two dimensional object. Thus, if f is an image, then

$\begin{displaymath} F(\omega, \nu) = \int \int f(x,y)e^{-2 \pi i (\omega x + \nu y)}dxdy. \end{displaymath}$

Fortunately, it is possible to calculate this integral in two stages, since the 2D Fourier transform is separable. Thus, we first form the Fourier transform with respect to x:

$\begin{displaymath} F(\omega, y) = \int f(x,y)e^{-2 \pi i \omega x}dx \end{displaymath}$

and then we calculate the Fourier transform of this function of y:

$\begin{displaymath} F(\omega, \nu) = \int F(\omega, y) e^{-2 \pi i \nu y}dy. \end{displaymath}$

Next: The discrete Fourier transform Up: Computer Vision IT412 Previous: Lecture 4

Robyn Owens
10/29/1997