

Next:ApplicationsUp:Kernel
Principal Component AnalysisPrevious:Implementation
Examples
We first take a test nonlinear data set. Using a Gaussian kernel, we calculate
3 of the KPCA components, shown in Figure 1. We also show
a pseudo-density estimate [7] (Figure 2),
which is constructed using a particular weighted sum-of-squares of components.
|
Figure 1: A test nonlinear 2D data set (black
circles), and 3 unnormalised KPCA components.
|
Figure 2: The Pseudo-density Estimate for
the test data set.
As you can see, this pseudo-density does indeed describe the nonlinear
distribution of the test data set.
As a second example, we consider images of handwritten digits, taken
from the UCI Database [2]. The 16x16 grayscale
images are encoded directly as 256-dimensional vectors, and 50 examples
of each digit are used in the data set. As you can see from figure 4,
the first 3 KPCA components separate out 3 of the digits, the higher components
separating out the other digits in an analogous manner.
|
Figure 3: Examples of handwritten digits.
|
Figure 4: The digit training set, first
3 KPCA components
Coloured circles: 3 selected digits, Black crosses: Other
7 digits.
There are several important points to note about the behaviour of the
KPCA components, which should be contrasted with the behaviour of linear
PCA:
-
The maximum number of components is determined not by the dimensionality
of the input data, but by the number of input data points.
-
Not all sets of values of the components correspond to an actual point
in input space.


Next:ApplicationsUp:Kernel
Principal Component AnalysisPrevious:Implementation
Carole Twining
2001-10-02