next up previous
Next: Curse of dimensionality and Up: Approximation Using Recursive Partition Previous: Approximation Using Recursive Partition

The Most Discriminating Features (MDF)

Our goal is to find the best features to classify hand signs. The best criterion to evaluate feature sets is the Bayes error. However, in practice, it is hard to obtain a posteriori probability functions, and since the sample size is small, their estimates have severe biases and variances. One frequently used criterion in practice is based on a family of functions of scatter matrics, which are conceptually simple and give systematic feature extraction algorithms.

Suppose samples of Y are m-dimensional random vectors from c classes. The ith class has a probability tex2html_wrap_inline1710 , a mean vector tex2html_wrap_inline1712 and a scatter matrix tex2html_wrap_inline1714 . The within-class scatter matrix is defined by

equation276

The between-class scatter matrix is

equation286

where the grand mean m is defined as tex2html_wrap_inline1716 . The mixture scatter matrix is the covariance matrix of all the samples regardless of their class assignments:

equation299

Suppose we use k-dimensional linear features tex2html_wrap_inline1718 where W is an tex2html_wrap_inline1722 rectangular matrix whose column vectors are linearly independent. The above mapping represents a linear projection from m-dimensional space to k-dimensional space. The samples tex2html_wrap_inline1724 project to a corresponding set of samples tex2html_wrap_inline1726 whose within-class scatter, and between-class scatter matrices are tex2html_wrap_inline1728 and tex2html_wrap_inline1730 , respectively.

Thus, the problem of feature extraction for classification is to find tex2html_wrap_inline1732 which maximizes tex2html_wrap_inline1734 . tex2html_wrap_inline1736 is larger when the between class scatter is larger or the within-class scatter is smaller. For details of computing tex2html_wrap_inline1732 , the reader is referred to [26]. We call the feature extracted by the above method the most discriminating features (MDFs).



Yuntao Cui
Wed Jun 25 16:00:42 EDT 1997