Convergence of the Approximators

Next: Experimental Results Up: Approximation Using Recursive Partition Previous: Approximation for recognition

Convergence of the Approximators

An important issue to study here is how well the above approximators can approximate a function f. Its answer is closely related to the way samples are generated for the learning set L. In [44], Weng and Chen have shown that the nearest neighbor approximator approaches f pointwise in probability. In this section, we show that the same result also holds for the recursive partition tree approximator.

Due to a high complexity and undetermined nature of the way in which a learning set L is drawn from the real world, it is effective to consider that X(L), the set of samples in S, is generated randomly. We know that a fixed L is a special case of random L in that the probability distribution is concentrated at the single location. Thus, we consider in X(L) as a random sample from S. The learning set L is generated by acquiring samples from S with a d-dimensional probability distribution function .

definition459

If S consists of a finite number of discrete points, a point in P is positively supported means that the probability of selecting as a sample is not a zero-probability event. If S consists of infinitely many points, a point in P is positively supported means that in any small neighborhood centered at , the probability of selecting any point in the neighborhood is not a zero-probability event. In practice, we are not interested in cases that almost never appears in a real-world application. An approximate function can assume any value in subregions of S that will never be used in the application, without hurting the real performance of the system. Thus, we just need to investigate how well the approximation can do at points 's that are positively supported.

definition470

For a classification problem, we have a set of discrete number of categories to be assigned to the input. Then, is continuous only at the interior points.

theorem480

The proof is presented in Appendix A. Theorem 1 means that the RPTA approaches f pointwisely in probability: , as the size of training set L increases without bound.

Next: Experimental Results Up: Approximation Using Recursive Partition Previous: Approximation for recognition

Yuntao Cui
Wed Jun 25 16:00:42 EDT 1997