An Incremental Approach to Training Data Selection in Neural Networks

Sethu Vijayakumar

Abstract of Master thesis-93M17310, Tokyo Institute of Technology, March 1995.

Neural networks (NNs) are parallel, distributive, adaptive information processing structures which develop information processing abilities in response to exposure to an information environment. If we look at the input-output relationship of a NN, then, a single NN can be regarded as a single real valued function.

The learning problem is to construct a NN that expresses the best approximation to a desired function based on the training data. This can also be considered as an inverse problem of obtaining a learning operator which reconstructs the learned function from the set of m training data. The optimization of the learning operator has been studied throroughly based on different optimization criteria like the memorization criterion, the Wiener criterion, the projection criterion, etc. However, with an objective of optimizing the generalizing ability of the NN, there is need to look into methods of selecting an optimal training set rather than using the given set of training data.

Methods of selection of the training set can be broadly classified into two types. The first approach involves selecting the entire training set in one go (batch) while the the second involves incrementally selecting the sample points one after the other. In this research work, we will use the incremental approach since this would leave a scope for adapting our sampling strategy based on the new sample value, i.e., active sampling.

To solve the problems stated above, the aim of the research work was divided into two sub-objectives. The first one involves selecting the next best sample point from the viewpoint of optimizing the generalizing ability of the NN. The second objective, a direct consequence of the first, is a method of incorporating the newly selected sample point efficiently and incrementally into the learning process.

To this effect, a concept called the incremental learning, based on successive learning with increase in number of sample points, was introduced and formalized. With this framework, a method for incrementally computing the new learning operator as a function of the old learning operator and the new sample point was devised, which corresponds to the second objective. The newly learned function due to m+1 sample points was computed incrementally using the previously learned function. Furthermore, a scheme for selecting the next best sample point in order to reduce the computational cost of the incremental computation was formulated. In this research work, an incremental Wiener criterion has been used as the optimization criterion for the learned function.

With regard to the first objective, it was proved theoretically that the sample points which reduce computational cost also perform better from the generalizing ability viewpoint. This was verified through computer simulations. Hence, the problem of selecting the next best sample point to reduce compuational cost and that of optimizing the generalizing ability was shown to possess a common solution.

However, the task of obtaining the best sample point for optimizing the generalizing ability of the NN is yet to be solved and will be considered as one of the future objectives in continuation of this research work.

Click here to download a compressed (gzip-ed) version of the thesis (80 pages).