Boosting and Other Ensemble Methods

TLDR

The study compares three neural‑network ensemble methods to a single network. The authors evaluate four algorithms—two boosting variants and two independent‑committee ensembles—by measuring training and test error curves on OCR tasks across varying dataset sizes and computational budgets using three network architectures. Results show that a single network outperforms ensembles on small datasets, while boosting variants excel on larger datasets and at any fixed computational cost, and that boosting’s training error converges to the test error as data grow, suggesting new training algorithm insights.

Abstract

We compare the performance of three types of neural network-based ensemble techniques to that of a single neural network. The ensemble algorithms are two versions of boosting and committees of neural networks trained independently. For each of the four algorithms, we experimentally determine the test and training error curves in an optical character recognition (OCR) problem as both a function of training set size and computational cost using three architectures. We show that a single machine is best for small training set size while for large training set size some version of boosting is best. However, for a given computational cost, boosting is always best. Furthermore, we show a surprising result for the original boosting algorithm: namely, that as the training set size increases, the training error decreases until it asymptotes to the test error rate. This has potential implications in the search for better training algorithms.

References

Page 1

	Year	Citations

Page 1