Learning with ensembles: How overfitting can be useful

TLDR

The study investigates the characteristics of learning with ensembles. The authors use ensembles of students with diverse regularization parameters to make performance robust to unknown training noise levels. The analysis shows that large ensembles benefit from under‑regularized, over‑fitting students, while optimal performance is achieved by tuning training set sizes, and small ensembles can further improve generalization through weight optimization, especially when training noise is present.

Abstract

We study the characteristics of learning with ensembles. Solving exactly the simple model of an ensemble of linear students, we find surprisingly rich behaviour. For learning in large ensembles, it is advantageous to use under-regularized students, which actually over-fit the training data. Globally optimal performance can be obtained by choosing the training set sizes of the students appropriately. For smaller ensembles, optimization of the ensemble weights can yield significant improvements in ensemble generalization performance, in particular if the individual students are subject to noise in the training process. Choosing students with a wide range of regularization parameters makes this improvement robust against changes in the unknown level of noise in the training data.

References

Page 1

	Year	Citations

Page 1