Beating the hold-out

Abstract

The empirical error on a test set, the hold-out estimate, often is a more reliable estimate of generalization error than the observed error on the training set, the training estimate. K-fold cross validation is used in practice with the hope of being more accurate than the hold-out estimate without reducing the number of training examples. We argue that the k-fold estimate does in fact achieve this goal. Specifically, we show that for any nontrivial learning problem and learning algorithm that is insensitive to example ordering, the k-fold estimate is strictly more accurate than a single hold-out estimate on l/k of the data, for 2 < k < n (It = 12 is leave-one-out), based on its variance and all higher moments. Previous bounds were termed sanitycheck because they compared the k-fold estimate to the training estimate and, further, restricted the VC dimension and required a notion of hypothesis stability [2]. In order to avoid these dependencies, we consider a k-fold hypothesis that is a randomized combination or average of the LT individual hypotheses. We introduceprogressive validation as another possible improvement on the hold-out estimate. This estimate of the generalization error is, in many ways, as good as that of a single hold-out, but it uses an average of half as many examples for testing. The procedure also involves a hold-out set, but after an example has been tested, it is added to the training set and the learning algorithm is rerun.

References

Page 1

	Year	Citations

Page 1