Concepedia

TLDR

L2Boost, like other boosting algorithms, repeatedly applies a chosen learner in an iterative fashion. The article investigates L2Boost, a computationally simple boosting variant derived from functional gradient descent with L2 loss. The authors analyze L2Boost by explicitly refitting residuals, studying symmetric linear learners in both regression and classification. The study reveals an exponential bias‑variance trade‑off with slowly increasing variance, optimal convergence for smoothing spline learners that adapt to unknown smoothness, and demonstrates through simulations and real data that L2Boost, especially with component‑wise cubic smoothing splines, is practical and effective for high‑dimensional predictors.

Abstract

This article investigates a computationally simple variant of boosting, L2Boost, which is constructed from a functional gradient descent algorithm with the L2-loss function. Like other boosting algorithms, L2Boost uses many times in an iterative fashion a prechosen fitting method, called the learner. Based on the explicit expression of refitting of residuals of L2Boost, the case with (symmetric) linear learners is studied in detail in both regression and classification. In particular, with the boosting iteration m working as the smoothing or regularization parameter, a new exponential bias-variance trade-off is found with the variance (complexity) term increasing very slowly as m tends to infinity. When the learner is a smoothing spline, an optimal rate of convergence result holds for both regression and classification and the boosted smoothing spline even adapts to higher-order, unknown smoothness. Moreover, a simple expansion of a (smoothed) 0–1 loss function is derived to reveal the importance of the decision boundary, bias reduction, and impossibility of an additive bias-variance decomposition in classification. Finally, simulation and real dataset results are obtained to demonstrate the attractiveness of L2Boost. In particular, we demonstrate that L2Boosting with a novel component-wise cubic smoothing spline is both practical and effective in the presence of high-dimensional predictors.

References

YearCitations

Page 1