Random Forests and Kernel Methods

TLDR

Random forests are ensemble methods that grow decision trees and average their predictions, achieving strong practical performance especially in high‑dimensional settings, and recent studies suggest a theoretical link to kernel methods. This paper aims to elucidate that connection in detail. By slightly redefining random forests, the authors derive kernel‑based estimators called KeRF, provide explicit formulas for certain forest models, and establish upper bounds on their consistency rates. Empirical results show that KeRF estimates perform comparably or better than traditional random forest predictions.

Abstract

Random forests are ensemble methods which grow trees as base learners and combine their predictions by averaging. Random forests are known for their good practical performance, particularly in high-dimensional settings. On the theoretical side, several studies highlight the potentially fruitful connection between the random forests and the kernel methods. In this paper, we work out this connection in detail. In particular, we show that by slightly modifying their definition, random forests can be rewritten as kernel methods (called KeRF for kernel based on random forests) which are more interpretable and easier to analyze. Explicit expressions of KeRF estimates for some specific random forest models are given, together with upper bounds on their rate of consistency. We also show empirically that the KeRF estimates compare favourably to the random forest estimates.

References

Page 1

	Year	Citations

Page 1