Random decision forests

TLDR

Decision trees are fast classifiers, yet traditional training limits their complexity, causing suboptimal training accuracy and potential loss of generalization. The authors propose a tree‑based classifier whose capacity can be arbitrarily expanded to improve accuracy on both training and unseen data. They construct multiple trees in randomly selected subspaces of the feature space, enabling complementary generalization. Experiments on handwritten digit recognition demonstrate that combining trees from different subspaces monotonically improves classification performance.

Abstract

Decision trees are attractive classifiers due to their high execution speed. But trees derived with traditional methods often cannot be grown to arbitrary complexity for possible loss of generalization accuracy on unseen data. The limitation on complexity usually means suboptimal accuracy on training data. Following the principles of stochastic modeling, we propose a method to construct tree-based classifiers whose capacity can be arbitrarily expanded for increases in accuracy for both training and unseen data. The essence of the method is to build multiple trees in randomly selected subspaces of the feature space. Trees in, different subspaces generalize their classification in complementary ways, and their combined classification can be monotonically improved. The validity of the method is demonstrated through experiments on the recognition of handwritten digits.

References

Page 1

	Year	Citations

Page 1