Fifty Years of Classification and Regression Trees

TLDR

Since the first regression tree algorithm was published fifty years ago, modern classification and regression trees now use linear splits, nearest‑neighbor, kernel density, and many statistical models, and free software has broadened their use. This article surveys the developments and briefly reviews the key ideas behind some of the major algorithms. The authors conduct a survey of major algorithmic developments, briefly reviewing key ideas.

Abstract

Abstract Fifty years have passed since the publication of the first regression tree algorithm. New techniques have added capabilities that far surpass those of the early methods. Modern classification trees can partition the data with linear splits on subsets of variables and fit nearest neighbor, kernel density, and other models in the partitions. Regression trees can fit almost every kind of traditional statistical model, including least‐squares, quantile, logistic, Poisson, and proportional hazards models, as well as models for longitudinal and multiresponse data. Greater availability and affordability of software (much of which is free) have played a significant role in helping the techniques gain acceptance and popularity in the broader scientific community. This article surveys the developments and briefly reviews the key ideas behind some of the major algorithms.

References

Page 1

	Year	Citations

Page 1