Concepedia

Publication | Closed Access

Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid

1.4K

Citations

10

References

1996

Year

Ron Kohavi

Unknown Venue

TLDR

Naive‑Bayes classifiers are surprisingly accurate even when independence assumptions are violated, yet prior studies have mainly used small databases. The authors propose NBTree, a hybrid that combines decision‑tree structure with Naive‑Bayes leaves. NBTree builds a decision tree where each node splits on a single attribute and each leaf trains a Naive‑Bayes classifier on the subset of data reaching that leaf. In larger datasets, NBTree frequently outperforms both Naive‑Bayes and decision trees, whereas Naive‑Bayes accuracy does not scale up as well as decision trees.

Abstract

Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decision-tree classifiers and Naive-Bayes classifiers: the decision-tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naive-Bayesian classifiers. The approach retains the interpretability of Naive-Bayes and decision trees, while resulting in classifiers that frequently outperform both constituents, especially in the larger databases tested.

References

YearCitations

Page 1