Publication | Closed Access
Scaling up the accuracy of Naive-Bayes classifiers: a decision-tree hybrid
1.4K
Citations
10
References
1996
Year
Unknown Venue
EngineeringMachine LearningText MiningClassification MethodInformation RetrievalData ScienceData MiningPattern RecognitionDecision TreeManagementDecision Tree LearningNaive-bayes Induction AlgorithmsStatisticsMultiple Classifier SystemPredictive AnalyticsKnowledge DiscoveryNaive-bayes ClassifiersIntelligent ClassificationComputer ScienceNew AlgorithmData ClassificationClassification
Naive‑Bayes classifiers are surprisingly accurate even when independence assumptions are violated, yet prior studies have mainly used small databases. The authors propose NBTree, a hybrid that combines decision‑tree structure with Naive‑Bayes leaves. NBTree builds a decision tree where each node splits on a single attribute and each leaf trains a Naive‑Bayes classifier on the subset of data reaching that leaf. In larger datasets, NBTree frequently outperforms both Naive‑Bayes and decision trees, whereas Naive‑Bayes accuracy does not scale up as well as decision trees.
Naive-Bayes induction algorithms were previously shown to be surprisingly accurate on many classification tasks even when the conditional independence assumption on which they are based is violated. However, most studies were done on small databases. We show that in some larger databases, the accuracy of Naive-Bayes does not scale up as well as decision trees. We then propose a new algorithm, NBTree, which induces a hybrid of decision-tree classifiers and Naive-Bayes classifiers: the decision-tree nodes contain univariate splits as regular decision-trees, but the leaves contain Naive-Bayesian classifiers. The approach retains the interpretability of Naive-Bayes and decision trees, while resulting in classifiers that frequently outperform both constituents, especially in the larger databases tested.
| Year | Citations | |
|---|---|---|
Page 1
Page 1