Publication | Open Access
Interpretable classifiers using rules and Bayesian analysis: Building a better stroke prediction model
756
Citations
68
References
2015
Year
Artificial IntelligenceBayesian Decision TheoryEngineeringMachine LearningBayesian Rule ListsBiomedical Artificial IntelligenceClassification MethodData ScienceData MiningPattern RecognitionBiomedical Data ScienceBiostatisticsBayesian MethodsInterpretabilityDecision ListsPublic HealthStatisticsPrediction ModellingPredictive AnalyticsKnowledge DiscoveryLearning Classifier SystemIntelligent ClassificationPredictive LearningInterpretable ClassifiersBayesian StatisticsClassifier SystemHealth Informatics
Recent advances in personalized medicine motivate the development of highly accurate, interpretable medical scoring systems. The study aims to develop Bayesian Rule Lists, a generative model that produces accurate and interpretable predictive models. Bayesian Rule Lists are decision lists built via a generative Bayesian framework with a sparsity‑encouraging prior, producing interpretable if‑then rules that can replace existing scores like CHADS₂. Bayesian Rule Lists achieve predictive accuracy comparable to leading machine‑learning algorithms while matching CHADS₂ interpretability and outperforming it in accuracy.
We aim to produce predictive models that are not only accurate, but are also interpretable to human experts. Our models are decision lists, which consist of a series of if…then…statements (e.g., if high blood pressure, then stroke) that discretize a high-dimensional, multivariate feature space into a series of simple, readily interpretable decision statements. We introduce a generative model called Bayesian Rule Lists that yields a posterior distribution over possible decision lists. It employs a novel prior structure to encourage sparsity. Our experiments show that Bayesian Rule Lists has predictive accuracy on par with the current top algorithms for prediction in machine learning. Our method is motivated by recent developments in personalized medicine, and can be used to produce highly accurate and interpretable medical scoring systems. We demonstrate this by producing an alternative to the CHADS$_{2}$ score, actively used in clinical practice for estimating the risk of stroke in patients that have atrial fibrillation. Our model is as interpretable as CHADS$_{2}$, but more accurate.
| Year | Citations | |
|---|---|---|
Page 1
Page 1