Concepedia

Publication | Open Access

BagBoosting for tumor classification with gene expression data

539

Citations

40

References

2004

Year

TLDR

Microarray experiments promise precise, early cancer diagnosis, but require class‑prediction tools that handle many correlated variables, perform feature selection, and provide class‑probability estimates. The authors propose BagBoosting, a novel algorithm that combines bagging and boosting to address these needs. BagBoosting integrates bagging as a module within boosting, leveraging ensemble strengths to improve classification. On real and simulated gene‑expression data, BagBoosting consistently outperforms both bagging and boosting alone, and its advantage is confirmed against several established microarray classifiers, with gains achieved by increased computational effort. The software implementing BagBoosting and related benchmarks is available as an open‑source R package at http://stat.ethz.ch/~dettling/bagboost.html.

Abstract

Abstract Motivation: Microarray experiments are expected to contribute significantly to the progress in cancer treatment by enabling a precise and early diagnosis. They create a need for class prediction tools, which can deal with a large number of highly correlated input variables, perform feature selection and provide class probability estimates that serve as a quantification of the predictive uncertainty. A very promising solution is to combine the two ensemble schemes bagging and boosting to a novel algorithm called BagBoosting. Results: When bagging is used as a module in boosting, the resulting classifier consistently improves the predictive performance and the probability estimates of both bagging and boosting on real and simulated gene expression data. This quasi-guaranteed improvement can be obtained by simply making a bigger computing effort. The advantageous predictive potential is also confirmed by comparing BagBoosting to several established class prediction tools for microarray data. Availability: Software for the modified boosting algorithms, for benchmark studies and for the simulation of microarray data are available as an R package under GNU public license at http://stat.ethz.ch/~dettling/bagboost.html

References

YearCitations

Page 1