Publication | Open Access
Performance evaluation of machine learning for breast cancer diagnosis: A case study
19
Citations
22
References
2022
Year
Ensemble Machine LearningEngineeringMachine LearningIntelligent DiagnosticsMachine Learning ToolDiagnosisFeature SelectionData ScienceData MiningPattern RecognitionBreast ImagingDecision Tree LearningBiostatisticsMultiple Classifier SystemBreast Cancer DiagnosisRadiologyPredictive AnalyticsKnowledge DiscoveryMedical Image ComputingData ClassificationBreast CancerClassifier SystemMedicineEnsemble Algorithm
Breast cancer (BC) is one of the most common and aggressive malignancies in women worldwide. It is proven that machine learning (ML) could rapidly and cost-effectively diagnose BC. This study aimed to develop and test predictive models for BC based on women's lifestyle factors using several basic and ensemble machine learning (ML) classifiers. Data of 1503 suspected BC cases were retrospectively extracted from a hospital-based electronic database. First, important risk factors were identified using wrapper-J48, wrapper-SVM, wrapper-NB, logistic regression (LR), and correlation-based feature selection (CFS) methods. Then the performance of five basic ML algorithms, including Naïve Bayes (NB), Bayesian network (BNeT), random forest (RF), multilayer perceptron (MLP), support vector machine (SVM), C4.5, eXtreme Gradient Boosting (XGBoost), decision tree and two ensemble algorithms, including Confidence weighted voting and Voting were compared to predict BC before and after performing feature section (FS). We utilized SPSS 20 and Weka software version 3.8.4 to analyze the data. Implementation of ML models was also performed in R 3.5.0. The RF algorithm presented the best performance before and after performing FS with AUC of 0.799 and 0.798, respectively. Also, the best model's combination using the Confidence weighted voting method improved the classifier performance and achieved the best result with an 80% AUC. The results showed that ensemble ML algorithms represented higher ability than basic methods. The developed models can accurately classify individuals who are at high risk for BC, and can be employed as a screening tool for the early BC detection.
| Year | Citations | |
|---|---|---|
Page 1
Page 1