Publication | Open Access
A New Oversampling Method Based on the Classification Contribution Degree
78
Citations
36
References
2021
Year
Data ClassificationClassification MethodEngineeringMachine LearningData ScienceData MiningPattern RecognitionClass ImbalancePredictive AnalyticsKnowledge DiscoveryImbalanced LearningNew Oversampling MethodData ImbalanceIntelligent ClassificationClassifier SystemStatistics
Data imbalance is a thorny issue in machine learning. SMOTE is a famous oversampling method of imbalanced learning. However, it has some disadvantages such as sample overlapping, noise interference, and blindness of neighbor selection. In order to address these problems, we present a new oversampling method, OS-CCD, based on a new concept, the classification contribution degree. The classification contribution degree determines the number of synthetic samples generated by SMOTE for each positive sample. OS-CCD follows the spatial distribution characteristics of original samples on the class boundary, as well as avoids oversampling from noisy points. Experiments on twelve benchmark datasets demonstrate that OS-CCD outperforms six classical oversampling methods in terms of accuracy, F1-score, AUC, and ROC.
| Year | Citations | |
|---|---|---|
Page 1
Page 1