Concepedia

Publication | Closed Access

An Oversampling Technique by Integrating Reverse Nearest Neighbor in SMOTE: Reverse-SMOTE

26

Citations

10

References

2020

Year

Abstract

In recent years, the classification problem of an imbalanced dataset is getting a high demand in the field of machine learning. The SMOTE (Synthetic Minority Oversampling Technique) is a traditional approach to solve this issue. The main drawback of SMOTE is the issue of overfitting, as it randomly synthesized the minority data samples taking no notice of the significance of the majority class. To solve this problem, the paper proposes a new algorithm named as Reverse-Synthetic Minority Oversampling Technique (R-SMOTE), based on SMOTE and Reverse-Nearest Neighbor (R-NN). The proposed R-SMOTE extracts a significant set of data points out of the minority class and considers that set to synthesize new samples from their reverse nearest neighbors. The proposed algorithm is compared with four standard oversampling techniques. From the empirical analysis, it is observed that the proposed R-SMOTE had produced much improved results over the existing oversampling methods.

References

YearCitations

Page 1