Publication | Closed Access
Learning from Imbalanced Data Sets: A Comparison of Various Strategies *
453
Citations
6
References
2000
Year
Unknown Venue
Although the majority of concept-learning systems previously designed usually assume that their training sets are well-balanced, this assumption is not necessarily correct. Indeed, there exists many domains for which one class is represented by a large number of examples while the other is represented by only a few. The purpose of this paper is 1) to demonstrate experimentally that, at least in the case of connectionist systems, class imbalances hinder the performance of standard classifiers and 2) to compare the performance of several approaches previously proposed to deal with the problem. 1. Introduction As the field of machine learning makes a rapid transition from the status of "academic discipline" to that of "applied science", a myriad of new issues, not previously considered by the machine learning community, is now coming into light. One such issue is the class imbalance problem. The class imbalance problem corresponds to domains for which one class is represented...
| Year | Citations | |
|---|---|---|
Page 1
Page 1