Publication | Closed Access
Boosting for Learning Multiple Classes with Imbalanced Class Distribution
303
Citations
18
References
2006
Year
EngineeringMachine LearningClassification MethodData ScienceData MiningPattern RecognitionClass ImbalanceCost-sensitive Boosting AlgorithmManagementMultiple Classifier SystemImbalanced DataPredictive AnalyticsKnowledge DiscoveryIntelligent ClassificationComputer ScienceData ClassificationClassifier SystemCost-sensitive LearningImbalanced Class DistributionCost-sensitive Machine Learning
Imbalanced class distributions degrade classifier performance, a problem mainly studied in binary settings but also present in multi‑class tasks, where existing solutions fail and cost matrices are often unavailable. The authors develop a cost‑sensitive boosting algorithm to improve classification of multi‑class imbalanced data. They use a genetic algorithm to search for an optimal cost matrix for each class. Experiments show the algorithm significantly improves classification performance on imbalanced datasets.
Classification of data with imbalanced class distribution has posed a significant drawback of the performance attainable by most standard classifier learning algorithms, which assume a relatively balanced class distribution and equal misclassification costs. This learning difficulty attracts a lot of research interests. Most efforts concentrate on bi-class problems. However, bi-class is not the only scenario where the class imbalance problem prevails. Reported solutions for bi-class applications are not applicable to multi-class problems. In this paper, we develop a cost-sensitive boosting algorithm to improve the classification performance of imbalanced data involving multiple classes. One barrier of applying the cost-sensitive boosting algorithm to the imbalanced data is that the cost matrix is often unavailable for a problem domain. To solve this problem, we apply Genetic Algorithm to search the optimum cost setup of each class. Empirical tests show that the proposed cost-sensitive boosting algorithm improves the classification performances of imbalanced data sets significantly.
| Year | Citations | |
|---|---|---|
Page 1
Page 1