Publication | Closed Access
MetaCost
1.3K
Citations
20
References
1999
Year
Unknown Venue
Artificial IntelligenceEngineeringMachine LearningArbitrary ClassifierClassification MethodData ScienceData MiningPattern RecognitionClass ImbalanceManagementStatisticsSupervised LearningPredictive AnalyticsKnowledge DiscoveryComputer ScienceMetacost ScalesClassificationClassifier SystemCost-sensitive LearningCost-sensitive Machine Learning
Many machine‑learning classification algorithms assume equal error costs, yet real KDD problems often involve heterogeneous costs, making cost‑sensitive learning essential. This paper introduces a principled wrapper that transforms any arbitrary classifier into a cost‑sensitive version by surrounding it with a cost‑minimizing procedure. The MetaCost wrapper treats the underlying classifier as a black box, requiring no internal changes, and is applicable to any number of classes and arbitrary cost matrices, unlike stratification. Across a large suite of benchmark databases, MetaCost consistently achieves substantial cost reductions over cost‑blind classifiers and stratification, with key components identified and evidence of good scalability on larger datasets.
Research in machine learning, statistics and related fields has produced a wide variety of algorithms for classification. However, most of these algorithms assume that all errors have the same cost, which is seldom the case in KDD problems. Individually making each classification learner costsensitive is laborious, and often non-trivial. In this paper we propose a principled method for making an arbitrary classifier cost-sensitive by wrapping a cost-minimizing procedure around it. This procedure, called MetaCost, treats the underlying classifier as a black box, requiring no knowledge of its functioning or change to it. Unlike stratification, MetaCost, is applicable to any number of classes and to arbitrary cost matrices. Empirical trials on a large suite of benchmark databases show that MetaCost almost always produces large cost reductions compared to the cost-blind classifier used (C4.5RULES) and to two forms of stratification. Further tests identify the key components of MetaCost and those that can be varied without substantial loss. Experiments on a larger database indicate that MetaCost scales well.
| Year | Citations | |
|---|---|---|
Page 1
Page 1