Publication | Closed Access
Confusion-Matrix-Based Kernel Logistic Regression for Imbalanced Data Classification
177
Citations
40
References
2017
Year
Artificial IntelligenceEngineeringMachine LearningImbalanced Data ClassificationDiagnosisClassification MethodData ScienceData MiningPattern RecognitionClass ImbalanceManagementConfusion MatrixImbalanced DataPredictive AnalyticsKnowledge DiscoveryBenchmark Imbalanced DatasetsComputer ScienceData ClassificationClassificationClassifier SystemCost-sensitive LearningCost-sensitive Machine Learning
Imbalanced data classification is critical in many anomaly, failure, and risk detection applications, yet existing methods—sampling, cost‑sensitive, and ensemble—often rely on heuristic, task‑dependent processes. This study introduces confusion‑matrix‑based kernel logistic regression (CM‑KLOGR) to achieve superior, heuristic‑free classification performance and evaluates its effectiveness on benchmark imbalanced datasets. CM‑KLOGR optimizes a harmonic‑mean objective of confusion‑matrix criteria such as sensitivity and positive predictive value within the kernel logistic regression framework using minimum classification error and generalized probabilistic descent learning. The harmonic‑mean objective, KLOGR formulation, and MCE/GPD learning enable CM‑KLOGR to improve multifaceted performance in a well‑balanced manner.
There have been many attempts to classify imbalanced data, since this classification is critical in a wide variety of applications related to the detection of anomalies, failures, and risks. Many conventional methods, which can be categorized into sampling, cost-sensitive, or ensemble, include heuristic and task dependent processes. In order to achieve a better classification performance by formulation without heuristics and task dependence, we propose confusion-matrix-based kernel logistic regression (CM-KLOGR). Its objective function is the harmonic mean of various evaluation criteria derived from a confusion matrix, such criteria as sensitivity, positive predictive value, and others for negatives. This objective function and its optimization are consistently formulated on the framework of KLOGR, based on minimum classification error and generalized probabilistic descent (MCE/GPD) learning. Due to the merits of the harmonic mean, KLOGR, and MCE/GPD, CM-KLOGR improves the multifaceted performances in a well-balanced way. This paper presents the formulation of CM-KLOGR and its effectiveness through experiments that comparatively evaluated CM-KLOGR using benchmark imbalanced datasets.
| Year | Citations | |
|---|---|---|
Page 1
Page 1