Minority report in fraud detection

TLDR

The study introduces a novel fraud detection approach that addresses skewed data distributions by building on prior research and Minority Report. The method employs backpropagation, naive Bayes, and C4.5 classifiers on minority‑oversampled data partitions, then uses a stacking meta‑classifier to select the best base model and bagging to combine predictions, and is benchmarked against undersampling, oversampling, and SMOTE approaches. Experiments on an automobile insurance fraud dataset show that stacking‑bagging yields modestly higher cost savings than the best bagged C4.5, outperforms BP without sampling, and that the improved savings arise from the combined contributions of all three classifiers.

Abstract

This paper proposes an innovative fraud detection method, built upon existing fraud detection research and Minority Report , to deal with the data mining problem of skewed data distributions. This method uses backpropagation (BP), together with naive Bayesian (NB) and C4.5 algorithms, on data partitions derived from minority oversampling with replacement. Its originality lies in the use of a single meta-classifier (stacking) to choose the best base classifiers, and then combine these base classifiers' predictions (bagging) to improve cost savings (stacking-bagging). Results from a publicly available automobile insurance fraud detection data set demonstrate that stacking-bagging performs slightly better than the best performing bagged algorithm, C4.5, and its best classifier, C4.5 (2), in terms of cost savings. Stacking-bagging also outperforms the common technique used in industry (BP without both sampling and partitioning). Subsequently, this paper compares the new fraud detection method (meta-learning approach) against C4.5 trained using undersampling, oversampling, and SMOTEing without partitioning (sampling approach). Results show that, given a fixed decision threshold and cost matrix, the partitioning and multiple algorithms approach achieves marginally higher cost savings than varying the entire training data set with different class distributions. The most interesting find is confirming that the combination of classifiers to produce the best cost savings has its contributions from all three algorithms.

References

Page 1

	Year	Citations

Page 1