Learning with Noisy Labels

TLDR

Random label noise is class‑conditional, with the flip probability depending on the class. The paper theoretically investigates binary classification when labels are randomly flipped with a small probability. We propose two methods: an unbiased estimator of any loss with empirical‑risk‑minimization bounds, and a weighted surrogate loss derived from a reduction to weighted 0‑1 loss that yields strong risk guarantees. Under a simple symmetry condition the unbiased estimator yields an efficient empirical‑minimization algorithm, shows that biased SVM and weighted logistic regression are provably noise‑tolerant, and achieves over 88 % accuracy on a synthetic non‑separable dataset with 40 % label corruption, outperforming recent noise‑robust baselines.

Abstract

In this paper, we theoretically study the problem of binary classification in the presence of random classification noise—the learner, instead of seeing the true labels, sees labels that have independently been flipped with some small probability. Moreover, random label noise is class-conditional— the flip probability depends on the class. We provide two approaches to suitably modify any given surrogate loss function. First, we provide a simple unbiased estimator of any loss, and obtain performance bounds for empirical risk minimization in the presence of iid data with noisy labels. If the loss function satisfies a simple symmetry condition, we show that the method leads to an efficient algorithm for empirical minimization. Second, by leveraging a reduction of risk minimization under noisy labels to classification with weighted 0-1 loss, we suggest the use of a simple weighted surrogate loss, for which we are able to obtain strong empirical risk bounds. This approach has a very remarkable consequence — methods used in practice such as biased SVM and weighted logistic regression are provably noise-tolerant. On a synthetic non-separable dataset, our methods achieve over 88% accuracy even when 40% of the labels are corrupted, and are competitive with respect to recently proposed methods for dealing with label noise in several benchmark datasets.

References

Page 1

	Year	Citations

Page 1