Classification with Asymmetric Label Noise: Consistency and Maximal\n Denoising

Abstract

In many real-world classification problems, the labels of training examples\nare randomly corrupted. Most previous theoretical work on classification with\nlabel noise assumes that the two classes are separable, that the label noise is\nindependent of the true class label, or that the noise proportions for each\nclass are known. In this work, we give conditions that are necessary and\nsufficient for the true class-conditional distributions to be identifiable.\nThese conditions are weaker than those analyzed previously, and allow for the\nclasses to be nonseparable and the noise levels to be asymmetric and unknown.\nThe conditions essentially state that a majority of the observed labels are\ncorrect and that the true class-conditional distributions are "mutually\nirreducible," a concept we introduce that limits the similarity of the two\ndistributions. For any label noise problem, there is a unique pair of true\nclass-conditional distributions satisfying the proposed conditions, and we\nargue that this pair corresponds in a certain sense to maximal denoising of the\nobserved distributions.\n Our results are facilitated by a connection to "mixture proportion\nestimation," which is the problem of estimating the maximal proportion of one\ndistribution that is present in another. We establish a novel rate of\nconvergence result for mixture proportion estimation, and apply this to obtain\nconsistency of a discrimination rule based on surrogate loss minimization.\nExperimental results on benchmark data and a nuclear particle classification\nproblem demonstrate the efficacy of our approach.\n

References

Page 1

	Year	Citations

Page 1