Publication | Closed Access
Learning from massive noisy labeled data for image classification
939
Citations
25
References
2015
Year
Unknown Venue
Geometric LearningConvolutional Neural NetworkEngineeringMachine LearningImage ClassificationImage AnalysisData SciencePattern RecognitionClean LabelsSemi-supervised LearningSupervised LearningLabel NoisesMachine VisionFeature LearningComputer ScienceDeep LearningComputer VisionMassive NoisyConvolutional Neural Networks
Large‑scale supervised datasets are essential for training convolutional neural networks, yet acquiring well‑labeled data is typically expensive and time‑consuming. This work proposes a general framework that trains CNNs using only a small set of clean labels alongside millions of readily available noisy labels. The framework models the relationships among images, class labels, and label noise with a probabilistic graphical model and incorporates it into an end‑to‑end deep learning system. Experiments on a real‑world clothing classification dataset demonstrate that the method more effectively corrects noisy labels and improves the performance of the trained CNNs.
Large-scale supervised datasets are crucial to train convolutional neural networks (CNNs) for various computer vision problems. However, obtaining a massive amount of well-labeled data is usually very expensive and time consuming. In this paper, we introduce a general framework to train CNNs with only a limited number of clean labels and millions of easily obtained noisy labels. We model the relationships between images, class labels and label noises with a probabilistic graphical model and further integrate it into an end-to-end deep learning system. To demonstrate the effectiveness of our approach, we collect a large-scale real-world clothing classification dataset with both noisy and clean labels. Experiments on this dataset indicate that our approach can better correct the noisy labels and improves the performance of trained CNNs.
| Year | Citations | |
|---|---|---|
Page 1
Page 1