Exploring the Limits of Weakly Supervised Pretraining

TLDR

State‑of‑the‑art visual perception models depend on supervised pretraining, typically on ImageNet, which is now outdated and small, and little is known about the effects of training on much larger datasets that are hard to collect and annotate. The study investigates transfer learning using large convolutional networks trained to predict hashtags on billions of social media images. The authors train large convolutional networks to predict hashtags on billions of social media images and conduct extensive experiments to examine how large‑scale pretraining affects transfer learning performance. Training on large‑scale hashtag prediction yields excellent results, improving image classification and object detection and achieving the highest ImageNet‑1k single‑crop top‑1 accuracy to date (85.4 %/97.6 % top‑5), while providing new empirical insights into the link between large‑scale pretraining and transfer learning performance.

Abstract

State-of-the-art visual perception models for a wide range of tasks rely on supervised pretraining. ImageNet classification is the de facto pretraining task for these models. Yet, ImageNet is now nearly ten years old and is by modern standards small. Even so, relatively little is known about the behavior of pretraining with datasets that are multiple orders of magnitude larger. The reasons are obvious: such datasets are difficult to collect and annotate. In this paper, we present a unique study of transfer learning with large convolutional networks trained to predict hashtags on billions of social media images. Our experiments demonstrate that training for large-scale hashtag prediction leads to excellent results. We show improvements on several image classification and object detection tasks, and report the highest ImageNet-1k single-crop, top-1 accuracy to date: 85.4% (97.6% top-5). We also perform extensive experiments that provide novel empirical data on the relationship between large-scale pretraining and transfer learning performance.

References

Page 1

	Year	Citations

Page 1