Publication | Closed Access
80 Million Tiny Images: A Large Data Set for Nonparametric Object and Scene Recognition
2K
Citations
48
References
2008
Year
Wordnet Lexical DatabaseScene AnalysisEngineeringMachine LearningObject CategorizationImage RetrievalImage ClassificationImage AnalysisSemantic InformationData ScienceText-to-image RetrievalPattern RecognitionLarge DatasetVision RecognitionMachine VisionVision Language ModelComputer ScienceDeep LearningNonparametric ObjectComputer VisionScene RecognitionObject RecognitionScene UnderstandingMillion Tiny Images
The Internet provides billions of freely available images that densely sample the visual world. We aim to explore this visual world using a large dataset of 79 million images and non‑parametric methods. The dataset contains 32 × 32 color images loosely labeled with 75 062 WordNet nouns, and we use WordNet semantics with nearest‑neighbor methods to classify objects across semantic levels while reducing labeling noise. The database offers comprehensive coverage of object categories and scenes, and for common classes such as people it achieves recognition performance comparable to class‑specific Viola‑Jones detectors.
With the advent of the Internet, billions of images are now freely available online and constitute a dense sampling of the visual world. Using a variety of non-parametric methods, we explore this world with the aid of a large dataset of 79,302,017 images collected from the Internet. Motivated by psychophysical results showing the remarkable tolerance of the human visual system to degradations in image resolution, the images in the dataset are stored as 32 x 32 color images. Each image is loosely labeled with one of the 75,062 non-abstract nouns in English, as listed in the Wordnet lexical database. Hence the image database gives a comprehensive coverage of all object categories and scenes. The semantic information from Wordnet can be used in conjunction with nearest-neighbor methods to perform object classification over a range of semantic levels minimizing the effects of labeling noise. For certain classes that are particularly prevalent in the dataset, such as people, we are able to demonstrate a recognition performance comparable to class-specific Viola-Jones style detectors.
| Year | Citations | |
|---|---|---|
Page 1
Page 1