DeCAF: A Deep Convolutional Activation Feature for Generic Visual Recognition

TLDR

Generic visual tasks often differ from the original training set and lack sufficient labeled data, making conventional deep architecture adaptation difficult. The study evaluates whether deep convolutional activation features can be repurposed for generic visual tasks and releases DeCAF to enable such experimentation. The authors investigate and visualize semantic clustering of these features across scene recognition, domain adaptation, and fine‑grained tasks, and provide an open‑source implementation with network parameters. They demonstrate that features from various network levels form a fixed representation that significantly outperforms state‑of‑the‑art on several vision challenges.

Abstract

We evaluate whether features extracted from the activation of a deep convolutional network trained in a fully supervised fashion on a large, fixed set of object recognition tasks can be re-purposed to novel generic tasks. Our generic tasks may differ significantly from the originally trained tasks and there may be insufficient labeled or unlabeled data to conventionally train or adapt a deep architecture to the new tasks. We investigate and visualize the semantic clustering of deep convolutional features with respect to a variety of such tasks, including scene recognition, domain adaptation, and fine-grained recognition challenges. We compare the efficacy of relying on various network levels to define a fixed feature, and report novel results that significantly outperform the state-of-the-art on several important vision challenges. We are releasing DeCAF, an open-source implementation of these deep convolutional activation features, along with all associated network parameters to enable vision researchers to be able to conduct experimentation with deep representations across a range of visual concept learning paradigms.

References

Page 1

	Year	Citations
ImageNet: A large-scale hierarchical image database Jia Deng, Wei Dong, Richard Socher, 2009 IEEE Conference on Computer Vision and Pattern Recognition EngineeringMachine LearningImage RetrievalImage DatabaseImage Recognition (Computer Vision)	2009	60.2K
Gradient-based learning applied to document recognition Yann LeCun, Léon Bottou, Yoshua Bengio, Proceedings of the IEEE EngineeringMachine LearningMultilayer Neural NetworksImage AnalysisData Science	1998	56.5K
Visualizing Data using t-SNE Laurens van der Maaten, Geoffrey E. Hinton Journal of Machine Learning Research EngineeringData VisualizationData ExplorationVisualization (Data Visualization)Linear Embedding	2008	35.7K
Histograms of Oriented Gradients for Human Detection Navneet Dalal, Bill Triggs EngineeringFeature DetectionMachine LearningBiometricsOriented Gradients	2005	31.6K
Reducing the Dimensionality of Data with Neural Networks Geoffrey E. Hinton, Ruslan Salakhutdinov Science	2006	20.5K
Backpropagation Applied to Handwritten Zip Code Recognition Yann LeCun, Bernhard E. Boser, J. S. Denker, Neural Computation Artificial IntelligenceConvolutional Neural NetworkEngineeringMachine LearningAi Foundation	1989	11.6K
Object Detection with Discriminatively Trained Part-Based Models Pedro F. Felzenszwalb, Ross Girshick, David McAllester, IEEE Transactions on Pattern Analysis and Machine Intelligence Multiple Instance LearningObject Detection SystemMachine LearningEngineeringLatent Svm	2009	10K
Improving neural networks by preventing co-adaptation of feature detectors Geoffrey E. Hinton, Nitish Srivastava, Alex Krizhevsky, arXiv (Cornell University) Feature DetectorLarge Ai ModelConvolutional Neural NetworkMachine VisionMachine Learning	2012	6.6K
Modeling the Shape of the Scene: A Holistic Representation of the Spatial Envelope Aude Oliva, Antonio Torralba International Journal of Computer Vision	2001	6.4K
Multitask Learning Rich Caruana Machine Learning	1997	6.1K

Page 1