Scalable Object Detection Using Deep Neural Networks

Abstract

Deep convolutional neural networks have recently achieved state-of-the-art performance on a number of image recognition benchmarks, including the ImageNet Large-Scale Visual Recognition Challenge (ILSVRC-2012). The winning model on the localization sub-task was a network that predicts a single bounding box and a confidence score for each object category in the image. Such a model captures the whole-image context around the objects but cannot handle multiple instances of the same object in the image without naively replicating the number of outputs for each instance. In this work, we propose a saliency-inspired neural network model for detection, which predicts a set of class-agnostic bounding boxes along with a single score for each box, corresponding to its likelihood of containing any object of interest. The model naturally handles a variable number of instances for each class and allows for cross-class generalization at the highest levels of the network. We are able to obtain competitive recognition performance on VOC2007 and ILSVRC2012, while using only the top few predicted locations in each image and a small number of neural network evaluations.

References

Page 1

	Year	Citations
ImageNet classification with deep convolutional neural networks Alex Krizhevsky, Ilya Sutskever, Geoffrey E. Hinton Communications of the ACM Convolutional Neural NetworkEngineeringMachine LearningNeural NetworkImagenet Classification	2017	75.5K
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation Ross Girshick, Jeff Donahue, Trevor Darrell, Convolutional Neural NetworkEngineeringMachine LearningFeature DetectionRich Feature Hierarchies	2014	31.2K
The Pascal Visual Object Classes (VOC) Challenge Mark Everingham, Luc Van Gool, Christopher K. I. Williams, International Journal of Computer Vision Image AnalysisMachine VisionEngineeringObject CategorizationPattern Recognition	2009	19K
Object Detection with Discriminatively Trained Part-Based Models Pedro F. Felzenszwalb, Ross Girshick, David McAllester, IEEE Transactions on Pattern Analysis and Machine Intelligence Multiple Instance LearningObject Detection SystemMachine LearningEngineeringLatent Svm	2009	10K
Selective Search for Object Recognition Jasper Uijlings, Koen E. A. van de Sande, Theo Gevers, International Journal of Computer Vision Machine VisionMachine LearningImage AnalysisEngineeringPattern Recognition	2013	6.1K
The Representation and Matching of Pictorial Structures Martin A. Fischler, R.A. Elschlager IEEE Transactions on Computers Pictorial StructuresActual PhotographMachine VisionImage AnalysisEngineering	1973	1.3K
Deep Neural Networks for Object Detection Christian Szegedy, Alexander Toshev, Dumitru Erhan	2013	1.2K
What is an object? Bogdan Alexe, Thomas Deselaers, Vittorio Ferrari Scene AnalysisEngineeringIntelligent SystemsSemanticsAbstract Object Theory	2010	803
Segmentation as selective search for object recognition Koen E. A. van de Sande, Jasper Uijlings, Theo Gevers, Image ClassificationMachine VisionImage AnalysisMachine LearningData Science	2011	749
Beyond sliding windows: Object localization by efficient subwindow search Christoph H. Lampert, Matthew B. Blaschko, Thomas Hofmann EngineeringMachine LearningFeature DetectionClassifier FunctionsLocalization Technique	2008	691

Page 1