Publication | Closed Access
Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation
31.2K
Citations
39
References
2014
Year
Unknown Venue
Convolutional Neural NetworkEngineeringMachine LearningFeature DetectionRich Feature HierarchiesScalable Detection AlgorithmImage ClassificationImage AnalysisData SciencePattern RecognitionVision RecognitionMachine VisionFeature LearningObject DetectionComputer ScienceDeep LearningComputer VisionObject Detection PerformanceObject RecognitionRegion Proposals
Object detection on PASCAL VOC has plateaued, with top methods relying on complex ensembles that fuse low‑level features with high‑level context. We introduce a simple, scalable algorithm that raises VOC 2012 mean average precision by over 30 % to 53.3 %. The method, called R‑CNN, applies high‑capacity CNNs to bottom‑up region proposals and, when data are scarce, uses supervised pre‑training on an auxiliary task followed by domain‑specific fine‑tuning. Experiments show the network learns a rich hierarchy of image features and achieves the reported mAP improvement. Source code is available at http://www.cs.berkeley.edu/~rbg/rcnn.
Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.
| Year | Citations | |
|---|---|---|
2017 | 75.5K | |
2009 | 60.2K | |
1998 | 56.5K | |
2004 | 54.6K | |
2005 | 31.6K | |
2009 | 19K | |
1989 | 11.6K | |
2009 | 10K | |
2001 | 6.4K | |
2013 | 6.1K |
Page 1
Page 1