Concepedia

Publication | Closed Access

Rich Feature Hierarchies for Accurate Object Detection and Semantic Segmentation

31.2K

Citations

39

References

2014

Year

TLDR

Object detection on PASCAL VOC has plateaued, with top methods relying on complex ensembles that fuse low‑level features with high‑level context. We introduce a simple, scalable algorithm that raises VOC 2012 mean average precision by over 30 % to 53.3 %. The method, called R‑CNN, applies high‑capacity CNNs to bottom‑up region proposals and, when data are scarce, uses supervised pre‑training on an auxiliary task followed by domain‑specific fine‑tuning. Experiments show the network learns a rich hierarchy of image features and achieves the reported mAP improvement. Source code is available at http://www.cs.berkeley.edu/~rbg/rcnn.

Abstract

Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012 -- achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also present experiments that provide insight into what the network learns, revealing a rich hierarchy of image features. Source code for the complete system is available at http://www.cs.berkeley.edu/~rbg/rcnn.

References

YearCitations

2017

75.5K

2009

60.2K

1998

56.5K

2004

54.6K

2005

31.6K

2009

19K

1989

11.6K

2009

10K

2001

6.4K

2013

6.1K

Page 1