Rich feature hierarchies for accurate object detection and semantic segmentation

Girshick, Ross; Donahue, Jeff; Darrell, Trevor; Malik, Jitendra

Computer Science > Computer Vision and Pattern Recognition

arXiv:1311.2524 (cs)

[Submitted on 11 Nov 2013 (v1), last revised 22 Oct 2014 (this version, v5)]

Title:Rich feature hierarchies for accurate object detection and semantic segmentation

Authors:Ross Girshick, Jeff Donahue, Trevor Darrell, Jitendra Malik

View PDF

Abstract:Object detection performance, as measured on the canonical PASCAL VOC dataset, has plateaued in the last few years. The best-performing methods are complex ensemble systems that typically combine multiple low-level image features with high-level context. In this paper, we propose a simple and scalable detection algorithm that improves mean average precision (mAP) by more than 30% relative to the previous best result on VOC 2012---achieving a mAP of 53.3%. Our approach combines two key insights: (1) one can apply high-capacity convolutional neural networks (CNNs) to bottom-up region proposals in order to localize and segment objects and (2) when labeled training data is scarce, supervised pre-training for an auxiliary task, followed by domain-specific fine-tuning, yields a significant performance boost. Since we combine region proposals with CNNs, we call our method R-CNN: Regions with CNN features. We also compare R-CNN to OverFeat, a recently proposed sliding-window detector based on a similar CNN architecture. We find that R-CNN outperforms OverFeat by a large margin on the 200-class ILSVRC2013 detection dataset. Source code for the complete system is available at this http URL.

Comments:	Extended version of our CVPR 2014 paper; latest update (v5) includes results using deeper networks (see Appendix G. Changelog)
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1311.2524 [cs.CV]
	(or arXiv:1311.2524v5 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1311.2524

Submission history

From: Ross Girshick [view email]
[v1] Mon, 11 Nov 2013 18:43:49 UTC (3,704 KB)
[v2] Tue, 15 Apr 2014 01:44:31 UTC (16,729 KB)
[v3] Wed, 7 May 2014 17:09:23 UTC (6,644 KB)
[v4] Mon, 9 Jun 2014 22:07:33 UTC (6,644 KB)
[v5] Wed, 22 Oct 2014 17:23:20 UTC (6,660 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Rich feature hierarchies for accurate object detection and semantic segmentation

Submission history

Access Paper:

References & Citations

14 blog links

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Rich feature hierarchies for accurate object detection and semantic segmentation

Submission history

Access Paper:

References & Citations

14 blog links

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators