Cascaded one-vs-rest detection network for fine-grained recognition without part annotations

Chen, Long; Wang, Shengke; Lam, Kin-Man; Zhou, Huiyu; Jian, Muwei; Dong, Junyu

doi:10.1007/s11042-018-5875-y

Cascaded one-vs-rest detection network for fine-grained recognition without part annotations

Published: 17 March 2018

Volume 78, pages 4381–4395, (2019)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Long Chen¹,
Shengke Wang ORCID: orcid.org/0000-0002-4906-8773¹,
Kin-Man Lam²,
Huiyu Zhou³,
Muwei Jian¹ &
…
Junyu Dong¹

419 Accesses
3 Altmetric
Explore all metrics

Abstract

Fine-grained recognition is a challenging task due to small intra-category variances. Most of the top-performing fine-grained recognition methods leverage parts of objects for better performance. Therefore, part annotations which are extremely computationally expensive are required. In this paper, we propose a novel cascaded deep CNN detection framework for fine-grained recognition which is trained to detect a whole object without considering parts. Nevertheless, most of the current top-performing detection networks use N + 1 class (N object categories plus background) softmax loss. The background category with much more training samples dominates the feature learning progress where the features are not suitable for object categorisation with fewer samples. To address this issue, we here introduce two strategies: 1) We leverage a cascaded structure to eliminate the background. 2) We introduce a novel one-vs-rest loss function to capture more minute variances from different subordinate categories. Experiments show that our proposed recognition framework achieves comparable performance against the state-of-the-art, part-free, fine-grained recognition methods on the CUB-200-2011 Bird dataset. Meanwhile, our method outperforms most of the existing part annotation based methods and does not need part annotations at the training stage whilst being free from any annotations at the test stage.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Fine-Grained Image Classification Based on Target Acquisition and Feature Fusion

Part Detector Discovery in Deep Convolutional Neural Networks

Fine-Grained Image Classification with Object-Part Model

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

Alsmirat MA, Jararweh Y, Al-Ayyoub M, Shehab MA, Gupta BB (2017) Accelerating compute intensive medical imaging segmentation algorithms using hybrid CPU-GPU implementations. Multimed Tools Appl 76(3):3537–3555
Article Google Scholar
Atawneh S, Almomani A, Al Bazar H, Sumari P, Gupta B (2017) Secure and imperceptible digital image steganographic algorithm based on diamond encoding in DWT domain. Multimed Tools Appl 76(18):18451–18472
Article Google Scholar
Berg T, Liu J, Lee SW, Alexander ML, Jacobs DW, Belhumeur PN (2014, June) Birdsnap: Large-scale fine-grained visual categorization of birds. In: Computer Vision and Pattern Recognition (CVPR), 2014 I.E. Conference on IEEE, pp 2019–2026
Branson S, Van Horn G, Wah C, Perona P, Belongie S (2014) The ignorant led by the blind: a hybrid human–machine vision system for fine-grained categorization. Int J Comput Vis 108(1–2):3–29
MathSciNet MATH Google Scholar
Branson S, Van Horn G, Belongie S, Perona P (2014) Bird species categorization using pose normalized deep convolutional nets. arXiv preprint arXiv:1406.2952
Chang X, Yang Y (2017) Semi-supervised feature analysis by mining correlations among multiple tasks. IEEE Trans Neural Netw Learn Syst 28(10):2294–2305
Article MathSciNet Google Scholar
Chang X, Ma Z, Lin M, Yang Y, Hauptmann AG (2017) Feature interaction augmented sparse learning for fast kinect motion detection. IEEE Trans Image Process 26(8):3911–3920
Article MathSciNet MATH Google Scholar
Chang X, Ma Z, Yang Y, Zeng Z, Hauptmann AG (2017) Bi-level semantic representation analysis for multimedia event detection. IEEE Trans Cybern 47(5):1180–1197
Article Google Scholar
Chang X, Yu YL, Yang Y, Xing EP (2017) Semantic pooling for complex event analysis in untrimmed videos. IEEE Trans Pattern Anal Mach Intell 39(8):1617–1632
Article Google Scholar
Gavves E, Fernando B, Snoek CG, Smeulders AW, Tuytelaars T (2013, December) Fine-grained categorization by alignments. In: Computer Vision (ICCV), 2013 I.E. International Conference on IEEE, pp 1713–1720
Girshick R (2015) Fast r-cnn. arXiv preprint arXiv:1504.08083
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Huang S, Xu Z, Tao D, Zhang Y (2016) Part-stacked cnn for fine-grained visual categorization. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition pp 1173–1182
Ibtihal M, Hassan N (2017) Homomorphic encryption as a service for outsourced images in mobile cloud computing environment. Int J Cloud Appl Comput (IJCAC) 7(2):27–40
Google Scholar
Jouini M, Rabai LBA (2016) A security framework for secure cloud computing environments. Int J Cloud Appl Comput (IJCAC) 6(3):32–44
Google Scholar
Krause J, Jin H, Yang J, Fei-Fei L (2015, June) Fine-grained recognition without part annotations. In: Computer Vision and Pattern Recognition (CVPR), 2015 I.E. Conference on IEEE, pp 5546–5555
Kumar N, Belhumeur PN, Biswas A, Jacobs DW, Kress WJ, Lopez IC, Soares JV (2012) Leafsnap: A computer vision system for automatic plant species identification. In: Computer vision–ECCV 2012. Springer, Berlin, pp 502–516
Li Z, Nie F, Chang X, Yang Y (2017) Beyond trace ratio: weighted harmonic mean of trace ratios for multiclass discriminant analysis. IEEE Trans Knowl Data Eng 29(10):2100–2110
Article Google Scholar
Lin D, Shen X, Lu C, Jia J (2015, June) Deep lac: Deep localization, alignment and classification for fine-grained recognition. In: Computer Vision and Pattern Recognition (CVPR), 2015 I.E. Conference on IEEE, pp 1666–1674
Lin TY, RoyChowdhury A, Maji S (2015) Bilinear cnn models for fine-grained visual recognition. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1449–1457
Maji S (2012, October) Discovering a lexicon of parts and attributes. In: European Conference on Computer Vision. Springer, Berlin, pp 21–30
Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: Towards real-time object detection with region proposal networks. In: Advances in neural information processing systems, pp 91–99
Sfar AR, Boujemaa N, Geman D (2013, June) Vantage feature frames for fine-grained categorization. In: Computer Vision and Pattern Recognition (CVPR), 2013 I.E. Conference on IEEE, pp 835–842
Simon M, Rodner E (2015) Neural activation constellations: Unsupervised part model discovery with convolutional networks. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1143–1151
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556. 2014 Sep 4
Uijlings JR, Van De Sande KE, Gevers T, Smeulders AW (2013) Selective search for object recognition. Int J Comput Vis 104(2):154–171
Article Google Scholar
Wah, C., Branson, S., Welinder, P., Perona, P., & Belongie, S. (2011). The caltech-ucsd birds-200-2011 dataset
Google Scholar
Xiao T, Xu Y, Yang K, Zhang J, Peng Y, Zhang Z (2015, June) The application of two-level attention models in deep convolutional neural network for fine-grained image classification. In: Computer Vision and Pattern Recognition (CVPR), 2015 I.E. Conference on IEEE, pp 842–850
Yang B, Yan J, Lei Z, Li SZ (2016) Craft objects from images. arXiv preprint arXiv:1604.03239
Yu C, Li J, Li X et al (2018) Four-image encryption scheme based on quaternion Fresnel transform, chaos and computer generated hologram[J]. Multimed Tools Appl 77(4):4585–4608
Article Google Scholar
Zhang N, Donahue J, Girshick R, Darrell T (2014, September) Part-based R-CNNs for fine-grained category detection. In: European conference on computer vision. Springer, Cham, pp 834–849
Zhang X, Xiong H, Zhou W, Tian Q (2014, November) Fused one-vs-all mid-level features for fine-grained visual categorization. In: Proceedings of the 22nd ACM international conference on Multimedia ACM, pp 287–296
Zhang H, Xu T, Elhoseiny M, Huang X, Zhang S, Elgammal A, Metaxas D (2016) Spda-cnn: Unifying semantic part detection and abstraction for fine-grained recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 1143–1152
Zhang Z, Sun R, Zhao C, Wang J, Chang CK, Gupta BB (2017) CyVOD: a novel trinity multimedia social network scheme. Multimed Tools Appl 76(18):18513–18529
Article Google Scholar

Download references

Acknowledgments

2014DFA10410. H. Zhou is supported by UK EPSRC under Grant EP/N011074/1 and Royal Society-Newton Advanced Fellowship under Grant NA160342.

Author information

Authors and Affiliations

Ocean University of China, Qingdao, China
Long Chen, Shengke Wang, Muwei Jian & Junyu Dong
Hong Kong Polytechnic University, Hung Hom, Hong Kong
Kin-Man Lam
University of Leicester, University Rd, Leicester, LE1 7RH, UK
Huiyu Zhou

Authors

Long Chen
View author publications
You can also search for this author in PubMed Google Scholar
Shengke Wang
View author publications
You can also search for this author in PubMed Google Scholar
Kin-Man Lam
View author publications
You can also search for this author in PubMed Google Scholar
Huiyu Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Muwei Jian
View author publications
You can also search for this author in PubMed Google Scholar
Junyu Dong
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Shengke Wang.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Chen, L., Wang, S., Lam, KM. et al. Cascaded one-vs-rest detection network for fine-grained recognition without part annotations. Multimed Tools Appl 78, 4381–4395 (2019). https://doi.org/10.1007/s11042-018-5875-y

Download citation

Received: 13 July 2017
Revised: 02 February 2018
Accepted: 09 February 2018
Published: 17 March 2018
Issue Date: February 2019
DOI: https://doi.org/10.1007/s11042-018-5875-y

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Cascaded one-vs-rest detection network for fine-grained recognition without part annotations

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fine-Grained Image Classification Based on Target Acquisition and Feature Fusion

Part Detector Discovery in Deep Convolutional Neural Networks

Fine-Grained Image Classification with Object-Part Model

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Cascaded one-vs-rest detection network for fine-grained recognition without part annotations

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Fine-Grained Image Classification Based on Target Acquisition and Feature Fusion

Part Detector Discovery in Deep Convolutional Neural Networks

Fine-Grained Image Classification with Object-Part Model

Explore related subjects

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation