Abstract
Fine-grained visual classification (FGVC) is widely used to identify different sub-categories of ships, dogs, flowers, and so on, and aims to help the ordinary people distinguish sub-categories with only slight differences. It mainly faces the challenges of small inter-class differences and large intra-class variations. The current effective methods adopt multi-scale or multi-granularity feature to find the subtle difference. However, these methods pay their attentions to the accuracy while neglecting the computational cost in practice. Therefore, in this paper, an improved efficient Multi-granularity Learning method with Only Forward Once (MLOFO) is proposed. It reduces the forward and back propagation in training from several times to once, and decreases the computational cost several times. And more, an intra-class metric loss, named prototype metric (PM) loss, is proposed to supervise learning the effective features for classification in a multi-granularity network (MGN) framework. The effectiveness of the proposed method is verified on four fine-grained classification datasets (CUB-200-2011, Stanford Cars, FGVC-Aircraft, and AircraftCarrier). Experimental results demonstrate that our method achieves state-of-the-art accuracies, substantially improving FGVC tasks. Furthermore, we discuss that the new PM loss can compress the distribution of the intra-class features as label smoothing to achieve better generalization ability. Our method is helpful to promote the training efficiency of the MGN model and improve the accuracy of fine-grained classification to a certain extent.
Similar content being viewed by others
References
Araújo, V.M., Oliveira, L.S., Koerich, A.L.: Two-view fine-grained classification of plant species. Neurocomputing 467, 427–441 (2022). https://doi.org/10.1016/j.neucom.2021.10.015
Barshandeh, S., Dana, R., Eskandarian, P.: A learning automata-based hybrid MPA and JS algorithm for numerical optimization problems and its application on data clustering. Knowl. Based Syst. 236, 107682 (2022). https://doi.org/10.1016/j.knosys.2021.107682
Barshandeh, S., Haghzadeh, M.: A new hybrid chaotic atom search optimization based on tree-seed algorithm and levy flight for solving optimization problems. Eng. Comput. 37(4), 3079–3122 (2021). https://doi.org/10.1007/s00366-020-00994-0
Chang, D., Ding, Y., Xie, J., Bhunia, A.K., Li, X., Ma, Z., Wu, M., Guo, J., Song, Y.: The devil is in the channels: Mutual-channel loss for fine-grained image classification. IEEE Trans. Image Process. 29, 4683–4695 (2020). https://doi.org/10.1109/TIP.2020.2973812
Chen, Y., Bai, Y., Zhang, W., Mei, T.: Destruction and construction learning for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 5157–5166. Computer Vision Foundation / IEEE (2019). https://doi.org/10.1109/CVPR.2019.00530
Dimitriadis, G., Neto, J.P., Kampff, A.R.: t-sne visualization of large-scale neural recordings. Neural Comput. (2018). https://doi.org/10.1162/neco_a_01097
Ding, Y., Zhou, Y., Zhu, Y., Ye, Q., Jiao, J.: Selective sparse sampling for fine-grained image recognition. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 6598–6607. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00670
Du, R., Chang, D., Bhunia, A.K., Xie, J., Ma, Z., Song, Y., Guo, J.: Fine-grained visual classification via progressive multi-granularity training of jigsaw patches. In: A. Vedaldi, H. Bischof, T. Brox, J. Frahm (eds.) Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XX, Lecture Notes in Computer Science, vol. 12365, pp. 153–168. Springer (2020). https://doi.org/10.1007/978-3-030-58565-5_10
Engin, M., Wang, L., Zhou, L., Liu, X.: Deepkspd: Learning kernel-matrix-based SPD representation for fine-grained image recognition. In: V. Ferrari, M. Hebert, C. Sminchisescu, Y. Weiss (eds.) Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part II, Lecture Notes in Computer Science, vol. 11206, pp. 629–645. Springer (2018). https://doi.org/10.1007/978-3-030-01216-8_38
Gao, Y., Han, X., Wang, X., Huang, W., Scott, M.: Channel interaction networks for fine-grained image categorization. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pp. 10818–10825. AAAI Press (2020)
He, J., Chen, J., Liu, S., Kortylewski, A., Yang, C., Bai, Y., Wang, C., Yuille, A.L.: Transfg: A transformer architecture for fine-grained recognition. CoRR abs/2103.07976 (2021)
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 770–778. IEEE Computer Society (2016). https://doi.org/10.1109/CVPR.2016.90
Huang, S., Wang, X., Tao, D.: Stochastic partial swap: Enhanced model generalization and interpretability for fine-grained recognition. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp. 600–609. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.00066
Ji, R., Wen, L., Zhang, L., Du, D., Wu, Y., Zhao, C., Liu, X., Huang, F.: Attention convolutional binary neural tree for fine-grained visual categorization. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 10465–10474. IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.01048
Lin, T., RoyChowdhury, A., Maji, S.: Bilinear convolutional neural networks for fine-grained visual recognition. IEEE Trans. Pattern Anal. Mach. Intell. 40(6), 1309–1322 (2018). https://doi.org/10.1109/TPAMI.2017.2723400
Liu, Z., Lin, Y., Cao, Y., Hu, H., Wei, Y., Zhang, Z., Lin, S., Guo, B.: Swin transformer: Hierarchical vision transformer using shifted windows. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp. 9992–10002. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.00986
Luo, W., Yang, X., Mo, X., Lu, Y., Davis, L., Li, J., Yang, J., Lim, S.: Cross-x learning for fine-grained visual categorization. In: 2019 IEEE/CVF International Conference on Computer Vision, ICCV 2019, Seoul, Korea (South), October 27 - November 2, 2019, pp. 8241–8250. IEEE (2019). https://doi.org/10.1109/ICCV.2019.00833
Meena, S.D., Agilandeeswari, L.: A new supervised clustering framework using multi discriminative parts and expectation-maximization approach for a fine-grained animal breed classification (SC-MPEM). Neural Process. Lett. 52(1), 727–766 (2020). https://doi.org/10.1007/s11063-020-10246-3
Miao, Z., Zhao, X., Wang, J., Li, Y., Li, H.: Complemental attention multi-feature fusion network for fine-grained classification. IEEE Signal Process. Lett. 28, 1983–1987 (2021). https://doi.org/10.1109/LSP.2021.3114622
Min, S., Yao, H., Xie, H., Zha, Z., Zhang, Y.: Multi-objective matrix normalization for fine-grained visual recognition. IEEE Trans. Image Process. 29, 4996–5009 (2020). https://doi.org/10.1109/TIP.2020.2977457
Müller, R., Kornblith, S., Hinton, G.E.: When does label smoothing help? In: Wallach, H.M., Larochelle, H., Beygelzimer, A., d’Alché-Buc, F., Fox, E.B., Garnett, R. (eds.) Advances in Neural Information Processing Systems 32: Annual Conference on Neural Information Processing Systems 2019, NeurIPS 2019, 8–14 December 2019, pp. 4696–4705. Canada, Vancouver, BC (2019)
Nie, Y., Bian, C., Li, L.: Adap-emd: Adaptive EMD for aircraft fine-grained classification in remote sensing. IEEE Geosci. Remote Sens. Lett. 19, 1–5 (2022). https://doi.org/10.1109/LGRS.2022.3168581
Qiu, C., Zhang, S., Wang, C., Yu, Z., Zheng, H., Zheng, B.: Improving transfer learning and squeeze- and-excitation networks for small-scale fine-grained fish image classification. IEEE Access 6, 78503–78512 (2018). https://doi.org/10.1109/ACCESS.2018.2885055
Rao, Y., Chen, G., Lu, J., Zhou, J.: Counterfactual attention learning for fine-grained visual categorization and re-identification. In: 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021, pp. 1005–1014. IEEE (2021). https://doi.org/10.1109/ICCV48922.2021.00106
Santra, B., Shaw, A., Mukherjee, D.P.: Part-based annotation-free fine-grained classification of images of retail products. Pattern Recognit. 121, 108257 (2022). https://doi.org/10.1016/j.patcog.2021.108257
Sun, M., Yuan, Y., Zhou, F., Ding, E.: Multi-attention multi-class constraint for fine-grained image recognition. In: V. Ferrari, M. Hebert, C. Sminchisescu, Y. Weiss (eds.) Computer Vision - ECCV 2018 - 15th European Conference, Munich, Germany, September 8-14, 2018, Proceedings, Part XVI, Lecture Notes in Computer Science, vol. 11220, pp. 834–850. Springer (2018). https://doi.org/10.1007/978-3-030-01270-0_49
Wang, J., Li, Y., Miao, Z., Zhao, X., Zhang, R.: Multi-level metric learning network for fine-grained classification. IEEE Access 7, 166390–166397 (2019). https://doi.org/10.1109/ACCESS.2019.2953957
Wang, J., Li, Y., Wei, X., Li, H., Miao, Z., Zhang, R.: Bridge the gap between supervised and unsupervised learning for fine-grained classification. CoRR abs/2203.00441 (2022). https://doi.org/10.48550/arXiv.2203.00441
Wang, Y., Morariu, V.I., Davis, L.S.: Learning a discriminative filter bank within a CNN for fine-grained recognition. In: 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 4148–4157. IEEE Computer Society (2018). https://doi.org/10.1109/CVPR.2018.00436
Wang, Z., Wang, S., Yang, S., Li, H., Li, J., Li, Z.: Weakly supervised fine-grained image classification via guassian mixture model oriented discriminative learning. In: 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition, CVPR 2020, Seattle, WA, USA, June 13-19, 2020, pp. 9746–9755. IEEE (2020). https://doi.org/10.1109/CVPR42600.2020.00977
Wei, X., Song, Y., Aodha, O.M., Wu, J., Peng, Y., Tang, J., Yang, J., Belongie, S.J.: Fine-grained image analysis with deep learning: A survey. IEEE Trans. Pattern Anal. Mach. Intell. (2021). https://doi.org/10.1109/TPAMI.2021.3126648
Wei, X., Xie, C., Wu, J., Shen, C.: Mask-cnn: Localizing parts and selecting descriptors for fine-grained bird species categorization. Pattern Recognit. 76, 704–714 (2018). https://doi.org/10.1016/j.patcog.2017.10.002
Wen, Y., Zhang, K., Li, Z., Qiao, Y.: A discriminative feature learning approach for deep face recognition. In: B. Leibe, J. Matas, N. Sebe, M. Welling (eds.) Computer Vision - ECCV 2016 - 14th European Conference, Amsterdam, The Netherlands, October 11-14, 2016, Proceedings, Part VII, Lecture Notes in Computer Science, vol. 9911, pp. 499–515. Springer (2016). https://doi.org/10.1007/978-3-319-46478-7_31
Xiong, W., Xiong, Z., Cui, Y.: An explainable attention network for fine-grained ship classification using remote-sensing images. IEEE Trans. Geosci. Remote Sens. 60, 1–14 (2022). https://doi.org/10.1109/TGRS.2022.3162195
Zeiler, M.D., Taylor, G.W., Fergus, R.: Adaptive deconvolutional networks for mid and high level feature learning. In: D.N. Metaxas, L. Quan, A. Sanfeliu, L.V. Gool (eds.) IEEE International Conference on Computer Vision, ICCV 2011, Barcelona, Spain, November 6-13, 2011, pp. 2018–2025. IEEE Computer Society (2011). https://doi.org/10.1109/ICCV.2011.6126474
Zhang, F., Li, M., Zhai, G., Liu, Y.: Multi-branch and multi-scale attention learning for fine-grained visual categorization. In: J. Lokoc, T. Skopal, K. Schoeffmann, V. Mezaris, X. Li, S. Vrochidis, I. Patras (eds.) MultiMedia Modeling - 27th International Conference, MMM 2021, Prague, Czech Republic, June 22-24, 2021, Proceedings, Part I, Lecture Notes in Computer Science, vol. 12572, pp. 136–147. Springer (2021). https://doi.org/10.1007/978-3-030-67832-6_12
Zhang, H., Xu, T., Elhoseiny, M., Huang, X., Zhang, S., Elgammal, A.M., Metaxas, D.N.: SPDA-CNN: unifying semantic part detection and abstraction for fine-grained recognition. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, 2016, pp. 1143–1152. IEEE Computer Society (2016). https://doi.org/10.1109/CVPR.2016.129
Zhao, Y., Yan, K., Huang, F., Li, J.: Graph-based high-order relation discovery for fine-grained recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2021, virtual, June 19-25, 2021, pp. 15079–15088. Computer Vision Foundation / IEEE (2021)
Zheng, H., Fu, J., Zha, Z., Luo, J.: Looking for the devil in the details: Learning trilinear attention sampling network for fine-grained image recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2019, Long Beach, CA, USA, June 16-20, 2019, pp. 5012–5021. Computer Vision Foundation / IEEE (2019). https://doi.org/10.1109/CVPR.2019.00515
Zhuang, P., Wang, Y., Qiao, Y.: Learning attentive pairwise interaction for fine-grained classification. In: The Thirty-Fourth AAAI Conference on Artificial Intelligence, AAAI 2020, The Thirty-Second Innovative Applications of Artificial Intelligence Conference, IAAI 2020, The Tenth AAAI Symposium on Educational Advances in Artificial Intelligence, EAAI 2020, New York, NY, USA, February 7-12, 2020, pp. 13130–13137. AAAI Press (2020)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
This work has been supported by the Natural Science Foundation of Jiangsu Province under Grant BK20200581, and in part by the National Natural Science Foundation of China under Grant 61806220.
Rights and permissions
About this article
Cite this article
Wang, J., Li, Y., Li, H. et al. Efficient multi-granularity network for fine-grained image classification. J Real-Time Image Proc 19, 853–866 (2022). https://doi.org/10.1007/s11554-022-01228-w
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11554-022-01228-w