Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Scale-aware local difference attention on pyramidal features for crowd counting

  • Published:
Multimedia Tools and Applications Aims and scope Submit manuscript

Abstract

Estimating crowd counts automatically via computer vision technology has been attracting great attention due to its numerous practical applications. The crowd counting task has many challenges, and one of the main difficulties is scale variation since the scales of people’s heads vary dramatically across various images and between different regions of the same image. In this paper, we tackle the problem by proposing a novel scale-aware counting model named FPN-LDA Net, where the Feature Pyramid Network (FPN) handles the scale variation problem by fusing multi-scale feature maps from different depth levels of the network and the Local Difference Attention (LDA) module captures the local differences between the multi-scale pyramid pooling features at a specific location and its neighborhood. To tackle the head scale variation within the same image, the dynamically learned difference scores are utilized as the weights to adaptively highlight the scale-varying head regions of the crowd which need to be focused and filter irrelevant background regions. We conduct extensive experiments on three widely adopted benchmark datasets UCF-QNRF, ShanghaiTech and UCF_CC_50. And the experimental results showed the superiority of the proposed method.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Algorithm 1
Fig. 4

Similar content being viewed by others

References

  1. Chen K, Gong S, Xiang T, et al. (2013) Cumulative attribute space for age and crowd density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2467–2474

  2. Chen J, Su W, Wang Z (2020) Crowd counting with crowd attention convolutional neural network. Neurocomputing 382:210–220

    Article  Google Scholar 

  3. Ge W, Collins R T (2009) Marked point processes for crowd counting. In: 2009 IEEE Conference on computer vision and pattern recognition. IEEE, pp 2913–2920

  4. Hossain M, Hosseinzadeh M, Chanda O, et al. (2019) Crowd counting using scale-aware attention networks. In: 2019 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1280–1288

  5. Idrees H, Saleemi I, Seibert C, et al. (2013) Multi-source multi-scale counting in extremely dense crowd images. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2547–2554

  6. Idrees H, Tayyab M, Athrey K, et al. (2018) Composition loss for counting, density map estimation and localization in dense crowds. In: Proceedings of the European conference on computer vision (ECCV), pp 532–546

  7. Ilyas N, Ahmad A, Kim K (2019) Casa-crowd: a context-aware scale aggregation cnn-based crowd counting technique. IEEE Access 7:182050–182059

    Article  Google Scholar 

  8. Jiang X, Zhang L, Xu M, et al. (2020) Attention scaling for crowd counting. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 4706–4715

  9. Lempitsky V, Zisserman A (2010) Learning to count objects in images. In: Advances in neural information processing systems, pp 1324–1332

  10. Li M, Zhang Z, Huang K, et al. (2008) Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In: 2008 19th International conference on pattern recognition. IEEE, pp 1–4

  11. Li Y, Zhang X, Chen D (2018) Csrnet: dilated convolutional neural networks for understanding the highly congested scenes. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1091–1100

  12. Lin T Y, Dollár P, Girshick R, et al. (2017) Feature pyramid networks for object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125

  13. Liu J, Gao C, Meng D, et al. (2018) Decidenet: counting varying density crowds through attention guided detection and density estimation. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5197–5206

  14. Liu L, Wang H, Li G, et al. (2018b) Crowd counting using deep recurrent spatial-aware network. arXiv:180700601

  15. Liu L, Qiu Z, Li G, et al. (2019) Crowd counting with deep structured scale integration network. In: Proceedings of the IEEE international conference on computer vision, pp 1774–1783

  16. Liu W, Salzmann M, Fua P (2019) Context-aware crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5099–5108

  17. Liu S, Wang S, Liu X et al (2021) Human memory update strategy: a multi-layer template update mechanism for remote visual monitoring. IEEE Trans Multimed 23:2188–2198

    Article  Google Scholar 

  18. Liu S, Wang S, Liu X et al (2022) Human inertial thinking strategy: a novel fuzzy reasoning mechanism for iot-assisted visual monitoring. IEEE Internet Things J

  19. Liu S, Xu X, Zhang Y et al (2022) A reliable sample selection strategy for weakly supervised visual tracking. IEEE Trans Reliab

  20. Ma Z, Wei X, Hong X et al (2019) Bayesian loss for crowd count estimation with point supervision. In: Proceedings of the IEEE international conference on computer vision, pp 6142–6151

  21. Ma Z, Wei X, Hong X et al (2021) Learning to count via unbalanced optimal transport. In: Proceedings of the AAAI conference on artificial intelligence

  22. Oh M h, Olsen P, Ramamurthy K N (2020) Crowd counting with decomposed uncertainty. In: Proceedings of the AAAI conference on artificial intelligence, pp 11799–11806

  23. Sam D B, Surya S, Babu R V (2017) Switching convolutional neural network for crowd counting. In: 2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE, pp 4031–4039

  24. Shen Z, Xu Y, Ni B et al (2018) Crowd counting via adversarial cross-scale consistency pursuit. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5245–5254

  25. Sindagi V A, Patel V M (2017) Cnn-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6

  26. Sindagi V A, Patel V M (2017) Generating high-quality crowd density maps using contextual pyramid cnns. In: Proceedings of the IEEE international conference on computer vision, pp 1861–1870

  27. Sindagi V A, Patel V M (2018) A survey of recent advances in cnn-based single image crowd counting and density estimation. Pattern Recogn Lett 107:3–16

    Article  Google Scholar 

  28. Sindagi V A, Patel V M (2019) Multi-level bottom-top and top-bottom feature fusion for crowd counting. In: Proceedings of the IEEE international conference on computer vision, pp 1002–1012

  29. Wang C, Zhang H, Yang L et al (2015) Deep people counting in extremely dense crowds. In: Proceedings of the 23rd ACM international conference on multimedia, pp 1299–1302

  30. Wang X, Cai Z, Gao D et al (2019) Towards universal object detection by domain attention. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7289–7298

  31. Wang Y, Zhang J, Kan M et al (2020) Self-supervised equivariant attention mechanism for weakly supervised semantic segmentation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 12275–12284

  32. Wang Y, Hou J, Houa X et al (2021) A self-training approach for point-supervised object detection and counting in crowds. IEEE Trans Image Process PP(99)

  33. Xiong H, Lu H, Liu C et al (2019) From open set to closed set: counting objects by spatial divide-and-conquer. In: Proceedings of the IEEE International Conference on Computer Vision, pp 8362–8371

  34. Yang Y, Li G, Du D et al (2020) Embedding perspective analysis into multi-column convolutional neural network for crowd counting. IEEE Trans Image Process 30:1395–1407

    Article  Google Scholar 

  35. Yang Y, Li G, Wu Z et al (2020) Weakly-supervised crowd counting learns from sorting rather than locations. In: Computer vision–ECCV 2020: 16th European conference, Glasgow, UK, August 23–28, 2020, Proceedings, Part VIII 16. Springer, pp 1–17

  36. Zhang Y, Zhou D, Chen S et al (2016) Single-image crowd counting via multi-column convolutional neural network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 589–597

  37. Zhang L, Shi M, Chen Q (2018) Crowd counting via scale-adaptive convolutional neural network. In: 2018 IEEE Winter conference on applications of computer vision (WACV). IEEE, pp 1113–1121

  38. Zhang S, Yang Y, Wang P et al (2019) Attend to the difference: cross-modality person re-identification via contrastive correlation. arXiv:191011656

  39. Zhang F, Jiao L, Li L et al (2020) Multiresolution attention extractor for small object detection. arXiv:200605941

  40. Zhao M, Zhang J, Zhang C et al (2019) Leveraging heterogeneous auxiliary tasks to assist crowd counting. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 12736–12745

  41. Zhu L, Zhao Z, Lu C et al (2019) Dual path multi-scale fusion networks with attention for crowd counting. arXiv:190201115

  42. Zhu M, Wang X, Tang J et al (2020) Attentive multi-stage convolutional neural network for crowd counting. Pattern Recogn Lett 135:279–285

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Shizhou Zhang.

Ethics declarations

The manuscript has not been published before and is not being considered for publication elsewhere. All authors have contributed to the creation of this manuscript for important intellectual content and read and approved the final manuscript. We declare there is no conflict of interest.

Additional information

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This work was supported in part by the National Natural Science Foundation of China (NFSC) under Grant U19B2037.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhang, Q., Zhang, S., Liu, X. et al. Scale-aware local difference attention on pyramidal features for crowd counting. Multimed Tools Appl 83, 5165–5180 (2024). https://doi.org/10.1007/s11042-023-15366-1

Download citation

  • Received:

  • Revised:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11042-023-15366-1

Keywords