research-article

Multi-scale Multi-clue Crowd Counting Network

Authors:

Fei HanAuthors Info & Claims

ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

Article No.: 1, Pages 1 - 7

https://doi.org/10.1145/3508546.3508547

Published: 25 February 2022 Publication History

Abstract

At present, crowd counting under complex background is still a big challenge, but a meaningful task for public safety. We focus on this problem and propose a multi-scale multi-clue crowd counting network (MMNet), which is composed of a feature encoder backbone and four stacked multi-clue crowd estimation modules (MCEM) under multiple scales as decoders. Each module consists of three predictors, including a shared attention predictor (SAP), a density map predictor (DMP) and a local counting map predictor (LCMP). DMP utilizes the information of each pixel on the image, while LCMP divides the image into patches and counts the number of people on these patches, focusing on the number in each patch. These two predictors solve the problem of inaccurate crowd counting under complex background from the perspective of training target. They use the microscopic information and macro information of the image for model training, respectively. SAP helps them concentrate more on the human head region in the image by generating multi-scale shared attention maps from the perspective of feature extraction. Furthermore, we design a multi-task joint training strategy that automatically adjusts the loss weights of different tasks to promote training and the robustness of the model. Extensive experiments on three challenging datasets (ShanghaiTech, UCF_CC_50, UCF-QNRF) show the superior performance of MMNet.

References

[1]

Xinkun Cao, Zhipeng Wang, Yanyun Zhao, and Fei Su. 2018. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the European Conference on Computer Vision (ECCV). 734–750.

Digital Library

[2]

Antoni B Chan, Zhang-Sheng John Liang, and Nuno Vasconcelos. 2008. Privacy preserving crowd monitoring: Counting people without people models or tracking. In 2008 IEEE Conference on Computer Vision and Pattern Recognition. IEEE, 1–7.

[3]

Ke Chen, Chen Change Loy, Shaogang Gong, and Tony Xiang. 2012. Feature mining for localised crowd counting. In Bmvc, Vol. 1. 3.

[4]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), Vol. 1. Ieee, 886–893.

[5]

Ross Girshick. 2015. Fast r-cnn. In Proceedings of the IEEE international conference on computer vision. 1440–1448.

Digital Library

[6]

Haroon Idrees, Imran Saleemi, Cody Seibert, and Mubarak Shah. 2013. Multi-source multi-scale counting in extremely dense crowd images. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2547–2554.

Digital Library

[7]

Haroon Idrees, Muhmmad Tayyab, Kishan Athrey, Dong Zhang, Somaya Al-Maadeed, Nasir Rajpoot, and Mubarak Shah. 2018. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the European Conference on Computer Vision (ECCV). 532–546.

Digital Library

[8]

Xiaolong Jiang, Zehao Xiao, Baochang Zhang, Xiantong Zhen, Xianbin Cao, David Doermann, and Ling Shao. 2019. Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6133–6142.

[9]

Victor Lempitsky and Andrew Zisserman. 2010. Learning to count objects in images. Advances in neural information processing systems 23 (2010), 1324–1332.

[10]

Yuhong Li, Xiaofan Zhang, and Deming Chen. 2018. Csrnet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of the IEEE conference on computer vision and pattern recognition. 1091–1100.

[11]

Weizhe Liu, Mathieu Salzmann, and Pascal Fua. 2019. Context-aware crowd counting. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5099–5108.

[12]

Xiyang Liu, Jie Yang, and Wenrui Ding. 2020. Adaptive mixture regression network with local counting map for crowd counting. arXiv preprint arXiv:2005.05776 (2020).

[13]

Jonathan Long, Evan Shelhamer, and Trevor Darrell. 2015. Fully convolutional networks for semantic segmentation. In Proceedings of the IEEE conference on computer vision and pattern recognition. 3431–3440.

[14]

Joseph Redmon, Santosh Divvala, Ross Girshick, and Ali Farhadi. 2016. You only look once: Unified, real-time object detection. In Proceedings of the IEEE conference on computer vision and pattern recognition. 779–788.

[15]

Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster r-cnn: Towards real-time object detection with region proposal networks. arXiv preprint arXiv:1506.01497 (2015).

[16]

Deepak Babu Sam, Shiv Surya, and R Venkatesh Babu. 2017. Switching convolutional neural network for crowd counting. In 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE, 4031–4039.

[17]

Chong Shang, Haizhou Ai, and Bo Bai. 2016. End-to-end crowd counting via joint learning local and global count. In 2016 IEEE International Conference on Image Processing (ICIP). IEEE, 1215–1219.

[18]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556(2014).

[19]

Tobias Stahl, Silvia L Pintea, and Jan C van Gemert. 2018. Divide and count: Generic object counting by image divisions. IEEE Transactions on Image Processing 28, 2 (2018), 1035–1044.

Digital Library

[20]

Pongpisit Thanasutives, Ken-ichi Fukui, Masayuki Numao, and Boonserm Kijsirikul. 2020. Encoder-Decoder Based Convolutional Neural Networks with Multi-Scale-Aware Modules for Crowd Counting. arXiv preprint arXiv:2003.05586 (2020).

[21]

Yukun Tian, Yiming Lei, Junping Zhang, and James Z Wang. 2019. Padnet: Pan-density crowd counting. IEEE Transactions on Image Processing 29 (2019), 2714–2727.

[22]

Paul Viola and Michael J Jones. 2004. Robust real-time face detection. International journal of computer vision 57, 2 (2004), 137–154.

Digital Library

[23]

Jia Wan and Antoni Chan. 2019. Adaptive density map generation for crowd counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1130–1139.

[24]

Xinyu Wu, Guoyuan Liang, Ka Keung Lee, and Yangsheng Xu. 2006. Crowd density estimation using texture analysis and learning. In 2006 IEEE international conference on robotics and biomimetics. IEEE, 214–219.

[25]

Haipeng Xiong, Hao Lu, Chengxin Liu, Liang Liu, Zhiguo Cao, and Chunhua Shen. 2019. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 8362–8371.

[26]

Anran Zhang, Jiayi Shen, Zehao Xiao, Fan Zhu, Xiantong Zhen, Xianbin Cao, and Ling Shao. 2019. Relational attention network for crowd counting. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 6788–6797.

[27]

Yingying Zhang, Desen Zhou, Siqin Chen, Shenghua Gao, and Yi Ma. 2016. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of the IEEE conference on computer vision and pattern recognition. 589–597.

Recommendations

Multi Scale Attention Network for Crowd Counting
CSAE '21: Proceedings of the 5th International Conference on Computer Science and Application Engineering

Reasonable management and control of extra crowded scenes have become a hot topic in recent years. Counting people from density map generated from the object location annotations is an effective way to analyze crowd information and control crowds in ...
Crowd Counting Method Based on Scale Adaptive Network
EBIMCS '20: Proceedings of the 2020 3rd International Conference on E-Business, Information Management and Computer Science

In view of the difficulty in crowd counting due to occlusion and unequal distribution in crowded scenes, this paper presents a people counting method based on scale adaptive network. In this method, the first ten sixteen-vgg layers are used to initially ...
A crowd counting method via density map and counting residual estimation
Abstract
Recently, state-of-the-art crowd counting methods have focused more on predicting a density map and then obtaining the final aggregated count. In 2018, a typical density map-based network for congested scene recognition called CSRNet was proposed, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ACAI '21: Proceedings of the 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

December 2021

699 pages

ISBN:9781450385053

DOI:10.1145/3508546

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 February 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

Science and Technology Commission of Shanghai Municipality
Science and Technology Major Project of Commission of Science and Technology of Shanghai
research of High-tech industry and technological innovation special project in Lingang New Area on ecological environment monitoring system based on 5G Internet of Things and edge computing

Conference

ACAI'21

ACAI'21: 2021 4th International Conference on Algorithms, Computing and Artificial Intelligence

December 22 - 24, 2021

Sanya, China

Acceptance Rates

Overall Acceptance Rate 173 of 395 submissions, 44%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
125
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents