Rethinking Global Context in Crowd Counting

Sun, Guolei; Liu, Yun; Probst, Thomas; Paudel, Danda Pani; Popovic, Nikola; Van Gool, Luc

doi:10.1007/s11633-023-1475-z

Rethinking Global Context in Crowd Counting

Research Article
Published: 15 April 2024

Volume 21, pages 640–651, (2024)
Cite this article

Machine Intelligence Research Aims and scope Submit manuscript

166 Accesses
1 Citation
1 Altmetric
Explore all metrics

Abstract

This paper investigates the role of global context for crowd counting. Specifically, a pure transformer is used to extract features with global information from overlapping image patches. Inspired by classification, we add a context token to the input sequence, to facilitate information exchange with tokens corresponding to image patches throughout transformer layers. Due to the fact that transformers do not explicitly model the tried-and-true channel-wise interactions, we propose a token-attention module (TAM) to recalibrate encoded features through channel-wise attention informed by the context token. Beyond that, it is adopted to predict the total person count of the image through regression-token module (RTM). Extensive experiments on various datasets, including ShanghaiTech, UCF-QNRF, JHU-CROWD++ and NWPU, demonstrate that the proposed context extraction techniques can significantly improve the performance over the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

HTNet: A Hybrid Model Boosted by Triple Self-attention for Crowd Counting

Context-aware pyramid attention network for crowd counting

Article 02 September 2021

CounTr: An End-to-End Transformer Approach for Crowd Counting and Density Estimation

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

References

W. N. Ge, R. T. Collins. Marked point processes for crowd counting. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA. pp. 2913–2920, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206621.
T. Zhao, R. Nevatia. Bayesian human segmentation in crowded situations. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, USA, pp. II-459, 2003. DOI: https://doi.org/10.1109/CV-PR.2003.1211503.
W. Z. Liu, M. Salzmann, P. Fua. Context-aware crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach. USA, pp. 5094–5103, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00524.
Y. T. Hu, X. L. Jiang, X. H. Liu, B. C. Zhang, J. G. Han, X. B. Cao, D. Doermann. NAS-Count: Counting-by-density with neural architecture search. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 747–766, 2020. DOI: https://doi.org/10.1007/978-3-030-58542-6_45.
H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S. Al-Maadeed, N. Rajpoot, M. Shah. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 544–559, 2018. DOI: https://doi.org/10.1007/978-3-030-01216-8_33.
C. Zhang, H. S. Li, X. G. Wang, X. K. Yang. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 833–841, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298684.
Y. Y. Zhang, D. S. Zhou, S. Q. Chen, S. H. Gao, Y. Ma. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 589–597, 2016. DOI: https://doi.org/10.1109/CVPR.2016.70.
V. A. Sindagi, V. M. Patel. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, Lecce, Italy, pp. 1–6, 2017. DOI: https://doi.org/10.1109/AVSS.2017.8078491.
D. B. Sam, S. Surya, R. V. Babu. Switching convolutional neural network for crowd counting. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 4031–4039, 2017. DOI: https://doi.org/10.1109/CVPR.2017.429.
D. B. Sam, N. N. Sajjan, R. V. Babu, M. Srinivasan. Divide and grow: Capturing huge diversity in crowd images with incrementally growing CNN. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 3618–3626, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00381.
Y. H. Li, X. F. Zhang, D. M. Chen. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 1091–1100, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00120.
V. Ranjan, H. Le, M. Hoai. Iterative crowd counting. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 278–293, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_17.
Z. L. Shi, L. Zhang, Y. Liu, X. F. Cao, Y. D. Ye, M. M. Cheng, G. Y. Zheng. Crowd counting with deep negative correlation learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 5382–5390, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00564.
X. K. Cao, Z. P. Wang, Y. Y. Zhao, F. Su. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp.757–773, 2018. DOI: https://doi.org/10.1007/978-3-030-01228-1_45.
Q. Wang, J. Y. Gao, W. Lin, Y. Yuan. Learning from synthetic data for crowd counting in the wild. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 8190–8199, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00839.
M. J. Shi, Z. H. Yang, C. Xu, Q. J. Chen. Revisiting perspective information for efficient crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp.7271–7280, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00745.
X. L. Jiang, Z. H. Xiao, B. C. Zhang, X. T. Zhen, X. B. Cao, D. Doermann, L. Shao. Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp.6126–6135, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00629.
A. R. Zhang, L. Yue, J. Y. Shen, F. Zhu, X. T. Zhen, X. B. Cao, L. Shao. Attentional neural fields for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.5713–5722, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00581.
J. Wan, A. Chan. Adaptive density map generation for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1130–1139, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00122.
Z. L. Shi, P. Mettes, C. Snoek. Counting with focus for free. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.4199–208, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00430.
Z. Y. Yan, Y. C. Yuan, W. M. Zuo, X. Tan, Y. Z. Wang, S. L. Wen, E. R. Ding. Perspective-guided convolution networks for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.952–961, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00104.
H. P. Xiong, H. Lu, C. X. Liu, L. Liu, Z. G. Cao, C. H. Shen. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 8361–8370, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00845.
L. B. Liu, Z. L. Qiu, G. B. Li, S. F. Liu, W. L. Ouyang, L. Lin. Crowd counting with deep structured scale integration network. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1774–1783, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00186.
Z. H. Ma, X. Wei, X. P. Hong, Y. H. Gong. Bayesian loss for crowd count estimation with point supervision. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 6141–6150, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00624.
X. L. Liu, J. Van De Weijer, A. D. Bagdanov. Exploiting unlabeled data in CNNs by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.41, no.8, pp. 1862–1878, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2899857.
Article Google Scholar
X. H. Jiang, L. Zhang, M. L. Xu, T. Z. Zhang, P. Lv, B. Zhou, X. Yang, Y. W. Pang. Attention scaling for crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 4705–714, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00476.
L. Liu, H. Lu, H. W. Zou, H. P. Xiong, Z. G. Cao, C. H. Shen. Weighing counts: Sequential crowd counting by reinforcement learning. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.164–181, 2020. DOI: https://doi.org/10.1007/978-3-030-58607-210.
J. Wan, A. B. Chan. Modeling noisy annotations for crowd counting. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 285, 2020.
B. Y. Wang, H. D. Liu, D. Samaras, M. Hoai. Distribution matching for crowd counting. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 135, 2020.
Z. H. Ma, X. Wei, X. P. Hong, Y. H. Gong. Learning scales from points: A scale-aware probabilistic model for crowd counting. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, USA, pp.220–228, 2020. DOI: https://doi.org/10.1145/3394171.3413642.
D. Z. Lian, J. Li, J. Zheng, W. X. Luo, S. H. Gao. Density map regression guided detection network for RGB-D crowd counting and localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 1821–1830, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00192.
K. Kopaczewski, M. Szczodrak, A. Czyzewski, H. Krawczyk. A method for counting people attending large public events. Multimedia Tools and Applications, vol.74, no. 12, pp.4289–301, 2015. DOI: https://doi.org/10.1007/s11042-013-1628-0.
Article Google Scholar
J. He, X. J. Wu, J. Yang, W. X. Hu. CPSPNet: Crowd counting via semantic segmentation framework. In Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, Baltimore, USA, pp. 1104–1110, 2020. DOI: https://doi.org/10.1109/ICTAI50040.2020.00168.
V. A. Sindagi, R. Yasarla, V. M. Patel. JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.44, no.5, pp.2594–2609, 2020. DOI: https://doi.org/10.1109/TPAMI.2020.3035969.
Google Scholar
Q. Wang, J. Y. Gao, W. Lin, X. L. Li. NWPU-crowd: A large-scale benchmark for crowd counting and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.43, no. 6, pp. 2141–2149, 2020. DOI: https://doi.org/10.1109/TPAMI.2020.3013269.
Article Google Scholar
J. Watson, O. M. Aodha, D. Turmukhambetov, G. J. Brostow, M. Firman. Learning stereo from single images. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 722–740, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_42.
S. X. Zheng, J. C. Lu, H. S. Zhao, X. T. Zhu, Z. K. Luo, Y. B. Wang, Y. W. Fu, J. F. Feng, T. Xiang, P. H. S. Torr, L. Zhang. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 6877–6886, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00681.
G. L. Sun, W. G. Wang, J. F. Dai, L. Van Gool. Mining cross-image semantics for weakly supervised semantic segmentation. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.347–365, 2020. DOI: https://doi.org/10.1007/978-3-030-58536-5_21.
G. L. Yang, H. Tang, M. L. Ding, N. Sebe, E. Ricci. Transformers solve the limited receptive field for monocular depth prediction, [Online], Available: https://arxiv.org/abs/2103.12091, 2021.
E. Z. Xie, W. H. Wang, Z. D. Yu, A. Anandkumar, J. M. Alvarez, P. Luo. SegFormer: Simple and efficient design for semantic segmentation with transformers. In Proceedings of the 35th Neural Information Processing Systems, pp.12077–12090, 2021.
A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.
A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby. An image is worth 16×16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations, 2021.
L. Yuan, Y. P. Chen, T. Wang, W. H. Yu, Y. J. Shi, Z. H. Jiang, F. E. H. Tay, J. S. Feng, S. C. Yan. Tokens-to-token ViT: Training vision transformers from scratch on ImageNet. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 538–547, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00060.
J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.
S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_1.
S. Gidaris, P. Singh, N. Komodakis. Unsupervised representation learning by predicting image rotations. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.
K. Chen, S. G. Gong, T. Xiang, C. C. Loy. Cumulative attribute space for age and crowd density estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, pp. 2467–2474, 2013. DOI: https://doi.org/10.1109/CVPR.2013.319.
C. Wang, H. Zhang, L. Yang, S. Liu, X. C. Cao. Deep people counting in extremely dense crowds. In Proceedings of the 23rd ACM international conference on Multimedia, Brisbane, Australia, pp. 1299–1302, 2015. DOI: https://doi.org/10.1145/2733373.2806337.
M. Li, Z. X. Zhang, K. Q. Huang, T. N. Tan. Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In Proceedings of the 19th International Conference on Pattern Recognition, Tampa, USA, pp. Ā, 2008. DOI: https://doi.org/10.1109/ICPR.2008.4761705.
C. A. Wang, Q. Y. Song, B. S. Zhang, Y. B. Wang, Y. Tai, X. Y. Hu, C. J. Wang, J. L. Li, J. Y. Ma, Y. Wu. Uniformity in heterogeneity: Diving deep into count interval partition for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3214–3222, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00322.
Q. Y. Song, C. A. Wang, Z. K. Jiang, Y. B. Wang, Y. Tai, C. J. Wang, J. L. Li, F. Y. Huang, Y. Wu. Rethinking counting and localization in crowds: A purely point-based framework. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3345–3354, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00335.
X. Y. Liu, G. R. Li, Z. J. Han, W. G. Zhang, Y. F. Yang, Q. M. Huang, N. Sebe. Exploiting sample correlation for crowd counting with multi-expert network. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3195–3204, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00320.
D. K. Liang, X. W. Chen, W. Xu, Y. Zhou, X. Bai. TransCrowd: Weakly-supervised crowd counting with transformers. Science China Information Sciences, vol.65, no. 6, Article number 160104, 2022. DOI: https://doi.org/10.1007/s11432-021-3445-y.
N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko. End-to-end object detection with transformers. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.213–229, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-813.
K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.
A. Odena, C. Olah, J. Shlens. Conditional image synthesis with auxiliary classifier GANs. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 2642–2651, 2017.
Z. Tian, C. H. Shen, H. Chen. Conditional convolutions for instance segmentation. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 282–298, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_17.
C. Villani. Optimal Transport: Old and New, Berlin, Germany: Springer, 2009. DOI: https://doi.org/10.1007/978-3-540-71050-9.
Book Google Scholar
O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol.115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/s11263-015-0816-y.
Article MathSciNet Google Scholar
D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.
A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. M. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. J. Bai, S. Chintala. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 721, 2019.
Y. F. Yang, G. R. Li, Z. Wu, L. Su, Q. M. Huang, N. Sebe. Weakly-supervised crowd counting learns from sorting rather than locations. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 1–17, 2020. DOI: https://doi.org/10.1007/978-3-030-58598-3_1.
Y. J. Lei, Y. Liu, P. P. Zhang, L. Q. Liu. Towards using count-level weak supervision for crowd counting. Pattern Recognition, vol.109, Article number 107616, 2021. DOI: https://doi.org/10.1016/j.patcog.2020.107616.
V. Sindagi, V. Patel. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1002–1012, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00109.
J. Wan, Q. Z. Wang, A. B. Chan. Kernel-based density map generation for dense object counting. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.44, no.3, pp. 1357–1370, 2022. DOI: https://doi.org/10.1109/TPAMI.2020.3022878.
Article Google Scholar

Download references

Author information

Authors and Affiliations

Computer Vision Lab, ETH Zürich, Zürich, 8092, Switzerland
Guolei Sun, Danda Pani Paudel, Nikola Popovic & Luc Van Gool
Institute for Infocomm Research, A*STAR, Singapore, 138632, Singapore
Yun Liu
Magic Leap, Zürich, 8050, Switzerland
Thomas Probst

Authors

Guolei Sun
View author publications
You can also search for this author in PubMed Google Scholar
Yun Liu
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Probst
View author publications
You can also search for this author in PubMed Google Scholar
Danda Pani Paudel
View author publications
You can also search for this author in PubMed Google Scholar
Nikola Popovic
View author publications
You can also search for this author in PubMed Google Scholar
Luc Van Gool
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yun Liu.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Guolei Sun received the M.Sc. degree in computer science from King Abdullah University of Science and Technology, Saudi Arabia in 2018. From 2018 to 2019, he worked as a research engineer at the Inception Institute of Artificial Intelligence, UAE. Currently, he is a Ph.D. degree candidate at ETH Zurich, Switzerland under supervision of Prof. Luc Van Gool. He has published more than 20 papers in top journals and conferences such as TPAMI, CVPR, ICCV, and ECCV.

His research interests include computer vision and deep learning for tasks such as semantic segmentation, video understanding, and object counting.

Yun Liu received the B. Eng. and Ph.D. degrees in computer science from Nankai University, China in 2016 and 2020, respectively. Then, he worked with Prof. Luc Van Gool for one and a half years as a postdoctoral scholar at Computer Vision Lab, ETH Zürich, Switzerland. Currently, he is a senior scientist at Institute for Infocomm Research (I2R), A*STAR, Singapore.

His research interests include computer vision and machine learning.

Thomas Probst received the M. Sc. and Ph.D. degrees in computer science from Ulm University, Germany and ETH Zürich, Switzerland in 2014 and 2019, respectively. After that, he was a postdoctoral researcher at the Computer Vision Lab under Prof. Luc Van Gool at ETH Zürich, Switzerland.

His research interests include deep learning for geometry problems, and the perception of humans for robotics.

Danda Pani Paudel received the M.Sc. degree in computer vision and the Ph.D. degree in computer science from University of Bourgogne, France in 2012 and 2015, respectively. He worked as a research scholar at University of Strasbourg, France from 2013 to 2015, while devising global and local methods for 2D–3D registration problems. Currently, he is a researcher at the Computer Vision Lab, ETH Zürich, Switzerland, working with Prof. Luc Van Gool.

His research interests include computer vision, visual-SLAM, unsupervised learning and optimization methods.

Nikola Popovic is a Ph. D. degree candidate at the Computer Vision Lab at ETH Zürich, Switzerland. He has published a number of papers in major computer vision conferences.

His research interests include computer vision, machine learning and artificial intelligence.

Luc Van Gool received the B. Eng. degree in electromechanical engineering from the Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at the Katholieke Universiteit Leuven in Belgium and the ETH in Zürich, Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science committee. He is a co-founder of 10 spin-off companies.

His research interests include 3D reconstruction and modelling, object recognition, tracking, gesture analysis, and the combination of those.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Sun, G., Liu, Y., Probst, T. et al. Rethinking Global Context in Crowd Counting. Mach. Intell. Res. 21, 640–651 (2024). https://doi.org/10.1007/s11633-023-1475-z

Download citation

Received: 14 April 2023
Accepted: 01 September 2023
Published: 15 April 2024
Issue Date: August 2024
DOI: https://doi.org/10.1007/s11633-023-1475-z

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Rethinking Global Context in Crowd Counting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HTNet: A Hybrid Model Boosted by Triple Self-attention for Crowd Counting

Context-aware pyramid attention network for crowd counting

CounTr: An End-to-End Transformer Approach for Crowd Counting and Density Estimation

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Rethinking Global Context in Crowd Counting

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

HTNet: A Hybrid Model Boosted by Triple Self-attention for Crowd Counting

Context-aware pyramid attention network for crowd counting

CounTr: An End-to-End Transformer Approach for Crowd Counting and Density Estimation

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation