Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Rethinking Global Context in Crowd Counting

  • Research Article
  • Published:
Machine Intelligence Research Aims and scope Submit manuscript

Abstract

This paper investigates the role of global context for crowd counting. Specifically, a pure transformer is used to extract features with global information from overlapping image patches. Inspired by classification, we add a context token to the input sequence, to facilitate information exchange with tokens corresponding to image patches throughout transformer layers. Due to the fact that transformers do not explicitly model the tried-and-true channel-wise interactions, we propose a token-attention module (TAM) to recalibrate encoded features through channel-wise attention informed by the context token. Beyond that, it is adopted to predict the total person count of the image through regression-token module (RTM). Extensive experiments on various datasets, including ShanghaiTech, UCF-QNRF, JHU-CROWD++ and NWPU, demonstrate that the proposed context extraction techniques can significantly improve the performance over the baselines.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. W. N. Ge, R. T. Collins. Marked point processes for crowd counting. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Miami, USA. pp. 2913–2920, 2009. DOI: https://doi.org/10.1109/CVPR.2009.5206621.

  2. T. Zhao, R. Nevatia. Bayesian human segmentation in crowded situations. In Proceedings of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, Madison, USA, pp. II-459, 2003. DOI: https://doi.org/10.1109/CV-PR.2003.1211503.

  3. W. Z. Liu, M. Salzmann, P. Fua. Context-aware crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach. USA, pp. 5094–5103, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00524.

  4. Y. T. Hu, X. L. Jiang, X. H. Liu, B. C. Zhang, J. G. Han, X. B. Cao, D. Doermann. NAS-Count: Counting-by-density with neural architecture search. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 747–766, 2020. DOI: https://doi.org/10.1007/978-3-030-58542-6_45.

  5. H. Idrees, M. Tayyab, K. Athrey, D. Zhang, S. Al-Maadeed, N. Rajpoot, M. Shah. Composition loss for counting, density map estimation and localization in dense crowds. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 544–559, 2018. DOI: https://doi.org/10.1007/978-3-030-01216-8_33.

  6. C. Zhang, H. S. Li, X. G. Wang, X. K. Yang. Cross-scene crowd counting via deep convolutional neural networks. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, pp. 833–841, 2015. DOI: https://doi.org/10.1109/CVPR.2015.7298684.

  7. Y. Y. Zhang, D. S. Zhou, S. Q. Chen, S. H. Gao, Y. Ma. Single-image crowd counting via multi-column convolutional neural network. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 589–597, 2016. DOI: https://doi.org/10.1109/CVPR.2016.70.

  8. V. A. Sindagi, V. M. Patel. CNN-based cascaded multi-task learning of high-level prior and density estimation for crowd counting. In Proceedings of the 14th IEEE International Conference on Advanced Video and Signal Based Surveillance, Lecce, Italy, pp. 1–6, 2017. DOI: https://doi.org/10.1109/AVSS.2017.8078491.

  9. D. B. Sam, S. Surya, R. V. Babu. Switching convolutional neural network for crowd counting. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, pp. 4031–4039, 2017. DOI: https://doi.org/10.1109/CVPR.2017.429.

  10. D. B. Sam, N. N. Sajjan, R. V. Babu, M. Srinivasan. Divide and grow: Capturing huge diversity in crowd images with incrementally growing CNN. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 3618–3626, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00381.

  11. Y. H. Li, X. F. Zhang, D. M. Chen. CSRNet: Dilated convolutional neural networks for understanding the highly congested scenes. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 1091–1100, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00120.

  12. V. Ranjan, H. Le, M. Hoai. Iterative crowd counting. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 278–293, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_17.

  13. Z. L. Shi, L. Zhang, Y. Liu, X. F. Cao, Y. D. Ye, M. M. Cheng, G. Y. Zheng. Crowd counting with deep negative correlation learning. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 5382–5390, 2018. DOI: https://doi.org/10.1109/CV-PR.2018.00564.

  14. X. K. Cao, Z. P. Wang, Y. Y. Zhao, F. Su. Scale aggregation network for accurate and efficient crowd counting. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp.757–773, 2018. DOI: https://doi.org/10.1007/978-3-030-01228-1_45.

  15. Q. Wang, J. Y. Gao, W. Lin, Y. Yuan. Learning from synthetic data for crowd counting in the wild. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 8190–8199, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00839.

  16. M. J. Shi, Z. H. Yang, C. Xu, Q. J. Chen. Revisiting perspective information for efficient crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp.7271–7280, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00745.

  17. X. L. Jiang, Z. H. Xiao, B. C. Zhang, X. T. Zhen, X. B. Cao, D. Doermann, L. Shao. Crowd counting and density estimation by trellis encoder-decoder networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp.6126–6135, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00629.

  18. A. R. Zhang, L. Yue, J. Y. Shen, F. Zhu, X. T. Zhen, X. B. Cao, L. Shao. Attentional neural fields for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.5713–5722, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00581.

  19. J. Wan, A. Chan. Adaptive density map generation for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1130–1139, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00122.

  20. Z. L. Shi, P. Mettes, C. Snoek. Counting with focus for free. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.4199–208, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00430.

  21. Z. Y. Yan, Y. C. Yuan, W. M. Zuo, X. Tan, Y. Z. Wang, S. L. Wen, E. R. Ding. Perspective-guided convolution networks for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp.952–961, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00104.

  22. H. P. Xiong, H. Lu, C. X. Liu, L. Liu, Z. G. Cao, C. H. Shen. From open set to closed set: Counting objects by spatial divide-and-conquer. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 8361–8370, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00845.

  23. L. B. Liu, Z. L. Qiu, G. B. Li, S. F. Liu, W. L. Ouyang, L. Lin. Crowd counting with deep structured scale integration network. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1774–1783, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00186.

  24. Z. H. Ma, X. Wei, X. P. Hong, Y. H. Gong. Bayesian loss for crowd count estimation with point supervision. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 6141–6150, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00624.

  25. X. L. Liu, J. Van De Weijer, A. D. Bagdanov. Exploiting unlabeled data in CNNs by self-supervised learning to rank. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.41, no.8, pp. 1862–1878, 2019. DOI: https://doi.org/10.1109/TPAMI.2019.2899857.

    Article  Google Scholar 

  26. X. H. Jiang, L. Zhang, M. L. Xu, T. Z. Zhang, P. Lv, B. Zhou, X. Yang, Y. W. Pang. Attention scaling for crowd counting. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Seattle, USA, pp. 4705–714, 2020. DOI: https://doi.org/10.1109/CVPR42600.2020.00476.

  27. L. Liu, H. Lu, H. W. Zou, H. P. Xiong, Z. G. Cao, C. H. Shen. Weighing counts: Sequential crowd counting by reinforcement learning. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.164–181, 2020. DOI: https://doi.org/10.1007/978-3-030-58607-210.

  28. J. Wan, A. B. Chan. Modeling noisy annotations for crowd counting. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 285, 2020.

  29. B. Y. Wang, H. D. Liu, D. Samaras, M. Hoai. Distribution matching for crowd counting. In Proceedings of the 34th International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 135, 2020.

  30. Z. H. Ma, X. Wei, X. P. Hong, Y. H. Gong. Learning scales from points: A scale-aware probabilistic model for crowd counting. In Proceedings of the 28th ACM International Conference on Multimedia, Seattle, USA, pp.220–228, 2020. DOI: https://doi.org/10.1145/3394171.3413642.

  31. D. Z. Lian, J. Li, J. Zheng, W. X. Luo, S. H. Gao. Density map regression guided detection network for RGB-D crowd counting and localization. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Long Beach, USA, pp. 1821–1830, 2019. DOI: https://doi.org/10.1109/CVPR.2019.00192.

  32. K. Kopaczewski, M. Szczodrak, A. Czyzewski, H. Krawczyk. A method for counting people attending large public events. Multimedia Tools and Applications, vol.74, no. 12, pp.4289–301, 2015. DOI: https://doi.org/10.1007/s11042-013-1628-0.

    Article  Google Scholar 

  33. J. He, X. J. Wu, J. Yang, W. X. Hu. CPSPNet: Crowd counting via semantic segmentation framework. In Proceedings of the 32nd IEEE International Conference on Tools with Artificial Intelligence, Baltimore, USA, pp. 1104–1110, 2020. DOI: https://doi.org/10.1109/ICTAI50040.2020.00168.

  34. V. A. Sindagi, R. Yasarla, V. M. Patel. JHU-CROWD++: Large-scale crowd counting dataset and a benchmark method. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.44, no.5, pp.2594–2609, 2020. DOI: https://doi.org/10.1109/TPAMI.2020.3035969.

    Google Scholar 

  35. Q. Wang, J. Y. Gao, W. Lin, X. L. Li. NWPU-crowd: A large-scale benchmark for crowd counting and localization. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.43, no. 6, pp. 2141–2149, 2020. DOI: https://doi.org/10.1109/TPAMI.2020.3013269.

    Article  Google Scholar 

  36. J. Watson, O. M. Aodha, D. Turmukhambetov, G. J. Brostow, M. Firman. Learning stereo from single images. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 722–740, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_42.

  37. S. X. Zheng, J. C. Lu, H. S. Zhao, X. T. Zhu, Z. K. Luo, Y. B. Wang, Y. W. Fu, J. F. Feng, T. Xiang, P. H. S. Torr, L. Zhang. Rethinking semantic segmentation from a sequence-to-sequence perspective with transformers. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Nashville, USA, pp. 6877–6886, 2021. DOI: https://doi.org/10.1109/CVPR46437.2021.00681.

  38. G. L. Sun, W. G. Wang, J. F. Dai, L. Van Gool. Mining cross-image semantics for weakly supervised semantic segmentation. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.347–365, 2020. DOI: https://doi.org/10.1007/978-3-030-58536-5_21.

  39. G. L. Yang, H. Tang, M. L. Ding, N. Sebe, E. Ricci. Transformers solve the limited receptive field for monocular depth prediction, [Online], Available: https://arxiv.org/abs/2103.12091, 2021.

  40. E. Z. Xie, W. H. Wang, Z. D. Yu, A. Anandkumar, J. M. Alvarez, P. Luo. SegFormer: Simple and efficient design for semantic segmentation with transformers. In Proceedings of the 35th Neural Information Processing Systems, pp.12077–12090, 2021.

  41. A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, L. Kaiser, I. Polosukhin. Attention is all you need. In Proceedings of the 31st International Conference on Neural Information Processing Systems, Long Beach, USA, pp. 6000–6010, 2017.

  42. A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. H. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly, J. Uszkoreit, N. Houlsby. An image is worth 16×16 words: Transformers for image recognition at scale. In Proceedings of the 9th International Conference on Learning Representations, 2021.

  43. L. Yuan, Y. P. Chen, T. Wang, W. H. Yu, Y. J. Shi, Z. H. Jiang, F. E. H. Tay, J. S. Feng, S. C. Yan. Tokens-to-token ViT: Training vision transformers from scratch on ImageNet. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 538–547, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00060.

  44. J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. In Proceedings of IEEE/CVF Conference on Computer Vision and Pattern Recognition, Salt Lake City, USA, pp. 7132–7141, 2018. DOI: https://doi.org/10.1109/CVPR.2018.00745.

  45. S. Woo, J. Park, J. Y. Lee, I. S. Kweon. CBAM: Convolutional block attention module. In Proceedings of the 15th European Conference on Computer Vision, Munich, Germany, pp. 3–19, 2018. DOI: https://doi.org/10.1007/978-3-030-01234-2_1.

  46. S. Gidaris, P. Singh, N. Komodakis. Unsupervised representation learning by predicting image rotations. In Proceedings of the 6th International Conference on Learning Representations, Vancouver, Canada, 2018.

  47. K. Chen, S. G. Gong, T. Xiang, C. C. Loy. Cumulative attribute space for age and crowd density estimation. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA, pp. 2467–2474, 2013. DOI: https://doi.org/10.1109/CVPR.2013.319.

  48. C. Wang, H. Zhang, L. Yang, S. Liu, X. C. Cao. Deep people counting in extremely dense crowds. In Proceedings of the 23rd ACM international conference on Multimedia, Brisbane, Australia, pp. 1299–1302, 2015. DOI: https://doi.org/10.1145/2733373.2806337.

  49. M. Li, Z. X. Zhang, K. Q. Huang, T. N. Tan. Estimating the number of people in crowded scenes by mid based foreground segmentation and head-shoulder detection. In Proceedings of the 19th International Conference on Pattern Recognition, Tampa, USA, pp. Ā, 2008. DOI: https://doi.org/10.1109/ICPR.2008.4761705.

  50. C. A. Wang, Q. Y. Song, B. S. Zhang, Y. B. Wang, Y. Tai, X. Y. Hu, C. J. Wang, J. L. Li, J. Y. Ma, Y. Wu. Uniformity in heterogeneity: Diving deep into count interval partition for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3214–3222, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00322.

  51. Q. Y. Song, C. A. Wang, Z. K. Jiang, Y. B. Wang, Y. Tai, C. J. Wang, J. L. Li, F. Y. Huang, Y. Wu. Rethinking counting and localization in crowds: A purely point-based framework. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3345–3354, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00335.

  52. X. Y. Liu, G. R. Li, Z. J. Han, W. G. Zhang, Y. F. Yang, Q. M. Huang, N. Sebe. Exploiting sample correlation for crowd counting with multi-expert network. In Proceedings of IEEE/CVF International Conference on Computer Vision, Montreal, Canada, pp. 3195–3204, 2021. DOI: https://doi.org/10.1109/ICCV48922.2021.00320.

  53. D. K. Liang, X. W. Chen, W. Xu, Y. Zhou, X. Bai. TransCrowd: Weakly-supervised crowd counting with transformers. Science China Information Sciences, vol.65, no. 6, Article number 160104, 2022. DOI: https://doi.org/10.1007/s11432-021-3445-y.

  54. N. Carion, F. Massa, G. Synnaeve, N. Usunier, A. Kirillov, S. Zagoruyko. End-to-end object detection with transformers. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp.213–229, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-813.

  55. K. M. He, X. Y. Zhang, S. Q. Ren, J. Sun. Deep residual learning for image recognition. In Proceedings of IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, pp. 770–778, 2016. DOI: https://doi.org/10.1109/CVPR.2016.90.

  56. A. Odena, C. Olah, J. Shlens. Conditional image synthesis with auxiliary classifier GANs. In Proceedings of the 34th International Conference on Machine Learning, Sydney, Australia, pp. 2642–2651, 2017.

  57. Z. Tian, C. H. Shen, H. Chen. Conditional convolutions for instance segmentation. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 282–298, 2020. DOI: https://doi.org/10.1007/978-3-030-58452-8_17.

  58. C. Villani. Optimal Transport: Old and New, Berlin, Germany: Springer, 2009. DOI: https://doi.org/10.1007/978-3-540-71050-9.

    Book  Google Scholar 

  59. O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. H. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei. ImageNet large scale visual recognition challenge. International Journal of Computer Vision, vol.115, no. 3, pp. 211–252, 2015. DOI: https://doi.org/10.1007/s11263-015-0816-y.

    Article  MathSciNet  Google Scholar 

  60. D. P. Kingma, J. Ba. Adam: A method for stochastic optimization. In Proceedings of the 3rd International Conference on Learning Representations, San Diego, USA, 2015.

  61. A. Paszke, S. Gross, F. Massa, A. Lerer, J. Bradbury, G. Chanan, T. Killeen, Z. M. Lin, N. Gimelshein, L. Antiga, A. Desmaison, A. Köpf, E. Yang, Z. DeVito, M. Raison, A. Tejani, S. Chilamkurthy, B. Steiner, L. Fang, J. J. Bai, S. Chintala. PyTorch: An imperative style, high-performance deep learning library. In Proceedings of 33rd International Conference on Neural Information Processing Systems, Vancouver, Canada, Article number 721, 2019.

  62. Y. F. Yang, G. R. Li, Z. Wu, L. Su, Q. M. Huang, N. Sebe. Weakly-supervised crowd counting learns from sorting rather than locations. In Proceedings of the 16th European Conference on Computer Vision, Glasgow, UK, pp. 1–17, 2020. DOI: https://doi.org/10.1007/978-3-030-58598-3_1.

  63. Y. J. Lei, Y. Liu, P. P. Zhang, L. Q. Liu. Towards using count-level weak supervision for crowd counting. Pattern Recognition, vol.109, Article number 107616, 2021. DOI: https://doi.org/10.1016/j.patcog.2020.107616.

  64. V. Sindagi, V. Patel. Multi-level bottom-top and top-bottom feature fusion for crowd counting. In Proceedings of IEEE/CVF International Conference on Computer Vision, Seoul, Republic of Korea, pp. 1002–1012, 2019. DOI: https://doi.org/10.1109/ICCV.2019.00109.

  65. J. Wan, Q. Z. Wang, A. B. Chan. Kernel-based density map generation for dense object counting. IEEE Transactions on Pattern Analysis and Machine Intelligence, vol.44, no.3, pp. 1357–1370, 2022. DOI: https://doi.org/10.1109/TPAMI.2020.3022878.

    Article  Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yun Liu.

Ethics declarations

The authors declared that they have no conflicts of interest to this work.

Additional information

Colored figures are available in the online version at https://link.springer.com/journal/11633

Guolei Sun received the M.Sc. degree in computer science from King Abdullah University of Science and Technology, Saudi Arabia in 2018. From 2018 to 2019, he worked as a research engineer at the Inception Institute of Artificial Intelligence, UAE. Currently, he is a Ph.D. degree candidate at ETH Zurich, Switzerland under supervision of Prof. Luc Van Gool. He has published more than 20 papers in top journals and conferences such as TPAMI, CVPR, ICCV, and ECCV.

His research interests include computer vision and deep learning for tasks such as semantic segmentation, video understanding, and object counting.

Yun Liu received the B. Eng. and Ph.D. degrees in computer science from Nankai University, China in 2016 and 2020, respectively. Then, he worked with Prof. Luc Van Gool for one and a half years as a postdoctoral scholar at Computer Vision Lab, ETH Zürich, Switzerland. Currently, he is a senior scientist at Institute for Infocomm Research (I2R), A*STAR, Singapore.

His research interests include computer vision and machine learning.

Thomas Probst received the M. Sc. and Ph.D. degrees in computer science from Ulm University, Germany and ETH Zürich, Switzerland in 2014 and 2019, respectively. After that, he was a postdoctoral researcher at the Computer Vision Lab under Prof. Luc Van Gool at ETH Zürich, Switzerland.

His research interests include deep learning for geometry problems, and the perception of humans for robotics.

Danda Pani Paudel received the M.Sc. degree in computer vision and the Ph.D. degree in computer science from University of Bourgogne, France in 2012 and 2015, respectively. He worked as a research scholar at University of Strasbourg, France from 2013 to 2015, while devising global and local methods for 2D–3D registration problems. Currently, he is a researcher at the Computer Vision Lab, ETH Zürich, Switzerland, working with Prof. Luc Van Gool.

His research interests include computer vision, visual-SLAM, unsupervised learning and optimization methods.

Nikola Popovic is a Ph. D. degree candidate at the Computer Vision Lab at ETH Zürich, Switzerland. He has published a number of papers in major computer vision conferences.

His research interests include computer vision, machine learning and artificial intelligence.

Luc Van Gool received the B. Eng. degree in electromechanical engineering from the Katholieke Universiteit Leuven, Belgium in 1981. Currently, he is a professor at the Katholieke Universiteit Leuven in Belgium and the ETH in Zürich, Switzerland. He leads computer vision research at both places, and also teaches at both. He has been a program committee member of several major computer vision conferences. He received several Best Paper awards, won a David Marr Prize and a Koenderink Award, and was nominated Distinguished Researcher by the IEEE Computer Science committee. He is a co-founder of 10 spin-off companies.

His research interests include 3D reconstruction and modelling, object recognition, tracking, gesture analysis, and the combination of those.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Sun, G., Liu, Y., Probst, T. et al. Rethinking Global Context in Crowd Counting. Mach. Intell. Res. 21, 640–651 (2024). https://doi.org/10.1007/s11633-023-1475-z

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11633-023-1475-z

Keywords