PapMOT: Exploring Adversarial Patch Attack Against Multiple Object Tracking

Long, Jiahuan; Jiang, Tingsong; Yao, Wen; Jia, Shuai; Zhang, Weijia; Zhou, Weien; Ma, Chao; Chen, Xiaoqian

doi:10.1007/978-3-031-72983-6_8

Jiahuan Long^13,14,
Tingsong Jiang¹⁴,
Wen Yao¹⁴,
Shuai Jia¹³,
Weijia Zhang¹³,
Weien Zhou¹⁴,
Chao Ma¹³ &
…
Xiaoqian Chen¹⁴

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 15109))

Included in the following conference series:

European Conference on Computer Vision

198 Accesses

Abstract

Tracking multiple objects in a continuous video stream is crucial for many computer vision tasks. It involves detecting and associating objects with their respective identities across successive frames. Despite significant progress made in multiple object tracking (MOT), recent studies have revealed the vulnerability of existing MOT methods to adversarial attacks. Nevertheless, all of these attacks belong to digital attacks that inject pixel-level noise into input images, and are therefore ineffective in physical scenarios. To fill this gap, we propose PapMOT, which can generate physical adversarial patches against MOT for both digital and physical scenarios. Besides attacking the detection mechanism, PapMOT also optimizes a printable patch that can be detected as new targets to mislead the identity association process. Moreover, we introduce a patch enhancement strategy to further degrade the temporal consistency of tracking results across video frames, resulting in more aggressive attacks. We further develop new evaluation metrics to assess the robustness of MOT against such attacks. Extensive evaluations on multiple datasets demonstrate that our PapMOT can successfully attack various architectures of MOT trackers in digital scenarios. We also validate the effectiveness of PapMOT for physical attacks by deploying printed adversarial patches in the real world.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 64.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Box-spoof attack against single object tracking

Article 11 January 2024

Adversarial attack can help visual tracking

Article 14 March 2022

DIMBA: discretely masked black-box attack in single object tracking

Article Open access 31 October 2022

References

Athalye, A., Engstrom, L., Ilyas, A., Kwok, K.: Synthesizing robust adversarial examples. In: International Conference on Machine Learning (2018)
Google Scholar
PaddlePaddle Authors: PaddleDetection, object detection and instance segmentation toolkit based on PaddlePaddle (2019). https://github.com/PaddlePaddle/PaddleDetection
Bernardin, K., Stiefelhagen, R.: Evaluating multiple object tracking performance: the clear mot metrics. EURASIP J. Image Video Process. 2008, 246309 (2008). https://doi.org/10.1155/2008/246309
Article Google Scholar
Cao, J., Weng, X., Khirodkar, R., Pang, J., Kitani, K.: Observation-centric sort: rethinking sort for robust multi-object tracking. arXiv preprint arXiv:2203.14360 (2022)
Chiu, H.K., Li, J., Ambrus, R., Bohg, J.: Probabilistic 3D multi-modal, multi-object tracking for autonomous driving. In: IEEE International Conference on Robotics and Automation (2021)
Google Scholar
Chow, K.-H., Liu, L., Gursoy, M.E., Truex, S., Wei, W., Wu, Y.: Understanding object detection through an adversarial lens. In: Chen, L., Li, N., Liang, K., Schneider, S. (eds.) ESORICS 2020. LNCS, vol. 12309, pp. 460–481. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59013-0_23
Chapter Google Scholar
Dendorfer, P., et al.: MOT20: a benchmark for multi object tracking in crowded scenes. arXiv preprint arXiv:2003.09003 (2020)
Dollar, P., Wojek, C., Schiele, B., Perona, P.: Pedestrian detection: a benchmark. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2009)
Google Scholar
Du, A., et al.: Physical adversarial attacks on an aerial imagery object detector. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (2022)
Google Scholar
Ess, A., Leibe, B., Schindler, K., van Gool, L.: A mobile vision system for robust multi-person tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2008)
Google Scholar
Ge, Z., Liu, S., Wang, F., Li, Z., Sun, J.: YOLOX: exceeding yolo series in 2021. arXiv preprint arXiv:2107.08430 (2021)
Hu, Y.C.T., Kung, B.H., Tan, D.S., Chen, J.C., Hua, K.L., Cheng, W.H.: Naturalistic physical adversarial patch for object detectors. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (2021)
Google Scholar
Hu, Z., Huang, S., Zhu, X., Sun, F., Zhang, B., Hu, X.: Adversarial texture for fooling person detectors in the physical world. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2022)
Google Scholar
Jia, S., Ma, C., Song, Y., Yang, X.: Robust tracking against adversarial attacks. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12364, pp. 69–84. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58529-7_5
Chapter Google Scholar
Jia, S., Ma, C., Yao, T., Yin, B., Ding, S., Yang, X.: Exploring frequency adversarial attacks for face forgery detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2022)
Google Scholar
Jia, S., Song, Y., Ma, C., Yang, X.: IoU attack: towards temporally coherent black-box adversarial attack for visual object tracking. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2021)
Google Scholar
Jia, Y., Lu, Y., Shen, J., Chen, Q.A., Zhong, Z., Wei, T.: Fooling detection alone is not enough: adversarial attack against multiple object tracking. In: International Conference on Learning Representations (2020)
Google Scholar
Kuhn, H.W.: The Hungarian method for the assignment problem. Naval Res. Logistics Q. 2(1–2), 83–97 (1955)
Article MathSciNet Google Scholar
Leal-Taixé, L., Milan, A., Reid, I., Roth, S., Schindler, K.: MOTChallenge 2015: towards a benchmark for multi-target tracking. arXiv preprint arXiv:1504.01942 (2015)
Lin, D., Chen, Q., Zhou, C., He, K.: TraSA: Tracklet-switch adversarial attacks against multi-object tracking. arXiv preprint arXiv:2111.08954 (2023)
Luisier, F., Blu, T., Unser, M.: Image denoising in mixed Poisson-Gaussian noise. IEEE Trans. Image Process. 20, 696–708 (2010)
Article MathSciNet Google Scholar
Luo, W., Xing, J., Milan, A., Zhang, X., Liu, W., Kim, T.K.: Multiple object tracking: a literature review. Artif. Intell. 293, 103448 (2021)
Article MathSciNet Google Scholar
Milan, A., Leal-Taixé, L., Reid, I., Roth, S., Schindler, K.: MOT16: a benchmark for multi-object tracking. arXiv preprint arXiv:1603.00831 (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. arXiv preprint arXiv:1804.02767 (2018)
Shao, S., et al.: CrowdHuman: a benchmark for detecting human in a crowd. arXiv preprint arXiv:1805.00123 (2018)
Sharif, M., Bhagavatula, S., Bauer, L., Reiter, M.K.: Accessorize to a crime. In: Proceedings of the 2016 ACM SIGSAC Conference on Computer and Communications Security (2016)
Google Scholar
Thys, S., Van Ranst, W., Goedemé, T.: Fooling automated surveillance cameras: adversarial patches to attack person detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (2019)
Google Scholar
Wang, D., et al.: FCA: learning a 3D full-coverage vehicle camouflage for multi-view physical adversarial attack. In: Proceedings of the AAAI Conference on Artificial Intelligence (2022)
Google Scholar
Welch, G., Bishop, G., et al.: An introduction to the Kalman filter (1995)
Google Scholar
Wojke, N., Bewley, A., Paulus, D.: Simple online and realtime tracking with a deep association metric. In: IEEE International Conference on Image Processing (2017)
Google Scholar
Xiao, T., Li, S., Wang, B., Lin, L., Wang, X.: Joint detection and identification feature learning for person search. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Xu, H., et al.: Adversarial attacks and defenses in images, graphs and text: a review. Int. J. Autom. Comput. 17, 151–178 (2019). https://doi.org/10.1007/s11633-019-1211-x
Article Google Scholar
Xu, K., et al.: Adversarial t-shirt! evading person detectors in a physical world. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12350, pp. 665–681. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58558-7_39
Chapter Google Scholar
Yang, X., Wei, F., Zhang, H., Zhu, J.: Design and interpretation of universal adversarial patches in face detection. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.-M. (eds.) ECCV 2020. LNCS, vol. 12362, pp. 174–191. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58520-4_11
Chapter Google Scholar
Yu, F., et al.: BDD100K: a diverse driving dataset for heterogeneous multitask learning. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (2020)
Google Scholar
Zhang, S., Benenson, R., Schiele, B.: CityPersons: a diverse dataset for pedestrian detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar
Zhang, Y., et al.: ByteTrack: multi-object tracking by associating every detection box. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision, ECCV 2022. LNCS, vol. 13682, pp. 1–21. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_1
Zhang, Y., Wang, C., Wang, X., Zeng, W., Liu, W.: FairMOT: on the fairness of detection and re-identification in multiple object tracking. Int. J. Comput. Vis. 129, 3069–3087 (2021)
Article Google Scholar
Zhao, Z., Wu, Z., Zhuang, Y., Li, B., Jia, J.: Tracking objects as pixel-wise distributions. In: Avidan, S., Brostow, G., Cisseé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision, ECCV 2022. LNCS, vol. 13682, pp 76–94. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_5
Zheng, L., Zhang, H., Sun, S., Chandraker, M., Yang, Y., Tian, Q.: Person re-identification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (2017)
Google Scholar

Download references

Acknowledgements

This work was supported in part by NSFC (62322113, 62376156).

Author information

Authors and Affiliations

MoE Key Lab of Artificial Intelligence, AI Institute, Shanghai Jiao Tong University, Shanghai, China
Jiahuan Long, Shuai Jia, Weijia Zhang & Chao Ma
Defense Innovation Institute, Chinese Academy of Military Science, Beijing, China
Jiahuan Long, Tingsong Jiang, Wen Yao, Weien Zhou & Xiaoqian Chen

Authors

Jiahuan Long
View author publications
You can also search for this author in PubMed Google Scholar
Tingsong Jiang
View author publications
You can also search for this author in PubMed Google Scholar
Wen Yao
View author publications
You can also search for this author in PubMed Google Scholar
Shuai Jia
View author publications
You can also search for this author in PubMed Google Scholar
Weijia Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Weien Zhou
View author publications
You can also search for this author in PubMed Google Scholar
Chao Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoqian Chen
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Wen Yao or Chao Ma .

Editor information

Editors and Affiliations

University of Birmingham, Birmingham, UK
Aleš Leonardis
University of Trento, Trento, Italy
Elisa Ricci
Technical University of Darmstadt, Darmstadt, Germany
Stefan Roth
Princeton University, Princeton, NJ, USA
Olga Russakovsky
Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler
École des Ponts ParisTech, Marne-la-Vallée, France
Gül Varol

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (zip 41483 KB)

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Long, J. et al. (2025). PapMOT: Exploring Adversarial Patch Attack Against Multiple Object Tracking. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15109. Springer, Cham. https://doi.org/10.1007/978-3-031-72983-6_8

Download citation

DOI: https://doi.org/10.1007/978-3-031-72983-6_8
Published: 29 October 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72982-9
Online ISBN: 978-3-031-72983-6
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

PapMOT: Exploring Adversarial Patch Attack Against Multiple Object Tracking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Box-spoof attack against single object tracking

Adversarial attack can help visual tracking

DIMBA: discretely masked black-box attack in single object tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (zip 41483 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

PapMOT: Exploring Adversarial Patch Attack Against Multiple Object Tracking

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Box-spoof attack against single object tracking

Adversarial attack can help visual tracking

DIMBA: discretely masked black-box attack in single object tracking

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Editor information

Editors and Affiliations

1 Electronic supplementary material

Supplementary material 1 (zip 41483 KB)

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation