Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

P2FTrack: Multi-Object Tracking with Motion Prior and Feature Posterior

Published: 23 December 2024 Publication History

Abstract

Multiple object tracking (MOT) has emerged as a crucial component of the rapidly developing computer vision. However, existing multi-object tracking methods often overlook the relationship between features and motion, hindering the ability to strike a performance balance between coupled motion and complex scenes. In this work, we propose a novel end-to-end multi-object tracking method that integrates motion and feature information. To achieve this, we introduce a motion prior generator that transforms motion information into attention masks. Additionally, we leverage prior-posterior fusion multi-head attention to combine the motion-derived priors and attention-based posteriors. Our proposed method is extensively evaluated on MOT17 and DanceTrack datasets through comprehensive experiments and ablation studies, demonstrating state-of-the-art performance in the feature-based method with reasonable speed.

References

[1]
Nir Aharon, Roy Orfaig, and Ben-Zion Bobrovsky. 2022. BoT-SORT: Robust associations multi-pedestrian tracking. arXiv:2206.14651. Retrieved from https://arxiv.org/abs/2206.14651
[2]
Philipp Bergmann, Tim Meinhardt, and Laura Leal-Taixe. 2019. Tracking without bells and whistles. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’19), 941–951.
[3]
Alex Bewley, Zongyuan Ge, Lionel Ott, Fabio Ramos, and Ben Upcroft. 2016. Simple online and realtime tracking. In Proceedings of the 2016 IEEE International Conference on Image Processing (ICIP ’16). IEEE, 3464–3468.
[4]
Jinkun Cao, Jiangmiao Pang, Xinshuo Weng, Rawal Khirodkar, and Kris Kitani. 2023. Observation-centric sort: Rethinking sort for robust multi-object tracking. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition (CVPR ’23), 9686–9696.
[5]
Ruopeng Gao and Limin Wang. 2023. MeMOTR: Long-term memory-augmented transformer for multi-object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9901–9910.
[6]
Zheng Ge, Songtao Liu, Feng Wang, Zeming Li, and Jian Sun. 2021. Yolox: Exceeding yolo series in 2021. arXiv:2107.08430. Retrieved from https://arxiv.org/abs/2107.08430
[7]
Wen Guo, Wuzhou Quan, Junyu Gao, Tianzhu Zhang, and Changsheng Xu. 2023. Feature disentanglement network: Multi-object tracking needs more differentiated features. ACM Transactions on Multimedia Computing, Communications and Applications 20, 3 (2023), 1–22.
[8]
Kaiming He, Georgia Gkioxari, Piotr Dollár, and Ross Girshick. 2017. Mask R-CNN. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’17), 2961–2969.
[9]
Yihan Hu, Jiazhi Yang, Li Chen, Keyu Li, Chonghao Sima, Xizhou Zhu, Siqi Chai, Senyao Du, Tianwei Lin, Wenhai Wang, et al. 2023. Planning-oriented autonomous driving. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17853–17862.
[10]
Rui Li, Baopeng Zhang, Wei Liu, Zhu Teng, and Jianping Fan. 2023. PANet: An end-to-end network based on relative motion for online multi-object tracking. ACM Transactions on Multimedia Computing, Communications and Applications (2023), 1–21.
[11]
Tsung-Yi Lin, Priya Goyal, Ross Girshick, Kaiming He, and Piotr Dollár. 2017. Focal loss for dense object detection. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’17), 2980–2988.
[12]
Qiao Liu, Di Yuan, Nana Fan, Peng Gao, Xin Li, and Zhenyu He. 2022. Learning dual-level deep representation for thermal infrared tracking. IEEE Transactions on Multimedia 25 (2022), 1269–1281.
[13]
Yiheng Liu, Junta Wu, and Yi Fu. 2023. Collaborative tracking learning for frame-rate-insensitive multi-object tracking. In Proceedings of the IEEE/CVF International Conference on Computer Vision, 9964–9973.
[14]
Michel Meneses, Leonardo Matos, Bruno Prado, André de Carvalho, and Hendrik Macedo. 2020. Learning to associate detections for real-time multiple object tracking. arXiv:2007.06041. Retrieved from https://arxiv.org/abs/2007.06041
[15]
Anton Milan, Laura Leal-Taixé, Ian Reid, Stefan Roth, and Konrad Schindler. 2016. MOT16: A benchmark for multi-object tracking. arXiv:1603.00831. Retrieved from https://arxiv.org/abs/1603.00831
[16]
Wing W. Y. Ng, Xuyu Liu, Xuli Yan, Xing Tian, Cankun Zhong, and Sam Kwong. 2023. Multi-object tracking for horse racing. Information Sciences 638 (2023), 118967.
[17]
Shaoqing Ren, Kaiming He, Ross Girshick, and Jian Sun. 2015. Faster R-CNN: Towards real-time object detection with region proposal networks. In Advances in Neural Information Processing Systems, 91–99.
[18]
Hamid Rezatofighi, Nathan Tsoi, JunYoung Gwak, Amir Sadeghian, Ian Reid, and Silvio Savarese. 2019. Generalized intersection over union: A metric and a loss for bounding box regression. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’19), 658–666.
[19]
Bing Shuai, Andrew Berneshawi, Xinyu Li, Davide Modolo, and Joseph Tighe. 2021. SiamMOT: Siamese multi-object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’21), 12372–12382.
[20]
Peize Sun, Jinkun Cao, Yi Jiang, Zehuan Yuan, Song Bai, Kris Kitani, and Ping Luo. 2022. Dancetrack: Multi-object tracking in uniform appearance and diverse motion. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 20993–21002.
[21]
Peize Sun, Yi Jiang, Rufeng Zhang, Enze Xie, Jinkun Cao, Xinting Hu, Tao Kong, Zehuan Yuan, Changhu Wang, and Ping Luo. 2020. Transtrack: Multiple-object tracking with transformer. arXiv:2012.15460. Retrieved from https://arxiv.org/abs/2012.15460
[22]
Pavel Tokmakov, Jie Li, Wolfram Burgard, and Adrien Gaidon. 2021. Learning to track with object permanence. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV ’21), 10860–10869.
[23]
Haidong Wang, Xuan He, Zhiyong Li, Jin Yuan, and Shutao Li. 2023. JDAN: Joint detection and association network for real-time online multi-object tracking. ACM Transactions on Multimedia Computing, Communications and Applications 19, 1s (2023), 1–17.
[24]
Qiang Wang, Yun Zheng, Pan Pan, and Yinghui Xu. 2021. Multiple object tracking with correlation learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 3876–3886.
[25]
Nicolai Wojke, Alex Bewley, and Dietrich Paulus. 2017. Simple online and realtime tracking with a deep association metric. In Proceedings of the 2017 IEEE International Conference on Image Processing (ICIP ’17). IEEE, 3645–3649.
[26]
Jialian Wu, Jiale Cao, Liangchen Song, Yu Wang, Ming Yang, and Junsong Yuan. 2021. Track to detect and segment: An online multi-object tracker. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR ’21), 12352–12361.
[27]
Wanli Xing, Hong Zhang, Hao Chen, Yifan Yang, and Ding Yuan. 2022. Feature adaptation-based multipeak-redetection spatial-aware correlation filter for object tracking. Neurocomputing 488 (2022), 299–314. DOI:
[28]
Bin Yan, Yi Jiang, Peize Sun, Dong Wang, Zehuan Yuan, Ping Luo, and Huchuan Lu. 2022. Towards grand unification of object tracking. In Proceedings of the European Conference on Computer Vision. Springer, 733–751.
[29]
Chao Yang, Qiao Liu, Gaojun Li, Honghu Pan, and Zhenyu He. 2024. Learning diverse fine-grained features for thermal infrared tracking. Expert Systems with Applications 238 (2024), 121577.
[30]
Yifan Yang, Ziqi He, Jiaxu Wan, Ding Yuan, Hanyang Liu, Xuliang Li, and Hong Zhang. 2023. FusionTrack: Multiple object tracking with enhanced information utilization. Applied Sciences 13, 14 (2023), 8010.
[31]
Botao Ye, Hong Chang, Bingpeng Ma, Shiguang Shan, and Xilin Chen. 2022. Joint feature learning and relation modeling for tracking: A one-stream framework. In Proceedings of the European Conference on Computer Vision. Springer, 341–357.
[32]
Sisi You, Hantao Yao, Bing-Kun Bao, and Changsheng Xu. 2024. Multi-object tracking with spatial-temporal tracklet association. ACM Transactions on Multimedia Computing, Communications and Applications 20, 5 (2024), 1–21.
[33]
Di Yuan, Xiu Shu, Qiao Liu, and Zhenyu He. 2022. Aligned spatial-temporal memory network for thermal infrared target tracking. IEEE Transactions on Circuits and Systems II: Express Briefs 70, 3 (2022), 1224–1228.
[34]
Fangao Zeng, Bin Dong, Yuang Zhang, Tiancai Wang, Xiangyu Zhang, and Yichen Wei. 2022. Motr: End-to-end multiple-object tracking with transformer. In Proceedings of the European Conference on Computer Vision. Springer, 659–675.
[35]
Hong Zhang, Jiaxu Wan, Ziqi He, Jianbo Song, Yifan Yang, and Ding Yuan. 2024. Sparse agent transformer for unified voxel and image feature extraction and fusion. Information Fusion 110 (2024), 102455. DOI:
[36]
Hong Zhang, Chaoqi Yan, Xuliang Li, Yifan Yang, and Ding Yuan. 2022. MSAGNet: Multi-stream attribute-guided network for occluded pedestrian detection. IEEE Signal Processing Letters 29 (2022), 2163–2167. DOI:
[37]
Yifu Zhang, Peize Sun, Yi Jiang, Dongdong Yu, Fucheng Weng, Zehuan Yuan, Ping Luo, Wenyu Liu, and Xinggang Wang. 2022. Bytetrack: Multi-object tracking by associating every detection box. In Proceedings of the European Conference on Computer Vision. Springer, 1–21.
[38]
Yifu Zhang, Chunyu Wang, Xinggang Wang, Wenjun Zeng, and Wenyu Liu. 2021. Fairmot: On the fairness of detection and re-identification in multiple object tracking. International Journal of Computer Vision 129 (2021), 3069–3087.
[39]
Yuang Zhang, Tiancai Wang, and Xiangyu Zhang. 2023. Motrv2: Bootstrapping end-to-end multi-object tracking by pretrained object detectors. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 22056–22065.
[40]
Linyu Zheng, Ming Tang, Yingying Chen, Jinqiao Wang, and Hanqing Lu. 2020. Learning feature embeddings for discriminant model based tracking-supplementary material. In Proceedings of the European Conference on Computer Vision (ECCV ’20) 759–775.
[41]
Linyu Zheng, Ming Tang, Yingying Chen, Guibo Zhu, Jinqiao Wang, and Hanqing Lu. 2021. Improving multiple object tracking with single object tracking. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2453–2462.
[42]
Xingyi Zhou, Vladlen Koltun, and Philipp Krähenbühl. 2020. Tracking objects as points. In Proceedings of the European Conference on Computer Vision. Springer, 474–490.
[43]
Xingyi Zhou, Dequan Wang, and Philipp Krähenbühl. 2019. Objects as points. arXiv:1904.07850. Retrieved from https://arxiv.org/abs/1904.07850
[44]
Chenchen Zhu, Fangyi Chen, Zhiqiang Shen, and Marios Savvides. 2020. Soft anchor-point object detection. In Proceedings of the European Conference on Computer Vision (ECCV ’20). Springer, 91–107.

Index Terms

  1. P2FTrack: Multi-Object Tracking with Motion Prior and Feature Posterior

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 21, Issue 1
    January 2025
    842 pages
    EISSN:1551-6865
    DOI:10.1145/3703004
    • Editor:
    • Abuabdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 23 December 2024
    Online AM: 14 October 2024
    Accepted: 08 October 2024
    Revised: 14 August 2024
    Received: 05 March 2024
    Published in TOMM Volume 21, Issue 1

    Check for updates

    Author Tags

    1. multi-object tracking
    2. prior-posterior fusion
    3. transformer

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • National Natural Science Foundation of China

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 125
      Total Downloads
    • Downloads (Last 12 months)125
    • Downloads (Last 6 weeks)60
    Reflects downloads up to 24 Dec 2024

    Other Metrics

    Citations

    View Options

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media