survey

Free access

Just Accepted

Review and Analysis of RGBT Single Object Tracking Methods: A Fusion Perspective

Authors:

Zhang BoAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications

Accepted on 17 February 2024

https://doi.org/10.1145/3651308

Online AM: 07 March 2024 Publication History

Abstract

Visual tracking is a fundamental task in computer vision with significant practical applications in various domains, including surveillance, security, robotics, and human-computer interaction. However, it may face limitations in visible light data, such as low-light environments, occlusion, and camouflage, which can significantly reduce its accuracy. To cope with these challenges, researchers have explored the potential of combining the visible and infrared modalities to improve tracking performance. By leveraging the complementary strengths of visible and infrared data, RGB-infrared fusion tracking has emerged as a promising approach to address these limitations and improve tracking accuracy in challenging scenarios. In this paper, we present a review on RGB-infrared fusion tracking. Specifically, we categorize existing RGBT tracking methods into four categories based on their underlying architectures, feature representations, and fusion strategies, namely feature decoupling based method, feature selecting based method, collaborative graph tracking method, and traditional fusion method. Furthermore, we provide a critical analysis of their strengths, limitations, representative methods, and future research directions. To further demonstrate the advantages and disadvantages of these methods, we present a review of publicly available RGBT tracking datasets and analyze the main results on public datasets. Moreover,we discuss some limitations in RGBT tracking at present and provide some opportunities and future directions for RGBT visual tracking, such as dataset diversity, unsupervised and weakly supervised applications. In conclusion, our survey aims to serve as a useful resource for researchers and practitioners interested in the emerging field of RGBT tracking, and to promote further progress and innovation in this area.

References

[1]

Luca Bertinetto, Jack Valmadre, Stuart Golodetz, Ondrej Miksik, and Philip HS Torr. 2016. Staple: Complementary learners for real-time tracking. In CVPR. 1401–1409.

[2]

Luca Bertinetto, Jack Valmadre, Joao F Henriques, Andrea Vedaldi, and Philip HS Torr. 2016. Fully-convolutional siamese networks for object tracking. In ECCVW. 850–865.

[3]

Goutam Bhat, Martin Danelljan, Luc Van Gool, and Radu Timofte. 2019. Learning discriminative model prediction for tracking. In ICCV. 6182–6191.

[4]

David S Bolme, J Ross Beveridge, Bruce A Draper, and Yui Man Lui. 2010. Visual object tracking using adaptive correlation filters. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition. 2544–2550.

[5]

Filiz Bunyak, Kannappan Palaniappan, Sumit Kumar Nath, and Guna Seetharaman. 2007. Geodesic active contour based fusion of visible and infrared video for persistent object tracking. In WACV. 35–35.

[6]

Xin Chen, Bin Yan, Jiawen Zhu, Dong Wang, Xiaoyun Yang, and Huchuan Lu. 2021. Transformer tracking. In CVPR. 8126–8135.

[7]

Janghoon Choi, Junseok Kwon, and Kyoung Mu Lee. 2017. Visual tracking by reinforced decision making. arXiv preprint arXiv:1702.06291 2 (2017).

[8]

Janghoon Choi, Junseok Kwon, and Kyoung Mu Lee. 2019. Deep meta learning for real-time target-aware visual tracking. In ICCV. 911–920.

[9]

Ciarán O Conaire, Noel E O’Connor, Eddie Cooke, and Alan F Smeaton. 2006. Comparison of fusion methods for thermo-visual surveillance tracking. In ICIF. 1–7.

[10]

Ciarán Ó Conaire, Noel E O’Connor, and Alan Smeaton. 2008. Thermo-visual feature fusion for object tracking using multiple spatiogram trackers. MVA (2008), 483–494.

[11]

Nedeljko Cvejic, Stavri G Nikolov, Henry D Knowles, Artur Loza, Alin Achim, David R Bull, and Cedric Nishan Canagarajah. 2007. The effect of pixel-level fusion on object tracking in multi-sensor surveillance video. In CVPR. 1–7.

[12]

Kenan Dai, Yunhua Zhang, Dong Wang, Jianhua Li, Huchuan Lu, and Xiaoyun Yang. 2020. High-performance long-term tracking with meta-updater. In CVPR. 6298–6307.

[13]

Navneet Dalal and Bill Triggs. 2005. Histograms of oriented gradients for human detection. In CVPR, Vol. 1. 886–893.

[14]

Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2019. Atom: Accurate tracking by overlap maximization. In CVPR. 4660–4669.

[15]

Martin Danelljan, Goutam Bhat, Fahad Shahbaz Khan, and Michael Felsberg. 2017. Eco: Efficient convolution operators for tracking. In CVPR. 6638–6646.

[16]

Martin Danelljan, Gustav Häger, Fahad Khan, and Michael Felsberg. 2014. Accurate scale estimation for robust visual tracking. In BMVC.

[17]

Martin Danelljan, Gustav Häger, Fahad Shahbaz Khan, and Michael Felsberg. 2016. Discriminative scale space tracking. IEEE transactions on pattern analysis and machine intelligence 39, 8(2016), 1561–1575.

[18]

Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. 2015. Convolutional features for correlation filter based visual tracking. In ICCVW. 58–66.

[19]

Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. 2015. Learning spatially regularized correlation filters for visual tracking. In ICCV. 4310–4318.

[20]

Martin Danelljan, Gustav Hager, Fahad Shahbaz Khan, and Michael Felsberg. 2016. Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. In CVPR. 1430–1438.

[21]

Martin Danelljan, Andreas Robinson, Fahad Shahbaz Khan, and Michael Felsberg. 2016. Beyond correlation filters: Learning continuous convolution operators for visual tracking. In ECCV. 472–488.

[22]

Martin Danelljan, Fahad Shahbaz Khan, Michael Felsberg, and Joost Van de Weijer. 2014. Adaptive color attributes for real-time visual tracking. In CVPR. 1090–1097.

[23]

Heng Fan and Haibin Ling. 2019. Siamese cascaded region proposal networks for real-time visual tracking. In CVPR. 7952–7961.

[24]

Mingzheng Feng and Jianbo Su. 2023. Learning Multi-Layer Attention Aggregation Siamese Network for Robust RGBT Tracking. IEEE Transactions on Multimedia(2023).

[25]

Zhihong Fu, Zehua Fu, Qingjie Liu, Wenrui Cai, and Yunhong Wang. 2022. SparseTT: Visual tracking with sparse transformers. arXiv preprint arXiv:2205.03776(2022).

[26]

Yuan Gao, Chenglong Li, Yabin Zhu, Jin Tang, Tao He, and Futian Wang. 2019. Deep adaptive fusion network for high performance RGBT tracking. In ICCVW. 0–0.

[27]

Chang Guo, Dedong Yang, Chang Li, and Peng Song. 2022. Dual Siamese network for RGBT tracking via fusing predicted position maps. TVC 38, 7 (2022), 2555–2567.

Digital Library

[28]

Dongyan Guo, Yanyan Shao, Ying Cui, Zhenhua Wang, Liyan Zhang, and Chunhua Shen. 2021. Graph attention tracking. In CVPR. 9543–9552.

[29]

Jun;Cui Ying;Wang Zhenhua;Chen Shengyong Guo, Dongyan;Wang. 2020. SiamCAR: Siamese fully convolutional classification and regression for visual tracking. In CVPR. 6269–6277.

[30]

Qing Guo, Wei Feng, Ce Zhou, Rui Huang, Liang Wan, and Song Wang. 2017. Learning dynamic siamese network for visual object tracking. In ICCV. 1763–1771.

[31]

Joao F Henriques, Rui Caseiro, Pedro Martins, and Jorge Batista. 2012. Exploiting the circulant structure of tracking-by-detection with kernels. In ECCV. Springer, 702–715.

[32]

Ruichao Hou, Boyue Xu, Tongwei Ren, and Gangshan Wu. 2023. MTNet: Learning Modality-aware Representation with Transformer for RGBT Tracking. In ICME. 1163–1168.

[33]

Lianghua Huang, Xin Zhao, and Kaiqi Huang. 2020. Globaltrack: A simple and strong baseline for long-term tracking. In AAAI, Vol. 34. 11037–11044.

[34]

Michael Isard and Andrew Blake. 1998. CONDENSATION–conditional density propagation for visual tracking. IJCV 29, 1 (1998), 5.

Digital Library

[35]

Zedu Chen;Bineng Zhong;Guorong Li;Shengping Zhang;Rongrong Ji. 2020. Siamese box adaptive network for visual tracking. In CVPR. 6668–6677.

[36]

Peng Jingchao, Zhao Haitao, Hu Zhengwei, Zhuang Yi, and Wang Bofan. 2021. Siamese infrared and visible light fusion network for RGB-T tracking. arXiv preprint arXiv:2103.07302(2021).

[37]

Matej Kristan, Jiri Matas, Ales Leonardis, Michael Felsberg, Roman Pflugfelder, Joni-Kristian Kamarainen, Luka ˇCehovin Zajc, Ondrej Drbohlav, Alan Lukezic, Amanda Berg, et al. 2019. The seventh visual object tracking vot2019 challenge results. In ICCVW. 0–0.

[38]

Xiangyuan Lan, Mang Ye, Shengping Zhang, Huiyu Zhou, and Pong C Yuen. 2020. Modality-correlation-aware sparse representation for RGB-infrared object tracking. PRL 130(2020), 12–20.

Digital Library

[39]

Bo Li, Wei Wu, Qiang Wang, Fangyi Zhang, Junliang Xing, and Junjie Yan. 2019. Siamrpn++: Evolution of siamese visual tracking with very deep networks. In CVPR. 4282–4291.

[40]

Bi Li, Wenxuan Xie, Wenjun Zeng, and Wenyu Liu. 2019. Learning to update for object tracking with recurrent meta-learner. TIP 28, 7 (2019), 3624–3635.

[41]

Bo Li, Junjie Yan, Wei Wu, Zheng Zhu, and Xiaolin Hu. 2018. High performance visual tracking with siamese region proposal network. In CVPR. 8971–8980.

[42]

Chenglong Li, Hui Cheng, Shiyi Hu, Xiaobai Liu, Jin Tang, and Liang Lin. 2016. Learning collaborative sparse representation for grayscale-thermal tracking. TIP 25, 12 (2016), 5743–5756.

Digital Library

[43]

Chenglong Li, Shiyi Hu, Sihan Gao, and Jin Tang. 2016. Real-time grayscale-thermal tracking via laplacian sparse representation. In MMM. Springer, 54–65.

[44]

Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. 2018. RGB-T Object Tracking: Benchmark and Baseline. CoRR abs/1805.08982(2018).

[45]

Chenglong Li, Xinyan Liang, Yijuan Lu, Nan Zhao, and Jin Tang. 2019. RGB-T object tracking: Benchmark and baseline. PR 96(2019), 106977.

[46]

Chenglong Li, Liang Lin, Wangmeng Zuo, and Jin Tang. 2017. Learning patch-based dynamic graph for visual tracking. In AAAI, Vol. 31.

[47]

Chenglong Li, Liang Lin, Wangmeng Zuo, Jin Tang, and Ming-Hsuan Yang. 2018. Visual tracking via dynamic graph learning. TPAMI 41, 11 (2018), 2770–2782.

[48]

Chenglong Li, Lei Liu, Andong Lu, Qing Ji, and Jin Tang. 2020. Challenge-aware RGBT tracking. In ECCV. Springer, 222–237.

[49]

Chenglong Li, Xiang Sun, Xiao Wang, Lei Zhang, and Jin Tang. 2017. Grayscale-thermal object tracking via multitask laplacian sparse representation. TSMC-S 47, 4 (2017), 673–681.

[50]

Chenglong Li, Xiaohao Wu, Nan Zhao, Xiaochun Cao, and Jin Tang. 2018. Fusing two-stream convolutional neural networks for RGB-T object tracking. Neurocomputing 281(2018), 78–85.

Digital Library

[51]

Chenglong Li, Zhiqiang Xiang, Jin Tang, Bin Luo, and Futian Wang. 2021. Rgbt tracking via noise-robust cross-modal ranking. TNNLS 33, 9 (2021), 5019–5031.

[52]

Chenglong Li, Wanlin Xue, Yaqing Jia, Zhichen Qu, Bin Luo, Jin Tang, and Dengdi Sun. 2021. LasHeR: A large-scale high-diversity benchmark for RGBT tracking. TIP 31(2021), 392–404.

[53]

Chenglong Li, Nan Zhao, Yijuan Lu, Chengli Zhu, and Jin Tang. 2017. Weighted sparse representation regularized graph learning for RGB-T object tracking. In ACMM. 1856–1864.

[54]

Chenglong Li, Chengli Zhu, Yan Huang, Jin Tang, and Liang Wang. 2018. Cross-modal ranking with soft consistency and noisy labels for robust RGB-T tracking. In ECCV. 808–823.

[55]

Chenglong Li, Chengli Zhu, Jian Zhang, Bin Luo, Xiaohao Wu, and Jin Tang. 2018. Learning local-global multi-graph descriptors for RGB-T object tracking. TCSVT 29, 10 (2018), 2913–2926.

[56]

Chenglong Li, Chengli Zhu, Shaofei Zheng, Bin Luo, and Jing Tang. 2018. Two-stage modality-graphs regularized manifold ranking for RGB-T tracking. SPIC 68(2018), 207–217.

[57]

Ning Wang;Wengang Zhou;Jie Wang;Houqiang Li. 2021. Transformer meets tracker: Exploiting temporal context for robust visual tracking. In CVPR. 1571–1580.

[58]

Peixia Li, Boyu Chen, Wanli Ouyang, Dong Wang, Xiaoyun Yang, and Huchuan Lu. 2019. Gradnet: Gradient-guided network for visual object tracking. In ICCV. 6162–6171.

[59]

Shenglan Li, Rui Yao, Yong Zhou, Hancheng Zhu, Bing Liu, Jiaqi Zhao, and Zhiwen Shao. 2023. Unsupervised RGB-T object tracking with attentional multi-modal feature fusion. Multimedia Tools and Applications(2023), 1–19.

[60]

Xiang Li, Wenhai Wang, Xiaolin Hu, and Jian Yang. 2019. Selective kernel networks. In CVPR. 510–519.

[61]

Yang Li, Jianke Zhu, et al. 2014. A scale adaptive kernel correlation filter tracker with feature integration. In ECCVW, Vol. 8926. Citeseer, 254–265.

[62]

T. Lindeberg. 2012. Scale Invariant Feature Transform. Scholarpedia 7, 5 (2012), 10491.

[63]

Lei Liu, Chenglong Li, Yun Xiao, and Jin Tang. 2023. Quality-Aware RGBT Tracking via Supervised Reliability Learning and Weighted Residual Guidance. In ACMM. 3129–3137.

[64]

Cheng Long Li, Andong Lu, Ai Hua Zheng, Zhengzheng Tu, and Jin Tang. 2019. Multi-adapter RGBT tracking. In ICCVW. 0–0.

[65]

Andong Lu, Chenglong Li, Yuqing Yan, Jin Tang, and Bin Luo. 2021. RGBT tracking via multi-adapter network with hierarchical divergence loss. TIP 30(2021), 5613–5625.

[66]

Andong Lu, Cun Qian, Chenglong Li, Jin Tang, and Liang Wang. 2022. Duality-gated mutual condition network for RGBT tracking. TNNLS (2022).

[67]

Bin Yan;Houwen Peng;Jianlong Fu;Dong Wang;Huchuan Lu. 2021. Learning spatio-temporal transformer for visual tracking. In ICCV. 10448–10457.

[68]

Xue Mei and Haibin Ling. 2009. Robust visual tracking using l1 minimization. In ICCV. 1436–1443.

[69]

Hyeonseob Nam and Bohyung Han. 2016. Learning multi-domain convolutional neural networks for visual tracking. In CVPR. 4293–4302.

[70]

Renaud Péteri and Ondřej Šiler. 2009. Object tracking using joint visible and thermal infrared video sequences. (2009).

[71]

Yuankai Qi, Shengping Zhang, Weigang Zhang, Li Su, Qingming Huang, and Ming-Hsuan Yang. 2019. Learning attribute-specific representations for visual tracking. In AAAI, Vol. 33. 8835–8842.

[72]

Longfeng Shen, Xiaoxiao Wang, Lei Liu, Bin Hou, Yulei Jian, Jin Tang, and Bin Luo. 2022. RGBT tracking based on cooperative low-rank graph model. Neurocomputing 492(2022), 370–381.

Digital Library

[73]

Arnold WM Smeulders, Dung M Chu, Rita Cucchiara, Simone Calderara, Afshin Dehghan, and Mubarak Shah. 2013. Visual tracking: An experimental survey. TPAMI 36, 7 (2013), 1442–1468.

Digital Library

[74]

Yibing Song, Chao Ma, Lijun Gong, Jiawei Zhang, Rynson WH Lau, and Ming-Hsuan Yang. 2017. Crest: Convolutional residual learning for visual tracking. In ICCV. 2555–2564.

[75]

Zhangyong Tang, Tianyang Xu, and Xiao-Jun Wu. 2022. A Survey for Deep RGBT Tracking. arXiv preprint arXiv:2201.09296(2022).

[76]

Zhengzheng Tu, Chun Lin, Wei Zhao, Chenglong Li, and Jin Tang. 2021. M 5 l: multi-modal multi-margin metric learning for RGBT tracking. TIP 31(2021), 85–98.

Digital Library

[77]

Paul Voigtlaender, Jonathon Luiten, Philip HS Torr, and Bastian Leibe. 2020. Siam r-cnn: Visual tracking by re-detection. In CVPR. 6578–6588.

[78]

Chaoqun Wang, Chunyan Xu, Zhen Cui, Ling Zhou, Tong Zhang, Xiaoya Zhang, and Jian Yang. 2020. Cross-modal pattern-propagation for RGB-T tracking. In CVPR. 7064–7073.

[79]

Lijun Wang, Wanli Ouyang, Xiaogang Wang, and Huchuan Lu. 2015. Visual tracking with fully convolutional networks. In ICCV. 3119–3127.

[80]

Yulong Wang, Chenglong Li, and Jin Tang. 2018. Learning soft-consistent correlation filters for RGB-T object tracking. In CVPR. 295–306.

[81]

Yi Wu, Erik Blasch, Genshe Chen, Li Bai, and Haibin Ling. 2011. Multiple source data fusion via sparse representation for robust visual tracking. In ICIF. IEEE, 1–8.

[82]

Yutao Cui;Cheng Jiang;Limin Wang;Gangshan Wu. 2022. Mixformer: End-to-end tracking with iterative mixed attention. In CVPR. 13608–13618.

[83]

Gang Xiao, Xiao Yun, and JianMin Wu. 2012. A multi-cue mean-shift target tracking approach based on fuzzified region dynamic image fusion. SCI CHINA INFORM SCI 55(2012), 577–589.

[84]

Gang Xiao, Xiao Yun, and Jianmin Wu. 2016. A new tracking approach for visible and infrared sequences based on tracking-before-fusion. IJDC 4(2016), 40–51.

[85]

Xianbing Xiao, Xingzhong Xiong, Fanqin Meng, and Zhen Chen. 2023. Multi-scale feature interactive fusion network for rgbt tracking. Sensors 23, 7 (2023), 3410.

[86]

YUN Xiao, Zhongliang Jing, Gang Xiao, JIN Bo, and Canlong Zhang. 2016. A compressive tracking based on time-space Kalman fusion model. Information Sciences 59, 012106 (2016), 1–012106.

[87]

Yun Xiao, Mengmeng Yang, Chenglong Li, Lei Liu, and Jin Tang. 2022. Attribute-based progressive fusion network for rgbt tracking. In AAAI, Vol. 36. 2831–2838.

[88]

Qin Xu, Yiming Mei, Jinpei Liu, and Chenglong Li. 2021. Multimodal cross-layer bilinear pooling for RGBT tracking. TMM 24(2021), 567–580.

[89]

Yingjian Xue, Jianwei Zhang, Zhoujin Lin, Chenglong Li, Bihan Huo, and Yan Zhang. 2023. SiamCAF: Complementary Attention Fusion-Based Siamese Network for RGBT Tracking. Remote Sensing 15, 13 (2023), 3252.

[90]

Tianyu Yang and Antoni B Chan. 2018. Learning dynamic memory networks for object tracking. In ECCV. 152–167.

[91]

Zikai Song;Junqing Yu;Yi–Ping Phoebe Chen;Wei Yang. 2022. Transformer tracking with cyclic shifting window attention. In CVPR. 8791–8800.

[92]

Alper Yilmaz, Omar Javed, and Mubarak Shah. 2006. Object tracking: A survey. CSUR 38, 4 (2006), 13–es.

Digital Library

[93]

Yuechen Yu, Yilei Xiong, Weilin Huang, and Matthew R Scott. 2020. Deformable siamese attention networks for visual object tracking. In CVPR. 6728–6737.

[94]

Xiao Yun, Yanjing Sun, Xuanxuan Yang, and Nannan Lu. 2019. Discriminative fusion correlation learning for visible and infrared tracking. MATH PROBL ENG 2019(2019).

[95]

Sulan Zhai, Pengpeng Shao, Xinyan Liang, and Xin Wang. 2019. Fast RGB-T tracking via cross-modal correlation filters. Neurocomputing 334(2019), 172–181.

Digital Library

[96]

Fan Zhang, Hanwei Peng, Lingli Yu, Yuqian Zhao, and Baifan Chen. 2023. Dual-Modality Space-Time Memory Network for RGBT Tracking. IEEE Transactions on Instrumentation and Measurement (2023).

[97]

Hui Zhang, Lei Zhang, Li Zhuo, and Jing Zhang. 2020. Object tracking in RGB-T videos using modal-aware attention network and competitive learning. Sensors 20, 2 (2020), 393.

[98]

Kaihua Zhang, Lei Zhang, and Ming-Hsuan Yang. 2014. Fast compressive tracking. IEEE transactions on pattern analysis and machine intelligence 36, 10(2014), 2002–2015.

[99]

Lichao Zhang, Martin Danelljan, Abel Gonzalez-Garcia, Joost Van De Weijer, and Fahad Shahbaz Khan. 2019. Multi-modal fusion for end-to-end rgb-t tracking. In ICCVW. 0–0.

[100]

Lichao Zhang, Abel Gonzalez-Garcia, Joost van de Weijer, Martin Danelljan, and Fahad Shahbaz Khan. 2019. Learning the model update for siamese trackers. In ICCV. 4010–4019.

[101]

Pengyu Zhang, Dong Wang, and Huchuan Lu. 2020. Multi-modal visual tracking: Review and experimental comparison. arXiv preprint arXiv:2012.04176(2020).

[102]

Pengyu Zhang, Dong Wang, Huchuan Lu, and Xiaoyun Yang. 2021. Learning adaptive attribute-driven representation for real-time RGB-T tracking. IJCV 129(2021), 2714–2729.

Digital Library

[103]

Pengyu Zhang, Jie Zhao, Chunjuan Bo, Dong Wang, Huchuan Lu, and Xiaoyun Yang. 2021. Jointly modeling motion and appearance cues for robust RGB-T tracking. TIP 30(2021), 3335–3347.

Digital Library

[104]

Pengyu Zhang, Jie Zhao, Dong Wang, Huchuan Lu, and Xiang Ruan. 2022. Visible-thermal UAV tracking: A large-scale benchmark and new baseline. In CVPR. 8886–8895.

[105]

Tianlu Zhang, Xueru Liu, Qiang Zhang, and Jungong Han. 2021. SiamCDA: Complementarity-and distractor-aware RGB-T tracking based on Siamese network. TCSVT 32, 3 (2021), 1403–1417.

[106]

Xingchen Zhang, Ping Ye, Henry Leung, Ke Gong, and Gang Xiao. 2020. Object fusion tracking based on visible and infrared images: A comprehensive review. Information Fusion 63(2020), 166–187.

[107]

Xingchen Zhang, Ping Ye, Shengyun Peng, Jun Liu, Ke Gong, and Gang Xiao. 2019. SiamFT: An RGB-infrared fusion tracking method via fully convolutional Siamese networks. IEEE Access 7(2019), 122122–122133.

[108]

Xingchen Zhang, Ping Ye, Shengyun Peng, Jun Liu, and Gang Xiao. 2020. DSiamMFT: An RGB-T fusion tracking method via dynamic Siamese networks using multi-layer feature fusion. SPIC 84(2020), 115756.

[109]

Jie Zhao, Kenan Dai, Pengyu Zhang, Dong Wang, and Huchuan Lu. 2022. Robust Online Tracking With Meta-Updater. TPAMI (2022).

[110]

Jiawen Zhu, Simiao Lai, Xin Chen, Dong Wang, and Huchuan Lu. 2023. Visual prompt multi-modal tracking. In CVPR. 9516–9526.

[111]

Yabin Zhu, Chenglong Li, Bin Luo, Jin Tang, and Xiao Wang. 2019. Dense feature aggregation and pruning for RGBT tracking. In ACMM. 465–472.

[112]

Yabin Zhu, Chenglong Li, Jin Tang, and Bin Luo. 2020. Quality-aware feature aggregation network for robust RGBT tracking. TIV 6, 1 (2020), 121–130.

[113]

Yabin Zhu, Chenglong Li, Jin Tang, Bin Luo, and Liang Wang. 2021. RGBT tracking by trident fusion network. TCSVT 32, 2 (2021), 579–592.

[114]

Zheng Zhu, Qiang Wang, Bo Li, Wei Wu, Junjie Yan, and Weiming Hu. 2018. Distractor-aware siamese networks for visual object tracking. In ECCV. 101–117.

Index Terms

Review and Analysis of RGBT Single Object Tracking Methods: A Fusion Perspective
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Tracking

Recommendations

Robust object tracking via multi-cue fusion

A long-term object tracking method based on calibrated binocular cameras by fusing information of the two channels and binocular geometry constraints is proposed.The stereo filter which is built based on the epipolar geometry of the binocular cameras is ...
Read More
Dual Siamese network for RGBT tracking via fusing predicted position maps
Abstract
Visual object tracking is a basic task in the field of computer vision. Despite the rapid development of visual object tracking, it is not reliable to use only visible light images for object tracking in some cases. Since visible light and thermal ...
Read More
Review of recent advances in visual tracking techniques
Abstract
Visual tracking is the widely emerging research in computer vision applications. Nowadays, researchers have proposed various novel tracking methodologies to attain the excellence in terms of performance. In this review, several recent visual ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Just Accepted

ISSN:1551-6857

EISSN:1551-6865

Table of Contents

Copyright © 2024 Copyright held by the owner/author(s). Publication rights licensed to ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Online AM: 07 March 2024

Accepted: 17 February 2024

Revised: 02 January 2024

Received: 16 April 2023

Check for updates

Author Tags

Qualifiers

Survey

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
306
Total Downloads

Downloads (Last 12 months)306
Downloads (Last 6 weeks)96

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables