Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fast RGB-T Tracking via Cross-Modal Correlation Filters

Published: 21 March 2019 Publication History
  • Get Citation Alerts
  • Abstract

    This paper studies how to perform RGB-T object tracking in the correlation filter framework. Given the input RGB and thermal videos, we utilize the correlation filter for each modality due to its high performance in both of accuracy and speed. To take the interdependency between RGB and thermal modalities, we introduce the low-rank constraint to learn filters collaboratively, based on the observation that different modality features should have similar filters to make them have consistent localization of the target object. For optimization, we design an efficient ADMM (Alternating Direction Method of Multipliers) algorithm to solve the proposed model. Experimental results on the benchmark datasets (i.e., GTOT, RGBT210 and OSU-CT) suggest that the proposed approach performs favorably in both accuracy and efficiency against the state-of-the-art RGB-T methods.

    References

    [1]
    J. Zhang, S. Ma, S. Sclaroff, Meem: robust tracking via multiple experts using entropy minimization, Proceedings of European Conference on Computer Vision, 2014, pp. 188–203.
    [2]
    S. Hare, A. Saffari, P.H.S. Torr, Struck: structured output tracking with kernels, International Conference on Computer Vision, 2011, pp. 263–270.
    [3]
    K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, CoRR (2014) arxiv:1409.1556 (2014).
    [4]
    A. Torabi, G. Massé, G.-A. Bilodeau, An iterative integrated framework for thermal-visible image registration, sensor fusion, and people tracking for video surveillance applications, Computer Vision and Image Understanding (2012) 210–221.
    [5]
    T. Zhang, B. Ghanem, S. Liu, N. Ahuja, Low-rank sparse learning for robust visual tracking, Computer Vision – ECCV, 2012, pp. 470–484.
    [6]
    K. Zhang, L. Zhang, M.-H. Yang, Real-time compressive tracking, Computer Vision – ECCV, 2012, pp. 864–877.
    [7]
    D.S. Bolme, J.R. Beveridge, B.A. Draper, Y.M. Lui, Visual object tracking using adaptive correlation filters, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 2544–2550.
    [8]
    C. Li, L. Lin, W. Zuo, J. Tang, M. Yang, Visual tracking via learning dynamic patch-based graph representation, CoRR (2017) arXiv:1710.01444 (2017).
    [9]
    B. Zhong, H. Yao, S. Chen, R. Ji, X. Yuan, S. Liu, W. Gao, Visual tracking via weakly supervised learning from multiple imperfect oracles, IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 1323–1330.
    [10]
    C. Li, L. Lin, W. Zuo, J. Tang, M. Yang, Visual tracking via learning dynamic patch-based graph representation, IEEE Transactions on Pattern Analysis and Mach. ineIntelligence (2017).
    [11]
    H. Liu, F. Sun, Fusion tracking in color and infrared images using joint sparse representation, Science China Information Sciences 55 (2012) 590–599.
    [12]
    C. Li, H. Cheng, S. Hu, X. Liu, J. Tang, L. Lin, Learning collaborative sparse representation for grayscale-thermal tracking, IEEE Transactions on Image Processing (2016) 5743–5756.
    [13]
    C. Li, X. Sun, X. Wang, L. Zhang, J. Tang, Grayscale-thermal object tracking via multitask laplacian sparse representation, IEEE Transactions on Systems, Man, Cybernetics: Systems (2017) 673–681.
    [14]
    C. Li, X. Wu, N. Zhao, X. Cao, J. Tang, Fusing two-stream convolutional neural networks for rgb-t object tracking, Neurocomputing (2018) 78–85.
    [15]
    R. Gade, T.B. Moeslund, Thermal cameras and applications: a survey, Machine Vision Applications (2014) 245–262.
    [16]
    J.F. Henriques, R. Caseiro, P. Martins, J. Batista, High-speed tracking with kernelized correlation filters, IEEE Transactions Pattern Analysis and Machine Intelligence (2015) 583–596.
    [17]
    M. Danelljan, F.S. Khan, M. Felsberg, J. v. d. Weijer, Adaptive color attributes for real-time visual tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2014, pp. 1090–1097.
    [18]
    J.F. Henriques, R. Caseiro, P. Martins, J. Batista, Exploiting the circulant structure of tracking-by-detection with kernels, Computer Vision – ECCV, 2012, pp. 702–715.
    [19]
    M. Mueller, N. Smith, B. Ghanem, Context-aware correlation filter tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2017, p. 6.
    [20]
    M. Danelljan, G. Bhat, F.S. Khan, M. Felsberg, Eco: efficient convolution operators for tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2017, p. 3.
    [21]
    B. Bai, B. Zhong, G. Ouyang, P. Wang, X. Liu, Z. Chen, C. Wang, Kernel correlation filters for visual tracking with adaptive fusion of heterogeneous cues, Neurocomputing (2018) 109–120.
    [22]
    S. Boyd, N. Parikh, E. Chu, B. Peleato, J. Eckstein, Distributed optimization and statistical learning via the alternating direction method of multipliers (2011) 1–122.
    [23]
    C. O’Conaire, N.E. O’Connor, E. Cooke, A.F. Smeaton, Comparison of fusion methods for thermo-visual surveillance tracking, International Conference on Information Fusion, 2006, pp. 1–7.
    [24]
    C.Ó. Conaire, N.E. O’Connor, A. Smeaton, Thermo-visual feature fusion for object tracking using multiple spatiogram trackers, Machine Vision and Applications (2008) 483–494.
    [25]
    N. Cvejic, S.G. Nikolov, H.D. Knowles, A. Loza, A. Achim, D.R. Bull, C.N. Canagarajah, The effect of pixel-level fusion on object tracking in multi-sensor surveillance video, IEEE Conference on Computer Vision and Pattern Recognition, 2007, pp. 1–7.
    [26]
    A. Leykin, R. Hammoud, Pedestrian tracking by fusion of thermal-visible surveillance videos, Machine Vision and Applications 21 (2010) 587–595.
    [27]
    A.B. Moulay, A. Akhloufi, C. Porcher, Fusion of thermal infrared and visible spectrum for robust pedestrian tracking, 2014.
    [28]
    C. Li, N. Zhao, Y. Lu, C. Zhu, J. Tang, Weighted sparse representation regularized graph learning for rgb-t object tracking, Proceedings of the 2017 ACM on Multimedia Conference, 2017, pp. 1856–1864.
    [29]
    X. Lan, M. Ye, S. Zhang, P.C. Yuen, Robust collaborative discriminative learning for rgb-infrared tracking, AAAI, 2018.
    [30]
    J.W. Davis, V. Sharma, Background-subtraction using contour-based fusion of thermal and visible imagery, Computer Vision and Image Understanding (2007) 162–182.
    [31]
    G.-A. Bilodeau, A. Torabi, P.-L. St-Charles, D. Riahi, Thermal-visible registration of human silhouettes: asimilarity measure performance evaluation, Infrared Phys.ics and Technology 106 (2014) 79–86.
    [32]
    C. Li, X. Liang, Y. Lu, N. Zhao, J. Tang, RGB-T object tracking: benchmark and baseline, CoRR (2018) arXiv:1805.08982 (2018).
    [33]
    M. Danelljan, G. H.auml;ger, F. Shahbaz Khan, M. Felsberg, Accurate scale estimation for robust visual tracking, Proceedings of the British Machine Vision Conference, 2014.
    [34]
    Z. Hong, Z. Chen, C. Wang, X. Mei, D. Prokhorov, D. Tao, Multi-store tracker (muster): a cognitive psychology inspired approach to object tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 749–758.
    [35]
    M. Danelljan, A. Robinson, F. Shahbaz Khan, M. Felsberg, Beyond correlation filters: learning continuous convolution operators for visual tracking, Computer Vision – ECCV, 2016, pp. 472–488.
    [36]
    T.H. Oh, Y. Matsushita, Y.W. Tai, I.S. Kweon, Fast randomized singular value thresholding for low-rank optimization, IEEE Transactions on Pattern Analysis and Mach. ineIntelligence (2018) 376–391.
    [37]
    J.-F. Cai, E.J. Cand, Z. Shen, A singular value thresholding algorithm for matrix completion, SIAM Journal on Optimization (2010) 1956–1982.
    [38]
    C. Ma, J.B. Huang, X. Yang, M.H. Yang, Hierarchical convolutional features for visual tracking, IEEE International Conference on Computer Vision, 2015, pp. 3074–3082.
    [39]
    Y. Qi, S. Zhang, L. Qin, H. Yao, Q. Huang, J. Lim, M.H. Yang, Hedged deep tracking, IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 4303–4311.
    [40]
    K. Zhang, L. Zhang, Q. Liu, D. Zhang, M.-H. Yang, Fast visual tracking via dense spatio-temporal context learning, Computer Vision – ECCV, 2014, pp. 127–141.
    [41]
    Z. Kalal, K. Mikolajczyk, J. Matas, Tracking-learning-detection, IEEE Transactions on Pattern Analysis and Mach. ineIntelligence (2012) 1409–1422.
    [42]
    Y. Li, J. Zhu, S.C.H. Hoi, Reliable patch trackers: Robust visual tracking by exploiting reliable patches, IEEE Conference on Computer Vision and Pattern Recognition, 2015, pp. 353–361.
    [43]
    A. Lukezic, T. Vojir, L.C. Zajc, J. Matas, M. Kristan, Discriminative correlation filter with channel and spatial reliability, IEEE Conference on Computer Vision and Pattern Recognition, 2017, p. 3.
    [44]
    J. Valmadre, L. Bertinetto, J. Henriques, A. Vedaldi, P.H. Torr, End-to-end representation learning for correlation filter based tracking, IEEE Conference onComputer Vision and Pattern Recognition, 2017, pp. 5000–5008.
    [45]
    M. Danelljan, G. Hager, F. Shahbaz Khan, M. Felsberg, Learning spatially regularized correlation filters for visual tracking, Proceedings of the IEEE International Conference on Computer Vision, 2015, pp. 4310–4318.
    [46]
    C. Ma, J.B. Huang, X. Yang, M.H. Yang, Hierarchical convolutional features for visual tracking, IEEE International Conference on Computer Vision, 2015, pp. 3074–3082.

    Cited By

    View all
    • (2024)Review and Analysis of RGBT Single Object Tracking Methods: A Fusion PerspectiveACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3651308Online publication date: 7-Mar-2024
    • (2024)Learning Multi-Layer Attention Aggregation Siamese Network for Robust RGBT TrackingIEEE Transactions on Multimedia10.1109/TMM.2023.331029526(3378-3391)Online publication date: 1-Jan-2024
    • (2024)Online Learning Samples and Adaptive Recovery for Robust RGB-T TrackingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328885334:2(724-737)Online publication date: 1-Feb-2024
    • Show More Cited By

    Index Terms

    1. Fast RGB-T Tracking via Cross-Modal Correlation Filters
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Neurocomputing
        Neurocomputing  Volume 334, Issue C
        Mar 2019
        276 pages

        Publisher

        Elsevier Science Publishers B. V.

        Netherlands

        Publication History

        Published: 21 March 2019

        Author Tags

        1. RGB-Thermal
        2. Cross-Modal
        3. Correlation Filters
        4. Low-Rank
        5. Tracking

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Review and Analysis of RGBT Single Object Tracking Methods: A Fusion PerspectiveACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3651308Online publication date: 7-Mar-2024
        • (2024)Learning Multi-Layer Attention Aggregation Siamese Network for Robust RGBT TrackingIEEE Transactions on Multimedia10.1109/TMM.2023.331029526(3378-3391)Online publication date: 1-Jan-2024
        • (2024)Online Learning Samples and Adaptive Recovery for Robust RGB-T TrackingIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2023.328885334:2(724-737)Online publication date: 1-Feb-2024
        • (2024)Deep learning and multi-modal fusion for real-time multi-object trackingInformation Fusion10.1016/j.inffus.2024.102247105:COnline publication date: 1-May-2024
        • (2024)Sparse mixed attention aggregation network for multimodal images fusion trackingEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107273127:PAOnline publication date: 1-Feb-2024
        • (2023)Robust RGB-T Tracking via Adaptive Modality Weight Correlation Filters and Cross-modality LearningACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363010020:4(1-20)Online publication date: 25-Oct-2023
        • (2023)Dynamic Fusion Network for RGBT TrackingIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2022.322983024:4(3822-3832)Online publication date: 1-Apr-2023
        • (2023)RGBT tracking based on prior least absolute shrinkage and selection operator and quality aware fusion of deep and handcrafted featuresKnowledge-Based Systems10.1016/j.knosys.2023.110683275:COnline publication date: 5-Sep-2023
        • (2023)SiamMMF: multi-modal multi-level fusion object tracking based on Siamese networksMachine Vision and Applications10.1007/s00138-022-01354-234:1Online publication date: 1-Jan-2023
        • (2022)M5L: Multi-Modal Multi-Margin Metric Learning for RGBT TrackingIEEE Transactions on Image Processing10.1109/TIP.2021.312550431(85-98)Online publication date: 1-Jan-2022
        • Show More Cited By

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media