Jointly modeling motion and appearance cues for robust RGB-T tracking

P Zhang, J Zhao, C Bo, D Wang, H Lu… - IEEE Transactions on …, 2021 - ieeexplore.ieee.org
IEEE Transactions on Image Processing, 2021ieeexplore.ieee.org
In this study, we propose a novel RGB-T tracking framework by jointly modeling both
appearance and motion cues. First, to obtain a robust appearance model, we develop a
novel late fusion method to infer the fusion weight maps of both RGB and thermal (T)
modalities. The fusion weights are determined by using offline-trained global and local
multimodal fusion networks, and then adopted to linearly combine the response maps of
RGB and T modalities. Second, when the appearance cue is unreliable, we …
In this study, we propose a novel RGB-T tracking framework by jointly modeling both appearance and motion cues. First, to obtain a robust appearance model, we develop a novel late fusion method to infer the fusion weight maps of both RGB and thermal (T) modalities. The fusion weights are determined by using offline-trained global and local multimodal fusion networks, and then adopted to linearly combine the response maps of RGB and T modalities. Second, when the appearance cue is unreliable, we comprehensively take motion cues, i.e., target and camera motions, into account to make the tracker robust. We further propose a tracker switcher to switch the appearance and motion trackers flexibly. Numerous results on three recent RGB-T tracking datasets show that the proposed tracker performs significantly better than other state-of-the-art algorithms.
ieeexplore.ieee.org