Google Scholar

Fusing two-stream convolutional neural networks for RGB-T object tracking

C Li, X Wu, N Zhao, X Cao, J Tang - Neurocomputing, 2018 - Elsevier

C Li, X Wu, N Zhao, X Cao, J Tang

Neurocomputing, 2018•Elsevier

Abstract

This paper investigates how to integrate the complementary information from RGB and thermal (RGB-T) sources for object tracking. We propose a novel Convolutional Neural Network (ConvNet) architecture, including a two-stream ConvNet and a FusionNet, to achieve adaptive fusion of different source data for robust RGB-T tracking. Both RGB and thermal streams extract generic semantic information of the target object. In particular, the thermal stream is pre-trained on the ImageNet dataset to encode rich semantic information, and then fine-tuned using thermal images to capture the specific properties of thermal information. For adaptive fusion of different modalities while avoiding redundant noises, the FusionNet is employed to select most discriminative feature maps from the outputs of the two-stream ConvNet, and updated online to adapt to appearance variations of the target object. Finally, the object locations are efficiently predicted by applying the multi-channel correlation filter on the fused feature maps. Extensive experiments on the recently public benchmark GTOT verify the effectiveness of the proposed approach against other state-of-the-art RGB-T trackers.

Elsevier

Show moreShow less

Save Cite Cited by 123 Related articles All 2 versions

Cite

Advanced search

Saved to My library

Fusing two-stream convolutional neural networks for RGB-T object tracking