research-article

Visual tracking via dynamic weighting with pyramid-redetection based Siamese networks

Authors:

Fei XueAuthors Info & Claims

Volume 65, Issue C

https://doi.org/10.1016/j.jvcir.2019.102635

Published: 01 December 2019 Publication History

Highlights

•

An improved end-to-end Siamese network for visual tracking algorithm.

•

The dynamic weighting module is used for both offline and online learning.

•

The residual architecture is exploited for better prediction.

•

The online pyramid-redetection module is resorted to re-track the target object.

•

Experiments of both short and long-term tracking show excellent tracking results.

Abstract

Siamese network based similarity-learning algorithm is currently a significant branch of visual tracking. However, most of existing deep Siamese networks depend much on the offline-trained knowledge and always assume the same importance for different prediction views. In this paper, we first introduce a dynamic weighting module in Siamese framework, which could make the offline-trained network adapt to the current circumstance well and weight predictive response maps discriminatively. The thought stems from the basis that different maps have different predictive preference, which should not be treated equally. Secondly, in order to focus more on the accurate preference, we then introduce the residual structure to form the residual dynamic weighting module. Thirdly, we construct a simple online pyramid-redetection module to avoid local search and also consider the global viewpoint. Extensive experiments on both short-term and long-term tracking demonstrate that the proposed tracker possesses the competitive tracking performance over many mainstream state-of-the-art trackers.

References

[1]

X. Jia, H. Lu, M. Yang, Visual tracking via adaptive structural local sparse appearance model, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 1822–1829.

[2]

J. Henriques, R. Caseiro, P. Martins, J. Batista, Exploiting the circulant structure of tracking-by-detection with kernels, in: Computer Vision–ECCV 2012, Springer, 2012, pp. 702–715.

[3]

J. Henriques, R. Caseiro, P. Martins, J. Batista, High-speed tracking with kernelized correlation filters, IEEE Trans. Pattern Anal. Mach. Intell. 37 (3) (2015) 583–596,.

Digital Library

[4]

M. Danelljan, G. Häger, F. Khan, M. Felsberg, Accurate scale estimation for robust visual tracking, in: 2014 British Machine Vision Conference (BMVC), BMVA, 2014, pp. 65.1–65.11.

[5]

M. Danelljan, G. Häger, F. Khan, et al., Discriminative scale space tracking, IEEE Trans. Pattern Anal. Mach. Intell. 39 (8) (2017) 1561–1575,.

Digital Library

[6]

B. Babenko, M. Yang, S. Belongie, Visual tracking with online multiple instance learning, in: 2009 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2009, pp. 983–990.

[7]

C. Bao, Y. Wu, H. Ling, et al., Real time robust l1 tracker using accelerated proximal gradient approach, in: 2012 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2012, pp. 1830–1837.

[8]

K. Zhang, L. Zhang, M. Yang, Real-time compressive tracking, in: Computer Vision–ECCV 2012, Springer, 2012, pp. 864–877.

[9]

Z. Kalal, K. Mikolajczyk, J. Matas, Tracking-learning-detection, IEEE Trans. Pattern Anal. Mach. Intell. 34 (7) (2012) 1409–1422,.

Digital Library

[10]

D. Ross, J. Lim, R. Lin, M. Yang, Incremental learning for robust visual tracking, Int. J. Comput. Vis. 77 (1–3) (2008) 125–141,.

Digital Library

[11]

M. Danelljan, F. Khan, M. Felsberg, et al., Adaptive color attributes for real-time visual tracking, in: 2014 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2014, pp. 1090–1097.

[12]

O. Russakovsky, J. Deng, H. Su, et al., Imagenet large scale visual recognition challenge, Int. J. Comput. Vis. 115 (3) (2015) 211–252,.

Digital Library

[13]

B. Zhou, A. Lapedriza, A. Khosla, et al., Places: A 10 million image database for scene recognition, IEEE Trans. Pattern Anal. Mach. Intell. PP (99) (2017) 1–14.

[14]

K. Soomro, A. Zamir, M. Shah, UCF101: A dataset of 101 human actions classes from videos in the wild, arXiv preprint arXiv: 1212.0402, 2012.

[15]

K. Simonyan, A. Zisserman, Very deep convolutional networks for large-scale image recognition, arXiv preprint arXiv: 1409.1556, 2014.

[16]

K. He, X. Zhang, S. Ren, et al., Deep residual learning for image recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 770–778,.

[17]

G. Huang, Z. Liu, K. Weinberger, et al., Densely connected convolutional networks, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp. 2261–2269,.

[18]

C. Feichtenhofer, A. Pinz, A. Zisserman, Convolutional two-stream network fusion for video action recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 1933–1941,.

[19]

J. Donahue, L. Hendricks, S. Guadarrama, et al., Long-term recurrent convolutional networks for visual recognition and description, in: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2015, pp. 2625–2634.

[20]

L. Wang, W. Ouyang, X. Wang, H. Lu, STCT: sequentially training convolutional networks for visual tracking, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 1373–1381.

[21]

Y. Song, C. Ma, L. Gong, et al., CREST: Convolutional residual learning for visual tracking, in: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 2574–2583.

[22]

C. Ma, J. Huang, X. Yang, M. Yang, Hierarchical convolutional features for visual tracking, in: 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 3074–3082.

[23]

Y. Qi, S. Zhang, L. Qin, et al., Hedged deep tracking, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 4303–4311.

[24]

N. Wang, D. Yeung, Learning a deep compact image representation for visual tracking, in: Advances in Neural Information Processing Systems, 2013, pp. 809–817.

[25]

H. Nam, B. Han, Learning multi-domain convolutional neural networks for visual tracking, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 4293–4302.

[26]

R. Tao, E. Gavves, A. Smeulders, Siamese instance search for tracking, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 1420–1429.

[27]

K. Chen, W. Tao, Once for all: a two-flow convolutional neural network for visual tracking, IEEE Trans. Circuits Syst. Video Technol PP (99) (2017) 1–10.

[28]

L. Bertinetto, J. Valmadre, J. Henriques, et al., Fully-convolutional siamese networks for object tracking, in: Computer Vision–ECCV 2016, Springer, 2016, pp. 850–865.

[29]

D. Held, S. Thrun, S. Savarese, Learning to track at 100 fps with deep regression networks, in: Computer Vision–ECCV 2016, Springer, 2016, pp. 749–765.

[30]

Q. Liu, X. Lu, Z. He, et al., Deep convolutional neural networks for thermal infrared object tracking, Knowledge-Based Syst. 134 (2017) 189–198,.

[31]

Q. Guo, W. Feng, C. Zhou, et al., Learning dynamic siamese network for visual object tracking, in: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 1781–1789.

[32]

C. Huang, S. Lucey, D. Ramanan, Learning policies for adaptive tracking with deep feature cascades, in: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 105–114.

[33]

W. Zheng, H. Yu, W. Huang, Visual tracking via graph regularized kernel correlation filer and multi-memory voting, J. Visual Commun. Image Represent. 55 (2018) 688–697,.

Digital Library

[34]

J. Valmadre, L. Bertinetto, J. Henriques, et al., End-to-end representation learning for correlation filter based tracking, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp. 5000–5008.

[35]

A. Lukezic, T. Vojír, L. Zajc, et al., Discriminative correlation filter with channel and spatial reliability, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp. 4847–4856.

[36]

G. Zhu, J. Wang, Y. Wu, H. Lu, Collaborative correlation tracking, in: 2015 British Machine Vision Conference (BMVC), BMVA, 2015, pp. 184.1–184.12.

[37]

M. Wang, Y. Liu, Z. Huang, Large margin object tracking with circulant feature maps, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp. 21–26.

[38]

J. Zhang, S. Ma, S. Sclaroff, MEEM: robust tracking via multiple experts using entropy minimization, in: Computer Vision–ECCV 2014, Springer, 2014, pp. 188–203.

[39]

J. Choi, H. Chang, J. Jeong, et al., Visual tracking using attention-modulated disintegration and integration, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 4321–4330.

[40]

B. Cai, X. Xu, X. Xing, et al., BIT: Biologically inspired tracker, IEEE Trans. Image Process. 25 (3) (2016) 1327–1339,.

Digital Library

[41]

M. Danelljan, G. Hager, F. Khan, et al., Learning spatially regularized correlation filters for visual tracking, in: 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 4310–4318.

[42]

Y. Li, J. Zhu, A scale adaptive kernel correlation filter tracker with feature integration, in: Computer Vision–ECCV 2014 Workshops (ECCV Workshops), Springer, 2014, pp. 254–265.

[43]

L. Bertinetto, J. Valmadre, S. Golodetz, et al., Staple: Complementary learners for real-time tracking, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2016, pp. 1401–1409.

[44]

Y. Wu, J. Lim, M. Yang, Online object tracking: a benchmark, in: 2013 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2013, pp. 2411–2418.

[45]

Y. Wu, J. Lim, M. Yang, Object tracking benchmark, IEEE Trans. Pattern Anal. Mach. Intell. 37 (9) (2015) 1834–1848,.

Digital Library

[46]

A. Smeulders, D. Chu, R. Cucchiara, et al., Visual tracking: an experimental survey, IEEE Trans. Pattern Anal. Mach. Intell. 36 (7) (2014) 1442–1468,.

Digital Library

[47]

M. Kristan, R. Pflugfelder, A. Leonardis, et al., The visual object tracking vot2013 challenge results, in: 2013 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), IEEE, 2013, pp. 98–111.

[48]

M. Kristan, R. Pflugfelder, A. Leonardis, et al., The visual object tracking vot2014 challenge results, in: Computer Vision–ECCV 2014 Workshops (ECCV Workshops), Springer, 2014, pp. 191–217.

[49]

M. Kristan, J. Matas, A. Leonardis, et al., The visual object tracking vot2015 challenge results, in: 2015 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), IEEE, 2015, pp. 1–23.

[50]

T. Yang, A. Chan, Recurrent filter learning for visual tracking, in: 2017 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), IEEE, 2017, pp. 2010–2019.

[51]

O. Parkhi, A. Vedaldi, A. Zisserman, Deep face recognition, in: 2015 British Machine Vision Conference (BMVC), BMVA, 2015, pp. 41.1–41.12.

[52]

E. Simo-Serra, E. Trulls, L. Ferraz, et al., Discriminative learning of deep convolutional feature point descriptors, in: 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 118–126.

[53]

V. Mnih, K. Kavukcuoglu, D. Silver, et al., Human-level control through deep reinforcement learning, Nature 518 (7540) (2015) 529–533,.

[54]

A. Vedaldi, K. Lenc, Matconvnet: convolutional neural networks for matlab, in: Proceedings of the 23rd ACM international conference on Multimedia, ACM, 2015, pp. 689–692.

[55]

K. He, X. Zhang, S. Ren, et al., Delving deep into rectifiers: Surpassing human-level performance on imagenet classification, in: 2015 IEEE International Conference on Computer Vision (ICCV), IEEE, 2015, pp. 1026–1034.

[56]

Q. Wang, Z. Teng, J. Xing, et al., Learning attentions: residual attentional Siamese Network for high performance online visual tracking, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2018, pp. 4854–4863.

[57]

A. He, C. Luo, X. Tian, et al., A twofold siamese network for real-time object tracking, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2018, pp. 4834–4843.

[58]

Z. Ji, W. Wang, Correlation filter tracker based on sparse regularization, J. Visual Commun. Image Represent. 55 (2018) 354–362,.

Digital Library

[59]

H. Song, Y. Zheng, K. Zhang, Robust visual tracking via self-similarity learning, Electron. Lett. 53 (1) (2016) 20–22,.

[60]

B. Bai, B. Zhong, G. Ouyang, et al., Kernel correlation filters for visual tracking with adaptive fusion of heterogeneous cues, Neurocomputing 286 (2018) 109–120,.

Digital Library

[61]

G. Li, M. Peng, K. Nai, et al., Visual tracking via context-aware local sparse appearance model, J. Visual Commun. Image Represent. 56 (2018) 92–105,.

Digital Library

[62]

S. Hare, A. Saffari, P. Torr, Struck: structured output tracking with kernels, in: 2011 IEEE International Conference on Computer Vision (ICCV), IEEE, 2011, pp. 263–270.

[63]

Y. Li, J. Zhu, A scale adaptive kernel correlation filter tracker with feature integration, in: Computer Vision–ECCV 2014, Springer, 2014, pp. 254–265.

[64]

Z. Hong, Z. Chen, C. Wang, et al., Multi-store tracker (muster): A cognitive psychology inspired approach to object tracking, in: 2018 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2018, pp. 749–758.

[65]

H. Grabner, M. Grabner, H. Bischof, Real-time tracking via on-line boosting, in: 2006 British Machine Vision Conference (BMVC), BMVA, 2006, 1(5): 6. https://doi.org/10.5244/c.20.6.

[66]

M. Mueller, N. Smith, B. Ghanem, A benchmark and simulator for uav tracking, in: Computer Vision–ECCV 2016, Springer, 2016, pp. 445–461.

[67]

H. Fan, H. Ling, Parallel tracking and verifying: a framework for real-time and high accuracy visual tracking, in: 2017 IEEE International Conference on Computer Vision (ICCV), IEEE, 2017, pp. 5487–5495.

[68]

T. Lin, P. Dollár, R. Girshick, et al., Feature pyramid networks for object detection, in: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), IEEE, 2017, pp. 936–944.

[69]

Z. Zhu, Q. Wang, B. Li, et al., Distractor-aware siamese networks for visual object tracking, arXiv preprint arXiv:1808.06048, 2018.

Index Terms

Visual tracking via dynamic weighting with pyramid-redetection based Siamese networks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Tracking
      2. Computer vision tasks
        Scene understanding
  2. Machine learning
    1. Machine learning approaches
      1. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

Visual Tracking with Attentional Convolutional Siamese Networks
Image and Graphics
Abstract
Recently Siamese trackers have drawn great attention due to their considerable accuracy and speed. To further improve the discriminability of Siamese networks for visual tracking, some deeper networks, such as VGG and ResNet, are exploited as ...
Deep visual tracking

The first comprehensive survey on deep-learning-based trackers.Review existing deep visual trackers from three different perspectives.Large-scale benchmark evaluations of deep visual trackers.Summarize cutting-edge research works and discuss future ...
IoU-guided Siamese region proposal network for real-time visual tracking
Abstract
Recently, region proposal network (RPN) has been combined with the Siamese network for tracking and shown excellent accuracy and high efficiency. However, the low correlation between the classification score and localization accuracy ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Visual Communication and Image Representation

Journal of Visual Communication and Image Representation Volume 65, Issue C

Dec 2019

271 pages

ISSN:1047-3203

Issue’s Table of Contents

Elsevier Inc.

Publisher

Academic Press, Inc.

United States

Publication History

Published: 01 December 2019

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 15 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Media

Figures

Other

Tables

View Issue’s Table of Contents