Abstract
Visual object tracking remains a challenging computer vision problem with numerous real-world applications. Discriminative correlation filter (DCF)-based methods are a recent state-of-the-art approach for dealing with this problem. The learning rate when applying a DCF is typically fixed, regardless of the situation. However, this rate is important for robust tracking, insofar as real-world video sequences include a variety of dynamical changes, such as occlusions, motion blur, and deformations. In this study, we propose Meta-Q-learning Correlation Filter (MQCF), a method for dynamically determining the learning rate of a baseline DCF-based tracker based on hand-crafted features of Histogram of Oriented Gradient (HOG), by means of reinforcement learning. The incorporation of reinforcement learning enables us to train a function for an image patch that outputs a situation-dependent learning rate of the baseline tracker in an autonomous fashion. We evaluated this method using two open benchmarks, namely, OTB-2015 and VOT-2105, and found our MQCF tracker outperformed a baseline state-of-the-art tracker by 1.8% in Area Under Curve on OTB-2015, and 8.4% relative gain in Expected Average Overlap in the VOT-2015 challenge. Our results demonstrate the advantages of the so-called meta-learning with DCF-based visual object tracking.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Poppe, R.: A survey on vision-based human action recognition. Image Vis. Comput. 28(6), 976–990 (2010)
Bolme, D.S., Beveridge, J.R., Draper, B.A., Lui, Y.M.: Visual object tracking using adaptive correlation filters. In: Computer Vision and Pattern Recognition (CVPR), 2010 IEEE Conference on, pp. 2544–2550. IEEE (2010)
Henriques, J.F., Caseiro, R., Martins, P., Batista, J.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)
Danelljan, M., Häger, G., Khan, F., Felsberg, M.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference. BMVA Press, Nottingham (2014)
Danelljan, M., Häger, G., Khan, F.S., Felsberg, M.: Discriminative scale space tracking. IEEE Trans. Pattern Anal. Mach. Intell. 39(8), 1561–1575 (2017)
Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision, pp. 254–265. Springer (2014)
Liu, T., Wang, G., Yang, Q.: Real-time part-based visual tracking via adaptive correlation filters. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4902–4912 (2015)
Bibi, A., Mueller, M., Ghanem, B.: Target response adaptation for correlation filter tracking. In: European conference on computer vision, pp. 419–433. Springer (2016)
Possegger, H., Mauthner, T., Bischof, H.: In defense of color-based model-free tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2113–2120 (2015)
Galoogahi, H.K., Sim, T., Lucey, S.: Correlation filters with limited boundaries. In: Computer Vision and Pattern Recognition (CVPR), 2015 IEEE Conference on, pp. 4630–4638. IEEE (2015)
Kiani Galoogahi, H., Fagg, A., Lucey, S.: Learning background-aware correlation filters for visual tracking. In: Proceedings of the IEEE international conference on computer vision, pp. 1135–1143 (2017)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Learning spatially regularized correlation filters for visual tracking. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 4310–4318 (2015)
Danelljan, M., Hager, G., Shahbaz Khan, F., Felsberg, M.: Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1430–1438 (2016)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: 2005. CVPR 2005. IEEE Computer Society Conference on Computer Vision and Pattern Recognition, vol. 1, pp. 886–893. IEEE (2005)
Sutton, R.S., Barto, A.G., et al.: Introduction to reinforcement learning, vol. 135. MIT press Cambridge (1998)
Zhang, D., Maei, H., Wang, X., Wang, Y.-F.: Deep reinforcement learning for visual object tracking in videos. arXiv:1701.08936 (2017)
Yoo, S.Y.J.C.Y., Yun, K., Choi, J.Y., Yun, K., Choi, J.Y.: Action-decision networks for visual tracking with deep reinforcement learning. CVPR (2017)
Supancic III, J.S., Ramanan, D.: Tracking as online decision-making: Learning a policy from streaming videos with reinforcement learning. In: ICCV, pp. 322–331 (2017)
Choi, J., Kwon, J., Lee, K.M.: Visual tracking by reinforced decision making. arXiv:1702.06291 (2017)
Huang, C., Lucey, S., Ramanan, D.: Learning policies for adaptive tracking with deep feature cascades. In: IEEE Int. Conf. on Computer Vision (ICCV), pp. 105–114 (2017)
Dong, X., Shen, J., Wang, W., Liu, Y., Shao, L., Porikli, F.: Hyperparameter optimization for tracking with continuous deep q-learning. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 518–527 (2018)
Rummery, G.A., Niranjan, M.: On-line q-learning using connectionist systems, vol. 37. University of Cambridge, Department of Engineering (1994)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
LeCun, Y., Bottou, L., Bengio, Y., Haffner, P.: Gradient-based learning applied to document recognition. Proc. IEEE 86(11), 2278–2324 (1998)
Wu, Y., Lim, J., Yang, M.-H.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)
Wu, Y., Lim, J., Yang, M.-H.: Online object tracking: A benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp. 2411–2418 (2013)
Kristan, M., Matas, J., Leonardis, A., Felsberg, M., Cehovin, L., Fernandez, G., Vojir, T., Hager, G., Nebehay, G., Pflugfelder, R.: The visual object tracking vot2015 challenge results. In: Proceedings of the IEEE international conference on computer vision workshops, pp. 1–23 (2015)
Bertinetto, L., Valmadre, J., Golodetz, S., Miksik, O., Torr, P.H.: Staple: Complementary learners for real-time tracking. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1401–1409 (2016)
Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: Algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)
Real, E., Shlens, J., Mazzocchi, S., Pan, X., Vanhoucke, V.: Youtube-boundingboxes: A large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 5296–5305 (2017)
Acknowledgments
The authors would like to thank the associate editor and anonymous reviewers for their time and constructive comments towards the improvement of our work. This research was partly supported by JSPS-KAKENHI (No. 17H06310 and 19H04180).
Author information
Authors and Affiliations
Corresponding author
Additional information
Declaration
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Kubo, A., Meshgi, K. & Ishii, S. A Meta-Q-Learning Approach to Discriminative Correlation Filter based Visual Tracking. J Intell Robot Syst 101, 11 (2021). https://doi.org/10.1007/s10846-020-01273-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-020-01273-2