Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

A coupling method of learning structured support correlation filters for visual tracking

  • Original article
  • Published:
The Visual Computer Aims and scope Submit manuscript

A Correction to this article was published on 04 December 2023

This article has been updated

Abstract

The correlation filtering method is one of the mainstream methods in the visual target tracking task. One of the reasons is that the introduction of cyclic samples facilitates the calculation of optimizing filters. But usual correlation filtering frameworks which are attributable to a ridge regression model based on least squares error with various regularizations put emphasis on modeling a linear system for samples themselves, so maybe likely result in over-fitting. With the appearance of either target or background varying, the credibility of the response obtained after filtering would be decrease. Some researchers thus have tried to incorporate other models such as SVMs to obtain better robustness. In this study, we propose a natural coupling method called StrucSCF of integrating structured SVM by background awareness into the correlation filtering framework, which put more emphasis on the discrepancy between the target and background samples to enhance the discrimination and robustness of tracking. Meanwhile, for the sake of online updating the filters based on structured SVM with real-time performance, we take advantage of the fast Fourier transform on the circulant samples to speed up solving the structured SVM-based filters. In addition, we extend the StrucSCF method with Laplacian temporal regularization to demonstrate that it has as good quality of extension as the conventional correlation filtering framework. The proposed StrucSCF has achieved competitive performance compared with the baseline and other advanced methods in mainstream benchmarks.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
EUR 32.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or Ebook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Algorithm 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Change history

Notes

  1. Here it may as well be for convenience that the samples other than the sole target are collectively referred to as negative samples.

  2. Here we make it clear first that all vector variables in the full text are column vectors by default unless expressly stated otherwise; thus, those default to row vectors if carrying subscript \( ^\textsf{T} \).

  3. For clarity and conciseness, we regard the matrix of feature as a vector in the following formulation, which in view of the fact that 2-D DFT and 1-D DFT have the same properties can be naturally extended to the version of matrix form in the practical application.

  4. http://www.vlfeat.org/matconvnet/.

References

  1. Abbass, M.Y., Kwon, K., Kim, N., et al.: A survey on online learning for visual tracking. Vis. Comput. 37(5), 993–1014 (2021)

    Article  Google Scholar 

  2. Fan, C., Zhang, R., Ming, Y.: Mp-ln: motion state prediction and localization network for visual object tracking. Vis. Comput. (2022). https://doi.org/10.1007/s00371-021-02296-y

  3. Zhang, W., Du, Y., Chen, Z., et al.: Robust adaptive learning with siamese network architecture for visual tracking. Vis. Comput. 37(5), 881–894 (2021)

    Article  Google Scholar 

  4. Yang, S., Chen, H., Xu, F., et al.: High-performance uavs visual tracking based on siamese network. Vis. Comput. 38(6), 2107–2123 (2022)

    Article  Google Scholar 

  5. Qu, Z., Shi, H., Tan, S., et al.: A flow-guided self-calibration siamese network for visual tracking. Vis. Comput. (2022). https://doi.org/10.1007/s00371-021-02362-5

  6. Bolme, D.S., Beveridge, J.R., Draper, B.A. , et al.: Visual object tracking using adaptive correlation filters. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 2544–2550 (2010)

  7. Danelljan, M., Khan, F.S., Felsberg, M., et al.: Adaptive color attributes for real-time visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 1090–1097 (2014)

  8. Bertinetto, L., Valmadre, J., Golodetz, S., et al.: Staple: Complementary learners for real-time tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 1401–1409 (2016)

  9. Lukezic, A., Vojír, T., Zajc, L.C., et al.: Discriminative correlation filter tracker with channel and spatial reliability. Int. J. Comput. Vis. 126(7), 671–688 (2018)

    Article  MathSciNet  Google Scholar 

  10. Lukezic, A., Vojir, T., Zajc, L.C., et al.: Discriminative correlation filter with channel and spatial reliability. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 4847–4856 (2017)

  11. Danelljan, M., Häger, G., Khan, F.S., et al.: Adaptive decontamination of the training set: A unified formulation for discriminative visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 1430–1438 (2016)

  12. Galoogahi, H.K., Fagg, A., Lucey, S.: Learning background-aware correlation filters for visual tracking. In: IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, pp. 1144–1152 (2017)

  13. Dai, K., Wang, D., Lu, H., et al. : Visual tracking via adaptive spatially-regularized correlation filters. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Computer Vision Foundation/IEEE, pp. 4670–4679 (2019)

  14. Huang, Z., Fu, Y. Li, C., Lin, F., et al.: Learning aberrance repressed correlation filters for real-time UAV tracking. In: IEEE/CVF International Conference on Computer Vision (ICCV). IEEE Computer Society, pp. 2891–2900 (2019)

  15. Li, Y., Fu, C., Ding, F., et al.: Autotrack: Towards high-performance visual tracking for UAV with automatic spatio-temporal regularization. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 11920–11929 (2020)

  16. Liao, J., Qi, C., Cao, J.: Temporal constraint background-aware correlation filter with saliency map. IEEE Trans. Multim. 23, 3346–3361 (2021)

    Article  Google Scholar 

  17. Li, F., Tian, C., Zuo, W., et al.: Learning spatial-temporal regularized correlation filters for visual tracking. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 4904–4913 (2018)

  18. Zhang, K., Wang, W., Wang, J., et al.: Learning adaptive target-and-surrounding soft mask for correlation filter based visual tracking. IEEE Trans. Circuits Syst. Video Technol. 32(6), 3708–3721 (2022)

    Article  Google Scholar 

  19. Zuo, W., Wu, X., Lin, L., et al.: Learning support correlation filters for visual tracking. IEEE Trans. Pattern Anal. Mach. Intell. 41(5), 1158–1172 (2019)

    Article  Google Scholar 

  20. Sun, Y., Sun, C., Wang, D., et al.: ROI pooled correlation filters for visual tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Computer Vision Foundation/IEEE, pp. 5783–5791 (2019)

  21. Xu, T., Feng, Z., Wu, X., et al.: Learning adaptive discriminative correlation filters via temporal consistency preserving spatial feature selection for robust visual object tracking. IEEE Trans. Image Process. 28(11), 5596–5609 (2019)

    Article  MathSciNet  Google Scholar 

  22. Lin, F., Fu, C., He, Y., et al.: Learning temporary block-based bidirectional incongruity-aware correlation filters for efficient UAV object tracking. IEEE Trans. Circuits Syst. Video Technol. 31(6), 2160–2174 (2021)

    Article  Google Scholar 

  23. Wang, Y., Hu, S., Wu, S.: Object tracking based on Huber loss function. Vis. Comput. 35, 1641–1654 (2019)

    Article  Google Scholar 

  24. Ersi, E.F., Nooghabi, M.K.: Revisiting correlation-based filters for low-resolution and long-term visual tracking. Vis. Comput. 35(10), 1447–1459 (2019)

    Article  Google Scholar 

  25. Miao, Q., Xu, C., Li, F., et al.: Delayed rectification of discriminative correlation filters for visual tracking. Vis. Comput. (2022). https://doi.org/10.1007/s00371-022-02401-9

  26. Fan, J., Yang, X., R. Lu, R., et al.: Long-term visual tracking algorithm for uavs based on kernel correlation filtering and surf features. Vis. Comput. (2022). https://doi.org/10.1007/s00371-021-02331-y

  27. Wang, M., Liu, Y., Huang, Z.: Large margin object tracking with circulant feature maps. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 4800–4808 (2017)

  28. Platt, J.: Sequential minimal optimization: a fast algorithm for training support vector machines. Advances in Kernel Methods-Support Vector Learning. 208, (1998)

  29. Henriques, J.F., Caseiro, R., Martins, P. et al.: Exploiting the circulant structure of tracking-by-detection with kernels. In: European Conference on Computer Vision (ECCV) Part IV. Springer, pp. 702–715 (2012)

  30. Henriques, J.F., Caseiro, R., Martins, P., et al.: High-speed tracking with kernelized correlation filters. IEEE Trans. Pattern Anal. Mach. Intell. 37(3), 583–596 (2015)

    Article  Google Scholar 

  31. Danelljan, M., Häger, G., Khan, F.S., et al.: Learning spatially regularized correlation filters for visual tracking. In: IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, pp. 4310–4318 (2015)

  32. Danelljan, M., Robinson, A., Khan, F. S., et al.: Beyond correlation filters: Learning continuous convolution operators for visual tracking. In: European Conference on Computer Vision (ECCV) Part V. Springer, pp. 472–488 (2016)

  33. Mueller, M., Smith, N., Ghanem, B.: Context-aware correlation filter tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 1387–1395 (2017)

  34. Hare, S., Saffari, A., Torr, P.H.S.: Struck: Structured output tracking with kernels. In: IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, pp. 263–270 (2011)

  35. Ning, J., Yang, J., Jiang, S., et al.: Object tracking via dual linear structured SVM and explicit feature map. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 4266–4274 (2016)

  36. Ramanan, D.: Dual coordinate solvers for large-scale structural svms. CoRR, vol. abs/1312.1743, 2013. http://arxiv.org/abs/1312.1743

  37. Zhang, K., Zhang, L., Yang, M.: Real-time compressive tracking. In: European Conference on Computer Vision (ECCV) Part III. Springer, pp. 864–877 (2012)

  38. Zhang, J., Ma, S., Sclaroff, S.: MEEM: robust tracking via multiple experts using entropy minimization. In: European Conference on Computer Vision (ECCV) Part VI. Springer, pp. 188–203 (2014)

  39. Rodriguez, A., Boddeti, V.N., Kumar, B.V., et al.: Maximum margin correlation filter: a new approach for localization and classification. IEEE Trans. Image Process. 22(2), 631–643 (2013)

    Article  MathSciNet  Google Scholar 

  40. Suykens, J., Vandewalle, J.: Least squares support vector machine classifiers. Neural Process. Lett. 9(3), 293–300 (1999)

    Article  Google Scholar 

  41. Ye, J., Tao, X.: Svm versus least squares svm. J. Mach. Learn. Res. 2, 644–651 (2007)

    Google Scholar 

  42. Lee, C.P., Lin, C.J.: A study on l2-loss (squared hinge-loss) multiclass svm. Neural Comput. 25(5), 1302–1323 (2013)

    Article  MathSciNet  Google Scholar 

  43. Boyd, S.P., Parikh, N., Chu, E., et al.: Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3(1), 1–122 (2011)

    Article  Google Scholar 

  44. Felzenszwalb, P.F., Girshick, R.B., McAllester, D.A., et al.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)

    Article  Google Scholar 

  45. Simonyan K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. In: International Conference on Learning Representations (ICLR) (2015). http://arxiv.org/abs/1409.1556

  46. Vedaldi A., Lenc, K.: Matconvnet: Convolutional neural networks for MATLAB. In: Annual ACM Conference on Multimedia Conference (MM). ACM, pp. 689–692 (2015)

  47. Wu, Y., Lim, J., Yang, M.: Object tracking benchmark. IEEE Trans. Pattern Anal. Mach. Intell. 37(9), 1834–1848 (2015)

    Article  Google Scholar 

  48. Kristan, M., Leonardis, A., He, Z.: The visual object tracking VOT2017 challenge results. In: IEEE International Conference on Computer Vision Workshops (ICCVW). IEEE Computer Society, pp. 1949–1972 (2017)

  49. Kristan, M., Leonardis, A., Matas, J.: The sixth visual object tracking VOT2018 challenge results. In: European Conference on Computer Vision Workshops (ECCVW) Part I. Springer, pp. 3–53 (2018)

  50. Liang, P., Blasch, E., Ling, H.: Encoding color information for visual tracking: algorithms and benchmark. IEEE Trans. Image Process. 24(12), 5630–5644 (2015)

    Article  MathSciNet  Google Scholar 

  51. Mueller, M., Smith, N., Ghanem, B.: A benchmark and simulator for UAV tracking. In: European Conference on Computer Vision (ECCV) Part I. Springer, pp. 445–461 (2016)

  52. Wu, Y., Lim, J., Yang, M.: Online object tracking: a benchmark. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2411–2418 (2013)

  53. Li, Y., Zhu, J., Hoi, S. C. H., et al.: Robust estimation of similarity transformation for visual object tracking. In: AAAI Conference on Artificial Intelligence (AAAI). AAAI Press, pp. 8666–8673 (2019)

  54. Li, Y., Zhu, J.: A scale adaptive kernel correlation filter tracker with feature integration. In: European Conference on Computer Vision Workshops (ECCVW) PartII. Springer, pp. 254–265 (2014)

  55. Danelljan, M., Häger, G., Khan, F.S. ., et al.: Accurate scale estimation for robust visual tracking. In: British Machine Vision Conference (BMVC). BMVA Press, (2014). http://www.bmva.org/bmvc/2014/papers/paper038/index.html

  56. Danelljan, M., Bhat, G., Khan, F.S., et al.: ECO: Efficient convolution operators for tracking. In: IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). IEEE Computer Society, pp. 6931–6939 (2017)

  57. Feng, W., Han, R., Guo, Q., et al.: Dynamic saliency-aware regularization for correlation filter-based object tracking. IEEE Trans. Image Process. 28(7), 3232–3245 (2019)

    Article  MathSciNet  Google Scholar 

  58. Song, Y., Ma, C., Gong, L., et al.: CREST: convolutional residual learning for visual tracking. IEEE International Conference on Computer Vision (ICCV). IEEE Computer Society, pp. 2574–2583 (2017)

  59. Li, X., Ma, C., Wu, B., et al.: Target-aware deep tracking. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Computer Vision Foundation / IEEE, pp. 1369–1378 (2019)

  60. Parikh, N., Boyd, S.P.: Proximal algorithms. Found. Trends Optim. 1(3), 127–239 (2014)

    Article  Google Scholar 

Download references

Acknowledgements

This work was partially supported by the National Natural Science Foundation of China under Grant 51935005; the Natural Science Foundation of Heilongjiang Province, China, under Grant LH2021F023 and the Basic Scientific Research Program (Grant No. JCKY20200603C010).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Gong Li.

Ethics declarations

Conflict of interest

All the authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

The original online version of this article was revised: there were errors in Appendix C.

Appendices

Appendix A

1.1 Convergence of {\( \varvec{z}^\kappa \)}

Here we here introduce an operator called proximal operator [60]. For a closed convex proper function \( f:{\mathbb {R}}^n\rightarrow {\mathbb {R}}\bigcup \{-\infty ,\infty \} \), the proximal operator \( {\textbf {prox}}_{\lambda f}:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n \) is

$$\begin{aligned} {\textbf {prox}}_{\lambda f}\left( \varvec{v}\right) =\arg \min _{\varvec{x}}f\left( \varvec{x}\right) +\left( 1/2\lambda \right) \left\| \varvec{x}-\varvec{v}\right\| ^2_2, \end{aligned}$$
(A.1)

where \( \lambda \ge 0 \), and \( \left( 1/2\lambda \right) \) is short for \( \left( 1/(2\lambda )\right) \).

If f is an indicator function \( \textrm{I}_{{\mathbb {R}}_+^n}\left( \varvec{x}\right) =\left\{ \begin{aligned}0&,x_i \ge 0,\forall i\\\infty&,\textrm{or} \end{aligned}\right. \), the proximal operator will be \( {\textbf {prox}}_{\lambda f}\left( \varvec{v}\right) =\Pi _{{\mathbb {R}}_+^n}\left( \varvec{v}\right) =\left[ \varvec{v}\right] ^+ \), where \( \Pi _C \) denotes project mapping to set C.

According to proximal operator, here we define an operator \( {{\textbf {prox}}}_{\lambda f;g}: {\mathbb {R}}^n\rightarrow {\mathbb {R}}^n \) as follows

$$\begin{aligned} {\textbf {prox}}_{\lambda f;g}\left( \varvec{v}\right) \overset{\triangle }{=}\arg \min _{\varvec{x}}f\left( \varvec{x}\right) +\left( 1/2\lambda \right) \left\| g\left( \varvec{x}\right) -\varvec{v}\right\| _2^2, \end{aligned}$$
(A.2)

where \( f:{\mathbb {R}}^n\rightarrow {\mathbb {R}} \bigcup \left\{ \infty \right\} \) is differentiable convex closed function, and \( g:{\mathbb {R}}^n\rightarrow {\mathbb {R}}^n \) is affine function \( g\left( \varvec{x}\right) =\varvec{Ax}+\varvec{b},\varvec{A}\in {\mathbb {R}}^{n\times n}, \varvec{b}\in {\mathbb {R}}^n\). Note that if let \( \varvec{v}^*=\varvec{Ax}^*+\varvec{b}\), where \(\varvec{x}^*=\arg \min _{\varvec{x}}f\left( \varvec{x}\right) \), then \( \varvec{v}^*=g\left( {\textbf {prox}}_{\lambda f;g}\left( \varvec{v}^*\right) \right) \). Defining a notation \( T=g\circ {\textbf {prox}}_{\lambda f;g} \), we know there exists fixed point of T due to \( \varvec{v}^*=T\varvec{v}^* \), and \( \varvec{v}^* \) is one of fixed point of T. We denote the fixed points set of T by \( \textrm{Fix}\left( T\right) \), thus \( \varvec{v}^*\in \textrm{Fix}\left( T\right) \). Now, we are going to prove that

Proposition 1

T is an averaged operator, strictly, a 1/2-averaged operator (also called firmly nonexpansive).

Remark

If T is referred to as \( \lambda \)-average operator, that means there exists nonexpansiveness operator \( T' \) for \( \lambda \in \left( 0,1\right) \) such that \( T=\left( 1-\lambda \right) I+\lambda T' \), where I is identity mapping. According to \( T=\left( 1/2\right) I+\left( 1/2\right) T' \iff T'=2T-I\), we get

$$\begin{aligned} \begin{aligned}&\left\| \varvec{v}_1-\varvec{v}_2\right\| _2^2\ge \left\| \left( 2T-I\right) \varvec{v}_1-\left( 2T-I\right) \varvec{v}_2\right\| _2^2\\ \iff&\left\| \varvec{v}_1-\varvec{v}_2\right\| _2^2\ge \left\| \varvec{v}_1-\varvec{v}_2-\left( T\varvec{v}_1-T\varvec{v}_2\right) \right\| _2^2\\&\qquad \qquad \qquad +\left\| T\varvec{v}_1-T\varvec{v}_2\right\| _2^2\\ \iff&\left<\varvec{v}_1-\varvec{v}_2,T\varvec{v}_1-T\varvec{v}_2\right>\ge \left\| T\varvec{v}_1-T\varvec{v}_2\right\| _2^2,\\ \forall \varvec{v}_1,&\varvec{v}_2\in \textrm{dom}T. \end{aligned} \end{aligned}$$
(A.3)

The so-called firm nonexpansiveness means T meets condition (A.3), e.g., \(\Pi _{{\mathbb {R}}_+^n} \) is firmly nonexpansive as a result of \( \left<\varvec{x}_1-\varvec{x}_2,\Pi _{{\mathbb {R}}_+^n}\varvec{x}_1-\Pi _{{\mathbb {R}}_+^n}\varvec{x}_2\right>\ge \left\| \Pi _{{\mathbb {R}}_+^n}\varvec{x}_1-\Pi _{{\mathbb {R}}_+^n}\varvec{x}_2\right\| _2^2 \). Obviously, the average operator is a subset of nonexpansiveness operators, and the composition of average operator is still an average operator.

Proof

Given arbitrary \( \varvec{v}_1,\varvec{v}_2\in {\mathbb {R}}^n \), let

$$\begin{aligned} \varvec{x}_1= & {} {\textbf {prox}}_{\lambda f;g}\left( \varvec{v}_1\right) \nonumber \\= & {} \arg \min _{\varvec{x}}f\left( \varvec{x}\right) +\left( 1/2\lambda \right) \left\| \varvec{Ax}+\varvec{b}-\varvec{v}_1\right\| _2^2, \end{aligned}$$
(A.4)
$$\begin{aligned} \varvec{x}_2= & {} {\textbf {prox}}_{\lambda f;g}\left( \varvec{v}_2\right) \nonumber \\= & {} \arg \min _{\varvec{x}}f\left( \varvec{x}\right) +\left( 1/2\lambda \right) \left\| \varvec{Ax}+\varvec{b}-\varvec{v}_2\right\| _2^2. \end{aligned}$$
(A.5)

According to optimal condition, we have

$$\begin{aligned} \varvec{0}= & {} \nabla f\left( \varvec{x}_1\right) +\left( 1/2\lambda \right) \varvec{A}^\textsf{T}\left( \varvec{Ax}_1+\varvec{b}-\varvec{v}_1\right) \nonumber \\{} & {} \iff \nabla f\left( \varvec{x}_1\right) =\left( 1/2\lambda \right) \varvec{A}^\textsf{T}\left( \varvec{v}_1-\varvec{Ax}_1-\varvec{b}\right) , \end{aligned}$$
(A.6)
$$\begin{aligned} \varvec{0}= & {} \nabla f\left( \varvec{x}_2\right) +\left( 1/2\lambda \right) \varvec{A}^\textsf{T}\left( \varvec{Ax}_2+\varvec{b}-\varvec{v}_2\right) \nonumber \\{} & {} \iff \nabla f\left( \varvec{x}_2\right) =\left( 1/2\lambda \right) \varvec{A}^\textsf{T}\left( \varvec{v}_2-\varvec{Ax}_2-\varvec{b}\right) . \end{aligned}$$
(A.7)

Due to f differentiable convex function, \( \nabla f \) is monotone operator, which means that

$$\begin{aligned} \left<\varvec{x}_1-\varvec{x}_2,\nabla f\left( \varvec{x}_1\right) -\nabla f\left( \varvec{x}_2\right) \right>\ge \varvec{0}, \end{aligned}$$
(A.8)

then the above formula is substituted with (A.6) and (A.7), and using \( T\varvec{v}_1=\varvec{Ax}_1 + \varvec{b},T\varvec{v}_2=\varvec{Ax}_2 + \varvec{b}\), we get

$$\begin{aligned}&\left<\varvec{x}_1-\varvec{x}_2,\varvec{A}^\textsf{T}\left( \varvec{v}_1-\varvec{Ax}_1\right) -\varvec{A}^\textsf{T}\left( \varvec{v}_2-\varvec{Ax}_2\right) \right>\ge \varvec{0}\nonumber \\&\iff \left<\varvec{Ax}_1-\varvec{Ax}_2,\varvec{v}_1-\varvec{Ax}_1-\left( \varvec{v}_2-\varvec{Ax}_2\right) \right>\ge \varvec{0}\nonumber \\&\iff \left<T\varvec{v}_1-T\varvec{v}_2,\varvec{v}_1-T\varvec{v}_1-\left( \varvec{v}_2-T\varvec{v}_2\right) \right>\ge \varvec{0}\nonumber \\&\iff \left<\varvec{v}_1-\varvec{v}_2,T\varvec{v}_1-T\varvec{v}_2\right>\ge \left\| T\varvec{v}_1-T\varvec{v}_2\right\| _2^2. \end{aligned}$$
(A.9)

So T is firmly nonexpansive; in other words, it is a 1/2-averaged operator. \(\square \)

Since T has fixed point \( \varvec{v}^* \), then for arbitrary \( \varvec{v}^1 \in {\mathbb {R}}^n \),

$$\begin{aligned}&\left\| \varvec{v}^1-\varvec{v}^*\right\| ^2_2\ge \left\| T\varvec{v}^1-\varvec{v}^*\right\| ^2_2+\left\| \varvec{v}^1-T\varvec{v}^1\right\| ^2_2\nonumber \\ \ge&\left\| T\varvec{v}^2-\varvec{v}^*\right\| ^2_2+\left\| \varvec{v}^2-T\varvec{v}^2\right\| ^2_2+\left\| \varvec{v}^1-T\varvec{v}^1\right\| ^2_2\nonumber \\ \ge&\ldots \ge \left\| T\varvec{v}^k-\varvec{v}^*\right\| ^2_2+\sum _{i=1}^{k}\left\| \varvec{v}^i-T\varvec{v}^i\right\| ^2_2, \end{aligned}$$
(A.10)

so \( \sum _{i=1}^{\infty }\left\| \varvec{v}^i-T\varvec{v}^i\right\| ^2_2 \) is bounded, hence \( T\varvec{v}^k\rightarrow \varvec{v}^k \) as \( k\rightarrow \infty \), which means that the sequence \( \left\{ \varvec{v}^k\right\} _{k=1}^\infty \) generated by iteration form \( \varvec{v}^{k+1}=T\varvec{v}^k \) is bounded and converges to a fixed point of T denoted by \( \varvec{v}'=\lim \limits _{k\rightarrow \infty } \varvec{v}^k\in \textrm{Fix}\left( T\right) \). Consider a bounded subset D on \( {\mathbb {R}}^n \) that contains \( \textrm{Fix}\left( T\right) \) and whose intersection with \( {\mathbb {R}}^n_+ \) is not empty, for arbitrary \( \varvec{x}^1\in D\bigcap {\mathbb {R}}_+^n \), the sequence \( \left\{ \varvec{x}^k\right\} _{k=1}^\infty \) on \( {\mathbb {R}}_+^n \) generated from iteration form \( \varvec{x}^{k+1}=\Pi _{{\mathbb {R}}_+^n}\left( T\varvec{x}^{k}\right) \) is bounded. Let \( S=\Pi _{{\mathbb {R}}_+^n}\circ T \); hence, S is also an averaged operator. Let \( {\overline{{\textbf {conv}}}}D \) denote the closure of convex hull of \( D\bigcup \left\{ \varvec{x}^k\right\} _{k=1}^\infty \), then \( \left\{ \varvec{x}^k\right\} _{k=1}^\infty \subset \overline{{\textbf {conv}}}D \bigcap {\mathbb {R}}_+^n \), S is a mapping from \( \overline{{\textbf {conv}}}D \bigcap {\mathbb {R}}_+^n \) to itself. For \( \overline{{\textbf {conv}}}D \bigcap {\mathbb {R}}_+^n \) is compact because of \( \overline{{\textbf {conv}}}D \bigcap {\mathbb {R}}_+^n \) bounded and closed, there exists a convergent subsequence \( \left\{ \varvec{x}^{k_i}\right\} _{i=1}^\infty \) such that \( \varvec{u}=\lim \limits _{i\rightarrow \infty }\varvec{x}^{k_i} \in \overline{{\textbf {conv}}}D \bigcap {\mathbb {R}}_+^n \). Then, we bring in a set of operator \( S_i\varvec{x}^{k_i}=\left( 1-\lambda _i\right) \varvec{x}^{k_1}+\lambda _iS\varvec{x}^{k_i} \) where \( \lambda _i \in \left( 0,1\right) \) is set as

$$\begin{aligned} \lambda _i=\frac{\varvec{x}^{k_i}-\varvec{x}^{k_1}}{S\varvec{x}^{k_i}-\varvec{x}^{k_1}}=\frac{\varvec{x}^{k_i}-\varvec{x}^{k_1}}{\varvec{x}^{k_i+1}-\varvec{x}^{k_1}}\ge \frac{\varvec{x}^{k_i}-\varvec{x}^{k_1}}{S\varvec{x}^{k_{i+1}}-\varvec{x}^{k_1}}; \end{aligned}$$
(A.11)

hence, \( \lambda _i\rightarrow 1 \) as \( i\rightarrow \infty \) and \( \varvec{x}^{k_i}=S_i\varvec{x}^{k_i} \). So

$$\begin{aligned} \begin{aligned} \left\| S\varvec{x}^{k_i}-\varvec{x}^{k_i}\right\| _2&=\left\| S\varvec{x}^{k_i}-\varvec{x}^{k_i}\right\| _2=\left\| S\varvec{x}^{k_i}-S_i\varvec{x}^{k_i}\right\| _2\\&=\left\| S\varvec{x}^{k_i}-\left( 1-\lambda _i\right) \varvec{x}^{k_1}-\lambda _iS\varvec{x}^{k_i}\right\| _2\\&=\left\| \left( 1-\lambda _i\right) \left( S\varvec{x}^{k_i}-\varvec{x}^{k_1}\right) \right\| _2\\&=\left( 1-\lambda _i\right) \left\| S\varvec{x}^{k_i}-\varvec{x}^{k_i}\right\| _2\rightarrow 0\left( i\rightarrow \infty \right) , \end{aligned} \end{aligned}$$
(A.12)

thereby

$$\begin{aligned} \begin{aligned}&\left\| \varvec{u}-S\varvec{u}\right\| _2=\left\| \varvec{u}-\varvec{x}^{k_i}-S\varvec{u}+S\varvec{x}^{k_i}-S\varvec{x}^{k_i}+\varvec{x}^{k_i}\right\| _2\\ \le&\left\| \varvec{u}-\varvec{x}^{k_i}\right\| _2+\left\| S\varvec{u}-S\varvec{x}^{k_i}\right\| _2+\left\| S\varvec{x}^{k_i}-\varvec{x}^{k_1}\right\| _2\\ =&\;2\left\| \varvec{u}-\varvec{x}^{k_i}\right\| _2+\left\| S\varvec{x}^{k-i}-\varvec{x}^{k_i}\right\| _2\rightarrow 0 \left( i\rightarrow \infty \right) , \end{aligned} \end{aligned}$$
(A.13)

that is, \( \varvec{u}=S\varvec{u} \) is a fixed point of S. And

$$\begin{aligned} \begin{aligned}&\left\| \varvec{x}^{k+1}-\varvec{u}\right\| _2=\left\| S\varvec{x}^{k}-\varvec{u}\right\| _2\le \left\| \varvec{x}^{k}-\varvec{u}\right\| _2\le \left\| \varvec{x}^{k_i}-\varvec{u}\right\| _2,\\&\;\forall \; k>k_i, \end{aligned} \end{aligned}$$
(A.14)

so \( \lim \limits _{k\rightarrow \infty }\left\| \varvec{x}^{k}-\varvec{u}\right\| _2= \lim \limits _{i\rightarrow \infty }\left\| \varvec{x}^{k_i}-\varvec{u}\right\| _2=0\), \( \left\{ \varvec{x}^k\right\} \) generated from iteration form \( \varvec{x}^{k+1}=S\varvec{x}^{k} \) converges to \( \varvec{u} \).

Proposition 2

{\( \varvec{z}^\kappa \)} is convergent.

Proof

Let

$$\begin{aligned} \begin{aligned} {\mathcal {L}}_0\left( \varvec{f},\varvec{\gamma };\lambda ,\varvec{z}^\kappa \right) =&\frac{1}{2}\left\| \varvec{1}\varvec{x}^\textsf{T}\varvec{f}-\varvec{Xf}-\varvec{y}-\varvec{z}^\kappa \right\| ^2+\frac{\lambda }{2}\left\| \varvec{f}\right\| ^2\\&+\gamma ^\textsf{T}\overline{\varvec{P}}\varvec{f}. \end{aligned} \end{aligned}$$
(A.15)

In Algorithm 1 with regard to the kth iteration \( \varvec{f}^k,\varvec{g}^k \) in the ADMM steps, we have \( \varvec{f}^k-\varvec{g}^k \rightarrow \varvec{0}\), and

$$\begin{aligned}&\frac{1}{2}\left\| \varvec{1}\varvec{x}^\textsf{T}\varvec{g}^k\right\| ^2\!-\!\left<\varvec{1}\varvec{x}^\textsf{T}\varvec{g}^k,\varvec{Xg}^k\!+\!\varvec{y}+\varvec{z}\right>\!+\!\frac{1}{2n}\left\| \varvec{1}^\textsf{T}\varvec{Xg}^k\right\| ^2\nonumber \\&-\frac{1}{2n}\left\| \varvec{1}^\textsf{T}\varvec{Xf}^k\right\| ^2+\frac{1}{2}\left\| \varvec{Xf}^k+\varvec{y}+\varvec{z}\right\| ^2+\frac{\lambda }{2}\left\| \varvec{f}^k\right\| ^2\nonumber \\&\rightarrow \min _{\varvec{f}}{\mathcal {L}}_0\left( \varvec{f},\varvec{\gamma }|_{z^\kappa };\lambda ,\varvec{z}^\kappa \right) , \end{aligned}$$
(A.16)

where \( \varvec{\gamma }^k|_{\varvec{z}^\kappa } =\arg \max _{\varvec{\gamma }} \inf _{\varvec{f}} {\mathcal {L}}_0\left( \varvec{f},\varvec{\gamma };\lambda ,\varvec{z}^\kappa \right) \), so \( \varvec{f}^k, \varvec{g}^k\rightarrow \arg \min _{\varvec{f}}{\mathcal {L}}_0\left( \varvec{f},\varvec{\gamma }^k|_{\varvec{z}^\kappa };\lambda ,\varvec{z}^\kappa \right) \).

Let \( f\left( \varvec{f}\right) =\left\| \varvec{f}\right\| ^2+\left( 2/\lambda \right) \left( \varvec{\gamma }^k|_{\varvec{z}^\kappa }\right) ^\textsf{T}\overline{\varvec{P}}\varvec{f},g\left( \varvec{f}\right) =\left( \varvec{1}\varvec{x}^\textsf{T}-\varvec{X}\right) \varvec{f}-\varvec{y} \), and follow the iteration steps below

$$\begin{aligned} \begin{aligned} \left\{ \begin{aligned} \varvec{f}^k|_{\varvec{z}^\kappa }&={\textbf {prox}}_{\lambda f;g}\left( \varvec{z}^\kappa \right) =\arg \min _{\varvec{f}}{\mathcal {L}}_0\left( \varvec{f},\varvec{\gamma }^k|_{\varvec{z}^\kappa };\lambda ,\varvec{z}^\kappa \right) \\ \varvec{z}^{\kappa +.5}&=g\big (\varvec{f}^k|_{\varvec{z}^\kappa }\big )=\varvec{1}\varvec{x}^\textsf{T}\varvec{f}^k|_{\varvec{z}^\kappa }-\varvec{X}\varvec{f}^k|_{\varvec{z}^\kappa }-\varvec{y}\\ \varvec{z}^{\kappa +1}&=\Pi _{{\mathbb {R}}_+^n}\left( \varvec{z}^{\kappa +.5}\right) \end{aligned}s \right. \end{aligned} \end{aligned}$$
(A.17)

Since \( \varvec{z}^{\kappa +1}=\Pi _{{\mathbb {R}}_+^n}\left( \varvec{z}^{\kappa +.5}\right) =\Pi _{{\mathbb {R}}_+^n}\left( g\left( {\textbf {prox}}_{\lambda f;g}\left( \varvec{z}^\kappa \right) \right) \right) =\Pi _{{\mathbb {R}}_+^n}\left( T\varvec{z}^k\right) =S\varvec{z}^\kappa \), {\( \varvec{z}^\kappa \)} is convergent. \(\square \)

Appendix B

1.1 Derivation of \( \widehat{\varvec{f}_l} \)

Fixed \( \varvec{g}_l,l=1,2,\ldots ,L \), let the partial derivative of (20) with regard to \( \varvec{f}_l,l=1,2,\ldots ,L \) be zeros as

$$\begin{aligned} \begin{aligned} \frac{\partial {\mathcal {L}}_p}{\partial \varvec{f}_l}=&\;\varvec{X}_l^\textsf{T}\left( \varvec{y}+\varvec{z}+\varvec{X}_l\varvec{f}_l+\sum _{i\ne l}^{L}\varvec{X}_i\varvec{f}_i\right) \\&-\frac{1}{n}\varvec{X}_l^\textsf{T}\varvec{1}\left( \varvec{1}^\textsf{T}\varvec{X}_l\varvec{f}_l+\varvec{1}^\textsf{T}\sum _{i\ne l}^L\varvec{X}_i\varvec{f}_i\right) \\&+\lambda \varvec{f}_l+\rho \left( \varvec{f}_l-\varvec{g}_l+\varvec{\nu }_l\right) =\varvec{0},l=1,\ldots ,L. \end{aligned} \end{aligned}$$
(B.1)

Then, take DFT on both sides of (B.1) to get

$$\begin{aligned} \begin{aligned} \hspace{-0.07in}\varvec{0}=&\;\widehat{\varvec{x}_l}\odot \left( \widehat{\varvec{y}}+\widehat{\varvec{z}}+\widehat{\varvec{x}_l}^*\odot \widehat{\varvec{f}_l}+\sum _{i\ne l}^{L}\widehat{\varvec{x}_i}^*\odot \widehat{\varvec{f}_i}\right) \\&-\widehat{\varvec{x}_l}\!\odot \!\left[ 1,0,\ldots ,0\right] ^\textsf{T}\odot \left( \widehat{\varvec{x}_l}^*\!\odot \!\widehat{\varvec{f}_l}\!+\!\sum _{i\ne l}^L\widehat{\varvec{x}_i}^*\!\odot \!\widehat{\varvec{f}_i}\right) \\&+\lambda \widehat{\varvec{f}_l}+\rho \left( \widehat{\varvec{f}_l}-\widehat{\varvec{g}_l}+\widehat{\varvec{\nu }_l}\right) ,l=1,2,\ldots ,L. \end{aligned} \end{aligned}$$
(B.2)

To solve above nL equations, let \( \widehat{\varvec{\psi }_l}=-\widehat{\varvec{x}_l}\odot \left( \widehat{\varvec{y}}+\widehat{\varvec{z}}\right) +\rho \left( \widehat{\varvec{g}_l}-\widehat{\varvec{\nu }_l}\right) \) and

$$\begin{aligned} \varvec{s}=\sum _{i=1}^{L}\widehat{\varvec{x}_i}^*\odot \widehat{\varvec{f}_i}, \end{aligned}$$
(B.3)

we have

$$\begin{aligned} \widehat{\varvec{f}_l}=\frac{1}{\rho +\lambda }\left( \widehat{\varvec{\psi }_l}\!-\!\widehat{\varvec{x}_l}\odot \left( 0,1,\ldots ,1\right) ^\textsf{T}\odot \varvec{s}\right) ,l\!=\!1,\ldots ,L; \end{aligned}$$
(B.4)

then, plug (B.4) into (B.3) to get

$$\begin{aligned} \varvec{s}\!=\!\frac{1}{\rho +\lambda }\sum _{i=1}^{L}\widehat{\varvec{x}_i}^*\odot \left( \widehat{\varvec{\psi }_i}\!-\!\widehat{\varvec{x}_i}\odot \left( 0,1,\ldots ,1\right) ^\textsf{T}\odot \varvec{s}\right) , \end{aligned}$$
(B.5)

that is,

$$\begin{aligned} \varvec{s}=\frac{\sum _{i=1}^{L}\widehat{\varvec{x}_i}^*\odot \widehat{\varvec{\psi }_i}}{\left( \rho +\lambda \right) \varvec{1}+\sum _{i=1}^{L}\widehat{\varvec{x}_i}\odot \left( 0,1,\ldots ,1\right) ^\textsf{T}\odot \widehat{\varvec{x}_i}^*}. \end{aligned}$$
(B.6)

Plugging (B.6) into (B.4) as well as setting \( \widehat{\varvec{\upsilon }_l}=\widehat{\varvec{x}_l}^*\odot \left( 0,1,\ldots ,1\right) \odot \widehat{\varvec{\psi }_l},\widehat{\varvec{\omega }_l}=\widehat{\varvec{x}_l}^*\odot \left( 0,1,\ldots ,1\right) \odot \widehat{\varvec{x}_l} \), then we get

$$\begin{aligned} \widehat{\varvec{f}_l}=\frac{1}{\rho +\lambda }\left( \widehat{\varvec{\psi }_l}-\frac{\widehat{\varvec{x}_l}\odot \sum _i^L\widehat{\varvec{\upsilon }_i}}{\left( \rho +\lambda \right) \varvec{1}+\sum _i^L\widehat{\varvec{\omega }_i}}\right) ,l=1,\ldots ,L. \end{aligned}$$
(B.7)

Appendix C

1.1 Derivation of \( \varvec{f}_l^\star \)

Let \( \nabla ^2\varvec{f}_l \) denote a matrix of the same shape as \( \varvec{f}_l \), whose elements, if taking 4-neighborhood difference as an example, are \( (\nabla ^2\varvec{f}_l)[u,v]{:=}\Delta (\varvec{f}_l[u,v]) =4\varvec{f}_l[u,v]-\varvec{f}_l[(u+1)\bmod H,v]-\varvec{f}_l[(u-1)\bmod H,v]-\varvec{f}_l[u,(v+1)\bmod W]-\varvec{f}_l[u,(v-1)\bmod W], u\in \{0,1,\ldots ,H\!-\!1\},v\in \{0,1,\ldots ,W\!-\!1\}\). According to linear and spatial-shifting property of DFT, we have \( \widehat{\nabla ^2\varvec{f}_l}[u,v]=(4-e^{i\frac{2\pi }{H}u}-e^{-i\frac{2\pi }{H}u}-e^{j\frac{2\pi }{W}v}-e^{-j\frac{2\pi }{W}v})\widehat{\varvec{f}_l}[u,v]=(4-2{{\mathfrak {R}}}{{\mathfrak {e}}}(e^{i\frac{2\pi }{H}u}+e^{j\frac{2\pi }{W}v}))\widehat{\varvec{f}_l}[u,v]\). For the sake of convenience in formula, we flatten \( \widehat{\varvec{f}_l} \) out as a vector and \( \widehat{\nabla ^2\varvec{f}_l} \) as well. Then, set a vector \( \varvec{\delta } \) whose elements \( \varvec{\delta }[u\times W+v]=4-2{{\mathfrak {R}}}{\mathfrak {e}}(e^{i\frac{2\pi }{H}u}+e^{j\frac{2\pi }{W}v}) \) for all \( u\in \{0,1,\ldots ,H-1\}, v\in \{0,1,\ldots ,W-1\} \). According to Parseval Identity, take DFT on (26) to get the equivalent optimization objective

$$\begin{aligned} \min _{\widehat{\varvec{f}_l}}&\;\frac{1}{2n}\left\| \widehat{\varvec{y}}+\widehat{\varvec{z}}+\sum _l^L\widehat{\varvec{x}_l}^*\odot \widehat{\varvec{f}_l}\right\| ^2+\frac{\rho }{2n}\sum _l^L\left\| \widehat{\varvec{f}_l}-\widehat{\varvec{g}_l}+\widehat{\varvec{\nu }_l}\right\| ^2\nonumber \\&+\frac{\lambda }{2n}\sum _l^L\left\| \widehat{\varvec{f}_l}\right\| ^2+\frac{\theta }{2n}\sum _l^L\left\| \varvec{\delta }\odot \left( \widehat{\varvec{f}_l}-\widehat{\varvec{f}_l^\textrm{pre}}\right) \right\| ^2\nonumber \\&-\frac{1}{2n}\sum _l^L\left\| (1,0,\ldots ,0)\odot \widehat{\varvec{x}_l}^*\odot \widehat{\varvec{f}_l}\right\| ^2. \end{aligned}$$
(C.1)

Let the partial derivative of (C.1) with regard to \(\widehat{\varvec{f}_l^\star }, l=1,2,\ldots ,L \) be zeros as

$$\begin{aligned} \begin{aligned} \hspace{-0.07in}\varvec{0}=&\;\widehat{\varvec{x}_l}\odot \left( \widehat{\varvec{y}}+\widehat{\varvec{z}}+\widehat{\varvec{x}_l}^*\odot \widehat{\varvec{f}_l}+\sum _{i\ne l}^{L}\widehat{\varvec{x}_i}^*\odot \widehat{\varvec{f}_i}\right) \\&-\widehat{\varvec{x}_l}\!\odot \!\left[ 1,0,\ldots ,0\right] ^\textsf{T}\odot \left( \widehat{\varvec{x}_l}^*\!\odot \!\widehat{\varvec{f}_l}\!+\!\sum _{i\ne l}^L\widehat{\varvec{x}_i}^*\!\odot \!\widehat{\varvec{f}_i}\right) \\&+\lambda \widehat{\varvec{f}_l}+\rho \left( \widehat{\varvec{f}_l}-\widehat{\varvec{g}_l}+\widehat{\varvec{\nu }_l}\right) +\theta \varvec{d}\odot \left( \widehat{\varvec{f}_l}-\widehat{\varvec{f}_l^\textrm{pre}}\right) \!,\\ l=&\;1,2,\ldots ,L \end{aligned} \end{aligned}$$
(C.2)

where \( \varvec{d}=\varvec{\delta }^*\odot \varvec{\delta } \). The rest steps are similar to that in Appendix B.

Let \( \widehat{\varvec{\psi }^\Delta _l}=\widehat{\varvec{\psi }_l}+\theta \varvec{d}\odot \widehat{\varvec{f}_l^\textrm{pre}},\widehat{\varvec{\upsilon }^\Delta _l}=\widehat{\varvec{x}_l}^*\odot \left( 0,1,\ldots ,1\right) \odot \widehat{\varvec{\psi }^\Delta _l},l=1,2,\ldots ,L \), and finally, we have

$$\begin{aligned} \begin{aligned}&\widehat{\varvec{f}_l^\star }=\frac{\varvec{1}}{\left( \rho \!+\!\lambda \right) \!\varvec{1}\!+\!\theta \varvec{d}}\!\left( \widehat{\varvec{\psi }_l^\Delta }\!-\!\frac{\widehat{\varvec{x}_l}\odot \sum _i^L\widehat{\varvec{\upsilon }^\Delta _i}}{\left( \rho \!+\!\lambda \right) \!\varvec{1}\!+\!\theta \varvec{d}\!+\!\sum _i^L\widehat{\varvec{\omega }_i}}\right) ,\\&l=1,2,\ldots ,L, \end{aligned} \end{aligned}$$
(C.3)

and then, take IFFT to get \( \varvec{f}_l^\star \).

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Liu, P., Li, G., Zhao, W. et al. A coupling method of learning structured support correlation filters for visual tracking. Vis Comput 40, 181–199 (2024). https://doi.org/10.1007/s00371-023-02774-5

Download citation

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00371-023-02774-5

Keywords