Abstract
Recently correlation filter based trackers have attracted considerable attention for their high computational efficiency. However, they cannot handle occlusion and scale variation well enough. This paper aims at preventing the tracker from failure in these two situations by integrating the depth information into a correlation filter based tracker. By using RGB-D data, we construct a depth context model to reveal the spatial correlation between the target and its surrounding regions. Furthermore, we adopt a region growing method to make our tracker robust to occlusion and scale variation. Additional optimizations such as a model updating scheme are applied to improve the performance for longer video sequences. Both qualitative and quantitative evaluations on challenging benchmark image sequences demonstrate that the proposed tracker performs favourably against state-of-the-art algorithms.
Similar content being viewed by others
References
Adam, A., Rivlin, E., Shimshoni, I., 2006. Robust fragmentsbased tracking using the integral histogram. IEEE Computer Society Conf. on Computer Vision and Pattern Recognition, p.798–805. http://dx.doi.org/10.1109/CVPR.2006.256
Adams, R., Bischof, L., 1994. Seeded region growing. IEEE Trans. Patt. Anal. Mach. Intell., 16(6):641–647. http://dx.doi.org/10.1109/34.295913
Bolme, D.S., Beveridge, J.R., Draper, B.A., et al., 2010. Visual object tracking using adaptive correlation filters. IEEE Conf. on Computer Vision and Pattern Recognition, p.2544–2550. http://dx.doi.org/10.1109/CVPR.2010.5539960
Cehovin, L., Kristan, M., Leonardis, A., 2011. An adaptive coupled-layer visual model for robust visual tracking. IEEE Int. Conf. on Computer Vision, p.1363–1370. http://dx.doi.org/10.1109/ICCV.2011.6126390
Chen, K., Lai, Y., Wu, Y., et al., 2014. Automatic semantic modeling of indoor scenes from low-quality RGB-D data using contextual information. ACM Trans. Graph., 33(6):208–219. http://dx.doi.org/10.1145/2661229.2661239
Choi, C., Christensen, H.I., 2013. RGB-D object tracking: a particle filter approach on GPU. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, p.1084–1091. http://dx.doi.org/10.1109/IROS.2013.6696485
Danelljan, M., Häger, G., Khan, F.S., et al., 2014a. Accurate scale estimation for robust visual tracking. British Machine Vision Conf., p.1–11.
Danelljan, M., Khan, F.S., Felsberg, M., et al., 2014b. Adaptive color attributes for real-time visual tracking. IEEE Conf. on Computer Vision and Pattern Recognition, p.1090–1097. http://dx.doi.org/10.1109/CVPR.2014.143
Dinh, T.B., Vo, N., Medioni, G.G., 2011. Context tracker: exploring supporters and distracters in unconstrained environments. IEEE Conf. on Computer Vision and Pattern Recognition, p.1177–1184. http://dx.doi.org/10.1109/CVPR.2011.5995733
Everingham, M., Gool, L.V., Williams, C.K., et al., 2010. The Pascal Visual Object Classes (VOC) challenge. Int. J. Comput. Vis., 88(2):303–338. http://dx.doi.org/10.1007/s11263-009-0275-4
Grabner, H., Matas, J., Gool, L.V., et al., 2010. Tracking the invisible: learning where the object might be. IEEE Conf. on Computer Vision and Pattern Recognition, p.1285–1292. http://dx.doi.org/10.1109/CVPR.2010.5539819
Gupta, S., Girshick, R.B., Arbelaez, P., et al., 2014. Learning rich features from RGB-D images for object detection and segmentation. ECCV, p.345–360. http://dx.doi.org/10.1007/978-3-319-10584-0_23
Hare, S., Saffari, A., Torr, P., et al., 2011. Struck: structured output tracking with kernels. IEEE Trans. Patt. Anal. Mach. Intell., 38(10):263–270. http://dx.doi.org/10.1109/TPAMI.2015.2509974
Henriques, J.F., Caseiro, R., Martins, P., et al., 2012. Exploiting the circulant structure of tracking-by-detection with kernels. ECCV, p.702–715. http://dx.doi.org/10.1007/978-3-642-33765-9_50
Henriques, J.F., Caseiro, R., Martins, P., et al., 2015. Highspeed tracking with kernelized correlation filters. IEEE Trans. Patt. Anal. Mach. Intell., 37(3):583–596. http://dx.doi.org/10.1109/TPAMI.2014.2345390
Hickson, S., Birchfield, S., Essa, I.A., et al., 2014. Efficient hierarchical graph-based segmentation of RGBD videos. IEEE Conf. on Computer Vision and Pattern Recognition, p.344–351.
Izadinia, H., Saleemi, I., Li, W., et al., 2012. (MP) 2T: multiple people multiple parts tracker. IEEE Conf. on Computer Vision and Pattern Recognition, p.100–114. http://dx.doi.org/10.1007/978-3-642-33783-3_8
Kalal, Z., Mikolajczyk, K., Matas, J., 2012. Trackinglearning-detection. IEEE Trans. Patt. Anal. Mach. Intell., 34(7):1409–1422. http://dx.doi.org/10.1109/TPAMI.2011.239
Kristan, M., Pflugfelder, R., Leonardis, A., et al., 2015. The visual object tracking VOT2014 challenge results. IEEE Conf. on Computer Vision and Pattern Recognition, p.191–217.
Kumar, B.V., Mahalanobis, A., Juday, R.D., 2010. Correlation Pattern Recognition. Cambridge University Press, Cambridge.
Lee, D., Sim, J., Kim, C., 2014. Visual tracking using pertinent patch selection and masking. IEEE Conf. on Computer Vision and Pattern Recognition, p.3486–3493.
Li, X., Hu, W., Shen, C., et al., 2013. A survey of appearance models in visual object tracking. ACM Intell. Syst. Technol., 4(4):58. http://dx.doi.org/10.1145/2508037.2508039
Li, Y., Zhu, J., 2014. A scale adaptive kernel correlation filter tracker with feature integration. ECCV, p.254–265. http://dx.doi.org/10.1007/978-3-319-16181-5_18
Li, Y., Zhu, J., Hoi, S., et al., 2015. Reliable patch trackers: robust visual tracking by exploiting reliable patches. IEEE Conf. on Computer Vision and Pattern Recognition, p.353–361.
Liu, T., Wang, G., Yang, Q., 2015. Real-time part-based visual tracking via adaptive correlation filters. IEEE Conf. on Computer Vision and Pattern Recognition, p.4902–4912.
Luber, M., Spinello, L., Arras, K.O., 2011. People tracking in RGB-D data with on-line boosted target models. IEEE/RSJ Int. Conf. on Intelligent Robots and Systems, p.3844–3849. http://dx.doi.org/10.1109/IROS.2011.6095075
Ma, C., Yang, X., Zhang, C., et al., 2015. Long-term correlation tracking. IEEE Conf. on Computer Vision and Pattern Recognition, p.5388–5396.
Park, Y., Lepetit, V., Woo, W., 2011. Texture-less object tracking with online training using an RGB-D camera. 10th IEEE Int. Symp. on Mixed and Augmented Reality, p.121–126. http://dx.doi.org/10.1109/ISMAR.2011.6092377
Ross, D.A., Lim, J., Lin, R.S., et al., 2008. Incremental learning for robust visual tracking. Int. J. Comput. Vis., 77(1-3):125–141. http://dx.doi.org/10.1007/s11263-007-0075-7
Shu, G., Dehghan, A., Oreifej, O., et al., 2012. Partbased multiple-person tracking with partial occlusion handling. IEEE Conf. on Computer Vision and Pattern Recognition, p.1815–1821. http://dx.doi.org/10.1007/s11263-007-0075-7
Smeulders, A.W., Chu, D., Cucchiara, R., et al., 2014. Visual tracking: an experimental survey. IEEE Trans. Patt. Anal. Mach. Intell., 36(7):1442–1468. http://dx.doi.org/10.1109/TPAMI.2013.230
Song, S., Xiao, J., 2013. Tracking revisited using RGBD camera: unified benchmark and baselines. IEEE Int. Conf. on Computer Vision, p.233–240.
Teichman, A., Lussier, J.T., Thrun, S., 2013. Learning to segment and track in RGBD. IEEE Trans. Autom. Sci. Eng., 10(4):841–852. http://dx.doi.org/10.1109/TASE.2013.2264286
Wu, Y., Lim, J., Yang, M., 2013. Online object tracking: a benchmark. IEEE Conf. on Computer Vision and Pattern Recognition, p.2411–2418.
Yang, B., Nevatia, R., 2012. Online learned discriminative part-based appearance models for multi-human tracking. ECCV, p.484–498. http://dx.doi.org/10.1007/978-3-642-33718-5_35
Yang, H., Shao, L., Zheng, F., et al., 2011. Recent advances and trends in visual tracking: a review. Neurocomputing, 74(18):3823–3831. http://dx.doi.org/10.1016/j.neucom.2011.07.024
Yang, M., Wu, Y., Hua, G., 2009. Context-aware visual tracking. IEEE Trans. Patt. Anal. Mach. Intell., 31(7):1195–1209. http://dx.doi.org/10.1109/TPAMI.2008.146
Yilmaz, A., Javed, O., Shah, M., 2006. Object tracking: a survey. ACM Comput. Surv., 38(4):13. http://dx.doi.org/10.1145/1177352.1177355
Zhang, L., Maaten, L., 2014. Preserving structure in modelfree tracking. IEEE Trans. Patt. Anal. Mach. Intell., 36(4):756–769. http://dx.doi.org/10.1109/TPAMI.2013.221
Zhang, K., Zhang, L., Liu, Q., et al., 2014. Fast visual tracking via dense spatio-temporal context learning. ECCV, p.127–141. http://dx.doi.org/10.1007/978-3-319-10602-1_9
Author information
Authors and Affiliations
Corresponding author
Additional information
Project supported by the National Natural Science Foundation of China (Nos. 61502509, 61402504, and 61272145), the National High-Tech R&D Program (863) of China (No. 2012AA012706), and the Research Fund for the Doctoral Program of Higher Education of China (No. 21024307130004)
ORCID: Lei LUO, http://orcid.org/0000-0002-9329-1411
Rights and permissions
About this article
Cite this article
Chen, Zy., Luo, L., Huang, Df. et al. Exploiting a depth context model in visual tracking with correlation filter. Frontiers Inf Technol Electronic Eng 18, 667–679 (2017). https://doi.org/10.1631/FITEE.1500389
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1631/FITEE.1500389