Abstract
At present, face recognition algorithms are facing some problems with poor face tracking and low real-time performance in multi-target recognition scenarios. This paper details a multi-target face real-time detection tracking and recognition algorithm, including three methods of fast-tracking, fast detection, and quick recognition. The first step offers a new network based on GOTURN for achieving fast face tracking. The prior information of the previous frame image used to predict the position of the face boxes at the current frame. The second step is based on MTCNN for face detection, using the prior information of the present structure to avoid generating massive of invalid candidate boxes, thereby achieving rapid detection of faces. Finally, fast face recognition realized by reduced MobileFaceNet. By avoiding repeated exposure and repeated identification of the same target, the algorithm successfully transforms a multi-target scene into a single-target scene. On the OTB2015 and 300_VW test sets, the evaluation trackers tracked faces with an accuracy rate of 92.2% and 99.6% respectively. On the Xiph test set, multi-target detection and tracking face speed reached 102fps on the CPU. Compared with the original MobileFaceNet, the streamlined network has an accuracy rate of 99.1% on LFW, the feature extraction speed increased by 25%, and the model size reduced by 45%. Experimental results show that the algorithm has high recognition accuracy and real-time performance in multi-target recognition scenes.
Similar content being viewed by others
References
Babenko B , Yang M H , Belongie S . (2009). Visual tracking with online multiple instance learning[C]// 2009 IEEE conference on computer vision and pattern recognition. IEEE
Chen S, Liu Y, Gao X, et al. (2018). MobileFaceNets: efficient CNNs for accurate real-time face verification on Mobile devices[J]. arXiv preprint arXiv: 1804.07573
Chollet F (2016). Xception: deep learning with Depthwise separable convolutions[J]. arXiv preprint arXiv: 1610.02357
Comaniciu D, Ramesh V, Meer P (2003) Kernel-based object tracking. PAMI 25(5):564–577
Deng J, Guo J, Xue N, et al. (2018). ArcFace: additive angular margin loss for deep face recognition[J]. arXiv preprint arXiv: 1801.07698
Dinh T B , Vo N , Gérard G. Medioni.(2011). Context tracker: exploring supporters and distracters in unconstrained environments[C]// the 24th IEEE conference on computer vision and pattern recognition, CVPR, Colorado Springs, CO, USA, 20–25 June 2011. IEEE
Fang G, Li J, Wang Y (2019) Real-time face recognition on ARM platform based on deep learning [J]. Journal Of Computer Applications 39(8):2217–2222
Grabner H, Leistner C, Bischof H (2008) Semi-supervised on-line boosting for robust tracking[C]// European conference on computer vision. Springer, Berlin, Heidelberg
Hare S, Saffari A, Torr P H S. (2011). Struck: structured output tracking with kernels[C]// IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011. IEEE
He K, Zhang X, Ren S, et al. (2016). Deep residual learning for image recognition[C]//2014 IEEE conference on computer vision and pattern recognition (CVPR). IEEE Computer Society: 770–778
David Held, Thrun, Silvio Savarese.(2016). Learning to track at 100fps with deep regression networks[C]//European Conference on Computer Vision.Cham:749–765
Henriques JF, Rui C, Martins P et al (2012) Exploiting the cirulant structure oftracking-by-detection with kernels[J]. European Conference on Computer Vision 7575(1):702–715
Junseok Kwon, Kyoung Mu Lee. (2010). Visual tracking decomposition[C]// the twenty-third IEEE conference on computer vision and pattern recognition, CVPR 2010, San Francisco, CA, USA, 13–18 June 2010. IEEE
Junseok Kwon, Kyoung Mu Lee. (2011). Tracking by sampling trackers[C]// IEEE international conference on computer vision, ICCV 2011, Barcelona, Spain, November 6–13, 2011. IEEE
Li B, Wu W, Zheng Z, Yan J (2018) High performance visual tracking with Siamese region proposal network. CVPR:8971–8980
Wu Y. Lim J. Yang MH (2016). Online object tracking:a benchmark[C]//IEEE Conference on Computer Vision and Pattern Recognition. Portland:IEEE:2411–2418
Liu B , Huang J , Yang L , et al.(2011). Robust tracking using local sparse appearance model and K-selection[C]// the 24th IEEE conference on computer vision and pattern recognition, CVPR 2011, Colorado Springs, CO, USA, 20–25 June 2011. IEEE
Liu W, Wen Y, Yu Z et al (2017) SphereFace: deep Hypersphere embedding for face recognition[C]//2017 IEEE conference on computer vision and pattern recognition (CVPR). IEEE Computer Society:6738–6746
Nam H, Han B.(2016). Learning multi—domain convolutional neural networks for visual tracking[C].Las Vegas:IEEE Interna— tional conference on computer vision and pattern Reeogni— tion
Razavian A S, Azizpour H, Sullivan J, et al. (2014). CNN features off-the-shelf: an astounding baseline for recognition[C]//computer vision and pattern recognition workshops (CVPRW), IEEE Computer Society:512–519
Sandler M, Howard A, Zhu M, et al. (2018). MobileNetV2: inverted residuals and linear bottlenecks[J]. arXiv preprint arXiv: 1801.04381
Stalder S, Grabner H ,Gool LV . (2009). Beyond semi-supervised tracking: tracking should be as simple as detection, but not simpler than recognition[C]// computer vision workshops (ICCV workshops), 2009 IEEE 12th international conference on. IEEE
Taigman Y, Ming Y, Ranzato M, et al. (2014). DeepFace: closing the gap to human-level performance in face verification[C]//2014 conference on computer vision and pattern recognition (CVPR), IEEE Computer Society: 1701–1708
G Tzimiropoulos. (2015). Project-out cascaded regression with an application to face alignment. In proceedings of the IEEE conference on computer vision and pattern recognition, pages 3659–3667
Wu Y, Lim J, Yang M.(2013). Online object tracking:a benchmark[C].Portland:computer vision and pattern recognition
Wu Y, Lim J, Yang M (2015) Object tracking benchmark[J]. IEEETransactions on Pattern Analysis&Machine InteHigenee 37(9):1834–1848
Yang S, Luo P, Loy C-C, Tang X (2016) Wider face: a face detection benchmark. CVPR 2(3):5
Zhang K, Zhang Z, Li Z, Qiao Y (2016) Joint face detection and alignment using multitask cascaded convolutional networks[J]. IEEE Signal Processing Letters 23(10):1499–1503
Zhang X, Zhou X, Lin M, et al. (2017). ShuffleNet: an extremely efficient convolutional neural network for Mobile devices[J]. arXiv preprint arXiv: 1707.01083
Acknowledgements
This research is supported by a fund from Hubei Province Key Laboratory of Intelligent Information Processing and Real-time Industrial System (Grant No. znxx2018QN06), Major Project for New Generation of AI (Grant No. 2018AAA0100400) and the National Natural Science Foundation of Hunan (Grant No.2018JJ2098).
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Li, J., Wang, Y., Fang, G. et al. Real-time detection tracking and recognition algorithm based on multi-target faces. Multimed Tools Appl 80, 17223–17238 (2021). https://doi.org/10.1007/s11042-020-09601-2
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09601-2