Abstract
This paper presents convolutional neural network (CNN)-based single object detection and tracking algorithms. CNN-based object detection methods are directly applicable to static images, but not to videos. On the other hand, model-free visual object tracking methods cannot detect an object until a ground truth bounding box of the target is provided. Moreover, many annotated video datasets of the target object are required to train both the object detectors and visual trackers. In this work, three simple yet effective object detection and tracking algorithms for videos are proposed to efficiently combine a state-of-the-art object detector and visual tracker for circumstances in which only a few static images of the target are available for training. The proposed algorithms are tested using a drone detection task and the experimental results demonstrated their effectiveness.
Similar content being viewed by others
References
Aker C, Kalkan S (2017) Using deep networks for drone detection. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–6
Babenko B, Yang M-H, Belongie S (2010) Robust object tracking with online multiple instance learning. IEEE Trans Pattern Anal Machine Intell 33 (8):1619–1632
Bertinetto L, Valmadre J, Henriques JF, Vedaldi A, Torr PHS (2016) Fully-convolutional siamese networks for object tracking. In: European conference on computer vision. Springer, pp 850–865
Chen L, Lou J, Xu F, Ren M (2019) Grid-based multi-object tracking with siamese cnn based appearance edge and access region mechanism. Multimed Tools Applic: 1–19
Chen X, Yu J, Wu Z (2019) Temporally identity-aware ssd with attentional lstm. IEEE Trans Cybern
Coluccia A, Fascista A, Schumann A, Sommer L, Ghenescu M, Piatrik T, De Cubber G, Nalamati M, Kapoor A, Saqib M, et al. (2019) Drone-vs-bird detection challenge at ieee avss2019. In: 2019 16th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–7
Feichtenhofer C, Pinz A, Zisserman A (2017) Detect to track and track to detect. In: Proceedings of the IEEE international conference on computer vision, pp 3038–3046
Hassanin A-AIM, El-Samie FEA, Banby GME (2019) A real-time approach for automatic defect detection from pcbs based on surf features and morphological operations. Multimed Tools Applic: 1–21
Hatanaka T, Funada R, Fujita M (2019) Visual surveillance of human activities via gradient-based coverage control on matrix manifolds. IEEE Trans Control Sys Technol
He A, Luo C, Tian X, Zeng W (2018) A twofold siamese network for real-time object tracking. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 4834–4843
Henriques JF, Caseiro R, Martins P, Batista J (2014) High-speed tracking with kernelized correlation filters. IEEE Trans Pattern Anal Mach Intell 37(3):583–596
Kang K, Li H, Xiao T, Ouyang W, Yan J, Liu X, Wang X (2017) Object detection in videos with tubelet proposal networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 727–735
Kang K, Ouyang W, Li H, Wang X (2016) Object detection from video tubelets with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 817–825
Kouicem DE (2017) Security internet of everything for systems of systems. Journée des Doctorants: 51
Kristan M, Matas J, Leonardis A, Felsberg M, Pflugfelder R, Kamarainen J-K, Zajc LC, Drbohlav O, Lukezic A, Berg A, et al. (2019) The seventh visual object tracking vot2019 challenge results. In: Proceedings of the IEEE international conference on computer vision workshops, pp 0–0
Li B, Yan J, Wu W, Zhu Z, Hu X (2018) High performance visual tracking with siamese region proposal network. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8971–8980
Li Z, Zheng Z, Lin F, Leung H, Li Q (2019) Action recognition from depth sequence using depth motion maps-based local ternary patterns and cnn. Multimed Tools Applic: 1–15
Liu M, Zhu M (2018) Mobile video object detection with temporally-aware feature maps. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5686–5695
Lukezic A, Vojir T, Zajc LC, Matas J, Kristan M (2017) Discriminative correlation filter with channel and spatial reliability. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 6309–6318
Niu J, Liu Y, Guizani M, Ouyang Z (2019) Deep cnn-based real-time traffic light detector for self-driving vehicles. IEEE Trans Mobile Comput
Paliwal N, Vanjani P, Liu J-W, Saini S, Sharma A (2019) Image processing-based intelligent robotic system for assistance of agricultural crops. Int J Social Humanistic Comput 3(2):191–204
Park J, Kim DH, Shin YS, Lee S-H (2017) A comparison of convolutional object detectors for real-time drone tracking using a ptz camera. In: 2017 17th international conference on control, automation and systems (ICCAS). IEEE, pp 696–699
Real E, Shlens J, Mazzocchi S, Pan X, Vanhoucke V (2017) Youtube-boundingboxes: a large high-precision human-annotated data set for object detection in video. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5296–5305
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 779–788
Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 7263–7271
Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv:1804.02767
Rozantsev A, Lepetit V, Fua P (2016) Detecting flying objects using a single moving camera. IEEE Trans Pattern Anal Mach Intell 39(5):879–892
Russakovsky O, Deng J, Su H, Krause J, Satheesh S, Ma S, Huang Z, Karpathy A, Khosla A, Bernstein M, et al. (2015) Imagenet large scale visual recognition challenge. Int J Comput Vis 115(3):211–252
Saqib M, Khan SD, Sharma N, Blumenstein M (2017) A study on detecting drones using deep convolutional neural networks. In: 2017 14th IEEE international conference on advanced video and signal based surveillance (AVSS). IEEE, pp 1–5
Shi X, Yang C, Xie W, Liang C, Shi Z, Chen J (2018) Anti-drone system with multiple surveillance technologies: architecture, implementation, and challenges. IEEE Commun Mag 56(4):68–74
Simonyan K, Zisserman A (2014) Very deep convolutional networks for large-scale image recognition. arXiv:1409.1556
Smeulders Arnold WM, Chu DM, Cucchiara R, Calderara S, Dehghan A, Shah M (2013) Visual tracking: an experimental survey. IEEE Trans Pattern Anal Mach Intell 36(7):1442–1468
Szegedy C, Liu W, Jia Y, Sermanet P, Reed S, Anguelov D, Erhan D, Vanhoucke V, Rabinovich A (2015) Going deeper with convolutions. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1–9
Thai V-P, Zhong W, Pham T, Alam S, Duong V u (2019) Detection, tracking and classification of aircraft and drones in digital towers using machine learning on motion patterns. In: 2019 Integrated communications, navigation and surveillance conference (ICNS). IEEE, pp 1–8
Wang Y, Luo X, Ding L, Wu J, Fu S (2019) Robust visual tracking via a hybrid correlation filter. Multimed Tools Applic 78(22):31633–31648
Wu Y, Lim J, Yang M-H (2013) Online object tracking: a benchmark. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2411–2418
Wu XY (2019) A hand gesture recognition algorithm based on dc-cnn. Multimed Tools Applic: 1–13
Acknowledgements
This work was supported by the National Research Foundation of Korea (NRF) grant funded by the Korean government (MIST) (No.2017R1C1B5017125 and No.2019R1F1A1040709)
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Lee, DH. CNN-based single object detection and tracking in videos and its application to drone detection. Multimed Tools Appl 80, 34237–34248 (2021). https://doi.org/10.1007/s11042-020-09924-0
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-020-09924-0