Abstract
Vehicle detection in aerial images has been applied in many fields and attracted more and more scholars’ attention. In the task, the objects are multi-directional and arranged densely, the background information is complex, and the scale of the object is different. To achieve better detection performance, an improved detection model BFR-RetinaNet is proposed, which is based on the single-stage object detection model RetinaNet. BFR-RetinaNet optimizes vehicle positioning by adding a directional anchor box regression branch. Simultaneously, the model introduces a balanced feature pyramid structure to enhance the extraction of features and reduce the interference of complex backgrounds. The experimental results on the aerial dataset show that the precision, recall, and average precision of the proposed model have been improved to varying degrees. It achieves 86.2% Precision, 98.4% Recall, and 90.8% Average Precision (AP), which is 7.96, 0.45, and 0.58 points higher than R-RetinaNet.
This work is partially supported by the Open Research Project of the State Key Laboratory of Industrial Control Technology (No. ICT2021B10), the Natural Science Foundation of Hu-nan Province (2021JJ30456), the Open Fund of Science and Technology on Parallel and Distributed Processing Laboratory (WDZC20205500119), the Hunan Provincial Science and Technology Department High-tech Industry Science and Technology Innovation Lead-ing Project (2020GK2009) and the Scientific and Technological Progress and Innovation Program of the Transportation Department of Hunan Province (201927).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Zou, Z.X., Shi, Z.W., Guo, Y.H., et al.: Object detection in 20 years. https://arxiv.org/pdf/1905.05055.pdf. Accessed 2 Sept 2021
Wu, X.W., Sahoo, D., Hoi, Sch.: Recent advances in deep learning for object detection. Neurocomputing 396, 39–64 (2020)
Girshick, R., Donahue, J., Darrell, T., et al.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587. IEEE, Piscataway (2014)
Girshick, R.: Fast R-CNN. In: Proceedings of the 2015 IEEE International Conference on Computer Vision, pp. 1440–1448. IEEE, Piscataway (2015)
Ren, S., He, K., Girshick, R., et al.: Faster R-CNN: towards real-time object detection with region proposal networks. In: Proceedings of the 28th International Conference on Neural Information Processing Systems, pp. 91–99. MIT Press, Cambridge (2015)
Liu, W., Anguelov, D., Erhan, D., et al.: SSD: single shot multibox detector. In: Proceedings of the 2016 European Conference on Computer Vision, pp. 21–37. Springer, Cham (2016). Doi: https://doi.org/10.1007/978-3-319-46448-0_2
Fu, C.Y., Liu, W., Ranga, A., et al.: DSSD: deconvolutional single shot detector. https://arxiv.org/pdf/1701.06659.pdf. Accessed 2 Sept 2021
Redmon, J., Divvala, S., Girshick, R., et al.: You only look once: unified, real-time object detection. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788. IEEE, Piscataway (2016)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 7263–7271. IEEE, Piscataway (2017)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. https://arxiv.org/pdf/1804.02767.pdf. Accessed 2 Sept 2021
Lin, T.Y., Goyal, P., Girshick, R., et al.: Focal loss for dense object detection. In: Proceedings of the 2017 IEEE International Conference on Computer Vision, pp. 2999–3007. IEEE, Piscataway (2017)
Zhang, S., Huang, H.Y., Hu, Y.M., et al.: Vehicle detection in satellite imagery based on deep learning. J. Comput. Appl. 39(S2), 91–96 (2019)
Lu, B., Qu, S.J.: Vehicle detection method in aerial images based on BiFPN and improved Yolov3-tiny network. J. Chinese Comput. Syst. 42(08), 1694–1698 (2021)
Lecun, Y., Bengio, Y.: Convolutional networks for images, speech, and time series. In: The Handbook of Brain Theory and Neural Networks, vol. 3361(10), 1995 (1995)
Krizhevsky, A., Sutskever, I., Hinton, G.E.: ImageNet classification with deep convolutional neural networks. In: International Conference on Neural Information Processing Systems, pp. 1097–1105. Curran Associates Inc (2012)
Simonyan, K., Zisserman, A.: Very deep convolutional networks for large-scale image recognition. https://arxiv.org/pdf/1409.1556.pdf. Accessed 2 Sept 2021
He, K., Zhang, X., Ren, S., et al.: Deep residual learning for image recognition. In: Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778. IEEE, Piscataway (2016)
Lin, T.Y., Dollár, P., Girshick, R., et al.: Feature pyramid networks for object detection. In: Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition, pp. 2117–2125. IEEE, Piscataway (2017)
Liu, S., Qi, L., Qin, H., et al.: Path aggregation network for instance segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768. IEEE, Piscataway (2018)
Pang, J., Chen, K., Shi, J., et al.: Libra R-CNN: towards balanced learning for object detection. In: Proceedings of the 2019 IEEE Conference on Computer Vision and Pattern Recognition, pp. 821–830. IEEE, Piscataway (2019)
Wang, X., Girshick, R., Gupta, A., et al.: Non-local neural networks. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 7794–7803. IEEE, Piscataway (2018)
Feng, K.B.: UAV ROD: a car detection dataset. https://github.com/fengkaibit/UAV-ROD. Accessed 2 Sept 2021
Ding, J., Xue, N., Xia, G.S., et al.: Object detection in aerial images: a large-scale benchmark and challenges. https://arxiv.org/pdf/2102.12219.pdf. Accessed 2 Sept 2021
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Copyright information
© 2022 Springer Nature Switzerland AG
About this paper
Cite this paper
Zhang, J.I.N., Luo, M., Sun, C., Qu, P. (2022). BFR-RetinaNet: An Improved RetinaNet Model for Vehicle Detection in Aerial Images. In: Lai, Y., Wang, T., Jiang, M., Xu, G., Liang, W., Castiglione, A. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2021. Lecture Notes in Computer Science(), vol 13155. Springer, Cham. https://doi.org/10.1007/978-3-030-95384-3_2
Download citation
DOI: https://doi.org/10.1007/978-3-030-95384-3_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-95383-6
Online ISBN: 978-3-030-95384-3
eBook Packages: Computer ScienceComputer Science (R0)