Abstract
Aerial images usually are huge (around 2K resolution). Such high-resolution images contain thousands of small objects, and detecting all of them is a very challenging problem. The complexity of detection and classification in real-time is much higher than the usual images (<1K with high Object to Image Ratio OIR). Deep learning has many algorithms for object detection, but they are not designed for handling aerial images, and these algorithms are often sub-optimal for small scale object detection and their precise localization. In this work, a novel technique based on a modified SSD architecture OIR-SSD has proposed for real-time object detection on aerial images attaining high mean Average Precision (mAP). OIR-SSD has two approaches. The approach-I proposed for higher mAP, whereas the approach-II proposed to achieve real-time object detection. The approach-I has improved mAP from 0.72 to 0.92 (28% improvement) on Stanford data-set while from 0.04 to 0.44 (1100% improvement) on Visedrone2018 at 4 Frames Per Second (FPS) whereas the approach-II has improved mAP from 0.72 to 0.82 at 42 FPS.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Global Market Insights. Aerial Imaging Market Share - Industry Size, Outlook Report 2018–2024, May 2018. https://www.gminsights.com/industry-analysis/aerial-imaging-market. Accessed 12 Mar 2018
IndustryARC. Aerial Imaging Market, December 2018. https://industryarc.com/Report/16300/aerial-imaging-market.html. Accessed 12 Mar 2018
Felzenszwalb, P.F., Girshick, R.B., McAllester, D., Ramanan, D.: Object detection with discriminatively trained part-based models. IEEE Trans. Pattern Anal. Mach. Intell. 32(9), 1627–1645 (2010)
Viola, P., Jones, M., et al.: Robust real-time object detection. Int. J. Comput. Vis. 57(2), 137–154 (2001)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. Int. J. Comput. Vis. 60(2), 91–110 (2004)
Dalal, N., Triggs, B.: Histograms of oriented gradients for human detection. In: International Conference on Computer Vision & Pattern Recognition (CVPR 2005), vol. 1, pp. 886–893. IEEE Computer Society (2005)
Lienhart, R., Maydt, J.: An extended set of Haar-like features for rapid object detection. In: Proceedings of the International Conference on Image Processing, vol. 1, p. I. IEEE (2002)
Cortes, C., Vapnik, V.: Support vector machine. Mach. Learn. 20(3), 273–297 (1995)
Freund, Y., Schapire, R.E.: A decision-theoretic generalization of on-line learning and an application to boosting. J. Comput. Syst. Sci. 55(1), 119–139 (1997)
Hinton, G.E., Osindero, S., Teh, Y.-W.: A fast learning algorithm for deep belief nets. Neural Comput. 18(7), 1527–1554 (2006)
Hinton, G.E., Salakhutdinov, R.R.: Reducing the dimensionality of data with neural networks. Science 313(5786), 504–507 (2006)
LeCun, Y., Bengio, Y., Hinton, G.: Deep learning. Nature 521(7553), 436 (2015)
Girshick, R.B., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. CoRR, abs/1311.2524 (2013)
Redmon, J., Divvala, S.K., Girshick, R.B., Farhadi, A.: You only look once: unified, real-time object detection. CoRR, abs/1506.02640 (2015)
Liu, W., et al.: SSD: single shot multibox detector. CoRR, abs/1512.02325 (2015)
Bojarski, M., et al.: End to end learning for self-driving cars. CoRR, abs/1604.07316 (2016)
Robicquet, A., Sadeghian, A., Alahi, A., Savarese, S.: Learning social etiquette: human trajectory understanding in crowded scenes. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9912, pp. 549–565. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46484-8_33
Zhu, P., Wen, L., Bian, X., Ling, H., Hu, Q.: Vision meets drones: a challenge. arXiv preprint arXiv:1804.07437 (2018)
Girshick, R.B.: Fast R-CNN. CoRR, abs/1504.08083 (2015)
Ren, S., He, K., Girshick, R.B., Sun, J.: Faster R-CNN: towards real-time object detection with region proposal networks. CoRR, abs/1506.01497 (2015)
Lin, T., Goyal, P., Girshick, R.B., He, K., Dollár, P.: Focal loss for dense object detection. CoRR, abs/1708.02002 (2017)
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. CoRR, abs/1612.08242 (2016)
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR, abs/1804.02767 (2018)
Maggiori, E., Tarabalka, Y., Charpiat, G., Alliez, P.: Can semantic labeling methods generalize to any city? The Inria aerial image labeling benchmark. In: 2017 IEEE International Geoscience and Remote Sensing Symposium (IGARSS), pp. 3226–3229. IEEE (2017)
Wang, X., Cheng, P., Liu, X., Uzochukwu, B.: Fast and accurate, convolutional neural network based approach for object detection from UAV. CoRR, abs/1808.05756 (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2019 Springer Nature Switzerland AG
About this paper
Cite this paper
Sharma, R., Pandey, R., Nigam, A. (2019). Real Time Object Detection on Aerial Imagery. In: Vento, M., Percannella, G. (eds) Computer Analysis of Images and Patterns. CAIP 2019. Lecture Notes in Computer Science(), vol 11678. Springer, Cham. https://doi.org/10.1007/978-3-030-29888-3_39
Download citation
DOI: https://doi.org/10.1007/978-3-030-29888-3_39
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-29887-6
Online ISBN: 978-3-030-29888-3
eBook Packages: Computer ScienceComputer Science (R0)