Abstract
With the rapid development of remote sensing technology and the widespread application of remote sensing images, remote sensing object detection has become a hot research direction. However, we observe three primary challenges in remote sensing object detection: scale variations, small objects, and complex backgrounds. To address these challenges, we propose a novel detector, he Multi-Scale Context-Aware Network (MSCANet). First, we introduce a Multi-Scale Fusion Module (MSFM) that provides various scales of receptive fields to extract contextual information of objects at different scales adequately. Second, the Multi-Scale Guidance Module (MSGM) is proposed, which fuses deep and shallow feature maps from multiple scales, reducing the loss of feature information in small objects. Finally, we introduce the Context-Aware DownSampling Module (CADM). It dynamically adjusts context information weights at different scales, effectively reducing interference from complex backgrounds. Experimental results demonstrate that the proposed MSCANet achieves superior performance results with mean average precision (mAP) of 97.1% and 73.4% on the challenging RSOD and DIOR datasets, respectively, which indicates that the proposed network is suitable for remote sensing object detection and is of a great reference value.
Similar content being viewed by others
Data availability
No datasets were generated or analysed during the current study.
References
Bao W, Huang C, Hu G, Su B, Yang X (2024) Detection of fusarium head blight in wheat using uav remote sensing based on parallel channel space attention. Comput Electron Agric 217:108630
Behera TK, Bakshi S, Sa PK (2022) Vegetation extraction from uav-based aerial images through deep learning. Comput Electron Agric 198:107094
Cao L, Luo F, Chen L, Sheng Y, Wang H, Wang C, Ji R (2017) Weakly supervised vehicle detection in satellite images via multi-instance discriminative learning. Pattern Recognit 64:417–424
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In:European conference on computer vision, pp 213–229 Springer
Chalavadi V, Jeripothula P, Datla R, Ch SB (2022) Msodanet: a network for multi-scale object detection in aerial images using hierarchical dilated convolutions. Pattern Recognit 126:108548
Chen Y, Wang J, Zhang Y, Liu Y (2023) Arbitrary-oriented ship detection based on kullback-leibler divergence regression in remote sensing images. Earth Sci Inf 16(4):3243–3255
Cheng G, Han J (2016) A survey on object detection in optical remote sensing images. ISPRS J Photogrammetry Remote Sens 117:11–28
Ding J, Xue N, Xia G, Bai X, Yang W, Yang M, Belongie S, Luo J, Datcu M Pelillo M Object detection in aerial images: A large-scale benchmark and challenges. Arxiv 2021. arXiv preprint arXiv:2102.12219
Dong R, Xu D, Zhao J, Jiao L, An J (2019) Sig-nms-based faster r-cnn combining transfer learning for small target detection in vhr optical remote sensing imagery. IEEE Trans Geosci Remote Sens 57(11):8534–8545
Gao T, Liu Z, Zhang J, Wu G, Chen T (2023) A task-balanced multi-scale adaptive fusion network for object detection in remote sensing images. IEEE Trans Geosci Remote Sens 61:1–15
Gao T, Li Z, Wen Y, Chen T, Niu Q, Liu Z (2023b) Attention-free global multiscale fusion network for remote sensing object detection. IEEE Trans Geosci Remote Sens 62:1–14
Gao T, Niu Q, Zhang J, Chen T, Mei S, Jubair A (2023c) Global to local: a scale-aware network for remote sensing object detection. IEEE Trans Geosci Remote Sens 61:1–14
Girshick R (2015) Fast r-cnn. In:Proceedings of the IEEE international conference on computer vision, pp 1440–1448
Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detection and semantic segmentation. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 580–587
Guo Y, Ji J, Lu X, Xie H, Tong X (2020) Geospatial object detection with single shot anchor-free network. In:IGARSS 2020–2020 IEEE International Geoscience and Remote Sensing Symposium, pp 280–283 IEEE
Guo M, Shu S, Ma S, Wang L-J (2021) Using high-resolution remote sensing images to explore the spatial relationship between landscape patterns and ecosystem service values in regions of urbanization. Environ Sci Pollut Res 28(40):56139–56151
Guo Y, Tong X, Xu X, Liu S, Feng Y, Xie H (2022) An anchor-free network with density map and attention mechanism for multiscale object detection in aerial images. IEEE Geosci Remote Sens Lett 19:1–5
He K, Zhang X, Ren S, Sun J (2015) Spatial pyramid pooling in deep convolutional networks for visual recognition. IEEE Trans Pattern Anal Mach Intell 37(9):1904–1916
He K, Zhang X, Ren S, Sun J (2016) Deep residual learning for image recognition. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 770–778
Huang W, Li G, Chen Q, Ju M, Qu J (2021) Cf2pn: a cross-scale feature fusion pyramid network based remote sensing target detection. Remote Sens 13(5):847
Hui Y, You S, Hu X, Yang P, Zhao J (2024) Seb-yolo: an improved yolov5 model for remote sensing small target detection. Sensors 24(7):2193
Jocher G, Chaurasia A, Qiu J (2023) Ultralytics yolo (version 8.0.0) [computer software]. https://github.com/ultralytics/ultralytics
Li K, Wan G, Cheng G, Meng L, Han J (2020) Object detection in optical remote sensing images: A survey and a new benchmark. ISPRS J. Photogramm. Remote Sens. 159, 296–307
Li W, Wei W, Zhang L (2021) Gsdet: object detection in aerial images based on scale reasoning. IEEE Trans Image Process 30:4599–4609
Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv Preprint arXiv:2209.02976
Li Y, Hou Q, Zheng Z, Cheng M-M, Yang J, Li X (2023) Large selective kernel network for remote sensing object detection. In:Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 16794–16805
Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: Common objects in context. In:Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, pp 740–755 Springer
Lin T-Y, Goyal P, Girshick R, He K, Dollar P (2017) Focal loss for dense object detection. IEEE Trans Pattern Anal Mach Intell 42(2):318–327. https://doi.org/10.1109/iccv.2017.324
Lin T-Y, Dollár P, Girshick R, He K, Hariharan B, Belongie S (2017b) Feature pyramid networks for object detection. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2117–2125
Liu S, Qi L, Qin H, Shi J, Jia J (2018) Path aggregation network for instance segmentation. In:Proceedings of the IEEE conference on computer vision and pattern recognition, pp 8759–8768
Liu Y, Li Q, Yuan Y, Du Q, Wang Q (2021) Abnet: adaptive balanced network for multiscale object detection in remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–14
Liu J, Li S, Zhou C, Cao X, Gao Y, Wang B (2021b) Sraf-net: a scene-relevant anchor-free object detection network in remote sensing images. IEEE Trans Geosci Remote Sens 60:1–14
Long Y, Gong Y, Xiao Z, Liu Q (2017) Accurate object localization in remote sensing images based on convolutional neural networks. IEEE Trans Geosci Remote Sens 55(5):2486–2498
Ma Y-Y, Sun Z-L, Zeng Z, Lam K-M (2021) Corn-plant counting using scare-aware feature and channel interdependence. IEEE Geosci Remote Sens Lett 19:1–5
Qin H, Wang J, Mao X, Zhao Za, Gao X, Lu W (2024) An improved faster r-cnn method for landslide detection in remote sensing images. J Geovisualization Spat Anal 8(1):2
Redmon J, Farhadi A (2017) Yolo9000: Better, faster, stronger. In:2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 6517–6525
Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: Unified, real-time object detection. In:IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp 779–788
Song G, Du H, Zhang X, Bao F, Zhang Y (2024) Small object detection in unmanned aerial vehicle images using multi-scale hybrid attention. Eng Appl Artif Intell 128:107455
Tan M, Le Q (2019) Efficientnet: Rethinking model scaling for convolutional neural networks. In:International conference on machine learning, pp 6105–6114 PMLR
Tan M, Pang R, Le QV (2020) Efficientdet: Scalable and efficient object detection. In:Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 10781–10790
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Advances in neural information processing systems 30
Wang G, Zhuang Y, Chen H, Liu X, Zhang T, Li L, Dong S, Sang Q (2021) Fsod-net: full-scale object detection from optical remote sensing imagery. IEEE Trans Geosci Remote Sens 60:1–18
Wang Y, Bashir SMA, Khan M, Ullah Q, Wang R, Song Y, Guo Z, Niu Y (2022) Remote sensing image super-resolution and object detection: Benchmark and state of the art. Expert Syst Appl 197:116793
Wang C-Y, Bochkovskiy A, Liao H-YM (2023) Yolov7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In:Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 7464–7475
Wu Y, Zhang K, Wang J, Wang Y, Wang Q, Li X (2022) Gcwnet: a global context-weaving network for object detection in remote sensing images. IEEE Trans Geosci Remote Sens 60:1–12
Xiao Z, Liu Q, Tang G, Zhai X (2015) Elliptic fourier transformation-based histograms of oriented gradients for rotationally invariant object detection in remote-sensing images. Int J Remote Sens 36(2):618–644
Yao G, Zhu S, Zhang L, Qi M (2024) Hp-yolov8: high-precision small object detection algorithm for remote sensing images. https://doi.org/10.20944/preprints202406.1963.v1. Preprints https://doi.org/
Ye Y, Ren X, Zhu B, Tang T, Tan X, Gui Y, Yao Q (2022) An adaptive attention fusion mechanism convolutional network for object detection in remote sensing images. Remote Sens 14(3):516
Zhang G, Lu S, Zhang W (2019) Cad-net: a context-aware detection network for objects in remote sensing imagery. IEEE Trans Geosci Remote Sens 57(12):10015–10024
Zhang Y, Ning G, Chen S, Yang Y (2021) Impact of rapid urban sprawl on the local meteorological observational environment based on remote sensing images and gis technology. Remote Sens 13(13):2624
Zhang C, Lam K-M, Wang Q (2023) Cof-net: a progressive coarse-to-fine framework for object detection in remote-sensing imagery. IEEE Trans Geosci Remote Sens 61:1–17
Zhang G, Yu W, Hou R (2024) Mfil-fcos: a multi-scale fusion and interactive learning method for 2d object detection and remote sensing image detection. Remote Sens 16(6):936
Zhao C, Guo D, Shao C, Zhao K, Sun M, Shuai H (2024) Satdetx-yolo: a more accurate method for vehicle target detection in satellite remote sensing imagery. IEEE Access
Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159
Zou Z, Chen K, Shi Z, Guo Y, Ye J (2023) Object detection in 20 years: a survey. Proc IEEE 111(3):257–276
Funding
This work was supported by the Project of Research and Develop the Key Technology of Mine Explosion-proof Pure Electric Transport Vehicle. Key Research and Development Projects in Anhui Province (202004 b11020029).
Author information
Authors and Affiliations
Contributions
Huaping Zhou and Weidong Liu wrote the main manuscript text and designed the remote sensing object detection model. Kelei Sun performed the data processing and analysis. Jin Wu and Tao Wu prepared Figs. 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16 and 17. All authors reviewed the manuscript.
Corresponding author
Ethics declarations
Competing interests
The authors declare no competing interests.
Additional information
Communicated by Hassan Babaie.
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Zhou, H., Liu, W., Sun, K. et al. MSCANet: A multi-scale context-aware network for remote sensing object detection. Earth Sci Inform 17, 5521–5538 (2024). https://doi.org/10.1007/s12145-024-01447-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s12145-024-01447-8