Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Improving small object detection via context-aware and feature-enhanced plug-and-play modules

Published: 01 March 2024 Publication History

Abstract

Detecting small objects is a challenging task in computer vision due to the objects only occupying a limited number of pixels and having blurred contours. These factors result in minimal discriminative features being available to effectively model the objects. In this paper, we propose three lightweight plug-and-play modules that can be seamlessly integrated into object detection algorithms, particularly those in the YOLO series, to improve the accuracy of detecting small objects. The Spatially Enhanced Convolutional Block Attention Module (SE-CBAM) is integrated into the feature extraction layer of the network to enhance the feature extraction capability of neural networks. Additionally, a Contextual Information Pooling Enhancement Module (CIE-Pool) is included at the multi-scale feature fusion stage to extract and improve object background information, which enhances the recognition rate of small objects. To improve the detection of small objects, a new layer is added to the detection head, which incorporates the shallow feature map obtained from the feature extraction network after Adaptive Feature Processing (AFP), thereby obtaining more and richer information about small objects. The efficacy of the algorithm has been evaluated on the VisDrone2021 and AI-TOD datasets. The experimental results demonstrate that the method proposed in this paper greatly improves the detection accuracy of small objects while maintaining real-time capabilities. Furthermore, it maintains high accuracy and speed even when dealing with complex background conditions and detecting small objects with high blur.

References

[1]
Jiao L, Zhang F, Liu F, Yang S, Li L, Feng Z, and Qu R A survey of deep learning-based object detection IEEE Access 2019 7 128837-128868
[2]
Zou, Z., Chen, K., Shi, Z., Guo, Y., Ye, J.: Object detection in 20 years: A survey (2023) arXiv:1905.05055 [cs.CV]
[3]
Cheng G, Yuan X, Yao X, Yan K, Zeng Q, Xie X, and Han J Towards large-scale small object detection: survey and benchmarks IEEE Trans. Pattern Anal. Mach. Intell. 2023
[4]
Chen G, Wang H, Chen K, Li Z, Song Z, Liu Y, Chen W, and Knoll A A survey of the four pillars for small object detection: multiscale representation, contextual information, super-resolution, and region proposal IEEE Trans. Syst., Man., Cybern.: Syst. 2022 52 2 936-953
[5]
Cha Y, Choi W, Suh G, Mahmoudkhani S, and Büyüköztürk O Autonomous structural visual inspection using region-based deep learning for detecting multiple damage types Comput. Aided Civ. Infrastruct. Eng. 2018
[6]
Arnold E, Al-Jarrah OY, Dianati M, Fallah S, Oxtoby D, and Mouzakitis A A survey on 3d object detection methods for autonomous driving applications IEEE Trans. Intell. Transp. Syst. 2019 20 10 3782-3795
[7]
Wang T, Chen Y, Qiao M, and Snoussi H A fast and robust convolutional neural network-based defect detection model in product quality control Int. J. Adv. Manuf. Technol. 2018 94 3465-3471
[8]
Zhu P, Wen L, Du D, Bian X, Fan H, Hu Q, and Ling H Detection and tracking meet drones challenge IEEE Trans. Pattern Anal. Mach. Intell. 2021
[9]
Wang, J., Yang, W., Guo, H., Zhang, R., Xia, G.-S.: Tiny object detection in aerial images. In: 25th International Conference on Pattern Recognition (ICPR), pp. 3791–3798 (2021).
[10]
Uijlings JRR, Sande KEA, Gevers T, and Smeulders AWM Selective search for object recognition Int. J. Comput. Vis. 2013 104 154-171
[11]
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 580–587 (2014).
[12]
He K, Zhang X, Ren S, and Sun J Spatial pyramid pooling in deep convolutional networks for visual recognition IEEE Trans. Pattern Anal. Mach. Intell. 2015 37 9 1904-1916
[13]
Girshick, R.: Fast R-CNN. In: IEEE International Conference on Computer Vision (ICCV), pp. 1440–1448 (2015).
[14]
Ren S, He K, Girshick R, and Sun J Faster R-CNN: towards real-time object detection with region proposal networks IEEE Trans. Pattern Anal. Mach. Intell. 2017 39 6 1137-1149
[15]
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 779–788 (2016).
[16]
Redmon, J., Farhadi, A.: YOLO9000: better, faster, stronger. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6517–6525 (2017).
[17]
Redmon, J., Farhadi, A.: YOLOv3: an incremental improvement. CoRR abs/1804.02767 (2018) 1804.02767
[18]
Bochkovskiy, A., Wang, C., Liao, H.M.: YOLOv4: Optimal speed and accuracy of object detection. CoRR abs/2004.10934 (2020) 2004.10934
[19]
Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., Belongie, S.: Feature pyramid networks for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 936–944 (2017).
[20]
Wang, W., Xie, E., Song, X., Zang, Y., Wang, W., Lu, T., Yu, G., Shen, C.: Efficient and accurate arbitrary-shaped text detection with pixel aggregation network. In: IEEE International Conference on Computer Vision (ICCV), pp. 8439–8448 (2019).
[21]
[22]
Jocher, G., Chaurasia, A., Qiu, J.: YOLO by Ultralytics. https://github.com/ultralytics/ultralytics
[23]
Ghodrati, A., Diba, A., Pedersoli, M., Tuytelaars, T., Van Gool, L.: Deepproposal: Hunting objects by cascading deep convolutional layers. In: IEEE International Conference on Computer Vision (ICCV), pp. 2578–2586 (2015).
[24]
Cai, Z., Vasconcelos, N.: Cascade r-cnn: Delving into high quality object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 6154–6162 (2018).
[25]
Zhang, S., Wen, L., Bian, X., Lei, Z., Li, S.Z.: Single-shot refinement neural network for object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 4203–4212 (2018).
[26]
Etten, A.V.: You Only Look Twice: Rapid multi-scale object detection in satellite imagery. CoRR abs/1805.09512 (2018) 1805.09512
[27]
Bell, S., Zitnick, C.L., Bala, K., Girshick, R.: Inside-Outside Net: Detecting objects in context with skip pooling and recurrent neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 2874–2883 (2016).
[28]
Yuan Y, Xiong Z, and Wang Q VSSA-NET: vertical spatial sequence attention network for traffic sign detection IEEE Trans. Image Process. 2019 28 7 3423-3434
[29]
Müller, J., Dietmayer, K.: Detecting traffic lights by single shot detection. In: 21st International Conference on Intelligent Transportation Systems (ITSC), pp. 266–273 (2018).
[30]
Yan B, Li J, Yang Z, Zhang X, and Hao X AIE-YOLO: auxiliary information enhanced YOLO for small object detection Sensors 2022 22 21 8221
[31]
Wang M, Yang W, Wang L, Chen D, Wei F, KeZiErBieKe H, and Liao Y FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection J. Vis. Commun. Image Represent. 2023 90
[32]
Hu, J., Shen, L., Sun, G.: squeeze-and-excitation networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7132–7141 (2018).
[33]
Wang, Q., Wu, B., Zhu, P., Li, P., Zuo, W., Hu, Q.: ECA-Net: Efficient channel attention for deep convolutional neural networks. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 11531–11539 (2020).
[34]
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 3–19 (2018)
[35]
Yang R, Li W, Shang X, Zhu D, and Man X KPE-YOLOv5: an improved small target detection algorithm based on YOLOv5 Electronics 2023 12 4 817
[36]
Zhou W, Cai C, Zheng L, Li C, and Zeng D ASSD-YOLO: a small object detection method based on improved YOLOv7 for airport surface surveillance Multimed. Tools Appl. 2023
[37]
Lim, J.-S., Astrid, M., Yoon, H.-J., Lee, S.-I.: Small object detection using context and attention. In: International Conference on Artificial Intelligence in Information and Communication (ICAIIC), pp. 181–186 (2021).
[38]
Luo, W., Li, Y., Urtasun, R., Zemel, R.: Understanding the effective receptive field in deep convolutional neural networks. In: Neural Information Processing Systems (NeurIPS), vol. 29 (2016). https://proceedings.neurips.cc/paper_files/paper/2016/file/c8067ad1937f728f51288b3eb986afaa-Paper.pdf
[39]
Duan, K., Bai, S., Xie, L., Qi, H., Huang, Q., Tian, Q.: CenterNet: Keypoint triplets for object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 6568–6577 (2019).
[40]
Xie, X., Cheng, G., Wang, J., Yao, X., Han, J.: Oriented R-CNN for object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 3500–3509 (2021).
[41]
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: IEEE International Conference on Computer Vision (ICCV), pp. 2999–3007 (2017).
[42]
Tan, M., Pang, R., Le, Q.V.: EfficientDet: Scalable and efficient object detection. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2020)
[43]
Wang, C.-Y., Bochkovskiy, A., Liao, H.-Y.M.: YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pp. 7464–7475 (2023)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Journal of Real-Time Image Processing
Journal of Real-Time Image Processing  Volume 21, Issue 2
Apr 2024
529 pages

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 March 2024
Accepted: 23 January 2024
Received: 17 October 2023

Author Tags

  1. Small object detection
  2. Contextual information
  3. Feature enhancement
  4. Attention
  5. Plug-and-play modules

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Sep 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media