Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3487075.3487087acmotherconferencesArticle/Chapter ViewAbstractPublication PagescsaeConference Proceedingsconference-collections
research-article

SSD Target Detection Algorithm Based on Multi-Scale Fusion and Attention

Published: 07 December 2021 Publication History

Abstract

Aiming at the problems of weak effective information in feature maps and high miss-detection rate of difficult targets when traditional SSD target detection algorithms perform target detection, we propose an improved SSD target detection algorithm. First, add a CBAM module after each feature layer of the SSD. CBAM is a hybrid module that combines spatial attention and channel attention. This module strengthens the network's ability to discriminate targets and backgrounds, improves the expression of effective feature weights, and suppresses interference from irrelevant information; then, adopt the idea of FPN to construct a feature fusion module, which effectively integrates feature layers of different scales, thereby improving the network's ability to detect difficult targets. Verifying the method proposed in this paper on the PASCAL VOC data set fully proves that the improved network performance has been greatly improved.

References

[1]
ZHENG Y P, LI G Y, LI Y (2019). Survey of application of deep learning in image recognition[J]. Computer Engineering andApplications, 55(12): 20-36. (in Chinese with English abstract)
[2]
GIRSHICK R B (2015). Fast R-CNN[J]. 2015 IEEE International Conference on Computer Vision (ICCV), 1440-1448.
[3]
REN S, HE K, GIRSHICK R B, (2015). Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39: 1137-1149.
[4]
REDMON J, DIVVALA S, GIRSHICK R, (2016). You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA, June 27-30, 2016. Piscataway: IEEE, 779-788.
[5]
REDMONJ, FARHADI A (2017). YOLO 9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, Ju-ly 21-26,2017. Piscataway: IEEE, 6517-6525.
[6]
REDMON J, FARHADI A (2018). Yolov3: an incremental improvem-ent[EB/OL]. (2018-04-08) [2021-05-13]. https://arxiv.org/abs/1804.02767
[7]
LIU W, ANGUELOV D, ERHAN D, (2016). SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision, Berlin,Germany, October 11-14, 2016. Berlin, Heidelberg: Springer, 21-37.
[8]
SIMONYAN K, ZISSERMAN A (2014). Very deep convolutional networks for large scale image recognition[J]. Computer Science, 1409-1556.
[9]
WOO S,PARK J,LEE J Y,et al. (2018). CBAM: convolutional block attention module [M//Computer Vision-ECCV 2018.Cham: Springer International Publishing, 3-19.
[10]
LIN T-Y, DOLLÁR P, GIRSHICK R, (2017). Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, 2017. Piscataway: IEEE, 936-944.
[11]
CHU X,YANG W,OUYANG W, (2017). Multi-context atten-tion for human pose estimation[C]//2017 IEEE conference on computer vision and pattern recognition.
[12]
GOVARDHAN P, PATI U C (2014). NIR image based pedestrian detection in night vision with cascade classification and validati-on[C]//Proceedings of the IEEE International Conference on Advanced Communications, Control and Computing Tech-nologies, Ramanathapuram, India, May 8-10, 2014. Piscataway: IEEE, 1435-1438.
[13]
JADERBERG M, SIMONYAN K, ZISSERMAN A, (2016). Spatial transformer networks[EB/OL].(2016-02-04)[2021-05-13]. https://arxiv.org/abs/1506.02025v3
[14]
DENG J,DONG W, SOCHER R, (2009). Image Net: a large-scale hierarchical image database[C]//2009 IEEE Confer-ence on Computer Vision and Pattern Recognition. June 20-25,2009, Miami, FL, USA. IEEE, 248-255.
[15]
JI Z, KONG Q, WANG H, (2019). Small and Dense Commodity Object Detection with Multi-Scale Receptive Field Attention[C]//Proceedings of the 27th ACM International Conference on Multimedia. October, 2019. 1349-1357.
[16]
FU C-Y, LIU W, RANGA A, (2017). DSSD : deconvoluti-onal single shot detector[EB/OL].(2017-01-23)[2021-05-13]. https://arxiv.org/abs/1701.06659.
[17]
BELL S, ZITNICK C L, BALA K, (2016). Inside-Outside Net: detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, 2016. Piscataway: IEEE, 2016: 2874-2883.
[18]
JEONG J, PARK H, KWAK N (2017). Enhancement of SSD by concatenating feature maps for object detection [EB/OL]. (2017-05-26)[2021-05-13]. https://arxiv.org/abs/1705.09587v1.

Cited By

View all
  • (2024)Research on Person Re-Identification Method Based on Metric Learning and Supervised Learning2024 16th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC62065.2024.00036(128-131)Online publication date: 24-Aug-2024

Index Terms

  1. SSD Target Detection Algorithm Based on Multi-Scale Fusion and Attention
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Other conferences
        CSAE '21: Proceedings of the 5th International Conference on Computer Science and Application Engineering
        October 2021
        660 pages
        ISBN:9781450389853
        DOI:10.1145/3487075
        Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 07 December 2021

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. CBAM
        2. Multi-scale feature
        3. SSD
        4. Target detection

        Qualifiers

        • Research-article
        • Research
        • Refereed limited

        Conference

        CSAE 2021

        Acceptance Rates

        Overall Acceptance Rate 368 of 770 submissions, 48%

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)3
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 24 Dec 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Research on Person Re-Identification Method Based on Metric Learning and Supervised Learning2024 16th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC62065.2024.00036(128-131)Online publication date: 24-Aug-2024

        View Options

        Login options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        HTML Format

        View this article in HTML Format.

        HTML Format

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media