research-article

SSD Target Detection Algorithm Based on Multi-Scale Fusion and Attention

Authors:

Yijian PeiAuthors Info & Claims

CSAE '21: Proceedings of the 5th International Conference on Computer Science and Application Engineering

Article No.: 12, Pages 1 - 5

https://doi.org/10.1145/3487075.3487087

Published: 07 December 2021 Publication History

Abstract

Aiming at the problems of weak effective information in feature maps and high miss-detection rate of difficult targets when traditional SSD target detection algorithms perform target detection, we propose an improved SSD target detection algorithm. First, add a CBAM module after each feature layer of the SSD. CBAM is a hybrid module that combines spatial attention and channel attention. This module strengthens the network's ability to discriminate targets and backgrounds, improves the expression of effective feature weights, and suppresses interference from irrelevant information; then, adopt the idea of FPN to construct a feature fusion module, which effectively integrates feature layers of different scales, thereby improving the network's ability to detect difficult targets. Verifying the method proposed in this paper on the PASCAL VOC data set fully proves that the improved network performance has been greatly improved.

References

[1]

ZHENG Y P, LI G Y, LI Y (2019). Survey of application of deep learning in image recognition[J]. Computer Engineering andApplications, 55(12): 20-36. (in Chinese with English abstract)

[2]

GIRSHICK R B (2015). Fast R-CNN[J]. 2015 IEEE International Conference on Computer Vision (ICCV), 1440-1448.

Digital Library

[3]

REN S, HE K, GIRSHICK R B, (2015). Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39: 1137-1149.

Digital Library

[4]

REDMON J, DIVVALA S, GIRSHICK R, (2016). You only look once: unified, real-time object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition(CVPR), Las Vegas, NV, USA, June 27-30, 2016. Piscataway: IEEE, 779-788.

[5]

REDMONJ, FARHADI A (2017). YOLO 9000: better, faster, stronger[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, Ju-ly 21-26,2017. Piscataway: IEEE, 6517-6525.

[6]

REDMON J, FARHADI A (2018). Yolov3: an incremental improvem-ent[EB/OL]. (2018-04-08) [2021-05-13]. https://arxiv.org/abs/1804.02767

[7]

LIU W, ANGUELOV D, ERHAN D, (2016). SSD: single shot multibox detector[C]//Proceedings of the European Conference on Computer Vision, Berlin,Germany, October 11-14, 2016. Berlin, Heidelberg: Springer, 21-37.

[8]

SIMONYAN K, ZISSERMAN A (2014). Very deep convolutional networks for large scale image recognition[J]. Computer Science, 1409-1556.

[9]

WOO S,PARK J,LEE J Y,et al. (2018). CBAM: convolutional block attention module [M//Computer Vision-ECCV 2018.Cham: Springer International Publishing, 3-19.

[10]

LIN T-Y, DOLLÁR P, GIRSHICK R, (2017). Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, July 21-26, 2017. Piscataway: IEEE, 936-944.

[11]

CHU X,YANG W,OUYANG W, (2017). Multi-context atten-tion for human pose estimation[C]//2017 IEEE conference on computer vision and pattern recognition.

[12]

GOVARDHAN P, PATI U C (2014). NIR image based pedestrian detection in night vision with cascade classification and validati-on[C]//Proceedings of the IEEE International Conference on Advanced Communications, Control and Computing Tech-nologies, Ramanathapuram, India, May 8-10, 2014. Piscataway: IEEE, 1435-1438.

[13]

JADERBERG M, SIMONYAN K, ZISSERMAN A, (2016). Spatial transformer networks[EB/OL].(2016-02-04)[2021-05-13]. https://arxiv.org/abs/1506.02025v3

[14]

DENG J,DONG W, SOCHER R, (2009). Image Net: a large-scale hierarchical image database[C]//2009 IEEE Confer-ence on Computer Vision and Pattern Recognition. June 20-25,2009, Miami, FL, USA. IEEE, 248-255.

[15]

JI Z, KONG Q, WANG H, (2019). Small and Dense Commodity Object Detection with Multi-Scale Receptive Field Attention[C]//Proceedings of the 27th ACM International Conference on Multimedia. October, 2019. 1349-1357.

[16]

FU C-Y, LIU W, RANGA A, (2017). DSSD : deconvoluti-onal single shot detector[EB/OL].(2017-01-23)[2021-05-13]. https://arxiv.org/abs/1701.06659.

[17]

BELL S, ZITNICK C L, BALA K, (2016). Inside-Outside Net: detecting objects in context with skip pooling and recurrent neural networks[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, June 27-30, 2016. Piscataway: IEEE, 2016: 2874-2883.

[18]

JEONG J, PARK H, KWAK N (2017). Enhancement of SSD by concatenating feature maps for object detection [EB/OL]. (2017-05-26)[2021-05-13]. https://arxiv.org/abs/1705.09587v1.

Cited By

Li SHuang C(2024)Research on Person Re-Identification Method Based on Metric Learning and Supervised Learning2024 16th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC62065.2024.00036(128-131)Online publication date: 24-Aug-2024
https://doi.org/10.1109/IHMSC62065.2024.00036

Index Terms

SSD Target Detection Algorithm Based on Multi-Scale Fusion and Attention
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Computer vision tasks
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

An Improved SSD for small target detection
ICMIP '21: Proceedings of the 2021 6th International Conference on Multimedia and Image Processing

SSD is one of heuristic one-stage target detection approaches. Although it has got impressive results in general target detection, it still struggles in small-size object detection and precise location. In this paper, we proposed an improved SSD which ...
Enhanced SSD with interactive multi-scale attention features for object detection
Abstract
Single Shot MultiBox Detector (SSD) method using multi-scale feature maps for object detection, showing outstanding performance in object detection task. However, as a one-stage detection method, it’s difficult for SSD methods to quickly notice ...
An improved SSD method for infrared target detection based on convolutional neural network

Target detection is the basis for automatic target recognition system of infrared imaging guidance to complete subsequent tasks such as recognition and tracking. Existing systems have not the autonomous learning ability of target feature, and it ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CSAE '21: Proceedings of the 5th International Conference on Computer Science and Application Engineering

October 2021

660 pages

ISBN:9781450389853

DOI:10.1145/3487075

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

CSAE 2021

CSAE 2021: The 5th International Conference on Computer Science and Application Engineering

October 19 - 21, 2021

Sanya, China

Acceptance Rates

Overall Acceptance Rate 368 of 770 submissions, 48%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
41
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Dec 2024

Other Metrics

View Author Metrics

Citations

Cited By

Li SHuang C(2024)Research on Person Re-Identification Method Based on Metric Learning and Supervised Learning2024 16th International Conference on Intelligent Human-Machine Systems and Cybernetics (IHMSC)10.1109/IHMSC62065.2024.00036(128-131)Online publication date: 24-Aug-2024
https://doi.org/10.1109/IHMSC62065.2024.00036

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents