Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

RT-DETRmg: a lightweight real-time detection model for small traffic signs

  • Published:
The Journal of Supercomputing Aims and scope Submit manuscript

Abstract

In intelligent transportation systems, real-time detection performance and accuracy are essential metrics. This paper proposes a lightweight real-time detection model, RT-DETRmg, to address the challenges of false and missed detections of small traffic signs and to improve the algorithm's real-time performance. RT-DETRmg enhances the multi-scale feature extraction capability of the RT-DETR backbone network by incorporating a Multiple Scale Sequence Fusion module, which effectively integrates global and local semantic information from different scales of images. Additionally, a cascaded group attention module is utilized within an efficient hybrid encoder to reduce computational complexity, thereby enhancing real-time performance. To further optimize small object detection, a small receptive field feature layer is introduced, while a large receptive field feature layer is removed. Experimental results on the TT100K and GTSDB datasets demonstrate the superiority of RT-DETRmg over existing models. On the TT100K dataset, RT-DETRmg achieves a 2.0% improvement in mean average precision and a 6.6% increase in frames per second compared to the baseline RT-DETR model, while reducing model parameters and computational complexity. On the GTSDB dataset, RT-DETRmg further demonstrates its strong generalization ability, achieving a 2.2% improvement in the F1 score and a 1.7% increase in mean average precision compared to the baseline network. These findings highlight the effectiveness of RT-DETRmg in enhancing both detection accuracy and real-time performance of small traffic signs in diverse scenarios.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8

Similar content being viewed by others

Data availability

Data are provided within the manuscript or supplementary information files.

References

  1. De La Escalera A, Moreno LE, Salichs MA, Armingol JM (1997) Road traffic sign detection and classification. IEEE Trans Industr Electron 44(6):848–859

    Article  Google Scholar 

  2. Zhu Z, Liang D, Zhang S, Huang X, Li B, Hu S (2016) Traffic-sign detection and classification in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 2110–2118

  3. Yang Y, Luo H, Xu H, Wu F (2015) Towards real-time traffic sign detection and classification. IEEE Trans Intell Transp Syst 17(7):2022–2031

    Article  Google Scholar 

  4. Zhang J, Huang M, Jin X, Li X (2017) A real-time Chinese traffic sign detection algorithm based on modified YOLOv2. Algorithms 10(4):127

    Article  MathSciNet  Google Scholar 

  5. Redmon J, Divvala S, Girshick R, Farhadi A (2016) You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 779–788

  6. Liu W, Anguelov D, Erhan D, Szegedy C, Reed S, Fu C-Y, Berg AC (2016) Ssd: single shot multi-box detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Neth-erlands, October 11–14, 2016, Proceedings, Part I 14, pp 21–37. Springer

  7. Lin T-Y, Maire M, Belongie S, Hays J, Perona P, Ramanan D, Dollár P, Zitnick CL (2014) Microsoft coco: common objects in context. In: Computer Vision–ECCV 2014: 13th European Conference, Zurich, Switzerland, September 6–12, 2014, Proceedings, Part V 13, pp 740–755. Springer

  8. Carion N, Massa F, Synnaeve, G, Usunier N, Kirillov A, Zagoruyko S (2020). End-to-end object detection with transformers. In: European Conference on Computer Vision, pp 213–229. Cham: Springer International Publishing

  9. Zhao Y, Lv W, Xu S, Wei J, Wang G, Dang Q et al (2024) Detrs beat yolos on real-time object detection. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 16965–16974

  10. Zhang Z, Jiang Y, Jiang J, Wang X, Luo P, Gu J (2021) Star: a structure-aware lightweight transformer for real-time image enhancement. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 4106–4115

  11. He K, Gkioxari G, Dollár P, Girshick R (2017) Mask r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 2961–2969

  12. Girshick R (2015) Fast r-cnn. In: Proceedings of the IEEE International Conference on Computer Vision, pp 1440–1448

  13. Kattenborn T, Leitloff J, Schiefer F, Hinz S (2021) Review on convolutional neural networks (CNN) in vegetation remote sensing. ISPRS J Photogram Remote Sens 173:24–49

    Article  Google Scholar 

  14. Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, Al-Shamma O (2021) Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. J Big Data 8:1–74

    Article  Google Scholar 

  15. Han K, Xiao A, Wu E, Guo J, Xu C, Wang Y (2021) Transformer in transformer. Adv Neural Inf Process Syst 34:15908–15919

    Google Scholar 

  16. Zhao H, Jiang L, Jia J, Torr P H, Koltun V (2021) Point transformer. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp 16259–16268

  17. Han K, Wang Y, Chen H, Chen X, Guo J, Liu Z, Tao D (2022) A survey on vision transformer. IEEE Trans Pattern Anal Mach Intell 45(1):87–110

    Article  Google Scholar 

  18. Kitaev N, Kaiser Ł, Levskaya A (2020) Reformer: the efficient transformer. arXiv preprint arXiv:2001.04451.

  19. Girshick R, Donahue J, Darrell T, Malik J (2014) Rich feature hierarchies for accurate object detec-tion and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 580–587

  20. Ren S, He K, Girshick R, Sun J (2015) Faster r-cnn: towards real-time object detection with region proposal networks. Adv Neural Inf Process Syst 28:1137

    Google Scholar 

  21. Zhang J, Xie Z, Sun J, Zou X, Wang J (2020) A cascaded r-cnn with multiscale attention and imbalanced samples for traffic sign detection. IEEE Access 8:29742–29754

    Article  Google Scholar 

  22. Redmon J, Farhadi A (2017) Yolo9000: better, faster, stronger. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp 7263–7271

  23. Redmon J, Farhadi A (2018) Yolov3: an incremental improvement. arXiv preprint arXiv: 1804.02767

  24. Bochkovskiy A, Wang C-Y, Liao H-YM (2020) Yolov4: optimal speed and accuracy of object detection. arXiv preprint arXiv: 2004.10934

  25. Li C, Li L, Jiang H, Weng K, Geng Y, Li L, Ke Z, Li Q, Cheng M, Nie W, et al. (2022) Yolov6: a single-stage object detection framework for industrial applications. arXiv preprint arXiv: 2209.02976

  26. Tian Z, Shen C, Chen H, He T (2020) FCOS: a simple and strong anchor-free object detector. IEEE Trans Pattern Anal Mach Intell 44(4):1922–1933

    Google Scholar 

  27. Zhu X, Su W, Lu L, Li B, Wang X, Dai J (2020) Deformable detr: deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159

  28. Wang Y, Zhang X, Yang T, Sun J (2022). Anchor detr: query design for transformer-based detector. In: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 36, No. 3, pp 2567-2575

  29. Yao Z, Ai J, Li B, Zhang C (2021) Efficient detr: improving end-to-end object detector with dense prior. arXiv preprint arXiv:2104.01318

  30. Liu X, Peng H, Zheng N, Yang Y, Hu H, Yuan Y (2023) Efficientvit: memory efficient vision transformer with cascaded group attention. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp 14420–14430

  31. Kang M, Ting CM, Ting FF, Phan RCW (2024) ASF-YOLO: a novel YOLO model with attentional scale sequence fusion for cell instance segmentation. Image Vis Comput 147:105057

    Article  Google Scholar 

Download references

Funding

This work was supported by the Clinical research project initiated by researchers from the Sichuan Provincial Health Commission (23LCYJ020), the Special Project for City-College Science and Technology Strategic Cooperation of Nanchong City in 2022 (22SXQT0292), and Sichuan Provincial Key R&D Plan (Major Science and Technology Project) (2022YFS0020).

Author information

Authors and Affiliations

Authors

Contributions

Y.W. and J.C. contributed to conceptualization and writing—original draft preparation; B.Y. helped with the methodology; Y.W. assisted with software; Y.C. and R.L. carried out validation; Y.W., J.C., and R.L. conducted investigation; Y.S. and Y.C. were involved in writing—review and editing; and Y.S. and Y.C. were responsible for visualization. All authors have read and agreed to the published version of the manuscript.

Corresponding author

Correspondence to Jinling Chen.

Ethics declarations

Conflict of interest

The authors declare no competing interests.

Ethics approval

Not applicable.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Wang, Y., Chen, J., Yang, B. et al. RT-DETRmg: a lightweight real-time detection model for small traffic signs. J Supercomput 81, 307 (2025). https://doi.org/10.1007/s11227-024-06800-8

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s11227-024-06800-8

Keywords