Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection

  • Conference paper
  • First Online:
Advanced Intelligent Computing Technology and Applications (ICIC 2024)

Abstract

Object detection methods provide efficient and accurate solutions for industrial production and quality control in defect detection. Since the overlap of bridge defect categories often occurs simultaneously, it is difficult for target detection methods targeting a single label to achieve accurate bridge defect detection. This paper proposes a bridge defect detection scheme YOLOv5 based on multi-softmax and omni-dimensional dynamic convolution (MOD-YOLO), which combines the proposed multi-softmax classification loss function with omni-dimensional dynamic convolution (ODConv). MOD-YOLO is evaluated on codebrim dataset and achieves the highest performance compared to existing SOTA models such as YOLO series and transformer-based series.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 74.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 89.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Jia, B.-B., Zhang, M.-L.: Multi-dimensional multi-label classification: towards encompassing heterogeneous label spaces and multi-label annotations. Pattern Recognit. 138, 109357 (2023)

    Article  Google Scholar 

  2. Wang, C., Bochkovskiy, A., Liao, H.M.: YOLov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)

  3. Ge, P., Chen, Y., Wang, G., Weng, G.: An active contour model driven by adaptive local pre-fitting energy function based on Jeffreys divergence for image segmentation. Expert Syst. Appl. 210, 118493 (2022)

    Article  Google Scholar 

  4. Ma, P., Wang, L.: Filtering-based recursive least squares estimation approaches for multivariate equation-error systems by using the multiinnovation theory. Int. J. Adapt. Control Signal Process. 35(9), 1898–1915 (2021)

    Article  MathSciNet  Google Scholar 

  5. Ma, P.: A new partially-coupled recursive least squares algorithm for multivariate equation-error systems. Int. J. Control. Autom. Syst. 21(6), 1828–1839 (2023)

    Article  Google Scholar 

  6. Chen, Y., Zhu, X., Li, Y., Wei, Y., Ye, L.: Enhanced semantic feature pyramid network for small object detection. Signal Process. Image Commun. 113, 116919 (2023)

    Article  Google Scholar 

  7. Yao, Y., Han, L., Du, C., Xu, X., Jiang, X.: Traffic sign detection algorithm based on improved YOLOv4-tiny. Signal Process. Image Commun. 107, 116783 (2022)

    Article  Google Scholar 

  8. Zhao, Z., Wang, J., Tao, Q., Li, A., Chen, Y.: An unknown wafer surface defect detection approach based on incremental learning for reliability analysis. Reliab. Eng. Syst. Saf., 109966 (2024)

    Google Scholar 

  9. Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30 (2017)

    Google Scholar 

  10. Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds) ECCV 2020. LNCS, Part, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13

  11. Lin, T.Y., et al.: Microsoft COCO: Common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, Part V, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48

  12. Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)

  13. Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)

    Google Scholar 

  14. Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)

    Google Scholar 

  15. Liu, G.-H., Chu, M.-X., Gong, R.-F., Zheng, Z.-H.: DLF-YOLOF: an improved YOLOF-based surface defect detection for steel plate. J. Iron Steel Res. Int., 1–10 (2023)

    Google Scholar 

  16. Zhu, X., Liu, J., Zhou, X., Qian, S., Yu, J.: Detection of irregular small defects on metal base surface of infrared laser diode based on deep learning. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-16352-3

  17. Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep convolutional ranking for multilabel image annotation. arXiv preprint arXiv:1312.4894 (2013)

  18. Zhang, J., Wu, Q., Shen, C., Zhang, J., Lu, J.: Multilabel image classification with regional latent semantic dependencies. IEEE Trans. Multimed. 20(10), 2801–2813 (2018)

    Article  Google Scholar 

  19. Li, X., Zhao, F., Guo, Y.: Multi-label image classification with a probabilistic label enhancement model. In: UAI, vol. 1, pp. 1–10 (2014)

    Google Scholar 

  20. Hu, H., Zhou, G.-T., Deng, Z., Liao, Z., Mori, G.: Learning structured inference neural networks with label relations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2960–2968 (2016)

    Google Scholar 

  21. Tan, M., Shi, Q., van den Hengel, A., Shen, C., Gao, J., Hu, F., Zhang, Z.: Learning graph structure for multi-label image classification via clique generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4100–4109 (2015)

    Google Scholar 

  22. Li, Q., Qiao, M., Bian, W., Tao, D.: Conditional graphical lasso for multilabel image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2977–2986 (2016)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ping Ma .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

He, X., Ma, P., Chen, Y., Liu, Y. (2024). MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection. In: Huang, DS., Chen, W., Pan, Y. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14869. Springer, Singapore. https://doi.org/10.1007/978-981-97-5603-2_4

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-5603-2_4

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-5602-5

  • Online ISBN: 978-981-97-5603-2

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics