MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection

He, Xinyi; Ma, Ping; Chen, Yiyang; Liu, Yuan

doi:10.1007/978-981-97-5603-2_4

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14869))

Included in the following conference series:

International Conference on Intelligent Computing

459 Accesses

Abstract

Object detection methods provide efficient and accurate solutions for industrial production and quality control in defect detection. Since the overlap of bridge defect categories often occurs simultaneously, it is difficult for target detection methods targeting a single label to achieve accurate bridge defect detection. This paper proposes a bridge defect detection scheme YOLOv5 based on multi-softmax and omni-dimensional dynamic convolution (MOD-YOLO), which combines the proposed multi-softmax classification loss function with omni-dimensional dynamic convolution (ODConv). MOD-YOLO is evaluated on codebrim dataset and achieves the highest performance compared to existing SOTA models such as YOLO series and transformer-based series.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 74.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Steel surface defect detection based on MobileViTv2 and YOLOv8

Article 24 May 2024

BD-YOLOv8s: enhancing bridge defect detection with multidimensional attention and precision reconstruction

Article Open access 12 August 2024

Engineering-oriented bridge multiple-damage detection with damage integrity using modified faster region-based convolutional neural network

Article 09 March 2022

References

Jia, B.-B., Zhang, M.-L.: Multi-dimensional multi-label classification: towards encompassing heterogeneous label spaces and multi-label annotations. Pattern Recognit. 138, 109357 (2023)
Article Google Scholar
Wang, C., Bochkovskiy, A., Liao, H.M.: YOLov7: trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. arXiv preprint arXiv:2207.02696 (2022)
Ge, P., Chen, Y., Wang, G., Weng, G.: An active contour model driven by adaptive local pre-fitting energy function based on Jeffreys divergence for image segmentation. Expert Syst. Appl. 210, 118493 (2022)
Article Google Scholar
Ma, P., Wang, L.: Filtering-based recursive least squares estimation approaches for multivariate equation-error systems by using the multiinnovation theory. Int. J. Adapt. Control Signal Process. 35(9), 1898–1915 (2021)
Article MathSciNet Google Scholar
Ma, P.: A new partially-coupled recursive least squares algorithm for multivariate equation-error systems. Int. J. Control. Autom. Syst. 21(6), 1828–1839 (2023)
Article Google Scholar
Chen, Y., Zhu, X., Li, Y., Wei, Y., Ye, L.: Enhanced semantic feature pyramid network for small object detection. Signal Process. Image Commun. 113, 116919 (2023)
Article Google Scholar
Yao, Y., Han, L., Du, C., Xu, X., Jiang, X.: Traffic sign detection algorithm based on improved YOLOv4-tiny. Signal Process. Image Commun. 107, 116783 (2022)
Article Google Scholar
Zhao, Z., Wang, J., Tao, Q., Li, A., Chen, Y.: An unknown wafer surface defect detection approach based on incremental learning for reliability analysis. Reliab. Eng. Syst. Saf., 109966 (2024)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Carion, N., Massa, F., Synnaeve, G., Usunier, N., Kirillov, A., Zagoruyko, S.: End-to-end object detection with transformers. In: Vedaldi, A., Bischof, H., Brox, T., Frahm, J.M. (eds) ECCV 2020. LNCS, Part, vol. 12346, pp. 213–229. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-58452-8_13
Lin, T.Y., et al.: Microsoft COCO: Common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds.) ECCV 2014. LNCS, Part V, vol. 8693, pp. 740–755. Springer, Cham (2014). https://doi.org/10.1007/978-3-319-10602-1_48
Zhu, X., Su, W., Lu, L., Li, B., Wang, X., Dai, J.: Deformable DETR deformable transformers for end-to-end object detection. arXiv preprint arXiv:2010.04159 (2020)
Girshick, R., Donahue, J., Darrell, T., Malik, J.: Rich feature hierarchies for accurate object detection and semantic segmentation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 580–587 (2014)
Google Scholar
Cai, Z., Vasconcelos, N.: Cascade R-CNN: delving into high quality object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 6154–6162 (2018)
Google Scholar
Liu, G.-H., Chu, M.-X., Gong, R.-F., Zheng, Z.-H.: DLF-YOLOF: an improved YOLOF-based surface defect detection for steel plate. J. Iron Steel Res. Int., 1–10 (2023)
Google Scholar
Zhu, X., Liu, J., Zhou, X., Qian, S., Yu, J.: Detection of irregular small defects on metal base surface of infrared laser diode based on deep learning. Multimed. Tools Appl. (2023). https://doi.org/10.1007/s11042-023-16352-3
Gong, Y., Jia, Y., Leung, T., Toshev, A., Ioffe, S.: Deep convolutional ranking for multilabel image annotation. arXiv preprint arXiv:1312.4894 (2013)
Zhang, J., Wu, Q., Shen, C., Zhang, J., Lu, J.: Multilabel image classification with regional latent semantic dependencies. IEEE Trans. Multimed. 20(10), 2801–2813 (2018)
Article Google Scholar
Li, X., Zhao, F., Guo, Y.: Multi-label image classification with a probabilistic label enhancement model. In: UAI, vol. 1, pp. 1–10 (2014)
Google Scholar
Hu, H., Zhou, G.-T., Deng, Z., Liao, Z., Mori, G.: Learning structured inference neural networks with label relations. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2960–2968 (2016)
Google Scholar
Tan, M., Shi, Q., van den Hengel, A., Shen, C., Gao, J., Hu, F., Zhang, Z.: Learning graph structure for multi-label image classification via clique generation. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4100–4109 (2015)
Google Scholar
Li, Q., Qiao, M., Bian, W., Tao, D.: Conditional graphical lasso for multilabel image classification. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2977–2986 (2016)
Google Scholar

Download references

Author information

Authors and Affiliations

School of Artificial Intelligence and Computer Science, Jiangnan University, Wuxi, China
Xinyi He, Ping Ma & Yuan Liu
School of Mechanical and Electrical Engineering, Soochow University, Suzhou, China
Yiyang Chen

Authors

Xinyi He
View author publications
You can also search for this author in PubMed Google Scholar
Ping Ma
View author publications
You can also search for this author in PubMed Google Scholar
Yiyang Chen
View author publications
You can also search for this author in PubMed Google Scholar
Yuan Liu
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Ping Ma .

Editor information

Editors and Affiliations

Eastern Institute of Technology, Ningbo, China
De-Shuang Huang
China University of Mining and Technology, Xuzhou, China
Wei Chen
Eastern Institute of Technology, Ningbo, China
Yijie Pan

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

He, X., Ma, P., Chen, Y., Liu, Y. (2024). MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection. In: Huang, DS., Chen, W., Pan, Y. (eds) Advanced Intelligent Computing Technology and Applications. ICIC 2024. Lecture Notes in Computer Science, vol 14869. Springer, Singapore. https://doi.org/10.1007/978-981-97-5603-2_4

Download citation

DOI: https://doi.org/10.1007/978-981-97-5603-2_4
Published: 01 August 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-5602-5
Online ISBN: 978-981-97-5603-2
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Steel surface defect detection based on MobileViTv2 and YOLOv8

BD-YOLOv8s: enhancing bridge defect detection with multidimensional attention and precision reconstruction

Engineering-oriented bridge multiple-damage detection with damage integrity using modified faster region-based convolutional neural network

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

MOD-YOLO: Improved YOLOv5 Based on Multi-softmax and Omni-Dimensional Dynamic Convolution for Multi-label Bridge Defect Detection

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Steel surface defect detection based on MobileViTv2 and YOLOv8

BD-YOLOv8s: enhancing bridge defect detection with multidimensional attention and precision reconstruction

Engineering-oriented bridge multiple-damage detection with damage integrity using modified faster region-based convolutional neural network

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation