Abstract
Defect detection systems based on machine vision have been widely used as an essential part of intelligent manufacturing systems. However, in traditional object detection methods that rely on images as input, differences in defect areas, blurred images, and complex background interference can seriously impair detection accuracy. To meet these challenges, this paper proposed a dual-path neural network based on shearlet transform (STDPNet) by taking advantage of shearlet transform in multi-scale analysis and combining it with the improved object detection algorithm proposed in this paper. First, images are multi-scale and multi-directional decomposed with shearlet transform, and multi-directional sub-band information is input to the detection network instead of image information. Then, this paper proposed a dual-path object detection network for the differences between different frequency bands and introduced a transfer learning strategy between paths to improve the model performance. Finally, the training results on the NEU surface defect public dataset show that the mean average precision of STDPNet achieves 86.81% at a detection speed of 44.45 f/s, which exceeds that of Faster R-CNN by 12%. Experiments on different datasets prove that the accuracy is significantly superior to other models, and the proposed method is more advantageous compared to other models in large, fuzzy, and indistinguishable defect types.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Data availability
The datasets generated during and/or analyzed during the current study are available from the corresponding author on reasonable request.
References
Song, K., Yan, Y.: A noise robust method based on completed local binary patterns for hot-rolled steel strip surface defects. Appl. Surf. Sci. 285, 858–864 (2013). https://doi.org/10.1016/j.apsusc.2013.09.002
Hao, R., Lu, B., Cheng, Y., Li, X., Huang, B.: A steel surface defect inspection approach towards smart industrial monitoring. J. Intell. Manuf. 32, 1833–1843 (2021). https://doi.org/10.1007/s10845-020-01670-2
Chen, S., Yu, J., Xu, X., Chen, Z., Lu, L., Hu, X., Yang, Y.: Split-guidance network for salient object detection. Vis. Comput. 39, 1437–1451 (2023). https://doi.org/10.1007/s00371-022-02421-5
Neogi, N., Mohanta, D.K., Dutta, P.K.: Review of vision-based steel surface inspection systems. EURASIP J. Image Video Process. 2014(1), 1–19 (2014). https://doi.org/10.1186/1687-5281-2014-50
Schwartz, W.R., da Silva, R.D., Davis, L.S., Pedrini, H.: A novel feature descriptor based on the shearlet transform. In: 2011 18th IEEE International Conference on Image Processing, pp. 1033–1036. IEEE (2011). https://doi.org/10.1109/ICIP.2011.6115600
Dong, Y., Feng, J., Yang, C., Wang, X., Zheng, L., Pu, J.: Multi-scale counting and difference representation for texture classification. Vis. Comput. 34, 1315–1324 (2018). https://doi.org/10.1007/s00371-017-1415-4
Ghorai, S., Mukherjee, A., Gangadaran, M., Dutta, P.K.: Automatic defect detection on hot-rolled flat steel products. IEEE Trans. Instrum. Meas. 62(3), 612–621 (2012). https://doi.org/10.1109/TIM.2012.2218677
Hou, Z., Parker, J.M.: Texture defect detection using support vector machines with adaptive Gabor wavelet features. In: 2005 Seventh IEEE Workshops on Applications of Computer Vision (WACV/MOTION'05), Vol. 1, pp. 275–280. IEEE (2005). https://doi.org/10.1109/ACVMOT.2005.115
Po, D.Y., Do, M.N.: Directional multiscale modeling of images using the Contourlet transform. IEEE Trans. Image Process. 15(6), 1610–1620 (2006). https://doi.org/10.1109/tip.2006.873450
Guo, K., Kutyniok, G., Labate, D.: Sparse multidimensional representations using anisotropic dilation and shear operators. Wavelets Splines 14, 189–201 (2006)
Candes, E.J., Donoho, D.L.: Curvelets: A Surprisingly Effective Nonadaptive Representation for Objects with Edges. Stanford Univ Ca Dept of Statistics (2000). https://doi.org/10.1086/116933
Candes, E., Demanet, L., Donoho, D., Ying, L.: Fast discrete curvelet transforms. Multiscale Model. Simul. 5(3), 861–899 (2006). https://doi.org/10.1137/05064182X
Li, L., Wang, L., Wang, Z., Jia, Z., Si, Y., Yang, J., Kasabov, N.: A novel medical image fusion approach based on nonsubsampled shearlet transform. J. Med. Imaging Health Inform. 9(9), 1815–1826 (2019). https://doi.org/10.1166/jmihi.2019.2827
Hu, P., Wang, C., Li, D., Zhao, X.: An improved hybrid multiscale fusion algorithm based on NSST for infrared–visible images. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02844-8
Easley, G., Labate, D., Lim, W.Q.: Sparse directional image representations using the discrete shearlet transform. Appl. Comput. Harmon. Anal. 25(1), 25–46 (2008). https://doi.org/10.1016/j.acha.2007.09.003
Sarkar, D., Gunturi, S.K.: Online health status monitoring of high voltage insulators using deep learning model. Vis. Comput. 38, 4457–4468 (2022). https://doi.org/10.1007/s00371-021-02308-x
Lin, R., Liu, J., Liu, R., Fan, X.: Global structure-guided learning framework for underwater image enhancement. Vis. Comput. 38, 4419–4434 (2022). https://doi.org/10.1007/s00371-021-02305-0
Shi, L., Ma, H., Zhang, J.: Automatic detection of pulmonary nodules in CT images based on 3D Res-I network. Vis. Comput. 37, 1343–1356 (2021). https://doi.org/10.1007/s00371-020-01869-7
Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., Berg, A.C.: Ssd: single shot multibox detector. In: Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands, October 11–14, 2016, Proceedings, Part I 14, pp. 21–37. Springer (2016). https://doi.org/10.1007/978-3-319-46448-0_2
Redmon, J., Divvala, S., Girshick, R., Farhadi, A.: You only look once: unified, real-time object detection. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 779–788 (2016). https://doi.org/10.1109/CVPR.2016.91
Redmon, J., Farhadi, A.: Yolov3: an incremental improvement (2018). https://doi.org/10.48550/arXiv.1804.02767
Hou, W., Jing, H.: RC-YOLOv5s: for tile surface defect detection. Vis. Comput. (2023). https://doi.org/10.1007/s00371-023-02793-2
Ren, S., He, K., Girshick, R., Sun, J.: Faster r-cnn: towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst. 28, 66 (2015). https://doi.org/10.1109/TPAMI.2016.2577031
Wu, J., Le, J., Xiao, Z., Zhang, F., Geng, L., Liu, Y., Wang, W.: Automatic fabric defect detection using a wide-and-light network. Appl. Intell. 51, 4945–4961 (2021). https://doi.org/10.1007/s10489-020-02084-6
Chen, K., Zeng, Z., Yang, J.: A deep region-based pyramid neural network for automatic detection and multi-classification of various surface defects of aluminum alloys. J. Build. Eng. 43, 102–523 (2021). https://doi.org/10.1016/J.JOBE.2021.102523
Lin, D., Li, Y., Prasad, S., Nwe, T.L., Dong, S., Oo, Z.M.: CAM-UNET: class activation MAP guided UNET with feedback refinement for defect segmentation. In: 2020 IEEE International Conference on Image Processing (ICIP), pp. 2131–2135. IEEE (2020). https://doi.org/10.1109/ICIP40778.2020.9190900
Bochkovskiy, A., Wang, C.Y., Liao, H.Y.M. Yolov4: optimal speed and accuracy of object detection (2020). https://doi.org/10.48550/arXiv.2004.10934.
Zhang, X., Wan, T., Wu, Z., Du, B.: Real-time detector design for small targets based on bi-channel feature fusion mechanism. Appl. Intell. 52, 2775–2784 (2022). https://doi.org/10.1007/s10489-021-02545-6
Guo, K., Labate, D.: Optimally sparse multidimensional representation using shearlets. SIAM J. Math. Anal. 39(1), 298–318 (2007). https://doi.org/10.1137/060649781
Liu, S., Qi, L., Qin, H., Shi, J., Jia, J.: Path aggregation network for instance segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 8759–8768 (2018). https://doi.org/10.1109/CVPR.2018.00913
Zheng, Z., Wang, P., Liu, W., Li, J., Ye, R., Ren, D.: Distance-IoU loss: faster and better learning for bounding box regression. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 12993–13000 (2020). https://doi.org/10.1609/aaai.v34i07.6999
Pan, S.J., Yang, Q.: A survey on transfer learning. IEEE Trans. Knowl. Data Eng. 22, 1345–1359 (2010). https://doi.org/10.1109/TKDE.2009.191
Lin, T.-Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., Zitnick, C.L.: Microsoft COCO: common objects in context. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (Eds.) Computer Vision—ECCV 2014, pp. 740–755. Springer, Cham, 2014. https://doi.org/10.1007/978-3-319-10602-1_48
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. arXiv preprint (2017). https://doi.org/10.48550/arXiv.1711.05101
Powers, D.M.: Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint (2020).https://doi.org/10.48550/arXiv.2010.16061
Acknowledgements
This study was supported by the National Science Foundation of China (No. 51975130) and Basic Scientific Research Project of Education Department of Liaoning Province (LJKMZ20220915).
Author information
Authors and Affiliations
Contributions
DA conceptualized the study; RH helped in methodology and writing—original draft preparation; RH, LF were involved in formal analysis and investigation; RH, ZL contributed to writing—review and editing; DA, PZ acquired the funding; ZC helped in resources; DA, PZ supervised the study.
Corresponding author
Ethics declarations
Conflict of interest
All authors certify that they have no affiliations with or involvement in any organization or entity with any financial interest or non-financial interest in the subject matter or materials discussed in this manuscript.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
An, D., Hu, R., Fan, L. et al. STDPNet: a dual-path surface defect detection neural network based on shearlet transform. Vis Comput 40, 5841–5856 (2024). https://doi.org/10.1007/s00371-023-03139-8
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00371-023-03139-8