research-article

Open access

Deep transformer networks for precise pothole segmentation tasks

Authors:

Iason Katsamenis,

Athanasios Sakelliou,

Nikolaos Bakalos,

Eftychios Protopapadakis,

Christos Klaridopoulos,

Nikolaos Frangakis,

Matthaios Bimpas,

Dimitris KalogerasAuthors Info & Claims

PETRA '23: Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments

Pages 596 - 602

https://doi.org/10.1145/3594806.3596560

Published: 10 August 2023 Publication History

All formats PDF

Abstract

Potholes on the road surface are a significant safety hazard and can cause severe damage to vehicles. Identifying and repairing potholes is a challenging task that requires efficient and accurate methods. In recent years, deep learning models, such as U-Nets and transformers, have been used for image segmentation tasks with promising results. This paper proposes a transformer-based model and in particular the SegFormer framework, for pothole segmentation using high-resolution images captured from a road inspection vehicle. The proposed network outperformed the traditional U-Net model that demonstrates state-of-the-art performance in various segmentation tasks, achieving an average F1-score close to 80%. The results show that the proposed method can effectively identify and localize potholes, providing a useful auxiliary tool for road maintenance and safety.

References

[1]

Wang, T., Dra, Y. A. S. S., Cai, X., Cheng, Z., Zhang, D., Lin, Y., & Yu, H. (2022). Advanced cold patching materials (CPMs) for asphalt pavement pothole rehabilitation: State of the art. Journal of Cleaner Production, 133001.

[2]

SB, B. K., Guhan, S., Kishore, M., & Santhosh, R. (2023, March). Deep Learning Approach for Pothole Detection-A Systematic Review. In 2023 Second International Conference on Electronics and Renewable Systems (ICEARS) (pp. 1410-1414). IEEE.

[3]

Slavkovic, N., & Bjelica, M. (2019). Risk prediction algorithm based on image texture extraction using mobile vehicle road scanning system as support for autonomous driving. Journal of Electronic Imaging, 28(3), 033034-033034.

[4]

Katsamenis, I., Doulamis, N., Doulamis, A., Protopapadakis, E., & Voulodimos, A. (2022). Simultaneous Precise Localization and Classification of metal rust defects for robotic-driven maintenance and prefabrication using residual attention U-Net. Automation in Construction, 137, 104182.

[5]

Dhiman, A., & Klette, R. (2019). Pothole detection using computer vision and learning. IEEE Transactions on Intelligent Transportation Systems, 21(8), 3536-3550.

[6]

Zhang, Y., Zhang, S., Huang, R., Huang, B., Yang, L., & Liang, J. (2021). A deep learning-based approach for machining process route generation. The International Journal of Advanced Manufacturing Technology, 115(11-12), 3493-3511.

[7]

Baek, J. W., & Chung, K. (2020). Pothole classification model using edge detection in road image. Applied Sciences, 10(19), 6662.

[8]

Yu, X., & Salari, E. (2011, May). Pavement pothole detection and severity measurement using laser imaging. In 2011 IEEE International Conference on Electro/Information Technology (pp. 1-5). IEEE.

[9]

Ouma, Y. O., & Hahn, M. (2017). Pothole detection on asphalt pavements from 2D-colour pothole images using fuzzy c-means clustering and morphological reconstruction. Automation in Construction, 83, 196-211.

[10]

Dhiman, A., & Klette, R. (2019). Pothole detection using computer vision and learning. IEEE Transactions on Intelligent Transportation Systems, 21(8), 3536-3550.

[11]

Protopapadakis, E., Katsamenis, I., & Doulamis, A. (2020, June). Multi-label deep learning models for continuous monitoring of road infrastructures. In Proceedings of the 13th ACM International Conference on PErvasive Technologies Related to Assistive Environments (pp. 1-7).

Digital Library

[12]

Pereira, V., Tamura, S., Hayamizu, S., & Fukai, H. (2019, September). Semantic segmentation of paved road and pothole image using u-net architecture. In 2019 International Conference of Advanced Informatics: Concepts, Theory and Applications (ICAICTA) (pp. 1-4). IEEE.

[13]

Voulodimos, A., Protopapadakis, E., Katsamenis, I., Doulamis, A., & Doulamis, N. (2021, June). Deep learning models for COVID-19 infected area segmentation in CT images. In the 14th PErvasive technologies related to assistive environments conference (pp. 404-411).

[14]

Katsamenis, I., Protopapadakis, E., Doulamis, A., Doulamis, N., & Voulodimos, A. (2020, December). Pixel-level corrosion detection on metal constructions by fusion of deep learning semantic and contour segmentation. In Advances in Visual Computing: 15th International Symposium, ISVC 2020, San Diego, CA, USA, October 5–7, 2020, Proceedings, Part I (pp. 160-169). Cham: Springer International Publishing.

Digital Library

[15]

Xu, C., Zhang, Q., Mei, L., Shen, S., Ye, Z., Li, D., ... & Zhou, X. (2023). Dense Multiscale Feature Learning Transformer Embedding Cross-Shaped Attention for Road Damage Detection. Electronics, 12(4), 898.

[16]

Feng, Z., Guo, Y., Liang, Q., Bhutta, M. U. M., Wang, H., Liu, M., & Sun, Y. (2022). MAFNet: Segmentation of Road Potholes With Multimodal Attention Fusion Network for Autonomous Vehicles. IEEE Transactions on Instrumentation and Measurement, 71, 1-12.

[17]

Yuan, L., Chen, Y., Wang, T., Yu, W., Shi, Y., Jiang, Z.H., Tay, F.E., Feng, J. and Yan, S. (2021). Tokens-to-token vit: Training vision transformers from scratch on imagenet. In Proceedings of the IEEE/CVF international conference on computer vision (pp. 558-567).

[18]

Wang, S., Chen, X., & Dong, Q. (2023). Detection of Asphalt Pavement Cracks Based on Vision Transformer Improved YOLO V5. Journal of Transportation Engineering, Part B: Pavements, 149(2), 04023004.

[19]

Ma, T., Zhou, X., Xi, R., Yang, J., Zhang, J., & Li, F. (2022, December). A Semi-supervised Road Segmentation Method for Remote Sensing Image Based on SegFormer. In Artificial Intelligence and Robotics: 7th International Symposium, ISAIR 2022, Shanghai, China, October 21-23, 2022, Proceedings, Part II (pp. 189-201). Singapore: Springer Nature Singapore.

[20]

Chen, L. C., Papandreou, G., Kokkinos, I., Murphy, K., & Yuille, A. L. (2017). Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs. IEEE transactions on pattern analysis and machine intelligence, 40(4), 834-848.

[21]

Baheti, B., Innani, S., Gajre, S. and Talbar, S., 2020. Semantic scene segmentation in unstructured environment with modified DeepLabV3+. Pattern Recognition Letters, 138, pp.223-229.

[22]

Ma, N., Fan, J., Wang, W., Wu, J., Jiang, Y., Xie, L. and Fan, R., 2022. Computer vision for road imaging and pothole detection: a state-of-the-art review of systems and algorithms. Transportation safety and Environment, 4(4), p.tdac026.

[23]

Kaselimi, M., Voulodimos, A., Doulamis, N., Doulamis, A., Delikaraoglou, D., 2020, “A Causal Long Short-Term Memory Sequence to Sequence Model for TEC Prediction Using GNSS Observations”, Remote Sensing. 2020; 12(9):1354. https://doi.org/10.3390/rs12091354

[24]

Protonotarios, N.E., Katsamenis, I., Sykiotis, S., Dikaios, N., Kastis, G.A., Chatziioannou, S.N., Metaxas, M., Doulamis, N., & Doulamis, A. (2022). A few-shot U-Net deep learning model for lung cancer lesion segmentation via PET/CT imaging. Biomedical Physics & Engineering Express, 8(2), 025019.

[25]

Voulodimos, A.; Protopapadakis, E.; Katsamenis, I.; Doulamis, A.; Doulamis, N. A Few-Shot U-Net Deep Learning Model for COVID-19 Infected Area Segmentation in CT Images. Sensors 2021, 21, 2215. https://doi.org/10.3390/s21062215

[26]

Pereira, V., Tamura, S., Hayamizu, S. and Fukai, H., 2019, September. Semantic segmentation of paved road and pothole image using u-net architecture. In 2019 International Conference of Advanced Informatics: Concepts, Theory and Applications (ICAICTA) (pp. 1-4). IEEE.

[27]

Fan, R., Wang, H., Wang, Y., Liu, M. and Pitas, I., 2021. Graph attention layer evolves semantic segmentation for road pothole detection: A benchmark and algorithms. IEEE transactions on image processing, 30, pp.8144-8154.

[28]

Zhou, Z., Siddiquee, M.M.R., Tajbakhsh, N. and Liang, J., 2019. Unet++: Redesigning skip connections to exploit multiscale features in image segmentation. IEEE transactions on medical imaging, 39(6), pp.1856-1867.

[29]

Zhou, J., Hao, M., Zhang, D., Zou, P. and Zhang, W., 2019. Fusion PSPnet image segmentation based method for multi-focus image fusion. IEEE Photonics Journal, 11(6), pp.1-12.

[30]

He, K., Gkioxari, G., Dollár, P. and Girshick, R., 2017. Mask r-cnn. In Proceedings of the IEEE international conference on computer vision (pp. 2961-2969).

[31]

Thiruppathiraj, S., Kumar, U., & Buchke, S. (2020, November). Automatic pothole classification and segmentation using android smartphone sensors and camera images with machine learning techniques. In 2020 IEEE REGION 10 CONFERENCE (TENCON) (pp. 1386-1391). IEEE.

[32]

Katsamenis, I., Protopapadakis, E., Voulodimos, A., Doulamis, A., & Doulamis, N. (2020, November). Transfer learning for COVID-19 pneumonia detection and classification in chest X-ray images. In 24th Pan-Hellenic Conference on Informatics (pp. 170-174).

Digital Library

[33]

Pramanik, A., Bijoy, M. H. I., & Rahman, M. S. (2021, December). Detection of Potholes using Convolutional Neural Network Models: A Transfer Learning Approach. In 2021 IEEE International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (RAAICON) (pp. 73-78). IEEE.

[34]

Pramanik, A., Bijoy, M.H.I. and Rahman, M.S., 2021, December. Detection of Potholes using Convolutional Neural Network Models: A Transfer Learning Approach. In 2021 IEEE International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (RAAICON) (pp. 73-78). IEEE.

[35]

Jana, S., Middya, A.I. and Roy, S., 2023. Participatory Sensing Based Urban Road Condition Classification using Transfer Learning. Mobile Networks and Applications, pp.1-17.

[36]

Pramanik, A., Bijoy, M. H. I., & Rahman, M. S. (2021, December). Detection of Potholes using Convolutional Neural Network Models: A Transfer Learning Approach. In 2021 IEEE International Conference on Robotics, Automation, Artificial-Intelligence and Internet-of-Things (RAAICON) (pp. 73-78). IEEE.

[37]

Katsamenis, I., Bimpas, M., Protopapadakis, E., Zafeiropoulos, C., Kalogeras, D., Doulamis, A., Doulamis, N., Martín-Portugués Montoliu, C., Handanos, Y., Schmidt, F., Ott, L., Cantero, M. and Lopez, R. (2022, June). Robotic maintenance of road infrastructures: The heron project. In Proceedings of the 15th International Conference on PErvasive Technologies Related to Assistive Environments (pp. 628-635).

[38]

Katsamenis, I., Davradou, A., Karolou, E. E., Protopapadakis, E., Doulamis, A., Doulamis, N., & Kalogeras, D. (2022, September). Evaluating YOLO Transferability Limitation for Road Infrastructures Monitoring. In Novel & Intelligent Digital Systems: Proceedings of the 2nd International Conference (NiDS 2022) (pp. 349-358). Cham: Springer International Publishing.

[39]

Katsamenis, I., Karolou, E. E., Davradou, A., Protopapadakis, E., Doulamis, A., Doulamis, N., & Kalogeras, D. (2022, September). TraCon: A novel dataset for real-time traffic cones detection using deep learning. In Novel & Intelligent Digital Systems: Proceedings of the 2nd International Conference (NiDS 2022) (pp. 382-391). Cham: Springer International Publishing.

[40]

Katsamenis, I., Protopapadakis, E., Bakalos, N., Doulamis, A., Doulamis, N., & Voulodimos, A. (2023). A Few-Shot Attention Recurrent Residual U-Net for Crack Segmentation. arXiv preprint arXiv:2303.01582.

[41]

Xie, E., Wang, W., Yu, Z., Anandkumar, A., Alvarez, J. M., & Luo, P. (2021). SegFormer: Simple and efficient design for semantic segmentation with transformers. Advances in Neural Information Processing Systems, 34, 12077-12090.

[42]

Ronneberger, O., Fischer, P., & Brox, T. (2015). U-net: Convolutional networks for biomedical image segmentation. In Medical Image Computing and Computer-Assisted Intervention–MICCAI 2015: 18th International Conference, Munich, Germany, October 5-9, 2015, Proceedings, Part III 18 (pp. 234-241). Springer International Publishing.

[43]

Ranftl, R., Bochkovskiy, A., & Koltun, V. (2021). Vision transformers for dense prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision (pp. 12179-12188).

[44]

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A.N., Kaiser, Ł. and Polosukhin, I. (2017). Attention is all you need. Advances in neural information processing systems, 30.

[45]

Fan, R., Ozgunalp, U., Hosking, B., Liu, M., & Pitas, I. (2019). Pothole detection based on disparity transformation and road surface modeling. IEEE Transactions on Image Processing, 29, 897-908.

Digital Library

[46]

Fan, R., Wang, H., Bocus, M. J., & Liu, M. (2020). We learn better road pothole detection: from attention aggregation to adversarial domain adaptation. In Computer Vision–ECCV 2020 Workshops: Glasgow, UK, August 23–28, 2020, Proceedings, Part IV 16 (pp. 285-300). Springer International Publishing.

[47]

Fan, R., Ai, X., & Dahnoun, N. (2018). Road surface 3D reconstruction based on dense subpixel disparity map estimation. IEEE Transactions on Image Processing, 27(6), 3025-3035.

[48]

Fan, R., & Liu, M. (2019). Road damage detection based on unsupervised disparity map segmentation. IEEE Transactions on Intelligent Transportation Systems, 21(11), 4906-4911.

[49]

Voulodimos, A., Kosmopoulos, D., Veres, G., Grabner, H., Van Gool, L., & Varvarigou, T. (2011). Online classification of visual tasks for industrial workflow monitoring. Neural Networks, 24(8), 852-860.

Digital Library

[50]

Kosmopoulos, D. I., Voulodimos, A. S., & Doulamis, A. D. (2012). A system for multicamera task recognition and summarization for structured environments. IEEE Transactions on Industrial Informatics, 9(1), 161-171.

[51]

De Marsico, M., Nappi, M., Tistarelli, M., 2014, Face recognition in adverse conditions, IGI Global, Hershey, PA, USA, 2014.

Cited By

Fan LTang SMohd Ariffin MIsmail MZhao R(2024)How to Make a State of the Art Report—Case Study—Image-Based Road Crack Detection: A Scientometric Literature ReviewApplied Sciences10.3390/app1411481714:11(4817)Online publication date: 2-Jun-2024
https://doi.org/10.3390/app14114817
Katsamenis IKopsiaftis GVoulodimos ARallis IGeorgoulas IZafeiropoulos CDoulamis A(2024)A Deep Learning Framework for Segmentation of Road Defects Using ResUNet-aProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663935(449-455)Online publication date: 26-Jun-2024
https://dl.acm.org/doi/10.1145/3652037.3663935
Katsamenis IAndreoli GSkamantzari MBakalos NSchmidt FSedran TDoulamis NProtopapadakis EKalogeras D(2024)UAV-based Localization of Removable Urban Pavement Elements Through Deep Object Detection MethodsProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663934(440-448)Online publication date: 26-Jun-2024
https://dl.acm.org/doi/10.1145/3652037.3663934
Show More Cited By

Index Terms

Deep transformer networks for precise pothole segmentation tasks
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Image segmentation
        Object detection
        Object identification
  2. Machine learning
    1. Machine learning approaches
      1. Learning latent representations
      2. Neural networks

Index terms have been assigned to the content through auto-classification.

Recommendations

A Deep Learning Framework for Segmentation of Road Defects Using ResUNet-a
PETRA '24: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments

We present a deep learning framework leveraging the ResUNet-a framework for pixel-wise semantic segmentation of cracks and potholes. By integrating key components including a U-Net encoder/decoder backbone, residual connections, atrous convolutions, ...
Nighttime vehicle light detection on a moving vehicle using image segmentation and analysis techniques

This study proposes a vehicle detection system for identifying the vehicles by locating their headlights and rear-lights in the nighttime road environment. The proposed system comprises of two stages for detecting the vehicles in front of the camera-...
Precise segmentation of non-enhanced computed tomography in patients with ischemic stroke based on multi-scale U-Net deep network model
Highlights
- Multi-scale U-Net deep network model can effectively segment the ischemic stroke lesions.
Abstract Background and Objective
Acute ischemic stroke requires timely diagnosis and thrombolytic therapy, but it is difficult to locate and quantify the lesion site manually. The purpose of this study was to explore a more rapid ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

PETRA '23: Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments

July 2023

797 pages

ISBN:9798400700699

DOI:10.1145/3594806

Copyright © 2023 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 August 2023

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

European Union?s Horizon 2020 Research and Innovation Programme

Conference

PETRA '23

PETRA '23: Proceedings of the 16th International Conference on PErvasive Technologies Related to Assistive Environments

July 5 - 7, 2023

Corfu, Greece

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
616
Total Downloads

Downloads (Last 12 months)478
Downloads (Last 6 weeks)66

Reflects downloads up to 13 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Fan LTang SMohd Ariffin MIsmail MZhao R(2024)How to Make a State of the Art Report—Case Study—Image-Based Road Crack Detection: A Scientometric Literature ReviewApplied Sciences10.3390/app1411481714:11(4817)Online publication date: 2-Jun-2024
https://doi.org/10.3390/app14114817
Katsamenis IKopsiaftis GVoulodimos ARallis IGeorgoulas IZafeiropoulos CDoulamis A(2024)A Deep Learning Framework for Segmentation of Road Defects Using ResUNet-aProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663935(449-455)Online publication date: 26-Jun-2024
https://dl.acm.org/doi/10.1145/3652037.3663935
Katsamenis IAndreoli GSkamantzari MBakalos NSchmidt FSedran TDoulamis NProtopapadakis EKalogeras D(2024)UAV-based Localization of Removable Urban Pavement Elements Through Deep Object Detection MethodsProceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments10.1145/3652037.3663934(440-448)Online publication date: 26-Jun-2024
https://dl.acm.org/doi/10.1145/3652037.3663934
Feng ZGuo YSun Y(2024)Segmentation of Road Negative Obstacles Based on Dual Semantic-Feature Complementary Fusion for Autonomous DrivingIEEE Transactions on Intelligent Vehicles10.1109/TIV.2024.33765349:4(4687-4697)Online publication date: Apr-2024
https://doi.org/10.1109/TIV.2024.3376534
Katsamenis IProtopapadakis EBakalos NVarvarigos ADoulamis ADoulamis NVoulodimos A(2023)A Few-Shot Attention Recurrent Residual U-Net for Crack SegmentationAdvances in Visual Computing10.1007/978-3-031-47969-4_16(199-209)Online publication date: 16-Oct-2023
https://dl.acm.org/doi/10.1007/978-3-031-47969-4_16

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents