Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-47994-6_3guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Clinical Dialogue Transcription Error Correction with Self-supervision

Published: 12 December 2023 Publication History

Abstract

A clinical dialogue is a conversation between a clinician and a patient to share medical information, which is critical in clinical decision-making. The reliance on manual note-taking is highly inefficient and leads to transcription errors when digitising notes. Speech-to-text applications designed using Automatic Speech Recognition (ASR) can potentially overcome these errors using post-ASR error correction. Pre-trained language models are increasingly used in this area. However, the performance suffers from the lack of domain-specific vocabulary and the mismatch between error correction and pre-training objectives. This research explores these challenges in gastrointestinal specialism by introducing self-supervision strategies to fine-tune pre-trained language models for clinical dialogue error correction. We show that our mask-filling objective specialised for the medical domain (med-mask-filling) outperforms the best performing commercial ASR system by 10.27%.

References

[1]
Cucu H, Buzo A, Besacier L, and Burileanu C Dediu A-H, Martín-Vide C, Mitkov R, and Truthe B Statistical error correction methods for domain-specific ASR systems Statistical Language and Speech Processing 2013 Heidelberg Springer 83-92
[2]
Errattahi R, El Hannani A, and Ouahmane H Automatic speech recognition errors detection and correction: a review Procedia Comput. Sci. 2018 128 32-37 1st International Conference on Natural Language and Speech Processing
[3]
Filippidou F and Moussiades L Maglogiannis I, Iliadis L, and Pimenidis E A benchmarking of IBM, Google and wit automatic speech recognition systems Artificial Intelligence Applications and Innovations 2020 Cham Springer 73-82
[4]
Humphries, J.J., Woodland, P.C., Pearce, D.J.B.: Using accent-specific pronunciation modelling for robust speech recognition. Proceeding of Fourth International Conference on Spoken Language Processing, ICSLP 199, vol.6 4, pp. 2324–2327 (1996)
[5]
Jain, A., Upreti, M., Jyothi, P.: Improved accented speech recognition using accent embeddings and multi-task learning. In: INTERSPEECH (2018)
[6]
Kamper, H., Niesler, T.: Multi-accent speech recognition of Afrikaans, black and white varieties of south African English. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 3189–3192 (2011)
[7]
Leng, Y., et al.: FastCorrect 2: fast error correction on multiple candidates for automatic speech recognition. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4328–4337. Association for Computational Linguistics, Punta Cana (2021)
[8]
Leng, Y., et al.: FastCorrect: fast error correction with edit alignment for automatic speech recognition. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 21708–21719. Curran Associates, Inc. (2021)
[9]
Lewis, M., et al.: BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880. Association for Computational Linguistics, Online (2020)
[10]
Li, W., Di, H., Wang, L., Ouchi, K., Lu, J.: Boost transformer with BERT and copying mechanism for ASR error correction. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2021)
[11]
Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2019)
[12]
Mani, A., Palaskar, S., Konam, S.: Towards understanding ASR error correction for medical conversations. In: NLPMC (2020)
[13]
Mani, A., Palaskar, S., Meripo, N.V., Konam, S., Metze, F.: ASR error correction and domain adaptation using machine translation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6344–6348. IEEE (2020)
[14]
McDonald, A., Sherlock, J.: A long and winding road - improving communication with patients in the NHS (2016)
[15]
Nanayakkara G, Wiratunga N, Corsar D, Martin K, and Wijekoon A Shaban-Nejad A, Michalowski M, and Bianco S Clinical dialogue transcription error correction using Seq2Seq models Multimodal AI in Healthcare 2023 Cham Springer 41-57
[16]
Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 319–327. Association for Computational Linguistics, Florence (2019)
[17]
Quiroz J, Laranjo L, Kocaballi AB, Berkovsky S, Rezazadegan D, and Coiera E Challenges of developing a digital scribe to reduce clinical documentation burden NPJ Digit. Med. 2019 2 114
[18]
Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)
[19]
Raffel C et al. Exploring the limits of transfer learning with a unified text-to-text transformer J. Mach. Learn. Res. 2020 21 140 1-67
[20]
Sarma, A., Palmer, D.D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 85–88. Association for Computational Linguistics, Boston (2004)
[21]
Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)
[22]
Wagner RA and Fischer MJ The string-to-string correction problem J. ACM (JACM) 1974 21 1 168-173
[23]
Zhang, Y., Baldridge, J., He, L.: PAWS: paraphrase adversaries from word scrambling. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1298–1308. Association for Computational Linguistics, Minneapolis (2019)
[24]
Zhao, Y., Yang, X., Wang, J., Gao, Y., Yan, C., Zhou, Y.: BART based semantic correction for mandarin automatic speech recognition system. In: Proceedings of the Interspeech 2021, pp. 2017–2021 (2021)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Artificial Intelligence XL: 43rd SGAI International Conference on Artificial Intelligence, AI 2023, Cambridge, UK, December 12–14, 2023, Proceedings
Dec 2023
524 pages
ISBN:978-3-031-47993-9
DOI:10.1007/978-3-031-47994-6
  • Editors:
  • Max Bramer,
  • Frederic Stahl

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 December 2023

Author Tags

  1. Automatic Speech Recognition
  2. Error Correction
  3. Language Models

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Feb 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media