Article

Clinical Dialogue Transcription Error Correction with Self-supervision

Authors:

Gayani Nanayakkara,

Nirmalie Wiratunga,

Anjana WijekoonAuthors Info & Claims

Artificial Intelligence XL: 43rd SGAI International Conference on Artificial Intelligence, AI 2023, Cambridge, UK, December 12–14, 2023, Proceedings

Pages 33 - 46

https://doi.org/10.1007/978-3-031-47994-6_3

Published: 12 December 2023 Publication History

Abstract

A clinical dialogue is a conversation between a clinician and a patient to share medical information, which is critical in clinical decision-making. The reliance on manual note-taking is highly inefficient and leads to transcription errors when digitising notes. Speech-to-text applications designed using Automatic Speech Recognition (ASR) can potentially overcome these errors using post-ASR error correction. Pre-trained language models are increasingly used in this area. However, the performance suffers from the lack of domain-specific vocabulary and the mismatch between error correction and pre-training objectives. This research explores these challenges in gastrointestinal specialism by introducing self-supervision strategies to fine-tune pre-trained language models for clinical dialogue error correction. We show that our mask-filling objective specialised for the medical domain (med-mask-filling) outperforms the best performing commercial ASR system by 10.27%.

References

[1]

Cucu H, Buzo A, Besacier L, and Burileanu C Dediu A-H, Martín-Vide C, Mitkov R, and Truthe B Statistical error correction methods for domain-specific ASR systems Statistical Language and Speech Processing 2013 Heidelberg Springer 83-92

[2]

Errattahi R, El Hannani A, and Ouahmane H Automatic speech recognition errors detection and correction: a review Procedia Comput. Sci. 2018 128 32-37 1st International Conference on Natural Language and Speech Processing

[3]

Filippidou F and Moussiades L Maglogiannis I, Iliadis L, and Pimenidis E A benchmarking of IBM, Google and wit automatic speech recognition systems Artificial Intelligence Applications and Innovations 2020 Cham Springer 73-82

[4]

Humphries, J.J., Woodland, P.C., Pearce, D.J.B.: Using accent-specific pronunciation modelling for robust speech recognition. Proceeding of Fourth International Conference on Spoken Language Processing, ICSLP 199, vol.6 4, pp. 2324–2327 (1996)

[5]

Jain, A., Upreti, M., Jyothi, P.: Improved accented speech recognition using accent embeddings and multi-task learning. In: INTERSPEECH (2018)

[6]

Kamper, H., Niesler, T.: Multi-accent speech recognition of Afrikaans, black and white varieties of south African English. In: Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, pp. 3189–3192 (2011)

[7]

Leng, Y., et al.: FastCorrect 2: fast error correction on multiple candidates for automatic speech recognition. In: Findings of the Association for Computational Linguistics: EMNLP 2021, pp. 4328–4337. Association for Computational Linguistics, Punta Cana (2021)

[8]

Leng, Y., et al.: FastCorrect: fast error correction with edit alignment for automatic speech recognition. In: Ranzato, M., Beygelzimer, A., Dauphin, Y., Liang, P., Vaughan, J.W. (eds.) Advances in Neural Information Processing Systems, vol. 34, pp. 21708–21719. Curran Associates, Inc. (2021)

[9]

Lewis, M., et al.: BART: Denoising sequence-to-sequence pre-training for natural language generation, translation, and comprehension. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, pp. 7871–7880. Association for Computational Linguistics, Online (2020)

[10]

Li, W., Di, H., Wang, L., Ouchi, K., Lu, J.: Boost transformer with BERT and copying mechanism for ASR error correction. In: 2021 International Joint Conference on Neural Networks (IJCNN), pp. 1–6. IEEE (2021)

[11]

Loshchilov, I., Hutter, F.: Decoupled weight decay regularization. In: International Conference on Learning Representations (2019)

[12]

Mani, A., Palaskar, S., Konam, S.: Towards understanding ASR error correction for medical conversations. In: NLPMC (2020)

[13]

Mani, A., Palaskar, S., Meripo, N.V., Konam, S., Metze, F.: ASR error correction and domain adaptation using machine translation. In: ICASSP 2020–2020 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pp. 6344–6348. IEEE (2020)

[14]

McDonald, A., Sherlock, J.: A long and winding road - improving communication with patients in the NHS (2016)

[15]

Nanayakkara G, Wiratunga N, Corsar D, Martin K, and Wijekoon A Shaban-Nejad A, Michalowski M, and Bianco S Clinical dialogue transcription error correction using Seq2Seq models Multimodal AI in Healthcare 2023 Cham Springer 41-57

[16]

Neumann, M., King, D., Beltagy, I., Ammar, W.: ScispaCy: fast and robust models for biomedical natural language processing. In: Proceedings of the 18th BioNLP Workshop and Shared Task, pp. 319–327. Association for Computational Linguistics, Florence (2019)

[17]

Quiroz J, Laranjo L, Kocaballi AB, Berkovsky S, Rezazadegan D, and Coiera E Challenges of developing a digital scribe to reduce clinical documentation burden NPJ Digit. Med. 2019 2 114

[18]

Radford, A., Wu, J., Child, R., Luan, D., Amodei, D., Sutskever, I.: Language models are unsupervised multitask learners (2019)

[19]

Raffel C et al. Exploring the limits of transfer learning with a unified text-to-text transformer J. Mach. Learn. Res. 2020 21 140 1-67

[20]

Sarma, A., Palmer, D.D.: Context-based speech recognition error detection and correction. In: Proceedings of HLT-NAACL 2004: Short Papers, pp. 85–88. Association for Computational Linguistics, Boston (2004)

[21]

Vaswani, A., et al.: Attention is all you need. In: Guyon, I., et al. (eds.) Advances in Neural Information Processing Systems, vol. 30. Curran Associates, Inc. (2017)

[22]

Wagner RA and Fischer MJ The string-to-string correction problem J. ACM (JACM) 1974 21 1 168-173

[23]

Zhang, Y., Baldridge, J., He, L.: PAWS: paraphrase adversaries from word scrambling. In: Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pp. 1298–1308. Association for Computational Linguistics, Minneapolis (2019)

[24]

Zhao, Y., Yang, X., Wang, J., Gao, Y., Yan, C., Zhou, Y.: BART based semantic correction for mandarin automatic speech recognition system. In: Proceedings of the Interspeech 2021, pp. 2017–2021 (2021)

Recommendations

Importance of Signal Processing Cues in Transcription Correction for Low-Resource Indian Languages

Accurate phonetic transcriptions are crucial for building robust acoustic models for speech recognition as well as speech synthesis applications. Phonetic transcriptions are not usually provided with speech corpora. A lexicon is used to generate phone-...
Interface design strategies for computer-assisted speech transcription
OZCHI '08: Proceedings of the 20th Australasian Conference on Computer-Human Interaction: Designing for Habitus and Habitat

A set of user interface design techniques for computer-assisted speech transcription are presented and evaluated with respect to task performance and usability. These techniques include error-correction mechanisms which originated in dictation systems ...
Crossmodal error correction of continuous handwriting recognition by speech
IUI '07: Proceedings of the 12th international conference on Intelligent user interfaces

In recognition-based user interface, users' satisfaction is determined not only by recognition accuracy but also by effort to correct recognition errors. In this paper, we introduce a crossmodal error correction technique, which allows users to correct ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

Artificial Intelligence XL: 43rd SGAI International Conference on Artificial Intelligence, AI 2023, Cambridge, UK, December 12–14, 2023, Proceedings

Dec 2023

524 pages

ISBN:978-3-031-47993-9

DOI:10.1007/978-3-031-47994-6

Editors:
Max Bramer
https://ror.org/03ykbk197University of Portsmouth, Portsmouth, UK
,
Frederic Stahl
https://ror.org/01ayc5b57DFKI: German Research Center for Artificial Intelligence, Oldenburg, Germany

© The Author(s), under exclusive license to Springer Nature Switzerland AG 2023.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 12 December 2023

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten