Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                

Joakim Edin


2024

pdf bib
An Unsupervised Approach to Achieve Supervised-Level Explainability in Healthcare Records
Joakim Edin | Maria Maistro | Lars Maaløe | Lasse Borgholt | Jakob Drachmann Havtorn | Tuukka Ruotsalo
Proceedings of the 2024 Conference on Empirical Methods in Natural Language Processing

Electronic healthcare records are vital for patient safety as they document conditions, plans, and procedures in both free text and medical codes. Language models have significantly enhanced the processing of such records, streamlining workflows and reducing manual data entry, thereby saving healthcare providers significant resources. However, the black-box nature of these models often leaves healthcare professionals hesitant to trust them. State-of-the-art explainability methods increase model transparency but rely on human-annotated evidence spans, which are costly. In this study, we propose an approach to produce plausible and faithful explanations without needing such annotations. We demonstrate on the automated medical coding task that adversarial robustness training improves explanation plausibility and introduce AttInGrad, a new explanation method superior to previous ones. By combining both contributions in a fully unsupervised setup, we produce explanations of comparable quality, or better, to that of a supervised approach. We release our code and model weights.

2020

pdf bib
MultiQT: Multimodal learning for real-time question tracking in speech
Jakob D. Havtorn | Jan Latko | Joakim Edin | Lars Maaløe | Lasse Borgholt | Lorenzo Belgrano | Nicolai Jacobsen | Regitze Sdun | Željko Agić
Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics

We address a challenging and practical task of labeling questions in speech in real time during telephone calls to emergency medical services in English, which embeds within a broader decision support system for emergency call-takers. We propose a novel multimodal approach to real-time sequence labeling in speech. Our model treats speech and its own textual representation as two separate modalities or views, as it jointly learns from streamed audio and its noisy transcription into text via automatic speech recognition. Our results show significant gains of jointly learning from the two modalities when compared to text or audio only, under adverse noise and limited volume of training data. The results generalize to medical symptoms detection where we observe a similar pattern of improvements with multimodal learning.