Abstract
In this study, we introduce ExBEHRT, an extended version of BEHRT (BERT applied to electronic health record data) and applied various algorithms to interpret its results. While BEHRT only considers diagnoses and patient age, we extend the feature space to several multi-modal records, namely demographics, clinical characteristics, vital signs, smoking status, diagnoses, procedures, medications and lab tests by applying a novel method to unify the frequencies and temporal dimensions of the different features. We show that additional features significantly improve model performance for various down-stream tasks in different diseases. To ensure robustness, we interpret the model predictions using an adaption of expected gradients, which has not been applied to transformers with EHR data so far and provides more granular interpretations than previous approaches such as feature and token importances. Furthermore, by clustering the models’ representations of oncology patients, we show that the model has implicit understanding of the disease and is able to classify patients with same cancer type into different risk groups. Given the additional features and interpretability, ExBEHRT can help making informed decisions about disease progressions, diagnoses and risk factors of various diseases.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Notes
- 1.
Binary classification of whether a patient had at least one prolonged length of stay in hospital (\(> 7\) days) during their journey.
- 2.
A visualization of all clusters can be found in Fig. 8 in the appendix.
- 3.
In the table, % of journey with cancer indicates the ratio of the time between the first and last cancer diagnosis compared to the duration of the whole recorded patient journey. Cancer-free refers to the percentage of patients within a cluster, which have records of at least two visits without cancer diagnosis after the last visit with a cancer diagnosis. The average death rate comes directly from the EHR database and unfortunately does not include information on the cause of death.
References
Azhir, A., et al.: Behrtday: Dynamic mortality risk prediction using time-variant COVID-19 patient specific trajectories. In: AMIA Annual Symposium Proceedings (2022)
Campello, R.J.G.B., Moulavi, D., Sander, J.: Density-based clustering based on hierarchical density estimates. In: Pei, J., Tseng, V.S., Cao, L., Motoda, H., Xu, G. (eds.) PAKDD 2013. LNCS (LNAI), vol. 7819, pp. 160–172. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-37456-2_14
Erion, G., Janizek, J.D., Sturmfels, P., Lundberg, S.M., Lee, S.I., Allen, P.G.: Improving performance of deep learning models with axiomatic attribution priors and expected gradients. Nature 3, 620–631 (2020)
Kalyan, K.S., Rajasekharan, A., Sangeetha, S.: AMMU: a survey of transformer-based biomedical pretrained language models. J. Biomed. Inf. 126, 103982 (2022)
Lee, J., et al.: BioBERT: a pre-trained biomedical language representation model for biomedical text mining. Bioinformatics 36, 1234–1240 (2019)
Li, Y., et al.: Hi-BEHRT: hierarchical transformer-based model for accurate prediction of clinical events using multimodal longitudinal electronic health records. J. Biomed. Health Inf. 27, 1106–1117 (2021)
Li, Y., et al.: BEHRT: transformer for electronic health records. Nature (2020)
McInnes, L., Healy, J., Melville, J.: Umap: Uniform manifold approximation and projection for dimension reduction. J. Open Source Softw. (2018)
Meng, Y., Speier, W., Ong, M.K., Arnold, C.W.: Bidirectional representation learning from transformers using multimodal electronic health record data to predict depression. J. Biomed. Health Inf. 25, 3121–3129 (2021)
Pang, C., et al.: CEHR-BERT: incorporating temporal information from structured EHR data to improve prediction tasks. In: Proceedings of Machine Learning for Health (2021)
Poulain, R., Gupta, M., Beheshti, R.: Few-shot learning with semi-supervised transformers for electronic health records. In: Proceedings of Machine Learning Research, vol. 182 (2022)
Prakash, P., Chilukuri, S., Ranade, N., Viswanathan, S.: RareBERT: transformer architecture for rare disease patient identification using administrative claims. In: Proceedings of the AAAI Conference on Artificial Intelligence (2021)
Rao, S., et al.: An explainable transformer-based deep learning model for the prediction of incident heart failure. IEEE J. Biomed. Health Inf. 26, 3362–3372 (2022). https://doi.org/10.1109/JBHI.2022.3148820
Rasmy, L., Xiang, Y., Xie, Z., Tao, C., Zhi, D.: Med-BERT: pre-trained contextualized embeddings on large-scale structured electronic health records for disease prediction. Nature 4, 86 (2021)
Shang, J., Ma, T., Xiao, C., Sun, J.: Pre-training of graph augmented transformers for medication recommendation. Int. Joint Conf. Artif. Intell. (2019)
Vig, J.: A multiscale visualization of attention in the transformer model. In: ACL (2019)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Appendix
Appendix
A sample input of ExBEHRT. Each of the concepts has its own embedding, where each of the tokens is mapped to a 288-dimensional vector, which is learned during model training. After embedding, all concepts are summed vertically element-wise to create a single \(288 \times m\) dimensional vector as input for the model.
The unsupervised cluster assignments from HDBSCAN, visualized with a 2-dimensional UMAP projection. The gray points are patients not assigned to any cluster (10%). The labels indicate the most frequent diagnosis code of each cluster. Besides cluster 10, all labels are neoplasms. (Color figure online)
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Rupp, M., Peter, O., Pattipaka, T. (2023). ExBEHRT: Extended Transformer for Electronic Health Records. In: Chen, H., Luo, L. (eds) Trustworthy Machine Learning for Healthcare. TML4H 2023. Lecture Notes in Computer Science, vol 13932. Springer, Cham. https://doi.org/10.1007/978-3-031-39539-0_7
Download citation
DOI: https://doi.org/10.1007/978-3-031-39539-0_7
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-39538-3
Online ISBN: 978-3-031-39539-0
eBook Packages: Computer ScienceComputer Science (R0)