research-article

Language models are an effective representation learning technique for electronic health record data

Authors:

Ethan Steinberg,

Jason A. Fries,

Conor K. Corbin,

Stephen R. Pfohl,

Nigam H. ShahAuthors Info & Claims

Volume 113, Issue C

https://doi.org/10.1016/j.jbi.2020.103637

Published: 01 January 2021 Publication History

Abstract

Widespread adoption of electronic health records (EHRs) has fueled the development of using machine learning to build prediction models for various clinical outcomes. However, this process is often constrained by having a relatively small number of patient records for training the model. We demonstrate that using patient representation schemes inspired from techniques in natural language processing can increase the accuracy of clinical prediction models by transferring information learned from the entire patient population to the task of training a specific model, where only a subset of the population is relevant. Such patient representation schemes enable a 3.5% mean improvement in AUROC on five prediction tasks compared to standard baselines, with the average improvement rising to 19% when only a small number of patient records are available for training the clinical prediction model.

Graphical abstract

Display Omitted

Highlights

•

Electronic health records are often used to predict clinical outcomes.

•

One primary limiting factor for obtaining high quality predictions is limited data.

•

We demonstrate a representation learning technique that works around this limitation.

•

Models trained on these representations offer superior performance in many settings.

References

[1]

Shilo S., Rossman H., Segal E., Axes of a revolution: challenges and promises of big data in healthcare, Nat. Med. 26 (1) (2020) 29–38. https://doi.org/10.1038/s41591-019-0727-5.

[2]

Norgeot B., Glicksberg B.S., Butte A.J., A call for deep-learning healthcare, Nat. Med. 25 (1) (2019) 14–15. https://doi.org/10.1038/s41591-018-0320-3.

[3]

Wiens J., Saria S., Sendak M., Ghassemi M., Liu V.X., Doshi-Velez F., Jung K., Heller K., Kale D., Saeed M., Ossorio P.N., Thadaney-Israni S., Goldenberg A., Do no harm: a roadmap for responsible machine learning for health care, Nat. Med. 25 (9) (2019) 1337–1340. https://doi.org/10.1038/s41591-019-0548-6.

[4]

Sendak M.P., D’Arcy J., Kashyap S., Gao M., Nichols M., Corey K., Ratliff W., Balu S., A path for translation of machine learning products into healthcare delivery, EMJ Innov. (2020) https://doi.org/10.33590/emjinnov/19-00172.

[5]

Avati A., Jung K., Harman S., Downing L., Ng A., Shah N.H., Improving palliative care with deep learning, in: 2017 IEEE International Conference on Bioinformatics and Biomedicine (BIBM), 2017, pp. 311–316,.

[6]

Dhudasia M.B., Mukhopadhyay S., Puopolo K.M., Implementation of the sepsis risk calculator at an academic birth hospital, Hosp. Pediatr. 8 (5) (2018) 243–250,. arXiv:https://hosppeds.aappublications.org/content/8/5/243.full.pdf. URL https://hosppeds.aappublications.org/content/8/5/243.

[7]

Tamang S., Milstein A., Sørensen H.T., Pedersen L., Mackey L., Betterton J.-R., Janson L., Shah N., Predicting patient ‘cost blooms’ in Denmark: a longitudinal population-based study, BMJ Open 7 (1) (2017),. arXiv:https://bmjopen.bmj.com/content/7/1/e011580.full.pdf. URL https://bmjopen.bmj.com/content/7/1/e011580.

[8]

Cronin P., Greenwald J., C. Crevensten G., Chueh H., Zai A., Development and implementation of a real-time 30-day readmission predictive model, AMIA Annual Symposium proceedings / AMIA Symposium. AMIA Symposium 2014 (2014) 424–431.

[9]

Rajkomar A., Oren E., Chen K., Dai A.M., Hajaj N., Hardt M., Liu P.J., Liu X., Marcus J., Sun M., Sundberg P., Yee H., Zhang K., Zhang Y., Flores G., Duggan G.E., Irvine J., Le Q., Litsch K., Mossin A., Tansuwan J., Wang D., Wexler J., Wilson J., Ludwig D., Volchenboum S.L., Chou K., Pearson M., Madabushi S., Shah N.H., Butte A.J., Howell M.D., Cui C., Corrado G.S., Dean J., Scalable and accurate deep learning with electronic health records, npj Digit. Med. 1 (1) (2018) 18. https://doi.org/10.1038/s41746-018-0029-1.

[10]

Banda J.M., Sarraju A., Abbasi F., Parizo J., Pariani M., Ison H., Briskin E., Wand H., Dubois S., Jung K., Myers S.A., Rader D.J., Leader J.B., Murray M.F., Myers K.D., Wilemon K., Shah N.H., Knowles J.W., Finding missed cases of familial hypercholesterolemia in health systems using machine learning, npj Digit. Med. 2 (1) (2019) 23. https://doi.org/10.1038/s41746-019-0101-5.

[11]

Paulson S.S., Dummett B.A., Green J., Scruth E., Reyes V., Escobar G.J., What do we do after the pilot is done? Implementation of a hospital early warning system at scale, The Joint Comm. J. Qual. Patient Saf. (2020),. URL http://www.sciencedirect.com/science/article/pii/S1553725020300064.

[12]

Shimabukuro D.W., Barton C.W., Feldman M.D., Mataraso S.J., Das R., Effect of a machine learning-based severe sepsis prediction algorithm on patient survival and hospital length of stay: a randomised clinical trial, BMJ Open Resp. Res. 4 (1) (2017),. arXiv:https://bmjopenrespres.bmj.com/content/4/1/e000234.full.pdf. URL https://bmjopenrespres.bmj.com/content/4/1/e000234.

[13]

Goldstein B.A., Navar A.M., Pencina M.J., Ioannidis J.P.A., Opportunities and challenges in developing risk prediction models with electronic health records data: a systematic review, J. Am. Med. Inform. Assoc. 24 (1) (2016) 198–208,. https://doi.org/10.1093/jamia/ocw042.

[14]

Chen D., Liu S., Kingsbury P., Sohn S., Sorlie C.B., Haberman E.B., Naessens J.M., Larson D.W., Liu H., Deep learning and alternative learning strategies for retrospective real-world clinical data, Nat. Digit. Med. 2 (43) (2019).

[15]

Howard J., Ruder S., Universal language model fine-tuning for text classification, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 328–339. https://www.aclweb.org/anthology/P18-1031.

[16]

Devlin J., Chang M., Lee K., Toutanova K., BERT: Pre-training of deep bidirectional transformers for language understanding, in: NAACL-HLT, 2018.

[17]

Wiens J., Guttag J., Horvitz E., A study in transfer learning: leveraging data from multiple hospitals to enhance hospital-specific predictions, J. Am. Med. Inform. Assoc. 21 (4) (2014) 699–706. https://doi.org/10.1136/amiajnl-2013-002162.

[18]

Miotto R., Li L., Kidd B.A., Dudley J.T., Deep patient: An unsupervised representation to predict the future of patients from the electronic health records, Sci. Rep. 6 (2016) 26094.

[19]

Choi E., Bahadori M.T., Searles E., Coffey C., Thompson M., Bost J., T J., Sun J., Multi-layer representation learning for medical concepts, in: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in: KDD ’16, ACM, New York, NY, USA, 2016, pp. 1495–1504,. http://doi.acm.org/10.1145/2939672.2939823.

Digital Library

[20]

Choi Y., Chiu C.Y., Sontag D., Learning low-dimensional representations of medical concepts, in: AMIA Joint Summits on Translational Science proceedings, in: AMIA Joint Summits on Translational Science, vol. 2016, American Medical Informatics Association, 2016, pp. 41–50. https://www.ncbi.nlm.nih.gov/pubmed/27570647.

[21]

Choi E., Schuetz A., Stewart W.F., Sun J., Medical concept representation learning from electronic health records and its application on heart failure prediction, 2016, CoRR abs/1602.03686 arXiv:1602.03686.

[22]

Choi E., Bahadori M.T., Schuetz A., Stewart W.F., Sun J., Doctor AI: Predicting clinical events via recurrent neural networks, in: Doshi-Velez F., Fackler J., Kale D., Wallace B., Wiens J. (Eds.), Proceedings of the 1st Machine Learning for Healthcare Conference, in: Proceedings of Machine Learning Research, vol. 56, PMLR, Children’s Hospital LA, Los Angeles, CA, USA, 2016, pp. 301–318. http://proceedings.mlr.press/v56/Choi16.html.

[23]

Choi E., Bahadori M.T., Song L., Stewart W.F., Sun J., GRAM: Graph-based attention model for healthcare representation learning, in: Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, in: KDD ’17, ACM, New York, NY, USA, 2017, pp. 787–795,. http://doi.acm.org/10.1145/3097983.3098126.

Digital Library

[24]

Choi E., Schuetz A., Stewart W.F., Sun J., Using recurrent neural network models for early detection of heart failure onset, J. Am. Med. Inform. Assoc. 2 (24) (2016) 361–370.

[25]

Choi E., Bahadori M.T., Schuetz A., Stewart W.F., Sun J., RETAIN: interpretable predictive model in healthcare using reverse time attention mechanism, 2016, CoRR abs/1608.05745 arXiv:1608.05745.

[26]

Y. Cheng, F. Wang, P. Zhang, J. Hu, Risk prediction with electronic health records: A deep learning approach, in: Proceedings of the 2016 SIAM International Conference on Data Mining, 2016.

[27]

Pham T., Tran T., Phung D., Venkatesh S., Deepcare: A deep dynamic memory model for predictive medicine, in: Pacific-Asia Conference on Knowledge Discovery and Data Mining, Springer, 2016, pp. 30–41.

[28]

Nguyen P., Tran T., Wickramasinghe N., Venkatesh S., Deepr: a convolutional net for medical records, IEEE J. Biomed. Health Inform. 21 (1) (2016) 22–30.

[29]

Zhang J., Kowsari K., Harrison J.H., Lobo J.M., Barnes L.E., Patient2vec: A personalized interpretable deep representation of the longitudinal electronic health record, IEEE Access 6 (2018) 65333–65346.

[30]

Pennington J., Socher R., Manning C.D., Glove: Global vectors for word representation, in: Empirical Methods in Natural Language Processing (EMNLP), 2014, pp. 1532–1543. URL http://www.aclweb.org/anthology/D14-1162.

[31]

Mikolov T., Sutskever I., Chen K., Corrado G., Dean J., Distributed representations of words and phrases and their compositionality, in: Proceedings of the 26th International Conference on Neural Information Processing Systems - Volume 2, in: NIPS’13, Curran Associates Inc., USA, 2013, pp. 3111–3119. URL http://dl.acm.org/citation.cfm?id=2999792.2999959.

[32]

Shen D., Wang G., Wang W., Min M.R., Su Q., Zhang Y., Li C., Henao R., Carin L., Baseline needs more love: On simple word-embedding-based models and associated pooling mechanisms, in: Proceedings of the 56th Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), Association for Computational Linguistics, Melbourne, Australia, 2018, pp. 440–450. URL https://www.aclweb.org/anthology/P18-1041.

[33]

Kim Y., Convolutional neural networks for sentence classification, in: Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Association for Computational Linguistics, Doha, Qatar, 2014, pp. 1746–1751,. URL https://www.aclweb.org/anthology/D14-1181.

[34]

Berry M.W., Dumais S.T., O’Brien G.W., Using linear algebra for intelligent information retrieval, SIAM Rev. 37 (4) (1995) 573–595. https://doi.org/10.1137/1037127.

[35]

Choi E., Xiao C., Stewart W.F., Sun J., Mime: Multilevel medical embedding of electronic health records for predictive healthcare, 2018, CoRR abs/1810.09593 arXiv:1810.09593. URL http://arxiv.org/abs/1810.09593.

[36]

Datta S., Posada J., Olson G., Li W., O’Reilly C., Balraj D., Mesterhazy J., Pallas J., Desai P., Shah N., A new paradigm for accelerating clinical data science at stanford medicine, 2020, URL arXiv:2003.10534.

[37]

Sherman E., Gurm H., Balis U., Owens S., Wiens J., Leveraging clinical time-series data for prediction: a cautionary tale, in: AMIA Annual Symposium Proceedings, vol. 2017, American Medical Informatics Association, 2017, p. 1571.

[38]

Pedregosa F., Varoquaux G., Gramfort A., Michel V., Thirion B., Grisel O., Blondel M., Prettenhofer P., Weiss R., Dubourg V., Vanderplas J., Passos A., Cournapeau D., Brucher M., Perrot M., Duchesnay E., Scikit-learn: Machine learning in python, J. Mach. Learn. Res. 12 (2011) 2825–2830.

[39]

Ke G., Meng Q., Finley T., Wang T., Chen W., Ma W., Ye Q., Liu T.-Y., Lightgbm: A highly efficient gradient boosting decision tree, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, in: NIPS’17, Curran Associates Inc., USA, 2017, pp. 3149–3157. URL http://dl.acm.org/citation.cfm?id=3294996.3295074.

[40]

Bodenreider O., The unified medical language system (UMLS): integrating biomedical terminology, Nucleic Acids Res. 32 (2004) D267–D270,. arXiv:/oup/backfile/content_public/journal/nar/32/suppl_1/10.1093_nar_gkh061/2/gkh061.pdf.

[41]

Řehůřek R., Sojka P., Software framework for topic modelling with large corpora, in: Proceedings of the LREC 2010 Workshop on New Challenges for NLP Frameworks, ELRA, Valletta, Malta, 2010, pp. 45–50.

[42]

J. Pennington, R. Socher, C.D. Manning, Glove: Global vectors for word representation, in: EMNLP, 2014.

[43]

Wang B., Wang A., Chen F., Wang Y., Kuo C.-C.J., Evaluating word embedding models: methods and experimental results, APSIPA Trans. Signal Inform. Process. 8 (2019),.

[44]

Zhang M., Zhou Z., A review on multi-label learning algorithms, IEEE Trans. Knowl. Data Eng. 26 (8) (2014) 1819–1837.

[45]

Kudo T., Richardson J., Sentencepiece: A simple and language independent subword tokenizer and detokenizer for neural text processing, 2018, URL arXiv:1808.06226.

[46]

Hendrycks D., Gimpel K., Bridging nonlinearities and stochastic regularizers with Gaussian error linear units, 2016, CoRR abs/1606.08415. URL arXiv:1606.08415.

[47]

Varando G., Bielza C., Larrañaga P., Expressive power of binary relevance and chain classifiers based on Bayesian networks for multi-label classification, in: van der Gaag L.C., Feelders A.J. (Eds.), Probabilistic Graphical Models, Springer International Publishing, Cham, 2014, pp. 519–534.

[48]

Morin F., Bengio Y., Hierarchical probabilistic neural network language model, in: AISTATS, 2005.

[49]

Radford A., Improving language understanding by generative pre-training, 2018.

[50]

Brown T.B., Mann B., Ryder N., Subbiah M., Kaplan J., Dhariwal P., Neelakantan A., Shyam P., Sastry G., Askell A., Agarwal S., Herbert-Voss A., Krueger G., Henighan T., Child R., Ramesh A., Ziegler D.M., Wu J., Winter C., Hesse C., Chen M., Sigler E., Litwin M., Gray S., Chess B., Clark J., Berner C., McCandlish S., Radford A., Sutskever I., Amodei D., Language models are few-shot learners, 2020, arXiv:2005.14165.

[51]

Miotto R., Li L., Kidd B.A., Dudley J.T., Deep patient: an unsupervised representation to predict the future of patients from the electronic health records, Sci. Rep. 6 (2016) 26094.

[52]

Clark K., Luong M.-T., Le Q.V., Manning C.D., ELECTRA: Pre-training text encoders as discriminators rather than generators, in: International Conference on Learning Representations, 2020, URL https://openreview.net/forum?id=r1xMH1BtvB.

Cited By

Bhaskhar NIp WChen JRubin D(2024)Clinical outcome prediction using observational supervision with electronic health records and audit logsJournal of Biomedical Informatics10.1016/j.jbi.2023.104522147:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.jbi.2023.104522
Caruana ABandara MMusial KCatchpoole DKennedy P(2024)Machine learning for administrative health recordsArtificial Intelligence in Medicine10.1016/j.artmed.2023.102642144:COnline publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1016/j.artmed.2023.102642
McDermott MNestor BArgaw PJin YKohane IOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Event stream GPTProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667179(24322-24334)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667179
Show More Cited By

Index Terms

Language models are an effective representation learning technique for electronic health record data

Index terms have been assigned to the content through auto-classification.

Recommendations

Using electronic health records to accurately predict COVID-19 health outcomes through a novel machine learning pipeline
BCB '21: Proceedings of the 12th ACM Conference on Bioinformatics, Computational Biology, and Health Informatics

Current COVID-19 predictive models primarily focus on predicting the risk of mortality, and rely on COVID-19 specific medical data such as chest imaging after COVID-19 diagnosis. In this project, we developed an innovative supervised machine learning ...
MedLink: De-Identified Patient Health Record Linkage
KDD '23: Proceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining

A comprehensive patient health history is essential for patient care and healthcare research. However, due to the distributed nature of healthcare services, patient health records are often scattered across multiple systems. Existing record linkage ...
EEMI-An Electronic Health Record for Pediatricians: Adoption Barriers, Services and Use in Mexico

The use of paper health records and handwritten prescriptions are prone to preset errors of misunderstanding instructions or interpretations that derive in affecting patients' health. Electronic Health Records EHR systems are useful tools that among ...

Comments

Information & Contributors

Information

Published In

cover image Journal of Biomedical Informatics

Journal of Biomedical Informatics Volume 113, Issue C

Jan 2021

364 pages

ISSN:1532-0464

Issue’s Table of Contents

Elsevier Inc.

Publisher

Elsevier Science

San Diego, CA, United States

Publication History

Published: 01 January 2021

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Bhaskhar NIp WChen JRubin D(2024)Clinical outcome prediction using observational supervision with electronic health records and audit logsJournal of Biomedical Informatics10.1016/j.jbi.2023.104522147:COnline publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1016/j.jbi.2023.104522
Caruana ABandara MMusial KCatchpoole DKennedy P(2024)Machine learning for administrative health recordsArtificial Intelligence in Medicine10.1016/j.artmed.2023.102642144:COnline publication date: 5-Jan-2024
https://dl.acm.org/doi/10.1016/j.artmed.2023.102642
McDermott MNestor BArgaw PJin YKohane IOh ANaumann TGloberson ASaenko KHardt MLevine S(2023)Event stream GPTProceedings of the 37th International Conference on Neural Information Processing Systems10.5555/3666122.3667179(24322-24334)Online publication date: 10-Dec-2023
https://dl.acm.org/doi/10.5555/3666122.3667179
Le QVo C(2023)Patient Information Retrieval Based on BERT Variants and Clinical Texts in Electronic Medical RecordsProceedings of the 2023 7th International Conference on Big Data and Internet of Things10.1145/3617695.3617702(188-194)Online publication date: 11-Aug-2023
https://dl.acm.org/doi/10.1145/3617695.3617702
Wu ZXiao CSun JSingh ASun YAkoglu LGunopulos DYan XKumar ROzcan FYe J(2023)MedLink: De-Identified Patient Health Record LinkageProceedings of the 29th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3580305.3599427(2672-2682)Online publication date: 6-Aug-2023
https://dl.acm.org/doi/10.1145/3580305.3599427
Hossain ERana RHiggins NSoar JBarua PPisani ATurner K(2023)Natural Language Processing in Electronic Health Records in relation to healthcare decision-makingComputers in Biology and Medicine10.1016/j.compbiomed.2023.106649155:COnline publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.compbiomed.2023.106649
Belyaeva ACosentino JHormozdiari FEswaran KShetty SCorrado GCarroll AMcLean CFurlotte N(2023)Multimodal LLMs for Health Grounded in Individual-Specific DataMachine Learning for Multimodal Healthcare Data10.1007/978-3-031-47679-2_7(86-102)Online publication date: 29-Jul-2023
https://dl.acm.org/doi/10.1007/978-3-031-47679-2_7
Wang YPeng XShen TClarke ASchlegel CMartin PLong G(2023)Soft Prompt Transfer for Zero-Shot and Few-Shot Learning in EHR UnderstandingAdvanced Data Mining and Applications10.1007/978-3-031-46671-7_2(18-32)Online publication date: 27-Aug-2023
https://dl.acm.org/doi/10.1007/978-3-031-46671-7_2
Knauer RRodner E(2023)Cost-Sensitive Best Subset Selection for Logistic Regression: A Mixed-Integer Conic Optimization PerspectiveKI 2023: Advances in Artificial Intelligence10.1007/978-3-031-42608-7_10(114-129)Online publication date: 26-Sep-2023
https://dl.acm.org/doi/10.1007/978-3-031-42608-7_10
Luo XGandhi PZhang ZShao WHan ZChandrasekaran VTurzhitsky VBali VRoberts AMetzger MBaker JLa Rosa CWeaver JDexter PHuang K(2021)Applying interpretable deep learning models to identify chronic cough patients using EHR dataComputer Methods and Programs in Biomedicine10.1016/j.cmpb.2021.106395210:COnline publication date: 1-Oct-2021
https://dl.acm.org/doi/10.1016/j.cmpb.2021.106395

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents