research-article

Public Access

Risk Prediction on Electronic Health Records with Prior Medical Knowledge

Authors:

Aidong ZhangAuthors Info & Claims

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

Pages 1910 - 1919

https://doi.org/10.1145/3219819.3220020

Published: 19 July 2018 Publication History

Abstract

Predicting the risk of potential diseases from Electronic Health Records (EHR) has attracted considerable attention in recent years, especially with the development of deep learning techniques. Compared with traditional machine learning models, deep learning based approaches achieve superior performance on risk prediction task. However, none of existing work explicitly takes prior medical knowledge (such as the relationships between diseases and corresponding risk factors) into account. In medical domain, knowledge is usually represented by discrete and arbitrary rules. Thus, how to integrate such medical rules into existing risk prediction models to improve the performance is a challenge. To tackle this challenge, we propose a novel and general framework called PRIME for risk prediction task, which can successfully incorporate discrete prior medical knowledge into all of the state-of-the-art predictive models using posterior regularization technique. Different from traditional posterior regularization, we do not need to manually set a bound for each piece of prior medical knowledge when modeling desired distribution of the target disease on patients. Moreover, the proposed PRIME can automatically learn the importance of different prior knowledge with a log-linear model.Experimental results on three real medical datasets demonstrate the effectiveness of the proposed framework for the task of risk prediction

Supplementary Material

MP4 File (ma_risk_eletronic.mp4)

Download
414.74 MB

References

[1]

Inci M. Baytas, Cao Xiao, Xi Zhang, Fei Wang, Anil K. Jain, and Jiayu Zhou . 2017. Patient Subtyping via Time-Aware LSTM Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 65--74.

Digital Library

[2]

Zhengping Che, Yu Cheng, Zhaonan Sun, and Yan Liu . 2016. Exploiting Convolutional Neural Network for Risk Prediction with Medical Feature Embedding. In Proceedings of NIPS Workshop on Machine Learning for Health (NIPS-ML4HC'16).

[3]

Zhengping Che, Yu Cheng, Shuangfei Zhai, Zhaonan Sun, and Yan Liu . 2017. Boosting Deep Learning Risk Prediction with Generative Adversarial Networks for Electronic Health Records. In Proceedings of the IEEE International Conference on Data Mining (ICDM'17). 787--792.

[4]

Zhengping Che, David Kale, Wenzhe Li, Mohammad Taha Bahadori, and Yan Liu . 2015. Deep Computational Phenotyping. In Proceedings of the 21st ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). 507--516.

Digital Library

[5]

Yu Cheng, Fei Wang, Ping Zhang, and Jianying Hu . 2016. Risk Prediction with Electronic Health Records: A Deep Learning Approach Proceedings of the 2016 SIAM International Conference on Data Mining (SDM'16). 432--440.

[6]

Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio . 2014. On the Properties of Neural Machine Translation: Encoder-decoder Approaches. arXiv preprint arXiv:1409.1259 (2014).

[7]

Edward Choi, Mohammad Taha Bahadori, Elizabeth Searles, Catherine Coffey, Michael Thompson, James Bost, Javier Tejedor-Sojo, and Jimeng Sun . 2016 a. Multi-layer representation learning for medical concepts Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'16). 1495--1504.

Digital Library

[8]

Edward Choi, Mohammad Taha Bahadori, Le Song, Walter F Stewart, and Jimeng Sun . 2017. GRAM: Graph-based Attention Model for Healthcare Representation Learning Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 787--795.

Digital Library

[9]

Edward Choi, Mohammad Taha Bahadori, Jimeng Sun, Joshua Kulas, Andy Schuetz, and Walter Stewart . 2016 b. Retain: An Interpretable Predictive model for Healthcare Using Reverse Time Attention Mechanism. In Proceedings of Advances in Neural Information Processing Systems (NIPS'16). 3504--3512.

Digital Library

[10]

Ronan Collobert, Jason Weston, Léon Bottou, Michael Karlen, Koray Kavukcuoglu, and Pavel Kuksa . 2011. Natural Language Processing (Almost) From Scratch. Journal of Machine Learning Research (JMLR) Vol. 12, Aug (2011), 2493--2537.

Digital Library

[11]

Luc Djoussé and J Michael Gaziano . 2008. Alcohol consumption and heart failure: a systematic review. Current atherosclerosis reports Vol. 10, 2 (2008), 117--120.

[12]

Kuzman Ganchev, Jennifer Gillenwater, Ben Taskar, et almbox. . 2010. Posterior Regularization for Structured Latent Variable Models. Journal of Machine Learning Research (JMLR) Vol. 11, Jul (2010), 2001--2049.

Digital Library

[13]

Joyce C Ho, Joydeep Ghosh, Steve R Steinhubl, Walter F Stewart, Joshua C Denny, Bradley A Malin, and Jimeng Sun . 2014 b. Limestone: High-throughput Candidate Phenotype Generation via Tensor Factorization. Journal of Biomedical Informatics Vol. 52 (2014), 199--211.

Digital Library

[14]

Joyce C Ho, Joydeep Ghosh, and Jimeng Sun . 2014 a. Marble: High-throughput Phenotyping from Electronic Health Records via Sparse Nonnegative Tensor Factorization. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Dining (KDD'14). 115--124.

Digital Library

[15]

Sepp Hochreiter and Jürgen Schmidhuber . 1997. Long Short-term Memory. Neural computation Vol. 9, 8 (1997), 1735--1780.

Digital Library

[16]

George Hripcsak and David J Albers . 2012. Next-generation Phenotyping of Electronic Health Records. Journal of the American Medical Informatics Association (JAMIA) Vol. 20, 1 (2012), 117--121.

[17]

Zhiting Hu, Xuezhe Ma, Zhengzhong Liu, Eduard Hovy, and Eric Xing . 2016. Harnessing Deep Neural Networks with Logic Rules. In Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics (ACL'16). 2410--2420.

[18]

Jau-Huei Lin and Peter J Haug . 2008. Exploiting Missing Clinical Data in Bayesian Network Modeling for Predicting Medical Problems. Journal of Biomedical Informatics Vol. 41, 1 (2008), 1--14.

Digital Library

[19]

Roderick JA Little and Donald B Rubin . 2014. Statistical Analysis with Missing Data. Vol. Vol. 333. John Wiley & Sons.

[20]

Fenglong Ma, Radha Chitta, Jing Zhou, Quanzeng You, Tong Sun, and Jing Gao . 2017 a. Dipole: Diagnosis Prediction in Healthcare via Attention-based Bidirectional Recurrent Neural Networks. In Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). 1903--1911.

Digital Library

[21]

Fenglong Ma, Chuishi Meng, Houping Xiao, Qi Li, Jing Gao, Lu Su, and Aidong Zhang . 2017 b. Unsupervised Discovery of Drug Side-effects from Heterogeneous Data Sources Proceedings of the 23rd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'17). ACM, 967--976.

Digital Library

[22]

Franz Josef Och and Hermann Ney . 2002. Discriminative Training and Maximum Entropy Models for Statistical Machine Translation. In Proceedings of the 40th Annual Meeting on Association for Computational Linguistics (ACL'02). 295--302.

Digital Library

[23]

Rimma Pivovarov, David J Albers, Jorge L Sepulveda, and Noémie Elhadad . 2014. Identifying and Mitigating Biases in EHR Laboratory Tests. Journal of Biomedical Informatics Vol. 51 (2014), 24--34.

Digital Library

[24]

Qiuling Suo, Fenglong Ma, Giovanni Canino, Jing Gao, Aidong Zhang, Pierangelo Veltri, and Agostino Gnasso . 2017 a. A Multi-task Framework for Monitoring Health Conditions via Attention-based Recurrent Neural Networks. In Proceedings of the AMIA 2017 Annual Symposium (AMIA'17).

[25]

Qiuling Suo, Fenglong Ma, Ye Yuan, Mengdi Huai, Weida Zhong, Jing Gao, and Aidong Zhang . 2017 b. Personalized Disease Prediction Using A CNN-Based Similarity Learning Method Proceedings of The IEEE International Conference on Bioinformatics and Biomedicine (BIBM'17). 811--816.

[26]

Qiuling Suo, Fenglong Ma, Ye Yuan, Mengdi Huai, Weida Zhong, Jing Gao, and Aidong Zhang . 2018. Deep Patient Similarity Learning for Personalized Healthcare. IEEE Transactions on NanoBioscience (2018).

[27]

Fei Wang, Noah Lee, Jianying Hu, Jimeng Sun, and Shahram Ebadollahi . 2012. Towards Heterogeneous Temporal Clinical Event Pattern Discovery: A Convolutional Approach. In Proceedings of the 18th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'12). 453--461.

Digital Library

[28]

Xiang Wang, Fei Wang, Jianying Hu, and Robert Sorrentino . 2014. Exploring Joint Disease Risk Prediction. In AMIA Annual Symposium Proceedings (AMIA'14). 1180--1187.

[29]

Ye Yuan, Guangxu Xun, Fenglong Ma, Qiuling Suo, Hongfei Xue, Kebin Jia, and Aidong Zhang . 2018. A Novel Channel-aware Attention Framework for Multi-Channel EEG Seizure Detection via Multi-view Deep Learning. In Proceedings of the 2018 IEEE EMBS International Conference on Biomedical & Health Informatics (BHI'18). IEEE, 206--209.

[30]

Matthew D Zeiler . 2012. ADADELTA: an adaptive learning rate method. arXiv preprint arXiv:1212.5701 (2012).

[31]

Jiacheng Zhang, Yang Liu, Huanbo Luan, Jingfang Xu, and Maosong Sun . 2017. Prior Knowledge Integration for Neural Machine Translation Using Posterior Regularization. In Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (ACL'17), Vol. Vol. 1. 1514--1523.

[32]

Jiayu Zhou, Fei Wang, Jianying Hu, and Jieping Ye . 2014. From Micro to Macro: Data Driven Phenotyping by Densification of Longitudinal Electronic Medical Records. In Proceedings of the 20th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD'14). 135--144.

Digital Library

Cited By

Sravanthi JReddy CMahendar AKumar VBuragadda SGhantasala GGupta G(2024)Improve Accuracy in Healthcare Data Analysis using Competitive Ensemble Deep Learning Model2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10498390(1792-1797)Online publication date: 28-Feb-2024
https://doi.org/10.23919/INDIACom61295.2024.10498390
Sirocchi CBogliolo AMontagna S(2024)Medical-informed machine learning: integrating prior knowledge into medical decision systemsBMC Medical Informatics and Decision Making10.1186/s12911-024-02582-424:S4Online publication date: 28-Jun-2024
https://doi.org/10.1186/s12911-024-02582-4
Xie YLu JHo JNahab FHu XYang CHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept LinkingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657904(2589-2593)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657904
Show More Cited By

Index Terms

Risk Prediction on Electronic Health Records with Prior Medical Knowledge
1. Applied computing
  1. Life and medical sciences
    1. Health informatics
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

MedPath: Augmenting Health Risk Prediction via Medical Knowledge Paths
WWW '21: Proceedings of the Web Conference 2021

The broad adoption of electronic health records (EHR) data and the availability of biomedical knowledge graphs (KGs) on the web have provided clinicians and researchers unprecedented resources and opportunities for conducting health risk predictions to ...
Medication Combination Prediction via Attention Neural Networks with Prior Medical Knowledge
Knowledge Science, Engineering and Management
Abstract
With the adoption of electronic health records (EHR), deep learning technologies have the potential to employ the EHR data to assist experts in better understanding the complex mechanisms underlying the health and disease. Existing studies have ...
Mining Electronic Health Records (EHRs): A Survey

The continuously increasing cost of the US healthcare system has received significant attention. Central to the ideas aimed at curbing this trend is the use of technology in the form of the mandate to implement electronic health records (EHRs). EHRs ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

KDD '18: Proceedings of the 24th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining

July 2018

2925 pages

ISBN:9781450355520

DOI:10.1145/3219819

General Chairs:
Yike Guo
Imperial College London
,
Faisal Farooq
IBM

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 July 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Science Foundation

Conference

KDD '18

Sponsor:

KDD '18: The 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining

August 19 - 23, 2018

London, United Kingdom

Acceptance Rates

KDD '18 Paper Acceptance Rate 107 of 983 submissions, 11%;

Overall Acceptance Rate 1,133 of 8,635 submissions, 13%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

87
Total Citations
View Citations
3,096
Total Downloads

Downloads (Last 12 months)350
Downloads (Last 6 weeks)42

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Sravanthi JReddy CMahendar AKumar VBuragadda SGhantasala GGupta G(2024)Improve Accuracy in Healthcare Data Analysis using Competitive Ensemble Deep Learning Model2024 11th International Conference on Computing for Sustainable Global Development (INDIACom)10.23919/INDIACom61295.2024.10498390(1792-1797)Online publication date: 28-Feb-2024
https://doi.org/10.23919/INDIACom61295.2024.10498390
Sirocchi CBogliolo AMontagna S(2024)Medical-informed machine learning: integrating prior knowledge into medical decision systemsBMC Medical Informatics and Decision Making10.1186/s12911-024-02582-424:S4Online publication date: 28-Jun-2024
https://doi.org/10.1186/s12911-024-02582-4
Xie YLu JHo JNahab FHu XYang CHui Yang GWang HHan SHauff CZuccon GZhang Y(2024)PromptLink: Leveraging Large Language Models for Cross-Source Biomedical Concept LinkingProceedings of the 47th International ACM SIGIR Conference on Research and Development in Information Retrieval10.1145/3626772.3657904(2589-2593)Online publication date: 10-Jul-2024
https://dl.acm.org/doi/10.1145/3626772.3657904
Fang DDuan LYuan XKlunder ATan KCao SJi YXu M(2024)Interpretable Hierarchical Attention Network for Medical Condition Identification2024 IEEE 12th International Conference on Healthcare Informatics (ICHI)10.1109/ICHI61247.2024.00071(493-499)Online publication date: 3-Jun-2024
https://doi.org/10.1109/ICHI61247.2024.00071
Cauchi MMills ALawrie AKiely DKadirkamanathan V(2024)Individualized survival predictions using state space model with longitudinal and survival dataJournal of The Royal Society Interface10.1098/rsif.2023.068221:216Online publication date: 31-Jul-2024
https://doi.org/10.1098/rsif.2023.0682
Qu ZSun ZLiu NXu YYang XCui L(2024)DHGL: Dynamic hypergraph‐based deep learning model for disease predictionElectronics Letters10.1049/ell2.1316360:6Online publication date: 19-Mar-2024
https://doi.org/10.1049/ell2.13163
Niu SMa JBai LWang ZGuo LYang X(2024)EHR-KnowGen: Knowledge-enhanced multimodal learning for disease diagnosis generationInformation Fusion10.1016/j.inffus.2023.102069102(102069)Online publication date: Feb-2024
https://doi.org/10.1016/j.inffus.2023.102069
Niu SYin QMa JSong YXu YBai LPan WYang X(2024)Enhancing healthcare decision support through explainable AI models for risk predictionDecision Support Systems10.1016/j.dss.2024.114228181(114228)Online publication date: Jun-2024
https://doi.org/10.1016/j.dss.2024.114228
Hasan MIslam MIslam MChen DSanin CXu G(2023)Applications of Artificial Intelligence for Health Informatics: A Systematic ReviewJournal of Artificial Intelligence for Medical Sciences10.55578/joaims.230920.0014:2(19-46)Online publication date: 2023
https://doi.org/10.55578/joaims.230920.001
Alsekait DSaleh HGabralla LAlnowaiser KEl-Sappagh SSahal REl-Rashidy N(2023)Toward Comprehensive Chronic Kidney Disease Prediction Based on Ensemble Deep Learning ModelsApplied Sciences10.3390/app1306393713:6(3937)Online publication date: 20-Mar-2023
https://doi.org/10.3390/app13063937
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents