Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Shedding Light on the Black Box: Explaining Deep Neural Network Prediction of Clinical Outcomes

  • Image & Signal Processing
  • Published:
Journal of Medical Systems Aims and scope Submit manuscript

Abstract

Deep neural network models are emerging as an important method in healthcare delivery, following the recent success in other domains such as image recognition. Due to the multiple non-linear inner transformations, deep neural networks are viewed by many as black boxes. For practical use, deep learning models require explanations that are intuitive to clinicians. In this study, we developed a deep neural network model to predict outcomes following major cardiovascular procedures, using temporal image representation of past medical history as input. We created a novel explanation for the prediction of the model by defining impact scores that associate clinical observations with the outcome. For comparison, a logistic regression model was fitted to the same dataset. We compared the impact scores and log odds ratios by calculating three types of correlations, which provided a partial validation of the impact scores. The deep neural network model achieved an area under the receiver operating characteristics curve (AUC) of 0.787, compared to 0.746 for the logistic regression model. Moderate correlations were found between the impact scores and the log odds ratios. Impact scores generated by the explanation algorithm has the potential to shed light on the “black box” deep neural network model and could facilitate its adoption by clinicians.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4

Similar content being viewed by others

References

  1. A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in NIPS'12 Proceedings of the 25th International Conference on Neural Information Processing Systems, Lake Tahoe, 2012, vol. 1.

  2. W. Xiong et al., "Toward Human Parity in Conversational Speech Recognition," IEEE/ACM Trans. Audio Speech Lang. Process., vol. 25, no. 11, 2017.

  3. D. Silver et al., "Mastering the game of Go with deep neural networks and tree search," Nature, vol. 529, no. 7587, pp. 484-9, doi: https://doi.org/10.1038/nature16961.

  4. Y. LeCun, Y. Bengio, and G. Hinton, Deep learning, Nature, Research Support, Non-U.S. Gov't Research Support, U.S. Gov't, Non-P.H.S. Rev. vol. 521, no. 7553, pp. 436-44, May 28 2015, doi: https://doi.org/10.1038/nature14539.

  5. X. W. Chen and X. T. Lin, Big Data Deep Learning: Challenges and Perspectives, (in English), Leee Access, vol. 2, pp. 514-525, 2014, doi: https://doi.org/10.1109/Access.2014.2325029.

    Article  Google Scholar 

  6. R. Miotto, F. Wang, S. Wang, X. Jiang, and J. T. Dudley, Deep learning for healthcare: review, opportunities and challenges, Brief Bioinform, May 6 2017, doi: https://doi.org/10.1093/bib/bbx044.

  7. J. Futoma, J. Morris, and J. Lucas, "A comparison of models for predicting early hospital readmissions," J Biomed Inform, vol. 56, pp. 229-38, 2015, doi: https://doi.org/10.1016/j.jbi.2015.05.016.

    Article  PubMed  Google Scholar 

  8. Y. Cheng, F. Wang, P. Zheng, and J. Hu (2016), "Risk prediction with electronic health records: a deep learning approach," in Proceedings of the 2016 SIAM International Conference on Data Mining, Miami: Society for Industrial and Applied Mathematics.

    Google Scholar 

  9. E. Choi, M. T. Bahadori, A. Schuetz, W. F. Stewart, and J. Sun, "Doctor AI: Predicting Clinical Events via Recurrent Neural Networks," JMLR Workshop Conf. Proc., vol. 56, pp. 301-318, 2016. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/28286600.

  10. J. G. Lee et al., "Deep Learning in Medical Imaging: General Overview," Korean J Radiol, vol. 18, no. 4, pp. 570-584, 2017, doi: https://doi.org/10.3348/kjr.2017.18.4.570.

    Article  PubMed  PubMed Central  Google Scholar 

  11. B. J. Erickson, P. Korfiatis, Z. Akkus, and T. L. Kline, "Machine Learning for Medical Imaging," Radiographics, vol. 37, no. 2, pp. 505-515, 2017, doi: https://doi.org/10.1148/rg.2017160130.

    Article  PubMed  PubMed Central  Google Scholar 

  12. L. A. Pastur-Romay, F. Cedron, A. Pazos, and A. B. Porto-Pazos, "Deep Artificial Neural Networks and Neuromorphic Chips for Big Data Analysis: Pharmaceutical and Bioinformatics Applications," Int J Mol Sci, vol. 17, no. 8, 1313 Aug 11 2016, doi: https://doi.org/10.3390/ijms17081313.

  13. NIST. "Guidelines for the 2012 TREC Medical Records Track." http://www-nlpir.nist.gov/projects/trecmed/2012. Accessed March 2018.

  14. M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks, (in English), Lect. Notes Comput. Sc., vol. 8689, pp. 818-833, 2014. [Online]. Available: <Go to ISI>://WOS:000345524200047.

    Article  Google Scholar 

  15. S. Bach, A. Binder, G. Montavon, F. Klauschen, K. R. Muller, and W. Samek, "On Pixel-Wise Explanations for Non-Linear Classifier Decisions by Layer-Wise Relevance Propagation," PLoS One, vol. 10, no. 7, p. e0130140, 2015, doi: https://doi.org/10.1371/journal.pone.0130140.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Lipton, Z. C., The Mythos of Model Interpretability. Queue, vol 16, no. 3, 2018. https://arxiv.org/abs/1606.03490. Accessed March 2018.

  17. Y. Dong, H. Su, J. Zhu, and B. Zhang, Improving Interpretability of Deep Neural Networks with Semantic Information, 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2017. arXiv preprint arXiv:1703.04096, 2017

  18. Oquab, M., Bottou, L., Laptev, I., and J. Sivic, Learning and Transferring Mid-Level Image Representations using Convolutional Neural Networks, (in English). 2014 IEEE Conference on Computer Vision and Pattern Recognition, pp. 1717-1724, 2014, https://doi.org/10.1109/Cvpr.2014.222.

  19. R. Shah, Y. Shao, K. M. Doing-Harris, W. Charlene, Y. Cheng, B. Bray, and Q. Zeng-Treitler, "Frailty and Cardiovasular Surgery: Deep Neural Network versus Support Vector Machine to Predict Death," in ACC.18, Orlando, FL, 2018.

  20. Q. T. Zeng, D. Redd, G. Divita, S. Jarad, C. Brandt, and J. R. Nebeker, "Characterizing Clinical Text and Sublanguage: A Case Study of the VA Clinical Notes," J Health Med Informat. S3:001, 2011.

    Google Scholar 

  21. A. Kadish and M. Mehra, "Heart failure devices: implantable cardioverter-defibrillators and biventricular pacing therapy," Circulation, vol. 111, no. 24, pp. 3327-35, 2005, doi: https://doi.org/10.1161/CIRCULATIONAHA.104.481267.

    Article  PubMed  Google Scholar 

  22. M. S. Slaughter et al., "Advanced heart failure treated with continuous-flow left ventricular assist device," N Engl J Med, vol. 361, no. 23, pp. 2241-51, Dec 3 2009, doi: https://doi.org/10.1056/NEJMoa0909938.

  23. K. H. Ladwig, J. Baumert, B. Marten-Mittag, C. Kolb, B. Zrenner, and C. Schmitt, "Posttraumatic stress symptoms and predicted mortality in patients with implantable cardioverter-defibrillators: results from the prospective living with an implanted cardioverter-defibrillator study," Arch Gen Psychiatry, vol. 65, no. 11, pp. 1324-30, Nov 2008, doi: https://doi.org/10.1001/archpsyc.65.11.1324.

    Article  PubMed  Google Scholar 

  24. I. M. Morken, E. Bru, T. M. Norekval, A. I. Larsen, T. Idsoe, and B. Karlsen, "Perceived support from healthcare professionals, shock anxiety and post-traumatic stress in implantable cardioverter defibrillator recipients," J Clin Nurs, vol. 23, no. 3-4, pp. 450-60, 2014, doi: https://doi.org/10.1111/jocn.12200.

    Article  PubMed  Google Scholar 

  25. K. H. Magid, D. D. Matlock, J. S. Thompson, C. K. McIlvennan, and L. A. Allen, "The influence of expected risks on decision making for destination therapy left ventricular assist device: An MTurk survey," J Heart Lung Transplant, vol. 34, no. 7, pp. 988-90, 2015, doi: https://doi.org/10.1016/j.healun.2015.03.006.

    Article  PubMed  Google Scholar 

  26. R. Rowe et al., "Role of frailty assessment in patients undergoing cardiac interventions," Open Heart, vol. 1, no. 1, p. e000033, 2014, doi: https://doi.org/10.1136/openhrt-2013-000033.

    Article  PubMed  PubMed Central  Google Scholar 

  27. J. Chikwe and D. H. Adams, "Frailty: the missing element in predicting operative mortality," Semin Thorac Cardiovasc Surg, vol. 22, no. 2, pp. 109-10, 2010, doi: https://doi.org/10.1053/j.semtcvs.2010.09.001.

    Article  PubMed  Google Scholar 

  28. J. Zhang and M. F. Walji, "TURF: toward a unified framework of EHR usability," J Biomed Inform, vol. 44, no. 6, pp. 1056-67, 2011, doi: https://doi.org/10.1016/j.jbi.2011.08.005.

    Article  PubMed  Google Scholar 

  29. CZ. Che, Y. Cheng, Z. Sun, and Y. Liu, Exploiting Convolutional Neural Network for Risk Prediction with Medical Feature Embedding, arXiv preprint arXiv:1701.07474, 2017.

  30. Y. Cheng, F. Wang, P. Zheng, and J. Hu, Risk Prediction with Electronic Health Records: A Deep Learning Approach, presented at the Proceedings of the 2016 SIAM International Conference on Data Mining, 2016.

  31. S. Kiranyaz, T. Ince, O. Abdeljaber, O. Avci, and M. Gabbouj, "1-D Convolutional Neural Networks for Signal Processing Applications, presented at the ICASSP 2019 - 2019 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Brighton, 2019.

    Google Scholar 

  32. J. B. Bergstra, O. Bastien, F. Lamblin, P. Pascanu, R. Desjardins, G. Turian, J. Warde-Farley, D. Bengio, Y. Theano, A CPU and GPU Math Expression Compiler, in Proceedings of the Python for Scientific Computing Conference (SciPy) 2010.

  33. S. S. J. Dieleman, C. Raffel, E. Olso, S. K. Sønderby, D. Nouri, et. al, Lasagne: First release. https://doi.org/10.5281/zenodo.27878.

  34. R. P. Anderson, R. Jin, and G. L. Grunkemeier, "Understanding logistic regression analysis in clinical reports: an introduction," Ann Thorac Surg, vol. 75, no. 3, pp. 753-7, 2003. [Online]. Available: http://www.ncbi.nlm.nih.gov/pubmed/12645688. Accessed March 2018.

Download references

Acknowledgments

This study was funded by: NIH grant R56 (AG052536-01A1); The Clinical and Translational Science Institute at Children’s National (CTSI-CN) through the NIH Clinical and Translational Science Award (CTSA) program (UL1TR001876); CREATE: A VHA NLP Software Ecosystem for Collaborative Development and Integration (#CRE 12–315); Veterans Health Administration Health Services Research & Development (# CRE 12-321); Career Development Award from the NHLBI (K08HL136850).

Funding

This study was funded by: NIH grant R56 (AG052536-01A1); The Clinical and Translational Science Institute at Children’s National (CTSI-CN) through the NIH Clinical and Translational Science Award (CTSA) program (UL1TR001876); CREATE: A VHA NLP Software Ecosystem for Collaborative Development and Integration (#CRE 12–315); Veterans Health Administration Health Services Research & Development (# CRE 12–321); Career Development Award from the NHLBI (K08HL136850).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yijun Shao.

Ethics declarations

Conflict of interests

All authors declare that they have no conflicts of interest.

Ethical approval

All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards. This article does not contain any studies with animals performed by any of the authors.

Informed consent

Informed consent was obtained from all individual participants included in the study.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

This article is part of the Topical Collection on Image & Signal Processing

Appendix

Appendix

Table A List of all variables used in DNN model and the mean and standard deviation of the impact scores over all patients

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shao, Y., Cheng, Y., Shah, R.U. et al. Shedding Light on the Black Box: Explaining Deep Neural Network Prediction of Clinical Outcomes. J Med Syst 45, 5 (2021). https://doi.org/10.1007/s10916-020-01701-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10916-020-01701-8

Keywords