Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Thank you for visiting nature.com. You are using a browser version with limited support for CSS. To obtain the best experience, we recommend you use a more up to date browser (or turn off compatibility mode in Internet Explorer). In the meantime, to ensure continued support, we are displaying the site without styles and JavaScript.

  • Perspective
  • Published:

Designing clinically translatable artificial intelligence systems for high-dimensional medical imaging

Abstract

The National Institutes of Health in 2018 identified key focus areas for the future of artificial intelligence in medical imaging, creating a foundational roadmap for research in image acquisition, algorithms, data standardization and translatable clinical decision support systems. Among the key issues raised in the report, data availability, the need for novel computing architectures and explainable artificial intelligence algorithms are still relevant, despite the tremendous progress made over the past few years alone. Furthermore, translational goals of data sharing, validation of performance for regulatory approval, generalizability and mitigation of unintended bias must be accounted for early in the development process. In this Perspective, we explore challenges unique to high-dimensional clinical imaging data, in addition to highlighting some of the technical and ethical considerations involved in developing machine learning systems that better represent the high-dimensional nature of many imaging modalities. Furthermore, we argue that methods that attempt to address explainability, uncertainty and bias should be treated as core components of any clinical machine learning system.

This is a preview of subscription content, access via your institution

Access options

Buy this article

Prices may be subject to local taxes which are calculated during checkout

Fig. 1: Cloud-based collaborative annotation workflows.
Fig. 2: Quantifying uncertainty in machine learning outputs.
Fig. 3: Misleading nature of post-hoc model explanations.

Similar content being viewed by others

References

  1. Rajpurkar, P. et al. Deep learning for chest radiograph diagnosis: a retrospective comparison of the CheXNeXt algorithm to practicing radiologists. PLoS Med. 15, e1002686 (2018).

    Article  Google Scholar 

  2. Rajpurkar, P. et al. AppendiXNet: deep learning for diagnosis of appendicitis from a small dataset of CT exams using video pretraining. Sci. Rep. 10, 3958 (2020).

    Article  Google Scholar 

  3. Huang, S.-C. et al. PENet—a scalable deep-learning model for automated diagnosis of pulmonary embolism using volumetric CT imaging. npj Digit. Med. 3, 61 (2020).

    Article  Google Scholar 

  4. Ouyang, D. et al. Video-based AI for beat-to-beat assessment of cardiac function. Nature https://doi.org/10.1038/s41586-020-2145-8 (2020).

  5. Ghorbani, A. et al. Deep learning interpretation of echocardiograms. npj Digit. Med. 3, 10 (2020).

    Article  Google Scholar 

  6. Poplin, R. et al. Prediction of cardiovascular risk factors from retinal fundus photographs via deep learning. Nat. Biomed. Eng. 2, 158–164 (2018).

    Article  Google Scholar 

  7. McKinney, S. M. et al. International evaluation of an AI system for breast cancer screening. Nature 577, 89–94 (2020).

    Article  Google Scholar 

  8. Yim, J. et al. Predicting conversion to wet age-related macular degeneration using deep learning. Nat. Med. 26, 892–899 (2020).

    Article  Google Scholar 

  9. Beede, E. et al. A human-centered evaluation of a deep learning system deployed in clinics for the detection of diabetic retinopathy. In Proc. 2020 CHI Conference on Human Factors in Computing Systems 1–12 (ACM, 2020); https://doi.org/10.1145/3313831.3376718

  10. Allen, B. et al. A road map for translational research on artificial intelligence in medical imaging: from the 2018 National Institutes of Health/RSNA/ACR/The Academy Workshop. J. Am. Coll. Radiol. 16, 1179–1189 (2019).

    Article  Google Scholar 

  11. Paszke, A. et al. PyTorch: an imperative style, high-performance deep learning library. Preprint at https://arxiv.org/abs/1912.01703 (2019).

  12. Abadi, M. et al. TensorFlow: large-scale machine learning on heterogeneous distributed systems. Preprint at https://arxiv.org/abs/1603.04467v2 (2016).

  13. Langlotz, C. P. et al. A roadmap for foundational research on artificial intelligence in medical imaging: from the 2018 NIH/RSNA/ACR/The Academy Workshop. Radiology 291, 781–791 (2019).

    Article  Google Scholar 

  14. Ulloa Cerna, A. E. et al. Deep-learning-assisted analysis of echocardiographic videos improves predictions of all-cause mortality. Nat. Biomed. Eng. https://doi.org/10.1038/s41551-020-00667-9 (2021).

  15. Raghunath, S. et al. Prediction of mortality from 12-lead electrocardiogram voltage data using a deep neural network. Nat. Med. 26, 886–891 (2020).

    Article  Google Scholar 

  16. Oren, O., Gersh, B. J. & Bhatt, D. L. Artificial intelligence in medical imaging: switching from radiographic pathological data to clinically meaningful endpoints. Lancet Digit. Health 2, e486–e488 (2020).

    Article  Google Scholar 

  17. Mildenberger, P., Eichelberg, M. & Martin, E. Introduction to the DICOM standard. Eur. Radiol. 12, 920–927 (2002).

    Article  Google Scholar 

  18. Mesterhazy, J., Olson, G. & Datta, S. High performance on-demand de-identification of a petabyte-scale medical imaging data lake. Preprint at https://arxiv.org/abs/2008.01827 (2020).

  19. Mason, D. et al. pydicom/pydicom: pydicom 2.1.0. Zenodo https://doi.org/10.5281/ZENODO.4197955 (2020).

  20. Harris, C. R. et al. Array programming with NumPy. Nature 585, 357–362 (2020).

    Article  Google Scholar 

  21. Rubin, D. L. et al. Automated tracking of quantitative assessments of tumor burden in clinical trials. Transl. Oncol. 7, 23–35 (2014).

    Article  Google Scholar 

  22. Kaissis, G. A., Makowski, M. R., Rückert, D. & Braren, R. F. Secure, privacy-preserving and federated machine learning in medical imaging. Nat. Mach. Intell. 2, 305–311 (2020).

    Article  Google Scholar 

  23. Chang, K. et al. Distributed deep learning networks among institutions for medical imaging. J. Am. Med. Inform. Assoc. 25, 945–954 (2018).

    Article  Google Scholar 

  24. Balachandar, N., Chang, K., Kalpathy-Cramer, J. & Rubin, D. L. Accounting for data variability in multi-institutional distributed deep learning for medical imaging. J. Am. Med. Inform. Assoc. 27, 700–708 (2020).

    Article  Google Scholar 

  25. Xu, Y. et al. A collaborative online AI engine for CT-based COVID-19 diagnosis. Preprint at medRxiv https://doi.org/10.1101/2020.05.10.20096073 (2020).

  26. Kaissis, G. et al. End-to-end privacy preserving deep learning on multi-institutional medical imaging. Nat. Mach. Intell. 3, 473–484 (2021).

    Article  Google Scholar 

  27. Warnat-Herresthal, S. et al. Swarm learning for decentralized and confidential clinical machine learning. Nature 594, 265–270 (2021).

    Article  Google Scholar 

  28. Krizhevsky, A., Sutskever, I. & Hinton, G. E. ImageNet classification with deep convolutional neural networks. Commun. ACM 60, 84–90 (2017).

    Article  Google Scholar 

  29. Anwar, S., Barnes, N. & Petersson, L. A systematic evaluation: fine-grained CNN vs. traditional CNN classifiers. Preprint at https://arxiv.org/abs/2003.11154 (2020).

  30. He, K., Zhang, X., Ren, S. & Sun, J. Identity mappings in deep residual networks. Preprint at https://arxiv.org/abs/1603.05027 (2016).

  31. Hara, K., Kataoka, H. & Satoh, Y. Learning spatio-temporal features with 3D residual networks for action recognition. Preprint at https://arxiv.org/abs/1708.07632 (2017).

  32. Tan, M. & Le, Q. V. EfficientNet: rethinking model scaling for convolutional neural networks. Preprint at https://arxiv.org/abs/1905.11946 (2019).

  33. Carreira, J. & Zisserman, A. Quo vadis, action recognition? A new model and the kinetics dataset. Preprint at https://arxiv.org/abs/1705.07750 (2018).

  34. Simonyan, K. & Zisserman, A. Very deep convolutional networks for large-scale image recognition. Preprint at http://arxiv.org/abs/1409.1556 (2014).

  35. Marcel, S. & Rodriguez, Y. Torchvision the machine-vision package of torch. In Proc. International Conference on Multimedia - MM ’10 1485 (ACM, 2010); https://doi.org/10.1145/1873951.1874254

  36. Zhang, J. et al. Fully automated echocardiogram interpretation in clinical practice. Circulation 138, 1623–1635 (2018).

    Article  Google Scholar 

  37. Taleb, A. et al. 3D self-supervised methods for medical imaging. Preprint at https://arxiv.org/abs/2006.03829v3 (2020).

  38. Shad, R. et al. Predicting post-operative right ventricular failure using video-based deep learning. Nat. Commun. 12, 5192 (2021).

    Article  Google Scholar 

  39. Carreira, J., Noland, E., Banki-Horvath, A., Hillier, C. & Zisserman, A. A short note about Kinetics-600. Preprint at https://arxiv.org/abs/1808.01340 (2018).

  40. Raghu, M., Zhang, C., Kleinberg, J. & Bengio, S. Transfusion: understanding transfer learning for medical imaging. Preprint at https://arxiv.org/abs/1902.07208 (2019).

  41. Zhang, Y., Jiang, H., Miura, Y., Manning, C. D. & Langlotz, C. P. Contrastive learning of medical visual representations from paired images and text. Preprint at https://arxiv.org/abs/2010.00747 (2020).

  42. Real, E., Aggarwal, A., Huang, Y. & Le, Q. V. Regularized evolution for image classifier architecture search. Preprint at https://arxiv.org/abs/1802.01548 (2019).

  43. Piergiovanni, A., Angelova, A., Toshev, A. & Ryoo, M. Evolving space-time neural architectures for videos. In 2019 IEEE/CVF International Conf. Computer Vision (ICCV) 1793–1802 (IEEE, 2019); https://doi.org/10.1109/ICCV.2019.00188

  44. Yamashita, R., Long, J., Saleem, A., Rubin, D. L. & Shen, J. Deep learning predicts postsurgical recurrence of hepatocellular carcinoma from digital histopathologic images. Sci. Rep. 11, 2047 (2021).

    Article  Google Scholar 

  45. Mobadersany, P. et al. Predicting cancer outcomes from histology and genomics using convolutional networks. Proc. Natl Acad. Sci. USA 115, E2970–E2979 (2018).

    Article  Google Scholar 

  46. Kvamme, H., Borgan, Ø. & Scheel, I. Time-to-event prediction with neural networks and Cox regression. Preprint at https://arxiv.org/abs/1907.00825 (2019).

  47. Sensoy, M., Kaplan, L. & Kandemir, M. Evidential deep learning to quantify classification uncertainty. Preprint at https://arxiv.org/abs/1806.01768 (2018).

  48. Callaway, E. ‘It will change everything’: DeepMind’s AI makes gigantic leap in solving protein structures. Nature 588, 203–204 (2020).

    Article  Google Scholar 

  49. Jumper, J. et al. Highly accurate protein structure prediction with AlphaFold. Nature https://doi.org/10.1038/s41586-021-03819-2 (2021).

  50. Abdar, M. et al. A review of uncertainty quantification in deep learning: Techniques, applications and challenges. Inform. Fusion 76, 243–297 (2021).

    Article  Google Scholar 

  51. Goddard, K., Roudsari, A. & Wyatt, J. C. Automation bias: a systematic review of frequency, effect mediators, and mitigators. J. Am. Med. Inform. Assoc. 19, 121–127 (2012).

    Article  Google Scholar 

  52. Bach, S. et al. On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10, e0130140 (2015).

    Article  Google Scholar 

  53. Selvaraju, R. R. et al. Grad-CAM: visual explanations from deep networks via gradient-based localization. Int. J. Comput. Vis. 128, 336–359 (2020).

    Article  Google Scholar 

  54. Adebayo, J. et al. Sanity checks for saliency maps. Preprint at https://arxiv.org/abs/1810.03292 (2020).

  55. Rudin, C. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat. Mach. Intell. 1, 206–215 (2019).

    Article  Google Scholar 

  56. Arun, N. et al. Assessing the (un)trustworthiness of saliency maps for localizing abnormalities in medical imaging. Preprint at https://arxiv.org/abs/2008.02766 (2020).

  57. Hughes, J. W. et al. Deep learning prediction of biomarkers from echocardiogram videos. Preprint at medRxiv https://doi.org/10.1101/2021.02.03.21251080 (2021).

  58. DeGrave, A. J., Janizek, J. D. & Lee, S.-I. AI for radiographic COVID-19 detection selects shortcuts over signal. Nat. Mach. Intell. https://doi.org/10.1038/s42256-021-00338-7 (2021).

  59. Pierson, E., Cutler, D. M., Leskovec, J., Mullainathan, S. & Obermeyer, Z. An algorithmic approach to reducing unexplained pain disparities in underserved populations. Nat. Med. 27, 136–140 (2021).

    Article  Google Scholar 

  60. Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447–453 (2019).

    Article  Google Scholar 

  61. Chen, I. Y. et al. Ethical machine learning in health care. Preprint at https://arxiv.org/abs/2009.10576 (2020).

  62. Huang, S.-C., Pareek, A., Seyyedi, S., Banerjee, I. & Lungren, M. P. Fusion of medical imaging and electronic health records using deep learning: a systematic review and implementation guidelines. npj Digit. Med. 3, 136 (2020).

    Article  Google Scholar 

  63. Tomašev, N. et al. A clinically applicable approach to continuous prediction of future acute kidney injury. Nature 572, 116–119 (2019).

    Article  Google Scholar 

  64. Esteva, A. et al. Deep learning-enabled medical computer vision. npj Digit. Med. 4, 5 (2021).

    Article  Google Scholar 

  65. Shrikumar, A., Greenside, P. & Kundaje, A. Learning important features through propagating activation differences. Preprint at https://arxiv.org/abs/1704.02685 (2019).

  66. Lundberg, S. M. et al. From local explanations to global understanding with explainable AI for trees. Nat. Mach. Intell. 2, 56–67 (2020).

    Article  Google Scholar 

  67. Pfohl, S. R., Foryciarz, A. & Shah, N. H. An empirical characterization of fair machine learning for clinical risk prediction. J. Biomed. Inform. 113, 103621 (2021).

    Article  Google Scholar 

  68. Agarwal, A., Beygelzimer, A., Dudík, M., Langford, J. & Wallach, H. A Reductions approach to fair classification. Preprint at https://arxiv.org/abs/1803.02453 (2018).

  69. Shapley, L. S. A value for n-person games. Contrib. Theory Games 2, 307–317 (1953).

    MathSciNet  MATH  Google Scholar 

Download references

Acknowledgements

R.S. was supported in part by the American Heart Association Postdoctoral Fellowship Award (grant number 834986).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to William Hiesinger.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Peer review information Nature Machine Intelligence thanks Pearse Keane, Yipeng Hu and the other, anonymous, reviewer(s) for their contribution to the peer review of this work.

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Shad, R., Cunningham, J.P., Ashley, E.A. et al. Designing clinically translatable artificial intelligence systems for high-dimensional medical imaging. Nat Mach Intell 3, 929–935 (2021). https://doi.org/10.1038/s42256-021-00399-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1038/s42256-021-00399-8

Search

Quick links

Nature Briefing

Sign up for the Nature Briefing newsletter — what matters in science, free to your inbox daily.

Get the most important science stories of the day, free in your inbox. Sign up for Nature Briefing