This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 print issues and online access
$209.00 per year
only $17.42 per issue
Buy this article
- Purchase on Springer Link
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout
References
Adamson, A. S. & Smith, A. Machine learning and health care disparities in dermatology. JAMA Dermatol. 154, 1247â1248 (2018).
Obermeyer, Z., Powers, B., Vogeli, C. & Mullainathan, S. Dissecting racial bias in an algorithm used to manage the health of populations. Science 366, 447â453 (2019).
Seyyed-Kalantari, L., Zhang, H., McDermott, M., Chen, I. Y. & Ghassemi, M. Underdiagnosis bias of artificial intelligence algorithms applied to chest radiographs in under-served patient populations. Nat. Med. 27, 2176â2182 (2021).
Castro, D. C., Walker, I. & Glocker, B. Causality matters in medical imaging. Nat. Commun. 11, 1â10 (2020).
Finlayson, S. G. et al. The clinician and dataset shift in artificial intelligence. N. Engl. J. Med. 385, 283â286 (2021).
He, H. & Garcia, E. A. Learning from imbalanced data. IEEE Trans. Knowl. Data Eng. 21, 1263â1284 (2009).
Chawla, N. V., Bowyer, K. W., Hall, L. O. & Kegelmeyer, W. P. SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321â357 (2002).
Johnson, J. M. & Khoshgoftaar, T. M. Survey on deep learning with class imbalance. J. Big Data 6, 1â54 (2019).
Moreno-Torres, J. G., Raeder, T., Alaiz-RodrÃguez, R., Chawla, N. V. & Herrera, F. A unifying view on dataset shift in classification. Pattern Recognit. 45, 521â530 (2012).
Irvin, J. et al. CheXpert: a large chest radiograph dataset with uncertainty labels and expert comparison. in Proceedings of the AAAI Conference on Artificial Intelligence, vol. 33, 590â597 (2019).
Acknowledgements
We thank X. Liu, A. Denniston and M. McCradden for helpful discussions. M.B. is funded through an Imperial College London Presidentâs PhD Scholarship. C.J. is supported by Microsoft Research and EPSRC through the Microsoft PhD Scholarship Programme. B.G. is supported through funding from the European Research Council under the European Unionâs Horizon 2020 research and innovation programme (grant agreement no. 757173, Project MIRA, ERC-2017-STG).
Author information
Authors and Affiliations
Contributions
The authors contributed equally to this work in terms of formulating the arguments, interpreting the available evidence and cowriting the manuscript.
Corresponding author
Ethics declarations
Competing interests
B.G. is a part-time employee of HeartFlow and Kheiron Medical Technologies and holds stock options with both as part of the standard compensation package. M.B. and C.J. declare no competing interests.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Bernhardt, M., Jones, C. & Glocker, B. Potential sources of dataset bias complicate investigation of underdiagnosis by machine learning algorithms. Nat Med 28, 1157â1158 (2022). https://doi.org/10.1038/s41591-022-01846-8
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s41591-022-01846-8
This article is cited by
-
A causal perspective on dataset bias in machine learning for medical imaging
Nature Machine Intelligence (2024)
-
The limits of fair medical imaging AI in real-world generalization
Nature Medicine (2024)
-
Demographic bias in misdiagnosis by computational pathology models
Nature Medicine (2024)
-
An empirical investigation of challenges of specifying training data and runtime monitors for critical software with machine learning and their relation to architectural decisions
Requirements Engineering (2024)
-
Roadmap on the use of artificial intelligence for imaging of vulnerable atherosclerotic plaque in coronary arteries
Nature Reviews Cardiology (2024)