Introduction
The potential applications of artificial intelligence and machine learning (AI/ML) in medicine are progressing rapidly. AI is a broad term that refers to the intelligence of computer and software systems, while ML is a type of AI involving computers learning through pattern recognition methods including artificial neural networks. Radiology is a frontrunner in this space: between 2015 and early 2020, 129 radiology AI/ML devices received regulatory clearance from the United States Food and Drug Administration (FDA), and 126 devices received the Conformité Européenne (CE) mark in Europe1. These approvals are only accelerating, with the FDA clearing 126 radiology AI/ML devices in the twelve months to July 20222. Both the speed and volume of AI/ML devices present a delicate balance for regulatory bodies: ensuring the safety and effectiveness of devices while keeping pace with the clinical innovation and value that they may provide.
Here we discuss the current and future regulatory landscapes of AI/ML in radiology, and we highlight pressing challenges that are critical for regulatory bodies to traverse. Other medical specialties will soon face similar hurdles as AI/ML become increasingly ubiquitous and may benefit from considering these challenges proactively.
Current regulatory landscape
Like other medical regulations, those for radiology AI/ML use a risk-based approach that considers safety and effectiveness. This approach is heterogeneous between jurisdictions (Table 1), with the points of difference reflecting many of the most challenging areas to regulate.
The US FDA regulates medical software as devices, as defined in section 201 of the Food Drug and Cosmetic Act3, and has considered most radiology AI/ML devices as Class II2. Class II classification indicates that a device is moderate-risk and requires âspecial controlsâ that are specific to the device to assure safety and effectiveness. A novel device is granted a de novo request that identifies these special controls, while a less burdensome 510(k) request allows clearance of subsequent devices considered substantially equivalent to the âpredicateâ device. This process differs from the more intensive premarket approval process that is used for high-risk Class III devices4. As an example of the de novo / 510(k) pathways, Viz.ai was granted a de novo request for an acute stroke large vessel occlusion (LVO) detection device in February 2018 under the newly created regulation number 892.2080 with the designated QAS product code5. This device is considered radiological computer-aided triage and notification software, and one of the special controls involves demonstrating how the device will provide effective triage. By July 2022, 30 devices had received subsequent 510(k) clearance under the QAS product code; a related product code of QFM was also created under regulation number 892.2080, with the first device using the Viz.ai device as a predicate2,6.
The European Union (EU) Medical Device Regulation 2017/745 (MDR) regulates medical software as active devices, as defined in Article 2 of the MDR7. Unlike the centralized FDA approval process, the EU market approval (CE mark) process for devices occurs in a decentralized manner through one of ~40 Notified Bodies. This approval then covers market access for the whole EU. The classification is guided by Rule 11 in Annex VIII of the MDR, which is focused on the intended purpose of the device and states that âsoftware intended to provide information which is used to make decisions with diagnosis or therapeutic purposesâ are Class IIa devices. There are exceptions, including designating devices as Class IIb or III if such decisions may cause either a serious deterioration in health or death or when the software monitors vital physiological parameters in certain situations. Devices are then considered for approval within this classification; there is no further breakdown to regulation numbers or product codes like the de novo and 510(k) request pathways with the FDA. While the EU has created a central database called the European Database on Medical Devices (EUDAMED), its full functionality has been delayed and mandatory use is not yet enforced8. The aforementioned Viz.ai LVO algorithm has received clearance as a device in the EU9; while EUDAMED lists Viz.ai as a manufacturer, it does not yet list them as having any devices10.
A benefit of having the de novo and 510(k) pathways is that the burden of the regulatory approval process can vary based on the incremental risk of a device compared to other available devices. Manufacturers can also take advantage of clearer expectations of device features, including performance metrics from predicate devices. The 510(k) pathway has, however, been criticized for the divergence in the AI/ML tasks performed by devices and their predicates11,12. At a broader level, 510(k) devices are the most recalled medical devices, which has raised concerns about the pathway; it is also possible to use a predicate device that has been recalled, with descendent devices having a higher risk of their own recall13,14. In addition, the less burdensome nature of the 510(k) pathway may cause manufacturers to opt for 510(k) approval over a de novo approach, potentially leading them to curtail innovative features that extend beyond a predicate device.
While the MDR may impart its own limitations on a device, its processes typically allow a manufacturer to obtain regulatory approval for broader features in a less onerous manner than the FDA. This approach is exemplified in âcomprehensiveâ chest radiograph algorithms from Annalise.ai, Lunit, and Qure.ai. The CE-marked versions of these algorithms detect 124, 10, and 15 different chest radiographic findings, respectively15,16,17. In contrast, the FDA has cleared these same algorithms for just 5, 2, and 1 findings, respectively. Furthermore, while the Annalise.ai and Lunit FDA-cleared devices are limited to providing binary triage information (e.g., pleural effusion present or absent), the CE-marked versions of the devices can provide localization information such as heat maps (Fig. 1).
Future regulatory landscape
As AI/ML devices increase in use and complexity, regulatory approaches will need to evolve rapidly to address these and further challenges. Recent developments in generative AI/ML, especially the release of the large multimodal model GPT-4 from OpenAI, underscore how quickly these advances can occur18.
The FDA released a discussion paper proposing a regulatory framework in 201919, an action plan in 202120, and draft guidance on Predetermined Change Control Plans (PCCPs) for device modifications in 202321. These documents describe important areas, including good machine learning practice, algorithm bias and robustness, continuous learning (software that adapts incrementally over time), and assessment of real-world performance. The pace of innovation is clear, though none of these documents reference âgenerative AIâ or âlarge language modelâ specifically. While the principles that they do reference will remain, there will be many additional considerations for generative AI, especially as possible device inputs and outputs increase from a limited set to a much larger (potentially infinite) set22,23. Separately the FDA has attempted to strengthen the 510(k) process more broadly, albeit further improvements have been proposed, including establishing more robust performance criteria and related testing methods24.
Medical uses of AI are also covered in the EUâs flagship regulatory proposal, the AI Act. Designed to both foster innovation and protect citizensâ fundamental rights, this proposed Act would distinguish between different risk categories, imposing regulations ranging from notification requirements to outright bans for each tier. While detailed provisions are still subject to amendment and intensive negotiations, Chapter 3 lays down the obligations of providers and users of âHigh-Risk AI Systemsâ. It builds on and is closely linked to existing product regulation approaches, including conformity assessments, under the âNew Legislative Frameworkâ25.
Key regulatory challenges remain and we consider several below. In doing so, we also consider the lessons that can be learned from parallels with pharmaceutical regulation.
Enhancing post-market surveillance
The FDA and MDR approval processes for radiology AI/ML devices are mostly based on model performance with retrospective data. By nature, these data cannot encompass all situations a device may encounter following clearance, nor will they account for data drift (when a change in input data, such as a switch of MRI machines from 1.5âT to 3âT, decreases algorithm performance). Current post-market surveillance focuses on device malfunctions and serious injuries or deaths rather than maintaining ongoing device performance. Annex XIV of the MDR requires post-market clinical follow-up, although it provides flexibility for this requirement through the depth and extent being proportionate to the intended purpose and risks of a device7; for the most part, there has not been ongoing, systematic assessment of radiology AI/ML device performance. In contrast, post-approval clinical trials and real-world evidence, such as the FDAâs collaborative Sentinel system26, are both critical to pharmaceutical development and pharmacovigilance. They have revealed many safety events and even led to the withdrawal of drugs27. The benefits of the Viz.ai LVO detection device have started to be shown in real-world evidence, including its ability to decrease workflow times. However, these studies have spawned from academic and related purposes rather than regulatory requirements28,29,30.
Supporting continuous/active learning
AI/ML devices can have an iterative ability to continue to learn, especially as more training data become available. Such updates may occur in a continuous, automatic manner through a model or in a discrete, manual manner with human input. The former involves greater risk and therefore warrants more regulatory attention, but the latter still involves more risk than most current models that have a closed system and are âfrozenâ. Strategies have been proposed to enable continuous learning, including retesting and simulated checks31. The recent draft guidance from the FDA on PCCPs also describes important components of device updates, including re-training practices and performance assessment21. However, it can be challenging to know the future changes that will be necessary at the time of submitting for regulatory clearance. To overcome this barrier, companies may try to make PCCPs more encompassing and less specific; it is not yet clear how this scenario will play out. If the regulatory burden of a PCCP submission is too great, manufacturers may forego a PCCP, similar to opting for the 510(k) pathway over a de novo approach.
Enabling conditional clearances/approvals
Conditional clearances and approvals with appropriate regulatory guardrails could enable AI/ML devices that do not yet have sufficient evidence for a full assessment of safety and effectiveness to obtain this further evidence through post-clearance/approval studies. Accelerated approval pathways (called conditional marketing authorization in the EU) have been used for over 30 years to enable pharmaceuticals to reach patients faster based on preliminary evidence so that patients can benefit from drugs that are âreasonably likelyâ to offer clinical benefit while further clinical trials are performed32,33. These pathways are not without their own challenges: many accelerated approvals have either not completed confirmatory trials or failed to verify their clinical benefit32,34. The coupling of post-market surveillance, continuous/active learning, and conditional clearances/approvals, which all require ongoing assessment of devices, provides an opportunity to be stricter than pharmaceutical approvals in ensuring completion of confirmatory studies. Such assessment could be particularly streamlined for radiologic AI/ML devices that have real-time feedback on device performance and accuracy.
Moving beyond explainable and verifiable AI
Many current radiology AI/ML devices replicate tasks that radiologists could perform with the device providing benefit through a decrease in time to interpretation and/or time of interpretation. The outputs of these devices have therefore been verifiable, and the âthinkingâ of the devices is explainable. As devices become more complex, especially when predicting a future clinical outcome, verifiability and explainability may become less clear. While this black-box nature can be instinctively unsettling for clinicians, it is analogous to the many medications that have been approved despite an incomplete understanding of their pharmacologic mechanismâincluding common medications such as paracetamol and lithium. A key enabler for this transition will be an increased focus on device performance on clinical outcomes rather than only model metrics35.
Enabling autonomous AI/ML
A question arising across medical and non-medical industries is whether AI/ML devices can function autonomously. In radiology, this question entails myriad considerations, including regulatory limitations, medicolegal implications, and societal acceptance. There are already mooted use cases for autonomous AI/ML in radiology, including for âdetecting normalâ (e.g., eliminating the need to interpret a chest radiograph that a âcomprehensiveâ algorithm has called normal). The developments in generative AI/ML further increase the possibilities of autonomy (e.g., an AI/ML device can more easily create a detailed radiology report). The regulatory approach will need to consider both when device autonomy is acceptable and how to ensure appropriate escalation back to a radiologist or other clinician when necessary.
Conclusion
While it is lucent that regulations should ensure the safety and effectiveness of radiology AI/ML devices, the balance of keeping pace with AI/ML innovation makes the immediate next steps for regulatory evolution more opaque. We described many challenges that regulatory bodies will need to continue to consider as the potential of AI/ML in radiology and medicine more broadly is realized.
References
Muehlematter, U. J., Daniore, P. & Vokinger, K. N. Approval of artificial intelligence and machine learning-based medical devices in the USA and Europe (2015-20): a comparative analysis. Lancet Digit Health 3, e195âe203 (2021).
U.S. Food & Drug Administration. Artificial Intelligence and Machine Learning (AI/ML)-Enabled Medical Devices. https://www.fda.gov/medical-devices/software-medical-device-samd/artificial-intelligence-and-machine-learning-aiml-enabled-medical-devices (October 5, 2022).
U.S. Code. Federal Food, Drug, and Cosmetic Act. Title 21, Section 321. (enacted December 29, 2022).
U.S. Food and Drug Administration. Premarket Approval (PMA). https://www.fda.gov/medical-devices/premarket-submissions-selecting-and-preparing-correct-submission/premarket-approval-pma (2019).
U.S. Food and Drug Administration. DEN170073 (ContaCT). https://www.accessdata.fda.gov/cdrh_docs/pdf17/DEN170073.pdf (2020).
U.S. Food and Drug Administration. K183285 (cmTriage). https://www.accessdata.fda.gov/cdrh_docs/pdf18/K183285.pdf (2019).
REGULATION (EU) 2017/745 OF THE EUROPEAN PARLIAMENT AND OF THE COUNCIL. (2017).
Taylor, N. P. European Commission targets spring of 2024 for fully functional Eudamed database. https://www.medtechdive.com/news/european-commission-eudamed-timeline-2024/626917/ (2022).
Viz.ai. https://www.viz.ai/press-release/viz-ai-receives-ce-mark-for-europe-stroke-care (2021).
European Commission. EUDAMEDâEuropean Database on Medical Devices; Manufacturer, US-MF-000024837, Viz.aiâ[United States]. https://ec.europa.eu/tools/eudamed/#/screen/search-eo/ea856f0e-3c95-4e5c-b843-07d823f61c06 (2023).
Muehlematter, U. J., Bluethgen, C. & Vokinger, K. N. FDA-cleared artificial intelligence and machine learning-based medical devices and their 510(k) predicate networks. Lancet Digit. Health 5, e618âe626 (2023).
Hwang, T. J., Kesselheim, A. S. & Vokinger, K. N. Lifecycle regulation of artificial intelligence- and machine learning-based software devices in medicine. J. Am. Med. Assoc. 322, 2285â2286 (2019).
Kadakia, K. T., Dhruva, S. S., Caraballo, C., Ross, J. S. & Krumholz, H. M. Use of recalled devices in new device authorizations under the US Food and Drug Administrationâs 510(k) pathway and risk of subsequent recalls. J. Am. Med. Assoc. 329, 136â143 (2023).
Everhart, A. O., Sen, S., Stern, A. D., Zhu, Y. & Karaca-Mandic, P. Association between regulatory submission characteristics and recalls of medical devices receiving 510(k) clearance. J. Am. Med. Assoc. 329, 144â156 (2023).
Annalise.ai. Annalise.ai, https://annalise.ai/ (2023).
Lunit. Lunit, https://www.lunit.io/ (2023).
Qure.ai. Qure.ai, https://qure.ai/ (2023).
Lee, P., Bubeck, S. & Petro, J. Benefits, limits, and risks of GPT-4 as an AI chatbot for medicine. N. Engl. J. Med. 388, 1233â1239 (2023).
U.S. Food & Drug Administration. Proposed Regulatory Framework for Modifications to Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD)âDiscussion Paper and Request for Feedback. https://www.fda.gov/media/122535/download (2019).
U.S. Food & Drug Administration. Artificial Intelligence/Machine Learning (AI/ML)-Based Software as a Medical Device (SaMD) Action Plan. https://www.fda.gov/media/145022/download (2021).
U.S. Food & Drug Administration. Marketing Submission Recommendations for a Predetermined Change Control Plan for Artificial Intelligence/Machine Learning (AI/ML)-Enabled Device Software Functions; Draft Guidance for Industry and Food and Drug Administration Staff. https://www.fda.gov/media/166704/download (2023).
Shah, N. H., Entwistle, D. & Pfeffer, M. A. Creation and adoption of large language models in medicine. J. Am. Med. Assoc. 330, 866â869 (2023).
Gottlieb, S. & Silvis, L. Regulators face novel challenges as artificial intelligence tools enter medical practice. JAMA Health Forum 4, e232300 (2023).
Rathi, V. K. & Ross, J. S. Modernizing the FDAâs 510(k) pathway. N. Engl. J. Med. 381, 1891â1893 (2019).
Veale, M. & Zuiderveen Borgesius, F. Demystifying the draft EU artificial intelligence actâanalysing the good, the bad, and the unclear elements of the proposed approach. Comput. Law Rev. Int. 22, 97â112 (2021).
Sentinel Initiative. https://www.sentinelinitiative.org/ (2023).
Downing, N. S. et al. Postmarket safety events among novel therapeutics approved by the US Food and Drug Administration between 2001 and 2010. J. Am. Med. Assoc. 317, 1854â1863 (2017).
Hassan, A. E. et al. Early experience utilizing artificial intelligence shows significant reduction in transfer times and length of stay in a hub and spoke model. Inter. Neuroradiol. 26, 615â622 (2020).
Morey, J. R. et al. Real-world experience with artificial intelligence-based triage in transferred large vessel occlusion stroke patients. Cerebrovasc. Dis. 50, 450â455 (2021).
Matsoukas, S., Stein, L. K. & Fifi, J. T. Artificial intelligence-assisted software significantly decreases all workflow metrics for large vessel occlusion transfer patients, within a large spoke and hub system. Cerebrovasc. Dis. Extra 13, 41â46 (2023).
Babic, B., Gerke, S., Evgeniou, T. & Cohen, I. G. Algorithms on regulatory lockdown in medicine. Science 366, 1202â1204 (2019).
Beaver, J. A. et al. A 25-year experience of US Food and Drug Administration accelerated approval of malignant hematology and oncology drugs and biologics: a review. JAMA Oncol. 4, 849â856 (2018).
Fashoyin-Aje, L. A., Mehta, G. U., Beaver, J. A. & Pazdur, R. The on- and off-ramps of oncology accelerated approval. N. Engl. J. Med 387, 1439â1442 (2022).
Gyawali, B., Ross, J. S. & Kesselheim, A. S. Fulfilling the mandate of the US Food and Drug Administrationâs accelerated approval pathway: the need for reforms. JAMA Intern. Med. 181, 1275â1276 (2021).
Park, S. H. et al. Methods for clinical evaluation of artificial intelligence algorithms for medical diagnosis. Radiology 306, 20â31 (2023).
Author information
Authors and Affiliations
Contributions
J.M.H., J.J.V. and B.C.B. conceived of the article. All authors contributed to the drafting, critical review, and approval of the article. J.J.M. and J.J.V. contributed equally to the work and shared the first authorship.
Corresponding author
Ethics declarations
Competing interests
J.M.H., B.C.B. and K.J.D. have received institutional funding from industry partners for AI/ML algorithm development and/or validation projects including from Annalise.ai, Nuance, and Viz.ai. J.M.H. and B.C.B. are listed as inventors on patents for radiology AI/ML algorithms. JMH has received support for attending academic conferences from conference organizers to provide courses on AI/ML; and is an investor in Elly Health. J.J.V. has received a grant to the institution from Enlitic, Qure.ai; consulting fees from Tegus; payment to an institution for lectures from Roche; travel grant from Qure.ai; participation on a data safety monitoring board or advisory board from Contextflow, Noaber Foundation, and NLC Ventures; leadership or fiduciary role on the steering committee of the PINPOINT Project (payment to the institution from AstraZeneca), RSNA Common Data Elements Steering Committee (unpaid), chair scientific committee EuSoMII (unpaid), chair ESR value-based radiology subcommittee (unpaid), section editor European Journal of Radiology (unpaid); phantom shares in Contextflow and Quibim. ERSC receives research funding from Arnold Ventures. J.A.P. is the PI of the European Research Council-funded project âiManage: Algorithms at Workâ. KPA is a Senior Consulting Editor for Radiology: Artificial Intelligence, Associate Editor for the Journal of Medical Imaging, and Editorial Board Member for the Journal of Digital Imaging.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the articleâs Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the articleâs Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/.
About this article
Cite this article
Hillis, J.M., Visser, J.J., Cliff, E.R.S. et al. The lucent yet opaque challenge of regulating artificial intelligence in radiology. npj Digit. Med. 7, 69 (2024). https://doi.org/10.1038/s41746-024-01071-2
Received:
Accepted:
Published:
DOI: https://doi.org/10.1038/s41746-024-01071-2