Autonomous Artificial Intelligence Agents for Clinical Decision Making in Oncology
Authors:
Dyke Ferber,
Omar S. M. El Nahhas,
Georg Wölflein,
Isabella C. Wiest,
Jan Clusmann,
Marie-Elisabeth Leßman,
Sebastian Foersch,
Jacqueline Lammert,
Maximilian Tschochohei,
Dirk Jäger,
Manuel Salto-Tellez,
Nikolaus Schultz,
Daniel Truhn,
Jakob Nikolas Kather
Abstract:
Multimodal artificial intelligence (AI) systems have the potential to enhance clinical decision-making by interpreting various types of medical data. However, the effectiveness of these models across all medical fields is uncertain. Each discipline presents unique challenges that need to be addressed for optimal performance. This complexity is further increased when attempting to integrate differe…
▽ More
Multimodal artificial intelligence (AI) systems have the potential to enhance clinical decision-making by interpreting various types of medical data. However, the effectiveness of these models across all medical fields is uncertain. Each discipline presents unique challenges that need to be addressed for optimal performance. This complexity is further increased when attempting to integrate different fields into a single model. Here, we introduce an alternative approach to multimodal medical AI that utilizes the generalist capabilities of a large language model (LLM) as a central reasoning engine. This engine autonomously coordinates and deploys a set of specialized medical AI tools. These tools include text, radiology and histopathology image interpretation, genomic data processing, web searches, and document retrieval from medical guidelines. We validate our system across a series of clinical oncology scenarios that closely resemble typical patient care workflows. We show that the system has a high capability in employing appropriate tools (97%), drawing correct conclusions (93.6%), and providing complete (94%), and helpful (89.2%) recommendations for individual patient cases while consistently referencing relevant literature (82.5%) upon instruction. This work provides evidence that LLMs can effectively plan and execute domain-specific models to retrieve or synthesize new information when used as autonomous agents. This enables them to function as specialist, patient-tailored clinical assistants. It also simplifies regulatory compliance by allowing each component tool to be individually validated and approved. We believe, that our work can serve as a proof-of-concept for more advanced LLM-agents in the medical domain.
△ Less
Submitted 6 April, 2024;
originally announced April 2024.
Fully transformer-based biomarker prediction from colorectal cancer histology: a large-scale multicentric study
Authors:
Sophia J. Wagner,
Daniel Reisenbüchler,
Nicholas P. West,
Jan Moritz Niehues,
Gregory Patrick Veldhuizen,
Philip Quirke,
Heike I. Grabsch,
Piet A. van den Brandt,
Gordon G. A. Hutchins,
Susan D. Richman,
Tanwei Yuan,
Rupert Langer,
Josien Christina Anna Jenniskens,
Kelly Offermans,
Wolfram Mueller,
Richard Gray,
Stephen B. Gruber,
Joel K. Greenson,
Gad Rennert,
Joseph D. Bonner,
Daniel Schmolze,
Jacqueline A. James,
Maurice B. Loughrey,
Manuel Salto-Tellez,
Hermann Brenner
, et al. (6 additional authors not shown)
Abstract:
Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but…
▽ More
Background: Deep learning (DL) can extract predictive and prognostic biomarkers from routine pathology slides in colorectal cancer. For example, a DL test for the diagnosis of microsatellite instability (MSI) in CRC has been approved in 2022. Current approaches rely on convolutional neural networks (CNNs). Transformer networks are outperforming CNNs and are replacing them in many applications, but have not been used for biomarker prediction in cancer at a large scale. In addition, most DL approaches have been trained on small patient cohorts, which limits their clinical utility. Methods: In this study, we developed a new fully transformer-based pipeline for end-to-end biomarker prediction from pathology slides. We combine a pre-trained transformer encoder and a transformer network for patch aggregation, capable of yielding single and multi-target prediction at patient level. We train our pipeline on over 9,000 patients from 10 colorectal cancer cohorts. Results: A fully transformer-based approach massively improves the performance, generalizability, data efficiency, and interpretability as compared with current state-of-the-art algorithms. After training on a large multicenter cohort, we achieve a sensitivity of 0.97 with a negative predictive value of 0.99 for MSI prediction on surgical resection specimens. We demonstrate for the first time that resection specimen-only training reaches clinical-grade performance on endoscopic biopsy tissue, solving a long-standing diagnostic problem. Interpretation: A fully transformer-based end-to-end pipeline trained on thousands of pathology slides yields clinical-grade performance for biomarker prediction on surgical resections and biopsies. Our new methods are freely available under an open source license.
△ Less
Submitted 1 March, 2023; v1 submitted 23 January, 2023;
originally announced January 2023.