Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

Luo, Zhiyao; Pan, Yangchen; Watkinson, Peter; Zhu, Tingting

Computer Science > Machine Learning

arXiv:2405.18556 (cs)

[Submitted on 28 May 2024 (v1), last revised 3 Jun 2024 (this version, v2)]

Title:Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

Authors:Zhiyao Luo, Yangchen Pan, Peter Watkinson, Tingting Zhu

View PDF HTML (experimental)

Abstract:In the rapidly changing healthcare landscape, the implementation of offline reinforcement learning (RL) in dynamic treatment regimes (DTRs) presents a mix of unprecedented opportunities and challenges. This position paper offers a critical examination of the current status of offline RL in the context of DTRs. We argue for a reassessment of applying RL in DTRs, citing concerns such as inconsistent and potentially inconclusive evaluation metrics, the absence of naive and supervised learning baselines, and the diverse choice of RL formulation in existing research. Through a case study with more than 17,000 evaluation experiments using a publicly available Sepsis dataset, we demonstrate that the performance of RL algorithms can significantly vary with changes in evaluation metrics and Markov Decision Process (MDP) formulations. Surprisingly, it is observed that in some instances, RL algorithms can be surpassed by random baselines subjected to policy evaluation methods and reward design. This calls for more careful policy evaluation and algorithm development in future DTR works. Additionally, we discussed potential enhancements toward more reliable development of RL-based dynamic treatment regimes and invited further discussion within the community. Code is available at this https URL.

Comments:	Accepted at ICML 2024. 9 pages for main content, 34 pages in total
Subjects:	Machine Learning (cs.LG); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.18556 [cs.LG]
	(or arXiv:2405.18556v2 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.18556

Submission history

From: Zhiyao Luo [view email]
[v1] Tue, 28 May 2024 20:03:18 UTC (2,244 KB)
[v2] Mon, 3 Jun 2024 20:16:11 UTC (2,244 KB)

Computer Science > Machine Learning

Title:Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Reinforcement Learning in Dynamic Treatment Regimes Needs Critical Reexamination

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators