Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey

Bias in Reinforcement Learning: A Review in Healthcare Applications

Published: 15 September 2023 Publication History

Abstract

Reinforcement learning (RL) can assist in medical decision making using patient data collected in electronic health record (EHR) systems. RL, a type of machine learning, can use these data to develop treatment policies. However, RL models are typically trained using imperfect retrospective EHR data. Therefore, if care is not taken in training, RL policies can propagate existing bias in healthcare. Literature that considers and addresses the issues of bias and fairness in sequential decision making are reviewed. The major themes to mitigate bias that emerge relate to (1) data management; (2) algorithmic design; and (3) clinical understanding of the resulting policies.

References

[1]
Shahriar Akter, Grace McCarthy, Shahriar Sajib, Katina Michael, Yogesh K. Dwivedi, John D’Ambra, and K. N. Shen. 2021. Algorithmic bias in data-driven innovation in the age of AI. International Journal of Information Management 60 (2021). https://www.sciencedirect.com/science/article/pii/S0268401221000803
[2]
Mohamed Alosh, Kathleen Fritsch, Mohammad Huque, Kooros Mahjoob, Gene Pennello, Mark Rothmann, Estelle Russek-Cohen, Fraser Smith, Stephen Wilson, and Lilly Yue. 2015. Statistical considerations on subgroup analysis in clinical trials. Statistics in Biopharmaceutical Research 7, 4 (2015), 286–303.
[3]
Onur Asan and Enid Montague. 2012. Physician interactions with electronic health records in primary care. Health Systems 1, 2 (2012), 96–103.
[4]
Matthew Baucum, Anahita Khojandi, and Rama Vasudevan. 2021a. Improving deep reinforcement learning with transitional variational autoencoders: A healthcare application. IEEE Journal of Biomedical and Health Informatics 25, 6 (2021), 2273–2280. DOI:
[5]
Matt Baucum, Anahita Khojandi, Rama Vasudevan, and Robert Davis. 2022. Adapting reinforcement learning treatment policies using limited data to personalize critical care. INFORMS Journal on Data Science (2022).
[6]
Matt Baucum, Anahita Khojandi, Rama Vasudevan, and Ritesh Ramdhani. 2023. Optimizing patient-specific medication regimen policies using wearable sensors in parkinson’s disease. Management Science 0, 0 (2023).
[7]
Brett K. Beaulieu-Jones, Jason H. Moore, and Pooled Resource Open-Access ALS Clinical Trials Consortium. 2017. Missing data imputation in the electronic health record using deeply learned autoencoders. In Proceedings of the Pacific Symposium on Biocomputing 2017. World Scientific, 207–218.
[8]
Hillary Bekker. 2015. Using decision making theory to inform clinical practice. Shared Decision Making in Healthcare: Achieving Evidence-based Patient Choice (2015).
[9]
Susannah M. Bernheim, Joseph S. Ross, Harlan M. Krumholz, and Elizabeth H. Bradley. 2008. Influence of patients’ socioeconomic status on clinical management decisions: A qualitative study. The Annals of Family Medicine 6, 1 (2008), 53–59.
[10]
Matthew Bond, Ann Bowling, Dorothy McKee, Marian Kennelly, Adrian P. Banning, Nigel Dudley, Andrew Elder, and Anthony Martin. 2003. Does ageism affect the management of ischaemic heart disease? Journal of Health Services Research & Policy 8, 1 (2003), 40–47.
[11]
Jason Brownlee. 2020. Cost-sensitive learning for imbalanced classification. Machine Learning Mastery (Jan2020). https://machinelearningmastery.com/cost-sensitive-learning-for-imbalanced-classification/
[12]
Pascale Carayon, Tosha B. Wetterneck, A. Joy Rivera-Rodriguez, Ann Schoofs Hundt, Peter Hoonakker, Richard Holden, and Ayse P. Gurses. 2014. Human factors systems approach to healthcare quality and patient safety. Applied Ergonomics 45, 1 (2014), 14–25.
[13]
Changgee Chang, Yi Deng, Xiaoqian Jiang, and Qi Long. 2020. Multiple imputation for analysis of incomplete data in distributed health data networks. Nature Communications 11, 1 (2020), 1–11.
[14]
Nitesh V. Chawla, Kevin W. Bowyer, Lawrence O. Hall, and W. Philip Kegelmeyer. 2002. SMOTE: Synthetic minority over-sampling technique. Journal of Artificial Intelligence Research 16 (2002), 321–357.
[15]
Irene Chen, Fredrik D. Johansson, and David Sontag. 2018. Why Is My Classifier Discriminatory? (Dec2018). https://arxiv.org/abs/1805.12002
[16]
Dogan C. Cicek, Enes Duran, Baturay Saglam, Kagan Kaya, Furkan Mutlu, and Suleyman S. Kozat. 2021. AWD3: Dynamic reduction of the estimation bias. In Proceedings of the 2021 IEEE 33rd International Conference on Tools with Artificial Intelligence (ICTAI’21). IEEE, 775–779.
[17]
David I. Cook, Val J. Gebski, and Anthony C. Keech. 2004. Subgroup analysis in clinical trials. Medical Journal of Australia 180, 6 (2004), 289.
[18]
Ali el Hassouni, Mark Hoogendoorn, Martijn van Otterlo, and Eduardo Barbaro. 2018. Personalization of health interventions using cluster-based reinforcement learning. In Proceedings of PRIMA.
[19]
Tomás Escobar-Rodríguez, Pedro Monge-Lozano, and Ma Mercedes Romero-Alonso. 2012. Acceptance of e-prescriptions and automated medication-management systems in hospitals: An extension of the technology acceptance model. Journal of Information Systems 26, 1 (2012), 77–96.
[20]
Vincent François-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, and Joelle Pineau. 2018. An introduction to deep reinforcement learning. CoRR abs/1811.12560 (2018). arxiv:1811.12560http://arxiv.org/abs/1811.12560
[21]
Michael F. Furukawa, T. S. Raghu, and Benjamin B. M. Shao. 2010. Electronic medical records, nurse staffing, and nurse-sensitive patient outcomes: Evidence from California hospitals, 1998–2007. Health Services Research 45, 4 (2010), 941–962.
[22]
Yingqiang Ge, Shuchang Liu, Ruoyuan Gao, Yikun Xian, Yunqi Li, Xiangyu Zhao, Changhua Pei, Fei Sun, Junfeng Ge, Wenwu Ou, and Yongfeng Zhang. 2021. Towards long-term fairness in recommendation. In Proceedings of the 14th ACM International Conference on Web Search and Data Mining. 445–453.
[23]
Yingqiang Ge, Xiaoting Zhao, Lucia Yu, Saurabh Paul, Diane Hu, Chu-Cheng Hsieh, and Yongfeng Zhang. 2022. Toward pareto efficient fairness-utility trade-off in recommendation through reinforcement learning. In Proceedings of the 15th ACM International Conference on Web Search and Data Mining. 316–324.
[24]
Xinyang Geng, Kevin Li, Abhishek Gupta, Aviral Kumar, and Sergey Levine. [n.d.]. Effective offline RL needs going beyond pessimism: Representations and distributional shift. In Proceedings of the Decision Awareness in Reinforcement Learning Workshop at ICML 2022.
[25]
Yue Geng and Xinyu Luo. 2018. Cost-Sensitive Convolution based Neural Networks for Imbalanced Time-Series Classification. (2018). arxiv:cs.LG/1801.04396
[26]
Mohammad Ghassemi, Stefan Richter, Ifeoma Eche, Tszyi Chen, John Danziger, and Leo Celi. 2014. A data-driven approach to optimized medication dosing: A focus on heparin. Intensive Care Medicine 40 (082014). DOI:
[27]
Rachel Gold, Erika Cottrell, Arwen Bunce, Mary Middendorf, Celine Hollombe, Stuart Cowburn, Peter Mahr, and Gerardo Melgar. 2017. Developing electronic health record (EHR) strategies related to health center patients’ social determinants of health. The Journal of the American Board of Family Medicine 30, 4 (2017), 428–447.
[28]
Bryce Goodman and Seth Flaxman. 2017. European Union regulations on algorithmic decision-making and a “right to explanation”. AI Magazine 38, 3 (2017), 50–57.
[29]
Marek Grześ. 2017. Reward shaping in episodic reinforcement learning. In Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems (AAMAS’17). International Foundation for Autonomous Agents and Multiagent Systems, Richland, SC, 565–573.
[30]
F. M. Hajjaj, M. S. Salek, M. K. A. Basra, and A. Y. Finlay. 2010. Non-clinical influences on clinical decision-making: A major challenge to evidence-based practice. Journal of the Royal Society of Medicine 103, 5 (May2010), 178–187. DOI:
[31]
Qiang He and Xinwen Hou. 2020. WD3: Taming the estimation bias in deep reinforcement learning. In Proceedings of the 2020 IEEE 32nd International Conference on Tools with Artificial Intelligence (ICTAI’20). IEEE, 391–398.
[32]
Úrsula Hébert-Johnson, Michael P. Kim, Omer Reingold, and Guy N. Rothblum. 2017. Calibration for the (Computationally-Identifiable) Masses. CoRR abs/1711.08513 (2017). arxiv:1711.08513http://arxiv.org/abs/1711.08513
[33]
Alexandre Heuillet, Fabien Couthouis, and Natalia Díaz Rodríguez. 2020. Explainability in deep reinforcement learning. CoRR abs/2008.06693 (2020). arxiv:2008.06693https://arxiv.org/abs/2008.06693
[34]
Calvin WL Ho, Joseph Ali, and Karel Caals. 2020. Ensuring trustworthy use of artificial intelligence and big data analytics in health insurance. Bulletin of the World Health Organization 98, 4 (2020), 263.
[35]
Sara Hooker. 2021. Moving Beyond “Algorithmic Bias is a Data Problem”. (Apr2021). https://www.sciencedirect.com/science/article/pii/S2666389921000611
[36]
Yujing Hu, Weixun Wang, Hangtian Jia, Yixiang Wang, Yingfeng Chen, Jianye Hao, Feng Wu, and Changjie Fan. 2020. Learning to utilize shaping rewards: A new approach of reward shaping. CoRR abs/2011.02669 (2020). arXiv:2011.02669https://arxiv.org/abs/2011.02669
[37]
Zhen Hu, Genevieve B. Melton, Elliot G. Arsoniadis, Yan Wang, Mary R. Kwaan, and Gyorgy J. Simon. 2017. Strategies for handling missing clinical data for automated surgical site infection detection from the electronic health record. Journal of Biomedical Informatics 68 (2017), 112–120.
[38]
Janus Christian Jakobsen, Christian Gluud, Jørn Wetterslev, and Per Winkel. 2017. When and how should multiple imputation be used for handling missing data in randomised clinical trials–a practical guide with flowcharts. BMC Medical Research Methodology 17, 1 (2017), 1–10.
[39]
Alistair Johnson, Tom Pollard, and Roger Mark. 2016. MIMIC-III Clinical Database. (Sept2016). https://physionet.org/content/mimiciii/1.4/
[40]
Lynn B. Jorde and Stephen P. Wooding. 2004. Genetic variation, classification and ’race’. Nature Genetics 36, 11 (2004), S28–S33.
[41]
Christopher J. Kelly, Alan Karthikesalingam, Mustafa Suleyman, Greg Corrado, and Dominic King. 2019. Key Challenges for Delivering Clinical Impact with Artificial Intelligence. (Oct2019).
[42]
Anahita Khojandi, Lisa M. Maillart, Oleg A. Prokopyev, Mark S. Roberts, Timothy Brown, and William W. Barrington. 2014. Optimal implantable cardioverter defibrillator (ICD) generator replacement. INFORMS Journal on Computing 26, 3 (2014), 599–615.
[43]
Anahita Khojandi, Lisa M. Maillart, Oleg A. Prokopyev, Mark S. Roberts, and Samir F. Saba. 2018. Dynamic abandon/extract decisions for failed cardiac leads. Management Science 64, 2 (2018), 633–651.
[44]
Aki Koivu, Mikko Sairanen, Antti Airola, and Tapio Pahikkala. 2020. Synthetic minority oversampling of vital statistics data with generative adversarial networks. Journal of the American Medical Informatics Association 27, 11 (2020), 1667–1674.
[45]
Noemi Kreif, Richard Grieve, Rosalba Radice, Zia Sadique, Roland Ramsahai, and Jasjeet S. Sekhon. 2012. Methods for estimating subgroup effects in cost-effectiveness analyses that use observational data. Medical Decision Making 32, 6 (2012), 750–763.
[46]
Matjaz Kukar and Igor Kononenko. 1998. Cost-sensitive learning with neural networks. In ECAI, Vol. 15. Citeseer, 88–94.
[47]
Aviral Kumar, Justin Fu, Matthew Soh, George Tucker, and Sergey Levine. 2019. Stabilizing off-policy q-learning via bootstrapping error reduction. Advances in Neural Information Processing Systems 32 (2019).
[48]
Aviral Kumar, Aurick Zhou, George Tucker, and Sergey Levine. 2020. Conservative q-learning for offline reinforcement learning. Advances in Neural Information Processing Systems 33 (2020), 1179–1191.
[49]
Kimmo Kärkkäinen and Jungseock Joo. 2019. FairFace: Face Attribute Dataset for Balanced Race, Gender, and Age. (2019). arxiv:cs.CV/1908.04913
[50]
Isotta Landi, Benjamin S. Glicksberg, Hao-Chih Lee, Sarah Cherng, Giulia Landi, Matteo Danieletto, Joel T. Dudley, Cesare Furlanello, and Riccardo Miotto. 2020. Deep representation learning of electronic health records to unlock patient stratification at scale. NPJ Digital Medicine 3, 1 (2020), 1–11.
[51]
Catherine R. Lesko, Nicholas C. Henderson, and Ravi Varadhan. 2018. Considerations when assessing heterogeneity of treatment effect in patient-centered outcomes research. Journal of Clinical Epidemiology 100 (2018), 22–31.
[52]
Sergey Levine, Aviral Kumar, George Tucker, and Justin Fu. 2020. Offline reinforcement learning: Tutorial, review, and perspectives on open problems. CoRR abs/2005.01643 (2020). arxiv:2005.01643https://arxiv.org/abs/2005.01643
[53]
Ping Li, Zi yan Cheng, and Gui lin Liu. 2020. Availability bias causes misdiagnoses by physicians: Direct evidence from a randomized controlled trial. Internal Medicine 59, 24 (2020), 3141–3146.
[54]
Qing Li, Wengang Zhou, Zhenbo Lu, and Houqiang Li. 2022. Simultaneous double Q-learning with conservative advantage learning for actor-critic methods. arXiv preprint arXiv:2205.03819 (2022).
[55]
Tian-Hao Li, Zhi-Shun Wang, Wei Lu, Qian Zhang, and Deng-Feng Li. 2021. Electronic health records based reinforcement learning for treatment optimizing. Information Systems (2021), 101878.
[56]
Yuxi Li. 2017. Deep reinforcement learning: An overview. CoRR abs/1701.07274 (2017). http://dblp.uni-trier.de/db/journals/corr/corr1701.html#Li17b
[57]
Enlu Lin, Qiong Chen, and Xiaoming Qi. 2019. Deep Reinforcement Learning for Imbalanced Classification. (2019). arxiv:cs.LG/1901.01379
[58]
Enlu Lin, Qiong Chen, and Xiaoming Qi. 2019. Deep reinforcement learning for imbalanced classification. CoRR abs/1901.01379 (2019). arxiv:1901.01379http://arxiv.org/abs/1901.01379
[59]
Michael L. Littman. 1994. Markov games as a framework for multi-agent reinforcement learning. In Proceedings of the 11th International Conference on Machine Learning. Morgan-Kaufmann, 157–163.
[60]
Ning Liu, Ying Liu, Brent Logan, Zhiyuan Xu, Jian Tang, and Yanzhi Wang. 2018. Deep reinforcement learning for dynamic treatment regimes on medical registry data. CoRR abs/1801.09271 (2018). arxiv:1801.09271http://arxiv.org/abs/1801.09271
[61]
S. Liu, K. C. See, K. Y. Ngiam, L. A. Celi, X. Sun, and M. Feng. 2020. Reinforcement learning for clinical decision support in critical care: comprehensive review. Journal of Medical Internet Research 22, 7 (2020), e18477.
[62]
Zeyu Liu, Anahita Khojandi, Xueping Li, Akram Mohammed, Robert L. Davis, and Rishikesan Kamaleswaran. 2022. A machine learning–enabled partially observable markov decision process framework for early sepsis prediction. INFORMS J. on Computing 34, 4 (July-August 2022), 2039–2057.
[63]
MingYu Lu, Zachary Shahn, Daby Sow, Finale Doshi-Velez, and Li-wei H Lehman. 2020. Is deep reinforcement learning ready for practical applications in healthcare? A sensitivity analysis of duel-DDQN for hemodynamic management in sepsis patients. In AMIA Annual Symposium Proceedings, Vol. 2020. American Medical Informatics Association, 773.
[64]
Aaron J. Masino, Mary Catherine Harris, Daniel Forsyth, Svetlana Ostapenko, Lakshmi Srinivasan, Christopher P. Bonafide, Fran Balamuth, Melissa Schmatz, and Robert W. Grundmeier. 2019. Machine learning models for early sepsis recognition in the neonatal intensive care unit using readily available electronic health record data. PloS One 14, 2 (2019), e0212665.
[65]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin A. Riedmiller. 2013. Playing Atari with deep reinforcement learning. CoRR abs/1312.5602 (2013). arxiv:1312.5602http://arxiv.org/abs/1312.5602
[66]
Susan A. Murphy. 2005. An experimental design for the development of adaptive treatment strategies. Statistics in Medicine 24, 10 (2005), 1455–1481.
[67]
Shamim Nemati, Mohammad M. Ghassemi, and Gari D. Clifford. 2016. Optimal medication dosing from suboptimal clinical examples: A deep reinforcement learning approach. In Proceedings of the 2016 38th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (EMBC’16). IEEE, 2978–2981.
[68]
Cattram D. Nguyen, John B. Carlin, and Katherine J. Lee. 2021. Practical strategies for handling breakdown of multiple imputation procedures. Emerging Themes in Epidemiology 18, 1 (2021), 1–8.
[69]
Takato Okudo and Seiji Yamada. 2021. Subgoal-based reward shaping to improve efficiency in reinforcement learning. CoRR abs/2104.06411 (2021). arxiv:2104.06411https://arxiv.org/abs/2104.06411
[70]
Martijn Otterlo and Marco Wiering. 2012. Reinforcement learning and Markov decision processes. Reinforcement Learning: State of the Art (012012), 3–42. DOI:
[71]
Trishan Panch, Heather Mattie, and Rifat Atun. 2019. Artificial Intelligence and Algorithmic Bias: Implications for Health Systems. (Dec2019). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC6875681/
[72]
Sonali Parbhoo, Jasmina Bogojeska, Maurizio Zazzi, Volker Roth, and Finale Doshi-Velez. 2017. Combining Kernel and Model Based Learning for HIV Therapy Selection. (Jul2017). https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5543338/
[73]
Sonali Parbhoo, Mario Wieser, Volker Roth, and Finale Doshi-Velez. 2020. Transfer learning from well-curated to less-resourced populations with HIV. In Proceedings of the Machine Learning for Healthcare Conference. PMLR, 589–609.
[74]
Ravi B. Parikh, Stephanie Teeple, and Amol S. Navathe. 2019. Addressing bias in artificial intelligence in health care. JAMA 322, 24 (2019), 2377–2378.
[75]
Dana Pessach and Erez Shmueli. 2022. A review on fairness in machine learning. ACM Comput. Surv. 55, 3, Article 51 (Feb 2022), 44 pages. DOI:
[76]
Tom J. Pollard, Alistair E. W. Johnson, Jesse D. Raffa, Leo A. Celi, Roger G. Mark, and Omar Badawi. 2018. The eICU collaborative research database, a freely available multi-center database for critical care research. Scientific Data 5, 1 (2018), 1–13.
[77]
Erika Puiutta and Eric M. S. P. Veith. 2020. Explainable reinforcement learning: A survey. CoRR abs/2005.06247 (2020). arxiv:2005.06247https://arxiv.org/abs/2005.06247
[78]
Aniruddh Raghu, Matthieu Komorowski, Leo Anthony Celi, Peter Szolovits, and Marzyeh Ghassemi. 2017. Continuous state-space models for optimal sepsis treatment - A deep reinforcement learning approach. CoRR abs/1705.08422 (2017). arxiv:1705.08422http://arxiv.org/abs/1705.08422
[79]
Thejan Rajapakshe, Rajib Rana, Sara Khalifa, Björn W. Schuller, and Jiajun Liu. 2021. A novel policy for pre-trained deep reinforcement learning for speech emotion recognition. CoRR abs/2101.00738 (2021). arXiv:2101.00738https://arxiv.org/abs/2101.00738
[80]
Alvin Rajkomar, Michael Howell, and Michaela Hardt. 2018. Ensuring Fairness in Machine Learning to Advance Health Equity. (2018). https://pubmed.ncbi.nlm.nih.gov/30508424/
[81]
Susan Rea, Jyotishman Pathak, Guergana Savova, Thomas A. Oniki, Les Westberg, Calvin E. Beebe, Cui Tao, Craig G. Parker, Peter J. Haug, Stanley M. Huff, and Christopher G. Chute. 2012. Building a robust, scalable and standards-driven infrastructure for secondary use of EHR data: The SHARPn project. Journal of Biomedical Informatics 45, 4 (2012), 763–771.
[82]
Elsa Riachi, Muhammad Mamdani, Michael Fralick, and Frank Rudzicz. 2021. Challenges for Reinforcement Learning in Healthcare. (2021). arxiv:cs.LG/2103.05612
[83]
Wayne J. Riley. 2012. Health disparities: Gaps in access, quality and affordability of medical care. Transactions of the American Clinical and Climatological Association 123 (2012), 167.
[84]
Patricia J. Rodriguez, Zachary J. Ward, Michael W. Long, S. Bryn Austin, and Davene R. Wright. 2021. Applied methods for estimating transition probabilities from electronic health record data. Medical Decision Making 41, 2 (2021), 143–152.
[85]
S. Rosenbloom, William Stead, Joshua Denny, Dario Giuse, Nancy Lorenzi, Steven Brown, and Kevin Johnson. 2010. Generating clinical notes for electronic health record systems. Applied Clinical Informatics 1 (072010), 232–243. DOI:
[86]
Fernando Sánchez-Hernández, Juan Carlos Ballesteros-Herráez, Mohamed S. Kraiem, Mercedes Sánchez-Barba, and María N. Moreno-García. 2019. Predictive modeling of ICU healthcare-associated infections from imbalanced data. Using ensembles and a clustering-based undersampling approach. Applied Sciences 9, 24 (2019), 5287.
[87]
Andrew Schaefer and Matthew Bailey. 2005. Modeling Medical Treatment using Markov Decision Processes. (2005).
[88]
Ashkan Sharabiani, Adam Bress, Elnaz Douzali, and Houshang Darabi. 2015. Revisiting warfarin dosing using machine learning techniques. Computational and Mathematical Methods in Medicine 2015 (2015).
[89]
Jonathan A. C. Sterne, Ian R. White, John B. Carlin, Michael Spratt, Patrick Royston, Michael G. Kenward, Angela M. Wood, and James R. Carpenter. 2009. Multiple imputation for missing data in epidemiological and clinical research: Potential and pitfalls. BMJ 338 (2009).
[90]
Richard S. Sutton and Andrew G. Barto. 2018. Reinforcement Learning: An Introduction (2nd ed.). The MIT Press.
[91]
Phillip Swazinna, Steffen Udluft, and Thomas Runkler. 2021. Overcoming model bias for robust offline deep reinforcement learning. Engineering Applications of Artificial Intelligence 104 (2021), 104366.
[92]
Nicholas L. Syn, Andrea Li-Ann Wong, Soo-Chin Lee, Hock-Luen Teoh, James Wei Luen Yip, Raymond C. S. Seet, Wee Tiong Yeo, William Kristanto, Ping-Chong Bee, L. M. Poon, Patrick Marban, Tuck Seng Wu, Michael D. Winther, Liam R. Brunham, Richie Soong, Bee-Choo Tai, and Boon-Cher Goh. 2018. Genotype-guided versus traditional clinical dosing of warfarin in patients of Asian ancestry: A randomized controlled trial. BMC Medicine 16, 1 (2018), 1–10.
[93]
Shengpu Tang, Aditya Modi, Michael W. Sjoding, and Jenna Wiens. 2020. Clinician-in-the-loop decision making: Reinforcement learning with near-optimal set-valued policies. CoRR abs/2007.12678 (2020). arxiv:2007.12678https://arxiv.org/abs/2007.12678
[94]
Julien Tanniou, Ingeborg Van Der Tweel, Steven Teerenstra, and Kit C. B. Roes. 2016. Subgroup analyses in confirmatory clinical trials: Time to be specific about their purposes. BMC Medical Research Methodology 16, 1 (2016), 1–15.
[95]
Alexandra Chouldechova and and Aaron Roth. 2020. A Snapshot of the Frontiers of Fairness in Machine Learning. (May2020).
[96]
R. Vincent. 2014. Reinforcement learning in models of adaptive medical treatment strategies. McGill University (Canada). 2014.
[97]
Darshali A. Vyas, Leo G. Eisenstein, and David S. Jones. 2020. Hidden in plain sight—reconsidering the use of race correction in clinical algorithms. New England Journal of Medicine 383, 9 (2020), 874–882.
[98]
Jeremy Watts, Anahita Khojandi, Rama Vasudevan, and Ritesh Ramdhani. 2020. Optimizing individualized treatment planning for Parkinson’s disease using deep reinforcement learning. In Proceedings of the 2020 42nd Annual International Conference of the IEEE Engineering in Medicine Biology Society (EMBC’20). 5406–5409. DOI:
[99]
Min Wen, Osbert Bastani, and Ufuk Topcu. 2021. Algorithms for fairness in sequential decision making. In Proceedings of the International Conference on Artificial Intelligence and Statistics(PMLR), 1144–1152.
[100]
Wei-Hung Weng, Mingwu Gao, Ze He, Susu Yan, and Peter Szolovits. 2017. Representation and reinforcement learning for personalized glycemic control in septic patients. CoRR abs/1712.00654 (2017). arxiv:1712.00654http://arxiv.org/abs/1712.00654
[101]
Jeff Whittle, Joseph Conigliaro, C. B. Good, and Richard P. Lofgren. 1993. Racial differences in the use of invasive cardiovascular procedures in the Department of Veterans Affairs medical system. New England Journal of Medicine 329, 9 (1993), 621–627.
[102]
Edwin S. Wong, Jean Yoon, Rebecca I. Piegari, Ann-Marie M Rosland, Stephan D. Fihn, and Evelyn T. Chang. 2018. Identifying latent subgroups of high-risk patients using risk score trajectories. Journal of General Internal Medicine 33, 12 (2018), 2120–2126.
[103]
Jionglin Wu, Jason Roy, and Walter F. Stewart. 2010. Prediction modeling using EHR data: Challenges, strategies, and a comparison of machine learning approaches. Medical Care (2010), S106–S113.
[104]
Jiachen Yang, Brenden K. Petersen, Hongyuan Zha, and Daniel M. Faissol. 2019. Single episode policy transfer in reinforcement learning. CoRR abs/1910.07719 (2019). arXiv:1910.07719http://arxiv.org/abs/1910.07719
[105]
Jenny Yang, Andrew A. S. Soltan, and David A. Clifton. 2022. Algorithmic fairness and bias mitigation for clinical machine learning: A new utility for deep reinforcement learning. medRxiv (2022).
[106]
Jiayu Yao, Taylor Killian, George Konidaris, and Finale Doshi-Velez. 2018. Direct policy transfer via hidden parameter markov decision processes. In Proceedings of the LLARLA Workshop, FAIM, Vol. 2018.
[107]
Chao Yu, Jiming Liu, Shamim Nemati, and Guosheng Yin. 2021. Reinforcement learning in healthcare: A survey. ACM Comput. Surv. 55, 1, Article 5 (Nov 2021), 36 pages. DOI:
[108]
Daochen Zha, Kwei-Herng Lai, Qiaoyu Tan, Sirui Ding, Na Zou, and Xia Ben Hu. 2022. Towards automated imbalanced learning with deep hierarchical reinforcement learning. In Proceedings of the 31st ACM International Conference on Information & Knowledge Management. 2476–2485.
[109]
Songan Zhang, Lu Wen, Huei Peng, and H. Eric Tseng. 2021. Quick learner automated vehicle adapting its roadmanship to varying traffic cultures with meta reinforcement learning. CoRR abs/2104.08876 (2021). arXiv:2104.08876https://arxiv.org/abs/2104.08876
[110]
Yang Zhao, Zoie Shui-Yee Wong, and Kwok Leung Tsui. 2018. A Framework of Rebalancing Imbalanced Healthcare Data for Rare Events’ Classification: A Case of Look-Alike Sound-Alike Mix-Up Incident Detection. (May2018). https://www.hindawi.com/journals/jhe/2018/6275435/
[111]
Tuanfei Zhu, Yaping Lin, and Yonghe Liu. 2020. Oversampling for imbalanced time series data. CoRR abs/2004.06373 (2020). arxiv:2004.06373https://arxiv.org/abs/2004.06373
[112]
Zhuangdi Zhu, Kaixiang Lin, and Jiayu Zhou. 2020. Transfer learning in deep reinforcement learning: A survey. CoRR abs/2009.07888 (2020). arXiv:2009.07888https://arxiv.org/abs/2009.07888

Cited By

View all
  • (2025)The role of AI in detecting and mitigating human errors in safety-critical industries: A reviewReliability Engineering & System Safety10.1016/j.ress.2024.110682256(110682)Online publication date: Apr-2025
  • (2024)Comprehending Algorithmic Bias and Strategies for Fostering Trust in Artificial IntelligenceDigital Technologies, Ethics, and Decentralization in the Digital Era10.4018/979-8-3693-1762-4.ch014(286-305)Online publication date: 8-Feb-2024
  • (2024)Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic ReviewSensors10.3390/s2408246124:8(2461)Online publication date: 11-Apr-2024
  • Show More Cited By

Index Terms

  1. Bias in Reinforcement Learning: A Review in Healthcare Applications

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Computing Surveys
      ACM Computing Surveys  Volume 56, Issue 2
      February 2024
      974 pages
      EISSN:1557-7341
      DOI:10.1145/3613559
      Issue’s Table of Contents

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 15 September 2023
      Online AM: 18 July 2023
      Accepted: 11 July 2023
      Revised: 16 November 2022
      Received: 17 February 2022
      Published in CSUR Volume 56, Issue 2

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Reinforcement learning
      2. electronic health records
      3. algorithmic bias
      4. treatment planning
      5. bias management

      Qualifiers

      • Survey

      Funding Sources

      • Science Alliance, The University of Tennessee, and the Laboratory Directed Research
      • Development Program of Oak Ridge National Laboratory, managed by UT-Battelle, LLC, for the U.S. Department of Energy

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)666
      • Downloads (Last 6 weeks)83
      Reflects downloads up to 25 Dec 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)The role of AI in detecting and mitigating human errors in safety-critical industries: A reviewReliability Engineering & System Safety10.1016/j.ress.2024.110682256(110682)Online publication date: Apr-2025
      • (2024)Comprehending Algorithmic Bias and Strategies for Fostering Trust in Artificial IntelligenceDigital Technologies, Ethics, and Decentralization in the Digital Era10.4018/979-8-3693-1762-4.ch014(286-305)Online publication date: 8-Feb-2024
      • (2024)Reinforcement Learning Algorithms and Applications in Healthcare and Robotics: A Comprehensive and Systematic ReviewSensors10.3390/s2408246124:8(2461)Online publication date: 11-Apr-2024
      • (2024)Fairness and Bias in Robot LearningProceedings of the IEEE10.1109/JPROC.2024.3403898112:4(305-330)Online publication date: Apr-2024
      • (2024)AI-Driven Physical Rehabilitation Strategies in Post-Cancer Care2024 2nd International Conference on Cyber Resilience (ICCR)10.1109/ICCR61006.2024.10532883(1-6)Online publication date: 26-Feb-2024
      • (2024)A reinforcement learning assisted evolutionary algorithm for constrained multi-task optimizationInformation Sciences10.1016/j.ins.2024.120863678(120863)Online publication date: Sep-2024
      • (2024)Leveraging Natural Language Queries for Effective Video AnalysisArtificial Intelligence: Theory and Applications10.1007/978-981-99-8476-3_18(231-240)Online publication date: 28-Feb-2024

      View Options

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Full Text

      View this article in Full Text.

      Full Text

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media