Work in Progress

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM

Authors:

Patricia Strachan,

Julie Anne Séguin,

Karyn C Schroeder,

Mathias S Fleck,

Alan Karthikesalingam,

Vivek Natarajan,

Greg S Corrado,

Christopher Semturs,

Mike SchaekermannAuthors Info & Claims

CHI EA '24: Extended Abstracts of the CHI Conference on Human Factors in Computing Systems

Article No.: 88, Pages 1 - 10

https://doi.org/10.1145/3613905.3651891

Published: 11 May 2024 Publication History

Abstract

Although skin concerns are common, access to specialist care is limited. Artificial intelligence (AI)-assisted tools to support medical decisions may provide patients with feedback on their concerns while also helping ensure the most urgent cases are routed to dermatologists. Although AI-based conversational agents have been explored recently, how they are perceived by patients and clinicians is not well understood. We conducted a Wizard-of-Oz study involving 18 participants with real skin concerns. Participants were randomly assigned to interact with either a clinician agent (portrayed by a dermatologist) or an LLM agent (supervised by a dermatologist) via synchronous multimodal chat. In both conditions, participants found the conversation to be helpful in understanding their medical situation and alleviate their concerns. Through qualitative coding of the conversation transcripts, we provide insight on the importance of empathy and effective information-seeking. We conclude with design considerations for future AI-based conversational agents in healthcare settings.

Supplemental Material

MP4 File

Talk Video

PDF File - Supplemental Material

A.1 Participant pre-interaction survey

Download
379.35 KB

References

[1]

Dominique Ansell, James A G Crispo, Benjamin Simard, and Lise M Bjerre. 2017. Interventions to reduce wait times for primary care appointments: a systematic review. BMC Health Serv. Res. 17, 1 (April 2017), 295.

[2]

Gopi J Astik, Nita Kulkarni, Rachel M Cyrus, Chen Yeh, and Kevin J O’Leary. 2021. Implementation of a triage nurse role and the effect on hospitalist workload. Hospital Practice 49, 5 (2021), 336–340.

[3]

Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Mobasher Butt, Arnold DoRosario, and Saurabh Johri. 2020. A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis. Frontiers in artificial intelligence 3 (2020), 543405.

[4]

Neeli M Bendapudi, Leonard L Berry, Keith A Frey, Janet Turner Parish, and William L Rayburn. 2006. Patients’ perspectives on ideal physician behaviors. In Mayo Clinic Proceedings, Vol. 81. Elsevier, Mayo Clinic Proceedings, England, UK, 338–344.

[5]

Virginia Braun and Victoria Clarke. 2012. Thematic analysis. American Psychological Association, Washington DC, USA.

[6]

PA Cameron, Belinda Jane Gabbe, Karen Smith, and Biswadev Mitra. 2014. Triaging the right patient to the right place in the shortest time. British journal of anaesthesia 113, 2 (2014), 226–233.

[7]

Bolin Cao, Shiyi Huang, and Weiming Tang. 2024. AI triage or manual triage? Exploring medical staffs’ preference for AI triage in China. Patient Education and Counseling 119 (2024), 108076.

[8]

Deborah Cline, Carolyn Reilly, and Jayne F Moore. 2004. What’s behind RN turnover?: Uncover the “real reason” nurses leave. Holistic Nursing Practice 18, 1 (2004), 45–48.

[9]

Mukhamad Fathoni, Hathairat Sangchan, and Praneed Songwathana. 2013. Relationships between triage knowledge, training, working experiences and triage skills among emergency nurses in East Java, Indonesia. Nurse Media Journal of Nursing 3, 1 (2013), 511–525.

[10]

Thomas B Fitzpatrick. 1988. The validity and practicality of sun-reactive skin types I through VI. Archives of dermatology 124, 6 (1988), 869–871.

[11]

Karen A Funk and Malia Davis. 2015. Enhancing the role of the nurse in primary care: the RN “co-visit” model. Journal of general internal medicine 30, 12 (2015), 1871–1873.

[12]

Aidan Gilson, Conrad W Safranek, Thomas Huang, Vimig Socrates, Ling Chi, Richard Andrew Taylor, David Chartash, 2023. How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education 9, 1 (2023), e45312.

[13]

Katelyn R Glines, Wasim Haidari, Leena Ramani, Zeynep M Akkurt, and Steven R Feldman. 2020. Digital future of dermatology. Dermatology online journal 26, 10 (2020), N/A.

[14]

Derek Haggett. 2022. N.B. woman shocked at four-year wait time to see dermatologist. https://atlantic.ctvnews.ca/n-b-woman-shocked-at-four-year-wait-time-to-see-dermatologist-1.5975452. Accessed: 2023-11-2.

[15]

Eunkyung Jo, Daniel A. Epstein, Hyunhoon Jung, and Young-Ho Kim. 2023. Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 18, 16 pages. https://doi.org/10.1145/3544548.3581503

Digital Library

[16]

William R. Kearns, Neha Kaura, Myra Divina, Cuong Vo, Dong Si, Teresa Ward, and Weichao Yuwen. 2020. A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (, Honolulu, HI, USA,) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3382902

Digital Library

[17]

Rafal Kocielnik, Elena Agapie, Alexander Argyle, Dennis T Hsieh, Kabir Yadav, Breena Taira, and Gary Hsieh. 2019. HarborBot: a chatbot for social needs screening. In AMIA Annual Symposium Proceedings, Vol. 2019. American Medical Informatics Association, American Medical Informatics Association, USA, 552.

[18]

Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.

[19]

Brenna Li, Tetyana Skoropad, Puneet Seth, Mohit Jain, Khai Truong, and Alex Mariakakis. 2023. Constraints and Workarounds to Support Clinical Consultations in Synchronous Text-based Platforms. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 342, 17 pages. https://doi.org/10.1145/3544548.3581014

Digital Library

[20]

Society of Dermatology Physician Assistants. 2023. Patients Are Waiting: America’s Dermatology Wait Times Crisis. https://www.dermpa.org/page/GAPP. Accessed: 2023-11-2.

[21]

Vikas N O’Reilly-Shah. 2017. Factors influencing healthcare provider respondent fatigue answering a globally administered in-app survey. PeerJ 5 (2017), e3785.

[22]

Maria Panagioti, Efharis Panagopoulou, Peter Bower, George Lewith, Evangelos Kontopantelis, Carolyn Chew-Graham, Shoba Dawson, Harm Van Marwijk, Keith Geraghty, and Aneez Esmail. 2017. Controlled interventions to reduce burnout in physicians: a systematic review and meta-analysis. JAMA internal medicine 177, 2 (2017), 195–205.

[23]

Marisa Shrimpling. 2002. Redesigning triage to reduce waiting times. Emerg. Nurse 10, 2 (May 2002), 34–37.

[24]

Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, 2023. Large language models encode clinical knowledge. Nature 620, 7972 (2023), 172–180.

[25]

Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, and Vivek Natarajan. 2023. Towards Expert-Level Medical Question Answering with Large Language Models. arxiv:2305.09617 [cs.CL]

[26]

Augustin Toma, Patrick R Lawler, Jimmy Ba, Rahul G Krishnan, Barry B Rubin, and Bo Wang. 2023. Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding.

Index Terms

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing
  2. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems

Health dialog collection is the primary bottleneck for the training and deployment of conversational agents into clinical practice. Current tools for the development of dialog systems are primarily focused on writing intent-slot schemas for natural ...
Wizard-of-Oz vs. GPT-4: A Comparative Study of Perceived Social Intelligence in HRI Brainstorming
HRI '24: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction

Human-robot interaction often employs the Wizard-of-Oz (WoZ) paradigm, where a human controls the robot. However, this approach has limitations, such as a lack of autonomy that impedes real-world applications. Large language models (LLMs) can replace WoZ ...
A Conversational Agent for Medical Disclosure of Sexually Transmitted Infections
Hybrid Artificial Intelligent Systems
Abstract
Sexually transmitted infections (STIs) are serious health problems worldwide, increasing the risk of infection by Human Immunodeficiency Virus (HIV)/Acquired Immune Deficiency Syndrome (AIDS). Despite the significant efforts to address the ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems

May 2024

4761 pages

ISBN:9798400703317

DOI:10.1145/3613905

Editors:
Florian Floyd Mueller
Monash University
,
Penny Kyburz
The Australian National University
,
Julie R. Williamson
University of Glasgow
,
Corina Sas
Lancaster University

Copyright © 2024 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 May 2024

Check for updates

Author Tags

Qualifiers

Work in progress
Research
Refereed limited

Data Availability

Supplemental Material: A.1 Participant pre-interaction survey https://dl.acm.org/doi/10.1145/3613905.3651891#3613905.3651891-supplement-1.pdf

Supplemental Material: A.1 Participant pre-interaction survey https://dl.acm.org/doi/10.1145/3613905.3651891#3613905.3651891-supplement-1.pdf

Funding Sources

Google

Conference

CHI '24

Sponsor:

CHI '24: CHI Conference on Human Factors in Computing Systems

May 11 - 16, 2024

HI, Honolulu, USA

Acceptance Rates

Overall Acceptance Rate 6,164 of 23,696 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
607
Total Downloads

Downloads (Last 12 months)607
Downloads (Last 6 weeks)106

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Table of Contents