Work in Progress

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM

Authors:
Brenna Li

Google, United States and University of Toronto, Canada

Google, United States and University of Toronto, Canada

0000-0003-3692-243X
View Profile

,
Amy Wang

Google, United States and McMaster University, Canada

Google, United States and McMaster University, Canada

0000-0001-6514-8784
View Profile

,
Patricia Strachan

Google, United States

Google, United States

0009-0004-6385-5943
View Profile

,
Julie Anne Séguin

Google, United States

Google, United States

0000-0002-8371-9576
View Profile

,
Sami Lachgar

Google, United States

Google, United States

0009-0008-3663-1441
View Profile

,
Karyn C Schroeder

Work done at Google via YoGiYo 2GROW, United States

Work done at Google via YoGiYo 2GROW, United States

0009-0004-0223-0173
View Profile

,
Mathias S Fleck

Google, United States

Google, United States

0000-0002-6335-6165
View Profile

,
Renee Wong

Google, United States

Google, United States

0009-0003-0403-7679
View Profile

,
Alan Karthikesalingam

Google, United States

Google, United States

0009-0000-4958-5976
View Profile

,
Vivek Natarajan

Google, United States

Google, United States

0000-0001-7849-2074
View Profile

,
Yossi Matias

Google, United States

Google, United States

0000-0003-3960-6002
View Profile

,
Greg S Corrado

Google, United States

Google, United States

0000-0001-8817-0992
View Profile

,
Dale Webster

Google, United States

Google, United States

0000-0002-3023-8824
View Profile

,
Yun Liu

Google, United States

Google, United States

0000-0003-4079-8275
View Profile

,
Naama Hammel

Google, United States

Google, United States

0000-0002-1284-0061
View Profile

,
Rory Sayres

Google, United States

Google, United States

0000-0002-8269-5779
View Profile

,
Christopher Semturs

Google, United States

Google, United States

0000-0001-6108-2773
View Profile

,
Mike Schaekermann

Google, United States

Google, United States

0000-0002-1735-9680
View Profile

CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing SystemsMay 2024Article No.: 88Pages 1–10https://doi.org/10.1145/3613905.3651891

Published:11 May 2024Publication History

CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems

Pages 1–10

ABSTRACT

Although skin concerns are common, access to specialist care is limited. Artificial intelligence (AI)-assisted tools to support medical decisions may provide patients with feedback on their concerns while also helping ensure the most urgent cases are routed to dermatologists. Although AI-based conversational agents have been explored recently, how they are perceived by patients and clinicians is not well understood. We conducted a Wizard-of-Oz study involving 18 participants with real skin concerns. Participants were randomly assigned to interact with either a clinician agent (portrayed by a dermatologist) or an LLM agent (supervised by a dermatologist) via synchronous multimodal chat. In both conditions, participants found the conversation to be helpful in understanding their medical situation and alleviate their concerns. Through qualitative coding of the conversation transcripts, we provide insight on the importance of empathy and effective information-seeking. We conclude with design considerations for future AI-based conversational agents in healthcare settings.

Footnotes

^⁎ Both authors contributed equally.
^† Both authors advised equally.

Supplemental Material

3613905.3651891-talk-video.mp4

Talk Video

mp4

24.4 MB

Download

Available for Download

vtt

3613905.3651891-talk-video.vtt (4.6 KB)

pdf

Supplemental Material (379.3 KB)

A.1 Participant pre-interaction survey

References

Dominique Ansell, James A G Crispo, Benjamin Simard, and Lise M Bjerre. 2017. Interventions to reduce wait times for primary care appointments: a systematic review. BMC Health Serv. Res. 17, 1 (April 2017), 295.Google ScholarCross Ref
Gopi J Astik, Nita Kulkarni, Rachel M Cyrus, Chen Yeh, and Kevin J O’Leary. 2021. Implementation of a triage nurse role and the effect on hospitalist workload. Hospital Practice 49, 5 (2021), 336–340.Google ScholarCross Ref
Adam Baker, Yura Perov, Katherine Middleton, Janie Baxter, Daniel Mullarkey, Davinder Sangar, Mobasher Butt, Arnold DoRosario, and Saurabh Johri. 2020. A comparison of artificial intelligence and human doctors for the purpose of triage and diagnosis. Frontiers in artificial intelligence 3 (2020), 543405.Google Scholar
Neeli M Bendapudi, Leonard L Berry, Keith A Frey, Janet Turner Parish, and William L Rayburn. 2006. Patients’ perspectives on ideal physician behaviors. In Mayo Clinic Proceedings, Vol. 81. Elsevier, Mayo Clinic Proceedings, England, UK, 338–344.Google Scholar
Virginia Braun and Victoria Clarke. 2012. Thematic analysis. American Psychological Association, Washington DC, USA.Google Scholar
PA Cameron, Belinda Jane Gabbe, Karen Smith, and Biswadev Mitra. 2014. Triaging the right patient to the right place in the shortest time. British journal of anaesthesia 113, 2 (2014), 226–233.Google Scholar
Bolin Cao, Shiyi Huang, and Weiming Tang. 2024. AI triage or manual triage? Exploring medical staffs’ preference for AI triage in China. Patient Education and Counseling 119 (2024), 108076.Google ScholarCross Ref
Deborah Cline, Carolyn Reilly, and Jayne F Moore. 2004. What’s behind RN turnover?: Uncover the “real reason” nurses leave. Holistic Nursing Practice 18, 1 (2004), 45–48.Google ScholarCross Ref
Mukhamad Fathoni, Hathairat Sangchan, and Praneed Songwathana. 2013. Relationships between triage knowledge, training, working experiences and triage skills among emergency nurses in East Java, Indonesia. Nurse Media Journal of Nursing 3, 1 (2013), 511–525.Google Scholar
Thomas B Fitzpatrick. 1988. The validity and practicality of sun-reactive skin types I through VI. Archives of dermatology 124, 6 (1988), 869–871.Google Scholar
Karen A Funk and Malia Davis. 2015. Enhancing the role of the nurse in primary care: the RN “co-visit” model. Journal of general internal medicine 30, 12 (2015), 1871–1873.Google ScholarCross Ref
Aidan Gilson, Conrad W Safranek, Thomas Huang, Vimig Socrates, Ling Chi, Richard Andrew Taylor, David Chartash, 2023. How does ChatGPT perform on the United States medical licensing examination? The implications of large language models for medical education and knowledge assessment. JMIR Medical Education 9, 1 (2023), e45312.Google ScholarCross Ref
Katelyn R Glines, Wasim Haidari, Leena Ramani, Zeynep M Akkurt, and Steven R Feldman. 2020. Digital future of dermatology. Dermatology online journal 26, 10 (2020), N/A.Google Scholar
Derek Haggett. 2022. N.B. woman shocked at four-year wait time to see dermatologist. https://atlantic.ctvnews.ca/n-b-woman-shocked-at-four-year-wait-time-to-see-dermatologist-1.5975452. Accessed: 2023-11-2.Google Scholar
Eunkyung Jo, Daniel A. Epstein, Hyunhoon Jung, and Young-Ho Kim. 2023. Understanding the Benefits and Challenges of Deploying Conversational AI Leveraging Large Language Models for Public Health Intervention. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 18, 16 pages. https://doi.org/10.1145/3544548.3581503Google ScholarDigital Library
William R. Kearns, Neha Kaura, Myra Divina, Cuong Vo, Dong Si, Teresa Ward, and Weichao Yuwen. 2020. A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog. In Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems (, Honolulu, HI, USA,) (CHI EA ’20). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3334480.3382902Google ScholarDigital Library
Rafal Kocielnik, Elena Agapie, Alexander Argyle, Dennis T Hsieh, Kabir Yadav, Breena Taira, and Gary Hsieh. 2019. HarborBot: a chatbot for social needs screening. In AMIA Annual Symposium Proceedings, Vol. 2019. American Medical Informatics Association, American Medical Informatics Association, USA, 552.Google Scholar
Liliana Laranjo, Adam G Dunn, Huong Ly Tong, Ahmet Baki Kocaballi, Jessica Chen, Rabia Bashir, Didi Surian, Blanca Gallego, Farah Magrabi, Annie YS Lau, 2018. Conversational agents in healthcare: a systematic review. Journal of the American Medical Informatics Association 25, 9 (2018), 1248–1258.Google ScholarCross Ref
Brenna Li, Tetyana Skoropad, Puneet Seth, Mohit Jain, Khai Truong, and Alex Mariakakis. 2023. Constraints and Workarounds to Support Clinical Consultations in Synchronous Text-based Platforms. In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems (, Hamburg, Germany,) (CHI ’23). Association for Computing Machinery, New York, NY, USA, Article 342, 17 pages. https://doi.org/10.1145/3544548.3581014Google ScholarDigital Library
Society of Dermatology Physician Assistants. 2023. Patients Are Waiting: America’s Dermatology Wait Times Crisis. https://www.dermpa.org/page/GAPP. Accessed: 2023-11-2.Google Scholar
Vikas N O’Reilly-Shah. 2017. Factors influencing healthcare provider respondent fatigue answering a globally administered in-app survey. PeerJ 5 (2017), e3785.Google ScholarCross Ref
Maria Panagioti, Efharis Panagopoulou, Peter Bower, George Lewith, Evangelos Kontopantelis, Carolyn Chew-Graham, Shoba Dawson, Harm Van Marwijk, Keith Geraghty, and Aneez Esmail. 2017. Controlled interventions to reduce burnout in physicians: a systematic review and meta-analysis. JAMA internal medicine 177, 2 (2017), 195–205.Google ScholarCross Ref
Marisa Shrimpling. 2002. Redesigning triage to reduce waiting times. Emerg. Nurse 10, 2 (May 2002), 34–37.Google ScholarCross Ref
Karan Singhal, Shekoofeh Azizi, Tao Tu, S Sara Mahdavi, Jason Wei, Hyung Won Chung, Nathan Scales, Ajay Tanwani, Heather Cole-Lewis, Stephen Pfohl, 2023. Large language models encode clinical knowledge. Nature 620, 7972 (2023), 172–180.Google Scholar
Karan Singhal, Tao Tu, Juraj Gottweis, Rory Sayres, Ellery Wulczyn, Le Hou, Kevin Clark, Stephen Pfohl, Heather Cole-Lewis, Darlene Neal, Mike Schaekermann, Amy Wang, Mohamed Amin, Sami Lachgar, Philip Mansfield, Sushant Prakash, Bradley Green, Ewa Dominowska, Blaise Aguera y Arcas, Nenad Tomasev, Yun Liu, Renee Wong, Christopher Semturs, S. Sara Mahdavi, Joelle Barral, Dale Webster, Greg S. Corrado, Yossi Matias, Shekoofeh Azizi, Alan Karthikesalingam, and Vivek Natarajan. 2023. Towards Expert-Level Medical Question Answering with Large Language Models. arxiv:2305.09617 [cs.CL]Google Scholar
Augustin Toma, Patrick R Lawler, Jimmy Ba, Rahul G Krishnan, Barry B Rubin, and Bo Wang. 2023. Clinical Camel: An Open-Source Expert-Level Medical Language Model with Dialogue-Based Knowledge Encoding.Google Scholar

Index Terms

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Natural language generation
2. Human-centered computing
  1. Collaborative and social computing
    1. Empirical studies in collaborative and social computing
  2. Human computer interaction (HCI)
    1. Empirical studies in HCI

Recommendations

A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog
CHI EA '20: Extended Abstracts of the 2020 CHI Conference on Human Factors in Computing Systems

Health dialog collection is the primary bottleneck for the training and deployment of conversational agents into clinical practice. Current tools for the development of dialog systems are primarily focused on writing intent-slot schemas for natural ...
Read More
Wizard-of-Oz vs. GPT-4: A Comparative Study of Perceived Social Intelligence in HRI Brainstorming
HRI '24: Companion of the 2024 ACM/IEEE International Conference on Human-Robot Interaction

Human-robot interaction often employs the Wizard-of-Oz (WoZ) paradigm, where a human controls the robot. However, this approach has limitations, such as a lack of autonomy that impedes real-world applications. Large language models (LLMs) can replace WoZ ...
Read More
A Conversational Agent for Medical Disclosure of Sexually Transmitted Infections
Hybrid Artificial Intelligent Systems
Abstract
Sexually transmitted infections (STIs) are serious health problems worldwide, increasing the risk of infection by Human Immunodeficiency Virus (HIV)/Acquired Immune Deficiency Syndrome (AIDS). Despite the significant efforts to address the ...
Read More

Comments

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Published in
CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems
May 2024
4761 pages
ISBN:9798400703317
DOI:10.1145/3613905
Editors:
Florian Floyd Mueller
Monash University
,
Penny Kyburz
The Australian National University
,
Julie R. Williamson
University of Glasgow
,
Corina Sas
Lancaster University
Copyright © 2024 Owner/Author
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.
Sponsors
In-Cooperation
Publisher
Association for Computing Machinery
New York, NY, United States
Publication History
- Published: 11 May 2024
Check for updates
Author Tags
Artificial Intelligence
Chatbot
Dermatology
Large Language Models
Medical
Wizard-of-Oz
Qualifiers
- Work in Progress
- Research
- Refereed limited
Conference

Acceptance Rates
Overall Acceptance Rate6,164of23,696submissions,26%
Upcoming Conference
CHI PLAY '24

Sponsor:

sigchi

The Annual Symposium on Computer-Human Interaction in Play

October 14 - 17, 2024

Tampere , Finland
Funding Sources
Other Metrics
View Article Metrics

Article Metrics
- 0
  Total Citations
  View Citations
- 516
  Total Downloads
- Downloads (Last 12 months)516
- Downloads (Last 6 weeks)516
Other Metrics
View Author Metrics
Cited By
This publication has not been cited yet

PDF Format

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Full Text

View this article in Full Text.

View Full Text

HTML Format

View this article in HTML Format .

View HTML Format

Conversational AI in health: Design considerations from a Wizard-of-Oz dermatology case study with users, clinicians and a medical LLM

CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems

ABSTRACT

Footnotes

Supplemental Material

Available for Download

References

Cited By

Index Terms

Recommendations

A Wizard-of-Oz Interface and Persona-based Methodology for Collecting Health Counseling Dialog

Wizard-of-Oz vs. GPT-4: A Comparative Study of Perceived Social Intelligence in HRI Brainstorming

A Conversational Agent for Medical Disclosure of Sexually Transmitted Infections