demonstration

"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations

Authors:

David ElsweilerAuthors Info & Claims

CHIIR '24: Proceedings of the 2024 Conference on Human Information Interaction and Retrieval

Pages 411 - 416

https://doi.org/10.1145/3627508.3638330

Published: 10 March 2024 Publication History

Abstract

Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and behaviour change interventions, with large language model (LLM)-based approaches becoming more popular. Research in this context so far has been largely system-focused, foregoing the aspect of user behaviour and the impact this can have on LLM-generated texts. To address this issue, we share a dataset containing text-based user interactions related to behaviour change with two GPT-4-based conversational agents collected in a preregistered user study. This dataset includes conversation data, user language analysis, perception measures, and user feedback for LLM-generated turns, and can offer valuable insights to inform the design of such systems based on real interactions.

References

[1]

Imtihan Ahmed, Eric Keilty, Carolynne Cooper, Peter Selby, and Jonathan Rose. 2022. Generation and Classification of Motivational-Interviewing-Style Reflections for Smoking Behaviour Change Using Few-Shot Learning with Transformers. (2022).

[2]

John W Ayers, Adam Poliak, Mark Dredze, Eric C Leas, Zechariah Zhu, Jessica B Kelley, Dennis J Faix, Aaron M Goodman, Christopher A Longhurst, Michael Hogarth, and Davey M Smith. 2023. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA internal medicine 183, 6 (June 2023), 589—596. https://doi.org/10.1001/jamainternmed.2023.1838

[3]

Sabrina Barko-Sherif, David Elsweiler, and Morgan Harvey. 2020. Conversational agents for recipe recommendation. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 73–82.

Digital Library

[4]

Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 610–623. https://doi.org/10.1145/3442188.3445922

Digital Library

[5]

Lois Biener and David B Abrams. 1991. The Contemplation Ladder: validation of a measure of readiness to consider smoking cessation.Health psychology 10, 5 (1991), 360.

[6]

Maya Boustani, Stephanie Lunn, Ubbo Visser, Christine Lisetti, 2021. Development, Feasibility, Acceptability, and Utility of an Expressive Speech-Enabled Digital Health Agent to Deliver Online, Brief Motivational Interviewing for Alcohol Misuse: Descriptive Study. Journal of medical Internet research 23, 9 (2021), e25837.

[7]

Kushal Chawla, Weiyan Shi, Jingwen Zhang, Gale Lucas, Zhou Yu, and Jonathan Gratch. 2023. Social Influence Dialogue Systems: A Survey of Datasets and Models For Social Influence Tasks. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Dubrovnik, Croatia, 750–766. https://aclanthology.org/2023.eacl-main.53

[8]

Dawn Clifford and Laura Curtis. 2016. Motivational interviewing in nutrition and fitness. Guilford Publications.

[9]

Marianna A de Sá Siqueira, Barbara CN Müller, and Tibor Bosse. 2023. When do we accept mistakes from chatbots? The impact of human-like communication on user experience in chatbots that make mistakes. International Journal of Human–Computer Interaction (2023), 1–11.

[10]

Carlo C DiClemente and James O Prochaska. 1998. Toward a comprehensive, transtheoretical model of change: Stages of change and addictive behaviors. (1998).

[11]

Ela Elsholz, Jon Chamberlain, and Udo Kruschwitz. 2019. Exploring language style in chatbots to increase perceived product value and user engagement. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval. 301–305.

Digital Library

[12]

David Elsweiler, Max L Wilson, and Brian Kirkegaard Lunn. 2011. Understanding casual-leisure information behaviour. In New directions in information behaviour. Vol. 1. Emerald Group Publishing Limited, 211–241.

[13]

Dariush D Farhud. 2015. Impact of lifestyle on health. Iranian journal of public health 44, 11 (2015), 1442.

[14]

Joseph R Ferrari, Jean O’Callaghan, and Ian Newbegin. 2005. Prevalence of procrastination in the United States, United Kingdom, and Australia: arousal and avoidance delays among adults.North American Journal of Psychology 7, 1 (2005).

[15]

Alexander Frummet, David Elsweiler, and Bernd Ludwig. 2022. “What Can I Cook with these Ingredients?”-Understanding Cooking-Related Information Needs in Conversational Search. ACM Transactions on Information Systems (TOIS) 40, 4 (2022), 1–32.

Digital Library

[16]

Linwei He, Erkan Basar, Reinout W Wiers, Marjolijn L Antheunis, and Emiel Krahmer. 2022. Can chatbots help to motivate smoking cessation? A study on the effectiveness of motivational interviewing on engagement and therapeutic alliance. BMC Public Health 22, 1 (2022), 726.

[17]

Marianne Holdener, Alain Gut, Alfred Angerer, 2020. Applicability of the user engagement scale to mobile health: a survey-based quantitative study. JMIR mHealth and uHealth 8, 1 (2020), e13244.

[18]

Eunbin Kang and Youn Ah Kang. 2023. Counseling chatbot design: The effect of anthropomorphic chatbot characteristics on user self-disclosure and companionship. International Journal of Human–Computer Interaction (2023), 1–15.

[19]

Jieon Lee, Daeho Lee, and Jae-gil Lee. 2022. Influence of Rapport and Social Presence with an AI Psychotherapy Chatbot on Users’ Self-Disclosure. International Journal of Human–Computer Interaction (2022), 1–12.

[20]

Yi-Chieh Lee, Naomi Yamashita, Yun Huang, and Wai Fu. 2020. " I hear you, I feel you": encouraging deep self-disclosure through a chatbot. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–12.

Digital Library

[21]

Yanran Li, Ke Li, Hongke Ning, Xiaoqiang Xia, Yalong Guo, Chen Wei, Jianwei Cui, and Bin Wang. 2021. Towards an online empathetic chatbot with emotion causes. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2041–2045.

Digital Library

[22]

Irene Lopatovska and Jessika Davis. 2023. Designing Supportive Conversational Agents With and For Teens. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval. 328–332.

Digital Library

[23]

Michael B Madson, Richard S Mohn, Julie A Schumacher, and Alicia S Landry. 2015. Measuring client experiences of motivational interviewing during a lifestyle intervention. Measurement and Evaluation in Counseling and Development 48, 2 (2015), 140–151.

[24]

Michael B Madson, Richard S Mohn, Allan Zuckoff, Julie A Schumacher, Jane Kogan, Shari Hutchison, Emily Magee, and Bradley Stein. 2013. Measuring client perceptions of motivational interviewing: factor analysis of the Client Evaluation of Motivational Interviewing scale. Journal of Substance Abuse Treatment 44, 3 (2013), 330–335.

[25]

Michael B Madson, Margo C Villarosa, Julie A Schumacher, and Richard S Mohn. 2016. Evaluating the validity of the client evaluation of motivational interviewing scale in a brief motivational intervention for college student drinkers. Journal of substance abuse treatment 65 (2016), 51–57.

[26]

Selina Meyer. 2021. Natural Language Stage of Change Modelling for “Motivationally-driven” Weight Loss Support. In Proceedings of the 2021 International Conference on Multimodal Interaction. 807–811.

Digital Library

[27]

Selina Meyer. 2022. “I’m at my wits’ end”-Anticipating Information Needs and Appropriate Support Strategies in Behaviour Change. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval. 396–399.

Digital Library

[28]

Selina Meyer and David Elsweiler. 2022. GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. 2226–2235.

[29]

Selina Meyer and David Elsweiler. 2023. Evaluating the Efficacy, Controllability, and Safety of LLM-driven Conversational Agents to Support Behaviour Change. (2023).

[30]

Selina Meyer and David Elsweiler. 2023. Towards Cross-Content Conversational Agents for Behaviour Change: Investigating Domain Independence and the Role of Lexical Features in Written Language Around Change. researchgate preprint 10.13140/RG.2.2.10419.30242 (2023).

[31]

William R Miller and Stephen Rollnick. 2012. Motivational interviewing: Helping people change. Guilford press.

[32]

Harsha Nori, Nicholas King, Scott Mayer McKinney, Dean Carignan, and Eric Horvitz. 2023. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375 (2023).

[33]

Elnaz Nouri, Robert Sim, Adam Fourney, and Ryen W White. 2020. Step-wise recommendation for complex task support. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 203–212.

Digital Library

[34]

Office for National Statistics. [n. d.]. Most adults report making some changes to their lifestyle for environmental reasons. ([n. d.]). https://www.ons.gov.uk/peoplepopulationandcommunity/wellbeing/articles/mostadultsreportmakingsomechangestotheirlifestyleforenvironmentalreasons/2023-07-05

[35]

Stefan Olafsson, Teresa O’Leary, and Timothy Bickmore. 2019. Coerced change-talk with conversational agents promotes confidence in behavior change. In Proceedings of the 13th EAI International Conference on Pervasive Computing Technologies for Healthcare. 31–40.

Digital Library

[36]

Heather L O’Brien, Paul Cairns, and Mark Hall. 2018. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. International Journal of Human-Computer Studies 112 (2018), 28–39.

[37]

Andrea Papenmeier, Alexander Frummet, and Dagmar Kern. 2022. “Mhm...”–Conversational Strategies For Product Search Assistants. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval. 36–46.

Digital Library

[38]

Andrea Papenmeier, Dagmar Kern, Daniel Hienert, Alfred Sliwa, Ahmet Aker, and Norbert Fuhr. 2021. Dataset of Natural Language Queries for E-Commerce. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval. 307–311.

Digital Library

[39]

SoHyun Park, Jeewon Choi, Sungwoo Lee, Changhoon Oh, Changdai Kim, Soohyun La, Joonhwan Lee, Bongwon Suh, 2019. Designing a chatbot for a brief motivational interview on stress management: Qualitative case study. Journal of medical Internet research 21, 4 (2019), e12231.

[40]

MGM Pinho, JD Mackenbach, Hélène Charreire, J-M Oppert, H Bárdos, K Glonti, H Rutter, Sofie Compernolle, Ilse De Bourdeaudhuij, JWJ Beulens, 2018. Exploring the relationship between perceived barriers to healthy eating and dietary behaviours in European adults. European journal of nutrition 57 (2018), 1761–1770.

[41]

Amon Rapp, Lorenzo Curti, and Arianna Boldi. 2021. The human side of human-chatbot interaction: A systematic literature review of ten years of research on text-based chatbots. International Journal of Human-Computer Studies 151 (2021), 102630.

[42]

Ian Ruthven. 2019. Making meaning: A focus for information interactions research. In Proceedings of the 2019 conference on human information interaction and retrieval. 163–171.

Digital Library

[43]

Samiha Samrose and Ehsan Hoque. 2022. MIA: Motivational Interviewing Agent for Improving Conversational Skills in Remote Group Discussions. Proceedings of the ACM on Human-Computer Interaction 6, GROUP (2022), 1–24.

Digital Library

[44]

Daniel Schulman, Timothy W Bickmore, and Candace L Sidner. 2011. An Intelligent Conversational Agent for Promoting Long-Term Health Behavior Change Using Motivational Interviewing. In AAAI Spring Symposium: AI and Health Communication. 61–64.

[45]

Abigail See and Christopher D Manning. 2021. Understanding and predicting user dissatisfaction in a neural generative chatbot. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 1–12.

[46]

Ashish Sharma, Inna W Lin, Adam S Miner, David C Atkins, and Tim Althoff. 2023. Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nature Machine Intelligence 5, 1 (2023), 46–57.

[47]

Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, David Wadden, Khendra G Lucas, Adam S Miner, Theresa Nguyen, and Tim Althoff. 2023. Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction. arXiv preprint arXiv:2305.02466 (2023).

[48]

Siqi Shen, Charles Welch, Rada Mihalcea, and Verónica Pérez-Rosas. 2020. Counseling-Style Reflection Generation Using Generative Pretrained Transformers with Augmented Context. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 1st virtual meeting, 10–20. https://aclanthology.org/2020.sigdial-1.2

[49]

James D Slavet, LAR Stein, Suzanne M Colby, Nancy P Barnett, Peter M Monti, Charles Golembeske Jr, and Rebecca Lebeau-Craven. 2006. The Marijuana Ladder: Measuring motivation to change marijuana use in incarcerated adolescents. Drug and Alcohol Dependence 83, 1 (2006), 42–48.

[50]

Jörg Tiedemann and Santhosh Thottingal. 2020. OPUS-MT — Building open translation services for the World. In Proceedings of the 22nd Annual Conferenec of the European Association for Machine Translation (EAMT). Lisbon, Portugal.

[51]

WF Velicer, JO Prochaska, JL Fava, GJ Norman, and CA Redding. 1998. Detailed overview of the transtheoretical model. Homeostasis 38 (1998), 216–33.

[52]

Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, and William Isaac. 2023. Sociotechnical Safety Evaluation of Generative AI Systems. arxiv:2310.11986 [cs.AI]

[53]

Fabian Wilmers, Thomas Munder, Rainer Leonhart, Thomas Herzog, Reinhard Plassmann, Jürgen Barth, and Hans Wolfgang Linster. 2008. Die deutschsprachige Version des Working Alliance Inventory-short revised (WAI-SR)-Ein schulenübergreifendes, ökonomisches und empirisch validiertes Instrument zur Erfassung der therapeutischen Allianz. Klinische Diagnostik und Evaluation 1, 3 (2008), 343–358.

[54]

Bei Xu and Ziyuan Zhuang. 2022. Survey on psychotherapy chatbots. Concurrency and Computation: Practice and Experience 34, 7 (2022), e6170.

Index Terms

"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations
1. Computing methodologies
  1. Artificial intelligence
    1. Natural language processing
      1. Language resources
2. Human-centered computing
  1. Human computer interaction (HCI)
    1. Interaction paradigms
      1. Natural language interfaces

Recommendations

How Does Conversation Length Impact User’s Satisfaction? A Case Study of Length-Controlled Conversations with LLM-Powered Chatbots
CHI EA '24: Extended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems
Users can discuss a wide range of topics with large language models (LLMs), but they do not always prefer solving problems or getting information through lengthy conversations. This raises an intriguing HCI question: How does instructing LLMs to engage in ...
Leveraging Large Language Models to Power Chatbots for Collecting User Self-Reported Data
CSCW

Large language models (LLMs) provide a new way to build chatbots by accepting natural language prompts. Yet, it is unclear how to design prompts to power chatbots to carry on naturalistic conversations while pursuing a given goal such as collecting self-...
From Human-to-Human to Human-to-Bot Conversations in Software Engineering
AIware 2024: Proceedings of the 1st ACM International Conference on AI-Powered Software

Software developers use natural language to interact not only with other humans, but increasingly also with chatbots. These interactions have different properties and flow differently based on what goal the developer wants to achieve and who they ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

CHIIR '24: Proceedings of the 2024 Conference on Human Information Interaction and Retrieval

March 2024

481 pages

ISBN:9798400704345

DOI:10.1145/3627508

Editors:
Paul Clough
University of Sheffield Information School
,
Morgan Harvey
University of Sheffield Information School
,
Frank Hopfgartner
Universität Koblenz

Copyright © 2024 Owner/Author.

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 March 2024

Check for updates

Author Tags

Qualifiers

Demonstration
Research
Refereed limited

Conference

CHIIR '24

CHIIR '24: 2024 ACM SIGIR Conference on Human Information Interaction and Retrieval

March 10 - 14, 2024

Sheffield, United Kingdom

Acceptance Rates

Overall Acceptance Rate 55 of 163 submissions, 34%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
148
Total Downloads

Downloads (Last 12 months)148
Downloads (Last 6 weeks)15

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents