Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3627508.3638330acmotherconferencesArticle/Chapter ViewAbstractPublication PagesirConference Proceedingsconference-collections

"You tell me": A Dataset of GPT-4-Based Behaviour Change Support Conversations

Published: 10 March 2024 Publication History


Conversational agents are increasingly used to address emotional needs on top of information needs. One use case of increasing interest are counselling-style mental health and behaviour change interventions, with large language model (LLM)-based approaches becoming more popular. Research in this context so far has been largely system-focused, foregoing the aspect of user behaviour and the impact this can have on LLM-generated texts. To address this issue, we share a dataset containing text-based user interactions related to behaviour change with two GPT-4-based conversational agents collected in a preregistered user study. This dataset includes conversation data, user language analysis, perception measures, and user feedback for LLM-generated turns, and can offer valuable insights to inform the design of such systems based on real interactions.


Imtihan Ahmed, Eric Keilty, Carolynne Cooper, Peter Selby, and Jonathan Rose. 2022. Generation and Classification of Motivational-Interviewing-Style Reflections for Smoking Behaviour Change Using Few-Shot Learning with Transformers. (2022).
John W Ayers, Adam Poliak, Mark Dredze, Eric C Leas, Zechariah Zhu, Jessica B Kelley, Dennis J Faix, Aaron M Goodman, Christopher A Longhurst, Michael Hogarth, and Davey M Smith. 2023. Comparing Physician and Artificial Intelligence Chatbot Responses to Patient Questions Posted to a Public Social Media Forum. JAMA internal medicine 183, 6 (June 2023), 589—596. https://doi.org/10.1001/jamainternmed.2023.1838
Sabrina Barko-Sherif, David Elsweiler, and Morgan Harvey. 2020. Conversational agents for recipe recommendation. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 73–82.
Emily M. Bender, Timnit Gebru, Angelina McMillan-Major, and Shmargaret Shmitchell. 2021. On the Dangers of Stochastic Parrots: Can Language Models Be Too Big?. In Proceedings of the 2021 ACM Conference on Fairness, Accountability, and Transparency (Virtual Event, Canada) (FAccT ’21). Association for Computing Machinery, New York, NY, USA, 610–623. https://doi.org/10.1145/3442188.3445922
Lois Biener and David B Abrams. 1991. The Contemplation Ladder: validation of a measure of readiness to consider smoking cessation.Health psychology 10, 5 (1991), 360.
Maya Boustani, Stephanie Lunn, Ubbo Visser, Christine Lisetti, 2021. Development, Feasibility, Acceptability, and Utility of an Expressive Speech-Enabled Digital Health Agent to Deliver Online, Brief Motivational Interviewing for Alcohol Misuse: Descriptive Study. Journal of medical Internet research 23, 9 (2021), e25837.
Kushal Chawla, Weiyan Shi, Jingwen Zhang, Gale Lucas, Zhou Yu, and Jonathan Gratch. 2023. Social Influence Dialogue Systems: A Survey of Datasets and Models For Social Influence Tasks. In Proceedings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Association for Computational Linguistics, Dubrovnik, Croatia, 750–766. https://aclanthology.org/2023.eacl-main.53
Dawn Clifford and Laura Curtis. 2016. Motivational interviewing in nutrition and fitness. Guilford Publications.
Marianna A de Sá Siqueira, Barbara CN Müller, and Tibor Bosse. 2023. When do we accept mistakes from chatbots? The impact of human-like communication on user experience in chatbots that make mistakes. International Journal of Human–Computer Interaction (2023), 1–11.
Carlo C DiClemente and James O Prochaska. 1998. Toward a comprehensive, transtheoretical model of change: Stages of change and addictive behaviors. (1998).
Ela Elsholz, Jon Chamberlain, and Udo Kruschwitz. 2019. Exploring language style in chatbots to increase perceived product value and user engagement. In Proceedings of the 2019 Conference on Human Information Interaction and Retrieval. 301–305.
David Elsweiler, Max L Wilson, and Brian Kirkegaard Lunn. 2011. Understanding casual-leisure information behaviour. In New directions in information behaviour. Vol. 1. Emerald Group Publishing Limited, 211–241.
Dariush D Farhud. 2015. Impact of lifestyle on health. Iranian journal of public health 44, 11 (2015), 1442.
Joseph R Ferrari, Jean O’Callaghan, and Ian Newbegin. 2005. Prevalence of procrastination in the United States, United Kingdom, and Australia: arousal and avoidance delays among adults.North American Journal of Psychology 7, 1 (2005).
Alexander Frummet, David Elsweiler, and Bernd Ludwig. 2022. “What Can I Cook with these Ingredients?”-Understanding Cooking-Related Information Needs in Conversational Search. ACM Transactions on Information Systems (TOIS) 40, 4 (2022), 1–32.
Linwei He, Erkan Basar, Reinout W Wiers, Marjolijn L Antheunis, and Emiel Krahmer. 2022. Can chatbots help to motivate smoking cessation? A study on the effectiveness of motivational interviewing on engagement and therapeutic alliance. BMC Public Health 22, 1 (2022), 726.
Marianne Holdener, Alain Gut, Alfred Angerer, 2020. Applicability of the user engagement scale to mobile health: a survey-based quantitative study. JMIR mHealth and uHealth 8, 1 (2020), e13244.
Eunbin Kang and Youn Ah Kang. 2023. Counseling chatbot design: The effect of anthropomorphic chatbot characteristics on user self-disclosure and companionship. International Journal of Human–Computer Interaction (2023), 1–15.
Jieon Lee, Daeho Lee, and Jae-gil Lee. 2022. Influence of Rapport and Social Presence with an AI Psychotherapy Chatbot on Users’ Self-Disclosure. International Journal of Human–Computer Interaction (2022), 1–12.
Yi-Chieh Lee, Naomi Yamashita, Yun Huang, and Wai Fu. 2020. " I hear you, I feel you": encouraging deep self-disclosure through a chatbot. In Proceedings of the 2020 CHI conference on human factors in computing systems. 1–12.
Yanran Li, Ke Li, Hongke Ning, Xiaoqiang Xia, Yalong Guo, Chen Wei, Jianwei Cui, and Bin Wang. 2021. Towards an online empathetic chatbot with emotion causes. In Proceedings of the 44th International ACM SIGIR Conference on Research and Development in Information Retrieval. 2041–2045.
Irene Lopatovska and Jessika Davis. 2023. Designing Supportive Conversational Agents With and For Teens. In Proceedings of the 2023 Conference on Human Information Interaction and Retrieval. 328–332.
Michael B Madson, Richard S Mohn, Julie A Schumacher, and Alicia S Landry. 2015. Measuring client experiences of motivational interviewing during a lifestyle intervention. Measurement and Evaluation in Counseling and Development 48, 2 (2015), 140–151.
Michael B Madson, Richard S Mohn, Allan Zuckoff, Julie A Schumacher, Jane Kogan, Shari Hutchison, Emily Magee, and Bradley Stein. 2013. Measuring client perceptions of motivational interviewing: factor analysis of the Client Evaluation of Motivational Interviewing scale. Journal of Substance Abuse Treatment 44, 3 (2013), 330–335.
Michael B Madson, Margo C Villarosa, Julie A Schumacher, and Richard S Mohn. 2016. Evaluating the validity of the client evaluation of motivational interviewing scale in a brief motivational intervention for college student drinkers. Journal of substance abuse treatment 65 (2016), 51–57.
Selina Meyer. 2021. Natural Language Stage of Change Modelling for “Motivationally-driven” Weight Loss Support. In Proceedings of the 2021 International Conference on Multimodal Interaction. 807–811.
Selina Meyer. 2022. “I’m at my wits’ end”-Anticipating Information Needs and Appropriate Support Strategies in Behaviour Change. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval. 396–399.
Selina Meyer and David Elsweiler. 2022. GLoHBCD: A Naturalistic German Dataset for Language of Health Behaviour Change on Online Support Forums. In Proceedings of the Thirteenth Language Resources and Evaluation Conference. 2226–2235.
Selina Meyer and David Elsweiler. 2023. Evaluating the Efficacy, Controllability, and Safety of LLM-driven Conversational Agents to Support Behaviour Change. (2023).
Selina Meyer and David Elsweiler. 2023. Towards Cross-Content Conversational Agents for Behaviour Change: Investigating Domain Independence and the Role of Lexical Features in Written Language Around Change. researchgate preprint 10.13140/RG.2.2.10419.30242 (2023).
William R Miller and Stephen Rollnick. 2012. Motivational interviewing: Helping people change. Guilford press.
Harsha Nori, Nicholas King, Scott Mayer McKinney, Dean Carignan, and Eric Horvitz. 2023. Capabilities of gpt-4 on medical challenge problems. arXiv preprint arXiv:2303.13375 (2023).
Elnaz Nouri, Robert Sim, Adam Fourney, and Ryen W White. 2020. Step-wise recommendation for complex task support. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval. 203–212.
Office for National Statistics. [n. d.]. Most adults report making some changes to their lifestyle for environmental reasons. ([n. d.]). https://www.ons.gov.uk/peoplepopulationandcommunity/wellbeing/articles/mostadultsreportmakingsomechangestotheirlifestyleforenvironmentalreasons/2023-07-05
Stefan Olafsson, Teresa O’Leary, and Timothy Bickmore. 2019. Coerced change-talk with conversational agents promotes confidence in behavior change. In Proceedings of the 13th EAI International Conference on Pervasive Computing Technologies for Healthcare. 31–40.
Heather L O’Brien, Paul Cairns, and Mark Hall. 2018. A practical approach to measuring user engagement with the refined user engagement scale (UES) and new UES short form. International Journal of Human-Computer Studies 112 (2018), 28–39.
Andrea Papenmeier, Alexander Frummet, and Dagmar Kern. 2022. “Mhm...”–Conversational Strategies For Product Search Assistants. In Proceedings of the 2022 Conference on Human Information Interaction and Retrieval. 36–46.
Andrea Papenmeier, Dagmar Kern, Daniel Hienert, Alfred Sliwa, Ahmet Aker, and Norbert Fuhr. 2021. Dataset of Natural Language Queries for E-Commerce. In Proceedings of the 2021 Conference on Human Information Interaction and Retrieval. 307–311.
SoHyun Park, Jeewon Choi, Sungwoo Lee, Changhoon Oh, Changdai Kim, Soohyun La, Joonhwan Lee, Bongwon Suh, 2019. Designing a chatbot for a brief motivational interview on stress management: Qualitative case study. Journal of medical Internet research 21, 4 (2019), e12231.
MGM Pinho, JD Mackenbach, Hélène Charreire, J-M Oppert, H Bárdos, K Glonti, H Rutter, Sofie Compernolle, Ilse De Bourdeaudhuij, JWJ Beulens, 2018. Exploring the relationship between perceived barriers to healthy eating and dietary behaviours in European adults. European journal of nutrition 57 (2018), 1761–1770.
Amon Rapp, Lorenzo Curti, and Arianna Boldi. 2021. The human side of human-chatbot interaction: A systematic literature review of ten years of research on text-based chatbots. International Journal of Human-Computer Studies 151 (2021), 102630.
Ian Ruthven. 2019. Making meaning: A focus for information interactions research. In Proceedings of the 2019 conference on human information interaction and retrieval. 163–171.
Samiha Samrose and Ehsan Hoque. 2022. MIA: Motivational Interviewing Agent for Improving Conversational Skills in Remote Group Discussions. Proceedings of the ACM on Human-Computer Interaction 6, GROUP (2022), 1–24.
Daniel Schulman, Timothy W Bickmore, and Candace L Sidner. 2011. An Intelligent Conversational Agent for Promoting Long-Term Health Behavior Change Using Motivational Interviewing. In AAAI Spring Symposium: AI and Health Communication. 61–64.
Abigail See and Christopher D Manning. 2021. Understanding and predicting user dissatisfaction in a neural generative chatbot. In Proceedings of the 22nd Annual Meeting of the Special Interest Group on Discourse and Dialogue. 1–12.
Ashish Sharma, Inna W Lin, Adam S Miner, David C Atkins, and Tim Althoff. 2023. Human–AI collaboration enables more empathic conversations in text-based peer-to-peer mental health support. Nature Machine Intelligence 5, 1 (2023), 46–57.
Ashish Sharma, Kevin Rushton, Inna Wanyin Lin, David Wadden, Khendra G Lucas, Adam S Miner, Theresa Nguyen, and Tim Althoff. 2023. Cognitive Reframing of Negative Thoughts through Human-Language Model Interaction. arXiv preprint arXiv:2305.02466 (2023).
Siqi Shen, Charles Welch, Rada Mihalcea, and Verónica Pérez-Rosas. 2020. Counseling-Style Reflection Generation Using Generative Pretrained Transformers with Augmented Context. In Proceedings of the 21th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 1st virtual meeting, 10–20. https://aclanthology.org/2020.sigdial-1.2
James D Slavet, LAR Stein, Suzanne M Colby, Nancy P Barnett, Peter M Monti, Charles Golembeske Jr, and Rebecca Lebeau-Craven. 2006. The Marijuana Ladder: Measuring motivation to change marijuana use in incarcerated adolescents. Drug and Alcohol Dependence 83, 1 (2006), 42–48.
Jörg Tiedemann and Santhosh Thottingal. 2020. OPUS-MT — Building open translation services for the World. In Proceedings of the 22nd Annual Conferenec of the European Association for Machine Translation (EAMT). Lisbon, Portugal.
WF Velicer, JO Prochaska, JL Fava, GJ Norman, and CA Redding. 1998. Detailed overview of the transtheoretical model. Homeostasis 38 (1998), 216–33.
Laura Weidinger, Maribeth Rauh, Nahema Marchal, Arianna Manzini, Lisa Anne Hendricks, Juan Mateos-Garcia, Stevie Bergman, Jackie Kay, Conor Griffin, Ben Bariach, Iason Gabriel, Verena Rieser, and William Isaac. 2023. Sociotechnical Safety Evaluation of Generative AI Systems. arxiv:2310.11986 [cs.AI]
Fabian Wilmers, Thomas Munder, Rainer Leonhart, Thomas Herzog, Reinhard Plassmann, Jürgen Barth, and Hans Wolfgang Linster. 2008. Die deutschsprachige Version des Working Alliance Inventory-short revised (WAI-SR)-Ein schulenübergreifendes, ökonomisches und empirisch validiertes Instrument zur Erfassung der therapeutischen Allianz. Klinische Diagnostik und Evaluation 1, 3 (2008), 343–358.
Bei Xu and Ziyuan Zhuang. 2022. Survey on psychotherapy chatbots. Concurrency and Computation: Practice and Experience 34, 7 (2022), e6170.



Information & Contributors


Published In

cover image ACM Other conferences
CHIIR '24: Proceedings of the 2024 Conference on Human Information Interaction and Retrieval
March 2024
481 pages
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.


Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 March 2024

Check for updates

Author Tags

  1. behaviour change
  2. conversational agents
  3. dialogue
  4. information behaviour
  5. large language models


  • Demonstration
  • Research
  • Refereed limited



Acceptance Rates

Overall Acceptance Rate 55 of 163 submissions, 34%


Other Metrics

Bibliometrics & Citations


Article Metrics

  • 0
    Total Citations
  • 148
    Total Downloads
  • Downloads (Last 12 months)148
  • Downloads (Last 6 weeks)15
Reflects downloads up to 12 Sep 2024

Other Metrics


View Options

Get Access

Login options

View options


View or Download as a PDF file.



View online with eReader.


HTML Format

View this article in HTML Format.

HTML Format







Share this Publication link

Share on social media