Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3308532.3329473acmconferencesArticle/Chapter ViewAbstractPublication PagesivaConference Proceedingsconference-collections
research-article

An End-to-End Conversational Style Matching Agent

Published: 01 July 2019 Publication History

Abstract

We present an end-to-end voice-based conversational agent that is able to engage in naturalistic multi-turn dialogue and align with the interlocutor's conversational style. The system uses a series of deep neural network components for speech recognition, dialogue generation, prosodic analysis and speech synthesis to generate language and prosodic expression with qualities that match those of the user. We conducted a user study (N=30) in which participants talked with the agent for 15 to 20 minutes, resulting in over 8 hours of natural interaction data. Users with high consideration conversational styles reported the agent to be more trustworthy when it matched their conversational style. Whereas, users with high involvement conversational styles were indifferent. Finally, we provide design guidelines for multi-turn dialogue interactions using conversational style adaptation.

References

[1]
Christoph Bartneck, Dana Kulić, Elizabeth Croft, and Susana Zoghbi. 2009. Measurement instruments for the anthropomorphism, animacy, likeability, perceived intelligence, and perceived safety of robots. International journal of social robotics, Vol. 1, 1 (2009), 71--81.
[2]
vS tefan Bevn uvs. 2014. Conversational entrainment in the use of discourse markers. In Recent Advances of Neural Network Models and Applications. Springer, 345--352.
[3]
Timothy Bickmore and Justine Cassell. 2000. how about this weather? social dialogue with embodied conversational agents. In Proc. AAAI Fall Symposium on Socially Intelligent Agents .
[4]
Timothy Bickmore and Justine Cassell. 2001. Relational agents: a model and implementation of building user trust. In Proceedings of the SIGCHI conference on Human factors in computing systems. ACM, 396--403.
[5]
Timothy Bickmore and Justine Cassell. 2005. Social dialogue with embodied conversational agents. Advances in natural multimodal dialogue systems, Vol. 30 (2005), 23--54.
[6]
Dan Bohus, Sean Andrist, and Mihai Jalobeanu. 2017. Rapid development of multimodal interactive systems: a demonstration of platform for situated intelligence. In Proceedings of the 19th ACM International Conference on Multimodal Interaction. ACM, 493--494.
[7]
Daniel C Burnett, Mark R Walker, and Andrew Hunt. 2004. Speech synthesis markup language (ssml) version 1.0. W3C recommendation, Vol. 7 (2004).
[8]
Justine Cassell and Timothy Bickmore. 2003. Negotiated collusion: Modeling social language and its relationship effects in intelligent agents. User modeling and user-adapted interaction, Vol. 13, 1--2 (2003), 89--132.
[9]
David DeVault, Ron Artstein, Grace Benn, Teresa Dey, Ed Fast, Alesia Gainer, Kallirroi Georgila, Jon Gratch, Arno Hartholt, Margaux Lhommet, et almbox. 2014. SimSensei Kiosk: A virtual human interviewer for healthcare decision support. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems . International Foundation for Autonomous Agents and Multiagent Systems, 1061--1068.
[10]
Kevin El Haddad, Hüseyin cC akmak, Emer Gilmartin, Stéphane Dupont, and Thierry Dutoit. 2016. Towards a listening agent: a system generating audiovisual laughs and smiles to show interest. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 248--255.
[11]
Greg Elofson. 2001. Developing trust with intelligent agents: An exploratory study. In Trust and deception in virtual societies. Springer, 125--138.
[12]
Jonathan Gratch, Gale M Lucas, Aisha King, and Louis-Philippe Morency. 2014. It's only a computer: the impact of human-agent interaction in clinical interviews. In Proceedings of the 2014 international conference on Autonomous agents and multi-agent systems. International Foundation for Autonomous Agents and Multiagent Systems, 85--92.
[13]
Jonathan Gratch, Ning Wang, Jillian Gerten, Edward Fast, and Robin Duffy. 2007. Creating rapport with virtual agents. In International workshop on intelligent virtual agents. Springer, 125--138.
[14]
Kun Han, Dong Yu, and Ivan Tashev. 2014. Speech emotion recognition using deep neural network and extreme learning machine. In Fifteenth annual conference of the international speech communication association .
[15]
Julia Hirschberg. 2011. Speaking more like you: Entrainment in conversational speech. In Twelfth Annual Conference of the International Speech Communication Association .
[16]
Bernd Huber and Daniel McDuff. 2018. Facial Expression Grounded Conversational Dialogue Generation. In Automatic Face & Gesture Recognition (FG 2018), 2018 13th IEEE International Conference on. IEEE, 365--372.
[17]
Bernd Huber, Daniel McDuff, Chris Brockett, Michel Galley, and Bill Dolan. 2018. Emotional Dialogue Generation using Image-Grounded Language Models. In Proceedings of the 2018 CHI Conference on Human Factors in Computing Systems. ACM, 277.
[18]
Benjamin Inden, Zofia Malisz, Petra Wagner, and Ipke Wachsmuth. 2013. Timing and entrainment of multimodal backchanneling behavior for an embodied conversational agent. In Proceedings of the 15th ACM on International conference on multimodal interaction. ACM, 181--188.
[19]
Katherine Isbister and Patrick Doyle. 2002. Design and evaluation of embodied conversational agents: A proposed taxonomy. In The first international joint conference on autonomous agents & multi-agent systems .
[20]
Igor Jauk, Ipke Wachsmuth, and Petra Wagner. 2011. Dynamic perception-production oscillation model in human-machine communication. In Proceedings of the 13th international conference on multimodal interfaces. ACM, 213--216.
[21]
Oliver P John and Sanjay Srivastava. 1999. The Big Five trait taxonomy: History, measurement, and theoretical perspectives. Handbook of personality: Theory and research, Vol. 2, 1999 (1999), 102--138.
[22]
Rivka Levitan, Stefan Benus, Ramiro H Gálvez, Agust'in Gravano, Florencia Savoretti, Marian Trnka, Andreas Weise, and Julia Hirschberg. 2016. Implementing Acoustic-Prosodic Entrainment in a Conversational Avatar. In INTERSPEECH, Vol. 16. 1166--1170.
[23]
Ewa Luger and Abigail Sellen. 2016. Like having a really bad PA: the gulf between user expectation and experience of conversational agents. In Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems. ACM, 5286--5297.
[24]
Yoichi Matsuyama, Arjun Bhardwaj, Ran Zhao, Oscar Romeo, Sushma Akoju, and Justine Cassell. 2016. Socially-aware animated intelligent personal assistant agent. In Proceedings of the 17th Annual Meeting of the Special Interest Group on Discourse and Dialogue . 224--227.
[25]
Robert R Morris, Kareem Kouddous, Rohan Kshirsagar, and Stephen M Schueller. 2018. Towards an Artificially Empathic Conversational Agent for Mental Health Applications: System Design and User Perceptions. Journal of medical Internet research, Vol. 20, 6 (2018).
[26]
Nasrin Mostafazadeh, Chris Brockett, Bill Dolan, Michel Galley, Jianfeng Gao, Georgios P Spithourakis, and Lucy Vanderwende. 2017. Image-grounded conversations: Multimodal context for natural question and response generation. arXiv preprint arXiv:1701.08251 (2017).
[27]
Kate G Niederhoffer and James W Pennebaker. 2002. Linguistic style matching in social interaction. Journal of Language and Social Psychology, Vol. 21, 4 (2002), 337--360.
[28]
Jahna Otterbacher, Chee Siang Ang, Marina Litvak, and David Atkins. 2017. Show Me You Care: Trait Empathy, Linguistic Style, and Mimicry on Facebook. ACM Transactions on Internet Technology (TOIT), Vol. 17, 1 (2017), 6.
[29]
Florian Pecune, Jingya Chen, Yoichi Matsuyama, and Justine Cassell. 2018. Field Trial Analysis of Socially Aware Robot Assistant. In Proceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems . International Foundation for Autonomous Agents and Multiagent Systems, 1241--1249.
[30]
Zsófia Ruttkay and Catherine Pelachaud. 2006. From brows to trust: evaluating embodied conversational agents. Vol. 7. Springer Science & Business Media.
[31]
Lauren E Scissors, Alastair J Gill, Kathleen Geraghty, and Darren Gergle. 2009. In CMC we trust: The role of similarity. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems. ACM, 527--536.
[32]
Lauren E Scissors, Alastair J Gill, and Darren Gergle. 2008. Linguistic mimicry and trust in text-based CMC. In Proceedings of the 2008 ACM conference on Computer supported cooperative work. ACM, 277--280.
[33]
Ameneh Shamekhi, Mary Czerwinski, Gloria Mark, Margeigh Novotny, and Gregory A Bennett. 2016. An exploratory study toward the preferred conversational style for compatible virtual agents. In International Conference on Intelligent Virtual Agents. Springer, 40--50.
[34]
Alessandro Sordoni, Michel Galley, Michael Auli, Chris Brockett, Yangfeng Ji, Margaret Mitchell, Jian-Yun Nie, Jianfeng Gao, and Bill Dolan. 2015. A neural network approach to context-sensitive generation of conversational responses. In Proc. of NAACL-HLT .
[35]
Deborah Tannen. 1987. Conversational style. In Psycholinguistic models of production, Hans W Dechert and Manfred Raupach (Eds.). Ablex, Norwood, NJ.
[36]
Deborah Tannen. 2005. Conversational style: Analyzing talk among friends new ed.). Oxford University Press, New York.
[37]
Paul Thomas, Mary Czerwinski, Daniel McDuff, Nick Craswell, and Gloria Mark. 2018. Style and Alignment in Information-Seeking Conversation. In Proceedings of the 2018 Conference on Human Information Interaction&Retrieval. ACM, 42--51.
[38]
Paul Thomas, Daniel McDuff, Mary Czerwinski, and Nick Craswell. 2017. MISC: A data set of information-seeking conversations. In SIGIR 1st International Workshop on Conversational Approaches to Information Retrieval (CAIR'17), Vol. 5.
[39]
David Traum and Jeff Rickel. 2002. Embodied agents for multi-party dialogue in immersive virtual worlds. In Proceedings of the first international joint conference on Autonomous agents and multiagent systems: part 2. ACM, 766--773.

Cited By

View all
  • (2024)Reading Between the Lines: A Refined Methodology for Measuring Language Style Matching in ConversationsSSRN Electronic Journal10.2139/ssrn.4790188Online publication date: 2024
  • (2024)Exploring the Impact of Conversational Style in Enhancing Recruitment Chatbot InteractionsProceedings of the 13th Nordic Conference on Human-Computer Interaction10.1145/3679318.3685387(1-14)Online publication date: 13-Oct-2024
  • (2024)A Meta-Analysis of Vulnerability and Trust in Human–Robot InteractionACM Transactions on Human-Robot Interaction10.1145/365889713:3(1-25)Online publication date: 29-Apr-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
IVA '19: Proceedings of the 19th ACM International Conference on Intelligent Virtual Agents
July 2019
282 pages
ISBN:9781450366724
DOI:10.1145/3308532
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2019

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. artificial intelligence
  2. conversational agent
  3. conversational style
  4. dialogue

Qualifiers

  • Research-article

Conference

IVA '19
Sponsor:

Acceptance Rates

IVA '19 Paper Acceptance Rate 15 of 63 submissions, 24%;
Overall Acceptance Rate 53 of 196 submissions, 27%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)10
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Reading Between the Lines: A Refined Methodology for Measuring Language Style Matching in ConversationsSSRN Electronic Journal10.2139/ssrn.4790188Online publication date: 2024
  • (2024)Exploring the Impact of Conversational Style in Enhancing Recruitment Chatbot InteractionsProceedings of the 13th Nordic Conference on Human-Computer Interaction10.1145/3679318.3685387(1-14)Online publication date: 13-Oct-2024
  • (2024)A Meta-Analysis of Vulnerability and Trust in Human–Robot InteractionACM Transactions on Human-Robot Interaction10.1145/365889713:3(1-25)Online publication date: 29-Apr-2024
  • (2024)Trust in a Human-Computer Collaborative Task With or Without Lexical AlignmentAdjunct Proceedings of the 32nd ACM Conference on User Modeling, Adaptation and Personalization10.1145/3631700.3664868(189-194)Online publication date: 27-Jun-2024
  • (2024)What I Don’t Like about You?: A Systematic Review of Impeding Aspects for the Usage of Conversational AgentsInteracting with Computers10.1093/iwc/iwae01836:5(293-312)Online publication date: 31-May-2024
  • (2024)The Usage of Voice in Sexualized Interactions with Technologies and Sexual Health Communication: An OverviewCurrent Sexual Health Reports10.1007/s11930-024-00383-416:2(47-57)Online publication date: 27-Mar-2024
  • (2024)Twenty-four years of empirical research on trust in AI: a bibliometric review of trends, overlooked issues, and future directionsAI & SOCIETY10.1007/s00146-024-02059-yOnline publication date: 2-Oct-2024
  • (2023)The Role of Lexical Alignment in Human Understanding of Explanations by Conversational AgentsProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584086(423-435)Online publication date: 27-Mar-2023
  • (2023)The Importance of Multimodal Emotion Conditioning and Affect Consistency for Embodied Conversational AgentsProceedings of the 28th International Conference on Intelligent User Interfaces10.1145/3581641.3584045(790-801)Online publication date: 27-Mar-2023
  • (2023)The Bot on Speaking Terms: The Effects of Conversation Architecture on Perceptions of Conversational AgentsProceedings of the 5th International Conference on Conversational User Interfaces10.1145/3571884.3597139(1-16)Online publication date: 19-Jul-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media