Abstract
The phenomenon of entrainment in conversation is the process where participants become more similar to each other in terms of different verbal and non-verbal aspects such as acoustic-prosodic, lexical, syntactic, pitch, and speech rate. This process of becoming similar to each other is the key to effective human-human conversation. To replicate the effectiveness observed in human-human conversation, it is equally critical to explore the occurrence of entrainment within human-machine conversation. This review article examines the various non-verbal and verbal aspects that machines are able to adapt for improved entrainment in human-machine conversation. Initially, we categorize the specific verbal and non-verbal behaviors of human users that machines are capable of adapting. Subsequently, we analyze the likely challenges that have prevented the speech technology sector from enabling smooth, natural interactions between humans and machines. These obstacles have hindered the industry’s ability to leverage the phenomenon of entrainment for more fluid and intuitive human-machine conversation. Finally, we advocate for a mechanomorphic design strategy in human-machine conversation, outlining the rationale for its potential efficacy.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Levitan, R., et al.: Implementing Acoustic-Prosodic Entrainment in a Conversational Avatar. Interspeech. vol. 16 (2016)
Gálvez, R.H., et al.: An empirical study of the effect of acoustic-prosodic entrainment on the perceived trustworthiness of conversational avatars. Speech Commun. 124, 46–67 (2020)
Beňuš, Š., et al.: Prosodic entrainment and trust in human-computer interaction. In: Proceedings of the 9th International Conference on Speech Prosody. Baixas, France: International Speech Communication Association (2018)
Iio, T., et al.: Lexical entrainment in human robot interaction: do humans use their vocabulary to robots? Int. J. Soc. Robot. 7, 253–263 (2015)
Huiyang, S., Min, W.: Improving interaction experience through lexical convergence: the prosocial effect of lexical alignment in human-human and human-computer interactions. Int. J. Hum. Comput. Interact. 38(1), 28–41 (2022)
Nuñez, T.R., et al.: Virtual agents aligning to their users. Lexical alignment in human-agent-interaction and its psychological effects. Int. J. Hum Comput Stud. 178, 103093 (2023)
Levitan, R., et al.: Entrainment and turn-taking in human-human dialogue. In: 2015 AAAI Spring Symposium Series (2015)
Linnemann, G.A., Jucks, R.: can i trust the spoken dialogue system because it uses the same words as i do?-influence of lexically aligned spoken dialogue systems on trustworthiness and user satisfaction. Interact. Comput. 30(3), 173–186 (2018)
Schoot, L., Hagoort, P., Segaert, K.: Stronger syntactic alignment in the presence of an interlocutor. Front. Psychol. 10, 685 (2019)
Hirschberg, J.B, Nenkova, A., Gravano, A.: High frequency word entrainment in spoken dialogue (2008)
Mori, M., MacDorman, K.F., Kageki, N.: The uncanny valley [from the field]. IEEE Robot. Autom. Mag. 19(2), 98–100 (2012)
Lubold, N., Pon-Barry, H., Walker, E.: Naturalness and rapport in a pitch adaptive learning companion. In: 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU). IEEE (2015)
Balentine, B.: It’s Better to Be a Good Machine Than a Bad Person: Speech Recognition and Other Exotic User Interfaces in the Twilight of the Jetsonian Age. ICMI Press (2007)
Strömbergsson, S., et al.: Timing responses to questions in dialogue. Interspeech. vol. 2013 (2013)
Phukon, M., Shrivastava, A., Balentine, B.: Can VUI turn-taking entrain user behaviours? voice user interfaces that disallow overlapping speech present turn-taking challenges. In: Proceedings of the 13th Indian Conference on Human-Computer Interaction (2022)
Goetsu, S., Sakai, T.: Different types of voice user interface failures may cause different degrees of frustration. arXiv preprint arXiv:2002.03582 (2020)
Goetsu, S., Sakai, T.: Voice input interface failures and frustration: developer and user perspectives. In: Adjunct Proceedings of the 32nd Annual ACM Symposium on User Interface Software and Technology (2019)
Kim, J., Jeong, M., Lee, S.C.: Why did this voice agent not understand me?" error recovery strategy for in-vehicle voice user interface. In: Proceedings of the 11th International Conference on Automotive User Interfaces and Interactive Vehicular Applications: Adjunct Proceedings (2019)
Jiang, J., Jeng, W., He, D.: How do users respond to voice input errors? Lexical and phonetic query reformulation in voice search. In: Proceedings of the 36th International ACM SIGIR Conference on Research and Development in Information Retrieval (2013)
Luger, E., Sellen, A.: like having a really bad PA the gulfbetween user expectation and experience of conversational agents. In: Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 5286–5297 (2016)
Tannen, D.: Talking Voices: Repetition, Dialogue, and Imagery in Conversational Discourse, vol. 26. Cambridge University Press, Cambridge (2007)
Natale, M.: Convergence of mean vocal intensity in dyadic communication as a function of social desirability. J. Pers. Soc. Psychol. 32(5), 790 (1975)
Heldner, M., Edlund, J., Hirschberg, J.B.: Pitch similarity in the vicinity of backchannels (2010)
Pardo, J.S.: On phonetic convergence during conversational interaction. J. Acoust. Soc. Am. 119(4), 2382–2393 (2006)
Lubold, N., Pon-Barry, H.: Acoustic-prosodic entrainmentand rapport in collaborative learning dialogues. In: Proceedings of the 2014 ACMworkshop on Multimodal Learning Analytics Workshop and Grand Challenge, pp. 5–12 (2014)
Manson, J.H., et al.: Convergence of speech rate in conversation predicts cooperation. Evol. Hum. Behav. 34(6), 419–426 (2013)
Wynn, C.J., et al.: Speech entrainment in adolescent conversations: a developmental perspective. J. Speech, Lang. Hearing Res. 1–19 (2023)
Wynn, C.J., Barrett, T.S., Borrie, S.A.: Rhythm perception, speaking rate entrainment, and conversational quality: a mediated model. J. Speech Lang. Hear. Res. 65(6), 2187–2203 (2022)
Borrie, S.A., et al.: Syncing up for a good conversation: a clinically meaningful methodology for capturing conversational entrainment in the speech domain. J. Speech, Lang. Hearing Res. 62(2), 283–296 (2019)
Coupland, J., Coupland, N., Giles, H.: Accommodation theory. communication, context and consequences. Contexts Accommodation, 1–68 (1991)
Paletz, S.B.F., et al.: Speaking similarly: team personality composition and acoustic-prosodic entrainment. Small Group Res. 10464964231178748 (2023)
Xia, Z., Hirschberg, J., Levitan, R.: Investigating prosodic entrainment from global conversations to local turns and tones in mandarin conversations. Speech Commun. 153, 102961 (2023)
Brennan, E.S.: Lexical choice and conceptual pacts in conversation. J. Exp. Psychol. Learn. Mem. Cogn. 22, 1482–1493 (1996)
Shen, H., Wang, M.: Effects of social skills on lexical alignment in human-human interaction and human-computer interaction. Comput. Hum. Behav. 143, 107718 (2023)
Iio, T., et al.: Lexical entrainment in human-robot interaction: can robots entrain human vocabulary?. In: 2009 IEEE/RSJ International Conference on Intelligent Robots and Systems. IEEE (2009)
Cowan, B.R., Branigan, H.P., Beale, R.: Investigating the impact of interlocutor voice on syntactic alignment in human-computer dialogue. In: The 26th BCS Conference on Human Computer Interaction, vol. 26 (2012)
Beňuš, Š.: Social aspects of entrainment in spoken interaction. Cogn. Comput. 6, 802-813 (2014)
Levitan, R.: Acoustic-Prosodic Entrainment In Human-human and Human-computer Dialogue. Columbia University, New York (2014)
Levitan, R., et al.: Acoustic-prosodic entrainment and social behavior. In: Proceedings of the 2012 Conference of the North American Chapter of the Association for Computational Linguistics: Human language Technologies (2012)
Shi, Z., Sen, P., Lipani, A.: Lexical Entrainment for Conversational Systems. arXiv preprint arXiv:2310.09651 (2023)
Giles, H., Powesland, P.F.: Speech Style and Social Evaluation. Academic Press, Cambridge (1975)
Levitan, R., Hirschberg, J.B.: Measuring acoustic-prosodic entrainment with respect to multiple levels and dimensions (2011)
Cohn, M., et al.: Speech rate adjustments in conversations with an Amazon Alexa socialbot. Front. Commun. 6, 671429 (2021)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Phukon, M., Shrivastava, A. (2024). Effect of Speech Entrainment in Human-Computer Conversation: A Review. In: Choi, B.J., Singh, D., Tiwary, U.S., Chung, WY. (eds) Intelligent Human Computer Interaction. IHCI 2023. Lecture Notes in Computer Science, vol 14531. Springer, Cham. https://doi.org/10.1007/978-3-031-53827-8_4
Download citation
DOI: https://doi.org/10.1007/978-3-031-53827-8_4
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-53826-1
Online ISBN: 978-3-031-53827-8
eBook Packages: Computer ScienceComputer Science (R0)