Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1138317.1138322guidebooksArticle/Chapter ViewAbstractPublication PagesBookacm-pubtype
chapter

Evaluating users' reactions to human-like interfaces

Published: 01 January 2004 Publication History

Abstract

An increasing number of dialogue systems are deployed to provide public services in our everyday lives. They are becoming more service-minded and several of them provide different channels for interaction. The rationale is to make automatic services available in new environments and more attractive to use. From a developer perspective, this affects the complexity of the requirements elicitation activity, as new combinations and variations in end-user interaction need to be considered. The aim of our investigation is to propose new parameters and metrics to evaluate multimodal dialogue systems endowed with embodied conversational agents (ECAs). These new matrics focus on the users, rather than on the system. Our assumption is that the intentional use of prosodic variation and the production of communicative nonverbal behaviour by users can give an indication of their attitude towards the system and might also help to evaluate the users' overall experience of the interaction. To test our hypothesis we carried out analyses on different Swedish corpora of interactions between users and multimodal dialogue systems. We analysed the prosodic variation in the way the users ended their interactions with the system and we observed the prodution of non-verbal communicative expressions by users. Our study supports the idea that the observation of users' prosodic variation and production of communicative non-verbal behaviour during the interaction with dialogue syslems could be used as an indication of whether or not the users are satisfied with the system performance.

References

[1]
Bell, L. and Gustafson, J. (1999a). Utterance types in the August System. In Proceedings of the Third Swedish Symposium on Multimodal Communication, pp. 81-84. Stockholm.
[2]
Bell, L. and Gustafson, J. (1999b). Interacting with an animated agent: an analysis of a Swedish database of spontaneous computer directed speech. In Proceedings of Eurospeech'99, Budapest, pp. 1143-1146.
[3]
Beskow, J. (2003). Talking Heads Models and Applications for Multimodal Speech Synthesis. Doctoral dissertation, KTH, Stockholm.
[4]
Beskow, J., Edlund, J., and Nordstrand, M. (in press). A model for generalised multi-modal dialogue system output applied to an animated talking head. In Minker, W., Bühler, D., and Dybkjaer, L., editors, Spoken Multimodal Human-Computer Dialogue in Mobile Environments , Dordrecht, Kluwer Academic Press.
[5]
Carlson, R. and Granström, B. (1996). The Waxholm spoken dialogue system, Palkova, Z., editor, Phonetica Pragensia IX. Charisteria viro doctissimo Premysl Janota oblata. Acta Universitatis Carolinae Philologica, 1: 39-52.
[6]
Cave, C., Guaitella, I., Bertrand, R., Santi, S., Harlay, F., and Espesser R. (1996). About the relationship between eyebrow movements and f0 variations. In Proceedings of the ICSLP'96, pp. 2175-2179. Philadelphia.
[7]
Cerrato, L. and Skhiri, M. (2003). Analysis and measurement of head movements in human dialogues. In Proceedings of AVSP, ITRW on Audio Visual Speech Processing'03, pp. 251-256, St. Jorioz, France.
[8]
Damasio, A. (1994). AR: Descartes' Error: Emotion, Reason, and the Human Brain, New York, Grosset-Putnam.
[9]
Danieli, M. and Gerbino, E. (1995). Metrics for evaluating dialog strategies in a spoken language system. In Working Notes AAAI Spring Symposium Series, pp. 34-39, Stanford University.
[10]
Edlund, J. and Nordstrand, M. (2002). Turn-taking Gestures and Hour-Glasses in a Multi-modal Dialogue System. In Proceedings of ISCA Workshop Multi-Modal Dialogue in Mobile Environments, Kloster Irsee, Germany. pp. 181-184.
[11]
Ekman, P. (1993). Facial expression and emotion. American Psychologist , 48(4): 384-392.
[12]
Fabri, M., Moore, D.J., and Hobbs, D.J. (2000). Expressive Agents: Non-verbal Communication in Collaborative Virtual Environments. In Proceedings of Workshop on Embodied Conversational Agents - Let's Specify and Evaluate them!. AAMAS02, Bologna, Italy.
[13]
Gibbon, D., Mertins, I., Moore, R.K., editors, (2000). Handbook of multimodal and spoken dialogue systems. Kluwer Academic Press.
[14]
Glass, J.R. (1999). Challenges for spoken dialog systems. In Proceedings of the 1999 IEEE ASRU Workshop, pp. 430-435, Keystone.
[15]
Graf, H.P. et al. (2002). Visual Prosody; Facial Movements Accompanying speech. In Proceedings of the 5th IEEE International Conference on Automatic Face and Gesture Recognition, Washington. pp. 396-401.
[16]
Gustafson, J., Lindberg, N., and Lundeberg, M. (1999). The August spoken dialogue system. In Proceedings of Eurospeeeh'99, pp. 1151-1154, Budapest.
[17]
Gustafson, J., Bell, L., Beskow, J., Boye, J., Carlson, R., Edlund, J., Granström, B., House, D., and Wirén M(2000). AdApt - a Multimodal conversational dialogue system in an apartment domain. In Proceedings of ICSLP'00, 2, pp. 134-137, Bejiing.
[18]
Hjalmarsson, A. (2002). Evaluating Adapt, a multimodal conversational dialogue system, using PARADISE. MaS thesis. Department of Speech Music and Hearing KTH, Stockholm.
[19]
Höök, K. (2002). Evaluation of Affective Interaction. In Proceedings of Workshop on Embodied Conversational Agents - Let's Specify and Evaluate them!. AAMAS02, Bologna, Italy.
[20]
Kipp, M. (2001). Anvil - A Generic Annotation Tool for Multimodal Dialogue. In Proceedings of Eurospeech'01, pp. 1367-1370, Aalborg.
[21]
Laban, R. (1976). The Language of Movement a Guidebook to Choreutic. Boston, Plays Inc.
[22]
Lippmann, R. (1997). Speech recognition by machines and humans. Speech Communication, 22: 1-15, Elsevier Science.?
[23]
Massaro, D.W., Cohen, M., Beskow, J., and Cole, R. (1998). Developing and Evaluating Conversational Agents. In Workshop on Embodied Conversation Characters'98, pp. 287-318. Lake Tohoe.
[24]
Massaro, D.W. (2000). Multimodal emotion perception analogous to speech processes. In Proceedings of the ISCA Workshop on Speech and Emotion, pp. 114-121, Newcastle Northern Ireland.
[25]
McTear, M.F. (2002). Spoken Dialog Technology: Enabling the Conversational User Interface. ACM Computing Surveys, 34(1): 90-169.
[26]
Mitchell, J., Menezes, C., Williams, J., Pardo, B., Erickson, D., and Fujimura, O. (2000). Changes in syllables and boundary strengths due to irritations. In Proceedings of the ISCA Workshop on Speech and Emotion'00, pp. 98-103, Newcastle Northern Ireland.
[27]
Montero, J.M., Gutierrez Ariola, J., de Cordoba Herralde, R., Enriquez Carrasco, E.V., and Pardo Muoz, J.M. (2002). The role of pitch and tempo in Spanish Emotional Speech. In Keller, E., Bailly, G., Monaghan, A., Terken, J., and Huckvale, M. editors, Improvements in speech synthesis Cost 258, pp. 246-251, John Wiley & Sons, Chichester.
[28]
Nass, C. and Moon, Y. (1996). Social responses to communication technologies: A summary of four studies. Unpublished manuscript.
[29]
Nordstrand, M., Svanfeldt, G., Granstrm, B., and House, D. (2003). Measurements of Articulatory Variation and Communicative signals in Expressive Speech. In Proceedings of AVSP, ITRW on Audio Visual Speech Processing'03, pp. 233-238, St. Jorioz, France.
[30]
Pelachaud, C., Badler, N., and Steedman, M. (1996). Generating Facial Expressions for Speech. Cognitive Science, 20(1): 1-46.
[31]
Picard, R. (1997). Affective Computing, Cambridge Massachusetts, MIT Press.
[32]
Poggi, I. and Pelachaud, C. (1999). Emotional Meaning and Expression in Animated Faces. In Proceedings IWAI'99, pp. 182-195, Siena, Italy.
[33]
Sanders, G.A. and Scholtz, J. (2000). Measurements and Evaluation of Embodied conversational agents. In Cassell, J., Sullivan, J., Prevost, S., and Churchill, E., editors, Embodied conversational agents, pp. 346-373, Cambridge Massachusetts, MIT press.
[34]
Schegloff, E.A. and Sacks, H. (1977). Opening Up Closings. Semiotica 8: 298-327.
[35]
Sjölander, K. and Beskow, J. (2000). WaveSurfer - an Open Source Speech Tool. In Proceedings of ICSLP'00, Bejing, pp. 464-467.
[36]
Thórisson, K.R. (1997). Gandalf an embodied humanoid capable of real time multimodal dialogue with people. In Proceedings of the First ACM International Conference of Autonomous Agents, pp. 536-537, California.
[37]
Walker M.A., Kamm C.A. and Littman, D.J. (2000) Towards Developing General Models of Usability with PARADISE. Natural Language Engineering: Special Issue on Best Practice in Spoken Dialogue Systems, 2000, 6.
[38]
Walker, M.A. and Passonneau, R. (2001). DATE: A Dialogue Act Tagging Scheme for Evaluation of Spoken Dialogue Systems. In Proceedings of Human Language Technology Conference'01, pp. 66-73, San Diego. Available from: http://hlt2001.org/papers/hlt2001-15.pdf

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide books
From brows to trust: evaluating embodied conversational agents
January 2004
352 pages
ISBN:140202729X

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 January 2004

Author Tags

  1. communicative gestures
  2. conversational agents
  3. evaluation
  4. multimodal interface
  5. non-verbal behaviour
  6. prosodic cues

Qualifiers

  • Chapter

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 26 Jan 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media