Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3136755.3136767acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Crowdsourcing ratings of caller engagement in thin-slice videos of human-machine dialog: benefits and pitfalls

Published: 03 November 2017 Publication History

Abstract

We analyze the efficacy of different crowds of naive human raters in rating engagement during human--machine dialog interactions. Each rater viewed multiple 10 second, thin-slice videos of native and non-native English speakers interacting with a computer-assisted language learning (CALL) system and rated how engaged and disengaged those callers were while interacting with the automated agent. We observe how the crowd's ratings compared to callers' self ratings of engagement, and further study how the distribution of these rating assignments vary as a function of whether the automated system or the caller was speaking. Finally, we discuss the potential applications and pitfalls of such crowdsourced paradigms in designing, developing and analyzing engagement-aware dialog systems.

References

[1]
Nalini Ambady, Mary Anne Krabbenhoft, and Daniel Hogan. 2006. The 30sec sale: Using thin-slice judgments to evaluate sales effectiveness. Journal of Consumer Psychology 16, 1 (2006), 4–13.
[2]
Nalini Ambady and Robert Rosenthal. 1993. Half a minute: Predicting teacher evaluations from thin slices of nonverbal behavior and physical attractiveness. Journal of personality and social psychology 64, 3 (1993), 431.
[3]
Roman Bednarik, Shahram Eivazi, and Michal Hradis. 2012. Gaze and conversational engagement in multiparty video conversation: an annotation scheme and classification of high and low levels of engagement. In Proceedings of the 4th workshop on eye gaze in intelligent human machine interaction. ACM, 10.
[4]
Ronald Böck. 2013. Multimodal automatic user disposition recognition in humanmachine interaction. Ph.D. Dissertation. Magdeburg, Universität, Diss., 2013.
[5]
Francesca Bonin, Ronald Bock, and Nick Campbell. 2012. How do we react to context? annotation of individual and group engagement in a video corpus. In Privacy, Security, Risk and Trust (PASSAT), 2012 International Conference on and 2012 International Confernece on Social Computing (SocialCom). IEEE, 899–903.
[6]
Dana R Carney, C Randall Colvin, and Judith A Hall. 2007. A thin slice perspective on the accuracy of first impressions. Journal of Research in Personality 41, 5 (2007), 1054–1072.
[7]
Anthony J Conger. 1980. Integration and generalization of kappas for multiple raters. Psychological Bulletin 88, 2 (1980), 322.
[8]
Jared R Curhan and Alex Pentland. 2007. Thin slices of negotiation: predicting outcomes from conversational dynamics within the first 5 minutes. Journal of Applied Psychology 92, 3 (2007), 802. 5 Note that this means we cannot guarantee that the distribution of segments obtained for the “uniform” sampling condition was truly representative of the five different engagement levels. Crowdsourcing Ratings of Caller Engagement in Thin-Slice Videos. .. ICMI’17, November 13–17, 2017, Glasgow, UK
[9]
Sidney S D’Mello, Patrick Chipman, and Art Graesser. 2007. Posture as a predictor of learner’s affective engagement. In Proceedings of the Cognitive Science Society, Vol. 29.
[10]
Kate Forbes-Riley and Diane Litman. 2012. Adapting to multiple affective states in spoken dialogue. In Proceedings of the 13th Annual Meeting of the Special Interest Group on Discourse and Dialogue. Association for Computational Linguistics, 217–226.
[11]
Katherine A Fowler, Scott O Lilienfeld, and Christopher J Patrick. 2009. Detecting psychopathy from thin slices of behavior. Psychological assessment 21, 1 (2009), 68.
[12]
Nadine Glas and Catherine Pelachaud. 2015. Definitions of engagement in humanagent interaction. In Affective Computing and Intelligent Interaction (ACII), 2015 International Conference on. IEEE, 944–949.
[13]
Kilem L Gwet. 2008. Intrarater reliability. Wiley encyclopedia of clinical trials (2008).
[14]
Malte F Jung. 2016. Coupling interactions and performance: Predicting team performance from thin slices of conflict. ACM Transactions on Computer-Human Interaction (TOCHI) 23, 3 (2016), 18.
[15]
Filip Jurcıcek, Simon Keizer, Milica Gašic, Francois Mairesse, Blaise Thomson, Kai Yu, and Steve Young. 2011. Real user evaluation of spoken dialogue systems using Amazon Mechanical Turk. In Proceedings of INTERSPEECH, Vol. 11.
[16]
Michael W Kraus and Dacher Keltner. 2009. Signs of socioeconomic status a thin-slicing approach. Psychological Science 20, 1 (2009), 99–106.
[17]
Klaus Krippendorff. 2007. Computing Krippendorff’s alpha reliability. Departmental papers (ASC) (2007), 43.
[18]
Walter S Lasecki, Mitchell Gordon, Danai Koutra, Malte F Jung, Steven P Dow, and Jeffrey P Bigham. 2014. Glance: Rapidly coding behavioral video with the crowd. In Proceedings of the 27th annual ACM symposium on User interface software and technology. ACM, 551–562.
[19]
Andres Levitski, Jenni Radun, and Kristiina Jokinen. 2012. Visual interaction and conversational activity. In Proceedings of the 4th Workshop on Eye Gaze in Intelligent Human Machine Interaction. ACM, 11.
[20]
Ian McGraw, Chia-ying Lee, I Lee Hetherington, Stephanie Seneff, and Jim Glass. 2010. Collecting Voices from the Cloud. In LREC.
[21]
Laurent Son Nguyen and Daniel Gatica-Perez. 2015. I would hire you in a minute: Thin slices of nonverbal behavior in job interviews. In Proceedings of the 2015 ACM on International Conference on Multimodal Interaction. ACM, 51–58.
[22]
Catharine Oertel, Céline De Looze, Stefan Scherer, Andreas Windmann, Petra Wagner, and Nick Campbell. 2011. Towards the automatic detection of involvement in conversation. Analysis of Verbal and Nonverbal Communication and Enactment. The Processing Issues (2011), 163–170.
[23]
Catharine Oertel and Giampiero Salvi. 2013. A gaze-based method for relating group involvement to individual engagement in multimodal multiparty dialogue. In Proceedings of the 15th ACM on International conference on multimodal interaction. ACM, 99–106.
[24]
Thomas F Oltmanns, Jacqueline NW Friedman, Edna R Fiedler, and Eric Turkheimer. 2004. Perceptions of people with personality disorders based on thin slices of behavior. Journal of Research in Personality 38, 3 (2004), 216–229.
[25]
Vikram Ramanarayanan, Chee Wee Leong, and David Suendermann-Oeft. 2017. Rushing to Judgement: How Do Laypeople Rate Caller Engagement in Thin-Slice Videos of Human–Machine Dialog? Proc. Interspeech 2017 (2017), 2526–2530.
[26]
Vikram Ramanarayanan, David Suendermann-Oeft, Patrick Lange, Robert Mundkowsky, Alexei V Ivanov, Zhou Yu, Yao Qian, and Keelan Evanini. 2017. Assembling the Jigsaw: How Multiple Open Standards Are Synergistically Combined in the HALEF Multimodal Dialog System. In Multimodal Interaction with W3C Standards. Springer, 295–310.
[27]
Emmanuel Rayner, Ian Frank, Cathy Chua, Nikolaos Tsourakis, and Pierrette Bouillon. 2011. For a fistful of dollars: Using crowd-sourcing to evaluate a spoken language CALL application. (2011).
[28]
Debra L Roter, Judith A Hall, Danielle Blanch-Hartigan, Susan Larson, and Richard M Frankel. 2011. Slicing it thin: new methods for brief sampling analysis using RIAS-coded medical dialogue. Patient education and counseling 82, 3 (2011), 410–419.
[29]
Hanan Salam, Oya Celiktutan, Isabelle Hupont, Hatice Gunes, and Mohamed Chetouani. 2016. Fully Automatic Analysis of Engagement and Its Relationship to Personality in Human-Robot Interactions. IEEE Access (2016).
[30]
William A Scott. 1955. Reliability of content analysis: The case of nominal scale coding. Public opinion quarterly (1955), 321–325.

Cited By

View all
  • (2022)Automatic engagement estimation in smart education/learning settings: a systematic review of engagement definitions, datasets, and methodsSmart Learning Environments10.1186/s40561-022-00212-y9:1Online publication date: 12-Nov-2022
  • (2018)User Affect and No-Match Dialogue ScenariosProceedings of the 4th International Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction10.1145/3279972.3279979(6-14)Online publication date: 16-Oct-2018

Index Terms

  1. Crowdsourcing ratings of caller engagement in thin-slice videos of human-machine dialog: benefits and pitfalls

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      ICMI '17: Proceedings of the 19th ACM International Conference on Multimodal Interaction
      November 2017
      676 pages
      ISBN:9781450355438
      DOI:10.1145/3136755
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 03 November 2017

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. crowdsourcing
      2. engagement
      3. multimodal dialog
      4. thin-slicing

      Qualifiers

      • Research-article

      Conference

      ICMI '17
      Sponsor:

      Acceptance Rates

      ICMI '17 Paper Acceptance Rate 65 of 149 submissions, 44%;
      Overall Acceptance Rate 453 of 1,080 submissions, 42%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)2
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 20 Jan 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)Automatic engagement estimation in smart education/learning settings: a systematic review of engagement definitions, datasets, and methodsSmart Learning Environments10.1186/s40561-022-00212-y9:1Online publication date: 12-Nov-2022
      • (2018)User Affect and No-Match Dialogue ScenariosProceedings of the 4th International Workshop on Multimodal Analyses Enabling Artificial Agents in Human-Machine Interaction10.1145/3279972.3279979(6-14)Online publication date: 16-Oct-2018

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media