Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3503161.3551605acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

Graph-based Group Modelling for Backchannel Detection

Published: 10 October 2022 Publication History

Abstract

The brief responses given by listeners in group conversations are known as backchannels rendering the task of backchannel detection an essential facet of group interaction analysis. Most of the current backchannel detection studies explore various audio-visual cues for individuals. However, analysing all group members is of utmost importance for backchannel detection, like any group interaction. This study uses a graph neural network to model group interaction through all members' implicit and explicit behaviours. The proposed method achieves the best and second best performance on agreement estimation and backchannel detection tasks, respectively, of the 2022 MultiMediate: Multi-modal Group Behaviour Analysis for Artificial Mediation challenge.

Supplementary Material

MP4 File (MM22-mmgc59.mp4)
Presentation video of the paper "Graph-based Group Modelling for Backchannel Detection".

References

[1]
Juan Leó n Alcázar, Fabian Caba Heilbron, Ali K. Thabet, and Bernard Ghanem. 2021. MAAS: Multi-modal Assignation for Active Speaker Detection. In 2021 IEEE/CVF International Conference on Computer Vision, ICCV 2021, Montreal, QC, Canada, October 10-17, 2021. IEEE, 265--274. https://doi.org/10.1109/ICCV48922.2021.00033
[2]
Tadas Baltrusaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial Behavior Analysis Toolkit. In 13th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2018, Xi'an, China, May 15-19, 2019. IEEE Computer Society, 59--66. https://doi.org/10.1109/FG.2018.00019
[3]
Cigdem Beyan, Vasiliki-Maria Katsageorgiou, and Vittorio Murino. 2017. Moving as a Leader: Detecting Emergent Leadership in Small Groups using Body Pose. In Proceedings of the 2017 ACM on Multimedia Conference, MM 2017, Mountain View, CA, USA, October 23-27, 2017. ACM, 1425--1433. https://doi.org/10.1145/3123266.3123404
[4]
Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017Realtime Multi-person 2D Pose Estimation Using Part Affinity Fields. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, 2017. IEEE Computer Society, 1302--1310. https://doi.org/10.1109/CVPR.2017.143
[5]
Eugene Cho, Nasim Motalebi, S. Shyam Sundar, and Saeed Abdullah. 2022. Alexa as an Active Listener: How Backchanneling Can Elicit Self-Disclosure and Promote User Experience. CoRR, Vol. abs/2204.10191 (2022). https://doi.org/10.48550/arXiv.2204.10191 showeprint[arXiv]2204.10191
[6]
Soo-Whan Chung, Joon Son Chung, and Hong-Goo Kang. 2019. Perfect Match: Improved Cross-modal Embeddings for Audio-visual Synchronisation. In IEEE International Conference on Acoustics, Speech and Signal Processing, ICASSP 2019, Brighton, United Kingdom, May 12-17, 2019. IEEE, 3965--3969. https://doi.org/10.1109/ICASSP.2019.8682524
[7]
Gaëlle Ferré and Suzanne Renaudier. 2017. Unimodal and Bimodal Backchannels in Conversational English. In SEMDIAL 2017. 27--37.
[8]
Bettina Heinz. 2003. Backchannel Responses as Strategic Responses in Bilingual Speakers' Conversations. Journal of Pragmatics, Vol. 35, 7 (2003), 1113--1142.
[9]
Mattias Heldner, Anna Hjalmarsson, and Jens Edlund. 2013. Backchannel Relevance Spaces. In Nordic Prosody XI, Tartu, Estonia, 15-17 August, 2012. Peter Lang Publishing Group, 137--146.
[10]
Vidit Jain, Maitree Leekha, Rajiv Ratn Shah, and Jainendra Shukla. 2021. Exploring Semi-Supervised Learning for Predicting Listener Backchannels. In CHI '21: CHI Conference on Human Factors in Computing Systems, Virtual Event / Yokohama, Japan, May 8-13, 2021. ACM, 395:1--395:12. https://doi.org/10.1145/3411764.3445449
[11]
Laura D Kassner and Kate M Cassada. 2017. Chat it up: Backchanneling to promote reflective practice among in-service teachers. Journal of Digital Learning in Teacher Education, Vol. 33, 4 (2017), 160--168.
[12]
Robert M Krauss, Connie M Garlock, Peter D Bricker, and Lee E McMahon. 1977. The Role of Audible and Visible Back-channel Responses in Interpersonal Communication. Journal of Personality and Social Psychology, Vol. 35, 7 (1977), 523.
[13]
Wanhua Li, Yueqi Duan, Jiwen Lu, Jianjiang Feng, and Jie Zhou. 2020. Graph-Based Social Relation Reasoning. In Computer Vision - ECCV 2020 - 16th European Conference, Glasgow, UK, August 23-28, 2020, Proceedings, Part XV (Lecture Notes in Computer Science, Vol. 12360). Springer, 18--34. https://doi.org/10.1007/978-3-030-58555-6_2
[14]
Deepti Mishra, Gonca Gokce Menekse Dalveren, Frode S Volden, and Carly Grace Allen. 2021. Group Discussion in a Blended Environment in Engineering Education. (2021).
[15]
Philipp Müller, Michael Dietz, Dominik Schiller, Dominike Thomas, Hali Lindsay, Patrick Gebhard, Elisabeth André, and Andreas Bulling. 2022. MultiMediate '22: Backchannel Detection and Agreement Estimation in Group Interactions. In Proceedings of the 30th International Conference on Multimedia 2022, Lisboa, Portugal, October 10-14, 2022. ACM. https://doi.org/10.1145/3503161.3551589
[16]
Louis-Philippe Morency, Iwan de Kok, and Jonathan Gratch. 2008. Predicting Listener Backchannels: A Probabilistic Multimodal Approach. In Intelligent Virtual Agents, 8th International Conference, IVA 2008, Tokyo, Japan, September 1-3, 2008. Proceedings (Lecture Notes in Computer Science, Vol. 5208). Springer, 176--190. https://doi.org/10.1007/978-3-540-85483-8_18
[17]
Philipp Mü ller, Michael Dietz, Dominik Schiller, Dominike Thomas, Guanhua Zhang, Patrick Gebhard, Elisabeth André, and Andreas Bulling. 2021. MultiMediate: Multi-modal Group Behaviour Analysis for Artificial Mediation. In MM '21: ACM Multimedia Conference, Virtual Event, China, October 20 - 24, 2021. ACM, 4878--4882. https://doi.org/10.1145/3474085.3479219
[18]
Philipp Mü ller, Michael Xuelin Huang, and Andreas Bulling. 2018. Detecting Low Rapport During Natural Interactions in Small Groups from Non-Verbal Behavior. In Proc. ACM International Conference on Intelligent User Interfaces (IUI). 153--164. https://doi.org/10.1145/3172944.3172969
[19]
Kalin Stefanov, Baiyu Huang, Zongjian Li, and Mohammad Soleymani. 2020. OpenSense: A Platform for Multimodal Data Acquisition and Behavior Perception. In ICMI '20: International Conference on Multimodal Interaction, Virtual Event, The Netherlands, October 25-29, 2020. ACM, 660--664. https://doi.org/10.1145/3382507.3418832
[20]
Sydney Thompson, Abhijit Gupta, Anjali W. Gupta, Austin Chen, and Marynel Vá zquez. 2021. Conversational Group Detection with Graph Neural Networks. In ICMI '21: International Conference on Multimodal Interaction, Montréal, QC, Canada, October 18--22, 2021. ACM, 248--252. https://doi.org/10.1145/3462244.3479963
[21]
Jackson Tolins and Jean E Fox Tree. 2014. Addressee backchannels steer narrative development. Journal of Pragmatics, Vol. 70 (2014), 152--164.
[22]
Khiet P. Truong, Ronald Poppe, and Dirk Heylen. 2010. A Rule-based Backchannel Prediction Model Using Pitch and Pause Information. In INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, September 26-30, 2010. ISCA, 3058--3061. http://www.isca-speech.org/archive/interspeech_2010/i10_3058.html
[23]
Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, and Philip S. Yu. 2021. A Comprehensive Survey on Graph Neural Networks. IEEE Transactions on Neural Networks and Learning Systems, Vol. 32, 1 (2021), 4-24. https://doi.org/10.1109/TNNLS.2020.2978386

Cited By

View all
  • (2024)Enabling Social Robots to Perceive and Join Socially Interacting Groups using F-formation: A Comprehensive OverviewACM Transactions on Human-Robot Interaction10.1145/3682072Online publication date: 29-Jul-2024
  • (2023)MultiMediate '23: Engagement Estimation and Bodily Behaviour Recognition in Social InteractionsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613851(9640-9645)Online publication date: 26-Oct-2023
  • (2023)Unveiling Subtle Cues: Backchannel Detection Using Temporal Multimodal Attention NetworksProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612870(9586-9590)Online publication date: 26-Oct-2023

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MM '22: Proceedings of the 30th ACM International Conference on Multimedia
October 2022
7537 pages
ISBN:9781450392037
DOI:10.1145/3503161
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 October 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. agreement estimation
  2. backchannel detection
  3. graph neural networks

Qualifiers

  • Research-article

Conference

MM '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 995 of 4,171 submissions, 24%

Upcoming Conference

MM '24
The 32nd ACM International Conference on Multimedia
October 28 - November 1, 2024
Melbourne , VIC , Australia

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)81
  • Downloads (Last 6 weeks)8
Reflects downloads up to 21 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Enabling Social Robots to Perceive and Join Socially Interacting Groups using F-formation: A Comprehensive OverviewACM Transactions on Human-Robot Interaction10.1145/3682072Online publication date: 29-Jul-2024
  • (2023)MultiMediate '23: Engagement Estimation and Bodily Behaviour Recognition in Social InteractionsProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3613851(9640-9645)Online publication date: 26-Oct-2023
  • (2023)Unveiling Subtle Cues: Backchannel Detection Using Temporal Multimodal Attention NetworksProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3612870(9586-9590)Online publication date: 26-Oct-2023
  • (2023)Backchannel Detection and Agreement Estimation from Video with Transformer Networks2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191640(1-8)Online publication date: 18-Jun-2023

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media