Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3577190.3614152acmconferencesArticle/Chapter ViewAbstractPublication Pagesicmi-mlmiConference Proceedingsconference-collections
research-article

Analyzing and Recognizing Interlocutors' Gaze Functions from Multimodal Nonverbal Cues

Published: 09 October 2023 Publication History

Abstract

A novel framework is presented for analyzing and recognizing the functions of gaze in group conversations. Considering the multiplicity and ambiguity of the gaze functions, we first define 43 nonexclusive gaze functions that play essential roles in conversations, such as monitoring, regulation, and expressiveness. Based on the defined functions, in this study, a functional gaze corpus is created, and a corpus analysis reveals several frequent functions, such as addressing and thinking while speaking and attending by listeners. Next, targeting the ten most frequent functions, we build convolutional neural networks (CNNs) to recognize the frame-based presence/absence of each gaze function from multimodal inputs, including head pose, utterance status, gaze/avert status, eyeball direction, and facial expression. Comparing different input sets, our experiments confirm that the proposed CNN using all modality inputs achieves the best performance and an F value of 0.839 for listening while looking.

References

[1]
Reginald Adams and Robert Kleck. 2005. Effects of Direct and Averted Gaze on the Perception of Facially Communicated Emotion. Emotion 5 (2005), 3–11.
[2]
Michael Argyle. 1988. Bodily communication – 2nd ed.Routledge, London and New York.
[3]
Michael Argyle and Mark Cook. 1976. Gaze and Mutual Gaze. Cambridge University Press, London and New York and Melbourne.
[4]
Michael Argyle and Janet Dean. 1965. Eye-Contact, Distance and Affiliation. Sociometry 28, 3 (1965), 289–304.
[5]
Michael Argyle, Luc Lefebvre, and Mark Cook. 1974. The meaning of five patterns of gaze. European Journal of Social Psychology 4, 2 (1974), 125–136.
[6]
Tadas Baltrušaitis, Amir Zadeh, Yao Chong Lim, and Louis-Philippe Morency. 2018. OpenFace 2.0: Facial Behavior Analysis Toolkit. In Proc. 2018 13th IEEE Int. Conf. Automatic Face and Gesture Recognition (FG 2018). 59–66. https://doi.org/10.1109/FG.2018.00019
[7]
Cigdem Beyan, Francesca Capozzi, Cristina Becchio, and Vittorio Murino. 2018. Prediction of the Leadership Style of an Emergent Leader Using Audio and Visual Nonverbal Features. IEEE Transactions on Multimedia 20, 2 (2018), 441–456. https://doi.org/10.1109/TMM.2017.2740062
[8]
Max Boholm and Jens Allwood. 2010. Repeated head movements, their function and relation to speech. In Proc. Workshop on Multimodal Corpora: Advances in Capturing, Coding and Analyzing Multimodality. LREC 2010. 1–5.
[9]
Dan Bohus and Eric Horvitz. 2010. Facilitating Multiparty Dialog with Gaze, Gesture, and Speech. In Proc. Int. Conf. Multimodal Interfaces and the Workshop on Machine Learning for Multimodal Interaction(ICMI-MLMI ’10). Article 5, 8 pages.
[10]
Starkey Duncan. 1972. Some Signals and Rules for Taking Speaking Turns in Conversations. Journal of Personality and Social Psychology 23 (1972), 283–292.
[11]
Starkey Duncan and George Niederehe. 1974. On signalling that it’s your turn to speak. Journal of Experimental Social Psychology 10, 3 (1974), 234–247.
[12]
Daniel Gatica-Perez. 2009. Automatic nonverbal analysis of social interaction in small groups: A review. Image and Vision Computing 27, 12 (2009), 1775–1787.
[13]
Anjith George and Aurobinda Routray. 2016. Real-time eye gaze direction classification using convolutional neural network. In 2016 Int. Conf. Signal Processing and Communications (SPCOM). 1–5.
[14]
Charles Goodwin. 1981. Conversational organization: Interaction between speakers and hearers. New York : Academic Press.
[15]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. arxiv:1512.03385 [cs.CV]
[16]
M. Imamura, K. Takeda, S. Kumano, and K. Otsuka. 2023. Recognition of Synergetic Functions between Head Movements and Facial Expressions in Conversations. IEICE Tran. Fundamentals (Japanese Edition) J106-A, 3 (2023), 1–18.
[17]
M. Imamura, A. Tashiro, S. Kumano, and K. Otsuka. 2023. Analyzing Synergetic Functional Spectrum from Head Movements and Facial Expressions in Conversations. In Proc. ACM ICMI’23.
[18]
Adam Kendon. 1967. Some functions of gaze-direction in social interaction. Acta Psychologica 26 (1967), 22–63.
[19]
Chris L. Kleinke. 1986. Gaze and Eye Contact: A Research Review. Psychological Bulletin 100, 1 (1986), 78–100. https://doi.org/10.1037/0033-2909.100.1.78
[20]
Terry K. Koo and Mae Y. Li. 2016. A Guideline of Selecting and Reporting Intraclass Correlation Coefficients for Reliability Research. Journal of Chiropractic Medicine 15, 2 (2016), 155–163.
[21]
Takashi Mori and Kazuhiro Otsuka. 2021. Deep Transfer Learning for Recognizing Functional Interactions via Head Movements in Multiparty Conversations. In Proc. 23rd ACM Int. Conf. Multimodal Interaction. 370–378.
[22]
Shumpei Otsuchi, Yoko Ishii, Momoko Nakatani, and Kazuhiro Otsuka. 2021. Prediction of Interlocutors’ Subjective Impressions Based on Functional Head-Movement Features in Group Meetings. In Proc. 23rd ACM Int. Conf. Multimodal Interaction. 352–360.
[23]
Kazuhiro Otsuka. 2011. Conversation Scene Analysis [Social Sciences]. IEEE Signal Processing Magazine 28, 4 (2011), 127–131.
[24]
Kazuhiro Otsuka, Keisuke Kasuga, and Martina Köhler. 2018. Estimating Visual Focus of Attention in Multiparty Meetings Using Deep Convolutional Neural Networks. In Proc. 20th ACM Int. Conf. Multimodal Interaction. 191–199. https://doi.org/10.1145/3242969.3242973
[25]
Kazuhiro Otsuka, Yoshinao Takemae, Junji Yamato, and Hiroshi Murase. 2005. A Probabilistic Inference of Multiparty-Conversation Structure Based on Markov-Switching Models of Gaze Patterns, Head Directions, and Utterances. In Proc. 7th ACM Int. Conf. Multimodal Interfaces(ICMI ’05). 191–198.
[26]
Kazuhiro Otsuka and Masahiro Tsumori. 2020. Analyzing Multifunctionality of Head Movements in Face-to-Face Conversations Using Deep Convolutional Neural Networks. IEEE Access 8 (2020), 217169–217195.
[27]
Kazuhiro Otsuka, Junji Yamato, Yoshinao Takemae, and Hiroshi Murase. 2006. Conversation Scene Analysis with Dynamic Bayesian Network Basedon Visual Head Tracking. In Proc. 2006 IEEE Int. Conf. Multimedia and Expo. 949–952.
[28]
Kazuhiro Otsuka, Junji Yamato, Yoshinao Takemae, and Hiroshi Murase. 2006. Quantifying Interpersonal Influence in Face-to-Face Conversations Based on Visual Attention Patterns. In Proc. CHI ’06 Extended Abstracts on Human Factors in Computing Systems. 1175–1180.
[29]
Miles L. Patterson. 1982. A sequential functional model of nonverbal exchange. Psychological Review 89 (05 1982), 231–249.
[30]
Frano Petric, Damjan Miklić, and Zdenko Kovačić. 2016. Probabilistic Eye Contact Detection for the Robot-assisted ASD Diagnostic Protocol. In Proc. Croatian Computer Vision Workshop, Year 4. 3–8.
[31]
Salwa O. Slim, Ayman Atia, Marwa M.A. Elfattah, and Mostafa-Sami M.Mostafa. 2019. Survey on Human Activity Recognition based on Acceleration Data. International Journal of Advanced Computer Science and Applications 10, 3 (2019), 84–98.
[32]
Alessandro Vinciarelli, Maja Pantic, and Hervé Bourlard. 2009. Social signal processing: Survey of an emerging domain. Image and Vision Computing 27, 12 (2009).
[33]
Qingsong Wen, Liang Sun, Fan Yang, Xiaomin Song, Jingkun Gao, Xue Wang, and Huan Xu. 2021. Time Series Data Augmentation for Deep Learning: A Survey. In Proc. 13th Int. J. Conf. Artificial Intelligence, IJCAI-21. 1–8.

Cited By

View all
  • (2024)Exploring Interlocutor Gaze Interactions in Conversations based on Functional Spectrum AnalysisProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685708(86-94)Online publication date: 4-Nov-2024
  • (2024)Exploring Multimodal Nonverbal Functional Features for Predicting the Subjective Impressions of InterlocutorsIEEE Access10.1109/ACCESS.2024.342653712(96769-96782)Online publication date: 2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICMI '23: Proceedings of the 25th International Conference on Multimodal Interaction
October 2023
858 pages
ISBN:9798400700552
DOI:10.1145/3577190
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 October 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural network
  2. gaze
  3. nonverbal behavior
  4. social signal processing

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Funding Sources

Conference

ICMI '23
Sponsor:

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)76
  • Downloads (Last 6 weeks)4
Reflects downloads up to 11 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Exploring Interlocutor Gaze Interactions in Conversations based on Functional Spectrum AnalysisProceedings of the 26th International Conference on Multimodal Interaction10.1145/3678957.3685708(86-94)Online publication date: 4-Nov-2024
  • (2024)Exploring Multimodal Nonverbal Functional Features for Predicting the Subjective Impressions of InterlocutorsIEEE Access10.1109/ACCESS.2024.342653712(96769-96782)Online publication date: 2024

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media