research-article

Exploring Regularizations with Face, Body and Image Cues for Group Cohesion Prediction

Authors:

Xiaojiang Peng, and

Yu QiaoAuthors Info & Claims

ICMI '19: 2019 International Conference on Multimodal Interaction

October 2019

Pages 557 - 561

https://doi.org/10.1145/3340555.3355712

Published: 14 October 2019 Publication History

Abstract

This paper presents our approach for the group cohesion prediction sub-challenge in the EmotiW 2019. The task is to predict group cohesiveness in images. We mainly explore several regularizations with three types of visual cues, namely face, body,and global image. Our main contribution is two-fold. First, we jointly train the group cohesion prediction task and group emotion recognition task using multi-task learning strategy with all visual cues. Second, we elaborately design two regularizations, namely a rank loss and a hourglass loss, where the former aims to give a margin between the distance of distant categories and near categories and the later to avoid centralization predictions with only MSE loss. With careful evaluations, we finally achieve the second place in this sub-challenge with MSE of 0.43821 on the testing set. https://github.com/DaleAG/Group_Cohesion_Prediction

References

[1]

Daniel J Beal, Robin R Cohen, Michael J Burke, and Christy L McLendon. 2003. Cohesion and performance in groups: a meta-analytic clarification of construct relations.Journal of applied psychology 88, 6 (2003), 989.

[2]

Zhe Cao, Tomas Simon, Shih-En Wei, and Yaser Sheikh. 2017. Realtime Multi-Person 2D Pose Estimation using Part Affinity Fields. In CVPR.

[3]

Albert V Carron and Kevin S Spink. 1995. The group size-cohesion relationship in minimal groups. Small group research 26, 1 (1995), 86–105.

[4]

Abhinav Dhall, Roland Goecke, Shreya Ghosh, Jyoti Joshi, Jesse Hoey, and Tom Gedeon. 2017. From individual to group-level emotion recognition: EmotiW 5.0. In Proceedings of the 19th ACM international conference on multimodal interaction. ACM, 524–528.

Digital Library

[5]

Shreya Ghosh, Abhinav Dhall, and Nicu Sebe. 2018. Predicting Group Cohesiveness in Images. arXiv preprint arXiv:1812.11771(2018).

[6]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2015. Deep Residual Learning for Image Recognition. CoRR abs/1512.03385(2015). http://arxiv.org/abs/1512.03385

[7]

Jie Hu, Li Shen, and Gang Sun. 2017. Squeeze-and-Excitation Networks. CoRR abs/1709.01507(2017). arxiv:1709.01507http://arxiv.org/abs/1709.01507

[8]

Hayley Hung and Daniel Gatica-Perez. 2010. Estimating cohesion in small groups using audio-visual nonverbal behavior. IEEE Transactions on Multimedia 12, 6 (2010), 563–575.

Digital Library

[9]

Sunan Li, Wenming Zheng, Yuan Zong, Cheng Lu, Chuangao Tang, Xingxun Jiang, Jiateng Liu, and Wanchuang Xia. 2019. Bi-modality Fusion for Emotion Recognition in the Wild. In Proceedings of the 21th ACM International Conference on Multimodal Interaction (in press). ACM.

Digital Library

[10]

Albert E Myers. 1961. Team competition, success, and the adjustment of group members.Technical Report. ILLINOIS UNIV URBANA GROUP EFFECTIVENESS RESEARCH LAB.

[11]

Sara Sabour, Nicholas Frosst, and Geoffrey E Hinton. 2017. Dynamic routing between capsules. In Advances in neural information processing systems. 3856–3866.

[12]

Tomas Simon, Hanbyul Joo, Iain Matthews, and Yaser Sheikh. 2017. Hand Keypoint Detection in Single Images using Multiview Bootstrapping. In CVPR.

[13]

Christian Szegedy, Vincent Vanhoucke, Sergey Ioffe, Jon Shlens, and Zbigniew Wojna. 2016. Rethinking the inception architecture for computer vision. In Proceedings of the IEEE conference on computer vision and pattern recognition. 2818–2826.

[14]

Henri Tajfel. 2010. Social identity and intergroup relations. Vol. 7. Cambridge University Press.

[15]

Kai Wang, Jianfei Yang, Da Guo, Kaipeng Zhang, Xiaojiang Peng, and Yu Qiao. 2019. Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression. In Proceedings of the 21th ACM International Conference on Multimodal Interaction (in press). ACM.

Digital Library

[16]

Kai Wang, Xiaoxing Zeng, Jianfei Yang, Debin Meng, Kaipeng Zhang, Xiaojiang Peng, and Yu Qiao. 2018. Cascade attention networks for group emotion recognition with face, body and image cues. In Proceedings of the 2018 on International Conference on Multimodal Interaction. ACM, 640–645.

Digital Library

[17]

Shih-En Wei, Varun Ramakrishna, Takeo Kanade, and Yaser Sheikh. 2016. Convolutional pose machines. In CVPR.

[18]

Stephen J Zaccaro and M Catherine McCoy. 1988. The effects of task and interpersonal cohesiveness on performance of a disjunctive group task 1. Journal of applied social psychology 18, 10 (1988), 837–851.

[19]

Kaipeng Zhang, Zhanpeng Zhang, Zhifeng Li, and Yu Qiao. 2016. Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Processing Letters 23, 10 (2016), 1499–1503.

[20]

Shifeng Zhang, Xiangyu Zhu, Zhen Lei, Hailin Shi, Xiaobo Wang, and Stan Z Li. 2017. S3fd: Single shot scale-invariant face detector. In Proceedings of the IEEE International Conference on Computer Vision. 192–201.

[21]

Zhanpeng Zhang, Ping Luo, Chen Change Loy, and Xiaoou Tang. 2014. Facial landmark detection by deep multi-task learning. In European conference on computer vision. Springer, 94–108.

Cited By

Barrera-Llanga KBurriel-Valencia JSapena-Bañó ÁMartínez-Román J(2023)A Comparative Analysis of Deep Learning Convolutional Neural Network Architectures for Fault Diagnosis of Broken Rotor Bars in Induction MotorsSensors10.3390/s2319819623:19(8196)Online publication date: 30-Sep-2023
https://doi.org/10.3390/s23198196
Ghosh SDhall ASebe NGedeon T(2022)Automatic Prediction of Group Cohesiveness in ImagesIEEE Transactions on Affective Computing10.1109/TAFFC.2020.302609513:3(1677-1690)Online publication date: 1-Jul-2022
https://doi.org/10.1109/TAFFC.2020.3026095
Maman LChetouani MLikforman-Sulem LVarni G(2021)Using Valence Emotion to Predict Group Cohesion’s Dynamics: Top-down and Bottom-up Approaches2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII52823.2021.9597429(1-8)Online publication date: 28-Sep-2021
https://doi.org/10.1109/ACII52823.2021.9597429
Show More Cited By

Recommendations

Cascade Attention Networks For Group Emotion Recognition with Face, Body and Image Cues
ICMI '18: Proceedings of the 20th ACM International Conference on Multimodal Interaction

This paper presents our approach for group-level emotion recognition sub-challenge in the EmotiW 2018. The task is to classify an image into one of the group emotions such as positive, negative, and neutral. Our approach mainly explores three cues, ...
Read More
A survey on deep learning based face recognition
Abstract
Deep learning, in particular the deep convolutional neural networks, has received increasing interests in face recognition recently, and a number of deep learning methods have been proposed. This paper summarizes about 330 ...
Graphical abstract

Display Omitted
Highlights
- Presents a comprehensive survey of deep learning based face recognition methods.
Read More
Deep face recognition using imperfect facial data
Abstract
Today, computer based face recognition is a mature and reliable mechanism which is being practically utilised for many access control scenarios. As such, face recognition or authentication is predominantly performed using ‘perfect’ ...
Highlights
- We show the performance of machine learning for face recognition using partial faces and other manipulations of the face such as rotation and zooming which ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICMI '19: 2019 International Conference on Multimodal Interaction

October 2019

601 pages

ISBN:9781450368605

DOI:10.1145/3340555

Editors:
Wen Gao
Peking University, China
,
Helen Mei Ling Meng
Chinese University of Hong Kong, China
,
Matthew Turk
Toyota Technological Institute at Chicago, USA
,
Susan R. Fussell
Cornell University, USA
,
Björn Schuller
Imperial College London / University of Augsburg, UK
,
Yale Song
Microsoft Research, USA
,
Kai Yu
Shanghai Jiao Tong University, China

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 October 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ICMI '19

ICMI '19: INTERNATIONAL CONFERENCE ON MULTIMODAL INTERACTION

October 14 - 18, 2019

Suzhou, China

Acceptance Rates

Overall Acceptance Rate 453 of 1,080 submissions, 42%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

10
Total Citations
View Citations
295
Total Downloads

Downloads (Last 12 months)15
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Barrera-Llanga KBurriel-Valencia JSapena-Bañó ÁMartínez-Román J(2023)A Comparative Analysis of Deep Learning Convolutional Neural Network Architectures for Fault Diagnosis of Broken Rotor Bars in Induction MotorsSensors10.3390/s2319819623:19(8196)Online publication date: 30-Sep-2023
https://doi.org/10.3390/s23198196
Ghosh SDhall ASebe NGedeon T(2022)Automatic Prediction of Group Cohesiveness in ImagesIEEE Transactions on Affective Computing10.1109/TAFFC.2020.302609513:3(1677-1690)Online publication date: 1-Jul-2022
https://doi.org/10.1109/TAFFC.2020.3026095
Maman LChetouani MLikforman-Sulem LVarni G(2021)Using Valence Emotion to Predict Group Cohesion’s Dynamics: Top-down and Bottom-up Approaches2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII)10.1109/ACII52823.2021.9597429(1-8)Online publication date: 28-Sep-2021
https://doi.org/10.1109/ACII52823.2021.9597429
Tien DYang HLee GKim S(2021)D2C-Based Hybrid Network for Predicting Group Cohesion ScoresIEEE Access10.1109/ACCESS.2021.30883409(84356-84363)Online publication date: 2021
https://doi.org/10.1109/ACCESS.2021.3088340
Gavrikov ISavchenko A(2021)Efficient Group-Based Cohesion Prediction in Images Using Facial DescriptorsRecent Trends in Analysis of Images, Social Networks and Texts10.1007/978-3-030-71214-3_12(140-148)Online publication date: 25-Mar-2021
https://doi.org/10.1007/978-3-030-71214-3_12
Walocha FMaman LChetouani MVarni GTruong KHeylen DCzerwinski MBerthouze NChetouani MNakano M(2020)Modeling Dynamics of Task and Social Cohesion from the Group Perspective Using Nonverbal Motion Capture-based FeaturesCompanion Publication of the 2020 International Conference on Multimodal Interaction10.1145/3395035.3425963(182-190)Online publication date: 25-Oct-2020
https://dl.acm.org/doi/10.1145/3395035.3425963
Wang YWu JHuang JHattori GTakishima YWada SKimura RChen JKurihara STruong KHeylen DCzerwinski MBerthouze NChetouani MNakano M(2020)LDNN: Linguistic Knowledge Injectable Deep Neural Network for Group Cohesiveness UnderstandingProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3418830(343-350)Online publication date: 21-Oct-2020
https://dl.acm.org/doi/10.1145/3382507.3418830
Wang YWu JHeracleous PWada SKimura RKurihara STruong KHeylen DCzerwinski MBerthouze NChetouani MNakano M(2020)Implicit Knowledge Injectable Cross Attention Audiovisual Model for Group Emotion RecognitionProceedings of the 2020 International Conference on Multimodal Interaction10.1145/3382507.3417960(827-834)Online publication date: 21-Oct-2020
https://dl.acm.org/doi/10.1145/3382507.3417960
Li SZheng WZong YLu CTang CJiang XLiu JXia W(2019)Bi-modality Fusion for Emotion Recognition in the Wild2019 International Conference on Multimodal Interaction10.1145/3340555.3355719(589-594)Online publication date: 14-Oct-2019
https://dl.acm.org/doi/10.1145/3340555.3355719
Wang KYang JGuo DZhang KPeng XQiao Y(2019)Bootstrap Model Ensemble and Rank Loss for Engagement Intensity Regression2019 International Conference on Multimodal Interaction10.1145/3340555.3355711(551-556)Online publication date: 14-Oct-2019
https://dl.acm.org/doi/10.1145/3340555.3355711

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents