Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

Wen, Zhuofan; Zhang, Fengyu; Zhang, Siyuan; Sun, Haiyang; Xu, Mingyu; Sun, Licai; Lian, Zheng; Liu, Bin; Tao, Jianhua

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.15044 (cs)

[Submitted on 22 Mar 2024]

Title:Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

Authors:Zhuofan Wen, Fengyu Zhang, Siyuan Zhang, Haiyang Sun, Mingyu Xu, Licai Sun, Zheng Lian, Bin Liu, Jianhua Tao

View PDF HTML (experimental)

Abstract:Multimodal fusion is a significant method for most multimodal tasks. With the recent surge in the number of large pre-trained models, combining both multimodal fusion methods and pre-trained model features can achieve outstanding performance in many multimodal tasks. In this paper, we present our approach, which leverages both advantages for addressing the task of Expression (Expr) Recognition and Valence-Arousal (VA) Estimation. We evaluate the Aff-Wild2 database using pre-trained models, then extract the final hidden layers of the models as features. Following preprocessing and interpolation or convolution to align the extracted features, different models are employed for modal fusion. Our code is available at GitHub - FulgenceWen/ABAW6th.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2403.15044 [cs.CV]
	(or arXiv:2403.15044v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.15044

Submission history

From: ZhuoFan Wen [view email]
[v1] Fri, 22 Mar 2024 09:00:24 UTC (147 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.CV

< prev | next >

new | recent | 2024-03

Change to browse by:

cs
cs.AI

References & Citations

export BibTeX citation

Computer Science > Computer Vision and Pattern Recognition

Title:Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Multimodal Fusion with Pre-Trained Model Features in Affective Behaviour Analysis In-the-wild

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators