research-article

JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression Recognition

Authors:

Guangming LuAuthors Info & Claims

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

Pages 3312 - 3320

https://doi.org/10.1145/3474085.3475484

Published: 17 October 2021 Publication History

Abstract

Cross-domain Facial Expression Recognition (FER) is challenging due to the difficulty of concurrently handling the domain shift and semantic gap during domain adaptation. Existing methods mainly focus on reducing the domain discrepancy for transferable features but fail to decrease the semantic one, which may result in negative transfer. To this end, we propose Joint Discriminative and Mutual Adaptation Networks (JDMAN), which collaboratively bridge the domain shift and semantic gap by domain- and category-level co-adaptation based on mutual information and discriminative metric learning techniques. Specifically, we design a mutual information minimization module for domain-level adaptation, which narrows the domain shift by simultaneously distilling the domain-invariant components and eliminating the untransferable ones lying in different domains. Moreover, we propose a semantic metric learning module for category-level adaptation, which can close the semantic discrepancy during discriminative intra-domain representation learning and transferable inter-domain knowledge discovery. These two modules are jointly leveraged in our JDMAN to safely transfer the source knowledge to target data in an end-to-end manner. Extensive experimental results on six databases show that our method achieves state-of-the-art performance. The code of our JDMAN is available at https://github.com/YingjianLi/JDMAN.

References

[1]

Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. 2017. Generalization and equilibrium in generative adversarial nets (gans). In International Conference on Machine Learning. PMLR, 224--232.

Digital Library

[2]

Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and R Devon Hjelm. 2018. Mine: mutual information neural estimation. arXiv preprint arXiv:1801.04062 (2018).

[3]

Rafael A Calvo and Sidney D'Mello. 2010. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, Vol. 1, 1 (2010), 18--37.

Digital Library

[4]

Zhuxin Chen, Zhifeng Xie, Weibin Zhang, and Xiangmin Xu. 2017. ResNet and Model Fusion for Automatic Spoofing Detection. In Interspeech. 102--106.

[5]

H. Cui, L. Zhu, J. Li, Y. Yang, and L. Nie. 2020. Scalable Deep Hashing for Large-Scale Social Image Retrieval. IEEE Transactions on Image Processing, Vol. 29 (2020), 1271--1284.

[6]

Flávio Altinier Maximiano da Silva and Helio Pedrini. 2015. Effects of cultural characteristics on building an emotion classifier through facial expression analysis. Journal of Electronic Imaging, Vol. 24, 2 (2015), 023015.

[7]

Zeyu Feng, Chang Xu, and Dacheng Tao. 2019. Self-supervised representation learning from multi-domain data. In IEEE/CVF International Conference on Computer Vision. 3245--3255.

[8]

Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In International Conference on Neural Information Processing. Springer, 117--124.

[9]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.

[10]

R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations .

[11]

Xun Huang, Chengyao Shen, Xavier Boix, and Qi Zhao. 2015. Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 262--270.

Digital Library

[12]

Yanli Ji, Yuhan Hu, Yang Yang, Fumin Shen, and Heng Tao Shen. 2019. Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network. Neurocomputing, Vol. 333 (2019), 231--239.

Digital Library

[13]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, Vol. 25 (2012), 1097--1105.

Digital Library

[14]

Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, and Daniel Ulbricht. 2019. Sliced wasserstein discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10285--10295.

[15]

Shan Li and Weihong Deng. 2018a. Deep emotion transfer network for cross-database facial expression recognition. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 3092--3099.

[16]

Shan Li and Weihong Deng. 2018b. Deep facial expression recognition: A survey. arXiv preprint arXiv:1804.08348 (2018).

[17]

Shan Li and Weihong Deng. 2019. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing, Vol. 28, 1 (2019), 356--370.

Digital Library

[18]

Shan Li and Weihong Deng. 2020. A deeper look at facial expression dataset bias. IEEE Transactions on Affective Computing (2020).

[19]

Yingjian Li, Guangming Lu, Jinxing Li, Zheng Zhang, and David Zhang. 2020. Facial Expression Recognition in the Wild Using Multi-level Features and Attention Mechanisms. IEEE Transactions on Affective Computing (2020).

[20]

Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing, Vol. 28, 5 (2018), 2439--2450.

[21]

Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I Jordan. 2018. Conditional Adversarial Domain Adaptation. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Vol. 31. Curran Associates, Inc.

Digital Library

[22]

Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I Jordan. 2017. Deep transfer learning with joint adaptation networks. In International Conference on Machine Learning. PMLR, 2208--2217.

Digital Library

[23]

Patrick Lucey, Jeffrey F Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. 2010. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 94--101.

[24]

You-Wei Luo, Chuan-Xian Ren, DAI Dao-Qing, and Hong Yan. 2020. Unsupervised Domain Adaptation via Discriminative Manifold Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).

[25]

Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 200--205.

Digital Library

[26]

Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2019. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, Vol. 10, 1 (2019), 18--31.

Digital Library

[27]

Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. 2005. Web-based database for facial expression analysis. In IEEE International Conference on Multimedia and Expo. IEEE, 317--321.

[28]

Gerard Pons and David Masip. 2017. Supervised committee of convolutional neural networks in automated facial expression analysis. IEEE Transactions on Affective Computing, Vol. 9, 3 (2017), 343--350.

Digital Library

[29]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).

[30]

Jiaming Song, Pratyusha Kalluri, Aditya Grover, Shengjia Zhao, and Stefano Ermon. 2019. Learning controllable fair representations. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2164--2173.

[31]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897--6906.

[32]

Rongliang Wu and Shijian Lu. 2020. Leed: Label-free expression editing via disentanglement. In European Conference on Computer Vision. Springer, 781--798.

Digital Library

[33]

Rongliang Wu, Gongjie Zhang, Shijian Lu, and Tao Chen. 2020. Cascade ef-gan: Progressive facial expression editing with local focuses. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5021--5030.

[34]

Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, and Liang Lin. 2020. Adversarial graph representation adaptation for cross-domain facial expression recognition. In ACM international conference on Multimedia. 1255--1264.

Digital Library

[35]

Ruijia Xu, Guanbin Li, Jihan Yang, and Liang Lin. 2019. Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1426--1435.

[36]

Guoying Zhao, Xiaohua Huang, Matti Taini, Stan Z Li, and Matti Pietik"aInen. 2011. Facial expression recognition from near-infrared videos. Image and Vision Computing, Vol. 29, 9 (2011), 607--619.

Digital Library

[37]

Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, 11 (2019), 3212--3232.

[38]

Lei Zhu, Xu Lu, Zhiyong Cheng, Jingjing Li, and Huaxiang Zhang. 2020. Deep Collaborative Multi-View Hashing for Large-Scale Image Search. IEEE Transactions on Image Processing, Vol. 29 (2020), 4643--4655.

Digital Library

[39]

Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Jingwu Chen, Zhiping Shi, Wenjuan Wu, and Qing He. 2019. Multi-representation adaptation network for cross-domain image classification. Neural Networks, Vol. 119 (2019), 214--221.

Digital Library

Cited By

Yang YWen LZeng XXu YWu XZhou JWang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680747
Gao YXie YHu ZChen TLin L(2024)Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.335563726(6676-6688)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3355637
Chen DWen GWen PYang PChen RLi C(2024)Cross-Domain Sample Relationship Learning for Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.331602726(3788-3798)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3316027
Show More Cited By

Index Terms

JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision

Recommendations

Adversarial Graph Representation Adaptation for Cross-Domain Facial Expression Recognition
MM '20: Proceedings of the 28th ACM International Conference on Multimedia

Data inconsistency and bias are inevitable among different facial expression recognition (FER) datasets due to subjective annotating process and different collecting conditions. Recent works resort to adversarial mechanisms that learn domain-invariant ...
Pose-robust personalized facial expression recognition through unsupervised multi-source domain adaptation
Abstract
Pose-robust personalized facial expression recognition is rather challenging, as facial expressions are subject-related and pose-dependent. Multi-source domain adaptation tries to leverage knowledge from multiple source domains to boost the ...
Highlights
- An unsupervised multi-source domain adaptation method is proposed for FER.
- Adversarial learning is used to learn a source FER encoder avoiding pose variations.
- Adversarial domain adaptation is used to train a personalized model for ...
Collaborative discriminative multi-metric learning for facial expression recognition in video

We present a new metric learning approach for facial expression recognition in videos.Our approach combines both audio and visual features and achieves better facial expression recognition performance.Experimental results clearly show the advantages of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

MM '21: Proceedings of the 29th ACM International Conference on Multimedia

October 2021

5796 pages

ISBN:9781450386517

DOI:10.1145/3474085

General Chairs:
Heng Tao Shen
University of Electronic Science&Technology of China, China
,
Yueting Zhuang
Zhejiang University, China
,
John R. Smith
IBM, USA
,
Program Chairs:
Yang Yang
University of Electronic Science and Technology of China, China
,
Pablo Cesar
CWI&TU Delft, The Netherlands
,
Florian Metze
FACEBOOK, Inc., USA
,
Balakrishnan Prabhakaran
University of Texas at Dallas, USA

Copyright © 2021 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMM: ACM Special Interest Group on Multimedia

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Shenzhen Key Technical Project
Guangdong Basic and Applied Basic Research Foundation
Open Project Fund from Shenzhen Institute of Artificial Intelligence and Robotics for Society
Medical Biometrics Perception and Analysis Engineering Laboratory
Shenzhen Fundamental Research Fund

Conference

MM '21

Sponsor:

SIGMM

MM '21: ACM Multimedia Conference

October 20 - 24, 2021

Virtual Event, China

Acceptance Rates

Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

15
Total Citations
View Citations
415
Total Downloads

Downloads (Last 12 months)47
Downloads (Last 6 weeks)3

Reflects downloads up to 19 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang YWen LZeng XXu YWu XZhou JWang YCai JKankanhalli MPrabhakaran BBoll SSubramanian RZheng LSingh VCesar PXie LXu D(2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
https://dl.acm.org/doi/10.1145/3664647.3680747
Gao YXie YHu ZChen TLin L(2024)Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.335563726(6676-6688)Online publication date: 18-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2024.3355637
Chen DWen GWen PYang PChen RLi C(2024)Cross-Domain Sample Relationship Learning for Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.331602726(3788-3798)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3316027
Chen DWen GYang PLi HChen CWang B(2024)CFAN-SDA: Coarse-Fine Aware Network With Static-Dynamic Adaptation for Facial Expression Recognition in VideosIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345065234:12(13507-13517)Online publication date: Dec-2024
https://doi.org/10.1109/TCSVT.2024.3450652
Zhang JLiu XLiang YXian XXie WShen LSong S(2024)CLIP-Guided Bidirectional Prompt and Semantic Supervision for Dynamic Facial Expression Recognition2024 IEEE International Joint Conference on Biometrics (IJCB)10.1109/IJCB62174.2024.10744485(1-10)Online publication date: 15-Sep-2024
https://doi.org/10.1109/IJCB62174.2024.10744485
Zhang YSun Z(2024)Cross-domain facial expression recognition based on adversarial attack fine-tuning learningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109014136:PBOnline publication date: 1-Oct-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.109014
Gao YCai YBi XLi BLi SZheng W(2023)Cross-Domain Facial Expression Recognition through Reliable Global–Local Representation Learning and Dynamic Label WeightingElectronics10.3390/electronics1221455312:21(4553)Online publication date: 6-Nov-2023
https://doi.org/10.3390/electronics12214553
Liu TLi JWu JZhang LZhao SChang JWan JElkind E(2023)Cross-domain facial expression recognition via disentangling identity representationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/135(1213-1221)Online publication date: 19-Aug-2023
https://dl.acm.org/doi/10.24963/ijcai.2023/135
Liu HCai HLin QLi XXiao HEl Saddik AMei TCucchiara RBertini MTobon Vallejo DAtrey PHossain M(2023)Learning from More: Combating Uncertainty Cross-multidomain for Facial Expression RecognitionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611702(5889-5898)Online publication date: 26-Oct-2023
https://dl.acm.org/doi/10.1145/3581783.3611702
Li YZhang ZChen BLu GZhang D(2023)Deep Margin-Sensitive Representation Learning for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2022.314160425(1359-1373)Online publication date: 1-Jan-2023
https://dl.acm.org/doi/10.1109/TMM.2022.3141604
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten