Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3474085.3475484acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression Recognition

Published: 17 October 2021 Publication History

Abstract

Cross-domain Facial Expression Recognition (FER) is challenging due to the difficulty of concurrently handling the domain shift and semantic gap during domain adaptation. Existing methods mainly focus on reducing the domain discrepancy for transferable features but fail to decrease the semantic one, which may result in negative transfer. To this end, we propose Joint Discriminative and Mutual Adaptation Networks (JDMAN), which collaboratively bridge the domain shift and semantic gap by domain- and category-level co-adaptation based on mutual information and discriminative metric learning techniques. Specifically, we design a mutual information minimization module for domain-level adaptation, which narrows the domain shift by simultaneously distilling the domain-invariant components and eliminating the untransferable ones lying in different domains. Moreover, we propose a semantic metric learning module for category-level adaptation, which can close the semantic discrepancy during discriminative intra-domain representation learning and transferable inter-domain knowledge discovery. These two modules are jointly leveraged in our JDMAN to safely transfer the source knowledge to target data in an end-to-end manner. Extensive experimental results on six databases show that our method achieves state-of-the-art performance. The code of our JDMAN is available at https://github.com/YingjianLi/JDMAN.

References

[1]
Sanjeev Arora, Rong Ge, Yingyu Liang, Tengyu Ma, and Yi Zhang. 2017. Generalization and equilibrium in generative adversarial nets (gans). In International Conference on Machine Learning. PMLR, 224--232.
[2]
Mohamed Ishmael Belghazi, Aristide Baratin, Sai Rajeswar, Sherjil Ozair, Yoshua Bengio, Aaron Courville, and R Devon Hjelm. 2018. Mine: mutual information neural estimation. arXiv preprint arXiv:1801.04062 (2018).
[3]
Rafael A Calvo and Sidney D'Mello. 2010. Affect detection: An interdisciplinary review of models, methods, and their applications. IEEE Transactions on Affective Computing, Vol. 1, 1 (2010), 18--37.
[4]
Zhuxin Chen, Zhifeng Xie, Weibin Zhang, and Xiangmin Xu. 2017. ResNet and Model Fusion for Automatic Spoofing Detection. In Interspeech. 102--106.
[5]
H. Cui, L. Zhu, J. Li, Y. Yang, and L. Nie. 2020. Scalable Deep Hashing for Large-Scale Social Image Retrieval. IEEE Transactions on Image Processing, Vol. 29 (2020), 1271--1284.
[6]
Flávio Altinier Maximiano da Silva and Helio Pedrini. 2015. Effects of cultural characteristics on building an emotion classifier through facial expression analysis. Journal of Electronic Imaging, Vol. 24, 2 (2015), 023015.
[7]
Zeyu Feng, Chang Xu, and Dacheng Tao. 2019. Self-supervised representation learning from multi-domain data. In IEEE/CVF International Conference on Computer Vision. 3245--3255.
[8]
Ian J Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In International Conference on Neural Information Processing. Springer, 117--124.
[9]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770--778.
[10]
R Devon Hjelm, Alex Fedorov, Samuel Lavoie-Marchildon, Karan Grewal, Phil Bachman, Adam Trischler, and Yoshua Bengio. 2019. Learning deep representations by mutual information estimation and maximization. In International Conference on Learning Representations .
[11]
Xun Huang, Chengyao Shen, Xavier Boix, and Qi Zhao. 2015. Salicon: Reducing the semantic gap in saliency prediction by adapting deep neural networks. In Proceedings of the IEEE International Conference on Computer Vision. 262--270.
[12]
Yanli Ji, Yuhan Hu, Yang Yang, Fumin Shen, and Heng Tao Shen. 2019. Cross-domain facial expression recognition via an intra-category common feature and inter-category distinction feature fusion network. Neurocomputing, Vol. 333 (2019), 231--239.
[13]
Alex Krizhevsky, Ilya Sutskever, and Geoffrey E Hinton. 2012. Imagenet classification with deep convolutional neural networks. Advances in Neural Information Processing Systems, Vol. 25 (2012), 1097--1105.
[14]
Chen-Yu Lee, Tanmay Batra, Mohammad Haris Baig, and Daniel Ulbricht. 2019. Sliced wasserstein discrepancy for unsupervised domain adaptation. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 10285--10295.
[15]
Shan Li and Weihong Deng. 2018a. Deep emotion transfer network for cross-database facial expression recognition. In 2018 24th International Conference on Pattern Recognition (ICPR). IEEE, 3092--3099.
[16]
Shan Li and Weihong Deng. 2018b. Deep facial expression recognition: A survey. arXiv preprint arXiv:1804.08348 (2018).
[17]
Shan Li and Weihong Deng. 2019. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Transactions on Image Processing, Vol. 28, 1 (2019), 356--370.
[18]
Shan Li and Weihong Deng. 2020. A deeper look at facial expression dataset bias. IEEE Transactions on Affective Computing (2020).
[19]
Yingjian Li, Guangming Lu, Jinxing Li, Zheng Zhang, and David Zhang. 2020. Facial Expression Recognition in the Wild Using Multi-level Features and Attention Mechanisms. IEEE Transactions on Affective Computing (2020).
[20]
Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Transactions on Image Processing, Vol. 28, 5 (2018), 2439--2450.
[21]
Mingsheng Long, Zhangjie Cao, Jianmin Wang, and Michael I Jordan. 2018. Conditional Adversarial Domain Adaptation. In Advances in Neural Information Processing Systems, S. Bengio, H. Wallach, H. Larochelle, K. Grauman, N. Cesa-Bianchi, and R. Garnett (Eds.), Vol. 31. Curran Associates, Inc.
[22]
Mingsheng Long, Han Zhu, Jianmin Wang, and Michael I Jordan. 2017. Deep transfer learning with joint adaptation networks. In International Conference on Machine Learning. PMLR, 2208--2217.
[23]
Patrick Lucey, Jeffrey F Cohn, Takeo Kanade, Jason Saragih, Zara Ambadar, and Iain Matthews. 2010. The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression. In IEEE Computer Society Conference on Computer Vision and Pattern Recognition-Workshops. IEEE, 94--101.
[24]
You-Wei Luo, Chuan-Xian Ren, DAI Dao-Qing, and Hong Yan. 2020. Unsupervised Domain Adaptation via Discriminative Manifold Propagation. IEEE Transactions on Pattern Analysis and Machine Intelligence (2020).
[25]
Michael Lyons, Shigeru Akamatsu, Miyuki Kamachi, and Jiro Gyoba. 1998. Coding facial expressions with gabor wavelets. In IEEE International Conference on Automatic Face and Gesture Recognition. IEEE, 200--205.
[26]
Ali Mollahosseini, Behzad Hasani, and Mohammad H Mahoor. 2019. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Transactions on Affective Computing, Vol. 10, 1 (2019), 18--31.
[27]
Maja Pantic, Michel Valstar, Ron Rademaker, and Ludo Maat. 2005. Web-based database for facial expression analysis. In IEEE International Conference on Multimedia and Expo. IEEE, 317--321.
[28]
Gerard Pons and David Masip. 2017. Supervised committee of convolutional neural networks in automated facial expression analysis. IEEE Transactions on Affective Computing, Vol. 9, 3 (2017), 343--350.
[29]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556 (2014).
[30]
Jiaming Song, Pratyusha Kalluri, Aditya Grover, Shengjia Zhao, and Stefano Ermon. 2019. Learning controllable fair representations. In The 22nd International Conference on Artificial Intelligence and Statistics. PMLR, 2164--2173.
[31]
Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897--6906.
[32]
Rongliang Wu and Shijian Lu. 2020. Leed: Label-free expression editing via disentanglement. In European Conference on Computer Vision. Springer, 781--798.
[33]
Rongliang Wu, Gongjie Zhang, Shijian Lu, and Tao Chen. 2020. Cascade ef-gan: Progressive facial expression editing with local focuses. In IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5021--5030.
[34]
Yuan Xie, Tianshui Chen, Tao Pu, Hefeng Wu, and Liang Lin. 2020. Adversarial graph representation adaptation for cross-domain facial expression recognition. In ACM international conference on Multimedia. 1255--1264.
[35]
Ruijia Xu, Guanbin Li, Jihan Yang, and Liang Lin. 2019. Larger norm more transferable: An adaptive feature norm approach for unsupervised domain adaptation. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 1426--1435.
[36]
Guoying Zhao, Xiaohua Huang, Matti Taini, Stan Z Li, and Matti Pietik"aInen. 2011. Facial expression recognition from near-infrared videos. Image and Vision Computing, Vol. 29, 9 (2011), 607--619.
[37]
Zhong-Qiu Zhao, Peng Zheng, Shou-tao Xu, and Xindong Wu. 2019. Object detection with deep learning: A review. IEEE Transactions on Neural Networks and Learning Systems, Vol. 30, 11 (2019), 3212--3232.
[38]
Lei Zhu, Xu Lu, Zhiyong Cheng, Jingjing Li, and Huaxiang Zhang. 2020. Deep Collaborative Multi-View Hashing for Large-Scale Image Search. IEEE Transactions on Image Processing, Vol. 29 (2020), 4643--4655.
[39]
Yongchun Zhu, Fuzhen Zhuang, Jindong Wang, Jingwu Chen, Zhiping Shi, Wenjuan Wu, and Qing He. 2019. Multi-representation adaptation network for cross-domain image classification. Neural Networks, Vol. 119 (2019), 214--221.

Cited By

View all
  • (2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
  • (2024)Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.335563726(6676-6688)Online publication date: 18-Jan-2024
  • (2024)Cross-Domain Sample Relationship Learning for Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.331602726(3788-3798)Online publication date: 1-Jan-2024
  • Show More Cited By

Index Terms

  1. JDMAN: Joint Discriminative and Mutual Adaptation Networks for Cross-Domain Facial Expression Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '21: Proceedings of the 29th ACM International Conference on Multimedia
    October 2021
    5796 pages
    ISBN:9781450386517
    DOI:10.1145/3474085
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 17 October 2021

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. domain shift
    2. facial expression recognition
    3. metric learning
    4. mutual information
    5. semantic gap

    Qualifiers

    • Research-article

    Funding Sources

    • Shenzhen Key Technical Project
    • Guangdong Basic and Applied Basic Research Foundation
    • Open Project Fund from Shenzhen Institute of Artificial Intelligence and Robotics for Society
    • Medical Biometrics Perception and Analysis Engineering Laboratory
    • Shenzhen Fundamental Research Fund

    Conference

    MM '21
    Sponsor:
    MM '21: ACM Multimedia Conference
    October 20 - 24, 2021
    Virtual Event, China

    Acceptance Rates

    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)47
    • Downloads (Last 6 weeks)3
    Reflects downloads up to 19 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Learning with Alignments: Tackling the Inter- and Intra-domain Shifts for Cross-multidomain Facial Expression RecognitionProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3680747(4236-4245)Online publication date: 28-Oct-2024
    • (2024)Adaptive Global-Local Representation Learning and Selection for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2024.335563726(6676-6688)Online publication date: 18-Jan-2024
    • (2024)Cross-Domain Sample Relationship Learning for Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2023.331602726(3788-3798)Online publication date: 1-Jan-2024
    • (2024)CFAN-SDA: Coarse-Fine Aware Network With Static-Dynamic Adaptation for Facial Expression Recognition in VideosIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2024.345065234:12(13507-13517)Online publication date: Dec-2024
    • (2024)CLIP-Guided Bidirectional Prompt and Semantic Supervision for Dynamic Facial Expression Recognition2024 IEEE International Joint Conference on Biometrics (IJCB)10.1109/IJCB62174.2024.10744485(1-10)Online publication date: 15-Sep-2024
    • (2024)Cross-domain facial expression recognition based on adversarial attack fine-tuning learningEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.109014136:PBOnline publication date: 1-Oct-2024
    • (2023)Cross-Domain Facial Expression Recognition through Reliable Global–Local Representation Learning and Dynamic Label WeightingElectronics10.3390/electronics1221455312:21(4553)Online publication date: 6-Nov-2023
    • (2023)Cross-domain facial expression recognition via disentangling identity representationProceedings of the Thirty-Second International Joint Conference on Artificial Intelligence10.24963/ijcai.2023/135(1213-1221)Online publication date: 19-Aug-2023
    • (2023)Learning from More: Combating Uncertainty Cross-multidomain for Facial Expression RecognitionProceedings of the 31st ACM International Conference on Multimedia10.1145/3581783.3611702(5889-5898)Online publication date: 26-Oct-2023
    • (2023)Deep Margin-Sensitive Representation Learning for Cross-Domain Facial Expression RecognitionIEEE Transactions on Multimedia10.1109/TMM.2022.314160425(1359-1373)Online publication date: 1-Jan-2023
    • Show More Cited By

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media