research-article

TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition

Authors:

Zhiyong LiAuthors Info & Claims

ACM Transactions on Multimedia Computing, Communications and Applications, Volume 19, Issue 3

Article No.: 113, Pages 1 - 17

https://doi.org/10.1145/3570329

Published: 15 March 2023 Publication History

Abstract

Single-label facial expression recognition (FER), which aims to classify single expression for facial images, usually suffers from the label noisy and incomplete problem, where manual annotations for partial training images exist wrong or incomplete labels, resulting in performance decline. Although prior work has attempted to leverage external sources or manual annotations to handle this problem, it usually requires extra costs. This article explores a simple yet effective three-phase paradigm (“warm-up,” “selection,” and “relabeling”) for FER task. First, the warm-up phase attempts to build an initial recognition network based on noisy samples for discriminative feature extractions and facial expression predictions. Then, the second selection phase defines several rules to choose high confident samples according to prediction scores, and the third relabeling phase assigns two potential labels to those samples for network updating according to a composite two-label loss. Compared with the previous studies, the three-phase learning could effectively correct noisy labels in the ground truth without extra information and automatically assign two potential labels to single-label samples without manual annotations. As a result, the label information is purified and supplemented with few cost, yielding significant performance improvement. Extensive experiments are conducted on three datasets, and the experimental results demonstrate that our approach is robust to noisy training samples and outperforms several state-of-the-art methods.

References

[1]

Görkem Algan and Ilkay Ulusoy. 2021. Meta soft label generation for noisy labels. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 7142–7148.

[2]

Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279–283.

Digital Library

[3]

Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O’Reilly, and Yan Tong. 2018. Island loss for learning discriminative features in facial expression recognition. In Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG’18). IEEE, 302–309.

Digital Library

[4]

Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13984–13993.

[5]

Abhinav Dhall, O. V. Ramana Murthy, Roland Goecke, Jyoti Joshi, and Tom Gedeon. 2015. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the ACM on International Conference on Multimodal Interaction. 423–426.

Digital Library

[6]

Hui Ding, Peng Zhou, and Rama Chellappa. 2020. Occlusion-adaptive deep network for robust facial expression recognition. In Proceedings of the IEEE International Joint Conference on Biometrics (IJCB’20). IEEE, 1–9.

Digital Library

[7]

Darshan Gera. 2021. Handling ambiguous annotations for facial expression recognition in the wild. In Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing. 1–9.

Digital Library

[8]

Darshan Gera and S. Balasubramanian. 2021. Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition. Pattern Recogn. Lett. 145 (2021), 58–66.

Digital Library

[9]

Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. Springer, 117–124.

[10]

Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.

[11]

Dan Hendrycks, Kimin Lee, and Mantas Mazeika. 2019. Using pre-training can improve model robustness and uncertainty. In Proceedings of the International Conference on Machine Learning. PMLR, 2712–2721.

[12]

Wei Hu, Yangyu Huang, Fan Zhang, and Ruirui Li. 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 11879–11888.

[13]

Jinchi Huang, Lie Qu, Rongfei Jia, and Binqiang Zhao. 2019. O2u-net: A simple noisy label detection approach for deep neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3326–3334.

[14]

Lee Jaehwan, Yoo Donggeun, and Kim Hyo-Eun. 2019. Photometric transformer networks and label adjustment for breast density prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW’19). IEEE Computer Society, 460–466.

[15]

Junnan Li, Richard Socher, and Steven C. H. Hoi. 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the International Conference on Learning Representations.

[16]

Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S. Kankanhalli. 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5051–5059.

[17]

Shan Li and Weihong Deng. 2018. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28, 1 (2018), 356–370.

Digital Library

[18]

Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852–2861.

[19]

Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 5 (2018), 2439–2450.

[20]

Zimeng Luo, Jiani Hu, and Weihong Deng. 2018. Local subclass constraint for facial expression recognition in the wild. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 3132–3137.

[21]

Rongyun Mo, Yan Yan, Jing-Hao Xue, Si Chen, and Hanzi Wang. 2021. D \(^3\) Net: Dual-branch disturbance disentangling network for facial expression recognition. In Proceedings of the 29th ACM International Conference on Multimedia. 779–787.

Digital Library

[22]

Ali Mollahosseini, Behzad Hasani, and Mohammad H. Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 1 (2017), 18–31.

Digital Library

[23]

Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. 2019. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations.

[24]

Roberto Pecoraro, Valerio Basile, Viviana Bono, and Sara Gallo. 2021. Local multi-head channel self-attention for facial expression recognition. Retrieved from https://arXiv:2111.07224.

[25]

Luan Pham, The Huynh Vu, and Tuan Anh Tran. 2021. Facial expression recognition using residual masking network. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 4513–4519.

[26]

Christopher Pramerdorfer and Martin Kampel. 2016. Facial expression recognition using convolutional neural networks: State of the art. Retrieved from https://arXiv:1612.02903.

[27]

Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). IEEE, 7656–7665.

[28]

Karishma Sharma, Pinar Donmez, Enming Luo, Yan Liu, and I Zeki Yalniz. 2020. Noiserank: Unsupervised label noise reduction with dependence models. In Proceedings of the European Conference on Computer Vision. Springer, 737–753.

Digital Library

[29]

Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, and Tao Mei. 2021. Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6248–6257.

[30]

Jialie Shen and Neil Robertson. 2021. BBAS: Towards large-scale effective ensemble adversarial attacks against deep neural network learning. Info. Sci. 569 (2021), 469–478.

[31]

Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https://arXiv:1409.1556.

[32]

Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff A. Bilmes, Gopinath Chennupati, and Jamal Mohd-Yusof. 2019. Combating label noise in deep learning using abstention. In Proceedings of the International Conference on Machine Learning (ICML’19).

[33]

Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, and Serge Belongie. 2017. Learning from noisy large-scale datasets with minimal supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 839–847.

[34]

Kai Wang, Yuxin Gu, Xiaojiang Peng, Panpan Zhang, Baigui Sun, and Hao Li. 2020. AU-guided unsupervised domain adaptive facial expression recognition. Retrieved from https://arXiv:2012.10078.

[35]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897–6906.

[36]

Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29 (2020), 4057–4069.

Digital Library

[37]

Luo Wang, Xueming Qian, Yuting Zhang, Jialie Shen, and Xiaochun Cao. 2019. Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans. Cybernet. 50, 7 (2019), 3330–3342.

[38]

Xinshao Wang, Elyor Kodirov, Yang Hua, and Neil M. Robertson. 2019. Improved mean absolute error for learning meaningful patterns from abnormal training data. Technical report.

[39]

Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 322–330.

[40]

Siyue Xie, Haifeng Hu, and Yongbo Wu. 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92 (2019), 177–191.

Digital Library

[41]

Bodi Yuan, Jianyu Chen, Weidong Zhang, Hung-Shuo Tai, and Sara McMains. 2018. Iterative cross learning on noisy labels. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 757–765.

[42]

Jin Yuan, Shuai Zhu, Shuyin Huang, Hanwang Zhang, Yaoqiang Xiao, Zhiyong Li, and Meng Wang. 2022. Discriminative style learning for cross-domain image captioning. IEEE Trans. Image Process. 31 (2022), 1723–1736.

[43]

Dan Zeng, Zhiyuan Lin, Xiao Yan, Yuting Liu, Fei Wang, and Bo Tang. 2022. Face2Exp: Combating data biases for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20291–20300.

[44]

Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision (ECCV’18). 222–237.

Digital Library

[45]

Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. 2018. mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations.

[46]

Yuhang Zhang, Chengrui Wang, and Weihong Deng. 2021. Relative uncertainty learning for facial expression recognition. Adv. Neural Info. Process. Syst. 34 (2021).

[47]

Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3510–3519.

[48]

Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, and Chao Chen. 2020. Error-bounded correction of noisy labels. In Proceedings of the International Conference on Machine Learning. PMLR, 11447–11457.

Cited By

Liu YYuan XLi HTan ZHuang JXiao JLi WMo T(2024)SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664816Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3664816
Zhang PLiu MSong XCao DGao ZNie L(2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1145/3656045
Hsu WLin H(2024)Context-detail-aware United Network for Single Image DerainingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363940720:5(1-18)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3639407
Show More Cited By

Index Terms

TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision representations
        Image representations

Recommendations

Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Facial expression recognition with Convolutional Neural Networks

Facial expression recognition has been an active research area in the past 10 years, with growing application areas including avatar animation, neuromarketing and sociable robots. The recognition of facial expressions is not an easy problem for machine ...
Pose-Robust Facial Expression Recognition Using View-Based 2D + 3D AAM

This paper proposes a pose-robust face tracking and facial expression recognition method using a view-based 2D 3D active appearance model (AAM) that extends the 2D 3D AAM to the view-based approach, where one independent face model is used for a ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Multimedia Computing, Communications, and Applications

ACM Transactions on Multimedia Computing, Communications, and Applications Volume 19, Issue 3

May 2023

514 pages

ISSN:1551-6857

EISSN:1551-6865

DOI:10.1145/3582886

Editor:
Abdulmotaleb El Saddik
Mohamed Bin Zayed University of Artificial Intelligence, UAE and University of Ottawa, Canada

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 15 March 2023

Online AM: 17 November 2022

Accepted: 22 October 2022

Revised: 22 August 2022

Received: 13 April 2022

Published in TOMM Volume 19, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

National Natural Science Foundation of China
State Grid Science and Technology Project
Special Project of Foshan Science and Technology Innovation Team
National Natural Science Foundation of Changsha

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

19
Total Citations
View Citations
239
Total Downloads

Downloads (Last 12 months)112
Downloads (Last 6 weeks)16

Reflects downloads up to 10 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu YYuan XLi HTan ZHuang JXiao JLi WMo T(2024)SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664816Online publication date: 11-May-2024
https://dl.acm.org/doi/10.1145/3664816
Zhang PLiu MSong XCao DGao ZNie L(2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
https://dl.acm.org/doi/10.1145/3656045
Hsu WLin H(2024)Context-detail-aware United Network for Single Image DerainingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363940720:5(1-18)Online publication date: 22-Jan-2024
https://dl.acm.org/doi/10.1145/3639407
Song PGuo DYang XTang SWang M(2024)Emotional Video Captioning With Vision-Based Emotion Interpretation NetworkIEEE Transactions on Image Processing10.1109/TIP.2024.335904533(1122-1135)Online publication date: 1-Feb-2024
https://dl.acm.org/doi/10.1109/TIP.2024.3359045
Li LZhou ZWu SLi PZhang B(2024)Multi-granularity relationship reasoning network for high-fidelity 3D shape reconstructionPattern Recognition10.1016/j.patcog.2024.110647155(110647)Online publication date: Nov-2024
https://doi.org/10.1016/j.patcog.2024.110647
Zhang SWei ZXu WZhang LWang YZhang JLiu J(2024)Edge aware depth inference for large-scale aerial building multi-view stereoISPRS Journal of Photogrammetry and Remote Sensing10.1016/j.isprsjprs.2023.11.020207(27-42)Online publication date: Jan-2024
https://doi.org/10.1016/j.isprsjprs.2023.11.020
Li DXiong WLuo TZhang L(2024)3WAUS: A novel three-way adaptive uncertainty-suppressing model for facial expression recognitionInformation Sciences10.1016/j.ins.2024.120962677(120962)Online publication date: Aug-2024
https://doi.org/10.1016/j.ins.2024.120962
Li LLiu FWang JWang YChen YHu X(2024)Exploiting global and instance-level perceived feature relationship matrices for 3D face reconstruction and dense alignmentEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.107862131:COnline publication date: 1-May-2024
https://dl.acm.org/doi/10.1016/j.engappai.2024.107862
Bhattacharjee SSharma HChoudhury TAbdelmoniem A(2024)Leveraging chaos for enhancing encryption and compression in large cloud data transfersThe Journal of Supercomputing10.1007/s11227-024-05906-380:9(11923-11957)Online publication date: 4-Feb-2024
https://dl.acm.org/doi/10.1007/s11227-024-05906-3
Azadiabad SKhendek F(2024)Dependability of Network Services in the Context of NFV: A Taxonomy and State of the Art ClassificationJournal of Network and Systems Management10.1007/s10922-024-09810-232:2Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1007/s10922-024-09810-2
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents