Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition

Published: 15 March 2023 Publication History

Abstract

Single-label facial expression recognition (FER), which aims to classify single expression for facial images, usually suffers from the label noisy and incomplete problem, where manual annotations for partial training images exist wrong or incomplete labels, resulting in performance decline. Although prior work has attempted to leverage external sources or manual annotations to handle this problem, it usually requires extra costs. This article explores a simple yet effective three-phase paradigm (“warm-up,” “selection,” and “relabeling”) for FER task. First, the warm-up phase attempts to build an initial recognition network based on noisy samples for discriminative feature extractions and facial expression predictions. Then, the second selection phase defines several rules to choose high confident samples according to prediction scores, and the third relabeling phase assigns two potential labels to those samples for network updating according to a composite two-label loss. Compared with the previous studies, the three-phase learning could effectively correct noisy labels in the ground truth without extra information and automatically assign two potential labels to single-label samples without manual annotations. As a result, the label information is purified and supplemented with few cost, yielding significant performance improvement. Extensive experiments are conducted on three datasets, and the experimental results demonstrate that our approach is robust to noisy training samples and outperforms several state-of-the-art methods.

References

[1]
Görkem Algan and Ilkay Ulusoy. 2021. Meta soft label generation for noisy labels. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 7142–7148.
[2]
Emad Barsoum, Cha Zhang, Cristian Canton Ferrer, and Zhengyou Zhang. 2016. Training deep networks for facial expression recognition with crowd-sourced label distribution. In Proceedings of the 18th ACM International Conference on Multimodal Interaction. 279–283.
[3]
Jie Cai, Zibo Meng, Ahmed Shehab Khan, Zhiyuan Li, James O’Reilly, and Yan Tong. 2018. Island loss for learning discriminative features in facial expression recognition. In Proceedings of the 13th IEEE International Conference on Automatic Face and Gesture Recognition (FG’18). IEEE, 302–309.
[4]
Shikai Chen, Jianfeng Wang, Yuedong Chen, Zhongchao Shi, Xin Geng, and Yong Rui. 2020. Label distribution learning on auxiliary label space graphs for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 13984–13993.
[5]
Abhinav Dhall, O. V. Ramana Murthy, Roland Goecke, Jyoti Joshi, and Tom Gedeon. 2015. Video and image based emotion recognition challenges in the wild: Emotiw 2015. In Proceedings of the ACM on International Conference on Multimodal Interaction. 423–426.
[6]
Hui Ding, Peng Zhou, and Rama Chellappa. 2020. Occlusion-adaptive deep network for robust facial expression recognition. In Proceedings of the IEEE International Joint Conference on Biometrics (IJCB’20). IEEE, 1–9.
[7]
Darshan Gera. 2021. Handling ambiguous annotations for facial expression recognition in the wild. In Proceedings of the 12th Indian Conference on Computer Vision, Graphics and Image Processing. 1–9.
[8]
Darshan Gera and S. Balasubramanian. 2021. Landmark guidance independent spatio-channel attention and complementary context information based facial expression recognition. Pattern Recogn. Lett. 145 (2021), 58–66.
[9]
Ian J. Goodfellow, Dumitru Erhan, Pierre Luc Carrier, Aaron Courville, Mehdi Mirza, Ben Hamner, Will Cukierski, Yichuan Tang, David Thaler, Dong-Hyun Lee, et al. 2013. Challenges in representation learning: A report on three machine learning contests. In Proceedings of the International Conference on Neural Information Processing. Springer, 117–124.
[10]
Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. 2016. Deep residual learning for image recognition. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 770–778.
[11]
Dan Hendrycks, Kimin Lee, and Mantas Mazeika. 2019. Using pre-training can improve model robustness and uncertainty. In Proceedings of the International Conference on Machine Learning. PMLR, 2712–2721.
[12]
Wei Hu, Yangyu Huang, Fan Zhang, and Ruirui Li. 2019. Noise-tolerant paradigm for training face recognition CNNs. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’19). IEEE, 11879–11888.
[13]
Jinchi Huang, Lie Qu, Rongfei Jia, and Binqiang Zhao. 2019. O2u-net: A simple noisy label detection approach for deep neural networks. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3326–3334.
[14]
Lee Jaehwan, Yoo Donggeun, and Kim Hyo-Eun. 2019. Photometric transformer networks and label adjustment for breast density prediction. In Proceedings of the IEEE/CVF International Conference on Computer Vision Workshop (ICCVW’19). IEEE Computer Society, 460–466.
[15]
Junnan Li, Richard Socher, and Steven C. H. Hoi. 2019. DivideMix: Learning with noisy labels as semi-supervised learning. In Proceedings of the International Conference on Learning Representations.
[16]
Junnan Li, Yongkang Wong, Qi Zhao, and Mohan S. Kankanhalli. 2019. Learning to learn from noisy labeled data. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5051–5059.
[17]
Shan Li and Weihong Deng. 2018. Reliable crowdsourcing and deep locality-preserving learning for unconstrained facial expression recognition. IEEE Trans. Image Process. 28, 1 (2018), 356–370.
[18]
Shan Li, Weihong Deng, and JunPing Du. 2017. Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2852–2861.
[19]
Yong Li, Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Occlusion aware facial expression recognition using CNN with attention mechanism. IEEE Trans. Image Process. 28, 5 (2018), 2439–2450.
[20]
Zimeng Luo, Jiani Hu, and Weihong Deng. 2018. Local subclass constraint for facial expression recognition in the wild. In Proceedings of the 24th International Conference on Pattern Recognition (ICPR’18). IEEE, 3132–3137.
[21]
Rongyun Mo, Yan Yan, Jing-Hao Xue, Si Chen, and Hanzi Wang. 2021. D \(^3\) Net: Dual-branch disturbance disentangling network for facial expression recognition. In Proceedings of the 29th ACM International Conference on Multimedia. 779–787.
[22]
Ali Mollahosseini, Behzad Hasani, and Mohammad H. Mahoor. 2017. Affectnet: A database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10, 1 (2017), 18–31.
[23]
Duc Tam Nguyen, Chaithanya Kumar Mummadi, Thi Phuong Nhung Ngo, Thi Hoai Phuong Nguyen, Laura Beggel, and Thomas Brox. 2019. SELF: Learning to filter noisy labels with self-ensembling. In Proceedings of the International Conference on Learning Representations.
[24]
Roberto Pecoraro, Valerio Basile, Viviana Bono, and Sara Gallo. 2021. Local multi-head channel self-attention for facial expression recognition. Retrieved from https://arXiv:2111.07224.
[25]
Luan Pham, The Huynh Vu, and Tuan Anh Tran. 2021. Facial expression recognition using residual masking network. In Proceedings of the 25th International Conference on Pattern Recognition (ICPR’21). IEEE, 4513–4519.
[26]
Christopher Pramerdorfer and Martin Kampel. 2016. Facial expression recognition using convolutional neural networks: State of the art. Retrieved from https://arXiv:1612.02903.
[27]
Delian Ruan, Yan Yan, Shenqi Lai, Zhenhua Chai, Chunhua Shen, and Hanzi Wang. 2021. Feature decomposition and reconstruction learning for effective facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR’21). IEEE, 7656–7665.
[28]
Karishma Sharma, Pinar Donmez, Enming Luo, Yan Liu, and I Zeki Yalniz. 2020. Noiserank: Unsupervised label noise reduction with dependence models. In Proceedings of the European Conference on Computer Vision. Springer, 737–753.
[29]
Jiahui She, Yibo Hu, Hailin Shi, Jun Wang, Qiu Shen, and Tao Mei. 2021. Dive into ambiguity: Latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6248–6257.
[30]
Jialie Shen and Neil Robertson. 2021. BBAS: Towards large-scale effective ensemble adversarial attacks against deep neural network learning. Info. Sci. 569 (2021), 469–478.
[31]
Karen Simonyan and Andrew Zisserman. 2014. Very deep convolutional networks for large-scale image recognition. Retrieved from https://arXiv:1409.1556.
[32]
Sunil Thulasidasan, Tanmoy Bhattacharya, Jeff A. Bilmes, Gopinath Chennupati, and Jamal Mohd-Yusof. 2019. Combating label noise in deep learning using abstention. In Proceedings of the International Conference on Machine Learning (ICML’19).
[33]
Andreas Veit, Neil Alldrin, Gal Chechik, Ivan Krasin, Abhinav Gupta, and Serge Belongie. 2017. Learning from noisy large-scale datasets with minimal supervision. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 839–847.
[34]
Kai Wang, Yuxin Gu, Xiaojiang Peng, Panpan Zhang, Baigui Sun, and Hao Li. 2020. AU-guided unsupervised domain adaptive facial expression recognition. Retrieved from https://arXiv:2012.10078.
[35]
Kai Wang, Xiaojiang Peng, Jianfei Yang, Shijian Lu, and Yu Qiao. 2020. Suppressing uncertainties for large-scale facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6897–6906.
[36]
Kai Wang, Xiaojiang Peng, Jianfei Yang, Debin Meng, and Yu Qiao. 2020. Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29 (2020), 4057–4069.
[37]
Luo Wang, Xueming Qian, Yuting Zhang, Jialie Shen, and Xiaochun Cao. 2019. Enhancing sketch-based image retrieval by cnn semantic re-ranking. IEEE Trans. Cybernet. 50, 7 (2019), 3330–3342.
[38]
Xinshao Wang, Elyor Kodirov, Yang Hua, and Neil M. Robertson. 2019. Improved mean absolute error for learning meaningful patterns from abnormal training data. Technical report.
[39]
Yisen Wang, Xingjun Ma, Zaiyi Chen, Yuan Luo, Jinfeng Yi, and James Bailey. 2019. Symmetric cross entropy for robust learning with noisy labels. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 322–330.
[40]
Siyue Xie, Haifeng Hu, and Yongbo Wu. 2019. Deep multi-path convolutional neural network joint with salient region attention for facial expression recognition. Pattern Recogn. 92 (2019), 177–191.
[41]
Bodi Yuan, Jianyu Chen, Weidong Zhang, Hung-Shuo Tai, and Sara McMains. 2018. Iterative cross learning on noisy labels. In Proceedings of the IEEE Winter Conference on Applications of Computer Vision (WACV’18). IEEE, 757–765.
[42]
Jin Yuan, Shuai Zhu, Shuyin Huang, Hanwang Zhang, Yaoqiang Xiao, Zhiyong Li, and Meng Wang. 2022. Discriminative style learning for cross-domain image captioning. IEEE Trans. Image Process. 31 (2022), 1723–1736.
[43]
Dan Zeng, Zhiyuan Lin, Xiao Yan, Yuting Liu, Fei Wang, and Bo Tang. 2022. Face2Exp: Combating data biases for facial expression recognition. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20291–20300.
[44]
Jiabei Zeng, Shiguang Shan, and Xilin Chen. 2018. Facial expression recognition with inconsistently annotated datasets. In Proceedings of the European Conference on Computer Vision (ECCV’18). 222–237.
[45]
Hongyi Zhang, Moustapha Cisse, Yann N. Dauphin, and David Lopez-Paz. 2018. mixup: Beyond empirical risk minimization. In Proceedings of the International Conference on Learning Representations.
[46]
Yuhang Zhang, Chengrui Wang, and Weihong Deng. 2021. Relative uncertainty learning for facial expression recognition. Adv. Neural Info. Process. Syst. 34 (2021).
[47]
Zengqun Zhao, Qingshan Liu, and Feng Zhou. 2021. Robust lightweight facial expression recognition network with label distribution training. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 35. 3510–3519.
[48]
Songzhu Zheng, Pengxiang Wu, Aman Goswami, Mayank Goswami, Dimitris Metaxas, and Chao Chen. 2020. Error-bounded correction of noisy labels. In Proceedings of the International Conference on Machine Learning. PMLR, 11447–11457.

Cited By

View all
  • (2024)SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664816Online publication date: 11-May-2024
  • (2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
  • (2024)Context-detail-aware United Network for Single Image DerainingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363940720:5(1-18)Online publication date: 22-Jan-2024
  • Show More Cited By

Index Terms

  1. TP-FER: An Effective Three-phase Noise-tolerant Recognizer for Facial Expression Recognition

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Multimedia Computing, Communications, and Applications
    ACM Transactions on Multimedia Computing, Communications, and Applications  Volume 19, Issue 3
    May 2023
    514 pages
    ISSN:1551-6857
    EISSN:1551-6865
    DOI:10.1145/3582886
    • Editor:
    • Abdulmotaleb El Saddik
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 15 March 2023
    Online AM: 17 November 2022
    Accepted: 22 October 2022
    Revised: 22 August 2022
    Received: 13 April 2022
    Published in TOMM Volume 19, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Facial expression recognition
    2. label noise
    3. three-phase learning

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • State Grid Science and Technology Project
    • Special Project of Foshan Science and Technology Innovation Team
    • National Natural Science Foundation of Changsha

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)112
    • Downloads (Last 6 weeks)16
    Reflects downloads up to 10 Sep 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)SEMScene: Semantic-Consistency Enhanced Multi-Level Scene Graph Matching for Image-Text RetrievalACM Transactions on Multimedia Computing, Communications, and Applications10.1145/3664816Online publication date: 11-May-2024
    • (2024)Universal Relocalizer for Weakly Supervised Referring Expression GroundingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/365604520:7(1-23)Online publication date: 16-May-2024
    • (2024)Context-detail-aware United Network for Single Image DerainingACM Transactions on Multimedia Computing, Communications, and Applications10.1145/363940720:5(1-18)Online publication date: 22-Jan-2024
    • (2024)Emotional Video Captioning With Vision-Based Emotion Interpretation NetworkIEEE Transactions on Image Processing10.1109/TIP.2024.335904533(1122-1135)Online publication date: 1-Feb-2024
    • (2024)Multi-granularity relationship reasoning network for high-fidelity 3D shape reconstructionPattern Recognition10.1016/j.patcog.2024.110647155(110647)Online publication date: Nov-2024
    • (2024)Edge aware depth inference for large-scale aerial building multi-view stereoISPRS Journal of Photogrammetry and Remote Sensing10.1016/j.isprsjprs.2023.11.020207(27-42)Online publication date: Jan-2024
    • (2024)3WAUS: A novel three-way adaptive uncertainty-suppressing model for facial expression recognitionInformation Sciences10.1016/j.ins.2024.120962677(120962)Online publication date: Aug-2024
    • (2024)Exploiting global and instance-level perceived feature relationship matrices for 3D face reconstruction and dense alignmentEngineering Applications of Artificial Intelligence10.1016/j.engappai.2024.107862131:COnline publication date: 1-May-2024
    • (2024)Leveraging chaos for enhancing encryption and compression in large cloud data transfersThe Journal of Supercomputing10.1007/s11227-024-05906-380:9(11923-11957)Online publication date: 4-Feb-2024
    • (2024)Dependability of Network Services in the Context of NFV: A Taxonomy and State of the Art ClassificationJournal of Network and Systems Management10.1007/s10922-024-09810-232:2Online publication date: 26-Mar-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    Full Access

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Full Text

    View this article in Full Text.

    Full Text

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media