ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

Zhang, Ziyang; Sun, Xiao; An, Liuwei; Wang, Meng

doi:10.1007/978-981-99-8469-5_23

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 14429))

Included in the following conference series:

Chinese Conference on Pattern Recognition and Computer Vision (PRCV)

539 Accesses
1 Citations

Abstract

Given the similarity between facial expression categories, the presence of compound facial expressions, and the subjectivity of annotators, facial expression recognition (FER) datasets often suffer from ambiguity and noisy labels. Ambiguous expressions are challenging to differentiate from expressions with noisy labels, which hurt the robustness of FER models. Furthermore, the difficulty of recognition varies across different expression categories, rendering a uniform approach unfair for all expressions. In this paper, we introduce a novel approach called Adaptive Sample Mining (ASM) to dynamically address ambiguity and noise within each expression category. First, the Adaptive Threshold Learning module generates two thresholds, namely the clean and noisy thresholds, for each category. These thresholds are based on the mean class probabilities at each training epoch. Next, the Sample Mining module partitions the dataset into three subsets: clean, ambiguity, and noise, by comparing the sample confidence with the clean and noisy thresholds. Finally, the Tri-Regularization module employs a mutual learning strategy for the ambiguity subset to enhance discrimination ability, and an unsupervised learning strategy for the noise subset to mitigate the impact of noisy labels. Extensive experiments prove that our method can effectively mine both ambiguity and noise, and outperform SOTA methods on both synthetic noisy and original datasets. The supplement material is available at https://github.com/zzzzzzyang/ASM.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 69.99; Price excludes VAT (USA)

Softcover Book: USD 89.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

SAST: a suppressing ambiguity self-training framework for facial expression recognition

Article 06 December 2023

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Article 04 November 2023

Self-supervised facial expression recognition with fine-grained feature selection

Article 17 March 2024

References

Arpit, D., et al.: A closer look at memorization in deep networks. In: International Conference on Machine Learning, pp. 233–242. PMLR (2017)
Google Scholar
Barsoum, E., Zhang, C., Ferrer, C.C., Zhang, Z.: Training deep networks for facial expression recognition with crowd-sourced label distribution. In: Proceedings of the 18th ACM International Conference on Multimodal Interaction, pp. 279–283 (2016)
Google Scholar
Blum, A., Mitchell, T.: Combining labeled and unlabeled data with co-training. In: Proceedings of the Eleventh Annual Conference on Computational Learning Theory, pp. 92–100 (1998)
Google Scholar
Farzaneh, A.H., Qi, X.: Facial expression recognition in the wild via deep attentive center loss. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 2402–2411 (2021)
Google Scholar
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Chapter Google Scholar
Guo, L.Z., Li, Y.F.: Class-imbalanced semi-supervised learning with adaptive thresholding. In: Proceedings of the 39th International Conference on Machine Learning, pp. 8082–8094 (2022)
Google Scholar
Guo, L.Z., Zhang, Y.G., Wu, Z.F., Shao, J.J., Li, Y.F.: Robust semi-supervised learning when not all classes have labels. In: Advanced Neural Information Processing Systems, vol. 35, pp. 3305–3317 (2022)
Google Scholar
Guo, Y., Zhang, L., Hu, Y., He, X., Gao, J.: MS-Celeb-1M: a dataset and benchmark for large-scale face recognition. In: Leibe, B., Matas, J., Sebe, N., Welling, M. (eds.) ECCV 2016. LNCS, vol. 9907, pp. 87–102. Springer, Cham (2016). https://doi.org/10.1007/978-3-319-46487-9_6
Chapter Google Scholar
Han, B., et al.: Co-teaching: robust training of deep neural networks with extremely noisy labels. In: Advances in Neural Information Processing Systems 31 (2018)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 2016
Google Scholar
Li, J., Socher, R., Hoi, S.C.: DIVIDEMIX: learning with noisy labels as semi-supervised learning. arXiv preprint arXiv:2002.07394 (2020)
Li, S., Deng, W., Du, J.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)
Google Scholar
Van der Maaten, L., Hinton, G.: Visualizing data using t-SNE. J. Mach. Learn. Res. 9(11), 2579–2605 (2008)
Google Scholar
Malach, E., Shalev-Shwartz, S.: Decoupling “when to update” from “how to update”. In: Advances in Neural Information Processing Systems 30 (2017)
Google Scholar
Mollahosseini, A., Hasani, B., Mahoor, M.H.: AffectNet: a database for facial expression, valence, and arousal computing in the wild. IEEE Trans. Affect. Comput. 10(1), 18–31 (2017)
Article Google Scholar
Nigam, K., Ghani, R.: Analyzing the effectiveness and applicability of co-training. In: Proceedings of the Ninth International Conference on Information and Knowledge Management, pp. 86–93 (2000)
Google Scholar
Sarfraz, F., Arani, E., Zonooz, B.: Noisy concurrent training for efficient learning under label noise. In: Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, pp. 3159–3168 (2021)
Google Scholar
Shao, J., Wu, Z., Luo, Y., Huang, S., Pu, X., Ren, Y.: Self-paced label distribution learning for in-the-wild facial expression recognition. In: Proceedings of the 30th ACM International Conference on Multimedia, pp. 161–169 (2022)
Google Scholar
She, J., Hu, Y., Shi, H., Wang, J., Shen, Q., Mei, T.: Dive into ambiguity: latent distribution mining and pairwise uncertainty estimation for facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6248–6257 (2021)
Google Scholar
Sindhwani, V., Niyogi, P., Belkin, M.: A co-regularization approach to semi-supervised learning with multiple views. In: Proceedings of ICML Workshop on Learning with Multiple Views, vol. 2005, pp. 74–79. Citeseer (2005)
Google Scholar
Wang, K., Peng, X., Yang, J., Lu, S., Qiao, Y.: Suppressing uncertainties for large-scale facial expression recognition. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 6897–6906 (2020)
Google Scholar
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Article Google Scholar
Wang, Y., et al.: FreeMatch: self-adaptive thresholding for semi-supervised learning. arXiv preprint arXiv:2205.07246 (2022)
Wei, H., Feng, L., Chen, X., An, B.: Combating noisy labels by agreement: a joint training method with co-regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13726–13735 (2020)
Google Scholar
Zeng, J., Shan, S., Chen, X.: Facial expression recognition with inconsistently annotated datasets. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11217, pp. 227–243. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01261-8_14
Chapter Google Scholar
Zhang, H., Cisse, M., Dauphin, Y.N., Lopez-Paz, D.: mixup: beyond empirical risk minimization. arXiv preprint arXiv:1710.09412 (2017)
Zhang, K., Zhang, Z., Li, Z., Qiao, Y.: Joint face detection and alignment using multitask cascaded convolutional networks. IEEE Signal Process. Lett. 23(10), 1499–1503 (2016)
Article Google Scholar
Zhang, Y., Wang, C., Deng, W.: Relative uncertainty learning for facial expression recognition. In: Advances in Neural Information Processing Systems 34 (2021)
Google Scholar
Zhang, Y., Wang, C., Ling, X., Deng, W.: Learn from all: Erasing attention consistency for noisy label facial expression recognition. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) Computer Vision-ECCV 2022: 17th European Conference, Tel Aviv, Israel, October 23–27, 2022, Proceedings, Part XXVI. pp. 418–434. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19809-0_24

Download references

Acknowledgments

This work was supported by the National Key R &D Programme of China (2022YFC3803202), Major Project of Anhui Province under Grant 202203a05020011. This work was done in Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machine.

Author information

Authors and Affiliations

School of Computer Science and Information Engineering, Hefei University of Technology, Heifei, China
Ziyang Zhang, Xiao Sun, Liuwei An & Meng Wang
Anhui Province Key Laboratory of Affective Computing and Advanced Intelligent Machines, Hefei University of Technology, Heifei, China
Ziyang Zhang, Xiao Sun, Liuwei An & Meng Wang
Institute of Artificial Intelligence, Hefei Comprehensive National Science Center, Heifei, China
Xiao Sun & Meng Wang

Authors

Ziyang Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Sun
View author publications
You can also search for this author in PubMed Google Scholar
Liuwei An
View author publications
You can also search for this author in PubMed Google Scholar
Meng Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Sun .

Editor information

Editors and Affiliations

Nanjing University of Information Science and Technology, Nanjing, China
Qingshan Liu
Xiamen University, Xiamen, China
Hanzi Wang
Beijing University of Posts and Telecommunications, Beijing, China
Zhanyu Ma
Sun Yat-sen University, Guangzhou, China
Weishi Zheng
Peking University, Beijing, China
Hongbin Zha
Chinese Academy of Sciences, Beijing, China
Xilin Chen
Chinese Academy of Sciences, Beijing, China
Liang Wang
Xiamen University, Xiamen, China
Rongrong Ji

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zhang, Z., Sun, X., An, L., Wang, M. (2024). ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition. In: Liu, Q., et al. Pattern Recognition and Computer Vision. PRCV 2023. Lecture Notes in Computer Science, vol 14429. Springer, Singapore. https://doi.org/10.1007/978-981-99-8469-5_23

Download citation

DOI: https://doi.org/10.1007/978-981-99-8469-5_23
Published: 25 December 2023
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-8468-8
Online ISBN: 978-981-99-8469-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SAST: a suppressing ambiguity self-training framework for facial expression recognition

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Self-supervised facial expression recognition with fine-grained feature selection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

ASM: Adaptive Sample Mining for In-The-Wild Facial Expression Recognition

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

SAST: a suppressing ambiguity self-training framework for facial expression recognition

Dynamic adaptive threshold based learning for noisy annotations robust facial expression recognition

Self-supervised facial expression recognition with fine-grained feature selection

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation