research-article

Fine-grained facial expression recognition via relational reasoning and hierarchical relation optimization

Authors:

Yidong LiAuthors Info & Claims

Volume 164, Issue C

Pages 67 - 73

https://doi.org/10.1016/j.patrec.2022.10.020

Published: 01 December 2022 Publication History

Highlights

•

We formulate fine-grained FER as a mapping problem and propose a R3HO-Net.

•

The intra-image graph is based on the structural relations among image sets.

•

An intra-label graph is constructed from semantic associations between labels.

•

A hierarchical relation optimization strategy optimizes fine-grained predictions.

•

Experiment performance on FG-Emotions dataset shows R3HO-Net outperforms the SOTA.

Abstract

Facial expression recognition (FER), aiming to recognize the type of facial expressions, has achieved significant progress. However, most of the existing FER approaches ignore the influence of structure relations among image set and the semantic associations between labels. Recently, some studies turn to explore fine-grained FER which includes hierarchical label structure, but they merely explore the influence of hierarchical relations of labels. Inspired by this, in this paper, we propose a relational reasoning and hierarchical relation optimization network (R3HO-Net) that explores the above three kinds of relations simultaneously. Concretely, we first construct two sub-graphs, i.e., intra-image graph (IIG) and intra-label graph (ILG). Meanwhile, we propose an entropy-based relation adaptive initialization strategy to construct a heterogeneous inter-graph (HIG). Then fine-grained stream of R3HO-Net, including a relational update GCN module, updates the graphs simultaneously in an iterative way and outputs the mapping probabilities between heterogeneous node pairs to infer the final mapping results. Moreover, we also propose a hierarchical label optimization module and a hierarchical optimization loss to optimize fine-grained prediction results. Extensive experiments on serveral benchmarks demonstrate the superiority of the proposed approach.

References

[1]

P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression, 2010 ieee computer society conference on computer vision and pattern recognition-workshops, IEEE, 2010, pp. 94–101.

[2]

M. Valstar, M. Pantic, et al., Induced disgust, happiness and surprise: an addition to the mmi facial expression database, Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, Paris, France., 2010, p. 65.

[3]

P. Ekman, W.V. Friesen, Constants across cultures in the face and emotion, J Pers Soc Psychol 17 (2) (1971) 124.

[4]

S. Du, Y. Tao, A.M. Martinez, Compound facial expressions of emotion, Proceedings of the National Academy of Sciences 111 (15) (2014) E1454–E1462.

[5]

W.G. Parrott, Emotions in social psychology: Essential readings, psychology press, 2001.

[6]

L. Liang, C. Lang, Y. Li, S. Feng, J. Zhao, Fine-grained facial expression recognition in the wild, IEEE Trans. Inf. Forensics Secur. 16 (2020) 482–494.

[7]

Z. Guo, L. Zhang, D. Zhang, A completed modeling of local binary pattern operator for texture classification, IEEE Trans. Image Process. 19 (6) (2010) 1657–1663.

Digital Library

[8]

D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int J Comput Vis 60 (2) (2004) 91–110.

Digital Library

[9]

N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, Ieee, 2005, pp. 886–893.

Digital Library

[10]

C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, ACM transactions on intelligent systems and technology (TIST) 2 (3) (2011) 1–27.

Digital Library

[11]

D. Opitz, R. Maclin, Popular ensemble methods: an empirical study, Journal of artificial intelligence research 11 (1999) 169–198.

[12]

P. Liu, S. Han, Z. Meng, Y. Tong, Facial expression recognition via a boosted deep belief network, Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1805–1812.

[13]

S.E. Kahou, X. Bouthillier, P. Lamblin, C. Gulcehre, V. Michalski, K. Konda, S. Jean, P. Froumenty, Y. Dauphin, N. Boulanger-Lewandowski, et al., Emonets: multimodal deep learning approaches for emotion recognition in video, Journal on Multimodal User Interfaces 10 (2) (2016) 99–111.

[14]

M. Liu, S. Li, S. Shan, X. Chen, Au-inspired deep networks for facial expression feature learning, Neurocomputing 159 (2015) 126–136.

[15]

X. Zhu, S. Ye, L. Zhao, Z. Dai, Hybrid attention cascade network for facial expression recognition, Sensors 21 (6) (2021) 2003.

[16]

J. Cai, Z. Meng, A.S. Khan, Z. Li, J. O’Reilly, Y. Tong, Island loss for learning discriminative features in facial expression recognition, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), IEEE, 2018, pp. 302–309.

[17]

Q.T. Ngo, S. Yoon, Facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset, Sensors 20 (9) (2020) 2639.

[18]

Z. Li, C. Lang, L. Liang, T. Wang, S. Feng, J. Wu, Y. Li, A universal model for cross modality mapping by relational reasoning, arXiv preprint arXiv:2102.13360 (2021).

[19]

S. Liu, W. Huang, Z. Zhang, Learning hybrid relationships for person re-identification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 2021, pp. 2172–2179.

[20]

X. Zhang, F. Zhou, Y. Lin, S. Zhang, Embedding label structures for fine-grained feature representation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1114–1123.

[21]

T. Chen, W. Wu, Y. Gao, L. Dong, X. Luo, L. Lin, Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding, Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 2023–2031.

[22]

T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907 (2016).

[23]

H. Liu, T. Wang, Y. Li, C. Lang, S. Feng, H. Ling, Deep probabilistic graph matching, arXiv preprint arXiv:2201.01603 (2022).

[24]

T. Wang, H. Liu, Y. Li, Y. Jin, X. Hou, H. Ling, Learning combinatorial solver for graph matching, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7568–7577.

[25]

H. Ding, S.K. Zhou, R. Chellappa, Facenet2expnet: Regularizing a deep face recognition net for expression recognition, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), IEEE, 2017, pp. 118–126.

[26]

Z. Wang, S. Wang, H. Li, Z. Dou, J. Li, Graph-propagation based correlation learning for weakly supervised fine-grained image classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 12289–12296.

[27]

P. Zhuang, Y. Wang, Y. Qiao, Learning attentive pairwise interaction for fine-grained classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 13130–13137.

[28]

G. Chu, B. Potetz, W. Wang, A. Howard, Y. Song, F. Brucher, T. Leung, H. Adam, Geo-aware networks for fine-grained recognition, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, p. 0.

[29]

X. Sun, L. Chen, J. Yang, Learning from web data using adversarial discriminative neural networks for fine-grained classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 2019, pp. 273–280.

[30]

Q. Li, X. Peng, Y. Qiao, Q. Peng, Learning category correlations for multi-label image recognition with graph networks, arXiv preprint arXiv:1909.13005 (2019).

[31]

H. Gao, Y. Liu, S. Ji, Topology-aware graph pooling networks, IEEE Trans Pattern Anal Mach Intell 43 (12) (2021) 4512–4518.

[32]

J. He, X. Yu, B. Sun, L. Yu, Facial expression and action unit recognition augmented by their dependencies on graph convolutional networks, Journal on Multimodal User Interfaces 15 (4) (2021) 429–440.

[33]

M. Shi, Y. Tang, X. Zhu, J. Liu, Multi-label graph convolutional network representation learning, IEEE Trans. Big Data (2020).

[34]

Q. Le, T. Mikolov, Distributed representations of sentences and documents, International conference on machine learning, PMLR, 2014, pp. 1188–1196.

[35]

T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Trans Pattern Anal Mach Intell 24 (7) (2002) 971–987.

Digital Library

[36]

J. Deng, J. Guo, N. Xue, S. Zafeiriou, Arcface: Additive angular margin loss for deep face recognition, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4690–4699.

Digital Library

[37]

K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.

[38]

A.H. Farzaneh, X. Qi, Facial expression recognition in the wild via deep attentive center loss, Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 2402–2411.

[39]

K. Wang, X. Peng, J. Yang, S. Lu, Y. Qiao, Suppressing uncertainties for large-scale facial expression recognition, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6897–6906.

[40]

B. Kang, S. Xie, M. Rohrbach, Z. Yan, A. Gordo, J. Feng, Y. Kalantidis, Decoupling representation and classifier for long-tailed recognition, arXiv preprint arXiv:1910.09217 (2019).

[41]

R. Walecki, V. Pavlovic, B. Schuller, M. Pantic, et al., Deep structured learning for facial action unit intensity estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3405–3414.

[42]

S. Li, W. Deng, J. Du, Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild, Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2852–2861.

[43]

C. Fabian Benitez-Quiroz, R. Srinivasan, A.M. Martinez, Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5562–5570.

[44]

Y. Li, J. Zeng, S. Shan, X. Chen, Occlusion aware facial expression recognition using cnn with attention mechanism, IEEE Trans. Image Process. 28 (5) (2018) 2439–2450.

Cited By

Li JYap MCheng WSee JHong XLi XWang S(2023)Editorial for pattern recognition letters special issue on face-based emotion understandingPattern Recognition Letters10.1016/j.patrec.2023.02.022168:C(8-9)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.02.022
Kim BChoi HJang HLee DJeong WKim S(2023)Dead pixel test using effective receptive fieldPattern Recognition Letters10.1016/j.patrec.2023.02.018167:C(149-156)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.02.018

Index Terms

Fine-grained facial expression recognition via relational reasoning and hierarchical relation optimization
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
  2. Machine learning
    1. Learning paradigms
    2. Machine learning approaches

Index terms have been assigned to the content through auto-classification.

Recommendations

Expression-invariant face recognition by facial expression transformations

In this paper, we present a method of expression-invariant face recognition that transforms input face image with an arbitrary expression into its corresponding neutral facial expression image. When a new face image with an arbitrary expression is ...
Subtle facial expression recognition using motion magnification

This paper proposes a novel method for subtle facial expression recognition that uses motion magnification to transform subtle expressions into corresponding exaggerated ones. Motion magnification consists of four steps: First, active appearance model (...
Facial expression recognition via learning deep sparse autoencoders

Facial expression recognition is an important research issue in the pattern recognition field. In this paper, we intend to present a novel framework for facial expression recognition to automatically distinguish the expressions with high accuracy. ...

Comments

Information & Contributors

Information

Published In

cover image Pattern Recognition Letters

Pattern Recognition Letters Volume 164, Issue C

Dec 2022

293 pages

ISSN:0167-8655

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier Science Inc.

United States

Publication History

Published: 01 December 2022

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Li JYap MCheng WSee JHong XLi XWang S(2023)Editorial for pattern recognition letters special issue on face-based emotion understandingPattern Recognition Letters10.1016/j.patrec.2023.02.022168:C(8-9)Online publication date: 1-Apr-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.02.022
Kim BChoi HJang HLee DJeong WKim S(2023)Dead pixel test using effective receptive fieldPattern Recognition Letters10.1016/j.patrec.2023.02.018167:C(149-156)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1016/j.patrec.2023.02.018

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents