Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Fine-grained facial expression recognition via relational reasoning and hierarchical relation optimization

Published: 01 December 2022 Publication History

Highlights

We formulate fine-grained FER as a mapping problem and propose a R3HO-Net.
The intra-image graph is based on the structural relations among image sets.
An intra-label graph is constructed from semantic associations between labels.
A hierarchical relation optimization strategy optimizes fine-grained predictions.
Experiment performance on FG-Emotions dataset shows R3HO-Net outperforms the SOTA.

Abstract

Facial expression recognition (FER), aiming to recognize the type of facial expressions, has achieved significant progress. However, most of the existing FER approaches ignore the influence of structure relations among image set and the semantic associations between labels. Recently, some studies turn to explore fine-grained FER which includes hierarchical label structure, but they merely explore the influence of hierarchical relations of labels. Inspired by this, in this paper, we propose a relational reasoning and hierarchical relation optimization network (R3HO-Net) that explores the above three kinds of relations simultaneously. Concretely, we first construct two sub-graphs, i.e., intra-image graph (IIG) and intra-label graph (ILG). Meanwhile, we propose an entropy-based relation adaptive initialization strategy to construct a heterogeneous inter-graph (HIG). Then fine-grained stream of R3HO-Net, including a relational update GCN module, updates the graphs simultaneously in an iterative way and outputs the mapping probabilities between heterogeneous node pairs to infer the final mapping results. Moreover, we also propose a hierarchical label optimization module and a hierarchical optimization loss to optimize fine-grained prediction results. Extensive experiments on serveral benchmarks demonstrate the superiority of the proposed approach.

References

[1]
P. Lucey, J.F. Cohn, T. Kanade, J. Saragih, Z. Ambadar, I. Matthews, The extended cohn-kanade dataset (ck+): A complete dataset for action unit and emotion-specified expression, 2010 ieee computer society conference on computer vision and pattern recognition-workshops, IEEE, 2010, pp. 94–101.
[2]
M. Valstar, M. Pantic, et al., Induced disgust, happiness and surprise: an addition to the mmi facial expression database, Proc. 3rd Intern. Workshop on EMOTION (satellite of LREC): Corpora for Research on Emotion and Affect, Paris, France., 2010, p. 65.
[3]
P. Ekman, W.V. Friesen, Constants across cultures in the face and emotion, J Pers Soc Psychol 17 (2) (1971) 124.
[4]
S. Du, Y. Tao, A.M. Martinez, Compound facial expressions of emotion, Proceedings of the National Academy of Sciences 111 (15) (2014) E1454–E1462.
[5]
W.G. Parrott, Emotions in social psychology: Essential readings, psychology press, 2001.
[6]
L. Liang, C. Lang, Y. Li, S. Feng, J. Zhao, Fine-grained facial expression recognition in the wild, IEEE Trans. Inf. Forensics Secur. 16 (2020) 482–494.
[7]
Z. Guo, L. Zhang, D. Zhang, A completed modeling of local binary pattern operator for texture classification, IEEE Trans. Image Process. 19 (6) (2010) 1657–1663.
[8]
D.G. Lowe, Distinctive image features from scale-invariant keypoints, Int J Comput Vis 60 (2) (2004) 91–110.
[9]
N. Dalal, B. Triggs, Histograms of oriented gradients for human detection, 2005 IEEE computer society conference on computer vision and pattern recognition (CVPR’05), volume 1, Ieee, 2005, pp. 886–893.
[10]
C.-C. Chang, C.-J. Lin, Libsvm: a library for support vector machines, ACM transactions on intelligent systems and technology (TIST) 2 (3) (2011) 1–27.
[11]
D. Opitz, R. Maclin, Popular ensemble methods: an empirical study, Journal of artificial intelligence research 11 (1999) 169–198.
[12]
P. Liu, S. Han, Z. Meng, Y. Tong, Facial expression recognition via a boosted deep belief network, Proceedings of the IEEE conference on computer vision and pattern recognition, 2014, pp. 1805–1812.
[13]
S.E. Kahou, X. Bouthillier, P. Lamblin, C. Gulcehre, V. Michalski, K. Konda, S. Jean, P. Froumenty, Y. Dauphin, N. Boulanger-Lewandowski, et al., Emonets: multimodal deep learning approaches for emotion recognition in video, Journal on Multimodal User Interfaces 10 (2) (2016) 99–111.
[14]
M. Liu, S. Li, S. Shan, X. Chen, Au-inspired deep networks for facial expression feature learning, Neurocomputing 159 (2015) 126–136.
[15]
X. Zhu, S. Ye, L. Zhao, Z. Dai, Hybrid attention cascade network for facial expression recognition, Sensors 21 (6) (2021) 2003.
[16]
J. Cai, Z. Meng, A.S. Khan, Z. Li, J. O’Reilly, Y. Tong, Island loss for learning discriminative features in facial expression recognition, 2018 13th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2018), IEEE, 2018, pp. 302–309.
[17]
Q.T. Ngo, S. Yoon, Facial expression recognition based on weighted-cluster loss and deep transfer learning using a highly imbalanced dataset, Sensors 20 (9) (2020) 2639.
[18]
Z. Li, C. Lang, L. Liang, T. Wang, S. Feng, J. Wu, Y. Li, A universal model for cross modality mapping by relational reasoning, arXiv preprint arXiv:2102.13360 (2021).
[19]
S. Liu, W. Huang, Z. Zhang, Learning hybrid relationships for person re-identification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 35, 2021, pp. 2172–2179.
[20]
X. Zhang, F. Zhou, Y. Lin, S. Zhang, Embedding label structures for fine-grained feature representation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016, pp. 1114–1123.
[21]
T. Chen, W. Wu, Y. Gao, L. Dong, X. Luo, L. Lin, Fine-grained representation learning and recognition by exploiting hierarchical semantic embedding, Proceedings of the 26th ACM international conference on Multimedia, 2018, pp. 2023–2031.
[22]
T.N. Kipf, M. Welling, Semi-supervised classification with graph convolutional networks, arXiv preprint arXiv:1609.02907 (2016).
[23]
H. Liu, T. Wang, Y. Li, C. Lang, S. Feng, H. Ling, Deep probabilistic graph matching, arXiv preprint arXiv:2201.01603 (2022).
[24]
T. Wang, H. Liu, Y. Li, Y. Jin, X. Hou, H. Ling, Learning combinatorial solver for graph matching, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 7568–7577.
[25]
H. Ding, S.K. Zhou, R. Chellappa, Facenet2expnet: Regularizing a deep face recognition net for expression recognition, 2017 12th IEEE International Conference on Automatic Face & Gesture Recognition (FG 2017), IEEE, 2017, pp. 118–126.
[26]
Z. Wang, S. Wang, H. Li, Z. Dou, J. Li, Graph-propagation based correlation learning for weakly supervised fine-grained image classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 12289–12296.
[27]
P. Zhuang, Y. Wang, Y. Qiao, Learning attentive pairwise interaction for fine-grained classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 34, 2020, pp. 13130–13137.
[28]
G. Chu, B. Potetz, W. Wang, A. Howard, Y. Song, F. Brucher, T. Leung, H. Adam, Geo-aware networks for fine-grained recognition, Proceedings of the IEEE/CVF International Conference on Computer Vision Workshops, 2019, p. 0.
[29]
X. Sun, L. Chen, J. Yang, Learning from web data using adversarial discriminative neural networks for fine-grained classification, Proceedings of the AAAI Conference on Artificial Intelligence, volume 33, 2019, pp. 273–280.
[30]
Q. Li, X. Peng, Y. Qiao, Q. Peng, Learning category correlations for multi-label image recognition with graph networks, arXiv preprint arXiv:1909.13005 (2019).
[31]
H. Gao, Y. Liu, S. Ji, Topology-aware graph pooling networks, IEEE Trans Pattern Anal Mach Intell 43 (12) (2021) 4512–4518.
[32]
J. He, X. Yu, B. Sun, L. Yu, Facial expression and action unit recognition augmented by their dependencies on graph convolutional networks, Journal on Multimodal User Interfaces 15 (4) (2021) 429–440.
[33]
M. Shi, Y. Tang, X. Zhu, J. Liu, Multi-label graph convolutional network representation learning, IEEE Trans. Big Data (2020).
[34]
Q. Le, T. Mikolov, Distributed representations of sentences and documents, International conference on machine learning, PMLR, 2014, pp. 1188–1196.
[35]
T. Ojala, M. Pietikainen, T. Maenpaa, Multiresolution gray-scale and rotation invariant texture classification with local binary patterns, IEEE Trans Pattern Anal Mach Intell 24 (7) (2002) 971–987.
[36]
J. Deng, J. Guo, N. Xue, S. Zafeiriou, Arcface: Additive angular margin loss for deep face recognition, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2019, pp. 4690–4699.
[37]
K. He, X. Zhang, S. Ren, J. Sun, Deep residual learning for image recognition, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 770–778.
[38]
A.H. Farzaneh, X. Qi, Facial expression recognition in the wild via deep attentive center loss, Proceedings of the IEEE/CVF winter conference on applications of computer vision, 2021, pp. 2402–2411.
[39]
K. Wang, X. Peng, J. Yang, S. Lu, Y. Qiao, Suppressing uncertainties for large-scale facial expression recognition, Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, 2020, pp. 6897–6906.
[40]
B. Kang, S. Xie, M. Rohrbach, Z. Yan, A. Gordo, J. Feng, Y. Kalantidis, Decoupling representation and classifier for long-tailed recognition, arXiv preprint arXiv:1910.09217 (2019).
[41]
R. Walecki, V. Pavlovic, B. Schuller, M. Pantic, et al., Deep structured learning for facial action unit intensity estimation, Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017, pp. 3405–3414.
[42]
S. Li, W. Deng, J. Du, Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild, Proceedings of the IEEE conference on computer vision and pattern recognition, 2017, pp. 2852–2861.
[43]
C. Fabian Benitez-Quiroz, R. Srinivasan, A.M. Martinez, Emotionet: An accurate, real-time algorithm for the automatic annotation of a million facial expressions in the wild, Proceedings of the IEEE conference on computer vision and pattern recognition, 2016, pp. 5562–5570.
[44]
Y. Li, J. Zeng, S. Shan, X. Chen, Occlusion aware facial expression recognition using cnn with attention mechanism, IEEE Trans. Image Process. 28 (5) (2018) 2439–2450.

Cited By

View all
  • (2023)Editorial for pattern recognition letters special issue on face-based emotion understandingPattern Recognition Letters10.1016/j.patrec.2023.02.022168:C(8-9)Online publication date: 1-Apr-2023
  • (2023)Dead pixel test using effective receptive fieldPattern Recognition Letters10.1016/j.patrec.2023.02.018167:C(149-156)Online publication date: 1-Mar-2023

Index Terms

  1. Fine-grained facial expression recognition via relational reasoning and hierarchical relation optimization
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Pattern Recognition Letters
        Pattern Recognition Letters  Volume 164, Issue C
        Dec 2022
        293 pages

        Publisher

        Elsevier Science Inc.

        United States

        Publication History

        Published: 01 December 2022

        Author Tags

        1. Fine-grained facial expression recognition
        2. Hierarchical label
        3. Relational reasoning
        4. GCN

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 27 Jan 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2023)Editorial for pattern recognition letters special issue on face-based emotion understandingPattern Recognition Letters10.1016/j.patrec.2023.02.022168:C(8-9)Online publication date: 1-Apr-2023
        • (2023)Dead pixel test using effective receptive fieldPattern Recognition Letters10.1016/j.patrec.2023.02.018167:C(149-156)Online publication date: 1-Mar-2023

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media