Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Facial expression recognition via coarse-grained and fine-grained feature representation

Published: 01 January 2022 Publication History

Abstract

 Recognizing facial expressions rely on facial parts’ movement (action units) such as eyes, mouth, and nose. Existing methods utilize complex subnetworks to learn part-based facial features or train neural networks with an extensively perturbed dataset. Different from existing methods, we propose a trainable end-to-end convolutional neural network for facial expression recognition. First, we propose a Local Prediction Penalty to stimulate facial expression recognition research with no part-based learning. It is a technique to punish the feature extractor’s local predictive power to coerce it to learn coarse-grained features (general facial expression). The Local Prediction Penalty forces the network to disregard predictive local signals learned from local receptive fields and instead depend on the global facial region. Second, we propose a Spatial Self-Attention method for fine-grained feature representation to learn distinct face features from pixel positions. The Spatial Self-Attention accumulates attention features at privileged positions without changing the spatial feature dimension. Lastly, we leverage a classifier to carefully combine all learned features (coarse-grained and fine-grained) for better feature representation. Extensive experiments demonstrate that our proposed methods significantly improve facial expression recognition performance.

References

[1]
Baffour A.A., Qin Z., Geng J., Ding Y., Deng F. and Qin Z., Generic network for domain adaptation based on self-supervised learning and deep clustering, Neurocomputing 476 (2022), 126–136.
[2]
Baffour A.A., Qin Z., Wang Y., Qin Z. and Choo K.-K.R., Spatial self-attention network with self-attention distillation for fine-grained image recognition, Journal of Visual Communication and Image Representation 81 (2021), 103368.
[3]
Chen Y., Wang J., Chen S., Shi Z. and Cai J., Facial motion prior networks for facial expression recognition. In 2019 IEEE Visual Communications and Image Processing, VCIP 2019, Sydney, Australia, December 1-4, (2019), pp. 1–4. IEEE, 2019.
[4]
Ghimire D., Jeong S., Yoon S., Choi J. and Lee J., Facial expression recognition based on region specific appearance and geometric features. In 2015 Tenth International Conference on Digital Information Management (ICDIM), (2015), pp. 142–147.
[5]
Jampour M., Mauthner T. and Bischof H., Pairwise linear regression: An efficient and fast multi-view facial expression recognition. In 11th IEEE International Conference and Workshops on Automatic Face and Gesture Recognition, FG 2015, Ljubljana, Slovenia, May 4-8, (2015), pp. 1–8. IEEE Computer Society, 2015.
[6]
Jung H., Lee S., Yim J., Park S. and Kim J., Joint fine-tuning in deep neural networks for facial expression recognition. In 2015 IEEE International Conference on Computer Vision (ICCV), Dec 2015 pp. 2983–2991.
[7]
Kaggle. Challenges in Representation Learning: Facial Expression Recognition Challenge, 2020 (accessed September 3, 2020).
[8]
Kanade T., Tian Y. and Cohn J.F., Comprehensive database for facial expression analysis. In 4th IEEE International Conference on Automatic Face and Gesture Recognition (FG 2000), 26-30 March 2000, Grenoble, France, pp. 46–53. IEEE Computer Society, 2000.
[9]
Kim B., Dong S., Roh J., Kim G. and Lee S., Fusing aligned and non-aligned face information for automatic affect recognition in the wild: A deep learning approach. In 2016 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2016), pp. 1499–1508.
[10]
Kwabena O., Qin Z., Zhuang T. and Qin Z., Mscryptonet: Multi-scheme privacy-preserving deep learning in cloud computing, IEEE Access 7 (2019), 29344–29354.
[11]
Li H., Sun J., Xu Z. and Chen L., Multimodal 2d+3d facial expression recognition with deep fusion convolutional neural network, IEEE Transactions on Multimedia 19(12) (2017), 2816–2831.
[12]
Li J., Jin K., Zhou D., Kubota N. and Ju Z., Attention mechanism-based cnn for facial expression recognition, Neurocomputing 411 (2020), 340–350.
[13]
Li Y., Wang X., Zhang S., Xie L., Wu W., Yu H. and Zhu Z., Identity-enhanced network for facial expression recognition. In Jawahar C.V., Li H., Mori G., and Schindler K., editors, Computer Vision - ACCV 2018 - 14th Asian Conference on Computer Vision, Perth, Australia, December 2-6, 2018, Revised Selected Papers, Part IV, volume 11364 of Lecture Notes in Computer Science, pp. 534–550. Springer, 2018.
[14]
Li Y., Zeng J., Shan S. and Chen X., Occlusion aware facial expression recognition using CNN with attention mechanism, IEEE Trans Image Processing 28(5) (2019), 2439–2450.
[15]
Liu D., Ouyang X., Xu S., Zhou P., He K. and Wen S., Saanet: Siamese action-units attention network for improving dynamic facial expression recognition, Neurocomputing 413 (2020), 145–157.
[16]
Liu M., Li S., Shan S., Wang R. and Chen X., Deeply learning deformable facial action parts model for dynamic expression analysis. In Cremers D., Reid I. D., Saito H., and Yang M., editors, Computer Vision - ACCV 2014 - 12th Asian Conference on Computer Vision, Singapore, Singapore, November 1-5, 2014, Revised Selected Papers, Part IV, volume 9006 of Lecture Notes in Computer Science, pp. 143–157. Springer, 2014.
[17]
Liu X., Kumar B.V.K.V., You J. and Jia P., Adaptive deep metric learning for identity-aware facial expression recognition. In 2017 IEEE Conference on Computer Vision and Pattern Recognition Workshops (CVPRW), (2017), pp. 522–531, July 2017.
[18]
Lopes A.T., de Aguiar E., Souza A.F.D. and Oliveira-Santos T., Facial expression recognition with convolutional neural networks: Coping with few data and the training sample order, Pattern Recognit 61 (2017), 610–628.
[19]
Malach E. and Shalev-Shwartz S., A provably correct algorithm for deep learning that actually works. CoRR, abs/1803.09522, 2018.
[20]
Marrero-Fernández P.D., Guerrero-Peña F.A., Ren T.I. and Cunha A., Feratt: Facial expression recognition with attention net. In IEEE Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2019, Long Beach, CA, USA, June 16-20, 2019, page 0. Computer Vision Foundation / IEEE, 2019.
[21]
Meng Z., Liu P., Cai J., Han S. and Tong Y., Identity-aware convolutional neural network for facial expression recognition. In 12th IEEE International Conference on Automatic Face & Gesture Recognition, FG 2017, Washington, DC, USA, May 30 - June 3, 2017, pp. 558–565. IEEE Computer Society, 2017.
[22]
Mnih V., Heess N., Graves A. and Kavukcuoglu K., Recurrent models of visual attention. In Ghahramani Z., Welling M., Cortes C., Lawrence N.D., and Weinberger K. Q., editors, Advances in Neural Information Processing Systems 27: Annual Conference on Neural Information Processing Systems 2014, December 8-13 2014, Montreal, Quebec, Canada, (2014), pp. 2204–2212.
[23]
Mollahosseini A., Chan D. and Mahoor M.H., Going deeper in facial expression recognition using deep neural networks. In 2016 IEEE Winter Conference on Applications of Computer Vision (WACV), (2016), pp. 1–10.
[24]
Mostafa H., Ramesh V. and Cauwenberghs G., Deep supervised learning using local errors. CoRR, abs/1711.06756, 2017.
[25]
Pantic M., Valstar M., Rademaker R. and Maat L., Web-based database for facial expression analysis. In 2005 IEEE International Conference on Multimedia and Expo, pp. 5, July 2005.
[26]
Pantic M., Valstar M.F., Rademaker R. and Maat L., Webbased database for facial expression analysis. In Proceedings of the 2005 IEEE International Conference on Multimedia and Expo, ICME 2005, July 6-9, 2005, Amsterdam, The Netherlands, pp. 317–321. IEEE Computer Society, 2005.
[27]
Pramerdorfer C. and Kampel M., Facial expression recognition using convolutional neural networks: State of the art. CoRR, abs/1612.02903, 2016.
[28]
Qin Z., He W., Deng F., Li M. and Liu Y., Srprid: Pedestrian re-identification based on super-resolution images, IEEE Access 7 (2019), 152891–152899.
[29]
Qin Z., Hu L., Zhang N., Chen D., Zhang K., Qin Z. and Choo K.R., Learning-aided user identification using smartphone sensors for smart homes, IEEE Internet of Things Journal 6(5) (2019), 7760–7772.
[30]
Qin Z., Huang G., Xiong H., Qin Z. and Choo K.R., A fuzzy authentication system based on neural network learning and extreme value statistics, IEEE Transactions on Fuzzy Systems (2019), pp. 1–1.
[31]
Qin Z., Wang Y., Cheng H., Zhou Y., Sheng Z. and Leung V.C.M., Demographic information prediction: A portrait of smartphone application users, IEEE Trans Emerging Topics Comput 6(3) (2018), 432–444.
[32]
Qin Z., Zhang Y., Meng S., Qin Z. and Choo K.-K.R., Imaging and fusing time series for wearable sensor-based human activity recognition, Information Fusion 53 (2020), 80–87.
[33]
Shao J. and Qian Y., Three convolutional neural network models for facial expression recognition in the wild, Neurocomputing 355 (2019), 82–92.
[34]
Sun W., Zhao H. and Jin Z., A visual attention based roi detection method for facial expression recognition, Neurocomputing 296 (2018), 12–22.
[35]
Turan C., Lam K. and He X., Soft locality preserving map (SLPM) for facial expression recognition. CoRR, abs/1801.03754, 2018.
[36]
Vaswani A., Shazeer N., Parmar N., Uszkoreit J., Jones L., Gomez A.N., Kaiser L. and Polosukhin I., Attention is all you need. In Guyon I., von Luxburg U., Bengio S., Wallach H. M., Fergus R., Vishwanathan S. V. N., and Garnett R., editors, Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, 4-9 December 2017, Long Beach, CA, USA, (2017), pp. 5998–6008.
[37]
Viola P. and Jones M., Robust real-time object detection, International Journal of Computer Vision (2004), pp. 137–154.
[38]
Wang F., Jiang M., Qian C., Yang S., Li C., Zhang H., Wang X. and Tang X., Residual attention network for image classification. In 2017 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2017, Honolulu, HI, USA, July 21-26, (2017), pp. 6450–6458. IEEE Computer Society, 2017.
[39]
Wang W., Sun Q., Chen T., Cao C., Zheng Z., Xu G., Qiu H. and Fu Y., A fine-grained facial expression database for end-to-end multi-pose facial expression recognition. CoRR, abs/1907.10838, 2019.
[40]
Guo Y., Tao D., Yu J., Xiong H., Li Y. and Tao D., Deep neural networks with relativity learning for facial expression recognition. In 2016 IEEE International Conference on Multimedia Expo Workshops (ICMEW), (2016), pp. 1–6.
[41]
Zhang F., Zhang T., Mao Q. and Xu C., Joint pose and expression modeling for facial expression recognition. In 2018 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2018, Salt Lake City, UT, USA, June 18-22, 2018, pp. 3359–3368. IEEE Computer Society, 2018.
[42]
Zhang K., Huang Y., Du Y. and Wang L., Facial expression recognition based on deep evolutional spatial-temporal networks, IEEE Trans Image Process 26(9) (2017), 4193–4203.
[43]
Zhang K., Zhang Z., Li Z. and Qiao Y., Joint face detection and alignment using multitask cascaded convolutional networks, IEEE Signal Process Lett 23(10) (2016), 1499–1503.
[44]
Zhang T., Zheng W., Cui Z., Zong Y., Yan J. and Yan K., A deep neural network-driven feature learning method for multiview facial expression recognition, IEEE Trans Multimedia 18(12) (2016), 2528–2536.
[45]
Zhang Z., Girard J.M., Wu Y., Zhang X., Liu P., Ciftci U.A., Canavan S.J., Reale M., Horowitz A., Yang H., Cohn J.F., Ji Q. and Yin L., Multimodal spontaneous emotion corpus for human behavior analysis. In 2016 IEEE Conference on Computer Vision and Pattern Recognition, CVPR 2016, Las Vegas, NV, USA, June 27-30, (2016), pp. 3438–3446. IEEE Computer Society, 2016.
[46]
Zheng W., Multi-view facial expression recognition based on group sparse reduced-rank regression, IEEE Trans Affect Comput 5(1) (2014), 71–85.
[47]
Zhong L., Liu Q., Yang P., Huang J. and Metaxas D.N., Learning multiscale active facial patches for expression analysis, IEEE Transactions on Cybernetics 45(8) (2015), 1499–1510, Aug 2015.
[48]
Zhu K., Du Z., Li W., Huang D., Wang Y. and Chen L., Discriminative attention-based convolutional neural network for 3d facial expression recognition. In 2019 14th IEEE International Conference on Automatic Face Gesture Recognition (FG 2019), (2019), pp. 1–8, May 2019.

Cited By

View all
  • (2024)Cov-FedEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107448128:COnline publication date: 14-Mar-2024

Index Terms

  1. Facial expression recognition via coarse-grained and fine-grained feature representation
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology
        Journal of Intelligent & Fuzzy Systems: Applications in Engineering and Technology  Volume 43, Issue 4
        2022
        1429 pages

        Publisher

        IOS Press

        Netherlands

        Publication History

        Published: 01 January 2022

        Author Tags

        1. Facial expression recognition
        2. spatial self-attention
        3. coarse-grained
        4. fine-grained
        5. convolutional neural network

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0
        Reflects downloads up to 18 Feb 2025

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)Cov-FedEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.107448128:COnline publication date: 14-Mar-2024

        View Options

        View options

        Figures

        Tables

        Media

        Share

        Share

        Share this Publication link

        Share on social media