Abstract
Facial expression recognition is pivotal in computer vision and finds applications across various domains. In this paper, we proposed a self-supervised learning approach for precise facial expression recognition. Our approach leverages recent advancements in diffusion models, specifically the Classification and Regression Diffusion (CARD) model. To enhance the discriminative capability of our model, we integrate the Convolutional Block Attention Module (CBAM), an effective attention mechanism, to extract pertinent and discriminative feature maps. Furthermore, we capitalize on unlabelled data by using the simple contrastive learning framework of self-supervised learning (SSL) to extract meaningful features. To evaluate the performance, we conduct extensive experiments on the FER2013 dataset, comparing our results with existing benchmarks. The findings reveal significant performance improvements, achieving 66.6% accuracy on the FER2013 dataset. The quantitative results demonstrate the efficacy of our proposed SSL-based model in achieving accurate and robust facial expression recognition.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Houssein, E.H., Hammad, A., Ali, A.A.: Human emotion recognition from EEG-based brain-computer interface using machine learning: a comprehensive review. Neural Comput. App. 34(15), 12527–12557 (2022)
Ullah, H., Khan, S.D., Ullah, M., Cheikh, F.A.: Social modeling meets virtual reality: an immersive implication. In: Del Bimbo, A., et al. (eds.) ICPR 2021. LNCS, vol. 12664, pp. 131–140. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-68799-1_10
Mao, W., Zhang, J., Yang, K., Stiefelhagen, R.: Panoptic lintention network: Towards efficient navigational perception for the visually impaired. In 2021 IEEE International Conference on Real-time Computing and Robotics (RCAR), pp. 857–862. IEEE (2021)
Ullah, M., Ullah, H., Khan, S.D., Cheikh, F.A.: Stacked LSTM network for human activity recognition using smartphone data. In: 2019 8th European Workshop on Visual Information Processing (EUVIP), pp. 175–180. IEEE (2019)
Luo, J., Xie, Z., Zhu, F., Zhu, X.: Facial expression recognition using machine learning models in fer2013. In: 2021 IEEE 3rd International Conference on Frontiers Technology of Information and Computer (ICFTIC), pp. 231–235. IEEE (2021)
Yirui, W., Zhang, L., Zonghua, G., Hu, L., Wan, S.: Edge-AI-driven framework with efficient mobile network design for facial expression recognition. ACM Trans. Embedded Comput. Syst. 22(3), 1–17 (2023)
Mao, Y.: Optimization of facial expression recognition on ResNet-18 using focal loss and cosface loss. In: 2022 International Symposium on Advances in Informatics, Electronics and Education (ISAIEE), pp. 161–163. IEEE (2022)
Munsif, M., Ullah, M., Ahmad, B., Sajjad, M., Cheikh, F.A.: Monitoring neurological disorder patients via deep learning based facial expressions analysis. In: Maglogiannis, I., Iliadis, L., Macintyre, J., Cortez, P. (eds.) Artificial Intelligence Applications and Innovations. AIAI 2022 IFIP WG 12.5 International Workshops. AIAI 2022. IFIP Advances in Information and Communication Technology, vol. 652, pp. 412–423. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-08341-9_33
Lin, T.-Y., Goyal, P., Girshick, R., He, K., Dollár, P.: Focal loss for dense object detection. In: Proceedings of the IEEE International Conference on Computer Vision, pp. 2980–2988 (2017)
Wang et al. Cosface: large margin cosine loss for deep face recognition. In: IEEE Conference on Computer Vision and Pattern Recognition, pp. 5265–5274 (2018)
Chen, X., Wang, Z., Cheikh, F.A., Ullah, M.: 3D-resnet fused attention for autism spectrum disorder classification. In: Peng, Y., Hu, S.-M., Gabbouj, M., Zhou, K., Elad, M., Xu, K. (eds.) ICIG 2021. LNCS, vol. 12889, pp. 607–617. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-87358-5_49
Mamadou, K., Ullah, M., Nordbø, Ø., Cheikh, F.A.: Multi-encoder convolution block attention model for binary segmentation. In: 2022 International Conference on Frontiers of Information Technology (FIT), pp. 183–188. IEEE (2022)
Croitoru, F.-A., Hondru, V., Ionescu, R.T., Shah, M.: Diffusion models in vision: a survey. IEEE Trans. Pattern Anal. Mach. Intell. (2023)
Ho, J., Jain, A., Abbeel, P.: Denoising diffusion probabilistic models. Adv. Neural. Inf. Process. Syst. 33, 6840–6851 (2020)
Han, X., Zheng, H., Zhou, M.: Card: Classification and regression diffusion models (2022). arXiv preprint arXiv:2206.07275
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 770–778 (2016)
Woo, S., Park, J., Lee, J.-Y., Kweon, I.S.: CBAM: convolutional block attention module. In: Ferrari, V., Hebert, M., Sminchisescu, C., Weiss, Y. (eds.) ECCV 2018. LNCS, vol. 11211, pp. 3–19. Springer, Cham (2018). https://doi.org/10.1007/978-3-030-01234-2_1
Chen, T., Kornblith, S., Norouzi, M., Hinton, G.: A simple framework for contrastive learning of visual representations. In: International Conference on Machine Learning, pp. 1597–1607. PMLR (2020)
Li, S., Deng, W., Du, J.P.: Reliable crowdsourcing and deep locality-preserving learning for expression recognition in the wild. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2852–2861 (2017)
Goodfellow, I.J., et al.: Challenges in representation learning: a report on three machine learning contests. In: Lee, M., Hirose, A., Hou, Z.-G., Kil, R.M. (eds.) ICONIP 2013. LNCS, vol. 8228, pp. 117–124. Springer, Heidelberg (2013). https://doi.org/10.1007/978-3-642-42051-1_16
Acknowledgment
We’re thankful to NORPART-CONNECT for their support and funding, enabling us to conduct this research. The European Union also supported the work through the Horizon 2020 Research and Innovation Programme within the ALAMEDA project (addressing brain disease diagnosis and treatment gaps) under grant agreement No GA 101017558.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Hassan, S., Ullah, M., Imran, A.S., Cheikh, F.A. (2024). Attention-Guided Self-supervised Framework for Facial Emotion Recognition. In: Liu, F., Sadanandan, A.A., Pham, D.N., Mursanto, P., Lukose, D. (eds) PRICAI 2023: Trends in Artificial Intelligence. PRICAI 2023. Lecture Notes in Computer Science(), vol 14327. Springer, Singapore. https://doi.org/10.1007/978-981-99-7025-4_26
Download citation
DOI: https://doi.org/10.1007/978-981-99-7025-4_26
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-7024-7
Online ISBN: 978-981-99-7025-4
eBook Packages: Computer ScienceComputer Science (R0)