Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Dynamic Occlusion Expression Recognition Based on Improved GAN

  • Conference paper
  • First Online:
Artificial Intelligence and Machine Learning (IAIC 2023)

Abstract

In order to address the issue of local occlusion in practical dynamic expression recognition, this paper first introduces a facial restoration network that combines Vision Transformer (ViT) and GAN. This network can accurately identify missing facial features and perform detailed and efficient restoration. Secondly, for the task of expression recognition, a more robust dynamic expression recognition network is trained by cascading ViT with a Two-Stream CNN, effectively leveraging ViT’s feature extraction capability and the Two-Stream CNN’s ability to acquire spatio-temporal features. Finally, by combining these two networks, we can efficiently recognize dynamically occluded expressions. A multitude of experiments demonstrate that the facial image restoration network trained on the CelebA and VGG Face2 datasets outperforms other networks in handling small and medium occlusions. Expression recognition experiments on AFEW and MMI datasets show that this paper’s expression recognition network achieves an accuracy of 54.95% and 81.2%, respectively, for dynamic expression recognition, surpassing mainstream networks. Moreover, the restoration network outperforms mainstream networks in addressing occlusions and provides an average accuracy improvement of 5.34% in occluded expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 79.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 99.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Chan, T.F., Shen, J.: Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 62(3), 1019–1043 (2002)

    Article  MathSciNet  Google Scholar 

  2. Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)

    Article  Google Scholar 

  3. Zhang, J., Kan, M., Shan, S., Chen, X.: Occlusion-free face alignment: deep regression networks coupled with de-corrupt autoencoders. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  4. Lee, J., Kim, S., Kim, S., Park, J., Sohn, K.: Context-aware emotion recognition networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)

    Google Scholar 

  5. Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)

    Google Scholar 

  6. Egils, A., Tomasz, S., Maie, B., Dorota, K.: Audiovisual emotion recognition in wild. Mach. Vis. Appl. 30, 975–985 (2018)

    Google Scholar 

  7. Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)

    Article  Google Scholar 

  8. Jain, D.K., Zhang, Z., Huang, K.: Multi angle optimal pattern-based deep learning for automatic facial expression recognition. Pattern Recogn. Lett. 139, 157–165 (2020)

    Article  Google Scholar 

  9. Mollahosseini, A., Chan, D., Mahoor, M.H.: Going deeper in facial expression recognition using deep neural networks. IEEE (2016)

    Google Scholar 

  10. He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. IEEE (2016)

    Google Scholar 

  11. Jianming, Z., Xiaocui, Z.: Processing method of facial expression images under partial occlusion. Comput. Eng. Appl. 47(3), 170–173 (2011)

    Google Scholar 

  12. Li, Y., Liu, S., Yang, J., Yang, M.H.: Generative face completion. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)

    Google Scholar 

  13. Goodfellow, I.J., et al.: Generative adversarial nets (2014)

    Google Scholar 

  14. Zhang, P., Zhang, K., Luo, W., Li, C., Wang, G.: Blind face restoration: benchmark datasets and a baseline model (2022)

    Google Scholar 

  15. Ge, S., Li, C., Zhao, S., Zeng, D.: Occluded face recognition in the wild by identity-diversity inpainting. IEEE Trans. Circ. Syst. Video Technol. 30(10), 3387–3397 (2020)

    Article  Google Scholar 

  16. Xu, Z., et al.: Facecontroller: controllable attribute editing for face in the wild. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 3083–3091 (2022)

    Google Scholar 

  17. Sun, M.C., Hsu, S.H., Yang, M.C., Chien, J.H.: Context-aware cascade attention-based rnn for video emotion recognition. In: 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia) (2018)

    Google Scholar 

  18. Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates Inc. (2014)

    Google Scholar 

  19. Dosovitskiy, A., et al.: An image is worth 16\(\times \)16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)

    Google Scholar 

  20. Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)

    Google Scholar 

  21. Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision Conference 2015 (2015)

    Google Scholar 

  22. Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)

    Google Scholar 

  23. Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: 2005 IEEE International Conference on Multimedia and Expo, p. 5 (2005)

    Google Scholar 

Download references

Acknowledgements

This article has been supported by State Grid Corporation Technology Guide Project(5700-202218185A-1-1-ZN).

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Yongli Wang .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Liang, M., Zhang, M., Liu, K., Li, X., Wang, Y. (2024). Dynamic Occlusion Expression Recognition Based on Improved GAN. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_14

Download citation

  • DOI: https://doi.org/10.1007/978-981-97-1277-9_14

  • Published:

  • Publisher Name: Springer, Singapore

  • Print ISBN: 978-981-97-1276-2

  • Online ISBN: 978-981-97-1277-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics