Dynamic Occlusion Expression Recognition Based on Improved GAN

Liang, Minchao; Zhang, Mingming; Liu, Kai; Li, Xianhui; Wang, Yongli

doi:10.1007/978-981-97-1277-9_14

Minchao Liang ORCID: orcid.org/0009-0008-3626-109X⁸,
Mingming Zhang⁹,
Kai Liu⁹,
Xianhui Li^10,11 &
…
Yongli Wang ORCID: orcid.org/0000-0003-2219-067X⁸

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 2058))

Included in the following conference series:

International Artificial Intelligence Conference

214 Accesses

Abstract

In order to address the issue of local occlusion in practical dynamic expression recognition, this paper first introduces a facial restoration network that combines Vision Transformer (ViT) and GAN. This network can accurately identify missing facial features and perform detailed and efficient restoration. Secondly, for the task of expression recognition, a more robust dynamic expression recognition network is trained by cascading ViT with a Two-Stream CNN, effectively leveraging ViT’s feature extraction capability and the Two-Stream CNN’s ability to acquire spatio-temporal features. Finally, by combining these two networks, we can efficiently recognize dynamically occluded expressions. A multitude of experiments demonstrate that the facial image restoration network trained on the CelebA and VGG Face2 datasets outperforms other networks in handling small and medium occlusions. Expression recognition experiments on AFEW and MMI datasets show that this paper’s expression recognition network achieves an accuracy of 54.95% and 81.2%, respectively, for dynamic expression recognition, surpassing mainstream networks. Moreover, the restoration network outperforms mainstream networks in addressing occlusions and provides an average accuracy improvement of 5.34% in occluded expression recognition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 79.99; Price excludes VAT (USA)

Softcover Book: USD 99.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

Chan, T.F., Shen, J.: Mathematical models for local nontexture inpaintings. SIAM J. Appl. Math. 62(3), 1019–1043 (2002)
Article MathSciNet Google Scholar
Barnes, C., Shechtman, E., Finkelstein, A., Goldman, D.B.: Patchmatch: a randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28(3), 24 (2009)
Article Google Scholar
Zhang, J., Kan, M., Shan, S., Chen, X.: Occlusion-free face alignment: deep regression networks coupled with de-corrupt autoencoders. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Lee, J., Kim, S., Kim, S., Park, J., Sohn, K.: Context-aware emotion recognition networks. In: 2019 IEEE/CVF International Conference on Computer Vision (ICCV) (2020)
Google Scholar
Pathak, D., Krahenbuhl, P., Donahue, J., Darrell, T., Efros, A.A.: Context encoders: feature learning by inpainting. In: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2016)
Google Scholar
Egils, A., Tomasz, S., Maie, B., Dorota, K.: Audiovisual emotion recognition in wild. Mach. Vis. Appl. 30, 975–985 (2018)
Google Scholar
Wang, K., Peng, X., Yang, J., Meng, D., Qiao, Y.: Region attention networks for pose and occlusion robust facial expression recognition. IEEE Trans. Image Process. 29, 4057–4069 (2020)
Article Google Scholar
Jain, D.K., Zhang, Z., Huang, K.: Multi angle optimal pattern-based deep learning for automatic facial expression recognition. Pattern Recogn. Lett. 139, 157–165 (2020)
Article Google Scholar
Mollahosseini, A., Chan, D., Mahoor, M.H.: Going deeper in facial expression recognition using deep neural networks. IEEE (2016)
Google Scholar
He, K., Zhang, X., Ren, S., Sun, J.: Deep residual learning for image recognition. IEEE (2016)
Google Scholar
Jianming, Z., Xiaocui, Z.: Processing method of facial expression images under partial occlusion. Comput. Eng. Appl. 47(3), 170–173 (2011)
Google Scholar
Li, Y., Liu, S., Yang, J., Yang, M.H.: Generative face completion. In: 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2017)
Google Scholar
Goodfellow, I.J., et al.: Generative adversarial nets (2014)
Google Scholar
Zhang, P., Zhang, K., Luo, W., Li, C., Wang, G.: Blind face restoration: benchmark datasets and a baseline model (2022)
Google Scholar
Ge, S., Li, C., Zhao, S., Zeng, D.: Occluded face recognition in the wild by identity-diversity inpainting. IEEE Trans. Circ. Syst. Video Technol. 30(10), 3387–3397 (2020)
Article Google Scholar
Xu, Z., et al.: Facecontroller: controllable attribute editing for face in the wild. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 3083–3091 (2022)
Google Scholar
Sun, M.C., Hsu, S.H., Yang, M.C., Chien, J.H.: Context-aware cascade attention-based rnn for video emotion recognition. In: 2018 First Asian Conference on Affective Computing and Intelligent Interaction (ACII Asia) (2018)
Google Scholar
Simonyan, K., Zisserman, A.: Two-stream convolutional networks for action recognition in videos. In: Ghahramani, Z., Welling, M., Cortes, C., Lawrence, N., Weinberger, K.Q. (eds.) Advances in Neural Information Processing Systems, vol. 27. Curran Associates Inc. (2014)
Google Scholar
Dosovitskiy, A., et al.: An image is worth 16$\times $16 words: transformers for image recognition at scale. In: International Conference on Learning Representations (2021)
Google Scholar
Liu, Z., Luo, P., Wang, X., Tang, X.: Deep learning face attributes in the wild. In: Proceedings of International Conference on Computer Vision (ICCV) (2015)
Google Scholar
Parkhi, O.M., Vedaldi, A., Zisserman, A.: Deep face recognition. In: Proceedings of the British Machine Vision Conference 2015 (2015)
Google Scholar
Zhu, X., Lei, Z., Yan, J., Yi, D., Li, S.Z.: High-fidelity pose and expression normalization for face recognition in the wild. In: 2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2015)
Google Scholar
Pantic, M., Valstar, M., Rademaker, R., Maat, L.: Web-based database for facial expression analysis. In: 2005 IEEE International Conference on Multimedia and Expo, p. 5 (2005)
Google Scholar

Download references

Acknowledgements

This article has been supported by State Grid Corporation Technology Guide Project(5700-202218185A-1-1-ZN).

Author information

Authors and Affiliations

Nanjing University of Science and Technology, 210000, NanJing, China
Minchao Liang & Yongli Wang
Information Communication Branc, State Grid Jiangsu Electric Power Co., Ltd., 210024, Nanjing, China
Mingming Zhang & Kai Liu
Jiangsu Ruizhong Data Co., Ltd., 210012, Nanjing, China
Xianhui Li
State Grid Electric Power Research Institute Co., Ltd., 211106, Nanjing, China
Xianhui Li

Authors

Minchao Liang
View author publications
You can also search for this author in PubMed Google Scholar
Mingming Zhang
View author publications
You can also search for this author in PubMed Google Scholar
Kai Liu
View author publications
You can also search for this author in PubMed Google Scholar
Xianhui Li
View author publications
You can also search for this author in PubMed Google Scholar
Yongli Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yongli Wang .

Editor information

Editors and Affiliations

Huazhong University of Science and Technology, Wuhan, Hubei, China
Hai Jin
Chinese Academy of Science, Shenzhen, China
Yi Pan
Nanjing University of Science and Technology, Nanjing, China
Jianfeng Lu

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Liang, M., Zhang, M., Liu, K., Li, X., Wang, Y. (2024). Dynamic Occlusion Expression Recognition Based on Improved GAN. In: Jin, H., Pan, Y., Lu, J. (eds) Artificial Intelligence and Machine Learning. IAIC 2023. Communications in Computer and Information Science, vol 2058. Springer, Singapore. https://doi.org/10.1007/978-981-97-1277-9_14

Download citation

DOI: https://doi.org/10.1007/978-981-97-1277-9_14
Published: 03 April 2024
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-1276-2
Online ISBN: 978-981-97-1277-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Dynamic Occlusion Expression Recognition Based on Improved GAN