Quality-aware face alignment using high-resolution spatial dependencies

Ma, Jinyan; Li, Xuefei; Li, Jing; Wan, Jun; Liu, Tong; Li, Guohao

doi:10.1007/s11042-023-17295-5

Quality-aware face alignment using high-resolution spatial dependencies

Published: 16 October 2023

Volume 83, pages 42165–42187, (2024)
Cite this article

Multimedia Tools and Applications Aims and scope Submit manuscript

Jinyan Ma¹,
Xuefei Li¹,
Jing Li¹,
Jun Wan²,
Tong Liu¹ &
…
Guohao Li¹

143 Accesses
Explore all metrics

Abstract

Although CNN-based face alignment algorithms have got promising results. However, their alignment accuracy are still suffer from faces with severe occlusions and large poses, which mainly because (1) the inability to model long-range dependencies, construct effective face shape constraints and (2) the limitation on the size of the labeled facial datasets. To address the above problems, this study proposed a transformer-based data distillation semi-supervised face alignment algorithm. The transformer-based heatmap detection network introduces the transformer to model more efficient face shape constraint relationships, thus improving algorithm robustness under partial occlusion. Moreover, a quality-aware pseudolabeled sample distillation network is designed to help transformer obtain the CNNs inherent inductive biases by evaluating the quality of pseudolabeled data generated by transformer-based heatmap detection networks. This study also proposed intensive training strategy to use more unlabeled data without the need for manual operation to further improve the performance of transformer thermal map detection networks. Experimental results on the 300W, AFLW, and 300VW datasets demonstrate the superiority of our method over state-of-the-art face alignment methods.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient Example Mining for Anchor-Free Face Detection

Holistic Co-occurrence Prior for High-Density Face Detection

SCAF: Skip-Connections in Auto-encoder for Face Alignment with Few Annotated Data

Data Availibility Statement

300W dataset is openly available from Intelligent Behaviour Understanding Group (Accession 300W at https://ibug.doc.ic.ac.uk/resources/300-W/). AFLW dataset is openly available from Graz University of Technology (Accession AFLW at https://www.tugraz.at/institute/icg/research/team-bischof/lrs/downloads/aflw/). 300VW dataset is openly available from Intelligent Behaviour Understanding Group (Accession 300VW at https://ibug.doc.ic.ac.uk/resources/300-VW/).

References

Jiang K, Wang Z, Yi P, Wang G, Gu K, Jiang J (2019) Atmfn: adaptive-threshold-based multi-model fusion network for compressed face hallucination. IEEE Trans Multimed 22(10):2734–2747
Jiang K, Wang Z, Yi P, Lu T, Jiang J, Xiong Z (2020) Dual-path deep fusion network for face image hallucination. IEEE Trans Neural Netw Learn Syst 33(1):378–391
Article Google Scholar
Kumar A, Kaur A, Kumar M (2019) Face detection techniques: a review. Artif Intell Rev 52:927–948
Article Google Scholar
Xiao S, Feng J, Xing J, Lai H, Yan S, Kassim A (2016) Robust facial landmark detection via recurrent attentive-refinement networks. In: European conference on computer vision, pp 57–72. Springer
Bulat A, Tzimiropoulos G (2018) Super-fan: Integrated facial landmark localization and super-resolution of real-world low resolution faces in arbitrary poses with gans. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 109–117
Zhu M, Shi D, Zheng M, Sadiq M (2019) Robust facial landmark detection via occlusion-adaptive deep networks. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 3486–3496
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, Gomez AN, Kaiser Ł, Polosukhin I (2017) Attention is all you need. Adv Neural Inf Process Syst 30
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M et al (2019) Huggingface’s transformers: state-of-the-art natural language processing. arXiv:1910.03771
Wolf T, Debut L, Sanh V, Chaumond J, Delangue C, Moi A, Cistac P, Rault T, Louf R, Funtowicz M et al (2020) Transformers: state-of-the-art natural language processing. In: Proceedings of the 2020 conference on empirical methods in natural language processing: system demonstrations, pp 38–45
Sun K, Zhao Y, Jiang B, Cheng T, Xiao B, Liu D, Mu Y, Wang X, Liu W, Wang J (2019) High-resolution representations for labeling pixels and regions. arXiv:1904.04514
Dosovitskiy A, Beyer L, Kolesnikov A, Weissenborn D, Zhai X, Unterthiner T, Dehghani M, Minderer M, Heigold G, Gelly S et al (2020) An image is worth 16x16 words: transformers for image recognition at scale. arXiv:2010.11929
Zhu J-Y, Park T, Isola P, Efros AA (2017) Unpaired image-to-image translation using cycle-consistent adversarial networks. In: Proceedings of the IEEE international conference on computer vision, pp 2223–2232
Feng Z-H, Kittler J, Awais M, Huber P, Wu X-J (2018) Wing loss for robust facial landmark localisation with convolutional neural networks. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2235–2245
Wan J, Lai Z, Shen L, Zhou J, Gao C, Xiao G, Hou X (2021) Robust facial landmark detection by cross-order cross-semantic deep network. Neural Netw 136:233–243
Article Google Scholar
Kowalski M, Naruniec J, Trzcinski T (2017) Deep alignment network: a convolutional neural network for robust face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition workshops, pp 88–97
Wu W, Qian C, Yang S, Wang Q, Cai Y, Zhou Q (2018) Look at boundary: a boundary-aware face alignment algorithm. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 2129–2138
Wang X, Bo L, Fuxin L (2019) Adaptive wing loss for robust face alignment via heatmap regression. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 6971–6981
Wan J, Lai Z, Liu J, Zhou J, Gao C (2020) Robust face alignment by multi-order high-precision hourglass network. IEEE Trans Image Process 30:121–133
Article Google Scholar
Dong X, Yu S-I, Weng X, Wei S-E, Yang Y, Sheikh Y (2018) Supervision-by-registration: An unsupervised approach to improve the precision of facial landmark detectors. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 360–368
Honari S, Molchanov P, Tyree S, Vincent P, Pal C, Kautz J (2018) Improving landmark localization with semi-supervised learning. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 1546–1555
Jin H, Liao S, Shao L (2021) Pixel-in-pixel net: Towards efficient facial land- mark detection in the wild. Int J Comput Vis 129(12):3174–3194
Article Google Scholar
Robinson JP, Li Y, Zhang N, Fu Y, Tulyakov S (2019) Laplace landmark localization. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10103–10112
Yue X, Li J, Wu J, Chang J, Wan J, Ma J (2021) Multi-task adversarial autoencoder network for face alignment in the wild. Neurocomputing 437:261–273
Article Google Scholar
Browatzki B, Wallraven C (2020) 3fabrec: Fast few-shot face alignment by reconstruction. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 6110–6120
Kumar A, Chellappa R (2020) S2ld: Semi-supervised landmark detection in low-resolution images and impact on face verification. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition workshops, pp 758–759
Carion N, Massa F, Synnaeve G, Usunier N, Kirillov A, Zagoruyko S (2020) End-to-end object detection with transformers. In: European conference on computer vision, pp 213–229. Springer
Touvron H, Cord M, Douze M, Massa F, Sablayrolles A, Jégou H (2021) Training data-efficient image transformers & distillation through attention. In: International conference on machine learning, pp 10347–10357. PMLR
Fan Y, Tian F, Qin T, Li X-Y, Liu T-Y (2018) Learning to teach. arXiv:1805.03643
Kumar V, Rao S, Yu L (2020) Noisy student training using body language dataset improves facial expression recognition. In: European conference on computer vision, pp 756–773. Springer
Chen L-C, Lopes RG, Cheng B, Collins MD, Cubuk ED, Zoph B, Adam H, Shlens J (2020) Naive-student: leveraging semi-supervised learning in video sequences for urban scene segmentation. In: European conference on computer vision, pp 695–714. Springer
Dong X, Yang Y (2019) Teacher supervises students how to learn from par- tially labeled images for facial landmark detection. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 783–792
Meng R, Zhou S, Wan X, Li M, Wang J (2020) Teacher-student asyn- chronous learning with multi-source consistency for facial landmark detection. arXiv preprint arXiv:2012.06711
Si J, Jiang F, Shen R, Lu H (2021) Small and accurate heatmap-based face alignment via distillation strategy and cascaded architecture. Comput Vis Image Underst 203:103125
Article Google Scholar
Sun K, Xiao B, Liu D, Wang J (2019) Deep high-resolution representation learning for human pose estimation. In: Proceedings of the IEEE/CVF conference on computer vision and pattern recognition, pp 5693–5703
Xiao Y, Yuan Q, Jiang K, He J, Wang Y, Zhang L (2023) From degrade to upgrade: learning a self-supervised degradation guided adaptive network for blind remote sensing image super-resolution. Inf Fusion 96:297–311
Ahuja K, Mahajan D, Wang Y, Bengio Y (2023) Interventional causal rep- resentation learning. In: International conference on machine learning, pp 372–407. PMLR
Yang S, Quan Z, Nie M, Yang W (2020) Transpose: towards explainable human pose estimation by transformer. arXiv:2012.14214 2(6)
Sagonas C, Tzimiropoulos G, Zafeiriou S, Pantic M (2013) 300 faces in the wild challenge: the first facial landmark localization challenge. In: Proceedings of the IEEE international conference on computer vision workshops, pp 397–403
Martin Koestinger, P.M.R. Paul Wohlhart, Bischof H (2011) Annotated facial landmarks in the wild: a large-scale, real-world database for facial landmark localization. In: Proc. First IEEE international workshop on benchmarking facial image analysis technologies
Chrysos GG, Antonakos E, Zafeiriou S, Snape P (2015) Offline deformable face tracking in arbitrary videos. In: Proceedings of the IEEE international conference on computer vision workshops, pp 1–9
Jourabloo A, Ye M, Liu X, Ren L (2017) Pose-invariant face alignment with a single cnn. In: Proceedings of the IEEE international conference on computer vision, pp 3200–3209
Lv J, Shao X, Xing J, Cheng C, Zhou X (2017) A deep regression architecture with two-stage re-initialization for high performance facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 3317–3326
Dong X, Yan Y, Ouyang W, Yang Y (2018) Style aggregated network for facial landmark detection. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 379–388
Kumar A, Chellappa R (2018) Disentangling 3d pose in a dendritic cnn for unconstrained 2d face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 430–439
Ranjan R, Patel VM, Chellappa R (2017) Hyperface: A deep multi-task learning framework for face detection, landmark localization, pose estimation, and gender recognition. IEEE Trans Pattern Anal Mach Intell 41(1):121–135
Yue X, Li J, Wu J, Chang J, Wan J, Ma J (2021) Multi-task adversarial autoencoder network for face alignment in the wild. Neurocomputing 437:261–273
Article Google Scholar
Ma J, Li J, Du B, Wu J, Wan J, Xiao Y (2022) Robust face alignment by dual-attentional spatial-aware capsule networks. Pattern Recognit 122:108297
Article Google Scholar
Qian S, Sun K, Wu W, Qian C, Jia J (2019) Aggregation via separation: boosting facial landmark detector with semi-supervised style translation. In: Proceedings of the IEEE/CVF international conference on computer vision, pp 10153–10163
Miao X, Zhen X, Liu X, Deng C, Athitsos V, Huang H (2018) Direct shape regression networks for end-to-end face alignment. In: Proceedings of the IEEE conference on computer vision and pattern recognition, pp 5040–5049
Guo X, Li S, Yu J, Zhang J, Ma J, Ma L, Liu W, Ling H (2019) Pd: a practical facial landmark detector. arXiv:1902.10859
Haris Khan M, McDonagh J, Tzimiropoulos G (2017) Synergy between face alignment and tracking via discriminative global consensus optimization. In: Proceedings of the IEEE international conference on computer vision, pp 3791–3799

Download references

Acknowledgements

This research was made benefited from a grant from National Science Foundation of China (Grant No. 62002233) and National Natural Science Foundation of China (Grant No. 62372335).

Author information

Authors and Affiliations

School of Computer Science, Wuhan University, Bayi Road, Wuhan, 430072, Hubei, China
Jinyan Ma, Xuefei Li, Jing Li, Tong Liu & Guohao Li
School of Information and Safety Engineering, Zhongnan University of Economics and Law, South Lake street, Wuhan, 430073, Hubei, China
Jun Wan

Authors

Jinyan Ma
View author publications
You can also search for this author in PubMed Google Scholar
Xuefei Li
View author publications
You can also search for this author in PubMed Google Scholar
Jing Li
View author publications
You can also search for this author in PubMed Google Scholar
Jun Wan
View author publications
You can also search for this author in PubMed Google Scholar
Tong Liu
View author publications
You can also search for this author in PubMed Google Scholar
Guohao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding authors

Correspondence to Xuefei Li or Jing Li.

Ethics declarations

Competing Interests

This work was supported by National Science Foundation of China (Grant No. 62002233) and National Natural Science Foundation of China (Grant No. 62372335).

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Ma, J., Li, X., Li, J. et al. Quality-aware face alignment using high-resolution spatial dependencies. Multimed Tools Appl 83, 42165–42187 (2024). https://doi.org/10.1007/s11042-023-17295-5

Download citation

Received: 04 October 2022
Revised: 04 September 2023
Accepted: 22 September 2023
Published: 16 October 2023
Issue Date: April 2024
DOI: https://doi.org/10.1007/s11042-023-17295-5

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Quality-aware face alignment using high-resolution spatial dependencies

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Example Mining for Anchor-Free Face Detection

Holistic Co-occurrence Prior for High-Density Face Detection

SCAF: Skip-Connections in Auto-encoder for Face Alignment with Few Annotated Data

Data Availibility Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Quality-aware face alignment using high-resolution spatial dependencies

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Efficient Example Mining for Anchor-Free Face Detection

Holistic Co-occurrence Prior for High-Density Face Detection

SCAF: Skip-Connections in Auto-encoder for Face Alignment with Few Annotated Data

Data Availibility Statement

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding authors

Ethics declarations

Competing Interests

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation