Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Huang, Ziyao; Tang, Fan; Zhang, Yong; Cun, Xiaodong; Cao, Juan; Li, Jintao; Lee, Tong-Yee

Computer Science > Computer Vision and Pattern Recognition

arXiv:2403.16510 (cs)

[Submitted on 25 Mar 2024]

Title:Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Authors:Ziyao Huang, Fan Tang, Yong Zhang, Xiaodong Cun, Juan Cao, Jintao Li, Tong-Yee Lee

View PDF HTML (experimental)

Abstract:Despite the remarkable process of talking-head-based avatar-creating solutions, directly generating anchor-style videos with full-body motions remains challenging. In this study, we propose Make-Your-Anchor, a novel system necessitating only a one-minute video clip of an individual for training, subsequently enabling the automatic generation of anchor-style videos with precise torso and hand movements. Specifically, we finetune a proposed structure-guided diffusion model on input video to render 3D mesh conditions into human appearances. We adopt a two-stage training strategy for the diffusion model, effectively binding movements with specific appearances. To produce arbitrary long temporal video, we extend the 2D U-Net in the frame-wise diffusion model to a 3D style without additional training cost, and a simple yet effective batch-overlapped temporal denoising module is proposed to bypass the constraints on video length during inference. Finally, a novel identity-specific face enhancement module is introduced to improve the visual quality of facial regions in the output videos. Comparative experiments demonstrate the effectiveness and superiority of the system in terms of visual quality, temporal coherence, and identity preservation, outperforming SOTA diffusion/non-diffusion methods. Project page: \url{this https URL}.

Comments:	accepted at CVPR2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.16510 [cs.CV]
	(or arXiv:2403.16510v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2403.16510

Submission history

From: Ziyao Huang [view email]
[v1] Mon, 25 Mar 2024 07:54:18 UTC (48,153 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Make-Your-Anchor: A Diffusion-based 2D Avatar Generation Framework

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators