MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Xu, Zhongcong; Zhang, Jianfeng; Liew, Jun Hao; Yan, Hanshu; Liu, Jia-Wei; Zhang, Chenxu; Feng, Jiashi; Shou, Mike Zheng

Computer Science > Computer Vision and Pattern Recognition

arXiv:2311.16498 (cs)

[Submitted on 27 Nov 2023]

Title:MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Authors:Zhongcong Xu, Jianfeng Zhang, Jun Hao Liew, Hanshu Yan, Jia-Wei Liu, Chenxu Zhang, Jiashi Feng, Mike Zheng Shou

View PDF

Abstract:This paper studies the human image animation task, which aims to generate a video of a certain reference identity following a particular motion sequence. Existing animation works typically employ the frame-warping technique to animate the reference image towards the target motion. Despite achieving reasonable results, these approaches face challenges in maintaining temporal consistency throughout the animation due to the lack of temporal modeling and poor preservation of reference identity. In this work, we introduce MagicAnimate, a diffusion-based framework that aims at enhancing temporal consistency, preserving reference image faithfully, and improving animation fidelity. To achieve this, we first develop a video diffusion model to encode temporal information. Second, to maintain the appearance coherence across frames, we introduce a novel appearance encoder to retain the intricate details of the reference image. Leveraging these two innovations, we further employ a simple video fusion technique to encourage smooth transitions for long video animation. Empirical results demonstrate the superiority of our method over baseline approaches on two benchmarks. Notably, our approach outperforms the strongest baseline by over 38% in terms of video fidelity on the challenging TikTok dancing dataset. Code and model will be made available.

Comments:	Project Page at this https URL
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR)
Cite as:	arXiv:2311.16498 [cs.CV]
	(or arXiv:2311.16498v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2311.16498

Submission history

From: Zhongcong Xu [view email]
[v1] Mon, 27 Nov 2023 18:32:31 UTC (10,397 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:MagicAnimate: Temporally Consistent Human Image Animation using Diffusion Model

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators