CCEdit: Creative and Controllable Video Editing via Diffusion Models

Feng, Ruoyu; Weng, Wenming; Wang, Yanhui; Yuan, Yuhui; Bao, Jianmin; Luo, Chong; Chen, Zhibo; Guo, Baining

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.16496 (cs)

[Submitted on 28 Sep 2023 (v1), last revised 7 Apr 2024 (this version, v3)]

Title:CCEdit: Creative and Controllable Video Editing via Diffusion Models

Authors:Ruoyu Feng, Wenming Weng, Yanhui Wang, Yuhui Yuan, Jianmin Bao, Chong Luo, Zhibo Chen, Baining Guo

View PDF HTML (experimental)

Abstract:In this paper, we present CCEdit, a versatile generative video editing framework based on diffusion models. Our approach employs a novel trident network structure that separates structure and appearance control, ensuring precise and creative editing capabilities. Utilizing the foundational ControlNet architecture, we maintain the structural integrity of the video during editing. The incorporation of an additional appearance branch enables users to exert fine-grained control over the edited key frame. These two side branches seamlessly integrate into the main branch, which is constructed upon existing text-to-image (T2I) generation models, through learnable temporal layers. The versatility of our framework is demonstrated through a diverse range of choices in both structure representations and personalized T2I models, as well as the option to provide the edited key frame. To facilitate comprehensive evaluation, we introduce the BalanceCC benchmark dataset, comprising 100 videos and 4 target prompts for each video. Our extensive user studies compare CCEdit with eight state-of-the-art video editing methods. The outcomes demonstrate CCEdit's substantial superiority over all other methods.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2309.16496 [cs.CV]
	(or arXiv:2309.16496v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.16496

Submission history

From: Ruoyu Feng [view email]
[v1] Thu, 28 Sep 2023 15:03:44 UTC (4,400 KB)
[v2] Fri, 1 Dec 2023 03:28:21 UTC (15,292 KB)
[v3] Sun, 7 Apr 2024 02:39:31 UTC (15,150 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:CCEdit: Creative and Controllable Video Editing via Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:CCEdit: Creative and Controllable Video Editing via Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators