Iterative Multi-granular Image Editing using Diffusion Models

Joseph, K J; Udhayanan, Prateksha; Shukla, Tripti; Agarwal, Aishwarya; Karanam, Srikrishna; Goswami, Koustava; Srinivasan, Balaji Vasan

Computer Science > Computer Vision and Pattern Recognition

arXiv:2309.00613v2 (cs)

[Submitted on 1 Sep 2023 (v1), last revised 28 Oct 2023 (this version, v2)]

Title:Iterative Multi-granular Image Editing using Diffusion Models

Authors:K J Joseph, Prateksha Udhayanan, Tripti Shukla, Aishwarya Agarwal, Srikrishna Karanam, Koustava Goswami, Balaji Vasan Srinivasan

View PDF

Abstract:Recent advances in text-guided image synthesis has dramatically changed how creative professionals generate artistic and aesthetically pleasing visual assets. To fully support such creative endeavors, the process should possess the ability to: 1) iteratively edit the generations and 2) control the spatial reach of desired changes (global, local or anything in between). We formalize this pragmatic problem setting as Iterative Multi-granular Editing. While there has been substantial progress with diffusion-based models for image synthesis and editing, they are all one shot (i.e., no iterative editing capabilities) and do not naturally yield multi-granular control (i.e., covering the full spectrum of local-to-global edits). To overcome these drawbacks, we propose EMILIE: Iterative Multi-granular Image Editor. EMILIE introduces a novel latent iteration strategy, which re-purposes a pre-trained diffusion model to facilitate iterative editing. This is complemented by a gradient control operation for multi-granular control. We introduce a new benchmark dataset to evaluate our newly proposed setting. We conduct exhaustive quantitatively and qualitatively evaluation against recent state-of-the-art approaches adapted to our task, to being out the mettle of EMILIE. We hope our work would attract attention to this newly identified, pragmatic problem setting.

Comments:	Accepted to IEEE/CVF Winter Conference on Applications of Computer Vision (WACV) 2024
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI); Machine Learning (cs.LG)
Cite as:	arXiv:2309.00613 [cs.CV]
	(or arXiv:2309.00613v2 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2309.00613

Submission history

From: Joseph K J [view email]
[v1] Fri, 1 Sep 2023 17:59:29 UTC (26,644 KB)
[v2] Sat, 28 Oct 2023 11:16:53 UTC (26,644 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Iterative Multi-granular Image Editing using Diffusion Models

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Iterative Multi-granular Image Editing using Diffusion Models

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators