TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting

Liu, Taorong; Liao, Liang; Chen, Delin; Xiao, Jing; Wang, Zheng; Lin, Chia-Wen; Satoh, Shin'ichi

Computer Science > Computer Vision and Pattern Recognition

arXiv:2306.11528 (cs)

[Submitted on 20 Jun 2023 (v1), last revised 3 Oct 2024 (this version, v3)]

Title:TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting

Authors:Taorong Liu, Liang Liao, Delin Chen, Jing Xiao, Zheng Wang, Chia-Wen Lin, Shin'ichi Satoh

View PDF HTML (experimental)

Abstract:Image inpainting for completing complicated semantic environments and diverse hole patterns of corrupted images is challenging even for state-of-the-art learning-based inpainting methods trained on large-scale data. A reference image capturing the same scene of a corrupted image offers informative guidance for completing the corrupted image as it shares similar texture and structure priors to that of the holes of the corrupted image. In this work, we propose a transformer-based encoder-decoder network, named TransRef, for reference-guided image inpainting. Specifically, the guidance is conducted progressively through a reference embedding procedure, in which the referencing features are subsequently aligned and fused with the features of the corrupted image. For precise utilization of the reference features for guidance, a reference-patch alignment (Ref-PA) module is proposed to align the patch features of the reference and corrupted images and harmonize their style differences, while a reference-patch transformer (Ref-PT) module is proposed to refine the embedded reference feature. Moreover, to facilitate the research of reference-guided image restoration tasks, we construct a publicly accessible benchmark dataset containing 50K pairs of input and reference images. Both quantitative and qualitative evaluations demonstrate the efficacy of the reference information and the proposed method over the state-of-the-art methods in completing complex holes. Code and dataset can be accessed at this https URL.

Comments:	Under review
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2306.11528 [cs.CV]
	(or arXiv:2306.11528v3 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.2306.11528

Submission history

From: Taorong Liu [view email]
[v1] Tue, 20 Jun 2023 13:31:33 UTC (33,228 KB)
[v2] Wed, 21 Jun 2023 01:51:59 UTC (33,228 KB)
[v3] Thu, 3 Oct 2024 14:02:10 UTC (18,032 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:TransRef: Multi-Scale Reference Embedding Transformer for Reference-Guided Image Inpainting

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators