Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3664647.3680853acmconferencesArticle/Chapter ViewAbstractPublication PagesmmConference Proceedingsconference-collections
research-article

LD-BFR: Vector-Quantization-Based Face Restoration Model with Latent Diffusion Enhancement

Published: 28 October 2024 Publication History

Abstract

Blind Face Restoration (BFR) aims to restore high-quality face images from low-quality images with unknown degradation. Previous GAN-based or ViT-based methods have shown promising results, but have identity details loss once degradation is severe; while recent diffusion-based methods work on image level and take a lot of time to infer. To restore images in any degradation types with high quality and spend less time compared to the classic diffusion-based method, we propose LD-BFR, a novel BFR framework that integrates both the strengths of vector quantization and latent diffusion. First, we employ a Dual Cross-Attention vector quantization to restore the degraded image in a global manner. Then we utilize the restored high-quality quantized feature as the guidance in our latent diffusion model to generate high-quality restored images with rich details. With the help of the proposed high-quality feature injection module, our LD-BFR effectively injects the high-quality feature as a condition to guide the generation of our latent diffusion model. Extensive experiments demonstrate the superior performance of our model over the SOTA BFR methods. The code is available at: https://github.com/YuzhenD/LD-BFR.git

References

[1]
Chaofeng Chen, Xiaoming Li, Lingbo Yang, Xianhui Lin, Lei Zhang, and Kwan- Yee K. Wong. 2021. Progressive Semantic-Aware Style Transformation for Blind Face Restoration. arXiv:2009.08709 [cs.CV]
[2]
Jooyoung Choi, Sungwon Kim, Yonghyun Jeong, Youngjune Gwon, and Sungroh Yoon. 2021. Ilvr: Conditioning method for denoising diffusion probabilistic models. In 2021 IEEE. In CVF international conference on computer vision (ICCV), Vol. 1. 2.
[3]
Berk Dogan, Shuhang Gu, and Radu Timofte. 2019. Exemplar guided face image super-resolution without facial landmarks. In CVPRW. 0--0.
[4]
Patrick Esser, Robin Rombach, and Bjorn Ommer. 2021. Taming transformers for high-resolution image synthesis. In CVPR. 12873--12883.
[5]
Yuchao Gu, Xintao Wang, Liangbin Xie, Chao Dong, Gen Li, Ying Shan, and Ming-Ming Cheng. 2022. Vqfr: Blind face restoration with vector-quantized dictionary and parallel decoder. In ECCV. Springer, 126--143.
[6]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. NIPS 33 (2020), 6840--6851.
[7]
Xiaobin Hu,Wenqi Ren, Jiaolong Yang, Xiaochun Cao, David Wipf, Bjoern Menze, Xin Tong, and Hongbin Zha. 2021. Face restoration via plug-and-play 3d facial priors. TPAMI 44, 12 (2021), 8910--8926.
[8]
Gary B Huang, Marwan Mattar, Tamara Berg, and Eric Learned-Miller. 2008. Labeled faces in the wild: A database forstudying face recognition in unconstrained environments. In Workshop on faces in-Real-Life-Images: detection, alignment, and recognition.
[9]
Tero Karras, Samuli Laine, and Timo Aila. 2019. A style-based generator architecture for generative adversarial networks. In CVPR. 4401--4410.
[10]
Tero Karras, Samuli Laine, Miika Aittala, Janne Hellsten, Jaakko Lehtinen, and Timo Aila. 2020. Analyzing and improving the image quality of stylegan. In CVPR. 8110--8119.
[11]
Deokyun Kim, Minseon Kim, Gihyun Kwon, and Dae-Shik Kim. 2019. Progressive Face Super-Resolution via Attention to Facial Landmark. arXiv:1908.08239 [cs.CV]
[12]
Xiaoming Li, Chaofeng Chen, Shangchen Zhou, Xianhui Lin, Wangmeng Zuo, and Lei Zhang. 2020. Blind face restoration via deep multi-scale component dictionaries. In ECCV. Springer, 399--415.
[13]
Xiaoming Li, Wenyu Li, Dongwei Ren, Hongzhi Zhang, Meng Wang, and Wangmeng Zuo. 2020. Enhanced blind face restoration with multi-exemplar images and adaptive spatial feature fusion. In CVPR. 2706--2715.
[14]
Sachit Menon, Alexandru Damian, Shijia Hu, Nikhil Ravi, and Cynthia Rudin. 2020. Pulse: Self-supervised photo upsampling via latent space exploration of generative models. In CVPR. 2437--2445.
[15]
Xinmin Qiu, Congying Han, ZiCheng Zhang, Bonan Li, Tiande Guo, and Xuecheng Nie. 2023. DiffBFR: Bootstrapping Diffusion Model Towards Blind Face Restoration. arXiv preprint arXiv:2305.04517 (2023).
[16]
Ali Razavi, Aaron Van den Oord, and Oriol Vinyals. 2019. Generating diverse high-fidelity images with vq-vae-2. NIPS 32 (2019).
[17]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684--10695.
[18]
Chitwan Saharia, Jonathan Ho, William Chan, Tim Salimans, David J Fleet, and Mohammad Norouzi. 2022. Image super-resolution via iterative refinement. TPAMI (2022).
[19]
Ziyi Shen, Wei-Sheng Lai, Tingfa Xu, Jan Kautz, and Ming-Hsuan Yang. 2018. Deep Semantic Face Deblurring. arXiv:1803.03345 [cs.CV]
[20]
Aaron Van Den Oord, Oriol Vinyals, et al. 2017. Neural discrete representation learning. NIPS 30 (2017).
[21]
Xintao Wang, Yu Li, Honglun Zhang, and Ying Shan. 2021. Towards real-world blind face restoration with generative facial prior. In CVPR. 9168--9178.
[22]
ZhouxiaWang, Jiawei Zhang, Runjian Chen,WenpingWang, and Ping Luo. 2022. Restoreformer: High-quality blind face restoration from undegraded key-value pairs. In CVPR. 17512--17521.
[23]
Zhixin Wang, Xiaoyun Zhang, Ziying Zhang, Huangjie Zheng, Mingyuan Zhou, Ya Zhang, and Yanfeng Wang. 2023. DR2: Diffusion-based Robust Degradation Remover for Blind Face Restoration. arXiv preprint arXiv:2303.06885 (2023).
[24]
Qiao Xue, Qingqing Ye, Haibo Hu, Youwen Zhu, and Jian Wang. 2022. DDRM: A continual frequency estimation mechanism with local differential privacy. IEEE Transactions on Knowledge and Data Engineering (2022).
[25]
Tao Yang, Peiran Ren, Xuansong Xie, and Lei Zhang. 2021. Gan prior embedded network for blind face restoration in the wild. In CVPR. 672--681.
[26]
Xin Yu, Basura Fernando, Bernard Ghanem, Fatih Porikli, and Richard Hartley. 2018. Face Super-Resolution Guided by Facial Component Heatmaps. In ECCV, Vittorio Ferrari, Martial Hebert, Cristian Sminchisescu, and Yair Weiss (Eds.). Springer International Publishing.
[27]
Yang Zhao, Tingbo Hou, Yu-Chuan Su, Xuhui Jia, Yandong Li, and Matthias Grundmann. 2023. Towards authentic face restoration with iterative diffusion models and beyond. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 7312--7322.
[28]
Shangchen Zhou, Kelvin Chan, Chongyi Li, and Chen Change Loy. 2022. Towards robust blind face restoration with codebook lookup transformer. NIPS 35 (2022), 30599--30611.
[29]
Feida Zhu, Junwei Zhu, Wenqing Chu, Xinyi Zhang, Xiaozhong Ji, Chengjie Wang, and Ying Tai. 2022. Blind face restoration via integrating face shape and generative priors. In CVPR. 7662--7671.

Index Terms

  1. LD-BFR: Vector-Quantization-Based Face Restoration Model with Latent Diffusion Enhancement

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    MM '24: Proceedings of the 32nd ACM International Conference on Multimedia
    October 2024
    11719 pages
    ISBN:9798400706868
    DOI:10.1145/3664647
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 October 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. blind face restoration
    2. diffusion
    3. vector-quantization

    Qualifiers

    • Research-article

    Funding Sources

    • National Natural Science Foundation of China
    • the Fundamental Research Funds for the Central Universities
    • Beijing Natural Science Foundation
    • Young Elite Scientists Sponsorship Program by CAST
    • Shanghai Science and Technology Commission
    • Shanghai Municipal Science and Technology Major Project
    • Shanghai Sailing Program

    Conference

    MM '24
    Sponsor:
    MM '24: The 32nd ACM International Conference on Multimedia
    October 28 - November 1, 2024
    Melbourne VIC, Australia

    Acceptance Rates

    MM '24 Paper Acceptance Rate 1,150 of 4,385 submissions, 26%;
    Overall Acceptance Rate 2,145 of 8,556 submissions, 25%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 57
      Total Downloads
    • Downloads (Last 12 months)57
    • Downloads (Last 6 weeks)10
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media