Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

ProteusNeRF: Fast Lightweight NeRF Editing using 3D-Aware Image Context

Published: 13 May 2024 Publication History

Abstract

Neural Radiance Fields (NeRFs) have recently emerged as a popular option for photo-realistic object capture due to their ability to faithfully capture high-fidelity volumetric content even from handheld video input. Although much research has been devoted to efficient optimization leading to real-time training and rendering, options for interactive editing NeRFs remain limited. We present a very simple but effective neural network architecture that is fast and efficient while maintaining a low memory footprint. This architecture can be incrementally guided through user-friendly image-based edits. Our representation allows straightforward object selection via semantic feature distillation at the training stage. More importantly, we propose a local 3D-aware image context to facilitate view-consistent image editing that can then be distilled into fine-tuned NeRFs, via geometric and appearance adjustments. We evaluate our setup on a variety of examples to demonstrate appearance and geometric edits and report 10-30x speedup over concurrent work focusing on text-guided NeRF editing. Video results and code can be found on our project webpage at https://proteusnerf.github.io.

References

[1]
Omer Bar-Tal, Lior Yariv, Yaron Lipman, and Tali Dekel. 2023. MultiDiffusion: Fusing Diffusion Paths for Controlled Image Generation. arXiv preprint arXiv:2302.08113 (2023).
[2]
Jonathan T Barron, Ben Mildenhall, Dor Verbin, Pratul P Srinivasan, and Peter Hedman. 2022. Mip-nerf 360: Unbounded anti-aliased neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 5470--5479.
[3]
Tim Brooks, Aleksander Holynski, and Alexei A Efros. 2023. Instructpix2pix: Learning to follow image editing instructions. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18392--18402.
[4]
Mathilde Caron, Hugo Touvron, Ishan Misra, Hervé Jégou, Julien Mairal, Piotr Bojanowski, and Armand Joulin. 2021. Emerging properties in self-supervised vision transformers. In Proceedings of the IEEE/CVF international conference on computer vision. 9650--9660.
[5]
Eric R Chan, Connor Z Lin, Matthew A Chan, Koki Nagano, Boxiao Pan, Shalini De Mello, Orazio Gallo, Leonidas J Guibas, Jonathan Tremblay, Sameh Khamis, et al. 2022. Efficient geometry-aware 3D generative adversarial networks. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 16123--16133.
[6]
Anpei Chen, Zexiang Xu, Andreas Geiger, Jingyi Yu, and Hao Su. 2022. TensoRF: Tensorial Radiance Fields. In European Conference on Computer Vision (ECCV).
[7]
Pei-Ze Chiang, Meng-Shiun Tsai, Hung-Yu Tseng, Wei-Sheng Lai, and Wei-Chen Chiu. 2022. Stylizing 3d scene via implicit representation and hypernetwork. In Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision. 1475--1484.
[8]
Florinel-Alin Croitoru, Vlad Hondru, Radu Tudor Ionescu, and Mubarak Shah. 2023. Diffusion models in vision: A survey. IEEE Transactions on Pattern Analysis and Machine Intelligence (2023).
[9]
Michael Fischer, Zhengqin Li, Thu Nguyen-Phuoc, Aljaz Bozic, Zhao Dong, Carl Marshall, and Tobias Ritschel. 2024. NeRF Analogies: Example-Based Visual Attribute Transfer for NeRFs. arXiv preprint arXiv:2402.08622 (2024).
[10]
Sara Fridovich-Keil, Giacomo Meanti, Frederik Rahbæk Warburg, Benjamin Recht, and Angjoo Kanazawa. 2023. K-Planes: Explicit Radiance Fields in Space, Time, and Appearance. In CVPR.
[11]
Fridovich-Keil and Yu, Matthew Tancik, Qinhong Chen, Benjamin Recht, and Angjoo Kanazawa. 2022. Plenoxels: Radiance Fields without Neural Networks. In CVPR.
[12]
Stephan J Garbin, Marek Kowalski, Matthew Johnson, Jamie Shotton, and Julien Valentin. 2021. Fastnerf: High-fidelity neural rendering at 200fps. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 14346--14355.
[13]
Bingchen Gong, Yuehao Wang, Xiaoguang Han, and Qi Dou. 2023. RecolorNeRF: Layer Decomposed Radiance Field for Efficient Color Editing of 3D Scenes. arXiv preprint arXiv:2301.07958 (2023).
[14]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2020. Generative adversarial networks. Commun. ACM 63, 11 (2020), 139--144.
[15]
Ayaan Haque, Matthew Tancik, Alexei A Efros, Aleksander Holynski, and Angjoo Kanazawa. 2023. Instruct-nerf2nerf: Editing 3d scenes with instructions. arXiv preprint arXiv:2303.12789 (2023).
[16]
Peter Hedman, Pratul P Srinivasan, Ben Mildenhall, Jonathan T Barron, and Paul Debevec. 2021. Baking neural radiance fields for real-time view synthesis. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5875--5884.
[17]
Jonathan Ho, Ajay Jain, and Pieter Abbeel. 2020. Denoising diffusion probabilistic models. Advances in neural information processing systems 33 (2020), 6840--6851.
[18]
Hsin-Ping Huang, Hung-Yu Tseng, Saurabh Saini, Maneesh Singh, and Ming-Hsuan Yang. 2021. Learning to stylize novel views. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 13869--13878.
[19]
Yi-Hua Huang, Yue He, Yu-Jie Yuan, Yu-Kun Lai, and Lin Gao. 2022. Stylizednerf: consistent 3d scene stylization as stylized nerf via 2d-3d mutual learning. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 18342--18352.
[20]
Ajay Jain, Ben Mildenhall, Jonathan T Barron, Pieter Abbeel, and Ben Poole. 2022. Zero-shot text-guided object generation with dream fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 867--876.
[21]
Clément Jambon, Bernhard Kerbl, Georgios Kopanas, Stavros Diolatzis, George Drettakis, and Thomas Leimkühler. 2023. NeRFshop: Interactive Editing of Neural Radiance Fields. Proceedings of the ACM on Computer Graphics and Interactive Techniques 6, 1 (2023).
[22]
Animesh Karnewar, Tobias Ritschel, Oliver Wang, and Niloy Mitra. 2022. ReLU Fields: The Little Non-Linearity That Could. In ACM SIGGRAPH 2022 Conference Proceedings (Vancouver, BC, Canada) (SIGGRAPH '22). Association for Computing Machinery, New York, NY, USA, Article 27, 9 pages. https://doi.org/10.1145/3528233.3530707
[23]
Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).
[24]
Sosuke Kobayashi, Eiichi Matsumoto, and Vincent Sitzmann. 2022. Decomposing nerf for editing via feature field distillation. Advances in Neural Information Processing Systems 35 (2022), 23311--23330.
[25]
Zhengfei Kuang, Fujun Luan, Sai Bi, Zhixin Shu, Gordon Wetzstein, and Kalyan Sunkavalli. 2023. Palettenerf: Palette-based appearance editing of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 20691--20700.
[26]
Jae-Hyeok Lee and Dae-Shik Kim. 2023. ICE-NeRF: Interactive Color Editing of NeRFs via Decomposition-Aware Weight Optimization. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 3491--3501.
[27]
Gal Metzer, Elad Richardson, Or Patashnik, Raja Giryes, and Daniel Cohen-Or. 2023. Latent-nerf for shape-guided generation of 3d shapes and textures. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 12663--12673.
[28]
Ben Mildenhall, Pratul P. Srinivasan, Rodrigo Ortiz-Cayon, Nima Khademi Kalantari, Ravi Ramamoorthi, Ren Ng, and Abhishek Kar. 2019. Local Light Field Fusion: Practical View Synthesis with Prescriptive Sampling Guidelines. ACM Transactions on Graphics (TOG) (2019).
[29]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. In ECCV.
[30]
Thomas Müller, Alex Evans, Christoph Schied, and Alexander Keller. 2022. Instant neural graphics primitives with a multiresolution hash encoding. ACM Transactions on Graphics (ToG) 41, 4 (2022), 1--15.
[31]
Thu Nguyen-Phuoc, Feng Liu, and Lei Xiao. 2022. Snerf: stylized neural implicit representations for 3d scenes. arXiv preprint arXiv:2207.02363 (2022).
[32]
Xingang Pan, Ayush Tewari, Thomas Leimkühler, Lingjie Liu, Abhimitra Meka, and Christian Theobalt. 2023. Drag your gan: Interactive point-based manipulation on the generative image manifold. In ACM SIGGRAPH 2023 Conference Proceedings. 1--11.
[33]
Polycam. 2023. Polycam - lidar and 3d scanner for iphone & android.
[34]
Ben Poole, Ajay Jain, Jonathan T. Barron, and Ben Mildenhall. 2022. DreamFusion: Text-to-3D using 2D Diffusion. arXiv (2022).
[35]
Thomas Porter and Tom Duff. 1984. Compositing digital images. In Proceedings of the 11th annual conference on Computer graphics and interactive techniques. 253--259.
[36]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021a. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[37]
Alec Radford, Jong Wook Kim, Chris Hallacy, Aditya Ramesh, Gabriel Goh, Sandhini Agarwal, Girish Sastry, Amanda Askell, Pamela Mishkin, Jack Clark, et al. 2021b. Learning transferable visual models from natural language supervision. In International conference on machine learning. PMLR, 8748--8763.
[38]
Aditya Ramesh, Prafulla Dhariwal, Alex Nichol, Casey Chu, and Mark Chen. 2022. Hierarchical text-conditional image generation with clip latents. arXiv preprint arXiv:2204.06125 1, 2 (2022), 3.
[39]
René Ranftl, Alexey Bochkovskiy, and Vladlen Koltun. 2021. Vision transformers for dense prediction. In Proceedings of the IEEE/CVF international conference on computer vision. 12179--12188.
[40]
Zhongzheng Ren, Aseem Agarwala, Bryan Russell, Alexander G Schwing, and Oliver Wang. 2022. Neural volumetric object selection. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 6133--6142.
[41]
Robin Rombach, Andreas Blattmann, Dominik Lorenz, Patrick Esser, and Björn Ommer. 2022. High-resolution image synthesis with latent diffusion models. In Proceedings of the IEEE/CVF conference on computer vision and pattern recognition. 10684--10695.
[42]
Hyeonseop Song, Seokhun Choi, Hoseok Do, Chul Lee, and Taehyeong Kim. 2023. Blending-NeRF: Text-Driven Localized Editing in Neural Radiance Fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV). 14383--14393.
[43]
Cheng Sun, Min Sun, and Hwann-Tzong Chen. 2022. Direct Voxel Grid Optimization: Super-fast Convergence for Radiance Fields Reconstruction. In CVPR.
[44]
Matthew Tancik, Ben Mildenhall, Pratul Srinivasan, Jon Barron, and Angjoo Kanazawa. 2022. NeRF Tutorial ECCV 2022. Retrieved April 19, 2024 from https://sites.google.com/berkeley.edu/nerf-tutorial/home
[45]
Matthew Tancik, Ethan Weber, Evonne Ng, Ruilong Li, Brent Yi, Justin Kerr, Terrance Wang, Alexander Kristoffersen, Jake Austin, Kamyar Salahi, Abhik Ahuja, David McAllister, and Angjoo Kanazawa. 2023. Nerfstudio: A Modular Framework for Neural Radiance Field Development. In ACM SIGGRAPH 2023 Conference Proceedings (SIGGRAPH '23).
[46]
Christina Tsalicoglou, Fabian Manhardt, Alessio Tonioni, Michael Niemeyer, and Federico Tombari. 2023. TextMesh: Generation of Realistic 3D Meshes From Text Prompts. arXiv preprint arXiv:2304.12439 (2023).
[47]
Vadim Tschernezki, Iro Laina, Diane Larlus, and Andrea Vedaldi. 2022. Neural Feature Fusion Fields: 3D distillation of self-supervised 2D image representations. In 2022 International Conference on 3D Vision (3DV). IEEE, 443--453.
[48]
Can Wang, Menglei Chai, Mingming He, Dongdong Chen, and Jing Liao. 2022a. Clip-nerf: Text-and-image driven manipulation of neural radiance fields. In Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. 3835--3844.
[49]
Can Wang, Ruixiang Jiang, Menglei Chai, Mingming He, Dongdong Chen, and Jing Liao. 2022b. NeRF-Art: Text-Driven Neural Radiance Fields Stylization. arXiv preprint arXiv:2212.08070 (2022).
[50]
Dongqing Wang, Tong Zhang, Alaa Abboud, and Sabine Süsstrunk. 2023. InpaintNeRF360: Text-Guided 3D Inpainting on Unbounded Neural Radiance Fields. arXiv preprint arXiv:2305.15094 (2023).
[51]
Zhengwei Wang, Qi She, and Tomas E Ward. 2021. Generative adversarial networks in computer vision: A survey and taxonomy. ACM Computing Surveys (CSUR) 54, 2 (2021), 1--38.
[52]
Tianhan Xu and Tatsuya Harada. 2022. Deforming radiance fields with cages. In European Conference on Computer Vision. Springer, 159--175.
[53]
Alex Yu, Ruilong Li, Matthew Tancik, Hao Li, Ren Ng, and Angjoo Kanazawa. 2021. Plenoctrees for real-time rendering of neural radiance fields. In Proceedings of the IEEE/CVF International Conference on Computer Vision. 5752--5761.
[54]
Lvmin Zhang and Maneesh Agrawala. 2023. Adding conditional control to text-to-image diffusion models. arXiv preprint arXiv:2302.05543 (2023).
[55]
Jingyu Zhuang, Chen Wang, Lingjie Liu, Liang Lin, and Guanbin Li. 2023. DreamEditor: Text-Driven 3D Scene Editing with Neural Fields. arXiv preprint arXiv:2306.13455 (2023).

Cited By

View all
  • (2025)A Comprehensive Survey on Test-Time Adaptation Under Distribution ShiftsInternational Journal of Computer Vision10.1007/s11263-024-02181-w133:1(31-64)Online publication date: 1-Jan-2025
  • (2024)Text-driven light-field content editing for three-dimensional light-field display based on Gaussian SplattingOptics Express10.1364/OE.547233Online publication date: 22-Dec-2024
  • (2024)On-device Learning of EEGNet-based Network For Wearable Motor Imagery Brain-Computer InterfaceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676607(9-16)Online publication date: 5-Oct-2024
  • Show More Cited By

Index Terms

  1. ProteusNeRF: Fast Lightweight NeRF Editing using 3D-Aware Image Context

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Proceedings of the ACM on Computer Graphics and Interactive Techniques
      Proceedings of the ACM on Computer Graphics and Interactive Techniques  Volume 7, Issue 1
      May 2024
      399 pages
      EISSN:2577-6193
      DOI:10.1145/3665094
      Issue’s Table of Contents
      This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 13 May 2024
      Published in PACMCGIT Volume 7, Issue 1

      Check for updates

      Author Tags

      1. Generative AI
      2. Interactive 3D Editing
      3. Neural Editing
      4. Neural Radiance Field
      5. ProteusNeRF
      6. Stable Diffusion Model

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)436
      • Downloads (Last 6 weeks)98
      Reflects downloads up to 10 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2025)A Comprehensive Survey on Test-Time Adaptation Under Distribution ShiftsInternational Journal of Computer Vision10.1007/s11263-024-02181-w133:1(31-64)Online publication date: 1-Jan-2025
      • (2024)Text-driven light-field content editing for three-dimensional light-field display based on Gaussian SplattingOptics Express10.1364/OE.547233Online publication date: 22-Dec-2024
      • (2024)On-device Learning of EEGNet-based Network For Wearable Motor Imagery Brain-Computer InterfaceProceedings of the 2024 ACM International Symposium on Wearable Computers10.1145/3675095.3676607(9-16)Online publication date: 5-Oct-2024
      • (2024)BrushBuds: Toothbrushing Tracking Using Earphone IMUsCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3680521(655-660)Online publication date: 5-Oct-2024
      • (2024)Toolkit Design for Building Camera Sensor-Driven DIY Smart HomesCompanion of the 2024 on ACM International Joint Conference on Pervasive and Ubiquitous Computing10.1145/3675094.3678363(256-261)Online publication date: 5-Oct-2024
      • (2024)AdaShadow: Responsive Test-time Model Adaptation in Non-stationary Mobile EnvironmentsProceedings of the 22nd ACM Conference on Embedded Networked Sensor Systems10.1145/3666025.3699339(295-308)Online publication date: 4-Nov-2024
      • (2024)MetaBioLiq: A Wearable Passive Metasurface Aided mmWave Sensing Platform for BioFluidsProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3690687(1192-1206)Online publication date: 4-Dec-2024
      • (2024)MSense: Boosting Wireless Sensing Capability Under Motion InterferenceProceedings of the 30th Annual International Conference on Mobile Computing and Networking10.1145/3636534.3649350(108-123)Online publication date: 29-May-2024
      • (2024)CoupNeRF: Property‐aware Neural Radiance Fields for Multi‐Material Coupled Scenario ReconstructionComputer Graphics Forum10.1111/cgf.1520843:7Online publication date: 24-Oct-2024
      • (2024)IReNe: Instant Recoloring of Neural Radiance Fields2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00567(5937-5946)Online publication date: 16-Jun-2024
      • Show More Cited By

      View Options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Login options

      Full Access

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media