Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields

Published: 10 December 2021 Publication History

Abstract

Neural Radiance Fields (NeRF) are able to reconstruct scenes with unprecedented fidelity, and various recent works have extended NeRF to handle dynamic scenes. A common approach to reconstruct such non-rigid scenes is through the use of a learned deformation field mapping from coordinates in each input image into a canonical template coordinate space. However, these deformation-based approaches struggle to model changes in topology, as topological changes require a discontinuity in the deformation field, but these deformation fields are necessarily continuous. We address this limitation by lifting NeRFs into a higher dimensional space, and by representing the 5D radiance field corresponding to each individual input image as a slice through this "hyper-space". Our method is inspired by level set methods, which model the evolution of surfaces as slices through a higher dimensional surface. We evaluate our method on two tasks: (i) interpolating smoothly between "moments", i.e., configurations of the scene, seen in the input images while maintaining visual plausibility, and (ii) novel-view synthesis at fixed moments. We show that our method, which we dub HyperNeRF, outperforms existing methods on both tasks. Compared to Nerfies, HyperNeRF reduces average error rates by 4.1% for interpolation and 8.6% for novel-view synthesis, as measured by LPIPS. Additional videos, results, and visualizations are available at hypernerf.github.io.

Supplementary Material

ZIP File (a238-park.zip)
Supplemental files.
MP4 File (a238-park.mp4)

References

[1]
Kara-Ali Aliev, Artem Sevastopolsky, Maria Kolos, Dmitry Ulyanov, and Victor Lempitsky. 2019. Neural point-based graphics. arXiv:1906.08240 (2019).
[2]
Jonathan T. Barron, Ben Mildenhall, Matthew Tancik, Peter Hedman, Ricardo Martin-Brualla, and Pratul P. Srinivasan. 2021. Mip-NeRF: A Multiscale Representation for Anti-Aliasing Neural Radiance Fields. arXiv:2103.13415 [cs.CV]
[3]
Piotr Bojanowski, Armand Joulin, David Lopez-Pas, and Arthur Szlam. 2018. Optimizing the Latent Space of Generative Networks. ICML (2018).
[4]
Aljaž Božič, Pablo Palafox, Michael Zollhöfer, Angela Dai, Justus Thies, and Matthias Nießner. 2020. Neural Non-Rigid Tracking. arXiv preprint arXiv:2006.13240 (2020).
[5]
James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
[6]
Christoph Bregler, Aaron Hertzmann, and Henning Biermann. 2000. Recovering non-rigid 3D shape from image streams. CVPR (2000).
[7]
Zhiqin Chen and Hao Zhang. 2019. Learning implicit fields for generative shape modeling. In CVPR. 5939--5948.
[8]
Alvaro Collet, Ming Chuang, Pat Sweeney, Don Gillett, Dennis Evseev, David Calabrese, Hugues Hoppe, Adam Kirk, and Steve Sullivan. 2015. High-quality streamable free-viewpoint video. ACM ToG (2015).
[9]
Boyang Deng, Jonathan T. Barron, and Pratul P. Srinivasan. 2020. JaxNeRF: an efficient JAX implementation of NeRF. https://github.com/google-research/google-research/tree/master/jaxnerf
[10]
Mingsong Dou, Sameh Khamis, Yury Degtyarev, Philip Davidson, Sean Ryan Fanello, Adarsh Kowdle, Sergio Orts Escolano, Christoph Rhemann, David Kim, Jonathan Taylor, et al. 2016. Fusion4D: Real-time performance capture of challenging scenes. ACM ToG (2016).
[11]
Ohad Fried, Ayush Tewari, Michael Zollhöfer, Adam Finkelstein, Eli Shechtman, Dan B Goldman, Kyle Genova, Zeyu Jin, Christian Theobalt, and Maneesh Agrawala. 2019. Text-based editing of talking-head video. ACM TOG (2019).
[12]
Guy Gafni, Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2021. Dynamic Neural Radiance Fields for Monocular 4D Facial Avatar Reconstruction. In CVPR. 8649--8658.
[13]
Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. NeurIPS 27 (2014).
[14]
Amir Hertz, Or Perel, Raja Giryes, Olga Sorkine-Hornung, and Daniel Cohen-Or. 2021. Progressive Encoding for Neural Optimization. arXiv:2104.09125 [cs.LG]
[15]
Phillip Isola, Jun-Yan Zhu, Tinghui Zhou, and Alexei A Efros. 2017. Image-to-image translation with conditional adversarial networks. In CVPR.
[16]
Arthur Jacot, Franck Gabriel, and Clément Hongler. 2018. Neural tangent kernel: Convergence and generalization in neural networks. In NeurIPS.
[17]
Chiyu Jiang, Jingwei Huang, Andrea Tagliasacchi, Leonidas Guibas, et al. 2020. Shape-Flow: Learnable Deformations Among 3D Shapes. arXiv preprint arXiv:2006.07982 (2020).
[18]
Hyeongwoo Kim, Pablo Garrido, Ayush Tewari, Weipeng Xu, Justus Thies, Matthias Nießner, Patrick Pérez, Christian Richardt, Michael Zollöfer, and Christian Theobalt. 2018. Deep Video Portraits. ACM ToG (2018).
[19]
Diederik P Kingma and Max Welling. 2013. Auto-encoding variational bayes. arXiv preprint arXiv:1312.6114 (2013).
[20]
Tianye Li, Mira Slavcheva, Michael Zollhoefer, Simon Green, Christoph Lassner, Changil Kim, Tanner Schmidt, Steven Lovegrove, Michael Goesele, and Zhaoyang Lv. 2021. Neural 3D Video Synthesis. arXiv:2103.02597 [cs.CV]
[21]
Zhengqi Li, Simon Niklaus, Noah Snavely, and Oliver Wang. 2020. Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes. arXiv preprint arXiv:2011.13084 (2020).
[22]
Stephen Lombardi, Tomas Simon, Jason Saragih, Gabriel Schwartz, Andreas Lehrmann, and Yaser Sheikh. 2019. Neural volumes: learning dynamic renderable volumes from images. ACM ToG (2019).
[23]
Ricardo Martin-Brualla, Rohit Pandey, Shuoran Yang, Pavel Pidlypenskyi, Jonathan Taylor, Julien Valentin, Sameh Khamis, Philip Davidson, Anastasia Tkach, Peter Lincoln, et al. 2018. LookinGood: Enhancing performance capture with real-time neural re-rendering. SIGGRAPH Asia (2018).
[24]
Ricardo Martin-Brualla, Noha Radwan, Mehdi S. M. Sajjadi, Jonathan T. Barron, Alexey Dosovitskiy, and Daniel Duckworth. 2021. NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections. CVPR (2021).
[25]
Abhimitra Meka, Christian Haene, Rohit Pandey, Michael Zollhoefer, Sean Fanello, Graham Fyffe, Adarsh Kowdle, Xueming Yu, Jay Busch, Jason Dourgarian, Peter Denny, Sofien Bouaziz, Peter Lincoln, Matt Whalen, Geoff Harvey, Jonathan Taylor, Shahram Izadi, Andrea Tagliasacchi, Paul Debevec, Christian Theobalt, Julien Valentin, and Christoph Rhemann. 2019. Deep Reflectance Fields - High-Quality Facial Reflectance Field Inference From Color Gradient Illumination. SIGGRAPH.
[26]
Lars Mescheder, Michael Oechsle, Michael Niemeyer, Sebastian Nowozin, and Andreas Geiger. 2019. Occupancy networks: Learning 3D reconstruction in function space. In CVPR.
[27]
Moustafa Meshry, Dan B Goldman, Sameh Khamis, Hugues Hoppe, Rohit Pandey, Noah Snavely, and Ricardo Martin-Brualla. 2019. Neural rerendering in the wild. In CVPR.
[28]
Ben Mildenhall, Pratul P. Srinivasan, Matthew Tancik, Jonathan T. Barron, Ravi Ramamoorthi, and Ren Ng. 2020. NeRF: Representing Scenes as Neural Radiance Fields for View Synthesis. ECCV (2020).
[29]
Richard A Newcombe, Dieter Fox, and Steven M Seitz. 2015. DynamicFusion: Reconstruction and tracking of non-rigid scenes in real-time. CVPR (2015).
[30]
Michael Niemeyer, Lars Mescheder, Michael Oechsle, and Andreas Geiger. 2019. Occupancy Flow: 4D Reconstruction by Learning Particle Dynamics. ICCV (2019).
[31]
Stanley Osher and James A Sethian. 1988. Fronts propagating with curvature-dependent speed: Algorithms based on Hamilton-Jacobi formulations. Journal of computational physics 79, 1 (1988), 12--49.
[32]
Jeong Joon Park, Peter Florence, Julian Straub, Richard Newcombe, and Steven Love-grove. 2019. DeepSDF: Learning continuous signed distance functions for shape representation. In CVPR.
[33]
Keunhong Park, Utkarsh Sinha, Jonathan T. Barron, Sofien Bouaziz, Dan B Goldman, Steven M. Seitz, and Ricardo Martin-Brualla. 2020. Nerfies: Deformable Neural Radiance Fields. arXiv preprint arXiv:2011.12948.
[34]
Albert Pumarola, Enric Corona, Gerard Pons-Moll, and Francesc Moreno-Noguer. 2020. D-NeRF: Neural Radiance Fields for Dynamic Scenes. arXiv preprint arXiv:2011.13961 (2020).
[35]
Tanner Schmidt, Richard Newcombe, and Dieter Fox. 2015. DART: dense articulated real-time tracking with consumer depth cameras. Autonomous Robots (2015).
[36]
Johannes Lutz Schönberger and Jan-Michael Frahm. 2016. Structure-from-Motion Revisited. CVPR (2016).
[37]
Vincent Sitzmann, Justus Thies, Felix Heide, Matthias Nießner, Gordon Wetzstein, and Michael Zollhöfer. 2019a. DeepVoxels: Learning Persistent 3D Feature Embeddings. In CVPR.
[38]
Vincent Sitzmann, Michael Zollhöfer, and Gordon Wetzstein. 2019b. Scene Representation Networks: Continuous 3D-Structure-Aware Neural Scene Representations. In NeurIPS.
[39]
Tiancheng Sun, Jonathan T Barron, Yun-Ta Tsai, Zexiang Xu, Xueming Yu, Graham Fyffe, Christoph Rhemann, Jay Busch, Paul E Debevec, and Ravi Ramamoorthi. 2019. Single image portrait relighting. SIGGRAPH (2019).
[40]
Matthew Tancik, Pratul P. Srinivasan, Ben Mildenhall, Sara Fridovich-Keil, Nithin Raghavan, Utkarsh Singhal, Ravi Ramamoorthi, Jonathan T. Barron, and Ren Ng. 2020. Fourier Features Let Networks Learn High Frequency Functions in Low Dimensional Domains. NeurIPS (2020).
[41]
A. Tewari, O. Fried, J. Thies, V. Sitzmann, S. Lombardi, K. Sunkavalli, R. Martin-Brualla, T. Simon, J. Saragih, M. Nießner, R. Pandey, S. Fanello, G. Wetzstein, J.-Y. Zhu, C. Theobalt, M. Agrawala, E. Shechtman, D. B Goldman, and M. Zollhöfer. 2020. State of the Art on Neural Rendering. Computer Graphics Forum (2020).
[42]
Justus Thies, Michael Zollhöfer, and Matthias Nießner. 2019. Deferred neural rendering: Image synthesis using neural textures. ACM TOG (2019).
[43]
Justus Thies, Michael Zollhofer, Marc Stamminger, Christian Theobalt, and Matthias Nießner. 2016. Face2Face: Real-time face capture and reenactment of rgb videos. CVPR (2016).
[44]
Lorenzo Torresani, Aaron Hertzmann, and Chris Bregler. 2008. Nonrigid structure-from-motion: Estimating shape and motion with hierarchical priors. TPAMI (2008).
[45]
Edgar Tretschk, Ayush Tewari, Vladislav Golyanik, Michael Zollhöfer, Christoph Lassner, and Christian Theobalt. 2021. Non-Rigid Neural Radiance Fields: Reconstruction and Novel View Synthesis of a Dynamic Scene From Monocular Video. arXiv:2012.12247 [cs.CV]
[46]
Zhou Wang, Eero P Simoncelli, and Alan C Bovik. 2003. Multiscale structural similarity for image quality assessment. Asilomar Conference on Signals, Systems & Computers (2003).
[47]
Wenqi Xian, Jia-Bin Huang, Johannes Kopf, and Changil Kim. 2020. Space-time Neural Irradiance Fields for Free-Viewpoint Video. arXiv preprint arXiv:2011.12950 (2020).
[48]
Lior Yariv, Yoni Kasten, Dror Moran, Meirav Galun, Matan Atzmon, Ronen Basri, and Yaron Lipman. 2020. Multiview neural surface reconstruction by disentangling geometry and appearance. arXiv preprint arXiv:2003.09852 (2020).
[49]
Jae Shin Yoon, Kihwan Kim, Orazio Gallo, Hyun Soo Park, and Jan Kautz. 2020. Novel view synthesis of dynamic scenes with globally coherent depths from a monocular camera. In CVPR.
[50]
Richard Zhang, Phillip Isola, Alexei A Efros, Eli Shechtman, and Oliver Wang. 2018. The unreasonable effectiveness of deep features as a perceptual metric. In CVPR.

Cited By

View all
  • (2024)UDR-GS: Enhancing Underwater Dynamic Scene Reconstruction with Depth RegularizationSymmetry10.3390/sym1608101016:8(1010)Online publication date: 8-Aug-2024
  • (2024)FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic FacesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36513047:1(1-17)Online publication date: 13-May-2024
  • (2024)HQ3DAvatar: High-quality Implicit 3D Head AvatarACM Transactions on Graphics10.1145/364988943:3(1-24)Online publication date: 9-Apr-2024
  • Show More Cited By

Index Terms

  1. HyperNeRF: a higher-dimensional representation for topologically varying neural radiance fields

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image ACM Transactions on Graphics
        ACM Transactions on Graphics  Volume 40, Issue 6
        December 2021
        1351 pages
        ISSN:0730-0301
        EISSN:1557-7368
        DOI:10.1145/3478513
        Issue’s Table of Contents
        This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

        Publisher

        Association for Computing Machinery

        New York, NY, United States

        Publication History

        Published: 10 December 2021
        Published in TOG Volume 40, Issue 6

        Permissions

        Request permissions for this article.

        Check for updates

        Author Tags

        1. 3D synthesis
        2. dynamic scenes
        3. neural radiance fields
        4. neural rendering
        5. novel view synthesis

        Qualifiers

        • Research-article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)246
        • Downloads (Last 6 weeks)71
        Reflects downloads up to 06 Oct 2024

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)UDR-GS: Enhancing Underwater Dynamic Scene Reconstruction with Depth RegularizationSymmetry10.3390/sym1608101016:8(1010)Online publication date: 8-Aug-2024
        • (2024)FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic FacesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36513047:1(1-17)Online publication date: 13-May-2024
        • (2024)HQ3DAvatar: High-quality Implicit 3D Head AvatarACM Transactions on Graphics10.1145/364988943:3(1-24)Online publication date: 9-Apr-2024
        • (2024)Physics-Informed Learning of Characteristic Trajectories for Smoke ReconstructionACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657483(1-11)Online publication date: 13-Jul-2024
        • (2024)4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic ScenesACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657463(1-11)Online publication date: 13-Jul-2024
        • (2024)Promptable Game Models: Text-guided Game Simulation via Masked Diffusion ModelsACM Transactions on Graphics10.1145/363570543:2(1-16)Online publication date: 3-Jan-2024
        • (2024)VolTeMorph: Real‐time, Controllable and Generalizable Animation of Volumetric RepresentationsComputer Graphics Forum10.1111/cgf.1511743:6Online publication date: 29-May-2024
        • (2024)Recent Trends in 3D Reconstruction of General Non‐Rigid ScenesComputer Graphics Forum10.1111/cgf.1506243:2Online publication date: 30-Apr-2024
        • (2024)ShellNeRF: Learning a Controllable High‐resolution Model of the Eye and Periocular RegionComputer Graphics Forum10.1111/cgf.1504143:2Online publication date: 24-Apr-2024
        • (2024)ParticleNeRF: A Particle-Based Encoding for Online Neural Radiance Fields2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00587(5963-5972)Online publication date: 3-Jan-2024
        • Show More Cited By

        View Options

        View options

        PDF

        View or Download as a PDF file.

        PDF

        eReader

        View online with eReader.

        eReader

        Get Access

        Login options

        Full Access

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media