Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting

  • Conference paper
  • First Online:
Computer Vision – ECCV 2024 (ECCV 2024)

Abstract

Novel view synthesis from limited observations remains a crucial and ongoing challenge. In the realm of NeRF-based few-shot view synthesis, there is often a trade-off between the accuracy of the synthesized view and the efficiency of the 3D representation. To tackle this dilemma, we introduce a Few-Shot view synthesis framework based on 3D Gaussian Splatting, which facilitates real-time, photo-realistic synthesis from a minimal number of training views. FSGS employs an innovative Proximity-guided Gaussian Unpooling, specifically designed for sparse-view settings, to bridge the gap presented by the sparse initial point sets. This method involves the strategic placement of new Gaussians between existing ones, guided by a Gaussian proximity score, enhancing the adaptive density control. We have identified that Gaussian optimization can sometimes result in overly smooth textures and a propensity for overfitting when training views are limited. To mitigate these issues, FSGS introduces the synthesis of virtual views to replicate the parallax effect experienced during training, coupled with geometric regularization applied across both actual training and synthesized viewpoints. This strategy ensures that new Gaussians are placed in the most representative locations, fostering more accurate and detailed scene reconstruction. Our comprehensive evaluation across various datasets-including NeRF-Synthetic, LLFF, Shiny, and Mip-NeRF360 datasets-illustrates that FSGS not only delivers exceptional rendering quality but also achieves an inference speed more than 2000 times faster than existing state-of-the-art methods for sparse-view synthesis. Project webpage: https://zehaozhu.github.io/FSGS/.

Z. Zhu and Z. Fan—Equal Contribution.

Z. Fan—Project Lead.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 64.99
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 79.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

References

  1. Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)

    Google Scholar 

  2. Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)

    Google Scholar 

  3. Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded Anti-Aliased Neural Radiance Fields. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5460–5469 (2022)

    Google Scholar 

  4. Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Zip-NeRF: anti-aliased grid-based neural radiance fields. In: ICCV (2023)

    Google Scholar 

  5. Cao, Y., Cao, Y.P., Han, K., Shan, Y., Wong, K.Y.K.: DreamAvatar: text-and-shape guided 3d human avatar generation via diffusion models. arXiv preprint arXiv:2304.00916 (2023)

  6. Chan, E., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3d-aware image synthesis. arXiv (2020)

    Google Scholar 

  7. Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: Tensorf: tensorial radiance fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 333–350. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20

    Chapter  Google Scholar 

  8. Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14124–14133 (2021)

    Google Scholar 

  9. Chen, G., Wang, W.: A survey on 3D gaussian splatting. arXiv preprint arXiv:2401.03890 (2024)

  10. Chen, T., Wang, P., Fan, Z., Wang, Z.: Aug-NeRF: training stronger neural radiance fields with triple-level physically-grounded augmentations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15191–15202 (2022)

    Google Scholar 

  11. Chibane, J., Bansal, A., Lazova, V., Pons-Moll, G.: Stereo radiance fields (SRF): learning view synthesis from sparse views of novel scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2021)

    Google Scholar 

  12. Deng, C., et al.: Nerdi: Single-view nerf synthesis with language-guided diffusion as general image priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20637–20647 (2023)

    Google Scholar 

  13. Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised NeRF: fewer views and faster training for free. arXiv preprint arXiv:2107.02791 (2021)

  14. Drebin, R.A., Carpenter, L., Hanrahan, P.: Volume rendering. ACM Siggraph Comput. Graph. 22(4), 65–74 (1988)

    Google Scholar 

  15. Fan, Z., Jiang, Y., Wang, P., Gong, X., Xu, D., Wang, Z.: Unified implicit neural stylization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13675, pp. 636–654. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19784-0_37

    Chapter  Google Scholar 

  16. Fan, Z., Wang, P., Jiang, Y., Gong, X., Xu, D., Wang, Z.: NeRF-SOS: any-view self-supervised object segmentation on complex scenes. arXiv preprint arXiv:2209.08776 (2022)

  17. Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)

    Google Scholar 

  18. Gao, K., Gao, Y., He, H., Lu, D., Xu, L., Li, J.: NeRF: neural radiance field in 3D vision, a comprehensive review (2023)

    Google Scholar 

  19. Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: FastNeRF: high-fidelity neural rendering at 200FPS. arXiv preprint arXiv:2103.10380 (2021)

  20. Gu, J., Liu, L., Wang, P., Theobalt, C.: StyleNeRF: a style-based 3D aware generator for high-resolution image synthesis. In: International Conference on Learning Representations (2022)

    Google Scholar 

  21. Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2020)

    Google Scholar 

  22. Guo, Y.C., Kang, D., Bao, L., He, Y., Zhang, S.H.: NeRFRen: neural radiance fields with reflections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18409–18418 (2022)

    Google Scholar 

  23. Höllein, L., Cao, A., Owens, A., Johnson, J., Nießner, M.: Text2Room: extracting textured 3D meshes from 2D text-to-image models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7909–7920 (2023)

    Google Scholar 

  24. Irshad, M.Z., et al.: NeO 360: neural fields for sparse view synthesis of outdoor scenes (2023)

    Google Scholar 

  25. Jain, A., Mildenhall, B., Barron, J.T., Abbeel, P., Poole, B.: Zero-shot text-guided object generation with dream fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 867–876 (2022)

    Google Scholar 

  26. Jain, A., Tancik, M., Abbeel, P.: Putting nerf on a diet: semantically consistent few-shot view synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5885–5894 (2021)

    Google Scholar 

  27. Johari, M.M., Lepoittevin, Y., Fleuret, F.: GeoNeRF: generalizing nerf with geometry priors. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2022)

    Google Scholar 

  28. Karnewar, A., Vedaldi, A., Novotny, D., Mitra, N.: HoloDiffusion: training a 3D diffusion model using 2D images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)

    Google Scholar 

  29. Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. (ToG) 42(4), 1–14 (2023)

    Article  Google Scholar 

  30. Kerr, J., Kim, C.M., Goldberg, K., Kanazawa, A., Tancik, M.: LeRF: language embedded radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19729–19739 (2023)

    Google Scholar 

  31. Kobayashi, S., Matsumoto, E., Sitzmann, V.: Decomposing nerf for editing via feature field distillation. Adv. Neural. Inf. Process. Syst. 35, 23311–23330 (2022)

    Google Scholar 

  32. Li, R., et al.: 4K4DGen: panoramic 4D generation at 4K resolution. arXiv preprint arXiv:2406.13527 (2024)

  33. Lin, C.H., et al.: Magic3D: high-resolution text-to-3D content creation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2023)

    Google Scholar 

  34. Liu, R., Wu, R., Hoorick, B.V., Tokmakov, P., Zakharov, S., Vondrick, C.: Zero-1-to-3: zero-shot one image to 3D object (2023)

    Google Scholar 

  35. Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (TOG) 38(4), 1–14 (2019)

    Article  Google Scholar 

  36. Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing Scenes As Neural Radiance Fields for View Synthesis. Commun. ACM 65(1), 99–106 (2021). https://doi.org/10.1145/3503250

    Article  Google Scholar 

  37. Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Tran. Graph. (ToG) 41(4), 1–15 (2022)

    Article  Google Scholar 

  38. Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: RegNeRF: regularizing neural radiance fields for view synthesis from sparse inputs. arXiv preprint arXiv:2112.00724 (2021)

  39. Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: RegNeRF: regularizing neural radiance fields for view synthesis from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5480–5490 (2022)

    Google Scholar 

  40. Poole, B., Jain, A., Barron, J.T., Mildenhall, B.: DreamFusion: text-to-3D using 2D diffusion. arXiv preprint arXiv:2209.14988 (2022)

  41. Qin, M., Li, W., Zhou, J., Wang, H., Pfister, H.: LangSplat: 3D language gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20051–20060 (2024)

    Google Scholar 

  42. Rabby, A.S.A., Zhang, C.: BeyondPixels: a comprehensive review of the evolution of neural radiance fields (2023)

    Google Scholar 

  43. Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)

    Google Scholar 

  44. Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)

    Google Scholar 

  45. Reiser, C., Peng, S., Liao, Y., Geiger, A.: KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14335–14345 (2021)

    Google Scholar 

  46. Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)

    Google Scholar 

  47. Schwarz, K., Sauer, A., Niemeyer, M., Liao, Y., Geiger, A.: VoxGRAF: fast 3D-aware image synthesis with sparse voxel grids. ArXiv Preprint ArXiv:2206.07695 (2022)

  48. Seo, J., et al.: Let 2D diffusion model know 3D-consistency for robust text-to-3D generation. arXiv preprint arXiv:2303.07937 (2023)

  49. Suhail, M., Esteves, C., Sigal, L., Makadia, A.: Light field neural rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8269–8279 (2022)

    Google Scholar 

  50. Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469 (2022)

    Google Scholar 

  51. T, M.V., Wang, P., Chen, X., Chen, T., Venugopalan, S., Wang, Z.: Is attention all that NeRF needs? In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=xE-LtsE-xx

  52. Tang, J., et al.: Make-it-3D: high-fidelity 3D creation from a single image with diffusion prior (2023)

    Google Scholar 

  53. Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: SPARF: neural radiance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4190–4200 (2023)

    Google Scholar 

  54. Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-NeRF: structured view-dependent appearance for neural radiance fields. arXiv preprint arXiv:2112.03907 (2021)

  55. Wang, C., Chai, M., He, M., Chen, D., Liao, J.: CLIP-NeRF: text-and-image driven manipulation of neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3835–3844 (2022)

    Google Scholar 

  56. Wang, G., Chen, Z., Loy, C.C., Liu, Z.: SparseNeRF: distilling depth ranking for few-shot novel view synthesis. arXiv preprint arXiv:2303.16196 (2023)

  57. Wang, G., Wang, P., Chen, Z., Wang, W., Loy, C.C., Liu, Z.: PERF: panoramic neural radiance field from a single panorama. arXiv preprint arXiv:2310.16831 (2023)

  58. Wang, L., et al.: Fourier PlenOctrees for dynamic radiance field rendering in real-time. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13524–13534 (2022)

    Google Scholar 

  59. Wang, P., et al.: F2-NeRF: fast neural radiance field training with free camera trajectories. CVPR (2023)

    Google Scholar 

  60. Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4690–4699 (2021)

    Google Scholar 

  61. Wizadwongsa, S., Phongthawee, P., Yenphraphai, J., Suwajanakorn, S.: NeX: real-time view synthesis with neural basis expansion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)

    Google Scholar 

  62. Wu, R., et al.: ReconFusion: 3D reconstruction with diffusion priors. arXiv preprint arXiv:2312.02981 (2023)

  63. Xu, D., Jiang, Y., Wang, P., Fan, Z., Shi, H., Wang, Z.: SinNeRF: training neural radiance fields on complex scenes from a single image. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13682, pp. 736–753. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_42

    Chapter  Google Scholar 

  64. Yang, J., Pavone, M., Wang, Y.: FreeNeRF: improving few-shot neural rendering with free frequency regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8254–8263 (2023)

    Google Scholar 

  65. Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo. In: Proceedings of the European conference on computer vision (ECCV), pp. 767–783 (2018)

    Google Scholar 

  66. Yu, A., Fridovich-Keil, S., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5491–5500 (2022)

    Google Scholar 

  67. Yu, A., Ye, V., Tancik, M., Kanazawa, A.: PixelNeRF: neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4578–4587 (2021)

    Google Scholar 

  68. Zhang, K., Kolkin, N., Bi, S., Luan, F., Xu, Z., Shechtman, E., Snavely, N.: ARF: artistic radiance fields (2022)

    Google Scholar 

  69. Zhou, S., et al.: Feature 3DGS: supercharging 3D gaussian splatting to enable distilled feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21676–21685 (2024)

    Google Scholar 

  70. Zhou, S., et al.: DreamScene360: unconstrained text-to-3D scene generation with panoramic gaussian splatting. arXiv preprint arXiv:2404.06903 (2024)

  71. Zorin, D., Schröder, P., Sweldens, W.: Interpolating subdivision for meshes with arbitrary topology. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 189–192 (1996)

    Google Scholar 

Download references

Acknowledgement

The work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number 140D0423C0074. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Zehao Zhu .

Editor information

Editors and Affiliations

1 Electronic supplementary material

Below is the link to the electronic supplementary material.

Supplementary material 1 (pdf 3495 KB)

Rights and permissions

Reprints and permissions

Copyright information

© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Zhu, Z., Fan, Z., Jiang, Y., Wang, Z. (2025). FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15097. Springer, Cham. https://doi.org/10.1007/978-3-031-72933-1_9

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-72933-1_9

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-72932-4

  • Online ISBN: 978-3-031-72933-1

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics