Abstract
Novel view synthesis from limited observations remains a crucial and ongoing challenge. In the realm of NeRF-based few-shot view synthesis, there is often a trade-off between the accuracy of the synthesized view and the efficiency of the 3D representation. To tackle this dilemma, we introduce a Few-Shot view synthesis framework based on 3D Gaussian Splatting, which facilitates real-time, photo-realistic synthesis from a minimal number of training views. FSGS employs an innovative Proximity-guided Gaussian Unpooling, specifically designed for sparse-view settings, to bridge the gap presented by the sparse initial point sets. This method involves the strategic placement of new Gaussians between existing ones, guided by a Gaussian proximity score, enhancing the adaptive density control. We have identified that Gaussian optimization can sometimes result in overly smooth textures and a propensity for overfitting when training views are limited. To mitigate these issues, FSGS introduces the synthesis of virtual views to replicate the parallax effect experienced during training, coupled with geometric regularization applied across both actual training and synthesized viewpoints. This strategy ensures that new Gaussians are placed in the most representative locations, fostering more accurate and detailed scene reconstruction. Our comprehensive evaluation across various datasets-including NeRF-Synthetic, LLFF, Shiny, and Mip-NeRF360 datasets-illustrates that FSGS not only delivers exceptional rendering quality but also achieves an inference speed more than 2000 times faster than existing state-of-the-art methods for sparse-view synthesis. Project webpage: https://zehaozhu.github.io/FSGS/.
Z. Zhu and Z. Fan—Equal Contribution.
Z. Fan—Project Lead.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Barron, J.T., Mildenhall, B., Tancik, M., Hedman, P., Martin-Brualla, R., Srinivasan, P.P.: Mip-NeRF: a multiscale representation for anti-aliasing neural radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5855–5864 (2021)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded anti-aliased neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5470–5479 (2022)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Mip-NeRF 360: unbounded Anti-Aliased Neural Radiance Fields. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5460–5469 (2022)
Barron, J.T., Mildenhall, B., Verbin, D., Srinivasan, P.P., Hedman, P.: Zip-NeRF: anti-aliased grid-based neural radiance fields. In: ICCV (2023)
Cao, Y., Cao, Y.P., Han, K., Shan, Y., Wong, K.Y.K.: DreamAvatar: text-and-shape guided 3d human avatar generation via diffusion models. arXiv preprint arXiv:2304.00916 (2023)
Chan, E., Monteiro, M., Kellnhofer, P., Wu, J., Wetzstein, G.: pi-GAN: periodic implicit generative adversarial networks for 3d-aware image synthesis. arXiv (2020)
Chen, A., Xu, Z., Geiger, A., Yu, J., Su, H.: Tensorf: tensorial radiance fields. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13692, pp. 333–350. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19824-3_20
Chen, A., Xu, Z., Zhao, F., Zhang, X., Xiang, F., Yu, J., Su, H.: MVSNeRF: fast generalizable radiance field reconstruction from multi-view stereo. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14124–14133 (2021)
Chen, G., Wang, W.: A survey on 3D gaussian splatting. arXiv preprint arXiv:2401.03890 (2024)
Chen, T., Wang, P., Fan, Z., Wang, Z.: Aug-NeRF: training stronger neural radiance fields with triple-level physically-grounded augmentations. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 15191–15202 (2022)
Chibane, J., Bansal, A., Lazova, V., Pons-Moll, G.: Stereo radiance fields (SRF): learning view synthesis from sparse views of novel scenes. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR). IEEE (2021)
Deng, C., et al.: Nerdi: Single-view nerf synthesis with language-guided diffusion as general image priors. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20637–20647 (2023)
Deng, K., Liu, A., Zhu, J.Y., Ramanan, D.: Depth-supervised NeRF: fewer views and faster training for free. arXiv preprint arXiv:2107.02791 (2021)
Drebin, R.A., Carpenter, L., Hanrahan, P.: Volume rendering. ACM Siggraph Comput. Graph. 22(4), 65–74 (1988)
Fan, Z., Jiang, Y., Wang, P., Gong, X., Xu, D., Wang, Z.: Unified implicit neural stylization. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13675, pp. 636–654. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-19784-0_37
Fan, Z., Wang, P., Jiang, Y., Gong, X., Xu, D., Wang, Z.: NeRF-SOS: any-view self-supervised object segmentation on complex scenes. arXiv preprint arXiv:2209.08776 (2022)
Fridovich-Keil, S., Yu, A., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5501–5510 (2022)
Gao, K., Gao, Y., He, H., Lu, D., Xu, L., Li, J.: NeRF: neural radiance field in 3D vision, a comprehensive review (2023)
Garbin, S.J., Kowalski, M., Johnson, M., Shotton, J., Valentin, J.: FastNeRF: high-fidelity neural rendering at 200FPS. arXiv preprint arXiv:2103.10380 (2021)
Gu, J., Liu, L., Wang, P., Theobalt, C.: StyleNeRF: a style-based 3D aware generator for high-resolution image synthesis. In: International Conference on Learning Representations (2022)
Gu, X., Fan, Z., Zhu, S., Dai, Z., Tan, F., Tan, P.: Cascade cost volume for high-resolution multi-view stereo and stereo matching. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 2495–2504 (2020)
Guo, Y.C., Kang, D., Bao, L., He, Y., Zhang, S.H.: NeRFRen: neural radiance fields with reflections. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 18409–18418 (2022)
Höllein, L., Cao, A., Owens, A., Johnson, J., Nießner, M.: Text2Room: extracting textured 3D meshes from 2D text-to-image models. In: Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), pp. 7909–7920 (2023)
Irshad, M.Z., et al.: NeO 360: neural fields for sparse view synthesis of outdoor scenes (2023)
Jain, A., Mildenhall, B., Barron, J.T., Abbeel, P., Poole, B.: Zero-shot text-guided object generation with dream fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 867–876 (2022)
Jain, A., Tancik, M., Abbeel, P.: Putting nerf on a diet: semantically consistent few-shot view synthesis. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 5885–5894 (2021)
Johari, M.M., Lepoittevin, Y., Fleuret, F.: GeoNeRF: generalizing nerf with geometry priors. In: Proceedings of the IEEE International Conference on Computer Vision and Pattern Recognition (CVPR) (2022)
Karnewar, A., Vedaldi, A., Novotny, D., Mitra, N.: HoloDiffusion: training a 3D diffusion model using 2D images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (2023)
Kerbl, B., Kopanas, G., Leimkühler, T., Drettakis, G.: 3D gaussian splatting for real-time radiance field rendering. ACM Trans. Graph. (ToG) 42(4), 1–14 (2023)
Kerr, J., Kim, C.M., Goldberg, K., Kanazawa, A., Tancik, M.: LeRF: language embedded radiance fields. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 19729–19739 (2023)
Kobayashi, S., Matsumoto, E., Sitzmann, V.: Decomposing nerf for editing via feature field distillation. Adv. Neural. Inf. Process. Syst. 35, 23311–23330 (2022)
Li, R., et al.: 4K4DGen: panoramic 4D generation at 4K resolution. arXiv preprint arXiv:2406.13527 (2024)
Lin, C.H., et al.: Magic3D: high-resolution text-to-3D content creation. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2023)
Liu, R., Wu, R., Hoorick, B.V., Tokmakov, P., Zakharov, S., Vondrick, C.: Zero-1-to-3: zero-shot one image to 3D object (2023)
Mildenhall, B., et al.: Local light field fusion: practical view synthesis with prescriptive sampling guidelines. ACM Trans. Graph. (TOG) 38(4), 1–14 (2019)
Mildenhall, B., Srinivasan, P.P., Tancik, M., Barron, J.T., Ramamoorthi, R., Ng, R.: NeRF: representing Scenes As Neural Radiance Fields for View Synthesis. Commun. ACM 65(1), 99–106 (2021). https://doi.org/10.1145/3503250
Müller, T., Evans, A., Schied, C., Keller, A.: Instant neural graphics primitives with a multiresolution hash encoding. ACM Tran. Graph. (ToG) 41(4), 1–15 (2022)
Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: RegNeRF: regularizing neural radiance fields for view synthesis from sparse inputs. arXiv preprint arXiv:2112.00724 (2021)
Niemeyer, M., Barron, J.T., Mildenhall, B., Sajjadi, M.S., Geiger, A., Radwan, N.: RegNeRF: regularizing neural radiance fields for view synthesis from sparse inputs. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5480–5490 (2022)
Poole, B., Jain, A., Barron, J.T., Mildenhall, B.: DreamFusion: text-to-3D using 2D diffusion. arXiv preprint arXiv:2209.14988 (2022)
Qin, M., Li, W., Zhou, J., Wang, H., Pfister, H.: LangSplat: 3D language gaussian splatting. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 20051–20060 (2024)
Rabby, A.S.A., Zhang, C.: BeyondPixels: a comprehensive review of the evolution of neural radiance fields (2023)
Radford, A., et al.: Learning transferable visual models from natural language supervision. In: International Conference on Machine Learning, pp. 8748–8763. PMLR (2021)
Ranftl, R., Bochkovskiy, A., Koltun, V.: Vision transformers for dense prediction. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 12179–12188 (2021)
Reiser, C., Peng, S., Liao, Y., Geiger, A.: KiloNeRF: speeding up neural radiance fields with thousands of tiny MLPs. In: Proceedings of the IEEE/CVF International Conference on Computer Vision, pp. 14335–14345 (2021)
Schonberger, J.L., Frahm, J.M.: Structure-from-motion revisited. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 4104–4113 (2016)
Schwarz, K., Sauer, A., Niemeyer, M., Liao, Y., Geiger, A.: VoxGRAF: fast 3D-aware image synthesis with sparse voxel grids. ArXiv Preprint ArXiv:2206.07695 (2022)
Seo, J., et al.: Let 2D diffusion model know 3D-consistency for robust text-to-3D generation. arXiv preprint arXiv:2303.07937 (2023)
Suhail, M., Esteves, C., Sigal, L., Makadia, A.: Light field neural rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8269–8279 (2022)
Sun, C., Sun, M., Chen, H.T.: Direct voxel grid optimization: super-fast convergence for radiance fields reconstruction. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 5459–5469 (2022)
T, M.V., Wang, P., Chen, X., Chen, T., Venugopalan, S., Wang, Z.: Is attention all that NeRF needs? In: The Eleventh International Conference on Learning Representations (2023). https://openreview.net/forum?id=xE-LtsE-xx
Tang, J., et al.: Make-it-3D: high-fidelity 3D creation from a single image with diffusion prior (2023)
Truong, P., Rakotosaona, M.J., Manhardt, F., Tombari, F.: SPARF: neural radiance fields from sparse and noisy poses. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4190–4200 (2023)
Verbin, D., Hedman, P., Mildenhall, B., Zickler, T., Barron, J.T., Srinivasan, P.P.: Ref-NeRF: structured view-dependent appearance for neural radiance fields. arXiv preprint arXiv:2112.03907 (2021)
Wang, C., Chai, M., He, M., Chen, D., Liao, J.: CLIP-NeRF: text-and-image driven manipulation of neural radiance fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 3835–3844 (2022)
Wang, G., Chen, Z., Loy, C.C., Liu, Z.: SparseNeRF: distilling depth ranking for few-shot novel view synthesis. arXiv preprint arXiv:2303.16196 (2023)
Wang, G., Wang, P., Chen, Z., Wang, W., Loy, C.C., Liu, Z.: PERF: panoramic neural radiance field from a single panorama. arXiv preprint arXiv:2310.16831 (2023)
Wang, L., et al.: Fourier PlenOctrees for dynamic radiance field rendering in real-time. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 13524–13534 (2022)
Wang, P., et al.: F2-NeRF: fast neural radiance field training with free camera trajectories. CVPR (2023)
Wang, Q., et al.: IBRNet: learning multi-view image-based rendering. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. pp. 4690–4699 (2021)
Wizadwongsa, S., Phongthawee, P., Yenphraphai, J., Suwajanakorn, S.: NeX: real-time view synthesis with neural basis expansion. In: IEEE Conference on Computer Vision and Pattern Recognition (CVPR) (2021)
Wu, R., et al.: ReconFusion: 3D reconstruction with diffusion priors. arXiv preprint arXiv:2312.02981 (2023)
Xu, D., Jiang, Y., Wang, P., Fan, Z., Shi, H., Wang, Z.: SinNeRF: training neural radiance fields on complex scenes from a single image. In: Avidan, S., Brostow, G., Cissé, M., Farinella, G.M., Hassner, T. (eds.) ECCV 2022. LNCS, vol. 13682, pp. 736–753. Springer, Cham (2022). https://doi.org/10.1007/978-3-031-20047-2_42
Yang, J., Pavone, M., Wang, Y.: FreeNeRF: improving few-shot neural rendering with free frequency regularization. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 8254–8263 (2023)
Yao, Y., Luo, Z., Li, S., Fang, T., Quan, L.: MVSNet: depth inference for unstructured multi-view stereo. In: Proceedings of the European conference on computer vision (ECCV), pp. 767–783 (2018)
Yu, A., Fridovich-Keil, S., Tancik, M., Chen, Q., Recht, B., Kanazawa, A.: Plenoxels: radiance fields without neural networks. In: 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pp. 5491–5500 (2022)
Yu, A., Ye, V., Tancik, M., Kanazawa, A.: PixelNeRF: neural radiance fields from one or few images. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 4578–4587 (2021)
Zhang, K., Kolkin, N., Bi, S., Luan, F., Xu, Z., Shechtman, E., Snavely, N.: ARF: artistic radiance fields (2022)
Zhou, S., et al.: Feature 3DGS: supercharging 3D gaussian splatting to enable distilled feature fields. In: Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, pp. 21676–21685 (2024)
Zhou, S., et al.: DreamScene360: unconstrained text-to-3D scene generation with panoramic gaussian splatting. arXiv preprint arXiv:2404.06903 (2024)
Zorin, D., Schröder, P., Sweldens, W.: Interpolating subdivision for meshes with arbitrary topology. In: Proceedings of the 23rd Annual Conference on Computer Graphics and Interactive Techniques, pp. 189–192 (1996)
Acknowledgement
The work is supported by the Intelligence Advanced Research Projects Activity (IARPA) via Department of Interior/Interior Business Center (DOI/IBC) contract number 140D0423C0074. The U.S. Government is authorized to reproduce and distribute reprints for Governmental purposes notwithstanding any copyright annotation thereon. Disclaimer: The views and conclusions contained herein are those of the authors and should not be interpreted as necessarily representing the official policies or endorsements, either expressed or implied, of IARPA, DOI/IBC, or the U.S. Government.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
1 Electronic supplementary material
Below is the link to the electronic supplementary material.
Rights and permissions
Copyright information
© 2025 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Zhu, Z., Fan, Z., Jiang, Y., Wang, Z. (2025). FSGS: Real-Time Few-Shot View Synthesis Using Gaussian Splatting. In: Leonardis, A., Ricci, E., Roth, S., Russakovsky, O., Sattler, T., Varol, G. (eds) Computer Vision – ECCV 2024. ECCV 2024. Lecture Notes in Computer Science, vol 15097. Springer, Cham. https://doi.org/10.1007/978-3-031-72933-1_9
Download citation
DOI: https://doi.org/10.1007/978-3-031-72933-1_9
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-72932-4
Online ISBN: 978-3-031-72933-1
eBook Packages: Computer ScienceComputer Science (R0)