Abstract
In this work, we present a novel light field rendering framework that allows a viewer to walk around a virtual scene reconstructed from a multi-view image/video dataset with visual and depth information. With immersive media applications in mind, the framework is designed to support dynamic scenes through input videos, give the viewer full freedom of movement in a large area, and achieve real-time rendering, even in Virtual Reality (VR). This paper explores how Depth-Image-Based Rendering (DIBR) is one of the few state-of-the-art techniques that achieves all requirements. We therefor implemented OpenDIBR, an Openly available DIBR, as a proof of concept for the framework. It uses Nvidia’s Video Codec SDK to rapidly decode the color and depth videos on the GPU. The decoded depth maps and color frames are then warped to the output view in OpenGL. Each input contribution is blended together through a per-pixel weighted average depending on the input and output camera positions. Experiments comparing the visual quality conclude that OpenDIBR is, objectively and subjectively, similar to TMIV and better than NeRF. Performancewise, OpenDIBR runs at 90 Hz for up to 4 full HD input videos on desktop, or 2–4 in VR, and there are options to further increase this by lowering the video bitrates, reducing the depth map resolution or dynamically lowering the number of rendered input videos.
Similar content being viewed by others
Data availability
The datasets generated during and/or analysed during the current study are available at https://cloud.ilabt.imec.be/index.php/s/z4bm23cy2ineApS. Instructions and more information can be found at https://github.com/IDLabMedia/open-dibr/wiki.
Code availability
The OpenDIBR framework is available at https://github.com/IDLabMedia/open-dibr.
References
Attal B, Ling S, Gokaslan A et al (2020) Matryodshka: Real-time 6dof video view synthesis using multi-sphere images. In: Europ. Conf. Comput. Vis. (ECCV). Springer, Cham, pp 441–459
Baños RM, Botella C, Alcañiz M et al (2004) Immersion and emotion: their impact on the sense of presence. Cyberpsychol Behav 7(6):734–741
Biocca F, Delaney B (1995) Immersive virtual reality technology. Commun Age Virtual Reality 15(32):10–5555
Bolles RC, Baker HH, Marimont DH (1987) Epipolar-plane image analysis: an approach to determining structure from motion. Int J Comput Vis 1:7–55
Bonatto D, Fachada S, Lafruit G (2020) Ravis: Real-time accelerated view synthesizer for immersive video 6dof vr. Electronic Imaging 2020:382–391
Bonatto D, Fachada S, Rogge S et al (2021) Real-time depth video-based rendering for 6-dof hmd navigation and light field displays. IEEE Access 9:146,868-146,887
Broxton M, Flynn J, Overbeck R et al (2020) Immersive light field video with a layered mesh representation. In: ACM Trans. Graph. (SIGGRAPH). ACM, New York, pp 1–15
Buehler C, Bosse M, McMillan L et al (2001) Unstructured lumigraph rendering. In: Proc. 28th Annu. Conf. Comp. Graph. Interact. Techn. (SIGGRAPH ’01). ACM, New York, pp 425–432
Chan SC (2021) Image-Based Rendering. Springer International Publishing, New York, pp 656–664
Chen X, Liang H, Xu H et al (2021) Disocclusion-type aware hole filling method for view synthesis. Multimed Tools Appl 80:11,557–11,581
Chen SE, Williams L (1993) View interpolation for image synthesis. In: 20th Annu. Conf. Comp. Graph. Interact. Techn. ACM, New York, pp 279–288
Courteaux M, Artois J, De Pauw S et al (2022) Silvr: a synthetic immersive large-volume plenoptic dataset. In: 13th ACM Multimedia Systems Conf. ACM, New York, pp 221–226
Debevec PE, Taylor CJ, Malik J (1996) Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In: Proc. 23rd Annu. Conf. Comp. Graph. Interact. Techn. (SIG-GRAPH). ACM, New York, pp 11–20
Dinechin GDd, Paljic A (2020) From real to virtual: An image-based rendering toolkit to help bring the world around us into virtual reality. In: 2020 IEEE Conf. Virtual Reality 3D User Interfaces Abstracts and Workshops (VRW), Atlanta, pp 348–353
Do L, Bravo G, Zinger S et al (2012) Gpu-accelerated real-time free-viewpoint dibr for 3dtv. IEEE Trans Consumer Electr 58(2):633–640
Dziembowski A (2020) Software Manual of IV-PSNR for Immersive Video [N19495]. document ISO/IEC JTC1/SC29/WG11
Fehn C (2004) Depth-image-based rendering (dibr), compression and transmission for a new approach on 3d-tv. Proc SPIE 5291:93–105
Field DA (1988) Laplacian smoothing and delaunay triangulations. Commun Appl Numer Methods 4(6):709–712
Geršak G, Lu H, Guna J (2020) Effect of vr technology matureness on vr sickness. Multi-Media Tools Appl 79(21–22):14,491-14,507
Gortler SJ, Grzeszczuk R, Szeliski R et al (1996) The lumigraph. In: 23rd Annu. Conf. Comp. Graph. Interact. Techn. (SIG- GRAPH). ACM, New York, pp 43–54
Hedman P, Philip J, Price T et al (2018) Deep blending for free-viewpoint image-based rendering. ACM Trans Graph 37(6):1–15
Hedman P, Srinivasan PP, Mildenhall B et al (2021) Baking neural radiance fields for real time view synthesis. In: IEEE/CVF Internat. Conf. Comput. Vis. (ICCV), Montreal, pp 5855–5864
Jung J, Kroon B (2022) Common Test Conditions for MPEG Immersive Video [N0232]. document ISO/IEC JTC1/SC29/WG04
Kazhdan M, Bolitho M, Hoppe H (2006) Poisson surface reconstruction. In: Proc. fourth Eurograph. Sympos. Geometry Processing. Eurographics Association, Goslar, pp 61–70
Kertész G, Vámossy Z (2015) Current challenges in multi-view computer vision. In: 2015 IEEE 10th Jubil. Internat. Sympos. Appl. Computat. Intell. Informat, Timisoara, pp 237–241
Koniaris B, Kosek M, Sinclair D et al (2017) Real-time rendering with compressed animated light fields. In: Proc. 43rd Graph. Interface Conf., Canadian Human-omputer Communications Society, Wateroo, pp 33–40
Levoy M, Hanrahan P (1996) Light field rendering. In: 23rd Annu. Conf. Comp. Graph. Interact. Techn. ACM, New York, pp 31–42
Li J, He Y, Jiao J et al (2021) Extending 6-dof vr experience via multi-sphere images interpolation. In: 29th ACM Internat. Conf. Multimedia. ACM, New York, pp 4632–4640
Mildenhall B, Srinivasan PP, Ortiz-Cayon R et al (2019) Local light field fusion: Practical view synthesis with prescriptive sampling guidelines, vol 38. ACM Trans Graph (TOG), ACM, New York, pp 1–14
Mildenhall B, Srinivasan PP, Tancik M et al (2020) Nerf: Representing scenes as neural radiance fields for view synthesis. In: Eur. Conf. Comput. Vis. (ECCV)
Müller T, Evans A, Schied C et al (2022) Instant neural graphics primitives with a multiresolution hash encoding. ACM Trans Graph 41(4):102:1-102:15
Netflix Technology Blog (2018) Vmaf: the journey continues. URL https://netflixtechblog.com/vmaf-the-journey-continues-44b51ee9ed12. Accessed 25 Oct 2022
NVidia (2022) Nvidia video codec sdk. https://developer.nvidia.com/nvidia-video-codec-sdk. Accessed 12 Jul 2022
Oh KJ, Yea S, Ho YS (2009) Hole filling method using depth based in-painting for view synthesis in free viewpoint television and 3-d video. In: 2009 Picture Coding Symposium, IEEE, Chicago, pp 1–4
Overbeck RS, Erickson D, Evangelakos D et al (2018) Welcome to light fields. In: ACM SIGGRAPH Virtual, Augmented, Mixed Reality. ACM, New York
Penner E, Zhang L (2017) Soft 3d reconstruction for view synthesis. ACM Trans Graph (SIGGRAPH Asia) 36(6):1–11
Pumarola A, Corona E, Pons-Moll G et al (2021) D-nerf: neural radiance fields for dynamic scenes. In: IEEE/CVF Conf. Comput. Vis. Pattern. Recogn. (CVPR), Nashville, pp 10,318–10,327
Riegler G, Koltun V (2020) Free view synthesis. In: 16th Europ. Conf. Comput. Vis. (ECCV). Springer, Cham, pp 623–640
Schönberger JL, Zheng E, Frahm JM et al (2016) Pixelwise view selection for unstructured multi-view stereo. In: 16th Europ. Conf. Comput. Vis. (ECCV). Springer, Cham, pp 501–518
Schönberger JL, Frahm JM (2016) Structure-from-motion revisited. In: IEEE Conf Comput Vis Pattern Recognition, Las Vegas, pp 4104–4113
Seitz SM, Curless B, Diebel J et al (2006) A comparison and evaluation of multi-view stereo reconstruction algorithms. In: IEEE/CVF Conf Comput Vis Pattern Recogn. (CVPR), New York, pp 519–528
Stankiewicz O, Wegner K, Tanimoto M et al (2013) Enhanced Depth Estimation Reference Software (DERS) for Free-viewpoint Television [M31518]. document ISO/IEC JTC1/SC29/WG11
Sun W, Xu L, Au OC et al (2010) An overview of free view-point depth-image-based rendering (dibr). In: APSIPA Annual Summit Conf., Singapore, pp 1023–1030
MPEG-I Visual (2022) Test model of mpeg immersive video (tmiv). https://gitlab.com/mpeg-i-visual/tmiv. Accessed 10 Oct 2022
Wang Z, Bovik A, Sheikh H et al (2004) Image quality assessment: from error visibility to structural similarity. IEEE Trans Image Processing 13(4):600–612
Xie Y, Souto AL, Fachada S, et al (2021) Performance analysis of dibr-based view synthesis with kinect azure. In: 2021 Internat. Conf. 3D Immersion (IC3D), Brussels, pp 1–6
Yao L, Han Y, Li X (2019) Fast and high-quality virtual view synthesis from multi-view plus depth videos. Multimed Tools Appl 78:19,325-19,340
Zhang C, Chen T (2004) A survey on image-based rendering—representation, sampling and compression. Signal Process Image Commun 19(1):1–28
Zhang R, Isola P, Efros AA et al (2018) The unreasonable effectiveness of deep features as a perceptual metric. In: IEEE/CVF Conf. Comput. Vis. Pattern Recogn. (CVPR), Salt Lake City, pp 586–595
Zhou T, Tucker R, Flynn J, et al (2018) Stereo magnification: Learning view synthesis using multiplane images. In: ACM Trans. Graph. ACM, New York, pp 1–12
Funding
This work was funded in part by the Research Foundation - Flanders (FWO) under Grant 1SD8221N, in part by IDLab (Ghent University - imec), Flanders Innovation and Entrepreneurship (VLAIO), and the European Union.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Artois, J., Courteaux, M., Van Wallendael, G. et al. OpenDIBR: Open Real-Time Depth-Image-Based renderer of light field videos for VR. Multimed Tools Appl 83, 25797–25815 (2024). https://doi.org/10.1007/s11042-023-16250-8
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11042-023-16250-8