Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Depth synthesis and local warps for plausible image-based navigation

Published: 04 July 2013 Publication History

Abstract

Modern camera calibration and multiview stereo techniques enable users to smoothly navigate between different views of a scene captured using standard cameras. The underlying automatic 3D reconstruction methods work well for buildings and regular structures but often fail on vegetation, vehicles, and other complex geometry present in everyday urban scenes. Consequently, missing depth information makes Image-Based Rendering (IBR) for such scenes very challenging. Our goal is to provide plausible free-viewpoint navigation for such datasets. To do this, we introduce a new IBR algorithm that is robust to missing or unreliable geometry, providing plausible novel views even in regions quite far from the input camera positions. We first oversegment the input images, creating superpixels of homogeneous color content which often tends to preserve depth discontinuities. We then introduce a depth synthesis approach for poorly reconstructed regions based on a graph structure on the oversegmentation and appropriate traversal of the graph. The superpixels augmented with synthesized depth allow us to define a local shape-preserving warp which compensates for inaccurate depth. Our rendering algorithm blends the warped images, and generates plausible image-based novel views for our challenging target scenes. Our results demonstrate novel view synthesis in real time for multiple challenging scenes with significant depth complexity, providing a convincing immersive navigation experience.

Supplementary Material

chaurasia (chaurasia.zip)
Supplemental movie and image files for, Depth synthesis and local warps for plausible image-based navigation

References

[1]
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S. 2012. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Trans. Pattern Analy. Machine Intell. 34, 11, 2274--2282.
[2]
Andreetto, M., Zelnik-Manor, L., and Perona, P. 2008. Unsupervised learning of categorical segments in image collections. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition Workshop.
[3]
Barnes, C., Shechtman, E., Finkelstein, A., and Goldman, D. B. 2009. Patchmatch: A randomized correspondence algorithm for structural image editing. ACM Trans. Graph. 28, 3, 24:1--24:11.
[4]
Bleyer, M., Rother, C., Kohli, P., Scharstein, D., and Sinha, S. 2011. Object stereo joint stereo matching and object segmentation. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'11). 3081--3088.
[5]
Buehler, C., Bosse, M., Mcmillan, L., Gortler, S., and Cohen, M. 2001. Unstructured lumigraph rendering. In Proceedings of the 28th Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'01). 425--432.
[6]
Chaurasia, G., Sorkine, O., and Drettakis, G. 2011. Silhouette-aware warping for image-based rendering. Comput. Graph. Forum 30, 4, 1223--1232.
[7]
Chen, J., Paris, S., Wang, J., Matusik, W., Cohen, M., and Durand, F. 2011. The video mesh: A data structure for image-based three dimensional video editing. In Proceedings of the IEEE International Conference on Computational Photography (ICCP'11).
[8]
Chen, Y., Davis, T. A., Hager, W. W., and Rajamanickam, S. 2008. Algorithm 887: Cholmod, supernodal sparse cholesky factorization and update/downdate. ACM Trans. Math. Softw. 35, 3, 22:1--22:14.
[9]
Cigla, C., Zabulis, X., and Alatan, A. 2007. Region-based dense depth extraction from multi-view video. In Proceedings of the IEEE International Conference on Image Processing (ICIP'07).
[10]
Criminisi, A., Perez, P., and Toyama, K. 2003. Object removal by exemplar-based inpainting. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'03). 721--728.
[11]
Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96). 11--20.
[12]
Dolson, J., Baek, J., Plagemann, C., and Thrun, S. 2010. Upsampling range data in dynamic environments. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'10). 1141--1148.
[13]
Eisemann, M., Decker, B. D., Magnor, M., Bekaert, P., De Aguiar, E., Ahmed, N., Theobalt, C., and Sellent, A. 2008. Floating textures. Comput. Graph. Forum 27, 2, 409--418.
[14]
Felzenszwalb, P. F. and Huttenlocher, D. P. 2004. Efficient graph-based image segmentation. Int. J. Comput. Vision 59, 167--181.
[15]
Fuhrmann, S. and Goesele, M. 2011. Fusion of depth maps with multiple scales. In Proceedings of the SIGGRAPH Asia Conference. 148:1--148:8.
[16]
Furukawa, Y., Curless, B., Seitz, S. M., and Szeliski, R. 2009. Manhattan-world stereo. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'09). 1422--1429.
[17]
Furukawa, Y. and Ponce, J. 2009. Accurate, dense, and robust multiview stereopsis. IEEE Trans. PAMI 32, 8, 1362--1376.
[18]
Gallup, D., Frahm, J.-M., and Pollefeys, M. 2010. Piecewise planar and non-planar stereo for urban scene reconstruction. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[19]
Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., and Klowsky, R. 2010. Ambient point clouds for view interpolation. ACM Trans. Graph. 29, 95:1--95:6.
[20]
Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[21]
Grundmann, M., Kwatra, V., Han, M., and Essa, I. 2010. Efficient hierarchical graph based video segmentation. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'10).
[22]
Gupta, A., Bhat, P., Dontcheva, M., Curless, B., Deussen, O., and Cohen, M. 2009. Enhancing and experiencing spacetime resolution with videos and stills. In Proceedings of the IEEE International Conference on Computational Photography (ICCP'09).
[23]
Hawe, S., Kleinsteuber, M., and Diepold, K. 2011. Dense disparity maps from sparse disparity measurements. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'11).
[24]
Hoiem, D., Efros, A. A., and Hebert, M. 2007. Recovering surface layout from an image. Int. J. Comput. Vision 75, 1, 151--172.
[25]
Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Proceedings of the 4th Eurographics Symposium on Geometry Processing (SGP'06). 61--70.
[26]
Kolmogorov, V. and Zabih, R. 2004. What energy functions can be minimized via graph cuts? IEEE Trans. Pattern Analy. Machine Intell. 26, 2, 147--159.
[27]
Kowdle, A., Sinha, S. N., and Szeliski, R. 2012. Multiple view object cosegmentation using appearance and stereo cues. In Proceedings of the 12th European Conference on Computer Vision (ECCV'12).
[28]
Levoy, M. and Hanrahan, P. 1996. Light field rendering. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96). 31--42.
[29]
Lipski, C., Linz, C., Berger, K., Sellent, A., and Magnor, M. 2010. Virtual video camera: Image-based viewpoint navigation through space and time. Comput. Graph. Forum 29, 8, 2555--2568.
[30]
Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Contentpreserving warps for 3D video stabilization. In Proceedings of the ACM SIGGRAPH Papers. 44:1--44:9.
[31]
Mahajan, D., Huang, F.-C., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: A path-based method for plausible image interpolation. ACM Trans. Graph. 28, 3.
[32]
Mcmillan, L. and Bishop, G. 1995. Plenoptic modeling: An image based rendering system. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'95). 39--46.
[33]
Micusik, B. and Kosecka, J. 2010. Multi-view superpixel stereo in urban environments. Int. J. Comput. Vision 89, 1, 106--119.
[34]
Perez, P., Gangnet, M., and Blake, A. 2003. Poisson image editing. In Proceedings of the Annual Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'03). 313--318.
[35]
Pollefeys, M., Nistér, D., Frahm, J. M., Akbarzadeh, A., Mordohai, P., Clipp, B., Engels, C., Gallup, D., Kim, S.-J., Merrell, P., Salmi, C., Sinha, S., Talton, B., Wang, L., Yang, Q., Stewenius, H., Yang, R., Welch, G., and Towles, H. 2008. Detailed real-time urban 3D reconstruction from video. Int. J. Comput. Vision 78, 2--3, 143--167.
[36]
Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multi-view stereo reconstruction algorithms. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR'06).
[37]
Sinha, S. N., Kopf, J., Goesele, M., Scharstein, D., and Szeliski, R. 2012. Image-based rendering for scenes with reflections. ACM Trans. Graph. 31, 4, 100:1--100:10.
[38]
Sinha, S. N., Mordohai, P., and Pollefeys, M. 2007. Multi-view stereo via graph cuts on the dual of an adaptive tetrahedral mesh. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'07).
[39]
Sinha, S. N., Steedly, D., and Szeliski, R. 2009. Piecewise planar stereo for image-based rendering. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'09). 1881--1888.
[40]
Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: Exploring photo collections in 3D. ACM Trans. Graph. 25, 3, 835--846.
[41]
Stich, T., Linz, C., Wallraven, C., Cunningham, D., and Magnor, M. 2011. Perception-motivated interpolation of image sequences. ACM Trans. Appl. Percept. 8, 2, 11:1--11:25.
[42]
Vangorp, P., Chaurasia, G., Laffont, P.-Y., Fleming, R. W., and Drettakis, G. 2011. Perception of visual artifacts in image-based rendering of facades. Comput. Graph. Forum 30, 4, 1241--1250.
[43]
Yang, Q., Yang, R., Davis, J., and Niste R, D. 2007. Spatial-depth super resolution for range images. In Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR'07).
[44]
Zitnick, C. L., Jojic, N., and Kang, S. B. 2005. Consistent segmentation for optical flow estimation. In Proceedings of the IEEE International Conference on Computer Vision (ICCV'05). 1308--1315.
[45]
Zitnick, C. L. and Kang, S. B. 2007. Stereo for image-based rendering using image over-segmentation. Int. J. Comput. Vision 75, 1, 49--65.
[46]
Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3, 600--608.

Cited By

View all
  • (2024)A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large DatasetsACM Transactions on Graphics10.1145/365816043:4(1-15)Online publication date: 19-Jul-2024
  • (2024)Reducing the Memory Footprint of 3D Gaussian SplattingProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36512827:1(1-17)Online publication date: 13-May-2024
  • (2024)The Metamorphosis of Storytelling: Time-based Interactivity in Virtual Reality FilmmakingExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3648672(1-5)Online publication date: 11-May-2024
  • Show More Cited By

Index Terms

  1. Depth synthesis and local warps for plausible image-based navigation

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Transactions on Graphics
      ACM Transactions on Graphics  Volume 32, Issue 3
      June 2013
      129 pages
      ISSN:0730-0301
      EISSN:1557-7368
      DOI:10.1145/2487228
      Issue’s Table of Contents
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 04 July 2013
      Accepted: 01 February 2013
      Revised: 01 January 2013
      Received: 01 July 2012
      Published in TOG Volume 32, Issue 3

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Image-based rendering
      2. image warp
      3. multiview stereo
      4. superpixels
      5. variational warp
      6. wide baseline

      Qualifiers

      • Research-article
      • Research
      • Refereed

      Funding Sources

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)83
      • Downloads (Last 6 weeks)6
      Reflects downloads up to 15 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)A Hierarchical 3D Gaussian Representation for Real-Time Rendering of Very Large DatasetsACM Transactions on Graphics10.1145/365816043:4(1-15)Online publication date: 19-Jul-2024
      • (2024)Reducing the Memory Footprint of 3D Gaussian SplattingProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36512827:1(1-17)Online publication date: 13-May-2024
      • (2024)The Metamorphosis of Storytelling: Time-based Interactivity in Virtual Reality FilmmakingExtended Abstracts of the 2024 CHI Conference on Human Factors in Computing Systems10.1145/3613905.3648672(1-5)Online publication date: 11-May-2024
      • (2024)TRIPS: Trilinear Point Splatting for Real‐Time Radiance Field RenderingComputer Graphics Forum10.1111/cgf.1501243:2Online publication date: 30-Apr-2024
      • (2024)CMC: Few-shot Novel View Synthesis via Cross-view Multiplane Consistency2024 IEEE Conference Virtual Reality and 3D User Interfaces (VR)10.1109/VR58804.2024.00115(960-968)Online publication date: 16-Mar-2024
      • (2024)HDhuman: High-Quality Human Novel-View Rendering From Sparse ViewsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2023.329054330:8(5328-5338)Online publication date: 1-Aug-2024
      • (2024)A Framework for Single-View Multi-Plane Image Inpainting2024 IEEE 7th International Conference on Multimedia Information Processing and Retrieval (MIPR)10.1109/MIPR62202.2024.00092(536-541)Online publication date: 7-Aug-2024
      • (2024)ExtraNeRF: Visibility-Aware View Extrapolation of Neural Radiance Fields with Diffusion Models2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01927(20385-20395)Online publication date: 16-Jun-2024
      • (2024)4K4D: Real-Time 4D View Synthesis at 4K Resolution2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.01893(20029-20040)Online publication date: 16-Jun-2024
      • (2024)Spacetime Gaussian Feature Splatting for Real-Time Dynamic View Synthesis2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00813(8508-8520)Online publication date: 16-Jun-2024
      • Show More Cited By

      View Options

      Get Access

      Login options

      Full Access

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media