Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Public Access

Learning-based view synthesis for light field cameras

Published: 05 December 2016 Publication History

Abstract

With the introduction of consumer light field cameras, light field imaging has recently become widespread. However, there is an inherent trade-off between the angular and spatial resolution, and thus, these cameras often sparsely sample in either spatial or angular domain. In this paper, we use machine learning to mitigate this trade-off. Specifically, we propose a novel learning-based approach to synthesize new views from a sparse set of input views. We build upon existing view synthesis techniques and break down the process into disparity and color estimation components. We use two sequential convolutional neural networks to model these two components and train both networks simultaneously by minimizing the error between the synthesized and ground truth images. We show the performance of our approach using only four corner sub-aperture views from the light fields captured by the Lytro Illum camera. Experimental results show that our approach synthesizes high-quality images that are superior to the state-of-the-art techniques on a variety of challenging real-world scenes. We believe our method could potentially decrease the required angular resolution of consumer light field cameras, which allows their spatial resolution to increase.

Supplementary Material

ZIP File (a193-kalantari.zip)
Supplemental file.

References

[1]
Adelson, E. H., and Wang, J. Y. A. 1992. Single lens stereo with a plenoptic camera. IEEE PAMI 14, 2, 99--106.
[2]
Bishop, T. E., Zanetti, S., and Favaro, P. 2009. Light field superresolution. In IEEE ICCP, 1--9.
[3]
Burger, H. C., Schuler, C. J., and Harmeling, S. 2012. Image denoising: Can plain neural networks compete with BM3D? In IEEE CVPR, 2392--2399.
[4]
Chaurasia, G., Sorkine, O., and Drettakis, G. 2011. Silhouette-aware warping for image-based rendering. In EGSR, 1223--1232.
[5]
Chaurasia, G., Duchene, S., Sorkine-Hornung, O., and Drettakis, G. 2013. Depth synthesis and local warps for plausible image-based navigation. ACM TOG 32, 3, 30:1--30:12.
[6]
Cho, D., Lee, M., Kim, S., and Tai, Y.-W. 2013. Modeling the calibration pipeline of the lytro camera for high quality light-field image reconstruction. In IEEE ICCV, 3280--3287.
[7]
Dong, C., Loy, C. C., He, K., and Tang, X. 2014. Learning a deep convolutional network for image super-resolution. In ECCV, 184--199.
[8]
Dosovitskiy, A., Springenberg, J. T., and Brox, T. 2015. Learning to generate chairs with convolutional neural networks. In IEEE CVPR, 1538--1546.
[9]
Eisemann, M., De Decker, B., Magnor, M., Bekaert, P., De Aguiar, E., Ahmed, N., Theobalt, C., and Sellent, A. 2008. Floating textures. CGF 27, 2, 409--418.
[10]
Fitzgibbon, A., Wexler, Y., and Zisserman, A. 2003. Image-based rendering using image-based priors. In IEEE ICCV, 1176--1183 vol.2.
[11]
Flynn, J., Neulander, I., Philbin, J., and Snavely, N. 2016. Deepstereo: Learning to predict new views from the worlds imagery. In IEEE CVPR, 5515--5524.
[12]
Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE PAMI 32, 8, 1362--1376.
[13]
Georgiev, T., Zheng, K. C., Curless, B., Salesin, D., Nayar, S., and Intwala, C. 2006. Spatio-angular resolution tradeoffs in integral photography. In EGSR, 263--272.
[14]
Girod, B., Chang, C.-L., Ramanathan, P., and Zhu, X. 2003. Light field compression using disparity-compensated lifting. In IEEE ICME, vol. 1, I--373--6 vol.1.
[15]
Glorot, X., and Bengio, Y. 2010. Understanding the difficulty of training deep feedforward neural networks. In AISTATS, vol. 9, 249--256.
[16]
Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., Klowsky, R., Steedly, D., and Szeliski, R. 2010. Ambient point clouds for view interpolation. ACM TOG 29, 4, 95.
[17]
Heber, S., and Pock, T. 2016. Convolutional networks for shape from light field. In IEEE CVPR.
[18]
Jeon, H. G., Park, J., Choe, G., Park, J., Bok, Y., Tai, Y. W., and Kweon, I. S. 2015. Accurate depth map estimation from a lenslet light field camera. In IEEE CVPR, 1547--1555.
[19]
Kholgade, N., Simon, T., Efros, A., and Sheikh, Y. 2014. 3D object manipulation in a single photograph using stock 3D models. ACM TOG 33, 4, 127.
[20]
Kingma, D., and Ba, J. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980.
[21]
Levin, A., and Durand, F. 2010. Linear view synthesis using a dimensionality gap light field prior. In IEEE CVPR, 1831--1838.
[22]
Levoy, M., and Hanrahan, P. 1996. Light field rendering. In ACM SIGGRAPH, 31--42.
[23]
Lytro, 2016. https://www.lytro.com/.
[24]
Mahajan, D., Huang, F.-C., Matusik, W., Ramamoorthi, R., and Belhumeur, P. 2009. Moving gradients: a path-based method for plausible image interpolation. ACM TOG 28, 3, 42.
[25]
Marwah, K., Wetzstein, G., Bando, Y., and Raskar, R. 2013. Compressive light field photography using overcomplete dictionaries and optimized projections. ACM TOG 32, 4, 46:1--46:12.
[26]
Mitra, K., and Veeraraghavan, A. 2012. Light field de-noising, light field superresolution and stereo camera based re-focussing using a GMM light field patch prior. In IEEE CVPRW, 22--28.
[27]
Ng, R., Levoy, M., Brédif, M., Duval, G., Horowitz, M., and Hanrahan, P. 2005. Light field photography with a hand-held plenoptic camera. Computer Science Technical Report CSTR 2, 11, 1--11.
[28]
Pelican Imaging, 2016. Capture life in 3D. http://www.pelicanimaging.com/.
[29]
Raj, A., Lowney, M., Shah, R., and Wetzstein, G., 2016. Stanford lytro light field archive. http://lightfields.stanford.edu/.
[30]
RayTrix, 2016. 3D light field camera technology. https://www.raytrix.de/.
[31]
Rumelhart, D. E., Hinton, G. E., and Williams, R. J. 1986. Learning representations by back-propagating errors. Nature 323, 533--536.
[32]
Schedl, D. C., Birklbauer, C., and Bimber, O. 2015. Directional super-resolution by means of coded sampling and guided upsampling. In IEEE ICCP, 1--10.
[33]
Shechtman, E., Rav-Acha, A., Irani, M., and Seitz, S. 2010. Regenerative morphing. In IEEE CVPR, 615--622.
[34]
Shi, L., Hassanieh, H., Davis, A., Katabi, D., and Du-rand, F. 2014. Light field reconstruction using sparsity in the continuous fourier domain. ACM TOG 34, 1, 12:1--12:13.
[35]
Su, H., Wang, F., Yi, L., and Guibas, L. 2014. 3D-assisted image feature synthesis for novel views of an object. arXiv preprint arXiv:1412.0003.
[36]
Sun, J., Cao, W., Xu, Z., and Ponce, J. 2015. Learning a convolutional neural network for non-uniform motion blur removal. In IEEE CVPR, 769--777.
[37]
Tao, M. W., Hadap, S., Malik, J., and Ramamoorthi, R. 2013. Depth from combining defocus and correspondence using light-field cameras. In IEEE ICCV, 673--680.
[38]
Tao, M. W., Srinivasan, P. P., Malik, J., Rusinkiewicz, S., and Ramamoorthi, R. 2015. Depth from shading, defocus, and correspondence using light-field angular coherence. In IEEE CVPR, 1940--1948.
[39]
Tatarchenko, M., Dosovitskiy, A., and Brox, T. 2015. Single-view to multi-view: Reconstructing unseen views with a convolutional network. CoRR abs/1511.06702.
[40]
Tong, X., and Gray, R. M. 2003. Interactive rendering from compressed light fields. IEEE TCSVT 13, 11 (Nov), 1080--1091.
[41]
Vedaldi, A., and Lenc, K. 2015. MatConvNet: Convolutional neural networks for Matlab. In ACMMM, 689--692.
[42]
Wang, Z., Bovik, A., Sheikh, H., and Simoncelli, E. 2004. Image quality assessment: from error visibility to structural similarity. IEEE TIP 13, 4 (April), 600--612.
[43]
Wang, T. C., Efros, A. A., and Ramamoorthi, R. 2015. Occlusion-aware depth estimation using light-field cameras. In IEEE ICCV, 3487--3495.
[44]
Wanner, S., and Goldluecke, B. 2012. Globally consistent depth labeling of 4D light fields. In IEEE CVPR, 41--48.
[45]
Wanner, S., and Goldluecke, B. 2014. Variational light field analysis for disparity estimation and super-resolution. IEEE PAMI 36, 3, 606--619.
[46]
Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. ACM TOG 24, 3, 765--776.
[47]
Yang, J., Reed, S. E., Yang, M.-H., and Lee, H. 2015. Weakly-supervised disentangling with recurrent transformations for 3D view synthesis. In NIPS, 1099--1107.
[48]
Yoon, Y., Jeon, H. G., Yoo, D., Lee, J. Y., and Kweon, I. S. 2015. Learning a deep convolutional network for light-field image super-resolution. In IEEE ICCV Workshop, 57--65.
[49]
Zhang, Z., Liu, Y., and Dai, Q. 2015. Light field from micro-baseline image pair. In IEEE CVPR, 3800--3809.
[50]
Zhang, F. L., Wang, J., Shechtman, E., Zhou, Z. Y., Shi, J. X., and Hu, S. M. 2016. PlenoPatch: Patch-based plenoptic image manipulation. IEEE TVCG PP, 99, 1--1.
[51]
Zhou, T., Tulsiani, S., Sun, W., Malik, J., and Efros, A. A. 2016. View synthesis by appearance flow. CoRR abs/1605.03557.

Cited By

View all
  • (2025)Disparity Enhancement-Based Light Field Angular Super-ResolutionIEEE Signal Processing Letters10.1109/LSP.2024.349658232(81-85)Online publication date: 2025
  • (2025)Optimization of feature association strategies in multi-target tracking based on light field imagesMeasurement10.1016/j.measurement.2024.116205242(116205)Online publication date: Jan-2025
  • (2024)基于深度学习的光场图像重建与增强综述(特邀)Laser & Optoelectronics Progress10.3788/LOP24140461:16(1611015)Online publication date: 2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics
ACM Transactions on Graphics  Volume 35, Issue 6
November 2016
1045 pages
ISSN:0730-0301
EISSN:1557-7368
DOI:10.1145/2980179
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2016
Published in TOG Volume 35, Issue 6

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. convolutional neural network
  2. disparity estimation
  3. light field
  4. view synthesis

Qualifiers

  • Research-article

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)509
  • Downloads (Last 6 weeks)101
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2025)Disparity Enhancement-Based Light Field Angular Super-ResolutionIEEE Signal Processing Letters10.1109/LSP.2024.349658232(81-85)Online publication date: 2025
  • (2025)Optimization of feature association strategies in multi-target tracking based on light field imagesMeasurement10.1016/j.measurement.2024.116205242(116205)Online publication date: Jan-2025
  • (2024)基于深度学习的光场图像重建与增强综述(特邀)Laser & Optoelectronics Progress10.3788/LOP24140461:16(1611015)Online publication date: 2024
  • (2024)超表面光场成像研究现状及展望(特邀)Laser & Optoelectronics Progress10.3788/LOP24139961:16(1611007)Online publication date: 2024
  • (2024)A Semi-supervised Angular Super-Resolution Method for Autostereoscopic 3D Surface MeasurementOptics Letters10.1364/OL.516099Online publication date: 19-Jan-2024
  • (2024)Efficient light field acquisition for integral imaging with adaptive viewport optimizationOptics Express10.1364/OE.53126432:18(31280)Online publication date: 13-Aug-2024
  • (2024)Learning-based light field imaging: an overviewJournal on Image and Video Processing10.1186/s13640-024-00628-12024:1Online publication date: 30-May-2024
  • (2024)DirectL: Efficient Radiance Fields Rendering for 3D Light Field DisplaysACM Transactions on Graphics10.1145/368789743:6(1-19)Online publication date: 19-Dec-2024
  • (2024)Learning to Handle Large Obstructions in Video Frame InterpolationProceedings of the 32nd ACM International Conference on Multimedia10.1145/3664647.3681006(5221-5229)Online publication date: 28-Oct-2024
  • (2024)More Realistic 3D Environment Reconstruction from Scanned Data Based on Multi-process TechnologiesProceedings of the International Conference on Computer Vision and Deep Learning10.1145/3653781.3653799(1-5)Online publication date: 19-Jan-2024
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media