research-article

Open access

Jump: virtual reality video

Authors:

Robert Anderson,

Jonathan T. Barron,

Janne Kontkanen,

Carlos Hernández,

Sameer Agarwal,

Steven M. SeitzAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 35, Issue 6

Article No.: 198, Pages 1 - 13

https://doi.org/10.1145/2980179.2980257

Published: 05 December 2016 Publication History

Abstract

We present Jump, a practical system for capturing high resolution, omnidirectional stereo (ODS) video suitable for wide scale consumption in currently available virtual reality (VR) headsets. Our system consists of a video camera built using off-the-shelf components and a fully automatic stitching pipeline capable of capturing video content in the ODS format. We have discovered and analyzed the distortions inherent to ODS when used for VR display as well as those introduced by our capture method and show that they are small enough to make this approach suitable for capturing a wide variety of scenes. Our stitching algorithm produces robust results by reducing the problem to one of pairwise image interpolation followed by compositing. We introduce novel optical flow and compositing methods designed specifically for this task. Our algorithm is temporally coherent and efficient, is currently running at scale on a distributed computing platform, and is capable of processing hours of footage each day.

Supplementary Material

ZIP File (a198-anderson.zip)

Supplemental file.

Download
255.60 MB

References

[1]

Aydin, T. O., Stefanoski, N., Croci, S., Gross, M., and Smolic, A. 2014. Temporally coherent local tone mapping of hdr video. TOG.

Digital Library

[2]

Baker, S., Scharstein, D., Lewis, J. P., Roth, S., Black, M. J., and Szeliski, R. 2011. A database and evaluation methodology for optical flow. IJCV.

Digital Library

[3]

Baran, I., Schmid, J., Siegrist, T., Gross, M., and Sumner, R. W. 2011. Mixed-order compositing for 3d paintings. TOG.

Digital Library

[4]

Barron, J. T., and Poole, B. 2016. The fast bilateral solver. ECCV.

[5]

Brox, T., and Malik, J. 2011. Large displacement optical flow: Descriptor matching in variational motion estimation. TPAMI.

Digital Library

[6]

Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. TOG.

Digital Library

[7]

Collet, A., Chuang, M., Sweeney, P., Gillett, D., Evseev, D., Calabrese, D., Hoppe, H., Kirk, A., and Sullivan, S. 2015. High-quality streamable free-viewpoint video. TOG.

Digital Library

[8]

Couture, V., Langer, M. S., and Roy, S. 2010. Analysis of disparity distortions in omnistereoscopic displays. ACM Transactions on Applied Perception (TAP).

Digital Library

[9]

Couture, V., Langer, M. S., and Roy, S. 2011. Panoramic stereo video textures. ICCV.

Digital Library

[10]

Dodgson, N. A. 2004. Variation and extrema of human inter-pupillary distance. SPIE: Stereoscopic Displays and Applications, 3646.

[11]

Gluckman, J., Nayar, S. K., and Thoresz, K. J. 1998. Real-time omnidirectional and panoramic stereo. Proc. of Image Understanding Workshop.

[12]

Google, 2014. Google Cardboard. https://en.wikipedia.org/wiki/Google Cardboard.

[13]

Gross, M., and Pfister, H. 2007. Point-Based Graphics. Morgan Kaufmann Publishers Inc.

Digital Library

[14]

Hartley, R., and Zisserman, A. 2003. Multiple view geometry in computer vision. Cambridge university press.

Digital Library

[15]

Hasinoff, S. W., Sharlet, D., Geiss, R., Adams, A., Barron, J. T., Kainz, F., Chen, J., and Levoy, M. 2016. Burst photography for high dynamic range and low-light imaging on mobile cameras. SIGGRAPH Asia.

Digital Library

[16]

Horn, B. K. P., and Schunk, B. G. 1981. Determining optical flow. Artificial Intelligence.

Digital Library

[17]

Ishiguro, H., Yamamoto, M., and Tsuji, S. 1990. Omni-directional stereo for making global map. ICCV.

[18]

Jarabo, A., Masia, B., Bousseau, A., Pellacini, F., and Gutierrez, D. 2014. How do people edit light fields? SIGGRAPH.

Digital Library

[19]

Koppal, S. J., Zitnick, C. L., Cohen, M. F., Kang, S. B., Ressler, B., and Colburn, A. 2010. A viewer-centric editor for 3d movies. Computer Graphics and Applications.

Digital Library

[20]

Krähenbühl, P., and Koltun, V. 2012. Efficient nonlocal regularization for optical flow. ECCV.

[21]

Kroeger, T., Timofte, R., Dai, D., and Gool, L. J. V. 2016. Fast optical flow using dense inverse search. ECCV.

[22]

Lang, M., Wang, O., Aydin, T., Smolic, A., and Gross, M. 2012. Practical temporal consistency for image-based graphics applications. SIGGRAPH.

Digital Library

[23]

Levoy, M., and Hanrahan, P. 1996. Light field rendering. CGIT.

Digital Library

[24]

Lewis, J. 1995. Fast normalized cross-correlation. Vision interface.

[25]

Liu, C., Yuen, J., and Torralba, A. 2011. Sift flow: Dense correspondence across scenes and its applications. TPAMI.

Digital Library

[26]

Lucas, B. D., and Kanade, T. 1981. An iterative image registration technique with an application to stereo vision. IJCAI.

Digital Library

[27]

Meka, A., Zollhoefer, M., Richardt, C., and Theobalt, C. 2016. Live intrinsic video. SIGGRAPH.

Digital Library

[28]

Menze, M., and Geiger, A. 2015. Object scene flow for autonomous vehicles. CVPR.

[29]

Peleg, S., Ben-Ezra, M., and Pritch, Y. 2001. Omnistereo: Panoramic stereo imaging. TPAMI.

Digital Library

[30]

Porter, T., and Duff, T. 1984. Compositing digital images. SIGGRAPH.

Digital Library

[31]

Pulli, K., Hoppe, H., Cohen, M., Shapiro, L., Duchamp, T., and Stuetzle, W. 1997. View-based rendering: Visualizing real objects from scanned range and color data. Proc. Eurographics Workshop on Rendering.

Digital Library

[32]

Qin, D., Takamatsu, M., and Nakashima, Y. 2004. Measurement for the panum's fusional area in retinal fovea using a three-dimension display device. Journal of Light & Visual Environment.

[33]

Ragan-Kelley, J., Adams, A., Paris, S., Levoy, M., Ama-rasinghe, S., and Durand, F. 2012. Decoupling algorithms from schedules for easy optimization of image processing pipelines. SIGGRAPH.

Digital Library

[34]

Rav-Acha, A., Engel, G., and Peleg, S. 2008. Minimal aspect distortion (mad) mosaicing of long scenes. IJCV.

Digital Library

[35]

Revaud, J., Weinzaepfel, P., Harchaoui, Z., and Schmid, C. 2015. EpicFlow: Edge-Preserving Interpolation of Correspondences for Optical Flow. CVPR.

[36]

Richardt, C., Pritch, Y., Zimmer, H., and Sorkine-Hornung, A. 2013. Megastereo: Constructing high-resolution stereo panoramas. CVPR.

Digital Library

[37]

Samsung, 2015. Samsung Gear VR. https://en.wikipedia.org/wiki/Samsung_Gear_VR.

[38]

Shimamura, J., Yokoya, N., Takemura, H., and Yamazawa, K. 2000. Construction of an immersive mixed environment using an omnidirectional stereo image sensor. Workshop on Omnidirectional Vision.

Digital Library

[39]

Shum, H.-Y., and He, L.-W. 1999. Rendering with concentric mosaics. CGIT.

Digital Library

[40]

Smolic, A. 2011. 3d video and free viewpoint videofrom capture to display. Pattern recognition.

Digital Library

[41]

Tanaka, K., and Tachi, S. 2005. Tornado: Omnistereo video imaging with rotating optics. TVCG.

Digital Library

[42]

Weissig, C., Schreer, O., Eisert, P., and Kauff, P. 2012. The ultimate immersive experience: panoramic 3D video acquisition. Springer.

[43]

Wilburn, B., Joshi, N., Vaish, V., Talvala, E.-V., Antunez, E., Barth, A., Adams, A., Horowitz, M., and Levoy, M. 2005. High performance imaging using large camera arrays. TOG.

Digital Library

[44]

Yang, J. C., Everett, M., Buehler, C., and McMillan, L. 2002. A real-time distributed light field camera. Rendering Techniques 2002.

Digital Library

[45]

Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. TOG.

Digital Library

[46]

Zwicker, M., Pfister, H., van Baar, J., and Gross, M. 2001. Surface splatting. CGIT.

Digital Library

Cited By

Zhang RYu ZSheng ZYing JCao SChen SYang BLi JShen H(2025)SGDFormer: One-stage transformer-based architecture for cross-spectral stereo image guided denoisingInformation Fusion10.1016/j.inffus.2024.102603113(102603)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102603
Chang JLi QLiang YZhou L(2024)SC-AOF: A Sliding Camera and Asymmetric Optical-Flow-Based Blending Method for Image StitchingSensors10.3390/s2413403524:13(4035)Online publication date: 21-Jun-2024
https://doi.org/10.3390/s24134035
Kim MLee JLee BIm SJin K(2024)Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00404(4075-4084)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00404
Show More Cited By

Index Terms

Jump: virtual reality video
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Image and video acquisition
        Computational photography
  2. Computer graphics
    1. Graphics systems and interfaces
      1. Virtual reality

Recommendations

An integrated 6DoF video camera and system design

Designing a fully integrated 360° video camera supporting 6DoF head motion parallax requires overcoming many technical hurdles, including camera placement, optical design, sensor resolution, system calibration, real-time video capture, depth ...
Panoramic stereo representation for immersive projection display system
VRCAI '10: Proceedings of the 9th ACM SIGGRAPH Conference on Virtual-Reality Continuum and its Applications in Industry

In this paper, the panoramic stereo images are produced from the real environment using a digital camera and a panoramic tripod head. And they are applied to the CAVE system as panoramic stereo background to increase user's immersion. The images for ...
Stabilization of panoramic videos from mobile multi-camera platforms

Wide field of view panoramic videos have recently become popular due to the availability of high resolution displays. These panoramic videos are generated by stitching video frames captured from a panoramic video acquisition system, typically comprising ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 35, Issue 6

November 2016

1045 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2980179

Issue’s Table of Contents

Copyright © 2016 Owner/Author.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 December 2016

Published in TOG Volume 35, Issue 6

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

225
Total Citations
View Citations
4,539
Total Downloads

Downloads (Last 12 months)292
Downloads (Last 6 weeks)29

Reflects downloads up to 03 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhang RYu ZSheng ZYing JCao SChen SYang BLi JShen H(2025)SGDFormer: One-stage transformer-based architecture for cross-spectral stereo image guided denoisingInformation Fusion10.1016/j.inffus.2024.102603113(102603)Online publication date: Jan-2025
https://doi.org/10.1016/j.inffus.2024.102603
Chang JLi QLiang YZhou L(2024)SC-AOF: A Sliding Camera and Asymmetric Optical-Flow-Based Blending Method for Image StitchingSensors10.3390/s2413403524:13(4035)Online publication date: 21-Jun-2024
https://doi.org/10.3390/s24134035
Kim MLee JLee BIm SJin K(2024)Implicit Neural Image Stitching With Enhanced and Blended Feature Reconstruction2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00404(4075-4084)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00404
Zhu GQin ZDing YLiu YQin Z(2024)MFNet:Real-Time Motion Focus Network for Video Frame InterpolationIEEE Transactions on Multimedia10.1109/TMM.2023.330844226(3251-3262)Online publication date: 1-Jan-2024
https://dl.acm.org/doi/10.1109/TMM.2023.3308442
Tang YTian SShuai PDuan Y(2024)MGHE-Net: A Transformer-Based Multi-Grid Homography Estimation Network for Image StitchingIEEE Access10.1109/ACCESS.2024.338459812(49216-49227)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3384598
Khan AHossain MCovaci ASirlantzis KXu C(2024)Light field imaging technology for virtual reality content creation: A reviewIET Image Processing10.1049/ipr2.13144Online publication date: 14-Jun-2024
https://doi.org/10.1049/ipr2.13144
Chen ZZhang KCai HDing XJiang CChen Z(2024)Audio-visual saliency prediction for movie viewing in immersive environments: Dataset and benchmarksJournal of Visual Communication and Image Representation10.1016/j.jvcir.2024.104095100(104095)Online publication date: Apr-2024
https://doi.org/10.1016/j.jvcir.2024.104095
Heng Wei 衡Yu Jian 俞Da Feipeng 达(2023)基于密集视点插值的实时视频拼接方法Acta Optica Sinica10.3788/AOS23050943:14(1415003)Online publication date: 2023
https://doi.org/10.3788/AOS230509
Gąbka J(2023)Devising a Multi-camera Motion Capture and Processing System for Production Plant Monitoring and Operator's Training in Virtual RealityManufacturing Technology10.21062/mft.2023.05723:4(399-417)Online publication date: 5-Sep-2023
https://doi.org/10.21062/mft.2023.057
熊禹(2023)Large Parallax Image Mosaic Algorithm Based on Deep Learning and APAP ModelJournal of Image and Signal Processing10.12677/JISP.2023.12201112:02(104-115)Online publication date: 2023
https://doi.org/10.12677/JISP.2023.122011
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents