research-article

High-quality streamable free-viewpoint video

Authors:

David Calabrese,

Steve SullivanAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 34, Issue 4

Article No.: 69, Pages 1 - 13

https://doi.org/10.1145/2766945

Published: 27 July 2015 Publication History

Abstract

We present the first end-to-end solution to create high-quality free-viewpoint video encoded as a compact data stream. Our system records performances using a dense set of RGB and IR video cameras, generates dynamic textured surfaces, and compresses these to a streamable 3D video format. Four technical advances contribute to high fidelity and robustness: multimodal multi-view stereo fusing RGB, IR, and silhouette information; adaptive meshing guided by automatic detection of perceptually salient areas; mesh tracking to create temporally coherent subsequences; and encoding of tracked textured meshes as an MPEG video stream. Quantitative experiments demonstrate geometric accuracy, texture fidelity, and encoding efficiency. We release several datasets with calibrated inputs and processed results to foster future research.

Supplementary Material

ZIP File (a69-collet.zip)

Supplemental files

Download
640.81 MB

MP4 File (a69.mp4)

Download
32.77 MB

References

[1]

4D View Solutions, 2007. http://www.4dviews.com.

[2]

Ahmed, N., Theobalt, C., Dobrev, P., and Seidel, H. 2008. Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry. In Proc. CVPR.

[3]

Ahmed, N., Theobalt, C., Rossl, C., Thrun, S., and Seidel, H. 2008. Dense correspondence finding for parameterization-free animation reconstruction from video. In Proc. CVPR.

[4]

Alexa, M., Behr, J., Cohen-Or, D., Fleishman, S., Levin, D., and Silva, C. T. 2001. Point set surfaces. In Proc. Conf. on Visualization.

Digital Library

[5]

Aspert, N., Santa-cruz, D., and Ebrahimi, T. 2002. MESH: Measuring errors between surfaces using the Hausdorff distance. In Proc. ICME.

[6]

Bleyer, M., Rhemann, C., and Rother, C. 2011. PatchMatch stereo - stereo matching with slanted support windows. In Proc. BMVC.

[7]

Bojsen-Hansen, M., Li, H., and Wojtan, C. 2012. Tracking surfaces with evolving topology. ACM Trans. Graph. 31, 4.

Digital Library

[8]

Borshukov, G., Piponi, D., Larsen, O., Lewis, J. P., and Tempelaar-Lietz, C. 2005. Universal capture -- Image-based facial animation for "The Matrix Reloaded". In ACM SIGGRAPH Courses.

Digital Library

[9]

Budd, C., Huang, P., Klaudiny, M., and Hilton, A. 2013. Global non-rigid alignment of surface sequences. Int. J. Comput. Vision 102, 1--3.

Digital Library

[10]

Campbell, N. D. F., Vogiatzis, G., Hernandez, C., and Cipolla, R. 2008. Using multiple hypotheses to improve depth-maps for multi-view stereo. In Proc. ECCV.

Digital Library

[11]

Carranza, J., Theobalt, C., Magnor, M. A., and Seidel, H.-P. 2003. Free-viewpoint video of human actors. ACM Trans. Graph. 22, 3.

Digital Library

[12]

Casas, D., Volino, M., Collomosse, J., and Hilton, A. 2014. 4D video textures for interactive character appearance. Comput. Graph. Forum 33, 2.

Digital Library

[13]

Chuang, M., Luo, L., Brown, B., Rusinkiewicz, S., and Kazhdan, M. 2009. Estimating the Laplace-Beltrami operator by restricting 3D functions. Symposium on Geometry Processing.

Digital Library

[14]

de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.-P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 3.

Digital Library

[15]

DoubleMe, 2014. https://www.doubleme.me.

[16]

Erickson, J., and Whittlesey, K. 2005. Greedy optimal homotopy and homology generators. In Proc. ACM-SIAM Symposium on Discrete algorithms.

Digital Library

[17]

Franco, J., Lapierre, M., and Boyer, E. 2006. Visual shapes of silhouette sets. In Proc. Intl. Symp. 3D Data Processing, Visualization and Transmission.

Digital Library

[18]

FreeD, 2014. http://replay-technologies.com.

[19]

Furukawa, Y., and Ponce, J. 2010. Accurate, dense, and robust multiview stereopsis. IEEE PAMI 32, 8.

Digital Library

[20]

Gal, R., Wexler, Y., Ofek, E., Hoppe, H., and Cohen-Or, D. 2010. Seamless montage for texturing models. Comput. Graph. Forum 29, 2.

[21]

Gall, J., Stoll, C., Aguiar, E. D., Theobalt, C., Rosenhahn, B., and peter Seidel, H. 2009. Motion capture using joint skeleton tracking and surface estimation. In Proc. CVPR.

[22]

Garland, M., and Heckbert, P. S. 1997. Surface simplification using quadric error metrics. In ACM SIGGRAPH.

Digital Library

[23]

Goesele, M., Curless, B., and Seitz, S. M. 2006. Multi-view stereo revisited. In Proc. CVPR.

Digital Library

[24]

Goldluecke, B., and Magnor, M. 2004. Space-time isosurface evolution for temporally coherent 3D reconstruction. In Proc. CVPR.

[25]

Golomb, S. 1966. Run-length encodings (corresp.). IEEE Transactions on Information Theory 12, 3.

Digital Library

[26]

Guennebaud, G., Jacob, B., et al., 2010. Eigen v3. http://eigen.tuxfamily.org.

[27]

Guskov, I., and Wood, Z. J. 2001. Topological noise removal. In Proc. Graphics Interface.

Digital Library

[28]

Hernandez, C., and Schmitt, F. 2004. Silhouette and stereo fusion for 3D object modeling. Computer Vision and Image Understanding 96, 3.

Digital Library

[29]

Hiep, V. H., Keriven, R., Labatut, P., and Pons, J.-P. 2009. Towards high-resolution large-scale multi-view stereo. In Proc. CVPR.

[30]

Hu, X., and Mordohai, P. 2012. A quantitative evaluation of confidence measures for stereo vision. IEEE PAMI 34, 11.

Digital Library

[31]

Huang, C.-H., Boyer, E., Navab, N., and Ilic, S. 2014. Human shape and pose tracking using keyframes. In Proc. CVPR.

Digital Library

[32]

ISO/IEC 23009-1, 2014. Information technology -- dynamic adaptive streaming over HTTP (DASH) -- Part 1: Media presentation description and segment formats.

[33]

Kanade, T., Rander, P., and Narayanan, P. J. 1997. Virtualized reality: Constructing virtual worlds from real scenes. IEEE Multimedia 4, 1.

Digital Library

[34]

Kazhdan, M., and Hoppe, H. 2013. Screened Poisson surface reconstruction. ACM Trans. Graph. 32, 3.

Digital Library

[35]

Kazhdan, M., Bolitho, M., and Hoppe, H. 2006. Poisson surface reconstruction. In Symposium on Geometry Processing.

Digital Library

[36]

Klaudiny, M., Budd, C., and Hilton, A. 2012. Towards optimal non-rigid surface tracking. In Proc. ECCV.

Digital Library

[37]

Labatut, P., Pons, J.-P., and Keriven, R. 2007. Efficient multi-view reconstruction of large-scale scenes using interest points, delaunay triangulation and graph cuts. In Proc. ICCV.

[38]

Lee, C. H., Varshney, A., and Jacobs, D. W. 2005. Mesh saliency. ACM Trans. Graph. 24, 3.

Digital Library

[39]

Lempitsky, V. S., and Ivanov, D. V. 2007. Seamless mosaicing of image-based texture maps. In Proc. CVPR.

[40]

Letouzey, A., and Boyer, E. 2012. Progressive shape models. In Proc. CVPR.

Digital Library

[41]

Li, H., Adams, B., Guibas, L. J., and Pauly, M. 2009. Robust single-view geometry and motion reconstruction. ACM Trans. Graph. 28, 5.

Digital Library

[42]

Lindstrom, P., and Turk, G. 2000. Image-driven simplification. ACM Trans. Graph. 19, 3.

Digital Library

[43]

Liu, Y., Dai, Q., and Xu, W. 2010. A point-cloud-based multiview stereo algorithm for free-viewpoint video. IEEE TVCG.

Digital Library

[44]

Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In ACM SIGGRAPH.

Digital Library

[45]

Microsoft, 2011. UVAtlas. http://uvatlas.codeplex.com.

[46]

Moezzi, S., Tai, L.-C., and Gerard, P. 1997. Virtual view generation for 3D digital video. IEEE Multimedia 4, 1.

Digital Library

[47]

Narayanan, P., Rander, P., and Kanade, T. 1998. Constructing virtual worlds using dense stereo. In Proc. ICCV.

Digital Library

[48]

Shan, Q., Curless, B., Furukawa, Y., Hernandez, C., and Seitz, S. M. 2014. Occluding contours for multi-view stereo. In Proc. ECCV.

[49]

Sinha, S. N., and Pollefeys, M. 2005. Multi-view reconstruction using photo-consistency and exact silhouette constraints: a maximum-flow formulation. In Proc. ICCV.

Digital Library

[50]

Song, P., Wu, X., and Wang, M. Y. 2010. Volumetric stereo and silhouette fusion for image-based modeling. The Visual Computer 26, 12.

Digital Library

[51]

Starck, J., and Hilton, A. 2007. Surface capture for performance-based animation. IEEE Computer Graphics and Application 27, 6.

Digital Library

[52]

Sumner, R. W., Schmid, J., and Pauly, M. 2007. Embedded deformation for shape manipulation. ACM Trans. Graph. 26, 3.

Digital Library

[53]

Vasa, L., and Skala, V. 2007. CoDDyaC: Connectivity Driven Dynamic Mesh Compression. In Proc. 3DTV.

[54]

Vlasic, D., Baran, I., Matusik, W., and Popovic, J. 2008. Articulated mesh animation from multiview silhouettes. ACM Trans. Graph. 27, 3.

Digital Library

[55]

Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., and Matusik, W. 2009. Dynamic shape capture using multi-view photometric stereo. ACM Trans. Graph. 28, 5.

Digital Library

[56]

Volino, M., Casas, D., Collomosse, J. P., and Hilton, A. 2014. Optimal representation of multiple view video. In Proc. BMVC.

[57]

Wand, M., Adams, B., Ovsjanikov, M., Berner, A., Bokeloh, M., Jenke, P., Guibas, L., Seidel, H.-P., and Schilling, A. 2009. Efficient reconstruction of nonrigid shape and motion from real-time 3D scanner data. ACM Trans. Graph. 28, 2.

Digital Library

[58]

Wang, Z., Bovik, A. C., Sheikh, H. R., and Simoncelli, E. P. 2004. Image quality assessment: From error visibility to structural similarity. IEEE Trans. Image Proc. 13, 4.

Digital Library

[59]

Wood, Z., Hoppe, H., Desbrun, M., and Schröder, P. 2004. Removing excess topology from isosurfaces. ACM Trans. Graph. 23, 2.

Digital Library

[60]

Wu, C., Varanasi, K., Liu, Y., Seidel, H.-P., and Theobalt, C. 2011. Shading--based dynamic shape refinement from multi-view video under general illumination. In Proc. ICCV.

Digital Library

[61]

Ye, G., Liu, Y., Deng, Y., Hasler, N., Ji, X., Dai, Q., and Theobalt, C. 2013. Free-viewpoint video of human actors using multiple handheld Kinects. IEEE Trans. on System, Man & Cybernetics 43, 5.

[62]

Yu, F., Luo, H., Lu, Z., and Wang, P. 2010. 3D mesh compression. Three-Dimensional Model Analysis and Processing.

[63]

Zhou, Q.-Y., and Koltun, V. 2014. Color map optimization for 3D reconstruction with consumer depth cameras. ACM Trans. Graph. 33, 4.

Digital Library

[64]

Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Trans. Graph. 23, 3.

Digital Library

[65]

Zollhöfer, M., Niessner, M., Izadi, S., Rehmann, C., Zach, C., Fisher, M., Wu, C., Fitzgibbon, A., Loop, C., Theobalt, C., and Stamminger, M. 2014. Real-time non-rigid reconstruction using an RGB-D camera. ACM Trans. Graph. 33, 4.

Digital Library

Cited By

Du YZhang ZZhang PSun FLv X(2024)UDR-GS: Enhancing Underwater Dynamic Scene Reconstruction with Depth RegularizationSymmetry10.3390/sym1608101016:8(1010)Online publication date: 8-Aug-2024
https://doi.org/10.3390/sym16081010
Fukushima DIshikawa T(2024)An attempt to create a pet model by combining 3D reconstruction and fur estimationProceedings of the 2024 16th International Conference on Computer Modeling and Simulation10.1145/3686812.3686818(35-40)Online publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1145/3686812.3686818
Medin SLi GDu RGarbin SDavidson PWornell GBeeler TMeka A(2024)FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic FacesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36513047:1(1-17)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3651304
Show More Cited By

Index Terms

High-quality streamable free-viewpoint video
1. Computing methodologies
  1. Computer graphics

Recommendations

Photometric Bundle Adjustment for Dense Multi-view 3D Modeling
CVPR '14: Proceedings of the 2014 IEEE Conference on Computer Vision and Pattern Recognition

Motivated by a Bayesian vision of the 3D multi-view reconstruction from images problem, we propose a dense 3D reconstruction technique that jointly refines the shape and the camera parameters of a scene by minimizing the photometric reprojection error ...
Stereo fusion

A stereo fusion system that combines binocular and refractive stereo is presented.Our stereo fusion outperforms traditional binocular and refractive stereo.An efficient calibration method for refractive stereo is proposed. Display Omitted The ...
Free Viewpoint Video Coding With Rate-Distortion Analysis

To improve free viewpoint video (FVV) coding efficiency and optimize the quality of the synthesized virtual view video, this paper proposes a depth-assisted FVV coding framework and analyzes the rate-distortion (R-D) property of the synthesized virtual ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 34, Issue 4

August 2015

1307 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/2809654

Issue’s Table of Contents

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 July 2015

Published in TOG Volume 34, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

426
Total Citations
View Citations
3,682
Total Downloads

Downloads (Last 12 months)192
Downloads (Last 6 weeks)33

Reflects downloads up to 15 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Du YZhang ZZhang PSun FLv X(2024)UDR-GS: Enhancing Underwater Dynamic Scene Reconstruction with Depth RegularizationSymmetry10.3390/sym1608101016:8(1010)Online publication date: 8-Aug-2024
https://doi.org/10.3390/sym16081010
Fukushima DIshikawa T(2024)An attempt to create a pet model by combining 3D reconstruction and fur estimationProceedings of the 2024 16th International Conference on Computer Modeling and Simulation10.1145/3686812.3686818(35-40)Online publication date: 21-Jun-2024
https://dl.acm.org/doi/10.1145/3686812.3686818
Medin SLi GDu RGarbin SDavidson PWornell GBeeler TMeka A(2024)FaceFolds: Meshed Radiance Manifolds for Efficient Volumetric Rendering of Dynamic FacesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/36513047:1(1-17)Online publication date: 13-May-2024
https://dl.acm.org/doi/10.1145/3651304
Li DHuang SLu ZDuan XHuang H(2024)ST-4DGS: Spatial-Temporally Consistent 4D Gaussian Splatting for Efficient Dynamic Scene RenderingACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657520(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657520
Shih MHuang JKim CShah RKopf JGao C(2024)Modeling Ambient Scene Dynamics for Free-view SynthesisACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657488(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657488
Duan YWei FDai QHe YChen WChen B(2024)4D-Rotor Gaussian Splatting: Towards Efficient Novel View Synthesis for Dynamic ScenesACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657463(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657463
Yang HZheng MMa CLai YWan PHuang H(2024)VRMM: A Volumetric Relightable Morphable Head ModelACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657406(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657406
Du TWang JXie XLi WSu PLiu J(2024)Temporal residual neural radiance fields for monocular video dynamic human body reconstructionJournal of Electronic Imaging10.1117/1.JEI.33.4.04301833:04Online publication date: 1-Jul-2024
https://doi.org/10.1117/1.JEI.33.4.043018
Yang YWang Y(2024)Real-Time Multi-View Human Pose Estimation System2024 39th Youth Academic Annual Conference of Chinese Association of Automation (YAC)10.1109/YAC63405.2024.10598814(2181-2186)Online publication date: 7-Jun-2024
https://doi.org/10.1109/YAC63405.2024.10598814
Karthikeyan ARen RKant YGilitschenski I(2024)AvatarOne: Monocular 3D Human Animation2024 IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)10.1109/WACV57701.2024.00361(3635-3645)Online publication date: 3-Jan-2024
https://doi.org/10.1109/WACV57701.2024.00361
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents