research-article

Unstructured video-based rendering: interactive exploration of casually captured videos

Authors:

Gabriel J. Brostow,

Marc PollefeysAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 29, Issue 4

Article No.: 87, Pages 1 - 11

https://doi.org/10.1145/1778765.1778824

Published: 26 July 2010 Publication History

Abstract

We present an algorithm designed for navigating around a performance that was filmed as a "casual" multi-view video collection: real-world footage captured on hand held cameras by a few audience members. The objective is to easily navigate in 3D, generating a video-based rendering (VBR) of a performance filmed with widely separated cameras. Casually filmed events are especially challenging because they yield footage with complicated backgrounds and camera motion. Such challenging conditions preclude the use of most algorithms that depend on correlation-based stereo or 3D shape-from-silhouettes.

Our algorithm builds on the concepts developed for the exploration of photo-collections of empty scenes. Interactive performer-specific view-interpolation is now possible through innovations in interactive rendering and offline-matting relating to i) modeling the foreground subject as video-sprites on billboards, ii) modeling the background geometry with adaptive view-dependent textures, and iii) view interpolation that follows a performer. The billboards are embedded in a simple but realistic reconstruction of the environment. The reconstructed environment provides very effective visual cues for spatial navigation as the user transitions between viewpoints. The prototype is tested on footage from several challenging events, and demonstrates the editorial utility of the whole system and the particular value of our new inter-billboard optimization.

Supplementary Material

JPG File (tp124-10.jpg)

Download
20.24 KB

Supplemental material. (087.zip)

video.avi - Main video Additional material can be found in http://cvg.ethz.ch/research/unstructured-vbr/

Download
58.67 MB

MP4 File (tp124-10.mp4)

Download
135.91 MB

References

[1]

Arulampalam, M. S., Maskell, S., and Gordon, N. 2002. A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans. Signal Processing 50, 174--188.

Digital Library

[2]

Bai, X., Wang, J., Simons, D., and Sapiro, G. 2009. Video snapcut: robust video object cutout using localized classifiers. ACM Trans. Graph. 28, 3.

Digital Library

[3]

Ballan, L., and Cortelazzo, G. M. 2008. Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes. In 3DPVT.

[4]

Boykov, Y., and Kolmogorov, V. 2004. An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. IEEE Trans. Pattern Anal. Mach. Intell. 26, 9, 1124--1137.

Digital Library

[5]

Buehler, C., Bosse, M., McMillan, L., Gortler, S. J., and Cohen, M. F. 2001. Unstructured lumigraph rendering. In SIGGRAPH, 425--432.

Digital Library

[6]

Campbell, N. D., Vogiatzis, G., Hernández, C., and Cipolla, R. 2007. Automatic 3d object segmentation in multiple views using volumetric graph-cuts. In 18th British Machine Vision Conference, vol. 1, 530--539.

[7]

Carranza, J., Theobalt, C., Magnor, M. A., and peter Seidel, H. 2003. Free-viewpoint video of human actors. In ACM Transactions on Graphics, 569--577.

Digital Library

[8]

Chen, S. E., and Williams, L. 1993. View interpolation for image synthesis. In SIGGRAPH '93: Proceedings of the 20th annual conference on Computer graphics and interactive techniques, 279--288.

Digital Library

[9]

Chuang, Y.-Y., Curless, B., Salesin, D. H., and Szeliski, R. 2001. A bayesian approach to digital matting. In Proceedings of IEEE CVPR 2001, vol. 2, 264--271.

[10]

Chuang, Y.-Y., Agarwala, A., Curless, B., Salesin, D. H., and Szeliski, R. 2002. Video matting of complex scenes. ACM Transactions on Graphics 21, 3 (July), 243--248.

Digital Library

[11]

de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H. P., and Thrun, S. 2008. Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 3, 1--10.

Digital Library

[12]

Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry- and image-based approach. In Proceedings of SIGGRAPH 96, Computer Graphics Proceedings, Annual Conference Series, 11--20.

Digital Library

[13]

Debevec, P., Borshukov, G., and Yu, Y. 1998. Efficient view-dependent image-based rendering with projective texture-mapping. In 9th Eurographics Workshop on Rendering.

[14]

Dragicevic, P., Ramos, G., Bibliowitcz, J., Nowrouzezahrai, D., Balakrishnan, R., and Singh, K. 2008. Video browsing by direct manipulation. In CHI '08: Proceeding of the twenty-sixth annual SIGCHI conference on Human factors in computing systems, 237--246.

Digital Library

[15]

Eisemann, M., Decker, B. D., Magnor, M., Bekaert, P., de Aguiar, E., Ahmed, N., Theobalt, C., and Sellent, A. 2008. Floating Textures. Computer Graphics Forum (Proc. Eurographics EG'08) 27, 2 (4), 409--418.

[16]

Franco, J.-S., and Boyer, E. 2005. Fusion of multi-view silhouette cues using a space occupancy grid. In ICCV, 1747--1753.

Digital Library

[17]

Goesele, M., Snavely, N., Curless, B., Hoppe, H., and Seitz, S. M. 2007. Multi-view stereo for community photo collections. In ICCV, 1--8.

[18]

Goldman, D. B., Gonterman, C., Curless, B., Salesin, D., and Seitz, S. M. 2008. Video object annotation, navigation, and composition. In UIST '08: Proceedings of the 21st annual ACM symposium on User interface software and technology, 3--12.

Digital Library

[19]

Goldman, D. B. 2007. A framework for video annotation, visualization, and interaction. PhD thesis.

Digital Library

[20]

Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The lumigraph. In SIGGRAPH, 43--54.

Digital Library

[21]

Grundland, M., Vohra, R., Williams, G. P., and Dodgson, N. A. 2006. Cross dissolve without cross fade: Preserving contrast, color and salience in image compositing. In Proceedings of EUROGRAPHICS, Computer Graphics Forum, 577--586.

[22]

Guillemaut, J.-Y., Hilton, A., Starck, J., Kilner, J., and Grau, O. 2007. A bayesian framework for simultaneous matting and 3d reconstruction. In 3DIM '07: Proceedings of the Sixth International Conference on 3-D Digital Imaging and Modeling, 167--176.

Digital Library

[23]

Guillemaut, J.-Y., Kilner, J., and Hilton, A. 2009. Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes. In Proc. International Conference on Computer Vision (ICCV 2009).

[24]

Hartley, R. I., and Zisserman, A. 2000. Multiple View Geometry in Computer Vision. Cambridge University Press, ISBN: 0521623049.

Digital Library

[25]

Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., and Seidel, H.-P. 2009. Markerless motion capture with unsynchronized moving cameras. In CVPR, 224--231.

[26]

Hayashi, K., and Saito, H. 2006. Synthesizing free-viewpoint images from multiple view videos in soccer stadium. In CGIV '06: Proceedings of the International Conference on Computer Graphics, Imaging and Visualisation, 220--225.

Digital Library

[27]

Hays, J., and Efros, A. A. 2007. Scene completion using millions of photographs. ACM Transactions on Graphics (SIGGRAPH 2007) 26, 3.

Digital Library

[28]

Heigl, B., Koch, R., Pollefeys, M., Denzler, J., and Van Gool, L. 1999. Plenoptic modeling and rendering from image sequences taken by hand-held camera. In Patter Recognition 1999, 21. DAGM-Symposium, 94--101.

Digital Library

[29]

Kanade, T., 2001. Carnegie mellon goes to the superbowl. http://www.ri.cmu.edu/events/sb35/tksuperbowl.html.

[30]

Karrer, T., Weiss, M., Lee, E., and Borchers, J. 2008. Dragon: a direct manipulation interface for frame-accurate in-scene video navigation. In CHI '08, 247--250.

Digital Library

[31]

Kilner, J., Starck, J., and Hilton, A. 2006. A comparative study of free-viewpoint video techniques for sports events. European Conference on Visual Media Production (CVMP).

[32]

Kilner, J., Starck, J., Hilton, A., and Grau, O. 2007. Dual-mode deformable models for free-viewpoint video of sports events. In 3DIM07, 177--184.

Digital Library

[33]

Kopf, J., Neubert, B., Chen, B., Cohen, M., Cohen-Or, D., Deussen, O., Uyttendaele, M., and Lischinski, D. 2008. Deep photo: model-based photograph enhancement and viewing. ACM Trans. Graph. 27, 5, 116.

Digital Library

[34]

Levoy, M., and Hanrahan, P. 1996. Light field rendering. In SIGGRAPH, 31--42.

Digital Library

[35]

Lhuillier, M., and Quan, L. 2005. A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27, 3, 418--433.

Digital Library

[36]

Liu, F., Gleicher, M., Jin, H., and Agarwala, A. 2009. Content-preserving warps for 3d video stabilization. In ACM SIGGRAPH 2009, 1--9.

Digital Library

[37]

Lowe, D. G. 2004. Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 2, 91--110.

Digital Library

[38]

Matusik, W., Buehler, C., Raskar, R., Gortler, S. J., and McMillan, L. 2000. Image-based visual hulls. In Proceedings of ACM SIGGRAPH, 369--374.

Digital Library

[39]

Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. 2004. Visual modeling with a hand-held camera. IJCV 59, 3, 207--232.

Digital Library

[40]

Rav-Acha, A., Kohli, P., Rother, C., and Fitzgibbon, A. 2008. Unwrap mosaics: A new representation for video editing. ACM Transactions on Graphics (SIGGRAPH 2008) (August).

Digital Library

[41]

Rong, G., and Tan, T.-S. 2006. Jump flooding in gpu with applications to voronoi diagram and distance transform. In ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), ACM, 109--116.

Digital Library

[42]

Schindler, G., and Dellaert, F. 2010. Probabilistic temporal inference on reconstructed 3D scenes. In CVPR, 1--8.

[43]

Schödl, A., Szeliski, R., Salesin, D. H., and Essa, I. 2000. Video textures. In SIGGRAPH '00: Proceedings of the 27th annual conference on Computer graphics and interactive techniques, 489--498.

Digital Library

[44]

Schönemann, P. 1966. A generalized solution of the orthogonal procrustes problem. Psychometrika 31, 1 (March), 1--10.

[45]

Seitz, S. M., and Dyer, C. R. 1996. View morphing. In Proceedings of ACM SIGGRAPH, 21--30.

Digital Library

[46]

Seitz, S. M., Curless, B., Diebel, J., Scharstein, D., and Szeliski, R. 2006. A comparison and evaluation of multiview stereo reconstruction algorithms. In 2006 Conference on Computer Vision and Pattern Recognition (CVPR 2006), 519--528.

Digital Library

[47]

Sinha, S. N., and Pollefeys, M. 2004. Synchronization and calibration of camera networks from silhouettes. In ICPR '04: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR'04) Volume 1, 116--119.

Digital Library

[48]

Sinha, S. N., Steedly, D., Szeliski, R., Agrawala, M., and Pollefeys, M. 2008. Interactive 3d architectural modeling from unordered photo collections. ACM Transactions on Graphics (Proceedings of SIGGRAPH Asia 2008) 27, 5, 159.

Digital Library

[49]

Sivic, J., and Zisserman, A. 2003. Video Google: A text retrieval approach to object matching in videos. In Proceedings of the International Conference on Computer Vision, vol. 2, 1470--1477.

Digital Library

[50]

Snavely, N., Seitz, S. M., and Szeliski, R. 2006. Photo tourism: Exploring photo collections in 3d. In SIGGRAPH Conference Proceedings, 835--846.

Digital Library

[51]

Snavely, N., Garg, R., Seitz, S. M., and Szeliski, R. 2008. Finding paths through the world's photos. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2008) 27, 3, 11--21.

Digital Library

[52]

Starck, J., and Hilton, A. 2007. Surface capture for performance based animation. IEEE Computer Graphics and Applications 27(3), 21--31.

Digital Library

[53]

Stich, T., Linz, C., Albuquerque, G., and Magnor, M. 2008. View and time interpolation in image space. Computer Graphics Forum (Proc. Pacific Graphics) 27, 7.

[54]

Sun, J., Zhang, W., Tang, X., and Shum, H.-Y. 2006. Background cut. In ECCV (2), 628--641.

Digital Library

[55]

Tuytelaars, T., and Van Gool, L. 2004. Synchronizing video sequences. Computer Vision and Pattern Recognition, IEEE Computer Society Conference on 1, 762--768.

[56]

van den Hengel, A., Dick, A., Thormählen, T., Ward, B., and Torr, P. H. S. 2007. Videotrace: Rapid interactive scene modelling from video. ACM Transactions on Graphics 26, 3 (July), 86:1--86:5.

Digital Library

[57]

Vedula, S., Baker, S., and Kanade, T. 2005. Image-based spatio-temporal modeling and view interpolation of dynamic events. ACM Transactions on Graphics 24, 2 (Apr.), 240--261.

Digital Library

[58]

Vlasic, D., Baran, I., Matusik, W., and Popović, J. 2008. Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics 27, 3, 97:1--97:9.

Digital Library

[59]

Wang, J., and Bodenheimer, B. 2008. Synthesis and evaluation of linear motion transitions. ACM Trans. Graph. 27, 1, 1--15.

Digital Library

[60]

Wang, J., Bhat, P., Colburn, R. A., Agrawala, M., and Cohen, M. F. 2005. Interactive video cutout. ACM Trans. Graph. 24, 3, 585--594.

Digital Library

[61]

Waschbüsch, M., Würmlin, S., and Gross, M. H. 2007. 3d video billboard clouds. Computer Graphics Forum (Proc. Eurographics EG'07) 26, 3, 561--569.

[62]

Würmlin, S., and Niederberger, C., 2010. Realistic virtual replays for sports broadcasts. http://www.liberovision.com/.

[63]

Zach, C., Pock, T., and Bischof, H. 2007. A globally optimal algorithm for robust tv-11 range image integration. In IEEE International Conference on Computer Vision (ICCV).

[64]

Zitnick, C. L., Kang, S. B., Uyttendaele, M., Winder, S., and Szeliski, R. 2004. High-quality video view interpolation using a layered representation. ACM Transactions on Graphics 23, 3 (Aug.), 600--608.

Digital Library

Cited By

Liu YGao CMeuleman ATseng HSaraf AKim CChuang YKopf JHuang J(2023)Robust Dynamic Radiance Fields2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00010(13-23)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.00010
Fülöp-Balogh BTursman ETompkin JDigne JBonneel N(2022)Dynamic scene novel view synthesis via deferred spatio-temporal consistencyComputers & Graphics10.1016/j.cag.2022.07.019107(220-230)Online publication date: Oct-2022
https://doi.org/10.1016/j.cag.2022.07.019
Pavlakos GWeber ETancik MKanazawa A(2022)The One Where They Reconstructed 3D Humans and Environments in TV ShowsComputer Vision – ECCV 202210.1007/978-3-031-19836-6_41(732-749)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-19836-6_41
Show More Cited By

Index Terms

Unstructured video-based rendering: interactive exploration of casually captured videos

Recommendations

Unstructured video-based rendering: interactive exploration of casually captured videos
SIGGRAPH '10: ACM SIGGRAPH 2010 papers

We present an algorithm designed for navigating around a performance that was filmed as a "casual" multi-view video collection: real-world footage captured on hand held cameras by a few audience members. The objective is to easily navigate in 3D, ...
Image-Based Rendering from Uncalibrated Lightfields with Scalable Geometry
Proceedings of the 10th International Workshop on Theoretical Foundations of Computer Vision: Multi-Image Analysis

We combine uncalibrated Structure-from-Motion, lightfield rendering and view-dependent texture mapping to model and render scenes from a set of images that are acquired from an uncalibrated hand-held video camera. The camera is simply moved by hand ...
Real-time video-based rendering from uncalibrated cameras using plane-sweep algorithm

In this paper, we present a new online video-based rendering (VBR) method that creates new views of a scene from uncalibrated cameras. Our method does not require information about the cameras intrinsic parameters. For obtaining a geometrical relation ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 29, Issue 4

July 2010

942 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/1778765

Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2010

Published in TOG Volume 29, Issue 4

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article

Funding Sources

Seventh Framework Programme

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

73
Total Citations
View Citations
2,032
Total Downloads

Downloads (Last 12 months)6
Downloads (Last 6 weeks)1

Reflects downloads up to 05 Mar 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu YGao CMeuleman ATseng HSaraf AKim CChuang YKopf JHuang J(2023)Robust Dynamic Radiance Fields2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52729.2023.00010(13-23)Online publication date: Jun-2023
https://doi.org/10.1109/CVPR52729.2023.00010
Fülöp-Balogh BTursman ETompkin JDigne JBonneel N(2022)Dynamic scene novel view synthesis via deferred spatio-temporal consistencyComputers & Graphics10.1016/j.cag.2022.07.019107(220-230)Online publication date: Oct-2022
https://doi.org/10.1016/j.cag.2022.07.019
Pavlakos GWeber ETancik MKanazawa A(2022)The One Where They Reconstructed 3D Humans and Environments in TV ShowsComputer Vision – ECCV 202210.1007/978-3-031-19836-6_41(732-749)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-19836-6_41
Hwang DKoike H(2021)MonoMR: Synthesizing Pseudo-2.5D Mixed Reality Content from Monocular VideosApplied Sciences10.3390/app1117794611:17(7946)Online publication date: 27-Aug-2021
https://doi.org/10.3390/app11177946
Thonat TAksoy YAittala MParis SDurand FDrettakis G(2021)Video‐Based Rendering of Dynamic Stationary Environments from Unsynchronized InputsComputer Graphics Forum10.1111/cgf.1434240:4(73-86)Online publication date: 15-Jul-2021
https://doi.org/10.1111/cgf.14342
van der Laan RScandolo LEisemann E(2020)Lossy Geometry Compression for High Resolution Voxel ScenesProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/33845413:1(1-13)Online publication date: 4-May-2020
https://dl.acm.org/doi/10.1145/3384541
Chaurasia GNieuwoudt AIchim ASzeliski RSorkine-Hornung A(2020)Passthrough+Proceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/33845403:1(1-17)Online publication date: 4-May-2020
https://dl.acm.org/doi/10.1145/3384540
Brunel CBénard PGuennebaud GBarla P(2020)A Time-independent Deformer for Elastic-rigid ContactsProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/33845393:1(1-21)Online publication date: 4-May-2020
https://dl.acm.org/doi/10.1145/3384539
Macklin MErleben KMüller MChentanez NJeschke SCorse Z(2020)Local Optimization for Robust Signed Distance Field CollisionProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/33845383:1(1-17)Online publication date: 4-May-2020
https://dl.acm.org/doi/10.1145/3384538
Luksch CProst LWimmer M(2020)Real-time Approximation of Photometric Polygonal LightsProceedings of the ACM on Computer Graphics and Interactive Techniques10.1145/33845373:1(1-18)Online publication date: 4-May-2020
https://dl.acm.org/doi/10.1145/3384537
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents