Abstract
In this chapter we explore the possibility of interactively navigating a collection of casually captured videos of a performance: real-world footage captured on hand held cameras by a few members of the audience. The aim is to navigate the video collection in 3D by generating video based rendering of the performance using the offline pre-computed reconstruction of the event.
We propose two different techniques to obtain this reconstruction, considering that the video collection may have been recorded in complex, uncontrolled outdoor environments. One approach recovers the event geometry by exploring the temporal domain of each video independently, while the other explores the spatial domain of the video collection at each time instant, independently. The pros and cons of the two methods and their applicability to the addressed navigation problem, are also discussed. In the end, we propose an interactive GPU-accelerated viewing tool to navigate the video collection.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Snavely, N., Seitz, S.M., Szeliski, R.: Photo tourism: Exploring photo collections in 3d. In: SIGGRAPH Conference Proceedings, pp. 835–846 (2006)
Snavely, N., Garg, R., Seitz, S.M., Szeliski, R.: Finding paths through the world’s photos. ACM Transactions on Graphics (Proceedings of SIGGRAPH 2008) 27, 11–21 (2008)
Goesele, M., Ackermann, J., Fuhrmann, S., Haubold, C., Klowsky, R., Steedly, D., Szeliski, R.: Ambient point coulds for view interpolation. In: SIGGRAPH (2010)
Kim, H., Sarim, M., Takai, T., Guillemaut, J.Y., Hilton, A.: Dynamic 3d scene reconstruction in outdoor environments. In: 3DPVT (2010)
Guan, L., Franco, J.S., Pollefeys, M.: Multi-object shape estimation and tracking from silhouette cues. In: CVPR (2008)
Franco, J.-S., Boyer, E.: Fusion of multi-view silhouette cues using a space occupancy grid. In: ICCV, pp. 1747–1753 (2005)
Matusik, W., Buehler, C., Raskar, R., Gortler, S.J., McMillan, L.: Image-based visual hulls. In: Proceedings of ACM SIGGRAPH, pp. 369–374 (2000)
Sarim, M., Hilton, A., Guillemaut, J.Y., Kim, H., Takai, T.: Multiple view wide-baseline trimap propagation for natural video matting. In: 2010 Conference on Visual Media Production (CVMP), pp. 82–91 (2010)
Seitz, S., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR (2006)
Seitz, S.M., Dyer, C.R.: Photorealistic scene reconstruction by voxel coloring. In: CVPR, p. 1067 (1997)
Furukawa, Y., Ponce, J.: Dense 3d motion capture for human faces. In: CVPR, pp. 1674–1681 (2009)
Vlasic, D., Peers, P., Baran, I., Debevec, P., Popović, J., Rusinkiewicz, S., Matusik, W.: Dynamic shape capture using multi-view photometric stereo. In: SIGGRAPH Asia (2009)
Ahmed, N., Theobalt, C., Dobrev, P., Seidel, H.P., Thrun, S.: Robust fusion of dynamic shape and normal capture for high-quality reconstruction of time-varying geometry. In: CVPR (2008)
Hernández, C., Vogiatzis, G., Cipolla, R.: Shadows in three-source photometric stereo. In: Forsyth, D., Torr, P., Zisserman, A. (eds.) ECCV 2008, Part I. LNCS, vol. 5302, pp. 290–303. Springer, Heidelberg (2008)
Vedula, S., Baker, S., Seitz, S., Kanade, T.: Shape and motion carving in 6d. In: CVPR (2000)
Goldlucke, B., Ihrke, I., Linz, C., Magnor, M.: Weighted minimal hypersurface reconstruction. PAMI, 1194–1208 (2007)
Hilton, A., Starck, J.: Multiple view reconstruction of people. In: 3DPVT (2004)
Sinha, S.N., Pollefeys, M.: Multi-view reconstruction using photo-consistency and exact silhouette constraints: A maximum-flow formulation. In: ICCV, pp. 349–356 (2005)
Tung, T., Nobuhara, S., Matsuyama, T.: Complete multi-view reconstruction of dynamic scenes from probabilistic fusion of narrow and wide baseline stereo. In: ICCV (2009)
Waschbüsch, M., Würmlin, S., Gross, M.H.: 3d video billboard clouds. Computer Graphics Forum (Proc. Eurographics EG 2007) 26, 561–569 (2007)
Ballan, L., Cortelazzo, G.M.: Marker-less motion capture of skinned models in a four camera set-up using optical flow and silhouettes. In: 3DPVT (June 2008)
Carranza, J., Theobalt, C., Magnor, M.A., Peter Seidel, H.: Free-viewpoint video of human actors. ACM Transactions on Graphics, 569–577 (2003)
Vlasic, D., Baran, I., Matusik, W., Popović, J.: Articulated mesh animation from multi-view silhouettes. ACM Transactions on Graphics 27, 97:1–97:9 (2008)
de Aguiar, E., Stoll, C., Theobalt, C., Ahmed, N., Seidel, H.P., Thrun, S.: Performance capture from sparse multi-view video. ACM Trans. Graph. 27, 1–10 (2008)
Zitnick, C.L., Kang, S.B., Uyttendaele, M., Winder, S., Szeliski, R.: High-quality video view interpolation using a layered representation. ACM Transactions on Graphics 23, 600–608 (2004)
Kanade, T.: Carnegie mellon goes to the superbowl (2001), http://www.ri.cmu.edu/events/sb35/tksuperbowl.html
Würmlin, S., Niederberger, C.: Realistic virtual replays for sports broadcasts (2010), http://www.liberovision.com/
Guillemaut, J.-Y., Kilner, J., Hilton, A.: Robust graph-cut scene segmentation and reconstruction for free-viewpoint video of complex dynamic scenes. In: ICCV (2009)
Hayashi, K., Saito, H.: Synthesizing free-viewpoint images from multiple view videos in soccer stadium. In: CGIV, pp. 220–225 (2006)
Hasler, N., Rosenhahn, B., Thormählen, T., Wand, M., Gall, J., Seidel, H.P.: Markerless motion capture with unsynchronized moving cameras. In: CVPR, pp. 224–231 (2009)
Lipski, C., Linz, C., Berger, K., Sellent, A., Magnor, M.: Virtual video camera: Image-based viewpoint navigation through space and time. Computer Graphics Forum 29, 2555–2568 (2010)
Eisemann, M., Decker, B.D., Magnor, M., Bekaert, P., de Aguiar, E., Ahmed, N., Theobalt, C., Sellent, A.: Floating Textures. Computer Graphics Forum (Proc. Eurographics EG 2008) 27, 409–418 (2008)
Seitz, S.M., Dyer, C.R.: View morphing. In: Proceedings of ACM SIGGRAPH, pp. 21–30 (1996)
Pollefeys, M., Van Gool, L., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., Koch, R.: Visual modeling with a hand-held camera. IJCV 59, 207–232 (2004)
Lhuillier, M., Quan, L.: A quasi-dense approach to surface reconstruction from uncalibrated images. IEEE Trans. Pattern Anal. Mach. Intell. 27, 418–433 (2005)
Ballan, L., Cortelazzo, G.M.: Multimodal 3D shape recovery from texture, silhouette and shadow information. In: 3DPVT. Chapel Hill, USA (2006)
Campbell, N.D., Vogiatzis, G., Hernández, C., Cipolla, R.: Automatic 3d object segmentation in multiple views using volumetric graph-cuts. In: 18th British Machine Vision Conference, vol. 1, pp. 530–539 (2007)
Goesele, M., Snavely, N., Curless, B., Hoppe, H., Seitz, S.M.: Multi-view stereo for community photo collections. In: ICCV, pp. 1–8 (2007)
Ballan, L., Brusco, N., Cortelazzo, G.M.: 3D Content Creation by Passive Optical Methods. In: 3D Online Multimedia and Games: Processing, Visualization and Transmission. World Scientific Publishing, Singapore (2008)
Seitz, S.M., Curless, B., Diebel, J., Scharstein, D., Szeliski, R.: A comparison and evaluation of multi-view stereo reconstruction algorithms. In: CVPR, pp. 519–528 (2006)
Lowe, D.G.: Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision 60, 91–110 (2004)
Gallup, D., Frahm, J.M., Mordohai, P., Yang, Q., Pollefeys, M.: Real-time plane-sweeping stereo with multiple sweeping directions. In: CVPR (2007)
Zach, C., Pock, T., Bischof, H.: A globally optimal algorithm for robust tv-l1 range image integration. In: ICCV (2007)
Sheffer, A., Praun, E., Rose, K.: Mesh parameterization methods and their applications. Foundations and Trends in Computer Graphics and Vision 2, 105–171 (2006)
Brusco, N., Ballan, L., Cortelazzo, G.M.: Passive reconstruction of high quality textured 3D models of works of art. In: 6th International Symposium on Virtual Reality, Archeology and Cultural Heritage, VAST (2005)
Hartley, R.I., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press, Cambridge (2000); ISBN: 0521623049
Arulampalam, M.S., Maskell, S., Gordon, N.: A tutorial on particle filters for online nonlinear/non-gaussian bayesian tracking. IEEE Trans. Signal Processing 50, 174–188 (2002)
Sinha, S.N., Pollefeys, M.: Synchronization and calibration of camera networks from silhouettes. In: ICPR 2004: Proceedings of the Pattern Recognition, 17th International Conference on (ICPR 2004), vol. 1, pp. 116–119 (2004)
Tuytelaars, T., Van Gool, L.: Synchronizing video sequences. In: CVPR, vol. 1, pp. 762–768 (2004)
Reinhard, E., Adhikhmin, M., Gooch, B., Shirley, P.: Color transfer between images. IEEE Computer Graphics and Applications 21, 34–41 (2001)
Baumberg, A., Hogg, D.: An efficient method for contour tracking using active shape models. In: Motion of Non-Rigid and Articulated Objects, pp. 194–199 (1994)
Leibe, B., Cornelis, N., Cornelis, K., Gool, L.V.: Dynamic 3d scene analysis from a moving vehicle. In: CVPR (2007)
Elgammal, A., Duraiswami, R., Harwood, D., Davis, L.S.: Background and foreground modeling using nonparametric kernel density estimation for visual surveillance. Proceedings of the IEEE 90, 1151–1163 (2002)
Sheikh, Y., Javed, O., Kanade, T.: Background subtraction for freely moving cameras. In: ICCV (2009)
Bai, X., Wang, J., Simons, D., Sapiro, G.: Video snapcut: robust video object cutout using localized classifiers. ACM Trans. Graph. 28 (2009)
Wang, J., Bhat, P., Colburn, R.A., Agrawala, M., Cohen, M.F.: Interactive video cutout. ACM Trans. Graph. 24, 585–594 (2005)
Sun, J., Zhang, W., Tang, X., Shum, H.Y.: Background cut. In: Leonardis, A., Bischof, H., Pinz, A. (eds.) ECCV 2006. LNCS, vol. 3952, pp. 628–641. Springer, Heidelberg (2006)
Ballan, L., Brostow, G.J., Puwein, J., Pollefeys, M.: Unstructured video-based rendering: Interactive exploration of casually captured videos. ACM Transactions on Graphics (Proceedings of SIGGRAPH), 1–11 (2010), http://doi.acm.org/10.1145/1833349.1778824
Taneja, A., Ballan, L., Pollefeys, M.: Modeling dynamic scenes recorded with freely moving cameras. In: Kimmel, R., Klette, R., Sugimoto, A. (eds.) ACCV 2010, Part III. LNCS, vol. 6494, pp. 613–626. Springer, Heidelberg (2011)
Cheng, Y.: Mean shift, mode seeking, and clustering. IEEE Transactions on Pattern Analysis and Machine Intelligence 17, 790–799 (1995)
Boykov, Y., Kolmogorov, V.: An experimental comparison of min-cut/max-flow algorithms for energy minimization in vision. PAMI 26, 1124–1137 (2004)
Rav-Acha, A., Kohli, P., Rother, C., Fitzgibbon, A.: Unwrap mosaics: A new representation for video editing. ACM Transactions on Graphics (SIGGRAPH 2008) (2008)
Chuang, Y.Y., Curless, B., Salesin, D.H., Szeliski, R.: A bayesian approach to digital matting. In: Proceedings of IEEE CVPR 2001, Kauai, Hawaii, vol. 2, pp. 264–271 (2001)
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23, 1222–1239 (2001)
Kolmogorov, V., Zabin, R.: What energy functions can be minimized via graph cuts? PAMI 26, 147–159 (2004)
Lorensen, W.E., Cline, H.E.: Marching cubes: A high resolution 3d surface construction algorithm. SIGGRAPH 21, 163–169 (1987)
Buehler, C., Bosse, M., McMillan, L., Gortler, S.J., Cohen, M.F.: Unstructured lumigraph rendering. In: SIGGRAPH, pp. 425–432 (2001)
Grundland, M., Vohra, R., Williams, G.P., Dodgson, N.A.: Cross dissolve without cross fade: Preserving contrast, color and salience in image compositing. In: Proceedings of EUROGRAPHICS, Computer Graphics Forum, pp. 577–586 (2006)
Schödl, A., Szeliski, R., Salesin, D.H., Essa, I.: Video textures. In: SIGGRAPH 2000: Proceedings of the 27th Annual Conference on Computer Graphics and Interactive Techniques, pp. 489–498 (2000)
Rong, G., Tan, T.S.: Jump flooding in gpu with applications to voronoi diagram and distance transform. In: ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games (I3D), pp. 109–116. ACM, New York (2006)
Wang, J., Bodenheimer, B.: Synthesis and evaluation of linear motion transitions. ACM Trans. Graph 27, 1–15 (2008)
Debevec, P., Borshukov, G., Yu, Y.: Efficient view-dependent image-based rendering with projective texture-mapping. In: 9th Eurographics Workshop on Rendering (1998)
Unstructured VBR, http://www.cvg.ethz.ch/research/unstructured-vbr/
Kilner, J., Starck, J., Hilton, A.: A comparative study of free-viewpoint video techniques for sports events. In: CVMP (2006)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2011 Springer-Verlag Berlin Heidelberg
About this chapter
Cite this chapter
Taneja, A., Ballan, L., Puwein, J., Brostow, G.J., Pollefeys, M. (2011). 3D Reconstruction and Video-Based Rendering of Casually Captured Videos. In: Cremers, D., Magnor, M., Oswald, M.R., Zelnik-Manor, L. (eds) Video Processing and Computational Video. Lecture Notes in Computer Science, vol 7082. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-24870-2_4
Download citation
DOI: https://doi.org/10.1007/978-3-642-24870-2_4
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-24869-6
Online ISBN: 978-3-642-24870-2
eBook Packages: Computer ScienceComputer Science (R0)