Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Russell, Chris; Yu, Rui; Agapito, Lourdes

doi:10.1007/978-3-319-10584-0_38

Chris Russell¹⁹,
Rui Yu¹⁹ &
Lourdes Agapito¹⁹

Part of the book series: Lecture Notes in Computer Science ((LNIP,volume 8695))

Included in the following conference series:

European Conference on Computer Vision

16k Accesses
40 Citations
1 Altmetric

Abstract

Consider a video sequence captured by a single camera observing a complex dynamic scene containing an unknown mixture of multiple moving and possibly deforming objects. In this paper we propose an unsupervised approach to the challenging problem of simultaneously segmenting the scene into its constituent objects and reconstructing a 3D model of the scene. The strength of our approach comes from the ability to deal with real-world dynamic scenes and to handle seamlessly different types of motion: rigid, articulated and non-rigid. We formulate the problem as hierarchical graph-cut based segmentation where we decompose the whole scene into background and foreground objects and model the complex motion of non-rigid or articulated objects as a set of overlapping rigid parts. We evaluate the motion segmentation functionality of our approach on the Berkeley Motion Segmentation Dataset. In addition, to validate the capability of our approach to deal with real-world scenes we provide 3D reconstructions of some challenging videos from the YouTube-Objects dataset.

This research was funded by the European Research Council under the ERC Starting Grant agreement 204871-HUMANIS.

Download to read the full chapter text

Chapter PDF

First International Workshop on Video Segmentation - Panel Discussion

Appearance-Based Refinement for Object-Centric Motion Segmentation

4D Temporally Coherent Multi-Person Semantic Reconstruction and Segmentation

Article Open access 28 April 2022

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Adams, A., Baek, J., Davis, A.: Fast high-dimensional filtering using the permutohedral lattice. In: Eurographics (2010)
Google Scholar
Bleyer, M., Rother, C., Kohli, P.: Surface stereo with soft segmentation. In: CVPR (2010)
Google Scholar
Boros, E., Hammer, P.L.: Pseudo-boolean optimization. Discrete Applied Mathematics, 155–225 (2002)
Google Scholar
Boykov, Y., Kolmogorov, V.: An Experimental Comparison of Min-Cut/Max-Flow Algorithms for Energy Minimization in Vision. PAMI 26(9), 1124–1137 (2004)
Article Google Scholar
Boykov, Y., Veksler, O., Zabih, R.: Fast approximate energy minimization via graph cuts. PAMI 23 (2001)
Google Scholar
Brox, T., Malik, J.: Object segmentation by long term analysis of point trajectories. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part V. LNCS, vol. 6315, pp. 282–295. Springer, Heidelberg (2010)
Chapter Google Scholar
Costeira, J., Kanade, T.: A multi-body factorization method for motion analysis. In: ICCV, pp. 1071–1076 (1995)
Google Scholar
Elhamifar, E., Vidal, R.: Sparse subspace clustering. In: CVPR (2009)
Google Scholar
Fayad, J., Russell, C., Agapito, L.: Automated articulated structure and 3D shape recovery from point correspondences. In: IEEE International Conference on Computer Vision (ICCV), Barcelona, Spain (November 2011)
Google Scholar
Fitzgibbon, A.W., Zisserman, A.: Multibody structure and motion: 3-D reconstruction of independently moving objects. In: Vernon, D. (ed.) ECCV 2000. LNCS, vol. 1842, pp. 891–906. Springer, Heidelberg (2000)
Chapter Google Scholar
Galasso, F., Cipolla, R., Schiele, B.: Video segmentation with superpixels. In: Lee, K.M., Matsushita, Y., Rehg, J.M., Hu, Z. (eds.) ACCV 2012, Part I. LNCS, vol. 7724, pp. 760–774. Springer, Heidelberg (2013)
Chapter Google Scholar
Garg, R., Roussos, A., Agapito, L.: Dense variational reconstruction of non-rigid surfaces from monocular video. In: CVPR (2013)
Google Scholar
Hartley, R., Zisserman, A.: Multiple View Geometry in Computer Vision. Cambridge University Press (2000)
Google Scholar
Isack, H., Boykov, Y.: Energy-based geometric multi-model fitting. International Journal of Computer Vision (IJCV) 97(2) (2012)
Google Scholar
Kanatani, K.: Motion segmentation by subspace separation and model selection. In: ICCV, Vancouver, Canada, vol. 2, pp. 301–306 (July 2001)
Google Scholar
Kohli, P., Ladicky, L., Torr, P.: Robust higher order potentials for enforcing label consistency. In: CVPR (2008)
Google Scholar
Ladickỳ, L., Russell, C., Kohli, P., Torr, P.H.: Inference methods for crfs with co-occurrence statistics. International Journal of Computer Vision 103(2), 213–225 (2013)
Article MathSciNet Google Scholar
Li, Z., Guo, J., Cheong, L.-F., Zhou, Z.: Perspective motion segmentation via collaborative clustering. In: ICCV (2013)
Google Scholar
Lourakis, M.A., Argyros, A.: SBA: A Software Package for Generic Sparse Bundle Adjustment. ACM Trans. Math. Software (2009)
Google Scholar
Narasimhan, M., Bilmes, J.A.: A submodular-supermodular procedure with applications to discriminative structure learning. arXiv preprint arXiv:1207.1404 (2012)
Google Scholar
Ozden, K., Schindler, K., van Gool, L.: Multibody structure-from-motion in practice. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) (2010)
Google Scholar
Paladini, M., Del Bue, A., Xavier, J., Agapito, L., Stosic, M., Dodig, M.: Factorization for Non-Rigid and Articulated Structure using Metric Projections. IJCV (2012)
Google Scholar
Prest, A., Leistner, C., Civera, J., Schmid, C., Ferrari, V.: Learning object class detectors from weakly annotated video. In: CVPR (2012)
Google Scholar
Rao, S., Tron, R., Vidal, R., Ma, Y.: Motion segmentation in the presence of outlying, incomplete or corrupted trajectories. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 32(10), 1832–1845 (2010)
Article Google Scholar
Roussos, A., Russell, C., Garg, R., Agapito, L.: Dense multibody motion estimation and reconstruction from a handheld camera. In: ISMAR (2012)
Google Scholar
Russell, C., Fayad, J., Agapito, L.: Energy based multiple model fitting for non-rigid structure from motion. In: CVPR (2011)
Google Scholar
Schindler, K., Suter, D., Wang, H.: A model selection framework for multibody structure-and-motion of image sequences. International Journal of Computer Vision (IJCV) 79(2), 159–177 (2008)
Article Google Scholar
Siva, P., Russell, C., Xiang, T., Agapito, L.: Looking beyond the image: Unsupervised learning for object saliency and detection. In: CVPR (2013)
Google Scholar
Sundaram, N., Brox, T., Keutzer, K.: Dense point trajectories by GPU-accelerated large displacement optical flow. In: Daniilidis, K., Maragos, P., Paragios, N. (eds.) ECCV 2010, Part I. LNCS, vol. 6311, pp. 438–451. Springer, Heidelberg (2010)
Chapter Google Scholar
Tomasi, C., Kanade, T.: Shape and motion from image streams: a factorization method - part 3 detection and tracking of point features. Technical Report CMU-CS-91-132, Computer Science Department, Carnegie Mellon University, Pittsburgh, PA (April 1991)
Google Scholar
Torresani, L., Hertzmann, A., Bregler, C.: Non-rigid structure-from-motion: Estimating shape and motion with hierarchical priors. PAMI, 878–892 (2008)
Google Scholar
Tresadern, P., Reid, I.: Articulated structure from motion by factorization. In: CVPR, vol. 2, pp. 1110–1115 (June 2005)
Google Scholar
Varol, A., Salzmann, M., Tola, E., Fua, P.: Template-free monocular reconstruction of deformable surfaces. In: ICCV (2009)
Google Scholar
Vidal, R., Ma, Y., Sastry, S.: Generalized principal component analysis (gpca). In: CVPR, pp. 621–628 (2003)
Google Scholar
Xu, C., Corso, J.J.: Evaluation of super-voxel methods for early video processing. In: CVPR (2012)
Google Scholar
Yan, J., Pollefeys, M.: A factorization-based approach for articulated non-rigid shape, motion and kinematic chain recovery from video. PAMI 30(5) (May 2008)
Google Scholar
Yuille, A.L., Rangarajan, A.: The concave-convex procedure (cccp). In: NIPS (2002)
Google Scholar
Zelnik-Manor, L., Irani, M.: Degeneracies, dependencies and their implications in multi-body and multi-sequence factorizations. In: CVPR, vol. 2, pp. 287–293 (June 2003)
Google Scholar

Download references

Author information

Authors and Affiliations

University College London, UK
Chris Russell, Rui Yu & Lourdes Agapito

Authors

Chris Russell
View author publications
You can also search for this author in PubMed Google Scholar
Rui Yu
View author publications
You can also search for this author in PubMed Google Scholar
Lourdes Agapito
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Department of Computer Science, University of Toronto, 6 King’s College Road, M5H 3S5, Toronto, ON, Canada
David Fleet
Faculty of Electrical Engineering, Department of Cybernetics, Czech Technical University in Prague, Technicka 2, 166 27, Prague 6, Czech Republic
Tomas Pajdla
Max-Planck-Institut für Informatik, Campus E1 4, 66123, Saarbrücken, Germany
Bernt Schiele
PSI, iMinds, KU Leuven, ESAT, Kasteelpark Arenberg 10, Bus 2441, 3001, Leuven, Belgium
Tinne Tuytelaars

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Russell, C., Yu, R., Agapito, L. (2014). Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes. In: Fleet, D., Pajdla, T., Schiele, B., Tuytelaars, T. (eds) Computer Vision – ECCV 2014. ECCV 2014. Lecture Notes in Computer Science, vol 8695. Springer, Cham. https://doi.org/10.1007/978-3-319-10584-0_38

Download citation

DOI: https://doi.org/10.1007/978-3-319-10584-0_38
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-10583-3
Online ISBN: 978-3-319-10584-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Abstract

Chapter PDF

Similar content being viewed by others

First International Workshop on Video Segmentation - Panel Discussion

Appearance-Based Refinement for Object-Centric Motion Segmentation

4D Temporally Coherent Multi-Person Semantic Reconstruction and Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Video Pop-up: Monocular 3D Reconstruction of Dynamic Scenes

Abstract

Chapter PDF

Similar content being viewed by others

First International Workshop on Video Segmentation - Panel Discussion

Appearance-Based Refinement for Object-Centric Motion Segmentation

4D Temporally Coherent Multi-Person Semantic Reconstruction and Segmentation

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation