article

Automatic photo pop-up

Authors:

Alexei A. Efros,

Martial HebertAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 24, Issue 3

Pages 577 - 584

https://doi.org/10.1145/1073204.1073232

Published: 01 July 2005 Publication History

Abstract

This paper presents a fully automatic method for creating a 3D model from a single photograph. The model is made up of several texture-mapped planar billboards and has the complexity of a typical children's pop-up book illustration. Our main insight is that instead of attempting to recover precise geometry, we statistically model geometric classes defined by their orientations in the scene. Our algorithm labels regions of the input image into coarse categories: "ground", "sky", and "vertical". These labels are then used to "cut and fold" the image into a pop-up model using a set of simple assumptions. Because of the inherent ambiguity of the problem and the statistical nature of the approach, the algorithm is not expected to work on every image. However. it performs surprisingly well for a wide range of scenes taken from a typical person's photo album.

Supplementary Material

MP4 File (pps022.mp4)

Download
34.90 MB

References

[1]

Chen, E. 1995. QuickTime VR - an image-based approach to virtual environment navigation. In ACM SIGGRAPH 95, 29--38.

Digital Library

[2]

Cipolla, R., Robertson, D., and Boyer, E. 1999. Photobuilder - 3d models of architectural scenes from uncalibrated images. In IEEE Int. Conf. on Multimedia Computing and Systems, vol. I. 25--31.

Digital Library

[3]

Collins, M., Schapire, R., and Singer, Y. 2002. Logistic regression, adaboost and bregman distances. Machine Learning 48, 1--3, 253--285.

Digital Library

[4]

Criminisi, A., Reid, I., and Zisserman, A. 2000. Single view metrology. Int. Journal of Computer Vision 40, 2, 123--148.

Digital Library

[5]

Debevec, P. E., Taylor, C. J., and Malik, J. 1996. Modeling and rendering architecture from photographs: A hybrid geometry-and image-based approach. In ACM SIGGRAPH 96, 11--20.

Digital Library

[6]

Duda, R., and Hart, P. 1972. Use of the hough transformation to detect lines and curves in pictures. Communications of the ACM 15, 1, 11--15.

Digital Library

[7]

Duda, R., Hart, P., and Stork, D. 2000. Pattern Classification. Wiley-Interscience Publication.

Digital Library

[8]

Everingham, M. R., Thomas, B. T., and Troscianko, T. 1999. Head-mounted mobility aid for low vision using scene classification techniques. The Intl J. of Virtual Reality 3, 4, 3--12.

[9]

Felzenszwalb, P., and Huttenlocher, D. 2004. Efficient graph-based image segmentation. Int. Journal of Computer Vision 59, 2, 167--181.

Digital Library

[10]

Friedman, J., Hastie, T., and Tibshirani, R. 2000. Additive logistic regression: a statistical view of boosting. Annals of Statistics 28, 2, 337--407.

[11]

Gortler, S. J., Grzeszczuk, R., Szeliski, R., and Cohen, M. F. 1996. The Lumigraph. In ACM SIGGRAPH 96, 43--54.

Digital Library

[12]

Hartley, R. I., and Zisserman, A. 2004. Multiple View Geometry in Computer Vision, 2nd ed. Cambridge University Press.

Digital Library

[13]

Horry, Y., Anjyo, K.-I., and Arai, K. 1997. Tour into the picture: using a spidery mesh interface to make animation from a single image. In ACM SIGGRAPH 97, 225--232.

Digital Library

[14]

Kang, H., Pyo, S., Anjyo, K., and Shin, S. 2001. Tour into the picture using a vanishing line and its extension to panoramic images. In Proc. Eurographics, 132--141.

[15]

Konishi, S., and Yuille, A. 2000. Statistical cues for domain specific image segmentation with performance analysis. In Computer Vision and Pattern Recognition, 1125--1132.

[16]

Kosecka, J., and Zhang, W. 2002. Video compass. In European Conf. on Computer Vision, Springer-Verlag, 476--490

Digital Library

[17]

Levoy, M., and Hanrahan, P. 1996. Light field rendering. In ACM SIGGRAPH 96, 31--42.

Digital Library

[18]

Li, Y., Sun, J., Tang, C.-K., and Shum, H.-Y. 2004. Lazy snapping. ACM Trans. on Graphics 23, 3, 303--308.

Digital Library

[19]

Liebowitz, D., Criminisi, A., and Zisserman, A. 1999. Creating architectural models from images. In Proc. Eurographics, vol. 18, 39--50.

[20]

Martin, D., Fowlkes, C., Tal, D., and Malik, J. 2001. A database of human segmented natural images and its application to evaluating segmentation algorithms and measuring ecological statistics. In Int. Conf. on Computer Vision, vol. 2, 416--423.

[21]

Nistér, D. 2001. Automatic dense reconstruction from uncalibrated video sequences. PhD thesis, Royal Institute of Technology KTH.

[22]

Oh, B. M., Chen, M., Dorsey, J., and Durand, F. 2001. Image-based modeling and photo editing. In ACM SIGGRAPH 2001, ACM Press, 433--442.

Digital Library

[23]

Pollefeys, M., Gool, L. V., Vergauwen, M., Verbiest, F., Cornelis, K., Tops, J., and Koch, R. 2004. Visual modeling with a hand-held camera. Int. J. of Computer Vision 59, 3, 207--232.

Digital Library

[24]

Quinlan, J. 1993. C4.5: Programs for Machine Learning. Morgan Kaufmann Publishers Inc.

Digital Library

[25]

Ren, X., and Malik, J. 2003. Learning a classification model for segmentation, In Int. Conf. on Computer Vision, 10--17.

Digital Library

[26]

Singhal, A., Luo, J., and Zhu, W. 2003. Probabilistic spatial context models for scene content understanding. In Computer Vision and Pattern Recognition, 235--241.

Digital Library

[27]

Tao, H., Sawhney, H. S., and Kumar, R. 2001. A global matching framework for stereo computation. In Int. Conf. on Computer Vision, 532--539.

[28]

Zhang, L., Dugas-Phocion, G., Samson, J., and Seitz, S. 2001. Single view modeling of free-form scenes. In Computer Vision and Pattern Recognition, 990--997.

[29]

Ziegler, R., Matusik, W., Pfister, H., and McMillan, L. 2003. 3d reconstruction using labeled image regions. In Eurographics Symposium on Geometry Processing, 248--259.

Digital Library

Cited By

Esposito A(2025)The Representation of Architectural Space for Caspar David Friedrich: The Case Study of Eldena AbbeyArts10.3390/arts1401000714:1(7)Online publication date: 20-Jan-2025
https://doi.org/10.3390/arts14010007
Gąsienica-Józkowy JCyganek BKnapik MGłogowski SPrzebinda Ł(2024)Deep Learning-Based Monocular Estimation of Distance and Height for Edge DevicesInformation10.3390/info1508047415:8(474)Online publication date: 9-Aug-2024
https://doi.org/10.3390/info15080474
Furferi R(2024)Deep Learning Approaches for 3D Model Generation from 2D Artworks to Aid Blind People with Tactile ExplorationHeritage10.3390/heritage80100128:1(12)Online publication date: 28-Dec-2024
https://doi.org/10.3390/heritage8010012
Show More Cited By

Index Terms

Automatic photo pop-up
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
    1. Image manipulation
      1. Texturing
    2. Shape modeling
      1. Parametric curve and surface models

Recommendations

Automatic photo pop-up
SIGGRAPH '05: ACM SIGGRAPH 2005 Papers

This paper presents a fully automatic method for creating a 3D model from a single photograph. The model is made up of several texture-mapped planar billboards and has the complexity of a typical children's pop-up book illustration. Our main insight is ...
Pop-up light field: An interactive image-based modeling and rendering system

In this article, we present an image-based modeling and rendering system, which we call pop-up light field, that models a sparse light field using a set of coherent layers. In our system, the user specifies how many coherent layers should be modeled or ...
Automatic Scene Inference for 3D Object Compositing

We present a user-friendly image editing system that supports a drag-and-drop object insertion (where the user merely drags objects into the image, and the system automatically places them in 3D and relights them appropriately), postprocess illumination ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 24, Issue 3

July 2005

826 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/1073204

Issue’s Table of Contents

Copyright © 2005 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 July 2005

Published in TOG Volume 24, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

345
Total Citations
View Citations
3,186
Total Downloads

Downloads (Last 12 months)17
Downloads (Last 6 weeks)5

Reflects downloads up to 03 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Esposito A(2025)The Representation of Architectural Space for Caspar David Friedrich: The Case Study of Eldena AbbeyArts10.3390/arts1401000714:1(7)Online publication date: 20-Jan-2025
https://doi.org/10.3390/arts14010007
Gąsienica-Józkowy JCyganek BKnapik MGłogowski SPrzebinda Ł(2024)Deep Learning-Based Monocular Estimation of Distance and Height for Edge DevicesInformation10.3390/info1508047415:8(474)Online publication date: 9-Aug-2024
https://doi.org/10.3390/info15080474
Furferi R(2024)Deep Learning Approaches for 3D Model Generation from 2D Artworks to Aid Blind People with Tactile ExplorationHeritage10.3390/heritage80100128:1(12)Online publication date: 28-Dec-2024
https://doi.org/10.3390/heritage8010012
Ye CQiu LGu XZuo QWu YDong ZBo LXiu YHan X(2024)StableNormal: Reducing Diffusion Variance for Stable and Sharp NormalACM Transactions on Graphics10.1145/368797143:6(1-18)Online publication date: 19-Dec-2024
https://dl.acm.org/doi/10.1145/3687971
Nguyen AChoi SKim WKim JOh HKang JLee S(2024)Single-Image 3-D Reconstruction: Rethinking Point Cloud DeformationIEEE Transactions on Neural Networks and Learning Systems10.1109/TNNLS.2022.321192935:5(6613-6627)Online publication date: May-2024
https://doi.org/10.1109/TNNLS.2022.3211929
Bae GDavison A(2024)Rethinking Inductive Biases for Surface Normal Estimation2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00911(9535-9545)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.00911
Pandey KGuerrero PGadelha MHold-Geoffroy YSingh KMitra N(2024)Diffusion Handles Enabling 3D Edits for Diffusion Models by Lifting Activations to 3D2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR52733.2024.00735(7695-7704)Online publication date: 16-Jun-2024
https://doi.org/10.1109/CVPR52733.2024.00735
Li JWang SPaquette E(2024)Texture-Driven Adaptive Mesh Refinement with Application to 3D ReliefComputer-Aided Design10.1016/j.cad.2023.103640167(103640)Online publication date: Feb-2024
https://doi.org/10.1016/j.cad.2023.103640
Wei FZhu JWang HShen J(2024)CFDepthNet: Monocular Depth Estimation Introducing Coordinate Attention and Texture FeaturesNeural Processing Letters10.1007/s11063-024-11477-456:3Online publication date: 24-Apr-2024
https://doi.org/10.1007/s11063-024-11477-4
Fu ZLi XHuai TLi WDong DHe L(2024)Robust depth completion based on Semantic AggregationApplied Intelligence10.1007/s10489-024-05366-554:5(3825-3840)Online publication date: 1-Mar-2024
https://dl.acm.org/doi/10.1007/s10489-024-05366-5
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents