Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Free access

Scene understanding by labeling pixels

Published: 27 October 2014 Publication History

Abstract

Pixels labeled with a scene's semantics and geometry let computers describe what they see.

References

[1]
Achanta, R., Shaji, A., Smith, K., Lucchi, A., Fua, P., and Susstrunk, S. SLIC superpixels compared to state-of-the-art superpixel methods. IEEE Transactions on Pattern Analysis and Machine Intelligence 34, 11 (Nov. 2012), 2274--2282.
[2]
Borenstein, E., Sharon, E., and Ullman, S. Combining top-down and bottom-up segmentation. In Proceedings of the IEEE Workshop on Perceptual Organization in Computer Vision at the IEEE Conference on Computer Vision and Pattern Recognition (Washington, D.C., June 27--July 2). IEEE Computer Society Press, 2004, 46--46.
[3]
Boykov, Y., Veksler, O., and Zabih, R. Fast approximate energy minimization via graph cuts. IEEE Transactions on Pattern Analysis and Machine Intelligence 23, 11 (Nov. 2001), 1222--1239.
[4]
Comaniciu, D. and Meer, P. Mean shift: A robust approach toward feature space analysis. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5 (May 2002), 603--619.
[5]
Dalal, N. and Triggs, B. Histograms of oriented gradients for human detection. In Proceedings of the Conference on Computer Vision and Pattern Recognition (San Diego, CA, June 20--25). IEEE Computer Society Press, 2005, 886--893.
[6]
Everingham, M., Van Gool, L., Williams, C.K.I., Winn, J., and Zisserman, A. The Pascal visual object classes challenge. International Journal of Computer Vision 88, 2 (June 2010), 303--338.
[7]
Farhadi, A., Endres, I., Hoiem, D., and Forsyth, D. Describing objects by their attributes. In Proceedings of the Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Computer Society Press, 2009, 1778--1785.
[8]
Felzenszwalb, P.F. and Huttenlocher, D.P. Efficient graph-based image segmentation. International Journal of Computer Vision 59, 2 (Sept. 2004), 167--181.
[9]
Fulkerson, B., Vedaldi, A., and Soatto, S. Class segmentation and object localization with superpixel neighborhoods. In Proceedings of the 12th International Conference on Computer Vision (Kyoto, Japan, Sept. 29--Oct. 2). IEEE Computer Society Press, 2009, 670--677.
[10]
Gould, S., Fulton, R., and Koller, D. Decomposing a scene into geometric and semantically consistent regions. In Proceedings of the 12th International Conference on Computer Vision (Kyoto, Japan, Sept. 29--Oct. 2). IEEE Computer Society Press, 2009, 1--8.
[11]
Gould, S., Gao, T., and Koller, D. Region-based segmentation and object detection. In Advances in Neural Information Processing Systems 22 (Vancouver, B.C., Canada, Dec. 6--11). Curran Associates, Inc., 2009, 655--663.
[12]
Gould, S., Rodgers, J., Cohen, D., Elidan, G., and Koller, D. Multi-class segmentation with relative location prior. International Journal of Computer Vision 80, 3 (Dec. 2008), 300--316.
[13]
He, X., Zemel, R.S., and Carreira-Perpinan, M. Multiscale conditional random fields for image labeling. In Proceedings of the Conference on Computer Vision and Pattern Recognition (Washington, D.C., June 27--July 2). IEEE Computer Society Press, 2004, 695--702.
[14]
Hedau, V., Hoiem, D., and Forsyth, D. Recovering the spatial layout of cluttered rooms. In Proceedings of the International Conference on Computer Vision (Kyoto, Japan, Sept. 29--Oct. 2). IEEE Computer Society Press, 2009, 1849--1856.
[15]
Heitz, G., Gould, S., Saxena, A., and Koller, D. Cascaded classification models: Combining models for holistic scene understanding. In Advances in Neural Information Processing Systems 21 (Vancouver, B.C., Canada, Dec. 8--13). Curran Associates, Inc., 2008, 641--648.
[16]
Heitz, G. and Koller, D. Learning spatial context: Using stuff to find things. In Proceedings of the European Conference on Computer Vision (Marseille, France, Oct. 12--18). Springer, Berlin, Heidelberg, 2008, 30--43.
[17]
Hoiem, D., Efros, A.A., and Hebert, M. Recovering surface layout from an image. International Journal of Computer Vision 75, 1 (Oct. 2007), 151--172.
[18]
Hoiem, D., Efros, A.A., and Hebert, M. Closing the loop on scene interpretation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (Anchorage, AK, June 23--28). IEEE Computer Society Press, 2008, 1--8.
[19]
Hoiem, D., Efros, A.A., and Hebert, M. Putting objects in perspective. International Journal of Computer Vision 80, 1 (Oct. 2008), 3--15.
[20]
Kohli, P., Ladicky, L., and Torr, P.H. Robust higher order potentials for enforcing label consistency. International Journal of Computer Vision 82, 3 (May 2009), 302--324.
[21]
Kolmogorov, V. and Zabih, R. What energy functions can be minimized via graph cuts? IEEE Transactions on Pattern Analysis and Machine Intelligence 26, 2 (Feb. 2004), 147--159.
[22]
Komodakis, N., Paragios, N., and Tziritas, G. MRF optimization via dual decomposition: Message-passing revisited. In Proceedings of the International Conference on Computer Vision (Rio de Janeiro, Oct. 14--21). IEEE Computer Society Press, 2007, 1--8.
[23]
Krahenbuhl, P. and Koltun, V. Efficient inference in fully connected CRFs with gaussian edge potentials. In Advances in Neural Information Processing Systems 24 (Granada, Spain, Dec. 12--17). Curran Associates, Inc., 2011, 109--117.
[24]
Ladicky, L., Russell, C., Kohli, P., and Torr, P.H. Graph cut-based inference with co-occurrence statistics. In Proceedings of the 11th European Conference on Computer Vision (Crete, Greece, Sept. 5--11). Springer, Berlin, Heidelberg, 2010, 239--253.
[25]
Lafferty, J.D., McCallum, A., and Pereira, F.C.N. Conditional random fields: Probabilistic models for segmenting and labeling sequence data. In Proceedings of the International Conference on Machine Learning (Williamstown, MA, June 28--July 1). Morgan Kaufmann, San Francisco, 2001, 282--289.
[26]
Le, Q.V., Ranzato, M., Monga, R., Devin, M., Chen, K., Corrado, G.S., Dean, J., and Ng, A.Y. Building high-level features using large scale unsupervised learning. In Proceedings of the International Conference on Machine Learning (Edinburgh, Scotland, June 26--July 1). Morgan Kaufmann, San Francisco, 2012.
[27]
Levin, A. and Weiss, Y. Learning to combine bottom-up and top-down segmentation. International Journal of Computer Vision 81, 1 (Sept. 2008), 105--118.
[28]
Li, L.-J., Socher, R., and Fei-Fei, L. Towards total scene understanding: Classification, annotation and segmentation in an automatic framework. In Proceedings of the Conference on Computer Vision and Pattern Recognition (Miami, FL, June 20--25). IEEE Computer Society Press, 2009, 2036--2043.
[29]
Liu, C., Yuen, J., and Torralba, A. Nonparametric scene parsing via label transfer. IEEE Transactions on Pattern Analysis and Machine Intelligence 33, 12 (Dec. 2011), 2368--2382.
[30]
Malik, J., Belongie, S., Shi, J., and Leung, T. Textons, contours and regions: Cue integration in image segmentation. In Proceedings of the International Conference on Computer Vision (Corfu, Greece, Sept. 20--25). IEEE Computer Society Press, 1999, 918--925.
[31]
Nowozin, S. and Lampert, C.W. Structured learning and prediction in computer vision. Foundations and Trends in Computer Graphics and Vision 6, 3--4 (May 2011), 185--365.
[32]
Rabinovich, A., Vedaldi, A., Galleguillos, C., Wiewiora, E., and Belongie, S. Objects in context. In Proceedings of the International Conference on Computer Vision (Rio de Janeiro, Oct. 14--21). IEEE Computer Society Press, 2007, 1--8.
[33]
Ren, X. and Malik, J. Learning a classification model for segmentation. In Proceedings of the International Conference on Computer Vision (Nice, France, Oct. 13--16). IEEE Computer Society Press, 2003, 10--17.
[34]
Shotton, J., Winn, J., Rother, C., and Criminisi, A. TextonBoost: Joint appearance, shape and context modeling for multi-class object recognition and segmentation. In Proceedings of the European Conference on Computer Vision (Graz, Austria, May 7--13). Springer, Berlin, Heidelberg, 2006, 1--15.
[35]
Szeliski, R. Computer Vision: Algorithms and Applications. Springer, Berlin, Heidelberg, 2011.
[36]
Tighe, J. and Lazebnik, S. SuperParsing: Scalable nonparametric image parsing with superpixels. In Proceedings of the European Conference on Computer Vision (Crete, Greece, Sept. 5--11). Springer, Berlin, Heidelberg, 2010, 352--365.
[37]
Tu, Z., Chen, X., Yuille, A.L., and Zhu, S.-C. Image parsing: Unifying segmentation, detection and recognition. International Journal of Computer Vision 63, 2 (July 2005), 113--140.
[38]
Tu, Z. and Zhu, S.-C. Image segmentation by data-driven Markov chain Monte Carlo. IEEE Transactions on Pattern Analysis and Machine Intelligence 24, 5 (May 2002), 657--673.
[39]
Wang, H., Gould, S., and Koller, D. Discriminative learning with latent variables for cluttered indoor scene understanding. In Proceedings of the 11th European Conference on Computer Vision (Crete, Greece, Sept. 5--Sept. 11). Springer, Berlin, Heidelberg, 2010, 497--510.
[40]
Yao, Y., Fidler, S., and Urtasun, R. Describing the scene as a whole: Joint object detection, scene classification and semantic segmentation. In Proceedings of the Conference on Computer Vision and Pattern Recognition (Providence, RI, June 16--21). IEEE Computer Society Press, 2012, 702--709.

Cited By

View all
  • (2022)Non-parametric scene parsing: Label transfer methods and datasetsComputer Vision and Image Understanding10.1016/j.cviu.2022.103418219(103418)Online publication date: May-2022
  • (2021)Enhancing label transfer in non-parametric scene parsing by superpixel-based dense alignmentProceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing10.1145/3490035.3490290(1-9)Online publication date: 19-Dec-2021
  • (2019)MC-SSM: Nonparametric Semantic Image Segmentation With the ICM AlgorithmIEEE Transactions on Multimedia10.1109/TMM.2019.2891418(1-1)Online publication date: 2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM
Communications of the ACM  Volume 57, Issue 11
November 2014
95 pages
ISSN:0001-0782
EISSN:1557-7317
DOI:10.1145/2684442
  • Editor:
  • Moshe Y. Vardi
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 October 2014
Published in CACM Volume 57, Issue 11

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article
  • Popular
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)139
  • Downloads (Last 6 weeks)26
Reflects downloads up to 26 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2022)Non-parametric scene parsing: Label transfer methods and datasetsComputer Vision and Image Understanding10.1016/j.cviu.2022.103418219(103418)Online publication date: May-2022
  • (2021)Enhancing label transfer in non-parametric scene parsing by superpixel-based dense alignmentProceedings of the Twelfth Indian Conference on Computer Vision, Graphics and Image Processing10.1145/3490035.3490290(1-9)Online publication date: 19-Dec-2021
  • (2019)MC-SSM: Nonparametric Semantic Image Segmentation With the ICM AlgorithmIEEE Transactions on Multimedia10.1109/TMM.2019.2891418(1-1)Online publication date: 2019
  • (2018)Modeling With Prejudice: Small-Sample Learning via Adversary for Semantic SegmentationIEEE Access10.1109/ACCESS.2018.28845026(77965-77974)Online publication date: 2018
  • (2018)Leveraging semantic segmentation with learning-based confidence measureNeurocomputing10.1016/j.neucom.2018.10.037Online publication date: Oct-2018
  • (2018)Contour-aware network for semantic segmentation via adaptive depthNeurocomputing10.1016/j.neucom.2018.01.022284(27-35)Online publication date: Apr-2018
  • (2018)Subset selection for visualization of relevant image fractions for deep learning based semantic image segmentationJournal of the Franklin Institute10.1016/j.jfranklin.2017.08.001355:4(1931-1944)Online publication date: Mar-2018
  • (2018)Vision-based entrance detection in outdoor scenesMultimedia Tools and Applications10.1007/s11042-018-5846-377:20(26219-26238)Online publication date: 1-Oct-2018
  • (2017)Stacked Learning to Search for Scene LabelingIEEE Transactions on Image Processing10.1109/TIP.2017.266821826:4(1887-1898)Online publication date: 1-Apr-2017
  • (2016)Scene structure inference through scene map estimationProceedings of the Conference on Vision, Modeling and Visualization10.5555/3056901.3056909(45-52)Online publication date: 10-Oct-2016
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDFChinese translation

eReader

View online with eReader.

eReader

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Magazine Site

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media