Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2425836.2425884acmotherconferencesArticle/Chapter ViewAbstractPublication PagesivcnzConference Proceedingsconference-collections
poster

Towards unsupervised semantic segmentation of street scenes from motion cues

Published: 26 November 2012 Publication History

Abstract

Motion provides a rich source of information about the world. It can be used as an important cue to analyse the behaviour of objects in a scene and consequently identify interesting locations within it. In this paper, given an unannotated video sequence of a dynamic scene from fixed viewpoint, we first present a set of useful motion features that can be efficiently extracted at each pixel by optical flow. Using these features, we then develop an algorithm that can extract motion topic models and identify semantically significant regions and landmarks in a complex scene from a short video sequence. For example, by watching a street scene our algorithm can extract meaningful regions such as roads and important landmarks such as parking spots. Our method is robust to complicating factors such as shadows and occlusions.

References

[1]
Virat video dataset. http://www.viratdata.org/, 2011.
[2]
D. M. Blei, A. Y. Ng, and M. I. Jordan. Latent dirichlet allocation. Journal of Machine Learning Research, 3: 993--1022, 2003.
[3]
W. Cao, Y. Yan, and S. Li. Unsupervised color-texture image segmentation based on a new clustering method. JNIT, 1(2): 96--102, 2010.
[4]
N. J. Carlos, W. Hongcheng, and F.-F. Li. Unsupervised learning of human action categories using spatial-temporal words. IJCV, 79(3): 299--318, 2008.
[5]
A. Criminisi, I. D. Reid, and A. Zisserman. Single view metrology. IJCV, 40(2): 123--148, 2000.
[6]
L. Fei-Fei and P. Perona. A bayesian hierarchical model for learning natural scene categories. CVPR, pages 524--531, 2005.
[7]
R. Hartley and A. Zisserman. Multiple View Geometry in Computer Vision. Cambridge University Press, 2004.
[8]
B. K. P. Horn and B. G. Schunck. Determining optical flow. Artificial Intelligence, 17: 185--203, 1981.
[9]
T. M. Hospedales, S. Gong, and T. Xiang. A markov clustering topic model for mining behaviour in video. In ICCV, pages 1165--1172, 2009.
[10]
C. Li and Y. Zhao. Camera self-calibration method by using three orthogonal vanishing points. AISS: Advances in Information Sciences and Service Sciences, 3(8): 45--52, 2011.
[11]
X.-H. Phan and C.-T. Nguyen. Gibbslda++: A c/c++ implementation of latent dirichlet allocation (lda). http://gibbslda.sourceforge.net/, 2007.
[12]
I. Saleemi, K. Shafique, and M. Shah. Probabilistic modeling of scene dynamics for applications in visual surveillance. IEEE Trans. Pattern Anal. Mach. Intell., 31(8): 1472--1485, 2009.
[13]
J. Seetha, R. Varadharajan, and V. Vaithiyanathan. Unsupervised learning algorithm for color texture segmentation based multiscale image fusion. EJSR, 67(4), 2012.
[14]
S. N. Sinha and M. Pollefeys. Pan-tilt-zoom camera calibration and high-resolution mosaic generation. Comput. Vis. Image Underst., 103: 170--183, 2006.
[15]
J. Sivic, B. C. Russell, A. A. Efros, A. Zisserman, and W. T. Freeman. Discovering object categories in image collections. In Proceedings of the International Conference on Computer Vision, 2005.
[16]
D. Sun, S. Roth, and M. J. Black. Secrets of optical flow estimation and their principles. In CVPR, pages 2432--2439, 2010.
[17]
X. Wang and E. Grimson. Spatial latent dirichlet allocation. In NIPS, 2007.
[18]
X. Wang, K. Tieu, and E. Grimson. Learning semantic scene models by trajectory analysis. In In ECCV (3), pages 110--123, 2006.
[19]
W. Zhang, X. Fang, X. K. Yang, and Q. M. J. Wu. Moving cast shadows detection using ratio edge. IEEE Transactions on Multimedia, 9(6): 1202--1214, 2007.

Cited By

View all
  • (2017)Unsupervised Semantic Scene Labeling for Streaming Data2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR.2017.626(5910-5919)Online publication date: Jul-2017

Index Terms

  1. Towards unsupervised semantic segmentation of street scenes from motion cues

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      IVCNZ '12: Proceedings of the 27th Conference on Image and Vision Computing New Zealand
      November 2012
      547 pages
      ISBN:9781450314732
      DOI:10.1145/2425836
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      • HRS: Hoare Research Software Ltd.
      • Google Inc.
      • Dept. of Information Science, Univ.of Otago: Department of Information Science, University of Otago, Dunedin, New Zealand

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 26 November 2012

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. motion cues
      2. scene understanding
      3. semantic segmentation

      Qualifiers

      • Poster

      Conference

      IVCNZ '12
      Sponsor:
      • HRS
      • Dept. of Information Science, Univ.of Otago
      IVCNZ '12: Image and Vision Computing New Zealand
      November 26 - 28, 2012
      Dunedin, New Zealand

      Acceptance Rates

      Overall Acceptance Rate 55 of 74 submissions, 74%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)1
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 25 Feb 2025

      Other Metrics

      Citations

      Cited By

      View all
      • (2017)Unsupervised Semantic Scene Labeling for Streaming Data2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR)10.1109/CVPR.2017.626(5910-5919)Online publication date: Jul-2017

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Figures

      Tables

      Media

      Share

      Share

      Share this Publication link

      Share on social media