Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2996913.2996956acmotherconferencesArticle/Chapter ViewAbstractPublication PagesgisConference Proceedingsconference-collections
research-article

Enhancing scene parsing by transferring structures via efficient low-rank graph matching

Published: 31 October 2016 Publication History

Abstract

Scene parsing has attracted significant attention for its practical and theoretical value in computer vision. A typical scene parsing algorithm seeks to densely label pixels or 3-dimensional points from a scene. Traditionally, this procedure relies on a pre-trained classifier to identify the label information, and a smoothing step via Markov Random Field to enhance the consistency. LabelTranfer is a category of scene parsing algorithms to enhance traditional scene parsing framework, by finding dense correspondence and transferring labels across scenes. In this paper, we present a novel scene parsing algorithm which matches maximal similar structures between scenes via efficient low-rank graph matching. The inputs of the algorithm are images, and well- aligned point clouds if available. The images and the point clouds are processed in separate pipelines. The pipeline of images is to learn a reliable classifier and to match local structures via graph matching. The pipeline of point clouds is to conduct preliminary segmentation and to generate feasible label sets. The two pipelines are merged at inference step, in which we elaborate effective and efficient potential functions. We propose a new graph matching model incorporating low-rank and Frobenius regularization, which not only guarantees an accurate solution, but also provides high optimization efficiency via an eigen-decomposition strategy. Several challenging experiments are conducted, showing competitive performance of the proposed method compared to state-of-the-art LabelTransfer algorithm. Further, with point clouds, the performance can be significantly enhanced.

References

[1]
A. Arnab, S. Jayasumana, S. Zheng, and P. Torr. Higher order potentials in end-to-end trainable conditional random fields. arXiv preprint arXiv:1511.08119, 2015.
[2]
P. Babahajiani, L. Fan, and M. Gabbouj. Semantic parsing of street scene images using 3d lidar point cloud. In ICCVW, pages 714--721, 2013.
[3]
S. T. Birchfield and S. Rangarajan. Spatiograms versus histograms for region-based tracking. In CVPR, volume 2, pages 1158--1163, 2005.
[4]
M. Cho, J. Lee, and K. M. Lee. Reweighted random walks for graph matching. In ECCV, pages 492--505, 2010.
[5]
T. Cour, P. Srinivasan, and J. Shi. Balanced graph matching. In NIPS, pages 313--320, 2007.
[6]
C. Farabet, C. Couprie, L. Najman, and Y. Lecun. Scene parsing with multiscale feature learning, purity trees, and optimal covers. In ICML, pages 575--582, 2012.
[7]
C. Farabet, C. Couprie, L. Najman, and Y. LeCun. Learning hierarchical features for scene labeling. PAMI, 35(8):1915--1929, 2013.
[8]
P. F. Felzenszwalb and D. P. Huttenlocher. Efficient belief propagation for early vision. IJCV, 70(1):41--54, 2006.
[9]
M. A. Fischler and O. Firschein. Readings in Computer Vision: Issues, Problem, Principles, and Paradigms. Morgan Kaufmann, 2014.
[10]
A. Jain, A. Gupta, and L. S. Davis. Learning what and how of contextual models for scene labeling. In ECCV, pages 199--212. 2010.
[11]
P. Krähenbühl and V. Koltun. Efficient inference in fully connected crfs with gaussian edge potentials. In NIPS, pages 109--117, 2011.
[12]
B. Kulis, A. C. Surendran, and J. C. Platt. Fast low-rank semidefinite programming for embedding and clustering. In International Conference on Artificial Intelligence and Statistics, pages 235--242, 2007.
[13]
M. Leordeanu and M. Hebert. A spectral technique for correspondence problems using pairwise constraints. In ICCV, pages 1482--1489, 2005.
[14]
C. Liu, J. Yuen, and A. Torralba. Nonparametric scene parsing via label transfer. PAMI, 33(12):2368--2382, 2011.
[15]
C. Liu, J. Yuen, and A. Torralba. Sift flow: Dense correspondence across scenes and its applications. PAMI, 33(5):978--994, 2011.
[16]
H. Myeong, J. Y. Chang, and K. M. Lee. Learning object relationships via graph-based context model. In CVPR, pages 2727--2734, 2012.
[17]
H. Myeong and K. M. Lee. Tensor-based high-order semantic relation transfer for semantic scene segmentation. In CVPR, pages 3073--3080, 2013.
[18]
A. Oliva and A. Torralba. Modeling the shape of the scene: A holistic representation of the spatial envelope. IJCV, 42(3):145--175, 2001.
[19]
P. Pinheiro and R. Collobert. Recurrent convolutional neural networks for scene labeling. In ICML, pages 82--90, 2014.
[20]
X. Ren and J. Malik. Learning a classification model for segmentation. In ICCV, pages 10--17, 2003.
[21]
B. C. Russell, A. Torralba, K. P. Murphy, and W. T. Freeman. Labelme: a database and web-based tool for image annotation. IJCV, 77(1-3):157--173, 2008.
[22]
R. B. Rusu, N. Blodow, Z. C. Marton, and M. Beetz. Close-range scene segmentation and reconstruction of 3d point cloud maps for mobile manipulation in domestic environments. In IROS, pages 1--6, 2009.
[23]
C. Schellewald and C. Schnaörr. Probabilistic subgraph matching based on convex relaxation. In EMMCVPR, pages 171--186, 2005.
[24]
C. J. Taylor and A. Cowley. Fast scene analysis using image and range data. In ICRA, pages 3562--3567, 2011.
[25]
J. Tighe and S. Lazebnik. Superparsing: scalable nonparametric image parsing with superpixels. In ECCV, pages 352--365. 2010.
[26]
F. Tung and J. J. Little. Scene parsing by nonparametric label transfer of content-adaptive windows. CVIU, 143:191--200, 2016.
[27]
P. Wang, C. Shen, and A. van den Hengel. A fast semidefinite approach to solving binary quadratic problems. In CVPR, pages 1312--1319, 2013.
[28]
R. Wang, F. P. Ferrie, and J. Macfarlane. Automatic registration of mobile lidar and spherical panoramas. In CVPRW, pages 33--40, 2012.
[29]
T. Yu and R. Wang. Graph matching with low-rank regularization. In WACV, 2016.
[30]
T. Yu and R. Wang. Scene parsing using graph matching on street-level data. doi: 10.1016/j.cviu.2016.01.004, 2016.
[31]
C. Zhang, L. Wang, and R. Yang. Semantic segmentation of urban scenes using dense depth maps. In ECCV, pages 708--721, 2010.
[32]
H. Zhang, T. Fang, X. Che, Q. Zhao, and L. Quan. Partial similarity based nonparametric scene parsing in certain environment. In CVPR, pages 2241--2248, 2011.
[33]
S. Zheng, S. Jayasumana, B. Romera-Paredes, V. Vineet, Z. Su, D. Du, C. Huang, and P. Torr. Conditional random fields as recurrent neural networks. In ICCV, 2015.
[34]
H. Zou and T. Hastie. Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 67(2):301--320, 2005.

Cited By

View all
  • (2021)Pole-Like Objects Segmentation and Multiscale Classification-Based Fusion from Mobile Point Clouds in Road ScenesRemote Sensing10.3390/rs1321438213:21(4382)Online publication date: 30-Oct-2021

Index Terms

  1. Enhancing scene parsing by transferring structures via efficient low-rank graph matching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    SIGSPACIAL '16: Proceedings of the 24th ACM SIGSPATIAL International Conference on Advances in Geographic Information Systems
    October 2016
    649 pages
    ISBN:9781450345897
    DOI:10.1145/2996913
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 31 October 2016

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. graph matching
    2. scene parsing
    3. semantic segmentation

    Qualifiers

    • Research-article

    Funding Sources

    • Google Research Award

    Conference

    SIGSPATIAL'16

    Acceptance Rates

    SIGSPACIAL '16 Paper Acceptance Rate 40 of 216 submissions, 19%;
    Overall Acceptance Rate 257 of 1,238 submissions, 21%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 06 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Pole-Like Objects Segmentation and Multiscale Classification-Based Fusion from Mobile Point Clouds in Road ScenesRemote Sensing10.3390/rs1321438213:21(4382)Online publication date: 30-Oct-2021

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media