research-article

Visual Place Recognition with Repetitive Structures

Authors:

Masatoshi Okutomi, and

Tomas PajdlaAuthors Info & Claims

IEEE Transactions on Pattern Analysis and Machine Intelligence, Volume 37, Issue 11

Pages 2346 - 2359

https://doi.org/10.1109/TPAMI.2015.2409868

Published: 01 November 2015 Publication History

Abstract

Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. They violate the feature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, they form an important distinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval and geometric verification. The retrieval is based on robust detection of repeated image structures and a suitable modification of weights in the bag-of-visual-word model. We also demonstrate that the explicit detection of repeated patterns is beneficial for robust visual word matching for geometric verification. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline as well as the more recently proposed burstiness weighting and Fisher vector encoding.

References

[1]

B. Aguera y Arcas. (2010). Augmented reality using Bing maps, talk at TED 2010 [Online]. Available: http://www.videosift.com/video/TED-Augmented-reality-using-Bing-maps

[2]

M. Cummins and P. Newman, “Highly scalable appearance-only SLAM—FAB-MAP 2.0,” presented at the Robotics: Science and Systems, Seattle, WA, USA, June 2009.

[3]

D. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk, “City-scale landmark identification on mobile devices,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 737–744.

[4]

J. Knopp, J. Sivic, and T. Pajdla, “Avoiding confusing features in place recognition,” in Proc. Eur. Conf. Comput. Vis., 2010, pp. 748–761.

[5]

T. Quack, B. Leibe, and L. Van Gool, “ World-scale mining of objects and events from community photo collections,” in Proc. Int. Conf. Content-Based Image Video Retrieval, 2008, pp. 47–56.

[6]

G. Schindler, M. Brown, and R. Szeliski, “ City-scale location recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–7.

[7]

O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: Automatic query expansion with a generative feature model for object retrieval,” in Proc. IEEE 11th Int. Conf. Comput. Vis., 2007, pp. 1–8.

[8]

H. Jegou, M. Douze, and C. Schmid, “On the burstiness of visual elements,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 1169–1176.

[9]

H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–8.

[10]

D. Nister and H. Stewenius, “Scalable recognition with a vocabulary tree,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2006, pp. 2161–2168.

[11]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–8.

[12]

J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. IEEE 9th Int. Conf. Comput. Vis., 2003, pp. 1470–1477.

[13]

A. Mikulik, M. Perdoch, O. Chum, and J. Matas, “Learning a fine vocabulary, ” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 1–14.

[14]

J. Philbin, M. Isard, J. Sivic, and A. Zisserman, “Descriptor learning for efficient retrieval,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 677–691.

[15]

H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 33, no. 1, pp. 117–128, Jan. 2011.

[16]

J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Lost in quantization: Improving particular object retrieval in large scale image databases,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2008, pp. 1–8.

[17]

J. C. van Gemert, C. J. Veenman, A. W. Smeulders, and J.-M. Geusebroek, “Visual word ambiguity, ” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 7, pp. 1271–1283, Jul. 2010.

Digital Library

[18]

O. Chum, A. Mikulik, M. Perdoch, and J. Matas, “Total recall II: Query expansion revisited,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 889–896.

[19]

A. Irschara, C. Zach, J. Frahm, and H. Bischof, “From structure-from-motion point clouds to fast location recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 2599–2606.

[20]

Y. Li, N. Snavely, and D. Huttenlocher, “Location recognition using prioritized feature matching,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 791–804.

[21]

J. Philbin, J. Sivic, and A. Zisserman, “ Geometric latent dirichlet allocation on a matching graph for large-scale image datasets,” Int. J. Comput. Vis., vol. 95, pp. 138–153, 2011.

Digital Library

[22]

A. Torii, J. Sivic, and T. Pajdla, “Visual localization by linear combination of image descriptors,” in Proc. 2nd IEEE Workshop Mobile Vis., with ICCV, 2011, pp. 102–1029.

[23]

P. Turcot and D. Lowe, “Better matching with fewer features: The selection of useful features in large database recognition problem,” in Proc. IEEE 12th Int. Conf. Comput. Vis. Workshop LAVD, 2009, pp. 2109–2116.

[24]

A. Zamir and M. Shah, “Accurate image localization based on Google maps street view,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 255–268.

[25]

O. Chum, M. Perdoch, and J. Matas, “Geometric min-hashing: Finding a (thick) needle in a haystack,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 17–24.

[26]

H. Jegou, M. Douze, and C. Schmid, “Hamming embedding and weak geometric consistency for large-scale image search,” in Proc. 10th Eur. Conf. Comput. Vis., 2008, pp. 304–317.

[27]

Y. Zhang, Z. Jia, and T. Chen, “Image retrieval with geometry-preserving visual phrases,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 809–816.

[28]

F. Schaffalitzky and A. Zisserman, “Automated location matching in movies,” Comput. Vis. Image Understanding, vol. 92, pp. 236–264, 2003.

Digital Library

[29]

C. Schmid and R. Mohr, “Local greyvalue invariants for image retrieval,” IEEE Trans Pattern Anal. Mach. Intell., vol. 19, no. 5, pp. 530– 534, May 1997.

Digital Library

[30]

J. Hays, M. Leordeanu, A. Efros, and Y. Liu, “Discovering texture regularity as a higher-order correspondence problem,” in Proc. Eur. Conf. Comput. Vis., 2006, pp. 522–535.

[31]

T. Leung and J. Malik, “Detecting, localizing and grouping repeated scene elements from an image,” in Proc. 4th Eur. Conf. Comput. Vis., 1996, pp. 546–555.

[32]

M. Park, K. Brocklehurst, R. Collins, and Y. Liu, “Deformed lattice detection in real-world images using mean-shift belief propagation,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 31, no. 10, pp. 1804–1816, Oct. 2009.

Digital Library

[33]

F. Schaffalitzky and A. Zisserman, “Geometric grouping of repeated elements within images,” in Proc. Brit. Mach. Conf., 1998, pp. 13–22.

[34]

G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert, “Detecting and matching repeated patterns for automatic geo-tagging in urban environments,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2008, pp. 1–7.

[35]

J. Pritts, O. Chum, and J. Matas, “Detection, rectification and segmentation of coplanar repeated patterns,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2014, pp. 2973–2980.

[36]

D. Hauagge and N. Snavely, “Image matching using local symmetry features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2012, pp. 206–213.

[37]

C. Wu, J. Frahm, and M. Pollefeys, “Detecting large repetitive structures with salient boundaries,” in Proc. 11th Eur. Conf. Comput. Vis. , 2010, pp. 142–155.

[38]

P. Muller, G. Zeng, P. Wonka, and L. Van Gool, “Image-based procedural modeling of facades,” ACM Trans. Graph., vol. 26, no. 3, p. 85, 2007.

Digital Library

[39]

O. Teboul, L. Simon, P. Koutsourakis, and N. Paragios, “Segmentation of building facades using procedural shape priors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. , 2010, pp. 3105–3112.

[40]

C. Wu, J.-M. Frahm, and M. Pollefeys, “Repetition-based dense single-view reconstruction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. , 2011, pp. 3113–3120.

[41]

T. Sattler, B. Leibe, and L. Kobbelt, “SCRAMSAC: Improving RANSAC’s efficiency with a spatial consistency filter,” in Proc. IEEE 12th Int. Conf. Comput. Vis., 2009, pp. 2090–2097.

[42]

P. Doubek, J. Matas, M. Perdoch, and O. Chum, “Image matching and retrieval by repetitive patterns,” in Proc. 20th Int. Conf. Pattern Recog., 2010, pp. 3195–3198.

[43]

A. Torii, J. Sivic, T. Pajdla, and M. Okutomi, “Visual place recognition with repetitive structures,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2013, pp. 883–890.

[44]

G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Inf. Process. Manage., vol. 24, no. 5, pp. 513–523, 1988.

Digital Library

[45]

L. Zheng, S. Wang, Z. Liu, and Q. Tian, “Lp-norm IDF for large scale image search, ” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2013, pp. 1626–1633.

[46]

S. Katz, “Distribution of content words and phrases in text and language modelling,” Natural Lang. Eng., vol. 2, no. 1, pp. 15–59, 1996.

Digital Library

[47]

A. Pothen and C.-J. Fan, “Computing the block triangular form of a sparse matrix,” ACM Trans. Math. Softw., vol. 16, no. 4, pp. 303–324, 1990.

Digital Library

[48]

D. Lowe, “Distinctive image features from scale-invariant keypoints, ” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.

Digital Library

[49]

O. Chum, J. Philbin, M. Isard, and A. Zisserman, “Scalable near identical image and shot detection,” in Proc. 6th ACM Int. Conf. Image Video Retrieval, 2007, pp. 549–556.

Digital Library

[50]

T. Pylvanainen, K. Roimela, R. Vedantham, J. Itaranta, and R. Grzeszczuk, “Automatic alignment and multi-view segmentation of street view data using 3d shape prior,” presented at the 5th International Symp. 3D Data Processing, Visualization and Transmission, Paris, France, 2010.

[51]

T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt, “Image retrieval for image-based localization revisited,” in Proc. Brit. Mach. Vis. Conf., 2012, pp. 76.1–76.12.

[52]

M. Muja and D. Lowe, “Fast approximate nearest neighbors with automatic algorithm configuration, ” in Proc. Int. Conf. Comput. Vis. Theory Appl., 2009, pp. 331 –340.

[53]

A. Vedaldi and B. Fulkerson, “VLFeat: An open and portable library of computer vision algorithms,” in Proc. Int. Conf. Multimedia, pp. 1469–1472 [Online]. Available: http://www.vlfeat.org/

[54]

R. Arandjelović and A. Zisserman, “Three things everyone should know to improve object retrieval, ” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2012, pp. 2911 –2918.

[55]

H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid, “Aggregating local image descriptors into compact codes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 9, pp. 1704–1716, Sep. 2012.

[56]

H. Jegou, M. Douze, and C. Schmid, “Packing bag-of-features,” in Proc. IEEE 12th Int. Conf. Comput. Vis., 2009, pp. 2357–2364.

[57]

O. Chum, J. Matas, and S. Obdrzalek, “ Enhancing RANSAC by generalized model optimization,” in Proc. Asian Conf. Comput. Vis. , 2004, pp. 812–817.

[58]

R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge, U.K. : Cambridge Univ. Press, 2004.

[59]

O. Chum and J. Matas, “Matching with PROSAC—progressive sample consensus,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2005, pp. 220–226.

[60]

[Online]. Available: http://www.ok.ctrl.titech.ac.jp/∼torii/project/repttile/, 2013.

Cited By

Xiao YDu SChen XLiu MSun M(2024)Dual-attention-transformer-based semantic reranking for large-scale image localizationApplied Intelligence10.1007/s10489-024-05539-254:9-10(6946-6958)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10489-024-05539-2
Jeon HIm SLee BRameau FChoi DOh JKweon IHebert M(2023)A Large-Scale Virtual Dataset and Egocentric Localization for Disaster ResponsesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.309453145:6(6766-6782)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TPAMI.2021.3094531
Wang YQiu YCheng PZhang J(2023)Hybrid CNN-Transformer Features for Visual Place RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321243433:3(1109-1122)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1109/TCSVT.2022.3212434
Show More Cited By

Index Terms

Visual Place Recognition with Repetitive Structures
1. Computing methodologies
  1. Artificial intelligence
    1. Computer vision
      1. Computer vision problems
        Object recognition
      2. Computer vision tasks
        Scene understanding
  2. Computer graphics
2. Information systems

Index terms have been assigned to the content through auto-classification.

Recommendations

Visual Place Recognition with Repetitive Structures
CVPR '13: Proceedings of the 2013 IEEE Conference on Computer Vision and Pattern Recognition

Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. Even more ...
Read More
Attention-based Pyramid Aggregation Network for Visual Place Recognition
MM '18: Proceedings of the 26th ACM international conference on Multimedia

Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task. The intrinsic challenges in place recognition exist that the confusing objects such as cars and trees frequently occur in the ...
Read More
A proposed method for the improvement in biometric facial image recognition using document-based classification
Abstract
This paper mainly focuses on improving the recognition rate and reducing the recognition time in facial image recognition application. The existing methods are based on statistical or neural network or fuzzy-based feature extraction. In this study,...
Read More

Comments

Information & Contributors

Information

Published In

cover image IEEE Transactions on Pattern Analysis and Machine Intelligence

IEEE Transactions on Pattern Analysis and Machine Intelligence Volume 37, Issue 11

Nov. 2015

208 pages

ISSN:0162-8828

Issue’s Table of Contents

Copyright © 2015.

Publisher

IEEE Computer Society

United States

Publication History

Published: 01 November 2015

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

22
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Other Metrics

View Author Metrics

Citations

Cited By

Xiao YDu SChen XLiu MSun M(2024)Dual-attention-transformer-based semantic reranking for large-scale image localizationApplied Intelligence10.1007/s10489-024-05539-254:9-10(6946-6958)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s10489-024-05539-2
Jeon HIm SLee BRameau FChoi DOh JKweon IHebert M(2023)A Large-Scale Virtual Dataset and Egocentric Localization for Disaster ResponsesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.309453145:6(6766-6782)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TPAMI.2021.3094531
Wang YQiu YCheng PZhang J(2023)Hybrid CNN-Transformer Features for Visual Place RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321243433:3(1109-1122)Online publication date: 1-Mar-2023
https://dl.acm.org/doi/10.1109/TCSVT.2022.3212434
Jurado-Rodriguez DMuñoz-Salinas RGarrido-Jurado SMedina-Carnicer R(2023)Planar fiducial markers: a comparative studyVirtual Reality10.1007/s10055-023-00772-527:3(1733-1749)Online publication date: 21-Feb-2023
https://dl.acm.org/doi/10.1007/s10055-023-00772-5
Xompero ACavallaro A(2022)Cross-Camera View-Overlap RecognitionComputer Vision – ECCV 2022 Workshops10.1007/978-3-031-25075-0_19(253-269)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-25075-0_19
Xia ZBooij OManfredi MKooij J(2022)Visual Cross-View Metric Localization with Dense Uncertainty EstimatesComputer Vision – ECCV 202210.1007/978-3-031-19842-7_6(90-106)Online publication date: 23-Oct-2022
https://dl.acm.org/doi/10.1007/978-3-031-19842-7_6
Paolicelli VTavera AMasone CBerton GCaputo B(2022)Learning Semantics for Visual Place Recognition Through Multi-scale AttentionImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06430-2_38(454-466)Online publication date: 23-May-2022
https://dl.acm.org/doi/10.1007/978-3-031-06430-2_38
Wang HPeng JLu SCao XQin XTu C(2021)ReLoc: Indoor Visual Localization with Hierarchical Sitemap and View SynthesisJournal of Computer Science and Technology10.1007/s11390-021-1373-136:3(494-507)Online publication date: 1-Jun-2021
https://dl.acm.org/doi/10.1007/s11390-021-1373-1
Zhang ZSattler TScaramuzza D(2021)Reference Pose Generation for Long-term Visual Localization via Learned Features and View SynthesisInternational Journal of Computer Vision10.1007/s11263-020-01399-8129:4(821-844)Online publication date: 1-Apr-2021
https://dl.acm.org/doi/10.1007/s11263-020-01399-8
Qi JWang RWang CCao X(2021)Coarse-to-Fine Visual Place RecognitionNeural Information Processing10.1007/978-3-030-92273-3_3(28-39)Online publication date: 8-Dec-2021
https://dl.acm.org/doi/10.1007/978-3-030-92273-3_3
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents