Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Visual Place Recognition with Repetitive Structures

Published: 01 November 2015 Publication History
  • Get Citation Alerts
  • Abstract

    Repeated structures such as building facades, fences or road markings often represent a significant challenge for place recognition. Repeated structures are notoriously hard for establishing correspondences using multi-view geometry. They violate the feature independence assumed in the bag-of-visual-words representation which often leads to over-counting evidence and significant degradation of retrieval performance. In this work we show that repeated structures are not a nuisance but, when appropriately represented, they form an important distinguishing feature for many places. We describe a representation of repeated structures suitable for scalable retrieval and geometric verification. The retrieval is based on robust detection of repeated image structures and a suitable modification of weights in the bag-of-visual-word model. We also demonstrate that the explicit detection of repeated patterns is beneficial for robust visual word matching for geometric verification. Place recognition results are shown on datasets of street-level imagery from Pittsburgh and San Francisco demonstrating significant gains in recognition performance compared to the standard bag-of-visual-words baseline as well as the more recently proposed burstiness weighting and Fisher vector encoding.

    References

    [1]
    B. Aguera y Arcas. (2010). Augmented reality using Bing maps, talk at TED 2010 [Online]. Available: http://www.videosift.com/video/TED-Augmented-reality-using-Bing-maps
    [2]
    M. Cummins and P. Newman, “Highly scalable appearance-only SLAM—FAB-MAP 2.0,” presented at the Robotics: Science and Systems, Seattle, WA, USA, June 2009.
    [3]
    D. Chen, G. Baatz, K. Koser, S. S. Tsai, R. Vedantham, T. Pylvanainen, K. Roimela, X. Chen, J. Bach, M. Pollefeys, B. Girod, and R. Grzeszczuk, “City-scale landmark identification on mobile devices,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 737–744.
    [4]
    J. Knopp, J. Sivic, and T. Pajdla, “Avoiding confusing features in place recognition,” in Proc. Eur. Conf. Comput. Vis., 2010, pp. 748–761.
    [5]
    T. Quack, B. Leibe, and L. Van Gool, “ World-scale mining of objects and events from community photo collections,” in Proc. Int. Conf. Content-Based Image Video Retrieval, 2008, pp. 47–56.
    [6]
    G. Schindler, M. Brown, and R. Szeliski, “ City-scale location recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–7.
    [7]
    O. Chum, J. Philbin, J. Sivic, M. Isard, and A. Zisserman, “Total recall: Automatic query expansion with a generative feature model for object retrieval,” in Proc. IEEE 11th Int. Conf. Comput. Vis., 2007, pp. 1–8.
    [8]
    H. Jegou, M. Douze, and C. Schmid, “On the burstiness of visual elements,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 1169–1176.
    [9]
    H. Jegou, H. Harzallah, and C. Schmid, “A contextual dissimilarity measure for accurate and efficient image search,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–8.
    [10]
    D. Nister and H. Stewenius, “Scalable recognition with a vocabulary tree,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2006, pp. 2161–2168.
    [11]
    J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Object retrieval with large vocabularies and fast spatial matching,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2007, pp. 1–8.
    [12]
    J. Sivic and A. Zisserman, “Video Google: A text retrieval approach to object matching in videos,” in Proc. IEEE 9th Int. Conf. Comput. Vis., 2003, pp. 1470–1477.
    [13]
    A. Mikulik, M. Perdoch, O. Chum, and J. Matas, “Learning a fine vocabulary, ” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 1–14.
    [14]
    J. Philbin, M. Isard, J. Sivic, and A. Zisserman, “Descriptor learning for efficient retrieval,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 677–691.
    [15]
    H. Jégou, M. Douze, and C. Schmid, “Product quantization for nearest neighbor search,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 33, no. 1, pp. 117–128, Jan. 2011.
    [16]
    J. Philbin, O. Chum, M. Isard, J. Sivic, and A. Zisserman, “Lost in quantization: Improving particular object retrieval in large scale image databases,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2008, pp. 1–8.
    [17]
    J. C. van Gemert, C. J. Veenman, A. W. Smeulders, and J.-M. Geusebroek, “Visual word ambiguity, ” IEEE Trans. Pattern Anal. Mach. Intell., vol. 32, no. 7, pp. 1271–1283, Jul. 2010.
    [18]
    O. Chum, A. Mikulik, M. Perdoch, and J. Matas, “Total recall II: Query expansion revisited,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 889–896.
    [19]
    A. Irschara, C. Zach, J. Frahm, and H. Bischof, “From structure-from-motion point clouds to fast location recognition,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 2599–2606.
    [20]
    Y. Li, N. Snavely, and D. Huttenlocher, “Location recognition using prioritized feature matching,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 791–804.
    [21]
    J. Philbin, J. Sivic, and A. Zisserman, “ Geometric latent dirichlet allocation on a matching graph for large-scale image datasets,” Int. J. Comput. Vis., vol. 95, pp. 138–153, 2011.
    [22]
    A. Torii, J. Sivic, and T. Pajdla, “Visual localization by linear combination of image descriptors,” in Proc. 2nd IEEE Workshop Mobile Vis., with ICCV, 2011, pp. 102–1029.
    [23]
    P. Turcot and D. Lowe, “Better matching with fewer features: The selection of useful features in large database recognition problem,” in Proc. IEEE 12th Int. Conf. Comput. Vis. Workshop LAVD, 2009, pp. 2109–2116.
    [24]
    A. Zamir and M. Shah, “Accurate image localization based on Google maps street view,” in Proc. 11th Eur. Conf. Comput. Vis., 2010, pp. 255–268.
    [25]
    O. Chum, M. Perdoch, and J. Matas, “Geometric min-hashing: Finding a (thick) needle in a haystack,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2009, pp. 17–24.
    [26]
    H. Jegou, M. Douze, and C. Schmid, “Hamming embedding and weak geometric consistency for large-scale image search,” in Proc. 10th Eur. Conf. Comput. Vis., 2008, pp. 304–317.
    [27]
    Y. Zhang, Z. Jia, and T. Chen, “Image retrieval with geometry-preserving visual phrases,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2011, pp. 809–816.
    [28]
    F. Schaffalitzky and A. Zisserman, “Automated location matching in movies,” Comput. Vis. Image Understanding, vol. 92, pp. 236–264, 2003.
    [29]
    C. Schmid and R. Mohr, “Local greyvalue invariants for image retrieval,” IEEE Trans Pattern Anal. Mach. Intell., vol. 19, no. 5, pp. 530– 534, May 1997.
    [30]
    J. Hays, M. Leordeanu, A. Efros, and Y. Liu, “Discovering texture regularity as a higher-order correspondence problem,” in Proc. Eur. Conf. Comput. Vis., 2006, pp. 522–535.
    [31]
    T. Leung and J. Malik, “Detecting, localizing and grouping repeated scene elements from an image,” in Proc. 4th Eur. Conf. Comput. Vis., 1996, pp. 546–555.
    [32]
    M. Park, K. Brocklehurst, R. Collins, and Y. Liu, “Deformed lattice detection in real-world images using mean-shift belief propagation,” IEEE Trans. Pattern Anal. Mach. Intell. , vol. 31, no. 10, pp. 1804–1816, Oct. 2009.
    [33]
    F. Schaffalitzky and A. Zisserman, “Geometric grouping of repeated elements within images,” in Proc. Brit. Mach. Conf., 1998, pp. 13–22.
    [34]
    G. Schindler, P. Krishnamurthy, R. Lublinerman, Y. Liu, and F. Dellaert, “Detecting and matching repeated patterns for automatic geo-tagging in urban environments,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2008, pp. 1–7.
    [35]
    J. Pritts, O. Chum, and J. Matas, “Detection, rectification and segmentation of coplanar repeated patterns,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2014, pp. 2973–2980.
    [36]
    D. Hauagge and N. Snavely, “Image matching using local symmetry features,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2012, pp. 206–213.
    [37]
    C. Wu, J. Frahm, and M. Pollefeys, “Detecting large repetitive structures with salient boundaries,” in Proc. 11th Eur. Conf. Comput. Vis. , 2010, pp. 142–155.
    [38]
    P. Muller, G. Zeng, P. Wonka, and L. Van Gool, “Image-based procedural modeling of facades,” ACM Trans. Graph., vol. 26, no. 3, p. 85, 2007.
    [39]
    O. Teboul, L. Simon, P. Koutsourakis, and N. Paragios, “Segmentation of building facades using procedural shape priors,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. , 2010, pp. 3105–3112.
    [40]
    C. Wu, J.-M. Frahm, and M. Pollefeys, “Repetition-based dense single-view reconstruction,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog. , 2011, pp. 3113–3120.
    [41]
    T. Sattler, B. Leibe, and L. Kobbelt, “SCRAMSAC: Improving RANSAC’s efficiency with a spatial consistency filter,” in Proc. IEEE 12th Int. Conf. Comput. Vis., 2009, pp. 2090–2097.
    [42]
    P. Doubek, J. Matas, M. Perdoch, and O. Chum, “Image matching and retrieval by repetitive patterns,” in Proc. 20th Int. Conf. Pattern Recog., 2010, pp. 3195–3198.
    [43]
    A. Torii, J. Sivic, T. Pajdla, and M. Okutomi, “Visual place recognition with repetitive structures,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2013, pp. 883–890.
    [44]
    G. Salton and C. Buckley, “Term-weighting approaches in automatic text retrieval,” Inf. Process. Manage., vol. 24, no. 5, pp. 513–523, 1988.
    [45]
    L. Zheng, S. Wang, Z. Liu, and Q. Tian, “Lp-norm IDF for large scale image search, ” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2013, pp. 1626–1633.
    [46]
    S. Katz, “Distribution of content words and phrases in text and language modelling,” Natural Lang. Eng., vol. 2, no. 1, pp. 15–59, 1996.
    [47]
    A. Pothen and C.-J. Fan, “Computing the block triangular form of a sparse matrix,” ACM Trans. Math. Softw., vol. 16, no. 4, pp. 303–324, 1990.
    [48]
    D. Lowe, “Distinctive image features from scale-invariant keypoints, ” Int. J. Comput. Vis., vol. 60, no. 2, pp. 91–110, 2004.
    [49]
    O. Chum, J. Philbin, M. Isard, and A. Zisserman, “Scalable near identical image and shot detection,” in Proc. 6th ACM Int. Conf. Image Video Retrieval, 2007, pp. 549–556.
    [50]
    T. Pylvanainen, K. Roimela, R. Vedantham, J. Itaranta, and R. Grzeszczuk, “Automatic alignment and multi-view segmentation of street view data using 3d shape prior,” presented at the 5th International Symp. 3D Data Processing, Visualization and Transmission, Paris, France, 2010.
    [51]
    T. Sattler, T. Weyand, B. Leibe, and L. Kobbelt, “Image retrieval for image-based localization revisited,” in Proc. Brit. Mach. Vis. Conf., 2012, pp. 76.1–76.12.
    [52]
    M. Muja and D. Lowe, “Fast approximate nearest neighbors with automatic algorithm configuration, ” in Proc. Int. Conf. Comput. Vis. Theory Appl., 2009, pp. 331 –340.
    [53]
    A. Vedaldi and B. Fulkerson, “VLFeat: An open and portable library of computer vision algorithms,” in Proc. Int. Conf. Multimedia, pp. 1469–1472 [Online]. Available: http://www.vlfeat.org/
    [54]
    R. Arandjelović and A. Zisserman, “Three things everyone should know to improve object retrieval, ” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2012, pp. 2911 –2918.
    [55]
    H. Jégou, F. Perronnin, M. Douze, J. Sánchez, P. Pérez, and C. Schmid, “Aggregating local image descriptors into compact codes,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 34, no. 9, pp. 1704–1716, Sep. 2012.
    [56]
    H. Jegou, M. Douze, and C. Schmid, “Packing bag-of-features,” in Proc. IEEE 12th Int. Conf. Comput. Vis., 2009, pp. 2357–2364.
    [57]
    O. Chum, J. Matas, and S. Obdrzalek, “ Enhancing RANSAC by generalized model optimization,” in Proc. Asian Conf. Comput. Vis. , 2004, pp. 812–817.
    [58]
    R. I. Hartley and A. Zisserman, Multiple View Geometry in Computer Vision, 2nd ed. Cambridge, U.K. : Cambridge Univ. Press, 2004.
    [59]
    O. Chum and J. Matas, “Matching with PROSAC—progressive sample consensus,” in Proc. IEEE Conf. Comput. Vis. Pattern Recog., 2005, pp. 220–226.
    [60]
    [Online]. Available: http://www.ok.ctrl.titech.ac.jp/∼torii/project/repttile/, 2013.

    Cited By

    View all
    • (2024)Dual-attention-transformer-based semantic reranking for large-scale image localizationApplied Intelligence10.1007/s10489-024-05539-254:9-10(6946-6958)Online publication date: 1-May-2024
    • (2023)A Large-Scale Virtual Dataset and Egocentric Localization for Disaster ResponsesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.309453145:6(6766-6782)Online publication date: 1-Jun-2023
    • (2023)Hybrid CNN-Transformer Features for Visual Place RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321243433:3(1109-1122)Online publication date: 1-Mar-2023
    • Show More Cited By

    Index Terms

    1. Visual Place Recognition with Repetitive Structures
            Index terms have been assigned to the content through auto-classification.

            Recommendations

            Comments

            Information & Contributors

            Information

            Published In

            cover image IEEE Transactions on Pattern Analysis and Machine Intelligence
            IEEE Transactions on Pattern Analysis and Machine Intelligence  Volume 37, Issue 11
            Nov. 2015
            208 pages

            Publisher

            IEEE Computer Society

            United States

            Publication History

            Published: 01 November 2015

            Author Tags

            1. image retrieval
            2. Place recognition
            3. bag of visual words
            4. geometric verification

            Qualifiers

            • Research-article

            Contributors

            Other Metrics

            Bibliometrics & Citations

            Bibliometrics

            Article Metrics

            • Downloads (Last 12 months)0
            • Downloads (Last 6 weeks)0

            Other Metrics

            Citations

            Cited By

            View all
            • (2024)Dual-attention-transformer-based semantic reranking for large-scale image localizationApplied Intelligence10.1007/s10489-024-05539-254:9-10(6946-6958)Online publication date: 1-May-2024
            • (2023)A Large-Scale Virtual Dataset and Egocentric Localization for Disaster ResponsesIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.309453145:6(6766-6782)Online publication date: 1-Jun-2023
            • (2023)Hybrid CNN-Transformer Features for Visual Place RecognitionIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2022.321243433:3(1109-1122)Online publication date: 1-Mar-2023
            • (2023)Planar fiducial markers: a comparative studyVirtual Reality10.1007/s10055-023-00772-527:3(1733-1749)Online publication date: 21-Feb-2023
            • (2022)Cross-Camera View-Overlap RecognitionComputer Vision – ECCV 2022 Workshops10.1007/978-3-031-25075-0_19(253-269)Online publication date: 23-Oct-2022
            • (2022)Visual Cross-View Metric Localization with Dense Uncertainty EstimatesComputer Vision – ECCV 202210.1007/978-3-031-19842-7_6(90-106)Online publication date: 23-Oct-2022
            • (2022)Learning Semantics for Visual Place Recognition Through Multi-scale AttentionImage Analysis and Processing – ICIAP 202210.1007/978-3-031-06430-2_38(454-466)Online publication date: 23-May-2022
            • (2021)ReLoc: Indoor Visual Localization with Hierarchical Sitemap and View SynthesisJournal of Computer Science and Technology10.1007/s11390-021-1373-136:3(494-507)Online publication date: 1-Jun-2021
            • (2021)Reference Pose Generation for Long-term Visual Localization via Learned Features and View SynthesisInternational Journal of Computer Vision10.1007/s11263-020-01399-8129:4(821-844)Online publication date: 1-Apr-2021
            • (2021)Coarse-to-Fine Visual Place RecognitionNeural Information Processing10.1007/978-3-030-92273-3_3(28-39)Online publication date: 8-Dec-2021
            • Show More Cited By

            View Options

            View options

            Get Access

            Login options

            Media

            Figures

            Other

            Tables

            Share

            Share

            Share this Publication link

            Share on social media