Investigating the Role of Image Retrieval for Visual Localization

Humenberger, Martin; Cabon, Yohann; Pion, Noé; Weinzaepfel, Philippe; Lee, Donghwan; Guérin, Nicolas; Sattler, Torsten; Csurka, Gabriela

doi:10.1007/s11263-022-01615-7

Investigating the Role of Image Retrieval for Visual Localization

An Exhaustive Benchmark

Published: 25 May 2022

Volume 130, pages 1811–1836, (2022)
Cite this article

International Journal of Computer Vision Aims and scope Submit manuscript

1488 Accesses
14 Altmetric
Explore all metrics

Abstract

Visual localization, i.e., camera pose estimation in a known scene, is a core component of technologies such as autonomous driving and augmented reality. State-of-the-art localization approaches often rely on image retrieval techniques for one of two purposes: (1) provide an approximate pose estimate or (2) determine which parts of the scene are potentially visible in a given query image. It is common practice to use state-of-the-art image retrieval algorithms for both of them. These algorithms are often trained for the goal of retrieving the same landmark under a large range of viewpoint changes which often differs from the requirements of visual localization. In order to investigate the consequences for visual localization, this paper focuses on understanding the role of image retrieval for multiple visual localization paradigms. First, we introduce a novel benchmark setup and compare state-of-the-art retrieval representations on multiple datasets using localization performance as metric. Second, we investigate several definitions of “ground truth” for image retrieval. Using these definitions as upper bounds for the visual localization paradigms, we show that there is still significant room for improvement. Third, using these tools and in-depth analysis, we show that retrieval performance on classical landmark retrieval or place recognition tasks correlates only for some but not all paradigms to localization performance. Finally, we analyze the effects of blur and dynamic scenes in the images. We conclude that there is a need for retrieval approaches specifically designed for localization paradigms. Our benchmark and evaluation protocols are available at https://github.com/naver/kapture-localization.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Semantic Match Consistency for Long-Term Visual Localization

Map-Free Visual Relocalization: Metric Pose Relative to a Single Image

Notes

\({\mathbf {q}}_q = \sum _i w_i {\mathbf {q}}_i\) is re-normalized to be a unit quaternion.
Note that compared to the query pose, the 3D points are very seldom co-linear with the reference poses and can thus be accurately triangulated.
Note that only datasets with publicly available ground truth are used to generate these results. For Aachen Day-Night and InLoc, no GT poses were available (see Sect. 4.1).
Code available at http://www.ok.ctrl.titech.ac.jp/~torii/project/247/.
Matlab code and pretrained models are available at https://github.com/Relja/netvlad. We used the VGG-16-based NetVLAD model trained on Pitts30k (Arandjelović et al. 2016).
Pytorch implementation and models are available at https://europe.naverlabs.com/Research/Computer-Vision/Learning-Visual-Representations/Deep-Image-Retrieval/.
We used the TensorFlow code publicly available at https://github.com/tensorflow/models/tree/master/research/delf/delf/python/delg.
Note that this is different from the model used in our 3DV paper (Pion et al. 2020), where we used a model with ResNet50 backbone trained on GLD v1.
For InLoc, the viewpoint difference between the reference images is too large to allow robust feature matching and point triangulation. Using the available depth maps to obtain the 3D points for all features for local SFM is identical to the way we perform global SFM on InLoc. That is why for InLoc, we do not show results for local SFM.
For the other datasets, GT camera poses are not available for the test images, which makes it hard to generate retrieval GT.
https://www.pyimagesearch.com/2020/06/15/opencv-fast-fourier-transform-fft-for-blur-detection-in-images-and-video-streams/.

References

Arandjelović, R., Gronát, P., Torii, A., Pajdla, T., & Sivic, J. (2016). NetVLAD: CNN architecture for weakly supervised place recognition. In CVPR.
Arandjelović, R., & Zisserman, A. (2012). Three things everyone should know to improve object retrieval. In CVPR.
Arandjelović, R., & Zisserman, A. (2013). All about VLAD. In CVPR.
Arandjelović, R., & Zisserman, A. (2014) DisLocation: Scalable descriptor distinctiveness for location recognition. In ACCV (pp. 188–204). Springer.
Arth, C., Wagner, D., Klopschitz, M., Irschara, A., & Schmalstieg, D. (2009) Wide area localization on mobile phones. In IEEE International Symposium on Mixed and Augmented Reality.
Avrithis, Y., Kalantidis, Y., Tolias, G., & Spyrou, E. (2010). Retrieving landmark and non-landmark images from community photo collections. In ACMMM.
Babenko, A., & Lempitsky, V. (2015). Aggregating deep convolutional features for image retrieval. In ICCV.
Babenko, A., Slesarev, A., Chigorin, A., & Lempitsky, V. (2014). Neural codes for image retrieval. In ECCV.
Balntas, V., Li, S., & Prisacariu, V. (2018). RelocNet: Continuous metric learning relocalisation using neural nets. In ECCV.
Brachmann, E., Humenberger, M., Rother, C., & Sattler, T. (2021). On the limits of pseudo ground truth in visual camera re-localisation. In ICCV.
Brachmann, E., & Rother, C. (2018). Learning less is more—6D camera localization via 3D surface regression. In CVPR.
Brachmann, E., & Rother, C. (2019). Expert sample consensus applied to camera re-localization. In ICCV.
Brahmbhatt, S., Gu, J., Kim, K., Hays, J., & Kautz, J. (2018). Geometry-aware learning of maps for camera localization. In CVPR.
Brejcha, J., & Čadík, M. (2017). State-of-the-art in visual geo-localization. Pattern Analysis and Applications (PAA), 20(3), 613–637.
Article MathSciNet Google Scholar
Cao, B., Araujo, A., & Sim, J. (2020). Unifying deep local and global features for image search. In ECCV.
Cao, S., & Snavely, N. (2013). Graph-based discriminative learning for location recognition. In CVPR.
Castle, R., Klein, G., & Murray, D. (2008). Video-rate localization in multiple maps for wearable augmented reality. In IEEE international symposium on wearable computers.
Cavallari, T., Bertinetto, L., Mukhoti, J., Torr, P., & Golodetz, S. (2017). Let’s take this online: Adapting scene coordinate regression network predictions for online RGB-D camera relocalisation. In 3DV.
Cavallari, T., Golodetz, S., Lord, N., Valentin, J., Prisacariu, V., Di Stefano, L., & Torr, P. (2019). Real-time RGB-D camera pose estimation in novel scenes using a relocalisation cascade. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 42(10), 2465–2477.
Article Google Scholar
Chen, D., Baatz, G., Köser, K., Tsai, S., Vedantham, R., Pylvänäinen, T., Roimela, K., Chen, X., Bach, J., Pollefeys, M., Girod, B., & Grzeszczuk, R. (2011). City-scale landmark identification on mobile devices. In CVPR.
Chum, O., & Matas, J. (2008). Optimal randomized RANSAC. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 30(8), 1472–1482.
Article Google Scholar
Crandall, D., Backstrom, L., Huttenlocher, D., & Kleinberg, J. (2009). Mapping the world’s photos. In WWW.
Csurka, G., Dance, C., Fan, L., Willamowski, J., & Bray, C. (2004). Visual categorization with bags of keypoints. In ECCV Workshops.
Csurka, G., Dance, C., & Humenberger, M. (2018). From handcrafted to deep local invariant features. arXiv:1807.10254
Cui, Q., Fragoso, V., Sweeney, C., & Sen, P. (2017). GraphMatch: Efficient large-scale graph construction for structure from motion. In 3DV.
Deng, J., Guo, J., & Zafeiriou, S. (2019). ArcFace: Additive angular margin loss for deep face recognition. In CVPR.
Ding, M., Wang, Z., Sun, J., Shi, J., & Luo, P. (2019). CamNet: Coarse-to-fine retrieval for camera re-localization. In ICCV.
Dusmanu, M., Rocco, I., Pajdla, T., Pollefeys, M., Sivic, J., Torii, A., & Sattler, T. (2019). D2-Net: A trainable CNN for joint description and detection of local features. In CVPR
Fischler, M., & Bolles, R. (1981). Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM, 24(6), 381–395.
Article MathSciNet Google Scholar
Garcia-Fidalgo, E., & Ortiz, A. (2015). Vision-based topological mapping and localization methods: A survey. Robotics and Autonomous Systems (RAS), 64(2), 1–20.
Google Scholar
Germain, H., Bourmaud, G., & Lepetit, V. (2019). Sparse-to-dense hypercolumn matching for long-term visual localization. In 3DV.
Gordo, A., Almazán, J., Revaud, J., & Larlus, D. (2017). End-to-end learning of deep visual representations for image retrieval. International Journal of Computer Vision (IJCV), 124, 237–254.
Article MathSciNet Google Scholar
Hausler, S., Garg, S., Xu, M., Milford, M., & Fischer, T. (2021). Patch-NetVLAD: Multi-scale fusion of locally-global descriptors for place recognition. In CVPR.
Hays, J., & Efros, A. (2008). IM2GPS: Estimating geographic information from a single image. In CVPR.
He, K., Gkioxari, G., Dollár, P., & Girshick, R. (2017). Mask R-CNN. In ICCV.
Heinly, J., Schönberger, J., Dunn, E., & Frahm, J. M. (2015). Reconstructing the world in six days as captured by the Yahoo 100 million image dataset. In CVPR.
Heng, L., Choi, B., Cui, Z., Geppert, M., Hu, S., Kuan, B., Liu, P., Nguyen, R., Yeo, Y., Geiger, A., Lee, G., Pollefeys, M., & Sattler, T. (2019). Project AutoVision: Localization and 3D scene perception for an autonomous vehicle with a multi-camera system. In ICRA.
Humenberger, M., Cabon, Y., Guerin, N., Morat, J., Revaud, J., Rerole, P., Pion, N., de Souza, C., Leroy, V., & Csurka, G. (2020). Robust image retrieval-based visual localization using Kapture. arXiv:2007.13867
Irschara, A., Zach, C., Frahm, J. M., & Bischof, H. (2009). From structure-from-motion point clouds to fast location recognition. In CVPR.
Jégou, H., & Chum, O. (2012). Negative evidences and co-occurrences in image retrieval: The benefit of PCA and whitening. In ECCV.
Jégou, H., Douze, M., Schmid, C., & Pérez, P. (2010). Aggregating local descriptors into a compact image representation. In CVPR.
Kalantidis, Y., Mellina, C., & Osindero, S. (2016). Cross-dimensional Weighting for aggregated deep convolutional features. In ECCV Workshops.
Kalantidis, Y., Tolias, G., Avrithis, Y., Phinikettos, M., Spyrou, E., Mylonas, P., & Kollias, S. (2011). VIRaL: Visual image retrieval and localization. Multimedia Tools and Applications (MTA), 74(9), 3121–3135.
Google Scholar
Kendall, A., & Cipolla, R. (2017). Geometric loss functions for camera pose regression with deep learning. In CVPR.
Kendall, A., Grimes, M., & Cipolla, R. (2015). PoseNet: A convolutional network for real-time 6-DOF camera relocalization. In ICCV.
Kim, H., Dunn, E., & Frahm, J. M. (2017). Learned contextual feature reweighting for image geo-localization. In CVPR.
Kneip, L., Scaramuzza, D., & Siegwart, R. (2011). A novel parametrization of the perspective-three-point problem for a direct computation of absolute camera position and orientation. In CVPR.
Knopp, J., Sivic, J., & Pajdla, T. (2010). Avoiding confusing features in place recognition. In ECCV.
Kukelova, Z., Bujnak, M., & Pajdla, T. (2013). Real-time solution to the absolute pose problem with unknown radial distortion and focal length. In ICCV.
Larsson, V., Kukelova, Z., & Zheng, Y. (2017). Making minimal solvers for absolute pose estimation compact and robust. In ICCV.
Laskar, Z., Melekhov, I., Kalia, S., & Kannala, J. (2017). Camera relocalization by computing pairwise relative poses using convolutional neural network. In ICCV Workshops.
Lebeda, K., Matas, J., & Chum, O. (2012). Fixing the locally optimized RANSAC. In BMVC.
Lee, D., Ryu, S., Yeon, S., Lee, Y., Kim, D., Han, C., Cabon, Y., Weinzaepfel, P., Guerin, N., Csurka, G., & Humenberger, M. (2021). Large-scale localization datasets in crowded indoor spaces. In CVPR.
Li, X., Wang, S., Zhao, Y., Verbeek, J., & Kannala, J. (2020). Hierarchical scene coordinate classification and regression for visual localization. In CVPR.
Li, Y., Crandall, D., & Huttenlocher, D. (2009). Landmark classification in large-scale image collections. In ICCV.
Li, Y., Snavely, N., & Huttenlocher, D. (2010). Location recognition using prioritized feature matching. In ECCV.
Li, Y., Snavely, N., Huttenlocher, D., & Fua, P. (2012) Worldwide pose estimation using 3D point clouds. In ECCV.
Lim, H., Sinha, S., Cohen, M., Uyttendaele, M., & Kim, H. (2015). Real-time monocular image-based 6-DoF localization. International Journal of Robotics Research, 34(4–5), 476–492.
Article Google Scholar
Lin, T. Y., Maire, M., Belongie, S., Hays, J., Perona, P., Ramanan, D., Dollár, P., & Zitnick, C. (2014). Microsoft COCO: Common objects in context. In ECCV.
Liu, L., Li, H., & Dai, Y. (2019). Stochastic attraction-repulsion embedding for large scale image localization. In ICCV.
Liu, R., Li, Z., & Jia, J. (2008). Image partial blur detection and classification. In CVPR.
Lowe, D. (2004). Distinctive image features from scale-invariant keypoints. International Journal of Computer Vision (IJCV), 60(2), 91–110.
Article Google Scholar
Lowry, S., Sünderhauf, N., Newman, P., Leonard, J., Cox, D., Corke, P., & Milford, M. (2016). Visual place recognition: A survey. IEEE Transactions on Robotics, 32(1), 1–19.
Article Google Scholar
Lu, F., & Milios, E. (1997). Globally consistent range scan alignment for environment mapping. Autonomous Robots, 4, 333–34.
Article Google Scholar
Lynen, S., Sattler, T., Bosse, M., Hesch, J., Pollefeys, M., & Siegwart, R. (2015). Get out of my Lab: Large-scale, real-time visual-inertial localization. In RSS.
Maddern, W., Pascoe, G., Linegar, C., & Newman, P. (2017). 1 Year, 1000 km: The Oxford RobotCar dataset. International Journal of Robotics Research, 36(1), 3–15.
Article Google Scholar
Massiceti, D., Krull, A., Brachmann, E., Rother, C., & Torr, P. (2017). Random forests versus neural networks—What’s best for camera localization? In ICRA.
Middelberg, S., Sattler, T., Untzelmann, O., & Kobbelt, L. (2014). Scalable 6-DoF localization on mobile devices. In ECCV.
Myers, J., & Well, A. (2003). Research design and statistical analysis. Lawrence Erlbaum Associates.
Noh, H., Araujo, A., Sim, J., Weyand, T., & Han, B. (2017) Large-scale image retrieval with attentive deep local features. In ICCV.
Pearson, K. (1985). Notes on regression and inheritance in the case of two parents. Proceedings of the Royal Society of London, 58, 240–242.
Google Scholar
Perronnin, F., & Dance, C. (2007). Fisher kernels on visual vocabularies for image categorization. In CVPR.
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2007). Object retrieval with large vocabularies and fast spatial matching. In CVPR.
Philbin, J., Chum, O., Isard, M., Sivic, J., & Zisserman, A. (2008). Lost in quantization: Improving particular object retrieval in large scale image databases. In CVPR.
Piasco, N., Sidibé, D., Demonceaux, C., & Gouet-Brunet, V. (2018). A survey on visual-based localization: On the benefit of heterogeneous data. Pattern Recognition, 74(2), 90–109.
Article Google Scholar
Pion, N., Humenberger, M., Csurka Khedari, G., Cabon, Y., & Torsten, S. (2020). Benchmarking image retrieval for visual localization. In 3DV.
Radenović, F., Iscen, A., Tolias, G., & Avrithis Yannis Chum, O. (2018). Revisiting Oxford and Paris: Large-scale image retrieval benchmarking. In CVPR.
Radenović, F., Tolias, G., & Chum, O. (2019). Fine-tuning CNN image retrieval with no human annotation. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 41(7), 1655–1668.
Article Google Scholar
Razavian, A., Sullivan, J., Carlsson, S., & Maki, A. (2015). Visual instance retrieval with deep convolutional networks. ITE Transactions on Media Technology and Applications, 4(3), 251–258.
Article Google Scholar
Revaud, J., Almazan, J., de Rezende, R. S., & de Souza, C. R. (2019a). Learning with average precision: Training image retrieval with a listwise loss. In ICCV.
Revaud, J., Weinzaepfel, P., De Souza, C., & Humenberger, M. (2019b). R2D2: Reliable and repeatable detectors and descriptors. In NeurIPS.
Revaud, J., Weinzaepfel, P., De Souza, C., Pion, N., Csurka, G., Cabon, Y., & Humenberger, M. (2019c). R2D2: Reliable and repeatable detectors and descriptors for joint sparse keypoint detection and local feature extraction. arXiv:1906.06195
Sarlin, P. E., Cadena, C., Siegwart, R., & Dymczyk, M. (2019). From coarse to fine: Robust hierarchical localization at large scale. In CVPR.
Sarlin, P. E., Unagar, A., Larsson, M., Germain, H., Toft, C., Larsson, V., Pollefeys, M., Lepetit, V., Hammarstrand, L., Kahl, F., & Sattler, T. (2021). Back to the feature: Learning robust camera localization from pixels to pose. In CVPR.
Sattler, T., Havlena, M., Radenović, F., Schindler, K., & Pollefeys, M. (2015). Hyperpoints and fine vocabularies for large-scale location recognition. In ICCV.
Sattler, T., Havlena, M., Schindler, K., & Pollefey, M. (2016). Large-scale location recognition and the geometric burstiness problem. In CVPR.
Sattler, T., Leibe, B., & Kobbelt, L. (2017). Efficient & effective prioritized matching for large-scale image-based localization. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 39(9), 1744–1756.
Article Google Scholar
Sattler, T., Maddern, W., Toft, C., Torii, A., Hammarstrand, L., Stenborg, E., Safari, D., Okutomi, M., Pollefeys, M., Sivic, J., Kahl, F., & Pajdla, T. (2018). Benchmarking 6DoF outdoor visual localization in changing conditions. In CVPR.
Sattler, T., Weyand, T., Leibe, B., & Kobbelt, L. (2012). Image retrieval for image-based localization revisited. In BMVC.
Sattler, T., Zhou, Q., Pollefeys, M., & Leal-Taixé, L. (2019). Understanding the limitations of CNN-based absolute camera pose regression. In CVPR.
Schindler, G., Brown, M., & Szeliski, R. (2007). City-scale location recognition. In CVPR.
Schönberger, J., & Frahm, J. M. (2016). Structure-from-motion revisited. In CVPR.
Schönberger, J., Hardmeier, H., Sattler, T., & Pollefeys, M. (2017). Comparative evaluation of hand-crafted and learned local features. In CVPR.
Se, S., Lowe, D., & Little, J. (2002). Global localization using distinctive visual features. In IROS.
Shotton, J., Glocker, B., Zach, C., Izadi, S., Criminisi, A., & Fitzgibbon, A. (2013). Scene coordinate regression forests for camera relocalization in RGB-D images. In CVPR.
Sivic, J., & Zisserman, A. (2003). Video Google: A text retrieval approach to object matching in videos. In ICCV.
Snavely, N., Seitz, S., & Szeliski, R. (2008). Modeling the world from internet photo collections. International Journal of Computer Vision (IJCV), 80(2), 189–210.
Article Google Scholar
Sun, X., Xie, Y., Luo, P., & Wang, L. (2017). A dataset for benchmarking image-based localization. In CVPR.
Taira, H., Okutomi, M., Sattler, T., Cimpoi, M., Pollefeys, M., Sivic, J., Pajdla, T., & Akihiko, T. (2018). InLoc: Indoor visual localization with dense matching and view synthesis. In CVPR.
Taira, H., Okutomi, M., Sattler, T., Cimpoi, M., Pollefeys, M., Sivic, J., Pajdla, T., & Akihiko, T. (2019a). InLoc: Indoor visual localization with dense matching and view synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI). (Early Acces).
Taira, H., Rocco, I., Sedlar, J., Okutomi, M., Sivic, J., Pajdla, T., Sattler, T., & Torii, A. (2019b). Is This the Right Place? Geometric-semantic pose verification for indoor visual localization. In ICCV.
Tang, S., Tang, C., Huang, R., Zhu, S., & Tan, P. (2021). Learning camera localization via dense scene matching. In CVPR.
Tolias, G., & Jégou, H. (2014). Visual query expansion with or without geometry: Refining local descriptors by feature aggregation. Computer Vision and Image Understanding (CVIU), 47(10), 3466–3476.
Google Scholar
Tolias, G., Sicre, R., & Jégou, H. (2016). Particular object retrieval with integral maxpooling of CNN activations. In ICLR.
Torii, A., Arandjelović, R., Sivic, J., Okutomi, M., & Pajdla, T. (2015a). 24/7 Place recognition by view synthesis. In CVPR.
Torii, A., Arandjelović, R., Sivic, J., Okutomi, M., & Pajdla, T. (2018). 24/7 Place recognition by view synthesis. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 40(2), 257–271.
Article Google Scholar
Torii, A., Sivic, J., Okutomi, M., & Pajdla, T. (2015b). Visual place recognition with repetitive structures. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 37(11), 2346–2359.
Torii, A., Sivic, J., & Pajdla, T. (2011). Visual localization by linear combination of image descriptors. In ICCV Workshops.
Torii, A., Taira, H., Sivic, J., Pollefeys, M., Okutomi, M., Pajdla, T., & Sattler, T. (2021). Are large-scale 3D models really necessary for accurate visual localization? IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI), 43(3), 814–829.
Article Google Scholar
Ventura, J., Arth, C., Reitmayr, G., & Schmalstieg, D. (2014). Global localization from monocular SLAM on a mobile phone. IEEE Transactions on Visualization and Computer Graphics, 20(4), 531–539.
Article Google Scholar
Vo, N., Jacobs, N., & Hays, J. (2017). Revisiting IM2GPS in the deep learning era. In ICCV.
Walch, F., Hazirbas, C., Leal-Taixé, L., Sattler, T., Hilsenbeck, S., & Cremers, D. (2017). Image-based localization using LSTMs for structured feature correlation. In ICCV.
Weinzaepfel, P., Csurka, G., Cabon, Y., & Humenberger, M. (2019). Visual localization by learning objects-of-interest dense match regression. In CVPR.
Weyand, T., Araujo, A., Cao, B., & Sim, J. (2020). Google Landmarks dataset v2– A large-scale benchmark for instance-level recognition and retrieval. In CVPR.
Wijmans, E., & Furukawa, Y. (2017). Exploiting 2D floorplan for building-scale panorama RGB-D alignment. In CVPR.
Yang, L., Bai, Z., Tang, C., Li, H., Furukawa, Y., & Tan, P. (2019). SANet: Scene agnostic network for camera localization. In ICCV.
Zamir, A., Hakeem, A., Gool, L., Shah, M., & Richard, S. (2016). Large-scale visual geo-localization. In Advances in computer vision and pattern recognition. Springer.
Zamir, A. R., & Shah, M. (2010). Accurate image localization based on google maps street view. In ECCV.
Zhang, W., & Kosecka, J. (2006). Image based localization in urban environments. In International symposium on 3D data processing, visualization, and transmission.
Zhang, Z., Sattler, T., & Scaramuzza, D. (2021). Reference pose generation for long-term visual localization via learned features and view synthesis. International Journal of Computer Vision (IJCV), 129, 821–844.
Article Google Scholar
Zheng, E., & Wu, C. (2015). Structure from motion using structure-less resection. In ICCV.
Zheng, L., Zhao, Y., Wang, S., Wang, J., & Tian, Q. (2016). Good practice in CNN feature transfer. arXiv:1604.00133
Zhou, Q., Sattler, T., Pollefeys, M., & Leal-Taixé, L. (2020). To learn or not to learn: Visual localization from essential matrices. In ICRA.

Download references

Acknowledgements

This work received funding through the EU Horizon 2020 research and innovation programme under Grant agreement No. 857306 (RICAIP) and the European Regional Development Fund under IMPACT No. CZ.02.1.01/0.0/0.0/15_003/0000468.

Author information

Authors and Affiliations

NAVER LABS Europe, Meylan, France
Martin Humenberger, Yohann Cabon, Noé Pion, Philippe Weinzaepfel, Nicolas Guérin & Gabriela Csurka
NAVER LABS, Seongnam-si, Republic of Korea
Donghwan Lee
Czech Institute of Informatics, Robotics and Cybernetics, Czech Technical University in Prague, Prague, Czech Republic
Torsten Sattler

Authors

Martin Humenberger
View author publications
You can also search for this author in PubMed Google Scholar
Yohann Cabon
View author publications
You can also search for this author in PubMed Google Scholar
Noé Pion
View author publications
You can also search for this author in PubMed Google Scholar
Philippe Weinzaepfel
View author publications
You can also search for this author in PubMed Google Scholar
Donghwan Lee
View author publications
You can also search for this author in PubMed Google Scholar
Nicolas Guérin
View author publications
You can also search for this author in PubMed Google Scholar
Torsten Sattler
View author publications
You can also search for this author in PubMed Google Scholar
Gabriela Csurka
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Martin Humenberger.

Additional information

Communicated by Jun Sato.

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Noé Pion work was done during an appointment at NAVER LABS Europe, Meylan, France.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Humenberger, M., Cabon, Y., Pion, N. et al. Investigating the Role of Image Retrieval for Visual Localization. Int J Comput Vis 130, 1811–1836 (2022). https://doi.org/10.1007/s11263-022-01615-7

Download citation

Received: 28 February 2021
Accepted: 22 March 2022
Published: 25 May 2022
Issue Date: July 2022
DOI: https://doi.org/10.1007/s11263-022-01615-7

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

EUR 32.99 /Month

Get 10 units per month
Download Article/Chapter or Ebook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Investigating the Role of Image Retrieval for Visual Localization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Semantic Match Consistency for Long-Term Visual Localization

Map-Free Visual Relocalization: Metric Pose Relative to a Single Image

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Investigating the Role of Image Retrieval for Visual Localization

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Exploiting Spatial and Co-visibility Relations for Image-Based Localization

Semantic Match Consistency for Long-Term Visual Localization

Map-Free Visual Relocalization: Metric Pose Relative to a Single Image

Notes

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Additional information

Publisher's Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation