Attention-based Pyramid Aggregation Network for Visual Place Recognition

Zhu, Yingying; Wang, Jiong; Xie, Lingxi; Zheng, Liang

doi:10.1145/3240508.3240525

Computer Science > Computer Vision and Pattern Recognition

arXiv:1808.00288 (cs)

[Submitted on 1 Aug 2018]

Title:Attention-based Pyramid Aggregation Network for Visual Place Recognition

Authors:Yingying Zhu, Jiong Wang, Lingxi Xie, Liang Zheng

View PDF

Abstract:Visual place recognition is challenging in the urban environment and is usually viewed as a large scale image retrieval task. The intrinsic challenges in place recognition exist that the confusing objects such as cars and trees frequently occur in the complex urban scene, and buildings with repetitive structures may cause over-counting and the burstiness problem degrading the image representations. To address these problems, we present an Attention-based Pyramid Aggregation Network (APANet), which is trained in an end-to-end manner for place recognition. One main component of APANet, the spatial pyramid pooling, can effectively encode the multi-size buildings containing geo-information. The other one, the attention block, is adopted as a region evaluator for suppressing the confusing regional features while highlighting the discriminative ones. When testing, we further propose a simple yet effective PCA power whitening strategy, which significantly improves the widely used PCA whitening by reasonably limiting the impact of over-counting. Experimental evaluations demonstrate that the proposed APANet outperforms the state-of-the-art methods on two place recognition benchmarks, and generalizes well on standard image retrieval datasets.

Comments:	Accepted to ACM Multimedia 2018
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:1808.00288 [cs.CV]
	(or arXiv:1808.00288v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1808.00288
Related DOI:	https://doi.org/10.1145/3240508.3240525

Submission history

From: Jiong Wang [view email]
[v1] Wed, 1 Aug 2018 12:10:40 UTC (722 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Attention-based Pyramid Aggregation Network for Visual Place Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Attention-based Pyramid Aggregation Network for Visual Place Recognition

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators