Article

Is Quantized ANN Search Cursed? Case Study of Quantifying Search and Index Quality

Authors:

Gylfi Þór Guðmundsson,

Björn Þór JónssonAuthors Info & Claims

Pages 163 - 170

https://doi.org/10.1007/978-3-031-46994-7_14

Published: 27 October 2023 Publication History

Abstract

Traditional evaluation of an approximate high-dimensional index typically consists of running a benchmark with known ground truth, analyzing the performance in terms of traditional result quality and latency measures, and then comparing those measures to competing index structures. Such analysis can give an overall indication of the suitability of the index for the application that the benchmark represents. When the index inevitably fails to return the sought items for some queries, however, this methodology does not help to explain why the index fails in those cases. Furthermore, when considering many different parameter settings, the process of repeatedly indexing the entire collection is prohibitively time-consuming. In this paper, we define three causes for failures in hierarchical quantized search. We show that the two failure cases that relate to the index can be evaluated and quantified using only the index structure and ground-truth data. In our evaluation, we use eCP, a lightweight algorithm that builds the index hierarchy top-down a priori without any costly segmentation of the dataset, and show that significant insight can be gained into the quality of the index structure, or lack thereof.

References

[1]

Amsaleg, L., Jégou, H.: BIGANN: abillion-sized evaluation dataset, corpus-texmex.irisa.fr. Accessed 2 June 2023

Google Scholar

[2]

Gudmundsson, G.Þ., Jónsson, B.Þ., Amsaleg, L.: A large-scale performance study of cluster-based high-dimensional indexing. In: Proceedings of the international workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval (VLS-MCMR), pp. 31–36 (2010)

Google Scholar

[3]

Gudmundsson GÞ, Jónsson BÞ, Amsaleg L, and Franklin MJ Prototyping a web-scale multimedia retrieval service using spark ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2018 14 3s 1-24

Crossref

Google Scholar

[4]

Lowe DG Distinctive image features from scale-invariant keypoints Int. J. Comput. Vision (IJCV) 2004 60 91-110

Crossref

Google Scholar

[5]

Malkov Y, Ponomarenko A, Logvinov A, and Krylov V Approximate nearest neighbor algorithm based on navigable small world graphs Inf. Syst. 2014 45 61-68

Crossref

Google Scholar

[6]

Matsui Y, Uchida Y, Jégou H, and Satoh S A survey of product quantization ITE Trans. Media Technol. Appl. (MTA) 2018 6 1 2-10

Google Scholar

[7]

Simhadri, H.V., et al.: Results of the NeurIPS 2021 challenge on billion-scale approximate nearest neighbor search. In: NeurIPS 2021 Competitions and Demonstrations Track, pp. 177–189. PMLR (2022)

Google Scholar

Recommendations

Parallel trajectory search based on distributed index

Study distributed data management from big data trajectory based on distributed R-tree.The query trajectory is based on distance threshold and activities involved in the trajectory.The algorithms to store and maintain data into distributed index achieve ...
Hybrid index structures for location-based web search
CIKM '05: Proceedings of the 14th ACM international conference on Information and knowledge management

There is more and more commercial and research interest in location-based web search, i.e. finding web content whose topic is related to a particular place or region. In this type of search, location information should be indexed as well as text ...
Worst-case efficient range search indexing: invited tutorial
PODS '09: Proceedings of the twenty-eighth ACM SIGMOD-SIGACT-SIGART symposium on Principles of database systems

In this tutorial we will describe some of the recent advances in the development of worst-case efficient range search indexing structures, that is, structures for storing a set of data points such that the points in a axis-parallel (hyper-) query ...

Comments

Information & Contributors

Information

Published In

Similarity Search and Applications: 16th International Conference, SISAP 2023, A Coruña, Spain, October 9–11, 2023, Proceedings

Oct 2023

324 pages

ISBN:978-3-031-46993-0

DOI:10.1007/978-3-031-46994-7

Editors:
Oscar Pedreira
https://ror.org/01qckj285University of A Coruña, Coruña, Spain
,
Vladimir Estivill-Castro
https://ror.org/04n0g0b29Pompeu Fabra University, Barcelona, Spain

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 27 October 2023

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Recommendations

Parallel trajectory search based on distributed index

Hybrid index structures for location-based web search

Worst-case efficient range search indexing: invited tutorial

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations