Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1007/978-3-031-46994-7_14guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Is Quantized ANN Search Cursed? Case Study of Quantifying Search and Index Quality

Published: 27 October 2023 Publication History

Abstract

Traditional evaluation of an approximate high-dimensional index typically consists of running a benchmark with known ground truth, analyzing the performance in terms of traditional result quality and latency measures, and then comparing those measures to competing index structures. Such analysis can give an overall indication of the suitability of the index for the application that the benchmark represents. When the index inevitably fails to return the sought items for some queries, however, this methodology does not help to explain why the index fails in those cases. Furthermore, when considering many different parameter settings, the process of repeatedly indexing the entire collection is prohibitively time-consuming. In this paper, we define three causes for failures in hierarchical quantized search. We show that the two failure cases that relate to the index can be evaluated and quantified using only the index structure and ground-truth data. In our evaluation, we use eCP, a lightweight algorithm that builds the index hierarchy top-down a priori without any costly segmentation of the dataset, and show that significant insight can be gained into the quality of the index structure, or lack thereof.

References

[1]
Amsaleg, L., Jégou, H.: BIGANN: abillion-sized evaluation dataset, corpus-texmex.irisa.fr. Accessed 2 June 2023
[2]
Gudmundsson, G.Þ., Jónsson, B.Þ., Amsaleg, L.: A large-scale performance study of cluster-based high-dimensional indexing. In: Proceedings of the international workshop on Very-Large-Scale Multimedia Corpus, Mining and Retrieval (VLS-MCMR), pp. 31–36 (2010)
[3]
Gudmundsson GÞ, Jónsson BÞ, Amsaleg L, and Franklin MJ Prototyping a web-scale multimedia retrieval service using spark ACM Trans. Multimed. Comput. Commun. Appl. (TOMM) 2018 14 3s 1-24
[4]
Lowe DG Distinctive image features from scale-invariant keypoints Int. J. Comput. Vision (IJCV) 2004 60 91-110
[5]
Malkov Y, Ponomarenko A, Logvinov A, and Krylov V Approximate nearest neighbor algorithm based on navigable small world graphs Inf. Syst. 2014 45 61-68
[6]
Matsui Y, Uchida Y, Jégou H, and Satoh S A survey of product quantization ITE Trans. Media Technol. Appl. (MTA) 2018 6 1 2-10
[7]
Simhadri, H.V., et al.: Results of the NeurIPS 2021 challenge on billion-scale approximate nearest neighbor search. In: NeurIPS 2021 Competitions and Demonstrations Track, pp. 177–189. PMLR (2022)

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
Similarity Search and Applications: 16th International Conference, SISAP 2023, A Coruña, Spain, October 9–11, 2023, Proceedings
Oct 2023
324 pages
ISBN:978-3-031-46993-0
DOI:10.1007/978-3-031-46994-7

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 27 October 2023

Author Tags

  1. High-dimensionsional indexing
  2. Hierarchical vectorial quantization
  3. Evaluation methodology

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media