Billion-scale similarity search with GPUs

Johnson, Jeff; Douze, Matthijs; Jégou, Hervé

Computer Science > Computer Vision and Pattern Recognition

arXiv:1702.08734v1 (cs)

[Submitted on 28 Feb 2017]

Title:Billion-scale similarity search with GPUs

Authors:Jeff Johnson, Matthijs Douze, Hervé Jégou

View PDF

Abstract:Similarity search finds application in specialized database systems handling complex data such as images or videos, which are typically represented by high-dimensional features and require specific indexing structures. This paper tackles the problem of better utilizing GPUs for this task. While GPUs excel at data-parallel tasks, prior approaches are bottlenecked by algorithms that expose less parallelism, such as k-min selection, or make poor use of the memory hierarchy.
We propose a design for k-selection that operates at up to 55% of theoretical peak performance, enabling a nearest neighbor implementation that is 8.5x faster than prior GPU state of the art. We apply it in different similarity search scenarios, by proposing optimized design for brute-force, approximate and compressed-domain search based on product quantization. In all these setups, we outperform the state of the art by large margins. Our implementation enables the construction of a high accuracy k-NN graph on 95 million images from the Yfcc100M dataset in 35 minutes, and of a graph connecting 1 billion vectors in less than 12 hours on 4 Maxwell Titan X GPUs. We have open-sourced our approach for the sake of comparison and reproducibility.

Subjects:	Computer Vision and Pattern Recognition (cs.CV); Databases (cs.DB); Data Structures and Algorithms (cs.DS); Information Retrieval (cs.IR)
Cite as:	arXiv:1702.08734 [cs.CV]
	(or arXiv:1702.08734v1 [cs.CV] for this version)
	https://doi.org/10.48550/arXiv.1702.08734

Submission history

From: Matthijs Douze [view email]
[v1] Tue, 28 Feb 2017 10:42:31 UTC (819 KB)

Computer Science > Computer Vision and Pattern Recognition

Title:Billion-scale similarity search with GPUs

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computer Vision and Pattern Recognition

Title:Billion-scale similarity search with GPUs

Submission history

Access Paper:

References & Citations

1 blog link

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators