-
Derived Codebooks for High-Accuracy Nearest Neighbor Search
Authors:
Fabien André,
Anne-Marie Kermarrec,
Nicolas Le Scouarnec
Abstract:
High-dimensional Nearest Neighbor (NN) search is central in multimedia search systems. Product Quantization (PQ) is a widespread NN search technique which has a high performance and good scalability. PQ compresses high-dimensional vectors into compact codes thanks to a combination of quantizers. Large databases can, therefore, be stored entirely in RAM, enabling fast responses to NN queries. In al…
▽ More
High-dimensional Nearest Neighbor (NN) search is central in multimedia search systems. Product Quantization (PQ) is a widespread NN search technique which has a high performance and good scalability. PQ compresses high-dimensional vectors into compact codes thanks to a combination of quantizers. Large databases can, therefore, be stored entirely in RAM, enabling fast responses to NN queries. In almost all cases, PQ uses 8-bit quantizers as they offer low response times. In this paper, we advocate the use of 16-bit quantizers. Compared to 8-bit quantizers, 16-bit quantizers boost accuracy but they increase response time by a factor of 3 to 10. We propose a novel approach that allows 16-bit quantizers to offer the same response time as 8-bit quantizers, while still providing a boost of accuracy. Our approach builds on two key ideas: (i) the construction of derived codebooks that allow a fast and approximate distance evaluation, and (ii) a two-pass NN search procedure which builds a candidate set using the derived codebooks, and then refines it using 16-bit quantizers. On 1 billion SIFT vectors, with an inverted index, our approach offers a Recall@100 of 0.85 in 5.2 ms. By contrast, 16-bit quantizers alone offer a Recall@100 of 0.85 in 39 ms, and 8-bit quantizers a Recall@100 of 0.82 in 3.8 ms.
△ Less
Submitted 16 May, 2019;
originally announced May 2019.
-
Quicker ADC : Unlocking the hidden potential of Product Quantization with SIMD
Authors:
Fabien André,
Anne-Marie Kermarrec,
Nicolas Le Scouarnec
Abstract:
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. A common approach is to rely on Product Quantization, which allows the storage of large vector databases in memory and efficient distance computations. Yet, implementations of nearest neighbor search with Product Quantization have their performance limited by the many memory acce…
▽ More
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. A common approach is to rely on Product Quantization, which allows the storage of large vector databases in memory and efficient distance computations. Yet, implementations of nearest neighbor search with Product Quantization have their performance limited by the many memory accesses they perform. Following this observation, André et al. proposed Quick ADC with up to $6\times$ faster implementations of $m\times{}4$ product quantizers (PQ) leveraging specific SIMD instructions. Quicker ADC is a generalization of Quick ADC not limited to $m\times{}4$ codes and supporting AVX-512, the latest revision of SIMD instruction set. In doing so, Quicker ADC faces the challenge of using efficiently 5,6 and 7-bit shuffles that do not align to computer bytes or words. To this end, we introduce (i) irregular product quantizers combining sub-quantizers of different granularity and (ii) split tables allowing lookup tables larger than registers. We evaluate Quicker ADC with multiple indexes including Inverted Multi-Indexes and IVF HNSW and show that it outperforms the reference optimized implementations (i.e., FAISS and polysemous codes) for numerous configurations. Finally, we release an open-source fork of FAISS enhanced with Quicker ADC at http://github.com/nlescoua/faiss-quickeradc.
△ Less
Submitted 14 November, 2019; v1 submitted 21 December, 2018;
originally announced December 2018.
-
Cuckoo++ Hash Tables: High-Performance Hash Tables for Networking Applications
Authors:
Nicolas Le Scouarnec
Abstract:
Hash tables are an essential data-structure for numerous networking applications (e.g., connection tracking, firewalls, network address translators). Among these, cuckoo hash tables provide excellent performance by allowing lookups to be processed with very few memory accesses (2 to 3 per lookup). Yet, for large tables, cuckoo hash tables remain memory bound and each memory access impacts performa…
▽ More
Hash tables are an essential data-structure for numerous networking applications (e.g., connection tracking, firewalls, network address translators). Among these, cuckoo hash tables provide excellent performance by allowing lookups to be processed with very few memory accesses (2 to 3 per lookup). Yet, for large tables, cuckoo hash tables remain memory bound and each memory access impacts performance. In this paper, we propose algorithmic improvements to cuckoo hash tables allowing to eliminate some unnecessary memory accesses; these changes are conducted without altering the properties of the original cuckoo hash table so that all existing theoretical analysis remain applicable. On a single core, our hash table achieves 37M lookups per second for positive lookups (i.e., when the key looked up is present in the table), and 60M lookups per second for negative lookups, a 50% improvement over the implementation included into the DPDK. On a 18-core, with mostly positive lookups, our implementation achieves 496M lookups per second, a 45% improvement over DPDK.
△ Less
Submitted 27 December, 2017;
originally announced December 2017.
-
Accelerated Nearest Neighbor Search with Quick ADC
Authors:
Fabien André,
Anne-Marie Kermarrec,
Nicolas Le Scouarnec
Abstract:
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. Because it offers low responses times, Product Quantization (PQ) is a popular solution. PQ compresses high-dimensional vectors into short codes using several sub-quantizers, which enables in-RAM storage of large databases. This allows fast answers to NN queries, without accessing…
▽ More
Efficient Nearest Neighbor (NN) search in high-dimensional spaces is a foundation of many multimedia retrieval systems. Because it offers low responses times, Product Quantization (PQ) is a popular solution. PQ compresses high-dimensional vectors into short codes using several sub-quantizers, which enables in-RAM storage of large databases. This allows fast answers to NN queries, without accessing the SSD or HDD. The key feature of PQ is that it can compute distances between short codes and high-dimensional vectors using cache-resident lookup tables. The efficiency of this technique, named Asymmetric Distance Computation (ADC), remains limited because it performs many cache accesses.
In this paper, we introduce Quick ADC, a novel technique that achieves a 3 to 6 times speedup over ADC by exploiting Single Instruction Multiple Data (SIMD) units available in current CPUs. Efficiently exploiting SIMD requires algorithmic changes to the ADC procedure. Namely, Quick ADC relies on two key modifications of ADC: (i) the use 4-bit sub-quantizers instead of the standard 8-bit sub-quantizers and (ii) the quantization of floating-point distances. This allows Quick ADC to exceed the performance of state-of-the-art systems, e.g., it achieves a Recall@100 of 0.94 in 3.4 ms on 1 billion SIFT descriptors (128-bit codes).
△ Less
Submitted 24 April, 2017;
originally announced April 2017.
-
Fast Product-Matrix Regenerating Codes
Authors:
Nicolas Le Scouarnec
Abstract:
Distributed storage systems support failures of individual devices by the use of replication or erasure correcting codes. While erasure correcting codes offer a better storage efficiency than replication for similar fault tolerance, they incur higher CPU consumption, higher network consumption and higher disk I/Os. To address these issues, codes specific to storage systems have been designed. Thei…
▽ More
Distributed storage systems support failures of individual devices by the use of replication or erasure correcting codes. While erasure correcting codes offer a better storage efficiency than replication for similar fault tolerance, they incur higher CPU consumption, higher network consumption and higher disk I/Os. To address these issues, codes specific to storage systems have been designed. Their main feature is the ability to repair a single lost disk efficiently. In this paper, we focus on one such class of codes that minimize network consumption during repair, namely regenerating codes. We implement the original Product-Matrix Regenerating codes as well as a new optimization we propose and show that the resulting optimized codes allow achieving 790 MB/s for encoding in typical settings. Reported speeds are significantly higher than previous studies, highlighting that regenerating codes can be used with little CPU penalty.
△ Less
Submitted 9 December, 2014;
originally announced December 2014.
-
Cache policies for cloud-based systems: To keep or not to keep
Authors:
Nicolas Le Scouarnec,
Christoph Neumann,
Gilles Straub
Abstract:
In this paper, we study cache policies for cloud-based caching. Cloud-based caching uses cloud storage services such as Amazon S3 as a cache for data items that would have been recomputed otherwise. Cloud-based caching departs from classical caching: cloud resources are potentially infinite and only paid when used, while classical caching relies on a fixed storage capacity and its main monetary co…
▽ More
In this paper, we study cache policies for cloud-based caching. Cloud-based caching uses cloud storage services such as Amazon S3 as a cache for data items that would have been recomputed otherwise. Cloud-based caching departs from classical caching: cloud resources are potentially infinite and only paid when used, while classical caching relies on a fixed storage capacity and its main monetary cost comes from the initial investment. To deal with this new context, we design and evaluate a new caching policy that minimizes the overall cost of a cloud-based system. The policy takes into account the frequency of consumption of an item and the cloud cost model. We show that this policy is easier to operate, that it scales with the demand and that it outperforms classical policies managing a fixed capacity.
△ Less
Submitted 24 April, 2014; v1 submitted 2 December, 2013;
originally announced December 2013.
-
CROSS-MBCR: Exact Minimum Bandwith Coordinated Regenerating Codes
Authors:
Steve Jiekak,
Nicolas Le Scouarnec
Abstract:
We study the exact and optimal repair of multiple failures in codes for distributed storage. More particularly, we provide an explicit construction of exact minimum bandwidth coordinated regenerating codes (MBCR) for n=d+t,k,d >= k,t >= 1. Our construction differs from existing constructions by allowing both t>1 (i.e., repair of multiple failures) and d>k (i.e., contacting more than k devices duri…
▽ More
We study the exact and optimal repair of multiple failures in codes for distributed storage. More particularly, we provide an explicit construction of exact minimum bandwidth coordinated regenerating codes (MBCR) for n=d+t,k,d >= k,t >= 1. Our construction differs from existing constructions by allowing both t>1 (i.e., repair of multiple failures) and d>k (i.e., contacting more than k devices during repair).
△ Less
Submitted 3 July, 2012;
originally announced July 2012.
-
Regenerating Codes: A System Perspective
Authors:
Steve Jiekak,
Anne-Marie Kermarrec,
Nicolas Le Scouarnec,
Gilles Straub,
Alexandre Van Kempen
Abstract:
The explosion of the amount of data stored in cloud systems calls for more efficient paradigms for redundancy. While replication is widely used to ensure data availability, erasure correcting codes provide a much better trade-off between storage and availability. Regenerating codes are good candidates for they also offer low repair costs in term of network bandwidth. While they have been proven op…
▽ More
The explosion of the amount of data stored in cloud systems calls for more efficient paradigms for redundancy. While replication is widely used to ensure data availability, erasure correcting codes provide a much better trade-off between storage and availability. Regenerating codes are good candidates for they also offer low repair costs in term of network bandwidth. While they have been proven optimal, they are difficult to understand and parameterize. In this paper we provide an analysis of regenerating codes for practitioners to grasp the various trade-offs. More specifically we make two contributions: (i) we study the impact of the parameters by conducting an analysis at the level of the system, rather than at the level of a single device; (ii) we compare the computational costs of various implementations of codes and highlight the most efficient ones. Our goal is to provide system designers with concrete information to help them choose the best parameters and design for regenerating codes.
△ Less
Submitted 26 July, 2013; v1 submitted 23 April, 2012;
originally announced April 2012.
-
Exact Scalar Minimum Storage Coordinated Regenerating Codes
Authors:
Nicolas Le Scouarnec
Abstract:
We study the exact and optimal repair of multiple failures in codes for distributed storage. More particularly, we examine the use of interference alignment to build exact scalar minimum storage coordinated regenerating codes (MSCR). We show that it is possible to build codes for the case of k = 2 and d > k by aligning interferences independently but that this technique cannot be applied as soon a…
▽ More
We study the exact and optimal repair of multiple failures in codes for distributed storage. More particularly, we examine the use of interference alignment to build exact scalar minimum storage coordinated regenerating codes (MSCR). We show that it is possible to build codes for the case of k = 2 and d > k by aligning interferences independently but that this technique cannot be applied as soon as k > 2 and d > k. Our results also apply to adaptive regenerating codes.
△ Less
Submitted 2 February, 2012;
originally announced February 2012.
-
Repairing Multiple Failures with Coordinated and Adaptive Regenerating Codes
Authors:
Anne-Marie Kermarrec,
Gilles Straub,
Nicolas Le Scouarnec
Abstract:
Erasure correcting codes are widely used to ensure data persistence in distributed storage systems. This paper addresses the simultaneous repair of multiple failures in such codes. We go beyond existing work (i.e., regenerating codes by Dimakis et al.) by describing (i) coordinated regenerating codes (also known as cooperative regenerating codes) which support the simultaneous repair of multiple d…
▽ More
Erasure correcting codes are widely used to ensure data persistence in distributed storage systems. This paper addresses the simultaneous repair of multiple failures in such codes. We go beyond existing work (i.e., regenerating codes by Dimakis et al.) by describing (i) coordinated regenerating codes (also known as cooperative regenerating codes) which support the simultaneous repair of multiple devices, and (ii) adaptive regenerating codes which allow adapting the parameters at each repair. Similarly to regenerating codes by Dimakis et al., these codes achieve the optimal tradeoff between storage and the repair bandwidth. Based on these extended regenerating codes, we study the impact of lazy repairs applied to regenerating codes and conclude that lazy repairs cannot reduce the costs in term of network bandwidth but allow reducing the disk-related costs (disk bandwidth and disk I/O).
△ Less
Submitted 17 September, 2013; v1 submitted 1 February, 2011;
originally announced February 2011.