SNARF: a learning-enhanced range filter

K Vaidya, S Chatterjee, E Knorr… - Proceedings of the …, 2022 - dl.acm.org
Proceedings of the VLDB Endowment, 2022dl.acm.org
We present Sparse Numerical Array-Based Range Filters (SNARF), a learned range filter
that efficiently supports range queries for numerical data. SNARF creates a model of the
data distribution to map the keys into a bit array which is stored in a compressed form. The
model along with the compressed bit array which constitutes SNARF are used to answer
membership queries. We evaluate SNARF on multiple synthetic and real-world datasets as
a stand-alone filter and by integrating it into RocksDB. For range queries, SNARF provides …
We present Sparse Numerical Array-Based Range Filters (SNARF), a learned range filter that efficiently supports range queries for numerical data. SNARF creates a model of the data distribution to map the keys into a bit array which is stored in a compressed form. The model along with the compressed bit array which constitutes SNARF are used to answer membership queries.
We evaluate SNARF on multiple synthetic and real-world datasets as a stand-alone filter and by integrating it into RocksDB. For range queries, SNARF provides up to 50x better false positive rate than state-of-the-art range filters, such as SuRF and Rosetta, with the same space usage. We also evaluate SNARF in RocksDB as a filter replacement for filtering requests before they access on-disk data structures. For RocksDB, SNARF can improve the execution time of the system up to 10x compared to SuRF and Rosetta for certain read-only workloads.
ACM Digital Library