research-article

Open access

OneSparse: A Unified System for Multi-index Vector Search

Authors:

Ruicheng Zheng,

Mao YangAuthors Info & Claims

WWW '24: Companion Proceedings of the ACM Web Conference 2024

Pages 393 - 402

https://doi.org/10.1145/3589335.3648338

Published: 13 May 2024 Publication History

Abstract

Multi-index vector search has become the cornerstone for many applications, such as recommendation systems. Efficient search in such a multi-modal hybrid vector space is challenging since no single index design performs well for all kinds of vector data. Existing approaches to processing multi-index hybrid queries either suffer from algorithmic limitations or processing inefficiency. In this paper, we propose OneSparse, a unified multi-vector index query system that incorporates multiple posting-based vector indices, which enables highly efficient retrieval of multi-modal data-sets. OneSparse introduces a novel multi-index query engine design of inter-index intersection push-down. It also optimizes the vector posting format to expedite multi-index queries. Our experiments show OneSparse achieves more than 6x search performance improvement while maintaining comparable accuracy. OneSparse has already been integrated into Microsoft online web search and advertising systems with 5x+ latency gain for Bing web search and 2.0% Revenue Per Mille (RPM) gain for Bing sponsored search.

Supplemental Material

MOV File

Supplemental video

Download
59.80 MB

MP4 File

Presentation video

Download
1249.44 MB

References

[1]

Artem Babenko and Victor Lempitsky. 2014. The Inverted Multi-index. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 37, 6 (2014), 1247--1260.

Digital Library

[2]

Yoram Bachrach, Yehuda Finkelstein, Ran Gilad-Bachrach, Liran Katzir, Noam Koenigstein, Nir Nice, and Ulrich Paquet. 2014. Speeding up the Xbox Recommender System Using a Euclidean Transformation for Inner-Product Spaces. In Proceedings of the 8th ACM Conference on Recommender Systems. 257--264. https://doi.org/10.1145/2645710.2645741

Digital Library

[3]

Payal Bajaj, Daniel Campos, Nick Craswell, Li Deng, Jianfeng Gao, Xiaodong Liu, Rangan Majumder, Andrew McNamara, Bhaskar Mitra, Tri Nguyen, Mir Rosenberg, Xia Song, Alina Stoica, Saurabh Tiwary, and Tong Wang. 2018. MS MARCO: A Human Generated MAchine Reading COmprehension Dataset. arXiv preprint arXiv:1611.09268 (2018).

[4]

Dmitry Baranchuk, Artem Babenko, and Yury Malkov. 2018. Revisiting the Inverted Indices for Billion-scale Approximate Nearest Neighbors. In Proceedings of the European Conference on Computer Vision. 202--216.

Digital Library

[5]

Jon Louis Bentley. 1975. Multidimensional Binary Search Trees Used for Associative Searching. Commun. ACM, Vol. 18, 9 (1975), 509--517.

Digital Library

[6]

Tom Brown, Benjamin Mann, Nick Ryder, Melanie Subbiah, Jared D Kaplan, Prafulla Dhariwal, Arvind Neelakantan, Pranav Shyam, Girish Sastry, Amanda Askell, et al. 2020. Language Models are Few-shot Learners. Advances in Neural Information Processing Systems, Vol. 33 (2020), 1877--1901.

[7]

Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason Li, Chuanjie Liu, Lintao Zhang, and Jingdong Wang. 2018. SPTAG: A library for fast approximate nearest neighbor search. https://github.com/Microsoft/SPTAG

[8]

Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. Advances in Neural Information Processing Systems, Vol. 34 (2021), 5199--5212.

[9]

Heng-Tze Cheng, Levent Koc, Jeremiah Harmsen, Tal Shaked, Tushar Chandra, Hrishi Aradhye, Glen Anderson, Greg Corrado, Wei Chai, Mustafa Ispir, et al. 2016. Wide & Deep Learning for Recommender Systems. In Proceedings of the 1st Workshop on Deep Learning for Recommender Systems. 7--10.

Digital Library

[10]

Kenneth L. Clarkson. 1994. An Algorithm for Approximate Closest-Point Queries. In Proceedings of the Tenth Annual Symposium on Computational Geometry. 160--164. https://doi.org/10.1145/177424.177609

Digital Library

[11]

Mayur Datar, Nicole Immorlica, Piotr Indyk, and Vahab S Mirrokni. 2004. Locality-sensitive hashing scheme based on p-stable distributions. In Proceedings of the 20th Annual Symposium on Computational Geometry. 253--262.

Digital Library

[12]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of Deep Bidirectional Transformers for Language Understanding. arXiv preprint arXiv:1810.04805 (2018).

[13]

Wei Dong, Charikar Moses, and Kai Li. 2011. Efficient k-nearest neighbor graph construction for generic similarity measures. In Proceedings of the 20th International Conference on World Wide Web. 577--586.

Digital Library

[14]

Elasticsearch. 2015. Elasticsearch. https://github.com/elasticsearch/elasticsearch

[15]

Facebook. 2020. Faiss. https://github.com/facebookresearch/faiss

[16]

Apache Software Foundation. 2020. Lucene. https://lucene.apache.org/

[17]

Luyu Gao and Jamie Callan. 2022. Unsupervised Corpus Aware Language Model Pre-training for Dense Passage Retrieval. In Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics. 2843--2853. https://doi.org/10.18653/v1/2022.acl-long.203

[18]

K Sparck Jones, Steve Walker, and Stephen E. Robertson. 2000 a. A probabilistic model of information retrieval: development and comparative experiments: Part 1. Information Processing & Management, Vol. 36, 6 (2000), 779--808.

Digital Library

[19]

K Sparck Jones, Steve Walker, and Stephen E. Robertson. 2000 b. A probabilistic model of information retrieval: development and comparative experiments: Part 2. Information Processing & Management, Vol. 36, 6 (2000), 809--840.

Digital Library

[20]

Brian Kulis and Kristen Grauman. 2011. Kernelized locality-sensitive hashing. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 34, 6 (2011), 1092--1104.

Digital Library

[21]

Tom Kwiatkowski, Jennimaria Palomaki, Olivia Redfield, Michael Collins, Ankur Parikh, Chris Alberti, Danielle Epstein, Illia Polosukhin, Jacob Devlin, Kenton Lee, et al. 2019. Natural questions: a benchmark for question answering research. Transactions of the Association for Computational Linguistics, Vol. 7 (2019), 453--466.

[22]

Ji Ma, Ivan Korotkov, Keith B Hall, and Ryan T McDonald. 2020a. Hybrid First-stage Retrieval Models for Biomedical Literature. In Conference and Labs of the Evaluation Forum. 22--25.

[23]

Ji Ma, Ivan Korotkov, Yinfei Yang, Keith Hall, and Ryan McDonald. 2020b. Zero-shot Neural Passage Retrieval via Domain-targeted Synthetic Question Generation. arXiv preprint arXiv:2004.14503 (2020).

[24]

Yu A Malkov and Dmitry A Yashunin. 2018. Efficient and robust approximate nearest neighbor search using hierarchical navigable small world graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 42, 4 (2018), 824--836.

Digital Library

[25]

Tomas Mikolov, Ilya Sutskever, Kai Chen, Greg S Corrado, and Jeff Dean. 2013. Distributed Representations of Words and Phrases and their Compositionality. In Advances in Neural Information Processing Systems, Vol. 26.

[26]

Marius Muja and David G Lowe. 2014. Scalable Nearest Neighbor Algorithms for High Dimensional Data. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 11 (2014), 2227--2240.

[27]

Hinrich Schütze, Christopher D Manning, and Prabhakar Raghavan. 2008. Introduction to Information Retrieval. Vol. 39. Cambridge University Press Cambridge.

[28]

Minjoon Seo, Jinhyuk Lee, Tom Kwiatkowski, Ankur Parikh, Ali Farhadi, and Hannaneh Hajishirzi. 2019. Real-Time Open-Domain Question Answering with Dense-Sparse Phrase Index. In Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 4430--4441. https://doi.org/10.18653/v1/P19--1436

[29]

Ying Shan, T Ryan Hoens, Jian Jiao, Haijing Wang, Dong Yu, and JC Mao. 2016. Deep Crossing: Web-scale Modeling without Manually Crafted combinatorial features. In Proceedings of the 22nd ACM International Conference on Knowledge Discovery and Data Mining. 255--262.

Digital Library

[30]

Jingdong Wang, Naiyan Wang, You Jia, Jian Li, Gang Zeng, Hongbin Zha, and Xian-Sheng Hua. 2013. Trinary-projection Trees for Approximate Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 36, 2 (2013), 388--403.

Digital Library

[31]

Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Chengming Li, Xiaohai Xu, et al. 2021. Milvus: A Purpose-Built Vector Data Management System. In Proceedings of the International Conference on Management of Data. 2614--2627.

Digital Library

[32]

Ruoxi Wang, Bin Fu, Gang Fu, and Mingliang Wang. 2017. Deep & Cross Network for Ad Click Predictions. In Proceedings of the ADKDD'17. Association for Computing Machinery, Article 12, 7 pages. https://doi.org/10.1145/3124749.3124754

Digital Library

[33]

Xiang Wu, Ruiqi Guo, David Simcha, Dave Dopson, and Sanjiv Kumar. 2019. Efficient Inner Product Approximation in Hybrid Spaces. arXiv preprint arXiv:1903.08690 (2019).

[34]

Jianjin Zhang, Zheng Liu, Weihao Han, Shitao Xiao, Ruicheng Zheng, Yingxia Shao, Hao Sun, Hanqing Zhu, Premkumar Srinivasan, Weiwei Deng, Qi Zhang, and Xing Xie. 2022. Uni-Retriever: Towards Learning the Unified Embedding Based Retriever in Bing Sponsored Search. In Proceedings of the 28th ACM SIGKDD Conference on Knowledge Discovery and Data Mining. 4493--4501. https://doi.org/10.1145/3534678.3539212

Digital Library

[35]

Qianxi Zhang, Shuotao Xu, Qi Chen, Guoxin Sui, Jiadong Xie, Zhizhen Cai, Yaoqi Chen, Yinxuan He, Yuqing Yang, Fan Yang, Mao Yang, and Lidong Zhou. 2023. VBASE: Unifying Online Vector Similarity Search and Relational Queries via Relaxed Monotonicity. In 17th USENIX Symposium on Operating Systems Design and Implementation. 377--395.

[36]

Jason Zhu, Yanling Cui, Yuming Liu, Hao Sun, Xue Li, Markus Pelger, Tianqi Yang, Liangjie Zhang, Ruofei Zhang, and Huasha Zhao. 2021. TextGNN: Improving Text Encoder via Graph Neural Network in Sponsored Search. In Proceedings of the Web Conference 2021. 2848--2857. https://doi.org/10.1145/3442381.3449842 io

Digital Library

Index Terms

OneSparse: A Unified System for Multi-index Vector Search
1. Information systems
  1. Information retrieval

Recommendations

Dynamic Multi-probe LSH: An I/O Efficient Index Structure for Approximate Nearest Neighbor Search
DEXA 2013: Proceedings of the 24th International Conference on Database and Expert Systems Applications - Volume 8055

Locality-Sensitive Hashing LSH is widely used to solve approximate nearest neighbor search problems in high-dimensional spaces. The basic idea is to map the "nearby" objects into a same hash bucket with high probability. A significant drawback is that ...
Selective K-means Tree Search
MM '15: Proceedings of the 23rd ACM international conference on Multimedia

In object recognition and image retrieval, an inverted indexing method is used to solve the approximate nearest neighbor search problem. In these tasks, inverted indexing provides a nonexhaustive solution to large-scale search. However, a problem of ...
Optimized residual vector quantization for efficient approximate nearest neighbor search

In this paper, an optimized residual vector quantization-based approach is presented for improving the quality of vector quantization and approximate nearest neighbor search. The main contributions are as follows. Based on residual vector quantization (...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

WWW '24: Companion Proceedings of the ACM Web Conference 2024

May 2024

1928 pages

ISBN:9798400701726

DOI:10.1145/3589335

General Chairs:
Tat-Seng Chua
National University of Singapore
,
Chong-Wah Ngo
Singapore Management University
,
Program Chairs:
Ravi Kumar
Google
,
Hady W. Lauw
Singapore Management University
,
Roy Ka-Wei Lee
Singapore University of Technology and Design

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGWEB: ACM Special Interest Group on Hypertext, Hypermedia, and Web

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 May 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

WWW '24

Sponsor:

SIGWEB

WWW '24: The ACM Web Conference 2024

May 13 - 17, 2024

Singapore, Singapore

Acceptance Rates

Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
496
Total Downloads

Downloads (Last 12 months)496
Downloads (Last 6 weeks)99

Reflects downloads up to 01 Nov 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents