Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Vexless: A Serverless Vector Data Management System Using Cloud Functions

Published: 30 May 2024 Publication History

Abstract

Cloud functions, exemplified by AWS Lambda and Azure Functions, are emerging as a new computing paradigm in the cloud. They provide elastic, serverless, and low-cost cloud computing, making them highly suitable for bursty and sparse workloads, which are quite common in practice. Thus, there is a new trend in designing data systems that leverage cloud functions. In this paper, we focus on vector databases, which have recently gained significant attention partly due to large language models. In particular, we investigate how to use cloud functions to build high-performance and cost-efficient vector databases. This presents significant challenges in terms of how to perform sharding, how to reduce communication overhead, and how to minimize cold-start times.
In this paper, we introduce Vexless, the first vector database system optimized for cloud functions. We present three optimizations to address the challenges. To perform sharding, we propose a global coordinator (orchestrator) that assigns workloads to Cloud function instances based on their available hardware resources. To overcome communication overhead, we propose the use of stateful cloud functions, eliminating the need for costly communications during synchronization. To minimize cold-start overhead, we introduce a workload-aware Cloud function lifetime management strategy. Vexless has been implemented using Azure Functions. Experimental results demonstrate that Vexless can significantly reduce costs, especially on bursty and sparse workloads, compared to cloud VM instances, while achieving similar or higher query performance and accuracy.

References

[1]
[n. d.]. Alibaba Cloud: Manage Stateful Asynchronous Invocations. https://www.alibabacloud.com/help/en/fc/developer-reference/manage-stateful-asynchronous-invocations.
[2]
[n. d.]. Alibaba Cloud: Message Service (MNS). https://www.alibabacloud.com/product/message-service.
[3]
[n. d.]. Amazon Simple Queue Service. https://aws.amazon.com/sqs.
[4]
[n. d.]. AWS Lambda - Serverless Compute - Amazon Web Services. https://aws.amazon.com/lambda.
[5]
[n. d.]. AWS Step Functions. https://aws.amazon.com/step-functions.
[6]
[n. d.]. Azure Functions - Serverless Code. https://azure.microsoft.com/services/functions.
[7]
[n. d.]. Azure Functions Scale and Hosting. https://learn.microsoft.com/azure/azure-functions/functions-scale.
[8]
[n. d.]. Benchmarks for Billion-Scale Similarity Search. https://research.yandex.com/blog/benchmarks-for-billion-scale-similarity-search.
[9]
[n. d.]. Cloud Functions: Serverless Computing, Google Cloud. https://cloud.google.com/functions.
[10]
[n. d.]. Cold Starts in Azure Functions. https://mikhail.io/serverless/coldstarts/azure.
[11]
[n. d.]. Compute Optimized F Series - Azure Virtual Machines. https://learn.microsoft.com/en-us/azure/virtual-machines/sizes-compute.
[12]
[n. d.]. Google Cloud Pub/Sub. https://cloud.google.com/pubsub.
[13]
[n. d.]. Microsoft Azure Durable Functions. https://learn.microsoft.com/azure/azure-functions/durable/durable-functions-overview.
[14]
[n. d.]. Microsoft Azure Queue Storage. https://learn.microsoft.com/azure/storage/queues/storage-queues-introduction.
[15]
[n. d.]. pgvector. https://github.com/pgvector/pgvector.
[16]
[n. d.]. Pinecone: Vector Database for Vector Search. https://www.pinecone.io.
[17]
[n. d.]. Scalability and Performance Targets for Blob storage. https://learn.microsoft.com/azure/storage/blobs/scalability-targets.
[18]
[n. d.]. Vespa (https://vespa.ai/).
[19]
[n. d.]. What's the "Average" Requests Per Second for a Production Web Application? https://stackoverflow.com/questions/373098/whats-the-average-requests-per-second-for-a-production-web-application.
[20]
2023. LLM Limitations. (https://zilliz.com/use-cases/llm-retrieval-augmented-generation).
[21]
2023. Solving ChatGPT Hallucinations With Vector Embeddings https://www.youtube.com/watch?v=FUgp4oaxj-M.
[22]
Ryan Prescott Adams, Iain Murray, and David JC MacKay. 2009. Tractable Nonparametric Bayesian Inference in Poisson Processes with Gaussian Process Intensities. In International Conference on Machine Learning (ICML). 9--16.
[23]
Adil Akhter, Marios Fragkoulis, and Asterios Katsifodimos. 2019. Stateful Functions as a Service in Action. Proceedings of the VLDB Endowment (PVLDB) 12, 12 (2019), 1890--1893.
[24]
Alexandr Andoni and Piotr Indyk. 2008. Near-Optimal Hashing Algorithms for Approximate Nearest Neighbor in High Dimensions. Commun. ACM 51, 1 (2008), 117--122.
[25]
Artem Babenko and Victor Lempitsky. 2016. Efficient Indexing of Billion-Scale Datasets of Deep Descriptors. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2055--2063.
[26]
Oren Barkan and Noam Koenigstein. 2016. Item2Vec: Neural Item Embedding for Collaborative Filtering. In International Workshop on Machine Learning for Signal Processing (MLSP). 1--6.
[27]
Vandy Berten, Joël Goossens, and Emmanuel Jeannot. 2006. On the Distribution of Sequential Jobs in Random Brokering for Heterogeneous Computational Grids. IEEE Transactions on Parallel and Distributed Systems 17, 2 (2006), 113--124.
[28]
Haoqiong Bian, Tiannan Sha, and Anastasia Ailamaki. 2023. Using Cloud Functions as Accelerator for Elastic Data Analytics. Proceedings of the ACM on Management of Data (PACMMOD) 1, 2 (2023), 161:1--161:27.
[29]
Paul S Bradley, Kristin P Bennett, and Ayhan Demiriz. 2000. Constrained K-means Clustering. Microsoft Research, Redmond 20, 0 (2000), 0.
[30]
Mudashiru Busari and Carey Williamson. 2002. ProWGen: a Synthetic Workload Generation Tool for Simulation Evaluation of Web Proxy Caches. Computer Networks 38, 6 (2002), 779--794.
[31]
Qi Chen, Haidong Wang, Mingqin Li, Gang Ren, Scarlett Li, Jeffery Zhu, Jason Li, Chuanjie Liu, Lintao Zhang, and Jingdong Wang. 2018. SPTAG: A Library for Fast Approximate Nearest Neighbor Search. https://github.com/Microsoft/SPTAG.
[32]
Qi Chen, Bing Zhao, Haidong Wang, Mingqin Li, Chuanjie Liu, Zengzhong Li, Mao Yang, and Jingdong Wang. 2021. SPANN: Highly-efficient Billion-scale Approximate Nearest Neighborhood Search. In Proceedings of the International Conference on Neural Information Processing Systems (NeurIPS). 5199--5212.
[33]
Jia Deng, Wei Dong, Richard Socher, Li-Jia Li, Kai Li, and Li Fei-Fei. 2009. Imagenet: A Large-Scale Hierarchical Image Database. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 248--255.
[34]
Zhaoxu Ding, Guoqiang Zhong, Xianping Qin, Qingyang Li, Zhenlin Fan, Zhaoyang Deng, Xiao Ling, and Wei Xiang. 2024. MF-Net: Multi-frequency Intrusion Detection Network for Internet traffic Data. Pattern Recognition 146 (2024), 109999.
[35]
Matthijs Douze, Hervé Jégou, Harsimrat Sandhawalia, Laurent Amsaleg, and Cordelia Schmid. 2009. Evaluation of GIST Descriptors for Web-Scale Image Search. In Proceedings of the ACM International Conference on Image and Video Retrieval (CIVR). 1--8.
[36]
Thomas Eriksen and Naveed ur Rehman. 2023. Data-driven Nonstationary Signal Decomposition Approaches: a Comparative Analysis. Scientific Reports 13, 1 (2023), 1798.
[37]
Hakan Ferhatosmanoglu, Ertem Tuncel, Divyakant Agrawal, and Amr El Abbadi. 2006. High Dimensional Nearest Neighbor Searching. Information Systems 31, 6 (2006), 512--540.
[38]
Cong Fu, Chao Xiang, Changxu Wang, and Deng Cai. 2019. Fast Approximate Nearest Neighbor Search With The Navigating Spreading-out Graph. Proceedings of the VLDB Endowment (PVLDB) 12, 5 (2019), 461--474.
[39]
Aristides Gionis, Piotr Indyk, and Rajeev Motwani. 1999. Similarity Search in High Dimensions via Hashing. In International Conference on Very Large Data Bases (VLDB). 518--529.
[40]
Hervé Jégou, Matthijs Douze, and Cordelia Schmid. 2011. Product Quantization for Nearest Neighbor Search. IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI) 33, 1 (2011), 117--128.
[41]
Dingde Jiang, Zuyao Zhao, Zhengzheng Xu, Chunping Yao, and Hongwei Xu. 2014. How to Reconstruct End-to-end Traffic Based on Time-frequency Analysis and Artificial Neural Network. AEU-International Journal of Electronics and Communications 68, 10 (2014), 915--925.
[42]
Jeff Johnson, Matthijs Douze, and Hervé Jégou. 2019. Billion-Scale Similarity Search with GPUs. IEEE Transactions on Big Data 7, 3 (2019), 535--547.
[43]
Eric Jonas, Johann Schleier-Smith, Vikram Sreekanti, Chia-che Tsai, Anurag Khandelwal, Qifan Pu, Vaishaal Shankar, Joao Carreira, Karl Krauth, Neeraja Jayant Yadwadkar, Joseph E. Gonzalez, Raluca Ada Popa, Ion Stoica, and David A. Patterson. 2019. Cloud Programming Simplified: A Berkeley View on Serverless Computing. CoRR abs/1902.03383 (2019).
[44]
Quoc V. Le and Tomas Mikolov. 2014. Distributed Representations of Sentences and Documents. In International Conference on Machine Learning (ICML). 1188--1196.
[45]
Conglong Li, Minjia Zhang, David G. Andersen, and Yuxiong He. 2020. Improving Approximate Nearest Neighbor Search Through Learned Adaptive Early Termination. In Proceedings of the International Conference on Management of Data (SIGMOD). 2539--2554.
[46]
Yuliang Li, Jianguo Wang, Benjamin S. Pullman, Nuno Bandeira, and Yannis Papakonstantinou. 2019. Index-Based, High-Dimensional, Cosine Threshold Querying with Optimality Guarantees. In International Conference on Database Theory (ICDT), Vol. 127. 11:1--11:20.
[47]
Xuanzhe Liu, Jinfeng Wen, Zhenpeng Chen, Ding Li, Junkai Chen, Yi Liu, Haoyu Wang, and Xin Jin. 2023. FaaSLight: General Application-Level Cold-Start Latency Optimization for Function-as-a-Service in Serverless Computing. ACM Transactions on Software Engineering and Methodology (TOSEM) 32, 5 (2023).
[48]
Kejing Lu, Hongya Wang, Wei Wang, and Mineichi Kudo. 2020. VHP: Approximate Nearest Neighbor Search via Virtual Hypersphere Partitioning. Proceedings of the VLDB Endowment (PVLDB) 13, 9 (2020), 1443--1455.
[49]
Qin Lv, William Josephson, Zhe Wang, Moses Charikar, and Kai Li. 2007. Multi-Probe LSH: Efficient Indexing for High-Dimensional Similarity Search. In Proceedings of the VLDB Endowment (PVLDB). 950--961.
[50]
Yury A. Malkov and Dmitry A. Yashunin. 2018. Efficient and Robust Approximate Nearest Neighbor Search Using Hierarchical Navigable Small World Graphs. IEEE Transactions on Pattern Analysis and Machine Intelligence (PAMI) 42, 4 (2018), 824--836.
[51]
R. B. MARIMONT and M. B. SHAPIRO. 1979. Nearest Neighbour Searches and the Curse of Dimensionality. IMA Journal of Applied Mathematics 24, 1 (1979), 59--70.
[52]
Erik Bernhardsson Martin Aumueller. 2023. ANN-Benchmarks. https://ann-benchmarks.com.
[53]
Tomas Mikolov, Kai Chen, Greg Corrado, and Jeffrey Dean. 2013. Efficient Estimation of Word Representations in Vector Space. In International Conference on Learning Representations (ICLR).
[54]
Diego Montes, Juan A Añel, David CH Wallom, Peter Uhe, Pablo V Caderno, and Tomás F Pena. 2020. Cloud Computing for Climate Modelling: Evaluation, Challenges and Benefits. MDPI Computers 9, 2 (2020), 52.
[55]
Ingo Müller, Renato Marroquin, and Gustavo Alonso. 2020. Lambada: Interactive Data Analytics on Cold Data Using Serverless Cloud Infrastructure. In Proceedings of the International Conference on Management of Data (SIGMOD). 115--130.
[56]
James Pan, Jianguo Wang, and Guoliang Li. 2024. Vector Database Management Techniques and Systems. In Companion of the International Conference on Management of Data (SIGMOD).
[57]
James Jie Pan, Jianguo Wang, and Guoliang Li. 2023. Survey of Vector Database Management Systems. CoRR abs/2310.14021 (2023).
[58]
Zhen Peng, Minjia Zhang, Kai Li, Ruoming Jin, and Bin Ren. 2023. iQAN: Fast and Accurate Vector Search with Efficient Intra-Query Parallelism on Multi-Core Architectures. In Proceedings of the ACM SIGPLAN Annual Symposium on Principles and Practice of Parallel Programming (PPoPP). 313--328.
[59]
Matthew Perron, Raul Castro Fernandez, David DeWitt, and Samuel Madden. 2020. Starling: A Scalable Query Engine on Cloud Functions. In Proceedings of the International Conference on Management of Data (SIGMOD). 131--141.
[60]
Florin Pop, Ciprian Dobre, Valentin Cristea, and Nik Bessis. 2013. Scheduling of Sporadic Tasks with Deadline Constrains in Cloud Environments. In International Conference on Advanced Information Networking and Applications (AINA). 764--771.
[61]
Siying Qian, Chenran Ning, and Yuepeng Hu. 2021. MobileNetV3 for Image Classification. In International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE). 490--497.
[62]
Rudolf H Riedi, Matthew S Crouse, Vinay J Ribeiro, and Richard G Baraniuk. 1999. A Multifractal Wavelet Model with Application to Network Traffic. IEEE Transactions on Information Theory 45, 3 (1999), 992--1018.
[63]
Chanop Silpa-Anan and Richard I. Hartley. 2008. Optimised KD-trees for Fast Image Descriptor Matching. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--8.
[64]
Paulo Silva, Daniel Fireman, and Thiago Emmanuel Pereira. 2020. Prebaking Functions to Warm the Serverless Cold Start. In Proceedings of the International Middleware Conference. 1--13.
[65]
Suhas Jayaram Subramanya, Fnu Devvrit, Harsha Vardhan Simhadri, Ravishankar Krishnaswamy, and Rohan Kadekodi. 2019. Rand-NSG: Fast Accurate Billion-point Nearest Neighbor Search on a Single Node. In Annual Conference on Neural Information Processing Systems (NeurIPS). 13748--13758.
[66]
Christian Szegedy, Wei Liu, Yangqing Jia, Pierre Sermanet, Scott Reed, Dragomir Anguelov, Dumitru Erhan, Vincent Vanhoucke, and Andrew Rabinovich. 2015. Going Deeper with Convolutions. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 1--9.
[67]
John Thorpe, Yifan Qiao, Jonathan Eyolfson, Shen Teng, Guanzhou Hu, Zhihao Jia, Jinliang Wei, Keval Vora, Ravi Netravali, Miryung Kim, and Guoqing Harry Xu. 2021. Dorylus: Affordable, Scalable, and Accurate GNN Training with Distributed CPU Servers and Serverless Threads. In USENIX Symposium on Operating Systems Design and Implementation (OSDI). 495--514.
[68]
Jianguo Wang, Xiaomeng Yi, Rentong Guo, Hai Jin, Peng Xu, Shengjun Li, Xiangyu Wang, Xiangzhou Guo, Cheng-ming Li, Xiaohai Xu, Kun Yu, Yuxing Yuan, Yinghao Zou, Jiquan Long, Yudong Cai, Zhenxiang Li, Zhifeng Zhang, Yihua Mo, Jun Gu, Ruiyi Jiang, Yi Wei, and Charles Xie. 2021. Milvus: A Purpose-Built Vector Data Management System. In Proceedings of the International Conference on Management of Data (SIGMOD). 2614--2627.
[69]
Liang Wang, Mengyuan Li, Yinqian Zhang, Thomas Ristenpart, and Michael Swift. 2018. Peeking Behind the Curtains of Serverless Platforms. In USENIX Annual Technical Conference (USENIX ATC). 133--146.
[70]
Chuangxian Wei, Bin Wu, Sheng Wang, Renjie Lou, Chaoqun Zhan, Feifei Li, and Yuanzhe Cai. 2020. AnalyticDB-V: A Hybrid Analytical Engine Towards Query Fusion for Structured and Unstructured Data. Proceedings of the VLDB Endowment (PVLDB) 13, 12 (2020), 3152--3165.
[71]
Yuncheng Wu, Tien Tuan Anh Dinh, Guoyu Hu, Meihui Zhang, Yeow Meng Chee, and Beng Chin Ooi. 2022. Serverless Data Science - Are We There Yet? A Case Study of Model Serving. In Proceedings of the International Conference on Management of Data (SIGMOD). 1866--1875.
[72]
Huafeng Xi, Jianfeng Zhan, Zhen Jia, Xuehai Hong, Lei Wang, Lixin Zhang, Ninghui Sun, and Gang Lu. 2011. Characterization of Real Workloads of Web Search Engines. In Proceedings of the IEEE International Symposium on Workload Characterization (IISWC). 15--25.
[73]
Zhengjun Xu, Haitao Zhang, Xin Geng, Qiong Wu, and Huadong Ma. 2019. Adaptive Function Launching Acceleration in Serverless Computing Platforms. In IEEE International Conference on Parallel and Distributed Systems (ICPADS). 9--16.
[74]
Yunan Zhang, Shige Liu, and Jianguo Wang. 2024. Are There Fundamental Limitations in Supporting Vector Data Management in Relational Databases? A Case Study of PostgreSQL. In International Conference on Data Engineering (ICDE).
[75]
Zhe Zhao and Qiaozhu Mei. 2013. Questions about Questions: An Empirical Analysis of Information Needs on Twitter. In International World Wide Web Conference (WWW). 1545--1556.

Cited By

View all
  • (2024)Vector Database Management Techniques and SystemsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654691(597-604)Online publication date: 9-Jun-2024
  • (2024)Survey of vector database management systemsThe VLDB Journal10.1007/s00778-024-00864-x33:5(1591-1615)Online publication date: 15-Jul-2024

Index Terms

  1. Vexless: A Serverless Vector Data Management System Using Cloud Functions

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Proceedings of the ACM on Management of Data
    Proceedings of the ACM on Management of Data  Volume 2, Issue 3
    SIGMOD
    June 2024
    1953 pages
    EISSN:2836-6573
    DOI:10.1145/3670010
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 30 May 2024
    Published in PACMMOD Volume 2, Issue 3

    Permissions

    Request permissions for this article.

    Author Tags

    1. cloud functions
    2. serverless computing
    3. serverless databases
    4. vector databases

    Qualifiers

    • Research-article

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)354
    • Downloads (Last 6 weeks)123
    Reflects downloads up to 30 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Vector Database Management Techniques and SystemsCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3654691(597-604)Online publication date: 9-Jun-2024
    • (2024)Survey of vector database management systemsThe VLDB Journal10.1007/s00778-024-00864-x33:5(1591-1615)Online publication date: 15-Jul-2024

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media