research-article

RALF: Accuracy-Aware Scheduling for Feature Store Maintenance

Authors:

Joseph M. Hellerstein,

Natacha Crooks,

Joseph E. GonzalezAuthors Info & Claims

Proceedings of the VLDB Endowment, Volume 17, Issue 3

Pages 563 - 576

https://doi.org/10.14778/3632093.3632116

Published: 01 November 2023 Publication History

Abstract

Feature stores (also sometimes referred to as embedding stores) are becoming ubiquitous in model serving systems: downstream applications query these stores for auxiliary inputs at inference-time. Stored features are derived by featurizing rapidly changing base data sources. Featurization can be costly prohibitively expensive to trigger on every data update, particularly for features that are vector embeddings computed by a model. Yet, existing systems naively apply a one-size-fits-all policy as to when/how to update these features, and do not consider query access patterns or impacts on prediction accuracy. This paper introduces RALF, which orchestrates feature updates by leveraging downstream error feedback to minimize feature store regret, a metric for how much featurization degrades downstream accuracy. We evaluate with representative feature store workloads, anomaly detection and recommendation, using real-world datasets. We run system experiments with a 275,077 key anomaly detection workload on 800 cores to show up to a 32.7% reduction in prediction error or up to 1.6X compute cost reduction with accuracy-aware scheduling.

References

[1]

Amazon Web Services [n.d.]. Amazon SageMaker Feature Store. Amazon Web Services. Retrieved November 11, 2023 from https://aws.amazon.com/sagemaker/feature-store/

[2]

Hopsworks [n.d.]. Feature Stores Org. Hopsworks. Retrieved November 11, 2023 from https://www.featurestore.org/

[3]

[n.d.]. Feature Tools. Retrieved November 11, 2023 from https://featuretools.alteryx.com/en/stable/getting_started/handling_time.html

[4]

Hopsworks [n.d.]. Hopsworks. Hopsworks. Retrieved November 11, 2023 from https://www.hopsworks.ai/

[5]

grouplens [n.d.]. MovieLens 1M Dataset. grouplens. Retrieved November 11, 2023 from https://grouplens.org/datasets/movielens/1m/

[6]

OpenAI [n.d.]. Rate Limits. OpenAI. Retrieved November 11, 2023 from https://platform.openai.com/docs/guides/rate-limits?context=tier-free

[7]

[n.d.]. Redis. Retrieved November 11, 2023 from https://redis.io/

[8]

Cohere [n.d.]. Scalable, affordable pricing. Cohere. Retrieved November 11, 2023 from https://cohere.com/pricing

[9]

Tecton [n.d.]. Tecton. Tecton. Retrieved November 11, 2023 from https://www.tecton.ai/

[10]

[n.d.]. Timely Dataflow. Retrieved November 11, 2023 from https://timelydataflow.github.iASo/timely-dataflow/

[11]

Martín Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, et al. 2016. Tensorflow: A system for large-scale machine learning. In 12th {USENIX} symposium on operating systems design and implementation ({OSDI} 16). 265--283.

Digital Library

[12]

Sameer Agarwal, Barzan Mozafari, Aurojit Panda, Henry Milner, Samuel Madden, and Ion Stoica. 2013. BlinkDB: queries with bounded errors and bounded response times on very large data. In Proceedings of the 8th ACM European Conference on Computer Systems. 29--42.

Digital Library

[13]

Peter Bailis, Alan Fekete, Michael J. Franklin, Ali Ghodsi, Joseph M. Hellerstein, and Ion Stoica. 2014. Coordination Avoidance in Database Systems. Proc. VLDB Endow. 8, 3 (nov 2014), 185--196.

Digital Library

[14]

Peter Bailis, Shivaram Venkataraman, Michael J. Franklin, Joseph M. Hellerstein, and Ion Stoica. 2012. Probabilistically Bounded Staleness for Practical Partial Quorums. Proc. VLDB Endow. 5, 8 (apr 2012), 776--787.

Digital Library

[15]

Romil Bhardwaj, Zhengxu Xia, Ganesh Ananthanarayanan, Junchen Jiang, Nikolaos Karianakis, Yuanchao Shu, Kevin Hsieh, Victor Bahl, and Ion Stoica. 2020. Ekya: Continuous Learning of Video Analytics Models on Edge Compute Servers. arXiv preprint arXiv:2012.10557 (2020).

[16]

Surajit Chaudhuri, Bolin Ding, and Srikanth Kandula. 2017. Approximate Query Processing: No Silver Bullet. In Proceedings of the 2017 ACM International Conference on Management of Data (Chicago, Illinois, USA) (SIGMOD '17). Association for Computing Machinery, New York, NY, USA, 511--519.

Digital Library

[17]

Rada Chirkova, Jun Yang, et al. 2011. Materialized views. Foundations and Trends in Databases 4, 4 (2011), 295--405.

[18]

Brian F Cooper, Raghu Ramakrishnan, Utkarsh Srivastava, Adam Silberstein, Philip Bohannon, Hans-Arno Jacobsen, Nick Puz, Daniel Weaver, and Ramana Yerneni. 2008. PNUTS: Yahoo!'s hosted data serving platform. Proceedings of the VLDB Endowment 1, 2 (2008), 1277--1288.

Digital Library

[19]

Eli Cortez, Anand Bonde, Alexandre Muzio, Mark Russinovich, Marcus Fontoura, and Ricardo Bianchini. 2017. Resource central: Understanding and predicting workloads for improved resource management in large cloud platforms. In Proceedings of the 26th Symposium on Operating Systems Principles. 153--167.

Digital Library

[20]

Daniel Crankshaw, Gur-Eyal Sela, Xiangxi Mo, Corey Zumar, Ion Stoica, Joseph Gonzalez, and Alexey Tumanov. 2020. InferLine: latency-aware provisioning and scaling for prediction serving pipelines. In Proceedings of the 11th ACM Symposium on Cloud Computing. 477--491.

Digital Library

[21]

Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J Franklin, Joseph E Gonzalez, and Ion Stoica. 2017. Clipper: A low-latency online prediction serving system. In 14th {USENIX} Symposium on Networked Systems Design and Implementation ({NSDI} 17). 613--627.

[22]

Natacha Crooks, Youer Pu, Nancy Estrada, Trinabh Gupta, Lorenzo Alvisi, and Allen Clement. 2016. TARDiS: A Branch-and-Merge Approach To Weak Consistency. In Proceedings of the 2016 International Conference on Management of Data (San Francisco, California, USA) (SIGMOD '16). Association for Computing Machinery, New York, NY, USA, 1615--1628.

Digital Library

[23]

Henggang Cui, James Cipar, Qirong Ho, Jin Kyu Kim, Seunghak Lee, Abhimanu Kumar, Jinliang Wei, Wei Dai, Gregory R. Ganger, Phillip B. Gibbons, Garth A. Gibson, and Eric P. Xing. 2014. Exploiting Bounded Staleness to Speed up Big Data Analytics. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference (Philadelphia, PA) (USENIX ATC'14). USENIX Association, USA, 37--48.

[24]

Jon Gjengset, Malte Schwarzkopf, Jonathan Behrens, Lara Timbó Araújo, Martin Ek, Eddie Kohler, M Frans Kaashoek, and Robert Morris. 2018. Noria: dynamic, partially-stateful data-flow for high-performance web applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 213--231.

[25]

Kelvin Guu, Kenton Lee, Zora Tung, Panupong Pasupat, and Mingwei Chang. 2020. Retrieval augmented language model pre-training. In International conference on machine learning. PMLR, 3929--3938.

[26]

Brandon Holt, James Bornholt, Irene Zhang, Dan Ports, Mark Oskin, and Luis Ceze. 2016. Disciplined Inconsistency with Consistency Types. In Proceedings of the Seventh ACM Symposium on Cloud Computing (Santa Clara, CA, USA) (SoCC '16). Association for Computing Machinery, New York, NY, USA, 279--293.

Digital Library

[27]

Junchen Jiang, Ganesh Ananthanarayanan, Peter Bodik, Siddhartha Sen, and Ion Stoica. 2018. Chameleon: scalable adaptation of video analytics. In Proceedings of the 2018 Conference of the ACM Special Interest Group on Data Communication. 253--266.

Digital Library

[28]

Theofilos Kakantousis, Antonios Kouzoupis, Fabio Buso, Gautier Berthou, Jim Dowling, and Seif Haridi. 2019. Horizontally scalable ml pipelines with a feature store. In Proc. 2nd SysML Conf., Palo Alto, USA.

[29]

Daniel Kang, John Emmons, Firas Abuzaid, Peter Bailis, and Matei Zaharia. 2017. Noscope: optimizing neural network queries over video at scale. arXiv preprint arXiv:1703.02529 (2017).

[30]

Daniel Kang, Edward Gan, Peter Bailis, Tatsunori Hashimoto, and Matei Zaharia. 2020. Approximate selection with guarantees using proxies. arXiv preprint arXiv:2004.00827 (2020).

[31]

Yehuda Koren, Robert Bell, and Chris Volinsky. 2009. Matrix factorization techniques for recommender systems. Computer 42, 8 (2009), 30--37.

Digital Library

[32]

Peter Kraft, Daniel Kang, Deepak Narayanan, Shoumik Palkar, Peter Bailis, and Matei Zaharia. 2019. Willump: A statistically-aware end-to-end optimizer for machine learning inference. arXiv preprint arXiv.1906.01974 (2019).

[33]

Tim Kraska, Gene Pang, Michael J. Franklin, Samuel Madden, and Alan Fekete. 2013. MDCC: Multi-Data Center Consistency. In Proceedings of the 8th ACM European Conference on Computer Systems (Prague, Czech Republic) (EuroSys '13). Association for Computing Machinery, New York, NY, USA, 113--126.

Digital Library

[34]

N Laptev and S Amizadeh. 2015. Yahoo anomaly detection dataset s5. http://webscope.sandbox.yahoo.com/catalog.php (2015).

[35]

Kenton Lee, Ming-Wei Chang, and Kristina Toutanova. 2019. Latent retrieval for weakly supervised open domain question answering. arXiv preprint arXiv.1906.00300 (2019).

[36]

Patrick Lewis, Ethan Perez, Aleksandra Piktus, Fabio Petroni, Vladimir Karpukhin, Naman Goyal, Heinrich Küttler, Mike Lewis, Wen-tau Yih, Tim Rocktäschel, et al. 2020. Retrieval-augmented generation for knowledge-intensive nlp tasks. Advances in Neural Information Processing Systems 33 (2020), 9459--9474.

[37]

Cheng Li, Daniel Porto, Allen Clement, Johannes Gehrke, Nuno Preguiça, and Rodrigo Rodrigues. 2012. Making Geo-Replicated Systems Fast as Possible, Consistent When Necessary. In Proceedings of the 10th USENIX Conference on Operating Systems Design and Implementation (Hollywood, CA, USA) (OSDI'12). USENIX Association, USA, 265--278.

Digital Library

[38]

Mengtian Li, Yu-Xiong Wang, and Deva Ramanan. 2020. Towards streaming perception. In European Conference on Computer Vision. Springer, 473--488.

Digital Library

[39]

Wyatt Lloyd, Michael J. Freedman, Michael Kaminsky, and David G. Andersen. 2011. Don't Settle for Eventual: Scalable Causal Consistency for Wide-Area Storage with COPS. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles (Cascais, Portugal) (SOSP '11). Association for Computing Machinery, New York, NY, USA, 401--416.

Digital Library

[40]

Frank McSherry, Andrea Lattuada, Malte Schwarzkopf, and Timothy Roscoe. 2020. Shared Arrangements: practical inter-query sharing for streaming dataflows. Proc. VLDB Endow. 13,10 (2020), 1793--1806.

Digital Library

[41]

Abhinav Mishra, Ram Sriharsha, and Sichen Zhong. 2021. OnlineSTL: Scaling Time Series Decomposition by 100x. arXiv e-prints (2021), arXiv-2107.

[42]

Philipp Moritz, Robert Nishihara, Stephanie Wang, Alexey Tumanov, Richard Liaw, Eric Liang, Melih Elibol, Zongheng Yang, William Paul, Michael I Jordan, et al. 2018. Ray: A distributed framework for emerging {AI} applications. In 13th {USENIX} Symposium on Operating Systems Design and Implementation ({OSDI} 18). 561--577.

Digital Library

[43]

Maxim Naumov, Dheevatsa Mudigere, Hao-Jun Michael Shi, Jianyu Huang, Narayanan Sundaraman, Jongsoo Park, Xiaodong Wang, Udit Gupta, Carole-Jean Wu, Alisson G. Azzolini, Dmytro Dzhulgakov, Andrey Mallevich, Ilia Cherniavskii, Yinghai Lu, Raghuraman Krishnamoorthi, Ansha Yu, Volodymyr Kondratenko, Stephanie Pereira, Xianjie Chen, Wenlin Chen, Vijay Rao, Bill Jia, Liang Xiong, and Misha Smelyanskiy. 2019. Deep Learning Recommendation Model for Personalization and Recommendation Systems. CoRR abs/1906.00091 (2019). https://arxiv.org/abs/1906.00091

[44]

Charles Packer, Vivian Fang, Shishir G Patil, Kevin Lin, Sarah Wooders, and Joseph E Gonzalez. 2023. MemGPT: Towards LLMs as Operating Systems. arXiv preprint arXiv.2310.08560 (2023).

[45]

Skipper Seabold and Josef Perktold. 2010. statsmodels: Econometric and statistical modeling with python. In 9th Python in Science Conference.

[46]

Swaminathan Sivasubramanian. [n.d.]. Amazon dynamoDB: a seamlessly scalable non-relational database service. In Proceedings of the 2012 ACM SIGMOD International Conference on Management of Data. 729--730 year=2012.

[47]

Douglas B. Terry, Vijayan Prabhakaran, Ramakrishna Kotla, Mahesh Balakrishnan, Marcos K. Aguilera, and Hussam Abu-Libdeh. 2013. Consistency-Based Service Level Agreements for Cloud Storage. In Proceedings of the Twenty-Fourth ACM Symposium on Operating Systems Principles (Farminton, Pennsylvania) (SOSP '13). Association for Computing Machinery, New York, NY, USA, 309--324.

Digital Library

[48]

Zhaohui Wang, Xiao Lin, Abhinav Mishra, and Ram Sriharsha. 2021. Online Changepoint Detection on a Budget. In 2021 International Conference on Data Mining Workshops (ICDMW). IEEE, 414--420.

[49]

M. H. Wong and D. Agrawal. 1992. Tolerating Bounded Inconsistency for Increasing Concurrency in Database Systems. In Proceedings of the Eleventh ACM SIGACT-SIGMOD-SIGART Symposium on Principles of Database Systems (San Diego, California, USA) (PODS '92). Association for Computing Machinery, New York, NY, USA, 236--245.

Digital Library

[50]

Kun-Lung Wu, P.S. Yu, and C. Pu. 1992. Divergence control for epsilon-serializability. In [1992] Eighth International Conference on Data Engineering. 506--515.

[51]

Haifeng Yu and Amin Vahdat. 2000. Design and Evaluation of a Continuous Consistency Model for Replicated Services. In Proceedings of the 4th Conference on Symposium on Operating System Design & Implementation - Volume 4 (San Diego, California) (OSDI'00). USENIX Association, USA, Article 21.

Digital Library

[52]

Jingren Zhou, Per-ke Larson, and Jonathan Goldstein. 2005. Partially materialized views. In submitted to this conference.

Cited By

Index Terms

RALF: Accuracy-Aware Scheduling for Feature Store Maintenance
1. Information systems
  1. Information systems applications
2. Theory of computation
  1. Theory and algorithms for application domains

Index terms have been assigned to the content through auto-classification.

Recommendations

Store and Visualize EER in Neo4j
ISCSIC '18: Proceedings of the 2nd International Symposium on Computer Science and Intelligent Control

NoSQL databases have become very popular in the last few years. Graph databases, as a major NoSQL database type, are used for many problems. In relational databases, conceptual modeling is very important, for which Enhanced Entity-Relationship (EER) ...
Forensic investigation framework for the document store NoSQL DBMS

The NoSQL DBMS provides an efficient means of storing and accessing big data because its servers are more easily horizontally scalable and replicable than relational DBMSs. Its data model lacks a fixed schema, so that users can easily dynamically change ...
A storage advisor for hybrid-store databases

With the SAP HANA database, SAP offers a high-performance in-memory hybrid-store database. Hybrid-store databases---that is, databases supporting row- and column-oriented data management---are getting more and more prominent. While the columnar ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment

Proceedings of the VLDB Endowment Volume 17, Issue 3

November 2023

353 pages

ISSN:2150-8097

Editors:
Meihui Zhang
Beijing Institute of Technology
,
Cyrus Shahabi
University of Southern California

Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 November 2023

Published in PVLDB Volume 17, Issue 3

Check for updates

Badges

Artifacts Available / v1.1

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
138
Total Downloads

Downloads (Last 12 months)121
Downloads (Last 6 weeks)11

Reflects downloads up to 12 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents