Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3514221.3526128acmconferencesArticle/Chapter ViewAbstractPublication PagesmodConference Proceedingsconference-collections
research-article

Budget-aware Index Tuning with Reinforcement Learning

Published: 11 June 2022 Publication History

Abstract

Index tuning aims to find the optimal index configuration for an input workload. It is a resource-intensive task since it requires making multiple expensive "what-if" calls to the query optimizer to estimate the cost of a query given an index configuration without actually building the indexes. In this paper, we study the problem of budget-aware index tuning where the number of what-if calls allowed when searching for the optimal configuration during tuning is constrained. This problem is challenging as it requires addressing the trade-off between investing what-if calls on exploring new configurations versus exploiting a known promising configuration. We formulate budget-aware index tuning as a Markov decision process, and propose a solution based on Monte Carlo tree search, a classic reinforcement learning technology. Experimental evaluation on both standard industry benchmarks and real workloads shows that our solution can significantly outperform alternative budget-aware solutions in terms of the quality of the index configuration.

References

[1]
[n.d.]. Amazon Relational Database Service. https://aws.amazon.com/rds/.
[2]
[n.d.]. Azure SQL Database. https://azure.microsoft.com/en-us/products/azure-sql/database/.
[3]
[n.d.]. DTA utility. https://docs.microsoft.com/en-us/sql/tools/dta/dta-utility?view=sql-server-ver15.
[4]
[n.d.]. GitHub Repository of No DBA. https://github.com/shankur/autoindex.
[5]
[n.d.]. Google Cloud SQL. https://cloud.google.com/sql.
[6]
[n.d.]. Oracle Database Cloud Service. https://www.oracle.com/database/.
[7]
Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2000. Automated Selection of Materialized Views and Indexes in SQL Databases. In VLDB. 496--505.
[8]
Jean-Yves Audibert, Rémi Munos, and Csaba Szepesvari. 2006. Use of variance estimation in the multi-armed bandit problem. (2006).
[9]
Peter Auer. 2002. Using Confidence Bounds for Exploitation-Exploration Trade-offs. J. Mach. Learn. Res. 3 (2002), 397--422.
[10]
Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2--3 (2002), 235--256.
[11]
Debabrota Basu, Qian Lin, Weidong Chen, Hoang Tam Vo, Zihong Yuan, Pierre Senellart, and Stéphane Bressan. 2015. Cost-Model Oblivious Database Tuning with Reinforcement Learning. In DEXA. 253--268.
[12]
R. E. Bellman. 1957. Dynamic Programming. Princeton University Press.
[13]
Dimitri P. Bertsekas. 2005. Dynamic programming and optimal control, 3rd Edition. Athena Scientific.
[14]
Cameron Browne, Edward Jack Powley, Daniel Whitehouse, Simon M. Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez Liebana, Spyridon Samothrakis, and Simon Colton. 2012. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games 4, 1 (2012), 1--43.
[15]
Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.
[16]
Nicolas Bruno and Surajit Chaudhuri. 2007. An Online Approach to Physical Design Tuning. In ICDE. 826--835.
[17]
Nicolas Bruno and Surajit Chaudhuri. 2007. Physical design refinement: The 'merge-reduce' approach. ACM Trans. Database Syst. 32, 4 (2007), 28.
[18]
Nicolas Bruno and Surajit Chaudhuri. 2008. Constrained physical design tuning. Proc. VLDB Endow. 1, 1 (2008), 4--15.
[19]
Surajit Chaudhuri, Mayur Datar, and Vivek R. Narasayya. 2004. Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution. IEEE Trans. Knowl. Data Eng. 16, 11 (2004), 1313--1323.
[20]
Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Compressing SQL workloads. In SIGMOD. 488--499.
[21]
Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server.
[22]
Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155.
[23]
Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378.
[24]
Surajit Chaudhuri and Vivek R. Narasayya. 1999. Index Merging. In ICDE.
[25]
Sunil Choenni, Henk M. Blanken, and Thiel Chang. 1993. On the Selection of Secondary Indices in Relational Databases. Data Knowl. Eng. 11, 3 (1993).
[26]
Douglas Comer. 1978. The Difficulty of Optimum Index Selection. ACM Trans. Database Syst. 3, 4 (1978), 440--445.
[27]
Sudipto Das, Miroslav Grbic, Igor Ilic, Isidora Jovandic, Andrija Jovanovic, Vivek R. Narasayya, Miodrag Radulovic, Maja Stikic, Gaoxiang Xu, and Surajit Chaudhuri. 2019. Automatically Indexing Millions of Databases in Microsoft Azure SQL Database. In SIGMOD. 666--679.
[28]
Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow. 4, 6 (2011), 362--372.
[29]
Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey F. Naughton, and Stratis Viglas. 2020. Comprehensive and Efficient Workload Compression. Proc. VLDB Endow. 14, 3 (2020), 418--430.
[30]
Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258.
[31]
S. J. Finkelstein, M. Schkolnick, and P. Tiberio. 1988. Physical Database Design for Relational Databases. ACM Trans. Database Syst. 13, 1 (1988).
[32]
Vincent François-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, and Joelle Pineau. 2018. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn. 11, 3--4 (2018), 219--354.
[33]
Sylvain Gelly and David Silver. 2011. Monte-Carlo tree search and rapid action value estimation in computer Go. Artif. Intell. 175, 11 (2011), 1856--1875.
[34]
Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. 1997. Index Selection for OLAP. In ICDE. 208--219.
[35]
Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, and Audrunas Gruslys. 2018. Deep Q-learning From Demonstrations. In AAAI. 3223--3230.
[36]
Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement Learning: A Survey. J. Artif. Intell. Res. 4 (1996), 237--285.
[37]
Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Postgres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27.
[38]
Levente Kocsis and Csaba Szepesvári. 2006. Bandit Based Monte-Carlo Planning. In ECML. 282--293.
[39]
Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms. Proc. VLDB Endow. 13, 11 (2020), 2382--2395.
[40]
Andreas Krause and Daniel Golovin. 2014. Submodular Function Maximization. In Tractability: Practical Approaches to Hard Problems. Cambridge University Press, 71--104.
[41]
Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105--2108.
[42]
Viktor Leis. [n.d.]. Join Order Benchmark. https://github.com/gregrahn/join-order-benchmark.
[43]
Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 9, 3 (2015), 204--215.
[44]
Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. In ICLR.
[45]
G. L. Nemhauser et al. 1978. An analysis of approximations for maximizing submodular set functions - I. Math. Program. 14, 1 (1978).
[46]
Stratos Papadomanolakis, Debabrata Dash, and Anastassia Ailamaki. 2007. Efficient Use of the Query Optimizer for Automated Database Design. ACM.
[47]
Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica- Gajic. 2020. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. CoRR abs/2010.09208 (2020).
[48]
Laurent Péret and Frédérick Garcia. 2004. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling. In ECAI. 530--534.
[49]
M. L. Puterman and M. C. Shin. 1978. Modified policy iteration algorithms for discounted Markov decision problems. Management Science 24, 11 (1978).
[50]
Lijing Qin, Shouyuan Chen, and Xiaoyan Zhu. 2014. Contextual Combinatorial Bandit and its Application on Diversified Online Recommendation. In SDM. 461--469.
[51]
S. Ross. 2014. Introduction to stochastic dynamic programming. Academic press.
[52]
Reuven Rubinstein. 1999. The cross-entropy method for combinatorial and continuous optimization. Methodology and computing in applied probability 1, 2 (1999), 127--190.
[53]
Gavin A Rummery and Mahesan Niranjan. 1994. On-line Q-learning using connectionist systems. Vol. 37. University of Cambridge.
[54]
Zahra Sadri, Le Gruenwald, and Eleazar Leal. 2020. Online Index Selection Using Deep Reinforcement Learning for a Cluster Database. In ICDE Workshops.
[55]
Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. In ICDE. 1238--1249.
[56]
Karl Schnaitter, Neoklis Polyzotis, and Lise Getoor. 2009. Index Interactions in Physical Design Tuning: Modeling, Analysis, and Applications. Proc. VLDB Endow. 2, 1 (2009), 1234--1245.
[57]
Ankur Sharma, Felix Martin Schuhknecht, and Jens Dittrich. 2018. The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR abs/1801.05643 (2018).
[58]
Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.
[59]
Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis. 2019. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning. In SIGMOD. 1153--1170.
[60]
Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101--110.
[61]
Christopher J. C. H. Watkins and Peter Dayan. 1992. Technical Note Q-Learning. Mach. Learn. 8 (1992), 279--292.
[62]
Kyu-Young Whang. 1985. Index Selection in Relational Databases. In Foundations of Data Organization. 487--500.
[63]
Wentao Wu, Yun Chi, Shenghuo Zhu, Jun'ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013. Predicting query execution time: Are optimizer cost models really unusable?. In ICDE. 1081--1092.
[64]
Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning (Extended Version). Technical Report. Microsoft Research. https://www.microsoft.com/en-us/research/people/wentwu/publications/

Cited By

View all
  • (2024)Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management SystemsProceedings of the VLDB Endowment10.14778/3681954.368203017:11(3680-3693)Online publication date: 1-Jul-2024
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)Refactoring Index Tuning Process with Benefit EstimationProceedings of the VLDB Endowment10.14778/3654621.365462217:7(1528-1541)Online publication date: 30-May-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data
June 2022
2597 pages
ISBN:9781450392495
DOI:10.1145/3514221
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. budget allocation
  2. index tuning
  3. reinforcement learning

Qualifiers

  • Research-article

Conference

SIGMOD/PODS '22
Sponsor:

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)180
  • Downloads (Last 6 weeks)14
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management SystemsProceedings of the VLDB Endowment10.14778/3681954.368203017:11(3680-3693)Online publication date: 1-Jul-2024
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)Refactoring Index Tuning Process with Benefit EstimationProceedings of the VLDB Endowment10.14778/3654621.365462217:7(1528-1541)Online publication date: 30-May-2024
  • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
  • (2024)Wred: Workload Reduction for Scalable Index TuningProceedings of the ACM on Management of Data10.1145/36393052:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Robustness of Updatable Learning-based Index Advisors against Poisoning AttackProceedings of the ACM on Management of Data10.1145/36392652:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Online Index Recommendation for Slow Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00398(5294-5306)Online publication date: 13-May-2024
  • (2024)MFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian Optimization2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00331(4343-4356)Online publication date: 13-May-2024
  • (2024)TRAP: Tailored Robustness Assessment for Index Advisors via Adversarial Perturbation2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00011(42-55)Online publication date: 13-May-2024
  • (2023)ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement LearningProceedings of the VLDB Endowment10.14778/3611479.361148916:11(2805-2817)Online publication date: 1-Jul-2023
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media