research-article

Budget-aware Index Tuning with Reinforcement Learning

Authors:

Tarique Siddiqui,

Vivek Narasayya,

Surajit Chaudhuri,

Philip A. BernsteinAuthors Info & Claims

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Pages 1528 - 1541

https://doi.org/10.1145/3514221.3526128

Published: 11 June 2022 Publication History

Abstract

Index tuning aims to find the optimal index configuration for an input workload. It is a resource-intensive task since it requires making multiple expensive "what-if" calls to the query optimizer to estimate the cost of a query given an index configuration without actually building the indexes. In this paper, we study the problem of budget-aware index tuning where the number of what-if calls allowed when searching for the optimal configuration during tuning is constrained. This problem is challenging as it requires addressing the trade-off between investing what-if calls on exploring new configurations versus exploiting a known promising configuration. We formulate budget-aware index tuning as a Markov decision process, and propose a solution based on Monte Carlo tree search, a classic reinforcement learning technology. Experimental evaluation on both standard industry benchmarks and real workloads shows that our solution can significantly outperform alternative budget-aware solutions in terms of the quality of the index configuration.

References

[1]

[n.d.]. Amazon Relational Database Service. https://aws.amazon.com/rds/.

[2]

[n.d.]. Azure SQL Database. https://azure.microsoft.com/en-us/products/azure-sql/database/.

[3]

[n.d.]. DTA utility. https://docs.microsoft.com/en-us/sql/tools/dta/dta-utility?view=sql-server-ver15.

[4]

[n.d.]. GitHub Repository of No DBA. https://github.com/shankur/autoindex.

[5]

[n.d.]. Google Cloud SQL. https://cloud.google.com/sql.

[6]

[n.d.]. Oracle Database Cloud Service. https://www.oracle.com/database/.

[7]

Sanjay Agrawal, Surajit Chaudhuri, and Vivek R. Narasayya. 2000. Automated Selection of Materialized Views and Indexes in SQL Databases. In VLDB. 496--505.

Digital Library

[8]

Jean-Yves Audibert, Rémi Munos, and Csaba Szepesvari. 2006. Use of variance estimation in the multi-armed bandit problem. (2006).

[9]

Peter Auer. 2002. Using Confidence Bounds for Exploitation-Exploration Trade-offs. J. Mach. Learn. Res. 3 (2002), 397--422.

Digital Library

[10]

Peter Auer, Nicolò Cesa-Bianchi, and Paul Fischer. 2002. Finite-time Analysis of the Multiarmed Bandit Problem. Mach. Learn. 47, 2--3 (2002), 235--256.

Digital Library

[11]

Debabrota Basu, Qian Lin, Weidong Chen, Hoang Tam Vo, Zihong Yuan, Pierre Senellart, and Stéphane Bressan. 2015. Cost-Model Oblivious Database Tuning with Reinforcement Learning. In DEXA. 253--268.

[12]

R. E. Bellman. 1957. Dynamic Programming. Princeton University Press.

Digital Library

[13]

Dimitri P. Bertsekas. 2005. Dynamic programming and optimal control, 3rd Edition. Athena Scientific.

[14]

Cameron Browne, Edward Jack Powley, Daniel Whitehouse, Simon M. Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez Liebana, Spyridon Samothrakis, and Simon Colton. 2012. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games 4, 1 (2012), 1--43.

[15]

Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.

[16]

Nicolas Bruno and Surajit Chaudhuri. 2007. An Online Approach to Physical Design Tuning. In ICDE. 826--835.

[17]

Nicolas Bruno and Surajit Chaudhuri. 2007. Physical design refinement: The 'merge-reduce' approach. ACM Trans. Database Syst. 32, 4 (2007), 28.

Digital Library

[18]

Nicolas Bruno and Surajit Chaudhuri. 2008. Constrained physical design tuning. Proc. VLDB Endow. 1, 1 (2008), 4--15.

Digital Library

[19]

Surajit Chaudhuri, Mayur Datar, and Vivek R. Narasayya. 2004. Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution. IEEE Trans. Knowl. Data Eng. 16, 11 (2004), 1313--1323.

Digital Library

[20]

Surajit Chaudhuri, Ashish Kumar Gupta, and Vivek R. Narasayya. 2002. Compressing SQL workloads. In SIGMOD. 488--499.

[21]

Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server.

[22]

Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155.

Digital Library

[23]

Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378.

[24]

Surajit Chaudhuri and Vivek R. Narasayya. 1999. Index Merging. In ICDE.

Digital Library

[25]

Sunil Choenni, Henk M. Blanken, and Thiel Chang. 1993. On the Selection of Secondary Indices in Relational Databases. Data Knowl. Eng. 11, 3 (1993).

[26]

Douglas Comer. 1978. The Difficulty of Optimum Index Selection. ACM Trans. Database Syst. 3, 4 (1978), 440--445.

Digital Library

[27]

Sudipto Das, Miroslav Grbic, Igor Ilic, Isidora Jovandic, Andrija Jovanovic, Vivek R. Narasayya, Miodrag Radulovic, Maja Stikic, Gaoxiang Xu, and Surajit Chaudhuri. 2019. Automatically Indexing Millions of Databases in Microsoft Azure SQL Database. In SIGMOD. 666--679.

[28]

Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow. 4, 6 (2011), 362--372.

Digital Library

[29]

Shaleen Deep, Anja Gruenheid, Paraschos Koutris, Jeffrey F. Naughton, and Stratis Viglas. 2020. Comprehensive and Efficient Workload Compression. Proc. VLDB Endow. 14, 3 (2020), 418--430.

Digital Library

[30]

Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258.

Digital Library

[31]

S. J. Finkelstein, M. Schkolnick, and P. Tiberio. 1988. Physical Database Design for Relational Databases. ACM Trans. Database Syst. 13, 1 (1988).

Digital Library

[32]

Vincent François-Lavet, Peter Henderson, Riashat Islam, Marc G. Bellemare, and Joelle Pineau. 2018. An Introduction to Deep Reinforcement Learning. Found. Trends Mach. Learn. 11, 3--4 (2018), 219--354.

Digital Library

[33]

Sylvain Gelly and David Silver. 2011. Monte-Carlo tree search and rapid action value estimation in computer Go. Artif. Intell. 175, 11 (2011), 1856--1875.

Digital Library

[34]

Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. 1997. Index Selection for OLAP. In ICDE. 208--219.

Digital Library

[35]

Todd Hester, Matej Vecerík, Olivier Pietquin, Marc Lanctot, Tom Schaul, Bilal Piot, Dan Horgan, John Quan, Andrew Sendonaris, Ian Osband, Gabriel Dulac-Arnold, John P. Agapiou, Joel Z. Leibo, and Audrunas Gruslys. 2018. Deep Q-learning From Demonstrations. In AAAI. 3223--3230.

[36]

Leslie Pack Kaelbling, Michael L. Littman, and Andrew W. Moore. 1996. Reinforcement Learning: A Survey. J. Artif. Intell. Res. 4 (1996), 237--285.

[37]

Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Postgres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27.

[38]

Levente Kocsis and Csaba Szepesvári. 2006. Bandit Based Monte-Carlo Planning. In ECML. 282--293.

[39]

Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms. Proc. VLDB Endow. 13, 11 (2020), 2382--2395.

Digital Library

[40]

Andreas Krause and Daniel Golovin. 2014. Submodular Function Maximization. In Tractability: Practical Approaches to Hard Problems. Cambridge University Press, 71--104.

[41]

Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105--2108.

[42]

Viktor Leis. [n.d.]. Join Order Benchmark. https://github.com/gregrahn/join-order-benchmark.

[43]

Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow. 9, 3 (2015), 204--215.

Digital Library

[44]

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2016. Continuous control with deep reinforcement learning. In ICLR.

[45]

G. L. Nemhauser et al. 1978. An analysis of approximations for maximizing submodular set functions - I. Math. Program. 14, 1 (1978).

[46]

Stratos Papadomanolakis, Debabrata Dash, and Anastassia Ailamaki. 2007. Efficient Use of the Query Optimizer for Automated Database Design. ACM.

[47]

Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica- Gajic. 2020. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. CoRR abs/2010.09208 (2020).

[48]

Laurent Péret and Frédérick Garcia. 2004. On-Line Search for Solving Markov Decision Processes via Heuristic Sampling. In ECAI. 530--534.

[49]

M. L. Puterman and M. C. Shin. 1978. Modified policy iteration algorithms for discounted Markov decision problems. Management Science 24, 11 (1978).

[50]

Lijing Qin, Shouyuan Chen, and Xiaoyan Zhu. 2014. Contextual Combinatorial Bandit and its Application on Diversified Online Recommendation. In SDM. 461--469.

[51]

S. Ross. 2014. Introduction to stochastic dynamic programming. Academic press.

[52]

Reuven Rubinstein. 1999. The cross-entropy method for combinatorial and continuous optimization. Methodology and computing in applied probability 1, 2 (1999), 127--190.

[53]

Gavin A Rummery and Mahesan Niranjan. 1994. On-line Q-learning using connectionist systems. Vol. 37. University of Cambridge.

[54]

Zahra Sadri, Le Gruenwald, and Eleazar Leal. 2020. Online Index Selection Using Deep Reinforcement Learning for a Cluster Database. In ICDE Workshops.

[55]

Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. In ICDE. 1238--1249.

[56]

Karl Schnaitter, Neoklis Polyzotis, and Lise Getoor. 2009. Index Interactions in Physical Design Tuning: Modeling, Analysis, and Applications. Proc. VLDB Endow. 2, 1 (2009), 1234--1245.

Digital Library

[57]

Ankur Sharma, Felix Martin Schuhknecht, and Jens Dittrich. 2018. The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR abs/1801.05643 (2018).

[58]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[59]

Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis. 2019. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning. In SIGMOD. 1153--1170.

[60]

Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101--110.

[61]

Christopher J. C. H. Watkins and Peter Dayan. 1992. Technical Note Q-Learning. Mach. Learn. 8 (1992), 279--292.

Digital Library

[62]

Kyu-Young Whang. 1985. Index Selection in Relational Databases. In Foundations of Data Organization. 487--500.

[63]

Wentao Wu, Yun Chi, Shenghuo Zhu, Jun'ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013. Predicting query execution time: Are optimizer cost models really unusable?. In ICDE. 1081--1092.

[64]

Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning (Extended Version). Technical Report. Microsoft Research. https://www.microsoft.com/en-us/research/people/wentwu/publications/

Cited By

Lim WMa LZhang WButrovich MArch SPavlo A(2024)Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management SystemsProceedings of the VLDB Endowment10.14778/3681954.368203017:11(3680-3693)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682030
Zhang WLim WButrovich MPavlo A(2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3682007
Yu TZou ZSun WYan Y(2024)Refactoring Index Tuning Process with Benefit EstimationProceedings of the VLDB Endowment10.14778/3654621.365462217:7(1528-1541)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.14778/3654621.3654622
Show More Cited By

Index Terms

Budget-aware Index Tuning with Reinforcement Learning

Recommendations

Wii: Dynamic Budget Reallocation In Index Tuning
SIGMOD

Index tuning aims to find the optimal index configuration for an input workload. It is often a time-consuming and resource-intensive process, largely attributed to the huge amount of "what-if" calls made to the query optimizer during configuration ...
Continuous online index tuning in moving object databases

In a Moving Object Database (MOD), the dataset, for example, the location of objects and their distribution, and the workload change frequently. Traditional static indexes are not able to cope well with such changes, that is, their effectiveness and ...
Enhancing Online Index Tuning with a Learned Tuning Diagnostic
Database and Expert Systems Applications
Abstract
Indexes are vital for data retrieval performance. For online scenarios with dynamic workloads, index tuning is challenging. A commonly used strategy is to launch tuning requests periodically, yet resource-intensive tuning sessions can obstruct it, ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

June 2022

2597 pages

ISBN:9781450392495

DOI:10.1145/3514221

General Chair:
Zachary Ives
University of Pennsylvania (USA)
,
Program Chairs:
Angela Bonifati
Lyon 1 University (France)
,
Amr El Abbadi
University of California, Santa Barbara (USA)

Copyright © 2022 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGMOD/PODS '22

Sponsor:

SIGMOD

SIGMOD/PODS '22: International Conference on Management of Data

June 12 - 17, 2022

PA, Philadelphia, USA

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

16
Total Citations
View Citations
645
Total Downloads

Downloads (Last 12 months)180
Downloads (Last 6 weeks)14

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Lim WMa LZhang WButrovich MArch SPavlo A(2024)Hit the Gym: Accelerating Query Execution to Efficiently Bootstrap Behavior Models for Self-Driving Database Management SystemsProceedings of the VLDB Endowment10.14778/3681954.368203017:11(3680-3693)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.14778/3681954.3682030
Zhang WLim WButrovich MPavlo A(2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
https://doi.org/10.14778/3681954.3682007
Yu TZou ZSun WYan Y(2024)Refactoring Index Tuning Process with Benefit EstimationProceedings of the VLDB Endowment10.14778/3654621.365462217:7(1528-1541)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.14778/3654621.3654622
Siddiqui TWu W(2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
https://dl.acm.org/doi/10.1145/3641832.3641836
Brucato MSiddiqui TWu WNarasayya VChaudhuri S(2024)Wred: Workload Reduction for Scalable Index TuningProceedings of the ACM on Management of Data10.1145/36393052:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639305
Zheng YLin CLyu XZhou XLi GWang T(2024)Robustness of Updatable Learning-based Index Advisors against Poisoning AttackProceedings of the ACM on Management of Data10.1145/36392652:1(1-26)Online publication date: 26-Mar-2024
https://dl.acm.org/doi/10.1145/3639265
Peng GCai PYe KLi KCai JShen YSu HXu W(2024)Online Index Recommendation for Slow Queries2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00398(5294-5306)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00398
Chang ZZhang XLi YMiao XQin YCui B(2024)MFIX: An Efficient and Reliable Index Advisor via Multi-Fidelity Bayesian Optimization2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00331(4343-4356)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00331
Zhou WLin CZhou XLi GWang T(2024)TRAP: Tailored Robustness Assessment for Index Advisors via Adversarial Perturbation2024 IEEE 40th International Conference on Data Engineering (ICDE)10.1109/ICDE60146.2024.00011(42-55)Online publication date: 13-May-2024
https://doi.org/10.1109/ICDE60146.2024.00011
Wang JTrummer IKara AOlteanu D(2023)ADOPT: Adaptively Optimizing Attribute Orders for Worst-Case Optimal Join Algorithms via Reinforcement LearningProceedings of the VLDB Endowment10.14778/3611479.361148916:11(2805-2817)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.14778/3611479.3611489
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents