research-article

Open access

Wii: Dynamic Budget Reallocation In Index Tuning

Authors:

Vivek Narasayya,

Surajit ChaudhuriAuthors Info & Claims

Proceedings of the ACM on Management of Data, Volume 2, Issue 3

Article No.: 182, Pages 1 - 26

https://doi.org/10.1145/3654985

Published: 30 May 2024 Publication History

Abstract

Index tuning aims to find the optimal index configuration for an input workload. It is often a time-consuming and resource-intensive process, largely attributed to the huge amount of "what-if" calls made to the query optimizer during configuration enumeration. Therefore, in practice it is desirable to set a budget constraint that limits the number of what-if calls allowed. This yields a new problem of budget allocation, namely, deciding on which query-configuration pairs (QCP's) to issue what-if calls. Unfortunately, optimal budget allocation is NP-hard, and budget allocation decisions made by existing solutions can be inferior. In particular, many of the what-if calls allocated by using existing solutions are devoted to QCP's whose what-if costs can be approximated by using cost derivation, a well-known technique that is computationally much more efficient and has been adopted by commercial index tuning software. This results in considerable waste of the budget, as these what-if calls are unnecessary. In this paper, we propose "Wii," a lightweight mechanism that aims to avoid such spurious what-if calls. It can be seamlessly integrated with existing configuration enumeration algorithms. Experimental evaluation on top of both standard industrial benchmarks and real workloads demonstrates that Wii can eliminate significant number of spurious what-if calls. Moreover, by reallocating the saved budget to QCP's where cost derivation is less accurate, existing algorithms can be significantly improved in terms of the final configuration found.

References

[1]

2023. DTA utility. https://docs.microsoft.com/en-us/sql/tools/dta/dta-utility?view=sql-server-ver15.

[2]

Mert Akdere, Ugur cCetintemel, Matteo Riondato, Eli Upfal, and Stanley B. Zdonik. 2012. Learning-based Query Performance Modeling and Prediction. In ICDE. 390--401.

[3]

Cameron Browne, Edward Jack Powley, Daniel Whitehouse, Simon M. Lucas, Peter I. Cowling, Philipp Rohlfshagen, Stephen Tavener, Diego Perez Liebana, Spyridon Samothrakis, and Simon Colton. 2012. A Survey of Monte Carlo Tree Search Methods. IEEE Trans. Comput. Intell. AI Games, Vol. 4, 1 (2012), 1--43.

[4]

Matteo Brucato, Tarique Siddiqui, Wentao Wu, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wred: Workload Reduction for Scalable Index Tuning. Proc. ACM Manag. Data, Vol. 2, 1, Article 50 (2024), 26 pages.

Digital Library

[5]

Nicolas Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.

[6]

Surajit Chaudhuri, Mayur Datar, and Vivek R. Narasayya. 2004. Index Selection for Databases: A Hardness Study and a Principled Heuristic Solution. IEEE Trans. Knowl. Data Eng., Vol. 16, 11 (2004), 1313--1323.

Digital Library

[7]

Surajit Chaudhuri and Vivek Narasayya. 2020. Anytime Algorithm of Database Tuning Advisor for Microsoft SQL Server.

[8]

Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155.

Digital Library

[9]

Surajit Chaudhuri and Vivek R. Narasayya. 1998. AutoAdmin 'What-if' Index Analysis Utility. In SIGMOD. 367--378.

[10]

Sunil Choenni, Henk M. Blanken, and Thiel Chang. 1993. On the Selection of Secondary Indices in Relational Databases. Data Knowl. Eng., Vol. 11, 3 (1993).

Digital Library

[11]

Douglas Comer. 1978. The Difficulty of Optimum Index Selection. ACM Trans. Database Syst., Vol. 3, 4 (1978), 440--445.

Digital Library

[12]

Debabrata Dash, Neoklis Polyzotis, and Anastasia Ailamaki. 2011. CoPhy: A Scalable, Portable, and Interactive Index Advisor for Large Workloads. Proc. VLDB Endow., Vol. 4, 6 (2011), 362--372.

Digital Library

[13]

Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD. 1241--1258.

Digital Library

[14]

Archana Ganapathi, Harumi A. Kuno, Umeshwar Dayal, Janet L. Wiener, Armando Fox, Michael I. Jordan, and David A. Patterson. 2009. Predicting Multiple Metrics for Queries: Better Decisions Enabled by Machine Learning. In ICDE.

[15]

Himanshu Gupta, Venky Harinarayan, Anand Rajaraman, and Jeffrey D. Ullman. 1997. Index Selection for OLAP. In ICDE. 208--219.

Digital Library

[16]

Benjamin Hilprecht and Carsten Binnig. 2022. Zero-Shot Cost Models for Out-of-the-box Learned Cost Prediction. Proc. VLDB Endow., Vol. 15, 11 (2022), 2361--2374.

Digital Library

[17]

Andrew Kane. 2017. Introducing Dexter, the Automatic Indexer for Postgres. https://medium.com/@ankane/introducing-dexter-the-automatic-indexer-for-postgres-5f8fa8b28f27.

[18]

Levente Kocsis and Csaba Szepesvá ri. 2006. Bandit Based Monte-Carlo Planning. In ECML. 282--293.

[19]

Jan Kossmann, Stefan Halfpap, Marcel Jankrift, and Rainer Schlosser. 2020. Magic mirror in my hand, which is the best in the land? An Experimental Evaluation of Index Selection Algorithms. Proc. VLDB Endow., Vol. 13, 11 (2020), 2382--2395.

Digital Library

[20]

Jan Kossmann, Alexander Kastius, and Rainer Schlosser. 2022. SWIRL: Selection of Workload-aware Indexes using Reinforcement Learning. In EDBT. 2:155--2:168.

[21]

Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM. 2105--2108.

[22]

Viktor Leis, Andrey Gubichev, Atanas Mirchev, Peter A. Boncz, Alfons Kemper, and Thomas Neumann. 2015. How Good Are Query Optimizers, Really? Proc. VLDB Endow., Vol. 9, 3 (2015), 204--215.

Digital Library

[23]

Jiexing Li, Arnd Christian Kö nig, Vivek R. Narasayya, and Surajit Chaudhuri. 2012. Robust Estimation of Resource Consumption for SQL Queries using Statistical Techniques. Proc. VLDB Endow., Vol. 5, 11 (2012), 1555--1566.

Digital Library

[24]

Ryan C. Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A Learned Query Optimizer. Proc. VLDB Endow., Vol. 12, 11 (2019), 1705--1718.

Digital Library

[25]

Ryan C. Marcus and Olga Papaemmanouil. 2019. Plan-Structured Deep Neural Network Models for Query Performance Prediction. Proc. VLDB Endow., Vol. 12, 11 (2019), 1733--1746.

Digital Library

[26]

Stratos Papadomanolakis, Debabrata Dash, and Anastassia Ailamaki. 2007. Efficient Use of the Query Optimizer for Automated Database Design. ACM.

[27]

Debjyoti Paul, Jie Cao, Feifei Li, and Vivek Srikumar. 2021. Database Workload Characterization with Query Plan Encoders. Proc. VLDB Endow., Vol. 15, 4 (2021), 923--935.

Digital Library

[28]

R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica-Gajic. 2021. DBA bandits: Self-driving index tuning under ad-hoc, analytical workloads with safety guarantees. In ICDE. IEEE, 600--611.

[29]

R. Malinga Perera, Bastian Oetomo, Benjamin I. P. Rubinstein, and Renata Borovica-Gajic. 2022. HMAB: Self-Driving Hierarchy of Bandits for Integrated Physical Database Design Tuning. Proc. VLDB Endow., Vol. 16, 2 (2022), 216--229.

Digital Library

[30]

Rainer Schlosser, Jan Kossmann, and Martin Boissier. 2019. Efficient Scalable Multi-attribute Index Selection Using Recursive Strategies. In ICDE. 1238--1249.

[31]

Karl Schnaitter, Neoklis Polyzotis, and Lise Getoor. 2009. Index Interactions in Physical Design Tuning: Modeling, Analysis, and Applications. Proc. VLDB Endow., Vol. 2, 1 (2009), 1234--1245.

Digital Library

[32]

Ankur Sharma, Felix Martin Schuhknecht, and Jens Dittrich. 2018. The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR, Vol. abs/1801.05643 (2018).

[33]

Jiachen Shi, Gao Cong, and Xiaoli Li. 2022. Learned Index Benefits: Machine Learning Based Index Performance Estimation. Proc. VLDB Endow., Vol. 15, 13 (2022), 3950--3962.

Digital Library

[34]

Tarique Siddiqui, Alekh Jindal, Shi Qiao, Hiren Patel, and Wangchao Le. 2020. Cost Models for Big Data Query Processing: Learning, Retrofitting, and Our Findings. In SIGMOD. ACM, 99--113.

[35]

Tarique Siddiqui, Saehan Jo, Wentao Wu, Chi Wang, Vivek R. Narasayya, and Surajit Chaudhuri. 2022a. ISUM: Efficiently Compressing Large and Complex Workloads for Scalable Index Tuning. In SIGMOD. ACM, 660--673.

[36]

Tarique Siddiqui and Wentao Wu. 2023. ML-Powered Index Tuning: An Overview of Recent Progress and Open Challenges. SIGMOD Rec., Vol. 52, 4 (2023), 19--30.

Digital Library

[37]

Tarique Siddiqui, Wentao Wu, Vivek R. Narasayya, and Surajit Chaudhuri. 2022b. DISTILL: Low-Overhead Data-Driven Techniques for Filtering and Costing Indexes for Scalable Index Tuning. Proc. VLDB Endow., Vol. 15, 10 (2022), 2019--2031.

Digital Library

[38]

Ji Sun and Guoliang Li. 2019. An End-to-End Learning-based Cost Estimator. Proc. VLDB Endow., Vol. 13, 3 (2019), 307--319.

Digital Library

[39]

Richard S Sutton and Andrew G Barto. 2018. Reinforcement learning: An introduction. MIT press.

Digital Library

[40]

Gary Valentin, Michael Zuliani, Daniel C. Zilio, Guy M. Lohman, and Alan Skelley. 2000. DB2 Advisor: An Optimizer Smart Enough to Recommend Its Own Indexes. In ICDE. 101--110.

[41]

Xiaoying Wang, Wentao Wu, Chi Wang, Vivek Narasayya, and Surajit Chaudhuri. 2024. Wii: Dynamic Budget Reallocation In Index Tuning (Extended Version). Technical Report. Microsoft Research. https://www.microsoft.com/en-us/research/people/wentwu/publications/

[42]

Kyu-Young Whang. 1985. Index Selection in Relational Databases. In Foundations of Data Organization. 487--500.

[43]

Wentao Wu, Yun Chi, Hakan Hacigü mü s, and Jeffrey F. Naughton. 2013a. Towards Predicting Query Execution Time for Concurrent and Dynamic Database Workloads. Proc. VLDB Endow., Vol. 6, 10 (2013), 925--936.

Digital Library

[44]

Wentao Wu, Yun Chi, Shenghuo Zhu, Jun'ichi Tatemura, Hakan Hacigümüs, and Jeffrey F. Naughton. 2013b. Predicting query execution time: Are optimizer cost models really unusable?. In ICDE. 1081--1092.

[45]

Wentao Wu, Jeffrey F. Naughton, and Harneet Singh. 2016. Sampling-Based Query Re-Optimization. In SIGMOD. ACM, 1721--1736.

[46]

Wentao Wu, Chi Wang, Tarique Siddiqui, Junxiong Wang, Vivek R. Narasayya, Surajit Chaudhuri, and Philip A. Bernstein. 2022. Budget-aware Index Tuning with Reinforcement Learning. In SIGMOD. ACM, 1528--1541.

[47]

Wentao Wu, Xi Wu, Hakan Hacigü mü s, and Jeffrey F. Naughton. 2014. Uncertainty Aware Query Execution Time Prediction. Proc. VLDB Endow., Vol. 7, 14 (2014), 1857--1868.

Digital Library

[48]

Yue Zhao, Gao Cong, Jiachen Shi, and Chunyan Miao. 2022. QueryFormer: A Tree Transformer Model for Query Plan Representation. Proc. VLDB Endow., Vol. 15, 8 (2022), 1658--1670.

Digital Library

Index Terms

Wii: Dynamic Budget Reallocation In Index Tuning
1. Information systems
  1. Data management systems
    1. Database administration
      1. Autonomous database administration
    2. Database management system engines
      1. Database query processing
        Query optimization

Recommendations

Budget-aware Index Tuning with Reinforcement Learning
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Index tuning aims to find the optimal index configuration for an input workload. It is a resource-intensive task since it requires making multiple expensive "what-if" calls to the query optimizer to estimate the cost of a query given an index ...
ISUM: Efficiently Compressing Large and Complex Workloads for Scalable Index Tuning
SIGMOD '22: Proceedings of the 2022 International Conference on Management of Data

Today's database systems include index advisors that recommend an appropriate set of indexes for an input workload. Since index tuning on large and complex workloads can be resource-intensive and time-consuming, workload compression techniques have been ...
Continuous online index tuning in moving object databases

In a Moving Object Database (MOD), the dataset, for example, the location of objects and their distribution, and the workload change frequently. Traditional static indexes are not able to cope well with such changes, that is, their effectiveness and ...

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the ACM on Management of Data

Proceedings of the ACM on Management of Data Volume 2, Issue 3

SIGMOD

June 2024

1953 pages

EISSN:2836-6573

DOI:10.1145/3670010

Editor:
Divyakant Agrawal
UC Santa Barbara, United States

Issue’s Table of Contents

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 May 2024

Published in PACMMOD Volume 2, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
98
Total Downloads

Downloads (Last 12 months)98
Downloads (Last 6 weeks)61

Reflects downloads up to

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents