Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Towards predicting query execution time for concurrent and dynamic database workloads

Published: 01 August 2013 Publication History

Abstract

Predicting query execution time is crucial for many database management tasks including admission control, query scheduling, and progress monitoring. While a number of recent papers have explored this problem, the bulk of the existing work either considers prediction for a single query, or prediction for a static workload of concurrent queries, where by "static" we mean that the queries to be run are fixed and known. In this paper, we consider the more general problem of dynamic concurrent workloads. Unlike most previous work on query execution time prediction, our proposed framework is based on analytic modeling rather than machine learning. We first use the optimizer's cost model to estimate the I/O and CPU requirements for each pipeline of each query in isolation, and then use a combination queueing model and buffer pool model that merges the I/O and CPU requests from concurrent queries to predict running times. We compare the proposed approach with a machine-learning based approach that is a variant of previous work. Our experiments show that our analytic-model based approach can lead to competitive and often better prediction accuracy than its machine-learning based counterpart.

References

[1]
M. Ahmad, A. Aboulnaga, S. Babu, and K. Munagala. Interaction-aware scheduling of report-generation workloads. The VLDB Journal, 20:589-615, 2011.
[2]
M. Ahmad, S. Duan, A. Aboulnaga, and S. Babu. Predicting completion times of batch query workloads using interaction-aware models and simulation. In EDBT, pages 449-460, 2011.
[3]
M. Akdere, U. Çetintemel, M. Riondato, E. Upfal, and S. B. Zdonik. Learning-based query performance modeling and prediction. In ICDE, pages 390-401, 2012.
[4]
F. R. Bach and M. I. Jordan. Kernel independent component analysis. Journal of Machine Learning Research, 3:1-48, 2002.
[5]
S. Chaudhuri, V. R. Narasayya, and R. Ramamurthy. Estimating progress of execution for SQL queries. In SIGMOD, 2004.
[6]
J. Duggan, U. Çetintemel, O. Papaemmanouil, and E. Upfal. Performance prediction for concurrent database workloads. In SIGMOD, pages 337-348, 2011.
[7]
J. H. Friedman. Stochastic gradient boosting. Comput. Stat. Data Anal., 38(4):367-378, 2002.
[8]
A. Ganapathi, H. A. Kuno, U. Dayal, J. L.Wiener, A. Fox, M. I. Jordan, and D. A. Patterson. Predicting multiple metrics for queries: Better decisions enabled by machine learning. In ICDE, 2009.
[9]
G. Graefe. Query evaluation techniques for large databases. ACM Comput. Surv., 25(2):73-170, 1993.
[10]
S. Guirguis, M. A. Sharaf, P. K. Chrysanthis, A. Labrinidis, and K. Pruhs. Adaptive scheduling of web transactions. In ICDE, 2009.
[11]
M. Hall, E. Frank, G. Holmes, B. Pfahringer, P. Reutemann, and I. H. Witten. The WEKA data mining software: an update. SIGKDD Explorations, 11(1):10-18, 2009.
[12]
E. D. Lazowska, J. Zahorjan, G. S. Graham, and K. C. Sevcik. Quantitative system performance - computer system analysis using queueing network models. Prentice Hall, 1984.
[13]
J. Li, A. C. König, V. R. Narasayya, and S. Chaudhuri. Robust estimation of resource consumption for sql queries using statistical techniques. PVLDB, 5(11):1555-1566, 2012.
[14]
G. Luo, J. F. Naughton, C. J. Ellmann, and M. Watzke. Toward a progress indicator for database queries. In SIGMOD, 2004.
[15]
C. Mishra and N. Koudas. The design of a query monitoring system. ACM Trans. Database Syst., 34(1):1-51, 2009.
[16]
V. F. Nicola, A. Dan, and D. M. Dias. Analysis of the generalized clock buffer replacement scheme for database transaction processing. In SIGMETRICS, pages 35-46, 1992.
[17]
R. Osman, I. Awan, and M. E. Woodward. Application of queueing network models in the performance evaluation of database designs. Electr. Notes Theor. Comput. Sci., 232:101-124, 2009.
[18]
J. R. Quinlan. Simplifying decision trees, 1986.
[19]
R. Ramamurthy and D. J. DeWitt. Buffer-pool aware query optimization. In CIDR, pages 250-261, 2005.
[20]
M. Reiser and S. S. Lavenberg. Mean-value analysis of closed multichain queuing networks. J. ACM, 27(2):313-322, 1980.
[21]
Scilab Enterprises. Scilab: Free and Open Source software for numerical computation. Scilab Enterprises, Orsay, France, 2012.
[22]
K. C. Sevcik. Data base system performance prediction using an analytical model (invited paper). In VLDB, pages 182-198, 1981.
[23]
R. Suri, S. Sahu, and M. Vernon. Approximate mean value analysis for closed queueing networks with multiple-server stations. In Proc. of Industrial Engineering Research Conf. (IERC), 2007.
[24]
N. Tomov, E. W. Dempster, M. H. Williams, A. Burger, H. Taylor, P. J. B. King, and P. Broughton. Analytical response time estimation in parallel relational database systems. Parallel Computing, 30(2):249-283, 2004.
[25]
S. Tozer, T. Brecht, and A. Aboulnaga. Q-Cop: Avoiding bad query mixes to minimize client timeouts under heavy loads. In ICDE, 2010.
[26]
T. J. Wasserman, P. Martin, D. B. Skillicorn, and H. Rizvi. Developing a characterization of business intelligence workloads for sizing new database systems. In DOLAP, pages 7-13, 2004.
[27]
W. Wu, Y. Chi, H. Hacigümüs, and J. F. Naughton. Towards predicting query execution time for concurrent and dynamic database workloads. Technical Report TR-NECLA-DM-2013-13, NEC Laboratories America, 2013.
[28]
W. Wu, Y. Chi, S. Zhu, J. Tatemura, H. Hacigümüs, and J. F. Naughton. Predicting query execution time: are optimizer cost models really unusable? In ICDE, pages 1081-1092, 2013.

Cited By

View all
  • (2024)Wii: Dynamic Budget Reallocation In Index TuningProceedings of the ACM on Management of Data10.1145/36549852:3(1-26)Online publication date: 30-May-2024
  • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
  • (2024)Wred: Workload Reduction for Scalable Index TuningProceedings of the ACM on Management of Data10.1145/36393052:1(1-26)Online publication date: 26-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 6, Issue 10
August 2013
180 pages

Publisher

VLDB Endowment

Publication History

Published: 01 August 2013
Published in PVLDB Volume 6, Issue 10

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)74
  • Downloads (Last 6 weeks)10
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Wii: Dynamic Budget Reallocation In Index TuningProceedings of the ACM on Management of Data10.1145/36549852:3(1-26)Online publication date: 30-May-2024
  • (2024)ML-Powered Index Tuning: An Overview of Recent Progress and Open ChallengesACM SIGMOD Record10.1145/3641832.364183652:4(19-30)Online publication date: 19-Jan-2024
  • (2024)Wred: Workload Reduction for Scalable Index TuningProceedings of the ACM on Management of Data10.1145/36393052:1(1-26)Online publication date: 26-Mar-2024
  • (2022)Intelligent Automated Workload Analysis for Database ReplatformingProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3526050(2273-2285)Online publication date: 10-Jun-2022
  • (2022)Multi-Tenant Cloud Data Services: State-of-the-Art, Challenges and OpportunitiesProceedings of the 2022 International Conference on Management of Data10.1145/3514221.3522566(2465-2473)Online publication date: 10-Jun-2022
  • (2022)Efficient Learning with Pseudo Labels for Query Cost EstimationProceedings of the 31st ACM International Conference on Information & Knowledge Management10.1145/3511808.3557305(1309-1318)Online publication date: 17-Oct-2022
  • (2022)Abstract cost models for distributed data-intensive computationsDistributed and Parallel Databases10.1007/s10619-018-7244-237:3(411-439)Online publication date: 10-Mar-2022
  • (2021)openGaussProceedings of the VLDB Endowment10.14778/3476311.347638014:12(3028-3042)Online publication date: 28-Oct-2021
  • (2021)MB2: Decomposed Behavior Modeling for Self-Driving Database Management SystemsProceedings of the 2021 International Conference on Management of Data10.1145/3448016.3457276(1248-1261)Online publication date: 9-Jun-2021
  • (2021)Achieving query performance in the cloud via a cost-effective data replication strategySoft Computing - A Fusion of Foundations, Methodologies and Applications10.1007/s00500-020-05544-w25:7(5437-5454)Online publication date: 1-Apr-2021
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media