Abstract
Learning representations for query plans play a pivotal role in machine learning-based query optimizers of database management systems. To this end, particular model architectures are proposed in the literature to transform the tree-structured query plans into representations with formats learnable by downstream machine learning models However, existing research rarely compares and analyzes the query plan representation capabilities of these tree models and their direct impact on the performance of the overall optimizer. To address this problem, we perform a comparative study to explore the effect of using different state-of-the-art tree models on the optimizer’s cost estimation and plan selection performance in relatively complex workloads. Additionally, we explore the possibility of using graph neural networks (GNNs) in the query plan representation task. We propose a novel tree model BiGG employing Bidirectional GNN aggregated by Gated recurrent units (GRUs) and demonstrate experimentally that BiGG provides significant improvements to cost estimation tasks and relatively excellent plan selection performance compared to the state-of-the-art tree models.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Buterez, D., Janet, J.P., Kiddle, S.J., Oglic, D., Liò, P.: Graph neural networks with adaptive readouts. Adv. Neural. Inf. Process. Syst. 35, 19746–19758 (2022)
Chen, T., Gao, J., Chen, H., Tu, Y.: LOGER: a learned optimizer towards generating efficient and robust query execution plans. Proc. VLDB Endow. 16(7), 1777–1789 (2023)
Cho, K., et al.: Learning phrase representations using RNN encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078 (2014)
Ding, B., Das, S., Marcus, R., Wu, W., Chaudhuri, S., Narasayya, V.R.: AI meets AI: leveraging query executions to improve index recommendations. In: Proceedings of the 2019 International Conference on Management of Data, pp. 1241–1258 (2019)
Ioannidis, Y.: The history of histograms (abridged). In: Proceedings 2003 VLDB Conference, pp. 19–30. Elsevier (2003)
Kamali, A., Kantere, V., Zuzarte, C., Corvinelli, V.: Roq: robust query optimization based on a risk-aware learned cost model. arXiv preprint arXiv:2401.15210 (2024)
Kingma, D.P., Ba, J.: Adam: a method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014)
Leis, V., Gubichev, A., Mirchev, A., Boncz, P., Kemper, A., Neumann, T.: How good are query optimizers, really? Proc. VLDB Endow. 9(3), 204–215 (2015)
Li, L., Jamieson, K., et al.: A system for massively parallel hyperparameter tuning. Proc. Mach. Learn. Syst. 2, 230–246 (2020)
Liu, S., Chen, X., Zhao, Y., Chen, J., Zhou, R., Zheng, K.: Efficient learning with pseudo labels for query cost estimation. In: Proceedings of the 31st ACM International Conference on Information & Knowledge Management, pp. 1309–1318 (2022)
Marcus, R., Negi, P., Mao, H., Tatbul, N., Alizadeh, M., Kraska, T.: Bao: making learned query optimization practical. In: Proceedings of the 2021 International Conference on Management of Data, pp. 1275–1288 (2021)
Marcus, R., et al.: Neo: a learned query optimizer. arXiv preprint arXiv:1904.03711 (2019)
Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. arXiv preprint arXiv:1301.3781 (2013)
Mou, L., Li, G., Zhang, L., Wang, T., Jin, Z.: Convolutional neural networks over tree structures for programming language processing. In: Proceedings of the AAAI Conference on Artificial Intelligence, vol. 30 (2016)
Paszke, A., et al.: Pytorch: An imperative style, high-performance deep learning library. Adv. Neural Inf. Process. Syst. 32 (2019)
Poess, M., Floyd, C.: New TPC benchmarks for decision support and web commerce. ACM SIGMOD Rec. 29(4), 64–71 (2000)
Poess, M., Nambiar, R.O., Walrath, D.: Why you should run TPC-DS: a workload analysis. In: VLDB, vol. 7, pp. 1138–1149 (2007)
Rossi, E., Charpentier, B., Di Giovanni, F., Frasca, F., Günnemann, S., Bronstein, M.: Edge directionality improves learning on heterophilic graphs. arXiv preprint arXiv:2305.10498 (2023)
Shi, Y., Huang, Z., Feng, S., Zhong, H., Wang, W., Sun, Y.: Masked label prediction: unified message passing model for semi-supervised classification. arXiv preprint arXiv:2009.03509 (2020)
Sun, J., Li, G.: An end-to-end learning-based cost estimator. arXiv preprint arXiv:1906.02560 (2019)
Tai, K.S., Socher, R., Manning, C.D.: Improved semantic representations from tree-structured long short-term memory networks. arXiv preprint arXiv:1503.00075 (2015)
Vaswani, A., et al.: Attention is all you need. Adv. Neural Inf. Process. Syst. 30 (2017)
Yu, X., Li, G., Chai, C., Tang, N.: Reinforcement learning with tree-LSTM for join order selection. In: 2020 IEEE 36th International Conference on Data Engineering (ICDE), pp. 1297–1308. IEEE (2020)
Zhao, Y., Cong, G., Shi, J., Miao, C.: QueryFormer: a tree transformer model for query plan representation. Proc. VLDB Endow. 15(8), 1658–1670 (2022)
Zhao, Y., Li, Z., Cong, G.: A comparative study and component analysis of query plan representation techniques in ML4DB studies. Proc. VLDB Endow. 17(4), 823–835 (2023)
Zhou, J., et al.: Graph neural networks: a review of methods and applications. AI Open 1, 57–81 (2020)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Switzerland AG
About this paper
Cite this paper
Chang, B., Kamali, A., Kantere, V. (2024). A Novel Technique for Query Plan Representation Based on Graph Neural Nets. In: Wrembel, R., Chiusano, S., Kotsis, G., Tjoa, A.M., Khalil, I. (eds) Big Data Analytics and Knowledge Discovery. DaWaK 2024. Lecture Notes in Computer Science, vol 14912. Springer, Cham. https://doi.org/10.1007/978-3-031-68323-7_25
Download citation
DOI: https://doi.org/10.1007/978-3-031-68323-7_25
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-68322-0
Online ISBN: 978-3-031-68323-7
eBook Packages: Computer ScienceComputer Science (R0)