Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3485447.3511991acmconferencesArticle/Chapter ViewAbstractPublication PageswebconfConference Proceedingsconference-collections
research-article

A Sampling-based Learning Framework for Big Databases

Published: 25 April 2022 Publication History
  • Get Citation Alerts
  • Abstract

    The autonomous database of the next generation aims to apply the reinforcement learning (RL) on tasks like query optimization and performance tuning with little or no human DBAs’ intervention. Despite the promise, to obtain a decent policy model in the domain of database optimization is still challenging — primarily due to the inherent computational overhead involved in the data hungry RL frameworks — in particular on large databases. In the line of mitigating this adverse effect, we propose Mirror in this work. The core to Mirror is a sampling process built in an RL framework together with a transferring process of the policy model from the sampled database to its original counterpart. While being conceptually simple, we identify that the policy transfer between databases involves heavy noise and prediction drifting that cannot be neglectable. Thereby we build a theoretical-guided sampling algorithm in Mirror assisted by a continuous fine-tuning module. The experiments on the PostgreSQL and an industry database PolarDB validate that Mirror has effectively reduced the computational cost while maintaining a satisfactory performance.

    References

    [1]
    David Andre, Nir Friedman, and Ronald Parr. 1997. Generalized Prioritized Sweeping. In NIPS. 1001–1007.
    [2]
    Joy Arulraj, Ran Xian, Lin Ma, and Andrew Pavlo. 2019. Predictive Indexing. CoRR abs/1901.07064(2019).
    [3]
    Renata Borovica, Ioannis Alagiannis, and Anastasia Ailamaki. 2012. Automated physical designers: what you see is (not) what you get. In Proceedings of the Fifth International Workshop on Testing Database Systems, DBTest 2012, Scottsdale, AZ, USA, May 21, 2012. 9.
    [4]
    Yu Chen and Ke Yi. 2017. Two-Level Sampling for Join Size Estimation. In SIGMOD. 759–774.
    [5]
    Sudipto Das, Miroslav Grbic, Igor Ilic, Isidora Jovandic, Andrija Jovanovic, Vivek R. Narasayya, Miodrag Radulovic, Maja Stikic, Gaoxiang Xu, and Surajit Chaudhuri. 2019. Automatically Indexing Millions of Databases in Microsoft Azure SQL Database. In SIGMOD. 666–679.
    [6]
    Bailu Ding, Sudipto Das, Ryan Marcus, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2019. AI Meets AI: Leveraging Query Executions to Improve Index Recommendations. In SIGMOD, Peter A. Boncz, Stefan Manegold, Anastasia Ailamaki, Amol Deshpande, and Tim Kraska (Eds.). ACM, 1241–1258.
    [7]
    Bailu Ding, Sudipto Das, Wentao Wu, Surajit Chaudhuri, and Vivek R. Narasayya. 2018. Plan Stitch: Harnessing the Best of Many Plans. PVLDB 11, 10 (2018), 1123–1136.
    [8]
    Adam Dziedzic, Jingjing Wang, Sudipto Das, Bolin Ding, Vivek R. Narasayya, and Manoj Syamala. 2018. Columnstore and B+ tree - Are Hybrid Physical Designs Important?. In SIGMOD. 177–190.
    [9]
    Peter J. Haas and Joseph M. Hellerstein. 1999. Ripple Joins for Online Aggregation. In SIGMOD. 287–298.
    [10]
    Geoffrey E. Hinton, Oriol Vinyals, and Jeffrey Dean. 2015. Distilling the Knowledge in a Neural Network. CoRR abs/1503.02531(2015). arxiv:1503.02531http://arxiv.org/abs/1503.02531
    [11]
    Andreas Kipf, Thomas Kipf, Bernhard Radke, Viktor Leis, Peter A. Boncz, and Alfons Kemper. 2019. Learned Cardinalities: Estimating Correlated Joins with Deep Learning. In CIDR.
    [12]
    Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD, Gautam Das, Christopher M. Jermaine, and Philip A. Bernstein (Eds.). ACM, 489–504.
    [13]
    Sanjay Krishnan, Zongheng Yang, Ken Goldberg, Joseph M. Hellerstein, and Ion Stoica. 2018. Learning to Optimize Join Queries With Deep Reinforcement Learning. CoRR abs/1808.03196(2018).
    [14]
    Hai Lan, Zhifeng Bao, and Yuwei Peng. 2020. An Index Advisor Using Deep Reinforcement Learning. In CIKM, Mathieu d’Aquin, Stefan Dietze, Claudia Hauff, Edward Curry, and Philippe Cudré-Mauroux (Eds.). ACM, 2105–2108.
    [15]
    Guoliang Li, Xuanhe Zhou, Shifu Li, and Bo Gao. 2019. QTune: A Query-Aware Database Tuning System with Deep Reinforcement Learning. PVLDB 12, 12 (2019), 2118–2130.
    [16]
    Henry Liu, Mingbin Xu, Ziting Yu, Vincent Corvinelli, and Calisto Zuzarte. 2015. Cardinality estimation using neural networks. In CASCON. 53–59.
    [17]
    Minghua Ma, Zheng Yin, Shenglin Zhang, Sheng Wang, Christopher Zheng, Xinhao Jiang, Hanwen Hu, Cheng Luo, Yilin Li, Nengjun Qiu, Feifei Li, Changcheng Chen, and Dan Pei. 2020. Diagnosing Root Causes of Intermittent Slow Queries in Large-Scale Cloud Databases. Proc. VLDB Endow. 13, 8 (2020), 1176–1189.
    [18]
    Horia Mania, Aurelia Guy, and Benjamin Recht. 2018. Simple random search provides a competitive approach to reinforcement learning. CoRR abs/1803.07055(2018).
    [19]
    Ryan Marcus and Olga Papaemmanouil. 2018. Deep Reinforcement Learning for Join Order Enumeration. In SIGMOD. 3:1–3:4.
    [20]
    Ryan Marcus, Emily Zhang, and Tim Kraska. 2020. CDFShop: Exploring and Optimizing Learned Index Structures. In SIGMOD, David Maier, Rachel Pottinger, AnHai Doan, Wang-Chiew Tan, Abdussalam Alawini, and Hung Q. Ngo (Eds.). ACM, 2789–2792.
    [21]
    Jennifer Ortiz, Magdalena Balazinska, Johannes Gehrke, and S. Sathiya Keerthi. 2018. Learning State Representations for Query Optimization with Deep Reinforcement Learning. In SIGMOD. 4:1–4:4.
    [22]
    Cheng Peng, Canqing Zhang, Cheng Peng, and Junfeng Man. 2017. A reinforcement learning approach to map reduce auto-configuration under networked environment. IJSN 12, 3 (2017), 135–140.
    [23]
    Andrei A. Rusu, Neil C. Rabinowitz, Guillaume Desjardins, Hubert Soyer, James Kirkpatrick, Koray Kavukcuoglu, Razvan Pascanu, and Raia Hadsell. 2016. Progressive Neural Networks. CoRR abs/1606.04671(2016).
    [24]
    Tim Salimans, Jonathan Ho, Xi Chen, and Ilya Sutskever. 2017. Evolution Strategies as a Scalable Alternative to Reinforcement Learning. CoRR abs/1703.03864(2017).
    [25]
    Michael Schaarschmidt, Alexander Kuhnle, Ben Ellis, Kai Fricke, Felix Gessert, and Eiko Yoneki. 2018. LIFT: Reinforcement Learning in Computer Systems by Learning From Demonstrations. CoRR abs/1808.07903(2018).
    [26]
    Tom Schaul, John Quan, Ioannis Antonoglou, and David Silver. 2016. Prioritized Experience Replay. In ICLR.
    [27]
    Ankur Sharma, Felix Martin Schuhknecht, and Jens Dittrich. 2018. The Case for Automatic Database Administration using Deep Reinforcement Learning. CoRR abs/1801.05643(2018).
    [28]
    Ji Sun and Guoliang Li. 2019. An End-to-End Learning-based Cost Estimator. CoRR abs/1906.02560(2019).
    [29]
    Jian Tan, Tieying Zhang, Feifei Li, Jie Chen, Qixing Zheng, Ping Zhang, Honglin Qiao, Yue Shi, Wei Cao, and Rui Zhang. 2019. iBTune: Individualized Buffer Tuning for Large-scale Cloud Databases. Proc. VLDB Endow. 12, 10 (2019), 1221–1234.
    [30]
    Immanuel Trummer, Junxiong Wang, Deepak Maram, Samuel Moseley, Saehan Jo, and Joseph Antonakakis. 2019. SkinnerDB: Regret-Bounded Query Evaluation via Reinforcement Learning. In SIGMOD. 1153–1170.
    [31]
    Hado van Hasselt, Arthur Guez, and David Silver. 2016. Deep Reinforcement Learning with Double Q-Learning. In AAAI. 2094–2100.
    [32]
    TaiNing Wang and Chee-Yong Chan. 2020. Improved Correlated Sampling for Join Size Estimation. In ICDE. 325–336.
    [33]
    Tsui-Wei Weng, Pin-Yu Chen, Lam M. Nguyen, Mark S. Squillante, Ivan V. Oseledets, and Luca Daniel. 2018. PROVEN: Certifying Robustness of Neural Networks with a Probabilistic Approach. CoRR abs/1812.08329(2018).
    [34]
    Tsui-Wei Weng, Huan Zhang, Hongge Chen, Zhao Song, Cho-Jui Hsieh, Luca Daniel, Duane S. Boning, and Inderjit S. Dhillon. 2018. Towards Fast Computation of Certified Robustness for ReLU Networks. In ICML(Proceedings of Machine Learning Research, Vol. 80), Jennifer G. Dy and Andreas Krause (Eds.). PMLR, 5273–5282.
    [35]
    Sai Wu, Xinyi Yu, Gang Chen, Yusong Gao, Xiaojie Feng, and Wei Cao. 2019. Progressive Neural Index Search for Database System. CoRR abs/1912.07001(2019).
    [36]
    Feng Yu, Wen-Chi Hou, Cheng Luo, Dunren Che, and Mengxia Zhu. 2013. CS2: a new database synopsis for query estimation. In Proceedings of the ACM SIGMOD International Conference on Management of Data, SIGMOD 2013, New York, NY, USA, June 22-27, 2013. 469–480.
    [37]
    Huan Zhang, Tsui-Wei Weng, Pin-Yu Chen, Cho-Jui Hsieh, and Luca Daniel. 2018. Efficient Neural Network Robustness Certification with General Activation Functions. CoRR abs/1811.00866(2018). arxiv:1811.00866http://arxiv.org/abs/1811.00866
    [38]
    Ji Zhang, Yu Liu, Ke Zhou, Guoliang Li, Zhili Xiao, Bin Cheng, Jiashu Xing, Yangtao Wang, Tianheng Cheng, Li Liu, Minwei Ran, and Zekang Li. 2019. An End-to-End Automatic Cloud Database Tuning System Using Deep Reinforcement Learning. In SIGMOD. 415–432.
    [39]
    X. Zhou, C. Chai, G. Li, and J. SUN. 2020. Database Meets Artificial Intelligence: A Survey. IEEE Transactions on Knowledge and Data Engineering (2020), 1–1. https://doi.org/10.1109/TKDE.2020.2994641

    Index Terms

    1. A Sampling-based Learning Framework for Big Databases
          Index terms have been assigned to the content through auto-classification.

          Recommendations

          Comments

          Information & Contributors

          Information

          Published In

          cover image ACM Conferences
          WWW '22: Proceedings of the ACM Web Conference 2022
          April 2022
          3764 pages
          ISBN:9781450390965
          DOI:10.1145/3485447
          Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

          Sponsors

          Publisher

          Association for Computing Machinery

          New York, NY, United States

          Publication History

          Published: 25 April 2022

          Permissions

          Request permissions for this article.

          Check for updates

          Author Tags

          1. autonomous database
          2. database performance tuning
          3. query optimization
          4. reinforcement learning

          Qualifiers

          • Research-article
          • Research
          • Refereed limited

          Funding Sources

          • NSFC
          • Zhejiang Provincial Natural Science Foundation
          • Alibaba Group through Alibaba Innovative Research (AIR) Program
          • Fundamental Research Funds for the Central Universities of China

          Conference

          WWW '22
          Sponsor:
          WWW '22: The ACM Web Conference 2022
          April 25 - 29, 2022
          Virtual Event, Lyon, France

          Acceptance Rates

          Overall Acceptance Rate 1,899 of 8,196 submissions, 23%

          Contributors

          Other Metrics

          Bibliometrics & Citations

          Bibliometrics

          Article Metrics

          • 0
            Total Citations
          • 328
            Total Downloads
          • Downloads (Last 12 months)61
          • Downloads (Last 6 weeks)1
          Reflects downloads up to 27 Jul 2024

          Other Metrics

          Citations

          View Options

          Get Access

          Login options

          View options

          PDF

          View or Download as a PDF file.

          PDF

          eReader

          View online with eReader.

          eReader

          HTML Format

          View this article in HTML Format.

          HTML Format

          Media

          Figures

          Other

          Tables

          Share

          Share

          Share this Publication link

          Share on social media