research-article

Open access

Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless

Authors:

Vaishali Jhalani,

Anupriya Inumella,

Sanjana Dulipeta Sridhar,

Alexandru Chirica,

Ajay KalhanAuthors Info & Claims

SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data

Pages 227 - 240

https://doi.org/10.1145/3626246.3653371

Published: 09 June 2024 Publication History

Abstract

Demand-driven resource allocation for cloud databases has become a popular research direction. Recent approaches have evolved from reactive policies to proactive decision making. These approaches leverage not only the current resource demand but also the predicted demand to make more informed resource allocation decisions for each database and thus improve the quality of service and reduce the operational costs. We present an infrastructure that enables proactive resource allocation capabilities for millions of serverless Azure SQL databases. Our solution finds near-optimal middle ground between high availability of resources, low operational costs, and low computational overhead of the proactive policy. We describe the design principles we followed and the architectural decisions we made during this cross-team, multi-year journey. Given the size and scope of our solution, we believe that the relational cloud databases in other companies could benefit from the proactive resource allocation capabilities.

Supplemental Material

MP4 File

Conference presentation

Download
103.56 MB

MP4 File

Demand-driven resource allocation for cloud databases has become a popular research direction. Recent approaches have evolved from reactive policies to proactive decision making. These approaches leverage not only the current resource demand but also the predicted demand to make more informed resource allocation decisions for each database and thus improve the quality of service and reduce the operational costs. We present an infrastructure that enables proactive resource allocation capabilities for millions of serverless Azure SQL databases. Our solution finds near-optimal middle ground between high availability of resources, low operational costs, and low computational overhead of the proactive policy. We describe the design principles we followed and the architectural decisions we made during this cross-team, multi-year journey. Given the size and scope of our solution, we believe that the relational cloud databases in other companies could benefit from the proactive resource allocation capabilities.

Download
39.51 MB

PPTX File

Conference presentation

Download
1.71 MB

References

[1]

2024. ARIMA. https://pypi.org/project/pmdarima/

[2]

2024. Availability Capabilities of Azure SQL Database. https: //learn.microsoft.com/en-us/azure/azure-sql/database/sql-database-paasoverview? view=azuresql#availability-capabilities

[3]

2024. Azure ML. https://azure.microsoft.com/en-us/services/machine-learning/

[4]

2024. Azure Service Fabric. https://azure.microsoft.com/en-us/services/servicefabric/

[5]

2024. Azure SQL Database. https://azure.microsoft.com/en-us/products/azuresql/ database

[6]

2024. Azure SQL Database Pricing. https://azure.microsoft.com/en-us/pricing/ details/azure-sql-database

[7]

2024. Azure SQL Database Serverless. https://docs.microsoft.com/en-us/azure/ azure-sql/database/serverless-tier-overview

[8]

2024. Clustered and Nonclustered Indexes of SQL Server. https: //learn.microsoft.com/en-us/sql/relational-databases/indexes/clusteredand- nonclustered-indexes-described?view=sql-server-ver16

[9]

2024. GluonTS. https://gluon-ts.mxnet.io/

[10]

2024. MLflow. https://mlflow.org/

[11]

2024. ML.NET Binary Trainer. https://docs.microsoft.com/en-us/dotnet/api/ microsoft.ml.trainers.fasttree.fastforestbinarytrainer

[12]

2024. MySQL Autopilot Shape Advisor. https://dev.mysql.com/doc/heatwaveaws/ en/heatwave-aws-autopilot-shape-advisor.html

[13]

2024. NimbusML. https://docs.microsoft.com/en-us/python/api/nimbusml/ nimbusml.timeseries.ssaforecaster

[14]

2024. Oracle Autonomous Database. https://www.oracle.com/autonomousdatabase/

[15]

2024. Power BI. https://powerbi.microsoft.com/

[16]

2024. Prophet. https://facebook.github.io/prophet/

[17]

2024. SLA for Azure SQL Database. https://azure.microsoft.com/en-us/support/ legal/sla/azure-sql-database/v1_8/

[18]

Martin Abadi, Paul Barham, Jianmin Chen, Zhifeng Chen, Andy Davis, Jeffrey Dean, Matthieu Devin, Sanjay Ghemawat, Geoffrey Irving, Michael Isard, Manjunath Kudlur, Josh Levenberg, Rajat Monga, Sherry Moore, Derek G. Murray, Benoit Steiner, Paul Tucker, Vijay Vasudevan, Pete Warden, Martin Wicke, Yuan Yu, and Xiaoqiang Zheng. 2016. TensorFlow: A System for Large-Scale Machine Learning. In OSDI. 265--283.

[19]

Amir H. Ashouri, William Killian, John Cavazos, Gianluca Palermo, and Cristina Silvano. 2019. A Survey on Compiler Autotuning using Machine Learning. ACM Computing Surveys (CSUR) 51 (2019), 1 -- 42.

Digital Library

[20]

Nico Bruno and Surajit Chaudhuri. 2005. Automatic Physical Database Tuning: A Relaxation-based Approach. In SIGMOD. 227--238.

[21]

Joyce Cahoon, Wenjing Wang, Yiwen Zhu, Katherine Lin, Sean Liu, Raymond Truong, Neetu Singh, Chengcheng Wan, Alexandra Ciortea, Sreraman Narasimhan, and Subru Krishnan. 2022. Doppler: Automated SKU Recommendation in Migrating SQL Workloads to the Cloud. Proc. VLDB Endow. 15, 12 (2022), 3509--3521.

Digital Library

[22]

Surajit Chaudhuri and Vivek Narasayya. 1998. AutoAdmin "What-If" Index Analysis Utility. In SIGMOD. 367--378.

[23]

Surajit Chaudhuri and Vivek R. Narasayya. 1997. An Efficient Cost-Driven Index Selection Tool for Microsoft SQL Server. In VLDB. 146--155.

Digital Library

[24]

Dehao Chen, David Xinliang Li, and Tipp Moseley. 2016. AutoFDO: Automatic Feedback-Directed Optimization for Warehouse-Scale Applications. In Proc. of Int. Symposium on Code Generation and Optimization. 12--23.

Digital Library

[25]

Daniel Crankshaw, Peter Bailis, Joseph E. Gonzalez, Haoyuan Li, Zhao Zhang, Michael J. Franklin, Ali Ghodsi, and Michael I. Jordan. 2015. The Missing Piece in Complex Analytics: Low Latency, Scalable Model Management and Serving with Velox. In CIDR.

[26]

Daniel Crankshaw, Xin Wang, Guilio Zhou, Michael J. Franklin, Joseph E. Gonzalez, and Ion Stoica. 2017. Clipper: A Low-Latency Online Prediction Serving System. In NSDI. 613--627.

[27]

Carlo Curino, Neha Godwal, Brian Kroth, Sergiy Kuryata, Greg Lapinski, Siqi Liu, Slava Oks, Olga Poppe, Adam Smiechowski, Ed Thayer, MarkusWeimer, and Yiwen Zhu. 2020. MLOS: An Infrastructure for Automated Software Performance Engineering. In DEEM@SIGMOD. 1--5.

[28]

Benoît Dageville and Mohamed Zait. 2002. SQL Memory Management in Oracle9i. In VLDB. 962--973.

[29]

Sudipto Das, Feng Li, Vivek R. Narasayya, and Arnd Christian König. 2016. Automated Demand-driven Resource Scaling in Relational Database-as-a-Service. In SIGMOD. 1923--1924.

[30]

Christina Delimitrou and Christos Kozyrakis. 2014. Quasar: Resource-Efficient and QoS-Aware Cluster Management. SIGPLAN Not. 49, 4 (2014), 127--144.

Digital Library

[31]

Karl Dias, Mark Ramacher, Uri Shaft, Venkateshwaran Venkataramani, and Graham Wood. 2005. Automatic Performance Diagnosis and Tuning in Oracle. In CIDR. 84--94.

[32]

Songyun Duan, Vamsidhar Thummala, and Shivnath Babu. 2009. Tuning Database Configuration Parameters with ITuned. Proc. VLDB Endow. 2, 1 (August 2009), 1246--1257.

Digital Library

[33]

Jonathan Eastep, David Wingate, and Anant Agarwal. 2011. Smart Data Structures: An Online Machine Learning Approach to Multicore Data Structures. In Proc. of Int. Conf. on Autonomic Computing. 11--20.

Digital Library

[34]

Grigori Fursin, Yuriy Kashnikov, Abdul Wahid Memon, Zbigniew Chamski, Olivier Temam, Mircea Namolaru, Bilha Mendelson, Ayal Zaks, Eric Courtois, François Bodin, Phil Barnard, Elton Ashton, Edwin Bonilla, John Thomson, Christopher Williams, and Michael O'Boyle. 2011. Milepost GCC: Machine Learning Enabled Self-tuning Compiler. Int. Journal of Parallel Programming 39 (06 2011), 296--327.

[35]

Sunny Gakhar, Joyce Cahoon,Wangchao Le, Xiangnan Li, Kaushik Ravichandran, Hiren Patel, Marc Friedman, Brandon Haynes, Shi Qiao, Alekh Jindal, and Jyoti Leeka. 2022. Pipemizer: An Optimizer for Analytics Data Pipelines. Proc. VLDB Endow. 15, 12 (September 2022), 3710--3713.

Digital Library

[36]

Michael Hammer and Bahram Niamir. 1979. A Heuristic Approach to Attribute Partitioning. In SIGMOD. 93--101.

[37]

Stratos Idreos, Kostas Zoumpatianos, Brian Hentschel, Michael S. Kester, and Demi Guo. 2018. The Data Calculator: Data Structure Design and Cost Synthesis from First Principles and Learned Cost Models. In SIGMOD. 535--550.

[38]

Yangqing Jia, Evan Shelhamer, Jeff Donahue, Sergey Karayev, Jonathan Long, Ross Girshick, Sergio Guadarrama, and Trevor Darrell. 2014. Caffe: Convolutional Architecture for Fast Feature Embedding. In MM. Association for Computing Machinery, 675--678.

Digital Library

[39]

Alekh Jindal, K. Venkatesh Emani, Maureen Daum, Olga Poppe, Brandon Haynes, Anna Pavlenko, Ayushi Gupta, Karthik Ramachandra, Carlo Curino, Andreas C. Müller, Wentao Wu, and Hiren Patel. 2021. Magpie: Python at Speed and Scale using Cloud Backends. In CIDR.

[40]

Alekh Jindal and Jyoti Leeka. 2022. Query Optimizer as a Service: An Idea Whose Time Has Come. SIGMOD Record 51, 3 (2022), 49--55.

Digital Library

[41]

Gopal Kakivaya, Lu Xun, Richard Hasha, Shegufta Bakht Ahsan, Todd Pfleiger, Rishi Sinha, Anurag Gupta, Mihail Tarta, Mark Fussell, Vipul Modi, Mansoor Mohsin, Ray Kong, Anmol Ahuja, Oana Platon, Alex Wun, Matthew Snider, Chacko Daniel, Dan Mastrian, Yang Li, Aprameya Rao, Vaishnav Kidambi, Randy Wang, Abhishek Ram, Sumukh Shivaprakash, Rajeet Nair, Alan Warwick, Bharat S. Narasimman, Meng Lin, Jeffrey Chen, Abhay Balkrishna Mhatre, Preetha Subbarayalu, Mert Coskun, and Indranil Gupta. 2018. Service Fabric: A Distributed Platform for Building Microservices in the Cloud. In EuroSys. 1--15.

[42]

Arnd Christian König, Yi Shan, Tobias Ziegler, Aarati Kakaraparthy, Willis Lang, Justin Moeller, Ajay Kalhan, and Vivek Narasayya. 2022. Tenant Placement in Over-subscribed Database-as-a-Service Clusters. Proc. VLDB Endow. 15, 11 (2022), 2559--2571.

Digital Library

[43]

Tim Kraska, Alex Beutel, Ed H. Chi, Jeffrey Dean, and Neoklis Polyzotis. 2018. The Case for Learned Index Structures. In SIGMOD. 489--504.

Digital Library

[44]

Eva Kwan, Sam Lightstone, K. Bernhard Schiefer, Adam J. Storm, and LeanneWu. 2003. Automatic Database Configuration for DB2 Universal Database: Compressing Years of Performance Expertise into Seconds of Execution. In BTW, Vol. 26. 620--629.

[45]

Willis Lang, Karthik Ramachandra, David J. DeWitt, Shize Xu, Qun Guo, Ajay Kalhan, and Peter Carlin. 2016. Not for the Timid: On the Impact of Aggressive over-Booking in the Cloud. Proc. VLDB Endow. 9, 13 (2016), 1245--1256.

Digital Library

[46]

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Nesime Tatbul, Mohammad Alizadeh, and Tim Kraska. 2021. Bao: Learning to Steer Query Optimizers. In SIGMOD. 1275--1288.

[47]

Ryan Marcus, Parimarjan Negi, Hongzi Mao, Chi Zhang, Mohammad Alizadeh, Tim Kraska, Olga Papaemmanouil, and Nesime Tatbul. 2019. Neo: A Learned Query Optimizer. Proc. VLDB Endow. 12, 11 (July 2019), 1705--1718.

Digital Library

[48]

Xiangrui Meng, Joseph Bradley, Burak Yavuz, Evan Sparks, Shivaram Venkataraman, Davies Liu, Jeremy Freeman, DB Tsai, Manish Amde, Sean Owen, Doris Xin, Reynold Xin, Michael J. Franklin, Reza Zadeh, Matei Zaharia, and Ameet Talwalkar. 2016. MLlib: Machine Learning in Apache Spark. Journal of Machine Learning Research 17, 34 (2016), 1--7.

Digital Library

[49]

Justin Moeller, Zi Ye, Katherine Lin, and Willis Lang. 2021. Toto - Benchmarking the Efficiency of a Cloud Service. In SIGMOD. 2543--2556.

[50]

Kunal Mukerjee, Tomas Talius, Ajay Kalhan, Nigel Ellis, and Conor Cunningham. 2011. SQL Azure as a Self-Managing Database Service: Lessons Learned and Challenges Ahead. IEEE Data Eng. Bull. 34, 4 (2011), 61--70.

[51]

Dushyanth Narayanan, Eno Thereska, and Anastassia Ailamaki. 2005. Continuous Resource Monitoring for Self-predicting DBMS. In IEEE International Symposium on Modeling, Analysis, and Simulation of Computer and Telecommunication Systems. 239--248.

[52]

Rimma Nehme and Nicolas Bruno. 2011. Automated Partitioning Design in Parallel Database Systems. In SIGMOD. 1137--1148.

[53]

Andrew Pavlo, Gustavo Angulo, Joy Arulraj, Haibin Lin, Jiexi Lin, Lin Ma, Prashanth Menon, Todd C. Mowry, Matthew Perron, Ian Quah, Siddharth Santurkar, Anthony Tomasic, Skye Toor, Dana Van Aken, Ziqi Wang, Yingjun Wu, Ran Xian, and Tieying Zhang. 2017. Self-Driving Database Management Systems. In CIDR.

[54]

Andrew Pavlo, Matthew Butrovich, Ananya Joshi, Lin Ma, Prashanth Menon, Dana Van Aken, Lisa Lee, and Ruslan Salakhutdinov. 2019. External vs. Internal: An Essay on Machine Learning Agents for Autonomous Database Management Systems. IEEE Data Eng. Bull. 42, 2 (2019), 32--46.

[55]

Andrew Pavlo, Carlo Curino, and Stanley Zdonik. 2012. Skew-Aware Automatic Database Partitioning in Shared-Nothing, Parallel OLTP Systems. In SIGMOD. 61--72.

[56]

Jose Picado, Willis Lang, and Edward C. Thayer. 2018. Survivability of Cloud Databases - Factors and Prediction. In SIGMOD. 811--823.

[57]

Olga Poppe, Tayo Amuneke, Dalitso Banda, Aritra De, Ari Green, Manon Knoertzer, Ehi Nosakhare, Karthik Rajendran, Deepak Shankargouda, Meina Wang, Alan Au, Carlo Curino, Qun Guo, Alekh Jindal, Ajay Kalhan, Morgan Oslake, Sonia Parchani, Vijay Ramani, Raj Sellappan, Saikat Sen, Sheetal Shrotri, Soundararajan Srinivasan, Ping Xia, Shize Xu, Alicia Yang, and Yiwen Zhu. 2020. Seagull: An Infrastructure for Load Prediction and Optimized Resource Allocation. Proc. VLDB Endow. 14, 2 (2020), 154--162.

Digital Library

[58]

Olga Poppe, Pablo Castro, Willis Lang, and Jyoti Leeka. 2023. Proactive Resource Allocation Policy for Microsoft Azure Cognitive Search. SIGMOD Record 52, 3 (2023), 41--48.

Digital Library

[59]

Olga Poppe, Qun Guo, Willis Lang, Pankaj Arora, Morgan Oslake, Shize Xu, and Ajay Kalhan. 2022. Moneyball: Proactive Auto-Scaling in Microsoft Azure SQL Database Serverless. Proc. VLDB Endow. 15, 6 (2022), 1279--1287.

Digital Library

[60]

Olga Poppe, Chuan Lei, Elke A. Rundensteiner, and Dan Dougherty. 2016. Contextaware Event Stream Analytics. In EDBT. 413--424.

[61]

Conor Power, Hiren Patel, Alekh Jindal, Jyoti Leeka, Bob Jenkins, Michael Rys, Ed Triou, Dexin Zhu, Lucky Katahanas, Chakrapani Bhat Talapady, Josh Rowe, Fan Zhang, Rich Draves, Ivan Santa, and Amrish Kumar. 2021. The Cosmos Big Data Platform at Microsoft: Over a Decade of Progress and a Decade to Look Forward. Proc. VLDB Endow. 14, 12 (2021), 3148--3161.

Digital Library

[62]

Adam J. Storm, Christian Garcia-Arellano, Sam S. Lightstone, Yixin Diao, and M. Surendra. 2006. Adaptive Self-Tuning Memory in DB2. In VLDB. 1081--1092.

[63]

Rebecca Taft, Nosayba El-Sayed, Marco Serafini, Yu Lu, Ashraf Aboulnaga, Michael Stonebraker, Ricardo Mayerhofer, and Francisco Andrade. 2018. P-Store: An Elastic Database System with Predictive Provisioning. In SIGMOD. 205--219.

[64]

Wenhu Tian, Pat Martin, andWendy Powley. 2003. Techniques for Automatically Sizing Multiple Buffer Pools in DB2. In CASCON. 294--302.

[65]

Dana Van Aken, Andrew Pavlo, Geoffrey J. Gordon, and Bohan Zhang. 2017. Automatic Database Management System Tuning Through Large-Scale Machine Learning. In SIGMOD. 1009--1024.

[66]

Lalitha Viswanathan, Bikash Chandra, Willis Lang, Karthik Ramachandra, Jignesh M. Patel, Ajay Kalhan, David J. DeWitt, and Alan Halverson. 2017. Predictive Provisioning: Efficiently Anticipating Usage in Azure SQL Database. In ICDE. 1111--1116.

[67]

Bowei Xi, Zhen Liu, Mukund Raghavachari, Cathy H. Xia, and Li Zhang. 2004. A Smart Hill-Climbing Algorithm for Application Server Configuration. In WWW. 287--296.

[68]

Yuqing Zhu, Jianxun Liu, Mengying Guo, Yungang Bao, Wenlong Ma, Zhuoyue Liu, Kunpeng Song, and Yingchun Yang. 2017. BestConfig: Tapping the Performance Potential of Systems via Automatic Configuration Tuning. In Proc. of Symposium on Cloud Computing. 338--350.

Digital Library

Index Terms

Proactive Resume and Pause of Resources for Microsoft Azure SQL Database Serverless
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Self-organizing autonomic computing

Recommendations

Microsoft azure SQL database telemetry
SoCC '15: Proceedings of the Sixth ACM Symposium on Cloud Computing

Microsoft operates the Azure SQL Database (ASD) cloud service, one of the dominant relational cloud database services in the market today. To aid the academic community in their research on designing and efficiently operating cloud database services, ...
Pro SQL Database for Windows Azure: SQL Server in the Cloud
Pro SQL Azure

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

SIGMOD/PODS '24: Companion of the 2024 International Conference on Management of Data

June 2024

694 pages

ISBN:9798400704222

DOI:10.1145/3626246

General Chairs:
Pablo Barcelo
Universidad Catolica, Chile
,
Nayat Sanchez-Pi
INRIA Chile
,
Program Chairs:
Alexandra Meliou
University of Massachusetts Amherst, USA
,
S. Sudarshan
Indian Institute of Technology Bombay

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

SIGMOD: ACM Special Interest Group on Management of Data

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

SIGMOD/PODS '24

Sponsor:

SIGMOD

SIGMOD/PODS '24: International Conference on Management of Data

June 9 - 15, 2024

Santiago AA, Chile

Acceptance Rates

Overall Acceptance Rate 785 of 4,003 submissions, 20%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
380
Total Downloads

Downloads (Last 12 months)380
Downloads (Last 6 weeks)49

Reflects downloads up to 16 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Figures

Tables

Media

View Table of Conten