Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Doppler: automated SKU recommendation in migrating SQL workloads to the cloud

Published: 01 August 2022 Publication History

Abstract

Selecting the optimal cloud target to migrate SQL estates from on-premises to the cloud remains a challenge. Current solutions are not only time-consuming and error-prone, requiring significant user input, but also fail to provide appropriate recommendations. We present Doppler, a scalable recommendation engine that provides right-sized Azure SQL Platform-as-a-Service (PaaS) recommendations without requiring access to sensitive customer data and queries. Doppler introduces a novel price-performance methodology that allows customers to get a personalized rank of relevant cloud targets solely based on low-level resource statistics, such as latency and memory usage. Doppler supplements this rank with internal knowledge of Azure customer behavior to help guide new migration customers towards one optimal target. Experimental results over a 9-month period from prospective and existing customers indicate that Doppler can identify optimal targets and adapt to changes in customer workloads. It has also found cost-saving opportunities among over-provisioned cloud customers, without compromising on capacity or other requirements. Doppler has been integrated and released in the Azure Data Migration Assistant v5.5, which receives hundreds of assessment requests daily.

References

[1]
Adel Alkhalil, Reza Sahandi, and David John. 2017. A decision process model to support migration to cloud computing. International Journal of Business Information Systems 24, 1 (2017), 102--126.
[2]
Amazon.com, Inc. 2022. Amazon Web Service. Retrieved Jan 4, 2022 from https://aws.amazon.com/
[3]
Vasilios Andrikopoulos, Anja Reuter, Mingzhu Xiu, and Frank Leymann. 2014. Design support for cost-efficient application distribution in the cloud. In 2014 IEEE 7th International Conference on Cloud Computing. IEEE, 697--704.
[4]
Vasilios Andrikopoulos, Zhe Song, and Frank Leymann. 2013. Supporting the migration of applications to the cloud through a decision support system. In 2013 IEEE Sixth International Conference on Cloud Computing. IEEE, 565--572.
[5]
Patricia V Beserra, Alessandro Camara, Rafael Ximenes, Adriano B Albuquerque, and Nabor C Mendonca. 2012. Cloudstep: A step-by-step decision process to support legacy application migration to the cloud. In 2012 IEEE 6th international workshop on the maintenance and evolution of service-oriented and cloud-based systems (MESOCA). IEEE, 7--16.
[6]
Robert B Cleveland, William S Cleveland, Jean E McRae, and Irma Terpenning. 1990. STL: A seasonal-trend decomposition. J. Off. Stat 6, 1 (1990), 3--73.
[7]
Brian F Cooper, Adam Silberstein, Erwin Tam, Raghu Ramakrishnan, and Russell Sears. 2010. Benchmarking cloud serving systems with YCSB. In Proceedings of the 1st ACM symposium on Cloud computing. 143--154.
[8]
Shaleen Deep, Anja Gruenheid, Kruthi Nagaraj, Hiro Naito, Jeff Naughton, and Stratis Viglas. 2021. Diametrics: benchmarking query engines at scale. ACM SIGMOD Record 50, 1 (2021), 24--31.
[9]
Leonidas Galanis, Supiti Buranawatanachoke, Romain Colle, Benoît Dageville, Karl Dias, Jonathan Klein, Stratos Papadomanolakis, Leng Leng Tan, Venkateshwaran Venkataramani, Yujun Wang, et al. 2008. Oracle database replay. In Proceedings of the 2008 ACM SIGMOD international conference on Management of data. 1159--1170.
[10]
Gartner. 2019. Gartner Says the Future of the Database Market Is the Cloud. https://www.gartner.com/en/newsroom/press-releases/2019-07-01-gartner-says-the-future-of-the-database-market-is-the. Accessed: 2022-02-28.
[11]
Gartner. 2020. Market Share Database Management Systems Worldwide 2020. https://www.gartner.com/en/documents/4001330/market-share-database-management-systems-worldwide-2020. Accessed: 2022-02-28.
[12]
Mahdi Fahmideh Gholami, Farhad Daneshgar, Graham Low, and Ghassan Beydoun. 2016. Cloud migration process---A survey, evaluation framework, and open challenges. Journal of Systems and Software 120 (2016), 31--69.
[13]
LS Girish and HS Guruprasad. 2014. Survey on service migration to cloud architecture. International Journal of Computer Science & Engineering Technology 5 (2014), 507--510.
[14]
Google. [n.d.]. Google Cloud Platform. https://cloud.google.com. Accessed: 2022-01-25.
[15]
The PostgreSQL Global Development Group. 2021. PostgreSQL. https://www.postgresql.org/. Accessed: 2021-11-03.
[16]
John A Hartigan and Manchek A Wong. 1979. Algorithm AS 136: A k-means clustering algorithm. Journal of the royal statistical society. series c (applied statistics) 28, 1 (1979), 100--108.
[17]
Pooyan Jamshidi, Aakash Ahmad, and Claus Pahl. 2013. Cloud migration research: a systematic review. IEEE transactions on cloud computing 1, 2 (2013), 142--157.
[18]
Stephen C Johnson. 1967. Hierarchical clustering schemes. Psychometrika 32, 3 (1967), 241--254.
[19]
Jaeyong Kang and Kwang Mong Sim. 2010. Cloudle: a multi-criteria cloud service search engine. In 2010 IEEE Asia-Pacific Services Computing Conference. IEEE, 339--346.
[20]
Ravi Khadka, Amir Saeidi, Andrei Idu, Jurriaan Hage, and Slinger Jansen. 2013. Legacy to SOA evolution: a systematic literature review. Migrating Legacy Applications: Challenges in Service Oriented Architecture and Cloud Computing Environments (2013), 40--70.
[21]
Markus Klems, Jens Nimis, and Stefan Tai. 2008. Do clouds compute? a framework for estimating the value of cloud computing. In Workshop on E-Business. Springer, 110--123.
[22]
Aliki Kopaneli, George Kousiouris, Gorka Echevarria Velez, Athanasia Evangelinou, and Theodora Varvarigou. 2015. A model driven approach for supporting the Cloud target selection process. Procedia Computer Science 68 (2015), 89--102.
[23]
Stephen Lane and Ita Richardson. 2011. Process models for service-based applications: A systematic literature review. Information and Software Technology 53, 5 (2011), 424--439.
[24]
Viktor Leis and Maximilian Kuschewski. 2021. Towards cost-optimal query processing in the cloud. Proceedings of the VLDB Endowment 14, 9 (2021), 1606--1612.
[25]
Microsoft. 2021. Azure Data Factory. https://docs.microsoft.com/en-us/azure/data-factory/introduction.
[26]
Microsoft. 2021. Azure Data Migration Assistant. https://docs.microsoft.com/en-us/sql/dma/dma-overview?view=sql-server-ver15.
[27]
Microsoft. 2021. Azure SQL Database. https://azure.microsoft.com/en-us/products/azure-sql/database/.
[28]
Microsoft. 2021. Azure SQL Database hyperscale. https://docs.microsoft.com/en-us/azure/azure-sql/database/service-tier-hyperscale.
[29]
Microsoft. 2021. Azure SQL Database Serverless. https://docs.microsoft.com/en-us/azure/azure-sql/database/serverless-tier-overview.
[30]
Microsoft. 2021. Azure SQL DB Resource Limits. https://docs.microsoft.com/en-us/azure/azure-sql/database/resource-limits-vcore-single-databases/.
[31]
Microsoft. 2021. Azure SQL Managed Instance. https://azure.microsoft.com/en-us/products/azure-sql/managed-instance/.
[32]
Microsoft. 2021. Azure SQL MI Resource Limits. https://docs.microsoft.com/en-us/azure/azure-sql/managed-instance/resource-limits/.
[33]
Microsoft. 2021. Azure SQL Migration Guides. https://docs.microsoft.com/en-us/azure/azure-sql/migration-guides/database/sql-server-to-sql-database-overview.
[34]
Microsoft. 2021. Azure SQL October 2021 New Updates. https://www.youtube.com/watch?v=dKgIqe0x6Bc&t=1542s.
[35]
Microsoft. 2021. SQL Server on Azure VM. https://docs.microsoft.com/en-us/azure/azure-sql/virtual-machines/.
[36]
Microsoft. 2021. VCore Purchasing Model Overview. https://docs.microsoft.com/en-Us/azure/azure-sql/database/service-tiers-sql-database-vcore. Accessed: 2021-10-25.
[37]
Microsoft. 2022. Azure SQL Database pricing. https://azure.microsoft.com/en-us/pricing/details/azure-sql-database/single//.
[38]
Thomas Nagler and Claudia Czado. 2016. Evading the curse of dimensionality in nonparametric density estimation with simplified vine copulas. Journal of Multivariate Analysis 151 (2016), 69--89.
[39]
State of California Dept of Justice. 2022. California Consumer Privacy Act. Retrieved Feb 23, 2022 from https://oag.ca.gov/privacy/ccpa.
[40]
Oracle. 2021. Oracle Database. https://www.oracle.com/database/. Accessed: 2021-11-03.
[41]
Maryam Razavian and Patricia Lago. 2015. A systematic literature review on SOA migration. Journal of Software: Evolution and Process 27, 5 (2015), 337--372.
[42]
Andrey Sadovykh, Christian Hein, Brice Morin, Parastoo Mohagheghi, and Arne J Berre. 2011. REMICS-REuse and Migration of legacy applications to Interoperable Cloud Services. In European Conference on a Service-Based Internet. Springer, 315--316.
[43]
Santiago Gómez Sáez, Vasilios Andrikopoulos, Michael Hahn, Dimka Karastoyanova, Frank Leymann, Marigianna Skouradaki, and Karolina Vukojevic-Haupt. 2015. Performance and Cost Evaluation for the Migration of a Scientific Workflow Infrastructure to the Cloud. In CLOSER. 352--361.
[44]
Rathijit Sen, Abhishek Roy, Alekh Jindal, Rui Fang, Jeff Zheng, Xiaolei Liu, and Ruiping Li. 2021. AutoExecutor: predictive parallelism for spark SQL queries. Proceedings of the VLDB Endowment 14, 12 (2021), 2855--2858.
[45]
Bernard W Silverman. 2018. Density estimation for statistics and data analysis. Routledge.
[46]
TPC Benchmarks. 2021. TPC-C. Retrieved Aug 19, 2021 from http://tpc.org/tpcc.
[47]
TPC Benchmarks. 2021. TPC-DS. Retrieved Aug 19, 2021 from http://tpc.org/tpcds.
[48]
TPC Benchmarks. 2021. TPC-H. Retrieved Aug 19, 2021 from http://tpc.org/tpch.
[49]
European Union. 2022. General Data Protection Regulation. Retrieved Feb 23, 2022 from https://eur-lex.europa.eu/legal-content/EN/ALL/?uri=celex.
[50]
Yi Wei and M Brian Blake. 2010. Service-oriented computing and cloud computing: Challenges and opportunities. IEEE Internet Computing 14, 6 (2010), 72--75.
[51]
Khaled Yagoub, Peter Belknap, Benoit Dageville, Karl Dias, Shantanu Joshi, and Hailing Yu. 2008. Oracle's SQL Performance Analyzer. IEEE Data Eng. Bull. 31, 1 (2008), 51--58.
[52]
Yiwen Zhu, Subru Krishnan, Konstantinos Karanasos, Isha Tarte, Conor Power, Abhishek Modi, Manoj Kumar, Deli Zhang, Kartheek Muthyala, Nick Jurgens, et al. 2021. KEA: Tuning an Exabyte-Scale Data Infrastructure. In Proceedings of the 2021 International Conference on Management of Data. 2667--2680.

Cited By

View all
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)Lorentz: Learned SKU Recommendation Using Profile DataProceedings of the ACM on Management of Data10.1145/36549522:3(1-25)Online publication date: 30-May-2024
  • (2024)DoppelGanger++: Towards Fast Dependency Graph Generation for Database ReplayProceedings of the ACM on Management of Data10.1145/36393222:1(1-26)Online publication date: 26-Mar-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Proceedings of the VLDB Endowment
Proceedings of the VLDB Endowment  Volume 15, Issue 12
August 2022
551 pages
ISSN:2150-8097
Issue’s Table of Contents

Publisher

VLDB Endowment

Publication History

Published: 01 August 2022
Published in PVLDB Volume 15, Issue 12

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 03 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)The Holon Approach for Simultaneously Tuning Multiple Components in a Self-Driving Database Management System with Machine Learning via Synthesized Proto-ActionsProceedings of the VLDB Endowment10.14778/3681954.368200717:11(3373-3387)Online publication date: 30-Aug-2024
  • (2024)Lorentz: Learned SKU Recommendation Using Profile DataProceedings of the ACM on Management of Data10.1145/36549522:3(1-25)Online publication date: 30-May-2024
  • (2024)DoppelGanger++: Towards Fast Dependency Graph Generation for Database ReplayProceedings of the ACM on Management of Data10.1145/36393222:1(1-26)Online publication date: 26-Mar-2024
  • (2024)Vertically Autoscaling Monolithic Applications with CaaSPER: Scalable Container-as-a-Service Performance Enhanced Resizing Algorithm for the CloudCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653378(241-254)Online publication date: 9-Jun-2024
  • (2024)Proactive Resume and Pause of Resources for Microsoft Azure SQL Database ServerlessCompanion of the 2024 International Conference on Management of Data10.1145/3626246.3653371(227-240)Online publication date: 9-Jun-2024
  • (2023)Towards Building Autonomous Data Services on AzureCompanion of the 2023 International Conference on Management of Data10.1145/3555041.3589674(217-224)Online publication date: 4-Jun-2023

View Options

Get Access

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media