research-article

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Authors:

Konstantinos Psychas,

Javad GhaderiAuthors Info & Claims

ACM SIGMETRICS Performance Evaluation Review, Volume 48, Issue 3

Pages 27 - 32

https://doi.org/10.1145/3453953.3453958

Published: 05 March 2021 Publication History

Abstract

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor L, as L becomes large. We propose a resource reservation policy that asymptotically achieves at least 1/2, and under certain monotone property on jobs' rewards and resources, at least 11/4 of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes, and decides in which servers the slots should be created in advance, without the knowledge of traffic rates. It effectively tracks a low-complexity greedy packing of existing jobs in the system while maintaining only a small number, g(L) = w(logL), of reserved VM slots for high priority jobs that pack well.

References

[1]

Amazon AWS 2019. Amazon Web Services (AWS). https://aws.amazon.com/

[2]

AWS container 2019. Amazon AWS Containers. https://aws.amazon.com/ containers/

[3]

AWS SLA 2019. Amazon AWS Service Level Agreements (SLAs). //https://aws. amazon.com/legal/service-level-agreements/

[4]

N. G. Bean, R. J. Gibbens, and S. Zachary. 1995. Asymptotic analysis of single resource loss systems in heavy traffic, with applications to integrated networks. Advances in Applied Probability 27, 1 (1995), 273--292. https://doi.org/10.2307/1428107

[5]

Patrick Billingsley. 2008. Probability and measure. John Wiley & Sons.

[6]

Antonio Corradi, Mario Fanelli, and Luca Foschini. 2014. VM consolidation: A real case based on OpenStack Cloud. Future Generation Computer Systems 32 (2014), 118--127.

[7]

Javad Ghaderi, Yuan Zhong, and Rayadurgam Srikant. 2014. Asymptotic optimality of BestFit for stochastic bin packing. ACM SIGMETRICS Performance Evaluation Review 42, 2 (2014), 64--66.

Digital Library

[8]

Mostafa Ghobaei-Arani, Sam Jabbehdari, and Mohammad Ali Pourmina. 2018. An autonomic resource provisioning approach for service-based cloud applications: A hybrid approach. Future Generation Computer Systems 78 (2018), 191--210.

[9]

Google Cloud 2019. Google cloud computing services. https://cloud.google.com/

[10]

Google Kubernetes 2019. Kubernetes at Google Cloud. https://https://cloud. google.com/kubernetes/

[11]

Yang Guo, Alexander Stolyar, and Anwar Walid. 2018. Online vm auto-scaling algorithms for application hosting in a cloud. IEEE Transactions on Cloud Computing (2018).

[12]

Rui Han, Li Guo, Moustafa M Ghanem, and Yike Guo. 2012. Lightweight resource scaling for cloud applications. In EEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). 644--651.

Digital Library

[13]

PJ Hunt and TG Kurtz. 1994. Large loss networks. Stochastic Processes and their Applications 53, 2 (1994), 363--378.

[14]

PJ Hunt, CN Laws, et al. 1997. Optimization via trunk reservation in single resource loss systems under heavy trafic. The Annals of Applied Probability 7, 4 (1997), 1058--1079.

[15]

Jing Jiang, Jie Lu, Guangquan Zhang, and Guodong Long. 2013. Optimal cloud resource auto-scaling for web applications. In IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. 58--65.

Digital Library

[16]

A Karthik, Arpan Mukhopadhyay, and Ravi R Mazumdar. 2017. Choosing among heterogeneous server clouds. Queueing Systems 85, 1--2 (2017), 1--29.

Digital Library

[17]

Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Multidimensional knapsack problems. In Knapsack problems. Springer, 235--283.

[18]

Frank P Kelly. 1991. Loss networks. The annals of applied probability (1991), 319--378.

[19]

Siva Theja Maguluri, Rayadurgam Srikant, and Lei Ying. 2012. Stochastic models of load balancing and scheduling in cloud computing clusters. In 2012 Proceedings IEEE Infocom. IEEE, 702--710.

[20]

Siva Theja Maguluri, Rayadurgam Srikant, and Lei Ying. 2014. Heavy traffic optimal resource allocation algorithms for cloud computing clusters. Performance Evaluation 81 (2014), 20--39.

Digital Library

[21]

Ming Mao, Jie Li, and Marty Humphrey. 2010. Cloud auto-scaling with deadline and budget constraints. In IEEE/ACM International Conference on Grid Computing. 41--48.

[22]

Microsoft Azure 2019. Microsoft cloud computing service. https://azure.microsoft. com/

[23]

Arpan Mukhopadhyay, A Karthik, Ravi R Mazumdar, and Fabrice Guillemin. 2015. Mean field and propagation of chaos in multi-class heterogeneous loss models. Performance Evaluation 91 (2015), 117--131.

Digital Library

[24]

Konstantinos Psychas and Javad Ghaderi. 2017. On non-preemptive VM scheduling in the cloud. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 2 (2017), 35.

Digital Library

[25]

Konstantinos Psychas and Javad Ghaderi. 2018. Randomized Algorithms for Scheduling Multi-Resource Jobs in the Cloud. IEEE/ACM Transactions on Networking 26, 5 (2018), 2202--2215.

Digital Library

[26]

Konstantinos Psychas and Javad Ghaderi. 2020. A Theory of Auto-Scaling for Resource Reservation in Cloud Services. arXiv preprint arXiv:2005.13744 (2020).

[27]

Chenhao Qu, Rodrigo N Calheiros, and Rajkumar Buyya. 2018. Auto-scaling web applications in clouds: A taxonomy and survey. ACM Computing Surveys (CSUR) 51, 4 (2018), 1--33.

Digital Library

[28]

Safraz Rampersaud and Daniel Grosu. 2014. A sharing-aware greedy algorithm for virtual machine maximization. In IEEE 13th International Symposium on Network Computing and Applications. 113--120.

Digital Library

[29]

Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient autoscaling in the cloud using predictive models for workload forecasting. In IEEE 4th International Conference on Cloud Computing. 500--507.

Digital Library

[30]

Weijia Song, Zhen Xiao, Qi Chen, and Haipeng Luo. 2013. Adaptive resource provisioning for the cloud using online bin packing. IEEE Trans. Comput. 63, 11 (2013), 2647--2660.

Digital Library

[31]

Alexander L Stolyar. 2017. Large-scale heterogeneous service systems with general packing constraints. Advances in Applied Probability 49, 1 (2017), 61--83.

[32]

Alexander L Stolyar and Yuan Zhong. 2015. Asymptotic optimality of a greedy randomized algorithm in a large-scale service system with general packing constraints. Queueing Systems 79, 2 (2015), 117--143.

Digital Library

[33]

Ward Whitt. 1985. Blocking when service is required from several facilities simultaneously. AT&T technical journal 64, 8 (1985), 1807--1856.

[34]

Qiaomin Xie, Xiaobo Dong, Yi Lu, and Rayadurgam Srikant. 2015. Power of d choices for large-scale bin packing: A loss model. ACM SIGMETRICS Performance Evaluation Review 43, 1 (2015), 321--334.

Digital Library

Cited By

Khan HChua FYap T(2022)ReSQoV: A Scalable Resource Allocation Model for QoS-Satisfied Cloud ServicesFuture Internet10.3390/fi1405013114:5(131)Online publication date: 26-Apr-2022
https://doi.org/10.3390/fi14050131

Recommendations

Auto-scaling to minimize cost and meet application deadlines in cloud workflows
SC '11: Proceedings of 2011 International Conference for High Performance Computing, Networking, Storage and Analysis

A goal in cloud computing is to allocate (and thus pay for) only those cloud resources that are truly needed. To date, cloud practitioners have pursued schedule-based (e.g., time-of-day) and rule-based mechanisms to attempt to automate this matching ...
Throughput maximization in multiprocessor speed-scaling

In the classical energy minimization problem, introduced in 24, we are given a set of n jobs each one characterized by its release date, its deadline, its processing volume and we aim to find a feasible schedule of the jobs on a single speed-scalable ...
Resource availability-aware advance reservation for parallel jobs with deadlines

Advance reservation is important to guarantee the quality of services of jobs by allowing exclusive access to resources over a defined time interval on resources. It is a challenge for the scheduler to organize available resources efficiently and to ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review

ACM SIGMETRICS Performance Evaluation Review Volume 48, Issue 3

December 2020

140 pages

ISSN:0163-5999

DOI:10.1145/3453953

Editor:
Zhenhua Liu
Stony Brook University

Issue’s Table of Contents

Copyright © 2021 Copyright is held by the owner/author(s).

Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021

Published in SIGMETRICS Volume 48, Issue 3

Check for updates

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
94
Total Downloads

Downloads (Last 12 months)11
Downloads (Last 6 weeks)1

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Khan HChua FYap T(2022)ReSQoV: A Scalable Resource Allocation Model for QoS-Satisfied Cloud ServicesFuture Internet10.3390/fi1405013114:5(131)Online publication date: 26-Apr-2022
https://doi.org/10.3390/fi14050131

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents