Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A Theory of Auto-Scaling for Resource Reservation in Cloud Services

Published: 05 March 2021 Publication History

Abstract

We consider a distributed server system consisting of a large number of servers, each with limited capacity on multiple resources (CPU, memory, disk, etc.). Jobs with different rewards arrive over time and require certain amounts of resources for the duration of their service. When a job arrives, the system must decide whether to admit it or reject it, and if admitted, in which server to schedule the job. The objective is to maximize the expected total reward received by the system. This problem is motivated by control of cloud computing clusters, in which, jobs are requests for Virtual Machines or Containers that reserve resources for various services, and rewards represent service priority of requests or price paid per time unit of service by clients. We study this problem in an asymptotic regime where the number of servers and jobs' arrival rates scale by a factor L, as L becomes large. We propose a resource reservation policy that asymptotically achieves at least 1/2, and under certain monotone property on jobs' rewards and resources, at least 11/4 of the optimal expected reward. The policy automatically scales the number of VM slots for each job type as the demand changes, and decides in which servers the slots should be created in advance, without the knowledge of traffic rates. It effectively tracks a low-complexity greedy packing of existing jobs in the system while maintaining only a small number, g(L) = w(logL), of reserved VM slots for high priority jobs that pack well.

References

[1]
Amazon AWS 2019. Amazon Web Services (AWS). https://aws.amazon.com/
[2]
AWS container 2019. Amazon AWS Containers. https://aws.amazon.com/ containers/
[3]
AWS SLA 2019. Amazon AWS Service Level Agreements (SLAs). //https://aws. amazon.com/legal/service-level-agreements/
[4]
N. G. Bean, R. J. Gibbens, and S. Zachary. 1995. Asymptotic analysis of single resource loss systems in heavy traffic, with applications to integrated networks. Advances in Applied Probability 27, 1 (1995), 273--292. https://doi.org/10.2307/1428107
[5]
Patrick Billingsley. 2008. Probability and measure. John Wiley & Sons.
[6]
Antonio Corradi, Mario Fanelli, and Luca Foschini. 2014. VM consolidation: A real case based on OpenStack Cloud. Future Generation Computer Systems 32 (2014), 118--127.
[7]
Javad Ghaderi, Yuan Zhong, and Rayadurgam Srikant. 2014. Asymptotic optimality of BestFit for stochastic bin packing. ACM SIGMETRICS Performance Evaluation Review 42, 2 (2014), 64--66.
[8]
Mostafa Ghobaei-Arani, Sam Jabbehdari, and Mohammad Ali Pourmina. 2018. An autonomic resource provisioning approach for service-based cloud applications: A hybrid approach. Future Generation Computer Systems 78 (2018), 191--210.
[9]
Google Cloud 2019. Google cloud computing services. https://cloud.google.com/
[10]
Google Kubernetes 2019. Kubernetes at Google Cloud. https://https://cloud. google.com/kubernetes/
[11]
Yang Guo, Alexander Stolyar, and Anwar Walid. 2018. Online vm auto-scaling algorithms for application hosting in a cloud. IEEE Transactions on Cloud Computing (2018).
[12]
Rui Han, Li Guo, Moustafa M Ghanem, and Yike Guo. 2012. Lightweight resource scaling for cloud applications. In EEE/ACM International Symposium on Cluster, Cloud and Grid Computing (ccgrid 2012). 644--651.
[13]
PJ Hunt and TG Kurtz. 1994. Large loss networks. Stochastic Processes and their Applications 53, 2 (1994), 363--378.
[14]
PJ Hunt, CN Laws, et al. 1997. Optimization via trunk reservation in single resource loss systems under heavy trafic. The Annals of Applied Probability 7, 4 (1997), 1058--1079.
[15]
Jing Jiang, Jie Lu, Guangquan Zhang, and Guodong Long. 2013. Optimal cloud resource auto-scaling for web applications. In IEEE/ACM International Symposium on Cluster, Cloud, and Grid Computing. 58--65.
[16]
A Karthik, Arpan Mukhopadhyay, and Ravi R Mazumdar. 2017. Choosing among heterogeneous server clouds. Queueing Systems 85, 1--2 (2017), 1--29.
[17]
Hans Kellerer, Ulrich Pferschy, and David Pisinger. 2004. Multidimensional knapsack problems. In Knapsack problems. Springer, 235--283.
[18]
Frank P Kelly. 1991. Loss networks. The annals of applied probability (1991), 319--378.
[19]
Siva Theja Maguluri, Rayadurgam Srikant, and Lei Ying. 2012. Stochastic models of load balancing and scheduling in cloud computing clusters. In 2012 Proceedings IEEE Infocom. IEEE, 702--710.
[20]
Siva Theja Maguluri, Rayadurgam Srikant, and Lei Ying. 2014. Heavy traffic optimal resource allocation algorithms for cloud computing clusters. Performance Evaluation 81 (2014), 20--39.
[21]
Ming Mao, Jie Li, and Marty Humphrey. 2010. Cloud auto-scaling with deadline and budget constraints. In IEEE/ACM International Conference on Grid Computing. 41--48.
[22]
Microsoft Azure 2019. Microsoft cloud computing service. https://azure.microsoft. com/
[23]
Arpan Mukhopadhyay, A Karthik, Ravi R Mazumdar, and Fabrice Guillemin. 2015. Mean field and propagation of chaos in multi-class heterogeneous loss models. Performance Evaluation 91 (2015), 117--131.
[24]
Konstantinos Psychas and Javad Ghaderi. 2017. On non-preemptive VM scheduling in the cloud. Proceedings of the ACM on Measurement and Analysis of Computing Systems 1, 2 (2017), 35.
[25]
Konstantinos Psychas and Javad Ghaderi. 2018. Randomized Algorithms for Scheduling Multi-Resource Jobs in the Cloud. IEEE/ACM Transactions on Networking 26, 5 (2018), 2202--2215.
[26]
Konstantinos Psychas and Javad Ghaderi. 2020. A Theory of Auto-Scaling for Resource Reservation in Cloud Services. arXiv preprint arXiv:2005.13744 (2020).
[27]
Chenhao Qu, Rodrigo N Calheiros, and Rajkumar Buyya. 2018. Auto-scaling web applications in clouds: A taxonomy and survey. ACM Computing Surveys (CSUR) 51, 4 (2018), 1--33.
[28]
Safraz Rampersaud and Daniel Grosu. 2014. A sharing-aware greedy algorithm for virtual machine maximization. In IEEE 13th International Symposium on Network Computing and Applications. 113--120.
[29]
Nilabja Roy, Abhishek Dubey, and Aniruddha Gokhale. 2011. Efficient autoscaling in the cloud using predictive models for workload forecasting. In IEEE 4th International Conference on Cloud Computing. 500--507.
[30]
Weijia Song, Zhen Xiao, Qi Chen, and Haipeng Luo. 2013. Adaptive resource provisioning for the cloud using online bin packing. IEEE Trans. Comput. 63, 11 (2013), 2647--2660.
[31]
Alexander L Stolyar. 2017. Large-scale heterogeneous service systems with general packing constraints. Advances in Applied Probability 49, 1 (2017), 61--83.
[32]
Alexander L Stolyar and Yuan Zhong. 2015. Asymptotic optimality of a greedy randomized algorithm in a large-scale service system with general packing constraints. Queueing Systems 79, 2 (2015), 117--143.
[33]
Ward Whitt. 1985. Blocking when service is required from several facilities simultaneously. AT&T technical journal 64, 8 (1985), 1807--1856.
[34]
Qiaomin Xie, Xiaobo Dong, Yi Lu, and Rayadurgam Srikant. 2015. Power of d choices for large-scale bin packing: A loss model. ACM SIGMETRICS Performance Evaluation Review 43, 1 (2015), 321--334.

Cited By

View all
  • (2022)ReSQoV: A Scalable Resource Allocation Model for QoS-Satisfied Cloud ServicesFuture Internet10.3390/fi1405013114:5(131)Online publication date: 26-Apr-2022

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM SIGMETRICS Performance Evaluation Review
ACM SIGMETRICS Performance Evaluation Review  Volume 48, Issue 3
December 2020
140 pages
ISSN:0163-5999
DOI:10.1145/3453953
Issue’s Table of Contents
Permission to make digital or hard copies of part or all of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for third-party components of this work must be honored. For all other uses, contact the Owner/Author.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 05 March 2021
Published in SIGMETRICS Volume 48, Issue 3

Check for updates

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)11
  • Downloads (Last 6 weeks)1
Reflects downloads up to 30 Aug 2024

Other Metrics

Citations

Cited By

View all
  • (2022)ReSQoV: A Scalable Resource Allocation Model for QoS-Satisfied Cloud ServicesFuture Internet10.3390/fi1405013114:5(131)Online publication date: 26-Apr-2022

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media