research-article

A self-stabilizing and auto-provisioning orchestration for microservices in edge-cloud continuum

Authors:

Qin GuoAuthors Info & Claims

Volume 242, Issue C

https://doi.org/10.1016/j.comnet.2024.110279

Published: 02 July 2024 Publication History

Abstract

Modern user-facing services are progressively evolving from large monolithic applications to complex graphs of loosely-coupled microservices. While this shift provides opportunities to offload some microservices of a user-facing service to edge devices that are close to the end users, it also complicates the application deployment and resource provisioning in the edge-cloud continuum, due to complex relationships across microservices and unstable public networks. To reduce resource wastage and improve user experience, this paper presents SMO, a self-managed orchestration system for microservices. SMO leverages a self-stabilizing placement mechanism to optimally deploy microservices through perceiving both interferences and communication overheads. During runtime, it further tailors a multi-agent deep deterministic policy gradient (MADDPG)-based model in combination with attention and prioritized replay, which automatically provisions appropriate resources for each microservice subject to the differentiated tail latency Service Level Objectives (SLOs) of multiple user workloads. Experimental results demonstrate that SMO saves up to 39% of CPU cores and 47% of memory footprint while providing guarantees for heterogeneous tail latency SLOs.

References

[1]

Luo S., Xu H., Lu C., et al., An in-depth study of microservice call graph and runtime performance, IEEE Trans. Parallel Distrib. Syst. 33 (12) (2022) 3901–3914.

[2]

Cinque M., Corte R.D., Pecchia A., Microservices monitoring with event logs and black box execution tracing, IEEE Trans. Serv. Comput. 15 (1) (2022) 294–307.

[3]

A. Mirhosseini, S. Elnikety, T.F. Wenisch, Parslo: A Gradient Descent-Based Approach for Near-Optimal Partial SLO Allotment in Microservices, in: Proceedings of the 12th ACM Symposium on Cloud Computing, SoCC, 2021, pp. 442–457.

[4]

A.F. Baarzi, G. Kesidis, SHOWAR: Right-Sizing And Efficient Scheduling of Microservices, in: Proceedings of the 12th ACM Symposium on Cloud Computing, SoCC, 2021, pp. 427–441.

[5]

Zeng R., Hou X., Zhang L., et al., Performance optimization for cloud computing systems in the microservice era: state-of-the-art and research opportunities, Front. Comput. Sci. 16 (6) (2022) 1–19.

[6]

W. Zhang, Q. Chen, K. Fu, et al., Astraea: towards QoS-aware and resource-efficient multi-stage GPU services, in: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2022, pp. 570–582.

[7]

Gan Y., Zhang Y., Cheng D., et al., Unveiling the hardware and software implications of microservices in cloud and edge systems, IEEE Micro 40 (3) (2020) 10–19.

[8]

Fu K., Zhang W., Chen Q., et al., Adaptive resource efficient microservice deployment in cloud-edge continuum, IEEE Trans. Parallel Distrib. Syst. 33 (8) (2022) 1825–1840.

[9]

A. Samanta, L. Jiao, M. Muhlhauser, et al., Incentivizing Microservices for Online Resource Sharing in Edge Clouds, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 420–430.

[10]

K. Fu, W. Zhang, Q. Chen, et al., QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum, in: Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, IPDPS, 2021, pp. 932–941.

[11]

A. Mirhosseini, B.L. West, G.W. Blake, et al., Q-Zilla: A Scheduling Framework and Core Microarchitecture for Tail-Tolerant Microservices, in: Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 207–219.

[12]

A. Sriraman, A. Dhanotia, T.F. Wenisch, SoftSKU: Optimizing Server Architectures for Microservice Diversity @scale, in: Proceedings of the 46th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA, 2019, pp. 513–526.

[13]

Kubernetes A., Kubernetes: Production-grade container orchestration, 2023, https://kubernetes.io/.

[14]

K. Rzadca, P. Findeisen, J. Swiderski, et al., Autopilot: Workload Autoscaling at Google, in: Proceedings of the 15th European Conference on Computer Systems, EuroSys, 2020, pp. 1–16.

[15]

H. Qiu, S.S. Banerjee, S. Jha, et al., FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices, in: Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2020, pp. 805–825.

[16]

Y. Zhang, W. Hua, Z. Zhou, et al., Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices, in: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2021, pp. 167–181.

[17]

Baccarelli E., Scarpiniti M., Momenzadeh A., EcoMobiFog–design and dynamic optimization of a 5G mobile-fog-cloud multi-tier ecosystem for the real-time distributed execution of stream applications, IEEE Access 7 (2019) 55565–55608.

[18]

Y. Gan, Y. Zhang, D. Cheng, et al., An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems, in: Proceedings of the 24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 3–18.

[19]

R. Lowe, Y. Wu, A. Tamar, et al., Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS, 2017, pp. 6382–6393.

[20]

P. Garefalakis, K. Karanasos, P. Pietzuch, et al., Medea: Scheduling of Long Running Applications in Shared Production Clusters, in: Proceedings of the Thirteenth European Conference on Computer Systems, EuroSys, 2018, pp. 1–13.

[21]

A. Verma, L. Pedrosa, M. Korupolu, et al., Large-Scale Cluster Management at Google with Borg, in: Proceedings of the Tenth European Conference on Computer Systems, EuroSys, 2015, pp. 1–17.

[22]

S. Li, L. Wang, W. Wang, et al., George: Learning to Place Long-Lived Containers in Large Clusters with Operation Constraints, in: Proceedings of the ACM Symposium on Cloud Computing, SoCC, 2021, pp. 258–272.

[23]

Chong E.K., Zak S.H., An Introduction To Optimization, John Wiley & Sons, 2004.

[24]

A.Y. Ng, M.I. Jordan, Y. Weiss, On Spectral Clustering: Analysis and an algorithm, in: Proceedings of Advances in Neural Information Processing Systems, NIPS, 2001, pp. 849–856.

[25]

X. Bu, J. Rao, C.-z. Xu, Interference and Locality-Aware Task Scheduling for MapReduce Applications in Virtual Clusters, in: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC, 2013, pp. 227–238.

[26]

A. Tumanov, T. Zhu, J.W. Park, et al., TetriSched: Global Rescheduling with Adaptive Plan-Ahead in Dynamic Heterogeneous Clusters, in: Proceedings of the 11th European Conference on Computer Systems, EuroSys, 2016.

[27]

L. Wang, Q. Weng, W. Wang, et al., Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters at Scale, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC, 2020, pp. 1–17.

[28]

V. Kalavri, J. Liagouris, M. Hoffmann, et al., Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows, in: Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2018, pp. 783–798.

[29]

G. Yu, P. Chen, Z. Zheng, Microscaler: Automatic Scaling for Microservices with an Online Learning Approach, in: Proceedings of the 2019 IEEE International Conference on Web Services, ICWS, 2019, pp. 68–75.

[30]

Lorido-Botran T., Miguel-Alonso J., Lozano J.A., A review of auto-scaling techniques for elastic applications in cloud environments, J. Grid Comput. 12 (4) (2014) 559–592.

[31]

Peng H., Shen X., Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks, IEEE J. Sel. Areas Commun. 39 (1) (2021) 131–141.

Digital Library

[32]

Chen X., Liu G., Energy-efficient task offloading and resource allocation via deep reinforcement learning for augmented reality in mobile edge networks, IEEE Internet Things J. 8 (13) (2021) 10843–10856.

[33]

M.L. Littman, Markov Games as a Framework for Multi-Agent Reinforcement Learning, in: Proceedings of the 11th International Conference on International Conference on Machine Learning, ICML, 1994, pp. 157–163.

[34]

Nasir Y.S., Guo D., Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2239–2250.

[35]

R.S. Kannan, L. Subramanian, A. Raju, et al., GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks, in: Proceedings of the 14th EuroSys Conference, EuroSys, 2019, pp. 1–16.

[36]

T. Schaul, J. Quan, I. Antonoglou, et al., Prioritized Experience Replay, in: Proceedings of the 4th International Conference on Learning Representations, ICLR, 2016.

[37]

Wei F., Feng G., Sun Y., et al., Network slice reconfiguration by exploiting deep reinforcement learning with large action space, IEEE Trans. Netw. Serv. Manag. 17 (4) (2020) 2197–2211.

[38]

Liang L., Ye H., Li G.Y., Spectrum sharing in vehicular networks based on multi-agent reinforcement learning, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2282–2292.

[39]

J. Mars, L. Tang, R. Hundt, et al., Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations, in: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, 2011, pp. 248–259.

[40]

C. Delimitrou, C. Kozyrakis, iBench: Quantifying interference for datacenter applications, in: Proceedings of the IEEE International Symposium on Workload Characterization, IISWC, 2013, pp. 23–33.

[41]

Numactl, 2023, https://github.com/numactl/numactl.

[42]

Jaeger C., Jaeger: open source, end-to-end distributed tracing, 2023, https://www.jaegertracing.io.

[43]

Opentracing, 2023, https://opentracing.io/.

[44]

Prometheus, 2023, https://prometheus.io/.

[45]

ClarkNet C., The internet traffic archive, 2023, http://ita.ee.lbl.gov/html/traces.html.

[46]

Kube-scheduler, 2023, https://kubernetes.io/docs/concepts/sched-uling-eviction/kube-scheduler/.

[47]

Reghenzani F., Massari G., Fornaciari W., The real-time linux kernel: A survey on PREEMPT_RT, ACM Comput. Surv. 52 (1) (2019) 18:1–18:36.

Digital Library

[48]

P. Gerum, Xenomai-Implementing a Rtos Emulation Framework on Gnu/linux, White Paper, 2004, pp. 1–12.

[49]

V. Struhár, M. Behnam, M. Ashjaei, A.V. Papadopoulos, Real-Time Containers: A Survey, in: 2nd Workshop on Fog Computing and the IoT, Fog-IoT, Vol. 80, 2020, pp. 7:1–7:9.

[50]

J. Shi, J. Wang, K. Fu, Q. Chen, et al., QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling, in: 2022 IEEE International Parallel and Distributed Processing Symposium, IPDPS, 2022, pp. 324–334.

[51]

D. Amendola, N. Cordeschi, E. Baccarelli, Bandwidth Management VMs Live Migration in Wireless Fog Computing for 5G Networks, in: 2016 5th IEEE International Conference on Cloud Networking, Cloudnet, 2016, pp. 21–26.

[52]

Baccarelli E., Amendola D., Cordeschi N., Minimum-energy bandwidth management for QoS live migration of virtual machines, Comput. Netw. 93 (2015) 1–22.

[53]

A. Kwan, J. Wong, H.-A. Jacobsen, et al., HyScale: Hybrid and Network Scaling of Dockerized Microservices in Cloud Data Centres, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 80–90.

[54]

A.U. Gias, G. Casale, M. Woodside, ATOM: Model-Driven Autoscaling for Microservices, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 1994–2004.

[55]

Li Q., Li B., Mercati P., et al., RAMBO: Resource allocation for microservices using Bayesian optimization, IEEE Comput. Archit. Lett. 20 (1) (2021) 46–49.

[56]

S. Chen, C. Delimitrou, J.F. Martínez, PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services, in: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 107–120.

[57]

R. Nishtala, V. Petrucci, P. Carpenter, et al., Twig: Multi-Agent Task Management for Colocated Latency-Critical Cloud Services, in: Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 167–179.

[58]

T. Patel, D. Tiwari, CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers, in: Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 193–206.

[59]

Y. Gan, Y. Zhang, K. Hu, et al., Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices, in: Proceedings of the 24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 19–33.

[60]

Y. Gan, M. Liang, S. Dev, et al., Sage: Practical and Scalable ML-Driven Performance Debugging in Microservices, in: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2021, pp. 135–151.

Recommendations

Towards Seamless Serverless Computing Across an Edge-Cloud Continuum
UCC '23: Proceedings of the IEEE/ACM 16th International Conference on Utility and Cloud Computing

Serverless computing has emerged as an attractive paradigm due to the efficiency of development and the ease of deployment without managing any underlying infrastructure. Nevertheless, serverless computing approaches face numerous challenges to unlock ...
A Taxonomy for Workload Deployment Orchestration in the Edge-Cloud Continuum
Service-Oriented and Cloud Computing
Abstract
As compute resources continue to proliferate from static large-scale enterprise-grade cloud environments to various types of more dynamic and resource-constrained edge environments, the need increases to orchestrate the deployment of workloads of ...
Cloud resource provisioning: survey, status and future research directions

Cloud resource provisioning is a challenging job that may be compromised due to unavailability of the expected resources. Quality of Service (QoS) requirements of workloads derives the provisioning of appropriate resources to cloud workloads. Discovery ...

Comments

Information & Contributors

Information

Published In

cover image Computer Networks: The International Journal of Computer and Telecommunications Networking

Computer Networks: The International Journal of Computer and Telecommunications Networking Volume 242, Issue C

Apr 2024

489 pages

ISSN:1389-1286

Issue’s Table of Contents

Elsevier B.V.

Publisher

Elsevier North-Holland, Inc.

United States

Publication History

Published: 02 July 2024

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

Affiliations

Binlei Cai

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, China

Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250103, China

Jinan Institute of Supercomputing Technology, Jinan 250103, China

Xiaoli Wang

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, China

Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250103, China

Bin Wang

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, China

Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250103, China

Meihong Yang

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, China

Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250103, China

Ying Guo

Key Laboratory of Computing Power Network and Information Security, Ministry of Education, Shandong Computer Science Center (National Supercomputer Center in Jinan), Qilu University of Technology (Shandong Academy of Sciences), Jinan 250103, China

Shandong Provincial Key Laboratory of Computer Networks, Shandong Fundamental Research Center for Computer Science, Jinan 250103, China

Jinan Institute of Supercomputing Technology, Jinan 250103, China

Qin Guo

School of Science, Shandong Jianzhu University, Jinan 250101, China

View Issue’s Table of Contents