Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A self-stabilizing and auto-provisioning orchestration for microservices in edge-cloud continuum

Published: 02 July 2024 Publication History
  • Get Citation Alerts
  • Abstract

    Modern user-facing services are progressively evolving from large monolithic applications to complex graphs of loosely-coupled microservices. While this shift provides opportunities to offload some microservices of a user-facing service to edge devices that are close to the end users, it also complicates the application deployment and resource provisioning in the edge-cloud continuum, due to complex relationships across microservices and unstable public networks. To reduce resource wastage and improve user experience, this paper presents SMO, a self-managed orchestration system for microservices. SMO leverages a self-stabilizing placement mechanism to optimally deploy microservices through perceiving both interferences and communication overheads. During runtime, it further tailors a multi-agent deep deterministic policy gradient (MADDPG)-based model in combination with attention and prioritized replay, which automatically provisions appropriate resources for each microservice subject to the differentiated tail latency Service Level Objectives (SLOs) of multiple user workloads. Experimental results demonstrate that SMO saves up to 39% of CPU cores and 47% of memory footprint while providing guarantees for heterogeneous tail latency SLOs.

    References

    [1]
    Luo S., Xu H., Lu C., et al., An in-depth study of microservice call graph and runtime performance, IEEE Trans. Parallel Distrib. Syst. 33 (12) (2022) 3901–3914.
    [2]
    Cinque M., Corte R.D., Pecchia A., Microservices monitoring with event logs and black box execution tracing, IEEE Trans. Serv. Comput. 15 (1) (2022) 294–307.
    [3]
    A. Mirhosseini, S. Elnikety, T.F. Wenisch, Parslo: A Gradient Descent-Based Approach for Near-Optimal Partial SLO Allotment in Microservices, in: Proceedings of the 12th ACM Symposium on Cloud Computing, SoCC, 2021, pp. 442–457.
    [4]
    A.F. Baarzi, G. Kesidis, SHOWAR: Right-Sizing And Efficient Scheduling of Microservices, in: Proceedings of the 12th ACM Symposium on Cloud Computing, SoCC, 2021, pp. 427–441.
    [5]
    Zeng R., Hou X., Zhang L., et al., Performance optimization for cloud computing systems in the microservice era: state-of-the-art and research opportunities, Front. Comput. Sci. 16 (6) (2022) 1–19.
    [6]
    W. Zhang, Q. Chen, K. Fu, et al., Astraea: towards QoS-aware and resource-efficient multi-stage GPU services, in: Proceedings of the 27th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2022, pp. 570–582.
    [7]
    Gan Y., Zhang Y., Cheng D., et al., Unveiling the hardware and software implications of microservices in cloud and edge systems, IEEE Micro 40 (3) (2020) 10–19.
    [8]
    Fu K., Zhang W., Chen Q., et al., Adaptive resource efficient microservice deployment in cloud-edge continuum, IEEE Trans. Parallel Distrib. Syst. 33 (8) (2022) 1825–1840.
    [9]
    A. Samanta, L. Jiao, M. Muhlhauser, et al., Incentivizing Microservices for Online Resource Sharing in Edge Clouds, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 420–430.
    [10]
    K. Fu, W. Zhang, Q. Chen, et al., QoS-Aware and Resource Efficient Microservice Deployment in Cloud-Edge Continuum, in: Proceedings of the 35th IEEE International Parallel and Distributed Processing Symposium, IPDPS, 2021, pp. 932–941.
    [11]
    A. Mirhosseini, B.L. West, G.W. Blake, et al., Q-Zilla: A Scheduling Framework and Core Microarchitecture for Tail-Tolerant Microservices, in: Proceedings of the 26th IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 207–219.
    [12]
    A. Sriraman, A. Dhanotia, T.F. Wenisch, SoftSKU: Optimizing Server Architectures for Microservice Diversity @scale, in: Proceedings of the 46th ACM/IEEE Annual International Symposium on Computer Architecture, ISCA, 2019, pp. 513–526.
    [13]
    Kubernetes A., Kubernetes: Production-grade container orchestration, 2023, https://kubernetes.io/.
    [14]
    K. Rzadca, P. Findeisen, J. Swiderski, et al., Autopilot: Workload Autoscaling at Google, in: Proceedings of the 15th European Conference on Computer Systems, EuroSys, 2020, pp. 1–16.
    [15]
    H. Qiu, S.S. Banerjee, S. Jha, et al., FIRM: An Intelligent Fine-grained Resource Management Framework for SLO-Oriented Microservices, in: Proceedings of the 14th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2020, pp. 805–825.
    [16]
    Y. Zhang, W. Hua, Z. Zhou, et al., Sinan: ML-Based and QoS-Aware Resource Management for Cloud Microservices, in: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2021, pp. 167–181.
    [17]
    Baccarelli E., Scarpiniti M., Momenzadeh A., EcoMobiFog–design and dynamic optimization of a 5G mobile-fog-cloud multi-tier ecosystem for the real-time distributed execution of stream applications, IEEE Access 7 (2019) 55565–55608.
    [18]
    Y. Gan, Y. Zhang, D. Cheng, et al., An Open-Source Benchmark Suite for Microservices and Their Hardware-Software Implications for Cloud & Edge Systems, in: Proceedings of the 24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 3–18.
    [19]
    R. Lowe, Y. Wu, A. Tamar, et al., Multi-Agent Actor-Critic for Mixed Cooperative-Competitive Environments, in: Proceedings of the 31st International Conference on Neural Information Processing Systems, NIPS, 2017, pp. 6382–6393.
    [20]
    P. Garefalakis, K. Karanasos, P. Pietzuch, et al., Medea: Scheduling of Long Running Applications in Shared Production Clusters, in: Proceedings of the Thirteenth European Conference on Computer Systems, EuroSys, 2018, pp. 1–13.
    [21]
    A. Verma, L. Pedrosa, M. Korupolu, et al., Large-Scale Cluster Management at Google with Borg, in: Proceedings of the Tenth European Conference on Computer Systems, EuroSys, 2015, pp. 1–17.
    [22]
    S. Li, L. Wang, W. Wang, et al., George: Learning to Place Long-Lived Containers in Large Clusters with Operation Constraints, in: Proceedings of the ACM Symposium on Cloud Computing, SoCC, 2021, pp. 258–272.
    [23]
    Chong E.K., Zak S.H., An Introduction To Optimization, John Wiley & Sons, 2004.
    [24]
    A.Y. Ng, M.I. Jordan, Y. Weiss, On Spectral Clustering: Analysis and an algorithm, in: Proceedings of Advances in Neural Information Processing Systems, NIPS, 2001, pp. 849–856.
    [25]
    X. Bu, J. Rao, C.-z. Xu, Interference and Locality-Aware Task Scheduling for MapReduce Applications in Virtual Clusters, in: Proceedings of the 22nd International Symposium on High-Performance Parallel and Distributed Computing, HPDC, 2013, pp. 227–238.
    [26]
    A. Tumanov, T. Zhu, J.W. Park, et al., TetriSched: Global Rescheduling with Adaptive Plan-Ahead in Dynamic Heterogeneous Clusters, in: Proceedings of the 11th European Conference on Computer Systems, EuroSys, 2016.
    [27]
    L. Wang, Q. Weng, W. Wang, et al., Metis: Learning to Schedule Long-Running Applications in Shared Container Clusters at Scale, in: Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis, SC, 2020, pp. 1–17.
    [28]
    V. Kalavri, J. Liagouris, M. Hoffmann, et al., Three steps is all you need: fast, accurate, automatic scaling decisions for distributed streaming dataflows, in: Proceedings of the 13th USENIX Symposium on Operating Systems Design and Implementation, OSDI, 2018, pp. 783–798.
    [29]
    G. Yu, P. Chen, Z. Zheng, Microscaler: Automatic Scaling for Microservices with an Online Learning Approach, in: Proceedings of the 2019 IEEE International Conference on Web Services, ICWS, 2019, pp. 68–75.
    [30]
    Lorido-Botran T., Miguel-Alonso J., Lozano J.A., A review of auto-scaling techniques for elastic applications in cloud environments, J. Grid Comput. 12 (4) (2014) 559–592.
    [31]
    Peng H., Shen X., Multi-agent reinforcement learning based resource management in MEC- and UAV-assisted vehicular networks, IEEE J. Sel. Areas Commun. 39 (1) (2021) 131–141.
    [32]
    Chen X., Liu G., Energy-efficient task offloading and resource allocation via deep reinforcement learning for augmented reality in mobile edge networks, IEEE Internet Things J. 8 (13) (2021) 10843–10856.
    [33]
    M.L. Littman, Markov Games as a Framework for Multi-Agent Reinforcement Learning, in: Proceedings of the 11th International Conference on International Conference on Machine Learning, ICML, 1994, pp. 157–163.
    [34]
    Nasir Y.S., Guo D., Multi-agent deep reinforcement learning for dynamic power allocation in wireless networks, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2239–2250.
    [35]
    R.S. Kannan, L. Subramanian, A. Raju, et al., GrandSLAm: Guaranteeing SLAs for Jobs in Microservices Execution Frameworks, in: Proceedings of the 14th EuroSys Conference, EuroSys, 2019, pp. 1–16.
    [36]
    T. Schaul, J. Quan, I. Antonoglou, et al., Prioritized Experience Replay, in: Proceedings of the 4th International Conference on Learning Representations, ICLR, 2016.
    [37]
    Wei F., Feng G., Sun Y., et al., Network slice reconfiguration by exploiting deep reinforcement learning with large action space, IEEE Trans. Netw. Serv. Manag. 17 (4) (2020) 2197–2211.
    [38]
    Liang L., Ye H., Li G.Y., Spectrum sharing in vehicular networks based on multi-agent reinforcement learning, IEEE J. Sel. Areas Commun. 37 (10) (2019) 2282–2292.
    [39]
    J. Mars, L. Tang, R. Hundt, et al., Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations, in: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, MICRO, 2011, pp. 248–259.
    [40]
    C. Delimitrou, C. Kozyrakis, iBench: Quantifying interference for datacenter applications, in: Proceedings of the IEEE International Symposium on Workload Characterization, IISWC, 2013, pp. 23–33.
    [42]
    Jaeger C., Jaeger: open source, end-to-end distributed tracing, 2023, https://www.jaegertracing.io.
    [45]
    ClarkNet C., The internet traffic archive, 2023, http://ita.ee.lbl.gov/html/traces.html.
    [47]
    Reghenzani F., Massari G., Fornaciari W., The real-time linux kernel: A survey on PREEMPT_RT, ACM Comput. Surv. 52 (1) (2019) 18:1–18:36.
    [48]
    P. Gerum, Xenomai-Implementing a Rtos Emulation Framework on Gnu/linux, White Paper, 2004, pp. 1–12.
    [49]
    V. Struhár, M. Behnam, M. Ashjaei, A.V. Papadopoulos, Real-Time Containers: A Survey, in: 2nd Workshop on Fog Computing and the IoT, Fog-IoT, Vol. 80, 2020, pp. 7:1–7:9.
    [50]
    J. Shi, J. Wang, K. Fu, Q. Chen, et al., QoS-awareness of Microservices with Excessive Loads via Inter-Datacenter Scheduling, in: 2022 IEEE International Parallel and Distributed Processing Symposium, IPDPS, 2022, pp. 324–334.
    [51]
    D. Amendola, N. Cordeschi, E. Baccarelli, Bandwidth Management VMs Live Migration in Wireless Fog Computing for 5G Networks, in: 2016 5th IEEE International Conference on Cloud Networking, Cloudnet, 2016, pp. 21–26.
    [52]
    Baccarelli E., Amendola D., Cordeschi N., Minimum-energy bandwidth management for QoS live migration of virtual machines, Comput. Netw. 93 (2015) 1–22.
    [53]
    A. Kwan, J. Wong, H.-A. Jacobsen, et al., HyScale: Hybrid and Network Scaling of Dockerized Microservices in Cloud Data Centres, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 80–90.
    [54]
    A.U. Gias, G. Casale, M. Woodside, ATOM: Model-Driven Autoscaling for Microservices, in: Proceedings of the 39th International Conference on Distributed Computing Systems, ICDCS, 2019, pp. 1994–2004.
    [55]
    Li Q., Li B., Mercati P., et al., RAMBO: Resource allocation for microservices using Bayesian optimization, IEEE Comput. Archit. Lett. 20 (1) (2021) 46–49.
    [56]
    S. Chen, C. Delimitrou, J.F. Martínez, PARTIES: QoS-Aware Resource Partitioning for Multiple Interactive Services, in: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 107–120.
    [57]
    R. Nishtala, V. Petrucci, P. Carpenter, et al., Twig: Multi-Agent Task Management for Colocated Latency-Critical Cloud Services, in: Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 167–179.
    [58]
    T. Patel, D. Tiwari, CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers, in: Proceedings of the 2020 IEEE International Symposium on High Performance Computer Architecture, HPCA, 2020, pp. 193–206.
    [59]
    Y. Gan, Y. Zhang, K. Hu, et al., Seer: Leveraging Big Data to Navigate the Complexity of Performance Debugging in Cloud Microservices, in: Proceedings of the 24th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2019, pp. 19–33.
    [60]
    Y. Gan, M. Liang, S. Dev, et al., Sage: Practical and Scalable ML-Driven Performance Debugging in Microservices, in: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS, 2021, pp. 135–151.

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Computer Networks: The International Journal of Computer and Telecommunications Networking
    Computer Networks: The International Journal of Computer and Telecommunications Networking  Volume 242, Issue C
    Apr 2024
    489 pages

    Publisher

    Elsevier North-Holland, Inc.

    United States

    Publication History

    Published: 02 July 2024

    Author Tags

    1. Edge-cloud continuum
    2. Microservice deployment
    3. Resource management
    4. Reinforcement learning
    5. Tail latency

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 0
      Total Downloads
    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media