GCNPart: Interference-Aware Resource Partitioning Framework with Graph Convolutional Neural Networks and Deep Reinforcement Learning

Chen, Ruobing; Shi, Haosen; Wu, Jinping; Li, Yusen; Liu, Xiaoguang; Wang, Gang

doi:10.1007/978-3-031-22677-9_30

Ruobing Chen¹¹,
Haosen Shi¹²,
Jinping Wu¹¹,
Yusen Li¹¹,
Xiaoguang Liu¹¹ &
…
Gang Wang¹¹

Part of the book series: Lecture Notes in Computer Science ((LNCS,volume 13777))

Included in the following conference series:

International Conference on Algorithms and Architectures for Parallel Processing

1839 Accesses

Abstract

The clouding server providers usually take workload consolidation to maximize server utilization. For eliminating performance interference due to the competition among multiple shared resources, resource partitioning becomes an important problem in daily commercial servers scenario. However, partitioning the critical multiple resources coordinately is particularly challenging due to the complex contention behaviors and the large search space to be explored for finding the optimal solution.

In this paper, we propose GCNPart, which focuses on allocating the optimal shared compete resource partition for colocated applications to optimize system performance. The existing resource partitioning frameworks lack analysis and good modeling of applications, resulting in inefficiencies or lack of generality. We formulate the resource partitioning problem as a sequential decision problem. GCNPart builds an accurate application performance model based on graph convolutional neural networks (GCN) to learn the mapping relationships from multiple resources to applications, and then constructs deep reinforcement learning (DRL) model to consider temporal information for real-time resource partitioning decisions. The extensive experiments evaluate that compared with the existing resource partitioning frameworks, GCNPart improves system throughput by 5.35% $\sim $ 26.57%.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 89.00; Price excludes VAT (USA)

Softcover Book: USD 119.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

DeepLRA: An Efficient Long Running Application Scheduling Framework with Deep Reinforcement Learning in the Cloud

Performance and Cost-Aware Task Scheduling via Deep Reinforcement Learning in Cloud Environment

Deep Convolutional Neural Network with a Fuzzy (DCNN-F) technique for energy and time optimized scheduling of cloud computing

Article 04 July 2024

References

The python performance benchmark suite. https://pyperformance.readthedocs.io/ (2006)
The spec cpu®2006 benchmark suite. https://www.spec.org/cpu2006/ (2006)
The spec cpu®2017 benchmark suite. https://www.spec.org/cpu2017/ (2017)
Andrew, H., Abbasi, K.M., Marcel, C.: Introduction to memory bandwidth allocation. https://software.intel.com/en-us/articles/introduction-to-memory-bandwidth-allocation (2019)
Brownlee, J.: Gentle introduction to the adam optimization algorithm for deep learning. Machine Learning Mastery 3 (2017)
Google Scholar
Chen, R., Wu, J., Shi, H., Li, Y., Liu, X., Wang, G.: DRLPart: a deep reinforcement learning framework for optimally efficient and robust resource partitioning on commodity servers. In: Proceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing, pp. 175–188. Association for Computing Machinery (2020)
Google Scholar
Chen, S., Delimitrou, C., Martínez, F.J.: Parties: QoS-aware resource partitioning for multiple interactive services. In: Proceedings of the Twenty-Fourth International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS), pp. 107–120 (2019)
Google Scholar
Cheng, Y., Chen, W., Wang, Z., Xiang, Y.: Precise contention-aware performance prediction on virtualized multicore system. J. Syst. Archit. 72, 42–50 (2017)
Article Google Scholar
Delimitrou, C., Kozyrakis, C.: QoS-aware scheduling in heterogeneous datacenters with paragon
Google Scholar
Delimitrou, C., Kozyrakis, C.: Paragon: QoS-aware scheduling for heterogeneous datacenters. In: Proceedings of the 18th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS). vol. 48, pp. 77–88. ACM (2013)
Google Scholar
Delimitrou, C., Kozyrakis, C.: Quasar: resource-efficient and QoS-aware cluster management. ACM SIGPLAN Notices 49(4), 127–144 (2014)
Article Google Scholar
Du, B., Wu, C., Huang, Z.: Learning resource allocation and pricing for cloud profit maximization. In: The Thirty-Third AAAI Conference on Artificial Intelligence (AAAI-19) (2019)
Google Scholar
Dublish, S., Nagarajan, V., Topham, N.: Poise: Balancing thread-level parallelism and memory system performance in GPUs using machine learning. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 492–505 (2019)
Google Scholar
El-Sayed, N., Mukkara, A., Tsai, P.A., Kasture, H., Ma, X., Sanchez, D.: Kpart: A hybrid cache partitioning-sharing technique for commodity multicores. In: 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 104–117. IEEE (2018)
Google Scholar
Hammond, D.K., Vandergheynst, P., Gribonval, R.: Wavelets on graphs via spectral graph theory. Applied and Computational Harmonic Analysis 30(2), 129–150 (2011). https://doi.org/10.1016/j.acha.2010.04.005, https://www.sciencedirect.com/science/article/pii/S1063520310000552
Kasture, H., Sanchez, D.: Ubik: efficient cache sharing with strict QoS for latency-critical workloads. In: Proceedings of the 19th international conference on Architectural support for programming languages and operating systems (ASPLOS), vol. 49, pp. 729–742 (2014)
Google Scholar
Li, S., Wang, L., Wang, W., Yu, Y., Li, B.: George: Learning to place long-lived containers in large clusters with operation constraints. In: Proceedings of the ACM Symposium on Cloud Computing, pp. 258–272 (2021)
Google Scholar
Lo, D., Cheng, L., Govindaraju, R., Ranganathan, P., Kozyrakis, C.: Heracles: Improving resource efficiency at scale. In: International Symposium on Computer Architecture (ISCA), vol. 43, pp. 450–462. ACM (2015)
Google Scholar
Mars, J., Tang, L., Hundt, R., Skadron, K., Soffa, M.L.: Bubble-up: Increasing utilization in modern warehouse scale computers via sensible co-locations. In: Proceedings of the 44th annual IEEE/ACM International Symposium on Microarchitecture, pp. 248–259 (2011)
Google Scholar
Nair, V., Hinton, G.E.: Rectified linear units improve restricted boltzmann machines. In: ICML (2010)
Google Scholar
Nguyen, K.T.: Introduction to cache allocation technology in the intel^® xeon^® processor e5 v4 family. https://software.intel.com/en-us/articles/introduction-to-cache-allocation-technology/ (2019)
Nikas, K., Papadopoulou, N., Giantsidi, D., Karakostas, V., Goumas, G., Koziris, N.: Dicer: Diligent cache partitioning for efficient workload consolidation. In: Proceedings of the 48th International Conference on Parallel Processing, p. 15 (2019)
Google Scholar
Park, J., Park, S., Baek, W.: Copart: Coordinated partitioning of last-level cache and memory bandwidth for fairness-aware workload consolidation on commodity servers. In: Proceedings of the Fourteenth EuroSys Conference 2019, pp. 1–10 (2019)
Google Scholar
Park, J., Park, S., Han, M., Hyun, J., Baek, W.: Hypart: A hybrid technique for practical memory bandwidth partitioning on commodity servers. In: Proceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques, pp. 1–14 (2018)
Google Scholar
Patel, T., Tiwari, D.: Clite: Efficient and QoS-aware co-location of multiple latency-critical jobs for warehouse scale computers. In: 2020 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 193–206 (2020). https://doi.org/10.1109/HPCA47549.2020.00025
Pelikan, M., Sastry, K., Goldberg, D.E.: Scalability of the Bayesian optimization algorithm. Int. J. Approximate Reasoning 31(3), 221–258 (2002)
Article MATH Google Scholar
Qureshi, M.K., Patt, Y.N.: Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In: Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture, pp. 423–432 (2006)
Google Scholar
Roijers, D.M., Vamplew, P., Whiteson, S., Dazeley, R.: A survey of multi-objective sequential decision-making. J. Artif. Intell. Res. 48, 67–113 (2013)
Article MathSciNet MATH Google Scholar
Roy, R.B., Patel, T., Tiwari, D.: Satori: efficient and fair resource partitioning by sacrificing short-term benefits for long-term gains. In: 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), pp. 292–305. IEEE (2021)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement learning: an introduction. MIT press (2018)
Google Scholar
Tang, L., Mars, J., Vachharajani, N., Hundt, R., Soffa, M.L.: The impact of memory subsystem resource sharing on datacenter applications. In: 2011 38th Annual International Symposium on Computer Architecture (ISCA), pp. 283–294. IEEE (2011)
Google Scholar
Vaswani, A., et al.: Attention is all you need. In: Advances in neural information processing systems, pp. 5998–6008 (2017)
Google Scholar
Wang, L., Weng, Q., Wang, W., Chen, C., Li, B.: Metis: Learning to schedule long-running applications in shared container clusters at scale. In: SC20: International Conference for High Performance Computing, Networking, Storage and Analysis, pp. 1–17. IEEE (2020)
Google Scholar
Wu, Z., Pan, S., Chen, F., Long, G., Yu, P.S.: A comprehensive survey on graph neural networks (2019)
Google Scholar
Xiang, Y., Wang, X., Huang, Z., Wang, Z., Luo, Y., Wang, Z.: Dcaps: dynamic cache allocation with partial sharing. In: Proceedings of the Thirteenth EuroSys Conference 2018, p. 13 (2018)
Google Scholar
Xiao, J., Pimentel, A.D., Liu, X.: CPPF: A prefetch aware LLC partitioning approach. In: Proceedings of the 48th International Conference on Parallel Processing, pp. 1–10 (2019)
Google Scholar
Xu, C., Rajamani, K., Ferreira, A., Felter, W., Rubio, J., Li, Y.: DCAT: dynamic cache management for efficient, performance-sensitive infrastructure-as-a-service. In: Proceedings of the Thirteenth EuroSys Conference 2018, p. 14 (2018)
Google Scholar

Download references

Acknowledgment

This work is supported by Key-Area Research and Development Program of Guangdong Province 2021B0101310002; State Key Laboratory of Computer Architecture, ICT, CAS, under Grant No. CARCHB202013; National Science Foundation of China (62141412, 61872201); Science and Technology Development Plan of Tianjin (20JCZDJC00610); Fundamental Research Funds for the Central Universities.

Author information

Authors and Affiliations

NanKai University, Tianjin, 300300, China
Ruobing Chen, Jinping Wu, Yusen Li, Xiaoguang Liu & Gang Wang
Nanyang Technological University, Singapore, Singapore
Haosen Shi

Authors

Ruobing Chen
View author publications
You can also search for this author in PubMed Google Scholar
Haosen Shi
View author publications
You can also search for this author in PubMed Google Scholar
Jinping Wu
View author publications
You can also search for this author in PubMed Google Scholar
Yusen Li
View author publications
You can also search for this author in PubMed Google Scholar
Xiaoguang Liu
View author publications
You can also search for this author in PubMed Google Scholar
Gang Wang
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Yusen Li .

Editor information

Editors and Affiliations

Technical University of Denmark, Kongens Lyngby, Denmark
Weizhi Meng
University of New Brunswick, Fredericton, NB, Canada
Rongxing Lu
University of Exeter, Exeter, UK
Geyong Min
Rutgers University, Newark, NJ, USA
Jaideep Vaidya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Chen, R., Shi, H., Wu, J., Li, Y., Liu, X., Wang, G. (2023). GCNPart: Interference-Aware Resource Partitioning Framework with Graph Convolutional Neural Networks and Deep Reinforcement Learning. In: Meng, W., Lu, R., Min, G., Vaidya, J. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2022. Lecture Notes in Computer Science, vol 13777. Springer, Cham. https://doi.org/10.1007/978-3-031-22677-9_30

Download citation

DOI: https://doi.org/10.1007/978-3-031-22677-9_30
Published: 11 January 2023
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-22676-2
Online ISBN: 978-3-031-22677-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

GCNPart: Interference-Aware Resource Partitioning Framework with Graph Convolutional Neural Networks and Deep Reinforcement Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

DeepLRA: An Efficient Long Running Application Scheduling Framework with Deep Reinforcement Learning in the Cloud

Performance and Cost-Aware Task Scheduling via Deep Reinforcement Learning in Cloud Environment

Deep Convolutional Neural Network with a Fuzzy (DCNN-F) technique for energy and time optimized scheduling of cloud computing

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

GCNPart: Interference-Aware Resource Partitioning Framework with Graph Convolutional Neural Networks and Deep Reinforcement Learning

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

DeepLRA: An Efficient Long Running Application Scheduling Framework with Deep Reinforcement Learning in the Cloud

Performance and Cost-Aware Task Scheduling via Deep Reinforcement Learning in Cloud Environment

Deep Convolutional Neural Network with a Fuzzy (DCNN-F) technique for energy and time optimized scheduling of cloud computing

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation