Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3623278.3624762acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

CPS: A Cooperative Para-virtualized Scheduling Framework for Manycore Machines

Published: 07 February 2024 Publication History

Abstract

Today's cloud platforms offer large virtual machine (VM) instances with multiple virtual CPUs (vCPU) on manycore machines. These machines typically have a deep memory hierarchy to enhance communication between cores. Although previous researches have primarily focused on addressing the performance scalability issues caused by the double scheduling problem in virtualized environments, they mainly concentrated on solving the preemption problem of synchronization primitives and the traditional NUMA architecture. This paper specifically targets a new aspect of scalability issues caused by the absence of runtime hypervisor-internal states (RHS). We demonstrate two typical RHS problems, namely the invisible pCPU (physical CPU) load and dynamic cache group mapping. These RHS problems result in a collapse in VM performance and low CPU utilization because the guest VM lacks visibility into the latest runtime internal states maintained by the hypervisor, such as pCPU load and vCPU-pCPU mappings. Consequently, the guest VM makes inefficient scheduling decisions.
To address the RHS issue, we argue that the solution lies in exposing the latest scheduling decisions made by both the guest and host schedulers to each other. Hence, we present a cooperative para-virtualized scheduling framework called CPS, which facilitates the proactive exchange of timely scheduling information between the hypervisor and guest VMs. To ensure effective scheduling decisions for VMs, a series of techniques are proposed based on the exchanged information. We have implemented CPS in Linux KVM and have designed corresponding solutions to tackle the two RHS problems. Evaluation results demonstrate that CPS significantly improves the performance of PARSEC by 81.1% and FxMark by 1.01x on average for the two identified problems.

References

[1]
Alibaba Cloud: Elastic Compute Service. https://www.alibabacloud.com/product/ecs. Referenced September 2023.
[2]
Amazon EC2 Instance Types. https://aws.amazon.com/ec2/instance-types/. Referenced September 2023.
[3]
AMD. 2021. The 2nd Gen AMD EPYC 7002 Series Processors. www.amd.com/en/processors/epyc-7002-series. Referenced September 2023.
[4]
AMD64 Architecture Programmer's Manual, Volume 2: System Programming. https://www.amd.com/system/files/TechDocs/24593.pdf. Referenced September 2023.
[5]
Github: stress-ng (stress next generation). https://github.com/ColinIanKing/stress-ng. Referenced September 2023.
[6]
Huawei TaiShan Server Data Sheet. https://e.huawei.com/en/material/datacenter/server/7a0b8b0f056f479f909220ac21915999. Referenced September 2023.
[7]
implement vcpu preempted check. https://lwn.net/Articles/704904/. Referenced September 2023.
[8]
Intel® 64 and IA-32 Architectures Software Developer's Manual. https://www.intel.com/content/dam/www/public/us/en/documents/manuals/64-ia-32-architectures-software-developer-vol-3c-part-3-manual.pdf. Referenced September 2023.
[9]
lbzip2: parallel bzip2 compression utility. https://github.com/kjn/lbzip2. Referenced September 2023.
[10]
LWN.net: Steal time for KVM. https://lwn.net/Articles/449657/. Referenced September 2023.
[11]
OpenEuler. https://github.com/openeuler-mirror. Referenced September 2023.
[12]
Paravirtualized ticket spinlocks. https://lwn.net/Articles/552696/. Referenced September 2023.
[13]
The CPU Scheduler in VMware vSphere® 5.1. https://www.vmware.com/content/dam/digitalmarketing/vmware/en/pdf/techpaper/vmware-vsphere-cpu-sched-performance-white-paper.pdf. Referenced September 2023.
[14]
Thomas E. Anderson, Brian N. Bershad, Edward D. Lazowska, and Henry M. Levy. Scheduler Activations: Effective Kernel Support for the User-Level Management of Parallelism. SIGOPS Oper. Syst. Rev., 25(5):95--109, sep 1991.
[15]
Paul Barham, Boris Dragovic, Keir Fraser, Steven Hand, Tim Harris, Alex Ho, Rolf Neugebauer, Ian Pratt, and Andrew Warfield. Xen and the Art of Virtualization. SIGOPS Oper. Syst. Rev., 37(5):164--177, October 2003.
[16]
Andrew Baumann, Paul Barham, Pierre-Evariste Dagand, Tim Harris, Rebecca Isaacs, Simon Peter, Timothy Roscoe, Adrian Schüpbach, and Akhilesh Singhania. The Multikernel: A New OS Architecture for Scalable Multicore Systems. In Proceedings of the ACM SIGOPS 22nd Symposium on Operating Systems Principles, SOSP '09, page 29--44, New York, NY, USA, 2009. Association for Computing Machinery.
[17]
Christian Bienia, Sanjeev Kumar, Jaswinder Pal Singh, and Kai Li. The PARSEC Benchmark Suite: Characterization and Architectural Implications. In Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT '08, page 72--81, New York, NY, USA, 2008. Association for Computing Machinery.
[18]
Edouard Bugnion, Scott Devine, and Mendel Rosenblum. Disco: Running Commodity Operating Systems on Scalable Multiprocessors. SIGOPS Oper. Syst. Rev., 31(5):143--156, oct 1997.
[19]
Bao Bui, Djob Mvondo, Boris Teabe, Kevin Jiokeng, Lavoisier Wapet, Alain Tchana, Gaël Thomas, Daniel Hagimont, Gilles Muller, and Noel DePalma. When EXtended Para - Virtualization (XPV) Meets NUMA. In Proceedings of the Fourteenth EuroSys Conference 2019, EuroSys '19, New York, NY, USA, 2019. Association for Computing Machinery.
[20]
Sanchuan Chen, Fangfei Liu, Zeyu Mi, Yinqian Zhang, Ruby B. Lee, Haibo Chen, and XiaoFeng Wang. Leveraging Hardware Transactional Memory for Cache Side-Channel Defenses. In Proceedings of the 2018 on Asia Conference on Computer and Communications Security, ASIACCS '18, page 601--608, New York, NY, USA, 2018. Association for Computing Machinery.
[21]
Luwei Cheng, Jia Rao, and Francis C. M. Lau. VScale: Automatic and Efficient Processor Scaling for SMP Virtual Machines. In Proceedings of the Eleventh European Conference on Computer Systems, EuroSys '16, New York, NY, USA, 2016. Association for Computing Machinery.
[22]
Christoffer Dall and Jason Nieh. KVM/ARM: The Design and Implementation of the Linux ARM Hypervisor. In Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems, ASPLOS '14, page 333--348, New York, NY, USA, 2014. Association for Computing Machinery.
[23]
Rafael Lourenco de Lima Chehab, Antonio Paolillo, Diogo Behrens, Ming Fu, Hermann Härtig, and Haibo Chen. CLoF: A Compositional Lock Framework for Multi-Level NUMA Systems. In Proceedings of the ACM SIGOPS 28th Symposium on Operating Systems Principles, SOSP '21, page 851--865, New York, NY, USA, 2021. Association for Computing Machinery.
[24]
Xiaoning Ding, Phillip B. Gibbons, and Michael A. Kozuch. A Hidden Cost of Virtualization When Scaling Multicore Applications. In 5th USENIX Workshop on Hot Topics in Cloud Computing (HotCloud 13), San Jose, CA, June 2013. USENIX Association.
[25]
Xiaoning Ding, Phillip B. Gibbons, Michael A. Kozuch, and Jianchen Shan. Gleaner: Mitigating the Blocked-Waiter Wakeup Problem for Virtualized Multicore Applications. In Proceedings of the 2014 USENIX Conference on USENIX Annual Technical Conference, USENIX ATC'14, page 73--84, USA, 2014. USENIX Association.
[26]
Thomas Friebel and Sebastian Biemueller. How to Deal with Lock Holder Pre-emption. 2008.
[27]
Jaeung Han, Jeongseob Ahn, Changdae Kim, Youngjin Kwon, Young-Ri Choi, and Jaehyuk Huh. The Effect of Multi-Core on HPC Applications in Virtualized Systems. In Proceedings of the 2010 Conference on Parallel Processing, Euro-Par 2010, page 615--623, Berlin, Heidelberg, 2010. Springer-Verlag.
[28]
Kenta Ishiguro, Naoki Yasuno, Pierre-Louis Aublin, and Kenji Kono. Mitigating Excessive VCPU Spinning in VM-Agnostic KVM. In Proceedings of the 17th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE 2021, page 139--152, New York, NY, USA, 2021. Association for Computing Machinery.
[29]
Ali Kamali. Sharing aware scheduling on multicore systems. PhD thesis, Applied Science: School of Computing Science, 2010.
[30]
Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. Scalability in the Clouds! A Myth or Reality? In Proceedings of the 6th Asia-Pacific Workshop on Systems, APSys '15, New York, NY, USA, 2015. Association for Computing Machinery.
[31]
Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. Opportunistic Spinlocks: Achieving Virtual Machine Scalability in the Clouds. SIGOPS Oper. Syst. Rev., 50(1):9--16, mar 2016.
[32]
Sanidhya Kashyap, Changwoo Min, and Taesoo Kim. Scaling Guest OS Critical Sections with eCS. In 2018 USENIX Annual Technical Conference (USENIX ATC 18), pages 159--172, Boston, MA, July 2018. USENIX Association.
[33]
Hwanju Kim, Sangwook Kim, Jinkyu Jeong, Joonwon Lee, and Seungryoul Maeng. Demand-Based Coordinated Scheduling for SMP VMs. SIGARCH Comput. Archit. News, 41(1):369--380, mar 2013.
[34]
Avi Kivity, Yaniv Kamay, Dor Laor, Uri Lublin, and Anthony Liguori. KVM: the Linux virtual machine monitor. In Proceedings of the Linux symposium, volume 1, pages 225--230. Dttawa, Dntorio, Canada, 2007.
[35]
Fangfei Liu, Yuval Yarom, Qian Ge, Gernot Heiser, and Ruby B. Lee. Last-Level Cache Side-Channel Attacks are Practical. In 2015 IEEE Symposium on Security and Privacy, pages 605--622, 2015.
[36]
Ming Liu and Tao Li. Optimizing virtual machine consolidation performance on NUMA server architecture for cloud workloads. In 2014 ACM/IEEE 41st International Symposium on Computer Architecture (ISCA), pages 325--336, 2014.
[37]
Brian D. Marsh, Michael L. Scott, Thomas J. LeBlanc, and Evangelos P. Markatos. First-Class User-Level Threads. SIGOPS Oper. Syst. Rev., 25(5):110--121, sep 1991.
[38]
Aravind Menon, Jose Renato Santos, Yoshio Turner, G. (John) Janakiraman, and Willy Zwaenepoel. Diagnosing Performance Overheads in the Xen Virtual Machine Environment. In Proceedings of the 1st ACM/USENIX International Conference on Virtual Execution Environments, VEE '05, page 13--23, New York, NY, USA, 2005. Association for Computing Machinery.
[39]
Zeyu Mi, Haibo Chen, Yinqian Zhang, Shuanghe Peng, Xiaofeng Wang, and Michael K. Reiter. CPU Elasticity to Mitigate Cross-VM Runtime Monitoring. IEEE Transactions on Dependable and Secure Computing, 17(5):1094--1108, 2020.
[40]
Changwoo Min, Sanidhya Kashyap, Steffen Maass, and Taesoo Kim. Understanding manycore scalability of file systems. In 2016 USENIX Annual Technical Conference (USENIX ATC 16), pages 71--85, 2016.
[41]
Jiannan Ouyang and John R. Lange. Preemptable Ticket Spinlocks: Improving Consolidated Performance in the Cloud. In Proceedings of the 9th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments, VEE '13, page 191--200, New York, NY, USA, 2013. Association for Computing Machinery.
[42]
Aravinda Prasad, K Gopinath, and Paul E. McKenney. The RCU-Reader Preemption Problem in VMs. In 2017 USENIX Annual Technical Conference (USENIX ATC 17), pages 265--270, Santa Clara, CA, July 2017. USENIX Association.
[43]
Jia Rao, Kun Wang, Xiaobo Zhou, and Cheng-Zhong Xu. Optimizing virtual machine scheduling in NUMA multicore systems. In 2013 IEEE 19th International Symposium on High Performance Computer Architecture (HPCA), pages 306--317, 2013.
[44]
Xiang Song, Haibo Chen, Binyu Zang, X Song, H Chen, and B Zang. Characterizing the performance and scalability of many-core applications on virtualized platforms. Parallel Processing Institute Technical Report Number: FDUPPITR-2010, 2, 2010.
[45]
Xiang Song, Jicheng Shi, Haibo Chen, and Binyu Zang. Schedule Processes, Not VCPUs. In Proceedings of the 4th Asia-Pacific Workshop on Systems, APSys '13, New York, NY, USA, 2013. Association for Computing Machinery.
[46]
Orathai Sukwong and Hyong S. Kim. Is Co-Scheduling Too Expensive for SMP VMs? In Proceedings of the Sixth Conference on Computer Systems, EuroSys '11, page 257--272, New York, NY, USA, 2011. Association for Computing Machinery.
[47]
David Tam, Reza Azimi, and Michael Stumm. Thread Clustering: Sharing-Aware Scheduling on SMP-CMP-SMT Multiprocessors. SIGOPS Oper. Syst. Rev., 41(3):47--58, mar 2007.
[48]
Boris Teabe, Vlad Nitu, Alain Tchana, and Daniel Hagimont. The lock holder and the lock waiter pre-emption problems: Nip them in the bud using informed spinlocks (i-spinlock). In Proceedings of the Twelfth European Conference on Computer Systems, pages 286--297, 2017.
[49]
Volkmar Uhlig, Joshua LeVasseur, Espen Skoglund, and Uwe Dannowski. Towards Scalable Multiprocessor Virtual Machines. In Proceedings of the 3rd Conference on Virtual Machine Research And Technology Symposium - Volume 3, VM'04, page 4, USA, 2004. USENIX Association.
[50]
VMware. The CPU Scheduler in VMware ESX 4.1. Technical Report, 2010.
[51]
Gauthier Voron, Gaël Thomas, Vivien Quéma, and Pierre Sens. An Interface to Implement NUMA Policies in the Xen Hypervisor. In Proceedings of the Twelfth European Conference on Computer Systems, EuroSys '17, page 453--467, New York, NY, USA, 2017. Association for Computing Machinery.
[52]
Philip M. Wells, Koushik Chakraborty, and Gurindar S. Sohi. Hardware support for spin management in overcommitted virtual machines. In 2006 International Conference on Parallel Architectures and Compilation Techniques (PACT), pages 124--133, 2006.
[53]
Chuliang Weng, Qian Liu, Lei Yu, and Minglu Li. Dynamic Adaptive Scheduling for Virtual Machines. In Proceedings of the 20th International Symposium on High Performance Distributed Computing, HPDC '11, page 239--250, New York, NY, USA, 2011. Association for Computing Machinery.
[54]
Song Wu, Huahua Sun, Like Zhou, Qingtian Gan, and Hai Jin. vProbe: Scheduling Virtual Machines on NUMA Systems. In 2016 IEEE International Conference on Cluster Computing (CLUSTER), pages 70--79, 2016.
[55]
Song Wu, Zhenjiang Xie, Haibao Chen, Sheng Di, Xinyu Zhao, and Hai Jin. Dynamic Acceleration of Parallel Applications in Cloud Platforms by Adaptive Time-Slice Control. In 2016 IEEE International Parallel and Distributed Processing Symposium (IPDPS), pages 343--352, 2016.
[56]
Xiangyao Yu, George Bezerra, Andrew Pavlo, Srinivas Devadas, and Michael Stonebraker. Staring into the Abyss: An Evaluation of Concurrency Control with One Thousand Cores. Proc. VLDB Endow., 8(3):209--220, nov 2014.
[57]
Yifan Yuan, Mohammad Alian, Yipeng Wang, Ren Wang, Ilia Kurakin, Charlie Tai, and Nam Sung Kim. Don't Forget the I/O When Allocating Your LLC. In 2021 ACM/IEEE 48th Annual International Symposium on Computer Architecture (ISCA), pages 112--125, 2021.
[58]
Lei Zhang, Yu Chen, Yaozu Dong, and Chao Liu. Lock-Visor: An Efficient Transitory Co-scheduling for MP Guest. In 2012 41st International Conference on Parallel Processing, pages 88--97, 2012.
[59]
Yinqian Zhang, Ari Juels, Michael K. Reiter, and Thomas Ristenpart. Cross-VM Side Channels and Their Use to Extract Private Keys. In Proceedings of the 2012 ACM Conference on Computer and Communications Security, CCS '12, page 305--316, New York, NY, USA, 2012. Association for Computing Machinery.
[60]
Sergey Zhuravlev, Juan Carlos Saez, Sergey Blagodurov, Alexandra Fedorova, and Manuel Prieto. Survey of Scheduling Techniques for Addressing Shared Resources in Multicore Processors. ACM Comput. Surv., 45(1), dec 2012.

Cited By

View all
  • (2024)IOGuard: Software-Based I/O Page Fault Handling with One CPU CoreProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671394(337-346)Online publication date: 24-Jul-2024

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '23: Proceedings of the 28th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 4
March 2023
430 pages
ISBN:9798400703942
DOI:10.1145/3623278
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 February 2024

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. para-virtualized scheduling
  2. cache group
  3. manycore machine
  4. performance scalability

Qualifiers

  • Research-article

Funding Sources

  • The National Natural Science Foundation of China (NSFC)

Conference

ASPLOS '23

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)360
  • Downloads (Last 6 weeks)45
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)IOGuard: Software-Based I/O Page Fault Handling with One CPU CoreProceedings of the 15th Asia-Pacific Symposium on Internetware10.1145/3671016.3671394(337-346)Online publication date: 24-Jul-2024

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media