Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3124680.3124734acmconferencesArticle/Chapter ViewAbstractPublication PagesapsysConference Proceedingsconference-collections
research-article
Public Access

HyperFresh: Live Refresh of Hypervisors Using Nested Virtualization

Published: 02 September 2017 Publication History

Abstract

Bugs in hypervisors are becoming common as hypervisors grow in size and complexity. Latent bugs, such as memory leaks, can lead to hypervisor failures resulting in complete loss of all its virtual machines (or guests). However, reliable operation of hypervisors, even in the presence of bugs, is critical in cloud platforms. A hypervisor can be regularly restarted, with or without updates, to reset its state, preempt unexpected failures, and extend its operational uptime. However, a hypervisor restart and update is highly disruptive to guests, which must be either migrated to another host, or shut down. We propose HyperFresh, a fast and guest-transparent approach to replace an old, and possibly unstable, hypervisor with a fresh one beneath live unmodified guests. Using nested virtualization, a thin hyperplexor layer runs the hypervisor and its guests. To prepare for refresh, all guest memory is co-mapped in advance to a fresh co-resident hypervisor. When the refresh operation is triggered, the hyperplexor simply switches control of the guest VCPUs and I/O state to the new hypervisor. Our HyperFresh prototype on the KVM/QEMU platform yields switching times of around 100ms with low performance impact on guest workload.

References

[1]
Xen Live Patching, https://wiki.xenproject.org/wiki/LivePatch.
[2]
Xen Security Advisories. http://xenbits.xen.org/xsa/.
[3]
AMD. AMD Virtualization (AMD-V) http://www.amd.com/us/solutions/servers/virtualization.
[4]
Muli Ben-Yehuda, Michael D. Day, Zvi Dubitzky, Michael Factor, Nadav Har'El, Abel Gordon, Anthony Liguori, Orit Wasserman, and Ben-Ami Yassour. 2010. The Turtles Project: Design and Implementation of Nested Virtualization. In Proc. of Operating Systems Design and Implementation. Vancouver, BC, Canada.
[5]
Swapnil Bhartiya. Best Lightweight Linux Distros for 2017 https://www.linux.com/news/best-lightweight-linux-distros-2017.
[6]
Franz Ferdinand Brasser, Mihai Bucicoiu, and Ahmad-Reza Sadeghi. 2014. Swap and play: Live updating hypervisors and its application to xen. In Proc. of the 6th edition of the ACM Workshop on Cloud Computing Security. ACM, 33--44.
[7]
C. Clark, K. Fraser, S. Hand, J.G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. 2005. Live Migration of Virtual Machines. In Proc. of Network System Design and Implementation.
[8]
Jonathan Corbet. A rough patch for live patching https://lwn.net/Articles/634649/.
[9]
Jonathan Corbet. Topics in live kernel patching https://lwn.net/Articles/706327/.
[10]
Microsoft Corporation. Window Azure: Microsoft's Cloud Platform https://azure.microsoft.com/en-us/.
[11]
Brendan Cully, Geoffrey Lefebvre, Dutch Meyer, Mike Feeley, Norm Hutchinson, and Andrew Warfield. 2008. Remus: High availability via asynchronous virtual machine replication. In Proc. of Networked Systems Design and Implementation.
[12]
Umesh Deshpande, Brandon Schlinker, Eitan Adler, and Kartik Gopalan. 2013. Gang Migration of Virtual Machinesusing Cluster-wide Deduplication. In Proc. of the 13th International Symposium on Cluster, Cloud and Grid Computing (CCGrid).
[13]
U. Deshpande, X. Wang, and K. Gopalan. 2010. Live gang migration of virtual machines. In Proc. of High Performance Distributed Computing(HPDC).
[14]
ESNet/LBNL. iPerf: The Network Bandwidth Measurement Tool, http://iperf.fr.
[15]
Dan Goodin. Oct 29 2015. Xen patches 7-year-old bug that shattered hypervisor security. In Ars Technica http://arstechnica.com/security/2015/10/xen-patches-7-year-old-bug-that-shattered-hypervisor-security/.
[16]
M. Hines, U. Deshpande, and K. Gopalan. 2009. Post-Copy Live Migration of Virtual Machines. In SIGOPS Operating Systems Review (July 2009).
[17]
Y. Huang, C. Kintala, N. Kolettis, and N. D. Fulton. 1995. Software rejuvenation: analysis, module and applications. In Twenty-Fifth International Symposium on Fault-Tolerant Computing. 381--390.
[18]
Google Inc. Google Compute Engine, https://cloud.google.com/compute/.
[19]
Google Inc. Jan 2017. Google Infrastructure Security Design Overview, https://cloud.google.com/security/security-design/resources/google_infrastructure_whitepaper_fa.pdf.
[20]
Kenichi Kourai and Hiroki Ooba. 2015. Zero-copy Migration for Lightweight Software Rejuvenation of Virtualized Systems. In Proc. of the 6th Asia-Pacific (APSys) Workshop on Systems.
[21]
Michael Le and Yuval Tamir. 2011. ReHype: enabling VM survival across hypervisor failures. In ACM SIGPLAN Notices, Vol. 46. ACM, 63--74.
[22]
McAfee LLC. Root Out Rootkits: An Inside Look at McAfee Deep Defender https://www.intel.com/content/dam/www/public/us/en/documents/white-papers/mcafee-deep-defender-deepsafe-rootkit-protection-paper.pdf.
[23]
David E. Lowell, Yasushi Saito, and Eileen J. Samberg. 2004. Devirtualizable Virtual Machines Enabling General, Single-node, Online Maintenance. SIGARCH Comput. Archit. News 32, 5 (Oct. 2004), 211--223.
[24]
Ravello Systems (Oracle). 2013. Nested Virtualization: Achieving Up to 2x better AWS performance! https://www.ravellosystems.com/blog/nested-virtualization-achieving-up-to-2x-better-aws-performance/.
[25]
Daniel J. Scales, Mike Nelson, and Ganesh Venkitachalam. 2010. The Design of a Practical System for Fault-tolerant Virtual Machines. SIGOPS Oper. Syst. Rev. 44, 4 (Dec. 2010), 30--39.
[26]
Amazon Web Services. Amazon Elastic Compute Cloud (EC2), http://aws.amazon.com/ec2.
[27]
Arvind Seshadri, Mark Luk, Ning Qu, and Adrian Perrig. 2007. SecVisor: a tiny hypervisor to provide lifetime kernel code integrity for commodity OSes. In ACM SIGOPS Operating Systems Review, Vol. 41(6). 335--350.
[28]
Zhiming Shen, Qin Jia, Gur-Eyal Sela, Ben Rainero, Weijia Song, Robbert van Renesse, and Hakim Weatherspoon. 2016. Follow the Sun Through the Clouds: Application Migration for Geographically Shifting Workloads. In Proc. of the Seventh ACM Symposium on Cloud Computing. 141--154.
[29]
Ravello Systems. https://www.ravellosystems.com/.
[30]
Linux Bug Tracker. https://bugzilla.kernel.org/buglist.cgi?quicksearch=kvm.
[31]
QEMU-KVM Bug tracker. Huge memory leak (qemu-kvm 0.12.3) https://sourceforge.net/p/kvm/bugs/539/.
[32]
R. Uhlig, G. Neiger, D. Rodgers, A.L. Santoni, F.C.M. Martins, A.V. Anderson, S.M. Bennett, A. Kagi, F.H. Leung, and L. Smith. 2005. Intel virtualization technology. Computer 38, 5 (2005), 48--56.
[33]
Orit Wasserman. 2013. Nested Virtualization: Shadow Turtles. In KVM Forum, Edinburgh, Spain.
[34]
Dan Williams, Hani Jamjoom, and Hakim Weatherspoon. 2012. The Xen-Blanket: Virtualize Once, Run Everywhere. In Proc. of EuroSys, Bern, Switzerland.
[35]
Xen Security Advisory. CVE-2015-7969: Leak of main per-domain VCPU pointer array, https://xenbits.xen.org/xsa/advisory-149.html.
[36]
Xen Security Advisory. CVE-2015-8341: Libxl leak of PV kernel and initrd on error, https://xenbits.xen.org/xsa/advisory-160.html.
[37]
XVisor. http://xhypervisor.org.
[38]
Ben-Ami Yassour, Muli Ben-Yehuda, and Orit Wasserman. 2008. Direct Device Assignment for Untrusted Fully-Virtualized Virtual Machines. Technical Report. IBM Research.
[39]
Fengzhe Zhang, Jin Chen, Haibo Chen, and Binyu Zang. 2011. CloudVisor: Retrofitting Protection of Virtual Machines in Multi-tenant Cloud with Nested Virtualization. In Proceedings of the Twenty-Third ACM Symposium on Operating Systems Principles. ACM, 203--216.

Cited By

View all
  • (2024)Efficient Virtual Resource Management in Nested Virtualization Environments2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)10.1109/3ict64318.2024.10824265(751-756)Online publication date: 17-Nov-2024
  • (2023)V-Recover: Virtual Machine Recovery When Live Migration FailsIEEE Transactions on Cloud Computing10.1109/TCC.2023.3282466(1-12)Online publication date: 2023
  • (2023)Rust-Shyper: A reliable embedded hypervisor supporting VM migration and hypervisor live-updateJournal of Systems Architecture10.1016/j.sysarc.2023.102948142(102948)Online publication date: Sep-2023
  • Show More Cited By
  1. HyperFresh: Live Refresh of Hypervisors Using Nested Virtualization

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Conferences
    APSys '17: Proceedings of the 8th Asia-Pacific Workshop on Systems
    September 2017
    207 pages
    ISBN:9781450351973
    DOI:10.1145/3124680
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Sponsors

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 02 September 2017

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Hypervisor
    2. Reliability
    3. Virtual Machines
    4. Virtualization

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    Conference

    APSys '17
    Sponsor:

    Acceptance Rates

    APSys '17 Paper Acceptance Rate 27 of 51 submissions, 53%;
    Overall Acceptance Rate 169 of 430 submissions, 39%

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)44
    • Downloads (Last 6 weeks)12
    Reflects downloads up to 14 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Efficient Virtual Resource Management in Nested Virtualization Environments2024 International Conference on Innovation and Intelligence for Informatics, Computing, and Technologies (3ICT)10.1109/3ict64318.2024.10824265(751-756)Online publication date: 17-Nov-2024
    • (2023)V-Recover: Virtual Machine Recovery When Live Migration FailsIEEE Transactions on Cloud Computing10.1109/TCC.2023.3282466(1-12)Online publication date: 2023
    • (2023)Rust-Shyper: A reliable embedded hypervisor supporting VM migration and hypervisor live-updateJournal of Systems Architecture10.1016/j.sysarc.2023.102948142(102948)Online publication date: Sep-2023
    • (2023)HyperTP: A unified approach for live hypervisor replacement in datacentersJournal of Parallel and Distributed Computing10.1016/j.jpdc.2023.104733181(104733)Online publication date: Nov-2023
    • (2022)The Effects of Soft Errors and Mitigation Strategies for Virtualization ServersIEEE Transactions on Cloud Computing10.1109/TCC.2020.297314610:2(1065-1081)Online publication date: 1-Apr-2022
    • (2022)VM Migration and Live-Update for Reliable Embedded HypervisorDependable Software Engineering. Theories, Tools, and Applications10.1007/978-3-031-21213-0_4(53-69)Online publication date: 11-Dec-2022
    • (2021)Mitigating Virtualization Failures Through Migration to a Co-Located HypervisorIEEE Access10.1109/ACCESS.2021.30986449(105255-105269)Online publication date: 2021
    • (2021)Seamless Update of a Hypervisor StackIntelligent Computing10.1007/978-3-030-80119-9_31(509-519)Online publication date: 13-Jul-2021
    • (2019)Fast and live hypervisor replacementProceedings of the 15th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments10.1145/3313808.3313821(45-58)Online publication date: 14-Apr-2019
    • (2019)Fast Local VM Migration Against Hypervisor Corruption2019 15th European Dependable Computing Conference (EDCC)10.1109/EDCC.2019.00028(97-102)Online publication date: Sep-2019

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media