Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
tutorial

Content Look-Aside Buffer for Redundancy-Free Virtual Disk I/O and Caching

Published: 08 April 2017 Publication History
  • Get Citation Alerts
  • Abstract

    Storage consolidation in a virtualized environment introduces numerous duplications in virtual disks and imposes considerable pressure on disk I/O and caching. In this paper, we present a content look-aside buffer (CLB) approach for simultaneously providing redundancy-free virtual disk I/O and caching. CLB attaches persistent fingerprints to virtual disk blocks, which enables detection of I/O redundancy before disk access. At run time, CLB exploits content pages already present in the guest disk caches to service the redundant reads through page sharing, thus eliminating both redundant I/O requests and redundant disk cache copies. For write requests, CLB uses a group invalidating writeback protocol for updating fingerprints to support crash consistency while minimizing disk write overhead. By implementing and evaluating a CLB prototype on KVM hypervisor, we demonstrate that CLB delivers considerably improved I/O performance with realistic workloads. Our CLB prototype improves the throughput of sequential and random read on duplicate data by 4.1x and 26.2x, respectively. For typical read-intensive workloads, such as booting VM and launching application, CLB's I/O deduplication and cache deduplication eliminates 94.9%--98.5% of read requests and saves 50%--100% cache memory in each VM, respectively. Compared with the QEMU's raw virtual disk format, CLB improves the per-disk VM density by 8x--16x. For mixed read-write workloads, the cost of on-line fingerprint updating offsets the read benefit; nevertheless, CLB substantially improves overall performance.

    References

    [1]
    A. Arcangeli, I. Eidus, and C. Wright. Increasing memory density by using KSM. In Proceedings of the 2009 Ottawa Linux Symposium (OLS'09), pages 19--28, 2009.
    [2]
    E. Bugnion, S. Devine, and M. Rosenblum. Disco: Running commodity operating systems on scalable multiprocessors. In Proceedings of the 16th ACM Symposium on Operating Systems Principles (SOSP'97), pages 143--156, 1997.
    [3]
    L. Chen, Z. Wei, Z. Cui, M. Chen, H. Pan, and Y. Bao. CMD: Classification-based memory deduplication through page access characteristics. In Proceedings of 10th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments (VEE'14), pages 65--76, 2014.
    [4]
    Q. Chen, L. Liang, Y. Xia, H. Chen, and H. Kim. Mitigating sync amplification for copy-on-write virtual disk. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 241--247, 2016.
    [5]
    Citrix Systems, Inc. XenDesktop plannning guide: Storage best practices. White paper, 2011.
    [6]
    C. Clark, K. Fraser, S. Hand, J. G. Hansen, E. Jul, C. Limpach, I. Pratt, and A. Warfield. Live migration of virtual machines. In Proceedings of the 2nd Symposium on Networked Systems Design and Implementation (NSDI'05), pages 273--286, 2005.
    [7]
    A. T. Clements, I. Ahmad, M. Vilayannur, and J. Li. Decentralized deduplication in SAN cluster file systems. In Proceedings of the 2009 USENIX Annual Technical Conference, pages 101--114, 2009.
    [8]
    B. Debnath, S. Sengupta, and J. Li. ChunkStash: Speeding up inline storage deduplication using flash memory. In Proceedings of the 2010 USENIX Annual Technical Conference, pages 215--229, 2010.
    [9]
    A. El-Shimi, R. Kalach, A. Kumar, A. Oltean, J. Li, and S. Sengupta. Primary data deduplication -- large scale study and system design. In Proceedings of the 2012 USENIX Annual Technical Conference, pages 285--296, 2012.
    [10]
    EMC Corporation. VNX fast cache: A detailed review. White paper, 2012.
    [11]
    F. Guo and P. Efstathopoulos. Building a high-performance deduplication system. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.
    [12]
    D. Gupta, S. Lee, M. Vrable, S. Savage, A. C. Snoeren, G. Varghese, G. M. Voelker, and A. Vahdat. Difference engine: Harnessing memory redundancy in virtual machines. In Proceedings of the 8th USENIX Symposium on Operating Systems Design and Implementation (OSDI'08), pages 309--322, 2008.
    [13]
    IBM Corporation. Best practice for KVM. White paper, 2012.
    [14]
    H. Kim, H. Jo, and J. Lee. XHive: Efficient cooperative caching for virtual machines. IEEE Transactions On Computers, 60(1):106--119, Jan. 2011.
    [15]
    A. Kivity, Y. Kamay, D. Laor, U. Lublin, and A. Liguori. KVM: the linux virtual machine monitor. In Proceedings of the 2007 Ottawa Linux Symposium (OLS'07), pages 225--230, 2007.
    [16]
    R. Koller and R. Rangaswami. I/O deduplication: Utilizing content similarity to improve I/O performance. In Proceedings of the 8th USENIX Conference on File and Storage Technologies (FAST'10), 2010.
    [17]
    A. Liguori and E. V. Hensbergen. Experiences with content addressable storage and virtual disks. In Proceedings of the 1st Workshop on I/O Virtualization (WIOV'08), 2008.
    [18]
    M. Lillibridge, K. Eshghi, D. Bhagwat, V. Deolalikar, G. Trezis, and P. Camble. Sparse indexing: Large scale, inline deduplication using sampling and locality. In Proceedings of the 7th USENIX Conference on File and Storage Technologies (FAST'09), pages 111--123, 2009.
    [19]
    S. Mandal, G. Kuenning, D. Ok, V. Shastry, P. Shilane, S. Zhen, V. Tarasov, and E. Zadok. Using hints to improve inline block-layer deduplication. In Proceedings of the 14th USENIX Conference on File and Storage Technologies (FAST'16), pages 315--322, 2016.
    [20]
    D. T. Meyer and W. J. Bolosky. A study of practical deduplication. In Proceedings of the 9th USENIX Conference on File and Storage Technologies (FAST'11), pages 1--13, 2011.
    [21]
    D. T. Meyer, G. Aggarwal, B. Cully, G. Lefebvre, M. J. Feeley, N. C. Hutchinson, and A. Warfield. Parallax: Virtual disks for virtual machines. In Proceedings of the 3rd ACM SIGOPS/EuroSys European Conference on Computer Systems (EuroSys'08), pages 41--54, 2008.
    [22]
    K. Miller, F. Franz, T. Groeninger, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. KSM++: Using IO-based hints to make memory-deduplication scanners more efficient. In Proceedings of the ASPLOS Workshop on Runtime Environments, Systems, Layering and Virtualized Environments (RESoLVE'12), 2012.
    [23]
    K. Miller, F. Franz, M. Rittinghaus, M. Hillenbrand, and F. Bellosa. XLH: More effective memory deduplication scanners through cross-layer hints. In Proceedings of the 2013 USENIX Annual Technical Conference, pages 279--290, 2013.
    [24]
    G. Miłoś, D. G. Murray, S. Hand, and M. A. Fetterman. Satori: Enlightened page sharing. In Proceedings of the 2009 USENIX Annual Technical Conference, 2009.
    [25]
    National Institute of Standards and Technology (NIST). Secure Hash Standard (SHS). Standard, October 2008.
    [26]
    B. Pfaff, T. Garfinkel, and M. Rosenblum. Virtualization aware file systems: Getting beyond the limitations of virtual disks. In Proceedings of the 3rd Symposium on Networked Systems Design and Implementation (NSDI'06), pages 353--366, 2006.
    [27]
    S. Quinlan and S. Dorward. Venti: A new approach to archival storage. In Proceedings of the 1st USENIX Conference on File and Storage Technologies (FAST'02), pages 89--101, 2002.
    [28]
    J. Ren and Q. Yang. A new buffer cache design exploiting both temporal and content localities. In Proceedings of the 30th International Conference on Distributed Computing Systems (ICDCS'10), pages 273--282, 2010.
    [29]
    S. Rhea, R. Cox, and A. Pesterev. Fast, inexpensive content-addressed storage in foundation. In Proceedings of the 2008 USENIX Annual Technical Conference, pages 143--156, 2008.
    [30]
    M. Rosenblum and T. Garfinkel. Virtual machine monitors: Current technology and future trends. IEEE Computer, 38(5): 39--47, May 2005.
    [31]
    M. Russinovich, D. A. Solomon, and A. Ionescu. Windows Internals, 6th edition. Microsoft Press, 2012.
    [32]
    J. Shafer. I/O virtualization bottlenecks in cloud computing today. In Proceedings of the 2nd Workshop on I/O Virtualization (WIOV'10), 2010.
    [33]
    P. Sharma and P. Kulkarni. Singleton: System-wide page deduplication in virtual environments. In Proceedings of the 21st International Symposium on High-performance Parallel and Distributed Computing (HPDC'12), pages 15--26, 2012.
    [34]
    B. Singh. Page/slab cache control in a virtualized environment. In Proceedings of the 2010 Ottawa Linux Symposium (OLS'10), pages 255--262, 2010.
    [35]
    J. E. Smith and R. Nair. The architecture of virtual machines. IEEE Computer, 38(5):32--38, May 2005.
    [36]
    K. Srinivasan, T. Bisson, G. Goodson, and K. Voruganti. iDedup: Latency-aware, inline data deduplication for primary storage. In Proceedings of the 10th USENIX Conference on File and Storage Technologies (FAST'12), pages 299--312, 2012.
    [37]
    V. Tarasov, D. Jain, G. Kuenning, S. Mandal, K. Palanisami, P. Shilane, S. Trehan, and E. Zadok. Dmdedup: Device mapper target for data deduplication. In Proceedings of the 2014 Ottawa Linux Symposium (OLS'14), 2014.
    [38]
    VMware Corporation. VMware Virtual Desktop Infrastructure. White paper, 2007.
    [39]
    VMware Corporation. View storage accelerator in VMware View 5.1. White paper, 2012.
    [40]
    C. A. Waldspurger. Memory resource management in VMware ESX Server. In Proceedings of the 5th Symposium on Operating Systems Design and Implementation (OSDI'02), pages 181--194, 2002.
    [41]
    W. Xia, H. Jiang, D. Feng, and Y. Hua. SiLo: A similarity-locality based near-exact deduplication scheme with low RAM overhead and high throughput. In Proceedings of the 2011 USENIX Annual Technical Conference, 2011.
    [42]
    B. Zhu, K. Li, and H. Patterson. Avoiding the disk bottleneck in the data domain deduplication file system. In Proceedings of the 6th USENIX Conference on File and Storage Technologies (FAST'08), pages 269--282, 2008.

    Cited By

    View all
    • (2019)Research Characterization on I/O Improvements of Storage EnvironmentsAdvances on P2P, Parallel, Grid, Cloud and Internet Computing10.1007/978-3-030-33509-0_26(287-298)Online publication date: 20-Oct-2019
    • (2020)Characterization Research on I/O Improvements Targeting DISC and HPC ApplicationsIECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society10.1109/IECON43393.2020.9255087(2095-2100)Online publication date: 18-Oct-2020

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGPLAN Notices
    ACM SIGPLAN Notices  Volume 52, Issue 7
    VEE '17
    July 2017
    256 pages
    ISSN:0362-1340
    EISSN:1558-1160
    DOI:10.1145/3140607
    Issue’s Table of Contents
    • cover image ACM Conferences
      VEE '17: Proceedings of the 13th ACM SIGPLAN/SIGOPS International Conference on Virtual Execution Environments
      April 2017
      261 pages
      ISBN:9781450349482
      DOI:10.1145/3050748
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 08 April 2017
    Published in SIGPLAN Volume 52, Issue 7

    Check for updates

    Author Tags

    1. I/O deduplication
    2. memory deduplication
    3. virtual disk

    Qualifiers

    • Tutorial
    • Research
    • Refereed limited

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)12
    • Downloads (Last 6 weeks)0
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2019)Research Characterization on I/O Improvements of Storage EnvironmentsAdvances on P2P, Parallel, Grid, Cloud and Internet Computing10.1007/978-3-030-33509-0_26(287-298)Online publication date: 20-Oct-2019
    • (2020)Characterization Research on I/O Improvements Targeting DISC and HPC ApplicationsIECON 2020 The 46th Annual Conference of the IEEE Industrial Electronics Society10.1109/IECON43393.2020.9255087(2095-2100)Online publication date: 18-Oct-2020

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media