Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3445814.3446722acmconferencesArticle/Chapter ViewAbstractPublication PagesasplosConference Proceedingsconference-collections
research-article

BCD deduplication: effective memory compression using partial cache-line deduplication

Published: 17 April 2021 Publication History

Abstract

In this paper, we identify new partial data redundancy among multiple cache lines that are not exploited by traditional memory compression or memory deduplication. We propose Base and Compressed Difference (BCD) deduplication that effectively utilizes the partial matches among cache lines through a novel combination of compression and deduplication to increase the effective capacity of main memory. Experimental results show that BCD achieves the average compression ratio of 1.94× for SPEC2017, DaCapo, TPC-DS, and TPC-H, which is 48.4% higher than the best prior work. We also present an efficient implementation of BCD in a modern memory hierarchy, which compresses data in both the last-level cache (LLC) and main memory with modest area overhead. Even with additional meta-data accesses and compression/deduplication operations, cycle-level simulations show that BCD improves the performance of the SPEC2017 benchmarks by 2.7% on average because it increases the effective capacity of the LLC. Overall, the results show that BCD can significantly increase the capacity of main memory with little performance overhead.

References

[1]
SPEC CPU2017. https://www.spec.org/cpu2017. Accessed: 2020-08-15.
[2]
Bluent Abali, Hubertus Franke, Dan E. Pof, Robert A. Saccone, Charles O. Schulz, Lorraine M. Herger, and T. Basil Smith. Memory Expansion Technology (MXT): Software Support and Performance. IBM Journal of Research and Development, 2001.
[3]
Alaa Alameldeen and David Wood. Frequent Pattern Compression: A Significance-based Compression Scheme for L2 Caches. Technical report, University of Wisconsin-Madison, 2004.
[4]
Alaa R Alameldeen and David A Wood. Adaptive Cache Compression for Highperformance Processors. In Proceedings of the 31st Annual International Symposium on Computer Architecture, 2004.
[5]
Angelos Arelakis, Fredrik Dahlgren, and Per Stenstrom. Hycomp: A Hybrid Cache Compression Method for Selection of Data-type-specific Compression Methods. In Proceedings of the 48th International Symposium on Microarchitecture, 2015.
[6]
Angelos Arelakis and Per Stenstrom. A Case for a Value-aware Cache. IEEE Computer Architecture Letters, 2012.
[7]
Angelos Arelakis and Per Stenstrom. SC2: A Statistical Compression Cache Scheme. In Proceedings of the 41st Annual International Symposium on Computer Architecture, 2014.
[8]
Luiz AndrÃ? Barroso, Urs HÃ?lzle, and Parthasarathy Ranganathan. The Datacenter as a Computer: An Introduction to the Design of Warehouse-Scale Machines. Morgan & Claypool Publishers, 3rd edition, 2018.
[9]
Stephen M Blackburn, Robin Garner, Chris Hofmann, Asjad M Khang, Kathryn S McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovi?, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. The DaCapo Benchmarks: Java Benchmarking Development and Analysis. In Proceedings of the 21st annual ACM SIGPLAN conference on Object-oriented Programming Systems, Languages, and Applications, 2006.
[10]
David Chen, Enoch Peserico, and Larry Rudolph. A Dynamically Partitionable Compressed Cache. In Proceedings of the Singapore-MIT Alliance Symposium, 2003.
[11]
Xi Chen, Lei Yang, Robert P. Dick, Li Shang, and Haris Lekatsas. C-Pack: A HighPerformance Microprocessor Cache Compression Algorithm. IEEE Transactions on Very Large Scale Integration Systems, 2010.
[12]
David Cheriton, Amin Firoozshahian, Alex Solomatnikov, John P Stevenson, and Omid Azizi. HICAMP: Architectural Support for Eficient Concurrency-safe Shared Structured Data Access. In Proceedings of the 17th International Conference on Architectural Support for Programming Languages and Operating Systems, 2012.
[13]
Esha Choukse, Mattan Erez, and Alaa R. Alameldeen. Compresso: Pragmatic Main Memory Compression. In Proceedings of 51st International Symposium on Microarchitecture, 2018.
[14]
Cornel Constantinescu, Joseph Glider, and David Chambliss. Mixing Deduplication and Compression on Active Data Sets. In Data Compression Conference, 2011.
[15]
Giorgos Dimitrakopoulos, Kostas Galanopoulos, Christos Mavrokefalidis, and Dimitris Nikolos. Low-power Leading-zero Counting and Anticipation Logic for High-speed Floating Point Units. IEEE Transactions on Very Large Scale Integration Systems, 2008.
[16]
Julien Dusser, Thomas Piquet, and André Seznec. Zero-content Augmented Caches. In Proceedings of the 23rd International Conference on Supercomputing, 2009.
[17]
Magnus Ekman and Per Stenstrom. A Robust Main-memory Compression Scheme. In Proceedings of the 32nd Annual International Symposium on Computer Architecture, 2005.
[18]
Bart Goeman, Hans Vandierendonck, and Koenraad De Bosschere. Diferential FCM : Increasing Value Prediction Accuracy by Improving Table Usage Eficiency. In Proceedings of the 7th International Symposium on High-Performance Computer Architecture, 2001.
[19]
JEDEC. DDR4 SDRAM Standard. JESD79-4B, 2012.
[20]
Georgios Keramidas, Konstantinos Aisopos, and Stefanos Kaxiras. Dynamic Dictionary-based Data Compression for Level-1 Caches. In International Conference on Architecture of Computing Systems, 2006.
[21]
Jungrae Kim, Michael Sullivan, Esha Choukse, and Mattan Erez. Bit-plane Compression: Transforming Data for Better Compression in Many-core Architectures. In Inproceedings of the 43rd Annual International Symposium on Computer Architecture, 2016.
[22]
Jang-Soo Lee, Won-Kee Hong, and Shin-Dug Kim. Design and Evaluation of a Selective Compressed Memory System. In Proceedings of the IEEE International Conference on Computer Design, 1999.
[23]
Peter Lindstrom and Martin Isenburg. Fast and Eficient Compression of FloatingPoint Data. IEEE Transactions on Visualization and Computer Graphics, 2006.
[24]
Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geof Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. Pin: Building Customized Program Analysis Tools with Dynamic Instrumentation. In Programming Language Design and Implementation, 2015.
[25]
Sparsh Mittal and Jefrey S Vetter. A survey of Architectural Approaches for Data Compression in Cache and Main Memory Systems. IEEE Transactions on Parallel and Distributed Systems, 2015.
[26]
Tri M Nguyen and David Wentzlaf. MORC: A Manycore-oriented Compressed Cache. In Proceedings of the 48th International Symposium on Microarchitecture, 2015.
[27]
Gennady Pekhimenko. Practical Data Compression for Modern Memory Hierarchies. arXiv preprint arXiv:1609. 02067, 2016.
[28]
Gennady Pekhimenko, Vivek Seshadri, Yoongu Kim, Hongyi Xin, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. Linearly Compressed Pages: A Low-complexity, Low-latency Main Memory Compression Framework. In Proceedings of 46th International Symposium on Microarchitecture, 2013.
[29]
Gennady Pekhimenko, Vivek Seshadri, Onur Mutlu, Phillip B. Gibbons, Michael A. Kozuch, and Todd C. Mowry. Base-delta-immediate Compression: Practical Data Compression for On-chip Caches. In Proceedings of the 21st international conference on Parallel architectures and compilation techniques, 2012.
[30]
Meikel Poess and Chris Floyd. New TPC Benchmarks for Decision Support and Web Commerce. SIGMOD Record, 2000.
[31]
Meikel Poess, Bryan Smith, Lubor Kollar, and Paul Larson. TPC-DS, Taking Decision Support Benchmarking to the Next Level. In ACM SIGMOD International Conference on Management of Data, 2002.
[32]
Paruj Ratanaworabhan, Jian Ke, and Martin Burtscher. Fast Lossless Compression of Scientific Floating-point Data. In Data Compression Conference, 2006.
[33]
David Salomon. Data Compression: The Complete Reference. Springer Science & Business Media, 2004.
[34]
Daniel Sanchez and Christos Kozyrakis. ZSim: Fast and Accurate Microarchitectural Simulation of Thousand-Core Systems. In Proceedings of the 40th Annual International Symposium on Computer Architecture, 2013.
[35]
Sarabjeet Singh and Manu Awasthi. Memory Centric Characterization and Analysis of SPEC CPU2017 Suite. In Proceedings of the 2019 ACM/SPEC International Conference on Performance Engineering, 2019.
[36]
John Peter Stevenson. Fine-grain In-memory Deduplication for Large-scale Workloads. Stanford University, 2013.
[37]
Yingying Tian, Samira M. Khan, Daniel A. Jiménez, and Gabriel H. Loh. Last-level Cache Deduplication. In Proceedings of the 28th ACM international conference on Supercomputing, 2014.
[38]
Luis Villa, Michael Zhang, and Krste Asanovi?. Dynamic Zero Compression for Cache Energy Reduction. In Proceedings of the 33rd International Symposium on Microarchitecture, 2000.
[39]
Yuejian Xie and Gabriel H Loh. Thread-aware Dynamic Shared Cache Compression in Multi-core Processors. In 2011 IEEE 29th International Conference on Computer Design, 2011.
[40]
Jun Yang, Youtao Zhang, and Rajiv Gupta. Frequent Value Compression in Data Caches. In Proceedings of the 33rd International Symposium on Microarchitecture, 2000.
[41]
Vinson Young, Prashant J Nair, and Moinuddin K Qureshi. Dice : Compressing dram caches for bandwidth and capacity. In Proceedings of the 44th Annual International Symposium on Computer Architecture, 2017.
[42]
Youtao Zhang, Jun Yang, and Rajiv Gupta. Frequent Value Locality and Valuecentric Data Cache Design. In Proceedings of the 9th International Conference on Architectural Support for Programming Languages and Operating Systems, 2000.

Cited By

View all
  • (2024)Sparrow: Flexible Memory Deduplication in Android Systems with Similar-Page Awareness2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546588(1-6)Online publication date: 25-Mar-2024
  • (2024)FSDedup: Feature-Aware and Selective Deduplication for Improving Performance of Encrypted Non-Volatile Main MemoryACM Transactions on Storage10.1145/366273620:4(1-33)Online publication date: 1-May-2024
  • (2024)A Low-Cost Fault-Tolerant Racetrack Cache Based on Data CompressionIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.337564071:8(3940-3944)Online publication date: Aug-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ASPLOS '21: Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems
April 2021
1090 pages
ISBN:9781450383172
DOI:10.1145/3445814
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 April 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. DRAM
  2. Memory compression
  3. deduplication

Qualifiers

  • Research-article

Conference

ASPLOS '21
Sponsor:

Acceptance Rates

Overall Acceptance Rate 535 of 2,713 submissions, 20%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)176
  • Downloads (Last 6 weeks)25
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Sparrow: Flexible Memory Deduplication in Android Systems with Similar-Page Awareness2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546588(1-6)Online publication date: 25-Mar-2024
  • (2024)FSDedup: Feature-Aware and Selective Deduplication for Improving Performance of Encrypted Non-Volatile Main MemoryACM Transactions on Storage10.1145/366273620:4(1-33)Online publication date: 1-May-2024
  • (2024)A Low-Cost Fault-Tolerant Racetrack Cache Based on Data CompressionIEEE Transactions on Circuits and Systems II: Express Briefs10.1109/TCSII.2024.337564071:8(3940-3944)Online publication date: Aug-2024
  • (2024)DyLeCT: Achieving Huge-page-like Translation Performance for Hardware-compressed Memory2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00085(1129-1143)Online publication date: 29-Jun-2024
  • (2024)A novel approximate cache block compressor for error-resilient image dataComputers and Electrical Engineering10.1016/j.compeleceng.2024.109106115:COnline publication date: 2-Jul-2024
  • (2023)QZRAM: A Transparent Kernel Memory Compression System Design for Memory-Intensive Applications with QAT Accelerator IntegrationApplied Sciences10.3390/app13181052613:18(10526)Online publication date: 21-Sep-2023
  • (2023)ESD: An ECC-assisted and Selective Deduplication for Encrypted Non-Volatile Main Memory2023 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA56546.2023.10071011(977-990)Online publication date: Feb-2023
  • (2022)Research on Data Routing Strategy of Deduplication in Cloud EnvironmentIEEE Access10.1109/ACCESS.2021.313975710(9529-9542)Online publication date: 2022
  • (2021)A Cost-Efficient Metadata Scheme for High-Performance Deduplication Systems2021 IEEE 23rd Int Conf on High Performance Computing & Communications; 7th Int Conf on Data Science & Systems; 19th Int Conf on Smart City; 7th Int Conf on Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys53884.2021.00034(49-56)Online publication date: Dec-2021

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media