Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1815961.1815973acmconferencesArticle/Chapter ViewAbstractPublication PagesiscaConference Proceedingsconference-collections
research-article

Reducing cache power with low-cost, multi-bit error-correcting codes

Published: 19 June 2010 Publication History

Abstract

Technology advancements have enabled the integration of large on-die embedded DRAM (eDRAM) caches. eDRAM is significantly denser than traditional SRAMs, but must be periodically refreshed to retain data. Like SRAM, eDRAM is susceptible to device variations, which play a role in determining refresh time for eDRAM cells. Refresh power potentially represents a large fraction of overall system power, particularly during low-power states when the CPU is idle. Future designs need to reduce cache power without incurring the high cost of flushing cache data when entering low-power states.
In this paper, we show the significant impact of variations on refresh time and cache power consumption for large eDRAM caches. We propose Hi-ECC, a technique that incorporates multi-bit error-correcting codes to significantly reduce refresh rate. Multi-bit error-correcting codes usually have a complex decoder design and high storage cost. Hi-ECC avoids the decoder complexity by using strong ECC codes to identify and disable sections of the cache with multi-bit failures, while providing efficient single-bit error correction for the common case. Hi-ECC includes additional optimizations that allow us to amortize the storage cost of the code over large data words, providing the benefit of multi-bit correction at same storage cost as a single-bit error-correcting (SECDED) code (2% overhead). Our proposal achieves a 93% reduction in refresh power vs. a baseline eDRAM cache without error correcting capability, and a 66% reduction in refresh power vs. a system using SECDED codes.

References

[1]
A. Agarwal, et al., "Process variation in embedded memories: failure analysis and variation aware architecture," IEEE Journal of Solid-state Circuits, vol. 40, no. 9, pp. 1804--1814, Sep., 2005.
[2]
J. Barth, et al., "A 500 MHz random cycle, 1.5 ns latency, SOI embedded DRAM macro featuring a three-transistor micro sense amplifier", IEEE Journal of Solid State Circuits, vol. 43, no. 1, pp. 86--95, Jan. 2008.
[3]
E. R. Berlekamp, Algebraic coding theory, New York: McGraw-Hill, chapter 7, 1968.
[4]
H. Brunner, A. Curiger and M. Hofstetter, "On computing multiplicative inverses in GF(2m)," IEEE Transactions on Computers, vol. 42, pp. 1010--1015, Aug. 1993.
[5]
H. O. Burton and E. J.Weldon, Jr., "Cyclic product codes," IEEE Transactions on Information Theory, vol. 11, no. 3, pp. 433--439, Jul. 1965.
[6]
J. Chang, et al., "The 65-nm 16-MB shared on-die L3 cache for the dual-Core Intel® Xeon processor 7100 series," IEEE Journal of Solid-state Circuits, vol. 42, no. 4, pp. 846--852, Apr. 2007.
[7]
R. T. Chien, "Cyclic decoding procedures for Bose-Chaudhuri-Hocquenghem codes," IEEE Transactions on Information Theory, vol. 10, no. 4, pp. 357--363, Oct. 1964.
[8]
P. Emma, W. Reohr and M. Meterelliyoz, "Rethinking refresh: Increasing availability and reducing power in DRAM for cache applications," IEEE Micro, vol. 28, no. 6, pp. 47--56, Nov 2008.
[9]
V. George, "45nm next generation Intel Core microarchitecture (Penryn)," Hot Chips 19, Stanford, CA, Aug. 2007.
[10]
M. Ghosh and H. Lee, "Smart refresh: An enhanced memory controller design for reducing energy in conventional and 3D die-stacked DRAMs," in Proceedings of the 40th International Symposium on Microarchitecture, pp. 134--145, Dec. 2007.
[11]
T. Hamamoto, S. Sugiura and S, Sawada, "On the retention time distribution of dynamic random access memory (DRAM)," IEEE Transactions on Electron Devices, vol. 45, no. 6, pp. 1300--1309, Jun. 1998.
[12]
Mu Y. Hsiao, Douglas C. Bossen, "Orthogonal latin square configuration for LSI memory yield and reliability enhancement," IEEE Transactions on Computers, vol. 24, no. 5, pp. 512--516, May 1975.
[13]
H. Imai and Y. Kamiyanagi, "A construction method for double error correcting codes for application to main memories," Transactions of the IECE Japan, vol. J60-D, pp. 861--868, Oct. 1977.
[14]
R. Kalla, "Power7: IBM's next generation POWER microprocessor," Presentation at Hot Chips 21, Stanford, CA, Aug. 2009.
[15]
J. Kim and M. Papaefthymiou, "Dynamic memory design for low data-retention power," in Proceedings of the 10th International Workshop on Integrated Circuit Design, Power and Timing Modeling, Optimization and Simulation, pp. 207--216, Sep. 2000.
[16]
J. Kim, M. Papaefthymiou, "Block-based multiperiod dynamic memory design for low data-retention power," IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 11, no. 6, pp. 1006--1018, Dec. 2003.
[17]
J. Kim, et al., "Multi-bit error tolerant caches using two-dimensional error coding," in Proceedings of the 40th Annual International Symposium on Microarchitecture (MICRO), pp. 197--209, Dec. 2007.
[18]
W. Kong, et al., "Analysis of retention time distribution of embedded DRAM - A new method to characterize across-chip threshold voltage variation," in Proceedings of IEEE International Test Conference (ITC 2008), pp. 1--7, Oct. 2008.
[19]
S. Lin and D. J. Costello. Error control coding, Second Edition. Prentice-Hall, Inc., Upper Saddle River, NJ, USA, 2004.
[20]
J. L. Massey, "Step-by-step decoding of the Bose-Chaudhuri-Hocquenghem codes," IEEE Transactions on Information Theory, vol. 11, no. 4, pp. 580--585, Apr. 1965.
[21]
R. Matick and S. Schuster, "Logic based eDRAM: origins and rationale for use," IBM Journal of Research and Development, vol. 49, no. 1, pp. 145--165, Jan. 2005.
[22]
Micron Technology, Inc. "TN-41-01: Calculating memory system power for DDR3", http://download.micron.com/pdf/ technotes/ddr3/TN41_01DDR3%20Power.pdf
[23]
K. Mistry, et al., "A 45nm logic technology with high-k+metal gate transistors, strained silicon, 9 Cu interconnect layers, 193nm dry patterning, and 100% Pb-free packaging," in Proceedings of IEDM 2007, pp. 247--250, Dec. 2007.
[24]
A. Naveh, et al., "Power and thermal management in the Intel® Core® Duo processor," Intel Technology Journal, vol. 10, no. 2, May 2006.
[25]
S. Natarajan, et.al., "A 32nm logic technology featuring 2nd-generation high-k + metal-gate transistors, enhanced channel strain and 0.171um2 SRAM cell size in a 291Mb array," in Proceedings of IEDM 2008, pp. 1--3, Dec. 2008.
[26]
T. Ohsawa, K. Kai and K. Murakami, "Optimizing the DRAM refresh count for merged DRAM/Logic LSIs," in Proceedings of the 1998 International Symposium on Low Power Electronics and Design (ISLPED), pp. 82--87, August 1998.
[27]
S. Ozdemir, et. al., "Yield-aware cache architectures," in Proceedings of the 39th Annual International Symposium on Microarchitecture (MICRO), pp. 15--25, Dec, 2006.
[28]
T. Rao, E. Fujiwara, "Error-control coding for computer systems," Prentice-Hall, Inc., Upper Saddle River, NJ, 1989.
[29]
D. Roberts, N. Kim and T. Mudge, "On-chip cache device scaling limits and effective fault repair techniques in future nanoscale technology," in Proceedings of 10th Euromicro Conference on Digital System Design Architectures, Methods and Tools, pp. 570--578, Aug. 2007.
[30]
D. Somasekhar, et al., "Multi-phase 1GHz voltage doubler charge-pump in 32nm logic process," in Proceedings of 2009 Symposium on VLSI Circuits, pp. 196--197, Jun. 2009.
[31]
D. Strukov, "The area and latency tradeoffs of binary bit-parallel BCH decoders for prospective nanoelectronic memories," in Proceedings of 2006 Asilomar Conference on Signals Systems and Computers, pp. 1183--1187, Oct. 2006.
[32]
R. Venkatesan, S. Herr and E. Rotenberg, "Retention aware placement in DRAM (RAPID): Software methods for quasi-non-volatile DRAM," in Proceedings of 12th International Symposium on High Performance Computer Architecture (HPCA), pp. 155--165, Feb. 2006.
[33]
C. Wilkerson, et. al, "Trading off cache capacity for reliability to enable low voltage operation," in Proceedings of 35th International Symposium on Computer Architecture (ISCA-35), pp. 203--214, Jun. 2008.
[34]
D. H. Yoon and M. Erez, "Memory Mapped ECC: Low-Cost Error Protection for Last Level Caches", in Proceedings of the 36th International Symposium on Computer Architecture (ISCA-36), pp. 116--127, June, 2009.

Cited By

View all
  • (2024)Achieving DRAM-Like PCM by Trading Off Capacity for LatencyIEEE Transactions on Computers10.1109/TC.2024.335577973:4(1180-1189)Online publication date: Apr-2024
  • (2023)Vitruvius+: An Area-Efficient RISC-V Decoupled Vector Coprocessor for High Performance Computing ApplicationsACM Transactions on Architecture and Code Optimization10.1145/357586120:2(1-25)Online publication date: 1-Mar-2023
  • (2023)HyGain: High-performance, Energy-efficient Hybrid Gain Cell-based Cache HierarchyACM Transactions on Architecture and Code Optimization10.1145/357283920:2(1-20)Online publication date: 1-Mar-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture
June 2010
520 pages
ISBN:9781450300537
DOI:10.1145/1815961
  • cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 38, Issue 3
    ISCA '10
    June 2010
    508 pages
    ISSN:0163-5964
    DOI:10.1145/1816038
    Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

In-Cooperation

  • IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. dram
  2. ecc
  3. edram
  4. idle power
  5. idle states
  6. multi-bit ecc
  7. refresh power
  8. vccmin

Qualifiers

  • Research-article

Conference

ISCA '10
Sponsor:

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)62
  • Downloads (Last 6 weeks)10
Reflects downloads up to 08 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Achieving DRAM-Like PCM by Trading Off Capacity for LatencyIEEE Transactions on Computers10.1109/TC.2024.335577973:4(1180-1189)Online publication date: Apr-2024
  • (2023)Vitruvius+: An Area-Efficient RISC-V Decoupled Vector Coprocessor for High Performance Computing ApplicationsACM Transactions on Architecture and Code Optimization10.1145/357586120:2(1-25)Online publication date: 1-Mar-2023
  • (2023)HyGain: High-performance, Energy-efficient Hybrid Gain Cell-based Cache HierarchyACM Transactions on Architecture and Code Optimization10.1145/357283920:2(1-20)Online publication date: 1-Mar-2023
  • (2023)YaConv: Convolution with Low Cache FootprintACM Transactions on Architecture and Code Optimization10.1145/357030520:1(1-18)Online publication date: 10-Feb-2023
  • (2023)Reducing Power Dissipation in Memory Repair for High Fault RatesIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2020.304660531:12(2112-2125)Online publication date: Dec-2023
  • (2022)MLFTCache: Multilevel Fault Tolerance Scheme for Write-Back L2 Cache Under IrradiationIEEE Transactions on Nuclear Science10.1109/TNS.2022.315180569:5(1182-1192)Online publication date: May-2022
  • (2022)HiRA: Hidden Row Activation for Reducing Refresh Latency of Off-the-Shelf DRAM Chips2022 55th IEEE/ACM International Symposium on Microarchitecture (MICRO)10.1109/MICRO56248.2022.00062(815-834)Online publication date: Oct-2022
  • (2022)Enabling efficient sub-block disabled caches using coarse grain spatial predictionsMicroprocessors & Microsystems10.1016/j.micpro.2022.10447990:COnline publication date: 1-Apr-2022
  • (2021)Marvel: A Vertical Resistive Accelerator for Low-Power Deep Learning Inference in Monolithic 3D2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE51398.2021.9474208(1240-1245)Online publication date: 1-Feb-2021
  • (2021)Understanding Power Consumption and Reliability of High-Bandwidth Memory with Voltage Underscaling2021 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE51398.2021.9474024(517-522)Online publication date: 1-Feb-2021
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media