Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2744769.2744846acmconferencesArticle/Chapter ViewAbstractPublication PagesdacConference Proceedingsconference-collections
research-article

Guidelines to design parity protected write-back L1 data cache

Published: 07 June 2015 Publication History

Abstract

Several decades of technology scaling has brought the challenge of soft errors to modern computing systems, and caches are most susceptible to soft errors. While it is straightforward to protect L2 and other lower level caches using error correcting coding (ECC), protecting the L1 data caches poses a challenge. Parity-based protection of L1 data cache is a more power-efficient alternative, however, some questions still linger -- How effective is parity protection for caches? How can we design a parity-based L1 data cache so as to maximize the protection achieved? The goal of this paper is to perform a quantitative evaluation of the protection afforded by various parity-protected cache design alternatives, and formulate guidelines for the design of power-efficient and reliable L1 data caches. Towards this goal, this paper develops an algorithm to accurately model the vulnerability of data in caches, in the presence of various configurations of parity protection, and validate it against extensive fault injection campaigns. We find that, (i) checking parity at reads only (and not at writes) provides 11% more protection with 30% lesser power overheads as compared to that at both reads and writes; and (ii) when implementing parity at the word-level granularity for 53% improved protection as compared to block-level parity implementation, the dirty-bits in the cache should also be implemented at the same granularity, otherwise, there is no improvement in protection. We find several popular commercial processors -- even the ones specifically designed for reliability -- not following these design guidelines, and resulting in sub-optimial designs.

References

[1]
ARM. ARM1156T2-S technical manual, 2007.
[2]
ARM. ARM Cortex-R4 processor, 2010.
[3]
ARM. Cortex-A8 processor, 2014.
[4]
G. H. Asadi, V. S. Mehdi, B. Tahoori, and D. Kaeli. Balancing performance and reliability in the memory hierarchy. In ISPASS, 2005.
[5]
R. Baumann. Soft errors in advanced computer systems. Design Test of Computers, IEEE, 22(3), 2005.
[6]
M. P. Baze, S. P. Buchner, and D. McMorrow. A digital CMOS design technique for SEU hardening. Nuclear Science, IEEE Trans on, 47(6), 2000.
[7]
N. Binkert, B. Beckmann, G. Black, S. K. Reinhardt, A. Saidi, A. Basu, J. Hestness, D. R. Hower, T. Krishna, and S. Sardashti. The gem5 simulator. ACM SIGARCH Computer Architecture News, 39(2), 2011.
[8]
A. Biswas, P. Racunas, R. Cheveresan, J. Emer, S. Mukherjee, and R. Rangan. Computing architectural vulnerability factors for address-based structures. In ISCA, 2005.
[9]
W. Chao, F. Zhongchuan, C. Hongsong, B. Wei, L. Bin, C. Lin, Z. Zexu, W. Yuying, and C. Gang. CFCSS without aliasing for SPARC architecture. In CIT, 2010.
[10]
X. Fu, T. Li, and J. Fortes. Sim-soda: A unified framework for architectural level software reliability analysis. In PMBS, 2006.
[11]
T. Granlund, B. Granbom, and N. Olsson. Soft error rate increase for new generations of SRAMs. Nuclear Science, IEEE Transactions on, 50(6), 2003.
[12]
M. R. Guthaus, J. S. Ringenberg, D. Ernst, T. M. Austin, T. Mudge, and R. B. Brown. MiBench: A free, commercially representative embedded benchmark suite. In WWC, 2001.
[13]
J. L. Henning. SPEC CPU2006 benchmark descriptions. ACM SIGARCH Computer Architecture News, 34(4), 2006.
[14]
C. X. Huang, B. Zhang, A.-C. Deng, and B. Swirski. The design and implementation of PowerMill. In ISLPED, 1995.
[15]
E. Ibe, H. Taniguchi, Y. Yahagi, K. Shimbo, and T. Toba. Impact of scaling on neutron-induced soft error in SRAMs from a 250 nm to a 22 nm design rule. Electron Devices, IEEE Trans on, 57(7), 2010.
[16]
Intel. Intel pentium 4 processor on 90 nm process datasheet. In Intel Corporation, April 2004.
[17]
ITRS. The international technology roadmap for semiconductors, 2007.
[18]
R. Jeyapaul and A. Shrivastava. Smart cache cleaning: Energy efficient vulnerability reduction in embedded processors. In CASES, 2011.
[19]
L. Li, V. Degalahal, N. Vijaykrishnan, M. Kandemir, and M. J. Irwin. Soft error and energy consumption interactions: a data cache perspective. In ISLPED, 2004.
[20]
H. Madeira and J. Silva. On-line signature learning and checking: experimental evaluation. In CompEuro, 1991.
[21]
F. H. McMahon. The Livermore Fortran Kernels: A computer test of the numerical performance range. Technical report, Lawrence National Lab., 1986.
[22]
S. Mitra, M. Zhang, N. Seifert, T. M. Mak, and K. S. Kim. Built-in soft error resilience for robust system design. In ICICDT, 2007.
[23]
S. S. Mukherjee, C. Weaver, J. Emer, S. K. Reinhardt, and T. Austin. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Micro, 2003.
[24]
R. Naseer, Y. Boulghassoul, J. Draper, S. DasGupta, and A. Witulski. Critical charge characterization for soft error rate modeling in 90nm SRAM. In ISCAS, 2007.
[25]
N. Sadler and D. Sorin. Choosing an error protection scheme for a microprocessor's L1 data cache. In ICCD, 2006.
[26]
A. Seyedi, G. Yalcin, O. S. Unsal, and A. Cristal. Circuit design of a novel adaptable and reliable L1 data cache. In GLSVLSI, 2013.
[27]
S. Shazli, M. Abdul-Aziz, M. Tahoori, and D. Kaeli. A field analysis of system-level effects of soft errors occurring in microprocessors used in information systems. In ITC, 2008.
[28]
A. Shrivastava, A. Rhisheekesan, R. Jeyapaul, and C.-J. Wu. Quantitative analysis of control flow checking mechanisms for soft errors. In DAC, 2014.
[29]
C. Slayman. Alpha particle or neutron SER-what will dominate in future IC technology. 2010.
[30]
Texas Instruments. AM3359 sitara processor, 2011.
[31]
S. Thoziyoor, N. Muralimanohar, J. H. Ahn, and N. P. Jouppi. CACTI 5.1. HP Laboratories, April, 2, 2008.
[32]
R. Vemu and J. Abraham. CEDA: Control-flow error detection using assertions. IEEE Transactions on Computers, 60(9), 2011.
[33]
S. Wang, J. Hu, and S. Ziavras. On the characterization of data cache vulnerability in high-performance embedded microprocessors. In IC-SAMOS, 2006.
[34]
W. Zhang. Computing cache vulnerability to transient errors and its implication. In DFTVS, 2005.

Cited By

View all
  • (2023)gemV-tool: A Comprehensive Soft Error Reliability Estimation Tool for Design Space ExplorationElectronics10.3390/electronics1222457312:22(4573)Online publication date: 8-Nov-2023
  • (2021)Characterizing System-Level Masking Effects against Soft ErrorsElectronics10.3390/electronics1018228610:18(2286)Online publication date: 17-Sep-2021
  • (2019)Cache Reconfiguration Using Machine Learning for Vulnerability-aware Energy OptimizationACM Transactions on Embedded Computing Systems10.1145/330976218:2(1-24)Online publication date: 2-Apr-2019
  • Show More Cited By

Index Terms

  1. Guidelines to design parity protected write-back L1 data cache

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      DAC '15: Proceedings of the 52nd Annual Design Automation Conference
      June 2015
      1204 pages
      ISBN:9781450335201
      DOI:10.1145/2744769
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 07 June 2015

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      DAC '15
      Sponsor:
      DAC '15: The 52nd Annual Design Automation Conference 2015
      June 7 - 11, 2015
      California, San Francisco

      Acceptance Rates

      Overall Acceptance Rate 1,770 of 5,499 submissions, 32%

      Upcoming Conference

      DAC '25
      62nd ACM/IEEE Design Automation Conference
      June 22 - 26, 2025
      San Francisco , CA , USA

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)7
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 12 Nov 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2023)gemV-tool: A Comprehensive Soft Error Reliability Estimation Tool for Design Space ExplorationElectronics10.3390/electronics1222457312:22(4573)Online publication date: 8-Nov-2023
      • (2021)Characterizing System-Level Masking Effects against Soft ErrorsElectronics10.3390/electronics1018228610:18(2286)Online publication date: 17-Sep-2021
      • (2019)Cache Reconfiguration Using Machine Learning for Vulnerability-aware Energy OptimizationACM Transactions on Embedded Computing Systems10.1145/330976218:2(1-24)Online publication date: 2-Apr-2019
      • (2019)RAW-TagIEEE Transactions on Dependable and Secure Computing10.1109/TDSC.2017.270626316:4(651-664)Online publication date: 1-Jul-2019
      • (2018)Vulnerability-aware Energy Optimization for Reconfigurable Caches in Multitasking SystemsIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2018.2834410(1-1)Online publication date: 2018
      • (2017)BenchPrimeACM Transactions on Embedded Computing Systems10.1145/312649916:5s(1-22)Online publication date: 27-Sep-2017
      • (2017)Protecting Caches from Soft ErrorsACM Transactions on Embedded Computing Systems10.1145/306318016:4(1-28)Online publication date: 11-May-2017
      • (2017)Application-Guided Power-Efficient Fault Tolerance for H.264 Context Adaptive Variable Length CodingIEEE Transactions on Computers10.1109/TC.2016.261631366:4(560-574)Online publication date: 1-Apr-2017
      • (2016)Multi-level cache vulnerability estimation: The first step to protect memory2016 IEEE International Conference on Systems, Man, and Cybernetics (SMC)10.1109/SMC.2016.7844399(001165-001170)Online publication date: Oct-2016

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media