Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Word- and Partition-Level Write Variation Reduction for Improving Non-Volatile Cache Lifetime

Published: 01 August 2017 Publication History

Abstract

Non-volatile memory technologies are among the most promising technologies for implementing the main memories and caches in future microprocessors and replacing the traditional DRAM and SRAM technologies. However, one of the most challenging design issues of the non-volatile memory technologies is the limited write. In this article, we first propose to exploit the narrow-width values to improve the lifetime of non-volatile last-level caches with word-level write variation reduction. Leading zeros masking scheme is proposed to reduce the write stress to the upper half of the narrow-width data. To balance the write variations between the upper half and the lower half of the narrow-width data, two swapping schemes, the swap on write (SW) and swap on replacement (SRepl), are proposed. Two existing optimization schemes, the multiple dirty bit (MDB) and read before write (RBW), are adopted with our word-level swapping design. To further reduce the write variation on the partition level, we propose to exploit the cache partitioning design to improve the lifetime. Based on the observation that different applications demonstrate different cache access (write) behaviors, we propose to partition the last-level cache for different applications and balance the write variations by partition swapping. Both software-based and hardware-based partitioning and swapping schemes are proposed and evaluated for different situations. Our experimental results show that both our word- and partition-level designs can improve the lifetime of the non-volatile caches effectively with low performance and energy overheads.

References

[1]
J. Ahn, S. Yoo, and K. Choi. 2014. DASCA: Dead write prediction assisted STT-RAM cache architecture. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’14). 25--36.
[2]
Nathan Binkert et al. 2011. The gem5 simulator. ACM SIGARCH Comput. Arch. News 39, 2 (May 2011), 1--7.
[3]
D. Brooks and M. Martonosi. 1999. Dynamically exploiting narrow width operands to improve processor power and performance. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’99).
[4]
D. Burger and T. M. Austin. 1997. The Simplescalar Tool Set, Version 2.0. Technical Report 1342. Computer Sciences Department, University of Wisconsin.
[5]
Sangyeun Cho and Lei Jin. 2006. Managing distributed, shared l2 caches through OS-level page allocation. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture.
[6]
Sangyeun Cho and Hyunjin Lee. 2009. Flip-n-write: A simple deterministic technique to improve pram write performance, energy and endurance. In Proceedings of IEEE/ACM International Symposium on Microarchitecture.
[7]
X. Dong et al. 2012. NVSim: A circuit-level performance, energy, and area model for emerging nonvolatile memory. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 31, 7 (2012), 994--1007.
[8]
O. Ergin et al. 2006. Exploiting narrow values for soft error tolerance. IEEE Comput. Arch. Lett. 5, 2 (July-Dec. 2006).
[9]
Sanchuan Guo and others. 2012. Wear-resistant hybrid cache architecture with phase change memory. In Proceedings of the IEEE Seventh International Conference on Networking, Architecture, and Storage. 268--272.
[10]
Engin Ipek et al. 2010. Dynamically replicated memory: Building reliable systems from nanoscale resistive memories. In Proceedings of the International Symposium on Architectural Support for Programming Languages and Operating Systems.
[11]
Amin Jadidi, Mohammad Arjomand, and Hamid Sarbazi-Azad. 2011. High-endurance and performance-efficient design of hybrid cache architectures through adaptive line replacement. In Proceedigns of the IEEE/ACM International Symposium on Low Power Electronics and Design. 79--84.
[12]
Yongsoo Joo et al. 2010. Energy- and endurance-aware design of phase change memory caches. In Proceedings of the Design, Automation and Test in Europe Conference (DATE’10). 136--141.
[13]
W. Kang et al. 2015. Yield and reliability improvement techniques for emerging nonvolatile STT-MRAM. IEEE J. Emerg. Select. Top. Circ. Syst. (JETCAS’15) 5, 1 (2015), 28--39.
[14]
H. A. Khouzani et al. 2014. Prolonging PCM lifetime through energy-efficient, segment-aware, and wear-resistant page allocation. In Proceedings of the IEEE/ACM International Symposium on Low Power Electronics and Design. 327--330.
[15]
Joonho Kong and Sung Woo Chung. 2012. Exploiting narrow-width values for process variationtolerant 3-D microprocessors. In Proceedings of the 49th Design Automation Conference (DAC’12). 1193--1202.
[16]
Ing-Chao Lin and Jeng-Nian Chiou. 2015. High-endurance hybrid cache design in CMP architecture with cache partitioning and access-aware policies. IEEE Trans. VLSI Syst. 23, 10 (2015), 2149--2161.
[17]
Jiang Lin et al. 2008. Gaining insights into multicore cache partitioning: Bridging the gap between simulation and real systems. In Proceedings of the IEEE 14th International Symposium on High Performance Computer Architecture.
[18]
Xianlu Luo and others. 2014. Enhancing lifetime of NVM-based main memory with bit shifting and flipping. In Proceedings of the IEEE International Conference on Embedded and Real-Time Computing Systems and Applications (RTCSA’14). 1--7.
[19]
Sparsh Mittal, Jeffrey S. Vetter, and Dong Li. 2014. WriteSmoothing: Improving lifetime of non-volatile caches using intra-set wear-leveling. In Proceedings of the Great Lakes Symposium on VLSI (GLSVLSI’14). 139--144.
[20]
M. K. Qureshi et al. 2009. Enhancing lifetime and security of PCM-based main memory with start-gap wear leveling. In Proceedings of IEEE/ACM International Symposium on Microarchitecture.
[21]
Moinuddin K. Qureshi and Yale N. Patt. 2006. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture.
[22]
Roberto Rodriguez-Rodriguez et al. 2013. Reducing writes in phase-change memory environments by using efficient cache replacement policies. In Proceedings of the Conference on Design, Automation and Test in Europe.
[23]
Stuart Schechter et al. 2010. Use ECP, not ECC, for hard failures in resistive memories. In Proceedings of the International Symposium on High Performance Computer Architecture.
[24]
N. H. Seong et al. 2010. Security refresh: Prevent malicious wearout and increase durability for phase-change memory with dynamically randomized address mapping. In Proceedings of the International Symposium on Computer Architecture.
[25]
Li Sheng et al. 2009. McPAT: An integrated power, area, and timing modeling framework for multicore and manycore architectures. In Proceedings of the IEEE/ACM International Symposium on Microarchitecture. 469--480.
[26]
G. E. Suh et al. 2004. Dynamic partitioning of shared cache memory. J. Supercomput. 28, 1 (2004).
[27]
H. J. Tsai et al. 2014. Leveraging data lifetime for energy-aware last level non-volatile SRAM caches using redundant store elimination. In Proceedings of the Design Automation Conference (DAC’14). 1--6.
[28]
L. Villa, M. Zhang, and K. Asanovic. 2000. Dynamic zero compression for cache energy reduction. In Proceedings of the 33rd International Symposium on Microarchitecture (Micro’00). 214--220.
[29]
Jue Wang et al. 2013. i2WAP: Improving non-volatile cache lifetime by reducing inter- and intra-set write variations. In Proceedings of the International Symposium on High Performance Computer Architecture.
[30]
Ruisheng Wang and Lizhong Chen. 2014. Futility scaling: High-associativity cache partitioning. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture.
[31]
Shuai Wang, Jie Hu, and Sotirios G. Ziavras. 2009. On the characterization and optimization of on-chip cache reliability against soft errors. IEEE Trans. Comput. 58, 9 (September 2009), 1171--1184.
[32]
Yiqun Wang and others. 2014a. Register allocation for hybrid register architecture in nonvolatile processors. In Proceedings of the IEEE International Symposium on Circuits and Systems. 1050--1053.
[33]
Z. Wang et al. 2014b. Adaptive placement and migration policy for an STT-RAM-based hybrid cache. In Proceedings of the International Symposium on High Performance Computer Architecture (HPCA’14). 13--24.
[34]
W. Wen et al. 2013. CDECC: Content-dependent error correction codes for combating asymmetric nonvolatile memory operation errors. In Proceedings of the International Conference on Computer-Aided Design (ICCAD’13). 1--8.
[35]
Wujie Wen et al. 2014. PS3-RAM: A fast portable and scalable statistical STT-RAM reliability/energy analysis method. IEEE Trans. Comput.-Aid. Des. Integr. Circ. Syst. 33, 11 (2014), 1644--1656.
[36]
D. H. Yoon et al. 2011. FREE-p: Protecting non-volatile memory against both hard and soft errors. In Proceedings of the International Symposium on High Performance Computer Architecture.
[37]
J. Zhao, O. Mutlu, and Y. Xie. 2014. FIRM: Fair and high-performance memory control for persistent memory systems. In Proceedings of the International Symposium on Microarchitecture (MICRO’14). 153--165.
[38]
P. Zhou and others. 2009. A durable and energy efficient main memory using phase change memory technology. In Proceedings of the International Symposium on Computer Architecture. 14--23.

Cited By

View all
  • (2024)PROLONG: Priority based Write Bypassing Technique for Longer Lifetime in STT-RAM based LLCProceedings of the International Symposium on Memory Systems10.1145/3695794.3695803(89-103)Online publication date: 30-Sep-2024
  • (2023)Auto-tuning Fixed-point Precision with TVM on RISC-V Packed SIMD ExtensionACM Transactions on Design Automation of Electronic Systems10.1145/356993928:3(1-21)Online publication date: 22-Mar-2023
  • (2023) WiSE: W hen Learning Ass i sts Resolving S TT-MRAM E fficiency Challenges IEEE Transactions on Emerging Topics in Computing10.1109/TETC.2022.316343811:1(43-55)Online publication date: 1-Jan-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Design Automation of Electronic Systems
ACM Transactions on Design Automation of Electronic Systems  Volume 23, Issue 1
January 2018
279 pages
ISSN:1084-4309
EISSN:1557-7309
DOI:10.1145/3129756
  • Editor:
  • Naehyuck Chang
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Journal Family

Publication History

Published: 01 August 2017
Accepted: 01 April 2017
Revised: 01 January 2017
Received: 01 September 2016
Published in TODAES Volume 23, Issue 1

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Non-volatile memory
  2. cache partitioning
  3. last-level cache
  4. narrow-width value
  5. wear-leveling

Qualifiers

  • Research-article
  • Research
  • Refereed

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)6
  • Downloads (Last 6 weeks)0
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)PROLONG: Priority based Write Bypassing Technique for Longer Lifetime in STT-RAM based LLCProceedings of the International Symposium on Memory Systems10.1145/3695794.3695803(89-103)Online publication date: 30-Sep-2024
  • (2023)Auto-tuning Fixed-point Precision with TVM on RISC-V Packed SIMD ExtensionACM Transactions on Design Automation of Electronic Systems10.1145/356993928:3(1-21)Online publication date: 22-Mar-2023
  • (2023) WiSE: W hen Learning Ass i sts Resolving S TT-MRAM E fficiency Challenges IEEE Transactions on Emerging Topics in Computing10.1109/TETC.2022.316343811:1(43-55)Online publication date: 1-Jan-2023
  • (2022)The Support of MLIR HLS Adaptor for LLVM IRWorkshop Proceedings of the 51st International Conference on Parallel Processing10.1145/3547276.3548515(1-8)Online publication date: 29-Aug-2022
  • (2022)Register-Pressure Aware Predicator for Length Multiplier of RVVWorkshop Proceedings of the 51st International Conference on Parallel Processing10.1145/3547276.3548513(1-9)Online publication date: 29-Aug-2022
  • (2022)An Architectural-Level Reliability Improvement Scheme in STT-MRAM Main MemoryMicroprocessors and Microsystems10.1016/j.micpro.2022.10446290(104462)Online publication date: Apr-2022
  • (2022)Data block manipulation for error rate reduction in STT-MRAM based main memoryThe Journal of Supercomputing10.1007/s11227-022-04394-778:11(13342-13372)Online publication date: 1-Jul-2022
  • (2021)NOSTalgy: Near-Optimum Run-Time STT-MRAM Quality-Energy Knob Management for Approximate Computing ApplicationsIEEE Transactions on Computers10.1109/TC.2020.298924370:3(414-427)Online publication date: 9-Feb-2021
  • (2020)CAST: Content-Aware STT-MRAM Cache Write Management for Different Levels of ApproximationIEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems10.1109/TCAD.2020.298632039:12(4385-4398)Online publication date: Dec-2020
  • (2019)Write Variation Aware Cache Partitioning for Improved Lifetime in Non-volatile Caches2019 32nd International Conference on VLSI Design and 2019 18th International Conference on Embedded Systems (VLSID)10.1109/VLSID.2019.00091(425-430)Online publication date: Jan-2019
  • Show More Cited By

View Options

Login options

Full Access

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media