Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2540708.2540745acmconferencesArticle/Chapter ViewAbstractPublication PagesmicroConference Proceedingsconference-collections
research-article

Aegis: partitioning data block for efficient recovery of stuck-at-faults in phase change memory

Published: 07 December 2013 Publication History

Abstract

While Phase Change Memory (PCM) holds a great promise as a complement or even replacement of DRAM-based memory and flash-based storage, it must effectively overcome its limit on write endurance to be a reliable device for an extended period of intensive use. The limited write endurance can lead to permanent stuck-at faults after a certain number of writes, which causes some memory cells permanently stuck at either '0' or '1'. State-of-the-art solutions apply a bit inversion technique on selected bit groups of a data block after its partitioning. The effectiveness of this approach hinges on how a data block is partitioned into bit groups. While all existing solutions can separate faults into different groups for error correction, they are inadequate on three fundamental capabilities desired for any partition scheme. First, it can maximize probability of successfully re-partitioning a block so that two faults currently in the same group are placed into two new groups. Second, it can partition a block into a small number of groups for space efficiency. Third, it should spread out faults across the groups as uniformly as possible, so that more faults can be accommodated within the same number of groups. A recovery solution with these capabilities can provide strong fault tolerance with minimal overhead.
We propose Aegis, a recovery solution with a systematical partition scheme using fewer groups to accommodate more faults compared with state-of-the-art schemes. The uniqueness of Aegis's partition scheme lies on its guarantee that any two bits in the same group will not be in the same group after a re-partition. Empowered by the partition scheme, Aegis can recover significantly more faults with reduced space overhead relative to state-of-the-art solutions.

References

[1]
Emerging research devices. In International Technology Roadmap for Semiconductors, 2011.
[2]
K. Kim. "Technology for sub-50nm DRAM and NAND flash manufacturing", In International Electron Devices Meeting, 2005.
[3]
Micron Announces Availability of Phase Change Memory for Mobile Devices. In http://investors.micron.com/releasedetail.cfm?ReleaseID=692563, July, 2012.
[4]
Samsung Ships Industry's First Multi-chip Package with a PRAM Chip for Handsets. In http://www.samsung.com/us/aboutsamsung/news/newsIrRead.do?news_ctgry=irnewsrelease&news_seq=18828 April, 2010.
[5]
J. Condit, E. Nightingale, C. Frost, E. Ipek, D. Burger, B. Lee, and D. Coetzee. "Better I/O Through Byte-Addressable, Persistent Memory", In Symposium on Operating Systems Principles, October 2009.
[6]
J. Chen, G. Venkataramani, and H. H. Huang. "RePRAM: Re-cycling PRAM Faulty Blocks for Extended Lifetime", In IEEE International Conference on Dependable Systems and Networks, 2012.
[7]
E. Ipek, J. Condit, E. B. Nightingale, D. Burger, and T. Moscibroda. "Dynamically Replicated Memory: Building Reliable Systems from Nanoscale Resistive Memories", In Proceedings of the International Conference on Architectural Support for Programming Languages and Operating Systems, 2010.
[8]
B. Lee, E. Ipek, O. Mutlu, and D. Burger. "Architecting Phase-Change Memory as a Scalable DRAM Alternative", In Proceedings of the International Symposium on Computer Architecture, June 2009.
[9]
F. Matsuoka and F. Masuoka. "Numerical Analysis of Alpha-particle-induced Soft Errors in Floating Channel Type Surrounding Gate Transistor (FC-SGT) DRAM Cell", In IEEE Transactions on Electron Devices, 2003.
[10]
R. Melhem, R, R. Maddah, and S. Cho. "RDIS: A Recursively Defined Invertible Set Scheme to Tolerate Multiple Stuck-At Faults in Resistive Memory", In Proceedings of the 42nd Annual IEEE/IFIP International Conference on Dependable Systems and Networks, June 2012.
[11]
T. Nirschl, J. B. Phipp, T. D. Happ, G. W. Burr, B. Rajendran, M. H. Lee, A. Schrott, M. Yang, M. Breitwisch, C. F. Chen, et al. "Write Strategies for 2 and 4-bit Multi-level Phase-change memory. In IEEE International Electron Devices Meeting, 2007.
[12]
M. K. Qureshi, J. Karidis, M. Fraceschini, V. Srinivasan, L. Lastras, and B. Abali. "Enhancing Lifetime and Security of Phase Change Memories via Start-Gap Wear Leveling", In Proceedings of the International Symposium on Microarchitecture, 2009.
[13]
M. K. Qureshi. "Pay-As-You-Go: Low-Overhead Hard-Error Correction for Phase Change Memories", In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture, December, 2011.
[14]
S. Schechter, G. Loh, K. Strauss, and D. Burger. "Use ECP, not ECC, for Hard Failures in Resistive Memories", In Proceedings of the International Symposium on Computer Architecture, June 2010.
[15]
N. H. Seong, D. H. Woo, and H.-H. S. Lee. "Security Refresh: Prevent Malicious Wear-out and Increase Durability for Phase-Change Memory with Dynamically Randomized Address Mapping", In Proceedings of the 37th annual International Symposium on Computer Architecture, pages, 2010.
[16]
N. H. Seong, D. H. Woo, V. Srinivasan, J. A. Rivers, and H. S. Lee. "SAFER: Stuck-At-Fault Error Recovery for Memories", In Proceedings of the 43th Annual IEEE/ACM International Symposium on Microarchitecture, 2010.
[17]
D. H. Yoon, N. Muralimanohar, J. Chang, P. Ranganathan, N. Jouppi, and M. Erez. "FREE-p: Protecting Non-volatile Memory against both Hard and Soft Errors", In Proceedings of IEEE 17th International Symposium on High Performance Computer Architecture, 2011.
[18]
P. Zhou, B. Zhao, J. Yang, and Y. Zhang. "A Durable and Energy Efficient Main Memory Using Phase Change Memory Technology", In Proceedings of the International Symposium on Computer Architecture, 2009.

Cited By

View all
  • (2024)A Review: Complete Analysis of the Cache Architecture for Better Performance2024 Second International Conference on Inventive Computing and Informatics (ICICI)10.1109/ICICI62254.2024.00129(768-771)Online publication date: 11-Jun-2024
  • (2024)WIRE: Write Energy Reduction via Encoding in Phase Change Main Memories (PCM)Proceedings of the Future Technologies Conference (FTC) 2024, Volume 310.1007/978-3-031-73125-9_38(599-615)Online publication date: 8-Nov-2024
  • (2023)SW-PCM: Graceful Degradation Support in PCM Main Memories by Using Swaption MechanismProceedings of the Future Technologies Conference (FTC) 2023, Volume 310.1007/978-3-031-47457-6_34(514-531)Online publication date: 9-Nov-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
MICRO-46: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture
December 2013
498 pages
ISBN:9781450326384
DOI:10.1145/2540708
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 December 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Cartesian plane
  2. partition scheme
  3. phase-change memory
  4. reliability
  5. stuck-at faults

Qualifiers

  • Research-article

Funding Sources

Conference

MICRO-46
Sponsor:

Acceptance Rates

MICRO-46 Paper Acceptance Rate 39 of 239 submissions, 16%;
Overall Acceptance Rate 484 of 2,242 submissions, 22%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Review: Complete Analysis of the Cache Architecture for Better Performance2024 Second International Conference on Inventive Computing and Informatics (ICICI)10.1109/ICICI62254.2024.00129(768-771)Online publication date: 11-Jun-2024
  • (2024)WIRE: Write Energy Reduction via Encoding in Phase Change Main Memories (PCM)Proceedings of the Future Technologies Conference (FTC) 2024, Volume 310.1007/978-3-031-73125-9_38(599-615)Online publication date: 8-Nov-2024
  • (2023)SW-PCM: Graceful Degradation Support in PCM Main Memories by Using Swaption MechanismProceedings of the Future Technologies Conference (FTC) 2023, Volume 310.1007/978-3-031-47457-6_34(514-531)Online publication date: 9-Nov-2023
  • (2022)Cellular automata based multi-bit stuck-at fault diagnosis for resistive memory基于元胞自动机的电阻存储器多比特固定型故障诊断Frontiers of Information Technology & Electronic Engineering10.1631/FITEE.210025523:7(1110-1126)Online publication date: 25-Jul-2022
  • (2021)Soteria: Towards Resilient Integrity-Protected and Encrypted Non-Volatile MemoriesMICRO-54: 54th Annual IEEE/ACM International Symposium on Microarchitecture10.1145/3466752.3480066(1214-1226)Online publication date: 18-Oct-2021
  • (2021)A CASTLE With TOWERs for Reliable, Secure Phase-Change MemoryIEEE Transactions on Computers10.1109/TC.2020.300685270:9(1311-1324)Online publication date: 1-Sep-2021
  • (2020)'I Can't Even Buy Apples If I Don't Use Mobile Pay?'Proceedings of the ACM on Human-Computer Interaction10.1145/34152414:CSCW2(1-26)Online publication date: 15-Oct-2020
  • (2020)Exploring Antecedents and Consequences of Toxicity in Online DiscussionsProceedings of the ACM on Human-Computer Interaction10.1145/34151794:CSCW2(1-23)Online publication date: 15-Oct-2020
  • (2020)Computing's social obligationACM SIGCAS Computers and Society10.1145/3383641.338364448:3-4(13-14)Online publication date: 13-Feb-2020
  • (2020)History and the social responsibility of computing professionalsACM SIGCAS Computers and Society10.1145/3383641.338364248:3-4(8-10)Online publication date: 13-Feb-2020
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media