Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Reliability-aware Garbage Collection for Hybrid HBM-DRAM Memories

Published: 20 January 2021 Publication History
  • Get Citation Alerts
  • Abstract

    Emerging workloads in cloud and data center infrastructures demand high main memory bandwidth and capacity. Unfortunately, DRAM alone is unable to satisfy contemporary main memory demands. High-bandwidth memory (HBM) uses 3D die-stacking to deliver 4–8× higher bandwidth. HBM has two drawbacks: (1) capacity is low, and (2) soft error rate is high. Hybrid memory combines DRAM and HBM to promise low fault rates, high bandwidth, and high capacity. Prior OS approaches manage HBM by mapping pages to HBM versus DRAM based on hotness (access frequency) and risk (susceptibility to soft errors). Unfortunately, these approaches operate at a coarse-grained page granularity, and frequent page migrations hurt performance.
    This article proposes a new class of reliability-aware garbage collectors for hybrid HBM-DRAM systems that place hot and low-risk objects in HBM and the rest in DRAM. Our analysis of nine real-world Java workloads shows that: (1) newly allocated objects in the nursery are frequently written, making them both hot and low-risk, (2) a small fraction of the mature objects are hot and low-risk, and (3) allocation site is a good predictor for hotness and risk. We propose RiskRelief, a novel reliability-aware garbage collector that uses allocation site prediction to place hot and low-risk objects in HBM. Allocation sites are profiled offline and RiskRelief uses heuristics to classify allocation sites as DRAM and HBM. The proposed heuristics expose Pareto-optimal trade-offs between soft error rate (SER) and execution time. RiskRelief improves SER by 9× compared to an HBM-Only system while at the same time improving performance by 29% compared to a DRAM-Only system. Compared to a state-of-the-art OS approach for reliability-aware data placement, RiskRelief eliminates all page migration overheads, which substantially improves performance while delivering similar SER. Reliability-aware garbage collection opens up a new opportunity to manage emerging HBM-DRAM memories at fine granularity while requiring no extra hardware support and leaving the programming model unchanged.

    References

    [1]
    Shoaib Akram, Jennifer B. Sartor, and Lieven Eeckhout. 2016. DVFS performance prediction for managed multithreaded applications. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’16). 12--23.
    [2]
    Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2019. Crystal gazer: Profile-driven write-rationing garbage collection for hybrid memories. Proc. ACM Measure. Anal. Comput. Syst. 3, 1 (2019), 1--27.
    [3]
    Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2019. Emulating and evaluating hybrid memory for managed languages on NUMA hardware. In Proceedings of IEEE International Symposium on Performance Analysis of Systems and Software (ISPASS’19). 93--105.
    [4]
    Shoaib Akram, Jennifer B. Sartor, Kathryn S. McKinley, and Lieven Eeckhout. 2018. Write-rationing garbage collection for hybrid memories. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’18). 62--77.
    [5]
    Bowen Alpern, C. Richard Attanasio, John J. Barton, Michael G. Burke, Perry Cheng, Jong-Deok Choi, Anthony Cocchi, Stephen J. Fink, David Grove, Michael Hind, Susan Flynn Hummel, Derek Lieber, Vassily Litvinov, Mark F. Mergen, Ton Ngo, James R. Russell, Vivek Sarkar, Mauricio J. Serrano, Janice C. Shepherd, Stephen E. Smith, Vugranam C. Sreedhar, Harini Srinivasan, and John Whaley. 2000. The Jalapeño virtual machine. IBM Syst. J. 39, 1 (2000), 211--238.
    [6]
    Bowen Alpern, Steve Augart, Stephen M. Blackburn, Maria A. Butrico, Anthony Cocchi, Perry Cheng, Julian Dolby, Stephen J. Fink, David Grove, Michael Hind, Kathryn S. McKinley, Mark Mergen, J. Eliot B. Moss, Ton Anh Ngo, Vivek Sarkar, and Martin Trapp. 2005. The Jikes RVM project: Building an open source research community. IBM Syst. J. 44, 2 (2005), 399--418.
    [7]
    AMD. [n.d.]. High Bandwidth Memory. Retrieved from https://www.amd.com/en/technologies/hbm.
    [8]
    Andrew W. Appel. 1989. Simple generational garbage collection and fast allocation. Softw.: Pract. Exper. 19, 2 (1989), 171--183.
    [9]
    Amro Awad, Arkaprava Basu, Sergey Blagodurov, Yan Solihin, and Gabriel H. Loh. 2017. Avoiding TLB shootdowns through self-invalidating TLB entries. In Proceedings of the 26th International Conference on Parallel Architectures and Compilation Techniques (PACT’17). 273--287.
    [10]
    Bryan Black, Murali Annavaram, Ned Brekelbaum, John DeVale, Lei Jiang, Gabriel H. Loh1, Don McCauley, Pat Morrow, Donald W. Nelson, Daniel Pantuso, Paul Reed, Jeff Rupley, Sadasivan Shankar, John Shen, and Clair Webb. 2006. Die stacking (3D) microarchitecture. In Proceedings of the 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’06). 469--479.
    [11]
    Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Myths and realities: The performance impact of garbage collection. In Proceedings of the Joint International Conference on Measurement and Modeling of Computer Systems (SIGMETRICS’04). 25--36.
    [12]
    Stephen M. Blackburn, Perry Cheng, and Kathryn S. McKinley. 2004. Oil and water? High performance garbage collection in Java with MMTk. In Proceedings of the International Conference on Software Engineering (ICSE’04). 137--146.
    [13]
    Stephen M. Blackburn, Robin Garner, Chris Hoffmann, Asjad M. Khang, Kathryn S. McKinley, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanović, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2006. The DaCapo benchmarks: Java benchmarking development and analysis. In Proceedings of the Annual ACM SIGPLAN Conference on Object-oriented Programming Systems, Languages, and Applications (OOPSLA’06). 169--190.
    [14]
    Stephen M. Blackburn and Kathryn S. McKinley. 2008. Immix: A mark-region garbage collector with space efficiency, fast collection, and mutator performance. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’08). 22--32.
    [15]
    Stephen M. Blackburn, Kathryn S. McKinley, Robin Garner, Chris Hoffmann, Asjad M. Khan, Rotem Bentzur, Amer Diwan, Daniel Feinberg, Daniel Frampton, Samuel Z. Guyer, Martin Hirzel, Antony Hosking, Maria Jump, Han Lee, J. Eliot B. Moss, Aashish Phansalkar, Darko Stefanovik, Thomas VanDrunen, Daniel von Dincklage, and Ben Wiedermann. 2008. Wake up and smell the Coffee: Evaluation methodology for the 21st century. Commun. ACM 51, 8 (2008), 83--89.
    [16]
    Trevor E. Carlson, Wim Heirman, Stijn Eyerman, Ibrahim Hur, and Lieven Eeckhout. 2014. An evaluation of high-level mechanistic core models. ACM Trans. Architect. Code Optim. 11, 3 (2014), 1--25.
    [17]
    ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2014. CAMEO: A two-level memory organization with capacity of main memory and flexibility of hardware-managed cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 1--12.
    [18]
    ChiaChen Chou, Aamer Jaleel, and Moinuddin K. Qureshi. 2015. BEAR: Techniques for mitigating bandwidth bloat in gigascale DRAM caches. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). 198--210.
    [19]
    Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2017. BATMAN: Techniques for maximizing system bandwidth of memory systems with stacked-DRAM. In Proceedings of the International Symposium on Memory Systems (MEMSYS’17). 268--280.
    [20]
    NVIDIA Corp.2016. NVIDIA Pascal Architecture. Retrieved from https://www.nvidia.com/en-us/data-center/pascal-gpu-architecture/.
    [21]
    Timothy J. Dell. 1997. A white paper on the benefits of Chipkill-correct ECC for PC server main memory. IBM Microelectronics Division.
    [22]
    Xiangyu Dong, Yuan Xie, Naveen Muralimanohar, and Norman P. Jouppi. 2010. Simple but effective heterogeneous main memory with on-chip memory controller support. In Proceedings of the ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis (SC’10). 1--11.
    [23]
    Kristof Du Bois, Jennifer B. Sartor, Stijn Eyerman, and Lieven Eeckhout. 2013. Bottle graphs: Visualizing scalability bottlenecks in multi-threaded applications. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’13). 355--372.
    [24]
    Bogdan F. Romanescu, Alvin R. Lebeck, Daniel J. Sorin, and Anne Bracy. 2010. UNified instruction/translation/data (UNITD) coherence: One protocol to rule them all. In Proceedings of the 16th International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.
    [25]
    Daniel Frampton, Stephen M. Blackburn, Perry Cheng, Robin J. Garner, David Grove, J. Eliot B. Moss, and Sergey I. Salishev. 2009. Demystifying magic: High-level low-level programming. In Proceedings of the International Conference on Virtual Execution Environments (VEE’09). 81--90.
    [26]
    Tiejun Gao, Karin Strauss, Stephen M. Blackburn, Kathryn S. McKinley, Doug Burger, and James Larus. 2013. Using managed runtime systems to tolerate holes in wearable memories. In Proceedings of the 34th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’13). 297--308.
    [27]
    Manish Gupta, Vilas Sridharan, David Roberts, Andreas Prodromou, Ashish Venkat, Dean Tullsen, and Rajesh Gupta. 2018. Reliability-aware data placement for heterogeneous memory architecture. In Proceedings of the 24th IEEE International Symposium on High Performance Computer Architecture (HPCA’18). 583--595.
    [28]
    Gabriel H. Loh and Mark D. Hill. 2011. Efficiently enabling conventional block sizes for very large die-stacked DRAM caches. In Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’11). 454--564.
    [29]
    Jungwoo Ha, Magnus Gustafsson, Stephen M. Blackburn, and Kathryn S. McKinley. 2008. Microarchitectural characterization of production JVMs and Java workloads. In Proceedings of the IBM CAS Workshop.
    [30]
    Mu-Yue Hsiao. 1970. A class of optimal minimum odd-weight-column SEC-DED codes. IBM J. Res. Dev. 14, 4 (1970), 395--401.
    [31]
    Jipeng Huang and Michael D. Bond. 2013. Efficient context sensitivity for dynamic analyses via calling context uptrees and customized memory management. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’13). 53--72.
    [32]
    Xianglong Huang, Stephen M. Blackburn, Kathryn S. McKinley, J. Eliot B. Moss, Zhenlin Wang, and Perry Cheng. 2004. The garbage collection advantage: Improving mutator locality. In Proceedings of the ACM SIGPLAN International Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’04). 69--80.
    [33]
    ITRS. 2005. Internatial Technology Roadmap for Semiconductors: Assembly and Packaging. https://www.semiconductors.org/resources/2005-international-technology-roadmap-for-semiconductors-itrs/.
    [34]
    Prashant J. Nair, David A. Roberts, and Moinuddin K. Qureshi. 2014. Citadel: Efficiently protecting stacked memory from large granularity failures. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 51--62.
    [35]
    JEDEC. [n.d.]. High Bandwidth Memory. Retrieved from https://www.jedec.org/standards-documents/docs/jesd235a.
    [36]
    Hyeran Jeon, Gabriel H. Loh, and Murali Annavaram. 2014. Efficient RAS support for die-stacked DRAM. In Proceedings of the International Test Conference (ITC’14). 1--10.
    [37]
    Djordje Jevdjic, Gabriel H. Loh, Cansu Kaynak, and Babak Falsafi. 2014. Unison cache: A scalable and effective die-stacked DRAM cache. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 25--37.
    [38]
    Djordje Jevdjic, Stavros Volos, and Babak Falsafi. 2013. Die-stacked DRAM caches for servers: Hit ratio, latency, or bandwidth? Have it all with footprint cache. In Proceedings of the 40th Annual International Symposium on Computer Architecture (ISCA’13). 404--415.
    [39]
    Xiaowei Jiang, Niti Madan, Li Zhao, Mike Upton, Ravishankar Iyer, Srihari Makineni, Donald Newell, Yan Solihin, and Rajeev Balasubramonian. 2010. CHOP: Adaptive filter-based DRAM caching for CMP server platforms. In Proceedings of the 16th International Symposium on High-Performance Computer Architecture (HPCA’10). 1--12.
    [40]
    Richard Jones and Rafael Lins. 1996. Garbage Collection: Algorithms for Automatic Dynamic Memory Management. John Wiley 8 Sons.
    [41]
    Moinuddin K. Qureshi and Gabe H. Loh. 2012. Fundamental latency trade-off in architecting DRAM caches: Outperforming impractical SRAM-tags with a simple and practical design. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). 235--246.
    [42]
    Yoongu Kim, Ross Daly, Jeremie Kim, Chris Fallin, Ji Hye Lee, Donghyuk Lee, Chris Wilkerson, Konrad Lai, and Onur Mutlu. 2014. Flipping bits in memory without accessing them: An experimental study of DRAM disturbance errors. In Proceeding of the 41st Annual International Symposium on Computer Architecuture (ISCA’14). 361--372.
    [43]
    Yongjun Lee, Jongwon Kim, Hakbeom Jang, Hyunggyun Yang, Jangwoo Kim, Jinkyu Jeong, and Jae W. Leet. 2015. A fully associative, tagless DRAM cache. In Proceedings of the 42nd Annual International Symposium on Computer Architecture (ISCA’15). 211--222.
    [44]
    Xiao Liu, David Roberts, Rachata Ausavarungnirun, Onur Mutlu, and Jishen Zhao. 2019. Binary star: Coordinated reliability in heterogeneous memory systems for high performance and scalability. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’19). 807--820.
    [45]
    Chi-Keung Luk, Robert Cohn, Robert Muth, Harish Patil, Artur Klauser, Geoff Lowney, Steven Wallace, Vijay Janapa Reddi, and Kim Hazelwood. 2005. Pin: Building customized program analysis tools with dynamic instrumentation. In Proceedings of the ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’05). 190--200.
    [46]
    Matthias Meyer. 2006. A true hardware read barrier. In Proceedings of the 5th International Symposium on Memory Management (ISMM’06). 3--16.
    [47]
    Justin Meza, Qiang Wu, Sanjeev Kumar, and Onur Mutlu. 2015. Revisiting memory errors in large-scale production data centers: Analysis and modeling of new trends from the field. In Proceedings of the 45th Annual IEEE/IFIP International Conference on Dependable Systems and Networks (DSN’15). 415--526.
    [48]
    Micron. 2007. TN-41-01: Calculating memory system power for DDR3. https://www.micron.com/-/media/client/global/documents/products/technical-note/dram/tn41_01ddr3_power.pdf.
    [49]
    Prashant J. Nair, David A. Roberts, and Moinuddin K. Qureshi. 2015. FaultSim: A fast, configurable memory-reliability simulator for conventional and 3D-stacked systems. ACM Trans. Architect. Code Optim. 12, 4 (2015), 1--24.
    [50]
    Khanh Nguyen, Lu Fang, Guoqing Xu, Brian Demsky, Shan Lu, Sanazsadat Alamian, and Onur Mutlu. 2016. Yak: A high-performance big-data-friendly garbage collector. In Proceedings of the USENIX Conference on Operating Systems Design and Implementation (OSDI’16). 349--365.
    [51]
    Mark Oskin and Gabriel H. Loh. 2015. A software-managed approach to die-stacked DRAM. In Proceedings of the International Conference on Parallel Architecture and Compilation (PACT’15). 188--200.
    [52]
    Mike O’Connor. 2014. Highlights of the high-bandwidth memory (HBM) standard. In Proceedings of the Memory Forum Workshop.
    [53]
    I. B. Peng, R. Gioiosa, G. Kestor, P. Cicotti, E. Laure, and S. Markidis. 2017. Exploring the performance benefit of hybrid memory system on HPC environments. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW’17). 683--692.
    [54]
    Andreas Prodromou, Mitesh Meswani, Nuwan Jayasena, Gabriel Loh, and Dean M. Tullsen. 2017. MemPod: A clustered architecture for efficient and scalable migration in flat address space multi-level memories. In Proceedings of the 23rd IEEE International Symposium on High Performance Computer Architecture (HPCA’17). 433--444.
    [55]
    Mitesh R. Meswani, Sergey Blagodurov, David Roberts, John Slice, Mike Ignatowski, and Gabriel H. Loh. 2015. Heterogeneous memory architectures: A HW/SW approach for mixing die-stacked and off-package memories. In Proceedings of the 21st International Symposium on High Performance Computer Architecture (HPCA’15). 126--136.
    [56]
    Brian M. Rogers, Anil Krishna, Gordon B. Bell, Ken Vu, Xiaowei Jiang, and Yan Solihin. 2009. Scaling the bandwidth wall: Challenges in and avenues for CMP scaling. In Proceedings of the 36th Annual International Symposium on Computer Architecture (ISCA’09). 371--382.
    [57]
    Shubhendu S. Mukherjee, Christopher Weaver, Joel Emer, Steven K. Reinhardt, and Todd Austin. 2003. A systematic methodology to compute the architectural vulnerability factors for a high-performance microprocessor. In Proceedings of the 36th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’03). 29--40.
    [58]
    Jennifer B. Sartor, Wim Heirman, Stephen M. Blackburn, Lieven Eeckhout, and Kathryn S. McKinley. 2014. Cooperative cache scrubbing. In Proceedings of the International Conference on Parallel Architectures and Compilation (PACT’14). 15--26.
    [59]
    Bianca Schroeder, Eduardo Pinheiro, and Wolf-Dietrich Weber. 2009. DRAM errors in the wild: A large-scale field study. ACM SIGMETRICS Perform. Eval. Rev. 37, 1 (2009), 193--204.
    [60]
    Rifat Shahriyar, Stephen M. Blackburn, Xi Yang, and Kathryn S. McKinley. 2013. Taking off the gloves with reference counting Immix. In Proceedings of the ACM International Conference on Object Oriented Programming Systems Languages 8 Applications (OOPSLA’13). 93--110.
    [61]
    Jaewoong Sim, Gabriel H. Loh, Hyesoon Kim, Mike O’Connor, and Mithuna Thottethodi. 2012. A mostly clean DRAM cache for effective hit speculation and self-balancing dispatch. In Proceedings of the 45th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’12). 247--257.
    [62]
    Jaewoong Sim, Alaa R. Alameldeen, Zeshan Chishti, Chris Wilkerson, and Hyesoon Kim. 2014. Transparent hardware management of stacked DRAM as part of memory. In Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO’14). 13--24.
    [63]
    Vilas Sridharan, Nathan DeBardeleben, Sean Blanchard, Kurt B. Ferreira, Jon Stearley, John Shalf, and Sudhanva Gurumurthi. 2015. Memory errors in modern systems: The good, the bad, and the ugly. In Proceedings of the 20th International Conference on Architectural Support for Programming Languages and Operating Systems (ASPLOS’15). 297--310.
    [64]
    Vilas Sridharan and Dean Liberty. 2012. A study of DRAM failures in the field. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis (SC’12). 1--11.
    [65]
    David Ungar. 1984. Generation scavenging: A non-disruptive high performance storage reclamation algorithm. In Proceedings of the 1st ACM SIGSOFT/SIGPLAN Software Engineering Symposium on Practical Software Development Environments (SDE’84). 157--167.
    [66]
    David Ungar and Frank Jackson. 1992. An adaptive tenuring policy for generation scavengers. ACM Trans. Program. Lang. Syst. 14, 1 (1992), 1--27.
    [67]
    Carlos Villavieja, Vasileios Karakostas, Lluis Vilanova, Yoav Etsion, Alex Ramirez, Avi Mendelson, Nacho Navarro, Adrian Cristal, and Osman S. Unsal. 2011. DiDi: Mitigating the performance impact of TLB shootdowns using a shared TLB directory. In Proceedings of the International Conference on Parallel Architectures and Compilation Techniques (PACT’11). 340--349.
    [68]
    Chenxi Wang, Huimin Cui, Ting Cao, John Zigman, Haris Volos, Onur Mutlu, Fang Lv, Xiaobing Feng, and Guoqing Harry Xu. 2019. Panthera: Holistic memory management for big data processing over hybrid memories. In Proceedings of the 40th ACM SIGPLAN Conference on Programming Language Design and Implementation (PLDI’19). 347--362.
    [69]
    Xi Yang, Stephen M. Blackburn, Daniel Frampton, and Antony L. Hosking. 2012. Barriers reconsidered, friendlier still! In Proceedings of the ACM SIGPLAN International Symposium on Memory Management (ISMM’12). 37--48.
    [70]
    Xi Yang, Stephen M. Blackburn, Daniel Frampton, Jennifer B. Sartor, and Kathryn S. McKinley. 2011. Why nothing matters: The impact of zeroing. In Proceedings of the ACM Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA’11). 307--324.
    [71]
    Vinson Young, Chiachen Chou, Aamer Jaleel, and Moinuddin Qureshi. 2018. ACCORD: Enabling associativity for gigascale DRAM caches by coordinating way-install and way-prediction. In Proceedings of the 45th Annual International Symposium on Computer Architecture (ISCA’18). 328--339.
    [72]
    Yi Zhao, Jin Shi, Kai Zheng, Haichuan Wang, Haibo Lin, and Ling Shao. 2009. Allocation wall: A limiting factor of Java applications on emerging multi-core platforms. In Proceedings of the ACM SIGPLAN Conference on Object Oriented Programming Systems Languages and Applications (OOPSLA’09). 361--376.

    Cited By

    View all
    • (2022)Reliability-Aware Runahead2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00062(772-785)Online publication date: Apr-2022

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 18, Issue 1
    March 2021
    402 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/3446348
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 20 January 2021
    Accepted: 01 October 2020
    Revised: 01 October 2020
    Received: 01 May 2020
    Published in TACO Volume 18, Issue 1

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Soft-error reliability
    2. garbage collection
    3. high-bandwidth memory
    4. hybrid memories

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    • ERC
    • FWO

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)302
    • Downloads (Last 6 weeks)44
    Reflects downloads up to 11 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Reliability-Aware Runahead2022 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA53966.2022.00062(772-785)Online publication date: Apr-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media