research-article

High performance cache replacement using re-reference interval prediction (RRIP)

Authors:

Kevin B. Theobald,

Simon C. Steely, Jr.,

Joel EmerAuthors Info & Claims

ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

Pages 60 - 71

https://doi.org/10.1145/1815961.1815971

Published: 19 June 2010 Publication History

Abstract

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and misses. Applications that exhibit a distant re-reference interval perform badly under LRU. Such applications usually have a working-set larger than the cache or have frequent bursts of references to non-temporal data (called scans). To improve the performance of such workloads, this paper proposes cache replacement using Re-reference Interval Prediction (RRIP). We propose Static RRIP (SRRIP) that is scan-resistant and Dynamic RRIP (DRRIP) that is both scan-resistant and thrash-resistant. Both RRIP policies require only 2-bits per cache block and easily integrate into existing LRU approximations found in modern processors. Our evaluations using PC games, multimedia, server and SPEC CPU2006 workloads on a single-core processor with a 2MB last-level cache (LLC) show that both SRRIP and DRRIP outperform LRU replacement on the throughput metric by an average of 4% and 10% respectively. Our evaluations with over 1000 multi-programmed workloads on a 4-core CMP with an 8MB shared LLC show that SRRIP and DRRIP outperform LRU replacement on the throughput metric by an average of 7% and 9% respectively. We also show that RRIP outperforms LFU, the state-of the art scan-resistant replacement algorithm to-date. For the cache configurations under study, RRIP requires 2X less hardware than LRU and 2.5X less hardware than LFU.

References

[1]

"Inside the Intel Itanium 2 Processor", HP Technical White Paper, July 2002.

[2]

"UltraSPARC T2 supplement to the UltraSPARC architecture 2007", Draft D1.4.3. 2007.

[3]

Intel. Intel Core i7 Processor. http://www.intel.com/products/processor/corei7/specifications.htm

[4]

H. Al-Zoubi, A. Milenkovic, M. Milenkovic. "Performance evaluation of cache replacement policies for the SPEC CPU2000 benchmark suite." In ACMSE, 2004.

Digital Library

[5]

S. Bansal and D. S. Modha. "CAR: Clock with Adaptive Replacement", In FAST, 2004.

Digital Library

[6]

A. Basu, N. Kirman, M. Kirman, M. Chaudhuri, J. Martinez. "Scavenger: A New Last Level Cache Architecture with Global Block Priority". In Micro-40, 2007.

Digital Library

[7]

L. A. Belady. A study of replacement algorithms for a virtual-storage computer. In IBM Systems journal, pages 78--101, 1966.

Digital Library

[8]

M. Chaudhuri. "Pseudo-LIFO: The Foundation of a New Family of Replacement Policies for Last-level Caches". In Micro, 2009.

Digital Library

[9]

F. J. Corbató, "A paging experiment with the multics system," In Honor of P. M. Morse, pp. 217--228, MIT Press, 1969.

[10]

A. Jaleel, R. Cohn, C. K. Luk, B. Jacob. CMP$im: A Pin-Based On-The-Fly MultiCore Cache Simulator. In MoBS, 2008.

[11]

A. Jaleel, W. Hasenplaugh, M. K. Qureshi, S. C. Steely Jr., J. Emer. "Adaptive Insertion Policies for Managing Shared Caches". In PACT, 2008.

Digital Library

[12]

S. Jiang and X. Zhang, "LIRS: An efficient low inter-reference recency set replacement policy to improve buffer cache performance," In Proc. ACM SIGMETRICS Conf., 2002.

Digital Library

[13]

T. Johnson and D. Shasha, "2Q: A low overhead high performance buffer management replacement algorithm," In VLDB Conf., 1994.

Digital Library

[14]

S. Kaxiras, Z. Hu, M. Martonosi. "Cache decay: exploiting generational behavior to reduce cache leakage power." In ISCA--28.

Digital Library

[15]

G. Keramidas, P. Petoumenos, S. Kaxiras. "Cache replacement based on reuse-distance prediction'. In ICCD, 2007

[16]

A. Lai, C. Fide, B. Falsafi. Dead-block prediction & dead-block correlating prefetchers. In ISCA-28, 2001

Digital Library

[17]

D. Lee, J. Choi, J. Kim, S. H. Noh, S. Lyul Min, Y. Cho, C. Sang Kim. "LRFU: A spectrum of policies that subsumes the least recently used and least frequently used policies," IEEE Trans. Computers, vol. 50, no. 12, pp. 1352--1360, 2001.

Digital Library

[18]

W. Lin and S. K. Reinhardt. "Predicting last-touch references under optimal replacement." Technical Report CSE-TR-447-02, U. of Michigan, 2002.

[19]

H. Liu, M. Ferdman, J. Huh, D. Burger. "Cache Bursts: A New Approach for Eliminating Dead Blocks and Increasing Cache Efficiency." In Micro-41, 2008.

Digital Library

[20]

G. Loh. "Extending the Effectiveness of 3D-Stacked DRAM Caches with an Adaptive Multi-Queue Policy". In Micro, 2009.

Digital Library

[21]

C. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, K. Hazelwood. "Pin: building customized program analysis tools with dynamic instrumentation". In PLDI, pages 190--200, 2005.

Digital Library

[22]

N. Megiddo and D. S. Modha, "ARC: A self-tuning, low overhead replacement cache,' in FAST, 2003.

Digital Library

[23]

E. J. O'Neil, P. E. O'Neil, G. Weikum. "The LRU-K page replacement algorithm for database disk buffering," in Proc. ACM SIGMOD Conf., pp. 297--306, 1993.

Digital Library

[24]

H. Patil, R. Cohn, M. Charney, R. Kapoor, A. Sun, A. Karunanidhi. "Pinpointing Representative Portions of Large Intel Itanium Programs with Dynamic Instrumentation". In MICRO--37, 2004.

Digital Library

[25]

M. Qureshi, A. Jaleel, Y. Patt, S. Steely, J. Emer. "Adaptive Insertion Policies for High Performance Caching". In ISCA--34, 2007.

Digital Library

[26]

K. Rajan and G. Ramaswamy. "Emulating Optimal Replacement with a Shepherd Cache". In Micro--40, 2007.

Digital Library

[27]

J. T. Robinson and M. V. Devarakonda, "Data cache management using frequency-based replacement," in SIGMETRICS Conf, 1990.

Digital Library

[28]

S. Srinath, O. Mutlu, H. Kim, Y. Patt. "Feedback Directed Prefetching: Improving the Performance and Bandwidth-Efficiency of Hardware Prefetcher". In HPCA-13, 2007.

Digital Library

[29]

R. Subramanian, Y. Smaragdakis, G. Loh. "Adaptive caches: Effective shaping of cache behavior to workloads." In MICRO-39, 2006.

Digital Library

[30]

Y. Xie and G. Loh. "PIPP: Promotion/Insertion Pseudo-Partitioning of Multi-Core Shared Caches." In ISCA-36, 2009

Digital Library

[31]

Y. Zhou and J. F. Philbin, "The multi-queue replacement algorithm for second level buffer caches," in USENIX Annual Tech. Conf, 2001.

Digital Library

Cited By

Pal ADesai KChatterjee RSan Miguel J(2024)Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program TracesACM Transactions on Architecture and Code Optimization10.1145/365011021:2(1-23)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3650110
Peters MGaudin NThoma JLapôtre VCotret PGogniat GGüneysu TQuek TGao DZhou JCardenas A(2024)On The Effect of Replacement Policies on The Security of Randomized Cache ArchitecturesProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637677(483-497)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3637677
Zhao ZMorrison AFletcher CTorrellas JTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Last-Level Cache Side-Channel Attacks Are Feasible in the Modern Public CloudProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640403(582-600)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640403
Show More Cited By

Index Terms

High performance cache replacement using re-reference interval prediction (RRIP)
1. Computer systems organization
  1. Architectures
    1. Parallel architectures
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory

Recommendations

SHiP: signature-based hit predictor for high performance caching
MICRO-44: Proceedings of the 44th Annual IEEE/ACM International Symposium on Microarchitecture

The shared last-level caches in CMPs play an important role in improving application performance and reducing off-chip memory bandwidth requirements. In order to use LLCs more efficiently, recent research has shown that changing the re-reference ...
Adaptive insertion policies for high performance caching

The commonly used LRU replacement policy is susceptible to thrashing for memory-intensive workloads that have a working set greater than the available cache size. For such applications, the majority of lines traverse from the MRU position to the LRU ...
High performance cache replacement using re-reference interval prediction (RRIP)
ISCA '10

Practical cache replacement policies attempt to emulate optimal replacement by predicting the re-reference interval of a cache block. The commonly used LRU replacement policy always predicts a near-immediate re-reference interval on cache hits and ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ISCA '10: Proceedings of the 37th annual international symposium on Computer architecture

June 2010

520 pages

ISBN:9781450300537

DOI:10.1145/1815961

General Chair:
André Seznec
INRIA Rennes
,
Program Chairs:
Uri Weiser
Technion
,
Ronny Ronen
Intel

ACM SIGARCH Computer Architecture News Volume 38, Issue 3
ISCA '10
June 2010
508 pages
ISSN:0163-5964
DOI:10.1145/1816038
Issue’s Table of Contents

Copyright © 2010 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

In-Cooperation

IEEE CS

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 June 2010

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ISCA '10

Sponsor:

SIGARCH

ISCA '10: The 37th Annual International Symposium on Computer Architecture

June 19 - 23, 2010

Saint-Malo, France

Acceptance Rates

Overall Acceptance Rate 543 of 3,203 submissions, 17%

Upcoming Conference

ISCA '25

Sponsor:
sigarch

The 52nd Annual International Symposium on Computer Architecture

June 21 - 25, 2025

Tokyo , Japan

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

637
Total Citations
View Citations
8,337
Total Downloads

Downloads (Last 12 months)1,663
Downloads (Last 6 weeks)113

Reflects downloads up to 13 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Pal ADesai KChatterjee RSan Miguel J(2024)Camouflage: Utility-Aware Obfuscation for Accurate Simulation of Sensitive Program TracesACM Transactions on Architecture and Code Optimization10.1145/365011021:2(1-23)Online publication date: 21-May-2024
https://dl.acm.org/doi/10.1145/3650110
Peters MGaudin NThoma JLapôtre VCotret PGogniat GGüneysu TQuek TGao DZhou JCardenas A(2024)On The Effect of Replacement Policies on The Security of Randomized Cache ArchitecturesProceedings of the 19th ACM Asia Conference on Computer and Communications Security10.1145/3634737.3637677(483-497)Online publication date: 1-Jul-2024
https://dl.acm.org/doi/10.1145/3634737.3637677
Zhao ZMorrison AFletcher CTorrellas JTsafrir DMusuvathi MGupta RAbu-Ghazaleh N(2024)Last-Level Cache Side-Channel Attacks Are Feasible in the Modern Public CloudProceedings of the 29th ACM International Conference on Architectural Support for Programming Languages and Operating Systems, Volume 210.1145/3620665.3640403(582-600)Online publication date: 27-Apr-2024
https://dl.acm.org/doi/10.1145/3620665.3640403
Ainsworth SMukhanov L(2024)Triangel: A High-Performance, Accurate, Timely On-Chip Temporal Prefetcher2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00090(1202-1216)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00090
Liu YLi XZhang TLiu TGuo QZhang FWang J(2024)AVM-BTB: Adaptive and Virtualized Multi-level Branch Target Buffer2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00012(17-31)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00012
Jamet AVavouliotis GJiménez DAlvarez LCasas M(2024)Practically Tackling Memory Bottlenecks of Graph-Processing Workloads2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00096(1034-1045)Online publication date: 27-May-2024
https://doi.org/10.1109/IPDPS57955.2024.00096
Jamet AVavouliotis GJiménez DAlvarez LCasas M(2024)A Two Level Neural Approach Combining Off-Chip Prediction with Adaptive Prefetch Filtering2024 IEEE International Symposium on High-Performance Computer Architecture (HPCA)10.1109/HPCA57654.2024.00046(528-542)Online publication date: 2-Mar-2024
https://doi.org/10.1109/HPCA57654.2024.00046
Bueno NCastro FPinuel LGomez-Perez JCatthoor F(2024)Improving the Representativeness of Simulation Intervals for the Cache Memory SystemIEEE Access10.1109/ACCESS.2024.335064612(5973-5985)Online publication date: 2024
https://doi.org/10.1109/ACCESS.2024.3350646
Mahenge MKitindi E(2023)Adaptive Cache Server Selection and Resource Allocation Strategy in Mobile Edge ComputingInternational Journal of Information Communication Technologies and Human Development10.4018/IJICTHD.29941214:1(1-16)Online publication date: 31-Mar-2023
https://dl.acm.org/doi/10.4018/IJICTHD.299412
Yu HZhang YZhao JLiao YHuang ZHe DGu LJin HLiao XLiu HHe BYue J(2023)RACE: An Efficient Redundancy-aware Accelerator for Dynamic Graph Neural NetworkACM Transactions on Architecture and Code Optimization10.1145/3617685Online publication date: 30-Aug-2023
https://dl.acm.org/doi/10.1145/3617685
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents