Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Capturing dynamic memory reference behavior with adaptive cache topology

Published: 01 October 1998 Publication History

Abstract

Memory references exhibit locality and are therefore not uniformly distributed across the sets of a cache. This skew reduces the effectiveness of a cache because it results in the caching of a considerable number of less-recently-used lines which are less likely to be re-referenced before they are replaced. In this paper, we describe a technique that dynamically identifies these less-recently-used lines and effectively utilizes the cache frames they occupy to more accurately approximate the global least-recently-used replacement policy while maintaining the fast access time of a direct-mapped cache. We also explore the idea of using these underutilized cache frames to reduce cache misses through data prefetching. In the proposed design, the possible locations that a line can reside in is not predetermined. Instead, the cache is dynamically partitioned into groups of cache lines. Because both the total number of groups and the individual group associativity adapt to the dynamic reference pattern, we call this design the adaptive group-associative cache. Performance evaluation using trace-driven simulations of the TPC-C benchmark and selected programs from the SPEC95 benchmark suite shows that the group-associative cache is able to achieve a hit ratio that is consistently better than that of a 4-way set-associative cache. For some of the workloads, the hit ratio approaches that of a fully-associative cache.

References

[1]
A. Agarwal, J. Hennessy, and M. Horowitz, "Cache Performance of Operating Systems and Multiprogramming," ACM Trans. Computer Systems, Vol. 6(4), Nov. 1988, pp. 393-431.
[2]
A. Agarwal, and S. Pudar, "Column-Associative Caches: A Technique for Reducing the Miss Rate of Direct-Mapped Caches," Proc. 20th Int'l Syrup. Comp. Arch., San Diego, CA, May 1993, pp. 179- 190.
[3]
B. Calder, D. Grunwald, and J. Emer, "Predictive Sequential Associative Cache," Proc. 2nd Syrup. High-PerJbrmance Comp. Arch., San Jose, CA, Jan. 1996, pp. 244-253.
[4]
J. Chang, H. Chao, and K. So, "Cache Design of A Sub-Micron CMOS System/370," Proc. 14th {nt'l Symp. Comp. Arch., Pittsburgh, PA, June 1987, pp. 208-213.
[5]
B. Chung, and J. Peir, "LRU-Based Column Associative Caches," Comp. Arch. News, Vol. 26(2) May 1998, pp. 9-17.
[6]
A. Gonzalez, M. Valero, N. Topham and J.M. Parcerisa, "Eliminating Cache Conflict Misses Through XOR-Based Placement Functions," Proc. l lth Int'l Confkrence Supercomputing, Vienna, Austria, 1997, pp. 76-83.
[7]
M. Hill "A Case for Direct-Mapped Caches," IEEE Computer, Vol. 21(12), Dec. 1988, pp. 25-40.
[8]
T. Johnson, and W. Hwu, "Run-Time Adaptive Cache Hierarchy Management via Reference Analysis," Proc. 24th Int'l Syrup. Comp. Arch., Denver, CO, Jun. 1997, pp. 315-326.
[9]
D. Joseph, and D. Grunwald, "Prefetching using Markov Predictors," Proc. 24th lnt'l Syrup. Comp. Arch., Denver, CO, Jun 1997, pp. 252- 263.
[10]
N. Jouppi, "Improving Direct-Mapped Cache Performance by the Addition of A Small Fully-Associative Cache and Prefetch Buffers," Proc. 17th lnt'l Syrup. Comp. Arch., Seattle, WA, May 1990, pp. 364- 373.
[11]
N. Jouppi and S. Wilton "Tradeoffs in Two-Level On-Chip Caching," Proc. 21st Int'l Symp. Comp. Arch., Chicago, IL, April 1994, pp. 34- 45.
[12]
T. Juan, T. Lang, and J. Navarro, "The Difference-bit Cache," Proc. 23rd Int'! Symp. Comp. Arch., Philadelphia, PA, May 1996, pp. 114- 120.
[13]
G. Kurpanek, et. al, "PA7200: A PA-RISC Processor with Integrated High Performance MP Bus Interface," COMPCON Digest of Papers, San Francisco, CA, Feb. 1994, pp. 375-382.
[14]
L. Liu "Cache Design with Partial Address Matching," MICRO'27, San Jose, CA, Dec. 1994, pp. 128-136.
[15]
L. Liu, "History Table for Set Prediction for Accessing a Set- Associative Cache," United States Patent No. 5,418, 922, May 1995.
[16]
S. Palacharla, and R. Kessler, "Evaluating Stream Buffers as a Secondary Cache Replacement," Proc. 16th Int'l Syrup. Comp. Arch., Chicago, IL, April 1994, pp. 24-33.
[17]
J. Peir, W. Hsu, H. Young, and S. Ong, "Improving Cache Perfor mance with Balanced Tag and Data Paths," Proc. 7th Int'l Conf Architectural Supportjbr Programming Languages and Operating Systems, Cambridge, MA, Oct. 1996, pp. 268-278.
[18]
J. Pomerene, et. ;fl, "Prefetching System for a Cache Having a Second Directory for Sequentially Accessed Blocks," US Patent 4807110, Feb. 1989.
[19]
J. Rivers, and E. Davidson, "Reducing Conflicts in Direct-Mapped Caches with A Temporality-Based Design," Proc. 1996 Int'l Conj.' Parallel Prr;ces,ting, Ithaca, NY, Aug. 1996, pp. 151-162.
[20]
A. Seznec, "A Case for Two-Way Skewed-Associative Caches," Proc. 20th Int'l Syrup. Comp. Arch., San Diego, CA, May 1993, pp. 169-178.
[21]
A. Seznec, "DASC Cache," Proc. Jst Symp. High-Perj'brmance Comp. Arch., Raleigh, NC, Jan. 1995, pp. 134-143.
[22]
A. Smith, "Sequential Program Prefetching in Memory Hierarchies," iEEE Computer, Vol. 11(12), Dec. 1978, pp. 7-21.
[23]
A. Smith, "Cache Memories," Computing Surveys, Vol. 14(3), Sep. 1982, pp. 473-530.
[24]
K. So and R. Rechtschaffen, "Cache Operations by MRU Change," IEEE Trans. Computers, Vol. 37(6), Jun. 1988, pp. 700-709.
[25]
SPEC CPU95 Benchmark Suite, Version 1.10, Aug. 1995.
[26]
Sun Microsystems, "Introduction to Shade," Revision C, April 1993.
[27]
TPC Council, "TPC Benchmark C, Standard Specification, Rev. 3.6.2," Jun. 1997.
[28]
C. Zhang, X. Zhang, and Y. Yan, "Two Fast and High-Associativity Cache Schemes," IEEE Micro, Vol. 17(5), Sep/Oct 1997, pp. 40-49.

Cited By

View all
  • (2024)Cache Memory and On-Chip Cache Architecture: A SurveyAdvanced Computing, Machine Learning, Robotics and Internet Technologies10.1007/978-3-031-47221-3_12(126-138)Online publication date: 16-Apr-2024
  • (2014)Image-Optimized Rolling CacheIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2013.227814424:3(539-551)Online publication date: 1-Mar-2014
  • (2012)ASCIBProceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design10.1145/2333660.2333674(51-56)Online publication date: 30-Jul-2012
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 01 October 1998
Published in SIGOPS Volume 32, Issue 5

Check for updates

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)101
  • Downloads (Last 6 weeks)18
Reflects downloads up to 22 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Cache Memory and On-Chip Cache Architecture: A SurveyAdvanced Computing, Machine Learning, Robotics and Internet Technologies10.1007/978-3-031-47221-3_12(126-138)Online publication date: 16-Apr-2024
  • (2014)Image-Optimized Rolling CacheIEEE Transactions on Circuits and Systems for Video Technology10.1109/TCSVT.2013.227814424:3(539-551)Online publication date: 1-Mar-2014
  • (2012)ASCIBProceedings of the 2012 ACM/IEEE international symposium on Low power electronics and design10.1145/2333660.2333674(51-56)Online publication date: 30-Jul-2012
  • (2012)A comparative analysis of performance improvement schemes for cache memoriesComputers and Electrical Engineering10.1016/j.compeleceng.2011.12.00838:2(243-257)Online publication date: 1-Mar-2012
  • (2012)Receiving-peer-driven multi-video-source scheduling algorithms in mobile P2P overlay networksComputers and Electrical Engineering10.1016/j.compeleceng.2011.10.00738:1(116-127)Online publication date: 1-Jan-2012
  • (2011)Composite Pseudo Associative Cache with Victim Cache for Mobile ProcessorsJournal of Computer Science10.3844/jcssp.2011.1448.14577:10(1448-1457)Online publication date: 1-Oct-2011
  • (2011)Hybrid-way Cache for Mobile ProcessorsProceedings of the 2011 Eighth International Conference on Information Technology: New Generations10.1109/ITNG.2011.125(707-712)Online publication date: 11-Apr-2011
  • (2011)Evaluation of Techniques to Improve Cache Access UniformitiesProceedings of the 2011 International Conference on Parallel Processing10.1109/ICPP.2011.12(31-40)Online publication date: 13-Sep-2011
  • (2010)Composite Pseudo-Associative Cache for Mobile ProcessorsProceedings of the 2010 IEEE International Symposium on Modeling, Analysis and Simulation of Computer and Telecommunication Systems10.1109/MASCOTS.2010.49(394-396)Online publication date: 17-Aug-2010
  • (2010)Towards Smaller-Sized Cache for Mobile Processors Using Shared Set-AssociativityProceedings of the 2010 Seventh International Conference on Information Technology: New Generations10.1109/ITNG.2010.120(1-6)Online publication date: 12-Apr-2010
  • Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media