Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
column

Data prefetching in a cache hierarchy with high bandwidth and capacity

Published: 01 September 2007 Publication History

Abstract

In this paper we evaluate four hardware data prefetchers in the context of a high-performance three-level on chip cache hierarchy with high bandwidth and capacity. We consider two classic prefetchers (Sequential Tagged and Stride) and two correlating prefetchers: PC/DC, a recent method with a superior score and low-sized tables, and P-DFCM, a new method. Like PC/DC, P-DFCM focuses on local delta sequences, but it is based on the DFCM value predictor. We explore different prefetch degrees and distances. Running SPEC2000, Olden and IAbench applications, results show that this kind of cache hierarchy turns prefetching aggressiveness into success for the four prefetchers. Sequential Tagged is the best, and deserves further attention to cut it losses in some applications. PC/DC results are matched or even improved by P-DFCM, using far fewer accesses to tables while keeping sizes low.

References

[1]
J. L. Baer and T. F. Chen. "An Effective On-chip Preloading Scheme to Reduce Data Access Penalty". In Int. Conf. on Supercomputing (ICS) pp.176--186, 1991.
[2]
D. Burger and T. Austin, The SimpleScalar Toolset, v. 3.0. http://www.simplescalar.org.
[3]
J. Collins, S. Sair, B. Calder and D. M. Tullsen. "Pointer Cache Assisted Prefetching". In Procs. 35th Int. Symp. on Microarchitecture (MICRO-35) pp. 62--73, Nov. 2002
[4]
R. Cooksey, S. Jordan, D. Grundwald. "A Stateless, Content-Directed Data Prefetching Mechanism". In Proc. of 10th Int. conf. on Architectural support for programming languages and operating systems (ASPLOS X) pp. 279--290 San José, California, Oct. 2002.
[5]
A. S. Dhodapkar and J. E. Smith. "Managing Multi-Configuration Hardware via Dynamic Working Set Analysis". In Proc. of the 29th Ann. Intl. Symp. on Computer Architecture, (ISCA) pp. 233--245. May 2002.
[6]
P. G. Emma, A. Harstein, T. R. Puzac and V. Srinivasan. "Exploring the limits of prefetching". IBM Journal of Res. and Dev. 49 (1) pp. 127--144, Jan. 2005.
[7]
B. Goeman, H. Vandierendonck and K. De Bosschere. "Differential FCM: Increasing Value Prediction Accuracy by Improving Table Usage Efficiency". In Procs. of the 7th Int. Symp. on High-Performance Computer Architecture (HPCA) pp. 207--218. Monterrey, Mexico 2001.
[8]
D. Gracia, G. Mouchard and O. Temam. "MicroLib: A Case for the Quantitative Comparison of Micro-Architecture Mechanisms". Proc. of the 37th Int. Symp. on Microarchitecture (MICRO-37), pp.: 43--54. December 2004.
[9]
Z. Hu, M. Martonosi, S. Kaxiras, "TCP Tag Correlating Prefetchers", In Proceedings of the 9th Int. Symposium on High Performance Computer Architecture (HPCA), 2003.
[10]
P. Ibáñez, V. Viñals, J. L. Briz, and M. J. Garzarán. "Characterization and Improvement of Load/Store Cache-based Prefetching". In Proc. of Int. Conf. on Supercomputing (ICS) Melbourne, Australia. pp.369--376 July 1998.
[11]
D. Joseph and D. Grunwald. "Prefetching Using Markov Predictors". IEEE Trans. on Computer Systems, 48(2), pp. 121--133, 1999."
[12]
N. Jouppi. "Improving direct-mapped cache performance by addition of a small fully associative cache and prefetch buffers". In Procs. of the 17th International Symposium on Computer Architecture (ISCA), Seattle, WA, 1990.
[13]
G. B. Kandiraju and A. Sivasubramaniam. "Going the Distance for TLB Prefetching: An Application-driven Study". In Procs. of the 29th Int. Symposium on Computer Architecture (ISCA), May 2002.
[14]
A. Lai, C. Fide and B. Falsafi. Dead-Block Correlating Prefetchers". In Procs. of the 28th Intl. Symp. on Computer Architecture (ISCA) pp. 144--154, 2001
[15]
Mark J. Charney and Anthony P. Reeves. "Generalized correlation-based hardware prefetching". TR EECEG-95-1, School of Electrical Engineering, Cornell University, February 1995.
[16]
K. J. Nesbit and J. E. Smith. "Data Cache Prefetching Using a Global History Buffer". In Procs. of the 10th Annual Int. Symp. on High Performance Computer Architecture (HPCA) pp: 96--105, Madrid, Spain 2004.
[17]
K. J. Nesbit and J. E. Smith. "Data Cache Prefetching Using a Global History Buffer". IEEE Micro 25 (3), pp. 90--97. May/June 2005.
[18]
K. J. Nesbit, A. S. Dhodapkar and J. E. Smith. "AC/DC: An Adaptive Data Cache Prefetcher". In Proc. of the 13th Int. Conf. on Parallel Architecture and Compilation Techniques (PACT) Sept. 2004.
[19]
L. Ramos, P. Ibáñez, V. Viñals and J. M. Llabería. "Modelling Load Address Behaviour Through Recurrences". In Proc. of Int. Symp. on Performance Analysis of Systems and Software (ISPASS), Austin, Texas. pp. 101--108 April, 2000.
[20]
A. Rogers, M. Carlisle, J. Reppy and L. Hendren. "Supporting Dynamic Data Structures on Distributed Memory Machines". ACM Trans. on Programming Languages and Systems, March 1995.
[21]
S. Sair, T. Sherwood and B. Calder. "Quntifying load stream behavior". In Proc 8th. Annual International Symposium on High Performance Computer Architecture (HPCA) 2002.
[22]
Y. Sazeides and J. E. Smith. "Implementations of context based value predictors. TR ECE97-8, Dept. of Electrical and Computer Engineering, Univ. Wiscosin-Madison, Dec. 1997.
[23]
T. Sherwood et al., "Automatically Characterizing Large Scale Program Behaviour," ASPLOS X, Oct. 2002.
[24]
A. J. Smith, "Sequential Program Prefetching in Memory Hierarchies", IEEE Transactions on Computers., 11(12), pp.7--21, Dec. 1978.
[25]
S. P. Vanderwiel and D. J. Lilja.- "Data Prefetch Mechanisms". ACM Computing Surveys 32 (2) June 2000.
[26]
Z. Wang, D. Burger, K. S. McKinley, S. K. Reinhardt and C. C. Weems. "Guided Region Prefetching: A Cooperative Hardware/Software Approach". In Proc. 30th Int. Symp. on Computer Architecture (ISCA) 2003.

Cited By

View all
  • (2012)Algorithm-level Feedback-controlled Adaptive data prefetcherParallel Computing10.1016/j.parco.2012.06.00238:10-11(533-551)Online publication date: 1-Oct-2012
  • (2011)Global-aware and multi-order context-based prefetching for high-performance processorsInternational Journal of High Performance Computing Applications10.1177/109434201039438625:4(355-370)Online publication date: 1-Nov-2011
  • (2010)Improving the Effectiveness of Context-Based Prefetching with Multi-order AnalysisProceedings of the 2010 39th International Conference on Parallel Processing Workshops10.1109/ICPPW.2010.64(428-435)Online publication date: 13-Sep-2010
  • Show More Cited By

Index Terms

  1. Data prefetching in a cache hierarchy with high bandwidth and capacity

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 35, Issue 4
    September 2007
    59 pages
    ISSN:0163-5964
    DOI:10.1145/1327312
    Issue’s Table of Contents

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 September 2007
    Published in SIGARCH Volume 35, Issue 4

    Check for updates

    Author Tag

    1. hardware data prefeching

    Qualifiers

    • Column

    Funding Sources

    • Spanish Ministry of Education and Science/European Union FEDER

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)7
    • Downloads (Last 6 weeks)2
    Reflects downloads up to 08 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2012)Algorithm-level Feedback-controlled Adaptive data prefetcherParallel Computing10.1016/j.parco.2012.06.00238:10-11(533-551)Online publication date: 1-Oct-2012
    • (2011)Global-aware and multi-order context-based prefetching for high-performance processorsInternational Journal of High Performance Computing Applications10.1177/109434201039438625:4(355-370)Online publication date: 1-Nov-2011
    • (2010)Improving the Effectiveness of Context-Based Prefetching with Multi-order AnalysisProceedings of the 2010 39th International Conference on Parallel Processing Workshops10.1109/ICPPW.2010.64(428-435)Online publication date: 13-Sep-2010
    • (2010)An Adaptive Data Prefetcher for High-Performance ProcessorsProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.61(155-164)Online publication date: 17-May-2010
    • (2008)Low-Cost Adaptive Data PrefetchingProceedings of the 14th international Euro-Par conference on Parallel Processing10.1007/978-3-540-85451-7_36(327-336)Online publication date: 26-Aug-2008

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media