Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Open access

Spectral prefetcher: An effective mechanism for L2 cache prefetching

Published: 01 December 2005 Publication History

Abstract

Effective data prefetching requires accurate mechanisms to predict embedded patterns in the miss reference behavior. This paper proposes a novel prefetching mechanism, called the spectral prefetcher (SP), that accurately identifies the pattern by dynamically adjusting to its frequency. The proposed mechanism divides the memory address space into tag concentration zones (TCzones) and detects either the pattern of tags (higher order bits) or the pattern of strides (differences between consecutive tags) within each TCzone. The prefetcher dynamically determines whether the pattern of tags or strides will increase the effectiveness of prefetching and switches accordingly. To measure the performance of our scheme, we use a cycle-accurate aggressive out-of-order simulator that models bus occupancy, bus protocol, and limited bandwidth. Our experimental results show performance improvement of 1.59, on average, and at best 2.10 for the memory-intensive benchmarks we studied. Further, we show that SP outperforms the previously proposed scheme, with twice the size of SP, by 39% and a larger L2 cache, with equivalent storage area by 31%.

References

[1]
Burger, D. and Austin, T. 1999. The Simplescalar Toolset, Version 3.0 Tech. rep., University of Wisconsin, Madison.
[2]
Charney, M. J. and Reeves, A. P. 1995. Generalized Correlation Based Hardware Prefetching, Tech. rep., School of Electrical Engineering, Cornell University.
[3]
Chen, T. F. and Baer, J. L. 1992. Reducing memory latency via non-blocking and prefetching caches. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, MA. ACM Press, New York.
[4]
Cooksey, T., Jourdan, S., and Grunwald, D. 2002. A stateless, content-directed data prefetching mechanism. In Proceedings of International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA. ACM Press, New York.
[5]
Cuppu, V., Jacob, B., Davis, B., and Mudge, T. 2001. High performance DRAMS in workstation environments. IEEE Transaction on Computers 50, 11, 1133--1153.
[6]
Ding, C. and Zhong, Y. 2003. Predicting whole program locality through reuse distance stateless analysis. In Proceedings of International Conference on Programming Language Design and Implementation. San Diego, CA. ACM Press, New York.
[7]
Henning, J. L. 2000. SPEC CPU2000: Measuring CPU performance in the new millennium. IEEE Computers 33, 7 (July), 28--35.
[8]
Hu, Z., Martonosi, M., and Kaxiras, S. 2003. TCP: Tag correlating prefetchers. In Proceedings of 9th International Symposium on High Performance Computer Architecture, Anaheim, CA. IEEE Press.
[9]
Joseph, D. and Grunwald, D. 1999. Prefetching using Markov Predictors. IEEE Transactions on Computers 48, 2, 121--133.
[10]
Jouppi, N. P. 1990. Improving direct-mapped cache performance by the addition of the small fully associative cache and prefetch buffers. In Proceedings of the 17th Annual International Symposium on Computer Architecture, Seattle, WA. ACM Press/IEEE, New York.
[11]
Lai, A. C., Fide, C., and Falsafi, B. 2001. Dead-Block prediction and Dead-Block Correlating Prefetchers. In Proceedings of the 28th Annual International Symposium on Computer Architecture, Goteborg. ACM Press/IEEE New York.
[12]
Lipasti, M. H., Schimidt, W. J., Kuenel, R., and Roediger, R. R. 1995. Software prefetching in pointer and call intensive environment. In Proceedings of the 28th International Symposium on Microarchitecture, Ann Arbor, MI. ACM Press/IEEE, New York.
[13]
Lipasti, M. H., Wilkerson, C. B., and Chen, J. P. 1996. Value locality and load value prediction. In Proceedings of 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA. ACM Press, New York.
[14]
Luk, C. K. and Mowry, T. C. 1996. Compiler based prefetching for recursive data structures. In Proceedings of 7th International Conference on Architectural Support for Programming Languages and Operating Systems, Cambridge, MA. ACM Press, New York.
[15]
Mowry, T. C., Lam, M. S., and Gupta, A. 1992. Design and evaluation of a compiler algorithm for prefetching. In Proceedings of 5th International Conference on Architectural Support for Programming Languages and Operating Systems, Boston, MA. ACM Press, New York.
[16]
Nesbit, K. and Smith, J. E. 2004. Prefetching with a global history buffer. In Proceedings of 10th International Symposium on High Performance Computer Architecture, Madrid, Spain. IEEE.
[17]
Nesbit, K., Dhodapkar, A. S., and Smith, J. E. 2004. AC/DC: An adaptive data cache prefetcher. IEEE PACT 2004, Antibes Juan-les-Pins, France.
[18]
Palacharla, S. and Kessler, R. E. 1994. Evaluating stream buffers as secondary cache replacement. In Proceedings of the 21st Annual International Symposium on Computer Architecture, Chicago, IL. ACM Press/IEEE, New York.
[19]
Roth, A., Moshovos, A., and Gurinder, S. S. 1998. Dependence based prefetching for linked data structures. In Proceedings of 8th International Conference on Architectural Support for Programming Languages and Operating Systems, San Jose, CA. ACM Press, New York.

Cited By

View all
  • (2016)A Survey of Recent Prefetching Techniques for Processor CachesACM Computing Surveys10.1145/290707149:2(1-35)Online publication date: 2-Aug-2016
  • (2011)Evaluating the Memory System Performance of Software-Initiated Inter-core LLC PrepushingProceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications Workshops10.1109/ISPAW.2011.56(216-221)Online publication date: 26-May-2011
  • (2010)Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller SupportProceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2010.50(1-11)Online publication date: 13-Nov-2010
  • Show More Cited By

Index Terms

  1. Spectral prefetcher: An effective mechanism for L2 cache prefetching

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Architecture and Code Optimization
    ACM Transactions on Architecture and Code Optimization  Volume 2, Issue 4
    December 2005
    116 pages
    ISSN:1544-3566
    EISSN:1544-3973
    DOI:10.1145/1113841
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 01 December 2005
    Published in TACO Volume 2, Issue 4

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. L2 cache
    2. Prefetch
    3. absolute and differential domain
    4. adaptive
    5. autocorrelation
    6. frequency
    7. memory

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)97
    • Downloads (Last 6 weeks)20
    Reflects downloads up to 03 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2016)A Survey of Recent Prefetching Techniques for Processor CachesACM Computing Surveys10.1145/290707149:2(1-35)Online publication date: 2-Aug-2016
    • (2011)Evaluating the Memory System Performance of Software-Initiated Inter-core LLC PrepushingProceedings of the 2011 IEEE Ninth International Symposium on Parallel and Distributed Processing with Applications Workshops10.1109/ISPAW.2011.56(216-221)Online publication date: 26-May-2011
    • (2010)Simple but Effective Heterogeneous Main Memory with On-Chip Memory Controller SupportProceedings of the 2010 ACM/IEEE International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC.2010.50(1-11)Online publication date: 13-Nov-2010
    • (2008)Exploiting producer patterns and L2 cache for timely dependence-based prefetching2008 IEEE International Conference on Computer Design10.1109/ICCD.2008.4751935(685-692)Online publication date: Oct-2008

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Full Access

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media