Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Skip header Section
A Primer on Hardware PrefetchingJune 2014
Publisher:
  • Morgan & Claypool Publishers
ISBN:978-1-60845-952-0
Published:01 June 2014
Pages:
68
Skip Bibliometrics Section
Bibliometrics
Skip Abstract Section
Abstract

Since the 1970s, microprocessor-based digital platforms have been riding Moores law, allowing for doubling of density for the same area roughly every two years. However, whereas microprocessor fabrication has focused on increasing instruction execution rate, memory fabrication technologies have focused primarily on an increase in capacity with negligible increase in speed. This divergent trend in performance between the processors and memory has led to a phenomenon referred to as the Memory Wall. To overcome the memory wall, designers have resorted to a hierarchy of cache memory levels, which rely on the principal of memory access locality to reduce the observed memory access time and the performance gap between processors and memory. Unfortunately, important workload classes exhibit adverse memory access patterns that baffle the simple policies built into modern cache hierarchies to move instructions and data across cache levels. As such, processors often spend much time idling upon a demand fetch of memory blocks that miss in higher cache levels. Prefetchingpredicting future memory accesses and issuing requests for the corresponding memory blocks in advance of explicit accessesis an effective approach to hide memory access latency. There have been a myriad of proposed prefetching techniques, and nearly every modern processor includes some hardware prefetching mechanisms targeting simple and regular memory access patterns. This primer offers an overview of the various classes of hardware prefetchers for instructions and data proposed in the research literature, and presents examples of techniques incorporated into modern microprocessors.

Cited By

  1. ACM
    Eris F, Louis M, Eris K, Abellán J and Joshi A (2022). Puppeteer: A Random Forest Based Manager for Hardware Prefetchers Across the Memory Hierarchy, ACM Transactions on Architecture and Code Optimization, 20:1, (1-25), Online publication date: 31-Mar-2023.
  2. Khan T, Ugur M, Nathella K, Sunwoo D, Litz H, Jiménez D and Kasikci B Whisper: Profile-Guided Branch Misprediction Elimination for Data Center Applications Proceedings of the 55th Annual IEEE/ACM International Symposium on Microarchitecture, (19-34)
  3. Khan T, Zhang D, Sriraman A, Devietti J, Pokam G, Litz H and Kasikci B Ripple Proceedings of the 48th Annual International Symposium on Computer Architecture, (734-747)
  4. Ye C, Xu Y, Shen X, Liao X, Jin H and Solihin Y Supporting legacy libraries on non-volatile memory Proceedings of the 48th Annual International Symposium on Computer Architecture, (443-455)
  5. Vicarte J, Shome P, Nayak N, Trippel C, Morrison A, Kohlbrenner D and Fletcher C Opening pandora's box Proceedings of the 48th Annual International Symposium on Computer Architecture, (347-360)
  6. Naithani A, Ainsworth S, Jones T and Eeckhout L Vector runahead Proceedings of the 48th Annual International Symposium on Computer Architecture, (195-208)
  7. Ros A and Jimborean A A cost-effective entangling prefetcher for instructions Proceedings of the 48th Annual International Symposium on Computer Architecture, (99-111)
  8. ACM
    Ainsworth S and Jones T (2018). An Event-Triggered Programmable Prefetcher for Irregular Workloads, ACM SIGPLAN Notices, 53:2, (578-592), Online publication date: 30-Nov-2018.
  9. ACM
    Ainsworth S and Jones T An Event-Triggered Programmable Prefetcher for Irregular Workloads Proceedings of the Twenty-Third International Conference on Architectural Support for Programming Languages and Operating Systems, (578-592)
  10. ACM
    Mittal S (2016). A Survey of Recent Prefetching Techniques for Processor Caches, ACM Computing Surveys, 49:2, (1-35), Online publication date: 30-Jun-2017.
  11. ACM
    Hong B, Kim G, Ahn J, Kwon Y, Kim H and Kim J Accelerating Linked-list Traversal Through Near-Data Processing Proceedings of the 2016 International Conference on Parallel Architectures and Compilation, (113-124)
  12. ACM
    Ainsworth S and Jones T Graph Prefetching Using Data Structure Knowledge Proceedings of the 2016 International Conference on Supercomputing, (1-11)
  13. ACM
    Peled L, Mannor S, Weiser U and Etsion Y (2015). Semantic locality and context-based prefetching using reinforcement learning, ACM SIGARCH Computer Architecture News, 43:3S, (285-297), Online publication date: 4-Jan-2016.
  14. ACM
    Miguel J, Albericio J, Moshovos A and Jerger N Doppelgänger Proceedings of the 48th International Symposium on Microarchitecture, (50-61)
  15. ACM
    Peled L, Mannor S, Weiser U and Etsion Y Semantic locality and context-based prefetching using reinforcement learning Proceedings of the 42nd Annual International Symposium on Computer Architecture, (285-297)
Contributors
  • University of Michigan, Ann Arbor

Recommendations