research-article

Long short term memory based hardware prefetcher: a case study

Authors:

Xiaochen GuoAuthors Info & Claims

MEMSYS '17: Proceedings of the International Symposium on Memory Systems

Pages 305 - 311

https://doi.org/10.1145/3132402.3132405

Published: 02 October 2017 Publication History

Abstract

Hardware prefetching is an efficient mechanism to hide cache miss penalties. Accuracy, coverage, and timeliness are three primary metrics in evaluating prefetcher performance. Highly accurate hardware prefetches are desired to predict complex memory access patterns in multicore systems. In this paper, we propose a long short term memory (LSTM) prefetcher---a neural network based hardware prefetcher. Offline experiment shows that the proposed LSTM prefetcher achieves higher accuracy and better coverage on a set of evaluated traces.

References

[1]

Dario Amodei, Rishita Anubhai, Eric Battenberg, Carl Case, Jared Casper, Bryan Catanzaro, Jingdong Chen, Mike Chrzanowski, Adam Coates, Greg Diamos, et al. 2015. Deep speech 2: End-to-end speech recognition in english and mandarin. arXiv preprint arXiv:1512.02595 (2015).

[2]

Yoshua Bengio, Patrice Simard, and Paolo Frasconi. 1994. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks 5, 2 (1994), 157--166.

Digital Library

[3]

John Cavazos and Darko Stefanovic. 1997. Adaptive Prefetching using Neural Networks. Proposal to NEC.

[4]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.

Digital Library

[5]

Yasuo Ishii, Mary Inaba, and Kei Hiraki. 2011. Access map pattern matching for high performance data cache prefetch. Journal of Instruction-Level Parallelism 13, 1--24.

[6]

Hyesoon Kim Jaekyu Lee and Richard Vuduc. 2012. When prefetching works, when it doesn?t, and why. ACM Transactions on Architecture and Code Optimization (TACO) 9, 1 (2012), 2.

Digital Library

[7]

Daniel A Jiménez and Calvin Lin. 2001. Dynamic branch prediction with perceptrons. In High-Performance Computer Architecture, 2001. HPCA. The Seventh International Symposium on. IEEE, 197--206.

Digital Library

[8]

Hyungjun Kim, Taesu Kim, Jinseok Kim, and Jae-Joon Kim. 2017. Deep Neural Network Optimized to Resistive Memory with Nonlinear Current-Voltage Characteristics. arXiv preprint arXiv:1703.10642 (2017).

[9]

Jinchun Kim, Seth H Pugsley, Paul V Gratz, AL Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In Microarchitecture (MICRO), 2016 49th Annual IEEE/ACM International Symposium on. IEEE, 1--12.

[10]

Shih-wei Liao, Tzu-Han Hung, Donald Nguyen, Chinyen Chou, Chiaheng Tu, and Hucheng Zhou. 2009. Machine learning-based prefetch optimization for data center applications. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis. ACM, 56.

Digital Library

[11]

Pramod Kumar Meher. 2010. An optimized lookup-table for the evaluation of sigmoid function for artificial neural networks. In VLSI System on Chip Conference (VLSI-SoC), 2010 18th IEEE/IFIP. IEEE, 91--95.

[12]

Pierre Michaud. 2015. A best-offset prefetcher. In 2nd Data Prefetching Championship.

[13]

A Muthuramalingam, S Himavathi, and E Srinivasan. 2008. Neural network implementation using FPGA: issues and application. International journal of information technology 4, 2 (2008), 86--92.

[14]

Kyle J Nesbit and James E Smith. 2004. Data cache prefetching using a global history buffer. In Software, IEE Proceedings-. IEEE, 96--96.

Digital Library

[15]

Amos R Omondi and Jagath C Rajapakse. 2002. Neural networks in FPGAs. In Neural Information Processing, 2002. ICONIP'02. Proceedings of the 9th International Conference on, Vol. 2. IEEE, 954--959.

[16]

F Piazza, A Uncini, and M Zenobi. 1993. Neural networks with digital LUT activation functions. In Neural Networks, 1993. IJCNN'93-Nagoya. Proceedings of 1993 International Joint Conference on, Vol. 2. IEEE, 1401--1404.

[17]

Demetri Psaltis, Athanasios Sideris, and Alan A Yamamura. 1988. A multilayered neural network controller. IEEE control systems magazine 8, 2 (1988), 17--21.

[18]

Saami Rahman, Martin Burtscher, Ziliang Zong, and Apan Qasem. 2015. Maximizing hardware prefetch effectiveness with machine learning. In High Performance Computing and Communications (HPCC), 2015 IEEE 7th International Symposium on Cyberspace Safety and Security (CSS), 2015 IEEE 12th International Conferen on Embedded Software and Systems (ICESS), 2015 IEEE 17th International Conference on. IEEE, 383--389.

Digital Library

[19]

Sam Romano and Hala ElAarag. 2011. A neural network proxy cache replacement strategy and its implementation in the Squid proxy server. Neural computing and Applications 20, 1 (2011), 59--78.

Digital Library

[20]

Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H Pugsley, and Zeshan Chishti. 2015. Efficiently prefetching complex address patterns. Proceedings of the 48th International Symposium on Microarchitecture, 141--152.

Digital Library

[21]

Stephen Somogyi, Thomas F Wenisch, Anastassia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2006. Spatial memory streaming. ACM SIGARCH Computer Architecture News 34, 2 (2006), 252--263.

Digital Library

[22]

Martin Sundermeyer, Ralf Schlüter, and Hermann Ney. 2012. LSTM Neural Networks for Language Modeling. In Interspeech. 194--197.

[23]

Ngah Syahrulanuar, Abu Bakar Rohani, and Embong Abdullah. 2014. Two-Step Implementation of Sigmoid Function for Artificial Neural Network in Field Programmable Gate Array. (2014).

[24]

Wm A Wulf and Sally A McKee. 1995. Hitting the memory wall: implications of the obvious. ACM SIGARCH computer architecture news 23, 1 (1995), 20--24.

Digital Library

[25]

Zichao Yang, Xiaodong He, Jianfeng Gao, Li Deng, and Alex Smola. 2016. Stacked attention networks for image question answering. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 21--29.

[26]

Xiangyao Yu, Christopher J Hughes, Nadathur Satish, and Srinivas Devadas. 2015. IMP: Indirect memory prefetcher. In Proceedings of the 48th International Symposium on Microarchitecture. ACM, 178--190.

Digital Library

Cited By

Zhen YChen WGao WRen JChen KChen Y(2024)PatternS: An intelligent hybrid memory scheduler driven by page pattern recognitionJournal of Systems Architecture10.1016/j.sysarc.2024.103178153(103178)Online publication date: Aug-2024
https://doi.org/10.1016/j.sysarc.2024.103178
Liu YTziantzioulis GWentzlaff D(2023)Building Efficient Neural PrefetcherProceedings of the International Symposium on Memory Systems10.1145/3631882.3631903(1-12)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3631882.3631903
Zhang PKannan RPrasanna VMohror KArnold DBadia R(2023)Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph AnalyticsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607043(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607043
Show More Cited By

Index Terms

Long short term memory based hardware prefetcher: a case study
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
2. Hardware
  1. Integrated circuits
    1. Semiconductor memory
      1. Dynamic memory

Recommendations

The migration prefetcher: Anticipating data promotion in dynamic NUCA caches
Special Issue on High-Performance Embedded Architectures and Compilers

The exponential increase in multicore processor (CMP) cache sizes accompanied by growing on-chip wire delays make it difficult to implement traditional caches with a single, uniform access latency. Non-Uniform Cache Architecture (NUCA) designs have been ...
Global Prefetcher Aggressiveness Control for Chip-Multiprocessor
CIS '11: Proceedings of the 2011 Seventh International Conference on Computational Intelligence and Security

Aggressive prefetching may cause much inter-core interference and lead to large performance in shared memory CMP systems. The paper aims at improving system performance and making prefetching effective. We study prefetching-caused inter-core interference ...
Increasing hardware data prefetching performance using the second-level cache

Techniques to reduce or tolerate large memory latencies are critical for achieving high processor performance. Hardware data prefetching is one of the most heavily studied solutions, but it is essentially applied to first-level caches where it can ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

MEMSYS '17: Proceedings of the International Symposium on Memory Systems

October 2017

409 pages

ISBN:9781450353359

DOI:10.1145/3132402

General Chair:
Bruce Jacob
University of Maryland

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 02 October 2017

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Lehigh University

Conference

MEMSYS 2017

MEMSYS 2017: The International Symposium on Memory Systems, 2017

October 2 - 5, 2017

Virginia, Alexandria

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

33
Total Citations
View Citations
947
Total Downloads

Downloads (Last 12 months)63
Downloads (Last 6 weeks)8

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhen YChen WGao WRen JChen KChen Y(2024)PatternS: An intelligent hybrid memory scheduler driven by page pattern recognitionJournal of Systems Architecture10.1016/j.sysarc.2024.103178153(103178)Online publication date: Aug-2024
https://doi.org/10.1016/j.sysarc.2024.103178
Liu YTziantzioulis GWentzlaff D(2023)Building Efficient Neural PrefetcherProceedings of the International Symposium on Memory Systems10.1145/3631882.3631903(1-12)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3631882.3631903
Zhang PKannan RPrasanna VMohror KArnold DBadia R(2023)Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph AnalyticsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607043(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607043
Han SJiang Y(2023)RISC-V-Based Evaluation and Strategy Exploration of MRAM Triple-Level Hybrid Cache SystemsIEEE Transactions on Very Large Scale Integration (VLSI) Systems10.1109/TVLSI.2023.326810831:7(980-992)Online publication date: 1-Jul-2023
https://dl.acm.org/doi/10.1109/TVLSI.2023.3268108
Yang CMan XShao J(2023)G&L: An Attention-based Model for Improving Prefetching in Solid-state Drives2023 International Joint Conference on Neural Networks (IJCNN)10.1109/IJCNN54540.2023.10191741(1-8)Online publication date: 18-Jun-2023
https://doi.org/10.1109/IJCNN54540.2023.10191741
Gupta NZhang PKannan RPrasanna V(2023)PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363610(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363610
Gorle AZhang PKannan RPrasanna V(2023)G-MAP: A Graph Neural Network-Based Framework for Memory Access Prediction2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363605(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363605
Ghosh SSahula VBhargava L(2023)Reinforcement Learning Based Prefetch-Control Mechanism2023 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)10.1109/APCCAS60141.2023.00035(110-114)Online publication date: 19-Nov-2023
https://doi.org/10.1109/APCCAS60141.2023.00035
Yoo HKim JHan T(2023)RL-Based Cache Replacement: A Modern Interpretation of Belady’s Algorithm With Bypass Mechanism and Access Type AnalysisIEEE Access10.1109/ACCESS.2023.334679011(145238-145253)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3346790
Fang JLv XCai H(2022)ABMLP: Attention-Based Multi-Layer Perceptron PrefetcherProceedings of the 2022 6th International Conference on Computer Science and Artificial Intelligence10.1145/3577530.3577579(308-315)Online publication date: 9-Dec-2022
https://dl.acm.org/doi/10.1145/3577530.3577579
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents