research-article

Public Access

Predicting memory accesses: the road to compact ML-driven prefetcher

Authors:

Ajitesh Srivastava,

Angelos Lazaris,

Benjamin Brooks,

Rajgopal Kannan,

Viktor K. PrasannaAuthors Info & Claims

MEMSYS '19: Proceedings of the International Symposium on Memory Systems

Pages 461 - 470

https://doi.org/10.1145/3357526.3357549

Published: 30 September 2019 Publication History

Abstract

With the advent of fast processors, TPUs, accelerators, and heterogeneous architectures, computation is no longer the only bottleneck. In fact for many applications, speed of execution is limited by memory performance. To address memory performance, more accurate prefetching is necessary. While sophisticated machine learning algorithms have shown to predict memory accesses with high accuracy, they suffer with several issues that prevent them from being practical solutions as hardware prefetchers. These issues are centered around size of the model that results in high memory requirement, high latency and difficulty in online retraining. As the first step towards building ML-based prefetchers, we propose a compressed-LSTM approach for accurate memory access prediction. With a novel compression technique based on output encoding, we show that for the problem of predicting one of n memory locations, our technique results in O(n/log n) compression factor over the traditional LSTM approach. We further demonstrate through experiments on several benchmarks that the prediction accuracy drop due to compression is small and the training is fast. The actual compression obtained is of the order of 100×.

References

[1]

B. Plank, A. Søgaard, and Y. Goldberg, "Multilingual part-of-speech tagging with bidirectional long short-term memory models and auxiliary loss," arXiv preprint arXiv:1604.05529, 2016.

[2]

O. Vinyals, $Ls. Kaiser, T. Koo, S. Petrov, I. Sutskever, and G. Hinton, "Grammar as a foreign language," in Advances in Neural Information Processing Systems, pp. 2773--2781, 2015.

[3]

F. A. Gers, J. Schmidhuber, and F. Cummins, "Learning to forget: Continual prediction with lstm," 1999.

[4]

"Apple is bringing the ai revolution to your iphone." https://www.wired.com/2016/06/apple-bringing-ai-revolution-iphone/. Accessed: 2018-11-10.

[5]

Y. Zeng and X. Guo, "Long short term memory based hardware prefetcher: a case study," in Proceedings of the International Symposium on Memory Systems, pp. 305--311, ACM, Oct. 2017.

[6]

M. Hashemi, K. Swersky, J. A. Smith, G. Ayers, H. Litz, J. Chang, C. Kozyrakis, and P. Ranganathan, "Learning memory access patterns," Mar. 2018.

[7]

M. F. Sakr, S. P. Levitan, D. M. Chiarulli, B. G. Horne, and C. L. Giles, "Predicting multiprocessor memory access patterns with learning models," in ICML, pp. 305--312, 1997.

[8]

M. E. Sakr, C. L. Giles, S. P. Levitan, B. G. Horne, M. Maggini, and D. M. Chiarulli, "Online prediction of multiprocessor memory access patterns," in Proceedings of International Conference on Neural Networks (ICNN'96), vol. 3, pp. 1564--1569 vol.3, June 1996.

[9]

V. Fedchenko, G. Neglia, and B. Ribeiro, "Feedforward neural networks for caching: Enough or too much?," CoRR, vol. abs/1810.06930, 2018.

[10]

F. B. Moreira, M. Diener, P. O. A. Navaux, and I. Koren, "Data mining the memory access stream to detect anomalous application behavior," in Proceedings of the Computing Frontiers Conference, CF'17, (New York, NY, USA), pp. 45--52, ACM, 2017.

Digital Library

[11]

N. Zhang, K. Zheng, and M. Tao, "Using grouped linear prediction and accelerated reinforcement learning for online content caching," CoRR, vol. abs/1803.04675, 2018.

[12]

Z. Xu, S. Ray, P. Subramanyan, and S. Malik, "Malware detection using machine learning based analysis of virtual memory access patterns," in Proceedings of the Conference on Design, Automation & Test in Europe, DATE '17, (3001 Leuven, Belgium, Belgium), pp. 169--174, European Design and Automation Association, 2017.

[13]

L. Peled, U. Weiser, and Y. Etsion, "A neural network memory prefetcher using semantic locality," Mar. 2018.

[14]

S. Rahman, M. Burtscher, Z. Zong, and A. Qasem, "Maximizing hardware prefetch effectiveness with machine learning," in 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems, pp. 383--389, Aug. 2015.

[15]

S. Liao, T. Hung, D. Nguyen, C. Chou, C. Tu, and H. Zhou, "Machine learning-based prefetch optimization for data center applications," in Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, pp. 1--10, Nov. 2009.

[16]

A. Narayanan, S. Verma, E. Ramadan, P. Babaie, and Z.-L. Zhang, "Deepcache: A deep learning based framework for content caching," pp. 48--53, 08 2018.

[17]

M. Hashemi, K. Swersky, J. A. Smith, G. Ayers, H. Litz, J. Chang, C. Kozyrakis, and P. Ranganathan, "Learning memory access patterns," CoRR, vol. abs/1803.02329, 2018.

[18]

P. Michaud, "Best-offset hardware prefetching," in 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp. 469--480, IEEE, 2016.

[19]

J. Kim, S. H. Pugsley, P. V. Gratz, A. Reddy, C. Wilkerson, and Z. Chishti, "Path confidence based lookahead prefetching," in The 49th Annual IEEE/ACM International Symposium on Microarchitecture, p. 60, IEEE Press, 2016.

[20]

M. Shevgoor, S. Koladiya, R. Balasubramonian, C. Wilkerson, S. H. Pugsley, and Z. Chishti, "Efficiently prefetching complex address patterns," in 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO), pp. 141--152, IEEE, 2015.

[21]

S. Kondguli and M. Huang, "Division of labor: A more effective approach to prefetching," in 2018 ACM/IEEE 45th Annual International Symposium on Computer Architecture (ISCA), pp. 83--95, June 2018.

[22]

C. Bienia, S. Kumar, J. P. Singh, and K. Li, "The parsec benchmark suite: Characterization and architectural implications," in Proceedings of the 17th International Conference on Parallel Architectures and Compilation Techniques, PACT '08, (New York, NY, USA), pp. 72--81, ACM, 2008.

Digital Library

[23]

S. Hochreiter and J. Schmidhuber, "Long short-term memory," Neural Comput., vol. 9, pp. 1735--1780, Nov. 1997.

Digital Library

[24]

I. Goodfellow, Y. Bengio, and A. Courville, Deep Learning. The MIT Press, 2016.

Digital Library

[25]

C.-K. Luk, R. Cohn, R. Muth, H. Patil, A. Klauser, G. Lowney, S. Wallace, V. J. Reddi, and K. Hazelwood, "Pin: Building customized program analysis tools with dynamic instrumentation," SIGPLAN Not., vol. 40, pp. 190--200, June 2005.

Digital Library

Cited By

Duong QJain ALin C(2024)A New Formulation of Neural Data Prefetching2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00088(1173-1187)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00088
Zhang PKannan RNori APrasanna V(2024)Accelerating Graph Analytics Using Attention-Based Data PrefetcherSN Computer Science10.1007/s42979-024-02989-w5:5Online publication date: 13-Jun-2024
https://doi.org/10.1007/s42979-024-02989-w
Yang HFang JSu XCai ZWang Y(2024)RL-CoPref: a reinforcement learning-based coordinated prefetching controller for multiple prefetchersThe Journal of Supercomputing10.1007/s11227-024-05938-9Online publication date: 27-Feb-2024
https://doi.org/10.1007/s11227-024-05938-9
Show More Cited By

Index Terms

Predicting memory accesses: the road to compact ML-driven prefetcher

Recommendations

Accelerating Graph Analytics Using Attention-Based Data Prefetcher
Abstract
Graph analytics shows promise for solving challenging problems on relational data. However, memory constraints arise from the large size of graphs and the high complexity of algorithms. Data prefetching is a crucial technique to hide memory access ...
Designing a Modern Memory Hierarchy with Hardware Prefetching

In this paper, we address the severe performance gap caused by high processor clock rates and slow DRAM accesses. We show that, even with an aggressive, next-generation memory system using four Direct Rambus channels and an integrated one-megabyte level-...
Linearizing irregular memory accesses for improved correlated prefetching
MICRO-46: Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture

This paper introduces the Irregular Stream Buffer (ISB), a prefetcher that targets irregular sequences of temporally correlated memory references. The key idea is to use an extra level of indirection to translate arbitrary pairs of correlated physical ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

MEMSYS '19: Proceedings of the International Symposium on Memory Systems

September 2019

517 pages

ISBN:9781450372060

DOI:10.1145/3357526

Copyright © 2019 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 30 September 2019

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

MEMSYS '19

MEMSYS '19: The International Symposium on Memory Systems

September 30 - October 3, 2019

District of Columbia, Washington, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

24
Total Citations
View Citations
1,297
Total Downloads

Downloads (Last 12 months)405
Downloads (Last 6 weeks)64

Reflects downloads up to 22 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Duong QJain ALin C(2024)A New Formulation of Neural Data Prefetching2024 ACM/IEEE 51st Annual International Symposium on Computer Architecture (ISCA)10.1109/ISCA59077.2024.00088(1173-1187)Online publication date: 29-Jun-2024
https://doi.org/10.1109/ISCA59077.2024.00088
Zhang PKannan RNori APrasanna V(2024)Accelerating Graph Analytics Using Attention-Based Data PrefetcherSN Computer Science10.1007/s42979-024-02989-w5:5Online publication date: 13-Jun-2024
https://doi.org/10.1007/s42979-024-02989-w
Yang HFang JSu XCai ZWang Y(2024)RL-CoPref: a reinforcement learning-based coordinated prefetching controller for multiple prefetchersThe Journal of Supercomputing10.1007/s11227-024-05938-9Online publication date: 27-Feb-2024
https://doi.org/10.1007/s11227-024-05938-9
Liu YTziantzioulis GWentzlaff D(2023)Building Efficient Neural PrefetcherProceedings of the International Symposium on Memory Systems10.1145/3631882.3631903(1-12)Online publication date: 2-Oct-2023
https://dl.acm.org/doi/10.1145/3631882.3631903
Zhang PKannan RPrasanna VMohror KArnold DBadia R(2023)Phases, Modalities, Spatial and Temporal Locality: Domain Specific ML Prefetcher for Accelerating Graph AnalyticsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607043(1-15)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3581784.3607043
Mohapatra SPanda B(2023)Drishyam: An Image is Worth a Data Prefetcher2023 32nd International Conference on Parallel Architectures and Compilation Techniques (PACT)10.1109/PACT58117.2023.00013(51-61)Online publication date: 21-Oct-2023
https://doi.org/10.1109/PACT58117.2023.00013
Gupta NZhang PKannan RPrasanna V(2023)PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363610(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363610
Gorle AZhang PKannan RPrasanna V(2023)G-MAP: A Graph Neural Network-Based Framework for Memory Access Prediction2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363605(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363605
Fang JLi JYang HWang YSong S(2023)AMPP: An Adaptive Multilayer Perceptron Prefetcher for Irregular Data Prefetching2023 IEEE International Conference on High Performance Computing & Communications, Data Science & Systems, Smart City & Dependability in Sensor, Cloud & Big Data Systems & Application (HPCC/DSS/SmartCity/DependSys)10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00059(377-384)Online publication date: 17-Dec-2023
https://doi.org/10.1109/HPCC-DSS-SmartCity-DependSys60770.2023.00059
Ghosh SSahula VBhargava L(2023)Reinforcement Learning Based Prefetch-Control Mechanism2023 IEEE Asia Pacific Conference on Circuits and Systems (APCCAS)10.1109/APCCAS60141.2023.00035(110-114)Online publication date: 19-Nov-2023
https://doi.org/10.1109/APCCAS60141.2023.00035
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents