research-article

Open access

Fine-grained address segmentation for attention-based variable-degree prefetching

Authors:

Pengmiao Zhang,

Ajitesh Srivastava,

Rajgopal Kannan, and

Viktor K. PrasannaAuthors Info & Claims

CF '22: Proceedings of the 19th ACM International Conference on Computing Frontiers

May 2022

Pages 103 - 112

https://doi.org/10.1145/3528416.3530236

Published: 17 May 2022 Publication History

Abstract

Machine learning algorithms have shown potential to improve prefetching performance by accurately predicting future memory accesses. Existing approaches are based on the modeling of text prediction, considering prefetching as a classification problem for sequence prediction. However, the vast and sparse memory address space leads to large vocabulary, which makes this modeling impractical. The number and order of outputs for multiple cache line prefetching are also fundamentally different from text prediction.

We propose TransFetch, a novel way to model prefetching. To reduce vocabulary size, we use fine-grained address segmentation as input. To predict unordered sets of future addresses, we use delta bitmaps for multiple outputs. We apply an attention-based network to learn the mapping between input and output. Prediction experiments demonstrate that address segmentation achieves 26% - 36% higher F1-score than delta inputs and 15% - 24% higher F1-score than page & offset inputs for SPEC 2006, SPEC 2017, and GAP benchmarks. Simulation results show that TransFetch achieves 38.75% IPC improvement compared with no prefetching, outperforming the best-performing rule-based prefetcher BOP by 10.44% and ML-based prefetcher Voyager by 6.64%.

References

[1]

Mohammad Bakhshalipour, Pejman Lotfi-Kamran, and Hamid Sarbazi-Azad. 2018. Domino temporal data prefetcher. In 2018 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 131--142.

[2]

Scott Beamer, Krste Asanović, and David Patterson. 2015. The GAP benchmark suite. arXiv preprint arXiv:1508.03619 (2015).

[3]

Peter Braun and Heiner Litz. 2019. Understanding Memory Access Patterns for Prefetching. In International Workshop on AI-assisted Design for Architecture (AIDArc), held in conjunction with ISCA.

[4]

Carlos Carvalho. 2002. The gap between processor and memory speeds. In Proc. of IEEE International Conference on Control and Automation.

[5]

"ChampSim". 2017. https://github.com/ChampSim/ChampSim.

[6]

Chi F Chen, S-H Yang, Babak Falsafi, and Andreas Moshovos. 2004. Accurate and complexity-effective spatial pattern prediction. In 10th International Symposium on High Performance Computer Architecture (HPCA'04). IEEE, 276--287.

Digital Library

[7]

Tien-Fu Chen and Jean-Loup Baer. 1995. Effective hardware-based data prefetching for high-performance processors. IEEE transactions on computers 44, 5 (1995), 609--623.

[8]

Trishul M Chilimbi. 2001. Efficient representations and abstractions for quantifying and exploiting data reference locality. ACM SIGPLAN Notices 36, 5 (2001), 191--202.

Digital Library

[9]

Yuan Chou. 2007. Low-cost epoch-based correlation prefetching for commercial applications. In 40th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO 2007). IEEE, 301--313.

Digital Library

[10]

"SPEC CPU2017". 2017. The Standard Performance Evaluation Corporation. https://www.spec.org/cpu2017/.

[11]

Jacob Devlin, Ming-Wei Chang, Kenton Lee, and Kristina Toutanova. 2018. Bert: Pre-training of deep bidirectional transformers for language understanding. arXiv preprint arXiv:1810.04805 (2018).

[12]

Alexey Dosovitskiy, Lucas Beyer, Alexander Kolesnikov, Dirk Weissenborn, Xiaohua Zhai, Thomas Unterthiner, Mostafa Dehghani, Matthias Minderer, Georg Heigold, Sylvain Gelly, et al. 2020. An image is worth 16×16 words: Transformers for image recognition at scale. arXiv preprint arXiv:2010.11929 (2020).

[13]

Michel Dubois, Murali Annavaram, and Per Stenström. 2012. Parallel computer organization and design. cambridge university press.

[14]

Keith I Farkas, Paul Chow, Norman P Jouppi, and Zvonko Vranesic. 1997. Memory-system design considerations for dynamically-scheduled processors. ACM SIGARCH Computer Architecture News 25, 2 (1997), 133--143.

Digital Library

[15]

Milad Hashemi, Kevin Swersky, Jamie A Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, and Parthasarathy Ranganathan. 2018. Learning memory access patterns. arXiv preprint arXiv:1803.02329 (2018).

[16]

Milad Hashemi, Kevin Swersky, Jamie A. Smith, Grant Ayers, Heiner Litz, Jichuan Chang, Christos Kozyrakis, and Parthasarathy Ranganathan. 2018. Learning Memory Access Patterns. CoRR abs/1803.02329 (2018). arXiv:1803.02329 http://arxiv.org/abs/1803.02329

[17]

Anakhi Hazarika, Soumyajit Poddar, and Hafizur Rahaman. 2020. Survey on memory management techniques in heterogeneous computing systems. IET Computers & Digital Techniques 14, 2 (2020), 47--60.

[18]

Sepp Hochreiter and Jürgen Schmidhuber. 1997. Long short-term memory. Neural computation 9, 8 (1997), 1735--1780.

Digital Library

[19]

Zhigang Hu, Margaret Martonosi, and Stefanos Kaxiras. 2003. TCP: Tag correlating prefetchers. In The Ninth International Symposium on High-Performance Computer Architecture, 2003. HPCA-9 2003. Proceedings. IEEE, 317--326.

[20]

Ibrahim Hur and Calvin Lin. 2006. Memory prefetching using adaptive stream detection. In 2006 39th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO'06). IEEE, 397--408.

Digital Library

[21]

Andrey Ignatov, Radu Timofte, William Chou, Ke Wang, Max Wu, Tim Hartley, and Luc Van Gool. 2018. Ai benchmark: Running deep neural networks on android smartphones. In Proceedings of the European Conference on Computer Vision (ECCV) Workshops. 0--0.

[22]

Yasuo Ishii, Mary Inaba, and Kei Hiraki. 2011. Access map pattern matching for high performance data cache prefetch. Journal of Instruction-Level Parallelism 13, 2011 (2011), 1--24.

[23]

Akanksha Jain and Calvin Lin. 2013. Linearizing irregular memory accesses for improved correlated prefetching. In Proceedings of the 46th Annual IEEE/ACM International Symposium on Microarchitecture. 247--259.

Digital Library

[24]

Aamer Jaleel. 2010. Memory characterization of workloads using instrumentation-driven simulation. Web Copy: http://www.glue.umd.edu/ajaleel/workload (2010).

[25]

Teresa L Johnson, Matthew C Merten, and Wen-Mei W Hwu. 1997. Run-time spatial locality detection and optimization. In Proceedings of 30th Annual International Symposium on Microarchitecture. IEEE, 57--64.

[26]

Doug Joseph and Dirk Grunwald. 1997. Prefetching using markov predictors. In Proceedings of the 24th annual international symposium on Computer architecture. 252--263.

Digital Library

[27]

Norman P Jouppi. 1990. Improving direct-mapped cache performance by the addition of a small fully-associative cache and prefetch buffers. ACM SIGARCH Computer Architecture News 18, 2SI (1990), 364--373.

Digital Library

[28]

Norman P Jouppi, Cliff Young, Nishant Patil, David Patterson, Gaurav Agrawal, Raminder Bajwa, Sarah Bates, Suresh Bhatia, Nan Boden, Al Borchers, et al. 2017. In-datacenter performance analysis of a tensor processing unit. In Proceedings of the 44th annual international symposium on computer architecture. 1--12.

Digital Library

[29]

Karthik Kambatla, Giorgos Kollias, Vipin Kumar, and Ananth Grama. 2014. Trends in big data analytics. Journal of parallel and distributed computing 74, 7 (2014), 2561--2573.

[30]

Jinchun Kim, Seth H Pugsley, Paul V Gratz, AL Narasimha Reddy, Chris Wilkerson, and Zeshan Chishti. 2016. Path confidence based lookahead prefetching. In 2016 49th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 1--12.

[31]

Diederik P Kingma and Jimmy Ba. 2014. Adam: A method for stochastic optimization. arXiv preprint arXiv:1412.6980 (2014).

[32]

Sanjeev Kumar and Christopher Wilkerson. 1998. Exploiting spatial locality in data caches using spatial footprints. In Proceedings. 25th Annual International Symposium on Computer Architecture (Cat. No. 98CB36235). IEEE, 357--368.

Digital Library

[33]

Colin Lea, Michael D Flynn, Rene Vidal, Austin Reiter, and Gregory D Hager. 2017. Temporal convolutional networks for action segmentation and detection. In proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 156--165.

[34]

Wei-Fen Lin, Steven K Reinhardt, Doug Burger, and Thomas R Puzak. 2001. Filtering superfluous prefetches using density vectors. In Proceedings 2001 IEEE International Conference on Computer Design: VLSI in Computers and Processors. ICCD 2001. IEEE, 124--132.

[35]

Ward Douglas Maurer and Ted G Lewis. 1975. Hash table methods. ACM Computing Surveys (CSUR) 7, 1 (1975), 5--19.

Digital Library

[36]

Julian Richard Medina and Jugal Kalita. 2018. Parallel attention mechanisms in neural machine translation. In 2018 17th IEEE international conference on machine learning and applications (ICMLA). IEEE, 547--552.

[37]

Pierre Michaud. 2016. Best-offset hardware prefetching. In 2016 IEEE International Symposium on High Performance Computer Architecture (HPCA). IEEE, 469--480.

[38]

Richard C Murphy, Kyle B Wheeler, Brian W Barrett, and James A Ang. 2010. Introducing the graph 500. Cray Users Group (CUG) 19 (2010), 45--74.

[39]

Prakash M Nadkarni, Lucila Ohno-Machado, and Wendy W Chapman. 2011. Natural language processing: an introduction. Journal of the American Medical Informatics Association 18, 5 (2011), 544--551.

[40]

Arvind Narayanan, Saurabh Verma, Eman Ramadan, Pariya Babaie, and Zhi-Li Zhang. 2018. Deepcache: A deep learning based framework for content caching. In Proceedings of the 2018 Workshop on Network Meets AI & ML. 48--53.

Digital Library

[41]

Mahdi Nazemi, Arash Fayyazi, Amirhossein Esmaili, Atharva Khare, Soheil Nazar Shahsavani, and Massoud Pedram. 2021. NullaNet Tiny: Ultra-low-latency DNN Inference Through Fixed-function Combinational Logic. In 2021 IEEE 29th Annual International Symposium on Field-Programmable Custom Computing Machines (FCCM). IEEE, 266--267.

[42]

Kyle J Nesbit, Ashutosh S Dhodapkar, and James E Smith. 2004. AC/DC: An adaptive data cache prefetcher. In Proceedings. 13th International Conference on Parallel Architecture and Compilation Techniques, 2004. PACT 2004. IEEE, 135--145.

[43]

Tesla NVIDIA. 2017. V100 GPU architecture. The world's most advanced data center GPU. Version WP-08608-001_v1 1 (2017).

[44]

Subbarao Palacharla and Richard E Kessler. 1994. Evaluating stream buffers as a secondary cache replacement. In Proceedings of the 21st annual international symposium on Computer architecture. 24--33.

Digital Library

[45]

Leeor Peled, Shie Mannor, Uri Weiser, and Yoav Etsion. 2015. Semantic locality and context-based prefetching using reinforcement learning. In 2015 ACM/IEEE 42nd Annual International Symposium on Computer Architecture (ISCA). IEEE, 285--297.

Digital Library

[46]

Leeor Peled, Uri Weiser, and Yoav Etsion. 2018. A neural network memory prefetcher using semantic locality. arXiv preprint arXiv:1804.00478 (2018).

[47]

Erez Perelman, Greg Hamerly, Michael Van Biesbrouck, Timothy Sherwood, and Brad Calder. 2003. Using SimPoint for accurate and efficient simulation. ACM SIGMETRICS Performance Evaluation Review 31, 1 (2003), 318--319.

Digital Library

[48]

David MW Powers. 2020. Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061 (2020).

[49]

Seth H Pugsley, Zeshan Chishti, Chris Wilkerson, Peng-fei Chuang, Robert L Scott, Aamer Jaleel, Shih-Lien Lu, Kingsum Chow, and Rajeev Balasubramonian. 2014. Sandbox prefetching: Safe run-time evaluation of aggressive prefetchers. In 2014 IEEE 20th International Symposium on High Performance Computer Architecture (HPCA). IEEE, 626--637.

[50]

S Rahman, M Burtscher, Z Zong, and A Qasem. 2015. Maximizing Hardware Prefetch Effectiveness with Machine Learning. In 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conference on Embedded Software and Systems. 383--389.

[51]

Mohammad Samragh Razlighi, Mohsen Imani, Farinaz Koushanfar, and Tajana Rosing. 2017. Looknn: Neural network with no multiplication. In Design, Automation & Test in Europe Conference & Exhibition (DATE), 2017. IEEE, 1775--1780.

[52]

Albert Reuther, Peter Michaleas, Michael Jones, Vijay Gadepally, Siddharth Samsi, and Jeremy Kepner. 2019. Survey and benchmarking of machine learning accelerators. In 2019 IEEE high performance extreme computing conference (HPEC). IEEE, 1--9.

[53]

Takaya Saito and Marc Rehmsmeier. 2015. The precision-recall plot is more informative than the ROC plot when evaluating binary classifiers on imbalanced datasets. PloS one 10, 3 (2015), e0118432.

[54]

Siddharth Samsi, Vijay Gadepally, Michael Hurley, Michael Jones, Edward Kao, Sanjeev Mohindra, Paul Monticciolo, Albert Reuther, Steven Smith, William Song, et al. 2018. Graphchallenge. org: Raising the bar on graph analytic performance. In 2018 IEEE High Performance extreme Computing Conference (HPEC). IEEE, 1--7.

[55]

Manjunath Shevgoor, Sahil Koladiya, Rajeev Balasubramonian, Chris Wilkerson, Seth H Pugsley, and Zeshan Chishti. 2015. Efficiently prefetching complex address patterns. In 2015 48th Annual IEEE/ACM International Symposium on Microarchitecture (MICRO). IEEE, 141--152.

Digital Library

[56]

Zhan Shi, Akanksha Jain, Kevin Swersky, Milad Hashemi, Parthasarathy Ranganathan, and Calvin Lin. 2021. A hierarchical neural model of data prefetching. In Proceedings of the 26th ACM International Conference on Architectural Support for Programming Languages and Operating Systems. 861--873.

Digital Library

[57]

Alan Jay Smith. 1978. Sequential program prefetching in memory hierarchies. Computer 11, 12 (1978), 7--21.

Digital Library

[58]

Yan Solihin, Jaejin Lee, and Josep Torrellas. 2002. Using a user-level memory thread for correlation prefetching. In Proceedings 29th Annual International Symposium on Computer Architecture. IEEE, 171--182.

[59]

Stephen Somogyi, Thomas F Wenisch, Anastassia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2006. Spatial memory streaming. ACM SIGARCH Computer Architecture News 34, 2 (2006), 252--263.

Digital Library

[60]

Viji Srinivasan, Edward S Davidson, and Gary S Tyson. 2004. A prefetch taxonomy. IEEE Trans. Comput. 53, 2 (2004), 126--140.

Digital Library

[61]

Ajitesh Srivastava, Angelos Lazaris, Benjamin Brooks, Rajgopal Kannan, and Viktor K Prasanna. 2019. Predicting memory accesses: the road to compact ML-driven prefetcher. In Proceedings of the International Symposium on Memory Systems. 461--470.

Digital Library

[62]

Ajitesh Srivastava, Ta-Yang Wang, Pengmiao Zhang, Cesar Augusto F De Rose, Rajgopal Kannan, and Viktor K Prasanna. 2020. MemMAP: Compact and Generalizable Meta-LSTM Models for Memory Access Prediction. In Pacific-Asia Conference on Knowledge Discovery and Data Mining. Springer, 57--68.

[63]

Chun-Wei Tsai, Chin-Feng Lai, Han-Chieh Chao, and Athanasios V Vasilakos. 2015. Big data analytics: a survey. Journal of Big data 2, 1 (2015), 1--32.

[64]

Raju Vaishya, Mohd Javaid, Ibrahim Haleem Khan, and Abid Haleem. 2020. Artificial Intelligence (AI) applications for COVID-19 pandemic. Diabetes & Metabolic Syndrome: Clinical Research & Reviews 14, 4 (2020), 337--339.

[65]

Steven P Vander Wiel and David J Lilja. 1997. When caches aren't enough: Data prefetching techniques. Computer 30, 7 (1997), 23--30.

Digital Library

[66]

Ashish Vaswani, Noam Shazeer, Niki Parmar, Jakob Uszkoreit, Llion Jones, Aidan N Gomez, Łukasz Kaiser, and Illia Polosukhin. 2017. Attention is all you need. In Advances in neural information processing systems. 5998--6008.

[67]

Brian Wahl, Aline Cossy-Gantner, Stefan Germann, and Nina R Schwalbe. 2018. Artificial intelligence (AI) and global health: how can AI contribute to health in resource-poor settings? BMJ global health 3, 4 (2018), e000798.

[68]

Jonathan J Webster and Chunyu Kit. 1992. Tokenization as the initial phase in NLP. In COLING 1992 Volume 4: The 14th International Conference on Computational Linguistics.

Digital Library

[69]

Thomas F Wenisch, Michael Ferdman, Anastasia Ailamaki, Babak Falsafi, and Andreas Moshovos. 2008. Temporal streams in commercial server applications. In 2008 IEEE International Symposium on Workload Characterization. IEEE, 99--108.

[70]

Hao Wu, Krishnendra Nathella, Joseph Pusdesris, Dam Sunwoo, Akanksha Jain, and Calvin Lin. 2019. Temporal prefetching without the off-chip metadata. In Proceedings of the 52nd Annual IEEE/ACM International Symposium on Microarchitecture. 996--1008.

Digital Library

[71]

Hao Wu, Krishnendra Nathella, Dam Sunwoo, Akanksha Jain, and Calvin Lin. 2019. Efficient metadata management for irregular data prefetching. In 2019 ACM/IEEE 46th Annual International Symposium on Computer Architecture (ISCA). IEEE, 1--13.

Digital Library

[72]

Wm A Wulf and Sally A McKee. 1995. Hitting the memory wall: Implications of the obvious. ACM SIGARCH computer architecture news 23, 1 (1995), 20--24.

Digital Library

[73]

Yuan Zeng and Xiaochen Guo. 2017. Long short term memory based hardware prefetcher: a case study. In Proceedings of the International Symposium on Memory Systems. 305--311.

Digital Library

[74]

Pengmiao Zhang, Ajitesh Srivastava, Benjamin Brooks, Rajgopal Kannan, and Viktor K Prasanna. 2020. RAOP: Recurrent Neural Network Augmented Offset Prefetcher. In The International Symposium on Memory Systems (MEMSYS 2020).

[75]

Pengmiao Zhang, Ajitesh Srivastava, Ta-Yang Wang, Cesar AF De Rose, Rajgopal Kannan, and Viktor K Prasanna. 2021. C-MemMAP: clustering-driven compact, adaptable, and generalizable meta-LSTM models for memory access prediction. International Journal of Data Science and Analytics (2021), 1--14.

Cited By

Gupta NKannan NZhang PPrasanna V(2024)TabConv: Low-Computation CNN Inference via Table LookupsProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649212(180-188)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1145/3649153.3649212
Jang JShim SEgay VLee JPark JChae SKang U(2023)Accurate Open-Set Recognition for Memory WorkloadACM Transactions on Knowledge Discovery from Data10.1145/359702717:9(1-14)Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1145/3597027
Marino KZhang PPrasanna V(2023)ME- ViT: A Single-Load Memory-Efficient FPGA Accelerator for Vision Transformers2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00039(213-223)Online publication date: 18-Dec-2023
https://doi.org/10.1109/HiPC58850.2023.00039
Show More Cited By

Index Terms

Fine-grained address segmentation for attention-based variable-degree prefetching
1. Computer systems organization
  1. Architectures
    1. Other architectures
      1. Neural networks
  2. Dependable and fault-tolerant systems and networks
    1. Processors and memory architectures
2. Information systems
  1. Information systems applications
    1. Data mining

Recommendations

Stealth prefetching
Proceedings of the 2006 ASPLOS Conference

Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Read More
Stealth prefetching
ASPLOS XII: Proceedings of the 12th international conference on Architectural support for programming languages and operating systems

Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Read More
Stealth prefetching
Proceedings of the 2006 ASPLOS Conference

Prefetching in shared-memory multiprocessor systems is an increasingly difficult problem. As system designs grow to incorporate larger numbers of faster processors, memory latency and interconnect traffic increase. While aggressive prefetching ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

CF '22: Proceedings of the 19th ACM International Conference on Computing Frontiers

May 2022

321 pages

ISBN:9781450393386

DOI:10.1145/3528416

General Chair:
Luca Sterpone
Politecnico di Torino, IT
,
Program Chairs:
Andrea Bartolini
Universit`a di Bologna, IT
,
Anastasiia Butko
Lawrence Berkeley National Laboratory

Copyright © 2022 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGMICRO: ACM Special Interest Group on Microarchitectural Research and Processing

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2022

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Conference

CF '22

Sponsor:

SIGMICRO

CF '22: 19th ACM International Conference on Computing Frontiers

May 17 - 22, 2022

Turin, Italy

Acceptance Rates

Overall Acceptance Rate 273 of 785 submissions, 35%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

8
Total Citations
View Citations
599
Total Downloads

Downloads (Last 12 months)324
Downloads (Last 6 weeks)36

Other Metrics

View Author Metrics

Citations

Cited By

Gupta NKannan NZhang PPrasanna V(2024)TabConv: Low-Computation CNN Inference via Table LookupsProceedings of the 21st ACM International Conference on Computing Frontiers10.1145/3649153.3649212(180-188)Online publication date: 7-May-2024
https://dl.acm.org/doi/10.1145/3649153.3649212
Jang JShim SEgay VLee JPark JChae SKang U(2023)Accurate Open-Set Recognition for Memory WorkloadACM Transactions on Knowledge Discovery from Data10.1145/359702717:9(1-14)Online publication date: 15-Jun-2023
https://dl.acm.org/doi/10.1145/3597027
Marino KZhang PPrasanna V(2023)ME- ViT: A Single-Load Memory-Efficient FPGA Accelerator for Vision Transformers2023 IEEE 30th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC58850.2023.00039(213-223)Online publication date: 18-Dec-2023
https://doi.org/10.1109/HiPC58850.2023.00039
Gupta NZhang PKannan RPrasanna V(2023)PaCKD: Pattern-Clustered Knowledge Distillation for Compressing Memory Access Prediction Models2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363610(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363610
Gorle AZhang PKannan RPrasanna V(2023)G-MAP: A Graph Neural Network-Based Framework for Memory Access Prediction2023 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC58863.2023.10363605(1-7)Online publication date: 25-Sep-2023
https://doi.org/10.1109/HPEC58863.2023.10363605
Yoo HKim JHan T(2023)RL-Based Cache Replacement: A Modern Interpretation of Belady’s Algorithm With Bypass Mechanism and Access Type AnalysisIEEE Access10.1109/ACCESS.2023.334679011(145238-145253)Online publication date: 2023
https://doi.org/10.1109/ACCESS.2023.3346790
Zhang PKannan RSrivastava ANori APrasanna V(2022)ReSemble: Reinforced Ensemble Framework for Data PrefetchingSC22: International Conference for High Performance Computing, Networking, Storage and Analysis10.1109/SC41404.2022.00086(1-14)Online publication date: Nov-2022
https://doi.org/10.1109/SC41404.2022.00086
Zhang PKannan RTong XNori APrasanna V(2022)SHARP: Software Hint-Assisted Memory Access Prediction for Graph Analytics2022 IEEE High Performance Extreme Computing Conference (HPEC)10.1109/HPEC55821.2022.9926307(1-8)Online publication date: 19-Sep-2022
https://doi.org/10.1109/HPEC55821.2022.9926307

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents