Predicting memory accesses: the road to compact ml-driven prefetcher

A Srivastava, A Lazaris, B Brooks, R Kannan… - Proceedings of the …, 2019 - dl.acm.org
Proceedings of the International Symposium on Memory Systems, 2019dl.acm.org
With the advent of fast processors, TPUs, accelerators, and heterogeneous architectures,
computation is no longer the only bottleneck. In fact for many applications, speed of
execution is limited by memory performance. To address memory performance, more
accurate prefetching is necessary. While sophisticated machine learning algorithms have
shown to predict memory accesses with high accuracy, they suffer with several issues that
prevent them from being practical solutions as hardware prefetchers. These issues are …
With the advent of fast processors, TPUs, accelerators, and heterogeneous architectures, computation is no longer the only bottleneck. In fact for many applications, speed of execution is limited by memory performance. To address memory performance, more accurate prefetching is necessary. While sophisticated machine learning algorithms have shown to predict memory accesses with high accuracy, they suffer with several issues that prevent them from being practical solutions as hardware prefetchers. These issues are centered around size of the model that results in high memory requirement, high latency and difficulty in online retraining. As the first step towards building ML-based prefetchers, we propose a compressed-LSTM approach for accurate memory access prediction. With a novel compression technique based on output encoding, we show that for the problem of predicting one of n memory locations, our technique results in O(n/log n) compression factor over the traditional LSTM approach. We further demonstrate through experiments on several benchmarks that the prediction accuracy drop due to compression is small and the training is fast. The actual compression obtained is of the order of 100×.
ACM Digital Library