Optimizing sparse matrix-vector multiplication for large-scale data analytics

D Buono, F Petrini, F Checconi, X Liu, X Que… - Proceedings of the …, 2016 - dl.acm.org
D Buono, F Petrini, F Checconi, X Liu, X Que, C Long, TC Tuan
Proceedings of the 2016 International Conference on Supercomputing, 2016dl.acm.org
Sparse Matrix-Vector multiplication (SpMV) is a fundamental kernel, used by a large class of
numerical algorithms. Emerging big-data and machine learning applications are propelling
a renewed interest in SpMV algorithms that can tackle massive amount of unstructured data--
-rapidly approaching the TeraByte range---with predictable, high performance. In this paper
we describe a new methodology to design SpMV algorithms for shared memory
multiprocessors (SMPs) that organizes the original SpMV algorithm into two distinct phases …
Sparse Matrix-Vector multiplication (SpMV) is a fundamental kernel, used by a large class of numerical algorithms. Emerging big-data and machine learning applications are propelling a renewed interest in SpMV algorithms that can tackle massive amount of unstructured data---rapidly approaching the TeraByte range---with predictable, high performance. In this paper we describe a new methodology to design SpMV algorithms for shared memory multiprocessors (SMPs) that organizes the original SpMV algorithm into two distinct phases. In the first phase we build a scaled matrix, that is reduced in the second phase, providing numerous opportunities to exploit memory locality. Using this methodology, we have designed two algorithms. Our experiments on irregular big-data matrices (an order of magnitude larger than the current state of the art) show a quasi-optimal scaling on a large-scale POWER8 SMP system, with an average performance speedup of 3.8x, when compared to an equally optimized version of the CSR algorithm. In terms of absolute performance, with our implementation, the POWER8 SMP system is comparable to a 256-node cluster. In terms of size, it can process matrices with up to 68 billion edges, an order of magnitude larger than state-of-the-art clusters.
ACM Digital Library