Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICPADS.2015.97guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

LSRB-CSR: A Low Overhead Storage Format for SpMV on the GPU Systems

Published: 14 December 2015 Publication History

Abstract

Sparse matrix vector multiplication (SpMV) is a basic building block of many scientific applications. Several GPU accelerated SpMV algorithms for the CSR format suffer from workload unbalance for irregular matrices. In this paper, we propose a new auxiliary array assisted CSR format called local segmented reduction based CSR (LSRB-CSR), which enables synchronization free preprocessing and efficient SpMV algorithm with the light weight auxiliary arrays. It is efficient for both regular matrices and irregular matrices with tiny preprocessing overhead. We compare our LSRB-CSR based SpMV algorithm with the CSR-based SpMV from cuSPARSE, the SpMV algorithm based on segmented reduction adopted by CUDPP library, and the CSR5-based SpMV algorithm for both regular and irregular sparse matrices. Compared to cuSparse, our LSRB-CSR based SpMV algorithm could improve the performance by 26% on regular matrices and up to 4750% on irregular matrices. Compared to CUDPP, our LSRB-CSR based SpMV algorithm could improve the average SpMV performance by 210% on regular matrices and 250% on irregular matrices. Our LSRB-CSR based SpMV algorithm has comparable performance as the CSR5 based SpMV algorithm for regular matrices, and achieves better performance over the CSR5 based SpMV algorithm for irregular matrices. Experimental results show that the conversion overhead from the CSR to the LSRB-CSR is only 1/10 of the overhead from the CSR to the CSR5 on average.

Cited By

View all
  • (2023)Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUsEuro-Par 2023: Parallel Processing Workshops10.1007/978-3-031-50684-0_19(246-256)Online publication date: 28-Aug-2023
  • (2018)Regularizing irregularityProceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3210259.3210263(1-8)Online publication date: 10-Jun-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPADS '15: Proceedings of the 2015 IEEE 21st International Conference on Parallel and Distributed Systems (ICPADS)
December 2015
857 pages
ISBN:9780769557854

Publisher

IEEE Computer Society

United States

Publication History

Published: 14 December 2015

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 18 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2023)Sparse Matrix-Vector Product for the bmSparse Matrix Format in GPUsEuro-Par 2023: Parallel Processing Workshops10.1007/978-3-031-50684-0_19(246-256)Online publication date: 28-Aug-2023
  • (2018)Regularizing irregularityProceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA)10.1145/3210259.3210263(1-8)Online publication date: 10-Jun-2018

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media