Enhancing the Sparse Matrix Storage Using Reordering Techniques

Freire, Manuel; Marichal, Raul; Gonzaga de Oliveira, Sanderson L.; Dufrechou, Ernesto; Ezzatti, Pablo

doi:10.1007/978-3-031-52186-7_5

Manuel Freire¹¹,
Raul Marichal¹¹,
Sanderson L. Gonzaga de Oliveira¹²,
Ernesto Dufrechou¹¹ &
…
Pablo Ezzatti¹¹

Part of the book series: Communications in Computer and Information Science ((CCIS,volume 1887))

Included in the following conference series:

Latin American High Performance Computing Conference

217 Accesses

Abstract

Sparse linear algebra kernels are memory-bound routines, and their performance varies significantly according to the non-null pattern of the sparse matrix operands. The impressive computing power and memory bandwidth of modern massively parallel computing devices encourage researchers to develop sparse linear algebra kernels that can exploit these platforms efficiently. In this sense, a main line of work improves the storage of matrices, aiming to optimize the communication between the memory and the cores. In previous work, the use of a strategy consisting of a delta-encoding with matrix reorderings compressed the indexing data of the matrix, saving storage and communications. This work presents an algorithm to improve the reordering strategy and the resulting compression of the indexing data. The results show that this strategy leads to important storage savings, which can also reduce data movements between the main memory and processors.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 59.99; Price excludes VAT (USA)

Softcover Book: USD 79.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Towards Reducing Communications in Sparse Matrix Kernels

Single Matrix Block Shift (SMBS) Dense Matrix Multiplication Algorithm

Expanding Opportunities for Array Privatization in Sparse Computations

References

Monakov, A., Lokhmotov, A., Avetisyan, A.: Automatically tuning sparse matrix-vector multiplication for GPU architectures. In: Patt, Y.N., Foglia, P., Duesterwald, E., Faraboschi, P., Martorell, X. (eds.) HiPEAC 2010. LNCS, vol. 5952, pp. 111–125. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-11515-8_10
Chapter Google Scholar
Barrett, R., et al.: Templates for the Solution of Linear Systems: Building Blocks for Iterative Methods. Society for Industrial and Applied Mathematics (1994). https://doi.org/10.1137/1.9781611971538, https://epubs.siam.org/doi/abs/10.1137/1.9781611971538
Bell, N., Garland, M.: Implementing sparse matrix-vector multiplication on throughput-oriented processors. In: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, SC 2009. Association for Computing Machinery, New York, NY, USA (2009). https://doi.org/10.1145/1654059.1654078
Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Unleashing the performance of bmSparse for the sparse matrix multiplication in GPUs. In: Proceedings of the 2021 12th Workshop on Latest Advances in Scalable Algorithms for Large-Scale Systems (ScalA), pp. 19–26, November 2021
Google Scholar
Berger, G., Freire, M., Marini, R., Dufrechou, E., Ezzatti, P.: Advancing on an efficient sparse matrix multiplication kernel for modern GPUs. Concurr. Comput. Pract. Experience 35, e7271 (2022). https://doi.org/10.1002/cpe.7271, https://onlinelibrary.wiley.com/doi/abs/10.1002/cpe.7271
Cuthill, E., McKee, J.: Reducing the bandwidth of sparse symmetric matrices. In: Proceedings of the 1969 24th National Conference, pp. 157–172. ACM Press (1969). https://doi.org/10.1145/800195.805928
Davis, T.A., Hu, Y.: The university of Florida sparse matrix collection. ACM Trans. Math. Softw. 38(1), 1–25 (2011). https://doi.org/10.1145/2049662.2049663
Dufrechou, E., Ezzatti, P., Quintana-Ortí, E.S.: Selecting optimal SPMV realizations for GPUs via machine learning. Int. J. High Perform. Comput. Appl. 35(3), 254–267 (2021). https://doi.org/10.1177/1094342021990738
Favaro, F., Oliver, J.P., Ezzatti, P.: Unleashing the computational power of FPGAs to efficiently perform SPMV operation. In: 40th International Conference of the Chilean Computer Science Society, SCCC 2021, La Serena, Chile, 15–19 November 2021, pp. 1–8. IEEE (2021). https://doi.org/10.1109/SCCC54552.2021.9650418
Freire, M., Marichal, R., Dufrechou, E., Ezzatti, P.: Towards reducing communications in sparse matrix kernels. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) Cloud Computing, Big Data & Emerging Topics, JCC-BD &ET 2023. CCIS, vol. 1828, pp. 17–30. Springer, Cham (2023). https://doi.org/10.1007/978-3-031-40942-4_2
George, A.: Computer implementation of the finite element method. Ph.D. thesis, Computer Science Department, School of Humanities and Sciences, Stanford University, CA, USA (1971)
Google Scholar
George, J.A., Liu, J.W.: Computer Solution of Large Sparse Positive Definite Systems. Prentice-Hall, Englewood Cliffs (1981)
Google Scholar
Godwin, J., Holewinski, J., Sadayappan, P.: High-performance sparse matrix-vector multiplication on GPUs for structured grid computations. In: The 5th Annual Workshop on General Purpose Processing with Graphics Processing Units, GPGPU-5, London, United Kingdom, 3 March 2012, pp. 47–56. ACM (2012)
Google Scholar
Gómez, C., Mantovani, F., Focht, E., Casas, M.: Efficiently running SPMV on long vector architectures. In: Proceedings of the 26th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP 2021, pp. 292–303. Association for Computing Machinery, New York, NY, USA (2021). https://doi.org/10.1145/3437801.3441592
Choi, J.W., Singh, A., Vuduc, R.W.: Model-driven autotuning of sparse matrix-vector multiply on GPUs. In: Proceedings of the 15th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming (15th PPOPP 2010), pp. 115–125. ACM SIGPLAN, Bangalore, India, January 2010
Google Scholar
Karakasis, V., Gkountouvas, T., Kourtis, K., Goumas, G.I., Koziris, N.: An extended compression format for the optimization of sparse matrix-vector multiplication. IEEE Trans. Parallel Distributed Syst. 24(10), 1930–1940 (2013). https://doi.org/10.1109/TPDS.2012.290, https://doi.org/10.1109/TPDS.2012.290
Kourtis, K., Goumas, G.I., Koziris, N.: Optimizing sparse matrix-vector multiplication using index and value compression. In: Ramírez, A., Bilardi, G., Gschwind, M. (eds.) Proceedings of the 5th Conference on Computing Frontiers, 2008, Ischia, Italy, 5–7 May 2008, pp. 87–96. ACM (2008). https://doi.org/10.1145/1366230.1366244
Marichal, R., Dufrechou, E., Ezzatti, P.: Optimizing sparse matrix storage for the big data era. In: Naiouf, M., Rucci, E., Chichizola, F., De Giusti, L. (eds.) Cloud Computing, Big Data & Emerging Topics - 9th Conference, JCC-BD &ET, La Plata, Argentina, 22–25 June 2021, Proceedings. Communications in Computer and Information Science, vol. 1444, pp. 121–135. Springer, Cham (2021). https://doi.org/10.1007/978-3-030-84825-5_9
de Oliveira, S.L.G., de Abreu, A.A.A.M.: An evaluation of pseudoperipheral vertex finders for the reverse Cuthill-McKee method for bandwidth and profile reductions of symmetric matrices. In: 37th International Conference of the Chilean Computer Science Society, SCCC 2018, Santiago, Chile, 5–9 November 2018, pp. 1–9. IEEE (2018). https://doi.org/10.1109/SCCC.2018.8705263
de Oliveira, S.L.G., Silva, L.M.: Low-cost heuristics for matrix bandwidth reduction combined with a hill-climbing strategy. RAIRO Oper. Res. 55(4), 2247–2264 (2021). https://doi.org/10.1051/ro/2021102
Saad, Y.: Iterative Methods for Sparse Linear Systems, 2nd edn. Society for Industrial and Applied Mathematics (SIAM), Philadelphia (2003)
Google Scholar
Tang, W.T., et al.: Accelerating sparse matrix-vector multiplication on GPUs using bit-representation-optimized schemes. In: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis. ACM, November 2013. https://doi.org/10.1145/2503210.2503234
Willcock, J., Lumsdaine, A.: Accelerating sparse matrix computations via data compression. In: Proceedings of the 20th Annual International Conference on Supercomputing, ICS 2006, pp. 307–316. Association for Computing Machinery, New York, NY, USA (2006). https://doi.org/10.1145/1183401.1183444
Zhang, J., Gruenwald, L.: Regularizing irregularity: bitmap-based and portable sparse matrix multiplication for graph data on GPUs. In: Proceedings of the 1st ACM SIGMOD Joint International Workshop on Graph Data Management Experiences & Systems (GRADES) and Network Data Analytics (NDA), GRADES-NDA 2018. Association for Computing Machinery, New York, NY, USA (2018). https://doi.org/10.1145/3210259.3210263

Download references

Acknowledgments

This work is partially funded by the UDELAR CSIC-INI project CompactDisp: Formatos dispersos eficientes para arquitecturas de hardware modernas. The authors also thank PEDECIBA Informática and the University of the Republic, Uruguay.

Author information

Authors and Affiliations

Instituto de Computación, INCO, Facultad de Ingeniería, Universidad de la República, Montevideo, Uruguay
Manuel Freire, Raul Marichal, Ernesto Dufrechou & Pablo Ezzatti
ICT-Unifesp, Universidade Federal de São Paulo, São José dos Campos, Brazil
Sanderson L. Gonzaga de Oliveira

Authors

Manuel Freire
View author publications
You can also search for this author in PubMed Google Scholar
Raul Marichal
View author publications
You can also search for this author in PubMed Google Scholar
Sanderson L. Gonzaga de Oliveira
View author publications
You can also search for this author in PubMed Google Scholar
Ernesto Dufrechou
View author publications
You can also search for this author in PubMed Google Scholar
Pablo Ezzatti
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Freire .

Editor information

Editors and Affiliations

Industrial University of Santander, Bucaramanga, Colombia
Carlos J. Barrios H.
Argonne National Laboratory, Lemont, IL, USA
Silvio Rizzi
Centro Nacional de Alta Tecnología, San José, Costa Rica
Esteban Meneses
University of Buenos Aires & Center for Computational Simulation Aplicaciones Tecnológicas, Buenos Aires, Argentina
Esteban Mocskos
Argonne National Laboratory, Lemont, IL, USA
Jose M. Monsalve Diaz
University of Cartagena, Cartagena, Colombia
Javier Montoya

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Freire, M., Marichal, R., Gonzaga de Oliveira, S.L., Dufrechou, E., Ezzatti, P. (2024). Enhancing the Sparse Matrix Storage Using Reordering Techniques. In: Barrios H., C.J., Rizzi, S., Meneses, E., Mocskos, E., Monsalve Diaz, J.M., Montoya, J. (eds) High Performance Computing. CARLA 2023. Communications in Computer and Information Science, vol 1887. Springer, Cham. https://doi.org/10.1007/978-3-031-52186-7_5

Download citation

DOI: https://doi.org/10.1007/978-3-031-52186-7_5
Published: 28 January 2024
Publisher Name: Springer, Cham
Print ISBN: 978-3-031-52185-0
Online ISBN: 978-3-031-52186-7
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Enhancing the Sparse Matrix Storage Using Reordering Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards Reducing Communications in Sparse Matrix Kernels

Single Matrix Block Shift (SMBS) Dense Matrix Multiplication Algorithm

Expanding Opportunities for Array Privatization in Sparse Computations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Enhancing the Sparse Matrix Storage Using Reordering Techniques

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Towards Reducing Communications in Sparse Matrix Kernels

Single Matrix Block Shift (SMBS) Dense Matrix Multiplication Algorithm

Expanding Opportunities for Array Privatization in Sparse Computations

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation