research-article

Scaling sparse matrix-matrix multiplication in the accumulo database

Authors:

Gunduz Vehbi Demirci,

Cevdet AykanatAuthors Info & Claims

Distributed and Parallel Databases, Volume 38, Issue 1

Pages 31 - 62

https://doi.org/10.1007/s10619-019-07257-y

Published: 01 March 2020 Publication History

Abstract

We propose and implement a sparse matrix-matrix multiplication (SpGEMM) algorithm running on top of Accumulo’s iterator framework which enables high performance distributed parallelism. The proposed algorithm provides write-locality while ingesting the output matrix back to database via utilizing row-by-row parallel SpGEMM. The proposed solution also alleviates scanning of input matrices multiple times by making use of Accumulo’s batch scanning capability which is used for accessing multiple ranges of key-value pairs in parallel. Even though the use of batch-scanning introduces some latency overheads, these overheads are alleviated by the proposed solution and by using node-level parallelism structures. We also propose a matrix partitioning scheme which reduces the total communication volume and provides a balance of workload among servers. The results of extensive experiments performed on both real-world and synthetic sparse matrices show that the proposed algorithm scales significantly better than the outer-product parallel SpGEMM algorithm available in the Graphulo library. By applying the proposed matrix partitioning, the performance of the proposed algorithm is further improved considerably.

References

[1]

Chang F, Dean J, Ghemawat S, Hsieh WC, Wallach DA, Burrows M, Chandra T, Fikes A, and Gruber RE Bigtable: a distributed storage system for structured data ACM Trans. Comput. Syst. (TOCS) 2008 26 2 4

[2]

DeCandia Giuseppe, Hastorun Deniz, Jampani Madan, Kakulapati Gunavardhan, Lakshman Avinash, Pilchin Alex, Sivasubramanian Swaminathan, Vosshall Peter, and Vogels Werner Dynamo ACM SIGOPS Operating Systems Review 2007 41 6 205

[3]

Lakshman A and Malik P Cassandra: a decentralized structured storage system ACM SIGOPS Oper. Syst. Rev. 2010 44 2 35-40

[4]

Fuchs, A.: Accumulo-extensions to googles bigtable design, National Security Agency, Tech. Rep (2012)

[5]

Apache hbase. https://hbase.apache.org/ (2018). Accessed 15 April 2018

[6]

Sen, R., Farris, A., Guerra, P.: Benchmarking apache accumulo bigdata distributed table store using its continuous test suite. In: 2013 IEEE International Congress on Big Data (BigData Congress), pp. 334–341. IEEE (2013)

[7]

Hutchison, D., Kepner, J., Gadepally, V., Howe, B.: From nosql accumulo to newsql graphulo: Design and utility of graph algorithms inside a bigtable database. In: 2016 IEEE on High Performance Extreme Computing Conference (HPEC), pp. 1–9. IEEE (2016)

[8]

Grolinger K, Higashino WA, Tiwari A, and Capretz MA Data management in cloud environments: Nosql and newsql data stores J. Cloud Comput. 2013 2 1 22

[9]

Gadepally, V., Bolewski, J., Hook, D., Hutchison, D., Miller, B., Kepner, J.: Graphulo: Linear algebra graph kernels for nosql databases. In: 2015 IEEE International on Parallel and Distributed Processing Symposium Workshop (IPDPSW), pp. 822–830. IEEE (2015)

[10]

Kepner J, Bader D, Buluç A, Gilbert J, Mattson T, and Meyerhenke H Graphs, matrices, and the graphblas: seven good reasons Procedia Comput. Sci. 2015 51 2453-2462

[11]

Weale, T., Gadepally, V., Hutchison, D., Kepner, J.: Benchmarking the graphulo processing framework. In: 2016 IEEE on High Performance Extreme Computing Conference (HPEC), pp. 1–5. IEEE (2016)

[12]

Buluç, A., Gilbert, J.R.: Highly parallel sparse matrix-matrix multiplication, arXiv preprint arXiv:1006.2183 (2010)

[13]

Kepner J and Gilbert J Graph algorithms in the language of linear algebra 2011 Philadelphia SIAM

[14]

Hutchison, D., Kepner, J., Gadepally, V., Fuchs, A.: Graphulo implementation of server-side sparse matrix multiply in the accumulo database. In: 2015 IEEE on High Performance Extreme Computing Conference (HPEC), pp. 1–7. IEEE (2015)

[15]

Akbudak K, Selvitopi O, and Aykanat C Partitioning models for scaling parallel sparse matrix-matrix multiplication ACM Trans. Parallel Comput. (TOPC) 2018 4 3 13

[16]

Bader D, Madduri K, Gilbert J, Shah V, Kepner J, Meuse T, and Krishnamurthy A Designing scalable synthetic compact applications for benchmarking high productivity computing systems Cyberinfrastruct. Technol. Watch 2006 2 1-10

[17]

Shvachko, K., Kuang, H., Radia, S., Chansler, R.: The hadoop distributed file system. In: 2010 IEEE 26th symposium on Mass storage systems and technologies (MSST), pp. 1–10. IEEE (2010)

[18]

Hunt, P., Konar, M., Junqueira, F.P., Reed, B.: Zookeeper: Wait-free coordination for internet-scale systems. In: USENIX annual technical conference, vol. 8, p. 9 (2010)

[19]

Wang Endong, Zhang Qing, Shen Bo, Zhang Guangyong, Lu Xiaowei, Wu Qing, and Wang Yajuan Intel Math Kernel Library High-Performance Computing on the Intel® Xeon Phi™ 2014 Cham Springer International Publishing 167-188

[20]

Patwary Md. Mostofa Ali, Satish Nadathur Rajagopalan, Sundaram Narayanan, Park Jongsoo, Anderson Michael J., Vadlamudi Satya Gautam, Das Dipankar, Pudov Sergey G., Pirogov Vadim O., and Dubey Pradeep Parallel Efficient Sparse Matrix-Matrix Multiplication on Multicore Platforms Lecture Notes in Computer Science 2015 Cham Springer International Publishing 48-57

[21]

Gremse F, Hofter A, Schwen LO, Kiessling F, and Naumann UGPU-accelerated sparse matrix-matrix multiplication by iterative row mergingSIAM J. Sci. Comput.2015371C54-C713302599

[22]

Akbudak K and Aykanat C Exploiting locality in sparse matrix-matrix multiplication on many-core architectures IEEE Trans. Parallel Distrib. Syst. 2017 28 8 2258-2271

[23]

Heroux MA, Bartlett RA, Howle VE, Hoekstra RJ, Hu JJ, Kolda TG, Lehoucq RB, Long KR, Pawlowski RP, and Phipps ETAn overview of the trilinos projectACM Trans. Math. Softw. (TOMS)2005313397-4232266800

[24]

Buluç A and Gilbert JR The combinatorial blas: design, implementation, and applications Int. J. High Perform. Comput. Appl. 2011 25 4 496-509

[25]

Buluç A and Gilbert JRParallel sparse matrix-matrix multiplication and indexing: implementation and experimentsSIAM J. Sci. Comput.2012344C170-C1912970419

[26]

Akbudak K and Aykanat CSimultaneous input and output matrix partitioning for outer-product-parallel sparse matrix-matrix multiplicationSIAM J. Sci. Comput.2014365C568-C5903270980

[27]

Catalyurek, U., Aykanat, C.: A hypergraph-partitioning approach for coarse-grain decomposition. In: Proceedings of the 2001 ACM/IEEE Conference on Supercomputing, pp. 28–28. ACM (2001)

[28]

Karypis, G.: Multilevel algorithms for multi-constraint hypergraph partitioning, tech. rep., MINNESOTA UNIV MINNEAPOLIS DEPT OF COMPUTER SCIENCE (1999)

[29]

Karypis, G., Kumar, V.: Metis—unstructured graph partitioning and sparse matrix ordering system, version 2.0 (1995)

[30]

Chevalier C and Pellegrini FPt-scotch: a tool for efficient parallel graph orderingParallel Comput.2008346–8318-3312428880

[31]

Bejeck B Getting Started with Google Guava 2013 Birmingham Packt Publishing Ltd

[32]

Karypis G and Kumar V Multilevelk-way partitioning scheme for irregular graphs J. Parallel Distrib. Comput. 1998 48 1 96-129

[33]

Liu, W., Vinter, B.: An efficient GPU general sparse matrix-matrix multiplication for irregular data. In: 2014 IEEE 28th International on Parallel and Distributed Processing Symposium, pp. 370–381. IEEE (2014)

[34]

McCourt M, Smith B, and Zhang HSparse matrix-matrix products executed through coloringSIAM J. Matrix Anal. Appl.201536190-1093304264

[35]

D’Alberto P and Nicolau AR-kleene: a high-performance divide-and-conquer algorithm for the all-pair shortest path for densely connected networksAlgorithmica2007472203-2132290460

[36]

Ordonez COptimization of linear recursive queries in SQLIEEE Trans. Knowl. Data Eng.2010222264-2772670614

[37]

Ordonez C, Zhang Y, and Cabrera W The gamma matrix to summarize dense and sparse data sets for big data analytics IEEE Trans. Knowl. Data Eng. 2016 28 7 1905-1918

[38]

Linden G, Smith B, and York J Amazon. com recommendations: item-to-item collaborative filtering IEEE Internet Comput. 2003 7 1 76-80

[39]

Davis TA and Hu YThe university of florida sparse matrix collectionACM Trans. Math. Softw. (TOMS)2011381128650111365.65123

[40]

Bell N, Dalton S, and Olson LNExposing fine-grained parallelism in algebraic multigrid methodsSIAM J. Sci. Comput.2012344C123-C1522970417

[41]

Li, H., Li, K., Peng, J., Hu, J., Li, K.: An efficient parallelization approach for large-scale sparse non-negative matrix factorization using kullback-leibler divergence on multi-GPU. In: IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), 2017, pp. 511–518. IEEE (2017)

[42]

Li, H., Li, K., Peng, J., Li, K.: Cusnmf: A sparse non-negative matrix factorization approach for large-scale collaborative filtering recommender systems on multi-GPU. In: 2017 IEEE International Symposium on Parallel and Distributed Processing with Applications and 2017 IEEE International Conference on Ubiquitous Computing and Communications (ISPA/IUCC), pp. 1144–1151. IEEE (2017)

[43]

Kannan R, Ballard G, and Park H Mpi-faun: an MPI-based framework for alternating-updating nonnegative matrix factorization IEEE Trans. Knowl. Data Eng. 2018 30 3 544-558

[44]

Lee, D.D., Seung, H.S.: Algorithms for non-negative matrix factorization. In: Advances in neural information processing systems, pp. 556–562 (2001)

Cited By

Lin CLuo WFang YMa CLiu XMa Y(2024)On Efficient Large Sparse Matrix Chain MultiplicationProceedings of the ACM on Management of Data10.1145/36549592:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654959
Jang MKo YGwon HJo IPark YKim SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix MultiplicationProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615044(923-933)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615044
Gao JJi WChang FHan SWei BLiu ZWang Y(2023)A Systematic Survey of General Sparse Matrix-matrix MultiplicationACM Computing Surveys10.1145/357115755:12(1-36)Online publication date: 2-Mar-2023
https://dl.acm.org/doi/10.1145/3571157
Show More Cited By

Recommendations

Adaptive sparse tiling for sparse matrix multiplication
PPoPP '19: Proceedings of the 24th Symposium on Principles and Practice of Parallel Programming

Tiling is a key technique for data locality optimization and is widely used in high-performance implementations of dense matrix-matrix multiplication for multicore/manycore CPUs and GPUs. However, the irregular and matrix-dependent data access pattern ...
A sparse-sparse iteration for computing a sparse incomplete factorization of the inverse of an SPD matrix

In this paper, a method via sparse-sparse iteration for computing a sparse incomplete factorization of the inverse of a symmetric positive definite matrix is proposed. The resulting factorized sparse approximate inverse is used as a preconditioner for ...
Parallel Sparse Matrix-Matrix Multiplication and Indexing: Implementation and Experiments

Generalized sparse matrix-matrix multiplication (or SpGEMM) is a key primitive for many high performance graph algorithms as well as for some linear solvers, such as algebraic multigrid. Here we show that SpGEMM also yields efficient algorithms for general ...

Comments

Information & Contributors

Information

Published In

cover image Distributed and Parallel Databases

Distributed and Parallel Databases Volume 38, Issue 1

Mar 2020

251 pages

ISSN:0926-8782

Issue’s Table of Contents

© Springer Science+Business Media, LLC, part of Springer Nature 2019.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 March 2020

Author Tags

Qualifiers

Research-article

Funding Sources

The Scientific and Technological Research Council of Turkey (TUBITAK)

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Lin CLuo WFang YMa CLiu XMa Y(2024)On Efficient Large Sparse Matrix Chain MultiplicationProceedings of the ACM on Management of Data10.1145/36549592:3(1-27)Online publication date: 30-May-2024
https://dl.acm.org/doi/10.1145/3654959
Jang MKo YGwon HJo IPark YKim SFrommholz IHopfgartner FLee MOakes MLalmas MZhang MSantos R(2023)SAGE: A Storage-Based Approach for Scalable and Efficient Sparse Generalized Matrix-Matrix MultiplicationProceedings of the 32nd ACM International Conference on Information and Knowledge Management10.1145/3583780.3615044(923-933)Online publication date: 21-Oct-2023
https://dl.acm.org/doi/10.1145/3583780.3615044
Gao JJi WChang FHan SWei BLiu ZWang Y(2023)A Systematic Survey of General Sparse Matrix-matrix MultiplicationACM Computing Surveys10.1145/357115755:12(1-36)Online publication date: 2-Mar-2023
https://dl.acm.org/doi/10.1145/3571157
Niu YLu ZJi HSong SJin ZLiu WLee JAgrawal KSpear M(2022)TileSpGEMMProceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming10.1145/3503221.3508431(90-106)Online publication date: 2-Apr-2022
https://dl.acm.org/doi/10.1145/3503221.3508431
Diaz-Garcia JRuiz MMartin-Bautista M(2022)A survey on the use of association rules mining techniques in textual social mediaArtificial Intelligence Review10.1007/s10462-022-10196-356:2(1175-1200)Online publication date: 12-May-2022
https://dl.acm.org/doi/10.1007/s10462-022-10196-3

View Options

View options

Figures

Tables

Media

View Issue’s Table of Contents