Article

SPLATT: Efficient and Parallel Sparse Tensor-Matrix Multiplication

Authors:

Shaden Smith,

Niranjay Ravindran,

Nicholas D. Sidiropoulos,

George KarypisAuthors Info & Claims

IPDPS '15: Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium

Pages 61 - 70

https://doi.org/10.1109/IPDPS.2015.27

Published: 25 May 2015 Publication History

Abstract

Multi-dimensional arrays, or tensors, are increasingly found in fields such as signal processing and recommender systems. Real-world tensors can be enormous in size and often very sparse. There is a need for efficient, high-performance tools capable of processing the massive sparse tensors of today and the future. This paper introduces SPLATT, a C library with shared-memory parallelism for three-mode tensors. SPLATT contains algorithmic improvements over competing state of the art tools for sparse tensor factorization. SPLATT has a fast, parallel method of multiplying a matricide tensor by a Khatri-Rao product, which is a key kernel in tensor factorization methods. SPLATT uses a novel data structure that exploits the sparsity patterns of tensors. This data structure has a small memory footprint similar to competing methods and allows for the computational improvements featured in our work. We also present a method of finding cache-friendly reordering and utilizing them with a novel form of cache tiling. To our knowledge, this is the first work to investigate reordering and cache tiling in this context. SPLATT averages almost 30x speedup compared to our baseline when using 16 threads and reaches over 80x speedup on NELL-2.

Cited By

View all

Soh YKannan RSao PChoi J(2024)Accelerated Constrained Sparse Tensor Factorization on Massively Parallel ArchitecturesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673128(107-116)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673128
Li ZQin YXiao QYang WLi K(2024)cuFasterTucker: A Stochastic Optimization Strategy for Parallel Sparse FastTucker Decomposition on GPU PlatformACM Transactions on Parallel Computing10.1145/364809411:2(1-33)Online publication date: 8-Jun-2024
https://dl.acm.org/doi/10.1145/3648094
Kanakagiri RSolomonik EAgrawal KPetrank E(2024)Minimum Cost Loop Nests for Contraction of a Sparse Tensor with a Tensor NetworkProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659985(169-181)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3626183.3659985
Show More Cited By

Recommendations

On the Nuclear Norm and the Singular Value Decomposition of Tensors

Finding the rank of a tensor is a problem that has many applications. Unfortunately, it is often very difficult to determine the rank of a given tensor. Inspired by the heuristics of convex relaxation, we consider the nuclear norm instead of the rank of ...
Symmetric Tensors and Symmetric Tensor Rank

A symmetric tensor is a higher order generalization of a symmetric matrix. In this paper, we study various properties of symmetric tensors in relation to a decomposition into a symmetric sum of outer product of vectors. A rank-1 order-$k$ tensor is the ...
Tensor Rank and the Ill-Posedness of the Best Low-Rank Approximation Problem

There has been continued interest in seeking a theorem describing optimal low-rank approximations to tensors of order 3 or higher that parallels the Eckart-Young theorem for matrices. In this paper, we argue that the naive approach to this problem is ...

Comments

Information & Contributors

Information

Published In

IPDPS '15: Proceedings of the 2015 IEEE International Parallel and Distributed Processing Symposium

May 2015

1110 pages

ISBN:9781479986491

Publisher

IEEE Computer Society

United States

Publication History

Published: 25 May 2015

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

47
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 03 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Soh YKannan RSao PChoi J(2024)Accelerated Constrained Sparse Tensor Factorization on Massively Parallel ArchitecturesProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673128(107-116)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673128
Li ZQin YXiao QYang WLi K(2024)cuFasterTucker: A Stochastic Optimization Strategy for Parallel Sparse FastTucker Decomposition on GPU PlatformACM Transactions on Parallel Computing10.1145/364809411:2(1-33)Online publication date: 8-Jun-2024
https://dl.acm.org/doi/10.1145/3648094
Kanakagiri RSolomonik EAgrawal KPetrank E(2024)Minimum Cost Loop Nests for Contraction of a Sparse Tensor with a Tensor NetworkProceedings of the 36th ACM Symposium on Parallelism in Algorithms and Architectures10.1145/3626183.3659985(169-181)Online publication date: 17-Jun-2024
https://dl.acm.org/doi/10.1145/3626183.3659985
Swartvagher PHunold STräff JVardas I(2023)Using Mixed-Radix Decomposition to Enumerate Computational Resources of Deeply Hierarchical ArchitecturesProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624109(405-415)Online publication date: 12-Nov-2023
https://dl.acm.org/doi/10.1145/3624062.3624109
Bruch SNardini FIngber ALiberty E(2023)An Approximate Algorithm for Maximum Inner Product Search over Streaming Sparse VectorsACM Transactions on Information Systems10.1145/360979742:2(1-43)Online publication date: 8-Nov-2023
https://dl.acm.org/doi/10.1145/3609797
Xiao GYin CZhou TLi XChen YLi K(2023)A Survey of Accelerating Parallel Sparse Linear AlgebraACM Computing Surveys10.1145/360460656:1(1-38)Online publication date: 28-Aug-2023
https://dl.acm.org/doi/10.1145/3604606
Kovach SKolichala PGu TKjolstad F(2023)Indexed Streams: A Formal Intermediate Representation for Fused Contraction ProgramsProceedings of the ACM on Programming Languages10.1145/35912687:PLDI(1169-1193)Online publication date: 6-Jun-2023
https://dl.acm.org/doi/10.1145/3591268
Wang HYang WOuyang RHu RLi KLi K(2023)A Heterogeneous Parallel Computing Approach Optimizing SpTTM on CPU-GPU via GCNACM Transactions on Parallel Computing10.1145/358437310:2(1-23)Online publication date: 20-Jun-2023
https://dl.acm.org/doi/10.1145/3584373
Wijeratne SWang TKannan RPrasanna VIenne PZhang Z(2023)Accelerating Sparse MTTKRP for Tensor Decomposition on FPGAProceedings of the 2023 ACM/SIGDA International Symposium on Field Programmable Gate Arrays10.1145/3543622.3573179(259-269)Online publication date: 12-Feb-2023
https://dl.acm.org/doi/10.1145/3543622.3573179
Zhao TPopoola THall MOlschanowsky CStrout M(2022)Polyhedral Specification and Code Generation of Sparse Tensor Contraction with Co-iterationACM Transactions on Architecture and Code Optimization10.1145/356605420:1(1-26)Online publication date: 16-Dec-2022
https://dl.acm.org/doi/10.1145/3566054
Show More Cited By

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

Cited By

Recommendations

On the Nuclear Norm and the Singular Value Decomposition of Tensors

Symmetric Tensors and Symmetric Tensor Rank

Tensor Rank and the Ill-Posedness of the Best Low-Rank Approximation Problem

Comments

Information

Published In

Publisher

Publication History

Author Tags

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations