Computations on matrices

research-article

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-core and Many-core Systems

ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 4Article No.: 22, Pages 1–45https://doi.org/10.1145/3418075

Sparse matrix-vector multiplication (SpMV) operations are commonly used in various scientific and engineering applications. The performance of the SpMV operation often depends on exploiting regularity patterns in the matrix. Various representations and ...

research-article

Public Access

Load-balancing Sparse Matrix Vector Product Kernels on GPUs

ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 1Article No.: 2, Pages 1–26https://doi.org/10.1145/3380930

Efficient processing of Irregular Matrices on Single Instruction, Multiple Data (SIMD)-type architectures is a persistent challenge. Resolving it requires innovations in the development of data formats, computational techniques, and implementations that ...

research-article

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation

ACM Transactions on Parallel Computing (TOPC), Volume 7, Issue 1Article No.: 4, Pages 1–19https://doi.org/10.1145/3380934

We describe the application of a communication-reduction technique for the PageRank algorithm that dynamically adapts the precision of the data access to the numerical requirements of the algorithm as the iteration converges. Our variable-precision ...

research-article

Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication

ACM Transactions on Parallel Computing (TOPC), Volume 4, Issue 3Article No.: 13, Pages 1–34https://doi.org/10.1145/3155292

We investigate outer-product--parallel, inner-product--parallel, and row-by-row-product--parallel formulations of sparse matrix-matrix multiplication (SpGEMM) on distributed memory architectures. For each of these three formulations, we propose a ...

research-article

Public Access

Trade-Offs Between Synchronization, Communication, and Computation in Parallel Linear Algebra Computations

ACM Transactions on Parallel Computing (TOPC), Volume 3, Issue 1Article No.: 3, Pages 1–47https://doi.org/10.1145/2897188

This article derives trade-offs between three basic costs of a parallel algorithm: synchronization, data movement, and computational cost. These trade-offs are lower bounds on the execution time of the algorithm that are independent of the number of ...

research-article

Public Access

Hypergraph Partitioning for Sparse Matrix-Matrix Multiplication

ACM Transactions on Parallel Computing (TOPC), Volume 3, Issue 3Article No.: 18, Pages 1–34https://doi.org/10.1145/3015144

We propose a fine-grained hypergraph model for sparse matrix-matrix multiplication (SpGEMM), a key computational kernel in scientific computing and data analysis whose performance is often communication bound. This model correctly describes both the ...

research-article

Work-Efficient Matrix Inversion in Polylogarithmic Time

ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 3Article No.: 15, Pages 1–29https://doi.org/10.1145/2809812

We present an algorithm for inversion of symmetric positive definite matrices that combines the practical requirement of an optimal number of arithmetic operations and the theoretical goal of a polylogarithmic critical path length. The algorithm reduces ...

research-article

Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates

ACM Transactions on Parallel Computing (TOPC), Volume 2, Issue 1Article No.: 7, Pages 1–33https://doi.org/10.1145/2742351

Next-generation HPC computing platforms are likely to be characterized by significant, unpredictable nonuniformities in execution time among compute nodes and cores. The resulting load imbalances from this nonuniformity are expected to arise from a ...

research-article

Public Access

Avoiding Communication in Successive Band Reduction

ACM Transactions on Parallel Computing (TOPC), Volume 1, Issue 2Article No.: 11, Pages 1–37https://doi.org/10.1145/2686877

The running time of an algorithm depends on both arithmetic and communication (i.e., data movement) costs, and the relative costs of communication are growing over time. In this work, we present sequential and distributed-memory parallel algorithms for ...

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

All Publications

Content Type

Media Formats

Publisher

Publication Date

Optimizing the Linear Fascicle Evaluation Algorithm for Multi-core and Many-core Systems

Load-balancing Sparse Matrix Vector Product Kernels on GPUs

Acceleration of PageRank with Customized Precision Based on Mantissa Segmentation

Partitioning Models for Scaling Parallel Sparse Matrix-Matrix Multiplication

Trade-Offs Between Synchronization, Communication, and Computation in Parallel Linear Algebra Computations

Hypergraph Partitioning for Sparse Matrix-Matrix Multiplication

Work-Efficient Matrix Inversion in Polylogarithmic Time

Noise-Tolerant Explicit Stencil Computations for Nonuniform Process Execution Rates

Avoiding Communication in Successive Band Reduction

Applied Filters

People

Names

Institutions

Authors

Reviewers

Publications

All Publications

Content Type

Media Formats

Publisher

Publication Date

Save to Binder