Author: Boman, Erik G : Search

research-article

Public Access

Optimal size of the block in block GMRES on GPUs: computational model and experiments

Numerical Algorithms (SPNA), Volume 92, Issue 1Pages 119–147https://doi.org/10.1007/s11075-022-01439-z

Abstract

The block version of GMRES (BGMRES) is most advantageous over the single right hand side (RHS) counterpart when the cost of communication is high while the cost of floating point operations is not. This is the particular case on modern graphics ...

research-article

Low-synch Gram–Schmidt with delayed reorthogonalization for Krylov solvers

Parallel Computing (PACO), Volume 112, Issue Chttps://doi.org/10.1016/j.parco.2022.102940

Abstract

The parallel strong-scaling of iterative methods is often determined by the number of global reductions at each iteration. Low-synch Gram–Schmidt algorithms are applied here to the Arnoldi algorithm to reduce the number of global ...

research-article

Parallel graph coloring algorithms for distributed GPU environments

Parallel Computing (PACO), Volume 110, Issue Chttps://doi.org/10.1016/j.parco.2022.102896

Abstract

Graph coloring is often used in parallelizing scientific computations that run in distributed and multi-GPU environments; it identifies sets of independent data that can be updated in parallel. Many algorithms exist for graph coloring ...

Highlights

We present the first multi-GPU graph coloring implementation.
Our framework ...

research-article

EXAGRAPH: Graph and combinatorial methods for enabling exascale applications

International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 35, Issue 6Pages 553–571https://doi.org/10.1177/10943420211029299

Combinatorial algorithms in general and graph algorithms in particular play a critical enabling role in numerous scientific applications. However, the irregular memory access nature of these algorithms makes them one of the hardest algorithmic kernels to ...

research-article

A survey of numerical linear algebra methods utilizing mixed-precision arithmetic

International Journal of High Performance Computing Applications (SAGE-HPCA), Volume 35, Issue 4Pages 344–369https://doi.org/10.1177/10943420211003313

The efficient utilization of mixed-precision numerical linear algebra algorithms can offer attractive acceleration to scientific computing applications. Especially with the hardware integration of low-precision special-function units designed for machine ...

research-article

Scalable Asynchronous Domain Decomposition Solvers

SIAM Journal on Scientific Computing (SISC), Volume 42, Issue 6Pages C384–C409https://doi.org/10.1137/19M1291303

Parallel implementations of linear iterative solvers generally alternate between phases of data exchange and phases of local computation. Increasingly large problem sizes and more heterogeneous compute architectures make load balancing and the design of ...

research-article

An Algebraic Sparsified Nested Dissection Algorithm Using Low-Rank Approximations

SIAM Journal on Matrix Analysis and Applications (SIMAX), Volume 41, Issue 2Pages 715–746https://doi.org/10.1137/19M123806X

We propose a new algorithm for the fast solution of large, sparse, symmetric positive-definite linear systems, spaND (sparsified Nested Dissection). It is based on nested dissection, sparsification, and low-rank compression. After eliminating all ...

research-article

A robust hierarchical solver for ill-conditioned systems with applications to ice sheet modeling

Journal of Computational Physics (JOCP), Volume 396, Issue CPages 819–836https://doi.org/10.1016/j.jcp.2019.07.024

Abstract

A hierarchical solver is proposed for solving sparse ill-conditioned linear systems in parallel. The solver is based on a modification of the LoRaSp method, but employs a deferred-compression technique, which provably reduces the ...

Highlights

We introduced the deferred-compression technique in hierarchical solvers for solving sparse ill-conditioned linear systems.

research-article

A distributed-memory hierarchical solver for general sparse linear systems

Parallel Computing (PACO), Volume 74, Issue CPages 49–64https://doi.org/10.1016/j.parco.2017.12.004

Highlights

Derived a new formulation of a sequential hierarchical solver, which compresses dense fill-in blocks.

Abstract

We present a parallel hierarchical solver for general sparse linear systems on distributed-memory machines. For large-scale problems, this fully algebraic algorithm is faster and more memory-efficient than sparse direct solvers because ...

research-article

Domain decomposition preconditioners for communication-avoiding krylov methods on a hybrid CPU/GPU cluster

SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 933–944https://doi.org/10.1109/SC.2014.81

Krylov subspace projection methods are widely used iterative methods for solving large-scale linear systems of equations. Researchers have demonstrated that communication-avoiding (CA) techniques can improve Krylov methods' performance on modern ...

research-article

Scalable matrix computations on large scale-free graphs using 2D graph partitioning

SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and AnalysisArticle No.: 50, Pages 1–12https://doi.org/10.1145/2503210.2503293

Scalable parallel computing is essential for processing large scale-free (power-law) graphs. The distribution of data across processes becomes important on distributed-memory computers with thousands of cores. It has been shown that two-dimensional ...

Article

Multithreaded Algorithms for Maxmum Matching in Bipartite Graphs

IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing SymposiumPages 860–872https://doi.org/10.1109/IPDPS.2012.82

We design, implement, and evaluate algorithms for computing a matching of maximum cardinality in a bipartite graph on multicore and massively multithreaded computers. As computers with larger numbers of slower cores dominate the commodity processor ...

Article

ShyLU: A Hybrid-Hybrid Solver for Multicore Platforms

IPDPS '12: Proceedings of the 2012 IEEE 26th International Parallel and Distributed Processing SymposiumPages 631–643https://doi.org/10.1109/IPDPS.2012.64

With the ubiquity of multicore processors, it is crucial that solvers adapt to the hierarchical structure of modern architectures. We present ShyLU, a ``hybrid-hybrid'' solver for general sparse linear systems that is hybrid in two ways: First, it ...

article

The Zoltan and Isorropia parallel toolkits for combinatorial scientific computing: Partitioning, ordering and coloring

Scientific Programming (SCIP), Volume 20, Issue 2Pages 129–150https://doi.org/10.1155/2012/713587

Partitioning and load balancing are important problems in scientific computing that can be modeled as combinatorial problems using graphs or hypergraphs. The Zoltan toolkit was developed primarily for partitioning and load balancing to support dynamic ...

article

A Quasi-algebraic Multigrid Approach to Fracture Problems Based on Extended Finite Elements

SIAM Journal on Scientific Computing (SISC), Volume 34, Issue 2Pages 603–626https://doi.org/10.1137/110819913

The modeling of discontinuities arising from fracture of materials poses a number of significant computational challenges. The extended finite element method provides an attractive alternative to standard finite elements in that they do not require fine ...

poster

Poster: a hybrid-hybrid solver for manycore platforms

SC '11 Companion: Proceedings of the 2011 companion on High Performance Computing Networking, Storage and Analysis CompanionPages 35–36https://doi.org/10.1145/2148600.2148619

With the increasing levels of parallelism in a compute node, it is important to exploit multiple levels of parallelism even within a single compute node. We present ShyLU (pronounced "Shy-loo" for Scalable Hybrid LU), a "hybrid-hybrid" solver for ...

Article

Enabling next-generation parallel circuit simulation with trilinos

Euro-Par'11: Proceedings of the 2011 international conference on Parallel ProcessingPages 315–323https://doi.org/10.1007/978-3-642-29737-3_36

The Xyce Parallel Circuit Simulator, which has demonstrated scalable circuit simulation on hundreds of processors, heavily leverages the high-performance scientific libraries provided by Trilinos. With the move towards multi-core CPUs and GPU technology,...

article

Hypergraph-Based Unsymmetric Nested Dissection Ordering for Sparse LU Factorization

SIAM Journal on Scientific Computing (SISC), Volume 32, Issue 6Pages 3426–3446https://doi.org/10.1137/080720395

In this paper we discuss a hypergraph-based unsymmetric nested dissection (HUND) ordering for reducing the fill-in incurred during Gaussian elimination. It has several important properties. It takes a global perspective of the entire matrix, as opposed ...

article

Distributed-Memory Parallel Algorithms for Distance-2 Coloring and Related Problems in Derivative Computation

SIAM Journal on Scientific Computing (SISC), Volume 32, Issue 4Pages 2418–2446https://doi.org/10.1137/080732158

The distance-2 graph coloring problem aims at partitioning the vertex set of a graph into the fewest sets consisting of vertices pairwise at distance greater than 2 from each other. Its applications include derivative computation in numerical ...

Article

Factors impacting performance of multithreaded sparse triangular solve

VECPAR'10: Proceedings of the 9th international conference on High performance computing for computational sciencePages 32–44

As computational science applications grow more parallel with multi-core supercomputers having hundreds of thousands of computational cores, it will become increasingly difficult for solvers to scale. Our approach is to use hybrid MPI/threaded numerical ...

Applied Filters

People

Names

Institutions

Authors

Publications

Journal/Magazine Names

Proceedings/Book Names

All Publications

Content Type

Supplemental Material Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Publication Date

Save to Binder

Upcoming Conferences