article

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems

Author:

Keqin LiAuthors Info & Claims

The Journal of Supercomputing, Volume 54, Issue 3

Pages 271 - 297

https://doi.org/10.1007/s11227-009-0319-0

Published: 01 December 2010 Publication History

Abstract

We present fast and highly scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include matrix multiplication, matrix chain product, and computing the powers, the inverse, the characteristic polynomial, the determinant, the rank, the Krylov matrix, and an LU- and a QR-factorization of a matrix, and solving linear systems of equations. Our highly scalable parallel computations for these problems are based on a highly scalable implementation of the fastest sequential matrix multiplication algorithm on DMS. We show that compared with the best known parallel time complexities on parallel random access machines (PRAM), the most powerful but unrealistic shared memory model of parallel computing, our parallel matrix computations achieve the same speeds on distributed memory parallel computers (DMPC), and have an extra polylog factor in the time complexities on DMS with hypercubic networks. Furthermore, our parallel matrix computations are fully scalable on DMPC and highly scalable over a wide range of system size on DMS with hypercubic networks. Such fast (in terms of parallel time complexity) and highly scalable (in terms of our definition of scalability) parallel matrix computations were rarely seen before on any distributed memory systems.

References

[1]

Arabnia HR (1993) A transputer-based reconfigurable parallel system. In: Atkins S, Wagner AS (eds) Transputer research and applications (NATUG 6), Vancouver, Canada. IOS Press, Amsterdam, pp 153-169.

Digital Library

[2]

Arif Wani M, Arabnia HR (2003) Parallel edge-region-based segmentation algorithm targeted at reconfigurable multiring network. J Supercomput 25(1):43-62.

[3]

Bhandarkar SM, Arabnia HR (1995) The REFINE multiprocessor--theoretical properties and algorithms. Parallel Comput 21(11):1783-1805.

Digital Library

[4]

Bini D, Pan V (1994) Polynomial and matrix computations, vol 1, fundamental algorithms. Birkhäuser, Boston

[5]

Coppersmith D, Winograd S (1990) Matrix multiplication via arithmetic progressions. J Symb Comput 9:251-280.

Digital Library

[6]

Csanky L (1976) Fast parallel matrix inversion algorithms. SIAM J Comput 5:618-623.

Digital Library

[7]

Dekel E, Nassimi D, Sahni S (1981) Parallel matrix and graph algorithms. SIAM J Comput 10:657- 673.

Digital Library

[8]

Eshaghian MM (1993) Parallel algorithms for image processing on OMC. IEEE Trans Comput 40:827-833.

Digital Library

[9]

Goldberg LA, Jerrum M, Leighton T, Rao S (1997) Doubly logarithmic communication algorithms for optical-communication parallel computers. SIAM J Comput 26:1100-1119.

Digital Library

[10]

Grama A, Gupta A, Karypis G, Kumar V (2003) Introduction to parallel computing, 2nd edn. Addison-Wesley, Harlow.

[11]

Ibarra OH, Moran S, Rosier LE (1980) A note on the parallel complexity of computing the rank of order n matrices. Inf Process Lett 11(4-5):162.

[12]

Le Verrier UJJ (1840) Sur les variations seculaires des elementes elliptiques des sept planets principales. J Math Pures Appl 5:220-254.

[13]

Leighton FT (1992) Introduction to parallel algorithms and architectures: arrays, trees, hypercubes. Morgan Kaufmann, San Mateo.

[14]

Li K (2001) Scalable parallel matrix multiplication on distributed memory parallel computers. J Parallel Distrib Comput 61(12):1709-1731.

Digital Library

[15]

Li K (2004) Fast and scalable parallel matrix computations with reconfigurable pipelined optical buses. Parallel Algorithms Appl 19(4):195-209.

[16]

Li K (2007) Analysis of parallel algorithms for matrix chain product and matrix powers on distributed memory systems. IEEE Trans Parallel Distrib Syst 18(7):865-878.

Digital Library

[17]

Li K (2008) Fast and scalable parallel matrix multiplication and its applications on distributed memory systems. In: Rajasekaran S, Reif J (eds) Parallel computing: models, algorithms, and applications. CRC Press, Boca Raton, Chap 47.

[18]

Li K, Pan VY (2001) Parallel matrix multiplication on a linear array with a reconfigurable pipelined bus system. IEEE Trans Comput 50(5):519-525.

Digital Library

[19]

Li K, Pan Y, Zheng SQ (1998) Fast and processor efficient parallel matrix multiplication algorithms on a linear array with a reconfigurable pipelined bus system. IEEE Trans Parallel Distrib Syst 9(8):705-720.

Digital Library

[20]

Mehlhorn K, Vishkin U (1984) Randomized and deterministic simulations of PRAMs by parallel machines with restricted granularity of parallel memories. Acta Inf 21:339-374.

Digital Library

[21]

Pan V (1987) Complexity of parallel matrix computations. Theor Comput Sci 54:65-85.

Digital Library

[22]

Pan V, Reif J (1985) Efficient parallel solution of linear systems. In: Proceedings of 7th ACM symposium on theory of computing, May 1985, pp 143-152.

[23]

Pan Y, Li K (1998) Linear array with a reconfigurable pipelined bus system--concepts and applications. J Inf Sci 106(3-4):237-258.

Digital Library

Cited By

Shrawankar USapkal K(2018)Complexity Analysis of Vedic Mathematics Algorithms for Multicore EnvironmentInternational Journal of Rough Sets and Data Analysis10.4018/IJRSDA.20171001034:4(31-47)Online publication date: 11-Dec-2018
https://dl.acm.org/doi/10.4018/IJRSDA.2017100103
Tosun S(2018)Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architecturesThe Journal of Supercomputing10.1007/s11227-011-0720-362:1(265-289)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11227-011-0720-3

Index Terms

Fast and highly scalable parallel computations for fundamental matrix problems on distributed memory systems
1. Computing methodologies
  1. Parallel computing methodologies
    1. Parallel algorithms
  2. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms
      1. Linear algebra algorithms
2. Theory of computation
  1. Design and analysis of algorithms
    1. Parallel algorithms
  2. Models of computation
    1. Concurrency
      1. Parallel computing models

Index terms have been assigned to the content through auto-classification.

Recommendations

Scalable Parallel Matrix Multiplication on Distributed Memory Parallel Computers

Consider any known sequential algorithm for matrix multiplication over an arbitrary ring with time complexity O(N ), where 2< 3. We show that such an algorithm can be parallelized on a distributed memory parallel computer (DMPC) in O(logN) time by using ...
Analysis of Parallel Algorithms for Matrix Chain Product and Matrix Powers on Distributed Memory Systems

Given N matrices A_{1}, A_{2}, \ldots, A_{N} of size N \times N, the matrix chain product problem is to compute A_{1} \times A_{2} \times \cdots \times A_{N}. Given an N \times N matrix A, the matrix powers problem is to calculate the first N powers of ...
Fast and Scalable Parallel Matrix Computations on Distributed Memory Systems
IPDPS '05: Proceedings of the 19th IEEE International Parallel and Distributed Processing Symposium (IPDPS'05) - Papers - Volume 01

We present fast and scalable parallel computations for a number of important and fundamental matrix problems on distributed memory systems (DMS). These problems include computing the powers, the inverse, the characteristic polynomial, the determinant, ...

Comments

Information & Contributors

Information

Published In

cover image The Journal of Supercomputing

The Journal of Supercomputing Volume 54, Issue 3

December 2010

129 pages

ISSN:0920-8542

Issue’s Table of Contents

Copyright © Copyright © 2010 Springer Science+Business Media, LLC.

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 December 2010

Author Tags

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Shrawankar USapkal K(2018)Complexity Analysis of Vedic Mathematics Algorithms for Multicore EnvironmentInternational Journal of Rough Sets and Data Analysis10.4018/IJRSDA.20171001034:4(31-47)Online publication date: 11-Dec-2018
https://dl.acm.org/doi/10.4018/IJRSDA.2017100103
Tosun S(2018)Energy- and reliability-aware task scheduling onto heterogeneous MPSoC architecturesThe Journal of Supercomputing10.1007/s11227-011-0720-362:1(265-289)Online publication date: 31-Dec-2018
https://dl.acm.org/doi/10.1007/s11227-011-0720-3

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents