Article

Toward memory-efficient linear solvers

Authors:

Elizabeth R. JessupAuthors Info & Claims

VECPAR'02: Proceedings of the 5th international conference on High performance computing for computational science

Pages 315 - 328

Published: 26 June 2002 Publication History

Abstract

We describe a new technique for solvinga sparse linear system Ax = b as a block system AX = B, where multiple starting vectors and right-hand sides are chosen so as to accelerate convergence. Efficiency is gained by reusing the matrix A in block operations with X and B. Techniques for reducingthe cost of the extra matrix-vector operations are presented.

References

[1]

Field, M.: Optimizing a parallel conjugate gradient solver. SIAM J. Sci. Stat. Comput. 19 (1998) 27-37. 315

Digital Library

[2]

Simon, H., Yeremin, A.: A new approach to construction of efficient iterative schemes for massively parallel applications: variable block CG and BiCG methods and variable block Arnoldi procedure. In R. Sincovec et al., ed.: Parallel Processingfor Scientific Computing. (1993) 57-60. 315

[3]

Anderson,W.K., Gropp, W.D., Kaushik, D., Keyes, D. E., Smith, B. F.: Achieving high sustained performance in an unstructured mesh CFD application. In: Proceedings of Supercomputing 99. (1999) Also published as Mathematics and Computer Science Division, Argonne National Laboratory, Technical Report ANL/MCS-P776-0899 315

Digital Library

[4]

Farhat, C., Macedo, A., Lesoinne, M.: A two-level domain decomposition method for the iterative solution of high frequency exterior Helmholtz problems. Numerische Mathematik 85 (2000) 283-308. 315

[5]

Dongarra, J., Hammarling, S., Sorensen, D.: Block reduction of matrices to condensed form for eigenvalue computations. J. Comp. Appl. Math. 27 (1989) 215-227. 315

[6]

Dongarra, J., DuCroz, J., Hammarling, S., Hanson, R.: An extended set of Fortran Basic Linear Algebra Subprograms. ACM Trans. Math. Software 14 (1988) 1-17. 315

Digital Library

[7]

Dongarra, J., DuCroz, J., Duff, I., Hammarling, S.: A set of level 3 Basic Linear Algebra Subprograms. ACM TOMS 16 (1990) 1ff 315

Digital Library

[8]

Patterson, D., Anderson, T., Cardwell, N., Fromm, R., Keeton, K., Kozyrakis, C., Thomas, R., Yelick, K.: A case for intelligent RAM. IEEE Micro March/April (1997) 34-44 315

Digital Library

[9]

Gropp, W., Kaushik, D., Keyes, D., Smith, B.: Toward realistic performance bounds for implicit CFD codes. In the Proceedings of the International Conference on Parallel CFD (1999). 315

[10]

Kaushik, D., Keyes, D.: Efficient parallelization of an unstructured grid solver: A memory-centric approach. In the Proceedings of the International Conference on Parallel CFD (1999) 315

[11]

Behling, S., Bell, R., Farrell, P., Holthoff, H., O'Connell, F., Weir, W.: The POWER4 Processor Introduction and Tuning Guide. IBM Redbooks (2001). 316, 323

[12]

Gropp, W., et al.: PETSc 2.0 for MPI. http://www.mcs.anl.gov/petsc/ (1999). 316

[13]

Basic Linear Algebra Subprograms Technical (BLAST) Forum: Document for the Basic Linear Algebra Subprograms (BLAS) standard. http://www.netlib.org/utk/papers/blast-forum.html (1998) 316

[14]

Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stds. 49 (1952) 409-436. 316

[15]

Nachtigal, N.M., Reddy, S.C., Trefethen, L.N.: How fast are nonsymmetric matrix iterations? SIAM Journal on Matrix Analysis Applications 13 (1992) 778-795. 316

Digital Library

[16]

Saad, Y., Schultz, M.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 7 (1986) 856-869. 317

Digital Library

[17]

Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing Company (1996). 317, 325

Digital Library

[18]

Chapman, A., Saad, Y.: Deflated and augmented Krylov subspace techniques. Linear Algebra with Applications 4 (1997) 43-66. 317, 318

[19]

Morgan, R.B.: A restarted GMRES method augmented with eigenvectors. SIAM Journal on Matrix Analysis and Applications 16 (1995) 1154-1171. 317, 318, 320

Digital Library

[20]

Morgan, R.B.: Implicitly restarted GMRES and Arnoldi methods for nonsymmetric systems of equations. SIAM Journal of Matrix Analysis and Applications 21 (2000) 1112-1135. 317

Digital Library

[21]

Saad, Y.: Analysis of augmented Krylov subspace methods. SIAM Journal on Matrix Analysis and Applications 18 (1997) 435-449. 317

Digital Library

[22]

Baglama, J., Calvetti, D., Golub, G., Reichel, L.: Adaptively preconditioned GMRES algorithms. SIAM Journal on Scientific Computing 20 (1998) 243-269. 318

Digital Library

[23]

Erhel, J., Burrage, K., Pohl, B.: Restarted GMRES preconditioned by deflation. Journal of Computational Applied Mathematics 69 (1996) 303-318. 318

Digital Library

[24]

Eiermann, M., Ernst, O. G., Schneider, O.: Analysis of acceleration strategies for restarted minimum residual methods. Journal of Computational and Applied Mathematics 123 (200) 261-292. 318, 320

Digital Library

[25]

van der Vorst, H. A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numerical Linear Algebra with Applications 1 (1994) 369-386. 318, 319, 320

[26]

de Sturler, E.: Nested Krylov methods based on GCR. Journal of Computational and Applied Mathematics 67 (1996) 15-41. 318, 319, 320

Digital Library

[27]

de Sturler, E., Fokkema, D.: Nested Krylov methods and preserving orthogonality. In Melson, N., Manteuffel, T., McCormick, S., eds.: Sixth Copper Mountain Conference on Multigrid Methods. Part 1 of NASA conference Publication 3324, NASA (1993) 111-126. 318

[28]

de Sturler, E.: Truncation strategies for optimal Krylov subspace methods. SIAM Journal on Numerical Analysis 36 (1999) 864-889. 318

Digital Library

[29]

Nachitgal, N.M., Reichel, L., Trefethen, L.N.: A hybrid GMRES algorithm for nonsymmetric linear systems. SIAM Journal of Matrix Analysis Applications 13 (1992) 796-825. 319

Digital Library

[30]

Manteuffel, T. A., Starke, G.: On hybrid iterative methods for nonsymmetric systems of linear equations. Numerical Mathematics 73 (1996) 489-506. 319

[31]

Joubert, W.: A robust GMRES-base adaptive polynomial preconditioning algorithm for nonsymmetric linear systems. SIAM Journal on Scientific Computing 15 (1994) 427-439. 319

Digital Library

[32]

National Institute of Standards and Technology, Mathematical and Computational Sciences Division: Matrix Market. http://math.nist.gov/MatrixMarket (2002). 321

[33]

S. Naffziger, Hammond, G.: The implementation of the next generation 64bitanium microprocessor. In: Proceedings of the IEEE International Solid-State Circuits Conference. (2002). 323

[34]

Kessler, R. E., McLellan, E. J., Webb, D.A.: The alpha 21264 microprocessor architecture (2002) http://www.compaq.com/alphaserver/download/ev6chip.pdf. 323

[35]

DeGelas, J.: Alphalinux: The penguin drives a Ferrari (2000) http://www.aceshardware.com/Spades 323

[36]

Hennessey, J., Patterson, D.: Computer Architecture: A Quantitative Approach. 2nd edn. Morgan Kaufmann (1996). 324

Digital Library

[37]

Dongarra, J., Bunch, J., Moler, C., Stewart, G.: LINPACK Users' Guide. SIAM Publications (1979). 324

[38]

Gropp, W.D., Kaushik, D.K., Keyes, D. E., B. F. Smith: Toward realistic performance bounds for implicit CFD codes. In A. Ecer et al., ed.: Proceedings of Parallel CFD'99, Elsevier (1999). 324

Toward memory-efficient linear solvers
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms

Recommendations

Accurate Symmetric Indefinite Linear Equation Solvers

The Bunch-Kaufman factorization is widely accepted as the algorithm of choice for the direct solution of symmetric indefinite linear equations; it is the algorithm employed in both LINPACK and LAPACK. It has also been adapted to sparse symmetric ...
Block solvers for dense linear systems on local memory multiprocessors
Parallel iterative solvers for sparse linear systems in circuit simulation

For the solution of sparse linear systems from circuit simulation whose coefficient matrices include a few dense rows and columns, a parallel iterative algorithm with distributed Schur complement preconditioning is presented. The parallel efficiency of ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

VECPAR'02: Proceedings of the 5th international conference on High performance computing for computational science

June 2002

732 pages

ISBN:3540008527

Editors:
José M. L. M. Palma
Faculdade de Engenharia da Universidade do Porto, Porto, Portugal
,
A. Augusto Sousa
Faculdade de Engenharia da Universidade do Porto, Porto, Portugal
,
Jack Dongarra
University of Tennessee, Department of Computer Science, Knoxville, TN
,
Vicente Hernández
Universidad Politécnica de Valencia, Departamento de Sistemas Informáticos y Computación, Valencia, Spain

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 26 June 2002

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
6
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 27 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten