Toward memory-efficient linear solvers
Pages 315 - 328
Abstract
We describe a new technique for solvinga sparse linear system Ax = b as a block system AX = B, where multiple starting vectors and right-hand sides are chosen so as to accelerate convergence. Efficiency is gained by reusing the matrix A in block operations with X and B. Techniques for reducingthe cost of the extra matrix-vector operations are presented.
References
[1]
Field, M.: Optimizing a parallel conjugate gradient solver. SIAM J. Sci. Stat. Comput. 19 (1998) 27-37. 315
[2]
Simon, H., Yeremin, A.: A new approach to construction of efficient iterative schemes for massively parallel applications: variable block CG and BiCG methods and variable block Arnoldi procedure. In R. Sincovec et al., ed.: Parallel Processingfor Scientific Computing. (1993) 57-60. 315
[3]
Anderson,W.K., Gropp, W.D., Kaushik, D., Keyes, D. E., Smith, B. F.: Achieving high sustained performance in an unstructured mesh CFD application. In: Proceedings of Supercomputing 99. (1999) Also published as Mathematics and Computer Science Division, Argonne National Laboratory, Technical Report ANL/MCS-P776-0899 315
[4]
Farhat, C., Macedo, A., Lesoinne, M.: A two-level domain decomposition method for the iterative solution of high frequency exterior Helmholtz problems. Numerische Mathematik 85 (2000) 283-308. 315
[5]
Dongarra, J., Hammarling, S., Sorensen, D.: Block reduction of matrices to condensed form for eigenvalue computations. J. Comp. Appl. Math. 27 (1989) 215-227. 315
[6]
Dongarra, J., DuCroz, J., Hammarling, S., Hanson, R.: An extended set of Fortran Basic Linear Algebra Subprograms. ACM Trans. Math. Software 14 (1988) 1-17. 315
[7]
Dongarra, J., DuCroz, J., Duff, I., Hammarling, S.: A set of level 3 Basic Linear Algebra Subprograms. ACM TOMS 16 (1990) 1ff 315
[8]
Patterson, D., Anderson, T., Cardwell, N., Fromm, R., Keeton, K., Kozyrakis, C., Thomas, R., Yelick, K.: A case for intelligent RAM. IEEE Micro March/April (1997) 34-44 315
[9]
Gropp, W., Kaushik, D., Keyes, D., Smith, B.: Toward realistic performance bounds for implicit CFD codes. In the Proceedings of the International Conference on Parallel CFD (1999). 315
[10]
Kaushik, D., Keyes, D.: Efficient parallelization of an unstructured grid solver: A memory-centric approach. In the Proceedings of the International Conference on Parallel CFD (1999) 315
[11]
Behling, S., Bell, R., Farrell, P., Holthoff, H., O'Connell, F., Weir, W.: The POWER4 Processor Introduction and Tuning Guide. IBM Redbooks (2001). 316, 323
[12]
Gropp, W., et al.: PETSc 2.0 for MPI. http://www.mcs.anl.gov/petsc/ (1999). 316
[13]
Basic Linear Algebra Subprograms Technical (BLAST) Forum: Document for the Basic Linear Algebra Subprograms (BLAS) standard. http://www.netlib.org/utk/papers/blast-forum.html (1998) 316
[14]
Hestenes, M., Stiefel, E.: Methods of conjugate gradients for solving linear systems. J. Res. Nat. Bur. Stds. 49 (1952) 409-436. 316
[15]
Nachtigal, N.M., Reddy, S.C., Trefethen, L.N.: How fast are nonsymmetric matrix iterations? SIAM Journal on Matrix Analysis Applications 13 (1992) 778-795. 316
[16]
Saad, Y., Schultz, M.: GMRES: A generalized minimal residual algorithm for solving nonsymmetric linear systems. SIAM Journal on Scientific and Statistical Computing 7 (1986) 856-869. 317
[17]
Saad, Y.: Iterative Methods for Sparse Linear Systems. PWS Publishing Company (1996). 317, 325
[18]
Chapman, A., Saad, Y.: Deflated and augmented Krylov subspace techniques. Linear Algebra with Applications 4 (1997) 43-66. 317, 318
[19]
Morgan, R.B.: A restarted GMRES method augmented with eigenvectors. SIAM Journal on Matrix Analysis and Applications 16 (1995) 1154-1171. 317, 318, 320
[20]
Morgan, R.B.: Implicitly restarted GMRES and Arnoldi methods for nonsymmetric systems of equations. SIAM Journal of Matrix Analysis and Applications 21 (2000) 1112-1135. 317
[21]
Saad, Y.: Analysis of augmented Krylov subspace methods. SIAM Journal on Matrix Analysis and Applications 18 (1997) 435-449. 317
[22]
Baglama, J., Calvetti, D., Golub, G., Reichel, L.: Adaptively preconditioned GMRES algorithms. SIAM Journal on Scientific Computing 20 (1998) 243-269. 318
[23]
Erhel, J., Burrage, K., Pohl, B.: Restarted GMRES preconditioned by deflation. Journal of Computational Applied Mathematics 69 (1996) 303-318. 318
[24]
Eiermann, M., Ernst, O. G., Schneider, O.: Analysis of acceleration strategies for restarted minimum residual methods. Journal of Computational and Applied Mathematics 123 (200) 261-292. 318, 320
[25]
van der Vorst, H. A., Vuik, C.: GMRESR: a family of nested GMRES methods. Numerical Linear Algebra with Applications 1 (1994) 369-386. 318, 319, 320
[26]
de Sturler, E.: Nested Krylov methods based on GCR. Journal of Computational and Applied Mathematics 67 (1996) 15-41. 318, 319, 320
[27]
de Sturler, E., Fokkema, D.: Nested Krylov methods and preserving orthogonality. In Melson, N., Manteuffel, T., McCormick, S., eds.: Sixth Copper Mountain Conference on Multigrid Methods. Part 1 of NASA conference Publication 3324, NASA (1993) 111-126. 318
[28]
de Sturler, E.: Truncation strategies for optimal Krylov subspace methods. SIAM Journal on Numerical Analysis 36 (1999) 864-889. 318
[29]
Nachitgal, N.M., Reichel, L., Trefethen, L.N.: A hybrid GMRES algorithm for nonsymmetric linear systems. SIAM Journal of Matrix Analysis Applications 13 (1992) 796-825. 319
[30]
Manteuffel, T. A., Starke, G.: On hybrid iterative methods for nonsymmetric systems of linear equations. Numerical Mathematics 73 (1996) 489-506. 319
[31]
Joubert, W.: A robust GMRES-base adaptive polynomial preconditioning algorithm for nonsymmetric linear systems. SIAM Journal on Scientific Computing 15 (1994) 427-439. 319
[32]
National Institute of Standards and Technology, Mathematical and Computational Sciences Division: Matrix Market. http://math.nist.gov/MatrixMarket (2002). 321
[33]
S. Naffziger, Hammond, G.: The implementation of the next generation 64bitanium microprocessor. In: Proceedings of the IEEE International Solid-State Circuits Conference. (2002). 323
[34]
Kessler, R. E., McLellan, E. J., Webb, D.A.: The alpha 21264 microprocessor architecture (2002) http://www.compaq.com/alphaserver/download/ev6chip.pdf. 323
[35]
DeGelas, J.: Alphalinux: The penguin drives a Ferrari (2000) http://www.aceshardware.com/Spades 323
[36]
Hennessey, J., Patterson, D.: Computer Architecture: A Quantitative Approach. 2nd edn. Morgan Kaufmann (1996). 324
[37]
Dongarra, J., Bunch, J., Moler, C., Stewart, G.: LINPACK Users' Guide. SIAM Publications (1979). 324
[38]
Gropp, W.D., Kaushik, D.K., Keyes, D. E., B. F. Smith: Toward realistic performance bounds for implicit CFD codes. In A. Ecer et al., ed.: Proceedings of Parallel CFD'99, Elsevier (1999). 324
- Toward memory-efficient linear solvers
Recommendations
Accurate Symmetric Indefinite Linear Equation Solvers
The Bunch-Kaufman factorization is widely accepted as the algorithm of choice for the direct solution of symmetric indefinite linear equations; it is the algorithm employed in both LINPACK and LAPACK. It has also been adapted to sparse symmetric ...
Parallel iterative solvers for sparse linear systems in circuit simulation
For the solution of sparse linear systems from circuit simulation whose coefficient matrices include a few dense rows and columns, a parallel iterative algorithm with distributed Schur complement preconditioning is presented. The parallel efficiency of ...
Comments
Information & Contributors
Information
Published In
June 2002
732 pages
ISBN:3540008527
Publisher
Springer-Verlag
Berlin, Heidelberg
Publication History
Published: 26 June 2002
Qualifiers
- Article
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 6Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0
Reflects downloads up to 27 Jan 2025