Abstract
An efficient sparse LU factorization algorithm on popular shared memory multiprocessors is presented. Interprocess communication is critically important on these architectures—the algorithm introduces O(n) synchronization events only. No global barrier is used and a completely asynchronous scheduling scheme is one central point of the implementation. The algorithm aims at optimizing the single node performance and minimizing the communication overhead. It has been successfully tested on SUN Enterprise, DEC AlphaServer, SGI Origin 2000, Cray T90, J90, and NEC SX-4 parallel computers, delivering up to 2.3 GFlop/s on an eight processor DEC AlphaServer for medium-size semiconductor device simulations and structural engineering problems.
The work of O. Schenk was supported by a grant from the Cray Research and Development Grant Program and the Swiss Commission of Technology and Innovation under contract number 3975.1.
Preview
Unable to display preview. Download preview PDF.
References
C. Ashcraft, R. Grimes, J. Lewis, B. Peyton, and H. Simon, Progress in sparse matrix methods for large linear systems on vector supercomputers, The International Journal of Supercomputer Applications, 1 (1987), pp. 10–30.
I. S. Duff, Multiprocessing a sparse matrix code on the alliant fx/8, J. Comput. Appl. Math., 27 (1989), pp. 229–239.
D. Fokkema, Subspace methods for linear, nonlinear, and eigen problems., PhD thesis, Utrecht University, 1996.
Integrated Systems Engineering AG, DESSIS-iseReference Manual, ISE Integrated Systems Engineering AG, 1998.
Integrated Systems Engineering AG, DESS-iseReference Manual, ISE Integrated Systems Engineering AG, 1998.
G. Karypis and V. Kumar, Analysis of multilevel graph algorithms, Tech. Rep. MN 95–037, University of Minnesota, Department of Computer Science, Minneapolis, MN 55455, 1995.
A. Liegmann, Efficient Solution of Large Sparse Linear Systems, PhD thesis, ETH ZĂĽrich, 1995.
J. Liu, The role of elimination trees in sparse factorization, SIAM Journal on Matrix Analysis & Applications, 11 (1990), pp. 134–172.
P. Matstoms, Parallel sparse QR factorization on shared memory architectures, Parallel Computing, 21 (1995), pp. 473–486.
E. Ng and B. Peyton, A supernodal Cholesky factorization algorithm for shared-memory multiprocessors, SIAM Journal on Scientific Computing, 14 (1993), pp. 761–769.
E. Rothberg, Exploiting the memory hierarchy in sequential and parallel sparse Cholesky factorization, PhD thesis, Stanford University, 1992. STAN-CS-92-1459.
Y. Saad, Iterative Methods for Sparse Linear Systems, PWS Publishing Company, 1996.
O. Schenk, K. Gärtner, and W. Fichtner, Efficient sparse LU factorization with left-right looking strategy on shared memory multiprocessors, Tech. Rep. 98/40, Integrated Systems Laboratory, ETH Zurich, Swiss Fed. Inst. of Technology (ETH), Zurich, Switzerland, Submitted to BIT Numerical Mathematics, 1998.
G. Sleijpen, H. Van Der Vorst, and D. Fokkema, BiCGSTAB(l) and other hybrid Bi-CG methods, Tech. Rep. TR Nr. 831, Department of Mathematics, University Utrecht, 1993.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1999 Springer-Verlag
About this paper
Cite this paper
Schenk, O., Gärtner, K., Fichtner, W. (1999). Scalable parallel sparse factorization with left-right looking strategy on shared memory multiprocessors. In: Sloot, P., Bubak, M., Hoekstra, A., Hertzberger, B. (eds) High-Performance Computing and Networking. HPCN-Europe 1999. Lecture Notes in Computer Science, vol 1593. Springer, Berlin, Heidelberg . https://doi.org/10.1007/BFb0100583
Download citation
DOI: https://doi.org/10.1007/BFb0100583
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-65821-4
Online ISBN: 978-3-540-48933-7
eBook Packages: Springer Book Archive