Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article
Free access

Use of parallel level 3 BLAS in LU factorization on three vector multiprocessors the ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF

Published: 01 June 1990 Publication History

Abstract

We show how to transform the B-spline curve and surface fitting problems into suffix computations of continued fractions. Then a parallel substitution scheme is introduced to compute the suffix values on a newly proposed mesh-of-unshuffle network. The derived parallel algorithm allows the curve interpolation through n points to be solved in Ο(log n) time using Θn/log n) processors and allows the surface interpolation through m x n points to be solved in Ο(log m log n) time using Θ (mn/(log m log n)) processors. Both interpolation algorithms are cost-optimal for their respective problems. Besides, the surface fitting problem can be even faster solved in Ο(log m + log n) time if Θ(mn) processors are used in the network.

References

[1]
Bischof, C. and Van Loan, C. (1987). The WY representation for products of Householder matrices. SlAM J. Sci. Stat. Comput. 8, s2-s13.
[2]
Bischof, C., Demmel, J., Dongarra J., DuCroz J., Greertbaum, A., Hammarling S., and Soreusen, D. (1988). LAPACK Working Note #5 : Provisional Contents. Report ANL-88-38. Mathematics and Computer Science Division, Argonne National Laboratory.
[3]
Bucher, I. and Jordan, T. (1984). Linear algebra programs for use on a vector computer with a secondary solid state storage device. In Advances in Computer Methods for Partial Differential Equations, E. R. Vichnevetsky and R. Stepleman (Eds), IMACS, 546-550.
[4]
Calahan, D.A. (1986). Block-oriented, localmemory-based linear equation solution on the CRAY-2: uniprocessor algorithms. Proceedings International Conference on Parallel Processing, August 1986. IEEE Computer Society Press, 375-378.
[5]
Carnevali, P., Radicati di Brozolo, G., Robert, Y., and Sguazzero, P. (1987). Efficient FORTRAN implementation of the Gaussian elimination and Householder reduction algorithms on the IBM 3090 vector multiprocessor. Report ICE- 0012, Rome : IBM European Center for Scientific and Engineering Computing, italy.
[6]
Dayd6, M. and Duff, I.S. (1989). Use of Level 3 BLAS in LU factorization,on the CRAY-2, the ETA 10-P, and the IBM 3090/VF. Int. J. Supercomputers Applications 3(2), 40-70.
[7]
Dayd6, M. and Duff, I.S. (1991)). Use of Level 3 BLAS in LU factorization in a multiprocessing environment on three vector multiprocessors the ALLIANT F~80, the CRAY-2, and the IBM 3090/VF. CERFACS Report (To appear).
[8]
Dekker, E. (I989). Some aspects of the CRAY.2 architecture. Report TR 89/'8, CERFACS.
[9]
Demmel, J.W., Dongarra, J.J., Du Croz, J., Greenbaum, A., Hammarling, S., and Sorensen, D. C. (1987). Prospectus for the development of a linear algebra library for high-performance computers. Report TM-97, Mathematics and Computer Science Division, Argonne National Laboratory.
[10]
Dongarra, J.J. (1988). Performance of various computers using standard linear equations software in a Fortran environment. Report TM 23. Mathematics and Computer Science Division, Argonne National Laboratory.
[11]
Dongarra, J.J., Du Croz, J., Ham marling, S., and Hanson, R.J. (1988). An extended set of Fortran Basic Linear Algebra Subprograms. ACM Trans. Math. Softw, 14, 1-17 and 18-32.
[12]
Dongarra, J.J., Du Croz, J., Duff, I.S., and Hammarling, S. (1988a). A set of level 3 Basic Linear Algebra Subprograms. Report AERE R 13297, Harwell Laboratory. To ap~ in ACM Trans. Math. Softw.
[13]
Dongarra, J.J., Du Croz, J., Duff, I.S., and Hammarling, S. (1988b). A set of level 3 Basic Linear Algebra Subprograms: model implementation and test programs. Report AERE R 13298, Harwell ~tboratory. To appear in A CM Tram. Math. Softw.
[14]
Dongarra, J.J., Gustavson, F. G., and Karp, A. (1984). Implementing linear algebra algorithms for dense matrices on a vector pipeline machine. SIAM Review 26, 91-112.
[15]
Gallivan, K., Jalby, W., and Meier, U. (1987). The use of BLAS3 in linear algebra on a parallel processor with a hierarchical memory. Timely communications SlAM J. Sci. Stat. Comput. 8, 1079-1084.
[16]
Gallivan, K., jalby, W., Meier, U., and Sameh A. (1988). impact of hierarchical memory systems on linear algebra algorithm design. Int. J. Supercomputers Applications 2( 1),12-48.
[17]
IBM (1986). Engineering and Scientific Subroutine Library. Program Number: 5668-863, IBM.
[18]
Kagstrom B., and Ling, P. (1988). Level 2 and Level 3 BLAS routines for IBM 3090 VF/400 : implementations and experiences. Report UMINF 154.88, University of Umea, Sweden.
[19]
Lawson, C.L., Hanson, R.J., Kincaid, D.R., and Krogh, F.T. (1979a). Basic linear algebra subprograms for Fortran usage. ACM Trans. Math. Softw. 5, 308-323.
[20]
Lawson, C.L., Hanson, R.J., Kincaid, D.R., and Krogh, F.T. (1979b). Algorithm 539. Basic linear algebra subprograms for Fortran usage. A CM Trans. Math. Softw. 5, 324-325.
[21]
Mayes P., and Radicati Di Brozolo G. (1989). Portable and efficient factorization algorithms on the IBM 3090/VF. Proceedings International Conference on Supercomputing, June 1989, ACM, 263-270.
[22]
Robert, Y. and Sguazzero, P. (1987). The LU decomposition algorithm and its efficient Fortran implementation on the IBM 3090 vector multiprocessor. Report ICE-0006, Rome : IBM European Center for Scientific and Engineering Computing, Italy.
[23]
Sheikh, Q., and Liu, J. (1989). Basic Linear Algebra Subprogram Optimization on the CRAY-2 System. Cray Channels, Spring 1989.

Cited By

View all
  • (1993)The role of APL and J in high-performance computationACM SIGAPL APL Quote Quad10.1145/166198.16620124:1(17-32)Online publication date: Sep-1993
  • (2016)The International Journal of Supercomputer Applications—The International Journal of Supercomputing Applications10.1177/1094342092006004106:4(392-406)Online publication date: 16-Sep-2016
  • (2016)Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/VfThe International Journal of Supercomputing Applications10.1177/1094342089003002043:2(40-70)Online publication date: 16-Sep-2016
  • Show More Cited By

Index Terms

  1. Use of parallel level 3 BLAS in LU factorization on three vector multiprocessors the ALLIANT FX/80, the CRAY-2, and the IBM 3090 VF

                            Recommendations

                            Comments

                            Information & Contributors

                            Information

                            Published In

                            Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

                            Publisher

                            Association for Computing Machinery

                            New York, NY, United States

                            Publication History

                            Published: 01 June 1990
                            Published in SIGARCH Volume 18, Issue 3b

                            Check for updates

                            Qualifiers

                            • Article

                            Contributors

                            Other Metrics

                            Bibliometrics & Citations

                            Bibliometrics

                            Article Metrics

                            • Downloads (Last 12 months)84
                            • Downloads (Last 6 weeks)26
                            Reflects downloads up to 22 Sep 2024

                            Other Metrics

                            Citations

                            Cited By

                            View all
                            • (1993)The role of APL and J in high-performance computationACM SIGAPL APL Quote Quad10.1145/166198.16620124:1(17-32)Online publication date: Sep-1993
                            • (2016)The International Journal of Supercomputer Applications—The International Journal of Supercomputing Applications10.1177/1094342092006004106:4(392-406)Online publication date: 16-Sep-2016
                            • (2016)Level 3 Blas in Lu Factorization On the Cray-2, Eta-10P, and Ibm 3090-200/VfThe International Journal of Supercomputing Applications10.1177/1094342089003002043:2(40-70)Online publication date: 16-Sep-2016
                            • (2005)Experiences in numerical software on IBM distributed memory architecturesParallel Scientific Computing10.1007/BFb0030149(207-218)Online publication date: 17-Jun-2005
                            • (1993)The role of APL and J in high-performance computationACM SIGAPL APL Quote Quad10.1145/166198.16620124:1(17-32)Online publication date: 1-Sep-1993
                            • (1993)The role of APL and J in high-performance computationProceedings of the international conference on APL10.1145/166197.166201(17-32)Online publication date: 1-Sep-1993
                            • (1992)Performance of parallel Cholesky factorization algorithms using BLASThe Journal of Supercomputing10.1007/BF001558046:3-4(315-329)Online publication date: Dec-1992
                            • (1991)Threshold pivoting for dense LU factorization on distributed memory multiprocessorsProceedings of the 1991 ACM/IEEE conference on Supercomputing10.1145/125826.126136(600-607)Online publication date: 1-Aug-1991
                            • (1991)A new approach for automatic parallelization of blocked linear Algebra computationsProceedings of the 1991 ACM/IEEE conference on Supercomputing10.1145/125826.125898(122-129)Online publication date: 1-Aug-1991

                            View Options

                            View options

                            PDF

                            View or Download as a PDF file.

                            PDF

                            eReader

                            View online with eReader.

                            eReader

                            Get Access

                            Login options

                            Media

                            Figures

                            Other

                            Tables

                            Share

                            Share

                            Share this Publication link

                            Share on social media