Abstract
NAG has always aimed to make their software available on any type of computer for which there is reasonable demand for it, which in practice means any computer in widespread use for general purpose scientific computing. The NAG Fortran 77 Library is currently available on more than fifty different machine ranges, and on something like a hundred different compiler versions. Thus portability of the library has always been a prime consideration, but the advent of vector and parallel computers has required us to pay much more careful attention to the performance of the library, and the challenge has been to try satisfy the sometimes conflicting requirements of performance and portability.
We shall discuss how we have approached the development of portable software for modern shared memory machines, and how we are addressing the problem of distributed memory systems.
Preview
Unable to display preview. Download preview PDF.
References
E. Anderson, Z. Bai, C. H. Bischof, J. Demmel, J. J. Dongarra, J. Du Croz, A. Greenbaum, S. Hammarling, A. McKenney, S. Ostrouchov, and D. C. Sorensen. LAPACK Users' Guide. SIAM, Philadelphia, 1992.
K. R. Bennett and G. Fairweather. PCOLNEW: A parallel boundary-value solver for shared memory machines. Technical Report CS-90-8, University of Kentucky, Center for Computer Science, Lexington, Kentucky 40506, USA, 1990.
R. H. Bisseling and W. F. McColl. Scientific computing on bulk synchronous parallel architectures. Preprint 836, Utrecht University, Department of Mathematics, P.O. Box 80010, 3508 TA Utrecht, The Netherlands, 1993.
W. S. Brainerd, C. H. Goldberg, and J. C. Adams. Programmer's Guide to Fortran 90. Unicomp, Albuquerque, 2nd edition, 1994.
R. H. Byrd, R. B. Schnabel, and G. A. Shultz. Parallel quasi-Newton methods for unconstrained optimization. Mathematical Programming, 42:273–306, 1988.
J. Choi, J. J. Dongarra, D. W. Walker, and R. C. Whaley. ScaLAPACK reference manual. Technical Memorandum ORNL/TM-12470, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA, 1994.
R. D. da Cunha and T. Hopkins. PIM 1.1: The parallel iterative methods package for systems of linear equations — users' guide (Fortran 77 version). Technical report, University of Kent at Canterbury, Computing Laboratory, UK, 1994.
C. Daly and J. Du Croz. Performance of a subroutine library on vector-processing machines. Comput. Phys. Comm., 37:181–186, 1985.
M. J. Daydé, I. S. Duff, and A. Petitet. A parallel block implementation of Level 3 BLAS for MIMD vector processors. Technical Report RAL-93-037, Rutherford Appleton Laboratory, Central Computing Department, Atlas Centre, Oxon OX11 0QX, UK, 1993.
J. J. Dongarra, J. R. Bunch, C. B. Moler, and G. W. Stewart. LINPACK Users' Guide. SIAM, Philadelphia, 1978.
J. J. Dongarra, J. Du Croz, I. S. Duff, and S. Hammarling. A set of Level 3 Basic Linear Algebra Subprograms. ACM Trans. Math. Software, 16:1–28, 1990.
J. J. Dongarra, J. Du Croz, S. Hammarling, and R. J. Hanson. An extended set of FORTRAN Basic Linear Algebra Subprograms. ACM Trans. Math. Software, 14:1–32, 1988.
J. J. Dongarra, I. S. Duff, D. C. Sorensen, and H. A. van der Vorst. Solving Linear Systems on Vector and Shared Memory Computers. SIAM, Philadelphia, 1991.
J. J. Dongarra and S. C. Eisenstat. Squeezing the most out of an algorithm in Cray Fortran. ACM Trans. Math. Software, 10:219–230, 1984.
J. J. Dongarra and E. Grosse. Distribution of mathematical software via electronic mail. Communs Ass. comput Mach., 30:403–407, 1987.
J. J. Dongarra and S. Hammarling. Evolution of numerical software for dense linear algebra. In M. G. Cox and S. Hammarling, editors, Reliable Numerical Computation, pages 297–327. Oxford University Press, Oxford, 1990.
J. J. Dongarra, L. Kaufman, and S. Hammarling. Squeezing the most out of eigenvalue solvers on high-performance computers. Linear Algebra Appl., 77:113–136, 1986.
J. J. Dongarra, T. H. Rowan, and R. C. Wade. Software distribution using Xnetlib. Technical Memorandum ORNL/TM-12318, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA, 1993.
J. J. Dongarra, van de Geijn, and R. C. Whaley. Users' guide to the BLACS. Technical report, University of Tennessee, Department of Computer Science, 107 Ayres Hall, Knoxville, TN 37996-1301, USA, 1993.
J. Du Croz. Evolution of parallel algorithms in dense linear algebra. In A. E. Fincham and B. Ford, editors, Parallel Computation, pages 233–251. Oxford University Press, Oxford, 1993.
J. Du Croz and P. Mayes. NAG Fortran Library vectorization review. Technical Report TR6/89, Numerical Algorithms Group, Wilkinson House, Jordan Hill Road, Oxford OX2 8DR, UK, 1989.
J. Du Croz, S. Nugent, J. Reid, and D. Taylor. Solving large full sets of linear equations in a paged virtual store. ACM Trans. Math. Software, 7:527–536, 1981.
I. S. Duff. The influence of vector and parallel processors on numerical analysis. In A. Iserles and M. J. D. Powell, editors, The State of the Art in Numerical Analysis, pages 359–407. Oxford University Press, London, 1987.
A. E. Fincham and B. Ford, editors. Parallel Computation. Oxford University Press, Oxford, 1993.
Message Passing Interface Forum. MPI: A message-passing interface standard. Technical report, University of Tennessee, Department of Computer Science, 107 Ayres Hall, Knoxville, TN 37996-1301, USA, 1994.
T. L. Freeman and C. Phillips. Parallel Numerical Algorithms. Prentice-Hall, Hemel Hempstead, Hertfordshire, 1992.
K. A. Gallivan, R. J. Plemmons, and A. H. Sameh. Parallel algorithms for dense linear algebra computations. SIAM Review, 32:54–135, 1990.
G. A. Geist, A. Beguilin, J. J. Dongarra, W. Jiang, R. Manchek, and V. Sunderam. PVM 3 user's guide and reference manual. Technical Memorandum ORNL/TM-12187, Oak Ridge National Laboratory, Oak Ridge, Tennessee 37831, USA, 1993.
A. Genz. The numerical evaluation of multiple integrals on parallel computers. In P Keast and G. Fairweather, editors, Numerical Integration: Recent Developments, Software and Applications, pages 219–229. NATO ASI Series, 209, D. Reidel, 1987.
I. Gladwell. Vectorisation of one dimensional quadrature codes. In P Keast and G. Fairweather, editors, Numerical Integration: Recent Developments, Software and Applications, pages 230–238. NATO ASI Series, 209, D. Reidel, 1987.
S. Hammarling. Development of numerical software libraries for vector and parallel machines. In A. E. Fincham and B. Ford, editors, Parallel Computation, pages 11–35. Oxford University Press, Oxford, 1993.
R. W. Hockney and C. R. Jesshope. Parallel Computers 2. Adam Hilger, Bristol, 1988.
B Kågström, P Ling, and C. Van Loan. Portable high performance GEMM-based Level 3 BLAS. In R. F. Sincovec, D. E. Keyes, M. R. Leuze, L. R. Petzold, and D. A. Reed, editors, Parallel Processing for Scientific Computing. SIAM, Philadelphia, 1993. Proceedings of the Sixth SIAM Conference.
C. H. Koelbel, D. B. Loveman, R. S. Schreiber, G. L. Steele Jr., and M. E. Zosel. The High Performance Fortran Handbook. The MIT Press, Cambridge, Massachusetts, 1994.
C. L. Lawson, R. J. Hanson, D. Kincaid, and F. T. Krogh. Basic Linear Algebra Subprograms for FORTRAN usage. ACM Trans. Math. Software, 5:308–323, 1979.
M. Metcalf and J Reid. Fortran 90 Explained. Oxford University Press, Oxford, 1993.
R. Miller and J. L. Reed. The Oxford BSP library users' guide, version 1.0. Technical report, University of Oxford, Programming Research Group, 1994.
J. M. Ortega and R. G. Voigt. Solution of partial differential equations on vector and parallel computers. SIAM Review, 27:149–240, 1985.
J. Rutter. A serial implementation of Cuppen's divide and conquer algorithm for the symmetric eigenvalue problem. Technical Report UCB//CSD-94-799, Computer Science Division (EECS), University of California at Berkeley, Berkeley, CA 94720, USA, 1994.
R. B. Schnabel. Parallel nonlinear optimization: Limitations, opportunities, and challenges. Technical Report CU-CS-715-94, University of Colorado at Boulder, Department of Computer Science, Campus Box 430, Boulder, Colorado, USA, 1994.
L. G. Valiant. A bridging model for parallel computation. Communs Ass. comput Mach., 33:103–111, 1990.
R. G. Voigt. Where are the parallel algorithms? ICASE Report 85-2, Institute for Computer Applications in Science and Engineering, NASA Langley Research Center, Hampton, Virginia 23665, USA, 1985.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1994 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Hammarling, S. (1994). The challenge of portable libraries for high performance machines. In: Dongarra, J., Waśniewski, J. (eds) Parallel Scientific Computing. PARA 1994. Lecture Notes in Computer Science, vol 879. Springer, Berlin, Heidelberg. https://doi.org/10.1007/BFb0030155
Download citation
DOI: https://doi.org/10.1007/BFb0030155
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-58712-5
Online ISBN: 978-3-540-49050-0
eBook Packages: Springer Book Archive