Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Three Algorithms for Cholesky Factorization on Distributed Memory Using Packed Storage

  • Conference paper
Applied Parallel Computing. State of the Art in Scientific Computing (PARA 2006)

Part of the book series: Lecture Notes in Computer Science ((LNTCS,volume 4699))

Included in the following conference series:

  • 1793 Accesses

Abstract

We present three algorithms for Cholesky factorization using minimum block storage for a distributed memory (DM) environment. One of the distributed square block packed (SBP) format algorithms performs similar to ScaLAPACK PDPOTRF, and our algorithm with iteration overlapping typically outperforms it by 15–50% for small and medium sized matrices. By storing the blocks contiguously, we get better performing BLAS operations. Our DM algorithms are not sensitive to cache conflicts and thus give smooth and predictable performance. We also investigate the intricacies of using rectangular full packed (RFP) format with ScaLAPACK routines and point out some advantages and drawbacks.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Unable to display preview. Download preview PDF.

Similar content being viewed by others

References

  1. Agarwal, R.C., Gustavson, F.G.: A parallel implementation of matrix multiplication and LU factorization on the IBM 3090. In: Wright, M. (ed.) Aspects of Computation on Asynchronous and Parallel Processors, pp. 217–221. IFIP, North-Holland, Amsterdam (1989)

    Google Scholar 

  2. Baboulin, M., Giraud, L., Gratton, S., Langou, J.: A distributed packed storage for large parallel calculations. Technical Report TR/PA/05/30, CERFACS, Toulouse, France (2005)

    Google Scholar 

  3. Blackford, L.S., et al.: ScaLAPACK user’s guide. SIAM Publications (1997)

    Google Scholar 

  4. Choi, J., Dongarra, J.J., Ostrouchov, S., Petitet, A.P., Walker, D.W., Whaley, R.C.: Design and implementation of the ScaLAPACK LU, QR, and Cholesky factorization routines. Scientific Programming 5(3), 173–184 (1996)

    Google Scholar 

  5. Dackland, K., Elmroth, E., Kågström, B.: A ring–oriented approach for block matrix factorizations on shared and distributed memory architectures. In: Sincovec, R.F., et al. (eds.) SIAM Conference on Parallel Processing for Scientific Computing, pp. 330–338. SIAM Publications (1993)

    Google Scholar 

  6. D’Azevedo, E., Dongarra, J.: Packed storage extension for ScaLAPACK. Technical Report UT-CS-98-385 (1998)

    Google Scholar 

  7. Gustavson, F.: Algorithm compiler architecture interaction relative to dense linear algebra. Technical Report RC 23715, IBM Thomas J. Watson Research Center (September 2005)

    Google Scholar 

  8. Gustavson, F.: New generalized data structures for matrices lead to a variety of high performance dense linear algebra algorithms. In: Dongarra, J.J., Madsen, K., Waśniewski, J. (eds.) PARA 2004. LNCS, vol. 3732, pp. 11–20. Springer, Heidelberg (2006)

    Chapter  Google Scholar 

  9. Gustavson, F., Wasniewski, J.: LAPACK Cholesky routines in rectangular full packed format. In: Rectangular Full Packed Format for LAPACK Algorithms Timings on Several Computers. Workshop on State-of-the-Art in Scientific and Parallel Computing. LNCS, pp. 570–579. Springer, Heidelberg, 2006 (to appear)

    Google Scholar 

  10. Kurzak, J., Dongarra, J.J.: Pipelined shared memory implementation of linear algebra routines with arbitrary lookahead – LU, Cholesky, QR. In: Implementing Linear Algebra Routines on Multi-core Processors with Pipelining and a Look Ahead. Workshop on State-of-the-Art in Scientific and Parallel Computing. LNCS, pp. 147–156. Springer, Heidelberg, 2006 (to appear)

    Google Scholar 

Download references

Author information

Authors and Affiliations

Authors

Editor information

Bo Kågström Erik Elmroth Jack Dongarra Jerzy Waśniewski

Rights and permissions

Reprints and permissions

Copyright information

© 2007 Springer-Verlag Berlin Heidelberg

About this paper

Cite this paper

Gustavson, F.G., Karlsson, L., Kågström, B. (2007). Three Algorithms for Cholesky Factorization on Distributed Memory Using Packed Storage. In: Kågström, B., Elmroth, E., Dongarra, J., Waśniewski, J. (eds) Applied Parallel Computing. State of the Art in Scientific Computing. PARA 2006. Lecture Notes in Computer Science, vol 4699. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75755-9_67

Download citation

  • DOI: https://doi.org/10.1007/978-3-540-75755-9_67

  • Publisher Name: Springer, Berlin, Heidelberg

  • Print ISBN: 978-3-540-75754-2

  • Online ISBN: 978-3-540-75755-9

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics