Article

Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme

Authors:

José R. Herrero,

Juan J. NavarroAuthors Info & Claims

PPAM'05: Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics

Pages 1058 - 1065

https://doi.org/10.1007/11752578_128

Published: 11 September 2005 Publication History

Abstract

We present the way in which we adapt data and computations to the underlying memory hierarchy by means of a hierarchical data structure known as hypermatrix. The application of orthogonal block forms produced the best performance for the platforms used.

References

[1]

Fuchs, G., Roy, J., Schrem, E.: Hypermatrix solution of large sets of symmetric positive-definite linear equations. Comp. Meth. Appl. Mech. Eng. 1 (1972) 197-216

[2]

Noor, A., Voigt, S.: Hypermatrix scheme for the STAR-100 computer. Comp. & Struct. 5 (1975) 287-296

[3]

Ast, M., Fischer, R., Manz, H., Schulz, U.: PERMAS: User's reference manual, INTES publication no. 450, rev.d (1997).

[4]

Chatterjee, S., Jain, V. V., Lebeck, A. R., Mundhra, S., Thottethodi, M.: Nonlinear array layouts for hierarchical memory systems. In: Proceedings of the 13th international conference on Supercomputing, ACM Press (1999) 444-453

Digital Library

[5]

Frens, J. D., Wise, D. S.: Auto-blocking matrix multiplication, or tracking BLAS3 performance from source code. Proc. 6th ACM SIGPLAN Symp. on Principles and Practice of Parallel Program., SIGPLAN Not. 32 (1997) 206-216

Digital Library

[6]

Valsalam, V., Skjellum, A.: A framework for high-performance matrix multiplication based on hierarchical abstractions, algorithms and optimized low-level kernels. Concurrency and Computation: Practice and Experience 14 (2002) 805-839

[7]

Wise, D. S.: Ahnentafel indexing into Morton-ordered arrays, or matrix locality for free. In: Euro-Par 2000, LNCS1900. (2000) 774-783

Digital Library

[8]

Mellor-Crummey, J., Whalley, D., Kennedy, K.: Improving memory hierarchy performance for irregular applications. In: Proceedings of the 13th international conference on Supercomputing, ACM Press (1999) 425-433

Digital Library

[9]

Wise, D. S.: Representing matrices as quadtrees for parallel processors. Information Processing Letters 20 (1985) 195-199

[10]

Herrero, J. R., Navarro, J. J.: Automatic benchmarking and optimization of codes: an experience with numerical kernels. In: Proceedings of the 2003 International Conference on Software Engineering Research and Practice, CSREA Press (2003) 701-706

[11]

Herrero, J. R., Navarro, J. J.: Improving Performance of Hypermatrix Cholesky Factorization. In: Euro-Par 2003, LNCS2790, Springer-Verlag (2003) 461-469

[12]

Intel: Intel(R) Itanium(R) 2 processor reference manual for software development and optimization (2004).

[13]

Lam, M., Rothberg, E., Wolf, M.: The cache performance and optimizations of blocked algorithms. In: Proceedings of ASPLOS'91. (1991) 67-74

Digital Library

[14]

Navarro, J. J., Juan, A., Lang, T.: MOB forms: A class of Multilevel Block Algorithms for dense linear algebra operations. In: Proceedings of the 8th International Conference on Supercomputing, ACM Press (1994).

Digital Library

[15]

Whaley, R.C., Dongarra, J. J.: Automatically tuned linear algebra software. In: Supercomputing '98, IEEE Computer Society (1998) 211-217

Digital Library

Cited By

Herrero JNavarro J(2006)Using non-canonical array layouts in dense matrix operationsProceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing10.5555/1775059.1775141(580-588)Online publication date: 18-Jun-2006
https://dl.acm.org/doi/10.5555/1775059.1775141
Herrero JNavarro J(2006)Compiler-optimized kernelsProceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V10.1007/11751649_84(762-771)Online publication date: 8-May-2006
https://dl.acm.org/doi/10.1007/11751649_84

Index Terms

Adapting linear algebra codes to the memory hierarchy using a hypermatrix scheme
1. Computing methodologies
  1. Symbolic and algebraic manipulation
    1. Symbolic and algebraic algorithms
      1. Linear algebra algorithms

Index terms have been assigned to the content through auto-classification.

Recommendations

Studies in numerical linear algebra
Black box linear algebra with the linbox library
A Novel Memory Block Management Scheme for PCM Using WOM-Code
HPCC-CSS-ICESS '15: Proceedings of the 2015 IEEE 17th International Conference on High Performance Computing and Communications, 2015 IEEE 7th International Symposium on Cyberspace Safety and Security, and 2015 IEEE 12th International Conf on Embedded Software and Systems

Phase Change Memory (PCM) is a promising DRAM replacement in embedded systems due to its attractive characteristics including low static power consumption and high density. However, long write latency is one of the major drawbacks in current PCM ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

PPAM'05: Proceedings of the 6th international conference on Parallel Processing and Applied Mathematics

September 2005

1121 pages

ISBN:3540341412

Editors:
Roman Wyrzykowski
Institute of Computational and Information Sciences, Czestochowa University of Technology
,
Jack Dongarra
Computer Science Department, University of Tennessee, Knoxville, TN
,
Norbert Meyer
Computer Science Department, Poznan Supercomputing and Networking Center, Knoxville, TN, Poland
,
Jerzy Waśniewski
Informatics & Mathematical Modeling, Technical University of Denmark, Lyngby, DK, Denmark

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 11 September 2005

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

2
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 06 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Herrero JNavarro J(2006)Using non-canonical array layouts in dense matrix operationsProceedings of the 8th international conference on Applied parallel computing: state of the art in scientific computing10.5555/1775059.1775141(580-588)Online publication date: 18-Jun-2006
https://dl.acm.org/doi/10.5555/1775059.1775141
Herrero JNavarro J(2006)Compiler-optimized kernelsProceedings of the 2006 international conference on Computational Science and Its Applications - Volume Part V10.1007/11751649_84(762-771)Online publication date: 8-May-2006
https://dl.acm.org/doi/10.1007/11751649_84

View Options

View options

Media

Figures

Other

Tables

View Table of Contents