Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1009385.1010066guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Optimizing Parallel Multiplication Operation for Rectangular and Transposed Matrices

Published: 07 July 2004 Publication History

Abstract

In many applications, matrix multiplication involvesdifferent shapes of matrices. The shape of the matrix cansignificantly impact the performance of matrixmultiplication algorithm. This paper describes extensionsof the SRUMMA parallel matrix multiplication algorithm[1] to improve performance of transpose and rectangularmatrices. Our approach relies on a set of hybrid algorithmswhich are chosen based on the shape of matrices andtranspose operator involved. The algorithm exploitsperformance characteristics of clusters and shared memorysystems: it differs from the other parallel matrixmultiplication algorithms by the explicit use of sharedmemory and remote memory access (RMA) communicationrather than message passing. The experimental results onclusters and shared memory systems demonstrateconsistent performance advantages over pdgemm from theScaLAPACK parallel linear algebra package.

Cited By

View all
  • (2006)Advances, Applications and Performance of the Global Arrays Shared Memory Programming ToolkitInternational Journal of High Performance Computing Applications10.1177/109434200606450320:2(203-231)Online publication date: 1-May-2006
  • (2006)Memory efficient parallel matrix multiplication operation for irregular problemsProceedings of the 3rd conference on Computing frontiers10.1145/1128022.1128054(229-240)Online publication date: 3-May-2006

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPADS '04: Proceedings of the Parallel and Distributed Systems, Tenth International Conference
July 2004
ISBN:0769521525

Publisher

IEEE Computer Society

United States

Publication History

Published: 07 July 2004

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2006)Advances, Applications and Performance of the Global Arrays Shared Memory Programming ToolkitInternational Journal of High Performance Computing Applications10.1177/109434200606450320:2(203-231)Online publication date: 1-May-2006
  • (2006)Memory efficient parallel matrix multiplication operation for irregular problemsProceedings of the 3rd conference on Computing frontiers10.1145/1128022.1128054(229-240)Online publication date: 3-May-2006

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media