Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/PDP.2015.89guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Flexible and Portable Large-Scale DGEMM Library for Linpack on Next-Generation Multi-GPU Systems

Published: 04 March 2015 Publication History
  • Get Citation Alerts
  • Abstract

    In recent years, high performance computing has benefitted greatly from special accelerator cards such as GPUs. Matrix multiplication performed by the BLAS function DGEMM is one of the prime examples where such accelerators excel. DGEMM is the computational hotspot of many tasks, among them the Linpack benchmark. Current GPUs achieve more than 1 TFLOPS real performance in this task. Being connected via PCI Express, one can easily install multiple GPUs in a single compute node. This enables the construction of multi-TFLOPS systems out of off-the-shelf components. At such high performance, it is often complicated to feed the GPUs with sufficient data to run at full performance. In this paper we first analyze the scalability of our DGEMM implementation for multiple fast GPUs. Then we suggest a new scheme optimized for this situation and we present an implementation.

    Cited By

    View all
    • (2024)An Illustration of Extending Hedgehog to Multi-Node GPU Architectures Using GEMMSN Computer Science10.1007/s42979-024-02917-y5:5Online publication date: 15-Jun-2024
    • (2023)5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway SupercomputerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607030(1-13)Online publication date: 12-Nov-2023
    • (2022)SnuHPLProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532370(1-12)Online publication date: 28-Jun-2022
    • Show More Cited By

    Index Terms

    1. A Flexible and Portable Large-Scale DGEMM Library for Linpack on Next-Generation Multi-GPU Systems
        Index terms have been assigned to the content through auto-classification.

        Recommendations

        Comments

        Information & Contributors

        Information

        Published In

        cover image Guide Proceedings
        PDP '15: Proceedings of the 2015 23rd Euromicro International Conference on Parallel, Distributed, and Network-Based Processing
        March 2015
        768 pages
        ISBN:9781479984916

        Publisher

        IEEE Computer Society

        United States

        Publication History

        Published: 04 March 2015

        Author Tag

        1. HPL Linpack DGEMM OpenCL CUDA BLAS GPU multi-GPU DMA

        Qualifiers

        • Article

        Contributors

        Other Metrics

        Bibliometrics & Citations

        Bibliometrics

        Article Metrics

        • Downloads (Last 12 months)0
        • Downloads (Last 6 weeks)0

        Other Metrics

        Citations

        Cited By

        View all
        • (2024)An Illustration of Extending Hedgehog to Multi-Node GPU Architectures Using GEMMSN Computer Science10.1007/s42979-024-02917-y5:5Online publication date: 15-Jun-2024
        • (2023)5 ExaFlop/s HPL-MxP Benchmark with Linear Scalability on the 40-Million-Core Sunway SupercomputerProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3581784.3607030(1-13)Online publication date: 12-Nov-2023
        • (2022)SnuHPLProceedings of the 36th ACM International Conference on Supercomputing10.1145/3524059.3532370(1-12)Online publication date: 28-Jun-2022
        • (2021)HPC LINPACK Parameter Optimization on Homo-/Heterogeneous System of ARM Neoverse N1SDPThe International Conference on High Performance Computing in Asia-Pacific Region10.1145/3432261.3439864(139-143)Online publication date: 20-Jan-2021

        View Options

        View options

        Get Access

        Login options

        Media

        Figures

        Other

        Tables

        Share

        Share

        Share this Publication link

        Share on social media