Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/CLUSTER.2012.46guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Autotuning Stencil-Based Computations on GPUs

Published: 24 September 2012 Publication History
  • Get Citation Alerts
  • Abstract

    Finite-difference, stencil-based discretization approaches are widely used in the solution of partial differential equations describing physical phenomena. Newton-Krylov iterative methods commonly used in stencil-based solutions generate matrices that exhibit diagonal sparsity patterns. To exploit these structures on modern GPUs, we extend the standard diagonal sparse matrix representation and define new matrix and vector data types in the PETSc parallel numerical toolkit. We create tunable CUDA implementations of the operations associated with these types after identifying a number of GPU-specific optimizations and tuning parameters for these operations. We discuss our implementation of GPU auto tuning capabilities in the Orio framework and present performance results for several kernels, comparing them with vendor-tuned library implementations.

    Cited By

    View all
    • (2024)Automatic Static Analysis-Guided Optimization of CUDA KernelsProceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3649169.3649249(11-21)Online publication date: 3-Mar-2024
    • (2021)Performance portability through machine learning guided kernel selection in SYCL librariesParallel Computing10.1016/j.parco.2021.102813107:COnline publication date: 1-Oct-2021
    • (2019)Tiling Optimizations for Stencil Computations Using Rewrite Rules in LiftACM Transactions on Architecture and Code Optimization10.1145/336885816:4(1-25)Online publication date: 26-Dec-2019
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    CLUSTER '12: Proceedings of the 2012 IEEE International Conference on Cluster Computing
    September 2012
    630 pages
    ISBN:9780769548074

    Publisher

    IEEE Computer Society

    United States

    Publication History

    Published: 24 September 2012

    Author Tags

    1. CUDA
    2. GPU
    3. autotuning
    4. stencil

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)0
    • Downloads (Last 6 weeks)0

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Automatic Static Analysis-Guided Optimization of CUDA KernelsProceedings of the 15th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3649169.3649249(11-21)Online publication date: 3-Mar-2024
    • (2021)Performance portability through machine learning guided kernel selection in SYCL librariesParallel Computing10.1016/j.parco.2021.102813107:COnline publication date: 1-Oct-2021
    • (2019)Tiling Optimizations for Stencil Computations Using Rewrite Rules in LiftACM Transactions on Architecture and Code Optimization10.1145/336885816:4(1-25)Online publication date: 26-Dec-2019
    • (2019)Performance tuning case study on graphics processing unit-accelerated monte carlo simulations for proton therapyProceedings of the Conference on Research in Adaptive and Convergent Systems10.1145/3338840.3355638(1-6)Online publication date: 24-Sep-2019
    • (2019)An Autotuning Protocol to Rapidly Build AutotunersACM Transactions on Parallel Computing10.1145/32915275:2(1-25)Online publication date: 4-Jan-2019
    • (2018)A Strategy for Automatic Performance Tuning of Stencil Computations on GPUsScientific Programming10.1155/2018/60930542018Online publication date: 1-Jan-2018
    • (2018)High performance stencil code generation with LiftProceedings of the 2018 International Symposium on Code Generation and Optimization10.1145/3168824(100-112)Online publication date: 24-Feb-2018
    • (2017)Tessellating stencilsProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3126908.3126920(1-13)Online publication date: 12-Nov-2017
    • (2015)FASTProceedings of the 29th ACM on International Conference on Supercomputing10.1145/2751205.2751214(187-196)Online publication date: 8-Jun-2015
    • (2014)Derivation of optimal input parameters for minimizing execution time of matrix-based computations on a GPUParallel Computing10.1016/j.parco.2014.09.00240:10(628-645)Online publication date: 1-Dec-2014

    View Options

    View options

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media