Google Scholar

PARTANS: An autotuning framework for stencil computation on multi-GPU systems

T Lutz, C Fensch, M Cole - ACM Transactions on Architecture and Code …, 2013 - dl.acm.org

ACM Transactions on Architecture and Code Optimization (TACO), 2013•dl.acm.org

GPGPUs are a powerful and energy-efficient solution for many problems. For higher
performance or larger problems, it is necessary to distribute the problem across multiple
GPUs, increasing the already high programming complexity. In this article, we focus on
abstracting the complexity of multi-GPU programming for stencil computation. We show that
the best strategy depends not only on the stencil operator, problem size, and GPU, but also
on the PCI express layout. This adds nonuniform characteristics to a seemingly …

In this article, we focus on abstracting the complexity of multi-GPU programming for stencil computation. We show that the best strategy depends not only on the stencil operator, problem size, and GPU, but also on the PCI express layout. This adds nonuniform characteristics to a seemingly homogeneous setup, causing up to 23% performance loss. We address this issue with an autotuner that optimizes the distribution across multiple GPUs.

ACM Digital Library

Show moreShow less

Save Cite Cited by 102 Related articles All 6 versions

Cite

Advanced search

Saved to My library

PARTANS: An autotuning framework for stencil computation on multi-GPU systems