Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Exploiting graphical processing units for data-parallel scientific applications

Published: 01 December 2009 Publication History

Abstract

Graphical processing units (GPUs) have recently attracted attention for scientific applications such as particle simulations. This is partially driven by low commodity pricing of GPUs but also by recent toolkit and library developments that make them more accessible to scientific programmers. We discuss the application of GPU programming to two significantly different paradigms—regular mesh field equations with unusual boundary conditions and graph analysis algorithms. The differing optimization techniques required for these two paradigms cover many of the challenges faced when developing GPU applications. We discuss the relevance of these application paradigms to simulation engines and games. GPUs were aimed primarily at the accelerated graphics market but since this is often closely coupled to advanced game products it is interesting to speculate about the future of fully integrated accelerator hardware for both visualization and simulation combined. As well as reporting the speed-up performance on selected simulation paradigms, we discuss suitable data-parallel algorithms and present code examples for exploiting GPU features like large numbers of threads and localized texture memory. We find a surprising variation in the performance that can be achieved on GPUs for our applications and discuss how these findings relate to past known effects in parallel computing such as memory speed-related super-linear speed up. Copyright © 2009 John Wiley & Sons, Ltd.

Cited By

View all
  • (2023)Optimization Techniques for GPU ProgrammingACM Computing Surveys10.1145/357063855:11(1-81)Online publication date: 16-Mar-2023
  • (2014)Simulating and benchmarking the shallow-water fluid dynamical equations on multiple graphical processing unitsProceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 15210.5555/2667672.2667676(29-36)Online publication date: 20-Jan-2014
  • (2014)Developmental directions in parallel acceleratorsProceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 15210.5555/2667672.2667675(21-27)Online publication date: 20-Jan-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Concurrency and Computation: Practice & Experience
Concurrency and Computation: Practice & Experience  Volume 21, Issue 18
December 2009
285 pages
ISSN:1532-0626
EISSN:1532-0634
Issue’s Table of Contents

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 01 December 2009

Author Tags

  1. CUDA
  2. GPUs
  3. data-parallelism
  4. field equations
  5. graph algorithms

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Optimization Techniques for GPU ProgrammingACM Computing Surveys10.1145/357063855:11(1-81)Online publication date: 16-Mar-2023
  • (2014)Simulating and benchmarking the shallow-water fluid dynamical equations on multiple graphical processing unitsProceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 15210.5555/2667672.2667676(29-36)Online publication date: 20-Jan-2014
  • (2014)Developmental directions in parallel acceleratorsProceedings of the Twelfth Australasian Symposium on Parallel and Distributed Computing - Volume 15210.5555/2667672.2667675(21-27)Online publication date: 20-Jan-2014
  • (2013)Simulating growth kinetics in a data-parallel 3d lattice photobioreactorModelling and Simulation in Engineering10.1155/2013/1532412013(20-20)Online publication date: 1-Jan-2013
  • (2013)Empirical measurement of instruction level parallelism for four generations of ARM CPUsProceedings of the 2013 International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/2442992.2443003(101-106)Online publication date: 23-Feb-2013
  • (2013)Parallel multi-objective Ant Programming for classification using GPUsJournal of Parallel and Distributed Computing10.1016/j.jpdc.2013.01.01773:6(713-728)Online publication date: 1-Jun-2013
  • (2012)Hard-sphere collision simulations with multiple GPUs, PCIe extension buses and GPU-GPU communicationsProceedings of the Tenth Australasian Symposium on Parallel and Distributed Computing - Volume 12710.5555/2523685.2523687(13-22)Online publication date: 31-Jan-2012
  • (2011)Solving a kind of boundary-value problem for ordinary differential equations using Fermi-The next generation CUDA computing architectureJournal of Computational and Applied Mathematics10.1016/j.cam.2011.07.028236:3(384-393)Online publication date: 1-Sep-2011
  • (2011)On the GPGPU parallelization issues of finite element approximate inverse preconditioningJournal of Computational and Applied Mathematics10.1016/j.cam.2011.07.016236:3(294-307)Online publication date: 1-Sep-2011
  • (2010)Asynchronous Communication Schemes for Finite Difference Methods on Multiple GPUsProceedings of the 2010 10th IEEE/ACM International Conference on Cluster, Cloud and Grid Computing10.1109/CCGRID.2010.86(763-768)Online publication date: 17-May-2010

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media