Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICPP.2011.45guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

A Comprehensive Performance Comparison of CUDA and OpenCL

Published: 13 September 2011 Publication History

Abstract

This paper presents a comprehensive performance comparison between CUDA and OpenCL. We have selected 16 benchmarks ranging from synthetic applications to real-world ones. We make an extensive analysis of the performance gaps taking into account programming models, ptimization strategies, architectural details, and underlying compilers. Our results show that, for most applications, CUDA performs at most 30\% better than OpenCL. We also show that this difference is due to unfair comparisons: in fact, OpenCL can achieve similar performance to CUDA under a fair comparison. Therefore, we define a fair comparison of the two types of applications, providing guidelines for more potential analyses. We also investigate OpenCL's portability by running the benchmarks on other prevailing platforms with minor modifications. Overall, we conclude that OpenCL's portability does not fundamentally affect its performance, and OpenCL can be a good alternative to CUDA.

Cited By

View all
  • (2024)A Comparison of OpenCL, CUDA, and HIP as Compilation Targets for a Functional Array LanguageProceedings of the 1st ACM SIGPLAN International Workshop on Functional Programming for Productivity and Performance10.1145/3677997.3678226(1-9)Online publication date: 28-Aug-2024
  • (2021)C-for-metalProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370324(289-300)Online publication date: 27-Feb-2021
  • (2020)LLVM-based automation of memory decoupling for OpenCL applications on FPGAsMicroprocessors & Microsystems10.1016/j.micpro.2019.10290972:COnline publication date: 1-Feb-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICPP '11: Proceedings of the 2011 International Conference on Parallel Processing
September 2011
796 pages
ISBN:9780769545103

Publisher

IEEE Computer Society

United States

Publication History

Published: 13 September 2011

Author Tags

  1. CUDA
  2. OpenCL
  3. Performance Comparison

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 04 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2024)A Comparison of OpenCL, CUDA, and HIP as Compilation Targets for a Functional Array LanguageProceedings of the 1st ACM SIGPLAN International Workshop on Functional Programming for Productivity and Performance10.1145/3677997.3678226(1-9)Online publication date: 28-Aug-2024
  • (2021)C-for-metalProceedings of the 2021 IEEE/ACM International Symposium on Code Generation and Optimization10.1109/CGO51591.2021.9370324(289-300)Online publication date: 27-Feb-2021
  • (2020)LLVM-based automation of memory decoupling for OpenCL applications on FPGAsMicroprocessors & Microsystems10.1016/j.micpro.2019.10290972:COnline publication date: 1-Feb-2020
  • (2019)VComputeLibProceedings of the 17th International Conference on Advances in Mobile Computing & Multimedia10.1145/3365921.3365936(242-251)Online publication date: 2-Dec-2019
  • (2019)A programmable shared-memory system for an array of processing-in-memory devicesCluster Computing10.1007/s10586-018-2844-122:2(385-398)Online publication date: 1-Jun-2019
  • (2018)Performance Comparison of CUDA and OpenACC Based on OptimizationsProceedings of the 2018 2nd High Performance Computing and Cluster Technologies Conference10.1145/3234664.3234681(53-57)Online publication date: 22-Jun-2018
  • (2018)MOCLProceedings of the 15th ACM International Conference on Computing Frontiers10.1145/3203217.3203244(26-35)Online publication date: 8-May-2018
  • (2018)New algorithms for fixed-length approximate string matching and approximate circular string matching under the Hamming distanceThe Journal of Supercomputing10.1007/s11227-017-2192-674:5(1815-1834)Online publication date: 1-May-2018
  • (2017)Bounded exhaustive test-input generation on GPUsProceedings of the ACM on Programming Languages10.1145/31339181:OOPSLA(1-25)Online publication date: 12-Oct-2017
  • (2017)CuMF_SGDProceedings of the 26th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3078597.3078602(79-92)Online publication date: 26-Jun-2017
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media