Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3229710.3229717acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Evaluating Support for OpenMP Offload Features

Published: 13 August 2018 Publication History

Abstract

The OpenMP language features have been evolving to meet the rapid development in hardware platforms. DOE applications tend to push the bleeding edge of features ratified in the OpenMP specification and tend to expose the rough edges of the features' implementations. The software harness on DOE supercomputers such as Titan and (upcoming) Summit include Cray, Clang, Flang, XL and GCC compilers. It is critical, especially for Summit, that the compilers support OpenMP offloading features. This paper focuses on evaluating support for OpenMP 4.5 target offload directives across compiler implementations on Titan and Summitdev, an early access system, which is one generation removed from Summit's architecture enabling application teams to test the systems' architecture. Our tests not only evaluate the OpenMP implementations but also expose ambiguities in the OpenMP 4.5 specification. We also evaluate compiler implementations using kernels extracted from production DOE applications. This helps in assessing the interaction of different OpenMP directives independent of other application artifacts. We are aware that the implementations are constantly evolving and are advertised as having only partial OpenMP 4.x support. We see this as a synergistic effort to help identify and correct features that are required by DOE applications and prevent deployment delays later on. Going forward, we also plan to interact with standard benchmarking bodies like SPEC/HPG to donate our tests and mini-apps/kernels for potential inclusion in the next release versions of SPEC OMP and SPEC ACCEL benchmark suites.

References

[1]
{n. d.}. NVIDIA Thrust. https://developer.nvidia.com/thrust. ({n. d.}). Accessed: 2017-02-03.
[2]
OpenMP Architecture Review Board. {n. d.}. OpenMP Application Programming Interface. http://www.openmp.org/wp-content/uploads/openmp-examples-4.5.0.pdf.({n. d.}).
[3]
J Mark Bull, Fiona Reid, and Nicola McDonnell. 2012. A microbenchmark suite for openmp tasks. In International Workshop on OpenMP. Springer, 271--274.
[4]
MP Clay, D Buaria, PK Yeung, and T Gotoh. 2018. GPU acceleration of a petascale application for turbulent mixing at high Schmidt number using OpenMP 4.5. Computer Physics Communications 228 (2018), 100--114.
[5]
M. P. Clay, D. Buaria, and P. K. Yeung. 2017. Improving Scalability and Accelerating Petascale Turbulence Simulations Using OpenMP. http://openmpcon.org/conf2017/program/. (2017). To Appear.
[6]
Jack Dongarra, Mark Furtney, Steve Reinhardt, and Jerry Russell. 1991. Parallel Loops?A test suite for parallelizing compilers: Description and example results. Parallel Comput. 11, 10--11 (1991), 1247--1255.
[7]
H Carter Edwards, Christian R Trott, and Daniel Sunderland. 2014. Kokkos: Enabling manycore performance portability through polymorphic memory access patterns. J. Parallel and Distrib. Comput. 74, 12 (2014), 3202--3216.
[8]
Jose Monsalve Diaz, Swaroop Pophale, Oscar Hernandez, David Bernholdt, and Sunita Chandrasekaran. {n. d.}. OpenMP 4.5 Validation and Verification Suite. https://crpl.cis.udel.edu/ompvvsollve/. ({n. d.}).
[9]
Guido Juckeland, William Brantley, Sunita Chandrasekaran, Barbara Chapman, Shuai Che, Mathew Colgrove, Huiyu Feng, Alexander Grund, Robert Henschel, Wen-Mei W Hwu, et al. 2014. SPEC ACCEL: a standard application suite for measuring hardware accelerator performance. In International Workshop on Performance Modeling, Benchmarking and Simulation of High Performance Computer Systems. Springer, 46--67.
[10]
Guido Juckeland, Alexander Grund, and Wolfgang E Nagel. 2015. Performance portable applications for hardware accelerators: lessons learned from SPEC ACCEL. In Parallel and Distributed Processing Symposium Workshop (IPDPSW), 2015 IEEE International. IEEE, 689--698.
[11]
Guido Juckeland, Oscar Hernandez, Arpith C Jacob, Daniel Neilson, Veronica G Vergara Larrea, Sandra Wienke, Alexander Bobyr, William C Brantley, Sunita Chandrasekaran, Mathew Colgrove, et al. 2016. From describing to prescribing parallelism: Translating the SPEC ACCEL OpenACC suite to OpenMP target directives. In International Conference on High Performance Computing. Springer, 470--488.
[12]
GrahamLopez Kyle Friedline, Sunita Chandrasekaran and Oscar Hernandez. {n. d.}. OpenACC 2.5 Validation Testsuite targeting multiple architectures. In Proceedings of P3MA Workshop co-located with ISC 2017 ({n. d.}). To appear.
[13]
LLVM. {n. d.}. LLVM Testing Infrastructure Guide. http://www.llvm.org/pre-releases/4.0.0/rc2/docs/TestingGuide.html#test-suite. ({n. d.}).
[14]
Frank H McMahon. 1986. The Livermore Fortran Kernels: A computer test of the numerical performance range. Technical Report. Lawrence Livermore National Lab., CA (USA).
[15]
Matthias Müller and Pavel Neytchev. 2003. An openmp validation suite. In Fifth European Workshop on OpenMP, Aachen University, Germany.
[16]
Matthias S Müller, Christoph Niethammer, Barbara Chapman, Yi Wen, and Zhenying Liu. 2004. Validating OpenMP 2.5 for fortran and c/c++. In Sixth European Workshop on OpenMP, KTH Royal Institute of Technology, Stockholm, Sweden.
[17]
NVIDIA. {n. d.}. CUDA SDK Code Samples. http://developer.nvidia.com/cuda-cc-sdk-code-samples. ({n. d.}). Accessed: 2017-02-03.
[18]
Oak Ridge National Lab. {n. d.}. Ascending to Summit: Announcing Summitdev. https://www.olcf.ornl.gov/2017/02/28/ascending-to-summit-announcing-summitdev/. ({n. d.}).
[19]
Oak Ridge National Lab. {n. d.}. Summit. https://www.olcf.ornl.gov/olcf-resources/compute-systems/summit/. ({n. d.}).
[20]
Oak Ridge National Lab. {n. d.}. Titan supercomputer. https://www.olcf.ornl.gov/titan/. ({n. d.}).
[21]
OpenACC. {n. d.}. OpenACC, Directives for Accelerators. http://www.openacc.org/. ({n. d.}).
[22]
OpenCL. {n. d.}. OpenCL. https://www.khronos.org/. ({n. d.}).
[23]
OpenMP. {n. d.}. OpenMP 4.5 Specification. http://www.openmp.org/wp-content/uploads/openmp-4.5.pdf. ({n. d.}).
[24]
OpenMP. {n. d.}. OpenMP Compilers. http://www.openmp.org/resources/openmp-compilers/, ({n. d.}).
[25]
Swaroop Suhas Pophale, Anthony Curtis, Barbara Chapman, and Stephen Poole. 2013. Poster: Validation and Verification Suite for OpenSHMEM. In Proceedings of the Seventh Conference on Partitioned Global Address Space Programming Model (PGAS 2013). 257, 258.
[26]
Fiona JL Reid and J Mark Bull. 2004. Openmp microbenchmarks version 2.0. In Proc. EWOMP. 63--68.
[27]
David F Richards, Ryan C Bleile, Patrick S Brantley, Shawn A Dawson, Michael Scott McKinley, and Matthew J O?Brien. 2017. Quicksilver: A Proxy App for the Monte Carlo Transport Code Mercury. In Cluster Computing (CLUSTER), 2017 IEEE International Conference on. IEEE, 866--873.
[28]
Top500. {n. d.}. Global Supercomputing Capacity Creeps Up as Petascale Systems Blanket Top 100. https://www.top500.org/news/global-supercomputing-capacity-creeps-up-as-petascale-systems-blanket-top-100/. ({n. d.}).
[29]
Cheng Wang, Sunita Chandrasekaran, and Barbara Chapman. 2012. An openmp 3.1 validation testsuite. In International Workshop on OpenMP. Springer, 237--249.
[30]
Cheng Wang, Rengan Xu, Sunita Chandrasekaran, Barbara Chapman, and Oscar Hernandez. 2014. A validation testsuite for OpenACC 1.0. In Parallel & Distributed Processing Symposium Workshops (IPDPSW), 2014 IEEE International. IEEE, 1407--1416.
[31]
Xuejun Yang, Yang Chen, Eric Eide, and John Regehr. 2011. Finding and understanding bugs in C compilers. In ACM SIGPLAN Notices, Vol. 46. ACM, 283--294.

Cited By

View all
  • (2023)Exploring OpenMP GPU Offloading for Implementing Convolutional Neural NetworksProceedings of the 14th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3582514.3582523(60-69)Online publication date: 25-Feb-2023
  • (2023)OpenMP Offload Features and Strategies for High Performance across Architectures and Compilers2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00098(564-573)Online publication date: May-2023
  • (2022)ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)10.1109/P3HPC56579.2022.00017(123-135)Online publication date: Nov-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP Workshops '18: Workshop Proceedings of the 47th International Conference on Parallel Processing
August 2018
409 pages
ISBN:9781450365239
DOI:10.1145/3229710
© 2018 Association for Computing Machinery. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the United States Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

In-Cooperation

  • University of Oregon: University of Oregon

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 August 2018

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Evaluation
  2. Offloading
  3. OpenMP 4.5

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP '18 Comp

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)25
  • Downloads (Last 6 weeks)3
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Exploring OpenMP GPU Offloading for Implementing Convolutional Neural NetworksProceedings of the 14th International Workshop on Programming Models and Applications for Multicores and Manycores10.1145/3582514.3582523(60-69)Online publication date: 25-Feb-2023
  • (2023)OpenMP Offload Features and Strategies for High Performance across Architectures and Compilers2023 IEEE International Parallel and Distributed Processing Symposium Workshops (IPDPSW)10.1109/IPDPSW59300.2023.00098(564-573)Online publication date: May-2023
  • (2022)ECP SOLLVE: Validation and Verification Testsuite Status Update and Compiler Insight for OpenMP2022 IEEE/ACM International Workshop on Performance, Portability and Productivity in HPC (P3HPC)10.1109/P3HPC56579.2022.00017(123-135)Online publication date: Nov-2022
  • (2022)A Portable Sparse Solver Framework for Large Matrices on Heterogeneous Architectures2022 IEEE 29th International Conference on High Performance Computing, Data, and Analytics (HiPC)10.1109/HiPC56025.2022.00030(145-155)Online publication date: Dec-2022
  • (2021)Resiliency in numerical algorithm design for extreme scale simulationsThe International Journal of High Performance Computing Applications10.1177/10943420211055188(109434202110551)Online publication date: 10-Dec-2021
  • (2021)An Empirical Study of Parallelizing Test Execution Using CUDA Unified Memory and OpenMP GPU Offloading2021 IEEE International Conference on Software Testing, Verification and Validation Workshops (ICSTW)10.1109/ICSTW52544.2021.00052(271-278)Online publication date: Apr-2021
  • (2020)An open-source solution to performance portability for Summit and Sierra supercomputersIBM Journal of Research and Development10.1147/JRD.2019.295594464:3/4(12:1-12:23)Online publication date: 1-May-2020
  • (2020)OpenMP Target Device Offloading for the SX-Aurora TSUBASA Vector EngineParallel Processing and Applied Mathematics10.1007/978-3-030-43229-4_21(237-249)Online publication date: 19-Mar-2020

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media