Abstract
Modern high-performance machines are challenging to program because of the availability of a wide array of compute resources that often requires low-level, specialized knowledge to exploit. OpenMP is an effective directive-based approach that can effectively exploit shared-memory multicores. The recently introduced OpenMP 4.0 standard extends the directive-based approach to exploit accelerators. However, programming clusters still requires the use of other specialized languages or libraries.
In this work we propose the use of the target offloading constructs to program nodes distributed in a cluster. We introduce an abstract model of a cluster that defines a clique of distinct shared-memory domains that are manipulated with the target constructs. We have implemented this model in the LLVM compiler with an OpenMP runtime that supports transparent offloading to nodes in a cluster using MPI. Our initial results on HMMER, a widely used Bioinformatics tool, show excellent scaling behavior with a small constant-factor overhead as compared to a baseline MPI implementation. Our work raises the intriguing possibility of a natural progression of a program compiled for serial execution, to parallel execution on a multicore, to offloading onto accelerators, and finally extendible with minimal additional effort onto a cluster.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
For example, the IBM Power® System E880 is configurable up to 16 TB. See http://www-03.ibm.com/systems/power/hardware/e880/.
- 2.
HMMER 3.1b2: http://hmmer.org.
References
Chamberlain, B., Callahan, D., Zima, H.: Parallel programmability and the chapel language. J. High Perf. Comput. Appl. 21(3), 291–312 (2007)
Charles, P., et al.: X10: An object-oriented approach to non-uniform cluster computing. In: Conference on Object-Oriented Programming, Systems, Languages, and Applications, pp. 519–538 (2005)
Clang: A C language family frontend for LLVM. http://clang.llvm.org
Eichenberger, A.E., O’Brien, K.: Experimenting with low-overhead OpenMP runtime on IBM Blue Gene/Q. IBM J. Res. Dev. 57(1/2), 8:1–8:8 (2013)
El-Ghazawi, T., Smith, L.: UPC: Unified parallel C. In: Supercomputing (2006)
Hoeflinger, J.P.: Extending OpenMP to clusters (2006)
Hu, Y., Lu, H., Cox, A.L., Zwaenepoel, W.: OpenMP for networks of SMPs. J. Parallel Distrib. Comput. 60(12), 1512–1530 (2000)
Kogge, P.M.: Performance analysis of a large memory application on multiple architectures. In: Conference on Partitioned Global Address Space Programming Models (2013)
Li, M., et al.: Scaling distributed machine learning with the parameter server. In: Operating Systems Design and Implementation, pp. 583–598, October 2014
The LLVM Compiler Infrastructure. http://llvm.org
Millot, D., Muller, A., Parrot, C., Silber-Chaussumier, F.: STEP: a distributed OpenMP for coarse-grain parallelism tool. In: Eigenmann, R., de Supinski, B.R. (eds.) IWOMP 2008. LNCS, vol. 5004, pp. 83–99. Springer, Heidelberg (2008)
Numrich, R.W., Reid, J.: Co-array fortran for parallel programming. SIGPLAN Fortran Forum 17(2), 1–31 (1998)
Ojima, Y., Sato, M., Harada, H., Ishikawa, Y.: Performance of cluster-enabled OpenMP for the SCASH software distributed shared memory system. In: Cluster Computing and the Grid, pp. 450–456, May 2003
OpenMP Application Program Interface. http://www.openmp.org/
OpenMP, A.R.B.: OpenMP version 4.0, May 2013
Rowstron, A., et al.: Nobody ever got fired for using hadoop on a cluster. In: Workshop on Hot Topics in Cloud Data Processing, pp. 2:1–2:5 (2012)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2015 Springer International Publishing Switzerland
About this paper
Cite this paper
Jacob, A.C. et al. (2015). Exploiting Fine- and Coarse-Grained Parallelism Using a Directive Based Approach. In: Terboven, C., de Supinski, B., Reble, P., Chapman, B., Müller, M. (eds) OpenMP: Heterogenous Execution and Data Movements. IWOMP 2015. Lecture Notes in Computer Science(), vol 9342. Springer, Cham. https://doi.org/10.1007/978-3-319-24595-9_3
Download citation
DOI: https://doi.org/10.1007/978-3-319-24595-9_3
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24594-2
Online ISBN: 978-3-319-24595-9
eBook Packages: Computer ScienceComputer Science (R0)