Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1787275.1787323acmconferencesArticle/Chapter ViewAbstractPublication PagescfConference Proceedingsconference-collections
research-article

Hybrid parallel programming with MPI and unified parallel C

Published: 17 May 2010 Publication History

Abstract

The Message Passing Interface (MPI) is one of the most widely used programming models for parallel computing. However, the amount of memory available to an MPI process is limited by the amount of local memory within a compute node. Partitioned Global Address Space (PGAS) models such as Unified Parallel C (UPC) are growing in popularity because of their ability to provide a shared global address space that spans the memories of multiple compute nodes. However, taking advantage of UPC can require a large recoding effort for existing parallel applications.
In this paper, we explore a new hybrid parallel programming model that combines MPI and UPC. This model allows MPI programmers incremental access to a greater amount of memory, enabling memory-constrained MPI codes to process larger data sets. In addition, the hybrid model offers UPC programmers an opportunity to create static UPC groups that are connected over MPI. As we demonstrate, the use of such groups can significantly improve the scalability of locality-constrained UPC codes. This paper presents a detailed description of the hybrid model and demonstrates its effectiveness in two applications: a random access benchmark and the Barnes-Hut cosmological simulation. Experimental results indicate that the hybrid model can greatly enhance performance; using hybrid UPC groups that span two cluster nodes, RA performance increases by a factor of 1.33 and using groups that span four cluster nodes, Barnes-Hut experiences a twofold speedup at the expense of a 2% increase in code size.

References

[1]
MPICH2. http://www.mcs.anl.gov/research/projects/mpich2/, December 2009.
[2]
Eduard Ayguade, Marc Gonzalez, Xavier Martorell, and Gabriele Jost. Employing nested OpenMP for the parallelization of multi-zone computational fluid dynamics applications. J. Parallel Distrib. Comput., 66(5):686--697, 2006.
[3]
Joshua E. Barnes and Piet Hut. A hierarchical o(n log n) force calculation algorithm. Nature, 324:446--449, 1986.
[4]
Berkeley UPC. Berkeley UPC user's guide version 2.8.0, 2009.
[5]
L. S. Blackford, J. Choi, A. Cleary, E. D'Azeuedo, J. Demmel, I. Dhillon, S. Hammarling, G. Henry, A. Petitet, K. Stanley, D. Walker, and R. C. Whaley. ScaLAPACK user's guide. SIAM, Philadelphia, PA, 1997.
[6]
Dan Bonachea and Jason Duell. Problems with using MPI 1.1 and 2.0 as compilation targets for parallel language implementations. In 2nd Workshop on Hardware/Software Support for High Performance Scientific and Engineering Computing (SHPSEC), pages 91--99, 2003.
[7]
B. L. Chamberlain, D. Callahan, and H. P. Zima. Parallel programmability and the Chapel language. Intl. J. High Performance Computing Applications (IJHPCA), 21(3):291--312, 2007.
[8]
Philippe Charles, Christian Grothoff, Vijay Saraswat, Christopher Donawa, Allan Kielstra, Kemal Ebcioglu, Christoph von Praun, and Vivek Sarkar. X10: An object-oriented approach to non-uniform cluster computing. In Intl. Conf. Object-Oriented Programming, Systems, Languages, and Applications (OOPSLA), pages 519--538. ACM SIGPLAN, 2005.
[9]
Julita Corbalán, Alejandro Duran, and Jesús Labarta. Dynamic load balancing of MPI+OpenMP applications. In Intl. Conf. on Parallel Processing (ICPP), 2004.
[10]
Haoqiang Jin and Rob F. Van der Wijngaart. Performance characteristics of the multi-zone NAS parallel benchmarks. In 18th Intl. Parallel and Distributed Processing Symp. (IPDPS). IEEE, 2004.
[11]
MPI Forum. MPI: A message-passing interface standard. Technical Report UT-CS-94-230, University of Tennessee, Knoxville, 1994.
[12]
MPI Forum. MPI-2: Extensions to the message-passing interface. Technical report, University of Tennessee, Knoxville, 1996.
[13]
Jarek Nieplocha and Bryan Carpenter. ARMCI: A portable remote memory copy library for distributed array libraries and compiler run-time systems. Lecture Notes in Computer Science, 1586, 1999.
[14]
Jarek Nieplocha, Robert J. Harrison, and Richard J. Littlefield. Global Arrays: A portable "shared-memory" programming model for distributed memory computers. In Supercomputing (SC) '94, pages 340--349, 1994.
[15]
Jarek Nieplocha, Bruce Palmer, Vinod Tipparaju, Manojkumar Krishnan, Harold Trease, and Edoardo Aprà. Advances, applications and performance of the Global Arrays shared memory programming toolkit. Int. J. High Perform. Comput. Appl., 20(2):203--231, 2006.
[16]
Bruce Palmer, Jarek Nieplocha, and Edoardo Apra. Shared memory mirroring for reducing communication overhead on commodity networks. In Intl. Conf. on Cluster Computing. IEEE Computer Society, 2003.
[17]
Steven C. Pieper. Quantum Monte Carlo calculations of light nuclei. Nuclear Physics A, 751:516--532, 2005. Proceedings of the 22nd International Nuclear Physics Conference (Part 1).
[18]
Lorna Smith and Mark Bull. Development of mixed mode MPI / OpenMP applications. Scientific Programming, 9(2,3):83--98, 2001.
[19]
Guy L. Steele Jr. Parallel programming and parallel abstractions in fortress. In 14th Intl. Conf. on Parallel Architecture and Compilation Techniques (PACT), page 157, 2005.
[20]
Vinod Tipparaju, William Gropp, Hubert Ritzdorf, Rajeev Thakur, and Jesper L. Träff. Investigating high performance RMA interfaces for the MPI-3 standard. In Proc. 38th Intl. Conf. on Parallel Processing (ICPP), September 2009.
[21]
UPC Consortium. UPC language specifications, v1.2. Technical Report LBNL-59208, Lawrence Berkeley National Laboratory, 2005.

Cited By

View all
  • (2021)DiPOSH: A portable OpenSHMEM implementation for short API‐to‐network pathConcurrency and Computation: Practice and Experience10.1002/cpe.617933:11Online publication date: 4-Feb-2021
  • (2020)A Hybrid MPI+PGAS Approach to Improve Strong Scalability Limits of Finite Element Solvers2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00041(303-313)Online publication date: Sep-2020
  • (2020)On the parallelization and performance analysis of Barnes–Hut algorithm using Java parallel platformsSN Applied Sciences10.1007/s42452-020-2386-z2:4Online publication date: 10-Mar-2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
CF '10: Proceedings of the 7th ACM international conference on Computing frontiers
May 2010
370 pages
ISBN:9781450300445
DOI:10.1145/1787275
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 May 2010

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. hybrid parallel programming
  2. mpi
  3. pgas
  4. upc

Qualifiers

  • Research-article

Conference

CF'10
Sponsor:
CF'10: Computing Frontiers Conference
May 17 - 19, 2010
Bertinoro, Italy

Acceptance Rates

CF '10 Paper Acceptance Rate 30 of 113 submissions, 27%;
Overall Acceptance Rate 273 of 785 submissions, 35%

Upcoming Conference

CF '25

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)5
  • Downloads (Last 6 weeks)2
Reflects downloads up to 11 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)DiPOSH: A portable OpenSHMEM implementation for short API‐to‐network pathConcurrency and Computation: Practice and Experience10.1002/cpe.617933:11Online publication date: 4-Feb-2021
  • (2020)A Hybrid MPI+PGAS Approach to Improve Strong Scalability Limits of Finite Element Solvers2020 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER49012.2020.00041(303-313)Online publication date: Sep-2020
  • (2020)On the parallelization and performance analysis of Barnes–Hut algorithm using Java parallel platformsSN Applied Sciences10.1007/s42452-020-2386-z2:4Online publication date: 10-Mar-2020
  • (2019)Optimized Execution of Parallel Loops via User-Defined Scheduling PoliciesProceedings of the 48th International Conference on Parallel Processing10.1145/3337821.3337913(1-10)Online publication date: 5-Aug-2019
  • (2019)Evaluation of Compilers Effects on OpenMP Soft Error Resiliency2019 IEEE Computer Society Annual Symposium on VLSI (ISVLSI)10.1109/ISVLSI.2019.00055(259-264)Online publication date: Jul-2019
  • (2018)Multi-level load balancing with an integrated runtime approachProceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing10.1109/CCGRID.2018.00018(31-40)Online publication date: 1-May-2018
  • (2018)Erfahrungen beim Aufbau von großen Clustern aus Einplatinencomputern für Forschung und LehreInformatik-Spektrum10.1007/s00287-017-1083-941:3(189-199)Online publication date: 5-Jan-2018
  • (2016)Mobile clusters of single board computers: an option for providing resources to student projects and researchersSpringerPlus10.1186/s40064-016-1981-35:1Online publication date: 22-Mar-2016
  • (2015)Exascale Machines Require New Programming Paradigms and RuntimesSupercomputing Frontiers and Innovations: an International Journal10.14529/jsfi1502012:2(6-27)Online publication date: 6-Apr-2015
  • (2015)Design of a Multithreaded Barnes-Hut Algorithm for Multicore ClustersIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.233124326:7(1861-1873)Online publication date: 1-Jul-2015
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media