research-article

Porting scientific libraries to PGAS in XSEDE resources: practice and experience

Authors:

Antonio Gómez-Iglesias,

Dmitry Pekurovsky,

Khaled Hamidouche,

Jérôme VienneAuthors Info & Claims

XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure

Article No.: 40, Pages 1 - 7

https://doi.org/10.1145/2792745.2792785

Published: 26 July 2015 Publication History

Abstract

The next generation of supercomputers presents new and complex challenges that might require a change in the current paradigm of how parallel applications are developed. Hybrid programming is usually described as the best approach for exascale computers. PGAS programming models are considered an interesting alternative to work together with MPI in this hybrid model to achieve good performance in those machines. This is a very promising approach especially for one-sided and irregular communication patterns. However, this is still an emerging technology and there is not much previous experience on how to port existing MPI applications to the PGAS model. Due to the promising relevance of this approach for the next generation of devices, it is relevant to have early experience on porting applications as well as knowledge on the issues that might be faced in this new paradigm. In this paper we present two different scientific applications that are currently implemented in MPI and that are promising candidates for this PGAS paradigm. We describe how these applications have been ported, the challenges faced and some of the solutions that we found. We also show how PGAS models can achieve great performance when compared to MPI.

References

[1]

A. Basumallik and R. Eigenmann. Optimizing Irregular Shared-memory Applications for Distributed-memory Systems. In Proceedings of the Eleventh ACM SIGPLAN Symposium on Principles and Practice of Parallel Programming, PPoPP '06, 2006.

Digital Library

[2]

J. Dinan, D. Larkins, P. Sadayappan, S. Krishnamoorthy, and J. Nieplocha. Scalable Work Stealing. In Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis, 2009.

Digital Library

[3]

Dongarra, Jack and Beckman, Pete and Moore, Terry and Aerts, Patrick et al. The International Exascale Software Project Roadmap. Int. J. High Perform. Comput. Appl., 25(1):3--60, Feb. 2011.

Digital Library

[4]

D. Donzis, K. Aditya, P. Yeung, and K. Sreenivasan. The turbulent schmidt number. Journal of Fluid Engineering, 136:060912, 2014.

[5]

T. Hoefler, J. Squyres, W. Rehm, and A. Lumsdaine. A case for non-blocking collective operations. In Frontiers of High Performance Computing and Networking. ISPA 2006 Workshops, Lecture Notes in Computer Science, volume 4331/2006, pages 155--164, 2006.

Digital Library

[6]

H. Homann, O. Kamps, R. Friedrich, and R. Grauer. Bridging from eulerian to lagrangian statistics in 3d hydro- and magnetohydrodynamic turbulent flows. New J. Phys., 11:073020.

[7]

HPC Advisory Council. www.hpcadvisorycouncil.com.

[8]

J. Jose, K. T. Sreeram Potluri, and D. K. Panda. Designing Scalable Graph500 Benchmark with Hybrid MPI+OpenSHMEM Programming Models. In International Supercomputing Conference (ISC), 2013.

[9]

K. Kandalla, H. Subramoni, K. Tomko, D. Pekurovsky, S. Sur, and D. Panda. High-performance and scalable non-blocking all-to-all with collective offload on infiniband clusters: a study with parallel 3d fft. In Computer Science: Research and Development, volume 26, pages 237--246, 2011.

Digital Library

[10]

M. Li, J. Lin, X. Lu, K. Hamidouche, K. Tomko, and D. Panda. Scalable MiniMD Design with Hybrid MPI and OpenSHMEM. In OpenShmem User Group 2014, Affiliated with The Conference on Partitioned Global Address Space Programming Models (PGAS), 2014.

Digital Library

[11]

J. Liu, J. Wu, and D. Panda. High performance RDMA-based MPI implementation over infiniband. International Journal of Parallel Programming, 32(3):167--198, 2004.

Digital Library

[12]

MVAPICH2-X: Unified MPI+PGAS Communication Runtime over OpenFabrics/Gen2 for Exascale Systems. http://mvapich.cse.ohio-state.edu/.

[13]

D. Pekurovsky. P3DFFT: a framework for parallel computations of Fourier transforms in three dimensions. SIAM Journal on Scientific Computing, 34(4):C192 -- C209, 2012.

[14]

S. Pophale, H. Jin, S. Poole, and J. Kuehn. OpenSHMEM Performance and Potential: A NPB Experimental Study. In Proceedings of the 1st Conference on OpenSHMEM Workshop, Oct 2013.

[15]

R. Preissl, J. Shalf, N. Wichmann, B. Long, and S. Ethier. Advanced Communication Techniques for Gyrokinetic Fusion Applications on Ultra-Scale Platforms. In Conference on Partitioned Global Address Space Programming Models (PGAS), 2011.

[16]

K. W. Schulz and C. Simmons. libGRVY. Toolkit for HPC application development. https://red.ices.utexas.edu/projects/software/wiki/GRVY. {Online; accessed March-2015}.

[17]

J. Schumacher. Lagrangian studies in convective turbulence. Phys. Rev. E, 79:056301, 2009.

[18]

H. Shan, F. Blagojević, S.-J. Min, P. Hargrove, H. Jin, K. Fuerlinger, A. Koniges, and N. J. Wright. A Programming Model Performance Study Using the NAS Parallel Benchmarks. Sci. Program., 18(3-4):153--167, Aug. 2010.

Digital Library

[19]

C. S. Simmons and K. W. Schulz. A distributed memory out-of-core method on HPC clusters and its application to quantum chemistry applications. In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond, XSEDE '12, pages 1--7, New York, NY, USA, 2012. ACM.

Digital Library

[20]

H. Subramoni, A. A. Awan, K. Hamidouche, D. Pekurovsky, A. Venkatesh, S. Chakraborty, K. Tomko, and D. K. Panda. Designing non-blocking personalized collectives with near perfect overlap for rdma-enabled clusters. 2015.

[21]

J. Vienne, J. Chen, M. Wasi-ur-Rahman, N. S. Islam, H. Subramoni, and D. K. Panda. Performance analysis and evaluation of infiniband FDR and 40GigE RoCE on HPC and cloud computing systems. In IEEE 20th Annual Symposium on High-Performance Interconnects, HOTI 2012, Santa Clara, CA, USA, August 22--24, 2012, pages 48--55. IEEE Computer Society, 2012.

Digital Library

[22]

Y. Zheng, A. Kamil, M. B. Driscoll, H. Shan, and K. A. Yelick. UPC++: A PGAS extension for C++. In 2014 IEEE 28th International Parallel and Distributed Processing Symposium, Phoenix, AZ, USA, May 19--23, 2014, pages 1105--1114. IEEE, 2014.

Digital Library

Cited By

Schuchart JGracia J(2019)Global Task Data-Dependencies in PGAS ApplicationsHigh Performance Computing10.1007/978-3-030-20656-7_16(312-329)Online publication date: 17-May-2019
https://doi.org/10.1007/978-3-030-20656-7_16
Schuchart JKowalewski RFuerlinger KEndo TYokokawa MHanawa TTatebe O(2018)Recent experiences in using MPI-3 RMA in the DASH PGAS runtimeProceedings of Workshops of HPC Asia10.1145/3176364.3176367(21-30)Online publication date: 31-Jan-2018
https://dl.acm.org/doi/10.1145/3176364.3176367
Fuerlinger KFuchs TKowalewski R(2016)DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC-SmartCity-DSS.2016.0140(983-990)Online publication date: Dec-2016
https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0140

Index Terms

Recommendations

OpenSHMEM as a Portable Communication Layer for PGAS Models: A Case Study with Coarray Fortran
CLUSTER '15: Proceedings of the 2015 IEEE International Conference on Cluster Computing

Languages and libraries based on the Partitioned Global Address Space (PGAS) programming model have emerged in recent years with a focus on addressing the programming challenges for scalable parallel systems. Among these, Coarray Fortran (CAF) is unique ...
Experiences at scale with PGAS versions of a Hydrodynamics application
PGAS '14: Proceedings of the 8th International Conference on Partitioned Global Address Space Programming Models

In this work we directly evaluate two PGAS programming models, CAF and OpenSHMEM, as candidate technologies for improving the performance and scalability of scientific applications on future exascale HPC platforms. PGAS approaches are considered by many ...
Evaluating the Potential of Cray Gemini Interconnect for PGAS Communication Runtime Systems
HOTI '11: Proceedings of the 2011 IEEE 19th Annual Symposium on High Performance Interconnects

The Cray Gemini Interconnect has been recently introduced as the next generation network for building scalable multi-petascale supercomputers. The Cray XE6 systems, which use the Gemini Interconnect are becoming available with Message Passing Interface (...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

XSEDE '15: Proceedings of the 2015 XSEDE Conference: Scientific Advancements Enabled by Enhanced Cyberinfrastructure

July 2015

296 pages

ISBN:9781450337205

DOI:10.1145/2792745

General Chair:
Gregory D. Peterson
National Institute of Computational Sciences

Copyright © 2015 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

San Diego Super Computing Ctr: San Diego Super Computing Ctr
HPCWire: HPCWire
Omnibond: Omnibond Systems, LLC
SGI
Internet2
Indiana University: Indiana University
CASC: The Coalition for Academic Scientific Computation
NICS: National Institute for Computational Sciences
Intel: Intel
DDN: DataDirect Networks, Inc
DELL
CORSA: CORSA Technology
ALLINEA: Allinea Software
Cray
RENCI: Renaissance Computing Institute

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 July 2015

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

XSEDE '15

Sponsor:

San Diego Super Computing Ctr
HPCWire
Omnibond
Indiana University
CASC
NICS
Intel
DDN
CORSA
ALLINEA
RENCI

XSEDE '15: Extreme Science Engineering Discovery Environment 2015 Conference

July 26 - 30, 2015

Missouri, St. Louis

Acceptance Rates

XSEDE '15 Paper Acceptance Rate 49 of 70 submissions, 70%;

Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
48
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 16 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Schuchart JGracia J(2019)Global Task Data-Dependencies in PGAS ApplicationsHigh Performance Computing10.1007/978-3-030-20656-7_16(312-329)Online publication date: 17-May-2019
https://doi.org/10.1007/978-3-030-20656-7_16
Schuchart JKowalewski RFuerlinger KEndo TYokokawa MHanawa TTatebe O(2018)Recent experiences in using MPI-3 RMA in the DASH PGAS runtimeProceedings of Workshops of HPC Asia10.1145/3176364.3176367(21-30)Online publication date: 31-Jan-2018
https://dl.acm.org/doi/10.1145/3176364.3176367
Fuerlinger KFuchs TKowalewski R(2016)DASH: A C++ PGAS Library for Distributed Data Structures and Parallel Algorithms2016 IEEE 18th International Conference on High Performance Computing and Communications; IEEE 14th International Conference on Smart City; IEEE 2nd International Conference on Data Science and Systems (HPCC/SmartCity/DSS)10.1109/HPCC-SmartCity-DSS.2016.0140(983-990)Online publication date: Dec-2016
https://doi.org/10.1109/HPCC-SmartCity-DSS.2016.0140

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents