research-article

Free access

Enabling highly scalable remote memory access programming with MPI-3 one sided

Authors:

Robert Gerstenberger,

Torsten HoeflerAuthors Info & Claims

Communications of the ACM, Volume 61, Issue 10

Pages 106 - 113

https://doi.org/10.1145/3264413

Published: 26 September 2018 Publication History

All formats PDF

Abstract

Modern high-performance networks offer remote direct memory access (RDMA) that exposes a process' virtual address space to other processes in the network. The Message Passing Interface (MPI) specification has recently been extended with a programming interface called MPI-3 Remote Memory Access (MPI-3 RMA) for efficiently exploiting state-of-the-art RDMA features. MPI-3 RMA enables a powerful programming model that alleviates many message passing downsides. In this work, we design and develop bufferless protocols that demonstrate how to implement this interface and support scaling to millions of cores with negligible memory consumption while providing highest performance and minimal overheads. To arm programmers, we provide a spectrum of performance models for RMA functions that enable rigorous mathematical analysis of application performance and facilitate the development of codes that solve given tasks within specified time and energy budgets. We validate the usability of our library and models with several application studies with up to half a million processes. In a wider sense, our work illustrates how to use RMA principles to accelerate computation- and data-intensive codes.

References

[1]

Bell, C., Bonachea, D., Nishtala, R., Yelick, K. Optimizing bandwidth limited problems using one-sided communication and overlap. In Proceedings of the International Conference on Parallel and Distributed Processing (IPDPS'06) (2006). IEEE Computer Society, 1--10.

Digital Library

[2]

Bernard, C., Ogilvie, M.C., DeGrand, T.A., DeTar, C.E., Gottlieb, S.A., Krasnitz, A., Sugar, R., Toussaint, D. Studying quarks and gluons on MIMD parallel computers. J. High Perform. Comput Appl. 5, 4 (1991), 61--70.

Digital Library

[3]

Faanes, G., Bataineh, A., Roweth, D., Court, T., Froese, E., Alverson, B., Johnson, T., Kopnick, J., Higgins, M., Reinhard, J. Cray Cascade: A Scalable HPC System Based on a Dragonfly Network. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'12) (2012). IEEE Computer Society, Los Alamitos, CA, 103:1--103:9.

Digital Library

[4]

Gropp, W., Hoefler, T., Thakur, R., Lusk, E. Using Advanced MPI: Modern Features of the Message-Passing Interface. MIT Press, Cambridge, MA, Nov. (2014).

Digital Library

[5]

Hoefler, T., Dinan, J., Buntinas, D., Balaji, P., Barrett, B., Brightwell, R., Gropp, W., Kale, V., Thakur, R. Leveraging MPI's one-sided communication interface for shared-memory programming. In Recent Advances in the Message Passing Interface (EuroMPI'12), Volume LNCS 7490 (2012). Springer, 132--141.

Digital Library

[6]

Jiang, W., Liu, J., Jin, H.-W., Panda, D.K., Gropp, W., Thakur, R. High performance MPI-2 one-sided communication over InfiniBand. In Proceedings of the IEEE International Symposium on Cluster Computing and the Grid (CCGRID'04) (2004). IEEE Computer Society, 531--538.

Digital Library

[7]

Karp, R.M., Sahay, A., Santos, E.E., Schauser, K.E. Optimal broadcast and summation in the LogP model. In Proceedings of the ACM Symposium on Parallel Algorithms and Architectures (SPAA'93) (1993). ACM, New York, NY, USA, 142--153.

Digital Library

[8]

Mellor-Crummey, J.M., Scott, M.L. Scalable reader-writer synchronization for shared-memory multiprocessors. SIGPLAN Notices 26, 7 (1991), 106--113.

Digital Library

[9]

Mellor-Crummey, J.M., Scott, M.L. Synchronization without contention. SIGPLAN Notices 26, 4 (1991), 269--278.

Digital Library

[10]

Mirin, A.A., Sawyer, W.B. A scalable implementation of a finite-volume dynamical core in the community atmosphere model. J. High Perform. Comput. Appl. 19, 3 (2005), 203--212.

Digital Library

[11]

MPI Forum. MPI: A Message-Passing Interface Standard. Version 3.0 (2012).

[12]

Nishtala, R., Hargrove, P.H., Bonachea, D.O., Yelick, K.A. Scaling communication-intensive applications on BlueGene/P using one-sided communication and overlap. In Proceedings of the IEEE International Parallel and Distributed Processing Symposium (IPDPS'09) (2009). IEEE Computer Society, 1--12.

Digital Library

[13]

Potluri, S., Lai, P., Tomko, K., Sur, S., Cui, Y., Tatineni, M., Schulz, K.W., Barth, W.L., Majumdar, A., Panda, D.K. Quantifying performance benefits of overlap using MPI-2 in a seismic modeling application. In Proceedings of the ACM International Conference on Supercomputing (ICS'10) (2010). ACM 17--25.

Digital Library

[14]

Santhanaraman, G., Balaji, P., Gopalakrishnan, K., Thakur, R., Gropp, W., Panda, D.K. Natively supporting true one-sided communication in MPI on multi-core systems with InfiniBand. In Proceedings of the IEEE/ACM International Symposium on Cluster Computing and the Grid (CCGRID '09) (2009). 380--387.

Digital Library

[15]

Shan, H., Austin, B., Wright, N., Strohmaier, E., Shalf, J., Yelick, K. Accelerating applications at scale using one-sided communication. In Proceedings of the Conference on Partitioned Global Address Space Programming Models (PGAS'12) (2012).

[16]

Woodacre, M., Robb, D., Roe, D., Feind, K. The SGI Altix TM 3000 Global Shared-Memory Architecture (2003). SGI HPC White Papers.

[17]

Zhang, J., Behzad, B., Snir, M. Optimizing the Barnes-Hut algorithm in UPC. In Proceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis (SC'11) (2011). ACM, 75:1--75:11.

Digital Library

[18]

Zhao, X., Santhanaraman, G., Gropp, W. Adaptive strategy for one-sided communication in MPICH2. In Recent Advances in the Message Passing Interface (EuroMPI'12) (2012). Springer, 16--26.

Digital Library

Cited By

Paznikov ABurachenko AAbuelsoud M(2024)Decentralized lock-free distributed queue in MPI remote memory access modelE3S Web of Conferences10.1051/e3sconf/202454803007548(03007)Online publication date: 12-Jul-2024
https://doi.org/10.1051/e3sconf/202454803007
Koutsoukos DMüller IMarroquín RKlimovic AAlonso G(2021)ModularisProceedings of the VLDB Endowment10.14778/3484224.348422914:13(3308-3321)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.14778/3484224.3484229
Li ZVetta A(2021)The Fair Division of Hereditary Set SystemsACM Transactions on Economics and Computation10.1145/34344109:2(1-19)Online publication date: 9-Feb-2021
https://dl.acm.org/doi/10.1145/3434410
Show More Cited By

Index Terms

Enabling highly scalable remote memory access programming with MPI-3 one sided

Recommendations

Enabling highly-scalable remote memory access programming with MPI-3 one sided
SC '13: Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis

Modern interconnects offer remote direct memory access (RDMA) features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for exploiting RDMA ...
Remote Memory Access Programming in MPI-3

The Message Passing Interface (MPI) 3.0 standard, introduced in September 2012, includes a significant update to the one-sided communication interface, also known as remote memory access (RMA). In particular, the interface has been extended to better ...
Enabling highly-scalable remote memory access programming with MPI-3 One Sided
SC13 --The International Conference for High Performance Computing, Networking, Storage and Analysis

Modern interconnects offer remote direct memory access RDMA features. Yet, most applications rely on explicit message passing for communications albeit their unwanted overheads. The MPI-3.0 standard defines a programming interface for exploiting RDMA ...

Comments

Information & Contributors

Information

Published In

cover image Communications of the ACM

Communications of the ACM Volume 61, Issue 10

October 2018

107 pages

ISSN:0001-0782

EISSN:1557-7317

DOI:10.1145/3281635

Editor:
Andrew A. Chien
Association for Computing Machinery, New York, NY

Issue’s Table of Contents

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 September 2018

Published in CACM Volume 61, Issue 10

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Research-article
Research
Refereed

Funding Sources

ETH Zurich

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

26
Total Citations
View Citations
5,507
Total Downloads

Downloads (Last 12 months)313
Downloads (Last 6 weeks)22

Reflects downloads up to 30 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Paznikov ABurachenko AAbuelsoud M(2024)Decentralized lock-free distributed queue in MPI remote memory access modelE3S Web of Conferences10.1051/e3sconf/202454803007548(03007)Online publication date: 12-Jul-2024
https://doi.org/10.1051/e3sconf/202454803007
Koutsoukos DMüller IMarroquín RKlimovic AAlonso G(2021)ModularisProceedings of the VLDB Endowment10.14778/3484224.348422914:13(3308-3321)Online publication date: 1-Sep-2021
https://dl.acm.org/doi/10.14778/3484224.3484229
Li ZVetta A(2021)The Fair Division of Hereditary Set SystemsACM Transactions on Economics and Computation10.1145/34344109:2(1-19)Online publication date: 9-Feb-2021
https://dl.acm.org/doi/10.1145/3434410
Besta MCarigiet AJanda KVonarburg-Shmaria ZGianinazzi LHoefler TCuicchi CQualters IKramer W(2020)High-performance parallel graph coloring with strong guarantees on work, depth, and qualityProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3433701.3433833(1-17)Online publication date: 9-Nov-2020
https://dl.acm.org/doi/10.5555/3433701.3433833
Haigh T(2020)The immortal soul of an old machineCommunications of the ACM10.1145/343624964:1(32-37)Online publication date: 17-Dec-2020
https://dl.acm.org/doi/10.1145/3436249
Cabañas JCuevas ÁArrate ACuevas R(2020)Does Facebook use sensitive data for advertising purposes?Communications of the ACM10.1145/342636164:1(62-69)Online publication date: 17-Dec-2020
https://dl.acm.org/doi/10.1145/3426361
Chang TWatson LLux TButt ACameron KHong Y(2020)Algorithm 1012ACM Transactions on Mathematical Software10.1145/342281846:4(1-20)Online publication date: 7-Nov-2020
https://dl.acm.org/doi/10.1145/3422818
Barabasz BAnderson ASoodhalter KGregg D(2020)Error Analysis and Improving the Accuracy of Winograd Convolution for Deep Neural NetworksACM Transactions on Mathematical Software10.1145/341238046:4(1-33)Online publication date: 7-Nov-2020
https://dl.acm.org/doi/10.1145/3412380
Islam BNirjon S(2020)ZygardeProceedings of the ACM on Interactive, Mobile, Wearable and Ubiquitous Technologies10.1145/34118084:3(1-29)Online publication date: 4-Sep-2020
https://dl.acm.org/doi/10.1145/3411808
Uphoff CBader M(2020)Yet Another Tensor Toolbox for Discontinuous Galerkin Methods and Other ApplicationsACM Transactions on Mathematical Software10.1145/340683546:4(1-40)Online publication date: 16-Oct-2020
https://dl.acm.org/doi/10.1145/3406835
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Digital Edition

View this article in digital edition.

Digital Edition

Magazine Site

View this article on the magazine site (external)

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents