Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3605573.3605599acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicppConference Proceedingsconference-collections
research-article

Quantifying the Performance Benefits of Partitioned Communication in MPI

Published: 13 September 2023 Publication History

Abstract

Partitioned communication was introduced in MPI 4.0 as a user-friendly interface to support pipelined communication patterns, particularly common in the context of MPI+threads. It provides the user with the ability to divide a global buffer into smaller independent chunks, called partitions, which can then be communicated independently. In this work we first model the performance gain that can be expected when using partitioned communication. Next, we describe the improvements we made to MPICH to enable those gains and provide a high-quality implementation of MPI partitioned communication. We then evaluate partitioned communication in various common use cases and assess the performance in comparison with other MPI point-to-point and one-sided approaches. Specifically, we first investigate two scenarios commonly encountered for small partition sizes in a multithreaded environment: thread contention and overhead of using many partitions. We propose two solutions to alleviate the measured penalty and demonstrate their use. We then focus on large messages and the gain obtained when exploiting the delay resulting from computations or load imbalance. We conclude with our perspectives on the benefits of partitioned communication and the various results obtained.

References

[1]
M. G. F. Dosanjh, T. Groves, R. E. Grant, R. Brightwell, and P. G. Bridges. 2016. RMA-MT: A Benchmark Suite for Assessing MPI Multi-threaded RMA Performance, In 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid). 2016 16th IEEE/ACM International Symposium on Cluster, Cloud and Grid Computing (CCGrid), 550–559. https://doi.org/10.1109/CCGrid.2016.84
[2]
Matthew G. F. Dosanjh, Andrew Worley, Derek Schafer, Prema Soundararajan, Sheikh Ghafoor, Anthony Skjellum, Purushotham V. Bangalore, and Ryan E. Grant. 2021. Implementation and evaluation of MPI 4.0 partitioned communication libraries. Parallel Comput. 108 (2021), 102827. https://doi.org/10.1016/j.parco.2021.102827
[3]
Thomas Gillis. 2023. bench-pcomm. https://github.com/pmodels/bench-pcomm.
[4]
Ryan E. Grant, Matthew G. F. Dosanjh, Michael J. Levenhagen, Ron Brightwell, and Anthony Skjellum. 2019. Finepoints: Partitioned Multithreaded MPI Communication. In High Performance Computing. Springer International Publishing, 330–350. https://doi.org/10.1007/978-3-030-20656-7_17
[5]
Yiltan Hassan Temucin, Ryan E. Grant, and Ahmad Afsahi. 2023. Micro-Benchmarking MPI Partitioned Point-to-Point Communication. In Proceedings of the 51st International Conference on Parallel Processing(ICPP ’22). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3545008.3545088
[6]
D. J. Holmes, A. Skjellum, J. Jaeger, R. E. Grant, P. V. Bangalore, M. G. F. Dosanjh, A. Bienz, and D. Schafer. 2021. Partitioned Collective Communication, In 2021 Workshop on Exascale MPI (ExaMPI). 2021 Workshop on Exascale MPI (ExaMPI), 9–17. https://doi.org/10.1109/ExaMPI54564.2021.00007
[7]
Huda Ibeid, Luke Olson, and William Gropp. 2020. FFT, FMM, and multigrid on the road to exascale: Performance challenges and opportunities. J. Parallel and Distrib. Comput. 136 (2020), 63–74. https://doi.org/10.1016/j.jpdc.2019.09.014
[8]
H. Kamal and A. Wagner. 2010. FG-MPI: Fine-grain MPI for multicore and clusters, In 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW). 2010 IEEE International Symposium on Parallel & Distributed Processing, Workshops and Phd Forum (IPDPSW), 1–8. https://doi.org/10.1109/IPDPSW.2010.5470773
[9]
Message Passing Interface Forum. 2021. MPI: A Message-Passing Interface Standard Version 4.0. https://www.mpi-forum.org/docs/mpi-4.0/mpi40-report.pdf
[10]
P. Shamis, M. G. Venkata, M. G. Lopez, M. B. Baker, O. Hernandez, Y. Itigin, M. Dubman, G. Shainer, R. L. Graham, L. Liss, Y. Shahar, S. Potluri, D. Rossetti, D. Becker, D. Poole, C. Lamb, S. Kumar, C. Stunkel, G. Bosilca, and A. Bouteiller. 2015. UCX: An Open Source Framework for HPC Network APIs and Beyond, In 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects. 2015 IEEE 23rd Annual Symposium on High-Performance Interconnects, 40–43. https://doi.org/10.1109/HOTI.2015.13
[11]
M. Si, A. J. Peña, J. Hammond, P. Balaji, M. Takagi, and Y. Ishikawa. 2015. Casper: An Asynchronous Progress Model for MPI RMA on Many-Core Architectures, In 2015 IEEE International Parallel and Distributed Processing Symposium. 2015 IEEE International Parallel and Distributed Processing Symposium, 665–676. https://doi.org/10.1109/IPDPS.2015.35
[12]
Rohit Zambre and Aparna Chandramowlishwaran. 2022. Lessons Learned on MPI+threads Communication. In Proceedings of the International Conference on High Performance Computing, Networking, Storage and Analysis(SC ’22).
[13]
R. Zambre, A. Chandramowlishwaran, and P. Balaji. 2018. Scalable Communication Endpoints for MPI+Threads Applications, In 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS). 2018 IEEE 24th International Conference on Parallel and Distributed Systems (ICPADS), 803–812. https://doi.org/10.1109/PADSW.2018.8645059
[14]
Rohit Zambre, Aparna Chandramowliswharan, and Pavan Balaji. 2020. How I Learned to Stop Worrying about User-Visible Endpoints and Love MPI. In Proceedings of the 34th ACM International Conference on Supercomputing(ICS ’20). Association for Computing Machinery, New York, NY, USA. https://doi.org/10.1145/3392717.3392773
[15]
Hui Zhou, Ken Raffenetti, Yanfei Guo, and Rajeev Thakur. 2022. MPIX Stream: An Explicit Solution to Hybrid MPI+X Programming. In Proceedings of the 29th European MPI Users’ Group Meeting(EuroMPI/USA’22). Association for Computing Machinery, New York, NY, USA, 1–10. https://doi.org/10.1145/3555819.3555820

Cited By

View all
  • (2024)Partitioned Reduction for Heterogeneous Environments2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00047(285-289)Online publication date: 20-Mar-2024
  • (2024)CMB: A Configurable Messaging Benchmark to Explore Fine-Grained Communication2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00013(28-38)Online publication date: 6-May-2024
  • (2024)To Share or Not to Share: A Case for MPI in Shared-MemoryRecent Advances in the Message Passing Interface10.1007/978-3-031-73370-3_6(89-102)Online publication date: 25-Sep-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
ICPP '23: Proceedings of the 52nd International Conference on Parallel Processing
August 2023
858 pages
ISBN:9798400708435
DOI:10.1145/3605573
Publication rights licensed to ACM. ACM acknowledges that this contribution was authored or co-authored by an employee, contractor or affiliate of the United States government. As such, the Government retains a nonexclusive, royalty-free right to publish or reproduce this article, or to allow others to do so, for Government purposes only.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 13 September 2023

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. MPI
  2. distributed systems
  3. partitioned communication

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

ICPP 2023
ICPP 2023: 52nd International Conference on Parallel Processing
August 7 - 10, 2023
UT, Salt Lake City, USA

Acceptance Rates

Overall Acceptance Rate 91 of 313 submissions, 29%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)94
  • Downloads (Last 6 weeks)7
Reflects downloads up to 24 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Partitioned Reduction for Heterogeneous Environments2024 32nd Euromicro International Conference on Parallel, Distributed and Network-Based Processing (PDP)10.1109/PDP62718.2024.00047(285-289)Online publication date: 20-Mar-2024
  • (2024)CMB: A Configurable Messaging Benchmark to Explore Fine-Grained Communication2024 IEEE 24th International Symposium on Cluster, Cloud and Internet Computing (CCGrid)10.1109/CCGrid59990.2024.00013(28-38)Online publication date: 6-May-2024
  • (2024)To Share or Not to Share: A Case for MPI in Shared-MemoryRecent Advances in the Message Passing Interface10.1007/978-3-031-73370-3_6(89-102)Online publication date: 25-Sep-2024
  • (2023)Towards Correctness Checking of MPI Partitioned Communication in MUSTProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624089(224-227)Online publication date: 12-Nov-2023

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media