Micro-benchmarking mpi partitioned point-to-point communication
Y Hassan Temucin, RE Grant, A Afsahi - Proceedings of the 51st …, 2022 - dl.acm.org
Y Hassan Temucin, RE Grant, A Afsahi
Proceedings of the 51st International Conference on Parallel Processing, 2022•dl.acm.orgModern High-Performance Computing (HPC) architectures have developed the need for
scalable hybrid programming models. The latest Message Passing Interface (MPI) 4.0
standard has introduced a new communication model: MPI Partitioned Point-to-Point
communication. This new model allows for the contribution of data from multiple threads with
lower overheads than with traditional MPI point-to-point communication. In this paper, we
design the first publicly available micro-benchmark suite for MPI Partitioned to measure …
scalable hybrid programming models. The latest Message Passing Interface (MPI) 4.0
standard has introduced a new communication model: MPI Partitioned Point-to-Point
communication. This new model allows for the contribution of data from multiple threads with
lower overheads than with traditional MPI point-to-point communication. In this paper, we
design the first publicly available micro-benchmark suite for MPI Partitioned to measure …
Modern High-Performance Computing (HPC) architectures have developed the need for scalable hybrid programming models. The latest Message Passing Interface (MPI) 4.0 standard has introduced a new communication model: MPI Partitioned Point-to-Point communication. This new model allows for the contribution of data from multiple threads with lower overheads than with traditional MPI point-to-point communication. In this paper, we design the first publicly available micro-benchmark suite for MPI Partitioned to measure various metrics that can give insight into the benefits of using this new model and scenarios where MPI point-to-point is better suited. Suggestions are provided to application developers on how to choose partition size for their application based on compute and message size. We evaluate MPI Partitioned communication with both a hot and cold CPU cache, system noise with different probability distributions, point-to-point communication directly, and with commonly used MPI communication patterns such as a halo exchange and Sweep3D.
ACM Digital Library