Enabling flexible collective communication offload with triggered operations

KD Underwood, J Coffman, R Larsen… - 2011 IEEE 19th …, 2011 - ieeexplore.ieee.org
KD Underwood, J Coffman, R Larsen, KS Hemmert, BW Barrett, R Brightwell, M Levenhagen
2011 IEEE 19th Annual Symposium on High Performance Interconnects, 2011ieeexplore.ieee.org
Low latency collective communications are key to application scalability. As systems grow
larger, minimizing collective communication time becomes increasingly challenging. Offload
is an effective technique for accelerating collective operations, however, algorithms for
collective communication constantly evolve such that flexible implementations are critical.
This paper presents triggered operations--a semantic building block that allows the key
components of collective communications to be offloaded while allowing the host side …
Low latency collective communications are key to application scalability. As systems grow larger, minimizing collective communication time becomes increasingly challenging. Offload is an effective technique for accelerating collective operations, however, algorithms for collective communication constantly evolve such that flexible implementations are critical. This paper presents triggered operations -- a semantic building block that allows the key components of collective communications to be offloaded while allowing the host side software to define the algorithm. Simulations are used to demonstrate the performance improvements achievable through the offload of MPI_Allreduce using these building blocks.
ieeexplore.ieee.org