research-article

Efficient task placement and routing of nearest neighbor exchanges in dragonfly networks

Authors:

Torsten HoeflerAuthors Info & Claims

HPDC '14: Proceedings of the 23rd international symposium on High-performance parallel and distributed computing

Pages 129 - 140

https://doi.org/10.1145/2600212.2600225

Published: 23 June 2014 Publication History

Get Access

Abstract

Dragonflies are recent network designs that are one of the most promising topologies for the Exascale effort due to their scalability and cost. While being able to achieve very high throughput under random uniform all-to-all traffic, this type of network can experience significant performance degradation for other common high performance computing workloads such as stencil (multi-dimensional nearest neighbor) patterns. Often, the lack of peak performance is caused by an insufficient understanding of the interaction between the workload and the network, and an insufficient understanding of how application specific task-to-node mapping strategies can serve as optimization vehicles.

To address these issues, we propose a theoretical performance analysis framework that takes as inputs a network specification and a traffic demand matrix characterizing an arbitrary workload and is able to predict where bottlenecks will occur in the network and what their impact will be on the effective sustainable injection bandwidth. We then focus our analysis on a specific high-interest communication pattern, the multi-dimensional Cartesian nearest neighbor exchange, and provide analytic bounds (owing to bottlenecks in the remote links of the Dragonfly) on its expected performance across a multitude of possible mapping strategies.

Finally, using a comprehensive set of simulations results, we validate the correctness of the theoretical approach and in the process address some misconceptions regarding Dragonfly network behavior and evaluation, (such as the choice of throughput maximization over workload completion time minimization as optimization objective) and the question of whether the standard notion of Dragonfly balance can be extended to workloads other than uniform random traffic.

References

[1]

M. Alvanos, G. Tanase, M. Farreras, E. Tiotto, J. N. Amaral, and X. Martorell. Improving performance of all-to-all communication through loop scheduling in PGAS environments. In Proc. of the $27^th$ International Conference on Supercomputing, ICS '13, pages 457--458, New York, NY, USA, 2013. ACM.

Abstract

References

Cited By

Index Terms

Recommendations

Efficient Routing Mechanisms for Dragonfly Networks

Randomizing task placement and route selection do not randomize traffic (enough)

Connectivity and constructive algorithms of disjoint paths in dragonfly networks

Comments

Information

Published In

Sponsors

Publisher

Publication History

Permissions

Check for updates

Author Tags

Qualifiers

Conference

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations