Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2014
Understanding the effects of communication and coordination on checkpointing at scale
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 883–894https://doi.org/10.1109/SC.2014.77Fault-tolerance poses a major challenge for future large-scale systems. Active research into coordinated, uncoordinated, and hybrid checkpointing systems has explored how the introduction of asynchrony can address anticipated scalability issues. However,...
- research-articleNovember 2014
Nonblocking epochs in MPI one-sided communication
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 475–486https://doi.org/10.1109/SC.2014.44The synchronization model of the MPI one-sided communication paradigm can lead to serialization and latency propagation. For instance, a process can propagate non-RMA communication-related latencies to remote peers waiting in their respective epoch-...
- research-articleNovember 2014
Slim fly: a cost effective low-diameter network topology
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 348–359https://doi.org/10.1109/SC.2014.34We introduce a high-performance cost-effective network topology called Slim Fly that approaches the theoretically optimal network diameter. Slim Fly is based on graphs that approximate the solution to the degree-diameter problem. We analyze Slim Fly and ...
- research-articleNovember 2014
Maximizing throughput on a dragonfly network
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 336–347https://doi.org/10.1109/SC.2014.33Interconnection networks are a critical resource for large supercomputers. The dragonfly topology, which provides a low network diameter and large bisection bandwidth, is being explored as a promising option for building multi-Petaflop/s and Exaflop/s ...
- research-articleNovember 2014
Cypress: combining static and dynamic analysis for top-down communication trace compression
SC '14: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisPages 143–153https://doi.org/10.1109/SC.2014.17Communication traces are increasingly important, both for parallel applications' performance analysis/optimization, and for designing next-generation HPC systems. Meanwhile, the problem size and the execution scale on supercomputers keep growing, ...