Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- ArticleJune 2021
A Tunable Implementation of Quality-of-Service Classes for HPC Networks
- Kevin A. Brown,
- Neil McGlohon,
- Sudheer Chunduri,
- Eric Borch,
- Robert B. Ross,
- Christopher D. Carothers,
- Kevin Harms
AbstractHigh-performance computer (HPC) networks are often shared by communication traffic from multiple applications with varying communication characteristics and resource requirements. These applications contend for shared network buffers and channels, ...
- research-articleNovember 2019
GPCNeT: designing a benchmark suite for inducing and measuring contention in HPC networks
- Sudheer Chunduri,
- Taylor Groves,
- Peter Mendygral,
- Brian Austin,
- Jacob Balma,
- Krishna Kandalla,
- Kalyan Kumaran,
- Glenn Lockwood,
- Scott Parker,
- Steven Warren,
- Nathan Wichmann,
- Nicholas Wright
SC '19: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2019, Article No.: 42, Pages 1–33https://doi.org/10.1145/3295500.3356215Network congestion is one of the biggest problems facing HPC systems today, affecting system throughput, performance, user experience, and reproducibility. Congestion manifests as run-to-run variability due to contention for shared resources (e.g., ...
- research-articleJune 2019
The Effect of System Utilization on Application Performance Variability
ROSS '19: Proceedings of the 9th International Workshop on Runtime and Operating Systems for SupercomputersJune 2019, Pages 11–18https://doi.org/10.1145/3322789.3328743Application performance variability caused by network contention is a major issue on dragonfly based systems. This work-in-progress study makes two contributions. First, we analyze real workload logs and conduct application experiments on the production ...
- research-articleMay 2019
Modeling and Analysis of Application Interference on Dragonfly+
SIGSIM-PADS '19: Proceedings of the 2019 ACM SIGSIM Conference on Principles of Advanced Discrete SimulationMay 2019, Pages 161–172https://doi.org/10.1145/3316480.3325517Dragonfly class of networks are considered as promising interconnects for next-generation supercomputers. While Dragonfly+ networks offer more path diversity than the original Dragonfly design, they are still prone to performance variability due to ...
Characterization of MPI usage on a production supercomputer
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisNovember 2018, Article No.: 30, Pages 1–15https://doi.org/10.1109/SC.2018.00033MPI is the most prominent programming model used in scientific computing today. Despite the importance of MPI, however, how scientific computing applications use it in production is not well understood. This lack of understanding is attributed primarily ...
Characterization of MPI usage on a production supercomputer
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and AnalysisNovember 2018, Article No.: 30, Pages 1–15MPI is the most prominent programming model used in scientific computing today. Despite the importance of MPI, however, how scientific computing applications use it in production is not well understood. This lack of understanding is attributed primarily ...
- research-articleMay 2018
Parallel low discrepancy parameter sweep for public health policy
CCGrid '18: Proceedings of the 18th IEEE/ACM International Symposium on Cluster, Cloud and Grid ComputingMay 2018, Pages 291–300https://doi.org/10.1109/CCGRID.2018.00044Numerical simulations are used to analyze the effectiveness of alternate public policy choices in limiting the spread of infections. In practice, it is usually not feasible to predict their precise impacts due to inherent uncertainties, especially at ...
- research-articleNovember 2017
Run-to-run variability on Xeon Phi based cray XC systems
- Sudheer Chunduri,
- Kevin Harms,
- Scott Parker,
- Vitali Morozov,
- Samuel Oshin,
- Naveen Cherukuri,
- Kalyan Kumaran
SC '17: Proceedings of the International Conference for High Performance Computing, Networking, Storage and AnalysisNovember 2017, Article No.: 52, Pages 1–13https://doi.org/10.1145/3126908.3126926The increasing complexity of HPC systems has introduced new sources of variability, which can contribute to significant differences in run-to-run performance of applications. With components at various levels of the system contributing variability, ...
- short-paperMay 2017
Analytical Performance Modeling and Validation of Intel's Xeon Phi Architecture
CF'17: Proceedings of the Computing Frontiers ConferenceMay 2017, Pages 247–250https://doi.org/10.1145/3075564.3075593Modeling the performance of scientific applications on emerging hardware plays a central role in achieving extreme-scale computing goals. Analytical models that capture the interaction between applications and hardware characteristics are attractive ...
- research-articleDecember 2016
Static and Dynamic Frequency Scaling on Multicore CPUs
- Wenlei Bao,
- Changwan Hong,
- Sudheer Chunduri,
- Sriram Krishnamoorthy,
- Louis-Noël Pouchet,
- Fabrice Rastello,
- P. Sadayappan
ACM Transactions on Architecture and Code Optimization (TACO), Volume 13, Issue 4Article No.: 51, Pages 1–26https://doi.org/10.1145/3011017Dynamic Voltage and Frequency Scaling (DVFS) typically adapts CPU power consumption by modifying a processor’s operating frequency (and the associated voltage). Typical DVFS approaches include using default strategies such as running at the lowest or ...