Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/2949550.2949553acmotherconferencesArticle/Chapter ViewAbstractPublication PagesxsedeConference Proceedingsconference-collections
research-article

A Quantitative Analysis of Node Sharing on HPC Clusters Using XDMoD Application Kernels

Published: 17 July 2016 Publication History

Abstract

In this investigation, we study how application performance is affected when jobs are permitted to share compute nodes. A series of application kernels consisting of a diverse set of benchmark calculations were run in both exclusive and node-sharing modes on the Center for Computational Research's high-performance computing (HPC) cluster. Very little increase in runtime was observed due to job contention among application kernel jobs run on shared nodes. The small differences in runtime were quantitatively modeled in order to characterize the resource contention and attempt to determine the circumstances under which it would or would not be important. A machine learning regression model applied to the runtime data successfully fitted the small differences between the exclusive and shared node runtime data; it also provided insight into the contention for node resources that occurs when jobs are allowed to share nodes. Analysis of a representative job mix shows that runtime of shared jobs is affected primarily by the memory subsystem, in particular by the reduction in the effective cache size due to sharing; this leads to higher utilization of DRAM. Insights such as these are crucial when formulating policies proposing node sharing as a mechanism for improving HPC utilization.

References

[1]
Iancu C, Hofmeyr S, Blagojevic F (2010) Oversubscription on multicore processors. 2010 IEEE Int. Symp. Parallel Distrib. Process. IEEE, pp 1--11
[2]
Breslow AD, Tiwari A, Schulz M, Carrington L, Tang L, Mars J (2013) Enabling fair pricing on HPC systems with node sharing. Proc. Int. Conf. High Perform. Comput. Networking, Storage Anal. - SC '13. ACM Press, New York, New York, USA, pp 1--12
[3]
Koop MJ, Luo M, Panda DK (2009) Reducing network contention with mixed workloads on modern multicore, clusters. 2009 IEEE Int. Conf. Clust. Comput. Work. IEEE, pp 1--10
[4]
Breslow AD, Porter L, Tiwari A, Laurenzano M, Carrington L, Tullsen DM, Snavely AE (2016) The case for colocation of high performance computing workloads. Concurr Comput Pract Exp 28:232--251.
[5]
STREAM Benchmark Results on Intel Xeon and Xeon Phi | Karl Rupp. https://www.karlrupp.net/2015/02/stream-benchmark-results-on-intel-xeon-and-xeon-phi/. Accessed 22 Apr 2016
[6]
White JP, Barth WL, Hammond J, DeLeon RL, Furlani TR, Gallo SM, Jones MD, Ghadersohi A, Cornelius CD, Patra AK, Browne JC (2014) An Analysis of Node Sharing on HPC Clusters using XDMoD/TACC_Stats. Proc. 2014 Annu. Conf. Extrem. Sci. Eng. Discov. Environ. - XSEDE '14. ACM Press, New York, New York, USA, pp 1--8
[7]
Simakov NA, White JP, DeLeon RL, Ghadersohi A, Furlani TR, Jones MD, Gallo SM, Patra AK (2015) Application kernels: HPC resources performance monitoring and variance analysis. Concurr Comput Pract Exp 27:5238--5260.
[8]
Valiev M, Bylaska EJ, Govind N, Kowalski K, Straatsma TP, Van Dam HJJ, Wang D, Nieplocha J, Apra E, Windus TL, de Jong WA (2010) NWChem: A comprehensive and scalable open-source solution for large scale molecular simulations. Comput Phys Commun 181:1477--1489.
[9]
Schmidt MW, Baldridge KK, Boatz JA, Elbert ST, Gordon MS, Jensen JH, Koseki S, Matsunaga N, Nguyen KA, Su S, Windus TL, Dupuis M, Montgomery JA (1993) General atomic and molecular electronic structure system. J Comput Chem 14:1347--1363.
[10]
Phillips JC, Braun R, Wang W, Gumbart J, Tajkhorshid E, Villa E, Chipot C, Skeel RD, Kalé L, Schulten K (2005) Scalable molecular dynamics with NAMD. J Comput Chem 26:1781--802.
[11]
Norman ML, Bryan GL, Harkness R, Bordner J, Reynolds D, O'Shea B, Wagner R (2007) Simulating Cosmological Evolution with Enzo. eprint arXiv:0705.1556
[12]
Graph 500. http://www.graph500.org/. Accessed 11 Aug 2014
[13]
Luszczek PR, Bailey DH, Dongarra JJ, Kepner J, Lucas RF, Rabenseifner R, Takahashi D (2006) The HPC Challenge (HPCC) benchmark suite. Proc. 2006 ACM/IEEE Conf. Supercomput. p 213
[14]
IOR HPC Benchmark | Free System Administration software downloads at SourceForge.net. http://sourceforge.net/projects/ior-sio/. Accessed 11 Aug 2014
[15]
Performance Co-Pilot, System Performance and Analysis Framework. http://pcp.io/. Accessed 4 Apr 2016
[16]
Palmer JT, Gallo SM, Furlani TR, Jones MD, DeLeon RL, White JP, Simakov N, Patra AK, Sperhac J, Yearke T, Rathsam R, Innus M, Cornelius CD, Browne JC, Barth WL, Evans RT (2015) Open XDMoD: A Tool for the Comprehensive Management of High-Performance Computing Resources. Comput Sci Eng 17:52--62.
[17]
Liaw A, Wiener M (2002) Classification and Regression by randomForest. R News 2:18--22.
[18]
Gallo SM, White JP, DeLeon RL, Furlani TR, Ngo H, Patra AK, Jones MD, Palmer JT, Simakov N, Sperhac JM, Innus M, Yearke T, Rathsam R (2015) Analysis of XDMoD/SUPReMM Data Using Machine Learning Techniques. 2015 IEEE Int. Conf. Clust. Comput. IEEE, pp 642--649
[19]
Eklov D, Nikoleris N, Black-Schaffer D, Hagersten E (2011) Cache Pirating: Measuring the Curse of the Shared Cache. 2011 Int. Conf. Parallel Process. IEEE, pp 165--175
[20]
Eklov D, Nikoleris N, Black-Schaffer D, Hagersten E (2012) Bandwidth bandit: Understanding memory contention. 2012 IEEE Int. Symp. Perform. Anal. Syst. Softw. IEEE, pp 116--117

Cited By

View all
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • (2023)Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM SystemsProceedings of the HPC Asia 2023 Workshops10.1145/3581576.3581618(78-86)Online publication date: 27-Feb-2023
  • (2021)SatoriProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00031(292-305)Online publication date: 14-Jun-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
XSEDE16: Proceedings of the XSEDE16 Conference on Diversity, Big Data, and Science at Scale
July 2016
405 pages
ISBN:9781450347556
DOI:10.1145/2949550
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 17 July 2016

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. HPC
  2. SUPReMM
  3. TACC_Stats
  4. XDMoD
  5. node sharing
  6. performance co-pilot

Qualifiers

  • Research-article
  • Research
  • Refereed limited

Conference

XSEDE16

Acceptance Rates

Overall Acceptance Rate 129 of 190 submissions, 68%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2024)Software Resource Disaggregation for HPC with Serverless Computing2024 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS57955.2024.00021(139-156)Online publication date: 27-May-2024
  • (2023)Are we ready for broader adoption of ARM in the HPC community: Performance and Energy Efficiency Analysis of Benchmarks and Applications Executed on High-End ARM SystemsProceedings of the HPC Asia 2023 Workshops10.1145/3581576.3581618(78-86)Online publication date: 27-Feb-2023
  • (2021)SatoriProceedings of the 48th Annual International Symposium on Computer Architecture10.1109/ISCA52012.2021.00031(292-305)Online publication date: 14-Jun-2021
  • (2019)Spread-n-shareProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.1145/3295500.3356152(1-15)Online publication date: 17-Nov-2019
  • (2019)Contention Aware Workload and Resource Co-Scheduling on Power-Bounded Systems2019 IEEE International Conference on Networking, Architecture and Storage (NAS)10.1109/NAS.2019.8834721(1-8)Online publication date: Aug-2019
  • (2019)Opportunities for Partitioning Non-volatile Memory DIMMs Between Co-scheduled Jobs on HPC NodesEuro-Par 2019: Parallel Processing Workshops10.1007/978-3-030-48340-1_7(82-94)Online publication date: 26-Aug-2019
  • (2018)Slurm SimulatorProceedings of the Practice and Experience on Advanced Research Computing: Seamless Creativity10.1145/3219104.3219111(1-8)Online publication date: 22-Jul-2018
  • (2018)Tangram: Colocating HPC Applications with Oversubscription2018 IEEE High Performance extreme Computing Conference (HPEC)10.1109/HPEC.2018.8547644(1-7)Online publication date: Sep-2018
  • (2017)Co-locating Graph Analytics and HPC Applications2017 IEEE International Conference on Cluster Computing (CLUSTER)10.1109/CLUSTER.2017.111(659-660)Online publication date: Sep-2017
  • (2017)A Slurm Simulator: Implementation and Parametric AnalysisHigh Performance Computing Systems. Performance Modeling, Benchmarking, and Simulation10.1007/978-3-319-72971-8_10(197-217)Online publication date: 23-Dec-2017

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media