research-article

Rethinking Node Allocation Strategy for Data-intensive Applications in Consideration of Spatially Bursty I/O

Authors:

Yusheng LiuAuthors Info & Claims

ICS '18: Proceedings of the 2018 International Conference on Supercomputing

Pages 12 - 21

https://doi.org/10.1145/3205289.3205305

Published: 12 June 2018 Publication History

Abstract

Job scheduling in HPC systems by default allocate adjacent compute nodes for jobs for lower communication overhead. However, it is no longer applicable to data-intensive jobs running on systems with I/O forwarding layer, where each I/O node performs I/O on behalf of a subset of compute nodes in the vicinity. Under the default node allocation strategy a job's nodes are located close to each other and thus it only uses a limited number of I/O nodes. Since the I/O activities of jobs are bursty, at any moment only a minority of jobs in the system are busy processing I/O. Consequently, the bursty I/O traffic in the system is also concentrated in space, making the load on I/O nodes highly unbalanced. In this paper, we use the job logs and I/O traces collected from Tianhe-1A to quantitatively analyze the two causes of spatially bursty I/O, including uneven I/O traffic of job's processes and uneven distribution of job's nodes. Based on the analysis we propose a node allocation strategy that takes account of processes' different amounts of I/O traffic, so that the I/O traffic can be processed by more I/O nodes more evenly. Our evaluations on Tianhe-1A with synthetic benchmarks and realistic applications show that the proposed strategy can further exploit the potential of I/O forwarding layer and promote the I/O performance.

References

[1]

Boyle P A. 2012. The Bluegene/Q Supercomputer. PoS 020 (2012).

[2]

Nawab Ali, Philip Carns, Kamil Iskra, Dries Kimpe, Samuel Lang, Robert Latham, Robert Ross, Lee Ward, and P. Sadayappan. 2009. Scalable I/O forwarding framework for high-performance computing systems. In IEEE International Conference on CLUSTER Computing and Workshops. 1--10.

[3]

Katie Antypas, Nicholas Wright, Nicholas P Cardo, Allison Andrews, and Matthew Cordery. 2014. Cori: a cray xc pre-exascale system for nersc. Cray User Group Proceedings. Cray (2014).

[4]

Philip Carns, Kevin Harms, William Allcock, Charles Bacon, Samuel Lang, Robert Latham, and Robert Ross. 2011. Understanding and improving computational science storage access through continuous characterization. ACM Transactions on Storage (TOS) 7, 3 (2011), 8.

Digital Library

[5]

Yiannis Georgiou, Emmanuel Jeannot, Guillaume Mercier, Adle Villiermet, Yiannis Georgiou, Emmanuel Jeannot, Guillaume Mercier, Adle Villiermet, Yiannis Georgiou, and Emmanuel Jeannot. 2017. Topology-aware job mapping. International Journal of High Performance Computing Applications (2017), 109434201772706.

Digital Library

[6]

Landsteiner B Henseler D and Petesch D. 2016. Architecture and Design of Cray DataWarp. In Proc. Cray Users' Group Technical Conference (CUG).

[7]

Stephen Herbein, H. Ahn Dong, Don Lipari, Thomas R. W. Scogland, Marc Stearman, Mark Grondona, Jim Garlick, Becky Springmeyer, and Michela Taufer. 2016. Scalable I/O-Aware Job Scheduling for Burst Buffer Enabled HPC Clusters. In ACM International Symposium on High-Performance Parallel and Distributed Computing. 69--80.

Digital Library

[8]

Youngjae Kim, Raghul Gunasekaran, Galen M Shipman, David A Dillow, Zhe Zhang, and Bradley W Settlemyer. 2010. Workload characterization of a leadership class storage cluster. In Petascale Data Storage Workshop (PDSW), 2010 5th. IEEE, 1--5.

[9]

Quincey Koziol et al. 2014. High performance parallel I/O. CRC Press.

Digital Library

[10]

Xiangke Liao, Liquan Xiao, Canqun Yang, and Yu-tong Lu. 2014. Milky Way-2 supercomputer: system and application. Frontiers of Computer Science Selected Publications from Chinese Universities 8, 3 (2014), 345--356.

Digital Library

[11]

Xin Liu, Yutong Lu, Jie Yu, Peng-fei Wang, Jie-ting Wu, and Ying Lu. 2017. ONFS: a hierarchical hybrid file system based on memory, SSD, and HDD for high performance computers. Frontiers of Information Technology & Electronic Engineering 18, 12 (2017), 1940--1971.

[12]

Yang Liu, Raghul Gunasekaran, Xiaosong Ma, and Sudharshan S. Vazhkudai. 2016. Server-Side Log Data Analytics for I/O Workload Characterization and Coordination on Large Shared Storage Systems. In International Conference for High PERFORMANCE Computing, Networking, Storage and Analysis. 70.

Digital Library

[13]

Xiao Qin, Hong Jiang, Adam Manzanares, Xiaojun Ruan, and Shu Yin. 2009. Dynamic load balancing for I/O-intensive applications on clusters. Acm Transactions on Storage 5, 3 (2009), 1--38.

Digital Library

[14]

Stephan Schlagkamp, Rafael Ferreira da Silva, William Allcock, Ewa Deelman, and Uwe Schwiegelshohn. 2016. Consecutive job submission behavior at Mira supercomputer. In Proceedings of the 25th ACM International Symposium on High-Performance Parallel and Distributed Computing. ACM, 93--96.

Digital Library

[15]

Claude E Shannon. 2001. A mathematical theory of communication. ACM SIGMOBILE Mobile Computing and Communications Review 5, 1 (2001), 3--55.

Digital Library

[16]

H. Subramoni, D. Bureddy, K. Kandalla, K. Schulz, B. Barth, J. Perkins, M. Arnold, and D. K. Panda. 2014. Design of network topology aware scheduling services for large InfiniBand clusters. In IEEE International Conference on CLUSTER Computing. 1--8.

[17]

TOP500. 2017. TOP500 Supercomputer Sites. http://www.top500.org. (2017).

[18]

Jie Yu, Guangming Liu, Xiaoyong Li, Wenrui Dong, and Qiong Li. 2017. Cross-layer coordination in the I/O software stack of extreme-scale systems. Concurrency & Computation Practice & Experience (2017).

[19]

Yanyong Zhang, Antony Yang, Anand Sivasubramaniam, and Jose Moreira. 2007. Gang Scheduling Extensions for I/O Intensive Workloads. In Job Scheduling Strategies for Parallel Processing, International Workshop, Jsspp 2003, Seattle, Wa, Usa, June 24, 2003, Revised Papers. 183--207.

Cited By

Yang WYu J(2024)Trade-off topology design for hierarchical network based on job characteristicsCCF Transactions on High Performance Computing10.1007/s42514-024-00193-zOnline publication date: 21-May-2024
https://doi.org/10.1007/s42514-024-00193-z
Bez JByna SIbrahim S(2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
https://dl.acm.org/doi/10.1145/3611007
He XXiao WDeng XChen QYang BChen Z(2023)DFBuffer: High-performance data forwarding software optimized for single-process I/O scenarios2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00074(522-529)Online publication date: Jan-2023
https://doi.org/10.1109/ICPADS56603.2022.00074
Show More Cited By

Index Terms

Rethinking Node Allocation Strategy for Data-intensive Applications in Consideration of Spatially Bursty I/O
1. Computer systems organization
  1. Architectures

Recommendations

Evaluation of an interference-free node allocation policy on fat-tree clusters
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis

Interference between jobs competing for network bandwidth on a fat-tree cluster can cause significant variability and degradation in performance. These performance issues can be mitigated or completely eliminated if the resource allocation policy takes ...
Evaluation of an interference-free node allocation policy on fat-tree clusters
SC '18: Proceedings of the International Conference for High Performance Computing, Networking, Storage, and Analysis

Interference between jobs competing for network bandwidth on a fat-tree cluster can cause significant variability and degradation in performance. These performance issues can be mitigated or completely eliminated if the resource allocation policy takes ...
Evaluation of nine heuristic algorithms with data‐intensive jobs and computing‐intensive jobs in a dynamic environment

This study focuses on a dynamic environment where data‐intensive jobs and computing‐intensive jobs are submitted to a grid at the same time. The authors analyse nine heuristic algorithms in a grid and give a comparison of them in a simulation environment. ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICS '18: Proceedings of the 2018 International Conference on Supercomputing

June 2018

407 pages

ISBN:9781450357838

DOI:10.1145/3205289

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

SIGARCH: ACM Special Interest Group on Computer Architecture

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 12 June 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

National Key Research and Development Program of China
National Natural Science Foundation of China
China National Special Fund for Public Welfare

Conference

ICS '18

Sponsor:

SIGARCH

ICS '18: 2018 International Conference on Supercomputing

June 12 - 15, 2018

Beijing, China

Acceptance Rates

Overall Acceptance Rate 629 of 2,180 submissions, 29%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

5
Total Citations
View Citations
107
Total Downloads

Downloads (Last 12 months)3
Downloads (Last 6 weeks)1

Reflects downloads up to 29 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Yang WYu J(2024)Trade-off topology design for hierarchical network based on job characteristicsCCF Transactions on High Performance Computing10.1007/s42514-024-00193-zOnline publication date: 21-May-2024
https://doi.org/10.1007/s42514-024-00193-z
Bez JByna SIbrahim S(2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
https://dl.acm.org/doi/10.1145/3611007
He XXiao WDeng XChen QYang BChen Z(2023)DFBuffer: High-performance data forwarding software optimized for single-process I/O scenarios2022 IEEE 28th International Conference on Parallel and Distributed Systems (ICPADS)10.1109/ICPADS56603.2022.00074(522-529)Online publication date: Jan-2023
https://doi.org/10.1109/ICPADS56603.2022.00074
Xian GTang YYang WLi XZhang XYu J(2021)Visual Analysis of the High-performance Computing Jobs Based on the Comprehensive Load Scoring Algorithm2021 7th International Conference on Computer and Communications (ICCC)10.1109/ICCC54389.2021.9674694(1436-1443)Online publication date: 10-Dec-2021
https://doi.org/10.1109/ICCC54389.2021.9674694
Yang WYang ZZhou YWang FChen CWang Y(2019)A Comprehensive Analysis of User Job Data on a Petascale Supercomputer Dedicated to CFD2019 IEEE 5th International Conference on Computer and Communications (ICCC)10.1109/ICCC47050.2019.9064094(86-91)Online publication date: Dec-2019
https://doi.org/10.1109/ICCC47050.2019.9064094

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Table of Conten