Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1362622.1362636acmconferencesArticle/Chapter ViewAbstractPublication PagesscConference Proceedingsconference-collections
research-article

Investigation of leading HPC I/O performance using a scientific-application derived benchmark

Published: 10 November 2007 Publication History

Abstract

With the exponential growth of high-fidelity sensor and simulated data, the scientific community is increasingly reliant on ultrascale HPC resources to handle their data analysis requirements. However, to utilize such extreme computing power effectively, the I/O components must be designed in a balanced fashion, as any architectural bottleneck will quickly render the platform intolerably inefficient. To understand I/O performance of data-intensive applications in realistic computational settings, we develop a lightweight, portable benchmark called MADbench2, which is derived directly from a large-scale Cosmic Microwave Background (CMB) data analysis package. Our study represents one of the most comprehensive I/O analyses of modern parallel filesystems, examining a broad range of system architectures and configurations, including Lustre on the Cray XT3 and Intel Itanium2 cluster; GPFS on IBM Power5 and AMD Opteron platforms; two BlueGene/L installations utilizing GPFS and PVFS2 filesystems; and CXFS on the SGI Altix3700. We present extensive synchronous I/O performance data comparing a number of key parameters including concurrency, POSIX- versus MPI-IO, and unique- versus shared-file accesses, using both the default environment as well as highly-tuned I/O parameters. Finally, we explore the potential of asynchronous I/O and quantify the volume of computation required to hide a given volume of I/O. Overall our study quantifies the vast differences in performance and functionality of parallel filesystems across state-of-the-art platforms, while providing system designers and computational scientists a lightweight tool for conducting further analyses.

References

[1]
K. Antypas, A. C. Calder, A. Dubey, R. Fisher, M. K. Ganapathy, J. B. Gallagher, L. B. Reid, K. Reid, K. Riley, D. Sheeler, and N. Taylor. Scientific applications on the massively parallel bg/1 machines. In PDPTA 2006, pages 292--298, 2006.
[2]
J. Borrill. MADCAP: The Microwave Anisotropy Dataset Computational Analysis Package. In 5th European SGI/Cray MPP Workshop, Bologna, Italy, 1999.
[3]
J. Borrill, J. Carter, L. Oliker, D. Skinner, and R. Biswas. Integrated performance monitoring of a cosmology application on leading hec platforms. In ICPP: International Conference on Parallel Processing, Oslo, Norway, 2005.
[4]
P. Braam. File systems for clusters from a protocol perspective. In Proceedings of the Second Extreme Linux Topics Workshop, Monterey, CA, June 1999.
[5]
J. Carter, J. Borrill, and L. Oliker. Performance characteristics of a cosmology package on leading HPC architectures. In HiPC:International Conference on High Performance Computing, Bangalore, India, 2004.
[6]
A. Ching, A. Choudhary, W. Liao, R. Ross, and W. Gropp. Efficient structured data access in parallel file systems. In Cluster 2003 Conference, Dec 4, 2003.
[7]
D. Duffy, N. Acks, V. Noga, T. Schardt, J. Gary, B. Fink, B. Kobler, M. Donovan, J. McElvaney, and K. Kamischke. Beyond the storage area network: Data intensive computing in a distributed environment. In Conference on Mass Storage Systems and Technologies, Monterey, CA, April 11--14 2005.
[8]
FLASH-IO Benchmark on NERSC Platforms, http://pdsi.nersc.gov/IOBenchmark/FLASH_ IOBenchmark. pdf.
[9]
W. Gropp, E. Lusk, and R. Thakur. Using MPI-2: Advanced Features of the Message Passing Interface. MIT Press, 1984.
[10]
HECRTF:Workshop on the Roadmap for the Revitalization of High-End Computing, http://www.cra.org/Activities/workshops/nitrd/.
[11]
J. H. Howard, M. L. Kazar, S. G. Menees, D. A. Nichols, M. Satyanarayanan, R. N. Sidebotham, and M. J. West. Scale and performance in a distributed file system. ACM Transactions on Computer Systems, 6(1):51--81, Feb, 1988.
[12]
The ASCI I/O stress benchmark, http://www.llnl.gov/asci/purple/benchmarks/limited/ior/.
[13]
Iozone filesystem benchmark, http://www.iozone.org.
[14]
R. Latham, N. Miller, R. Ross, and P. Carns. A next-generation parallel file system for linux clusters: An introduction to the second parallel virtual file system. Linux Journal, pages 56--59, January 2004.
[15]
R. McDougall and J. Mauro. FileBench. http://www.solarisinternals.com/si/tools/filebench.
[16]
The PIORAW Test. http://cbcg.lbl.gov/nusers/systems/bassi/code_profiles.php.
[17]
R. Rabenseifner and A. E. Koniges. Effective file-I/O bandwidth benchmark. In European Conference on Parallel Processing (Euro-Par), Aug 29-Sep 1, 2000.
[18]
S. Saini, D. Talcott, R. Thakur, P. Adamidis, R. Rabenseifner, and R. Ciotti. Parallel I/O performance characterization of Columbia and NEC SX-8 superclusters. In "International Parallel and Distributed Processing Symposium (IPDPS)", Long Beach, CA, USA, March 26--30, 2007.
[19]
ScaLAPACK home page. http://www.netlib.org/scalapack/scalapack_home.html.
[20]
SCALES: Science Case for Large-scale Simulation, http://www.pnl.gov/scales/.
[21]
F. Schmuck and Roger Haskin. GPFS: A shared-disk file system for large computing clusters. In First USENIX Conference on File and Storage Technologies Fast02, Monterey, CA, January 2002.
[22]
H. Shan and J. Shalf. Using IOR, to analyze the I/O performance of HPC platforms. In Cray Users Group Meeting (CUG) 2007, Seattle, Washington, May 7--10, 2007.
[23]
SPIOBENCH: Streaming Parallel I/O Benchmark, http://www.nsf.gov/pubs/2006/nsf0605/spiobench.tar.gz, 2005.

Cited By

View all
  • (2024)PeakFS: An Ultra-High Performance Parallel File System via Computing-Network-Storage Co-Optimization for HPC ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.348575435:12(2578-2595)Online publication date: Dec-2024
  • (2023)Accelerating In Situ Analysis using Non-volatile MemoryProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624176(995-1004)Online publication date: 12-Nov-2023
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
SC '07: Proceedings of the 2007 ACM/IEEE conference on Supercomputing
November 2007
723 pages
ISBN:9781595937643
DOI:10.1145/1362622
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 10 November 2007

Permissions

Request permissions for this article.

Check for updates

Qualifiers

  • Research-article

Funding Sources

Conference

SC '07
Sponsor:

Acceptance Rates

SC '07 Paper Acceptance Rate 54 of 268 submissions, 20%;
Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

Upcoming Conference

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)22
  • Downloads (Last 6 weeks)2
Reflects downloads up to 01 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2024)PeakFS: An Ultra-High Performance Parallel File System via Computing-Network-Storage Co-Optimization for HPC ApplicationsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.348575435:12(2578-2595)Online publication date: Dec-2024
  • (2023)Accelerating In Situ Analysis using Non-volatile MemoryProceedings of the SC '23 Workshops of The International Conference on High Performance Computing, Network, Storage, and Analysis10.1145/3624062.3624176(995-1004)Online publication date: 12-Nov-2023
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • (2021)File System Semantics Requirements of HPC ApplicationsProceedings of the 30th International Symposium on High-Performance Parallel and Distributed Computing10.1145/3431379.3460637(19-30)Online publication date: 21-Jun-2021
  • (2021)Contour: A Process Variation Aware Wear-Leveling Mechanism for Inodes of Persistent Memory File SystemsIEEE Transactions on Computers10.1109/TC.2020.300253770:7(1034-1045)Online publication date: 1-Jul-2021
  • (2021)Arbitration Policies for On-Demand User-Level I/O Forwarding on HPC Platforms2021 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS49936.2021.00066(577-586)Online publication date: May-2021
  • (2020)The ESIF-HPC-2 benchmark suiteProceedings of the Workshop on Benchmarking in the Datacenter10.1145/3380868.3398200(1-8)Online publication date: 22-Feb-2020
  • (2020)Pacon: Improving Scalability and Efficiency of Metadata Service through Partial Consistency2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00105(986-996)Online publication date: May-2020
  • (2020)Adaptive request scheduling for the I/O forwarding layer using reinforcement learningFuture Generation Computer Systems10.1016/j.future.2020.05.005Online publication date: May-2020
  • (2019)Understanding Parallel I/O Performance Trends Under Various HPC ConfigurationsProceedings of the ACM Workshop on Systems and Network Telemetry and Analytics10.1145/3322798.3329258(29-36)Online publication date: 17-Jun-2019
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media