Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/1996130.1996138acmconferencesArticle/Chapter ViewAbstractPublication PageshpdcConference Proceedingsconference-collections
research-article

A cost-intelligent application-specific data layout scheme for parallel file systems

Published: 08 June 2011 Publication History

Abstract

I/O data access is a recognized performance bottleneck of high-end computing. Several commercial and research parallel file systems have been developed in recent years to ease the performance bottleneck. These advanced file systems perform well on some applications but may not perform well on others. They have not reached their full potential in mitigating the I/O-wall problem. Data access is application dependent. Based on the application-specific optimization principle, in this study we propose a cost-intelligent data access strategy to improve the performance of parallel file systems. We first present a novel model to estimate data access cost of different data layout policies. Next, we extend the cost model to calculate the overall I/O cost of any given application and choose an appropriate layout policy for the application. A complex application may consist of different data access patterns. Averaging the data access patterns may not be the best solution for those complex applications that do not have a dominant pattern. We then further propose a hybrid data replication strategy for those applications, so that a file can have replications with different layout policies for the best performance. Theoretical analysis and experimental testing have been conducted to verify the newly proposed cost-intelligent layout approach. Analytical and experimental results show that the proposed cost model is effective and the application-specific data layout approach achieved up to 74% performance improvement for data-intensive applications.

References

[1]
"Lustre: A Scalable, Robust, Highly-available Cluster File System," White Paper, Cluster File Systems, Inc., 2006. {Online}. Available: http://www.lustre.org/
[2]
F. Schmuck and R. Haskin, "GPFS: A Shared-disk File System for Large Computing Clusters," in FAST'02: Proceedings of the 1st USENIX Conference on File and Storage Technologies. Berkeley, CA, USA: USENIX Association, 2002, p. 19.
[3]
B. Welch, M. Unangst, Z. Abbasi, G. Gibson, B. Mueller, J. Small, J. Zelenka, and B. Zhou, "Scalable Performance of the Panasas Parallel File System," in phFAST'08: Proceedings of the 6th USENIX Conference on File and Storage Technologies. Berkeley, CA, USA: USENIX Association, 2008, pp. 1--17.
[4]
P. H. Carns, W. B. Ligon III, R. B. Ross, and R. Thakur, "PVFS: A Parallel File System for Linux Clusters," in Proceedings of the 4th Annual Linux Showcase and Conference.USENIX Association, 2000, pp. 317--327.
[5]
M. Seltzer, P. Chen, and J. Ousterhout, "Disk Scheduling Revisited," in Proceedings of the USENIX Winter Technical Conference (USENIX Winter 90, 1990, pp. 313--324.
[6]
B. L. Worthington, G. R. Ganger, and Y. N. Patt, "Scheduling Algorithms for Modern Disk Drives," 1994, pp. 241--251.
[7]
C. Ruemmler and J. Wilkes, "An Introduction to Disk Drive Modeling," IEEE Computer, vol. 27, pp. 17--28, 1994.
[8]
C. R. Lumb, J. Schindler, G. R. Ganger, and D. F. Nagle, "Towards Higher Disk Head Utilization: Extracting Free Bandwidth from Busy Disk Drives," in Symposium on Operating Systems Design and Implementation. USENIX Association, 2000, pp. 87--102.
[9]
J. A. Solworth and C. U. Orji, "Write-only Disk Caches," in SIGMOD '90: Proceedings of the 1990 ACM SIGMOD International Conference on Management of data. New York, NY, USA: ACM, 1990, pp. 123--132.
[10]
M. Rosenblum and J. K. Ousterhout, "The Design and Implementation of a Log-structured File System," ACM Trans. Comput. Syst., vol. 10, no. 1, pp. 26--52, 1992.
[11]
R. Thakur, W. Gropp, and E. Lusk, "Data Sieving and Collective I/O in ROMIO," in FRONTIERS '99: Proceedings of the The 7th Symposium on the Frontiers of Massively Parallel Computation. Washington, DC, USA: IEEE Computer Society, 1999, p. 182.
[12]
A. Ching, A. Choudhary, K. Coloma, W.-k. Liao, R. Ross, and W. Gropp, "Noncontiguous I/O Accesses Through MPI-IO," Cluster Computing and the Grid, IEEE International Symposium on, vol. 0, p. 104, 2003.
[13]
A. Ching, A. Choudhary, W.-k. Liao, R. Ross, and W. Gropp, "Efficient Structured Data Access in Parallel File Systems," in Proceedings of the IEEE International Conference on Cluster Computing, 2003.
[14]
K. E. Seamons, Y. Chen, P. Jones, J. Jozwiak, and M. Winslett, "Server-directed Collective I/O in Panda," in Supercomputing '95: Proceedings of the 1995 ACM/IEEE conference on Supercomputing (CDROM) New York, NY, USA: ACM, 1995, p. 57.
[15]
D. Kotz, "Disk-directed I/O for MIMD Multiprocessors," ACMTrans. Comput. Syst., vol. 15, no. 1, pp. 41--74, 1997.
[16]
X. Zhang, S. Jiang, and K. Davis, "Making Resonance a Common Case: A High-performance Implementation of Collective I/O on Parallel File Systems," in IPDPS '09: Proceedings of the 2009 IEEE International Symposium on Parallel & Distributed Processing. Washington, DC, USA: IEEE Computer Society, 2009, pp. 1--12.
[17]
F. Isaila and W. F. Tichy, "Clusterfile: A Flexible Physical LayoutParallel File System," IEEE International Conference on Cluster Computing, vol. 0, p. 37, 2001.
[18]
S. Rubin, R. Bodík, and T. Chilimbi, "An EfficientProfile-analysis Framework for Data-layout Optimizations," SIGPLAN Not., vol. 37, no. 1, pp. 140--153, 2002.
[19]
Y. Wang and D. Kaeli, "Profile-guided I/O Partitioning," in ICS '03: Proceedings of the 17th annual international conference on Supercomputing. New York, NY, USA: ACM, 2003, pp. 252--260.
[20]
W. W. Hsu, A. J. Smith, and H. C. Young, "The Automatic Improvementof Locality in Storage Systems," ACM Trans. Comput. Syst., vol. 23, no. 4, pp. 424--473, 2005.
[21]
X.-H. Sun, Y. Chen, and Y. Yin, "Data Layout Optimization for Petascale File Systems," in PDSW '09: Proceedings of the 4th Annual Workshop on Petascale Data Storage. New York, NY, USA: ACM, 2009, pp. 11--15.
[22]
H. Huang, W. Hung, and K. G. Shin, "FS2: Dynamic Data Replicationin Free Disk Space for Improving Disk Performance and Energy Consumption," in SOSP '05: Proceedings of the Twentieth ACM symposium on Operating systems principles. New York, NY, USA: ACM, 2005, pp. 263--276.
[23]
. Bhadkamkar, J. Guerra, L. Useche, S. Burnett, J. Liptak, R. Rangaswami, and V. Hristidis, "BORG: Block-reorganization for Self-optimizing Storage Systems," in Proccedings of the 7th conference on File and storage technologies. Berkeley, CA, USA: USENIX Association, 2009, pp. 183--196. {Online}. Available: http://portal.acm.org/citation.cfm?id=1525908.1525922
[24]
X. Zhang and S. Jiang, "IternterferenceRemoval: Removing Interference of Disk Access for MPI Programs through Data Replication," in phProceedings of the 24th International Conference on Supercomputing, 2010, pp. 223--232.
[25]
R. Thakur and A. Choudhary, "An Extended Two-phase Method for Accessing Sections of Out-of-core Arrays," in Scientific Programming, 5(4):301-C317, Winter, 1996.
[26]
C. Wang, Z. Zhang, X. Ma, S. S. Vazhkudai, and F. Mueller, "Improving the Availability of Supercomputer Job Input Data Using Temporal Replication," Computer Science - Research and Development, vol. 23.
[27]
B. Nitzberg and V. Lo, "Collective Buffering: Improving Parallel I/O Performance," in HPDC '97: Proceedings of the 6th IEEE International Symposium on High Performance Distributed Computing. Washington, DC, USA: IEEE Computer Society, 1997, p. 148.
[28]
X. Ma, M. Winslett, J. Lee, and S. Yu, "Faster Collective OutputThrough Active Buffering," in IPDPS '02: Proceedings of the 16th International Parallel and Distributed Processing Symposium.Washington, DC, USA: IEEE Computer Society, 2002, p. 151.
[29]
F. Isaila, G. Malpohl, V. Olaru, G. Szeder, and W. Tichy, "Integrating Collective I/O and Cooperative Caching into the "Clusterfile" Parallel File System," in ICS '04: Proceedings of the 18th annual international conference on Supercomputing. New York, NY, USA: ACM, 2004, pp. 58--67.
[30]
W.-k. Liao, K. Coloma, A. Choudhary, L. Ward, E. Russell, and S. Tideman, Collective Caching: Application-aware Client-side File Caching," in HPDC '05: Proceedings of the High Performance Distributed Computing, 2005. HPDC-14. Proceedings. 14th IEEE International Symposium. Washington, DC, USA: IEEE Computer Society, 2005, pp. 81--90.
[31]
J. W. C. Fu and J. H. Patel, "Data Prefetching in MultiprocessorVector Cache Memories," in phISCA '91: Proceedings of the 18th annual international symposium on Computer architecture. New York, NY, USA: ACM, 1991, pp. 54--63.
[32]
F. Dahlgren, M. Dubois, and P. Stenstrom, "Fixed and AdaptiveSequential Prefetching in Shared Memory Multiprocessors," in phICPP '93: Proceedings of the 1993 International Conference on Parallel Processing. Washington, DC, USA: IEEE Computer Society, 1993, pp. 56--63.
[33]
R. H. Patterson, G. A. Gibson, E. Ginting, D. Stodolsky, and J. Zelenka, "Informed Prefetching and Caching," in Proceedings of the Fifteenth ACM Symposium on Operating Systems Principles. ACM Press, 1995, pp. 79--95.
[34]
S. Byna, Y. Chen, X.-H. Sun, R. Thakur, and W. Gropp, "Parallel I/O Prefetching Using MPI File Caching and I/O Signatures," in SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Piscataway, NJ, USA: IEEE Press, 2008, pp. 1--12.
[35]
H. Lei and D. Duchamp, "An Analytical Approach to FilePrefetching," in Proceedings of the USENIX 1997 Annual Technical Conference, 1997, pp. 275--288.
[36]
N. Tran, D. A. Reed, and S. Member, "Automatic Arima Time SeriesModeling for Adaptive I/O Prefetching," IEEE Transactions on Parallel and Distributed Systems, vol. 15, pp. 362--377, 2004.
[37]
Y. Chen, S. Byna, X.-H. Sun, R. Thakur, and W. Gropp, "Hiding I/O Latency with Pre-execution Prefetching for Parallel Applications," in SC '08: Proceedings of the 2008 ACM/IEEE conference on Supercomputing. Piscataway, NJ, USA: IEEE Press, 2008, pp. 1--10.
[38]
F. Wang, Q. Xin, B. Hong, S. A. Brandt, E. L. Miller, D. D. E. Long,and T. T. Mclarty, "File System Workload Analysis for Large Scientific Computing Applications," in Proceedings of the 21st IEEE / 12th NASA Goddard Conference on Mass Storage Systems and Technologies, Apr. 2004, p. 139--152.
[39]
W. B. Ligon III and R. B. Ross, "Implementation and Performance of a Parallel File System for High Performance Distributed Applications," in HPDC '96: Proceedings of the 5th IEEE International Symposium on High Performance Distributed Computing. Washington, DC, USA: IEEE Computer Society, 1996, p. 471.
[40]
K. Vijayakumar, F. Mueller, X. Ma, and P. C. Roth, "Scalable I/O Tracing and Analysis," in phPDSW '09: Proceedings of the 4th Annual Workshop on Petascale Data Storage.New York, NY, USA: ACM, 2009, pp. 26--31.
[41]
H.-C. Yun, S.-K. Lee, J. Lee, and S. Maeng, "An Efficient LockProtocol for Home-based Lazy Release Consistency," in CCGRID '01: Proceedings of the 1st International Symposium on Cluster Computing and the Grid. Washington, DC, USA: IEEE Computer Society, 2001, p. 527.
[42]
Y. Sun and Z. Xu, "Grid Replication Coherence Protocol,"in IPDPS'04: Proceedings of 18th International Parallel and Distributed Processing Symposium, vol. 14, p. 232, 2004.
[43]
A. Phanishayee, E. Krevat, V. Vasudevan, D. G. Andersen, G. R.Ganger, G. A. Gibson, and S. Seshan, "Measurement and Analysis of TCP Throughput Collapse in Cluster-based Storage Systems," in FAST'08: Proceedings of the 6th USENIX Conference on File and Storage Technologies.relax Berkeley, CA, USA: USENIX Association, 2008, pp. 1--14.
[44]
. Vasudevan, A. Phanishayee, H. Shah, E. Krevat, D. G. Andersen, G. R. Ganger, G. A. Gibson, and B. Mueller, "Safe and Effective Fine-grained TCP Retransmissions for Datacenter Communication," in Proceedings of the ACM SIGCOMM 2009 conference on Data communication, ser. SIGCOMM '09. New York, NY, USA: ACM, 2009, pp. 303--314. {Online}. Available: http://doi.acm.org/10.1145/1592568.1592604
[45]
V. Vasudevan, H. Shah, A. Phanishayee, E. Krevat, D. Andersen,G. Ganger, and G. Gibson, "Solving TCP Incast in Cluster Storage Systems (poster presentation)," in FAST'09: Proceedings of the 7th USENIX Conference on File and Storage Technologies. 2009.\endthebibliography

Cited By

View all
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • (2021)Offloading the Training of an I/O Access Pattern Detector to the Cloud2021 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)10.1109/SBAC-PADW53941.2021.00013(15-19)Online publication date: Oct-2021
  • (2020)A Holistic Heterogeneity-Aware Data Placement Scheme for Hybrid Parallel I/O SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.2948901(1-1)Online publication date: 2020
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
HPDC '11: Proceedings of the 20th international symposium on High performance distributed computing
June 2011
296 pages
ISBN:9781450305525
DOI:10.1145/1996130
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 08 June 2011

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. data layout
  2. data-access performance modeling
  3. data-intensive
  4. parallel file systems

Qualifiers

  • Research-article

Conference

HPDC '11
Sponsor:

Acceptance Rates

Overall Acceptance Rate 166 of 966 submissions, 17%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)1
Reflects downloads up to 23 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2023)I/O Access Patterns in HPC Applications: A 360-Degree SurveyACM Computing Surveys10.1145/361100756:2(1-41)Online publication date: 15-Sep-2023
  • (2021)Offloading the Training of an I/O Access Pattern Detector to the Cloud2021 International Symposium on Computer Architecture and High Performance Computing Workshops (SBAC-PADW)10.1109/SBAC-PADW53941.2021.00013(15-19)Online publication date: Oct-2021
  • (2020)A Holistic Heterogeneity-Aware Data Placement Scheme for Hybrid Parallel I/O SystemsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2019.2948901(1-1)Online publication date: 2020
  • (2020)PRS: A Pattern-Directed Replication Scheme for Heterogeneous Object-Based StorageIEEE Transactions on Computers10.1109/TC.2019.295408969:4(591-605)Online publication date: 1-Apr-2020
  • (2020)Predicting and Comparing the Performance of Array Management Libraries2020 IEEE International Parallel and Distributed Processing Symposium (IPDPS)10.1109/IPDPS47924.2020.00097(906-915)Online publication date: May-2020
  • (2019)Optimizing Parallel I/O Accesses through Pattern-Directed and Layout-Aware ReplicationIEEE Transactions on Computers10.1109/TC.2019.2946135(1-1)Online publication date: 2019
  • (2019)Detecting I/O Access Patterns of HPC Workloads at Runtime2019 31st International Symposium on Computer Architecture and High Performance Computing (SBAC-PAD)10.1109/SBAC-PAD.2019.00025(80-87)Online publication date: Oct-2019
  • (2019)On server-side file access pattern matching2019 International Conference on High Performance Computing & Simulation (HPCS)10.1109/HPCS48598.2019.9188092(217-224)Online publication date: Jul-2019
  • (2018)Atributed consistent hashing for heterogeneous storage systemsProceedings of the 27th International Conference on Parallel Architectures and Compilation Techniques10.1145/3243176.3243202(1-12)Online publication date: 1-Nov-2018
  • (2018)A Checkpoint of Research on Parallel I/O for High-Performance ComputingACM Computing Surveys10.1145/315289151:2(1-35)Online publication date: 12-Mar-2018
  • Show More Cited By

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media