Abstract
The MPI-IO interface is a critical component in I/O software stacks for high-performance computing, and many successful optimizations have been incorporated into implementations to help provide high performance I/O for a variety of access patterns. However, in spite of these optimizations, there is still a large performance gap between ”easy” access patterns and more difficult ones, particularly when applications are unable to describe I/O using collective calls.
In this paper we present LogFS, a component that implements log-based storage for applications using the MPI-IO interface. We first discuss how this approach allows us to exploit the temporal freedom present in the MPI-IO consistency semantics, allowing optimization of a variety of access patterns that are not well-served by existing approaches. We then describe how this component is integrated into the ROMIO MPI-IO implementation as a stackable layer, allowing LogFS to be used on any file system supported by ROMIO. Finally we show performance results comparing the LogFS approach to current practice using a variety of benchmarks.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Thakur, R., Gropp, W., Lusk, E.: An Abstract-Device Interface for Implementing Portable Parallel-I/O Interfaces. In: Proceedings of the 6th Symposium on the Frontiers of Massively Parallel Computation, pp. 180–187 (1996)
Kimpe, D., Vandewalle, S., Poedts, S.: On the Usability of High-Level Parallel IO in Unstructured Grid Simulations. In: Proceedings of the 13th EuroPVM/MPI Conference, pp. 400–401 (2007)
Allsopp, N., Follows, J., Hennecke, M., Ishibashi, F., Paolini, M., Quintero, D., Tabary, A., Reddy, H., Sosa, C., Prakash, S., Lascu, O.: Unfolding the IBM Eserver Blue Gene Solution. International Business Machines Corporation (September 2005)
Worringen, J., Traff, J., Ritzdorf, H.: Improving Generic Non-Contiguous File Access for MPI-IO. In: Proceedings of the 10th EuroPVM/MPI Conference (2003)
Ross, R., Miller, N., Gropp, W.: Implementing Fast and Reusable Datatype Processing. In: Proceedings of the 10th EuroPVM/MPI Conference (2003)
Hastings, A., Choudhary, A.: Exploiting Shared Memory to Improve Parallel I/O Performance. In: Mohr, B., Träff, J.L., Worringen, J., Dongarra, J. (eds.) Recent Advances in Parallel Virtual Machine and Message Passing Interface. LNCS, vol. 4192, Springer, Heidelberg (2006)
Thakur, R., Gropp, W., Lusk, E.: A case for using MPI’s derived datatypes to improve I/O performance. In: Proceedings of the 1998 ACM/IEEE Conference on Supercomputing, San Jose, CA (1998)
Guttman, A.: R-Trees: A Dynamic Index Structure for Spatial Searching. In: Proceedings of the ACM International Conference on Management of Data (SIGMOD), ACM, New York (1984)
Li, J., Liao, W., Choudhary, A., Ross, R., Thakur, R., Gropp, W., Latham, R., Siegel, A., Gallagher, B., Zingale, M.: Parallel netCDF: A High-Performance Scientific I/O Interface. In: Proceedings of SC2003 (2003)
Purakayastha, A., Ellis, C., Kotz, D., Nieuwejaar, N., Best, M.: Characterizing Parallel File-Access Patterns on a Large-Scale Multiprocessor. In: Proceedings of the Ninth International Parallel Processing Symposium (1995)
Yu, W., Vetter, J., Canon, R., Jiang, S.: Exploiting Lustre File Joining for Effective Collective IO. In: Seventh IEEE International Symposium on Cluster Computing and the Grid (CCGrid 2007), IEEE Computer Society Press, Los Alamitos (2007)
Coloma, K., Choudhary, A., Liao, W., Ward, L., Tideman, S.: DAChe: Direct Access Cache System for Parallel I/O. In: the 2005 Proceedings of the International Supercomputer Conference (2005)
Liao, W., Ching, A., Coloma, K., Choudhary, A., Kandemir, M.: Improving MPI Independent Write Performance Using A Two-Stage Write-Behind Buffering Method. In: the Proceedings of the Next Generation Software (NGS) Workshop, held in conjunction with the 21th International Parallel and Distributed Processing Symposium (IPDPS), Long Beach, California (2007)
Carns, P.H., Ligon, W.B., Ross, III.R.B., Thakur, R.: PVFS: A Parallel File System For Linux Clusters. In: the Proceedings of the 4th Annual Linux Showcase and Conference, Atlanta, GA, pp. 317–327 (2000)
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 2007 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Kimpe, D., Ross, R., Vandewalle, S., Poedts, S. (2007). Transparent Log-Based Data Storage in MPI-IO Applications. In: Cappello, F., Herault, T., Dongarra, J. (eds) Recent Advances in Parallel Virtual Machine and Message Passing Interface. EuroPVM/MPI 2007. Lecture Notes in Computer Science, vol 4757. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-75416-9_34
Download citation
DOI: https://doi.org/10.1007/978-3-540-75416-9_34
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-75415-2
Online ISBN: 978-3-540-75416-9
eBook Packages: Computer ScienceComputer Science (R0)