Current APIs for multiprocessor multi-disk file systems are not easy to use in developing out-of-core algorithms that choreograph parallel data accesses. Consequently, the efficiency of these algorithms is hard to achieve in practice. We address this deficiency by specifying an API that includes data-access primitives for data choreography. With our API, the programmer can easily access specific blocks from each disk in a single operation, thereby fully utilizing the parallelism of the underlying storage system. Our API supports the development of libraries of commonly-used higher-level routines such as matrix-matrix addition, matrix-matrix multiplication, and BMMC (bit-matrix-multiply/complement) permutations. We illustrate our API in implementations of these three high-level routines to demonstrate how easy it is to use.
Cited By
- Vitter J (2008). Algorithms and data structures for external memory, Foundations and Trends® in Theoretical Computer Science, 2:4, (305-474), Online publication date: 1-Jan-2008.
- Vitter J External memory algorithms Handbook of massive data sets, (359-416)
- Vitter J (2001). External memory algorithms and data structures, ACM Computing Surveys (CSUR), 33:2, (209-271), Online publication date: 1-Jun-2001.
- Gibson G, Vitter J and Wilkes J (1996). Strategic directions in storage I/O issues in large-scale computing, ACM Computing Surveys (CSUR), 28:4, (779-793), Online publication date: 1-Dec-1996.
Recommendations
On Multiple Random Accesses and Physical Data Placement in Dynamic Files
In the study of data storage and retrieval involving secondary storage devices, for example, magnetic disks, a simplified model of storage that is often used is that each access takes a constant amount of time. However, if some information about the ...
Optimizing Local File Accesses for FUSE-Based Distributed Storage
SCC '12: Proceedings of the 2012 SC Companion: High Performance Computing, Networking Storage and AnalysisModern distributed file systems can store huge amounts of information while retaining the benefits of high reliability and performance. Many of these systems are prototyped with FUSE, a popular framework for implementing user-level file systems. ...