No abstract available.
quFiles: the right file at the right time
A quFile is a unifying abstraction that simplifies data management by encapsulating different physical representations of the same logical data. Similar to a quBit (quantum bit), the particular representation of the logical data displayed by a quFile is ...
Tracking back references in a write-anywhere file system
Many file systems reorganize data on disk, for example to defragment storage, shrink volumes, or migrate data between different classes of storage. Advanced file system features such as snapshots, writable clones, and deduplication make these tasks ...
End-to-end data integrity for file systems: a ZFS case study
We present a study of the effects of disk and memory corruption on file system data integrity. Our analysis focuses on Sun's ZFS, a modern commercial offering with numerous reliability mechanisms. Through careful and thorough fault injection, we show ...
Black-box problem diagnosis in parallel file systems
We focus on automatically diagnosing different performance problems in parallel file systems by identifying, gathering and analyzing OS-level, black-box performance metrics on every node in the cluster. Our peer-comparison diagnosis approach compares ...
A clean-slate look at disk scrubbing
A number of techniques have been proposed to reduce the risk of data loss in hard-drives, from redundant disks (e.g., RAID systems) to error coding within individual drives. Disk scrubbing is a background process that reads disks during idle periods to ...
Understanding latent sector errors and how to protect against them
Latent sector errors (LSEs) refer to the situation where particular sectors on a drive become inaccessible. LSEs are a critical factor in data reliability, since a single LSE can lead to data loss when encountered during RAID reconstruction after a disk ...
DFS: a file system for virtualized flash storage
This paper presents the design, implementation and evaluation of Direct File System (DFS) for virtualized flash storage. Instead of using traditional layers of abstraction, our layers of abstraction are designed for directly accessing flash memory ...
Extending SSD lifetimes with disk-based write caches
We present Griffin, a hybrid storage device that uses a hard disk drive (HDD) as a write cache for a Solid State Device (SSD). Griffin is motivated by two observations: First, HDDs can match the sequential write bandwidth of mid-range SSDs. Second, both ...
Write endurance in flash drives: measurements and analysis
We examine the write endurance of USB flash drives using a range of approaches: chip-level measurements, reverse engineering, timing analysis, whole-device endurance testing, and simulation. The focus of our investigation is not only measured endurance, ...
Accelerating parallel analysis of scientific simulation data via Zazen
As a new generation of parallel supercomputers enables researchers to conduct scientific simulations of unprecedented scale and resolution, terabyte-scale simulation output has become increasingly commonplace. Analysis of such massive data sets is ...
Efficient object storage journaling in a distributed parallel file system
Journaling is a widely used technique to increase file system robustness against metadata and/or data corruptions. While the overhead of journaling can be masked by the page cache for small-scale, local file systems, we found that Lustre's use of ...
Panache: a parallel file system cache for global file access
Cloud computing promises large-scale and seamless access to vast quantities of data across the globe. Applications will demand the reliability, consistency, and performance of a traditional cluster file system regardless of the physical distance between ...
BASIL: automated IO load balancing across storage devices
Live migration of virtual hard disks between storage arrays has long been possible. However, there is a dearth of online tools to perform automated virtual disk placement and IO load balancing across multiple storage arrays. This problem is quite ...
Discovery of application workloads from network file traces
An understanding of application I/O access patterns is useful in several situations. First, gaining insight into what applications are doing with their data at a semantic level helps in designing efficient storage systems. Second, it helps create ...
Provenance for the cloud
The cloud is poised to become the next computing environment for both data storage and computation due to its pay-as-you-go and provision-as-you-go models. Cloud storage is already being used to back up desktop user data, host shared scientific data, ...
I/O deduplication: utilizing content similarity to improve I/O performance
Duplication of data in storage systems is becoming increasingly common. We introduce I/O Deduplication, a storage optimization that utilizes content similarity for improving I/O performance by eliminating I/O operations and reducing the mechanical ...
HydraFS: a high-throughput file system for the HYDRAstor content-addressable storage system
- Cristian Ungureanu,
- Benjamin Atkin,
- Akshat Aranya,
- Salil Gokhale,
- Stephen Rago,
- Grzegorz Całkowski,
- Cezary Dubnicki,
- Aniruddha Bohra
A content-addressable storage (CAS) system is a valuable tool for building storage solutions, providing efficiency by automatically detecting and eliminating duplicate blocks; it can also be capable of high throughput, at least for streaming access. ...
Bimodal content defined chunking for backup streams
Data deduplication has become a popular technology for reducing the amount of storage space necessary for backup and archival data. Content defined chunking (CDC) techniques are well established methods of separating a data stream into variable-size ...
Evaluating performance and energy in file system server workloads
Recently, power has emerged as a critical factor in designing components of storage systems, especially for power-hungry data centers. While there is some research into power-aware storage stack components, there are no systematic studies evaluating ...
SRCMap: energy proportional storage using dynamic consolidation
We investigate the problem of creating an energy proportional storage system through power-aware dynamic storage consolidation. Our proposal, Sample-Replicate-Consolidate Mapping (SRCMap), is a storage virtualization layer optimization that enables ...
Membrane: operating system support for restartable file systems
- Swaminathan Sundararaman,
- Sriram Subramanian,
- Abhishek Rajimwale,
- Andrea C. Arpaci-Dusseau,
- Remzi H. Arpaci-Dusseau,
- Michael M. Swift
We introduce Membrane, a set of changes to the operating system to support restartable file systems. Membrane allows an operating system to tolerate a broad class of file system failures and does so while remaining transparent to running applications; ...