Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleNovember 2024
MoEsaic: Shared Mixture of Experts
SoCC '24: Proceedings of the 2024 ACM Symposium on Cloud ComputingPages 434–442https://doi.org/10.1145/3698038.3698521Mixture of Expert (MoE) models consist of several experts, each specializing in a specific task. During inference, a subset of the experts is invoked based on their relevance to the request. MoE's modular architecture lets users compose their model from ...
- research-articleJuly 2024
A Selective Preprocessing Offloading Framework for Reducing Data Traffic in DL Training
HotStorage '24: Proceedings of the 16th ACM Workshop on Hot Topics in Storage and File SystemsPages 63–70https://doi.org/10.1145/3655038.3665947Deep learning (DL) training is data-intensive and often bottlenecked by fetching data from remote storage. Recognizing that many samples' sizes diminish during data preprocessing, we explore selectively offloading preprocessing to remote storage to ...
- research-articleNovember 2022
A case for using cache line deltas for high frequency VM snapshotting
SoCC '22: Proceedings of the 13th Symposium on Cloud ComputingPages 526–539https://doi.org/10.1145/3542929.3563481Active-standby schemes for Virtual Machine (VM) high availability require periodic synchronization of memory and CPU state. The most common approach to synchronization is to use page tables and software to identify "dirty" memory pages at the source and ...
- research-articleOctober 2018
Fail-Slow at Scale: Evidence of Hardware Performance Faults in Large Production Systems
- Haryadi S. Gunawi,
- Riza O. Suminto,
- Russell Sears,
- Casey Golliher,
- Swaminathan Sundararaman,
- Xing Lin,
- Tim Emami,
- Weiguang Sheng,
- Nematollah Bidokhti,
- Caitie McCaffrey,
- Deepthi Srinivasan,
- Biswaranjan Panda,
- Andrew Baptist,
- Gary Grider,
- Parks M. Fields,
- Kevin Harms,
- Robert B. Ross,
- Andree Jacobson,
- Robert Ricci,
- Kirk Webb,
- Peter Alvaro,
- H. Birali Runesha,
- Mingzhe Hao,
- Huaicheng Li
ACM Transactions on Storage (TOS), Volume 14, Issue 3Article No.: 23, Pages 1–26https://doi.org/10.1145/3242086Fail-slow hardware is an under-studied failure mode. We present a study of 114 reports of fail-slow hardware incidents, collected from large-scale cluster deployments in 14 institutions. We show that all hardware types such as disk, SSD, CPU, memory, ...
- ArticleJuly 2018
Model governance: reducing the anarchy of production ML
- Vinay Sridhar,
- Sriram Subramanian,
- Dulcardo Arteaga,
- Swaminathan Sundararaman,
- Drew Roselli,
- Nisha Talagala
USENIX ATC '18: Proceedings of the 2018 USENIX Conference on Usenix Annual Technical ConferencePages 351–357As the influence of machine learning grows over decisions in businesses and human life, so grows the need for Model Governance. In this paper, we motivate the need for, define the problem of, and propose a solution for Model Governance in production ML. ...
-
- ArticleFebruary 2018
The case of FEMU: cheap, accurate, scalable and extensible flash emulator
- Huaicheng Li,
- Mingzhe Hao,
- Michael Hao Tong,
- Swaminatahan Sundararaman,
- Matias Bjørling,
- Haryadi S. Gunawi
FEMU is a software (QEMU-based) flash emulator for fostering future full-stack software/ hardware SSD research. FEMU is cheap (opensourced), relatively accurate (0.5-38% variance as a drop-in replacement of OpenChannel SSD), scalable (can support 32 ...
- ArticleFebruary 2018
Fail-slow at scale: evidence of hardware performance faults in large production systems
- Haryadi S. Gunawi,
- Riza O. Suminto,
- Russell Sears,
- Casey Golliher,
- Swaminathan Sundararaman,
- Xing Lin,
- Tim Emami,
- Weiguang Sheng,
- Nematollah Bidokhti,
- Caitie McCaffrey,
- Gary Grider,
- Parks M. Fields,
- Kevin Harms,
- Robert B. Ross,
- Andree Jacobson,
- Robert Ricci,
- Kirk Webb,
- Peter Alvaro,
- H. Birali Runesha,
- Mingzhe Hao,
- Huaicheng Li
Fail-slow hardware is an under-studied failure mode. We present a study of 101 reports of fail-slow hardware incidents, collected from large-scale cluster deployments in 12 institutions. We show that all hardware types such as disk, SSD, CPU, memory and ...
- research-articleOctober 2017
Tiny-Tail Flash: Near-Perfect Elimination of Garbage Collection Tail Latencies in NAND SSDs
- Shiqin Yan,
- Huaicheng Li,
- Mingzhe Hao,
- Michael Hao Tong,
- Swaminathan Sundararaman,
- Andrew A. Chien,
- Haryadi S. Gunawi
ACM Transactions on Storage (TOS), Volume 13, Issue 3Article No.: 22, Pages 1–26https://doi.org/10.1145/3121133Flash storage has become the mainstream destination for storage users. However, SSDs do not always deliver the performance that users expect. The core culprit of flash performance instability is the well-known garbage collection (GC) process, which ...
- ArticleFebruary 2017
Tiny-tail flash: near-perfect elimination of garbage collection tail latencies in NAND SSDs
- Shiqin Yan,
- Huaicheng Li,
- Mingzhe Hao,
- Michael Hao Tong,
- Swaminatahan Sundararaman,
- Andrew A. Chien,
- Haryadi S. Gunawi
TTFLASH is a "tiny-tail" flash drive (SSD) that eliminates GC-induced tail latencies by circumventing GC-blocked I/Os with four novel strategies: plane-blocking GC, rotating GC, GC-tolerant read, and GC-tolerant flush. It is built on three SSD internal ...
- ArticleFebruary 2016
CloudCache: on-demand flash cache management for Cloud Computing
Host-side flash caching has emerged as a promising solution to the scalability problem of virtual machine (VM) storage in cloud computing systems, but it still faces serious limitations in capacity and endurance. This paper presents CloudCache, an on-...
- research-articleOctober 2015
Mjölnir: collecting trash in a demanding new world
- Zev Weiss,
- Sriram Subramanian,
- Swaminathan Sundararaman,
- Vinay Sridhar,
- Nisha Talagala,
- Andrea C. Arpaci-Dusseau,
- Remzi H. Arpaci-Dusseau
INFLOW '15: Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and WorkloadsArticle No.: 4, Pages 1–10https://doi.org/10.1145/2819001.2819006As flash devices become ubiquitous in data centers and cost per gigabyte drops, flash systems need to provide data services similar to those of traditional storage. We present Mjölnir, a powerful and scalable engine that addresses the core problems that ...
- research-articleOctober 2015
Towards software defined persistent memory: rethinking software support for heterogenous memory architectures
INFLOW '15: Proceedings of the 3rd Workshop on Interactions of NVM/FLASH with Operating Systems and WorkloadsArticle No.: 6, Pages 1–10https://doi.org/10.1145/2819001.2819004The emergence of persistent memories promises a sea-change in application and data center architectures, with efficiencies and performance not possible with today's volatile DRAM and persistent slow storage. We present Software Defined Persistent Memory,...
- ArticleJuly 2015
NVMKV: a scalable, lightweight, FTL-aware key-value store
USENIX ATC '15: Proceedings of the 2015 USENIX Conference on Usenix Annual Technical ConferencePages 207–219Key-value stores are ubiquitous in high performance data-intensive, scale out, and NoSQL environments. Many KV stores use flash devices for meeting their performance needs. However, by using flash as a simple block device, these KV stores are unable to ...
- ArticleFebruary 2015
ANViL: advanced virtualization for modern non-volatile memory devices
- Zev Weiss,
- Sriram Subramanian,
- Swaminathan Sundararaman,
- Nisha Talagala,
- Andrea C. Arpaci-Dusseau,
- Remzi H. Arpaci-Dusseau
We present a new form of storage virtualization based on block-level address remapping. By allowing the host system to manipulate this address map with a set of three simple operations (clone, move, and delete), we enable a variety of useful features ...
- ArticleJune 2014
NVMKV: a scalable and lightweight flash aware key-value store
- Leonardo Mármol,
- Swaminathan Sundararaman,
- Nisha Talagala,
- Raju Rangaswami,
- Sushma Devendrappa,
- Bharath Ramsundar,
- Sriram Ganesan
HotStorage'14: Proceedings of the 6th USENIX conference on Hot Topics in Storage and File SystemsPage 8State-of-the-art flash-optimized KV stores frequently rely upon a log structure and/or compaction-based strategy to optimally organize content on flash. However, these strategies lead to excessive I/O, beyond the write amplification generated within the ...
- research-articleApril 2014
Snapshots in a flash with ioSnap
- Sriram Subramanian,
- Swaminathan Sundararaman,
- Nisha Talagala,
- Andrea C. Arpaci-Dusseau,
- Remzi H. Arpaci-Dusseau
EuroSys '14: Proceedings of the Ninth European Conference on Computer SystemsArticle No.: 23, Pages 1–14https://doi.org/10.1145/2592798.2592825Snapshots are a common and heavily relied upon feature in storage systems. The high performance of flash-based storage systems brings new, more stringent, requirements for this classic capability. We present ioSnap, a flash optimized snapshot system. ...
- research-articleJune 2013
HEC: improving endurance of high performance flash-based cache devices
SYSTOR '13: Proceedings of the 6th International Systems and Storage ConferenceArticle No.: 10, Pages 1–11https://doi.org/10.1145/2485732.2485743Flash memory is widely used for its fast random I/O access performance in a gamut of enterprise storage applications. However, due to the limited endurance and asymmetric write performance of flash memory, minimizing writes to a flash device is critical ...
- ArticleFebruary 2013
Write policies for host-side flash caches
Host-side flash-based caching offers a promising new direction for optimizing access to networked storage. Current work has argued for using host-side flash primarily as a read cache and employing a write-through policy which provides the strictest ...
- research-articleFebruary 2012
Making the common case the only case with anticipatory memory allocation
- Swaminathan Sundararaman,
- Yupu Zhang,
- Sriram Subramanian,
- Andrea C. Arpaci-Dusseau,
- Remzi H. Arpaci-Dusseau
ACM Transactions on Storage (TOS), Volume 7, Issue 4Article No.: 13, Pages 1–24https://doi.org/10.1145/2078861.2078863We present anticipatory memory allocation (AMA), a new method to build kernel code that is robust to memory-allocation failures. AMA avoids the usual difficulties in handling allocation failures through a novel combination of static and dynamic ...