No abstract available.
Spartan: a distributed array framework with smart tiling
Application programmers in domains like machine learning, scientific computing, and computational biology are accustomed to using powerful, high productivity array languages such as MatLab, R and NumPy. Distributed array frameworks aim to scale array ...
Experience with rules-based programming for distributed, concurrent, fault-tolerant code
This paper describes how a rules-based approach allowed us to solve a broad class of challenging distributed system problems in the RAMCloud storage system. In the rules-based approach, behavior is described with small sections of code that trigger ...
Tiered replication: a cost-effective alternative to full cluster geo-replication
Cloud storage systems typically use three-way random replication to guard against data loss within the cluster, and utilize cluster geo-replication to protect against correlated failures. This paper presents a much lower cost alternative to full cluster ...
Callisto-RTS: fine-grain parallel loops
We introduce Callisto-RTS, a parallel runtime system designed for multi-socket shared-memory machines. It supports very fine-grained scheduling of parallel loops-- down to batches of work of around 1K cycles. Finegrained scheduling helps avoid load ...
LAMA: optimized locality-aware memory allocation for key-value cache
The in-memory cache system is a performance-critical layer in today's web server architecture. Memcached is one of the most effective, representative, and prevalent among such systems. An important problem is memory allocation. The default design does ...
LSM-trie: an LSM-tree-based ultra-large key-value store for small data
Key-value (KV) stores have become a backbone of large-scale applications in today's data centers. The data set of the store on a single server can grow to billions of KV items or many terabytes, while individual data items are often small (with their ...
MetaSync: file synchronization across multiple untrusted storage services
Cloud-based file synchronization services, such as Drop-box, are a worldwide resource for many millions of users. However, individual services often have tight resource limits, suffer from temporary outages or even shutdowns, and sometimes silently ...
Pyro: a spatial-temporal big-data storage system
With the rapid growth of mobile devices and applications, geo-tagged data has become a major workload for big data storage systems. In order to achieve scalability, existing solutions build an additional index layer above general purpose distributed ...
CDStore: toward reliable, secure, and cost-efficient cloud storage via convergent dispersal
We present CDStore, which disperses users' backup data across multiple clouds and provides a unified multicloud storage solution with reliability, security, and cost-efficiency guarantees. CDStore builds on an augmented secret sharing scheme called ...
Surviving peripheral failures in embedded systems
Peripherals fail. Yet, modern embedded systems largely leave the burden of tolerating peripheral failures to the programmer. This paper presents Phoenix, a semi-automated peripheral recovery system for resource-constrained embedded systems. Phoenix ...
Log2: a cost-aware logging mechanism for performance diagnosis
Logging has been a common practice for monitoring and diagnosing performance issues. However, logging comes at a cost, especially for large-scale online service systems. First, the overhead incurred by intensive logging is non-negligible. Second, it is ...
Identifying trends in enterprise data protection systems
Enterprises routinely use data protection techniques to achieve business continuity in the event of failures. To ensure that backup and recovery goals are met in the face of the steep data growth rates of modern workloads, data protection systems need ...
Systematically exploring the behavior of control programs
Many networked systems today, ranging from home automation networks to global wide-area networks, are operated using centralized control programs. Bugs in such programs pose serious risks to system security and stability. We develop a new technique to ...
Fence: protecting device availability with uniform resource control
Applications such as software updaters or a run-away web app, even if low priority, can cause performance degradation, loss of battery life, or other issues that reduce a computing device's availability. The core problem is that OS resource control ...
Request-oriented durable write caching for application performance
Non-volatile write cache (NVWC) can help to improve the performance of I/O-intensive tasks, especially write-dominated tasks. The benefit of NVWC, however, cannot be fully exploited if an admission policy blindly caches all writes without ...
NVMKV: a scalable, lightweight, FTL-aware key-value store
Key-value stores are ubiquitous in high performance data-intensive, scale out, and NoSQL environments. Many KV stores use flash devices for meeting their performance needs. However, by using flash as a simple block device, these KV stores are unable to ...
Lightweight application-level crash consistency on transactional flash storage
Applications implement their own update protocols to ensure consistency of data on the file system. However, since current file systems provide only a preliminary ordering guarantee, notably fsync, these update protocols become complex, slow, and error-...
WALDIO: eliminating the filesystem journaling in resolving the journaling of journal anomaly
This work is dedicated to resolve the Journaling of Journal Anomaly in Android IO stack. We orchestrate SQLite and EXT4 filesystem so that SQLite's file-backed journaling activity can dispense with the expensive filesystem intervention, the journaling, ...
SpanFS: a scalable file system on fast storage devices
Most recent storage devices, such as NAND flash-based solid state drives (SSDs), provide low access latency and high degree of parallelism. However, conventional file systems, which are designed for slow hard disk drives, often encounter severe ...
Shoal: smart allocation and replication of memory for parallel programs
Modern NUMA multi-core machines exhibit complex latency and throughput characteristics, making it hard to allocate memory optimally for a given program's access patterns. However, sub-optimal allocation can significantly impact performance of parallel ...
Thread and memory placement on NUMA systems: asymmetry matters
It is well known that the placement of threads and memory plays a crucial role for performance on NUMA (Non-Uniform Memory-Access) systems. The conventional wisdom is to place threads close to their memory, to collocate on the same node threads that ...
Latency-tolerant software distributed shared memory
We present Grappa, a modern take on software distributed shared memory (DSM) for in-memory data-intensive applications. Grappa enables users to program a cluster as if it were a single, large, non-uniform memory access (NUMA) machine. Performance scales ...
NightWatch: integrating lightweight and transparent cache pollution control into dynamic memory allocation systems
Cache pollution, by which weak-locality data unduly replaces strong-locality data, may notably degrade application performance in a shared-cache multicore machine. This paper presents NightWatch, a cache management subsystem that provides general, ...
Secure deduplication of general computations
The world's fast-growing data has become highly concentrated on enterprise or cloud storage servers. Data deduplication reduces redundancy in this data, saving storage and simplifying management. While existing systems can deduplicate computations on ...
Lamassu: storage-efficient host-side encryption
Many storage customers are adopting encryption solutions to protect critical data. Most existing encryption solutions sit in, or near, the application that is the source of critical data, upstream of the primary storage system. Placing encryption near ...
SecPod: a framework for virtualization-based security systems
The OS kernel is critical to the security of a computer system. Many systems have been proposed to improve its security. A fundamental weakness of those systems is that page tables, the data structures that control the memory protection, are not ...
Between mutual trust and mutual distrust: practical fine-grained privilege separation in multithreaded applications
Threads in a multithreaded process share the same address space and thus are implicitly assumed to be mutually trusted. However, one (compromised) thread attacking another is a real world threat. It remains challenging to achieve privilege separation ...
GridGraph: large-scale graph processing on a single machine using 2-level hierarchical partitioning
In this paper, we present GridGraph, a system for processing large-scale graphs on a single machine. Grid-Graph breaks graphs into 1D-partitioned vertex chunks and 2D-partitioned edge blocks using a first fine-grained level partitioning in ...
GraphQ: graph query processing with abstraction refinement: scalable and programmable analytics over very large graphs on a single PC
This paper introduces GraphQ, a scalable querying framework for very large graphs. GraphQ is built on a key insight that many interesting graph properties -- such as finding cliques of a certain size, or finding vertices with a certain page rank -- can ...
Accurate latency-based congestion feedback for datacenters
The nature of congestion feedback largely governs the behavior of congestion control. In datacenter networks, where RTTs are in hundreds of microseconds, accurate feedback is crucial to achieve both high utilization and low queueing delay. Proposals for ...
Index Terms
- Proceedings of the 2015 USENIX Conference on Usenix Annual Technical Conference