Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleFebruary 2024
Diciclo: Flexible User-level Services for Efficient Multitenant Isolation
ACM Transactions on Computer Systems (TOCS), Volume 42, Issue 1-2Article No.: 3, Pages 1–47https://doi.org/10.1145/3639404Containers are a mainstream virtualization technique for running stateful workloads over persistent storage. In highly utilized multitenant hosts, resource contention at the system kernel leads to inefficient container input/output (I/O) handling. ...
- research-articleDecember 2023
Charlotte: Reformulating Blockchains into a Web of Composable Attested Data Structures for Cross-Domain Applications
ACM Transactions on Computer Systems (TOCS), Volume 41, Issue 1-4Article No.: 2, Pages 1–52https://doi.org/10.1145/3607534Cross-domain applications are rapidly adopting blockchain techniques for immutability, availability, integrity, and interoperability. However, for most applications, global consensus is unnecessary and may not even provide sufficient guarantees.
We ...
- research-articleJuly 2022
ROME: All Overlays Lead to Aggregation, but Some Are Faster than Others
ACM Transactions on Computer Systems (TOCS), Volume 39, Issue 1-4Article No.: 4, Pages 1–33https://doi.org/10.1145/3516430Aggregation is common in data analytics and crucial to distilling information from large datasets, but current data analytics frameworks do not fully exploit the potential for optimization in such phases. The lack of optimization is particularly notable ...
- research-articleOctober 2021
Apache Nemo: A Framework for Optimizing Distributed Data Processing
- Won Wook Song,
- Youngseok Yang,
- Jeongyoon Eo,
- Jangho Seo,
- Joo Yeon Kim,
- Sanha Lee,
- Gyewon Lee,
- Taegeon Um,
- Haeyoon Cho,
- Byung-Gon Chun
ACM Transactions on Computer Systems (TOCS), Volume 38, Issue 3-4Article No.: 5, Pages 1–31https://doi.org/10.1145/3468144Optimizing scheduling and communication of distributed data processing for resource and data characteristics is crucial for achieving high performance. Existing approaches to such optimizations largely fall into two categories. First, distributed runtimes ...
- research-articleJuly 2021
SmartIO: Zero-overhead Device Sharing through PCIe Networking
- Jonas Markussen,
- Lars Bjørlykke Kristiansen,
- Pål Halvorsen,
- Halvor Kielland-Gyrud,
- Håkon Kvale Stensland,
- Carsten Griwodz
ACM Transactions on Computer Systems (TOCS), Volume 38, Issue 1-2Article No.: 2, Pages 1–78https://doi.org/10.1145/3462545The large variety of compute-heavy and data-driven applications accelerate the need for a distributed I/O solution that enables cost-effective scaling of resources between networked hosts. For example, in a cluster system, different machines may have ...
-
- research-articleMay 2020
A Retargetable System-level DBT Hypervisor
ACM Transactions on Computer Systems (TOCS), Volume 36, Issue 4Article No.: 14, Pages 1–24https://doi.org/10.1145/3386161System-level Dynamic Binary Translation (DBT) provides the capability to boot an Operating System (OS) and execute programs compiled for an Instruction Set Architecture (ISA) different from that of the host machine. Due to their performance-critical ...
- research-articleApril 2019
Derecho: Fast State Machine Replication for Cloud Services
- Sagar Jha,
- Jonathan Behrens,
- Theo Gkountouvas,
- Mae Milano,
- Weijia Song,
- Edward Tremel,
- Robbert Van Renesse,
- Sydney Zink,
- Kenneth P. Birman
ACM Transactions on Computer Systems (TOCS), Volume 36, Issue 2Article No.: 4, Pages 1–49https://doi.org/10.1145/3302258Cloud computing services often replicate data and may require ways to coordinate distributed actions. Here we present Derecho, a library for such tasks. The API provides interfaces for structuring applications into patterns of subgroups and shards, ...
- research-articleOctober 2017
Apache REEF: Retainable Evaluator Execution Framework
- Byung-Gon Chun,
- Tyson Condie,
- Yingda Chen,
- Brian Cho,
- Andrew Chung,
- Carlo Curino,
- Chris Douglas,
- Matteo Interlandi,
- Beomyeol Jeon,
- Joo Seong Jeong,
- Gyewon Lee,
- Yunseong Lee,
- Tony Majestro,
- Dahlia Malkhi,
- Sergiy Matusevych,
- Brandon Myers,
- Mariia Mykhailova,
- Shravan Narayanamurthy,
- Joseph Noor,
- Raghu Ramakrishnan,
- Sriram Rao,
- Russell Sears,
- Beysim Sezgin,
- Taegeon Um,
- Julia Wang,
- Markus Weimer,
- Youngseok Yang
ACM Transactions on Computer Systems (TOCS), Volume 35, Issue 2Article No.: 5, Pages 1–31https://doi.org/10.1145/3132037Resource Managers like YARN and Mesos have emerged as a critical layer in the cloud computing system stack, but the developer abstractions for leasing cluster resources and instantiating application logic are very low level. This flexibility comes at a ...
- research-articleMay 2015
Fireflies: A Secure and Scalable Membership and Gossip Service
ACM Transactions on Computer Systems (TOCS), Volume 33, Issue 2Article No.: 5, Pages 1–32https://doi.org/10.1145/2701418An attacker who controls a computer in an overlay network can effectively control the entire overlay network if the mechanism managing membership information can successfully be targeted. This article describes Fireflies, an overlay network protocol ...
- research-articleMarch 2015
Energy-Oriented Partial Desktop Virtual Machine Migration
- Nilton Bila,
- Eric J. Wright,
- Eyal De Lara,
- Kaustubh Joshi,
- H. Andrés Lagar-Cavilla,
- Eunbyung Park,
- Ashvin Goel,
- Matti Hiltunen,
- Mahadev Satyanarayanan
ACM Transactions on Computer Systems (TOCS), Volume 33, Issue 1Article No.: 2, Pages 1–51https://doi.org/10.1145/2699683Modern offices are crowded with personal computers. While studies have shown these to be idle most of the time, they remain powered, consuming up to 60% of their peak power. Hardware-based solutions engendered by PC vendors (e.g., low-power states, Wake-...
- research-articleJanuary 2015
The Next 700 BFT Protocols
ACM Transactions on Computer Systems (TOCS), Volume 32, Issue 4Article No.: 12, Pages 1–45https://doi.org/10.1145/2658994We present Abstract (ABortable STate mAChine replicaTion), a new abstraction for designing and reconfiguring generalized replicated state machines that are, unlike traditional state machines, allowed to abort executing a client’s request if “something ...
- research-articleDecember 2013
QoS-Aware scheduling in heterogeneous datacenters with paragon
ACM Transactions on Computer Systems (TOCS), Volume 31, Issue 4Article No.: 12, Pages 1–34https://doi.org/10.1145/2556583Large-scale datacenters (DCs) host tens of thousands of diverse applications each day. However, interference between colocated workloads and the difficulty of matching applications to one of the many hardware platforms available can degrade performance, ...
- research-articleDecember 2013
CORFU: A distributed shared log
ACM Transactions on Computer Systems (TOCS), Volume 31, Issue 4Article No.: 10, Pages 1–24https://doi.org/10.1145/2535930CORFU is a global log which clients can append-to and read-from over a network. Internally, CORFU is distributed over a cluster of machines in such a way that there is no single I/O bottleneck to either appends or reads. Data is fully replicated for ...
- research-articleMay 2013
Parametric Content-Based Publish/Subscribe
ACM Transactions on Computer Systems (TOCS), Volume 31, Issue 2Article No.: 4, Pages 1–52https://doi.org/10.1145/2465346.2465347Content-based publish/subscribe (CPS) is an appealing abstraction for building scalable distributed systems, e.g., message boards, intrusion detectors, or algorithmic stock trading platforms. Recently, CPS extensions have been proposed for location-...
- research-articleFebruary 2013
TritonSort: A Balanced and Energy-Efficient Large-Scale Sorting System
- Alexander Rasmussen,
- George Porter,
- Michael Conley,
- Harsha V. Madhyastha,
- Radhika Niranjan Mysore,
- Alexander Pucher,
- Amin Vahdat
ACM Transactions on Computer Systems (TOCS), Volume 31, Issue 1Article No.: 3, Pages 1–28https://doi.org/10.1145/2427631.2427634We present TritonSort, a highly efficient, scalable sorting system. It is designed to process large datasets, and has been evaluated against as much as 100TB of input data spread across 832 disks in 52 nodes at a rate of 0.938TB/min. When evaluated ...
- research-articleNovember 2012
Fay: Extensible Distributed Tracing from Kernels to Clusters
ACM Transactions on Computer Systems (TOCS), Volume 30, Issue 4Article No.: 13, Pages 1–35https://doi.org/10.1145/2382553.2382555Fay is a flexible platform for the efficient collection, processing, and analysis of software execution traces. Fay provides dynamic tracing through use of runtime instrumentation and distributed aggregation within machines and across clusters. At the ...
- research-articleDecember 2011
Depot: Cloud Storage with Minimal Trust
ACM Transactions on Computer Systems (TOCS), Volume 29, Issue 4Article No.: 12, Pages 1–38https://doi.org/10.1145/2063509.2063512This article describes the design, implementation, and evaluation of Depot, a cloud storage system that minimizes trust assumptions. Depot tolerates buggy or malicious behavior by any number of clients or servers, yet it provides safety and liveness ...
- research-articleDecember 2011
EventGuard: A System Architecture for Securing Publish-Subscribe Networks
ACM Transactions on Computer Systems (TOCS), Volume 29, Issue 4Article No.: 10, Pages 1–40https://doi.org/10.1145/2063509.2063510Publish-subscribe (pub-sub) is an emerging paradigm for building a large number of distributed systems. A wide area pub-sub system is usually implemented on an overlay network infrastructure to enable information dissemination from publishers to ...
- research-articleAugust 2011
On the design of perturbation-resilient atomic commit protocols for mobile transactions
ACM Transactions on Computer Systems (TOCS), Volume 29, Issue 3Article No.: 7, Pages 1–36https://doi.org/10.1145/2003690.2003691Distributed mobile transactions utilize commit protocols to achieve atomicity and consistent decisions. This is challenging, as mobile environments are typically characterized by frequent perturbations such as network disconnections and node failures. ...
- research-articleMay 2011
DieCast: Testing Distributed Systems with an Accurate Scale Model
- Diwaker Gupta,
- Kashi Venkatesh Vishwanath,
- Marvin McNett,
- Amin Vahdat,
- Ken Yocum,
- Alex Snoeren,
- Geoffrey M. Voelker
ACM Transactions on Computer Systems (TOCS), Volume 29, Issue 2Article No.: 4, Pages 1–48https://doi.org/10.1145/1963559.1963560Large-scale network services can consist of tens of thousands of machines running thousands of unique software configurations spread across hundreds of physical networks. Testing such services for complex performance problems and configuration errors ...