Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleJuly 2021
Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan
- Zhiqiang Zuo,
- Kai Wang,
- Aftab Hussain,
- Ardalan Amiri Sani,
- Yiyu Zhang,
- Shenming Lu,
- Wensheng Dou,
- Linzhang Wang,
- Xuandong Li,
- Chenxi Wang,
- Guoqing Harry Xu
ACM Transactions on Computer Systems (TOCS), Volume 38, Issue 1-2Article No.: 4, Pages 1–39https://doi.org/10.1145/3466820There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
- research-articleAugust 2015
SKMD: Single Kernel on Multiple Devices for Transparent CPU-GPU Collaboration
ACM Transactions on Computer Systems (TOCS), Volume 33, Issue 3Article No.: 9, Pages 1–27https://doi.org/10.1145/2798725Heterogeneous computing on CPUs and GPUs has traditionally used fixed roles for each device: the GPU handles data parallel work by taking advantage of its massive number of cores while the CPU handles non data-parallel work, such as the sequential code ...
- research-articleJanuary 2015
The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors
ACM Transactions on Computer Systems (TOCS), Volume 32, Issue 4Article No.: 10, Pages 1–47https://doi.org/10.1145/2699681What opportunities for multicore scalability are latent in software interfaces, such as system call APIs? Can scalability challenges and opportunities be identified even before any implementation exists, simply by considering interface specifications? ...
- articleAugust 2004
Cluster communication protocols for parallel-programming systems
ACM Transactions on Computer Systems (TOCS), Volume 22, Issue 3Pages 281–325https://doi.org/10.1145/1012268.1012269Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable ...
- articleFebruary 2004
Parallel program performance prediction using deterministic task graph analysis
ACM Transactions on Computer Systems (TOCS), Volume 22, Issue 1Pages 94–136https://doi.org/10.1145/966785.966788In this article, we consider analytical techniques for predicting detailed performance characteristics of a single shared memory parallel program for a particular input. Analytical models for parallel programs have been successful at providing simple ...
-
- articleAugust 1999
Ace: a language for parallel programming with customizable protocols
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 3Pages 202–248https://doi.org/10.1145/320656.320657Customizing the protocols that manage accesses to different data structures within an application can improve the performance of software shared-memory programs substantially. Existing systems for using customizable protocols are hard to use directly ...
- articleMay 1999
RecPlay: a fully integrated practical record/replay system
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 2Pages 133–152https://doi.org/10.1145/312203.312214This article presents a practical solution for the cyclic debugging of nondeterministic parallel programs. The solution consists of a combination of record/replay with automatic on-the-fly data race detection. This combination enables us to limit the ...
- articleMay 1999
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 2Pages 89–132https://doi.org/10.1145/312203.312210This article presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses ...
- articleAugust 1998
A quantitative comparison of parallel computation models
ACM Transactions on Computer Systems (TOCS), Volume 16, Issue 3Pages 271–318https://doi.org/10.1145/290409.290412In recent years, a large number of parallel computation models have been proposed to replace the PRAM as the parallel computation model presented to the algorithm designer. Although mostly the theoretical justifications for these models are sound, and ...
- articleFebruary 1998
Performance evaluation of the Orca shared-object system
ACM Transactions on Computer Systems (TOCS), Volume 16, Issue 1Pages 1–40https://doi.org/10.1145/273011.273014Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence ...
- articleNovember 1997
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS), Volume 15, Issue 4Pages 391–411https://doi.org/10.1145/265924.265927Multithreaded programming is difficult and error prone. It is easy to make a mistake in synchronization that produces a data race, yet it can be extremely hard to locate this mistake during debugging. This article describes a new tool, called Eraser, ...
- articleFebruary 1997
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS), Volume 15, Issue 1Pages 3–40https://doi.org/10.1145/244764.244765Efficient synchronization is important for achieving good performance in parallel programs, especially on large-scale multiprocessors. Most synchronization algorithms have been designed to run on a dedicated machine, with one application process per ...
- articleAugust 1996
The Vesta parallel file system
ACM Transactions on Computer Systems (TOCS), Volume 14, Issue 3Pages 225–264https://doi.org/10.1145/233557.233558The Vesta parallel file system is designed to provide parallel file access to application programs running on multicomputers with parallel I/O subsystems. Vesta uses a new abstraction of files: a file is not a sequence of bytes, but rather it can be ...
- articleMay 1996
Portable run-time support for dynamic object-oriented parallel processing
ACM Transactions on Computer Systems (TOCS), Volume 14, Issue 2Pages 139–170https://doi.org/10.1145/227695.227696Mentat is an object-oriented parallel processing system designed to simplify the task of writing portable parallel programs for parallel machines and workstation networks. The Mentat compiler and run-time system work together to automatically manage the ...
- articleAugust 1994
The TickerTAIP parallel RAID architecture
ACM Transactions on Computer Systems (TOCS), Volume 12, Issue 3Pages 236–269https://doi.org/10.1145/185514.185517Traditional disk arrays have a centralized architecture, with a single controller through which all requests flow. Such a controller is a single point of failure, and its performance limits the maximum number of disks to which the array can scale. We ...
- articleNovember 1993
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS), Volume 11, Issue 4Pages 300–318https://doi.org/10.1145/161541.161544We believe the paucity of massively parallel, shared-memory machines follows from the lack of a shared-memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware ...
- articleNovember 1993
Access normalization: loop restructuring for NUMA computers
ACM Transactions on Computer Systems (TOCS), Volume 11, Issue 4Pages 353–375https://doi.org/10.1145/161541.159766In scalable parallel machines, processors can make local memory accesses much faster than they can make remote memory accesses. Additionally, when a number of remote accesses must be made, it is usually more efficient to use block transfers of data ...
- articleMay 1990
“Topologies”—distributed objects on multicomputers
ACM Transactions on Computer Systems (TOCS), Volume 8, Issue 2Pages 111–157https://doi.org/10.1145/78952.78954Application programs written for large-scale multicomputers with interconnection structures known to the programmer (e.g., hypercubes or meshes) use complex communication structures for connecting the applications' parallel tasks. Such structures ...
- articleMay 1989
High-speed implementations of rule-based systems
ACM Transactions on Computer Systems (TOCS), Volume 7, Issue 2Pages 119–146https://doi.org/10.1145/63404.63405Rule-based systems are widely used in artificial intelligence for modeling intelligent behavior and building expert systems. Most rule-based programs, however, are extremely computation intensive and run quite slowly. The slow speed of execution has ...
- articleAugust 1987
High-performance operating system primitives for robotics and real-time control systems
ACM Transactions on Computer Systems (TOCS), Volume 5, Issue 3Pages 189–231https://doi.org/10.1145/24068.24070To increase speed and reliability of operation, multiple computers are replacing uniprocessors and wired-logic controllers in modern robots and industrial control systems. However, performance increases are not attained by such hardware alone. The ...