Parallel computing methodologies

Applied Filters

Publication Date

People

Publications

Searched The ACM Guide to Computing Literature (3,856,353 records)|Limit your search to The ACM Full-Text Collection (778,796 records)

Showing 1 - 20of20 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

research-article
Public Access
July 2021
Systemizing Interprocedural Static Analysis of Large-scale Systems Code with Graspan
ACM Transactions on Computer Systems (TOCS), Volume 38, Issue 1-2Article No.: 4, Pages 1–39https://doi.org/10.1145/3466820

There is more than a decade-long history of using static analysis to find bugs in systems such as Linux. Most of the existing static analyses developed for these systems are simple checkers that find bugs based on pattern matching. Despite the presence ...
3
1,188
Metrics
Total Citations3
Total Downloads1,188
Last 12 Months392
Last 6 weeks61
View online with eReader
View this article in HTML format
PDF
research-article
August 2015
SKMD: Single Kernel on Multiple Devices for Transparent CPU-GPU Collaboration
ACM Transactions on Computer Systems (TOCS), Volume 33, Issue 3Article No.: 9, Pages 1–27https://doi.org/10.1145/2798725

Heterogeneous computing on CPUs and GPUs has traditionally used fixed roles for each device: the GPU handles data parallel work by taking advantage of its massive number of cores while the CPU handles non data-parallel work, such as the sequential code ...
38
848
Metrics
Total Citations38
Total Downloads848
Last 12 Months35
Last 6 weeks1
Get Access
research-article
Open Access
January 2015
The Scalable Commutativity Rule: Designing Scalable Software for Multicore Processors
ACM Transactions on Computer Systems (TOCS), Volume 32, Issue 4Article No.: 10, Pages 1–47https://doi.org/10.1145/2699681

What opportunities for multicore scalability are latent in software interfaces, such as system call APIs? Can scalability challenges and opportunities be identified even before any implementation exists, simply by considering interface specifications? ...
46
2,414
Metrics
Total Citations46
Total Downloads2,414
Last 12 Months298
Last 6 weeks33
View online with eReader
PDF
article
August 2004
Cluster communication protocols for parallel-programming systems
ACM Transactions on Computer Systems (TOCS), Volume 22, Issue 3Pages 281–325https://doi.org/10.1145/1012268.1012269

Clusters of workstations are a popular platform for high-performance computing. For many parallel applications, efficient use of a fast interconnection network is essential for good performance. Several modern System Area Networks include programmable ...
8
1,824
Metrics
Total Citations8
Total Downloads1,824
Last 12 Months9
Last 6 weeks0
Get Access
article
February 2004
Parallel program performance prediction using deterministic task graph analysis
- Vikram S. Adve,
- Mary K. Vernon
ACM Transactions on Computer Systems (TOCS), Volume 22, Issue 1Pages 94–136https://doi.org/10.1145/966785.966788

In this article, we consider analytical techniques for predicting detailed performance characteristics of a single shared memory parallel program for a particular input. Analytical models for parallel programs have been successful at providing simple ...
62
2,400
Metrics
Total Citations62
Total Downloads2,400
Last 12 Months20
Last 6 weeks2
Get Access
article
Free
August 1999
Ace: a language for parallel programming with customizable protocols
- Mukund Raghavachari,
- Anne Rogers
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 3Pages 202–248https://doi.org/10.1145/320656.320657

Customizing the protocols that manage accesses to different data structures within an application can improve the performance of software shared-memory programs substantially. Existing systems for using customizable protocols are hard to use directly ...
3
680
Metrics
Total Citations3
Total Downloads680
Last 12 Months75
Last 6 weeks7
View online with eReader
PDF
article
Free
May 1999
RecPlay: a fully integrated practical record/replay system
- Michiel Ronsse,
- Koen De Bosschere
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 2Pages 133–152https://doi.org/10.1145/312203.312214

This article presents a practical solution for the cyclic debugging of nondeterministic parallel programs. The solution consists of a combination of record/replay with automatic on-the-fly data race detection. This combination enables us to limit the ...
254
2,115
Metrics
Total Citations254
Total Downloads2,115
Last 12 Months140
Last 6 weeks27
View online with eReader
PDF
article
Free
May 1999
Eliminating synchronization overhead in automatically parallelized programs using dynamic feedback
- Pedro C. Diniz,
- Martin C. Rinard
ACM Transactions on Computer Systems (TOCS), Volume 17, Issue 2Pages 89–132https://doi.org/10.1145/312203.312210

This article presents dynamic feedback, a technique that enables computations to adapt dynamically to different execution environments. A compiler that uses dynamic feedback produces several different versions of the same source code; each version uses ...
9
630
Metrics
Total Citations9
Total Downloads630
Last 12 Months67
Last 6 weeks5
View online with eReader
PDF
article
Free
August 1998
A quantitative comparison of parallel computation models
- Ben H. H. Juurlink,
- Harry A. G. Wijshoff
ACM Transactions on Computer Systems (TOCS), Volume 16, Issue 3Pages 271–318https://doi.org/10.1145/290409.290412

In recent years, a large number of parallel computation models have been proposed to replace the PRAM as the parallel computation model presented to the algorithm designer. Although mostly the theoretical justifications for these models are sound, and ...
19
1,309
Metrics
Total Citations19
Total Downloads1,309
Last 12 Months84
Last 6 weeks13
View online with eReader
PDF
article
Free
February 1998
Performance evaluation of the Orca shared-object system
ACM Transactions on Computer Systems (TOCS), Volume 16, Issue 1Pages 1–40https://doi.org/10.1145/273011.273014

Orca is a portable, object-based distributed shared memory (DSM) system. This article studies and evaluates the design choices made in the Orca system and compares Orca with other DSMs. The article gives a quantitative analysis of Orca's coherence ...
103
897
Metrics
Total Citations103
Total Downloads897
Last 12 Months135
Last 6 weeks23
View online with eReader
PDF
article
Free
November 1997
Eraser: a dynamic data race detector for multithreaded programs
ACM Transactions on Computer Systems (TOCS), Volume 15, Issue 4Pages 391–411https://doi.org/10.1145/265924.265927

Multithreaded programming is difficult and error prone. It is easy to make a mistake in synchronization that produces a data race, yet it can be extremely hard to locate this mistake during debugging. This article describes a new tool, called Eraser, ...
1,225
5,944
Metrics
Total Citations1,225
Total Downloads5,944
Last 12 Months761
Last 6 weeks180
View online with eReader
PDF
article
Free
February 1997
Scheduler-conscious synchronization
ACM Transactions on Computer Systems (TOCS), Volume 15, Issue 1Pages 3–40https://doi.org/10.1145/244764.244765

Efficient synchronization is important for achieving good performance in parallel programs, especially on large-scale multiprocessors. Most synchronization algorithms have been designed to run on a dedicated machine, with one application process per ...
72
1,098
Metrics
Total Citations72
Total Downloads1,098
Last 12 Months64
Last 6 weeks8
View online with eReader
PDF
article
Free
August 1996
The Vesta parallel file system
- Peter F. Corbett,
- Dror G. Feitelson
ACM Transactions on Computer Systems (TOCS), Volume 14, Issue 3Pages 225–264https://doi.org/10.1145/233557.233558

The Vesta parallel file system is designed to provide parallel file access to application programs running on multicomputers with parallel I/O subsystems. Vesta uses a new abstraction of files: a file is not a sequence of bytes, but rather it can be ...
142
1,315
Metrics
Total Citations142
Total Downloads1,315
Last 12 Months138
Last 6 weeks20
View online with eReader
PDF
article
Free
May 1996
Portable run-time support for dynamic object-oriented parallel processing
ACM Transactions on Computer Systems (TOCS), Volume 14, Issue 2Pages 139–170https://doi.org/10.1145/227695.227696

Mentat is an object-oriented parallel processing system designed to simplify the task of writing portable parallel programs for parallel machines and workstation networks. The Mentat compiler and run-time system work together to automatically manage the ...
48
619
Metrics
Total Citations48
Total Downloads619
Last 12 Months58
Last 6 weeks9
View online with eReader
PDF
article
Free
August 1994
The TickerTAIP parallel RAID architecture
ACM Transactions on Computer Systems (TOCS), Volume 12, Issue 3Pages 236–269https://doi.org/10.1145/185514.185517

Traditional disk arrays have a centralized architecture, with a single controller through which all requests flow. Such a controller is a single point of failure, and its performance limits the maximum number of disks to which the array can scale. We ...
75
1,053
Metrics
Total Citations75
Total Downloads1,053
Last 12 Months123
Last 6 weeks19
View online with eReader
PDF
article
Free
November 1993
Cooperative shared memory: software and hardware for scalable multiprocessors
ACM Transactions on Computer Systems (TOCS), Volume 11, Issue 4Pages 300–318https://doi.org/10.1145/161541.161544

We believe the paucity of massively parallel, shared-memory machines follows from the lack of a shared-memory programming performance model that can inform programmers of the cost of operations (so they can avoid expensive ones) and can tell hardware ...
121
601
Metrics
Total Citations121
Total Downloads601
Last 12 Months108
Last 6 weeks6
View online with eReader
PDF
article
Free
November 1993
Access normalization: loop restructuring for NUMA computers
- Wei Li,
- Keshav Pingali
ACM Transactions on Computer Systems (TOCS), Volume 11, Issue 4Pages 353–375https://doi.org/10.1145/161541.159766

In scalable parallel machines, processors can make local memory accesses much faster than they can make remote memory accesses. Additionally, when a number of remote accesses must be made, it is usually more efficient to use block transfers of data ...
55
427
Metrics
Total Citations55
Total Downloads427
Last 12 Months76
Last 6 weeks13
View online with eReader
PDF
article
Free
May 1990
“Topologies”—distributed objects on multicomputers
- Karsten Schwan,
- Win Bo
ACM Transactions on Computer Systems (TOCS), Volume 8, Issue 2Pages 111–157https://doi.org/10.1145/78952.78954

Application programs written for large-scale multicomputers with interconnection structures known to the programmer (e.g., hypercubes or meshes) use complex communication structures for connecting the applications' parallel tasks. Such structures ...
18
745
Metrics
Total Citations18
Total Downloads745
Last 12 Months84
Last 6 weeks15
View online with eReader
PDF
article
Free
May 1989
High-speed implementations of rule-based systems
ACM Transactions on Computer Systems (TOCS), Volume 7, Issue 2Pages 119–146https://doi.org/10.1145/63404.63405

Rule-based systems are widely used in artificial intelligence for modeling intelligent behavior and building expert systems. Most rule-based programs, however, are extremely computation intensive and run quite slowly. The slow speed of execution has ...
93
1,617
Metrics
Total Citations93
Total Downloads1,617
Last 12 Months120
Last 6 weeks7
View online with eReader
PDF
article
Free
August 1987
High-performance operating system primitives for robotics and real-time control systems
ACM Transactions on Computer Systems (TOCS), Volume 5, Issue 3Pages 189–231https://doi.org/10.1145/24068.24070

To increase speed and reliability of operation, multiple computers are replacing uniprocessors and wired-logic controllers in modern robots and industrial control systems. However, performance increases are not attained by such hardware alone. The ...
46
1,921
Metrics
Total Citations46
Total Downloads1,921
Last 12 Months193
Last 6 weeks24
View online with eReader
PDF