Software and its engineering

Applied Filters

People

Publications

Conferences

Reproducibility Badges

Publication Date

Past 5 years

16 Results for: Book/Issue: PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingEdit SearchSave SearchRSS

Searched The ACM Guide to Computing Literature (3,836,933 records)|Limit your search to The ACM Full-Text Collection (774,356 records)

Showing 1 - 16of16 Results

Filters

Select All

Export Citations Save to Binder

per page:

Recency

poster
Public Access
March 2022
Optimizing sparse computations jointly
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 459–460https://doi.org/10.1145/3503221.3508439

This work proposes a framework called FuSy that analyzes the data dependence graphs (DAGs) of two sparse kernels and creates an efficient schedule to execute the kernels in combination. Sparse kernels are frequently used in scientific codes and in ...
2
324
Metrics
Total Citations2
Total Downloads324
Last 12 Months108
Last 6 weeks13
View online with eReader
PDF
poster
March 2022
Rethinking graph data placement for graph neural network training on multiple GPUs
- Shihui Song,
- Peng Jiang
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 455–456https://doi.org/10.1145/3503221.3508435

The existing Graph Neural Network (GNN) systems adopt graph partitioning to divide the graph data for multi-GPU training. Although they support large graphs, we find that the existing techniques lead to large data loading overhead. In this work, we for ...
1
378
Metrics
Total Citations1
Total Downloads378
Last 12 Months32
Last 6 weeks2
Get Access
research-article
Open Access
March 2022
Results Reproduced / v1.1
Artifacts Available / v1.1
Artifacts Evaluated & Reusable / v1.1
Parallel block-delayed sequences
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 61–75https://doi.org/10.1145/3503221.3508434

Programming languages using functions on collections of values, such as map, reduce, scan and filter, have been used for over fifty years. Such collections have proven to be particularly useful in the context of parallelism because such functions are ...
3
1,307
Metrics
Total Citations3
Total Downloads1,307
Last 12 Months372
Last 6 weeks44
View online with eReader
PDF
poster
March 2022
Parallel algorithms for masked sparse matrix-matrix products
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 453–454https://doi.org/10.1145/3503221.3508430

Computing the product of two sparse matrices (SpGEMM) is a fundamental operation in various combinatorial and graph algorithms as well as various bioinformatics and data analytics applications for computing inner-product similarities. For an important ...
2
276
Metrics
Total Citations2
Total Downloads276
Last 12 Months46
Last 6 weeks11
Get Access
poster
March 2022
An LLVM-based open-source compiler for NVIDIA GPUs
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 448–449https://doi.org/10.1145/3503221.3508428

We present GASS, an LLVM-based open-source compiler for NVIDIA GPU's SASS machine assembly. GASS is the first open-source compiler targeting SASS, and it provides a unified toolchain for currently fragmented low-level performance research on NVIDIA GPUs. ...
1
756
Metrics
Total Citations1
Total Downloads756
Last 12 Months175
Last 6 weeks16
Get Access
research-article
March 2022
Understanding and detecting deep memory persistency bugs in NVM programs with DeepMC
- Benjamin Reidys,
- Jian Huang
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 322–336https://doi.org/10.1145/3503221.3508427

To facilitate programming with non-volatile memory (NVM), a set of memory persistency models, such as strict and epoch persistency, have been proposed. Although these models provide high-level guidance for reasoning about the data persistence, ...
3
302
Metrics
Total Citations3
Total Downloads302
Last 12 Months44
Last 6 weeks0
Get Access
research-article
Open Access
March 2022
Artifacts Available / v1.1
Artifacts Evaluated & Reusable / v1.1
Interference relation-guided SMT solving for multi-threaded program verification
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 163–176https://doi.org/10.1145/3503221.3508424

Concurrent program verification is challenging due to a large number of thread interferences. A popular approach is to encode concurrent programs as SMT formulas and then rely on off-the-shelf SMT solvers to accomplish the verification. In most existing ...
4
774
Metrics
Total Citations4
Total Downloads774
Last 12 Months238
Last 6 weeks41
View online with eReader
PDF
research-article
March 2022
Results Reproduced / v1.1
Artifacts Available / v1.1
Artifacts Evaluated & Reusable / v1.1
CASE: a compiler-assisted SchEduling framework for multi-GPU systems
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 17–31https://doi.org/10.1145/3503221.3508423

Modern computing platforms tend to deploy multiple GPUs on a single node to boost performance. GPUs have large computing capacities and are an expensive resource. Increasing their utilization without causing performance degradation of individual ...
9
953
Metrics
Total Citations9
Total Downloads953
Last 12 Months272
Last 6 weeks14
Get Access
research-article
March 2022
Dopia: online parallelism management for integrated CPU/GPU architectures
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 32–45https://doi.org/10.1145/3503221.3508421

Recent desktop and mobile processors often integrate CPU and GPU onto the same die. The limited memory bandwidth of these integrated architectures can negatively affect the performance of data-parallel workloads when all computational resources are ...
4
552
Metrics
Total Citations4
Total Downloads552
Last 12 Months102
Last 6 weeks6
Get Access
research-article
March 2022
Artifacts Evaluated & Reusable / v1.1
Results Reproduced / v1.1
Asymmetry-aware scalable locking
- Nian Liu,
- Jinyu Gu,
- Dahai Tang,
- Kenli Li,
- Binyu Zang,
- Haibo Chen
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 294–308https://doi.org/10.1145/3503221.3508420

The pursuit of power-efficiency is popularizing asymmetric multicore processors (AMP) such as ARM big.LITTLE, Apple M1 and recent Intel Alder Lake with big and little cores. However, we find that existing scalable locks fail to scale on AMP and cause ...
3
357
Metrics
Total Citations3
Total Downloads357
Last 12 Months89
Last 6 weeks8
Get Access
poster
March 2022
Hardening selective protection across multiple program inputs for HPC applications
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 437–438https://doi.org/10.1145/3503221.3508414

With the ever-shrinking size of transistors and increasing scale of applications, silent data corruptions (SDCs) have become a common yet serious issue in HPC applications. Selective instruction duplication (SID) is a popular fault-tolerance technique ...
5
153
Metrics
Total Citations5
Total Downloads153
Last 12 Months38
Last 6 weeks2
Get Access
research-article
Open Access
March 2022
Artifacts Available / v1.1
Stream processing with dependency-guided synchronization
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 1–16https://doi.org/10.1145/3503221.3508413

Real-time data processing applications with low latency requirements have led to the increasing popularity of stream processing systems. While such systems offer convenient APIs that can be used to achieve data parallelism automatically, they offer ...
3
958
Metrics
Total Citations3
Total Downloads958
Last 12 Months213
Last 6 weeks31
View online with eReader
PDF
research-article
Open Access
March 2022
Vapro: performance variance detection and diagnosis for production-run parallel applications
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 150–162https://doi.org/10.1145/3503221.3508411

Performance variance is a serious problem for parallel applications, which can cause performance degradation and make applications' behavior hard to understand. Therefore, detecting and diagnosing performance variance are of crucial importance for users ...
6
865
Metrics
Total Citations6
Total Downloads865
Last 12 Months348
Last 6 weeks42
View online with eReader
PDF
research-article
Open Access
March 2022
Artifacts Evaluated & Functional / v1.1
Artifacts Available / v1.1
PerFlow: a domain specific framework for automatic performance analysis of parallel applications
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 177–191https://doi.org/10.1145/3503221.3508405

Performance analysis is widely used to identify performance issues of parallel applications. However, complex communications and data dependence, as well as the interactions between different kinds of performance issues make high-efficiency performance ...
4
995
Metrics
Total Citations4
Total Downloads995
Last 12 Months355
Last 6 weeks41
View online with eReader
PDF
research-article
March 2022
Artifacts Available / v1.1
Artifacts Evaluated & Reusable / v1.1
Results Reproduced / v1.1
Deadlock-free asynchronous message reordering in rust with multiparty session types
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 246–261https://doi.org/10.1145/3503221.3508404

Rust is a modern systems language focused on performance and reliability. Complementing Rust's promise to provide "fearless concurrency", developers frequently exploit asynchronous message passing. Unfortunately, sending and receiving messages in an ...
13
421
Metrics
Total Citations13
Total Downloads421
Last 12 Months117
Last 6 weeks3
Get Access
poster
Public Access
March 2022
Automatic synthesis of parallel unix commands and pipelines with KumQuat
PPoPP '22: Proceedings of the 27th ACM SIGPLAN Symposium on Principles and Practice of Parallel ProgrammingPages 431–432https://doi.org/10.1145/3503221.3508400

We present KumQuat, a system for automatically generating data-parallel implementations of Unix shell commands and pipelines. The generated parallel versions split input streams, execute multiple instantiations of the original pipeline commands to ...
1
237
Metrics
Total Citations1
Total Downloads237
Last 12 Months112
Last 6 weeks19
View online with eReader
PDF

Applied Filters

People

Names

Institutions

Authors

Publications

Proceedings/Book Names

All Publications

Content Type

Media Formats

Publisher

Conferences

Sponsors

Conference Event

Proceedings Series

Reproducibility Badges

Publication Date

Results

Optimizing sparse computations jointly

Rethinking graph data placement for graph neural network training on multiple GPUs

Parallel block-delayed sequences

Parallel algorithms for masked sparse matrix-matrix products

An LLVM-based open-source compiler for NVIDIA GPUs

Understanding and detecting deep memory persistency bugs in NVM programs with DeepMC

Interference relation-guided SMT solving for multi-threaded program verification

CASE: a compiler-assisted SchEduling framework for multi-GPU systems

Dopia: online parallelism management for integrated CPU/GPU architectures

Asymmetry-aware scalable locking

Hardening selective protection across multiple program inputs for HPC applications

Stream processing with dependency-guided synchronization

Vapro: performance variance detection and diagnosis for production-run parallel applications

PerFlow: a domain specific framework for automatic performance analysis of parallel applications

Deadlock-free asynchronous message reordering in rust with multiparty session types

Automatic synthesis of parallel unix commands and pipelines with KumQuat