Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 19, Issue 1March 2022
Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Locality-Aware CTA Scheduling for Gaming Applications
Article No.: 1, Pages 1–26https://doi.org/10.1145/3477497

The compute work rasterizer or the GigaThread Engine of a modern NVIDIA GPU focuses on maximizing compute work occupancy across all streaming multiprocessors in a GPU while retaining design simplicity. In this article, we identify the operational aspects ...

research-article
Open Access
Iterative Compilation Optimization Based on Metric Learning and Collaborative Filtering
Article No.: 2, Pages 1–25https://doi.org/10.1145/3480250

Pass selection and phase ordering are two critical compiler auto-tuning problems. Traditional heuristic methods cannot effectively address these NP-hard problems especially given the increasing number of compiler passes and diverse hardware architectures. ...

research-article
Open Access
ReuseTracker: Fast Yet Accurate Multicore Reuse Distance Analyzer
Article No.: 3, Pages 1–25https://doi.org/10.1145/3484199

One widely used metric that measures data locality is reuse distance—the number of unique memory locations that are accessed between two consecutive accesses to a particular memory location. State-of-the-art techniques that measure reuse distance in ...

research-article
Open Access
GPU Domain Specialization via Composable On-Package Architecture
Article No.: 4, Pages 1–23https://doi.org/10.1145/3484505

As GPUs scale their low-precision matrix math throughput to boost deep learning (DL) performance, they upset the balance between math throughput and memory system capabilities. We demonstrate that a converged GPU design trying to address diverging ...

research-article
Open Access
SMT-Based Contention-Free Task Mapping and Scheduling on 2D/3D SMART NoC with Mixed Dimension-Order Routing
Article No.: 5, Pages 1–21https://doi.org/10.1145/3487018

SMART NoCs achieve ultra-low latency by enabling single-cycle multiple-hop transmission via bypass channels. However, contention along bypass channels can seriously degrade the performance of SMART NoCs by breaking the bypass paths. Therefore, contention-...

research-article
Open Access
Marvel: A Data-Centric Approach for Mapping Deep Learning Operators on Spatial Accelerators
Article No.: 6, Pages 1–26https://doi.org/10.1145/3485137

A spatial accelerator’s efficiency depends heavily on both its mapper and cost models to generate optimized mappings for various operators of DNN models. However, existing cost models lack a formal boundary over their input programs (operators) for ...

research-article
Open Access
Joint Program and Layout Transformations to Enable Convolutional Operators on Specialized Hardware Based on Constraint Programming
Article No.: 7, Pages 1–26https://doi.org/10.1145/3487922

The success of Deep Artificial Neural Networks (DNNs) in many domains created a rich body of research concerned with hardware accelerators for compute-intensive DNN operators. However, implementing such operators efficiently with complex hardware ...

research-article
Open Access
SecNVM: An Efficient and Write-Friendly Metadata Crash Consistency Scheme for Secure NVM
Article No.: 8, Pages 1–26https://doi.org/10.1145/3488724

Data security is an indispensable part of non-volatile memory (NVM) systems. However, implementing data security efficiently on NVM is challenging, since we have to guarantee the consistency of user data and the related security metadata. Existing ...

research-article
Open Access
TLB-pilot: Mitigating TLB Contention Attack on GPUs with Microarchitecture-Aware Scheduling
Article No.: 9, Pages 1–23https://doi.org/10.1145/3491218

Co-running GPU kernels on a single GPU can provide high system throughput and improve hardware utilization, but this raises concerns on application security. We reveal that translation lookaside buffer (TLB) attack, one of the common attacks on CPU, can ...

research-article
Open Access
HeapCheck: Low-cost Hardware Support for Memory Safety
Article No.: 10, Pages 1–24https://doi.org/10.1145/3495152

Programs written in C/C++ are vulnerable to memory-safety errors like buffer-overflows and use-after-free. While several mechanisms to detect such errors have been previously proposed, they suffer from a variety of drawbacks, including poor performance, ...

research-article
Open Access
Task-RM: A Resource Manager for Energy Reduction in Task-Parallel Applications under Quality of Service Constraints
Article No.: 11, Pages 1–26https://doi.org/10.1145/3494537

Improving energy efficiency is an important goal of computer system design. This article focuses on a general model of task-parallel applications under quality-of-service requirements on the completion time. Our technique, called Task-RM, exploits the ...

research-article
Open Access
CASHT: Contention Analysis in Shared Hierarchies with Thefts
Article No.: 12, Pages 1–27https://doi.org/10.1145/3494538

Cache management policies should consider workloads’ contention behavior when managing a shared cache. Prior art makes estimates about shared cache behavior by adding extra logic or time to isolate per workload cache statistics. These approaches provide ...

research-article
Open Access
Optimizing Small-Sample Disk Fault Detection Based on LSTM-GAN Model
Article No.: 13, Pages 1–24https://doi.org/10.1145/3500917

In recent years, researches on disk fault detection based on SMART data combined with different machine learning algorithms have been proven to be effective. However, these methods require a large amount of data. In the early stages of the establishment ...

research-article
Open Access
E-BATCH: Energy-Efficient and High-Throughput RNN Batching
Article No.: 14, Pages 1–23https://doi.org/10.1145/3499757

Recurrent Neural Network (RNN) inference exhibits low hardware utilization due to the strict data dependencies across time-steps. Batching multiple requests can increase throughput. However, RNN batching requires a large amount of padding since the ...

research-article
Open Access
CARL: Compiler Assigned Reference Leasing
Article No.: 15, Pages 1–28https://doi.org/10.1145/3498730

Data movement is a common performance bottleneck, and its chief remedy is caching. Traditional cache management is transparent to the workload: data that should be kept in cache are determined by the recency information only, while the program information,...

Subjects

Comments