Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
Volume 18, Issue 4December 2021
Editor:
Publisher:
  • Association for Computing Machinery
  • New York
  • NY
  • United States
ISSN:1544-3566
EISSN:1544-3973
Reflects downloads up to 09 Nov 2024Bibliometrics
Skip Table Of Content Section
research-article
Open Access
Towards Enhanced System Efficiency while Mitigating Row Hammer
Article No.: 40, Pages 1–26https://doi.org/10.1145/3458749

In recent years, DRAM-based main memories have become susceptible to the Row Hammer (RH) problem, which causes bits to flip in a row without accessing them directly. Frequent activation of a row, called an aggressor row, causes its adjacent rows’ (...

research-article
Open Access
All-gather Algorithms Resilient to Imbalanced Process Arrival Patterns
Article No.: 41, Pages 1–22https://doi.org/10.1145/3460122

Two novel algorithms for the all-gather operation resilient to imbalanced process arrival patterns (PATs) are presented. The first one, Background Disseminated Ring (BDR), is based on the regular parallel ring algorithm often supplied in MPI ...

research-article
Open Access
Configurable Multi-directional Systolic Array Architecture for Convolutional Neural Networks
Article No.: 42, Pages 1–24https://doi.org/10.1145/3460776

The systolic array architecture is one of the most popular choices for convolutional neural network hardware accelerators. The biggest advantage of the systolic array architecture is its simple and efficient design principle. Without complicated control ...

research-article
Open Access
SLO-Aware Inference Scheduler for Heterogeneous Processors in Edge Platforms
Article No.: 43, Pages 1–26https://doi.org/10.1145/3460352

With the proliferation of applications with machine learning (ML), the importance of edge platforms has been growing to process streaming sensor, data locally without resorting to remote servers. Such edge platforms are commonly equipped with ...

research-article
Open Access
Gem5-X: A Many-core Heterogeneous Simulation Platform for Architectural Exploration and Optimization
Article No.: 44, Pages 1–27https://doi.org/10.1145/3461662

The increasing adoption of smart systems in our daily life has led to the development of new applications with varying performance and energy constraints, and suitable computing architectures need to be developed for these new applications. In this ...

research-article
Open Access
PICO: A Presburger In-bounds Check Optimization for Compiler-based Memory Safety Instrumentations
Article No.: 45, Pages 1–27https://doi.org/10.1145/3460434

Memory safety violations such as buffer overflows are a threat to security to this day. A common solution to ensure memory safety for C is code instrumentation. However, this often causes high execution-time overhead and is therefore rarely used in ...

research-article
Open Access
Low I/O Intensity-aware Partial GC Scheduling to Reduce Long-tail Latency in SSDs
Article No.: 46, Pages 1–25https://doi.org/10.1145/3460433

This article proposes a low I/O intensity-aware scheduling scheme on garbage collection (GC) in SSDs for minimizing the I/O long-tail latency to ensure I/O responsiveness. The basic idea is to assemble partial GC operations by referring to several ...

research-article
Open Access
Low-precision Logarithmic Number Systems: Beyond Base-2
Article No.: 47, Pages 1–25https://doi.org/10.1145/3461699

Logarithmic number systems (LNS) are used to represent real numbers in many applications using a constant base raised to a fixed-point exponent making its distribution exponential. This greatly simplifies hardware multiply, divide, and square root. LNS ...

research-article
Open Access
Monolithically Integrating Non-Volatile Main Memory over the Last-Level Cache
Article No.: 48, Pages 1–26https://doi.org/10.1145/3462632

Many emerging non-volatile memories are compatible with CMOS logic, potentially enabling their integration into a CPU’s die. This article investigates such monolithically integrated CPU–main memory chips. We exploit non-volatile memories employing 3D ...

research-article
Open Access
Byte-Select Compression
Article No.: 49, Pages 1–27https://doi.org/10.1145/3462209

Cache-block compression is a highly effective technique for both reducing accesses to lower levels in the memory hierarchy (cache compression) and minimizing data transfers (link compression). While many effective cache-block compression algorithms have ...

research-article
Open Access
CIB-HIER: Centralized Input Buffer Design in Hierarchical High-radix Routers
Article No.: 50, Pages 1–21https://doi.org/10.1145/3468062

Hierarchical organization is widely used in high-radix routers to enable efficient scaling to higher switch port count. A general-purpose hierarchical router must be symmetrically designed with the same input buffer depth, resulting in a large amount of ...

research-article
Open Access
Domain-Specific Multi-Level IR Rewriting for GPU: The Open Earth Compiler for GPU-accelerated Climate Simulation
Article No.: 51, Pages 1–23https://doi.org/10.1145/3469030

Most compilers have a single core intermediate representation (IR) (e.g., LLVM) sometimes complemented with vaguely defined IR-like data structures. This IR is commonly low-level and close to machine instructions. As a result, optimizations relying on ...

research-article
Open Access
System-level Early-stage Modeling and Evaluation of IVR-assisted Processor Power Delivery System
Article No.: 52, Pages 1–27https://doi.org/10.1145/3468145

Despite being employed in numerous efforts to improve power delivery efficiency, the integrated voltage regulator (IVR) approach has yet to be evaluated rigorously and quantitatively in a full power delivery system (PDS) setting. To fulfill this need, ...

research-article
Open Access
GraphAttack: Optimizing Data Supply for Graph Applications on In-Order Multicore Architectures
Article No.: 53, Pages 1–26https://doi.org/10.1145/3469846

Graph structures are a natural representation of important and pervasive data. While graph applications have significant parallelism, their characteristic pointer indirect loads to neighbor data hinder scalability to large datasets on multicore systems. ...

research-article
Open Access
Scenario-Aware Program Specialization for Timing Predictability
Article No.: 54, Pages 1–26https://doi.org/10.1145/3473333

The successful application of static program analysis strongly depends on flow facts of a program such as loop bounds, control-flow constraints, and operating modes. This problem heavily affects the design of real-time systems, since static program ...

research-article
Open Access
WaFFLe: Gated Cache-<underline>Wa</underline>ys with Per-Core <underline>F</underline>ine-Grained DV<underline>F</underline>S for Reduced On-Chip Temperature and <underline>Le</underline>akage Consumption
Article No.: 55, Pages 1–25https://doi.org/10.1145/3471908

Managing thermal imbalance in contemporary chip multi-processors (CMPs) is crucial in assuring functional correctness of modern mobile as well as server systems. Localized regions with high activity, e.g., register files, ALUs, FPUs, and so on, ...

research-article
Open Access
SortCache: Intelligent Cache Management for Accelerating Sparse Data Workloads
Article No.: 56, Pages 1–24https://doi.org/10.1145/3473332

Sparse data applications have irregular access patterns that stymie modern memory architectures. Although hyper-sparse workloads have received considerable attention in the past, moderately-sparse workloads prevalent in machine learning applications, ...

research-article
Open Access
Device Hopping: Transparent Mid-Kernel Runtime Switching for Heterogeneous Systems
Article No.: 57, Pages 1–25https://doi.org/10.1145/3471909

Existing OS techniques for homogeneous many-core systems make it simple for single and multithreaded applications to migrate between cores. Heterogeneous systems do not benefit so fully from this flexibility, and applications that cannot migrate in mid-...

research-article
Open Access
LargeGraph: An Efficient Dependency-Aware GPU-Accelerated Large-Scale Graph Processing
Article No.: 58, Pages 1–24https://doi.org/10.1145/3477603

Many out-of-GPU-memory systems are recently designed to support iterative processing of large-scale graphs. However, these systems still suffer from long time to converge because of inefficient propagation of active vertices’ new states along graph ...

research-article
Open Access
Spiking Neural Networks in Spintronic Computational RAM
Article No.: 59, Pages 1–21https://doi.org/10.1145/3475963

Spiking Neural Networks (SNNs) represent a biologically inspired computation model capable of emulating neural computation in human brain and brain-like structures. The main promise is very low energy consumption. Classic Von Neumann architecture based ...

Subjects

Currently Not Available

Comments