ITCO: Vol 73, No 8

Volume 73, Issue 8Aug. 2024

Volume 73, Issue 8

Aug. 2024

Publisher:

IEEE Computer Society
1730 Massachusetts Ave., NW Washington, DC
United States

ISSN:0018-9340

Bibliometrics

Select All

Export Citations Save to Binder

research-article

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-Based Accelerators

Pages 1883–1898https://doi.org/10.1109/TC.2024.3386067

The need to efficiently execute different Deep Neural Networks (DNNs) on the same computing platform, coupled with the requirement for easy scalability, makes Multi-Chip Module (MCM)-based accelerators a preferred design choice. Such an accelerator brings ...

research-article

TensorMap: A Deep RL-Based Tensor Mapping Framework for Spatial Accelerators

Pages 1899–1912https://doi.org/10.1109/TC.2024.3398424

The mapping of tensor computation is a complex and important process for spatial accelerators. Today's mapping works depend on hand-tuned kernel libraries or search-based heuristics from human experts. The former is time-intensive while the latter ...

research-article

Uniformity and Independence of H3 Hash Functions for Bloom Filters

Pages 1913–1923https://doi.org/10.1109/TC.2024.3398426

In this paper, we investigate the effects of violating the conditions of hash function uniformity and/or independence on the false positive probability of Bloom Filters (BF). To this end, we focus on hash functions of the H3 family with a partitioned ...

research-article

CoDA: A Co-Design Framework for Versatile and Efficient Attention Accelerators

Pages 1924–1938https://doi.org/10.1109/TC.2024.3398488

As a primary component of Transformers, attention mechanism suffers from quadratic computational complexity. To achieve efficient implementations, its hardware accelerator designs have aroused great research interest. However, most existing accelerators ...

research-article

Relieving Write Disturbance for Phase Change Memory With RESET-Aware Data Encoding

Pages 1939–1952https://doi.org/10.1109/TC.2024.3398490

The write disturbance (WD) problem is becoming increasingly severe in PCM due to the continuous scaling down of memory technology. Previous studies have attempted to transform WD-vulnerable data patterns of the new data to alleviate the WD problem. ...

research-article

Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization

Pages 1953–1966https://doi.org/10.1109/TC.2024.3398492

Ensuring the reliability of deep neural networks (DNNs) is paramount in safety-critical applications. Although introducing supplementary fault-tolerant mechanisms can augment the reliability of DNNs, an efficiency tradeoff may be introduced. This study ...

research-article

Ada-WL: An Adaptive Wear-Leveling Aware Data Migration Approach for Flexible SSD Array Scaling in Clusters

Pages 1967–1982https://doi.org/10.1109/TC.2024.3398493

Recently, the flash-based Solid State Drive (SSD) array has been widely implemented in real-world large-scale clusters. With the increasing number of users in upper-tier applications and the burst of Input/Output requests in this data explosive era, data ...

research-article

Effective Huge Page Strategies for TLB Miss Reduction in Nested Virtualization

Pages 1983–1996https://doi.org/10.1109/TC.2024.3398498

Huge page strategies, such as Linux Transparent Huge Page (THP), have become a prevalent solution to mitigate the performance bottleneck caused by increasingly high memory address translation overhead. However, in cloud environments, virtualization ...

research-article

ReDas: A Lightweight Architecture for Supporting Fine-Grained Reshaping and Multiple Dataflows on Systolic Array

Pages 1997–2011https://doi.org/10.1109/TC.2024.3398500

The systolic accelerator is one of the premier architectural choices for DNN acceleration. However, the conventional systolic architecture suffers from low PE utilization due to the mismatch between the fixed array and diverse DNN workloads. Recent ...

research-article

Revocable and Efficient Blockchain-Based Fine-Grained Access Control Against EDoS Attacks in Cloud Storage

Pages 2012–2024https://doi.org/10.1109/TC.2024.3398502

Users have become accustomed to storing data on the cloud using ciphertext policy attribute-based encryption (CP-ABE) for fine-grained access control. However, this encryption method does not consider the ability of malicious users to launch thousands of ...

research-article

AQA: An Adaptive Post-Training Quantization Method for Activations of CNNs

Pages 2025–2035https://doi.org/10.1109/TC.2024.3398503

The post-training quantization (PTQ) is a common technology to improve the efficiency of embedded neural network accelerators. Existing PTQ schemes for CNN activations usually rely on calibration dataset with good data representation to reduce ...

research-article

Toward Finding S-Box Circuits With Optimal Multiplicative Complexity

Pages 2036–2050https://doi.org/10.1109/TC.2024.3398507

In this paper, we present a new method to find S-box circuits with optimal multiplicative complexity (MC), i.e., MC-optimal S-box circuits. We provide new observations for efficiently constructing circuits and computing MC, combined with a popular ...

research-article

FutureDID: A Fully Decentralized Identity System With Multi-Party Verification

Pages 2051–2065https://doi.org/10.1109/TC.2024.3398509

Decentralized identity (DID) systems conforming to the World Wide Web Consortium (W3C) Decentralized Identifiers (DIDs) and Verifiable Credentials Data Model recommendations have recently attracted attention due to their better autonomy, interoperability, ...

research-article

A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs

Pages 2066–2080https://doi.org/10.1109/TC.2024.3404064

NAND Flash-based solid-state drives (SSDs) have gained widespread usage in data storage thanks to their exceptional performance and low power consumption. The computational capability of SSDs has been elevated to tackle complex algorithms. Inside an SSD, ...

research-article

DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters

Pages 2081–2095https://doi.org/10.1109/TC.2024.3404089

This paper presents DPU-Direct, an accelerator disaggregation system that connects accelerator nodes (ANs) and CPU nodes (CNs) over a standard Remote Direct Memory Access (RDMA) network. DPU-Direct eliminates the latency introduced by the CPU-based ...

research-article

BSR-FL: An Efficient Byzantine-Robust Privacy-Preserving Federated Learning Framework

Pages 2096–2110https://doi.org/10.1109/TC.2024.3404102

Federated learning (FL) is a technique that enables clients to collaboratively train a model by sharing local models instead of raw private data. However, existing reconstruction attacks can recover the sensitive training samples from the shared models. ...

research-article

BlockCompass: A Benchmarking Platform for Blockchain Performance

Pages 2111–2122https://doi.org/10.1109/TC.2024.3404103

Blockchain technology has gained momentum due to its immutability and transparency. Several blockchain platforms, each with different consensus protocols, have been proposed. However, choosing and configuring such a platform is a non-trivial task. ...

IEEE Transactions on Computers

Sections

Multi-Objective Hardware-Mapping Co-Optimisation for Multi-DNN Workloads on Chiplet-Based Accelerators

TensorMap: A Deep RL-Based Tensor Mapping Framework for Spatial Accelerators

Uniformity and Independence of H3 Hash Functions for Bloom Filters

CoDA: A Co-Design Framework for Versatile and Efficient Attention Accelerators

Relieving Write Disturbance for Phase Change Memory With RESET-Aware Data Encoding

Enhancing Neural Network Reliability: Insights From Hardware/Software Collaboration With Neuron Vulnerability Quantization

Ada-WL: An Adaptive Wear-Leveling Aware Data Migration Approach for Flexible SSD Array Scaling in Clusters

Effective Huge Page Strategies for TLB Miss Reduction in Nested Virtualization

ReDas: A Lightweight Architecture for Supporting Fine-Grained Reshaping and Multiple Dataflows on Systolic Array

Revocable and Efficient Blockchain-Based Fine-Grained Access Control Against EDoS Attacks in Cloud Storage

AQA: An Adaptive Post-Training Quantization Method for Activations of CNNs

Toward Finding S-Box Circuits With Optimal Multiplicative Complexity

FutureDID: A Fully Decentralized Identity System With Multi-Party Verification

A Machine Learning-Empowered Cache Management Scheme for High-Performance SSDs

DPU-Direct: Unleashing Remote Accelerators via Enhanced RDMA for Disaggregated Datacenters

BSR-FL: An Efficient Byzantine-Robust Privacy-Preserving Federated Learning Framework

BlockCompass: A Benchmarking Platform for Blockchain Performance

Sections

Save to Binder

Comments