Design of a High-Efficiency Sequential Load Modulated Balanced Amplifier Based on Multiple Multiobjective Bayesian Optimization
In this article, an overall optimization strategy of power amplifier (PA) based on Bayesian algorithm is proposed to perform multiple multiobjective optimization design of sequential load modulated balanced amplifier (SLMBA). Specifically, by combining ...
Optimization of High-Efficiency GaN Load Modulated Balanced Amplifier for Integrated Sensing and Communication Applications
The novelty of this article is proposing an optimization strategy to design a high-efficiency load modulated balanced amplifier (LMBA) for integrated sensing and communication applications. Taking account of the operating characteristics, theoretical ...
Constructive Place-and-Route for FinFET-Based Transistor Arrays in Analog Circuits Under Nonlinear Gradients
- Arvind K. Sharma,
- Meghna Madhusudan,
- Steven M. Burns,
- Soner Yaldiz,
- Parijat Mukherjee,
- Ramesh Harjani,
- Sachin S. Sapatnekar
The design of active array structures in analog circuits requires careful matching to minimize the impact of variations. This work presents a constructive approach for building these arrays to directly incorporate shifts due to process variations, ...
Microwave Network-Assisted Analysis and Machine Learning-Assisted Synthesis of Arbitrarily Tapped Coils and Its Application to On-Chip Ultrawideband ESD Protection Circuits
Since the data rates in advanced chips dramatically increase, the electrostatic-discharge (ESD) protection circuit at I/O ports causes significant ultrawideband challenges. To address these challenges, an arbitrarily tapped coil (AT-coil) structure with ...
Multiagent Based Reinforcement Learning (MA-RL): An Automated Designer for Complex Analog Circuits
Despite the effort of analog circuit design automation, currently complex analog circuit design still requires extensive manual iterations, making it labor intensive and time-consuming. Recently, reinforcement learning (RL) algorithms have been ...
Timing Error Tolerant CNN Accelerator With Layerwise Approximate Multiplication
Exploiting the error tolerance in computation, approximate circuits become an emerging computing paradigm to increase the energy efficiency in digital systems, which is crucial in high-performance and low-power systems for the Edge Internet of Things (...
VirSoC: Automatic Synthesis of Virtual System-on-Chip Environments
Modern system-on-chip (SoC) functionalities include significant software interacting closely with low-level hardware to realize system functionalities. This software is developed concurrently with the hardware and must be validated before the hardware is ...
Beware Your Standard Cells! On Their Role in Static Power Side-Channel Attacks
Static or leakage power, which is especially prominent in advanced technology nodes, enables so-called static power side-channel attacks (S-PSCAs). While countermeasures exist, they often incur considerable overheads. Besides, hardware Trojans represent ...
PUF-Kyber: Design of a PUF-Based Kyber Architecture Benchmarked on Diverse ARM Processors
It is well-studied that quantum computing breaks the security of the current worldwide implemented public key cryptosystems. This forces us toward post quantum cryptography (PQC) whose security remains solid even against adversaries having access to ...
CPU Address-Leakage Transient Execution Attack Detection and Its Countermeasures
Modern advanced CPU designs are frequently exposed to transient execution vulnerabilities, which allow attackers to harness microarchitectural side effects for data exfiltration. The leaked data may encompass direct target data, such as RSA keys or ...
Smaller Together: Groupwise Encoding of Sparse Neural Networks
With the drive toward ever more intelligent devices, neural networks (NNs) are deployed on smaller and smaller systems. For these embedded microcontrollers, memory consumption becomes a significant challenge. We propose multiple encoding schemes that ...
CoFS: A Collaboration-Aware Fairness Scheme for NVMe SSD in Cloud Storage System
Since cloud service providers adopt the sharing mode to improve the utilization of solid state drives (SSDs) and reduce management costs, fairness is a critical design consideration and has drawn great interest in recent years. There are many methods to ...
Improving Dependability of Distributed Real-Time Applications via Safety and Security Co-Design
With the increasing deployment in mission-critical domains, it is of foremost importance to improve dependability of distributed real-time applications for cyber–physical systems (CPSs) with safety & security-critical threats. Different from ...
Longer Is Shorter: Making Long Paths to Improve the Worst-Case Response Time of DAG Tasks
Directed acyclic graph (DAG) tasks are widely used to model parallel real-time workload. The real-time performance of a DAG task not only depends on its total workload but also its graph structure. Intuitively, with the same total workload, a DAG task ...
Domino-Pro-Max: Toward Efficient Network Simplification and Reparameterization for Embedded Hardware Systems
The prohibitive complexity of convolutional neural networks (CNNs) has triggered an increasing demand for network simplification. To this end, one natural solution is to remove the redundant channels or layers to explore simplified network structures. ...
TVTAC: Triple Voltage Threshold Approximate Cache for Energy Harvesting Nonvolatile Processors
Energy harvesting is considered to be a substitute for batteries in many modern systems. Systems based on energy harvesting receive environmental energies from sources, such as sun, radio frequency, wind, vibration, etc, and convert them to electrical ...
Block-Wise Mixed-Precision Quantization: Enabling High Efficiency for Practical ReRAM-Based DNN Accelerators
- Xueying Wu,
- Edward Hanson,
- Nansu Wang,
- Qilin Zheng,
- Xiaoxuan Yang,
- Huanrui Yang,
- Shiyu Li,
- Feng Cheng,
- Partha Pratim Pande,
- Janardhan Rao Doppa,
- Krishnendu Chakrabarty,
- Hai Li
Resistive random access memory (ReRAM)-based processing-in-memory (PIM) architectures have demonstrated great potential to accelerate the deep neural network (DNN) training/inference. However, the computational accuracy of analog PIM is compromised due to ...
An Ising Model-Based Parallel Tempering Processing Architecture for Combinatorial Optimization
Combinatorial optimization problems (COPs) are prevalent in various domains and present formidable challenges for modern computers. Searching for the ground state of the Ising model emerges as a promising approach to solve these problems. Recent studies ...
PipePIM: Maximizing Computing Unit Utilization in ML-Oriented Digital PIM by Pipelining and Dual Buffering
A digital processing-in-memory (PIM) that integrates computing units (CUs) with DRAM banks emerges as a promising technique for accelerating matrix–vector multiplication (MV). However, activating and precharging all banks incur significant ...
ReD: A Reliable and Deadlock-Free Routing for 2.5-D Chiplet-Based Interposer Networks
2.5-D integration offers a cost-effective and reliable solution for implementing large-scale modular systems. A 2.5-D chiplet system can be designed by connecting smaller chiplets through an interposer, where the chiplets may have heterogeneous ...
A Multichiplet Computing-in-Memory Architecture Exploration Framework Based on Various CIM Devices
- Zhuoyu Dai,
- Feibin Xiang,
- Xiangqu Fu,
- Yifan He,
- Wenyu Sun,
- Yongpan Liu,
- Guanhua Yang,
- Feng Zhang,
- Jinshan Yue,
- Ling Li
Computing-in-memory (CIM) architectures based on various devices, such as resistive random access memory, SRAM, DRAM, etc., have demonstrated promising energy efficiency. Single-device-based CIM chips show different advantages on performance, power, or ...
Area-Efficient Barrett Modular Multiplication With Optimized Karatsuba Algorithm
This article presents an area-efficient Barrett modular multiplication (BMM) algorithm, facilitating the development of cryptosystems like fully homomorphic encryption. Instead of implementing three normal multiplications required by classic BMM, our ...
A Comprehensive Dataflow-Mapping Optimization for Fully Pipelined Execution in Spatial Programmable Architecture
Although spatial programmable architectures have demonstrated high-performance and programmability for a variety of applications, they suffer from the pipeline unbalancing issue which restricts resource utilization and degrades the performance. In this ...
A Fully Pipelined Reconfigurable Montgomery Modular Multiplier Supporting Variable Bit-Widths
Recently, there has been increased emphasis on privacy-preserving computation technologies, such as homomorphic encryption (HE) and zero-knowledge proof (ZKP). Modular multiplication is a critical component for both HE and ZKP. Variable bit-width is a ...
DAG-Aware Synthesis Orchestration
Modern logic synthesis techniques use multilevel technology-independent representations like and-inverter-graphs (AIGs) for digital logic. This involves structural rewriting, resubstitution, and refactoring based on directed-acyclic-graph (DAGs) ...
9-Input Threshold Function Identification Using a New Necessary Condition of Threshold Function
Identification of a threshold function (TF) is a significant task that determines whether a given Boolean function is a TF or not. The state-of-the-art only identifies all 8-input NP-class TFs. In this article, we propose a new necessary condition for a ...
ROVER: RTL Optimization via Verified E-Graph Rewriting
Manual register transfer level (RTL) design and optimization remains prevalent across the semiconductor industry because commercial logic and high-level synthesis tools are unable to match human designs. Our experience in industrial datapath design ...
Automatic Mapping of Heterogeneous DNN Models on Adaptive Multiaccelerator Systems
As DNNs are developing rapidly, the computational and memory burden imposed on hardware systems grows exponentially. This becomes even more severe for large language models (LLMs) and multimodal models. As a promising solution that achieves high ...
Switching Activity Factor-Based ECSM Characterization (SAFE): A Novel Technique for Aging-Aware Static Timing Analysis
- Lomash Chandra Acharya,
- Arvind Sharma,
- Neeraj Mishra,
- Khoirom Johnson Singh,
- Mahipal Dargupally,
- Neha Gupta,
- Nayakanti Sai Shabarish,
- Ajoy Mandal,
- Venkatraman Ramakrishnan,
- Sudeb Dasgupta,
- Anand Bulusu
We propose switching activity factor-based effective current source model (SAFE) for aging-aware static timing analysis (STA), a new technique for estimating the timing performance of digital circuits. SAFE is based on the development of device-level ...
FS-TRA: Evaluating Sequential Circuit Reliability via a Fanout-Source Tracking and Reduction Approach
The input vector-oriented reliability estimation of sequential circuits plays an important role in predicting their reliability boundaries and identifying their reliability-critical gates. This article presents an input vector-oriented programmable method ...