Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2024
A Cascaded ReRAM-based Crossbar Architecture for Transformer Neural Network Acceleration
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 30, Issue 1Article No.: 10, Pages 1–23https://doi.org/10.1145/3701034Emerging resistive random-access memory (ReRAM) based processing-in-memory (PIM) accelerators have been increasingly explored in recent years because they can efficiently perform in-situ matrix-vector multiplication (MVM) operations involved in a wide ...
- short-paperDecember 2024
PIMSys: A Virtual Prototype for Processing in Memory
MEMSYS '24: Proceedings of the International Symposium on Memory SystemsPages 26–33https://doi.org/10.1145/3695794.3695797Data-driven applications are increasingly central to our information technology society, propelled by AI techniques reshaping various sectors of our economy. Despite their transformative potential, these applications demand immense data processing, ...
- research-articleSeptember 2024
ReHarvest: An ADC Resource-Harvesting Crossbar Architecture for ReRAM-Based DNN Accelerators
- Jiahong Xu,
- Haikun Liu,
- Zhuohui Duan,
- Xiaofei Liao,
- Hai Jin,
- Xiaokang Yang,
- Huize Li,
- Cong Liu,
- Fubing Mao,
- Yu Zhang
ACM Transactions on Architecture and Code Optimization (TACO), Volume 21, Issue 3Article No.: 63, Pages 1–26https://doi.org/10.1145/3659208ReRAM-based Processing-In-Memory (PIM) architectures have been increasingly explored to accelerate various Deep Neural Network (DNN) applications because they can achieve extremely high performance and energy-efficiency for in-situ analog Matrix-Vector ...
- research-articleSeptember 2024
Static Scheduling of Weight Programming for DNN Acceleration with Resource Constrained PIM
ACM Transactions on Embedded Computing Systems (TECS), Volume 23, Issue 6Article No.: 89, Pages 1–22https://doi.org/10.1145/3615657Most existing architectural studies on ReRAM-based processing-in-memory (PIM) DNN accelerators assume that all weights of the DNN can be mapped to the crossbar at once. However, these studies are over-idealized. ReRAM crossbar resources for calculation ...
- research-articleSeptember 2024
LowPASS: A Low power PIM-based accelerator with Speculative Scheme for SNNs
ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and DesignPages 1–6https://doi.org/10.1145/3665314.3672279Spiking neural networks (SNNs) are considered as energy-efficient alternatives to deep neural networks (DNNs). By adopting event-driven information processing, SNNs can significantly reduce the computational demands associated with DNNs, while still ...
-
- research-articleSeptember 2024
HeTraX: Energy Efficient 3D Heterogeneous Manycore Architecture for Transformer Acceleration
ISLPED '24: Proceedings of the 29th ACM/IEEE International Symposium on Low Power Electronics and DesignPages 1–6https://doi.org/10.1145/3665314.3670814Transformers have revolutionized deep learning and generative modeling to enable unprecedented advancements in natural language processing tasks and beyond. However, designing hardware accelerators for executing transformer models is challenging due to ...
- research-articleNovember 2024
Towards Efficient SRAM-PIM Architecture Design by Exploiting Unstructured Bit-Level Sparsity
- Cenlin Duan,
- Jianlei Yang,
- Yiou Wang,
- Yikun Wang,
- Yingjie Qi,
- Xiaolin He,
- Bonan Yan,
- Xueyan Wang,
- Xiaotao Jia,
- Weisheng Zhao
DAC '24: Proceedings of the 61st ACM/IEEE Design Automation ConferenceArticle No.: 209, Pages 1–6https://doi.org/10.1145/3649329.3655948Bit-level sparsity in neural network models harbors immense untapped potential. Eliminating redundant calculations of randomly distributed zero-bits significantly boosts computational efficiency. Yet, traditional digital SRAM-PIM architecture, limited by ...
- research-articleNovember 2023
Fusing In-storage and Near-storage Acceleration of Convolutional Neural Networks
- Ikenna Okafor,
- Akshay Krishna Ramanathan,
- Nagadastagiri Reddy Challapalle,
- Zheyu Li,
- Vijaykrishnan Narayanan
ACM Journal on Emerging Technologies in Computing Systems (JETC), Volume 20, Issue 1Article No.: 1, Pages 1–22https://doi.org/10.1145/3597496Video analytics has a wide range of applications and has attracted much interest over the years. While it can be both computationally and energy-intensive, video analytics can greatly benefit from in/near memory compute. The practice of moving compute ...
- research-articleNovember 2023
MAGIC-DHT: Fast in-memory computing for Discrete Hadamard Transform
AbstractDiscrete Hadamard transform (DHT) is a signal processing tool that decomposes an arbitrary input vector into a superposition of Walsh functions. Due to its wide range of applications in processing big data, a fast and energy-efficient hardware ...
Highlights- Hadamard Transform.
- Memristor.
- ReRAM.
- PIM.
- Parallel Computing.
- research-articleNovember 2023
Design and implementation of a real-time onboard system for a stratospheric balloon mission using commercial off-the-self components and a model-based approach
Computers and Electrical Engineering (CENG), Volume 111, Issue PBhttps://doi.org/10.1016/j.compeleceng.2023.108953Highlights- Onboard software (OBSW) of HERCCULES mission that characterize the convective heat and radiative environment in the stratosphere.
- Commercial off-the-shelf hardware and software using a Raspberry Pi 4B as an OBC with Linux OS and the ...
Stratospheric balloons have emerged as an affordable and flexible alternative to traditional spacecrafts as they are implemented using commercial off-the-shelf (COTS) equipment without following strict methodologies. HERCCULES is a stratospheric ...
Graphical abstractDisplay Omitted
- research-articleApril 2023
Two-scale concurrent topology optimization method of constrained layer damping structure subjected to non-uniform blast load
Structural and Multidisciplinary Optimization (SPSMO), Volume 66, Issue 5https://doi.org/10.1007/s00158-023-03554-4AbstractThe constrained layer damping structure is used to resist the impact damage from blast load and furthermore the two-scale topology optimization of the damping layer can deduce the additional mass of micro damping material while satisfying the ...
- research-articleDecember 2022
Hidden-ROM: A Compute-in-ROM Architecture to Deploy Large-Scale Neural Networks on Chip with Flexible and Scalable Post-Fabrication Task Transfer Capability
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided DesignArticle No.: 45, Pages 1–9https://doi.org/10.1145/3508352.3549335Motivated by reducing the data transfer activities in data-intensive neural network computing, SRAM-based compute-in-memory (CiM) has made significant progress. Unfortunately, SRAM has low density and limited on-chip capacity. This makes the deployment ...
- research-articleDecember 2023
FracDRAM: Fractional Values in Off-the-Shelf DRAM
MICRO '22: Proceedings of the 55th Annual IEEE/ACM International Symposium on MicroarchitecturePages 885–899https://doi.org/10.1109/MICRO56248.2022.00066As one of the cornerstones of computing, dynamic random-access memory (DRAM) is prevalent across digital systems. Over the years, researchers have proposed modifications to DRAM macros or explored alternative uses of existing DRAM chips to extend the ...
- rfcOctober 2022
RFC 9128: YANG Data Model for Protocol Independent Multicast (PIM)
This document defines a YANG data model that can be used to configure and manage devices supporting Protocol Independent Multicast (PIM). The model covers the PIM protocol configuration, operational state, and event notifications data.
- review-articleAugust 2022
A survey on hardware accelerators: Taxonomy, trends, challenges, and perspectives
Journal of Systems Architecture: the EUROMICRO Journal (JOSA), Volume 129, Issue Chttps://doi.org/10.1016/j.sysarc.2022.102561AbstractIn recent years, the limits of the multicore approach emerged in the so-called “dark silicon” issue and diminishing returns of an ever-increasing core count. Hardware manufacturers, out of necessity, switched their focus to ...
- research-articleAugust 2022
Novel Fault-Tolerant Processing in Memory Cell in Ternary Quantum-Dot Cellular Automata
Journal of Electronic Testing: Theory and Applications (JELT), Volume 38, Issue 4Pages 419–444https://doi.org/10.1007/s10836-022-06018-7AbstractProcessing-in-memory (PIM) is a computational architecture in which the processing unit and memory are integrated into a single unit. Different technologies and methods can be used to implement PIM, but a more optimal design for PIM can be ...
- research-articleAugust 2022
ReGNN: a ReRAM-based heterogeneous architecture for general graph neural networks
DAC '22: Proceedings of the 59th ACM/IEEE Design Automation ConferencePages 469–474https://doi.org/10.1145/3489517.3530479Graph Neural Networks (GNNs) have both graph processing and neural network computational features. Traditional graph accelerators and NN accelerators cannot meet these dual characteristics of GNN applications simultaneously. In this work, we propose a ...
- ArticleJuly 2022
A Critical Assessment of DRAM-PIM Architectures - Trends, Challenges and Solutions
Embedded Computer Systems: Architectures, Modeling, and SimulationPages 362–379https://doi.org/10.1007/978-3-031-15074-6_23AbstractRecently, we are witnessing a surge in DRAM-based Processing in Memory (PIM) publications from academia and industry. The architectures and design techniques proposed in these publications vary largely, ranging from integration of computation ...
- research-articleJune 2022
Gearbox: a case for supporting accumulation dispatching and hybrid partitioning in PIM-based accelerators
ISCA '22: Proceedings of the 49th Annual International Symposium on Computer ArchitecturePages 218–230https://doi.org/10.1145/3470496.3527402Processing-in-memory (PIM) minimizes data movement overheads by placing processing units near each memory segment. Recent PIMs employ processing units with a SIMD architecture. However, kernels with random accesses, such as sparse-matrix-dense-vector (...
- short-paperJune 2022
Bulk JPEG decoding on in-memory processors
SYSTOR '22: Proceedings of the 15th ACM International Conference on Systems and StoragePages 51–57https://doi.org/10.1145/3534056.3534946JPEG is a common encoding format for digital images. Applications that process large numbers of images can be accelerated by decoding multiple images concurrently. We examine the suitability of using a large array of in-memory processors (PIM) to obtain ...