Export Citations
Save this search
Please login to be able to save your searches and receive alerts for new content matching your search criteria.
- research-articleDecember 2023
ReFOCUS: Reusing Light for Efficient Fourier Optics-Based Photonic Neural Network Accelerator
MICRO '23: Proceedings of the 56th Annual IEEE/ACM International Symposium on MicroarchitectureOctober 2023, Pages 569–583https://doi.org/10.1145/3613424.3623798In recent years, there has been a significant focus on achieving low-latency and high-throughput convolutional neural network (CNN) inference. Integrated photonics offers the potential to substantially expedite neural networks due to its inherent low-...
- research-articleJuly 2023
Accelerating Deformable Convolution Networks with Dynamic and Irregular Memory Accesses
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 28, Issue 4Article No.: 67, Pages 1–23https://doi.org/10.1145/3597431Deformable convolution networks (DCNs) proposed to address image recognition with geometric or photometric variations typically involve deformable convolution that convolves on arbitrary locations of input features. The locations change with different ...
- research-articleJanuary 2023
GShuttle: Optimizing Memory Access Efficiency for Graph Convolutional Neural Network Accelerators
Journal of Computer Science and Technology (JCST), Volume 38, Issue 1Feb 2023, Pages 115–127https://doi.org/10.1007/s11390-023-2875-9AbstractGraph convolutional neural networks (GCNs) have emerged as an effective approach to extending deep learning for graph data analytics, but they are computationally challenging given the irregular graphs and the large number of nodes in a graph. ...
- research-articleMay 2023
Hardware Efficiency Stochastic Computing based on Hybrid Spatial Coding
NANOARCH '22: Proceedings of the 17th ACM International Symposium on Nanoscale ArchitecturesDecember 2022, Article No.: 23, Pages 1–6https://doi.org/10.1145/3565478.3572535As the era of silicon-based microchips advances by Moores Law approach to physical limits, new computational paradigms are proposed for future systems, i.e., stochastic computation. However, the current stochastic computing faces the challenge of high ...
- research-articleDecember 2022
Squeezing Accumulators in Binary Neural Networks for Extremely Resource-Constrained Applications
ICCAD '22: Proceedings of the 41st IEEE/ACM International Conference on Computer-Aided DesignOctober 2022, Article No.: 141, Pages 1–7https://doi.org/10.1145/3508352.3549418The cost and power consumption of BNN (Binarized Neural Network) hardware is dominated by additions. In particular, accumulators account for a large fraction of hardware overhead, which could be effectively reduced by using reduced-width accumulators. ...
- research-articleMay 2022
TCX: a programmable tensor processor
DATE '22: Proceedings of the 2022 Conference & Exhibition on Design, Automation & Test in EuropeMarch 2022, Pages 1023–1028Neural network processors and accelerators are domain-specific architectures deployed to solve the high computational requirements of deep learning algorithms. This paper proposes a new instruction set extension for tensor computing, TCX, with RISC-...
- research-articleFebruary 2022
N3H-Core: Neuron-designed Neural Network Accelerator via FPGA-based Heterogeneous Computing Cores
FPGA '22: Proceedings of the 2022 ACM/SIGDA International Symposium on Field-Programmable Gate ArraysFebruary 2022, Pages 112–122https://doi.org/10.1145/3490422.3502367Accelerating the neural network inference by FPGA has emerged as a popular option, since the reconfigurability and high performance computing capability of FPGA intrinsically satisfies the computation demand of the fast-evolving neural algorithms. ...
- research-articleOctober 2021
PointAcc: Efficient Point Cloud Accelerator
MICRO '21: MICRO-54: 54th Annual IEEE/ACM International Symposium on MicroarchitectureOctober 2021, Pages 449–461https://doi.org/10.1145/3466752.3480084Deep learning on point clouds plays a vital role in a wide range of applications such as autonomous driving and AR/VR. These applications interact with people in real time on edge devices and thus require low latency and low energy. Compared to ...
- research-articleOctober 2020
Core Placement Optimization for Multi-chip Many-core Neural Network Systems with Reinforcement Learning
ACM Transactions on Design Automation of Electronic Systems (TODAES), Volume 26, Issue 2Article No.: 11, Pages 1–27https://doi.org/10.1145/3418498Multi-chip many-core neural network systems are capable of providing high parallelism benefited from decentralized execution, and they can be scaled to very large systems with reasonable fabrication costs. As multi-chip many-core systems scale up, ...
- research-articleNovember 2020
Exploration of design space and runtime optimization for affective computing in machine learning empowered ultra-low power SoC
DAC '20: Proceedings of the 57th ACM/EDAC/IEEE Design Automation ConferenceJuly 2020, Article No.: 95, Pages 1–6The incorporation of artificial intelligence into the rapidly growing IoT devices demands a high level of built-in intelligence, e.g. machine learning capability at the device level. Affective computing offers a new degree of cognitive intelligence into ...