research-article

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars

Authors:

Naveen Muralimanohar,

Rajeev Balasubramonian,

John Paul Strachan,

R. Stanley Williams,

Vivek SrikumarAuthors Info & Claims

ACM SIGARCH Computer Architecture News, Volume 44, Issue 3

Pages 14 - 26

https://doi.org/10.1145/3007787.3001139

Published: 18 June 2016 Publication History

Abstract

A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks.

This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.

References

[1]

"ADC Performance Evolution: Walden Figure-Of-Merit (FOM)," 2012, https://converterpassion.wordpress.com/2012/08/21/adc-performance-evolution-walden-figure-of-merit-fom/.

[2]

F. Alibart, E. Zamanidoost, and D. B. Strukov, "Pattern Classification by Memristive Crossbar Circuits using Ex-Situ and In-Situ Training," Nature, 2013.

[3]

B. Belhadj, A. Joubert, Z. Li, R. Héliot, and O. Temam, "Continuous Real-World Inputs Can Open Up Alternative Accelerator Designs," in Proceedings of ISCA-40, 2013.

Digital Library

[4]

M. N. Bojnordi and E. Ipek, "Memristive Boltzmann Machine: A Hardware Accelerator for Combinatorial Optimization and Deep Learning," in Proceedings of HPCA-22, 2016.

[5]

B. E. Boser, E. Sackinger, J. Bromley, Y. Le Cun, and L. D. Jackel, "An Analog Neural Network Processor with Programmable Topology," Journal of Solid-State Circuits, 1991.

[6]

G. Burr, R. Shelby, C. di Nolfo, J. Jang, R. Shenoy, P. Narayanan, K. Virwani, E. Giacometti, B. Kurdi, and H. Hwang, "Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165,000 Synapses), using Phase-Change Memory as the Synaptic Weight Element," in Proceedings of IEDM, 2014.

[7]

L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A Convolutional Network Accelerator," in Proceedings of GLSVLSI-25, 2015.

Digital Library

[8]

T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning," in Proceedings of ASPLOS, 2014.

Digital Library

[9]

Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun et al., "DaDianNao: A Machine-Learning Supercomputer," in Proceedings of MICRO-47, 2014.

Digital Library

[10]

P. Chi, S. Li, Z. Qi, P. Gu, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, "PRIME: A Novel Processing-In-Memory Architecture for Neural Network Computation in ReRAM-based Main Memory," in Proceedings of ISCA-43, 2016.

Digital Library

[11]

J. Cloutier, S. Pigeon, F. R. Boyer, E. Cosatto, and P. Y. Simard, "VIP: An FPGA-Based Processor for Image Processing and Neural Networks," 1996.

[12]

A. Coates, B. Huval, T. Wang, D. Wu, B. Catanzaro, and N. Andrew, "Deep Learning with COTS HPC Systems," in Proceedings of ICML-30, 2013.

[13]

Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, "ShiDianNao: Shifting Vision Processing Closer to the Sensor," in Proceedings of ISCA-42, 2015.

Digital Library

[14]

Z. Du, A. Lingamneni, Y. Chen, K. Palem, O. Temam, and C. Wu, "Leveraging the Error Resilience of Machine-Learning Applications for Designing Highly Energy Efficient Accelerators," in Proceedings of ASPDAC-19, 2014.

[15]

C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun, "NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision," in Proceedings of CVPRW, 2011.

[16]

C. Farabet, C. Poulet, J. Y. Han, and Y. LeCun, "CNP: An FPGA-based Processor for Convolutional Networks," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2009.

[17]

J. Fieres, K. Meier, and J. Schemmel, "A Convolutional Neural Network Tolerant of Synaptic Faults for Low-Power Analog Hardware," in Proceedings of Artificial Neural Networks in Pattern Recognition, 2006.

Digital Library

[18]

R. Genov and G. Cauwenberghs, "Charge-Mode Parallel Architecture for Vector-Matrix Multiplication," 2001.

[19]

A. Graves, A.-r. Mohamed, and G. Hinton, "Speech Recognition with Deep Recurrent Neural Networks," in Proceedings of ICASSP, 2013.

[20]

B. Grigorian, N. Farahpour, and G. Reinman, "BRAINIAC: Bringing Reliable Accuracy Into Neurally-Implemented Approximate Computing," in Proceedings of HPCA-21, 2015.

[21]

S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep Learning with Limited Numerical Precision," arXiv preprint arXiv:1502.02551, 2015.

[22]

A. Hashmi, H. Berry, O. Temam, and M. Lipasti, "Automatic Abstraction and Fault Tolerance in Cortical Microachitectures," in Proceedings of ISCA-38, 2011.

Digital Library

[23]

J. Hauswald, Y. Kang, M. A. Laurenzano, Q. Chen, C. Li, T. Mudge, R. G. Dreslinski, J. Mars, and L. Tang, "DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers," in Proceedings of ISCA-42, 2015.

Digital Library

[24]

K. He, X. Zhang, S. Ren, and J. Sun, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," arXiv preprint arXiv:1502.01852, 2015.

[25]

Y. Ho, G. M. Huang, and P. Li, "Nonvolatile Memristor Memory: Device Characteristics and Design Implications," in Proceedings of ICCAD-28, 2009.

Digital Library

[26]

M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves, S. Lam, N. Ge, R. S. Williams, and J. Yang, "Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication," in Proceedings of DAC-53, 2016.

Digital Library

[27]

T. Iakymchuk, A. Rosado-Muñoz, J. F. Guerrero-Martínez, M. Bataller-Mompeán, and J. V. Francés-Víllora, "Simplified Spiking Neural Network Architecture and STDP Learning Algorithm Applied to Image Classification," Journal on Image and Video Processing (EURASIP), 2015.

[28]

K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, "What is the Best Multi-Stage Architecture for Object Recognition?" in Proceedings of ICCV-12, 2009.

[29]

A. Joubert, B. Belhadj, O. Temam, and R. Héliot, "Hardware Spiking Neurons Design: Analog or Digital?" in Proceedings of IJCNN, 2012.

[30]

O. Kavehei, S. Al-Sarawi, K.-R. Cho, N. Iannella, S.-J. Kim, K. Eshraghian, and D. Abbott, "Memristor-based Synaptic Networks and Logical Operations Using In-Situ Computing," in Proceedings of ISSNIP, 2011.

[31]

D. Kim, J. H. Kung, S. Chai, S. Yalamanchili, and S. Mukhopadhyay, "Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory," in Proceedings of ISCA-43, 2016.

Digital Library

[32]

K.-H. Kim, S. Gaba, D. Wheeler, J. M. Cruz-Albrecht, T. Hussain, N. Srinivasa, and W. Lu, "A Functional Hybrid Memristor Crossbar-Array/CMOS System for Data Storage and Neuromorphic Applications," Nano Letters, 2011.

[33]

Y. Kim, Y. Zhang, and P. Li, "A Digital Neuromorphic VLSI Architecture with Memristor Crossbar Synaptic Array for Machine Learning," in Proceedings of SOCC-3, 2012.

[34]

A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in Proceedings of NIPS, 2012.

Digital Library

[35]

C. Kügeler, C. Nauenheim, M. Meier, R. Waser et al., "Fast Resistance Switching of TiO 2 and MSQ Thin Films for Non-Volatile Memory Applications (RRAM)," in Proceedings of NVMTS-9, 2008.

[36]

L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Brandli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, "A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC with Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS," Journal of Solid-State Circuits, 2013.

[37]

Q. V. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. S. Corrado, J. Dean, and A. Y. Ng, "Building High-Level Features using Large Scale Unsupervised Learning," in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.

[38]

Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based Learning Applied to Document Recognition," Proceedings of the IEEE, 1998.

[39]

R. LiKamWa, Y. Hou, J. Gao, M. Polansky, and L. Zhong, "RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision," in Proceedings of ISCA-43, 2016.

Digital Library

[40]

K. Lim, D. Meisner, A. Saidi, P. Ranganathan, and T. Wenisch, "Thin Servers with Smart Pipes: Designing Accelerators for Memcached," in Proceedings of ISCA, 2013.

Digital Library

[41]

D. Liu, T. Chen, S. Liu, J. Zhou, S. Zhou, O. Teman, X. Feng, X. Zhou, and Y. Chen, "PuDianNao: A Polyvalent Machine Learning Accelerator," in Proceedings of ASPLOS-20.

Digital Library

[42]

X. Liu, M. Mao, H. Li, Y. Chen, H. Jiang, J. J. Yang, Q. Wu, and M. Barnell, "A Heterogeneous Computing System with Memristor-based Neuromorphic Accelerators," in Proceedings of HPEC-18, 2014.

[43]

X. Liu, M. Mao, B. Liu, H. Li, Y. Chen, B. Li, Y. Wang, H. Jiang, M. Barnell, Q. Wu et al., "RENO: A High-Efficient Reconfigurable Neuromorphic Computing Accelerator Design," in Proceedings of DAC-52, 2015.

Digital Library

[44]

P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha, "A Digital Neurosynaptic Core Using Embedded Crossbar Memory with 45pJ per Spike in 45nm," in Proceedings of CICC, 2011.

[45]

N. Muralimanohar, R. Balasubramonian, and N. Jouppi, "Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0," in Proceedings of MICRO, 2007.

Digital Library

[46]

B. Murmann, "ADC Performance Survey 1997-2015 (ISSCC & VLSI Symposium)," 2015, http://web.stanford.edu/~murmann/adcsurvey.html.

[47]

A. Nere, A. Hashmi, M. Lipasti, and G. Tononi, "Bridging the Semantic Gap: Emulating Biological Neuronal Behaviors with Simple Digital Neurons," in Proceedings of HPCA-19, 2013.

Digital Library

[48]

M. O'Halloran and R. Sarpeshkar, "A 10-nW 12-bit Accurate Analog Storage Cell with 10-aA Leakage," Journal of Solid-State Circuits, 2004.

[49]

W. Ouyang, P. Luo, X. Zeng, S. Qiu, Y. Tian, H. Li, S. Yang, Z. Wang, Y. Xiong, C. Qian et al., "DeepId-Net: Multi-Stage and Deformable Deep Convolutional Neural Networks for Object Detection," arXiv preprint arXiv:1409.3505, 2014.

[50]

K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. S. Chung, "Accelerating Deep Convolutional Neural Networks Using Specialized Hardware," 2015, http://research.microsoft.com/apps/pubs/default.aspx?id=240715.

[51]

Y. V. Pershin and M. Di Ventra, "Experimental Demonstration of Mssociative Memory with Memristive Neural Networks," Neural Networks, 2010.

Digital Library

[52]

P.-H. Pham, D. Jelaca, C. Farabet, B. Martini, Y. LeCun, and E. Culurciello, "NeuFlow: Dataflow Vision Processing System-On-a-Chip," in Proceedings of the MWSCAS-55, 2012.

[53]

M. Prezioso, F. Merrikh-Bayat, B. Hoskins, G. Adam, K. K. Likharev, and D. B. Strukov, "Training and Operation of an Integrated Neuromorphic Network based on Metal-Oxide Memristors," Nature, 2015.

[54]

A. Putnam, A. M. Caulfield, E. S. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, G. P. Gopal, J. Gray et al., "A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services," in Proceedings of ISCA-41, 2014.

Digital Library

[55]

W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution Engine: Balancing Efficiency & Flexibility in Specialized Computing," in Proceedings of ISCA-40, 2013.

Digital Library

[56]

S. Ramakrishnan and J. Hasler, "Vector-Matrix Multiply and Winner-Take-All as an Analog Classifier," 2014.

[57]

B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee, S. Lee, J. M. Hernandez, Lobato, G.-Y. Wei, and D. Brooks, "Minerva: Enabling Low-Power, High-Accuracy Deep Neural Network Accelerators," in Proceedings of ISCA-43, 2016.

Digital Library

[58]

O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, 2014.

Digital Library

[59]

M. Saberi, R. Lotfi, K. Mafinezhad, W. Serdijn et al., "Analysis of Power Consumption and Linearity in Capacitive Digital-to-Analog Converters used in Successive Approximation ADCs," 2011.

[60]

E. Sackinger, B. E. Boser, J. Bromley, Y. LeCun, and L. D. Jackel, "Application of the ANNA Neural Network Chip to High-Speed Character Recognition," IEEE Transactions on Neural Networks, 1991.

Digital Library

[61]

J. Schemmel, J. Fieres, and K. Meier, "Wafer-Scale Integration of Analog Neural Networks," in Proceedings of IJCNN, 2008.

[62]

S.Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. Horowitz, and W. Dally, "EIE: Efficient Inference Engine on Compressed Deep Neural Network," in Proceedings of ISCA, 2016.

Digital Library

[63]

K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556, 2014.

[64]

R. St Amant, A. Yazdanbakhsh, J. Park, B. Thwaites, H. Esmaeilzadeh, A. Hassibi, L. Ceze, and D. Burger, "General-Purpose Code Acceleration with Limited-Precision Analog Computation," in Proceeding of ISCA-41, 2014.

Digital Library

[65]

J. Starzyk and Basawaraj, "Memristor Crossbar Architecture for Synchronous Neural Networks," Transactions on Circuits and Systems I, 2014.

[66]

D. B. Strukov, G. S. Snider, D. R. Stewart, and R. Williams, "The Missing Memristor Found," Nature, vol. 453, pp. 80--83, May 2008.

[67]

Y. Sun, X. Wang, and X. Tang, "Deep Learning Face Representation from Predicting 10,000 Classes," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.

Digital Library

[68]

M. Suri, V. Sousa, L. Perniola, D. Vuillaume, and B. DeSalvo, "Phase Change Memory for Synaptic Plasticity Application in Neuromorphic Systems," in Proceedings of IJCNN, 2011.

[69]

C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," arXiv preprint arXiv:1409.4842, 2014.

[70]

R. Szeliski, Computer Vision: Algorithms and Applications, 2010.

Digital Library

[71]

T. Taha, R. Hasan, C. Yakopcic, and M. McLean, "Exploring the Design Space of Specialized Multicore Neural Processors," in Proceedings of IJCNN, 2013.

[72]

S. M. Tam, B. Gupta, H. Castro, M. Holler et al., "Learning on an Analog VLSI Neural Network Chip," in Proceedings of the International Conference on Systems, Man and Cybernetics, 1990.

[73]

O. Temam, "A Defect-Tolerant Accelerator for Emerging High-Performance Applications," in Proceedings of ISCA-39, 2012.

Digital Library

[74]

P. O. Vontobel, W. Robinett, P. J. Kuekes, D. R. Stewart, J. Straznicky, and R. S. Williams, "Writing to and reading from a nano-scale crossbar memory based on memristors," Nanotechnology, vol. 20, 2009.

[75]

L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.

Digital Library

[76]

L. Wu, R. Barker, M. Kim, and K. Ross, "Navigating Big Data with High-Throughput Energy-Efficient Data Partitioning," in Proceedings of ISCA-40, 2013.

Digital Library

[77]

C. Xu, D. Niu, N. Muralimanohar, R. Balasubramonian, T. Zhang, S. Yu, and Y. Xie, "Overcoming the Challenges of Crossbar Resistive Memory Architectures," in Proceedings of HPCA-21, 2015.

[78]

C. Yakopcic and T. M. Taha, "Energy Efficient Perceptron Pattern Recognition using Segmented Memristor Crossbar Arrays," in Proceedings of IJCNN, 2013.

[79]

M. Zangeneh and A. Joshi, "Design and Optimization of Nonvolatile Multibit 1T1R Resistive RAM," Proceedings of the Transactions on VLSI Systems, 2014.

[80]

M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," in Proceedings of ECCV, 2014.

Cited By

Liu HQian ZWu WRen HLiu ZNi L(2024)AFPR-CIM: An Analog-Domain Floating-Point RRAM -based Compute- In- Memory Architecture with Dynamic Range Adaptive FP-ADC2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546882(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546882
Zhang CYuan ZLi XSun G(2024)Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546760(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546760
He YZhao SQu SLi HLi XWang Y(2024)Bit-Trimmer: Ineffectual Bit-Operation Removal for CLM Architecture2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546755(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546755
Show More Cited By

Recommendations

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars
ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture

A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-...
In-Datacenter Performance Analysis of a Tensor Processing Unit
ISCA '17: Proceedings of the 44th Annual International Symposium on Computer Architecture

Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU) --- deployed in datacenters since 2015 that accelerates ...
In-Datacenter Performance Analysis of a Tensor Processing Unit
ISCA'17

Many architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. This paper evaluates a custom ASIC---called a Tensor Processing Unit (TPU) --- deployed in datacenters since 2015 that accelerates ...

Comments

Information & Contributors

Information

Published In

cover image ACM SIGARCH Computer Architecture News

ACM SIGARCH Computer Architecture News Volume 44, Issue 3

ISCA'16

June 2016

730 pages

ISSN:0163-5964

DOI:10.1145/3007787

Editor:
Doug DeGroot
acm dot org

Issue’s Table of Contents

ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture
June 2016
756 pages
ISBN:9781467389471
General Chairs:
Sang Lyul Min
Seoul National University
,
Gabriel Loh
AMD Research

Copyright © 2016 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 18 June 2016

Published in SIGARCH Volume 44, Issue 3

Check for updates

Author Tags

Qualifiers

Research-article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1,273
Total Citations
View Citations
6,660
Total Downloads

Downloads (Last 12 months)1,332
Downloads (Last 6 weeks)128

Reflects downloads up to 10 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

Liu HQian ZWu WRen HLiu ZNi L(2024)AFPR-CIM: An Analog-Domain Floating-Point RRAM -based Compute- In- Memory Architecture with Dynamic Range Adaptive FP-ADC2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546882(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546882
Zhang CYuan ZLi XSun G(2024)Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546760(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546760
He YZhao SQu SLi HLi XWang Y(2024)Bit-Trimmer: Ineffectual Bit-Operation Removal for CLM Architecture2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546755(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546755
Shahroodi TCardoso RWong SBosio AO'Connor IHamdioui S(2024)High-Performance Data Mapping for BNNs on PCM-Based Integrated Photonics2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546687(1-6)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546687
Han LHuang PZhou ZChen YYang HLiu XKang J(2024)Pipeline Design of Nonvolatile-based Computing in Memory for Convolutional Neural Networks Inference Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546661(1-2)Online publication date: 25-Mar-2024
https://doi.org/10.23919/DATE58400.2024.10546661
HUANG YZHENG LLIU HQIU QXIN JLIAO XJIN H(2024)A general yet accurate approach for energy-efficient processing-in-memory architecture computationsSCIENTIA SINICA Informationis10.1360/SSI-2023-034554:8(1827)Online publication date: 7-Aug-2024
https://doi.org/10.1360/SSI-2023-0345
Gao BWang ZHe ZLuo TWong WZhou Z(2024)IMI: In-memory Multi-job Inference Acceleration for Large Language ModelsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673053(752-761)Online publication date: 12-Aug-2024
https://dl.acm.org/doi/10.1145/3673038.3673053
Fan ZWan ZLiu CLu ABhardwaj KRaychowdhury A(2024)Benchmarking Test-Time DNN Adaptation at Edge with Compute-In-MemoryACM Journal on Autonomous Transportation Systems10.1145/36658981:3(1-26)Online publication date: 27-May-2024
https://dl.acm.org/doi/10.1145/3665898
Narang GOgbogu CDoppa JPande P(2024)TEFLON: Thermally Efficient Dataflow-aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM ArchitecturesACM Transactions on Embedded Computing Systems10.1145/366527923:5(1-23)Online publication date: 14-Aug-2024
https://dl.acm.org/doi/10.1145/3665279
Dang DDash PAziz A(2024)P-ReTI: Silicon Photonic Accelerator for Greener and Real-Time AIProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3660376(766-769)Online publication date: 12-Jun-2024
https://dl.acm.org/doi/10.1145/3649476.3660376
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Issue’s Table of Contents