Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars

Published: 18 June 2016 Publication History
  • Get Citation Alerts
  • Abstract

    A number of recent efforts have attempted to design accelerators for popular machine learning algorithms, such as those involving convolutional and deep neural networks (CNNs and DNNs). These algorithms typically involve a large number of multiply-accumulate (dot-product) operations. A recent project, DaDianNao, adopts a near data processing approach, where a specialized neural functional unit performs all the digital arithmetic operations and receives input weights from adjacent eDRAM banks.
    This work explores an in-situ processing approach, where memristor crossbar arrays not only store input weights, but are also used to perform dot-product operations in an analog manner. While the use of crossbar memory as an analog dot-product engine is well known, no prior work has designed or characterized a full-fledged accelerator based on crossbars. In particular, our work makes the following contributions: (i) We design a pipelined architecture, with some crossbars dedicated for each neural network layer, and eDRAM buffers that aggregate data between pipeline stages. (ii) We define new data encoding techniques that are amenable to analog computations and that can reduce the high overheads of analog-to-digital conversion (ADC). (iii) We define the many supporting digital components required in an analog CNN accelerator and carry out a design space exploration to identify the best balance of memristor storage/compute, ADCs, and eDRAM storage on a chip. On a suite of CNN and DNN workloads, the proposed ISAAC architecture yields improvements of 14.8×, 5.5×, and 7.5× in throughput, energy, and computational density (respectively), relative to the state-of-the-art DaDianNao architecture.

    References

    [1]
    "ADC Performance Evolution: Walden Figure-Of-Merit (FOM)," 2012, https://converterpassion.wordpress.com/2012/08/21/adc-performance-evolution-walden-figure-of-merit-fom/.
    [2]
    F. Alibart, E. Zamanidoost, and D. B. Strukov, "Pattern Classification by Memristive Crossbar Circuits using Ex-Situ and In-Situ Training," Nature, 2013.
    [3]
    B. Belhadj, A. Joubert, Z. Li, R. Héliot, and O. Temam, "Continuous Real-World Inputs Can Open Up Alternative Accelerator Designs," in Proceedings of ISCA-40, 2013.
    [4]
    M. N. Bojnordi and E. Ipek, "Memristive Boltzmann Machine: A Hardware Accelerator for Combinatorial Optimization and Deep Learning," in Proceedings of HPCA-22, 2016.
    [5]
    B. E. Boser, E. Sackinger, J. Bromley, Y. Le Cun, and L. D. Jackel, "An Analog Neural Network Processor with Programmable Topology," Journal of Solid-State Circuits, 1991.
    [6]
    G. Burr, R. Shelby, C. di Nolfo, J. Jang, R. Shenoy, P. Narayanan, K. Virwani, E. Giacometti, B. Kurdi, and H. Hwang, "Experimental Demonstration and Tolerancing of a Large-Scale Neural Network (165,000 Synapses), using Phase-Change Memory as the Synaptic Weight Element," in Proceedings of IEDM, 2014.
    [7]
    L. Cavigelli, D. Gschwend, C. Mayer, S. Willi, B. Muheim, and L. Benini, "Origami: A Convolutional Network Accelerator," in Proceedings of GLSVLSI-25, 2015.
    [8]
    T. Chen, Z. Du, N. Sun, J. Wang, C. Wu, Y. Chen, and O. Temam, "DianNao: A Small-Footprint High-Throughput Accelerator for Ubiquitous Machine-Learning," in Proceedings of ASPLOS, 2014.
    [9]
    Y. Chen, T. Luo, S. Liu, S. Zhang, L. He, J. Wang, L. Li, T. Chen, Z. Xu, N. Sun et al., "DaDianNao: A Machine-Learning Supercomputer," in Proceedings of MICRO-47, 2014.
    [10]
    P. Chi, S. Li, Z. Qi, P. Gu, C. Xu, T. Zhang, J. Zhao, Y. Liu, Y. Wang, and Y. Xie, "PRIME: A Novel Processing-In-Memory Architecture for Neural Network Computation in ReRAM-based Main Memory," in Proceedings of ISCA-43, 2016.
    [11]
    J. Cloutier, S. Pigeon, F. R. Boyer, E. Cosatto, and P. Y. Simard, "VIP: An FPGA-Based Processor for Image Processing and Neural Networks," 1996.
    [12]
    A. Coates, B. Huval, T. Wang, D. Wu, B. Catanzaro, and N. Andrew, "Deep Learning with COTS HPC Systems," in Proceedings of ICML-30, 2013.
    [13]
    Z. Du, R. Fasthuber, T. Chen, P. Ienne, L. Li, T. Luo, X. Feng, Y. Chen, and O. Temam, "ShiDianNao: Shifting Vision Processing Closer to the Sensor," in Proceedings of ISCA-42, 2015.
    [14]
    Z. Du, A. Lingamneni, Y. Chen, K. Palem, O. Temam, and C. Wu, "Leveraging the Error Resilience of Machine-Learning Applications for Designing Highly Energy Efficient Accelerators," in Proceedings of ASPDAC-19, 2014.
    [15]
    C. Farabet, B. Martini, B. Corda, P. Akselrod, E. Culurciello, and Y. LeCun, "NeuFlow: A Runtime Reconfigurable Dataflow Processor for Vision," in Proceedings of CVPRW, 2011.
    [16]
    C. Farabet, C. Poulet, J. Y. Han, and Y. LeCun, "CNP: An FPGA-based Processor for Convolutional Networks," in Proceedings of the International Conference on Field Programmable Logic and Applications, 2009.
    [17]
    J. Fieres, K. Meier, and J. Schemmel, "A Convolutional Neural Network Tolerant of Synaptic Faults for Low-Power Analog Hardware," in Proceedings of Artificial Neural Networks in Pattern Recognition, 2006.
    [18]
    R. Genov and G. Cauwenberghs, "Charge-Mode Parallel Architecture for Vector-Matrix Multiplication," 2001.
    [19]
    A. Graves, A.-r. Mohamed, and G. Hinton, "Speech Recognition with Deep Recurrent Neural Networks," in Proceedings of ICASSP, 2013.
    [20]
    B. Grigorian, N. Farahpour, and G. Reinman, "BRAINIAC: Bringing Reliable Accuracy Into Neurally-Implemented Approximate Computing," in Proceedings of HPCA-21, 2015.
    [21]
    S. Gupta, A. Agrawal, K. Gopalakrishnan, and P. Narayanan, "Deep Learning with Limited Numerical Precision," arXiv preprint arXiv:1502.02551, 2015.
    [22]
    A. Hashmi, H. Berry, O. Temam, and M. Lipasti, "Automatic Abstraction and Fault Tolerance in Cortical Microachitectures," in Proceedings of ISCA-38, 2011.
    [23]
    J. Hauswald, Y. Kang, M. A. Laurenzano, Q. Chen, C. Li, T. Mudge, R. G. Dreslinski, J. Mars, and L. Tang, "DjiNN and Tonic: DNN as a Service and Its Implications for Future Warehouse Scale Computers," in Proceedings of ISCA-42, 2015.
    [24]
    K. He, X. Zhang, S. Ren, and J. Sun, "Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification," arXiv preprint arXiv:1502.01852, 2015.
    [25]
    Y. Ho, G. M. Huang, and P. Li, "Nonvolatile Memristor Memory: Device Characteristics and Design Implications," in Proceedings of ICCAD-28, 2009.
    [26]
    M. Hu, J. P. Strachan, Z. Li, E. M. Grafals, N. Davila, C. Graves, S. Lam, N. Ge, R. S. Williams, and J. Yang, "Dot-Product Engine for Neuromorphic Computing: Programming 1T1M Crossbar to Accelerate Matrix-Vector Multiplication," in Proceedings of DAC-53, 2016.
    [27]
    T. Iakymchuk, A. Rosado-Muñoz, J. F. Guerrero-Martínez, M. Bataller-Mompeán, and J. V. Francés-Víllora, "Simplified Spiking Neural Network Architecture and STDP Learning Algorithm Applied to Image Classification," Journal on Image and Video Processing (EURASIP), 2015.
    [28]
    K. Jarrett, K. Kavukcuoglu, M. Ranzato, and Y. LeCun, "What is the Best Multi-Stage Architecture for Object Recognition?" in Proceedings of ICCV-12, 2009.
    [29]
    A. Joubert, B. Belhadj, O. Temam, and R. Héliot, "Hardware Spiking Neurons Design: Analog or Digital?" in Proceedings of IJCNN, 2012.
    [30]
    O. Kavehei, S. Al-Sarawi, K.-R. Cho, N. Iannella, S.-J. Kim, K. Eshraghian, and D. Abbott, "Memristor-based Synaptic Networks and Logical Operations Using In-Situ Computing," in Proceedings of ISSNIP, 2011.
    [31]
    D. Kim, J. H. Kung, S. Chai, S. Yalamanchili, and S. Mukhopadhyay, "Neurocube: A Programmable Digital Neuromorphic Architecture with High-Density 3D Memory," in Proceedings of ISCA-43, 2016.
    [32]
    K.-H. Kim, S. Gaba, D. Wheeler, J. M. Cruz-Albrecht, T. Hussain, N. Srinivasa, and W. Lu, "A Functional Hybrid Memristor Crossbar-Array/CMOS System for Data Storage and Neuromorphic Applications," Nano Letters, 2011.
    [33]
    Y. Kim, Y. Zhang, and P. Li, "A Digital Neuromorphic VLSI Architecture with Memristor Crossbar Synaptic Array for Machine Learning," in Proceedings of SOCC-3, 2012.
    [34]
    A. Krizhevsky, I. Sutskever, and G. E. Hinton, "ImageNet Classification with Deep Convolutional Neural Networks," in Proceedings of NIPS, 2012.
    [35]
    C. Kügeler, C. Nauenheim, M. Meier, R. Waser et al., "Fast Resistance Switching of TiO 2 and MSQ Thin Films for Non-Volatile Memory Applications (RRAM)," in Proceedings of NVMTS-9, 2008.
    [36]
    L. Kull, T. Toifl, M. Schmatz, P. A. Francese, C. Menolfi, M. Brandli, M. Kossel, T. Morf, T. M. Andersen, and Y. Leblebici, "A 3.1 mW 8b 1.2 GS/s Single-Channel Asynchronous SAR ADC with Alternate Comparators for Enhanced Speed in 32 nm Digital SOI CMOS," Journal of Solid-State Circuits, 2013.
    [37]
    Q. V. Le, M. Ranzato, R. Monga, M. Devin, K. Chen, G. S. Corrado, J. Dean, and A. Y. Ng, "Building High-Level Features using Large Scale Unsupervised Learning," in Proceedings of the IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), 2013.
    [38]
    Y. LeCun, L. Bottou, Y. Bengio, and P. Haffner, "Gradient-based Learning Applied to Document Recognition," Proceedings of the IEEE, 1998.
    [39]
    R. LiKamWa, Y. Hou, J. Gao, M. Polansky, and L. Zhong, "RedEye: Analog ConvNet Image Sensor Architecture for Continuous Mobile Vision," in Proceedings of ISCA-43, 2016.
    [40]
    K. Lim, D. Meisner, A. Saidi, P. Ranganathan, and T. Wenisch, "Thin Servers with Smart Pipes: Designing Accelerators for Memcached," in Proceedings of ISCA, 2013.
    [41]
    D. Liu, T. Chen, S. Liu, J. Zhou, S. Zhou, O. Teman, X. Feng, X. Zhou, and Y. Chen, "PuDianNao: A Polyvalent Machine Learning Accelerator," in Proceedings of ASPLOS-20.
    [42]
    X. Liu, M. Mao, H. Li, Y. Chen, H. Jiang, J. J. Yang, Q. Wu, and M. Barnell, "A Heterogeneous Computing System with Memristor-based Neuromorphic Accelerators," in Proceedings of HPEC-18, 2014.
    [43]
    X. Liu, M. Mao, B. Liu, H. Li, Y. Chen, B. Li, Y. Wang, H. Jiang, M. Barnell, Q. Wu et al., "RENO: A High-Efficient Reconfigurable Neuromorphic Computing Accelerator Design," in Proceedings of DAC-52, 2015.
    [44]
    P. Merolla, J. Arthur, F. Akopyan, N. Imam, R. Manohar, and D. Modha, "A Digital Neurosynaptic Core Using Embedded Crossbar Memory with 45pJ per Spike in 45nm," in Proceedings of CICC, 2011.
    [45]
    N. Muralimanohar, R. Balasubramonian, and N. Jouppi, "Optimizing NUCA Organizations and Wiring Alternatives for Large Caches with CACTI 6.0," in Proceedings of MICRO, 2007.
    [46]
    B. Murmann, "ADC Performance Survey 1997-2015 (ISSCC & VLSI Symposium)," 2015, http://web.stanford.edu/~murmann/adcsurvey.html.
    [47]
    A. Nere, A. Hashmi, M. Lipasti, and G. Tononi, "Bridging the Semantic Gap: Emulating Biological Neuronal Behaviors with Simple Digital Neurons," in Proceedings of HPCA-19, 2013.
    [48]
    M. O'Halloran and R. Sarpeshkar, "A 10-nW 12-bit Accurate Analog Storage Cell with 10-aA Leakage," Journal of Solid-State Circuits, 2004.
    [49]
    W. Ouyang, P. Luo, X. Zeng, S. Qiu, Y. Tian, H. Li, S. Yang, Z. Wang, Y. Xiong, C. Qian et al., "DeepId-Net: Multi-Stage and Deformable Deep Convolutional Neural Networks for Object Detection," arXiv preprint arXiv:1409.3505, 2014.
    [50]
    K. Ovtcharov, O. Ruwase, J.-Y. Kim, J. Fowers, K. Strauss, and E. S. Chung, "Accelerating Deep Convolutional Neural Networks Using Specialized Hardware," 2015, http://research.microsoft.com/apps/pubs/default.aspx?id=240715.
    [51]
    Y. V. Pershin and M. Di Ventra, "Experimental Demonstration of Mssociative Memory with Memristive Neural Networks," Neural Networks, 2010.
    [52]
    P.-H. Pham, D. Jelaca, C. Farabet, B. Martini, Y. LeCun, and E. Culurciello, "NeuFlow: Dataflow Vision Processing System-On-a-Chip," in Proceedings of the MWSCAS-55, 2012.
    [53]
    M. Prezioso, F. Merrikh-Bayat, B. Hoskins, G. Adam, K. K. Likharev, and D. B. Strukov, "Training and Operation of an Integrated Neuromorphic Network based on Metal-Oxide Memristors," Nature, 2015.
    [54]
    A. Putnam, A. M. Caulfield, E. S. Chung, D. Chiou, K. Constantinides, J. Demme, H. Esmaeilzadeh, J. Fowers, G. P. Gopal, J. Gray et al., "A Reconfigurable Fabric for Accelerating Large-Scale Datacenter Services," in Proceedings of ISCA-41, 2014.
    [55]
    W. Qadeer, R. Hameed, O. Shacham, P. Venkatesan, C. Kozyrakis, and M. A. Horowitz, "Convolution Engine: Balancing Efficiency & Flexibility in Specialized Computing," in Proceedings of ISCA-40, 2013.
    [56]
    S. Ramakrishnan and J. Hasler, "Vector-Matrix Multiply and Winner-Take-All as an Analog Classifier," 2014.
    [57]
    B. Reagen, P. Whatmough, R. Adolf, S. Rama, H. Lee, S. Lee, J. M. Hernandez, Lobato, G.-Y. Wei, and D. Brooks, "Minerva: Enabling Low-Power, High-Accuracy Deep Neural Network Accelerators," in Proceedings of ISCA-43, 2016.
    [58]
    O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein et al., "ImageNet Large Scale Visual Recognition Challenge," International Journal of Computer Vision, 2014.
    [59]
    M. Saberi, R. Lotfi, K. Mafinezhad, W. Serdijn et al., "Analysis of Power Consumption and Linearity in Capacitive Digital-to-Analog Converters used in Successive Approximation ADCs," 2011.
    [60]
    E. Sackinger, B. E. Boser, J. Bromley, Y. LeCun, and L. D. Jackel, "Application of the ANNA Neural Network Chip to High-Speed Character Recognition," IEEE Transactions on Neural Networks, 1991.
    [61]
    J. Schemmel, J. Fieres, and K. Meier, "Wafer-Scale Integration of Analog Neural Networks," in Proceedings of IJCNN, 2008.
    [62]
    S.Han, X. Liu, H. Mao, J. Pu, A. Pedram, M. Horowitz, and W. Dally, "EIE: Efficient Inference Engine on Compressed Deep Neural Network," in Proceedings of ISCA, 2016.
    [63]
    K. Simonyan and A. Zisserman, "Very Deep Convolutional Networks for Large-Scale Image Recognition," arXiv preprint arXiv:1409.1556, 2014.
    [64]
    R. St Amant, A. Yazdanbakhsh, J. Park, B. Thwaites, H. Esmaeilzadeh, A. Hassibi, L. Ceze, and D. Burger, "General-Purpose Code Acceleration with Limited-Precision Analog Computation," in Proceeding of ISCA-41, 2014.
    [65]
    J. Starzyk and Basawaraj, "Memristor Crossbar Architecture for Synchronous Neural Networks," Transactions on Circuits and Systems I, 2014.
    [66]
    D. B. Strukov, G. S. Snider, D. R. Stewart, and R. Williams, "The Missing Memristor Found," Nature, vol. 453, pp. 80--83, May 2008.
    [67]
    Y. Sun, X. Wang, and X. Tang, "Deep Learning Face Representation from Predicting 10,000 Classes," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
    [68]
    M. Suri, V. Sousa, L. Perniola, D. Vuillaume, and B. DeSalvo, "Phase Change Memory for Synaptic Plasticity Application in Neuromorphic Systems," in Proceedings of IJCNN, 2011.
    [69]
    C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, "Going Deeper with Convolutions," arXiv preprint arXiv:1409.4842, 2014.
    [70]
    R. Szeliski, Computer Vision: Algorithms and Applications, 2010.
    [71]
    T. Taha, R. Hasan, C. Yakopcic, and M. McLean, "Exploring the Design Space of Specialized Multicore Neural Processors," in Proceedings of IJCNN, 2013.
    [72]
    S. M. Tam, B. Gupta, H. Castro, M. Holler et al., "Learning on an Analog VLSI Neural Network Chip," in Proceedings of the International Conference on Systems, Man and Cybernetics, 1990.
    [73]
    O. Temam, "A Defect-Tolerant Accelerator for Emerging High-Performance Applications," in Proceedings of ISCA-39, 2012.
    [74]
    P. O. Vontobel, W. Robinett, P. J. Kuekes, D. R. Stewart, J. Straznicky, and R. S. Williams, "Writing to and reading from a nano-scale crossbar memory based on memristors," Nanotechnology, vol. 20, 2009.
    [75]
    L. Wolf, "DeepFace: Closing the Gap to Human-Level Performance in Face Verification," in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2014.
    [76]
    L. Wu, R. Barker, M. Kim, and K. Ross, "Navigating Big Data with High-Throughput Energy-Efficient Data Partitioning," in Proceedings of ISCA-40, 2013.
    [77]
    C. Xu, D. Niu, N. Muralimanohar, R. Balasubramonian, T. Zhang, S. Yu, and Y. Xie, "Overcoming the Challenges of Crossbar Resistive Memory Architectures," in Proceedings of HPCA-21, 2015.
    [78]
    C. Yakopcic and T. M. Taha, "Energy Efficient Perceptron Pattern Recognition using Segmented Memristor Crossbar Arrays," in Proceedings of IJCNN, 2013.
    [79]
    M. Zangeneh and A. Joshi, "Design and Optimization of Nonvolatile Multibit 1T1R Resistive RAM," Proceedings of the Transactions on VLSI Systems, 2014.
    [80]
    M. D. Zeiler and R. Fergus, "Visualizing and Understanding Convolutional Networks," in Proceedings of ECCV, 2014.

    Cited By

    View all
    • (2024)AFPR-CIM: An Analog-Domain Floating-Point RRAM -based Compute- In- Memory Architecture with Dynamic Range Adaptive FP-ADC2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546882(1-6)Online publication date: 25-Mar-2024
    • (2024)Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546760(1-6)Online publication date: 25-Mar-2024
    • (2024)Bit-Trimmer: Ineffectual Bit-Operation Removal for CLM Architecture2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546755(1-6)Online publication date: 25-Mar-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM SIGARCH Computer Architecture News
    ACM SIGARCH Computer Architecture News  Volume 44, Issue 3
    ISCA'16
    June 2016
    730 pages
    ISSN:0163-5964
    DOI:10.1145/3007787
    Issue’s Table of Contents
    • cover image ACM Conferences
      ISCA '16: Proceedings of the 43rd International Symposium on Computer Architecture
      June 2016
      756 pages
      ISBN:9781467389471
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 18 June 2016
    Published in SIGARCH Volume 44, Issue 3

    Check for updates

    Author Tags

    1. CNN
    2. DNN
    3. accelerator
    4. analog
    5. memristor
    6. neural

    Qualifiers

    • Research-article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)1,332
    • Downloads (Last 6 weeks)128
    Reflects downloads up to 10 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)AFPR-CIM: An Analog-Domain Floating-Point RRAM -based Compute- In- Memory Architecture with Dynamic Range Adaptive FP-ADC2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546882(1-6)Online publication date: 25-Mar-2024
    • (2024)Algorithm-Hardware Co-Design for Energy-Efficient A/D Conversion in ReRAM-Based Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546760(1-6)Online publication date: 25-Mar-2024
    • (2024)Bit-Trimmer: Ineffectual Bit-Operation Removal for CLM Architecture2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546755(1-6)Online publication date: 25-Mar-2024
    • (2024)High-Performance Data Mapping for BNNs on PCM-Based Integrated Photonics2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546687(1-6)Online publication date: 25-Mar-2024
    • (2024)Pipeline Design of Nonvolatile-based Computing in Memory for Convolutional Neural Networks Inference Accelerators2024 Design, Automation & Test in Europe Conference & Exhibition (DATE)10.23919/DATE58400.2024.10546661(1-2)Online publication date: 25-Mar-2024
    • (2024)A general yet accurate approach for energy-efficient processing-in-memory architecture computationsSCIENTIA SINICA Informationis10.1360/SSI-2023-034554:8(1827)Online publication date: 7-Aug-2024
    • (2024)IMI: In-memory Multi-job Inference Acceleration for Large Language ModelsProceedings of the 53rd International Conference on Parallel Processing10.1145/3673038.3673053(752-761)Online publication date: 12-Aug-2024
    • (2024)Benchmarking Test-Time DNN Adaptation at Edge with Compute-In-MemoryACM Journal on Autonomous Transportation Systems10.1145/36658981:3(1-26)Online publication date: 27-May-2024
    • (2024)TEFLON: Thermally Efficient Dataflow-aware 3D NoC for Accelerating CNN Inferencing on Manycore PIM ArchitecturesACM Transactions on Embedded Computing Systems10.1145/366527923:5(1-23)Online publication date: 14-Aug-2024
    • (2024)P-ReTI: Silicon Photonic Accelerator for Greener and Real-Time AIProceedings of the Great Lakes Symposium on VLSI 202410.1145/3649476.3660376(766-769)Online publication date: 12-Jun-2024
    • Show More Cited By

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media