Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Showing 1–14 of 14 results for author: Gokmen, T

Searching in archive cs. Search in all archives.
.
  1. arXiv:2406.12774  [pdf, other

    cs.LG cs.AR math.OC

    Towards Exact Gradient-based Training on Analog In-memory Computing

    Authors: Zhaoxian Wu, Tayfun Gokmen, Malte J. Rasch, Tianyi Chen

    Abstract: Given the high economic and environmental costs of using large vision or language models, analog in-memory accelerators present a promising solution for energy-efficient AI. While inference on analog accelerators has been studied recently, the training perspective is underexplored. Recent studies have shown that the "workhorse" of digital AI training - stochastic gradient descent (SGD) algorithm c… ▽ More

    Submitted 18 June, 2024; originally announced June 2024.

    Comments: 10 pages, 5 figures,2 tables

  2. arXiv:2401.13754  [pdf, other

    math.NA cs.ET

    Multi-Function Multi-Way Analog Technology for Sustainable Machine Intelligence Computation

    Authors: Vassilis Kalantzis, Mark S. Squillante, Shashanka Ubaru, Tayfun Gokmen, Chai Wah Wu, Anshul Gupta, Haim Avron, Tomasz Nowicki, Malte Rasch, Murat Onen, Vanessa Lopez Marrero, Effendi Leobandung, Yasuteru Kohda, Wilfried Haensch, Lior Horesh

    Abstract: Numerical computation is essential to many areas of artificial intelligence (AI), whose computing demands continue to grow dramatically, yet their continued scaling is jeopardized by the slowdown in Moore's law. Multi-function multi-way analog (MFMWA) technology, a computing architecture comprising arrays of memristors supporting in-memory computation of matrix operations, can offer tremendous imp… ▽ More

    Submitted 24 January, 2024; originally announced January 2024.

    MSC Class: 65F10; C3; G1 ACM Class: G.1.3

  3. Fast offset corrected in-memory training

    Authors: Malte J. Rasch, Fabio Carta, Omebayode Fagbohungbe, Tayfun Gokmen

    Abstract: In-memory computing with resistive crossbar arrays has been suggested to accelerate deep-learning workloads in highly efficient manner. To unleash the full potential of in-memory computing, it is desirable to accelerate the training as well as inference for large deep neural networks (DNNs). In the past, specialized in-memory training algorithms have been proposed that not only accelerate the forw… ▽ More

    Submitted 8 March, 2023; originally announced March 2023.

    Comments: 14 pages, 10 figures

  4. arXiv:2201.13377  [pdf

    cs.LG cs.ET eess.SY

    Neural Network Training with Asymmetric Crosspoint Elements

    Authors: Murat Onen, Tayfun Gokmen, Teodor K. Todorov, Tomasz Nowicki, Jesus A. del Alamo, John Rozen, Wilfried Haensch, Seyoung Kim

    Abstract: Analog crossbar arrays comprising programmable nonvolatile resistors are under intense investigation for acceleration of deep neural network training. However, the ubiquitous asymmetric conductance modulation of practical resistive devices critically degrades the classification performance of networks trained with conventional algorithms. Here, we describe and experimentally demonstrate an alterna… ▽ More

    Submitted 31 January, 2022; originally announced January 2022.

  5. A flexible and fast PyTorch toolkit for simulating training and inference on analog crossbar arrays

    Authors: Malte J. Rasch, Diego Moreda, Tayfun Gokmen, Manuel Le Gallo, Fabio Carta, Cindy Goldberg, Kaoutar El Maghraoui, Abu Sebastian, Vijay Narayanan

    Abstract: We introduce the IBM Analog Hardware Acceleration Kit, a new and first of a kind open source toolkit to simulate analog crossbar arrays in a convenient fashion from within PyTorch (freely available at https://github.com/IBM/aihwkit). The toolkit is under active development and is centered around the concept of an "analog tile" which captures the computations performed on a crossbar array. Analog t… ▽ More

    Submitted 5 April, 2021; originally announced April 2021.

    Comments: Submitted to AICAS2021

  6. arXiv:1909.07908  [pdf

    cs.LG cs.ET cs.NE stat.ML

    Algorithm for Training Neural Networks on Resistive Device Arrays

    Authors: Tayfun Gokmen, Wilfried Haensch

    Abstract: Hardware architectures composed of resistive cross-point device arrays can provide significant power and speed benefits for deep neural network training workloads using stochastic gradient descent (SGD) and backpropagation (BP) algorithm. The training accuracy on this imminent analog hardware however strongly depends on the switching characteristics of the cross-point elements. One of the key requ… ▽ More

    Submitted 17 September, 2019; originally announced September 2019.

    Comments: 26 pages, 7 fiures

  7. arXiv:1907.10228  [pdf

    cs.ET cs.NE

    Zero-shifting Technique for Deep Neural Network Training on Resistive Cross-point Arrays

    Authors: Hyungjun Kim, Malte Rasch, Tayfun Gokmen, Takashi Ando, Hiroyuki Miyazoe, Jae-Joon Kim, John Rozen, Seyoung Kim

    Abstract: A resistive memory device-based computing architecture is one of the promising platforms for energy-efficient Deep Neural Network (DNN) training accelerators. The key technical challenge in realizing such accelerators is to accumulate the gradient information without a bias. Unlike the digital numbers in software which can be assigned and accessed with desired accuracy, numbers stored in resistive… ▽ More

    Submitted 2 August, 2019; v1 submitted 24 July, 2019; originally announced July 2019.

  8. Design and Characterization of Superconducting Nanowire-Based Processors for Acceleration of Deep Neural Network Training

    Authors: Murat Onen, Brenden A. Butters, Emily Toomey, Tayfun Gokmen, Karl K. Berggren

    Abstract: Training of deep neural networks (DNNs) is a computationally intensive task and requires massive volumes of data transfer. Performing these operations with the conventional von Neumann architectures creates unmanageable time and power costs. Recent studies have shown that mixed-signal designs involving crossbar architectures are capable of achieving acceleration factors as high as 30,000x over the… ▽ More

    Submitted 5 July, 2019; originally announced July 2019.

  9. arXiv:1906.02698  [pdf, ps, other

    cs.NE cs.ET cs.LG

    Training large-scale ANNs on simulated resistive crossbar arrays

    Authors: Malte J. Rasch, Tayfun Gokmen, Wilfried Haensch

    Abstract: Accelerating training of artificial neural networks (ANN) with analog resistive crossbar arrays is a promising idea. While the concept has been verified on very small ANNs and toy data sets (such as MNIST), more realistically sized ANNs and datasets have not yet been tackled. However, it is to be expected that device materials and hardware design constraints, such as noisy computations, finite num… ▽ More

    Submitted 6 June, 2019; originally announced June 2019.

  10. arXiv:1807.01356  [pdf, ps, other

    cs.ET cs.LG stat.ML

    Efficient ConvNets for Analog Arrays

    Authors: Malte J. Rasch, Tayfun Gokmen, Mattia Rigotti, Wilfried Haensch

    Abstract: Analog arrays are a promising upcoming hardware technology with the potential to drastically speed up deep learning. Their main advantage is that they compute matrix-vector products in constant time, irrespective of the size of the matrix. However, early convolution layers in ConvNets map very unfavorably onto analog arrays, because kernel matrices are typically small and the constant time operati… ▽ More

    Submitted 3 July, 2018; originally announced July 2018.

  11. arXiv:1806.00166  [pdf

    cs.LG cs.ET stat.ML

    Training LSTM Networks with Resistive Cross-Point Devices

    Authors: Tayfun Gokmen, Malte Rasch, Wilfried Haensch

    Abstract: In our previous work we have shown that resistive cross point devices, so called Resistive Processing Unit (RPU) devices, can provide significant power and speed benefits when training deep fully connected networks as well as convolutional neural networks. In this work, we further extend the RPU concept for training recurrent neural networks (RNNs) namely LSTMs. We show that the mapping of recurre… ▽ More

    Submitted 31 May, 2018; originally announced June 2018.

    Comments: 17 pages, 5 figures

  12. Analog CMOS-based Resistive Processing Unit for Deep Neural Network Training

    Authors: Seyoung Kim, Tayfun Gokmen, Hyung-Min Lee, Wilfried E. Haensch

    Abstract: Recently we have shown that an architecture based on resistive processing unit (RPU) devices has potential to achieve significant acceleration in deep neural network (DNN) training compared to today's software-based DNN implementations running on CPU/GPU. However, currently available device candidates based on non-volatile memory technologies do not satisfy all the requirements to realize the RPU… ▽ More

    Submitted 20 June, 2017; originally announced June 2017.

  13. arXiv:1705.08014  [pdf

    cs.LG cs.NE stat.ML

    Training Deep Convolutional Neural Networks with Resistive Cross-Point Devices

    Authors: Tayfun Gokmen, O. Murat Onen, Wilfried Haensch

    Abstract: In a previous work we have detailed the requirements to obtain a maximal performance benefit by implementing fully connected deep neural networks (DNN) in form of arrays of resistive devices for deep learning. This concept of Resistive Processing Unit (RPU) devices we extend here towards convolutional neural networks (CNNs). We show how to map the convolutional layers to RPU arrays such that the p… ▽ More

    Submitted 22 May, 2017; originally announced May 2017.

    Comments: 22 pages, 6 figures, 2 tables

  14. arXiv:1603.07341  [pdf

    cs.LG cs.NE stat.ML

    Acceleration of Deep Neural Network Training with Resistive Cross-Point Devices

    Authors: Tayfun Gokmen, Yurii Vlasov

    Abstract: In recent years, deep neural networks (DNN) have demonstrated significant business impact in large scale analysis and classification tasks such as speech recognition, visual object detection, pattern extraction, etc. Training of large DNNs, however, is universally considered as time consuming and computationally intensive task that demands datacenter-scale computational resources recruited for man… ▽ More

    Submitted 23 March, 2016; originally announced March 2016.

    Comments: 19 pages, 5 figures, 2 tables

    Journal ref: Front. Neurosci 10, 333 (2016)