Abstract
Many advanced artificial intelligence tasks, such as policy optimization, decision making and autonomous navigation, demand high-bandwidth data transfer and probabilistic computing, posing great challenges for conventional computing hardware. Since digital computers based on the von Neumann architecture are good at precise and deterministic computing, their computing efficiency is limited by the high cost of both data transfer between memory and computing units and massive random number generation. Here we develop a stochastic computation-in-memory computing system that can efficiently perform both in situ random number generation and computation based on the nanoscale physical behaviour of memristors. This system is constructed based on a hardware-implemented multiple-memristor-array system. To demonstrate its functionality and efficiency, we implement a typical risk-sensitive reinforcement learning task, namely the storm coast task, with a four-layer Bayesian deep neural network. The computing system efficiently decomposes aleatoric and epistemic uncertainties by exploiting the inherent stochasticity of memristor. Compared with the conventional digital computer, our memristor-based system achieves a 10 times higher speed and 150 times higher energy efficiency in uncertainty decomposition. This stochastic computation-in-memory computing system paves the way for high-speed and energy-efficient implementation of various probabilistic artificial intelligence algorithms.
This is a preview of subscription content, access via your institution
Access options
Access Nature and 54 other Nature Portfolio journals
Get Nature+, our best-value online-access subscription
$29.99 /Â 30Â days
cancel any time
Subscribe to this journal
Receive 12 digital issues and online access to articles
$119.00 per year
only $9.92 per issue
Buy this article
- Purchase on SpringerLink
- Instant access to full article PDF
Prices may be subject to local taxes which are calculated during checkout




Similar content being viewed by others
Data availability
The data supporting the plots within this paper and other data supporting the findings in this study are available in a Zenodo repository41.
Code availability
The codes used for the simulations described in Methods are available in a Zenodo repository41 and a GitHub repository (https://github.com/YudengLin/memristorBDNN). The code that supports the communication between the custom-built ESCIM system and the integrated chip is available from the corresponding author on reasonable request.
References
Chouard, T. & Venema, L. Machine intelligence. Nature 521, 435â435 (2015).
Duan, Y., Edwards, J. S. & Dwivedi, Y. K. Artificial intelligence for decision making in the era of Big Dataâevolution, challenges and research agenda. Int. J. Inf. Manage. 48, 63â71 (2019).
Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452â459 (2015).
Abdar, M. et al. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion 76, 243â297 (2021).
Wang, H. & Yeung, D.-Y. Towards Bayesian deep learning: a framework and some existing methods. IEEE Trans. Knowl. Data Eng. 28, 3395â3408 (2016).
Michelmore, R. et al. Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control. In 2020 IEEE International Conference on Robotics and Automation (ICRA) 7344â7350 (IEEE, 2020).
McAllister, R. et al. Concrete problems for autonomous vehiclesafety: advantages of Bayesian deep learning. In Proc. 26th International Joint Conference on Artificial Intelligence (IJCAI) 4745â4753 (Elsevier, 2017).
Ticknor, J. L. A Bayesian regularized artificial neural network for stock market forecasting. Expert Syst. Appl. 40, 5501â5506 (2013).
Jang, H. & Lee, J. Generative Bayesian neural network model for risk-neutral pricing of American index options. Quant. Finance 19, 587â603 (2019).
Begoli, E., Bhattacharya, T. & Kusnezov, D. The need for uncertainty quantification in machine-assisted medical decision making. Nat. Mach. Intell. 1, 20â23 (2019).
Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44â56 (2019).
Hüllermeier, E. & Waegeman, W. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn. 110, 457â506 (2021).
Depeweg, S., Hernandez-Lobato, J.-M., Doshi-Velez, F. & Udluft, S. Decomposition of uncertainty in Bayesian deep learning for efficient and risk-sensitive learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 1184â1193 (PMLR, 2018).
Coates, A. et al. Deep learning with COTS HPC systems. In Proc. 30th International Conference on Machine Learning (eds Dasgupta, S. & McAllester, D.) 1337â1345 (PMLR, 2013).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In Proc. 44th Annual International Symposium on Computer Architecture 1â12 (ACM, 2017).
Horowitz, M. 1.1 Computingâs energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) 10â14 (IEEE, 2014).
Thomas, D. B., Howes, L. & Luk, W. A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In Proc. ACM/SIGDA International Symposium on Field Programmable Gate Arrays 63â72 (ACM, 2009).
Askar, T., Shukirgaliyev, B., Lukac, M. & Abdikamalov, E. Evaluation of pseudo-random number generation on GPU cards. Computation 9, 142 (2021).
Thomas, D. B., Luk, W., Leong, P. H. W. & Villasenor, J. D. Gaussian random number generators. ACM Comput. Surv. 39, 11 (2007).
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641â646 (2020).
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60â67 (2018).
Prezioso, M. et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521, 61â64 (2015).
Lin, Y. et al. Demonstration of generative adversarial network by intrinsic random noises of analog RRAM devices. In 2018 IEEE International Electron Devices Meeting (IEDM) 3.4.1â3.4.4 (IEEE, 2018).
Gao, L., Chen, P.-Y. & Yu, S. Demonstration of convolution kernel operation on resistive cross-point array. IEEE Electron Device Lett. 37, 870â873 (2016).
Yu, S. Neuro-inspired computing with emerging nonvolatile memorys. Proc. IEEE 106, 260â285 (2018).
Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys. X 3, 89â124 (2017).
Dalgaty, T. et al. In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling. Nat. Electron. 4, 151â161 (2021).
Dalgaty, T., Esmanhotto, E., Castellani, N., Querlioz, D. & Vianello, E. Ex situ transfer of Bayesian neural networks to resistive memory-based inference hardware. Adv. Intell. Syst. 3, 2000103 (2021).
Balatti, S., Ambrogio, S., Wang, Z. & Ielmini, D. True random number generation by variability of resistive switching in oxide-based devices. IEEE J. Emerg. Select. Top. Circuits Syst. 5, 214â221 (2015).
Vodenicarevic, D. et al. Low-energy truly random number generation with superparamagnetic tunnel junctions for unconventional computing. Phys. Rev. Appl. 8, 054045 (2017).
Kim, G. et al. Self-clocking fast and variation tolerant true random number generator based on a stochastic mott memristor. Nat. Commun. 12, 2906 (2021).
Jiang, H. et al. A novel true random number generator based on a stochastic diffusive memristor. Nat. Commun. 8, 882 (2017).
Lin, B. et al. A high-performance and calibration-free true random number generator based on the resistance perturbation in RRAM array. In 2020 IEEE International Electron Devices Meeting (IEDM) 38.6.1â38.6.4 (IEEE, 2020).
Wu, W. et al. Improving analog switching in HfOx-based resistive memory with a thermal enhanced layer. IEEE Electron Device Lett. 38, 1019â1022 (2017).
Chen, J. et al. A parallel multibit programing scheme with high precision for RRAM-based neuromorphic systems. IEEE Trans. Electron Devices 67, 2213â2217 (2020).
Puglisi, F. M., Pavan, P. & Larcher, L. Random telegraph noise in HfOx Resistive Random Access Memory: from physics to compact modeling. In 2016 IEEE International Reliability Physics Symposium (IRPS) MY-8-1âMY-8-5 (IEEE, 2016).
Ambrogio, S. et al. Statistical fluctuations in HfOx resistive-switching memory: part IIârandom telegraph noise. IEEE Trans. Electron Devices 61, 2920â2927 (2014).
Blundell, C., Cornebise, J., Kavukcuoglu, K. & Wierstra, D. Weight uncertainty in neural network. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 1613â1622 (PMLR, 2015).
Depeweg, S., Hernández-Lobato, J. M., Doshi-Velez, F. & Udluft, S. Learning and policy search in stochastic dynamical systems with Bayesian neural networks. In 5th International Conference on Learning Representations 1â14 (ICLR, 2017).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Lin, Y. YudengLin/memristorBDNN: uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning. Zenodo https://doi.org/10.5281/zenodo.7947059 (2023).
Acknowledgements
This work was supported in part by the STI 2030-Major Projects (2021ZD0201200), the National Natural Science Foundation of China (92064001, 62025111 and 61974081), the XPLORER Prize, the Shanghai Municipal Science and Technology Major Project and the Beijing Advanced Innovation Center for Integrated Circuits.
Author information
Authors and Affiliations
Contributions
Y. Lin and Q.Z. contributed to the overall experimental design. B.G., J.Z. and H.W. supervised this project and proposed the overall architecture. Y. Lin, P.Y., Y.Z. and Y. Liu benchmarked the system performance. Z.L., C.L., W.Z. and S.H. helped with the simulations and data analysis. Y. Lin and B.G. contributed to writing and editing the manuscript. All authors examined the results and reviewed the manuscript.
Corresponding authors
Ethics declarations
Competing interests
The authors declare no competing interests.
Peer review
Peer review information
Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.
Additional information
Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Extended data
Extended Data Fig. 1 Read noise distribution in the different current states.
a-f, The target currents under six current state values as shown in the figures. The read disturbance values of the memristors in the same current state are the difference values between the read and average currents of each cell. We use a block mapping method to program the current state to the target value within a \(\Delta I\)=\(\pm\)0.3 μA error margin. We program 2800 memristor cells for each selected target state, and each cell is measured over 1000 read cycles. All read noise distributions in the target current states can be fitted with a double exponential distribution (solid line).
Extended Data Fig. 2 Functional block diagram of the ESCIM system.
The core board with a field-programmable gate array (FPGA) supports the lower computer in communication with the upper computer and generation of control signals for operations. The TIA&ADC board and each DAC board provide 64 current quantization channels and 64 voltage supply channels, respectively. The eight socket boards containing 4K memristor chips can be connected in parallel to the DUT board. The mother board with voltage and digital signal conversion circuits is used for the connection of other boards.
Extended Data Fig. 3 Structure of the memristor BDNN and deployment on eight 4K memristor chips in the ESCIM system.
a, Structure of the memristor BDNN. Each layer input in the memristor BDNN is quantized to 8 bits. The activation functions in the hidden layers are rectifier functions, that is, ReLu(x) = max(x, 0), and those in the output layers are identity functions, that is, Linear(x) = x. The bias input of each layer is not shown in the figure. The dimensions of each memristor matrix are 1800 (6 Ã 100 Ã 3), 30300 (101 Ã 100 Ã 3), 606 (101 Ã 2 Ã 3). b, The memristor matrix is mapped onto eight 4K memristor chips. The 4K memristor chips are sequentially filled in columns, and different matrices can start with a new column.
Extended Data Fig. 4 Average drift current under various current states.
A statistical analysis of the average drift current \(\delta I\) with respect to the initial current under various current states is conducted. 1890 cells are programmed into a specific current state and the drift current of the cells is averaged. The drift current is the difference between the present read current and the initial current.
Extended Data Fig. 5
Flow chart of ex-situ training using the memristor variational inference and the drift compensation technique.
Extended Data Fig. 6 The read current distribution of the BDNN after memristor programming.
The purple histogram shows the programmed results, and the grey histogram depicts the target current state.
Extended Data Fig. 7 Prediction distribution of the memristor BDNN in the ESCIM system.
a-i, Typical yâ² samples in the next state for several typical and noteworthy locations (\(x,{y}\)), where \(x\) = â10, 0, 10 and y = 1, 5, 9. We set action (\({a}_{x},\,{a}_{y}\))=(0, 0) and analyse the \(y\)-axis value at the next location for simplicity. The next state of the ground truth is sampled 360 times in the true dynamic sea environment (the same random seeds are set), and the next state is predicted 360 times to obtain the probability density. There is a slight difference between the prediction distribution (purple histogram) and the ground truth distribution (yellow histogram). The histogram is truncated at yâ²<0. Notably, the smaller the \(y\) value is, the higher the random disturbance in the predictive samples.
Extended Data Fig. 8 Predicted performance of the memristor BDNN over time with and without compensation.
We use the JensenâShannon (JS) divergence index as a performance metric to measure the similarity between two probability distributions (JS divergence \(\in\)[0, 1]). The time is counted from the moment when the programming process is finished. At ~3 seconds, the average JS divergence over the nine typical states is 0.021. The figure shows that the average JS divergence with drift compensation remains nearly constant over ~7000 seconds, indicating that the prediction performance of the memristor BDNN is as good as that at the beginning, and the memristor BDNN suitably accomplishes the regression task in a complex dynamic environment.
Extended Data Fig. 9
Circuit modules of the simulated memristor core to evaluate the speed and energy cost of the BDNN in the RL storm coast task.
Extended Data Fig. 10 Energy cost and latency of the GPU and ESCIM system in performing uncertainty decomposition in the risk-sensitive RL storm coast task.
a, Compared to the NVIDIA Tesla A100 GPU, the energy cost of the memristor-based ESCIM system is approximately 27 times better at 130 nm and 150 times at 28 nm. b, In addition, in regard to latency, that of the ESCIM system is 5 times better at 130 nm and 10 times better at 28 nm than that of the GPU.
Supplementary information
Supplementary Information
Supplementary Figs. 1â16, Tables 1â8, Notes 1â5 and Video 1.
Supplementary Video 1
The movie shows result for the risk-sensitive RL storm coast task. A boat trajectory passes through the low-epistemic uncertainty and low environmental stochastic sea area. The aleatoric and epistemic uncertainties are warning when boat is in sea areas with a high environmental stochasticity and high epistemic uncertainty, respectively. The warnings guide the boat paddling upwards in order to leave the high-uncertainties areas. The stable point occurred at a suitable distance from the coast owing to consideration of the uncertainties.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Lin, Y., Zhang, Q., Gao, B. et al. Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning. Nat Mach Intell 5, 714â723 (2023). https://doi.org/10.1038/s42256-023-00680-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1038/s42256-023-00680-y
This article is cited by
-
Memristors enabling probabilistic AI at the edge
Nature Computational Science (2025)
-
A dual-domain compute-in-memory system for general neural network inference
Nature Electronics (2025)
-
Deep Bayesian active learning using in-memory computing hardware
Nature Computational Science (2024)
-
Generative complex networks within a dynamic memristor with intrinsic variability
Nature Communications (2023)