Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning

Lin, Yudeng; Zhang, Qingtian; Gao, Bin; Tang, Jianshi; Yao, Peng; Li, Chongxuan; Huang, Shiyu; Liu, Zhengwu; Zhou, Ying; Liu, Yuyi; Zhang, Wenqiang; Zhu, Jun; Qian, He; Wu, Huaqiang

doi:10.1038/s42256-023-00680-y

Article
Published: 22 June 2023

Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning

Yudeng Lin¹^Â na1,
Qingtian Zhang¹^Â na1,
Bin GaoÂ ORCID: orcid.org/0000-0002-2417-983X¹,
Jianshi TangÂ ORCID: orcid.org/0000-0001-8369-0067¹,
Peng Yao¹,
Chongxuan Li²,
Shiyu HuangÂ ORCID: orcid.org/0000-0003-0500-0141²,
Zhengwu LiuÂ ORCID: orcid.org/0000-0001-7968-9469¹,
Ying Zhou¹,
Yuyi Liu¹,
Wenqiang Zhang¹,
Jun Zhu²,
He Qian¹ &
â¦
Huaqiang WuÂ ORCID: orcid.org/0000-0001-8359-7997¹Â

Nature Machine Intelligence volumeÂ 5,Â pages 714â723 (2023)Cite this article

5490 Accesses
16 Citations
1 Altmetric
Metrics details

Subjects

Abstract

Many advanced artificial intelligence tasks, such as policy optimization, decision making and autonomous navigation, demand high-bandwidth data transfer and probabilistic computing, posing great challenges for conventional computing hardware. Since digital computers based on the von Neumann architecture are good at precise and deterministic computing, their computing efficiency is limited by the high cost of both data transfer between memory and computing units and massive random number generation. Here we develop a stochastic computation-in-memory computing system that can efficiently perform both in situ random number generation and computation based on the nanoscale physical behaviour of memristors. This system is constructed based on a hardware-implemented multiple-memristor-array system. To demonstrate its functionality and efficiency, we implement a typical risk-sensitive reinforcement learning task, namely the storm coast task, with a four-layer Bayesian deep neural network. The computing system efficiently decomposes aleatoric and epistemic uncertainties by exploiting the inherent stochasticity of memristor. Compared with the conventional digital computer, our memristor-based system achieves a 10 times higher speed and 150 times higher energy efficiency in uncertainty decomposition. This stochastic computation-in-memory computing system paves the way for high-speed and energy-efficient implementation of various probabilistic artificial intelligence algorithms.

Access through your institution

Buy or subscribe

This is a preview of subscription content, access via your institution

Access options

Access through your institution

Buy this article

Purchase on SpringerLink
Instant access to full article PDF

Buy now

Prices may be subject to local taxes which are calculated during checkout

**Fig. 1: BDNN for uncertainty quantification under real-world dynamic scenarios.**

**Fig. 2: Stochastic behaviour of the memristor array and ESCIM system.**

**Fig. 3: Risk-sensitive RL task and implementation of the BDNN.**

**Fig. 4: Uncertainty quantification and results for the risk-sensitive RL task.**

Bringing uncertainty quantification to the extreme-edge with memristor-based Bayesian neural networks

Article Open access 20 November 2023

Deep Bayesian active learning using in-memory computing hardware

Article Open access 23 December 2024

Memristor-based adaptive neuromorphic perception in unstructured environments

Article Open access 31 May 2024

Data availability

The data supporting the plots within this paper and other data supporting the findings in this study are available in a Zenodo repository⁴¹.

Code availability

The codes used for the simulations described in Methods are available in a Zenodo repository⁴¹ and a GitHub repository (https://github.com/YudengLin/memristorBDNN). The code that supports the communication between the custom-built ESCIM system and the integrated chip is available from the corresponding author on reasonable request.

References

Chouard, T. & Venema, L. Machine intelligence. Nature 521, 435â435 (2015).
ArticleÂ Google ScholarÂ
Duan, Y., Edwards, J. S. & Dwivedi, Y. K. Artificial intelligence for decision making in the era of Big Dataâevolution, challenges and research agenda. Int. J. Inf. Manage. 48, 63â71 (2019).
ArticleÂ Google ScholarÂ
Ghahramani, Z. Probabilistic machine learning and artificial intelligence. Nature 521, 452â459 (2015).
ArticleÂ Google ScholarÂ
Abdar, M. et al. A review of uncertainty quantification in deep learning: techniques, applications and challenges. Inf. Fusion 76, 243â297 (2021).
ArticleÂ Google ScholarÂ
Wang, H. & Yeung, D.-Y. Towards Bayesian deep learning: a framework and some existing methods. IEEE Trans. Knowl. Data Eng. 28, 3395â3408 (2016).
ArticleÂ Google ScholarÂ
Michelmore, R. et al. Uncertainty quantification with statistical guarantees in end-to-end autonomous driving control. In 2020 IEEE International Conference on Robotics and Automation (ICRA) 7344â7350 (IEEE, 2020).
McAllister, R. et al. Concrete problems for autonomous vehiclesafety: advantages of Bayesian deep learning. In Proc. 26th International Joint Conference on Artificial Intelligence (IJCAI) 4745â4753 (Elsevier, 2017).
Ticknor, J. L. A Bayesian regularized artificial neural network for stock market forecasting. Expert Syst. Appl. 40, 5501â5506 (2013).
ArticleÂ Google ScholarÂ
Jang, H. & Lee, J. Generative Bayesian neural network model for risk-neutral pricing of American index options. Quant. Finance 19, 587â603 (2019).
ArticleÂ MathSciNetÂ MATHÂ Google ScholarÂ
Begoli, E., Bhattacharya, T. & Kusnezov, D. The need for uncertainty quantification in machine-assisted medical decision making. Nat. Mach. Intell. 1, 20â23 (2019).
ArticleÂ Google ScholarÂ
Topol, E. J. High-performance medicine: the convergence of human and artificial intelligence. Nat. Med. 25, 44â56 (2019).
ArticleÂ Google ScholarÂ
HÃ¼llermeier, E. & Waegeman, W. Aleatoric and epistemic uncertainty in machine learning: an introduction to concepts and methods. Mach. Learn. 110, 457â506 (2021).
ArticleÂ MathSciNetÂ MATHÂ Google ScholarÂ
Depeweg, S., Hernandez-Lobato, J.-M., Doshi-Velez, F. & Udluft, S. Decomposition of uncertainty in Bayesian deep learning for efficient and risk-sensitive learning. In Proc. 35th International Conference on Machine Learning (eds Dy, J. & Krause, A.) 1184â1193 (PMLR, 2018).
Coates, A. et al. Deep learning with COTS HPC systems. In Proc. 30th International Conference on Machine Learning (eds Dasgupta, S. & McAllester, D.) 1337â1345 (PMLR, 2013).
Jouppi, N. P. et al. In-datacenter performance analysis of a tensor processing unit. In Proc. 44th Annual International Symposium on Computer Architecture 1â12 (ACM, 2017).
Horowitz, M. 1.1 Computingâs energy problem (and what we can do about it). In 2014 IEEE International Solid-State Circuits Conference Digest of Technical Papers (ISSCC) 10â14 (IEEE, 2014).
Thomas, D. B., Howes, L. & Luk, W. A comparison of CPUs, GPUs, FPGAs, and massively parallel processor arrays for random number generation. In Proc. ACM/SIGDA International Symposium on Field Programmable Gate Arrays 63â72 (ACM, 2009).
Askar, T., Shukirgaliyev, B., Lukac, M. & Abdikamalov, E. Evaluation of pseudo-random number generation on GPU cards. Computation 9, 142 (2021).
ArticleÂ Google ScholarÂ
Thomas, D. B., Luk, W., Leong, P. H. W. & Villasenor, J. D. Gaussian random number generators. ACM Comput. Surv. 39, 11 (2007).
ArticleÂ Google ScholarÂ
Yao, P. et al. Fully hardware-implemented memristor convolutional neural network. Nature 577, 641â646 (2020).
ArticleÂ Google ScholarÂ
Ambrogio, S. et al. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature 558, 60â67 (2018).
ArticleÂ Google ScholarÂ
Prezioso, M. et al. Training and operation of an integrated neuromorphic network based on metal-oxide memristors. Nature 521, 61â64 (2015).
ArticleÂ Google ScholarÂ
Lin, Y. et al. Demonstration of generative adversarial network by intrinsic random noises of analog RRAM devices. In 2018 IEEE International Electron Devices Meeting (IEDM) 3.4.1â3.4.4 (IEEE, 2018).
Gao, L., Chen, P.-Y. & Yu, S. Demonstration of convolution kernel operation on resistive cross-point array. IEEE Electron Device Lett. 37, 870â873 (2016).
ArticleÂ Google ScholarÂ
Yu, S. Neuro-inspired computing with emerging nonvolatile memorys. Proc. IEEE 106, 260â285 (2018).
ArticleÂ Google ScholarÂ
Burr, G. W. et al. Neuromorphic computing using non-volatile memory. Adv. Phys. X 3, 89â124 (2017).
Google ScholarÂ
Dalgaty, T. et al. In situ learning using intrinsic memristor variability via Markov chain Monte Carlo sampling. Nat. Electron. 4, 151â161 (2021).
ArticleÂ Google ScholarÂ
Dalgaty, T., Esmanhotto, E., Castellani, N., Querlioz, D. & Vianello, E. Ex situ transfer of Bayesian neural networks to resistive memory-based inference hardware. Adv. Intell. Syst. 3, 2000103 (2021).
ArticleÂ Google ScholarÂ
Balatti, S., Ambrogio, S., Wang, Z. & Ielmini, D. True random number generation by variability of resistive switching in oxide-based devices. IEEE J. Emerg. Select. Top. Circuits Syst. 5, 214â221 (2015).
ArticleÂ Google ScholarÂ
Vodenicarevic, D. et al. Low-energy truly random number generation with superparamagnetic tunnel junctions for unconventional computing. Phys. Rev. Appl. 8, 054045 (2017).
ArticleÂ Google ScholarÂ
Kim, G. et al. Self-clocking fast and variation tolerant true random number generator based on a stochastic mott memristor. Nat. Commun. 12, 2906 (2021).
ArticleÂ Google ScholarÂ
Jiang, H. et al. A novel true random number generator based on a stochastic diffusive memristor. Nat. Commun. 8, 882 (2017).
ArticleÂ Google ScholarÂ
Lin, B. et al. A high-performance and calibration-free true random number generator based on the resistance perturbation in RRAM array. In 2020 IEEE International Electron Devices Meeting (IEDM) 38.6.1â38.6.4 (IEEE, 2020).
Wu, W. et al. Improving analog switching in HfO_x-based resistive memory with a thermal enhanced layer. IEEE Electron Device Lett. 38, 1019â1022 (2017).
ArticleÂ Google ScholarÂ
Chen, J. et al. A parallel multibit programing scheme with high precision for RRAM-based neuromorphic systems. IEEE Trans. Electron Devices 67, 2213â2217 (2020).
ArticleÂ Google ScholarÂ
Puglisi, F. M., Pavan, P. & Larcher, L. Random telegraph noise in HfOx Resistive Random Access Memory: from physics to compact modeling. In 2016 IEEE International Reliability Physics Symposium (IRPS) MY-8-1âMY-8-5 (IEEE, 2016).
Ambrogio, S. et al. Statistical fluctuations in HfO_x resistive-switching memory: part IIârandom telegraph noise. IEEE Trans. Electron Devices 61, 2920â2927 (2014).
ArticleÂ Google ScholarÂ
Blundell, C., Cornebise, J., Kavukcuoglu, K. & Wierstra, D. Weight uncertainty in neural network. In Proc. 32nd International Conference on Machine Learning (eds Bach, F. & Blei, D.) 1613â1622 (PMLR, 2015).
Depeweg, S., HernÃ¡ndez-Lobato, J. M., Doshi-Velez, F. & Udluft, S. Learning and policy search in stochastic dynamical systems with Bayesian neural networks. In 5th International Conference on Learning Representations 1â14 (ICLR, 2017).
Schulman, J., Wolski, F., Dhariwal, P., Radford, A. & Klimov, O. Proximal policy optimization algorithms. Preprint at https://arxiv.org/abs/1707.06347 (2017).
Lin, Y. YudengLin/memristorBDNN: uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning. Zenodo https://doi.org/10.5281/zenodo.7947059 (2023).

Download references

Acknowledgements

This work was supported in part by the STI 2030-Major Projects (2021ZD0201200), the National Natural Science Foundation of China (92064001, 62025111 and 61974081), the XPLORER Prize, the Shanghai Municipal Science and Technology Major Project and the Beijing Advanced Innovation Center for Integrated Circuits.

Author information

These authors contributed equally: Yudeng Lin, Qingtian Zhang.

Authors and Affiliations

School of Integrated Circuits, Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing, China
Yudeng Lin,Â Qingtian Zhang,Â Bin Gao,Â Jianshi Tang,Â Peng Yao,Â Zhengwu Liu,Â Ying Zhou,Â Yuyi Liu,Â Wenqiang Zhang,Â He QianÂ &Â Huaqiang Wu
Department of Computer Science and Technology, Tsinghua University, Beijing, China
Chongxuan Li,Â Shiyu HuangÂ &Â Jun Zhu

Authors

Yudeng Lin
View author publications
You can also search for this author in PubMedÂ Google Scholar
Qingtian Zhang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Bin Gao
View author publications
You can also search for this author in PubMedÂ Google Scholar
Jianshi Tang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Peng Yao
View author publications
You can also search for this author in PubMedÂ Google Scholar
Chongxuan Li
View author publications
You can also search for this author in PubMedÂ Google Scholar
Shiyu Huang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Zhengwu Liu
View author publications
You can also search for this author in PubMedÂ Google Scholar
Ying Zhou
View author publications
You can also search for this author in PubMedÂ Google Scholar
Yuyi Liu
View author publications
You can also search for this author in PubMedÂ Google Scholar
Wenqiang Zhang
View author publications
You can also search for this author in PubMedÂ Google Scholar
Jun Zhu
View author publications
You can also search for this author in PubMedÂ Google Scholar
He Qian
View author publications
You can also search for this author in PubMedÂ Google Scholar
Huaqiang Wu
View author publications
You can also search for this author in PubMedÂ Google Scholar

Contributions

Y. Lin and Q.Z. contributed to the overall experimental design. B.G., J.Z. and H.W. supervised this project and proposed the overall architecture. Y. Lin, P.Y., Y.Z. and Y. Liu benchmarked the system performance. Z.L., C.L., W.Z. and S.H. helped with the simulations and data analysis. Y. Lin and B.G. contributed to writing and editing the manuscript. All authors examined the results and reviewed the manuscript.

Corresponding authors

Correspondence to Bin Gao or Huaqiang Wu.

Ethics declarations

Competing interests

The authors declare no competing interests.

Peer review

Peer review information

Nature Machine Intelligence thanks the anonymous reviewers for their contribution to the peer review of this work.

Additional information

Publisherâs note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Extended data

Extended Data Fig. 1 Read noise distribution in the different current states.

a-f, The target currents under six current state values as shown in the figures. The read disturbance values of the memristors in the same current state are the difference values between the read and average currents of each cell. We use a block mapping method to program the current state to the target value within a \(\Delta I\)=\(\pm\)0.3 Î¼A error margin. We program 2800 memristor cells for each selected target state, and each cell is measured over 1000 read cycles. All read noise distributions in the target current states can be fitted with a double exponential distribution (solid line).

Extended Data Fig. 2 Functional block diagram of the ESCIM system.

The core board with a field-programmable gate array (FPGA) supports the lower computer in communication with the upper computer and generation of control signals for operations. The TIA&ADC board and each DAC board provide 64 current quantization channels and 64 voltage supply channels, respectively. The eight socket boards containing 4K memristor chips can be connected in parallel to the DUT board. The mother board with voltage and digital signal conversion circuits is used for the connection of other boards.

Extended Data Fig. 3 Structure of the memristor BDNN and deployment on eight 4K memristor chips in the ESCIM system.

a, Structure of the memristor BDNN. Each layer input in the memristor BDNN is quantized to 8 bits. The activation functions in the hidden layers are rectifier functions, that is, ReLu(x) = max(x, 0), and those in the output layers are identity functions, that is, Linear(x) = x. The bias input of each layer is not shown in the figure. The dimensions of each memristor matrix are 1800 (6 Ã 100 Ã 3), 30300 (101 Ã 100 Ã 3), 606 (101 Ã 2 Ã 3). b, The memristor matrix is mapped onto eight 4K memristor chips. The 4K memristor chips are sequentially filled in columns, and different matrices can start with a new column.

Extended Data Fig. 4 Average drift current under various current states.

A statistical analysis of the average drift current \(\delta I\) with respect to the initial current under various current states is conducted. 1890 cells are programmed into a specific current state and the drift current of the cells is averaged. The drift current is the difference between the present read current and the initial current.

Extended Data Fig. 5

Flow chart of ex-situ training using the memristor variational inference and the drift compensation technique.

Extended Data Fig. 6 The read current distribution of the BDNN after memristor programming.

The purple histogram shows the programmed results, and the grey histogram depicts the target current state.

Extended Data Fig. 7 Prediction distribution of the memristor BDNN in the ESCIM system.

a-i, Typical yâ² samples in the next state for several typical and noteworthy locations (\(x,{y}\)), where \(x\) = â10, 0, 10 and y = 1, 5, 9. We set action (\({a}_{x},\,{a}_{y}\))=(0, 0) and analyse the \(y\)-axis value at the next location for simplicity. The next state of the ground truth is sampled 360 times in the true dynamic sea environment (the same random seeds are set), and the next state is predicted 360 times to obtain the probability density. There is a slight difference between the prediction distribution (purple histogram) and the ground truth distribution (yellow histogram). The histogram is truncated at yâ²<0. Notably, the smaller the \(y\) value is, the higher the random disturbance in the predictive samples.

Extended Data Fig. 8 Predicted performance of the memristor BDNN over time with and without compensation.

We use the JensenâShannon (JS) divergence index as a performance metric to measure the similarity between two probability distributions (JS divergence \(\in\)[0, 1]). The time is counted from the moment when the programming process is finished. At ~3 seconds, the average JS divergence over the nine typical states is 0.021. The figure shows that the average JS divergence with drift compensation remains nearly constant over ~7000 seconds, indicating that the prediction performance of the memristor BDNN is as good as that at the beginning, and the memristor BDNN suitably accomplishes the regression task in a complex dynamic environment.

Extended Data Fig. 9

Circuit modules of the simulated memristor core to evaluate the speed and energy cost of the BDNN in the RL storm coast task.

Extended Data Fig. 10 Energy cost and latency of the GPU and ESCIM system in performing uncertainty decomposition in the risk-sensitive RL storm coast task.

a, Compared to the NVIDIA Tesla A100 GPU, the energy cost of the memristor-based ESCIM system is approximately 27 times better at 130 nm and 150 times at 28 nm. b, In addition, in regard to latency, that of the ESCIM system is 5 times better at 130 nm and 10 times better at 28 nm than that of the GPU.

Supplementary information

Supplementary Information

Supplementary Figs. 1â16, Tables 1â8, Notes 1â5 and Video 1.

Supplementary Video 1

The movie shows result for the risk-sensitive RL storm coast task. A boat trajectory passes through the low-epistemic uncertainty and low environmental stochastic sea area. The aleatoric and epistemic uncertainties are warning when boat is in sea areas with a high environmental stochasticity and high epistemic uncertainty, respectively. The warnings guide the boat paddling upwards in order to leave the high-uncertainties areas. The stable point occurred at a suitable distance from the coast owing to consideration of the uncertainties.

Rights and permissions

Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.

Reprints and permissions

About this article

Cite this article

Lin, Y., Zhang, Q., Gao, B. et al. Uncertainty quantification via a memristor Bayesian deep neural network for risk-sensitive reinforcement learning. Nat Mach Intell 5, 714â723 (2023). https://doi.org/10.1038/s42256-023-00680-y

Download citation

Received: 07 July 2022
Accepted: 18 May 2023
Published: 22 June 2023
Issue Date: July 2023
DOI: https://doi.org/10.1038/s42256-023-00680-y

This article is cited by

Memristors enabling probabilistic AI at the edge
- Damien Querlioz
Nature Computational Science (2025)
A dual-domain compute-in-memory system for general neural network inference
- Ze Wang
- Ruihua Yu
- Huaqiang Wu
Nature Electronics (2025)
Deep Bayesian active learning using in-memory computing hardware
- Yudeng Lin
- Bin Gao
- Huaqiang Wu
Nature Computational Science (2024)
Generative complex networks within a dynamic memristor with intrinsic variability
- Yunpeng Guo
- Wenrui Duan
- Huanglong Li
Nature Communications (2023)