survey

Open access

A Survey on Bayesian Deep Learning

Authors:

Dit-Yan YeungAuthors Info & Claims

ACM Computing Surveys (CSUR), Volume 53, Issue 5

Article No.: 108, Pages 1 - 37

https://doi.org/10.1145/3409383

Published: 28 September 2020 Publication History

All formats PDF

Abstract

A comprehensive artificial intelligence system needs to not only perceive the environment with different “senses” (e.g., seeing and hearing) but also infer the world’s conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks, such as visual object recognition and speech recognition, using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models.¹ In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and, in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, and so on. We also discuss the relationship and differences between Bayesian deep learning and other related topics, such as Bayesian treatment of neural networks.

References

[1]

Gediminas Adomavicius and YoungOk Kwon. 2012. Improving aggregate recommendation diversity using ranking-based techniques. Trans. Knowl. Data Eng. 24, 5 (2012), 896--911.

Digital Library

[2]

Anoop Korattikara Balan, Vivek Rathod, Kevin P. Murphy, and Max Welling. 2015. Bayesian dark knowledge. In Proceedings of the NIPS. 3420--3428.

[3]

Ilaria Bartolini, Zhenjie Zhang, and Dimitris Papadias. 2011. Collaborative filtering with personalized skylines. Trans. Knowl. Data Eng. 23, 2 (2011), 190--203.

Digital Library

[4]

Yoshua Bengio, Li Yao, Guillaume Alain, and Pascal Vincent. 2013. Generalized denoising auto-encoders as generative models. In Proceedings of the NIPS. 899--907.

[5]

Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag, New York.

Digital Library

[6]

David Blei and John Lafferty. 2006. Correlated topic models. In Proceedings of the NIPS. 147--154.

[7]

David M. Blei and John D. Lafferty. 2006. Dynamic topic models. In Proceedings of the ICML. ACM, 113--120.

[8]

David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993--1022.

[9]

Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural network. In Proceedings of the ICML. 1613--1622.

[10]

Hervé Bourlard and Yves Kamp. 1988. Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybernet. 59, 4--5 (1988), 291--294.

Digital Library

[11]

Yuri Burda, Roger B. Grosse, and Ruslan Salakhutdinov. 2016. Importance weighted autoencoders. In Proceedings of the ICLR.

[12]

Yi Cai, Ho-fung Leung, Qing Li, Huaqing Min, Jie Tang, and Juanzi Li. 2014. Typicality-based collaborative filtering recommendation. Trans. Knowl. Data Eng. 26, 3 (2014), 766--779.

Digital Library

[13]

Minmin Chen, Kilian Q. Weinberger, Fei Sha, and Yoshua Bengio. 2014. Marginalized denoising auto-encoders for nonlinear representations. In Proceedings of the ICML. 1476--1484.

[14]

Minmin Chen, Zhixiang Eddie Xu, Kilian Q. Weinberger, and Fei Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the ICML.

[15]

Tianqi Chen, Emily B. Fox, and Carlos Guestrin. 2014. Stochastic Gradient Hamiltonian Monte Carlo. In Proceedings of the ICML. 1683--1691.

[16]

Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the EMNLP. 1724--1734.

[17]

Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C. Courville, and Yoshua Bengio. 2015. A recurrent latent variable model for sequential data. In Proceedings of the NIPS. 2980--2988.

[18]

Yulai Cong, Bo Chen, Hongwei Liu, and Mingyuan Zhou. 2017. Deep latent dirichlet allocation with topic-layer-adaptive stochastic gradient riemannian MCMC. In Proceedings of the ICML. 864--873.

[19]

Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, and Sebastian Trimpe. 2018. Probabilistic recurrent state-space models. In Proceedings of the ICML. 1279--1288.

[20]

S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, and Geoffrey E. Hinton. 2016. Attend, infer, repeat: Fast scene understanding with generative models. In Proceedings of the NIPS. 3225--3233.

Digital Library

[21]

Valentin Flunkert, David Salinas, and Jan Gasthaus. 2017. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. CoRR abs/1704.04110.

[22]

Yarin Gal and Zoubin Ghahramani. 2015. Dropout as a Bayesian approximation: Insights and applications. In Proceedings of the ICML.

[23]

M. J. F. Gales and S. S. Airey. 2006. Product of Gaussians for speech recognition. Comput. Speech 8 Lang. 20, 1 (2006), 22--40.

[24]

Zhe Gan, Changyou Chen, Ricardo Henao, David E. Carlson, and Lawrence Carin. 2015. Scalable deep Poisson factor analysis for topic modeling. In Proceedings of the ICML. 1823--1832.

[25]

Zhe Gan, Ricardo Henao, David E. Carlson, and Lawrence Carin. 2015. Learning deep sigmoid belief networks with data augmentation. In Proceedings of the AISTATS.

[26]

Jochen Gast and Stefan Roth. 2018. Lightweight probabilistic deep networks. In Proceedings of the CVPR. 3369--3378.

[27]

Jan Gasthaus, Konstantinos Benidis, Yuyang Wang, Syama Sundar Rangapuram, David Salinas, Valentin Flunkert, and Tim Januschowski. 2019. Probabilistic forecasting with spline quantile function RNNs. In Proceedings of the AISTATS. 1901--1910.

[28]

Kostadin Georgiev and Preslav Nakov. 2013. A non-IID framework for collaborative filtering with restricted Boltzmann machines. In Proceedings of the ICML. 1148--1156.

Digital Library

[29]

Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Retrieved from http://goodfeli.github.io/dlbook/. Book in preparation for MIT Press.

[30]

Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the NIPS. 2672--2680.

Digital Library

[31]

Alex Graves. 2011. Practical variational inference for neural networks. In Proceedings of the NIPS. 2348--2356.

[32]

Aditya Grover, Aaron Zweig, and Stefano Ermon. 2019. Graphite: Iterative generative modeling of graphs. In Proceedings of the ICML. 2434--2444.

[33]

A. K. Gupta and D. K. Nagar. 2000. Matrix Variate Distributions. Chapman 8 Hall. 99040291 Retrieved from http://books.google.com.hk/books?id=PQOYnT7P1loC.

[34]

Danijar Hafner, Timothy P. Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. 2019. Learning latent dynamics for planning from pixels. In Proceedings of the ICML. 2555--2565.

[35]

Jeff Harrison and Mike West. 1999. Bayesian Forecasting 8 Dynamic Models. Springer.

[36]

Andrew C. Harvey. 1990. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press.

[37]

Hao He, Hao Wang, Guang-He Lee, and Yonglong Tian. 2019. ProbGAN: Towards probabilistic GAN with theoretical guarantees. In Proceedings of the ICLR.

[38]

Ricardo Henao, James Lu, Joseph E. Lucas, Jeffrey M. Ferranti, and Lawrence Carin. 2016. Electronic health record analysis via deep Poisson factor models. J. Mach. Learn. Res. 17 (2016), 186:1--186:32.

[39]

José Miguel Hernández-Lobato and Ryan Adams. 2015. Probabilistic backpropagation for scalable learning of Bayesian neural networks. In Proceedings of the ICML. 1861--1869.

[40]

Geoffrey E. Hinton. 2002. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 8 (2002), 1771--1800.

Digital Library

[41]

Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18, 7 (2006), 1527--1554.

Digital Library

[42]

Geoffrey E. Hinton and Drew Van Camp. 1993. Keeping the neural networks simple by minimizing the description length of the weights. In Proceedings of the COLT. 5--13.

[43]

Geoffrey E. Hinton and Richard S. Zemel. 1994. Autoencoders, minimum description length, and Helmholtz free energy. In Proceedings of the NIPS. 3--10.

Digital Library

[44]

Matthew Hoffman, Francis R. Bach, and David M. Blei. 2010. Online learning for latent Dirichlet allocation. In Proceedings of the NIPS. 856--864.

Digital Library

[45]

Matthew D. Hoffman, David M. Blei, Chong Wang, and John William Paisley. 2013. Stochastic variational inference. J. Mach. Learn. Res. 14, 1 (2013), 1303--1347.

[46]

Mark F. Hornick and Pablo Tamayo. 2012. Extending recommender systems for disjoint user/item sets: The conference recommendation problem. Trans. Knowl. Data Eng. 24, 8 (2012), 1478--1490.

Digital Library

[47]

Wei-Ning Hsu and James R. Glass. 2018. Scalable factorized hierarchical variational autoencoder training. In Proceedings of the INTERSPEECH. 1462--1466.

[48]

Wei-Ning Hsu, Yu Zhang, and James R. Glass. 2017. Unsupervised learning of disentangled and interpretable representations from sequential data. In Proceedings of the NIPS. 1878--1889.

[49]

Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, and Ruoming Pang. 2019. Hierarchical generative modeling for controllable speech synthesis. In Proceedings of the ICLR.

[50]

Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In Proceedings of the ICDM. 263--272.

Digital Library

[51]

Hengguan Huang, Hao Wang, and Brian Mak. 2019. Recurrent Poisson process unit for speech recognition. In Proceedings of the AAAI. 6538--6545.

[52]

Hengguan Huang, Fuzhao Xue, Hao Wang, and Ye Wang. 2020. Deep graph random process for relational-thinking-based speech recognition. In Proceedings of the ICML.

[53]

David H. Hubel and Torsten N. Wiesel. 1968. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 1 (1968), 215--243.

[54]

Finn V. Jensen et al. 1996. An Introduction to Bayesian Networks, Vol. 210. UCL Press London.

[55]

Michael I. Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Mach. Learn. 37, 2 (1999), 183--233.

Digital Library

[56]

Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the ACL. 655--665.

[57]

Maximilian Karl, Maximilian Sölch, Justin Bayer, and Patrick van der Smagt. 2017. Deep variational Bayes filters: Unsupervised learning of state space models from raw data. In Proceedings of the ICLR.

[58]

Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.

[59]

Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the ICLR.

[60]

Adam R. Kosiorek, Hyunjik Kim, Yee Whye Teh, and Ingmar Posner. 2018. Sequential attend, infer, repeat: Generative modelling of moving objects. In Proceedings of the NIPS. 8615--8625.

[61]

Rahul G. Krishnan, Uri Shalit, and David A. Sontag. 2017. Structured inference networks for nonlinear state space models. In Proceedings of the AAAI. 2101--2109.

[62]

Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of the NIPS. 1097--1105.

Digital Library

[63]

Y. LeCun. 1987. Modeles connexionnistes de l’apprentissage (connectionist learning models). Ph.D. Dissertation. Université P. et M. Curie (Paris 6).

[64]

Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.

[65]

Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. 2009. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the ICML. 609--616.

Digital Library

[66]

Sheng Li, Jaya Kawale, and Yun Fu. 2015. Deep collaborative filtering via marginalized Denoising Auto-encoder. In Proceedings of the CIKM. 811--820.

Digital Library

[67]

Wu-Jun Li and Dit-Yan Yeung. 2009. Relation regularized matrix factorization. In Proceedings of the IJCAI.

[68]

Xiaopeng Li and James She. 2017. Collaborative variational autoencoder for recommender systems. In Proceedings of the KDD. 305--314.

Digital Library

[69]

Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, and Tong Zhang. 2018. QuaSE: Sequence editing under quantifiable guidance. In Proceedings of the EMNLP. 3855--3864.

[70]

Nathan Nan Liu, Xiangrui Meng, Chao Liu, and Qiang Yang. 2011. Wisdom of the better few: Cold start recommendation via representative based rating elicitation. In Proceedings of the RecSys. 37--44.

Digital Library

[71]

Zhongqi Lu, Zhicheng Dou, Jianxun Lian, Xing Xie, and Qiang Yang. 2015. Content-based collaborative filtering for news topic recommendation. In Proceedings of the AAAI. 217--223.

Digital Library

[72]

J. C. MacKay David. 1992. A practical Bayesian framework for backprop networks. Neural Comput. 4, 3 (1992), 448--472.

Digital Library

[73]

Wesley J. Maddox, Pavel Izmailov, Timur Garipov, Dmitry P. Vetrov, and Andrew Gordon Wilson. 2019. A simple baseline for Bayesian uncertainty in deep learning. In Proceedings of the NIPS. 13132--13143.

[74]

Takamitsu Matsubara, Vicenc Gómez, and Hilbert J. Kappen. 2014. Latent Kullback Leibler control for continuous-state systems using probabilistic graphical models. In Proceedings of the UAI. 583--592.

[75]

Nikhil Mehta, Lawrence Carin, and Piyush Rai. 2019. Stochastic blockmodels meet graph neural networks. In Proceedings of the ICML. 4466--4474.

[76]

Abdel-rahman Mohamed, Tara N. Sainath, George E. Dahl, Bhuvana Ramabhadran, Geoffrey E. Hinton, and Michael A. Picheny. 2011. Deep belief networks using discriminative features for phone recognition. In Proceedings of the ICASSP. 5060--5063.

[77]

Jonas Mueller, David K. Gifford, and Tommi S. Jaakkola. 2017. Sequence to better sequence: Continuous revision of combinatorial structures. In Proceedings of the ICML. 2536--2544.

[78]

Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press.

Digital Library

[79]

Radford M. Neal. 1992. Connectionist learning of belief networks. Artif. Intell. 56, 1 (1992), 71--113.

Digital Library

[80]

Radford M. Neal. 1995. Bayesian learning for neural networks. Ph.D. Dissertation. University of Toronto.

[81]

Radford M. Neal et al. 2011. MCMC using Hamiltonian dynamics. Handbook Markov Chain Monte Carlo 2, 11 (2011), 2.

[82]

Krzysztof Nowicki and Tom A. B. Snijders. 2001. Estimation and prediction for stochastic blockstructures. J. Amer. Stat. Assoc. 96, 455 (2001), 1077--1087.

[83]

Takahiro Omi, Naonori Ueda, and Kazuyuki Aihara. 2019. Fully neural network based model for general temporal point processes. In Proceedings of the NIPS. 2120--2129.

[84]

Aäron Van Den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Proceedings of the NIPS. 2643--2651.

[85]

Yoon-Joo Park. 2013. The adaptive clustering method for the long tail problem of recommender systems. Trans. Knowl. Data Eng. 25, 8 (2013), 1904--1915.

Digital Library

[86]

Ian Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. 2008. Fast collapsed gibbs sampling for latent Dirichlet allocation. In Proceedings of the KDD. 569--577.

Digital Library

[87]

Janis Postels, Francesco Ferroni, Huseyin Coskun, Nassir Navab, and Federico Tombari. 2019. Sampling-free epistemic uncertainty estimation using approximated variance propagation. In Proceedings of the ICCV. 2931--2940.

[88]

Christopher Poultney, Sumit Chopra, Yann L. Cun et al. 2006. Efficient learning of sparse representations with an energy-based model. In Proceedings of the NIPS. 1137--1144.

[89]

Sanjay Purushotham, Yan Liu, and C.-C. Jay Kuo. 2012. Collaborative topic regression with social matrix factorization for recommendation systems. In Proceedings of the ICML. 759--766.

[90]

Syama Sundar Rangapuram, Matthias W. Seeger, Jan Gasthaus, Lorenzo Stella, Yuyang Wang, and Tim Januschowski. 2018. Deep state space models for time series forecasting. In Proceedings of the NIPS. 7796--7805.

[91]

Daniele Ravì, Charence Wong, Fani Deligianni, Melissa Berthelot, Javier Andreu-Perez, Benny Lo, and Guang-Zhong Yang. 2016. Deep learning for health informatics. IEEE J. Biomed. Health Info. 21, 1 (2016), 4--21.

[92]

Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to Recommender Systems Handbook. Springer.

Digital Library

[93]

Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot, and Yoshua Bengio. 2011. Contractive auto-encoders: Explicit invariance during feature extraction. In Proceedings of the ICML. 833--840.

[94]

Sheldon M. Ross, John J. Kelly, Roger J. Sullivan, William James Perry, Donald Mercer, Ruth M. Davis, Thomas Dell Washburn, Earl V. Sager, Joseph B. Boyce, and Vincent L. Bristow. 1996. Stochastic Processes. Vol. 2. Wiley, New York.

[95]

Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. 2013. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In Proceedings of the ICASSP. 6655--6659.

[96]

Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic matrix factorization. In Proceedings of the NIPS. 1257--1264.

[97]

Ruslan Salakhutdinov and Andriy Mnih. 2008. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the ICML. 880--887.

Digital Library

[98]

Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey E. Hinton. 2007. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the ICML. 791--798.

[99]

Oleksandr Shchur, Marin Bilos, and Stephan Günnemann. 2020. Intensity-free learning of temporal point processes. In Proceedings of the ICLR.

[100]

Alexander Shekhovtsov and Boris Flach. 2019. Feed-forward propagation in probabilistic neural networks with categorical and max layers. In Proceedings of the ICLR.

[101]

Jonathan R. Shewchuk. 1994. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Technical Report. Carnegie Mellon University, Pittsburgh, PA.

[102]

Gunnar A. Sigurdsson, Santosh Kumar Divvala, Ali Farhadi, and Abhinav Gupta. 2017. Asynchronous temporal fields for action recognition. In Proceedings of the CVPR. 5650--5659.

[103]

Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.

Digital Library

[104]

Nitish Srivastava and Russ R. Salakhutdinov. 2012. Multimodal learning with deep Boltzmann machines. In Proceedings of the NIPS. 2222--2230.

Digital Library

[105]

Karl Stelzner, Robert Peharz, and Kristian Kersting. 2019. Faster attend-infer-repeat with tractable probabilistic models. In Proceedings of the ICML. 5966--5975.

[106]

Robert S. Strichartz. 2003. A Guide to Distribution Theory and Fourier Transforms. World Scientific.

[107]

Jiahao Su, Milan Cvitkovic, and Furong Huang. 2020. Sampling-free learning of Bayesian quantized neural networks. In Proceedings of the ICLR.

[108]

Ilya Sutskever, Oriol Vinyals, and Quoc V. V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the NIPS. 3104--3112.

Digital Library

[109]

Jinhui Tang, Guo-Jun Qi, Liyan Zhang, and Changsheng Xu. 2013. Cross-space affinity learning with its application to movie recommendation. IEEE Trans. Knowl. Data Eng. 25, 7 (2013), 1510--1519.

Digital Library

[110]

Wesley Tansey, Yixin Wang, David M. Blei, and Raul Rabadan. 2018. Black box FDR. In Proceedings of the ICML. 4874--4883.

[111]

Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408.

Digital Library

[112]

Chong Wang and David M. Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the KDD. 448--456.

[113]

Chong Wang, David M. Blei, and David Heckerman. 2008. Continuous time dynamic topic models. In Proceedings of the UAI. 579--586.

[114]

Hao Wang. 2017. Bayesian Deep Learning for Integrated Intelligence: Bridging the Gap between Perception and Inference. Ph.D. Dissertation. Hong Kong University of Science and Technology.

[115]

Hao Wang, Binyi Chen, and Wu-Jun Li. 2013. Collaborative topic regression with social regularization for tag recommendation. In Proceedings of the IJCAI. 2719--2725.

[116]

Hao Wang and Wu-Jun Li. 2015. Relational collaborative topic regression for recommender systems. Trans. Knowl. Data Eng. 27, 5 (2015), 1343--1355.

[117]

Hao Wang, Chengzhi Mao, Hao He, Mingmin Zhao, Tommi S. Jaakkola, and Dina Katabi. 2019. Bidirectional inference networks: A class of deep Bayesian networks for health profiling. In Proceedings of the AAAI. 766--773.

[118]

Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2015. Relational stacked denoising autoencoder for tag recommendation. In Proceedings of the AAAI. 3052--3058.

[119]

Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2016. Natural-parameter networks: A class of probabilistic neural networks. In Proceedings of the NIPS. 118--126.

[120]

Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2017. Relational deep learning: A deep latent variable model for link prediction. In Proceedings of the AAAI. 2688--2694.

[121]

Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems. In Proceedings of the KDD. 1235--1244.

Digital Library

[122]

Hao Wang, Shi Xingjian, and Dit-Yan Yeung. 2016. Collaborative recurrent autoencoder: Recommend while learning to fill in the blanks. In Proceedings of the NIPS. 415--423.

Digital Library

[123]

Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the ACM MM. 627--636.

Digital Library

[124]

Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, and Tim Januschowski. 2019. Deep factors for forecasting. In Proceedings of the ICML. 6607--6617.

[125]

Manuel Watter, Jost Springenberg, Joschka Boedecker, and Martin Riedmiller. 2015. Embed to control: A locally linear latent dynamics model for control from raw images. In Proceedings of the NIPS. 2728--2736.

[126]

Yan Zheng Wei, Luc Moreau, and Nicholas R. Jennings. 2005. Learning users’ interests by quality classification in market-based recommender systems. Trans. Knowl. Data Eng. 17, 12 (2005), 1678--1688.

Digital Library

[127]

Max Welling and Yee Whye Teh. 2011. Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the ICML. 681--688.

[128]

Andrew Gordon Wilson. 2020. The case for Bayesian deep learning. arXiv preprint arXiv:2001.10995.

[129]

Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved variational autoencoders for text modeling using dilated convolutions. In Proceedings of the ICML. 3881--3890.

[130]

Ghim-Eng Yap, Ah-Hwee Tan, and HweeHwa Pang. 2007. Discovering and exploiting causal dependencies for robust mobile context-aware recommenders. Trans. Knowl. Data Eng. 19, 7 (2007), 977--992.

Digital Library

[131]

Haochao Ying, Liang Chen, Yuwen Xiong, and Jian Wu. 2016. Collaborative deep ranking: A hybrid pair-wise recommendation algorithm with implicit feedback. In Proceedings of the PAKDD.

Digital Library

[132]

Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the KDD. ACM, 353--362.

Digital Library

[133]

He Zhao, Lan Du, Wray L. Buntine, and Mingyuan Zhou. 2018. Dirichlet belief networks for topic structure learning. In Proceedings of the NIPS. 7966--7977.

[134]

Vincent Wenchen Zheng, Bin Cao, Yu Zheng, Xing Xie, and Qiang Yang. 2010. Collaborative filtering meets mobile recommendation: A user-centered approach. In Proceedings of the AAAI.

[135]

Mingyuan Zhou and Lawrence Carin. 2015. Negative binomial process count and mixture modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2 (2015), 307--320.

Digital Library

[136]

Mingyuan Zhou, Lauren Hannah, David B. Dunson, and Lawrence Carin. 2012. Beta-negative binomial process and Poisson factor analysis. In Proceedings of the AISTATS. 1462--1471.

Cited By

Neyigapula B(2024)Secure AI Model Sharing: A Cryptographic Approach for Encrypted Model ExchangeInternational Journal of Artificial Intelligence and Machine Learning10.51483/IJAIML.4.1.2024.48-604:1(48-60)Online publication date: 5-Jan-2024
https://doi.org/10.51483/IJAIML.4.1.2024.48-60
Gao ZTan ZBao B(2024)A Multi-Dimensional Evaluation Model for Epidemic Prevention PoliciesCAAI Artificial Intelligence Research10.26599/AIR.2024.9150034(9150034)Online publication date: Dec-2024
https://doi.org/10.26599/AIR.2024.9150034
Show More Cited By

Index Terms

A Survey on Bayesian Deep Learning

Recommendations

Learning Bayesian network classifiers by risk minimization

Bayesian networks (BNs) provide a powerful graphical model for encoding the probabilistic relationships among a set of variables, and hence can naturally be used for classification. However, Bayesian network classifiers (BNCs) learned in the common way ...
Read More
Deep active inference

This work combines the free energy principle and the ensuing active inference dynamics with recent advances in variational inference in deep generative models, and evolution strategies to introduce the "deep active inference" agent. This agent minimises ...
Read More
Gated Bayesian networks for algorithmic trading

This paper introduces a new probabilistic graphical model called gated Bayesian network (GBN). This model evolved from the need to represent processes that include several distinct phases. In essence, a GBN is a model that combines several Bayesian ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Computing Surveys

ACM Computing Surveys Volume 53, Issue 5

September 2021

782 pages

ISSN:0360-0300

EISSN:1557-7341

DOI:10.1145/3426973

Editor:
Albert Zomaya
University of Sydney, Austraila

Issue’s Table of Contents

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 28 September 2020

Accepted: 01 September 2020

Revised: 01 August 2020

Received: 01 March 2020

Published in CSUR Volume 53, Issue 5

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Survey
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

79
Total Citations
View Citations
11,266
Total Downloads

Downloads (Last 12 months)2,437
Downloads (Last 6 weeks)198

Other Metrics

View Author Metrics

Citations

Cited By

Neyigapula B(2024)Secure AI Model Sharing: A Cryptographic Approach for Encrypted Model ExchangeInternational Journal of Artificial Intelligence and Machine Learning10.51483/IJAIML.4.1.2024.48-604:1(48-60)Online publication date: 5-Jan-2024
https://doi.org/10.51483/IJAIML.4.1.2024.48-60
Gao ZTan ZBao B(2024)A Multi-Dimensional Evaluation Model for Epidemic Prevention PoliciesCAAI Artificial Intelligence Research10.26599/AIR.2024.9150034(9150034)Online publication date: Dec-2024
https://doi.org/10.26599/AIR.2024.9150034
Rangel GCuevas-Tello JNunez-Varela JPuente CSilva-Trujillo A(2024)A Survey on Convolutional Neural Networks and Their Performance Limitations in Image Recognition TasksJournal of Sensors10.1155/2024/27973202024:1Online publication date: 12-Jul-2024
https://doi.org/10.1155/2024/2797320
Zhu RGuo DQi DChu ZYu XLi S(2024)A Survey of Trustworthy Representation Learning Across DomainsACM Transactions on Knowledge Discovery from Data10.1145/365730118:7(1-53)Online publication date: 12-Apr-2024
https://dl.acm.org/doi/10.1145/3657301
Potter MJun M(2024)Do Bayesian Neural Networks Weapon System Improve Predictive Maintenance?2024 Annual Reliability and Maintainability Symposium (RAMS)10.1109/RAMS51492.2024.10457828(1-7)Online publication date: 22-Jan-2024
https://doi.org/10.1109/RAMS51492.2024.10457828
Emam AFarag MRoscher R(2024)Confident Naturalness Explanation (CNE): A Framework to Explain and Assess Patterns Forming NaturalnessIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2024.336519621(1-5)Online publication date: 2024
https://doi.org/10.1109/LGRS.2024.3365196
Xu HLuo HWu ZZhao F(2024)Res-EMSA: Adaptive Adjustment of Innovation Based on Efficient Multihead Self-Attention in GNSS/INS Tightly Integrated Navigation SystemIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2024.336514821(1-5)Online publication date: 2024
https://doi.org/10.1109/LGRS.2024.3365148
Nan ZDang TSethu VAhmed B(2024)Variational Connectionist Temporal Classification for Order-Preserving Sequence ModelingICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447530(6495-6499)Online publication date: 14-Apr-2024
https://doi.org/10.1109/ICASSP48485.2024.10447530
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

Media

Figures

Other

Tables

View Issue’s Table of Contents