Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
survey
Open access

A Survey on Bayesian Deep Learning

Published: 28 September 2020 Publication History
  • Get Citation Alerts
  • Abstract

    A comprehensive artificial intelligence system needs to not only perceive the environment with different “senses” (e.g., seeing and hearing) but also infer the world’s conditional (or even causal) relations and corresponding uncertainty. The past decade has seen major advances in many perception tasks, such as visual object recognition and speech recognition, using deep learning models. For higher-level inference, however, probabilistic graphical models with their Bayesian nature are still more powerful and flexible. In recent years, Bayesian deep learning has emerged as a unified probabilistic framework to tightly integrate deep learning and Bayesian models.1 In this general framework, the perception of text or images using deep learning can boost the performance of higher-level inference and, in turn, the feedback from the inference process is able to enhance the perception of text or images. This survey provides a comprehensive introduction to Bayesian deep learning and reviews its recent applications on recommender systems, topic models, control, and so on. We also discuss the relationship and differences between Bayesian deep learning and other related topics, such as Bayesian treatment of neural networks.

    References

    [1]
    Gediminas Adomavicius and YoungOk Kwon. 2012. Improving aggregate recommendation diversity using ranking-based techniques. Trans. Knowl. Data Eng. 24, 5 (2012), 896--911.
    [2]
    Anoop Korattikara Balan, Vivek Rathod, Kevin P. Murphy, and Max Welling. 2015. Bayesian dark knowledge. In Proceedings of the NIPS. 3420--3428.
    [3]
    Ilaria Bartolini, Zhenjie Zhang, and Dimitris Papadias. 2011. Collaborative filtering with personalized skylines. Trans. Knowl. Data Eng. 23, 2 (2011), 190--203.
    [4]
    Yoshua Bengio, Li Yao, Guillaume Alain, and Pascal Vincent. 2013. Generalized denoising auto-encoders as generative models. In Proceedings of the NIPS. 899--907.
    [5]
    Christopher M. Bishop. 2006. Pattern Recognition and Machine Learning. Springer-Verlag, New York.
    [6]
    David Blei and John Lafferty. 2006. Correlated topic models. In Proceedings of the NIPS. 147--154.
    [7]
    David M. Blei and John D. Lafferty. 2006. Dynamic topic models. In Proceedings of the ICML. ACM, 113--120.
    [8]
    David M. Blei, Andrew Y. Ng, and Michael I. Jordan. 2003. Latent Dirichlet allocation. J. Mach. Learn. Res. 3 (2003), 993--1022.
    [9]
    Charles Blundell, Julien Cornebise, Koray Kavukcuoglu, and Daan Wierstra. 2015. Weight uncertainty in neural network. In Proceedings of the ICML. 1613--1622.
    [10]
    Hervé Bourlard and Yves Kamp. 1988. Auto-association by multilayer perceptrons and singular value decomposition. Biol. Cybernet. 59, 4--5 (1988), 291--294.
    [11]
    Yuri Burda, Roger B. Grosse, and Ruslan Salakhutdinov. 2016. Importance weighted autoencoders. In Proceedings of the ICLR.
    [12]
    Yi Cai, Ho-fung Leung, Qing Li, Huaqing Min, Jie Tang, and Juanzi Li. 2014. Typicality-based collaborative filtering recommendation. Trans. Knowl. Data Eng. 26, 3 (2014), 766--779.
    [13]
    Minmin Chen, Kilian Q. Weinberger, Fei Sha, and Yoshua Bengio. 2014. Marginalized denoising auto-encoders for nonlinear representations. In Proceedings of the ICML. 1476--1484.
    [14]
    Minmin Chen, Zhixiang Eddie Xu, Kilian Q. Weinberger, and Fei Sha. 2012. Marginalized denoising autoencoders for domain adaptation. In Proceedings of the ICML.
    [15]
    Tianqi Chen, Emily B. Fox, and Carlos Guestrin. 2014. Stochastic Gradient Hamiltonian Monte Carlo. In Proceedings of the ICML. 1683--1691.
    [16]
    Kyunghyun Cho, Bart van Merrienboer, Çaglar Gülçehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. In Proceedings of the EMNLP. 1724--1734.
    [17]
    Junyoung Chung, Kyle Kastner, Laurent Dinh, Kratarth Goel, Aaron C. Courville, and Yoshua Bengio. 2015. A recurrent latent variable model for sequential data. In Proceedings of the NIPS. 2980--2988.
    [18]
    Yulai Cong, Bo Chen, Hongwei Liu, and Mingyuan Zhou. 2017. Deep latent dirichlet allocation with topic-layer-adaptive stochastic gradient riemannian MCMC. In Proceedings of the ICML. 864--873.
    [19]
    Andreas Doerr, Christian Daniel, Martin Schiegg, Duy Nguyen-Tuong, Stefan Schaal, Marc Toussaint, and Sebastian Trimpe. 2018. Probabilistic recurrent state-space models. In Proceedings of the ICML. 1279--1288.
    [20]
    S. M. Ali Eslami, Nicolas Heess, Theophane Weber, Yuval Tassa, David Szepesvari, Koray Kavukcuoglu, and Geoffrey E. Hinton. 2016. Attend, infer, repeat: Fast scene understanding with generative models. In Proceedings of the NIPS. 3225--3233.
    [21]
    Valentin Flunkert, David Salinas, and Jan Gasthaus. 2017. DeepAR: Probabilistic forecasting with autoregressive recurrent networks. CoRR abs/1704.04110.
    [22]
    Yarin Gal and Zoubin Ghahramani. 2015. Dropout as a Bayesian approximation: Insights and applications. In Proceedings of the ICML.
    [23]
    M. J. F. Gales and S. S. Airey. 2006. Product of Gaussians for speech recognition. Comput. Speech 8 Lang. 20, 1 (2006), 22--40.
    [24]
    Zhe Gan, Changyou Chen, Ricardo Henao, David E. Carlson, and Lawrence Carin. 2015. Scalable deep Poisson factor analysis for topic modeling. In Proceedings of the ICML. 1823--1832.
    [25]
    Zhe Gan, Ricardo Henao, David E. Carlson, and Lawrence Carin. 2015. Learning deep sigmoid belief networks with data augmentation. In Proceedings of the AISTATS.
    [26]
    Jochen Gast and Stefan Roth. 2018. Lightweight probabilistic deep networks. In Proceedings of the CVPR. 3369--3378.
    [27]
    Jan Gasthaus, Konstantinos Benidis, Yuyang Wang, Syama Sundar Rangapuram, David Salinas, Valentin Flunkert, and Tim Januschowski. 2019. Probabilistic forecasting with spline quantile function RNNs. In Proceedings of the AISTATS. 1901--1910.
    [28]
    Kostadin Georgiev and Preslav Nakov. 2013. A non-IID framework for collaborative filtering with restricted Boltzmann machines. In Proceedings of the ICML. 1148--1156.
    [29]
    Ian Goodfellow, Yoshua Bengio, and Aaron Courville. 2016. Deep Learning. Retrieved from http://goodfeli.github.io/dlbook/. Book in preparation for MIT Press.
    [30]
    Ian Goodfellow, Jean Pouget-Abadie, Mehdi Mirza, Bing Xu, David Warde-Farley, Sherjil Ozair, Aaron Courville, and Yoshua Bengio. 2014. Generative adversarial nets. In Proceedings of the NIPS. 2672--2680.
    [31]
    Alex Graves. 2011. Practical variational inference for neural networks. In Proceedings of the NIPS. 2348--2356.
    [32]
    Aditya Grover, Aaron Zweig, and Stefano Ermon. 2019. Graphite: Iterative generative modeling of graphs. In Proceedings of the ICML. 2434--2444.
    [33]
    A. K. Gupta and D. K. Nagar. 2000. Matrix Variate Distributions. Chapman 8 Hall. 99040291 Retrieved from http://books.google.com.hk/books?id=PQOYnT7P1loC.
    [34]
    Danijar Hafner, Timothy P. Lillicrap, Ian Fischer, Ruben Villegas, David Ha, Honglak Lee, and James Davidson. 2019. Learning latent dynamics for planning from pixels. In Proceedings of the ICML. 2555--2565.
    [35]
    Jeff Harrison and Mike West. 1999. Bayesian Forecasting 8 Dynamic Models. Springer.
    [36]
    Andrew C. Harvey. 1990. Forecasting, Structural Time Series Models and the Kalman Filter. Cambridge University Press.
    [37]
    Hao He, Hao Wang, Guang-He Lee, and Yonglong Tian. 2019. ProbGAN: Towards probabilistic GAN with theoretical guarantees. In Proceedings of the ICLR.
    [38]
    Ricardo Henao, James Lu, Joseph E. Lucas, Jeffrey M. Ferranti, and Lawrence Carin. 2016. Electronic health record analysis via deep Poisson factor models. J. Mach. Learn. Res. 17 (2016), 186:1--186:32.
    [39]
    José Miguel Hernández-Lobato and Ryan Adams. 2015. Probabilistic backpropagation for scalable learning of Bayesian neural networks. In Proceedings of the ICML. 1861--1869.
    [40]
    Geoffrey E. Hinton. 2002. Training products of experts by minimizing contrastive divergence. Neural Comput. 14, 8 (2002), 1771--1800.
    [41]
    Geoffrey E. Hinton, Simon Osindero, and Yee-Whye Teh. 2006. A fast learning algorithm for deep belief nets. Neural Comput. 18, 7 (2006), 1527--1554.
    [42]
    Geoffrey E. Hinton and Drew Van Camp. 1993. Keeping the neural networks simple by minimizing the description length of the weights. In Proceedings of the COLT. 5--13.
    [43]
    Geoffrey E. Hinton and Richard S. Zemel. 1994. Autoencoders, minimum description length, and Helmholtz free energy. In Proceedings of the NIPS. 3--10.
    [44]
    Matthew Hoffman, Francis R. Bach, and David M. Blei. 2010. Online learning for latent Dirichlet allocation. In Proceedings of the NIPS. 856--864.
    [45]
    Matthew D. Hoffman, David M. Blei, Chong Wang, and John William Paisley. 2013. Stochastic variational inference. J. Mach. Learn. Res. 14, 1 (2013), 1303--1347.
    [46]
    Mark F. Hornick and Pablo Tamayo. 2012. Extending recommender systems for disjoint user/item sets: The conference recommendation problem. Trans. Knowl. Data Eng. 24, 8 (2012), 1478--1490.
    [47]
    Wei-Ning Hsu and James R. Glass. 2018. Scalable factorized hierarchical variational autoencoder training. In Proceedings of the INTERSPEECH. 1462--1466.
    [48]
    Wei-Ning Hsu, Yu Zhang, and James R. Glass. 2017. Unsupervised learning of disentangled and interpretable representations from sequential data. In Proceedings of the NIPS. 1878--1889.
    [49]
    Wei-Ning Hsu, Yu Zhang, Ron J. Weiss, Heiga Zen, Yonghui Wu, Yuxuan Wang, Yuan Cao, Ye Jia, Zhifeng Chen, Jonathan Shen, Patrick Nguyen, and Ruoming Pang. 2019. Hierarchical generative modeling for controllable speech synthesis. In Proceedings of the ICLR.
    [50]
    Yifan Hu, Yehuda Koren, and Chris Volinsky. 2008. Collaborative filtering for implicit feedback datasets. In Proceedings of the ICDM. 263--272.
    [51]
    Hengguan Huang, Hao Wang, and Brian Mak. 2019. Recurrent Poisson process unit for speech recognition. In Proceedings of the AAAI. 6538--6545.
    [52]
    Hengguan Huang, Fuzhao Xue, Hao Wang, and Ye Wang. 2020. Deep graph random process for relational-thinking-based speech recognition. In Proceedings of the ICML.
    [53]
    David H. Hubel and Torsten N. Wiesel. 1968. Receptive fields and functional architecture of monkey striate cortex. J. Physiol. 195, 1 (1968), 215--243.
    [54]
    Finn V. Jensen et al. 1996. An Introduction to Bayesian Networks, Vol. 210. UCL Press London.
    [55]
    Michael I. Jordan, Zoubin Ghahramani, Tommi Jaakkola, and Lawrence K. Saul. 1999. An introduction to variational methods for graphical models. Mach. Learn. 37, 2 (1999), 183--233.
    [56]
    Nal Kalchbrenner, Edward Grefenstette, and Phil Blunsom. 2014. A convolutional neural network for modelling sentences. In Proceedings of the ACL. 655--665.
    [57]
    Maximilian Karl, Maximilian Sölch, Justin Bayer, and Patrick van der Smagt. 2017. Deep variational Bayes filters: Unsupervised learning of state space models from raw data. In Proceedings of the ICLR.
    [58]
    Diederik P. Kingma and Max Welling. 2013. Auto-encoding variational Bayes. arXiv preprint arXiv:1312.6114.
    [59]
    Thomas N. Kipf and Max Welling. 2017. Semi-supervised classification with graph convolutional networks. In Proceedings of the ICLR.
    [60]
    Adam R. Kosiorek, Hyunjik Kim, Yee Whye Teh, and Ingmar Posner. 2018. Sequential attend, infer, repeat: Generative modelling of moving objects. In Proceedings of the NIPS. 8615--8625.
    [61]
    Rahul G. Krishnan, Uri Shalit, and David A. Sontag. 2017. Structured inference networks for nonlinear state space models. In Proceedings of the AAAI. 2101--2109.
    [62]
    Alex Krizhevsky, Ilya Sutskever, and Geoffrey E. Hinton. 2012. Imagenet classification with deep convolutional neural networks. In Proceedings of the NIPS. 1097--1105.
    [63]
    Y. LeCun. 1987. Modeles connexionnistes de l’apprentissage (connectionist learning models). Ph.D. Dissertation. Université P. et M. Curie (Paris 6).
    [64]
    Yann LeCun, Léon Bottou, Yoshua Bengio, and Patrick Haffner. 1998. Gradient-based learning applied to document recognition. Proc. IEEE 86, 11 (1998), 2278--2324.
    [65]
    Honglak Lee, Roger Grosse, Rajesh Ranganath, and Andrew Y. Ng. 2009. Convolutional deep belief networks for scalable unsupervised learning of hierarchical representations. In Proceedings of the ICML. 609--616.
    [66]
    Sheng Li, Jaya Kawale, and Yun Fu. 2015. Deep collaborative filtering via marginalized Denoising Auto-encoder. In Proceedings of the CIKM. 811--820.
    [67]
    Wu-Jun Li and Dit-Yan Yeung. 2009. Relation regularized matrix factorization. In Proceedings of the IJCAI.
    [68]
    Xiaopeng Li and James She. 2017. Collaborative variational autoencoder for recommender systems. In Proceedings of the KDD. 305--314.
    [69]
    Yi Liao, Lidong Bing, Piji Li, Shuming Shi, Wai Lam, and Tong Zhang. 2018. QuaSE: Sequence editing under quantifiable guidance. In Proceedings of the EMNLP. 3855--3864.
    [70]
    Nathan Nan Liu, Xiangrui Meng, Chao Liu, and Qiang Yang. 2011. Wisdom of the better few: Cold start recommendation via representative based rating elicitation. In Proceedings of the RecSys. 37--44.
    [71]
    Zhongqi Lu, Zhicheng Dou, Jianxun Lian, Xing Xie, and Qiang Yang. 2015. Content-based collaborative filtering for news topic recommendation. In Proceedings of the AAAI. 217--223.
    [72]
    J. C. MacKay David. 1992. A practical Bayesian framework for backprop networks. Neural Comput. 4, 3 (1992), 448--472.
    [73]
    Wesley J. Maddox, Pavel Izmailov, Timur Garipov, Dmitry P. Vetrov, and Andrew Gordon Wilson. 2019. A simple baseline for Bayesian uncertainty in deep learning. In Proceedings of the NIPS. 13132--13143.
    [74]
    Takamitsu Matsubara, Vicenc Gómez, and Hilbert J. Kappen. 2014. Latent Kullback Leibler control for continuous-state systems using probabilistic graphical models. In Proceedings of the UAI. 583--592.
    [75]
    Nikhil Mehta, Lawrence Carin, and Piyush Rai. 2019. Stochastic blockmodels meet graph neural networks. In Proceedings of the ICML. 4466--4474.
    [76]
    Abdel-rahman Mohamed, Tara N. Sainath, George E. Dahl, Bhuvana Ramabhadran, Geoffrey E. Hinton, and Michael A. Picheny. 2011. Deep belief networks using discriminative features for phone recognition. In Proceedings of the ICASSP. 5060--5063.
    [77]
    Jonas Mueller, David K. Gifford, and Tommi S. Jaakkola. 2017. Sequence to better sequence: Continuous revision of combinatorial structures. In Proceedings of the ICML. 2536--2544.
    [78]
    Kevin P. Murphy. 2012. Machine Learning: A Probabilistic Perspective. MIT Press.
    [79]
    Radford M. Neal. 1992. Connectionist learning of belief networks. Artif. Intell. 56, 1 (1992), 71--113.
    [80]
    Radford M. Neal. 1995. Bayesian learning for neural networks. Ph.D. Dissertation. University of Toronto.
    [81]
    Radford M. Neal et al. 2011. MCMC using Hamiltonian dynamics. Handbook Markov Chain Monte Carlo 2, 11 (2011), 2.
    [82]
    Krzysztof Nowicki and Tom A. B. Snijders. 2001. Estimation and prediction for stochastic blockstructures. J. Amer. Stat. Assoc. 96, 455 (2001), 1077--1087.
    [83]
    Takahiro Omi, Naonori Ueda, and Kazuyuki Aihara. 2019. Fully neural network based model for general temporal point processes. In Proceedings of the NIPS. 2120--2129.
    [84]
    Aäron Van Den Oord, Sander Dieleman, and Benjamin Schrauwen. 2013. Deep content-based music recommendation. In Proceedings of the NIPS. 2643--2651.
    [85]
    Yoon-Joo Park. 2013. The adaptive clustering method for the long tail problem of recommender systems. Trans. Knowl. Data Eng. 25, 8 (2013), 1904--1915.
    [86]
    Ian Porteous, David Newman, Alexander Ihler, Arthur Asuncion, Padhraic Smyth, and Max Welling. 2008. Fast collapsed gibbs sampling for latent Dirichlet allocation. In Proceedings of the KDD. 569--577.
    [87]
    Janis Postels, Francesco Ferroni, Huseyin Coskun, Nassir Navab, and Federico Tombari. 2019. Sampling-free epistemic uncertainty estimation using approximated variance propagation. In Proceedings of the ICCV. 2931--2940.
    [88]
    Christopher Poultney, Sumit Chopra, Yann L. Cun et al. 2006. Efficient learning of sparse representations with an energy-based model. In Proceedings of the NIPS. 1137--1144.
    [89]
    Sanjay Purushotham, Yan Liu, and C.-C. Jay Kuo. 2012. Collaborative topic regression with social matrix factorization for recommendation systems. In Proceedings of the ICML. 759--766.
    [90]
    Syama Sundar Rangapuram, Matthias W. Seeger, Jan Gasthaus, Lorenzo Stella, Yuyang Wang, and Tim Januschowski. 2018. Deep state space models for time series forecasting. In Proceedings of the NIPS. 7796--7805.
    [91]
    Daniele Ravì, Charence Wong, Fani Deligianni, Melissa Berthelot, Javier Andreu-Perez, Benny Lo, and Guang-Zhong Yang. 2016. Deep learning for health informatics. IEEE J. Biomed. Health Info. 21, 1 (2016), 4--21.
    [92]
    Francesco Ricci, Lior Rokach, and Bracha Shapira. 2011. Introduction to Recommender Systems Handbook. Springer.
    [93]
    Salah Rifai, Pascal Vincent, Xavier Muller, Xavier Glorot, and Yoshua Bengio. 2011. Contractive auto-encoders: Explicit invariance during feature extraction. In Proceedings of the ICML. 833--840.
    [94]
    Sheldon M. Ross, John J. Kelly, Roger J. Sullivan, William James Perry, Donald Mercer, Ruth M. Davis, Thomas Dell Washburn, Earl V. Sager, Joseph B. Boyce, and Vincent L. Bristow. 1996. Stochastic Processes. Vol. 2. Wiley, New York.
    [95]
    Tara N. Sainath, Brian Kingsbury, Vikas Sindhwani, Ebru Arisoy, and Bhuvana Ramabhadran. 2013. Low-rank matrix factorization for deep neural network training with high-dimensional output targets. In Proceedings of the ICASSP. 6655--6659.
    [96]
    Ruslan Salakhutdinov and Andriy Mnih. 2007. Probabilistic matrix factorization. In Proceedings of the NIPS. 1257--1264.
    [97]
    Ruslan Salakhutdinov and Andriy Mnih. 2008. Bayesian probabilistic matrix factorization using Markov chain Monte Carlo. In Proceedings of the ICML. 880--887.
    [98]
    Ruslan Salakhutdinov, Andriy Mnih, and Geoffrey E. Hinton. 2007. Restricted Boltzmann machines for collaborative filtering. In Proceedings of the ICML. 791--798.
    [99]
    Oleksandr Shchur, Marin Bilos, and Stephan Günnemann. 2020. Intensity-free learning of temporal point processes. In Proceedings of the ICLR.
    [100]
    Alexander Shekhovtsov and Boris Flach. 2019. Feed-forward propagation in probabilistic neural networks with categorical and max layers. In Proceedings of the ICLR.
    [101]
    Jonathan R. Shewchuk. 1994. An Introduction to the Conjugate Gradient Method Without the Agonizing Pain. Technical Report. Carnegie Mellon University, Pittsburgh, PA.
    [102]
    Gunnar A. Sigurdsson, Santosh Kumar Divvala, Ali Farhadi, and Abhinav Gupta. 2017. Asynchronous temporal fields for action recognition. In Proceedings of the CVPR. 5650--5659.
    [103]
    Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, and Ruslan Salakhutdinov. 2014. Dropout: A simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15, 1 (2014), 1929--1958.
    [104]
    Nitish Srivastava and Russ R. Salakhutdinov. 2012. Multimodal learning with deep Boltzmann machines. In Proceedings of the NIPS. 2222--2230.
    [105]
    Karl Stelzner, Robert Peharz, and Kristian Kersting. 2019. Faster attend-infer-repeat with tractable probabilistic models. In Proceedings of the ICML. 5966--5975.
    [106]
    Robert S. Strichartz. 2003. A Guide to Distribution Theory and Fourier Transforms. World Scientific.
    [107]
    Jiahao Su, Milan Cvitkovic, and Furong Huang. 2020. Sampling-free learning of Bayesian quantized neural networks. In Proceedings of the ICLR.
    [108]
    Ilya Sutskever, Oriol Vinyals, and Quoc V. V. Le. 2014. Sequence to sequence learning with neural networks. In Proceedings of the NIPS. 3104--3112.
    [109]
    Jinhui Tang, Guo-Jun Qi, Liyan Zhang, and Changsheng Xu. 2013. Cross-space affinity learning with its application to movie recommendation. IEEE Trans. Knowl. Data Eng. 25, 7 (2013), 1510--1519.
    [110]
    Wesley Tansey, Yixin Wang, David M. Blei, and Raul Rabadan. 2018. Black box FDR. In Proceedings of the ICML. 4874--4883.
    [111]
    Pascal Vincent, Hugo Larochelle, Isabelle Lajoie, Yoshua Bengio, and Pierre-Antoine Manzagol. 2010. Stacked denoising autoencoders: Learning useful representations in a deep network with a local denoising criterion. J. Mach. Learn. Res. 11 (2010), 3371--3408.
    [112]
    Chong Wang and David M. Blei. 2011. Collaborative topic modeling for recommending scientific articles. In Proceedings of the KDD. 448--456.
    [113]
    Chong Wang, David M. Blei, and David Heckerman. 2008. Continuous time dynamic topic models. In Proceedings of the UAI. 579--586.
    [114]
    Hao Wang. 2017. Bayesian Deep Learning for Integrated Intelligence: Bridging the Gap between Perception and Inference. Ph.D. Dissertation. Hong Kong University of Science and Technology.
    [115]
    Hao Wang, Binyi Chen, and Wu-Jun Li. 2013. Collaborative topic regression with social regularization for tag recommendation. In Proceedings of the IJCAI. 2719--2725.
    [116]
    Hao Wang and Wu-Jun Li. 2015. Relational collaborative topic regression for recommender systems. Trans. Knowl. Data Eng. 27, 5 (2015), 1343--1355.
    [117]
    Hao Wang, Chengzhi Mao, Hao He, Mingmin Zhao, Tommi S. Jaakkola, and Dina Katabi. 2019. Bidirectional inference networks: A class of deep Bayesian networks for health profiling. In Proceedings of the AAAI. 766--773.
    [118]
    Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2015. Relational stacked denoising autoencoder for tag recommendation. In Proceedings of the AAAI. 3052--3058.
    [119]
    Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2016. Natural-parameter networks: A class of probabilistic neural networks. In Proceedings of the NIPS. 118--126.
    [120]
    Hao Wang, Xingjian Shi, and Dit-Yan Yeung. 2017. Relational deep learning: A deep latent variable model for link prediction. In Proceedings of the AAAI. 2688--2694.
    [121]
    Hao Wang, Naiyan Wang, and Dit-Yan Yeung. 2015. Collaborative deep learning for recommender systems. In Proceedings of the KDD. 1235--1244.
    [122]
    Hao Wang, Shi Xingjian, and Dit-Yan Yeung. 2016. Collaborative recurrent autoencoder: Recommend while learning to fill in the blanks. In Proceedings of the NIPS. 415--423.
    [123]
    Xinxi Wang and Ye Wang. 2014. Improving content-based and hybrid music recommendation using deep learning. In Proceedings of the ACM MM. 627--636.
    [124]
    Yuyang Wang, Alex Smola, Danielle C. Maddix, Jan Gasthaus, Dean Foster, and Tim Januschowski. 2019. Deep factors for forecasting. In Proceedings of the ICML. 6607--6617.
    [125]
    Manuel Watter, Jost Springenberg, Joschka Boedecker, and Martin Riedmiller. 2015. Embed to control: A locally linear latent dynamics model for control from raw images. In Proceedings of the NIPS. 2728--2736.
    [126]
    Yan Zheng Wei, Luc Moreau, and Nicholas R. Jennings. 2005. Learning users’ interests by quality classification in market-based recommender systems. Trans. Knowl. Data Eng. 17, 12 (2005), 1678--1688.
    [127]
    Max Welling and Yee Whye Teh. 2011. Bayesian learning via stochastic gradient langevin dynamics. In Proceedings of the ICML. 681--688.
    [128]
    Andrew Gordon Wilson. 2020. The case for Bayesian deep learning. arXiv preprint arXiv:2001.10995.
    [129]
    Zichao Yang, Zhiting Hu, Ruslan Salakhutdinov, and Taylor Berg-Kirkpatrick. 2017. Improved variational autoencoders for text modeling using dilated convolutions. In Proceedings of the ICML. 3881--3890.
    [130]
    Ghim-Eng Yap, Ah-Hwee Tan, and HweeHwa Pang. 2007. Discovering and exploiting causal dependencies for robust mobile context-aware recommenders. Trans. Knowl. Data Eng. 19, 7 (2007), 977--992.
    [131]
    Haochao Ying, Liang Chen, Yuwen Xiong, and Jian Wu. 2016. Collaborative deep ranking: A hybrid pair-wise recommendation algorithm with implicit feedback. In Proceedings of the PAKDD.
    [132]
    Fuzheng Zhang, Nicholas Jing Yuan, Defu Lian, Xing Xie, and Wei-Ying Ma. 2016. Collaborative knowledge base embedding for recommender systems. In Proceedings of the KDD. ACM, 353--362.
    [133]
    He Zhao, Lan Du, Wray L. Buntine, and Mingyuan Zhou. 2018. Dirichlet belief networks for topic structure learning. In Proceedings of the NIPS. 7966--7977.
    [134]
    Vincent Wenchen Zheng, Bin Cao, Yu Zheng, Xing Xie, and Qiang Yang. 2010. Collaborative filtering meets mobile recommendation: A user-centered approach. In Proceedings of the AAAI.
    [135]
    Mingyuan Zhou and Lawrence Carin. 2015. Negative binomial process count and mixture modeling. IEEE Trans. Pattern Anal. Mach. Intell. 37, 2 (2015), 307--320.
    [136]
    Mingyuan Zhou, Lauren Hannah, David B. Dunson, and Lawrence Carin. 2012. Beta-negative binomial process and Poisson factor analysis. In Proceedings of the AISTATS. 1462--1471.

    Cited By

    View all
    • (2024)Secure AI Model Sharing: A Cryptographic Approach for Encrypted Model ExchangeInternational Journal of Artificial Intelligence and Machine Learning10.51483/IJAIML.4.1.2024.48-604:1(48-60)Online publication date: 5-Jan-2024
    • (2024)A Multi-Dimensional Evaluation Model for Epidemic Prevention PoliciesCAAI Artificial Intelligence Research10.26599/AIR.2024.9150034(9150034)Online publication date: Dec-2024
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Computing Surveys
    ACM Computing Surveys  Volume 53, Issue 5
    September 2021
    782 pages
    ISSN:0360-0300
    EISSN:1557-7341
    DOI:10.1145/3426973
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 28 September 2020
    Accepted: 01 September 2020
    Revised: 01 August 2020
    Received: 01 March 2020
    Published in CSUR Volume 53, Issue 5

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Bayesian networks
    2. Deep learning
    3. generative models
    4. probabilistic graphical models

    Qualifiers

    • Survey
    • Research
    • Refereed

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)2,437
    • Downloads (Last 6 weeks)198

    Other Metrics

    Citations

    Cited By

    View all
    • (2024)Secure AI Model Sharing: A Cryptographic Approach for Encrypted Model ExchangeInternational Journal of Artificial Intelligence and Machine Learning10.51483/IJAIML.4.1.2024.48-604:1(48-60)Online publication date: 5-Jan-2024
    • (2024)A Multi-Dimensional Evaluation Model for Epidemic Prevention PoliciesCAAI Artificial Intelligence Research10.26599/AIR.2024.9150034(9150034)Online publication date: Dec-2024
    • (2024)A Survey on Convolutional Neural Networks and Their Performance Limitations in Image Recognition TasksJournal of Sensors10.1155/2024/27973202024:1Online publication date: 12-Jul-2024
    • (2024)A Survey of Trustworthy Representation Learning Across DomainsACM Transactions on Knowledge Discovery from Data10.1145/365730118:7(1-53)Online publication date: 12-Apr-2024
    • (2024)Do Bayesian Neural Networks Weapon System Improve Predictive Maintenance?2024 Annual Reliability and Maintainability Symposium (RAMS)10.1109/RAMS51492.2024.10457828(1-7)Online publication date: 22-Jan-2024
    • (2024)Confident Naturalness Explanation (CNE): A Framework to Explain and Assess Patterns Forming NaturalnessIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2024.336519621(1-5)Online publication date: 2024
    • (2024)Res-EMSA: Adaptive Adjustment of Innovation Based on Efficient Multihead Self-Attention in GNSS/INS Tightly Integrated Navigation SystemIEEE Geoscience and Remote Sensing Letters10.1109/LGRS.2024.336514821(1-5)Online publication date: 2024
    • (2024)Variational Connectionist Temporal Classification for Order-Preserving Sequence ModelingICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)10.1109/ICASSP48485.2024.10447530(6495-6499)Online publication date: 14-Apr-2024
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media