Abstract
A necessary step in the development of artificial intelligence is to enable a machine to represent how the world works, building an internal structure from data. This structure should hold a good trade-off between expressive power and querying efficiency. Bayesian networks have proven to be an effective and versatile tool for the task at hand. They have been applied to modeling knowledge in a variety of fields, ranging from bioinformatics to law, from image processing to economic risk analysis. A crucial aspect is learning the dependency graph of a Bayesian network from data. This task, called structure learning, is NP-hard and is the subject of intense, cutting-edge research. In short, it can be thought of as choosing one graph over the many candidates, grounding our reasoning over a collection of samples of the distribution generating the data. The number of possible graphs increases very quickly at the increase in the number of variables. Searching in this space, and selecting a graph over the others, becomes quickly burdensome. In this survey, we review the most relevant structure learning algorithms that have been proposed in the literature. We classify them according to the approach they follow for solving the problem and we also show alternatives for handling missing data and continuous variable. An extensive review of existing software tools is also given.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Abellán, J., Gómez-Olmedo, M., Moral, S.: Some variations on the PC algorithm. In: Third European Workshop on Probabilistic Graphical Models, pp. 1–8 (2006)
Adel, T., de Campos, C.P.: Learning Bayesian networks with incomplete data by augmentation. In: Proceedings of the 31st AAAI Conference on Artificial Intelligence, pp. 1684–1690 (2017)
Alonso-Barba, J., de la Ossa, L., Gámez, J., Puerta, J.: Scaling up the greedy equivalence search algorithm by constraining the search space of equivalence classes. Int. J. Approx. Reason. 54, 429–451 (2013)
Alonso-Barba, J.I., de la Ossa, L., Puerta, J.M.: Structural learning of Bayesian networks using local algorithms based on the space of orderings. Soft Comput. 15(10), 1881–1895 (2011)
Alonso, J., de la Ossa, L., Gámez, J., Puerta, J.: On the use of local search heuristics to improve GES-based Bayesian network learning. Appl. Soft Comput. 64, 366–376 (2018)
Bacciu, D., Etchells, T., Lisboa, P., Whittaker, J.: Efficient identification of independence networks using mutual information. Comput. Stat. 28, 621–646 (2013)
Ben-Daya, M., Al-Fawzan, M.: A tabu search approach for the flow shop scheduling problem. Eur. J. Oper. Res. 109(1), 88–95 (1998)
Bøttcher, S.: Learning Bayesian networks with mixed variables. In: Proceedings of the Eighth International Workshop in Artificial Intelligence and Statistics (2001)
Bøttcher, S., Dethlefsen, C.: deal: A package for learning bayesian networks. J. Stat. Softw. 8, 1–40 (2003)
Buntine, W.: Theory refinement on Bayesian networks. In: Proceedings of the 8th Conference on Uncertainty in Artificial Intelligence, pp. 52–60 (1991)
Cheng, J., Bell, D.A., Liu, W.: An algorithm for Bayesian belief network construction from data. In: Proceedings of Artificial Intelligence and Statistics, pp. 83–90 (1997)
Chickering, D.: A transformational characterization of equivalent Bayesian network structures. In: Proceedings of the Eleventh Annual Conference on Uncertainty in Artificial Intelligence, pp. 87–98. Morgan Kaufmann (1995)
Chickering, D.M., Heckerman, D., Meek, C.: Large-sample learning of Bayesian networks is NP-Hard. J. Mach. Learn. Res. 5, 1287–1330 (2014)
Colombo, D., Maathuis, M.H.: Order-independent constraint-based causal structure learning. Journal of Machine Learning Research 15, 3741–3782 (2014)
Consortium, Elvira.: Elvira: An environment for creating and using probabilistic graphical models. In: Gámez, J., Salmerón, A. (eds) Proceedings of the First European Workshop on Probabilistic Graphical Models, pp. 222–230 (2002)
Cooper, G.F.: The computational complexity of probabilistic inference using Bayesian belief networks. Artif. Intell. 42, 393–405 (1990)
Cooper, G.F., Herskovits, E.: A Bayesian method for the induction of probabilistic networks from data. Mach. Learn. 9, 309–347 (1992)
Cussens, J.: Bayesian network learning with cutting planes. In: Proceedings of the 27th Conference on Uncertainty in Artificial Intelligence, pp. 153–160 (2011)
Cussens, J., Malone, B., Yuan, C.: IJCAI 2013 tutorial on optimal algorithms for learning Bayesian networks (2013). https://sites.google.com/site/ijcai2013bns/slides. Accessed June 2018
de Campos, C.P., Corani, G., Scanagatta, M., Cuccu, M., Zaffalon, M.: Learning extended tree augmented naive structures. Int. J. Approx. Reason. 68, 153–163 (2015)
de Campos, C.P., Ji, Q.: Efficient structure learning of Bayesian networks using constraints. J. Mach. Learn. Res. 12, 663–689 (2011)
de Campos, C.P., Zeng, Z., Ji, Q.: Structure learning of Bayesian networks using constraints. In: Proceedings of the 26th International Conference on Machine Learning, pp. 113–120 (2009)
Elidan, G., Gould, S.: Learning bounded treewidth Bayesian networks. J. Mach. Learn. Res. 9, 2699–2731 (2008)
Fernández, A., Nielsen, J.D., Salmerón, A.: Learning Bayesian networks for regression from incomplete databases. Int. J. Uncertain. Fuzziness Knowl. Based Syst 18(1), 69–86 (2010)
Fernández, A., Pérez-Bernabé, I., Salmerón, A.: On Using the PC Algorithm for Learning Continuous Bayesian Networks: An Experimental Analysis, CAEPIA’13. Lecture Notes in Computer Science 8109, 342–351 (2013)
Fernández, A., Salmerón, A.: Extension of Bayesian network classifiers to regression problems. In: Geffner, H., Prada, R., Alexandre, I.M., David, N. (eds) Advances in Artificial Intelligence—IBERAMIA 2008, Vol. 5290 of Lecture Notes in Artificial Intelligence, pp. 83–92. Springer (2008)
Friedman, N.: The Bayesian structural EM algorithm. In: Proceedings of the 14th Conference on Uncertainty in Artificial Intelligence, pp. 129–138 (1998)
Friedman, N., Geiger, D., Goldszmidt, M.: Bayesian network classifiers. Mach. Learn. 29, 131–163 (1997)
Hand, D.J., Yu, K.: Idiot’s Bayes–not so stupid after all? Int. Stat. Rev. 69(3), 385–398 (2001)
He, Y., Jia, J., Geng, Z.: Structural learning of causal networks. Behaviormetrika 44, 287–305 (2017)
Heckerman, D., Geiger, D., Chickering, D.: Learning Bayesian networks: the combination of knowledge and statistical data. Mach. Learn. 20, 197–243 (1995)
Jaakkola, T., Sontag, D., Globerson, A., Meila, M.: Learning Bayesian network structure using LP relaxations. In: Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, pp. 358–365 (2010)
Jaeger, M.: Probabilistic decision graphs—combining verification and ai techniques for probabilistic inference. Int. J. Uncertain. Fuzziness Knowl. Based Syst. 12, 19–42 (2004)
Kalisch, M., Bühlmann, P.: Estimating high-dimensional directed acyclic graphs with the PC-algorithm. J. Mach. Learn. Res. 8, 613–636 (2007)
Koivisto, M.: Parent assignment is hard for the MDL, AIC, and NML costs. In: Proceedings of the 29th Annual Conference On Learning Theory, vol. 4005, pp. 289–303 (2016)
Koivisto, M., Sood, K.: Exact Bayesian structure discovery in Bayesian networks. J. Mach. Learn. Res. 5, 549–573 (2004)
Koller, D., Friedman, N.: Probabilistic Graphical Models: Principles and Techniques. MIT Press, Boston (2009)
Korhonen, J., Parviainen, P.: Exact learning of bounded treewidth Bayesian networks. In: Artificial Intelligence and Statistics, pp 370–378 (2013)
Kwisthout, J. H.P., Bodlaender, H.L., van der Gaag, L.C.: The necessity of bounded treewidth for efficient inference in Bayesian networks. In: Proceedings of the 19th European Conference on Artificial Intelligence, pp. 237–242 (2010)
Lauritzen, S., Wermuth, N.: Graphical models for associations between variables, some of which are qualitative and some quantitative. Ann. Stat. 17, 31–57 (1989)
Lee, C., van Beek, P.: Metaheuristics for score-and-search Bayesian network structure learning. In: Proceedings of the 30th Canadian Conference on Artificial Intelligence, pp. 129–141 (2017)
Madsen, A.L., Jensen, F., Salmerón, A., Langseth, H., Nielsen, T.D.: A parallel algorithm for Bayesian network structure learning from large data sets. Knowl. Based Syst. 117, 46–55 (2017)
Malone, B., Kangas, K., Järvisalo, M., Koivisto, M., Myllymäki, P.: Empirical hardness of finding optimal Bayesian network structures: algorithm selection and runtime prediction. Mach. Learn. 107, 1–37 (2018)
Malone, B.M.: Learning optimal Bayesian networks with heuristic search. Ph.D. thesis, Mississippi State University (2012)
Moral, S., Rumí, R., Salmerón, A.: Mixtures of Truncated Exponentials in Hybrid Bayesian Networks. In: Benferhat, S., Besnard , P. (eds) Symbolic and Quantitative Approaches to Reasoning with Uncertainty, Vol. 2143 of Lecture Notes in Artificial Intelligence, pp. 156–167. Springer (2001)
Nie, S., de Campos, C.P., Ji, Q.: Learning bounded treewidth Bayesian networks via sampling. In: Proceedings of the 13th European Conference on Symbolic and Quantitative Approaches to Reasoning with Uncertainty, pp. 387–396 (2015)
Nie, S., Mauá, D.D., de Campos, C.P., Ji, Q.: Advances in learning Bayesian networks of bounded treewidth. Adv. Neural Inf. Process. Syst. 27, 2285–2293 (2014)
Nielsen, J.D., Rumí, R., Salmerón, A.: Structural-EM for learning PDG models from incomplete data. Int. J. Approx. Reason. 51(5), 515–530 (2010)
Parviainen, P., Farahani, H.S., Lagergren, J.: Learning bounded treewidth Bayesian networks using integer linear programming. In: Proceedings of the 17th International Conference on Artificial Intelligence and Statistics, pp. 751–759 (2014)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Elsevier, Amsterdam (1988)
Pearl, J.: Causality: models, reasoning and inference. Econom. Theory 19(46), 675–685 (2003)
Pearl, J., Verma, T.S.: A theory of inferred causation. Stud. Logic Found. Math. 134, 789–811 (1995)
Pourret, O., Naïm, P., Marcot, B.: Bayesian Networks: A Practical Guide to Applications. Wiley, Hoboken (2008)
Robinson, R.W.: Counting Labeled Acyclic Digraphs, New Directions in the Theory of Graphs, pp. 28–43. Academic Press, New York (1973)
Romero, V., Rumí, R., Salmerón, A.: Learning hybrid Bayesian networks using mixtures of truncated exponentials. Int. J. Approx. Reason. 42, 54–68 (2006)
Scanagatta, M., Corani, G., de Campos, C.P., Zaffalon, M.: Learning treewidth-bounded Bayesian networks with thousands of variables. Adv. Neural Inf. Process. Syst. 29, 1462–1470 (2016)
Scanagatta, M., Corani, G., de Campos, C.P., Zaffalon, M.: Approximate structure learning for large Bayesian networks. Mach. Learn. 107, 1–19 (2018)
Scanagatta, M., Corani, G., Zaffalon, M.: Improved local search in Bayesian networks structure learning. In:Proceedings of the 3rd International Workshop on Advanced Methodologies for Bayesian Networks, pp. 45–56 (2017)
Scanagatta, M., Corani, G., Zaffalon, M., Yoo, J., Kang, U.: Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data sets. Int. J. Approx. Reason. 95, 152–166 (2018)
Scanagatta, M., de Campos, C.P., Corani, G., Zaffalon, M.: Learning Bayesian networks with thousands of variables. Adv. Neural Inf. Process. Syst. 28, 1855–1863 (2015)
Schwarz, G.: Estimating the dimension of a model. Ann. Stat. 6, 461–464 (1978)
Scutari, M.: Bayesian network constraint-based structure learning algorithms: Parallel and optimised implementations in the bnlearn R package. CoRR (2014). arXiv:1406.7648
Silander, T., Myllymaki, P.: A simple approach for finding the globally optimal Bayesian network structure. In: Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence, pp. 445–452 (2006)
Spirtes, P., Glymour, C.N., Scheines, R.: Causation, Prediction, and Search. MIT Press, Boston (2000)
Steck, H., Tresp, V.: Bayesian belief networks for data mining. University of Magdeburg, pp 145–154 (1996)
Teyssier, M., Koller, D.: Ordering-based search: a simple and effective algorithm for learning Bayesian networks. In: Proceedings of the 21st Conference on Uncertainty in Artificial Intelligence, pp. 584–590 (2005)
Yuan, C., Malone, B.: An improved admissible heuristic for learning optimal Bayesian networks. In: Proceedings of the 28th Conference on Uncertainty in Artificial Intelligence, pp. 924–933 (2012)
Yuan, C., Malone, B., Wu, X.: Learning optimal Bayesian networks using A* search. In: Proceedings of the 22nd International Joint Conference on Artificial Intelligence, pp. 2186–2191 (2011)
Zheng, X., Aragam, B., Ravikumar, P., Xing, E.: DAGs with no tears: Continuous optimization for structure learning. In: Advances in Neural Information Processing Systems, pp. 9492–9503 (2018)
Acknowledgements
This work has been partly supported by the Spanish Ministry of Science, Innovation and Universities, grant TIN2016-77902-C3-3-P and by ERDF (FEDER) funds.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Scanagatta, M., Salmerón, A. & Stella, F. A survey on Bayesian network structure learning from data. Prog Artif Intell 8, 425–439 (2019). https://doi.org/10.1007/s13748-019-00194-y
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s13748-019-00194-y