Abstract
In several domains it is common to have data from different, but closely related problems. For instance, in manufacturing, many products follow the same industrial process but with different conditions; or in industrial diagnosis, where there is equipment with similar specifications. In these cases it is common to have plenty of data for some scenarios but very little for others. In order to learn accurate models for rare cases, it is desirable to use data and knowledge from similar cases; a technique known as transfer learning. In this paper we propose an inductive transfer learning method for Bayesian networks, that considers both structure and parameter learning. For structure learning we use conditional independence tests, by combining measures from the target task with those obtained from one or more auxiliary tasks, using a novel weighted sum of the conditional independence measures. For parameter learning, we propose two variants of the linear pool for probability aggregation, combining the probability estimates from the target task with those from the auxiliary tasks. To validate our approach, we used three Bayesian networks models that are commonly used for evaluating learning techniques, and generated variants of each model by changing the structure as well as the parameters. We then learned one of the variants with a small dataset and combined it with information from the other variants. The experimental results show a significant improvement in terms of structure and parameters when we transfer knowledge from similar tasks. We also evaluated the method with real-world data from a manufacturing process considering several products, obtaining an improvement in terms of log-likelihood between the data and the model when we do transfer learning from related products.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Baxter, J. (1997). A Bayesian/information theoretic model of learning to learn via multiple task sampling. Machine Learning, 28(1), 7–39.
Beinlich, I. A., Suermondt, H. J., Chavez, R. M., & Cooper, G. F. (1989). The ALARM monitoring system: a case study with two probabilistic inference techniques for belief networks. In Proceedings of the second European Conference on Artificial Intelligence in Medicine. Berlin: Springer.
Binder, J., Koller, D., Russell, S., & Kanazawa, K. (1997). Adaptive probabilistic networks with hidden variables. Machine Learning, 29(2–3), 213–244.
Caruana, R. (1997). Multitask learning. Machine Learning, 28(1), 41–75.
Chang, C.-S., & Chen, A. L. P. (1996). Aggregate functions over probabilisitic data. Informing Science, 88(1–4), 15–45.
Chen, A. L. P., Chiu, J.-S., & Tseng, F. S.-C. (1996). Evaluating aggregate operations over imprecise data. IEEE Transactions on Knowledge and Data Engineering, 8(2), 273–284.
Cooper, G. F., & Herskovits, E. (1992). A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9(4), 309–347.
Dai, W., Xue, G., Yang, Q., & Yu, Y. (2007). Trasfer naive Bayes classifiers for text classification. In Proceedings of the twenty-second AAAI conference on artificial intelligence (AAAI-07) (pp. 540–545). Menlo Park: AAAI Press.
Elvira (2002). Elvira: an environment for creating and using probabilistic graphical models. In J. A. Gámez & A. Salmerón (Eds.), First European workshop on probabilistic graphical models, 6–8 November, 2002, Cuenca (Spain). Electronic Proceedings.
Friedman, N., & Yakhini, Z. (1996). On the sample complexity of learning Bayesian networks. In E. J. Horvitz & F. V. Jensen (Eds.), Proceedings of the 12th conference on Uncertainty in Artificial Intelligence (pp. 274–282). San Mateo: Morgan-Kaufmann.
Genest, C., & Zidek, J. V. (1986). Combining probability distributions: a critique and an annotated bibliography. Statistical Science, 1(1), 114–148.
Lam, W., & Bacchus, F. (1994). Learning Bayesian belief networks: an approach based on the MDL principle. Computational Intelligence, 10, 269–293.
Luis-Velázquez, R. (2009). Aprendizaje por transferencia de redes Bayesianas (Technical Report). Instituto Nacional de Atrofísica, Óptica y Electrónica, Mexico.
Niculescu-Mizil, A., & Caruana, R. (2007). Inductive transfer for Bayesian network structure learning. In Proceedings of the 11th international conference on AI and statistics (AISTATS ‘07).
Pearl, J. (1988). Probabilistic reasoning in intelligent systems: networks of plausible inference. San Mateo: Morgan Kaufmann.
Rasmussen, L. (1992). Blood groups determination of danish jersey cattlein the f-blood group system (Technical Report Dina Res. Rep. No. 8). Research Centre Foulum, Denmark.
Richardson, M., & Domingos, P. (2003). Learning with knowledge from multiple experts. In T. Fawcett & N. Mishra (Eds.), Machine learning, Proceedings of the Twentieth International Conference (ICML 2003), August 21–24, 2003, Washington, DC, USA (pp. 624–631). Menlo Park: AAAI Press.
Roy, D., & Kaelbling, L. (2007). Efficient Bayesian task-level transfer learning. In Proceedings of the International Joint Conference on Artificial Intelligence (IJCAI-07) (pp. 2599–2604). Menlo Park: AAAI Press.
Silver, D., Poirier, R., & Currie, D. (2008). Inductive transfer with context-sensitive neural networks. Machine Learning, 73(3), 313–336.
Spirtes, P., Glymour, C., & Scheines, R. (1993). Causation, prediction, and search. Berlin: Springer.
Thrun, S. (1996). Is learning the n-th thing any easier than learning the first? In D. S. Touretzky, M. C. Mozer, & M. E. Hasselmo (Eds.), Advances in Neural Information Processing Systems (vol. 8, pp. 640–646). Cambridge: MIT Press.
Wu, P., & Dietterich, T. G. (2004). Improving SVM accuracy by training on auxiliary data sources. In ICML ’04: Proceedings of the Twenty-first International Conference on Machine Learning (p. 110). New York: ACM.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Nicolo Cesa-Bianchi, David R. Hardoon, and Gayle Leen.
Rights and permissions
About this article
Cite this article
Luis, R., Sucar, L.E. & Morales, E.F. Inductive transfer for learning Bayesian networks. Mach Learn 79, 227–255 (2010). https://doi.org/10.1007/s10994-009-5160-4
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-009-5160-4