Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Advertisement

Multi-dimensional Bayesian network classifiers: A survey

  • Published:
Artificial Intelligence Review Aims and scope Submit manuscript

Abstract

Multi-dimensional classification is a cutting-edge problem, in which the values of multiple class variables have to be simultaneously assigned to a given example. It is an extension of the well known multi-label subproblem, in which the class variables are all binary. In this article, we review and expand the set of performance evaluation measures suitable for assessing multi-dimensional classifiers. We focus on multi-dimensional Bayesian network classifiers, which directly cope with multi-dimensional classification and consider dependencies among class variables. A comprehensive survey of this state-of-the-art classification model is offered by covering aspects related to their learning and inference process complexities. We also describe algorithms for structural learning, provide real-world applications where they have been used, and compile a collection of related software.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6

Similar content being viewed by others

Notes

  1. This is a simplification taken from Read et al. (2013) to facilitate discussion of the problem complexity. Actually, we will see later that each class variable can take a different number of values.

  2. A graph is said to be maximal connected if there is a path between every pair of vertices in its undirected version (Bielza et al. 2011).

  3. Note that we have modified the term \(r_s = \sum _{j=1}^{d} |\Omega _{C_j}|\) of Fernandes et al. (2013) by d in the denominator of the equation in order to correctly normalize the score to lie between 0 and 1.

  4. The popular approach to handle concept drifts named ensemble learning consists of combining the predictions of a set of individual classifiers, the so-called ensemble, in order to predict new incoming examples. A comprehensive review of ensemble approaches for data stream analysis was conducted by Krawczyk et al. (2017).

  5. http://mulan.sourceforge.net/datasets-mlc.html.

  6. http://palm.seu.edu.cn/zhangml/files/MLNB.rar.

  7. http://www.sc.ehu.es/ccwbayes/members/jafernandes/files/Multi-dimensional_Pre-processing.zip.

  8. https://github.com/jacintoArias/academic-FMC.

  9. https://github.com/marcobb8/tr_bn.

  10. https://github.com/ComputationalIntelligenceGroup/MBCTree.

References

  • Abdelbar AM, Hedetniemi SM (1998) Approximating MAPs for belief networks is NP-hard and other theorems. Artif Intell 102(1):21–38

    MathSciNet  MATH  Google Scholar 

  • Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD (2010a) Local causal and Markov blanket induction for causal discovery and feature selection for classification. Part I: Algorithms and empirical evaluation. J Mach Learn Res 11:171–234

    MathSciNet  MATH  Google Scholar 

  • Aliferis CF, Statnikov A, Tsamardinos I, Mani S, Koutsoukos XD (2010b) Local causal and Markov blanket induction for causal discovery and feature selection for classification. Part II: Analysis and extensions. J Mach Learn Res 11:235–284

    MathSciNet  MATH  Google Scholar 

  • Antonucci A, Corani G, Mauá D, Gabaglio S (2013) An ensemble of Bayesian networks for multilabel classification. In: Proceedings of the 23rd international joint conference on artificial intelligence, AAAI Press, pp 1220–1225

  • Arias J, Gámez JA, Nielsen TD, Puerta JM (2016) A scalable pairwise class interaction framework for multidimensional classification. Int J Approx Reason 68:194–210

    MATH  Google Scholar 

  • Arnborg S, Corneil DG, Proskurowski A (1987) Complexity of finding embeddings in a k-tree. SIAM J Alg Discrete Methods 8(2):277–284

    MathSciNet  MATH  Google Scholar 

  • Benjumeda M, Bielza C, Larrañaga P (2018) Tractability of most probable explanations in multidimensional Bayesian network classifiers. Int J Approx Reason 93:74–87

    MathSciNet  MATH  Google Scholar 

  • Bielza C, Larrañaga P (2014) Discrete Bayesian network classifiers: A survey. ACM Comput Surv 47(1):5

    MATH  Google Scholar 

  • Bielza C, Li G, Larrañaga P (2011) Multi-dimensional classification with Bayesian networks. Int J Approx Reason 52(6):705–727

    MathSciNet  MATH  Google Scholar 

  • Blanco R, Inza I, Merino M, Quiroga J, Larrañaga P (2005) Feature selection in Bayesian classifiers for the prognosis of survival of cirrhotic patients treated with TIPS. J Biomed Inform 38(5):376–388

    Google Scholar 

  • Bolt JH, van der Gaag LC (2017) Balanced sensitivity functions for tuning multi-dimensional Bayesian network classifiers. Int J Approx Reason 80:361–376

    MathSciNet  MATH  Google Scholar 

  • Borchani H, Bielza C, Larrañaga P (2010) Learning CB-decomposable multi-dimensional Bayesian network classifiers. In: Proceedings of the 5th European workshop on probabilistic graphical models, pp 25–32

  • Borchani H, Bielza C, Martínez-Martín P, Larrañaga P (2012) Markov blanket-based approach for learning multi-dimensional Bayesian network classifiers: An application to predict the European Quality of Life-5 Dimensions (EQ-5D) from the 39-item Parkinson’s Disease Questionnaire (PDQ-39). J Biomed Inform 45(6):1175–1184

    Google Scholar 

  • Borchani H, Bielza C, Toro C, Larrañaga P (2013) Predicting human immunodeficiency virus inhibitors using multi-dimensional Bayesian network classifiers. Artif Intell Med 57(3):219–229

    Google Scholar 

  • Borchani H, Larrañaga P, Gama J, Bielza C (2016) Mining multi-dimensional concept-drifting data streams using Bayesian network classifiers. Intell Data Anal 20(2):257–280

    Google Scholar 

  • Bouckaert RR (1992) Optimizing causal orderings for generating DAGs from data. In: Proceedings of the 8th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 9–16

  • Boutell MR, Luo J, Shen X, Brown CM (2004) Learning multi-label scene classification. Pattern Recogn 37(9):1757–1771

    Google Scholar 

  • Brier GW (1950) Verification of forecasts expressed in terms of probability. Mon Weather Rev 78(1):1–3

    Google Scholar 

  • Buntine W (1991) Theory refinement on Bayesian networks. In: Proceedings of the 7th conference on Uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 52–60

  • Charte F, Charte D (2015) Working with multilabel datasets in R: The mldr package. R J 7(2):149–162

    Google Scholar 

  • Charte F, Rivera AJ, Charte D, del Jesus MJ, Herrera F (2018) Tips, guidelines and tools for managing multi-label datasets: The mldr.datasets R package and the Cometa data repository. Neurocomputing 289:68–85

    Google Scholar 

  • Cheng W, Hühn J, Hüllermeier E (2009) Decision tree and instance-based learning for label ranking. In: Proceedings of the 26th annual international conference on machine learning, ACM, pp 161–168

  • Chow C, Liu C (1968) Approximating discrete probability distributions with dependence trees. IEEE Trans Inf Theory 14(3):462–467

    MATH  Google Scholar 

  • Chu YJ, Liu TH (1965) On the shortest arborescence of a directed graph. Sci Sinica 14:1396–1400

    MathSciNet  MATH  Google Scholar 

  • Cooper GF, Herskovits E (1992) A Bayesian method for the induction of probabilistic networks from data. Mach Learn 9(4):309–347

    MATH  Google Scholar 

  • Corani G, Antonucci A, Mauá DD, Gabaglio S (2014) Trading off speed and accuracy in multilabel classification. In: Proceedings of the 7th European workshop on probabilistic graphical models, Lecture Notes in Artificial Intelligence, Springer, pp 145–159

  • Dawid AP (1992) Applications of a general propagation algorithm for probabilistic expert systems. Stat Comput 2(1):25–36

    Google Scholar 

  • Dean T, Kanazawa K (1989) A model for reasoning about persistence and causation. Comput Intell 5(2):142–150

    Google Scholar 

  • Deb K, Pratap A, Agarwal S, Meyarivan T (2002) A fast and elitist multiobjective genetic algorithm: NSGA-II. IEEE Trans Evol Comput 6(2):182–197

    Google Scholar 

  • Dechter R (1999) Bucket elimination: A unifying framework for reasoning. Artif Intell 113(1–2):41–85

    MathSciNet  MATH  Google Scholar 

  • Dechter R, Rish I (1997) A scheme for approximating probabilistic inference. In: Proceedings of the 13th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 132–141

  • Dempster AP, Laird NM, Rubin DB (1977) Maximum likelihood from incomplete data via the EM algorithm. J R Stat Soc Ser B (Methodol) 39(1):1–38

    MathSciNet  MATH  Google Scholar 

  • Fayyad U, Irani K (1993) Multi-interval discretization of continuous-valued attributes for classification learning. In: Proceedings of the 13th international joint conference on artificial intelligence, pp 1022–1027

  • Fernandes JA, Lozano JA, Inza I, Irigoien X, Pérez A, Rodríguez JD (2013) Supervised pre-processing approaches in multiple class variables classification for fish recruitment forecasting. Environ Modell Softw 40:245–254

    Google Scholar 

  • Fernandez-Gonzalez P, Bielza C, Larrañaga P (2015) Multidimensional classifiers for neuroanatomical data. In: ICML Workshop on statistics, machine learning and neuroscience (Stamlins 2015)

  • Frank E, Hall M (2001) A simple approach to ordinal classification. In: Proceedings of the 12th European conference on machine learning, Lecture Notes in Artificial Intelligence, Springer, pp 145–156

  • Friedman N (1997) Learning belief networks in the presence of missing values and hidden variables. In: Proceedings of the 14th international conference on machine learning, Morgan Kaufmann Publishers Inc, vol 97, pp 125–133

  • Friedman N (1998) The Bayesian structural EM algorithm. In: Proceedings of the 14th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 129–138

  • Friedman N, Geiger D, Goldszmidt M (1997) Bayesian network classifiers. Mach Learn 29(2–3):131–163

    MATH  Google Scholar 

  • Fürnkranz J, Hüllermeier E, Mencía EL, Brinker K (2008) Multilabel classification via calibrated label ranking. Mach Learn 73(2):133–153

    Google Scholar 

  • van der Gaag LC, de Waal PR (2006) Muti-dimensional Bayesian network classifiers. In: Proceedings of the 3rd European workshop in probabilistic graphical models, pp 107–114

  • Gama J, Castillo G (2006) Learning with local drift detection. In: Proceedings of the 2nd international conference on advanced data mining and applications, Lecture Notes in Artificial Intelligence, Springer, pp 42–55

  • Gama J, Zliobaite I, Bifet A, Pechenizkiy M, Bouchachia A (2014) A survey on concept drift adaptation. ACM Comput Surv 46(4):44

    MATH  Google Scholar 

  • Gelsema ES (1995) Abductive reasoning in Bayesian belief networks using a genetic algorithm. Pattern Recogn Lett 16(8):865–871

    Google Scholar 

  • Gibaja E, Ventura S (2015) A tutorial on multi-label learning. ACM Comput Surv 47(3):52

    Google Scholar 

  • Gil-Begue S, Larrañaga P, Bielza C (2018) Multi-dimensional Bayesian network classifier trees. In: Proceedings of the 19th international conference on intelligent data engineering and automated learning, Lecture Notes in Computer Science, Springer, pp 354–363

  • Godbole S, Sarawagi S (2004) Discriminative methods for multi-labeled classification. In: Proceedings of the 8th Pacific-Asia conference on knowledge discovery and data mining, Lecture Notes in Artificial Intelligence, Springer, pp 22–30

  • Guan DJ (1998) Generalized Gray codes with applications. In: Proceedings of the national science council of the Republic of China, part a: Physical science and engineering, vol 22, No 6, pp 841–848

  • Hall MA (2000) Correlation-based feature selection for discrete and numeric class machine learning. In: Proceedings of the 17th international conference on machine learning, Morgan Kaufmann Publishers Inc, pp 359–366

  • Henrion M (1988) Propagating uncertainty in Bayesian networks by probabilistic logic sampling. In: Machine intelligence and pattern recognition, vol 5, Elsevier, pp 149–163

  • Hernández-González J, Inza I, Lozano JA (2015) Multidimensional learning from crowds: Usefulness and application of expertise detection. Int J Intell Syst 30(3):326–354

    Google Scholar 

  • Hinkley DV (1971) Inference about the change-point from cumulative sum tests. Biometrika 58(3):509–523

    MathSciNet  MATH  Google Scholar 

  • Hüllermeier E, Fürnkranz J, Cheng W, Brinker K (2008) Label ranking by learning pairwise preferences. Artif Intell 172(16–17):1897–1916

    MathSciNet  MATH  Google Scholar 

  • Hutter F, Hoos HH, Stützle T (2005) Efficient stochastic local search for MPE solving. In: Proceedings of the 19th international joint conference on artificial intelligence, Morgan Kaufmann Publishers Inc, pp 169–174

  • John GH, Langley P (1995) Estimating continuous distributions in Bayesian classifiers. In: Proceedings of the 11th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 338–345

  • Kask K, Dechter R (1999) Mini-bucket heuristics for improved search. In: Proceedings of the 15th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 314–323

  • Kask K, Dechter R (2001) A general scheme for automatic generation of search heuristics from specification dependencies. Artif Intell 129(1–2):91–131

    MathSciNet  MATH  Google Scholar 

  • Koller D, Friedman N (2009) Probabilistic graphical models: Principles and techniques. The MIT Press, London

    MATH  Google Scholar 

  • Kong X, Philip SY (2011) An ensemble-based approach to fast classification of multi-label data streams. In: Proceedings of the 7th international conference on collaborative computing: Networking, applications and worksharing, IEEE, pp 95–104

  • Krawczyk B, Minku LL, Gama J, Stefanowski J, Woźniak M (2017) Ensemble learning for data stream analysis: A survey. Inf Fusion 37:132–156

    Google Scholar 

  • Kruskal JB (1956) On the shortest spanning subtree of a graph and the traveling salesman problem. Proc Am Math Soc 7(1):48–50

    MathSciNet  MATH  Google Scholar 

  • Kullback S (1997) Information theory and statistics. Courier Corporation

  • Kwisthout J (2011) Most probable explanations in Bayesian networks: Complexity and tractability. Int J Approx Reason 52(9):1452–1469

    MathSciNet  MATH  Google Scholar 

  • Langley P, Sage S (1994) Induction of selective Bayesian classifiers. In: Proceedings of the 10th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 399–406

  • Li Z, D’Ambrosio B (1993) An efficient approach for finding the MPE in belief networks. In: Proceedings of the 9th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publisher Inc, pp 342–349

  • Marinescu R, Dechter R (2009) AND/OR branch-and-bound search for combinatorial optimization in graphical models. Artif Intell 173(16–17):1457–1491

    MathSciNet  MATH  Google Scholar 

  • Mencía EL, Fürnkranz J (2010) Efficient multilabel classification algorithms for large-scale problems in the legal domain. In: Semantic processing of legal texts, Springer, pp 192–215

  • Minku LL, White AP, Yao X (2010) The impact of diversity on online ensemble learning in the presence of concept drift. IEEE Trans Knowl Data Eng 22(5):730–742

    Google Scholar 

  • Minsky M (1961) Steps toward artificial intelligence. Proc Inst Radio Eng 49(1):8–30

    MathSciNet  Google Scholar 

  • Nodelman U, Shelton CR, Koller D (2002) Continuous time Bayesian networks. In: Proceedings of the 18th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 378–387

  • Ortigosa-Hernández J, Rodríguez JD, Alzate L, Lucania M, Inza I, Lozano JA (2012) Approaching sentiment analysis by using semi-supervised learning of multi-dimensional classifiers. Neurocomputing 92:98–115

    Google Scholar 

  • Page ES (1954) Continuous inspection schemes. Biometrika 41(1/2):100–115

    MathSciNet  MATH  Google Scholar 

  • Park S, Fürnkranz J (2008) Multi-label classification with label constraints. In: Proceedings of the joint European conference on machine learning and principles and practice of knowledge discovery in databases workshop on preference learning, pp 157–171

  • Pastink A, van der Gaag LC (2015) Multi-classifiers of small treewidth. In: Proceedings of the 13th European conference on symbolic and quantitative approaches to reasoning and uncertainty, Lecture Notes in Artificial Intelligence, Springer, pp 199–209

  • Pearl J (1988) Probabilistic reasoning in intelligent systems: Networks of plausible inference. Morgan Kaufmann Publishers, New York

    MATH  Google Scholar 

  • Pérez A, Larrañaga P, Inza I (2009) Bayesian classifiers based on kernel density estimation: Flexible classifiers. Int J Approx Reason 50(2):341–362

    MATH  Google Scholar 

  • Provost F, Domingos P (2000) Improving probability estimation trees. Mach Learn 52(3):199–215

    MATH  Google Scholar 

  • Qazi M, Fung G, Krishnan S, Rosales R, Steck H, Rao RB, Poldermans D, Chandrasekaran D (2007) Automated heart wall motion abnormality detection from ultrasound images using Bayesian networks. In: Proceedings of the 20th international joint conference on artificial intelligence, Morgan Kaufmann Publishers Inc, pp 519–525

  • Qu W, Zhang Y, Zhu J, Qiu Q (2009) Mining multi-label concept-drifting data streams using dynamic classifier ensemble. In: Proceedings of the 1st Asian conference on machine learning, Lecture Notes in Artificial Intelligence, Springer, pp 308–321

  • Read J (2008) A pruned problem transformation method for multi-label classification. In: Proceedings of the New Zealand computer science research student conference, pp 143–150

  • Read J, Pfahringer B, Holmes G, Frank E (2011) Classifier chains for multi-label classification. Mach Learn 85(3):333–359

    MathSciNet  Google Scholar 

  • Read J, Bifet A, Holmes G, Pfahringer B (2012) Scalable and efficient multi-label classification for evolving data streams. Mach Learn 88(1–2):243–272

    MathSciNet  Google Scholar 

  • Read J, Bielza C, Larrañaga P (2013) Multi-dimensional classification with super-classes. IEEE Trans Knowl Data Eng 26(7):1720–1733

    Google Scholar 

  • Read J, Reutemann P, Pfahringer B, Holmes G (2016) MEKA: A multi-label/multi-target extension to WEKA. J Mach Learn Res 17:667–671

    MathSciNet  MATH  Google Scholar 

  • Rebane G, Pearl J (1987) The recovery of causal poly-trees from statistical data. In: Proceedings of the 3rd conference on uncertainty in artificial intelligence, AUAI Press, pp 222–228

  • Rissanen J (1978) Modeling by shortest data description. Automatica 14(5):465–471

    MATH  Google Scholar 

  • Rivas JJ, Orihuela-Espina F, Sucar LE (2018) Circular chain classifiers. In: Proceedings of the 9th international conference on probabilistic graphical models, proceedings of machine learning research, pp 380–391

  • Rivolli A, de Carvalho ACPLF (2018) The utiml package: Multi-label classification in R. The R J 10(2):24–37

    Google Scholar 

  • Robinson RW (1973) Counting labeled acyclic digraphs. In: New directions in the theory of graphs, Academic Press, pp 239–273

  • Rodríguez JD, Lozano JA (2008) Multi-objective learning of multi-dimensional Bayesian classifiers. In: Proceedings of the 8th international conference on hybrid intelligent systems, IEEE Computer Society, pp 501–506

  • Rodríguez JD, Perez A, Arteta D, Tejedor D, Lozano JA (2012) Using multidimensional Bayesian network classifiers to assist the treatment of multiple sclerosis. IEEE Trans Syst Man Cybern Part C Appl Rev 42(6):1705–1715

    Google Scholar 

  • Rojas-Guzman C, Kramer MA (1993) GALGO: A genetic algorithm decision support tool for complex uncertain systems modeled with Bayesian belief networks. In: Proceedings of the 9th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publisher Inc, pp 368–375

  • Rudin C (2019) Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nat Mach Intell 1(5):206–215

    Google Scholar 

  • Sahami M (1996) Learning limited dependence Bayesian classifiers. In: Proceedings of the 2nd international conference on knowledge discovery and data mining, AAAI Press, 1, pp 335–338

  • Santos E (1991) On the generation of alternative explanations with implications for belief revision. In: Proceedings of the 7th conference on uncertainty in artificial intelligence, Morgan Kaufmann Publishers Inc, pp 339–347

  • Schapire RE, Singer Y (1999) Improved boosting algorithms using confidence-rated predictions. Mach Learn 37(3):297–336

    MATH  Google Scholar 

  • Schapire RE, Singer Y (2000) BoosTexter: A boosting-based system for text categorization. Mach Learn 39(2–3):135–168

    MATH  Google Scholar 

  • Schwarz G (1978) Estimating the dimension of a model. Ann Stat 6(2):461–464

    MathSciNet  MATH  Google Scholar 

  • Sechidis K, Tsoumakas G, Vlahavas I (2011) On the stratification of multi-label data. In: Proceedings of the joint European conference on machine learning and knowledge discovery in databases, Lecture Notes in Artificial Intelligence, Springer, pp 145–158

  • Shimony SE (1994) Finding MAPs for belief networks is NP-hard. Artif Intell 68(2):399–410

    MATH  Google Scholar 

  • Shimony SE, Charniak W (1990) A new algorithm for finding MAP assignments to belief networks. In: Proceedings of the 6th annual conference on uncertainty in artificial intelligence, Elsevier, pp 185–196

  • Song G, Ye Y (2014) A new ensemble method for multi-label data stream classification in non-stationary environment. In: Proceedings of the 2014 international joint conference on neural networks, IEEE, pp 1776–1783

  • Stella F, Amer Y (2012) Continuous time Bayesian network classifiers. J Biomed Inform 45(6):1108–1119

    Google Scholar 

  • Sucar LE, Bielza C, Morales EF, Hernandez-Leal P, Zaragoza JH, Larrañaga P (2014) Multi-label classification with Bayesian network-based chain classifiers. Pattern Recogn Lett 41:14–22

    Google Scholar 

  • Sy BK (1992) Reasoning MPE to multiply connected belief networks using message passing. In: Proceedings of the 10th national conference on artificial intelligence, AAAI Press, pp 570–576

  • Szymanski P, Kajdanowicz T (2019) Scikit-multilearn: A scikit-based Python environment for performing multi-label classification. J Mach Learn Res 20:209–230

    MATH  Google Scholar 

  • Teyssier M, Koller D (2005) Ordering-based search: A simple and effective algorithm for learning Bayesian networks. In: Proceedings of the 21st conference on uncertainty in artificial intelligence, AUAI Press, pp 584–590

  • Tsoumakas G, Katakis I (2007) Multi-label classification: An overview. Int J Data Warehouse Min 3(3):1–13

    Google Scholar 

  • Tsoumakas G, Vlahavas I (2007) Random k-labelsets: An ensemble method for multilabel classification. In: Proceedings of the 18th European conference on machine learning, Lecture Notes in Artificial Intelligence, Springer, pp 406–417

  • Tsoumakas G, Katakis I, Vlahavas I (2009) Mining multi-label data. In: Data mining and knowledge discovery handbook, Springer, pp 667–685

  • Tsoumakas G, Spyromitros-Xioufis E, Vilcek J, Vlahavas I (2011) MULAN: A Java library for multi-label learning. J Mach Learn Res 12:2411–2414

    MathSciNet  MATH  Google Scholar 

  • de Waal PR, van der Gaag LC (2007) Inference and learning in multi-dimensional Bayesian network classifiers. In: Proceedings of the 9th European conference on symbolic and quantitative approaches to reasoning with uncertainty, Lecture Notes in Artificial Intelligence, Springer, pp 501–511

  • Wang L, Shen H, Tian H (2017) Weighted ensemble classification of multi-label data streams. In: Proceedings of the 21st Pacific-Asia conference on knowledge discovery and data mining, Lecture Notes in Artificial Intelligence, Springer, pp 551–562

  • Wang P, Zhang P, Guo L (2012) Mining multi-label data streams using ensemble-based active learning. In: Proceedings of the 2012 SIAM international conference on data mining, SIAM, pp 1131–1140

  • Widmer G, Kubat M (1996) Learning in the presence of concept drift and hidden contexts. Mach Learn 23(1):69–101

    Google Scholar 

  • Xioufis ES, Spiliopoulou M, Tsoumakas G, Vlahavas IP (2011) Dealing with concept drift and class imbalance in multi-label stream classification. In: Proceedings of the 22nd international joint conference on artificial intelligence, AAAI Press, pp 1583–1588

  • Yang Y (1999) An evaluation of statistical approaches to text categorization. Inf Retrieval 1(1–2):69–90

    Google Scholar 

  • Yang Y, Ding M (2019) Decision function with probability feature weighting based on Bayesian network for multi-label classification. Neural Comput Appl 31(9):4819–4828

    Google Scholar 

  • Yang Y, Liu X (1999) A re-examination of text categorization methods. In: Proceedings of the 22nd annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 42–49

  • Zaragoza JH, Sucar LE, Morales EF (2011a) A two-step method to learn multidimensional Bayesian network classifiers based on mutual information measures. In: Proceedings of the 24th international FLAIRS conference, AAAI Press, pp 644–649

  • Zaragoza JH, Sucar LE, Morales EF, Bielza C, Larranaga P (2011b) Bayesian chain classifiers for multidimensional classification. In: Proceedings of the 22nd international joint conference on artificial intelligence, AAAI Press, pp 2192–2197

  • Zhang ML, Zhou ZH (2007) ML-KNN: A lazy learning approach to multi-label learning. Pattern Recogn 40(7):2038–2048

    MATH  Google Scholar 

  • Zhang ML, Zhou ZH (2014) A review on multi-label learning algorithms. IEEE Trans Knowl Data Eng 26(8):1819–1837

    Google Scholar 

  • Zhang ML, Peña JM, Robles V (2009) Feature selection for multi-label naive Bayes classification. Inf Sci 179(19):3218–3229

    MATH  Google Scholar 

  • Zhu M, Liu S, Jiang J (2016) A hybrid method for learning multi-dimensional Bayesian network classifiers based on an optimization model. Appl Intell 44(1):123–148

    Google Scholar 

  • Zhu S, Ji X, Xu W, Gong Y (2005) Multi-labelled classification using maximum entropy method. In: Proceedings of the 28th annual international ACM SIGIR conference on research and development in information retrieval, ACM, pp 274–281

Download references

Acknowledgements

This work has been partially supported by the Spanish Ministry of Science and Innovation through the PID2019-109247GB-IOO project. Santiago Gil-Begue has been supported by the predoctoral grant FPU17/04341 from the Spanish Ministry of Science, Innovation and Universities.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Santiago Gil-Begue.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Gil-Begue, S., Bielza, C. & Larrañaga, P. Multi-dimensional Bayesian network classifiers: A survey. Artif Intell Rev 54, 519–559 (2021). https://doi.org/10.1007/s10462-020-09858-x

Download citation

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s10462-020-09858-x

Keywords