-
Entropy-based Pruning for Learning Bayesian Networks using BIC
Authors:
Cassio P. de Campos,
Mauro Scanagatta,
Giorgio Corani,
Marco Zaffalon
Abstract:
For decomposable score-based structure learning of Bayesian networks, existing approaches first compute a collection of candidate parent sets for each variable and then optimize over this collection by choosing one parent set for each variable without creating directed cycles while maximizing the total score. We target the task of constructing the collection of candidate parent sets when the score…
▽ More
For decomposable score-based structure learning of Bayesian networks, existing approaches first compute a collection of candidate parent sets for each variable and then optimize over this collection by choosing one parent set for each variable without creating directed cycles while maximizing the total score. We target the task of constructing the collection of candidate parent sets when the score of choice is the Bayesian Information Criterion (BIC). We provide new non-trivial results that can be used to prune the search space of candidate parent sets of each node. We analyze how these new results relate to previous ideas in the literature both theoretically and empirically. We show in experiments with UCI data sets that gains can be significant. Since the new pruning rules are easy to implement and have low computational costs, they can be promptly integrated into all state-of-the-art methods for structure learning of Bayesian networks.
△ Less
Submitted 19 July, 2017;
originally announced July 2017.
-
Approximation Complexity of Maximum A Posteriori Inference in Sum-Product Networks
Authors:
Diarmaid Conaty,
Denis D. Mauá,
Cassio P. de Campos
Abstract:
We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show th…
▽ More
We discuss the computational complexity of approximating maximum a posteriori inference in sum-product networks. We first show NP-hardness in trees of height two by a reduction from maximum independent set; this implies non-approximability within a sublinear factor. We show that this is a tight bound, as we can find an approximation within a linear factor in networks of height two. We then show that, in trees of height three, it is NP-hard to approximate the problem within a factor $2^{f(n)}$ for any sublinear function $f$ of the size of the input $n$. Again, this bound is tight, as we prove that the usual max-product algorithm finds (in any network) approximations within factor $2^{c \cdot n}$ for some constant $c < 1$. Last, we present a simple algorithm, and show that it provably produces solutions at least as good as, and potentially much better than, the max-product algorithm. We empirically analyze the proposed algorithm against max-product using synthetic and realistic networks.
△ Less
Submitted 5 September, 2017; v1 submitted 17 March, 2017;
originally announced March 2017.
-
A Structured View on Weighted Counting with Relations to Counting, Quantum Computation and Applications
Authors:
Cassio P. de Campos,
Georgios Stamoulis,
Dennis Weyland
Abstract:
Weighted counting problems are a natural generalization of counting problems where a weight is associated with every computational path of polynomial-time non-deterministic Turing machines and the goal is to compute the sum of the weights of all paths (instead of just computing the number of accepting paths). Useful closure properties and plenty of applications make weighted counting problems inte…
▽ More
Weighted counting problems are a natural generalization of counting problems where a weight is associated with every computational path of polynomial-time non-deterministic Turing machines and the goal is to compute the sum of the weights of all paths (instead of just computing the number of accepting paths). Useful closure properties and plenty of applications make weighted counting problems interesting. The general definition of these problems captures even undecidable problems, but it turns out that obtaining an exponentially small additive approximation is just as hard as solving conventional counting problems. In many cases such an approximation is sufficient and working with weighted counting problems tends to be very convenient. We present a structured view on weighted counting by defining classes that depend on the range of the function that assigns weights to paths and by showing the relationships between these different classes. These classes constitute generalizations of the usual counting problems. Weighted counting allows us to easily cast a number of famous results of computational complexity in its terms, especially regarding counting and quantum computation. Moreover, these classes are flexible enough and capture the complexity of various problems in fields such as probabilistic graphical models and stochastic combinatorial optimization. Using the weighted counting terminology and our results, we are able to simplify and answer some open questions in those fields.
△ Less
Submitted 10 January, 2019; v1 submitted 23 January, 2017;
originally announced January 2017.
-
Learning Bayesian Networks with Incomplete Data by Augmentation
Authors:
Tameem Adel,
Cassio P. de Campos
Abstract:
We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not…
▽ More
We present new algorithms for learning Bayesian networks from data with missing values using a data augmentation approach. An exact Bayesian network learning algorithm is obtained by recasting the problem into a standard Bayesian network learning problem without missing data. To the best of our knowledge, this is the first exact algorithm for this problem. As expected, the exact algorithm does not scale to large domains. We build on the exact method to create an approximate algorithm using a hill-climbing technique. This algorithm scales to large domains so long as a suitable standard structure learning method for complete data is available. We perform a wide range of experiments to demonstrate the benefits of learning Bayesian networks with such new approach.
△ Less
Submitted 8 October, 2016; v1 submitted 27 August, 2016;
originally announced August 2016.
-
Learning Bounded Treewidth Bayesian Networks with Thousands of Variables
Authors:
Mauro Scanagatta,
Giorgio Corani,
Cassio P. de Campos,
Marco Zaffalon
Abstract:
We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. We propose a novel algorithm for this task, able to scale to large domains and large tree…
▽ More
We present a method for learning treewidth-bounded Bayesian networks from data sets containing thousands of variables. Bounding the treewidth of a Bayesian greatly reduces the complexity of inferences. Yet, being a global property of the graph, it considerably increases the difficulty of the learning process. We propose a novel algorithm for this task, able to scale to large domains and large treewidths. Our novel approach consistently outperforms the state of the art on data sets with up to ten thousand variables.
△ Less
Submitted 11 May, 2016;
originally announced May 2016.
-
Advances in Learning Bayesian Networks of Bounded Treewidth
Authors:
Siqi Nie,
Denis Deratani Maua,
Cassio Polpo de Campos,
Qiang Ji
Abstract:
This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling $k$-trees (maximal graphs of treewidth $k$), and subsequently selecting,…
▽ More
This work presents novel algorithms for learning Bayesian network structures with bounded treewidth. Both exact and approximate methods are developed. The exact method combines mixed-integer linear programming formulations for structure learning and treewidth computation. The approximate method consists in uniformly sampling $k$-trees (maximal graphs of treewidth $k$), and subsequently selecting, exactly or approximately, the best structure whose moral graph is a subgraph of that $k$-tree. Some properties of these methods are discussed and proven. The approaches are empirically compared to each other and to a state-of-the-art method for learning bounded treewidth structures on a collection of public data sets with up to 100 variables. The experiments show that our exact algorithm outperforms the state of the art, and that the approximate approach is fairly accurate.
△ Less
Submitted 6 June, 2014; v1 submitted 5 June, 2014;
originally announced June 2014.
-
On the Complexity of Strong and Epistemic Credal Networks
Authors:
Denis D. Maua,
Cassio Polpo de Campos,
Alessio Benavoli,
Alessandro Antonucci
Abstract:
Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and s…
▽ More
Credal networks are graph-based statistical models whose parameters take values in a set, instead of being sharply specified as in traditional statistical models (e.g., Bayesian networks). The computational complexity of inferences on such models depends on the irrelevance/independence concept adopted. In this paper, we study inferential complexity under the concepts of epistemic irrelevance and strong independence. We show that inferences under strong independence are NP-hard even in trees with ternary variables. We prove that under epistemic irrelevance the polynomial time complexity of inferences in credal trees is not likely to extend to more general models (e.g. singly connected networks). These results clearly distinguish networks that admit efficient inferences and those where inferences are most likely hard, and settle several open questions regarding computational complexity.
△ Less
Submitted 26 September, 2013;
originally announced September 2013.
-
Inference in Polytrees with Sets of Probabilities
Authors:
Jose Carlos Ferreira da Rocha,
Fabio Gagliardi Cozman,
Cassio Polpo de Campos
Abstract:
Inferences in directed acyclic graphs associated with probability sets and probability intervals are NP-hard, even for polytrees. In this paper we focus on such inferences, and propose: 1) a substantial improvement on Tessems A / R algorithm FOR polytrees WITH probability intervals; 2) a new algorithm FOR direction - based local search(IN sets OF probability) that improves ON e…
▽ More
Inferences in directed acyclic graphs associated with probability sets and probability intervals are NP-hard, even for polytrees. In this paper we focus on such inferences, and propose: 1) a substantial improvement on Tessems A / R algorithm FOR polytrees WITH probability intervals; 2) a new algorithm FOR direction - based local search(IN sets OF probability) that improves ON existing methods; 3) a collection OF branch - AND - bound algorithms that combine the previous techniques.The first two techniques lead TO approximate solutions, WHILE branch - AND - bound procedures can produce either exact OR approximate solutions.We report ON dramatic improvements ON existing techniques FOR inference WITH probability sets AND intervals, IN SOME cases reducing the computational effort BY many orders OF magnitude.
△ Less
Submitted 19 October, 2012;
originally announced December 2012.
-
The Complexity of Approximately Solving Influence Diagrams
Authors:
Denis D. Maua,
Cassio Polpo de Campos,
Marco Zaffalon
Abstract:
Influence diagrams allow for intuitive and yet precise description of complex situations involving decision making under uncertainty. Unfortunately, most of the problems described by influence diagrams are hard to solve. In this paper we discuss the complexity of approximately solving influence diagrams. We do not assume no-forgetting or regularity, which makes the class of problems we address ver…
▽ More
Influence diagrams allow for intuitive and yet precise description of complex situations involving decision making under uncertainty. Unfortunately, most of the problems described by influence diagrams are hard to solve. In this paper we discuss the complexity of approximately solving influence diagrams. We do not assume no-forgetting or regularity, which makes the class of problems we address very broad. Remarkably, we show that when both the tree-width and the cardinality of the variables are bounded the problem admits a fully polynomial-time approximation scheme.
△ Less
Submitted 16 October, 2012;
originally announced October 2012.
-
Propositional and Relational Bayesian Networks Associated with Imprecise and Qualitative Probabilistic Assesments
Authors:
Fabio Gagliardi Cozman,
Cassio Polpo de Campos,
Jaime Ide,
Jose Carlos Ferreira da Rocha
Abstract:
This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact…
▽ More
This paper investigates a representation language with flexibility inspired by probabilistic logic and compactness inspired by relational Bayesian networks. The goal is to handle propositional and first-order constructs together with precise, imprecise, indeterminate and qualitative probabilistic assessments. The paper shows how this can be achieved through the theory of credal networks. New exact and approximate inference algorithms based on multilinear programming and iterated/loopy propagation of interval probabilities are presented; their superior performance, compared to existing ones, is shown empirically.
△ Less
Submitted 11 July, 2012;
originally announced July 2012.
-
Belief Updating and Learning in Semi-Qualitative Probabilistic Networks
Authors:
Cassio Polpo de Campos,
Fabio Gagliardi Cozman
Abstract:
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum…
▽ More
This paper explores semi-qualitative probabilistic networks (SQPNs) that combine numeric and qualitative information. We first show that exact inferences with SQPNs are NPPP-Complete. We then show that existing qualitative relations in SQPNs (plus probabilistic logic and imprecise assessments) can be dealt effectively through multilinear programming. We then discuss learning: we consider a maximum likelihood method that generates point estimates given a SQPN and empirical data, and we describe a Bayesian-minded method that employs the Imprecise Dirichlet Model to generate set-valued estimates.
△ Less
Submitted 4 July, 2012;
originally announced July 2012.
-
Strategy Selection in Influence Diagrams using Imprecise Probabilities
Authors:
Cassio Polpo de Campos,
Qiang Ji
Abstract:
This paper describes a new algorithm to solve the decision making problem in Influence Diagrams based on algorithms for credal networks. Decision nodes are associated to imprecise probability distributions and a reformulation is introduced that finds the global maximum strategy with respect to the expected utility. We work with Limited Memory Influence Diagrams, which generalize most Influence Dia…
▽ More
This paper describes a new algorithm to solve the decision making problem in Influence Diagrams based on algorithms for credal networks. Decision nodes are associated to imprecise probability distributions and a reformulation is introduced that finds the global maximum strategy with respect to the expected utility. We work with Limited Memory Influence Diagrams, which generalize most Influence Diagram proposals and handle simultaneous decisions. Besides the global optimum method, we explore an anytime approximate solution with a guaranteed maximum error and show that imprecise probabilities are handled in a straightforward way. Complexity issues and experiments with random diagrams and an effects-based military planning problem are discussed.
△ Less
Submitted 13 June, 2012;
originally announced June 2012.
-
Improving parameter learning of Bayesian nets from incomplete data
Authors:
Giorgio Corani,
Cassio P. De Campos
Abstract:
This paper addresses the estimation of parameters of a Bayesian network from incomplete data. The task is usually tackled by running the Expectation-Maximization (EM) algorithm several times in order to obtain a high log-likelihood estimate. We argue that choosing the maximum log-likelihood estimate (as well as the maximum penalized log-likelihood and the maximum a posteriori estimate) has severe…
▽ More
This paper addresses the estimation of parameters of a Bayesian network from incomplete data. The task is usually tackled by running the Expectation-Maximization (EM) algorithm several times in order to obtain a high log-likelihood estimate. We argue that choosing the maximum log-likelihood estimate (as well as the maximum penalized log-likelihood and the maximum a posteriori estimate) has severe drawbacks, being affected both by overfitting and model uncertainty. Two ideas are discussed to overcome these issues: a maximum entropy approach and a Bayesian model averaging approach. Both ideas can be easily applied on top of EM, while the entropy idea can be also implemented in a more sophisticated way, through a dedicated non-linear solver. A vast set of experiments shows that these ideas produce significantly better estimates and inferences than the traditional and widely used maximum (penalized) log-likelihood and maximum a posteriori estimates. In particular, if EM is adopted as optimization engine, the model averaging approach is the best performing one; its performance is matched by the entropy approach when implemented using the non-linear solver. The results suggest that the applicability of these ideas is immediate (they are easy to implement and to integrate in currently available inference engines) and that they constitute a better way to learn Bayesian network parameters.
△ Less
Submitted 12 October, 2011;
originally announced October 2011.
-
Solving Limited Memory Influence Diagrams
Authors:
Denis Deratani Mauá,
Cassio Polpo de Campos,
Marco Zaffalon
Abstract:
We present a new algorithm for exactly solving decision making problems represented as influence diagrams. We do not require the usual assumptions of no forgetting and regularity; this allows us to solve problems with simultaneous decisions and limited information. The algorithm is empirically shown to outperform a state-of-the-art algorithm on randomly generated problems of up to 150 variables an…
▽ More
We present a new algorithm for exactly solving decision making problems represented as influence diagrams. We do not require the usual assumptions of no forgetting and regularity; this allows us to solve problems with simultaneous decisions and limited information. The algorithm is empirically shown to outperform a state-of-the-art algorithm on randomly generated problems of up to 150 variables and $10^{64}$ solutions. We show that the problem is NP-hard even if the underlying graph structure of the problem has small treewidth and the variables take on a bounded number of states, but that a fully polynomial time approximation scheme exists for these cases. Moreover, we show that the bound on the number of states is a necessary condition for any efficient approximation scheme.
△ Less
Submitted 9 September, 2011; v1 submitted 8 September, 2011;
originally announced September 2011.
-
New Results for the MAP Problem in Bayesian Networks
Authors:
Cassio P. de Campos
Abstract:
This paper presents new results for the (partial) maximum a posteriori (MAP) problem in Bayesian networks, which is the problem of querying the most probable state configuration of some of the network variables given evidence. First, it is demonstrated that the problem remains hard even in networks with very simple topology, such as binary polytrees and simple trees (including the Naive Bayes stru…
▽ More
This paper presents new results for the (partial) maximum a posteriori (MAP) problem in Bayesian networks, which is the problem of querying the most probable state configuration of some of the network variables given evidence. First, it is demonstrated that the problem remains hard even in networks with very simple topology, such as binary polytrees and simple trees (including the Naive Bayes structure). Such proofs extend previous complexity results for the problem. Inapproximability results are also derived in the case of trees if the number of states per variable is not bounded. Although the problem is shown to be hard and inapproximable even in very simple scenarios, a new exact algorithm is described that is empirically fast in networks of bounded treewidth and bounded number of states per variable. The same algorithm is used as basis of a Fully Polynomial Time Approximation Scheme for MAP under such assumptions. Approximation schemes were generally thought to be impossible for this problem, but we show otherwise for classes of networks that are important in practice. The algorithms are extensively tested using some well-known networks as well as random generated cases to show their effectiveness.
△ Less
Submitted 29 July, 2010; v1 submitted 22 July, 2010;
originally announced July 2010.