article

Free access

Parallel Algorithm for Learning Optimal Bayesian Network Structure

Authors:

Yoshinori Tamada,

Seiya Imoto,

Satoru MiyanoAuthors Info & Claims

The Journal of Machine Learning Research, Volume 12

Pages 2437 - 2459

Published: 01 July 2011 Publication History

PDF eReader

Abstract

We present a parallel algorithm for the score-based optimal structure search of Bayesian networks. This algorithm is based on a dynamic programming (DP) algorithm having O(n ⋅ 2ⁿ) time and space complexity, which is known to be the fastest algorithm for the optimal structure search of networks with n nodes. The bottleneck of the problem is the memory requirement, and therefore, the algorithm is currently applicable for up to a few tens of nodes. While the recently proposed algorithm overcomes this limitation by a space-time trade-off, our proposed algorithm realizes direct parallelization of the original DP algorithm with O(n^σ) time and space overhead calculations, where σ>0 controls the communication-space trade-off. The overall time and space complexity is O(n^σ+1 2ⁿ). This algorithm splits the search space so that the required communication between independent calculations is minimal. Because of this advantage, our algorithm can run on distributed memory supercomputers. Through computational experiments, we confirmed that our algorithm can run in parallel using up to 256 processors with a parallelization efficiency of 0.74, compared to the original DP algorithm with a single processor. We also demonstrate optimal structure search for a 32-node network without any constraints, which is the largest network search presented in literature.

References

[1]

B. P. Buckles and M. Lybanon. Algorithm 515: Generation of a vector from the lexicographical index {G6}. ACM Transductions on Mathematical Software, 3(2):180-182, 1977.

Crossref

Google Scholar

[2]

D. M. Chickering, D. Geiger, and D. Heckerman. Learning Bayesian networks: Search methods and experimental results. In Proceedings of the Fifth Conference on Artificial Intelligence and Statistics, pages 112-128, 1995.

Google Scholar

[3]

N. Friedman, M. Linial, I. Nachman, and D. Pe'er. Using Bayesian networks to analyze expression data. J. Computational Biology, 7:601-620, 2000.

Google Scholar

[4]

D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: the combination of knowledge and statistical data. Machine Learning, 20:197-243, 1995.

Crossref

Google Scholar

[5]

S. Imoto, T. Goto, and S. Miyano. Estimation of genetic networks and functional structures between genes by using Bayesian networks and nonparametric regression. Pacific Symposium on Biocomputing, 7:175-186, 2002.

Google Scholar

[6]

M. Koivisto and K. Sood. Exact Bayesian structure discovery in Bayesian networks. Journal of Machine Learning Research, 5:549-573, 2004.

Crossref

Google Scholar

[7]

S. Ott, S. Imoto, and S. Miyano. Finding optimal models for small gene networks. Pacific Symposium on Biocomputing, 9:557-567, 2004.

Google Scholar

[8]

P. Parviainen and M. Koivisto. Exact structure discovery in Bayesian networks with less space. Proceedings of the 25th Conference on Uncertainty in Artificial Intelligence (UAI 2009), 2009.

Crossref

Google Scholar

[9]

J. Pearl. Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufman Publishers, San Mateo, CA, 1988.

Crossref

Google Scholar

[10]

E. Perrier, S. Imoto, and S. Miyano. Finding optimal Bayesian network given a super-structure. J. Machine Learning Research, 9:2251-2286, 2008.

Google Scholar

[11]

T. Silander and P. Myllymäki. A simple approach for finding the globally optimal Bayesian network structure. Proceedings of the 22nd Conference on Uncertainty in Artificial Intelligence (UAI 2006), pages 445-452, 2006.

Google Scholar

[12]

Y. Tamada, S. Imoto, and S. Miyano. Conversion between a combination vector and the lexicographical index in linear time with polynomial time preprocessing, 2011. submitted.

Google Scholar

[13]

I. Tsamardinos, L. E. Brown, and C. F. Aliferis. The max-min hill-climbing Bayesian network structure learning algorithm. Machine Learning, 65:31-78, 2006.

Crossref

Google Scholar

Cited By

View all

Karan SSayed ZZola J(2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3366471
Srivastava AChockalingam SAluru S(2023)A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket DiscoveryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324413534:6(1699-1715)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TPDS.2023.3244135
Parque VKrawiec K(2021)Generating combinations on the GPU and its application to the k-subset sumProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3449726.3463226(1308-1316)Online publication date: 7-Jul-2021
https://dl.acm.org/doi/10.1145/3449726.3463226
Show More Cited By

Index Terms

Parallel Algorithm for Learning Optimal Bayesian Network Structure
1. Computing methodologies
  1. Machine learning
2. Mathematics of computing
  1. Probability and statistics

Recommendations

A parallel algorithm for Bayesian network structure learning from large data sets

This paper considers a parallel algorithm for Bayesian network structure learning from large data sets. The parallel algorithm is a variant of the well known PC algorithm. The PC algorithm is a constraint-based algorithm consisting of five steps where ...
Algorithm-specific network design for parallel computing
Communication-optimal parallel algorithm for strassen's matrix multiplication
SPAA '12: Proceedings of the twenty-fourth annual ACM symposium on Parallelism in algorithms and architectures

Parallel matrix multiplication is one of the most studied fundamental problems in distributed and high performance computing. We obtain a new parallel algorithm that is based on Strassen's fast matrix multiplication and minimizes communication. The ...

Comments

Information & Contributors

Information

Published In

Publisher

JMLR.org

Publication History

Published: 01 July 2011

Published in JMLR Volume 12

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

12
Total Citations
View Citations
209
Total Downloads

Downloads (Last 12 months)41
Downloads (Last 6 weeks)3

Reflects downloads up to 10 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Karan SSayed ZZola J(2024)End-to-End Bayesian Networks Exact Learning in Shared MemoryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2024.336647135:4(634-645)Online publication date: 1-Apr-2024
https://dl.acm.org/doi/10.1109/TPDS.2024.3366471
Srivastava AChockalingam SAluru S(2023)A Parallel Framework for Constraint-Based Bayesian Network Learning via Markov Blanket DiscoveryIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2023.324413534:6(1699-1715)Online publication date: 1-Jun-2023
https://dl.acm.org/doi/10.1109/TPDS.2023.3244135
Parque VKrawiec K(2021)Generating combinations on the GPU and its application to the k-subset sumProceedings of the Genetic and Evolutionary Computation Conference Companion10.1145/3449726.3463226(1308-1316)Online publication date: 7-Jul-2021
https://dl.acm.org/doi/10.1145/3449726.3463226
Parque V(2021)Tackling the Subset Sum Problem with Fixed Size using an Integer Representation Scheme2021 IEEE Congress on Evolutionary Computation (CEC)10.1109/CEC45853.2021.9504889(1447-1453)Online publication date: 28-Jun-2021
https://dl.acm.org/doi/10.1109/CEC45853.2021.9504889
Srivastava AChockalingam SAluru SCuicchi CQualters IKramer W(2020)A parallel framework for constraint-based bayesian network learning via markov blanket discoveryProceedings of the International Conference for High Performance Computing, Networking, Storage and Analysis10.5555/3433701.3433710(1-15)Online publication date: 9-Nov-2020
https://dl.acm.org/doi/10.5555/3433701.3433710
Guo YSun YWu KJiang K(2020)New Algorithms of Feature Selection and Big Data Assignment for CBR System Integrated by Bayesian NetworkACM Transactions on Knowledge Discovery from Data10.1145/337308614:2(1-20)Online publication date: 18-Feb-2020
https://dl.acm.org/doi/10.1145/3373086
Malone BKangas KJärvisalo MKoivisto MMyllymäki P(2018)Empirical hardness of finding optimal Bayesian network structuresMachine Language10.1007/s10994-017-5680-2107:1(247-283)Online publication date: 1-Jan-2018
https://dl.acm.org/doi/10.1007/s10994-017-5680-2
Jahnsson NMalone BMyllymäki P(2017)Duplicate Detection for Bayesian Network Structure LearningNew Generation Computing10.1007/s00354-016-0004-935:1(47-67)Online publication date: 1-Jan-2017
https://dl.acm.org/doi/10.1007/s00354-016-0004-9
Wang YQian WZhang SLiang XYuan B(2016)A Learning Algorithm for Bayesian Networks and Its Efficient Implementation on GPUsIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2014.238728527:1(17-30)Online publication date: 1-Jan-2016
https://dl.acm.org/doi/10.1109/TPDS.2014.2387285
Jahnsson NMalone BMyllymäki P(2015)Hashing-Based Hybrid Duplicate Detection for Bayesian Network Structure LearningAdvanced Methodologies for Bayesian Networks10.1007/978-3-319-28379-1_4(46-60)Online publication date: 16-Nov-2015
https://dl.acm.org/doi/10.1007/978-3-319-28379-1_4
Show More Cited By

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Abstract

References

Cited By

Index Terms

Recommendations

A parallel algorithm for Bayesian network structure learning from large data sets

Algorithm-specific network design for parallel computing

Communication-optimal parallel algorithm for strassen's matrix multiplication

Comments

Information

Published In

Publisher

Publication History

Qualifiers

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

View options

PDF

eReader

Get Access

Login options

Full Access

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations