Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/2074094.2074110guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article
Free access

The Bayesian structural EM algorithm

Published: 24 July 1998 Publication History
  • Get Citation Alerts
  • Abstract

    In recent years there has been a flurry of works on learning Bayesian networks from data. One of the hard problems in this area is how to effectively learn the structure of a belief network from incomplete data--that is, in the presence of missing values or hidden variables. In a recent paper, I introduced an algorithm called Structural EM that combines the standard Expectation Maximization (EM) algorithm, which optimizes parameters, with structure search for model selection. That algorithm learns networks based on penalized likelihood scores, which include the BIC/MDL score and various approximations to the Bayesian score. In this paper, I extend Structural EM to deal directly with Bayesian model selection. I prove the convergence of the resulting algorithm and show how to apply it for learning a large class of probabilistic models, including Bayesian networks and some variants thereof.

    References

    [1]
    M. Abramowitz and I. A. Stegun, eds. Handbook of Mathematical Functions. 1964.
    [2]
    I. Beinlich, G. Suermondt, R. Chavez, and G. Cooper. The ALARM monitoring system. In Proc. 2'nd Euro. Conf. on AI and Medicine, 1989.
    [3]
    J. Binder, D. Koller, S. Russell, and K. Kanazawa. Adaptive probabilistic networks with hidden variables. Machine Learning, 29:213-244,1997.
    [4]
    C. Boutilier, N. Friedman, M. Goldszmidt, and D. Koller. Context-specific independence in Bayesian networks. In UAI '96, pp. 115-123. 1996.
    [5]
    W. Buntine. Learning classification trees. In D. J. Hand, ed., AI & Stats 3, 1993.
    [6]
    P. Cheeseman and J. Stutz Bayesian classification (AutoClass): Theory and results. In Advances in Knowledge Discoveryand Data Mining, pp. 153-180, 1995.
    [7]
    D. M. Chickering and D. Heckem. Efficient approximations for the marginal likelihood of Bayesian networks with hidden variables. Machine Learning, 29:181-212, 1997.
    [8]
    D. M. Cbickering, D. Heckerman, and C. Meek. A Bayesian approach to learning Bayesian networks with local structure. In UAI '97, pp. 80-89, 1997.
    [9]
    G. F. Cooper and E. Herskovits. A Bayesian method for the induction of probabilistic networks from data. Machine Learning, 9:309-347, 1992.
    [10]
    M. H. DeGroot. Optimal Statistical Decisions, 1970.
    [11]
    A. P. Dempster, N. M. Laird, and D. B. Rubin. Maximum likelihood from incomplete data via the EM algorithm. J. Royal Stat. Soc., B 39:1-39, 1977.
    [12]
    N. Friedman. Learning Bayesian networks in the presence of missing values and hidden variables. In ML '97. 1997.
    [13]
    N. Friedman and M. Goldszmidt. Learning Bayesian networks with local smcture. In M. I. Jordan, ed., Learning in Graphical Models, 1998. A preliminary version appeared in UAI '96.
    [14]
    D. Geiger and D. Heckennan. Knowledge representation and inference in similarity networks and Bayesian multinets. Artificial Intelligence, 82:45-74, 1996.
    [15]
    D. Geiger, D. Heckennan, and C. Meek, Asymptotic model selection for directed graphs with hidden variables. In UAI '96, pp. 283-290. 1996.
    [16]
    D. Heckerman. A tutorial on learning Bayesian networks. In M. I. Jordan, ed., Learning in Graphical Models, 1998.
    [17]
    D. Heckerman, D. Geiger, and D. M. Chickering. Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20:197-243, 1995.
    [18]
    W. Lam and F. Bacchus. Learning Bayesian belief networks: An approach based on the MDL ptinciple. Computational Intelligence, 10:269-293, 1994.
    [19]
    S. L. Lauritzen. The EM algorithm for graphical association models with missing data. Computational Statistics and Data Analysis, 19:191-201, 1995.
    [20]
    D. J. C. MacKay. Ensemble learning for hidden Markov models. Unpublished manuscript, http://wol.ra.phy.cam.ac.uk/mackay, 1997.
    [21]
    M. Meila and M. I. Jordan. Estimating dependency structure as a hidden variable. In NIPS 10. 1998.
    [22]
    J. Pearl. Probabilistic Reasoning in Intelligent Systems, 1988.
    [23]
    D. R. Rubin. Inference and missing data. Biometrica, 63:581-592, 1976.
    [24]
    L. Saul, T. Jaakkola, and M. Jordan. Mean field theory for sigmoid belief networks. Journal of Artificial Intelligence Research, 4:61-76, 1996.
    [25]
    M. Sineh. Learning Bayesian networks from incomulete data. In AAAI '97, pp. 27-31. 1997.
    [26]
    P. Spirtes, C. Glymour, and R. Scheines. Causation, prediction, and search, 1993.
    [27]
    B. Thiesson, C. Meek, D. M. Chickering, and D. Heckem. Learning mixtures of Bayesian networks. In UAI '98, 1998.

    Cited By

    View all
    • (2023)Probabilistic multi-dimensional classificationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625977(1522-1533)Online publication date: 31-Jul-2023
    • (2022)MissDAGProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3600633(5024-5038)Online publication date: 28-Nov-2022
    • (2020)Reasoned bargaining protocol in construction contracts using a novel Bayesian gameInternational Journal of Computer Applications in Technology10.1504/ijcat.2020.10468962:2(148-157)Online publication date: 1-Jan-2020
    • Show More Cited By

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    UAI'98: Proceedings of the Fourteenth conference on Uncertainty in artificial intelligence
    July 1998
    538 pages
    ISBN:155860555X

    Sponsors

    • NEC
    • HUGIN: Hugin Expert A/S
    • Information Extraction and Transportation
    • Microsoft Research: Microsoft Research
    • AT&T: AT&T Labs Research

    Publisher

    Morgan Kaufmann Publishers Inc.

    San Francisco, CA, United States

    Publication History

    Published: 24 July 1998

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)50
    • Downloads (Last 6 weeks)5
    Reflects downloads up to 26 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2023)Probabilistic multi-dimensional classificationProceedings of the Thirty-Ninth Conference on Uncertainty in Artificial Intelligence10.5555/3625834.3625977(1522-1533)Online publication date: 31-Jul-2023
    • (2022)MissDAGProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3600633(5024-5038)Online publication date: 28-Nov-2022
    • (2020)Reasoned bargaining protocol in construction contracts using a novel Bayesian gameInternational Journal of Computer Applications in Technology10.1504/ijcat.2020.10468962:2(148-157)Online publication date: 1-Jan-2020
    • (2018)Dynamic Hybrid Random Fields for the Probabilistic Graphical Modeling of Sequential DataNeural Processing Letters10.5555/3288065.328814348:2(733-768)Online publication date: 1-Oct-2018
    • (2018)Efficient learning of bounded-treewidth Bayesian networks from complete and incomplete data setsInternational Journal of Approximate Reasoning10.1016/j.ijar.2018.02.00495:C(152-166)Online publication date: 1-Apr-2018
    • (2017)Asymmetric hidden Markov modelsInternational Journal of Approximate Reasoning10.1016/j.ijar.2017.05.01188:C(169-191)Online publication date: 1-Sep-2017
    • (2016)Parameter-free probabilistic API mining across GitHubProceedings of the 2016 24th ACM SIGSOFT International Symposium on Foundations of Software Engineering10.1145/2950290.2950319(254-265)Online publication date: 1-Nov-2016
    • (2016)A Subsequence Interleaving Model for Sequential Pattern MiningProceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2939672.2939787(835-844)Online publication date: 13-Aug-2016
    • (2016)Hierarchical Semi-supervised Classification with Incomplete Class HierarchiesProceedings of the Ninth ACM International Conference on Web Search and Data Mining10.1145/2835776.2835810(193-202)Online publication date: 8-Feb-2016
    • (2016)Reconstructing Markov processes from independent and anonymous experimentsDiscrete Applied Mathematics10.1016/j.dam.2015.06.035200:C(108-122)Online publication date: 19-Feb-2016
    • Show More Cited By

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Get Access

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media