Abstract
We examine the class of multi-linear representations (MLR) for expressing probability distributions over discrete variables. Recently, MLR have been considered as intermediate representations that facilitate inference in distributions represented as graphical models.
We show that MLR is an expressive representation of discrete distributions and can be used to concisely represent classes of distributions which have exponential size in other commonly used representations, while supporting probabilistic inference in time linear in the size of the representation. Our key contribution is presenting techniques for learning bounded-size distributions represented using MLR, which support efficient probabilistic inference. We demonstrate experimentally that the MLR representations we learn support accurate and very efficient inference.
Article PDF
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Avoid common mistakes on your manuscript.
References
Agrawal, R., Imielinski, T., & Swami, A. (1993). Mining association rules between sets of items in large databases. In SIGMOD ’93, 1993 (pp. 207–216).
Asuncion, A., & Newman, D. (2007). UCI Machine learning repository.
Burdick, D., Calimlim, M., Flannick, J., Yiu, T., & Gehrke, J. (2005). MAFIA: A maximal frequent itemset algorithm. IEEE Transactions of Knowledge Data Engineering, 17, 1490–1504.
Castillo, E., Gutiérrez, J. M., & Hadi, A. S. (1996). Goal oriented symbolic propagation in Bayesian networks. In AAAI/IAAI 1996 (Vol. 2, pp. 1263–1268).
Castillo, E., Gutiérrez, J. M., Hadi, A. S., & Solares, C. (1997). Symbolic propagation and sensitivity analysis in Gaussian Bayesian networks with application to damage assessment. AI in Engineering, 11(2), 173–181.
Chickering, D. M. (2002). The WinMine Toolkit (Tech. Rep. MSR-TR-2002-103). Microsoft, Redmond, WA.
Darwiche, A. (2001). Recursive conditioning. Artificial Intelligence, 126(1–2), 5–41.
Darwiche, A. (2003). A differential approach to inference in Bayesian networks. Journal of the ACM, 50(3), 280–305.
Dechter, R. (1996). Bucket elimination: A unifying framework for probabilistic inference. In UAI 1996 (pp. 211–219).
Gilks, W. R., Richardson, S., & Speigelhalter, D. J. (1995). Markov chain Monte Carlo in practice. Boca Raton: Chapman & Hall/CRC.
Gunopulos, D., Khardon, R., Mannila, H., Saluja, S., Toivonen, H., & Sharm, R. S. (2003). Discovering all most specific sentences. ACM Transactions on Database Systems, 28(2), 140–174.
Heckerman, D., Geiger, D., & Chickering, D. M. (1995). Learning Bayesian networks: The combination of knowledge and statistical data. Machine Learning, 20(3), 197–243.
Jaeger, M., Nielsen, J. D., & Silander, T. (2006). Learning probabilistic decision graphs. International Journal of Approximate Reasoning, 42(1–2), 84–100.
Jensen, F. V., Lauritzen, S., & Olesen, K. (1990). Bayesian updating in recursive graphical models by local computation. Computational Statistics Quarterly, 4, 269–282.
Lowd, D., & Domingos, P. (2005). Naive Bayes models for probability estimation. In Proc. ICML-05 (pp. 529–536).
Lowd, D., & Domingos, P. (2008). Learning arithmetic circuits. In UAI 2008 (pp. 383–392).
Meila, M., & Jordan, M. I. (2000). Learning with mixtures of trees. Journal of Machine Learning Research, 1, 1–48.
Pasquier, N., Bastide, Y., Taouil, R., & Lakhal, L. (1999). Discovering frequent closed itemsets for association rules. In: ICDT’99, 1999 (pp. 398–416).
Pearl, J. (1988). Probabilistic reasoning in intelligent systems. San Mateo: Morgan Kaufman.
Roth, D. (1996). On the hardness of approximate reasoning. Artificial Intelligence, 82(1–2), 273–302.
Shachter, R. D., D’Ambrosio, B., & Favero, B. D. (1980). Symbolic probabilistic inference in belief networks. In AAAI 1990 (pp. 126–131).
Srebro, N. (2003). Maximum likelihood bounded tree-width Markov networks. Artificial Intelligence, 143(1), 123–138.
Wainwright, M. J., & Jordan, M. I. (2008). Graphical models, exponential families, and variational inference. Foundations and Trends in Machine Learning, 1(1–2), 1–305.
Yedidia, J. S., Freeman, W.T., Weiss, Y., (2005). Constructing free-energy approximations and generalized belief propagation algorithms. IEEE Transactions on Information Theory, 51(7), 2282–2312.
Zhang, N. L., & Poole, D. (1996). Exploiting causal independence in Bayesian network inference. Journal Artificial Intelligence Research (JAIR), 5, 301–328.
Author information
Authors and Affiliations
Corresponding author
Additional information
Editors: Aleksander Kołcz, Dunja Mladenić, Wray Buntine, Marko Grobelnik, and John Shawe-Taylor.
Rights and permissions
About this article
Cite this article
Roth, D., Samdani, R. Learning multi-linear representations of distributions for efficient inference. Mach Learn 76, 195–209 (2009). https://doi.org/10.1007/s10994-009-5130-x
Received:
Revised:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10994-009-5130-x