Abstract
A wide range of important problems in machine learning, expert system, social network analysis, bioinformatics and information theory can be formulated as a maximum a-posteriori (MAP) inference problem on statistical relational models. While off-the-shelf inference algorithms that are based on local search and message-passing may provide adequate solutions in some situations, they frequently give poor results when faced with models that possess high-density networks. Unfortunately, these situations always occur in models of real-world applications. As such, accurate and scalable maximum a-posteriori (MAP) inference on such models often remains a key challenge. In this paper, we first introduce a novel family of extended factor graphs that are parameterized by a smoothing parameter χ ∈ [0,1]. Applying belief propagation (BP) message-passing to this family formulates a new family of W eighted S urvey P ropagation algorithms (WSP-χ) applicable to relational domains. Unlike off-the-shelf inference algorithms, WSP-χ detects the “backbone” ground atoms in a solution cluster that involve potentially optimal MAP solutions: the cluster backbone atoms are not only portions of the optimal solutions, but they also can be exploited for scaling MAP inference by iteratively fixing them to reduce the complex parts until the network is simplified into one that can be solved accurately using any conventional MAP inference method. We also propose a lazy variant of this WSP-χ family of algorithms. Our experiments on several real-world problems show the efficiency of WSP-χ and its lazy variants over existing prominent MAP inference solvers such as MaxWalkSAT, RockIt, IPP, SP-Y and WCSP.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Achlioptas, D., Ricci-Tersenghi, F.: Random formulas have frozen variables. SIAM J Comput 39(1), 260–280 (2009). SIAM
Ahmadi, B., Kersting, K., Mladenov, M., Natarajan, S.: Exploiting symmetries for scaling loopy belief propagation and relational training, vol. 92 (2013)
Allouche, D., de Givry, S., Schiex, T.: Toulbar2 an Open Source Exact Cost Function Network Solver. Technical report, INRIA (2010)
Amirian, M.M., Ghidary, S.S.: Xeggora: Exploiting immune-to-evidence symmetries with full aggregation in statistical relational models. J. Artif. Intell. Res. 66, 33–56 (2019)
Battaglia, D., Kolár, M., Zecchina, R.: Minimizing energy below the glass thresholds. Phys. Rev. E 70, 36107–36118 (2004)
Besag, J.: On the statistical analysis of dirty pictures. J R Stat Soc Series B stat Methodol 48(3), 259–279 (1986)
Braunstein, A., Zecchina, R.: Survey and belief propagation on random k-sat. In: Proceedings of the 7th International Conference on Theory and Applications of Satisfiability Testing, vol. 2919, pp. 519–528. Springer, Vancouver (2004)
Braunstein, A., Mézard, M., Zecchina, R.: Survey propagation: an algorithm for satisfiability. Random Struct. Algorithm. 27(2), 201–226 (2005)
Chavas, J., Furtlehner, C., Mézard, M., Zecchina, R.: Survey-propagation decimation through distributed local computations. J. Stat. Mech. Theory Exper. 2005 (11), 11016–11027 (2005). IOP Publishing
Chieu, H.L., Lee, W.S.: Relaxed survey propagation for the weighted maximum satisfiability problem. J. Artif. Intell. Res. (JAIR) 36, 229–266 (2009)
Chieu, H.L., Lee, W.S., Teh, Y.W.: Cooled and relaxed survey propagation for mrfs. In: Proceedings of the 21st Annual Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems, vol. 20, pp. 297–304, Vancouver. Curran Associates, Inc. (2007)
Conaty, D., Maua, D., de Campos, C.: Approximation complexity of maximum a posteriori inference in sum-product networks. In: Proceedings of The 33rd Conference on Uncertainty in Artificial Intelligence, AUAI (2017)
Davis, J., Domingos, P.: Deep Transfer via Second-Order Markov Logic. In: Proceedings of the 26Th International Conference on Machine Learning (ICML-09), Montreal (2009)
De Salvo Braz, R., Amir, E., Roth, D.: Lifted first-order probabilistic inference. In: Proceedings of the 19th International joint conference in artificial intelligent, pp. 1319–1325. AAAI Press (2005)
De Salvo Braz, R., Amir, E., Roth, D.: Mpe and partial inversion in lifted probabilistic variable elimination. In: Proceedings Of The Twenty-first National Conference On Artificial Intelligence, vol. 6, pp. 1123–1130. AAAI press, Boston (2006)
Forney, G.D.: The viterbi algorithm. Proc. IEEE 61(3), 268–278 (1973). IEEE computer Society
Getoor, L., Taskar, B.: Introduction to Statistical Relational Learning. Adaptive Computation and Machine Learning. The MIT Press (2007)
Gomes, C., Hogg, T., Walsh, T., Zhang, W.: Tutorial - Phase Transitions and Structure in Combinatorial Problems. In: Proceedings Of The Eighteenth National Conference On Artificial Intelligence. AAAI Press, Edmonton (2002)
Granville, V., Krivánek, M., Rasson, J.P.: Simulated annealing: a proof of convergence. IEEE Trans. Pattern Anal. Mach. Intell. 16(6), 652–656 (1994). IEEE computer society
Hartmann, A.K., Weigt, M.: Phase transitions in combinatorial optimization problems: basics, algorithms and statistical mechanics. Wiley, New York (2006)
Huynh, T.N., Mooney, R.J.: Max-margin weight learning for markov logic networks. In: Machine Learning and Knowledge Discovery in Databases, vol. 5781, pp. 564–579. Springer (2009)
Ibrahim, M.H., Pal, C., Pesant, G.: Exploiting determinism to scale relational inference. In: Proceedings of the Twenty-Ninth National Conference on Artificial Intelligence (AAAI’15), pp. 1756–1762. AAAI Press, Austin (2015)
Jain, D., Maier, P., Wylezich, G.: Markov Logic as a Modelling Language for Weighted Constraint Satisfaction Problems. In: Eighth International Workshop on Constraint Modelling and Reformulation, in conjunction with CP (2009)
Kambhampati, S.C., Liu, T.: Phase transition and network structure in realistic sat problems. In: Proceedings of the Twenty-Seventh AAAI Conference on Artificial Intelligence, pp. 1619–1620. AAAI Press, Washington (2013)
Kautz, H., Selman, B., Jiang, Y.: A general stochastic approach to solving problems with hard and soft constraints. Satisfiab Problem Theory Appl. 17, 573–586 (1997)
Kazemi, S.M., Kimmig, A., Van den Broeck, G., Poole, D.: New liftable classes for first-order probabilistic inference. In: Advances in Neural Information Processing Systems, pp. 3117–3125 (2016)
Kersting, K.: Lifted probabilistic inference. In: Proceedings of 20th European Conference on Artificial Intelligence (ECAI–2012), vol. 27-31, pp. 33–38. ECCAI, Montpellier (2012)
Khosla, M., Melhorn, K., Panagiotou, K.: Message Passing Algorithms, PhD thesis. Citeseer (2009)
Kiddon, C., Domingos, P.: Coarse-to-fine inference and learning for first-order probabilistic models. In: Proceedings of the Twenty-Fifth AAAI Conference on Artificial Intelligence, pp. 1049–1056. AAAI Press, San Francisco (2011)
Kilby, P., Slaney, J., Thiébaux, S., Walsh, T.: Backbones and backdoors in satisfiability. In: Proceedings of the The Twentieth National Conference on Artificial Intelligence and the Seventeenth Innovative Applications of Artificial Intelligence Conference, vol. 5, pp. 1368–1373. AAAI Press, Pittsburgh (2005)
Kok, S., Singla, P., Richardson, M., Domingos, P., Sumner, M., Poon, H., Lowd, D.: The Alchemy System for Statistical Relational AI. In: Technical Report Department of Computer Science and Engineering, University of Washington, Seattle. http://alchemy.cs.washington.edu (2007)
Kolmogorov, V.: Convergent tree-reweighted message passing for energy minimization. IEEE Trans. Pattern Anal. Mach. Intell. 28(10), 1568–1583 (2006)
Kroc, L., Sabharwal, A., Selman, B.: Survey propagation revisited. In: Proceedings of the Twenty-Third Conference on Uncertainty in Artificial Intelligence, pp. 217–226. AUAI Press, Vancouver (2007)
Kroc, L., Sabharwal, A., Selman, B.: Counting solution clusters in graph coloring problems using belief propagation. In: Proceedings of 22nd Conference on Neural Information Processing Systems: Advances in Neural Information Processing Systems, vol. 21, pp. 873–880. Curran Associates Inc., Vancouver (2008)
Kroc, L., Sabharwal, A., Selman, B.: Message-passing and local heuristics as decimation strategies for satisfiability. In: Proceedings of the 2009 ACM symposium on Applied Computing, pp. 1408–1414. ACM (2009)
Kumar, M.P., Torr, P.H.: Efficiently solving convex relaxations for map estimation. In: Proceedings of the 25th international conference on Machine learning, pp. 680–687. ACM, Helsinki (2008)
Lauritzen, S.L., Spiegelhalter, D.J.: Local computations with probabilities on graphical structures and their application to expert systems. J. R. Stat. Soc. Ser. B (Methodol.) 50, 157–224 (1988)
Lowd, D., Domingos, P.: Efficient weight learning for markov logic networks. In: Proceedings of 11th European Conference on Principles and Practice of Knowledge Discovery in Databases PKDD 2007, pp. 200–211. Springer, Warsaw (2007)
Lüdtke, S., Schröder, M., Krüger, F., Bader, S., Kirste, T.: State-space abstractions for probabilistic inference: a systematic review. J. Artif. Intell. Res. 63, 789–848 (2018)
Maneva, E., Mossel, E., Wainwright, M.J.: A new look at survey propagation and its generalizations. J. ACM (JACM) 54(4), 17–21 (2007). ACM
Mann, A., Hartmann, A.: Numerical solution-space analysis of satisfiability problems. Phys. Rev. E 82(5), 056702–56707. APS (2010)
Mei, J., Jiang, Y., Tu, K.: Maximum a posteriori inference in sum-product networks. In: Thirty-Second AAAI Conference on Artificial Intelligence, pp. 1923–1930 (2018)
Meilicke, C., Leopold, H., Kuss, E., Stuckenschmidt, H., Reijers, H.A.: Overcoming individual process model matcher weaknesses using ensemble matching. Decis. Support. Syst. 100, 15–26 (2017)
Molina, A., Vergari, A., Stelzner, K., Peharz, R., Subramani, P., Mauro, N.D., Poupart, P., Kersting, K.: Spflow: An easy and extensible library for deep probabilistic learning using sum-product networks. arXiv:1901.03704(2019)
Montanari, A., Parisi, G., Ricci-Tersenghi, F.: Instability of one-step replica-symmetry-broken phase in satisfiability problems. J. Phys. A: Math. Gen. 37 (6), 2073–2079 (2004). IOP Publishing
Natarajan, S., Tadepalli, P., Dietterich, T.G., Fern, A.: Learning first-order probabilistic models with combining rules. Ann. Math. Artif. Intell. 54(1-3), 223–256 (2008)
Nath, A., Domingos, P.M.: Learning relational sum-product networks. In: Twenty-Ninth AAAI Conference on Artificial Intelligence, pp 2878–2886 (2015)
Ng, K.S., Lloyd, J.W., Uther, W.T.: Probabilistic modelling, inference and learning using logical theories. Ann. Math. Artif. Intell. 54(1-3), 159–205 (2008)
Niu, F., Ré, C., Doan, A., Shavlik, J.: Tuffy: Scaling up statistical inference in markov logic networks using an rdbms. Proc. VLDB Endow. 4(6), 373–384 (2011)
Noessner, J., Niepert, M., Stuckenschmidt, H.: Rockit: Exploiting Parallelism and Symmetry for Map Inference in Statistical Relational Models. In: Twenty-Seventh AAAI Conference on Artificial Intelligence (2013)
Papai, T., Singla, P., Kautz, H.: Constraint propagation for efficient inference in markov logic. In: Proceedings of 17th International Conference on Principles and Practice of Constraint Programming (CP 2011), no. 6876 in Lecture Notes in Computer Science (LNCS), pp 691–705 (2011)
Park, J.D.: Using weighted max-sat engines to solve mpe. In: Proceedings of the Eighteenth National Conference on Artificial Intelligence, pp. 682–687. AAAI Press, Menlo Park (2002)
Parkes, A.J.: Clustering at the phase transition. In: Proceedings of the 14th National Conference on Artificial Intelligence, pp. 340–345. AAAI Press. at the convention center in Providence (1997)
Pearl, J.: Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference. Morgan Kaufmann Publishers Inc., San Francisco (1988)
Peharz, R., Gens, R., Pernkopf, F., Domingos, P.: On the latent variable interpretation in sum-product networks. IEEE Trans. Pattern Anal. Mach. Intell. 39 (10), 2030–2044 (2016)
Poon, H., Domingos, P.: Sum-Product Networks: a New Deep Architecture. In: 2011 IEEE International Conference on Computer Vision Workshops (ICCV Workshops), pp. 689–690. IEEE (2011)
Poon, H., Domingos, P., Sumner, M.: A general method for reducing the complexity of relational inference and its application to mcmc. In: Proceedings of the Twenty-Third AAAI Conference on Artificial Intelligence, pp 1075–1080. AAAI Press, Chicago (2008)
Ravikumar, P., Lafferty, J.: Quadratic programming relaxations for metric labeling and markov random field map estimation. In: Proceedings of the 23rd international conference on Machine learning, pp. 737–744. ACM (2006)
Richardson, M., Domingos, P.: Markov logic networks. Mach. Learn. 62(1-2), 107–136 (2006). Kluwer Academic Publishers
Riedel, S.: Improving the accuracy and efficiency of map inference for markov logic. In: UAI, pp 468–475. AUAI Press (2008)
Rooshenas, A., Lowd, D.: Learning sum-product networks with direct and indirect variable interactions. In: International Conference on Machine Learning, pp 710–718 (2014)
Sarkhel, S., Gogate, V.: Lifting walksat-based local search algorithms for map inference. In: Proceedings of Statistical Relational Artificial Intelligence Workshop at the Twenty-Seventh AAAI Conference on Artificial Intelligence, pp. 64–67. AAAI Press, Bellevue (2013)
Sarkhel, S., Venugopal, D., Singla, P., Gogate, V.: Lifted MAP inference for markov logic networks. In: Proceedings of the Seventeenth International Conference on Artificial Intelligence and Statistics, vol. 33, pp. 859–867. JMLR: W & CP, Reykjavik (2014a)
Sarkhel, S., Venugopal, D., Singla, P., Gogate, V.G.: An integer polynomial programming based framework for lifted map inference. In: Advances in Neural Information Processing Systems, pp 3302–3310 (2014b)
Schoenfisch, J., Meilicke, C., von Stülpnagel, J., Ortmann, J., Stuckenschmidt, H.: Root cause analysis in it infrastructures using ontologies and abduction in markov logic networks. Inf. Syst. 74, 103–116 (2018)
Selman, B., Kautz, H., Cohen, B., et al.: Local search strategies for satisfiability testing. Cliques, coloring, and satisfiability: Second DIMACS implementation challenge 26, 521–532 (1993)
Singla, P., Domingos, P.: Entity resolution with markov logic. In: ICDM, pp 572–582. IEEE Computer Society (2006a)
Singla, P., Domingos, P.: Memory-efficient inference in relational domains. In: Proceedings of the Twenty-first National Conference on Artificial Intelligence (AAAI-06), vol. 6, pp 488–493. AAAI Press, Boston (2006b)
Skarlatidis, A.: Logical markov random fields (lomrf): an open-source implementation of markov logic networks. https://github.com/anskarl/LoMRF (2012)
Slaney, J., Walsh, T.: Backbones in optimization and approximation. In: Proceedings of the 17th International Joint Conference on Artificial Intelligence, vol. 1, pp. 254–259. Morgan Kaufmann Publishers Inc., Seattle (2001)
Szeliski, R.: Image alignment and stitching: a tutorial. Found. Trends®; Comput. Graph. Vis. 2(1), 1–104 (2006). Now Publishers Inc.
Wainwright, M., Jaakkola, T., Willsky, A.: Tree consistency and bounds on the performance of the max-product algorithm and its generalizations. Stat. Comput. 14(2), 143–166 (2004). Springer
Wainwright, M., Jaakkola, T., Willsky, A.: MAP estimation via agreement on (hyper)trees: Message-passing and linear programming approaches. IEEE Transactions on Information Theory, vol. 51, pp. 3697–3717. IEEE computer society (2005)
Weiss, Y., Freeman, W.T.: On the optimality of solutions of the max-product belief-propagation algorithm in arbitrary graphs. IEEE Trans. Inf. Theory 47(2), 736–744 (2001). IEEE computer Society
Yanover, C., Meltzer, T., Weiss, Y.: Linear programming relaxations and belief propagation–an empirical study. J. Mach. Learn. Res. 7, 1887–1907 (2006). JMLR. org
Zhang, W.: Phase transitions and backbones of the asymmetric traveling salesman problem. J. Artif. Intell. Res. (JAIR) 21, 471–497 (2004). AAAI Press
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Appendix A: Derivation of WSP-χ’s update equations
Appendix A: Derivation of WSP-χ’s update equations
Here we derive the update equations for WSP-χ’s message passing. For simplicity, and without lose of generality, we consider the derivation of WSP-1 — a pure version of WSP-χ on \(\hat {\mathcal {G}}\) when setting χ = 1 and γ = 0 in (8).
1.1 A1. Variable-to-factor
Let us start here by computing the update of the component \(\mu ^{s}_{X_{j} \rightarrow \hat {f_{i}}}\). This component represents the probability that Xj is constrained by other extended factors to satisfy \(\hat {f_{i}}\), and therefore, it is specified by the event that the variable Xj = si,j and its mega-node \(P_{j} = Z^{j} \cup \{\hat {f_{i}}\}\). If we use \(P_{j} = Z^{j} \cup \{\hat {f_{i}}\}\) as a notation representing the following event for a ground atom Xj
Then we can compute \(\mu ^{s}_{X_{j} \rightarrow \hat {f_{i}}}\) as follows:
Similarly for \(\mu ^{u}_{X_{j} \rightarrow \hat {f_{i}}}\). This component is specified by the event that Xj = ui,j and its mega-node \(P_{j} \subseteq \mathcal {F}^{u}_{\hat {f_{i}}}(j)\). Thus, we have:
Finally, computing \(\mu ^{*}_{X_{j} \rightarrow \hat {f_{i}}}\) is specified by the event that Xj = si,j with \(P_{j} = \mathcal {F}^{s}_{\hat {f_{i}}}(j)\), and Xj = ∗ with Pj = ∅. Thus we have the following:
1.2 A2. Factor-to-Variables
Let us start here with the component \(\eta ^{s}_{\hat {f_{i}} \rightarrow X_{j}}\). This component implies that Xj = si,j and \(\hat {f_{i}} \in P_{j}\), and that the only possible assignment for the other ground atoms \(X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) is ui,k and their mega-nodes are \(P_{k} \subseteq \mathcal {F}^{u}_{\hat {f_{i}}}(k)\). That is, it takes the form:
Note that since the component \(\eta ^{s}_{\hat {f_{i}} \rightarrow X_{j}}\) is constrained to satisfy \(\hat {f_{i}}\), we multiply right hand side of (22) by the term \(e^{\hat {w}_{i} \cdot y}\) which is the reward term of satisfying \(\hat {f_{i}}\). Now, using the definition of \(\mu ^{u}_{X_{k} \rightarrow \hat {f_{i}}}\) from (20a) into (22), we obtain the following:
Now moving to the component \(\eta ^{u}_{\hat {f_{i}} \rightarrow X_{j}}\). This component represents the probability that Xj can violate \(\hat {f_{i}}\). That is to say, we have Xj = ui,j and \(P_{j} \subseteq \mathcal {F}^{u}_{\hat {f_{i}}}(j)\). This probability implies a combination of three possibilities (having weights labeled as W1,W2 and W3) for the other ground atoms \(X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) in a potential complete assignment:
-
1.
There is one ground atom in \(\mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) satisfying \(\hat {f_{i}}\), and all the other ground atoms are violating it
$$ \begin{array}{@{}rcl@{}} \text{W}_{1} & =& \sum\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \overbrace{\sum\limits_{Z^{k} \subseteq \mathcal{F}^{s}_{\hat{f_{i}}}(k)} \left\{ \mu_{X_{k} \rightarrow \hat{f_{i}}} \bigg| X_{k}=s_{i,k}, P_{k} = Z^{k} \cup \{\hat{f_{i}}\} \right\} }^{\text{From Eq.}~(19a) \text{this equals} \mu^{s}_{X_{k} \rightarrow \hat{f_{i}}}} \\ && \times \prod\limits_{X_{i} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{k},X_{j}\}} \overbrace{\sum\limits_{Z^{i} \subseteq \mathcal{F}^{u}_{\hat{f_{i}}}(i)} \left\{ \mu_{X_{i} \rightarrow \hat{f_{i}}} \bigg| X_{i}=u_{i,i}, P_{i} = Z^{i}\} \right\} }^{\text{From Eq.}~(20a) \text{this equals} \mu^{u}_{X_{i} \rightarrow \hat{f_{i}}}} \end{array} $$(24a)$$ \begin{array}{@{}rcl@{}} & = &\sum\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \mu^{s}_{X_{k} \rightarrow \hat{f_{i}}} \times \prod\limits_{X_{i} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{k},X_{j}\}} \mu^{u}_{X_{i} \rightarrow \hat{f_{i}}} \end{array} $$(24b) -
2.
There are two or more ground atoms in \(\mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) satisfying \(\hat {f_{i}}\) or equal joker ∗, and all other ground atoms are violating it
$$ \begin{array}{@{}rcl@{}} \text{W}_{2} & =& \sum\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \left[\sum\limits_{Z^{k} \subseteq \mathcal{F}^{s}_{\hat{f_{i}}}(k)} \left\{\mu_{X_{k} \rightarrow \hat{f_{i}}} \bigg| X_{k}=s_{i,k}, P_{k} = Z^{k} \right\} + \left\{ \mu_{X_{k} \rightarrow \hat{f_{i}}} \bigg| X_{k}=*, P_{k} =\emptyset \right\} \right] \\ && \times \prod\limits_{X_{i} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{k},X_{j}\}} \sum\limits_{Z^{i} \subseteq \mathcal{F}^{u}_{\hat{f_{i}}}(i)} \left\{\mu_{X_{i} \rightarrow \hat{f_{i}}} \bigg| X_{i}=u_{i,i}, P_{i} = Z^{i} \right\} \end{array} $$(25a)$$ \begin{array}{@{}rcl@{}} & =& \prod\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \left[\mu^{u}_{X_{k} \rightarrow \hat{f_{i}}} + \mu^{*}_{X_{k} \rightarrow \hat{f_{i}}} \right] - \sum\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \mu^{*}_{X_{k} \rightarrow \hat{f_{i}}} \times \prod\limits_{X_{i} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{k},X_{j}\}} \mu^{u}_{X_{i} \rightarrow \hat{f_{i}}} \\ && - \prod\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \mu^{u}_{X_{k} \rightarrow \hat{f_{i}}} \end{array} $$(25b)Note that the weight assigned to the event that each ground atom is either satisfying or ∗ is \({\prod }_{X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}} \left [\mu ^{u}_{X_{k} \rightarrow \hat {f_{i}}} + \mu ^{*}_{X_{k} \rightarrow \hat {f_{i}}} \right ]\), and the weight W2 is given by subtracting from this quantity the weight assigned to the event that there are not at least two joker ground atoms ∗ or satisfying. This event is a combination of two disjoint events that either all other ground atoms in \(X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) are violating (which weight \({\prod }_{X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}} \mu ^{u}_{X_{k} \rightarrow \hat {f_{i}}}\)) or that only one ground atom is ∗ or satisfying (with weight \({\sum }_{X_{k} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}} \mu ^{*}_{X_{k} \rightarrow \hat {f_{i}}} \times {\prod }_{X_{i} \in \mathcal {X}_{\hat {f_{i}}} \setminus \{X_{k},X_{j}\}} \mu ^{u}_{X_{i} \rightarrow \hat {f_{i}}}\)).
-
3.
All other ground atoms in \(\mathcal {X}_{\hat {f_{i}}} \setminus \{X_{j}\}\) are violating \(\hat {f_{i}}\). So here, there is a penalty term \(e^{-\hat {w}_{i} \cdot y}\) of violating \(\hat {f_{i}}\) when updating the message:
$$ \begin{array}{@{}rcl@{}} \text{W}_{3} & = &\left[\prod\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \overbrace{\sum\limits_{Z^{k} \subseteq \mathcal{F}^{u}_{\hat{f_{i}}}(k)} \left\{\mu_{X_{k} \rightarrow \hat{f_{i}}} \bigg| X_{k}=s_{i,k}, P_{k} = Z^{k} \right\}}^{\text{From Eq.}~(6) \text{this equals} \mu^{u}_{X_{k} \rightarrow \hat{f_{i}}}} \right] \times \overbrace{e^{-\hat{w}_{i} \cdot y}}^{\text{A penalty term, see (6)}} \end{array} $$(26a)$$ \begin{array}{@{}rcl@{}} & =& \left[ \prod\limits_{X_{k} \in \mathcal{X}_{\hat{f_{i}}} \setminus \{X_{j}\}} \mu^{u}_{X_{k} \rightarrow \hat{f_{i}}} \right] \times e^{-\hat{w}_{i} \cdot y} \end{array} $$(26b)
Now, bringing together the weight forms of W1, W2, and W3 from (24b), (25b) and (26b) results in:
Finally, the component \(\eta ^{*}_{\hat {f_{i}} \rightarrow X_{j}}\) represents the probability that Xj can be unconstrained by \(\hat {f_{i}}\). This probability is a combination of two possibilities: either Xj is satisfying \(\hat {f_{i}}\) and all other ground atoms are unconstrained, or Xj is unconstrained (i.e., Xj = ∗ with Pi = ∅). So we have:
Note that the first part of (25a) and (25b) is identical to (28). Thus, we substitute the computation of this part from (25a) and (25b) into (28), and we have:
1.3 A3. Estimating the Marginals
Now let us explain the derivation of ground atoms’ marginals over max-cores in \(\hat {\mathcal {G}}\). Computing the unnormalized positive marginal of a ground atom Xj requires multiplying the satisfying income messages from the ground clauses in which Xj appears positively by the violating income messages from the ground clauses in which Xj appears negatively:
Similarly, we can obtain the unnormalized negative marginal by multiplying the satisfying income messages from the factors in which Xi appears negatively by the violating income messages from the factors in which Xi appears positively:
Finally, we can estimate the unnormalized joker marginal by multiplying all the unconstrained incoming messages from all factors in which Xj appears:
Now by normalizing the quantities in (30c), (31) and (32a), we obtain the marginal of Xj as follows:
and
where \(\mathcal {Z}_{i}\) is the normalizing constant, given the evidence E.
Rights and permissions
About this article
Cite this article
Ibrahim, MH., Pal, C. & Pesant, G. Leveraging cluster backbones for improving MAP inference in statistical relational models. Ann Math Artif Intell 88, 907–949 (2020). https://doi.org/10.1007/s10472-020-09698-z
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10472-020-09698-z