Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Structure Learning of H-Colorings

Published: 06 June 2020 Publication History

Abstract

We study the following structure learning problem for H-colorings. For a fixed (and known) constraint graph H with q colors, given access to uniformly random H-colorings of an unknown graph G=(V,E), how many samples are required to learn the edges of G? We give a characterization of the constraint graphs H for which the problem is identifiable for every G and show that there are identifiable constraint graphs for which one cannot hope to learn every graph G efficiently. We provide refined results for the case of proper vertex q-colorings of graphs of maximum degree d. In particular, we prove that in the tree uniqueness region (i.e., when q≤ d), the problem is identifiable and we can learn G in poly(d,q)× O(n2 log n) time. In the tree non-uniqueness region (i.e., when q≤ d), we show that the problem is not identifiable and thus G cannot be learned. Moreover, when q ≤ d - √d + Θ (1), we establish that even learning an equivalent graph (any graph with the same set of H-colorings) is computationally hard—sample complexity is exponential in n in the worst case. We further explore the connection between the efficiency/hardness of the structure learning problem and the uniqueness/non-uniqueness phase transition for general H-colorings and prove that under a well-known uniqueness condition in statistical physics, we can learn G in poly(d,q)× O(n2 log n) time.

References

[1]
Pieter Abbeel, Daphne Koller, and Andrew Y. Ng. 2006. Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7 (Aug. 2006), 1743--1788.
[2]
Anima Anandkumar, Daniel J. Hsu, Furong Huang, and Sham M. Kakade. 2012. Learning mixtures of tree graphical models. In Advances in Neural Information Processing Systems (NeurIPS’12). 1052--1060.
[3]
Ivona Bezáková, Antonio Blanca, Zongchen Chen, Daniel Štefankovič, and Eric Vigoda. 2019. Lower bounds for testing graphical models: Colorings and antiferromagnetic Ising models. In Proceedings of the 31st Conference on Learning Theory (COLT’19). 283--298.
[4]
Guy Bresler. 2015. Efficiently learning Ising models on arbitrary graphs. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC’15). ACM, New York, NY, 771--782.
[5]
Guy Bresler, David Gamarnik, and Devavrat Shah. 2014. Hardness of parameter estimation in graphical models. In Advances in Neural Information Processing Systems (NeurIPS’14). 1062--1070.
[6]
Guy Bresler, David Gamarnik, and Devavrat Shah. 2014. Structure learning of antiferromagnetic Ising models. In Advances in Neural Information Processing Systems (NeurIPS’14). 2852--2860.
[7]
Guy Bresler and Mina Karzand. 2016. Learning a tree-structured Ising model in order to make predictions. arXiv:1604.06749.
[8]
Guy Bresler, Elchanan Mossel, and Allan Sly. 2013. Reconstruction of Markov random fields from samples: Some observations and algorithms. SIAM Journal on Computing 42, 2 (2013), 563--578.
[9]
Graham R. Brightwell and Peter Winkler. 2002. Random colorings of a Cayley tree. Contemporary Combinatorics 10 (2002), 247--276.
[10]
Andrei A. Bulatov. 2005. -coloring dichotomy revisited. Theoretical Computer Science 349, 1 (2005), 31--39.
[11]
C. K. Chow and C. N. Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 3 (1968), 462--467.
[12]
Imre Csiszár and Zsolt Talata. 2006. Consistent estimation of the basic neighborhood of Markov random fields. Annals of Statistics 34, 1 (2006), 123--145.
[13]
Sanjoy Dasgupta. 1999. Learning polytrees. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). 134--141.
[14]
Amir Dembo, Andrea Montanari, and Nike Sun. 2013. Factor models on locally tree-like graphs. Annals of Probability 41, 6 (2013), 4162--4213.
[15]
R. L. Dobrushin. 1968. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability 8 Its Applications 13, 2 (1968), 197--224.
[16]
R. L. Dobrushin and S. B. Shlosman. 1985. Constructive criterion for the uniqueness of Gibbs field. In Statistical Physics and Dynamical Systems. Vol. 10. Birkhäuser Boston, Boston, MA, 347--370.
[17]
Martin Dyer, Leslie Ann Goldberg, and Mark Jerrum. 2004. Counting and sampling -colourings. Information and Computation 189, 1 (2004), 1--16.
[18]
Martin Dyer, Leslie Ann Goldberg, and Mark Jerrum. 2008. Dobrushin conditions and systematic scan. Combinatorics, Probability and Computing 17, 6 (2008), 761--779.
[19]
Martin Dyer and Catherine Greenhill. 2000. The complexity of counting graph homomorphisms. Random Structures and Algorithms 17, 3-4 (2000), 260--289.
[20]
Martin Dyer, Alistair Sinclair, Eric Vigoda, and Dror Weitz. 2004. Mixing in time and space for lattice spin systems: A combinatorial view. Random Structures 8 Algorithms 24, 4 (2004), 461--479.
[21]
Nathan Eagle, Alex Sandy Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences 106, 36 (2009), 15274--15278.
[22]
Thomas Emden-Weinert, Stefan Hougardy, and Bernd Kreuter. 1998. Uniquely colourable graphs and the hardness of colouring graphs of large girth. Combinatorics, Probability and Computing 7, 4 (1998), 375--386.
[23]
Andreas Galanis, Leslie Ann Goldberg, and Mark Jerrum. 2016. Approximately counting -colorings is #BIS-hard. SIAM Journal on Computing 45, 3 (2016), 680--711.
[24]
Andreas Galanis, Daniel Štefankovič, and Eric Vigoda. 2015. Inapproximability for antiferromagnetic spin systems in the tree non-uniqueness region. Journal of the ACM 62, 6 (2015), 50.
[25]
Hans-Otto Georgii. 2011. Gibbs Measures and Phase Transitions. In De Gruyter Studies in Mathematics, Vol. 9. Walter de Gruyter, Boston, MA.
[26]
Leslie Ann Goldberg, Steven Kelk, and Mike Paterson. 2002. The complexity of choosing an -colouring (nearly) uniformly at random. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC’02). ACM, New York, NY, 53--62.
[27]
Linus Hamilton, Frederic Koehler, and Ankur Moitra. 2017. Information theoretic properties of Markov random fields, and their algorithmic applications. In Advances in Neural Information Processing Systems (NeurIPS’17). 2460--2469.
[28]
Pavol Hell and Jaroslav Nešetřil. 1990. On the complexity of -coloring. Journal of Combinatorial Theory, Series B 48, 1 (1990), 92--110.
[29]
Pavol Hell and Jaroslav Nešetřil. 2004. Graphs and Homomorphisms. Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press
[30]
John P. Huelsenbeck, Fredrik Ronquist, Rasmus Nielsen, and Jonathan P. Bollback. 2001. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 5550 (2001), 2310--2314.
[31]
Ali Jalali, Pradeep Ravikumar, Vishvas Vasuki, and Sujay Sanghavi. 2011. On learning discrete graphical models using group-sparse regularization. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. 378--387.
[32]
Johan Jonasson. 2002. Uniqueness of uniform random colorings of regular trees. Statistics 8 Probability Letters 57, 3 (2002), 243--248.
[33]
Adam Klivans and Raghu Meka. 2017. Learning graphical models using multiplicative weights. In Proceedings of the 58th Annual Symposium on Foundations of Computer Science (FOCS’17). IEEE, Los Alamitos, CA, 343--354.
[34]
Su-In Lee, Varun Ganapathi, and Daphne Koller. 2007. Efficient structure learning of Markov networks using -regularization. In Advances in Neural Information Processing Systems (NeurIPS’07). 817--824.
[35]
Liang Li, Pinyan Lu, and Yitong Yin. 2013. Correlation decay up to uniqueness in spin systems. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’13). 67--84.
[36]
Daniel Marbach, James C. Costello, Robert Küffner, Nicci Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, Manolis Kellis, James J. Collins, and Gustavo Stolovitzky. 2012. Wisdom of crowds for robust gene network inference. Nature Methods 9, 8 (2012), 796--804.
[37]
Fabio Martinelli, Alistair Sinclair, and Dror Weitz. 2007. Fast mixing for independent sets, colorings, and other models on trees. Random Structures 8 Algorithms 31, 2 (2007), 134--172.
[38]
Michael Molloy and Bruce Reed. 2001. Colouring graphs when the number of colours is nearly the maximum degree. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing (STOC’01). ACM, New York, NY, 462--470.
[39]
Faruck Morcos, Andrea Pagnani, Bryan Lunt, Arianna Bertolino, Debora S. Marks, Chris Sander, Riccardo Zecchina, José N. Onuchic, Terence Hwa, and Martin Weigt. 2011. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proceedings of the National Academy of Sciences 108, 49 (2011), E1293–E1301.
[40]
Pradeep Ravikumar, Martin J. Wainwright, and John D. Lafferty. 2010. High-dimensional Ising model selection using -regularized logistic regression. Annals of Statistics 38, 3 (2010), 1287--1319.
[41]
Stefan Roth and Michael J. Black. 2005. Fields of experts: A framework for learning image priors. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2. IEEE, Los Alamitos, CA, 860--867.
[42]
Elad Schneidman, Michael J. Berry II, Ronen Segev, and William Bialek. 2006. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature 440, 7087 (2006), 1007.
[43]
Mark H. Siggers. 2010. A new proof of the -coloring dichotomy. SIAM Journal on Discrete Mathematics 23, 4 (2010), 2204--2210.
[44]
Allan Sly. 2010. Computational transition at the uniqueness threshold. In Proceedings of the 51st Annual Symposium on Foundations of Computer Science (FOCS’10). IEEE, Los Alamitos, CA, 287--296.
[45]
Allan Sly and Nike Sun. 2012. The computational hardness of counting in two-spin models on -regular graphs. In Proceedings of the 53rd Annual Symposium on Foundations of Computer Science (FOCS’12). IEEE, Los Alamitos, CA, 361--369.
[46]
Nathan Srebro. 2001. Maximum likelihood bounded tree-width Markov networks. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence (UAI’01). 504--511.
[47]
Marc Vuffray, Sidhant Misra, Andrey Lokhov, and Michael Chertkov. 2016. Interaction screening: Efficient and sample-optimal learning of Ising models. In Advances in Neural Information Processing Systems (NeurIPS’16). 2595--2603.
[48]
Dror Weitz. 2005. Combinatorial criteria for uniqueness of Gibbs measures. Random Structures 8 Algorithms 27, 4 (2005), 445--475.
[49]
Dror Weitz. 2006. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC’06). ACM, New York, NY, 140--149.

Cited By

View all
  • (2021)Parameter Estimation for Undirected Graphical Models With Hard ConstraintsIEEE Transactions on Information Theory10.1109/TIT.2021.309440467:10(6790-6809)Online publication date: Oct-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Algorithms
ACM Transactions on Algorithms  Volume 16, Issue 3
July 2020
368 pages
ISSN:1549-6325
EISSN:1549-6333
DOI:10.1145/3403658
Issue’s Table of Contents
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 06 June 2020
Online AM: 07 May 2020
Accepted: 01 February 2020
Revised: 01 August 2019
Received: 01 May 2018
Published in TALG Volume 16, Issue 3

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. H-colorings
  2. Markov random fields
  3. Structure learning
  4. identifiability
  5. spin systems

Qualifiers

  • Research-article
  • Research
  • Refereed

Funding Sources

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)120
  • Downloads (Last 6 weeks)14
Reflects downloads up to 10 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2021)Parameter Estimation for Undirected Graphical Models With Hard ConstraintsIEEE Transactions on Information Theory10.1109/TIT.2021.309440467:10(6790-6809)Online publication date: Oct-2021

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

HTML Format

View this article in HTML Format.

HTML Format

Get Access

Login options

Full Access

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media