Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article
Open access

Structure Learning of H-Colorings

Published: 06 June 2020 Publication History
  • Get Citation Alerts
  • Abstract

    We study the following structure learning problem for H-colorings. For a fixed (and known) constraint graph H with q colors, given access to uniformly random H-colorings of an unknown graph G=(V,E), how many samples are required to learn the edges of G? We give a characterization of the constraint graphs H for which the problem is identifiable for every G and show that there are identifiable constraint graphs for which one cannot hope to learn every graph G efficiently. We provide refined results for the case of proper vertex q-colorings of graphs of maximum degree d. In particular, we prove that in the tree uniqueness region (i.e., when q≤ d), the problem is identifiable and we can learn G in poly(d,q)× O(n2 log n) time. In the tree non-uniqueness region (i.e., when q≤ d), we show that the problem is not identifiable and thus G cannot be learned. Moreover, when q ≤ d - √d + Θ (1), we establish that even learning an equivalent graph (any graph with the same set of H-colorings) is computationally hard—sample complexity is exponential in n in the worst case. We further explore the connection between the efficiency/hardness of the structure learning problem and the uniqueness/non-uniqueness phase transition for general H-colorings and prove that under a well-known uniqueness condition in statistical physics, we can learn G in poly(d,q)× O(n2 log n) time.

    References

    [1]
    Pieter Abbeel, Daphne Koller, and Andrew Y. Ng. 2006. Learning factor graphs in polynomial time and sample complexity. Journal of Machine Learning Research 7 (Aug. 2006), 1743--1788.
    [2]
    Anima Anandkumar, Daniel J. Hsu, Furong Huang, and Sham M. Kakade. 2012. Learning mixtures of tree graphical models. In Advances in Neural Information Processing Systems (NeurIPS’12). 1052--1060.
    [3]
    Ivona Bezáková, Antonio Blanca, Zongchen Chen, Daniel Štefankovič, and Eric Vigoda. 2019. Lower bounds for testing graphical models: Colorings and antiferromagnetic Ising models. In Proceedings of the 31st Conference on Learning Theory (COLT’19). 283--298.
    [4]
    Guy Bresler. 2015. Efficiently learning Ising models on arbitrary graphs. In Proceedings of the 47th Annual ACM Symposium on Theory of Computing (STOC’15). ACM, New York, NY, 771--782.
    [5]
    Guy Bresler, David Gamarnik, and Devavrat Shah. 2014. Hardness of parameter estimation in graphical models. In Advances in Neural Information Processing Systems (NeurIPS’14). 1062--1070.
    [6]
    Guy Bresler, David Gamarnik, and Devavrat Shah. 2014. Structure learning of antiferromagnetic Ising models. In Advances in Neural Information Processing Systems (NeurIPS’14). 2852--2860.
    [7]
    Guy Bresler and Mina Karzand. 2016. Learning a tree-structured Ising model in order to make predictions. arXiv:1604.06749.
    [8]
    Guy Bresler, Elchanan Mossel, and Allan Sly. 2013. Reconstruction of Markov random fields from samples: Some observations and algorithms. SIAM Journal on Computing 42, 2 (2013), 563--578.
    [9]
    Graham R. Brightwell and Peter Winkler. 2002. Random colorings of a Cayley tree. Contemporary Combinatorics 10 (2002), 247--276.
    [10]
    Andrei A. Bulatov. 2005. -coloring dichotomy revisited. Theoretical Computer Science 349, 1 (2005), 31--39.
    [11]
    C. K. Chow and C. N. Liu. 1968. Approximating discrete probability distributions with dependence trees. IEEE Transactions on Information Theory 14, 3 (1968), 462--467.
    [12]
    Imre Csiszár and Zsolt Talata. 2006. Consistent estimation of the basic neighborhood of Markov random fields. Annals of Statistics 34, 1 (2006), 123--145.
    [13]
    Sanjoy Dasgupta. 1999. Learning polytrees. In Proceedings of the 15th Conference on Uncertainty in Artificial Intelligence (UAI’99). 134--141.
    [14]
    Amir Dembo, Andrea Montanari, and Nike Sun. 2013. Factor models on locally tree-like graphs. Annals of Probability 41, 6 (2013), 4162--4213.
    [15]
    R. L. Dobrushin. 1968. The description of a random field by means of conditional probabilities and conditions of its regularity. Theory of Probability 8 Its Applications 13, 2 (1968), 197--224.
    [16]
    R. L. Dobrushin and S. B. Shlosman. 1985. Constructive criterion for the uniqueness of Gibbs field. In Statistical Physics and Dynamical Systems. Vol. 10. Birkhäuser Boston, Boston, MA, 347--370.
    [17]
    Martin Dyer, Leslie Ann Goldberg, and Mark Jerrum. 2004. Counting and sampling -colourings. Information and Computation 189, 1 (2004), 1--16.
    [18]
    Martin Dyer, Leslie Ann Goldberg, and Mark Jerrum. 2008. Dobrushin conditions and systematic scan. Combinatorics, Probability and Computing 17, 6 (2008), 761--779.
    [19]
    Martin Dyer and Catherine Greenhill. 2000. The complexity of counting graph homomorphisms. Random Structures and Algorithms 17, 3-4 (2000), 260--289.
    [20]
    Martin Dyer, Alistair Sinclair, Eric Vigoda, and Dror Weitz. 2004. Mixing in time and space for lattice spin systems: A combinatorial view. Random Structures 8 Algorithms 24, 4 (2004), 461--479.
    [21]
    Nathan Eagle, Alex Sandy Pentland, and David Lazer. 2009. Inferring friendship network structure by using mobile phone data. Proceedings of the National Academy of Sciences 106, 36 (2009), 15274--15278.
    [22]
    Thomas Emden-Weinert, Stefan Hougardy, and Bernd Kreuter. 1998. Uniquely colourable graphs and the hardness of colouring graphs of large girth. Combinatorics, Probability and Computing 7, 4 (1998), 375--386.
    [23]
    Andreas Galanis, Leslie Ann Goldberg, and Mark Jerrum. 2016. Approximately counting -colorings is #BIS-hard. SIAM Journal on Computing 45, 3 (2016), 680--711.
    [24]
    Andreas Galanis, Daniel Štefankovič, and Eric Vigoda. 2015. Inapproximability for antiferromagnetic spin systems in the tree non-uniqueness region. Journal of the ACM 62, 6 (2015), 50.
    [25]
    Hans-Otto Georgii. 2011. Gibbs Measures and Phase Transitions. In De Gruyter Studies in Mathematics, Vol. 9. Walter de Gruyter, Boston, MA.
    [26]
    Leslie Ann Goldberg, Steven Kelk, and Mike Paterson. 2002. The complexity of choosing an -colouring (nearly) uniformly at random. In Proceedings of the 34th Annual ACM Symposium on Theory of Computing (STOC’02). ACM, New York, NY, 53--62.
    [27]
    Linus Hamilton, Frederic Koehler, and Ankur Moitra. 2017. Information theoretic properties of Markov random fields, and their algorithmic applications. In Advances in Neural Information Processing Systems (NeurIPS’17). 2460--2469.
    [28]
    Pavol Hell and Jaroslav Nešetřil. 1990. On the complexity of -coloring. Journal of Combinatorial Theory, Series B 48, 1 (1990), 92--110.
    [29]
    Pavol Hell and Jaroslav Nešetřil. 2004. Graphs and Homomorphisms. Oxford Lecture Series in Mathematics and Its Applications. Oxford University Press
    [30]
    John P. Huelsenbeck, Fredrik Ronquist, Rasmus Nielsen, and Jonathan P. Bollback. 2001. Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294, 5550 (2001), 2310--2314.
    [31]
    Ali Jalali, Pradeep Ravikumar, Vishvas Vasuki, and Sujay Sanghavi. 2011. On learning discrete graphical models using group-sparse regularization. In Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. 378--387.
    [32]
    Johan Jonasson. 2002. Uniqueness of uniform random colorings of regular trees. Statistics 8 Probability Letters 57, 3 (2002), 243--248.
    [33]
    Adam Klivans and Raghu Meka. 2017. Learning graphical models using multiplicative weights. In Proceedings of the 58th Annual Symposium on Foundations of Computer Science (FOCS’17). IEEE, Los Alamitos, CA, 343--354.
    [34]
    Su-In Lee, Varun Ganapathi, and Daphne Koller. 2007. Efficient structure learning of Markov networks using -regularization. In Advances in Neural Information Processing Systems (NeurIPS’07). 817--824.
    [35]
    Liang Li, Pinyan Lu, and Yitong Yin. 2013. Correlation decay up to uniqueness in spin systems. In Proceedings of the 24th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA’13). 67--84.
    [36]
    Daniel Marbach, James C. Costello, Robert Küffner, Nicci Vega, Robert J. Prill, Diogo M. Camacho, Kyle R. Allison, Manolis Kellis, James J. Collins, and Gustavo Stolovitzky. 2012. Wisdom of crowds for robust gene network inference. Nature Methods 9, 8 (2012), 796--804.
    [37]
    Fabio Martinelli, Alistair Sinclair, and Dror Weitz. 2007. Fast mixing for independent sets, colorings, and other models on trees. Random Structures 8 Algorithms 31, 2 (2007), 134--172.
    [38]
    Michael Molloy and Bruce Reed. 2001. Colouring graphs when the number of colours is nearly the maximum degree. In Proceedings of the 33rd Annual ACM Symposium on Theory of Computing (STOC’01). ACM, New York, NY, 462--470.
    [39]
    Faruck Morcos, Andrea Pagnani, Bryan Lunt, Arianna Bertolino, Debora S. Marks, Chris Sander, Riccardo Zecchina, José N. Onuchic, Terence Hwa, and Martin Weigt. 2011. Direct-coupling analysis of residue coevolution captures native contacts across many protein families. Proceedings of the National Academy of Sciences 108, 49 (2011), E1293–E1301.
    [40]
    Pradeep Ravikumar, Martin J. Wainwright, and John D. Lafferty. 2010. High-dimensional Ising model selection using -regularized logistic regression. Annals of Statistics 38, 3 (2010), 1287--1319.
    [41]
    Stefan Roth and Michael J. Black. 2005. Fields of experts: A framework for learning image priors. In Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), Vol. 2. IEEE, Los Alamitos, CA, 860--867.
    [42]
    Elad Schneidman, Michael J. Berry II, Ronen Segev, and William Bialek. 2006. Weak pairwise correlations imply strongly correlated network states in a neural population. Nature 440, 7087 (2006), 1007.
    [43]
    Mark H. Siggers. 2010. A new proof of the -coloring dichotomy. SIAM Journal on Discrete Mathematics 23, 4 (2010), 2204--2210.
    [44]
    Allan Sly. 2010. Computational transition at the uniqueness threshold. In Proceedings of the 51st Annual Symposium on Foundations of Computer Science (FOCS’10). IEEE, Los Alamitos, CA, 287--296.
    [45]
    Allan Sly and Nike Sun. 2012. The computational hardness of counting in two-spin models on -regular graphs. In Proceedings of the 53rd Annual Symposium on Foundations of Computer Science (FOCS’12). IEEE, Los Alamitos, CA, 361--369.
    [46]
    Nathan Srebro. 2001. Maximum likelihood bounded tree-width Markov networks. In Proceedings of the 17th Conference on Uncertainty in Artificial Intelligence (UAI’01). 504--511.
    [47]
    Marc Vuffray, Sidhant Misra, Andrey Lokhov, and Michael Chertkov. 2016. Interaction screening: Efficient and sample-optimal learning of Ising models. In Advances in Neural Information Processing Systems (NeurIPS’16). 2595--2603.
    [48]
    Dror Weitz. 2005. Combinatorial criteria for uniqueness of Gibbs measures. Random Structures 8 Algorithms 27, 4 (2005), 445--475.
    [49]
    Dror Weitz. 2006. Counting independent sets up to the tree threshold. In Proceedings of the 38th Annual ACM Symposium on Theory of Computing (STOC’06). ACM, New York, NY, 140--149.

    Cited By

    View all
    • (2021)Parameter Estimation for Undirected Graphical Models With Hard ConstraintsIEEE Transactions on Information Theory10.1109/TIT.2021.309440467:10(6790-6809)Online publication date: Oct-2021

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Transactions on Algorithms
    ACM Transactions on Algorithms  Volume 16, Issue 3
    July 2020
    368 pages
    ISSN:1549-6325
    EISSN:1549-6333
    DOI:10.1145/3403658
    Issue’s Table of Contents
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 06 June 2020
    Online AM: 07 May 2020
    Accepted: 01 February 2020
    Revised: 01 August 2019
    Received: 01 May 2018
    Published in TALG Volume 16, Issue 3

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. H-colorings
    2. Markov random fields
    3. Structure learning
    4. identifiability
    5. spin systems

    Qualifiers

    • Research-article
    • Research
    • Refereed

    Funding Sources

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)86
    • Downloads (Last 6 weeks)11
    Reflects downloads up to 27 Jul 2024

    Other Metrics

    Citations

    Cited By

    View all
    • (2021)Parameter Estimation for Undirected Graphical Models With Hard ConstraintsIEEE Transactions on Information Theory10.1109/TIT.2021.309440467:10(6790-6809)Online publication date: Oct-2021

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Get Access

    Login options

    Full Access

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media