Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

Obuchi, Tomoyuki; Cocco, Simona; Monasson, Rémi

Condensed Matter > Statistical Mechanics

arXiv:1503.02802v1 (cond-mat)

[Submitted on 10 Mar 2015 (this version), latest version 21 Jul 2015 (v2)]

Title:Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

Authors:Tomoyuki Obuchi, Simona Cocco, Rémi Monasson

View PDF

Abstract:We consider the problem of learning a target probability distribution over a set of $N$ binary variables from the knowledge of the expectation values (with this target distribution) of $M$ observables, drawn uniformly at random. The space of all probability distributions compatible with these $M$ expectation values within some fixed accuracy, called version space, is studied. We introduce a biased measure over the version space, which gives a boost increasing exponentially with the entropy of the distributions and with an arbitrary inverse `temperature' $\Gamma$. The choice of $\Gamma$ allows us to interpolate smoothly between the unbiased measure over all distributions in the version space ($\Gamma=0$) and the pointwise measure concentrated at the maximum entropy distribution ($\Gamma \to \infty$). Using the replica method we compute the volume of the version space and other quantities of interest, such as the distance $R$ between the target distribution and the center-of-mass distribution over the version space, as functions of $\alpha=(\log M)/N$ and $\Gamma$ for large $N$. Phase transitions at critical values of $\alpha$ are found, corresponding to qualitative improvements in the learning of the target distribution and to the decrease of the distance $R$. However, for fixed $\alpha$, the distance $R$ does not vary with $\Gamma$, which means that the maximum entropy distribution is not closer to the target distribution than any other distribution compatible with the observable values. Our results are confirmed by Monte Carlo sampling of the version space for small system sizes ($N\le 10$).

Comments:	30 pages, 13 figures
Subjects:	Statistical Mechanics (cond-mat.stat-mech); Methodology (stat.ME)
Cite as:	arXiv:1503.02802 [cond-mat.stat-mech]
	(or arXiv:1503.02802v1 [cond-mat.stat-mech] for this version)
	https://doi.org/10.48550/arXiv.1503.02802

Submission history

From: Tomoyuki Obuchi [view email]
[v1] Tue, 10 Mar 2015 08:02:06 UTC (2,295 KB)
[v2] Tue, 21 Jul 2015 09:17:34 UTC (2,087 KB)

Condensed Matter > Statistical Mechanics

Title:Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Condensed Matter > Statistical Mechanics

Title:Learning probabilities from random observables in high dimensions: the maximum entropy distribution and others

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators