Article

Free access

Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

Authors:

Hassan Ashtiani,

Shai Ben-David,

Nicholas J. A. Harvey,

Christopher Liaw,

Abbas Mehrabian,

Yaniv PlanAuthors Info & Claims

NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems

Pages 3416 - 3425

Published: 03 December 2018 Publication History

PDF eReader Publisher Site

Abstract

We prove that Θ~(kd²/ε²) samples are necessary and sufficient for learning a mixture of k Gaussians in ℝ^d, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ(kd/ε²) samples suffice, matching a known lower bound.

The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in ℝ^d has an efficient sample compression.

References

[1]

Martin Anthony and Peter Bartlett. Neural network learning: theoretical foundations. Cambridge University Press, 1999.

Digital Library

[2]

Hassan Ashtiani, Shai Ben-David, Nicholas J. A. Harvey, Christopher Liaw, Abbas Mehrabian, and Yaniv Plan. Near-optimal sample complexity bounds for robust learning of Gaussians mixtures via compression schemes. arXiv preprint. URL https://arxiv.org/abs/1710.05209.

[3]

Hassan Ashtiani, Shai Ben-David, and Abbas Mehrabian. Sample-efficient learning of mixtures. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI'18, pages 2679-2686. AAAI Publications, 2018. URL https://arxiv.org/abs/1706.01596.

[4]

Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4):929-965, October 1989. ISSN 0004-5411. URL http://doi.acm.org/10.1145/76359.76371.

Digital Library

[5]

Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, and Xiaorui Sun. Efficient density estimation via piecewise polynomial approximation. In Proceedings of the Forty-sixth Annual ACM Symposium on Theory of Computing, STOC '14, pages 604-613, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2710-7.

Digital Library

[6]

Luc Devroye. A course in density estimation, volume 14 of Progress in Probability and Statistics. Birkhäuser Boston, Inc., Boston, MA, 1987. ISBN 0-8176-3365-0.

Digital Library

[7]

Luc Devroye and Gábor Lugosi. Combinatorial methods in density estimation. Springer Series in Statistics. Springer-Verlag, New York, 2001. ISBN 0-387-95117-2.

[8]

Luc Devroye, Abbas Mehrabian, and Tommy Reddad. The minimax learning rate of normal and Ising undirected graphical models. arXiv preprint arXiv:1806.06887, 2018.

[9]

Ilias Diakonikolas. Learning Structured Distributions. In Peter Bühlmann, Petros Drineas, Michael Kane, and Mark van der Laan, editors, Handbook of Big Data, chapter 15, pages 267-283. Chapman and Hall/CRC, 2016. URL http://www.crcnetbase.com/doi/pdfplus/10.1201/b19567-21.

[10]

Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Statistical query lower bounds for robust estimation of high-dimensional Gaussians and Gaussian mixtures. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 73-84, Oct 2017. Available on arXiv:1611.03473 [cs.LG].

[11]

Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Learning multivariate log-concave distributions. In Proceedings of Machine Learning Research, volume 65 of COLT'17, pages 1—-17, 2017. ISBN 3-540-35294-5, 978-3-540-35294-5. URL http://proceedings.mlr.press/v65/diakonikolas17a/diakonikolas17a.pdf.

[12]

Ildar Ibragimov. Estimation of analytic functions. In State of the art in probability and statistics (Leiden, 1999), volume 36 of IMS Lecture Notes Monogr. Ser., pages 359-383. Inst. Math. Statist., Beachwood, OH, 2001. URL https://doi.org/10.1214/lnms/1215090078.

[13]

Adam Kalai, Ankur Moitra, and Gregory Valiant. Disentangling Gaussians. Communications of the ACM, 55(2), 2012.

Digital Library

[14]

Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, and Linda Sellie. On the learnability of discrete distributions. In Proceedings of the Twenty-sixth Annual ACM Symposium on Theory of Computing, STOC '94, pages 273-282, New York, NY, USA, 1994. ACM. ISBN 0-89791-663-8. URL http://doi.acm.org/10.1145/195058.195155.

Digital Library

[15]

Nick Littlestone and Manfred Warmuth. Relating data compression and learnability. Technical report, Technical report, University of California, Santa Cruz, 1986.

[16]

Alexander E. Litvak, Alain Pajor, Mark Rudelson, and Nicole Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes. Adv. Math., 195(2): 491-523, 2005. ISSN 0001-8708.

[17]

Mario Lucic, Matthew Faulkner, Andreas Krause, and Dan Feldman. Training Gaussian mixture models at scale via coresets. Journal of Machine Learning Research, 18(160):1-25, 2018. URL http://jmlr.org/papers/v18/15-506.html.

Digital Library

[18]

Shay Moran and Amir Yehudayoff. Sample compression schemes for VC classes. Journal of the ACM (JACM), 63(3):21, 2016.

Digital Library

[19]

Bernard W. Silverman. Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1986. ISBN 0-412-24620-1.

[20]

Ananda Theertha Suresh, Alon Orlitsky, Jayadev Acharya, and Ashkan Jafarpour. Near-optimal-sample estimators for spherical gaussian mixtures. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 1395-1403. Curran Associates, Inc., 2014. URL http://papers.nips.cc/paper/ 5251-near-optimal-sample-estimators-for-spherical-gaussian-mixtures.pdf.

Digital Library

[21]

Vladimir N. Vapnik and Alexey Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability & Its Applications, 16(2): 264-280, 1971.

Cited By

Liu ALi JMoitra AKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601929

Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes
1. Theory of computation

Recommendations

Near-optimal Sample Complexity Bounds for Robust Learning of Gaussian Mixtures via Compression Schemes

We introduce a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a compression scheme can be learned with few samples. Moreover, if a class of distributions has such a ...
Quantitatively tight sample complexity bounds
Sample-efficient learning of mixtures
AAAI'18/IAAI'18/EAAI'18: Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence and Thirtieth Innovative Applications of Artificial Intelligence Conference and Eighth AAAI Symposium on Educational Advances in Artificial Intelligence

We consider PAC learning of probability distributions (a.k.a. density estimation), where we are given an i.i.d. sample generated from an unknown target distribution, and want to output a distribution that is close to the target in total variation ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems

December 2018

11021 pages

Publisher

Curran Associates Inc.

Red Hook, NY, United States

Publication History

Published: 03 December 2018

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

1
Total Citations
View Citations
67
Total Downloads

Downloads (Last 12 months)32
Downloads (Last 6 weeks)6

Reflects downloads up to 17 Jan 2025

Other Metrics

View Author Metrics

Citations

Cited By

Liu ALi JMoitra AKoyejo SMohamed SAgarwal ABelgrave DCho KOh A(2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022
https://dl.acm.org/doi/10.5555/3600270.3601929

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents