Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3327144.3327260guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article
Free access

Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

Published: 03 December 2018 Publication History

Abstract

We prove that Θ~(kd22) samples are necessary and sufficient for learning a mixture of k Gaussians in ℝd, up to error ε in total variation distance. This improves both the known upper bounds and lower bounds for this problem. For mixtures of axis-aligned Gaussians, we show that Õ(kd2) samples suffice, matching a known lower bound.
The upper bound is based on a novel technique for distribution learning based on a notion of sample compression. Any class of distributions that allows such a sample compression scheme can also be learned with few samples. Moreover, if a class of distributions has such a compression scheme, then so do the classes of products and mixtures of those distributions. The core of our main result is showing that the class of Gaussians in ℝd has an efficient sample compression.

References

[1]
Martin Anthony and Peter Bartlett. Neural network learning: theoretical foundations. Cambridge University Press, 1999.
[2]
Hassan Ashtiani, Shai Ben-David, Nicholas J. A. Harvey, Christopher Liaw, Abbas Mehrabian, and Yaniv Plan. Near-optimal sample complexity bounds for robust learning of Gaussians mixtures via compression schemes. arXiv preprint. URL https://arxiv.org/abs/1710.05209.
[3]
Hassan Ashtiani, Shai Ben-David, and Abbas Mehrabian. Sample-efficient learning of mixtures. In Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, AAAI'18, pages 2679-2686. AAAI Publications, 2018. URL https://arxiv.org/abs/1706.01596.
[4]
Anselm Blumer, Andrzej Ehrenfeucht, David Haussler, and Manfred K. Warmuth. Learnability and the Vapnik-Chervonenkis dimension. J. ACM, 36(4):929-965, October 1989. ISSN 0004-5411. URL http://doi.acm.org/10.1145/76359.76371.
[5]
Siu-On Chan, Ilias Diakonikolas, Rocco A. Servedio, and Xiaorui Sun. Efficient density estimation via piecewise polynomial approximation. In Proceedings of the Forty-sixth Annual ACM Symposium on Theory of Computing, STOC '14, pages 604-613, New York, NY, USA, 2014. ACM. ISBN 978-1-4503-2710-7.
[6]
Luc Devroye. A course in density estimation, volume 14 of Progress in Probability and Statistics. Birkhäuser Boston, Inc., Boston, MA, 1987. ISBN 0-8176-3365-0.
[7]
Luc Devroye and Gábor Lugosi. Combinatorial methods in density estimation. Springer Series in Statistics. Springer-Verlag, New York, 2001. ISBN 0-387-95117-2.
[8]
Luc Devroye, Abbas Mehrabian, and Tommy Reddad. The minimax learning rate of normal and Ising undirected graphical models. arXiv preprint arXiv:1806.06887, 2018.
[9]
Ilias Diakonikolas. Learning Structured Distributions. In Peter Bühlmann, Petros Drineas, Michael Kane, and Mark van der Laan, editors, Handbook of Big Data, chapter 15, pages 267-283. Chapman and Hall/CRC, 2016. URL http://www.crcnetbase.com/doi/pdfplus/10.1201/b19567-21.
[10]
Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Statistical query lower bounds for robust estimation of high-dimensional Gaussians and Gaussian mixtures. In 2017 IEEE 58th Annual Symposium on Foundations of Computer Science (FOCS), pages 73-84, Oct 2017. Available on arXiv:1611.03473 [cs.LG].
[11]
Ilias Diakonikolas, Daniel M. Kane, and Alistair Stewart. Learning multivariate log-concave distributions. In Proceedings of Machine Learning Research, volume 65 of COLT'17, pages 1—-17, 2017. ISBN 3-540-35294-5, 978-3-540-35294-5. URL http://proceedings.mlr.press/v65/diakonikolas17a/diakonikolas17a.pdf.
[12]
Ildar Ibragimov. Estimation of analytic functions. In State of the art in probability and statistics (Leiden, 1999), volume 36 of IMS Lecture Notes Monogr. Ser., pages 359-383. Inst. Math. Statist., Beachwood, OH, 2001. URL https://doi.org/10.1214/lnms/1215090078.
[13]
Adam Kalai, Ankur Moitra, and Gregory Valiant. Disentangling Gaussians. Communications of the ACM, 55(2), 2012.
[14]
Michael Kearns, Yishay Mansour, Dana Ron, Ronitt Rubinfeld, Robert E. Schapire, and Linda Sellie. On the learnability of discrete distributions. In Proceedings of the Twenty-sixth Annual ACM Symposium on Theory of Computing, STOC '94, pages 273-282, New York, NY, USA, 1994. ACM. ISBN 0-89791-663-8. URL http://doi.acm.org/10.1145/195058.195155.
[15]
Nick Littlestone and Manfred Warmuth. Relating data compression and learnability. Technical report, Technical report, University of California, Santa Cruz, 1986.
[16]
Alexander E. Litvak, Alain Pajor, Mark Rudelson, and Nicole Tomczak-Jaegermann. Smallest singular value of random matrices and geometry of random polytopes. Adv. Math., 195(2): 491-523, 2005. ISSN 0001-8708.
[17]
Mario Lucic, Matthew Faulkner, Andreas Krause, and Dan Feldman. Training Gaussian mixture models at scale via coresets. Journal of Machine Learning Research, 18(160):1-25, 2018. URL http://jmlr.org/papers/v18/15-506.html.
[18]
Shay Moran and Amir Yehudayoff. Sample compression schemes for VC classes. Journal of the ACM (JACM), 63(3):21, 2016.
[19]
Bernard W. Silverman. Density estimation for statistics and data analysis. Monographs on Statistics and Applied Probability. Chapman & Hall, London, 1986. ISBN 0-412-24620-1.
[20]
Ananda Theertha Suresh, Alon Orlitsky, Jayadev Acharya, and Ashkan Jafarpour. Near-optimal-sample estimators for spherical gaussian mixtures. In Z. Ghahramani, M. Welling, C. Cortes, N. D. Lawrence, and K. Q. Weinberger, editors, Advances in Neural Information Processing Systems 27, pages 1395-1403. Curran Associates, Inc., 2014. URL http://papers.nips.cc/paper/ 5251-near-optimal-sample-estimators-for-spherical-gaussian-mixtures.pdf.
[21]
Vladimir N. Vapnik and Alexey Ya. Chervonenkis. On the uniform convergence of relative frequencies of events to their probabilities. Theory of Probability & Its Applications, 16(2): 264-280, 1971.

Cited By

View all
  • (2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022
  1. Nearly tight sample complexity bounds for learning mixtures of Gaussians via sample compression schemes

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems
    December 2018
    11021 pages

    Publisher

    Curran Associates Inc.

    Red Hook, NY, United States

    Publication History

    Published: 03 December 2018

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)32
    • Downloads (Last 6 weeks)6
    Reflects downloads up to 17 Jan 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2022)Robust model selection and nearly-proper learning for GMMsProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3601929(22830-22843)Online publication date: 28-Nov-2022

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media