Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/3157096.3157313guideproceedingsArticle/Chapter ViewAbstractPublication PagesnipsConference Proceedingsconference-collections
Article
Free access

Minimax estimation of maximum mean discrepancy with radial kernels

Published: 05 December 2016 Publication History

Abstract

Maximum Mean Discrepancy (MMD) is a distance on the space of probability measures which has found numerous applications in machine learning and nonparametric testing. This distance is based on the notion of embedding probabilities in a reproducing kernel Hilbert space. In this paper, we present the first known lower bounds for the estimation of MMD based on finite samples. Our lower bounds hold for any radial universal kernel on ℝd and match the existing upper bounds up to constants that depend only on the properties of the kernel. Using these lower bounds, we establish the minimax rate optimality of the empirical estimator and its U-statistic variant, which are usually employed in applications.

References

[1]
A. Berliner, and C. Thomas-Agnan. Reproducing Kernel Hilbert Spaces in Probability and Statistics. Kluwer Academic Publishers, London, UK, 2004.
[2]
S. Boucheron, G. Lugosi, and P. Massart. Concentration Inequalities: A Nonasymptotic Theory of Independence. Oxford University Press, 2013.
[3]
K. Fukumizu, A. Gretton, X. Sun, and B. Schölkopf. Kernel measures of conditional dependence. In J.C. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 489-496, Cambridge, MA, 2008. MIT Press.
[4]
K. Fukumizu, L. Song, and A. Gretton. Kernel Bayes' rule: Bayesian inference with positive definite kernels. J. Mach. Learn. Res., 14:3753-3783, 2013.
[5]
A. Gretton, K. M. Borgwardt, M. Rasch, B. Schölkopf, and A. Smola. A kernel method for the two sample problem. In B. Schölkopf, J. Platt, and T. Hoffman, editors, Advances in Neural Information Processing Systems 19, pages 513-520, Cambridge, MA, 2007. MIT Press.
[6]
A. Gretton, K. M. Borgwardt, M. J. Rasch, B. Schölkopf, and A. J. Smola. A kernel two-sample test. Journal of Machine Learning Research, 13:723-773, 2012.
[7]
A. Gretton, K. Fukumizu, C. H. Teo, L. Song, B. Schölkopf, and A. J. Smola. A kernel statistical test of independence. In J. Platt, D. Koller, Y. Singer, and S. Roweis, editors, Advances in Neural Information Processing Systems 20, pages 585-592. MIT Press, 2008.
[8]
E. L. Lehmann and G. Casella. Theory of Point Estimation. Springer-Verlag, New York, 2008.
[9]
D. Lopez-Paz, K. Muandet, B. Schölkopf, and I. Tolstikhin. Towards a learning theory of cause-effect inference. In Proceedings of the 32nd International Conference on Machine Learning, ICML 2015, Lille, France, 6-11 July 2015, 2015.
[10]
K. Muandet, B. Sriperumbudur, K. Fukumizu, A. Gretton, and B. Schölkopf. Kernel mean shrinkage estimators. Journal of Machine Learning Research, 2016. To appear.
[11]
A. Müller. Integral probability metrics and their generating classes of functions. Advances in Applied Probability, 29:429-443, 1997.
[12]
I. J. Schoenberg. Metric spaces and completely monotone functions. The Annals of Mathematics, 39(4):811-841, 1938.
[13]
A. J. Smola, A. Gretton, L. Song, and B. Schölkopf. A Hilbert space embedding for distributions. In Proceedings of the 18th International Conference on Algorithmic Learning Theory (ALT), pages 13-31. Springer-Verlag, 2007.
[14]
L. Song, A. Smola, A. Gretton, J. Bedo, and K. Borgwardt. Feature selection via dependence maximization. Journal of Machine Learning Research, 13:1393-1434, 2012.
[15]
L. Song, X. Zhang, A. Smola, A. Gretton, and B. Schölkopf. Tailoring density estimation via reproducing kernel moment matching. In Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pages 992-999, 2008.
[16]
B. K. Sriperumbudur, K. Fukumizu, A. Gretton, B. Schölkopf, and G. R. G. Lanckriet. On the empirical estimation of integral probability metrics. Electronic Journal of Statistics, 6:1550-1599, 2012.
[17]
B. K. Sriperumbudur, K. Fukumizu, and G. R. G. Lanckriet. Universality, characteristic kernels and RKHS embedding of measures. J. Mach. Learn. Res., 12:2389-2410, 2011.
[18]
B. K. Sriperumbudur, A. Gretton, K. Fukumizu, B. Schölkopf, and G. R. G. Lanckriet. Hilbert space embeddings and metrics on probability measures. J. Mach. Learn. Res., 11:1517-1561, 2010.
[19]
I. Steinwart and A. Christmann. Support Vector Machines. Springer, 2008.
[20]
Z. Szabó, A. Gretton, B. Póczos, and B. K. Sriperumbudur. Two-stage sampled learning theory on distributions. In Proceedings of the Eighteenth International Conference on Artificial Intelligence and Statistics, volume 38, pages 948-957. JMLR Workshop and Conference Proceedings, 2015.
[21]
I. Tolstikhin, B. Sriperumbudur, and K. Muandet. Minimax estimation of kernel mean embeddings. arXiv:1602.04361 [math.ST], 2016.
[22]
A. B. Tsybakov. Introduction to Nonparametric Estimation. Springer, NY, 2008.

Cited By

View all
  • (2018)Minimax estimation of neural net distanceProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327300(3849-3858)Online publication date: 3-Dec-2018
  1. Minimax estimation of maximum mean discrepancy with radial kernels

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image Guide Proceedings
    NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing Systems
    December 2016
    5100 pages

    Publisher

    Curran Associates Inc.

    Red Hook, NY, United States

    Publication History

    Published: 05 December 2016

    Qualifiers

    • Article

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • Downloads (Last 12 months)57
    • Downloads (Last 6 weeks)14
    Reflects downloads up to 10 Feb 2025

    Other Metrics

    Citations

    Cited By

    View all
    • (2018)Minimax estimation of neural net distanceProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327144.3327300(3849-3858)Online publication date: 3-Dec-2018

    View Options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    Login options

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media