Consistent, Two-Stage Sampled Distribution Regression via Mean Embedding

Szabo, Zoltan; Gretton, Arthur; Poczos, Barnabas; Sriperumbudur, Bharath

Mathematics > Statistics Theory

arXiv:1402.1754v2 (math)

[Submitted on 7 Feb 2014 (v1), revised 21 Apr 2014 (this version, v2), latest version 26 Jan 2015 (v6)]

Title:Consistent, Two-Stage Sampled Distribution Regression via Mean Embedding

Authors:Zoltan Szabo, Arthur Gretton, Barnabas Poczos, Bharath Sriperumbudur

View PDF

Abstract:We study the distribution regression problem: regressing to a real-valued response from a probability distribution. Due to the inherent two-stage sampled difficulty of this important machine learning problem---in practise we only have samples from sampled distributions---very little is known about its theoretical properties. In this paper, we propose an algorithmically simple approach to tackle the distribution regression problem: embed the distributions to a reproducing kernel Hilbert space, and learn a ridge regressor from the embeddings to the outputs. Our main contribution is to prove that this technique is consistent in the two-stage sampled setting under fairly mild conditions (for probability distributions on locally compact Polish spaces on which kernels have been defined). The method gives state-of-the-art results on (i) supervised entropy learning and (ii) the prediction problem of aerosol optical depth based on satellite images.

Comments:	Case of Hölder continuous kernels (K): added. A typo on kernel bounds: corrected. Code: made available (this https URL)
Subjects:	Statistics Theory (math.ST); Machine Learning (cs.LG); Functional Analysis (math.FA); Machine Learning (stat.ML)
MSC classes:	62G08, 46E22, 47B32
ACM classes:	G.3; I.2.6
Cite as:	arXiv:1402.1754 [math.ST]
	(or arXiv:1402.1754v2 [math.ST] for this version)
	https://doi.org/10.48550/arXiv.1402.1754

Submission history

From: Zoltan Szabo [view email]
[v1] Fri, 7 Feb 2014 20:37:59 UTC (58 KB)
[v2] Mon, 21 Apr 2014 11:35:58 UTC (62 KB)
[v3] Sun, 4 May 2014 19:29:36 UTC (36 KB)
[v4] Sat, 7 Jun 2014 17:42:06 UTC (51 KB)
[v5] Sat, 25 Oct 2014 21:03:01 UTC (57 KB)
[v6] Mon, 26 Jan 2015 22:20:59 UTC (57 KB)

Mathematics > Statistics Theory

Title:Consistent, Two-Stage Sampled Distribution Regression via Mean Embedding

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Mathematics > Statistics Theory

Title:Consistent, Two-Stage Sampled Distribution Regression via Mean Embedding

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators