Nonparametric Stochastic Contextual Bandits

Guan, Melody Y.; Jiang, Heinrich

Computer Science > Machine Learning

arXiv:1801.01750 (cs)

[Submitted on 5 Jan 2018]

Title:Nonparametric Stochastic Contextual Bandits

Authors:Melody Y. Guan, Heinrich Jiang

View PDF

Abstract:We analyze the $K$-armed bandit problem where the reward for each arm is a noisy realization based on an observed context under mild nonparametric assumptions. We attain tight results for top-arm identification and a sublinear regret of $\widetilde{O}\Big(T^{\frac{1+D}{2+D}}\Big)$, where $D$ is the context dimension, for a modified UCB algorithm that is simple to implement ($k$NN-UCB). We then give global intrinsic dimension dependent and ambient dimension independent regret bounds. We also discuss recovering topological structures within the context space based on expected bandit performance and provide an extension to infinite-armed contextual bandits. Finally, we experimentally show the improvement of our algorithm over existing multi-armed bandit approaches for both simulated tasks and MNIST image classification.

Comments:	AAAI 2018
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:1801.01750 [cs.LG]
	(or arXiv:1801.01750v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1801.01750

Submission history

From: Melody Guan [view email]
[v1] Fri, 5 Jan 2018 13:27:42 UTC (1,617 KB)

Full-text links:

Access Paper:

view license

Current browse context:

< prev | next >

new | recent | 2018-01

Change to browse by:

cs.LG
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Melody Y. Guan
Heinrich Jiang

export BibTeX citation

Computer Science > Machine Learning

Title:Nonparametric Stochastic Contextual Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Nonparametric Stochastic Contextual Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators