Neural Contextual Bandits with Deep Representation and Shallow Exploration

Xu, Pan; Wen, Zheng; Zhao, Handong; Gu, Quanquan

Computer Science > Machine Learning

arXiv:2012.01780 (cs)

[Submitted on 3 Dec 2020]

Title:Neural Contextual Bandits with Deep Representation and Shallow Exploration

Authors:Pan Xu, Zheng Wen, Handong Zhao, Quanquan Gu

View PDF

Abstract:We study a general class of contextual bandits, where each context-action pair is associated with a raw feature vector, but the reward generating function is unknown. We propose a novel learning algorithm that transforms the raw feature vector using the last hidden layer of a deep ReLU neural network (deep representation learning), and uses an upper confidence bound (UCB) approach to explore in the last linear layer (shallow exploration). We prove that under standard assumptions, our proposed algorithm achieves $\tilde{O}(\sqrt{T})$ finite-time regret, where $T$ is the learning time horizon. Compared with existing neural contextual bandit algorithms, our approach is computationally much more efficient since it only needs to explore in the last layer of the deep neural network.

Comments:	28 pages, 1 figure, 1 table
Subjects:	Machine Learning (cs.LG); Machine Learning (stat.ML)
Cite as:	arXiv:2012.01780 [cs.LG]
	(or arXiv:2012.01780v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2012.01780

Submission history

From: Quanquan Gu [view email]
[v1] Thu, 3 Dec 2020 09:17:55 UTC (57 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2020-12

Change to browse by:

cs
stat
stat.ML

References & Citations

DBLP - CS Bibliography

listing | bibtex

Pan Xu
Zheng Wen
Handong Zhao
Quanquan Gu

export BibTeX citation

Computer Science > Machine Learning

Title:Neural Contextual Bandits with Deep Representation and Shallow Exploration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Neural Contextual Bandits with Deep Representation and Shallow Exploration

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators