Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Bian, Jie; Tan, Vincent Y. F.

Computer Science > Machine Learning

arXiv:2405.15200v1 (cs)

[Submitted on 24 May 2024]

Title:Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Authors:Jie Bian, Vincent Y. F. Tan

View PDF HTML (experimental)

Abstract:The Indexed Minimum Empirical Divergence (IMED) algorithm is a highly effective approach that offers a stronger theoretical guarantee of the asymptotic optimality compared to the Kullback--Leibler Upper Confidence Bound (KL-UCB) algorithm for the multi-armed bandit problem. Additionally, it has been observed to empirically outperform UCB-based algorithms and Thompson Sampling. Despite its effectiveness, the generalization of this algorithm to contextual bandits with linear payoffs has remained elusive. In this paper, we present novel linear versions of the IMED algorithm, which we call the family of LinIMED algorithms. We demonstrate that LinIMED provides a $\widetilde{O}(d\sqrt{T})$ upper regret bound where $d$ is the dimension of the context and $T$ is the time horizon. Furthermore, extensive empirical studies reveal that LinIMED and its variants outperform widely-used linear bandit algorithms such as LinUCB and Linear Thompson Sampling in some regimes.

Comments:	Accepted to the Transactions on Machine Learning Research (TMLR)
Subjects:	Machine Learning (cs.LG); Information Theory (cs.IT)
Cite as:	arXiv:2405.15200 [cs.LG]
	(or arXiv:2405.15200v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.2405.15200

Submission history

From: Vincent Tan [view email]
[v1] Fri, 24 May 2024 04:11:58 UTC (7,049 KB)

Computer Science > Machine Learning

Title:Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Indexed Minimum Empirical Divergence-Based Algorithms for Linear Bandits

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators