Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

Allen-Zhu, Zeyuan; Bubeck, Sébastien; Li, Yuanzhi

Computer Science > Machine Learning

arXiv:1802.03386 (cs)

[Submitted on 9 Feb 2018]

Title:Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

Authors:Zeyuan Allen-Zhu, Sébastien Bubeck, Yuanzhi Li

View PDF

Abstract:Regret bounds in online learning compare the player's performance to $L^*$, the optimal performance in hindsight with a fixed strategy. Typically such bounds scale with the square root of the time horizon $T$. The more refined concept of first-order regret bound replaces this with a scaling $\sqrt{L^*}$, which may be much smaller than $\sqrt{T}$. It is well known that minor variants of standard algorithms satisfy first-order regret bounds in the full information and multi-armed bandit settings. In a COLT 2017 open problem, Agarwal, Krishnamurthy, Langford, Luo, and Schapire raised the issue that existing techniques do not seem sufficient to obtain first-order regret bounds for the contextual bandit problem. In the present paper, we resolve this open problem by presenting a new strategy based on augmenting the policy space.

Comments:	15 pages
Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1802.03386 [cs.LG]
	(or arXiv:1802.03386v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1802.03386

Submission history

From: Sebastien Bubeck [view email]
[v1] Fri, 9 Feb 2018 18:47:21 UTC (17 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2018-02

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Zeyuan Allen-Zhu
Sébastien Bubeck
Yuanzhi Li

export BibTeX citation

Computer Science > Machine Learning

Title:Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Make the Minority Great Again: First-Order Regret Bound for Contextual Bandits

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators