Online Reinforcement Learning in Stochastic Games

Wei, Chen-Yu; Hong, Yi-Te; Lu, Chi-Jen

Computer Science > Machine Learning

arXiv:1712.00579 (cs)

[Submitted on 2 Dec 2017]

Title:Online Reinforcement Learning in Stochastic Games

Authors:Chen-Yu Wei, Yi-Te Hong, Chi-Jen Lu

View PDF

Abstract:We study online reinforcement learning in average-reward stochastic games (SGs). An SG models a two-player zero-sum game in a Markov environment, where state transitions and one-step payoffs are determined simultaneously by a learner and an adversary. We propose the UCSG algorithm that achieves a sublinear regret compared to the game value when competing with an arbitrary opponent. This result improves previous ones under the same setting. The regret bound has a dependency on the diameter, which is an intrinsic value related to the mixing property of SGs. If we let the opponent play an optimistic best response to the learner, UCSG finds an $\varepsilon$-maximin stationary policy with a sample complexity of $\tilde{\mathcal{O}}\left(\text{poly}(1/\varepsilon)\right)$, where $\varepsilon$ is the gap to the best policy.

Subjects:	Machine Learning (cs.LG)
Cite as:	arXiv:1712.00579 [cs.LG]
	(or arXiv:1712.00579v1 [cs.LG] for this version)
	https://doi.org/10.48550/arXiv.1712.00579

Submission history

From: Chen-Yu Wei [view email]
[v1] Sat, 2 Dec 2017 09:46:33 UTC (51 KB)

Full-text links:

Access Paper:

view license

Current browse context:

cs.LG

< prev | next >

new | recent | 2017-12

Change to browse by:

References & Citations

DBLP - CS Bibliography

listing | bibtex

Chen-Yu Wei
Yi-Te Hong
Chi-Jen Lu

export BibTeX citation

Computer Science > Machine Learning

Title:Online Reinforcement Learning in Stochastic Games

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Machine Learning

Title:Online Reinforcement Learning in Stochastic Games

Submission history

Access Paper:

References & Citations

DBLP - CS Bibliography

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators