Finite-time regret of thompson sampling algorithms for exponential family multi-armed bandits
Abstract
Supplementary Material
- Download
- 472.70 KB
References
Index Terms
- Finite-time regret of thompson sampling algorithms for exponential family multi-armed bandits
Recommendations
Thompson sampling for budgeted multi-armed bandits
IJCAI'15: Proceedings of the 24th International Conference on Artificial IntelligenceThompson sampling is one of the earliest randomized algorithms for multi-armed bandits (MAB). In this paper, we extend the Thompson sampling to Budgeted MAB, where there is random cost for pulling an arm and the total cost is constrained by a budget. We ...
Thompson Sampling for Dynamic Multi-armed Bandits
ICMLA '11: Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 01The importance of multi-armed bandit (MAB) problems is on the rise due to their recent application in a large variety of areas such as online advertising, news article selection, wireless networks, and medicinal trials, to name a few. The most common ...
Double Thompson sampling for dueling bandits
NIPS'16: Proceedings of the 30th International Conference on Neural Information Processing SystemsIn this paper, we propose a Double Thompson Sampling (D-TS) algorithm for dueling bandit problems. As its name suggests, D-TS selects both the first and the second candidates according to Thompson Sampling. Specifically, D-TS maintains a posterior ...
Comments
Information & Contributors
Information
Published In
- Editors:
- S. Koyejo,
- S. Mohamed,
- A. Agarwal,
- D. Belgrave,
- K. Cho,
- A. Oh
Publisher
Curran Associates Inc.
Red Hook, NY, United States
Publication History
Qualifiers
- Research-article
- Research
- Refereed limited
Contributors
Other Metrics
Bibliometrics & Citations
Bibliometrics
Article Metrics
- 0Total Citations
- 0Total Downloads
- Downloads (Last 12 months)0
- Downloads (Last 6 weeks)0