Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1109/ICMLA.2011.144guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Thompson Sampling for Dynamic Multi-armed Bandits

Published: 18 December 2011 Publication History

Abstract

The importance of multi-armed bandit (MAB) problems is on the rise due to their recent application in a large variety of areas such as online advertising, news article selection, wireless networks, and medicinal trials, to name a few. The most common assumption made when solving such MAB problems is that the unknown reward probability theta k of each bandit arm k is fixed. However, this assumption rarely holds in practice simply because real-life problems often involve underlying processes that are dynamically evolving. In this paper, we model problems where reward probabilities theta k are drifting, and introduce a new method called Dynamic Thompson Sampling (DTS) that facilitates Order Statistics based Thompson Sampling for these dynamically evolving MABs. The DTS algorithm adapts its success probability estimates, hat theta k, faster than traditional Thompson Sampling schemes and thus leads to improved performance in terms of lower regret. Extensive experiments demonstrate that DTS outperforms current state-of-the-art approaches, namely pure Thompson Sampling, UCB-Normal and UCB_f, for the case of dynamic reward probabilities. Furthermore, this performance advantage increases persistently with the number of bandit arms.

Cited By

View all
  • (2024)Adaptive Hyperparameter Tuning Within Neural Network-Based Efficient Global OptimizationComputational Science – ICCS 202410.1007/978-3-031-63775-9_6(74-89)Online publication date: 2-Jul-2024
  • (2023)An information-theoretic analysis of nonstationary bandit learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619442(24831-24849)Online publication date: 23-Jul-2023
  • (2021)Dynamic Learning in Hyper-Heuristics to Solve Flowshop ProblemsIntelligent Systems10.1007/978-3-030-91702-9_11(155-169)Online publication date: 29-Nov-2021
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
ICMLA '11: Proceedings of the 2011 10th International Conference on Machine Learning and Applications and Workshops - Volume 01
December 2011
507 pages
ISBN:9780769546070

Publisher

IEEE Computer Society

United States

Publication History

Published: 18 December 2011

Author Tags

  1. Bayesian Techniques
  2. Learning Algorithms
  3. Multi-Armed Bandits

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Adaptive Hyperparameter Tuning Within Neural Network-Based Efficient Global OptimizationComputational Science – ICCS 202410.1007/978-3-031-63775-9_6(74-89)Online publication date: 2-Jul-2024
  • (2023)An information-theoretic analysis of nonstationary bandit learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619442(24831-24849)Online publication date: 23-Jul-2023
  • (2021)Dynamic Learning in Hyper-Heuristics to Solve Flowshop ProblemsIntelligent Systems10.1007/978-3-030-91702-9_11(155-169)Online publication date: 29-Nov-2021
  • (2019)Weighted linear bandits for non-stationary environmentsProceedings of the 33rd International Conference on Neural Information Processing Systems10.5555/3454287.3455365(12040-12049)Online publication date: 8-Dec-2019
  • (2018)Expertise Drift in Referral NetworksProceedings of the 17th International Conference on Autonomous Agents and MultiAgent Systems10.5555/3237383.3237449(425-433)Online publication date: 9-Jul-2018
  • (2015)Efficient search with an ensemble of heuristicsProceedings of the 24th International Conference on Artificial Intelligence10.5555/2832249.2832358(784-791)Online publication date: 25-Jul-2015
  • (2015)Real-Time Bid Prediction using Thompson Sampling-Based Expert SelectionProceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining10.1145/2783258.2788586(1869-1878)Online publication date: 10-Aug-2015

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media