research-article

Bandits with Knapsacks and predictions

AUTHORs:

Marek EliášAuthors Info & Claims

UAI '24: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence

Article No.: 55, Pages 1189 - 1206

Published: 15 July 2024 Publication History

Abstract

We study the Bandits with Knapsacks problem with the aim of designing a learning-augmented online learning algorithm upholding better regret guarantees than the state-of-the-art primal-dual algorithms with worst-case guarantees, under both stochastic and adversarial inputs. In the adversarial case, we obtain better competitive ratios when the input predictions are accurate, while also maintaining worst-case guarantees for imprecise predictions. We introduce two algorithms tailored for the full and bandit feedback settings, respectively. Both algorithms integrate a static prediction with a worst-case no-α-regret algorithm. This yields an optimized competitive ratio of (π + (1 - π)/α)^-1 in scenarios where the prediction is perfect, and a competitive ratio of α/(1 - π) in the case of highly imprecise predictions, where π ∈ (0, 1) is chosen by the learner and α is Slater's parameter. We complement this analysis by studying the stochastic setting under full feedback. We provide an algorithm which guarantees a pseudo-regret of Õ(√T) with poor predictions, and 0 pseudo-regret with perfect predictions.

References

[1]

Shipra Agrawal and Nikhil R Devanur. Bandits with global convex constraints and objective. Operations Research, 67(5):1486-1502, 2019.

Digital Library

[2]

Shipra Agrawal, Nikhil R Devanur, and Lihong Li. An efficient algorithm for contextual bandits with knapsacks, and an extension to concave objectives. In 29th Annual Conference on Learning Theory (COLT), 2016.

[3]

Antonios Antoniadis, Christian Coester, Marek Eliáš, Adam Polak, and Bertrand Simon. Online metric algorithms with untrusted predictions. ACM Trans. Algorithms, 19 (2), apr 2023. ISSN 1549-6325.

Digital Library

[4]

Moshe Babaioff, Shaddin Dughmi, Robert Kleinberg, and Aleksandrs Slivkins. Dynamic pricing with limited supply. In Proceedings of the 13th ACM Conference on Electronic Commerce, pages 74-91, 2012.

Digital Library

[5]

Ashwinkumar Badanidiyuru, Robert Kleinberg, and Yaron Singer. Learning on a budget: posted price mechanisms for online procurement. In Proceedings of the 13th ACM conference on electronic commerce, pages 128-145, 2012.

Digital Library

[6]

Ashwinkumar Badanidiyuru, Robert Kleinberg, and Aleksandrs Slivkins. Bandits with knapsacks. In 2013 IEEE 54th Annual Symposium on Foundations of Computer Science, pages 207-216, 2013.

Digital Library

[7]

Ashwinkumar Badanidiyuru, John Langford, and Aleksandrs Slivkins. Resourceful contextual bandits. In Conference on Learning Theory, pages 1109-1134. PMLR, 2014.

[8]

Ashwinkumar Badanidiyuru, Robert Kleinberg, and Aleksandrs Slivkins. Bandits with knapsacks. J. ACM, 65(3), 2018.

Digital Library

[9]

Santiago Balseiro, Haihao Lu, and Vahab Mirrokni. Dual mirror descent for online allocation problems. In International Conference on Machine Learning, pages 613-628. PMLR, 2020.

[10]

Santiago R Balseiro, Haihao Lu, and Vahab Mirrokni. The best of many worlds: Dual mirror descent for online allocation problems. Operations Research, 2022.

[11]

Étienne Bamas, Andreas Maggiori, and Ola Svensson. The primal-dual method for learning augmented algorithms, 2020.

[12]

Omar Besbes and Assaf Zeevi. Dynamic pricing without knowing the demand function: Risk bounds and near-optimal algorithms. Operations Research, 57(6):1407-1420, 2009.

Digital Library

[13]

Matteo Castiglioni, Andrea Celli, and Christian Kroer. Online learning with knapsacks: the best of both worlds. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 2767-2783. PMLR, 17-23 Jul 2022a.

[14]

Matteo Castiglioni, Andrea Celli, and Christian Kroer. Online learning with knapsacks: the best of both worlds. In International Conference on Machine Learning, pages 2767-2783. PMLR, 2022b.

[15]

Rohan Deb, Aadirupa Saha, and Arindam Banerjee. Think before you duel: Understanding complexities of preference learning under constrained resources. In Sanjoy Dasgupta, Stephan Mandt, and Yingzhen Li, editors, Proceedings of The 27th International Conference on Artificial Intelligence and Statistics, volume 238 of Proceedings of Machine Learning Research, pages 4546-4554. PMLR, 02-04 May 2024. URL https://proceedings.mlr.press/v238/deb24a.html.

[16]

Yuval Emek, Shay Kutten, and Yangguang Shi. Online paging with a vanishing regret. In ITCS, 2021.

[17]

Zhe Feng, Swati Padmanabhan, and Di Wang. Online bidding algorithms for return-on-spend constrained advertisers. In Proceedings of the ACM Web Conference 2023, page 3550-3560, 2023.

Digital Library

[18]

Anupam Gupta, Debmalya Panigrahi, Bernardo Subercaseaux, and Kevin Sun. Augmenting online algorithms with \varepsilon-accurate predictions. In S. Koyejo, S. Mohamed, A. Agarwal, D. Belgrave, K. Cho, and A. Oh, editors, Advances in Neural Information Processing Systems, volume 35, pages 2115-2127. Curran Associates, Inc., 2022. URL https://proceedings.neurips.cc/paper_files/paper/2022/file/0ea048312aa812b2711fe765a9e9ef05-Paper-Conference.pdf.

[19]

Nicole Immorlica, Karthik Abinav Sankararaman, Robert Schapire, and Aleksandrs Slivkins. Adversarial bandits with knapsacks. In 60th IEEE Annual Symposium on Foundations of Computer Science, FOCS 2019, pages 202-219. IEEE Computer Society, 2019.

[20]

Nicole Immorlica, Karthik Sankararaman, Robert Schapire, and Aleksandrs Slivkins. Adversarial bandits with knapsacks. J. ACM, 69(6), 2022. ISSN 0004-5411.

Digital Library

[21]

Thomas Kesselheim and Sahil Singla. Online learning with vector costs and bandits with knapsacks. In Jacob Abernethy and Shivani Agarwal, editors, Proceedings of Thirty Third Conference on Learning Theory, volume 125 of Proceedings of Machine Learning Research, pages 2286-2305. PMLR, 09-12 Jul 2020.

[22]

Silvio Lattanzi, Thomas Lavastida, Benjamin Moseley, and Sergei Vassilvitskii. Online Scheduling via Learned Weights, pages 1859-1877. 12 2020. ISBN 978-1-61197-599-4.

[23]

Alexander Lindermayr and Nicole Megow. Algorithms with predictions. https://algorithms-with-predictions.github.io, 2023. URL https://algorithms-with-predictions.github.io. Online: accessed 2023-07-12.

[24]

Thodoris Lykouris and Sergei Vassilvtiskii. Competitive caching with machine learned advice. In Jennifer Dy and Andreas Krause, editors, Proceedings of the 35th International Conference on Machine Learning, volume 80 of Proceedings of Machine Learning Research, pages 3296-3305. PMLR, 10-15 Jul 2018.

[25]

Lixing Lyu and Wang Chi Cheung. Bandits with knapsacks: advice on time-varying demands. In International Conference on Machine Learning, pages 23212-23238. PMLR, 2023.

[26]

Michael Mitzenmacher. Scheduling with predictions and the price of misprediction, 2019.

[27]

Michael Mitzenmacher and Sergei Vassilvitskii. Algorithms with predictions, 2020.

[28]

Michael Mitzenmacher and Sergei Vassilvitskii. Algorithms with predictions. Communications of the ACM, 65(7): 33-35, 2022.

Digital Library

[29]

Dhruv Rohatgi. Near-optimal bounds for online caching with machine learned advice. In SODA, 2020.

[30]

Karthik Abinav Sankararaman and Aleksandrs Slivkins. Combinatorial semi-bandits with knapsacks. In International Conference on Artificial Intelligence and Statistics, pages 1760-1770. PMLR, 2018.

[31]

Vidyashankar Sivakumar, Shiliang Zuo, and Arindam Banerjee. Smoothed adversarial linear contextual bandits with knapsacks. In Kamalika Chaudhuri, Stefanie Jegelka, Le Song, Csaba Szepesvari, Gang Niu, and Sivan Sabato, editors, Proceedings of the 39th International Conference on Machine Learning, volume 162 of Proceedings of Machine Learning Research, pages 20253-20277. PMLR, 17-23 Jul 2022. URL https://proceedings.mlr.press/v162/sivakumar22a.html.

[32]

Aleksandrs Slivkins, Karthik Abinav Sankararaman, and Dylan J Foster. Contextual bandits with packing and covering constraints: A modular lagrangian approach via regression. In The Thirty Sixth Annual Conference on Learning Theory, pages 4633-4656. PMLR, 2023.

[33]

Aleksandrs Slivkins et al. Introduction to multi-armed bandits. Foundations and Trends® in Machine Learning, 12 (1-2):1-286, 2019.

Digital Library

[34]

Qian Wang, Zongjun Yang, Xiaotie Deng, and Yuqing Kong. Learning to bid in repeated first-price auctions with budgets. In Proceedings of the 40th International Conference on Machine Learning, ICML'23, 2023.

Digital Library

[35]

Alexander Wei. Better and simpler learning-augmented online caching. In APPROX/RANDOM, 2020.

Index Terms

Bandits with Knapsacks and predictions
1. Computing methodologies
  1. Machine learning
2. Theory of computation

Index terms have been assigned to the content through auto-classification.

Recommendations

Adversarial Bandits with Knapsacks
We consider Bandits with Knapsacks (henceforth, BwK), a general model for multi-armed bandits under supply/budget constraints. In particular, a bandit algorithm needs to solve a well-known knapsack problem: find an optimal packing of items into a limited-...
Bandits with concave rewards and convex knapsacks
EC '14: Proceedings of the fifteenth ACM conference on Economics and computation

In this paper, we consider a very general model for exploration-exploitation tradeoff which allows arbitrary concave rewards and convex constraints on the decisions across time, in addition to the customary limitation on the time horizon. This model ...
Bandits with knapsacks beyond the worst case
NIPS '21: Proceedings of the 35th International Conference on Neural Information Processing Systems

Bandits with Knapsacks (BwK) is a general model for multi-armed bandits under supply/budget constraints. While worst-case regret bounds for BwK are well-understood, we present three results that go beyond the worst-case perspective. First, we provide ...

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings

UAI '24: Proceedings of the Fortieth Conference on Uncertainty in Artificial Intelligence

July 2024

4270 pages

Editors:
Negar Kiyavash
College of Management of Technology, École Polytechnique Fédérale de Lausanne, Lausanne, Switzerland
,
Joris M. Mooij
Korteweg-de Vries Institute, University of Amsterdam, Amsterdam, Netherlands

Copyright © 2024 Association for Uncertainty in Artificial Intelligence.

Sponsors

HUAWEI
Google
DEShaw&Co
Barcelona School of Economics
Universitat Pompeu Fabra

Publisher

JMLR.org

Publication History

Published: 15 July 2024

Qualifiers

Research-article
Research
Refereed limited

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 24 Jan 2025

Other Metrics

View Author Metrics

Citations

View Options

View options

Figures

Tables

Media

View Table of Conten