research-article

Open access

No-Regret Learning in Bilateral Trade via Global Budget Balance

Authors:

Martino Bernasconi,

Matteo Castiglioni,

Federico FuscoAuthors Info & Claims

STOC 2024: Proceedings of the 56th Annual ACM Symposium on Theory of Computing

Pages 247 - 258

https://doi.org/10.1145/3618260.3649653

Published: 11 June 2024 Publication History

Abstract

Bilateral trade models the problem of intermediating between two rational agents — a seller and a buyer — both characterized by a private valuation for an item they want to trade. We study the online learning version of the problem, in which at each time step a new seller and buyer arrive and the learner has to set prices for them without any knowledge about their (adversarially generated) valuations.

In this setting, known impossibility results rule out the existence of no-regret algorithms when budget balanced has to be enforced at each time step. In this paper, we introduce the notion of global budget balance, which only requires the learner to fulfill budget balance over the entire time horizon. Under this natural relaxation, we provide the first no-regret algorithms for adversarial bilateral trade under various feedback models. First, we show that in the full-feedback model, the learner can guarantee Õ(√T) regret against the best fixed prices in hindsight, and that this bound is optimal up to poly-logarithmic terms. Second, we provide a learning algorithm guaranteeing a Õ(T³⁴) regret upper bound with one-bit feedback, which we complement with a Ω(T⁵⁷) lower bound that holds even in the two-bit feedback model. Finally, we introduce and analyze an alternative benchmark that is provably stronger than the best fixed prices in hindsight and is inspired by the literature on bandits with knapsacks.

References

[1]

Shipra Agrawal and Nikhil R. Devanur. 2019. Bandits with Global Convex Constraints and Objective. Oper. Res., 67, 5 (2019), 1486–1502. https://doi.org/10.1287/opre.2019.1840

Digital Library

[2]

Noga Alon, Nicolò Cesa-Bianchi, Claudio Gentile, Shie Mannor, Yishay Mansour, and Ohad Shamir. 2017. Nonstochastic Multi-Armed Bandits with Graph-Structured Feedback. SIAM J. Comput., 46, 6 (2017), 1785–1826. https://doi.org/10.1137/140989455

Digital Library

[3]

Peter Auer, Nicolò Cesa-Bianchi, Yoav Freund, and Robert E. Schapire. 2002. The Nonstochastic Multiarmed Bandit Problem. SIAM J. Comput., 32, 1 (2002), 48–77. https://doi.org/10.1137/S0097539701398375

Digital Library

[4]

Yossi Azar, Amos Fiat, and Federico Fusco. 2022. An alpha-regret analysis of Adversarial Bilateral Trade. In NeurIPS.

[5]

Ashwinkumar Badanidiyuru, Robert Kleinberg, and Aleksandrs Slivkins. 2018. Bandits with knapsacks. J. ACM, 65, 3 (2018), 1–55. https://doi.org/10.1145/3164539

Digital Library

[6]

Santiago R. Balseiro and Yonatan Gur. 2019. Learning in Repeated Auctions with Budgets: Regret Minimization and Equilibrium. Manag. Sci., 65, 9 (2019), 3952–3968. https://doi.org/10.1287/mnsc.2018.3174

Digital Library

[7]

Gábor Bartók, Dean P. Foster, Dávid Pál, Alexander Rakhlin, and Csaba Szepesvári. 2014. Partial Monitoring - Classification, Regret Bounds, and Algorithms. Math. Oper. Res., 39, 4 (2014), 967–997. https://doi.org/10.1287/moor.2014.0663

Digital Library

[8]

Martino Bernasconi, Matteo Castiglioni, Andrea Celli, and Federico Fusco. 2023. No-Regret Learning in Bilateral Trade via Global Budget Balance. CoRR, abs/2310.12370 (2023).

[9]

Martino Bernasconi, Matteo Castiglioni, Andrea Celli, and Federico Fusco. 2024. Bandits with Replenishable Knapsacks: the Best of both Worlds. In ICLR.

[10]

Martino Bernasconi, Matteo Castiglioni, Andrea Celli, Alberto Marchesi, Francesco Trovò, and Nicola Gatti. 2023. Optimal Rates and Efficient Algorithms for Online Bayesian Persuasion. In ICML (Proceedings of Machine Learning Research, Vol. 202). PMLR, 2164–2183.

[11]

Liad Blumrosen and Shahar Dobzinski. 2014. Reallocation mechanisms. In EC. ACM, 617. https://doi.org/10.1145/2600057.2602843

Digital Library

[12]

Liad Blumrosen and Yehonatan Mizrahi. 2016. Approximating Gains-from-Trade in Bilateral Trading. In WINE (Lecture Notes in Computer Science, Vol. 10123). Springer, 400–413. https://doi.org/10.1007/978-3-662-54110-4_28

Digital Library

[13]

Nataša Bolić, Tommaso Cesari, and Roberto Colomboni. 2024. An Online Learning Theory of Brokerage. The 23rd International Conference on Autonomous Agents and Multi-Agent Systems.

[14]

Johannes Brustle, Yang Cai, Fa Wu, and Mingfei Zhao. 2017. Approximating gains from trade in two-sided markets via simple mechanisms. In Proceedings of the 2017 ACM Conference on Economics and Computation. 589–590. https://doi.org/10.1145/3033274.3085148

Digital Library

[15]

Matteo Castiglioni, Andrea Celli, and Christian Kroer. 2022. Online Learning with Knapsacks: the Best of Both Worlds. In ICML (Proceedings of Machine Learning Research, Vol. 162). PMLR, 2767–2783.

[16]

Matteo Castiglioni, Andrea Celli, Alberto Marchesi, and Nicola Gatti. 2020. Online Bayesian Persuasion. In NeurIPS.

[17]

Matteo Castiglioni, Andrea Celli, Alberto Marchesi, and Nicola Gatti. 2023. Regret minimization in online Bayesian persuasion: Handling adversarial receiver’s types under full and partial feedback models. Artif. Intell., 314 (2023), 103821.

Digital Library

[18]

Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, and Stefano Leonardi. 2021. A Regret Analysis of Bilateral Trade. In EC. ACM, 289–309. https://doi.org/10.1145/3465456.3467645

Digital Library

[19]

Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, and Stefano Leonardi. 2023. Repeated Bilateral Trade Against a Smoothed Adversary. In COLT (Proceedings of Machine Learning Research, Vol. 195). PMLR, 1095–1130.

[20]

Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, and Stefano Leonardi. 2024. Bilateral trade: A regret minimization perspective. Mathematics of Operations Research, 49, 1 (2024), 171–203. https://doi.org/10.1287/moor.2023.1351

Digital Library

[21]

Nicolò Cesa-Bianchi, Tommaso Cesari, Roberto Colomboni, Federico Fusco, and Stefano Leonardi. 2024. The Role of Transparency in Repeated First-Price Auctions with Unknown Valuations. In STOC. ACM.

[22]

Nicolò Cesa-Bianchi, Claudio Gentile, and Yishay Mansour. 2015. Regret Minimization for Reserve Prices in Second-Price Auctions. IEEE Trans. Inf. Theory, 61, 1 (2015), 549–564. https://doi.org/10.1109/TIT.2014.2365772

[23]

Nicolò Cesa-Bianchi, Gábor Lugosi, and Gilles Stoltz. 2006. Regret Minimization Under Partial Monitoring. Math. Oper. Res., 31, 3 (2006), 562–580. https://doi.org/moor.1060.0206

Digital Library

[24]

Constantinos Daskalakis and Vasilis Syrgkanis. 2022. Learning in auctions: Regret is hard, envy is easy. Games Econ. Behav., 134 (2022), 308–343.

[25]

Yuan Deng, Jieming Mao, Balasubramanian Sivan, and Kangning Wang. 2022. Approximately efficient bilateral trade. In STOC. ACM, 718–721. https://doi.org/10.1145/3519935.3520054

Digital Library

[26]

Paul Duetting, Guru Guruganesh, Jon Schneider, and Joshua Ruizhi Wang. 2023. Optimal No-Regret Learning for One-Sided Lipschitz Functions. In ICML (Proceedings of Machine Learning Research, Vol. 202). PMLR, 8836–8850.

[27]

Paul Dütting, Federico Fusco, Philip Lazos, Stefano Leonardi, and Rebecca Reiffenhäuser. 2021. Efficient two-sided markets with limited information. In STOC. ACM, 1452–1465. https://doi.org/10.1145/3406325.3451076

Digital Library

[28]

Yumou Fei. 2022. Improved approximation to first-best gains-from-trade. In International Conference on Web and Internet Economics. 204–218. https://doi.org/10.1007/978-3-031-22832-2_12

Digital Library

[29]

Michal Feldman, Tomer Koren, Roi Livni, Yishay Mansour, and Aviv Zohar. 2016. Online pricing with strategic and patient buyers. Advances in Neural Information Processing Systems, 29 (2016).

[30]

Chien-Ju Ho, Aleksandrs Slivkins, and Jennifer Wortman Vaughan. 2016. Adaptive Contract Design for Crowdsourcing Markets: Bandit Algorithms for Repeated Principal-Agent Problems. J. Artif. Intell. Res., 55 (2016), 317–359.

[31]

Nicole Immorlica, Karthik Sankararaman, Robert Schapire, and Aleksandrs Slivkins. 2022. Adversarial bandits with knapsacks. J. ACM, 69, 6 (2022), 1–47. https://doi.org/10.1145/3557045

Digital Library

[32]

Rodolphe Jenatton, Jim C. Huang, and Cédric Archambeau. 2016. Adaptive Algorithms for Online Convex Optimization with Long-term Constraints. In ICML (JMLR Workshop and Conference Proceedings, Vol. 48). JMLR.org, 402–411.

[33]

Sham M. Kakade, Adam Tauman Kalai, and Katrina Ligett. 2009. Playing Games with Approximation Algorithms. SIAM J. Comput., 39, 3 (2009), 1088–1106. https://doi.org/10.1137/070701704

Digital Library

[34]

Zi Yang Kang, Francisco Pernice, and Jan Vondrák. 2022. Fixed-Price Approximations in Bilateral Trade. In SODA. SIAM, 2964–2985. https://doi.org/10.1137/1.9781611977073.115

[35]

Thomas Kesselheim and Sahil Singla. 2020. Online Learning with Vector Costs and Bandits with Knapsacks. In COLT (Proceedings of Machine Learning Research, Vol. 125). PMLR, 2286–2305.

[36]

Robert D. Kleinberg and Frank Thomson Leighton. 2003. The Value of Knowing a Demand Curve: Bounds on Regret for Online Posted-Price Auctions. In FOCS. IEEE Computer Society, 594–605. https://doi.org/10.1109/SFCS.2003.1238232

[37]

Raunak Kumar and Robert Kleinberg. 2022. Non-monotonic Resource Utilization in the Bandits with Knapsacks Problem. In NeurIPS.

[38]

Thodoris Lykouris, Vasilis Syrgkanis, and Éva Tardos. 2016. Learning and Efficiency in Games with Dynamic Population. In SODA. SIAM, 120–129. https://doi.org/10.1137/1.9781611977554.ch17

[39]

Mehrdad Mahdavi, Rong Jin, and Tianbao Yang. 2012. Trading regret for efficiency: online convex optimization with long term constraints. J. Mach. Learn. Res., 13 (2012), 2503–2528.

Digital Library

[40]

Shie Mannor, John N. Tsitsiklis, and Jia Yuan Yu. 2009. Online Learning with Sample Path Constraints. J. Mach. Learn. Res., 10 (2009), 569–590.

Digital Library

[41]

R Preston McAfee. 2008. The gains from trade under fixed price mechanisms. Applied economics research bulletin, 1, 1 (2008), 1–10.

[42]

Jamie Morgenstern and Tim Roughgarden. 2015. On the Pseudo-Dimension of Nearly Optimal Auctions. In NIPS. 136–144.

[43]

Roger B Myerson and Mark A Satterthwaite. 1983. Efficient mechanisms for bilateral trading. Journal of economic theory, 29, 2 (1983), 265–281.

[44]

Thomas Nedelec, Clément Calauzènes, Noureddine El Karoui, and Vianney Perchet. 2022. Learning in Repeated Auctions. Found. Trends Mach. Learn., 15, 3 (2022), 176–334. https://doi.org/10.1561/2200000077

Digital Library

[45]

2007. Algorithmic Game Theory, Noam Nisan, Tim Roughgarden, Éva Tardos, and Vijay V. Vazirani (Eds.). Cambridge University Press. https://doi.org/10.1017/CBO9780511800481

[46]

Aleksandrs Slivkins. 2019. Introduction to Multi-Armed Bandits. Found. Trends Mach. Learn., 12, 1-2 (2019), 1–286. https://doi.org/10.1561/2200000068

Digital Library

[47]

Aleksandrs Slivkins, Karthik Abinav Sankararaman, and Dylan J. Foster. 2023. Contextual Bandits with Packing and Covering Constraints: A Modular Lagrangian Approach via Regression. In COLT (Proceedings of Machine Learning Research, Vol. 195). PMLR, 4633–4656.

[48]

Wen Sun, Debadeepta Dey, and Ashish Kapoor. 2017. Safety-Aware Algorithms for Adversarial Contextual Bandit. In ICML (Proceedings of Machine Learning Research, Vol. 70). PMLR, 3280–3288.

[49]

William Vickrey. 1961. Counterspeculation, auctions, and competitive sealed tenders. The Journal of finance, 16, 1 (1961), 8–37.

[50]

Jonathan Weed, Vianney Perchet, and Philippe Rigollet. 2016. Online learning in repeated auctions. In COLT (JMLR Workshop and Conference Proceedings, Vol. 49). JMLR.org, 1562–1583.

[51]

Hao Yu, Michael J. Neely, and Xiaohan Wei. 2017. Online Convex Optimization with Stochastic Constraints. In NIPS. 1428–1438.

[52]

Banghua Zhu, Stephen Bates, Zhuoran Yang, Yixin Wang, Jiantao Jiao, and Michael I. Jordan. 2023. The Sample Complexity of Online Contract Design. In EC. ACM, 1188. https://doi.org/10.1145/3580507.3597673

Digital Library

[53]

You Zu, Krishnamurthy Iyer, and Haifeng Xu. 2021. Learning to Persuade on the Fly: Robustness Against Ignorance. In EC. ACM, 927–928. https://doi.org/10.1145/3465456.3467593

Digital Library

Cited By

Index Terms

No-Regret Learning in Bilateral Trade via Global Budget Balance
1. Computing methodologies
  1. Machine learning
    1. Learning settings
      1. Online learning settings
2. Theory of computation
  1. Theory and algorithms for application domains
    1. Algorithmic game theory and mechanism design
      1. Algorithmic game theory

Recommendations

A Regret Analysis of Bilateral Trade
EC '21: Proceedings of the 22nd ACM Conference on Economics and Computation

Bilateral trade, a fundamental topic in economics, models the problem of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. Despite the simplicity of this problem, a ...
Reallocation mechanisms
EC '14: Proceedings of the fifteenth ACM conference on Economics and computation

We consider reallocation problems in settings where the initial endowment of each agent consists of a subset of the resources. The private information of the players is their value for every possible subset of the resources. The goal is to redistribute ...
Bilateral Trade: A Regret Minimization Perspective
Bilateral trade, a fundamental topic in economics, models the problem of intermediating between two strategic agents, a seller and a buyer, willing to trade a good for which they hold private valuations. In this paper, we cast the bilateral trade problem ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

STOC 2024: Proceedings of the 56th Annual ACM Symposium on Theory of Computing

June 2024

2049 pages

ISBN:9798400703836

DOI:10.1145/3618260

General Chairs:
Bojan Mohar
Simon Fraser University, Canada
,
Igor Shinkar
Simon Fraser University, Canada
,
Program Chair:
Ryan O'Donnell
Carnegie Mellon University, USA

Copyright © 2024 Owner/Author.

This work is licensed under a Creative Commons Attribution International 4.0 License.

Sponsors

SIGACT: ACM Special Interest Group on Algorithms and Computation Theory

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 11 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Ministero dell'Università e della Ricerca
European Research Council
Ministero dell'Università e della Ricerca

Conference

STOC '24

Sponsor:

SIGACT

STOC '24: 56th Annual ACM Symposium on Theory of Computing

June 24 - 28, 2024

BC, Vancouver, Canada

Acceptance Rates

Overall Acceptance Rate 1,469 of 4,586 submissions, 32%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
65
Total Downloads

Downloads (Last 12 months)65
Downloads (Last 6 weeks)52

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Table of Contents