Simulation of Unintentional Collusion Caused by Auto Pricing in Supply Chain Markets

Hirano, Masanori; Matsushima, Hiroyasu; Izumi, Kiyoshi; Mukai, Taisei

doi:10.1007/978-3-030-69322-0_24

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 12568))

Included in the following conference series:

International Conference on Principles and Practice of Multi-Agent Systems

588 Accesses
1 Citations

Abstract

In this paper, we address the problem of unintentional price collusion, which happens due to auto pricing, such as systems using reinforcement learning. Firstly, Q-learning, sarsa, and deep Q-Learning models were used for auto pricing to test whether they cause collusion. To test them, we performed multi-agent simulations of a competitive market with a pre-defined demand function. In each simulation, the agents learn their pricing strategies using reinforcement learning. And we defined and calculated the new collusion metric representing how agents collude. Secondly, we tested cases with open and shield bidding with multiple numbers of agents. In our result, we observe that deep Q-Learning demonstrates the highest collusion metric. Also, contrary to expectations, we found that shield bidding has no significant effect on collusion levels when agents employ outperforming reinforcement learning, such as deep Q-learning. Moreover, the number of agents also contribute to less collusion levels.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Winning at Any Cost - Infringing the Cartel Prohibition with Reinforcement Learning

Multi-agent Dynamic Pricing Using Reinforcement Learning and Asymmetric Information

Dynamic pricing under competition using reinforcement learning

Article Open access 27 February 2021

References

ANAC Organizers: ANAC2019 - Tenth Automated Negotiating Agents Competition (2019). http://web.tuat.ac.jp/~katfuji/ANAC2019/
Bellemare, M.G., Veness, J., Bowling, M.: The arcade learning environment: an evaluation platform for general agents. J. Artif. Intell. Res. 47, 253–279 (2013)
Article Google Scholar
Bertrand, J.L.F.: Théorie mathématique de la richesse sociale par Léon Walras: Recherches sur les principes mathématiques de la théorie des richesse par Augustin Cournot. Journal des savants 67, 499–508 (1883)
Google Scholar
Granichin, O., Uzhva, D.: Invariance preserving control of clusters recognized in networks of kuramoto oscillators. In: Kuznetsov, S.O., Panov, A.I., Yakovlev, K.S. (eds.) RCAI 2020. LNCS (LNAI), vol. 12412, pp. 472–486. Springer, Cham (2020). https://doi.org/10.1007/978-3-030-59535-7_35
Chapter Google Scholar
Jain, M., An, B., Tambe, M.: An overview of recent application trends at the AAMAS conference: security, sustainability, and safety. AI Mag. 33, 14–28 (2012). https://doi.org/10.1609/aimag.v33i3.2420
Article Google Scholar
Mnih, V., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015). https://doi.org/10.1038/nature14236
Article Google Scholar
Mohammad, Y.: yasserfarouk/negmas: Negotiation Multi-Agent System (2019). https://github.com/yasserfarouk/negmas
Mohammad, Y., Viqueira, E.A., Ayerza, N.A., Greenwald, A., Nakadai, S., Morinaga, S.: Supply chain management world: a benchmark environment for situated negotiations. In: Baldoni, M., Dastani, M., Liao, B., Sakurai, Y., Zalila Wenkstern, R. (eds.) PRIMA 2019. LNCS (LNAI), vol. 11873, pp. 153–169. Springer, Cham (2019). https://doi.org/10.1007/978-3-030-33792-6_10
Chapter Google Scholar
Rummery, G.A., Niranjan, M.: On-line Q-learning using connectionist systems. University of Cambridge, Department of Engineering Cambridge, England (1994)
Google Scholar
Sutton, R.S.: Learning to predict by the methods of temporal differences. Mach. Learn. 3(1), 9–44 (1988). https://doi.org/10.1007/BF00115009
Article Google Scholar
Tambe, M.: Security and Game Theory: Algorithms, Deployed Systems. Lessons Learned. Cambridge University Press, Cambridge (2011)
Book Google Scholar
Watkins, C.J.C.H., Dayan, P.: Q-learning. Mach. Learn. 8(3–4), 279–292 (1992). https://doi.org/10.1007/bf00992698
Article MATH Google Scholar

Download references

Acknowledgment

This work was supported by Council for Science, Technology and Innovation (CSTI), Cross-ministerial Strategic Innovation Promotion Program (SIP), “AI Collaboration for Improved Value Chain Efficiency and Flexibility” (Funding agency: NEDO).

Author information

Authors and Affiliations

School of Engineering, The University of Tokyo, Tokyo, Japan
Masanori Hirano & Kiyoshi Izumi
Center for Data Science Education and Research, Shiga University, Shiga, Japan
Hiroyasu Matsushima
Comprehensive Research Organization, Waseda University, Tokyo, Japan
Taisei Mukai

Authors

Masanori Hirano
View author publications
You can also search for this author in PubMed Google Scholar
Hiroyasu Matsushima
View author publications
You can also search for this author in PubMed Google Scholar
Kiyoshi Izumi
View author publications
You can also search for this author in PubMed Google Scholar
Taisei Mukai
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Masanori Hirano .

Editor information

Editors and Affiliations

Nagoya Institute of Technology, Nagoya, Japan
Takahiro Uchiya
University of Tasmania, Tasmania, TAS, Australia
Quan Bai
University of Alcalá, Alcala de Henares, Spain
Iván Marsá Maestre

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Hirano, M., Matsushima, H., Izumi, K., Mukai, T. (2021). Simulation of Unintentional Collusion Caused by Auto Pricing in Supply Chain Markets. In: Uchiya, T., Bai, Q., Marsá Maestre, I. (eds) PRIMA 2020: Principles and Practice of Multi-Agent Systems. PRIMA 2020. Lecture Notes in Computer Science(), vol 12568. Springer, Cham. https://doi.org/10.1007/978-3-030-69322-0_24

Download citation

DOI: https://doi.org/10.1007/978-3-030-69322-0_24
Published: 14 February 2021
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-69321-3
Online ISBN: 978-3-030-69322-0
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Simulation of Unintentional Collusion Caused by Auto Pricing in Supply Chain Markets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Winning at Any Cost - Infringing the Cartel Prohibition with Reinforcement Learning

Multi-agent Dynamic Pricing Using Reinforcement Learning and Asymmetric Information

Dynamic pricing under competition using reinforcement learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Simulation of Unintentional Collusion Caused by Auto Pricing in Supply Chain Markets

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Winning at Any Cost - Infringing the Cartel Prohibition with Reinforcement Learning

Multi-agent Dynamic Pricing Using Reinforcement Learning and Asymmetric Information

Dynamic pricing under competition using reinforcement learning

References

Acknowledgment

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation