research-article

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

Authors:

Sascha Yves Frey,

Christopher Lu,

Jakob Foerster,

Anisoara CalinescuAuthors Info & Claims

ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance

Pages 583 - 591

https://doi.org/10.1145/3604237.3626880

Published: 25 November 2023 Publication History

Abstract

Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, whether for identical or different securities, with an up to 75x faster per-message processing time. The implementation of our simulator – JAX-LOB – is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs. The project code is available on GitHub 1

References

[1]

Robert Almgren and Neil Chriss. 2001. Optimal execution of portfolio transactions. The Journal of Risk 3, 2 (Jan. 2001), 5–39. https://doi.org/10.21314/jor.2001.041 Publisher: Infopro Digital Services Limited.

[2]

Selim Amrouni, Aymeric Moulin, Jared Vann, Svitlana Vyetrenko, Tucker Balch, and Manuela Veloso. 2022. ABIDES-gym: gym environments for multi-agent discrete event simulation and application to financial markets. In Proceedings of the Second ACM International Conference on AI in Finance(ICAIF ’21). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3490354.3494433

Digital Library

[3]

Igor Babuschkin, Kate Baumli, Alison Bell, Surya Bhupatiraju, Jake Bruce, Peter Buchlovsky, David Budden, Trevor Cai, Aidan Clark, Ivo Danihelka, Antoine Dedieu, Claudio Fantacci, Jonathan Godwin, Chris Jones, Ross Hemsley, Tom Hennigan, Matteo Hessel, Shaobo Hou, Steven Kapturowski, Thomas Keck, Iurii Kemaev, Michael King, Markus Kunesch, Lena Martens, Hamza Merzic, Vladimir Mikulik, Tamara Norman, George Papamakarios, John Quan, Roman Ring, Francisco Ruiz, Alvaro Sanchez, Rosalia Schneider, Eren Sezener, Stephen Spencer, Srivatsan Srinivasan, Wojciech Stokowiec, Luyu Wang, Guangyao Zhou, and Fabio Viola. 2020. The DeepMind JAX Ecosystem. http://github.com/deepmind

[4]

Peter Belcak, Jan-Peter Calliess, and Stefan Zohren. 2022. Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects. In Multi-Agent-Based Simulation XXII(Lecture Notes in Computer Science), Koen H. Van Dam and Nicolas Verstaevel (Eds.). Springer International Publishing, Cham, 42–56. https://doi.org/10.1007/978-3-030-94548-0_4

Digital Library

[5]

Dimitris Bertsimas and Andrew W. Lo. 1998. Optimal control of execution costs. Journal of Financial Markets 1, 1 (April 1998), 1–50. https://doi.org/10.1016/S1386-4181(97)00012-8

[6]

Taweh Beysolow II and Taweh Beysolow II. 2019. Market making via reinforcement learning. Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras (2019), 77–94.

[7]

Jean-Philippe Bouchaud, Julius Bonart, Jonathan Donier, and Martin Gould. 2018. Trades, Quotes and Prices: Financial Markets Under the Microscope. Cambridge University Press. Google-Books-ID: dPRQDwAAQBAJ.

[8]

James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax

[9]

David Byrd, Maria Hybinette, and Tucker Hybinette Balch. 2020. ABIDES: Towards High-Fidelity Multi-Agent Market Simulation. In Proceedings of the 2020 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation(SIGSIM-PADS ’20). Association for Computing Machinery, New York, NY, USA, 11–22. https://doi.org/10.1145/3384441.3395986

Digital Library

[10]

Álvaro Cartea and Sebastian Jaimungal. 2016. A Closed-Form Execution Strategy to Target Volume Weighted Average Price. SIAM Journal on Financial Mathematics (Nov. 2016). https://doi.org/10.1137/16M1058406 Publisher: Society for Industrial and Applied Mathematics.

Digital Library

[11]

Philippe Casgrain. 2020. LimitOrderBook.jl. https://github.com/p-casgrain/LimitOrderBook.jl

[12]

Rama Cont, Sasha Stoikov, and Rishi Talreja. 2010. A Stochastic Model for Order Book Dynamics. Operations Research 58, 3 (June 2010), 549–563. https://doi.org/10.1287/opre.1090.0780

Digital Library

[13]

Kevin Dabérius, Elvin Granat, and Patrik Karlsson. 2019. Deep execution-value and policy based reinforcement learning for trading and beating market benchmarks. Available at SSRN 3374766 (2019).

[14]

Jin Fang, Jiacheng Weng, Yi Xiang, and Xinwen Zhang. 2022. Imitate then Transcend: Multi-Agent Optimal Execution with Dual-Window Denoise PPO. arXiv preprint arXiv:2206.10736 (2022).

[15]

Sumitra Ganesh, Nelson Vadori, Mengda Xu, Hua Zheng, Prashant Reddy, and Manuela Veloso. 2019. Reinforcement learning for market making in a multi-agent dealer market. arXiv preprint arXiv:1911.05892 (2019).

[16]

Google. 23. XLA: Optimizing Compiler for machine learning : tensorflow. https://www.tensorflow.org/xla

[17]

Martin D. Gould, Mason A. Porter, Stacy Williams, Mark McDonald, Daniel J. Fenn, and Sam D. Howison. 2013. Limit order books. Quantitative Finance 13, 11 (Nov. 2013), 1709–1742. https://doi.org/10.1080/14697688.2013.803148 Publisher: Routledge _eprint: https://doi.org/10.1080/14697688.2013.803148.

[18]

Ruihong Huang and Tomas Polak. 2011. LOBSTER: Limit Order Book Reconstruction System. https://doi.org/10.2139/ssrn.1977207

[19]

Joseph Jerome, Gregory Palmer, and Rahul Savani. 2022. Market Making with Scaled Beta Policies. In Proceedings of the Third ACM International Conference on AI in Finance. ACM, New York NY USA, 214–222. https://doi.org/10.1145/3533271.3561745

Digital Library

[20]

Joseph Jerome, Leandro Sanchez-Betancourt, Rahul Savani, and Martin Herdegen. 2022. Model-based gym environments for limit order book trading. http://arxiv.org/abs/2209.07823 arXiv:2209.07823 [cs, q-fin].

[21]

Michäel Karpe, Jin Fang, Zhongyao Ma, and Chen Wang. 2020. Multi-agent reinforcement learning in a realistic limit order book market simulation. In Proceedings of the First ACM International Conference on AI in Finance. ACM.

Digital Library

[22]

Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR). San Diega, CA, USA.

[23]

Dmitrii Kochkov, Jamie A. Smith, Ayya Alieva, Qing Wang, Michael P. Brenner, and Stephan Hoyer. 2021. Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences 118, 21 (May 2021), e2101784118. https://doi.org/10.1073/pnas.2101784118 Publisher: Proceedings of the National Academy of Sciences.

[24]

Albert S. Kyle. 1985. Continuous Auctions and Insider Trading. Econometrica 53, 6 (1985), 1315–1335. https://doi.org/10.2307/1913210 Publisher: [Wiley, Econometric Society].

[25]

Robert Tjarko Lange. 2022. gymnax: A JAX-based Reinforcement Learning Environment Library. http://github.com/RobertTLange/gymnax

[26]

Chris Lu, Jakub Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, and Jakob Foerster. 2022. Discovered policy optimisation. Advances in Neural Information Processing Systems 35 (2022), 16455–16468.

[27]

Peer Nagy, Sascha Frey, Silvia Sapora, Kang Li, Anisoara Calinescu, Stefan Zohren, and Jakob Foerster. 2023. Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network. In Proceedings of the Fourth ACM International Conference on AI in Finance.

Digital Library

[28]

Yuriy Nevmyvaka, Yi Feng, and Michael Kearns. 2006. Reinforcement learning for optimized trade execution. In Proceedings of the 23rd international conference on Machine learning. 673–680.

Digital Library

[29]

Brian Ning, Franco Ho Ting Lin, and Sebastian Jaimungal. 2021. Double deep Q-learning for optimal execution. Applied Mathematical Finance 28, 4 (2021), 361–380.

[30]

Anna A. Obizhaeva and Jiang Wang. 2013. Optimal trading strategy and supply/demand dynamics. Journal of Financial Markets 16, 1 (Feb. 2013), 1–32. https://doi.org/10.1016/j.finmar.2012.09.001

[31]

James Paulin. 2019. Understanding flash crash contagion and systemic risk: a calibrated agent-based approach. https://ora.ox.ac.uk/objects/uuid:929fa3fe-4e5f-4cef-ad9f-03eb40110818

[32]

Arnau Quera-Bofarull, Joel Dyer, Anisoara Calinescu, and Michael Wooldridge. 2023. Some challenges of calibrating differentiable agent-based models. http://arxiv.org/abs/2307.01085 arXiv:2307.01085 [cs, q-fin, stat].

[33]

Samuel S. Schoenholz and Ekin D. Cubuk. 2020. JAX M.D. A Framework for Differentiable Physics. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc.https://papers.nips.cc/paper/2020/file/83d3d4b6c9579515e1679aca8cbc8033-Paper.pdf

[34]

Maarten Scholl. 2022. Economic Simulation Library.

[35]

Maarten P. Scholl, Anisoara Calinescu, and J. Doyne Farmer. 2021. How market ecology explains market malfunction. Proceedings of the National Academy of Sciences 118, 26 (June 2021), e2015574118. https://doi.org/10.1073/pnas.2015574118 Publisher: Proceedings of the National Academy of Sciences.

[36]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).

[37]

Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.

[38]

Svitlana Vyetrenko, David Byrd, Nick Petosa, Mahmoud Mahfouz, Danial Dervovic, Manuela Veloso, and Tucker Balch. 2020. Get real: realism metrics for robust limit order book market simulations. In Proceedings of the First ACM International Conference on AI in Finance(ICAIF ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3383455.3422561

Digital Library

Cited By

Index Terms

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading
1. Computing methodologies
  1. Machine learning
    1. Learning paradigms
      1. Reinforcement learning
  2. Modeling and simulation
    1. Simulation types and techniques
      1. Discrete-event simulation
      2. Massively parallel and high-performance simulations

Recommendations

Mbt-gym: Reinforcement learning for model-based limit order book trading
ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance

Within the mathematical finance literature there is a rich catalogue of mathematical models for studying algorithmic trading problems such as market making and optimal execution. This paper introduces mbt_gym, a Python module that provides a suite of gym ...
Multi-agent reinforcement learning in a realistic limit order book market simulation
ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

Optimal order execution is widely studied by industry practitioners and academic researchers because it determines the profitability of investment decisions and high-level trading strategies, particularly those involving large volumes of orders. However,...
Reward Shaping in Episodic Reinforcement Learning
AAMAS '17: Proceedings of the 16th Conference on Autonomous Agents and MultiAgent Systems

Recent advancements in reinforcement learning confirm that reinforcement learning techniques can solve large scale problems leading to high quality autonomous decision making. It is a matter of time until we will see large scale applications of ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance

November 2023

697 pages

ISBN:9798400702402

DOI:10.1145/3604237

Copyright © 2023 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 25 November 2023

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Funding Sources

UKRI AI World Leading Researcher Fellowship
TAILOR
Hasler Foundation

Conference

ICAIF '23

ICAIF '23: 4th ACM International Conference on AI in Finance

November 27 - 29, 2023

NY, Brooklyn, USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
194
Total Downloads

Downloads (Last 12 months)194
Downloads (Last 6 weeks)35

Reflects downloads up to 09 Aug 2024

Other Metrics

View Author Metrics

Citations

Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View Table of Contents