Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3604237.3626880acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

JAX-LOB: A GPU-Accelerated limit order book simulator to unlock large scale reinforcement learning for trading

Published: 25 November 2023 Publication History
  • Get Citation Alerts
  • Abstract

    Financial exchanges across the world use limit order books (LOBs) to process orders and match trades. For research purposes it is important to have large scale efficient simulators of LOB dynamics. LOB simulators have previously been implemented in the context of agent-based models (ABMs), reinforcement learning (RL) environments, and generative models, processing order flows from historical data sets and hand-crafted agents alike. For many applications, there is a requirement for processing multiple books, either for the calibration of ABMs or for the training of RL agents. We showcase the first GPU-enabled LOB simulator designed to process thousands of books in parallel, whether for identical or different securities, with an up to 75x faster per-message processing time. The implementation of our simulator – JAX-LOB – is based on design choices that aim to best exploit the powers of JAX without compromising on the realism of LOB-related mechanisms. We integrate JAX-LOB with other JAX packages, to provide an example of how one may address an optimal execution problem with reinforcement learning, and to share some preliminary results from end-to-end RL training on GPUs. The project code is available on GitHub 1

    References

    [1]
    Robert Almgren and Neil Chriss. 2001. Optimal execution of portfolio transactions. The Journal of Risk 3, 2 (Jan. 2001), 5–39. https://doi.org/10.21314/jor.2001.041 Publisher: Infopro Digital Services Limited.
    [2]
    Selim Amrouni, Aymeric Moulin, Jared Vann, Svitlana Vyetrenko, Tucker Balch, and Manuela Veloso. 2022. ABIDES-gym: gym environments for multi-agent discrete event simulation and application to financial markets. In Proceedings of the Second ACM International Conference on AI in Finance(ICAIF ’21). Association for Computing Machinery, New York, NY, USA, 1–9. https://doi.org/10.1145/3490354.3494433
    [3]
    Igor Babuschkin, Kate Baumli, Alison Bell, Surya Bhupatiraju, Jake Bruce, Peter Buchlovsky, David Budden, Trevor Cai, Aidan Clark, Ivo Danihelka, Antoine Dedieu, Claudio Fantacci, Jonathan Godwin, Chris Jones, Ross Hemsley, Tom Hennigan, Matteo Hessel, Shaobo Hou, Steven Kapturowski, Thomas Keck, Iurii Kemaev, Michael King, Markus Kunesch, Lena Martens, Hamza Merzic, Vladimir Mikulik, Tamara Norman, George Papamakarios, John Quan, Roman Ring, Francisco Ruiz, Alvaro Sanchez, Rosalia Schneider, Eren Sezener, Stephen Spencer, Srivatsan Srinivasan, Wojciech Stokowiec, Luyu Wang, Guangyao Zhou, and Fabio Viola. 2020. The DeepMind JAX Ecosystem. http://github.com/deepmind
    [4]
    Peter Belcak, Jan-Peter Calliess, and Stefan Zohren. 2022. Fast Agent-Based Simulation Framework with Applications to Reinforcement Learning and the Study of Trading Latency Effects. In Multi-Agent-Based Simulation XXII(Lecture Notes in Computer Science), Koen H. Van Dam and Nicolas Verstaevel (Eds.). Springer International Publishing, Cham, 42–56. https://doi.org/10.1007/978-3-030-94548-0_4
    [5]
    Dimitris Bertsimas and Andrew W. Lo. 1998. Optimal control of execution costs. Journal of Financial Markets 1, 1 (April 1998), 1–50. https://doi.org/10.1016/S1386-4181(97)00012-8
    [6]
    Taweh Beysolow II and Taweh Beysolow II. 2019. Market making via reinforcement learning. Applied Reinforcement Learning with Python: With OpenAI Gym, Tensorflow, and Keras (2019), 77–94.
    [7]
    Jean-Philippe Bouchaud, Julius Bonart, Jonathan Donier, and Martin Gould. 2018. Trades, Quotes and Prices: Financial Markets Under the Microscope. Cambridge University Press. Google-Books-ID: dPRQDwAAQBAJ.
    [8]
    James Bradbury, Roy Frostig, Peter Hawkins, Matthew James Johnson, Chris Leary, Dougal Maclaurin, George Necula, Adam Paszke, Jake VanderPlas, Skye Wanderman-Milne, and Qiao Zhang. 2018. JAX: composable transformations of Python+NumPy programs. http://github.com/google/jax
    [9]
    David Byrd, Maria Hybinette, and Tucker Hybinette Balch. 2020. ABIDES: Towards High-Fidelity Multi-Agent Market Simulation. In Proceedings of the 2020 ACM SIGSIM Conference on Principles of Advanced Discrete Simulation(SIGSIM-PADS ’20). Association for Computing Machinery, New York, NY, USA, 11–22. https://doi.org/10.1145/3384441.3395986
    [10]
    Álvaro Cartea and Sebastian Jaimungal. 2016. A Closed-Form Execution Strategy to Target Volume Weighted Average Price. SIAM Journal on Financial Mathematics (Nov. 2016). https://doi.org/10.1137/16M1058406 Publisher: Society for Industrial and Applied Mathematics.
    [11]
    Philippe Casgrain. 2020. LimitOrderBook.jl. https://github.com/p-casgrain/LimitOrderBook.jl
    [12]
    Rama Cont, Sasha Stoikov, and Rishi Talreja. 2010. A Stochastic Model for Order Book Dynamics. Operations Research 58, 3 (June 2010), 549–563. https://doi.org/10.1287/opre.1090.0780
    [13]
    Kevin Dabérius, Elvin Granat, and Patrik Karlsson. 2019. Deep execution-value and policy based reinforcement learning for trading and beating market benchmarks. Available at SSRN 3374766 (2019).
    [14]
    Jin Fang, Jiacheng Weng, Yi Xiang, and Xinwen Zhang. 2022. Imitate then Transcend: Multi-Agent Optimal Execution with Dual-Window Denoise PPO. arXiv preprint arXiv:2206.10736 (2022).
    [15]
    Sumitra Ganesh, Nelson Vadori, Mengda Xu, Hua Zheng, Prashant Reddy, and Manuela Veloso. 2019. Reinforcement learning for market making in a multi-agent dealer market. arXiv preprint arXiv:1911.05892 (2019).
    [16]
    Google. 23. XLA: Optimizing Compiler for machine learning  :  tensorflow. https://www.tensorflow.org/xla
    [17]
    Martin D. Gould, Mason A. Porter, Stacy Williams, Mark McDonald, Daniel J. Fenn, and Sam D. Howison. 2013. Limit order books. Quantitative Finance 13, 11 (Nov. 2013), 1709–1742. https://doi.org/10.1080/14697688.2013.803148 Publisher: Routledge _eprint: https://doi.org/10.1080/14697688.2013.803148.
    [18]
    Ruihong Huang and Tomas Polak. 2011. LOBSTER: Limit Order Book Reconstruction System. https://doi.org/10.2139/ssrn.1977207
    [19]
    Joseph Jerome, Gregory Palmer, and Rahul Savani. 2022. Market Making with Scaled Beta Policies. In Proceedings of the Third ACM International Conference on AI in Finance. ACM, New York NY USA, 214–222. https://doi.org/10.1145/3533271.3561745
    [20]
    Joseph Jerome, Leandro Sanchez-Betancourt, Rahul Savani, and Martin Herdegen. 2022. Model-based gym environments for limit order book trading. http://arxiv.org/abs/2209.07823 arXiv:2209.07823 [cs, q-fin].
    [21]
    Michäel Karpe, Jin Fang, Zhongyao Ma, and Chen Wang. 2020. Multi-agent reinforcement learning in a realistic limit order book market simulation. In Proceedings of the First ACM International Conference on AI in Finance. ACM.
    [22]
    Diederik Kingma and Jimmy Ba. 2015. Adam: A Method for Stochastic Optimization. In International Conference on Learning Representations (ICLR). San Diega, CA, USA.
    [23]
    Dmitrii Kochkov, Jamie A. Smith, Ayya Alieva, Qing Wang, Michael P. Brenner, and Stephan Hoyer. 2021. Machine learning–accelerated computational fluid dynamics. Proceedings of the National Academy of Sciences 118, 21 (May 2021), e2101784118. https://doi.org/10.1073/pnas.2101784118 Publisher: Proceedings of the National Academy of Sciences.
    [24]
    Albert S. Kyle. 1985. Continuous Auctions and Insider Trading. Econometrica 53, 6 (1985), 1315–1335. https://doi.org/10.2307/1913210 Publisher: [Wiley, Econometric Society].
    [25]
    Robert Tjarko Lange. 2022. gymnax: A JAX-based Reinforcement Learning Environment Library. http://github.com/RobertTLange/gymnax
    [26]
    Chris Lu, Jakub Kuba, Alistair Letcher, Luke Metz, Christian Schroeder de Witt, and Jakob Foerster. 2022. Discovered policy optimisation. Advances in Neural Information Processing Systems 35 (2022), 16455–16468.
    [27]
    Peer Nagy, Sascha Frey, Silvia Sapora, Kang Li, Anisoara Calinescu, Stefan Zohren, and Jakob Foerster. 2023. Generative AI for End-to-End Limit Order Book Modelling: A Token-Level Autoregressive Generative Model of Message Flow Using a Deep State Space Network. In Proceedings of the Fourth ACM International Conference on AI in Finance.
    [28]
    Yuriy Nevmyvaka, Yi Feng, and Michael Kearns. 2006. Reinforcement learning for optimized trade execution. In Proceedings of the 23rd international conference on Machine learning. 673–680.
    [29]
    Brian Ning, Franco Ho Ting Lin, and Sebastian Jaimungal. 2021. Double deep Q-learning for optimal execution. Applied Mathematical Finance 28, 4 (2021), 361–380.
    [30]
    Anna A. Obizhaeva and Jiang Wang. 2013. Optimal trading strategy and supply/demand dynamics. Journal of Financial Markets 16, 1 (Feb. 2013), 1–32. https://doi.org/10.1016/j.finmar.2012.09.001
    [31]
    James Paulin. 2019. Understanding flash crash contagion and systemic risk: a calibrated agent-based approach. https://ora.ox.ac.uk/objects/uuid:929fa3fe-4e5f-4cef-ad9f-03eb40110818
    [32]
    Arnau Quera-Bofarull, Joel Dyer, Anisoara Calinescu, and Michael Wooldridge. 2023. Some challenges of calibrating differentiable agent-based models. http://arxiv.org/abs/2307.01085 arXiv:2307.01085 [cs, q-fin, stat].
    [33]
    Samuel S. Schoenholz and Ekin D. Cubuk. 2020. JAX M.D. A Framework for Differentiable Physics. In Advances in Neural Information Processing Systems, Vol. 33. Curran Associates, Inc.https://papers.nips.cc/paper/2020/file/83d3d4b6c9579515e1679aca8cbc8033-Paper.pdf
    [34]
    Maarten Scholl. 2022. Economic Simulation Library.
    [35]
    Maarten P. Scholl, Anisoara Calinescu, and J. Doyne Farmer. 2021. How market ecology explains market malfunction. Proceedings of the National Academy of Sciences 118, 26 (June 2021), e2015574118. https://doi.org/10.1073/pnas.2015574118 Publisher: Proceedings of the National Academy of Sciences.
    [36]
    John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal Policy Optimization Algorithms. arXiv preprint arXiv:1707.06347 (2017).
    [37]
    Hado Van Hasselt, Arthur Guez, and David Silver. 2016. Deep reinforcement learning with double Q-learning. In Proceedings of the AAAI conference on artificial intelligence, Vol. 30.
    [38]
    Svitlana Vyetrenko, David Byrd, Nick Petosa, Mahmoud Mahfouz, Danial Dervovic, Manuela Veloso, and Tucker Balch. 2020. Get real: realism metrics for robust limit order book market simulations. In Proceedings of the First ACM International Conference on AI in Finance(ICAIF ’20). Association for Computing Machinery, New York, NY, USA, 1–8. https://doi.org/10.1145/3383455.3422561

    Cited By

    View all

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    ICAIF '23: Proceedings of the Fourth ACM International Conference on AI in Finance
    November 2023
    697 pages
    ISBN:9798400702402
    DOI:10.1145/3604237
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 25 November 2023

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. high frequency trading
    2. limit order books
    3. market replay
    4. order book simulator
    5. reinforcement learning
    6. trade execution

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Funding Sources

    • UKRI AI World Leading Researcher Fellowship
    • TAILOR
    • Hasler Foundation

    Conference

    ICAIF '23

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 194
      Total Downloads
    • Downloads (Last 12 months)194
    • Downloads (Last 6 weeks)35
    Reflects downloads up to 09 Aug 2024

    Other Metrics

    Citations

    Cited By

    View all

    View Options

    Get Access

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Media

    Figures

    Other

    Tables

    Share

    Share

    Share this Publication link

    Share on social media