Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3677052.3698668acmotherconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement Learning

Published: 14 November 2024 Publication History

Abstract

Recent advancements in Distributional Reinforcement Learning (DRL) for modeling loss distributions have shown promise in developing hedging strategies in derivatives markets. A common approach in DRL involves learning the quantiles of loss distributions at specified levels using Quantile Regression (QR). This method is particularly effective in option hedging due to its direct quantile-based risk assessment, such as Value at Risk (VaR) and Conditional Value at Risk (CVaR). However, these risk measures depend on the accurate estimation of extreme quantiles in the loss distribution’s tail, which can be imprecise in QR-based DRL due to the rarity and extremity of tail data, as highlighted in the literature. To address this issue, we propose EXtreme DRL (EX-DRL), which enhances extreme quantile prediction by modeling the tail of the loss distribution with a Generalized Pareto Distribution (GPD). This method introduces supplementary data to mitigate the scarcity of extreme quantile observations, thereby improving estimation accuracy through QR. Comprehensive experiments on gamma hedging options demonstrate that EX-DRL improves existing QR-based models by providing more precise estimates of extreme quantiles, thereby improving the computation and reliability of risk metrics for complex financial risk management. The implementation is available here.

References

[1]
S Abilasha, Sahely Bhadra, Ahmed Zaheer Dadarkar, and P Deepak. 2022. Deep Extreme Mixture Model for Time Series Forecasting. In CIKM. 1726–1735.
[2]
Yu Bai, Song Mei, Huan Wang, and Caiming Xiong. 2021. Understanding the under-coverage bias in uncertainty estimation. Advances in Neural Information Processing Systems 34 (2021), 18307–18319.
[3]
Marc G Bellemare, Will Dabney, and Rémi Munos. 2017. A distributional perspective on reinforcement learning. In International conference on machine learning. PMLR, 449–458.
[4]
Boris Beranger, Simone A Padoan, and Scott A Sisson. 2021. Estimation and uncertainty quantification for extreme quantile regions. Extremes 24, 2 (2021), 349–375.
[5]
Tomas Björk and Agatha Murgoci. 2014. A theory of Markovian time-inconsistent stochastic control in discrete time. Finance and Stochastics 18 (2014), 545–592.
[6]
Fischer Black and Myron Scholes. 1973. The pricing of options and corporate liabilities. Journal of political economy 81, 3 (1973), 637–654.
[7]
Hans Buehler, Lukas Gonon, Josef Teichmann, and Ben Wood. 2019. Deep hedging. Quantitative Finance 19, 8 (2019), 1271–1291.
[8]
Hans Buehler, Phillip Murray, and Ben Wood. 2022. Deep bellman hedging. arXiv preprint arXiv:2207.00932 (2022).
[9]
Jay Cao, Jacky Chen, Soroush Farghadani, John Hull, Zissis Poulos, Zeyu Wang, and Jun Yuan. 2023. Gamma and vega hedging using deep distributional reinforcement learning. Frontiers in Artificial Intelligence 6 (2023), 1129370.
[10]
Jay Cao, Jacky Chen, John Hull, and Zissis Poulos. 2021. Deep Hedging of Derivatives Using Reinforcement Learning. The Journal of Financial Data Science 3, 1 (2021), 10–27.
[11]
Will Dabney, Georg Ostrovski, David Silver, and Rémi Munos. 2018. Implicit quantile networks for distributional reinforcement learning. In International conference on machine learning. PMLR, 1096–1105.
[12]
Will Dabney, Mark Rowland, Marc Bellemare, and Rémi Munos. 2018. Distributional reinforcement learning with quantile regression. In Proceedings of the AAAI conference on artificial intelligence, Vol. 32.
[13]
Roberto Daluiso, Marco Pinciroli, Michele Trapletti, and Edoardo Vittori. 2023. Cva hedging with reinforcement learning. In Proceedings of the Fourth ACM International Conference on AI in Finance. 261–269.
[14]
Parisa Davar, Frédéric Godin, and Jose Garrido. 2024. Catastrophic-risk-aware reinforcement learning with extreme-value-theory-based policy gradients. arXiv preprint arXiv:2406.15612 (2024).
[15]
Kang Gao, Stephen Weston, Perukrishnen Vytelingum, Namid Stillman, Wayne Luk, and Ce Guo. 2023. Deeper Hedging: A New Agent-based Model for Effective Deep Hedging. In Proceedings of the Fourth ACM International Conference on AI in Finance. 270–278.
[16]
Igor Halperin. 2020. QLBS: Q-Learner in the Black-Scholes (-Merton) Worlds. The Journal of Derivatives 28, 1 (2020), 99–122.
[17]
Yi He, Liang Peng, Dabao Zhang, and Zifeng Zhao. 2022. Risk analysis via generalized Pareto distributions. Journal of Business & Economic Statistics 40, 2 (2022), 852–867.
[18]
Yan Huang, Fuyu Du, Jian Chen, Yan Chen, Qicong Wang, and Maozhen Li. 2019. Generalized Pareto model based on particle swarm optimization for anomaly detection. IEEE Access 7 (2019), 176329–176338.
[19]
Petter N Kolm and Gordon Ritter. 2019. Dynamic replication and hedging: A reinforcement learning approach. The Journal of Financial Data Science 1, 1 (2019), 159–171.
[20]
Yuxi Li, Csaba Szepesvari, and Dale Schuurmans. 2009. Learning exercise policies for american options. In Artificial intelligence and statistics. PMLR, 352–359.
[21]
Guiliang Liu, Yudong Luo, Oliver Schulte, and Pascal Poupart. 2022. Uncertainty-aware reinforcement learning for risk-sensitive player evaluation in sports game. Advances in Neural Information Processing Systems 35 (2022), 20218–20231.
[22]
Yudong Luo, Guiliang Liu, Pascal Poupart, and Yangchen Pan. 2023. An alternative to variance: Gini deviation for risk-averse policy gradient. Advances in Neural Information Processing Systems 36 (2023), 60922–60946.
[23]
Yecheng Ma, Dinesh Jayaraman, and Osbert Bastani. 2021. Conservative offline distributional reinforcement learning. Advances in neural information processing systems 34 (2021), 19235–19247.
[24]
Parvin Malekzadeh, Ming Hou, and Konstantinos N Plataniotis. 2023. A unified uncertainty-aware exploration: Combining epistemic and aleatory uncertainty. In ICASSP 2023-2023 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). IEEE, 1–5.
[25]
Parvin Malekzadeh and Konstantinos N Plataniotis. 2024. Active Inference and Reinforcement Learning: A Unified Inference on Continuous State and Action Spaces under Partial Observability. Neural Computation (2024), 1–64.
[26]
Parvin Malekzadeh, Konstantinos N Plataniotis, Zissis Poulos, and Zeyu Wang. 2024. A Robust Quantile Huber Loss with Interpretable Parameter Adjustment in Distributional Reinforcement Learning. In IEEE International Conference on Acoustics, Speech and Signal Processing. 6120–6124.
[27]
Parvin Malekzadeh, Mohammad Salimibeni, Arash Mohammadi, Akbar Assa, and Konstantinos N Plataniotis. 2020. MM-KTD: multiple model kalman temporal differences for reinforcement learning. IEEE Access 8 (2020), 128716–128729.
[28]
Saeed Marzban, Erick Delage, and Jonathan Yu-Meng Li. 2022. Equal risk pricing and hedging of financial derivatives with convex risk measures. Quantitative Finance 22, 1 (2022), 47–73.
[29]
Robert C Merton. 1973. Theory of rational option pricing. The Bell Journal of economics and management science (1973), 141–183.
[30]
Phillip Murray, Ben Wood, Hans Buehler, Magnus Wiese, and Mikko Pakkanen. 2022. Deep hedging: Continuous reinforcement learning for hedging of general portfolios across multiple risk aversions. In Proceedings of the Third ACM International Conference on AI in Finance. 361–368.
[31]
Qiyun Pan, Young Myoung Ko, and Eunshin Byon. 2020. Uncertainty quantification for extreme quantile estimation with stochastic computer models. IEEE Transactions on Reliability 70, 1 (2020), 134–145.
[32]
Shige Peng, Shuzhen Yang, and Jianfeng Yao. 2023. Improving value-at-risk prediction under model uncertainty. Journal of Financial Econometrics 21, 1 (2023), 228–259.
[33]
James Pickands III. 1975. Statistical inference using extreme order statistics. the Annals of Statistics (1975), 119–131.
[34]
Anil Sharma, Freeman Chen, Jaesun Noh, Julio DeJesus, and Mario Schlener. 2024. Hedging Beyond the Mean: A Distributional Reinforcement Learning Perspective for Hedging Portfolios with Structured Products. arXiv preprint arXiv:2407.10903 (2024).
[35]
Abhay K Singh, David E Allen, and Powell J Robert. 2013. Extreme market risk and extreme value theory. Mathematics and computers in simulation 94 (2013), 310–328.
[36]
Dylan Troop, Frédéric Godin, and Jia Yuan Yu. 2021. Bias-corrected peaks-over-threshold estimation of the cvar. In Uncertainty in Artificial Intelligence. PMLR, 1809–1818.
[37]
Edoardo Vittori, Michele Trapletti, and Marcello Restelli. 2020. Option hedging with risk averse reinforcement learning. In Proceedings of the first ACM international conference on AI in finance. 1–8.
[38]
Wen Xu, Huixia Judy Wang, and Deyuan Li. 2022. Extreme quantile estimation based on the tail single-index model. Statistica Sinica 32, 2 (2022), 893–914.

Index Terms

  1. EX-DRL: Hedging Against Heavy Losses with EXtreme Distributional Reinforcement Learning

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Other conferences
      ICAIF '24: Proceedings of the 5th ACM International Conference on AI in Finance
      November 2024
      878 pages
      ISBN:9798400710810
      DOI:10.1145/3677052
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 November 2024

      Permissions

      Request permissions for this article.

      Check for updates

      Author Tags

      1. Distributional Reinforcement Learning
      2. Generalized Pareto Distribution
      3. Option Hedging

      Qualifiers

      • Research-article
      • Research
      • Refereed limited

      Conference

      ICAIF '24

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • 0
        Total Citations
      • 71
        Total Downloads
      • Downloads (Last 12 months)71
      • Downloads (Last 6 weeks)71
      Reflects downloads up to 23 Dec 2024

      Other Metrics

      Citations

      View Options

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      HTML Format

      View this article in HTML Format.

      HTML Format

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media