research-article

Deep reinforcement learning for automated stock trading: an ensemble strategy

Authors:

Anwar WalidAuthors Info & Claims

ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

Article No.: 31, Pages 1 - 8

https://doi.org/10.1145/3383455.3422540

Published: 07 October 2021 Publication History

Abstract

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

References

[1]

Wenhang Bao and Xiao-Yang Liu. 2019. Multi-agent deep reinforcement learning for liquidation strategy analysis. ICML Workshop on Applications and Infrastructure for Multi-Agent Learning, 2019 (06 2019).

[2]

Stelios Bekiros. 2010. Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach. Journal of Economic Dynamics and Control 34 (06 2010), 1153--1170.

[3]

Stelios D. Bekiros. 2010. Fuzzy adaptive decision-making for boundedly rational traders in speculative stock markets. European Journal of Operational Research 202, 1 (April 2010), 285--293.

[4]

Francesco Bertoluzzo and Marco Corazza. 2012. Testing different reinforcement learning configurations for financial trading: introduction and applications. Procedia Economics and Finance 3 (12 2012), 68--77.

[5]

Dimitri Bertsekas. 1995. Dynamic programming and optimal control. Vol. 1.

Digital Library

[6]

Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540

[7]

Yuri Burda, Harrison Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell, and Alexei Efros. 2018. Large-scale study of curiosity-driven learning. In 2019 Seventh International Conference on Learning Representations (ICLR) Poster.

[8]

Lucian Busoniu, Tim de Bruin, Domagoj Tolić, Jens Kober, and Ivana Palunko. 2018. Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control (10 2018).

[9]

Lin Chen and Qiang Gao. 2019. Application of deep reinforcement learning on automated stock trading. In 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). 29--33.

[10]

Qian Chen and Xiao-Yang Liu. 2020. Quantifying ESG alpha using scholar big data: An automated machine learning approach. ACM International Conference on AI in Finance, ICAIF 2020 (2020).

Digital Library

[11]

Terence Chong, Wing-Kam Ng, and Venus Liew. 2014. Revisiting the performance of MACD and RSI oscillators. Journal of Risk and Financial Management 7 (03 2014), 1--12.

[12]

Quang-Vinh Dang. 2020. Reinforcement learning in stock trading. In Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2019. Advances in Intelligent Systems and Computing, vol 1121. Springer, Cham.

[13]

Yue Deng, Feng Bao, Youyong Kong, Zhiquan Ren, and Qionghai Dai. 2016. Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems 28 (02 2016), 1--12.

[14]

Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. https://github.com/openai/baselines.

[15]

Gabriel Dulac-Arnold, N. Levine, Daniel J. Mankowitz, J. Li, Cosmin Paduraru, Sven Gowal, and T. Hester. 2020. An empirical investigation of the challenges of real-world reinforcement learning. ArXiv abs/2003.11881 (2020).

[16]

Yunzhe Fang, Xiao-Yang Liu, and Hongyang Yang. 2019. Practical machine learning approach to capture the scholar data driven alpha in AI industry. In 2019 IEEE International Conference on Big Data (Big Data) Special Session on Intelligent Data Mining. 2230--2239.

[17]

Thomas G. Fischer. 2018. Reinforcement learning in financial markets - a survey. FAU Discussion Papers in Economics 12/2018. Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.

[18]

Ikhlaas Gurrib. 2018. Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks and Bank Systems 13 (08 2018), 58--70.

[19]

Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. 2018. Stable baselines. https://github.com/hill-a/stable-baselines.

[20]

A. Ilmanen. 2012. Expected Returns: An Investor's Guide to Harvesting Market Rewards. (05 2012).

[21]

Gyeeun Jeong and Ha Kim. 2018. Improving financial trading decisions using deep Q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Systems with Applications 117 (09 2018).

[22]

Zhengyao Jiang and Jinjun Liang. 2017. Cryptocurrency portfolio management with deep reinforcement learning. In 2017 Intelligent Systems Conference.

[23]

Youngmin Kim, Wonbin Ahn, Kyong Joo Oh, and David Enke. 2017. An intelligent hybrid trading system for discovering trading rules for the futures market using rough sets and genetic algorithms. Applied Soft Computing 55 (02 2017), 127--140.

[24]

Vijay Konda and John Tsitsiklis. 2001. Actor-critic algorithms. Society for Industrial and Applied Mathematics 42 (04 2001).

[25]

Mark Kritzman and Yuanzhen Li. 2010. Skulls, financial turbulence, and risk management. Financial Analysts Journal 66 (10 2010).

[26]

Jinke Li, Ruonan Rao, and Jun Shi. 2018. Learning to Trade with Deep Actor Critic Methods. 2018 11th International Symposium on Computational Intelligence and Design (ISCID) 02 (2018), 66--71.

[27]

Xinyi Li, Yinchuan Li, Hongyang Yang, Liuqing Yang, and Xiao-Yang Liu. 2019. DP-LSTM: Differential privacy-inspired LSTM for stock prediction using financial news. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy, December 2019 (12 2019).

[28]

Zhipeng Liang, Kangkang Jiang, Hao Chen, Junhao Zhu, and Yanran Li. 2018. Adversarial deep reinforcement learning in portfolio management. arXiv: Portfolio Management (2018).

[29]

Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. International Conference on Learning Representations (ICLR) 2016 (09 2015).

[30]

Mansoor Maitah, Petr Procházka, Michal Čermák, and Karel Šrédl. 2016. Commodity Channel index: evaluation of trading rule of agricultural Commodities. International Journal of Economics and Financial Issues 6 (03 2016), 176--178.

[31]

Harry Markowitz. 1952. Portfolio selection. Journal of Finance 7, 1 (1952), 77--91.

[32]

Volodymyr Mnih, Adrià Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. The 33rd International Conference on Machine Learning (02 2016).

[33]

John Moody and Matthew Saffell. 2001. Learning to trade via direct reinforcement. IEEE Transactions on Neural Networks 12 (07 2001), 875--89.

Digital Library

[34]

Ralph Neuneier. 1996. Optimal asset allocation using adaptive dynamic programming. Conference on Neural Information Processing Systems, 1995 (05 1996).

[35]

Ralph Neuneier. 1997. Enhancing Q-learning for optimal asset allocation. Conference on Neural Information Processing Systems (NeurIPS), 1997.

[36]

John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan, and Pieter Abbeel. 2015. Trust region policy optimization. In The 31st International Conference on Machine Learning.

[37]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv:1707.06347 (07 2017).

[38]

Wharton Research Data Service. 2015. Standard & Poor's Compustat. Data retrieved from Wharton Research Data Service,.

[39]

W.F. Sharpe. 1994. The Sharpe ratio. Journal of Portfolio Management (01 1994).

[40]

Richard Sutton and Andrew Barto. 1998. Reinforcement learning: an introduction. IEEE Transactions on Neural Networks 9 (02 1998), 1054.

Digital Library

[41]

Richard Sutton, David Mcallester, Satinder Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. Conference on Neural Information Processing Systems (NeurIPS), 1999 (02 2000).

[42]

Lu Wang, Wei Zhang, Xiaofeng He, and Hongyuan Zha. 2018. Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In Conference on Knowledge Discovery and Data Mining (KDD), 2018. 2447--2456.

Digital Library

[43]

Yuxin Wu and Yuandong Tian. 2017. Training agent for first-person shooter game with actor-critic curriculum learning. In International Conference on Learning Representations (ICLR), 2017.

[44]

Zhuoran Xiong, Xiao-Yang Liu, Shan Zhong, Hongyang Yang, and A. Elwalid. 2018. Practical deep reinforcement learning approach for stock trading. NeurIPS Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, 2018. (2018).

[45]

Hongyang Yang, Xiao-Yang Liu, and Qingwei Wu. 2018. A practical machine learning approach for dynamic stock recommendation. In IEEE TrustCom/BiDataSE, 2018. 1693--1697.

[46]

Wenbin Zhang and Steven Skiena. 2010. Trading strategies to exploit blog and news sentiment. In Fourth International AAAI Conference on Weblogs and Social Media, 2010.

[47]

Yong Zhang and Xingyu Yang. 2016. Online portfolio selection strategy based on combining experts' advice. Computational Economics 50 (05 2016).

[48]

Zihao Zhang. 2019. Deep reinforcement learning for trading. ArXiv 2019 (11 2019).

Cited By

Zhou CHuang YCui KLu X(2024)R-DDQN: Optimizing Algorithmic Trading Strategies Using a Reward Network in a Double DQNMathematics10.3390/math1211162112:11(1621)Online publication date: 22-May-2024
https://doi.org/10.3390/math12111621
Dumitrescu CRadu VGheorghe RTăbîrcă AȘtefan MManea L(2024)Crowd Panic Behavior Simulation Using Multi-Agent ModelingElectronics10.3390/electronics1318362213:18(3622)Online publication date: 12-Sep-2024
https://doi.org/10.3390/electronics13183622
Kommey BIsaac OTamakloe EOpoku4 D(2024)Reinforcement Learning Review: Past Acts, Present Facts and Future ProspectsIT Journal Research and Development10.25299/itjrd.2023.134748:2(120-142)Online publication date: 15-Feb-2024
https://doi.org/10.25299/itjrd.2023.13474
Show More Cited By

Index Terms

Deep reinforcement learning for automated stock trading: an ensemble strategy
1. Computing methodologies
  1. Machine learning

Recommendations

Deep Reinforcement Learning for Automated Stock Trading: Inclusion of Short Selling
Foundations of Intelligent Systems
Abstract
Multiple facets of the financial industry, such as algorithmic trading, have greatly benefited from their unison with cutting-edge machine learning research in recent years. However, despite significant research efforts directed towards leveraging ...
A novel Deep Reinforcement Learning based automated stock trading system using cascaded LSTM networks
Abstract
Deep Reinforcement Learning (DRL) algorithms have been increasingly used to construct stock trading strategies, but they often face performance challenges when applied to financial data with low signal-to-noise ratios and unevenness, as these ...
Evaluation of Deep Reinforcement Learning Based Stock Trading
Information Retrieval
Abstract
Stock is one of the most important targets in investment. However, it is challenging to manually design a profitable strategy in the highly dynamic and complex stock market. Modern portfolio management usually employs quantitative trading, which ...

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences

ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance

October 2020

422 pages

ISBN:9781450375849

DOI:10.1145/3383455

Conference Chair:
Tucker Balch
J.P. Morgan AI Research

Copyright © 2020 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

ACM: Association for Computing Machinery

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2021

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Conference

ICAIF '20

Sponsor:

ACM

ICAIF '20: ACM International Conference on AI in Finance

October 15 - 16, 2020

New York, New York

Upcoming Conference

ICAIF '24

5th ACM International Conference on AI in Finance

November 14 - 16, 2024

Brooklyn , NY , USA

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

74
Total Citations
View Citations
2,584
Total Downloads

Downloads (Last 12 months)1,158
Downloads (Last 6 weeks)112

Reflects downloads up to 12 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

Zhou CHuang YCui KLu X(2024)R-DDQN: Optimizing Algorithmic Trading Strategies Using a Reward Network in a Double DQNMathematics10.3390/math1211162112:11(1621)Online publication date: 22-May-2024
https://doi.org/10.3390/math12111621
Dumitrescu CRadu VGheorghe RTăbîrcă AȘtefan MManea L(2024)Crowd Panic Behavior Simulation Using Multi-Agent ModelingElectronics10.3390/electronics1318362213:18(3622)Online publication date: 12-Sep-2024
https://doi.org/10.3390/electronics13183622
Kommey BIsaac OTamakloe EOpoku4 D(2024)Reinforcement Learning Review: Past Acts, Present Facts and Future ProspectsIT Journal Research and Development10.25299/itjrd.2023.134748:2(120-142)Online publication date: 15-Feb-2024
https://doi.org/10.25299/itjrd.2023.13474
Chen Y(2024)A study on automated improvement of securities trading strategies using machine learning optimization algorithmsApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21759:1Online publication date: 5-Aug-2024
https://doi.org/10.2478/amns-2024-2175
Lussange JVrizzi SPalminteri SGutkin B(2024)Mesoscale effects of trader learning behaviors in financial markets: A multi-agent reinforcement learning studyPLOS ONE10.1371/journal.pone.030114119:4(e0301141)Online publication date: 1-Apr-2024
https://doi.org/10.1371/journal.pone.0301141
Franke LWeidele DDehmamy NNing LHaehn D(2024)AutoRL X: Automated Reinforcement Learning on the WebACM Transactions on Interactive Intelligent Systems10.1145/3670692Online publication date: 3-Jun-2024
https://dl.acm.org/doi/10.1145/3670692
Mo KYe PRen XWang SLi WLi J(2024)Security and Privacy Issues in Deep Reinforcement Learning: Threats and CountermeasuresACM Computing Surveys10.1145/364031256:6(1-39)Online publication date: 12-Jan-2024
https://dl.acm.org/doi/10.1145/3640312
Dong ZFan XPeng ZBaeza-Yates RBonchi F(2024)FNSPID: A Comprehensive Financial News Dataset in Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671629(4918-4927)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671629
Ye YTang LWang HYu RYu WHe EChen HXiong HBaeza-Yates RBonchi F(2024)PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral OptimizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671611(6148-6157)Online publication date: 25-Aug-2024
https://dl.acm.org/doi/10.1145/3637528.3671611
Zhong XZhao ZYan BXu Q(2024)Large Language Model for Dynamic Strategy Interchange in Financial Markets2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA)10.1109/ICCCBDA61447.2024.10569928(306-312)Online publication date: 25-Apr-2024
https://doi.org/10.1109/ICCCBDA61447.2024.10569928
Show More Cited By

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents