Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3383455.3422540acmconferencesArticle/Chapter ViewAbstractPublication PagesicaifConference Proceedingsconference-collections
research-article

Deep reinforcement learning for automated stock trading: an ensemble strategy

Published: 07 October 2021 Publication History

Abstract

Stock trading strategies play a critical role in investment. However, it is challenging to design a profitable strategy in a complex and dynamic stock market. In this paper, we propose an ensemble strategy that employs deep reinforcement schemes to learn a stock trading strategy by maximizing investment return. We train a deep reinforcement learning agent and obtain an ensemble trading strategy using three actor-critic based algorithms: Proximal Policy Optimization (PPO), Advantage Actor Critic (A2C), and Deep Deterministic Policy Gradient (DDPG). The ensemble strategy inherits and integrates the best features of the three algorithms, thereby robustly adjusting to different market situations. In order to avoid the large memory consumption in training networks with continuous action space, we employ a load-on-demand technique for processing very large data. We test our algorithms on the 30 Dow Jones stocks that have adequate liquidity. The performance of the trading agent with different reinforcement learning algorithms is evaluated and compared with both the Dow Jones Industrial Average index and the traditional min-variance portfolio allocation strategy. The proposed deep ensemble strategy is shown to outperform the three individual algorithms and two baselines in terms of the risk-adjusted return measured by the Sharpe ratio.

References

[1]
Wenhang Bao and Xiao-Yang Liu. 2019. Multi-agent deep reinforcement learning for liquidation strategy analysis. ICML Workshop on Applications and Infrastructure for Multi-Agent Learning, 2019 (06 2019).
[2]
Stelios Bekiros. 2010. Heterogeneous trading strategies with adaptive fuzzy Actor-Critic reinforcement learning: A behavioral approach. Journal of Economic Dynamics and Control 34 (06 2010), 1153--1170.
[3]
Stelios D. Bekiros. 2010. Fuzzy adaptive decision-making for boundedly rational traders in speculative stock markets. European Journal of Operational Research 202, 1 (April 2010), 285--293.
[4]
Francesco Bertoluzzo and Marco Corazza. 2012. Testing different reinforcement learning configurations for financial trading: introduction and applications. Procedia Economics and Finance 3 (12 2012), 68--77.
[5]
Dimitri Bertsekas. 1995. Dynamic programming and optimal control. Vol. 1.
[6]
Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. OpenAI Gym. arXiv:arXiv:1606.01540
[7]
Yuri Burda, Harrison Edwards, Deepak Pathak, Amos Storkey, Trevor Darrell, and Alexei Efros. 2018. Large-scale study of curiosity-driven learning. In 2019 Seventh International Conference on Learning Representations (ICLR) Poster.
[8]
Lucian Busoniu, Tim de Bruin, Domagoj Tolić, Jens Kober, and Ivana Palunko. 2018. Reinforcement learning for control: Performance, stability, and deep approximators. Annual Reviews in Control (10 2018).
[9]
Lin Chen and Qiang Gao. 2019. Application of deep reinforcement learning on automated stock trading. In 2019 IEEE 10th International Conference on Software Engineering and Service Science (ICSESS). 29--33.
[10]
Qian Chen and Xiao-Yang Liu. 2020. Quantifying ESG alpha using scholar big data: An automated machine learning approach. ACM International Conference on AI in Finance, ICAIF 2020 (2020).
[11]
Terence Chong, Wing-Kam Ng, and Venus Liew. 2014. Revisiting the performance of MACD and RSI oscillators. Journal of Risk and Financial Management 7 (03 2014), 1--12.
[12]
Quang-Vinh Dang. 2020. Reinforcement learning in stock trading. In Advanced Computational Methods for Knowledge Engineering. ICCSAMA 2019. Advances in Intelligent Systems and Computing, vol 1121. Springer, Cham.
[13]
Yue Deng, Feng Bao, Youyong Kong, Zhiquan Ren, and Qionghai Dai. 2016. Deep direct reinforcement learning for financial signal representation and trading. IEEE Transactions on Neural Networks and Learning Systems 28 (02 2016), 1--12.
[14]
Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, Yuhuai Wu, and Peter Zhokhov. 2017. OpenAI Baselines. https://github.com/openai/baselines.
[15]
Gabriel Dulac-Arnold, N. Levine, Daniel J. Mankowitz, J. Li, Cosmin Paduraru, Sven Gowal, and T. Hester. 2020. An empirical investigation of the challenges of real-world reinforcement learning. ArXiv abs/2003.11881 (2020).
[16]
Yunzhe Fang, Xiao-Yang Liu, and Hongyang Yang. 2019. Practical machine learning approach to capture the scholar data driven alpha in AI industry. In 2019 IEEE International Conference on Big Data (Big Data) Special Session on Intelligent Data Mining. 2230--2239.
[17]
Thomas G. Fischer. 2018. Reinforcement learning in financial markets - a survey. FAU Discussion Papers in Economics 12/2018. Friedrich-Alexander University Erlangen-Nuremberg, Institute for Economics.
[18]
Ikhlaas Gurrib. 2018. Performance of the average directional index as a market timing tool for the most actively traded USD based currency pairs. Banks and Bank Systems 13 (08 2018), 58--70.
[19]
Ashley Hill, Antonin Raffin, Maximilian Ernestus, Adam Gleave, Anssi Kanervisto, Rene Traore, Prafulla Dhariwal, Christopher Hesse, Oleg Klimov, Alex Nichol, Matthias Plappert, Alec Radford, John Schulman, Szymon Sidor, and Yuhuai Wu. 2018. Stable baselines. https://github.com/hill-a/stable-baselines.
[20]
A. Ilmanen. 2012. Expected Returns: An Investor's Guide to Harvesting Market Rewards. (05 2012).
[21]
Gyeeun Jeong and Ha Kim. 2018. Improving financial trading decisions using deep Q-learning: predicting the number of shares, action strategies, and transfer learning. Expert Systems with Applications 117 (09 2018).
[22]
Zhengyao Jiang and Jinjun Liang. 2017. Cryptocurrency portfolio management with deep reinforcement learning. In 2017 Intelligent Systems Conference.
[23]
Youngmin Kim, Wonbin Ahn, Kyong Joo Oh, and David Enke. 2017. An intelligent hybrid trading system for discovering trading rules for the futures market using rough sets and genetic algorithms. Applied Soft Computing 55 (02 2017), 127--140.
[24]
Vijay Konda and John Tsitsiklis. 2001. Actor-critic algorithms. Society for Industrial and Applied Mathematics 42 (04 2001).
[25]
Mark Kritzman and Yuanzhen Li. 2010. Skulls, financial turbulence, and risk management. Financial Analysts Journal 66 (10 2010).
[26]
Jinke Li, Ruonan Rao, and Jun Shi. 2018. Learning to Trade with Deep Actor Critic Methods. 2018 11th International Symposium on Computational Intelligence and Design (ISCID) 02 (2018), 66--71.
[27]
Xinyi Li, Yinchuan Li, Hongyang Yang, Liuqing Yang, and Xiao-Yang Liu. 2019. DP-LSTM: Differential privacy-inspired LSTM for stock prediction using financial news. 33rd Conference on Neural Information Processing Systems (NeurIPS 2019) Workshop on Robust AI in Financial Services: Data, Fairness, Explainability, Trustworthiness, and Privacy, December 2019 (12 2019).
[28]
Zhipeng Liang, Kangkang Jiang, Hao Chen, Junhao Zhu, and Yanran Li. 2018. Adversarial deep reinforcement learning in portfolio management. arXiv: Portfolio Management (2018).
[29]
Timothy Lillicrap, Jonathan Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. International Conference on Learning Representations (ICLR) 2016 (09 2015).
[30]
Mansoor Maitah, Petr Procházka, Michal Čermák, and Karel Šrédl. 2016. Commodity Channel index: evaluation of trading rule of agricultural Commodities. International Journal of Economics and Financial Issues 6 (03 2016), 176--178.
[31]
Harry Markowitz. 1952. Portfolio selection. Journal of Finance 7, 1 (1952), 77--91.
[32]
Volodymyr Mnih, Adrià Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. The 33rd International Conference on Machine Learning (02 2016).
[33]
John Moody and Matthew Saffell. 2001. Learning to trade via direct reinforcement. IEEE Transactions on Neural Networks 12 (07 2001), 875--89.
[34]
Ralph Neuneier. 1996. Optimal asset allocation using adaptive dynamic programming. Conference on Neural Information Processing Systems, 1995 (05 1996).
[35]
Ralph Neuneier. 1997. Enhancing Q-learning for optimal asset allocation. Conference on Neural Information Processing Systems (NeurIPS), 1997.
[36]
John Schulman, Sergey Levine, Philipp Moritz, Michael Jordan, and Pieter Abbeel. 2015. Trust region policy optimization. In The 31st International Conference on Machine Learning.
[37]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv:1707.06347 (07 2017).
[38]
Wharton Research Data Service. 2015. Standard & Poor's Compustat. Data retrieved from Wharton Research Data Service,.
[39]
W.F. Sharpe. 1994. The Sharpe ratio. Journal of Portfolio Management (01 1994).
[40]
Richard Sutton and Andrew Barto. 1998. Reinforcement learning: an introduction. IEEE Transactions on Neural Networks 9 (02 1998), 1054.
[41]
Richard Sutton, David Mcallester, Satinder Singh, and Yishay Mansour. 2000. Policy gradient methods for reinforcement learning with function approximation. Conference on Neural Information Processing Systems (NeurIPS), 1999 (02 2000).
[42]
Lu Wang, Wei Zhang, Xiaofeng He, and Hongyuan Zha. 2018. Supervised reinforcement learning with recurrent neural network for dynamic treatment recommendation. In Conference on Knowledge Discovery and Data Mining (KDD), 2018. 2447--2456.
[43]
Yuxin Wu and Yuandong Tian. 2017. Training agent for first-person shooter game with actor-critic curriculum learning. In International Conference on Learning Representations (ICLR), 2017.
[44]
Zhuoran Xiong, Xiao-Yang Liu, Shan Zhong, Hongyang Yang, and A. Elwalid. 2018. Practical deep reinforcement learning approach for stock trading. NeurIPS Workshop on Challenges and Opportunities for AI in Financial Services: the Impact of Fairness, Explainability, Accuracy, and Privacy, 2018. (2018).
[45]
Hongyang Yang, Xiao-Yang Liu, and Qingwei Wu. 2018. A practical machine learning approach for dynamic stock recommendation. In IEEE TrustCom/BiDataSE, 2018. 1693--1697.
[46]
Wenbin Zhang and Steven Skiena. 2010. Trading strategies to exploit blog and news sentiment. In Fourth International AAAI Conference on Weblogs and Social Media, 2010.
[47]
Yong Zhang and Xingyu Yang. 2016. Online portfolio selection strategy based on combining experts' advice. Computational Economics 50 (05 2016).
[48]
Zihao Zhang. 2019. Deep reinforcement learning for trading. ArXiv 2019 (11 2019).

Cited By

View all
  • (2024)R-DDQN: Optimizing Algorithmic Trading Strategies Using a Reward Network in a Double DQNMathematics10.3390/math1211162112:11(1621)Online publication date: 22-May-2024
  • (2024)Crowd Panic Behavior Simulation Using Multi-Agent ModelingElectronics10.3390/electronics1318362213:18(3622)Online publication date: 12-Sep-2024
  • (2024)Reinforcement Learning Review: Past Acts, Present Facts and Future ProspectsIT Journal Research and Development10.25299/itjrd.2023.134748:2(120-142)Online publication date: 15-Feb-2024
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
ICAIF '20: Proceedings of the First ACM International Conference on AI in Finance
October 2020
422 pages
ISBN:9781450375849
DOI:10.1145/3383455
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 October 2021

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. actor-critic framework
  2. automated stock trading
  3. deep reinforcement learning
  4. ensemble strategy
  5. markov decision process

Qualifiers

  • Research-article

Conference

ICAIF '20
Sponsor:
ICAIF '20: ACM International Conference on AI in Finance
October 15 - 16, 2020
New York, New York

Upcoming Conference

ICAIF '24
5th ACM International Conference on AI in Finance
November 14 - 16, 2024
Brooklyn , NY , USA

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)1,158
  • Downloads (Last 6 weeks)112
Reflects downloads up to 12 Sep 2024

Other Metrics

Citations

Cited By

View all
  • (2024)R-DDQN: Optimizing Algorithmic Trading Strategies Using a Reward Network in a Double DQNMathematics10.3390/math1211162112:11(1621)Online publication date: 22-May-2024
  • (2024)Crowd Panic Behavior Simulation Using Multi-Agent ModelingElectronics10.3390/electronics1318362213:18(3622)Online publication date: 12-Sep-2024
  • (2024)Reinforcement Learning Review: Past Acts, Present Facts and Future ProspectsIT Journal Research and Development10.25299/itjrd.2023.134748:2(120-142)Online publication date: 15-Feb-2024
  • (2024)A study on automated improvement of securities trading strategies using machine learning optimization algorithmsApplied Mathematics and Nonlinear Sciences10.2478/amns-2024-21759:1Online publication date: 5-Aug-2024
  • (2024)Mesoscale effects of trader learning behaviors in financial markets: A multi-agent reinforcement learning studyPLOS ONE10.1371/journal.pone.030114119:4(e0301141)Online publication date: 1-Apr-2024
  • (2024)AutoRL X: Automated Reinforcement Learning on the WebACM Transactions on Interactive Intelligent Systems10.1145/3670692Online publication date: 3-Jun-2024
  • (2024)Security and Privacy Issues in Deep Reinforcement Learning: Threats and CountermeasuresACM Computing Surveys10.1145/364031256:6(1-39)Online publication date: 12-Jan-2024
  • (2024)FNSPID: A Comprehensive Financial News Dataset in Time SeriesProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671629(4918-4927)Online publication date: 25-Aug-2024
  • (2024)PAIL: Performance based Adversarial Imitation Learning Engine for Carbon Neutral OptimizationProceedings of the 30th ACM SIGKDD Conference on Knowledge Discovery and Data Mining10.1145/3637528.3671611(6148-6157)Online publication date: 25-Aug-2024
  • (2024)Large Language Model for Dynamic Strategy Interchange in Financial Markets2024 9th International Conference on Cloud Computing and Big Data Analytics (ICCCBDA)10.1109/ICCCBDA61447.2024.10569928(306-312)Online publication date: 25-Apr-2024
  • Show More Cited By

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media