Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Outperforming algorithmic trading reinforcement learning systems: : A supervised approach to the cryptocurrency market

Published: 15 September 2022 Publication History

Abstract

The interdisciplinary relationship between machine learning and financial markets has long been a theme of great interest among both research communities. Recently, reinforcement learning and deep learning methods gained prominence in the active asset trading task, aiming to achieve outstanding performances compared with classical benchmarks, such as the Buy and Hold strategy. This paper explores both the supervised learning and reinforcement learning approaches applied to active asset trading, drawing attention to the benefits of both approaches. This work extends the comparison between the supervised approach and reinforcement learning by using state-of-the-art strategies with both techniques. We propose adopting the ResNet architecture, one of the best deep learning approaches for time series classification, into the ResNet-LSTM actor (RSLSTM-A). We compare RSLSTM-A against classical and recent reinforcement learning techniques, such as recurrent reinforcement learning, deep Q-network, and advantage actor–critic. We simulated a currency exchange market environment with the price time series of the Bitcoin, Litecoin, Ethereum, Monero, Nxt, and Dash cryptocurrencies to run our tests. We show that our approach achieves better overall performance, confirming that supervised learning can outperform reinforcement learning for trading. We also present a graphic representation of the features extracted from the ResNet neural network to identify which type of characteristics each residual block generates.

Highlights

ResNet-LSTM actor as our proposed method for financial trading decision problems.
Comparison of our method against state-of-the-art reinforcement learning methods.
Real-world evaluation using cryptocurrency market, surpassing the benchmark.
Robustness evaluation with and without transaction costs.
Feature extraction insights using graphical visualization of the layer’s outputs.

References

[1]
Aboussalah A.M., Lee C.-G., Continuous control with stacked deep dynamic recurrent reinforcement learning for portfolio optimization, Expert Systems with Applications 140 (2020),.
[2]
Allen M.P., The problem of multicollinearity, in: Understanding regression analysis, Springer US, Boston, MA, 1997, pp. 176–180,.
[3]
Allen F., Karjalainen R., Using genetic algorithms to find technical trading rules, Journal of Financial Economics 51 (2) (1999) 245–271,.
[4]
Almahdi S., Yang S.Y., An adaptive portfolio trading system: A risk-return portfolio optimization using recurrent reinforcement learning with expected maximum drawdown, Expert Systems with Applications 87 (2017) 267–279,.
[5]
Almahdi S., Yang S.Y., A constrained portfolio trading system using particle swarm algorithm and recurrent reinforcement learning, Expert Systems with Applications 130 (2019) 145–156.
[6]
Baird, L. (1993). Advantage updating: Technical Report WL-TR-93-1146.
[7]
Bessembinder H., Chan K., The profitability of technical trading rules in the Asian stock markets, Pacific-Basin Finance Journal 3 (2) (1995) 257–284,.
[8]
Brock W., Lakonishok J., LeBaron B., Simple technical trading rules and the stochastic properties of stock returns, The Journal of Finance 47 (5) (1992) 1731–1764. http://www.jstor.org/stable/2328994.
[9]
Campbell J.Y., Lo A.W., MacKinlay A., The econometrics of financial markets, Princeton University Press, 1997, http://www.jstor.org/stable/j.ctt7skm5.
[10]
Choi H., Ryu S., Kim H., Short-term load forecasting based on ResNet and LSTM, in: 2018 IEEE international conference on communications, control, and computing technologies for smart grids (SmartGridComm), 2018, pp. 1–6,.
[11]
Chopra N., Lakonishok J., Ritter J.R., Measuring abnormal performance: Do stocks overreact?, Journal of Financial Economics 31 (2) (1992) 235–268,.
[12]
Dempster M.A.H., Romahi Y.S., Intraday FX trading: An evolutionary reinforcement learning approach, in: Yin H., Allinson N., Freeman R., Keane J., Hubbard S. (Eds.), Intelligent data engineering and automated learning — IDEAL 2002, Springer Berlin Heidelberg, Berlin, Heidelberg, 2002, pp. 347–358.
[13]
Demšar J., Statistical comparisons of classifiers over multiple data sets, Journal of Machine Learning Research (2006).
[14]
Deng Y., Bao F., Kong Y., Ren Z., Dai Q., Deep direct reinforcement learning for financial signal representation and trading, IEEE Transactions on Neural Networks and Learning Systems 28 (3) (2017) 653–664,.
[15]
Eckles D., Kaptein M., Thompson sampling with the online bootstrap, 2014.
[16]
Fama E.F., Random walks in stock market prices, Financial Analysts Journal (1965),.
[17]
Fama E.F., French K.R., Permanent and temporary components of stock prices, Journal of Political Economy 96 (2) (1988) 246–273.
[18]
Feinberg E.A., Shwartz A. (Eds.), Handbook of Markov decision processes, Springer US, 2002,.
[19]
Feuerriegel S., Prendinger H., News-based trading strategies, Decision Support Systems 90 (2016) 65–74,.
[20]
Hamilton W.P., The stock market barometer: study of its forecast value based on Charles H. Dow’s theory of the price movement. with an analysis of the market and its history since 1897, Harper & Brothers, 1922.
[21]
He R., Liu Y., Wang K., Zhao N., Yuan Y., Li Q., Zhang H., Automatic cardiac arrhythmia classification using combination of deep residual network and bidirectional LSTM, IEEE Access 7 (2019) 102119–102135,.
[22]
He K., Zhang X., Ren S., Sun J., Deep residual learning for image recognition, in: Proceedings of the IEEE computer society conference on computer vision and pattern recognition, 2016,.
[23]
Henderson P., Islam R., Bachman P., Pineau J., Precup D., Meger D., Deep reinforcement learning that matters, in: 32nd AAAI conf. on artificial intelligence (AAAI-18), 2018, https://www.aaai.org/ocs/index.php/AAAI/AAAI18/paper/view/16669.
[24]
Ismail Fawaz H., Forestier G., Weber J., Idoumghar L., Muller P.-A., Deep learning for time series classification: a review, Data Mining and Knowledge Discovery 33 (4) (2019) 917–963,.
[25]
Jeong G., Kim H.Y., Improving financial trading decisions using deep Q-learning: Predicting the number of shares, action strategies, and transfer learning, Expert Systems with Applications 117 (2019) 125–138,.
[26]
Kang Q., Zhou H., Kang Y., An asynchronous advantage actor-critic reinforcement learning method for stock selection and portfolio management, in: Proceedings of the 2nd international conference on big data research - ICBDR 2018, ACM Press, New York, New York, USA, 2018, pp. 141–145,.
[27]
Kapoor S., Prosad J.M., Behavioural finance: A review, Procedia Computer Science 122 (2017) 50–54,. 5th International Conference on Information Technology and Quantitative Management, ITQM 2017.
[28]
Khadjeh Nassirtoussi A., Aghabozorgi S., Ying Wah T., Ngo D.C.L., Text mining for market prediction: A systematic review, Expert Systems with Applications 41 (16) (2014) 7653–7670,.
[29]
Kirkpatrick II C.D., Dahlquist J.A., Technical analysis: the complete resource for financial market technicians, FT Press, 2010.
[30]
Kuo R., Chen C., Hwang Y., An intelligent stock trading decision support system through integration of genetic algorithm based fuzzy neural network and artificial neural network, Fuzzy Sets and Systems 118 (1) (2001) 21–45,.
[31]
Li Y., Zheng W., Zheng Z., Deep robust reinforcement learning for practical algorithmic trading, IEEE Access 7 (2019) 108014–108022,.
[32]
Lima Paiva F.C., Felizardo L.K., Bianchi R.A.d.C.B., Costa A.H.R., Intelligent trading systems: A sentiment-aware reinforcement learning approach, Proceedings of the Second ACM International Conference on AI in Finance, ACM, New York, NY, USA, 2021, pp. 1–9,. arXiv:2112.02095.
[33]
Maringer D., Ramtohul T., Threshold recurrent reinforcement learning model for automated trading, in: Applications of evolutionary computation, Springer Berlin Heidelberg, Berlin, Heidelberg, 2010, pp. 212––221,.
[34]
Mnih V., Badia A.P., Mirza M., Graves A., Lillicrap T.P., Harley T., Silver D., Kavukcuoglu K., Asynchronous methods for deep reinforcement learning, 2016, CoRR, abs/1602.01783, arXiv:1602.01783.
[35]
Mnih V., Kavukcuoglu K., Silver D., Graves A., Antonoglou I., Wierstra D., Riedmiller M.A., Playing atari with deep reinforcement learning, 2013, CoRR, abs/1312.5602, arXiv:1312.5602.
[36]
Moody J., Saffell M., Liao Y., Wu L., Reinforcement learning for trading systems and portfolios: Immediate vs future rewards, in: Refenes A.-P.N., Burgess A.N., Moody J.E. (Eds.), Decision technologies for computational finance: proceedings of the fifth international conference computational finance, Springer US, Boston, MA, 1998, pp. 129–140,.
[37]
Moody J., Wu L., Optimization of trading systems and portfolios, in: Proceedings of the IEEE/IAFE 1997 computational intelligence for financial engineering (CIFEr), IEEE, 1997, pp. 300–307,.
[38]
Moody J., Wu L., Liao Y., Saffell M., Performance functions and reinforcement learning for trading systems and portfolios, Journal of Forecasting 17 (5–6) (1998) 441–470,.
[39]
Nakano M., Takahashi A., Takahashi S., Bitcoin technical trading with artificial neural network, Physica A: Statistical Mechanics and its Applications 510 (2018) 587–609,.
[40]
Neuneier R., Optimal asset allocation using adaptive dynamic programming, in: Advances in neural information processing systems 8, in: 2, vol. 32, 1995, pp. 952–958. http://papers.nips.cc/paper/1121-optimal-asset-allocation-using-adaptive-dynamic-programming.pdf.
[41]
Neuneier R., Enhancing Q-learning for optimal asset allocation, in: Advances in neural information processing systems, 1998.
[42]
Park H., Sim M.K., Choi D.G., An intelligent financial portfolio trading strategy using deep Q-learning, Expert Systems with Applications 158 (2020),.
[43]
Pendharkar P.C., Cusatis P., Trading financial indices with reinforcement learning agents, Expert Systems with Applications 103 (2018) 1–13,.
[44]
Pineau J., Vincent-Lamarre P., Sinha K., Larivière V., Beygelzimer A., d’Alché Buc F., Fox E., Larochelle H., Improving reproducibility in machine learning research: a report from the neurips 2019 reproducibility program, Journal of Machine Learning Research 22 (2021).
[45]
Ponomarev E.S., Oseledets I.V., Cichocki A.S., Using reinforcement learning in the algorithmic trading problem, Journal of Communications Technology and Electronics 64 (12) (2019) 1450–1457,.
[46]
Sang C., Di Pierro M., Improving trading technical analysis with TensorFlow long short-term memory (LSTM) neural network, The Journal of Finance and Data Science 5 (1) (2019) 1–11,.
[47]
Shen W., Wang J., Jiang Y.-G., Zha H., Portfolio choices with orthogonal bandit learning, in: Proceedings of the 24th international conference on artificial intelligence, in: IJCAI’15, AAAI Press, 2015, pp. 974–980.
[48]
Trippi R.R., DeSieno D., Trading equity index futures with a neural network, The Journal of Portfolio Management 19 (1) (1992) 27–33,.
[49]
Wang J., Liu Y., Li B., Reinforcement learning with perturbed rewards, in: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 2020, pp. 6202–6209,.
[50]
Wang J., Zhang Y., Tang K., Wu J., Xiong Z., AlphaStock, in: Proceedings of the 25th ACM SIGKDD international conference on knowledge discovery & data mining - KDD ’19, ACM Press, New York, New York, USA, 2019, pp. 1900–1908,.
[51]
Ye Y., Pei H., Wang B., Chen P.-Y., Zhu Y., Xiao J., Li B., Reinforcement-learning based portfolio management with augmented asset movement prediction states, in: Proceedings of the AAAI conference on artificial intelligence, vol. 34, 2020, pp. 1112–1119,.
[52]
Yu P., Lee J.S., Kulyatin I., Shi Z., Dasgupta S., Model-based deep reinforcement learning for dynamic portfolio optimization, 2019, http://arxiv.org/abs/1901.08740.
[53]
Zarkias K.S., Passalis N., Tsantekidis A., Tefas A., Deep reinforcement learning for financial trading using price trailing, in: ICASSP 2019 - 2019 IEEE international conference on acoustics, speech and signal processing (ICASSP), vol. 117, IEEE, 2019, pp. 3067–3071,.

Cited By

View all
  • (2024)Artificial intelligence techniques in financial tradingJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10201536:3Online publication date: 1-Mar-2024
  • (2023)Research on the Portfolio Model of Deep Reinforcement Learning Based on Twin Delayed Deep Deterministic Policy Gradient AlgorithmProceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management10.1145/3656766.3656833(392-396)Online publication date: 24-Nov-2023
  • (2023)Novel insights into the modeling financial time-series through machine learning methodsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121012234:COnline publication date: 30-Dec-2023
  • Show More Cited By

Index Terms

  1. Outperforming algorithmic trading reinforcement learning systems: A supervised approach to the cryptocurrency market
      Index terms have been assigned to the content through auto-classification.

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image Expert Systems with Applications: An International Journal
      Expert Systems with Applications: An International Journal  Volume 202, Issue C
      Sep 2022
      1548 pages

      Publisher

      Pergamon Press, Inc.

      United States

      Publication History

      Published: 15 September 2022

      Author Tags

      1. Deep neural network
      2. Reinforcement learning
      3. Stock trading
      4. Time series classification
      5. Cryptocurrencies

      Qualifiers

      • Research-article

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)0
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 04 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2024)Artificial intelligence techniques in financial tradingJournal of King Saud University - Computer and Information Sciences10.1016/j.jksuci.2024.10201536:3Online publication date: 1-Mar-2024
      • (2023)Research on the Portfolio Model of Deep Reinforcement Learning Based on Twin Delayed Deep Deterministic Policy Gradient AlgorithmProceedings of the 2023 3rd International Conference on Big Data, Artificial Intelligence and Risk Management10.1145/3656766.3656833(392-396)Online publication date: 24-Nov-2023
      • (2023)Novel insights into the modeling financial time-series through machine learning methodsExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.121012234:COnline publication date: 30-Dec-2023
      • (2023)Digital financial asset price fluctuation forecasting in digital economy era using blockchain informationExpert Systems with Applications: An International Journal10.1016/j.eswa.2023.120329228:COnline publication date: 15-Oct-2023
      • (2023)Machine learning-based computation offloading in edge and fog: a systematic reviewCluster Computing10.1007/s10586-023-04100-z26:5(3113-3144)Online publication date: 1-Oct-2023
      • (2022)Developing a smart stock trading system equipped with a novel risk control mechanism for investors with different risk appetitesExpert Systems with Applications: An International Journal10.1016/j.eswa.2022.118614210:COnline publication date: 30-Dec-2022

      View Options

      View options

      Get Access

      Login options

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media