Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
Skip to main content

Recurrent neural networks for stochastic control problems with delay

  • Original Article
  • Published:
Mathematics of Control, Signals, and Systems Aims and scope Submit manuscript

Abstract

Stochastic control problems with delay are challenging due to the path-dependent feature of the system and thus its intrinsic high dimensions. In this paper, we propose and systematically study deep neural network-based algorithms to solve stochastic control problems with delay features. Specifically, we employ neural networks for sequence modeling (e.g., recurrent neural networks such as long short-term memory) to parameterize the policy and optimize the objective function. The proposed algorithms are tested on three benchmark examples: a linear-quadratic problem, optimal consumption with fixed finite delay, and portfolio optimization with complete memory. Particularly, we notice that the architecture of recurrent neural networks naturally captures the path-dependent feature with much flexibility and yields better performance with more efficient and stable training of the network compared to feedforward networks. The superiority is even evident in the case of portfolio optimization with complete memory, which features infinite delay.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9
Fig. 10

Similar content being viewed by others

Notes

  1. https://github.com/frankhan91/RNN-ControlwithDelay.

References

  1. Abadi M, Barham P, Chen J, Chen Z, Davis A, Dean J, Devin M, Ghemawat S, Irving G, Isard M et al (2016) Tensorflow: a system for large-scale machine learning. In: 12th USENIX symposium on operating systems design and implementation (OSDI 16), pp 265–283

  2. Asea PK, Zak PJ (1999) Time-to-build and cycles. J Econ Dyn Control 23(8):1155–1175

    Article  MathSciNet  Google Scholar 

  3. Bandini E, Cosso A, Fuhrman M, Pham H (2018) Backward SDEs for optimal control of partially observed path-dependent stochastic systems: a control randomization approach. Ann Appl Probab 28(3):1634–1678

    Article  MathSciNet  Google Scholar 

  4. Bauer H, Rieder U (2005) Stochastic control problems with delay. Math Methods Oper Res 62(3):411–427

    Article  MathSciNet  Google Scholar 

  5. Bengio Y (2009) Learning deep architectures for AI. Found® Trends Mach Learn 2(1):1–127

    Article  MathSciNet  Google Scholar 

  6. Carleo G, Troyer M (2017) Solving the quantum many-body problem with artificial neural networks. Science 355(6325):602–606

    Article  MathSciNet  Google Scholar 

  7. Carmona R (2016) Lectures on BSDEs, stochastic control, and stochastic differential games with financial applications. SIAM

  8. Carmona R, Laurière M (2019) Convergence analysis of machine learning algorithms for the numerical solution of mean field control and games: II—the finite horizon case. arXiv preprint arXiv:1908.01613

  9. Chang MH, Pang T, Pemy M (2008) Finite difference approximation for stochastic optimal stopping problems with delays. J Ind Manag Optim 4(2):227

    MathSciNet  MATH  Google Scholar 

  10. Chen L, Wu Z (2010) Maximum principle for the stochastic optimal control problem with delay and application. Automatica 46(6):1074–1080

    Article  MathSciNet  Google Scholar 

  11. Chen L, Wu Z (2012) Dynamic programming principle for stochastic recursive optimal control problem with delayed systems. ESAIM Control Optim Calc Var 18(4):1005–1026

    Article  MathSciNet  Google Scholar 

  12. Chen L, Wu Z, Yu Z (2012) Delayed stochastic linear-quadratic control problem and related applications. J Appl Math 6:66

    MathSciNet  Google Scholar 

  13. Cho K, Van Merriënboer B, Gulcehre C, Bahdanau D, Bougares F, Schwenk H, Bengio Y (2014) Learning phrase representations using RNN encoder–decoder for statistical machine translation. arXiv preprint arXiv:1406.1078

  14. Eiginan W, Han J, Jentzen A (2017) Deep learning-based numerical methods for high-dimensional parabolic partial differential equations and backward stochastic differential equations. Commun Math Stat 5(4):349–380

    Article  MathSciNet  Google Scholar 

  15. Elsanosi I, Larssen B (2001) Optimal consumption under partial observations for a stochastic system with delay. Preprint series. Pure mathematics http://urn. nb. no/URN: NBN: no-8076

  16. Elsanosi I, Øksendal B, Sulem A (2000) Some solvable stochastic control problems with delay. Stoch Int J Probab Stoch Process 71(1–2):69–89

    MathSciNet  MATH  Google Scholar 

  17. Federico S (2011) A stochastic control problem with delay arising in a pension fund model. Finance Stoch 15(3):421–459

    Article  MathSciNet  Google Scholar 

  18. Fischer M, Nappo G (2008) Time discretisation and rate of convergence for the optimal control of continuous-time stochastic systems with delay. Appl Math Optim 57(2):177–206

    Article  MathSciNet  Google Scholar 

  19. Fischer M, Reiss M (2007) Discretisation of stochastic control problems for continuous time dynamics with delay. J Comput Appl Math 205(2):969–981

    Article  MathSciNet  Google Scholar 

  20. Fouque JP, Zhang Z (2020) Deep learning methods for mean field control problems with delay. Front Appl Math Stat 6:11

    Article  Google Scholar 

  21. Gers FA, Schraudolph NN, Schmidhuber J (2002) Learning precise timing with LSTM recurrent networks. J Mach Learn Res 3:115–143

    MathSciNet  MATH  Google Scholar 

  22. Gozzi F, Marinelli C, Savin S (2009) On controlled linear diffusions with delay in a model of optimal advertising under uncertainty with memory effects. J Optim Theory Appl 142(2):291–321

    Article  MathSciNet  Google Scholar 

  23. Gozzi F, Masiero F (2017) Stochastic optimal control with delay in the control I: solving the HJB equation through partial smoothing. SIAM J Control Optim 55(5):2981–3012

    Article  MathSciNet  Google Scholar 

  24. Gozzi F, di Roma S, Marinelli C (2005) Stochastic optimal control of delay equations arising in advertising models. Stoch Part Differ Equ Appl VII:133–148

    MATH  Google Scholar 

  25. Graves A (2013) Generating sequences with recurrent neural networks. arXiv preprint arXiv:1308.0850

  26. Graves A, Mohamed Ar, Hinton G (2013) Speech recognition with deep recurrent neural networks. In: 2013 IEEE international conference on acoustics, speech and signal processing. IEEE, pp 6645–6649

  27. Graves A, Schmidhuber J (2009) Offline handwriting recognition with multidimensional recurrent neural networks. In: Advances in neural information processing systems, pp 545–552

  28. Guatteri G, Masiero F (2020) Stochastic maximum principle for problems with delay with general dependence on the past. arXiv preprint arXiv:2002.03953

  29. Han J (2016) Deep learning approximation for stochastic control problems. arXiv preprint arXiv:1611.07422 (2016)

  30. Han J, Hu R (2020) Deep fictitious play for finding Markovian Nash equilibrium in multi-agent games. In: Mathematical and scientific machine learning. PMLR, pp 221–245

  31. Han J, Hu R, Long J (2020) Convergence of deep fictitious play for stochastic differential games. arXiv preprint arXiv:2008.05519

  32. Han J, Jentzen A, Weinan E (2018) Solving high-dimensional partial differential equations using deep learning. Proc Natl Acad Sci 115(34):8505–8510

    Article  MathSciNet  Google Scholar 

  33. Han J, Lu J, Zhou M (2020) Solving high-dimensional eigenvalue problems using deep neural networks: a diffusion Monte Carlo like approach. J Comput Phys 423:109792

    Article  MathSciNet  Google Scholar 

  34. Han J, Ma C, Ma Z, Weinan E (2019) Uniformly accurate machine learning-based hydrodynamic models for kinetic equations. Proc Natl Acad Sci 116(44):21983–21991

    Article  MathSciNet  Google Scholar 

  35. Han J, Zhang L, Weinan E (2019) Solving many-electron Schrödinger equation using deep neural networks. J Comput Phys 66:108929

    Article  Google Scholar 

  36. Hermann J, Schätzle Z, Noé F (2020) Deep-neural-network solution of the electronic Schrödinger equation. Nat Chem 12(10):891–897

    Article  Google Scholar 

  37. Hochreiter S, Schmidhuber J (1997) Long short-term memory. Neural Comput 9(8):1735–1780

    Article  Google Scholar 

  38. Hu R (2021) Deep fictitious play for stochastic differential games. Commun Math Sci 19(2):325–353

    Article  MathSciNet  Google Scholar 

  39. Kingma DP, Ba J (2015) Adam: a method for stochastic optimization. In: Proceedings of the international conference on learning representations (ICLR)

  40. Kolmanovskiĭ VB, Shaĭkhet LE (1996) Control of systems with aftereffect, vol 157. American Mathematical Society

  41. Kushner H (2008) Numerical methods for controlled stochastic delay systems. Springer

  42. Kushner HJ (2006) Numerical approximations for stochastic systems with delays in the state and control. Stoch Int J Probab Stoch Process 78(5):343–376

    Article  MathSciNet  Google Scholar 

  43. Kydland FE, Prescott EC (1982) Time to build and aggregate fluctuations. Econom J Econom Soc 66:1345–1370

    MATH  Google Scholar 

  44. Larssen B (2002) Dynamic programming in stochastic control of systems with delay. Stoch Int J Probab Stoch Process 74(3–4):651–673

    MathSciNet  MATH  Google Scholar 

  45. Larssen B, Risebro NH (2001) When are HJB-equations for control problems with stochastic delay equations finite dimensional? Preprint series. Pure mathematics http://urn. nb. no/URN: NBN: no-8076

  46. Li K, Liu J (2018) Portfolio selection under time delays: a piecewise dynamic programming approach. Available at SSRN 2916481

  47. Min M, Hu R (2021) Signatured deep fictitious play for mean field games with common noise. In: International conference on machine learning (ICML). PMLR. arXiv:2106.03272

  48. Mohammed SEA (1984) Stochastic functional differential equations, vol 99. Pitman Advanced Publishing Program

  49. Mohammed SEA (1998) Stochastic differential systems with memory: theory, examples and applications. In: Stochastic analysis and related topics VI. Springer, pp 1–77

  50. Øksendal B, Sulem A (2000) A maximum principle for optimal control of stochastic systems with delay, with applications to finance. Preprint series. Pure mathematics http://urn. nb. no/URN: NBN: no-8076

  51. Øksendal B, Sulem A, Zhang T (2011) Optimal control of stochastic delay equations and time-advanced backward stochastic differential equations. Adv Appl Probab 43(2):572–596

    Article  MathSciNet  Google Scholar 

  52. Pang T, Hussain A (2017) A stochastic portfolio optimization model with complete memory. Stoch Anal Appl 35(4):1–25

    Article  MathSciNet  Google Scholar 

  53. Peng S, Yang Z (2009) Anticipated backward stochastic differential equations. Ann Probab 37(3):877–902

    Article  MathSciNet  Google Scholar 

  54. Pfau D, Spencer JS, Matthews AG, Foulkes WMC (2020) Ab initio solution of the many-electron Schrödinger equation with deep neural networks. Phys Rev Res 2(3):033429

    Article  Google Scholar 

  55. Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536

    Article  Google Scholar 

  56. Xuan Y, Balkin R, Han J, Hu R, Ceniceros HD (2021) Optimal policies for a pandemic: a stochastic game approach and a deep learning algorithm. In: Mathematical and scientific machine learning (MSML). arXiv:2012.06745

Download references

Acknowledgements

J.H. and R.H. are grateful to the reviewers for their valuable and constructive comments.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Ruimeng Hu.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

R.H. was partially supported by the NSF grant DMS-1953035, the Faculty Career Development Award and the Research Assistance Program Award, University of California, Santa Barbara.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Han, J., Hu, R. Recurrent neural networks for stochastic control problems with delay. Math. Control Signals Syst. 33, 775–795 (2021). https://doi.org/10.1007/s00498-021-00300-3

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s00498-021-00300-3

Keywords