Abstract
This work concerns with Markov chains on a finite state space. It is supposed that a state-dependent cost is associated with each transition, and that the evolution of the system is watched by an agent with positive and constant risk-sensitivity. For a general transition matrix, the problem of approximating the risk-sensitive average criterion in terms of the risk-sensitive discounted index is studied. It is proved that, as the discount factor increases to 1, an appropriate normalization of the discounted value functions converges to the average cost, extending recent results derived under the assumption that the state space is communicating.
Similar content being viewed by others
References
Alanís-Durán A, Cavazos-Cadena R (2012) An optimality system for finite average Markov decision chains under risk-aversion. Kybernetika 48:83–104
Arapostathis A, Borkar VS, Fernández-Gaucherand E, Ghosh MK, Marcus SI (1993) Discrete-time controlled Markov processes with average cost criterion: a survey. SIAM J Control Optim 31(2):282–334. https://doi.org/10.1137/0331018
Ash RB (1972) Real analysis and probability. Academic Press, New York
Balaji S, Meyn SP (2000) Multiplicative ergodicity and large deviations for an irreducible Markov chain. Stoch Proc Appl 90(1):123–144. https://doi.org/10.1016/S0304-4149(00)00032-6
Bäuerle N, Rieder U (2011) Markov decision processes with applications to finance. Springer, New York
Bäuerle N, Rieder U (2014) More risk-sensitive Markov decision processes. Math Oper Res 39(1):105–120. https://doi.org/10.1287/moor.2013.0601
Borkar VS, Meyn SP (2002) Risk-sensitive optimal control for Markov decision process with monotone cost. Math Oper Res 27(1):192–209. https://doi.org/10.1287/moor.27.1.192.334
Cavazos-Cadena R, Cruz-Suárez D (2017) Discounted approximations to the risk-sensitive average cost in finite Markov chains. J Math Anal Appl 450:1345–1362
Cavazos-Cadena R, Fernández-Gaucherand E (2000) The vanishing discount approach in Markov chains with risk-sensitive criteria. IEEE Trans Autom Control 45(10):1800–1816. https://doi.org/10.1109/TAC.2000.880971
Cavazos-Cadena R, Hernández-Hernández D (2006) A system of Poisson equations for a non-constant Varadhan functional on a finite state space. Appl Math Optim 53:101–119
Denardo EV, Rothblum UG (2006) A turnpike theorem for a risk-sensitive Markov decision process with stopping. SIAM J Control Optim 45(2):414–431. https://doi.org/10.1137/S0363012904442616
Di Masi GB, Stettner L (1999) Risk-sensitive control of discrete time Markov processes with infinite horizon. SIAM J Control Optim 38(1):61–78. https://doi.org/10.1137/S0363012997320614
Di Masi GB, Stettner L (2000) Infinite horizon risk sensitive control of discrete time Markov processes with small risk. Syst Control Lett 40:15–20. https://doi.org/10.1016/S0167-6911(99)00118-8
Di Masi GB, Stettner L (2007) Infinite horizon risk sensitive control of discrete time Markov processes under minorization property. SIAM J Control Optim 46(1):231–252. https://doi.org/10.1137/040618631
Hernández-Lerma O (1989) Adaptive Markov control processes. Springer, New York
Howard RA, Matheson JE (1972) Risk-sensitive Markov decision processes. Manag Sci 18(7):356–369. https://doi.org/10.1287/mnsc.18.7.356
Jaśkiewicz A (2007) Average optimality for risk sensitive control with general state space. Ann Appl Probab 17(2):654–675. https://doi.org/10.1214/105051606000000790
Kontoyiannis I, Meyn SP (2003) Spectral theory and limit theorems for geometrically ergodic Markov processes. Ann Appl Probab 13(1):304–362
Pitera M, Stettner L (2016) Long run risk sensitive portfolio with general factors. Math Methods Oper Res 82(2):265–293. https://doi.org/10.1007/s00186-015-0528-7
Puterman ML (1994) Markov decision processes: discrete stochastic dynamic programming. Wiley, New York
Ross SM (1970) Applied probability models with optimization applications. Holden-Day, San Francisco
Rudin W (1987) Real and complex analysis. McGraw-Hill, New York
Shen Y, Stannat W, Obermayer K (2013) Risk-sensitive Markov control processes. SIAM J Control Optim 51(5):3652–3672. https://doi.org/10.1137/120899005
Sladký K (2008) Growth rates and average optimality in risk-sensitive Markov decision chains. Kybernetika 44(2):205–226
Sladký K (2008) Risk-sensitive average optimality in Markov decision processes. Kybernetika 54(6):1218–1230
Stettner L (1999) Risk sensitive portfolio optimization. Math Methods Oper Res 50(3):463–474. https://doi.org/10.1007/s001860050081
Thomas LC (1981) Connectedness conditions for denumerable state Markov decision processes. In: Hartley R, Thomas LC, White DJ (eds) Recent developments in Markov decision processes. Academic Press, New York, pp 181–204
Tijms HC (2003) A first course in stochastic models. Wiley, New York
Acknowledgements
The authors are grateful to the referees and the associate editor for their careful reading of the original manuscript and helpful suggestions to improve the paper.
Author information
Authors and Affiliations
Corresponding author
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Blancas-Rivera, R., Cavazos-Cadena, R. & Cruz-Suárez, H. Discounted approximations in risk-sensitive average Markov cost chains with finite state space. Math Meth Oper Res 91, 241–268 (2020). https://doi.org/10.1007/s00186-019-00689-3
Received:
Revised:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00186-019-00689-3
Keywords
- Exponential utility
- Certainty equivalent
- Vanishing discount method
- Largest discounted cost from a given state
- Uniform discounted approximations