Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

Optimal Control of Conditional Value-at-Risk in Continuous Time

Published: 01 January 2017 Publication History

Abstract

We consider continuous-time stochastic optimal control problems featuring conditional value-at-risk (CVaR) in the objective. The major difficulty in these problems arises from time inconsistency, which prevents us from directly using dynamic programming. To resolve this challenge, we convert to an equivalent bilevel optimization problem in which the inner optimization problem is standard stochastic control. Furthermore, we provide conditions under which the outer objective function is convex and differentiable. We compute the outer objective's value via a Hamilton--Jacobi--Bellman equation and its gradient via the viscosity solution of a linear parabolic equation, which allows us to perform gradient descent. The significance of this result is that we provide an efficient dynamic-programming-based algorithm for optimal control of CVaR without lifting the state space. To broaden the applicability of the proposed algorithm, we propose convergent approximation schemes in cases where our key assumptions do not hold and characterize relevant suboptimality bounds. In addition, we extend our method to a more general class of risk metrics, which includes mean variance and median deviation. We also demonstrate a concrete application to portfolio optimization under CVaR constraints. Our results contribute an efficient framework for solving time-inconsistent CVaR-based sequential optimization.

References

[1]
A. Ahmadi-Javid and R. Malhamé, Optimal control of a multistate failure-prone manufacturing system under a conditional value-at-risk cost criterion, J. Optim. Theory Appl., 167 (2015), pp. 716--732.
[2]
P. Artzner, F. Delbaen, J.-M. Eber, and D. Heath, Coherent measures of risk, Math. Finance, 9 (1999), pp. 203--228.
[3]
P. Artzner, F. Delbaen, J.-M. Eber, D. Heath, and H. Ku, Coherent multiperiod risk adjusted values and Bellman's principle, Ann. Oper. Res., 152 (2007), pp. 5--22.
[4]
V. Azhmyakov and J. Raisch, Convex control systems and convex optimal control problems with constraints, IEEE Trans. Automat. Control, 53 (2008), pp. 993--998.
[5]
G. Barles and P. E. Souganidis, Convergence of approximation schemes for fully nonlinear second order equations, Asymptot. Anal., 4 (1991), pp. 271--283.
[6]
N. Bäuerle and J. Ott, Markov decision processes with average-value-at-risk criteria, Math. Methods Oper. Res., 74 (2011), pp. 361--379.
[7]
E. Bayraktar and C. W. Miller, Distribution-Constrained Optimal Stopping, preprint, arXiv:1604.03042, 2016.
[8]
T. Bjork and A. Murgoci, A General Theory of Markovian Time Inconsistent Stochastic Control Problems, preprint, SSRN 1694759, 2010.
[9]
F. Black and A. F. Perold, Theory of constant proportion portfolio insurance, J. Econom. Dynam. Control, 16 (1992), pp. 403--426.
[10]
O. Bokanowski, J. Garcke, M. Griebel, and I. Klompmaker, An adaptive sparse grid semi-Lagrangian scheme for first order Hamilton-Jacobi Bellman equations, J. Sci. Comput., 55 (2013), pp. 575--605.
[11]
S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, Cambridge, 2004.
[12]
L. A. Caffarelli and X. Cabré, Fully Nonlinear Elliptic Equations, AMS, Providence, RI, 1995.
[13]
P. Cannarsa and C. Sinestrari, Semiconcave Functions, Hamilton--Jacobi Equations, and Optimal Control, Birkhäuser, Boston, 2004.
[14]
Ö Çavuş and A. Ruszczyński, Risk-averse control of undiscounted transient Markov models, SIAM J. Control Optim., 52 (2014), pp. 3935--3966.
[15]
P. Cheridito, F. Delbaen, and M. Kupper, Dynamic monetary risk measures for bounded discrete-time processes, Electron. J. Probab., 11 (2006), pp. 57--106.
[16]
Y. Chow, A. Tamar, S. Mannor, and M. Pavone, Risk-sensitive and robust decision-making: A CVaR optimization approach, in NIPS, MIT Press, Cambridge, MA, 2015, pp. 1522--1530.
[17]
P. Colaneri, R. H. Middleton, Z. Chen, D. Caporale, and F. Blanchini, Convexity of the cost functional in an optimal control problem for a class of positive switched systems, Automatica J. IFAC, 50 (2014), pp. 1227--1234.
[18]
R. Courant, E. Isaacson, and M. Rees, On the solution of nonlinear hyperbolic differential equations by finite differences, Comm. Pure. Appl. Math., 5 (1952), pp. 243--255.
[19]
H. Dong and D. Kim, Elliptic and parabolic equations with measurable coefficients in weighted Sobolev spaces, Adv. Math., 274 (2015), pp. 681--735.
[20]
Y. Dong and R. Sircar, Time-inconsistent portfolio investment problems, in Stochastic Analysis and Applications, Springer, Heidelberg, 2014, pp. 239--281.
[21]
L. C. Evans, Classical solutions of the Hamilton-Jacobi-Bellman equation for uniformly elliptic operators, Trans. Amer. Math. Soc., 275 (1983), pp. 245--255.
[22]
L. C. Evans, An Introduction to Stochastic Differential Equations, AMS, Providence, RI, 2013.
[23]
W. H. Fleming and W. M. McEneaney, Risk-sensitive control on an infinite time horizon, SIAM J. Control Optim., 33 (1995), pp. 1881--1915.
[24]
W. H. Fleming and H. M. Soner, Controlled Markov Processes and Viscosity Solutions, 2nd ed., Springer, New York, 2006.
[25]
M. Fritelli and G. Scandolo, Risk measures and capital requirements for processes, Math. Finance, 16 (2006), pp. 589--612.
[26]
D. Gilbarg and N. S. Trudinger, Elliptic Partial Differential Equations of Second Order, 2nd ed., Springer-Verlag, 2001.
[27]
W. B. Haskell and R. Jain, A convex analytic approach to risk-aware Markov decision processes, SIAM J. Control Optim., 53 (2015), pp. 1569--1598.
[28]
R. H. W. Hoppe, Multi-grid methods for Hamilton-Jacobi-Bellman equations, Numer. Math., 49 (1986), pp. 239--254.
[29]
Y. Huang and X. Guo, Minimum average value-at-risk for finite horizon semi-Markov decision processes in continuous time, SIAM J. Optim., 26 (2016), pp. 1--28.
[30]
M. R. James, J. S. Baras, and R. J. Elliott, Risk-sensitive control and dynamic games for partially observed discrete-time nonlinear systems, IEEE Trans. Automat. Control, 39 (1994), pp. 780--792.
[31]
C. Karnam, J. Ma, and J. Zhang, Dynamic Approaches for Some Time Inconsistent Problems, preprint, arXiv:1604.03913, 2016.
[32]
N. Katzourakis, An Introduction to Viscosity Solutions for Fully Nonlinear PDE with Applications to Calculus of Variations in $L\sp \infty$, Springer, Cham, Switzerland, 2015.
[33]
B. Kawohl and N. Kutev, Strong maximum principle for semicontinuous viscosity solutions of nonlinear partial differential equations, Arch. Math. (Basel), 70 (1998), pp. 470--478.
[34]
D. Kim, Parabolic equations with measurable coefficients. II, J. Math. Anal. Appl., 334 (2007), pp. 534--548.
[35]
D. Kim, Second order elliptic equations in ${\Bbb R}\sp d$ with piecewise continuous coefficients, Potential Anal., 26 (2007), pp. 189--212.
[36]
S. Klöppel and M. Schweizer, Dynamic indifference valuation via convex risk measures, Math. Finance, 17 (2007), pp. 599--627.
[37]
N. V. Krylov, On weak uniqueness for some diffusions with discontinuous coefficients, Stochastic Process. Appl., 113 (2004), pp. 37--64.
[38]
N. V. Krylov, Parabolic and elliptic equations with VMO coefficients, Comm. Partial Differential Equations, 32 (2007), pp. 453--475.
[39]
P.-L. Lions, On the Hamilton--Jacobi--Bellman equations, Acta Appl. Math., 1 (1983), pp. 17--41.
[40]
R. Mansini, W. Ogryczak, and M. G. Speranza, Conditional value at risk and related linear programming models for portfolio optimization, Ann. Oper. Res., 152 (2007), pp. 227--256.
[41]
C. W. Miller, Non-linear PDE approach to time-inconsistent optimal stopping, SIAM J. Control Optim., to appear.
[42]
N. Nadirashvili, Nonuniqueness in the Martingale problem and the Dirichlet problem for uniformly elliptic operators, Ann. Sc. Norm. Sup. Pisa Cl. Sci. (5), 24 (1997), pp. 537--549.
[43]
J. Nocedal and S. Wright, Numerical Optimization, Springer, Berlin, 2006.
[44]
W. Ogryczak and A. Ruszczyński, From stochastic dominance to mean-risk models: Semideviations as risk measures, European J. Oper. Res., 116 (1999), pp. 33--50.
[45]
J. L. Pedersen and G. Peskir, Optimal mean-variance portfolio selection, Math. Financ. Econ., 11 (2017), pp. 137--160.
[46]
A. F. Perold and W. F. Sharpe, Dynamic strategies for asset allocation, Financ. Anal. J., 44 (1988), pp. 16--27.
[47]
L. Pfeiffer, Risk-averse Merton's portfolio problem, in Second IFAC Workshop on Control of Systems Governed by Partial Differential Equations, Elsevier, New York, 2016, pp. 266--271.
[48]
G. C. Pflug and A. Pichler, Time-inconsistent multistage stochastic programs: Martingale bounds, European J. Oper. Res., 249 (2016), pp. 155--163.
[49]
J. Qin, H.-I. Su, and R. Rajagopal, Storage in risk limiting dispatch: Control and approximation, in American Control Conference, IEEE, Piscataway, NJ, 2013, pp. 4202--4208.
[50]
R. Rajagopal, E. Bitar, W. Wu, and P. Varaiya, Risk limiting dispatch of wind power, in American Control Conference, IEEE, Piscataway, NJ, 2012, pp. 4417--4422.
[51]
F. Riedel, Dynamic coherent risk measures, Stochastic Process. Appl., 112 (2004), pp. 185--200.
[52]
R. T. Rockafellar, Convex Analysis, Princeton University Press, Princeton, NJ, 1970.
[53]
R. T. Rockafellar and S. Uryasev, Optimization of conditional value-at-risk, J. Risk, 2 (2000), pp. 21--42.
[54]
R. T. Rockafellar and S. Uryasev, Conditional value-at-risk for general loss distribution, J. Bank. Finan., 26 (2002), pp. 1443--1471.
[55]
A. Ruszczyński, Risk-averse dynamic programming for Markov decision processes, Math. Program., 125 (2010), pp. 235--261.
[56]
A. Ruszczyński and A. Shapiro, Optimization of convex risk functions, Math. Oper. Res., 31 (2006), pp. 433--452.
[57]
A. Ruszczyński and J. Yao, Discrete-Time Appproximation of Risk-Averse Control Problems for Diffusion Processes, preprint, arXiv:1508.05316 [math.OC], 2015.
[58]
M. V. Safonov, Nonuniqueness for second-order elliptic equations with measurable coefficients, SIAM J. Math. Anal., 30 (1999), pp. 879--895.
[59]
S. Sarykalin, G. Serraino, and S. Uryasev, Value-at-risk vs. conditional value-at-risk in risk management and optimization, in Tutorials in Operations Research, INFORMS, Hanover, MD, 2008, pp. 270--294.
[60]
P. Soravia, Uniqueness results for fully nonlinear degenerate elliptic equations with discontinuous coefficients, Comm. Pure Appl. Anal., 5 (2006), pp. 213--240.
[61]
N. Touzi, Optimal Stochastic Control, Stochastic Target Problems, and Backward SDE, Springer, New York, 2013.
[62]
I. Yang, D. S. Callaway, and C. J. Tomlin, Indirect load control for electricity market risk management via risk-limiting dynamic contracts, in American Control Conference, IEEE, Piscataway, NJ, 2015, pp. 3025--3031.
[63]
I. Yang, D. S. Callaway, and C. J. Tomlin, Variance-constrained risk sharing in stochastic systems, IEEE Trans. Automat. Control, to appear.

Cited By

View all
  • (2024)Risk-sensitive policy optimization via predictive CVaR policy gradientProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693046(24354-24369)Online publication date: 21-Jul-2024
  • (2021)Toward Improving the Distributional Robustness of Risk-Aware Controllers in Learning-Enabled Environments2021 60th IEEE Conference on Decision and Control (CDC)10.1109/CDC45484.2021.9682981(6024-6031)Online publication date: 14-Dec-2021
  • (2018)Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision ProcessesProceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/3209108.3209176(609-618)Online publication date: 9-Jul-2018

Recommendations

Comments

Information & Contributors

Information

Published In

cover image SIAM Journal on Control and Optimization
SIAM Journal on Control and Optimization  Volume 55, Issue 2
DOI:10.1137/sjcodc.55.2
Issue’s Table of Contents

Publisher

Society for Industrial and Applied Mathematics

United States

Publication History

Published: 01 January 2017

Author Tags

  1. conditional value-at-risk
  2. risk measures
  3. time inconsistency
  4. stochastic optimal control
  5. Hamilton--Jacobi--Bellman equations
  6. viscosity solutions
  7. dynamic programming

Author Tags

  1. 49L20
  2. 49L25
  3. 90C39
  4. 93E20
  5. 91G80

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Risk-sensitive policy optimization via predictive CVaR policy gradientProceedings of the 41st International Conference on Machine Learning10.5555/3692070.3693046(24354-24369)Online publication date: 21-Jul-2024
  • (2021)Toward Improving the Distributional Robustness of Risk-Aware Controllers in Learning-Enabled Environments2021 60th IEEE Conference on Decision and Control (CDC)10.1109/CDC45484.2021.9682981(6024-6031)Online publication date: 14-Dec-2021
  • (2018)Conditional Value-at-Risk for Reachability and Mean Payoff in Markov Decision ProcessesProceedings of the 33rd Annual ACM/IEEE Symposium on Logic in Computer Science10.1145/3209108.3209176(609-618)Online publication date: 9-Jul-2018

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media