Efficient sampling in approximate dynamic programming algorithms

Cervellera, Cristiano; Muselli, Marco

doi:10.1007/s10589-007-9054-8

Efficient sampling in approximate dynamic programming algorithms

Published: 23 June 2007

Volume 38, pages 417–443, (2007)
Cite this article

Computational Optimization and Applications Aims and scope Submit manuscript

Cristiano Cervellera¹ &
Marco Muselli²

180 Accesses
27 Citations
Explore all metrics

Abstract

Dynamic Programming (DP) is known to be a standard optimization tool for solving Stochastic Optimal Control (SOC) problems, either over a finite or an infinite horizon of stages. Under very general assumptions, commonly employed numerical algorithms are based on approximations of the cost-to-go functions, by means of suitable parametric models built from a set of sampling points in the d-dimensional state space. Here the problem of sample complexity, i.e., how “fast” the number of points must grow with the input dimension in order to have an accurate estimate of the cost-to-go functions in typical DP approaches such as value iteration and policy iteration, is discussed. It is shown that a choice of the sampling based on low-discrepancy sequences, commonly used for efficient numerical integration, permits to achieve, under suitable hypotheses, an almost linear sample complexity, thus contributing to mitigate the curse of dimensionality of the approximate DP procedure.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

References

Bellman, R.: Dynamic Programming. Princeton University Press, Princeton (1957)
Google Scholar
Bellman, R., Dreyfus, S.: Applied Dynamic Programming. Princeton University Press, Princeton (1962)
MATH Google Scholar
Larson, R.E.: State Increment Dynamic Programming. Elsevier, New York (1968)
MATH Google Scholar
Puterman, M.: Markov Decision Processes. Wiley, New York (1994)
MATH Google Scholar
Bertsekas, D.: Dynamic Programming and Optimal Control, 2nd edn., vol. 1 Athena Scientific, Belmont (2000)
Google Scholar
Jacobson, D., Mayne, D.: Differential Dynamic Programming. Academic, New York (1970)
MATH Google Scholar
Bellman, R., Kalaba, R., Kotkin, B.: Polynomial approximation—a new computational technique in dynamic programming allocation processes. Math. Comput. 17, 155–161 (1963)
Article MATH MathSciNet Google Scholar
Bertsekas, D.: Convergence of discretization procedures in dynamic programming. IEEE Trans. Autom. Control 20, 415–419 (1975)
Article MATH MathSciNet Google Scholar
Foufoula-Georgiou, E., Kitanidis, P.: Gradient dynamic programming for stochastic optimal control of multidimensional water resources systems. Water Resour. Res. 24, 1345–1359 (1988)
Google Scholar
Johnson, S., Stedinger, J., Shoemaker, C., Li, Y., Tejada-Guibert, J.: Numerical solution of continuous-state dynamic programs using linear and spline interpolation. Oper. Res. 41, 484–500 (1993)
MATH Google Scholar
Chow, C., Tsitsiklis, J.: An optimal multigrid algorithm for continuous state discrete time stochastic control. IEEE Trans. Autom. Control 36, 898–914 (1991)
Article MATH MathSciNet Google Scholar
Chen, V., Ruppert, D., Shoemaker, C.: Applying experimental design and regression splines to high-dimensional continuous-state stochastic dynamic programming. Oper. Res. 47, 38–53 (1999)
Article MATH MathSciNet Google Scholar
Bertsekas, D., Tsitsiklis, J.: Neuro-Dynamic Programming. Athena Scientific, Belmont (1996)
MATH Google Scholar
Vapnik, V.N.: Statistical Learning Theory. Wiley, New York (1995)
MATH Google Scholar
Hammersley, J.M., Handscomb, D.C.: Monte Carlo Methods. Methuen, London (1964)
MATH Google Scholar
Cervellera, C., Muselli, M.: Deterministic design for neural network learning: An approach based on discrepancy. IEEE Trans. Neural Netw. 15, 533–543 (2004)
Article Google Scholar
Cervellera, C., Chen, V.C., Wen, A.: Optimization of a large-scale water reservoir network by stochastic dynamic programming with efficient state space discretization. Eur. J. Oper. Res. 171(3), 1139–1151 (2006)
Article MATH Google Scholar
Cervellera, C., Chen, V., Wen, A.: Neural network and regression spline value function approximations for stochastic dynamic programming. Comput. Oper. Res. 34(1), 70–90 (2007)
Article MATH MathSciNet Google Scholar
Baglietto, M., Cervellera, C., Parisini, T., Sanguineti, M., Zoppoli, R.: Neural approximators, dynamic programming and stochastic approximation. In: Proc. 19th Am. Contr. Conf., pp. 3304–3308, 2000
Zoppoli, R., Sanguineti, M., Parisini, T.: Approximating networks and extended Ritz method for the solution of functional optimization problems. J. Optim. Theory Appl. 112, 403–439 (2002)
Article MATH MathSciNet Google Scholar
Fang, K.-T., Wang, Y.: Number-Theoretic Methods in Statistics. Chapman & Hall, London (1994)
MATH Google Scholar
Alon, N., Spencer, J.: The Probabilistic Method. Wiley, New York (2000)
MATH Google Scholar
Niederreiter, H.: Random Number Generation and Quasi-Monte Carlo Methods. SIAM, Philadelphia (1992)
MATH Google Scholar
Barron, A.: Universal approximation bounds for superpositions of a sigmoidal function. IEEE Trans. Inf. Theory 39, 930–945 (1993)
Article MATH MathSciNet Google Scholar
Niyogi, P., Girosi, F.: On the relationship between generalization error, hypothesis complexity, and sample complexity for radial basis functions. Neural Comput. 8, 819–842 (1996)
Article Google Scholar
Breiman, L.: Hinging hyperplanes for regression, classification and function approximation. IEEE Trans. Inf. Theory 39, 993–1013 (1993)
Article MathSciNet Google Scholar
Stokey, N., Lucas, R., Prescott, E.: Recursive Methods in Economic Dynamics. Harvard University Press, Cambridge (1989)
MATH Google Scholar
Dudley, R.M.: Real Analysis and Probability. Wadsworth & Brooks/Cole, Pacific Grove (1989)
MATH Google Scholar
Bratley, P., Fox, B.L., Niederreiter, H.: Programs to generate Niederreiter’s low-discrepancy sequences. ACM Trans. Math. Softw. 20(4), 494–495 (1994)
Article MATH Google Scholar
Chen, V.C.P., Tsui, K.-L., Barton, R.R., Allen, J.K.: A review of design and modeling in computer experiments. In: Rao, C.R., Khattree, R. (eds.) Handbook in Industrial Statistics, pp. 231–261. Elsevier, Amsterdam (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Istituto di Studi sui Sistemi Intelligenti per l’Automazione, Consiglio Nazionale delle Ricerche, Via de Marini 6, 16149, Genova, Italy
Cristiano Cervellera
Istituto di Elettronica e di Ingegneria dell’Informazione e delle Telecomunicazioni, Consiglio Nazionale delle Ricerche, Via de Marini 6, 16149, Genova, Italy
Marco Muselli

Authors

Cristiano Cervellera
View author publications
You can also search for this author in PubMed Google Scholar
Marco Muselli
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Cristiano Cervellera.

Rights and permissions

Reprints and permissions

About this article

Cite this article

Cervellera, C., Muselli, M. Efficient sampling in approximate dynamic programming algorithms. Comput Optim Appl 38, 417–443 (2007). https://doi.org/10.1007/s10589-007-9054-8

Download citation

Received: 03 March 2004
Revised: 28 March 2005
Published: 23 June 2007
Issue Date: December 2007
DOI: https://doi.org/10.1007/s10589-007-9054-8

Keywords

Access this article

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Efficient sampling in approximate dynamic programming algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Approximate Dynamic Programming by Practical Examples

MIDAS: A mixed integer dynamic approximation scheme

An incremental off-policy search in a model-free Markov decision process using a single sample path

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Efficient sampling in approximate dynamic programming algorithms

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Approximate Dynamic Programming by Practical Examples

MIDAS: A mixed integer dynamic approximation scheme

An incremental off-policy search in a model-free Markov decision process using a single sample path

References

Author information

Authors and Affiliations

Corresponding author

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation