Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

On Linear Programming in a Markov Decision Problem

Published: 01 January 1970 Publication History

Abstract

This paper treats a Markov decision problem with an infinite planning horizon and no discounting. This model is analyzed by application, perhaps repeated, of a simple linear program.

References

[1]
BLACKWELL, D., "Discrete Dynamic Programming," Ann. Math. Statist., Vol. 36 (1962), pp. 719-726.
[2]
DE CANI, J. S., "A Dynamic Programming Algorithm for Embedded Markov Chains when the Planning Horizon is at Infinity," Management Science, Vol. 10 (1964), pp. 716-733.
[3]
DEGHELLINCK, G., "Les Problèmes de décisions séquentielles," Cahiers Centre Etudes Recherche Opér., Vol. 2 (1960), pp. 161-179.
[4]
DENARDO, E. V., "Separable Markovian Decision Problems," Management Science, Vol. 14 (1968), pp. 461-462.
[5]
DENARDO, E. V., "Computing a Bias-Optimal Policy in a Discrete-Time Markov Decision Problem," Operations Research, (to appear).
[6]
DENARDO, E. V., "Markov Renewal Programming with Small Interest Rates," Technical Report No. 8, Dept. of Administrative Sciences, Yale University, New Haven, Connecticut, April, 1969.
[7]
DENARDO, E. V., AND B. L. FOX, "Multichain Markov Renewal Programs," SIAM J. Appl. Math., Vol. 16 (1968), pp. 468-487.
[8]
DENARDO, E. V., AND B. L. MILLER, "An Optimality Condition for Discrete Dynamic Programming with No Discounting," Ann. Math. Statist., Vol. 39 (1968), pp. 1220-1227.
[9]
D'ÉPENOUX, R., "Sur un Problème de Production et de Stockage dans l'Alèatoire," Revue Française de Recherche Operationelle, No. 14 (1960), pp. 3-16.
[10]
DERMAN, C., "On Sequential Control Processes," Ann. Math. Statist., Vol. 35 (1964), pp. 341-349.
[11]
FOX, B. L., "Markov Renewal Programming by Linear Fractional Programming," SIAM J. Appl. Math., Vol. 14 (1966), pp. 1418-1422.
[12]
FOX, B. L., "Semi-Markov Processes: A Primer," P-3577-1, The RAND Corporation, Santa Monica, California, 1967.
[13]
HOWARD, R. A., Dynamic Programming and Markov Processes, John Wiley, New York, 1960.
[14]
HOWARD, R. A., "Semi-Markovian Decision Processes," Proceedings of the 84th Session of the International Statistical Institute, Ottawa, Canada, August 21-29, 1963.
[15]
JEWELL, W. S., "Markov-Renewal Programming, I and II," Operations Research, Vol. 11 (1963), pp. 938-971.
[16]
MANNE, A., "Linear Programming and Sequential Decisions," Management Science, Vol. 6 (1960), pp. 259-267.
[17]
SCHWEITZER, P., private communication to W. S. Jewell (see [15]), March 1963.
[18]
VEINOTT, A. J., JR., "Discrete Dynamic Programming with Sensitive Discount Optimality Criteria," Technical Report No. 6, Dept. of Operations Research, Stanford Univ., Stanford, California, 30 August 1968.
[19]
WAGNBB, H. M., "On the Optimality of Pure Strategies," Management Science, Vol. 6 (1960), pp. 268-269.
[20]
WOLF, P., AND G. B. DANTZIG, "Linear Programming in a Markov Chain," Operations Research, Vol. 10 (1962), pp. 702-710.

Cited By

View all
  • (2024)Flexible Risk Aware Sequential Decision MakingScalable Uncertainty Management10.1007/978-3-031-76235-2_6(70-84)Online publication date: 28-Nov-2024
  • (2023)Optimal goal-reaching reinforcement learning via quasimetric learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619923(36411-36430)Online publication date: 23-Jul-2023
  • (2022)Proximal point imitation learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602035(24309-24326)Online publication date: 28-Nov-2022
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Management Science
Management Science  Volume 16, Issue 5
January 1970
145 pages

Publisher

INFORMS

Linthicum, MD, United States

Publication History

Published: 01 January 1970

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 03 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2024)Flexible Risk Aware Sequential Decision MakingScalable Uncertainty Management10.1007/978-3-031-76235-2_6(70-84)Online publication date: 28-Nov-2024
  • (2023)Optimal goal-reaching reinforcement learning via quasimetric learningProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619923(36411-36430)Online publication date: 23-Jul-2023
  • (2022)Proximal point imitation learningProceedings of the 36th International Conference on Neural Information Processing Systems10.5555/3600270.3602035(24309-24326)Online publication date: 28-Nov-2022
  • (2022)On Linear Programming for Constrained and Unconstrained Average-Cost Markov Decision Processes with Countable Action Spaces and Strictly Unbounded CostsMathematics of Operations Research10.1287/moor.2021.117747:2(1474-1499)Online publication date: 1-May-2022
  • (2021)Continuous-Action Reinforcement Learning for Portfolio Allocation of a Life Insurance CompanyMachine Learning and Knowledge Discovery in Databases. Applied Data Science Track10.1007/978-3-030-86514-6_15(237-252)Online publication date: 13-Sep-2021
  • (2020)A unifying view of optimism in episodic reinforcement learningProceedings of the 34th International Conference on Neural Information Processing Systems10.5555/3495724.3495842(1392-1403)Online publication date: 6-Dec-2020
  • (2017)Polynomial-Time Computation of Strong and n-Present-Value Optimal Policies in Markov Decision ChainsMathematics of Operations Research10.1287/moor.2016.081242:3(577-598)Online publication date: 1-Aug-2017
  • (2014)Optimal offloading control for a mobile device based on a realistic battery model and semi-markov decision processProceedings of the 2014 IEEE/ACM International Conference on Computer-Aided Design10.5555/2691365.2691441(369-375)Online publication date: 3-Nov-2014
  • (2014)Maximum-Stopping-Value Policies in Finite Markov Population Decision ChainsMathematics of Operations Research10.1287/moor.2013.063839:3(597-606)Online publication date: 1-Aug-2014
  • (2014)Optimizing cloud utilization via switching decisionsACM SIGMETRICS Performance Evaluation Review10.1145/2627534.262755441:4(57-60)Online publication date: 17-Apr-2014
  • Show More Cited By

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media