Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Iterative Aggregation-Disaggregation Procedures for Discounted Semi-Markov Reward Processes

Published: 01 June 1985 Publication History

Abstract

The equation v = q + Mv, where M is a matrix with nonnegative elements and spectral radius less than one, arises in Markovian decision processes and input-output models. In this paper, we solve the equation using an iterative aggregation-disaggregation procedure that alternates between solving an aggregated problem and disaggregating the variables, one block at a time, in terms of the aggregate variables of the other blocks. The disaggregated variables are then used to guide the choice of weights in the subsequent aggregation. Computational experiments on randomly generated and inventory problems indicate that this algorithm is significantly faster than successive approximations when the spectral radius of M is near one, and is slower in unstructured problems with spectral radii in the neighborhood of 0.8. The algorithm appears promising for large structured problems, where it can often reduce computational time and main memory storage requirements and offer greater robustness to initial values.

References

[1]
BARTMANN, D. 1980. Acceleration of the Method of Successive Approximations in Dynamic Programming. Technical University of Munich, Institut fur Sta-tistik und Unternehmensforschung, TUM-M8005 (February).
[2]
BRANDT, A. 1977. Multilevel Adaptive Solutions to Boundary-Value Problems. Math. Comp. 31, 333-390.
[3]
CHATELIN F., AND W. L. MIRANKER, 1980. Acceleration by Aggregation of Successive Approximation Methods. Linear Algebra Appl. 43, 17-47.
[4]
FEDERGRUEN, A., AND P. J. SCHWEITZER. 1980. A Survey of Asymptotic Value-Iteration for Undiscounted Markovian Decision Processes. In Recent Developments in Markov Decision Processes, pp. 73-109, R. Hartley, L. C. Thomas and D. J. White (eds.). Academic Press, New York. (Proceedings of the International Conference on Markov Decision Processes, University of Manchester, Manchester, England, July 17-19, 1978).
[5]
HACKBUSCH, W. 1980. Convergence of Multi-Grid Iterations Applied to Difference Equations. Math. Comp. 34, No. 150, 425-440.
[6]
HASTINGS, N. 1969. Optimization of Discounted Markov Decision Problems. Opnl. Res. Quart. 20, 499-500.
[7]
HOWARD, R. A. 1960. Dynamic Programming and Markov Processes. John Wiley & Sons, New York.
[8]
JEWELL, W. 1963. Markov-Renewal Programming, I and II. Opns. Res. 11, 938-972.
[9]
KUSHNER, H. 1971. Introduction to Stochastic Control, Holt, Rinehart & Winston, New York.
[10]
KUSHNER, H., AND A. J. KLEINMAN. 1971. Accelerated Procedures for the Solution of Discrete Markov Control Problem. IEEE Trans. Automat. Control 16, 147-152.
[11]
LARRANETA, J. 1978. Approaches to Approximate Markov Decision Processes. Department of Industrial Organization, University of Sevilla, Sevilla, Spain.
[12]
LIPPMAN, S. 1975. Applying a New Device in the Optimization of Exponential Systems. Opns. Res. 23, 687-710.
[13]
MACQUEEN, J. 1966. A Modified Dynamic Programming Method for Markov Decision Problems. J. Math. Anal. Appl. 14, 38-43.
[14]
MANDEL, J. AND B. SEKERKA, 1983. A Local Convergence Proof for the Iterative Aggregation Method. Linear Algebra Appl. 51, 163-172.
[15]
MENDELSSOHN, R. 1980. The Effects of Grid Size and Approximation Techniques on the Solutions of Markov Decision Problems. Administrative Report No. 20-H, Southwest Fisheries Center, National Marine Fisheries Service, National Oceanic and Atmospheric Administration, Honolulu, Hawaii.
[16]
MENDELSSOHN, R. 1982. An Iterative Aggregation Procedure for Markov Decision Processes. Opns. Res. 30, 62-73.
[17]
MIRANKER, W. L., AND V. YA PAN. 1980. Methods of Aggregation. Linear Algebra Its Appl. 29, 231-258.
[18]
MORTON, T. E. 1971. On the Asymptotic Convergence Rate of Cost Differences for Markovian Decision Processes. Opns. Res. 19, 244-248.
[19]
MORTON, T. E., AND W. E. WECKER. 1977. Discounting, Ergodicity and Convergence for Markov Decision Processes. Mgmt. Sci. 23, 890-900.
[20]
NICOLAIDES, R. A. 1976. On Multiple Grid and Related Techniques for Solving Discrete Elliptic Systems. J. Comp. Phys. 9, 418-431.
[21]
POPYACK, J. L., R. L. BROWN AND C. C. WHITE III. 1979. Discrete Version of an Algorithm Due to Varaiya. IEEE Trans. Automat. Control AC-24 (No. 3), 503-504.
[22]
PORTEUS, E. 1971. Some Bounds for Discounted Sequential Decision Processes Mgmt. Sci. 18, 7-11.
[23]
PORTEUS, E. 1975. Bounds and Transformations for Finite Markov Decision Chains. Opns. Res. 23, 761-784.
[24]
PORTEUS, E. 1980a. Improved Iterative Computation of the Expected Discounted Return in Markov and Semi-Markov Chains. Z. Opns. Res. 24, 155-170.
[25]
PORTEUS, E. 1980b. Overview of Iterative Methods for Discounted Finite Markov and Semi-Markov Decision Chains. In Recent Developments in Markov Decision Processes, pp. 1-20, R. Hartley, L. C. Thomas and D. J. White (eds.). Academic Press, New York.
[26]
PORTEUS, E. L. 1981. Computing the Discounted Return in Markov and Semi-Markov Chains. Naval Res. Logist. Quart. 28, 567-578.
[27]
PORTEUS, E., AND J. TOTTEN. 1978. Accelerated Computation of the Expected Discounted Return in a Markov Chain. Opns. Res. 26, 350-358.
[28]
REETZ, D. 1973. Solution of a Markovian Decision Problem by Successive Overrelaxation. Z. Opns. Res. 21, 29-32.
[29]
REETZ, D. 1977. Approximate Solutions of a Discounted Markovian Decision Process. Dynam. Optim. Bonner Math. Schrift. 98, 77-92.
[30]
SCHELLHAAS, H. 1974. Zur Extrapolation in Markoffschen Entscheidungsmodel-len mit Diskontierung. Z. Opns. Res. 18, 91-104.
[31]
SCHWEITZER, P. J. 1972. Data Transformations for Markov Renewal Programming. National ORSA Meeting, Atlantic City, New Jersey (November).
[32]
SCHWEITZER, P. J., M. L. PUTERMAN AND K. W. KINDLE. 1981. Iterative Aggregation-Disaggregation Procedures for Solving Discounted Semi-Markov Reward Processes, Working Paper No. 8123, Graduate School of Management, University of Rochester, Rochester, N.Y.
[33]
THOMAS, L. C, R. HARTLEY AND A. LAVERCOMBE. 1981. Computational Comparisons of Algorithms for Discounted Markov Decision Processes I--Value Iteration. Notes in Decision Theory, Note No. 100, Department of Decision Theory, University of Manchester, Manchester, England.
[34]
VAKHUTINSKY, I. YA., L. M. DUDKIN AND A. A. RYVKIN. 1979. Iterative Aggregation: A New Approach to the Solution of Large-Scale Problems. Econometrica 47, 821-841.
[35]
VAN NUNEN, J. 1976. A Set of Successive Approximation Methods for Discounted Markovian Decision Problems. Z. Opns. Res. 20, 203-208.
[36]
VARGA, A. 1962. Matrix Iterative Analysis. Prentice-Hall, Englewood Cliffs, N.J.
[37]
VERKHOVSKY, B. S. 1976a. Smoothing Systems Optimal Design. RC 6085, IBM Research Division, Yorktown Heights, N.Y. (July).
[38]
VERKHOVSKY, B. S. 1976b. Algorithm with Nonlinear Acceleration for a System of Linear Equations. Technical Report No. 76-WR-1, Department of Civil Engineering, Princeton University, Princeton, N.J.
[39]
VERKHOVSKY, B. C. 1976c. Algorithm with Controlled Feedback for System of Equations with Stochastic Matrix. IBM Tech. Disclosure Bull. 18 (No. 10), 3466-3467 (March). (See also pp. 3464-3465).
[40]
VERKHOVSKY, B. 1977. Smoothing System Design and Parametric Markovian Programming. In Markov Decision Theory, pp. 105-117, H. Tijms and J. Wessels (eds.). Math. Centre Tract 93, Amsterdam.
[41]
WHITT, W. 1978. Approximations of Dynamic Programs, I. Math. Opns. Res. 3, 231-243.
[42]
WHITT, W. 1979. Approximations of Dynamic Programs, II. Math. Opns. Res. 4, 179-185.
[43]
YOUNG, D. M. 1971. Iterative Solution of Large Linear Systems, Academic Press, New York.

Cited By

View all
  • (2016)Interpretable policies for dynamic product recommendationsProceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence10.5555/3020948.3021011(607-616)Online publication date: 25-Jun-2016
  • (1999)Decision-theoretic planningJournal of Artificial Intelligence Research10.5555/3013545.301354611:1(1-94)Online publication date: 1-Jul-1999
  • (1997)Model reduction techniques for computing approximately optimal solutions for Markov decision processesProceedings of the Thirteenth conference on Uncertainty in artificial intelligence10.5555/2074226.2074241(124-131)Online publication date: 1-Aug-1997
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Operations Research
Operations Research  Volume 33, Issue 3
June 1985
236 pages

Publisher

INFORMS

Linthicum, MD, United States

Publication History

Published: 01 June 1985

Author Tags

  1. 118 efficient computation
  2. 133 efficient computation

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 19 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2016)Interpretable policies for dynamic product recommendationsProceedings of the Thirty-Second Conference on Uncertainty in Artificial Intelligence10.5555/3020948.3021011(607-616)Online publication date: 25-Jun-2016
  • (1999)Decision-theoretic planningJournal of Artificial Intelligence Research10.5555/3013545.301354611:1(1-94)Online publication date: 1-Jul-1999
  • (1997)Model reduction techniques for computing approximately optimal solutions for Markov decision processesProceedings of the Thirteenth conference on Uncertainty in artificial intelligence10.5555/2074226.2074241(124-131)Online publication date: 1-Aug-1997
  • (1996)Planning, learning and coordination in multiagent decision processesProceedings of the 6th conference on Theoretical aspects of rationality and knowledge10.5555/1029693.1029710(195-210)Online publication date: 17-Mar-1996
  • (1994)Using abstractions for decision-theoretic planning with time constraintsProceedings of the Twelfth AAAI National Conference on Artificial Intelligence10.5555/2891730.2891887(1016-1022)Online publication date: 1-Aug-1994
  • (1991)Aggregation and Disaggregation Techniques and Methodology in OptimizationOperations Research10.1287/opre.39.4.55339:4(553-582)Online publication date: 1-Aug-1991

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media