Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.5555/1643031.1643045guideproceedingsArticle/Chapter ViewAbstractPublication PagesConference Proceedingsacm-pubtype
Article

Decomposition techniques for planning in stochastic domains

Published: 20 August 1995 Publication History

Abstract

This paper is concerned with modeling planning problems involving uncertainty as discrete-time, finite-state stochastic automata Solving planning problems is reduced to computing policies for Markov decision processes. Classical methods for solving Markov decision processes cannot cope with the size of the state spaces for typical problems encountered in practice. As an alternative, we investigate methods that decompose global planning problems into a number of local problems solve the local problems separately and then combine the local solutions to generate a global solution. We present algorithms that decompose planning problems into smaller problems given an arbitrary partition of the state space. The local problems are interpreted as Markov decision processes and solutions to the local problems are interpreted as policies restricted to the subsets of the state space defined by the partition. One algorithm relies on constructing and solving an abstract version of the original decision problem. A second algorithm iteratively approximates parameters of the local problems to converge to an optimal solution. We show how properties of a specified partition affect the time and storage required for these algorithms.

References

[1]
{Bellman 1961} Richard Bellman Adaptive Control Processes Princeton University Press Princeton New Jersey, 1961.
[2]
{Boutilier et al, 1995} Craig Boutilier Richard Dearden and Moises Goldszmidt Exploiting structure in policy construction In Proceedings of the 1995 International Joint Conference on Artificial Intelligence, 1995.
[3]
{Caines and Wang 1990} Peter E Caines and S Wang COCOLOG A conditional observer and controller logic for finite machines In Proceedings of the 29th IEEE Conference on Decision and Control Hawaii 1990.
[4]
{Chvatal, 1980} Vasek Chvatal Linear Programmming W H Freeman and Company, 1980.
[5]
{Dantzig and Wolfe, 1960} George Dantzig and Philip Wolfe Decomposition principle for dynamic programs Operations Research 8(1) 101-111, 1900.
[6]
{Dean and Kanazawa, 1989} Thomas Dean and Keiji Kanazawa A model for reasoning about persistence and causation Computational Intelligence, 5(3) 112-150, 1989.
[7]
{Dean and Lin 1995} Thomas Dean and Shieu-Hong Lin Decomposition techniques for planning in stochastic domains Technical Report CS-95-08, Brown University Department of Computer Science 1995.
[8]
{Dean et al, 1993} Thomas Dean, Leslie Kaelbling, Jak Kirman, and Ann Nicholson Planning with deadlines in stochastic domains In Proceedings AAAI 93, pages 574-579 AAAI, 1993.
[9]
{Dean et al, 1995} Thomas Dean, Leslie Kaelbling, Jak Kirman aud Ann Nicholson Planning under time constraints in stochastic domains To appear in Artificial Intelligence, 1995.
[10]
{D'Epenoux, 1963} F D'Epenoux Sur un probleme de production et de stockage dans l'aleatoire Management Science, 10 98-108, 1963.
[11]
{Derman, 1970} Cyrus Derman Finite State Markovian Decision Processes Cambridge University Press New York, 1970.
[12]
{Fikes and Nilsson 1971} Richard Fikes and Nils J Nilsson Strip* A new approach to the application of theorem proving to problem solving Artificial Intelligence, 2 189-208, 1971.
[13]
{Howard 1960} Ronald A Howard Dynamic Programming and Markov Processes MIT Press, Cambridge, Massachusetts 1960.
[14]
{Kaelbling, 1993} Leslie Park Kaelbling Hierarchical learning in stochastic domains A preliminary report In Proceedings Tenth International Conference on Machine Learning, 1993.
[15]
{Knoblock, 1991} Craig A Knoblock Search reduction in hierarchical problem solving In Proceedings AAAI-91, pages 686-691 AAAI, 1991.
[16]
{Korf, 1985} Richard Korf Macro-operators a weak method for learning Artificial Intelligence 26-35 77 1985.
[17]
{Kushner and Chen 1974} Harold J Kushner and Ching-Hui Chen Decomposition of systems governed by Markov chains IEEE Transactions on Automatic Control AC-19(5) 501-507, 1974.
[18]
{Kushner and Kleinman, 1971} H J Kushner and A J Kleinman Mathematical programming and the control of Markov chains International Journal on Control 13(5) 801-820, 1971.
[19]
{Lasdon, 1970} Leon S Lasdon Optimization Theory for Large Systems. Macmillan Company 1970.
[20]
{Lin and Dean 1991} Shreu-Hong Lin and Thomas Dean Exploiting locality in temporal reasoning In E Sandewall and C Backstrom edilors Current Trends in AI Planning, Amsterdam, 1994 IOS Press.
[21]
{Moore and Atkeson, 1995} Andrew W Moore and Christopher G Atkeson The parti-game algorithm for variable resolution reinforcement learning in multidimensional state spares To appear in Machine Learning 1995.
[22]
{MS Bezaraa, 1990} H D Sherah MS Bazaraa J Harvis Linear Programming and Network Flours john Wiley & Sons New York 1990.
[23]
{Papadimitriou and Tsitsiklis 1987} Christos H Papadimitriou and John N Tsitsiklis The complexity of Markov chain decision processes Mathematics of Operations Research 12(3)441-460 1987.
[24]
{Puterman, 1994} Martin L Puterman Markov Decision Processes John Wiley & Sons, New York 1994.
[25]
{Sacerdott, 1974} Earl Sacerdott Planning in a hierarchy of abstraction spaces Artificial Intelligence 7 231-272 1974.
[26]
{Zhong and Wonham, 1990} H Zhong and W M Wonham On the consistency of hierarchical supervision in discrete-event systems IEEE Transactions on automatic Control, 35(10) 1125-1134, 1990.

Cited By

View all
  • (2018)Parallel Hierarchical Pre-Gauss-Seidel Value Iteration AlgorithmInternational Journal of Decision Support System Technology10.4018/IJDSST.201804010110:2(1-22)Online publication date: 1-Apr-2018
  • (2015)Planning for Crowdsourcing Hierarchical TasksProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2773301(1191-1199)Online publication date: 4-May-2015
  • (2014)Automatic abstraction controller in reinforcement learning agent via automataApplied Soft Computing10.1016/j.asoc.2014.08.07125:C(118-128)Online publication date: 1-Dec-2014
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Guide Proceedings
IJCAI'95: Proceedings of the 14th international joint conference on Artificial intelligence - Volume 2
August 1995
2077 pages
ISBN:1558603638

Sponsors

  • Societe Canadienne pour I'etude de I intelligence par ordinateur
  • Canadian Society for Computational Studies of Intelligence
  • AAAI: American Association for Artificial Intelligence
  • The International Joint Conferences on Artificial Intelligence, Inc.

Publisher

Morgan Kaufmann Publishers Inc.

San Francisco, CA, United States

Publication History

Published: 20 August 1995

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 16 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2018)Parallel Hierarchical Pre-Gauss-Seidel Value Iteration AlgorithmInternational Journal of Decision Support System Technology10.4018/IJDSST.201804010110:2(1-22)Online publication date: 1-Apr-2018
  • (2015)Planning for Crowdsourcing Hierarchical TasksProceedings of the 2015 International Conference on Autonomous Agents and Multiagent Systems10.5555/2772879.2773301(1191-1199)Online publication date: 4-May-2015
  • (2014)Automatic abstraction controller in reinforcement learning agent via automataApplied Soft Computing10.1016/j.asoc.2014.08.07125:C(118-128)Online publication date: 1-Dec-2014
  • (2008)Economic hierarchical Q-learningProceedings of the 23rd national conference on Artificial intelligence - Volume 210.5555/1620163.1620179(689-695)Online publication date: 13-Jul-2008
  • (2006)Stochastic over-subscription planning using hierarchies of MDPsProceedings of the Sixteenth International Conference on International Conference on Automated Planning and Scheduling10.5555/3037104.3037121(121-130)Online publication date: 6-Jun-2006
  • (2006)A compact, hierarchically optimal Q-function decompositionProceedings of the Twenty-Second Conference on Uncertainty in Artificial Intelligence10.5555/3020419.3020460(332-340)Online publication date: 13-Jul-2006
  • (2006)Causal Graph Based Decomposition of Factored MDPsThe Journal of Machine Learning Research10.5555/1248547.12486287(2259-2301)Online publication date: 1-Dec-2006
  • (2005)Fast exact planning in Markov decision processesProceedings of the Fifteenth International Conference on International Conference on Automated Planning and Scheduling10.5555/3037062.3037082(151-160)Online publication date: 5-Jun-2005
  • (2005)Hybrid BDI-POMDP framework for multiagent teamingJournal of Artificial Intelligence Research10.5555/1622503.162251223:1(367-420)Online publication date: 1-Apr-2005
  • (2005)Restricted value iterationJournal of Artificial Intelligence Research10.5555/1622503.162250723:1(123-165)Online publication date: 1-Feb-2005
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media