Article

Coarticulation: an approach for generating concurrent plans in Markov decision processes

Authors:

Khashayar Rohanimanesh,

Sridhar MahadevanAuthors Info & Claims

ICML '05: Proceedings of the 22nd international conference on Machine learning

Pages 720 - 727

https://doi.org/10.1145/1102351.1102442

Published: 07 August 2005 Publication History

Get Access

Abstract

We study an approach for performing concurrent activities in Markov decision processes (MDPs) based on the coarticulation framework. We assume that the agent has multiple degrees of freedom (DOF) in the action space which enables it to perform activities simultaneously. We demonstrate that one natural way for generating concurrency in the system is by coarticulating among the set of learned activities available to the agent. In general due to the multiple DOF in the system, often there exists a redundant set of admissible sub-optimal policies associated with each learned activity. Such flexibility enables the agent to concurrently commit to several subgoals according to their priority levels, given a new task defined in terms of a set of prioritized subgoals. We present efficient approximate algorithms for computing such policies and for generating concurrent plans. We also evaluate our approach in a simulated domain.

References

[1]

Albus, J. (1981). Brain, behavior, and robotics. ByteBooks.

Digital Library

Google Scholar

[2]

Boutilier, C., Brafman, R., & Geib, C. (1997). Prioritized goal decomposition of Markov decision processes: Towards a synthesis of classical and decision theoretic planning. Proceedings of the Fifteenth International Joint Conference on Artificial Intelligence (pp. 1156--1163). San Francisco: Morgan Kaufmann.

Digital Library

Google Scholar

[3]

Dechter, R. (1999). Bucket elimination: A unifying framework for probabilistic inference. Artificial Intelligence.

Digital Library

Google Scholar

[4]

Gabor, Z., Kalmar, Z., & Szepesvari, C. (1998). Multi-criteria reinforcement learning.

Google Scholar

[5]

Guestrin, C., & Gordon, G. (2002). Distributed planning in hierarchical factored mdps. In the Proceedings of the Eighteenth Conference on Uncertainty in Artificial Intelligence (pp. 197--206). Edmonton, Canada.

Digital Library

Google Scholar

[6]

Guestrin, C., Lagoudakis, M., & Parr, R. (2002). Coordinated reinforcement learning. In Proceedings of the ICMI-02. Sydney Australia.

Digital Library

Google Scholar

[7]

Huber, M. (2000). A hybrid architecture for adaptive robot control. Doctoral dissertation, University of Massachusetts, Amherst.

Digital Library

Google Scholar

[8]

Nilsson, D. (1998). An efficient algorithm for finding the m most probable configurations in bayesian networks. Statistics and Computing, 8, 159--173.

Digital Library

Google Scholar

[9]

Precup, D. (2000). Temporal abstraction in reinforcement learning. Doctoral dissertation, Department of Computer Science, University of Massachusetts, Amherst.

Digital Library

Google Scholar

[10]

Rohanimanesh, K., Platt, R., Mahadevan, S., & Grupen, R. (2004). Coarticulation in markov decision processes. Proceedings of the Eighteenth Annual Conference on Neural Information Processing Systems: Natural and Synthetic. Vancouver, Canada.

Google Scholar

[11]

Singh, S., & Cohn, D. (1998). How to dynamically merge markov decision processes. Proceedings of NIPS 11.

Digital Library

Google Scholar

[12]

Sutton, R., & Barto, A. (1998). An introduction to reinforcement learning. Cambridge, MA.: MIT Press.

Digital Library

Google Scholar

[13]

Sutton, R. S. (1996). Generalization in reinforcement learning: Successful examples using sparse coarse coding. Advances in Neural Information Processing Systems (pp. 1038 1044). The MIT Press.

Google Scholar

Cited By

View all

Zhang RZhang STong MCui YRothkopf CBallard DHayhoe M(2018)Modeling sensory-motor decisions in natural behaviorPLOS Computational Biology10.1371/journal.pcbi.100651814:10(e1006518)Online publication date: 25-Oct-2018
https://doi.org/10.1371/journal.pcbi.1006518
Hart SGrupen R(2018)Learning Generalizable Control ProgramsIEEE Transactions on Autonomous Mental Development10.1109/TAMD.2010.21033113:3(216-231)Online publication date: 12-Dec-2018
https://dl.acm.org/doi/10.1109/TAMD.2010.2103311
Munzer TToussaint MLopes M(2018)Efficient behavior learning in human---robot collaborationAutonomous Robots10.1007/s10514-017-9674-542:5(1103-1115)Online publication date: 28-Dec-2018
https://dl.acm.org/doi/10.1007/s10514-017-9674-5
Show More Cited By

Recommendations

Coarticulation in Markov decision processes
NIPS'04: Proceedings of the 17th International Conference on Neural Information Processing Systems

We investigate an approach for simultaneously committing to multiple activities, each modeled as a temporally extended action in a semi-Markov decision process (SMDP). For each activity we define a set of admissible solutions consisting of the redundant ...
WFR-TM

Transactional Memory (TM) is a promising concurrent programming paradigm which employs transactions to achieve synchronization in accessing common data known as transactional variables. A transaction may either commit, making its updates to ...
Continual planning and acting in dynamic multiagent environments

In order to behave intelligently, artificial agents must be able to deliberatively plan their future actions. Unfortunately, realistic agent environments are usually highly dynamic and only partially observable, which makes planning computationally ...

Comments

Information & Contributors

Information

Published In

ICML '05: Proceedings of the 22nd international conference on Machine learning

August 2005

1113 pages

ISBN:1595931805

DOI:10.1145/1102351

General Chair:
Saso Dzeroski
Jozef Stefan Institute, Slovenia
,
Program Chairs:
Luc De Raedt,
Stefan Wrobel

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 07 August 2005

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Qualifiers

Article

Acceptance Rates

Overall Acceptance Rate 140 of 548 submissions, 26%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
157
Total Downloads

Downloads (Last 12 months)5
Downloads (Last 6 weeks)0

Reflects downloads up to 26 Jul 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Zhang RZhang STong MCui YRothkopf CBallard DHayhoe M(2018)Modeling sensory-motor decisions in natural behaviorPLOS Computational Biology10.1371/journal.pcbi.100651814:10(e1006518)Online publication date: 25-Oct-2018
https://doi.org/10.1371/journal.pcbi.1006518
Hart SGrupen R(2018)Learning Generalizable Control ProgramsIEEE Transactions on Autonomous Mental Development10.1109/TAMD.2010.21033113:3(216-231)Online publication date: 12-Dec-2018
https://dl.acm.org/doi/10.1109/TAMD.2010.2103311
Munzer TToussaint MLopes M(2018)Efficient behavior learning in human---robot collaborationAutonomous Robots10.1007/s10514-017-9674-542:5(1103-1115)Online publication date: 28-Dec-2018
https://dl.acm.org/doi/10.1007/s10514-017-9674-5
Toussaint MMunzer TMollard YLi Yang Wu Ngo Anh Vien Lopes M(2016)Relational activity processes for modeling concurrent cooperation2016 IEEE International Conference on Robotics and Automation (ICRA)10.1109/ICRA.2016.7487765(5505-5511)Online publication date: May-2016
https://doi.org/10.1109/ICRA.2016.7487765

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Cited By

Recommendations

Coarticulation in Markov decision processes

WFR-TM

Continual planning and acting in dynamic multiagent environments

Comments

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Acceptance Rates

Other Metrics

Article Metrics

Other Metrics

Cited By

Login options

Full Access

PDF

eReader

Abstract

References

Cited By

Recommendations

Coarticulation in Markov decision processes

WFR-TM

Continual planning and acting in dynamic multiagent environments

Comments

Information

Published In

Publisher

Publication History

Permissions

Check for updates

Qualifiers

Acceptance Rates

Contributors

Other Metrics

Bibliometrics

Article Metrics

Other Metrics

Citations

Cited By

Get Access

Login options

Full Access

View options

PDF

eReader

Figures

Other

Share

Share this Publication link

Share on social media

Affiliations