Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
article

Probabilistic inference for determining options in reinforcement learning

Published: 01 September 2016 Publication History

Abstract

Tasks that require many sequential decisions or complex solutions are hard to solve using conventional reinforcement learning algorithms. Based on the semi Markov decision process setting (SMDP) and the option framework, we propose a model which aims to alleviate these concerns. Instead of learning a single monolithic policy, the agent learns a set of simpler sub-policies as well as the initiation and termination probabilities for each of those sub-policies. While existing option learning algorithms frequently require manual specification of components such as the sub-policies, we present an algorithm which infers all relevant components of the option framework from data. Furthermore, the proposed approach is based on parametric option representations and works well in combination with current policy search methods, which are particularly well suited for continuous real-world tasks. We present results on SMDPs with discrete as well as continuous state-action spaces. The results show that the presented algorithm can combine simple sub-policies to solve complex tasks and can improve learning performance on simpler tasks.

Cited By

View all
  • (2023)Hierarchical imitation learning with vector quantized modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619145(17896-17919)Online publication date: 23-Jul-2023
  • (2023)Robust subtask learning for compositional generalizationProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619035(15371-15387)Online publication date: 23-Jul-2023
  • (2023)Learning Hierarchical Planning-Based Policies from Offline DataMachine Learning and Knowledge Discovery in Databases: Research Track10.1007/978-3-031-43421-1_29(489-505)Online publication date: 18-Sep-2023
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Machine Language
Machine Language  Volume 104, Issue 2-3
September 2016
314 pages

Publisher

Kluwer Academic Publishers

United States

Publication History

Published: 01 September 2016

Author Tags

  1. Options
  2. Reinforcement learning
  3. Robot learning
  4. Semi Markov decision process

Qualifiers

  • Article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 09 Nov 2024

Other Metrics

Citations

Cited By

View all
  • (2023)Hierarchical imitation learning with vector quantized modelsProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619145(17896-17919)Online publication date: 23-Jul-2023
  • (2023)Robust subtask learning for compositional generalizationProceedings of the 40th International Conference on Machine Learning10.5555/3618408.3619035(15371-15387)Online publication date: 23-Jul-2023
  • (2023)Learning Hierarchical Planning-Based Policies from Offline DataMachine Learning and Knowledge Discovery in Databases: Research Track10.1007/978-3-031-43421-1_29(489-505)Online publication date: 18-Sep-2023
  • (2022)Behavior priors for efficient reinforcement learningThe Journal of Machine Learning Research10.5555/3586589.358681023:1(9989-10056)Online publication date: 1-Jan-2022
  • (2022)Continuous Action Reinforcement Learning From a Mixture of Interpretable ExpertsIEEE Transactions on Pattern Analysis and Machine Intelligence10.1109/TPAMI.2021.310313244:10_Part_2(6795-6806)Online publication date: 1-Oct-2022
  • (2022)Gaussian Process Self-triggered Policy Search in Weakly Observable Environments2022 International Conference on Robotics and Automation (ICRA)10.1109/ICRA46639.2022.9811781(5946-5952)Online publication date: 23-May-2022
  • (2021)Discovery of options via meta-learned subgoalsProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3542546(29861-29873)Online publication date: 6-Dec-2021
  • (2021)Flexible option learningProceedings of the 35th International Conference on Neural Information Processing Systems10.5555/3540261.3540615(4632-4646)Online publication date: 6-Dec-2021
  • (2021)Hierarchical Reinforcement LearningACM Computing Surveys10.1145/345316054:5(1-35)Online publication date: 5-Jun-2021
  • (2021)Online Baum-Welch algorithm for Hierarchical Imitation Learning2021 60th IEEE Conference on Decision and Control (CDC)10.1109/CDC45484.2021.9683044(3717-3722)Online publication date: 14-Dec-2021
  • Show More Cited By

View Options

View options

Get Access

Login options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media