Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs

Neumann, Gerhard; Pfeiffer, Michael; Maass, Wolfgang

doi:10.1007/978-3-540-74958-5_25

Gerhard Neumann¹,
Michael Pfeiffer¹ &
Wolfgang Maass¹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 4701))

Included in the following conference series:

European Conference on Machine Learning

5970 Accesses
3 Citations

Abstract

We present a new reinforcement learning approach for deterministic continuous control problems in environments with unknown, arbitrary reward functions. The difficulty of finding solution trajectories for such problems can be reduced by incorporating limited prior knowledge of the approximative local system dynamics. The presented algorithm builds an adaptive state graph of sample points within the continuous state space. The nodes of the graph are generated by an efficient principled exploration scheme that directs the agent towards promising regions, while maintaining good online performance. Global solution trajectories are formed as combinations of local controllers that connect nodes of the graph, thereby naturally allowing continuous actions and continuous time steps. We demonstrate our approach on various movement planning tasks in continuous domains.

Download to read the full chapter text

Chapter PDF

Motion Planning

Complex behavior from intrinsic motivation to occupy future action-state path space

Article Open access 29 July 2024

Leveraging experience in lazy search

Article 01 October 2021

References

Sutton, R., Barto, A.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar
Boyan, J.A., Moore, A.W.: Generalization in reinforcement learning: Safely approximating the value function. In: NIPS, vol. 7, pp. 369–376 (1995)
Google Scholar
Ormoneit, D., Sen, S.: Kernel-based reinforcement learning. Machine Learning 49(2-3), 161–178 (2002)
Article MATH Google Scholar
Jong, N., Stone, P.: Kernel-based models for reinforcement learning. In: ICML Workshop on Kernel Machines and Reinforcement Learning (2006)
Google Scholar
Kavraki, L., Svestka, P., Latombe, J., Overmars, M.: Probabilistic roadmaps for path planning in high-dimensional configuration spaces. IEEE T-RA 12(4) (1996)
Google Scholar
Guestrin, C.E., Ormoneit, D.: Robust combination of local controllers. In: Proc. UAI, pp. 178–185 (2001)
Google Scholar
Simsek, Ö., Barto, A.: An intrinsic reward mechanism for efficient exploration. In: ICML, pp. 833–840 (2006)
Google Scholar
Moore, A.W., Atkeson, C.G.: Prioritized sweeping: Reinforcement learning with less data and less real time. Machine Learning 13, 103–130 (1993)
Google Scholar
Hart, P., Nilsson, N., Raphael, B.: A formal basis for the heuristic determination of minimum cost paths. IEEE Trans. SSC 4, 100–107 (1968)
Google Scholar
Rasmussen, C., Williams, C.: Gaussian Processes for Machine Learning. MIT Press, Cambridge (2006)
Google Scholar
Hauser, H., Ijspeert, A., Maass, W.: Kinematic motion primitives to facilitate control in humanoid robots (submitted for publication, 2007)
Google Scholar

Download references

Author information

Authors and Affiliations

Institute for Theoretical Computer Science, Graz University of Technology, A-8010 Graz, Austria
Gerhard Neumann, Michael Pfeiffer & Wolfgang Maass

Authors

Gerhard Neumann
View author publications
You can also search for this author in PubMed Google Scholar
Michael Pfeiffer
View author publications
You can also search for this author in PubMed Google Scholar
Wolfgang Maass
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Joost N. Kok Jacek Koronacki Raomon Lopez de Mantaras Stan Matwin Dunja Mladenič Andrzej Skowron

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Neumann, G., Pfeiffer, M., Maass, W. (2007). Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs. In: Kok, J.N., Koronacki, J., Mantaras, R.L.d., Matwin, S., Mladenič, D., Skowron, A. (eds) Machine Learning: ECML 2007. ECML 2007. Lecture Notes in Computer Science(), vol 4701. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-74958-5_25

Download citation

DOI: https://doi.org/10.1007/978-3-540-74958-5_25
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-74957-8
Online ISBN: 978-3-540-74958-5
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs

Abstract

Chapter PDF

Similar content being viewed by others

Motion Planning

Complex behavior from intrinsic motivation to occupy future action-state path space

Leveraging experience in lazy search

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Efficient Continuous-Time Reinforcement Learning with Adaptive State Graphs

Abstract

Chapter PDF

Similar content being viewed by others

Motion Planning

Complex behavior from intrinsic motivation to occupy future action-state path space

Leveraging experience in lazy search

References

Author information

Authors and Affiliations

Editor information

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation