Hybrid Least-Squares Algorithms for Approximate Policy Evaluation

Johns, Jeff; Petrik, Marek; Mahadevan, Sridhar

doi:10.1007/978-3-642-04180-8_9

Jeff Johns²²,
Marek Petrik²² &
Sridhar Mahadevan²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 5781))

Included in the following conference series:

Joint European Conference on Machine Learning and Knowledge Discovery in Databases

2575 Accesses

Abstract

The goal of approximate policy evaluation is to “best” represent a target value function according to a specific criterion. Different algorithms offer different choices of the optimization criterion. Two popular least-squares algorithms for performing this task are the Bellman residual method, which minimizes the Bellman residual, and the fixed point method, which minimizes the projection of the Bellman residual. When used within policy iteration, the fixed point algorithm tends to ultimately find better performing policies whereas the Bellman residual algorithm exhibits more stable behavior between rounds of policy iteration. We propose two hybrid least-squares algorithms to try to combine the advantages of these algorithms. We provide an analytical and geometric interpretation of hybrid algorithms and demonstrate their utility on a simple problem. Experimental results on both small and large domains suggest hybrid algorithms may find solutions that lead to better policies when performing policy iteration.

This is an extended abstract of an article published in the machine learning journal [1].

Download to read the full chapter text

Chapter PDF

Algorithms for Constrained Optimization

Least-Squares Reinforcement Learning Methods

Least-squares-based three-term conjugate gradient methods

Article Open access 03 February 2020

References

Johns, J., Petrik, M., Mahadevan, S.: Hybrid Least-Squares Algorithms for Approximate Policy Evaluation. Machine Learning (2009) doi: 10.1007/s10994-009-5128-4
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Massachusetts Amherst, USA
Jeff Johns, Marek Petrik & Sridhar Mahadevan

Authors

Jeff Johns
View author publications
You can also search for this author in PubMed Google Scholar
Marek Petrik
View author publications
You can also search for this author in PubMed Google Scholar
Sridhar Mahadevan
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

NICTA, Locked Bag 8001, Canberra, 2601, Australia and Helsinki Institute of IT,, Finland
Wray Buntine
Dept. of Knowledge Technologies, Jožef Stefan Institute, Jamova 39, 1000, Ljubljana, Slovenia
Marko Grobelnik & Dunja Mladenić &
University College London, The Centre for Computational Statistics and Machine Learning Department of Computer Science, Gower St., WC1E 6BT, London, UK
John Shawe-Taylor

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Johns, J., Petrik, M., Mahadevan, S. (2009). Hybrid Least-Squares Algorithms for Approximate Policy Evaluation. In: Buntine, W., Grobelnik, M., Mladenić, D., Shawe-Taylor, J. (eds) Machine Learning and Knowledge Discovery in Databases. ECML PKDD 2009. Lecture Notes in Computer Science(), vol 5781. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-04180-8_9

Download citation

DOI: https://doi.org/10.1007/978-3-642-04180-8_9
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-04179-2
Online ISBN: 978-3-642-04180-8
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Hybrid Least-Squares Algorithms for Approximate Policy Evaluation

Abstract

Chapter PDF

Similar content being viewed by others

Algorithms for Constrained Optimization

Least-Squares Reinforcement Learning Methods

Least-squares-based three-term conjugate gradient methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Hybrid Least-Squares Algorithms for Approximate Policy Evaluation

Abstract

Chapter PDF

Similar content being viewed by others

Algorithms for Constrained Optimization

Least-Squares Reinforcement Learning Methods

Least-squares-based three-term conjugate gradient methods

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation