Batch Reinforcement Learning with State Importance

Li, Lihong; Bulitko, Vadim; Greiner, Russell

doi:10.1007/978-3-540-30115-8_53

Lihong Li²²,
Vadim Bulitko²² &
Russell Greiner²²

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 3201))

Included in the following conference series:

European Conference on Machine Learning

4069 Accesses

Abstract

We investigate the problem of using function approximation in reinforcement learning where the agent’s policy is represented as a classifier mapping states to actions. High classification accuracy is usually deemed to correlate with high policy quality. But this is not necessarily the case as increasing classification accuracy can actually decrease the policy’s quality. This phenomenon takes place when the learning process begins to focus on classifying less “important” states. In this paper, we introduce a measure of state’s decision-making importance that can be used to improve policy learning. As a result, the focused learning process is shown to converge faster to better policies.

Download to read the full chapter text

Chapter PDF

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Least-Squares Reinforcement Learning Methods

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Baird, L.: Advantage updating. Technical report, Wright-Patterson Air Force Base (1993)
Google Scholar
Dietterich, T.G., Wang, X.: Batch value function approximation via support vectors. Advances in Neural Information Processing Systems 14 (2002)
Google Scholar
Fan, W., Stolfo, S.J., Zhang, J., Chan, P.K.: AdaCost: Misclassification cost-sensitive boosting. In: Proc. of the 16th Int’l Conf. on Machine Learning (1999)
Google Scholar
Fern, A., Yoon, S., Givan, R.: Approximate policy iteration with a policy language bias. Advances in Neural Information Processing Systems 16 (2004)
Google Scholar
Kearns, M., Mansour, Y., Ng, A.: Approximate planning in large POMDPs via reusable trajectories. Advances in Neural Information Processing Systems 12 (2000)
Google Scholar
Lagoudakis, M., Parr, R.: Reinforcement learning as classification: Leveraging modern classifiers. In: Proc. of the 12th Int’l Conf. on Machine Learning (2003)
Google Scholar
Langford, J., Zadrozny, B.: Reducing T-step reinforcement learning to classification. In: Proc. of the Machine Learning Reductions Workshop, Chicago, IL (2003)
Google Scholar
Levner, I., Bulitko, V.: Machine learning for adaptive image interpretation. In: Proc. of the 12th Innovative Applications of Artificial Intelligence Conf. (2004)
Google Scholar
Li, L.: Focus of attention in reinforcement learning. Master’s thesis, Department of Computing Science, University of Alberta, Edmonton, Alberta, Canada (June 2004)
Google Scholar
Ng, A.Y., Jordan, M.: PEGASUS: A policy search method for large MDPs and POMDPs. In: Proc. of the 16th Conf. on Uncertainty in AI (2000)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (March 1998)
Google Scholar
Yoon, S., Fern, A., Givan, R.: Inductive policy selection for first-order MDPs. In: Proc. of the 18th Conference on Uncertainty in AI (2002)
Google Scholar
Zadrozny, B., Langford, J.: Cost-sensitive learning by cost-proportionate example weighting. In: Proc. of the IEEE Int’l Conf. on Data Mining (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computing Science, University of Alberta, Edmonton, Alberta, T6G 2E8, Canada
Lihong Li, Vadim Bulitko & Russell Greiner

Authors

Lihong Li
View author publications
You can also search for this author in PubMed Google Scholar
Vadim Bulitko
View author publications
You can also search for this author in PubMed Google Scholar
Russell Greiner
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

INSA-Lyon, LIRIS CNRS UMR5205, F-69621, Villeurbanne, France
Jean-François Boulicaut
Dipartimento di Informatica, Università degli Studi di Bari,
Floriana Esposito
Pisa KDD Laboratory, ISTI - CNR, Area della Ricerca di Pisa, Via Giuseppe Moruzzi 1, Pisa, Italy
Fosca Giannotti
Dipartimento di Informatica, Via F. Buonarroti 2, 56127, Pisa, Italy
Dino Pedreschi

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Li, L., Bulitko, V., Greiner, R. (2004). Batch Reinforcement Learning with State Importance. In: Boulicaut, JF., Esposito, F., Giannotti, F., Pedreschi, D. (eds) Machine Learning: ECML 2004. ECML 2004. Lecture Notes in Computer Science(), vol 3201. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-30115-8_53

Download citation

DOI: https://doi.org/10.1007/978-3-540-30115-8_53
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-23105-9
Online ISBN: 978-3-540-30115-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Batch Reinforcement Learning with State Importance

Abstract

Chapter PDF

Similar content being viewed by others

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Least-Squares Reinforcement Learning Methods

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Batch Reinforcement Learning with State Importance

Abstract

Chapter PDF

Similar content being viewed by others

Extending Sliding-Step Importance Weighting from Supervised Learning to Reinforcement Learning

Least-Squares Reinforcement Learning Methods

Selecting Near-Optimal Approximate State Representations in Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation