Article

Programmable reinforcement learning agents

Authors:

David Andre,

Stuart J. RussellAuthors Info & Claims

NIPS'00: Proceedings of the 13th International Conference on Neural Information Processing Systems

Pages 975 - 981

Published: 01 January 2000 Publication History

Publisher Site

Abstract

We present an expressive agent design language for reinforcement learning that allows the user to constrain the policies considered by the learning process. The language includes standard features such as parameterized subroutines, temporary interrupts, aborts, and memory variables, but also allows for unspecified choices in the agent program. For learning that which isn't specified, we present provably convergent learning algorithms. We demonstrate by example that agent programs written in the language are concise as well as modular. This facilitates state abstraction and the transferability of learned skills.

References

[1]

D. Andre. Programmable HAMs. www.cs.berkeley.edu/~dandre/pham.ps. 2000.

Google Scholar

[2]

S. Benson and N. Nilsson. Reacting, planning and learning in an autonomous agent. In K. Furukawa, D. Michie, and S. Muggleton, editors, Machine Intelligence 14. 1995.

Google Scholar

[3]

G. Berry and G. Gonthier. The Esterel synchronous programming language: Design, semantics, implementation. Science of Computer Programming, 19(2):87-152, 1992.

Crossref

Google Scholar

[4]

T. G. Dietterich. State abstraction in MAXQ hierarchical RL. In NIPS 12, 2000.

Google Scholar

[5]

R.J. Firby. Modularity issues in reactive planning. In AIPS 96, pages 78-85. AAAI Press, 1996.

Google Scholar

[6]

L. P. Kaelbling, M. L. Littman, and A. W. Moore. Reinforcement learning: A survey. JAIR, 4:237-285, 1996.

Crossref

Google Scholar

[7]

N. J. Nilsson. Teleo-reactive programs for agent control. JAIR, 1:139-158, 1994.

Crossref

Google Scholar

[8]

R. Parr and S. J. Russell. Reinforcement learning with hierarchies of machines. In NIPS 10, 1998.

Crossref

Google Scholar

[9]

R. Parr. Hierarchical Control and Learning for MDPs. PhD thesis, UC Berkeley, 1998.

Crossref

Google Scholar

[10]

L. Peshkin, N. Meuleau, and L. Kaelbling. Learning policies with external memory. In ICML, 1999.

Crossref

Google Scholar

[11]

R. Sutton. Temporal abstraction in reinforcement learning. In ICML, 1995.

Google Scholar

[12]

R. Sutton, D. Precup, and S. Singh. Between MDPs and semi-MDPs: A framework for temporal abstraction in reinforcement learning. Artificial Intelligence, 112(1):181-211, February 1999.

Crossref

Google Scholar

Cited By

View all

Sohn SOh JLee H(2018)Hierarchical reinforcement learning for zero-shot generalization with subtask dependenciesProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327818(7156-7166)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327757.3327818
Oh JSingh SLee HKohli P(2017)Zero-shot task generalization with multi-task deep reinforcement learningProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305890.3305956(2661-2670)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305890.3305956
Andreas JKlein DLevine S(2017)Modular multitask reinforcement learning with policy sketchesProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305399(166-175)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305381.3305399
Show More Cited By

Recommendations

Programmable reinforcement learning agents
Agents Teaching Agents in Reinforcement Learning (Nectar Abstract)
Machine Learning and Knowledge Discovery in Databases
Abstract
Using reinforcement learning [4] (RL), agents can autonomously learn a control policy to master sequential-decision tasks. Rather than always learning tabula rasa, our recent work [5,7,8] considers how an experienced RL agent, the teacher, can ...
Reinforcement learning for fuzzy agents: application to a pighouse environment control
New learning paradigms in soft computing

Fuzzy Actor-Critic Learning (FACL) and Fuzzy Q-learning (FQL) are reinforcement learning methods based on Dynamic Programming (DP) principles. In this chapter, they are used to tune on line the conclusion part of Fuzzy Inference Systems (FIS). The only ...

Comments

Information & Contributors

Information

Published In

NIPS'00: Proceedings of the 13th International Conference on Neural Information Processing Systems

January 2000

1051 pages

Publisher

MIT Press

Cambridge, MA, United States

Publication History

Published: 01 January 2000

Qualifiers

Article

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 02 Sep 2024

Other Metrics

View Author Metrics

Citations

Cited By

View all

Sohn SOh JLee H(2018)Hierarchical reinforcement learning for zero-shot generalization with subtask dependenciesProceedings of the 32nd International Conference on Neural Information Processing Systems10.5555/3327757.3327818(7156-7166)Online publication date: 3-Dec-2018
https://dl.acm.org/doi/10.5555/3327757.3327818
Oh JSingh SLee HKohli P(2017)Zero-shot task generalization with multi-task deep reinforcement learningProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305890.3305956(2661-2670)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305890.3305956
Andreas JKlein DLevine S(2017)Modular multitask reinforcement learning with policy sketchesProceedings of the 34th International Conference on Machine Learning - Volume 7010.5555/3305381.3305399(166-175)Online publication date: 6-Aug-2017
https://dl.acm.org/doi/10.5555/3305381.3305399
Bai ARussell S(2017)Efficient reinforcement learning with hierarchies of machines by leveraging internal transitionsProceedings of the 26th International Joint Conference on Artificial Intelligence10.5555/3172077.3172084(1418-1424)Online publication date: 19-Aug-2017
https://dl.acm.org/doi/10.5555/3172077.3172084

Abstract

References

Cited By

Recommendations