Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/860575.860716acmconferencesArticle/Chapter ViewAbstractPublication PagesaamasConference Proceedingsconference-collections
Article

Automatic generation of an agent's basic behaviors

Published: 14 July 2003 Publication History

Abstract

The agent approach, as seen by [9], intends to design "intelligent" behaviors. Yet, Reinforcement Learning (RL) methods often fail when confronted with complex tasks. We are therefore trying to develop a methodology for the automated design of agents (in the framework of Markov Decision Processes) in the case where the global task can be decomposed into simpler -possibly concurrent- sub-tasks. Our main idea is to automatically combine basic behaviors using RL methods. This led us to propose two complementary mechanisms presented in the current paper. The first mechanism builds a global policy using a weighted combination of basic policies (which are reusable), the weights being learned by the agent (using Simulated Annealing in our case). An agent designed this way is highly scalable as, without further refinement of the global behavior, it can automatically combine several instances of the same basic behavior to take into account concurrent occurences of the same subtask. The second mechanism aims at creating new basic behaviors for combination. It is based on an incremental learning method that builds on the approximate solution obtained through the combination of older behaviors.

References

[1]
J. Baxter and P. Bartlett. Infinite-horizon policy-gradient estimation. Journal of Artificial Intelligence Research, 15:319--350, 2001.
[2]
O. Buffet, A. Dutech, and F. Charpillet. Adaptive combination of behaviors in an agent. In Proceedings of the 15th European Conference on Artificial Intelligence (ECAI'02), 2002.
[3]
K. Dixon, R. Malak, and P. Khosla. Incorporating prior knowledge and previously learned information into reinforcement learning agents. Technical report, Carnegie Mellon University, Institute for Complex Engineered Systems, 2000.
[4]
A. Dutech. Solving pomdp using selected past-events. In Proceedings of the 14th European Conference on Artificial Intelligence (ECAI'00), 2000.
[5]
S. Gadanho and L. Custodio. Asynchronous learning by emotions and cognition. In Proceedings of the Seventh International Conference on the Simulation of Adaptive Behavior (SAB2002), 2002.
[6]
M. Humphrys. Action selection methods using reinforcement learning. In From Animals to Animats 4: 4th International Conference on Simulation of Adaptive Behavior (SAB-96), September 1996.
[7]
R. A. McCallum. Instance-based utile distinctions for reinforcement learning with hidden state. In Proceedings of the 12fth International Machine Learning Conference (ML'95), 1995.
[8]
L. Peshkin, K. Kim, N. Meuleau, and L. Kaelbling. Learning to cooperate via policy search. In Proceedings of the 16th Conference on Uncertainty in Artificial Intelligence (UAI'00), 2000.
[9]
S. Russell and P. Norvig. Artificial Intelligence: A Modern Approach. Englewood Cliffs, NJ: prentice Hall, 1995.
[10]
S. Singh, T. Jaakkola, and M. Jordan. Learning without state estimation in partially observable markovian decision processes. In Proceedings of the 11th International Conference on Machine Learning (ICML'94), 1994.
[11]
R. Sutton, D. Precup, and S. Singh. Between MDPs and Semi-MDPs: Learning, planning, and representing knowledge at multiple temporal scales. Technical report, University of Massachusetts, Department of Computer and Information Sciences, Amherst, MA, 1998.
[12]
T. Tyrrell. Computational Mechanisms for Action Selection. PhD thesis, University of Edinburgh, 1993.
[13]
G. Wang and S. Mahadevan. Hierarchical optimization of policy-coupled semi-markov decision processes. In Proceedings of the 16th International Conference on Machine Learning (ICML'99), Seattle. Morgan Kaufmann, 1999.

Cited By

View all
  • (2005)Angels and artifacts: Moral agents in the age of computers and networksJournal of Information, Communication and Ethics in Society10.1108/147799605800002693:3(151-157)Online publication date: 31-Aug-2005

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Conferences
AAMAS '03: Proceedings of the second international joint conference on Autonomous agents and multiagent systems
July 2003
1200 pages
ISBN:1581136838
DOI:10.1145/860575
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 14 July 2003

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. Markov decision processes
  2. adaptation
  3. complex environments
  4. reinforcement learning
  5. scalability

Qualifiers

  • Article

Conference

AAMAS03
Sponsor:

Acceptance Rates

Overall Acceptance Rate 1,155 of 5,036 submissions, 23%

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)3
  • Downloads (Last 6 weeks)0
Reflects downloads up to 25 Dec 2024

Other Metrics

Citations

Cited By

View all
  • (2005)Angels and artifacts: Moral agents in the age of computers and networksJournal of Information, Communication and Ethics in Society10.1108/147799605800002693:3(151-157)Online publication date: 31-Aug-2005

View Options

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media