Abstract
In both research fields, Case-Based Reasoning and Reinforcement Learning, the system under consideration gains its expertise from experience. Utilizing this fundamental common ground as well as further characteristics and results of these two disciplines, in this paper we develop an approach that facilitates the distributed learning of behaviour policies in cooperative multi-agent domains without communication between the learning agents. We evaluate our algorithms in a case study in reactive production scheduling.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bertsekas, D., Tsitsiklis, J.: Neuro Dynamic Programming. Athena Scientific, Belmont (1996)
Bowling, M., Veloso, M.: Simultaneous Adversarial Multi-Robot Learning. In: Proceedings of the Eighteenth International Joint Conference on Artificial Intelligence (IJCAI 2003), Acapulco, Mexico, pp. 699–704. Morgan Kaufmann, San Francisco (2003)
Bridge, D.: The Virtue of Reward: Performance, Reinforcement and Discovery in Case-Based Reasoning. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, p. 1. Springer, Heidelberg (2005)
Claus, C., Boutilier, C.: The Dynamics of Reinforcement Learning in Cooperative Multiagent Systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence (AAAI 1998). AAAI Press, Menlo Park (1998)
Gabel, T., Riedmiller, M.: CBR for State Value Function Approximation in Reinforcement Learning. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 206–221. Springer, Heidelberg (2005)
Gabel, T., Riedmiller, M.: Reducing Policy Degradation in Neuro-Dynamic Programming. In: Proceedings of ESANN 2006, Bruges, Belgium (to appear, 2006)
Hu, J., Wellman, M.: Nash Q-Learning for General-Sum Stochastic Games. Journal of Machine Learning Research 4, 1039–1069 (2003)
Kim, J., Seong, D., Jung, S., Park, J.: Integrated CBR Framework for Quality Designing and Scheduling in Steel Industry. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 645–658. Springer, Heidelberg (2004)
Lauer, M., Riedmiller, M.: Reinforcement Learning for Stochastic Cooperative Multi-Agent Systems. In: AAMAS 2004, pp. 1514–1515. ACM Press, New York (2004)
Leake, D., Sooriamurthi, R.: Managing Multiple Case Bases: Dimensions and Issues. In: FLAIRS Conference, Pensacola Beach, pp. 106–110. AAAI Press, Menlo Park (2002)
Littman, M.: Friend-or-Foe Q-learning in General-Sum Games. In: Proceedings of the Eighteenth International Conference on Machine Learning (ICML 2001), Williamstown, USA, pp. 322–328. Morgan Kaufman, San Francisco (2001)
Louis, S., McDonnell, J.: Learning with Case-Injected Genetic Algorithms. IEEE Trans. Evolutionary Computation 8(4), 316–328 (2004)
Macedo, L., Cardoso, A.: Using CBR in the Exploration of Unknown Environments with an Autonomous Agent. In: Funk, P., González Calero, P.A. (eds.) ECCBR 2004. LNCS (LNAI), vol. 3155, pp. 272–286. Springer, Heidelberg (2004)
Ontanon, S., Plaza, E.: Collaborative Case Retention Strategies for CBR Agents. In: Ashley, K.D., Bridge, D.G. (eds.) ICCBR 2003. LNCS, vol. 2689, pp. 392–406. Springer, Heidelberg (2003)
Ontanon, S., Plaza, E.: Cooperative Reuse for Compositional Cases in Multi-agent Systems. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 382–396. Springer, Heidelberg (2005)
Pinedo, M.: Scheduling. Theory, Algorithms, and Systems. Prentice Hall, Englewood Cliffs (2002)
Powell, J., Hauff, B., Hastings, J.: Evaluating the Effectiveness of Exploration and Accumulated Experience in Automatic Case Elicitation. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 397–407. Springer, Heidelberg (2005)
Riedmiller, M., Merke, A.: Using Machine Learning Techniques in Complex Multi-Agent Domains. In: Stamatescu, I., Menzel, W., Richter, M., Ratsch, U. (eds.) Adaptivity and Learning. Springer, Heidelberg (2003)
Riedmiller, S., Riedmiller, M.: A Neural Reinforcement Learning Approach to Learn Local Dispatching Policies in Production Scheduling. In: Proceedings of ICJAI 1999, Stockholm, Sweden, pp. 764–771 (1999)
Santamaria, J., Sutton, R., Ram, A.: Experiments with RL in Problems with Continuous State and Action Spaces. Adaptive Behavior 6(2), 163–217 (1998)
Sutton, R.S., Barto, A.G.: Reinforcement Learning. An Introduction. MIT Press/A Bradford Book, Cambridge (1998)
Szer, D., Charpillet, F.: Coordination through Mutual Notification in Cooperative Multiagent Reinforcement Learning. In: Proceedings of AAMAS 2004, New York, USA, pp. 1254–1255. IEEE Computer Society, Los Alamitos (2004)
Tesauro, G.: Extending Q-Learning to General Adaptive Multi-Agent Systems. In: Proceedings of NIPS 2003, Vancouver and Whistler, Canada. MIT Press, Cambridge (2003)
Tinkler, P., Fox, J., Green, C., Rome, D., Casey, K., Furmanski, C.: Analogical and Case-Based Reasoning for Predicting Satellite Task Schedulability. In: Muñoz-Ávila, H., Ricci, F. (eds.) ICCBR 2005. LNCS (LNAI), vol. 3620, pp. 566–578. Springer, Heidelberg (2005)
Uther, W., Veloso, M.: Adversarial Reinforcement Learning. Technical Report CMU-CS-03-107, School of Computer Science, Carnegie Mellon University (2003)
Watkins, C., Dayan, P.: Q-Learning. Machine Learning 8, 279–292 (1992)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2006 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Gabel, T., Riedmiller, M. (2006). Multi-agent Case-Based Reasoning for Cooperative Reinforcement Learners. In: Roth-Berghofer, T.R., Göker, M.H., Güvenir, H.A. (eds) Advances in Case-Based Reasoning. ECCBR 2006. Lecture Notes in Computer Science(), vol 4106. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11805816_5
Download citation
DOI: https://doi.org/10.1007/11805816_5
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-36843-4
Online ISBN: 978-3-540-36846-5
eBook Packages: Computer ScienceComputer Science (R0)