An Improved Q-Learning Algorithm Using Synthetic Pheromones

Monekosso, Ndedi; Remagnino, Paolo; Szarowicz, Adam

doi:10.1007/3-540-45941-3_21

Ndedi Monekosso³,
Paolo Remagnino³ &
Adam Szarowicz³

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2296))

Included in the following conference series:

International Workshop of Central and Eastern Europe on Multi-Agent Systems

422 Accesses
1 Citations

Abstract

In this paper we propose an algorithm for multi-agent Q-learning. The algorithm is inspired by the natural behaviour of ants, which deposit pheromone in the environment to communicate. The benefit besides simulating ant behaviour in a colony is to design complex multi-agent systems. Complex behaviour can emerge from relatively simple interacting agents. The proposed Q-learning update equation includes a belief factor. The belief factor reflects the confidence the agent has in the pheromone detected in its environment. Agents communicate implicitly to co-ordinate and co-operate in learning to solve a problem.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

Pheromone-inspired Communication Framework for Large-scale Multi-agent Reinforcement Learning

Pheromone Based Independent Reinforcement Learning for Multiagent Navigation

Expected Lenient Q-learning: a fast variant of the Lenient Q-learning algorithm for cooperative stochastic Markov games

Article 09 January 2024

References

C. Anderson, P.G. Blacwell, and C. Cannings. Simulating ants that forage by expectation. In Proc. 4Th Conf. on Artificial Life, pages 531–538, 1997.
Google Scholar
R. Beckers, J. L. Deneubourg, S. Goss, and J. M. Pasteels. Collective decision making through food recruitment. Ins. Soc., 37:258–267, 1990.
Article Google Scholar
R. Beckers, J.L. Deneubourg, and S. Goss. Trails and u-turns in the selection of the shortest path by the ant lasius niger. Journal of Theoretical Biology, 159:397–4151, 1992.
Article Google Scholar
D.P. Bertsekas and J.N. Tsitsiklis. Neuro-Dynamic Programming. Athena Scientific, 1996.
Google Scholar
E. Bonabeau, M. Dorigo, and G. Theraulaz. Swarm intelligence, From Natural to Artificial Systems. Oxford University Press, 1999.
Google Scholar
M. C. Cammaerts-Tricot. Piste et pheromone attraction chez la fourmi myrmica ruba. Journal of Computational Physiology, 88:373–382, 1974.
Article Google Scholar
G. Di Caro and M. Dorigo. Antnet: a mobile agents approach to adaptive routing.
Google Scholar
A. Colorni, M. Dorigo, and V. Maniezzo. Ant system for job-shop scheduling. Belgian Journal of OR, statistics and computer science, 34:39–53, 1993.
Google Scholar
A. Colorni, M. Dorigo, and G. Theraulaz. Distributed optimzation by ant colonies. In Proceedings First European Conf. on Artificial Life, pages 134–142, 1991.
Google Scholar
J.L. Deneubourg, R. Beckers, and S. Goss. Trails and u-turns in the selection of a path by the ant lasius niger. Journal of Theoretical Biology, 159:397–415, 1992.
Article Google Scholar
J.L. Deneubourg and S. Goss. Collective patterns and decision making. Ethol. Ecol. and Evol., 1:295–311, 1993.
Google Scholar
M. Dorigo and L. M. Gambardella. Ant colony system: A cooperative learning approach to the travelling salesman problem. IEEE Trans. on Evol. Comp., 1:53–66, 1997.
Article Google Scholar
M. Dorigo, V. Maniezzo, and A. Colorni. The ant system: Optimization by a colony of cooperatin agents. IEEE Trans. on Systems, Man, and Cybernetics, 26:1–13, 1996.
Google Scholar
M. Kisiel-Dorohinicki E. Nawarecki, G. Dobrowolski. Organisations in the particular class of multi-agent systems. In in this volume, 2001.
Google Scholar
L. M. Gambardella and M. Dorigo. Ant-q: A reinforcement learning approach to the traveling salesman problem. In Proc. 12Th ICML, pages 252–260, 1995.
Google Scholar
L. M. Gambardella, E. D. Taillard, and M. Dorigo. Ant colonies for the qap. Journal of Operational Research society, 1998.
Google Scholar
S. Goss, S. Aron, J.L. Deneubourg, and J. M. Pasteels. Self-organized shorcuts in the argentine ants. Naturwissenschaften, pages 579–581, 1989.
Google Scholar
L. R. Leerink, S. R. Schultz, and M. A. Jabri. A reinforcement learning exploration strategy based on ant foraging mechanisms. In Proc. 6Th Australian Conference on Neural Nets, 1995.
Google Scholar
J-P. Sansonnet N. Sabouret. Learning collective behaviour from local interaction. In in this volume, 2001.
Google Scholar
J.G. Ollason. Learning to forage-optimally? Theoretical Population Biology, 18:44–56, 1980.
Article MathSciNet Google Scholar
J.G. Ollason. Learning to forage in a regenerating patchy environment: can it fail to be optimal? Theoretical Population Biology, 31:13–32, 1987.
Article MATH Google Scholar
H. Van Dyke Parunak and S. Brueckner. Ant-like missionnaries and cannibals: Synthetic pheromones for distributed motion control. In Proc. of ICMAS’00, 2000.
Google Scholar
H. Van Dyke Parunak, S. Brueckner, J. Sauter, and J. Posdamer. Mechanisms and military applications for synthetic pheromones. In Proc. 5Th International Conference Autonomous Agents, Montreal, Canada, 2001.
Google Scholar
L. Sheremetov R. Romero Cortes. Model of cooperation in multi-agent systems with fuzzy coalitions. In in this volume, 2001.
Google Scholar
R. S. Sutton and A.G. Barto. Reinforcement Learning. MITPress, 1998.
Google Scholar
Ming Tan. Multi-agent reinforcement learning: Independent vs. cooperative agents. In Proceedings of the Tenth International Conference on Machine Learning, pages 330–337, 1993.
Google Scholar
R. T. Vaughan, K. Stoy, G. S. Sukhatme, and M. J. Mataric. Whistling in the dark: Cooperative trail following in uncertain localization space. In Proc. 4Th International Conference on Autonomous Agents, Barcelona, Spain, 2000.
Google Scholar
C. J. C. H. Watkins. Learning with delayed rewards. PhD thesis, University of Cambridge, 1989.
Google Scholar

Download references

Author information

Authors and Affiliations

Digital Imaging Research Centre, School of Computing and Information Systems Kingston University, UK
Ndedi Monekosso, Paolo Remagnino & Adam Szarowicz

Authors

Ndedi Monekosso
View author publications
You can also search for this author in PubMed Google Scholar
Paolo Remagnino
View author publications
You can also search for this author in PubMed Google Scholar
Adam Szarowicz
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

Institute of Informatics, Warsaw University, ul. Banacha 2, 02-097, Warsaw, Poland
Barbara Dunin-Keplicz
Department of Computer Science, University of Mining and Metallurgy, al. Mickiewicza 30, 30-059, Krakow, Poland
Edward Nawarecki

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Monekosso, N., Remagnino, P., Szarowicz, A. (2002). An Improved Q-Learning Algorithm Using Synthetic Pheromones. In: Dunin-Keplicz, B., Nawarecki, E. (eds) From Theory to Practice in Multi-Agent Systems. CEEMAS 2001. Lecture Notes in Computer Science(), vol 2296. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-45941-3_21

Download citation

DOI: https://doi.org/10.1007/3-540-45941-3_21
Published: 14 March 2002
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-43370-5
Online ISBN: 978-3-540-45941-5
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics