Multi-agent Reinforcement Learning for Control Systems: Challenges and Proposals

Graña, Manuel; Fernandez-Gauna, Borja

doi:10.1007/978-3-319-24834-9_3

Manuel Graña^18,19 &
Borja Fernandez-Gauna¹⁸

Part of the book series: Lecture Notes in Computer Science ((LNISA,volume 9375))

Included in the following conference series:

International Conference on Intelligent Data Engineering and Automated Learning

1767 Accesses

Abstract

Multi-agent Reinforcement Learning (MARL) methods offer a promising alternative to traditional analytical approaches for the design of control systems. We review the most important MARL algorithms from a control perspective focusing on on-line and model-free methods. We review some of sophisticated developments in the state-of-the-art of single-agent Reinforcement Learning which may be transferred to MARL, listing the most important remaining challenges. We also propose some ideas for future research aiming to overcome some of these challenges.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Log in via an institution

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Reinforcement Learning and Adaptive Control

Multiagent Reinforcement Learning

Taxonomy of Reinforcement Learning Algorithms

References

Arel, I., Liu, C., Urbanik, T., Kohls, A.: Reinforcement learning-based multi-agent system for network traffic signal control. Intell. Transport Syst. IET 4(2), 128–135 (2010)
Article Google Scholar
Arokhlo, M., Selamat, A., Hashim, S., Selamat, M.: Route guidance system using multi-agent reinforcement learning. In: 2011 7th International Conference on Information Technology in Asia (CITA 2011), pp. 1–5, July 2011
Google Scholar
Bagnell, J.A.D., Schneider, J.: Autonomous helicopter control using reinforcement learning policy search methods. In: 2001 Proceedings of the International Conference on Robotics and Automation. IEEE, May 2001
Google Scholar
Bazzan, A.: Opportunities for multiagent systems and multiagent reinforcement learning in traffic control. Auton. Agents Multi-Agent Syst. 18(3), 342–375 (2009)
Article Google Scholar
Bhatnagar, S., Sutton, R., Ghavamzadeh, M., Lee, M.: Natural actor-critic algorithms. Automatica Int. Fed. Autom. Control 45(11), 2471–2482 (2009)
MathSciNet MATH Google Scholar
Boyan, J.A.: Technical update: least-squares temporal difference learning. Mach. Learn. 49, 233–246 (2002)
Article MATH Google Scholar
Bussoniu, L., Babuska, R., Schutter, B.D., Ernst, D.: Reinforcement Learning and Dynamic Programming Using Function Approximators. CRC Press, Boca Raton (2010)
Book Google Scholar
Claus, C., Boutilier, C.: The dynamics of reinforcement learning in cooperative multiagent systems. In: Proceedings of the Fifteenth National Conference on Artificial Intelligence, pp. 746–752. AAAI Press (1997)
Google Scholar
Czibula, G., Bocicor, M.I., Czibula, I.G.: A distributed reinforcement learning approach for solving optimization problems. In: Proceedings of the 5th WSEAS International Conference on Communications and Information Technology, CIT 2011, pp. 25–30. World Scientific and Engineering Academy and Society (WSEAS), Stevens Point (2011)
Google Scholar
De Hauwere, Y.M., Vrancx, P., Nowé, A.: Learning multi-agent state space representations. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems, AAMAS 2010, vol. 1, pp. 715–722. International Foundation for Autonomous Agents and Multiagent Systems, Richland (2010)
Google Scholar
Dietterich, T.G.: An overview of MAXQ hierarchical reinforcement learning. In: Choueiry, B.Y., Walsh, T. (eds.) SARA 2000. LNCS (LNAI), vol. 1864, p. 26. Springer, Heidelberg (2000)
Chapter Google Scholar
Drugan, M., Nowe, A.: Designing multi-objective multi-armed bandits algorithms: a study. In: The 2013 International Joint Conference on Neural Networks (IJCNN), pp. 1–8, August 2013
Google Scholar
Duro, R., Graña, M., de Lope, J.: On the potential contributions of hybrid intelligent approaches to multicomponen robotic system development. Inf. Sci. 180(14), 2635–2648 (2010)
Article Google Scholar
Fernandez-Gauna, B., Lopez-Guede, J., Graña, M.: Transfer learning with partially constrained models: application to reinforcement learning of linked multicomponent robot system control. Robot. Auton. Syst. 61(7), 694–703 (2013)
Article Google Scholar
Fernandez-Gauna, B., Ansoategui, I., Etxeberria-Agiriano, I., Graña, M.: Reinforcement learning of ball screw feed drive controllers. Eng. Appl. Artif. Intell. 30, 107–117 (2014)
Article Google Scholar
Fernandez-Gauna, B., Graña, M., Etxeberria-Agiriano, I.: Distributed round-robin q-learning. PLoS ONE 10(7), e0127129 (2015)
Article Google Scholar
Fernandez-Gauna, B., Marques, I., Graña, M.: Undesired state-action prediction in multi-agent reinforcement learning. application to multicomponent robotic system control. Inf. Sci. 232, 309–324 (2013)
Article MATH Google Scholar
Fernandez-Gauna, B., Osa, J.L., Graña, M.: Effect of initial conditioning of reinforcement learning agents on feedback control tasks over continuous state and action spaces. In: de la Puerta, J.G., Ferreira, I.G., Bringas, P.G., Klett, F., Abraham, A., de Carvalho, A.C.P.L.F., Herrero, Á., Baruque, B., Quintián, H., Corchado, E. (eds.) International Joint Conference SOCO’14-CISIS’14-ICEUTE’14. AISC, vol. 299, pp. 125–133. Springer, Heidelberg (2014)
Google Scholar
Ghavamzadeh, M., Mahadevan, S., Makar, R.: Hierarchical multi-agent reinforcement learning. Auton. Agents Multi-Agent Syst. 13, 197–229 (2006)
Article Google Scholar
Guestrin, C., Lagoudakis, M., Parr, R.: Coordinated reinforcement learning. In: Proceedings of the IXth ICML, pp. 227–234 (2002)
Google Scholar
van Hasselt, H.: Reinforcement Learning in Continuous State and Action Spaces. In: Wiering, M., van Otterlo, M. (eds.) Reinforcement Learning: State of the Art, pp. 207–246. Springer, Heidelberg (2011)
Google Scholar
Hengst, B.: Discovering hierarchy in reinforcement learning with HEXQ. In: Maching Learning: Proceedings of the Nineteenth International Conference on Machine Learning, pp. 243–250. Morgan Kaufmann (2002)
Google Scholar
Kapetanakis, S., Kudenko, D.: Reinforcement learning of coordination in cooperative multi-agent systems. In: AAAI/IAAI 2002, pp. 326–331 (2002)
Google Scholar
Kok, J.R., Vlassis, N.: Collaborative multiagent reinforcement learning by payoff propagation. J. Mach. Learn. Res. 7, 1789–1828 (2006)
MathSciNet MATH Google Scholar
Kuyer, L., Whiteson, S., Bakker, B., Vlassis, N.: Multiagent reinforcement learning for urban traffic control using coordination graphs. In: Daelemans, W., Goethals, B., Morik, K. (eds.) ECML PKDD 2008, Part I. LNCS (LNAI), vol. 5211, pp. 656–671. Springer, Heidelberg (2008)
Chapter Google Scholar
Lauer, M., Riedmiller, M.A.: An algorithm for distributed reinforcement learning in cooperative multi-agent systems. In: Proceedings of the Seventeenth International Conference on Machine Learning, ICML 2000, pp. 535–542. Morgan Kaufmann Publishers Inc., San Francisco (2000)
Google Scholar
Li, F.D., Wu, M., He, Y., Chen, X.: Optimal control in microgrid using multi-agent reinforcement learning. ISA Trans. 51(6), 743–751 (2012)
Article Google Scholar
Littman, M.L.: Value-function reinforcement learning in Markov games. Cogn. Syst. Res. 2(1), 55–66 (2001)
Article Google Scholar
Mehta, N., Ray, S., Tadepalli, P., Dietterich, T.: Automatic discovery and transfer of MAXQ hierarchies. In: Proceedings of the 25th International Conference on Machine Learning, ICML 2008, pp. 648–655. ACM, New York (2008). http://doi.acm.org/10.1145/1390156.1390238
Melo, F., Ribeiro, M.: Coordinated learning in multiagent MDPS with infinite state-space. Auton. Agents Multi-Agent Syst. 21, 321–367 (2010)
Article Google Scholar
Nedic, A., Bertsekas, D.: Least squares policy evaluation algorithms with linear function approximation. Discrete Event Dyn. Syst. 13(1–2), 79–110 (2003)
Article MathSciNet MATH Google Scholar
Peters, J., Schaal, S.: Policy gradient methods for robotics. In: Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems, IROS (2006)
Google Scholar
Ren, W., Beard, R.W.: Distributed Consensus in Multi-vehicle Cooperative Control: Theory and Applications. Springer, London (2007)
MATH Google Scholar
Roberts, J.W., Manchester, I.R., Tedrake, R.: Feedback controller parameterizations for reinforcement learning. In: IEEE Symposium on Adaptive Dynamic Programming and Reinforcement Learning (2011)
Google Scholar
Salkham, A., Cunningham, R., Garg, A., Cahill, V.: A collaborative reinforcement learning approach to urban traffic control optimization. In: Proceedings of the 2008 IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology, WI-IAT 2008, vol. 2, pp. 560–566. IEEE Computer Society, Washington, DC (2008)
Google Scholar
Servin, A., Kudenko, D.: Multi-agent reinforcement learning for intrusion detection. In: Tuyls, K., Nowe, A., Guessoum, Z., Kudenko, D. (eds.) ALAMAS 2005, ALAMAS 2006, and ALAMAS 2007. LNCS (LNAI), vol. 4865, pp. 211–223. Springer, Heidelberg (2008)
Chapter Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning I: Introduction. MIT Press, Cambridge (1998)
Google Scholar
Taylor, M.E., Stone, P.: Transfer learning for reinforcement learning domains: a survey. J. Mach. Learn. Res. 10(1), 1633–1685 (2009)
MathSciNet MATH Google Scholar
Vlassis, N., Elhorst, R., Kok, J.R.: Anytime algorithms for multiagent decision making using coordination graphs. In: Proceedings of the International Conference on Systems, Man, and Cybernetics (2004)
Google Scholar
Wang, X., Sandholm, T.: Reinforcement learning to play an optimal nash equilibrium in team Markov games. In: Advances in Neural Information Processing Systems, pp. 1571–1578. MIT Press (2002)
Google Scholar
Wu, C., Chowdhury, K., Di Felice, M., Meleis, W.: Spectrum management of cognitive radio using multi-agent reinforcement learning. In: Proceedings of the 9th International Conference on Autonomous Agents and Multiagent Systems: Industry Track, AAMAS 2010, pp. 1705–1712. International Foundation for Autonomous Agents and Multiagent Systems, Richland (2010)
Google Scholar
Xu, X., Zuo, L., Huang, Z.: Reinforcement learning algorithms with function approximation: recent advances and applications. Inf. Sci. 261, 1–31 (2014)
Article MathSciNet MATH Google Scholar
Zhao, G., Sun, R.: Application of multi-agent reinforcement learning to supply chain ordering management. In: 2010 Sixth International Conference on Natural Computation (ICNC), vol. 7, pp. 3830–3834, August 2010
Google Scholar

Download references

Acknowledgments

This research has been partially funded by grant TIN2011-23823 of the Ministerio de Ciencia e Innovación of the Spanish Government (MINECO), and the Basque Government grant IT874-13 for the research group. Manuel Graña was supported by EC under FP7, Coordination and Support Action, Grant Agreement Number 316097, ENGINE European Research Centre of Network Intelligence for Innovation Enhancement.

Author information

Authors and Affiliations

Grupo de Inteligencia Computacional (GIC), Universidad del País Vasco (UPV/EHU), San Sebastián, Spain
Manuel Graña & Borja Fernandez-Gauna
ENGINE Centre, Wrocław University of Technology, Wybrzeże Wyspiańskiego 27, 50-370, Wrocław, Poland
Manuel Graña

Authors

Manuel Graña
View author publications
You can also search for this author in PubMed Google Scholar
Borja Fernandez-Gauna
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Manuel Graña .

Editor information

Editors and Affiliations

Wroclaw University of Technology, Wroclaw, Poland
Konrad Jackowski
Department of Systems, Wroclaw University of Technology, Wroclaw, Poland
Robert Burduk
Wroclaw Univ of Tech, Wroclaw, Poland
Krzysztof Walkowiak
Wroclaw University of Technology, Faculty of Electronics, Wroclaw, Poland
Michal Wozniak
School of Electrical & Electronic E, University of Manchester, Manchester, United Kingdom
Hujun Yin

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Graña, M., Fernandez-Gauna, B. (2015). Multi-agent Reinforcement Learning for Control Systems: Challenges and Proposals. In: Jackowski, K., Burduk, R., Walkowiak, K., Wozniak, M., Yin, H. (eds) Intelligent Data Engineering and Automated Learning – IDEAL 2015. IDEAL 2015. Lecture Notes in Computer Science(), vol 9375. Springer, Cham. https://doi.org/10.1007/978-3-319-24834-9_3

Download citation

DOI: https://doi.org/10.1007/978-3-319-24834-9_3
Published: 07 January 2016
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-24833-2
Online ISBN: 978-3-319-24834-9
eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics

Multi-agent Reinforcement Learning for Control Systems: Challenges and Proposals

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement Learning and Adaptive Control

Multiagent Reinforcement Learning

Taxonomy of Reinforcement Learning Algorithms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Multi-agent Reinforcement Learning for Control Systems: Challenges and Proposals

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Reinforcement Learning and Adaptive Control

Multiagent Reinforcement Learning

Taxonomy of Reinforcement Learning Algorithms

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation