Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A federated advisory teacher–student framework with simultaneous learning agents

Published: 18 February 2025 Publication History

Abstract

Multi-agent reinforcement learning requires numerous interactions with the environment and other agents to learn an optimal policy. The teacher–student framework is one paradigm that can enhance the learning performance of reinforcement learning by allowing agents to seek advice from one another. However, recent studies show limitations in knowledge sharing between agents, as advisory learning is only peer-to-peer at a given time. Some methods enable a student to accept multiple pieces of advice, but they typically rely on pre-trained and/or policy-fixed teachers, rendering them unsuitable for agents with simultaneous advisory learning. Simultaneous learning with multiple pieces of advice has not been thoroughly investigated. Furthermore, most research has concentrated on the sharing of knowledge samples, a practice vulnerable to security breaches that could allow attackers to deduce details about the environment. To address these challenges, we propose a federated advisory framework that uses a federated learning structure to aggregate multiple sources of advice with deep reinforcement learning, ensuring that the shared advice is not sample-based. Our experimental comparisons with leading advisory learning techniques confirm that our approach significantly enhances learning performance.

References

[1]
Gosavi A., Reinforcement learning: A tutorial survey and recent advances, INFORMS J. Comput. 21 (2) (2009) 178–192.
[2]
L. Torrey, M. Taylor, Teaching on a budget: Agents advising agents in reinforcement learning, in: Proceedings of the 2013 International Conference on Autonomous Agents and Multi-Agent Systems, 2013, pp. 1053–1060.
[3]
F.L. Da Silva, P. Hernandez-Leal, B. Kartal, M.E. Taylor, Uncertainty-aware action advising for deep reinforcement learning agents, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 34, 2020, pp. 5792–5799.
[4]
Griffith S., Subramanian K., Scholz J., Isbell C.L., Thomaz A.L., Policy shaping: Integrating human feedback with reinforcement learning, Adv. Neural Inf. Process. Syst. 26 (2013).
[5]
Ilhan E., Gow J., Perez-Liebana D., Action advising with advice imitation in deep reinforcement learning, 2021, arXiv preprint arXiv:2104.08441.
[6]
S. Omidshafiei, D.-K. Kim, M. Liu, G. Tesauro, M. Riemer, C. Amato, M. Campbell, J.P. How, Learning to teach in cooperative multiagent reinforcement learning, in: Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 33, 2019, pp. 6128–6136.
[7]
Levine S., Finn C., Darrell T., Abbeel P., End-to-end training of deep visuomotor policies, J. Mach. Learn. Res. 17 (1) (2016) 1334–1373.
[8]
Kosorok M.R., Moodie E.E., Adaptive Treatment Strategies in Practice: Planning Trials and Analyzing Data for Personalized Medicine, SIAM, 2015.
[9]
Glavic M., Fonteneau R., Ernst D., Reinforcement learning for electric power system decision and control: Past considerations and perspectives, IFAC-PapersOnLine 50 (1) (2017) 6918–6927.
[10]
Clouse J.A., On Integrating Apprentice Learning and Reinforcement Learning, University of Massachusetts Amherst, 1996.
[11]
Maclin R., Shavlik J.W., Creating advice-taking reinforcement learners, Mach. Learn. 22 (1–3) (1996) 251–281.
[12]
Ye D., Zhu T., Cheng Z., Zhou W., Philip S.Y., Differential advising in multiagent reinforcement learning, IEEE Trans. Cybern. 52 (6) (2020) 5508–5521.
[13]
Ye D., Zhu T., Zhu C., Zhou W., Philip S.Y., Model-based self-advising for multi-agent learning, IEEE Trans. Neural Netw. Learn. Syst. (2022).
[14]
Ye D., Zhu T., Zhou W., Philip S.Y., Differentially private malicious agent avoidance in multiagent advising learning, IEEE Trans. Cybern. 50 (10) (2019) 4214–4227.
[15]
Frazier S., Riedl M., Improving deep reinforcement learning in minecraft with action advice, in: Proceedings of the AAAI Conference on Artificial Intelligence and Interactive Digital Entertainment, 15, 2019, pp. 146–152.
[16]
Laroche R., Fatemi M., Romoff J., van Seijen H., Multi-advisor reinforcement learning, 2017, arXiv preprint arXiv:1704.00756.
[17]
Zhan Y., Ammar H.B., et al., Theoretically-grounded policy advice from multiple teachers in reinforcement learning settings with applications to negative transfer, 2016, arXiv preprint arXiv:1604.03986.
[18]
Ilhan E., Gow J., Perez-Liebana D., Teaching on a budget in multi-agent deep reinforcement learning, in: 2019 IEEE Conference on Games (CoG), IEEE, 2019, pp. 1–8.
[19]
Subramanian S.G., Taylor M.E., Larson K., Crowley M., Learning from multiple independent advisors in multi-agent reinforcement learning, 2023, arXiv preprint arXiv:2301.11153.
[20]
F.L. Da Silva, R. Glatt, A.H.R. Costa, Simultaneously learning and advising in multiagent reinforcement learning, in: Proceedings of the 16th Conference on Autonomous Agents and Multiagent Systems, 2017, pp. 1100–1108.
[21]
Zhu Z., Lin K., Jain A.K., Zhou J., Transfer learning in deep reinforcement learning: A survey, 2020, arXiv preprint arXiv:2009.07888.
[22]
Mnih V., Badia A.P., Mirza M., Graves A., Lillicrap T., Harley T., Silver D., Kavukcuoglu K., Asynchronous methods for deep reinforcement learning, in: International Conference on Machine Learning, PMLR, 2016, pp. 1928–1937.
[23]
Zhang C., Xie Y., Bai H., Yu B., Li W., Gao Y., A survey on federated learning, Knowl.-Based Syst. 216 (2021).
[24]
A. Nilsson, S. Smith, G. Ulm, E. Gustavsson, M. Jirstrand, A performance evaluation of federated learning algorithms, in: Proceedings of the Second Workshop on Distributed Infrastructures for Deep Learning, 2018, pp. 1–8.
[25]
Goyal P., Dollár P., Girshick R., Noordhuis P., Wesolowski L., Kyrola A., Tulloch A., Jia Y., He K., Accurate, large minibatch sgd: Training imagenet in 1 hour, 2017, arXiv preprint arXiv:1706.02677.
[26]
Zhu C., Leung H.-F., Hu S., Cai Y., A Q-values sharing framework for multi-agent reinforcement learning under budget constraint, ACM Trans. Auton. Adapt. Syst. (TAAS) 15 (2) (2021) 1–28.
[27]
S. Ganapathi Subramanian, M.E. Taylor, K. Larson, M. Crowley, Learning from Multiple Independent Advisors in Multi-agent Reinforcement Learning, in: Proceedings of the 2023 International Conference on Autonomous Agents and Multiagent Systems, 2023, pp. 1144–1153.

Recommendations

Comments

Information & Contributors

Information

Published In

cover image Knowledge-Based Systems
Knowledge-Based Systems  Volume 305, Issue C
Dec 2024
1481 pages

Publisher

Elsevier Science Publishers B. V.

Netherlands

Publication History

Published: 18 February 2025

Author Tags

  1. Reinforcement learning
  2. Federated learning
  3. Teacher–student framework
  4. Multi-agent system
  5. Multiple sources of advice

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • 0
    Total Citations
  • 0
    Total Downloads
  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 07 Mar 2025

Other Metrics

Citations

View Options

View options

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media