Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
10.1145/3652037.3663948acmotherconferencesArticle/Chapter ViewAbstractPublication PagespetraConference Proceedingsconference-collections
research-article

Exploring Performance in Complex Search-and-Retrieve Tasks: A Comparative Analysis of PPO and GAIL Robots

Published: 26 June 2024 Publication History

Abstract

Prior research has extensively examined supervised and unsupervised methods for search tasks, focusing primarily on individual model accuracy and performance. However, there’s a gap in understanding how these models collaborate during complex tasks. This study’s primary objective was to evaluate a simulated robot employing Proximal Policy Optimization (PPO) and another using Generative Adversarial Imitation Learning (GAIL) algorithms, working together for a search-and-retrieve task. The environment, built in Unity3D, contained 56 distractors (e.g., walls, tables, and chairs) with negative points and 112 targets (e.g., pistol, laptop, and armour) with positive points during the gameplay. A PPO robot searched for the targets within an environment with a GAIL robot player. The results demonstrated that the inclusion of the GAIL robot along with the PPO robot led to superior performance of the multi-robot team in the search-and-retrieve task. The GAIL robot outperformed the PPO robot when both players performed individually in the search-and-retrieve task. The PPO robots demonstrated comparatively poorer performance when performing alone in a task without GAIL players. Our findings highlight the importance of collaborative multi-robot search involving generative imitation learning within a simulated environment.

References

[1]
Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. 2017. A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017).
[2]
Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, 2019. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019).
[3]
Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861–1870.
[4]
Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems 29 (2016).
[5]
Keke Hou, Tingting Hou, and Lili Cai. 2023. Exploring Trust in Human–AI Collaboration in the Context of Multiplayer Online Games. Systems 11, 5 (2023), 217.
[6]
Ioannis Kostavelis, Andreas Kargakos, Evangelos Skartados, Georgia Peleka, Dimitrios Giakoumis, Iason Sarantopoulos, Ioannis Agriomallos, Zoe Doulgeri, Satoshi Endo, Heiko Stüber, 2022. Robotic Assistance in Medication Intake: A Complete Pipeline. Applied Sciences 12, 3 (2022), 1379.
[7]
Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).
[8]
Long-Ji Lin. 1991. Programming robots using reinforcement learning and teaching. In Proceedings of the ninth National conference on Artificial intelligence-Volume 2. 781–786.
[9]
Zhixin Ma, Shengmin Cui, and Inwhee Joe. 2022. An Enhanced Proximal Policy Optimization-Based Reinforcement Learning Method with Random Forest for Hyperparameter Optimization. Applied Sciences 12, 14 (2022), 7006.
[10]
Ml-agents 2021. Unity Ml-agents Toolkit. Retrieved Dec 2, 2023 from https://github.com/Unity-Technologies/ml-agents/blob/main/docs/LearningEnvironment-Examples.md
[11]
Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928–1937.
[12]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).
[13]
Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.
[14]
Andrew Y Ng, Stuart Russell, 2000. Algorithms for inverse reinforcement learning. In Icml, Vol. 1. 2.
[15]
Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG) 37, 4 (2018), 1–14.
[16]
Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine. 2018. Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow. arXiv preprint arXiv:1810.00821 (2018).
[17]
Carlo Reverberi, Tommaso Rigon, Aldo Solari, Cesare Hassan, Paolo Cherubini, and Andrea Cherubini. 2022. Experimental evidence of effective human–AI collaboration in medical decision-making. Scientific reports 12, 1 (2022), 14952.
[18]
Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, 2020. Mastering atari, go, chess and shogi by planning with a learned model. Nature 588, 7839 (2020), 604–609.
[19]
John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).
[20]
Yuchul Shin, Jaewon Kim, Kyohoon Jin, and Young Bin Kim. 2020. Playtesting in match 3 game using strategic plays via reinforcement learning. IEEE Access 8 (2020), 51593–51600.
[21]
David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484–489.
[22]
David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, 2017. Mastering the game of go without human knowledge. nature 550, 7676 (2017), 354–359.
[23]
Ervin Teng. 2019. Training your agents 7 times faster with ML-Agents. Unity Blog (2019).
[24]
Ishita Vohra, Shashank Uttrani, Akash K Rao, and Varun Dutt. 2021. Evaluating the efficacy of different neural network deep reinforcement algorithms in complex search-and-retrieve virtual simulations. In International Advanced Computing Conference. Springer, 348–361.
[25]
Jessica Williams, Stephen M Fiore, and Florian Jentsch. 2022. Supporting artificial social intelligence with theory of mind. Frontiers in artificial intelligence 5 (2022), 750763.
[26]
Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J Mankowitz, and Shie Mannor. 2018. Learn what not to learn: Action elimination with deep reinforcement learning. Advances in neural information processing systems 31 (2018).
[27]
Wenshuai Zhao, Jorge Peña Queralta, and Tomi Westerlund. 2020. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In 2020 IEEE symposium series on computational intelligence (SSCI). IEEE, 737–744.

Index Terms

  1. Exploring Performance in Complex Search-and-Retrieve Tasks: A Comparative Analysis of PPO and GAIL Robots

    Recommendations

    Comments

    Information & Contributors

    Information

    Published In

    cover image ACM Other conferences
    PETRA '24: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments
    June 2024
    708 pages
    ISBN:9798400717604
    DOI:10.1145/3652037
    Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

    Publisher

    Association for Computing Machinery

    New York, NY, United States

    Publication History

    Published: 26 June 2024

    Permissions

    Request permissions for this article.

    Check for updates

    Author Tags

    1. Deep Reinforcement Learning
    2. Generative Adversarial Imitation Learning
    3. Proximal Policy Optimization
    4. Reinforcement Learning
    5. Unity3D
    6. Virtual Environments

    Qualifiers

    • Research-article
    • Research
    • Refereed limited

    Conference

    PETRA '24

    Contributors

    Other Metrics

    Bibliometrics & Citations

    Bibliometrics

    Article Metrics

    • 0
      Total Citations
    • 40
      Total Downloads
    • Downloads (Last 12 months)40
    • Downloads (Last 6 weeks)7
    Reflects downloads up to 09 Feb 2025

    Other Metrics

    Citations

    View Options

    Login options

    View options

    PDF

    View or Download as a PDF file.

    PDF

    eReader

    View online with eReader.

    eReader

    HTML Format

    View this article in HTML Format.

    HTML Format

    Figures

    Tables

    Media

    Share

    Share

    Share this Publication link

    Share on social media