research-article

Exploring Performance in Complex Search-and-Retrieve Tasks: A Comparative Analysis of PPO and GAIL Robots

Authors:

Shashank Kapoor,

Shashank Uttrani,

Varun DuttAuthors Info & Claims

PETRA '24: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments

Pages 466 - 473

https://doi.org/10.1145/3652037.3663948

Published: 26 June 2024 Publication History

Abstract

Prior research has extensively examined supervised and unsupervised methods for search tasks, focusing primarily on individual model accuracy and performance. However, there’s a gap in understanding how these models collaborate during complex tasks. This study’s primary objective was to evaluate a simulated robot employing Proximal Policy Optimization (PPO) and another using Generative Adversarial Imitation Learning (GAIL) algorithms, working together for a search-and-retrieve task. The environment, built in Unity3D, contained 56 distractors (e.g., walls, tables, and chairs) with negative points and 112 targets (e.g., pistol, laptop, and armour) with positive points during the gameplay. A PPO robot searched for the targets within an environment with a GAIL robot player. The results demonstrated that the inclusion of the GAIL robot along with the PPO robot led to superior performance of the multi-robot team in the search-and-retrieve task. The GAIL robot outperformed the PPO robot when both players performed individually in the search-and-retrieve task. The PPO robots demonstrated comparatively poorer performance when performing alone in a task without GAIL players. Our findings highlight the importance of collaborative multi-robot search involving generative imitation learning within a simulated environment.

References

[1]

Kai Arulkumaran, Marc Peter Deisenroth, Miles Brundage, and Anil Anthony Bharath. 2017. A brief survey of deep reinforcement learning. arXiv preprint arXiv:1708.05866 (2017).

[2]

Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemysław Dębiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Chris Hesse, 2019. Dota 2 with large scale deep reinforcement learning. arXiv preprint arXiv:1912.06680 (2019).

[3]

Tuomas Haarnoja, Aurick Zhou, Pieter Abbeel, and Sergey Levine. 2018. Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In International conference on machine learning. PMLR, 1861–1870.

[4]

Jonathan Ho and Stefano Ermon. 2016. Generative adversarial imitation learning. Advances in neural information processing systems 29 (2016).

[5]

Keke Hou, Tingting Hou, and Lili Cai. 2023. Exploring Trust in Human–AI Collaboration in the Context of Multiplayer Online Games. Systems 11, 5 (2023), 217.

[6]

Ioannis Kostavelis, Andreas Kargakos, Evangelos Skartados, Georgia Peleka, Dimitrios Giakoumis, Iason Sarantopoulos, Ioannis Agriomallos, Zoe Doulgeri, Satoshi Endo, Heiko Stüber, 2022. Robotic Assistance in Medication Intake: A Complete Pipeline. Applied Sciences 12, 3 (2022), 1379.

[7]

Timothy P Lillicrap, Jonathan J Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015).

[8]

Long-Ji Lin. 1991. Programming robots using reinforcement learning and teaching. In Proceedings of the ninth National conference on Artificial intelligence-Volume 2. 781–786.

Digital Library

[9]

Zhixin Ma, Shengmin Cui, and Inwhee Joe. 2022. An Enhanced Proximal Policy Optimization-Based Reinforcement Learning Method with Random Forest for Hyperparameter Optimization. Applied Sciences 12, 14 (2022), 7006.

[10]

Ml-agents 2021. Unity Ml-agents Toolkit. Retrieved Dec 2, 2023 from https://github.com/Unity-Technologies/ml-agents/blob/main/docs/LearningEnvironment-Examples.md

[11]

Volodymyr Mnih, Adria Puigdomenech Badia, Mehdi Mirza, Alex Graves, Timothy Lillicrap, Tim Harley, David Silver, and Koray Kavukcuoglu. 2016. Asynchronous methods for deep reinforcement learning. In International conference on machine learning. PMLR, 1928–1937.

[12]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. Playing atari with deep reinforcement learning. arXiv preprint arXiv:1312.5602 (2013).

[13]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A Rusu, Joel Veness, Marc G Bellemare, Alex Graves, Martin Riedmiller, Andreas K Fidjeland, Georg Ostrovski, 2015. Human-level control through deep reinforcement learning. nature 518, 7540 (2015), 529–533.

[14]

Andrew Y Ng, Stuart Russell, 2000. Algorithms for inverse reinforcement learning. In Icml, Vol. 1. 2.

[15]

Xue Bin Peng, Pieter Abbeel, Sergey Levine, and Michiel Van de Panne. 2018. Deepmimic: Example-guided deep reinforcement learning of physics-based character skills. ACM Transactions On Graphics (TOG) 37, 4 (2018), 1–14.

Digital Library

[16]

Xue Bin Peng, Angjoo Kanazawa, Sam Toyer, Pieter Abbeel, and Sergey Levine. 2018. Variational discriminator bottleneck: Improving imitation learning, inverse rl, and gans by constraining information flow. arXiv preprint arXiv:1810.00821 (2018).

[17]

Carlo Reverberi, Tommaso Rigon, Aldo Solari, Cesare Hassan, Paolo Cherubini, and Andrea Cherubini. 2022. Experimental evidence of effective human–AI collaboration in medical decision-making. Scientific reports 12, 1 (2022), 14952.

[18]

Julian Schrittwieser, Ioannis Antonoglou, Thomas Hubert, Karen Simonyan, Laurent Sifre, Simon Schmitt, Arthur Guez, Edward Lockhart, Demis Hassabis, Thore Graepel, 2020. Mastering atari, go, chess and shogi by planning with a learned model. Nature 588, 7839 (2020), 604–609.

[19]

John Schulman, Filip Wolski, Prafulla Dhariwal, Alec Radford, and Oleg Klimov. 2017. Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017).

[20]

Yuchul Shin, Jaewon Kim, Kyohoon Jin, and Young Bin Kim. 2020. Playtesting in match 3 game using strategic plays via reinforcement learning. IEEE Access 8 (2020), 51593–51600.

[21]

David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, George Van Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, 2016. Mastering the game of Go with deep neural networks and tree search. nature 529, 7587 (2016), 484–489.

[22]

David Silver, Julian Schrittwieser, Karen Simonyan, Ioannis Antonoglou, Aja Huang, Arthur Guez, Thomas Hubert, Lucas Baker, Matthew Lai, Adrian Bolton, 2017. Mastering the game of go without human knowledge. nature 550, 7676 (2017), 354–359.

[23]

Ervin Teng. 2019. Training your agents 7 times faster with ML-Agents. Unity Blog (2019).

[24]

Ishita Vohra, Shashank Uttrani, Akash K Rao, and Varun Dutt. 2021. Evaluating the efficacy of different neural network deep reinforcement algorithms in complex search-and-retrieve virtual simulations. In International Advanced Computing Conference. Springer, 348–361.

[25]

Jessica Williams, Stephen M Fiore, and Florian Jentsch. 2022. Supporting artificial social intelligence with theory of mind. Frontiers in artificial intelligence 5 (2022), 750763.

[26]

Tom Zahavy, Matan Haroush, Nadav Merlis, Daniel J Mankowitz, and Shie Mannor. 2018. Learn what not to learn: Action elimination with deep reinforcement learning. Advances in neural information processing systems 31 (2018).

[27]

Wenshuai Zhao, Jorge Peña Queralta, and Tomi Westerlund. 2020. Sim-to-real transfer in deep reinforcement learning for robotics: a survey. In 2020 IEEE symposium series on computational intelligence (SSCI). IEEE, 737–744.

Index Terms

Exploring Performance in Complex Search-and-Retrieve Tasks: A Comparative Analysis of PPO and GAIL Robots
1. Computing methodologies
  1. Modeling and simulation
    1. Simulation support systems
      1. Simulation environments

Recommendations

Enhancing construction robot learning for collaborative and long-horizon tasks using generative adversarial imitation learning
Abstract
The development and deployment of robots on construction sites are integral to the industrialization of construction, known as Construction 4.0. Tele-operated and pre-programmed robots have enhanced construction efficiency and safety. However, ...
A Comparative Study of Reinforcement Learning and Insect-Inspired Visual Navigation Methods
Biomimetic and Biohybrid Systems
Abstract
Navigation capability is crucial for the functionality of living machines across various domains. Recently, two types of approaches have been pursued to address navigation tasks: reinforcement learning (RL) based methods and insect-inspired ...
GAIL-PT: An intelligent penetration testing framework with generative adversarial imitation learning
Abstract
Penetration testing (PT) is an efficient tool for network testing and vulnerability mining by simulating the hackers’ attacks to obtain valuable information applied in operating and database systems. Most of the traditional manual solutions are ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

PETRA '24: Proceedings of the 17th International Conference on PErvasive Technologies Related to Assistive Environments

June 2024

708 pages

ISBN:9798400717604

DOI:10.1145/3652037

Copyright © 2024 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 26 June 2024

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

PETRA '24

PETRA '24: The PErvasive Technologies Related to Assistive Environments Conference

June 26 - 28, 2024

Crete, Greece

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

0
Total Citations
40
Total Downloads

Downloads (Last 12 months)40
Downloads (Last 6 weeks)7

Reflects downloads up to 09 Feb 2025

Other Metrics

View Author Metrics

Citations

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

HTML Format

View this article in HTML Format.

Figures

Tables

Media

View Table of Conten