research-article

Reinforcement learning-based dynamic obstacle avoidance and integration of path planning

Authors:

Chibum LeeAuthors Info & Claims

Intelligent Service Robotics, Volume 14, Issue 5

Pages 663 - 677

https://doi.org/10.1007/s11370-021-00387-2

Published: 01 November 2021 Publication History

Abstract

Deep reinforcement learning has the advantage of being able to encode fairly complex behaviors by collecting and learning empirical information. In the current study, we have proposed a framework for reinforcement learning in decentralized collision avoidance where each agent independently makes its decision without communication with others. In an environment exposed to various kinds of dynamic obstacles with irregular movements, mobile robot agents could learn how to avoid obstacles and reach a target point efficiently. Moreover, a path planner was integrated with the reinforcement learning-based obstacle avoidance to solve the problem of not finding a path in a specific situation, thereby imposing path efficiency. The robots were trained about the policy of obstacle avoidance in environments where dynamic characteristics were considered with soft actor critic algorithm. The trained policy was implemented in the robot operating system (ROS), tested in virtual and real environments for the differential drive wheel robot to prove the effectiveness of the proposed method. Videos are available at https://youtu.be/xxzoh1XbAl0.

References

[1]

Abe Y, Matsuo Y (2001) Collision avoidance method for multiple autonomous mobile agents by implicit cooperation. In: IEEE international conference on intelligent robots and systems.

[2]

Martinez-Gomez L, Fraichard T (2009) Collision avoidance in dynamic environments: an ICS-based solution and its comparative evaluation. In: Proceedings—IEEE international conference on robotics and automation.

[3]

Tan Q, Fan T, Pan J, Manocha D (2019) DeepMNavigate: deep reinforced multi-robot navigation unifying local & global collision avoidance. arXiv:1910.09441

[4]

Xue X, Li Z, Zhang D, Yan Y (2019) A deep reinforcement learning method for mobile robot collision avoidance based on double DQN. In: IEEE international symposium on industrial electronics.

[5]

Chen C, Liu Y, Kreiss S, Alahi A (2019) Crowd-robot interaction: crowd-aware robot navigation with attention-based deep reinforcement learning. In: Proceedings—IEEE international conference on robotics and automation., arXiv:1809.08835

[6]

Fox D, Burgard W, Thrun S (1997) The dynamic window approach to collision avoidance. IEEE Robotics Autom Mag 10(1109/100):580977

[7]

Rosmann C, Hoffmann F, Bertram T, (2015) Timed-Elastic-Bands for time-optimal point-to-point nonlinear model predictive control. In: European control conference. ECC 2015.

[8]

Fiorini P and Shiller Z Motion planning in dynamic environments using velocity obstacles Int J Robotics Res 1998

[9]

Van Berg JD, Lin M, Manocha D (2008) Reciprocal velocity obstacles for real-time multi-agent navigation. In: Proceedings—IEEE international conference on robotics and automation.

[10]

Snape J, Berg JVD, Guy SJ, Manocha D (2011) The hybrid reciprocal velocity obstacle. IEEE Trans Robotics.

[11]

Van Den Berg J, Guy SJ, Lin M, Manocha D (2011) Reciprocal n-body collision avoidance. In: Springer tracts in advanced robotics.

[12]

Silver D, Huang A, Maddison CJ, Guez A, Sifre L, Van Den Driessche G, Schrittwieser J, Antonoglou I, Panneershelvam V, Lanctot M, Dieleman S, Grewe D, Nham J, Kalchbrenner N, Sutskever I, Lillicrap T, Leach M, Kavukcuoglu K, Graepel T, Hassabis D (2016) Mastering the game of Go with deep neural networks and tree search. Nature.

[13]

Kahn G, Abbeel P, Levine S (2020) BADGR: An autonomous self-supervised learning-based navigation system. arXiv:2002.05700

[14]

Chen YF, Everett M, Liu M, How JP (2017) Socially aware motion planning with deep reinforcement learning. In: IEEE international conference on intelligent robots and systems, arXiv:1703.08862

[15]

Long P, Fanl T, Liao X, Liu W, Zhang H, Pan J (2018) Towards optimally decentralized multi-robot collision avoidance via deep reinforcement learning. In: Proceedings—IEEE international conference on robotics and automation., arXiv:1709.10082

[16]

Haarnoja T, Zhou A, Abbeel P, Levine S (2018) Soft actor-critic: Off-policy maximum entropy deep reinforcement learning with a stochastic actor. In: 35th international conference on machine learning, ICML. arXiv:1801.01290

[17]

Burgard W, Stachniss C, Bennewitz M, Arras K (2018) Introduction to mobile robotics—Bayes filter, particle filter and Monte Carlo localization (uni freiburg. edn). Lectures

[18]

Fan T, Long P, Liu W, and Pan J Distributed multi-robot collision avoidance via deep reinforcement learning for navigation in complex scenarios Int J Robotics Res 2020 39 7 856-892

[19]

Gerkey B, Vaughan R, Howard A (2003) The player/stage project: Tools for multi-robot and distributed sensor systems. In: Proceedings of international conference on advanced robotics (ICAR 2003)

[20]

Koenig N, Howard A (2004) Design and use paradigms for Gazebo, an open-source multi-robot simulator. In: 2004 IEEE/RSJ international conference on intelligent robots and systems (IROS).

[21]

Botteghi M, Khaled M, Sirmaçek B, Poel M (2020) Entropy-based exploration for mobile robot navigation: a learning-based approach. In: Planning and robotics workshop, PlanRob

[22]

Feng S, Sebastian B, Ben-Tzvi P (2021) A collision avoidance method based on deep reinforcement learning. Robotics 10(2)., https://www.mdpi.com/2218-6581/10/2/73

[23]

Morales J, Martínez JL, Martínez MA (2009) Mandow A (2009) Pure-pursuit reactive path tracking for nonholonomic mobile robots with a 2d laser scanner. EURASIP J Adv Sig Process 1:935237.

[24]

SM L (2006) Search for feasible plans. Planning algorithms. Cambridge University Press.

[25]

Marder-Eppstein E, Berger E, Foote T, Gerkey B, Konolige K (2010) The office marathon: Robust navigation in an indoor office environment. In: Proceedings—IEEE international conference on robotics and automation.

[26]

Quigley M, Conley K, Gerkey B, Faust J, Foote T, Leibs J, Wheeler R, Ng AY (2009) ROS: an open-source robot operating system. In: ICRA workshop on open source software

Cited By

Wei ZXiao WYuan LRan TCui JLv K(2024)Memory-based soft actor–critic with prioritized experience replay for autonomous navigationIntelligent Service Robotics10.1007/s11370-024-00514-917:3(621-630)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s11370-024-00514-9
Cai KChen WDugas DSiegwart RChung J(2023)Sampling-Based Path Planning in Highly Dynamic and Crowded Pedestrian FlowIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.329292724:12(14732-14742)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TITS.2023.3292927
Luo SSchomaker L(2023)Reinforcement learning in robotic motion planning by combined experience-based planning and self-imitation learningRobotics and Autonomous Systems10.1016/j.robot.2023.104545170:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.robot.2023.104545
Show More Cited By

Index Terms

Reinforcement learning-based dynamic obstacle avoidance and integration of path planning
1. Computer systems organization
  1. Embedded and cyber-physical systems
    1. Robotics
      1. Robotic autonomy
2. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
      1. Robotic planning
  2. Machine learning

Index terms have been assigned to the content through auto-classification.

Recommendations

Deep reinforcement learning collision avoidance using policy gradient optimisation and Q-learning

Usage of trust region policy optimisation (TRPO) and proximal policy optimisation (PPO) 'children of policy gradient optimisation method' and deep Q-learning network (DQN) in Lidar-based differential robots are proposed using Turtlebot and OpenAI's ...
Reactive Visual Navigation Based on Omnidirectional Sensing – Path Following and Collision Avoidance

Described here is a visual navigation method for navigating a mobile robot along a man-made route such as a corridor or a street. We have proposed an image sensor, named HyperOmni Vision, with a hyperboloidal mirror for vision-based navigation of the ...
On-line Planning for Collision Avoidance on the Nominal Path

In this paper a solution to the obstacle avoidance problem for a mobile robot moving in the two-dimensional Cartesian plane is presented. The robot is modelled as a linear time-invariant dynamic system of finite size enclosed by a circle and the obstacles ...

Comments

Information & Contributors

Information

Published In

cover image Intelligent Service Robotics

Intelligent Service Robotics Volume 14, Issue 5

Nov 2021

170 pages

ISSN:1861-2776

Issue’s Table of Contents

© The Author(s), under exclusive licence to Springer-Verlag GmbH Germany, part of Springer Nature 2021.

Publisher

Springer-Verlag

Berlin, Heidelberg

Publication History

Published: 01 November 2021

Accepted: 09 September 2021

Received: 10 April 2021

Author Tags

Qualifiers

Research-article

Funding Sources

Korea Institute for Advancement of Technology

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
0
Total Downloads

Downloads (Last 12 months)0
Downloads (Last 6 weeks)0

Reflects downloads up to 12 Nov 2024

Other Metrics

View Author Metrics

Citations

Cited By

Wei ZXiao WYuan LRan TCui JLv K(2024)Memory-based soft actor–critic with prioritized experience replay for autonomous navigationIntelligent Service Robotics10.1007/s11370-024-00514-917:3(621-630)Online publication date: 1-May-2024
https://dl.acm.org/doi/10.1007/s11370-024-00514-9
Cai KChen WDugas DSiegwart RChung J(2023)Sampling-Based Path Planning in Highly Dynamic and Crowded Pedestrian FlowIEEE Transactions on Intelligent Transportation Systems10.1109/TITS.2023.329292724:12(14732-14742)Online publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1109/TITS.2023.3292927
Luo SSchomaker L(2023)Reinforcement learning in robotic motion planning by combined experience-based planning and self-imitation learningRobotics and Autonomous Systems10.1016/j.robot.2023.104545170:COnline publication date: 1-Dec-2023
https://dl.acm.org/doi/10.1016/j.robot.2023.104545
Jung JLee HLee C(2023)Distance estimation with semantic segmentation and edge detection of surround view imagesIntelligent Service Robotics10.1007/s11370-023-00486-216:5(633-641)Online publication date: 28-Sep-2023
https://dl.acm.org/doi/10.1007/s11370-023-00486-2

View Options

View options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

Media

Figures

Other

Tables

View Issue’s Table of Contents