Abstract
When using deep reinforcement learning algorithm to complete Unmanned Aerial Vehicle (UAV) autonomous obstacle avoidance and target tracking tasks, there are often some problems such as slow convergence speed and low success rate. Therefore, this paper proposes a new deep reinforcement learning algorithm, namely Multiple Pools Twin Delay Deep Deterministic Policy Gradient (MPTD3) algorithm. Firstly, the state space and action space of UAV are established as continuous models, which is closer to engineering practice than discrete models. Then, multiple experience pools mechanism and gradient truncation are designed to improve the convergence of the algorithm. Furthermore, the generalization ability of the algorithm is obtained by giving UAV environmental perception ability. Experimental results verify the effectiveness of the proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Shirani, B., Najafi, M., Izadi, I.: Cooperative load transportation using multiple UAVs. Proc. Aerosp. Sci. Technol. 84, 158–169 (2019). https://doi.org/10.1016/j.ast.2018.10.027
Khan, M.A., Cheema, T.A., Ullah, I., Noor, F., Aziz, M.A.: A dual-mode medium access control mechanism for UAV-enabled intelligent transportation system. Proc. Mob. Inf. Syst. (2021). https://doi.org/10.1155/2021/5578490
Sung, I., Nielsen, P.: Zoning a service area of unmanned aerial vehicles for package delivery services. Proc. J. Intel. Robot. Syst. 97, 719–731 (2020)
Umemoto, K., Endo, T., Matsuno, F.: Dynamic cooperative transportation control using friction forces of n multi-rotor unmanned aerial vehicles. Proc. J. Intell. Robot. Syst. 100, 1085–1095 (2020). https://doi.org/10.1007/s10846-020-01212-1
Liu, X., Ansari, N.: Resource allocation in UAV-assisted M2M communications for disaster rescue. Proc. IEEE Wirel. Commun. Lett. 8(2), 580–583 (2018)
Wang, Y., Su, Z., Xu, Q., Li, R., Luan, T.H.: Lifesaving with Rescuechain: Energy-Efficient and Partition-Tolerant Blockchain Based Secure Information Sharing for UAV-Aided Disaster Rescue. In: Proceeding of IEEE Conference on Computer Communications (2021)
Dong, J., Ota, K., Dong, M.: UAV-Based Real-Time Survivor Detection System in Post-Disaster Search and Rescue Operations. In: Proceeding of IEEE Journal on Miniaturization for Air and Space Systems (2021)
Stampa, M., Sutorma, A., Jahn, U., Thiem, J., Wolff, C., Röhrig, C.: Maturity levels of public safety applications using unmanned aerial systems: a review. Proc. J. Intel. Robot. Syst. 103(1), 1–15 (2021). https://doi.org/10.1007/s10846-021-01462-7
Lyu, J., Zeng, Y., Zhang, R., Lim, T.J.: Placement optimization of UAV-mounted mobile base stations. Proc. IEEE Commun. Lett. 21(3), 604–607 (2016)
Wu, Y., Yang, W., Guan, X., Wu, Q.: UAV-Enabled Relay Communication under Malicious Jamming: Joint Trajectory and Transmit Power Optimization. Proc. IEEE Trans. Veh. Technol. (2021)
Cetin, O., Zagli, I., Yilmaz, G.: Establishing obstacle and collision free communication relay for UAVs with artificial potential fields. Proc. J. Intel. Robot. Syst. 69(1), 361–372 (2013). https://doi.org/10.1007/s10846-012-9761-y
Oh, D., Lim, J., Lee, J.K., Baek, H.: Airborne-relay-based algorithm for locating crashed UAVs in GPS-denied environments. In: Proceeding of 2019 IEEE 10th annual ubiquitous computing, Electronics & Mobile Communication Conference (UEMCON). IEEE (2019)
Huang, Z., Zhang, T., Liu, P., Lu, X.: Outdoor independent charging platform system for power patrol UAV. In: Proceeding of 2020 12th IEEE PES Asia-Pacific Power and Energy Engineering Conference (APPEEC), pp. 1–5. IEEE (2020)
Chang, A., Jiang, M., Nan, J., Zhou, W., Li, X., Wang, J., He, X.: Research on the application of computer track planning algorithm in UAV power line patrol system. In: Proceeding of Conference Series (Vol. 1915, No. 3, p. 032030). IOP Publishing (2021)
Pham, H.X., La, H.M., Feil-Seifer, D., Nguyen, L.V.: Autonomous UAV navigation using reinforcement learning. arXiv 2018. arXiv preprint arXiv:1801.05086 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., Hassabis, D.: Human-level control through deep reinforcement learning. Nature. 518(7540), 529–533 (2015)
Yan, C., Xiang, X., Wang, C.: Towards real-time path planning through deep reinforcement learning for a UAV in dynamic environments. Proc. J. Intel. Robot. Syst. 98, 297–309 (2020). https://doi.org/10.1007/s10846-019-01073-3
Yao, Q., Zheng, Z., Qi, L., Yuan, H., Guo, X., Zhao, M., Liu, Z., Yang, T.: Path planning method with improved artificial potential field—a reinforcement learning perspective. Proc. IEEE Access. 8, 135513–135523 (2020)
Hausknecht, M., Stone, P.: Deep Recurrent Q-Learning for Partially Observable MDPs. In: Proceeding of AAAI Fall Symposium Series (2015)
Singla, A., Padakandla, S., Bhatnagar, S.: Memory-Based Deep Reinforcement Learning for Obstacle Avoidance in UAV with Limited Environment Knowledge. Proc. IEEE Trans. Intell. Transp. Syst. (2019)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., … Wierstra, D.: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Rodriguez-Ramos, A., Sampedro, C., Bavle, H., De La Puente, P., Campoy, P.: A deep reinforcement learning strategy for UAV autonomous landing on a moving platform. Proc J Intel Robot Syst. 93(1–2), 351–366 (2019). https://doi.org/10.1007/s10846-018-0891-8
Li, B., Yang, Z.P., Chen, D.Q., Liang, S.Y., Ma, H.: Maneuvering target tracking of UAV based on MN-DDPG and transfer learning. Proc. Defence Technol. 17(2), 457–466 (2021). https://doi.org/10.1016/j.dt.2020.11.014
Wan, K., Gao, X., Hu, Z., Wu, G.: Robust motion control for UAV in dynamic uncertain environments using deep reinforcement learning. Proc. Remote Sens. 12(4), 640 (2020)
Sampedro, C., Rodriguez-Ramos, A., Bavle, H., Carrio, A., de la Puente, P., Campoy, P.: A fully-autonomous aerial robot for search and rescue applications in indoor environments using learning-based techniques. Proc. J. Intel. Robot. Syst. 95(2), 601–627 (2019). https://doi.org/10.1007/s10846-018-0898-1
Wang, C., Wang, J., Shen, Y., Zhang, X.: Autonomous navigation of UAVs in large-scale complex environments: a deep reinforcement learning approach. Proc. IEEE Trans. Veh. Technol. 68(3), 2124–2136 (2019)
Song, D.R., Yang, C., McGreavy, C., Li, Z.: Recurrent deterministic policy gradient method for bipedal locomotion on rough terrain challenge. In: Proceeding of 2018 15th International Conference on Control, Automation, Robotics and Vision (ICARCV), pp. 311–318. IEEE (2018)
Fujimoto, S., Hoof, H., Meger, D.: Addressing function approximation error in actor-critic methods. In: Proceeding of International Conference on Machine Learning, pp. 1587–1596. PMLR (2018)
Li, B., Gan, Z., Chen, D., Sergey Aleksandrovich, D.: UAV maneuvering target tracking in uncertain environments based on deep reinforcement learning and meta-learning. Proc. Remote Sens. 12(22), 3789 (2020). https://doi.org/10.3390/rs12223789
Yu, C., Velu, A., Vinitsky, E., Wang, Y., Bayen, A., Wu, Y.: The surprising effectiveness of MAPPO in cooperative multi-agent games. arXiv preprint arXiv:2103.01955 (2021)
Acknowledgements
This work was supported by the National Natural Science Foundation of China under the grant number 61903133 and 61733004.
Code or Data Availability
Source code in Python generated during the current study is available from the corresponding author on reasonable request.
Funding
This research was funded by National Science Foundation of China under the grant for Prof. Yaonan Wang having grant number 61733004 and Prof. Weilai Jiang having grant number 61903133.
Author information
Authors and Affiliations
Contributions
Guoqiang Xu contributed to the design, implementation and manuscript writing of the research; Weilai Jiang and Yaonan Wang contributed to the guidance of the experiment and the revision of the manuscript.
Corresponding author
Ethics declarations
Ethical Approval
No applicable as this study does not contain biological applications.
Consent to Participate
All authors of this research paper have consented to participate in the research study.
Consent to Publication
All authors of this research paper have read and approved the final version submitted.
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Xu, G., Jiang, W., Wang, Z. et al. Autonomous Obstacle Avoidance and Target Tracking of UAV Based on Deep Reinforcement Learning. J Intell Robot Syst 104, 60 (2022). https://doi.org/10.1007/s10846-022-01601-8
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-022-01601-8