Abstract
The goal of most asymmetrically coordinated manipulative tasks of humanoid manipulators is multilevel. For example, a bottle cap screwing task is composed of several sub-objectives, such as reaching, grasping, aligning, and screwing. In addition, the flexible interaction requirements of dual-arm robots challenge the trajectory planning methods of manipulator with high dimensional and strong coupling characteristics. However, the traditional reinforcement learning algorithms cannot quickly learn and generate the required trajectories above. Based on the idea of multi-agent control, a dual-agent deep deterministic policy gradient algorithm is proposed in this paper, which uses two agents to simultaneously plan the coordinated trajectory of the left arm and the right arm online. This algorithm solves the problem of online trajectory planning for multi-objective tasks of humanoid manipulators. The design of observations and actions in the dual-agent structure can reduce the dimension and decouple the humanoid manipulators’ trajectory planning problem to a certain extent, thus speeding up the learning speed. Moreover, a reward function is constructed to realize the coordinated control between the two agents, to promote dual-agent to generate continuous trajectories for multi-objective tasks. Finally, the effectiveness of the proposed algorithm is verified in Baxter multi-objective task simulation environment under the Gym. The results show that this algorithm can quickly learn and online plan the coordinated trajectory of humanoid manipulators for multi-objective tasks.
Supported in part by the National Natural Science Foundation of China (U2013602, 52075115, 51521003, 61911530250), National Key R &D Program of China (2020YFB13134), Self-Planned Task (SKLRS202001B, SKLRS202110B) of State Key Laboratory of Robotics and System (HIT), Shenzhen Science and Technology Research and Development Foundation (JCYJ20190813171009236), and Basic Scientific Research of Technology (JCKY2020603C009).
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
References
Vahrenkamp, N., Asfour, T., Dillmann, R.: Simultaneous grasp and motion planning: humanoid robot ARMAR-III. IEEE Rob. Autom. Mag. 19(2), 43–57 (2012)
Fang, C., Rocchi, A., Hoffman, E.M., Tsagarakis, N.G., Caldwell, D.G: Efficient self-collision avoidance based on focus of interest for humanoid robots. In: 2015 IEEE-RAS 15th International Conference on Humanoid Robots (Humanoids), pp. 1060–1066. IEEE, Seoul, Korea (South) (2015)
Park, H.A., Lee, C.S. George: extended cooperative task space for manipulation tasks of humanoid robots. In: 2015 IEEE International Conference on Robotics and Automation (ICRA), pp. 6088–6093. IEEE, Seattle, WA, USA (South) (2015)
Giftthaler, M., Farshidian, F., Sandy, T., Stadelmann, L., Buchli, J.: Efficient kinematic planning for mobile manipulators with non-holonomic constraints using optimal control. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp. 3411–3417. IEEE, Singapore (2017)
Casalino, A., Massarenti, N., Zanchettin, A.M., Rocco, P.: Predicting the human behaviour in human-robot co-assemblies: an approach based on suffix trees. In: 2020 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 11108–11114. IEEE, Las Vegas, NV, USA (2020)
Lentini, G., Grioli, G., Catalano, M.G. Bicchi, A.: Robot programming without coding. In: 2020 IEEE International Conference on Robotics and Automation (ICRA), pp. 7576–7582. IEEE, Paris, France (2020)
Mronga, D., Kirchner, F.: Learning context-adaptive task constraints for robotic manipulation. Rob. Auton. Syst. 141, 103779 (2021)
Sasabuchi, K., Wake, N., Ikeuchi, K.: Task-oriented motion mapping on robots of various configuration using body role division. IEEE Rob. Autom. Lett. 6(2), 413–420 (2021)
Kim, H., Ohmura, Y., Kuniyoshi, Y.: Transformer-based deep imitation learning for dual-arm robot manipulation. In: 2021 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp. 8965–8972. IEEE, Prague, Czech Republic (2021)
Ye, D., Liu, Z., Sun, M., Shi, B., Zhao, P.: Mastering complex control in moba games with deep reinforcement learning. In: Proceedings of the AAAI Conference on Artificial Intelligence, pp. 6672–6679 (2020)
Sallab, A.E.L., Abdou, M., Perot, E., Yogamani, S.: Deep reinforcement learning framework for autonomous driving. Elect. Imaging 29(19), 70–76 (2017)
Duan, Y., Chen, X., Houthooft, R., Schulman, J., Abbeel, P.: Benchmarking deep reinforcement learning for continuous control. In: International Conference on machine learning, pp. 1329–1338. PMLR, New York, USA (2016)
Cuayáhuitl, H., Yu, S., Williamson, A., Carse, J.: Scaling up deep reinforcement learning for multi-domain dialogue systems. In: 2017 International Joint Conference on Neural Networks (IJCNN), pp. 3339–3346. IEEE, Anchorage, AK, USA (2017)
Lillicrap, T.P: Continuous control with deep reinforcement learning. arXiv preprint arXiv:1509.02971 (2015)
Schulman, J., Wolski, F., Dhariwal, P., Radford, A., Klimov, O.: Proximal policy optimization algorithms. arXiv preprint arXiv:1707.06347 (2017)
Mnih, V., et al: Asynchronous methods for deep reinforcement learning. In: International Conference on Machine Learning, pp. 1928–1937. PMLR, New York, USA (2016)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy eep reinforcement learning with a stochastic actor. In: International Conference on Machine Learning, pp. 1861–1870. PMLR, New York, USA (2018)
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2023 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Liang, K., Zha, F., Sheng, W., Guo, W., Wang, P., Sun, L. (2023). Research on Target Trajectory Planning Method of Humanoid Manipulators Based on Reinforcement Learning. In: Yang, H., et al. Intelligent Robotics and Applications. ICIRA 2023. Lecture Notes in Computer Science(), vol 14270. Springer, Singapore. https://doi.org/10.1007/978-981-99-6492-5_39
Download citation
DOI: https://doi.org/10.1007/978-981-99-6492-5_39
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-99-6491-8
Online ISBN: 978-981-99-6492-5
eBook Packages: Computer ScienceComputer Science (R0)