research-article

Deep Reinforcement Learning for Parameter Tuning of Robot Visual Servoing

Authors:

Jianping WangAuthors Info & Claims

ACM Transactions on Intelligent Systems and Technology, Volume 14, Issue 2

Article No.: 33, Pages 1 - 27

https://doi.org/10.1145/3579829

Published: 21 February 2023 Publication History

Abstract

Robot visual servoing controls the motion of a robot through real-time visual observations. Kinematics is a key approach to achieving visual servoing. One key challenge of kinematics-based visual servoing is that it requires time-varying parameter configuration throughout the entire process of one task. Parameter tuning is also necessary when applying to different tasks. The existing work on parameter tuning either lacks adaptation or cannot automate the tuning of all parameters. Meanwhile, the transferability of existing methods from one task to another is low. This work develops a Deep Reinforcement Learning (DRL) framework for robot visual servoing, which can automate all parameters tuning for one task and across tasks. In visual servoing, forward kinematics focuses on motion speed, while inverse kinematics focuses on the smoothness of motion. Therefore, we develop two separate modules in the proposed DRL framework. One tunes time-varying Forward Kinematics parameters to accelerate the motion, and the other tunes the Inverse Kinematics parameters to ensure smoothness. Moreover, we customize a knowledge transfer method to generalize the proposed DRL models to various robot tasks without reconstructing the neural network. We verify the proposed method on simulated robot tasks. The experimental results show that the proposed method outperforms the state-of-the-art methods and manual parameter configuration in terms of movement speed and smoothness in one task and across tasks.

References

[1]

A. Astolfi, Liu Hsu, Mariana Netto, and Romeo Ortega.2002. Two solutions to the adaptive visual servoing problem. IEEE Trans. Robot. Autom. 18, 3 (August2002), 387–392.

[2]

Francois Chaumette and Seth Hutchinson. 2006. Visual servo control. I. Basic approaches. IEEE Robot. Autom. Mag. 13, 4 (December2006), 82–90.

[3]

Cosmin Copot, Lei Shi, and Steve Vanlanduit. 2019. Automatic tuning methodology of visual servoing system using predictive approach. In Proceedings of the IEEE 15th International Conference on Control and Automation (ICCA’19). IEEE, 776–781.

[4]

Xingping Dong, Jianbin Shen, Wenguan Wang, Yu Liu, Ling Shao, and Fatih Porikli.2018. Hyperparameter optimization for tracking with continuous deep q-learning. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR’18). 518–527.

[5]

Xingping Dong, Jianbing Shen, Wenguan Wang, Ling Shao, Haibin Ling, and Fatih Porikli. 2019. Dynamical hyperparameter optimization via deep reinforcement learning in tracking. IEEE Trans. Pattern Anal. Mach. Intell. 43, 5 (2019), 1515–1529.

[6]

Yan Duan, Xi Chen, Rein Houthooft, John Schulman, and Pieter Abbeel. 2016. Benchmarking deep reinforcement learning for continuous control. In International Conference on Machine Learning. PMLR, 1329–1338.

Digital Library

[7]

De feng He, Li Yu, and Xiu lan Song.2014. Optimized-based stabilization of constrained nonlinear systems: A receding horizon approach. Asian J. Contr. 16, 6 (March2014), 1693–1701.

[8]

Jesus A. Garrido, Niceto R. Luque, and Egidio D’Angelo. 2013. Distributed cerebellar plasticity implements adaptable gain control in a manipulation task: A closed-loop robotic simulation. Front. Neural Circ. 7 (October2013), 1–20.

[9]

Jesus A. Garrido Alcazar, Niceto Rafael Luque, Egidio D’Angelo, and Eduardo Ros. 2013. Distributed cerebellar plasticity implements adaptable gain control in a manipulation task: A closed-loop robotic simulation. Front. Neural Circ. 7 (2013), 159, 1–20.

[10]

Alex Graves. 2012. Long short-term memory. In Supervised Sequence Labelling with Recurrent Neural Networks, Studies in Computational Intelligence, Vol. 385 (February2012), 37–45.

[11]

Ondrej Hock and Jozef Sedo. 2018. Inverse kinematics using transposition method for robotic arm. In Proceedings of the International Conference ELEKTRO (ELEKTRO’18). 1–5.

[12]

Zhehao Jin, Jinhui Wu, Andong Liu, Wen-An Zhang, and Li Yu. 2021. Policy-based deep reinforcement learning for visual servoing control of mobile robots with visibility constraints. IEEE Trans. Industr. Electr. 69, 2 (February2021), 1898–1908.

[13]

Hadi S. Jomaa, Josif Grabocka, and Lars Schmidt-Thieme.2019. Hyp-rl: Hyperparameter optimization by reinforcement learning. arXiv:1906.11527. Retrieved from https://arxiv.org/abs/1906.11527.

[14]

Meng Kang, Hao Chen, and Jiuxiang Dong.2020. Adaptive visual servoing with an uncalibrated camera using extreme learning machine and Q-leaning. Neurocomputing 402 (March2020), 384–394.

[15]

M. Kirtas, Konstantinos Tsampazis, Nikolaos Passalis, and Anastasios Tefas. 2020. Deepbots: A webots-based deep reinforcement learning framework for robotics. In Proceedings of the IFIP International Conference on Artificial Intelligence Applications and Innovations. Springer, 64–75.

[16]

K. Jagatheesan, B. Anand, S. Samanta, N. Dey, A. S. Ashour, and V. E. Balas. 2019. Design of a proportional-integral-derivative controller for an automatic generation control of multi-area power thermal systems using firefly algorithm. IEEE/CAA J. Autom. Sinica 6, 2 (March2019), 503–515.

[17]

Linghuan Kong, Wei He, Yiting Dong, Long Cheng, Chenguang Yang, and Zhijun Li. 2019. Asymmetric bounded neural control for an uncertain robot by state feedback and output feedback. IEEE Trans. Syst. Man Cybernet.: Syst. 51, 3 (2019), 1735–1746.

[18]

Alex X. Lee, Sergey Levine, and Pieter Abbeel. 2017. Learning visual servoing with deep features and fitted q-iteration. arXiv:1703.11000. Retrieved from https://arxiv.org/abs/1703.11000.

[19]

Haoran Li, Qichao Zhang, and Dongbin Zhao. 2020. Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and Sim2Real. IEEE Trans. Neural Netw. Learn. Syst. 31, 6 (June2020), 2064–2076.

[20]

Min Li, Yu Zhu, Kaiming Yang, and Chuxiong Hu. 2015. A data-driven variable-gain control strategy for an ultra-precision wafer stage with accelerated iterative parameter tuning. IEEE Trans. Industr. Inf. 11, 5 (October2015), 1179–1189.

[21]

Xuesi Li, Kai Jiang, Chunlei Yang, and Haobin Shi.2018. Image-based visual servoing for quadrotor helicopters using genetic algorithm. In Proceedings of the IEEE International Conference on Information and Automation. 507–523.

[22]

Yimeng Li and Jana Košecka. 2020. Learning view and target invariant visual servoing for navigation. In Proceedings of the IEEE International Conference on Robotics and Automation (ICRA’20). IEEE, 658–664.

[23]

Z. Li, X. Li, Q. Li, H. Su, Z. Kan, and W. He 2022. Human-in-the-Loop Control of Soft Exosuits Using Impedance Learning on Different Terrains. In IEEE Transactions on Robotics, 38, 5 (2022), 2979–2993. DOI:

[24]

Timothy P. Lillicrap, Jonathan J. Hunt, and Alexander Pritzel et. al. (2015). Continuous control with deep reinforcement learning. arXiv:1509.02971. Retrieved from https://arxiv.org/abs/1509.02971.

[25]

Xiangyang Liu, Jianliang Mao, Jun Yang, Shihua Li, and Kaifeng Yang. 2021. Robust predictive visual servoing control for an inertially stabilized platform with uncertain kinematics. ISA Trans. 114 (2021), 347–358.

[26]

Carlos Lopez-Franco, Javier Gomez-Avila, Alma Y. Alanis, and Carlos Villaseñor. 2017. Visual servoing for an autonomous hexarotor using a neural network based PID controller. Sensors 17, 8 (August2017), 1–17.

[27]

Ezio Malis, Francois Chaumette, and Sylvie Boudet. 1999. 2 1/2 D visual servoing. IEEE Trans. Robot. Autom. 15, 2 (1999), 238–250.

[28]

Ebrahim Matter.2010. Epipolar-kinematics relations estimation neural approximation for robotics closed loop visual servo system. In Proceedings of the 2nd International Conference on Computer and Automation Engineering (ICCAE’10). 441–445.

[29]

M. Bašić, D. Vukadinović, I. Grgić, and M. Bubalo. 2020. Speed-sensorless vector control of an induction generator including stray load and iron losses and online parameter tuning. IEEE Trans. Energy Convers. 35, 2 (June2020), 724–732.

[30]

Olivier Michel. 2004. Cyberbotics Ltd. Webots™: Professional mobile robot simulation. Int. J. Adv. Robot. Syst. 1, 5 (2004), 39–42.

[31]

Francisco Naveros, Niceto R Luque, Eduardo Ros, and Angelo Arleo. 2019. VOR adaptation on a humanoid iCub robot using a spiking cerebellar model. IEEE Trans. Cybernet. 50, 11 (2019), 4744–4757.

[32]

Francisco Naveros, Niceto R. Luque, Eduardo Ros, and Angelo Arleo. 2020. VOR adaptation on a humanoid iCub robot using a spiking cerebellar model. IEEE Trans. Cybernet. 50, 11 (November2020), 4744–4757.

[33]

Jie Pan, Xuesong Wang, Yuhu Cheng, and Qiang Yu. 2018. Multisource transfer double DQN based on actor learning. IEEE Trans. Neural Netw. Learn. Syst. 29, 6 (March2018), 2227–2238.

[34]

Do-Hwan Park, Jeong-Hoon Kwon, and In-Joong Ha. 2011. Novel position-based visual servoing approach to robust global stability under field-of-view constraint. IEEE Trans. Industr. Electr. 59, 12 (2011), 4735–4752.

[35]

Carlos Sampedro, Alejandro Rodriguez-Ramos, Ignacio Gil, Luis Mejias, and Pascual Campoy. 2018. Image-based visual servoing controller for multirotor aerial robots using deep reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 979–986.

Digital Library

[36]

G. Senthilkumar and M. P. Chitra. 2020. An ensemble dynamic optimization based inverse adaptive heuristic critic in IaaS cloud computing for resource allocation. J. Intell. Fuzzy Syst. 39, 5 (November2020), 7521–7535.

Digital Library

[37]

Haobin Shi and Meng Xu. 2020. A multiple-attribute decision-making approach to reinforcement learning. IEEE Trans. Cogn. Dev. Syst. 12, 4 (December2020), 695–708.

[38]

Haobin Shi, Meng Xu, and Kao-Shing Hwang. 2020. A fuzzy adaptive approach to decoupled visual servoing for a wheeled mobile robot. IEEE Trans. Fuzzy Syst. 28, 12 (December 2020), 3229–3243.

Digital Library

[39]

Xiulan Song and Miaomiao Fu.2017. CLFs-based optimization control for a class of constrained visual servoing systems. ISA Trans. 67 (March2017), 507–514.

[40]

Xiaocheng Tang, Zhiwei Qin, Fan Zhang, Zhaodong Wang, Zhe Xu, Yintai Ma, Hongtu Zhu, and Jieping Ye. 2019. A deep value-network based approach for multi-driver order dispatching. In Proceedings of the 25th ACM SIGKDD International Conference on Knowledge Discovery & Data Mining. 1780–1790.

Digital Library

[41]

Likui Wang and Hak-Keung Lam. 2020. A new approach to stability and stabilization analysis for continuous-time takagi–sugeno fuzzy systems with time delay. IEEE Trans. Fuzzy Syst. 26, 4 (August2020), 2460–2465.

[42]

Kaixuan Wei, Angelica Aviles-Rivero, Jingwei Liang, Ying Fu, Carola-Bibiane Schonlieb, and Hua Huang.2020. Tuning-free plug-and-play proximal algorithm for inverse imaging problems. In Proceedings of the 37th International Conference on Machine Learning. 10158–10169.

[43]

Chenxi Xiao, Peng Lu, and Qizhi He. 2021. Flying through a narrow gap using end-to-end deep reinforcement learning augmented with curriculum learning and Sim2Real. IEEE Trans. Neural Netw. Learn. Syst. (September2021), 1–8. early access.

[44]

Zhaoming Xie, Glen Berseth, Patrick Clary, Jonathan Hurst, and Michiel van de Panne. 2018. Feedback control for cassie with deep reinforcement learning. In Proceedings of the IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’18). IEEE, 1241–1246.

Digital Library

[45]

De Xu, You Fu Li, and Min Tan. 2008. A general recursive linear method and unique solution pattern design for the perspective-n-point problem. Image Vis. Comput. 26, 6 (June2008), 740–750.

Digital Library

[46]

Meng Xu and Jianping Wang. 2022. Learning strategy for continuous robot visual control: A multi-objective perspective. Knowl.-Bas. Syst. 252, 109448 (2022), 1–15.

[47]

Li Yang and Abdallah Shami. 2020. On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing 415 (July2020), 295–316.

[48]

Tolga Yüksel. 2017. Intelligent visual servoing with extreme learning machine and fuzzy logic. Exp. Syst. Appl. 72 (2017), 344–356.

Digital Library

[49]

Yinyan Zhang and Shuai Li. 2018. A neural controller for image-based visual servoing of manipulators with physical constraints. IEEE Trans. Neural Netw. Learn. Syst. 29, 11 (2018), 5419–5429.

[50]

Zhengyou Zhang. 2000. A flexible new technique for camera calibration. IEEE Trans. Pattern Anal. Mach. Intell. 22, 11 (November2000), 1330–1334.

Digital Library

Cited By

Xu MChen XShe YJin YZhao GWang J(2024)Strengthening Cooperative Consensus in Multi-Robot ConfrontationACM Transactions on Intelligent Systems and Technology10.1145/363937115:2(1-27)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1145/3639371
Xu MChen XShe YJin YWang J(2024)Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement LearningIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33590398:2(1865-1881)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2024.3359039
Xu MShe YJin YWang J(2023)Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent LearningACM Transactions on Intelligent Systems and Technology10.1145/362340514:6(1-28)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3623405
Show More Cited By

Index Terms

Deep Reinforcement Learning for Parameter Tuning of Robot Visual Servoing
1. Computing methodologies
  1. Artificial intelligence
    1. Planning and scheduling
      1. Robotic planning

Recommendations

Deep Reinforcement Learning for Humanoid Robot Behaviors
Abstract
RoboCup 3D Soccer Simulation is a robot soccer competition based on a high-fidelity simulator with autonomous humanoid agents, making it an interesting testbed for robotics and artificial intelligence. Due to the recent success of Deep ...
Read More
Neural-network-based parameter tuning for multi-agent simulation using deep reinforcement learning
Abstract
This study proposes a new efficient parameter tuning method for multi-agent simulation (MAS) using deep reinforcement learning. MAS is currently a useful tool for social sciences, but is hard to realize realistic simulations due to its ...
Read More
Deep Reinforcement Learning of Map-Based Obstacle Avoidance for Mobile Robot Navigation
Abstract
Autonomous and safe navigation in complex environments without collisions is particularly important for mobile robots. In this paper, we propose an end-to-end deep reinforcement learning method for mobile robot navigation with map-based obstacle ...
Read More

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology

ACM Transactions on Intelligent Systems and Technology Volume 14, Issue 2

April 2023

430 pages

ISSN:2157-6904

EISSN:2157-6912

DOI:10.1145/3582879

Editor:
Huan Liu
Arizona State University, USA

Issue’s Table of Contents

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 21 February 2023

Online AM: 12 January 2023

Accepted: 11 December 2022

Revised: 25 October 2022

Received: 04 April 2022

Published in TIST Volume 14, Issue 2

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article

Funding Sources

Science and Technology Innovation Committee Foundation of Shenzhen
Hong Kong Research Grant Council under RIF

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

4
Total Citations
View Citations
1,043
Total Downloads

Downloads (Last 12 months)772
Downloads (Last 6 weeks)52

Other Metrics

View Author Metrics

Citations

Cited By

Xu MChen XShe YJin YZhao GWang J(2024)Strengthening Cooperative Consensus in Multi-Robot ConfrontationACM Transactions on Intelligent Systems and Technology10.1145/363937115:2(1-27)Online publication date: 22-Feb-2024
https://dl.acm.org/doi/10.1145/3639371
Xu MChen XShe YJin YWang J(2024)Time-Varying Weights in Multi-Reward Architecture for Deep Reinforcement LearningIEEE Transactions on Emerging Topics in Computational Intelligence10.1109/TETCI.2024.33590398:2(1865-1881)Online publication date: Apr-2024
https://doi.org/10.1109/TETCI.2024.3359039
Xu MShe YJin YWang J(2023)Dynamic Weights and Prior Reward in Policy Fusion for Compound Agent LearningACM Transactions on Intelligent Systems and Technology10.1145/362340514:6(1-28)Online publication date: 14-Nov-2023
https://dl.acm.org/doi/10.1145/3623405
Ye SLu J(undefined)Robust Recommender Systems with Rating Flip NoiseACM Transactions on Intelligent Systems and Technology10.1145/3641285
https://dl.acm.org/doi/10.1145/3641285

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Full Text

View this article in Full Text.

HTML Format

View this article in HTML Format.

Media

Figures

Other

Tables

View full text|Download PDF

View Issue’s Table of Contents