Location via proxy:   [ UP ]  
[Report a bug]   [Manage cookies]                
skip to main content
research-article

A deep transfer‐learning‐based dynamic reinforcement learning for intelligent tightening system

Published: 28 January 2021 Publication History

Abstract

Reinforcement learning (RL) has been widely applied in the static environment with standard reward functions. For intelligent tightening tasks, it is a challenge to transform expert knowledge into a recognizable mathematical expression for RL agents. Changing assembly standards make the model repeat learning updated knowledge with a high time‐cost. In addition, as the difficulty and low accuracy of designing reward functions, the RL model itself also limits its application in the complex and dynamic engineering environment. To solve the above problems, a deep transfer‐learning‐based dynamic reinforcement learning (DRL‐DTL) is presented and applied in the intelligent tightening system. Specifically, a deep convolution transfer‐learning model (DCTL) is presented to build a mathematical mapping between agents of the model and subjective knowledge, which endows agents to learn from human knowledge efficiently. Then, a dynamic expert library is established to improve the adaptability of algorithm to the changing environment. And an inverse RL based on prior knowledge is presented to acquire reward functions. Experiments are conducted on a tightening assembly system and the results show that the tightening robot with the proposed model can inspect quality problems during the tightening process autonomously and make an adjustment decision based on the optimal policy that the agent calculates.

Graphical Abstract

A deep transfer‐learning‐based dynamic reinforcement learning (DRL‐DTL) model is proposed to enable machine systems in industry to obtain the intelligent decision‐making ability. Data generated in the physical layer (multisensors included) are input to the digital twin layer (DRL‐DTL model included) for feature analysis and optimal solution exploration. The optimal scenario is obtained in the decision layer and is finally fed back to the physical layer. The state space and action space of tightening curves.

References

[1]
Xia C, El Kamel A. Neural inverse reinforcement learning in autonomous navigation. Rob Auton Syst. 2016;84:1‐14.
[2]
Wang T‐M, Tao Y, Liu H. Current researches and future development trend of intelligent robot: a review. Int J Comput Autom. 2018;15(5):525‐546.
[3]
Kashyap P. Machine Learning for Decision Makers: Cognitive Computing Fundamentals for Better Decision Making. Berkeley, CA: Apress; 2017:189‐233.
[4]
Wang H, Ma C, Zhou L. A brief review of machine learning and its application. In: 2009 International Conference on Information Engineering and Computer Science. Wuhan, China; 2009. https://doi.org/10.1109/ICIECS.2009.5362936
[5]
Bai W, Sinclair M, Tarroni G, et al. Automated cardiovascular magnetic resonance image analysis with fully convolutional networks. J Cardiovasc Magn Reson. 2018;20(1):65.
[6]
Luo G, Dong S, Wang K, Zuo W, Cao S, Zhang H. Multi‐views fusion CNN for left ventricular volumes estimation on cardiac MR images. IEEE Trans Biomed Eng. 2018;65(9):1924‐1934.
[7]
Shah SJ, Katz DH, Selvaraj S, et al. Phenomapping for novel classification of heart failure with preserved ejection fraction. Circulation. 2015;131(3):269‐279.
[8]
Choi E, Schuetz A, Stewart WF, Sun J. Using recurrent neural network models for early detection of heart failure onset. J Am Med Inf Assoc. 2016;24(2):361‐370.
[9]
Couderc JP, Kyal S, Mestha LK, et al. Detection of atrial fibrillation using contactless facial video monitoring. Heart Rhythm. 2015;12(1):195‐201.
[10]
Chayakrit K, Johnson KW, Rosenson RS, et al. Deep learning for cardiovascular medicine: a practical primer. Eur Heart J. 2019;40(25):2058‐2073.
[11]
Zhu Y, Xie C, Wang G‐J, Yan X‐G. Comparison of individual, ensemble and integrated ensemble machine learning methods to predict China's SME credit risk in supply chain finance. Neural Comput Appl. 2017;28(1):41‐50.
[12]
Ghoddusi H, Creamer GG, Rafizadeh N. Machine learning in energy economics and finance: a review. Energy Econ. 2019;81:709‐727.
[13]
Sezer OB, Gudelek MU, Ozbayoglu AM. Financial time series forecasting with deep learning: a systematic literature review: 2005–2019. Appl Soft Comput. 2020;90:106181.
[14]
Luo W, Zhang J, Feng P, Yu D, Wu Z. A concise peephole model based transfer learning method for small sample temporal feature‐based data‐driven quality analysis. Knowl Syst. 2020;195:105665.
[15]
Iqbal R, Maniak T, Doctor F, Karyotis C. Fault detection and isolation in industrial processes using deep learning approaches. IEEE Trans Ind Inf. 2019;15(5):3077‐3084.
[16]
Kulić D, Ott C, Lee D, Ishikawa J, Nakamura Y. Incremental learning of full body motion primitives and their sequencing through human motion observation. Int J Rob Res. 2012;31(3):330‐345.
[17]
Sutton RS, Barto AG. Reinforcement Learning: an Introduction. Adaptive Computation and Machine Learning. Vol xviii. Cambridge, Massachusetts: MIT Press; 1998:322.
[18]
Sutton RS. Learning to predict by the methods of temporal differences. Mach Learn. 1988;3(1):9‐44.
[19]
Lillicrap TP, Hunt JJ, Pritzel A, et al. Continuous control with deep reinforcement learning. 2015. arXiv:1509. http://arxiv.org/abs/1509.02971v2. Accessed November, 2015.
[20]
Sutton RS, McAllester D, Singh S, et al. Policy gradient method for reinforcement learning with function approximation. In: Leen TK, Muller KR, Solla SA, eds. Advances In Neural Information Processing Systems 12. Cambridge, Massachusetts: MIT Press; 2000:1057‐1063.
[21]
Kober J, Peters J. Reinforcement learning in robotics: a survey. In: Wiering M, van Otterlo M, eds. Reinforcement Learning: State‐of‐the‐Art. Berlin, Heidelberg: Springer; 2012:579‐610.
[22]
Fernández F, Borrajo D. Two steps reinforcement learning. Int J Intell Syst. 2008;23(2):213‐245.
[23]
Lopez‐Martin M, Carro B, Sanchez‐Esguevillas A. Application of deep reinforcement learning to intrusion detection for supervised problems. Expert Syst Appl. 2020;141:112963.
[24]
Carlucho I, De Paula M, Acosta GG. Double Q‐PID algorithm for mobile robot control. Expert Syst Appl. 2019;137:292‐307.
[25]
De Maio C, Fenza G, Loia V, Orciuoli F, Herrera‐Viedma E. A framework for context‐aware heterogeneous group decision making in business processes. Knowl Syst. 2016;102:39‐50.
[26]
Zhifei S, Er Meng J. A review of inverse reinforcement learning theory and recent advances. In: Abbass H, Essam D, Sarker R, eds. 2012 IEEE Congress on Evolutionary Computation. Brisbane, QLD; 2012.
[27]
Ratliff LJ, Mazumdar E. Inverse risk‐sensitive reinforcement learning. IEEE Trans Autom Control. 2020;65(3):1256‐1263.
[28]
Simonyan K, Zisserman A. Very deep convolutional networks for large‐scale image recognition. In: Yoshua B, Yann LC, eds. ICLR 2015. San Diego, CA; 2015.

Cited By

View all
  • (2021)IPSadasInternational Journal of Intelligent Systems10.1002/int.2279337:8(5290-5324)Online publication date: 28-Dec-2021
  • (2021)Parallel crop planning based on price forecastInternational Journal of Intelligent Systems10.1002/int.2273937:8(4772-4793)Online publication date: 16-Nov-2021

Recommendations

Comments

Information & Contributors

Information

Published In

cover image International Journal of Intelligent Systems
International Journal of Intelligent Systems  Volume 36, Issue 3
March 2021
355 pages
ISSN:0884-8173
DOI:10.1002/int.v36.3
Issue’s Table of Contents

Publisher

John Wiley and Sons Ltd.

United Kingdom

Publication History

Published: 28 January 2021

Author Tags

  1. deep transfer‐learning
  2. dynamic expert library
  3. inverse reinforcement learning
  4. tightening quality decision system

Qualifiers

  • Research-article

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)0
  • Downloads (Last 6 weeks)0
Reflects downloads up to 13 Jan 2025

Other Metrics

Citations

Cited By

View all
  • (2021)IPSadasInternational Journal of Intelligent Systems10.1002/int.2279337:8(5290-5324)Online publication date: 28-Dec-2021
  • (2021)Parallel crop planning based on price forecastInternational Journal of Intelligent Systems10.1002/int.2273937:8(4772-4793)Online publication date: 16-Nov-2021

View Options

View options

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media