Abstract
In this paper, a novel off-policy iterative algorithm is developed, which only uses the measurement data along the trajectory of the system to deal with the optimal control problem of the discrete-time complex dynamic networks. By approximating the solutions of the coupled Hamilton–Jacobi–Bellman equations, a local performance index is defined to solve the optimal synchronization problem for discrete-time nonlinear complex dynamic networks without knowing the node dynamics and the topology of the directed graph. Based on this, an off-policy iteration algorithm is designed to iteratively improve the target policy, and the convergence of the algorithm is proved theoretically. Actor-critic neural networks along with the gradient descent approach are employed to approximate optimal control policies and performance index functions using the data generated by applying prescribed behavior policies. Finally, two numerical simulation examples are given to show the effectiveness of our proposed method.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Wang XF, Chen G (2003) Complex networks: small-world, scale-free and beyond. IEEE Circuits Syst Mag 3:6–20
Arenas A, Díaz-Guilera A, Kurths J et al (2008) Synchronization in complex networks. Phys Rep 469:93–153
Wu W, Xiong N, Wu C (2017) Improved clustering algorithm based on energy consumption in wireless sensor networks. IET Netw 6:1–7
Li C, Chen G (2004) Synchronization in general complex dynamical networks with coupling delays. Phys A Stat Mech Appl 343:263–278
Kao Y, Li Y, Park JH, Chen X (2021) Mittag-leffler synchronization of delayed fractional memristor neural networks via adaptive control. IEEE Trans Neural Netw Learn Syst 32:2279–2284
Rakkiyappan R, Sakthivel N, Cao J (2015) Stochastic sampled-data control for synchronization of complex dynamical networks with control packet loss and additive time-varying delays. Neural Netw 66:46–63
Chen G, Xia J, Park JH et al (2021) Robust sampled-data control for switched complex dynamical networks with actuators saturation. IEEE Trans Cybern. https://doi.org/10.1109/TCYB.2021.3069813
Li H, Kao Y, Bao H, Chen Y (2021) Uniform stability of complex-valued neural networks of fractional order with linear impulses and fixed time delays. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3070136
Ding D, Tang Z, Wang Y et al (2021) Secure synchronization for cyber-physical complex networks based on self-triggering impulsive control: static and dynamic method. IEEE Trans Netw Sci Eng. https://doi.org/10.1109/tnse.2021.3106943
Jiang Y, Jiang ZP (2012) Computational adaptive optimal control for continuous-time linear systems with completely unknown dynamics. Automatica 48:2699–2704
Ding S, Wang Z, Member S, Xie X (2021) Periodic event-triggered synchronization for discrete-time complex dynamical networks. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3053652
Zou Y, Su H, Tang R, Yang X (2021) Finite-time bipartite synchronization of switched competitive neural networks with time delay via quantized control. ISA Trans. https://doi.org/10.1016/j.isatra.2021.06.015
Tang R, Su H, Zou Y, Yang X (2021) Finite-time synchronization of markovian coupled neural networks with delays via intermittent quantized control: linear programming approach. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2021.3069926
Tang Z, Xuan D, Park JH et al (2020) Impulsive effects based distributed synchronization of heterogeneous coupled neural networks. IEEE Trans Netw Sci Eng 8:498–510
Xing W, Shi P, Agarwal RK, Li L (2020) Robust \(H_{\infty }\) pinning synchronization for complex networks with event-triggered communication scheme. IEEE Trans Circuits Syst I Regul Pap 64:5233–5245
Boonraksa T, Boonraksa P, Marungsri B (2021) Optimal capacitor location and sizing for reducing the power loss on the power distribution systems due to the dynamic load of the electric buses charging system using the artificial bee colony algorithm. J Electr Eng Technol. https://doi.org/10.1007/s42835-021-00718-4
Vahabi S, Eslaminejad M, Dashti SE (2019) Integration of geographic and hierarchical routing protocols for energy saving in wireless sensor networks with mobile sink. Wirel Netw 25:2953–2961
Kao Y, Li H (2021) Asymptotic multistability and local S-asymptotic \(\omega \)-periodicity for the nonautonomous fractional-order neural networks with impulses. Sci China Inf Sci 64:112207. https://doi.org/10.1007/s11432-019-2821-x
Zhang H, Liu Y, Xiao G, Jiang H (2020) Data-based adaptive dynamic programming for a class of discrete-time systems with multiple delays. IEEE Trans Syst Man, Cybern Syst 50:432–441
Al-Tamimi A, Lewis FL, Abu-Khalaf M (2008) Discrete-time nonlinear HJB solution using approximate dynamic programming: convergence proof. IEEE Trans Syst Man, Cybern Part B Cybern 38:943–949
Abouheaf MI, Lewis FL, Vamvoudakis KG et al (2014) Multi-agent discrete-time graphical games and reinforcement learning solutions. Automatica 50:3038–3053
Mu C, Liao K, Ren L, Gao Z (2020) Approximately optimal control of discrete-time nonlinear switched systems using globalized dual heuristic programming. Neural Process Lett 52:1089–1108
Wei Q, Liu D, Lin Q, Song R (2018) Adaptive dynamic programming for discrete-time zero-sum games. IEEE Trans Neural Netw Learn Syst 53:957–967
Wei Q, Zhu L, Song R et al (2020) Model-free adaptive optimal control for unknown nonlinear multiplayer nonzero-sum game. IEEE Trans Neural Netw Learn Syst. https://doi.org/10.1109/TNNLS.2020.3030127
He S, Fang H, Zhang M et al (2020) Adaptive optimal control for a class of nonlinear systems: the online policy iteration approach. IEEE Trans Neural Netw Learn Syst 31:549–558
Peng Z, Zhao Y, Hu J, Ghosh BK (2019) Data-driven optimal tracking control of discrete-time multi-agent systems with two-stage policy iteration algorithm. Inf Sci (Ny) 481:189–202
Modares H, Nageshrao SP, Lopes GAD et al (2016) Optimal model-free output synchronization of heterogeneous systems using off-policy reinforcement learning. Automatica 71:334–341
Vamvoudakis KG, Lewis FL, Hudas GR (2012) Multi-agent differential graphical games: online adaptive learning solution for synchronization with optimality. Automatica 48:1598–1611
Zhang H, Jiang H, Luo Y, Xiao G (2017) Data-driven optimal consensus control for discrete-time multi-agent systems with unknown dynamics using reinforcement learning method. IEEE Trans Ind Electron 64:4091–4100
Wang W, Chen X, Fu H, Wu M (2020) Model-free distributed consensus control based on actor-critic framework for discrete-time nonlinear multiagent systems. IEEE Trans Syst Man, Cybern Syst 50:4123–4134
Xiao X, Li XJ (2018) Adaptive dynamic programming methodbased synchronisation control of a class of complex dynamical networks with unknown dynamics and actuator faults. IET Control Theory Appl 12:291–298
Cao YW, Yang GH, Li XJ (2019) Optimal synchronization controller design for complex dynamical networks with unknown system dynamics. J Franklin Inst 356:6071–6086
Hu W, Gao L, Dong T (2021) Event-based projective synchronization for different dimensional complex dynamical networks with unknown dynamics by using data-driven scheme. Neural Process Lett. https://doi.org/10.1007/s11063-021-10515-9
Hu W, Gao L, Dong T (2020) Data-driven optimal synchronization for complex networks with unknown dynamics. IEEE Access 8:224083–224091
Li J, Chai T, Lewis FL et al (2019) Off-Policy interleaved Q-Learning: optimal control for affine nonlinear discrete-time dystems. IEEE Trans Neural Netw Learn Syst 30:1308–1320
Li J, Modares H, Chai T et al (2017) Off-policy reinforcement learning for synchronization in multiagent graphical games. IEEE Trans Neural Netw Learn Syst 28:2434–2445
Zhang H, Luo Y, Liu D (2009) Neural-network-based near-optimal control for a class of discrete-time affine nonlinear systems with control constraints. IEEE Trans Neural Netw 20:1490–1503
Liu D, Wei Q (2014) Policy iteration adaptive dynamic programming algorithm for discrete-time nonlinear systems. IEEE Trans Neural Netw Learn Syst 25:621–634
Acknowledgements
This work was supported in part by the National Key Research and Development Program of China under Grant 2018YFB1701903 and in part by the National Natural Science Foundation of China under Grant 61973138.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
The authors declare that they have no conflict of interest.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Wang, J., Wang, Y. & Ji, Z. Off-Policy: Model-Free Optimal Synchronization Control for Complex Dynamical Networks. Neural Process Lett 54, 2941–2958 (2022). https://doi.org/10.1007/s11063-022-10748-2
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s11063-022-10748-2